Sep-06-2017, 09:08 PM
(This post was last modified: Sep-06-2017, 09:08 PM by Prince_Bhatia.)
hi,
i am trying to scrape the website "https://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=-1&IsNodeId=1&Description=GTX&bop=And&Page=1&PageSize=36&order=BESTMATCH"
what i am trying to do scrape, product name, it's price and image link
i got the success a bit with one problem, name, price and image are coming in every cell, like formatting is so poor.
can someone help me to ammend codes so that i can get name in name column, price in price column and image in image column.
i am trying to scrape the website "https://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=-1&IsNodeId=1&Description=GTX&bop=And&Page=1&PageSize=36&order=BESTMATCH"
what i am trying to do scrape, product name, it's price and image link
i got the success a bit with one problem, name, price and image are coming in every cell, like formatting is so poor.
can someone help me to ammend codes so that i can get name in name column, price in price column and image in image column.
from urllib.request import urlopen
from bs4 import BeautifulSoup
#page_url = "https://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=-1&IsNodeId=1&Description=GTX&bop=And&Page=1&PageSize=36&order=BESTMATCH"
#html = urlopen(page_url)
#bs0bj = BeautifulSoup(html, "html.parser")
#page_details = bs0bj.find_all("div", {"class":"item-container"})
f = open("Scrapedetails.csv", "w")
Headers = "Item_Name, Price, Image\n"
f.write(Headers)
#for i in page_details:
# Item_Name = i.find("a", {"class":"item-title"})
# Price = i.find("li", {"class":"price-current"})
# Image = i.find("img")
# Name_item = Item_Name.get_text()
# Prin = Price.get_text()
# imgf = Image["src"]# to get the key src
# f.write("{}".format(Name_item)+ ",{}".format(Prin)+ ",{}".format(imgf))
#f.close()
for page in range(1,15):
page_url = "https://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=-1&IsNodeId=1&Description=GTX&bop=And&Page={}&PageSize=36&order=BESTMATCH".format(page)
html = urlopen(page_url)
bs0bj = BeautifulSoup(html, "html.parser")
page_details = bs0bj.find_all("div", {"class":"item-container"})
for i in page_details:
Item_Name = i.find("a", {"class":"item-title"})
Price = i.find("li", {"class":"price-current"})
Image = i.find("img")
Name_item = Item_Name.get_text()
Prin = Price.get_text()
imgf = Image["src"]# to get the key src
f.write("{}".format(Name_item)+ ",{}".format(Prin)+ ",{}".format(imgf)+ "\n")
f.close()i am attaching the excel file too and what are the new ways to save data in csv ,can someone help me in it with codes too?
