I want to be able to extract data from multiple pages. The pages are in the following format:
I have created code so far that exports into a results into a csv file. However this only works for 1 url:
I could create a textfile with the possible links but still not sure what to do to get this to work
I'm new to python
Output:https://www.trademe.co.nz/browse/categoryattributesearchresults.aspx?cid=5748&search=1&134=9&135=2&rptpath=350-5748-&rsqid=d4360a620e944164b321dc2498f327b9-002&nofilters=1&originalsidebar=1&key=1227701521&page=1&sort_order=price_asc
https://www.trademe.co.nz/browse/categoryattributesearchresults.aspx?cid=5748&search=1&134=9&135=2&rptpath=350-5748-&rsqid=d4360a620e944164b321dc2498f327b9-002&nofilters=1&originalsidebar=1&key=1227701521&page=2&sort_order=price_asc
https://www.trademe.co.nz/browse/categoryattributesearchresults.aspx?cid=5748&search=1&134=9&135=2&rptpath=350-5748-&rsqid=d4360a620e944164b321dc2498f327b9-002&nofilters=1&originalsidebar=1&key=1227701521&page=3&sort_order=price_ascIn these links the only thing that changes in the url is the number following page=I have created code so far that exports into a results into a csv file. However this only works for 1 url:
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url = 'https://www.trademe.co.nz/browse/categoryattributesearchresults.aspx?cid=5748&search=1&134=9&135=2&rptpath=350-5748-&rsqid=d4360a620e944164b321dc2498f327b9-002&nofilters=1&originalsidebar=1&key=1227701521&page=1&sort_order=price_asc'
# opening up connection, grabbing the page
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
# html parser
page_soup = soup(page_html, "html.parser")
# grabs each property
listings = page_soup.findAll("div",{"class":"tmp-search-card-list-view__card-content"})
filename = "trademe.csv"
f = open(filename, "w")
headers = "title, price, area\n"
f.write(headers)
for listing in listings:
title_listing = listing.findAll("div", {"class":"tmp-search-card-list-view__title"})
price_listing = listing.findAll("div", {"class":"tmp-search-card-list-view__price"})
area_listing = listing.findAll("div", {"class":"tmp-search-card-list-view__subtitle"})
title = title_listing[0].text.strip()
price = price_listing[0].text.strip()
area = area_listing[0].text.strip()
print("title: " + title)
print("price: " + price)
print("area: " + area)
f.write(title.replace(",", "^") + "," + price.replace(",", "") + "," + area.replace(",", "^") + "\n")
f.close()How would I get these working so that it keeps going through all the numbers of urls?I could create a textfile with the possible links but still not sure what to do to get this to work
I'm new to python
