Unable to print data while looping through list in csv for webscraping - Python

Prince_Bhatia · Oct-04-2017, 10:25 AM

I have a CSV file which has list of url that needed to be scraped. Website i am scraping is http://www.rera-rajasthan.in/ProjectSearch which is real estate website which has property name and a link which has property details. I was able to scrape those links into csv, now i need to loop through all the links which i extracted for further web scraping.

This website requires post method to search project. I applied same method on the extracted links too.

But when i run this code it prints nothing :

import requests
from bs4 import BeautifulSoup
from urllib.request import urlopen
import csv
import json

#links = []

links = []
reranumber = []
table_attr = {"class":"table table-bordered"}

with open("RajLinks.csv", newline= '') as f:
    reader = csv.reader(f)
    for row in reader:
        reranumber = row[0]
        link = row[1]
        links.append(link)

def getData(url):
    url = "http://www.rera-rajasthan.in/Home/GetProjectsList"
    user_agent = {"User-Agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:55.0) Gecko/20100101 Firefox/55.0"}
    payload = {'certificateNo': '', 'PageSize': '50', 'District': '', 'v': '', 'projectName': '', 'promoterName': '', 'page': '1', 'tehsil': ''}
    r = requests.post(url, headers=user_agent, params=payload)
    data = r.text
    return data

#getdata

for sublist in links:
    htmldata = getData(link)
    soup = BeautifulSoup(htmldata, "html.parser")
    tables = soup.find_all("table", table_attr)
    for table in tables:
        txt = table.text
    if txt.find("Contact Address"):
        trs = table.find_all("tr")
        for data in trs:
            name = data[1].text
            print(name)

it should print first tr in contact address that it founds. i am extracting the links column

i am attaching the CSV. Can someone please guide?

wavic · Oct-04-2017, 11:18 AM

The page strangely lacks classes and ids so no one can target specific element directly. What you could do is to find the table you want to scrape by using the above h3 tag:

table = soup.find('h3', text='CONTRACTOR').find_next_sibling('table')

Note find_next_sibling method.
Then you can get all tr tags and from second, get the desired td. Have to use indices because as I said there is no classes or id to point to.

address = table.find_all('tr')[1].find_all('td')[2].text

Finally, you get 'S-33/34, JDA Shopping Center, Amrapali Circel, Vaishali Nagar, Jaipur' form the first url in the csv

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	How do select this table for webscraping?	MarkMan	12	3,756	Aug-08-2025, 08:29 AM Last Post: MarkMan
	Webscraping: Attendance Local Community Council	ThatsMe	1	1,323	Jun-17-2025, 02:20 AM Last Post: Larz60+
	Intro to WebScraping	d1rjr03	3	8,497	Dec-16-2024, 02:50 AM Last Post: bobprogrammer
	Webscraping - loop on first page	RikP	0	1,289	Jul-22-2024, 12:15 PM Last Post: RikP
	Webscraping news articles by using selenium	cate16	7	7,711	Aug-28-2023, 09:58 AM Last Post: snippsat
	Webscraping with beautifulsoup	cormanstan	3	13,316	Aug-24-2023, 11:57 AM Last Post: snippsat
	Webscraping returning empty table	Buuuwq	0	3,187	Dec-09-2022, 10:41 AM Last Post: Buuuwq
	WebScraping using Selenium library	Korgik	0	1,977	Dec-09-2022, 09:51 AM Last Post: Korgik
	Selenium innerHTML list, print specific value	denis22934	2	6,389	Jun-14-2021, 04:59 AM Last Post: denis22934
	DJANGO Looping Through Context Variable with specific data	Taz	0	3,072	Feb-18-2021, 03:52 PM Last Post: Taz

Unable to print data while looping through list in csv for webscraping - Python

User Panel Messages

Announcements