May-29-2022, 07:06 PM
(This post was last modified: May-29-2022, 07:06 PM by eddywinch82.)
Hi there,
I have the following Python Code :-
I am not sure why that is the case ? Could someone suggest for me the reason why ? And tell me what I need to change in the Code, so
that that Data is printed only once and not several times ? I have tried to Scrape Data from a Website, where entries contain the word between or Flypast.
When I use the following piece of Code instead :-
The first entry for the 28th May, is printed out in the DataFrame 15 times ! instead of 15 seperate Entries I mentioned before.
Any help would be much appreciated.
Best Regards
Eddie Winch ))
I have the following Python Code :-
import pandas as pd
import requests
import numpy as np
from bs4 import BeautifulSoup
import xlrd
import re
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)
res3 = requests.get("https://web.archive.org/web/20220521203053/https://www.military-airshows.co.uk/press22/bbmfschedule2022.htm")
soup3 = BeautifulSoup(res3.content,'lxml')
BBMF_2022 = []
#BBMF_elem = soup3.find_all('a', string=re.compile(r'between|Flypast'))
for item in soup3.find_all('a', string=re.compile(r'between|Flypast')):
li1 = item.find_parent().text
#li2 = li1.find_previous().font
#print(link)
print(li1)
#print(li2)
#BBMF_2022.append(li1)
#check if links are in dataframe
#df = pd.DataFrame(BBMF_2022, columns=['BBMF_2022'])
#dfThe issue I have is when I run the Code, the Data is printed for 15 Entries from May 28th to May 29th, several times,I am not sure why that is the case ? Could someone suggest for me the reason why ? And tell me what I need to change in the Code, so
that that Data is printed only once and not several times ? I have tried to Scrape Data from a Website, where entries contain the word between or Flypast.
When I use the following piece of Code instead :-
for item in soup3.find_all('a', string=re.compile(r'between|Flypast')):
li1 = item.find_parent().text
#li2 = li1.find_previous().font
#print(link)
#print(li1)
#print(li2)
BBMF_2022.append(li1)
df = pd.DataFrame(BBMF_2022, columns=['BBMF_2022'])
df The first entry for the 28th May, is printed out in the DataFrame 15 times ! instead of 15 seperate Entries I mentioned before.
Any help would be much appreciated.
Best Regards
Eddie Winch ))
