How to fix looking specific word in a webpage

BSOD · Jun-16-2020, 08:01 PM

I was using to scrape a website to look for wordpress on as "/wp-", and it partially works, but it also partially doesn't.

The problem is that when it looks and counts for /wp-, it gives way too many results on all the sites I am looking at. If I manually inspect https://arstechnica.com/ and look for /wp- on it using ctrl+f, it would bring up around 46 results.
If I use the code, it brings up 922 results.

Is there a way to fix it from bring up so many results?
Also, is there a way to bring up only the first result of /wp- too?
I am curious in trying to incorporate both ways in a future code.

Thank you very much for your help and any advice you might have on how to fix this!

#!bin/usr/python3

import urllib.request
import urlopen
import bs4
import queue
import urllib.request as urllib2 
import urllib3
import re
import requests
from bs4 import BeautifulSoup
 
def count_words(url, the_word):
    r = requests.get(url, allow_redirects=False)
    soup = BeautifulSoup(r.content, 'lxml')
    words = soup.find(text=lambda text: text and the_word in text)
    print(words)
    return len(words)
 
 
def main():
    url = 'https://arstechnica.com/'
    word = '/wp-'
    count = count_words(url, word)
    print('\nUrl: {}\ncontains {} occurrences of word: {}'.format(url, count, word))
 
if __name__ == '__main__':
    main()

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	How to get the href value of a specific word in the html code	julio2000	2	5,291	Mar-05-2020, 07:50 PM Last Post: julio2000
	How do I extract specific lines from HTML files before and after a word?	glittergirl	1	6,568	Aug-06-2019, 07:23 AM Last Post: fishhook
	[split] How to find a specific word in a webpage and How to count it.	marpop	2	8,035	Mar-12-2019, 08:25 AM Last Post: snippsat
	How to find a specific word in a webpage and How to count it.	pratheep	11	50,530	Feb-08-2018, 04:07 PM Last Post: pratheep

How to fix looking specific word in a webpage

User Panel Messages

Announcements