How do I extract specific lines from HTML files before and after a word?

glittergirl · (This post was last modified: Aug-05-2019, 05:45 PM by glittergirl.)

I am trying to extract the 10 lines before and after the word "apple" from a directory (with subdirectories) full of HTML files. I want to print out the lines into a CSV file. Ideally, the CSV file will contain two variables: 1) the HTML filename and 2) the 10 lines before and after the word "apple".

I have done the following:

import glob
import collections
import itertools
import sys
import csv

for filepath in glob.glob('**/*.html', recursive=True):
    with open(filepath) as f:
        before = collections.deque(maxlen=10)
        for line in f:
            if 'apple' in line:
                sys.stdout.writelines(before)
                sys.stdout.write(line)
                sys.stdout.writelines(itertools.islice(f, 10))
            break
        results = before.append(line)
        print(results)

I am currently getting a bunch of rows that say "None" in my terminal when I print the results. What is the issue here?

fishhook · Aug-06-2019, 07:23 AM

Why do you expect that "append" method returns a value?
https://docs.python.org/2/library/collec...que.append
Nothing about the value returned. In case if a function doesn't return a result python always returns None.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Populating list items to html code and create individualized html code files	ChainyDaisy	0	3,923	Sep-21-2022, 07:18 PM Last Post: ChainyDaisy
	Python Obstacles \| Karate \| HTML/Scrape Specific Tag and Store it in MariaDB	BrandonKastning	8	5,872	Nov-22-2021, 01:38 AM Last Post: BrandonKastning
	HTML multi select HTML listbox with Flask/Python	rfeyer	0	7,154	Mar-14-2021, 12:23 PM Last Post: rfeyer
	Extracting the Address tag from multiple HTML files using BeautifulSoup	Dredd	8	9,133	Jan-25-2021, 12:16 PM Last Post: Dredd
	How to fix looking specific word in a webpage	BSOD	0	2,855	Jun-16-2020, 08:01 PM Last Post: BSOD
	Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to loop to next HTML/new CSV Row	BrandonKastning	0	3,732	Mar-22-2020, 06:10 AM Last Post: BrandonKastning
	How to get the href value of a specific word in the html code	julio2000	2	5,291	Mar-05-2020, 07:50 PM Last Post: julio2000
	Web crawler extracting specific text from HTML	lewdow	1	4,736	Jan-03-2020, 11:21 PM Last Post: snippsat
	Extract text between bold headlines from HTML	CostasG	1	3,839	Aug-31-2019, 10:53 AM Last Post: snippsat
	Getting a specific text inside an html with soup	mathieugrimbert	9	23,507	Jul-10-2019, 12:40 PM Last Post: mathieugrimbert

How do I extract specific lines from HTML files before and after a word?

User Panel Messages

Announcements