Dec-06-2019, 03:39 PM
Hello guys,
I have a directory with a lot of text files, I need to loop through them and extract a certain section from them. The Text files are formatted in a standard way, in this way:
What I need to do, is to extract the "section 5" portion of the text. I know the split method, but splitting the file by ":", and then further splitting by "*" - doesn't quite seem right:
I would appreciate any help!
I have a directory with a lot of text files, I need to loop through them and extract a certain section from them. The Text files are formatted in a standard way, in this way:
Quote:ABC: this is a text
SECTION 2: this is more text
ANOTHER SECTION: blah blah blah. This is another section
SECTION 4:
SECTION 5:
* A list
* Another list. I need this
YET ANOTHER SECTION: A bunch. of sentences. exist here.
OTHER FINDINGS: None.
FINAL
THIS IS NOT IMPORTANT
What I need to do, is to extract the "section 5" portion of the text. I know the split method, but splitting the file by ":", and then further splitting by "*" - doesn't quite seem right:
import glob
#list of all the text files
path = "reports/*.txt"
file_id=0
#loop through files, one at a time
for file_name in glob.glob(path):
file_id += 1
with open (file_name, 'rt') as myfile:
current_file = myfile.read()
section_list = current_file.split(':')
for list_section in section_list:
further_split = list_section.split('*')
for x in further_split:
print("An item in list :" + str(further_split))Is there a more elegant/better way to get to what I need? What I am really after is that within the section that I care about, I want to loop through each of the subsections, which are delineated by "*" and work with those strings.I would appreciate any help!
