Hey folks. I am a real new beginner for python and NLP.
I am stuck with this question: Store the resulting Doc objects into a list name A"
Is it possible to store it by using "A=list(doc)". How could it still remain doc objects under list? Thank you so much!
This is my exercise:
1. Load text files from a directory and read their contentsΒΆ
The directory data contains a subdirectory named sotu with five State of the Union speeches from various presidents of the United States, which are stored as UTF-8 encoded plain text files.
Import the pathlib module and use the module to read the contents of each text file into string objects.
Then import the spacy library and load a small language model for English. Assign the model under the variable nlp.
Process the texts using the language model and store the resulting Doc objects into a list named speeches.
And my answer:
I am stuck with this question: Store the resulting Doc objects into a list name A"
Is it possible to store it by using "A=list(doc)". How could it still remain doc objects under list? Thank you so much!
This is my exercise:
1. Load text files from a directory and read their contentsΒΆ
The directory data contains a subdirectory named sotu with five State of the Union speeches from various presidents of the United States, which are stored as UTF-8 encoded plain text files.
Import the pathlib module and use the module to read the contents of each text file into string objects.
Then import the spacy library and load a small language model for English. Assign the model under the variable nlp.
Process the texts using the language model and store the resulting Doc objects into a list named speeches.
And my answer:
from pathlib import Path
corpus_dir = Path ("data/sotu")
files = list(corpus_dir.glob(pattern='*.txt'))
for file in files:
text = file.read_text(encoding='utf-8')
import spacy
nlp=spacy.load('en_core_web_sm')
doc=nlp(text)
speeches=list(doc)
