Feb-15-2020, 05:33 AM
Have been testing a Python API that can fetch transcripts from Youtube videos - https://github.com/jdepoix/youtube-transcript-api
Some test scripts based on the documentation.
and the second script is very basic ..
1. Can you please advise what is causing the error in the first script
2. How can the 2nd script be modified to search through the transcript for specific keyword/s ?
There is a search example at https://stackoverflow.com/questions/5428...ata-api-v3 . It searches on video title and works okay. Just referencing that as it may help to add code to that 2nd script to include a search on the transcription.
Some test scripts based on the documentation.
#!/usr/bin/python
from youtube_transcript_api import YouTubeTranscriptApi
video_id = 'nTg6Rqlz6ts'
# retrieve the available transcripts
transcript_list = YouTubeTranscriptApi.get_transcript(video_id)
# iterate over all available transcripts
for transcript in transcript_list:
# the Transcript object provides metadata properties
print(
transcript.video_id,
transcript.language,
transcript.language_code,
# whether it has been manually created or generated by YouTube
transcript.is_generated,
# whether this transcript can be translated or not
transcript.is_translatable,
# a list of languages the transcript can be translated to
transcript.translation_languages,
)
# fetch the actual transcript data
print(transcript.fetch())
# translating the transcript will return another transcript object
print(transcript.translate('en').fetch())
# you can also directly filter for the language you are looking for, using the transcript list
transcript = transcript_list.find_transcript(['de', 'en'])
# or just filter for manually created transcripts
transcript = transcript_list.find_manually_created_transcript(['de', 'en'])
# or automatically generated ones
transcript = transcript_list.find_generated_transcript(['de', 'en'])when I run it I get this error messageQuote:$ python3 test6.py
Traceback (most recent call last):
File "test6.py", line 14, in <module>
transcript.video_id,
AttributeError: 'dict' object has no attribute 'video_id'
and the second script is very basic ..
#!/usr/bin/python from youtube_transcript_api import YouTubeTranscriptApi video_id = 'nTg6Rqlz6ts' transcript_list = YouTubeTranscriptApi.get_transcript(video_id) print(transcript_list[0]) print(transcript_list[1]) print(transcript_list[2])and returns ..
Quote:$ python3 test7.py
{'duration': 3.04, 'text': '[Music]', 'start': 0.82}
{'duration': 8.17, 'text': 'salvation is undoing salvation can be', 'start': 7.36}
{'duration': 7.249, 'text': 'seen as nothing more than the escape', 'start': 12.53}
1. Can you please advise what is causing the error in the first script
2. How can the 2nd script be modified to search through the transcript for specific keyword/s ?
There is a search example at https://stackoverflow.com/questions/5428...ata-api-v3 . It searches on video title and works okay. Just referencing that as it may help to add code to that 2nd script to include a search on the transcription.
