Mar-20-2018, 12:48 PM
import os
import urllib.request
# the path where the html is located
path = r"C:\Users\The Capricorn\Documents\Html"
for filename in os.listdir(path):
# Now we have to find the full path name of the files
subpath = os.path.join(path,filename)
if subpath.endswith('.html'):
print(subpath)
print('Reading',filename,'....')
html = open(subpath,'r').read()
if html:
print('Successfully fetched Html')
else:
for file in os.listdir(subpath):
# getting the full path of html file
fullpath = os.path.join(subpath,file)
if fullpath.endswith('.html'):
print(fullpath)
print('Reading',file,'....')
html = open(fullpath,'r').read()
if html:
print('Successfully fetched Html')
This code is to fetch local HTML files in the directory. This works fine when path contains only a single sub-folder inside it or no sub-folders but not when there are folders inside sub-folders as well and gives an error if files with different extension instead of Html are present inside path. What should I do to correct this?
