Jun-06-2018, 04:16 PM
So, I'm using BeautifulSoup to try and read urls from a website. The code I'm using so far is something along the lines of:
File: appName \ models.py
~~~
Has anyone with better experience encountered a *NoneType* error? How did you get around it?
File: appName \ models.py
~~~
from django.db import models
from bs4 import BeautifulSoup
from urllib import request
class BBCHeaders(models.Model):
url = "http://www.bbc.co.uk/news" # URL for test reasons
content = request.urlopen(url).read() # Open url
soup = BeautifulSoup(content, "html.parser") # Grab the page details
try: # I am unsure about 'try' function, using it to 'try' a value (I may be using it wrong?)
for element in soup.body.find_all('nav'): # Search for all elements that are of the category 'nav' for navigation tree
for link in element.find_all('a', text = True): # Find all the links within the navigation tree
for strLink in link.find_all('span'): # Check for heading strings in navigation tree
if strLink != None: # Filter out all results where span is 'None'
# print(link.span.string) # Print the heading of the link
# print(link.get('href')) # Print the URL padding for link
nBBC = models.CharField(link.string) # Save the headers into SQL
urlBBC = models.CharField(link.get('href')) # Save links into SQL
except(TypeError): # Exception of TypeError
pass # Do Nothing
~~~The code keeps returning a type error. Typing it into the python shell and using the print instead of the nBBC saving it posts teh results which are the headings followed by the link padding to guide you to the destination. The script works in a Python 3.6 shell however Django doesn't like it and I suspect I have my *try* function in the wrong location or I might have to refine my search criteria?Has anyone with better experience encountered a *NoneType* error? How did you get around it?
