I have a function which takes an url and returns the text from this url.
def extract_raw_text_from_url(url, set_parser='lxml'):
try:
req = Request(url, headers={'User-Agent': 'Mozilla/5.0'}) # Set user agent as Mozilla. Otherwise: Error 403
source = urlopen(req).read() # Return source code
parser = set_parser
soup = bs.BeautifulSoup(source, parser) # create beautiful soup object
text = soup.get_text() # get text of websites
except (ValueError): # ToDo: Why urllib.error.URLError is unknown? I want to include it in exception! Works in Colab!
text = []
return text
How do I properly test this function? Since I think it's bad practise to make a request each time I run the test, I think it would be a nice idea to mock the result.
Any idea how to do this? I am using pytest but I'm still a beginner.
Aucun commentaire:
Enregistrer un commentaire