Stuck With Encodings In Python With Beautifulsoup
The page is encoded in UTF-8 and with python's HTMLParser it works well, no UnicodeDecodeError, but I do get an error when I try to parse it with BeautifulSoup. I've tried _*_ codi
Solution 1:
Ok i found the solution in my last try, maybe it will help others with the same problem. It needs to be encoded, not decoded
print( [e.encode('utf-8', 'ignore') for e in stuff] )
Solution 2:
You shouldn't be getting UnicodeEncodeError: 'ascii'..
errors when you print. This is often caused if your locale
is corrupt or set to C
. Python is then unable to set an appropriate encoder on the stdout stream.
Run locale
and check for errors or warnings.
If you can't fix your locale, you can often override Python's stdout encoder with by setting PYTHONIOENCODING
in your environment to an encoding that matches your terminal emulation. Often you can get by with:
export PYTHONIOENCODING=UTF-8
or
PYTHONIOENCODING=UTF-8 python my_script.py
Post a Comment for "Stuck With Encodings In Python With Beautifulsoup"