Url Component % And \x
I have a doubt. st = 'b%C3%BCrokommunikation' urllib2.unquote(st) OUTPUT: 'b\xc3\xbcrokommunikation' But, if I print it: print urllib2.unquote(st) OUTPUT: bürokommunikation Why
Solution 1:
When you print
the string, your terminal emulator recognizes the unicode character \xc3\xbc
and displays it correctly.
However, as @MarkDickinson says in the comments, ü
doesn't exist in ASCII, so you'll need to tell Python that the string you want to write to a file is unicode encoded, and what encoding format you want to use, for instance UTF-8.
This is very easy using the codecs
library:
import codecs
# First create a Python UTF-8 string
st = "b%C3%BCrokommunikation"
encoded_string = urllib2.unquote(st).decode('utf-8')
# Write it to file keeping the encodingwith codecs.open('my_file.txt', 'w', 'utf-8') as f:
f.write(encoded_string)
Solution 2:
You are looking at the same result. when you try to print it without print command, it just show the __repr__()
result. when you use print, it shows the unicode character instead of escaping it with \x
Post a Comment for "Url Component % And \x"