Skip to content Skip to sidebar Skip to footer

Url Component % And \x

I have a doubt. st = 'b%C3%BCrokommunikation' urllib2.unquote(st) OUTPUT: 'b\xc3\xbcrokommunikation' But, if I print it: print urllib2.unquote(st) OUTPUT: bürokommunikation Why

Solution 1:

When you print the string, your terminal emulator recognizes the unicode character \xc3\xbc and displays it correctly.

However, as @MarkDickinson says in the comments, ü doesn't exist in ASCII, so you'll need to tell Python that the string you want to write to a file is unicode encoded, and what encoding format you want to use, for instance UTF-8.

This is very easy using the codecs library:

import codecs

# First create a Python UTF-8 string
st = "b%C3%BCrokommunikation"
encoded_string = urllib2.unquote(st).decode('utf-8')

# Write it to file keeping the encodingwith codecs.open('my_file.txt', 'w', 'utf-8') as f:
    f.write(encoded_string)

Solution 2:

You are looking at the same result. when you try to print it without print command, it just show the __repr__() result. when you use print, it shows the unicode character instead of escaping it with \x

Post a Comment for "Url Component % And \x"