Python Regex Not Working
I am using the following code: downloadlink = re.findall('http://uploadir.com/u/(.*)\b', str(downloadhtml)) However, when I pass it the following string:
Solution 1:
Get in the habit of making all regex patterns with raw strings:
In [16]: re.findall("http://uploadir.com/u/(.*)\b", '<input type="text" value="http://uploadir.com/u/bb41c5b3" />')
Out[16]: []
In [17]: re.findall(r"http://uploadir.com/u/(.*)\b", '<input type="text" value="http://uploadir.com/u/bb41c5b3" />')
Out[17]: ['bb41c5b3']
The difference is due to \b
being interpreted differently:
In [18]: '\b'
Out[18]: '\x08'
In [19]: r'\b'
Out[19]: '\\b'
'\b'
is an ASCII Backspace, while r'\b'
is a string composed of the two characters, a backslash and a b.
Solution 2:
>>>import re>>>html = '<input type="text" value="http://uploadir.com/u/bb41c5b3" />';>>>regex = r'http://uploadir.com/u/([^"]+)'>>>link = re.findall(regex, html)>>>link
['bb41c5b3']
>>>
Post a Comment for "Python Regex Not Working"