How To Use '?' To Extract Optional Substring Between Two Matching Pattern In Python?
I was answering this question. Consider this string str1 = '{'show permission allowed to 16': 'show permission to 16\\nSchool permissions from group 17:student to group 16:teacher:
Solution 1:
The reason why it is not capturing Temp
is because you have made it optional due to which .*?
consumes it, and Temp
does not get captured in your optional group.
To solve that problem, you can use negative look ahead to reject Temp
getting captured except any other character using this regex,
from group (\d+)(?:(?!Temp).)*?(Temp)?(?:(?!Temp).)*?\\t(.*? ALL-..)
^^^^^^^^^ This rejects Temp getting captured except any other character
Regex explanation:
from group
- literal matching of this text(?:(?!Temp).)*?
-?:
means its a non-capturing group which by default is a capturing group and this means that capturing anything but stop when you seeTemp
string and*
means capture zero or more characters. So this captures any string which doesn't containTemp
and?
means as less as possible(Temp)?
- Optionally captureTemp
if present(?:(?!Temp).)*?
- Again capture any character zero or more times exceptTemp
just like above\\t
- capture this literally(.*? ALL-..)
- Capturing any character as less as possible followed by a space followed by literalALL-
followed by any two characters
Hope this clarifies the regex. Let me know in case you have any further queries.
Sample Python Codes,
import re
s = '{"show permission allowed to 16": "show permission to 16\\nSchool permissions from group 17:student to group 16:teacher:\\n\\tAllow ALL-00\\nSchool permissions from group 18:library to group 16(Temp):teacher:\\n\\tNo Allow ALL-00\\nSchool permissions from group 20:Gym to group 16:teacher:\\n\\tCheck ALL-00\\nRTYAHY: FALSE\\nRTYAHY: FALSE\\n\\n#"}'
arr = re.findall(r'from group (\d+)(?:(?!Temp).)*?(Temp)?(?:(?!Temp).)*?\\t(.*?ALL-..)',s)
print(arr)
Prints,
[('17', '', 'Allow ALL-00'), ('18', 'Temp', 'No Allow ALL-00'), ('20', '', 'Check ALL-00')]
Edit: For listing only tuples that does not contain Temp
You will need to use this regex to avoid matching substring that contains Temp
string within the match,
from group (\d+)(?:(?!Temp).)*\\t(.*? ALL-..)
Sample Python code,
import re
str1 = '{"show permission allowed to 16": "show permission to 16\\nSchool permissions from group 17:student to group 16:teacher:\\n\\tAllow ALL-00\\nSchool permissions from group 18:library to group 16(Temp):teacher:\\n\\tNo Allow ALL-00\\nSchool permissions from group 20:Gym to group 16:teacher:\\n\\tCheck ALL-00\\nRTYAHY: FALSE\\nRTYAHY: FALSE\\n\\n#"}'
arr = re.findall(r'from group (\d+)(?:(?!Temp).)*\\t(.*?ALL-..)',str1)
print(arr)
Prints,
[('17', 'Allow ALL-00'), ('20', 'Check ALL-00')]
Which does not contain the tuple having Temp
Post a Comment for "How To Use '?' To Extract Optional Substring Between Two Matching Pattern In Python?"