Skip to content Skip to sidebar Skip to footer

How To Use '?' To Extract Optional Substring Between Two Matching Pattern In Python?

I was answering this question. Consider this string str1 = '{'show permission allowed to 16': 'show permission to 16\\nSchool permissions from group 17:student to group 16:teacher:

Solution 1:

The reason why it is not capturing Temp is because you have made it optional due to which .*? consumes it, and Temp does not get captured in your optional group.

To solve that problem, you can use negative look ahead to reject Temp getting captured except any other character using this regex,

from group (\d+)(?:(?!Temp).)*?(Temp)?(?:(?!Temp).)*?\\t(.*? ALL-..)
                   ^^^^^^^^^ This rejects Temp getting captured except any other character

Regex explanation:

  • from group - literal matching of this text
  • (?:(?!Temp).)*? - ?: means its a non-capturing group which by default is a capturing group and this means that capturing anything but stop when you see Temp string and * means capture zero or more characters. So this captures any string which doesn't contain Temp and ? means as less as possible
  • (Temp)? - Optionally capture Temp if present
  • (?:(?!Temp).)*? - Again capture any character zero or more times except Temp just like above
  • \\t - capture this literally
  • (.*? ALL-..) - Capturing any character as less as possible followed by a space followed by literal ALL- followed by any two characters

Hope this clarifies the regex. Let me know in case you have any further queries.

Demo

Sample Python Codes,

import re

s = '{"show permission allowed to 16": "show permission to 16\\nSchool permissions from group 17:student to group 16:teacher:\\n\\tAllow ALL-00\\nSchool permissions from group 18:library to group 16(Temp):teacher:\\n\\tNo Allow ALL-00\\nSchool permissions from group 20:Gym to group 16:teacher:\\n\\tCheck ALL-00\\nRTYAHY: FALSE\\nRTYAHY: FALSE\\n\\n#"}'

arr = re.findall(r'from group (\d+)(?:(?!Temp).)*?(Temp)?(?:(?!Temp).)*?\\t(.*?ALL-..)',s)
print(arr)

Prints,

[('17', '', 'Allow ALL-00'), ('18', 'Temp', 'No Allow ALL-00'), ('20', '', 'Check ALL-00')]

Edit: For listing only tuples that does not contain Temp

You will need to use this regex to avoid matching substring that contains Temp string within the match,

from group (\d+)(?:(?!Temp).)*\\t(.*? ALL-..)

Demo

Sample Python code,

import re

str1 = '{"show permission allowed to 16": "show permission to 16\\nSchool permissions from group 17:student to group 16:teacher:\\n\\tAllow ALL-00\\nSchool permissions from group 18:library to group 16(Temp):teacher:\\n\\tNo Allow ALL-00\\nSchool permissions from group 20:Gym to group 16:teacher:\\n\\tCheck ALL-00\\nRTYAHY: FALSE\\nRTYAHY: FALSE\\n\\n#"}'

arr = re.findall(r'from group (\d+)(?:(?!Temp).)*\\t(.*?ALL-..)',str1)
print(arr)

Prints,

[('17', 'Allow ALL-00'), ('20', 'Check ALL-00')]

Which does not contain the tuple having Temp

Post a Comment for "How To Use '?' To Extract Optional Substring Between Two Matching Pattern In Python?"