Skip to content Skip to sidebar Skip to footer

Searching For A Specific Phrase In Csv File Using Regex In Python

I have a csv database of tweets, which I need to search for a list of specific phrases and words. For example, I'm searching for 'global warming'. I want to find not only 'global

Solution 1:

You can use your own code with very simple modification

...

forrowin csv_read:
    row_lower = row.lower()
    search_terms = ["global warming", "globalwarming"]

    if any([term in row_lower for term in search_terms]):
        writer.writerow(row)
    else:
        writer2.writerow(row)

If you must use regex or you are afraid to miss some rows such as : "...global(more than one space)warming...", "..global____warming..", "..global serious warming.."

...

global_regex = re.compile(r'global.*?warming', re.IGNORECASE)
for row in csv_read:            

        ifany(re.findall(global_regex, row)):
           writer.writerow(row)
        else:
           writer2.writerow(row)

I compiled the regex outside the loop for better performance.

Here you can see the regex in action.

Post a Comment for "Searching For A Specific Phrase In Csv File Using Regex In Python"