Skip to content Skip to sidebar Skip to footer

Python: Finding The Word That Shows Up The Most?

I'm trying to get my program to report the word that shows up the most in a text file. For example, if I type 'Hello I like pie because they are like so good' the program should pr

Solution 1:

I think you may want to do:

for word in textInput.split():

Currently, you are just iterating through every character in the textInput. So to iterate through every word, we must first split the string up into an array of words. By default .split() splits on whitespace, but you can change this by just passing a delimeter to split().

Also, you need to check if the word is in your dictionary, not in your original string. So try:

if word in word_counter:...

Then, to find the entry with the highest occurrences:

highest_word = ""
highest_value = 0

for k,v in word_counter.items():
  if v > highest_value:
    highest_value = v
    highest_word = k

Then, just print out the value of highest_word and highest_value.

To keep track of ties, just keep a list of the highest words. If we find a higher occurrence, clear the list and continue rebuilding. Here is the full program so far:

textInput = "He likes eating because he likes eating"
word_counter = {}
for word in textInput.split():
  if word in word_counter:
    word_counter[word] += 1
    word_counter[word] = 1

highest_words = []
highest_value = 0

for k,v in word_counter.items():
  # if we find a new value, create a new list,# add the entry and update the highest value
  if v > highest_value:
    highest_words = []
    highest_value = v
  # else if the value is the same, add it
  elif v == highest_value:

# print out the highest words
for word in highest_words:
  print word

Solution 2:

Instead of rolling your own counter, a better idea is to use Counters in the collections module.

>>> input = 'blah and stuff and things and stuff'>>> from collections import Counter
>>> c = Counter(input.split())
>>> c.most_common()
[('and', 3), ('stuff', 2), ('things', 1), ('blah', 1)]

Also, as a general code style thing, please avoid adding comments like this:

option = 0

It makes your code less readable, not more.

Solution 3:

With Python 3.6+ you can use statistics.mode:

>>>from statistics import mode>>>mode('Hello I like pie because they are like so good'.split())

Solution 4:

I'm not too keen on Python, but on your last print statement, shouldn't you have a %s?

i.e.: print("The word that showed up the most was: %s", word)

Post a Comment for "Python: Finding The Word That Shows Up The Most?"