contar letras en una cadena python

I have to write a function, countLetters(word), that takes in a word as argument and returns a list that counts the number of times each letter appears. The letters must be sorted in alphabetical order.

Este es mi intento:

def countLetters(word):
    x = 0
    y = []
    for i in word:
        for j in range(len(y)):
            if i not in y[j]:
                x = (i, word.count(i))
                y.append(x)
    return y

I first tried it without the if i not in y[j]

countLetters("google")

el resultado fue

[('g', 2), ('o', 2), ('o', 2), ('g', 2), ('l', 1), ('e', 1)] 

cuando quise

[('e', 1), ('g', 2), ('l', 1), ('o', 2)]

Cuando agregué el if i not in y[j] filter, it just returns an empty list [].

Could someone please point out my error here?

preguntado el 28 de mayo de 14 a las 12:05

7 Respuestas

Recomiendo el collections módulo de Counter if you're in Python 2.7+

>>> import collections
>>> s = 'a word and another word'
>>> c = collections.Counter(s)
>>> c
Counter({' ': 4, 'a': 3, 'd': 3, 'o': 3, 'r': 3, 'n': 2, 'w': 2, 'e': 1, 'h': 1, 't': 1})

You can do the same in any version Python with an extra line or two:

>>> c = {}
>>> for i in s: 
...     c[i] = c.get(i, 0) + 1

This would also be useful to check your work.

To sort in alphabetical order (the above is sorted by frequency)

>>> for letter, count in sorted(c.items()):
...     print '{letter}: {count}'.format(letter=letter, count=count)
... 
 : 4
a: 3
d: 3
e: 1
h: 1
n: 2
o: 3
r: 3
t: 1
w: 2

or to keep in a format that you can reuse as a dict:

>>> import pprint
>>> pprint.pprint(dict(c))
{' ': 4,
 'a': 3,
 'd': 3,
 'e': 1,
 'h': 1,
 'n': 2,
 'o': 3,
 'r': 3,
 't': 1,
 'w': 2}

Finally, to get that as a list:

>>> pprint.pprint(sorted(c.items()))
[(' ', 4),
 ('a', 3),
 ('d', 3),
 ('e', 1),
 ('h', 1),
 ('n', 2),
 ('o', 3),
 ('r', 3),
 ('t', 1),
 ('w', 2)]

contestado el 30 de mayo de 14 a las 22:05

Most pythonic way. If you want to define a closed alphabe in your problem, you could use a filter function before the Counter - pierre tupin

Using Counter.most_common is preferable. The sort is hidden, it's possible to get only the n most_common and it's easy to iterate over it. - pierre tupin

Yep, but the questioner asked for it: "The letters must be sorted in alphabetical order." - Rusia debe sacar a Putin

Oh ok, I didn't see this part. So the alphabet has to be defined more strictly : filter before the counter ; key function in the sorted to avoid strange result (where are the accentued char ?) ; other operation if weird alphabet are used (UTF-8 for example). - pierre tupin

I think the problem lies in your outer for loop, as you are iterating over each letter in the word.

If the word contains more than one of a certain letter, for example "bees", when it iterates over this, it will now count the number of 'e's twice as the for loop does not discriminate against unique values. Look at string iterators, this might clarify this more. I'm not sure this will solve your problem, but this is the first thing that I noticed.

Tal vez podrías intentar algo como esto:

tally= {}
for s in check_string:
  if tally.has_key(s):
    tally[s] += 1
  else:
    tally[s] = 1

and then you can just retrieve the tally for each letter from that dictionary.

contestado el 28 de mayo de 14 a las 13:05

Tu lista y is always empty. You are never getting inside a loop for j in range(len(y))

P.S. your code is not very pythonic

contestado el 28 de mayo de 14 a las 12:05

Works fine with latest Py3 and Py2

def countItems(iter):
  from collections import Counter
  return sorted(Counter(iter).items())

contestado el 28 de mayo de 14 a las 13:05

Using a dictionary and pprint from answer of @salón aarón

import pprint
def countLetters(word):
    y = {}
    for i in word:
    if i in y:
        y[i] += 1
    else:
        y[i] = 1
    return y

res1 = countLetters("google")
pprint.pprint(res1)

res2 = countLetters("Google")
pprint.pprint(res2)

Salida:

{'e': 1, 'g': 2, 'l': 1, 'o': 2}

{'G': 1, 'e': 1, 'g': 1, 'l': 1, 'o': 2}

contestado el 23 de mayo de 17 a las 12:05

I am not sure what is your expected output, according to the problem statement, it seems you should sort the word first to get the count of letters in a sorted order. code below may be helpful:

def countLetters(word):
    letter = []
    cnt = []
    for c in sorted(word):
        if c not in letter:
            letter.append(c)
            cnt.append(1)
        else:
            cnt[-1] += 1
    return zip(letter, cnt)

print countLetters('hello')

this will give you [('e', 1), ('h', 1), ('l', 2), ('o', 1)]

contestado el 28 de mayo de 14 a las 13:05

The complexity is not good. If you sort on the input, the complexity is in function of the input length. If you sort on the alphabet, the complexity is in function of the alphabet length (which is generally smaller than the input) - pierre tupin

The complexity mostly lies in sorting the word, which is O(NlogN). Then you only need to iterate the word once to get the result, regardless of alphabet length, because it is always appending or modifying at the end, which is O(N). If you are using a dictionary, you only need O(N) to setup the letter:cnt mapping that's good, but if you need sorted letter:cnt pair, you'll still need to sort it, which is O(NlogN), so the complexity is just the same, that's my opinion. - dguan

The N is the length of your collection to sort. In the first case N = length(sentence). In the second case N = length(alphabet). An alphabet is generally short and finite. A sentence may be of length = 5 or length = 5000000. So no the complexity is not the same - pierre tupin

yeah, you are absolutely right, although they are both O(N*logN), there could be a huge difference. Thanks. - dguan

You can create dict of characters first, and than list of tulips

text = 'hello'
my_dict = {x : text.count(x) for x in text}
my_list = [(key, my_dict[key]) for key in my_dict]
print(my_dict)
print(my_list)

{'h': 1, 'e': 1, 'l': 2, 'o': 1}
[('h', 1), ('e', 1), ('l', 2), ('o', 1)]

Respondido el 13 de Septiembre de 22 a las 21:09

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.