Python leer reglas de asociación de construcción de datos

I have two columns in a text file. I read them into Python into two separate lists. What I want to do is count the occurences of each pair and build association rules based on it.

Ejemplo:

colA = [a,b,c,d,...]

colB = [c,y,d,e,...]

I came only so far to read the data into the two lists but what is the best way to count the occurences and build the rules?

Código:

pred = []
succ = []
for line in open('arsample.txt'):
    lst = line.split('\t')
    pred.append(int(lst[0]))
    succ.append(int(lst[1]))

Rules would look like this and are sorted descending:

P   S   Probability
---------------------
a > c   count(a>c)/n
...     ...

preguntado el 08 de noviembre de 11 a las 14:11

It's not exactly clear what you are trying to do. With the sample data you gave, what is your expected result? -

2 Respuestas

Puedes utilizar una dictionary to create a mapping:

mapping = {}

for key in colA:
  mapping[key] = colB.index(key)

To count the occurrences, just use .count():

colA.count('a')

Note that the mapping will break if colB has two elements with the same name. This is because you're trying to build a bijection between two non-unique sets, which won't work. Think of it like recovering the input number from x^2. You just don't know.

respondido 08 nov., 11:18

colB has elements with the same name so this is a problem. - user366121

You can't construct a mapping. - Licuadora

Echa un vistazo a sets :

 http://docs.python.org/library/sets.html

They allow this :

>>> a = [1,2,2,5,4,5,4,2,1,3]
>>> set(a)
set([1, 2, 3, 4, 5])
>>>

So you will have to build the pairs in a list of strings, I guess...

Espero que pueda ayudar.

respondido 08 nov., 11:18

Looks good but I still need the count of the pairs otherwise I cannot calculate the probability. - user366121

Then you'd probably better with the itertools : from itertools import groupby : will give you the sets and the counts. - Luis

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.