Una estructura de árbol con profundidad dinámica y datos en solo hojas

Tengo la siguiente jerarquía:

- A
    - X : [1, 2, 3]
    - Y : [4, 5]
    - Z : [10, 11]
- B
    - X : [6, 7]
    - Y : [8]

And what I want is to have following queries give me following results:

get(A) ==> [1,2,3,4,5,10,11]
get(A,Y) ==> [4,5]
get(B) ==> [6,7,8]
get(B,X) ==> [6,7]

So far, it seems easy. I can accomplish this by having a Dictionary> que puede ser un defaultdict(lambda : defaultdict(list)) in Python. However, what if I need to make it more generic and have another level, or another 2 levels? Something like :

- A
    - X
        - i  : [1]
        - ii : [2,3]
    - Y
        - i  : [4, 5]
    - Z
        - ii : [10, 11]
- B
    - X
        - ii  : [6]
        - iii : [7]
    - Y
        - i   : [8]

In this example, the first hierarchy is a "projection" of the second hierarchy where the last level is merged into the parent. So, all queries for the first hierarchy should give the same results.

Some sample queries for new level:

get(B, X, ii) ==> [6]
get(B,X) ==> [6,7]          (same query and result as before)

Please note that, data is only in leaf nodes. So, for insertion, whole path must be given:

insert(A, X, i, 20)

That also means, we can give the depth of the tree in constructor of the data structure.

EDIT: I realized that I need validation of depth:

  • Insert operation : whole path must be given and the len(path) must be equal to depth
  • Get operation : a path "deeper" than the depth of the structure is not allowed

preguntado el 29 de julio de 12 a las 02:07

Generally, the way one traverses and operates on elements in a tree is through recursion. You can easily adapt a list flattening algorithm such as esta to iterate over a node's children. -

Thanks for the keyword "flattening". It is easy to search on Google with it :) -

3 Respuestas

from collections import defaultdict
def tree(): return defaultdict(tree)

def get_(t):
    L = []
    if isinstance(t, list):
            L.extend(x for x in t)
    else:
        for k in t:
            L.extend(get_(t[k]))
    return sorted(L)

t = tree()
t['A']['X']['i'] = [1]
t['A']['X']['ii'] = [2,3]
t['A']['Y']['i'] = [4,5]
t['A']['Z']['ii'] = [10,11]

t['B']['X']['ii'] = [6]
t['B']['X']['iii'] = [7]
t['B']['Y']['i'] = [8]

print get_(t)
print get_(t['A'])
print get_(t['A']['X'])
print get_(t['A']['X']['i'])
print get_(t['B']['Y']['i'])

>>> 
[1, 2, 3, 4, 5, 6, 7, 8, 10, 11]
[1, 2, 3, 4, 5, 10, 11]
[1, 2, 3]
[1]
[8]
>>> 

Respondido 29 Jul 12, 14:07

Echa un vistazo a esto:

>>> A = Tree()
>>> B = Tree()
>>> A.insert_subtree("x", Leaf([1, 2, 3]))
>>> A.insert_subtree("y", Leaf([10, 20, 30]))
>>> B.insert_subtree("y", Leaf([100, 101, 102]))
>>> root = Tree({'A': A, 'B': B})
>>> root.get("A")
[1, 2, 3, 10, 20, 30]
>>> root.get("A", "x")
[1, 2, 3]
>>> root.insert("A", "x", 4)
>>> root.get("A", "x")
[1, 2, 3, 4]
>>> root.get("A")
[1, 2, 3, 4, 10, 20, 30]
>>> root.get("B")
[100, 101, 102]

Aquí está el código para que funcione:

class Leaf(object):
    def __init__(self, data=None):
        self.data = data[:] if data else []

    def __iter__(self):
        for item in self.data:
            yield item

    def insert(self, value):
        self.data.append(value)


class Tree(object):
    def __init__(self, trees=None):
        self.trees = dict(trees) if trees else {}

    def insert_subtree(self, name, tree):
        if name in self.trees:
            raise TreeAlreadyExists()

        self.trees[name] = tree

    def get(self, *args):
        child_name, rest = args[0], args[1:]
        child = self._get_child(child_name)

        if len(rest):
            return child.get(*rest)
        else:
            return [item for item in child]

    def _get_child(self, name):
        if name not in self.trees:
            raise KeyError("Child %s does not exist" % name)
        return self.trees[name]

    def insert(self, *args):
        child_name, rest = args[0], args[1:]
        child = self._get_child(child_name)
        child.insert(*rest)

    def __iter__(self):
        for key in sorted(self.trees.keys()):
            for item in self.trees[key]:
                yield item

class TreeAlreadyExists(Exception):
    pass

Respondido 29 Jul 12, 12:07

Yo escribí esta clase based on the idea of @black_dragon and with the support for the validation of depth. Here is how to use it (copied from the caso de prueba):

def test_index_with_sample_case_for_depth_2(self):
    idx = HierarchicalIndex(2)

    # A
    idx.insert(1, 'A', 'X')
    idx.insert(2, 'A', 'X')
    idx.insert(3, 'A', 'X')

    idx.insert(4, 'A', 'Y')
    idx.insert(5, 'A', 'Y')

    idx.insert(10, 'A', 'Z')
    idx.insert(11, 'A', 'Z')

    #B
    idx.insert(6, 'B', 'X')
    idx.insert(7, 'B', 'X')

    idx.insert(8, 'B', 'Y')

    assert_that(idx.get('A'), equal_to([1, 2, 3, 4, 5, 10, 11]))
    assert_that(idx.get('A', 'Y'), equal_to([4, 5]))
    assert_that(idx.get('B'), equal_to([6, 7, 8]))
    assert_that(idx.get('B', 'X'), equal_to([6, 7]))


def test_index_with_sample_case_for_depth_3(self):
    idx = HierarchicalIndex(3)

    # A
    idx.insert(1, 'A', 'X', 'i')
    idx.insert(2, 'A', 'X', 'ii')
    idx.insert(3, 'A', 'X', 'ii')

    idx.insert(4, 'A', 'Y', 'i')
    idx.insert(5, 'A', 'Y', 'ii')

    idx.insert(10, 'A', 'Z', 'ii')
    idx.insert(11, 'A', 'Z', 'iii')

    #B
    idx.insert(6, 'B', 'X', 'ii')
    idx.insert(7, 'B', 'X', 'iii')

    idx.insert(8, 'B', 'Y', 'i')

    #same queries with case for depth 2
    assert_that(idx.get('A'), equal_to([1, 2, 3, 4, 5, 10, 11]))
    assert_that(idx.get('A', 'Y'), equal_to([4, 5]))
    assert_that(idx.get('B'), equal_to([6, 7, 8]))
    assert_that(idx.get('B', 'X'), equal_to([6, 7]))

    #new queries
    assert_that(idx.get('B', 'X', 'ii'), equal_to([6]))
    assert_that(idx.get('A', 'X', 'ii'), equal_to([2, 3]))

And validation of depth:

def test_index_should_validate_depth_in_operations(self):
    # ....
    # depth=3
    idx = HierarchicalIndex(3)

    assert_that(idx.get('A'), has_length(0))
    assert_that(idx.get('A', 'X'), has_length(0))
    assert_that(idx.get('A', 'X', 'i'), has_length(0))
    self.assertRaises(AssertionError, lambda: idx.get('A', 'X', 'i', '1'))

    self.assertRaises(AssertionError, lambda: idx.insert(1))
    self.assertRaises(AssertionError, lambda: idx.insert(1, 'A'))
    self.assertRaises(AssertionError, lambda: idx.insert(1, 'A', 'X'))
    idx.insert(1, 'A', 'X', 'i')        # should not raise anything
    self.assertRaises(AssertionError, lambda: idx.insert(1, 'A', 'X', 'i', 'a'))

    assert_that(idx.get('A', 'X', 'i'), equal_to([1]))

Respondido 29 Jul 12, 17:07

Well, I didn't like the name of the data structure, HierarchicalIndex. A better name? - alí está bien

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.