Alcance variable del generador de rendimiento de Python

I am using yield to create a generator that returns chunks of a string that are being extracted using a regex and re.sub(). While I found an approach that worked, I am a bit confused about why it works one way but not another, as shown below:

This doesn't work (processchunk() is not assigning to the chunk declared in splitmsg):

def splitmsg(msg):
    chunk = None
    def processchunk(match):
        chunk = match.group(1)
        return ""
    while True:
        chunk = None
        msg = re.sub(reCHUNK,processchunk,msg,1)
        if chunk:
            yield chunk
        else:
            break     

This does work (note the only difference being chunk is now a list chunks):

def splitmsg(msg):
    chunks = [ None, ]
    def processchunk(match):
        chunks[0] = match.group(1)
        return ""
    while True:
        chunks[0] = None
        msg = re.sub(reCHUNK,processchunk,msg,1)
        if chunks[0]:
            yield chunks[0]
        else:
            break

My question is basically why does it appear that the scoping of the chunk/chunks variable seem to depend on whether it is a plain variable or a list?

preguntado el 24 de agosto de 12 a las 21:08

posible duplicado de Pregunta de alcance variable de Python -

2 Respuestas

In python, variables can be 'pulled' from the surrounding scope if read from. So the following will work:

def foo():
    spam = 'eggs'
    def bar():
        print spam
foo()

because the variable 'spam' is being looked up in the surrounding scope, the foo función.

Sin embargo, no puedes cambiar the value of a surrounding scope. You can change global variables (if you declare them as global in your function), but you cannot do that for the variable spam en la función anterior.

(Python 3 changes this, it adds a new keyword nonlocal. Si tu defines spam as nonlocal dentro de bar you can assign to that variable a new value inside of bar.)

Now to your list. What happens there is that you are not altering the variable chunks at all. Throughout your code, chunks points to one list, and only to that list. As far as python is concerned, chunks the variable is not altered within the processchunk función.

Qué happen is that you alter the contenido of the list. You can freely assign a new value to chunks[0], because that's not the variable chunks, it is the list referred to by chunks, first index. Python allows this because it is not a variable assignment, but a list manipulation instead.

So, your 'workaround' is correct, if somewhat obscure. If you use Python 3, you can declare chunks as nonlocal dentro de processchunk and then things will work without lists too.

Respondido 24 ago 12, 21:08

@aaa90210: chunk = match.group(1) oscuridad the nonlocal variable chunk. - Steven Rumbalski

@aaa90210: chunk[0] = match.group(1) es azúcar sintáctico para chunk.__setitem__(0, match.group(1)). It's not really an assignment even though it looks like it. - Steven Rumbalski

In the first case, you are creating a new local variable called chunk. A variable is treated as local to a function if you assign to it inside the function. In the second case, you are modifying the list referred to by the outer variable chunk. Because you don't assign to this variable, it's not treated as local. See for instance esta pregunta anterior.

Assigning to a bare name in Python (someName = ...) is not the same as anything else; in particular it is not the same as item assignment (someName[0] = ...). The latter is calling methods under the hood to mutate the list.

contestado el 23 de mayo de 17 a las 11:05

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.