cómo dividir una expresión regular muy larga en python

i have a regular expression which is very long.

 vpa_pattern = '(VAP) ([0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}): (.*)'

My code to match group as follows:

 class ReExpr:
def __init__(self):
    self.string=None

def search(self,regexp,string):
    self.string=string
    self.rematch = re.search(regexp, self.string)
    return bool(self.rematch)

def group(self,i):
    return self.rematch.group(i)

 m = ReExpr()

 if m.search(vpa_pattern,line):
    print m.group(1)
    print m.group(2)
    print m.group(3)

I tried to make the regular expression pattern to multiple line in following ways,

vpa_pattern = '(VAP) \
    ([0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}):\
    (.*)'

Or Even i tried:

 vpa_pattern = re.compile(('(VAP) \
    ([0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}):\
    (.*)'))

But above methods are not working. For each group i have a space () after open and close parenthesis. I guess it is not picking up when i split to multiple lines.

preguntado el 28 de mayo de 14 a las 13:05

What about simpler regex like (VAP) ((?:[0-9A-Fa-f]{2}:){5}) (.*)? -

3 Respuestas

Revisa re.X flag. It allows comments and ignores white spaces in regex.

a = re.compile(r"""\d +  # the integral part
               \.    # the decimal point
               \d *  # some fractional digits""", re.X)

contestado el 28 de mayo de 14 a las 13:05

+1 And it should also be noted that Python's r"""raw multi-line string""" syntax (used here) makes writing these self-documenting regexes much easier (because it completely avoids any backslash soup confusion). - corredor de crestas

Python allows writing text strings in parts if enclosed in parenthesis:

>>> text = ("alfa" "beta"
... "gama")
...
>>> text
'alfabetagama'

o en tu código:

text = ("alfa" "beta"
        "gama" "delta"
        "omega")
print text

imprimirá

"alfabetagamadeltaomega"

Respondido el 21 de enero de 15 a las 17:01

Its actually quite simple. You already use the {} notation. Use it again. So instead of:

'([0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}):'

which is just a repeat of [0-9A-Fa-f]{2}: 6 times, you can use:

'([0-9A-Fa-f]{2}:){6}'

We can even simplify it further by using \d para representar dígitos:

'([\dA-Fa-f]{2}:){6}'

NOTA: Dependiendo de qué re function you use, you can pass in re.IGNORE_CASE and simplify that chunk down to [\da-f]{2}:

So your final regex is:

'(VAP) ([\dA-Fa-f]{2}:){6} (.*)'

contestado el 23 de mayo de 17 a las 12:05

A repeating group only captures the last repetition. Instead, use a repeating non-capturing group inside a capturing group. Note also that OP's regex does not capture the last colon. - Janne Karila

If the OPs regex doesn't capture the final : then what is the : aquí: '...[0-9A-Fa-f]{2}): (.*)' ¿haciendo? - RemolachaDemGuise

LA () define a group which the OP accesses as m.group(2). El último : is outside the paretheses. - Janne Karila

I see. They'll both recognize the same strings, though it seems the group structure may be different. - RemolachaDemGuise

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.