cómo dividir una expresión regular muy larga en python
Frecuentes
Visto 1,168 equipos
5
i have a regular expression which is very long.
vpa_pattern = '(VAP) ([0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}): (.*)'
My code to match group as follows:
class ReExpr:
def __init__(self):
self.string=None
def search(self,regexp,string):
self.string=string
self.rematch = re.search(regexp, self.string)
return bool(self.rematch)
def group(self,i):
return self.rematch.group(i)
m = ReExpr()
if m.search(vpa_pattern,line):
print m.group(1)
print m.group(2)
print m.group(3)
I tried to make the regular expression pattern to multiple line in following ways,
vpa_pattern = '(VAP) \
([0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}):\
(.*)'
Or Even i tried:
vpa_pattern = re.compile(('(VAP) \
([0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}):\
(.*)'))
But above methods are not working. For each group i have a space () after open and close parenthesis. I guess it is not picking up when i split to multiple lines.
3 Respuestas
8
Revisa re.X flag. It allows comments and ignores white spaces in regex.
a = re.compile(r"""\d + # the integral part
\. # the decimal point
\d * # some fractional digits""", re.X)
contestado el 28 de mayo de 14 a las 13:05
+1 And it should also be noted that Python's r"""raw multi-line string"""
syntax (used here) makes writing these self-documenting regexes much easier (because it completely avoids any backslash soup confusion). - corredor de crestas
3
Python allows writing text strings in parts if enclosed in parenthesis:
>>> text = ("alfa" "beta"
... "gama")
...
>>> text
'alfabetagama'
o en tu código:
text = ("alfa" "beta"
"gama" "delta"
"omega")
print text
imprimirá
"alfabetagamadeltaomega"
Respondido el 21 de enero de 15 a las 17:01
1
Its actually quite simple. You already use the {}
notation. Use it again. So instead of:
'([0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}):'
which is just a repeat of [0-9A-Fa-f]{2}:
6 times, you can use:
'([0-9A-Fa-f]{2}:){6}'
We can even simplify it further by using \d
para representar dígitos:
'([\dA-Fa-f]{2}:){6}'
NOTA: Dependiendo de qué
re
function you use, you can pass in re.IGNORE_CASE and simplify that chunk down to[\da-f]{2}:
So your final regex is:
'(VAP) ([\dA-Fa-f]{2}:){6} (.*)'
contestado el 23 de mayo de 17 a las 12:05
A repeating group only captures the last repetition. Instead, use a repeating non-capturing group inside a capturing group. Note also that OP's regex does not capture the last colon. - Janne Karila
If the OPs regex doesn't capture the final :
then what is the :
aquí: '...[0-9A-Fa-f]{2}): (.*)'
¿haciendo? - RemolachaDemGuise
LA ()
define a group which the OP accesses as m.group(2)
. El último :
is outside the paretheses. - Janne Karila
I see. They'll both recognize the same strings, though it seems the group structure may be different. - RemolachaDemGuise
No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas python regex or haz tu propia pregunta.
What about simpler regex like
(VAP) ((?:[0-9A-Fa-f]{2}:){5}) (.*)
? - Kiro