Java 6 regex múltiples coincidencias de un grupo

Here is simple pattern: [key]: [value1] [value2] [value3] [valueN]

Quiero tener:

  1. clave
  2. matriz de valores

Aquí está mi expresión regular: ^([^:]+):(:? ([^ ]+))++$

Here is my text: foo: a b c d

Matcher gives me 2 groups: foo (as key) and d (as values).

Si uso +? en lugar de ++ yo obtengo ano, d.

So java returns me first (or last) occurrence of group.

No puedo usar find() here becase there is only uno partido.

What can I do except splitting regex into 2 parts and using find for the array of values? I have worked with regular expressions in many other environments and almost all of them have ability to fetch "first occurrence of group 1", "second occurrence of group 1" and so on.

How can I do with with java.util.regex in JDK6 ?

Gracias.

preguntado el 27 de agosto de 11 a las 14:08

Can you please clarify the point about there being "only uno match?" There's no way to capture an indeterminate number of matches like you're asking, so some iteration is required here. -

It is already 2013 and there is still no decent solution to this problem! facepalm -

2 Respuestas

The total number of match groups does not depend on the target string ("foo: a b c d", in your case), but on the pattern. Your pattern will always have 3 groups:

^([^:]+):(:? ([^ ]+))++$
 ^       ^   ^
 |       |   |
 1       2   3

Las 1 Característicasst group will hold your key, and the 2nd group, which matches the same as group 3 but then includes a white space, will always hold just 1 of your values. This is either the first values (in case of the ungreedy +?) or the last value (in case of greedy matching).

What you could do is just match:

^([^:]+):\s*(.*)$

so that you have the following matches:

- group(1) = "foo"
- group(2) = "a b c d"

and then split the 2nd group on it's white spaces to get all values:

import java.util.Arrays;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Main {
  public static void main (String[] args) throws Exception {
    Matcher m = Pattern.compile("^([^:]+):\\s*(.*)$").matcher("foo: a b c d");
    if(m.find()) {
      String key = m.group(1);
      String[] values = m.group(2).split("\\s+");
      System.out.printf("key=%s, values=%s", key, Arrays.toString(values));
    }
  }
}

que imprimirá:

key=foo, values=[a, b, c, d]

Respondido 18 Oct 13, 00:10

One, I think she meant (?: no (:?. But more importantly, this does get asked for a fair bit. I believe C# has a way of doing this. It might be useful to extend the API so that one could retrieve an array of matches for the Nᵗʰ group via group_array(N) or some such; you would need a new pattern compile flag to enable that, since it’s too expensive for general use. In Perl would could use arrays @1 y @2 instead of scalars $1 y $2, and even define $1 significar $1[$#1] etc. Is that useful, wicked, or both? :) - tchrist

@tchrist, yeah, you could be right about the :? <-> ?:. I'm not very familiar C#, and never heard of this N-grouping feature (do you have a link to the MSDN docs for me?). And it would definitely be both useful and wicked! :) - Bart Kiers

what make me pollute myself? :) This suggests it under Capture Collection and under Capture. I have trouble reading that, though. :) - tchrist

Is there any other engine that allows to capture all the repeated group matches? Because it will be an overkill if I'll do regex matching over regex results - Uko

@Uko, I don't know. Feel free to create a new question, of course. - Bart Kiers

Scanner s = new Scanner(input).useDelimiter(Pattern.compile(":?\\s+"));
String key = s.next();
ArrayList values = new ArrayList();
while (s.hasNext()) {
    values.add(s.next());
}
System.out.printf("key=%s, values=%s", key, values);

Imprime:

key=foo, values=[a, b, c, d]

Respondido 18 Oct 13, 00:10

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.