Extraer datos específicos de un archivo de texto en Perl

I am new to Perl and am trying to extract specific data from a file, which looks like this:

 Print of   9 heaviest strained elements:    


   Element no   Max strain 
      20004         9.6 % 
      20013         0.5 % 
      11189         0.1 % 
      20207         0.1 % 
      11157         0.1 % 
      11183         0.0 % 
      10665         0.0 % 
      20182         0.0 % 
      11160         0.0 % 


 ==================================================

I would like to extract the element numbers only (20004, 20013 etc.) and write these to a new file. The reading of the file should end as soon as the line (=========) is reached, as there are more element numbers with the same heading later on in the file. Hope that makes sense. Any advice much appreciated!

I now have this code, which gives me a list of the numbers, maximum 10 in a row:

my $StrainOut = "PFP_elem"."_$loadComb"."_"."$i";
open DATAOUT, ">$StrainOut" or die "can't open $StrainOut";  # Open the file for writing.

open my $in, '<', "$POSTout" or die "Unable to open file: $!\n";
my $count = 0;

 while(my $line = <$in>) {
  last if $line =~ / ={10}\s*/;
  if ($line =~ /% *$/) {
    my @columns = split "         ", $line;
    $count++;
    if($count % 10 == 0) {
      print DATAOUT "$columns[1]\n";
    }
    else {
      print DATAOUT "$columns[1] ";
    }      
  }
}
close (DATAOUT);
close $in;

What needs changing is the "my @columns = split..." line. At the moment it splits up the $line scalar whenever it has '9 spaces'. As the number of digits of the element numbers can vary, this is a poor way of extracting the data. Is it possible to just read from left to right, omitting all spaces and recording numbers only until the numbers are followed by more spaces (that way the percentage value is ignored)?

preguntado el 02 de febrero de 12 a las 11:02

this seems to work: my @columns = split(/\s+/,$line); -

5 Respuestas

#!/usr/bin/perl
use strict;
use warnings;

while (<>) {                        # read the file line by line
    if (/% *$/) {                   # if the line ends in a percent sign
        my @columns = split;        # create columns
        print $columns[0], "\n";    # print the first one
    }
    last if /={10}/;                # end of processing
}

Respondido 02 Feb 12, 15:02

A one-liner using flip-flop:

perl -ne '
  if ( m/\A\s*(?i)element\s+no/ .. ($end = /\A\s*=+\s*\Z/) ) {
    printf qq[$1\n] if m/\A\s*(\d+)/;
    exit 0 if $end
  }
' infile

Resultado:

20004
20013
11189
20207
11157
11183
10665
20182
11160

Respondido 02 Feb 12, 18:02

#!/usr/bin/perl
use strict;
use warnings;

while (my $f= shift) {
   open(F, $f) or (warn("While opening $f: $!", next);
   my foundstart=0;
  while(<F>) {
     ($foundstart++, next) if /^\s#Element/;
     last if /\s*=+/;
     print $_ if $foundstart;
  }
  $foundstart=0;
  close(F);
}

Respondido 02 Feb 12, 15:02

It has compile errors. 1.- There is a miss of a parentheses in the warn instruction. 2.- Declare foundstart variable as a scalar with $ and in the next regex I think there is a typo with # *. It then prints numbers but percentages too in my test. - Birei

#!/usr/bin/perl
use strict;
use warnings;

open my $rh, '<', 'input.txt' or die "Unable to open file: $!\n";
open my $wh, '>', 'output.txt' or die "Unable to open file: $!\n";

while (my $line = <$rh>) {        
    last if $line =~ /^ ={50}/;
    next unless $line =~ /^ {6}(\d+)/;
    print $wh "$1\n";
}

close $wh;

Respondido 02 Feb 12, 15:02

You could do it by running this one-liner in a command shell.

En * nix:

cat in_file.txt | perl -ne 'print "$1\n" if ( m/\s*(\d+)\s*\d+\.\d+/ )' > out_file.txt

En Windows:

type in_file.txt | perl -ne "print qq{$1\n} if ( m/\s*(\d+)\s*\d+\.\d+/ )" > out_file.txt

Respondido 02 Feb 12, 16:02

He wants to stop reading from file when the line with the equals signs are reached though. - flesk

The cat is a waste of a process. The -n switch causes an iteration over the file names in @ARGV as if you had written LINE: while (<>) { ... - JRFerguson

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.