Ejecutar proceso con entrada y salida de cadena

There are plenty of questions on here related to fork() and exec(). I have not found one that really makes the process of using them simple though, and making programmer's lives simple is the goal.

I need a C++, linux-friendly function that does the following:

string RunCommand(string command, string input){}

This function should be able to run a shell command, like grep, and "pipe" the content of input into it and read the ouptut and return it. So if I would do the following at the command line:

ps -elf | grep somequerytext

I would in code do:

string psOutput = RunCommand("ps -elf","");
string grepOutput = RunCommand("grep somequerytext", psOutput);

*edit: The question is what is the best implementation of the RunCommand function.

*edit: popen was considered as a solution for simplicity, but popen restricts you to piping data in or piping data out, but not both.

preguntado el 28 de agosto de 11 a las 04:08

What is the implementation of the RunCommand function? -

How would we know? You're the one who are writing it. Do you have a specific problem you need help with? -

Suena como popen Or system. -

@Patrick Yeah, like anon said, popen sounds like exactly what you want. -

I looked at popen, but you can either pipe output to it or get output from it, but not both (per the documentation I read anyway). According to our static analysis tools, system is prohibited, so that is off the table. I suspect the best implementation is a fork/exec/pipe combination, but I don't really understand their use. I'm reading up on them now, but it's late and I've got a deadline - help would be appreciated. -

3 Respuestas

It appears that you need a function to:

  • Create two pipes and fork.
  • The child process then does:
    • Duplicate appropriate descriptors of the pipes so that one file descriptor is standard input and one standard output
    • Close the pipe descriptors
    • Split up the command string into arguments
    • Run the command with the arguments
  • The parent (main) process then does:
    • Close the appropriate pipe file descriptors
    • Writes the input string to the child and closes the pipe to the child's standard input
    • Reads the output string from the child
    • Closes the the pipe from the child's standard output
    • Waits for the child to die
  • When the child is dead, the main process can continue, returning the string that it read.

The only potential problem with this outline is if the child writes output before it is finished reading its input, and it writes so much output that the pipe is full (they have a finite and usually quite small capacity). In that case, the processes will deadlock - the parent trying to write to the child, and the child trying to write to the parent, and both stuck waiting for the other to read some data. You can avoid that by having two threads in the parent, one processing the writing, the other processing the reading. Or you can use two child processes, one to run the command and one to write to the standard input, while the parent reads from the command's standard output into a string.

One of the reasons there isn't a standard function to do this is precisely the difficulty of deciding what are the appropriate semantics.

I've ignored error handling and signal handling issues; they add to the complexity of it all.

Respondido 28 ago 11, 09:08

I would do the above (i have in the past) except for the main process I would use select on the file descriptors to overcome the problem mentioned above. In addition I would add a timing mechanism to prevent the main process from hanging indefinetly and also a mechanism (int *) so the exit status can be returned. - Ed Heal

Two good points: the parent could indeed use one of the select() variants to avoid the deadlock, and returning an exit status would be beneficial. There was also no requirement in the original to provide special treatment for standard error output. I note that the deadlock will not occur if the input string is smaller than the size of a pipe buffer, which is traditionally 5120 bytes and often bigger, so it is relatively unlikely to be a problem (but a general purpose solution must handle it). Of course, select() has its own set of problems, too; notably, it modifies its input parameters. - Jonathan Leffler

I figured this would be a sufficiently complex problem. Thanks for the input Johathan and Ed. I think that will get me going down the right track. - PatrickV

@PatrickV: note that @vine'th suggests using sh -c [string command]; apart from the overhead of executing a shell which then itself forks to execute the command, it is a good suggestion (it saves splitting up the command and arguments in your program). - Jonathan Leffler

Hi Jonathan, thanks for the nice description. I think there is a single pipe (not two) and which is shared by the by two forked child processes. In the first process, stdout descriptor is replaced (dup2ed) by the out fd of pipe and in the second process, the stdin descriptor is replaced by the read fd of the pipe; (or Am I missing something?) - viña

Before discussing the implementation of RunCommand, let us consider this code fragment:

string psOutput = RunCommand("ps -elf","");
string grepOutput = RunCommand("grep somequerytext", psOutput);

In the above code fragment, the problem is that the commands are run sequentially, and does not run concurrently/in parallel. (See Programación con hilos POSIX p.9 ) To give an example if ps -elf generates huge amount of data, that will be stored in psOutput and then passed to next command. But in actual implementation, each process in the pipe are run concurrently and data is passed with pipe (with some buffering of course) and there is no need to wait for the execution of one process before starting the execution of other process.

I suggest you to look into the Richard Steven's Programación avanzada en el entorno Unix chapter.8 "Process Control" p.223 for an implementation of system. Based on Richard Steven's code, a sample implementation of RunCommand will be as follows (just skeleton code, no error checking):

int RunCommand(string command)
    pit_t pid;
    if ( ( pid = fork() ) < 0 ) return -1;
    else if (pid == 0)
        execl("/bin/sh", "sh", "-c", command.c_str(), (char*) 0);
       /* The parent waits for the child */
       wait(pid, ...);

and then one would invoke the above functions as:

string s("ps -elf | grep somequerytext");
int status = RunCommand(s);

The shell takes care of parsing its input and running the commands by setting up pipes in between them. If you are interested in understanding how a shell is implemented, see "A Minishell example" in Terrence Chan Unix System Programming using C++ chap.8 "Unix Processes" (Jonathan Leffler's https://www.youtube.com/watch?v=xB-eutXNUMXJtA&feature=youtu.be pretty much describes a shell implementation!)

contestado el 23 de mayo de 17 a las 12:05

Por qué no usar popen()? It's in the standard library, and very simple to use:

FILE* f = popen("ps -elf | grep somequerytext", "r");
char buf[2048];
buf[fread(buf, 1, 2048, f)] = '\0';
cout << buf;

Respondido 28 ago 11, 11:08

Thanks for the input Dave. That was my first attempt, but popen failed to work as expected. I believe it will not work with piped processes. popen("ps -elf"...) works fine, but popen("ps -elf | grep somequerytext"...) produces no data (while running the same command from the shell does). - PatrickV

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.