¿Cómo leer un contador de un programa linux C a un script de prueba bash?
Frecuentes
Visto 478 equipos
0
I have a large C/C++ program on a Suse linux system. We do automated testing of it with a bash script, which sends input to the program, and reads the output. It's mainly "black-box" testing, but some tests need to know a few internal details to determine if a test has passed.
One test in particular needs to know how times the program runs a certain function (which parses a particular response message). When that function runs it issues a log and increments a counter variable. The automated test currently determines the number of invocations by grepping in the log file for the log message, and counting the number of occurrences before and after the test. This isn't ideal, because the logs (syslog-ng) aren't guaranteed, and they're frequently turned off by configuration, because they're basically debug logs.
I'm looking for a better alternative. I can change the program to enhance the testability, but it shouldn't be heavy impact to normal operation. My first thought was, I could just read the counter after each test. Something like this:
gdb --pid=$PID --batch -ex "p numServerResponseX"
That's slow when it runs, but it's good because the program doesn't need to be changed at all. With a little work, I could probably write a ptrace command to do this a little more efficiently.
But I'm wondering if there isn't a simpler way to do this. Could I write the counter to shared memory (with shm_open / mmap), and then read /dev/shm in the bash script? Is there some simpler way I could setup the counter to make it easy to read, without making it slow to increment?
Edit:
Details: The test setup is like this:
testScript <-> sipp <-> programUnderTest <-> externalServer
The bash testScript injects sip messages with sipp, and it generally determines success or failure based on the completion code from sipp. But in certain tests it needs to know the number of responses the program received from the external server. The function "processServerResponseX" processes certain responses from the external server. During the testing there isn't much traffic running, so the function is only invoked perhaps 20 times over 10 seconds. When each test ends and we want to check the counter, there should be essentially no traffic. However during normal operation, it might be invoked hundreds of times a second. The function is roughly:
unsigned long int numServerResponseX;
int processServerResponseX(DMsg_t * dMsg, AppId id)
{
if (DEBUG_ENABLED)
{
syslog(priority, "%s received %d", __func__, (int) id);
}
myMutex->getLock();
numServerResponseX++;
doLockedStuff(dMsg, id);
myMutex->releaseLock();
return doOtherStuff(dMsg, id);
}
The script currently does:
grep processServerResponseX /var/log/logfile | wc -l
and compares the value before and after. My goal is to have this work even if DEBUG_ENABLED is false, and not have it be too slow. The program is multi-threaded, and it runs on an i86_64 smp machine, so adding any long blocking function would not be a good solution.
4 Respuestas
3
I would have that certain function "(which parses a particular response message)" write (probably using fopen
luego fprintf
luego fclose
) some textual data somewhere.
That destination could be a FIFO (see fifo(7) ...) or a temporary file in a tmpfs
file system (which is a RAM file system), maybe /run/
If your C++ program is big and complex enough, you could consider adding some probing facilities (some means for an external program to query about the internal state of your C++ program) e.g. a dedicated web service (using libonión in a separate thread), or some interface to sistemad, o para D-autobús, or some remote procedure call service like ONC/RPC, JSON-RPC, etc, etc ...
Puede que te interese POCOlib. Perhaps its logging framework should interest you.
As you mentioned, you might use Posix shared memory & semaphores (see shm_overview (7) e sem_overview (7) ...).
Perhaps the Linux specific eventofd(2) is what you need.... (you could code a tiny C program to be invoked by your testing bash scripts....)
You could also try to change the command line (I forgot how to do that, maybe libproc
o escriba a /proc/self/cmdline
ver proc (5)...). Then ps
lo mostraría.
contestado el 22 de mayo de 14 a las 21:05
2
I personally do usually use the methods Basile Starynkevitch outlined for this, but I wanted to bring up an alternative method using realtime signals.
I am not claiming this is the best solution, but it is simple to implement and has very little overhead. The main downside is that the size of the request and response are both limited to . int
(or technically, anything representable by an int
o por un void *
).
Basically, you use a simple helper program to send a signal to the application. The signal has a payload of one int
your application can examine, and based on it, the application responds by sending the same signal back to the originator, with an int
of its own as payload.
If you don't need any locking, you can use a simple realtime signal handler. When it catches a signal, it examines the siginfo_t
structure. If sent via sigqueue()
, la solicita está en el si_value
miembro de siginfo_t
structure. The handler answers to the originating process (si_pid
member of the structure) using sigqueue()
, Con el respuesta. This only requires about sixty lines of code to be added to your application. Here is an example application, app1.c
:
#define _POSIX_C_SOURCE 200112L
#include <unistd.h>
#include <signal.h>
#include <errno.h>
#include <string.h>
#include <time.h>
#include <stdio.h>
#define INFO_SIGNAL (SIGRTMAX-1)
/* This is the counter we're interested in */
static int counter = 0;
static void responder(int signum, siginfo_t *info,
void *context __attribute__((unused)))
{
if (info && info->si_code == SI_QUEUE) {
union sigval value;
int response, saved_errno;
/* We need to save errno, to avoid interfering with
* the interrupted thread. */
saved_errno = errno;
/* Incoming signal value (int) determines
* what we respond back with. */
switch (info->si_value.sival_int) {
case 0: /* Request loop counter */
response = *(volatile int *)&counter;
break;
/* Other codes? */
default: /* Respond with -1. */
response = -1;
}
/* Respond back to signaler. */
value.sival_ptr = (void *)0L;
value.sival_int = response;
sigqueue(info->si_pid, signum, value);
/* Restore errno. This way the interrupted thread
* will not notice any change in errno. */
errno = saved_errno;
}
}
static int install_responder(const int signum)
{
struct sigaction act;
sigemptyset(&act.sa_mask);
act.sa_sigaction = responder;
act.sa_flags = SA_SIGINFO;
if (sigaction(signum, &act, NULL))
return errno;
else
return 0;
}
int main(void)
{
if (install_responder(INFO_SIGNAL)) {
fprintf(stderr, "Cannot install responder signal handler: %s.\n",
strerror(errno));
return 1;
}
fprintf(stderr, "PID = %d\n", (int)getpid());
fflush(stderr);
/* The application follows.
* This one just loops at 100 Hz, printing a dot
* about once per second or so. */
while (1) {
struct timespec t;
counter++;
if (!(counter % 100)) {
putchar('.');
fflush(stdout);
}
t.tv_sec = 0;
t.tv_nsec = 10000000; /* 10ms */
nanosleep(&t, NULL);
/* Note: Since we ignore the remainder
* from the nanosleep call, we
* may sleep much shorter periods
* when a signal is delivered. */
}
return 0;
}
The above responder responds to query 0
con el counter
value, and with -1
to everything else. You can add other queries simply by adding a suitable case
declaración en responder()
.
Note that locking primitives (except for sem_post()
) no son seguro de señal asíncrona, and thus should not be used in a signal handler. So, the above code cannot implement any locking.
Signal delivery can interrupt a thread in a blocking call. In the above application, the nanosleep()
call is usually interrupted by the signal delivery, causing the sleep to be cut short. (Similarly, read()
e write()
calls may return -1
errno == EINTR
, if they were interrupted by signal delivery.)
If that is a problem, or you are not sure if all your code handles errno == EINTR
correctly, or your counters need locking, you can use separate thread dedicated for the signal handling instead.
The dedicated thread will sleep unless a signal is delivered, and only requires a very small stack, so it really does not consume any significant resources at run time.
The target signal is blocked in all threads, with the dedicated thread waiting in sigwaitinfo()
. If it catches any signals, it processes them just like above -- except that since this is a thread and not a signal handler per se, you can freely use any locking etc., and do not need to limit yourself to seguro de señal asíncrona funciones.
This threaded approach is slightly longer, adding almost a hundred lines of code to your application. (The differences are contained in the responder()
e install_responder()
functions; even the code added to main()
is exactly the same as in app1.c
.)
Esta es app2.c
:
#define _POSIX_C_SOURCE 200112L
#include <signal.h>
#include <errno.h>
#include <pthread.h>
#include <string.h>
#include <time.h>
#include <stdio.h>
#define INFO_SIGNAL (SIGRTMAX-1)
/* This is the counter we're interested in */
static int counter = 0;
static void *responder(void *payload)
{
const int signum = (long)payload;
union sigval response;
sigset_t sigset;
siginfo_t info;
int result;
/* We wait on only one signal. */
sigemptyset(&sigset);
if (sigaddset(&sigset, signum))
return NULL;
/* Wait forever. This thread is automatically killed, when the
* main thread exits. */
while (1) {
result = sigwaitinfo(&sigset, &info);
if (result != signum) {
if (result != -1 || errno != EINTR)
return NULL;
/* A signal was delivered using *this* thread. */
continue;
}
/* We only respond to sigqueue()'d signals. */
if (info.si_code != SI_QUEUE)
continue;
/* Clear response. We don't leak stack data! */
memset(&response, 0, sizeof response);
/* Question? */
switch (info.si_value.sival_int) {
case 0: /* Counter */
response.sival_int = *(volatile int *)(&counter);
break;
default: /* Unknown; respond with -1. */
response.sival_int = -1;
}
/* Respond. */
sigqueue(info.si_pid, signum, response);
}
}
static int install_responder(const int signum)
{
pthread_t worker_id;
pthread_attr_t attrs;
sigset_t mask;
int retval;
/* Mask contains only signum. */
sigemptyset(&mask);
if (sigaddset(&mask, signum))
return errno;
/* Block signum, in all threads. */
if (sigprocmask(SIG_BLOCK, &mask, NULL))
return errno;
/* Start responder() thread with a small stack. */
pthread_attr_init(&attrs);
pthread_attr_setstacksize(&attrs, 32768);
retval = pthread_create(&worker_id, &attrs, responder,
(void *)(long)signum);
pthread_attr_destroy(&attrs);
return errno = retval;
}
int main(void)
{
if (install_responder(INFO_SIGNAL)) {
fprintf(stderr, "Cannot install responder signal handler: %s.\n",
strerror(errno));
return 1;
}
fprintf(stderr, "PID = %d\n", (int)getpid());
fflush(stderr);
while (1) {
struct timespec t;
counter++;
if (!(counter % 100)) {
putchar('.');
fflush(stdout);
}
t.tv_sec = 0;
t.tv_nsec = 10000000; /* 10ms */
nanosleep(&t, NULL);
}
return 0;
}
Para ambos app1.c
e app2.c
the application itself is the same.
The only modifications needed to the application are making sure all the necessary header files get #include
d, adding responder()
e install_responder()
y una llamada a install_responder()
as early as possible in main()
.
(app1.c
e app2.c
only differ in responder()
e install_responder()
; and in that app2.c
needs pthreads.)
Ambos app1.c
e app2.c
use the signal SIGRTMAX-1
, which should be unused in most applications.
app2.c
approach, also has a useful side-effect you might wish to use in general: if you use other signals in your application, but don't want them to interrupt blocking I/O calls et cetera -- perhaps you have a library that was written by a third party, and does not handle EINTR
correctly, but you do need to use signals in your application --, you can simply block the signals after the install_responder()
call in your application. The only thread, then, where the signals are no blocked is the responder thread, and the kernel will use tat to deliver the signals. Therefore, the only thread that will ever get interrupted by the signal delivery is the responder thread, more specifically sigwaitinfo()
in responder()
, and it ignores any interruptions. If you use for example async I/O or timers, or this is a heavy math or data processing application, this might be useful.
Both application implementations can be queried using a very simple query program, query.c
:
#define _POSIX_C_SOURCE 200112L
#include <unistd.h>
#include <signal.h>
#include <string.h>
#include <errno.h>
#include <time.h>
#include <stdio.h>
int query(const pid_t process, const int signum,
const int question, int *const response)
{
sigset_t prevmask, waitset;
struct timespec timeout;
union sigval value;
siginfo_t info;
int result;
/* Value sent to the target process. */
value.sival_int = question;
/* Waitset contains only signum. */
sigemptyset(&waitset);
if (sigaddset(&waitset, signum))
return errno = EINVAL;
/* Block signum; save old mask into prevmask. */
if (sigprocmask(SIG_BLOCK, &waitset, &prevmask))
return errno;
/* Send the signal. */
if (sigqueue(process, signum, value)) {
const int saved_errno = errno;
sigprocmask(signum, &prevmask, NULL);
return errno = saved_errno;
}
while (1) {
/* Wait for a response within five seconds. */
timeout.tv_sec = 5;
timeout.tv_nsec = 0L;
/* Set si_code to an uninteresting value,
* just to be safe. */
info.si_code = SI_KERNEL;
result = sigtimedwait(&waitset, &info, &timeout);
if (result == -1) {
/* Some other signal delivered? */
if (errno == EINTR)
continue;
/* No response; fail. */
sigprocmask(SIG_SETMASK, &prevmask, NULL);
return errno = ETIMEDOUT;
}
/* Was this an interesting signal? */
if (result == signum && info.si_code == SI_QUEUE) {
if (response)
*response = info.si_value.sival_int;
/* Return success. */
sigprocmask(SIG_SETMASK, &prevmask, NULL);
return errno = 0;
}
}
}
int main(int argc, char *argv[])
{
pid_t pid;
int signum, question, response;
long value;
char dummy;
if (argc < 3 || argc > 4 ||
!strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
fprintf(stderr, "\n");
fprintf(stderr, "Usage: %s [ -h | --help ]\n", argv[0]);
fprintf(stderr, " %s PID SIGNAL [ QUERY ]\n", argv[0]);
fprintf(stderr, "\n");
return 1;
}
if (sscanf(argv[1], " %ld %c", &value, &dummy) != 1) {
fprintf(stderr, "%s: Invalid process ID.\n", argv[1]);
return 1;
}
pid = (pid_t)value;
if (pid < (pid_t)1 || value != (long)pid) {
fprintf(stderr, "%s: Invalid process ID.\n", argv[1]);
return 1;
}
if (sscanf(argv[2], "SIGRTMIN %ld %c", &value, &dummy) == 1)
signum = SIGRTMIN + (int)value;
else
if (sscanf(argv[2], "SIGRTMAX %ld %c", &value, &dummy) == 1)
signum = SIGRTMAX + (int)value;
else
if (sscanf(argv[2], " %ld %c", &value, &dummy) == 1)
signum = value;
else {
fprintf(stderr, "%s: Unknown signal.\n", argv[2]);
return 1;
}
if (signum < SIGRTMIN || signum > SIGRTMAX) {
fprintf(stderr, "%s: Not a realtime signal.\n", argv[2]);
return 1;
}
/* Clear the query union. */
if (argc > 3) {
if (sscanf(argv[3], " %d %c", &question, &dummy) != 1) {
fprintf(stderr, "%s: Invalid query.\n", argv[3]);
return 1;
}
} else
question = 0;
if (query(pid, signum, question, &response)) {
switch (errno) {
case EINVAL:
fprintf(stderr, "%s: Invalid signal.\n", argv[2]);
return 1;
case EPERM:
fprintf(stderr, "Signaling that process was not permitted.\n");
return 1;
case ESRCH:
fprintf(stderr, "No such process.\n");
return 1;
case ETIMEDOUT:
fprintf(stderr, "No response.\n");
return 1;
default:
fprintf(stderr, "Failed: %s.\n", strerror(errno));
return 1;
}
}
printf("%d\n", response);
return 0;
}
Note that I did not hardcode the signal number here; use SIGRTMAX-1
on the command line for app1.c
e app2.c
. (You can change it. query.c
entiende SIGRTMIN+n
too. You must use a realtime signal, SIGRTMIN+0
a SIGRTMAX-0
, inclusive.)
You can compile all three programs using
gcc -Wall -O3 app1.c -o app1
gcc -Wall -O3 app2.c -lpthread -o app2
gcc -Wall -O3 query.c -o query
Ambos ./app1
e ./app2
print their PIDs, so you don't need to look for it. (You can find the PID using e.g. ps -o pid= -C app1
or ps -o pid= -C app2
, aunque.)
Si tu corres ./app1
or ./app2
in one shell (or both in separate shells), you can see them outputting the dots at about once per second. The counter increases every 1/100th of a second. (Press Ctrl + C to stop.)
Si tu corres ./query PID SIGRTMAX-1
in another shell in the same directory on the same machine, you can see the counter value.
An example run on my machine:
A$ ./app1
PID = 28519
...........
B$ ./query 28519 SIGRTMAX-1
11387
C$ ./app2
PID = 28522
...
B$ ./query 28522 SIGRTMAX -1
371
As mentioned, the downside of this mechanism is that the response is limited to one int
(or technically an int
o un void *
). There are ways around that, however, by also using some of the methods Basile Starynkevich outlined. Typically, the signal is then just a notification for the application that it should update the state stored in a file, shared memory segment, or wherever. I recommend using the dedicated thread approach for that, as it has very little overheads, and minimal impact on the application itself.
¿Alguna pregunta?
contestado el 23 de mayo de 14 a las 03:05
0
A hard-coded systemtap solution could look like:
% cat FOO.stp
global counts
probe process("/path/to/your/binary").function("CertainFunction") { counts[pid()] <<< 1 }
probe process("/path/to/your/binary").end { println ("pid %d count %sd", pid(), @count(counts[pid()]))
delete counts[pid()] }
# stap FOO.stp
pid 42323 count 112
pid 2123 count 0
... etc, until interrupted
contestado el 26 de mayo de 14 a las 01:05
0
Thanks for the responses. There is lots of good information in the other answers. However, here's what I did. First I tweaked the program to add a counter in a shm file:
struct StatsCounter {
char counterName[8];
unsigned long int counter;
};
StatsCounter * stats;
void initStatsCounter()
{
int fd = shm_open("TestStats", O_RDWR|O_CREAT, 0);
if (fd == -1)
{
syslog(priority, "%s:: Initialization Failed", __func__);
stats = (StatsCounter *) malloc(sizeof(StatsCounter));
}
else
{
// For now, just one StatsCounter is used, but it could become an array.
ftruncate(fd, sizeof(StatsCounter));
stats = (StatsCounter *) mmap(NULL, sizeof(StatsCounter),
PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
}
// Initialize names. Pad them to 7 chars (save room for \0).
snprintf(stats[0].counterName, sizeof(stats[0].counterName), "nRespX ");
stats[0].counter = 0;
}
And changed processServerResponseX to increment stats[0].counter in the locked section. Then I changed the script to parse the shm file with "hexdump":
hexdump /dev/shm/TestStats -e ' 1/8 "%s " 1/8 "%d\n"'
This will then show something like this:
nRespX 23
This way I can extend this later if I want to also look at response Y, ...
Not sure if there are mutual exclusion problems with hexdump if it accessed the file while it was being changed. But in my case, I don't think it matters, because the script only calls it before and after the test, it should not be in the middle of an update.
contestado el 29 de mayo de 14 a las 23:05
No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas c linux gdb automated-tests shared-memory or haz tu propia pregunta.
How often is that particular function running (hundreds, or billions of times)? It is critically quick to run (i.e. running less than a few microseconds) or more than a millisecond? - Basile Starynkevitch
Your stock kernel may not have it but you could use
systemtap
yCONFIG_UTRACE
feature to trace which functions your program executed. - Brian CainYou should explain a bit more what sort of application are you coding, and how often the certain function is running. - Basile Starynkevitch
And yes, it is perfectly fine to read the memory contents of the shared memory segment using
/dev/shm/whatever
file nodes. That's why they exist in the first place. If your counter/s are properly aligned volatile longs (for whatever definition of this that works for you) you will be able to get a consistent reading every time you access it (optionally stick a__sync_synchronize()
after the counter update if you're on a massive smp machine). - oakad@oakad: No need for
volatile
or relying on x86/x86-64 atomic access rules. Instead, use__sync_fetch_and_add(&numDiameterResponseX, 1L)
to increase the counter atomically without any locking, and__sync_fetch_and_add(&numDiameterResponseX, 0L)
to read it atomically (everywhere you read it!), and it'll work on all architectures GCC supports. It has very little overhead, too (it optimizes tolock xaddl %2, (%1)
in x86 and x86-64 assembly). - Nominal Animal