Punteros relativos en el archivo mapeado en memoria usando C

Is it possible to use a structure with a pointer to another structure inside a memory mapped file instead of storing the offset in some integral type and calculate the pointer?

e.g. given following struct:

typedef struct _myStruct_t {
  int number;
  struct _myStruct_t *next;
} myStruct_t;
myStruct_t* first = (myStruct_t*)mapViewHandle;
myStruct_t* next = first->next;

en lugar de esto:

typedef struct _myStruct_t {
  int number;
  int next;
} myStruct_t;
myStruct_t* first = (myStruct_t*)mappedFileHandle;
myStruct_t* next = (myStruct_t*)(mappedFileHandle+first->next);

I read about '__based' keyword, but this is Microsoft specific and therefore Windows-bound.

Looking for something working with GCC compiler.

preguntado el 12 de junio de 14 a las 11:06

What is the problem? What have you tried? What errors do you get? The linked list node structure in the first snippet is perfectly normal. A structure in a file sounds a little iffy, so I think you should be a little bit more explicit in what it is that you're trying to achieve. -

@Henrik: There is nothing to try cause I see no way of doing this. Of course I can assign next to the memory mapped base with some offset and use it but as soon as I restart the application the pointer value stored in next is not valid anymore cause the base address of the mapped memory region changes. That's why I use the second snippet. So the question basically is if gcc does support something which could generate this in any way dynamically. It should store the offset instead of the current pointer value. -

I don't know of any way for GCC to help you with this, but assuming you're using mmap to create the mapping, maybe using the MAP_FIXED flag and specifying a fixed base address might help you. -

@Henrik: Nice idea but "it will cause mmap to unmap anything that may already be mapped at that address which is generally a very bad thing" see stackoverflow.com/questions/6446101/… -

3 Respuestas

I'm pretty sure there's nothing akin to the __based pointer from Visual Studio in GCC. The only time I'd seen anything like that built-in was on some pretty odd hardware. The Visual Studio extension provides an address translation layer around all operations involving the pointer.

So it sounds like you're into roll-your-own territory; although I'm willing to be told otherwise.

The last time I was dealing with something like this it was on the palm platform, where, unless you locked down memory, there was the possibility of it being moved around. You got memory handles from allocations and you had to MemHandleLock before you used it, and MemPtrUnlock it after you were finished using it so the block could be moved around by the OS (which seemed to happen on ARM based palm devices).

If you're insistent on storing pointer-esque values in a memory mapped structure the first recommendation would be to store the value in an intptr_t, which is an int size that can contain a pointer value. While your offsets are unlikely to exceed 4GB, it pays to stay safe.

That said, this is probably easy to implement in C++ using a template class, it's just that marking the question as C makes things a lot messier.

Respondido el 12 de junio de 14 a las 14:06

No creo que intptr_t is a good idea for a file format: it is platform dependent. Worse, it depends on whether the program is compiled as 32 or 64 bit. If you use intptr_t, a file written by a 32 bit app cannot be read by a 64 bit app and vice versa. I would opt for uint64_t. - cmaster - reinstalar a monica

@cmaster all things being equal, that's the least of the problems that would exist with the file. Everything in the structure would have to be coded for size, padding, alignment and capacity if it was to be portable across 32/64bit and endianness if it was to be portable across hardware platforms - Anya travesuras

C++: It is very doable and portable (the code, but maybe not the data). It was a while ago, but I created a template for a self-relative pointer classes. I had tree structures inside blocks of memory that might move. Internally, the class had a single intptr_t, but = * . -> operators were overloaded so it appeared like a regular pointer. Handling null took some attention. I also did versions using int, short and not very useful char for space-saving pointers that were unable to point far away (outside memory block).

In C you could use macros to wrap get and set

// typedef OBJ { int p; } OBJ;
#define OBJPTR(P) ((OBJ*)((P)?(int)&(P)+(P):0))
#define SETOBJPTR(P,V) ((P)=(V)?(int)(V)-(int)&(P):0)

The above C macros are for self-relative pointers that can be slightly more efficient than based pointers. Here is a working example of a tree in a small block of relocatable memory using 2-byte (short) pointers to save space. int is okay for casting from pointers since it is 32 bit code:

#include <stdio.h>
#include <memory.h>

typedef struct OBJ
{
  int val;
  short left;
  short right;
#define OBJPTR(P) ((OBJ*)((P)?(int)&(P)+(P):0))
#define SETOBJPTR(P,V) ((P)=(V)?(int)(V)-(int)&(P):0)  
} OBJ;

typedef struct HEAD
{
  short top; // top of tree
  short available; // index of next available place in data block
  char data[0x7FFF]; // put whole tree here
} HEAD;

HEAD * blk;

OBJ * Add(int val)
{
  short * where = &blk->top; // find pointer to "pointer" to place new node
  OBJ * nd;
  while ( ( nd = OBJPTR(*where) ) != 0 )
    where = val < nd->val ? &nd->left : &nd->right;
  nd = (OBJ*) ( blk->data + blk->available ); // allocate node
  blk->available += sizeof(OBJ); // finish allocation
  nd->val = val;
  nd->left = nd->right = 0;
  SETOBJPTR( *where, nd );
  return nd;
}

void Dump(OBJ*top,int indent)
{
  if ( ! top ) return;
  Dump( OBJPTR(top->left), indent + 3 );
  printf( "%*s %d\n", indent, "", top->val );
  Dump( OBJPTR(top->right), indent + 3 );
}

void main(int argc,char*argv)
{
  blk = (HEAD*) malloc(sizeof(HEAD));
  blk->available = (int) &blk->data - (int) blk;
  blk->top = 0;
  Add(23); Add(2); Add(45); Add(99); Add(0); Add(12);
  Dump( OBJPTR(blk->top), 3 );
  { // PROOF a copy at a different address still has the tree:
  HEAD blk2 = *blk;
  Dump( OBJPTR(blk2.top), 3 );
  }
}

A note about based verses self-relative "*" operator. Based can involve 2 addresses and 2 memory fetches. Self-relative involves 1 address and 1 memory fetch. Pseudo assembly:

load reg1,address of pointer
load reg2,fetch reg1
add reg3,reg2+reg1

load reg1,address of pointer
load reg2,fetch reg1
load reg3,address of base
load reg4,fetch base
add reg5,reg2+reg4

respondido 03 nov., 19:14

Your current C macros are bogus, but it's certainly possible to write reasonable ones. - o11c

In OBJPTR() I think you should be adding P to the base pointer, not to itself. - droog luser

I added a complete example, a "proof", and a note about based verses self-relative performance differences. - Codemeister

The first is extremely unlikely to work.

Remember that a pointer, such as struct _myStruct_t * is a pointer to a location en memoria. Suppose that this structure was located at address 1000 in memory: that would mean that the next structure, located just after it, might be located at address 1008, and that's what's stored in ->next (the numbers don't matter; what matters is that they are memory addresses). Now you save that structure to a file (or un-map it). Then you map it again, but this time, it ends up starting at address 2000, Pero el ->next pointer is still 1008.

You have (generally) no control over where files are mapped in memory, so no control over the actual memory locations of the elements within the mapped structure. Therefore you can , solamente depend on relative offsets.

Note that your second version may or may not work as you expect, depending on the declared type of mappedFileHandle. If it's a pointer to myStruct_t, then adding an integer n to it will produce a pointer to an address which is n*sizeof(myStruct_t) bytes higher in memory (as opposed to being n bytes higher).

Si declaraste mappedFileHandle as

myStruct_t* mappedFileHandle;

then you can subscript it like an array. If the mapped file is laid out as a sequence of myStruct_t blocks, and the next field refers to other blocks by index within that sequence, then (supposing myStruct_t* b is a block of interest)

mappedFileHandle[b->next].number

son los number del objeto b->nextth block in the sequence.

(This is just a consequence of the way that arrays are defined in C: mappedFileHandle[b->next] se define como equivalente a *(mappedFileHandle + b->next), que es un objeto de tipo myStruct_t, which you can therefore get the number field of).

Respondido el 12 de junio de 14 a las 11:06

Your note has nothing to do with the question and I am aware that the + operation on a pointer type will add n times the size of the type. With an array I'm bound to having my structures in sequence which is not what I want to have. This is not a real answer, rather an explanation of the problem, which I thought is clear.. - RafaelH

Then it's not completely clear what your question is. You know that mmap maps at an unpredictable address, so any absolute pointers stored in the mapped file will necessarily be wrong. MAP_FIXED may allow you to play games with the mapping address, but that's probably precarious. If you want to do this sort of thing, then managing a pool yourself and storing the offsets from the start is probably the only way. Is that what you're asking? I don't think the compiler can realistically help out here. - Norman Grey

The question is, if there's any way I can use pointers (like in first example) with a mapped file, so if the pointers are written to file, it writes the offsets instead of runtime addresses. - RafaelH

Ah, right: No, I don't think that's possible in any sort of portable way. Looking at the Windows docs for _based I see what the goal is, but I'm not _aware of anything analogous in GCC, and certainly nothing portable to other compilers. I'm afraid it's DIY, and some self-managed pool of structs+offsets in an array.... - Norman Grey

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.