|
|||
|
alex <nospam@nospam.com> wrote:
> On Wed, 11 Jul 2012 11:59:42 -0400, Eric Sosman wrote: > > A problem with the approach you've outlined is that the > > checksum computation will include the values of any padding bytes -- the > > size of `z' in your example almost begs for some padding bytes to be > > inserted. Since padding bytes are not necessarily preserved when > > assigning structs or even when assigning to struct elements, a checksum > > that includes padding bytes is unlikely to be very useful. > Are you sure about this?? I would expect a struct assign/deepcopy to be > implemented "under the hood" using memcpy(), not { s1.a=s2.a; > s1.b=s2.b; } etc. Pretty sure that's what GCC does. I doubt it. For small objects with fixed sizes it may do the opposite, i.e. convert memcpy calls to something equivalent to an assignment, presuming the compiler has enough smarts. My guess is that assignments can be more easily parallelized by the compiler. Certainly with an assignment there's more opportunity for optimization, especially considering padding bytes. With memcpy you're obscuring the semantics of what you're trying to accomplish and the compiler has to be more conservative (or, conversely, do more work to untangle the mess). For all the compiler knows you're implementing some OOP structure hack depending on padding bytes. Don't fall into the trap of "theoretically the compiler could figure it out". Leave that kind of wishful and contingent thinking to the Java/JVM cult, where thesis papers magically turn into products overnight, at least in a debate. Simpler code = happier compiler. |
|
|
||||
|
||||
|
|
|
|||
|
alex <nospam@nospam.com> writes:
> On Wed, 11 Jul 2012 11:59:42 -0400, Eric Sosman wrote: >> A problem with the approach you've outlined is that the >> checksum computation will include the values of any padding bytes -- the >> size of `z' in your example almost begs for some padding bytes to be >> inserted. Since padding bytes are not necessarily preserved when >> assigning structs or even when assigning to struct elements, a checksum >> that includes padding bytes is unlikely to be very useful. > > Are you sure about this?? I would expect a struct assign/deepcopy to be > implemented "under the hood" using memcpy(), not { s1.a=s2.a; > s1.b=s2.b; } etc. Pretty sure that's what GCC does. A compiler can do it either way. Using the equivalent of a memcpy() call is certainly a likely approach, but there's no guarantee that it's done that way. Code that assumes padding bytes are preserved is likely to work perfectly until the moment you demonstrate it to an important client. -- Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst> Will write code for food. "We must do something. This is something. Therefore, we must do this." -- Antony Jay and Jonathan Lynn, "Yes Minister" |
|
|||
|
On 7/16/2012 5:01 PM, alex wrote:
> On Wed, 11 Jul 2012 11:59:42 -0400, Eric Sosman wrote: >> A problem with the approach you've outlined is that the >> checksum computation will include the values of any padding bytes -- the >> size of `z' in your example almost begs for some padding bytes to be >> inserted. Since padding bytes are not necessarily preserved when >> assigning structs or even when assigning to struct elements, a checksum >> that includes padding bytes is unlikely to be very useful. > > Are you sure about this?? Yes. > I would expect a struct assign/deepcopy to be > implemented "under the hood" using memcpy(), not { s1.a=s2.a; > s1.b=s2.b; } etc. Pretty sure that's what GCC does. My car is blue. Therefore, cars are blue. -- Eric Sosman esosman@ieee-dot-org.invalid |
|
|||
|
On 7/16/2012 5:01 PM, alex wrote:
> On Wed, 11 Jul 2012 11:59:42 -0400, Eric Sosman wrote: >> A problem with the approach you've outlined is that the >> checksum computation will include the values of any padding bytes -- the >> size of `z' in your example almost begs for some padding bytes to be >> inserted. Since padding bytes are not necessarily preserved when >> assigning structs or even when assigning to struct elements, a checksum >> that includes padding bytes is unlikely to be very useful. > > Are you sure about this?? I would expect a struct assign/deepcopy to be > implemented "under the hood" using memcpy(), not { s1.a=s2.a; > s1.b=s2.b; } etc. Pretty sure that's what GCC does. It might. The point that is trying to be made in this thread is that it's not *guaranteed* that padding bytes will be preserved. -- Kenneth Brody |
|
|||
|
pozz <pozzugno@gmail.com> writes:
> Il 12/07/2012 17:08, Eric Sosman ha scritto: [snip] >> If you want to write a struct and a checksum to a file and >> verify the checksum when you read it back, keep the checksum as >> a separate variable and don't put it inside the struct. > > Could I ignore the "randomness" of the padding bytes? I read that > the padding bytes can be randomly changed even assigning a value to a > field of the struct. My application should work in this way: > > - at startup, read the configuration file, calculate and verify the > checksum: if it isn't correct, use a default struct; > > - when a field changes (after assigning it the new value), calculate > the new checksum and save both (struct and checksum) to the file; > > - during the normal execution of the application, the fields of the > struct are accessed many times. > > In this situation, could I calculate the checksum on the entire > memory area of the struct (with padding bytes)? [snip] Yes, provided (1) if the whole struct is updated, either you calculate a new checksum from scratch or you make sure the padding bytes are copied also (eg, by using memcpy()) and use the checksum of the source struct, and (2) the checksum is stored in a way so as not to perturb the struct's padding bytes (this can be done by putting the checksum outside the struct in question, or by updating an in-struct checksum using, eg, memcpy()). This question was asked fairly clearly and obviously is the most important one to answer; I don't know why it wasn't responded to more directly. You are right that reading a struct or its members has no effect on its padding bytes. |
|
|||
|
pozz <pozzugno@gmail.com> writes:
> I have a function that computes a 16-bit checksum (following whatever > algorithm) of a memory space: > > unsigned int checksum(const void *buffer, size_t size); > > I want to embed this checksum in a struct: > > struct PStruct { > int x; > unsigned int y; > char z[13]; > ... > unsigned int checksum; > }; > > How to use the checksum() function above? I propose: > > struct PStruct ps; > ... > ps.checksum = checksum(&ps, offsetof(struct PStruct, checksum)); > > Is there a better mechanism? What I think you want is a checksum for the physical value of the struct, ie, a checksum that will match if the bytes are written out (eg, to a file, or copied with memcpy()), and then read back similarly (that is, by coping individual bytes). Under that assumption, this approach will work okay, except you need to take care to store the value in the 'checksum' memory area so that it doesn't perturb the struct being checksum()'ed. This can be done either by using a temporary variable and then using memcpy() to get the value into ps.checksum, or by storing indirectly through a pointer: *&ps.checksum = checksum( ... ); I'm sure some people will say using memcpy() is safer. Certainly most people would agree using memcpy() is no less safe. Alternatively the checksum can be stored separately from the struct, so there is no chance for storing it to affect the struct's physical value. Personally, I would probably put the checksum either as the first member of the struct (rather than the last), or outside the struct altogether (as some others have explained more fully). However, that's a stylistic choice, not a mandatory one: any of the three can work, it's just a question of which one offers the best combination of benefits and costs in your situation. |
|
|||
|
Eric Sosman <esosman@ieee-dot-org.invalid> writes:
> On 7/11/2012 10:56 AM, pozz wrote: >> I have a function that computes a 16-bit checksum (following whatever >> algorithm) of a memory space: >> >> unsigned int checksum(const void *buffer, size_t size); >> >> I want to embed this checksum in a struct: >> >> struct PStruct { >> int x; >> unsigned int y; >> char z[13]; >> ... >> unsigned int checksum; >> }; >> >> How to use the checksum() function above? I propose: >> >> struct PStruct ps; >> ... >> ps.checksum = checksum(&ps, offsetof(struct PStruct, checksum)); >> >> Is there a better mechanism? > > You'd better hope so :-) > > A problem with the approach you've outlined is that the > checksum computation will include the values of any padding > bytes -- the size of `z' in your example almost begs for some > padding bytes to be inserted. [snip elaboration] You're assuming he wants a checksum on the "logical value" of the struct. If what he wants is a checksum on the physical value of the struct -- which appears to be what he does want -- then this approach will work fine (provided of course care is taken so that storing the checksum will not perturb the padding bytes, which I have already addressed in an earlier reply). |
|
|||
|
On 07/15/2012 03:58 AM, pozz wrote:
> Il 12/07/2012 17:07, James Kuyper ha scritto: >> On 07/12/2012 10:42 AM, pozz wrote: >>> Il 11/07/2012 18:32, Stefan Ram ha scritto: >>>> pozz <pozzugno@gmail.com>writes: >>>>> struct PStruct { >>>>> int x; >>>>> unsigned int y; >>>>> char z[13]; >>>>> ... >>>>> unsigned int checksum; >>>>> }; >>>>> ps.checksum = checksum(&ps, offsetof(struct PStruct, checksum)); >>>> >>>> struct a >>>> { int x; >>>> unsigned int y; >>>> char z[12]; }; >>>> >>>> struct p >>>> { struct a a; >>>> unsigned int checksum; }; >>>> >>>> ... >>>> >>>> p.checksum = checksum( &a, sizeof( a )); >>>> >>>> or >>>> >>>> struct p >>>> { unsigned int checksum; >>>> struct a a; } >>>> >>>> checksum( p, sizeof( a )); >>>> >>> >>> What is the advantage of your method? >> >> The sizeof expression is simpler than the corresponding offsetof() >> macro. > > Anyway I think this is a compilation-time complexity. ... What I was concerned with was composition-time complexity: the time it takes the author to write the expression in the source code. It's also maintenance-time complexity: the time it takes the reader to read and understand the code. > ... Both > offsetof() and sizeof() should be calculated at compilation time... is > it correct? Basically. They aren't required to be calculated at compile time; the C standard doesn't make such distinctions. However, they can both be computed at compile time, and it's a reasonable thing to expect of an implementation. -- James Kuyper |
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|