|
|||
|
On 4/12/2012 2:51 PM, Willem wrote:
[...] > That's the kind of thinking that brought us 2-megabyte XML blobs > as "database entities". Some stuff you just can't refactor when > it turns out it's a performance killer, so you *have* to account > for that from the beginning. Not to mention 'if it works, ship' I thought it was "it compiles without errors... ship it." :-) -- Kenneth Brody |
|
|
||||
|
||||
|
|
|
|||
|
On Thu, 12 Apr 2012 21:23:03 -0400, Eric Sosman wrote:
> Despite its infelicities, I see little in the C library design > that forces implementations or callers to perform poorly. Certainly, > there are some interfaces that could, if revised, admit of higher- > erforming implementations. For example, malloc() is a fairly "narrow" > interface, and one can imagine a super_malloc() to which one could pass > information about the expected lifetime of the allocation, its affinity > or disaffinity to other allocations, whether it's more likely to be used > by the GPU or the DMA chip, and so on. But one can also see that such a > super_malloc() could be rather messy to use: > > char *ptr = malloc(42); > > vs. > > super_malloc_options opt = { 0 }; > opt.request_size = 42; > opt.growth_allowance = 42 * 6; > opt.lifetime_hint = SUPER_MALLOC_LIFETIME_BRIEF; opt.cache_choice = > SUPER_MALLOC_CACHE_AVOID(&blivet); #if __STDC_VERSION__ > 201306L > opt.address_randomization = > #ifdef __WINDOWS__ > __MS_RANDOMIZE_IF_ABLE__ > #else > SUPER_MALLOC_RANDOMIZE_IF_ABLE > #endif > ; > #endif > char *ptr = super_malloc(&opt); > > There is no doubt that an interface along the lines of the latter could > outperform the familiar malloc() in at least some situations. But which > interface would you, as a code writer, prefer to use? Which would you > expect to require more debugging time? Come on Eric, this is a straw man argument. Of course it is possible to get the best of both worlds. For example, in the Pthreads API, the pthread_create() function takes a pthread_attr_t * argument, but you just need to provide NULL to get a sensible set of defaults. |
|
|||
|
On 4/12/2012 8:35 PM, James Kuyper wrote:
> On 04/12/2012 04:07 PM, BartC wrote: >> "John Reye"<jononanon@googlemail.com> wrote in message [...] >>> Are these deficiencies _only_ string-related? >> >> It seems that way. But no-one is bothered about it, because writing >> alternative versions (thin wrappers around standard functions) is trivial: > ,,, >> int strcpy_l(char *s,char *t,int tlen){ >> if (tlen<0) tlen=strlen(t); >> memcpy(s,t,tlen+1); >> return tlen; >> } > > Your wrapper version passes through the input array twice: once for > strlen() and once for memcpy(). That's precisely the deficiency he wants > to avoid. A minor change to the implementation of strcpy() would allow > it to return the desired pointer at no extra cost - the cost for your > wrapper is fairly high. Well, "fairly high" is relative. What is the code going to do with the copy? Perhaps that is orders of magnitude longer than the call to strlen(). Note, too, that it may be more efficient to call strlen() and then memcpy() than it is to copy byte-by-byte as you look for the nul-terminator. (As I recall, the x86 series of CPU has a single opcode which basically does strlen, and memcpy can copy 4 [or even 8] bytes at a time for everything other than the endpoints.) -- Kenneth Brody |
|
|||
|
"James Kuyper" <jameskuyper@verizon.net> wrote in message
news:4F882FAF.4050609@verizon.net... > On 04/13/2012 06:38 AM, BartC wrote: >>>> int strcpy_l(char *s,char *t,int tlen){ >>>> if (tlen<0) tlen=strlen(t); >>>> memcpy(s,t,tlen+1); >>>> return tlen; >>>> } >>> >>> Your wrapper version passes through the input array twice: once for >>> strlen() and once for memcpy(). >> >> Yes, I mentioned my versions work better when the caller knows the >> lengths. > > Which I consider a very minor advantage; if the length is known, just > call memcpy() directly. If it's not known, your version is slower than > strcpy(), and no faster than strcpy() followed by strlen(). Then you lose all the abstraction that strcpy() etc provide; you have to remember to copy the extra char in the case of strcpy(). And my main example was strcat(); the strcpy() was needed to initialise the destination in each iteration of the test loop. You don't really want to start inlining functions such as strcat() because the code will obscure whatever it is you're really trying to do. Don't forget these _l functions also return the length of the resulting string. I altered the two test loops (one using _l functions, one using standard functions) so that the length of the resulting string was needed. (In the _l loop, I also used the result of strcpy_l() as one of the length parameters to strcat_l().) That way I managed to outperform four x86 C compilers at their highest optimisation settings, *even without knowing* the lengths of the two strings at the start of each iteration! (OK, on a very specific test on two strings of 64 and 80 chars.) I did another test using a strcmp_l(); this can return an instant result when lengths are unequal, instead of having to scan the actual strings. OK, you will argue this could also be done inline, but you don't really want to do that; it's the compiler's job to inline, not yours. -- Bartc |
|
|||
|
On 04/13/2012 12:56 PM, Kenneth Brody wrote:
> On 4/12/2012 8:35 PM, James Kuyper wrote: >> On 04/12/2012 04:07 PM, BartC wrote: >>> "John Reye"<jononanon@googlemail.com> wrote in message > [...] >>>> Are these deficiencies _only_ string-related? >>> >>> It seems that way. But no-one is bothered about it, because writing >>> alternative versions (thin wrappers around standard functions) is trivial: >> ,,, >>> int strcpy_l(char *s,char *t,int tlen){ >>> if (tlen<0) tlen=strlen(t); >>> memcpy(s,t,tlen+1); >>> return tlen; >>> } >> >> Your wrapper version passes through the input array twice: once for >> strlen() and once for memcpy(). That's precisely the deficiency he wants >> to avoid. A minor change to the implementation of strcpy() would allow >> it to return the desired pointer at no extra cost - the cost for your >> wrapper is fairly high. > > Well, "fairly high" is relative. What is the code going to do with the > copy? Perhaps that is orders of magnitude longer than the call to strlen(). Agreed - I don't like wasting anything, but the amount of waste involved here is not sufficient to justify John Reye's description of these routines as "useless". > Note, too, that it may be more efficient to call strlen() and then memcpy() > than it is to copy byte-by-byte as you look for the nul-terminator. (As I > recall, the x86 series of CPU has a single opcode which basically does > strlen, and memcpy can copy 4 [or even 8] bytes at a time for everything > other than the endpoints.) That's too platform-specific an issue for my tastes; some other platform might have a single opcode for copying while looking for a null terminator. As long as I write clean simple code that does precisely what needs to be done, and no more than what needs to be done, I feel I've done my part of the job; choosing the right op-codes to implement it is the compiler's part. If a two-pass implementation turns out to be faster than the one-pass solution I wrote, I can hope that a sufficiently sophisticated compiler will convert my one-pass C code into machine code equivalent to a two-pass implementation. I'm not going to lose any sleep over the possibility that it won't. If I need more speed, and can afford to loose portability, I'll use assembly language rather than trying to figure out how to trick a particular C compiler into generating the assembly language I want it to generate. John would probably be more interested in such issues than I am. |
|
|||
|
Hi,
Am 04/12/2012 08:22 PM, schrieb John Reye: > On Apr 12, 6:13 pm, Jens Gustedt <jens.gust...@loria.fr> wrote: >> very useful for constructs such as >> >> char const* s = strcat(strcpy(malloc(100), "Hello "), "world"); > What's wrong with: > > char * s; > if ((s = (char *)malloc(100)) == NULL) { > fprintf(stderr, "Out of memory\n"); > exit(1); > } > strcpy(s, "Hello "); > strcat(s, "world"); > > Are you trying to tell me that your's will be more efficient. > Me'thinks not. Mine is an expression, so it can easily packed in a macro, in the initializer of a `for`-loop variable, whatever. >> >> or >> >> char const* t = strcat((char[100]){ "Hello " }, s); > char *t = (char *)((char[100]){ "Hello " }); > strcat(t, s); In your's the compound literal makes not much sense, since you give away the unprotected pointer to it, so it would better be written char t[100] = { "Hello " }; strcat(t, s); Mine has the const qualifier on the pointed-to object. How would you achieve this? >>> I already know the distination, in the first place. >> >> No, not necessarily. The first argument is an expression. This feature >> avoids to evaluate that expression multiple times. > > Sorry, I don't follow. Could you explain it (perhaps with an example). If you'd like to use such things in macros (I do) you only want to evaluate each argument once, because of possible side effects. #define STRCAT3(X, Y, Z) strcat(strcat((X), (Y)), (Z)) in P99 I have #define P99_STRCATS(TARG, ...) that lifts that idea to (almost) any fixed number of arguments. (Well I am cheating a bit, that one uses stpcpy under the hood to be more efficient. But the idea is the same.) Jens |
|
|||
|
Am 04/12/2012 06:01 PM, schrieb John Reye:
> Example: char * strcat ( char * destination, const char * > source ); > Return Value: destination is returned. > > How useless is that! I already know the distination, in the first > place. > Why not return a pointer to the end of the concatenated string, or the > size of the string. This would cost the library no extra performance > cost whatsoever! ah, perhaps I found a library that would fit your needs better. How about POSIX' memccpy? (((char*)memccpy(destination, source, 0, SIZE_MAX))-1) should do what you want, I think. (You'd have to be sure that source is 0-terminated for this to work, but that you'd have to do for strcat, too) Jens |
|
|||
|
בתאריך יום שישי, 13 באפריל 2012 19:48:30 UTC+1, מאת James Kuyper:
> > If a two-pass implementation turns out to be faster than the one-pass > solution I wrote, I can hope that a sufficiently sophisticated compiler > will convert my one-pass C code into machine code equivalent to a > two-pass implementation. I'm not going to lose any sleep over the > possibility that it won't. If I need more speed, and can afford to loose > portability, I'll use assembly language rather than trying to figure out > how to trick a particular C compiler into generating the assembly > language I want it to generate. > You look at the bottlneck, which is unlikely to be the string concatenation, but might be. You know that. But it's better to remove the bottleneck in C rather than assembly if you can. That way it's more likely that a maintainer can read your code. And ypucan port it much more easily. On the machine you port to the optimal two-pass solution might well be pessimal, but in ten years it might be a 20Ghz job available for five dollars, so no-one will care. |
|
|||
|
Kenneth Brody <kenbrody@spamcop.net> writes:
> On 4/12/2012 2:51 PM, Willem wrote: > [...] > > That's the kind of thinking that brought us 2-megabyte XML blobs > > as "database entities". Some stuff you just can't refactor when > > it turns out it's a performance killer, so you *have* to account > > for that from the beginning. Not to mention 'if it works, ship' > > I thought it was "it compiles without errors... ship it." :-) You've obviously never met our suppliers... Phil -- > I'd argue that there is much evidence for the existence of a God. Pics or it didn't happen. -- Tom (/. uid 822) |
|
|||
|
Jens Gustedt <jens.gustedt@loria.fr> writes:
> Hi, > > Am 04/12/2012 08:22 PM, schrieb John Reye: > > On Apr 12, 6:13 pm, Jens Gustedt <jens.gust...@loria.fr> wrote: > >> very useful for constructs such as > >> > >> char const* s = strcat(strcpy(malloc(100), "Hello "), "world"); gag. > > What's wrong with: > > > > char * s; > > if ((s = (char *)malloc(100)) == NULL) { > > fprintf(stderr, "Out of memory\n"); > > exit(1); > > } > > strcpy(s, "Hello "); > > strcat(s, "world"); > > > > Are you trying to tell me that your's will be more efficient. > > Me'thinks not. > > Mine is an expression, so it can easily packed in a macro, in the > initializer of a `for`-loop variable, whatever. > > >> > >> or > >> > >> char const* t = strcat((char[100]){ "Hello " }, s); > > char *t = (char *)((char[100]){ "Hello " }); > > strcat(t, s); > > In your's the compound literal makes not much sense, since you give away > the unprotected pointer to it, so it would better be written > > char t[100] = { "Hello " }; > strcat(t, s); > > Mine has the const qualifier on the pointed-to object. How would you > achieve this? > > >>> I already know the distination, in the first place. > >> > >> No, not necessarily. The first argument is an expression. This feature > >> avoids to evaluate that expression multiple times. > > > > Sorry, I don't follow. Could you explain it (perhaps with an example). > > If you'd like to use such things in macros (I do) you only want to > evaluate each argument once, because of possible side effects. > > #define STRCAT3(X, Y, Z) strcat(strcat((X), (Y)), (Z)) retch. > in P99 I have > > #define P99_STRCATS(TARG, ...) > > that lifts that idea to (almost) any fixed number of arguments. > > (Well I am cheating a bit, that one uses stpcpy under the hood to be > more efficient. But the idea is the same.) How can you use the word "efficient" so close to code that, from an efficiency viewpoint, is so obviously inefficient? Do you have an efficient bubblesort too in your library? If you want to concatenate 3 strings, using sprintf should be way more efficient than the above nonsense. Phil -- > I'd argue that there is much evidence for the existence of a God. Pics or it didn't happen. -- Tom (/. uid 822) |
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|