|
|||
|
I see a ~25% increase in performance (VC2010 SP1) when adding an
overload that takes integers by value instead of const references. Why are these not implemented as a part of STL for int, double, etc..? #define MIN(a, b) a<b?a:b #define MAX(a, b) a>b?a:b namespace std { inline const int min(int const a, int const b) { return MIN(a, b); } inline const int max(int const a, int const b) { return MAX(a, b); } } caller: { std::vector<int> v(2); v[0] = get_some_random_number(); v[1] = get_some_random_number(); int m = std::min(v[0], v[1]); // ~25% faster when the overload above is chosen } /Eirik |
|
|
||||
|
||||
|
|
|
|||
|
On Feb 22, 2:56*pm, Mike McCarty <mcmcca...@gmail.com> wrote:
> ~25% faster than what? > > 0.3ns vs. 0.4ns? > > I.e. how did you measure this and how can you be sure what you're seeing isn't just noise? Quite. And with what compiler options? Is the function even being inlined, let alone having the reference/dereference elided? |
|
|||
|
On Feb 22, 3:56*pm, Mike McCarty <mcmcca...@gmail.com> wrote:
> ~25% faster than what? > > 0.3ns vs. 0.4ns? > > I.e. how did you measure this and how can you be sure what you're seeing isn't just noise? ~25% faster than just using std::min(int const&, int const&) as specified by the standard. How many nano seconds faster is not really relevant. But it ran for about 20 seconds with about 420.000 iterations (see test iterations below) per second. test setup: int some_dummy_test_value = 0; size_t vector_size = 1000; vector<int> x = fill_vector_with_random_numbers(vector_size); vector<int> y = fill_vector_with_random_numbers(vector_size); test iteration: for(size_t i(0); i != vector_size; ++i) { some_dummy_test_value += std::min(x[i], y[i]); some_dummy_test_value += std::max(x[i], y[i]); } |
|
|||
|
420K iterations/second? That's ~2us per iteration == very slow (as in unbelievably so). It should take on order of nanoseconds, not microseconds.
In any case, it is possible that the optimizer realized that min(x,y)+max(x,y) == x+y and elided the conditional branches entirely. It may not have been able to make that elision via references. You might want to try making your benchmark less trivial. |
|
|||
|
On Feb 22, 4:56*am, Eirik <eirik.olims...@gmail.com> wrote:
> I see a ~25% increase in performance (VC2010 SP1) when adding an > overload that takes integers by value instead of const references. > Why are these not implemented as a part of STL for int, double, etc..? > > #define MIN(a, b) a<b?a:b > #define MAX(a, b) a>b?a:b > > namespace std > { > * * * * inline const int min(int const a, int const b) > * * * * { > * * * * * * * * return MIN(a, b); > * * * * } > * * * * inline const int max(int const a, int const b) > * * * * { > * * * * * * * * return MAX(a, b); > * * * * } > > } > > caller: > { > * * * * std::vector<int> v(2); > * * * * v[0] = get_some_random_number(); > * * * * v[1] = get_some_random_number(); > * * * * int m = std::min(v[0], v[1]); // ~25% faster when the overload above > is chosen > > } As others have asked, what compiler, what platform, what compiler options - specifically is optimization enabled? |
|
|||
|
On Wed, 2012-02-22, Eirik wrote:
> I see a ~25% increase in performance (VC2010 SP1) when adding an > overload that takes integers by value instead of const references. > Why are these not implemented as a part of STL for int, double, etc..? > .... > namespace std > { > inline const int min(int const a, int const b) > { > return MIN(a, b); > } With my compiler (g++ -O3 for amd64) I get the same assembly code when using the overload -- which is exactly what I'd expect for something this easy to optimize. int foo(int a, int b) { return std::min(a, b); } _Z3fooii: cmpl %edi, %esi movl %edi, %eax cmovle %esi, %eax ret /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
|
|||
|
On 2/22/2012 3:06 PM, Mike McCarty wrote:
> 420K iterations/second? That's ~2us per iteration == very slow (as in unbelievably so). It should take on order of nanoseconds, not microseconds. > yes. 2us is about 2x as long as it takes Windows to do a lock/unlock mutex pair on my system (via the "CreateMutex()" / "WaitForSingleObject()" route), which is itself absurdly slow (an "EnterCriticalSection()" / "LeaveCriticalSection()" pair is 15ns IIRC, or about 67x faster). if so, yes, that is pretty damn slow. unless of course someone is using an interpreter, HW emulator, or dragged some old-as-hell PC out of the closet ("how fast does it go on my early generation 386?"). far more likely, there is some sort of measurement issue, such as giving the wrong number of iterations or something... > In any case, it is possible that the optimizer realized that min(x,y)+max(x,y) == x+y and elided the conditional branches entirely. It may not have been able to make that elision via references. You might want to try making your benchmark less trivial. it depends. it the compiler can figure out that the references aren't needed, it can optimize things away. otherwise, it could be wasting a little bit of time getting/setting via references (which involve an additional indirection internally). granted, in general, I would advise against putting too much weight into benchmarks of this sort without good reason though (very rarely are these the sorts of issues which impact the overall performance of a program). generally, it is a good idea not to try to micro-optimize stuff unless it is really needed (not that one can't be aware of this stuff, or believe that it is all unknowable magic or something, but rather it is more that trying to optimize too much early on tends to result in overall worse code, and often not in areas which are actually particularly relevant to the overall running time). this is also part of why there are profilers: they can help point out where optimization effort would be most useful. or such... |
|
|||
|
On Feb 22, 11:23*pm, Jorgen Grahn <grahn+n...@snipabacken.se> wrote:
> On Wed, 2012-02-22, Eirik wrote: > > I see a ~25% increase in performance (VC2010 SP1) when adding an > > overload that takes integers by value instead of const references. > > Why are these not implemented as a part of STL for int, double, etc..? > > ... > > namespace std > > { > > * *inline const int min(int const a, int const b) > > * *{ > > * * * * * *return MIN(a, b); > > * *} > > With my compiler (g++ -O3 for amd64) I get the same assembly code when > using the overload -- which is exactly what I'd expect for something > this easy to optimize. > > int foo(int a, int b) > { > * * return std::min(a, b); > > } > > _Z3fooii: > * * * * cmpl * *%edi, %esi > * * * * movl * *%edi, %eax > * * * * cmovle *%esi, %eax > * * * * ret > > /Jorgen > > -- > * // Jorgen Grahn <grahn@ *Oo *o. * . * * . > \X/ * * snipabacken.se> * O *o * . That example will of course result int the same, as foo takes arguments by value... |
|
|||
|
Eirik <eirik.olimstad@gmail.com> writes:
> That example will of course result int the same, as foo takes > arguments by value... Er, isn't that the whole point -- that std::min yields _exactly the same code_ as your hand-rolled not-using-references MIN thingy (meaning there's no reason to not use std::min)? -miles -- Innards, n. pl. The stomach, heart, soul, and other bowels. |
|
|||
|
On Thu, 2012-02-23, Eirik wrote:
> On Feb 22, 11:23*pm, Jorgen Grahn <grahn+n...@snipabacken.se> wrote: >> On Wed, 2012-02-22, Eirik wrote: >> > I see a ~25% increase in performance (VC2010 SP1) when adding an >> > overload that takes integers by value instead of const references. >> > Why are these not implemented as a part of STL for int, double, etc..? >> >> ... >> > namespace std >> > { >> > * *inline const int min(int const a, int const b) >> > * *{ >> > * * * * * *return MIN(a, b); >> > * *} >> >> With my compiler (g++ -O3 for amd64) I get the same assembly code when >> using the overload -- which is exactly what I'd expect for something >> this easy to optimize. >> >> int foo(int a, int b) >> { >> * * return std::min(a, b); >> >> } >> >> _Z3fooii: >> * * * * cmpl * *%edi, %esi >> * * * * movl * *%edi, %eax >> * * * * cmovle *%esi, %eax >> * * * * ret > That example will of course result int the same, as foo takes > arguments by value... It was the simplest case I could come up with quickly without having it all optimized away. Then please come up with a better example. "int foo(const int& a, const int& b)" perhaps? The burden of proof must ultimately be on you. /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|