Go Back   Rhinocerus > Newsgroup > Newsgroup comp.lang.lisp

Reply
 
Thread Tools Display Modes
  #1 (permalink)  
Old 06-02-2012, 10:41 AM
Nicolas Neuss
Guest
 
Posts: n/a
Default Re: Optimizing simple Common Lisp gibbs sampler program

Faheem Mitha <faheem@email.unc.edu> writes:

> (defun gibbs (N thin)
> (declare (fixnum N thin))
> (declare (optimize (speed 3) (safety 1)))
> (let ((x 0.0) (y 0.0))
> (declare ((double-float 0.0 *) x))
> (declare (double-float y))
> (print "Iter x y")
> (dotimes (i N)
> (dotimes (j thin)
> (declare (fixnum i j))
> (setf x (cl-rmath::rgamma 3.0 (/ 1.0 (+ (* y y) 4))))
> (setf y (cl-rmath::rnorm (/ 1.0 (+ x 1.0)) (/ 1.0 (sqrt (+ (* 2 x) 2))))))
> (format t "~a ~a ~a~%" i x y))))

^^^^^^

I didn't check it, but isn't the performance of this code completely
output-dominated?

Nicolas

P.S.: Your reasoning about the problems of C++/Python and similar
couplings was also what lead me to CL. Maybe I'll expand this later.
Reply With Quote
Alt Today
Advertising
 
and become member of Rhinocerus
Standard Sponsored Links

  #2 (permalink)  
Old 06-03-2012, 07:00 AM
Faheem Mitha
Guest
 
Posts: n/a
Default Re: Optimizing simple Common Lisp gibbs sampler program

On Sat, 02 Jun 2012 12:41:46 +0200, Nicolas Neuss <lastname@scipolis.de> wrote:
> Faheem Mitha <faheem@email.unc.edu> writes:
>
>> (defun gibbs (N thin)
>> (declare (fixnum N thin))
>> (declare (optimize (speed 3) (safety 1)))
>> (let ((x 0.0) (y 0.0))
>> (declare ((double-float 0.0 *) x))
>> (declare (double-float y))
>> (print "Iter x y")
>> (dotimes (i N)
>> (dotimes (j thin)
>> (declare (fixnum i j))
>> (setf x (cl-rmath::rgamma 3.0 (/ 1.0 (+ (* y y) 4))))
>> (setf y (cl-rmath::rnorm (/ 1.0 (+ x 1.0)) (/ 1.0 (sqrt (+ (* 2 x) 2))))))
>> (format t "~a ~a ~a~%" i x y))))

> ^^^^^^
>
> I didn't check it, but isn't the performance of this code completely
> output-dominated?


Hi Nicolas,

Thanks for your reply.

Based on my timings, it does not seem so. The time difference made by
removing the format statement is only a few seconds. Do you get
different results?

> P.S.: Your reasoning about the problems of C++/Python and similar
> couplings was also what lead me to CL. Maybe I'll expand this later.


That would be nice.

As it happens, I had earlier read your paper

"On using Common Lisp for Scientific Computing"

but the link for that paper is broken. See, for example,

http://www.cl-user.net/asp/KhGQ/sdataQ19S81cTO7MjDQ3kOHpX8yBX8yBXnMq=/sdataQu3F$sSHnB==

which points to

http://www.iwr.uni-heidelberg.de/org...t2002-40.ps.gz

which returns a not found error.
Regards, Faheem
Reply With Quote
  #3 (permalink)  
Old 06-04-2012, 01:44 PM
Nicolas Neuss
Guest
 
Posts: n/a
Default Re: Optimizing simple Common Lisp gibbs sampler program

Faheem Mitha <faheem@email.unc.edu> writes:

> On Sat, 02 Jun 2012 12:41:46 +0200, Nicolas Neuss <lastname@scipolis.de> wrote:
>> Faheem Mitha <faheem@email.unc.edu> writes:
>>
>>> (defun gibbs (N thin)
>>> (declare (fixnum N thin))
>>> (declare (optimize (speed 3) (safety 1)))
>>> (let ((x 0.0) (y 0.0))
>>> (declare ((double-float 0.0 *) x))
>>> (declare (double-float y))
>>> (print "Iter x y")
>>> (dotimes (i N)
>>> (dotimes (j thin)
>>> (declare (fixnum i j))
>>> (setf x (cl-rmath::rgamma 3.0 (/ 1.0 (+ (* y y) 4))))
>>> (setf y (cl-rmath::rnorm (/ 1.0 (+ x 1.0)) (/ 1.0 (sqrt (+ (* 2 x) 2))))))
>>> (format t "~a ~a ~a~%" i x y))))

>> ^^^^^^
>>
>> I didn't check it, but isn't the performance of this code completely
>> output-dominated?

>
> Hi Nicolas,
>
> Thanks for your reply.
>
> Based on my timings, it does not seem so. The time difference made by
> removing the format statement is only a few seconds. Do you get
> different results?


No, as I said I didn't check:-)

So, it looks as if this is a code which does a lot of output which is
not dominant however. Therefore, I would suggest dropping the format
statement for further testing. Next, looking at Darren Wilkinson's
original post by far the fastest version (8 secs) was using the GSL
library. So probably using cl-gsl instead of cl-rmath would speed up
the calculation quite a lot.

>> P.S.: Your reasoning about the problems of C++/Python and similar
>> couplings was also what lead me to CL. Maybe I'll expand this later.

>
> That would be nice.
>
> As it happens, I had earlier read your paper
>
> "On using Common Lisp for Scientific Computing"
>
> but the link for that paper is broken. See, for example,
>
> http://www.cl-user.net/asp/KhGQ/sdataQ19S81cTO7MjDQ3kOHpX8yBX8yBXnMq=/sdataQu3F$sSHnB==
>
> which points to
>
> http://www.iwr.uni-heidelberg.de/org...t2002-40.ps.gz
>
> which returns a not found error.
> Regards, Faheem


Thanks. I'll try correcting this. For the moment, I have put a copy at
<http://scipolis.de/misc/cisc_2002.pdf>.

Nicolas


Reply With Quote
  #4 (permalink)  
Old 06-04-2012, 06:27 PM
Faheem Mitha
Guest
 
Posts: n/a
Default Re: Optimizing simple Common Lisp gibbs sampler program

On Mon, 04 Jun 2012 15:44:45 +0200, Nicolas Neuss <lastname@scipolis.de> wrote:

> So, it looks as if this is a code which does a lot of output which
> is not dominant however. Therefore, I would suggest dropping the
> format statement for further testing. Next, looking at Darren
> Wilkinson's original post by far the fastest version (8 secs) was
> using the GSL library. So probably using cl-gsl instead of cl-rmath
> would speed up the calculation quite a lot.


I had assumed that the C libraries would vary greatly in speed. That
might be a wrong assumption. Yes, I could try that. SBCL also has a
statistical profiler. I didn't try that, since I assumed this wouldn't
give me much information for such a small simple function, but it
might at least tell me how much time is spent running the C functions.

> Thanks. I'll try correcting this. For the moment, I have put a copy at
> <http://scipolis.de/misc/cisc_2002.pdf>.


Thanks. From the references it sounds like this was written around
2002/2003, i.e. about a decade ago. Have your views on this subject
changed in the interim? I think the CL scene has changed a bit since
then. SLIME for example was started around then.

Regards, Faheem
Reply With Quote
  #5 (permalink)  
Old 06-04-2012, 08:28 PM
Nicolas Neuss
Guest
 
Posts: n/a
Default Re: Optimizing simple Common Lisp gibbs sampler program

Faheem Mitha <faheem@email.unc.edu> writes:

>> Thanks. I'll try correcting this. For the moment, I have put a copy
>> at <http://scipolis.de/misc/cisc_2002.pdf>.

>
> Thanks. From the references it sounds like this was written around
> 2002/2003, i.e. about a decade ago. Have your views on this subject
> changed in the interim? I think the CL scene has changed a bit since
> then. SLIME for example was started around then.


I'm not sure. On one hand, development of standard (e.g. web)
applications in free CL implementations has become much easier because
of 1. SLIME, 2. more library options, and 3. improvements of
implementations.

On the other hand, the world has changed in (at least) two respects
which are not so favorable for CL's use in scientific computing:

- Recently some languages[*] have acquired a rather fast JIT compiler
which should make them serious competitors as combined
scripting/computing language.

- Computers become more and more multiprocessor machines with a large
number of processors. Now, standard CL code can almost not use this
power, because it usually has a rather high percentage (say 10-30%) of
garbage collection time and Amdahl's law cuts in if you do not have a
concurrent garbage collection scheme (which none of the CL
implementations has at the moment as much as I know). As a
consequence this means that for HPC you have to program even more
C-like than before for avoiding garbage collection to a very high
degree.

Nicolas
[*] To be more explicit: I learned that this is the case for Racket and
Python/Numpy, although I did not test these features for real
scientific computing applications.
Reply With Quote
  #6 (permalink)  
Old 06-04-2012, 08:32 PM
Nicolas Neuss
Guest
 
Posts: n/a
Default Re: Optimizing simple Common Lisp gibbs sampler program

Nicolas Neuss <lastname@scipolis.de> writes:

> I'm not sure. On one hand, development of standard (e.g. web)
> applications in free CL implementations has become much easier because
> of 1. SLIME, 2. more library options, and 3. improvements of

especially Quicklisp!
> implementations.


Nicolas
Reply With Quote
  #7 (permalink)  
Old 06-04-2012, 09:30 PM
Faheem Mitha
Guest
 
Posts: n/a
Default Re: Optimizing simple Common Lisp gibbs sampler program

On Mon, 04 Jun 2012 22:28:20 +0200, Nicolas Neuss <lastname@scipolis.de> wrote:
> Faheem Mitha <faheem@email.unc.edu> writes:
>
>>> Thanks. I'll try correcting this. For the moment, I have put a copy
>>> at <http://scipolis.de/misc/cisc_2002.pdf>.

>>
>> Thanks. From the references it sounds like this was written around
>> 2002/2003, i.e. about a decade ago. Have your views on this subject
>> changed in the interim? I think the CL scene has changed a bit since
>> then. SLIME for example was started around then.

>
> I'm not sure. On one hand, development of standard (e.g. web)
> applications in free CL implementations has become much easier because
> of 1. SLIME, 2. more library options, and 3. improvements of
> implementations.
>
> On the other hand, the world has changed in (at least) two respects
> which are not so favorable for CL's use in scientific computing:
>
> - Recently some languages[*] have acquired a rather fast JIT compiler
> which should make them serious competitors as combined
> scripting/computing language.
>
> - Computers become more and more multiprocessor machines with a large
> number of processors. Now, standard CL code can almost not use this
> power, because it usually has a rather high percentage (say 10-30%) of
> garbage collection time and Amdahl's law cuts in if you do not have a
> concurrent garbage collection scheme (which none of the CL
> implementations has at the moment as much as I know). As a
> consequence this means that for HPC you have to program even more
> C-like than before for avoiding garbage collection to a very high
> degree.


I don't know what "concurrent garbage collection" is, but is this
something that could be added to a CL implementation?

> Nicolas
>
>[*] To be more explicit: I learned that this is the case for Racket and
> Python/Numpy, although I did not test these features for real
> scientific computing applications.


It is true that Pypy for example will make Python more competitive for
scientific computing, if/when it matures. From what I know of Racket,
I would not have thought of it as a competitor to CL for scientific
computing, and even if it became successful I would have thought it
would only help CL by association.

Regards, Faheem
Reply With Quote
  #8 (permalink)  
Old 06-05-2012, 09:33 AM
Nicolas Neuss
Guest
 
Posts: n/a
Default Re: Optimizing simple Common Lisp gibbs sampler program

Faheem Mitha <faheem@faheem.info> writes:

> I don't know what "concurrent garbage collection" is, but is this
> something that could be added to a CL implementation?


OK, concurrency (i.e. GC while other threads are running) is not
strictly necessary for achieving high performance. More important is
that the GC should be parallelized so that it is not any more a
bottleneck for multithreaded programs. Of course, this could be added
to CL implementations, but -as far as I know- this is not yet the case.

You may check yourself what is happening in your implementation by
executing simple garbage producing loops like the following:

(defun simple-consing ()
(loop repeat 500000 do
(loop repeat 100 collect nil)))

(execute-with-n-threads #'simple-consing n)

At the ELS2011, I briefly showed the following timings using SBCL on an
AMD 64 bit machine with 2x2 cores:

Threads 1 2 3 4 5 6 7 8
(real) Time 1.6 5.0 8.6 11.6 13.5 16.3 20.5 24

OTOH, computing Mandelbrot sets scales up very nicely:

Threads 1 2 3 4 5 6 7 8
(real) Time 0.6 0.6 0.6 0.9 1.1 1.1 1.4 1.4

> It is true that Pypy for example will make Python more competitive for
> scientific computing, if/when it matures.


Isn't it yet? I learned it only from someone else, so I am genuinely
interested. What is lacking at the moment?

> From what I know of Racket, I would not have thought of it as a
> competitor to CL for scientific computing, and even if it became
> successful I would have thought it would only help CL by association.


Maybe. I would like it, of course.

Nicolas
Reply With Quote
  #9 (permalink)  
Old 06-05-2012, 10:17 AM
Faheem Mitha
Guest
 
Posts: n/a
Default Re: Optimizing simple Common Lisp gibbs sampler program

On Tue, 05 Jun 2012 11:33:47 +0200, Nicolas Neuss <lastname@scipolis.de> wrote:
> Faheem Mitha <faheem@faheem.info> writes:
>
>> I don't know what "concurrent garbage collection" is, but is this
>> something that could be added to a CL implementation?

>
> OK, concurrency (i.e. GC while other threads are running) is not
> strictly necessary for achieving high performance. More important is
> that the GC should be parallelized so that it is not any more a
> bottleneck for multithreaded programs. Of course, this could be added
> to CL implementations, but -as far as I know- this is not yet the case.
>
> You may check yourself what is happening in your implementation by
> executing simple garbage producing loops like the following:
>
> (defun simple-consing ()
> (loop repeat 500000 do
> (loop repeat 100 collect nil)))
>
> (execute-with-n-threads #'simple-consing n)


I don't have an implementation of execute-with-n-threads
handy. Multiprocessing is on my todo list, but I haven't got to it
yet.

> At the ELS2011, I briefly showed the following timings using SBCL on an
> AMD 64 bit machine with 2x2 cores:
>
> Threads 1 2 3 4 5 6 7 8
> (real) Time 1.6 5.0 8.6 11.6 13.5 16.3 20.5 24


Yes, that's not so great.

> OTOH, computing Mandelbrot sets scales up very nicely:
>
> Threads 1 2 3 4 5 6 7 8
> (real) Time 0.6 0.6 0.6 0.9 1.1 1.1 1.4 1.4
>
>> It is true that Pypy for example will make Python more competitive for
>> scientific computing, if/when it matures.

>
> Isn't it yet? I learned it only from someone else, so I am genuinely
> interested. What is lacking at the moment?


There are some incompatibilities between the pypy and cpython C APIs,
I believe, though I'm not really familar with the issue. Notably,
pypy can't run numpy yet, and possibly has issues with other cpython
libraries that use the Python C API. I tried running pypy on my
machine. It is certainly fast, and in combination with Python's other
advantages, could prove a formidable competitor in the scientific
computing area in the future. People already using Python for
scientific computing, even though it is slow, and speeding it up
involves that painful process of backending it with C/C++ code.

This seems to be as recent as anything -
http://morepypy.blogspot.in/2012/04/...ss-report.html
Googling "pypy numpy" gives other similar hits.

Of course, speed is not the only issue to consider. Part of my
interest in CL is that is seems to map better to things like
algorithms, and also has better abstraction facilities - e.g. similar
blocks of syntax can be abstracted away. Similar blocks of syntax are
not uncommon in scientific algorithms.

To take it to the extreme, it is reasonable to use DSLs in scientific
code, imo.

Also, frankly, CL just strikes me as a better designed language than
Python, which is superficially appealing, but imo has a design and
implementation that is not as solid as one would like.

>> From what I know of Racket, I would not have thought of it as a
>> competitor to CL for scientific computing, and even if it became
>> successful I would have thought it would only help CL by association.


> Maybe. I would like it, of course.


Have you used Racket in your work?
Regards, Faheem
Reply With Quote
  #10 (permalink)  
Old 06-05-2012, 12:09 PM
Nicolas Neuss
Guest
 
Posts: n/a
Default Re: Optimizing simple Common Lisp gibbs sampler program

Faheem Mitha <faheem@faheem.info> writes:

> On Tue, 05 Jun 2012 11:33:47 +0200, Nicolas Neuss <lastname@scipolis.de> wrote:
>> Faheem Mitha <faheem@faheem.info> writes:
>>
>>> I don't know what "concurrent garbage collection" is, but is this
>>> something that could be added to a CL implementation?

>>
>> OK, concurrency (i.e. GC while other threads are running) is not
>> strictly necessary for achieving high performance. More important is
>> that the GC should be parallelized so that it is not any more a
>> bottleneck for multithreaded programs. Of course, this could be added
>> to CL implementations, but -as far as I know- this is not yet the case.
>>
>> You may check yourself what is happening in your implementation by
>> executing simple garbage producing loops like the following:
>>
>> (defun simple-consing ()
>> (loop repeat 500000 do
>> (loop repeat 100 collect nil)))
>>
>> (execute-with-n-threads #'simple-consing n)

>
> I don't have an implementation of execute-with-n-threads
> handy. Multiprocessing is on my todo list, but I haven't got to it
> yet.


If you can run my Femlisp code (I think even the Quicklisp version
works, the CVS version should work in any case), you should be able to
do something like the following in the FL.MULTIPROCESSING package:

(defun speedup-test (func)
(loop for i from 1 upto 8 do
(format t "~R thread~:P~%" i)
(let ((*number-of-threads* i))
(time (with-workers (func)
(loop repeat i do (work-on)))))))

(speedup-test (_ (simple-consing 500000)))

> [...]
>>> It is true that Pypy for example will make Python more competitive for
>>> scientific computing, if/when it matures.

>>
>> Isn't it yet? I learned it only from someone else, so I am genuinely
>> interested. What is lacking at the moment?

>
> There are some incompatibilities between the pypy and cpython C APIs,
> I believe, though I'm not really familar with the issue. Notably,
> pypy can't run numpy yet, and possibly has issues with other cpython
> libraries that use the Python C API. I tried running pypy on my
> machine. It is certainly fast, and in combination with Python's other
> advantages, could prove a formidable competitor in the scientific
> computing area in the future. People already using Python for
> scientific computing, even though it is slow, and speeding it up
> involves that painful process of backending it with C/C++ code.
>
> This seems to be as recent as anything -
> http://morepypy.blogspot.in/2012/04/...ss-report.html
> Googling "pypy numpy" gives other similar hits.


Thanks, that was informative.

> Of course, speed is not the only issue to consider. Part of my
> interest in CL is that is seems to map better to things like
> algorithms, and also has better abstraction facilities - e.g. similar
> blocks of syntax can be abstracted away. Similar blocks of syntax are
> not uncommon in scientific algorithms.
>
> To take it to the extreme, it is reasonable to use DSLs in scientific
> code, imo.
>
> Also, frankly, CL just strikes me as a better designed language than
> Python, which is superficially appealing, but imo has a design and
> implementation that is not as solid as one would like.


Agreed.

>>> From what I know of Racket, I would not have thought of it as a
>>> competitor to CL for scientific computing, and even if it became
>>> successful I would have thought it would only help CL by association.

>
>> Maybe. I would like it, of course.

>
> Have you used Racket in your work?


Not Racket, but its predecessor DrScheme, and that one only for
educational purposes. However, I was impressed at some time how fast
its JIT compiler had become (although it was for an example outside of
scientific computing).

With "I would like it", I meant "I would like it if it helped CL by
association".

Nicolas
Reply With Quote
  #11 (permalink)  
Old 06-05-2012, 10:03 PM
Faheem Mitha
Guest
 
Posts: n/a
Default Re: Optimizing simple Common Lisp gibbs sampler program

On Tue, 05 Jun 2012 14:09:24 +0200, Nicolas Neuss <lastname@scipolis.de> wrote:

[snip]

> If you can run my Femlisp code (I think even the Quicklisp version
> works, the CVS version should work in any case), you should be able to
> do something like the following in the FL.MULTIPROCESSING package:


> (defun speedup-test (func)
> (loop for i from 1 upto 8 do
> (format t "~R thread~:P~%" i)
> (let ((*number-of-threads* i))
> (time (with-workers (func)
> (loop repeat i do (work-on)))))))


> (speedup-test (_ (simple-consing 500000)))


Ok, thanks. I'll give that a try. It is Ok if I get back to you with
questions if necessary? I'd do this via email.

[snip]

>>>> From what I know of Racket, I would not have thought of it as a
>>>> competitor to CL for scientific computing, and even if it became
>>>> successful I would have thought it would only help CL by
>>>> association.


>>> Maybe. I would like it, of course.

>>
>> Have you used Racket in your work?


> Not Racket, but its predecessor DrScheme, and that one only for
> educational purposes. However, I was impressed at some time how
> fast its JIT compiler had become (although it was for an example
> outside of scientific computing).


> With "I would like it", I meant "I would like it if it helped CL by
> association".


Right, I see. Yes, that would be good.

Regards, Faheem
Reply With Quote
  #12 (permalink)  
Old 06-06-2012, 06:50 PM
Nicolas Neuss
Guest
 
Posts: n/a
Default Re: Optimizing simple Common Lisp gibbs sampler program

Faheem Mitha <faheem@faheem.info> writes:

> Ok, thanks. I'll give that a try. It is Ok if I get back to you with
> questions if necessary? I'd do this via email.


Yes, of course, that would be fine. I'm quite interested in this topic.

Nicolas
Reply With Quote
 
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off




All times are GMT. The time now is 07:07 PM.


Copyright ©2009

LinkBacks Enabled by vBSEO 3.3.0 RC2 © 2009, Crawlability, Inc.