Go Back   Rhinocerus > Newsgroup > Newsgroup comp.lang.* 1 > Newsgroup comp.lang.fortran

Reply
 
Thread Tools Display Modes
  #1 (permalink)  
Old 03-27-2006, 02:00 PM
Bart Vandewoestyne
Guest
 
Posts: n/a
Default status of quadruple precision arithmetic in g95 and gfortran?

The g95 status page mentions that support for quad precision
arithmetic in g95 is `coming soon...'.

I can't find anything related to quadruple precision arithmetic
on the gfortran site.

Can somebody comment on the status of the quadruple precision
arithmetic support in both g95 and gfortran? I assume it isn't
available yet? Anybody who can tell me how far away we are from it?

Regards,
Bart

--
"Share what you know. Learn what you don't."
Reply With Quote
Alt Today
Advertising
 
and become member of Rhinocerus
Standard Sponsored Links

  #2 (permalink)  
Old 03-27-2006, 02:33 PM
Tim Prince
Guest
 
Posts: n/a
Default Re: status of quadruple precision arithmetic in g95 and gfortran?

Bart Vandewoestyne wrote:
> The g95 status page mentions that support for quad precision
> arithmetic in g95 is `coming soon...'.
>
> I can't find anything related to quadruple precision arithmetic
> on the gfortran site.
>
> Can somebody comment on the status of the quadruple precision
> arithmetic support in both g95 and gfortran? I assume it isn't
> available yet? Anybody who can tell me how far away we are from it?
>

You may have to search the fortran@gcc.gnu.org list archives (to learn
about gfortran progress) yourself, as it's not clear what you want.
Surely, the question of quad arithmetic will always be target dependent.
gfortran presumably already supports it on a few targets where it's
readily available, but those aren't the most popular ones.
ia64 quad appears to depend mainly on someone doing the work on ia64.md,
where there is support for 80-bit but not 128-bit floating point. In
case you don't see the point, this is a language independent part of
gcc. I wouldn't bet on that changing, even with all the publicity about
financial commitments to gcc development.
Reply With Quote
  #3 (permalink)  
Old 03-27-2006, 04:35 PM
Bart Vandewoestyne
Guest
 
Posts: n/a
Default Re: status of quadruple precision arithmetic in g95 and gfortran?

On 2006-03-27, Tim Prince <tprince@nospamcomputer.org> wrote:
>
> You may have to search the fortran@gcc.gnu.org list archives (to learn
> about gfortran progress) yourself, as it's not clear what you want.
> Surely, the question of quad arithmetic will always be target dependent.
> gfortran presumably already supports it on a few targets where it's
> readily available, but those aren't the most popular ones.


Sorry, you are right... i should specify more I guess...
If I say 'quadruple precision on i386 architectures', does that
make my question more clear then?

Actually, what I basically would like to know is when g95 will be
able to compute with the same maximum precision as for example
ifort 9.0 can do on my Linux i386 box. Suppose I use the
following numeric kinds:

integer, parameter, public :: sp = kind(1.0)
integer, parameter, public :: dp = selected_real_kind(2*precision(1.0_sp))
integer, parameter, public :: qp_preferred = &
selected_real_kind(2*precision(1.0_dp))
integer, parameter, public :: qp = (1+sign(1,qp_preferred))/2*qp_preferred+ &
(1-sign(1,qp_preferred))/2*dp

then qp is available and a higher precision then dp if i compile with
ifort 9.0 on my Debian GNU/Linux box on i386. If I compile this with g95
then qp is the same numeric kind as dp.

Regards,
Bart

--
"Share what you know. Learn what you don't."
Reply With Quote
  #4 (permalink)  
Old 03-27-2006, 05:03 PM
Joost
Guest
 
Posts: n/a
Default Re: status of quadruple precision arithmetic in g95 and gfortran?

Bart,

right now, g95 doesn't support real*16, but does support real*10 on the
appropriate hardware. So I think you could use
qp_preferred = selected_real_kind(1+precision(1.0_dp))
to get real*10, but nothing more than that. An important issue is of
course that if the hardware doesn't support real*16 it is not only very
slow to compute with real*16 quantities, but also a lot of work to
implement all operations required for real*16 calculations.

Joost

Reply With Quote
  #5 (permalink)  
Old 03-27-2006, 06:09 PM
Bernhard Enders
Guest
 
Posts: n/a
Default Re: status of quadruple precision arithmetic in g95 and gfortran?

Computing with quad precision is really slow by the fact that it is
software implemented, at least with Intel Fortran Compiler. It is
software implemented because the is no system (is there?) that supports
128 bits floating point arithmetics. In the case of extended precision
(80 bits fp arithmetics), several systems has hardware support for
this, that's true for IA32, EM64T, AMD64, etc. So you can expect less
performance hit from extended precision than from quad precision, when
compared to double precision.

Bernhard.

Reply With Quote
  #6 (permalink)  
Old 03-27-2006, 07:53 PM
glen herrmannsfeldt
Guest
 
Posts: n/a
Default Re: status of quadruple precision arithmetic in g95 and gfortran?

Bernhard Enders <bgeneto@gmail.com> wrote:
> Computing with quad precision is really slow by the fact that it is
> software implemented, at least with Intel Fortran Compiler. It is
> software implemented because the is no system (is there?) that supports
> 128 bits floating point arithmetics.


IBM S/370 supports 128 but except for divide, which is done
in software. In ESA/390 and later divide is done in hardware.

The 360/85 was the original machine supporting this format.

VAX has H-float, but it is done through software emulation
on most models. I was told that the VAX 11/730 supports it
(in microcode), the slowest of the non-micro VAXs.

-- glen
Reply With Quote
  #7 (permalink)  
Old 03-28-2006, 02:51 AM
Tim Prince
Guest
 
Posts: n/a
Default Re: status of quadruple precision arithmetic in g95 and gfortran?

Bernhard Enders wrote:
> Computing with quad precision is really slow by the fact that it is
> software implemented, at least with Intel Fortran Compiler. It is
> software implemented because the is no system (is there?) that supports
> 128 bits floating point arithmetics. In the case of extended precision
> (80 bits fp arithmetics), several systems has hardware support for
> this, that's true for IA32, EM64T, AMD64, etc. So you can expect less
> performance hit from extended precision than from quad precision, when
> compared to double precision.
>

Several architectures include hardware support for 128-bit floating
point by combinations of instructions. Among those still in production
are IA64 and IBM Power. The former has both 80-bit and 128-bit
IEEE-style, generally at most one of which is implemented with a single
set of options. The latter has a non-IEEE compliant version of somewhat
less precision.
The 80-bit arithmetic might be implemented with 128-bit storage (48 bits
unused), due to the alignment requirements for efficiency.
Reply With Quote
  #8 (permalink)  
Old 03-28-2006, 01:22 PM
Bernhard Enders
Guest
 
Posts: n/a
Default Re: status of quadruple precision arithmetic in g95 and gfortran?

Thanks for your information concerning 64 bits architecture. This is a
bit OT. I would like to know where can I read more technical
information about the "alignment requirements for efficiency", i.e.,
the fact that 80 bits calculations are performed using 128 bits
registers (I have heard about this but can't remember where)? And why
in the world we don't have 128 bits arithmetics on 'popular'
architectures such as ia32 or on AMD64 if there exist 128 bits
registers on these architectures? Is it so difficult (or costly) to
implement 128 bits operations or there are no interest in doing this?
Just for information, it follows an exerpt from AMD64 architecture
manual vol. 1 showing that it has 16x128bits registers with media
instructions (why no fp support at all??):

"The AMD64 architecture provides three floating-point instruction
subsets, using three distinct register sets:
- 128-Bit Media Instructions support 32-bit single-precision and 64-bit
double-precision floating-point operations, in addition to integer
operations."

Best regards,

Bernhard.

Reply With Quote
  #9 (permalink)  
Old 03-28-2006, 02:20 PM
Tim Prince
Guest
 
Posts: n/a
Default Re: status of quadruple precision arithmetic in g95 and gfortran?

Bernhard Enders wrote:
> Thanks for your information concerning 64 bits architecture. This is a
> bit OT. I would like to know where can I read more technical
> information about the "alignment requirements for efficiency", i.e.,
> the fact that 80 bits calculations are performed using 128 bits
> registers (I have heard about this but can't remember where)? And why
> in the world we don't have 128 bits arithmetics on 'popular'
> architectures such as ia32 or on AMD64 if there exist 128 bits
> registers on these architectures? Is it so difficult (or costly) to
> implement 128 bits operations or there are no interest in doing this?
> Just for information, it follows an exerpt from AMD64 architecture
> manual vol. 1 showing that it has 16x128bits registers with media
> instructions (why no fp support at all??):
>
> "The AMD64 architecture provides three floating-point instruction
> subsets, using three distinct register sets:
> - 128-Bit Media Instructions support 32-bit single-precision and 64-bit
> double-precision floating-point operations, in addition to integer
> operations."
>

The 80-bit fp calculations do use the 80-bit x87 registers. You may
find something about the recommendation for 128-bit data alignment in
the CPU manufacturers' software guides. A packed array of 80-bit data
would involve frequent multiple accesses to cache and memory, including
data which straddle cache line boundaries. Few Fortran compilers support
this format, in spite of it being supported in nearly all C compilers
for linux (but few for Windows).
Most Fortran compilers now do support the 128-bit parallel mode with
auto-vectorization, but that is a big diversion from the original topic
of quad precision. No one calls the 4 simultaneous single (or paired
double) precision operations "quad precision". The details of
implementation vary with hardware type; full width parallel operation on
Intel desktop CPUs, paired 64-bit width floating point units on AMD
desktops, splitting into a pair of closely pipelined 64-bit operations
on pentium-m, all with the same binary software code.
Reply With Quote
  #10 (permalink)  
Old 03-28-2006, 04:02 PM
Herman D. Knoble
Guest
 
Posts: n/a
Default Re: status of quadruple precision arithmetic in g95 and gfortran?

You are right about the cost of QP. Skip Knoble

Program Qtime
! Sample Program to illustrate DP versus QP compute times:
! Intel Fortran V9.0-5748 on AMD Opteron 852 with O2.

integer, parameter :: QDP = selected_real_kind(30)
real(kind=QDP) :: x, sum
real :: T1,T2, Seconds
integer :: i, pulses, PPS

x=1.5_QDP
sum=0.0_QDP
CALL SYSTEM_CLOCK(COUNT=Pulses,COUNT_RATE=PPS)
T1 = REAL(Pulses,QDP)/PPS

Do I=1,100000000
sum=sum+I*x
end do

CALL SYSTEM_CLOCK(COUNT=Pulses,COUNT_RATE=PPS)
T2 = REAL(Pulses,QDP)/PPS
Seconds=T2-T1
print *, " Time in seconds: ",Seconds
print *, "QDP=",QDP
print *, "Sum=",Sum

end Program Qtime


Output for: integer, parameter :: QDP = selected_real_kind(15)

Time in seconds: 0.5800781
QDP= 8
Sum= 7.500000080627340E+015

Output for: integer, parameter :: QDP = selected_real_kind(30)
Time in seconds: 5.750000
QDP= 16
Sum= 7500000075000000.00000000000000000

On Tue, 28 Mar 2006 14:20:04 GMT, Tim Prince <tprince@nospamcomputer.org> wrote:

-|Bernhard Enders wrote:
-|> Thanks for your information concerning 64 bits architecture. This is a
-|> bit OT. I would like to know where can I read more technical
-|> information about the "alignment requirements for efficiency", i.e.,
-|> the fact that 80 bits calculations are performed using 128 bits
-|> registers (I have heard about this but can't remember where)? And why
-|> in the world we don't have 128 bits arithmetics on 'popular'
-|> architectures such as ia32 or on AMD64 if there exist 128 bits
-|> registers on these architectures? Is it so difficult (or costly) to
-|> implement 128 bits operations or there are no interest in doing this?
-|> Just for information, it follows an exerpt from AMD64 architecture
-|> manual vol. 1 showing that it has 16x128bits registers with media
-|> instructions (why no fp support at all??):
-|>
-|> "The AMD64 architecture provides three floating-point instruction
-|> subsets, using three distinct register sets:
-|> - 128-Bit Media Instructions support 32-bit single-precision and 64-bit
-|> double-precision floating-point operations, in addition to integer
-|> operations."
-|>
-|The 80-bit fp calculations do use the 80-bit x87 registers. You may
-|find something about the recommendation for 128-bit data alignment in
-|the CPU manufacturers' software guides. A packed array of 80-bit data
-|would involve frequent multiple accesses to cache and memory, including
-|data which straddle cache line boundaries. Few Fortran compilers support
-|this format, in spite of it being supported in nearly all C compilers
-|for linux (but few for Windows).
-|Most Fortran compilers now do support the 128-bit parallel mode with
-|auto-vectorization, but that is a big diversion from the original topic
-|of quad precision. No one calls the 4 simultaneous single (or paired
-|double) precision operations "quad precision". The details of
-|implementation vary with hardware type; full width parallel operation on
-|Intel desktop CPUs, paired 64-bit width floating point units on AMD
-|desktops, splitting into a pair of closely pipelined 64-bit operations
-|on pentium-m, all with the same binary software code.

Reply With Quote
  #11 (permalink)  
Old 03-28-2006, 04:56 PM
Ian Gay
Guest
 
Posts: n/a
Default Re: status of quadruple precision arithmetic in g95 and gfortran?

Herman D. Knoble <SkipKnobleLESS@SPAMpsu.DOT.edu> wrote in
news:4dni22t33u3i6u6ctruig5iltu9he5d6im@4ax.com:

> You are right about the cost of QP. Skip Knoble
>
> Program Qtime
> ! Sample Program to illustrate DP versus QP compute times:
> ! Intel Fortran V9.0-5748 on AMD Opteron 852 with O2.
>
> integer, parameter :: QDP = selected_real_kind(30)
> real(kind=QDP) :: x, sum
> real :: T1,T2, Seconds
> integer :: i, pulses, PPS
>
> x=1.5_QDP
> sum=0.0_QDP
> CALL SYSTEM_CLOCK(COUNT=Pulses,COUNT_RATE=PPS)
> T1 = REAL(Pulses,QDP)/PPS
>
> Do I=1,100000000
> sum=sum+I*x
> end do
>
> CALL SYSTEM_CLOCK(COUNT=Pulses,COUNT_RATE=PPS)
> T2 = REAL(Pulses,QDP)/PPS
> Seconds=T2-T1
> print *, " Time in seconds: ",Seconds
> print *, "QDP=",QDP
> print *, "Sum=",Sum
>
> end Program Qtime
>
>
> Output for: integer, parameter :: QDP = selected_real_kind(15)
>
> Time in seconds: 0.5800781
> QDP= 8
> Sum= 7.500000080627340E+015
>


Why doesn't double precision produce a better result here?


> Output for: integer, parameter :: QDP = selected_real_kind(30)
> Time in seconds: 5.750000
> QDP= 16
> Sum= 7500000075000000.00000000000000000
>
> On Tue, 28 Mar 2006 14:20:04 GMT, Tim Prince
> <tprince@nospamcomputer.org> wrote:
>
> -|Bernhard Enders wrote:
>

<snip>

--
*********** To reply by e-mail, make w single in address
**************
Reply With Quote
  #12 (permalink)  
Old 03-28-2006, 05:54 PM
Richard E Maine
Guest
 
Posts: n/a
Default Re: status of quadruple precision arithmetic in g95 and gfortran?

Ian Gay <gay@sfuu.ca> wrote:

> Herman D. Knoble <SkipKnobleLESS@SPAMpsu.DOT.edu> wrote in
> news:4dni22t33u3i6u6ctruig5iltu9he5d6im@4ax.com:

....
> > Do I=1,100000000
> > sum=sum+I*x
> > end do

....
> > Sum= 7.500000080627340E+015

>
> Why doesn't double precision produce a better result here?


Why should it produce a better result? If my quick check is correct (and
I might have misssed because it isn't very far off and my check was
pretty hasty) this goes past the limits for which IEEE double gives
perfect results. Therefore, you'll have round-off errors in the
addition. And since there are quite a lot of additions (100 million of
them), those round-off errors can work their way up by quite a few bits
from the low order one.

Are you perhaps assuming that just because double has about 15 digits of
precision, that you can count on roundoff always staying down in the
bottom few bits? If so, I suggest reading up on the subject of numerical
instability.

Sounds to me like just a typical case of life in the real world of
floatting point arithmetic.

--
Richard Maine | Good judgment comes from experience;
email: my first.last at org.domain| experience comes from bad judgment.
org: nasa, domain: gov | -- Mark Twain
Reply With Quote
  #13 (permalink)  
Old 03-28-2006, 07:50 PM
Herman D. Knoble
Guest
 
Posts: n/a
Default Re: status of quadruple precision arithmetic in g95 and gfortran?

On Tue, 28 Mar 2006 09:54:09 -0800, nospam@see.signature (Richard E Maine) wrote:

-|Ian Gay <gay@sfuu.ca> wrote:
-|
-|> Herman D. Knoble <SkipKnobleLESS@SPAMpsu.DOT.edu> wrote in
-|> news:4dni22t33u3i6u6ctruig5iltu9he5d6im@4ax.com:
-|...
-|> > Do I=1,100000000
-|> > sum=sum+I*x
-|> > end do
-|...
-|> > Sum= 7.500000080627340E+015
-|>
-|> Why doesn't double precision produce a better result here?
-|
-|Why should it produce a better result? If my quick check is correct (and
-|I might have misssed because it isn't very far off and my check was
-|pretty hasty) this goes past the limits for which IEEE double gives
-|perfect results. Therefore, you'll have round-off errors in the
-|addition. And since there are quite a lot of additions (100 million of
-|them), those round-off errors can work their way up by quite a few bits
-|from the low order one.

Richard: As always, thank you.

Ian, here are some additional notes and code that you may wish to
check out that may help you with this and other more complex cases.

First, I completely agree with Richard's analysis, namely that such sums
are not the best numerical computations. But, a loop like

sum=0
delta=(a fraction not representble in binary, like .1 for example)
do i=1,n
sum=sum+delta !method 1
end do
will accumulate the representation error (of delta).
From long time experience we know that representation
error (of decimal fractions) can also be magnified by subtracting
two nearly equal quantities; the most significant digits cancel
during the subtraction, where the least significant digits
(where variaous numerical errors can be) become the most
significant digits. The example (program fuzztest)
http://ftp.cac.psu.edu/pub/ger/fortran/hdk/eps.f90
and
http://ftp.cac.psu.edu/pub/ger/fortran/hdk/example1.txt
illustrate this somewhat dramatically.

The loop
do i=1,n
sum=sum+I*delta ! method 2
end do
may have round off error but at least will not maximize the
effect of (accumulate) representation error as the above method 1
loop does.

There's also a better way to sum as Kahn (and Giles) point out at:
http://ftp.cac.psu.edu/pub/ger/fortran/hdk/KahnSum.f90


-|
-|Are you perhaps assuming that just because double has about 15 digits of
-|precision, that you can count on roundoff always staying down in the
-|bottom few bits? If so, I suggest reading up on the subject of numerical
-|instability.
-|
-|Sounds to me like just a typical case of life in the real world of
-|floatting point arithmetic.

I agree with Richard here also. I included displaying the sum realizing that
roundoff would likely happen.

The real purpose for the posting was to illustrate the order of
magnitude difference in computation time between Double and Quadruple
precision, using the Intel compiler which supports Real*16 (and
Complex*32). I'd guess that a compiler can do a quad software implementa tion
faster than using Quad.f90: http://users.bigpond.net.au/amiller/quad.html
which uses Fortran derived Type (quad).

All the best.
Skip Knoble

Reply With Quote
  #14 (permalink)  
Old 03-29-2006, 01:58 AM
Ian Gay
Guest
 
Posts: n/a
Default Re: status of quadruple precision arithmetic in g95 and gfortran?

nospam@see.signature (Richard E Maine) wrote in
news:1hcwlu7.17lxrqj1vfm668N%nospam@see.signature:

> Ian Gay <gay@sfuu.ca> wrote:
>
>> Herman D. Knoble <SkipKnobleLESS@SPAMpsu.DOT.edu> wrote in
>> news:4dni22t33u3i6u6ctruig5iltu9he5d6im@4ax.com:

> ...
>> > Do I=1,100000000
>> > sum=sum+I*x
>> > end do

> ...
>> > Sum= 7.500000080627340E+015

>>
>> Why doesn't double precision produce a better result here?

>
> Why should it produce a better result? If my quick check is
> correct (and I might have misssed because it isn't very far off
> and my check was pretty hasty) this goes past the limits for which
> IEEE double gives perfect results. Therefore, you'll have
> round-off errors in the addition. And since there are quite a lot
> of additions (100 million of them), those round-off errors can
> work their way up by quite a few bits from the low order one.
>


Recall that x had the value 1.5.
I was thinking that since 1.5 is exactly representable in binary
floating point, (and assuming a good compiler would represent it
exactly :-)) that the errors would be much smaller than this. On
further thought, I see that this is a (carefully constructed?)
pathological example of errors from denormazlization of the smaller
argument to a floating add.
For amusement: (qt is the op's program set for double precision)
(Windows xp on athlon)

C:\source\test>g95 -o qt qt.f95

C:\source\test>qt
Time in seconds: 0.5781
QDP= 8
Sum= 7.5000000806273400D+15

C:\source\test>g95 -O1 -o qt qt.f95

C:\source\test>qt
Time in seconds: 0.2187
QDP= 8
Sum= 7.5000000750000000D+15

If you don't specify optimization, g95 loads and stores SUM each
cycle. If you optimize, it's kept in the 80-bit floating point stack,
so you get the advantage of the longer accumulator, as well as the
speedup. (Unless you're foolish enough to force sse2 evaluation).

> Are you perhaps assuming that just because double has about 15
> digits of precision, that you can count on roundoff always staying
> down in the bottom few bits? If so, I suggest reading up on the
> subject of numerical instability.
>
> Sounds to me like just a typical case of life in the real world of
> floatting point arithmetic.
>




--
*********** To reply by e-mail, make w single in address
**************
Reply With Quote
  #15 (permalink)  
Old 03-29-2006, 02:16 AM
Richard Maine
Guest
 
Posts: n/a
Default Re: status of quadruple precision arithmetic in g95 and gfortran?

Ian Gay <gay@sfuu.ca> wrote:

> nospam@see.signature (Richard E Maine) wrote in
> news:1hcwlu7.17lxrqj1vfm668N%nospam@see.signature:


> > Why should it produce a better result? If my quick check is
> > correct (and I might have misssed because it isn't very far off
> > and my check was pretty hasty) this goes past the limits for which
> > IEEE double gives perfect results...


> Recall that x had the value 1.5.
> I was thinking that since 1.5 is exactly representable in binary...


Just because 1.5 is exactly representable in IEEE double does not mean
that all the numbers involved in the calculation are. Specifically,
these numbers get big enough that sum is not exactly representable. But
it sounds like you now see that.

> On
> further thought, I see that this is a (carefully constructed?)
> pathological example of errors from denormazlization of the smaller
> argument to a floating add.


While I think you' have the facts right, I'm not so sure about your
evaluation of it as being carefully constructed or pathological. I'd say
it was much more like typical of floatting point roundoff issues.

--
Richard Maine | Good judgement comes from experience;
email: last name at domain . net | experience comes from bad judgement.
domain: summertriangle | -- Mark Twain
Reply With Quote
 
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off




All times are GMT. The time now is 02:40 PM.


Copyright ©2009

LinkBacks Enabled by vBSEO 3.3.0 RC2 © 2009, Crawlability, Inc.