|
|
mpfs - an experiment in stochastic lazy floating-point arithmetic
What approaches are available when we need to do reliable floating-point
arithmetic? I'm not thinking here of problems like "compute 100 million
decimal digits of pi". What use are all those digits? And anyway, in such a
calculation it is impossible in principle to guarantee that all digits are
correct. Rather, I'm thinking of problems of the type "determine rigorously
whether x<y or not", where x and y are computed numbers. Such a
calculation can in principle be solved by a computation which terminates in
finite time, assuming x and y are not equal. Problems of this type occur in
computational number theory and computational geometry. Related problems
include computing the floor of a computed real, and the continued fraction of
an irrational (which needs the floor function).
Here is a typical heuristic, such as has traditionally been used in computer algebra systems, for getting results which are correct with (usually) high probability:
- Set a precision level for the whole computation. Record
the result. Repeat the whole computation at a higher precision. If the
result doesn't change, it's probably correct.
It's clear that the computational complexity will increase as we move from
asking for result that is probably correct, to demanding one that is
guaranteed correct. The main question in this field is: can we write
a general-purpose software package which gives guaranteed results with an
acceptably short computation time? Here are some approaches:
- Error analysis: perform (by hand) a complete analysis of the propagation
of round-off error through the computation. Set the precision level of each
step so that the final error is acceptably small. This method was used by
some entries in the
Many-digits friendly
competition. But this is hardly feasible for a general-purpose software
package.
- Exact real arithmetic: there have been many attempts at this - see the
links at xrc.
Essentially, these methods use lazy evaluation where the whole directed
acyclic graph (DAG) representing the computation is stored, and intermediate
results are re-evaluated as needed. Each step is performed with a bounded
error, so the final error is bounded. The main problems in practice concern
the storage required to store the DAG, and the fact that the error bounds are
typically too pessimistic. In many cases, the minimum possible "graininess"
of 1 bit (the smallest possible increase in precision per step) is larger than
is actually needed to satisfy error bounds. In large calculations, this can
result in unacceptable inefficiencies. Also, transcendental functions are
difficult to make efficient in these methods.
- iRRAM
- RealLib
Experience shows that which approach performs best is highly
problem-dependent.
the idea
The present method is an experimental stochastic variation of exact
real arithmetic. The novelty is a way to avoid the 1-bit graininess mentioned
above. It is assumed that arbitrary precision floating-point arithmetic with
correct rounding is available (such as provided by mpfr). The ingredients are:
- The complete DAG is stored, as in my xrc.
- The intermediate value at each node is a floating-point
approximation x and absolute error bound e. The interval [x-e,x+e] always strictly bounds the correct result. This interval will be refined automatically as necessary.
- When evaluation is initiated, the increase in precision as we move away
from the output node is probabilistic, controlled by a parameter alpha. The
optimum value (i.e. that minimizing computation time) for alpha will depend on
the problem, but the final results are independent of alpha. In essence, if
alpha<1, then 1 bit is added with probability alpha. If
alpha>1, then floor(alpha) bits are added. This allows tuning - for
example, for the logistic map problem, we know that alpha=1 should be optimal,
because the Liapunov exponent is such that one bit is lost on average per iteration.
- The results of evaluation-initiating functions like mpfs_cmp,
mpfs_floor are guaranteed correct (subject to there being no bugs!).
- mpfs_out_str tries to print the requested number of digits correctly, but
does not guarantee correctness.
The stochastic component is of course a heuristic. Why might this be
reasonable? Why not compute errors bounds exactly, as iRRAM does? One
answer is already given above - error bounds are always pessimistic. Another
answer is that computing error bounds itself takes time. One might compare
the way packet protocols are implemented - at the lowest level, packets are
just transmitted, regardless of whether they might suffer collisions or
otherwise get lost. So at this level the system is stochastic, and
transmission is not reliable. It is made reliable by being wrapped in
another layer (e.g. TCP), which detects when packets are lost and resends them
if necessary. My mpfs software has a similar two-level design - the lower
level is not reliable, but the upper layer (the only layer with which the user
interacts), is reliable. (More accurately, the lower layer actually is
reliable, but not necessarily precise enough. The upper layer ensures
sufficient precision.)
download
mpfs-0.9.tgz
installation
- Make sure you have
GMP and
mpfr
installed.
- Edit the top few lines of Makefile and tests/Makefile.
- make
- make check
- make install (as root)
available functions
The naming convention follows mpfr, except that there is no output ("rop")
argument.
char* mpfs_get_version(void);
void mpfs_set_alpha(double);
double mpfs_get_alpha(void);
mpfs_t mpfs_const_pi(void);
mpfs_t mpfs_set_d(double);
mpfs_t mpfs_abs(mpfs_t);
mpfs_t mpfs_neg(mpfs_t);
mpfs_t mpfs_sqrt(mpfs_t);
mpfs_t mpfs_exp(mpfs_t);
mpfs_t mpfs_expm1(mpfs_t);
mpfs_t mpfs_log(mpfs_t);
mpfs_t mpfs_log1p(mpfs_t);
mpfs_t mpfs_sin(mpfs_t);
mpfs_t mpfs_cos(mpfs_t);
mpfs_t mpfs_tan(mpfs_t);
mpfs_t mpfs_atan(mpfs_t);
mpfs_t mpfs_add(mpfs_t, mpfs_t);
mpfs_t mpfs_sub(mpfs_t, mpfs_t);
mpfs_t mpfs_mul(mpfs_t, mpfs_t);
mpfs_t mpfs_div(mpfs_t, mpfs_t);
mpfs_t mpfs_si_add(long, mpfs_t);
mpfs_t mpfs_si_sub(long, mpfs_t);
mpfs_t mpfs_si_mul(long, mpfs_t);
mpfs_t mpfs_si_div(long, mpfs_t);
mpfs_t mpfs_add_si(mpfs_t, long);
mpfs_t mpfs_sub_si(mpfs_t, long);
mpfs_t mpfs_mul_si(mpfs_t, long);
mpfs_t mpfs_div_si(mpfs_t, long);
mpfs_t mpfs_mul_ui(mpfs_t, unsigned long);
mpfs_t mpfs_div_ui(mpfs_t, unsigned long);
mpfs_t mpfs_d_add(double, mpfs_t);
mpfs_t mpfs_d_sub(double, mpfs_t);
mpfs_t mpfs_d_mul(double, mpfs_t);
mpfs_t mpfs_d_div(double, mpfs_t);
mpfs_t mpfs_add_d(mpfs_t, double);
mpfs_t mpfs_sub_d(mpfs_t, double);
mpfs_t mpfs_mul_d(mpfs_t, double);
mpfs_t mpfs_div_d(mpfs_t, double);
int mpfs_cmp_d(mpfs_t x, double d);
int mpfs_cmp(mpfs_t x, mpfs_t y);
double mpfs_floor(mpz_t y, mpfs_t x);
size_t mpfs_out_str(FILE *stream, int base, size_t digits, mpfs_t x);
At the moment there is no way to free mpfs_t objects. This will be added
in version 0.8, using reference counting.
example program
Coding is similar to xrc, but differs from mpfr in that variables do not
have to be initialized, and functions return a pointer to their output, rather
than return the result in a argument. This example is the classic problem:
determine whether exp(pi*sqrt(163)) is integral or not (it is not). Here is how the code
might look with different approaches.
- double: (result inconclusive)
#include <stdio.h>
#include <math.h>
int main() {
double x=exp(M_PI*sqrt(163));
printf("x-floor(x)=%g\n",x-floor(x));
return 0;
}
- mpfr: (result suggestive, but not rigorously conclusive)
#include <mpfr.h>
int main() {
mpfr_t x,pi,f;
mpfr_set_default_prec(500);
mpfr_init_set_ui(x,163);
mpfr_init(pi);
mpfr_init(f);
mpfr_const_pi(pi);
mpfr_sqrt(x,x);
mpfr_mul(x,pi,x);
mpfr_exp(x,x);
mpfr_floor(f,x);
mpfr_sub(x,x,f);
printf("x-floor(x)>0?: %d\n",mpfr_cmp_ui(x,0));
return 0;
}
- xrc: (result conclusive)
#include <xrc.h>
int main() {
xr_t x,z;
x=xr_exp(xr_mul(xr_pi(),xr_sqrt(xr_init(163,1))));
z=xr_near_int(x);
printf("cmp(x,nearint(x))=%d\n",xr_cmp(x,z));
return 0;
}
- mpfs: (result conclusive)
#include <mpfs.h>
int main() {
mpfs_t x;
x=mpfs_exp(mpfs_mul(mpfs_const_pi(),mpfs_sqrt(mpfs_set_d(163))));
printf("cmp(floor(x),x)=%d\n",mpfs_cmp(mpfs_floor(x),x));
return 0;
}
sample timings on an AMD64 2.2GHz
This software is not designed to win competitions for calculating many digits,
but on the other hand is should be too bad at this either.
These problems are from the Many-digits friendly competition. The timings are for mpfs-0.7.
| mpfs-0.9 | mpfr team |
problem | command | time | command | time |
sin(tan(cos(1))), 106 digits | tests/C01 6 | 6s | ./C01 1000000 | 6s |
logistic map, 104 iterations, 104 digits | tests/C13 4 | 3s | ./C13 10000 | 3s |
This website uses no cookies. This page was last modified 2024-01-21 10:57
by .
|
|