### pipemath-1.2

#### introduction

This package is intended for processing of very large data sets
via shell pipelines. The programs do not store the data.
They are responses to the challenge: can one perform some of
the standard computations of statistical data analysis
(autocorrelation of a scalar time-series, covariance matrix
of a set of vectors, and least-squares polynomials) if one
receives the data points one at a time, and must process them
and throw them away before receiving the next data point?
Of course, all this must be done while preserving numerical
stability. The three C programs I provide seem to achieve these
aims for the three specific problems mentioned.
The ideas could be relevant more generally to stream computing
and distributed data analysis; see e.g.

Version 1.2 is 64-bit clean. A new feature is that the covariance
program takes no arguments.

#### quick start

`tar zvxf pipemath-1.2.tgz; cd pipemath-1.2; make`
#### programs

Lines in the data file starting with # are ignored.

- autocorrelation:
Computes the autocorrelation function of a scalar time series.
Usage: `cat datafile | autocorrelation [maxlag=20 [stride=1 [dt=1]]]`

- covariance:
Computes the covariance matrix of a set of n-vectors.
Usage: `cat datafile | covariance`
or: `covariance < datafile`
Each line of datafile has an n-vector. The value of n is determined
by the number of items on the first line. All subsequent lines must have
the same number of items.

- lsqpoly:
Fits a least-squares polynomial.
Usage: `cat datafile | lsqpoly [degree=1]`.
Each line of datafile has an x,y pair and an optional weight

#### download

pipemath-1.2.tgz

#### installation

`
make
make test
sudo make install
`