﻿ Euler Math Toolbox - Reference

# Statistics with Euler Math Toolbox

## Content

EMT contains many statistical distributions, tests, plots, and functions for reading and writing data. For examples and more functions read the following introduction notebook.

Statistical functions.

## Random Variables

Euler has a reliable random number creator. It can be used to create random variables for many distributions. If you need a fixed sequence, you can set a seed value with seed(x). Otherwise, the time value (in seconds) at the start of the current Euler session will be used.

The newer functions for creating random variables start with rand...(), replacing older functions with less uniform naming.

```function comment seed (x)
```
```  Set the seed for the random numbers

After setting a seed, all random numbers will be determined from
the seed.
```
```function comment random (n,m)
```
```  Uniformly distributed random variables in [0,1]

random() : One random variable
random(n,m) : Matrix of random variables
random(n) : Row vector of random variables
random([n,m]) : Matrix of random variables

The function fastrandom is a quicker, but less reliable,
alternative.

See:   intrandom (Statistics with Euler Math Toolbox),   normal (Statistics with Euler Math Toolbox)
```
```function randuniform (n : index, m : index, a : number, b : number)
```
```  Random samples uniformly of the interval (a,b)

See:   random (Statistics with Euler Math Toolbox),   random (Maxima Documentation)
```
```function comment intrandom (n,m,k)
```
```  Integer random variables in {1,...,k}

intrandom(k) : One random variable
intrandom(n,m,k) : Matrix of random variables
intrandom(n,k) : Row vector of random variables
intrandom([n,m],k) : Matrix of random variables

See:   random (Maxima Documentation)
```
```function randint (n,m,k=none)
```
```  Integer random variables in {1,...,k}

randint(n,m,k) : Matrix of random variables
randint(n,k) : Vector of random variables
```
```function comment normal (n,m)
```
```  0-1-Gaussian distributed random variables

normal() : One random variable
normal(n,m) : Matrix of random variables
normal(n) : Row vector of random variables
normal([n,m]) : Matrix of random variables

The function fastnormal() is a quicker, but less reliable,
alternative.
```
```function randnormal (n : index, m : index,
mean : number = 0, stdev : nonnegative number = 1)
```
```  Random samples from a normal (Gaussian) distribution

```

The following distributions are based on Julia code by John D. Cook.

```function randmatrix (n:index, m:index=none, f\$:string)
```
```  Apply the random generator f\$ to generate a matrix.
```
```function randexponential (n : index, m : index=none,
mean : positive number = 1)
```
```  Random matrix from an exponential distribution

randexponential(n,m) : mean=1
randexponential(n,m,mean) : nxm matrix
randexponential(n,mean=v) : vector with mean=v

See:   randnormal (Statistics with Euler Math Toolbox),   randuniform (Statistics with Euler Math Toolbox)
```
```function randgamma (n : index, m : index = none,
shape : nonnegative number=1, scale : nonnegative number=1)
```
```  Random samples from a gamma distribution

Implementation based on "A Simple Method for Generating Gamma Variables"
by George Marsaglia and Wai Wan Tsang.
ACM Transactions on Mathematical Software
Vol 26, No 3, September 2000, pages 363-372.

Example:

>k=10; theta=2;
>x=randgamma(10000,shape=k,scale=theta);
>plot2d(x,>distribution); ...
>plot2d("x^(k-1)*exp(-x/theta)/theta^k/gamma(k)", ...
```
```function randchi (n : index, m : index, dof : index)
```
```  Random samples from a chi square distribution
```
```function randinversegamma (n : index, m : index,
shape : positive number, scale : positive number)
```
```  return a random matrix from an inverse gamma random variable
```
```function randweibull (n : index, m : index,
shape : positive number, scale : positive number)
```
```  Random samples from a Weibull distribution
```
```function randcauchy (n : index, m : index,
mean : number=0, scale : positive number=1)
```
```  Random samples from a Cauchy distribution
```
```function randt (n : index, m : index,
dof : positive integer)
```
```  Random samples from a Student-t distribution

See Seminumerical Algorithms by Knuth
```
```function randlaplace (n : index, m : index,
mean : number, scale : positive number)
```
```  Random samples from a Laplace distribution
The Laplace distribution is also known as the double exponential distribution.
```
```function randlognormal (n : index, m : index,
mu : number, sigma : positive number)
```
```  Random samples from a log-normal distribution
```
```function randbeta (n : index, m : index,
a: positive number, b : positive number)
```
```  Random samples from a Beta distribution

There are more efficient methods for generating beta samples.
However such methods are little more efficient and much more complicated.
```

## Statistical Distributions

Euler has a lot of routines to generate random numbers (named "rand..."), like the built-in functions random and normal. Moreover, Euler has functions for distributions ("...dis") and their densities ("q..."). For examples, see the following introduction notebook.

This file provides more distributions, random numbers, and tests.

```function comment bindis (k:natural, n:natural, p:number)
```
```  Cumulative binomial distribution

Binomial distribution for i<=k out of n with probability p.

From AlgLib.
```
```function map binsum (k:natural, n:natural, p:number)
```
```  Binomial sum for getting k<=i out of n runs with probability p.

Uses an actual summation to compute the binomial sum. binsum() is
faster.

See:   bindis (Statistics with Euler Math Toolbox),   normalsum (Statistics with Euler Math Toolbox)
```
```function map invbindis (px:number, n:natural, p:number)
```
```  Inverse cumulative binomial distribution

Finds k such that the probability of i<=k out of n is just more
than px. The result may not be integer. Then k=floor(result). A
binary intersection method is used.

>bindis(4,10,0.6), invbindis(%,10,0.6)
0.1662386176
4
```
```function comment bincdis (k,n,p)
```
```  Complementary cumulative binomial distribution

Inverse of the binomial distribution for i<=k out of n with
probability p.

From AlgLib.
```
```function comment invpbindis (k,n,px)
```
```  Inverse (for p) cumulative binomial distribution

Solves px=bindis(k,n,p) for p. Assumes integer k and n.

From AlgLib.

```
```function overwrite normaldis (x : real, mean : real = 0, dev : real = 1)
```
```  Cumulative normal distribution

This function calls the built-in _normaldis(x) with adjusted mean
and standard deviation.
```
```function overwrite invnormaldis (p : real, mean : real = 0, dev : real = 1)
```
```  Inverse of cumulative normal distribution

This function calls the built-in _invnormaldis(x) and adjusts the
mean and the standard deviation.
```
```function comment erf (x)
```
```  Gauss error function

This is the integral of exp(-t^2)/sqrt(pi) from -x to x (from
AlgLib). It is connected to normaldis() via
2*normaldis(sqrt(2)*x)-1=erf(x).
```
```function comment erfc (x)
```
```  Complementary Gauss error function

1-erf(x)
```
```function normalsum (i:natural, n:natural, p:number)
```
```  Probability of getting i or less hits in n runs.

Works like binsum, but is much faster for large n and medium p.

See:   binsum (Statistics with Euler Math Toolbox)
```
```function map hypergeomsum (i:natural, n:natural, itot:natural, ntot:natural)
```
```  Hypergemotric sum.

This is the probability to get i or less hits, if n are picked
randomly in an urn containing ntot objects, with itot good objects.

i : we want i or less hits in n picked objects
n : number of randomly picked objects
itot : total number of positive objects
ntot : total number of objects

Examples:
>1-hypergeomsum(7,13,13,52) // 8 or more spaces in Bridge
0.00126372228099
>columnsplot(hypergeomsum(0:20,20,20,40),lab=0:20):
>hypergeomsum(4,20,20,40), 1-hypergeomsum(15,20,20,40)
0.000179983683393
0.000179983683393
```
```function qnormal (x, m=0, d=1)
```
```  Density (DPF) of the m-d-normal distribution

This is the density function the Gauss normal distribution with
mean m and standard deviation 1.

See:   normaldis (Statistics with Euler Math Toolbox),   erf (Statistics with Euler Math Toolbox),   erf (Maxima Documentation)
```
```function map gammarestr (x)
```
```  Special Gamma function, works only for 2x natural

See:   gamma (Mathematical Functions),   gamma (Maxima Documentation)
```
```function qchidis (x, n)
```
```  Density (DPF) of the chi-squared distribution
```
```function comment chidis (x,n)
```
```  Chi-squared distribution with n degrees of freedom

Algorithm from AlgLib.
```
```function comment chicdis (x,n)
```
```  Complementary chi-squared distribution with n degrees of freedom

Algorithm from AlgLib.

See:   chidis (Statistics with Euler Math Toolbox),   invchidis (Statistics with Euler Math Toolbox),   invchicdis (Statistics with Euler Math Toolbox)
```
```function invchidis (x, n)
```
```  Inverse of of the chi-squared distribution

See:   invchicdis (Statistics with Euler Math Toolbox)

```
```function comment invchicdis (x, n)
```
```  Inverse of of the complentary chi-squared distribution

Algrithm from AlgLib.

```
```function qtdis (t:real, n:nonnegative integer)
```
```  Density (DPF) of the student t distribution

```
```function comment tdis (x:real, n:natural)
```
```  Student T distribution with n degrees of freedom

Algrithm from AlgLib.

See:   invtdis (Statistics with Euler Math Toolbox)
```
```function comment invtdis (x:nonnegative, n:natural)
```
```  Inverse Student T distributio with n degrees of freedom

Algrithm from AlgLib.

```
```function qfdis (x, n, m)
```
```  Denisity (DPF) of the F-distribution
```
```function overwrite map fdis (x, a, b)
```
```  F distribution

Vectorizes the built-in function _fdis(x,a,b).
```
```function overwrite map fcdis (x, a, b)
```
```  F distribution

Vectorizes the built-in function _fcdis(x,a,b).
```
```function overwrite map invfcdis (x, a, b)
```
```  Complementary F distribution

Vectorizes the built-in function _invfcdis(x,a,b)
```
```function map invfdis (x, a, b)
```
```  Inverse of of the F distribution

```

## Descriptive Statistical Functions

```function meandev (x:numerical, v=none)
```
```  Mean value and statistical standard deviation of [x1,...,xn]

An optional additional parameter v contains the multiplicities of
x. m=mean(x) will assign the mean value only! If x is a matrix the
function works on each row.

x : data (1xm or nxm)
v : multiplicities (1xn or nxm)

See:   mean (Maxima Documentation)
```
```function mean (x:numerical, v:real vector=none)
```
```  Mean value of x.

An optional additional parameter contains multiplicities.

See:   meandev (Statistics with Euler Math Toolbox),   median (Statistics with Euler Math Toolbox),   median (Maxima Documentation)
```
```function dev (x:numerical, v:real vector=none)
```
```  Experimental standard deviation of x

An additional parameter may contain multiplicities.

See:   meandev (Statistics with Euler Math Toolbox)
```
```function median (x, v=none, p:real vector=0.5)
```
```  Quantile such that p of the x[i] are less equal.

v are optional multiplicities for the values. If x is a matrix, the
function works on all rows of x.

x : data (1xm or nxm)
v : multiplicities (1xm or nxm)
p : desired percentage (real or row vector)

See:   mean (Statistics with Euler Math Toolbox),   mean (Maxima Documentation),   quartiles (Statistics with Euler Math Toolbox),   quantile (Statistics with Euler Math Toolbox),   quantile (Maxima Documentation)
```
```function pfold (v: real vector, w: real vector)
```
```  Distribution of the sum of two distributions

v[i], w[i] contain the probabilities that each random variable is
equal to i-1. result[i] contains the probability that the sum of
the two random variables is i-1.

See:   fold (Numerical Algorithms),   fftfold (Numerical Algorithms)
```
```function comment quantile (v:vector,p:real)
```
```  Compute the p-quantile of the elements in v

Function from AlgLib. This functions takes care of multiplicities
of the two values closest to the quantile. For the lower, upper or
middle quantile, use the median function.

>quantile([1,2],20%)
1.2
>quantile([1,2,2],20%)
1.4
```
```function covar (x:real vector, y:real vector)
```
```  Empirical covariance of x and y

The covariance is the scalar product of x and y after
centralization (x-mean(x),y-mean(y)) divided by the n-1, where n is
the length of x and y.

See:   covarmatrix (Statistics with Euler Math Toolbox)
```
```function covarmatrix (x:real)
```
```  Empirical covariance matrix of the rows of x

The covariance matrix contains the empirical covariances of the rows
of x, i.e., the scalar products of the centralized rows divided by
the number columns of x minus 1.

```
```function sphering (X)
```
```  Sphering of the matrix X.

The matrix X contains samples of random variables in its rows. The
sphering of X is a linear transformation Y=T.(X-m), such that the
rows of B have mean 0 and an identity correlation matrix.

Returns {Y,T,m)

See:   covarmatrix (Statistics with Euler Math Toolbox)
```
```function correl (x:real vector, y:real vector)
```
```  Correlation of x and y

The correlation is the salar product of the centralized and
normalized vectors x and y.
```
```function correlmatrix (x:real)
```
```  Correlation matrix of the rows of x

See:   covar (Statistics with Euler Math Toolbox)
```
```function ranks (x)
```
```  Ranks of the elements of x in x.

This is the number i of the item x[i] in the vector x. With
multiplicities, the rank is the mean rank of the equal elements.

Works for reals, real vectors, or string vectors x.

See:   rankcorrel (Statistics with Euler Math Toolbox)
```
```function rankcorrel (x:real vector, y:real vector)
```
```  Correlation of x and y

See:   ranks (Statistics with Euler Math Toolbox)
```
```function empdist (x:real vector, vsorted:real vector)
```
```  Empirical distribution

The vector vsorted contains empirical data. Then we compute the
empirical cumulative distribution (CPF) of the data at the points
x[i].

x : vector of values, usually sorted
vsorted : sorted(!) vector of empirical values.

>short empdist(1:6,sort(intrandom(1,6000,6)))
[ 0.16283  0.33083  0.49317  0.662  0.832  1 ]
```
```function randpint (n:index, m:index, p:vector)
```
```  nxm random numbers with probabilities in p

Generates nxm random numbers from 1 to k based on the vector of
probabilities p,...,p[k].

```
```function randmultinomial (n:index, m:index, p: vector)
```
```  n mulitnomial random numbers based on a density

This generates n outcomes of m throws with probabilities
p,...,p[k]. The result is a nxk matrix

See:   randpint (Statistics with Euler Math Toolbox),   chitest (Statistics with Euler Math Toolbox)
```

## Statistical Tests

```function chitest (x:real vector, y:positive vector,
montecarlo=false, nmontecarlo=1000, p=false)
```
```  Perform a chi^2 test, if x has the expected frequency y

This function tests an observed frequency x against an expected
frequency y. E.g., if 40 men are found sampling 100 persons, then
[40,60] has to be tested against [50,50]. The result of the test is
too small, which means that the sample does not obey the expected
frequency with an error less than 5%.

For a meaningful test, sum(x) should be equal to sum(y), unless
p=true. In this case, y is interpreted as a vector of probabilities
not a vector of events.

To get frequencies of data from the data, use "getfrequencies",
"count", or "histo".

montecarlo : If montecarlo is not zero, the method uses a
Monte Carlo simulation. It generates nmontecarlo random events of
sum(x) data with the distribution in y, and counts how often the
statistics sum((x-y)^2/y) is larger than the observed statistics.

x,y : two real row vectors (1xn)

Returns the error level for rejecting the hypothesis that the
observed frequency x has the expected frequency y.

>x=[100,90]; y=[0.5,0.5]*sum(x); chitest(x,y)
0.468159909854
>chitest(x,y,>montecarlo)
0.43
>chitest(x,[0.5,0.5],>p)
0.468159909854

See:   getfrequencies (Statistics with Euler Math Toolbox),   count (Statistics with Euler Math Toolbox),   histo (Statistics with Euler Math Toolbox)
```
```function testnormal (r:real vector, n:integer, v:real vector, ..
m:number, d:number)
```
```  Test an observed frequency for normal distribution.

Test the number of data v[i] in the ranges r[i],r[i+1] against the
normal distribution with mean m and deviation d, using the chi^2
method.

r : ranges (sorted 1xm vector)
n : total number of data
v : number of data in the ranges (1x(m-1) vector)
m : expected mean value
d : expected deviation

Return the error we get, if we reject the normal distribution.
```
```function ttest (m:number, d:real scalar, n:natural, mu:number)
```
```  T student test

Test, if the measured mean m with measured deviation d of n data
comes from a distribution with mean value mu.

m : mean value of data
d : standard deviation of data
n : number of data
mu : mean value to test for

Returns the error alpha, if we reject that the data come from a
distribution with mean mu.
```
```function tcompare (m1:number, d1:number, n1:natural, ..
m2:number, d2:number, n2:natural)
```
```  Test, if two measured data agree in mean.

The data must be normally distributed. Returns the error you make,
if you reject that both data are from the same normal distribution.

m1,m2 : means of the data
d1,d2 : standard deviation of the data
n1,n2 : number of data

Returns the error alpha, if we reject that the data come from a
distribution with the same expected mean.
```
```function tcomparedata (x:real vector, y:real vector)
```
```  Compare x and y for same mean

Calls "tcompare" to compare the two observations for the same mean.

Returns the error we make, if we reject that both data come from a
distribution with the same expected mean.

```
```function tabletest (A:real)
```
```  Chi^2-Test the results a[i,j] for independence of the rows from the columns.

The table test test for indepence of the rows of the tables
from the column. E.g., if some items are observed [40,50] times
for men, and [50,30] times for woman, we can ask, if the
observations depend on the gender. In this case we can reject
independece with 1.8% error level.

This test should only be used for large table entries.

Return the error you make, if you reject independence.
```
```function expectedtable (A:real)
```

```function contingency (A:real, correct=1)
```
```  Contigency Coefficent of a matrix A.

If the coefficient is close to 0, we tend to say that the rows and
the colums are independent.

correct : Correct the coefficient, so that it is between 0 and 1
```
```function varanalysis
```
```  varanalysis(x1,x2,x3,...) test for same mean.

Test the data sets for the same mean, assuming normal distributed
data sets. This is also known as one of the ANOVA tests.

Returns the error we make, if we reject same mean.

Example:
>seed(0.5); v=normal(1,10)+1; w=normal(1,12)+2; u=normal(1,5);
>varanalysis(v,w,u)
0.000556414242764 // reject same mean!
```
```function mediantest (a:real vector, b:real vector)
```
```  Median test for equal mean.

Test the two distributions a and b on equal mean value. For this,
both distributions are checked on exceeding the median of the
cumulative distribution.

Returns the error we make, if we reject that a and b can have the
same mean.
```
```function ranktest (a:real vector, b:real vector, eps=epsilon())
```
```  Mann-Whitney test tests a and b on same distribution

Return the error we make, if we reject the same distribution.
```
```function signtest (a:real vector, b:real vector)
```
```  Test, if the expected mean of a is not better than b

Assume a(i) and b(i) are results of a treatment. Then we can ask,
if a is better than b.

a,b : row vectors of same size

Return the error we make, if we decide that a is better
than b.
```
```function wilcoxon (a:real vector, b:real vector, eps=sqrt(epsilon()))
```
```  Test, if the expected mean of a is not better than b

This is a sharper test for the same problem as in "signtest".

Returns the error you make, if you decide that a is better
than b.

See:   signtest (Statistics with Euler Math Toolbox)
```

## Statistical Plots

```function quartiles (x, outliers=1.5)
```
```  Quartiles for each row of x.

This computes [Min,Q1,M,Q2,Max], where M is the median, Q1
the median of the lower half and Q2 the median of the upper half.

outliers : If none, Min and Max are the minimal and maximal values
of the data. Otherwise, Min is the least data value, which is not
smaller than Q1-outliers*range, where range=Q2-Q1. Similar for
Max.

See:   boxplot (Statistics with Euler Math Toolbox),   boxplot (Maxima Documentation)
```
```function boxplot (data:real, lab=none, style="0#",
textcolor=none, outliers=1.5, pointstyle="o",
range=none)
```
```  Summary of the quartiles in graphical form.

data : vector or matrix. In case of a matrix, the rows are used.
style : If present, it is used as fill style, the default is "O#"
lab : Labels for each row of the data (vector of strings)
textcolor : Color of the labels (vector of colors)
outliers : Factor for the maximal whisker length or none
pointstyle : Point style for outliers
range : 1x2 vector for the plot range (or none)

>x=normal(1000)*10+1000; boxplot(x):
>x=randnormal(5,1000,100,10); boxplot(x,outliers=none):

See:   quartiles (Statistics with Euler Math Toolbox),   barstyle (Euler Core)
```
```function columnsplot (x:vector, lab=none,
style="O#", color=green, textcolor=none,
width=0.4, frame=true, grid=true)
```
```  Plot the elements of x as columns.

x : vector of values
lab : a string vector with one label for each element of x.
style,color : fill style and color for the bars
textcolor : color for the labels

See:   style (Euler Core),   style (Maxima Documentation),   color (Euler Core),   color (Maxima Documentation),   plot2d (Plot Functions),   plot2d (Maxima Documentation)
```
```function dataplot (x:real, y:real, style="[]w", color=1)
```
```  Plot the data (x,y) with point and line plots.

x : real row vector
y : real row vector or matrix (one row for each data).
style : a style or a vector of styles
color : a color or a vector of colors

You can use a vector of styles and a vector of colors. These
vectors must contain as many elements as there are rows of y.

See:   statplot (Statistics with Euler Math Toolbox)
```
```function piechart (x:real vector, style="0#",
color=green, lab:string vector=none, r=1.5, textcolor=red)
```
```  plot the data x in a pie chart.

x : the vector of data
color : a color or a vector of colors (same length as x)
style : a style or a vector of styles
lab : a vector of labels (same length as x)
r : The piechart has radius 1. To leave space use r=1.5.
```
```function starplot (v, style="/", color=green, lab:string=none,
rays:integer=0, pstyle="[]w", textcolor=red, r=1.5)
```
```  A star like plot with a filled star or with rays and dots only
```
```function logimpulseplot (x, y=none, style="O#", color=green, d=0.1)
```
```  Logarithmic impulse plot of y.
```
```function columnsplot3d (z:real, srows=none, scols=none,
angle=30�, height=40�, zoom=2.5, distance=5,
crows:vector=none, ccols:vector=none, positive:integer=false)
```
```  Plot 3D columns from the matrix z.

This function shows a 3D plot of columns with heights z[i,j] in
a rectangular array. z can be any real nxm matrix.

z : the values to be displayed
srows : labels for the rows
scols : labels for the columns
crows : colors of the rows
ccols : colors of the columns (alternatively)
positive : plot only positive columns

Example
>x=normal(1,1000); y=normal(1,1000);
>v=-6:6; z=find2(x,y,v,v);
>columnsplot3d(z,v,v,>positive):

See:   find2 (Statistics with Euler Math Toolbox)
```
```function mosaicplot (z: real, srows=none, scols=none,
textcolor=red, color=green, style="O#")
```
```  Moasaic plot of the data in z.

z : matrix with values
srows, scols : label strings for the rows and columns (string
vectors)
color : a color or a vector of colors for the columns of the plot.
style : a style or a vector of styles.

For an example see the introduction to statistics.
```
```function scatterplots (M:real, lab=none,
ticks=1, grid=4, style="..")
```
```  Plot all rows of M against all rows of M.

The labels are shown in the diagonal of the plot.

lab : labels for the rows.
```
```function statplot (x, y=none, plottype="b",
pstyle="[]w", lstyle="-", fstyle="O#",
xl="", yl="", color=none, vertical=0)
```
```  Plots x against y.

This is a simple form of using plot2d with point, line or bar
options.

The available plotplottypes are

'p' : point plot
'l' : line plot
'b' : both
'h' : histgram plot
's' : surface plot

pstyle, lstyle, fstyle : Styles for the points, lines and bars

color : color or color array
vertical : vertical labels

See:   style (Euler Core),   style (Maxima Documentation)
```
```function getspectral (x)
```
```  Get a spectral color for 0<=x<=1.

The scheme runs from blue (0) to red (1)
```
```function colormap (A, spectral=0, color=white)
```
```  Plot a color map of the matrix A.

Color have a color scale on the right. The color is either a fixed
color (white by default) or spectral colors.

Example
>colormap(randexponential(50,50),color=yellow); ...
>title("Exponential distribution"); ...
>xlabel("n"); ylabel("m"):
```

## Data Tables in Statistics

```function writetable (x,
fixed:integer=0, wc:index=10, dc:nonnegative integer=2,
labc=none, labr=none, wlabr=none, lablength=1,
NA=".", NAval=NAN,
ctok:index=none, tok:string vector=none,
file=none, separator=none, comma=false,
date=none, time=none)
```
```  Write a table x of statistical values

wc : default width for all columns or vector of widths. This is
used only if the separator is not set.
dc : default decimal precision for all columns or vector of
precision values.
fixed : use fixed number of decimal digits
(boolean or vector of boolean).

labc : labels for the columns (string or real vector)
lablength : increase the width of the columns, if labels are wider.
labr : labels for the rows (string or real vector)

NA, NAval : Token string and value to represent "Not Available". By
default "." and NAN is used.

comma : write with decimal comma instead of dot.
separator : use this separator string instead of the default
blanks. Note that the number of blanks is determined
by wc, if no separator is given.

date : vector of columns, which should be written as dates.
time : vector of columns, which should be written as times.

Write a table with labels for the columns and rows and formats for
each row. A typical table looks like this

A     B     C
G   1.02     2     f
H   3.05     5     m

Each number in the table can be translated into a token string.
This translation can be set with a global variable tok (string
vector) which applies to all columns with indices in ctok (index
vector). Or it can be set in each column with an assigned variable
tok? (string vector), where ? is the number of the column. Note
that these assigned variables need to be declared with :=, since
they are not in the parameter list of readtable().

See the introduction to statistics for an example.

See:   readtable (Statistics with Euler Math Toolbox)
```
```function readtable (filename:string, clabs=1, rlabs=0,
NA=".", NAval=NAN,
ctok:index=none, tokens=[none],
separator=none, comma=false,
date=none, list=false)
```
```  Read a table from a file.

filename: readtable(none,...) will used an open file.
clabs : The table has a line with headings
rlabs : Each line has a heading label.
NA, MAval : Sets the string and the returned value for NA (not
available).
ctok : Indices of the columns, where tokens are to be collected.
tok1=..., tok2=... : Individual string arrays for columns.
separator : Optional separating characters.
comma : Use decimal commas instead of dots.
date : vector of columns which contain a date.

The table can have a header line (clabs=1) and row labels
(rlabs=1). The entries of the table can be numbers (by default with
decimal dots) or strings. In case of strings, these tokens are
translated to unique numbers. The translation can either be set for
each column separately in string vectors with names tok1, tok2
etc., or for the complete table in the tokens parameter.

The tokens are collected from the columns with indices in the ctok
vector. If a column has a tok? parameter (tok1, tok2, etc.), tokens
are not collected automatically from that column but the
translation in tok? is used.

Note that your have to write tok1:=... since the token parameters
are not pre-defined parameters in the parameter list.

The table can also contain expressions with units or global
variables.

"Not Available" can be represented by a special string. The
default is ".". In the numerical table, it is represented by
default as NAN. If you do not like this, simply let NAN be
represented by any other string and translate ti into a numerical
token.

Dates are converted to a unique day number.

See the introduction for statistics for an example.

The default separator is a comma, semicolon, blank or tabulator. If
you have a file with semicolons and decimal commas, just enable
>comma. This will replace all commas with dots before the
evaluation.

Returns {table, heading string, token strings, rowlabel strings}

See:   writetable (Statistics with Euler Math Toolbox),   date (Basic Utilities),   day (Basic Utilities),   day (Astronomical Functions)
```
```function tablecol (M:real, j:nonnegative vector, NAval=NAN)
```
```  The non-NAN values in the columns j of the table M.

To access a table column, you could simply use M[,j], where j is a
row vector of indices or a single index. But this function skips
any NAN values in any of the columns j. It returns the columns
as rows (transposed) and the indices of the rows.

NANval : The value that should be treated as "Not Available"

Returns {colums as rows, indices of non-NAN rows}
```
```function selectrows (M:real, j:index, v:real vector, NAval=NAN)
```
```  Select the rows indices i with M[i,j] in v and not-NAN.
```
```function sortedrows (M:real, j:nonnegative integer vector)
```
```  Index of rows for sorted table with respect to columns in j

The table gets sorted in lexicographic order.

Returns : {sorted table, index of sorted values}
```

## Shuffle, Sort and Find

For statistical purposes and many other applications, Euler has efficient functions to find values in a vector.

```function comment shuffle (v)
```
```  Shuffle the vector v

See:   sort (Statistics with Euler Math Toolbox),   sort (Maxima Documentation)
```
```function comment sort (v)
```
```  Sort the vector v

The function returns {x,i}, where x is the sorted vector, and i is
the vector of indices, which sort the vector.

>v=shuffle(1:10)
[6,  3,  1,  5,  10,  4,  9,  8,  2,  7]
>{vx,i}=sort(v); vx,
[1,  2,  3,  4,  5,  6,  7,  8,  9,  10]
>v[i]
[1,  2,  3,  4,  5,  6,  7,  8,  9,  10]

See:   shuffle (Statistics with Euler Math Toolbox)
```
```function comment lexsort (A)
```
```  Lexicographic sort of the rows of A

Returns {Asorted,i}, where i is the vector of indices, which sorts
the rows of A.

>A=intrandom(5,5,3)
2       1       2       1       2
1       3       3       1       2
3       3       2       1       2
3       1       3       2       2
3       2       1       1       1
>lexsort(A)
1       3       3       1       2
2       1       2       1       2
3       1       3       2       2
3       2       1       1       1
3       3       2       1       2

See:   sort (Maxima Documentation)
```
```function overwrite unique (v)
```
```  Unique elements in v

>v=intrandom(10,12)
[6,  2,  3,  9,  6,  5,  7,  7,  10,  2]
>unique(v)
[2,  3,  5,  6,  7,  9,  10]
```
```function comment find (v,x)
```
```  Find x in the intervals of the sorted vector v

Returns the index i such that v(i) <= x < v(i+1). It returns 0 for
elements smaller than v, and length(v) for elements larger or equal
the last element of v. The function maps to x.

The function works for sorted vectors of strings v, and strings or
string vectors x using alphabetic (ASCII) string comparison.

>s=random(10)
[0.270906,  0.704419,  0.217693,  0.445363,  0.308411,  0.914541,
0.193585,  0.463387,  0.095153,  0.595017]
>v=0.2:0.2:0.8
[0.2,  0.4,  0.6,  0.8]
>find(v,s)
[1,  3,  1,  2,  1,  4,  0,  2,  0,  2]

See:   indexof (Statistics with Euler Math Toolbox),   indexofsorted (Statistics with Euler Math Toolbox)
```
```function comment count (v,n)
```
```  Counts v[i] in integer intervals [i-1,i] up to n

Returns a vector n, where n[i] is the number of elements of v in
the interval [i-1,i[ for 1<=i<=n.
>count([0,0.1,0.2,1,1.5,2],2)
[3,  2]
```
```function comment indexof (v,x)
```
```  Find x in the vector v

Find the first occurence of x in the vector v. Maps to x.

>v=intrandom(10,4)
[6,  5,  2,  2,  3,  8,  5,  4,  4,  2]
>indexof(v,1:10)
[0,  3,  5,  8,  2,  1,  0,  6,  0,  0]
>indexof(["This","is","a","test"],"a")
3

See:   indexofsorted (Statistics with Euler Math Toolbox),   find (Statistics with Euler Math Toolbox)
```
```function comment indexofsorted (v,x)
```
```  Find x in the sorted vector v

Find the last occurence of x in the vector v. Note that indexof
returns the first occurence. Maps to x.

>v=sort(intrandom(10,4))
[3,  4,  5,  5,  5,  6,  8,  8,  9,  10]
>indexofsorted(v,1:10)
[0,  0,  1,  2,  5,  6,  0,  8,  9,  10]

See:   find (Statistics with Euler Math Toolbox)
```
```function comment multofsorted (v, x)
```
```  Counts x in the sorted vector v

The function maps to x.

>v=intrandom(1000,10); multofsorted(sort(v),1:10), sum(%)
[88,  84,  126,  86,  110,  104,  86,  103,  113,  100]
1000

See:   getmultiplicities (Statistics with Euler Math Toolbox),   getfrequencies (Statistics with Euler Math Toolbox)
```
```function getfrequencies (x:real vector, r: real vector)
```
```  Count the number of x in the intervals of the sorted vector r.

The function returns the number of x[j] in the intervals r[i-1] to
r[i].

x : real row vector (1xn)
r : real sorted row vector (1xm)

Returns the frequencies f as a row vector (1x(m-1))

See:   count (Statistics with Euler Math Toolbox),   histo (Statistics with Euler Math Toolbox),   multofsorted (Statistics with Euler Math Toolbox),   getmultiplicities (Statistics with Euler Math Toolbox)
```
```function getmultiplicities (x, y, sorted=0)
```
```  Counts how often the elements of x appear in y.

This works for string vectors and for real vectors.

sorted : if true, then y is assumed to be sorted.

See:   count (Statistics with Euler Math Toolbox),   getfrequencies (Statistics with Euler Math Toolbox),   multofsorted (Statistics with Euler Math Toolbox)
```
```function getstatistics (x:real vector, y:real vector=none)
```
```  Return a statics of the values in the vector x.

If y is none, the function returns {xu,mu}, where xu are the
unique elements of x, and mu are the multiplicities of these
values.

Else the function returns {xu,yu,m}, where xu are the unique
elements of x, yu the unique elements of y, and M is a table of
multiplicities of pairs (xu[i],yu[j]) in (x[k],y[k]), k=1...n.
```
```function args histo (d:real vector, n:index=10,
integer:integer=0, even:integer=0, v:real vector=none,
bar=1)
```
```  Computes {x,y} for histogram plots.

d : 1xm vector of data

Returns {x,y} whith

x - End points of the intervals (equispaced n+1 points)
y - The number of data in the subintervals (frequencies)

integer : flag for distributions on integers
even : flag for evenly spaced discrete distributions
This is used by plot2d for bar styles.

v : optional interval boundaries (ordered).

bar : If true, the function returns two vectors for >bar in plot2d.
If false, it returns a sawtooth function for plot2d.

The plot function plot2d has parameters distribution=1, histogram=1
to achieve the same effect.

See:   plot2d (Plot Functions),   plot2d (Maxima Documentation)
```
```function find2 (x:vector, y:vector,
vx:vector=none, vy:vector=none, n:integer=none)
```
```  Matrix count for pairs x[i],y[i] in the bounds.

x,y : Vectors of same size.
vx,vy : Sorted vector of bounds, if present (must enclose x resp. y)
n : If vx or vy is not present, number of intervals between the bounds of x.

Returns a matrix with counts.

See:   columnsplot3d (Statistics with Euler Math Toolbox)
```

## Confidence Intervals

```function cinormal (mean:numerical, sigma:numerical, alpha=0.05)
```
```  Confidence interval for known mean and standard deviation.

See:   cimean (Statistics with Euler Math Toolbox)
```
```function cimean (data: real vector, alpha=0.05)
```
```  Confidence interval for the mean of normal distributed data

This is a symmetric interval around the mean value of the data
containing the true mean of the random experiment in 95% (default
alpha=0.05) of the cases. The data are assumed to be
from identically normal distributed independent random variables.

Clopper-Pearson confidence interval for k hits in n.

The upper bound of the interval is such that P(X<=k,p)=alpha/2, the
lower bound such that P(X>=k,p)=alpha/2. In other words, if p is
outside the interval then k is an event which is less likely then
alpha. This interval estimator yields an interval which contains
the true p in 95% (default alpha=0.05) of the cases.

>clopperpearson(20,400)
[0.0308831,  0.076167]

```

Documentation Homepage