SAISIR

A comprehensive package for chemometrics in the MATLAB environment

Documentation on SAISIR function

For more information on the general structure of SAISIR, see also the manual.

Unité de Sensométrie et de Chimiométrie ENITIAA-INRA (Nantes , France)

Coordinator: Dr Dominique BERTRAND (e-mail: bertrand "at" nantes.nantes.fr)

Unité GENIAL, Equipe Ingénierie Analytique pour la Qualité des Aliments (Paris, France)

Coodinator: Dr Christophe CORDELLA (e-mail: cordella "at" paris.inra.fr)

Go to thematic list of functions

Go to alphabetic list of functions

Documentation automatically generated on 07-Jan-2010

This software is copyrighted by ENITIAA-INRA, Unité de Sensométrie et de Chimiométrie, Nantes (France)

Thematic list of functions

· Elementary manipulation of SAISIR files

· Elementary data transformation and pre-treatments

· Principal Component analysis

appendbag1 - Merge an arbitrary number of "bag" files according to rows

usage: [X]= appendbag1(X1, X2, X3,.....)

bag2group - uses the identifiers in bag to create groups

function [group_type]=bag2group(bag)

bag_appendrow1 - Merges an arbitrary number of bags according to rows

usage: [bag]= appendrow(bag1, bag2 ....)

check_name - Controls if some strings are strictly identical in a string array

function [detected,names]=check_name(string)

excel2bag - reads an excel file and creates the corresponding text bag

(array of caracter)

excel2saisir - reads an excel text file

function [saisir] = excel2saisir(filename,(nchar),(start),(xend))

issaisir - tests if the input argument is a SAISIR matrix

test=issaisir(X);

matrix2saisir - transforms a Matlab matrix in a saisir structure

X = matrix2saisir(data,(coderow),(codecol))

readexcel1 - reads an excel file in the .CSV format (create a 3way character matrix).

function [data] = readexcel1(filename,(nchar),(deb),(xend))

readident - loads a file of strings

function [ident, nident] = readident(filename,namesize)

saisir2ascii - Saves a saisir file into a simple ASCII format

function saisir2ascii(X,filename,separator)

saisir2excel - Saves a saisir file in a format compatible with Excel

function saisir2excel(X,filename)

saisir_check - Checks if the data respect the saisir stucture

function check=saisir_check(X)

string2saisir - creation of a saisir file from a string table (first column=name)

function [saisir] = string2saisir(data)

string2text - save a vector of string in a .txt format

function string2text(str,filename)

Thematic list
Alphabetic list

Elementary manipulation of SAISIR files

addcode - adds a string before or after a matrix of characters

function str1 = addcode(str,code,(deb_end))

alphabetic_sort - sorts the rows of x according to the alphabetic order of rows identifiers

function X1=alphabetic_sort(X,start_pos:end_pos)

appendbag1 - Merge an arbitrary number of "bag" files according to rows

usage: [X]= appendbag1(X1, X2, X3,.....)

appendcol - merges two files according to columns

function: [X3]= appendcol(X1,X2)

appendcol1 - merges an arbitrary number of files according to columns

usage: [X]= appendcol(X1,X2,X3,...)

appendrow - merges two SAISIR matrices according to rows

usage: [X3]= appendrow(X1,X2)

appendrow1 - Merges an arbitrary number of files according to rows

usage: [X]= appendrow(X1,X2,X3,...)

bag2group - uses the identifiers in bag to create groups

function [group_type]=bag2group(bag)

bag_appendrow1 - Merges an arbitrary number of bags according to rows

usage: [bag]= appendrow(bag1, bag2 ....)

build_indicator - build a disjoint table

function [indicator, groupings]=build_indicator(x);

check_name - Controls if some strings are strictly identical in a string array

function [detected,names]=check_name(string)

create_group - creates a vector of numbers indicating groups from identifiers

group=create_group(X,code_list,startpos,endpos)

create_group1 - uses the identifiers to create groups

use the identifier for creating groups.

deletecol - deletes columns of saisir files

function [X1] = deletecol(X,index)

deleterow - delete rows

function X1 = deleterow(X,index)

eliminate_nan - suppresses "not a number" data in a saisir structure

function [X1] = eliminate_nan(X,(row_or_col))

find_index - find the index corresponding to the closest "value"

function index=find_index(str,value);

find_max - gives the indices of the max value of a MATLAB Matrix

function [row,col,value]=find_max(matrix)

find_min - gives the indices of the min value of a MATLAB Matrix

function [row,col,value]=find_min(matrix)

find_peaks - finds and displays peaks greater than a threshold value

function [vect, index]=find_peaks(X,nrow,threshold,windowsize,min_max)

group_centering - Centers data according to groups

function X1=group_centering(X,group);

issaisir - tests if the input argument is a SAISIR matrix

test=issaisir(X);

matrix2saisir - transforms a Matlab matrix in a saisir structure

X = matrix2saisir(data,(coderow),(codecol))

num2str1 - Justified num2string

function str=num2str1(vector,ndigit);

random_splitrow - random selection of rows

function[X1,X2]=random_splitrow(X, nselect)

reorder - reorders the data of files A1 and A2 according to their identifiers

function [B1 B2]=reorder(A1,A2)

repeat_string - build a matricx of char by repeating a string

function str1=repeat_string(str,ntimes);

row_center - subtracts the average row to each row

function [X] = center(X1)

saisir_check - Checks if the data respect the saisir stucture

function check=saisir_check(X)

saisir_sort - sorts the rows of s according to the values in a column

function [X1 X2]=saisir_sort(X,ncol,minmax)

saisir_transpose - transposes a data matrix following the saisir format

function [X] = saisir_transpose(X1)

select_from_identifier - Uses identifier of rows for selecting samples

function [X1] = select_from_identifier(X,startpos,str)

select_from_variable - use identifier of columns for selecting variables

function [X1] = select_from_variable(X,startpos,str)

selectcol - creates a new data matrix with the selected columns

function [X] = selectcol(X1,index)

selectrow - creates a new data matrix with the selected rows

function [X] = selectrow(X1,index)

split_average - averages observations according to the identifiers

function res=split_average(X,startpos,endpos)

splitrow - splits a data matrix into 2 resulting matrices

function [X1, X2]= splitrow(X,index)

Thematic list
Alphabetic list

Elementary data transformation and pre-treatments

build_indicator - build a disjoint table

function [indicator, groupings]=build_indicator(x);

center - subtracts the average to each row

function [X1 xmean] = center(X)

correct_baseline - simple linear baseline correction, using intensity

function [saisir] = correct_baseline (saisir1,col1,col2)

create_group - creates a vector of numbers indicating groups from identifiers

group=create_group(X,code_list,startpos,endpos)

create_group1 - uses the identifiers to create groups

use the identifier for creating groups.

eliminate_nan - suppresses "not a number" data in a saisir structure

function [X1] = eliminate_nan(X,(row_or_col))

moving_average - Moving average of signals

function X1=moving_average(X,window_size)

moving_max - replaces the central point of a moving window by the maximum value

function [X1] = moving_max(X,window_size)

moving_min - replaces the central point of a moving window by the minimum value

function [X1] = moving_min(X,window_size)

msc - Multiplicative scatter correction on spectra

function [X1] = msc(X,(reference))

norm_col - divides each column by the corresponding standard deviation

function [saisir] = norm_col(saisir1,(mode))

normc - Normalize columns of a matrix.

random_saisir - Creation of a random matrix

function[X]=random_saisir(nrow,ncol)

random_select - bulding a vector of random elements 0 or 1

function[selected]=random_select(nel, nselect, (nrepeat))

randomize - Builds a file of randomly attributed vector in X1

function X1=randomize(X)

reorder - reorders the data of files A1 and A2 according to their identifiers

function [B1 B2]=reorder(A1,A2)

saisir_derivative - n-th order derivative using the Savitzky-Golay coefficients

[X]=saisir_derivative(X1,polynom_order,window_size,derivative_order)

saisir_emsc - Correction of spectra by EMSC

[X1, emsc_model, coefficients]=saisir_emsc(X,good_spectra,bad_spectra,ref);

sgolaycoef - Computes the Savitsky-Golay coefficients

function [B,G] = sgolaycoef(k,F)

snv - Standard normal variate correction on spectra

function [X1] = snv(X)

standardize - divides each column by the corresponding standard deviation

function [X, xstd] = standardize(X1,(option))

subtract_variable - subtract a given variable to all the others

function [X1] = subtract_variable(X,ncol)

surface1 - Represent a surface in three dimensions

function [zmin, zmax]=surface1(X)

surface_std - divide each row by the sum of its corresponding columns

function [X1] = surface_std(X,(threshold))

Thematic list
Alphabetic list

barycenter_map - graph of map of barycenter

function barycenter_map(X,col1,col2,group,(charsize))

browse - browses a series of curves

function browse(X,xstart)

ca_map - colored map for correspondence analysis: using a portion of the identifiers as labels

ca_map(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))

coloured_curves - displays curves coloured according to groups

function h=colored_curves(X,group)

colored_map1 - colored map : using a portion of the identifiers as labels

colored_map1(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))

colored_map2 - colored map : using a portion of the identifiers as labels

colored_map2(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))

colored_map4 - colored map according to 2 criteria

colored_map4(X,col1,col2,color_choice,(symbol_choice),(charsize);)

correlation_circle - Displays the correlation circle after PCA

function[res]=correlation_circle(pcatype,X,col1,col2,(startpos),(endpos))

correlation_plot - Draw a correlation between scores and tables

function handle=correlation_plot(scores,col1,col2, X1,X2, ...);

curve - represents a row of a matrix as a single curve

handle=courbe(X,(nrow), (xlabel),(ylabel),(title))

curves - represents several rows of a matrix as curves

usage handle=curves(X,range,(xlabel),(ylabel),(title))

dendro1 - dendrogram using euclidian metric and Ward linkage

function group=dendro(X,(topnodes))

ellipse_map - plots the ellipse confidence interval of groups

function ellipse_map(X,col1,col2,gr,centroid_variability,(confidence),(point_plot))

find_peaks - finds and displays peaks greater than a threshold value

function [vect, index]=find_peaks(X,nrow,threshold,windowsize,min_max)

labelled_hist - draws an histogram in which each obs. name is considered as a label

function labelled_hist(X,col,startpos,endpos,(nclass),(charsize),(car))

list - lists rows (only with a small number of columns)

function list(X,(start))

map - graph of map of data using identifiers as names

function map(X,col1,col2,(col1label),(col2label),(title),(charsize),(margin))

map3D - Draws a 3D map

function map3D(X,col1,col2,col3,(label1),(label2),(label3),(title),(charsize))

mir_style - changes the sign of the variables of MIR spectra

function [names] = mir_style(names1)

plotmatrix1 - biplots of columns of matrices with colors

function plotmatrix1(s,startpos,endpos,charsize)

sensory_profile - Graphical representation of sensory profile

function[h]=sensory_profile(X,range,max_score,(title))

show_vector - represents a row of a matrix as a succession identifiers

function handle=show_vector(X, (nrow) ,(csize),(xlab),(ylab),(title))

submap - partial display of observations

function submap(X,col1,col2,xstring,(col1label),(col2label),(title),(charsize),(marg))

symbol_map - map with symbols : using a portion of the identifiers for

symbol_map(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize))

tcurve - representation of a column of a given matrix as a curve

function handle=tcurve(X, ncol, (xlabel),(ylabel),(title))

tcurves - represents several columns of a matrix as curves

function handle=tcurves(X, range, (xlabel),(ylabel),(title))

trajectory_curve - plots coloured XYcurves

function handle=trajectory_curve(X,col1,col2,startpos,endpos);

xdisp -smart display of heterogeneous variables

function xdisp(varargin)

xy_plot - Biplot of one column of X versus one column of Y

function handle=xy_plot(X, xcol, Y, ycol,start_pos,end_pos);

xyz_colored_map1 - Draws a colored 3D map from a Saisir file

xyz_colored_map1(X,col1,col2,col3,startpos,endpos)

Thematic list
Alphabetic list

anavar1 - One way analysis of variance on all the columns

function res = anavar1(X,g)

anovan1 - N-way analysis of variance (ANOVA) on data matrices.

function res = anovan1(X,model,gr1, gr2, ...)

contingency_khi2 - computes khi2 stats on a contingency table

function res=contingency_kh2(table);

contingency_table - Computes a contingency table

function table=contingency_table(g1,g2)

cormap - Correlation between two tables

function [cor] = cormap(X1,X2)

covmap - assesses the covariances between two tables

function [cov] = covmap(X1,X2)

distance - Usual Euclidian distances

function [D] = distance(X1,X2)

find_max - gives the indices of the max value of a MATLAB Matrix

function [row,col,value]=find_max(matrix)

find_min - gives the indices of the min value of a MATLAB Matrix

function [row,col,value]=find_min(matrix)

group_centering - Centers data according to groups

function X1=group_centering(X,group);

group_mean - gives the means of group of rows

function X1=group_mean(X,startpos,endpos)

labelled_hist - draws an histogram in which each obs. name is considered as a label

function labelled_hist(X,col,startpos,endpos,(nclass),(charsize),(car))

mdistance - computes distances between the two tables using metric "metric"

function dis = mdistance(X1,X2,metric)

nancor - Matrix of correlation with missing data

function[cor]=nancor(X1,X2)

row_center - subtracts the average row to each row

function [X] = center(X1)

saisir_mean - computes the mean of the columns, following the saisir format

function[xmean]=saisir_mean(X);

saisir_std - computes the standard_deviations of the columns, following the saisir format

function[xstd]=saisir_std(X)

split_average - averages observations according to the identifiers

function res=split_average(X,startpos,endpos)

Thematic list
Alphabetic list

ca - CORRESPONDENCE ANALYSIS

function ca_type=ca(N);

ca_map - colored map for correspondence analysis: using a portion of the identifiers as labels

ca_map(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))

comdim - Finding common dimensions in multitable data (saisir format)

function[res]=comdim(collection,(ndim),(threshold))

covariance_pca - principal component analysis when knowing the covariance (of variables)

function[pcatype]=covariance_pca(covariance_type,(nscore))

covmap - assesses the covariances between two tables

function [cov] = covmap(X1,X2)

cumulate_covariance - Covariance on huge data set

function [covariance_type]=cumulate_covariance((X),(covariance_type1));

d2_factorial_map - assesses a factorial map from a table of squared distance

function [ftype] = d2_factorial_map(X)

distance - Usual Euclidian distances

function [D] = distance(X1,X2)

mdistance - computes distances between the two tables using metric "metric"

function dis = mdistance(X1,X2,metric)

mfa - Multiple factor analysis

function res=mfa(collection);

multiway_pca - Multi way principal component analysis

function res=multiway_pca(collection);

nuee - Nuee dynamique (KCmeans)

function[res]=nuee(X,ngroup,(nchanged))

pca_cross_ridge_regression - PCA ridge regression with crossvalidation

function[res]=pca_cross_ridge_regression(X,y,krange,selected)

regression_score - build a factorial space for regression

function res=regression_score(x,beta,(y))

saisir_linkage - assesses a simple linkage vector from a matrix of distance

function z=saisir_linkage(dis)

statis - Multiway method STATIS

function res=statis(collection);

xcomdim - Finding common dimensions in multitable data

No direct use. Normally called with function "comdim"

Thematic list
Alphabetic list

applypca - computes the scores of supplementary observations

function [supscores]=applypca(pcatype, X)

applypcr - applies basic PCR on data

function [predy]=applypcr(pcrtype,X)

applyspcr - Applies a stepwise PCR

function [predy]=applyspcr(spcrtype,X)

ca - CORRESPONDENCE ANALYSIS

function ca_type=ca(N);

ca_map - colored map for correspondence analysis: using a portion of the identifiers as labels

ca_map(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))

change_sign - changes the sign of a component and of its associated eigenvector

function [pcatype1] = change_sign(pcatype,ncomp)

comdim - Finding common dimensions in multitable data (saisir format)

function[res]=comdim(collection,(ndim),(threshold))

contingency_khi2 - computes khi2 stats on a contingency table

function res=contingency_kh2(table);

contingency_table - Computes a contingency table

function table=contingency_table(g1,g2)

correlation_circle - Displays the correlation circle after PCA

function[res]=correlation_circle(pcatype,X,col1,col2,(startpos),(endpos))

correlation_plot - Draw a correlation between scores and tables

function handle=correlation_plot(scores,col1,col2, X1,X2, ...);

covariance_pca - principal component analysis when knowing the covariance (of variables)

function[pcatype]=covariance_pca(covariance_type,(nscore))

cumulate_covariance - Covariance on huge data set

function [covariance_type]=cumulate_covariance((X),(covariance_type1));

d2_factorial_map - assesses a factorial map from a table of squared distance

function [ftype] = d2_factorial_map(X)

dimcrosspcr1 - validation of PCR (samples in validation are selected)

function [pcrres]=dimcrosspcr1(X,y,ndim,selected)

fda1 - stepwise factorial discriminant analysis on PCA scores

function[fdatype]=fda1(pcatype, group, among, maxscore)

fda2 - stepwise factorial discriminant analysis on PCA scores with verification

function[res]=fda2(X, group, among, maxscore, selected)

mfa - Multiple factor analysis

function res=mfa(collection);

multiway_pca - Multi way principal component analysis

function res=multiway_pca(collection);

normed_pca - PCA with normalisation of data

function[pcatype]=normed_pca(X)

pca - principal component analysis on raw data

function [pcatype]=pca(X,(var_score))

pca1 - assesses principal component analysis on raw data (case nrows>ncolumns)

This function is not to be called directly.

pca2 - computes principal component analysis on raw data (case nrows>ncolumns)

This function is not to be called directly.

pca_cano - generalized canonical analysis after PCAs on each table

function res=pca_cano(collection,ndim,graph);

pca_stat - Gives some complementary statistics on PCA observations

function res=pca_stat(pca_type, comp1, comp2);

pcareconstruct - reconstructs original data from a PCA model and a file of score

function[res]=pcareconstruct(pcatype,score,nscore)

pcr - PCR (components introduced in the order of eigenvalues)

function [pcrtype]=pcr(X,y,maxdim)

spcr - stepwise Principal component regression

function [spcrtype]=spcr(X,y,maxdim, (maxrank)(corr_cov))

statis - Multiway method STATIS

function res=statis(collection);

xpca - PCA on a matlab data matrix

assess a rustic principal component analysis (on not normalised data)

Thematic list
Alphabetic list

apply_multiple_regression - applies multiple_regression on "unknown" data

function[res]=apply_multiple_regression(X,type,(y))

apply_ridge_regression - applies ridge regression on "unknown data"

function[res]=apply_ridge_regression(ridgetype,X,(y))

applylr1 - Apply basic latent root model on saisir data x

function [predy]=applylr1(lrtype,X)

applypcr - applies basic PCR on data

function [predy]=applypcr(pcrtype,X)

applypls - applies a pls model on an unknown data set

function res=applypls(X,plsmodel, (knowny))

applyspcr - Applies a stepwise PCR

function [predy]=applyspcr(spcrtype,X)

basic_pls - basic pls with keeping loadings and scores

function[res] = basic_pls(X,y,ndim)

basic_pls2 - PLS2 on several variables, several dimensions

function result=basic_pls2(X,y,maxdim)

cross_ridge_regression - ridge regression with validation

function[res]=cross_ridge_regression(X,y,krange,selected)

crossval_multiple_regression - validation of multiple regression.

function[res]=crossval_multiple_regression(X,y,selected)

dimcross_stepwise_regression - Tests models obtained from stepwise regression

function[res]=dimcross_stepwise_regression(X,y,selected,Pthres,(confidence))

dimcrosspcr1 - validation of PCR (samples in validation are selected)

function [pcrres]=dimcrosspcr1(X,y,ndim,selected)

leave_one_out_pls1 - PLS1 with leave_one out validation

function res=leave_one_out_pls1(X,y,ndim);

leave_one_out_pls2 - PLS2 with leave_one out validation

function res=leave_one_out_pls2(X,y,ndim);

lr1 - Latent root regression

function [lr1type]=lr1(X,y,maxdim,(ratioxy))

multiple_regression - Simple Multiple linear regression (all the variables)

function res=multiple_regression(x,y);

pcr - PCR (components introduced in the order of eigenvalues)

function [pcrtype]=pcr(X,y,maxdim)

pcr1 - Basic model of PCR (components introduced in the order of eigenvalues)

function [pcrtype]=pcr1(X,y,dim)

pls2obs - PLS regression with many observations

function [mtBpls,mtT, tBeta] = pls2var(mtX,mtY,nbdim)

quickpls - Quick PLS regression from 1 to ndim dimensions

function [plstype]=saisirpls(X,y,ndim)

ridge_regression - Basic ridge regression

function [ridgetype]=ridge_regression(X,y,krange)

ridge_regression1 - Basic ridge regression at a given norm

function [ridgetype]=ridge_regression1(X,y,normrange)

saisirpls - PLS regression with "dim" dimensions

function [plstype]=saisirpls(X,Y,dim)

simple_regression - mono_linear regressions

function [beta beta0]=simple_regression(X,y);

spcr - stepwise Principal component regression

function [spcrtype]=spcr(X,y,maxdim, (maxrank)(corr_cov))

stepwise_regression - stepwise regression between x and y

function[result]=stepwise_regression(x,y,Pthres,(confidence))

Thematic list
Alphabetic list

applypls - applies a pls model on an unknown data set

function res=applypls(X,plsmodel, (knowny))

applyplsda - Applies pls discriminant analysis after model assessment using plsda

function[res]=applyplsda(X,plsdatype,(actual_group))

basic_pls - basic pls with keeping loadings and scores

function[res] = basic_pls(X,y,ndim)

basic_pls2 - PLS2 on several variables, several dimensions

function result=basic_pls2(X,y,maxdim)

crossplsda - validation on PLS discriminant analysis

function[res]=crossplsda(X,group,dim,selected)

crossvalpls - validation of pls with up to ndim dimensions.

function [res]=crossvalpls(X,y,ndim,selected)

crossvalpls1a - crossvalidation of pls with up to ndim dimensions.

function [res]=crossvalpls1a(X,y,ndim,selected)

leave_one_out_pls1 - PLS1 with leave_one out validation

function res=leave_one_out_pls1(X,y,ndim);

leave_one_out_pls2 - PLS2 with leave_one out validation

function res=leave_one_out_pls2(X,y,ndim);

PCA1 -assesses principal component analysis on raw data (case nrows>ncolumns)

[type]=afdlike(x,y,select,parmi)

pls2obs - PLS regression with many observations

function [mtBpls,mtT, tBeta] = pls2obs(mtX,mtY,nbdim)

plsda - Pls discriminant analysis following the saisir format

function[plsdatype]=plsda(X,group,ndim)

quickpls - Quick PLS regression from 1 to ndim dimensions

function [plstype]=saisirpls(X,y,ndim)

saisirpls - PLS regression with "dim" dimensions

function [plstype]=saisirpls(X,Y,dim)

Thematic list
Alphabetic list

apply_nuee - apply Nuee dynamique (KCmeans)

function[res]=nuee(X,barycenter)

apply_quaddis - Quadratic discriminant analysis

function result = apply_quaddis(quaddis_type,x,(known_group));

apply_stepwise_regression - applies stepwise_regression on "unknown" data

function[res]=apply_stepwise_regression(stepwise_type,X,(y))

applyfda1 - application of factorial discriminant analysis on PCA scores

function[res]=applyfda1(X,fdatype,(actual_group))

applyplsda - Applies pls discriminant analysis after model assessment using plsda

function[res]=applyplsda(X,plsdatype,(actual_group))

barycenter_map - graph of map of barycenter

function barycenter_map(X,col1,col2,group,(charsize))

basic_pls - basic pls with keeping loadings and scores

function[res] = basic_pls(X,y,ndim)

build_indicator - build a disjoint table

function [indicator, groupings]=build_indicator(x);

contingency_table - Computes a contingency table

function table=contingency_table(g1,g2)

create_group - creates a vector of numbers indicating groups from identifiers

group=create_group(X,code_list,startpos,endpos)

create_group1 - uses the identifiers to create groups

use the identifier for creating groups.

crossfda1 - validation on discrimination according to fda1 (directly on data)

function[res]=crossfda1(X,group,among,maxvar,ntest)

crossmaha - validation on discrimination according to maha1 (directly on data)

function[discrtype]=crossmaha(X,group,maxvar,ntest)

crossplsda - validation on PLS discriminant analysis

function[res]=crossplsda(X,group,dim,selected)

crossval_quaddis - crossvalidation of quadratic dis. analysis

function res=crossval_quaddis(X,group,selected)

crossvalpls - validation of pls with up to ndim dimensions.

function [res]=crossvalpls(X,y,ndim,selected)

crossvalpls1a - crossvalidation of pls with up to ndim dimensions.

function [res]=crossvalpls1a(X,y,ndim,selected)

dendro1 - dendrogram using euclidian metric and Ward linkage

function group=dendro(X,(topnodes))

ellipse_map - plots the ellipse confidence interval of groups

function ellipse_map(X,col1,col2,gr,centroid_variability,(confidence),(point_plot))

fda1 - stepwise factorial discriminant analysis on PCA scores

function[fdatype]=fda1(pcatype, group, among, maxscore)

fda2 - stepwise factorial discriminant analysis on PCA scores with verification

function[res]=fda2(X, group, among, maxscore, selected)

maha - simple discriminant analysis forward introducing variables no validation samples

function[res]=maha(X,group,maxvar)

maha1 - forward linear discriminant analysis DIRECTLY ON DATA with validation samples

function[discrtype1]=maha1(calibration_data,calibration_group,maxvar,test_data,test_group)

maha3 - simple discriminant analysis forward introducing variables no validation samples

function[discrtype]=maha3(X,group,maxvar)

maha4 - simple discriminant analysis forward

function[discrtype]=maha4(X,group,maxvar)

maha6 - simple discriminant analysis forward introducing variables with validation samples

function[discrtype]=maha6(X,group,maxvar,selected)

nuee - Nuee dynamique (KCmeans)

function[res]=nuee(X,ngroup,(nchanged))

plsda - Pls discriminant analysis following the saisir format

function[plsdatype]=plsda(X,group,ndim)

quaddis - Quadratic discriminant analysis

function quadis_type=quaddis(x,group);

random_select - bulding a vector of random elements 0 or 1

function[selected]=random_select(nel, nselect, (nrepeat))

Thematic list
Alphabetic list

apply_nuee - apply Nuee dynamique (KCmeans)

function[res]=nuee(X,barycenter)

function build_documentation(directory_name,filename,(thematic_list))

build_indicator - build a disjoint table

function [indicator, groupings]=build_indicator(x);

cumulate_covariance - Covariance on huge data set

function [covariance_type]=cumulate_covariance((X),(covariance_type1));

dendro1 - dendrogram using euclidian metric and Ward linkage

function group=dendro(X,(topnodes))

documentation_dico -build a dictionnary for HTML doc

function documentation_dico(fid,function_name);

documentation_thematic_list - HTML thematic list of function

eigord - diagonalization of a square matrix

function [mtvec,mtval] = eigord(mtA)

find_index - find the index corresponding to the closest "value"

function index=find_index(str,value);

find_max - gives the indices of the max value of a MATLAB Matrix

function [row,col,value]=find_max(matrix)

find_min - gives the indices of the min value of a MATLAB Matrix

function [row,col,value]=find_min(matrix)

find_peaks - finds and displays peaks greater than a threshold value

function [vect, index]=find_peaks(X,nrow,threshold,windowsize,min_max)

group_mean - gives the means of group of rows

function X1=group_mean(X,startpos,endpos)

html_header - build the first part of the documentation

function html_header(fid)

html_notice - prints the helps of functions in HTML document

function html_notice(fid,function_name);

html_postface - Finish the work in HTML documentation

function html_postface(fid);

issaisir - tests if the input argument is a SAISIR matrix

test=issaisir(X);

list - lists rows (only with a small number of columns)

function list(X,(start))

matrix2saisir - transforms a Matlab matrix in a saisir structure

X = matrix2saisir(data,(coderow),(codecol))

mdistance - computes distances between the two tables using metric "metric"

function dis = mdistance(X1,X2,metric)

mir_style - changes the sign of the variables of MIR spectra

function [names] = mir_style(names1)

multiway_pca - Multi way principal component analysis

function res=multiway_pca(collection);

nuee - Nuee dynamique (KCmeans)

function[res]=nuee(X,ngroup,(nchanged))

num2str1 - Justified num2string

function str=num2str1(vector,ndigit);

pca_ridge_regression - Basic ridge regression after PCA

function[ridgetype]=pca_ridge_regression(pcatype,y,range)

random_saisir - Creation of a random matrix

function[X]=random_saisir(nrow,ncol)

random_select - bulding a vector of random elements 0 or 1

function[selected]=random_select(nel, nselect, (nrepeat))

random_splitrow - random selection of rows

function[X1,X2]=random_splitrow(X, nselect)

randomize - Builds a file of randomly attributed vector in X1

function X1=randomize(X)

reorder - reorders the data of files A1 and A2 according to their identifiers

function [B1 B2]=reorder(A1,A2)

repeat_string - build a matricx of char by repeating a string

function str1=repeat_string(str,ntimes);

row_center - subtracts the average row to each row

function [X] = center(X1)

saisir_check - Checks if the data respect the saisir stucture

function check=saisir_check(X)

saisir_linkage - assesses a simple linkage vector from a matrix of distance

function z=saisir_linkage(dis)

saisir_mult - matrix multiplication following the SAISIR format

function X12=saisir_mult(X1,X2);

saisir_sort - sorts the rows of s according to the values in a column

function [X1 X2]=saisir_sort(X,ncol,minmax)

saisir_sum - calculates the sum of the rows

function xsum=saisir_sum(X);

saisir_transpose - transposes a data matrix following the saisir format

function [X] = saisir_transpose(X1)

seekstring - returns a vector giving the indices of string in matrix of char x in which 'str' is present

function index = seekstring(identifiers,xstr)

sgolaycoef - Computes the Savitsky-Golay coefficients

function [B,G] = sgolaycoef(k,F)

split_average - averages observations according to the identifiers

function res=split_average(X,startpos,endpos)

standardize - divides each column by the corresponding standard deviation

function [X, xstd] = standardize(X1,(option))

string2saisir - creation of a saisir file from a string table (first column=name)

function [saisir] = string2saisir(data)

string2text - save a vector of string in a .txt format

function string2text(str,filename)

thematic_classification - builds a thematic classification of the .m files

function res=thematic_classification(function_name,(previous));

w - w: (for "what") lists the fields which are present in a structure

function res= w(xstruct);

xdisp -smart display of heterogeneous variables

function xdisp(varargin)

Thematic list
Alphabetic list

Alphabetic list

addcode	alphabetic_sort	anavar1	anovan1	appendbag1
appendcol	appendcol1	appendrow	appendrow1	applyfda1
applylr1	applypca	applypcr	applypls	applyplsda
applyspcr	apply_multiple_regression	apply_nuee	apply_quaddis	apply_ridge_regression
apply_stepwise_regression	bag2group	bag_appendrow1	barycenter_map	basic_pls
basic_pls2	browse	build_documentation	build_indicator	ca
ca_map	center	change_sign	check_name	colored_curves
colored_map1	colored_map2	colored_map4	comdim	contingency_khi2
contingency_table	cormap	correct_baseline	correlation_circle	correlation_plot
covariance_pca	covmap	create_group	create_group1	crossfda1
crossmaha1	crossplsda	crossvalpls	crossvalpls1a	crossval_multiple_regression
crossval_quaddis	cross_ridge_regression	cumulate_covariance	curve	curves
d2_factorial_map	deletecol	deleterow	dendro	dimcrosspcr1
dimcross_stepwise_regression	distance	documentation_dico	documentation_thematic_list	eigord
eliminate_nan	ellipse_map	excel2bag	excel2saisir	fda1
fda2	find_index	find_max	find_min	find_peaks
group_centering	group_mean	html_header	html_notice	html_postface
issaisir	labelled_hist	leave_one_out_pls1	leave_one_out_pls2	list
lr1	maha	maha1	maha3	maha4
maha6	map	map3D	matrix2saisir	mdistance
mfa	mir_style	moving_average	moving_max	moving_min
multiple_regression	multiway_pca	nancor	normc	normed_pca
norm_col	nuee	num2str1	pca	pca1
pca2	pcareconstruct	pca_cano	pca_cross_ridge_regression	pca_ridge_regression
pca_stat	pcr	pcr1	plotmatrix1	pls
pls2obs	pls2var	plsda	quaddis	quickpls
randomize	random_saisir	random_select	random_splitrow	readexcel1
readident	regression_score	reorder	repeat_string	ridge_regression
ridge_regression1	row_center	saisir2ascii	saisir2excel	saisirpls
saisir_check	saisir_derivative	saisir_linkage	saisir_mean	saisir_mult
saisir_sort	saisir_std	saisir_sum	saisir_transpose	seekstring
selectcol	selectrow	select_from_identifier	select_from_variable	sensory_profile
sgolaycoef	show_vector	simple_regression	snv	spcr
splitrow	split_average	standardize	statis	stepwise_regression
string2saisir	string2text	submap	subtract_variable	surface1
surface_std	symbol_map	tcurve	tcurves	thematic_classification
trajectory_curve	w	xcomdim	xdisp	xpca

addcode

addcode - adds a string before or after a matrix of characters

function str1 = addcode(str,code,(deb_end))

Input argument:
==============
str: a matrix of character (n x p)
code: a string (1 x k)
deb_end: a number (0= addition before; 1 = addition after; default : 0)
Output argument:
===============
str1 : a matrix of character ((n x (k+p))

This function is mainly used to recode the identifiers of observations or
variables (".i", or ".v")
example:
data.i
ans =
casein
albumin
zein
>> data.i=addcode(data.i,'1')
data =
i: [3x8 char]
>> data.i
ans =
1casein
1albumin
1zein

Return to thematic list
Return to alphabetic list
HOME

alphabetic_sort

alphabetic_sort - sorts the rows of x according to the alphabetic order of rows identifiers

function X1=alphabetic_sort(X,start_pos:end_pos)

Input arguments:
===============
x : SAISIR matrix
start_pos, end_pos : character positions in the identifiers.

Output argument:
================
x1= SAISIR matrix with the rows sorted in alphabetic order.
This functions sorts the observations (rows) according to
their identifiers (".i" field of X);

example
=======
>> a.i
ans =
xenon
krypton
Aluminium
>>a.d
ans =
1.00 2.00 3.00 4.00
5.00 6.00 7.00 8.00
9.00 10.00 11.00 12.00
b=alphabetic_sort(a,1,5);
>> b.i
ans =
Aluminium
krypton
xenon
>> b.d
ans =
9.00 10.00 11.00 12.00
5.00 6.00 7.00 8.00
1.00 2.00 3.00 4.00

Return to thematic list
Return to alphabetic list
HOME

anavar1

anavar1 - One way analysis of variance on all the columns

function res = anavar1(X,g)

Perform a one-way analysis of variance for each of the column of X.

Input arguments:
===============
X: SAISIR matrix (n x p)
g: SAISIR vector of group identifier (integers ,n x 1)

Output arguments:
================
res with fields:
res.F : SAISIR vector of Fisher values for each variable in X (1 x p)
res.F.df : degrees of freedom of the model/
res.p : SAISIR vector of associated probabilities (1 x p).

Note: res.F and res.p can be examined as curves (using "curve" function)
or by the command "show_vector" (for discrete variables)

The function performs p independant one-way anovas, taking the groups
(defined in g: each observations having the same number are belonging to the same group.)

See also: "anovan1", "show_vector","create_group1"

Return to thematic list
Return to alphabetic list
HOME

anovan1

anovan1 - N-way analysis of variance (ANOVA) on data matrices.

function res = anovan1(X,model,gr1, gr2, ...)

Performs as many independant N-way analyses of variance as the number of columns in X

Input arguments:
===============
X: SAISIR matrix of response values (n x p)
model (integer): gives the level of desired interactions
(1= no interactions studied; 2: first degree of interactions ... ) (see
Matlab function ANOVAN)
gr1; gr2 ...: SAISIR vector of qualitative groups forming a factor of the ANOVA
(n x 1). Identical numbers mean that the corresponding observations are in
the same group

Output arguments:
================
res with fields
res.F: the F values associated with each effect and (possibly) interaction
res.P: probability
res.df (characters): degrees of freedom
res.singular: singularity of the model. If the singularity == 1, the model
is redundant and a lowest level of interaction must be tested.
see also: anovan, anova, anavar1, show_vector, create_group1

example:
my_anova=anovan1(spectra,2,grouping1, grouping2);
show_vector(my_anova.P,2); %%examination of the probabilities of the
second factor (with identifiers)
curve(my_anova.F,2); %% Fisher F examined as a curve.

Return to thematic list
Return to alphabetic list
HOME

appendbag1

appendbag1 - Merge an arbitrary number of "bag" files according to rows

usage: [X]= appendbag1(X1, X2, X3,.....)

Input argument:
===============
X1, X2, X3 ... : "bag" structure (see function "excel2bag" for details)
with the same number of columns

Output argument:
================
X: concatenated bag structure

bag is a structure (such as X.d,X.i,X.v),
with bag.d being here a three way table of characters
if bag.d is dimensioned (n x p x v): n is the number of rows, v the
number of columns, and p the number of characters for each string.
The structure bag is obtained as out put argument of the function
"excel2bag".

Example:
[value1,bag1]=excel2bag('data1',['A'; 'B'],20);
[value2,bag2]=excel2bag('data2',['A'; 'B'],20);
bag3=appendbag1(bag1,bag2);

Return to thematic list
Return to alphabetic list
HOME

appendcol

appendcol - merges two files according to columns

function: [X3]= appendcol(X1,X2)

Input arguments:
===============
X1, X2 : SAISIR Matrix dimensionned n x p and n x q

Ouput argument:
===============
X3 : SAISIR Matrix dimensionned n x(p+q)

The identifiers or rows are recopied in X3.i
The identifiers of columns are the concatenation of X1.v and X2.v

Example:
total=appendcol(chemistry1, chemistry2);

see also: appendrow, appendrow1, appendcol1

Return to thematic list
Return to alphabetic list
HOME

appendcol1

appendcol1 - merges an arbitrary number of files according to columns

usage: [X]= appendcol(X1,X2,X3,...)

Input arguments:
===============
X1, X2, X3 ... : SAISIR Matrices with the same numbers of rows

Ouput argument:
===============
X : SAISIR Matrix

The identifiers or rows are recopied in X.i
The identifiers of columns are the concatenation of X1.v, X2.v, X3.v ...

Example:
total=appendcol(chemistry1, chemistry2, chemistry3 );

see also: appendrow, appendrow1, appendcol

Return to thematic list
Return to alphabetic list
HOME

appendrow

appendrow - merges two SAISIR matrices according to rows

usage: [X3]= appendrow(X1,X2)

Input arguments:
================
X1, X2 : SAISIR Matrix dimensionned n1 x p and n2 x p

Ouput argument:
==============
X3 : SAISIR Matrix dimensionned (n1+n2) x p

The identifiers or columns are recopied in X3.v
The identifiers of rows are the concatenation of X1.i and X2.i

Example:
total=appendrow(spectra1, spectra2);

see also: appendcol, appendrow1, appendcol1

Return to thematic list
Return to alphabetic list
HOME

appendrow1

appendrow1 - Merges an arbitrary number of files according to rows

usage: [X]= appendrow(X1,X2,X3,...)

Input arguments:
===============
X1, X2, X3 : SAISIR Matrices with the same number of columns

Ouput argument:
===============
X : SAISIR Matrix with p columns

The identifiers or columns are recopied in X.v
The identifiers of rows are the concatenation of X1.i, X2.i, X3.i ...

Example:
total=appendrow(spectra1, spectra2, spectra3);

see also: appendcol, appendrow, appendcol1

Return to thematic list
Return to alphabetic list
HOME

applyfda1

applyfda1 - application of factorial discriminant analysis on PCA scores

function[res]=applyfda1(X,fdatype,(actual_group))

Input arguments:
===============
X: Saisir matrix of predictive variables (n x p)
fdatype : structure, output of function "fda1"
actual_group (optional)(n x 1): SAISIR vector indicating the membership of
the observations. A same number indicates that these observations belong to
the same group.

Output argument:
===============
res with fields
datafactor : discriminant scores (n rows)
predicted_group: number indicating the prediction in each group
If actual_group si given, the field "confusion" gives the confusion matrix
row: actual group, columns: predicted by the "fda1"

Typical example:
================
(calibration data in "calibration_data", group in "qualitative_group",
Unknown data in "unknown_data");
%Building the model
p=pca(calibration_data);
fdatype=fda1(p, qualitative_group, 10, 5)
%Applying the model
res=applyfda1(unknown_data,fdatype);
See also : fda1, maha3, maha6, plsda, pca

Return to thematic list
Return to alphabetic list
HOME

applylr1

applylr1 - Apply basic latent root model on saisir data x

function [predy]=applylr1(lrtype,X)

Input arguments:
================
lrtype: structure, output of function "lr1" (predictive model)
X :SAISIR matrix of predictive variables

Output argument
===============
predy :SAISIR matrix of predicted y (for all the models asked when using
"lr1"

creates as many y predicted as allowed by the dimensions in lr1type

Return to thematic list
Return to alphabetic list
HOME

applypca

applypca - computes the scores of supplementary observations

function [supscores]=applypca(pcatype, X)

assess the scores of supplementary observations

Input arguments:
===============
pcatype: (structure) output argument of functions "pca","normed_pca", or
"covariance_pca"
X : SAISIR matrix of supplementary observations

Output argument:
===============
supscores : SAISIR matrix of scores of X

Typical example
===============
p=pca(spectra);%% PCA
supscores=applypca(p,supplementary_spectra);
map(supscores,1,2);

The number of columns of X must be compatible with pcatype.
If "normed_pca" was applied, X is divided by the standard deviations of
the principal observations prior to projection

Return to thematic list
Return to alphabetic list
HOME

applypcr

applypcr - applies basic PCR on data

function [predy]=applypcr(pcrtype,X)

apply a basic pcr (in pcrtype) on saisir data x
creates as many y predicted as allowed by the dimensions in pcrtype

Input arguments
pcrtype:structure, output of function "PCR"
X: SAISIR matrix of predictive variables

Output argument
predy : SAISIR matrix of predicted y, for all the dimensions tested

see also: "pcr", "pcr1", "basic_pls"

Return to thematic list
Return to alphabetic list
HOME

applypls

applypls - applies a pls model on an unknown data set

function res=applypls(X,plsmodel, (knowny))

Input arguments:
---------------
X : SAISIR matrix (n x p)
plsmodel :output argument of functions "saisirpls", "basic_pls" or "basic_pls2"
knowny (optional): actual value of y (if it is known, this allos the
computation of r2 and RMSEV

Ouput arguments:
---------------
res with fields
PREDY: predicted values for the all the models given by "plsmodel" (n x
ndim)
RMSEV: root mean square error of validation for all the models (only if
"knowny" is given) (1 x ndim)
r2: determination coefficient between predicted and observed y values
(only if "knowny" is given) (1 x ndim)
T: PLS scores of the set (n x ndim)

Return to thematic list
Return to alphabetic list
HOME

applyplsda

applyplsda - Applies pls discriminant analysis after model assessment using plsda

function[res]=applyplsda(X,plsdatype,(actual_group))

return the predicted group on (unknown data x)

Input arguments:
===============
X : SAISIR matrix of predictive variables
plsdatype: structure returned by function 'plsda'
actual_group (optional): SAISIR vector of observed groups.
Observations with the same group number belong to the same group.

Ouput argument:
==============
res with fields:
confusion1: matrix of confusion, method 1 (if "actual_group"
defined)
ncorrect1: Number of correct classifications, method 1 (if "actual_group"
defined)
predgroup1: predicted group (method1)
confusion: matrix of confusion, method 0 (if "actual_group"
defined)
ncorrect: Number of correct classifications, method 1 (if "actual_group"
defined)
predgroup: predicted group (method 0)

Method 1: (attribution to index of max of predicted Y)
Method 0: (shortest Mahalanobis distance calculated on PLS scores);

Return to thematic list
Return to alphabetic list
HOME

applyspcr

applyspcr - Applies a stepwise PCR

function [predy]=applyspcr(spcrtype,X)

Input arguments:
===============
spcrtype:(structure), output argument of spcr
X: SAISIR marix of predictive variables

Output argument:
===============
predy: y predicted for all the tested dimensions

Return to thematic list
Return to alphabetic list
HOME

apply_multiple_regression

apply_multiple_regression - applies multiple_regression on "unknown" data

function[res]=apply_multiple_regression(X,type,(y))

Input argument:
==============
X: SAISIR matrix of predictive variables
type: output argument of function "multiple_regression"
y (optional): SAISIR vector of known observed y

Output argument:
===============
res with fields:
ypred: predicted y
if input argument "y" given :
r2: r2 value
RMSEV: root mean square error of validation

Return to thematic list
Return to alphabetic list
HOME

apply_nuee

apply_nuee - apply Nuee dynamique (KCmeans)

function[res]=nuee(X,barycenter)

X :a matrix of dimension n x p
barycenter : a matrix defining the barycenter k x p with k groups
Barycenter is possibly the field "center" of the output of function "nuee"

Return to thematic list
Return to alphabetic list
HOME

apply_quaddis

apply_quaddis - Quadratic discriminant analysis

function result = apply_quaddis(quaddis_type,x,(known_group));

Application of quadratic discriminant analysis (test)
====================================================
Input arguments:
================
quaddis_type : output of function "quaddis" (structure)
x : predictive data matrix (n x p)
known_group : true groups of observations in x (n x 1) (optional)

Ouput arguments:
================
result with fields:
predgroup : predicted groups (n x 1)
density : pseudo-density (n x gmax)
proba : probability of belonging to a given group (n x gmax)
if "known_group" defined
nscorrect100 : percentage of correct classification (number)
sconfus : confusion matrix (gmax x gmax)

Return to thematic list
Return to alphabetic list
HOME

apply_ridge_regression

apply_ridge_regression - applies ridge regression on "unknown data"

function[res]=apply_ridge_regression(ridgetype,X,(y))

Input arguments:
===============
ridge_type (structure): output argument of ridge_regression
X: SAISIR matrix of predictive variables
y: SAISIR vector of observed y

Ouput arguments:
===============
res with fields
predy: predicted values for all the ridge predictive models
(see function "ridge_regression")
If input argument "y" is given:
r2: values for all the ridge predictive models
rmsecv: root mean square error of validation for all the predictive models

Return to thematic list
Return to alphabetic list
HOME

apply_stepwise_regression

apply_stepwise_regression - applies stepwise_regression on "unknown" data

function[res]=apply_stepwise_regression(stepwise_type,X,(y))

Input arguments:
================
stepwise_type: array of cells obtained as output of stepwise_regression
X : SAISIR matrix of predictive variables (n x p)
y : SAISIR vector of observed variable (n x 1)

Output argument
===============
res with fields
predy : predicted y for all the tested models
rmsev : Root mean square error of validation (if input argument "y"
defined)
r2 : determination coefficients between oobserved and predicted y

build as many models as available in "stepwise_type"

Return to thematic list
Return to alphabetic list
HOME

bag2group

bag2group - uses the identifiers in bag to create groups

function [group_type]=bag2group(bag)

Input argument
=============
bag: "bag" structure , output of function "excel2bag"

Ouptut argument
=============
group_type : array of cells of structure SAISIR
such as group_type{i} contains the SAISIR structure of groups as defined
by the corresponding column i in "bag"
creates as many group as different strings in the column of bag.d

useful for discriminant analysis or correspondance analysis

Return to thematic list
Return to alphabetic list
HOME

bag_appendrow1

bag_appendrow1 - Merges an arbitrary number of bags according to rows

usage: [bag]= appendrow(bag1, bag2 ....)

Input arguments
==============
bag1, bag2, bag3 ... : structure "bag" as defined in function "excel2bag"

Ouput argument
=============
bag: concatentation of bag1, bag2, bag3 with increase of the number of
rows
the second and third dimensions of bag.d must be equals

Return to thematic list
Return to alphabetic list
HOME

barycenter_map

barycenter_map - graph of map of barycenter

function barycenter_map(X,col1,col2,group,(charsize))

Input arguments:
==============
X: SAISIR matrix (n x p)
col1, col2 : indices of the variables ploted in X and Y (integers)
group: SAISIR vector of group ( n x 1). Indicates the group belonging
of each observation
charsize size of the Font (default: 7).

Display two columns as a map
Each observation is linked to its own barycentre by a straight line

Return to thematic list
Return to alphabetic list
HOME

basic_pls

basic_pls - basic pls with keeping loadings and scores

function[res] = basic_pls(X,y,ndim)

Input arguments:
---------------
X: SAISIR matrix of predictive variables (n x p)
y: SAISIR vector of variables to be predicted
ndim: maximum number of dimension asked

Output arguments:
----------------
res with fields:
T: PLS scores (n x ndim)
P:¨PLS loadings such as X = TP + residuals (ndim x p)
beta: final regression coefficients with ndim dimensions (p x 1)
beta0: final interecpt value (number)
meanx: mean of X (1 x p)
meany: mean of y (numbe)
predy: predicted y with ndim dimensions (n x 1)
error: Root mean square error of the model with ndim dimensions (number)
corcoef: correlation coefficient r between observed and predicted values
in the model with ndim dimensions
BETA: regression coefficients for the ndim models (p x ndim)
BETA0: intercepts of the ndim models (ndim x 1)
loadings: pls loadings such as T=X*loadings (with X centred) (p x ndim)
Q: Q value such as y = TQ + residual (ndim x 1)
PREDY: predicted y values up to ndim dimensions (n x ndim)
RMSEC: Root mean square error of predicttion up to ndim dimensions (1 x
ndim)
r2: determination coefficient between observed and predicted values (1 x
ndim)

Return to thematic list
Return to alphabetic list
HOME

basic_pls2

basic_pls2 - PLS2 on several variables, several dimensions

function result=basic_pls2(X,y,maxdim)

Input arguments:
===============
X: SAISIR matrix of predictive variables ( n x p)
y: SAISIR vector of observed y (n x m)
maxdim (integer): maximum number of PLS dimensions

Ouput arguments:
===============
result with fields
T: PLS scores (n x maxdim)
loadings: PLS loadings (p x maxdim)
meanx: average of X (1 x p)
res: array of cells (maxdim elements)
A structure in res{i} with i = 1 ... maxdim contains the results of the
prediction of the variable i.

result.res{i} has the fields:
nom: (string) , name of the considered variable i
BETA (p x maxdim): regression coefficients up to maxdim dimensions
BETA0 (1 x maxdim): intercepts of the models
PREDY (n x maxdim): predicted y for 1 to maxdim dimensions
RMSEC (1 x maxdim): root mean square errors of calibration
r2 (1 x maxdim): r2 for 1 to maxdim dimensions

Return to thematic list
Return to alphabetic list
HOME

browse

browse - browses a series of curves

function browse(X,xstart)

Display the rows of the SAISIR matrix X as curves
Right button to go down, Left button to go up, Ctrl C to exit
%If X.v can be interpreted as a vector of number (such as wavelengths),
the X scale is given by this vector.
Otherwise, the X-axis is simply given by the rank of the variables

Return to thematic list
Return to alphabetic list
HOME

build_documentation

function build_documentation(directory_name,filename,(thematic_list))

Input arguments:
================
directory_name : name of the directory (working with
the Matlab command "what )
file_name: name of the resulting html file
Extension ".htm" is added to this name
thematic_list: output of function "thematic_classification"

The function builds the HTML documentation of SAISIR by concatenation of the
"Helps". If "thematic_list" is defined, gives also a thematic list of functions
The resulting HTML file is in "filename"

Typical example
===============
aux=what('saisir');%% functions in the directory "saisir";
function_name=char(aux.m);% list of function in SAISIR
build_documentation('saisir','SAISIR documentation');%% builds the HTLM fields

%For having also a thematic index in the HTML document, one must use the function
"thematic_classification".
%For example
mylist=thematic_classification(function_name);
build_documentation('saisir','SAISIR_documentation',mylist);%% builds the HTLM fields
Then the resulting SAISIR_documentation.htm file can be examined with WEB
explorer such as Window explorer or firefox. Window explorer is better here.

See also: "thematic_classification"

Return to thematic list
Return to alphabetic list
HOME

build_indicator

build_indicator - build a disjoint table

function [indicator, groupings]=build_indicator(x);

each column of x must contain integer values
build the complete table of indicators
Useful for computing multiple correspondance analysis

Return to thematic list
Return to alphabetic list
HOME

ca

ca - CORRESPONDENCE ANALYSIS

function ca_type=ca(N);

Compute correspondence analysis from the contingency table in N
If only groupings are available, the contigency table must be computed
before using this function (see for example function "contingency_table")

==============================================================
Fields of the output
score :CA scores of rows followed by CA scores of columns
eigenval :eigenvalues, percentage of inertia, cumulated percentage
contribution :contribution to the component rows, then columns
squared_cos :squared cosinus row, then columns
khi2 :khi2 of the contingency table
df :degree of freedom
probability :probability of random values in contigency table
==============================================================

The identifiers of rows of 'score' (whic are the identifiers of rows and columns of N (N.i and N.v)
are preceded with the letter 'r' or 'c'.
It is therefore possible to use color for emphazising row and columns in
the simultaneous biplot of rows and columns

Source : G. Saporta. Probabilités, analyse des données et statistiques.
: Edition Technip, page 198 and followings.
REMARK : use function "ca_map" to plot the biplot observation/variable

Return to thematic list
Return to alphabetic list
HOME

ca_map

ca_map - colored map for correspondence analysis: using a portion of the identifiers as labels

ca_map(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))

Input arguments:
================
X: SAISIR data matrix
col1, col2: rank of the columns to be displayed (normaly scores obtained
from function "ca".
startpos, endpos: position of the string in the identifiers indicating the
color of the display

Biplot of two columns as colored map useful for correspondance analysis (from function "ca")
The coloration of the displayed descriptors depends on the arguments
"startpos" and "endpos". If one of this argument is zero: single (black) color
Otherwise, from the names of individual, the string name(sartpos:endpos) is extracted.
Two observations for which these strings are different,
are also colored differently.

THIS FUNCTION IS SPECIFIC TO CORRESPONDANCE ANALYSIS:
If the first letter of the identifier is either c(column) or r (row),
this letter is removed in the name.
The letter c produces an italic display. This allows a representation in
which the variables are in italic letter

Return to thematic list
Return to alphabetic list
HOME

center

center - subtracts the average to each row

function [X1 xmean] = center(X)

Input argument:
---------------
X : SAISIR matrix (n x p)
Output argument:
---------------
X1:SAISIR matrix (n x p) centered (the average of each column of X1 is
equal to 0)
xmean: SAISIR vector (1 x p) of the average row.

Return to thematic list
Return to alphabetic list
HOME

change_sign

change_sign - changes the sign of a component and of its associated eigenvector

function [pcatype1] = change_sign(pcatype,ncomp)

Input argument:
--------------
pcatype : output argument of function pcatype

Output argument:
---------------
pcatype1: new pca structure

The function is useful when several PCAS has been computed.
For the sake of clarity, it may be useful to have the axis oriented in (about)
the same directions. In this way, the graphical representations may be
easier to interpret.

see also: function "smart_coord"

Return to thematic list
Return to alphabetic list
HOME

check_name

check_name - Controls if some strings are strictly identical in a string array

function [detected,names]=check_name(string)

Input argument:
--------------
"string": a matrix of characters

ouput argument:
--------------
detected: vectors giving the indices of the observations having the same
name.
names: matrix of characters giving the found identical name

This function is mainly used in relationship with function "reorder"

Return to thematic list
Return to alphabetic list
HOME

colored_curves

coloured_curves - displays curves coloured according to groups

function h=colored_curves(X,group)

Input argument:
--------------
X: SAISIR matrix (n x p)
group: Saisir vector of groups (integers,n x 1).

The function displays all the observations as curves.
Each curve is colored according to the values in "group". The observations
with the same group number are colored identically.

Return to thematic list
Return to alphabetic list
HOME

colored_map1

colored_map1 - colored map : using a portion of the identifiers as labels

colored_map1(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))

Biplot of two columns as colored map

Input arguments:
===============
X: SAISIR matrix
col1, col2 : index of the two columns to be represented (integer values)
startpos, endpos: position in the identifier strings of rows ('.i') for
the coloration
col1label (optional): Label of the variable forming the X-axis
col2label (optional): Label of the variable forming the Y-axis
title (optional) : title of the graph
charsize (optional) : size of the plotted characters
marg (optional) : margin value allowing an extension of the axis in order
to cope with long identifiers (default value: 0.05)
For the French users: there is a synonym function "carte_couleur1".
Use preferably "colored_map1"
The coloration of the displayed descriptors depends on the arguments
startpos and endpos.
From the names of individual, the string name(sartpos:endpos) is extracted. Two observations
for which these strings are different, are also colored differently.
example:
Let X be a SAISIR matrix
%Let X.i being
'wheat1'
'barle2'
'ricex1'
'wheat2'
'barle3'
...
The command 'colored_map1(X,5,3,1,5)' will plot the column 5 as X, 3, as Y
The characters are extracted from 1 to 5 , that is strings 'wheat', 'barley',
'ricex'.A different color will be given for each of this strings.

See also colored_map2 (same principle but with the whole identifier name
displayed)

Return to thematic list
Return to alphabetic list
HOME

colored_map2

colored_map2 - colored map : using a portion of the identifiers as labels

colored_map2(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))

Biplot of two columns as colored map

Input arguments:
===============
X: SAISIR matrix
col1, col2 : index of the two columns to be represented (integer values)
startpos, endpos: position in the identifier strings of rows ('.i') for
the coloration
col1label (optional): Label of the variable forming the X-axis
col2label (optional): Label of the variable forming the Y-axis
title (optional) : title of the graph
charsize (optional) : size of the plotted characters
marg (optional) : margin value allowing an extension of the axis in order
to cope with long identifiers (default value: 0.05)
For the French users: there is a synonym function "carte_couleur2".
Use preferably "colored_map2"
The coloration of the displayed descriptors depends on the arguments
startpos and endpos.
From the names of individual, the string name(sartpos:endpos) is extracted. Two observations
for which these strings are different, are also colored differently.
example:
Let X be a SAISIR matrix
%Let X.i being
'wheat1'
'barle2'
'ricex1'
'wheat2'
'barle3'
...
The command 'colored_map1(X,5,3,1,5)' will plot the column 5 as X, 3, as Y
The characters are extracted from 1 to 5 , that is strings 'wheat', 'barley',
'ricex'.A different color will be given for each of this strings.

See also colored_map1 (same principle but with only the portion of
the identifier names, from startpos to endpos, displayed)

Return to thematic list
Return to alphabetic list
HOME

colored_map4

colored_map4 - colored map according to 2 criteria

colored_map4(X,col1,col2,color_choice,(symbol_choice),(charsize);)

Input_arguments
--------------
X: SAISIR matrix of data (n x p)
col1, col2 : rank of the columns to be represented
choice: first criterion of choice, dealing with the colors
symbol_choice (optional): second criteria of choice, dealing with the
symbols

Biplot of two columns as colored map
The coloration of the displayed descriptors depends on the arguments
choice (either matrix of char, vector of number or saisir structure);
with number of elements equal to the number rows in s;
if the elements of "choice" are different, they are also colored differently.
if "symbol_choice" is also defined (either matrix of char, vector of number
or saisir structure) different symbols are used. The color of the point
is then given by "choice", and the shape of the symbol depends on
"symbol_choice"

Example of use:
Colored text with color determined by the second character in wheat.i
colored_map4(wheat,1,50,wheat.i(:,2));;

carte_couleur4(ble,1,50,ble.i(:,2),ble.i(:,3));
color: determined by the second character in wheat.i
colored symbol: shape determined by the third character in wheat.i

Return to thematic list
Return to alphabetic list
HOME

comdim

comdim - Finding common dimensions in multitable data (saisir format)

function[res]=comdim(collection,(ndim),(threshold))

Input arguments:
---------------
collection:vector of saisir files (the numbers "nrow" of rows in each table must be equal) .
ndim:number of common dimensions
threshold (optional): if the "difference of fit"
iterative loop

Output arguments:
-----------------
res with fields:
Q : observations scores (nrow x ndim)
explained : 1 x ndim, percentage explanation given by each dimension
saliences : weight of the original tables in each
dimensions (ntable x ndim).

Method published by E.M. Qannari, I. Wakeling, P. Courcoux and H. J. H. MacFie
in Food quality and Preference 11 (2000) 151-154

typical example (suppose 3 SAISIR matrices
"spectra1","spectra2","spectra3")
collection(1)=spectra1; collection(2)=spectra2; collection(3)=spectra3
myresult=comdim(collection);
map(myresult.Q,1,2);%% looking at the compromise scores
figure;
map(myresult.saliences,1,2);%% looking at the weights

Return to thematic list
Return to alphabetic list
HOME

contingency_khi2

contingency_khi2 - computes khi2 stats on a contingency table

function res=contingency_kh2(table);

Input argument:
--------------
table: SAISIR matrix of contingency table (n x p)

output argument
--------------
res with fields:
theo: theoretical contingency table assuming independence of rows and
columns in "table"
khi2: khi2 value
dll: degree of freedom of the model
P: probability of the null hypothesis ("independence of rows and columns")

Each element of input argument "table" gives the number of observations
which both belongs to the group of the corresponding row and the
corresponding column. For example table.d(2,4) indicates the number of
observations which are both in the group 2 of rows and in the group 4 of
columns.
The contingency table can be created with the function "contingency_table"

Return to thematic list
Return to alphabetic list
HOME

contingency_table

contingency_table - Computes a contingency table

function table=contingency_table(g1,g2)

Input arguments
---------------
g1 and g2: SAISIR vector (n x 1) of groups (possibly computed from "create_group1").
In these vectors, a same number indicates the belonging to the same group.

Output argument
---------------
table : contingency table (ngroup1 x ngroup2). A value table.d(i,j)indicates the
number of observations belonging both of group i in g1 and group j in g2.

see also "contigency_khi2", "ca" (correspondence analysis),
"build_indicator"

Return to thematic list
Return to alphabetic list
HOME

cormap

cormap - Correlation between two tables

function [cor] = cormap(X1,X2)

Input arguments
--------------
X1 and X2: SAISIR matrix dimensioned n x p1 and n x p2 respectively

Output argument:
---------------
cor: matrix of correlation dimensioned p1 x p2)
the tables must have the same number of rows
An element cor.d(i,j) is the correlation coefficient between the column i of X1 and j of X2

Return to thematic list
Return to alphabetic list
HOME

correct_baseline

correct_baseline - simple linear baseline correction, using intensity

function [saisir] = correct_baseline (saisir1,col1,col2)

The baseline is modelled by a straight line going from data points col1 to col2

Return to thematic list
Return to alphabetic list
HOME

correlation_circle

correlation_circle - Displays the correlation circle after PCA

function[res]=correlation_circle(pcatype,X,col1,col2,(startpos),(endpos))

Input arguments
---------------
pcatype: output argument of function "pca")
X: original data matrix (n x p)
col1 and col2 : ranks of the PC-scores to be represented.
startpos and endpos(optional): key in the variable identifiers for coloring the variables (optional)

Output argument
--------------
res: matrix p x n scores of the correlations between the variables and all
the available PC-scores

The function draw the correlation circle

%typical example
%Let "chemistry" be a SAISIR matrix
mypca=pca(chemistry);
correlation_circle(mypca,chemistry,col1,col2);%correlation circle
%of plan %#1-#2

%Note: Use preferably the function "correlation_plot"

Return to thematic list
Return to alphabetic list
HOME

correlation_plot

correlation_plot - Draw a correlation between scores and tables

function handle=correlation_plot(scores,col1,col2, X1,X2, ...);

Input arguments:
==============
scores: - ORTHOGONAL scores obtained by multidimensional analysis
col1 and col2: - Indices (ranks) of the scores to be plotted (integer number)
X, X2, ... - Arbitrary number of tables giving the variables to be plotted
The number of rows in the scores and other tables must be identical

The function displays the correlation circle, with a different color for
each table in X1, X2 ...
A dotted line gives the level of 50% explained variable.
If the input argument "scores" have non-orthogonal columns the graph is normally incorrect
and a warning message is displayed.

Return to thematic list
Return to alphabetic list
HOME

covariance_pca

covariance_pca - principal component analysis when knowing the covariance (of variables)

function[pcatype]=covariance_pca(covariance_type,(nscore))

Input arguments:
---------------
covariance_type: output argument of function "cumulate_covariance"
nscore : (integer, optional) number of scores to be calculated (default :
all)

Output argument:
----------------
pcatype with fields:
eigenvec: eigenvector
eigenval:eigenvalues
average:average value of the active observations

Performs PCA on the covariance matrix as calculated by"cumulate_covariance"
This function is useful fo to carrying on PCA with huge data set (see
"cumulate_covariance" for an example of use)

%typical example:
A complete script must therefore be something like
cov=cumulate_covariance(spectra_1);%% starting
cov=cumulate_covariance(spectra_2,cov);%% cumulating values of matrix 1
cov=cumulate_covariance(spectra_3,cov);%% cumulating values of matrix 2
...
cov=cumulate_covariance(spectra_k,cov);%% cumulating values of matrix k
covariance_type=cumulate_covariance([],cov);%% finishing
[pcatype]=covariance_pca(covariance_type,(ncomponent))
score1=applypca(pcatype,spectra_1);%% projecting data from spectra_1

This function is mainly used to compute PCA on huge data set, which cannot
be loaded completely in the free memory, and thust must be split in smaller subset of observations.

related function; "cumulate_covariance" (covariance of huge data set)

Return to thematic list
Return to alphabetic list
HOME

covmap

covmap - assesses the covariances between two tables

function [cov] = covmap(X1,X2)

Input arguments
--------------
X1 and X2: SAISIR matrix dimensioned n x p1 and n x p2 respectively

Output argument:
---------------
cov: matrix of covariance dimensioned p1 x p2)
the tables must have the same number of rows
An element cov.d(i,j) is the covariance between the column i of X1 and j of X2

Return to thematic list
Return to alphabetic list
HOME

create_group

create_group - creates a vector of numbers indicating groups from identifiers

group=create_group(X,code_list,startpos,endpos)

Input arguments:
---------------
X: SAISIR matrix (n x p)
code_list:matrix of character (k x q)
startpos, endpos: place in the identifier names where to find the code

Output argument:
---------------
group: SAISIR vector (n x 1) of groups. A same number indicates that the observations belong to the same group
Normally, k groups are identified

typical use:
group=create_group(X,['A1';'B2';'C1'],3,4]);
The command seek in X.i the codes 'A1', 'B2', 'C1', in position 3 to 4
Observations with code 'A1', 'B2, 'C1' are placed in group numbered 1, 2
and 3 respectively.

see also "create_group1"
group structure are used in discriminant analysis, anova and relate
methods

Return to thematic list
Return to alphabetic list
HOME

create_group1

create_group1 - uses the identifiers to create groups

use the identifier for creating groups.

creates as many group means as different strings from startpos to endpos
function saisir=create_group1(s,startpos,endpos)
s: saisir file, startpos and enpos : position of discriminating characters

Return to thematic list
Return to alphabetic list
HOME

crossfda1

crossfda1 - validation on discrimination according to fda1 (directly on data)

function[res]=crossfda1(X,group,among,maxvar,ntest)

input arguments
===============
X: Saisir data matrix (n x p)
group:Saisir vector of group (integer, n x 1).Two observations belonging
to the same group have the same group number.
among: maximum rank of the PC scores entered in the model (see "fda1" for
details)
maxvar: (integer) maximal number of scores entered in the model
ntest: (integer) number of observation in each group in the validation
set

ouput arguments:
===============
res with fields
fdatype: fdatype built up with the calibration set (see function "fda1")
verification:
with fields
datafactor: discriminant scores of the validation set
predicted_group: in the validation set
confusion: confusion matrix of the validation set (rows: actual group;
columns: predicted group)

The function applies "fda1" by dividing the sample in X into calibration and validation set
"ntest" observations in each group are randomly (with no repeat) placed in test group.

%typical example:
g=create_group1(wheat,2,3);%% creation of the vector of group number
res=crossfda1(wheat,g,10,5,5);
res.validation.confusion;%% confusion matrix in validation
%colored_map1(res.validation.datafactor,1,2,2,3); %% looking at the
%discriminant biplot of the validation set

Return to thematic list
Return to alphabetic list
HOME

crossmaha1

crossmaha - validation on discrimination according to maha1 (directly on data)

function[discrtype]=crossmaha(X,group,maxvar,ntest)

Input arguments
==============
X: SAISIR matrix (n x p)
group:SAISIR vector of group numbers (n1 x 1)
maxvar:(integer) maximum number of variables introduced in the model
ntest (integer): number of observations in each group in the validation
set

Output argument:
===============
discrtype with fields:
step with fields
index: vector giving the rank of the variable introduced at each step
(maxvar values)
correct: vector giving the number of correct classifications in the
calibration set (max var values)
name: identifiers of the variables introduced int the models (matrix of
char with maxvar rows)
ntestcorrect: vector giving the number of correct classifications in the
validation set (max var values)
classed: SAISIR vector of the predicted group numbers of the calibration
set (n1 x 1). Only the result of the final step is given
testclassed:SAISIR vector of the predicted group numbers of the test set (n2 x 1)
Only the result of the final step is given
selected: vecor indicating the observations in the calibration (0) and in
the validation (1) set
confusion with fields
cal: confusion matrix of the calibration set
val: confusion latrix of the validation set

Applies maha1 by dividing the sample in saisir into calibration and verification set
ntest observations in each group are randomly (with no repeat) placed in test group.

see also function "maha1"

Return to thematic list
Return to alphabetic list
HOME

crossplsda

crossplsda - validation on PLS discriminant analysis

function[res]=crossplsda(X,group,dim,selected)

Input arguments:
===============
X:SAISIR matrix of predictive data (n x p)
group:SAISIR vector of group numbers (n x 1). A same number indicates that
the observations belong to the sdame group
dim: maximum number of PLS dimensions
selected:matlab VECTOR with 0= calibration sample, 1= verification sample

Output argument:
===============
res with fields:
confusion1: confusion matrix of the calibration set , method 1
ncorrect1: number of correctly classified obs. in calibration, method1
nscorrect1: number of correctly classified obs. in validation, method1
sconfusion1: confusion matrix of the validation set , method 1
confusion: confusion matrix of the calibration set , method 0
ncorrect: number of correctly classified obs. in calibration, method 0
nscorrect: number of correctly classified obs. in validation, method 0
sconfusion: number of correctly classified obs. in validation, method 0
info: ' no index: max of predicted Y; 1: mahalanobis distance on latent variables t'

The function divides the data collection into a calibration set
(selected=0) and a validation set (selected = 1)
The function "plsda" is applied on the calibration set, and tested on the validation set.

Two strategies for attributing a group to each observation are tested:
Method 0 (no index): the observations are classified in the group for
which the predicted indicator variable is the highest.
Method 1: (preferable) linear discrimination on the PLS scores

see also:"plsda", "applyplsda"

Return to thematic list
Return to alphabetic list
HOME

crossvalpls

crossvalpls - validation of pls with up to ndim dimensions.

function [res]=crossvalpls(X,y,ndim,selected)

crossvalpls1a

crossvalpls1a - crossvalidation of pls with up to ndim dimensions.

function [res]=crossvalpls1a(X,y,ndim,selected)

Input args:
===========
X: predictive data n x p
y: Variable to be predicted n x 1
ndim: maximal number of dimensions in the PLS model
selected: Matlab vector n x 1 (0= obs in calibration; 1 in validation)
Output args
============
res with fields
---calibration: calibration results with fields
res with fields:
T: PLS scores (n x ndim)
P:¨PLS loadings such as X = TP + residuals (ndim x p)
beta: final regression coefficients with ndim dimensions (p x 1)
beta0: final interecpt value (number)
meanx: mean of X (1 x p)
meany: mean of y (numbe)
predy: predicted y with ndim dimensions (n x 1)
error: Root mean square error of the model with ndim dimensions (number)
corcoef: correlation coefficient r between observed and predicted values
in the model with ndim dimensions
BETA: regression coefficients for the ndim models (p x ndim)
BETA0: intercepts of the ndim models (ndim x 1)
loadings: pls loadings such as T=X*loadings (with X centred) (p x ndim)
Q: Q value such as y = TQ + residual (ndim x 1)
PREDY: predicted y values up to ndim dimensions (n x ndim)
RMSEC: Root mean square error of predicttion up to ndim dimensions (1 x
ndim)
r2: determination coefficient between observed and predicted values (1 x
ndim)
---validation : validation results with fields
PREDY: predicted y in validation (n x ndim)
RMSEV: Root mean square error of validation (1 x ndim )
r2: determination coefficient (yobs/ypred) (1 x ndim)
T : PLS scores in validation
OBSY: observed y in validation (number of rows=number of obs in
validation)

See also: "crossvalpls"
Note:
crossvalpls1a is slower than crossvalpls but gives more information (for
example the PLS scores in validation)

Return to thematic list
Return to alphabetic list
HOME

crossval_multiple_regression

crossval_multiple_regression - validation of multiple regression.

function[res]=crossval_multiple_regression(X,y,selected)

Input arguments:
================
X: SAISIR matrix of predictive data (n x p)
y: SAISIR vector of the variable to be predicted ( n x 1)
selected: MATLAB vector ( n x 1) giving the samples placed in the
verification set: 1= in verification; 0 = in calibration set

Output arguments:
================
res with fields:
calibration with fields
ypred: predicted y (n x1)
beta0: intercept of the regresssion
beta: regression coefficients (p x 1)
r2: r2 between observed and predicted values in calibration[1x1 struct]
RMSEC: Root mean square error of calibration
validation with fields
ypred predicted values in validation
r2: determination coefficient between obs and predicted values in
validation
RMSEV: Root mean square error of validation

Divides the data into a calibration and a validation set.
Multiple linear regression is established on the calibration set
and validated on the validation set.
The division calibration/validation is determined by the vector "selected"

Return to thematic list
Return to alphabetic list
HOME

crossval_quaddis

crossval_quaddis - crossvalidation of quadratic dis. analysis

function res=crossval_quaddis(X,group,selected)

Input arguments:
================
X : matrix of predictive data (n x p)
group : vector of known groups (integers, n x 1)
selected : MATLAB vector (n x 1)
with elements 0: selected in the calibration set
1: selected in the validation set

Output argument
===============
res with fields:
calibration: structure quaddis_type as defined in function "quaddis"
validation : structure as defined in "apply_quaddis"

see also quaddis, apply_quaddis

Return to thematic list
Return to alphabetic list
HOME

cross_ridge_regression

cross_ridge_regression - ridge regression with validation

function[res]=cross_ridge_regression(X,y,krange,selected)

Input arguments:
---------------
X: Saisir matrix of the predictive data set (n x p)
y: Saisir vector of value to be predicted (n x 1)
krange: Matlab vector of double (k x 1) (see function "ridge regression")
selected:Matlab vector which elements are either equal to 0 or 1

Output arguments
----------------
res with fields
predy: predicted y in validation (n2 x k)
obsy: observed y in validation (n2 x k)
r2: r2 between observed and predicted y in validation (1 x k)
rmsecv: root mean square error of validation (1 x k)
ridgetype: see function "ridge_regression"

divides a collection in calibration (selected = 0) and verification set
(selected = 1)
applies ridge_regression on the validation set
All the models with the k parameter in "krange" are tested

Return to thematic list
Return to alphabetic list
HOME

cumulate_covariance

cumulate_covariance - Covariance on huge data set

function [covariance_type]=cumulate_covariance((X),(covariance_type1));

This function is mainly useful for computing PCA on very large data sets

Input arguments:
---------------
X: SAISIR matrix X (n x p) or [];
covariance_type: (optional): output argument of the function "cumulate_covariance"

Output argument:
----------------
covariance_type: either intermediary results in cumulating covariance
or a structure containing covariance matrix (at completion)
At completion, "covariance_type" has fields:
covariance: matrix p x p of covariances
average: average value of all the observations (1 x p)
n: total number of observations involved in computing the covariance
matrix

if the second argument (covariance_type1) undefined, initiate the calculation of covariance,
If the two arguments are defined, cumulate the covariance
if the first argument =[], finish the work

A complete script must therefore be something like
cov=cumulate_covariance(spectra_1);%% starting
cov=cumulate_covariance(spectra_2,cov);%% cumulating values of matrix 1
cov=cumulate_covariance(spectra_3,cov);%% cumulating values of matrix 2
...
cov=cumulate_covariance(spectra_k,cov);%% cumulating values of matrix k
covariance_type=cumulate_covariance([],cov);%% finishing
[pcatype]=covariance_pca(covariance_type,(ncomponent))
score1=applypca(pcatype,spectra_1);%% projecting data from spectra_1

related function; "covariance_pca" (pca from covariance)

Return to thematic list
Return to alphabetic list
HOME

curve

curve - represents a row of a matrix as a single curve

handle=courbe(X,(nrow), (xlabel),(ylabel),(title))

Input argument:
--------------
X : SAISIR matrix
nrow : indeix of the row to be shown
xlabel, ylabel (optional) : label on X and Y
title (optional) : title of the graph.
This function draws the row (typically a spectrum) as a curve.
If X.v can be interpreted as a vector of number (such as wavelengths),
the X scale is given by this vector.
Otherwise, the X-axis is simply given by the rank of the variables
A function "courbe" is a synonym of this function.

Return to thematic list
Return to alphabetic list
HOME

curves

curves - represents several rows of a matrix as curves

usage handle=curves(X,range,(xlabel),(ylabel),(title))

Input arguments:
---------------
X : SAISIR matrix
range (optional): vector of integer values giving the indices of the
rows to be displayed (default : all rows displayed)
xlabel, ylabel (optional): labels in X and Y
title (optional): title of the graph.
If X.v can be interpreted as a vector of number (such as wavelengths),
the X scale is given by this vector.
Otherwise, the X-axis is simply given by the rank of the variables
example:
curves(spectra,1:2:100,'wavenumber','log 1/R','Raman spectra');
plot the rows 1, 3, 5, ... 99 as curves.
A function "courbes" is a synonym of this function.

Return to thematic list
Return to alphabetic list
HOME

d2_factorial_map

d2_factorial_map - assesses a factorial map from a table of squared distance

function [ftype] = d2_factorial_map(X)

Input argument:
==============
X : Saisir matrix n x n of squared distances

Output argument:
===============
ftype with fields:
eigenval: eigenvalues
score: scores

%typical (demonstrative) example
%===============
xdist=distance(data,data);
xdist.d=xdist.d.*xdist.d ;%% warning !! squared distance needed
ftype=d2_factorial_map(xdist);
map(ftype.score,1,2);
p=pca(data);
figure;
map(p.score,1,2);%% identical to previous figure

Useful when only the distance matrix is available
Uses the Torgerson approach to transform squared distance into pseudo
scalar products.Gives the factorial scores of the distance

Return to thematic list
Return to alphabetic list
HOME

deletecol

deletecol - deletes columns of saisir files

function [X1] = deletecol(X,index)

input arguments
===============
X:Saisir matrix (n x p)
index:vector indicating the columns to be deleted

Output argument
==============
X1:saisir matrix (n x q) with q <=p

The deleted columns are indicated by the vector index (numbers of booleans)

% Typical Examples
reduced=deletecol(data,[1 3 5]);%% deletes 3 columns
reduced1=deletecol(data,sum(data.d)==0); % deletes all the columns with
the sum equal to 0

see also: deleterow, selectrow, selectcol

Return to thematic list
Return to alphabetic list
HOME

deleterow

deleterow - delete rows

function X1 = deleterow(X,index)

input arguments
===============
X:Saisir matrix (n x p)
index:vector indicating the rows²to be deleted

Output argument
==============
X1:saisir matrix (n1 x p) with n1 <=n

The deleted rows are indicated by the vector index (numbers of booleans)

% Typical Examples
reduced=deleterow²(data,[1 3 5]);%% deletes 3 rows
reduced1=deletecol(data,sum(data.d,2)==0); % deletes all the rows with
the sum equal to 0

see also: selectrow, selectcol,deletecol

Return to thematic list
Return to alphabetic list
HOME

dendro

dendro1 - dendrogram using euclidian metric and Ward linkage

function group=dendro(X,(topnodes))

Input arguments:
================
X : Saisir data matrix
topnodes (optional): level of cutting the dendrogram

Output argument
===============
group: groups number at the level of cutting defined by topnodes.

typical use
===========
g=dendro(data,30);

g will contain numbers ranginf from 1 to 30 indicating the group number

Attempts to display the identifiers on the dendrogram.
Works only with a few number of identifiers

Return to thematic list
Return to alphabetic list
HOME

dimcrosspcr1

dimcrosspcr1 - validation of PCR (samples in validation are selected)

function [pcrres]=dimcrosspcr1(X,y,ndim,selected)

Input arguments:
=================
X : SAISIR matrix of predictive variables (n x p)
y : SAISIR vector of variable to be predicted (n x 1)
ndim: max number of PCR dimensions tested
selected: MATLAB vector of samples selected as calibration set (==0)
and verification set (==1)

Output arguments:
================
pcrres with fields
r2: determination coefficient between observed and predicted y (1 x ndim)
predy: predicted y for all the dimensions tested ( n x ndim)
rmsev: root mean square error of validation for all the dimensions tested (1 x ndim)
obsy: observed y in the validation set
Components introduced in the order of the eigenvalues
Remarks;
%1) The vector "selected" can be build randomly using the function
"random_select"
2)The biplot observed/predicted values can be displayed by the command
"xy_plot" (for example
"xy_plot(pcrres.obsy,1,pcrres.predy,3)"
will show the PCR model with 3 dimensions.

Return to thematic list
Return to alphabetic list
HOME

dimcross_stepwise_regression

dimcross_stepwise_regression - Tests models obtained from stepwise regression

function[res]=dimcross_stepwise_regression(X,y,selected,Pthres,(confidence))

Input arguments:
X : SAISIR matrix of predictive variables (n x p)
y : SAISIR vector of variable to be predicted (n x 1)
selected: MATLAB vector (p elements). 0 = in the calibration set;
1 = validation set
Pthres: probability threshold of entering or discarding variable
confidence: (optional): confidence interval for the correlation coefficient
ouput argument:
res with fields
calibration (see "stepwise_regression")
validation (see "apply_stepwise_regression")
validation has fields
predy: predicted y in validation for all the regression models
rmsev: root mean square error of validation (idem)
r2: determination coefficient between observed and predicted y
observed_y: vector of observed y values in the validation set.

The function divides the data set X and y in a calibration set and a
validation set. The vector "selected" defines the division.
The stepwise regression models are established on the calibration set
and tested on the validation set. All the calculated models are tested,
which gives as many columns of predicted y as the number models computed
by stepwise_regression.

Return to thematic list
Return to alphabetic list
HOME

distance

distance - Usual Euclidian distances

function [D] = distance(X1,X2)

Input arguments
===============
X1, X2: SAISIR matrices dimensioned n1 x p and n2 x p respectively

Output argument
==============
D: matrix n1 x n2 of Euclidian distances between the observations

the tables must have the same number of columns

Return to thematic list
Return to alphabetic list
HOME

documentation_dico

documentation_dico -build a dictionnary for HTML doc

function documentation_dico(fid,function_name);

(no direct use)
This function builds up a part of the automatic documentation
This is a part of an HTML document as created by "build_documentation"

Return to thematic list
Return to alphabetic list
HOME

documentation_thematic_list

documentation_thematic_list - HTML thematic list of function

Input argments
==============
fid : file identifier
directory_name: name of the directory (working with
the Matlab command "what )
function_name: list of names of SAISIR function
thematic_list: output of function "thematic_classification"

This function builds a part of HTML SAISIR documentation (function
"build_documentation").
This part corresponds to the thematic list of function
This function must be called by the function "build_documentation"

Return to thematic list
Return to alphabetic list
HOME

eigord

eigord - diagonalization of a square matrix

function [mtvec,mtval] = eigord(mtA)

This function is not directly called

Return to thematic list
Return to alphabetic list
HOME

eliminate_nan

eliminate_nan - suppresses "not a number" data in a saisir structure

function [X1] = eliminate_nan(X,(row_or_col))

Input argument:
==============
X:Saisir data matrix
row_or_col:
if row_or_col==0 tries to have the maximum number of rows
if row_or_col==1 tries to have the maximum number of columns

Output argument:
===============
X1:Saisir data matrix with no NaN values

From a given saisir file possibly containing NaN values (not determined values)
create a file of known values
Only useful when a very few numbers of rows or columns contain NaN values

Return to thematic list
Return to alphabetic list
HOME

ellipse_map

ellipse_map - plots the ellipse confidence interval of groups

function ellipse_map(X,col1,col2,gr,centroid_variability,(confidence),(point_plot))

Input arguments:
================
- X: data matrix
- col1, col2: represented columns
- gr: qualitative groups (referred as integer number)
- centroid_variability : either 0 (variability of individual data
points), or 1 : variability of the centroid itself (default: 0)
confidence: P value of the confidence interval to be out of the ellipse (default: 0.05)
point_plot: if point_plot ~=0 , plots also the individual points as
symbols (inactivated if centroid_variability set to 1)

useful in discriminant analysis and related methods.

Return to thematic list
Return to alphabetic list
HOME

excel2bag

excel2bag - reads an excel file and creates the corresponding text bag

(array of caracter)

function [X,bag] = excel2bag(filename,ref_text_col,(nchar),(deb),(xend))
reads an excel file which has been saved under the format .csv (the delimiters are ';')
the excel file includes the identifers of rows and columns

Input arguments
===============
filename :excelfile in the '.csv' format
nchar :number of characters read in each cell of the excel files
(the other are ignored)
ref_text_col :an array of string giving the reference of columns designed as forming the columns of bag.d.
THESE COLUMNS ARE DESIGNATED USING THE EXCEL STYLE ('AA','AB' ....)
deb : number of the first row decoded
xend : final row decoded

output argument:
===============
X: matrix of numerical values
bag: a structure (bag.d,bag.i,bag.v), with bag.d being here a matrix of char

saisir contains the numerical values from the excel file, with the exception of the columns referenced by ref_text_col
bag contains the charactes values from the excel file, referenced by ref_text_col

Example:
[value,bag]=excel2bag('olive',['A'; 'B'],20)
the columns 'A' and 'B' from excel are read as text (in output bag)
the other columns are read as number (in value) respecting the saisir
structure.

This function is normally used in relation
with "bag2group".

Return to thematic list
Return to alphabetic list
HOME

excel2saisir

excel2saisir - reads an excel text file

function [saisir] = excel2saisir(filename,(nchar),(start),(xend))

Input arguments
==============
filename: (string) name of the text Excel file in .csv format
nchar : (integer, optional) number of character kept in the identifiers
(default : 20)
start:(integer , optional) Index of the beginning of the observations to be loaded
xend:(integer , optional) Index of the final of the observations to
be loaded (greater than start)

Reads an Excel file which has been saved under the format .csv (the delimiters are ';')
the excel file includes the identifers of rows and columns
deb : number of the first row decoded, xend: final row decoded
The Excel format is compulsorily the following (example):
varname1 varname2 varname3
obsname1 number11 number12 number13
obsname2 number21 number22 number23
obsname3 number31 number32 number33

The decimal separator is the point (".") NOT THE COMMA (",")
Example of .csv format (3 rows named "obs1", "obs2", "obs3"; 3 columns named
"var1", "var2", "var3")
data :
=======================
var1 var2 var3
obs1 1 2 3
obs2 4 5 6
obs3 7 8 9
=======================
The corresponding .csv Excel file is:
;var1;var2;var3
obs1;1;2;3
obs2;4;5;6
obs3;7;8;9

Return to thematic list
Return to alphabetic list
HOME

fda1

fda1 - stepwise factorial discriminant analysis on PCA scores

function[fdatype]=fda1(pcatype, group, among, maxscore)

Input arguments:
---------------
pcatype: (structure) output argument of function "pca" applied on the predictive data
set
group : SAISIR vector of group (integer values). Identical numbers mean
that the observations belong to the same group.
among : (integer) Maximal rank (dimension) of PC-score allowed to enter in the model
maxscore: (integer) Maximum number of scores allowed to enter in the model.

Output arguments:
----------------
fdatype with fields:
introduced: rank of the PC scores introduced in the model
ncorrect: number of correct classifications (no validation) at each step
beta: projection coefficients such as datafactor=X*beta
datafactor: discriminant scores
centroidfactor: scores of the barycenters (centroids)
eigenval: eigenvalues of the discriminant analysis
confusion: confusion matrix (row actual; column predicted)
average: average of the predicitive data set.

Assesses a stepwise factorial discriminant analysis according
to Bertrand et al., J of Chemometrics, Vol . 4, 413-427 (1990).
the basic idea is to assess a factorial discriminant analysis on the scores of
a previous pca. The criterion of score selection is the maximisation of
the trace of T-1B.
In order to avoid using PC-scores with very small eigenvalues,
the input argument "among" gives the maximal dimension to be allowed.
"maxscore" indicates the maximal number of scores.
"datafactor" corresponds to the final model
(with maxscore scores introduced). If one is interested in a more
economical model, it is easy, looking at the classification, to reduce the
value in "maxscore" and re-run "fda1".

Typical example:
g=create_group1(wheat,1,3);%% creation of a grouping from the identifiers
%names, using characters in position 1
p=pca(wheat); %% first PCA
res=fda1(p,g,20,5);%% model with 5 scores introduced among the 20 first
%ones
res.ncorrect.d
% ans =
% 13.00 1 PC score introduced
% 31.00
% 69.00
% 77.00
% 93.00 5 PC scores introduced
colored_map1(res.datafactor,1,2,1,3)% map of the discrimination
figure;
ellipse_map(res.datafactor,1,2,g,1,0.05) % shown as confidence ellipses

Note : the number of dimensions in datafactor is less than the number of
qualitative groups minus 1
(with 2 groups, only 1 discriminant dimension!).

SEE ALSO maha3, maha6, plsda, quaddis, applyfda1.

Return to thematic list
Return to alphabetic list
HOME

fda2

fda2 - stepwise factorial discriminant analysis on PCA scores with verification

function[res]=fda2(X, group, among, maxscore, selected)

Input arguments:
---------------
X: SAISIR matrix of the predictive data set (n x p)
group : SAISIR vector of group (integer values). Identical numbers mean
that the observations belong to the same group.
among : (integer) Maximal rank (dimension) of PC-score allowed to enter in the model
maxscore: (integer) Maximum number of scores allowed to enter in the model.
selected: matlab vector (n x 1) with elements =0 (calibration set), or 1
(validation set)

Output arguments:
----------------
res with fields:
introduced: rank of the PC scores introduced in the model
ncorrect: number of correct classifications in the calibration set at each step
nscorrect: number of correct classifications in the validation (supplementary) set at each step
beta: projection coefficients such as datafactor=X*beta
datafactor: discriminant scores if the calibration set
centroidfactor: scores of the barycenters (centroids) computed from the
calibration set
supscore: discriminant scores if the validation set
eigenval: eigenvalues of the discriminant analysis
confusion: confusion matrix (row actual; column predicted) in the
calibration set
average: average of the calibration set.
sconfusion: confusion matrix (row actual; column predicted) in the
validation (or "supplementary" set)

Assesses a stepwise factorial discriminant analysis according
to Bertrand et al., J of Chemometrics, Vol . 4, 413-427 (1990).
the basic idea is to assess a factorial discriminant analysis on the scores of
a previous pca. The criterion of score selection is the maximisation of
the trace of T-1B.
In order to avoid using PC-scores with very small eigenvalues,
the input argument "among" gives the maximal dimension to be allowed.
"maxscore" indicates the maximal number of scores.
"datafactor" corresponds to the final model
(with maxscore scores introduced). If one is interested in a more
economical model, it is easy, looking at the classification, to reduce the
value in "maxscore" and re-run "fda1".
The collection is divided in a calibration and a validation set from the
elements of the input argument "selected"

%Typical example:
g=create_group1(data,1,3);%% creation of a grouping from the identifiers
%names, using characters in position 1 to 3
%random selection of 1/3 in the validation set
res=fda2(data,g,10,5,random_select(size(data.d,1),round(size(data.d,1)/3)));

SEE ALSO maha3, maha6, plsda, quaddis, applyfda1, fda1.

Return to thematic list
Return to alphabetic list
HOME

find_index

find_index - find the index corresponding to the closest "value"

function index=find_index(str,value);

useful for finding the wavelength index in strings

Input argument:
==============
str: an array of characters which can be interpreted as a vector of numbers
when using a command such as vector=str2num(str);
value: a numerical value normally in the range given by vector.

Output argument:
===============
Index (rank) of the variable

Exemple of use :
index=find_index(spectra.v,1104);
Find the index of the variable in "spectra" closest to the value 1104.
Important note: the number associated with num2str(str) are supposed to be sorted
(It is the normal case with spectral data)

Return to thematic list
Return to alphabetic list
HOME

find_max

find_max - gives the indices of the max value of a MATLAB Matrix

function [row,col,value]=find_max(matrix)

Input argument
==============
matrix: MATLAB matrix

Output arguments
===============
row, col: indexes of the row and column of the maximum, respectively
value: value of the maximum

see also find_min

Return to thematic list
Return to alphabetic list
HOME

find_min

find_min - gives the indices of the min value of a MATLAB Matrix

function [row,col,value]=find_min(matrix)

Input argument
==============
matrix: MATLAB matrix

Output arguments
===============
row, col: indexes of the row and column of the minimun, respectively
value: value of the minimun

see also find_max

Return to thematic list
Return to alphabetic list
HOME

find_peaks

find_peaks - finds and displays peaks greater than a threshold value

function [vect, index]=find_peaks(X,nrow,threshold,windowsize,min_max)

Input arguments
==============
X:SAISIR data matrix of "spectra-like" data
nrow:(integer): index of the row to be studied
threshold: peaks of absolute value lower than the threshold are not
detected
windowsize (integer, preferably odd number): size of the moving window
in which the peaks are to be found
min_max: either 0 : only maximum detected, or 1: maximum and minimum detected

Output argument
==============
vect: matlab vector of the positions (name of variable converted into
numbers)
index: matlab vector of the index of the variables corresponding to peaks

Inside a moving window of size "windowsize" (data points)
detects the maximum (or maximum and minimum values). The identified
positions are considered as "peaks" and shown on the display.
The corresponding variables identifiers (normally wavelengths, or
retention time values) are converted into numbers and given in the output
argument "vect".

The system of threshold and moving window avoid that a large series of
peaks will be identified if the studied curve is not perfectly smooth.
The "windowsize" indicates the minimum gap (in data points) between two
consecutive peaks. The threshold makes it possible to detect peaks greater than a certain value.

Return to thematic list
Return to alphabetic list
HOME

group_centering

group_centering - Centers data according to groups

function X1=group_centering(X,group);

Input arguments
==============
X: SAISIR matrix (n x p)
group: SAISIR vector of group (integer, n x 1). Identical values
in "group" indicates that the corresponding observations belong to the same group.

Output argument
===============
X1:SAISIR matrix (n x p) group-centered.

For each group, as defined by the input argument "group", the function computes the
average observation (1 x p). This average is subtracted to all
observation belonging to this group.

An usage of this function is the centering of sensory data according to each panellist.

Return to thematic list
Return to alphabetic list
HOME

group_mean

group_mean - gives the means of group of rows

function X1=group_mean(X,startpos,endpos)

Input arguments
===============
X: SAISIR data matrix ( n x p)
startpos, endpos : (integers) character positions in the identifiers giving the key
for building the qualitative groups.

Output argument
===============
X1: matrix of averages of groups (k x p) with k the number of found
groups.

This function uses the identifier for creating groups.
creates as many groups as different strings from startpos to endpos
The function gives the matrix of averages according to groups
(barycenters).

Return to thematic list
Return to alphabetic list
HOME

html_header

html_header - build the first part of the documentation

function html_header(fid)

This function has no direct use
It is used for beginning the HTML documentation of SAISIR

Return to thematic list
Return to alphabetic list
HOME

html_notice

html_notice - prints the helps of functions in HTML document

function html_notice(fid,function_name);

This function builds up the core of the documentation (which is the
gathering of individual helps
Builds also the necessary hypertext links.

No direct use

Return to thematic list
Return to alphabetic list
HOME

html_postface

html_postface - Finish the work in HTML documentation

function html_postface(fid);

No direct use

Return to thematic list
Return to alphabetic list
HOME

issaisir

issaisir - tests if the input argument is a SAISIR matrix

test=issaisir(X);

Input argument:
==============
X: anything

Output argument:
===============
test: (boolean) "true" if X is a SAISIR structure, "false" otherwise.

SEE also : saisir_check

Return to thematic list
Return to alphabetic list
HOME

labelled_hist

labelled_hist - draws an histogram in which each obs. name is considered as a label

function labelled_hist(X,col,startpos,endpos,(nclass),(charsize),(car))

Input arguments
===============
X : SAISIR data matrix
col : the column from which the histogram is drawn
startpos and endpos : the position in the row identifier strings considered as keys for coloration
nclass : number of desired classes
charsize : the size of character on the graph
str : (optional) if str (a string) is defined, all the observations are represented with
this string colored differently according to the extracted key.
by choosing str='--' (for example),it is possible to avoid overlapping identifiers.

This function build an histogram in which each observation is represented
by a colored code (key) extracted from the row identifiers.

Note: It is generally necessary to play with "nclass" and "charsize" to have a smart histogram

clf;

Return to thematic list
Return to alphabetic list
HOME

leave_one_out_pls1

leave_one_out_pls1 - PLS1 with leave_one out validation

function res=leave_one_out_pls1(X,y,ndim);

Input arguments:
===============
X:SAISIR matrix of predictive variables (n x p)
y:Saisir vector of the variable to be predicted (n x 1)
ndim: (integer) maximal number of tested PLS dimensions

Ouput arguments:
===============
res with fields
predy: SAISIR matrix of predicted y in leave-one-out (n x ndim) for all
the dimensions tested.
rmse: Root mean square error (1 x ndim) for all the dimensions tested
r2: r2 value between observed and predicted y for all the dimensions
tested.
optimal_error: (double) minimal rmse among all the dimensions tested.
optimal_dim: (integer) PLS dimension giving the best model.
optimal_r2: r2 value for the best model.

The function leaves out one observation and makes a model with the
resulting observations. The left observations is predicted.
This procedure is carried out for the n rows in X.

This function is very slow and must be used only for small data set
(typically less than 30 observations). Otherwise one must prefer a
validation procedure.

Return to thematic list
Return to alphabetic list
HOME

leave_one_out_pls2

leave_one_out_pls2 - PLS2 with leave_one out validation

function res=leave_one_out_pls2(X,y,ndim);

Input arguments:
===============
X:SAISIR matrix of predictive variables (n x p)
y:SAISIRr vector of the variables to be predicted (n x k)
ndim: (integer) maximal number of tested PLS dimensions

Ouput arguments:
===============
res with fields
col: vector of k cells. The cell res.col{i} contains the y predicted values
(n x ndim) associated with the variable i, for all the PLS dimensions
tested.
RMSEV:root mean square error (k x ndim) for all the variables (rows) and all the
dimensions (columns).
r2: r2 values (k x ndim) for all the variables (rows) and all the
dimensions (columns).

The function leaves out one observation and makes a model with the
resulting observations. The ys of the left observations are predicted.
This procedure is carried out for the n rows in X.

This function is very slow and must be used only for small data set
(typically less than 30 observations). Otherwise one must prefer a
validation procedure.

Return to thematic list
Return to alphabetic list
HOME

list

list - lists rows (only with a small number of columns)

function list(X,(start))

start: starting index in the list (default : 1)

Return to thematic list
Return to alphabetic list
HOME

lr1

lr1 - Latent root regression

function [lr1type]=lr1(X,y,maxdim,(ratioxy))

Computes a basic latent root model

Input arguments:
================
X: SAISIR matrix of predictive variables (n x p)
y: SAISIR vector of variables to be predicted (n x 1)
maxdim (integer): maximal number of dimensions introduced in the model
ratioxy (float) : positive number greater than 0 less than 1
giving the relative importance of x and y. 1: x important; 0 x not
important

Output argument:
================
lr1type with fields:
predy: predicted y for all the models, up to maxdim dimensions (n x maxdim)
corrcoef: correlation coefficents between y and predicted y (1 x maxdim)
beta: regression coefficients for all the models (p x maxdim)
averagey: average of y
averagex: average of x
ratioxy: copy of parameter ratioxy

Return to thematic list
Return to alphabetic list
HOME

maha

maha - simple discriminant analysis forward introducing variables no validation samples

function[res]=maha(X,group,maxvar)

Input arguments:
----------------
X :SAISIR matrix (n x p) of predictive variables
group :SAISIR vector (n x 1) of integers indicating the group. Two observations
belonging to the same group have the same group number
maxvar : integer indicating the maximum number of variables to be
introduced.

Output arguments:
-----------------
res with fields:
step with fields
index: vector of integers (1 x maxvar) giving the indices of the
selected variables
correct: vector of integer ( 1 x maxvar) giving the number of
correct classifications at each step
name: identifiers of the introduced variables (matrix of char with
maxvar rows)
classed: predicted groups in the final step (SAISIR vector of integers
n x 1)

The function assesses a simple quadratic discriminant analysis introducing
up to maxvar variables
At each step, the more discriminating variable (according to the percentage of correct
classification) is introduced. Only forward
The function makes use of matlab function "classify"

Return to thematic list
Return to alphabetic list
HOME

maha1

maha1 - forward linear discriminant analysis DIRECTLY ON DATA with validation samples

function[discrtype1]=maha1(calibration_data,calibration_group,maxvar,test_data,test_group)

Input arguments
==============
calibration_data: SAISIR matrix (n1 x p)
calibration_group:SAISIR vector of group numbers (n1 x 1)
maxvar:(integer) maximum number of variables introduced in the model
test_data:SAISIR matrix (n2 x p)
test_group: SAISIR vector of group numbers (n2 x 1);

Output argument:
===============
discrtype1 with fields:
step with fields
index: vector giving the rank of the variable introduced at each step
(maxvar values)
correct: vector giving the number of correct classifications in the
calibration set (max var values)
name: identifiers of the variables introduced int the models (matrix of
char with maxvar rows)
ntestcorrect: vector giving the number of correct classifications in the
validation set (max var values)
classed: SAISIR vector of the predicted group numbers of the calibration
set (n1 x 1). Only the result of the final step is given
testclassed:SAISIR vector of the predicted group numbers of the test set (n2 x 1)
Only the result of the final step is given

The function assesses a simple linear discriminant analysis introducing
up to maxvar variables
at each step, the more discriminating variable (according to the percentage of correct
classification of the calibration set) is introduced. Only forward
Uses the Matlab function "classify"

%Typical example

mydis=maha1(wheat1,g1,5,wheat2,g2)
disp(mydis.step.ntestcorrect);%Looking at the number of correct
%classifications in the test set

Return to thematic list
Return to alphabetic list
HOME

maha3

maha3 - simple discriminant analysis forward introducing variables no validation samples

function[discrtype]=maha3(X,group,maxvar)

Input arguments:
----------------
X :SAISIR matrix (n x p) of predictive variables
group :SAISIR vector (n x 1) of integers indicating the group. Two observations
belonging to the same group have the same group number
maxvar : integer indicating the maximum number of variables to be
introduced.

Output arguments:
-----------------
res with fields:
ncorrect: vector of integers (1 x maxdim) indicating the number of
correct classifications at each step.
classed: SAISIR vector of integer (n x 1) indicating the predicted group number
for the model with maxvar variables introduced.
confusion: SAISIR confusion matrix (row: actual group, column : predicted group).
varrank: vector of integer (1 x maxdim) indicating the index (rank) of the
introduced variables.

The function computes a linear discriminant analysis introducing up to
maxvar variables.
At each step, the more discriminating variable according to the
maximisation of the trace of T-1B is introduced.

Return to thematic list
Return to alphabetic list
HOME

maha4

maha4 - simple discriminant analysis forward

function[discrtype]=maha4(X,group,maxvar)

Input arguments:
----------------
X :SAISIR matrix (n x p) of predictive variables
group :SAISIR vector (n x 1) of integers indicating the group. Two observations
belonging to the same group have the same group number
maxvar : integer indicating the maximum number of variables to be
introduced.

Output arguments:
-----------------
res with fields:
ncorrect: vector of integers (1 x maxdim) indicating the number of
correct classifications at each step.
classed: SAISIR vector of integer (n x 1) indicating the predicted group number
for the model with maxvar variables introduced.
confusion: SAISIR confusion matrix (row: actual group, column : predicted group).
varrank: vector of integer (1 x maxdim) indicating the index ("rank") of the
introduced variables.

The function computes a linear discriminant analysis introducing up to
maxvar variables. At each step, the new variables giving the highest number
of correctly classified samples is introduced.

Return to thematic list
Return to alphabetic list
HOME

maha6

maha6 - simple discriminant analysis forward introducing variables with validation samples

function[discrtype]=maha6(X,group,maxvar,selected)

Input arguments:
===============
X: SAISIR matrix (n x p) of data
group: SAISIR vector of integer (n x 1) indicating the group number of the
observations. Two observations sharing a same group number belong to the
same qualitative group
maxvar: integer giving the maximal number of variables introduced in the
model
selected: Matlab vector (n x 1) the elments of which are equal to 0
(observation placed in the calibration set, or 1 (observations placed in
the validation set).

Ouput argument:
==============
discrtype with fields
ncorrect: vector of integers (1 x maxvar) giving the number of correctly
classified observations in the calibration set.
classed: SAISIR vector of integer giving the predicted group of the
calibration set.
confusion: matlab confusion matrix of the calibration set for the final
step.
sconfusion: matlab confusion matrix of the validation set for the final
step.
sclassed: SAISIR vector of integer giving the predicted group of the
validation set.
nscorrect: vector of integers (1 x maxvar) giving the number of correctly
classified observations in the validation (supplementary) set.
sclassed: vector of integers (1 x maxvar) giving the predicted group of
in the validation.
nscorrect: vector of integers (1 x maxvar) giving the number of correctly
classified observations in the validation set.

The function computes a linear discriminant analysis introducing up to maxvar variable
at each step, the more discriminating variable
according to the maximisation of the trace of T-1B is introduced
the collection is divided in cal. sample and test samples according to selected:
selected=0 , sample placed in calibration, =1 verification

%Typical use
%===========
load data;
g=create_group1(data,1,2); %% supposing that the identifiers contain a key
%for forming the qualitative group in position of characters 1 and 2
sel=random_select(size(data.d,1),round(size(data.d,1)/3));%% a third in
%validation
res=maha6(data,g,5,del);
xdisp('Evolution of correct classification in the calibration set',res.ncorrect);
xdisp('Evolution of correct classification in the validation set', res.nscorrect);

Return to thematic list
Return to alphabetic list
HOME

map

map - graph of map of data using identifiers as names

function map(X,col1,col2,(col1label),(col2label),(title),(charsize),(margin))

Input arguments
---------------
X: SAISIR matrix
col1, col2 : index of the two columns to be represented
col1label (optional): Label of the variable forming the X-axis
col2label (optional): Label of the variable forming the Y-axis
title (optional) : title of the graph
charsize (optional) : size of the plotted characters
marg (optional) : margin value allowing an extension of the axis in order
to cope with long identifiers (default value: 0.05)
For the French users: there is a synonym function "carte".
Use preferably "map"

Return to thematic list
Return to alphabetic list
HOME

map3D

map3D - Draws a 3D map

function map3D(X,col1,col2,col3,(label1),(label2),(label3),(title),(charsize))

Input arguments:
===============
X: SAISIR matrix (n x p)
co1, col2, col3: indices of he columns to be represented
as X, Y and Z in the 3D plot (integers).
label1,label2, label3: label of axes on X, Y, Z (optional, strings or vectors of
char)
charsize: size of the characters (optional, default :6)

synonymous of "carte3D" (French name). Use preferably map3D

Return to thematic list
Return to alphabetic list
HOME

matrix2saisir

matrix2saisir - transforms a Matlab matrix in a saisir structure

X = matrix2saisir(data,(coderow),(codecol))

Input arguments:
---------------
data : Matlab matric
coderow (optional) : string (code added to the row identifiers)
codecol (optional): string (code added to the variables identifiers)

Output argument:
----------------
X: SAISIR matrix with fields "d" (copy of "data"), "i" (identifiers of
rows), "v" (identifiers of variables)

Saisir means "statistique appliquée à l'interpretation des spectres infrarouge"
or "statistics applied to the interpretation of IR spectra'

See the manual of SAISIR for understanding the rationale of this structure.

%Typical use:
%===========
A=[1 2 3 4; 5 6 7 8];
B=matrix2saisir(A,'row # ', 'Column # ');
% >> B.d
% ans =
% 1 2 3 4
% 5 6 7 8
% >> B.i
% ans =
% row # 1
% row # 2
% >> B.v
% ans =
% Column # 1
% Column # 2
% Column # 3
% Column # 4

Return to thematic list
Return to alphabetic list
HOME

mdistance

mdistance - computes distances between the two tables using metric "metric"

function dis = mdistance(X1,X2,metric)

Input arguments:
================
X1: SAISIR matrix ( n1 x p)
X2: SAISIR matrix (n2 x p)
metric: SAISIR matrix (p x p) of the metric

Output argument:
================
dis: SAISIR matrix of distances between observations (n1 x n2) according to the metric "metric"

example of use (computing Mahalanobis distances):
================================================
data=matrix2saisir(rand(50,10));%% dummy data
data1=center(data);%% centered data
metric=matrix2saisir(inv(data1.d'*data1.d));
dis=mdistance(data1,data1,metric);%% Mahalanobis distance between
%observations

Return to thematic list
Return to alphabetic list
HOME

mfa

mfa - Multiple factor analysis

function res=mfa(collection);

Input argument:
===============
collection: VECTOR OF SAISIR structures
example of building a collection : collection col{1}=table1;col{2}=table2; col{3}=table3
in which table1, table2, table are SAISIR structure
Each table must include the same observations, but not necessarily the same variables.
WARNING! In this version, the variables are not normalised !

Let n be the number of observations, and t the number of tables
Let WHOLE be the matrix of appended tables normalized according to MFA (dimensions n x m)
Let k be the rank of WHOLE

Output arguments:
================
res with fields:
score (n x k) : scores of the individuals (compromise)
eigenvec ( m x k) : eigenvectors of the PCA on WHOLE (no direct use)
eigenval (1 x k) : eigenvalues of the PCA on WHOLE
average (1 x m) : averages of the variables of WHOLE
var_score (m x k) : scores of the variables
proj {1xt cell} : projectors for computing the projection of new observations of each table
first_eigenval (1xt) : first eigenvalues of the individual PCAS on each table
trajectory (q x k) : individual score of each row of each table (q = total number of rows in all the tables q=n*t)
id_group (q x 1) : identification of the belonging of the observation_score into a given table
table_score (t x k) : scores of the tables

Information on FMA can be found in SPAD TM Version 5.0 (procédure AFMUL);

Return to thematic list
Return to alphabetic list
HOME

mir_style

mir_style - changes the sign of the variables of MIR spectra

function [names] = mir_style(names1)

Input argument:
===============
names1:matrix of char interpretable as number through "num2str"

Ouput argument:
===============
names2: matrix of char interpretable as number through "num2str"

The function gives negative values of variables (here normally wavenumbers)
in order to have the usual sense on the graph of mid infrared spectra
This may help Spectroscopists to examine data and loadings.

%Typical use:
%============
%Let is suppose that "midIR" is a SAISIR matrix of Mid-infrared spectra
%with wavenumbers as variables identifiers
midIR.v=mir_style(midIR.v);

Return to thematic list
Return to alphabetic list
HOME

moving_average

moving_average - Moving average of signals

function X1=moving_average(X,window_size)

Input arguments
===============
X:SAISIR matrix of data (normally digitized signals such as spectra)
window_size: integer giving the number of data points which are locally
averaged (odd number).

Ouput arguments
===============
X1: matrix of averaged data

The function replaces a given variable by its average in the range defined
by "window_size".
(window_size-1)/2 variables are lost at the begining and the end of the signal.

For example with window_size = 5
a given variable x(i) of index i is replaced by the local average
(x(i-2)+x(i-1)+x(i)+x(i+1)+x(i+2))/5

Return to thematic list
Return to alphabetic list
HOME

moving_max

moving_max - replaces the central point of a moving window by the maximum value

function [X1] = moving_max(X,window_size)

Input arguments
===============
X:SAISIR matrix of data (normally digitized signals such as spectra)
window_size: integer giving the number of data points on which the max
value is computed.

Ouput arguments
===============
X1: matrix of local maxima

The function replaces a given variable by the maximum value in the range defined
by "window_size".
(window_size-1)/2 variables are lost at the begining and the end of the signal.

For example with window_size = 5
a given variable x(i) of index i is replaced by the local maximum of variables
x(i-2); x(i-1); x(i): x(i+1); x(i+2)

Return to thematic list
Return to alphabetic list
HOME

moving_min

moving_min - replaces the central point of a moving window by the minimum value

function [X1] = moving_min(X,window_size)

Input arguments
===============
X:SAISIR matrix of data (normally digitized signals such as spectra)
window_size: integer giving the number of data points on which the min
value is computed.

Ouput arguments
===============
X1: matrix of local minima

The function replaces a given variable by the minimum value in the range defined
by "window_size".
(window_size-1)/2 variables are lost at the begining and the end of the signal.

For example with window_size = 5
a given variable x(i) of index i is replaced by the local minimum of variables
x(i-2); x(i-1); x(i): x(i+1); x(i+2)

Return to thematic list
Return to alphabetic list
HOME

multiple_regression

multiple_regression - Simple Multiple linear regression (all the variables)

function res=multiple_regression(x,y);

Input arguments
===============
X: SAISIR matrix of predictive variables (n x p)
y: SAISIR vector of observed y (n x 1);

Output argument
===============
res with fields
ypred: predicted y
beta0: intercept of the model (double)
beta: regression coefficient
r2: r2 value
RMSEC: Root mean square error of calibration

Return to thematic list
Return to alphabetic list
HOME

multiway_pca

multiway_pca - Multi way principal component analysis

function res=multiway_pca(collection);

Input argument:
==============
collection : ARRAY OF SAISIR matrices
example of building a collection : collection col{1)=table1;col{2)=table2; col{3)=table3
in which table1, table2, table are SAISIR structure
Each table must include the same observations, but not necessarily the same variables.
WARNING! In this version, the variables are not normalised !

Let n be the number of observations, and t the number of tables
Let WHOLE be the matrix of appended tables(dimensions n x m)
Let k be the rank of WHOLE

Output argument:
===============
res with fields:

score (n x k) : scores of the individuals (compromise)
eigenvec ( m x k) : eigenvectors of the PCA on WHOLE (no direct use)
eigenval (1 x k) : eigenvalues of the PCA on WHOLE
average (1 x m) : averages of the variables of WHOLE
var_score (m x k) : scores of the variables
proj {1xt cell) : projectors for computing the projection of new observations of each table
trajectory (q x k) : individual score of each row of each table (q = total number of rows in all the tables)
id_group (q x 1) : identification of the belonging of the observation_score into a given table
table_score (t x k) : scores of the tables

This function is identical to "mfa" except that the tables are simply set in such a way that their norm are set to 1.

Return to thematic list
Return to alphabetic list
HOME

nancor

nancor - Matrix of correlation with missing data

function[cor]=nancor(X1,X2)

Input arguments:
================
X1 and X2: SAISIR matrices dimensioned (n x p1) and (n x p2)
respectively.

Output argument:
================
cor: SAISIR matrix (dimensioned p1 x p2).
An element cor.d(a,b) is the correlation coefficient between the column a
of X1 and b of X2.

Return to thematic list
Return to alphabetic list
HOME

normc

normc - Normalize columns of a matrix.

Syntax

normc(M)

Description

NORMC(M) normalizes the columns of M to a length of 1.

Examples

m = [1 2; 3 4]
n = normc(m)

See also NORMR

Reference page in Help browser
doc normc.m

Return to thematic list
Return to alphabetic list
HOME

normed_pca

normed_pca - PCA with normalisation of data

function[pcatype]=normed_pca(X)

function [pcatype]=pca(X,(var_score))
Assesses principal component analysis (on not normalised data)

Input arguments:
---------------
X: SAISIR matrix

Output arguments:
----------------
pcatype with fields:
score :PC score
eigenvec :eigenvectors (loadings)
eigenval :eigenvalues
average :average observation
var_score :scores of the variables
std: standard deviations of the columns of X

Return to thematic list
Return to alphabetic list
HOME

norm_col

norm_col - divides each column by the corresponding standard deviation

function [saisir] = norm_col(saisir1,(mode))

divide each column by the corresponding standard deviation
mode (optional): 0 or 1 division by n-1 or by n respectively
default : 1
OBSOLETE: USE PREFERABLY FUNCTION "standardize"

Return to thematic list
Return to alphabetic list
HOME

nuee

nuee - Nuee dynamique (KCmeans)

function[res]=nuee(X,ngroup,(nchanged))

Clusters the data into ngroup according to the KCmeans method ("nuée dynamique");

Input arguments
==============
X: SAISIR data matrix
ngroup (integer): number of groups asked for
nchanged (optional):stop iteration when there are still nchanged groups
which have changed in the previous iteration
This allows sparing some time. nchanged must be small in comparison with
the number of rows of X

Output argument
==============
res with fields
group: SAISIR vector of groups. Observations with the same group number
have been classified in the same group.
centre: barycenter of the groups.

Warning: the function may reduce the number of groups

Return to thematic list
Return to alphabetic list
HOME

num2str1

num2str1 - Justified num2string

function str=num2str1(vector,ndigit);

Input arguments:
===============

vector : Matlab vector of integers
ndigit : positive integer

This function transforms numbers into matrices of char.
The function justifies the strings by adding zeros.

If the first argument is a row vector, it is transposed

% %Example
% %=======
x=[1 2 100];
x1=num2str1(x,5);
x1
% 00001
% 00002
% 00100
The main use of this function is to help building smart row names
in SAISIR matrices using the system of extractable fields in the names.

Return to thematic list
Return to alphabetic list
HOME

pca

pca - principal component analysis on raw data

function [pcatype]=pca(X,(var_score))

Assesses principal component analysis (on not normalised data)

Input arguments:
---------------
X: SAISIR matrix
var_score : optional (0: only scores of observations
1: gives also the scores of the variables (default : 0)

Output arguments:
----------------
pcatype with fields:
score :PC score
eigenvec :eigenvectors (loadings)
eigenval :eigenvalues
average :average observation
var_score : (if input arg. "varscore" defined) scores of the variables
NOTA :the weight of the observations are equal to 1/(number of rows)
ALL the possible scores are calculated

SEE ALSO : normed_pca, cumulate_covariance, covariance_pca, normed_pca
correlation plot, apply_pca

%typical example:
%spectra: n x p, chemistry n x k
p=pca(spectra);%% PCA
map(p.score,1,2);%% PC plot 1-2
correlation_plot(p.score,1,2, chemistry);%% correlation with chemistry

Return to thematic list
Return to alphabetic list
HOME

pca1

pca1 - assesses principal component analysis on raw data (case nrows>ncolumns)

This function is not to be called directly.

Use "pca" or "normed_pca"

Return to thematic list
Return to alphabetic list
HOME

pca2

pca2 - computes principal component analysis on raw data (case nrows>ncolumns)

This function is not to be called directly.

Use "pca" or "normed_pca"

Return to thematic list
Return to alphabetic list
HOME

pcareconstruct

pcareconstruct - reconstructs original data from a PCA model and a file of score

function[res]=pcareconstruct(pcatype,score,nscore)

Input arguments:
===============
pcatype: output argument of function"PCA"
score: SAISIR matrix of scores (computed from the same pcatype model)
nscore: number of components involved in the reconstruction of the data

Output argument:
================
res: SAISIR matrix of reconstructed data. The quality of the
reoncstruction depends on the number of scores introduced (variable
"nscore").

From the previously computed scores, the function rebuilds the original data matrix.

Return to thematic list
Return to alphabetic list
HOME

pca_cano

pca_cano - generalized canonical analysis after PCAs on each table

function res=pca_cano(collection,ndim,graph);

% ========================================================================
input argument :
===============
collection: array of SAISIR matrices with the same number of rows (see
below)
ndim : dimension of each individual PCA (must be less than the smallest
number of variables
graph : if different from 0 : display examples of graph

let n be the number of observations in each table, k the number of tables
%Output argument:
===============
res with fields
compromise : PCA giving the compromise
observation_score : scores of each observation of each table (nxk rows)
id_group : groups identifying each observation in observation_score (for graph)
projector : struct array giving the vectors allowing the projection of each data set
score_correlation: : correlations between compromise scores and table scores
table_average : struct array giving the average of each original data table
Adapted from : G. Saporta. Probabilités, analyse des données et statistiques.
: Edition Technip, page 192 and followings.
% ========================================================================

The function first computes PCA on each of the SAISIR matrices in
"collection". Only the ndim PC scores are kept in each matrix.
Canonical analysis is carried out on the series of scores.
This procedure avoids to have inversion of the original matrices. The use
of few scores guarantees that the computation is feasible, even if the
original matrices have colinear variables.

Note :
the argument collection is obtained for example by
collection(1)=data1;collection(2)=data2; ...;collection(k)=datak;
In which data1, data2, ..., datak are SAISIR matrices with the same number
of rows.

Return to thematic list
Return to alphabetic list
HOME

pca_cross_ridge_regression

pca_cross_ridge_regression - PCA ridge regression with crossvalidation

function[res]=pca_cross_ridge_regression(X,y,krange,selected)

Input argument:
==============
X: SAISIR matrix (n x p) of predictive variables
y: SAISIR vector (n x 1) of the variable to be predicted
range: MATLAB vector of integers (1 x q)
selected: MATLAB vector (n x 1) with elements = 0 (selected in
calibration)
or 1 ( selected in validation)

Outut argument:
==============
res with fields:
predy: predicted y in the validation set (q columns)
obsy: observed y of the validation set (1 column)
r2: r2 between observed and predicted values in the validation set
(vector with q elements)
rmsecv: root mean square error of validation (1 x q)
ridgetype: calibration model (see function "pca_ridge_regression")

This function divides a collection in calibration and verification set
using the input argument "selected"
and applies the pca_ridge_regression on the validation set

The function calculates as many ridge regression models as the number of
elements in "range"
In ridge regression, the product X'X is replaced by X'X +kI.
In the present function, k is in fact the eigenvalue of the corresponding PCA component
for example, if krange = [1 3 5], that means that the tested values of k
are the eigenvalues #1, #3, #5.
The rationale of this, is that k in ridge_regression is very difficult to find
it is a good idea to test a value in the range of the observed eigenvalues

Typical example
===============
%Let DATA (n x p) be the SAISIR matrix of predictive variables and y (n x 1) the variable to be predicted.
myrange=1:20 %% testing the eigenvalues from 1 to 20
sel=random_select(size(DATA.d,1), round(DATA.d,1/3));%% 1/3 in validation
[res]=pca_cross_ridge_regression(DATA,y,1:10,sel);%% testing the 10 first
%eigenvalues
xy_plot(res.predy,10, res.obsy,1);%% display of the 10th model

%See also: pca_ridge_regression, ridge_regression, ridge_regression1,
apply_ridge_regression

Return to thematic list
Return to alphabetic list
HOME

pca_ridge_regression

pca_ridge_regression - Basic ridge regression after PCA

function[ridgetype]=pca_ridge_regression(pcatype,y,range)

Input arguments
===============
pcatype: structure, output argument of function "PCA"
y: variable to be predicted (n x 1)
range: MATLAB vector of positive integers (1 x k)

Output arguments:
================
ridgetype with fields:
beta: coefficients of the model (p x k), applicable on X
krange: (1 X q) values of the coefficient k of ridge regressions
averagex: average of original predictive variables (1 x p)
averagey: average of y (1 x 1)
rmsec: root mean square of validation (1 x q)
predy: y predicted (n x q)
corr: correlation coefficient (1 x q] between predicted and observed
values

The function calculates as many ridge regression models as the number of
elements in "range"
In ridge regression, the product X'X is replaced by X'X +kI.
In the present function, k is in fact the eigenvalue of the corresponding PCA component
for example, if krange = [1 3 5], that means that the tested values of k
are the eigenvalues #1, #3, #5.
The rationale of this, is that k in ridge_regression is very difficult to find
it is a good idea to test a value in the range of the observed eigenvalues

%Typical example
===============
%Let DATA (n x p) be the SAISIR matrix of predictive variables and y (n x 1) the variable to be predicted.
mypca=pca(DATA); %
myrange=1:20 %% testing the eigenvalues from 1 to 20
res=pca_ridge_regression(mypca,y,myrange)
xdisp(res.rmsec.d);%% displaying (for example) the errors

See also ridge_regression, ridge_regression1, apply_ridge_regression,
pca_cross_ridge_regression.

Return to thematic list
Return to alphabetic list
HOME

pca_stat

pca_stat - Gives some complementary statistics on PCA observations

function res=pca_stat(pca_type, comp1, comp2);

Input argument :
===============

pca_type: output argument of function "pca"
comp1, comp2: number of the PC components of the PCA to be analyzed

Ouput argument:
===============
res: Saisir Matrix with 7 columns : QTL, col1 CO2col1, CTRcol1, col2, CO2col2, CTRCol2
QLT : squared cosinus with the plan (quality of the representation of the
observations)
CO2col1 and CO2col2 : squared cosinus of the angle between the observation and the axis
We have QLT=CO2col1 + CO2col2
CTRcol1 and CTRcol2 : Contribution of the observation to the component.

From G.Saporta, Probabilités analyse des données et statistiques, Ed Technip, page 182

%Typical example:
p=pca(DATA);
res=pc_stat(p,1,2);%% stats for components #1 and #2
saisir2excel(res,'pca results');%% to be looked at with Excel

Return to thematic list
Return to alphabetic list
HOME

pcr

pcr - PCR (components introduced in the order of eigenvalues)

function [pcrtype]=pcr(X,y,maxdim)

assesses a basic pcr model
Input arguments
---------------
X : SAISIR matrix (n x p
y : SAISIR vector (n x 1)
maxdim : maximal dimensions of the model (integer)

Output arguments
----------------
pcrtype with fields
pca: Structure giving the PCA results (see function "pca")
beta: regression coefficients APPLICABLE ON THE PC SCORES (ndim x 1)
predy: predicted y up to ndim dimensions (n x ndim)
r2: determination coefficients between predicted and observed values (1 x
ndim)
averagey: mean of y (number)
beta1:regression coefficients APPLICABLE ON THE X DATA centred (ndim x p)
obsy: observed y (copy of input argument y)
rmsec: root mean square error of calibration (n - dim -1) degrees f freedom

Return to thematic list
Return to alphabetic list
HOME

pcr1

pcr1 - Basic model of PCR (components introduced in the order of eigenvalues)

function [pcrtype]=pcr1(X,y,dim)

Input arguments:
===============
X: SAISIR matrix (n x p) of predictive data
y: SAISIR matrix (n x k) of variables to be predicted (k variables)
dim: (integer) dimension of the model

Output argument
===============
pcrtype with fields:
pca: result of pca applied on X (see function "pca")
beta: coefficients of the models (p x k) obtained with "dim" dimensions
averagex: average of x (1 x p)
averagey: average of y (1 x k)
info: 'predicting several y with xxx dimensions'
r2: determination coefficient for the ys (1 x k)
predy: predicted y (n x k).

This function makes models for all the ys which are in "y" . The models
are built only with "dim" dimensions.

See also : pcr (only one y, but the dimensions are scanned),apply_pcr

Return to thematic list
Return to alphabetic list
HOME

plotmatrix1

plotmatrix1 - biplots of columns of matrices with colors

function plotmatrix1(s,startpos,endpos,charsize)

Return to thematic list
Return to alphabetic list
HOME

pls

pls - PLS regression (Partial Least Squares).

function [mtBpls,mtYp,mtT,tbeta] = pls(mtX,mtY,nbdim)

No direct use in SAISIR

Input arguments
================
mtX Matlab matrix of centered predictive data
mtY Matlab matrix of Y-variables (centered).
nbdim dimension of the PLS model

Output arguments
=================
mtBpls PLS regression coefficients
mtYp predicted Y
mtT PLS scores

Return to thematic list
Return to alphabetic list
HOME

pls2obs

pls2obs - PLS regression with many observations

function [mtBpls,mtT, tBeta] = pls2obs(mtX,mtY,nbdim)

This function is normally not directly called

Return to thematic list
Return to alphabetic list
HOME

pls2var

pls2obs - PLS regression with many observations

function [mtBpls,mtT, tBeta] = pls2var(mtX,mtY,nbdim)

This function is normally not directly called

Return to thematic list
Return to alphabetic list
HOME

plsda

plsda - Pls discriminant analysis following the saisir format

function[plsdatype]=plsda(X,group,ndim)

Input arguments:
===============
X : SAISIR matrix of predictive variables (n x p)
group: SAISIR (n x 1) vector of integers of groups. Observationqs with
the same number in group belong to the same group (missing group not allowed)
ndim: number of dimensions in the PLS model

Output arguments
================
plsdatype with fields
beta :coeff for predicting the indicator matrix
beta0 :intercept for predicting the indicator matrix
t :PLS latent variable
predy :predicted indicator matrix
classed :predicted groups according to method #0
ncorrect :number of rightly classified samples according to method #0
(attribution to index of max of predicted Y)
confusion :confusion matrix according to method #0
ncorrect1 :number of rightly classified samples according to method #1
(mahalanobis distance on latent variable t)
confusion1 :confusion matrix according to method #1
tbeta :coeff for predicting the latent variables t
tbeta0 :intercept for predicting the indicator matrix
linear :linear form for direct prediction of group
linear0 :%the min of x'*linear + linear0 gives the predicted group
(this is equivalent with considering Mahalanobis distances)

Return to thematic list
Return to alphabetic list
HOME

quaddis

quaddis - Quadratic discriminant analysis

function quadis_type=quaddis(x,group);

Quadratic discriminant analysis
(Training)
A multinormal distribution is assumed in each qualitative group
===============================================================
Input args:
============
x: predictive data set (matrix n x p)
g: qualitative groups (matrix n x 1) with integer ranging from 1
to maximum number of groups (gmax)

Output args:
============
quaddis_type with fields:
ncorrect100: percentage of correct classification (number)
confus : confusion matrix (gmax x gmax)
mean : means according to each group (gmax x p)
predgroup : predicted groups (integer) (n x gmax)
density : pseudo-densities of each observation (n x gmax)
proba : probability of belonging to a given group (n x gmax)

model : predictive model with fields:
inv: Matlab matrices of Mahalanobis metrics (cube p x p x gmax)
Mut: Matlab matrices of means according to each group (gmax x p)
det: Matlab vector of determinants of covariance matrices of each
group (1 x gmax)
See also apply_quaddis, crossval_quaddis

Return to thematic list
Return to alphabetic list
HOME

quickpls

quickpls - Quick PLS regression from 1 to ndim dimensions

function [plstype]=saisirpls(X,y,ndim)

Input arguments:
===============
X: SAISIR matrix (n x p) of predictive data
y: SAISIR vector (n x 1) of variable to be predicted
ndim: maximum dimensions of the model

output argument:
===============
plstype with fields:
BETA: regression coefficient of the models (p x ndim)
BETA0:intercept of the models (p x 1)
PREDY: predicted y for all the models (n x ndim)
T: PLS scores (n x ndim)
RMSEC: root mean square error of calibration (1 x ndim)
r2: r2 coefficient (1 x ndim)

This function calculates the PLS models for 1 to ndim dimensions.
All the models are kept
The algorithm is not the NIPALS algorithm, but another one which is
faster.

This function makes use of function pls (normally in the directory "pls"

see also: basic_pls, basic_pls2 (slower but giving more complete outputs)

Return to thematic list
Return to alphabetic list
HOME

randomize

randomize - Builds a file of randomly attributed vector in X1

function X1=randomize(X)

Input argument
=============
X: SAISIR matrix (n x p)

Output argument:
X1: SAISIR matrix (n x p) with the rows randomly allocated

This function randomly changes the rank (indices) of the observations.
This useful for validation test, when comparing the results with the
hazard.

Return to thematic list
Return to alphabetic list
HOME

random_saisir

random_saisir - Creation of a random matrix

function[X]=random_saisir(nrow,ncol)

Input arguments
===============
nrow, ncol: integers (number of rows and columns of the resulting matrix)

Output argument
===============
X: matrix of random elements (nrow x ncol)

Return to thematic list
Return to alphabetic list
HOME

random_select

random_select - bulding a vector of random elements 0 or 1

function[selected]=random_select(nel, nselect, (nrepeat))

Input arguments:
===============
nel: (integer) number of elements in the output vector "selected"
nselect: (integer smaller than nel) number of elements taking the value 1
nrepeat: (integer, optional) number of consecutive replicates.
Output arguments:

===============
selected: MATLAB vector with nel elements equal to 1 or 0

This function builds up a MATLAB vector of nel elements with nselect
elements equal to 1 in random position, and (nel-nselect) equal to 0.
nrepeat (optional) randomly selects nselect values, but organised by block of nrepeat groups
For example, if nrepeat =3 a possible result is [0 0 0 1 1 1 0 0 0 1 1 1 1 1 1 ...]

This function is useful for dividing a collection into two sets, for
example in many functions of SAISIR allowing a validation test.
The case with "nrepeat" defined corresponds to the situation in which the
replicates are in equal numbers and consecutive in the data collection.

Typical use: randomly building a calibration and validation set
==============================================================
[n,p]=size(DATA.d);
sel=random_select(n,round(n/3));%% A third in validation
cal=selectrow(DATA,sel==0);%% building the calibration set
val=selectrow(DATA,sel==1);%% building the validation set

See also: random_splitrow

Return to thematic list
Return to alphabetic list
HOME

random_splitrow

random_splitrow - random selection of rows

function[X1,X2]=random_splitrow(X, nselect)

Input argument
=============
X: SAISIR matrix (n x p)
nselect: integer, less than p.

Output arguments
================
X1, X2: SAISIR matrices of the resulting split

This function randomly divides a matrix in two matrices:
X1: with nselect rows, and X2 with n-nselect rows

Typical use : building a calibration and a validation set
========================================================
[n,p]=size(DATA.d);
[cal val]=random_splitrow(DATA, round(n*2/3));%% two third in calibration
% cal and val are respectively the calibration and validation sets

Return to thematic list
Return to alphabetic list
HOME

readexcel1

readexcel1 - reads an excel file in the .CSV format (create a 3way character matrix).

function [data] = readexcel1(filename,(nchar),(deb),(xend))

!!! NO DIRECT USE. Use function "excel2saisir"

Input arguments
===============
filename: name of an excel file saved in the .CSV format.
nchar: nchar is the length of the element data(i,:,j), ie the number of
characters which are kept (default: 20)
deb, xend: first and last rows wich are loaded. (default : all)

Output argument:
===============
data: is a 3 way file data(row,pos,col)
where row is the excel rows, col the excel columns,
and pos is the character in the string

if the string is less than nchar, the string is filled with white space.
if the string is more than nchar, the end of the chain is lost

This is a first step for decoding data coming from excel

%Example:
%========
mywork=readexcel1('work1.csv',15);

See also: excel2saisir, saisir2excel

Return to thematic list
Return to alphabetic list
HOME

readident

readident - loads a file of strings

function [ident, nident] = readident(filename,namesize)

Input arguments
==============
filename: (string) file name of the text file in the current directory
namesize: (integer) maximum size of the string to be read (default : 10)

Output arguments:
================
ident: matrix of char
nident:number of rows in ident (identifiers)

Loads an array of string in a matrix format
namesize gives the maximum number of characters in each string

Main use: loadings identifiers of rows and variables from a text file

Return to thematic list
Return to alphabetic list
HOME

regression_score

regression_score - build a factorial space for regression

function res=regression_score(x,beta,(y))

------------------
Input arguments:
===============
x : data matrix (n x p)
beta : vector of regression coefficients (p x 1)
y :(optional) known y value

Output argument:
===============
res with fields
score : regression scores (also y split)
reconstructed_norm2: squared norms of the scores
cumulated_norm2: cumulated squared norms of the scores
projector: matrix such as score=x*projector
eigenvec_sum: sum of the eigenvectors of PCA (linked to the theory)
xmean: mean of x
r2 : if (y defined) r2 of the cumulated model

Given a data matrix x and the regression coefficients beta,
the function build up a matrix of orthogonal scores
such as predicted y is equal to the sum of this scores
The scores can be used to examine the observations "oriented" in the
prediction of y.
As the scores are ranked as a function of their ability to predict y,
It is possible to examine the observations beginning by the first scores.

Return to thematic list
Return to alphabetic list
HOME

reorder

reorder - reorders the data of files A1 and A2 according to their identifiers

function [B1 B2]=reorder(A1,A2)

Input arguments:
===============
A1, A2: SAISIR matrices in which the rows have at least some identifiers in
common

Output arguments
================
B1, B2: reordered matrices.

This function makes it possible to realign the rows of A1 and A2, in order
to have the identifiers corresponding.
This is necessary for any predictive method (particularly regressions).
The function discards the observations which are not present in A1 and A2.
The matrix B1 corresponds to A1 and matrix B2 to A2
Fails if A1 or A2 contains duplicate identifiers of rows.
A2 is leader (B1 is as close as possible from the order of A2)

%Typical example:
%===============
%Let X and y matrices to be reordered
[X1, y1]=reorder(X,y);
In X1 and y1 the rows have now the same identifiers (with possibly some
lost of observations).
%
If the function fails because some identifiers are in duplicates, use the
function "check_names" to identifies these duplicated identifiers, and remove some of them

Return to thematic list
Return to alphabetic list
HOME

repeat_string

repeat_string - build a matricx of char by repeating a string

function str1=repeat_string(str,ntimes);

Input arguments:
===============
- str : a character string
- ntimes: number of repetition
Output argument:
================
- str1 the matrix of char with the repeated string.
example:
>> repeat_string('Vanessa',3)
ans =
Vanessa
Vanessa
Vanessa
Useful for building identifiers in SAISIR

See also: addcode

Return to thematic list
Return to alphabetic list
HOME

ridge_regression

ridge_regression - Basic ridge regression

function [ridgetype]=ridge_regression(X,y,krange)

Input arguments:
===============
X: SAISIR matrix of predictive variables (n x p)
y: SAISIR vector of observed y (n x 1)
krange: MATLAB vector of k-values to be tested in the ridge regression

Let ntest = length(krange)

Output argument:
===============
ridgetype with fields
beta: beta coefficients associated with th ntest k-values as defined in "krange"
averagex: average of X
averagey: average of y
rmsec: Root mean square error of calibration (ntest x 1 )
predy: predicted y for each test k-value (n x ntest)
r2: r2 for each tested k-value (ntest x 1);

Return to thematic list
Return to alphabetic list
HOME

ridge_regression1

ridge_regression1 - Basic ridge regression at a given norm

function [ridgetype]=ridge_regression1(X,y,normrange)

ONLY ONE VARIABLE TO BE PREDICTED (scan the dimensions)
return as many beta as the number of elements in krange

Input arguments:
===============
X: SAISIR matrix of predictive variables (n x p)
y: SAISIR vector of observed y (n x 1)
normrange: tested range of norms of beta (MATLAB vector of positive doubles)

Let ntest = length(krange)

Output argument:
===============
ridgetype with fields
beta: beta coefficients associated with th ntest k-values as defined in "krange"
averagex: average of X
averagey: average of y
rmsec: Root mean square error of calibration (ntest x 1 )
predy: predicted y for each test k-value (n x ntest)
r2: r2 for each tested norm of beta (ntest x 1);
k : MATLAB vector of resulting k values (ntest x 1)
expected norm: MATLAB vector of expected norms (copy of normrange)

This function carried out as many ridge regressions as the number of
elements in "normrange".
Rather that (as usual) trying to find the k-value of ridge, here, it is
directly the norm of the regression coefficients beta which are the
adjusted value. To each norm, there is a corresponding value of k.

%Typical example:
%===============
ridgetype=ridge_regression1(X,y,[100, 200]);
The function displays the Ordinary Least Square norm of beta
%"OLS norm = 1234.5678"
%This value gives the maximum possible value of the norm
%For example, testing half this norm
ridgetype=ridge_regression1(X,y,(1234.5678/2);

See also: ridge_regression

Return to thematic list
Return to alphabetic list
HOME

row_center

row_center - subtracts the average row to each row

function [X] = center(X1)

Return to thematic list
Return to alphabetic list
HOME

saisir2ascii

saisir2ascii - Saves a saisir file into a simple ASCII format

function saisir2ascii(X,filename,separator)

Input arguments
===============
X: SAISIR matrix to be saved
filename: (string) name of the saved file
separator:(string with a single char) separator character

Output argument
===============
none

Transform a saisir file into a simple .txt file and save it on disk
separator is a single character like ' ' or ';' or its ASCII code;
The extension '.txt' is added to the filename

%Typical example:
%===============
saisir2ascii(data,'mydata',';');
%Saves the SAISIR matrix "data", under the name "data.txt", with ";" as separator

Return to thematic list
Return to alphabetic list
HOME

saisir2excel

saisir2excel - Saves a saisir file in a format compatible with Excel

function saisir2excel(X,filename)

Input arguments
===============
X: SAISIR matrix to be saved
filename: (string) name of the saved file

Output argument
===============
none

Transformq a saisir file into a simple .CSV file and save it on disk
The separator is ";"
The extension '.csv' is added to the filename

%Typical example:
%===============
saisir2excel(data,'mydata');
%Saves the SAISIR matrix "data", under the name "data.csv", with ";" as separator
%This file is read by Excel

Return to thematic list
Return to alphabetic list
HOME

saisirpls

saisirpls - PLS regression with "dim" dimensions

function [plstype]=saisirpls(X,Y,dim)

Input arguments
===============
X: SAISIR matrix of predictive data (n x p)
Y: SAISIR matrix of variables to be predicted (n x k)
ndim: (integer) number of dimensions asked

Ouput arguments
===============
plstype with fields
beta: regression coefficients of the model (p x k)
beta0: intercept of the models (1 x k)
predy: predicted values (n x k)
T: PLS scores of the PLS2 regression model (n x ndim)
correlation: correlation coefficient (1 x k)

This function assesses a pls2 model
Several variables can be predicted, but only ndim dimensions are tested

Preferably uses basic_pls or basic_pls2

%Typical example
%===============
Let DATA be dimensionned (n x p)
Let Y be dimensionned (n x k)
plstype=saisirpls(DATA,Y,10);
%Assesses the models with 10 dimensions for the k variables in Y

Return to thematic list
Return to alphabetic list
HOME

saisir_check

saisir_check - Checks if the data respect the saisir stucture

function check=saisir_check(X)

Input argument:
==============
X: (expected) SAISIR matrix

Output argument:
===============
check:
check = 1 if x is in the SAISIR format (no warning)
check = 2 if x is in the SAISIR format (with warning)
check = 0 if x is not in the SAISIR format (fatal error)

The function tests if the input argument X is a valid SAISIR structure
and gives some information.
If X is a valid structure, also signals (as warning) if there are missing values,
identical rows or columns (which may be the sign of something wrong)

Useful to see if X is a valid '.d','.i','.v' structure.

Return to thematic list
Return to alphabetic list
HOME

saisir_derivative

saisir_derivative - n-th order derivative using the Savitzky-Golay coefficients

[X]=saisir_derivative(X1,polynom_order,window_size,derivative_order)

Input arguments:
===============
X1:SAISIR matrix (n x p)
polynom_order:(integer) order of the fitting polynom
window_size:(integer) number of data points involved in the calculation
derivative_order: (integer, normally 1 or 2) order of the derivative

Output argument:
================
X : transformed data matrix (n rows)

The function assumes that X is a matrix of digitized signals (such as
spectra) with constant intervals of digitization.

Example:
=======
res=saisir_derivative(DATA,3,21,2);
Compute the second derivative using a polynom of power 3 as model
and a window size of 21

Return to thematic list
Return to alphabetic list
HOME

saisir_linkage

saisir_linkage - assesses a simple linkage vector from a matrix of distance

function z=saisir_linkage(dis)

Input argument
=============
dis: a SAISIR matrix of distance (n x n, symetric)

Output argument
==============
z: z vector as required by the MATLAB function "dendrogram"

From a complete square matrix of distances
extracts the "unfolded" triangular matrix in order to enter the matlab program "linkage"
with the option "ward"
Returns the z vector as required by the MATLAB "dendrogram" function

This function is very specific, and can be used only by skilled persons!

See also : dendro (dendrogram with SAISIR)

Return to thematic list
Return to alphabetic list
HOME

saisir_mean

saisir_mean - computes the mean of the columns, following the saisir format

function[xmean]=saisir_mean(X);

Input argument
==============
X: SAISIR matrix (n x p)

Ouput argument
==============
xmean: SAISIR vector (1 x p) of the mean

Return to thematic list
Return to alphabetic list
HOME

saisir_mult

saisir_mult - matrix multiplication following the SAISIR format

function X12=saisir_mult(X1,X2);

Input arguments:
===============
X1 and X2 : SAISIR matrices dimensionned (n x p) and (p x m) respectively

Output argument:
===============
X12: SAISIR matrix (n x m) , result of the multiplication of X1 with X2

Little use!

Return to thematic list
Return to alphabetic list
HOME

saisir_sort

saisir_sort - sorts the rows of s according to the values in a column

function [X1 X2]=saisir_sort(X,ncol,minmax)

Input arguments:
===============
X: SAISIR matrix (n x p)
ncol:(integer) rank (index) of the column on which the data are sorted
ùinmax: 0 increasing order, 1: decreasing order (default : 0)

Output arguments:
================
X1: SAISIR matrix (n x p) sorted according to the column "ncol"
X2: SAISIR matrix (n x (p+1)) sorted according to the column "ncol", with
the rank added in column 1

%Typical example:
%===============
DATA1=saisir_sort(DATA,5);
map(DATA1,1,6); %% representing the 5 th column of DATA (6th column of
%DATA1) in increasing order

Return to thematic list
Return to alphabetic list
HOME

saisir_std

saisir_std - computes the standard_deviations of the columns, following the saisir format

function[xstd]=saisir_std(X)

Input argument
==============
X: SAISIR matrix (n x p)

Ouput argument
==============
xstd: SAISIR vector (1 x p) of the standard deviation

Return to thematic list
Return to alphabetic list
HOME

saisir_sum

saisir_sum - calculates the sum of the rows

function xsum=saisir_sum(X);

Input argument
==============
X: SAISIR matrix (n x p)

Ouput argument
==============
xsum: SAISIR vector (1 x p) of the sum of the rows

Return to thematic list
Return to alphabetic list
HOME

saisir_transpose

saisir_transpose - transposes a data matrix following the saisir format

function [X] = saisir_transpose(X1)

Input argument
=============
X1: SAISIR matrix ( n x p)

Output argument
===============
X: SAISIR matrix (p x n), transpose of X1

Return to thematic list
Return to alphabetic list
HOME

seekstring

seekstring - returns a vector giving the indices of string in matrix of char x in which 'str' is present

function index = seekstring(identifiers,xstr)

Input arguments
===============
identifiers: matrix of characters (n x p)
xstr: string (1 x k), with k smaller th=an p

Output argument
==============
ndex: vector of integers giving the indices of the rows of "identifiers" in
which the string "xstr" has been found

%Typical example:
%===============
index=seekstring(DATA.i,'thisname');
Gives the indices in DATA.i in which the string "thisname" is present.

Return to thematic list
Return to alphabetic list
HOME

selectcol

selectcol - creates a new data matrix with the selected columns

function [X] = selectcol(X1,index)

Input arguments
===============
X1: SAISIR matrix (n x p)
Index: vector of integer or of booleans

Output argument
===============
X: matrix with n rows reduced to the selected variables

%Typical example:
%===============
reduced=selectcol(DATA,[1 5 6]); %% selects the columns #1, #5, #6 and
%builds the reduced matrix (with 3 columns) in "reduced"

%See also: selectrow, deletecol, deleterow, appendcol, appendrow,
appendcol1, appendrow1

Return to thematic list
Return to alphabetic list
HOME

selectrow

selectrow - creates a new data matrix with the selected rows

function [X] = selectrow(X1,index)

Input arguments
===============
X1: SAISIR matrix (n x p)
Index: vector of integer or of booleans

Output argument
===============
X: matrix with n columns reduced to the selected rows

%Typical example:
%===============
reduced=selectrow(DATA,[1 5 6]); %% selects the rows #1, #5, #6 and
%builds the reduced matrix (with 3 rows) in "reduced"

See also: selectcol, deletecol, deleterow, appendcol, appendrow,
appendcol1, appendrow1

Return to thematic list
Return to alphabetic list
HOME

select_from_identifier

select_from_identifier - Uses identifier of rows for selecting samples

function [X1] = select_from_identifier(X,startpos,str)

Input arguments:
===============
X : SAISIR matrix
startpos : beginning position in the character strings of the identifiers
of rows ('.i')
str: string which is used as selection key.

Output argument:
===============
X1 : SAISIR matrix of the selected rows

Creates the data collection "X1" which is the subset of "X"
the identifiers of which contain the string str, in starting position startpos

%Example :
%========
%Let X be a SAISIR matrix
%Let X.i being
%'wheat1'
%'barle2'
%'ricex1'
%'wheat2'
%The 'wheat' samples are extracted through
mywheat= select_from_identifier(X,1,'wheat');
%This select the rows with identifiers 'wheat1' and 'wheat2'

Return to thematic list
Return to alphabetic list
HOME

select_from_variable

select_from_variable - use identifier of columns for selecting variables

function [X1] = select_from_variable(X,startpos,str)

Input arguments:
===============
X : SAISIR matrix
startpos : beginning position in the character strings of the identifiers
of columns ('.v')
str: string which is used as selection key.

Output argument:
===============
X1 : SAISIR matrix of the selected columns

Creates the data collection "X1" which is the subset of "X"
the variable identifiers of which contain the string str, in starting position startpos
are selected

see also : select_from_identifier

Return to thematic list
Return to alphabetic list
HOME

sensory_profile

sensory_profile - Graphical representation of sensory profile

function[h]=sensory_profile(X,range,max_score,(title))

Graphical display of sensory profiles in a "circular (spider web)" representation.

Input arguments:
===============
-X : matrix of data to be displayed
-range : vector of the indices of the rows to be displayed
-max_score : maximal score used in the scale
-title : (optional) title of the graph.
Warning: will not work properly with more than 15 variables
Preferably reduce the identifiers of variables to less than 8 characters

%Demonstration example
%====================
senso=rand(5,10)*5;%%simulationg 5 panellists, 10 scores, scale from 0 to
%5
senso1=matrix2saisir(senso,'judge','descri');%% In SAISIR structure
sensory_profile(senso1,1:3,5);%% graphic of the first 3 panellists

Return to thematic list
Return to alphabetic list
HOME

sgolaycoef

sgolaycoef - Computes the Savitsky-Golay coefficients

function [B,G] = sgolaycoef(k,F)

where the polynomial order is K and the frame size is F (an odd number)
No direct use

Return to thematic list
Return to alphabetic list
HOME

show_vector

show_vector - represents a row of a matrix as a succession identifiers

function handle=show_vector(X, (nrow) ,(csize),(xlab),(ylab),(title))

Input arguments
===============
X: SAISIR matrix (n x p)
nrow: index of the row to be displayed (integer less than n, default : 1)
csize: size of the character (default : 10)
xlab, ylab, title: label on axis X, axis Y, title , respectively (default:
none

The identifiers of the columns are plotted with X being the index of the variable and Y the actual
value of the variable for the selected row "nrow"

Main use : examining the output of "anavar1" and "anovan1" functions on
discrete variables

Return to thematic list
Return to alphabetic list
HOME

simple_regression

simple_regression - mono_linear regressions

function [beta beta0]=simple_regression(X,y);

Input arguments
===============
X:SAISIR matrix of predictive variables (n x p)
y:SAISIR vector of the variable to be predicted (n x 1)

Ouput arguments
===============
beta:SAISIR vector of the regression coefficients (1 x p)
beta0:SAISIR vector of the intercepts (1 x p)

y is predicted by each column i of X according to ypred=X.d(:,i)*beta.d(i)+beta0.d(i);
There are thus as many mono-linear models as the number of columns in X

Return to thematic list
Return to alphabetic list
HOME

snv

snv - Standard normal variate correction on spectra

function [X1] = snv(X)

Input argument:
==============
X:SAISIR matrix of spectra (n x p)

Output argument:
==============
X1:SAISIR matrix of SNV-corrected spectra (n x p)

SNV (Standard Normal Variate)is commonly used in spectroscopy.
It basically consists in centering
and standardizing the ROWS (not the columns) of the data matrix.
This procedure may reduce the scatter deformation of spectra.

Return to thematic list
Return to alphabetic list
HOME

spcr

spcr - stepwise Principal component regression

function [spcrtype]=spcr(X,y,maxdim, (maxrank)(corr_cov))

The PC scores are introduced in the order of their regression coefficient or their covariance
with y

Input arguments:
===============
X : SAISIR matrix of predictive variables (n x p)
y : SAISIR vector of observed y
maxdim: (integer): naximum number of PC scores introduced in the regression model
maxrank: (optional,integer): rank maximal of the PC score in the model
Default value: all components possibly introduced.
corr_cov : 1 introduction according to correlation coeff (corr_cov=1,default);
or : 0 introduction according to covariance

Ouput arguments:
===============
spcrtype with fields
pca: PCA structure (see function "pca")
beta: beta coefficients (applicable on PC scores)
selected_component: rank of the scores introduced in the model
predy: predicted y values for all the steps of spcr
r2: r2 for all the steps of spcr
averagey: average value of y
beta1: beta coefficients (applicable on X centred)
obsy: observed y, copy of input argument y.
rmsec: root mean square error of calibration for all the models

Return to thematic list
Return to alphabetic list
HOME

splitrow

splitrow - splits a data matrix into 2 resulting matrices

function [X1, X2]= splitrow(X,index)

Input arguments
===============
X:SAISIR matrix (n x p)
index: MATLAB vector with k elements equal to 1 and n-k elements equal to
0

Output arguments
===============
X1:SAISIR matrix (k x p)
X2:SAISIR matrix ((n-k) x p)

Divides X into two matrices X1 and X2
the first one correspond to kept rows (according to index = 1, or "true")
the second one is the complement (index = 0, or "false")
index is either indices of the rows (integers) or boolean.

%Typical example (division of DATA into a calibration and a validation sets):
%===============
[n,p]=size(DATA.d);
sel=random_select(n,round(n/3));%% buiding a random vector of 0 and 1
[validation_set calibration_set]=splitrow(DATA,sel);%%creation of a
%calibration and validation set.

Return to thematic list
Return to alphabetic list
HOME

split_average

split_average - averages observations according to the identifiers

function res=split_average(X,startpos,endpos)

Input arguments:
===============
X:SAISIR matrix (n x p)
startpos, endpos: position in the row-identifier strings (".i")

Output argument:
===============
res with fieldss:
average: averages of the identified groups (p columns)
group: number of observations in each group.

The function extracts the characters in the row identifiers from "startpos"
to "endpos" and makes as many groups as the number of different strings
The observations are averaged according to these groups

Return to thematic list
Return to alphabetic list
HOME

standardize

standardize - divides each column by the corresponding standard deviation

function [X, xstd] = standardize(X1,(option))

Input arguments
==============
X1:SAISIR matrix (n x p)
option : either 0: divides by n-1, or 1: divides by n (default : 1)

Output argument
===============
X:SAISIR matrix (n x p)
xstd: standard deviation of the columns of X1 (1 x p)

Return to thematic list
Return to alphabetic list
HOME

statis

statis - Multiway method STATIS

function res=statis(collection);

Input arguments
===============
collection:array of SAISIR matrices with the same number of rows (n)

Output arguments
================
Let n be the number of observations, and t the number of tables
the field of the outpout argument res are:

RV: [1x1 struct] matrix t x t of the RV value indicating the agreement
between the table (max value = 1)
eigenval1: [1x1 struct] first eigenvalue of the RV matrix
eigenvec1: [1x1 struct] first eigenvector of the RV matrix (t x 1) . Indicates the
weight associated with each table
-Wk: {1xt cell} cell of n x n array giving the scalar products between observations
-W_compromise n x n array giving the compromise of the array WK
eigenval2: [1x1 struct] r eigenvalues of W_compromise, with r the rank of
W_compromise.
score: [1x1 struct] (n x r ) Scores of the compromise of the observations. Can be
represented as factorial map
trajectory: [1x1 struct] (n*t x r) Projection of each row vector of each table in the space of observation_score
group: [1x1 struct] (n*t x 1) table giving the belonging of a given
row_vector to a table
table_score: [1x1 struct] (t x r) scores of the tables obtained from
diagonalisation of RV.
table_eigenval: [1x1 struct] (r x 1) eigenvalues of RV . The first one
is the same as eigenval1

The STATIS method is described in "C.Lavit, Analyse conjointe de tableaux qualitatif, Masson pub, 1988."
Basically the method attempts to establish a factorial compromise between table having
the same number of observations.
col is an ARRAY OF CELLS containing all the 2-D data tables (SAISIR format).
Each table must include the same observations, but not necessarily the same variables.
group is useful with the command 'carte_barycentre'
For example, a command such as "carte_barycentre(res.trajectory,2,3, res.group)" will produce the representation of
the row vector of each table for the score 2 and 3. The representation shows the compromise point and its link
to each vector of the tables.
"collection" is built up with commands like:
collection(1)=DATA1; collection(2)=DATA2; ...: collection(k)=DATA(K);

Warning ! Such commands work only if DATA are .d, .i, .v structures IN
THAT ORDER, with NO OTHER FIELDS. Otherwise, MATLAB refuses to build the
vector of SAISIR structure. Possibly use "saisir_check" for verifying this
point.

Return to thematic list
Return to alphabetic list
HOME

stepwise_regression

stepwise_regression - stepwise regression between x and y

function[result]=stepwise_regression(x,y,Pthres,(confidence))

Input arguments
===============
X : SAISIR matrix of predictive data (n x p)
y : SAISIR vector of variable to be predicted (n x 1)
P :probability threshold for entering or discarding a variable
confidence: (default=0.05) is the probability of the confidence interval
for the limit of the regression coefficients

Output argument
===============
result: array of cells corresponding to each step of the regression
Each cell correspond to one step (adding or discarding a variable)
In each cell:
message: gives the name of the entered or discarded variable'
res : a structure described below
intercept : constant value (beta0) of the current model
RMSE : root mean square error of the model
r2 : determination coefficient
adjusted_r2 : adjusted determination coefficient (taking into account of the dimensions
F : Fisher F value of the current model
probF : probaility value assiciated with F
ypred : predicted y values

res : the rows indicate the variables introduced
the different columns give information on the corresponding
regression coefficients:
1) regression coefficients
2) Lower confidence limit of regressin coefficients
3) Higher confidence limit
4) Std of regression coeff.
5) t value of reg. coeff.
6) Prob. of reg. coef.
7) Rank of variables

% example
result=stepwise_regression(X,y,0.05);
result{3} %% third model
message: 'Entering variable 2214 at step 3'
res: [1x1 struct]
intercept: 4.66
RMSE: 0.81
r2: 0.90
adjusted_r2: 0.90
F: 417.96
probF: 0
ypred: [1x1 struct]
and result{3}.res gives the statistics on the regression coefficients
which can be consulted for example under Excel using
"saisir2excel(result{3}.res,'model3')"

Return to thematic list
Return to alphabetic list
HOME

string2saisir

string2saisir - creation of a saisir file from a string table (first column=name)

function [saisir] = string2saisir(data)

NORMALLY NO DIRECT USE
creation of a saisir file from a string table obtained by procedure readexcel1
A DOS file which have been read from readexcel1 is a 3 way matrix of character under
the form data(row,pos,col). For example, data(5,:,12) contains the string in row 5 and
column 12.
In the particular case of acceptable data for saisir transformation, the data format must be
the following:
1) the first row data(1,:,:) must contain the identifier of variables (column)
warning: for matrix presentation of the data, the string data(1,:,1) is of no use
and is skipped by the program.
2) the first column(:,:,1) must contain the identifier of observations (rows)
3) the other lines and columns contains string which can be converted in number
(or whitespace)
such format is normally obtained by using readexcel1 (excel data saved as .csv)
If the original excel file was not appropriate, it is possible that several columns
contain string which could not be transformed in number (or white space).
in this situation, it is possible to remove the undesired column using
data(:,:,col)=[]; where col is the index of the column to be removed
Note that the whitespace are replaced by NAN values

Return to thematic list
Return to alphabetic list
HOME

string2text

string2text - save a vector of string in a .txt format

function string2text(str,filename)

Input argument
==============
str:matrix of characters
filename:string (vector of characters)

Output argument
==============
None

This function saves a succession of strings (in a matrix of char) under
the name "filename". The extension ".TXT" is added to the file name

Return to thematic list
Return to alphabetic list
HOME

submap

submap - partial display of observations

function submap(X,col1,col2,xstring,(col1label),(col2label),(title),(charsize),(marg))

The scale of the COMPLETE map is used.

Input arguments:
===============
X: SAISIR matrix
col1, col2 : index of the two columns to be represented (integer values)
xstring : string in the names of identifiers which must be displayed
col1label (optional): Label of the variable forming the X-axis
col2label (optional): Label of the variable forming the Y-axis
title (optional) : title of the graph
charsize (optional) : size of the plotted characters
marg (optional) : margin value allowing an extension of the axis in order
to cope with long identifiers (default value: 0.05)

example:
Let X be a SAISIR matrix
%Let X.i being
'wheat1'
'barle2'
'ricex1'
'wheat2'
'barle3'
...
The command "submap(X,5,3,'wh')" will plot the column 5 as X and 3 as Y
Only the observations containing 'wh' in there names are displayed, but
the general axis-scales are the one for the whole collection.
Useful for emphasizing some groups in a complex plot.

Return to thematic list
Return to alphabetic list
HOME

subtract_variable

subtract_variable - subtract a given variable to all the others

function [X1] = subtract_variable(X,ncol)

Input arguments
===============
X:SAISIR matrix
ncol: (integer) column chosen for the subtraction

Output argument
==============
X1: SAISIR matrix in which the variable has been subtracted

Subtracts the variable of indice ncol to each other variables of the observations
Useful for correcting an y-shift of spectral data.

Return to thematic list
Return to alphabetic list
HOME

surface1

surface1 - Represent a surface in three dimensions

function [zmin, zmax]=surface1(X)

Input argument:
==============
X:SAISIR matrix

Output arguments:
================
zmin, zmax: max and min values in X.d

% new version 6/10/2006

Return to thematic list
Return to alphabetic list
HOME

surface_std

surface_std - divide each row by the sum of its corresponding columns

function [X1] = surface_std(X,(threshold))

Input arguments
===============
X:SAISIR matrix
Threshold: (small) positive or zero value
If the sum is equal to 0, the elements of the corresponding row are set to 0.
If threshold defined threshold (normally very small value) is added to data
Only useful for avoiding "division by zero" warning

Ouput argument
===============
X1:corrected matrix

The function assesses the sum of each row . Each value of each row is
divided by the corresponding sum.

In chromatography, this corresponds to giving the same surface to all the
chromatograms.
See also: snv

Return to thematic list
Return to alphabetic list
HOME

symbol_map

symbol_map - map with symbols : using a portion of the identifiers for

symbol_map(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize))

Input arguments:
===============
X: SAISIR matrix
col1, col2 : index of the two columns to be represented (integer values)
startpos, endpos: position in the identifier strings of rows ('.i') for
the coloration
col1label (optional): Label of the variable forming the X-axis
col2label (optional): Label of the variable forming the Y-axis
title (optional) : title of the graph
charsize (optional) : size of the plotted characters

For the French users: there is a synonym function "carte_symbole".
Use preferably "symbol_map"
The coloration of the displayed descriptors depends on the arguments
startpos and endpos.
From the names of individual, the string name(sartpos:endpos) is extracted. Two observations
for which these strings are different, are represented with different symbols.
example:
Let X be a SAISIR matrix
%Let X.i being
'wheat1'
'barle2'
'ricex1'
'wheat2'
'barle3'
...
The command 'symbol_map(X,5,3,1,5)' will plot the column 5 as X, 3, as Y
The characters are extracted from 1 to 5 , that is strings 'wheat', 'barley',
'ricex'.A different symbol will be given for each of this strings.

See also colorde_map1, colored_map2 (same principle but with the
identifier name displayed)

Return to thematic list
Return to alphabetic list
HOME

tcurve

tcurve - representation of a column of a given matrix as a curve

function handle=tcurve(X, ncol, (xlabel),(ylabel),(title))

Input arguments;
================
X : SAISIR matrix
ncol : column to be represented
xlabel, ylabel (optional) : labels in X and Y
title (optional) title of the graph.

This function draws a column (typically a loading or an eigenvector) as a curve.
If X.i can be interpreted as a vector of number (such as wavelengths),
the X scale is given by this vector.
Otherwise, the X-axis is simply given by the rank of the variables
A function "tcourbe" is a synonym of this function.

Return to thematic list
Return to alphabetic list
HOME

tcurves

tcurves - represents several columns of a matrix as curves

function handle=tcurves(X, range, (xlabel),(ylabel),(title))

Input arguments
===============
X:SAISIR matrix (n x p)
range: indices of the selected columns (Matlab vector of integers)
xlabel, ylabel, title: legends ub x,y and title (strings)

Typical use : showing loadings of PLS or PCA
===========
p=pca(spectra);
tcurves(p.eigenvec,1:4);%% First 4 loadings pf PCA

Return to thematic list
Return to alphabetic list
HOME

thematic_classification

thematic_classification - builds a thematic classification of the .m files

function res=thematic_classification(function_name,(previous));

Input argument
==============
function_name: matrix of characters giving the function name
previous (optional): previous results of this function
"thematic_classification"

Output argument
==============
res with fields
theme_structure: array of structures with fields:
name: name of the function
theme: vector giving the number of the theme in the thematic list
theme: matrix of char giving the names of the themes.

"thematic classification" presents a list of themes
For each function, the user is asked to give the number in the list of
themes

VERY SPECIFIC USE.
This function is used by the function "build_documentation" in order to give a
a thematic list of the function
If "previous" is defined, the results are obtained with concatenation of
the old and newly created thematic list. The functions which are in "previous"
are not considered again

Return to thematic list
Return to alphabetic list
HOME

trajectory_curve

trajectory_curve - plots coloured XYcurves

function handle=trajectory_curve(X,col1,col2,startpos,endpos);

The function represents the columns col1 and col2 as curves (ark)
The observations which have the same strings in their identifiers are joined.
The points are joined consecutively according to their order (rank) in x.

Input arguments:
================
X : saisir matirx
col1, col2 : columns of x to be represented.
startpos, endpos : positions in identifiers indicating which identifiers are to be joined.

exemple of use : time series
==============
identifiers of rows (x.i) like A01; A02; A03... A100; B01 ... B100; C01 ...
with 1 ... 100 indicating times, and A B .. observations varying with time
Command:
trajectory_curve(x,1,2,1,1);
Join the point labelled 'A' together; the ones labelled 'B' ... and so on

Return to thematic list
Return to alphabetic list
HOME

w

w - w: (for "what") lists the fields which are present in a structure

function res= w(xstruct);

Input argument
==============
xstruct: any variable (supposed to be a structure with fields)

Output argument
==============
none

This function displays all the fields found in structures (given by a
SAISIR function).
"Matrix" or "Vector" means here "SAISIR matrix" or "SAISIR" vector"

Exemple of use
...
pls_res=crossvalpls1a(x,y,10,sel);%% modèle de 1 à 10 dimensions
w(pls_res);%% gives the fields in "pls_res":
"
calibration
T : matrix 94 X 10
P : matrix 10 X 1050
beta : vector 1050 X 1
beta0 : = 12.7025
....
validation
PREDY : matrix 46 X 10
RMSEV : vector 1 X 10
r2 : vector 1 X 10
T : matrix 46 X 10
OBSY : vector 46 X 1
....
"

Return to thematic list
Return to alphabetic list
HOME

xcomdim

xcomdim - Finding common dimensions in multitable data

No direct use. Normally called with function "comdim"

function[Q, saliences, explained]=xcomdim(col,threshold,ndim)
Finding common dimensions in multitable according to method 'level3'
proposed by E.M. Qannari, I. Wakeling, P. Courcoux and H. J. H. MacFie
in Food quality and Preference 11 (2000) 151-154
table is an array of matrices with the same number of row
threshold (optional): if the difference of fit
ndim : number of common dimensions
default: threshold=1E-10; ndim=number of tables
returns Q: nrow x ndim the observations loadings

Return to thematic list
Return to alphabetic list
HOME

xdisp

xdisp -smart display of heterogeneous variables

function xdisp(varargin)

Input argument
==============
varagin: variables number of arguments either number or string

The function avoids those boring "numstr" and brackets [] in displaying
text.

Example: xdisp('pi is equal to',pi, 'Don''t you know ','Charlie Brown ?',' age :',6 );
equivalent to
disp(['pi is equal to ', num2str(pi) ' Don''t you know ' 'Charlie Brown ?' ' age ' num2str(6)]);

Return to thematic list
Return to alphabetic list
HOME

xpca

xpca - PCA on a matlab data matrix

assess a rustic principal component analysis (on not normalised data)

directly on data
returns coord.d, eigenvector.d, eigenvalues.d
average.d
currently only nrow

Return to thematic list
Return to alphabetic list
HOME

xyz_colored_map1

xyz_colored_map1 - Draws a colored 3D map from a Saisir file

xyz_colored_map1(X,col1,col2,col3,startpos,endpos)

Input arguments
===============
X: Saisir matrix
col1, col2, col3 : indices of the columns represented in X, Y, Z
startpos, endpos: number indicating the beginning and ending character in
the name identifiers. Two different strings will have different colors.

Return to thematic list
Return to alphabetic list
HOME

xy_plot

xy_plot - Biplot of one column of X versus one column of Y

function handle=xy_plot(X, xcol, Y, ycol,start_pos,end_pos);

Input arguments:
================
X,Y : SAISIR matrices (with the same number of rows)
xcol, ycol : rank number (index) of the columns to be plotted
if start_pos and end_pos defined: colored plot according
to the characters of the row identifiers at position start_pos:end_pos

Return to thematic list
Return to alphabetic list
HOME

SAISIR

A comprehensive package for chemometrics in the MATLAB environment

Documentation on SAISIR function

Documentation automatically generated on 07-Jan-2010

This software is copyrighted by ENITIAA-INRA, Unité de Sensométrie et de Chimiométrie, Nantes (France)

· Loading and saving files

· Elementary manipulation of SAISIR files

· Elementary data transformation and pre-treatments

· Graphical display

· Usual statistics

· Advanced statistics

· Principal Component analysis

· Regression methods

· Partial least square (PLS)

· Discrimination methods

· Miscellaneous

appendbag1 - Merge an arbitrary number of "bag" files according to rows

bag2group - uses the identifiers in bag to create groups

bag_appendrow1 - Merges an arbitrary number of bags according to rows

check_name - Controls if some strings are strictly identical in a string array

excel2bag - reads an excel file and creates the corresponding text bag

excel2saisir - reads an excel text file

issaisir - tests if the input argument is a SAISIR matrix

matrix2saisir - transforms a Matlab matrix in a saisir structure

readexcel1 - reads an excel file in the .CSV format (create a 3way character matrix).

readident - loads a file of strings

saisir2ascii - Saves a saisir file into a simple ASCII format

saisir2excel - Saves a saisir file in a format compatible with Excel

saisir_check - Checks if the data respect the saisir stucture

string2saisir - creation of a saisir file from a string table (first column=name)

string2text - save a vector of string in a .txt format

addcode - adds a string before or after a matrix of characters

alphabetic_sort - sorts the rows of x according to the alphabetic order of rows identifiers

appendbag1 - Merge an arbitrary number of "bag" files according to rows

appendcol - merges two files according to columns

appendcol1 - merges an arbitrary number of files according to columns

appendrow - merges two SAISIR matrices according to rows

appendrow1 - Merges an arbitrary number of files according to rows

bag2group - uses the identifiers in bag to create groups

bag_appendrow1 - Merges an arbitrary number of bags according to rows

build_indicator - build a disjoint table

check_name - Controls if some strings are strictly identical in a string array

create_group - creates a vector of numbers indicating groups from identifiers

create_group1 - uses the identifiers to create groups

deletecol - deletes columns of saisir files

deleterow - delete rows

eliminate_nan - suppresses "not a number" data in a saisir structure

find_index - find the index corresponding to the closest "value"

find_max - gives the indices of the max value of a MATLAB Matrix

find_min - gives the indices of the min value of a MATLAB Matrix

find_peaks - finds and displays peaks greater than a threshold value

group_centering - Centers data according to groups

issaisir - tests if the input argument is a SAISIR matrix

matrix2saisir - transforms a Matlab matrix in a saisir structure

num2str1 - Justified num2string

random_splitrow - random selection of rows

reorder - reorders the data of files A1 and A2 according to their identifiers

repeat_string - build a matricx of char by repeating a string

row_center - subtracts the average row to each row

saisir_check - Checks if the data respect the saisir stucture

saisir_sort - sorts the rows of s according to the values in a column

saisir_transpose - transposes a data matrix following the saisir format

select_from_identifier - Uses identifier of rows for selecting samples

select_from_variable - use identifier of columns for selecting variables

selectcol - creates a new data matrix with the selected columns

selectrow - creates a new data matrix with the selected rows

split_average - averages observations according to the identifiers

splitrow - splits a data matrix into 2 resulting matrices

build_indicator - build a disjoint table

center - subtracts the average to each row

correct_baseline - simple linear baseline correction, using intensity

create_group - creates a vector of numbers indicating groups from identifiers

create_group1 - uses the identifiers to create groups

eliminate_nan - suppresses "not a number" data in a saisir structure

moving_average - Moving average of signals

moving_max - replaces the central point of a moving window by the maximum value

moving_min - replaces the central point of a moving window by the minimum value

msc - Multiplicative scatter correction on spectra

norm_col - divides each column by the corresponding standard deviation

normc - Normalize columns of a matrix.