For more
information on the general structure of SAISIR, see also the manual.
Unité de Sensométrie et de Chimiométrie ENITIAA-INRA
(Nantes , France)
Coordinator: Dr Dominique BERTRAND (e-mail:
bertrand "at" nantes.nantes.fr)
Unité GENIAL, Equipe Ingénierie Analytique pour la Qualité
des Aliments (Paris, France)
Coodinator:
Dr Christophe CORDELLA (e-mail: cordella "at" paris.inra.fr)
Go to
thematic list of functions
Go to alphabetic list of functions
usage: [X]= appendbag1(X1, X2, X3,.....)
function [group_type]=bag2group(bag)
usage: [bag]= appendrow(bag1, bag2 ....)
function [detected,names]=check_name(string)
(array of caracter)
function [saisir] = excel2saisir(filename,(nchar),(start),(xend))
test=issaisir(X);
X = matrix2saisir(data,(coderow),(codecol))
function [data] =
readexcel1(filename,(nchar),(deb),(xend))
function [ident, nident] = readident(filename,namesize)
function saisir2ascii(X,filename,separator)
function saisir2excel(X,filename)
function check=saisir_check(X)
function [saisir] = string2saisir(data)
function string2text(str,filename)
function str1 =
addcode(str,code,(deb_end))
function X1=alphabetic_sort(X,start_pos:end_pos)
usage: [X]= appendbag1(X1, X2, X3,.....)
function: [X3]= appendcol(X1,X2)
usage: [X]= appendcol(X1,X2,X3,...)
usage: [X3]= appendrow(X1,X2)
usage: [X]= appendrow(X1,X2,X3,...)
function [group_type]=bag2group(bag)
usage: [bag]= appendrow(bag1, bag2 ....)
function [indicator, groupings]=build_indicator(x);
function [detected,names]=check_name(string)
group=create_group(X,code_list,startpos,endpos)
use the identifier for creating groups.
function [X1] = deletecol(X,index)
function X1 = deleterow(X,index)
function [X1] = eliminate_nan(X,(row_or_col))
function index=find_index(str,value);
function [row,col,value]=find_max(matrix)
function [row,col,value]=find_min(matrix)
function [vect, index]=find_peaks(X,nrow,threshold,windowsize,min_max)
function X1=group_centering(X,group);
test=issaisir(X);
X = matrix2saisir(data,(coderow),(codecol))
function str=num2str1(vector,ndigit);
function[X1,X2]=random_splitrow(X, nselect)
function [B1 B2]=reorder(A1,A2)
function str1=repeat_string(str,ntimes);
function [X] = center(X1)
function check=saisir_check(X)
function [X1 X2]=saisir_sort(X,ncol,minmax)
function [X] = saisir_transpose(X1)
function [X1] = select_from_identifier(X,startpos,str)
function [X1] = select_from_variable(X,startpos,str)
function [X] = selectcol(X1,index)
function [X] = selectrow(X1,index)
function res=split_average(X,startpos,endpos)
function [X1, X2]= splitrow(X,index)
function [indicator, groupings]=build_indicator(x);
function [X1 xmean] = center(X)
function [saisir] = correct_baseline
(saisir1,col1,col2)
group=create_group(X,code_list,startpos,endpos)
use the identifier for creating groups.
function [X1] = eliminate_nan(X,(row_or_col))
function X1=moving_average(X,window_size)
function [X1] = moving_max(X,window_size)
function [X1] = moving_min(X,window_size)
function [X1] = msc(X,(reference))
function [saisir] = norm_col(saisir1,(mode))
function[X]=random_saisir(nrow,ncol)
function[selected]=random_select(nel, nselect, (nrepeat))
function X1=randomize(X)
function [B1 B2]=reorder(A1,A2)
[X]=saisir_derivative(X1,polynom_order,window_size,derivative_order)
[X1, emsc_model, coefficients]=saisir_emsc(X,good_spectra,bad_spectra,ref);
function [B,G] = sgolaycoef(k,F)
function [X1] = snv(X)
function [X, xstd] =
standardize(X1,(option))
function [X1] =
subtract_variable(X,ncol)
function [zmin, zmax]=surface1(X)
function [X1] = surface_std(X,(threshold))
function barycenter_map(X,col1,col2,group,(charsize))
function browse(X,xstart)
ca_map(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))
function h=colored_curves(X,group)
colored_map1(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))
colored_map2(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))
colored_map4(X,col1,col2,color_choice,(symbol_choice),(charsize);)
function[res]=correlation_circle(pcatype,X,col1,col2,(startpos),(endpos))
function handle=correlation_plot(scores,col1,col2, X1,X2, ...);
handle=courbe(X,(nrow), (xlabel),(ylabel),(title))
usage handle=curves(X,range,(xlabel),(ylabel),(title))
function group=dendro(X,(topnodes))
function
ellipse_map(X,col1,col2,gr,centroid_variability,(confidence),(point_plot))
function [vect, index]=find_peaks(X,nrow,threshold,windowsize,min_max)
function labelled_hist(X,col,startpos,endpos,(nclass),(charsize),(car))
function list(X,(start))
function map(X,col1,col2,(col1label),(col2label),(title),(charsize),(margin))
function map3D(X,col1,col2,col3,(label1),(label2),(label3),(title),(charsize))
function [names] = mir_style(names1)
function
plotmatrix1(s,startpos,endpos,charsize)
function[h]=sensory_profile(X,range,max_score,(title))
function handle=show_vector(X, (nrow) ,(csize),(xlab),(ylab),(title))
function submap(X,col1,col2,xstring,(col1label),(col2label),(title),(charsize),(marg))
symbol_map(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize))
function handle=tcurve(X, ncol, (xlabel),(ylabel),(title))
function handle=tcurves(X, range, (xlabel),(ylabel),(title))
function handle=trajectory_curve(X,col1,col2,startpos,endpos);
function xdisp(varargin)
function handle=xy_plot(X, xcol, Y, ycol,start_pos,end_pos);
xyz_colored_map1(X,col1,col2,col3,startpos,endpos)
function res = anavar1(X,g)
function res = anovan1(X,model,gr1, gr2,
...)
function res=contingency_kh2(table);
function table=contingency_table(g1,g2)
function [cor] = cormap(X1,X2)
function [cov] = covmap(X1,X2)
function [D] = distance(X1,X2)
function [row,col,value]=find_max(matrix)
function [row,col,value]=find_min(matrix)
function X1=group_centering(X,group);
function X1=group_mean(X,startpos,endpos)
function labelled_hist(X,col,startpos,endpos,(nclass),(charsize),(car))
function dis = mdistance(X1,X2,metric)
function[cor]=nancor(X1,X2)
function [X] = center(X1)
function[xmean]=saisir_mean(X);
function[xstd]=saisir_std(X)
function
res=split_average(X,startpos,endpos)
function ca_type=ca(N);
ca_map(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))
function[res]=comdim(collection,(ndim),(threshold))
function[pcatype]=covariance_pca(covariance_type,(nscore))
function [cov] = covmap(X1,X2)
function [covariance_type]=cumulate_covariance((X),(covariance_type1));
function [ftype] = d2_factorial_map(X)
function [D] = distance(X1,X2)
function dis = mdistance(X1,X2,metric)
function res=mfa(collection);
function res=multiway_pca(collection);
function[res]=nuee(X,ngroup,(nchanged))
function[res]=pca_cross_ridge_regression(X,y,krange,selected)
function res=regression_score(x,beta,(y))
function z=saisir_linkage(dis)
function res=statis(collection);
No direct use. Normally called with function "comdim"
function [supscores]=applypca(pcatype,
X)
function [predy]=applypcr(pcrtype,X)
function [predy]=applyspcr(spcrtype,X)
function ca_type=ca(N);
ca_map(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))
function [pcatype1] =
change_sign(pcatype,ncomp)
function[res]=comdim(collection,(ndim),(threshold))
function res=contingency_kh2(table);
function table=contingency_table(g1,g2)
function[res]=correlation_circle(pcatype,X,col1,col2,(startpos),(endpos))
function handle=correlation_plot(scores,col1,col2, X1,X2, ...);
function[pcatype]=covariance_pca(covariance_type,(nscore))
function [covariance_type]=cumulate_covariance((X),(covariance_type1));
function [ftype] = d2_factorial_map(X)
function [pcrres]=dimcrosspcr1(X,y,ndim,selected)
function[fdatype]=fda1(pcatype, group, among, maxscore)
function[res]=fda2(X, group, among, maxscore, selected)
function res=mfa(collection);
function res=multiway_pca(collection);
function[pcatype]=normed_pca(X)
function [pcatype]=pca(X,(var_score))
This function is not to be called directly.
This function is not to be called directly.
function
res=pca_cano(collection,ndim,graph);
function res=pca_stat(pca_type, comp1,
comp2);
function[res]=pcareconstruct(pcatype,score,nscore)
function [pcrtype]=pcr(X,y,maxdim)
function [spcrtype]=spcr(X,y,maxdim, (maxrank)(corr_cov))
function res=statis(collection);
assess a rustic principal component analysis (on not normalised data)
function[res]=apply_multiple_regression(X,type,(y))
function[res]=apply_ridge_regression(ridgetype,X,(y))
function [predy]=applylr1(lrtype,X)
function [predy]=applypcr(pcrtype,X)
function res=applypls(X,plsmodel, (knowny))
function [predy]=applyspcr(spcrtype,X)
function[res] = basic_pls(X,y,ndim)
function result=basic_pls2(X,y,maxdim)
function[res]=cross_ridge_regression(X,y,krange,selected)
function[res]=crossval_multiple_regression(X,y,selected)
function[res]=dimcross_stepwise_regression(X,y,selected,Pthres,(confidence))
function [pcrres]=dimcrosspcr1(X,y,ndim,selected)
function res=leave_one_out_pls1(X,y,ndim);
function res=leave_one_out_pls2(X,y,ndim);
function [lr1type]=lr1(X,y,maxdim,(ratioxy))
function res=multiple_regression(x,y);
function [pcrtype]=pcr(X,y,maxdim)
function [pcrtype]=pcr1(X,y,dim)
function [mtBpls,mtT, tBeta] = pls2var(mtX,mtY,nbdim)
function [plstype]=saisirpls(X,y,ndim)
function
[ridgetype]=ridge_regression(X,y,krange)
function
[ridgetype]=ridge_regression1(X,y,normrange)
function [plstype]=saisirpls(X,Y,dim)
function [beta beta0]=simple_regression(X,y);
function [spcrtype]=spcr(X,y,maxdim, (maxrank)(corr_cov))
function[result]=stepwise_regression(x,y,Pthres,(confidence))
function res=applypls(X,plsmodel, (knowny))
function[res]=applyplsda(X,plsdatype,(actual_group))
function[res] = basic_pls(X,y,ndim)
function result=basic_pls2(X,y,maxdim)
function[res]=crossplsda(X,group,dim,selected)
function [res]=crossvalpls(X,y,ndim,selected)
function [res]=crossvalpls1a(X,y,ndim,selected)
function res=leave_one_out_pls1(X,y,ndim);
function res=leave_one_out_pls2(X,y,ndim);
[type]=afdlike(x,y,select,parmi)
function [mtBpls,mtT, tBeta] = pls2obs(mtX,mtY,nbdim)
function[plsdatype]=plsda(X,group,ndim)
function [plstype]=saisirpls(X,y,ndim)
function [plstype]=saisirpls(X,Y,dim)
function[res]=nuee(X,barycenter)
function result = apply_quaddis(quaddis_type,x,(known_group));
function[res]=apply_stepwise_regression(stepwise_type,X,(y))
function[res]=applyfda1(X,fdatype,(actual_group))
function[res]=applyplsda(X,plsdatype,(actual_group))
function barycenter_map(X,col1,col2,group,(charsize))
function[res] = basic_pls(X,y,ndim)
function [indicator, groupings]=build_indicator(x);
function table=contingency_table(g1,g2)
group=create_group(X,code_list,startpos,endpos)
use the identifier for creating groups.
function[res]=crossfda1(X,group,among,maxvar,ntest)
function[discrtype]=crossmaha(X,group,maxvar,ntest)
function[res]=crossplsda(X,group,dim,selected)
function res=crossval_quaddis(X,group,selected)
function [res]=crossvalpls(X,y,ndim,selected)
function [res]=crossvalpls1a(X,y,ndim,selected)
function group=dendro(X,(topnodes))
function ellipse_map(X,col1,col2,gr,centroid_variability,(confidence),(point_plot))
function[fdatype]=fda1(pcatype, group, among, maxscore)
function[res]=fda2(X, group, among, maxscore, selected)
function[res]=maha(X,group,maxvar)
function[discrtype1]=maha1(calibration_data,calibration_group,maxvar,test_data,test_group)
function[discrtype]=maha3(X,group,maxvar)
function[discrtype]=maha4(X,group,maxvar)
function[discrtype]=maha6(X,group,maxvar,selected)
function[res]=nuee(X,ngroup,(nchanged))
function[plsdatype]=plsda(X,group,ndim)
function quadis_type=quaddis(x,group);
function[selected]=random_select(nel, nselect, (nrepeat))
function[res]=nuee(X,barycenter)
function [indicator, groupings]=build_indicator(x);
function [covariance_type]=cumulate_covariance((X),(covariance_type1));
function group=dendro(X,(topnodes))
function documentation_dico(fid,function_name);
function [mtvec,mtval] = eigord(mtA)
function index=find_index(str,value);
function [row,col,value]=find_max(matrix)
function [row,col,value]=find_min(matrix)
function [vect, index]=find_peaks(X,nrow,threshold,windowsize,min_max)
function X1=group_mean(X,startpos,endpos)
function html_header(fid)
function html_notice(fid,function_name);
function html_postface(fid);
test=issaisir(X);
function list(X,(start))
X = matrix2saisir(data,(coderow),(codecol))
function dis = mdistance(X1,X2,metric)
function [names] = mir_style(names1)
function res=multiway_pca(collection);
function[res]=nuee(X,ngroup,(nchanged))
function str=num2str1(vector,ndigit);
function[ridgetype]=pca_ridge_regression(pcatype,y,range)
function[X]=random_saisir(nrow,ncol)
function[selected]=random_select(nel, nselect, (nrepeat))
function[X1,X2]=random_splitrow(X, nselect)
function X1=randomize(X)
function [B1 B2]=reorder(A1,A2)
function str1=repeat_string(str,ntimes);
function [X] = center(X1)
function check=saisir_check(X)
function z=saisir_linkage(dis)
function X12=saisir_mult(X1,X2);
function [X1 X2]=saisir_sort(X,ncol,minmax)
function xsum=saisir_sum(X);
function [X] = saisir_transpose(X1)
function index = seekstring(identifiers,xstr)
function [B,G] = sgolaycoef(k,F)
function
res=split_average(X,startpos,endpos)
function [X, xstd] =
standardize(X1,(option))
function [saisir] = string2saisir(data)
function string2text(str,filename)
function res=thematic_classification(function_name,(previous));
function res= w(xstruct);
function xdisp(varargin)
Input argument:
==============
str: a matrix of character (n x p)
code: a string (1 x k)
deb_end: a number (0= addition before; 1 = addition after; default : 0)
Output argument:
===============
str1 : a matrix of character ((n x (k+p))
This function is mainly used to recode the identifiers of observations or
variables (".i", or ".v")
example:
data.i
ans =
casein
albumin
zein
>> data.i=addcode(data.i,'1')
data =
i: [3x8 char]
>> data.i
ans =
1casein
1albumin
1zein
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
x : SAISIR matrix
start_pos, end_pos : character positions in the identifiers.
Output argument:
================
x1= SAISIR matrix with the rows sorted in alphabetic order.
This functions sorts the observations (rows) according to
their identifiers (".i" field of X);
example
=======
>> a.i
ans =
xenon
krypton
Aluminium
>>a.d
ans =
1.00 2.00 3.00 4.00
5.00 6.00 7.00 8.00
9.00 10.00 11.00 12.00
b=alphabetic_sort(a,1,5);
>> b.i
ans =
Aluminium
krypton
xenon
>> b.d
ans =
9.00 10.00 11.00 12.00
5.00 6.00 7.00 8.00
1.00 2.00 3.00 4.00
Return to thematic list
Return to alphabetic list
HOME
Perform a one-way analysis of variance for each of the column of X.
Input arguments:
===============
X: SAISIR matrix (n x p)
g: SAISIR vector of group identifier (integers ,n x 1)
Output arguments:
================
res with fields:
res.F : SAISIR vector of Fisher values for each variable in X (1 x p)
res.F.df : degrees of freedom of the model/
res.p : SAISIR vector of associated probabilities (1 x p).
Note: res.F and res.p can be examined as curves (using "curve"
function)
or by the command "show_vector" (for discrete variables)
The function performs p independant one-way anovas, taking the groups
(defined in g: each observations having the same number are belonging to
the same group.)
See also: "anovan1",
"show_vector","create_group1"
Return to thematic list
Return to alphabetic list
HOME
Performs as many independant N-way analyses of variance as the number of
columns in X
Input arguments:
===============
X: SAISIR matrix of response values (n x p)
model (integer): gives the level of desired interactions
(1= no interactions studied; 2: first degree of interactions ... ) (see
Matlab function ANOVAN)
gr1; gr2 ...: SAISIR vector of qualitative groups forming a factor of the
ANOVA
(n x 1). Identical numbers mean that the corresponding observations are in
the same group
Output arguments:
================
res with fields
res.F: the F values associated with each effect and (possibly) interaction
res.P: probability
res.df (characters): degrees of freedom
res.singular: singularity of the model. If the singularity == 1, the model
is redundant and a lowest level of interaction must be tested.
see also: anovan, anova, anavar1, show_vector, create_group1
example:
my_anova=anovan1(spectra,2,grouping1, grouping2);
show_vector(my_anova.P,2); %%examination of the probabilities of the
second factor (with identifiers)
curve(my_anova.F,2); %% Fisher F examined as a curve.
Return to
thematic list
Return to alphabetic list
HOME
Input argument:
===============
X1, X2, X3 ... : "bag" structure (see function
"excel2bag" for details)
with the same number of columns
Output argument:
================
X: concatenated bag structure
bag is a structure (such as X.d,X.i,X.v),
with bag.d being here a three way table of characters
if bag.d is dimensioned (n x p x v): n is the number of rows, v the
number of columns, and p the number of characters for each string.
The structure bag is obtained as out put argument of the function
"excel2bag".
Example:
[value1,bag1]=excel2bag('data1',['A'; 'B'],20);
[value2,bag2]=excel2bag('data2',['A'; 'B'],20);
bag3=appendbag1(bag1,bag2);
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X1, X2 : SAISIR Matrix dimensionned n x p and n x q
Ouput argument:
===============
X3 : SAISIR Matrix dimensionned n x(p+q)
The identifiers or rows are recopied in X3.i
The identifiers of columns are the concatenation of X1.v and X2.v
Example:
total=appendcol(chemistry1, chemistry2);
see also: appendrow, appendrow1, appendcol1
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X1, X2, X3 ... : SAISIR Matrices with the same numbers of rows
Ouput argument:
===============
X : SAISIR Matrix
The identifiers or rows are recopied in X.i
The identifiers of columns are the concatenation of X1.v, X2.v, X3.v ...
Example:
total=appendcol(chemistry1, chemistry2, chemistry3 );
see also: appendrow, appendrow1, appendcol
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
================
X1, X2 : SAISIR Matrix dimensionned n1 x p and n2 x p
Ouput argument:
==============
X3 : SAISIR Matrix dimensionned (n1+n2) x p
The identifiers or columns are recopied in X3.v
The identifiers of rows are the concatenation of X1.i and X2.i
Example:
total=appendrow(spectra1, spectra2);
see also: appendcol, appendrow1, appendcol1
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X1, X2, X3 : SAISIR Matrices with the same number of columns
Ouput argument:
===============
X : SAISIR Matrix with p columns
The identifiers or columns are recopied in X.v
The identifiers of rows are the concatenation of X1.i, X2.i, X3.i ...
Example:
total=appendrow(spectra1, spectra2, spectra3);
see also: appendcol, appendrow, appendcol1
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X: Saisir matrix of predictive variables (n x p)
fdatype : structure, output of function "fda1"
actual_group (optional)(n x 1): SAISIR vector indicating the membership of
the observations. A same number indicates that these observations belong to
the same group.
Output argument:
===============
res with fields
datafactor : discriminant scores (n rows)
predicted_group: number indicating the prediction in each group
If actual_group si given, the field "confusion" gives the
confusion matrix
row: actual group, columns: predicted by the "fda1"
Typical example:
================
(calibration data in "calibration_data", group in
"qualitative_group",
Unknown data in "unknown_data");
%Building the model
p=pca(calibration_data);
fdatype=fda1(p, qualitative_group, 10, 5)
%Applying the model
res=applyfda1(unknown_data,fdatype);
See also : fda1, maha3, maha6, plsda, pca
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
================
lrtype: structure, output of function "lr1" (predictive model)
X :SAISIR matrix of predictive variables
Output argument
===============
predy :SAISIR matrix of predicted y (for all the models asked when using
"lr1"
creates as many y predicted as allowed by the dimensions in lr1type
Return to thematic list
Return to alphabetic list
HOME
assess the scores of supplementary observations
Input arguments:
===============
pcatype: (structure) output argument of functions
"pca","normed_pca", or
"covariance_pca"
X : SAISIR matrix of supplementary observations
Output argument:
===============
supscores : SAISIR matrix of scores of X
Typical example
===============
p=pca(spectra);%% PCA
supscores=applypca(p,supplementary_spectra);
map(supscores,1,2);
The number of columns of X must be compatible with pcatype.
If "normed_pca" was applied, X is divided by the standard
deviations of
the principal observations prior to projection
Return to thematic list
Return to alphabetic list
HOME
apply a basic pcr (in pcrtype) on saisir data x
creates as many y predicted as allowed by the dimensions in pcrtype
Input arguments
pcrtype:structure, output of function "PCR"
X: SAISIR matrix of predictive variables
Output argument
predy : SAISIR matrix of predicted y, for all the dimensions tested
see also: "pcr", "pcr1", "basic_pls"
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
---------------
X : SAISIR matrix (n x p)
plsmodel :output argument of functions "saisirpls",
"basic_pls" or "basic_pls2"
knowny (optional): actual value of y (if it is known, this allos the
computation of r2 and RMSEV
Ouput arguments:
---------------
res with fields
PREDY: predicted values for the all the models given by
"plsmodel" (n x
ndim)
RMSEV: root mean square error of validation for all the models (only if
"knowny" is given) (1 x ndim)
r2: determination coefficient between predicted and observed y values
(only if "knowny" is given) (1 x ndim)
T: PLS scores of the set (n x ndim)
Return to thematic list
Return to alphabetic list
HOME
return the predicted group on (unknown data x)
Input arguments:
===============
X : SAISIR matrix of predictive variables
plsdatype: structure returned by function 'plsda'
actual_group (optional): SAISIR vector of observed groups.
Observations with the same group number belong to the same group.
Ouput argument:
==============
res with fields:
confusion1: matrix of confusion, method 1 (if "actual_group"
defined)
ncorrect1: Number of correct classifications, method 1 (if
"actual_group"
defined)
predgroup1: predicted group (method1)
confusion: matrix of confusion, method 0 (if "actual_group"
defined)
ncorrect: Number of correct classifications, method 1 (if
"actual_group"
defined)
predgroup: predicted group (method 0)
Method 1: (attribution to index of max of predicted Y)
Method 0: (shortest Mahalanobis distance calculated on PLS scores);
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
spcrtype:(structure), output argument of spcr
X: SAISIR marix of predictive variables
Output argument:
===============
predy: y predicted for all the tested dimensions
Return to thematic list
Return to alphabetic list
HOME
Input argument:
==============
X: SAISIR matrix of predictive variables
type: output argument of function "multiple_regression"
y (optional): SAISIR vector of known observed y
Output argument:
===============
res with fields:
ypred: predicted y
if input argument "y" given :
r2: r2 value
RMSEV: root mean square error of validation
Return to thematic list
Return to alphabetic list
HOME
X :a matrix of dimension n x p
barycenter : a matrix defining the barycenter k x p with k groups
Barycenter is possibly the field "center" of the output of
function "nuee"
Return to thematic list
Return to alphabetic list
HOME
Application of quadratic discriminant analysis (test)
====================================================
Input arguments:
================
quaddis_type : output of function "quaddis" (structure)
x : predictive data matrix (n x p)
known_group : true groups of observations in x (n x 1) (optional)
Ouput arguments:
================
result with fields:
predgroup : predicted groups (n x 1)
density : pseudo-density (n x gmax)
proba : probability of belonging to a given group (n x gmax)
if "known_group" defined
nscorrect100 : percentage of correct classification (number)
sconfus : confusion matrix (gmax x gmax)
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
ridge_type (structure): output argument of ridge_regression
X: SAISIR matrix of predictive variables
y: SAISIR vector of observed y
Ouput arguments:
===============
res with fields
predy: predicted values for all the ridge predictive models
(see function "ridge_regression")
If input argument "y" is given:
r2: values for all the ridge predictive models
rmsecv: root mean square error of validation for all the predictive models
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
================
stepwise_type: array of cells obtained as output of stepwise_regression
X : SAISIR matrix of predictive variables (n x p)
y : SAISIR vector of observed variable (n x 1)
Output argument
===============
res with fields
predy : predicted y for all the tested models
rmsev : Root mean square error of validation (if input argument
"y"
defined)
r2 : determination coefficients between oobserved and predicted y
build as many models as available in "stepwise_type"
Return to thematic list
Return to alphabetic list
HOME
Input argument
=============
bag: "bag" structure , output of function "excel2bag"
Ouptut argument
=============
group_type : array of cells of structure SAISIR
such as group_type{i} contains the SAISIR structure of groups as defined
by the corresponding column i in "bag"
creates as many group as different strings in the column of bag.d
useful for discriminant analysis or correspondance analysis
Return to thematic list
Return to alphabetic list
HOME
Input arguments
==============
bag1, bag2, bag3 ... : structure "bag" as defined in function
"excel2bag"
Ouput argument
=============
bag: concatentation of bag1, bag2, bag3 with increase of the number of
rows
the second and third dimensions of bag.d must be equals
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
==============
X: SAISIR matrix (n x p)
col1, col2 : indices of the variables ploted in X and Y (integers)
group: SAISIR vector of group ( n x 1). Indicates the group belonging
of each observation
charsize size of the Font (default: 7).
Display two columns as a map
Each observation is linked to its own barycentre by a straight line
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
---------------
X: SAISIR matrix of predictive variables (n x p)
y: SAISIR vector of variables to be predicted
ndim: maximum number of dimension asked
Output arguments:
----------------
res with fields:
T: PLS scores (n x ndim)
P:¨PLS loadings such as X = TP + residuals (ndim x p)
beta: final regression coefficients with ndim dimensions (p x 1)
beta0: final interecpt value (number)
meanx: mean of X (1 x p)
meany: mean of y (numbe)
predy: predicted y with ndim dimensions (n x 1)
error: Root mean square error of the model with ndim dimensions (number)
corcoef: correlation coefficient r between observed and predicted values
in the model with ndim dimensions
BETA: regression coefficients for the ndim models (p x ndim)
BETA0: intercepts of the ndim models (ndim x 1)
loadings: pls loadings such as T=X*loadings (with X centred) (p x ndim)
Q: Q value such as y = TQ + residual (ndim x 1)
PREDY: predicted y values up to ndim dimensions (n x ndim)
RMSEC: Root mean square error of predicttion up to ndim dimensions (1 x
ndim)
r2: determination coefficient between observed and predicted values (1 x
ndim)
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X: SAISIR matrix of predictive variables ( n x p)
y: SAISIR vector of observed y (n x m)
maxdim (integer): maximum number of PLS dimensions
Ouput arguments:
===============
result with fields
T: PLS scores (n x maxdim)
loadings: PLS loadings (p x maxdim)
meanx: average of X (1 x p)
res: array of cells (maxdim elements)
A structure in res{i} with i = 1 ... maxdim contains the results of the
prediction of the variable i.
result.res{i} has the fields:
nom: (string) , name of the considered variable i
BETA (p x maxdim): regression coefficients up to maxdim dimensions
BETA0 (1 x maxdim): intercepts of the models
PREDY (n x maxdim): predicted y for 1 to maxdim dimensions
RMSEC (1 x maxdim): root mean square errors of calibration
r2 (1 x maxdim): r2 for 1 to maxdim dimensions
Return to thematic list
Return to alphabetic list
HOME
Display the rows of the SAISIR matrix X as curves
Right button to go down, Left button to go up, Ctrl C to exit
%If X.v can be interpreted as a vector of number (such as wavelengths),
the X scale is given by this vector.
Otherwise, the X-axis is simply given by the rank of the variables
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
================
directory_name : name of the directory (working with
the Matlab command "what
file_name: name of the resulting html file
Extension ".htm" is added to this name
thematic_list: output of function "thematic_classification"
The function builds the HTML documentation of SAISIR by concatenation of
the
"Helps". If "thematic_list" is defined, gives also a
thematic list of functions
The resulting HTML file is in "filename"
Typical example
===============
aux=what('saisir');%% functions in the directory "saisir";
function_name=char(aux.m);% list of function in SAISIR
build_documentation('saisir','SAISIR documentation');%% builds the HTLM
fields
%For having also a thematic index in the HTML document, one must use the
function
"thematic_classification".
%For example
mylist=thematic_classification(function_name);
build_documentation('saisir','SAISIR_documentation',mylist);%% builds the
HTLM fields
Then the resulting SAISIR_documentation.htm file can be examined with WEB
explorer such as Window explorer or firefox. Window explorer is better
here.
See also: "thematic_classification"
Return to thematic list
Return to alphabetic list
HOME
each column of x must contain integer values
build the complete table of indicators
Useful for computing multiple correspondance analysis
Return to thematic list
Return to alphabetic list
HOME
Compute correspondence analysis from the contingency table in N
If only groupings are available, the contigency table must be computed
before using this function (see for example function
"contingency_table")
==============================================================
Fields of the output
score :CA scores of rows followed by CA scores of columns
eigenval :eigenvalues, percentage of inertia, cumulated percentage
contribution :contribution to the component rows, then columns
squared_cos :squared cosinus row, then columns
khi2 :khi2 of the contingency table
df :degree of freedom
probability :probability of random values in contigency table
==============================================================
The identifiers of rows of 'score' (whic are the identifiers of rows and
columns of N (N.i and N.v)
are preceded with the letter 'r' or 'c'.
It is therefore possible to use color for emphazising row and columns in
the simultaneous biplot of rows and columns
Source : G. Saporta. Probabilités,
analyse des données et statistiques.
: Edition Technip, page 198 and followings.
REMARK
: use function "ca_map" to plot the biplot observation/variable
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
================
X: SAISIR data matrix
col1, col2: rank of the columns to be displayed (normaly scores obtained
from function "ca".
startpos, endpos: position of the string in the identifiers indicating the
color of the display
Biplot of two columns as colored map useful for correspondance analysis
(from function "ca")
The coloration of the displayed descriptors depends on the arguments
"startpos" and "endpos". If one of this argument is zero:
single (black) color
Otherwise, from the names of individual, the string name(sartpos:endpos) is
extracted.
Two observations for which these strings are different,
are also colored differently.
THIS FUNCTION IS SPECIFIC TO CORRESPONDANCE ANALYSIS:
If the first letter of the identifier is either c(column) or r (row),
this letter is removed in the name.
The letter c produces an italic display. This allows a representation in
which the variables are in italic letter
Return to thematic list
Return to alphabetic list
HOME
Input argument:
---------------
X : SAISIR matrix (n x p)
Output argument:
---------------
X1:SAISIR matrix (n x p) centered (the average of each column of X1 is
equal to 0)
xmean: SAISIR vector (1 x p) of the average row.
Return to
thematic list
Return to alphabetic list
HOME
Input argument:
--------------
pcatype : output argument of function pcatype
Output argument:
---------------
pcatype1: new pca structure
The function is useful when several PCAS has been computed.
For the sake of clarity, it may be useful to have the axis oriented in
(about)
the same directions. In this way, the graphical representations may be
easier to interpret.
see also: function "smart_coord"
Return to thematic list
Return to alphabetic list
HOME
Input argument:
--------------
"string": a matrix of characters
ouput argument:
--------------
detected: vectors giving the indices of the observations having the same
name.
names: matrix of characters giving the found identical name
This function is mainly used in relationship with function
"reorder"
Return to thematic list
Return to alphabetic list
HOME
Input argument:
--------------
X: SAISIR matrix (n x p)
group: Saisir vector of groups (integers,n x 1).
The function displays all the observations as curves.
Each curve is colored according to the values in "group". The
observations
with the same group number are colored identically.
Return to
thematic list
Return to alphabetic list
HOME
Biplot of two columns as colored map
Input arguments:
===============
X: SAISIR matrix
col1, col2 : index of the two columns to be represented (integer values)
startpos, endpos: position in the identifier strings of rows ('.i') for
the coloration
col1label (optional): Label of the variable forming the X-axis
col2label (optional): Label of the variable forming the Y-axis
title (optional) : title of the graph
charsize (optional) : size of the plotted characters
marg (optional) : margin value allowing an extension of the axis in order
to cope with long identifiers (default value: 0.05)
For the French users: there is a synonym function
"carte_couleur1".
Use preferably "colored_map1"
The coloration of the displayed descriptors depends on the arguments
startpos and endpos.
From the names of individual, the string name(sartpos:endpos) is extracted.
Two observations
for which these strings are different, are also colored differently.
example:
Let X be a SAISIR matrix
%Let X.i being
'wheat1'
'barle2'
'ricex1'
'wheat2'
'barle3'
...
The command 'colored_map1(X,5,3,1,5)' will plot the column 5 as X, 3, as Y
The characters are extracted from 1 to 5 , that is strings 'wheat',
'barley',
'ricex'.A different color will be given for each of this strings.
See also colored_map2 (same principle but with the whole identifier name
displayed)
Return to thematic list
Return to alphabetic list
HOME
Biplot of two columns as colored map
Input arguments:
===============
X: SAISIR matrix
col1, col2 : index of the two columns to be represented (integer values)
startpos, endpos: position in the identifier strings of rows ('.i') for
the coloration
col1label (optional): Label of the variable forming the X-axis
col2label (optional): Label of the variable forming the Y-axis
title (optional) : title of the graph
charsize (optional) : size of the plotted characters
marg (optional) : margin value allowing an extension of the axis in order
to cope with long identifiers (default value: 0.05)
For the French users: there is a synonym function
"carte_couleur2".
Use preferably "colored_map2"
The coloration of the displayed descriptors depends on the arguments
startpos and endpos.
From the names of individual, the string name(sartpos:endpos) is extracted.
Two observations
for which these strings are different, are also colored differently.
example:
Let X be a SAISIR matrix
%Let X.i being
'wheat1'
'barle2'
'ricex1'
'wheat2'
'barle3'
...
The command 'colored_map1(X,5,3,1,5)' will plot the column 5 as X, 3, as Y
The characters are extracted from 1 to 5 , that is strings 'wheat',
'barley',
'ricex'.A different color will be given for each of this strings.
See also colored_map1 (same principle but with only the portion of
the identifier names, from startpos to endpos, displayed)
Return to thematic list
Return to alphabetic list
HOME
Input_arguments
--------------
X: SAISIR matrix of data (n x p)
col1, col2 : rank of the columns to be represented
choice: first criterion of choice, dealing with the colors
symbol_choice (optional): second criteria of choice, dealing with the
symbols
Biplot of two columns as colored map
The coloration of the displayed descriptors depends on the arguments
choice (either matrix of char, vector of number or saisir structure);
with number of elements equal to the number rows in s;
if the elements of "choice" are different, they are also colored
differently.
if "symbol_choice" is also defined (either matrix of char, vector
of number
or saisir structure) different symbols are used. The color of the point
is then given by "choice", and the shape of the symbol depends on
"symbol_choice"
Example of use:
Colored text with color determined by the second character in wheat.i
colored_map4(wheat,1,50,wheat.i(:,2));;
carte_couleur4(ble,1,50,ble.i(:,2),ble.i(:,3));
color: determined by the second character in wheat.i
colored symbol: shape determined by the third character in wheat.i
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
---------------
collection:vector of saisir files (the numbers "nrow" of rows in
each table must be equal) .
ndim:number of common dimensions
threshold (optional): if the "difference of fit"
iterative loop
Output arguments:
-----------------
res with fields:
Q : observations scores (nrow x ndim)
explained : 1 x ndim, percentage explanation given by each dimension
saliences : weight of the original tables in each
dimensions (ntable x ndim).
Method published by E.M. Qannari, I. Wakeling, P. Courcoux and H. J. H.
MacFie
in Food quality and Preference 11 (2000) 151-154
typical example (suppose 3 SAISIR matrices
"spectra1","spectra2","spectra3")
collection(1)=spectra1; collection(2)=spectra2; collection(3)=spectra3
myresult=comdim(collection);
map(myresult.Q,1,2);%% looking at the compromise scores
figure;
map(myresult.saliences,1,2);%% looking at the weights
Return to thematic list
Return to alphabetic list
HOME
Input argument:
--------------
table: SAISIR matrix of contingency table (n x p)
output argument
--------------
res with fields:
theo: theoretical contingency table assuming independence of rows and
columns in "table"
khi2: khi2 value
dll: degree of freedom of the model
P: probability of the null hypothesis ("independence of rows and
columns")
Each element of input argument "table" gives the number of
observations
which both belongs to the group of the corresponding row and the
corresponding column. For example table.d(2,4) indicates the number of
observations which are both in the group 2 of rows and in the group 4 of
columns.
The contingency table can be created with the function
"contingency_table"
Return to thematic list
Return to alphabetic list
HOME
Input arguments
---------------
g1 and g2: SAISIR vector (n x 1) of groups (possibly computed from
"create_group1").
In these vectors, a same number indicates the belonging to the same group.
Output argument
---------------
table : contingency table (ngroup1 x ngroup2). A value
table.d(i,j)indicates the
number of observations belonging both of group i in g1 and group j in g2.
see also "contigency_khi2", "ca" (correspondence
analysis),
"build_indicator"
Return to thematic list
Return to alphabetic list
HOME
Input arguments
--------------
X1 and X2: SAISIR matrix dimensioned n x p1 and n x p2 respectively
Output argument:
---------------
cor: matrix of correlation dimensioned p1 x p2)
the tables must have the same number of rows
An element cor.d(i,j) is the correlation coefficient between the column i
of X1 and j of X2
Return to thematic list
Return to alphabetic list
HOME
The baseline is modelled by a straight line going from data points col1 to
col2
Return to thematic list
Return to alphabetic list
HOME
Input arguments
---------------
pcatype: output argument of function "pca")
X: original data matrix (n x p)
col1 and col2 : ranks of the PC-scores to be represented.
startpos and endpos(optional): key in the variable identifiers for coloring
the variables (optional)
Output argument
--------------
res: matrix p x n scores of the correlations between the variables and all
the available PC-scores
The function draw the correlation circle
%typical example
%Let "chemistry" be a SAISIR matrix
mypca=pca(chemistry);
correlation_circle(mypca,chemistry,col1,col2);%correlation circle
%of plan %#1-#2
%Note: Use preferably the function "correlation_plot"
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
==============
scores: - ORTHOGONAL scores obtained by multidimensional analysis
col1 and col2: - Indices (ranks) of the scores to be plotted (integer
number)
X, X2, ... - Arbitrary number of tables giving the variables to be plotted
The number of rows in the scores and other tables must be identical
The function displays the correlation circle, with a different color for
each table in X1, X2 ...
A dotted line gives the level of 50% explained variable.
If the input argument "scores" have non-orthogonal columns the
graph is normally incorrect
and a warning message is displayed.
Return to
thematic list
Return to alphabetic list
HOME
Input arguments:
---------------
covariance_type: output argument of function
"cumulate_covariance"
nscore : (integer, optional) number of scores to be calculated (default :
all)
Output argument:
----------------
pcatype with fields:
eigenvec: eigenvector
eigenval:eigenvalues
average:average value of the active observations
Performs PCA on the covariance matrix as calculated
by"cumulate_covariance"
This function is useful fo to carrying on PCA with huge data set (see
"cumulate_covariance" for an example of use)
%typical example:
A complete script must therefore be something like
cov=cumulate_covariance(spectra_1);%% starting
cov=cumulate_covariance(spectra_2,cov);%% cumulating values of matrix 1
cov=cumulate_covariance(spectra_3,cov);%% cumulating values of matrix 2
...
cov=cumulate_covariance(spectra_k,cov);%% cumulating values of matrix k
covariance_type=cumulate_covariance([],cov);%% finishing
[pcatype]=covariance_pca(covariance_type,(ncomponent))
score1=applypca(pcatype,spectra_1);%% projecting data from spectra_1
This function is mainly used to compute PCA on huge data set, which cannot
be loaded completely in the free memory, and thust must be split in smaller
subset of observations.
related function; "cumulate_covariance" (covariance of huge data
set)
Return to thematic list
Return to alphabetic list
HOME
Input arguments
--------------
X1 and X2: SAISIR matrix dimensioned n x p1 and n x p2 respectively
Output argument:
---------------
cov: matrix of covariance dimensioned p1 x p2)
the tables must have the same number of rows
An element cov.d(i,j) is the covariance between the column i of X1 and j of
X2
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
---------------
X: SAISIR matrix (n x p)
code_list:matrix of character (k x q)
startpos, endpos: place in the identifier names where to find the code
Output argument:
---------------
group: SAISIR vector (n x 1) of groups. A same number indicates that the
observations belong to the same group
Normally, k groups are identified
typical use:
group=create_group(X,['A1';'B2';'C1'],3,4]);
The command seek in X.i the codes 'A1', 'B2', 'C1', in position 3 to 4
Observations with code 'A1', 'B2, 'C1' are placed in group numbered 1, 2
and 3 respectively.
see also "create_group1"
group structure are used in discriminant analysis, anova and relate
methods
Return to thematic list
Return to alphabetic list
HOME
creates as many group means as different strings from startpos to endpos
function saisir=create_group1(s,startpos,endpos)
s: saisir file, startpos and enpos : position of discriminating characters
Return to thematic list
Return to alphabetic list
HOME
input arguments
===============
X: Saisir data matrix (n x p)
group:Saisir vector of group (integer, n x 1).Two observations belonging
to the same group have the same group number.
among: maximum rank of the PC scores entered in the model (see
"fda1" for
details)
maxvar: (integer) maximal number of scores entered in the model
ntest: (integer) number of observation in each group in the validation
set
ouput arguments:
===============
res with fields
fdatype: fdatype built up with the calibration set (see function
"fda1")
verification:
with fields
datafactor: discriminant scores of the validation set
predicted_group: in the validation set
confusion: confusion matrix of the validation set (rows: actual group;
columns: predicted group)
The function applies "fda1" by dividing the sample in X into
calibration and validation set
"ntest" observations in each group are randomly (with no repeat)
placed in test group.
%typical example:
g=create_group1(wheat,2,3);%% creation of the vector of group number
res=crossfda1(wheat,g,10,5,5);
res.validation.confusion;%% confusion matrix in validation
%colored_map1(res.validation.datafactor,1,2,2,3); %% looking at the
%discriminant biplot of the validation set
Return to thematic list
Return to alphabetic list
HOME
Input arguments
==============
X: SAISIR matrix (n x p)
group:SAISIR vector of group numbers (n1 x 1)
maxvar:(integer) maximum number of variables introduced in the model
ntest (integer): number of observations in each group in the validation
set
Output argument:
===============
discrtype with fields:
step with fields
index: vector giving the rank of the variable introduced at each step
(maxvar values)
correct: vector giving the number of correct classifications in the
calibration set (max var values)
name: identifiers of the variables introduced int the models (matrix of
char with maxvar rows)
ntestcorrect: vector giving the number of correct classifications in the
validation set (max var values)
classed: SAISIR vector of the predicted group numbers of the calibration
set (n1 x 1). Only the result of the final step is given
testclassed:SAISIR vector of the predicted group numbers of the test set
(n2 x 1)
Only the result of the final step is given
selected: vecor indicating the observations in the calibration (0) and in
the validation (1) set
confusion with fields
cal: confusion matrix of the calibration set
val: confusion latrix of the validation set
Applies maha1 by dividing the sample in saisir into calibration and
verification set
ntest observations in each group are randomly (with no repeat) placed in
test group.
see also function "maha1"
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X:SAISIR matrix of predictive data (n x p)
group:SAISIR vector of group numbers (n x 1). A same number indicates that
the observations belong to the sdame group
dim: maximum number of PLS dimensions
selected:matlab VECTOR with 0= calibration sample, 1= verification sample
Output argument:
===============
res with fields:
confusion1: confusion matrix of the calibration set , method 1
ncorrect1: number of correctly classified obs. in calibration, method1
nscorrect1: number of correctly classified obs. in validation, method1
sconfusion1: confusion matrix of the validation set , method 1
confusion: confusion matrix of the calibration set , method 0
ncorrect: number of correctly classified obs. in calibration, method 0
nscorrect: number of correctly classified obs. in validation, method 0
sconfusion: number of correctly classified obs. in validation, method 0
info: ' no index: max of predicted Y; 1: mahalanobis distance on latent
variables t'
The function divides the data collection into a calibration set
(selected=0) and a validation set (selected = 1)
The function "plsda" is applied on the calibration set, and
tested on the validation set.
Two strategies for attributing a group to each observation are tested:
Method 0 (no index): the observations are classified in the group for
which the predicted indicator variable is the highest.
Method 1: (preferable) linear discrimination on the PLS scores
see also:"plsda", "applyplsda"
Return to thematic list
Return to alphabetic list
HOME
Input args:
===========
X: predictive data n x p
y: Variable to be predicted n x 1
ndim: maximal number of dimensions in the PLS model
selected: Matlab vector n x 1 (0= obs in calibration; 1 in validation)
Output args
============
res with fields
---calibration: calibration results with fields
BETA: regression coefficients (p x ndim)
BETA0: intercept (p x 1)
PREDY: predicted y in calibration (n x ndim)
T: scores PLS (n x ndim)
RMSEC: Root mean square of calibration (1 x ndim)
r2: determination coefficient (yobs/ypred) (1 x ndim)
---validation : validation results with fields
PREDY: predicted y in validation (n x ndim)
RMSEV: Root mean square error of validation (1 x ndim )
r2: determination coefficient (yobs/ypred) (1 x ndim)
OBSY: observed y in validation (number of rows=number of obs in
validation)
See also: "crossvalpls1a"
Note:
"crossvalpls" is faster than "crossvalpls1a" but gives
less information (for
example the PLS scores in validation and the loadings in calibration are
not given here))
Return to thematic list
Return to alphabetic list
HOME
Input args:
===========
X: predictive data n x p
y: Variable to be predicted n x 1
ndim: maximal number of dimensions in the PLS model
selected: Matlab vector n x 1 (0= obs in calibration; 1 in validation)
Output args
============
res with fields
---calibration: calibration results with fields
res with fields:
T: PLS scores (n x ndim)
P:¨PLS loadings such as X = TP + residuals (ndim x p)
beta: final regression coefficients with ndim dimensions (p x 1)
beta0: final interecpt value (number)
meanx: mean of X (1 x p)
meany: mean of y (numbe)
predy: predicted y with ndim dimensions (n x 1)
error: Root mean square error of the model with ndim dimensions (number)
corcoef: correlation coefficient r between observed and predicted values
in the model with ndim dimensions
BETA: regression coefficients for the ndim models (p x ndim)
BETA0: intercepts of the ndim models (ndim x 1)
loadings: pls loadings such as T=X*loadings (with X centred) (p x ndim)
Q: Q value such as y = TQ + residual (ndim x 1)
PREDY: predicted y values up to ndim dimensions (n x ndim)
RMSEC: Root mean square error of predicttion up to ndim dimensions (1 x
ndim)
r2: determination coefficient between observed and predicted values (1 x
ndim)
---validation : validation results with fields
PREDY: predicted y in validation (n x ndim)
RMSEV: Root mean square error of validation (1 x ndim )
r2: determination coefficient (yobs/ypred) (1 x ndim)
T : PLS scores in validation
OBSY: observed y in validation (number of rows=number of obs in
validation)
See also: "crossvalpls"
Note:
crossvalpls1a is slower than crossvalpls but gives more information (for
example the PLS scores in validation)
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
================
X: SAISIR matrix of predictive data (n x p)
y: SAISIR vector of the variable to be predicted ( n x 1)
selected: MATLAB vector ( n x 1) giving the samples placed in the
verification set: 1= in verification; 0 = in calibration set
Output arguments:
================
res with fields:
calibration with fields
ypred: predicted y (n x1)
beta0: intercept of the regresssion
beta: regression coefficients (p x 1)
r2: r2 between observed and predicted values in calibration[1x1 struct]
RMSEC: Root mean square error of calibration
validation with fields
ypred predicted values in validation
r2: determination coefficient between obs and predicted values in
validation
RMSEV: Root mean square error of validation
Divides the data into a calibration and a validation set.
Multiple linear regression is established on the calibration set
and validated on the validation set.
The division calibration/validation is determined by the vector
"selected"
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
================
X : matrix of predictive data (n x p)
group : vector of known groups (integers, n x 1)
selected : MATLAB vector (n x 1)
with elements 0: selected in the calibration set
1: selected in the validation set
Output argument
===============
res with fields:
calibration: structure quaddis_type as defined in function
"quaddis"
validation : structure as defined in "apply_quaddis"
see also quaddis, apply_quaddis
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
---------------
X: Saisir matrix of the predictive data set (n x p)
y: Saisir vector of value to be predicted (n x 1)
krange: Matlab vector of double (k x 1) (see function "ridge
regression")
selected:Matlab vector which elements are either equal to 0 or 1
Output arguments
----------------
res with fields
predy: predicted y in validation (n2 x k)
obsy: observed y in validation (n2 x k)
r2: r2 between observed and predicted y in validation (1 x k)
rmsecv: root mean square error of validation (1 x k)
ridgetype: see function "ridge_regression"
divides a collection in calibration (selected = 0) and verification set
(selected = 1)
applies ridge_regression on the validation set
All the models with the k parameter in "krange" are tested
Return to thematic list
Return to alphabetic list
HOME
This function is mainly useful for computing PCA on very large data sets
Input arguments:
---------------
X: SAISIR matrix X (n x p) or [];
covariance_type: (optional): output argument of the function
"cumulate_covariance"
Output argument:
----------------
covariance_type: either intermediary results in cumulating covariance
or a structure containing covariance matrix (at completion)
At completion, "covariance_type" has fields:
covariance: matrix p x p of covariances
average: average value of all the observations (1 x p)
n: total number of observations involved in computing the covariance
matrix
if the second argument (covariance_type1) undefined, initiate the
calculation of covariance,
If the two arguments are defined, cumulate the covariance
if the first argument =[], finish the work
A complete script must therefore be something like
cov=cumulate_covariance(spectra_1);%% starting
cov=cumulate_covariance(spectra_2,cov);%% cumulating values of matrix 1
cov=cumulate_covariance(spectra_3,cov);%% cumulating values of matrix 2
...
cov=cumulate_covariance(spectra_k,cov);%% cumulating values of matrix k
covariance_type=cumulate_covariance([],cov);%% finishing
[pcatype]=covariance_pca(covariance_type,(ncomponent))
score1=applypca(pcatype,spectra_1);%% projecting data from spectra_1
related function; "covariance_pca" (pca from covariance)
Return to thematic list
Return to alphabetic list
HOME
Input argument:
--------------
X : SAISIR matrix
nrow : indeix of the row to be shown
xlabel, ylabel (optional) : label on X and Y
title (optional) : title of the graph.
This function draws the row (typically a spectrum) as a curve.
If X.v can be interpreted as a vector of number (such as wavelengths),
the X scale is given by this vector.
Otherwise, the X-axis is simply given by the rank of the variables
A function "courbe" is a synonym of this function.
Return to
thematic list
Return to alphabetic list
HOME
Input arguments:
---------------
X : SAISIR matrix
range (optional): vector of integer values giving the indices of the
rows to be displayed (default : all rows displayed)
xlabel, ylabel (optional): labels in X and Y
title (optional): title of the graph.
If X.v can be interpreted as a vector of number (such as wavelengths),
the X scale is given by this vector.
Otherwise, the X-axis is simply given by the rank of the variables
example:
curves(spectra,1:2:100,'wavenumber','log 1/R','Raman spectra');
plot the rows 1, 3, 5, ... 99 as curves.
A function "courbes" is a synonym of this function.
Return to
thematic list
Return to alphabetic list
HOME
Input argument:
==============
X : Saisir matrix n x n of squared distances
Output argument:
===============
ftype with fields:
eigenval: eigenvalues
score: scores
%typical (demonstrative) example
%===============
xdist=distance(data,data);
xdist.d=xdist.d.*xdist.d ;%% warning !! squared distance needed
ftype=d2_factorial_map(xdist);
map(ftype.score,1,2);
p=pca(data);
figure;
map(p.score,1,2);%% identical to previous figure
Useful when only the distance matrix is available
Uses the Torgerson approach to transform squared distance into pseudo
scalar products.Gives the factorial scores of the distance
Return to thematic list
Return to alphabetic list
HOME
input arguments
===============
X:Saisir matrix (n x p)
index:vector indicating the columns to be deleted
Output argument
==============
X1:saisir matrix (n x q) with q <=p
The deleted columns are indicated by the vector index (numbers of booleans)
% Typical Examples
reduced=deletecol(data,[1 3 5]);%% deletes 3 columns
reduced1=deletecol(data,sum(data.d)==0); % deletes all the columns with
the sum equal to 0
see also: deleterow, selectrow, selectcol
Return to thematic list
Return to alphabetic list
HOME
input arguments
===============
X:Saisir matrix (n x p)
index:vector indicating the rows²to be deleted
Output argument
==============
X1:saisir matrix (n1 x p) with n1 <=n
The deleted rows are indicated by the vector index (numbers of booleans)
% Typical Examples
reduced=deleterow²(data,[1 3 5]);%% deletes 3 rows
reduced1=deletecol(data,sum(data.d,2)==0); % deletes all the rows with
the sum equal to 0
see also: selectrow, selectcol,deletecol
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
================
X : Saisir data matrix
topnodes (optional): level of cutting the dendrogram
Output argument
===============
group: groups number at the level of cutting defined by topnodes.
typical use
===========
g=dendro(data,30);
g will contain numbers ranginf from 1 to 30 indicating the group number
Attempts to display the identifiers on the dendrogram.
Works only with a few number of identifiers
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
=================
X : SAISIR matrix of predictive variables (n x p)
y : SAISIR vector of variable to be predicted (n x 1)
ndim: max number of PCR dimensions tested
selected: MATLAB vector of samples selected as calibration set (==0)
and verification set (==1)
Output arguments:
================
pcrres with fields
r2: determination coefficient between observed and predicted y (1 x ndim)
predy: predicted y for all the dimensions tested ( n x ndim)
rmsev: root mean square error of validation for all the dimensions tested
(1 x ndim)
obsy: observed y in the validation set
Components introduced in the order of the eigenvalues
Remarks;
%1) The vector "selected" can be build randomly using the
function
"random_select"
2)The biplot observed/predicted values can be displayed by the command
"xy_plot" (for example
"xy_plot(pcrres.obsy,1,pcrres.predy,3)"
will show the PCR model with 3 dimensions.
Return to
thematic list
Return to alphabetic list
HOME
Input arguments:
X : SAISIR matrix of predictive variables (n x p)
y : SAISIR vector of variable to be predicted (n x 1)
selected: MATLAB vector (p elements). 0 = in the calibration set;
1 = validation set
Pthres: probability threshold of entering or discarding variable
confidence: (optional): confidence interval for the correlation coefficient
ouput argument:
res with fields
calibration (see "stepwise_regression")
validation (see "apply_stepwise_regression")
validation has fields
predy: predicted y in validation for all the regression models
rmsev: root mean square error of validation (idem)
r2: determination coefficient between observed and predicted y
observed_y: vector of observed y values in the validation set.
The function divides the data set X and y in a calibration set and a
validation set. The vector "selected" defines the division.
The stepwise regression models are established on the calibration set
and tested on the validation set. All the calculated models are tested,
which gives as many columns of predicted y as the number models computed
by stepwise_regression.
Return to
thematic list
Return to alphabetic list
HOME
Input arguments
===============
X1, X2: SAISIR matrices dimensioned n1 x p and n2 x p respectively
Output argument
==============
D: matrix n1 x n2 of Euclidian distances between the observations
the tables must have the same number of columns
Return to thematic list
Return to alphabetic list
HOME
(no direct use)
This function builds up a part of the automatic documentation
This is a part of an HTML document as created by
"build_documentation"
Return to thematic list
Return to alphabetic list
HOME
Input argments
==============
fid : file identifier
directory_name: name of the directory (working with
the Matlab command "what
function_name: list of names of SAISIR function
thematic_list: output of function "thematic_classification"
This function builds a part of HTML SAISIR documentation (function
"build_documentation").
This part corresponds to the thematic list of function
This function must be called by the function
"build_documentation"
Return to thematic list
Return to alphabetic list
HOME
This function is not directly called
Return to thematic list
Return to alphabetic list
HOME
Input argument:
==============
X:Saisir data matrix
row_or_col:
if row_or_col==0 tries to have the maximum number of rows
if row_or_col==1 tries to have the maximum number of columns
Output argument:
===============
X1:Saisir data matrix with no NaN values
From a given saisir file possibly containing NaN values (not determined
values)
create a file of known values
Only useful when a very few numbers of rows or columns contain NaN values
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
================
- X: data matrix
- col1, col2: represented columns
- gr: qualitative groups (referred as integer number)
- centroid_variability : either 0 (variability of individual data
points), or 1 : variability of the centroid itself (default: 0)
confidence: P value of the confidence interval to be out of the ellipse
(default: 0.05)
point_plot: if point_plot ~=0 , plots also the individual points as
symbols (inactivated if centroid_variability set to 1)
useful in discriminant analysis and related methods.
Return to
thematic list
Return to alphabetic list
HOME
function [X,bag] = excel2bag(filename,ref_text_col,(nchar),(deb),(xend))
reads an excel file which has been saved under the format .csv (the
delimiters are ';')
the excel file includes the identifers of rows and columns
Input arguments
===============
filename :excelfile in the '.csv' format
nchar :number of characters read in each cell of the excel files
(the other are ignored)
ref_text_col :an array of string giving the reference of columns designed
as forming the columns of bag.d.
THESE COLUMNS ARE DESIGNATED USING THE EXCEL STYLE ('AA','AB' ....)
deb : number of the first row decoded
xend : final row decoded
output argument:
===============
X: matrix of numerical values
bag: a structure (bag.d,bag.i,bag.v), with bag.d being here a matrix of
char
saisir contains the numerical values from the excel file, with the
exception of the columns referenced by ref_text_col
bag contains the charactes values from the excel file, referenced by
ref_text_col
Example:
[value,bag]=excel2bag('olive',['A'; 'B'],20)
the columns 'A' and 'B' from excel are read as text (in output bag)
the other columns are read as number (in value) respecting the saisir
structure.
This function is normally used in relation
with "bag2group".
Return to thematic list
Return to alphabetic list
HOME
Input arguments
==============
filename: (string) name of the text Excel file in .csv format
nchar : (integer, optional) number of character kept in the identifiers
(default : 20)
start:(integer , optional) Index of the beginning of the observations to be
loaded
xend:(integer , optional) Index of the final of the observations to
be loaded (greater than start)
Reads an Excel file which has been saved under the format .csv (the
delimiters are ';')
the excel file includes the identifers of rows and columns
deb : number of the first row decoded, xend: final row decoded
The Excel format is compulsorily the following (example):
varname1 varname2 varname3
obsname1 number11 number12 number13
obsname2 number21 number22 number23
obsname3 number31 number32 number33
The decimal separator is the point (".") NOT THE COMMA
(",")
Example of .csv format (3 rows named "obs1", "obs2",
"obs3"; 3 columns named
"var1", "var2", "var3")
data :
=======================
var1 var2 var3
obs1 1 2 3
obs2 4 5 6
obs3 7 8 9
=======================
The corresponding .csv Excel file is:
;var1;var2;var3
obs1;1;2;3
obs2;4;5;6
obs3;7;8;9
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
---------------
pcatype: (structure) output argument of function "pca" applied on
the predictive data
set
group : SAISIR vector of group (integer values). Identical numbers mean
that the observations belong to the same group.
among : (integer) Maximal rank (dimension) of PC-score allowed to enter in
the model
maxscore: (integer) Maximum number of scores allowed to enter in the model.
Output arguments:
----------------
fdatype with fields:
introduced: rank of the PC scores introduced in the model
ncorrect: number of correct classifications (no validation) at each step
beta: projection coefficients such as datafactor=X*beta
datafactor: discriminant scores
centroidfactor: scores of the barycenters (centroids)
eigenval: eigenvalues of the discriminant analysis
confusion: confusion matrix (row actual; column predicted)
average: average of the predicitive data set.
Assesses a stepwise factorial discriminant analysis according
to Bertrand et al., J of Chemometrics, Vol . 4, 413-427 (1990).
the basic idea is to assess a factorial discriminant analysis on the scores
of
a previous pca. The criterion of score selection is the maximisation of
the trace of T-1B.
In order to avoid using PC-scores with very small eigenvalues,
the input argument "among" gives the maximal dimension to be
allowed.
"maxscore" indicates the maximal number of scores.
"datafactor" corresponds to the final model
(with maxscore scores introduced). If one is interested in a more
economical model, it is easy, looking at the classification, to reduce the
value in "maxscore" and re-run "fda1".
Typical example:
g=create_group1(wheat,1,3);%% creation of a grouping from the identifiers
%names, using characters in position 1
p=pca(wheat); %% first PCA
res=fda1(p,g,20,5);%% model with 5 scores introduced among the 20 first
%ones
res.ncorrect.d
% ans =
% 13.00 1 PC score introduced
% 31.00
% 69.00
% 77.00
% 93.00 5 PC scores introduced
colored_map1(res.datafactor,1,2,1,3)% map of the discrimination
figure;
ellipse_map(res.datafactor,1,2,g,1,0.05) % shown as confidence ellipses
Note : the number of dimensions in datafactor is less than the number of
qualitative groups minus 1
(with 2 groups, only 1 discriminant dimension!).
SEE ALSO maha3, maha6, plsda, quaddis, applyfda1.
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
---------------
X: SAISIR matrix of the predictive data set (n x p)
group : SAISIR vector of group (integer values). Identical numbers mean
that the observations belong to the same group.
among : (integer) Maximal rank (dimension) of PC-score allowed to enter in
the model
maxscore: (integer) Maximum number of scores allowed to enter in the model.
selected: matlab vector (n x 1) with elements =0 (calibration set), or 1
(validation set)
Output arguments:
----------------
res with fields:
introduced: rank of the PC scores introduced in the model
ncorrect: number of correct classifications in the calibration set at each
step
nscorrect: number of correct classifications in the validation
(supplementary) set at each step
beta: projection coefficients such as datafactor=X*beta
datafactor: discriminant scores if the calibration set
centroidfactor: scores of the barycenters (centroids) computed from the
calibration set
supscore: discriminant scores if the validation set
eigenval: eigenvalues of the discriminant analysis
confusion: confusion matrix (row actual; column predicted) in the
calibration set
average: average of the calibration set.
sconfusion: confusion matrix (row actual; column predicted) in the
validation (or "supplementary" set)
Assesses a stepwise factorial discriminant analysis according
to Bertrand et al., J of Chemometrics, Vol . 4, 413-427 (1990).
the basic idea is to assess a factorial discriminant analysis on the scores
of
a previous pca. The criterion of score selection is the maximisation of
the trace of T-1B.
In order to avoid using PC-scores with very small eigenvalues,
the input argument "among" gives the maximal dimension to be
allowed.
"maxscore" indicates the maximal number of scores.
"datafactor" corresponds to the final model
(with maxscore scores introduced). If one is interested in a more
economical model, it is easy, looking at the classification, to reduce the
value in "maxscore" and re-run "fda1".
The collection is divided in a calibration and a validation set from the
elements of the input argument "selected"
%Typical example:
g=create_group1(data,1,3);%% creation of a grouping from the identifiers
%names, using characters in position 1 to 3
%random selection of 1/3 in the validation set
res=fda2(data,g,10,5,random_select(size(data.d,1),round(size(data.d,1)/3)));
SEE ALSO maha3, maha6, plsda, quaddis, applyfda1, fda1.
Return to
thematic list
Return to alphabetic list
HOME
useful for finding the wavelength index in strings
Input argument:
==============
str: an array of characters which can be interpreted as a vector of numbers
when using a command such as vector=str2num(str);
value: a numerical value normally in the range given by vector.
Output argument:
===============
Index (rank) of the variable
Exemple of use :
index=find_index(spectra.v,1104);
Find the index of the variable in "spectra" closest to the value
1104.
Important note: the number associated with num2str(str) are supposed to be
sorted
(It is the normal case with spectral data)
Return to thematic list
Return to alphabetic list
HOME
Input argument
==============
matrix: MATLAB matrix
Output arguments
===============
row, col: indexes of the row and column of the maximum, respectively
value: value of the maximum
see also find_min
Return to thematic list
Return to alphabetic list
HOME
Input argument
==============
matrix: MATLAB matrix
Output arguments
===============
row, col: indexes of the row and column of the minimun, respectively
value: value of the minimun
see also find_max
Return to thematic list
Return to alphabetic list
HOME
Input arguments
==============
X:SAISIR data matrix of "spectra-like" data
nrow:(integer): index of the row to be studied
threshold: peaks of absolute value lower than the threshold are not
detected
windowsize (integer, preferably odd number): size of the moving window
in which the peaks are to be found
min_max: either 0 : only maximum detected, or 1: maximum and minimum
detected
Output argument
==============
vect: matlab vector of the positions (name of variable converted into
numbers)
index: matlab vector of the index of the variables corresponding to peaks
Inside a moving window of size "windowsize" (data points)
detects the maximum (or maximum and minimum values). The identified
positions are considered as "peaks" and shown on the display.
The corresponding variables identifiers (normally wavelengths, or
retention time values) are converted into numbers and given in the output
argument "vect".
The system of threshold and moving window avoid that a large series of
peaks will be identified if the studied curve is not perfectly smooth.
The "windowsize" indicates the minimum gap (in data points)
between two
consecutive peaks. The threshold makes it possible to detect peaks greater
than a certain value.
Return to
thematic list
Return to alphabetic list
HOME
Input arguments
==============
X: SAISIR matrix (n x p)
group: SAISIR vector of group (integer, n x 1). Identical values
in "group" indicates that the corresponding observations belong
to the same group.
Output argument
===============
X1:SAISIR matrix (n x p) group-centered.
For each group, as defined by the input argument "group", the
function computes the
average observation (1 x p). This average is subtracted to all
observation belonging to this group.
An usage of this function is the centering of sensory data according to
each panellist.
Return to
thematic list
Return to alphabetic list
HOME
Input arguments
===============
X: SAISIR data matrix ( n x p)
startpos, endpos : (integers) character positions in the identifiers giving
the key
for building the qualitative groups.
Output argument
===============
X1: matrix of averages of groups (k x p) with k the number of found
groups.
This function uses the identifier for creating groups.
creates as many groups as different strings from startpos to endpos
The function gives the matrix of averages according to groups
(barycenters).
Return to
thematic list
Return to alphabetic list
HOME
This function has no direct use
It is used for beginning the HTML documentation of SAISIR
Return to thematic list
Return to alphabetic list
HOME
This function builds up the core of the documentation (which is the
gathering of individual helps
Builds also the necessary hypertext links.
No direct use
Return to thematic list
Return to alphabetic list
HOME
No direct use
Return to thematic list
Return to alphabetic list
HOME
Input argument:
==============
X: anything
Output argument:
===============
test: (boolean) "true" if X is a SAISIR structure,
"false" otherwise.
SEE also : saisir_check
Return to thematic list
Return to alphabetic list
HOME
Input arguments
===============
X : SAISIR data matrix
col : the column from which the histogram is drawn
startpos and endpos : the position in the row identifier strings considered
as keys for coloration
nclass : number of desired classes
charsize : the size of character on the graph
str : (optional) if str (a string) is defined, all the observations are
represented with
this string colored differently according to the extracted key.
by choosing str='--' (for example),it is possible to avoid overlapping
identifiers.
This function build an histogram in which each observation is represented
by a colored code (key) extracted from the row identifiers.
Note: It is generally necessary to play with "nclass" and
"charsize" to have a smart histogram
clf;
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X:SAISIR matrix of predictive variables (n x p)
y:Saisir vector of the variable to be predicted (n x 1)
ndim: (integer) maximal number of tested PLS dimensions
Ouput arguments:
===============
res with fields
predy: SAISIR matrix of predicted y in leave-one-out (n x ndim) for all
the dimensions tested.
rmse: Root mean square error (1 x ndim) for all the dimensions tested
r2: r2 value between observed and predicted y for all the dimensions
tested.
optimal_error: (double) minimal rmse among all the dimensions tested.
optimal_dim: (integer) PLS dimension giving the best model.
optimal_r2: r2 value for the best model.
The function leaves out one observation and makes a model with the
resulting observations. The left observations is predicted.
This procedure is carried out for the n rows in X.
This function is very slow and must be used only for small data set
(typically less than 30 observations). Otherwise one must prefer a
validation procedure.
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X:SAISIR matrix of predictive variables (n x p)
y:SAISIRr vector of the variables to be predicted (n x k)
ndim: (integer) maximal number of tested PLS dimensions
Ouput arguments:
===============
res with fields
col: vector of k cells. The cell res.col{i} contains the y predicted values
(n x ndim) associated with the variable i, for all the PLS dimensions
tested.
RMSEV:root mean square error (k x ndim) for all the variables (rows) and
all the
dimensions (columns).
r2: r2 values (k x ndim) for all the variables (rows) and all the
dimensions (columns).
The function leaves out one observation and makes a model with the
resulting observations. The ys of the left observations are predicted.
This procedure is carried out for the n rows in X.
This function is very slow and must be used only for small data set
(typically less than 30 observations). Otherwise one must prefer a
validation procedure.
Return to thematic list
Return to alphabetic list
HOME
start: starting index in the list (default : 1)
Return to thematic list
Return to alphabetic list
HOME
Computes a basic latent root model
Input arguments:
================
X: SAISIR matrix of predictive variables (n x p)
y: SAISIR vector of variables to be predicted (n x 1)
maxdim (integer): maximal number of dimensions introduced in the model
ratioxy (float) : positive number greater than 0 less than 1
giving the relative importance of x and y. 1: x important; 0 x not
important
Output argument:
================
lr1type with fields:
predy: predicted y for all the models, up to maxdim dimensions (n x maxdim)
corrcoef: correlation coefficents between y and predicted y (1 x maxdim)
beta: regression coefficients for all the models (p x maxdim)
averagey: average of y
averagex: average of x
ratioxy: copy of parameter ratioxy
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
----------------
X :SAISIR matrix (n x p) of predictive variables
group :SAISIR vector (n x 1) of integers indicating the group. Two
observations
belonging to the same group have the same group number
maxvar : integer indicating the maximum number of variables to be
introduced.
Output arguments:
-----------------
res with fields:
step with fields
index: vector of integers (1 x maxvar) giving the indices of the
selected variables
correct: vector of integer ( 1 x maxvar) giving the number of
correct classifications at each step
name: identifiers of the introduced variables (matrix of char with
maxvar rows)
classed: predicted groups in the final step (SAISIR vector of integers
n x 1)
The function assesses a simple quadratic discriminant analysis introducing
up to maxvar variables
At each step, the more discriminating variable (according to the percentage
of correct
classification) is introduced. Only forward
The function makes use of matlab function "classify"
Return to thematic list
Return to alphabetic list
HOME
Input arguments
==============
calibration_data: SAISIR matrix (n1 x p)
calibration_group:SAISIR vector of group numbers (n1 x 1)
maxvar:(integer) maximum number of variables introduced in the model
test_data:SAISIR matrix (n2 x p)
test_group: SAISIR vector of group numbers (n2 x 1);
Output argument:
===============
discrtype1 with fields:
step with fields
index: vector giving the rank of the variable introduced at each step
(maxvar values)
correct: vector giving the number of correct classifications in the
calibration set (max var values)
name: identifiers of the variables introduced int the models (matrix of
char with maxvar rows)
ntestcorrect: vector giving the number of correct classifications in the
validation set (max var values)
classed: SAISIR vector of the predicted group numbers of the calibration
set (n1 x 1). Only the result of the final step is given
testclassed:SAISIR vector of the predicted group numbers of the test set (n2
x 1)
Only the result of the final step is given
The function assesses a simple linear discriminant analysis introducing
up to maxvar variables
at each step, the more discriminating variable (according to the percentage
of correct
classification of the calibration set) is introduced. Only forward
Uses the Matlab function "classify"
%Typical example
mydis=maha1(wheat1,g1,5,wheat2,g2)
disp(mydis.step.ntestcorrect);%Looking at the number of correct
%classifications in the test set
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
----------------
X :SAISIR matrix (n x p) of predictive variables
group :SAISIR vector (n x 1) of integers indicating the group. Two
observations
belonging to the same group have the same group number
maxvar : integer indicating the maximum number of variables to be
introduced.
Output arguments:
-----------------
res with fields:
ncorrect: vector of integers (1 x maxdim) indicating the number of
correct classifications at each step.
classed: SAISIR vector of integer (n x 1) indicating the predicted group
number
for the model with maxvar variables introduced.
confusion: SAISIR confusion matrix (row: actual group, column : predicted
group).
varrank: vector of integer (1 x maxdim) indicating the index (rank) of the
introduced variables.
The function computes a linear discriminant analysis introducing up to
maxvar variables.
At each step, the more discriminating variable according to the
maximisation of the trace of T-1B is introduced.
Return to
thematic list
Return to alphabetic list
HOME
Input arguments:
----------------
X :SAISIR matrix (n x p) of predictive variables
group :SAISIR vector (n x 1) of integers indicating the group. Two
observations
belonging to the same group have the same group number
maxvar : integer indicating the maximum number of variables to be
introduced.
Output arguments:
-----------------
res with fields:
ncorrect: vector of integers (1 x maxdim) indicating the number of
correct classifications at each step.
classed: SAISIR vector of integer (n x 1) indicating the predicted group
number
for the model with maxvar variables introduced.
confusion: SAISIR confusion matrix (row: actual group, column : predicted
group).
varrank: vector of integer (1 x maxdim) indicating the index
("rank") of the
introduced variables.
The function computes a linear discriminant analysis introducing up to
maxvar variables. At each step, the new variables giving the highest number
of correctly classified samples is introduced.
Return to
thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X: SAISIR matrix (n x p) of data
group: SAISIR vector of integer (n x 1) indicating the group number of the
observations. Two observations sharing a same group number belong to the
same qualitative group
maxvar: integer giving the maximal number of variables introduced in the
model
selected: Matlab vector (n x 1) the elments of which are equal to 0
(observation placed in the calibration set, or 1 (observations placed in
the validation set).
Ouput argument:
==============
discrtype with fields
ncorrect: vector of integers (1 x maxvar) giving the number of correctly
classified observations in the calibration set.
classed: SAISIR vector of integer giving the predicted group of the
calibration set.
confusion: matlab confusion matrix of the calibration set for the final
step.
sconfusion: matlab confusion matrix of the validation set for the final
step.
sclassed: SAISIR vector of integer giving the predicted group of the
validation set.
nscorrect: vector of integers (1 x maxvar) giving the number of correctly
classified observations in the validation (supplementary) set.
sclassed: vector of integers (1 x maxvar) giving the predicted group of
in the validation.
nscorrect: vector of integers (1 x maxvar) giving the number of correctly
classified observations in the validation set.
The function computes a linear discriminant analysis introducing up to
maxvar variable
at each step, the more discriminating variable
according to the maximisation of the trace of T-1B is introduced
the collection is divided in cal. sample and test samples according to
selected:
selected=0 , sample placed in calibration, =1 verification
%Typical use
%===========
load data;
g=create_group1(data,1,2); %% supposing that the identifiers contain a key
%for forming the qualitative group in position of characters 1 and 2
sel=random_select(size(data.d,1),round(size(data.d,1)/3));%% a third in
%validation
res=maha6(data,g,5,del);
xdisp('Evolution of correct classification in the calibration
set',res.ncorrect);
xdisp('Evolution of correct classification in the validation set',
res.nscorrect);
Return to thematic list
Return to alphabetic list
HOME
Input arguments
---------------
X: SAISIR matrix
col1, col2 : index of the two columns to be represented
col1label (optional): Label of the variable forming the X-axis
col2label (optional): Label of the variable forming the Y-axis
title (optional) : title of the graph
charsize (optional) : size of the plotted characters
marg (optional) : margin value allowing an extension of the axis in order
to cope with long identifiers (default value: 0.05)
For the French users: there is a synonym function "carte".
Use preferably "map"
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X: SAISIR matrix (n x p)
co1, col2, col3: indices of he columns to be represented
as X, Y and Z in the 3D plot (integers).
label1,label2, label3: label of axes on X, Y, Z (optional, strings or
vectors of
char)
charsize: size of the characters (optional, default :6)
synonymous of "carte3D" (French name). Use preferably map3D
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
---------------
data : Matlab matric
coderow (optional) : string (code added to the row identifiers)
codecol (optional): string (code added to the variables identifiers)
Output argument:
----------------
X: SAISIR matrix with fields "d" (copy of "data"),
"i" (identifiers of
rows), "v" (identifiers of variables)
Saisir means "statistique appliquée à l'interpretation des spectres
infrarouge"
or "statistics applied to the interpretation of IR spectra'
See the manual of SAISIR for understanding the rationale of this structure.
%Typical use:
%===========
A=[1 2 3 4; 5 6 7 8];
B=matrix2saisir(A,'row # ', 'Column # ');
% >> B.d
% ans =
% 1 2 3 4
% 5 6 7 8
% >> B.i
% ans =
% row # 1
% row # 2
% >> B.v
% ans =
% Column # 1
% Column # 2
% Column # 3
% Column # 4
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
================
X1: SAISIR matrix ( n1 x p)
X2: SAISIR matrix (n2 x p)
metric: SAISIR matrix (p x p) of the metric
Output argument:
================
dis: SAISIR matrix of distances between observations (n1 x n2) according to
the metric "metric"
example of use (computing Mahalanobis distances):
================================================
data=matrix2saisir(rand(50,10));%% dummy data
data1=center(data);%% centered data
metric=matrix2saisir(inv(data1.d'*data1.d));
dis=mdistance(data1,data1,metric);%% Mahalanobis distance between
%observations
Return to thematic list
Return to alphabetic list
HOME
Input argument:
===============
collection: VECTOR OF SAISIR structures
example of building a collection : collection col{1}=table1;col{2}=table2;
col{3}=table3
in which table1, table2, table are SAISIR structure
Each table must include the same observations, but not necessarily the same
variables.
WARNING! In this version, the variables are not normalised !
Let n be the number of observations, and t the number of tables
Let WHOLE be the matrix of appended tables normalized according to MFA
(dimensions n x m)
Let k be the rank of WHOLE
Output arguments:
================
res with fields:
score (n x k) : scores of the individuals (compromise)
eigenvec ( m x k) : eigenvectors of the PCA on WHOLE (no direct use)
eigenval (1 x k) : eigenvalues of the PCA on WHOLE
average (1 x m) : averages of the variables of WHOLE
var_score (m x k) : scores of the variables
proj {1xt cell} : projectors for computing the projection of new
observations of each table
first_eigenval (1xt) : first eigenvalues of the individual PCAS on each
table
trajectory (q x k) : individual score of each row of each table (q = total
number of rows in all the tables q=n*t)
id_group (q x 1) : identification of the belonging of the observation_score
into a given table
table_score (t x k) : scores of the tables
Information on FMA can be found in SPAD TM Version 5.0 (procédure AFMUL);
Return to thematic list
Return to alphabetic list
HOME
Input argument:
===============
names1:matrix of char interpretable as number through "num2str"
Ouput argument:
===============
names2: matrix of char interpretable as number through "num2str"
The function gives negative values of variables (here normally wavenumbers)
in order to have the usual sense on the graph of mid infrared spectra
This may help Spectroscopists to examine data and loadings.
%Typical use:
%============
%Let is suppose that "midIR" is a SAISIR matrix of Mid-infrared
spectra
%with wavenumbers as variables identifiers
midIR.v=mir_style(midIR.v);
Return to thematic list
Return to alphabetic list
HOME
Input arguments
===============
X:SAISIR matrix of data (normally digitized signals such as spectra)
window_size: integer giving the number of data points which are locally
averaged (odd number).
Ouput arguments
===============
X1: matrix of averaged data
The function replaces a given variable by its average in the range defined
by "window_size".
(window_size-1)/2 variables are lost at the begining and the end of the
signal.
For example with window_size = 5
a given variable x(i) of index i is replaced by the local average
(x(i-2)+x(i-1)+x(i)+x(i+1)+x(i+2))/5
Return to thematic list
Return to alphabetic list
HOME
Input arguments
===============
X:SAISIR matrix of data (normally digitized signals such as spectra)
window_size: integer giving the number of data points on which the max
value is computed.
Ouput arguments
===============
X1: matrix of local maxima
The function replaces a given variable by the maximum value in the range
defined
by "window_size".
(window_size-1)/2 variables are lost at the begining and the end of the
signal.
For example with window_size = 5
a given variable x(i) of index i is replaced by the local maximum of
variables
x(i-2); x(i-1); x(i): x(i+1); x(i+2)
Return to thematic list
Return to alphabetic list
HOME
Input arguments
===============
X:SAISIR matrix of data (normally digitized signals such as spectra)
window_size: integer giving the number of data points on which the min
value is computed.
Ouput arguments
===============
X1: matrix of local minima
The function replaces a given variable by the minimum value in the range
defined
by "window_size".
(window_size-1)/2 variables are lost at the begining and the end of the
signal.
For example with window_size = 5
a given variable x(i) of index i is replaced by the local minimum of
variables
x(i-2); x(i-1); x(i): x(i+1); x(i+2)
Return to thematic list
Return to alphabetic list
HOME
Input arguments
===============
X: SAISIR matrix of predictive variables (n x p)
y: SAISIR vector of observed y (n x 1);
Output argument
===============
res with fields
ypred: predicted y
beta0: intercept of the model (double)
beta: regression coefficient
r2: r2 value
RMSEC: Root mean square error of calibration
Return to thematic list
Return to alphabetic list
HOME
Input argument:
==============
collection : ARRAY OF SAISIR matrices
example of building a collection : collection col{1)=table1;col{2)=table2;
col{3)=table3
in which table1, table2, table are SAISIR structure
Each table must include the same observations, but not necessarily the same
variables.
WARNING! In this version, the variables are not normalised !
Let n be the number of observations, and t the number of tables
Let WHOLE be the matrix of appended tables(dimensions n x m)
Let k be the rank of WHOLE
Output argument:
===============
res with fields:
score (n x k) : scores of the individuals (compromise)
eigenvec ( m x k) : eigenvectors of the PCA on WHOLE (no direct use)
eigenval (1 x k) : eigenvalues of the PCA on WHOLE
average (1 x m) : averages of the variables of WHOLE
var_score (m x k) : scores of the variables
proj {1xt cell) : projectors for computing the projection of new
observations of each table
trajectory (q x k) : individual score of each row of each table (q = total
number of rows in all the tables)
id_group (q x 1) : identification of the belonging of the observation_score
into a given table
table_score (t x k) : scores of the tables
This function is identical to "mfa" except that the tables are
simply set in such a way that their norm are set to 1.
Return to
thematic list
Return to alphabetic list
HOME
Input arguments:
================
X1 and X2: SAISIR matrices dimensioned (n x p1) and (n x p2)
respectively.
Output argument:
================
cor: SAISIR matrix (dimensioned p1 x p2).
An element cor.d(a,b) is the correlation coefficient between the column a
of X1 and b of X2.
Return to
thematic list
Return to alphabetic list
HOME
Syntax
normc(M)
Description
NORMC(M) normalizes the columns of M to a length of 1.
Examples
m = [1 2; 3 4]
n = normc(m)
See also NORMR
Reference page in Help browser
doc normc.m
Return to thematic list
Return to alphabetic list
HOME
function [pcatype]=pca(X,(var_score))
Assesses principal component analysis (on not normalised data)
Input arguments:
---------------
X: SAISIR matrix
Output arguments:
----------------
pcatype with fields:
score :PC score
eigenvec :eigenvectors (loadings)
eigenval :eigenvalues
average :average observation
var_score :scores of the variables
std: standard deviations of the columns of X
Return to thematic list
Return to alphabetic list
HOME
divide each column by the corresponding standard deviation
mode (optional): 0 or 1 division by n-1 or by n respectively
default : 1
OBSOLETE: USE PREFERABLY FUNCTION "standardize"
Return to thematic list
Return to alphabetic list
HOME
Clusters the data into ngroup according to the KCmeans method ("nuée
dynamique");
Input arguments
==============
X: SAISIR data matrix
ngroup (integer): number of groups asked for
nchanged (optional):stop iteration when there are still nchanged groups
which have changed in the previous iteration
This allows sparing some time. nchanged must be small in comparison with
the number of rows of X
Output argument
==============
res with fields
group: SAISIR vector of groups. Observations with the same group number
have been classified in the same group.
centre: barycenter of the groups.
Warning: the function may reduce the number of groups
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
vector : Matlab vector of integers
ndigit : positive integer
This function transforms numbers into matrices of char.
The function justifies the strings by adding zeros.
If the first argument is a row vector, it is transposed
% %Example
% %=======
x=[1 2 100];
x1=num2str1(x,5);
x1
% 00001
% 00002
% 00100
The main use of this function is to help building smart row names
in SAISIR matrices using the system of extractable fields in the names.
Return to
thematic list
Return to alphabetic list
HOME
Assesses principal component analysis (on not normalised data)
Input arguments:
---------------
X: SAISIR matrix
var_score : optional (0: only scores of observations
1: gives also the scores of the variables (default : 0)
Output arguments:
----------------
pcatype with fields:
score :PC score
eigenvec :eigenvectors (loadings)
eigenval :eigenvalues
average :average observation
var_score : (if input arg. "varscore" defined) scores of the
variables
NOTA :the weight of the observations are equal to 1/(number of rows)
ALL the possible scores are calculated
SEE ALSO : normed_pca, cumulate_covariance, covariance_pca, normed_pca
correlation plot, apply_pca
%typical example:
%spectra: n x p, chemistry n x k
p=pca(spectra);%% PCA
map(p.score,1,2);%% PC plot 1-2
correlation_plot(p.score,1,2, chemistry);%% correlation with chemistry
Return to thematic list
Return to alphabetic list
HOME
Use "pca" or "normed_pca"
Return to thematic list
Return to alphabetic list
HOME
Use "pca" or "normed_pca"
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
pcatype: output argument of function"PCA"
score: SAISIR matrix of scores (computed from the same pcatype model)
nscore: number of components involved in the reconstruction of the data
Output argument:
================
res: SAISIR matrix of reconstructed data. The quality of the
reoncstruction depends on the number of scores introduced (variable
"nscore").
From the previously computed scores, the function rebuilds the original
data matrix.
Return to
thematic list
Return to alphabetic list
HOME
% ========================================================================
input argument :
===============
collection: array of SAISIR matrices with the same number of rows (see
below)
ndim : dimension of each individual PCA (must be less than the smallest
number of variables
graph : if different from 0 : display examples of graph
let n be the number of observations in each table, k the number of tables
%Output argument:
===============
res with fields
compromise : PCA giving the compromise
observation_score : scores of each observation of each table (nxk rows)
id_group : groups identifying each observation in observation_score (for
graph)
projector : struct array giving the vectors allowing the projection of each
data set
score_correlation: : correlations between compromise scores and table
scores
table_average : struct array giving the average of each original data table
Adapted from : G. Saporta. Probabilités,
analyse des données et statistiques.
: Edition Technip, page 192 and followings.
%
========================================================================
The function first computes PCA on each of the SAISIR matrices in
"collection". Only the ndim PC scores are kept in each matrix.
Canonical analysis is carried out on the series of scores.
This procedure avoids to have inversion of the original matrices. The use
of few scores guarantees that the computation is feasible, even if the
original matrices have colinear variables.
Note :
the argument collection is obtained for example by
collection(1)=data1;collection(2)=data2; ...;collection(k)=datak;
In which data1, data2, ..., datak are SAISIR matrices with the same number
of rows.
Return to
thematic list
Return to alphabetic list
HOME
Input argument:
==============
X: SAISIR matrix (n x p) of predictive variables
y: SAISIR vector (n x 1) of the variable to be predicted
range: MATLAB vector of integers (1 x q)
selected: MATLAB vector (n x 1) with elements = 0 (selected in
calibration)
or 1 ( selected in validation)
Outut argument:
==============
res with fields:
predy: predicted y in the validation set (q columns)
obsy: observed y of the validation set (1 column)
r2: r2 between observed and predicted values in the validation set
(vector with q elements)
rmsecv: root mean square error of validation (1 x q)
ridgetype: calibration model (see function
"pca_ridge_regression")
This function divides a collection in calibration and verification set
using the input argument "selected"
and applies the pca_ridge_regression on the validation set
The function calculates as many ridge regression models as the number of
elements in "range"
In ridge regression, the product X'X is replaced by X'X +kI.
In the present function, k is in fact the eigenvalue of the corresponding
PCA component
for example, if krange = [1 3 5], that means that the tested values of k
are the eigenvalues #1, #3, #5.
The rationale of this, is that k in ridge_regression is very difficult to
find
it is a good idea to test a value in the range of the observed eigenvalues
Typical example
===============
%Let DATA (n x p) be the SAISIR matrix of predictive variables and y (n x
1) the variable to be predicted.
myrange=1:20 %% testing the eigenvalues from 1 to 20
sel=random_select(size(DATA.d,1), round(DATA.d,1/3));%% 1/3 in validation
[res]=pca_cross_ridge_regression(DATA,y,1:10,sel);%% testing the 10 first
%eigenvalues
xy_plot(res.predy,10, res.obsy,1);%% display of the 10th model
%See also: pca_ridge_regression, ridge_regression, ridge_regression1,
apply_ridge_regression
Return to thematic list
Return to alphabetic list
HOME
Input arguments
===============
pcatype: structure, output argument of function "PCA"
y: variable to be predicted (n x 1)
range: MATLAB vector of positive integers (1 x k)
Output arguments:
================
ridgetype with fields:
beta: coefficients of the model (p x k), applicable on X
krange: (1 X q) values of the coefficient k of ridge regressions
averagex: average of original predictive variables (1 x p)
averagey: average of y (1 x 1)
rmsec: root mean square of validation (1 x q)
predy: y predicted (n x q)
corr: correlation coefficient (1 x q] between predicted and observed
values
The function calculates as many ridge regression models as the number of
elements in "range"
In ridge regression, the product X'X is replaced by X'X +kI.
In the present function, k is in fact the eigenvalue of the corresponding
PCA component
for example, if krange = [1 3 5], that means that the tested values of k
are the eigenvalues #1, #3, #5.
The rationale of this, is that k in ridge_regression is very difficult to
find
it is a good idea to test a value in the range of the observed eigenvalues
%Typical example
===============
%Let DATA (n x p) be the SAISIR matrix of predictive variables and y (n x
1) the variable to be predicted.
mypca=pca(DATA); %
myrange=1:20 %% testing the eigenvalues from 1 to 20
res=pca_ridge_regression(mypca,y,myrange)
xdisp(res.rmsec.d);%% displaying (for example) the errors
See also ridge_regression, ridge_regression1, apply_ridge_regression,
pca_cross_ridge_regression.
Return to
thematic list
Return to alphabetic list
HOME
Input argument :
===============
pca_type: output argument of function "pca"
comp1, comp2: number of the PC components of the PCA to be analyzed
Ouput argument:
===============
res: Saisir Matrix with 7 columns : QTL, col1 CO2col1, CTRcol1, col2,
CO2col2, CTRCol2
QLT : squared cosinus with the plan (quality of the representation of the
observations)
CO2col1 and CO2col2 : squared cosinus of the angle between the observation
and the axis
We have QLT=CO2col1 + CO2col2
CTRcol1 and CTRcol2 : Contribution of the observation to the component.
From G.Saporta, Probabilités analyse des données et statistiques, Ed
Technip, page 182
%Typical example:
p=pca(DATA);
res=pc_stat(p,1,2);%% stats for components #1 and #2
saisir2excel(res,'pca results');%% to be looked at with Excel
Return to thematic list
Return to alphabetic list
HOME
assesses a basic pcr model
Input arguments
---------------
X : SAISIR matrix (n x p
y : SAISIR vector (n x 1)
maxdim : maximal dimensions of the model (integer)
Output arguments
----------------
pcrtype with fields
pca: Structure giving the PCA results (see function "pca")
beta: regression coefficients APPLICABLE ON THE PC SCORES (ndim x 1)
predy: predicted y up to ndim dimensions (n x ndim)
r2: determination coefficients between predicted and observed values (1 x
ndim)
averagey: mean of y (number)
beta1:regression coefficients APPLICABLE ON THE X DATA centred (ndim x p)
obsy: observed y (copy of input argument y)
rmsec: root mean square error of calibration (n - dim -1) degrees f freedom
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X: SAISIR matrix (n x p) of predictive data
y: SAISIR matrix (n x k) of variables to be predicted (k variables)
dim: (integer) dimension of the model
Output argument
===============
pcrtype with fields:
pca: result of pca applied on X (see function "pca")
beta: coefficients of the models (p x k) obtained with "dim"
dimensions
averagex: average of x (1 x p)
averagey: average of y (1 x k)
info: 'predicting several y with xxx dimensions'
r2: determination coefficient for the ys (1 x k)
predy: predicted y (n x k).
This function makes models for all the ys which are in "y" . The
models
are built only with "dim" dimensions.
See also : pcr (only one y, but the dimensions are scanned),apply_pcr
Return to thematic list
Return to alphabetic list
HOME
Return to thematic list
Return to alphabetic list
HOME
No direct use in SAISIR
Input arguments
================
mtX Matlab matrix of centered predictive data
mtY Matlab matrix of Y-variables (centered).
nbdim dimension of the PLS model
Output arguments
=================
mtBpls PLS regression coefficients
mtYp predicted Y
mtT PLS scores
Return to thematic list
Return to alphabetic list
HOME
This function is normally not directly called
Return to thematic list
Return to alphabetic list
HOME
This function is normally not directly called
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X : SAISIR matrix of predictive variables (n x p)
group: SAISIR (n x 1) vector of integers of groups. Observationqs with
the same number in group belong to the same group (missing group not
allowed)
ndim: number of dimensions in the PLS model
Output arguments
================
plsdatype with fields
beta :coeff for predicting the indicator matrix
beta0 :intercept for predicting the indicator matrix
t :PLS latent variable
predy :predicted indicator matrix
classed :predicted groups according to method #0
ncorrect :number of rightly classified samples according to method #0
(attribution to index of max of predicted Y)
confusion :confusion matrix according to method #0
ncorrect1 :number of rightly classified samples according to method #1
(mahalanobis distance on latent variable t)
confusion1 :confusion matrix according to method #1
tbeta :coeff for predicting the latent variables t
tbeta0 :intercept for predicting the indicator matrix
linear :linear form for direct prediction of group
linear0 :%the min of x'*linear + linear0 gives the predicted group
(this is equivalent with considering Mahalanobis distances)
Return to thematic list
Return to alphabetic list
HOME
Quadratic discriminant analysis
(Training)
A multinormal distribution is assumed in each qualitative group
===============================================================
Input args:
============
x: predictive data set (matrix n x p)
g: qualitative groups (matrix n x 1) with integer ranging from 1
to maximum number of groups (gmax)
Output args:
============
quaddis_type with fields:
ncorrect100: percentage of correct classification (number)
confus : confusion matrix (gmax x gmax)
mean : means according to each group (gmax x p)
predgroup : predicted groups (integer) (n x gmax)
density : pseudo-densities of each observation (n x gmax)
proba : probability of belonging to a given group (n x gmax)
model : predictive model with fields:
inv: Matlab matrices of Mahalanobis metrics (cube p x p x gmax)
Mut: Matlab matrices of means according to each group (gmax x p)
det: Matlab vector of determinants of covariance matrices of each
group (1 x gmax)
See also apply_quaddis, crossval_quaddis
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X: SAISIR matrix (n x p) of predictive data
y: SAISIR vector (n x 1) of variable to be predicted
ndim: maximum dimensions of the model
output argument:
===============
plstype with fields:
BETA: regression coefficient of the models (p x ndim)
BETA0:intercept of the models (p x 1)
PREDY: predicted y for all the models (n x ndim)
T: PLS scores (n x ndim)
RMSEC: root mean square error of calibration (1 x ndim)
r2: r2 coefficient (1 x ndim)
This function calculates the PLS models for 1 to ndim dimensions.
All the models are kept
The algorithm is not the NIPALS algorithm, but another one which is
faster.
This function makes use of function pls (normally in the directory
"pls"
see also: basic_pls, basic_pls2 (slower but giving more complete outputs)
Return to thematic list
Return to alphabetic list
HOME
Input argument
=============
X: SAISIR matrix (n x p)
Output argument:
X1: SAISIR matrix (n x p) with the rows randomly allocated
This function randomly changes the rank (indices) of the observations.
This useful for validation test, when comparing the results with the
hazard.
Return to
thematic list
Return to alphabetic list
HOME
Input arguments
===============
nrow, ncol: integers (number of rows and columns of the resulting matrix)
Output argument
===============
X: matrix of random elements (nrow x ncol)
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
nel: (integer) number of elements in the output vector "selected"
nselect: (integer smaller than nel) number of elements taking the value 1
nrepeat: (integer, optional) number of consecutive replicates.
Output arguments:
===============
selected: MATLAB vector with nel elements equal to 1 or 0
This function builds up a MATLAB vector of nel elements with nselect
elements equal to 1 in random position, and (nel-nselect) equal to 0.
nrepeat (optional) randomly selects nselect values, but organised by block
of nrepeat groups
For example, if nrepeat =3 a possible result is [0 0 0 1 1 1 0 0 0 1 1 1 1
1 1 ...]
This function is useful for dividing a collection into two sets, for
example in many functions of SAISIR allowing a validation test.
The case with "nrepeat" defined corresponds to the situation in
which the
replicates are in equal numbers and consecutive in the data collection.
Typical use: randomly building a calibration and validation set
==============================================================
[n,p]=size(DATA.d);
sel=random_select(n,round(n/3));%% A third in validation
cal=selectrow(DATA,sel==0);%% building the calibration set
val=selectrow(DATA,sel==1);%% building the validation set
See also: random_splitrow
Return to thematic list
Return to alphabetic list
HOME
Input argument
=============
X: SAISIR matrix (n x p)
nselect: integer, less than p.
Output arguments
================
X1, X2: SAISIR matrices of the resulting split
This function randomly divides a matrix in two matrices:
X1: with nselect rows, and X2 with n-nselect rows
Typical use : building a calibration and a validation set
========================================================
[n,p]=size(DATA.d);
[cal val]=random_splitrow(DATA, round(n*2/3));%% two third in calibration
% cal and val are respectively the calibration and validation sets
Return to thematic list
Return to alphabetic list
HOME
!!! NO DIRECT USE. Use function "excel2saisir"
Input arguments
===============
filename: name of an excel file saved in the .CSV format.
nchar: nchar is the length of the element data(i,:,j), ie the number of
characters which are kept (default: 20)
deb, xend: first and last rows wich are loaded. (default : all)
Output argument:
===============
data: is a 3 way file data(row,pos,col)
where row is the excel rows, col the excel columns,
and pos is the character in the string
if the string is less than nchar, the string is filled with white space.
if the string is more than nchar, the end of the chain is lost
This is a first step for decoding data coming from excel
%Example:
%========
mywork=readexcel1('work1.csv',15);
See also: excel2saisir, saisir2excel
Return to thematic list
Return to alphabetic list
HOME
Input arguments
==============
filename: (string) file name of the text file in the current directory
namesize: (integer) maximum size of the string to be read (default : 10)
Output arguments:
================
ident: matrix of char
nident:number of rows in ident (identifiers)
Loads an array of string in a matrix format
namesize gives the maximum number of characters in each string
Main use: loadings identifiers of rows and variables from a text file
Return to thematic list
Return to alphabetic list
HOME
------------------
Input arguments:
===============
x : data matrix (n x p)
beta : vector of regression coefficients (p x 1)
y :(optional) known y value
Output argument:
===============
res with fields
score : regression scores (also y split)
reconstructed_norm2: squared norms of the scores
cumulated_norm2: cumulated squared norms of the scores
projector: matrix such as score=x*projector
eigenvec_sum: sum of the eigenvectors of PCA (linked to the theory)
xmean: mean of x
r2 : if (y defined) r2 of the cumulated model
Given a data matrix x and the regression coefficients beta,
the function build up a matrix of orthogonal scores
such as predicted y is equal to the sum of this scores
The scores can be used to examine the observations "oriented" in
the
prediction of y.
As the scores are ranked as a function of their ability to predict y,
It is possible to examine the observations beginning by the first scores.
Return to
thematic list
Return to alphabetic list
HOME
Input arguments:
===============
A1, A2: SAISIR matrices in which the rows have at least some identifiers in
common
Output arguments
================
B1, B2: reordered matrices.
This function makes it possible to realign the rows of A1 and A2, in order
to have the identifiers corresponding.
This is necessary for any predictive method (particularly regressions).
The function discards the observations which are not present in A1 and A2.
The matrix B1 corresponds to A1 and matrix B2 to A2
Fails if A1 or A2 contains duplicate identifiers of rows.
A2 is leader (B1 is as close as possible from the order of A2)
%Typical example:
%===============
%Let X and y matrices to be reordered
[X1, y1]=reorder(X,y);
In X1 and y1 the rows have now the same identifiers (with possibly some
lost of observations).
%
If the function fails because some identifiers are in duplicates, use the
function "check_names" to identifies these duplicated
identifiers, and remove some of them
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
- str : a character string
- ntimes: number of repetition
Output argument:
================
- str1 the matrix of char with the repeated string.
example:
>> repeat_string('Vanessa',3)
ans =
Vanessa
Vanessa
Vanessa
Useful for building identifiers in SAISIR
See also: addcode
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X: SAISIR matrix of predictive variables (n x p)
y: SAISIR vector of observed y (n x 1)
krange: MATLAB vector of k-values to be tested in the ridge regression
Let ntest = length(krange)
Output argument:
===============
ridgetype with fields
beta: beta coefficients associated with th ntest k-values as defined in
"krange"
averagex: average of X
averagey: average of y
rmsec: Root mean square error of calibration (ntest x 1 )
predy: predicted y for each test k-value (n x ntest)
r2: r2 for each tested k-value (ntest x 1);
Return to thematic list
Return to alphabetic list
HOME
ONLY ONE VARIABLE TO BE PREDICTED (scan the dimensions)
return as many beta as the number of elements in krange
Input arguments:
===============
X: SAISIR matrix of predictive variables (n x p)
y: SAISIR vector of observed y (n x 1)
normrange: tested range of norms of beta (MATLAB vector of positive doubles)
Let ntest = length(krange)
Output argument:
===============
ridgetype with fields
beta: beta coefficients associated with th ntest k-values as defined in
"krange"
averagex: average of X
averagey: average of y
rmsec: Root mean square error of calibration (ntest x 1 )
predy: predicted y for each test k-value (n x ntest)
r2: r2 for each tested norm of beta (ntest x 1);
k : MATLAB vector of resulting k values (ntest x 1)
expected norm: MATLAB vector of expected norms (copy of normrange)
This function carried out as many ridge regressions as the number of
elements in "normrange".
Rather that (as usual) trying to find the k-value of ridge, here, it is
directly the norm of the regression coefficients beta which are the
adjusted value. To each norm, there is a corresponding value of k.
%Typical example:
%===============
ridgetype=ridge_regression1(X,y,[100, 200]);
The function displays the Ordinary Least Square norm of beta
%"OLS norm = 1234.5678"
%This value gives the maximum possible value of the norm
%For example, testing half this norm
ridgetype=ridge_regression1(X,y,(1234.5678/2);
See also: ridge_regression
Return to thematic list
Return to alphabetic list
HOME
Return to thematic list
Return to alphabetic list
HOME
Input arguments
===============
X: SAISIR matrix to be saved
filename: (string) name of the saved file
separator:(string with a single char) separator character
Output argument
===============
none
Transform a saisir file into a simple .txt file and save it on disk
separator is a single character like ' ' or ';' or its ASCII code;
The extension '.txt' is added to the filename
%Typical example:
%===============
saisir2ascii(data,'mydata',';');
%Saves the SAISIR matrix "data", under the name
"data.txt", with ";" as separator
Return to thematic list
Return to alphabetic list
HOME
Input arguments
===============
X: SAISIR matrix to be saved
filename: (string) name of the saved file
Output argument
===============
none
Transformq a saisir file into a simple .CSV file and save it on disk
The separator is ";"
The extension '.csv' is added to the filename
%Typical example:
%===============
saisir2excel(data,'mydata');
%Saves the SAISIR matrix "data", under the name
"data.csv", with ";" as separator
%This file is read by Excel
Return to thematic list
Return to alphabetic list
HOME
Input arguments
===============
X: SAISIR matrix of predictive data (n x p)
Y: SAISIR matrix of variables to be predicted (n x k)
ndim: (integer) number of dimensions asked
Ouput arguments
===============
plstype with fields
beta: regression coefficients of the model (p x k)
beta0: intercept of the models (1 x k)
predy: predicted values (n x k)
T: PLS scores of the PLS2 regression model (n x ndim)
correlation: correlation coefficient (1 x k)
This function assesses a pls2 model
Several variables can be predicted, but only ndim dimensions are tested
Preferably uses basic_pls or basic_pls2
%Typical example
%===============
Let DATA be dimensionned (n x p)
Let Y be dimensionned (n x k)
plstype=saisirpls(DATA,Y,10);
%Assesses the models with 10 dimensions for the k variables in Y
Return to thematic list
Return to alphabetic list
HOME
Input argument:
==============
X: (expected) SAISIR matrix
Output argument:
===============
check:
check = 1 if x is in the SAISIR format (no warning)
check = 2 if x is in the SAISIR format (with warning)
check = 0 if x is not in the SAISIR format (fatal error)
The function tests if the input argument X is a valid SAISIR structure
and gives some information.
If X is a valid structure, also signals (as warning) if there are missing
values,
identical rows or columns (which may be the sign of something wrong)
Useful to see if X is a valid '.d','.i','.v' structure.
Return to
thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X1:SAISIR matrix (n x p)
polynom_order:(integer) order of the fitting polynom
window_size:(integer) number of data points involved in the calculation
derivative_order: (integer, normally 1 or 2) order of the derivative
Output argument:
================
X : transformed data matrix (n rows)
The function assumes that X is a matrix of digitized signals (such as
spectra) with constant intervals of digitization.
Example:
=======
res=saisir_derivative(DATA,3,21,2);
Compute the second derivative using a polynom of power 3 as model
and a window size of 21
Return to thematic list
Return to alphabetic list
HOME
Input argument
=============
dis: a SAISIR matrix of distance (n x n, symetric)
Output argument
==============
z: z vector as required by the MATLAB function "dendrogram"
From a complete square matrix of distances
extracts the "unfolded" triangular matrix in order to enter the
matlab program "linkage"
with the option "ward"
Returns the z vector as required by the MATLAB "dendrogram"
function
This function is very specific, and can be used only by skilled persons!
See also : dendro (dendrogram with SAISIR)
Return to thematic list
Return to alphabetic list
HOME
Input argument
==============
X: SAISIR matrix (n x p)
Ouput argument
==============
xmean: SAISIR vector (1 x p) of the mean
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X1 and X2 : SAISIR matrices dimensionned (n x p) and (p x m) respectively
Output argument:
===============
X12: SAISIR matrix (n x m) , result of the multiplication of X1 with X2
Little use!
Return to
thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X: SAISIR matrix (n x p)
ncol:(integer) rank (index) of the column on which the data are sorted
ùinmax: 0 increasing order, 1: decreasing order (default : 0)
Output arguments:
================
X1: SAISIR matrix (n x p) sorted according to the column "ncol"
X2: SAISIR matrix (n x (p+1)) sorted according to the column
"ncol", with
the rank added in column 1
%Typical example:
%===============
DATA1=saisir_sort(DATA,5);
map(DATA1,1,6); %% representing the 5 th column of DATA (6th column of
%DATA1) in increasing order
Return to thematic list
Return to alphabetic list
HOME
Input argument
==============
X: SAISIR matrix (n x p)
Ouput argument
==============
xstd: SAISIR vector (1 x p) of the standard deviation
Return to thematic list
Return to alphabetic list
HOME
Input argument
==============
X: SAISIR matrix (n x p)
Ouput argument
==============
xsum: SAISIR vector (1 x p) of the sum of the rows
Return to thematic list
Return to alphabetic list
HOME
Input argument
=============
X1: SAISIR matrix ( n x p)
Output argument
===============
X: SAISIR matrix (p x n), transpose of X1
Return to thematic list
Return to alphabetic list
HOME
Input arguments
===============
identifiers: matrix of characters (n x p)
xstr: string (1 x k), with k smaller th=an p
Output argument
==============
ndex: vector of integers giving the indices of the rows of
"identifiers" in
which the string "xstr" has been found
%Typical example:
%===============
index=seekstring(DATA.i,'thisname');
Gives the indices in DATA.i in which the string "thisname" is
present.
Return to
thematic list
Return to alphabetic list
HOME
Input arguments
===============
X1: SAISIR matrix (n x p)
Index: vector of integer or of booleans
Output argument
===============
X: matrix with n rows reduced to the selected variables
%Typical example:
%===============
reduced=selectcol(DATA,[1 5 6]); %% selects the columns #1, #5, #6 and
%builds the reduced matrix (with 3 columns) in "reduced"
%See also: selectrow, deletecol, deleterow, appendcol, appendrow,
appendcol1, appendrow1
Return to thematic list
Return to alphabetic list
HOME
Input arguments
===============
X1: SAISIR matrix (n x p)
Index: vector of integer or of booleans
Output argument
===============
X: matrix with n columns reduced to the selected rows
%Typical example:
%===============
reduced=selectrow(DATA,[1 5 6]); %% selects the rows #1, #5, #6 and
%builds the reduced matrix (with 3 rows) in "reduced"
See also: selectcol, deletecol, deleterow, appendcol, appendrow,
appendcol1, appendrow1
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X : SAISIR matrix
startpos : beginning position in the character strings of the identifiers
of rows ('.i')
str: string which is used as selection key.
Output argument:
===============
X1 : SAISIR matrix of the selected rows
Creates the data collection "X1" which is the subset of
"X"
the identifiers of which contain the string str, in starting position
startpos
%Example :
%========
%Let X be a SAISIR matrix
%Let X.i being
%'wheat1'
%'barle2'
%'ricex1'
%'wheat2'
%The 'wheat' samples are extracted through
mywheat= select_from_identifier(X,1,'wheat');
%This select the rows with identifiers 'wheat1' and 'wheat2'
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X : SAISIR matrix
startpos : beginning position in the character strings of the identifiers
of columns ('.v')
str: string which is used as selection key.
Output argument:
===============
X1 : SAISIR matrix of the selected columns
Creates the data collection "X1" which is the subset of
"X"
the variable identifiers of which contain the string str, in starting
position startpos
are selected
see also : select_from_identifier
Return to thematic list
Return to alphabetic list
HOME
Graphical display of sensory profiles in a "circular (spider
web)" representation.
Input arguments:
===============
-X : matrix of data to be displayed
-range : vector of the indices of the rows to be displayed
-max_score : maximal score used in the scale
-title : (optional) title of the graph.
Warning: will not work properly with more than 15 variables
Preferably reduce the identifiers of variables to less than 8 characters
%Demonstration example
%====================
senso=rand(5,10)*5;%%simulationg 5 panellists, 10 scores, scale from 0 to
%5
senso1=matrix2saisir(senso,'judge','descri');%% In SAISIR structure
sensory_profile(senso1,1:3,5);%% graphic of the first 3 panellists
Return to thematic list
Return to alphabetic list
HOME
where the polynomial order is K and the frame size is F (an odd number)
No direct use
Return to thematic list
Return to alphabetic list
HOME
Input arguments
===============
X: SAISIR matrix (n x p)
nrow: index of the row to be displayed (integer less than n, default : 1)
csize: size of the character (default : 10)
xlab, ylab, title: label on axis X, axis Y, title , respectively (default:
none
The identifiers of the columns are plotted with X being the index of the
variable and Y the actual
value of the variable for the selected row "nrow"
Main use : examining the output of "anavar1" and
"anovan1" functions on
discrete variables
Return to thematic list
Return to alphabetic list
HOME
Input arguments
===============
X:SAISIR matrix of predictive variables (n x p)
y:SAISIR vector of the variable to be predicted (n x 1)
Ouput arguments
===============
beta:SAISIR vector of the regression coefficients (1 x p)
beta0:SAISIR vector of the intercepts (1 x p)
y is predicted by each column i of X according to
ypred=X.d(:,i)*beta.d(i)+beta0.d(i);
There are thus as many mono-linear models as the number of columns in X
Return to thematic list
Return to alphabetic list
HOME
Input argument:
==============
X:SAISIR matrix of spectra (n x p)
Output argument:
==============
X1:SAISIR matrix of SNV-corrected spectra (n x p)
SNV (Standard Normal Variate)is commonly used in spectroscopy.
It basically consists in centering
and standardizing the ROWS (not the columns) of the data matrix.
This procedure may reduce the scatter deformation of spectra.
Return to thematic list
Return to alphabetic list
HOME
The PC scores are introduced in the order of their regression coefficient
or their covariance
with y
Input arguments:
===============
X : SAISIR matrix of predictive variables (n x p)
y : SAISIR vector of observed y
maxdim: (integer): naximum number of PC scores introduced in the regression
model
maxrank: (optional,integer): rank maximal of the PC score in the model
Default value: all components possibly introduced.
corr_cov : 1 introduction according to correlation coeff
(corr_cov=1,default);
or : 0 introduction according to covariance
Ouput arguments:
===============
spcrtype with fields
pca: PCA structure (see function "pca")
beta: beta coefficients (applicable on PC scores)
selected_component: rank of the scores introduced in the model
predy: predicted y values for all the steps of spcr
r2: r2 for all the steps of spcr
averagey: average value of y
beta1: beta coefficients (applicable on X centred)
obsy: observed y, copy of input argument y.
rmsec: root mean square error of calibration for all the models
Return to thematic list
Return to alphabetic list
HOME
Input arguments
===============
X:SAISIR matrix (n x p)
index: MATLAB vector with k elements equal to 1 and n-k elements equal to
0
Output arguments
===============
X1:SAISIR matrix (k x p)
X2:SAISIR matrix ((n-k) x p)
Divides X into two matrices X1 and X2
the first one correspond to kept rows (according to index = 1, or
"true")
the second one is the complement (index = 0, or "false")
index is either indices of the rows (integers) or boolean.
%Typical example (division of DATA into a calibration and a validation
sets):
%===============
[n,p]=size(DATA.d);
sel=random_select(n,round(n/3));%% buiding a random vector of 0 and 1
[validation_set calibration_set]=splitrow(DATA,sel);%%creation of a
%calibration and validation set.
Return to
thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X:SAISIR matrix (n x p)
startpos, endpos: position in the row-identifier strings (".i")
Output argument:
===============
res with fieldss:
average: averages of the identified groups (p columns)
group: number of observations in each group.
The function extracts the characters in the row identifiers from
"startpos"
to "endpos" and makes as many groups as the number of different
strings
The observations are averaged according to these groups
Return to thematic list
Return to alphabetic list
HOME
Input arguments
==============
X1:SAISIR matrix (n x p)
option : either 0: divides by n-1, or 1: divides by n (default : 1)
Output argument
===============
X:SAISIR matrix (n x p)
xstd: standard deviation of the columns of X1 (1 x p)
Return to thematic list
Return to alphabetic list
HOME
Input arguments
===============
collection:array of SAISIR matrices with the same number of rows (n)
Output arguments
================
Let n be the number of observations, and t the number of tables
the field of the outpout argument res are:
RV: [1x1 struct] matrix t x t of the RV value indicating the agreement
between the table (max value = 1)
eigenval1: [1x1 struct] first eigenvalue of the RV matrix
eigenvec1: [1x1 struct] first eigenvector of the RV matrix (t x 1) . Indicates
the
weight associated with each table
-Wk: {1xt cell} cell of n x n array giving the scalar products between
observations
-W_compromise n x n array giving the compromise of the array WK
eigenval2: [1x1 struct] r eigenvalues of W_compromise, with r the rank of
W_compromise.
score: [1x1 struct] (n x r ) Scores of the compromise of the observations. Can
be
represented as factorial map
trajectory: [1x1 struct] (n*t x r) Projection of each row vector of each
table in the space of observation_score
group: [1x1 struct] (n*t x 1) table giving the belonging of a given
row_vector to a table
table_score: [1x1 struct] (t x r) scores of the tables obtained from
diagonalisation of RV.
table_eigenval: [1x1 struct] (r x 1) eigenvalues of RV . The first one
is the same as eigenval1
The STATIS method is described in "C.Lavit, Analyse conjointe de
tableaux qualitatif, Masson pub, 1988."
Basically the method attempts to establish a factorial compromise between
table having
the same number of observations.
col is an ARRAY OF CELLS containing all the 2-D data tables (SAISIR
format).
Each table must include the same observations, but not necessarily the same
variables.
group is useful with the command 'carte_barycentre'
For example, a command such as "carte_barycentre(res.trajectory,2,3,
res.group)" will produce the representation of
the row vector of each table for the score 2 and 3. The representation
shows the compromise point and its link
to each vector of the tables.
"collection" is built up with commands like:
collection(1)=DATA1; collection(2)=DATA2; ...: collection(k)=DATA(K);
Warning ! Such commands work only if DATA are .d, .i, .v structures IN
THAT ORDER, with NO OTHER FIELDS. Otherwise, MATLAB refuses to build the
vector of SAISIR structure. Possibly use "saisir_check" for
verifying this
point.
Return to thematic list
Return to alphabetic list
HOME
Input arguments
===============
X : SAISIR matrix of predictive data (n x p)
y : SAISIR vector of variable to be predicted (n x 1)
P :probability threshold for entering or discarding a variable
confidence: (default=0.05) is the probability of the confidence interval
for the limit of the regression coefficients
Output argument
===============
result: array of cells corresponding to each step of the regression
Each cell correspond to one step (adding or discarding a variable)
In each cell:
message: gives the name of the entered or discarded variable'
res : a structure described below
intercept : constant value (beta0) of the current model
RMSE : root mean square error of the model
r2 : determination coefficient
adjusted_r2 : adjusted determination coefficient (taking into account of
the dimensions
F : Fisher F value of the current model
probF : probaility value assiciated with F
ypred : predicted y values
res : the rows indicate the variables introduced
the different columns give information on the corresponding
regression coefficients:
1) regression coefficients
2) Lower confidence limit of regressin coefficients
3) Higher confidence limit
4) Std of regression coeff.
5) t value of reg. coeff.
6) Prob. of reg. coef.
7) Rank of variables
% example
result=stepwise_regression(X,y,0.05);
result{3} %% third model
message: 'Entering variable 2214 at step 3'
res: [1x1 struct]
intercept: 4.66
RMSE: 0.81
r2: 0.90
adjusted_r2: 0.90
F: 417.96
probF: 0
ypred: [1x1 struct]
and result{3}.res gives the statistics on the regression coefficients
which can be consulted for example under Excel using
"saisir2excel(result{3}.res,'model3')"
Return to thematic list
Return to alphabetic list
HOME
NORMALLY NO DIRECT USE
creation of a saisir file from a string table obtained by procedure
readexcel1
A DOS file which have been read from readexcel1 is a 3 way matrix of
character under
the form data(row,pos,col). For example, data(5,:,12) contains the string
in row 5 and
column 12.
In the particular case of acceptable data for saisir transformation, the
data format must be
the following:
1) the first row data(1,:,:) must contain the identifier of variables
(column)
warning: for matrix presentation of the data, the string data(1,:,1) is of
no use
and is skipped by the program.
2) the first column(:,:,1) must contain the identifier of observations
(rows)
3) the other lines and columns contains string which can be converted in
number
(or whitespace)
such format is normally obtained by using readexcel1 (excel data saved as
.csv)
If the original excel file was not appropriate, it is possible that several
columns
contain string which could not be transformed in number (or white space).
in this situation, it is possible to remove the undesired column using
data(:,:,col)=[]; where col is the index of the column to be removed
Note that the whitespace are replaced by NAN values
Return to thematic list
Return to alphabetic list
HOME
Input argument
==============
str:matrix of characters
filename:string (vector of characters)
Output argument
==============
None
This function saves a succession of strings (in a matrix of char) under
the name "filename". The extension ".TXT" is added to
the file name
Return to thematic list
Return to alphabetic list
HOME
The scale of the COMPLETE map is used.
Input arguments:
===============
X: SAISIR matrix
col1, col2 : index of the two columns to be represented (integer values)
xstring : string in the names of identifiers which must be displayed
col1label (optional): Label of the variable forming the X-axis
col2label (optional): Label of the variable forming the Y-axis
title (optional) : title of the graph
charsize (optional) : size of the plotted characters
marg (optional) : margin value allowing an extension of the axis in order
to cope with long identifiers (default value: 0.05)
example:
Let X be a SAISIR matrix
%Let X.i being
'wheat1'
'barle2'
'ricex1'
'wheat2'
'barle3'
...
The command "submap(X,5,3,'wh')" will plot the column 5 as X and
3 as Y
Only the observations containing 'wh' in there names are displayed, but
the general axis-scales are the one for the whole collection.
Useful for emphasizing some groups in a complex plot.
Return to thematic list
Return to alphabetic list
HOME
Input arguments
===============
X:SAISIR matrix
ncol: (integer) column chosen for the subtraction
Output argument
==============
X1: SAISIR matrix in which the variable has been subtracted
Subtracts the variable of indice ncol to each other variables of the
observations
Useful for correcting an y-shift of spectral data.
Return to
thematic list
Return to alphabetic list
HOME
Input argument:
==============
X:SAISIR matrix
Output arguments:
================
zmin, zmax: max and min values in X.d
% new version 6/10/2006
Return to thematic list
Return to alphabetic list
HOME
Input arguments
===============
X:SAISIR matrix
Threshold: (small) positive or zero value
If the sum is equal to 0, the elements of the corresponding row are set to
0.
If threshold defined threshold (normally very small value) is added to data
Only useful for avoiding "division by zero" warning
Ouput argument
===============
X1:corrected matrix
The function assesses the sum of each row . Each value of each row is
divided by the corresponding sum.
In chromatography, this corresponds to giving the same surface to all the
chromatograms.
See also: snv
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
===============
X: SAISIR matrix
col1, col2 : index of the two columns to be represented (integer values)
startpos, endpos: position in the identifier strings of rows ('.i') for
the coloration
col1label (optional): Label of the variable forming the X-axis
col2label (optional): Label of the variable forming the Y-axis
title (optional) : title of the graph
charsize (optional) : size of the plotted characters
For the French users: there is a synonym function
"carte_symbole".
Use preferably "symbol_map"
The coloration of the displayed descriptors depends on the arguments
startpos and endpos.
From the names of individual, the string name(sartpos:endpos) is extracted.
Two observations
for which these strings are different, are represented with different symbols.
example:
Let X be a SAISIR matrix
%Let X.i being
'wheat1'
'barle2'
'ricex1'
'wheat2'
'barle3'
...
The command 'symbol_map(X,5,3,1,5)' will plot the column 5 as X, 3, as Y
The characters are extracted from 1 to 5 , that is strings 'wheat', 'barley',
'ricex'.A different symbol will be given for each of this strings.
See also colorde_map1, colored_map2 (same principle but with the
identifier name displayed)
Return to thematic list
Return to alphabetic list
HOME
Input arguments;
================
X : SAISIR matrix
ncol : column to be represented
xlabel, ylabel (optional) : labels in X and Y
title (optional) title of the graph.
This function draws a column (typically a loading or an eigenvector) as a
curve.
If X.i can be interpreted as a vector of number (such as wavelengths),
the X scale is given by this vector.
Otherwise, the X-axis is simply given by the rank of the variables
A function "tcourbe" is a synonym of this function.
Return to
thematic list
Return to alphabetic list
HOME
Input arguments
===============
X:SAISIR matrix (n x p)
range: indices of the selected columns (Matlab vector of integers)
xlabel, ylabel, title: legends ub x,y and title (strings)
Typical use : showing loadings of PLS or PCA
===========
p=pca(spectra);
tcurves(p.eigenvec,1:4);%% First 4 loadings pf PCA
Return to thematic list
Return to alphabetic list
HOME
Input argument
==============
function_name: matrix of characters giving the function name
previous (optional): previous results of this function
"thematic_classification"
Output argument
==============
res with fields
theme_structure: array of structures with fields:
name: name of the function
theme: vector giving the number of the theme in the thematic list
theme: matrix of char giving the names of the themes.
"thematic classification" presents a list of themes
For each function, the user is asked to give the number in the list of
themes
VERY SPECIFIC USE.
This function is used by the function "build_documentation" in
order to give a
a thematic list of the function
If "previous" is defined, the results are obtained with
concatenation of
the old and newly created thematic list. The functions which are in
"previous"
are not considered again
Return to thematic list
Return to alphabetic list
HOME
The function represents the columns col1 and col2 as curves (ark)
The observations which have the same strings in their identifiers are
joined.
The points are joined consecutively according to their order (rank) in x.
Input arguments:
================
X : saisir matirx
col1, col2 : columns of x to be represented.
startpos, endpos : positions in identifiers indicating which identifiers
are to be joined.
exemple of use : time series
==============
identifiers of rows (x.i) like A01; A02; A03... A100; B01 ... B100; C01 ...
with 1 ... 100 indicating times, and A B .. observations varying with time
Command:
trajectory_curve(x,1,2,1,1);
Join the point labelled 'A' together; the ones labelled 'B' ... and so on
Return to thematic list
Return to alphabetic list
HOME
Input argument
==============
xstruct: any variable (supposed to be a structure with fields)
Output argument
==============
none
This function displays all the fields found in structures (given by a
SAISIR function).
"Matrix" or "Vector" means here "SAISIR
matrix" or "SAISIR" vector"
Exemple of use
...
pls_res=crossvalpls1a(x,y,10,sel);%% modèle de 1 à 10 dimensions
w(pls_res);%% gives the fields in "pls_res":
"
calibration
T : matrix 94 X 10
P : matrix 10 X 1050
beta : vector 1050 X 1
beta0 : = 12.7025
....
validation
PREDY : matrix 46 X 10
RMSEV : vector 1 X 10
r2 : vector 1 X 10
T : matrix 46 X 10
OBSY : vector 46 X 1
....
"
Return to thematic list
Return to alphabetic list
HOME
function[Q, saliences, explained]=xcomdim(col,threshold,ndim)
Finding common dimensions in multitable according to method 'level3'
proposed by E.M. Qannari, I. Wakeling, P. Courcoux and H. J. H. MacFie
in Food quality and Preference 11 (2000) 151-154
table is an array of matrices with the same number of row
threshold (optional): if the difference of fit
ndim : number of common dimensions
default: threshold=1E-10; ndim=number of tables
returns Q: nrow x ndim the observations loadings
Return to thematic list
Return to alphabetic list
HOME
Input argument
==============
varagin: variables number of arguments either number or string
The function avoids those boring "numstr" and brackets [] in
displaying
text.
Example: xdisp('pi is equal to',pi, 'Don''t you know ','Charlie Brown ?','
age :',6 );
equivalent to
disp(['pi is equal to ', num2str(pi) ' Don''t you know ' 'Charlie Brown ?' '
age ' num2str(6)]);
Return to thematic list
Return to alphabetic list
HOME
directly on data
returns coord.d, eigenvector.d, eigenvalues.d
average.d
currently only nrow
Return to thematic list
Return to alphabetic list
HOME
Input arguments
===============
X: Saisir matrix
col1, col2, col3 : indices of the columns represented in X, Y, Z
startpos, endpos: number indicating the beginning and ending character in
the name identifiers. Two different strings will have different colors.
Return to thematic list
Return to alphabetic list
HOME
Input arguments:
================
X,Y : SAISIR matrices (with the same number of rows)
xcol, ycol : rank number (index) of the columns to be plotted
if start_pos and end_pos defined: colored plot according
to the characters of the row identifiers at position start_pos:end_pos
Return to thematic list
Return to alphabetic list
HOME