SAISIR

A comprehensive package for chemometrics in the MATLAB environment

Documentation on SAISIR function

For more information on the general structure of SAISIR, see also the manual.

Unité de Sensométrie et de Chimiométrie ENITIAA-INRA (Nantes , France)


Coordinator: Dr Dominique BERTRAND (e-mail: bertrand "at" enitiaa-nantes.fr)









Go to thematic list of functions

Go to alphabetic list of functions






Documentation automatically generated on 07-Jan-2009
This software is copyrighted by ENITIAA-INRA, Unité de Sensométrie et de Chimiométrie, Nantes (France)


Thematic list of functions

  • Loading and saving files
  • Elementary manipulation of SAISIR files
  • Elementary data transformation and pre-treatments
  • Graphical display
  • Usual statistics
  • Advanced statistics
  • Principal Component analysis
  • Regression methods
  • Partial least square (PLS)
  • Discrimination methods
  • Miscellaneous


  • Thematic list
    Alphabetic list

    Loading and saving files

    appendbag1

    appendbag1 - Merge an arbitrary number of "bag" files according to rows


    usage: [X]= appendbag1(X1, X2, X3,.....)

    bag2group

    bag2group - uses the identifiers in bag to create groups


    function [group_type]=bag2group(bag)

    bag_appendrow1

    bag_appendrow1 - Merges an arbitrary number of bags according to rows


    usage: [bag]= appendrow(bag1, bag2 ....)

    check_name

    check_name - Controls if some strings are strictly identical in a string array


    function [detected,names]=check_name(string)

    excel2bag

    excel2bag - reads an excel file and creates the corresponding text bag


    (array of caracter)

    excel2saisir

    excel2saisir - reads an excel text file


    function [saisir] = excel2saisir(filename,(nchar),(start),(xend))

    issaisir

    issaisir - tests if the input argument is a SAISIR matrix


    test=issaisir(X);

    matrix2saisir

    matrix2saisir - transforms a Matlab matrix in a saisir structure


    X = matrix2saisir(data,(coderow),(codecol))

    readexcel1

    readexcel1 - reads an excel file in the .CSV format (create a 3way character matrix).


    function [data] = readexcel1(filename,(nchar),(deb),(xend))

    readident

    readident - loads a file of strings


    function [ident, nident] = readident(filename,namesize)

    saisir2ascii

    saisir2ascii - Saves a saisir file into a simple ASCII format


    function saisir2ascii(X,filename,separator)

    saisir2excel

    saisir2excel - Saves a saisir file in a format compatible with Excel


    function saisir2excel(X,filename)

    saisir_check

    saisir_check - Checks if the data respect the saisir stucture


    function check=saisir_check(X)

    string2saisir

    string2saisir - creation of a saisir file from a string table (first column=name)


    function [saisir] = string2saisir(data)

    string2text

    string2text - save a vector of string in a .txt format


    function string2text(str,filename)


    Thematic list
    Alphabetic list

    Elementary manipulation of SAISIR files

    addcode

    addcode - adds a string before or after a matrix of characters


    function str1 = addcode(str,code,(deb_end))

    alphabetic_sort

    alphabetic_sort - sorts the rows of x according to the alphabetic order of rows identifiers


    function X1=alphabetic_sort(X,start_pos:end_pos)

    appendbag1

    appendbag1 - Merge an arbitrary number of "bag" files according to rows


    usage: [X]= appendbag1(X1, X2, X3,.....)

    appendcol

    appendcol - merges two files according to columns


    function: [X3]= appendcol(X1,X2)

    appendcol1

    appendcol1 - merges an arbitrary number of files according to columns


    usage: [X]= appendcol(X1,X2,X3,...)

    appendrow

    appendrow - merges two SAISIR matrices according to rows


    usage: [X3]= appendrow(X1,X2)

    appendrow1

    appendrow1 - Merges an arbitrary number of files according to rows


    usage: [X]= appendrow(X1,X2,X3,...)

    bag2group

    bag2group - uses the identifiers in bag to create groups


    function [group_type]=bag2group(bag)

    bag_appendrow1

    bag_appendrow1 - Merges an arbitrary number of bags according to rows


    usage: [bag]= appendrow(bag1, bag2 ....)

    build_indicator

    build_indicator - build a disjoint table


    function [indicator, groupings]=build_indicator(x);

    check_name

    check_name - Controls if some strings are strictly identical in a string array


    function [detected,names]=check_name(string)

    create_group

    create_group - creates a vector of numbers indicating groups from identifiers


    group=create_group(X,code_list,startpos,endpos)

    create_group1

    create_group1 - uses the identifiers to create groups


    use the identifier for creating groups.

    deletecol

    deletecol - deletes columns of saisir files


    function [X1] = deletecol(X,index)

    deleterow

    deleterow - delete rows


    function X1 = deleterow(X,index)

    eliminate_nan

    eliminate_nan - suppresses "not a number" data in a saisir structure


    function [X1] = eliminate_nan(X,(row_or_col))

    find_index

    find_index - find the index corresponding to the closest "value"


    function index=find_index(str,value);

    find_max

    find_max - gives the indices of the max value of a MATLAB Matrix


    function [row,col,value]=find_max(matrix)

    find_min

    find_min - gives the indices of the min value of a MATLAB Matrix


    function [row,col,value]=find_min(matrix)

    find_peaks

    find_peaks - finds and displays peaks greater than a threshold value


    function [vect, index]=find_peaks(X,nrow,threshold,windowsize,min_max)

    group_centering

    group_centering - Centers data according to groups


    function X1=group_centering(X,group);

    issaisir

    issaisir - tests if the input argument is a SAISIR matrix


    test=issaisir(X);

    matrix2saisir

    matrix2saisir - transforms a Matlab matrix in a saisir structure


    X = matrix2saisir(data,(coderow),(codecol))

    num2str1

    num2str1 - Justified num2string


    function str=num2str1(vector,ndigit);

    random_splitrow

    random_splitrow - random selection of rows


    function[X1,X2]=random_splitrow(X, nselect)

    reorder

    reorder - reorders the data of files A1 and A2 according to their identifiers


    function [B1 B2]=reorder(A1,A2)

    repeat_string

    repeat_string - build a matricx of char by repeating a string


    function str1=repeat_string(str,ntimes);

    row_center

    row_center - subtracts the average row to each row


    function [X] = center(X1)

    saisir_check

    saisir_check - Checks if the data respect the saisir stucture


    function check=saisir_check(X)

    saisir_sort

    saisir_sort - sorts the rows of s according to the values in a column


    function [X1 X2]=saisir_sort(X,ncol,minmax)

    saisir_transpose

    saisir_transpose - transposes a data matrix following the saisir format


    function [X] = saisir_transpose(X1)

    select_from_identifier

    select_from_identifier - Uses identifier of rows for selecting samples


    function [X1] = select_from_identifier(X,startpos,str)

    select_from_variable

    select_from_variable - use identifier of columns for selecting variables


    function [X1] = select_from_variable(X,startpos,str)

    selectcol

    selectcol - creates a new data matrix with the selected columns


    function [X] = selectcol(X1,index)

    selectrow

    selectrow - creates a new data matrix with the selected rows


    function [X] = selectrow(X1,index)

    split_average

    split_average - averages observations according to the identifiers


    function res=split_average(X,startpos,endpos)

    splitrow

    splitrow - splits a data matrix into 2 resulting matrices


    function [X1, X2]= splitrow(X,index)


    Thematic list
    Alphabetic list

    Elementary data transformation and pre-treatments

    build_indicator

    build_indicator - build a disjoint table


    function [indicator, groupings]=build_indicator(x);

    center

    center - subtracts the average to each row


    function [X1 xmean] = center(X)

    correct_baseline

    correct_baseline - simple linear baseline correction, using intensity


    function [saisir] = correct_baseline (saisir1,col1,col2)

    create_group

    create_group - creates a vector of numbers indicating groups from identifiers


    group=create_group(X,code_list,startpos,endpos)

    create_group1

    create_group1 - uses the identifiers to create groups


    use the identifier for creating groups.

    eliminate_nan

    eliminate_nan - suppresses "not a number" data in a saisir structure


    function [X1] = eliminate_nan(X,(row_or_col))

    moving_average

    moving_average - Moving average of signals


    function X1=moving_average(X,window_size)

    moving_max

    moving_max - replaces the central point of a moving window by the maximum value


    function [X1] = moving_max(X,window_size)

    moving_min

    moving_min - replaces the central point of a moving window by the minimum value


    function [X1] = moving_min(X,window_size)

    msc

    msc - Multiplicative scatter correction on spectra


    function [X1] = msc(X,(reference))

    norm_col

    norm_col - divides each column by the corresponding standard deviation


    function [saisir] = norm_col(saisir1,(mode))

    normc

    normc - Normalize columns of a matrix.



    random_saisir

    random_saisir - Creation of a random matrix


    function[X]=random_saisir(nrow,ncol)

    random_select

    random_select - bulding a vector of random elements 0 or 1


    function[selected]=random_select(nel, nselect, (nrepeat))

    randomize

    randomize - Builds a file of randomly attributed vector in X1


    function X1=randomize(X)

    reorder

    reorder - reorders the data of files A1 and A2 according to their identifiers


    function [B1 B2]=reorder(A1,A2)

    saisir_derivative

    saisir_derivative - n-th order derivative using the Savitzky-Golay coefficients


    [X]=saisir_derivative(X1,polynom_order,window_size,derivative_order)

    saisir_emsc

    saisir_emsc - Correction of spectra by EMSC


    [X1, emsc_model, coefficients]=saisir_emsc(X,good_spectra,bad_spectra,ref);

    sgolaycoef

    sgolaycoef - Computes the Savitsky-Golay coefficients


    function [B,G] = sgolaycoef(k,F)

    snv

    snv - Standard normal variate correction on spectra


    function [X1] = snv(X)

    standardize

    standardize - divides each column by the corresponding standard deviation


    function [X, xstd] = standardize(X1,(option))

    subtract_variable

    subtract_variable - subtract a given variable to all the others


    function [X1] = subtract_variable(X,ncol)

    surface1

    surface1 - Represent a surface in three dimensions


    function [zmin, zmax]=surface1(X)

    surface_std

    surface_std - divide each row by the sum of its corresponding columns


    function [X1] = surface_std(X,(threshold))


    Thematic list
    Alphabetic list

    Graphical display

    barycenter_map

    barycenter_map - graph of map of barycenter


    function barycenter_map(X,col1,col2,group,(charsize))

    browse

    browse - browses a series of curves


    function browse(X,xstart)

    ca_map

    ca_map - colored map for correspondence analysis: using a portion of the identifiers as labels


    ca_map(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))

    colored_curves

    coloured_curves - displays curves coloured according to groups


    function h=colored_curves(X,group)

    colored_map1

    colored_map1 - colored map : using a portion of the identifiers as labels


    colored_map1(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))

    colored_map2

    colored_map2 - colored map : using a portion of the identifiers as labels


    colored_map2(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))

    colored_map4

    colored_map4 - colored map according to 2 criteria


    colored_map4(X,col1,col2,color_choice,(symbol_choice),(charsize);)

    correlation_circle

    correlation_circle - Displays the correlation circle after PCA


    function[res]=correlation_circle(pcatype,X,col1,col2,(startpos),(endpos))

    correlation_plot

    correlation_plot - Draw a correlation between scores and tables


    function handle=correlation_plot(scores,col1,col2, X1,X2, ...);

    curve

    curve - represents a row of a matrix as a single curve


    handle=courbe(X,(nrow), (xlabel),(ylabel),(title))

    curves

    curves - represents several rows of a matrix as curves


    usage handle=curves(X,range,(xlabel),(ylabel),(title))

    dendro

    dendro1 - dendrogram using euclidian metric and Ward linkage


    function group=dendro(X,(topnodes))

    ellipse_map

    ellipse_map - plots the ellipse confidence interval of groups


    function ellipse_map(X,col1,col2,gr,centroid_variability,(confidence),(point_plot))

    find_peaks

    find_peaks - finds and displays peaks greater than a threshold value


    function [vect, index]=find_peaks(X,nrow,threshold,windowsize,min_max)

    labelled_hist

    labelled_hist - draws an histogram in which each obs. name is considered as a label


    function labelled_hist(X,col,startpos,endpos,(nclass),(charsize),(car))

    list

    list - lists rows (only with a small number of columns)


    function list(X,(start))

    map

    map - graph of map of data using identifiers as names


    function map(X,col1,col2,(col1label),(col2label),(title),(charsize),(margin))

    map3D

    map3D - Draws a 3D map


    function map3D(X,col1,col2,col3,(label1),(label2),(label3),(title),(charsize))

    mir_style

    mir_style - changes the sign of the variables of MIR spectra


    function [names] = mir_style(names1)

    plotmatrix1

    plotmatrix1 - biplots of columns of matrices with colors


    function plotmatrix1(s,startpos,endpos,charsize)

    sensory_profile

    sensory_profile - Graphical representation of sensory profile


    function[h]=sensory_profile(X,range,max_score,(title))

    show_vector

    show_vector - represents a row of a matrix as a succession identifiers


    function handle=show_vector(X, (nrow) ,(csize),(xlab),(ylab),(title))

    submap

    submap - partial display of observations


    function submap(X,col1,col2,xstring,(col1label),(col2label),(title),(charsize),(marg))

    symbol_map

    symbol_map - map with symbols : using a portion of the identifiers for


    symbol_map(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize))

    tcurve

    tcurve - representation of a column of a given matrix as a curve


    function handle=tcurve(X, ncol, (xlabel),(ylabel),(title))

    tcurves

    tcurves - represents several columns of a matrix as curves


    function handle=tcurves(X, range, (xlabel),(ylabel),(title))

    trajectory_curve

    trajectory_curve - plots coloured XYcurves


    function handle=trajectory_curve(X,col1,col2,startpos,endpos);

    xdisp

    xdisp -smart display of heterogeneous variables


    function xdisp(varargin)

    xy_plot

    xy_plot - Biplot of one column of X versus one column of Y


    function handle=xy_plot(X, xcol, Y, ycol,start_pos,end_pos);

    xyz_colored_map1

    xyz_colored_map1 - Draws a colored 3D map from a Saisir file


    xyz_colored_map1(X,col1,col2,col3,startpos,endpos)


    Thematic list
    Alphabetic list

    Usual statistics

    anavar1

    anavar1 - One way analysis of variance on all the columns


    function res = anavar1(X,g)

    anovan1

    anovan1 - N-way analysis of variance (ANOVA) on data matrices.


    function res = anovan1(X,model,gr1, gr2, ...)

    contingency_khi2

    contingency_khi2 - computes khi2 stats on a contingency table


    function res=contingency_kh2(table);

    contingency_table

    contingency_table - Computes a contingency table


    function table=contingency_table(g1,g2)

    cormap

    cormap - Correlation between two tables


    function [cor] = cormap(X1,X2)

    covmap

    covmap - assesses the covariances between two tables


    function [cov] = covmap(X1,X2)

    distance

    distance - Usual Euclidian distances


    function [D] = distance(X1,X2)

    find_max

    find_max - gives the indices of the max value of a MATLAB Matrix


    function [row,col,value]=find_max(matrix)

    find_min

    find_min - gives the indices of the min value of a MATLAB Matrix


    function [row,col,value]=find_min(matrix)

    group_centering

    group_centering - Centers data according to groups


    function X1=group_centering(X,group);

    group_mean

    group_mean - gives the means of group of rows


    function X1=group_mean(X,startpos,endpos)

    labelled_hist

    labelled_hist - draws an histogram in which each obs. name is considered as a label


    function labelled_hist(X,col,startpos,endpos,(nclass),(charsize),(car))

    mdistance

    mdistance - computes distances between the two tables using metric "metric"


    function dis = mdistance(X1,X2,metric)

    nancor

    nancor - Matrix of correlation with missing data


    function[cor]=nancor(X1,X2)

    row_center

    row_center - subtracts the average row to each row


    function [X] = center(X1)

    saisir_mean

    saisir_mean - computes the mean of the columns, following the saisir format


    function[xmean]=saisir_mean(X);

    saisir_std

    saisir_std - computes the standard_deviations of the columns, following the saisir format


    function[xstd]=saisir_std(X)

    split_average

    split_average - averages observations according to the identifiers


    function res=split_average(X,startpos,endpos)


    Thematic list
    Alphabetic list

    Advanced statistics

    ca

    ca - CORRESPONDENCE ANALYSIS


    function ca_type=ca(N);

    ca_map

    ca_map - colored map for correspondence analysis: using a portion of the identifiers as labels


    ca_map(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))

    comdim

    comdim - Finding common dimensions in multitable data (saisir format)


    function[res]=comdim(collection,(ndim),(threshold))

    covariance_pca

    covariance_pca - principal component analysis when knowing the covariance (of variables)


    function[pcatype]=covariance_pca(covariance_type,(nscore))

    covmap

    covmap - assesses the covariances between two tables


    function [cov] = covmap(X1,X2)

    cumulate_covariance

    cumulate_covariance - Covariance on huge data set


    function [covariance_type]=cumulate_covariance((X),(covariance_type1));

    d2_factorial_map

    d2_factorial_map - assesses a factorial map from a table of squared distance


    function [ftype] = d2_factorial_map(X)

    distance

    distance - Usual Euclidian distances


    function [D] = distance(X1,X2)

    mdistance

    mdistance - computes distances between the two tables using metric "metric"


    function dis = mdistance(X1,X2,metric)

    mfa

    mfa - Multiple factor analysis


    function res=mfa(collection);

    multiway_pca

    multiway_pca - Multi way principal component analysis


    function res=multiway_pca(collection);

    nuee

    nuee - Nuee dynamique (KCmeans)


    function[res]=nuee(X,ngroup,(nchanged))

    pca_cross_ridge_regression

    pca_cross_ridge_regression - PCA ridge regression with crossvalidation


    function[res]=pca_cross_ridge_regression(X,y,krange,selected)

    regression_score

    regression_score - build a factorial space for regression


    function res=regression_score(x,beta,(y))

    saisir_linkage

    saisir_linkage - assesses a simple linkage vector from a matrix of distance


    function z=saisir_linkage(dis)

    statis

    statis - Multiway method STATIS


    function res=statis(collection);

    xcomdim

    xcomdim - Finding common dimensions in multitable data


    No direct use. Normally called with function "comdim"


    Thematic list
    Alphabetic list

    Principal Component analysis

    applypca

    applypca - computes the scores of supplementary observations


    function [supscores]=applypca(pcatype, X)

    applypcr

    applypcr - applies basic PCR on data


    function [predy]=applypcr(pcrtype,X)

    applyspcr

    applyspcr - Applies a stepwise PCR


    function [predy]=applyspcr(spcrtype,X)

    ca

    ca - CORRESPONDENCE ANALYSIS


    function ca_type=ca(N);

    ca_map

    ca_map - colored map for correspondence analysis: using a portion of the identifiers as labels


    ca_map(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))

    change_sign

    change_sign - changes the sign of a component and of its associated eigenvector


    function [pcatype1] = change_sign(pcatype,ncomp)

    comdim

    comdim - Finding common dimensions in multitable data (saisir format)


    function[res]=comdim(collection,(ndim),(threshold))

    contingency_khi2

    contingency_khi2 - computes khi2 stats on a contingency table


    function res=contingency_kh2(table);

    contingency_table

    contingency_table - Computes a contingency table


    function table=contingency_table(g1,g2)

    correlation_circle

    correlation_circle - Displays the correlation circle after PCA


    function[res]=correlation_circle(pcatype,X,col1,col2,(startpos),(endpos))

    correlation_plot

    correlation_plot - Draw a correlation between scores and tables


    function handle=correlation_plot(scores,col1,col2, X1,X2, ...);

    covariance_pca

    covariance_pca - principal component analysis when knowing the covariance (of variables)


    function[pcatype]=covariance_pca(covariance_type,(nscore))

    cumulate_covariance

    cumulate_covariance - Covariance on huge data set


    function [covariance_type]=cumulate_covariance((X),(covariance_type1));

    d2_factorial_map

    d2_factorial_map - assesses a factorial map from a table of squared distance


    function [ftype] = d2_factorial_map(X)

    dimcrosspcr1

    dimcrosspcr1 - validation of PCR (samples in validation are selected)


    function [pcrres]=dimcrosspcr1(X,y,ndim,selected)

    fda1

    fda1 - stepwise factorial discriminant analysis on PCA scores


    function[fdatype]=fda1(pcatype, group, among, maxscore)

    fda2

    fda2 - stepwise factorial discriminant analysis on PCA scores with verification


    function[res]=fda2(X, group, among, maxscore, selected)

    mfa

    mfa - Multiple factor analysis


    function res=mfa(collection);

    multiway_pca

    multiway_pca - Multi way principal component analysis


    function res=multiway_pca(collection);

    normed_pca

    normed_pca - PCA with normalisation of data


    function[pcatype]=normed_pca(X)

    pca

    pca - principal component analysis on raw data


    function [pcatype]=pca(X,(var_score))

    pca1

    pca1 - assesses principal component analysis on raw data (case nrows>ncolumns)


    This function is not to be called directly.

    pca2

    pca2 - computes principal component analysis on raw data (case nrows>ncolumns)


    This function is not to be called directly.

    pca_cano

    pca_cano - generalized canonical analysis after PCAs on each table


    function res=pca_cano(collection,ndim,graph);

    pca_stat

    pca_stat - Gives some complementary statistics on PCA observations


    function res=pca_stat(pca_type, comp1, comp2);

    pcareconstruct

    pcareconstruct - reconstructs original data from a PCA model and a file of score


    function[res]=pcareconstruct(pcatype,score,nscore)

    pcr

    pcr - PCR (components introduced in the order of eigenvalues)


    function [pcrtype]=pcr(X,y,maxdim)

    spcr

    spcr - stepwise Principal component regression


    function [spcrtype]=spcr(X,y,maxdim, (maxrank)(corr_cov))

    statis

    statis - Multiway method STATIS


    function res=statis(collection);

    xpca

    xpca - PCA on a matlab data matrix


    assess a rustic principal component analysis (on not normalised data)


    Thematic list
    Alphabetic list

    Regression methods

    apply_multiple_regression

    apply_multiple_regression - applies multiple_regression on "unknown" data


    function[res]=apply_multiple_regression(X,type,(y))

    apply_ridge_regression

    apply_ridge_regression - applies ridge regression on "unknown data"


    function[res]=apply_ridge_regression(ridgetype,X,(y))

    applylr1

    applylr1 - Apply basic latent root model on saisir data x


    function [predy]=applylr1(lrtype,X)

    applypcr

    applypcr - applies basic PCR on data


    function [predy]=applypcr(pcrtype,X)

    applypls

    applypls - applies a pls model on an unknown data set


    function res=applypls(X,plsmodel, (knowny))

    applyspcr

    applyspcr - Applies a stepwise PCR


    function [predy]=applyspcr(spcrtype,X)

    basic_pls

    basic_pls - basic pls with keeping loadings and scores


    function[res] = basic_pls(X,y,ndim)

    basic_pls2

    basic_pls2 - PLS2 on several variables, several dimensions


    function result=basic_pls2(X,y,maxdim)

    cross_ridge_regression

    cross_ridge_regression - ridge regression with validation


    function[res]=cross_ridge_regression(X,y,krange,selected)

    crossval_multiple_regression

    crossval_multiple_regression - validation of multiple regression.


    function[res]=crossval_multiple_regression(X,y,selected)

    dimcross_stepwise_regression

    dimcross_stepwise_regression - Tests models obtained from stepwise regression


    function[res]=dimcross_stepwise_regression(X,y,selected,Pthres,(confidence))

    dimcrosspcr1

    dimcrosspcr1 - validation of PCR (samples in validation are selected)


    function [pcrres]=dimcrosspcr1(X,y,ndim,selected)

    leave_one_out_pls1

    leave_one_out_pls1 - PLS1 with leave_one out validation


    function res=leave_one_out_pls1(X,y,ndim);

    leave_one_out_pls2

    leave_one_out_pls2 - PLS2 with leave_one out validation


    function res=leave_one_out_pls2(X,y,ndim);

    lr1

    lr1 - Latent root regression


    function [lr1type]=lr1(X,y,maxdim,(ratioxy))

    multiple_regression

    multiple_regression - Simple Multiple linear regression (all the variables)


    function res=multiple_regression(x,y);

    pcr

    pcr - PCR (components introduced in the order of eigenvalues)


    function [pcrtype]=pcr(X,y,maxdim)

    pcr1

    pcr1 - Basic model of PCR (components introduced in the order of eigenvalues)


    function [pcrtype]=pcr1(X,y,dim)

    pls2var

    pls2obs - PLS regression with many observations


    function [mtBpls,mtT, tBeta] = pls2var(mtX,mtY,nbdim)

    quickpls

    quickpls - Quick PLS regression from 1 to ndim dimensions


    function [plstype]=saisirpls(X,y,ndim)

    ridge_regression

    ridge_regression - Basic ridge regression


    function [ridgetype]=ridge_regression(X,y,krange)

    ridge_regression1

    ridge_regression1 - Basic ridge regression at a given norm


    function [ridgetype]=ridge_regression1(X,y,normrange)

    saisirpls

    saisirpls - PLS regression with "dim" dimensions


    function [plstype]=saisirpls(X,Y,dim)

    simple_regression

    simple_regression - mono_linear regressions


    function [beta beta0]=simple_regression(X,y);

    spcr

    spcr - stepwise Principal component regression


    function [spcrtype]=spcr(X,y,maxdim, (maxrank)(corr_cov))

    stepwise_regression

    stepwise_regression - stepwise regression between x and y


    function[result]=stepwise_regression(x,y,Pthres,(confidence))


    Thematic list
    Alphabetic list

    Partial least square (PLS)

    applypls

    applypls - applies a pls model on an unknown data set


    function res=applypls(X,plsmodel, (knowny))

    applyplsda

    applyplsda - Applies pls discriminant analysis after model assessment using plsda


    function[res]=applyplsda(X,plsdatype,(actual_group))

    basic_pls

    basic_pls - basic pls with keeping loadings and scores


    function[res] = basic_pls(X,y,ndim)

    basic_pls2

    basic_pls2 - PLS2 on several variables, several dimensions


    function result=basic_pls2(X,y,maxdim)

    crossplsda

    crossplsda - validation on PLS discriminant analysis


    function[res]=crossplsda(X,group,dim,selected)

    crossvalpls

    crossvalpls - validation of pls with up to ndim dimensions.


    function [res]=crossvalpls(X,y,ndim,selected)

    crossvalpls1a

    crossvalpls1a - crossvalidation of pls with up to ndim dimensions.


    function [res]=crossvalpls1a(X,y,ndim,selected)

    leave_one_out_pls1

    leave_one_out_pls1 - PLS1 with leave_one out validation


    function res=leave_one_out_pls1(X,y,ndim);

    leave_one_out_pls2

    leave_one_out_pls2 - PLS2 with leave_one out validation


    function res=leave_one_out_pls2(X,y,ndim);

    pls

    PCA1 -assesses principal component analysis on raw data (case nrows>ncolumns)


    [type]=afdlike(x,y,select,parmi)

    pls2obs

    pls2obs - PLS regression with many observations


    function [mtBpls,mtT, tBeta] = pls2obs(mtX,mtY,nbdim)

    plsda

    plsda - Pls discriminant analysis following the saisir format


    function[plsdatype]=plsda(X,group,ndim)

    quickpls

    quickpls - Quick PLS regression from 1 to ndim dimensions


    function [plstype]=saisirpls(X,y,ndim)

    saisirpls

    saisirpls - PLS regression with "dim" dimensions


    function [plstype]=saisirpls(X,Y,dim)


    Thematic list
    Alphabetic list

    Discrimination methods

    apply_nuee

    apply_nuee - apply Nuee dynamique (KCmeans)


    function[res]=nuee(X,barycenter)

    apply_quaddis

    apply_quaddis - Quadratic discriminant analysis


    function result = apply_quaddis(quaddis_type,x,(known_group));

    apply_stepwise_regression

    apply_stepwise_regression - applies stepwise_regression on "unknown" data


    function[res]=apply_stepwise_regression(stepwise_type,X,(y))

    applyfda1

    applyfda1 - application of factorial discriminant analysis on PCA scores


    function[res]=applyfda1(X,fdatype,(actual_group))

    applyplsda

    applyplsda - Applies pls discriminant analysis after model assessment using plsda


    function[res]=applyplsda(X,plsdatype,(actual_group))

    barycenter_map

    barycenter_map - graph of map of barycenter


    function barycenter_map(X,col1,col2,group,(charsize))

    basic_pls

    basic_pls - basic pls with keeping loadings and scores


    function[res] = basic_pls(X,y,ndim)

    build_indicator

    build_indicator - build a disjoint table


    function [indicator, groupings]=build_indicator(x);

    contingency_table

    contingency_table - Computes a contingency table


    function table=contingency_table(g1,g2)

    create_group

    create_group - creates a vector of numbers indicating groups from identifiers


    group=create_group(X,code_list,startpos,endpos)

    create_group1

    create_group1 - uses the identifiers to create groups


    use the identifier for creating groups.

    crossfda1

    crossfda1 - validation on discrimination according to fda1 (directly on data)


    function[res]=crossfda1(X,group,among,maxvar,ntest)

    crossmaha1

    crossmaha - validation on discrimination according to maha1 (directly on data)


    function[discrtype]=crossmaha(X,group,maxvar,ntest)

    crossplsda

    crossplsda - validation on PLS discriminant analysis


    function[res]=crossplsda(X,group,dim,selected)

    crossval_quaddis

    crossval_quaddis - crossvalidation of quadratic dis. analysis


    function res=crossval_quaddis(X,group,selected)

    crossvalpls

    crossvalpls - validation of pls with up to ndim dimensions.


    function [res]=crossvalpls(X,y,ndim,selected)

    crossvalpls1a

    crossvalpls1a - crossvalidation of pls with up to ndim dimensions.


    function [res]=crossvalpls1a(X,y,ndim,selected)

    dendro

    dendro1 - dendrogram using euclidian metric and Ward linkage


    function group=dendro(X,(topnodes))

    ellipse_map

    ellipse_map - plots the ellipse confidence interval of groups


    function ellipse_map(X,col1,col2,gr,centroid_variability,(confidence),(point_plot))

    fda1

    fda1 - stepwise factorial discriminant analysis on PCA scores


    function[fdatype]=fda1(pcatype, group, among, maxscore)

    fda2

    fda2 - stepwise factorial discriminant analysis on PCA scores with verification


    function[res]=fda2(X, group, among, maxscore, selected)

    maha

    maha - simple discriminant analysis forward introducing variables no validation samples


    function[res]=maha(X,group,maxvar)

    maha1

    maha1 - forward linear discriminant analysis DIRECTLY ON DATA with validation samples


    function[discrtype1]=maha1(calibration_data,calibration_group,maxvar,test_data,test_group)

    maha3

    maha3 - simple discriminant analysis forward introducing variables no validation samples


    function[discrtype]=maha3(X,group,maxvar)

    maha4

    maha4 - simple discriminant analysis forward


    function[discrtype]=maha4(X,group,maxvar)

    maha6

    maha6 - simple discriminant analysis forward introducing variables with validation samples


    function[discrtype]=maha6(X,group,maxvar,selected)

    nuee

    nuee - Nuee dynamique (KCmeans)


    function[res]=nuee(X,ngroup,(nchanged))

    plsda

    plsda - Pls discriminant analysis following the saisir format


    function[plsdatype]=plsda(X,group,ndim)

    quaddis

    quaddis - Quadratic discriminant analysis


    function quadis_type=quaddis(x,group);

    random_select

    random_select - bulding a vector of random elements 0 or 1


    function[selected]=random_select(nel, nselect, (nrepeat))


    Thematic list
    Alphabetic list

    Miscellaneous

    apply_nuee

    apply_nuee - apply Nuee dynamique (KCmeans)


    function[res]=nuee(X,barycenter)

    build_documentation

    function build_documentation(directory_name,filename,(thematic_list))



    build_indicator

    build_indicator - build a disjoint table


    function [indicator, groupings]=build_indicator(x);

    cumulate_covariance

    cumulate_covariance - Covariance on huge data set


    function [covariance_type]=cumulate_covariance((X),(covariance_type1));

    dendro

    dendro1 - dendrogram using euclidian metric and Ward linkage


    function group=dendro(X,(topnodes))

    documentation_dico

    documentation_dico -build a dictionnary for HTML doc


    function documentation_dico(fid,function_name);

    documentation_thematic_list

    documentation_thematic_list - HTML thematic list of function



    eigord

    eigord - diagonalization of a square matrix


    function [mtvec,mtval] = eigord(mtA)

    find_index

    find_index - find the index corresponding to the closest "value"


    function index=find_index(str,value);

    find_max

    find_max - gives the indices of the max value of a MATLAB Matrix


    function [row,col,value]=find_max(matrix)

    find_min

    find_min - gives the indices of the min value of a MATLAB Matrix


    function [row,col,value]=find_min(matrix)

    find_peaks

    find_peaks - finds and displays peaks greater than a threshold value


    function [vect, index]=find_peaks(X,nrow,threshold,windowsize,min_max)

    group_mean

    group_mean - gives the means of group of rows


    function X1=group_mean(X,startpos,endpos)

    html_header

    html_header - build the first part of the documentation


    function html_header(fid)

    html_notice

    html_notice - prints the helps of functions in HTML document


    function html_notice(fid,function_name);

    html_postface

    html_postface - Finish the work in HTML documentation


    function html_postface(fid);

    issaisir

    issaisir - tests if the input argument is a SAISIR matrix


    test=issaisir(X);

    list

    list - lists rows (only with a small number of columns)


    function list(X,(start))

    matrix2saisir

    matrix2saisir - transforms a Matlab matrix in a saisir structure


    X = matrix2saisir(data,(coderow),(codecol))

    mdistance

    mdistance - computes distances between the two tables using metric "metric"


    function dis = mdistance(X1,X2,metric)

    mir_style

    mir_style - changes the sign of the variables of MIR spectra


    function [names] = mir_style(names1)

    multiway_pca

    multiway_pca - Multi way principal component analysis


    function res=multiway_pca(collection);

    nuee

    nuee - Nuee dynamique (KCmeans)


    function[res]=nuee(X,ngroup,(nchanged))

    num2str1

    num2str1 - Justified num2string


    function str=num2str1(vector,ndigit);

    pca_ridge_regression

    pca_ridge_regression - Basic ridge regression after PCA


    function[ridgetype]=pca_ridge_regression(pcatype,y,range)

    random_saisir

    random_saisir - Creation of a random matrix


    function[X]=random_saisir(nrow,ncol)

    random_select

    random_select - bulding a vector of random elements 0 or 1


    function[selected]=random_select(nel, nselect, (nrepeat))

    random_splitrow

    random_splitrow - random selection of rows


    function[X1,X2]=random_splitrow(X, nselect)

    randomize

    randomize - Builds a file of randomly attributed vector in X1


    function X1=randomize(X)

    reorder

    reorder - reorders the data of files A1 and A2 according to their identifiers


    function [B1 B2]=reorder(A1,A2)

    repeat_string

    repeat_string - build a matricx of char by repeating a string


    function str1=repeat_string(str,ntimes);

    row_center

    row_center - subtracts the average row to each row


    function [X] = center(X1)

    saisir_check

    saisir_check - Checks if the data respect the saisir stucture


    function check=saisir_check(X)

    saisir_linkage

    saisir_linkage - assesses a simple linkage vector from a matrix of distance


    function z=saisir_linkage(dis)

    saisir_mult

    saisir_mult - matrix multiplication following the SAISIR format


    function X12=saisir_mult(X1,X2);

    saisir_sort

    saisir_sort - sorts the rows of s according to the values in a column


    function [X1 X2]=saisir_sort(X,ncol,minmax)

    saisir_sum

    saisir_sum - calculates the sum of the rows


    function xsum=saisir_sum(X);

    saisir_transpose

    saisir_transpose - transposes a data matrix following the saisir format


    function [X] = saisir_transpose(X1)

    seekstring

    seekstring - returns a vector giving the indices of string in matrix of char x in which 'str' is present


    function index = seekstring(identifiers,xstr)

    sgolaycoef

    sgolaycoef - Computes the Savitsky-Golay coefficients


    function [B,G] = sgolaycoef(k,F)

    split_average

    split_average - averages observations according to the identifiers


    function res=split_average(X,startpos,endpos)

    standardize

    standardize - divides each column by the corresponding standard deviation


    function [X, xstd] = standardize(X1,(option))

    string2saisir

    string2saisir - creation of a saisir file from a string table (first column=name)


    function [saisir] = string2saisir(data)

    string2text

    string2text - save a vector of string in a .txt format


    function string2text(str,filename)

    thematic_classification

    thematic_classification - builds a thematic classification of the .m files


    function res=thematic_classification(function_name,(previous));

    w

    w - w: (for "what") lists the fields which are present in a structure


    function res= w(xstruct);

    xdisp

    xdisp -smart display of heterogeneous variables


    function xdisp(varargin)


    Thematic list
    Alphabetic list

    Alphabetic list

    addcode alphabetic_sort anavar1 anovan1 appendbag1
    appendcol appendcol1 appendrow appendrow1 applyfda1
    applylr1 applypca applypcr applypls applyplsda
    applyspcr apply_multiple_regression apply_nuee apply_quaddis apply_ridge_regression
    apply_stepwise_regression bag2group bag_appendrow1 barycenter_map basic_pls
    basic_pls2 browse build_documentation build_indicator ca
    ca_map center change_sign check_name colored_curves
    colored_map1 colored_map2 colored_map4 comdim contingency_khi2
    contingency_table cormap correct_baseline correlation_circle correlation_plot
    covariance_pca covmap create_group create_group1 crossfda1
    crossmaha1 crossplsda crossvalpls crossvalpls1a crossval_multiple_regression
    crossval_quaddis cross_ridge_regression cumulate_covariance curve curves
    d2_factorial_map deletecol deleterow dendro dimcrosspcr1
    dimcross_stepwise_regression distance documentation_dico documentation_thematic_list eigord
    eliminate_nan ellipse_map excel2bag excel2saisir fda1
    fda2 find_index find_max find_min find_peaks
    group_centering group_mean html_header html_notice html_postface
    issaisir labelled_hist leave_one_out_pls1 leave_one_out_pls2 list
    lr1 maha maha1 maha3 maha4
    maha6 map map3D matrix2saisir mdistance
    mfa mir_style moving_average moving_max moving_min
    multiple_regression multiway_pca nancor normc normed_pca
    norm_col nuee num2str1 pca pca1
    pca2 pcareconstruct pca_cano pca_cross_ridge_regression pca_ridge_regression
    pca_stat pcr pcr1 plotmatrix1 pls
    pls2obs pls2var plsda quaddis quickpls
    randomize random_saisir random_select random_splitrow readexcel1
    readident regression_score reorder repeat_string ridge_regression
    ridge_regression1 row_center saisir2ascii saisir2excel saisirpls
    saisir_check saisir_derivative saisir_linkage saisir_mean saisir_mult
    saisir_sort saisir_std saisir_sum saisir_transpose seekstring
    selectcol selectrow select_from_identifier select_from_variable sensory_profile
    sgolaycoef show_vector simple_regression snv spcr
    splitrow split_average standardize statis stepwise_regression
    string2saisir string2text submap subtract_variable surface1
    surface_std symbol_map tcurve tcurves thematic_classification
    trajectory_curve w xcomdim xdisp xpca

    addcode

    addcode - adds a string before or after a matrix of characters

    function str1 = addcode(str,code,(deb_end))



    Input argument:

    ==============

    str: a matrix of character (n x p)

    code: a string (1 x k)

    deb_end: a number (0= addition before; 1 = addition after; default : 0)

    Output argument:

    ===============

    str1 : a matrix of character ((n x (k+p))


    This function is mainly used to recode the identifiers of observations or

    variables (".i", or ".v")

    example:

    data.i

    ans =

    casein

    albumin

    zein

    >> data.i=addcode(data.i,'1')

    data =

    i: [3x8 char]

    >> data.i

    ans =

    1casein

    1albumin

    1zein


    Return to thematic list
    Return to alphabetic list
    HOME

    alphabetic_sort

    alphabetic_sort - sorts the rows of x according to the alphabetic order of rows identifiers

    function X1=alphabetic_sort(X,start_pos:end_pos)



    Input arguments:

    ===============

    x : SAISIR matrix

    start_pos, end_pos : character positions in the identifiers.


    Output argument:

    ================

    x1= SAISIR matrix with the rows sorted in alphabetic order.

    This functions sorts the observations (rows) according to

    their identifiers (".i" field of X);


    example

    =======

    >> a.i

    ans =

    xenon

    krypton

    Aluminium

    >>a.d

    ans =

    1.00 2.00 3.00 4.00

    5.00 6.00 7.00 8.00

    9.00 10.00 11.00 12.00

    b=alphabetic_sort(a,1,5);

    >> b.i

    ans =

    Aluminium

    krypton

    xenon

    >> b.d

    ans =

    9.00 10.00 11.00 12.00

    5.00 6.00 7.00 8.00

    1.00 2.00 3.00 4.00


    Return to thematic list
    Return to alphabetic list
    HOME

    anavar1

    anavar1 - One way analysis of variance on all the columns

    function res = anavar1(X,g)


    Perform a one-way analysis of variance for each of the column of X.


    Input arguments:

    ===============

    X: SAISIR matrix (n x p)

    g: SAISIR vector of group identifier (integers ,n x 1)


    Output arguments:

    ================

    res with fields:

    res.F : SAISIR vector of Fisher values for each variable in X (1 x p)

    res.F.df : degrees of freedom of the model/

    res.p : SAISIR vector of associated probabilities (1 x p).


    Note: res.F and res.p can be examined as curves (using "curve" function)

    or by the command "show_vector" (for discrete variables)


    The function performs p independant one-way anovas, taking the groups

    (defined in g: each observations having the same number are belonging to the same group.)


    See also: "anovan1", "show_vector","create_group1"


    Return to thematic list
    Return to alphabetic list
    HOME

    anovan1

    anovan1 - N-way analysis of variance (ANOVA) on data matrices.

    function res = anovan1(X,model,gr1, gr2, ...)


    Performs as many independant N-way analyses of variance as the number of columns in X


    Input arguments:

    ===============

    X: SAISIR matrix of response values (n x p)

    model (integer): gives the level of desired interactions

    (1= no interactions studied; 2: first degree of interactions ... ) (see

    Matlab function ANOVAN)

    gr1; gr2 ...: SAISIR vector of qualitative groups forming a factor of the ANOVA

    (n x 1). Identical numbers mean that the corresponding observations are in

    the same group


    Output arguments:

    ================

    res with fields

    res.F: the F values associated with each effect and (possibly) interaction

    res.P: probability

    res.df (characters): degrees of freedom

    res.singular: singularity of the model. If the singularity == 1, the model

    is redundant and a lowest level of interaction must be tested.

    see also: anovan, anova, anavar1, show_vector, create_group1


    example:

    my_anova=anovan1(spectra,2,grouping1, grouping2);

    show_vector(my_anova.P,2); %%examination of the probabilities of the

    second factor (with identifiers)

    curve(my_anova.F,2); %% Fisher F examined as a curve.


    Return to thematic list
    Return to alphabetic list
    HOME

    appendbag1

    appendbag1 - Merge an arbitrary number of "bag" files according to rows

    usage: [X]= appendbag1(X1, X2, X3,.....)



    Input argument:

    ===============

    X1, X2, X3 ... : "bag" structure (see function "excel2bag" for details)

    with the same number of columns


    Output argument:

    ================

    X: concatenated bag structure


    bag is a structure (such as X.d,X.i,X.v),

    with bag.d being here a three way table of characters

    if bag.d is dimensioned (n x p x v): n is the number of rows, v the

    number of columns, and p the number of characters for each string.

    The structure bag is obtained as out put argument of the function

    "excel2bag".


    Example:

    [value1,bag1]=excel2bag('data1',['A'; 'B'],20);

    [value2,bag2]=excel2bag('data2',['A'; 'B'],20);

    bag3=appendbag1(bag1,bag2);


    Return to thematic list
    Return to alphabetic list
    HOME

    appendcol

    appendcol - merges two files according to columns

    function: [X3]= appendcol(X1,X2)



    Input arguments:

    ===============

    X1, X2 : SAISIR Matrix dimensionned n x p and n x q


    Ouput argument:

    ===============

    X3 : SAISIR Matrix dimensionned n x(p+q)


    The identifiers or rows are recopied in X3.i

    The identifiers of columns are the concatenation of X1.v and X2.v


    Example:

    total=appendcol(chemistry1, chemistry2);


    see also: appendrow, appendrow1, appendcol1


    Return to thematic list
    Return to alphabetic list
    HOME

    appendcol1

    appendcol1 - merges an arbitrary number of files according to columns

    usage: [X]= appendcol(X1,X2,X3,...)



    Input arguments:

    ===============

    X1, X2, X3 ... : SAISIR Matrices with the same numbers of rows


    Ouput argument:

    ===============

    X : SAISIR Matrix


    The identifiers or rows are recopied in X.i

    The identifiers of columns are the concatenation of X1.v, X2.v, X3.v ...


    Example:

    total=appendcol(chemistry1, chemistry2, chemistry3 );


    see also: appendrow, appendrow1, appendcol


    Return to thematic list
    Return to alphabetic list
    HOME

    appendrow

    appendrow - merges two SAISIR matrices according to rows

    usage: [X3]= appendrow(X1,X2)



    Input arguments:

    ================

    X1, X2 : SAISIR Matrix dimensionned n1 x p and n2 x p


    Ouput argument:

    ==============

    X3 : SAISIR Matrix dimensionned (n1+n2) x p


    The identifiers or columns are recopied in X3.v

    The identifiers of rows are the concatenation of X1.i and X2.i


    Example:

    total=appendrow(spectra1, spectra2);


    see also: appendcol, appendrow1, appendcol1


    Return to thematic list
    Return to alphabetic list
    HOME

    appendrow1

    appendrow1 - Merges an arbitrary number of files according to rows

    usage: [X]= appendrow(X1,X2,X3,...)



    Input arguments:

    ===============

    X1, X2, X3 : SAISIR Matrices with the same number of columns


    Ouput argument:

    ===============

    X : SAISIR Matrix with p columns


    The identifiers or columns are recopied in X.v

    The identifiers of rows are the concatenation of X1.i, X2.i, X3.i ...


    Example:

    total=appendrow(spectra1, spectra2, spectra3);


    see also: appendcol, appendrow, appendcol1


    Return to thematic list
    Return to alphabetic list
    HOME

    applyfda1

    applyfda1 - application of factorial discriminant analysis on PCA scores

    function[res]=applyfda1(X,fdatype,(actual_group))



    Input arguments:

    ===============

    X: Saisir matrix of predictive variables (n x p)

    fdatype : structure, output of function "fda1"

    actual_group (optional)(n x 1): SAISIR vector indicating the membership of

    the observations. A same number indicates that these observations belong to

    the same group.


    Output argument:

    ===============

    res with fields

    datafactor : discriminant scores (n rows)

    predicted_group: number indicating the prediction in each group

    If actual_group si given, the field "confusion" gives the confusion matrix

    row: actual group, columns: predicted by the "fda1"


    Typical example:

    ================

    (calibration data in "calibration_data", group in "qualitative_group",

    Unknown data in "unknown_data");

    %Building the model

    p=pca(calibration_data);

    fdatype=fda1(p, qualitative_group, 10, 5)

    %Applying the model

    res=applyfda1(unknown_data,fdatype);

    See also : fda1, maha3, maha6, plsda, pca


    Return to thematic list
    Return to alphabetic list
    HOME

    applylr1

    applylr1 - Apply basic latent root model on saisir data x

    function [predy]=applylr1(lrtype,X)



    Input arguments:

    ================

    lrtype: structure, output of function "lr1" (predictive model)

    X :SAISIR matrix of predictive variables


    Output argument

    ===============

    predy :SAISIR matrix of predicted y (for all the models asked when using

    "lr1"


    creates as many y predicted as allowed by the dimensions in lr1type


    Return to thematic list
    Return to alphabetic list
    HOME

    applypca

    applypca - computes the scores of supplementary observations

    function [supscores]=applypca(pcatype, X)


    assess the scores of supplementary observations


    Input arguments:

    ===============

    pcatype: (structure) output argument of functions "pca","normed_pca", or

    "covariance_pca"

    X : SAISIR matrix of supplementary observations


    Output argument:

    ===============

    supscores : SAISIR matrix of scores of X


    Typical example

    ===============

    p=pca(spectra);%% PCA

    supscores=applypca(p,supplementary_spectra);

    map(supscores,1,2);


    The number of columns of X must be compatible with pcatype.

    If "normed_pca" was applied, X is divided by the standard deviations of

    the principal observations prior to projection


    Return to thematic list
    Return to alphabetic list
    HOME

    applypcr

    applypcr - applies basic PCR on data

    function [predy]=applypcr(pcrtype,X)


    apply a basic pcr (in pcrtype) on saisir data x

    creates as many y predicted as allowed by the dimensions in pcrtype


    Input arguments

    pcrtype:structure, output of function "PCR"

    X: SAISIR matrix of predictive variables


    Output argument

    predy : SAISIR matrix of predicted y, for all the dimensions tested


    see also: "pcr", "pcr1", "basic_pls"


    Return to thematic list
    Return to alphabetic list
    HOME

    applypls

    applypls - applies a pls model on an unknown data set

    function res=applypls(X,plsmodel, (knowny))



    Input arguments:

    ---------------

    X : SAISIR matrix (n x p)

    plsmodel :output argument of functions "saisirpls", "basic_pls" or "basic_pls2"

    knowny (optional): actual value of y (if it is known, this allos the

    computation of r2 and RMSEV


    Ouput arguments:

    ---------------

    res with fields

    PREDY: predicted values for the all the models given by "plsmodel" (n x

    ndim)

    RMSEV: root mean square error of validation for all the models (only if

    "knowny" is given) (1 x ndim)

    r2: determination coefficient between predicted and observed y values

    (only if "knowny" is given) (1 x ndim)

    T: PLS scores of the set (n x ndim)


    Return to thematic list
    Return to alphabetic list
    HOME

    applyplsda

    applyplsda - Applies pls discriminant analysis after model assessment using plsda

    function[res]=applyplsda(X,plsdatype,(actual_group))


    return the predicted group on (unknown data x)


    Input arguments:

    ===============

    X : SAISIR matrix of predictive variables

    plsdatype: structure returned by function 'plsda'

    actual_group (optional): SAISIR vector of observed groups.

    Observations with the same group number belong to the same group.


    Ouput argument:

    ==============

    res with fields:

    confusion1: matrix of confusion, method 1 (if "actual_group"

    defined)

    ncorrect1: Number of correct classifications, method 1 (if "actual_group"

    defined)

    predgroup1: predicted group (method1)

    confusion: matrix of confusion, method 0 (if "actual_group"

    defined)

    ncorrect: Number of correct classifications, method 1 (if "actual_group"

    defined)

    predgroup: predicted group (method 0)


    Method 1: (attribution to index of max of predicted Y)

    Method 0: (shortest Mahalanobis distance calculated on PLS scores);


    Return to thematic list
    Return to alphabetic list
    HOME

    applyspcr

    applyspcr - Applies a stepwise PCR

    function [predy]=applyspcr(spcrtype,X)



    Input arguments:

    ===============

    spcrtype:(structure), output argument of spcr

    X: SAISIR marix of predictive variables


    Output argument:

    ===============

    predy: y predicted for all the tested dimensions


    Return to thematic list
    Return to alphabetic list
    HOME

    apply_multiple_regression

    apply_multiple_regression - applies multiple_regression on "unknown" data

    function[res]=apply_multiple_regression(X,type,(y))



    Input argument:

    ==============

    X: SAISIR matrix of predictive variables

    type: output argument of function "multiple_regression"

    y (optional): SAISIR vector of known observed y


    Output argument:

    ===============

    res with fields:

    ypred: predicted y

    if input argument "y" given :

    r2: r2 value

    RMSEV: root mean square error of validation


    Return to thematic list
    Return to alphabetic list
    HOME

    apply_nuee

    apply_nuee - apply Nuee dynamique (KCmeans)

    function[res]=nuee(X,barycenter)


    X :a matrix of dimension n x p

    barycenter : a matrix defining the barycenter k x p with k groups

    Barycenter is possibly the field "center" of the output of function "nuee"


    Return to thematic list
    Return to alphabetic list
    HOME

    apply_quaddis

    apply_quaddis - Quadratic discriminant analysis

    function result = apply_quaddis(quaddis_type,x,(known_group));


    Application of quadratic discriminant analysis (test)

    ====================================================

    Input arguments:

    ================

    quaddis_type : output of function "quaddis" (structure)

    x : predictive data matrix (n x p)

    known_group : true groups of observations in x (n x 1) (optional)


    Ouput arguments:

    ================

    result with fields:

    predgroup : predicted groups (n x 1)

    density : pseudo-density (n x gmax)

    proba : probability of belonging to a given group (n x gmax)

    if "known_group" defined

    nscorrect100 : percentage of correct classification (number)

    sconfus : confusion matrix (gmax x gmax)


    Return to thematic list
    Return to alphabetic list
    HOME

    apply_ridge_regression

    apply_ridge_regression - applies ridge regression on "unknown data"

    function[res]=apply_ridge_regression(ridgetype,X,(y))



    Input arguments:

    ===============

    ridge_type (structure): output argument of ridge_regression

    X: SAISIR matrix of predictive variables

    y: SAISIR vector of observed y


    Ouput arguments:

    ===============

    res with fields

    predy: predicted values for all the ridge predictive models

    (see function "ridge_regression")

    If input argument "y" is given:

    r2: values for all the ridge predictive models

    rmsecv: root mean square error of validation for all the predictive models


    Return to thematic list
    Return to alphabetic list
    HOME

    apply_stepwise_regression

    apply_stepwise_regression - applies stepwise_regression on "unknown" data

    function[res]=apply_stepwise_regression(stepwise_type,X,(y))


    Input arguments:

    ================

    stepwise_type: array of cells obtained as output of stepwise_regression

    X : SAISIR matrix of predictive variables (n x p)

    y : SAISIR vector of observed variable (n x 1)


    Output argument

    ===============

    res with fields

    predy : predicted y for all the tested models

    rmsev : Root mean square error of validation (if input argument "y"

    defined)

    r2 : determination coefficients between oobserved and predicted y


    build as many models as available in "stepwise_type"


    Return to thematic list
    Return to alphabetic list
    HOME

    bag2group

    bag2group - uses the identifiers in bag to create groups

    function [group_type]=bag2group(bag)



    Input argument

    =============

    bag: "bag" structure , output of function "excel2bag"


    Ouptut argument

    =============

    group_type : array of cells of structure SAISIR

    such as group_type{i} contains the SAISIR structure of groups as defined

    by the corresponding column i in "bag"

    creates as many group as different strings in the column of bag.d


    useful for discriminant analysis or correspondance analysis


    Return to thematic list
    Return to alphabetic list
    HOME

    bag_appendrow1

    bag_appendrow1 - Merges an arbitrary number of bags according to rows

    usage: [bag]= appendrow(bag1, bag2 ....)



    Input arguments

    ==============

    bag1, bag2, bag3 ... : structure "bag" as defined in function "excel2bag"


    Ouput argument

    =============

    bag: concatentation of bag1, bag2, bag3 with increase of the number of

    rows

    the second and third dimensions of bag.d must be equals


    Return to thematic list
    Return to alphabetic list
    HOME

    barycenter_map

    barycenter_map - graph of map of barycenter

    function barycenter_map(X,col1,col2,group,(charsize))



    Input arguments:

    ==============

    X: SAISIR matrix (n x p)

    col1, col2 : indices of the variables ploted in X and Y (integers)

    group: SAISIR vector of group ( n x 1). Indicates the group belonging

    of each observation

    charsize size of the Font (default: 7).


    Display two columns as a map

    Each observation is linked to its own barycentre by a straight line


    Return to thematic list
    Return to alphabetic list
    HOME

    basic_pls

    basic_pls - basic pls with keeping loadings and scores

    function[res] = basic_pls(X,y,ndim)


    Input arguments:

    ---------------

    X: SAISIR matrix of predictive variables (n x p)

    y: SAISIR vector of variables to be predicted

    ndim: maximum number of dimension asked


    Output arguments:

    ----------------

    res with fields:

    T: PLS scores (n x ndim)

    P:¨PLS loadings such as X = TP + residuals (ndim x p)

    beta: final regression coefficients with ndim dimensions (p x 1)

    beta0: final interecpt value (number)

    meanx: mean of X (1 x p)

    meany: mean of y (numbe)

    predy: predicted y with ndim dimensions (n x 1)

    error: Root mean square error of the model with ndim dimensions (number)

    corcoef: correlation coefficient r between observed and predicted values

    in the model with ndim dimensions

    BETA: regression coefficients for the ndim models (p x ndim)

    BETA0: intercepts of the ndim models (ndim x 1)

    loadings: pls loadings such as T=X*loadings (with X centred) (p x ndim)

    Q: Q value such as y = TQ + residual (ndim x 1)

    PREDY: predicted y values up to ndim dimensions (n x ndim)

    RMSEC: Root mean square error of predicttion up to ndim dimensions (1 x

    ndim)

    r2: determination coefficient between observed and predicted values (1 x

    ndim)


    Return to thematic list
    Return to alphabetic list
    HOME

    basic_pls2

    basic_pls2 - PLS2 on several variables, several dimensions

    function result=basic_pls2(X,y,maxdim)



    Input arguments:

    ===============

    X: SAISIR matrix of predictive variables ( n x p)

    y: SAISIR vector of observed y (n x m)

    maxdim (integer): maximum number of PLS dimensions


    Ouput arguments:

    ===============

    result with fields

    T: PLS scores (n x maxdim)

    loadings: PLS loadings (p x maxdim)

    meanx: average of X (1 x p)

    res: array of cells (maxdim elements)

    A structure in res{i} with i = 1 ... maxdim contains the results of the

    prediction of the variable i.


    result.res{i} has the fields:

    nom: (string) , name of the considered variable i

    BETA (p x maxdim): regression coefficients up to maxdim dimensions

    BETA0 (1 x maxdim): intercepts of the models

    PREDY (n x maxdim): predicted y for 1 to maxdim dimensions

    RMSEC (1 x maxdim): root mean square errors of calibration

    r2 (1 x maxdim): r2 for 1 to maxdim dimensions


    Return to thematic list
    Return to alphabetic list
    HOME

    browse

    browse - browses a series of curves

    function browse(X,xstart)


    Display the rows of the SAISIR matrix X as curves

    Right button to go down, Left button to go up, Ctrl C to exit

    %If X.v can be interpreted as a vector of number (such as wavelengths),

    the X scale is given by this vector.

    Otherwise, the X-axis is simply given by the rank of the variables


    Return to thematic list
    Return to alphabetic list
    HOME

    build_documentation

    function build_documentation(directory_name,filename,(thematic_list))


    Input arguments:

    ================

    directory_name : name of the directory (working with

    the Matlab command "what )

    file_name: name of the resulting html file

    Extension ".htm" is added to this name

    thematic_list: output of function "thematic_classification"


    The function builds the HTML documentation of SAISIR by concatenation of the

    "Helps". If "thematic_list" is defined, gives also a thematic list of functions

    The resulting HTML file is in "filename"


    Typical example

    ===============

    aux=what('saisir');%% functions in the directory "saisir";

    function_name=char(aux.m);% list of function in SAISIR

    build_documentation('saisir','SAISIR documentation');%% builds the HTLM fields


    %For having also a thematic index in the HTML document, one must use the function

    "thematic_classification".

    %For example

    mylist=thematic_classification(function_name);

    build_documentation('saisir','SAISIR_documentation',mylist);%% builds the HTLM fields

    Then the resulting SAISIR_documentation.htm file can be examined with WEB

    explorer such as Window explorer or firefox. Window explorer is better here.


    See also: "thematic_classification"


    Return to thematic list
    Return to alphabetic list
    HOME

    build_indicator

    build_indicator - build a disjoint table

    function [indicator, groupings]=build_indicator(x);


    each column of x must contain integer values

    build the complete table of indicators

    Useful for computing multiple correspondance analysis


    Return to thematic list
    Return to alphabetic list
    HOME

    ca

    ca - CORRESPONDENCE ANALYSIS

    function ca_type=ca(N);


    Compute correspondence analysis from the contingency table in N

    If only groupings are available, the contigency table must be computed

    before using this function (see for example function "contingency_table")


    ==============================================================

    Fields of the output

    score :CA scores of rows followed by CA scores of columns

    eigenval :eigenvalues, percentage of inertia, cumulated percentage

    contribution :contribution to the component rows, then columns

    squared_cos :squared cosinus row, then columns

    khi2 :khi2 of the contingency table

    df :degree of freedom

    probability :probability of random values in contigency table

    ==============================================================


    The identifiers of rows of 'score' (whic are the identifiers of rows and columns of N (N.i and N.v)

    are preceded with the letter 'r' or 'c'.

    It is therefore possible to use color for emphazising row and columns in

    the simultaneous biplot of rows and columns


    Source : G. Saporta. Probabilités, analyse des données et statistiques.

    : Edition Technip, page 198 and followings.

    REMARK : use function "ca_map" to plot the biplot observation/variable


    Return to thematic list
    Return to alphabetic list
    HOME

    ca_map

    ca_map - colored map for correspondence analysis: using a portion of the identifiers as labels

    ca_map(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))



    Input arguments:

    ================

    X: SAISIR data matrix

    col1, col2: rank of the columns to be displayed (normaly scores obtained

    from function "ca".

    startpos, endpos: position of the string in the identifiers indicating the

    color of the display


    Biplot of two columns as colored map useful for correspondance analysis (from function "ca")

    The coloration of the displayed descriptors depends on the arguments

    "startpos" and "endpos". If one of this argument is zero: single (black) color

    Otherwise, from the names of individual, the string name(sartpos:endpos) is extracted.

    Two observations for which these strings are different,

    are also colored differently.


    THIS FUNCTION IS SPECIFIC TO CORRESPONDANCE ANALYSIS:

    If the first letter of the identifier is either c(column) or r (row),

    this letter is removed in the name.

    The letter c produces an italic display. This allows a representation in

    which the variables are in italic letter


    Return to thematic list
    Return to alphabetic list
    HOME

    center

    center - subtracts the average to each row

    function [X1 xmean] = center(X)



    Input argument:

    ---------------

    X : SAISIR matrix (n x p)

    Output argument:

    ---------------

    X1:SAISIR matrix (n x p) centered (the average of each column of X1 is

    equal to 0)

    xmean: SAISIR vector (1 x p) of the average row.


    Return to thematic list
    Return to alphabetic list
    HOME

    change_sign

    change_sign - changes the sign of a component and of its associated eigenvector

    function [pcatype1] = change_sign(pcatype,ncomp)



    Input argument:

    --------------

    pcatype : output argument of function pcatype


    Output argument:

    ---------------

    pcatype1: new pca structure


    The function is useful when several PCAS has been computed.

    For the sake of clarity, it may be useful to have the axis oriented in (about)

    the same directions. In this way, the graphical representations may be

    easier to interpret.


    see also: function "smart_coord"


    Return to thematic list
    Return to alphabetic list
    HOME

    check_name

    check_name - Controls if some strings are strictly identical in a string array

    function [detected,names]=check_name(string)



    Input argument:

    --------------

    "string": a matrix of characters


    ouput argument:

    --------------

    detected: vectors giving the indices of the observations having the same

    name.

    names: matrix of characters giving the found identical name


    This function is mainly used in relationship with function "reorder"


    Return to thematic list
    Return to alphabetic list
    HOME

    colored_curves

    coloured_curves - displays curves coloured according to groups

    function h=colored_curves(X,group)



    Input argument:

    --------------

    X: SAISIR matrix (n x p)

    group: Saisir vector of groups (integers,n x 1).


    The function displays all the observations as curves.

    Each curve is colored according to the values in "group". The observations

    with the same group number are colored identically.


    Return to thematic list
    Return to alphabetic list
    HOME

    colored_map1

    colored_map1 - colored map : using a portion of the identifiers as labels

    colored_map1(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))


    Biplot of two columns as colored map


    Input arguments:

    ===============

    X: SAISIR matrix

    col1, col2 : index of the two columns to be represented (integer values)

    startpos, endpos: position in the identifier strings of rows ('.i') for

    the coloration

    col1label (optional): Label of the variable forming the X-axis

    col2label (optional): Label of the variable forming the Y-axis

    title (optional) : title of the graph

    charsize (optional) : size of the plotted characters

    marg (optional) : margin value allowing an extension of the axis in order

    to cope with long identifiers (default value: 0.05)

    For the French users: there is a synonym function "carte_couleur1".

    Use preferably "colored_map1"

    The coloration of the displayed descriptors depends on the arguments

    startpos and endpos.

    From the names of individual, the string name(sartpos:endpos) is extracted. Two observations

    for which these strings are different, are also colored differently.

    example:

    Let X be a SAISIR matrix

    %Let X.i being

    'wheat1'

    'barle2'

    'ricex1'

    'wheat2'

    'barle3'

    ...

    The command 'colored_map1(X,5,3,1,5)' will plot the column 5 as X, 3, as Y

    The characters are extracted from 1 to 5 , that is strings 'wheat', 'barley',

    'ricex'.A different color will be given for each of this strings.


    See also colored_map2 (same principle but with the whole identifier name

    displayed)


    Return to thematic list
    Return to alphabetic list
    HOME

    colored_map2

    colored_map2 - colored map : using a portion of the identifiers as labels

    colored_map2(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize),(margin))


    Biplot of two columns as colored map


    Input arguments:

    ===============

    X: SAISIR matrix

    col1, col2 : index of the two columns to be represented (integer values)

    startpos, endpos: position in the identifier strings of rows ('.i') for

    the coloration

    col1label (optional): Label of the variable forming the X-axis

    col2label (optional): Label of the variable forming the Y-axis

    title (optional) : title of the graph

    charsize (optional) : size of the plotted characters

    marg (optional) : margin value allowing an extension of the axis in order

    to cope with long identifiers (default value: 0.05)

    For the French users: there is a synonym function "carte_couleur2".

    Use preferably "colored_map2"

    The coloration of the displayed descriptors depends on the arguments

    startpos and endpos.

    From the names of individual, the string name(sartpos:endpos) is extracted. Two observations

    for which these strings are different, are also colored differently.

    example:

    Let X be a SAISIR matrix

    %Let X.i being

    'wheat1'

    'barle2'

    'ricex1'

    'wheat2'

    'barle3'

    ...

    The command 'colored_map1(X,5,3,1,5)' will plot the column 5 as X, 3, as Y

    The characters are extracted from 1 to 5 , that is strings 'wheat', 'barley',

    'ricex'.A different color will be given for each of this strings.


    See also colored_map1 (same principle but with only the portion of

    the identifier names, from startpos to endpos, displayed)


    Return to thematic list
    Return to alphabetic list
    HOME

    colored_map4

    colored_map4 - colored map according to 2 criteria

    colored_map4(X,col1,col2,color_choice,(symbol_choice),(charsize);)


    Input_arguments

    --------------

    X: SAISIR matrix of data (n x p)

    col1, col2 : rank of the columns to be represented

    choice: first criterion of choice, dealing with the colors

    symbol_choice (optional): second criteria of choice, dealing with the

    symbols


    Biplot of two columns as colored map

    The coloration of the displayed descriptors depends on the arguments

    choice (either matrix of char, vector of number or saisir structure);

    with number of elements equal to the number rows in s;

    if the elements of "choice" are different, they are also colored differently.

    if "symbol_choice" is also defined (either matrix of char, vector of number

    or saisir structure) different symbols are used. The color of the point

    is then given by "choice", and the shape of the symbol depends on

    "symbol_choice"


    Example of use:

    Colored text with color determined by the second character in wheat.i

    colored_map4(wheat,1,50,wheat.i(:,2));;


    carte_couleur4(ble,1,50,ble.i(:,2),ble.i(:,3));

    color: determined by the second character in wheat.i

    colored symbol: shape determined by the third character in wheat.i


    Return to thematic list
    Return to alphabetic list
    HOME

    comdim

    comdim - Finding common dimensions in multitable data (saisir format)

    function[res]=comdim(collection,(ndim),(threshold))



    Input arguments:

    ---------------

    collection:vector of saisir files (the numbers "nrow" of rows in each table must be equal) .

    ndim:number of common dimensions

    threshold (optional): if the "difference of fit"
    iterative loop


    Output arguments:

    -----------------

    res with fields:

    Q : observations scores (nrow x ndim)

    explained : 1 x ndim, percentage explanation given by each dimension

    saliences : weight of the original tables in each

    dimensions (ntable x ndim).


    Method published by E.M. Qannari, I. Wakeling, P. Courcoux and H. J. H. MacFie

    in Food quality and Preference 11 (2000) 151-154


    typical example (suppose 3 SAISIR matrices

    "spectra1","spectra2","spectra3")

    collection(1)=spectra1; collection(2)=spectra2; collection(3)=spectra3

    myresult=comdim(collection);

    map(myresult.Q,1,2);%% looking at the compromise scores

    figure;

    map(myresult.saliences,1,2);%% looking at the weights


    Return to thematic list
    Return to alphabetic list
    HOME

    contingency_khi2

    contingency_khi2 - computes khi2 stats on a contingency table

    function res=contingency_kh2(table);



    Input argument:

    --------------

    table: SAISIR matrix of contingency table (n x p)


    output argument

    --------------

    res with fields:

    theo: theoretical contingency table assuming independence of rows and

    columns in "table"

    khi2: khi2 value

    dll: degree of freedom of the model

    P: probability of the null hypothesis ("independence of rows and columns")


    Each element of input argument "table" gives the number of observations

    which both belongs to the group of the corresponding row and the

    corresponding column. For example table.d(2,4) indicates the number of

    observations which are both in the group 2 of rows and in the group 4 of

    columns.

    The contingency table can be created with the function "contingency_table"



    Return to thematic list
    Return to alphabetic list
    HOME

    contingency_table

    contingency_table - Computes a contingency table

    function table=contingency_table(g1,g2)



    Input arguments

    ---------------

    g1 and g2: SAISIR vector (n x 1) of groups (possibly computed from "create_group1").

    In these vectors, a same number indicates the belonging to the same group.


    Output argument

    ---------------

    table : contingency table (ngroup1 x ngroup2). A value table.d(i,j)indicates the

    number of observations belonging both of group i in g1 and group j in g2.


    see also "contigency_khi2", "ca" (correspondence analysis),

    "build_indicator"



    Return to thematic list
    Return to alphabetic list
    HOME

    cormap

    cormap - Correlation between two tables

    function [cor] = cormap(X1,X2)



    Input arguments

    --------------

    X1 and X2: SAISIR matrix dimensioned n x p1 and n x p2 respectively


    Output argument:

    ---------------

    cor: matrix of correlation dimensioned p1 x p2)

    the tables must have the same number of rows

    An element cor.d(i,j) is the correlation coefficient between the column i of X1 and j of X2


    Return to thematic list
    Return to alphabetic list
    HOME

    correct_baseline

    correct_baseline - simple linear baseline correction, using intensity

    function [saisir] = correct_baseline (saisir1,col1,col2)


    The baseline is modelled by a straight line going from data points col1 to col2


    Return to thematic list
    Return to alphabetic list
    HOME

    correlation_circle

    correlation_circle - Displays the correlation circle after PCA

    function[res]=correlation_circle(pcatype,X,col1,col2,(startpos),(endpos))



    Input arguments

    ---------------

    pcatype: output argument of function "pca")

    X: original data matrix (n x p)

    col1 and col2 : ranks of the PC-scores to be represented.

    startpos and endpos(optional): key in the variable identifiers for coloring the variables (optional)


    Output argument

    --------------

    res: matrix p x n scores of the correlations between the variables and all

    the available PC-scores


    The function draw the correlation circle



    %typical example

    %Let "chemistry" be a SAISIR matrix

    mypca=pca(chemistry);

    correlation_circle(mypca,chemistry,col1,col2);%correlation circle

    %of plan %#1-#2


    %Note: Use preferably the function "correlation_plot"


    Return to thematic list
    Return to alphabetic list
    HOME

    correlation_plot

    correlation_plot - Draw a correlation between scores and tables

    function handle=correlation_plot(scores,col1,col2, X1,X2, ...);



    Input arguments:

    ==============

    scores: - ORTHOGONAL scores obtained by multidimensional analysis

    col1 and col2: - Indices (ranks) of the scores to be plotted (integer number)

    X, X2, ... - Arbitrary number of tables giving the variables to be plotted

    The number of rows in the scores and other tables must be identical


    The function displays the correlation circle, with a different color for

    each table in X1, X2 ...

    A dotted line gives the level of 50% explained variable.

    If the input argument "scores" have non-orthogonal columns the graph is normally incorrect

    and a warning message is displayed.


    Return to thematic list
    Return to alphabetic list
    HOME

    covariance_pca

    covariance_pca - principal component analysis when knowing the covariance (of variables)

    function[pcatype]=covariance_pca(covariance_type,(nscore))



    Input arguments:

    ---------------

    covariance_type: output argument of function "cumulate_covariance"

    nscore : (integer, optional) number of scores to be calculated (default :

    all)


    Output argument:

    ----------------

    pcatype with fields:

    eigenvec: eigenvector

    eigenval:eigenvalues

    average:average value of the active observations


    Performs PCA on the covariance matrix as calculated by"cumulate_covariance"

    This function is useful fo to carrying on PCA with huge data set (see

    "cumulate_covariance" for an example of use)


    %typical example:

    A complete script must therefore be something like

    cov=cumulate_covariance(spectra_1);%% starting

    cov=cumulate_covariance(spectra_2,cov);%% cumulating values of matrix 1

    cov=cumulate_covariance(spectra_3,cov);%% cumulating values of matrix 2

    ...

    cov=cumulate_covariance(spectra_k,cov);%% cumulating values of matrix k

    covariance_type=cumulate_covariance([],cov);%% finishing

    [pcatype]=covariance_pca(covariance_type,(ncomponent))

    score1=applypca(pcatype,spectra_1);%% projecting data from spectra_1


    This function is mainly used to compute PCA on huge data set, which cannot

    be loaded completely in the free memory, and thust must be split in smaller subset of observations.


    related function; "cumulate_covariance" (covariance of huge data set)




    Return to thematic list
    Return to alphabetic list
    HOME

    covmap

    covmap - assesses the covariances between two tables

    function [cov] = covmap(X1,X2)



    Input arguments

    --------------

    X1 and X2: SAISIR matrix dimensioned n x p1 and n x p2 respectively


    Output argument:

    ---------------

    cov: matrix of covariance dimensioned p1 x p2)

    the tables must have the same number of rows

    An element cov.d(i,j) is the covariance between the column i of X1 and j of X2


    Return to thematic list
    Return to alphabetic list
    HOME

    create_group

    create_group - creates a vector of numbers indicating groups from identifiers

    group=create_group(X,code_list,startpos,endpos)



    Input arguments:

    ---------------

    X: SAISIR matrix (n x p)

    code_list:matrix of character (k x q)

    startpos, endpos: place in the identifier names where to find the code


    Output argument:

    ---------------

    group: SAISIR vector (n x 1) of groups. A same number indicates that the observations belong to the same group

    Normally, k groups are identified


    typical use:

    group=create_group(X,['A1';'B2';'C1'],3,4]);

    The command seek in X.i the codes 'A1', 'B2', 'C1', in position 3 to 4

    Observations with code 'A1', 'B2, 'C1' are placed in group numbered 1, 2

    and 3 respectively.


    see also "create_group1"

    group structure are used in discriminant analysis, anova and relate

    methods



    Return to thematic list
    Return to alphabetic list
    HOME

    create_group1

    create_group1 - uses the identifiers to create groups

    use the identifier for creating groups.


    creates as many group means as different strings from startpos to endpos

    function saisir=create_group1(s,startpos,endpos)

    s: saisir file, startpos and enpos : position of discriminating characters



    Return to thematic list
    Return to alphabetic list
    HOME

    crossfda1

    crossfda1 - validation on discrimination according to fda1 (directly on data)

    function[res]=crossfda1(X,group,among,maxvar,ntest)



    input arguments

    ===============

    X: Saisir data matrix (n x p)

    group:Saisir vector of group (integer, n x 1).Two observations belonging

    to the same group have the same group number.

    among: maximum rank of the PC scores entered in the model (see "fda1" for

    details)

    maxvar: (integer) maximal number of scores entered in the model

    ntest: (integer) number of observation in each group in the validation

    set


    ouput arguments:

    ===============

    res with fields

    fdatype: fdatype built up with the calibration set (see function "fda1")

    verification:

    with fields

    datafactor: discriminant scores of the validation set

    predicted_group: in the validation set

    confusion: confusion matrix of the validation set (rows: actual group;

    columns: predicted group)


    The function applies "fda1" by dividing the sample in X into calibration and validation set

    "ntest" observations in each group are randomly (with no repeat) placed in test group.


    %typical example:

    g=create_group1(wheat,2,3);%% creation of the vector of group number

    res=crossfda1(wheat,g,10,5,5);

    res.validation.confusion;%% confusion matrix in validation

    %colored_map1(res.validation.datafactor,1,2,2,3); %% looking at the

    %discriminant biplot of the validation set



    Return to thematic list
    Return to alphabetic list
    HOME

    crossmaha1

    crossmaha - validation on discrimination according to maha1 (directly on data)

    function[discrtype]=crossmaha(X,group,maxvar,ntest)


    Input arguments

    ==============

    X: SAISIR matrix (n x p)

    group:SAISIR vector of group numbers (n1 x 1)

    maxvar:(integer) maximum number of variables introduced in the model

    ntest (integer): number of observations in each group in the validation

    set


    Output argument:

    ===============

    discrtype with fields:

    step with fields

    index: vector giving the rank of the variable introduced at each step

    (maxvar values)

    correct: vector giving the number of correct classifications in the

    calibration set (max var values)

    name: identifiers of the variables introduced int the models (matrix of

    char with maxvar rows)

    ntestcorrect: vector giving the number of correct classifications in the

    validation set (max var values)

    classed: SAISIR vector of the predicted group numbers of the calibration

    set (n1 x 1). Only the result of the final step is given

    testclassed:SAISIR vector of the predicted group numbers of the test set (n2 x 1)

    Only the result of the final step is given

    selected: vecor indicating the observations in the calibration (0) and in

    the validation (1) set

    confusion with fields

    cal: confusion matrix of the calibration set

    val: confusion latrix of the validation set


    Applies maha1 by dividing the sample in saisir into calibration and verification set

    ntest observations in each group are randomly (with no repeat) placed in test group.


    see also function "maha1"


    Return to thematic list
    Return to alphabetic list
    HOME

    crossplsda

    crossplsda - validation on PLS discriminant analysis

    function[res]=crossplsda(X,group,dim,selected)



    Input arguments:

    ===============

    X:SAISIR matrix of predictive data (n x p)

    group:SAISIR vector of group numbers (n x 1). A same number indicates that

    the observations belong to the sdame group

    dim: maximum number of PLS dimensions

    selected:matlab VECTOR with 0= calibration sample, 1= verification sample


    Output argument:

    ===============

    res with fields:

    confusion1: confusion matrix of the calibration set , method 1

    ncorrect1: number of correctly classified obs. in calibration, method1

    nscorrect1: number of correctly classified obs. in validation, method1

    sconfusion1: confusion matrix of the validation set , method 1

    confusion: confusion matrix of the calibration set , method 0

    ncorrect: number of correctly classified obs. in calibration, method 0

    nscorrect: number of correctly classified obs. in validation, method 0

    sconfusion: number of correctly classified obs. in validation, method 0

    info: ' no index: max of predicted Y; 1: mahalanobis distance on latent variables t'


    The function divides the data collection into a calibration set

    (selected=0) and a validation set (selected = 1)

    The function "plsda" is applied on the calibration set, and tested on the validation set.


    Two strategies for attributing a group to each observation are tested:

    Method 0 (no index): the observations are classified in the group for

    which the predicted indicator variable is the highest.

    Method 1: (preferable) linear discrimination on the PLS scores


    see also:"plsda", "applyplsda"


    Return to thematic list
    Return to alphabetic list
    HOME

    crossvalpls

    crossvalpls - validation of pls with up to ndim dimensions.

    function [res]=crossvalpls(X,y,ndim,selected)



    Input args:

    ===========

    X: predictive data n x p

    y: Variable to be predicted n x 1

    ndim: maximal number of dimensions in the PLS model

    selected: Matlab vector n x 1 (0= obs in calibration; 1 in validation)


    Output args

    ============

    res with fields

    ---calibration: calibration results with fields

    BETA: regression coefficients (p x ndim)

    BETA0: intercept (p x 1)

    PREDY: predicted y in calibration (n x ndim)

    T: scores PLS (n x ndim)

    RMSEC: Root mean square of calibration (1 x ndim)

    r2: determination coefficient (yobs/ypred) (1 x ndim)

    ---validation : validation results with fields

    PREDY: predicted y in validation (n x ndim)

    RMSEV: Root mean square error of validation (1 x ndim )

    r2: determination coefficient (yobs/ypred) (1 x ndim)

    OBSY: observed y in validation (number of rows=number of obs in

    validation)


    See also: "crossvalpls1a"

    Note:

    "crossvalpls" is faster than "crossvalpls1a" but gives less information (for

    example the PLS scores in validation and the loadings in calibration are not given here))


    Return to thematic list
    Return to alphabetic list
    HOME

    crossvalpls1a

    crossvalpls1a - crossvalidation of pls with up to ndim dimensions.

    function [res]=crossvalpls1a(X,y,ndim,selected)


    Input args:

    ===========

    X: predictive data n x p

    y: Variable to be predicted n x 1

    ndim: maximal number of dimensions in the PLS model

    selected: Matlab vector n x 1 (0= obs in calibration; 1 in validation)

    Output args

    ============

    res with fields

    ---calibration: calibration results with fields

    res with fields:

    T: PLS scores (n x ndim)

    P:¨PLS loadings such as X = TP + residuals (ndim x p)

    beta: final regression coefficients with ndim dimensions (p x 1)

    beta0: final interecpt value (number)

    meanx: mean of X (1 x p)

    meany: mean of y (numbe)

    predy: predicted y with ndim dimensions (n x 1)

    error: Root mean square error of the model with ndim dimensions (number)

    corcoef: correlation coefficient r between observed and predicted values

    in the model with ndim dimensions

    BETA: regression coefficients for the ndim models (p x ndim)

    BETA0: intercepts of the ndim models (ndim x 1)

    loadings: pls loadings such as T=X*loadings (with X centred) (p x ndim)

    Q: Q value such as y = TQ + residual (ndim x 1)

    PREDY: predicted y values up to ndim dimensions (n x ndim)

    RMSEC: Root mean square error of predicttion up to ndim dimensions (1 x

    ndim)

    r2: determination coefficient between observed and predicted values (1 x

    ndim)

    ---validation : validation results with fields

    PREDY: predicted y in validation (n x ndim)

    RMSEV: Root mean square error of validation (1 x ndim )

    r2: determination coefficient (yobs/ypred) (1 x ndim)

    T : PLS scores in validation

    OBSY: observed y in validation (number of rows=number of obs in

    validation)


    See also: "crossvalpls"

    Note:

    crossvalpls1a is slower than crossvalpls but gives more information (for

    example the PLS scores in validation)


    Return to thematic list
    Return to alphabetic list
    HOME

    crossval_multiple_regression

    crossval_multiple_regression - validation of multiple regression.

    function[res]=crossval_multiple_regression(X,y,selected)



    Input arguments:

    ================

    X: SAISIR matrix of predictive data (n x p)

    y: SAISIR vector of the variable to be predicted ( n x 1)

    selected: MATLAB vector ( n x 1) giving the samples placed in the

    verification set: 1= in verification; 0 = in calibration set


    Output arguments:

    ================

    res with fields:

    calibration with fields

    ypred: predicted y (n x1)

    beta0: intercept of the regresssion

    beta: regression coefficients (p x 1)

    r2: r2 between observed and predicted values in calibration[1x1 struct]

    RMSEC: Root mean square error of calibration

    validation with fields

    ypred predicted values in validation

    r2: determination coefficient between obs and predicted values in

    validation

    RMSEV: Root mean square error of validation


    Divides the data into a calibration and a validation set.

    Multiple linear regression is established on the calibration set

    and validated on the validation set.

    The division calibration/validation is determined by the vector "selected"


    Return to thematic list
    Return to alphabetic list
    HOME

    crossval_quaddis

    crossval_quaddis - crossvalidation of quadratic dis. analysis

    function res=crossval_quaddis(X,group,selected)



    Input arguments:

    ================

    X : matrix of predictive data (n x p)

    group : vector of known groups (integers, n x 1)

    selected : MATLAB vector (n x 1)

    with elements 0: selected in the calibration set

    1: selected in the validation set


    Output argument

    ===============

    res with fields:

    calibration: structure quaddis_type as defined in function "quaddis"

    validation : structure as defined in "apply_quaddis"


    see also quaddis, apply_quaddis


    Return to thematic list
    Return to alphabetic list
    HOME

    cross_ridge_regression

    cross_ridge_regression - ridge regression with validation

    function[res]=cross_ridge_regression(X,y,krange,selected)



    Input arguments:

    ---------------

    X: Saisir matrix of the predictive data set (n x p)

    y: Saisir vector of value to be predicted (n x 1)

    krange: Matlab vector of double (k x 1) (see function "ridge regression")

    selected:Matlab vector which elements are either equal to 0 or 1


    Output arguments

    ----------------

    res with fields

    predy: predicted y in validation (n2 x k)

    obsy: observed y in validation (n2 x k)

    r2: r2 between observed and predicted y in validation (1 x k)

    rmsecv: root mean square error of validation (1 x k)

    ridgetype: see function "ridge_regression"


    divides a collection in calibration (selected = 0) and verification set

    (selected = 1)

    applies ridge_regression on the validation set

    All the models with the k parameter in "krange" are tested


    Return to thematic list
    Return to alphabetic list
    HOME

    cumulate_covariance

    cumulate_covariance - Covariance on huge data set

    function [covariance_type]=cumulate_covariance((X),(covariance_type1));


    This function is mainly useful for computing PCA on very large data sets


    Input arguments:

    ---------------

    X: SAISIR matrix X (n x p) or [];

    covariance_type: (optional): output argument of the function "cumulate_covariance"


    Output argument:

    ----------------

    covariance_type: either intermediary results in cumulating covariance

    or a structure containing covariance matrix (at completion)

    At completion, "covariance_type" has fields:

    covariance: matrix p x p of covariances

    average: average value of all the observations (1 x p)

    n: total number of observations involved in computing the covariance

    matrix


    if the second argument (covariance_type1) undefined, initiate the calculation of covariance,

    If the two arguments are defined, cumulate the covariance

    if the first argument =[], finish the work


    A complete script must therefore be something like

    cov=cumulate_covariance(spectra_1);%% starting

    cov=cumulate_covariance(spectra_2,cov);%% cumulating values of matrix 1

    cov=cumulate_covariance(spectra_3,cov);%% cumulating values of matrix 2

    ...

    cov=cumulate_covariance(spectra_k,cov);%% cumulating values of matrix k

    covariance_type=cumulate_covariance([],cov);%% finishing

    [pcatype]=covariance_pca(covariance_type,(ncomponent))

    score1=applypca(pcatype,spectra_1);%% projecting data from spectra_1



    related function; "covariance_pca" (pca from covariance)


    Return to thematic list
    Return to alphabetic list
    HOME

    curve

    curve - represents a row of a matrix as a single curve

    handle=courbe(X,(nrow), (xlabel),(ylabel),(title))



    Input argument:

    --------------

    X : SAISIR matrix

    nrow : indeix of the row to be shown

    xlabel, ylabel (optional) : label on X and Y

    title (optional) : title of the graph.

    This function draws the row (typically a spectrum) as a curve.

    If X.v can be interpreted as a vector of number (such as wavelengths),

    the X scale is given by this vector.

    Otherwise, the X-axis is simply given by the rank of the variables

    A function "courbe" is a synonym of this function.


    Return to thematic list
    Return to alphabetic list
    HOME

    curves

    curves - represents several rows of a matrix as curves

    usage handle=curves(X,range,(xlabel),(ylabel),(title))



    Input arguments:

    ---------------

    X : SAISIR matrix

    range (optional): vector of integer values giving the indices of the

    rows to be displayed (default : all rows displayed)

    xlabel, ylabel (optional): labels in X and Y

    title (optional): title of the graph.

    If X.v can be interpreted as a vector of number (such as wavelengths),

    the X scale is given by this vector.

    Otherwise, the X-axis is simply given by the rank of the variables

    example:

    curves(spectra,1:2:100,'wavenumber','log 1/R','Raman spectra');

    plot the rows 1, 3, 5, ... 99 as curves.

    A function "courbes" is a synonym of this function.


    Return to thematic list
    Return to alphabetic list
    HOME

    d2_factorial_map

    d2_factorial_map - assesses a factorial map from a table of squared distance

    function [ftype] = d2_factorial_map(X)



    Input argument:

    ==============

    X : Saisir matrix n x n of squared distances


    Output argument:

    ===============

    ftype with fields:

    eigenval: eigenvalues

    score: scores


    %typical (demonstrative) example

    %===============

    xdist=distance(data,data);

    xdist.d=xdist.d.*xdist.d ;%% warning !! squared distance needed

    ftype=d2_factorial_map(xdist);

    map(ftype.score,1,2);

    p=pca(data);

    figure;

    map(p.score,1,2);%% identical to previous figure


    Useful when only the distance matrix is available

    Uses the Torgerson approach to transform squared distance into pseudo

    scalar products.Gives the factorial scores of the distance


    Return to thematic list
    Return to alphabetic list
    HOME

    deletecol

    deletecol - deletes columns of saisir files

    function [X1] = deletecol(X,index)



    input arguments

    ===============

    X:Saisir matrix (n x p)

    index:vector indicating the columns to be deleted


    Output argument

    ==============

    X1:saisir matrix (n x q) with q <=p


    The deleted columns are indicated by the vector index (numbers of booleans)


    % Typical Examples

    reduced=deletecol(data,[1 3 5]);%% deletes 3 columns

    reduced1=deletecol(data,sum(data.d)==0); % deletes all the columns with

    the sum equal to 0


    see also: deleterow, selectrow, selectcol


    Return to thematic list
    Return to alphabetic list
    HOME

    deleterow

    deleterow - delete rows

    function X1 = deleterow(X,index)


    input arguments

    ===============

    X:Saisir matrix (n x p)

    index:vector indicating the rows²to be deleted


    Output argument

    ==============

    X1:saisir matrix (n1 x p) with n1 <=n


    The deleted rows are indicated by the vector index (numbers of booleans)


    % Typical Examples

    reduced=deleterow²(data,[1 3 5]);%% deletes 3 rows

    reduced1=deletecol(data,sum(data.d,2)==0); % deletes all the rows with

    the sum equal to 0


    see also: selectrow, selectcol,deletecol


    Return to thematic list
    Return to alphabetic list
    HOME

    dendro

    dendro1 - dendrogram using euclidian metric and Ward linkage

    function group=dendro(X,(topnodes))



    Input arguments:

    ================

    X : Saisir data matrix

    topnodes (optional): level of cutting the dendrogram


    Output argument

    ===============

    group: groups number at the level of cutting defined by topnodes.


    typical use

    ===========

    g=dendro(data,30);


    g will contain numbers ranginf from 1 to 30 indicating the group number


    Attempts to display the identifiers on the dendrogram.

    Works only with a few number of identifiers


    Return to thematic list
    Return to alphabetic list
    HOME

    dimcrosspcr1

    dimcrosspcr1 - validation of PCR (samples in validation are selected)

    function [pcrres]=dimcrosspcr1(X,y,ndim,selected)



    Input arguments:

    =================

    X : SAISIR matrix of predictive variables (n x p)

    y : SAISIR vector of variable to be predicted (n x 1)

    ndim: max number of PCR dimensions tested

    selected: MATLAB vector of samples selected as calibration set (==0)

    and verification set (==1)


    Output arguments:

    ================

    pcrres with fields

    r2: determination coefficient between observed and predicted y (1 x ndim)

    predy: predicted y for all the dimensions tested ( n x ndim)

    rmsev: root mean square error of validation for all the dimensions tested (1 x ndim)

    obsy: observed y in the validation set

    Components introduced in the order of the eigenvalues

    Remarks;

    %1) The vector "selected" can be build randomly using the function

    "random_select"

    2)The biplot observed/predicted values can be displayed by the command

    "xy_plot" (for example

    "xy_plot(pcrres.obsy,1,pcrres.predy,3)"

    will show the PCR model with 3 dimensions.


    Return to thematic list
    Return to alphabetic list
    HOME

    dimcross_stepwise_regression

    dimcross_stepwise_regression - Tests models obtained from stepwise regression

    function[res]=dimcross_stepwise_regression(X,y,selected,Pthres,(confidence))


    Input arguments:

    X : SAISIR matrix of predictive variables (n x p)

    y : SAISIR vector of variable to be predicted (n x 1)

    selected: MATLAB vector (p elements). 0 = in the calibration set;

    1 = validation set

    Pthres: probability threshold of entering or discarding variable

    confidence: (optional): confidence interval for the correlation coefficient

    ouput argument:

    res with fields

    calibration (see "stepwise_regression")

    validation (see "apply_stepwise_regression")

    validation has fields

    predy: predicted y in validation for all the regression models

    rmsev: root mean square error of validation (idem)

    r2: determination coefficient between observed and predicted y

    observed_y: vector of observed y values in the validation set.


    The function divides the data set X and y in a calibration set and a

    validation set. The vector "selected" defines the division.

    The stepwise regression models are established on the calibration set

    and tested on the validation set. All the calculated models are tested,

    which gives as many columns of predicted y as the number models computed

    by stepwise_regression.


    Return to thematic list
    Return to alphabetic list
    HOME

    distance

    distance - Usual Euclidian distances

    function [D] = distance(X1,X2)



    Input arguments

    ===============

    X1, X2: SAISIR matrices dimensioned n1 x p and n2 x p respectively


    Output argument

    ==============

    D: matrix n1 x n2 of Euclidian distances between the observations


    the tables must have the same number of columns


    Return to thematic list
    Return to alphabetic list
    HOME

    documentation_dico

    documentation_dico -build a dictionnary for HTML doc

    function documentation_dico(fid,function_name);


    (no direct use)

    This function builds up a part of the automatic documentation

    This is a part of an HTML document as created by "build_documentation"


    Return to thematic list
    Return to alphabetic list
    HOME

    documentation_thematic_list

    documentation_thematic_list - HTML thematic list of function


    Input argments

    ==============

    fid : file identifier

    directory_name: name of the directory (working with

    the Matlab command "what )

    function_name: list of names of SAISIR function

    thematic_list: output of function "thematic_classification"


    This function builds a part of HTML SAISIR documentation (function

    "build_documentation").

    This part corresponds to the thematic list of function

    This function must be called by the function "build_documentation"


    Return to thematic list
    Return to alphabetic list
    HOME

    eigord

    eigord - diagonalization of a square matrix

    function [mtvec,mtval] = eigord(mtA)


    This function is not directly called


    Return to thematic list
    Return to alphabetic list
    HOME

    eliminate_nan

    eliminate_nan - suppresses "not a number" data in a saisir structure

    function [X1] = eliminate_nan(X,(row_or_col))



    Input argument:

    ==============

    X:Saisir data matrix

    row_or_col:

    if row_or_col==0 tries to have the maximum number of rows

    if row_or_col==1 tries to have the maximum number of columns


    Output argument:

    ===============

    X1:Saisir data matrix with no NaN values


    From a given saisir file possibly containing NaN values (not determined values)

    create a file of known values

    Only useful when a very few numbers of rows or columns contain NaN values


    Return to thematic list
    Return to alphabetic list
    HOME

    ellipse_map

    ellipse_map - plots the ellipse confidence interval of groups

    function ellipse_map(X,col1,col2,gr,centroid_variability,(confidence),(point_plot))



    Input arguments:

    ================

    - X: data matrix

    - col1, col2: represented columns

    - gr: qualitative groups (referred as integer number)

    - centroid_variability : either 0 (variability of individual data

    points), or 1 : variability of the centroid itself (default: 0)

    confidence: P value of the confidence interval to be out of the ellipse (default: 0.05)

    point_plot: if point_plot ~=0 , plots also the individual points as

    symbols (inactivated if centroid_variability set to 1)


    useful in discriminant analysis and related methods.


    Return to thematic list
    Return to alphabetic list
    HOME

    excel2bag

    excel2bag - reads an excel file and creates the corresponding text bag

    (array of caracter)


    function [X,bag] = excel2bag(filename,ref_text_col,(nchar),(deb),(xend))

    reads an excel file which has been saved under the format .csv (the delimiters are ';')

    the excel file includes the identifers of rows and columns


    Input arguments

    ===============

    filename :excelfile in the '.csv' format

    nchar :number of characters read in each cell of the excel files

    (the other are ignored)

    ref_text_col :an array of string giving the reference of columns designed as forming the columns of bag.d.

    THESE COLUMNS ARE DESIGNATED USING THE EXCEL STYLE ('AA','AB' ....)

    deb : number of the first row decoded

    xend : final row decoded


    output argument:

    ===============

    X: matrix of numerical values

    bag: a structure (bag.d,bag.i,bag.v), with bag.d being here a matrix of char


    saisir contains the numerical values from the excel file, with the exception of the columns referenced by ref_text_col

    bag contains the charactes values from the excel file, referenced by ref_text_col


    Example:

    [value,bag]=excel2bag('olive',['A'; 'B'],20)

    the columns 'A' and 'B' from excel are read as text (in output bag)

    the other columns are read as number (in value) respecting the saisir

    structure.


    This function is normally used in relation

    with "bag2group".


    Return to thematic list
    Return to alphabetic list
    HOME

    excel2saisir

    excel2saisir - reads an excel text file

    function [saisir] = excel2saisir(filename,(nchar),(start),(xend))



    Input arguments

    ==============

    filename: (string) name of the text Excel file in .csv format

    nchar : (integer, optional) number of character kept in the identifiers

    (default : 20)

    start:(integer , optional) Index of the beginning of the observations to be loaded

    xend:(integer , optional) Index of the final of the observations to

    be loaded (greater than start)


    Reads an Excel file which has been saved under the format .csv (the delimiters are ';')

    the excel file includes the identifers of rows and columns

    deb : number of the first row decoded, xend: final row decoded

    The Excel format is compulsorily the following (example):

    varname1 varname2 varname3

    obsname1 number11 number12 number13

    obsname2 number21 number22 number23

    obsname3 number31 number32 number33


    The decimal separator is the point (".") NOT THE COMMA (",")

    Example of .csv format (3 rows named "obs1", "obs2", "obs3"; 3 columns named

    "var1", "var2", "var3")

    data :

    =======================

    var1 var2 var3

    obs1 1 2 3

    obs2 4 5 6

    obs3 7 8 9

    =======================

    The corresponding .csv Excel file is:

    ;var1;var2;var3

    obs1;1;2;3

    obs2;4;5;6

    obs3;7;8;9


    Return to thematic list
    Return to alphabetic list
    HOME

    fda1

    fda1 - stepwise factorial discriminant analysis on PCA scores

    function[fdatype]=fda1(pcatype, group, among, maxscore)



    Input arguments:

    ---------------

    pcatype: (structure) output argument of function "pca" applied on the predictive data

    set

    group : SAISIR vector of group (integer values). Identical numbers mean

    that the observations belong to the same group.

    among : (integer) Maximal rank (dimension) of PC-score allowed to enter in the model

    maxscore: (integer) Maximum number of scores allowed to enter in the model.


    Output arguments:

    ----------------

    fdatype with fields:

    introduced: rank of the PC scores introduced in the model

    ncorrect: number of correct classifications (no validation) at each step

    beta: projection coefficients such as datafactor=X*beta

    datafactor: discriminant scores

    centroidfactor: scores of the barycenters (centroids)

    eigenval: eigenvalues of the discriminant analysis

    confusion: confusion matrix (row actual; column predicted)

    average: average of the predicitive data set.


    Assesses a stepwise factorial discriminant analysis according

    to Bertrand et al., J of Chemometrics, Vol . 4, 413-427 (1990).

    the basic idea is to assess a factorial discriminant analysis on the scores of

    a previous pca. The criterion of score selection is the maximisation of

    the trace of T-1B.

    In order to avoid using PC-scores with very small eigenvalues,

    the input argument "among" gives the maximal dimension to be allowed.

    "maxscore" indicates the maximal number of scores.

    "datafactor" corresponds to the final model

    (with maxscore scores introduced). If one is interested in a more

    economical model, it is easy, looking at the classification, to reduce the

    value in "maxscore" and re-run "fda1".


    Typical example:

    g=create_group1(wheat,1,3);%% creation of a grouping from the identifiers

    %names, using characters in position 1

    p=pca(wheat); %% first PCA

    res=fda1(p,g,20,5);%% model with 5 scores introduced among the 20 first

    %ones

    res.ncorrect.d

    % ans =

    % 13.00 1 PC score introduced

    % 31.00

    % 69.00

    % 77.00

    % 93.00 5 PC scores introduced

    colored_map1(res.datafactor,1,2,1,3)% map of the discrimination

    figure;

    ellipse_map(res.datafactor,1,2,g,1,0.05) % shown as confidence ellipses


    Note : the number of dimensions in datafactor is less than the number of

    qualitative groups minus 1

    (with 2 groups, only 1 discriminant dimension!).


    SEE ALSO maha3, maha6, plsda, quaddis, applyfda1.


    Return to thematic list
    Return to alphabetic list
    HOME

    fda2

    fda2 - stepwise factorial discriminant analysis on PCA scores with verification

    function[res]=fda2(X, group, among, maxscore, selected)



    Input arguments:

    ---------------

    X: SAISIR matrix of the predictive data set (n x p)

    group : SAISIR vector of group (integer values). Identical numbers mean

    that the observations belong to the same group.

    among : (integer) Maximal rank (dimension) of PC-score allowed to enter in the model

    maxscore: (integer) Maximum number of scores allowed to enter in the model.

    selected: matlab vector (n x 1) with elements =0 (calibration set), or 1

    (validation set)


    Output arguments:

    ----------------

    res with fields:

    introduced: rank of the PC scores introduced in the model

    ncorrect: number of correct classifications in the calibration set at each step

    nscorrect: number of correct classifications in the validation (supplementary) set at each step

    beta: projection coefficients such as datafactor=X*beta

    datafactor: discriminant scores if the calibration set

    centroidfactor: scores of the barycenters (centroids) computed from the

    calibration set

    supscore: discriminant scores if the validation set

    eigenval: eigenvalues of the discriminant analysis

    confusion: confusion matrix (row actual; column predicted) in the

    calibration set

    average: average of the calibration set.

    sconfusion: confusion matrix (row actual; column predicted) in the

    validation (or "supplementary" set)


    Assesses a stepwise factorial discriminant analysis according

    to Bertrand et al., J of Chemometrics, Vol . 4, 413-427 (1990).

    the basic idea is to assess a factorial discriminant analysis on the scores of

    a previous pca. The criterion of score selection is the maximisation of

    the trace of T-1B.

    In order to avoid using PC-scores with very small eigenvalues,

    the input argument "among" gives the maximal dimension to be allowed.

    "maxscore" indicates the maximal number of scores.

    "datafactor" corresponds to the final model

    (with maxscore scores introduced). If one is interested in a more

    economical model, it is easy, looking at the classification, to reduce the

    value in "maxscore" and re-run "fda1".

    The collection is divided in a calibration and a validation set from the

    elements of the input argument "selected"


    %Typical example:

    g=create_group1(data,1,3);%% creation of a grouping from the identifiers

    %names, using characters in position 1 to 3

    %random selection of 1/3 in the validation set

    res=fda2(data,g,10,5,random_select(size(data.d,1),round(size(data.d,1)/3)));


    SEE ALSO maha3, maha6, plsda, quaddis, applyfda1, fda1.


    Return to thematic list
    Return to alphabetic list
    HOME

    find_index

    find_index - find the index corresponding to the closest "value"

    function index=find_index(str,value);


    useful for finding the wavelength index in strings


    Input argument:

    ==============

    str: an array of characters which can be interpreted as a vector of numbers

    when using a command such as vector=str2num(str);

    value: a numerical value normally in the range given by vector.


    Output argument:

    ===============

    Index (rank) of the variable


    Exemple of use :

    index=find_index(spectra.v,1104);

    Find the index of the variable in "spectra" closest to the value 1104.

    Important note: the number associated with num2str(str) are supposed to be sorted

    (It is the normal case with spectral data)


    Return to thematic list
    Return to alphabetic list
    HOME

    find_max

    find_max - gives the indices of the max value of a MATLAB Matrix

    function [row,col,value]=find_max(matrix)


    Input argument

    ==============

    matrix: MATLAB matrix


    Output arguments

    ===============

    row, col: indexes of the row and column of the maximum, respectively

    value: value of the maximum


    see also find_min


    Return to thematic list
    Return to alphabetic list
    HOME

    find_min

    find_min - gives the indices of the min value of a MATLAB Matrix

    function [row,col,value]=find_min(matrix)



    Input argument

    ==============

    matrix: MATLAB matrix


    Output arguments

    ===============

    row, col: indexes of the row and column of the minimun, respectively

    value: value of the minimun


    see also find_max


    Return to thematic list
    Return to alphabetic list
    HOME

    find_peaks

    find_peaks - finds and displays peaks greater than a threshold value

    function [vect, index]=find_peaks(X,nrow,threshold,windowsize,min_max)



    Input arguments

    ==============

    X:SAISIR data matrix of "spectra-like" data

    nrow:(integer): index of the row to be studied

    threshold: peaks of absolute value lower than the threshold are not

    detected

    windowsize (integer, preferably odd number): size of the moving window

    in which the peaks are to be found

    min_max: either 0 : only maximum detected, or 1: maximum and minimum detected


    Output argument

    ==============

    vect: matlab vector of the positions (name of variable converted into

    numbers)

    index: matlab vector of the index of the variables corresponding to peaks


    Inside a moving window of size "windowsize" (data points)

    detects the maximum (or maximum and minimum values). The identified

    positions are considered as "peaks" and shown on the display.

    The corresponding variables identifiers (normally wavelengths, or

    retention time values) are converted into numbers and given in the output

    argument "vect".


    The system of threshold and moving window avoid that a large series of

    peaks will be identified if the studied curve is not perfectly smooth.

    The "windowsize" indicates the minimum gap (in data points) between two

    consecutive peaks. The threshold makes it possible to detect peaks greater than a certain value.


    Return to thematic list
    Return to alphabetic list
    HOME

    group_centering

    group_centering - Centers data according to groups

    function X1=group_centering(X,group);



    Input arguments

    ==============

    X: SAISIR matrix (n x p)

    group: SAISIR vector of group (integer, n x 1). Identical values

    in "group" indicates that the corresponding observations belong to the same group.


    Output argument

    ===============

    X1:SAISIR matrix (n x p) group-centered.


    For each group, as defined by the input argument "group", the function computes the

    average observation (1 x p). This average is subtracted to all

    observation belonging to this group.


    An usage of this function is the centering of sensory data according to each panellist.




    Return to thematic list
    Return to alphabetic list
    HOME

    group_mean

    group_mean - gives the means of group of rows

    function X1=group_mean(X,startpos,endpos)



    Input arguments

    ===============

    X: SAISIR data matrix ( n x p)

    startpos, endpos : (integers) character positions in the identifiers giving the key

    for building the qualitative groups.


    Output argument

    ===============

    X1: matrix of averages of groups (k x p) with k the number of found

    groups.


    This function uses the identifier for creating groups.

    creates as many groups as different strings from startpos to endpos

    The function gives the matrix of averages according to groups

    (barycenters).



    Return to thematic list
    Return to alphabetic list
    HOME

    html_header

    html_header - build the first part of the documentation

    function html_header(fid)


    This function has no direct use

    It is used for beginning the HTML documentation of SAISIR


    Return to thematic list
    Return to alphabetic list
    HOME

    html_notice

    html_notice - prints the helps of functions in HTML document

    function html_notice(fid,function_name);


    This function builds up the core of the documentation (which is the

    gathering of individual helps

    Builds also the necessary hypertext links.


    No direct use



    Return to thematic list
    Return to alphabetic list
    HOME

    html_postface

    html_postface - Finish the work in HTML documentation

    function html_postface(fid);


    No direct use


    Return to thematic list
    Return to alphabetic list
    HOME

    issaisir

    issaisir - tests if the input argument is a SAISIR matrix

    test=issaisir(X);



    Input argument:

    ==============

    X: anything


    Output argument:

    ===============

    test: (boolean) "true" if X is a SAISIR structure, "false" otherwise.


    SEE also : saisir_check


    Return to thematic list
    Return to alphabetic list
    HOME

    labelled_hist

    labelled_hist - draws an histogram in which each obs. name is considered as a label

    function labelled_hist(X,col,startpos,endpos,(nclass),(charsize),(car))


    Input arguments

    ===============

    X : SAISIR data matrix

    col : the column from which the histogram is drawn

    startpos and endpos : the position in the row identifier strings considered as keys for coloration

    nclass : number of desired classes

    charsize : the size of character on the graph

    str : (optional) if str (a string) is defined, all the observations are represented with

    this string colored differently according to the extracted key.

    by choosing str='--' (for example),it is possible to avoid overlapping identifiers.


    This function build an histogram in which each observation is represented

    by a colored code (key) extracted from the row identifiers.


    Note: It is generally necessary to play with "nclass" and "charsize" to have a smart histogram


    clf;


    Return to thematic list
    Return to alphabetic list
    HOME

    leave_one_out_pls1

    leave_one_out_pls1 - PLS1 with leave_one out validation

    function res=leave_one_out_pls1(X,y,ndim);



    Input arguments:

    ===============

    X:SAISIR matrix of predictive variables (n x p)

    y:Saisir vector of the variable to be predicted (n x 1)

    ndim: (integer) maximal number of tested PLS dimensions


    Ouput arguments:

    ===============

    res with fields

    predy: SAISIR matrix of predicted y in leave-one-out (n x ndim) for all

    the dimensions tested.

    rmse: Root mean square error (1 x ndim) for all the dimensions tested

    r2: r2 value between observed and predicted y for all the dimensions

    tested.

    optimal_error: (double) minimal rmse among all the dimensions tested.

    optimal_dim: (integer) PLS dimension giving the best model.

    optimal_r2: r2 value for the best model.


    The function leaves out one observation and makes a model with the

    resulting observations. The left observations is predicted.

    This procedure is carried out for the n rows in X.


    This function is very slow and must be used only for small data set

    (typically less than 30 observations). Otherwise one must prefer a

    validation procedure.


    Return to thematic list
    Return to alphabetic list
    HOME

    leave_one_out_pls2

    leave_one_out_pls2 - PLS2 with leave_one out validation

    function res=leave_one_out_pls2(X,y,ndim);



    Input arguments:

    ===============

    X:SAISIR matrix of predictive variables (n x p)

    y:SAISIRr vector of the variables to be predicted (n x k)

    ndim: (integer) maximal number of tested PLS dimensions


    Ouput arguments:

    ===============

    res with fields

    col: vector of k cells. The cell res.col{i} contains the y predicted values

    (n x ndim) associated with the variable i, for all the PLS dimensions

    tested.

    RMSEV:root mean square error (k x ndim) for all the variables (rows) and all the

    dimensions (columns).

    r2: r2 values (k x ndim) for all the variables (rows) and all the

    dimensions (columns).


    The function leaves out one observation and makes a model with the

    resulting observations. The ys of the left observations are predicted.

    This procedure is carried out for the n rows in X.


    This function is very slow and must be used only for small data set

    (typically less than 30 observations). Otherwise one must prefer a

    validation procedure.


    Return to thematic list
    Return to alphabetic list
    HOME

    list

    list - lists rows (only with a small number of columns)

    function list(X,(start))


    start: starting index in the list (default : 1)


    Return to thematic list
    Return to alphabetic list
    HOME

    lr1

    lr1 - Latent root regression

    function [lr1type]=lr1(X,y,maxdim,(ratioxy))


    Computes a basic latent root model


    Input arguments:

    ================

    X: SAISIR matrix of predictive variables (n x p)

    y: SAISIR vector of variables to be predicted (n x 1)

    maxdim (integer): maximal number of dimensions introduced in the model

    ratioxy (float) : positive number greater than 0 less than 1

    giving the relative importance of x and y. 1: x important; 0 x not

    important


    Output argument:

    ================

    lr1type with fields:

    predy: predicted y for all the models, up to maxdim dimensions (n x maxdim)

    corrcoef: correlation coefficents between y and predicted y (1 x maxdim)

    beta: regression coefficients for all the models (p x maxdim)

    averagey: average of y

    averagex: average of x

    ratioxy: copy of parameter ratioxy


    Return to thematic list
    Return to alphabetic list
    HOME

    maha

    maha - simple discriminant analysis forward introducing variables no validation samples

    function[res]=maha(X,group,maxvar)



    Input arguments:

    ----------------

    X :SAISIR matrix (n x p) of predictive variables

    group :SAISIR vector (n x 1) of integers indicating the group. Two observations

    belonging to the same group have the same group number

    maxvar : integer indicating the maximum number of variables to be

    introduced.


    Output arguments:

    -----------------

    res with fields:

    step with fields

    index: vector of integers (1 x maxvar) giving the indices of the

    selected variables

    correct: vector of integer ( 1 x maxvar) giving the number of

    correct classifications at each step

    name: identifiers of the introduced variables (matrix of char with

    maxvar rows)

    classed: predicted groups in the final step (SAISIR vector of integers

    n x 1)


    The function assesses a simple quadratic discriminant analysis introducing

    up to maxvar variables

    At each step, the more discriminating variable (according to the percentage of correct

    classification) is introduced. Only forward

    The function makes use of matlab function "classify"


    Return to thematic list
    Return to alphabetic list
    HOME

    maha1

    maha1 - forward linear discriminant analysis DIRECTLY ON DATA with validation samples

    function[discrtype1]=maha1(calibration_data,calibration_group,maxvar,test_data,test_group)



    Input arguments

    ==============

    calibration_data: SAISIR matrix (n1 x p)

    calibration_group:SAISIR vector of group numbers (n1 x 1)

    maxvar:(integer) maximum number of variables introduced in the model

    test_data:SAISIR matrix (n2 x p)

    test_group: SAISIR vector of group numbers (n2 x 1);


    Output argument:

    ===============

    discrtype1 with fields:

    step with fields

    index: vector giving the rank of the variable introduced at each step

    (maxvar values)

    correct: vector giving the number of correct classifications in the

    calibration set (max var values)

    name: identifiers of the variables introduced int the models (matrix of

    char with maxvar rows)

    ntestcorrect: vector giving the number of correct classifications in the

    validation set (max var values)

    classed: SAISIR vector of the predicted group numbers of the calibration

    set (n1 x 1). Only the result of the final step is given

    testclassed:SAISIR vector of the predicted group numbers of the test set (n2 x 1)

    Only the result of the final step is given


    The function assesses a simple linear discriminant analysis introducing

    up to maxvar variables

    at each step, the more discriminating variable (according to the percentage of correct

    classification of the calibration set) is introduced. Only forward

    Uses the Matlab function "classify"


    %Typical example


    mydis=maha1(wheat1,g1,5,wheat2,g2)

    disp(mydis.step.ntestcorrect);%Looking at the number of correct

    %classifications in the test set


    Return to thematic list
    Return to alphabetic list
    HOME

    maha3

    maha3 - simple discriminant analysis forward introducing variables no validation samples

    function[discrtype]=maha3(X,group,maxvar)



    Input arguments:

    ----------------

    X :SAISIR matrix (n x p) of predictive variables

    group :SAISIR vector (n x 1) of integers indicating the group. Two observations

    belonging to the same group have the same group number

    maxvar : integer indicating the maximum number of variables to be

    introduced.


    Output arguments:

    -----------------

    res with fields:

    ncorrect: vector of integers (1 x maxdim) indicating the number of

    correct classifications at each step.

    classed: SAISIR vector of integer (n x 1) indicating the predicted group number

    for the model with maxvar variables introduced.

    confusion: SAISIR confusion matrix (row: actual group, column : predicted group).

    varrank: vector of integer (1 x maxdim) indicating the index (rank) of the

    introduced variables.


    The function computes a linear discriminant analysis introducing up to

    maxvar variables.

    At each step, the more discriminating variable according to the

    maximisation of the trace of T-1B is introduced.


    Return to thematic list
    Return to alphabetic list
    HOME

    maha4

    maha4 - simple discriminant analysis forward

    function[discrtype]=maha4(X,group,maxvar)



    Input arguments:

    ----------------

    X :SAISIR matrix (n x p) of predictive variables

    group :SAISIR vector (n x 1) of integers indicating the group. Two observations

    belonging to the same group have the same group number

    maxvar : integer indicating the maximum number of variables to be

    introduced.


    Output arguments:

    -----------------

    res with fields:

    ncorrect: vector of integers (1 x maxdim) indicating the number of

    correct classifications at each step.

    classed: SAISIR vector of integer (n x 1) indicating the predicted group number

    for the model with maxvar variables introduced.

    confusion: SAISIR confusion matrix (row: actual group, column : predicted group).

    varrank: vector of integer (1 x maxdim) indicating the index ("rank") of the

    introduced variables.


    The function computes a linear discriminant analysis introducing up to

    maxvar variables. At each step, the new variables giving the highest number

    of correctly classified samples is introduced.


    Return to thematic list
    Return to alphabetic list
    HOME

    maha6

    maha6 - simple discriminant analysis forward introducing variables with validation samples

    function[discrtype]=maha6(X,group,maxvar,selected)



    Input arguments:

    ===============

    X: SAISIR matrix (n x p) of data

    group: SAISIR vector of integer (n x 1) indicating the group number of the

    observations. Two observations sharing a same group number belong to the

    same qualitative group

    maxvar: integer giving the maximal number of variables introduced in the

    model

    selected: Matlab vector (n x 1) the elments of which are equal to 0

    (observation placed in the calibration set, or 1 (observations placed in

    the validation set).


    Ouput argument:

    ==============

    discrtype with fields

    ncorrect: vector of integers (1 x maxvar) giving the number of correctly

    classified observations in the calibration set.

    classed: SAISIR vector of integer giving the predicted group of the

    calibration set.

    confusion: matlab confusion matrix of the calibration set for the final

    step.

    sconfusion: matlab confusion matrix of the validation set for the final

    step.

    sclassed: SAISIR vector of integer giving the predicted group of the

    validation set.

    nscorrect: vector of integers (1 x maxvar) giving the number of correctly

    classified observations in the validation (supplementary) set.

    sclassed: vector of integers (1 x maxvar) giving the predicted group of

    in the validation.

    nscorrect: vector of integers (1 x maxvar) giving the number of correctly

    classified observations in the validation set.


    The function computes a linear discriminant analysis introducing up to maxvar variable

    at each step, the more discriminating variable

    according to the maximisation of the trace of T-1B is introduced

    the collection is divided in cal. sample and test samples according to selected:

    selected=0 , sample placed in calibration, =1 verification


    %Typical use

    %===========

    load data;

    g=create_group1(data,1,2); %% supposing that the identifiers contain a key

    %for forming the qualitative group in position of characters 1 and 2

    sel=random_select(size(data.d,1),round(size(data.d,1)/3));%% a third in

    %validation

    res=maha6(data,g,5,del);

    xdisp('Evolution of correct classification in the calibration set',res.ncorrect);

    xdisp('Evolution of correct classification in the validation set', res.nscorrect);


    Return to thematic list
    Return to alphabetic list
    HOME

    map

    map - graph of map of data using identifiers as names

    function map(X,col1,col2,(col1label),(col2label),(title),(charsize),(margin))


    Input arguments

    ---------------

    X: SAISIR matrix

    col1, col2 : index of the two columns to be represented

    col1label (optional): Label of the variable forming the X-axis

    col2label (optional): Label of the variable forming the Y-axis

    title (optional) : title of the graph

    charsize (optional) : size of the plotted characters

    marg (optional) : margin value allowing an extension of the axis in order

    to cope with long identifiers (default value: 0.05)

    For the French users: there is a synonym function "carte".

    Use preferably "map"


    Return to thematic list
    Return to alphabetic list
    HOME

    map3D

    map3D - Draws a 3D map

    function map3D(X,col1,col2,col3,(label1),(label2),(label3),(title),(charsize))



    Input arguments:

    ===============

    X: SAISIR matrix (n x p)

    co1, col2, col3: indices of he columns to be represented

    as X, Y and Z in the 3D plot (integers).

    label1,label2, label3: label of axes on X, Y, Z (optional, strings or vectors of

    char)

    charsize: size of the characters (optional, default :6)


    synonymous of "carte3D" (French name). Use preferably map3D


    Return to thematic list
    Return to alphabetic list
    HOME

    matrix2saisir

    matrix2saisir - transforms a Matlab matrix in a saisir structure

    X = matrix2saisir(data,(coderow),(codecol))



    Input arguments:

    ---------------

    data : Matlab matric

    coderow (optional) : string (code added to the row identifiers)

    codecol (optional): string (code added to the variables identifiers)


    Output argument:

    ----------------

    X: SAISIR matrix with fields "d" (copy of "data"), "i" (identifiers of

    rows), "v" (identifiers of variables)


    Saisir means "statistique appliquée à l'interpretation des spectres infrarouge"

    or "statistics applied to the interpretation of IR spectra'


    See the manual of SAISIR for understanding the rationale of this structure.


    %Typical use:

    %===========

    A=[1 2 3 4; 5 6 7 8];

    B=matrix2saisir(A,'row # ', 'Column # ');

    % >> B.d

    % ans =

    % 1 2 3 4

    % 5 6 7 8

    % >> B.i

    % ans =

    % row # 1

    % row # 2

    % >> B.v

    % ans =

    % Column # 1

    % Column # 2

    % Column # 3

    % Column # 4


    Return to thematic list
    Return to alphabetic list
    HOME

    mdistance

    mdistance - computes distances between the two tables using metric "metric"

    function dis = mdistance(X1,X2,metric)



    Input arguments:

    ================

    X1: SAISIR matrix ( n1 x p)

    X2: SAISIR matrix (n2 x p)

    metric: SAISIR matrix (p x p) of the metric


    Output argument:

    ================

    dis: SAISIR matrix of distances between observations (n1 x n2) according to the metric "metric"


    example of use (computing Mahalanobis distances):

    ================================================

    data=matrix2saisir(rand(50,10));%% dummy data

    data1=center(data);%% centered data

    metric=matrix2saisir(inv(data1.d'*data1.d));

    dis=mdistance(data1,data1,metric);%% Mahalanobis distance between

    %observations


    Return to thematic list
    Return to alphabetic list
    HOME

    mfa

    mfa - Multiple factor analysis

    function res=mfa(collection);



    Input argument:

    ===============

    collection: VECTOR OF SAISIR structures

    example of building a collection : collection col{1}=table1;col{2}=table2; col{3}=table3

    in which table1, table2, table are SAISIR structure

    Each table must include the same observations, but not necessarily the same variables.

    WARNING! In this version, the variables are not normalised !


    Let n be the number of observations, and t the number of tables

    Let WHOLE be the matrix of appended tables normalized according to MFA (dimensions n x m)

    Let k be the rank of WHOLE


    Output arguments:

    ================

    res with fields:

    score (n x k) : scores of the individuals (compromise)

    eigenvec ( m x k) : eigenvectors of the PCA on WHOLE (no direct use)

    eigenval (1 x k) : eigenvalues of the PCA on WHOLE

    average (1 x m) : averages of the variables of WHOLE

    var_score (m x k) : scores of the variables

    proj {1xt cell} : projectors for computing the projection of new observations of each table

    first_eigenval (1xt) : first eigenvalues of the individual PCAS on each table

    trajectory (q x k) : individual score of each row of each table (q = total number of rows in all the tables q=n*t)

    id_group (q x 1) : identification of the belonging of the observation_score into a given table

    table_score (t x k) : scores of the tables


    Information on FMA can be found in SPAD TM Version 5.0 (procédure AFMUL);


    Return to thematic list
    Return to alphabetic list
    HOME

    mir_style

    mir_style - changes the sign of the variables of MIR spectra

    function [names] = mir_style(names1)



    Input argument:

    ===============

    names1:matrix of char interpretable as number through "num2str"


    Ouput argument:

    ===============

    names2: matrix of char interpretable as number through "num2str"


    The function gives negative values of variables (here normally wavenumbers)

    in order to have the usual sense on the graph of mid infrared spectra

    This may help Spectroscopists to examine data and loadings.


    %Typical use:

    %============

    %Let is suppose that "midIR" is a SAISIR matrix of Mid-infrared spectra

    %with wavenumbers as variables identifiers

    midIR.v=mir_style(midIR.v);


    Return to thematic list
    Return to alphabetic list
    HOME

    moving_average

    moving_average - Moving average of signals

    function X1=moving_average(X,window_size)



    Input arguments

    ===============

    X:SAISIR matrix of data (normally digitized signals such as spectra)

    window_size: integer giving the number of data points which are locally

    averaged (odd number).


    Ouput arguments

    ===============

    X1: matrix of averaged data


    The function replaces a given variable by its average in the range defined

    by "window_size".

    (window_size-1)/2 variables are lost at the begining and the end of the signal.


    For example with window_size = 5

    a given variable x(i) of index i is replaced by the local average

    (x(i-2)+x(i-1)+x(i)+x(i+1)+x(i+2))/5


    Return to thematic list
    Return to alphabetic list
    HOME

    moving_max

    moving_max - replaces the central point of a moving window by the maximum value

    function [X1] = moving_max(X,window_size)



    Input arguments

    ===============

    X:SAISIR matrix of data (normally digitized signals such as spectra)

    window_size: integer giving the number of data points on which the max

    value is computed.


    Ouput arguments

    ===============

    X1: matrix of local maxima


    The function replaces a given variable by the maximum value in the range defined

    by "window_size".

    (window_size-1)/2 variables are lost at the begining and the end of the signal.


    For example with window_size = 5

    a given variable x(i) of index i is replaced by the local maximum of variables

    x(i-2); x(i-1); x(i): x(i+1); x(i+2)


    Return to thematic list
    Return to alphabetic list
    HOME

    moving_min

    moving_min - replaces the central point of a moving window by the minimum value

    function [X1] = moving_min(X,window_size)



    Input arguments

    ===============

    X:SAISIR matrix of data (normally digitized signals such as spectra)

    window_size: integer giving the number of data points on which the min

    value is computed.


    Ouput arguments

    ===============

    X1: matrix of local minima


    The function replaces a given variable by the minimum value in the range defined

    by "window_size".

    (window_size-1)/2 variables are lost at the begining and the end of the signal.


    For example with window_size = 5

    a given variable x(i) of index i is replaced by the local minimum of variables

    x(i-2); x(i-1); x(i): x(i+1); x(i+2)


    Return to thematic list
    Return to alphabetic list
    HOME

    multiple_regression

    multiple_regression - Simple Multiple linear regression (all the variables)

    function res=multiple_regression(x,y);



    Input arguments

    ===============

    X: SAISIR matrix of predictive variables (n x p)

    y: SAISIR vector of observed y (n x 1);


    Output argument

    ===============

    res with fields

    ypred: predicted y

    beta0: intercept of the model (double)

    beta: regression coefficient

    r2: r2 value

    RMSEC: Root mean square error of calibration


    Return to thematic list
    Return to alphabetic list
    HOME

    multiway_pca

    multiway_pca - Multi way principal component analysis

    function res=multiway_pca(collection);



    Input argument:

    ==============

    collection : ARRAY OF SAISIR matrices

    example of building a collection : collection col{1)=table1;col{2)=table2; col{3)=table3

    in which table1, table2, table are SAISIR structure

    Each table must include the same observations, but not necessarily the same variables.

    WARNING! In this version, the variables are not normalised !


    Let n be the number of observations, and t the number of tables

    Let WHOLE be the matrix of appended tables(dimensions n x m)

    Let k be the rank of WHOLE


    Output argument:

    ===============

    res with fields:


    score (n x k) : scores of the individuals (compromise)

    eigenvec ( m x k) : eigenvectors of the PCA on WHOLE (no direct use)

    eigenval (1 x k) : eigenvalues of the PCA on WHOLE

    average (1 x m) : averages of the variables of WHOLE

    var_score (m x k) : scores of the variables

    proj {1xt cell) : projectors for computing the projection of new observations of each table

    trajectory (q x k) : individual score of each row of each table (q = total number of rows in all the tables)

    id_group (q x 1) : identification of the belonging of the observation_score into a given table

    table_score (t x k) : scores of the tables


    This function is identical to "mfa" except that the tables are simply set in such a way that their norm are set to 1.


    Return to thematic list
    Return to alphabetic list
    HOME

    nancor

    nancor - Matrix of correlation with missing data

    function[cor]=nancor(X1,X2)



    Input arguments:

    ================

    X1 and X2: SAISIR matrices dimensioned (n x p1) and (n x p2)

    respectively.


    Output argument:

    ================

    cor: SAISIR matrix (dimensioned p1 x p2).

    An element cor.d(a,b) is the correlation coefficient between the column a

    of X1 and b of X2.



    Return to thematic list
    Return to alphabetic list
    HOME

    normc

    normc - Normalize columns of a matrix.


    Syntax


    normc(M)


    Description


    NORMC(M) normalizes the columns of M to a length of 1.


    Examples


    m = [1 2; 3 4]

    n = normc(m)


    See also NORMR


    Reference page in Help browser

    doc normc.m


    Return to thematic list
    Return to alphabetic list
    HOME

    normed_pca

    normed_pca - PCA with normalisation of data

    function[pcatype]=normed_pca(X)


    function [pcatype]=pca(X,(var_score))

    Assesses principal component analysis (on not normalised data)


    Input arguments:

    ---------------

    X: SAISIR matrix


    Output arguments:

    ----------------

    pcatype with fields:

    score :PC score

    eigenvec :eigenvectors (loadings)

    eigenval :eigenvalues

    average :average observation

    var_score :scores of the variables

    std: standard deviations of the columns of X


    Return to thematic list
    Return to alphabetic list
    HOME

    norm_col

    norm_col - divides each column by the corresponding standard deviation

    function [saisir] = norm_col(saisir1,(mode))


    divide each column by the corresponding standard deviation

    mode (optional): 0 or 1 division by n-1 or by n respectively

    default : 1

    OBSOLETE: USE PREFERABLY FUNCTION "standardize"


    Return to thematic list
    Return to alphabetic list
    HOME

    nuee

    nuee - Nuee dynamique (KCmeans)

    function[res]=nuee(X,ngroup,(nchanged))


    Clusters the data into ngroup according to the KCmeans method ("nuée dynamique");


    Input arguments

    ==============

    X: SAISIR data matrix

    ngroup (integer): number of groups asked for

    nchanged (optional):stop iteration when there are still nchanged groups

    which have changed in the previous iteration

    This allows sparing some time. nchanged must be small in comparison with

    the number of rows of X


    Output argument

    ==============

    res with fields

    group: SAISIR vector of groups. Observations with the same group number

    have been classified in the same group.

    centre: barycenter of the groups.


    Warning: the function may reduce the number of groups


    Return to thematic list
    Return to alphabetic list
    HOME

    num2str1

    num2str1 - Justified num2string

    function str=num2str1(vector,ndigit);



    Input arguments:

    ===============


    vector : Matlab vector of integers

    ndigit : positive integer


    This function transforms numbers into matrices of char.

    The function justifies the strings by adding zeros.


    If the first argument is a row vector, it is transposed


    % %Example

    % %=======

    x=[1 2 100];

    x1=num2str1(x,5);

    x1

    % 00001

    % 00002

    % 00100

    The main use of this function is to help building smart row names

    in SAISIR matrices using the system of extractable fields in the names.




    Return to thematic list
    Return to alphabetic list
    HOME

    pca

    pca - principal component analysis on raw data

    function [pcatype]=pca(X,(var_score))


    Assesses principal component analysis (on not normalised data)


    Input arguments:

    ---------------

    X: SAISIR matrix

    var_score : optional (0: only scores of observations

    1: gives also the scores of the variables (default : 0)


    Output arguments:

    ----------------

    pcatype with fields:

    score :PC score

    eigenvec :eigenvectors (loadings)

    eigenval :eigenvalues

    average :average observation

    var_score : (if input arg. "varscore" defined) scores of the variables

    NOTA :the weight of the observations are equal to 1/(number of rows)

    ALL the possible scores are calculated


    SEE ALSO : normed_pca, cumulate_covariance, covariance_pca, normed_pca

    correlation plot, apply_pca


    %typical example:

    %spectra: n x p, chemistry n x k

    p=pca(spectra);%% PCA

    map(p.score,1,2);%% PC plot 1-2

    correlation_plot(p.score,1,2, chemistry);%% correlation with chemistry



    Return to thematic list
    Return to alphabetic list
    HOME

    pca1

    pca1 - assesses principal component analysis on raw data (case nrows>ncolumns)

    This function is not to be called directly.


    Use "pca" or "normed_pca"


    Return to thematic list
    Return to alphabetic list
    HOME

    pca2

    pca2 - computes principal component analysis on raw data (case nrows>ncolumns)

    This function is not to be called directly.


    Use "pca" or "normed_pca"


    Return to thematic list
    Return to alphabetic list
    HOME

    pcareconstruct

    pcareconstruct - reconstructs original data from a PCA model and a file of score

    function[res]=pcareconstruct(pcatype,score,nscore)



    Input arguments:

    ===============

    pcatype: output argument of function"PCA"

    score: SAISIR matrix of scores (computed from the same pcatype model)

    nscore: number of components involved in the reconstruction of the data


    Output argument:

    ================

    res: SAISIR matrix of reconstructed data. The quality of the

    reoncstruction depends on the number of scores introduced (variable

    "nscore").


    From the previously computed scores, the function rebuilds the original data matrix.


    Return to thematic list
    Return to alphabetic list
    HOME

    pca_cano

    pca_cano - generalized canonical analysis after PCAs on each table

    function res=pca_cano(collection,ndim,graph);


    % ========================================================================

    input argument :

    ===============

    collection: array of SAISIR matrices with the same number of rows (see

    below)

    ndim : dimension of each individual PCA (must be less than the smallest

    number of variables

    graph : if different from 0 : display examples of graph


    let n be the number of observations in each table, k the number of tables

    %Output argument:

    ===============

    res with fields

    compromise : PCA giving the compromise

    observation_score : scores of each observation of each table (nxk rows)

    id_group : groups identifying each observation in observation_score (for graph)

    projector : struct array giving the vectors allowing the projection of each data set

    score_correlation: : correlations between compromise scores and table scores

    table_average : struct array giving the average of each original data table

    Adapted from : G. Saporta. Probabilités, analyse des données et statistiques.

    : Edition Technip, page 192 and followings.

    % ========================================================================


    The function first computes PCA on each of the SAISIR matrices in

    "collection". Only the ndim PC scores are kept in each matrix.

    Canonical analysis is carried out on the series of scores.

    This procedure avoids to have inversion of the original matrices. The use

    of few scores guarantees that the computation is feasible, even if the

    original matrices have colinear variables.



    Note :

    the argument collection is obtained for example by

    collection(1)=data1;collection(2)=data2; ...;collection(k)=datak;

    In which data1, data2, ..., datak are SAISIR matrices with the same number

    of rows.



    Return to thematic list
    Return to alphabetic list
    HOME

    pca_cross_ridge_regression

    pca_cross_ridge_regression - PCA ridge regression with crossvalidation

    function[res]=pca_cross_ridge_regression(X,y,krange,selected)



    Input argument:

    ==============

    X: SAISIR matrix (n x p) of predictive variables

    y: SAISIR vector (n x 1) of the variable to be predicted

    range: MATLAB vector of integers (1 x q)

    selected: MATLAB vector (n x 1) with elements = 0 (selected in

    calibration)

    or 1 ( selected in validation)


    Outut argument:

    ==============

    res with fields:

    predy: predicted y in the validation set (q columns)

    obsy: observed y of the validation set (1 column)

    r2: r2 between observed and predicted values in the validation set

    (vector with q elements)

    rmsecv: root mean square error of validation (1 x q)

    ridgetype: calibration model (see function "pca_ridge_regression")


    This function divides a collection in calibration and verification set

    using the input argument "selected"

    and applies the pca_ridge_regression on the validation set


    The function calculates as many ridge regression models as the number of

    elements in "range"

    In ridge regression, the product X'X is replaced by X'X +kI.

    In the present function, k is in fact the eigenvalue of the corresponding PCA component

    for example, if krange = [1 3 5], that means that the tested values of k

    are the eigenvalues #1, #3, #5.

    The rationale of this, is that k in ridge_regression is very difficult to find

    it is a good idea to test a value in the range of the observed eigenvalues


    Typical example

    ===============

    %Let DATA (n x p) be the SAISIR matrix of predictive variables and y (n x 1) the variable to be predicted.

    myrange=1:20 %% testing the eigenvalues from 1 to 20

    sel=random_select(size(DATA.d,1), round(DATA.d,1/3));%% 1/3 in validation

    [res]=pca_cross_ridge_regression(DATA,y,1:10,sel);%% testing the 10 first

    %eigenvalues

    xy_plot(res.predy,10, res.obsy,1);%% display of the 10th model



    %See also: pca_ridge_regression, ridge_regression, ridge_regression1,

    apply_ridge_regression


    Return to thematic list
    Return to alphabetic list
    HOME

    pca_ridge_regression

    pca_ridge_regression - Basic ridge regression after PCA

    function[ridgetype]=pca_ridge_regression(pcatype,y,range)



    Input arguments

    ===============

    pcatype: structure, output argument of function "PCA"

    y: variable to be predicted (n x 1)

    range: MATLAB vector of positive integers (1 x k)


    Output arguments:

    ================

    ridgetype with fields:

    beta: coefficients of the model (p x k), applicable on X

    krange: (1 X q) values of the coefficient k of ridge regressions

    averagex: average of original predictive variables (1 x p)

    averagey: average of y (1 x 1)

    rmsec: root mean square of validation (1 x q)

    predy: y predicted (n x q)

    corr: correlation coefficient (1 x q] between predicted and observed

    values


    The function calculates as many ridge regression models as the number of

    elements in "range"

    In ridge regression, the product X'X is replaced by X'X +kI.

    In the present function, k is in fact the eigenvalue of the corresponding PCA component

    for example, if krange = [1 3 5], that means that the tested values of k

    are the eigenvalues #1, #3, #5.

    The rationale of this, is that k in ridge_regression is very difficult to find

    it is a good idea to test a value in the range of the observed eigenvalues


    %Typical example

    ===============

    %Let DATA (n x p) be the SAISIR matrix of predictive variables and y (n x 1) the variable to be predicted.

    mypca=pca(DATA); %

    myrange=1:20 %% testing the eigenvalues from 1 to 20

    res=pca_ridge_regression(mypca,y,myrange)

    xdisp(res.rmsec.d);%% displaying (for example) the errors


    See also ridge_regression, ridge_regression1, apply_ridge_regression,

    pca_cross_ridge_regression.



    Return to thematic list
    Return to alphabetic list
    HOME

    pca_stat

    pca_stat - Gives some complementary statistics on PCA observations

    function res=pca_stat(pca_type, comp1, comp2);



    Input argument :

    ===============


    pca_type: output argument of function "pca"

    comp1, comp2: number of the PC components of the PCA to be analyzed


    Ouput argument:

    ===============

    res: Saisir Matrix with 7 columns : QTL, col1 CO2col1, CTRcol1, col2, CO2col2, CTRCol2

    QLT : squared cosinus with the plan (quality of the representation of the

    observations)

    CO2col1 and CO2col2 : squared cosinus of the angle between the observation and the axis

    We have QLT=CO2col1 + CO2col2

    CTRcol1 and CTRcol2 : Contribution of the observation to the component.


    From G.Saporta, Probabilités analyse des données et statistiques, Ed Technip, page 182


    %Typical example:

    p=pca(DATA);

    res=pc_stat(p,1,2);%% stats for components #1 and #2

    saisir2excel(res,'pca results');%% to be looked at with Excel



    Return to thematic list
    Return to alphabetic list
    HOME

    pcr

    pcr - PCR (components introduced in the order of eigenvalues)

    function [pcrtype]=pcr(X,y,maxdim)


    assesses a basic pcr model

    Input arguments

    ---------------

    X : SAISIR matrix (n x p

    y : SAISIR vector (n x 1)

    maxdim : maximal dimensions of the model (integer)


    Output arguments

    ----------------

    pcrtype with fields

    pca: Structure giving the PCA results (see function "pca")

    beta: regression coefficients APPLICABLE ON THE PC SCORES (ndim x 1)

    predy: predicted y up to ndim dimensions (n x ndim)

    r2: determination coefficients between predicted and observed values (1 x

    ndim)

    averagey: mean of y (number)

    beta1:regression coefficients APPLICABLE ON THE X DATA centred (ndim x p)

    obsy: observed y (copy of input argument y)

    rmsec: root mean square error of calibration (n - dim -1) degrees f freedom


    Return to thematic list
    Return to alphabetic list
    HOME

    pcr1

    pcr1 - Basic model of PCR (components introduced in the order of eigenvalues)

    function [pcrtype]=pcr1(X,y,dim)



    Input arguments:

    ===============

    X: SAISIR matrix (n x p) of predictive data

    y: SAISIR matrix (n x k) of variables to be predicted (k variables)

    dim: (integer) dimension of the model


    Output argument

    ===============

    pcrtype with fields:

    pca: result of pca applied on X (see function "pca")

    beta: coefficients of the models (p x k) obtained with "dim" dimensions

    averagex: average of x (1 x p)

    averagey: average of y (1 x k)

    info: 'predicting several y with xxx dimensions'

    r2: determination coefficient for the ys (1 x k)

    predy: predicted y (n x k).


    This function makes models for all the ys which are in "y" . The models

    are built only with "dim" dimensions.


    See also : pcr (only one y, but the dimensions are scanned),apply_pcr


    Return to thematic list
    Return to alphabetic list
    HOME

    plotmatrix1

    plotmatrix1 - biplots of columns of matrices with colors

    function plotmatrix1(s,startpos,endpos,charsize)



    Return to thematic list
    Return to alphabetic list
    HOME

    pls

    pls - PLS regression (Partial Least Squares).

    function [mtBpls,mtYp,mtT,tbeta] = pls(mtX,mtY,nbdim)


    No direct use in SAISIR


    Input arguments

    ================

    mtX Matlab matrix of centered predictive data

    mtY Matlab matrix of Y-variables (centered).

    nbdim dimension of the PLS model


    Output arguments

    =================

    mtBpls PLS regression coefficients

    mtYp predicted Y

    mtT PLS scores


    Return to thematic list
    Return to alphabetic list
    HOME

    pls2obs

    pls2obs - PLS regression with many observations

    function [mtBpls,mtT, tBeta] = pls2obs(mtX,mtY,nbdim)


    This function is normally not directly called


    Return to thematic list
    Return to alphabetic list
    HOME

    pls2var

    pls2obs - PLS regression with many observations

    function [mtBpls,mtT, tBeta] = pls2var(mtX,mtY,nbdim)


    This function is normally not directly called


    Return to thematic list
    Return to alphabetic list
    HOME

    plsda

    plsda - Pls discriminant analysis following the saisir format

    function[plsdatype]=plsda(X,group,ndim)



    Input arguments:

    ===============

    X : SAISIR matrix of predictive variables (n x p)

    group: SAISIR (n x 1) vector of integers of groups. Observationqs with

    the same number in group belong to the same group (missing group not allowed)

    ndim: number of dimensions in the PLS model


    Output arguments

    ================

    plsdatype with fields

    beta :coeff for predicting the indicator matrix

    beta0 :intercept for predicting the indicator matrix

    t :PLS latent variable

    predy :predicted indicator matrix

    classed :predicted groups according to method #0

    ncorrect :number of rightly classified samples according to method #0

    (attribution to index of max of predicted Y)

    confusion :confusion matrix according to method #0

    ncorrect1 :number of rightly classified samples according to method #1

    (mahalanobis distance on latent variable t)

    confusion1 :confusion matrix according to method #1

    tbeta :coeff for predicting the latent variables t

    tbeta0 :intercept for predicting the indicator matrix

    linear :linear form for direct prediction of group

    linear0 :%the min of x'*linear + linear0 gives the predicted group

    (this is equivalent with considering Mahalanobis distances)


    Return to thematic list
    Return to alphabetic list
    HOME

    quaddis

    quaddis - Quadratic discriminant analysis

    function quadis_type=quaddis(x,group);


    Quadratic discriminant analysis

    (Training)

    A multinormal distribution is assumed in each qualitative group

    ===============================================================

    Input args:

    ============

    x: predictive data set (matrix n x p)

    g: qualitative groups (matrix n x 1) with integer ranging from 1

    to maximum number of groups (gmax)


    Output args:

    ============

    quaddis_type with fields:

    ncorrect100: percentage of correct classification (number)

    confus : confusion matrix (gmax x gmax)

    mean : means according to each group (gmax x p)

    predgroup : predicted groups (integer) (n x gmax)

    density : pseudo-densities of each observation (n x gmax)

    proba : probability of belonging to a given group (n x gmax)


    model : predictive model with fields:

    inv: Matlab matrices of Mahalanobis metrics (cube p x p x gmax)

    Mut: Matlab matrices of means according to each group (gmax x p)

    det: Matlab vector of determinants of covariance matrices of each

    group (1 x gmax)

    See also apply_quaddis, crossval_quaddis


    Return to thematic list
    Return to alphabetic list
    HOME

    quickpls

    quickpls - Quick PLS regression from 1 to ndim dimensions

    function [plstype]=saisirpls(X,y,ndim)



    Input arguments:

    ===============

    X: SAISIR matrix (n x p) of predictive data

    y: SAISIR vector (n x 1) of variable to be predicted

    ndim: maximum dimensions of the model


    output argument:

    ===============

    plstype with fields:

    BETA: regression coefficient of the models (p x ndim)

    BETA0:intercept of the models (p x 1)

    PREDY: predicted y for all the models (n x ndim)

    T: PLS scores (n x ndim)

    RMSEC: root mean square error of calibration (1 x ndim)

    r2: r2 coefficient (1 x ndim)


    This function calculates the PLS models for 1 to ndim dimensions.

    All the models are kept

    The algorithm is not the NIPALS algorithm, but another one which is

    faster.


    This function makes use of function pls (normally in the directory "pls"


    see also: basic_pls, basic_pls2 (slower but giving more complete outputs)


    Return to thematic list
    Return to alphabetic list
    HOME

    randomize

    randomize - Builds a file of randomly attributed vector in X1

    function X1=randomize(X)



    Input argument

    =============

    X: SAISIR matrix (n x p)


    Output argument:

    X1: SAISIR matrix (n x p) with the rows randomly allocated


    This function randomly changes the rank (indices) of the observations.

    This useful for validation test, when comparing the results with the

    hazard.


    Return to thematic list
    Return to alphabetic list
    HOME

    random_saisir

    random_saisir - Creation of a random matrix

    function[X]=random_saisir(nrow,ncol)




    Input arguments

    ===============

    nrow, ncol: integers (number of rows and columns of the resulting matrix)


    Output argument

    ===============

    X: matrix of random elements (nrow x ncol)



    Return to thematic list
    Return to alphabetic list
    HOME

    random_select

    random_select - bulding a vector of random elements 0 or 1

    function[selected]=random_select(nel, nselect, (nrepeat))



    Input arguments:

    ===============

    nel: (integer) number of elements in the output vector "selected"

    nselect: (integer smaller than nel) number of elements taking the value 1

    nrepeat: (integer, optional) number of consecutive replicates.

    Output arguments:


    ===============

    selected: MATLAB vector with nel elements equal to 1 or 0


    This function builds up a MATLAB vector of nel elements with nselect

    elements equal to 1 in random position, and (nel-nselect) equal to 0.

    nrepeat (optional) randomly selects nselect values, but organised by block of nrepeat groups

    For example, if nrepeat =3 a possible result is [0 0 0 1 1 1 0 0 0 1 1 1 1 1 1 ...]


    This function is useful for dividing a collection into two sets, for

    example in many functions of SAISIR allowing a validation test.

    The case with "nrepeat" defined corresponds to the situation in which the

    replicates are in equal numbers and consecutive in the data collection.


    Typical use: randomly building a calibration and validation set

    ==============================================================

    [n,p]=size(DATA.d);

    sel=random_select(n,round(n/3));%% A third in validation

    cal=selectrow(DATA,sel==0);%% building the calibration set

    val=selectrow(DATA,sel==1);%% building the validation set


    See also: random_splitrow



    Return to thematic list
    Return to alphabetic list
    HOME

    random_splitrow

    random_splitrow - random selection of rows

    function[X1,X2]=random_splitrow(X, nselect)



    Input argument

    =============

    X: SAISIR matrix (n x p)

    nselect: integer, less than p.


    Output arguments

    ================

    X1, X2: SAISIR matrices of the resulting split


    This function randomly divides a matrix in two matrices:

    X1: with nselect rows, and X2 with n-nselect rows


    Typical use : building a calibration and a validation set

    ========================================================

    [n,p]=size(DATA.d);

    [cal val]=random_splitrow(DATA, round(n*2/3));%% two third in calibration

    % cal and val are respectively the calibration and validation sets


    Return to thematic list
    Return to alphabetic list
    HOME

    readexcel1

    readexcel1 - reads an excel file in the .CSV format (create a 3way character matrix).

    function [data] = readexcel1(filename,(nchar),(deb),(xend))



    !!! NO DIRECT USE. Use function "excel2saisir"


    Input arguments

    ===============

    filename: name of an excel file saved in the .CSV format.

    nchar: nchar is the length of the element data(i,:,j), ie the number of

    characters which are kept (default: 20)

    deb, xend: first and last rows wich are loaded. (default : all)


    Output argument:

    ===============

    data: is a 3 way file data(row,pos,col)

    where row is the excel rows, col the excel columns,

    and pos is the character in the string


    if the string is less than nchar, the string is filled with white space.

    if the string is more than nchar, the end of the chain is lost


    This is a first step for decoding data coming from excel


    %Example:

    %========

    mywork=readexcel1('work1.csv',15);


    See also: excel2saisir, saisir2excel


    Return to thematic list
    Return to alphabetic list
    HOME

    readident

    readident - loads a file of strings

    function [ident, nident] = readident(filename,namesize)



    Input arguments

    ==============

    filename: (string) file name of the text file in the current directory

    namesize: (integer) maximum size of the string to be read (default : 10)


    Output arguments:

    ================

    ident: matrix of char

    nident:number of rows in ident (identifiers)


    Loads an array of string in a matrix format

    namesize gives the maximum number of characters in each string


    Main use: loadings identifiers of rows and variables from a text file



    Return to thematic list
    Return to alphabetic list
    HOME

    regression_score

    regression_score - build a factorial space for regression

    function res=regression_score(x,beta,(y))


    ------------------

    Input arguments:

    ===============

    x : data matrix (n x p)

    beta : vector of regression coefficients (p x 1)

    y :(optional) known y value


    Output argument:

    ===============

    res with fields

    score : regression scores (also y split)

    reconstructed_norm2: squared norms of the scores

    cumulated_norm2: cumulated squared norms of the scores

    projector: matrix such as score=x*projector

    eigenvec_sum: sum of the eigenvectors of PCA (linked to the theory)

    xmean: mean of x

    r2 : if (y defined) r2 of the cumulated model


    Given a data matrix x and the regression coefficients beta,

    the function build up a matrix of orthogonal scores

    such as predicted y is equal to the sum of this scores

    The scores can be used to examine the observations "oriented" in the

    prediction of y.

    As the scores are ranked as a function of their ability to predict y,

    It is possible to examine the observations beginning by the first scores.


    Return to thematic list
    Return to alphabetic list
    HOME

    reorder

    reorder - reorders the data of files A1 and A2 according to their identifiers

    function [B1 B2]=reorder(A1,A2)



    Input arguments:

    ===============

    A1, A2: SAISIR matrices in which the rows have at least some identifiers in

    common


    Output arguments

    ================

    B1, B2: reordered matrices.


    This function makes it possible to realign the rows of A1 and A2, in order

    to have the identifiers corresponding.

    This is necessary for any predictive method (particularly regressions).

    The function discards the observations which are not present in A1 and A2.

    The matrix B1 corresponds to A1 and matrix B2 to A2

    Fails if A1 or A2 contains duplicate identifiers of rows.

    A2 is leader (B1 is as close as possible from the order of A2)


    %Typical example:

    %===============

    %Let X and y matrices to be reordered

    [X1, y1]=reorder(X,y);

    In X1 and y1 the rows have now the same identifiers (with possibly some

    lost of observations).

    %

    If the function fails because some identifiers are in duplicates, use the

    function "check_names" to identifies these duplicated identifiers, and remove some of them



    Return to thematic list
    Return to alphabetic list
    HOME

    repeat_string

    repeat_string - build a matricx of char by repeating a string

    function str1=repeat_string(str,ntimes);


    Input arguments:

    ===============

    - str : a character string

    - ntimes: number of repetition

    Output argument:

    ================

    - str1 the matrix of char with the repeated string.

    example:

    >> repeat_string('Vanessa',3)

    ans =

    Vanessa

    Vanessa

    Vanessa

    Useful for building identifiers in SAISIR


    See also: addcode


    Return to thematic list
    Return to alphabetic list
    HOME

    ridge_regression

    ridge_regression - Basic ridge regression

    function [ridgetype]=ridge_regression(X,y,krange)



    Input arguments:

    ===============

    X: SAISIR matrix of predictive variables (n x p)

    y: SAISIR vector of observed y (n x 1)

    krange: MATLAB vector of k-values to be tested in the ridge regression


    Let ntest = length(krange)


    Output argument:

    ===============

    ridgetype with fields

    beta: beta coefficients associated with th ntest k-values as defined in "krange"

    averagex: average of X

    averagey: average of y

    rmsec: Root mean square error of calibration (ntest x 1 )

    predy: predicted y for each test k-value (n x ntest)

    r2: r2 for each tested k-value (ntest x 1);


    Return to thematic list
    Return to alphabetic list
    HOME

    ridge_regression1

    ridge_regression1 - Basic ridge regression at a given norm

    function [ridgetype]=ridge_regression1(X,y,normrange)


    ONLY ONE VARIABLE TO BE PREDICTED (scan the dimensions)

    return as many beta as the number of elements in krange


    Input arguments:

    ===============

    X: SAISIR matrix of predictive variables (n x p)

    y: SAISIR vector of observed y (n x 1)

    normrange: tested range of norms of beta (MATLAB vector of positive doubles)


    Let ntest = length(krange)


    Output argument:

    ===============

    ridgetype with fields

    beta: beta coefficients associated with th ntest k-values as defined in "krange"

    averagex: average of X

    averagey: average of y

    rmsec: Root mean square error of calibration (ntest x 1 )

    predy: predicted y for each test k-value (n x ntest)

    r2: r2 for each tested norm of beta (ntest x 1);

    k : MATLAB vector of resulting k values (ntest x 1)

    expected norm: MATLAB vector of expected norms (copy of normrange)


    This function carried out as many ridge regressions as the number of

    elements in "normrange".

    Rather that (as usual) trying to find the k-value of ridge, here, it is

    directly the norm of the regression coefficients beta which are the

    adjusted value. To each norm, there is a corresponding value of k.


    %Typical example:

    %===============

    ridgetype=ridge_regression1(X,y,[100, 200]);

    The function displays the Ordinary Least Square norm of beta

    %"OLS norm = 1234.5678"

    %This value gives the maximum possible value of the norm

    %For example, testing half this norm

    ridgetype=ridge_regression1(X,y,(1234.5678/2);


    See also: ridge_regression


    Return to thematic list
    Return to alphabetic list
    HOME

    row_center

    row_center - subtracts the average row to each row

    function [X] = center(X1)



    Return to thematic list
    Return to alphabetic list
    HOME

    saisir2ascii

    saisir2ascii - Saves a saisir file into a simple ASCII format

    function saisir2ascii(X,filename,separator)



    Input arguments

    ===============

    X: SAISIR matrix to be saved

    filename: (string) name of the saved file

    separator:(string with a single char) separator character


    Output argument

    ===============

    none


    Transform a saisir file into a simple .txt file and save it on disk

    separator is a single character like ' ' or ';' or its ASCII code;

    The extension '.txt' is added to the filename


    %Typical example:

    %===============

    saisir2ascii(data,'mydata',';');

    %Saves the SAISIR matrix "data", under the name "data.txt", with ";" as separator


    Return to thematic list
    Return to alphabetic list
    HOME

    saisir2excel

    saisir2excel - Saves a saisir file in a format compatible with Excel

    function saisir2excel(X,filename)



    Input arguments

    ===============

    X: SAISIR matrix to be saved

    filename: (string) name of the saved file


    Output argument

    ===============

    none


    Transformq a saisir file into a simple .CSV file and save it on disk

    The separator is ";"

    The extension '.csv' is added to the filename


    %Typical example:

    %===============

    saisir2excel(data,'mydata');

    %Saves the SAISIR matrix "data", under the name "data.csv", with ";" as separator

    %This file is read by Excel


    Return to thematic list
    Return to alphabetic list
    HOME

    saisirpls

    saisirpls - PLS regression with "dim" dimensions

    function [plstype]=saisirpls(X,Y,dim)



    Input arguments

    ===============

    X: SAISIR matrix of predictive data (n x p)

    Y: SAISIR matrix of variables to be predicted (n x k)

    ndim: (integer) number of dimensions asked


    Ouput arguments

    ===============

    plstype with fields

    beta: regression coefficients of the model (p x k)

    beta0: intercept of the models (1 x k)

    predy: predicted values (n x k)

    T: PLS scores of the PLS2 regression model (n x ndim)

    correlation: correlation coefficient (1 x k)


    This function assesses a pls2 model

    Several variables can be predicted, but only ndim dimensions are tested


    Preferably uses basic_pls or basic_pls2


    %Typical example

    %===============

    Let DATA be dimensionned (n x p)

    Let Y be dimensionned (n x k)

    plstype=saisirpls(DATA,Y,10);

    %Assesses the models with 10 dimensions for the k variables in Y


    Return to thematic list
    Return to alphabetic list
    HOME

    saisir_check

    saisir_check - Checks if the data respect the saisir stucture

    function check=saisir_check(X)



    Input argument:

    ==============

    X: (expected) SAISIR matrix


    Output argument:

    ===============

    check:

    check = 1 if x is in the SAISIR format (no warning)

    check = 2 if x is in the SAISIR format (with warning)

    check = 0 if x is not in the SAISIR format (fatal error)


    The function tests if the input argument X is a valid SAISIR structure

    and gives some information.

    If X is a valid structure, also signals (as warning) if there are missing values,

    identical rows or columns (which may be the sign of something wrong)


    Useful to see if X is a valid '.d','.i','.v' structure.



    Return to thematic list
    Return to alphabetic list
    HOME

    saisir_derivative

    saisir_derivative - n-th order derivative using the Savitzky-Golay coefficients

    [X]=saisir_derivative(X1,polynom_order,window_size,derivative_order)



    Input arguments:

    ===============

    X1:SAISIR matrix (n x p)

    polynom_order:(integer) order of the fitting polynom

    window_size:(integer) number of data points involved in the calculation

    derivative_order: (integer, normally 1 or 2) order of the derivative


    Output argument:

    ================

    X : transformed data matrix (n rows)


    The function assumes that X is a matrix of digitized signals (such as

    spectra) with constant intervals of digitization.


    Example:

    =======

    res=saisir_derivative(DATA,3,21,2);

    Compute the second derivative using a polynom of power 3 as model

    and a window size of 21


    Return to thematic list
    Return to alphabetic list
    HOME

    saisir_linkage

    saisir_linkage - assesses a simple linkage vector from a matrix of distance

    function z=saisir_linkage(dis)



    Input argument

    =============

    dis: a SAISIR matrix of distance (n x n, symetric)


    Output argument

    ==============

    z: z vector as required by the MATLAB function "dendrogram"


    From a complete square matrix of distances

    extracts the "unfolded" triangular matrix in order to enter the matlab program "linkage"

    with the option "ward"

    Returns the z vector as required by the MATLAB "dendrogram" function


    This function is very specific, and can be used only by skilled persons!


    See also : dendro (dendrogram with SAISIR)


    Return to thematic list
    Return to alphabetic list
    HOME

    saisir_mean

    saisir_mean - computes the mean of the columns, following the saisir format

    function[xmean]=saisir_mean(X);



    Input argument

    ==============

    X: SAISIR matrix (n x p)


    Ouput argument

    ==============

    xmean: SAISIR vector (1 x p) of the mean



    Return to thematic list
    Return to alphabetic list
    HOME

    saisir_mult

    saisir_mult - matrix multiplication following the SAISIR format

    function X12=saisir_mult(X1,X2);



    Input arguments:

    ===============

    X1 and X2 : SAISIR matrices dimensionned (n x p) and (p x m) respectively


    Output argument:

    ===============

    X12: SAISIR matrix (n x m) , result of the multiplication of X1 with X2


    Little use!


    Return to thematic list
    Return to alphabetic list
    HOME

    saisir_sort

    saisir_sort - sorts the rows of s according to the values in a column

    function [X1 X2]=saisir_sort(X,ncol,minmax)



    Input arguments:

    ===============

    X: SAISIR matrix (n x p)

    ncol:(integer) rank (index) of the column on which the data are sorted

    ùinmax: 0 increasing order, 1: decreasing order (default : 0)


    Output arguments:

    ================

    X1: SAISIR matrix (n x p) sorted according to the column "ncol"

    X2: SAISIR matrix (n x (p+1)) sorted according to the column "ncol", with

    the rank added in column 1


    %Typical example:

    %===============

    DATA1=saisir_sort(DATA,5);

    map(DATA1,1,6); %% representing the 5 th column of DATA (6th column of

    %DATA1) in increasing order


    Return to thematic list
    Return to alphabetic list
    HOME

    saisir_std

    saisir_std - computes the standard_deviations of the columns, following the saisir format

    function[xstd]=saisir_std(X)



    Input argument

    ==============

    X: SAISIR matrix (n x p)


    Ouput argument

    ==============

    xstd: SAISIR vector (1 x p) of the standard deviation


    Return to thematic list
    Return to alphabetic list
    HOME

    saisir_sum

    saisir_sum - calculates the sum of the rows

    function xsum=saisir_sum(X);



    Input argument

    ==============

    X: SAISIR matrix (n x p)


    Ouput argument

    ==============

    xsum: SAISIR vector (1 x p) of the sum of the rows


    Return to thematic list
    Return to alphabetic list
    HOME

    saisir_transpose

    saisir_transpose - transposes a data matrix following the saisir format

    function [X] = saisir_transpose(X1)



    Input argument

    =============

    X1: SAISIR matrix ( n x p)


    Output argument

    ===============

    X: SAISIR matrix (p x n), transpose of X1


    Return to thematic list
    Return to alphabetic list
    HOME

    seekstring

    seekstring - returns a vector giving the indices of string in matrix of char x in which 'str' is present

    function index = seekstring(identifiers,xstr)



    Input arguments

    ===============

    identifiers: matrix of characters (n x p)

    xstr: string (1 x k), with k smaller th=an p


    Output argument

    ==============

    ndex: vector of integers giving the indices of the rows of "identifiers" in

    which the string "xstr" has been found


    %Typical example:

    %===============

    index=seekstring(DATA.i,'thisname');

    Gives the indices in DATA.i in which the string "thisname" is present.


    Return to thematic list
    Return to alphabetic list
    HOME

    selectcol

    selectcol - creates a new data matrix with the selected columns

    function [X] = selectcol(X1,index)



    Input arguments

    ===============

    X1: SAISIR matrix (n x p)

    Index: vector of integer or of booleans


    Output argument

    ===============

    X: matrix with n rows reduced to the selected variables


    %Typical example:

    %===============

    reduced=selectcol(DATA,[1 5 6]); %% selects the columns #1, #5, #6 and

    %builds the reduced matrix (with 3 columns) in "reduced"


    %See also: selectrow, deletecol, deleterow, appendcol, appendrow,

    appendcol1, appendrow1


    Return to thematic list
    Return to alphabetic list
    HOME

    selectrow

    selectrow - creates a new data matrix with the selected rows

    function [X] = selectrow(X1,index)



    Input arguments

    ===============

    X1: SAISIR matrix (n x p)

    Index: vector of integer or of booleans


    Output argument

    ===============

    X: matrix with n columns reduced to the selected rows


    %Typical example:

    %===============

    reduced=selectrow(DATA,[1 5 6]); %% selects the rows #1, #5, #6 and

    %builds the reduced matrix (with 3 rows) in "reduced"


    See also: selectcol, deletecol, deleterow, appendcol, appendrow,

    appendcol1, appendrow1


    Return to thematic list
    Return to alphabetic list
    HOME

    select_from_identifier

    select_from_identifier - Uses identifier of rows for selecting samples

    function [X1] = select_from_identifier(X,startpos,str)



    Input arguments:

    ===============

    X : SAISIR matrix

    startpos : beginning position in the character strings of the identifiers

    of rows ('.i')

    str: string which is used as selection key.


    Output argument:

    ===============

    X1 : SAISIR matrix of the selected rows


    Creates the data collection "X1" which is the subset of "X"

    the identifiers of which contain the string str, in starting position startpos


    %Example :

    %========

    %Let X be a SAISIR matrix

    %Let X.i being

    %'wheat1'

    %'barle2'

    %'ricex1'

    %'wheat2'

    %The 'wheat' samples are extracted through

    mywheat= select_from_identifier(X,1,'wheat');

    %This select the rows with identifiers 'wheat1' and 'wheat2'


    Return to thematic list
    Return to alphabetic list
    HOME

    select_from_variable

    select_from_variable - use identifier of columns for selecting variables

    function [X1] = select_from_variable(X,startpos,str)



    Input arguments:

    ===============

    X : SAISIR matrix

    startpos : beginning position in the character strings of the identifiers

    of columns ('.v')

    str: string which is used as selection key.


    Output argument:

    ===============

    X1 : SAISIR matrix of the selected columns


    Creates the data collection "X1" which is the subset of "X"

    the variable identifiers of which contain the string str, in starting position startpos

    are selected


    see also : select_from_identifier


    Return to thematic list
    Return to alphabetic list
    HOME

    sensory_profile

    sensory_profile - Graphical representation of sensory profile

    function[h]=sensory_profile(X,range,max_score,(title))


    Graphical display of sensory profiles in a "circular (spider web)" representation.


    Input arguments:

    ===============

    -X : matrix of data to be displayed

    -range : vector of the indices of the rows to be displayed

    -max_score : maximal score used in the scale

    -title : (optional) title of the graph.

    Warning: will not work properly with more than 15 variables

    Preferably reduce the identifiers of variables to less than 8 characters


    %Demonstration example

    %====================

    senso=rand(5,10)*5;%%simulationg 5 panellists, 10 scores, scale from 0 to

    %5

    senso1=matrix2saisir(senso,'judge','descri');%% In SAISIR structure

    sensory_profile(senso1,1:3,5);%% graphic of the first 3 panellists


    Return to thematic list
    Return to alphabetic list
    HOME

    sgolaycoef

    sgolaycoef - Computes the Savitsky-Golay coefficients

    function [B,G] = sgolaycoef(k,F)


    where the polynomial order is K and the frame size is F (an odd number)

    No direct use


    Return to thematic list
    Return to alphabetic list
    HOME

    show_vector

    show_vector - represents a row of a matrix as a succession identifiers

    function handle=show_vector(X, (nrow) ,(csize),(xlab),(ylab),(title))



    Input arguments

    ===============

    X: SAISIR matrix (n x p)

    nrow: index of the row to be displayed (integer less than n, default : 1)

    csize: size of the character (default : 10)

    xlab, ylab, title: label on axis X, axis Y, title , respectively (default:

    none


    The identifiers of the columns are plotted with X being the index of the variable and Y the actual

    value of the variable for the selected row "nrow"


    Main use : examining the output of "anavar1" and "anovan1" functions on

    discrete variables


    Return to thematic list
    Return to alphabetic list
    HOME

    simple_regression

    simple_regression - mono_linear regressions

    function [beta beta0]=simple_regression(X,y);



    Input arguments

    ===============

    X:SAISIR matrix of predictive variables (n x p)

    y:SAISIR vector of the variable to be predicted (n x 1)


    Ouput arguments

    ===============

    beta:SAISIR vector of the regression coefficients (1 x p)

    beta0:SAISIR vector of the intercepts (1 x p)


    y is predicted by each column i of X according to ypred=X.d(:,i)*beta.d(i)+beta0.d(i);

    There are thus as many mono-linear models as the number of columns in X


    Return to thematic list
    Return to alphabetic list
    HOME

    snv

    snv - Standard normal variate correction on spectra

    function [X1] = snv(X)



    Input argument:

    ==============

    X:SAISIR matrix of spectra (n x p)


    Output argument:

    ==============

    X1:SAISIR matrix of SNV-corrected spectra (n x p)


    SNV (Standard Normal Variate)is commonly used in spectroscopy.

    It basically consists in centering

    and standardizing the ROWS (not the columns) of the data matrix.

    This procedure may reduce the scatter deformation of spectra.


    Return to thematic list
    Return to alphabetic list
    HOME

    spcr

    spcr - stepwise Principal component regression

    function [spcrtype]=spcr(X,y,maxdim, (maxrank)(corr_cov))


    The PC scores are introduced in the order of their regression coefficient or their covariance

    with y


    Input arguments:

    ===============

    X : SAISIR matrix of predictive variables (n x p)

    y : SAISIR vector of observed y

    maxdim: (integer): naximum number of PC scores introduced in the regression model

    maxrank: (optional,integer): rank maximal of the PC score in the model

    Default value: all components possibly introduced.

    corr_cov : 1 introduction according to correlation coeff (corr_cov=1,default);

    or : 0 introduction according to covariance


    Ouput arguments:

    ===============

    spcrtype with fields

    pca: PCA structure (see function "pca")

    beta: beta coefficients (applicable on PC scores)

    selected_component: rank of the scores introduced in the model

    predy: predicted y values for all the steps of spcr

    r2: r2 for all the steps of spcr

    averagey: average value of y

    beta1: beta coefficients (applicable on X centred)

    obsy: observed y, copy of input argument y.

    rmsec: root mean square error of calibration for all the models


    Return to thematic list
    Return to alphabetic list
    HOME

    splitrow

    splitrow - splits a data matrix into 2 resulting matrices

    function [X1, X2]= splitrow(X,index)



    Input arguments

    ===============

    X:SAISIR matrix (n x p)

    index: MATLAB vector with k elements equal to 1 and n-k elements equal to

    0


    Output arguments

    ===============

    X1:SAISIR matrix (k x p)

    X2:SAISIR matrix ((n-k) x p)


    Divides X into two matrices X1 and X2

    the first one correspond to kept rows (according to index = 1, or "true")

    the second one is the complement (index = 0, or "false")

    index is either indices of the rows (integers) or boolean.


    %Typical example (division of DATA into a calibration and a validation sets):

    %===============

    [n,p]=size(DATA.d);

    sel=random_select(n,round(n/3));%% buiding a random vector of 0 and 1

    [validation_set calibration_set]=splitrow(DATA,sel);%%creation of a

    %calibration and validation set.


    Return to thematic list
    Return to alphabetic list
    HOME

    split_average

    split_average - averages observations according to the identifiers

    function res=split_average(X,startpos,endpos)



    Input arguments:

    ===============

    X:SAISIR matrix (n x p)

    startpos, endpos: position in the row-identifier strings (".i")


    Output argument:

    ===============

    res with fieldss:

    average: averages of the identified groups (p columns)

    group: number of observations in each group.


    The function extracts the characters in the row identifiers from "startpos"

    to "endpos" and makes as many groups as the number of different strings

    The observations are averaged according to these groups



    Return to thematic list
    Return to alphabetic list
    HOME

    standardize

    standardize - divides each column by the corresponding standard deviation

    function [X, xstd] = standardize(X1,(option))



    Input arguments

    ==============

    X1:SAISIR matrix (n x p)

    option : either 0: divides by n-1, or 1: divides by n (default : 1)


    Output argument

    ===============

    X:SAISIR matrix (n x p)

    xstd: standard deviation of the columns of X1 (1 x p)


    Return to thematic list
    Return to alphabetic list
    HOME

    statis

    statis - Multiway method STATIS

    function res=statis(collection);



    Input arguments

    ===============

    collection:array of SAISIR matrices with the same number of rows (n)


    Output arguments

    ================

    Let n be the number of observations, and t the number of tables

    the field of the outpout argument res are:


    RV: [1x1 struct] matrix t x t of the RV value indicating the agreement

    between the table (max value = 1)

    eigenval1: [1x1 struct] first eigenvalue of the RV matrix

    eigenvec1: [1x1 struct] first eigenvector of the RV matrix (t x 1) . Indicates the

    weight associated with each table

    -Wk: {1xt cell} cell of n x n array giving the scalar products between observations

    -W_compromise n x n array giving the compromise of the array WK

    eigenval2: [1x1 struct] r eigenvalues of W_compromise, with r the rank of

    W_compromise.

    score: [1x1 struct] (n x r ) Scores of the compromise of the observations. Can be

    represented as factorial map

    trajectory: [1x1 struct] (n*t x r) Projection of each row vector of each table in the space of observation_score

    group: [1x1 struct] (n*t x 1) table giving the belonging of a given

    row_vector to a table

    table_score: [1x1 struct] (t x r) scores of the tables obtained from

    diagonalisation of RV.

    table_eigenval: [1x1 struct] (r x 1) eigenvalues of RV . The first one

    is the same as eigenval1



    The STATIS method is described in "C.Lavit, Analyse conjointe de tableaux qualitatif, Masson pub, 1988."

    Basically the method attempts to establish a factorial compromise between table having

    the same number of observations.

    col is an ARRAY OF CELLS containing all the 2-D data tables (SAISIR format).

    Each table must include the same observations, but not necessarily the same variables.

    group is useful with the command 'carte_barycentre'

    For example, a command such as "carte_barycentre(res.trajectory,2,3, res.group)" will produce the representation of

    the row vector of each table for the score 2 and 3. The representation shows the compromise point and its link

    to each vector of the tables.

    "collection" is built up with commands like:

    collection(1)=DATA1; collection(2)=DATA2; ...: collection(k)=DATA(K);


    Warning ! Such commands work only if DATA are .d, .i, .v structures IN

    THAT ORDER, with NO OTHER FIELDS. Otherwise, MATLAB refuses to build the

    vector of SAISIR structure. Possibly use "saisir_check" for verifying this

    point.



    Return to thematic list
    Return to alphabetic list
    HOME

    stepwise_regression

    stepwise_regression - stepwise regression between x and y

    function[result]=stepwise_regression(x,y,Pthres,(confidence))



    Input arguments

    ===============

    X : SAISIR matrix of predictive data (n x p)

    y : SAISIR vector of variable to be predicted (n x 1)

    P :probability threshold for entering or discarding a variable

    confidence: (default=0.05) is the probability of the confidence interval

    for the limit of the regression coefficients


    Output argument

    ===============

    result: array of cells corresponding to each step of the regression

    Each cell correspond to one step (adding or discarding a variable)

    In each cell:

    message: gives the name of the entered or discarded variable'

    res : a structure described below

    intercept : constant value (beta0) of the current model

    RMSE : root mean square error of the model

    r2 : determination coefficient

    adjusted_r2 : adjusted determination coefficient (taking into account of the dimensions

    F : Fisher F value of the current model

    probF : probaility value assiciated with F

    ypred : predicted y values


    res : the rows indicate the variables introduced

    the different columns give information on the corresponding

    regression coefficients:

    1) regression coefficients

    2) Lower confidence limit of regressin coefficients

    3) Higher confidence limit

    4) Std of regression coeff.

    5) t value of reg. coeff.

    6) Prob. of reg. coef.

    7) Rank of variables


    % example

    result=stepwise_regression(X,y,0.05);

    result{3} %% third model

    message: 'Entering variable 2214 at step 3'

    res: [1x1 struct]

    intercept: 4.66

    RMSE: 0.81

    r2: 0.90

    adjusted_r2: 0.90

    F: 417.96

    probF: 0

    ypred: [1x1 struct]

    and result{3}.res gives the statistics on the regression coefficients

    which can be consulted for example under Excel using

    "saisir2excel(result{3}.res,'model3')"


    Return to thematic list
    Return to alphabetic list
    HOME

    string2saisir

    string2saisir - creation of a saisir file from a string table (first column=name)

    function [saisir] = string2saisir(data)


    NORMALLY NO DIRECT USE

    creation of a saisir file from a string table obtained by procedure readexcel1

    A DOS file which have been read from readexcel1 is a 3 way matrix of character under

    the form data(row,pos,col). For example, data(5,:,12) contains the string in row 5 and

    column 12.

    In the particular case of acceptable data for saisir transformation, the data format must be

    the following:

    1) the first row data(1,:,:) must contain the identifier of variables (column)

    warning: for matrix presentation of the data, the string data(1,:,1) is of no use

    and is skipped by the program.

    2) the first column(:,:,1) must contain the identifier of observations (rows)

    3) the other lines and columns contains string which can be converted in number

    (or whitespace)

    such format is normally obtained by using readexcel1 (excel data saved as .csv)

    If the original excel file was not appropriate, it is possible that several columns

    contain string which could not be transformed in number (or white space).

    in this situation, it is possible to remove the undesired column using

    data(:,:,col)=[]; where col is the index of the column to be removed

    Note that the whitespace are replaced by NAN values


    Return to thematic list
    Return to alphabetic list
    HOME

    string2text

    string2text - save a vector of string in a .txt format

    function string2text(str,filename)



    Input argument

    ==============

    str:matrix of characters

    filename:string (vector of characters)


    Output argument

    ==============

    None


    This function saves a succession of strings (in a matrix of char) under

    the name "filename". The extension ".TXT" is added to the file name


    Return to thematic list
    Return to alphabetic list
    HOME

    submap

    submap - partial display of observations

    function submap(X,col1,col2,xstring,(col1label),(col2label),(title),(charsize),(marg))


    The scale of the COMPLETE map is used.


    Input arguments:

    ===============

    X: SAISIR matrix

    col1, col2 : index of the two columns to be represented (integer values)

    xstring : string in the names of identifiers which must be displayed

    col1label (optional): Label of the variable forming the X-axis

    col2label (optional): Label of the variable forming the Y-axis

    title (optional) : title of the graph

    charsize (optional) : size of the plotted characters

    marg (optional) : margin value allowing an extension of the axis in order

    to cope with long identifiers (default value: 0.05)


    example:

    Let X be a SAISIR matrix

    %Let X.i being

    'wheat1'

    'barle2'

    'ricex1'

    'wheat2'

    'barle3'

    ...

    The command "submap(X,5,3,'wh')" will plot the column 5 as X and 3 as Y

    Only the observations containing 'wh' in there names are displayed, but

    the general axis-scales are the one for the whole collection.

    Useful for emphasizing some groups in a complex plot.


    Return to thematic list
    Return to alphabetic list
    HOME

    subtract_variable

    subtract_variable - subtract a given variable to all the others

    function [X1] = subtract_variable(X,ncol)



    Input arguments

    ===============

    X:SAISIR matrix

    ncol: (integer) column chosen for the subtraction


    Output argument

    ==============

    X1: SAISIR matrix in which the variable has been subtracted


    Subtracts the variable of indice ncol to each other variables of the observations

    Useful for correcting an y-shift of spectral data.


    Return to thematic list
    Return to alphabetic list
    HOME

    surface1

    surface1 - Represent a surface in three dimensions

    function [zmin, zmax]=surface1(X)



    Input argument:

    ==============

    X:SAISIR matrix


    Output arguments:

    ================

    zmin, zmax: max and min values in X.d



    % new version 6/10/2006


    Return to thematic list
    Return to alphabetic list
    HOME

    surface_std

    surface_std - divide each row by the sum of its corresponding columns

    function [X1] = surface_std(X,(threshold))



    Input arguments

    ===============

    X:SAISIR matrix

    Threshold: (small) positive or zero value

    If the sum is equal to 0, the elements of the corresponding row are set to 0.

    If threshold defined threshold (normally very small value) is added to data

    Only useful for avoiding "division by zero" warning


    Ouput argument

    ===============

    X1:corrected matrix


    The function assesses the sum of each row . Each value of each row is

    divided by the corresponding sum.


    In chromatography, this corresponds to giving the same surface to all the

    chromatograms.

    See also: snv


    Return to thematic list
    Return to alphabetic list
    HOME

    symbol_map

    symbol_map - map with symbols : using a portion of the identifiers for

    symbol_map(X,col1,col2,startpos,endpos,(col1label),(col2label),(title),(charsize))



    Input arguments:

    ===============

    X: SAISIR matrix

    col1, col2 : index of the two columns to be represented (integer values)

    startpos, endpos: position in the identifier strings of rows ('.i') for

    the coloration

    col1label (optional): Label of the variable forming the X-axis

    col2label (optional): Label of the variable forming the Y-axis

    title (optional) : title of the graph

    charsize (optional) : size of the plotted characters


    For the French users: there is a synonym function "carte_symbole".

    Use preferably "symbol_map"

    The coloration of the displayed descriptors depends on the arguments

    startpos and endpos.

    From the names of individual, the string name(sartpos:endpos) is extracted. Two observations

    for which these strings are different, are represented with different symbols.

    example:

    Let X be a SAISIR matrix

    %Let X.i being

    'wheat1'

    'barle2'

    'ricex1'

    'wheat2'

    'barle3'

    ...

    The command 'symbol_map(X,5,3,1,5)' will plot the column 5 as X, 3, as Y

    The characters are extracted from 1 to 5 , that is strings 'wheat', 'barley',

    'ricex'.A different symbol will be given for each of this strings.


    See also colorde_map1, colored_map2 (same principle but with the

    identifier name displayed)


    Return to thematic list
    Return to alphabetic list
    HOME

    tcurve

    tcurve - representation of a column of a given matrix as a curve

    function handle=tcurve(X, ncol, (xlabel),(ylabel),(title))



    Input arguments;

    ================

    X : SAISIR matrix

    ncol : column to be represented

    xlabel, ylabel (optional) : labels in X and Y

    title (optional) title of the graph.


    This function draws a column (typically a loading or an eigenvector) as a curve.

    If X.i can be interpreted as a vector of number (such as wavelengths),

    the X scale is given by this vector.

    Otherwise, the X-axis is simply given by the rank of the variables

    A function "tcourbe" is a synonym of this function.


    Return to thematic list
    Return to alphabetic list
    HOME

    tcurves

    tcurves - represents several columns of a matrix as curves

    function handle=tcurves(X, range, (xlabel),(ylabel),(title))



    Input arguments

    ===============

    X:SAISIR matrix (n x p)

    range: indices of the selected columns (Matlab vector of integers)

    xlabel, ylabel, title: legends ub x,y and title (strings)


    Typical use : showing loadings of PLS or PCA

    ===========

    p=pca(spectra);

    tcurves(p.eigenvec,1:4);%% First 4 loadings pf PCA



    Return to thematic list
    Return to alphabetic list
    HOME

    thematic_classification

    thematic_classification - builds a thematic classification of the .m files

    function res=thematic_classification(function_name,(previous));



    Input argument

    ==============

    function_name: matrix of characters giving the function name

    previous (optional): previous results of this function

    "thematic_classification"


    Output argument

    ==============

    res with fields

    theme_structure: array of structures with fields:

    name: name of the function

    theme: vector giving the number of the theme in the thematic list

    theme: matrix of char giving the names of the themes.


    "thematic classification" presents a list of themes

    For each function, the user is asked to give the number in the list of

    themes


    VERY SPECIFIC USE.

    This function is used by the function "build_documentation" in order to give a

    a thematic list of the function

    If "previous" is defined, the results are obtained with concatenation of

    the old and newly created thematic list. The functions which are in "previous"

    are not considered again


    Return to thematic list
    Return to alphabetic list
    HOME

    trajectory_curve

    trajectory_curve - plots coloured XYcurves

    function handle=trajectory_curve(X,col1,col2,startpos,endpos);


    The function represents the columns col1 and col2 as curves (ark)

    The observations which have the same strings in their identifiers are joined.

    The points are joined consecutively according to their order (rank) in x.


    Input arguments:

    ================

    X : saisir matirx

    col1, col2 : columns of x to be represented.

    startpos, endpos : positions in identifiers indicating which identifiers are to be joined.


    exemple of use : time series

    ==============

    identifiers of rows (x.i) like A01; A02; A03... A100; B01 ... B100; C01 ...

    with 1 ... 100 indicating times, and A B .. observations varying with time

    Command:

    trajectory_curve(x,1,2,1,1);

    Join the point labelled 'A' together; the ones labelled 'B' ... and so on


    Return to thematic list
    Return to alphabetic list
    HOME

    w

    w - w: (for "what") lists the fields which are present in a structure

    function res= w(xstruct);



    Input argument

    ==============

    xstruct: any variable (supposed to be a structure with fields)


    Output argument

    ==============

    none


    This function displays all the fields found in structures (given by a

    SAISIR function).

    "Matrix" or "Vector" means here "SAISIR matrix" or "SAISIR" vector"


    Exemple of use

    ...

    pls_res=crossvalpls1a(x,y,10,sel);%% modèle de 1 à 10 dimensions

    w(pls_res);%% gives the fields in "pls_res":

    "

    calibration

    T : matrix 94 X 10

    P : matrix 10 X 1050

    beta : vector 1050 X 1

    beta0 : = 12.7025

    ....

    validation

    PREDY : matrix 46 X 10

    RMSEV : vector 1 X 10

    r2 : vector 1 X 10

    T : matrix 46 X 10

    OBSY : vector 46 X 1

    ....

    "


    Return to thematic list
    Return to alphabetic list
    HOME

    xcomdim

    xcomdim - Finding common dimensions in multitable data

    No direct use. Normally called with function "comdim"


    function[Q, saliences, explained]=xcomdim(col,threshold,ndim)

    Finding common dimensions in multitable according to method 'level3'

    proposed by E.M. Qannari, I. Wakeling, P. Courcoux and H. J. H. MacFie

    in Food quality and Preference 11 (2000) 151-154

    table is an array of matrices with the same number of row

    threshold (optional): if the difference of fit
    ndim : number of common dimensions

    default: threshold=1E-10; ndim=number of tables

    returns Q: nrow x ndim the observations loadings


    Return to thematic list
    Return to alphabetic list
    HOME

    xdisp

    xdisp -smart display of heterogeneous variables

    function xdisp(varargin)



    Input argument

    ==============

    varagin: variables number of arguments either number or string


    The function avoids those boring "numstr" and brackets [] in displaying

    text.


    Example: xdisp('pi is equal to',pi, 'Don''t you know ','Charlie Brown ?',' age :',6 );

    equivalent to

    disp(['pi is equal to ', num2str(pi) ' Don''t you know ' 'Charlie Brown ?' ' age ' num2str(6)]);



    Return to thematic list
    Return to alphabetic list
    HOME

    xpca

    xpca - PCA on a matlab data matrix

    assess a rustic principal component analysis (on not normalised data)


    directly on data

    returns coord.d, eigenvector.d, eigenvalues.d

    average.d

    currently only nrow

    Return to thematic list
    Return to alphabetic list
    HOME

    xyz_colored_map1

    xyz_colored_map1 - Draws a colored 3D map from a Saisir file

    xyz_colored_map1(X,col1,col2,col3,startpos,endpos)



    Input arguments

    ===============

    X: Saisir matrix

    col1, col2, col3 : indices of the columns represented in X, Y, Z

    startpos, endpos: number indicating the beginning and ending character in

    the name identifiers. Two different strings will have different colors.


    Return to thematic list
    Return to alphabetic list
    HOME

    xy_plot

    xy_plot - Biplot of one column of X versus one column of Y

    function handle=xy_plot(X, xcol, Y, ycol,start_pos,end_pos);



    Input arguments:

    ================

    X,Y : SAISIR matrices (with the same number of rows)

    xcol, ycol : rank number (index) of the columns to be plotted

    if start_pos and end_pos defined: colored plot according

    to the characters of the row identifiers at position start_pos:end_pos


    Return to thematic list
    Return to alphabetic list
    HOME


    Documentation automatically generated on 07-Jan-2009
    This software is copyrighted by ENITIAA-INRA, Unité de Sensométrie et de Chimiométrie, Nantes (France)
    End of SAISIR documentation