MethodFisher.cxx

Go to the documentation of this file.
00001 // @(#)root/tmva $Id: MethodFisher.cxx 37506 2010-12-10 14:00:41Z stelzer $
00002 // Author: Andreas Hoecker, Xavier Prudent, Joerg Stelzer, Helge Voss, Kai Voss
00003 
00004 /**********************************************************************************
00005  * Project: TMVA - a Root-integrated toolkit for multivariate Data analysis       *
00006  * Package: TMVA                                                                  *
00007  * Class  : MethodFisher                                                          *
00008  * Web    : http://tmva.sourceforge.net                                           *
00009  *                                                                                *
00010  * Description:                                                                   *
00011  *      Implementation (see header for description)                               *
00012  *                                                                                *
00013  * Original author of this Fisher-Discriminant implementation:                    *
00014  *      Andre Gaidot, CEA-France;                                                 *
00015  *      (Translation from FORTRAN)                                                *
00016  *                                                                                *
00017  * Authors (alphabetical):                                                        *
00018  *      Andreas Hoecker <Andreas.Hocker@cern.ch> - CERN, Switzerland              *
00019  *      Xavier Prudent  <prudent@lapp.in2p3.fr>  - LAPP, France                   *
00020  *      Helge Voss      <Helge.Voss@cern.ch>     - MPI-K Heidelberg, Germany      *
00021  *      Kai Voss        <Kai.Voss@cern.ch>       - U. of Victoria, Canada         *
00022  *                                                                                *
00023  * Copyright (c) 2005:                                                            *
00024  *      CERN, Switzerland                                                         *
00025  *      U. of Victoria, Canada                                                    *
00026  *      MPI-K Heidelberg, Germany                                                 *
00027  *      LAPP, Annecy, France                                                      *
00028  *                                                                                *
00029  * Redistribution and use in source and binary forms, with or without             *
00030  * modification, are permitted according to the terms listed in LICENSE           *
00031  * (http://tmva.sourceforge.net/LICENSE)                                          *
00032  **********************************************************************************/
00033 
00034 //_______________________________________________________________________
00035 /* Begin_Html
00036   Fisher and Mahalanobis Discriminants (Linear Discriminant Analysis)
00037 
00038   <p>
00039   In the method of Fisher discriminants event selection is performed
00040   in a transformed variable space with zero linear correlations, by
00041   distinguishing the mean values of the signal and background
00042   distributions.<br></p>
00043 
00044   <p>
00045   The linear discriminant analysis determines an axis in the (correlated)
00046   hyperspace of the input variables
00047   such that, when projecting the output classes (signal and background)
00048   upon this axis, they are pushed as far as possible away from each other,
00049   while events of a same class are confined in a close vicinity.
00050   The linearity property of this method is reflected in the metric with
00051   which "far apart" and "close vicinity" are determined: the covariance
00052   matrix of the discriminant variable space.
00053   </p>
00054 
00055   <p>
00056   The classification of the events in signal and background classes
00057   relies on the following characteristics (only): overall sample means,
00058   <i><my:o>x</my:o><sub>i</sub></i>, for each input variable, <i>i</i>,
00059   class-specific sample means, <i><my:o>x</my:o><sub>S(B),i</sub></i>,
00060   and total covariance matrix <i>T<sub>ij</sub></i>. The covariance matrix
00061   can be decomposed into the sum of a <i>within-</i> (<i>W<sub>ij</sub></i>)
00062   and a <i>between-class</i> (<i>B<sub>ij</sub></i>) class matrix. They describe
00063   the dispersion of events relative to the means of their own class (within-class
00064   matrix), and relative to the overall sample means (between-class matrix).
00065   The Fisher coefficients, <i>F<sub>i</sub></i>, are then given by <br>
00066   <center>
00067   <img vspace=6 src="gif/tmva_fisherC.gif" align="bottom" >
00068   </center>
00069   where in TMVA is set <i>N<sub>S</sub>=N<sub>B</sub></i>, so that the factor
00070   in front of the sum simplifies to &frac12;.
00071   The Fisher discriminant then reads<br>
00072   <center>
00073   <img vspace=6 src="gif/tmva_fisherD.gif" align="bottom" >
00074   </center>
00075   The offset <i>F</i><sub>0</sub> centers the sample mean of <i>x</i><sub>Fi</sub>
00076   at zero. Instead of using the within-class matrix, the Mahalanobis variant
00077   determines the Fisher coefficients as follows:<br>
00078   <center>
00079   <img vspace=6 src="gif/tmva_mahaC.gif" align="bottom" >
00080   </center>
00081   with resulting <i>x</i><sub>Ma</sub> that are very similar to the
00082   <i>x</i><sub>Fi</sub>. <br></p>
00083 
00084   TMVA provides two outputs for the ranking of the input variables:<br><p></p>
00085   <ul>
00086   <li> <u>Fisher test:</u> the Fisher analysis aims at simultaneously maximising
00087   the between-class separation, while minimising the within-class dispersion.
00088   A useful measure of the discrimination power of a variable is hence given
00089   by the diagonal quantity: <i>B<sub>ii</sub>/W<sub>ii</sub></i>.
00090   </li>
00091 
00092   <li> <u>Discrimination power:</u> the value of the Fisher coefficient is a
00093   measure of the discriminating power of a variable. The discrimination power
00094   of set of input variables can therefore be measured by the scalar
00095   <center>
00096   <img vspace=6 src="gif/tmva_discpower.gif" align="bottom" >
00097   </center>
00098   </li>
00099   </ul>
00100   The corresponding numbers are printed on standard output.
00101   End_Html */
00102 //_______________________________________________________________________
00103 
00104 #include <iomanip>
00105 #include <cassert>
00106 
00107 #include "TMath.h"
00108 #include "Riostream.h"
00109 
00110 #include "TMVA/VariableTransformBase.h"
00111 #include "TMVA/MethodFisher.h"
00112 #include "TMVA/Tools.h"
00113 #include "TMatrix.h"
00114 #include "TMVA/Ranking.h"
00115 #include "TMVA/Types.h"
00116 #include "TMVA/ClassifierFactory.h"
00117 
00118 REGISTER_METHOD(Fisher)
00119 
00120 ClassImp(TMVA::MethodFisher);
00121 
00122 //_______________________________________________________________________
00123 TMVA::MethodFisher::MethodFisher( const TString& jobName,
00124                                   const TString& methodTitle,
00125                                   DataSetInfo& dsi,
00126                                   const TString& theOption,
00127                                   TDirectory* theTargetDir ) :
00128    MethodBase( jobName, Types::kFisher, methodTitle, dsi, theOption, theTargetDir ),
00129    fMeanMatx     ( 0 ),
00130    fTheMethod    ( "Fisher" ),
00131    fFisherMethod ( kFisher ),
00132    fBetw         ( 0 ),
00133    fWith         ( 0 ),
00134    fCov          ( 0 ),
00135    fSumOfWeightsS( 0 ),
00136    fSumOfWeightsB( 0 ),
00137    fDiscrimPow   ( 0 ),
00138    fFisherCoeff  ( 0 ),
00139    fF0           ( 0 )
00140 {
00141    // standard constructor for the "Fisher"
00142 }
00143 
00144 //_______________________________________________________________________
00145 TMVA::MethodFisher::MethodFisher( DataSetInfo& dsi,
00146                                   const TString& theWeightFile,
00147                                   TDirectory* theTargetDir ) :
00148    MethodBase( Types::kFisher, dsi, theWeightFile, theTargetDir ),
00149    fMeanMatx     ( 0 ),
00150    fTheMethod    ( "Fisher" ),
00151    fFisherMethod ( kFisher ),
00152    fBetw         ( 0 ),
00153    fWith         ( 0 ),
00154    fCov          ( 0 ),
00155    fSumOfWeightsS( 0 ),
00156    fSumOfWeightsB( 0 ),
00157    fDiscrimPow   ( 0 ),
00158    fFisherCoeff  ( 0 ),
00159    fF0           ( 0 )
00160 {
00161    // constructor from weight file
00162 }
00163 
00164 //_______________________________________________________________________
00165 void TMVA::MethodFisher::Init( void )
00166 {
00167    // default initialization called by all constructors
00168 
00169    // allocate Fisher coefficients
00170    fFisherCoeff = new std::vector<Double_t>( GetNvar() );
00171 
00172    // the minimum requirement to declare an event signal-like
00173    SetSignalReferenceCut( 0.0 );
00174 
00175    // this is the preparation for training
00176    InitMatrices();
00177 }
00178 
00179 //_______________________________________________________________________
00180 void TMVA::MethodFisher::DeclareOptions()
00181 {
00182    //
00183    // MethodFisher options:
00184    // format and syntax of option string: "type"
00185    // where type is "Fisher" or "Mahalanobis"
00186    //
00187    DeclareOptionRef( fTheMethod = "Fisher", "Method", "Discrimination method" );
00188    AddPreDefVal(TString("Fisher"));
00189    AddPreDefVal(TString("Mahalanobis"));
00190 }
00191 
00192 //_______________________________________________________________________
00193 void TMVA::MethodFisher::ProcessOptions()
00194 {
00195    // process user options
00196    if (fTheMethod ==  "Fisher" ) fFisherMethod = kFisher;
00197    else                          fFisherMethod = kMahalanobis;
00198 
00199    // this is the preparation for training
00200    InitMatrices();
00201 }
00202 
00203 //_______________________________________________________________________
00204 TMVA::MethodFisher::~MethodFisher( void )
00205 {
00206    // destructor
00207    if (fBetw       ) { delete fBetw; fBetw = 0; }
00208    if (fWith       ) { delete fWith; fWith = 0; }
00209    if (fCov        ) { delete fCov;  fCov = 0; }
00210    if (fDiscrimPow ) { delete fDiscrimPow; fDiscrimPow = 0; }
00211    if (fFisherCoeff) { delete fFisherCoeff; fFisherCoeff = 0; }
00212 }
00213 
00214 //_______________________________________________________________________
00215 Bool_t TMVA::MethodFisher::HasAnalysisType( Types::EAnalysisType type, UInt_t numberClasses, UInt_t /*numberTargets*/ )
00216 {
00217    // Fisher can only handle classification with 2 classes
00218    if (type == Types::kClassification && numberClasses == 2) return kTRUE;
00219    return kFALSE;
00220 }
00221 
00222 //_______________________________________________________________________
00223 void TMVA::MethodFisher::Train( void )
00224 {
00225    // computation of Fisher coefficients by series of matrix operations
00226 
00227    // get mean value of each variables for signal, backgd and signal+backgd
00228    GetMean();
00229 
00230    // get the matrix of covariance 'within class'
00231    GetCov_WithinClass();
00232 
00233    // get the matrix of covariance 'between class'
00234    GetCov_BetweenClass();
00235 
00236    // get the matrix of covariance 'between class'
00237    GetCov_Full();
00238 
00239    //--------------------------------------------------------------
00240 
00241    // get the Fisher coefficients
00242    GetFisherCoeff();
00243 
00244    // get the discriminating power of each variables
00245    GetDiscrimPower();
00246 
00247    // nice output
00248    PrintCoefficients();
00249 }
00250 
00251 //_______________________________________________________________________
00252 Double_t TMVA::MethodFisher::GetMvaValue( Double_t* err, Double_t* errUpper )
00253 {
00254    // returns the Fisher value (no fixed range)
00255    const Event * ev = GetEvent();
00256    Double_t result = fF0;
00257    for (UInt_t ivar=0; ivar<GetNvar(); ivar++)
00258       result += (*fFisherCoeff)[ivar]*ev->GetValue(ivar);
00259 
00260    // cannot determine error
00261    NoErrorCalc(err, errUpper);
00262 
00263    return result;
00264 
00265 }
00266 
00267 //_______________________________________________________________________
00268 void TMVA::MethodFisher::InitMatrices( void )
00269 {
00270    // initializaton method; creates global matrices and vectors
00271 
00272    // average value of each variables for S, B, S+B
00273    fMeanMatx = new TMatrixD( GetNvar(), 3 );
00274 
00275    // the covariance 'within class' and 'between class' matrices
00276    fBetw = new TMatrixD( GetNvar(), GetNvar() );
00277    fWith = new TMatrixD( GetNvar(), GetNvar() );
00278    fCov  = new TMatrixD( GetNvar(), GetNvar() );
00279 
00280    // discriminating power
00281    fDiscrimPow = new std::vector<Double_t>( GetNvar() );
00282 }
00283 
00284 //_______________________________________________________________________
00285 void TMVA::MethodFisher::GetMean( void )
00286 {
00287    // compute mean values of variables in each sample, and the overall means
00288 
00289    // initialize internal sum-of-weights variables
00290    fSumOfWeightsS = 0;
00291    fSumOfWeightsB = 0;
00292 
00293    const UInt_t nvar = DataInfo().GetNVariables();
00294 
00295    // init vectors
00296    Double_t* sumS = new Double_t[nvar];
00297    Double_t* sumB = new Double_t[nvar];
00298    for (UInt_t ivar=0; ivar<nvar; ivar++) { sumS[ivar] = sumB[ivar] = 0; }
00299 
00300    // compute sample means
00301    for (Int_t ievt=0; ievt<Data()->GetNEvents(); ievt++) {
00302 
00303       // read the Training Event into "event"
00304       const Event * ev = GetEvent(ievt);
00305 
00306       // sum of weights
00307       Double_t weight = GetTWeight(ev);
00308       if (DataInfo().IsSignal(ev)) fSumOfWeightsS += weight;
00309       else                         fSumOfWeightsB += weight;
00310 
00311       Double_t* sum = DataInfo().IsSignal(ev) ? sumS : sumB;
00312 
00313       for (UInt_t ivar=0; ivar<nvar; ivar++) sum[ivar] += ev->GetValue( ivar )*weight;
00314    }
00315 
00316    for (UInt_t ivar=0; ivar<nvar; ivar++) {
00317       (*fMeanMatx)( ivar, 2 ) = sumS[ivar];
00318       (*fMeanMatx)( ivar, 0 ) = sumS[ivar]/fSumOfWeightsS;
00319 
00320       (*fMeanMatx)( ivar, 2 ) += sumB[ivar];
00321       (*fMeanMatx)( ivar, 1 ) = sumB[ivar]/fSumOfWeightsB;
00322 
00323       // signal + background
00324       (*fMeanMatx)( ivar, 2 ) /= (fSumOfWeightsS + fSumOfWeightsB);
00325    }
00326    delete [] sumS;
00327    delete [] sumB;
00328 }
00329 
00330 //_______________________________________________________________________
00331 void TMVA::MethodFisher::GetCov_WithinClass( void )
00332 {
00333    // the matrix of covariance 'within class' reflects the dispersion of the
00334    // events relative to the center of gravity of their own class  
00335 
00336    // assert required
00337    assert( fSumOfWeightsS > 0 && fSumOfWeightsB > 0 );
00338 
00339    // product matrices (x-<x>)(y-<y>) where x;y are variables
00340 
00341    // init
00342    const Int_t nvar  = GetNvar();
00343    const Int_t nvar2 = nvar*nvar;
00344    Double_t *sumSig  = new Double_t[nvar2];
00345    Double_t *sumBgd  = new Double_t[nvar2];
00346    Double_t *xval    = new Double_t[nvar];
00347    memset(sumSig,0,nvar2*sizeof(Double_t));
00348    memset(sumBgd,0,nvar2*sizeof(Double_t));
00349    
00350    // 'within class' covariance
00351    for (Int_t ievt=0; ievt<Data()->GetNEvents(); ievt++) {
00352 
00353       // read the Training Event into "event"
00354       const Event* ev = GetEvent(ievt);
00355 
00356       Double_t weight = GetTWeight(ev); // may ignore events with negative weights
00357 
00358       for (Int_t x=0; x<nvar; x++) xval[x] = ev->GetValue( x );
00359       Int_t k=0;
00360       for (Int_t x=0; x<nvar; x++) {
00361          for (Int_t y=0; y<nvar; y++) {            
00362             Double_t v = ( (xval[x] - (*fMeanMatx)(x, 0))*(xval[y] - (*fMeanMatx)(y, 0)) )*weight;
00363             if (DataInfo().IsSignal(ev)) sumSig[k] += v;
00364             else                         sumBgd[k] += v;
00365             k++;
00366          }
00367       }
00368    }
00369    Int_t k=0;
00370    for (Int_t x=0; x<nvar; x++) {
00371       for (Int_t y=0; y<nvar; y++) {
00372          (*fWith)(x, y) = (sumSig[k] + sumBgd[k])/(fSumOfWeightsS + fSumOfWeightsB);
00373          k++;
00374       }
00375    }
00376 
00377    delete [] sumSig;
00378    delete [] sumBgd;
00379    delete [] xval;
00380 }
00381 
00382 //_______________________________________________________________________
00383 void TMVA::MethodFisher::GetCov_BetweenClass( void )
00384 {
00385    // the matrix of covariance 'between class' reflects the dispersion of the
00386    // events of a class relative to the global center of gravity of all the class
00387    // hence the separation between classes
00388 
00389    // assert required
00390    assert( fSumOfWeightsS > 0 && fSumOfWeightsB > 0);
00391 
00392    Double_t prodSig, prodBgd;
00393 
00394    for (UInt_t x=0; x<GetNvar(); x++) {
00395       for (UInt_t y=0; y<GetNvar(); y++) {
00396 
00397          prodSig = ( ((*fMeanMatx)(x, 0) - (*fMeanMatx)(x, 2))*
00398                      ((*fMeanMatx)(y, 0) - (*fMeanMatx)(y, 2)) );
00399          prodBgd = ( ((*fMeanMatx)(x, 1) - (*fMeanMatx)(x, 2))*
00400                      ((*fMeanMatx)(y, 1) - (*fMeanMatx)(y, 2)) );
00401 
00402          (*fBetw)(x, y) = (fSumOfWeightsS*prodSig + fSumOfWeightsB*prodBgd) / (fSumOfWeightsS + fSumOfWeightsB);
00403       }
00404    }
00405 }
00406 
00407 //_______________________________________________________________________
00408 void TMVA::MethodFisher::GetCov_Full( void )
00409 {
00410    // compute full covariance matrix from sum of within and between matrices
00411    for (UInt_t x=0; x<GetNvar(); x++) 
00412       for (UInt_t y=0; y<GetNvar(); y++) 
00413          (*fCov)(x, y) = (*fWith)(x, y) + (*fBetw)(x, y);
00414 }
00415 
00416 //_______________________________________________________________________
00417 void TMVA::MethodFisher::GetFisherCoeff( void )
00418 {
00419    // Fisher = Sum { [coeff]*[variables] }
00420    //
00421    // let Xs be the array of the mean values of variables for signal evts
00422    // let Xb be the array of the mean values of variables for backgd evts
00423    // let InvWith be the inverse matrix of the 'within class' correlation matrix
00424    //
00425    // then the array of Fisher coefficients is 
00426    // [coeff] =sqrt(fNsig*fNbgd)/fNevt*transpose{Xs-Xb}*InvWith
00427 
00428    // assert required
00429    assert( fSumOfWeightsS > 0 && fSumOfWeightsB > 0);
00430 
00431    // invert covariance matrix
00432    TMatrixD* theMat = 0;
00433    switch (GetFisherMethod()) {
00434    case kFisher:
00435       theMat = fWith;
00436       break;
00437    case kMahalanobis:
00438       theMat = fCov;
00439       break;
00440    default:
00441       Log() << kFATAL << "<GetFisherCoeff> undefined method" << GetFisherMethod() << Endl;
00442    }
00443 
00444    TMatrixD invCov( *theMat );
00445    if ( TMath::Abs(invCov.Determinant()) < 10E-24 ) {
00446       Log() << kWARNING << "<GetFisherCoeff> matrix is almost singular with deterninant="
00447               << TMath::Abs(invCov.Determinant()) 
00448               << " did you use the variables that are linear combinations or highly correlated?" 
00449               << Endl;
00450    }
00451    if ( TMath::Abs(invCov.Determinant()) < 10E-120 ) {
00452       Log() << kFATAL << "<GetFisherCoeff> matrix is singular with determinant="
00453               << TMath::Abs(invCov.Determinant())  
00454               << " did you use the variables that are linear combinations?" 
00455               << Endl;
00456    }
00457 
00458    invCov.Invert();
00459    
00460    // apply rescaling factor
00461    Double_t xfact = TMath::Sqrt( fSumOfWeightsS*fSumOfWeightsB ) / (fSumOfWeightsS + fSumOfWeightsB);
00462 
00463    // compute difference of mean values
00464    std::vector<Double_t> diffMeans( GetNvar() );
00465    UInt_t ivar, jvar;
00466    for (ivar=0; ivar<GetNvar(); ivar++) {
00467       (*fFisherCoeff)[ivar] = 0;
00468 
00469       for (jvar=0; jvar<GetNvar(); jvar++) {
00470          Double_t d = (*fMeanMatx)(jvar, 0) - (*fMeanMatx)(jvar, 1);
00471          (*fFisherCoeff)[ivar] += invCov(ivar, jvar)*d;
00472       }    
00473     
00474       // rescale
00475       (*fFisherCoeff)[ivar] *= xfact;
00476    }
00477 
00478    // offset correction
00479    fF0 = 0.0;
00480    for (ivar=0; ivar<GetNvar(); ivar++){ 
00481       fF0 += (*fFisherCoeff)[ivar]*((*fMeanMatx)(ivar, 0) + (*fMeanMatx)(ivar, 1));
00482    }
00483    fF0 /= -2.0;  
00484 }
00485 
00486 //_______________________________________________________________________
00487 void TMVA::MethodFisher::GetDiscrimPower( void )
00488 {
00489    // computation of discrimination power indicator for each variable
00490    // small values of "fWith" indicates little compactness of sig & of backgd
00491    // big values of "fBetw" indicates large separation between sig & backgd
00492    //
00493    // we want signal & backgd classes as compact and separated as possible
00494    // the discriminating power is then defined as the ration "fBetw/fWith"
00495    for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {
00496       if ((*fCov)(ivar, ivar) != 0) 
00497          (*fDiscrimPow)[ivar] = (*fBetw)(ivar, ivar)/(*fCov)(ivar, ivar);
00498       else
00499          (*fDiscrimPow)[ivar] = 0;
00500    }
00501 }
00502 
00503 //_______________________________________________________________________
00504 const TMVA::Ranking* TMVA::MethodFisher::CreateRanking() 
00505 {
00506    // computes ranking of input variables
00507 
00508    // create the ranking object
00509    fRanking = new Ranking( GetName(), "Discr. power" );
00510 
00511    for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {
00512       fRanking->AddRank( Rank( GetInputLabel(ivar), (*fDiscrimPow)[ivar] ) );
00513    }
00514 
00515    return fRanking;
00516 }
00517 
00518 //_______________________________________________________________________
00519 void TMVA::MethodFisher::PrintCoefficients( void ) 
00520 {
00521    // display Fisher coefficients and discriminating power for each variable
00522    // check maximum length of variable name
00523    Log() << kINFO << "Results for Fisher coefficients:" << Endl;
00524 
00525    if (GetTransformationHandler().GetTransformationList().GetSize() != 0) {
00526       Log() << kINFO << "NOTE: The coefficients must be applied to TRANFORMED variables" << Endl;
00527       Log() << kINFO << "  List of the transformation: " << Endl;
00528       TListIter trIt(&GetTransformationHandler().GetTransformationList());
00529       while (VariableTransformBase *trf = (VariableTransformBase*) trIt()) {
00530          Log() << kINFO << "  -- " << trf->GetName() << Endl;
00531       }
00532    }
00533    std::vector<TString>  vars;
00534    std::vector<Double_t> coeffs;
00535    for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {
00536       vars  .push_back( GetInputLabel(ivar) );
00537       coeffs.push_back(  (*fFisherCoeff)[ivar] );
00538    }
00539    vars  .push_back( "(offset)" );
00540    coeffs.push_back( fF0 );
00541    TMVA::gTools().FormattedOutput( coeffs, vars, "Variable" , "Coefficient", Log() );   
00542 
00543    if (IsNormalised()) {
00544       Log() << kINFO << "NOTE: You have chosen to use the \"Normalise\" booking option. Hence, the" << Endl;
00545       Log() << kINFO << "      coefficients must be applied to NORMALISED (') variables as follows:" << Endl;
00546       Int_t maxL = 0;
00547       for (UInt_t ivar=0; ivar<GetNvar(); ivar++) if (GetInputLabel(ivar).Length() > maxL) maxL = GetInputLabel(ivar).Length();
00548 
00549       // Print normalisation expression (see Tools.cxx): "2*(x - xmin)/(xmax - xmin) - 1.0"
00550       for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {
00551          Log() << kINFO 
00552                  << setw(maxL+9) << TString("[") + GetInputLabel(ivar) + "]' = 2*(" 
00553                  << setw(maxL+2) << TString("[") + GetInputLabel(ivar) + "]"
00554                  << setw(3) << (GetXmin(ivar) > 0 ? " - " : " + ")
00555                  << setw(6) << TMath::Abs(GetXmin(ivar)) << setw(3) << ")/"
00556                  << setw(6) << (GetXmax(ivar) -  GetXmin(ivar) )
00557                  << setw(3) << " - 1"
00558                  << Endl;
00559       }
00560       Log() << kINFO << "The TMVA Reader will properly account for this normalisation, but if the" << Endl;
00561       Log() << kINFO << "Fisher classifier is applied outside the Reader, the transformation must be" << Endl;
00562       Log() << kINFO << "implemented -- or the \"Normalise\" option is removed and Fisher retrained." << Endl;
00563       Log() << kINFO << Endl;
00564    }   
00565 }
00566   
00567 //_______________________________________________________________________
00568 void TMVA::MethodFisher::ReadWeightsFromStream( istream& istr )
00569 {
00570    // read Fisher coefficients from weight file
00571    istr >> fF0;
00572    for (UInt_t ivar=0; ivar<GetNvar(); ivar++) istr >> (*fFisherCoeff)[ivar];
00573 }
00574 
00575 //_______________________________________________________________________
00576 void TMVA::MethodFisher::AddWeightsXMLTo( void* parent ) const 
00577 {
00578    // create XML description of Fisher classifier
00579 
00580    void* wght = gTools().AddChild(parent, "Weights");
00581    gTools().AddAttr( wght, "NCoeff", GetNvar()+1 );
00582    void* coeffxml = gTools().AddChild(wght, "Coefficient");
00583    gTools().AddAttr( coeffxml, "Index", 0   );
00584    gTools().AddAttr( coeffxml, "Value", fF0 );
00585    for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {
00586       coeffxml = gTools().AddChild( wght, "Coefficient" );
00587       gTools().AddAttr( coeffxml, "Index", ivar+1 );
00588       gTools().AddAttr( coeffxml, "Value", (*fFisherCoeff)[ivar] );
00589    }
00590 }
00591 
00592 //_______________________________________________________________________
00593 void TMVA::MethodFisher::ReadWeightsFromXML( void* wghtnode ) 
00594 {
00595    // read Fisher coefficients from xml weight file
00596    UInt_t ncoeff, coeffidx;
00597    gTools().ReadAttr( wghtnode, "NCoeff", ncoeff );
00598    fFisherCoeff->resize(ncoeff-1);
00599 
00600    void* ch = gTools().GetChild(wghtnode);
00601    Double_t coeff;
00602    while (ch) {
00603       gTools().ReadAttr( ch, "Index", coeffidx );
00604       gTools().ReadAttr( ch, "Value", coeff    );
00605       if (coeffidx==0) fF0 = coeff;
00606       else             (*fFisherCoeff)[coeffidx-1] = coeff;
00607       ch = gTools().GetNextChild(ch);
00608    }
00609 }
00610 
00611 //_______________________________________________________________________
00612 void TMVA::MethodFisher::MakeClassSpecific( std::ostream& fout, const TString& className ) const
00613 {
00614    // write Fisher-specific classifier response
00615    Int_t dp = fout.precision();
00616    fout << "   double              fFisher0;" << endl;
00617    fout << "   std::vector<double> fFisherCoefficients;" << endl;
00618    fout << "};" << endl;
00619    fout << "" << endl;
00620    fout << "inline void " << className << "::Initialize() " << endl;
00621    fout << "{" << endl;
00622    fout << "   fFisher0 = " << std::setprecision(12) << fF0 << ";" << endl;
00623    for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {
00624       fout << "   fFisherCoefficients.push_back( " << std::setprecision(12) << (*fFisherCoeff)[ivar] << " );" << endl;
00625    }
00626    fout << endl;
00627    fout << "   // sanity check" << endl;
00628    fout << "   if (fFisherCoefficients.size() != fNvars) {" << endl;
00629    fout << "      std::cout << \"Problem in class \\\"\" << fClassName << \"\\\"::Initialize: mismatch in number of input values\"" << endl;
00630    fout << "                << fFisherCoefficients.size() << \" != \" << fNvars << std::endl;" << endl;
00631    fout << "      fStatusIsClean = false;" << endl;
00632    fout << "   }         " << endl;
00633    fout << "}" << endl;
00634    fout << endl;
00635    fout << "inline double " << className << "::GetMvaValue__( const std::vector<double>& inputValues ) const" << endl;
00636    fout << "{" << endl;
00637    fout << "   double retval = fFisher0;" << endl;
00638    fout << "   for (size_t ivar = 0; ivar < fNvars; ivar++) {" << endl;
00639    fout << "      retval += fFisherCoefficients[ivar]*inputValues[ivar];" << endl;
00640    fout << "   }" << endl;
00641    fout << endl;
00642    fout << "   return retval;" << endl;
00643    fout << "}" << endl;
00644    fout << endl;
00645    fout << "// Clean up" << endl;
00646    fout << "inline void " << className << "::Clear() " << endl;
00647    fout << "{" << endl;
00648    fout << "   // clear coefficients" << endl;
00649    fout << "   fFisherCoefficients.clear(); " << endl;
00650    fout << "}" << endl;
00651    fout << std::setprecision(dp);
00652 }
00653 
00654 //_______________________________________________________________________
00655 void TMVA::MethodFisher::GetHelpMessage() const
00656 {
00657    // get help message text
00658    //
00659    // typical length of text line: 
00660    //         "|--------------------------------------------------------------|"
00661    Log() << Endl;
00662    Log() << gTools().Color("bold") << "--- Short description:" << gTools().Color("reset") << Endl;
00663    Log() << Endl;
00664    Log() << "Fisher discriminants select events by distinguishing the mean " << Endl;
00665    Log() << "values of the signal and background distributions in a trans- " << Endl;
00666    Log() << "formed variable space where linear correlations are removed." << Endl;
00667    Log() << Endl;
00668    Log() << "   (More precisely: the \"linear discriminator\" determines" << Endl;
00669    Log() << "    an axis in the (correlated) hyperspace of the input " << Endl;
00670    Log() << "    variables such that, when projecting the output classes " << Endl;
00671    Log() << "    (signal and background) upon this axis, they are pushed " << Endl;
00672    Log() << "    as far as possible away from each other, while events" << Endl;
00673    Log() << "    of a same class are confined in a close vicinity. The  " << Endl;
00674    Log() << "    linearity property of this classifier is reflected in the " << Endl;
00675    Log() << "    metric with which \"far apart\" and \"close vicinity\" are " << Endl;
00676    Log() << "    determined: the covariance matrix of the discriminating" << Endl;
00677    Log() << "    variable space.)" << Endl;
00678    Log() << Endl;
00679    Log() << gTools().Color("bold") << "--- Performance optimisation:" << gTools().Color("reset") << Endl;
00680    Log() << Endl;
00681    Log() << "Optimal performance for Fisher discriminants is obtained for " << Endl;
00682    Log() << "linearly correlated Gaussian-distributed variables. Any deviation" << Endl;
00683    Log() << "from this ideal reduces the achievable separation power. In " << Endl;
00684    Log() << "particular, no discrimination at all is achieved for a variable" << Endl;
00685    Log() << "that has the same sample mean for signal and background, even if " << Endl;
00686    Log() << "the shapes of the distributions are very different. Thus, Fisher " << Endl;
00687    Log() << "discriminants often benefit from suitable transformations of the " << Endl;
00688    Log() << "input variables. For example, if a variable x in [-1,1] has a " << Endl;
00689    Log() << "a parabolic signal distributions, and a uniform background" << Endl;
00690    Log() << "distributions, their mean value is zero in both cases, leading " << Endl;
00691    Log() << "to no separation. The simple transformation x -> |x| renders this " << Endl;
00692    Log() << "variable powerful for the use in a Fisher discriminant." << Endl;
00693    Log() << Endl;
00694    Log() << gTools().Color("bold") << "--- Performance tuning via configuration options:" << gTools().Color("reset") << Endl;
00695    Log() << Endl;
00696    Log() << "<None>" << Endl;
00697 }

Generated on Tue Jul 5 15:25:03 2011 for ROOT_528-00b_version by  doxygen 1.5.1