Previous Topic Back Forward Next Topic
Print Page Dr. Frank Dieterle
Ph. D. ThesisPh. D. Thesis 9. Results  All Data Sets9. Results All Data Sets 9.2. Methanol, Ethanol and 1-Propanol by SPR9.2. Methanol, Ethanol and 1-Propanol by SPR 9.2.5. PCA-NN9.2.5. PCA-NN
About Me
Ph. D. Thesis
  Table of Contents
  1. Introduction
  2. Theory Fundamentals of the Multivariate Data Analysis
  3. Theory Quantification of the Refrigerants R22 and R134a: Part I
  4. Experiments, Setups and Data Sets
  5. Results Kinetic Measurements
  6. Results Multivariate Calibrations
  7. Results Genetic Algorithm Framework
  8. Results Growing Neural Network Framework
  9. Results All Data Sets
    9.1. Methanol and Ethanol by SPR
    9.2. Methanol, Ethanol and 1-Propanol by SPR
      9.2.1. Single Analytes
      9.2.2. Multivariate Calibrations of the Mixtures
      9.2.3. Genetic Algorithm Framework
      9.2.4. Parallel Growing Neural Network Framework
      9.2.5. PCA-NN
      9.2.6. Conclusions
    9.3. Methanol, Ethanol and 1-Propanol by the RIfS Array and the 4l Setup
    9.4. Quaternary Mixtures by the SPR Setup and the RIfS Array
    9.5. Quantification of the Refrigerants R22 and R134a in Mixtures: Part II
  10. Results Various Aspects of the Frameworks and Measurements
  11. Summary and Outlook
  12. References
  13. Acknowledgements
Research Tutorials
Site Map
Print this Page Print this Page

9.2.5.   PCA-NN

Similar to section 6.9, a compression of the input space for the data analysis by neural networks was performed by a principal component analysis. The optimal number of 19 principal components was determined by the minimum crossvalidation error of the calibration data. The predictions of the calibration data are promising whereas the predictions of the validation data are significantly worse (see table 6). The true-predicted plots (figure 68) demonstrate that the predictions of the validation data are biased towards too high predictions.

The bias can be explained by the different amount of noise of the validation and calibration data sets in combination with the nonlinearities in the data sets (see discussion in section 6.9): The linear PCA projection spreads the nonlinearities over many principal components resulting in the selection of the high number of 19 components by the minimum crossvalidation error criterion. On the other hand, the typical noise of the calibration data set is included in theses components. Thus, most of these components contain a combination of important information about the model and information about noise. As the validation data set was recorded by averaging two measurements, the noise is significantly reduced resulting in a changed data structure and thus a changed projection by the PCA causing the significant bias of prediction.



figure 68: Predictions of the calibration and validation data by PCA-NN.

Page 107 © Dr. Frank Dieterle, 14.08.2006 Navigation