9.2.5. PCA-NN (Dr. Frank Dieterle)

Frank Dieterle

Ph. D. Thesis

9. Results – All Data Sets

9.2. Methanol, Ethanol and 1-Propanol by SPR

9.2.5. PCA-NN

Home
News
About Me
Ph. D. Thesis
	Abstract
	Table of Contents
	1. Introduction
	2. Theory – Fundamentals of the Multivariate Data Analysis
	3. Theory – Quantification of the Refrigerants R22 and R134a: Part I
	4. Experiments, Setups and Data Sets
	5. Results – Kinetic Measurements
	6. Results – Multivariate Calibrations
	7. Results – Genetic Algorithm Framework
	8. Results – Growing Neural Network Framework
	9. Results – All Data Sets
		9.1. Methanol and Ethanol by SPR
		9.2. Methanol, Ethanol and 1-Propanol by SPR
			9.2.1. Single Analytes
			9.2.2. Multivariate Calibrations of the Mixtures
			9.2.3. Genetic Algorithm Framework
			9.2.4. Parallel Growing Neural Network Framework
			9.2.5. PCA-NN
			9.2.6. Conclusions
		9.3. Methanol, Ethanol and 1-Propanol by the RIfS Array and the 4l Setup
		9.4. Quaternary Mixtures by the SPR Setup and the RIfS Array
		9.5. Quantification of the Refrigerants R22 and R134a in Mixtures: Part II
	10. Results – Various Aspects of the Frameworks and Measurements
	11. Summary and Outlook
	12. References
	13. Acknowledgements
Publications
Research Tutorials
Downloads and Links
Contact
Search
Site Map
Print this Page

9.2.5. PCA-NN

Similar to section 6.9, a compression of the input space for the data analysis by neural networks was performed by a principal component analysis. The optimal number of 19 principal components was determined by the minimum crossvalidation error of the calibration data. The predictions of the calibration data are promising whereas the predictions of the validation data are significantly worse (see table 6). The true-predicted plots (figure 68) demonstrate that the predictions of the validation data are biased towards too high predictions.

The bias can be explained by the different amount of noise of the validation and calibration data sets in combination with the nonlinearities in the data sets (see discussion in section 6.9): The linear PCA projection spreads the nonlinearities over many principal components resulting in the selection of the high number of 19 components by the minimum crossvalidation error criterion. On the other hand, the typical noise of the calibration data set is included in theses components. Thus, most of these components contain a combination of important information about the model and information about noise. As the validation data set was recorded by averaging two measurements, the noise is significantly reduced resulting in a changed data structure and thus a changed projection by the PCA causing the significant bias of prediction.

figure 68: Predictions of the calibration and validation data by PCA-NN.

Page 125