9.1.4. Brute Force Variable Selection (Dr. Frank Dieterle)

Frank Dieterle

Ph. D. Thesis

9. Results – All Data Sets

9.1. Methanol and Ethanol by SPR

9.1.4. Brute Force Variable Selection

Home
News
About Me
Ph. D. Thesis
	Abstract
	Table of Contents
	1. Introduction
	2. Theory – Fundamentals of the Multivariate Data Analysis
	3. Theory – Quantification of the Refrigerants R22 and R134a: Part I
	4. Experiments, Setups and Data Sets
	5. Results – Kinetic Measurements
	6. Results – Multivariate Calibrations
	7. Results – Genetic Algorithm Framework
	8. Results – Growing Neural Network Framework
	9. Results – All Data Sets
		9.1. Methanol and Ethanol by SPR
			9.1.1. Single Analytes
			9.1.2. Parallel Growing Neural Network Framework
			9.1.3. Sensitivity Analysis
			9.1.4. Brute Force Variable Selection
			9.1.5. Conclusions
		9.2. Methanol, Ethanol and 1-Propanol by SPR
		9.3. Methanol, Ethanol and 1-Propanol by the RIfS Array and the 4l Setup
		9.4. Quaternary Mixtures by the SPR Setup and the RIfS Array
		9.5. Quantification of the Refrigerants R22 and R134a in Mixtures: Part II
	10. Results – Various Aspects of the Frameworks and Measurements
	11. Summary and Outlook
	12. References
	13. Acknowledgements
Publications
Research Tutorials
Downloads and Links
Contact
Search
Site Map
Print this Page

9.1.4. Brute Force Variable Selection

The low number of 3 variables selected by the growing neural network framework for an optimal model also allows a comparison with the variable selection by a brute force method. According to expression (14) , there are 140556 different realizations for selecting 3 variables out of 53. For all these realizations neural networks (fully connected with 4 hidden and 1 output neuron) were trained using the calibration data set and then the mean error of the prediction of the validation data set for both analytes was calculated (similar to the prediction error shown in figure 4 for 2 variables of the refrigerant data set). This procedure was repeated 25 times using different initial weights for the neural networks (a higher number of runs is desirable but limited by the computing time). Among the 25 best networks in respect to the lowest mean prediction error of the validation set, only 1 combination of 3 variables was selected more than once. This selection was the same than the 3 time points selected by the growing networks, whereas the 23 other best selections all were different. Thus, the best combination of the variables highly depends on the initial weights of the training, which is an indication of a high correlation of the variables rendering many realizations of 3 variables very similar. Nevertheless, the 3 variables selected by the parallel growing network framework were the most frequently individually selected variables among the 25 selections by the brute force method confirming the variable selection quality of the parallel growing network framework.

Page 118