Previous Topic Back Forward Next Topic
Print Page Dr. Frank Dieterle
Ph. D. ThesisPh. D. Thesis 2. Theory  Fundamentals of the Multivariate Data Analysis 2. Theory Fundamentals of the Multivariate Data Analysis 2.4. Data Splitting and Validation2.4. Data Splitting and Validation 2.4.2. Bootstrapping2.4.2. Bootstrapping
About Me
Ph. D. Thesis
  Table of Contents
  1. Introduction
  2. Theory Fundamentals of the Multivariate Data Analysis
    2.1. Overview of the Multivariate Quantitative Data Analysis
    2.2. Experimental Design
    2.3. Data Preprocessing
    2.4. Data Splitting and Validation
      2.4.1. Crossvalidation
      2.4.2. Bootstrapping
      2.4.3. Random Subsampling
      2.4.4. Kennard Stones
      2.4.5. Kohonen Neural Networks
      2.4.6. Conclusions
    2.5. Calibration of Linear Relationships
    2.6. Calibration of Nonlinear Relationships
    2.7. Neural Networks Universal Calibration Tools
    2.8. Too Much Information Deteriorates Calibration
    2.9. Measures of Error and Validation
  3. Theory Quantification of the Refrigerants R22 and R134a: Part I
  4. Experiments, Setups and Data Sets
  5. Results Kinetic Measurements
  6. Results Multivariate Calibrations
  7. Results Genetic Algorithm Framework
  8. Results Growing Neural Network Framework
  9. Results All Data Sets
  10. Results Various Aspects of the Frameworks and Measurements
  11. Summary and Outlook
  12. References
  13. Acknowledgements
Research Tutorials
Site Map
Print this Page Print this Page

2.4.2.   Bootstrapping

Bootstrap resampling was originally developed to help analysts determine how much their results might have changed if another random sample had been used instead and how different the results might be when a model is applied to new data. Bootstrapping has also gained an increasing popularity in the field of resampling small data sets [18]. Bootstrapping is based on sampling with replacement to form a calibration set. For the most popular variant, the 0.632 bootstrap, n times a sample is selected from n samples for the calibration set whereby the same sample can be selected several times. Then, the samples, which were not picked, are used for the test set. The chance that a particular sample is not picked for the calibration set is:


Consequently, the test set contains about 36.8% of the samples and the calibration set about 63.2% with some samples replicated in the calibration set. Bootstrapping is not affected by asymptotic inconsistency and might be the best way of estimating the error for very small data sets whereby the complete procedure can be repeated arbitrarily often [9].

Page 17 © Dr. Frank Dieterle, 14.08.2006 Navigation