Previous Topic Back Forward Next Topic
Print Page Frank Dieterle
 
Ph. D. ThesisPh. D. Thesis 10. Results – Various Aspects of the Frameworks and Measurements10. Results – Various Aspects of the Frameworks and Measurements 10.3. Optimization of the Measurements10.3. Optimization of the Measurements
Home
News
About Me
Ph. D. Thesis
  Abstract
  Table of Contents
  1. Introduction
  2. Theory – Fundamentals of the Multivariate Data Analysis
  3. Theory – Quantification of the Refrigerants R22 and R134a: Part I
  4. Experiments, Setups and Data Sets
  5. Results – Kinetic Measurements
  6. Results – Multivariate Calibrations
  7. Results – Genetic Algorithm Framework
  8. Results – Growing Neural Network Framework
  9. Results – All Data Sets
  10. Results – Various Aspects of the Frameworks and Measurements
    10.1. Single or Multiple Analyte Rankings
    10.2. Stopping Criteria for the Parallel Frameworks
    10.3. Optimization of the Measurements
    10.4. Robustness and Comparison with Martens' Uncertainty Test
  11. Summary and Outlook
  12. References
  13. Acknowledgements
Publications
Research Tutorials
Downloads and Links
Contact
Search
Site Map
Print this Page Print this Page

10.3.   Optimization of the Measurements

Among the many parameters to be decided on and to be adjusted, the scanning speed of the time-resolved sensor responses has been an often-discussed subject during the measurements for this work. A slow scanning of the sensor responses over time results in a low number of time points allowing a calibration without a variable selection or at least allows significantly speeding up the variable selection procedures. On the other hand, a slow scanning of the sensor responses might miss the differences between the sensor responses of analytes, which show a very similar kinetics. To investigate this topic a little bit more in detail, fully connected neural networks were trained using the refrigerant data set whereby the number of time points was systematically reduced by using only each 2nd, each 3rd... time point. In table 11, the prediction errors are shown, which decrease with an increasing number of time points corresponding with an increasing scanning speed. This table also demonstrates that only a sophisticated variable selection procedure improves the performance of calibration and prediction (compared with table 3 and table 4).

 

Method

Calibration

Data Set

Validation

Data Set

R22

R134a

R22

R134a

Each Time Point

1.5

2.6

2.2

3.3

Each 2nd Time Point

2.0

3.0

2.4

3.3

Each 3rd Time Point

2.2

3.1

2.7

3.4

Each 4th Time Point

2.4

3.2

2.8

3.5

Each 5th Time Point

2.9

3.5

3.2

3.8

Each 10th Time Point

4.5

3.7

4.9

4.1

Each 20th Time Point

21.9

55.2

21.6

52.1

table 11:    Relative RMSE in % for the prediction of the refrigerant data set by fully connected neural networks, which use each nth time point simulating a slower scanning of the time-resolved sensor response.

Also the variable selection by the frameworks gives an indication of an optimal scanning speed for the time-resolved sensor responses. Practically for all variable selections by the frameworks of the previous chapters, many of the variables selected were adjacent in time. For example it is shown in figure 46 that 9 out of 12 time points within the time interval 67 s to 93 s are selected demonstrating that nearly all information of the selected interval is evaluated and that a further increase of the scanning speed might yield even more useful information.

The fact that variables are selected and used only within few intervals is also known in PLS and has been subject to some further developments of the PLS known as Interval Partial Least Squares (IPLS) [266]. It has often been stated that the collinearity of a certain number of variables stabilizes the predictions [41] whereby too high a number of collinear variables negatively affects the predictions (see also section 2.8).

For practically all selections of the variables by the frameworks (for example in the sections 8.4.1, 9.1.2, 9.2.3, 9.2.4 and 9.3.2), the variables are located directly after the beginning of exposure to analyte and directly after the end of exposure to analyte. This implies that not the complete measurement time is needed for the determination of the sample composition, but only a short interval of exposure and after that a short interval of analyte desorption. It also implies that the time of exposure to analyte can be reduced, which also results in a faster desorption of the analyte (like a synergetic effect) and consequently reduces the time needed between measurements. For this work, the time used for exposure to analyte and a subsequent recovery had been determined by visually inspecting the sensor responses of single analytes (like figure 24) and then by choosing the time interval, for which the shape of sensor responses significantly differ. For the routine analysis, the calibration should be repeated measuring only during the time intervals proposed by the frameworks, which will save time and money.

The number of measurements which have to be performed for a calibration is also a significant point, which has to be decided on when planning an experimental design. As the number of measurements for a full factorial design strongly increases with the number of analytes and the number of concentration levels (see equation (1)), the number of concentra­tion levels for the calibration of ternary and quaternary mixtures was rather low compared with the binary mixtures of the refrigerants. The price to be paid for calibrating with a 4-level design (used for the calibration of the quaternary mixtures) instead of a 21-level design can be estimated by using only 16 calibration samples instead of 441 samples for the refrigerant data. The mean relative RMSE of the validation for the non-optimized neural networks increases thereby from 2.7% for the 21-level design to 6.7% for the 4-level design. Thus, it is expected that the calibrations of the ternary and especially of the quaternary mixtures can be significantly improved by measuring more calibration samples.

The choice of the optimal thickness of the sensitive layer depends on several parameters, which are partly discussed in chapter 5 and in the results in more detail and which will be only summarized here. A thick layer means a slow kinetics of the analytes allowing the discrimination of very small and similar analytes. On the other side, big analytes need a very long time until a sensible sensor response can be observed resulting in long measurement times. Thin layers, which allow fast measurements can only be used in some setups due to a low signal to noise ratio, whereby a smoothing of noisy signals can improve the calibration (in contrast to smoothing the nearly noise-free signals of thick layers). Among the different devices, the SPR setup is most appropriate for time-resolved measurements using Makrolon, but needs the most complex equipment (like an exact constancy of the temperature). The 4l setup is the smallest and cheapest device but is only fairly appropriate for Makrolon as sensitive layer, whereas the RIfS array setup can be found between the former two setups in respect to all concerns.

Thus, no general recommendation except of a highest possible scanning speed of the sensor responses in combination with a variable selection and a highest possible number of calibration samples can be given for most parameters, as the optimal solution is determined by the analytes under investigation, by external conditions like the allowed time for each measurement, the demanded robustness of the devices and much more.

Page 139 © Frank Dieterle, 03.03.2019 Navigation