Skip to content
Snippets Groups Projects
Commit a8701fff authored by DIANE Abderrahim's avatar DIANE Abderrahim
Browse files

index

parent 7c8b36ed
No related branches found
No related tags found
No related merge requests found
......@@ -9,65 +9,7 @@ Imagine a laboratory setting where laboratory personnel are tasked with determin
This workflow was developed to address these challenges by providing a semi-automatic solution that simplifies the development of NIRS calibrations and facilitates predictions using these calibrations. Its primary objective is to reduce the barriers for laboratory personnel performing analyses, whether they are beginners or experienced professionals. The workflow helps minimize reliance on costly and time-consuming wet chemistry while enabling efficient and accurate data processing. Furthermore, it extends its utility beyond NIRS to other optical techniques, such as MIRS, broadening its applicability to a wider range of analytical methods. By offering diverse functionalities and intuitive tools, it not only supports various spectroscopic techniques but also enhances laboratory efficiency, making it a versatile and essential resource for modern analytical workflows.
</p>
<p style='text-align: justify;'>
## Main Features
The NIRS workflow integrates a powerful collection of advanced data analysis algorithms, commonly used for infrared (IR) data processing. It spans the entire NIRS calibrations development pipeline, from signal preprocessing to the implementation of both supervised and unsupervised machine learning techniques.
## I - Samples Selection
<p style='text-align: justify;'>
Selecting a representative subset for NIRS (Near-Infrared Spectroscopy) calibration is a critical step in developing robust calibration models, particularly before proceeding to chemical analysis. This process involves identifying a smaller group of samples that adequately reflect the chemical, physical, and spectral variability present in the entire dataset. By focusing on a well-selected subset, calibration efforts can achieve better generalization across unseen data, avoid overfitting, and optimize the use of computational resources.
</p>
<p style='text-align: justify;'>
The selected subset of samples will undergo analysis using a reference method, providing the chemical values that will serve as the target variables for calibration model development. This dual role of the subset — capturing the diversity of the dataset while being used for detailed reference analysis — underscores the importance of careful selection. A representative subset ensures that the calibration model is informed by both accurate and diverse chemical compositions, enabling robust predictions for future samples.
</p>
<p style='text-align: justify;'>
This approach is particularly vital when dealing with large datasets, where performing reference chemical analysis on all samples is impractical, costly, or time-intensive. By selecting a subset strategically, the workload and cost of laboratory analysis are reduced while maintaining the quality and reliability of the calibration model. Additionally, selecting samples that represent the entire spectrum of variability in the dataset ensures the model's ability to generalize effectively to unseen data.
</p>
## II - Calibration model development
<p style='text-align: justify;'>
Once the representative subset of samples is selected, calibration development involves creating a predictive model that links the spectral data with the chemical reference values obtained through standard laboratory methods. This step is crucial for translating the spectral information into meaningful chemical predictions.
</p>
<p style='text-align: justify;'>
Key steps in calibration development include:
Reference Analysis
The selected samples are analyzed using a reference method (e.g., wet chemistry, chromatography) to obtain accurate chemical values, which act as the target variables for the calibration model.
Preprocessing
The spectral data are preprocessed to minimize noise, baseline shifts, and scattering effects. Common preprocessing techniques include Savitzky-Golay smoothing, Standard Normal Variate (SNV), and Multiplicative Scatter Correction (MSC).
Model Building
Multivariate regression methods, such as Partial Least Squares Regression (PLSR), Principal Component Regression (PCR), or machine learning models, are applied to build the calibration model.
The selected subset ensures that the model is trained on a diverse dataset, capturing the variability in both spectral features and chemical compositions.
Model Validation
The calibration model is validated using an independent set of samples (not part of the selected subset). Metrics such as R², Root Mean Square Error of Prediction (RMSEP), and residuals are used to evaluate performance.
Cross-validation methods, such as leave-one-out or k-fold validation, can also be applied.
Model Optimization
The model parameters, such as the number of latent variables in PLSR or hyperparameters in machine learning methods, are optimized to achieve the best balance between prediction accuracy and model complexity.
</p>
<!-- ### Dimension Reduction -->
<!-- ### Clustering analysis -->
<!-- [K-Means](Clustering.md#k-means-clustering) -->
<!-- [HDBSCAN](Clustering.md#hdbscan-clustering) -->
<!-- ### Representative subset selection -->
<!-- ## II - Models Creation -->
<!-- ### Data split into train/test subsets -->
<!-- ### Predictive model creation -->
<!-- [lwPlsR from Jchemo (Julia)](model_creation.md) -->
<!-- ### Predictive model evaluation -->
<!-- ## III - Prediction making -->
<!-- ## IV - Reporting -->
\ No newline at end of file
</p>
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment