Multivariate modeling - Near infrared spectroscopy-based evaluation of patellar tendon and knee

In studyI, the calibration model was constructed using themixOmics-variant of NI-PALS PLSR [118] in R [119] which accounts for grouping variables. These grouping variables are similar to random effects in mixed effect models in the sense that they can control for inter-subject variation and repeated measures [120,121]. In this study, the individual joints were assigned as grouping variables and the ligament/tendon type was included as a one-hot encoded predictor in the models. Leave-one-out cross-validation was used for determining the optimal number of latent variables as well as evaluating the final performance of each model. Table 6.2 contains the full list of all reference variables used in studyI.

In study II, the relationship between NIR spectra and reference variable (Ta-ble 6.4) was modeled using PLSR (implemented in Python using thescikit-learn module [115]). Each model was created with a one-dimensional response variable (i.e., separate models for different reference variables) using the NIPALS algorithm and 5-fold cross-validation for determining the number of latent variables. The ef-fect of multiple preprocessing pipelines to the model performance was investigated using thenippy module. Due to the relatively low number of samples, the model performance was evaluated with cross-validation.

In studyIIIthe automated preprocessing optimization with two examples using previously published datasets was demonstrated. In both examples the creation and validation of the calibration models followed the methodology of the original pub-lication with the exception of preprocessing which was optimized for each model.

The first example used barley cultivar identification dataset with a nu-variant sup-port vector machine (SVM) as a classifier [99]. The second example predicted the instantaneous modulus of equine articular cartilage using PLSR [22]. The perfor-mance of the second example was validated using an independent test set which was selected using the same criteria as in the original paper [22]. Both the SVM and PLSR models were implemented in Python using thescikit-learnmodule [115].

7 Results

This section briefly summarizes the main findings of the studies included in the thesis. A more in-depth and detailed report of the results can be found in the original publications, which are included as supplementary material at the end of the thesis.

In study I, the performance of PLSR models predicting the mechanical and morphological properties of the ligament and tendon samples from NIRS were ranked based on their cross-validated coefficient of determination (Q²_CV) and root mean squared error (RMSECV) (Fig. 7.1). Individual models were created for the five adjacent NIRS measurement sites and the median value over these sites was considered as the overall prediction performance (per property). PLSR models, where the cross-validation yielded Q²_CV values greater than 0.5 and RMSECV val-ues less than 10%, were considered to have acceptable predictive performance. Out of all the biomechanical property groups (i.e., morphological, sinusoidal, stress-relaxation, and quasi-static) only the quasi-static group produced models with an acceptable performance. More specifically, three properties related to the ultimate tensile strength of the sample satisfied both conditions: toughness at yield point (Q²_CV=0.54,RMSE_CV =6.14%), toughness at failure point (Q²_CV =0.53,RMSE_CV= 6.55%), and ultimate strength of the sample (Q²_CV =0.52, RMSE_CV = 8.27%). The group describing morphological properties of the samples yielded the lowest pre-diction performance of all the groups. Prepre-diction performance in groups related to viscoelastic properties (sinusoidal and stress-relaxation) ranked between quasi-static and morphological groups but below the selected threshold for acceptable accu-racy. Additionally, statistical testing (Kruskal-Wallis one-way analysis of variance) revealed that there were no statistically significant differences in the NIR spectra between the different ligament and tendon types.

In studyII, the PLSR models predicting the chemical and structural properties of the ligament and tendon samples were ranked similarly to studyI(Fig. 7.2). In studyII, however, the effect of NIRS measurement site was omitted and all models were created with spectra averaged over the five sites (i.e., each performance met-ric is reported as the median over the repeated cross-validations with bootstrapped 95% confidence intervals). Out of all the PLSR models, models predicting the con-tents of water (R²_CV = 0.65, RMSE_CV = 9.97%) and hydroxyproline (R²_CV = 0.57, RMSE_CV=12.58%) produced cross-validated performance higher than 0.5 in terms of R²_CV. The performance of the models predicting the remaining chemical prop-erties producedR²_CV values below 0.14. Similarly, models predicting the structural properties (i.e., collagen crimp length and angle) produced models withR²_CV< 0.04.

Full details of the PLSR models are reported in Table 7.2. The investigation of cross-correlation between the reference properties revealed a strong cross-correlation between the water and hydroxyproline contents (Table 7.1).

In studyIII, an open source Python module (nippy) for more efficient spectral preprocessing was developed. In short, the developed module acts as a repository of numerous preprocessing operations which can be easily combined into differ-ent preprocessing pipelines through the use of configuration files (Fig. 7.3). The

Properties:

Figure 7.1: Calibration and cross-validated accuracies of PLSR models predicting different mechanical and morphological properties of bovine knee ligaments and the patellar tendon. Boxes represent the models created from five separate NIRS measurement locations and the vertical lines correspond to the median performance.

The root mean squared error (RMSE) is normalized to the range of the dependent variable. Dashed lines correspond to the limit of acceptable performance in cross-validated metrics.

Figure 7.2: Cross-validated accuracies of PLSR models predicting chemical and structural properties of bovine knee ligaments and the patellar tendon. The root mean squared error (RMSE) is normalized to the range of the dependent variable.

Vertical lines correspond to median accuracy across the cross-validation folds, while the width of the boxes mark the 95% confidence interval of the median. Dashed ver-tical line represents the limit of acceptable performance in cross-validated metrics.

Table 7.1: Correlation coefficients between the different chemical and structural reference properties of the ligaments and the patellar tendon.

Water Hydroxypr. Hydroxypr.d.w. Uronicacid Uronicacidd.w. Elastin Elastind.w. Crimpangle Crimplength

Water 1.0

Hydroxypr. -0.9 1.0 Hydroxypr. d.w. 0.1 0.3 1.0 Uronic acid 0.1 -0.0 0.1 1.0 Uronic acid d.w. 0.6 -0.6 0.0 0.8 1.0 Elastin -0.3 0.3 0.1 -0.0 -0.2 1.0 Elastin d.w. 0.3 -0.2 0.1 0.1 0.2 0.8 1.0 Crimp angle -0.2 0.1 -0.2 0.0 -0.1 0.3 0.2 1.0 Crimp length -0.4 0.6 0.4 0.1 -0.1 0.1 -0.1 -0.4 1.0

Table 7.2: Performance of PLSR model prediction based on the coefficient of de-termination for calibration set (R²_cal.). Cross-validated performance is reported ac-cording to the coefficients of determination (R²_CV) and root mean squared errors (RMSE_CV) together with corresponding bootstrapped confidence intervals. The number of latent variables (LVs) used in the PLSR models are also reported. The wavelength (λ) column represents the ratio between initial number of wavelengths used in the model construction and the final number of wavelengths chosen by the variable selection step.

Property λ LVs R²_cal. R²_CV RMSECV

Water 161 / 232 13 1.00 0.65 (0.53 – 0.71) 0.48 (0.41 – 0.52) Hydroxypr. 159 / 232 11 1.00 0.57 (0.48 – 0.68) 0.56 (0.50 – 0.60) Hydroxypr. d.w. 90 / 232 2 0.63 0.06 (-0.23 – 0.21) 0.85 (0.71 – 0.92) Uronic acid 153 / 364 1 0.29 0.05 (-0.10 – 0.13) 0.84 (0.80 – 0.87) Uronic acid d.w. 171 / 232 3 0.50 0.14 (-0.30 – 0.26) 0.86 (0.76 – 0.91) Elastin 74 / 232 1 0.10 -0.19 (-0.35 – -0.06) 0.82 (0.73 – 0.90) Elastin d.w. 56 / 88 4 0.13 -0.53 (-0.83 – -0.35) 0.87 (0.78 – 1.03) Crimp angle 46 / 88 4 0.38 0.04 (-0.11 – 0.20) 0.80 (0.76 – 0.85) Crimp length 11 / 364 2 0.08 -0.33 (-0.49 – -0.16) 0.80 (0.73 – 1.00)

[SAVGOL]

Con guration section An example con guration le

a b c

Figure 7.3: a: Preprocessing operations in nippy are defined as sections. Each section can contain multiple parameters and each parameter can contain multiple different values. As an example, a configuration section for Savitzky-Golay filtering with spectral derivation is provided. Thealso_skipparameter is used to create an additional pipeline where the operation in question is omitted. b:Pipelines gener-ated from the configuration of panela. Four filtering window sizes and derivation up to the second degree produce a total of 12 preprocessing pipelines. The inclusion ofalso_skipparameter produces one additional reference pipeline which does not alter the input data. c: Multiple configuration sections can be combined to form a single configuration file. Thenippymodule will automatically generate all possible combinations while ensuring that only one operation per category is included in the pipelines.

module, therefore, facilitates rapid testing of large numbers of comparable pre-processing pipelines in order to find the best one for a given application. Pa-per III describes the features and typical usage of the module, while more com-prehensive documentation, source code, with examples available online (https:

//github.com/uef-bbc/nippy).

In studyIII, the benefit of automated spectral preprocessing was demonstrated with two examples: classification and regression (comparison of results presented in Fig. 7.4). In the classification task, a total of 38 preprocessing pipelines were compared to predict 24 classes of a stratified 30% test set (N= 360) using a support vector machine classifier. The baseline test set prediction accuracy (i.e., no prepro-cessing) of the classifier was 80.3% while the same value for the best performing preprocessing pipeline was 87.2% (Fig. 7.5 a). The classification accuracy of the test set reported in the study from which the dataset originated was 86.9% [99]. The preprocessing pipeline with the highest accuracy consisted of SNV scatter correction and first-order derivative SG filtering (11 points, 3rd order polynomial fit).

In the second example of study III, PLSR models were used to predict the in-stantaneous modulus of equine fetlock cartilage from NIR spectra. A total of 1618 comparable preprocessing pipelines were tested using the coefficient of determina-tion (R²) and root mean squared error of prediction (RMSEP) of an independent test set (9% of the data, N = 70) as the performance metric. The accuracy of the baseline model wasR²=0.25,RMSEP=3.06 MPa whereas the best preprocessing pipeline

Accuracy [%]R2

Figure 7.4: Comparison of model prediction performance in classification (top) and regression (bottom) examples as a result of preprocessing optimization. Per-formance values reported in the original publication of each open dataset are also included.

produced a model withR²=0.63 andRMSEP=2.15 MPa (Fig. 7.5 b). The original study reported predictive performance ofR² = 0.51 and RMSEP = 2.46 MPa for the independent test set [22]. The optimal preprocessing pipeline consisted of lim-iting the total wavelength range to 700 – 960 nm, RNV scatter correction with 85%

– 15% interquartile range and SG filtering (41 points, 3^rd order polynomial fit). No derivation was applied to the NIR spectra.

a

b

R² = 0.25 RMSEP = 3.06 MPa

R² = 0.63 RMSEP = 2.15 MPa

Figure 7.5: a: The effect of preprocessing optimization in the regression exam-ple. The baseline performance corresponds to a PLSR model with no preprocessing applied to the NIR spectra. b:The effect of preprocessing optimization in the clas-sification example. Confusion matrix on the left corresponds to the baseline model with no preprocessing of the NIR spectra while the confusion matrix on the right corresponds to the best preprocessing pipeline.

8 Discussion

The interactions between NIR light and connective tissues of the knee have been extensively investigated in the past [9–12, 83, 122–125]. The rapid non-destructive tissue evaluation with NIRS could have substantial diagnostic applications in the fields of orthopedic surgery and arthroscopy. While the NIRS-based tissue eval-uation technique has been demonstrated for articular cartilage [9, 12, 83, 123–125], meniscus [10, 122], and subchondral bone [11, 126], its feasibility for analysing lig-ament properties is yet to be proven. Studies I and II of this thesis, therefore, focused on establishing which tissue properties of ligaments can be quantitatively determined using NIRS. The investigation was carried out using the four primary knee ligaments and the patellar tendon of ten skeletally mature bovine stifle joints of comparable age. The ligament samples were fully characterized by a set of reference variables describing their mechanical response, chemical composition, and internal structure. The relationship between NIRS measurements, taken from the surface of the samples, and the reference variables were examined using various PSLR models.

Due to increasing interest in being able to conduct a NIRS-based evaluation of connective tissues, arthroscopic NIRS is now emerging as a new sub-field of applied spectroscopy [127]. The accuracy of all NIRS calibration models, however, heavily depends on the data analysis side of the technique (i.e., chemometrics). StudyIIIof the thesis focused on improving the development of chemometric models by releas-ing an open-source toolbox for optimizreleas-ing spectral preprocessreleas-ing. In studyIV, an extensive set of NIRS measurements and associated reference properties of equine articular cartilage were released in an open data publication to further facilitate the development of chemometrics methods for arthroscopic tissue evaluation.

8.1 NIRS-BASED EVALUATION OF KNEE LIGAMENTS AND THE PATEL-LAR TENDON

Determining the sensitivity of NIRS towards different reference properties of lig-aments and the patellar tendon was conducted in two parts: studyI investigated the mechanical and morphological properties while studyIIfocused on the chemi-cal composition and internal structure. The mechanichemi-cal properties with the highest prediction accuracy belonged to the quasi-static properties which were determined during the ultimate tensile test. The properties related to the failure mechanics of the tissue were predicted with the highest accuracy. Models predicting the viscoelastic properties determined using stress-relaxation and sinusoidal testing, however, failed to reach acceptable levels of accuracy. Finally, properties related to the morphology of the sample pieces were predicted with the lowest overall accuracy. The morpho-logical properties included in study Iwere mostly a result of sample preparation (e.g., sample dimensions) and did not represent any natural anatomical features of the tissue with the exception of tissue density. Although not strictly a morphologi-cal property, the tissue density was included in the morphology group for the sake of convenience as making a fifth property group for a single reference variable was deemed impractical. It should also be noted that the internal crimp structure of the

collagen fibers could also be construed to belong to the morphology group. The crimp parameters were, however, not determined before studyII and were anal-ysed at a later date. The low prediction accuracy of these morphological features indicates that the sample preparation did not, in any meaningful or systematic way, alter the light propagation in the tissue samples. This enabled ruling out of one possible source of error when considering the limitations of the study.

In studyII, the only properties with a high prediction accuracy were the water and hydroxyproline contents (hydroxyproline was used as an indicator of the to-tal amount of collagen in the tissue). The result was not surprising as water and collagen are the two most abundant components in ligaments. It should be noted, however, that there was a strong negative correlation (r = -0.9) between the two properties. With such a strong correlation, the prediction of one property will au-tomatically yield the other one. In the context of this dataset, NIRS is not really evaluating two separate aspects of the chemical composition, but rather a dyadof two interlinked properties. Further investigation of the regression coefficients of the water and hydroxyproline models revealed that both models heavily utilized wave-length regions related to water and CH-bonds. It appears that the models utilized the absorbances of both water and protein components to reach the reported predic-tion accuracy. The remaining chemical and structural properties resulted in models with low prediction accuracy. With respect to the elastin and uronic acid contents, this was most likely caused by their relative proportions being below the detection threshold of NIRS. While the structural properties (e.g., particle size) can alter the scattering of NIR light within materials, the low prediction accuracy of crimp pa-rameters suggest that this is not the case for the folded structure of collagen fibers, as far as this sample set is concerned.

The mechanical properties of ligament tissue are not directly related to any of its optical properties, i.e., the ligament’s response to tensile loading is dictated by some combination of the tissue composition and structural organization (Fig. 8.1).

Conversely, both the composition and structure can display a direct relationship to optical properties via absorption and scattering, respectively. The PLSR models of studyIdescribed the indirect relationship between visible/NIR light and the me-chanical behaviour of ligaments, while in studyIIthe relationship was more direct.

It stands to reason that the PLSR modeling of an indirect interaction is more likely to result in lower accuracy than models of direct interactions. Although the mod-els evaluated in studiesIandIIhad comparable prediction accuracies, the models devised in studyIrelied on an additional grouping variable which eliminated the inter-specimen variation in the data. No such similar normalization was necessary for the models in study II, which most likely results from the more robust link between the reference properties and the NIR spectra.

To summarize the findings of studies Iand II, the NIRS seems to be sensitive towards the mechanical properties related to the yield and failure mechanics of the ligament and patellar tendon samples. During studyIit was already hypothesized that these properties are most likely governed by collagen, which is the main load-bearing component of the tissue [37]. The subsequent findings of studyIIsupported this idea, as the hydroxyproline content (i.e., the surrogate for the total amount of collagen) was one of the best predicted properties of the samples. An earlier comparison of the mechanical properties and chemical composition of this same sample set also detected a correlation between the hydroxyproline content and the tensile strength of the tissue [128]. From the perspective of arthroscopic applications, the most likely usage of NIRS in evaluating ligaments would be to determine the

Figure 8.1: NIRS measurements can either be directly or indirectly related to the reference properties of the target material. In direct models, the interaction between incoming NIR light and the material is determined by the absorption by a specific analyte or the scattering induced by the internal structure. In indirect models, such as those predicting the mechanical properties, the relationship is more complex and most likely results from the interplay between several compositional and structural variables. Direct NIRS models tend to be simpler and more robust, resulting in more reliable and accurate models.

severity of the damage to injured ligaments. Therefore, it could be argued that out of all the possible mechanical properties, predictions of strength and stiffness of the tissue are the most relevant parameters for potential future clinical applications.

With the exception of the work of Padalkar et al. [123], very little prior research has been conducted on NIRS-based analysis of knee ligaments. In Padalkar et al., NIRS was measured from six different locations of six bovine ACLs. The reference analysis consisted of determining the corresponding water content in all of the 36 measurement locations. Additionally, six bovine PTs were also measured with NIRS (two locations per tendon) but the reference analysis was omitted for these sam-ples. The study concluded that NIRS would be capable of detecting varying water contents at different spatial locations along the length of the ACL. This finding was corroborated by studyII, which unsurprisingly showed that, in addition to ACL, the water content can be determined in all four knee ligaments and the PT. Padalkar et al. also showed that a linear discriminant analysis classifier combined with PCA was capable of differentiating the NIR spectra of ACL and PT. The comparison of NIR spectra acquired from different ligament types in studyI, however, failed to de-tect any statistically significant differences between the ligament types. This finding does not fully contradict the findings of Padalkar et al., as a more extensive classi-fication analysis with spectral decomposition was not included in studyI. Among other optical spectroscopic techniques, Matsunaga et al. recently reported Raman

spectroscopy to be sensitive towards microstructural degeneration of the ACL [129].

Unfortunately, similar analysis for NIRS could not be replicated in this thesis, as the ligament reference properties only consisted of bulk compositional and structural properties.

StudiesIandIIhad some limitations that are worth considering. While the total

In document Near infrared spectroscopy-based evaluation of patellar tendon and knee ligaments (sivua 53-64)