• Ei tuloksia

Validation parameters for FPA-FTIR

5 QUALITY CONTROL AND ASSURANCE

5.3 Validation parameters for FPA-FTIR

To validate an FPA-FTIR method for qualitative and quantitative analysis of MPs, several validation parameters should be measured. The same measurement has multiple outputs, such as numbers of MPs, polymer types, and particle sizes. All of these should be validated. Next, the feasibility and determination of the validation parameters for the outputs of FPA-FTIR are evaluated. However, as the definition of every validation parameter is not universally agreed (Magnusson and Örnemark, 2014), the definitions in this context are shortly explained before assessing the feasibility for FPA-FTIR.

5.3.1 Selectivity, specificity, and matrix interference

Taverniers et al. (2004) define specificity as: “The ability of the method to determine accurately and specifically the analyte of interest in the presence of other compounds in a sample matrix under the stated conditions of the test.” Sometimes specificity and selectivity are used similarly, but sometimes specificity is defined as 100% selectivity (Taverniers et al., 2004). IUPAC (International Union of Pure and Applied Chemistry) recommends to use selectivity instead of specificity in analytical chemistry to

“express the extent to which a particular method can be used to determine analytes under given conditions in the presence of other components of similar behaviour”.

(Vessman et al., 2001). In this thesis, selectivity is used following the IUPAC definition to express the capability of the method to determine the analyte in the presence of other possibly interfering compounds.

In FPA-FTIR, the polymers of MPs are identified, and particles are classified as MPs or not. If the identification is wrong, both the polymer type and MP count are affected. The selectivity therefore means the ability of FPA-FTIR and data analysis to distinguish between MPs and non-plastic polymer materials. The environmental samples usually contain remains of plants and biota, which may cover MPs, leave features to the IR spectra and prevent the identification (Primpke et al., 2020a).

Moreover, the reference spectral library should contain inclusive set of plastic types to enable selective identification of MPs. The threshold for the correlation coefficient between a sample and a reference is the most important parameter affecting the selectivity. If it is set too low, probability for false positives increases, and if too high, probability for false negatives increases.

In MP analysis, pre-treatment methods may include use of oxidizers, acids, or bases, which may degrade or fragment MPs (Hurley et al., 2018). Moreover, MPs can be degraded during the measurement. From the common measurement techniques especially Raman spectroscopy is capable for degrading small particles, because it exposes samples to intensive laser light. The stability of MPs during the pre-treatment and analysis should be studied by reference particles of various polymer types, sizes, and shapes. Moreover, if environmental samples are stored for a long time before MP analysis, the stability should be ensured, because the matrix can have for example high or low pH or contain oxidizing compounds, which may degrade MPs.

5.2 CONTAMINATION

MPs are ubiquitous everywhere – in air, water, laboratory equipment and chemicals.

Therefore, samples get easily contaminated if certain precautions are not conducted.

Because contamination can happen during sampling, pre-treatment, or measurement, it should be controlled for the whole process, including each step, if applicable (Brander et al., 2020).

Therefore, preventing and measuring the contamination are crucial part of QA and QC, and the level of contamination is needed to know for the method validation as well. Multiple practices have been proposed for preventing contamination: working in laminar flow cabinet, avoiding plastic equipment, wearing coat made of natural fibers, use of laboratory gloves and filtration of water and solutions used in the pre-treatments (Brander et al., 2020; Prata et al., 2019). Glassware should be washed properly and rinsed with ultrapure water before use. Sample containers must be closed whenever possible with glass lids or metal foil. Because the laboratory air should be as clean as possible, the use of air purifiers and sticky mats is encouraged.

The smaller particles are studied, the more carefully contamination must be controlled.

Contamination control is conducted with blank samples, in other words negative controls (Brander et al., 2020). Depending on the sample matrix, blank sample can be for example ultrapure water, empty and clean containers or filters, or old dated sediment which is free of plastic. The principle is that plastic-free sample undergoes the same pre-treatment and analysis process than actual samples. Finally, the number of MPs in blanks defines the contamination level. At least three blanks are required to get mean and standard deviation (SD). However, while blanks are typically relatively easy to design and conduct for laboratory analysis, monitoring the contamination during sampling is not always easy for environmental samples. If sampling equipment contain plastic parts, their FTIR or Raman spectra is

recommended to measure to exclude the materials of equipment from the results (Brander et al., 2020; Kroon et al., 2018). The other source of contamination during sampling is air, which can be separately monitored with clean filters, if fallout is expected to affect samples. However, closed sampling systems are better because they prevent the exposure to air.

5.3 VALIDATION PARAMETERS FOR FPA-FTIR

To validate an FPA-FTIR method for qualitative and quantitative analysis of MPs, several validation parameters should be measured. The same measurement has multiple outputs, such as numbers of MPs, polymer types, and particle sizes. All of these should be validated. Next, the feasibility and determination of the validation parameters for the outputs of FPA-FTIR are evaluated. However, as the definition of every validation parameter is not universally agreed (Magnusson and Örnemark, 2014), the definitions in this context are shortly explained before assessing the feasibility for FPA-FTIR.

5.3.1 Selectivity, specificity, and matrix interference

Taverniers et al. (2004) define specificity as: “The ability of the method to determine accurately and specifically the analyte of interest in the presence of other compounds in a sample matrix under the stated conditions of the test.” Sometimes specificity and selectivity are used similarly, but sometimes specificity is defined as 100% selectivity (Taverniers et al., 2004). IUPAC (International Union of Pure and Applied Chemistry) recommends to use selectivity instead of specificity in analytical chemistry to

“express the extent to which a particular method can be used to determine analytes under given conditions in the presence of other components of similar behaviour”.

(Vessman et al., 2001). In this thesis, selectivity is used following the IUPAC definition to express the capability of the method to determine the analyte in the presence of other possibly interfering compounds.

In FPA-FTIR, the polymers of MPs are identified, and particles are classified as MPs or not. If the identification is wrong, both the polymer type and MP count are affected. The selectivity therefore means the ability of FPA-FTIR and data analysis to distinguish between MPs and non-plastic polymer materials. The environmental samples usually contain remains of plants and biota, which may cover MPs, leave features to the IR spectra and prevent the identification (Primpke et al., 2020a).

Moreover, the reference spectral library should contain inclusive set of plastic types to enable selective identification of MPs. The threshold for the correlation coefficient between a sample and a reference is the most important parameter affecting the selectivity. If it is set too low, probability for false positives increases, and if too high, probability for false negatives increases.

No single measure exist for defining selectivity for every method (Taverniers et al., 2004). However, for many methods, including FPA-FTIR, recovery tests are suitable for measuring the selectivity of the quantitation and identification. Because both sample pre-treatment and measurement affect selectivity of the quantitation, reference MPs should be added before the pre-treatment. Similarly, recovery/spiking tests can be used for identification tests by adding MPs between pre-treatment and measurement to detect if the matrix remains effect the identification of MPs.

Interference caused by matrix is the most prominent reason for low selectivity in FPA-FTIR analysis of MPs. Ideally, measured signal is only from the analyte, but other compounds in the sample matrix can interfere the signal (Magnusson and Örnemark, 2014). If the volume of the matrix is high after the pre-treatment, the matrix can cover MPs and leave them unidentified and uncounted. Moreover, the matrix can prevent measuring the particle size from the spectral images if it partly covers MPs. Therefore, matrix interference can decrease the selectivity of the analysis.

5.3.2 Recovery

MPs can be lost or destroyed during the pre-treatment or they can be misidentified in the measurement and data analysis, which leads to underestimated results.

Moreover, if specific types of MPs are more prone to be ignored, misidentified, aggregated, or masked by another particles, the result is biased. The efficiency of the pre-treatment and analysis method can be estimated with recovery tests, in other words spiking tests or positive controls. In a recovery test, samples are spiked with a known amount of MPs and treated similarly than actual samples (Brander et al., 2020). Usually, the spiked MPs have specific particle size, material, and colour, which enables their detection and quantitation. In recovery tests, the sample matrix has to be the same as in the actual samples. Recovery samples should be prepared in at least triplicate (Hermsen et al., 2018). The best procedure is to conduct a recovery rate test for both pre-treatment and measurement, as one sample or separately. In any case, both must be tested to validate the method.

To date, recovery tests have been conducted mainly with commercially available size standard PS beads, or custom-made MPs cut or grinded from large plastic items (e.g.

Hurley et al., 2018; Simon et al., 2018; Uurasjärvi et al., 2021). Furthermore, Seghers et al. (2021) have published a concept for producing and characterizing reference MPs for validation purposes. Because MPs of different materials, sizes and shapes can behave differently, an inclusive set of MPs would be the best to use for spiking.

However, as only limited selection of standardized reference materials are available in relevant sizes and plastic types, this is not typically performed (Brander et al.,

2020). The use of traceable standard or certified reference materials is always recommended. When method validation is performed with standard reference materials, the studies are reproducible. In practice the selection of representative reference particles is complicated, as MPs found in the environment are weathered and oxidized and may differ from the standard materials.

Recovery rate of >80% is typically considered sufficient for (FPA-) FTIR analysis of MPs (Brander et al., 2020). However, for example Simon et al. (2018) argue that aggregation and overlapping of particles easily decrease the recovery rate, and lower rates can be satisfactory. The reported recovery rates for pre-treatments before FPA-FTIR vary between 58% (Simon et al., 2018) and 85% (Löder et al., 2017). Practically the required rate >80% can be difficult to obtain for samples that undergo extensive pre-treatment, or for small particles. Moreover, because recovery rates are not typically tested for all kind of MPs and are not representative for every MP type in samples, they have not been encouraged to be used as correction factors for multiplying the raw data (Simon et al., 2018).

5.3.3 Accuracy, trueness, repeatability, precision, and uncertainty

Accuracy is the closeness of a measured value to a reference value, which can be for example value of a standard reference material (Magnusson and Örnemark, 2014).

Accuracy can be divided to two components, trueness and precision. Trueness defines how close mean of infinite number of results is to a reference value. It is usually expressed as bias, which can be obtained by spiking testes or measuring reference materials.

Precision means the closeness of values of independent measurements, conducted in the same conditions (Magnusson and Örnemark, 2014). Multiple types of precision tests exist: for example repeatability precision tests are done in one laboratory with identical samples and parameters for each measurement, inter-day or intra-day.

Inter-laboratory precision is conducted in multiple laboratories with identical samples and parameters. Inter-day precision tests can be run with different operators and instruments (if applicable) to measure the dependecies from operator and instrument. Precision is usually expressed as standard deviation or %RSD.

By this definition, measurement suffers from random error when trueness is good, but precision low (Magnusson and Örnemark, 2014). Similarly, systematic error is high when trueness is low but precision high. Accuracy is poor when both trueness and precision are low, and measurement has both systematic and random errors.

For FPA-FTIR analysis of MPs, accuracy of the MP counting can be tested by measuring spiked samples, which contain known numbers of MPs. Similarly,

No single measure exist for defining selectivity for every method (Taverniers et al., 2004). However, for many methods, including FPA-FTIR, recovery tests are suitable for measuring the selectivity of the quantitation and identification. Because both sample pre-treatment and measurement affect selectivity of the quantitation, reference MPs should be added before the pre-treatment. Similarly, recovery/spiking tests can be used for identification tests by adding MPs between pre-treatment and measurement to detect if the matrix remains effect the identification of MPs.

Interference caused by matrix is the most prominent reason for low selectivity in FPA-FTIR analysis of MPs. Ideally, measured signal is only from the analyte, but other compounds in the sample matrix can interfere the signal (Magnusson and Örnemark, 2014). If the volume of the matrix is high after the pre-treatment, the matrix can cover MPs and leave them unidentified and uncounted. Moreover, the matrix can prevent measuring the particle size from the spectral images if it partly covers MPs. Therefore, matrix interference can decrease the selectivity of the analysis.

5.3.2 Recovery

MPs can be lost or destroyed during the pre-treatment or they can be misidentified in the measurement and data analysis, which leads to underestimated results.

Moreover, if specific types of MPs are more prone to be ignored, misidentified, aggregated, or masked by another particles, the result is biased. The efficiency of the pre-treatment and analysis method can be estimated with recovery tests, in other words spiking tests or positive controls. In a recovery test, samples are spiked with a known amount of MPs and treated similarly than actual samples (Brander et al., 2020). Usually, the spiked MPs have specific particle size, material, and colour, which enables their detection and quantitation. In recovery tests, the sample matrix has to be the same as in the actual samples. Recovery samples should be prepared in at least triplicate (Hermsen et al., 2018). The best procedure is to conduct a recovery rate test for both pre-treatment and measurement, as one sample or separately. In any case, both must be tested to validate the method.

To date, recovery tests have been conducted mainly with commercially available size standard PS beads, or custom-made MPs cut or grinded from large plastic items (e.g.

Hurley et al., 2018; Simon et al., 2018; Uurasjärvi et al., 2021). Furthermore, Seghers et al. (2021) have published a concept for producing and characterizing reference MPs for validation purposes. Because MPs of different materials, sizes and shapes can behave differently, an inclusive set of MPs would be the best to use for spiking.

However, as only limited selection of standardized reference materials are available in relevant sizes and plastic types, this is not typically performed (Brander et al.,

2020). The use of traceable standard or certified reference materials is always recommended. When method validation is performed with standard reference materials, the studies are reproducible. In practice the selection of representative reference particles is complicated, as MPs found in the environment are weathered and oxidized and may differ from the standard materials.

Recovery rate of >80% is typically considered sufficient for (FPA-) FTIR analysis of MPs (Brander et al., 2020). However, for example Simon et al. (2018) argue that aggregation and overlapping of particles easily decrease the recovery rate, and lower rates can be satisfactory. The reported recovery rates for pre-treatments before FPA-FTIR vary between 58% (Simon et al., 2018) and 85% (Löder et al., 2017). Practically the required rate >80% can be difficult to obtain for samples that undergo extensive pre-treatment, or for small particles. Moreover, because recovery rates are not typically tested for all kind of MPs and are not representative for every MP type in samples, they have not been encouraged to be used as correction factors for multiplying the raw data (Simon et al., 2018).

5.3.3 Accuracy, trueness, repeatability, precision, and uncertainty

Accuracy is the closeness of a measured value to a reference value, which can be for example value of a standard reference material (Magnusson and Örnemark, 2014).

Accuracy can be divided to two components, trueness and precision. Trueness defines how close mean of infinite number of results is to a reference value. It is usually expressed as bias, which can be obtained by spiking testes or measuring reference materials.

Precision means the closeness of values of independent measurements, conducted in the same conditions (Magnusson and Örnemark, 2014). Multiple types of precision tests exist: for example repeatability precision tests are done in one laboratory with identical samples and parameters for each measurement, inter-day or intra-day.

Inter-laboratory precision is conducted in multiple laboratories with identical samples and parameters. Inter-day precision tests can be run with different operators and instruments (if applicable) to measure the dependecies from operator and instrument. Precision is usually expressed as standard deviation or %RSD.

By this definition, measurement suffers from random error when trueness is good, but precision low (Magnusson and Örnemark, 2014). Similarly, systematic error is high when trueness is low but precision high. Accuracy is poor when both trueness and precision are low, and measurement has both systematic and random errors.

For FPA-FTIR analysis of MPs, accuracy of the MP counting can be tested by measuring spiked samples, which contain known numbers of MPs. Similarly,

accuracy of the measurement of particle size can be tested with MPs of known sizes.

Because pre-treatment methods can affect accuracy remarkably, both pre-treatment and measurement should be tested with reference samples. Similarly, repeatability of pre-treatment and measurement may vary between samples. It is tested by analysing multiple samples, containing identical amounts of the reference materials.

Intermediate precision of FPA-FTIR and other methods could be cross-validated by analysing identical samples with two or more methods.

Measurement uncertainty is a term, which may be feasible for FPA-FTIR analysis of MPs. Uncertainty can be defined: “A parameter associated with the result of a measurement, that characterises the dispersion of the values that could reasonably be attributed to the measurand” (ISO, 2008). Uncertainty of a chemical analysis is estimated by conducting four steps: 1) specify measurand, 2) identify uncertainty sources, 3) quantify uncertainty components, and 4) calculate combined uncertainty (Ellison and Williams, 2012). As discussed earlier, MP analysis often lack the first step, if “MP” is not specified, because it is not an agreed definition. Uncertainty sources to identify, quantify and calculate may arise during sampling, storage, sample pre-treatment, measurement, and data analysis. The combined uncertainty sums the uncertainty caused by different steps of the analysis process, in other words unit operations.

5.3.4 Working range, sensitivity and limit of detection and quantitation

Working range is the range of the amount of the analyte, for which the method provides results with acceptable uncertainty (Magnusson and Örnemark, 2014). In FPA-FTIR analysis of MPs, the working range extends theoretically from one particle

Working range is the range of the amount of the analyte, for which the method provides results with acceptable uncertainty (Magnusson and Örnemark, 2014). In FPA-FTIR analysis of MPs, the working range extends theoretically from one particle