• Ei tuloksia

INTERVENTION ON DAILY PHYSICAL ACTIVITY (II–III)

6.3 METHODOLOGICAL CONSIDERATIONS

In the exercise intervention study, the power of the samples was estimated to be sufficient to detect a 0.2 mmol/L change in glucose levels. The size of the sample may have been insufficiently powered to detect a change in physical activity and that may have affected the results of the studies (II, III, Wasenius et al.

unpublished observations). No power calculations were performed in the rehabilitation intervention study, as the goal was not to investigate the effectiveness of the intervention, but rather to measure the change in physical activity from a physiological point of view. This reasoning is also applicable for study II. Based on the present findings, it is clear that the individual variation, when it comes to change in physical activity, is large and that substantially larger sample sizes may have been required for significant results.

The study designs were appropriate for both rehabilitation and exercise intervention studies. A RCT study design, which is the most robust design to study causal relationships, was applied for the exercise intervention study. Although, the dropouts may have somewhat affected the randomization and thereby influenced the results. In addition, the lack of blinding, especially blinding of the subjects, which is almost impossible to achieve in exercise studies, may have influenced their physical activity.

In the rehabilitation group, the estimates could have been improved by the addition of a control group. It would have not, however, been possible to randomly allocate subjects into intervention or control group, because the selection of subjects was made by experts in KELA. The researchers could not influence this selection procedure. Therefore, the sample consisted of volunteers and a selected group of subjects, which diminishes the generalization of the findings. In the exercise intervention, the generalization of the findings may also have been affected by the partial volunteer sample.

The physical activity measurements in this study were largely based on self-reports, either diaries or questionnaires. Self-reports by subjects with higher fat percentage and by men have been shown to be associated with greater over-reporting of high intensity physical activity and under-over-reporting of low physical activity (Buchowski et al., 1999). Several other studies have also reported that self-reports (questionnaires and physical activity logs) are associated with over-reporting of moderate to vigorous physical activity compared to accelerometers (Macfarlane et al., 2006; Troiano et al., 2008; Boon et al., 2010) or heart rate measurements (Macfarlane et al., 2006). The discrepancy between the methods may somewhat be explained by the limited validity of waist accelerometers to measure lifestyle physical activities with upper body movement, load carriage, differing slope and surface of terrain (Jakicic et al., 1999; Hendelman et al., 2000). In addition, the validity of accelerometers’ thresholds applied for moderate physical activity were based on walking and jogging activities, which were substantially larger (1952 counts and 2020 counts) than thresholds (190.7 counts and 574 counts) derived from field studies of lifestyle activities (Hendelman et al., 2000; Swartz et al., 2000). Applying lower or larger range for moderate physical activity is likely to increase the duration of such activities, thereby closing the gap between the objective measurements and self-reports. In another study, approximately 0.1 unit increase (from r = 0.24 to r = 0.32 or r = 0.35) in the correlation coefficient between the self-reports and accelerometer was found when the lifetyle activities based thresholds were used instead of those retrieved from walking and jogging activities (Ainsworth et al., 2000a). Therefore, a significant portion of the difference between the methods remains to be explained. Accroding to Haskell (2012), it is possible that self-reports measure different or complementary aspects of physical activity compared to objective methods. In the present study, the possible over-reporting can be argued to have only minor influence on present findings, as the analyses were based on the comparison of groups with similar body weights and the investigation of change in physical activity.

Furthermore, the aim in both datasets was to measure physical activity continuously for several weeks, which limits the use of objective measures. In the present data, financial limitation but also the limited data saving capacity of objective monitors (accelerometers) and the limited battery life would have increased the burden on subjects and on researchers. In addition, burden on subjects due to wearing physical activity monitors continuously for several weeks and reporting the wear time, recharging batteries, and possibly uploading the saved file to a computer and sending the files to the researchers would have been tremendous. In addition, objective measurements should be complemented by some type of log or diary to obtain information about physical activity subcategories. Possibly due to these limitations, several previous studies have employed objective measures to measure 1–2 weeks of physical activity before the intervention and 1–2 weeks during the middle and/or last weeks of exercise intervention. This approach has also resulted in great number of incomplete

measurements and missing data in some clinical trials (Hollowell et al., 2009;

Rangan et al., 2011).

In the present study, objective measurements with SWA were applied to validate the MET-value estimations of the interventions. The applicability of this method in clinical environment was first tested in study I. In study I, also a heart rate based method was used, but it proved to be less reliable and more complicated to analyze than SWA. Thus, the SWA was chosen also for the exercise intervention study. MET-values for NW and RT training were similar to the previous studies (Phillips and Ziuraitis, 2003; Phillips and Ziuraitis, 2004;

Hansen and Smith, 2009; Jurimae et al., 2009; Schiffer et al., 2009; Schiffer et al., 2011). However, the mean MET-values used for the calculation of RT dose were unable to account for the anaerobic, EPOC, and sinusoidal (exercise to recovery to exercise) nature of RT energy expenditure, which may have affected the results of the present studies (International Organization For Standardization, 2004; Scott, 2011). The accuracy of the multiple individual activities measured during the rehabilitation intervention may have also increased the risk of error due to the limitation in SWA validation studies for individual activities. Furthermore, the average of all subjects was used to estimate the RIPA, which may have affected the individual data. The RIPA was, however, a group based rehabilitation, in which all subjects went trough the same program.

The risk factors of type 2 diabetes were measured with standard methods.

However, in the exercise intervention study, VO2peak estimation was based on the one-minute steps with an increase of 25 W until the end-point of the test was achieved. The increase may have been too steep for some subjects in order to reach steady state in each step. The non-steady state exercise may have induced exhaustion of the working muscles before the true VO2max was achieved. The test was, however, performed similarly before and after the intervention, which allowed!a comparison of the results. Albeit, it is possible that the physiological nature of the test did not reflect well the physiological nature of the NW, which could partially explain why no significant change in VO2peak was observed.

Inconsistently, structured exercise has been shown to induce 12% increase in VO2max among subjects with type 2 diabetes in a previous meta-analysis (Boule et al., 2003).

Statistical methods applied in this study followed the current tradition.

ANCOVA may have been more robust than non-parametric tests for change score to detect significant between the group differences in risk factors of type 2 diabetes (IV) (Jamieson, 1999; Fitzmaurice, 2001). The non-normal distribution of the data, however, enabled the use of this approach even after data transformation. Furthermore, we did not impute the missing response data, because the goal of this study was not to investigate the effectiveness of the intervention. Therefore, the inclusion of dropouts from the trial by means of imputation would have biased the exercise effect, albeit the estimates of clinical effectiveness would have been more accurate.

Moreover, the inclusion of the baseline value of the dependent variable into the regression analyses (Wasenius et al. unpublished observations) for change, as in

the previous studies (Bouchard and Rankinen, 2001; Nikander et al., 2006), may have biased the interpretation. In this approach, the baseline value is included on both sides of the regression equation (Y2 – Y1 = b0 + bX1 + bY1 + e) making it correlated with the error term. On the other hand, exercise response is widely reported as a change score (Y2 – Y1) and it is well known that the baseline value will have an effect on it. Thus, it would have been contradictory not to include the baseline value to the model. In the present study, several multiple linear regression analyses were performed with and without the baseline value as a predictor, and the one showing most consistency and best fit was reported. The best-fit model was estimated based on the proportion variation explained (R2), lack of multicollinearity, normality, and heteroskedasticity of the residuals.