• Research article
  • Open access
  • Published: 20 March 2014

The statistical interpretation of pilot trials: should significance thresholds be reconsidered?

  • Ellen C Lee 1 ,
  • Amy L Whitehead 1 ,
  • Richard M Jacques 1 &
  • Steven A Julious 1  

BMC Medical Research Methodology volume  14 , Article number:  41 ( 2014 ) Cite this article

33k Accesses

233 Citations

25 Altmetric

Metrics details

In an evaluation of a new health technology, a pilot trial may be undertaken prior to a trial that makes a definitive assessment of benefit. The objective of pilot studies is to provide sufficient evidence that a larger definitive trial can be undertaken and, at times, to provide a preliminary assessment of benefit.

We describe significance thresholds, confidence intervals and surrogate markers in the context of pilot studies and how Bayesian methods can be used in pilot trials. We use a worked example to illustrate the issues raised.

We show how significance levels other than the traditional 5% should be considered to provide preliminary evidence for efficacy and how estimation and confidence intervals should be the focus to provide an estimated range of possible treatment effects. We also illustrate how Bayesian methods could also assist in the early assessment of a health technology.

Conclusions

We recommend that in pilot trials the focus should be on descriptive statistics and estimation, using confidence intervals, rather than formal hypothesis testing and that confidence intervals other than 95% confidence intervals, such as 85% or 75%, be used for the estimation. The confidence interval should then be interpreted with regards to the minimum clinically important difference. We also recommend that Bayesian methods be used to assist in the interpretation of pilot trials. Surrogate endpoints can also be used in pilot trials but they must reliably predict the overall effect on the clinical outcome.

Peer Review reports

In an evaluation of a new health technology, a pilot trial may be undertaken prior to a definitive trial that makes a definitive assessment of benefit. The main objective of a pilot trial is to provide sufficient assurance to enable a larger definitive trial to be undertaken. For example, they may assess aspects such as recruitment rates or whether the technologies can be implemented.

Pilot studies are more about learning than confirming: they are not designed to formally assess evidence of benefit. As such, for clinical endpoints, rather than formal hypothesis testing to prove definitively there is a response, it is usually more informative to provide an estimate of the range of possible responses [ 1 , 2 ]. This estimation may not be around the primary endpoint for the definitive study but could be on a surrogate or an early assessment of an endpoint which may be assessed at a later time point in the definitive study [ 3 ].

In this paper we present and discuss approaches towards significance thresholds and confidence interval levels in pilot studies. The methods are divided into three main sections. In the first, we provide alternatives to hypothesis testing using the conventional 5% significance level. We then discuss the use of surrogate outcomes in pilot studies. Finally, a Bayesian approach to significant thresholds is introduced. Throughout the paper we use a worked example to provide illustration to the methods discussed.

Methods and results

Significance and confidence levels.

Pilot studies are not formally powered to assess effect. However, it may be of interest to calculate confidence intervals to describe the range of effects, even if this is not a conventional 95% confidence interval. In this section we give a rational for confidence interval estimation and “hypothesis testing” in pilot studies.

Significance levels and power calculations

Pilot studies are usually underpowered to achieve statistical significance at the commonly used 5% level. Despite recommendations that formal significance levels are not provided for pilot studies, [ 4 , 5 ] many still quote and interpret P-values. In a survey of pilot studies published in 2007–8, Arain et al. [ 6 ] found that 81% (21/26) of pilot studies performed hypothesis tests in order to comment on the statistical significance of results. If the primary purpose of a pilot study is to provide preliminary evidence of the efficacy of an intervention, then the significance level can be increased for hypothesis testing [ 7 ]. Stallard [ 8 ] recommends that the design for a phase II trial is based on a one sided Type I error rate of α = 0.2. Whilst Schoenfeld [ 9 ] proposed a higher type I error rate for preliminary testing in pilot trials; up to a (one sided) α = 0.25. In studies other than drug trials, setting and personnel may not be representative of a future main trial: A pilot trial might see a greater treatment difference due to protocol adherence and enthusiasm in the pilot centre, which might not be replicated in a multi-centre trial. Nevertheless, the pilot may still be underpowered for a traditional 5% significance threshold.

It should be noted that in the context of a pilot study a Type I error would have a different impact. For a definitive study, a Type I error would mean therapies or health technologies falsely being concluded as beneficial. As such, in this context they would be referred to as societies risk – such that the wish is to have a Type I error as low as possible. For a pilot study the impact of a Type I error is that a definitive study may falsely be undertaken. Although there is a consequence for patients in the trial – being randomised to therapies when there is equipoise – the impact of this false positive error could be in the main on the sponsor or funder i.e. sponsors spend more money and resources on the ‘wrong’ study that will not result in a true effect/benefit from the new technology.

The aim of a pilot study, therefore, is to inform both the decision whether to conduct a confirmatory study and the design of the larger confirmatory trial. Any interpreted P-values in a pilot study should be with a disclaimer that the study is not adequately powered [ 10 , 11 ]; and while post hoc power calculations are possible [ 11 ] they are generally not advisable [ 12 ]. Instead, estimation and confidence intervals should be used to infer the size and direction of treatment effect.

Confidence intervals

It is recommended in pilot trials that the focus is on descriptive statistics and estimation rather than formal hypothesis testing [ 4 ]. A confidence interval for the treatment effect will inform the decision, amongst other factors, whether or not to perform a confirmatory trial. The confidence interval should be interpreted with regards to the minimum clinically important difference (MCID) [ 12 ]; this is the difference between treatment groups that is considered to be clinically meaningful, specified a priori . If a confidence interval for the treatment difference crosses zero and the MCID, then the results of the pilot study could be considered to be equivocal. There could be no difference between treatments, or there could be a difference larger than the MCID; the results would not preclude either possibility. This approach is superior to formal hypothesis testing as there is insufficient power to test hypotheses, and its focus on the MCID will help inform the main confirmatory trial. Interpreting confidence intervals this way also helps investigators visualise the evidence of effect from the pilot trial.

It is common to report the 95% confidence interval which corresponds to a 5% significance level. In a pilot study, without adequate power, we can consider investigating confidence intervals of different widths to help inform our decision making, these can then be displayed alongside each other to illustrate the strength of preliminary evidence. We suggest setting minimum prior requirement; that the mean treatment difference is above zero, and that a CI of a certain length includes (or is above) the MCID.

Worked example

The Leg Ulcer Study was a randomised controlled trial designed to investigate the relative cost effectiveness of community leg ulcer clinics that use four layer compression bandaging versus usual care provided by district nurses [ 13 , 14 ]. In the trial 233 patients with venous leg ulcers were allocated at random to the intervention (120) or control (113) group. The SF-36 questionnaire was completed at baseline, three and twelve months post randomisation. For this example we investigate the SF-36 General Health (GH) dimension score. The GH dimension is scored on a 0 (poor) to 100 (good health) scale.

We assume that 3 month data for the first 40 patients is the pilot study data. There were 31 individuals with complete 3 month SF-36 GH dimension data (17 in treatment group and 14 in control group).

Note missing data on 22.5% (9/40) patients is quite high and may be considered unacceptable for a main study. In actuality for this trial there was just 14% (29/230) of missing data for the SF-36 data [ 15 ]. For our data we may well have observed a randomly high number. If this was a true pilot study then a missing data rate of 22.5% may need some investigation. There are statistical methods for accounting for missing data [ 16 ]. However, the only solution to missing data is not to have any. After a pilot study, measures to ensure complete data would need to be investigated to bring the level of missing data to an acceptable level.

We take the minimum clinically important difference to be a 5 point difference in SF-36 GH dimension scores at 3 months post-randomisation; we assume a standard deviation of 20 points. Without seeing the actual trial results, with 40 individuals, there would be 20% power to detect a 5 point or more difference between the groups if it truly existed which is clearly underpowered by conventional standards. Thus, for such a trial it would be more appropriate to estimate possible effects rather than have formal hypothesis tests.

Table  1 displays the results comparing the mean SF-36 GH dimension scores between the home (control) and clinic (intervention) group. The mean difference was found to be 12.8, which is statistically significant at the 10% but not 5% level; there is some evidence of a difference in SF-36 GH dimension between groups. If the significance level was set to 10%, there would be sufficient preliminary evidence of a treatment difference and this would lead onto a full-scale study.

The leg ulcer randomised controlled trial reported in 1998 obtained appropriate ethics committee approvals [ 14 ]. The use of the data from this trial for the work presented in this paper has been approved by School of Health and Related Research (University of Sheffield) ethics as secondary analysis of anonymised data.

Figure  1 shows a range of confidence intervals for the mean difference in SF-36 GH scores between the treatment groups. The 95% CI crosses both 0 and the MCID, this gives inconclusive evidence. The 80% and 90% confidence intervals both exclude 0 and cross the MCID, at these levels there is evidence of a treatment difference which is potentially clinically important. A confidence interval of 75% and smaller would be wholly above or equal to the MCID, suggesting at this level that there is a clinically meaningful difference in SF-36 General Health between the groups.

figure 1

Mean difference in SF-36 GH dimension scores between treatment and control with confidence intervals (based on n = 31 patients).

The NIHR Evaluation, Trials and Studies Coordinating Centre (NETSCC) describes a pilot study as a smaller version of the main trial, designed to test whether components of the main study can all work together as well as a preliminary assessment of clinical efficacy. This screening function of pilot studies requires a preliminary evaluation of treatments. Therefore, using the definitive clinical endpoint during a pilot trial may not always be viable. There may be times when measuring the clinical endpoint is not efficient [ 17 ]. For example, if the clinical endpoint is the five year survival rate, then an assessment of disease progression or tumour shrinkage may be assessed in the pilot. Such endpoints would be used as surrogates for the definitive endpoint. We will now discuss surrogates in more detail [ 18 ].

Surrogate endpoints

In the situations described above an investigator may consider using an endpoint other than the clinical endpoint; a surrogate endpoint. ICH E9 [ 19 ] defines a surrogate endpoint as

‘A variable that provides an indirect measurement of effect in situations where direct measurement of clinical effect is not feasible or practical’.

Using a surrogate endpoint can reduce the required sample size or the duration of the trial compared to using the clinical endpoint. This leads to cost reductions which may be crucial for trial feasibility [ 18 ]. For an endpoint to be considered a surrogate the relationship between it and the clinical outcome must be biologically plausible. In addition, the surrogate must have demonstrable prognostic value for the clinical outcome and there must be evidence from clinical trials that treatment effects on the surrogate outcome correspond to treatments effects on the clinical outcome [ 19 ].

The risks involved when using surrogate endpoints

When an aim of a pilot study is to estimate design parameters, using a surrogate endpoint may mean we do not get precise estimates. For example, designing the study based on the surrogate may mean having sub optimal information to estimate the variance of the clinical endpoint or an assessment at an earlier time point. This may mean we do not get an accurate estimate of attrition rates.

A surrogate endpoint must reliably predict the overall effect on the clinical outcome [ 20 ]. Otherwise it would be possible to wrongly reject effective treatments or take ineffective treatments through to further testing. If a surrogate does predict clinical benefit it could mean treatment benefits can be brought to patients earlier than if clinical outcomes were used and possibly at a lower cost [ 21 ].

Worked example revisited

Using the same data set as in the previous example we now look at the 12 month SF-36 general health (GH) dimension data for the main trial. There were 233 people in the study in total, 155 with complete SF-36 GH dimension data and 78 observations were recorded as missing. From the 155 observed outcomes 80 were in the clinic group and 75 were in the home or control group – note we had 23% attrition at 3 months compared to 31% at 12 months. Such considerations may be important when trying to design a definitive trial.

Table  2 presents the results from comparing the mean SF-36 GH dimension scores between home and clinic groups. The mean difference was 3.33 which is not significant at the 5% level. The original presentation of these results in 1998 stated that they observed a general deterioration of health status over time, with no difference between the two groups [ 14 ].

In the previous worked example we envisaged that the pilot trial had 40 patients and measured the 3-month GH dimension score. Using a significance level of 10% we would have proceeded to the main trial. The 3-month GH dimension score is now considered as a surrogate endpoint to the clinical outcome of 12-month GH dimension score. If we used a significance level of 5% to assess the clinical outcome, the difference between the groups is not statistically significant. Using the 3-month endpoint in the pilot study and a lower significance level would cause us to proceed to the main trial after the pilot study only to observe no significant difference between the two groups in the main study. It could be a Type I error which would lead us to the main study or it could be due to the treatment having no long term efficacy – for example the intervention may have a short term benefit which does not last for 12 months. The ‘large’ effect of 12.8 points in the first 40 patients at 3 months has not been replicated at 12 months in the full study.

  • Bayesian methods

The Bayesian framework offers an alternative approach to the Frequentist significance levels and confidence intervals discussed in the previous section. It allows prior beliefs about the intervention to be combined with the observed data to form posterior responses about the outcome of interest. These posterior responses can then be used to inform decisions about whether a larger definitive trial should be undertaken. One approach to making a decision about the intervention is to use a pre-specified Go/No-Go criteria.

Go/No-Go criteria

Julious et al. [ 22 ] define a Go/No-Go decision as a hurdle in a clinical development path to necessitate further progression or otherwise of a health technology. These hurdles can be set low or high depending on the stage of development of the intervention.

At the planning stage of a pilot study there are a number of decisions that need to be made about how Go/No-Go criteria are defined. The first concerns the metric that is going to measure success or failure. Julious and Swank [ 23 ] suggest a method of calculating a probability of success for different development plans based on decision trees and Bayes’ Theorem. They take into account the study team’s confidence (expressed as a probability) that the intervention will meet the safety and efficacy targets for success, and then calculate the probability that each part of the clinical assessment will correctly indicate that the health technology works or does not work.

Chuang-Stein et al. [ 24 ] suggest that a good metric is the probability that there will be a successful confirmatory trial outcome. This is also called assurance by O’Hagan et al. [ 25 ] or average power by Chuang-Stein [ 26 ] and is used in Bayesian sample size calculations for confirmatory trials. The method that we describe here in detail uses prior beliefs and the data collected from the pilot study to calculate the probability of detecting a clinically meaningful difference. This method has previously been described by Julious et al. [ 22 ] for binary and Normal outcomes, and Parmar et al. [ 27 ] for survival outcomes.

The second decision concerns the cut-off or level of the criteria. For example, do we want to be 70% or 80% sure that a confirmatory trial will show a minimum clinically meaningful difference? With a pilot study, criteria could be set to minimise the probability of a false positive, (i.e. minimising the probability of progressing an intervention that will fail in a confirmatory trial) but if the goal is set too high then this will increase the probability of a false negative (i.e. stopping an intervention that works from going to a confirmatory trial) [ 22 ]. Other factors may also influence the choice of criteria, for example, the sponsor of a drug trial may be more willing to accept an incorrect go decision rather than an incorrect no-go decision if the new treatment is the first in class rather than one of several drugs in class [ 24 ].

Prior distributions

As with all Bayesian methods, prior distributions have to be specified for the parameters that we are interested in making inference about and this leads to the question of how these distributions are defined. The simplest approach is to use a non-informative prior. In this case the results will be similar to the Frequentist analysis because all of the information is coming from the observed response. Alternatively, a prior can be elicited based on expert knowledge of the intervention. This may, for example, be based on the synthesis of evidence from previous studies of the same or similar interventions as suggested by Chuang-Stein et al. [ 24 ]. Other elicitation techniques including the elicitation from multiple experts are discussed in Spiegelhalter et al. [ 28 ].

With a large sample size for the pilot study the posterior distribution will be robust to changes in the prior [ 29 ]. However, sample sizes in pilot studies are typically small - in a literature survey by Arain et al. [ 6 ] the median number of participants was 76 - and therefore an informative prior distribution may have a large influence on the posterior distribution. We illustrate in our example that caution should be taken when specifying a prior distribution for a pilot study, as different priors may lead to different interpretations of the results.

Probability of detecting a clinically meaningful difference

We now outline one possible method for calculating the probability of detecting a clinically meaningful difference for data that are anticipated to take a Normal form. In the context of a Go/No-Go criteria we need to determine the probability of observing a difference, d i , or greater given that d pilot has already been observed, i.e. prob(θ > d i | d pilot ) where θ is the mean difference.

For Normal data of the form X 1 ,X 2 ,…,X n  ~ N(θ, σ 2 ) we wish to make inference about θ for given σ 2 . In this case the Normal family is conjugate and we have the following prior θ ~ N(μ prior , σ prior 2 ). Note that other distributions may be used for the prior. The Bayesian updating rules can then be defined as follows.

Prior values for the mean difference and population standard deviation are defined as d prior and s prior respectively. The observed mean difference and population standard deviation from the pilot data are defined as d pilot and s pilot respectively. Hence S 1 r + 1 / rn is an estimate of the standard deviation around the mean where r is the allocation ratio between groups and n is the number of individuals per arm.

The posterior distribution is calculated through a weighted sum of the prior and observed responses. The posterior estimate of the mean difference, d post , is defined as

and the posterior estimate of the variance around the mean, s post 2 , is defined as

From these posterior values a density distribution for prob(θ > d i | d pilot ) can be defined so that the probability of observing a difference, d i , or greater, for a given d post would be

Worked example revisited with bayesian approach

Using the same leg ulcer data as described previously, we demonstrate how to calculate the probability that the mean difference in SF-36 GH dimension scores at 3 months post randomisation is greater than the minimum clinically important difference of five points. This question may also be stated in terms of a ‘Go’ criteria, for example:

Are we at least 75% sure of having a mean difference in SF-36 GH dimension that is greater than the minimum clinically meaningful difference of five points at 3 months post randomisation.

For the expository purpose of this exercise we will consider the following three Normally distributed priors:

Non-informative

Pessimistic prior, with a mean difference of 4 and 90% certainty that the mean difference is within −1 and 9.

Optimistic prior, with a mean difference of 7 and 90% certainty that the mean difference is within 4 and 10.

Table  3 displays the posterior mean, posterior standard deviation, and the probability that the mean difference in SF-36 GH dimension score is greater than the minimum clinically meaningful difference of 5 points for our examples of a non-informative, pessimistic and optimistic prior distribution. When using both the non-informative and the optimistic prior the probability of achieving a clinically meaningful difference is greater than our pre-set threshold of 75%.

Figure  2 shows the prior, observed, and posterior distributions for each of our three examples. The non-informative prior has no influence on the posterior distribution and the 95% credibility interval for the posterior mean difference is the same as 95% confidence interval found previously (−0.8 to 26.6). In the case of the pessimistic and optimistic priors the posterior distribution is heavily influenced by the choice of prior because the observed data has such a small sample size. This emphasises that caution is required when specifying a prior distribution for pilot studies.

figure 2

Prior, observed and posterior distributions for non-informative, pessimistic and optimistic priors.

It could be argued that a Bayesian approach is appealing as it formally accounts for any related work (and/or of beliefs held by investigators) by setting priors before the start of a study [ 22 ]. Once the trial has been completed, the observed data are combined with the priors to form a posterior distribution for the treatment response. The interpretation is then through a measure that is more easily understood – in our example what is the probability that the response is greater than 5.

This paper has demonstrated a variety of approaches towards significance thresholds in pilot studies. When undertaking a pilot investigation, it was shown how significance levels other than the “traditional” 5% should be considered to provide preliminary evidence for efficacy. It was highlighted how estimation and confidence intervals should be focused on in order to provide an estimated range of possible treatment effects.

Interpreting confidence intervals with respect to the minimum clinically important difference should be considered. Investigating several confidence intervals of different widths and displaying them as in Figure  1 can aid decision making and is a helpful way of displaying evidence in pilot studies. Minimum prior requirements can be set and used in addition to the graphical display to help illustrate the strength of preliminary evidence. However, caution must be taken when using a surrogate outcome in pilot studies as it must reliably predict the clinical endpoint.

Bayesian methods could also assist in the early assessment of a health technology. Pilot data can be combined with prior beliefs in order to calculate the probability that there will be a successful confirmatory trial outcome. This can be framed into a Go/No-Go hurdle such as; are we at least 75% sure of having a mean difference larger than the minimum clinically meaningful difference . We demonstrated how care must be taken when choosing a prior distribution; the posterior distribution can be heavily influenced by the choice of prior as pilot data usually has a small sample size.

We recommend that in pilot trials the focus should be on descriptive statistics and estimation, using confidence intervals, rather than formal hypothesis testing. We further recommend that confidence intervals in addition to 95% confidence intervals, such as 85% or 75%, be used for the estimation. The confidence interval should then be interpreted with regards to the minimum clinically important difference and we suggest setting minimum prior requirements. Although Bayesian methods could assist in the interpretation of pilot trials, we recommend that they are used with caution due to small sample sizes.

Abbreviations

General Health

Minimum Clinically Important Difference

National Institute for Health Research Evaluation, Trials and Studies Coordinating Centre.

Wood J, Lambert M: Sample size calculations for trials in health services research. J Health Serv Res Policy. 1999, 4 (4): 226-229.

CAS   PubMed   Google Scholar  

Julious SA, Patterson SD: Sample sizes for estimation in clinical research. Pharm Stat. 2004, 3 (3): 213-215. 10.1002/pst.125.

Article   Google Scholar  

Biomarkers Definitions Working Group: Biomarkers and surrogate endpoints: preferred definitions and conceptual framework. Clin Pharmacol Ther. 2001, 69 (3): 89-95.

Lancaster GA, Dodd S, Williamson PR: Design and analysis of pilot studies: recommendations for good practice. J Eval Clin Pract. 2004, 10 (2): 307-312. 10.1111/j..2002.384.doc.x.

Article   PubMed   Google Scholar  

Thabane L, Ma J, Chu R, Cheng J, Ismaila A, Rios LP, Robson R, Thabane M, Giangregorio L, Goldsmith CH: A tutorial on pilot studies: the what, why and how. BMC Med Res Methodol. 2010, 10: 1-10.1186/1471-2288-10-1.

Article   PubMed   PubMed Central   Google Scholar  

Arain M, Campbell MJ, Cooper CL, Lancaster GA: What is a pilot or feasibility study? A review of current practice and editorial policy. BMC Med Res Methodol. 2010, 10: 67-10.1186/1471-2288-10-67.

Kianifard F, Islam MZ: A guide to the design and analysis of small clinical studies. Pharm Stat. 2011, 10 (4): 363-368. 10.1002/pst.477.

Stallard N: Optimal sample sizes for phase II clinical trials and pilot studies. Stat Med. 2012, 31: 1031-1042. 10.1002/sim.4357.

Schoenfeld D: Statistical considerations for pilot-studies. Int J Radiat Oncol Biol Phys. 1980, 6 (3): 371-374. 10.1016/0360-3016(80)90153-4.

Article   CAS   PubMed   Google Scholar  

Papadakis S, Aitken D, Gocan S, Riley D, Laplante MA, Bhatnagar-Bost A, Cousineau D, Simpson D, Edjoc R, Pipe AL, Sharma M, Reid RD: A randomised controlled pilot study of standardised counselling and cost-free pharmacotherapy for smoking cessation among stroke and TIA patients. BMJ Open. 2011, 1 (2): e000366-

Legault C, Jennings JM, Katula JA, Dagenbach D, Gaussoin SA, Sink KM, Rapp SR, Rejeski WJ, Shumaker SA, Espeland MA: Designing clinical trials for assessing the effects of cognitive training and physical activity interventions on cognitive outcomes: the Seniors Health and Activity Research Program Pilot (SHARP-P) study, a randomized controlled trial. BMC Geriatr. 2011, 11: 27-10.1186/1471-2318-11-27.

Walters SJ: Consultants’ forum: should post hoc sample size calculations be done?. Pharm Stat. 2009, 8 (2): 163-169. 10.1002/pst.334.

Walters SJ, Morrell CJ, Dixon S: Measuring health-related quality of life in patients with venous leg ulcers. Qual Life Res. 1999, 8 (4): 327-336. 10.1023/A:1008992006845.

Morrell CJ, Walters SJ, Dixon S, Collins KA, Brereton LML, Peters J, Brooker CGD: Cost effectiveness of community leg ulcer clinics: randomised controlled trial. Br Med J. 1998, 316 (7143): 1487-1491. 10.1136/bmj.316.7143.1487.

Article   CAS   Google Scholar  

Collins K, Morrell J, Peters J, Walters S, Brooker C, Brereton L: Problems associated with patient satisfaction surveys. Bri J Commun Health Nurs. 2007, 2 (3): 156-163.

Carpenter JR, Kenward MG: Multiple Imputation and its Application. 2013, Chichester: Wiley

Book   Google Scholar  

De Gruttola VG, Clax P, DeMets DL, Downing GJ, Ellenberg SS, Friedman L, Gail MH, Prentice R, Wittes J, Zeger SL: Considerations in the evaluation of surrogate endpoints in clinical trials: Summary of a National Institutes of Health Workshop. Control Clin Trials. 2001, 22 (5): 485-502. 10.1016/S0197-2456(01)00153-2.

Prentice RL: Surrogate endpoints in clinical-trials - definition and operational criteria. Stat Med. 1989, 8 (4): 431-440. 10.1002/sim.4780080407.

International Conference on Harmonisation: ICH E9 statistical principals for clinical trials. 1998, http://www.ich.org/fileadmin/Public_Web_Site/ICH_Products/Guidelines/Efficacy/E9/Step4/E9_Guideline.pdf ,

Google Scholar  

Fleming TR, DeMets DL: Surrogate end points in clinical trials: are we being misled?. Ann Intern Med. 1996, 125 (7): 605-613. 10.7326/0003-4819-125-7-199610010-00011.

Temple R: Are surrogate markers adequate to assess cardiovascular disease drugs?. J Am Med Assoc. 1999, 282 (8): 790-795. 10.1001/jama.282.8.790.

Julious SA, Machin D, Tan SB: An Introduction to Statistics in Early Phase Trials. 2010, Oxford: Wiley-Blackwell

Julious SA, Swank DJ: Moving statistics beyond the individual clinical trial: applying decision science to optimize a clinical development plan. Pharm Stat. 2005, 4 (1): 37-46. 10.1002/pst.149.

Chuang-Stein C, Kirby S, French J, Kowalski K, Marshall S, Smith MK, Bycott P, Beltangady M: A quantitative approach for making go/no-go decisions in drug development. Drug Inform J. 2011, 45 (2): 187-202.

O’Hagan A, Stevens JW, Campbell MJ: Assurance in clinical trial design. Pharm Stat. 2005, 4 (3): 187-201. 10.1002/pst.175.

Chuang-Stein C: Sample size and the probability of a successful trial. Pharm Stat. 2006, 5 (4): 305-309. 10.1002/pst.232.

Parmar MKB, Ungerleider RS, Simon R: Assessing whether to perform a confirmatory randomized clinical trial. J Natl Canc Inst. 1996, 88 (22): 1645-1651. 10.1093/jnci/88.22.1645.

Spiegelhalter DJ, Abrams KR, Myles JP: Bayesian Approaches to Clinical Trials and Health-Care Evaluation. 2004, Chichester: John Wiley & Sons

Lee PM: Bayesian Statistics: An Introduction. 1989, New York: Oxford University Press; Edward Arnold

Pre-publication history

The pre-publication history for this paper can be accessed here: http://www.biomedcentral.com/1471-2288/14/41/prepub

Download references

Acknowledgements

We thank Professor Stephen Walters who provided the data used in the worked example. ALW is funded by a School of Health and Related Research (ScHARR) Postgraduate Teaching Assistant Studentship. ECL, RMJ and SAJ did not receive any funding for this work.

Author information

Authors and affiliations.

Medical Statistics Group, School of Health and Related Research (ScHARR), University of Sheffield, 30 Regent Street, Sheffield, S1 4DA, UK

Ellen C Lee, Amy L Whitehead, Richard M Jacques & Steven A Julious

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Steven A Julious .

Additional information

Competing interests.

The authors declare that they have no competing interests.

Authors’ contributions

All authors contributed equally to the work in this paper. All authors read and approved the final manuscript.

Ellen C Lee, Amy L Whitehead, Richard M Jacques and Steven A Julious contributed equally to this work.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2, rights and permissions.

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( https://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article.

Lee, E.C., Whitehead, A.L., Jacques, R.M. et al. The statistical interpretation of pilot trials: should significance thresholds be reconsidered?. BMC Med Res Methodol 14 , 41 (2014). https://doi.org/10.1186/1471-2288-14-41

Download citation

Received : 18 October 2013

Accepted : 12 March 2014

Published : 20 March 2014

DOI : https://doi.org/10.1186/1471-2288-14-41

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Pilot trial
  • Type I error
  • Confidence interval
  • Significance

BMC Medical Research Methodology

ISSN: 1471-2288

hypothesis generating pilot study

hypothesis testing hypothesis generation pilot studies in clinical research areas

Hypothesis-testing, Hypothesis-generation, Pilot Studies, in Clinical Research Areas

Aug 23, 2014

200 likes | 511 Views

Hypothesis-testing, Hypothesis-generation, Pilot Studies, in Clinical Research Areas. Helena Chmura Kraemer, Ph.D. Stanford University (Emerita) University of Pittsburgh. The Scientific Method--Ideally. Exploration/Hypothesis Generation: Past Published Research (Clinical & Basic);

Share Presentation

  • sample sizes
  • diagnosis d
  • hypothesis formulations ht design
  • large study
  • secondary data analyses

naiara

Presentation Transcript

Hypothesis-testing,Hypothesis-generation, Pilot Studies,in Clinical Research Areas Helena Chmura Kraemer, Ph.D. Stanford University (Emerita) University of Pittsburgh

The Scientific Method--Ideally Exploration/Hypothesis Generation: Past Published Research (Clinical & Basic); Clinical Experience and Observation , Secondary Data Analyses with Personal/Shared Data Exploratory Studies Hypothesis Formulation Independent Replication & Validation Data Sharing HT Design HT Conclusions Publications Data Sharing Pilot Study HT Execution

As is: Exploration/Hypothesis Generation: Past Published Research (Clinical & Basic); Clinical Experience and Observation. Hypothesis Formulations HT Design HT Conclusions Publications HT Execution

Hypothesis-Testing (HT): The heart of clinical research • Required: Each “a priori” hypothesis must have scientific rationale and empirical justification. • Required for EACH SPECIFIC hypothesis: • Research Protocol (Sampling, Measurement, Design) • Analytic Plan • Testing criteria • Significance level • Adequate Power

Crucial Questions • Where do you get STRONG hypotheses to be tested without well-done exploratory studies? • Where do you get the information needed to test those hypotheses most effectively and efficiently (design and power) without well-done exploratory studies? • How can you be sure that what you propose to do in a hypothesis-testing study is feasible without well-done pilot studies?

How can you do hypothesis-generating or pilot studies without funding? • Since reviewers confuse the types of studies, the criteria for evaluating one type of study are often applied to another type, which confuses researchers. • Researchers misrepresent hypothesis-generating as HT, or badly designed HT as “pilot” studies, which confuses reviewers. • Researchers=Reviewers!!! • Clarification of these issues is necessary for productive communication between researchers and reviewers and with the research and clinical communities.

What is an Hypothesis-Generating (Exploratory) Study? • A large study on a relevant population meant to explore certain phenomena in order to generate important and innovative hypotheses for future testing…. • … and to generate information relevant to designing those studies most cost-effectively. • Phase I, Phase II trials?

Development vs. Testing • Hypothesis to be developed: Some gene (multiple candidates) is related to diagnosis D (Onset? Type? Severity? Course? Treatment Resistance?), perhaps in conjunction with certain environmental influences (multiple candidates). • Possible moderating or mediating relationship between genes and environment on D may exist: Does the result differ according to ethnicity or gender? • Hypothesis to be tested: Gene G moderates the effect of E on the onset of disorder D on patients between the ages of 15 and 30.

What is a Pilot Study? • A pilot study is a small study done as a preliminary study to a hypothesis-testing study, in which research tactics intended for a hypothesis-testing study are tried out. • A feasibility study • An effort to “debug” the proposed design.

Major Contrasts • A hypothesis-generating study focuses on research questions to be answered in the future, and is large. • A pilot study focuses on tactics used to answer research questions, and is small.

How Large is Large? • In HT, a sample size large enough to yield 80% power to detect (5% significance level) any effect size above the threshold of clinical significance. • May be as few as 10 subjects per group, to as many as thousands per group. • In Hypothesis-generating, sample sizes similar to those generally used in HT to follow, if not larger, large enough to get credible effect sizes. • In Pilot studies, only enough to convince the user and reviewers that the tactic will work.

Evaluating a proposal for an exploratory study • Yes: • Are the issues not yet well researched or well understood but of clinical importance? • Is the study, if it proposes to collect new data, ethical? • Is the sample representative of a clinically relevant population and large enough? • Are the measures comprehensive enough to shed light on the issues and of good enough quality (reliability, validity, sensitivity) to shed light on the issues? • Have examples of important questions been articulated to show the direction of researchers’ thinking and to support their lack of bias and analytic and interpretative skills? • No: • Hypotheses? Tests? Power? Pilot studies?

Evaluating a Proposal for a Pilot Study • YES: • Are the issues in the HT-to-be under consideration of clinical importance? • Are the feasibility questions to be addressed in the pilot study pertinent and important to the design of the HT? • Under what conditions would what is seen in the pilot study discourage proposing doing the main study or changing its design (tweaking)? • Is it clear that if the HTwere found to be feasible, the researchers would submit a proposal for that HT as a R01? • NO: • Hypotheses? Tests? Power? Pilot studies?

The Bottom Line with Pilot Studies • You don’t want to find out after the HT study is started that you’ve made mistakes in the protocol that invalidate the testing, or make it unlikely that credible results can be obtained!

Conclusion • A badly designed, underpowered hypothesis-testing study is neither a pilot study, nor an exploratory study. • Well-designed exploratory studies are necessary to having strong hypotheses in hypothesis-testing studies and the information necessary to design them well. • Well-conceived pilot studies are necessary to avoid catastrophes in hypothesis-testing studies. • What to do to clarify clear communication between researchers and reviewers, and to foster the proposals, funding, and publication of good science?

  • More by User

Hypothesis Testing

Hypothesis Testing

Hypothesis Testing. Is It Significant?. Questions (1). What is a statistical hypothesis? Why is the null hypothesis so important? What is a rejection region? What does it mean to say that a finding is statistically significant ?

1.31k views • 23 slides

Testing Hypothesis

Testing Hypothesis

Testing Hypothesis. Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama. Description of the problem. The population parameter(s) is unknown. Some one (say person A) has some claim about the value of this unknown parameter.

522 views • 24 slides

Hypothesis Testing

Hypothesis Testing. LIR 832 Lecture #3 January 30, 2007. Topics of the Day. A. Our Fundamental Problem Again: Learning About Populations from Samples B. Basic Hypothesis Testing: One Tailed Tests Using a Z Statistic C. Probability and Critical Cutoff Approaches: Really the Same Thing

2.66k views • 109 slides

Hypothesis Testing

Hypothesis Testing. Another inference method. We’ve used confidence intervals to give an estimate (with a margin of error) of m . We change the question we’re asking… from, “What’s an interval that likely encloses the parameter?”

703 views • 12 slides

Hypothesis Testing

Hypothesis Testing. Statistics for Microarray Data Analysis – Lecture 3 supplement The Fields Institute for Research in Mathematical Sciences May 25, 2002. p -values.

546 views • 28 slides

Hypothesis Testing:

Hypothesis Testing:

Hypothesis Testing:. Inferential statistics. These will help us to decide if we should:. 1) believe that the relationship we found in our sample data is the same as the relationship we would find if we tested the entire population. OR.

806 views • 39 slides

Hypothesis testing

Hypothesis testing

Hypothesis testing. Null hypothesis Ho - this hypothesis holds that if the data deviate from the norm in any way, that deviation is due strictly to chance. Alternative hypothesis Ha - the data show something important.

686 views • 47 slides

Hypothesis Testing

Hypothesis Testing. Overview. This is the other part of inferential statistics, hypothesis testing Hypothesis testing and estimation are two different approaches to two similar problems Estimation is the process of using sample data to estimate the value of a population parameter

1.74k views • 126 slides

Hypothesis Testing

Hypothesis Testing. Martina Litschmannová m artina.litschmannova @vsb.cz K210. Terms Introduce in Prior Chapter. Population … all possible values Sample … a portion of the population Statistical inference … generalizing from a sample to a population with calculated degree of certainty

1.05k views • 37 slides

Hypothesis Testing

Hypothesis Testing. An Inference Procedure We will study procedures for both the unknown population mean on a quantitative variable and the unknown population proportion on a qualitative variable. Background .

341 views • 18 slides

Hypothesis testing

Hypothesis testing. Dr David Field. Summary. Null hypothesis and alternative hypothesis Statistical significance (p-value, alpha level) One tailed and two tailed predictions What is a true experiment? random allocation to conditions Outcomes of experiments Type I and Type II error

834 views • 39 slides

Hypothesis Testing

Hypothesis Testing. Chapter 9 BA 201. Hypothesis Testing. The null hypothesis , denoted by H 0 , is a tentative assumption about a population parameter. The alternative hypothesis , denoted by H a , is the opposite of what is stated in the null hypothesis.

711 views • 41 slides

Hypothesis Testing

Hypothesis Testing. Ch 10, Principle of Biostatistics Pagano & Gauvreau Prepared by Yu-Fen Li. Statistical Inference. Estimation of parameters point estimation interval estimation Tests of statistical hypotheses construct a confidence interval for the parameter

483 views • 25 slides

Hypothesis Testing

Hypothesis Testing. Hypothesis Testing. Hypothesis is a claim or statement about a property of a population. Hypothesis Testing is to test the claim or statement Example : A conjecture is made that “the average starting salary for computer science gradate is Rs 45,000 per month”.

1.03k views • 40 slides

Hypothesis Testing

Hypothesis Testing. Philo I Group 3. What is a Hypothesis?. a tentative assumption made in order to draw out and test its logical/analytic or empirical consequences. Problems. Roots of Hypotheses Typical setting for hypothesis formation Can be anything.

368 views • 13 slides

Hypothesis Testing

Hypothesis Testing. Developing Null and Alternative Hypotheses. Type I and Type II Errors. Population Mean: s Known. Population Mean: s Unknown. Developing Null and Alternative Hypotheses. Hypothesis testing can be used to determine whether

900 views • 69 slides

Hypothesis testing

Hypothesis testing. Make assumptions. One of them is the “hypothesis.” Calculate the probability of what happened based on the assumptions. If the probability of what happened is too low, reject the hypothesis. Coin. Assumption: The probability of heads is ½. One toss possibilities:

276 views • 12 slides

Hypothesis Testing

Hypothesis Testing. Hypothesis Testing. Greene: App. C:892-897 Statistical Test: Divide parameter space ( Ω ) into two disjoint sets: Ω 0 , Ω 1 Ω 0 ∩ Ω 1 =  and Ω 0  Ω 1 = Ω

791 views • 38 slides

Grantsmanship – Hypothesis Generation and Testing

Grantsmanship – Hypothesis Generation and Testing

Grantsmanship – Hypothesis Generation and Testing. H. F. Gilbert. Outline. Hypothesis driven and discovery driven science Coming up with models/hypotheses and ways to test their predictions The Abstract and Specific Aims. A tumor-specific endonuclease . +.

266 views • 17 slides

Hypothesis testing

Hypothesis testing. HYPOTHESIS TESTING - CORRELATION, REGRESSION, SAMPLE T-TESTS, TEST FOR EQUAL VARIANCES. What is Hypothesis Testing?.

732 views • 44 slides

Hypothesis Testing

Hypothesis Testing.

189 views • 10 slides

Hypothesis Testing

Hypothesis Testing. In 2007, 1.3 million Canadians (4.8% of Canadians – 4.2% of girls and women and 5.3% of boys and men 12 years of age and older) reported having heart disease. From HeartAndStroke.com. A new (fictional) drug has been developed called 'Healthy Heart'

319 views • 15 slides

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 18 June 2024

A single-arm, open-label pilot study of neuroimaging, behavioral, and peripheral inflammatory correlates of mindfulness-based stress reduction in multiple sclerosis

  • Christopher C. Hemond 1 ,
  • Mugdha Deshpande 1 ,
  • Idanis Berrios-Morales 1 ,
  • Shaokuan Zheng 2 ,
  • Jerrold S. Meyer 3 ,
  • George M. Slavich 4 &
  • Steven W. Cole 4  

Scientific Reports volume  14 , Article number:  14044 ( 2024 ) Cite this article

Metrics details

  • Chronic inflammation
  • Magnetic resonance imaging
  • Neuroimmunology
  • Neurological disorders
  • Stress and resilience

Multiple sclerosis (MS) is a chronic neurological disease frequently associated with significant fatigue, anxiety, depression, and stress. These symptoms are difficult to treat, and prominently contribute to the decreases in quality of life observed with MS. The underlying mechanisms of these “silent” symptoms are not well understood and include not just the psychological responses to a chronic disease, but also biological contributions from bidirectional psycho-neuro-immune (dys)regulation of systemic inflammatory biology. To address these issues, we conducted a prospective, observational pilot study to investigate the psychological, biological, and neuroarchitecture changes associated with a mindfulness-based stress reduction (MBSR) program in MS. The overarching hypothesis was that MBSR modulates systemic and central nervous system inflammation via top-down neurocognitive control over forebrain limbic areas responsible for the neurobiological stress response. 23 patients were enrolled in MBSR and assessed pre/post-program with structural 3 T MRI, behavioral measures, hair cortisol, and blood measures of peripheral inflammation, as indexed by the Conserved Transcriptional Response to Adversity (CTRA) profile. MBSR was associated with improvements across a variety of behavioral outcomes, as well as on-study enlargement of the head of the right hippocampus. The CTRA analyses revealed that greater inflammatory gene expression was related to worse patient-reported anxiety, depression, stress, and loneliness, in addition to lower eudaimonic well-being. Hair cortisol did not significantly change from pre- to post-MBSR. These results support the use of MBSR in MS and elucidate inflammatory mechanisms related to key patient-reported outcomes in this population.

Similar content being viewed by others

hypothesis generating pilot study

Longitudinal trajectory of response to electroconvulsive therapy associated with transient immune response & white matter alteration post-stimulation

hypothesis generating pilot study

History of fatigue in multiple sclerosis is associated with grey matter atrophy

hypothesis generating pilot study

Effects of lockdowns on neurobiological and psychometric parameters in unipolar depression during the COVID-19 pandemic

Introduction.

Multiple sclerosis (MS) is an inflammatory and neurodegenerative disease characterized by immune-mediated demyelination of the central nervous system 1 . One trigger for an inflammatory flare is psychological stress, as demonstrated through prospective MRI trials and retrospective cohort and case-control studies 2 , 3 , 4 . The effect size of stress on risk of disease flare is moderate (d = 0.53), and clinically relevant, exceeding the effect size for first-generation MS therapeutics like interferon-beta and glatiramer acetate (decreased risk, d ~ 0.3) 2 . A recent randomized controlled trial showed that a cognitive-based stress reduction intervention temporarily reduced the number of new inflammatory brain lesions in MS by over 50%, acting synergistically with MS medication 5 , 6 .

Many MS patients experience high rates of stress, depression, fatigue, anxiety, and cognitive dysfunction, the so-called “silent symptoms” that are not reflected in typical neurological disability grading scales. These “silent” symptoms are highly prevalent, underrecognized, disabling, and represent an unmet care need in MS. Most if not all FDA-approved disease modifying therapies (DMT) do little to improve these symptoms; for example, first-line DMTs only modestly improve 7 , or decrease 8 quality of life measures in MS patients. A recent survey showed that 100% of MS patients rely on at least one complementary or integrative health intervention to address unmet symptomatic needs 9 .

Mindfulness-Based Stress Reduction (MBSR) may help address some of these unmet needs and offer a solution to common problems in stress-reduction services that are typically difficult to access and scale (most require one-on-one interactions), prohibitively costly, nonstandardized, and can have high attrition rates. MBSR was initially developed at the University of Massachusetts Medical School by Professor Jon Kabat-Zinn, and is a manualized group intervention that teaches a set of tools to cultivate present moment awareness with an attitude of non-judgement and acceptance 10 . MBSR has salubrious psychological effects on both clinical and healthy populations. A meta-analysis of MBSR training in 2668 healthy individuals showed moderate improvements in perceived stress, anxiety, depression, distress, and quality of life, with larger effect sizes for the standardized 8-week course compared to shortened alternatives 11 . Populations with higher baseline stress seemed to benefit more 11 , 12 . Moreover, a study in persons with MS by Grossman and colleagues found durable improvements in quality of life, depression, and fatigue at 6-month follow-up compared to an educational control group 13 .

The neurobiological mechanisms of MBSR’s efficacy remain incompletely understood. One potential mechanism is via the psycho-neuro-immune axis, such that reducing stress would mitigate the pro-inflammatory potential of the innate immune system through top-down efferent pathways involving the sympathetic nervous system (SNS) and hypothalamic–pituitary–adrenal (HPA) axis 14 . This inflammatory response to stress can be measured as the “conserved transcriptional response to adversity” (CTRA) 15 , 16 , a gene expression profile first observed in persons with chronic stress and loneliness 17 . The CTRA is characterized by upregulation of inflammatory gene expression, and downregulation of genes related to antibody synthesis and innate antiviral response 18 , factors which remain intriguing and relevant given the strong evidence for Epstein-Barr Virus as a cause of MS 19 , 20 . Early data have shown that MBSR may produce anti-inflammatory effects 21 , 22 , 23 , beneficial changes to neuroendocrine stress hormone release 24 , and changes in structural neuroarchitecture relevant to stress 25 , 26 , 27 .

There is considerable convergence between neurological substrates of MBSR, stress, and the sympathetic autonomic regulatory network that has been shown to modulate the CTRA. Most of the shared anatomy is related to limbic and paralimbic structures such as the hippocampus, amygdala, cingulate cortex, prefrontal cortex and insula 28 , 29 , 30 . Some 25 , 27 (but not all 31 ) studies have shown structural changes in limbic areas following MBSR; few studies have explored how the CTRA changes in relation to formal MBSR 23 or its similar mindful aware practices 32 , 33 , 34 , 35 , 36 . No studies to our knowledge have assessed the structural neural correlates of cross-sectional or longitudinal changes in the CTRA.

Present study

To explore these ideas, we conducted a longitudinal, observational, unblinded, single-arm study of persons with MS (pwMS) who had been recommended (or chosen) to participate in an MBSR course. We collected pre-and post-MBSR patient-reported outcome measures, hair cortisol, MRI scans, and blood samples that were assessed for the CTRA profile. Based on the literature reviewed above, we hypothesized that participating in MBSR would be associated with group-level improvements in self-reported stress and anxiety measures, and, additionally, that these measures would relate to biological changes as measured by serum inflammatory markers, reduced long-term cortisol output (as measured in the hair), and MRI volumetric changes at the patient-level. Specifically, we hypothesized that structural changes in a set of cerebral gray matter structures involved in central (cerebral) autonomic control would predict improvements in peripheral inflammatory measures as assessed by the CTRA. We chose 13 (lateralized) apriori limbic/paralimbic regions of interest (ROI) to analyze based on overlapping literature between (1) structures previously shown to be associated with MSBR and (2) structures with potential modulatory influence on the hypothalamic-pituitary-axis (HPA) and sympathetic (autonomic) output. We chose ROIs including the amygdala, hippocampus, hypothalamus, brainstem, insula, anterior cingulate, and subcallosal structures, all of which could plausibly affect peripheral immune functioning via top-down modulation of the “master regulatory” structures of neuroendocrine and autonomic efferent pathways, including the paraventricular nucleus of the hypothalamus 37 and other brainstem nuclei 38 . We did not assess other brain regions as we anticipated limited power for the study and were concerned about type I (false positive) statistical errors. Assessing longitudinal patient self-reported outcome metrics in parallel with biological outcomes is fertile territory for exploring which self-reported outcomes may carry more biological influence, and therefore are most important to target from a clinical perspective.

Study design and participants

This is a prospective, pre/post observational cohort study of pwMS who chose to participate in a MBSR program at the University of Massachusetts. Patients were referred to MBSR classes either by their healthcare provider or contacted the research staff via poster advertisements in the clinic. Patients in this MS clinic are able to enroll in MBSR classes free-of-charge through a separate grant, and participating in this research was entirely voluntary. Participants were not paid. Inclusion criteria were: age 18–75, and having a diagnosis of multiple sclerosis without any clinical exacerbation in the prior 6 months. Exclusion criteria included those who had taken MBSR or were enrolled in dedicated mindfulness training in the prior 10 years, severe psychiatric comorbidities (schizoform spectrum), or persons on nonselective beta-blockers (starting in January 2020, due to potential disruption of inflammatory gene expression outcomes). We did not exclude patients taking stable doses of serotonergic/noradrenergic reuptake inhibitors, tricyclic antidepressants, neuropathic pain medications, anti-spasticity agents, or anti-fatigue medications. No patients reported taking illicit substances.

A baseline research visit occurred between 1 and 21 days prior to the introductory session of the MBSR course. At this visit, patients underwent hair sample collection, a 3T MRI scan, and (starting January 2020) venipuncture. Questionnaires for self-reported outcomes were completed either on paper at the time of the visit, or any time prior to the start of the course using the online platform RedCap. Follow-up clinic visits were arranged to occur within 3 weeks of course completion. Participants undergoing MBSR at the start of the COVID-19 pandemic incurred a delay or cancellation of their follow-up visit due to temporary shutdown of research facilities. Patients were enrolled between April 2019 and September 2022.

106 pwMS were referred/interested in the mindfulness course (9 men, 97 women); 33 ultimately enrolled. Of these 33 patients, 23 (70%) chose to participate in the observational research and were included in this analysis. 96% (22/23) completed the MBSR course. The mean age ± SD of the cohort at baseline was 45.6 ± 11.3 years; all patients were female and classified as relapsing–remitting MS phenotype (Table 1 ). No adverse events were reported or noted.

None of the participants experienced a clinical MS flare on-study, nor was there any evidence of on-study inflammatory disease activity based on stability of T2-hyperintense lesion number and volumes on pre- and post-MRI scans (data not shown). No patient changed disease modifying therapy over the course of the study.

Class description

The MBSR course at UMass is a secular, manualized protocol consisting of weekly, 2.5-h interactive didactic and practicum sessions with the goal of learning non-judgmental present moment awareness of emotions and thoughts. There is a component of gentle standing yoga as well, which can be optionally performed in a chair. The class duration is 8 weeks, with an additional 8-h “all-day” session typically occurring during weeks 5 or 6. This class was initially based in-person, prior to the onset of the COVID-19 pandemic (March 2020), after which it was permanently switched to a virtual format (September 2020). All teachers were MBSR-certified.

  • Patient reported outcomes

Pre- and post-MSBR psychosocial questionnaire assessments included the Brief Inventory of Perceived Stress 39 ; Depression, Anxiety, and Stress Scale (DASS-21) 40 ; UCLA Loneliness scale 41 ; Modified Fatigue Impact Scale (5-item) 42 ; and other measures that were administered but not the focus of the present analysis.

Clinical data

Patient clinical and demographic data were obtained with patient consent through a clinical query of their electronic medical record. This included their neurological disability scores (the Expanded Disability Status Scale 43 , ranging from 0 = no objective disability to 10 = death from MS), cognitive processing speed (the symbol digit modalities test), disease duration, and treatment records including medications and use of disease-modifying therapies. Any use of glucocorticoids were recorded and converted to prednisone-equivalent dosing; this exposure was not uncommon given the frequent use of B-cell depleting agents requiring glucocorticoids as a premedication. The patient cohort exhibited overall low neurological disability (EDSS median = 1.5, ranging between 1.0 and 6.0), and none were below the threshold on the symbol digit modalities test (< 40) concerning for cognitive impairment (Table 1 ) 44 .

  • Hair cortisol

Hair samples were collected containing approximately 50–100 individual hairs, cutting as close as possible to the surface of the scalp in the area of the vertex. Samples were immediately trimmed to a length of 2.5 cm as measured from the base of the follicle and sealed in aluminum foil pouches. Samples were stored at room temperature for less than 4 weeks before being shipped in batches to the University of Massachusetts Amherst for later processing and analysis. Hair samples were processed and analyzed according to previously described methods 45 with minor modifications. Briefly, each sample was weighed, washed twice with isopropanol to remove external contaminants, and then ground to a fine powder. The samples were then extracted overnight in methanol, the methanol was evaporated followed by reconstitution of the extract in assay buffer, and the reconstituted extract was then spin-filtered to remove any residual solid material. Lastly, cortisol was analyzed in duplicate along with standards and quality controls using the Arbor Assays DetectX enzyme immunoassay. Intra- and inter-assay coefficients of variation are both < 10% for this assay.

Gene expression analysis

Blood samples were collected into PAXgene RNA tubes according to manufacturer instructions, and stored upright at − 80 °C. Samples were sent and processed in a single batch at the UCLA Social Genomics Core. The CTRA was assessed as has been described previously 46 , 47 , 48 . Briefly, total RNA was extracted from PAXgene RNA tubes (Qiagen PAXgene Blood RNA IVD), reverse-transcribed into cDNA using a high-efficiency mRNA-targeted enzyme system (Lexogen QuantSeq 3’ FWD), and sequenced on an Illumina NextSeq instrument (Lexogen GmbH). Sequencing targeted 5 million reads per sample (achieved mean = 5.7 million), each of which was mapped to the GRCh38 human reference transcriptome (average 99.6% mapped) using the STAR aligner. Transcript abundance was quantified as gene transcripts per million total mapped reads (TPM), floored at 1 TPM to suppress spurious variability, log 2 transformed to stabilize variance, and mean-centered for linear statistical model analyses as described below.

MRI acquisition and analysis

Participants were scanned on a Philips Ingenia CX dStream 3.0 T system using a standardized acquisition protocol. This included a 3D T1-weighted sagittal MPRAGE (field of view 256 mm × 240 mm with matrix size of 256 × 240, slice thickness 1.0 mm, slice number of 181, TE = 3.2 ms, TR = 6.9 ms, TI delay = 870 ms, shot interval = 3000 ms, flip angle = 8°) and 3D FLAIR (FOV of 256 mm × 256 mm with matrix size of 256 × 256, slice thickness 1.0 mm, slice number of 181, TE = 268 ms, TR = 4800 ms, TI = 1650 ms, flip angle = 90°) sequences. MS lesions were automatically segmented from FLAIR images using the Lesion Segmentation Toolbox (v3.0) 49 to obtain lesion counts and total T2-hyperintense lesion volume (T2LV). All T1-weighted images were processed using the automated longitudinal pipeline in the Freesurfer toolbox using the default settings 50 . Subsegmentation of the amygdala, hippocampus, brainstem and hypothalamus were performed as needed using additional Freesurfer toolboxes 51 , 52 , 53 , 54 .

We limited our analysis to limbic and cortical areas of interest based on our hypotheses as outlined in the introduction. Cortical surfaces were parcellated from the Destrieux atlas. The volumes (subcortical) or surface area (cortical) of the following 13 cerebral structures—by hemisphere—were exported for analysis: amygdala, hippocampus, hypothalamus, brainstem (unilateral), insula, anterior cingulate, and subcallosum. Sub-segmentation analysis of these structures was performed if significance was found at the level of the whole structure after correction for multiple comparisons.

Statistical analysis

All variables were assessed for normality using histogram visualization and determination of skew and kurtosis, with non-normal variables undergoing either log-transformation, or non-parametric statistical tests as indicated. Descriptive statistics and pre/post assessments of psychosocial measures were determined using chi-square, Wilcoxon rank, or t-tests as appropriate. We used mixed-effects regression modeling to determine associations between MRI structural regions-of-interest (dependent variable) and patient-reported psychological measures, adjusted for age and intracranial volume as fixed effects, and subject identity as a random effect. We performed analyses both with and without the addition of prior steroid use as fixed effect. Interaction terms were introduced for specific hypothesis testing based on results. Each independent variable assessment was corrected for multiple comparisons (of the selected brain ROIs) using the Benjamini–Hochberg procedure. All aforementioned statistical analyses employed the R software ( www.r-project.org ).

CTRA analyses and statistics were performed separately using mixed effect linear models implemented in SAS PROC MIXED to quantify the association of study variables with average expression of 53 standard CTRA indicator gene transcripts as previously described 48 . Briefly, these analyses treated as a repeated measure the expression of 19 canonical proinflammatory response genes ( CXCL8, FOS, FOSB, FOSL1, FOSL2, IL1A, IL1B, IL6, JUN, JUNB, JUND, NFKB1, NFKB2, PTGS1, PTGS2, REL, RELA, RELB, TNF ) and 34 Type I IFN response genes ( GBP1, IFI16, IFI27, IFI27L1, IFI27L2, IFI30, IFI35, IFI44, IFI44L, IFI6, IFIH1, IFIT1, IFIT1B, IFIT2, IFIT3, IFIT5, IFITM1, IFITM2, IFITM3, IFITM4P, IFITM5, IFNB1, IGLL1, IGLL3P, IRF2, IRF7, IRF8, JCHAIN, MX1, MX2, OAS1, OAS2, OAS3, OASL ), with the latter sign-inverted to reflect their inverse contribution to the CTRA profile 48 . Among these transcripts, 7 showed minimal levels and variance in expression (SD = 0; IFITM4P, IFITM5, IFNB1, IGLL1, IGLL3P, IL6, IL1A ) and were excluded from further analysis. Log 2 transcript abundance values were tested for average association with study time point (pre- vs post-MBSR), while controlling for patient age, ethnicity, BMI, and two treatment variables found to empirically affect CTRA gene expression values: exposure to B cell depletion therapy, and exposure to pharmacologic glucocorticoids (with the latter quantified as prednisone-equivalent steroid dose discounted by duration since last dose). When noted, additional substantive variables such as patient-reported outcomes or MRI volumetric parameters were added to this benchmark analysis model. Models were estimated by maximum likelihood, and included a compound symmetry covariance matrix to account for correlation among residuals across transcripts (equivalent to a subject-specific random intercept).

Consent to participate

This study was reviewed and approved by the University of Massachusetts ethics board (IRB Protocols #H00017392). Data collection, storage, and access were in accordance with the Health Insurance Portability and Accountability Act. All patients provided written informed consent prior to enrollment.

Patient-reported measures

83% (19/23) patients completed both the baseline and follow-up questionnaires. Significant improvements were observed across nearly all of the measures, including perceived stress, anxiety, depression, fatigue, and loneliness. See Table 2 for a summary of the results.

Gene expression

A sub-cohort of 12 patients provided pre- ( n  = 12) and post-MBSR ( n  = 10) blood samples starting in January 2020. All samples passed quality control metrics and were used for CTRA analysis. In a mixed-effect regression controlling for age, race, BMI, use of B-cell depletion, and recent steroid dose as fixed effects, and patient identity as a random effect, results showed no significant change in CTRA gene expression from pre- to post-MBSR (− 0.022 ± 0.058 log2 mRNA abundance, p  = 0.715, 95% CI [− 0.156, 0.112]). However, CTRA gene expression was associated with greater loneliness (0.142 ± 0.055, p  = 0.010, [0.034, 0.250]), and higher levels of anxiety (DASS Anxiety subscale: 0.221 ± 0.029, p  < 0.001, [0.164, 0.277]) and stress (DASS Stress subscale: 0.231 ± 0.038, p  < 0.001, [0.157, 0.306]; BIPS “Pushed” subscale : 0.118 ± 0.051, p  = 0.022, [0.017, 0.218]; BIPS “Control” subscale: 0.228 ± 0.055, p  < 0.001, [0.121, 0.336]; and BIPS conflict subscale: 0.108 ± 0.048, p  = 0.024, [0.014, 0.202]). CTRA gene expression was inversely related to eudaimonic well-being (− 0.511 ± 0.271, p  = 0.019, [− 0.937, − 0.086]) but not hedonic well-being (0.508 ± 0.223, p  = 0.023, [0.070, 0.946]). Additionally, the CTRA was inversely associated with hair cortisol concentrations (− 0.093 ± 0.037, p  = 0.011, [− 0.165, − 0.021]).

In patients with complete paired hair sample analysis (samples = 28, n  = 14), no significant difference in hair cortisol was detected pre- vs. post-MBSR (pre = 4.25 pg/mg; post = 3.66 pg/mg; Wilcoxon signed-rank paired test: V = 64, p  = 0.50), after the removal of one outlying measurement in a participant using a facial product containing a steroid. The significance of this test did not change while adjusting for prior steroid use. Hair cortisol measures correlated moderately (Spearman’s ρ = 0.51, p  = 0.002) with the total amount of exogenous glucocorticoid (prednisone equivalent) used in the prior 2.5 months. There were no differences in the amount of prior steroids received in the pre vs. post periods [pre-MBSR median = 0 mg (IQR 0, 125) and post-MBSR median = 0 mg (IQR 0, 0); Wilcoxon paired rank test: V = 38, p -value = 0.30].

MRI analyses

In patients who received pre- (n = 17) and post-MBSR (n = 13) MRI scans, all were of good quality and free of significant artifacts based on manual review. Table 3 summarizes the results of the 13 mixed-effects regressions with the bilateral limbic/paralimbic brain ROIs as the outcome variable and MBSR as the explanatory variable. Each regression was adjusted for age and intracranial volumes as fixed effects, and subject identity as a random effect; p -values were adjusted for the multiple comparisons by Benjamini–Hochberg method. Using these models we observe an association between increased right hippocampal volumes and MBSR (see Table 3 ). We then assessed the right hippocampus in greater detail stratified by region (head, body, tail), finding the largest association in the hippocampal head (ß = 24.2 mm 3 larger, post-MBSR; see Fig.  1 ). Further subsegmentation of the hippocampal head into discrete nuclei showed significant post-MBSR enlargements in the subiculum, presubiculum, molecular layer, and CA3. These effects were attenuated but remained significant after adjusting for prior steroid use; see Table 4 for full detail. We also performed a sensitivity analysis excluding patients recently (within 3 months) started on a new or different disease-modifying therapy (N = 2), as these could potentially be associated with “pseudoatrophy” 55 . The association between MBSR and right hippocampus volume was attenuated after excluding these patients (Beta reduced from 36.6 to 32.1; unadjusted p -value reduced from 0.003 to 0.01; adjusted p -value for multiple comparisons reduced from 0.03 to 0.12). A sensitivity analysis of the symbol digit modalities test as a measure of cognitive functioning did not show any significant associations with cerebral ROIs (results not shown).

figure 1

Hippocampal head volume is observed to be larger following MBSR. Violin plot showing group pre-post enlargement in right hippocampal head volume, unadjusted for covariates. The p -value listed is adjusted for age, intracranial volume and intra-subject correlation.

We additionally determined associations between structural volumes of limbic/paralimbic areas and behavioral patient-reported outcomes. The full results of these analyses are presented in Supplementary Tables. In brief, few associations survived adjustment for prior steroid exposure and corrections for multiple comparisons. Notable exceptions included a negative association between right hippocampal volume and fatigue (ß = − 44.5, p  = 0.001; p  = 0.011 after correction for multiple comparisons and steroid use). These changes were most notable in the head of the hippocampus, in areas of the presubiculum, the subiculum, CA1, molecular layer, and CA3 (all with negative betas, all p  < 0.05 after steroid adjustment and correction for multiple comparisons). Figure  2 shows an example of these segmented structures in one participant. There was no interaction between fatigue and MBSR ( p  > 0.05) on hippocampal volumes.

figure 2

Hippocampal subsegmentation with coronal (top row), axial (middle row) and sagittal (bottom row) sections through the hippocampus. Column A is unlabeled, Column B is labeled with head/body/tail, with 3D enlargement in section C. Column E is labeled using Freesurfer hippocampal subsegmentation, with a coronal cross-section enlarged in box D.

In this pre-post observational study of an 8-week MBSR class in pwMS, we found significant improvements in the debilitating “silent symptoms” of MS, as well as an associated enlargement of the anterior right hippocampus (head). The CTRA did not significantly change pre-post MBSR but was robustly increased with higher patient-reported levels of stress, anxiety, loneliness, and lower reported well-being in this sample. These data thus help elucidate biological mechanisms potentially underlying these symptoms in pwMS.

The leading factors affecting health-related quality of life in MS are fatigue and depression, rather than physical disability or ambulatory status 56 . Because these symptoms are not readily identified by observation (and sometimes not routinely followed in neurological practice), they are deemed “silent” or “invisible”. These challenges are compounded by a lack of pharmacological therapies proven to benefit fatigue 57 . For these reasons, any interventions that can alleviate “silent” MS symptoms are of high importance. Data from this study support a growing body of literature suggesting this low-risk educational intervention should be considered for clinical applications. Although the design of this study (lacking a control group) was not meant to demonstrate efficacy—and therefore we cannot exclude nonspecific effects—a recent meta-analysis of 14 randomized controlled trials of MSBR in MS highlights consistently high clinical value in improving quality of life 58 .

The mechanistic basis for MBSR efficacy is poorly understood. We hypothesized that MBSR would reduce both perceived stress and the CTRA through top-down modulation of sympathetic output via a convergent downstream structure, the paraventricular nucleus of the hypothalamus 37 . This master regulatory nucleus is itself regulated via a set of limbic and paralimbic areas, many of which have been previously associated with volumetric changes seen associated with MBSR such as the amygdala 27 and hippocampus 59 , 60 . Here, we did observe a pre-post enlargement of right anterior hippocampal volume, most notably in the presubiculum, molecular layer and CA3; this finding was attenuated after controlling for prior exposure to steroids, which are known to affect hippocampal volumes 61 . Right hippocampal enlargement has been observed in several prior studies of mindfulness, especially in long-term meditators 59 , 60 . Given the small sample size this finding is at risk of being a type I error, and due to study design we cannot exclude other nonspecific effects of study participation or other uncontrolled variables in this clinical cohort. We did not find changes in any other pre-specified region that survived covariate adjustment and multiple comparisons (see Supplement for full results). In comparison to the literature, a notable recent RCT assessing structural brain changes related to MBSR (using similar acquisition and post-processing methods to this study) did not show any areas significantly different pre/post 31 , although it is possible that our group of highly motivated participants systematically differed from the healthy participants in this other study. 8 weeks may also not be a long enough duration for substantial structural changes. We also did not control for the possibility of structural disruptions related to T2-hyperintense lesion volumes in MS, although no participants had evidence of new lesions on-study.

In addition to the effects of MBSR, we observed associations between several psychosocial factors and MRI-structural volumes using mixed-effect models. Most notably, one of these associations included a negative relationship between fatigue and right hippocampal volume, specifically the anterior regions including body and head. Subsegmentation showed that greater fatigue was associated with smaller right presubiculum, subiculum, CA1, CA3, and molecular layer. This association was generally unchanged with adjustment for steroids. Results here are similar to a study of fatigue in healthy aging adults ( N  = 1374, M age  = 72 years) also using Freesurfer post-processing, that showed significant reductions in combined hippocampal volumes with greater fatigue 62 . Several studies in pwMS also showed associations between more severe fatigue and smaller bilateral 63 , or right hippocampal volumes 64 . There was no interaction between fatigue and MBSR in our data, suggesting an independent association between the two.

We observed a robust and coherent association between CTRA and measures of stress, anxiety, and loneliness, as well as protective (inverse) associations with eudaimonic well-being (but not hedonic well-being). These are novel findings in the MS population to our knowledge, and the apparently protective role of eudaimonic well-being suggests new directions for psychological interventions to improve QOL in pwMS. We did not, however, find support for our hypothesis that MBSR would be associated with changes in the CTRA. These results are in contrast with a small RCT ( N  = 40) in healthy older adults that showed reductions in NF-κB-associated gene expression following MBSR 65 . A caveat of interpretation is that much of study population (83%) were on an immunomodulatory therapy, with a substantial proportion being on B-cell depleting medications (either rituximab or ocrelizumab); this is likely reflected as the lack of IgG response in the sample, and we speculate that this could have contributed to differences in results compared to healthy populations. Another caveat is that the present study used a standard pre-specified CTRA gene composite as the outcome analyzed, whereas the previous study linking MBSR to differential gene expression used a different transcriptomic measure involving bioinformatic assessment of NF-κB gene regulation 23 . We also did not observe any neural (MRI) correlates of CTRA after adjustment for steroid use, although unadjusted analyses did reveal several structures that were larger in association with greater CTRA (these results are not presented). Exploring the structural and functional neural correlates of the CTRA remains a fruitful area for future research on bidirectional brain-immune communication; investigators should carefully control for any steroid usage or exposures among participants.

Last, we did not observe any on-study MBSR-related changes in hair cortisol as hypothesized. The reasons for this could include the small sample size, complications from steroid medication exposure, and that changes in perceived or psychological stress are not always accompanied by parallel changes in cortisol output, including output measured by cortisol accumulation in hair 66 . Hair cortisol measurements did, however, strongly reflect the use of on-study steroid exposure, supporting its validity as a biomarker.

A primary limitation of this study is the small sample size (with even smaller cohorts of complete CTRA and MRI data) that puts this study at substantial risk of type I and type II errors—a shortcoming which may lead present findings to be inconsistent with previous results. The study also focused on a population of “real world” MS patients, a dynamic neurological cohort that was biased with a 100% female demographic. These biases potentially constrain the generalizability of findings. An additional limitation of the study is the occurrence of the COVID-19 pandemic halfway through recruitment, necessitating a change in class structure from in-person to virtual and disrupting follow-up. We included all data whenever possible to maximize power, but inconsistent/missing data could also have introduced a bias. Although most demographic and clinical parameters were similar pre- and post-COVID, there was notably a significantly higher level of depression symptoms in the post-COVID cohort (see supplemental material). Notwithstanding these admonitions, we present these data and our experience as a feasible structure for future research in psycho-neuro-immunology, including the need to carefully account for potential methodological pitfalls such as steroid use and immunotherapy.

MBSR remains a promising non-pharmacological strategy for addressing the debilitating “silent symptoms” of MS. The present data validate the associations between patient-reported stress, anxiety, and loneliness with greater systemic inflammation as reflected by the CTRA, and patient-reported eudiamonic well-being with reduced inflammation, in an MS population. Although we did not find any consistent associations between MBSR and the CTRA, we did observe a pre-post enlargement of right anterior hippocampal nuclei as well as negative correlations between fatigue and these same structures, findings that should be interpreted with caution due to methodological limitations. Future research directions include the assessment of CTRA changes in relation to functional brain connectivity (resting-state fMRI), which may reflect changes in the bidirectional neural-immune communication more sensitively.

Data availability

Data from this article are available to others upon reasonable request and completion of a data sharing agreement. Please contact the corresponding author (Christopher Hemond) for details.

Dutta, R. & Trapp, B. D. Mechanisms of neuronal dysfunction and degeneration in multiple sclerosis. Prog. Neurobiol. 93 , 1–12 (2011).

Article   PubMed   Google Scholar  

Mohr, D. C., Hart, S. L., Julian, L., Cox, D. & Pelletier, D. Association between stressful life events and exacerbation in multiple sclerosis: A meta-analysis. BMJ 328 , 731 (2004).

Article   PubMed   PubMed Central   Google Scholar  

Yamout, B., Itani, S., Hourany, R., Sibaii, A. M. & Yaghi, S. The effect of war stress on multiple sclerosis exacerbations and radiological disease activity. J. Neurol. Sci. 288 , 42–44 (2010).

Golan, D., Somer, E., Dishon, S., Cuzin-Disegni, L. & Miller, A. Impact of exposure to war stress on exacerbations of multiple sclerosis. Ann. Neurol. 64 , 143–148 (2008).

Mohr, D. C. et al. A randomized trial of stress management for the prevention of new brain lesions in MS. Neurology 79 , 412–419 (2012).

Shields, G. S., Spahr, C. M. & Slavich, G. M. Psychosocial interventions and immune system function: A systematic review and meta-analysis of randomized clinical trials. JAMA Psychiatry 77 , 1031 (2020).

Putzki, N. et al. Quality of life in 1000 patients with early relapsing-remitting multiple sclerosis. Eur. J. Neurol. 16 , 713–720 (2009).

Article   CAS   PubMed   Google Scholar  

Zwibel, H. L. & Smrtka, J. Improving quality of life in multiple sclerosis: An unmet need. Am. J. Manag. Care 17 (Suppl 5), S139–S145 (2011).

PubMed   Google Scholar  

Stoll, S. S., Nieves, C., Tabby, D. S. & Schwartzman, R. Use of therapies other than disease-modifying agents, including complementary and alternative medicine, by patients with multiple sclerosis: A survey study. J. Am. Osteopath. Assoc. 112 , 22–28 (2012).

Kabat-Zinn, J. Wherever you go, there you are: Mindfulness meditation in everyday life. (1994).

Khoury, B., Sharma, M., Rush, S. E. & Fournier, C. Mindfulness-based stress reduction for healthy individuals: A meta-analysis. J. Psychosom. Res. 78 , 519–528 (2015).

Vergara, R. C., Baquedano, C., Lorca-Ponce, E., Steinebach, C. & Langer, Á. I. The impact of baseline mindfulness scores on mindfulness-based intervention outcomes: Toward personalized mental health interventions. Front. Psychol. 13 , 934614 (2022).

Grossman, P. et al. MS quality of life, depression, and fatigue improve after mindfulness training: A randomized trial. Neurology 75 , 1141–1149 (2010).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Wohleb, E. S., Franklin, T., Iwata, M. & Duman, R. S. Integrating neuroimmune systems in the neurobiology of depression. Nat. Rev. Neurosci. 17 , 497–511. https://doi.org/10.1038/nrn.2016.69 (2016).

Slavich, G. M. & Cole, S. W. The emerging field of human social genomics. Clin. Psychol. Sci. 1 , 331–348 (2013).

Slavich, G. M., Mengelkoch, S. & Cole, S. W. Human social genomics: Concepts, mechanisms, and implications for health. Lifestyle Med. 4 , e75 (2023).

Article   Google Scholar  

Cole, S. W. The conserved transcriptional response to adversity. Curr. Opin. Behav. Sci. 28 , 31–37. https://doi.org/10.1016/j.cobeha.2019.01.008 (2019).

Irwin, M. R. & Cole, S. W. Reciprocal regulation of the neural and innate immune systems. Nat. Rev. Immunol. 11 , 625–632 (2011).

Pender, M. P., Csurhes, P. A., Burrows, J. M. & Burrows, S. R. Defective T-cell control of Epstein-Barr virus infection in multiple sclerosis. Clin. Transl. Immunol. 6 , e147 (2017).

Bjornevik, K. et al. Longitudinal analysis reveals high prevalence of Epstein-Barr virus associated with multiple sclerosis. Science 375 , 296–301 (2022).

Article   ADS   CAS   PubMed   Google Scholar  

Hoge, E. A. et al. The effect of mindfulness meditation training on biological acute stress responses in generalized anxiety disorder. Psychiatry Res. 262 , 328–332. https://doi.org/10.1016/j.psychres.2017.01.006 (2017).

Creswell, J. D. et al. Alterations in resting-state functional connectivity link mindfulness meditation with reduced interleukin-6: A randomized controlled trial. Biol. Psychiatry 80 , 53–61 (2016).

Creswell, J. D. et al. Mindfulness-based stress reduction training reduces loneliness and pro-inflammatory gene expression in older adults: A small randomized controlled trial. Brain Behav. Immun. 26 , 1095–1101 (2012).

Sanada, K. et al. Effects of mindfulness-based interventions on salivary cortisol in healthy adults: A meta-analytical review. Front. Physiol. 7 , (2016).

Hölzel, B. K. et al. Mindfulness practice leads to increases in regional brain gray matter density. Psychiatry Res. Neuroimaging 191 , 36–43 (2011).

Singleton, O. et al. Change in brainstem gray matter concentration following a mindfulness-based intervention is correlated with improvement in psychological well-being. Front. Hum. Neurosci. 8 , 33 (2014).

Hölzel, B. K. et al. Stress reduction correlates with structural changes in the amygdala. Soc. Cogn. Affect. Neurosci. 5 , 11–17 (2009).

Hermans, E. J., Henckens, M. J. A. G., Joe, M., Joëls, M. & Fernández, G. Dynamic adaptation of large-scale brain networks in response to acute stressors. Trends Neurosci. 37 , 304–314 (2014).

Hermans, E. J. et al. Stress-related noradrenergic activity prompts large-scale neural network reconfiguration. Science 334 , 1151–1153 (2011).

Gotink, R. A., Meijboom, R., Vernooij, M. W., Smits, M. & Hunink, M. G. M. 8-week mindfulness based stress reduction induces brain changes similar to traditional long-term meditation practice - A systematic review. Brain Cogn. 108 , 32–41 (2016).

Kral, T. R. et al. Absence of structural brain changes from mindfulness-based stress reduction: Two combined randomized controlled trials. Sci. Adv. 8 (20), eabk3316 (2022).

West, T. N. et al. Effect of mindfulness versus loving-kindness training on leukocyte gene expression in midlife adults raised in low-socioeconomic status households. Mindfulness 13 , 1185–1196 (2022).

Boyle, C. C., Cole, S. W., Dutcher, J. M., Eisenberger, N. I. & Bower, J. E. Changes in eudaimonic well-being and the conserved transcriptional response to adversity in younger breast cancer survivors. Psychoneuroendocrinology 103 , 173–179 (2019).

Walton, K. G., Wenuganen, S. & Cole, S. W. Transcendental meditation practitioners show reduced expression of the conserved transcriptional response to adversity. Brain Behav. Immun. Health 32 , 100672 (2023).

Lebares, C. C. et al. Enhanced stress resilience training in surgeons: Iterative adaptation and biopsychosocial effects in 2 small randomized trials. Ann. Surg. 273 , 424–432 (2021).

Dutcher, J. M., Cole, S. W., Williams, A. C. & Creswell, J. D. Smartphone mindfulness meditation training reduces pro-inflammatory gene expression in stressed adults: A randomized controlled trial. Brain Behav. Immun. 103 , 171–177 (2022).

Bains, J. S., Cusulin, J. I. W. & Inoue, W. Stress-related synaptic plasticity in the hypothalamus. Nat. Rev. Neurosci. 16 , 377–388 (2015).

Lamotte, G., Shouman, K. & Benarroch, E. E. Stress and central autonomic network. Auton. Neurosci. 235 , 102870 (2021).

Lehman, K. A., Burns, M. N., Gagen, E. C. & Mohr, D. C. Development of the brief inventory of perceived stress. J. Clin. Psychol. 68 , 631–644 (2012).

Lovibond, S. H. & Lovibond, P. F. Depression anxiety stress scales. Psychol. Assess. https://doi.org/10.1037/t01004-000 (2011).

Russell, D., Peplau, L. A. & Ferguson, M. L. Developing a measure of loneliness. J. Pers. Assess. 42 , 290–294 (1978).

Ritvo, P. et al. Multiple sclerosis quality of life inventory: a user’s manual. (1997).

Kurtzke, J. F. Rating neurologic impairment in multiple sclerosis an expanded disability status scale (EDSS). Neurology 33 , 1444–1444 (1983).

Van Schependom, J. et al. The s ymbol d igit m odalities t est as sentinel test for cognitive impairment in multiple sclerosis. Eur. J. Neurol. 21 , 1219 (2014).

Meyer, J., Novak, M., Hamel, A. & Rosenberg, K. Extraction and analysis of cortisol from human and monkey hair. J. Vis. Exp. 50882, https://doi.org/10.3791/50882 (2014).

Fredrickson, B. L. et al. A functional genomic perspective on human well-being. Proc. Natl. Acad. Sci. 110 , 13684–13689 (2013).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Powell, N. D. et al. Social stress up-regulates inflammatory gene expression in the leukocyte transcriptome via -adrenergic induction of myelopoiesis. Proc. Natl. Acad. Sci. 110 , 16574–16579 (2013).

Cole, S. W., Shanahan, M. J., Gaydosh, L. & Harris, K. M. Population-based RNA profiling in add health finds social disparities in inflammatory and antiviral gene regulation to emerge by young adulthood. Proc. Natl. Acad. Sci. 117 , 4601–4608 (2020).

Schmidt, P. et al. An automated tool for detection of FLAIR-hyperintense white-matter lesions in multiple sclerosis. NeuroImage 59 , 3774–3783 (2012).

Reuter, M., Schmansky, N. J., Rosas, H. D. & Fischl, B. Within-subject template estimation for unbiased longitudinal image analysis. NeuroImage 61 , 1402–1418 (2012).

Iglesias, J. E. et al. Bayesian longitudinal segmentation of hippocampal substructures in brain MRI using subject-specific atlases. NeuroImage 141 , 542–555 (2016).

Saygin, Z. M. et al. High-resolution magnetic resonance imaging reveals nuclei of the human amygdala: Manual segmentation to automatic atlas. NeuroImage 155 , 370–382 (2017).

Billot, B. et al. Automated segmentation of the hypothalamus and associated subunits in brain MRI. NeuroImage 223 , 117287 (2020).

Iglesias, J. E. et al. Bayesian segmentation of brainstem structures in MRI. NeuroImage 113 , 184–195 (2015).

De Stefano, N. et al. Clinical relevance of brain volume measures in multiple sclerosis. CNS Drugs 28 , 147–156 (2014).

Biernacki, T. et al. Contributing factors to health-related quality of life in multiple sclerosis. Brain Behav. 9 , e01466 (2019).

Nourbakhsh, B. et al. Safety and efficacy of amantadine, modafinil, and methylphenidate for fatigue in multiple sclerosis: A randomised, placebo-controlled, crossover, double-blind trial. Lancet Neurol. 20 , 38–48 (2021).

Simpson, R. et al. A systematic review and meta-analysis exploring the efficacy of mindfulness-based interventions on quality of life in people with multiple sclerosis. J. Neurol. 270 , 726–745 (2023).

Hölzel, B. K. et al. Investigation of mindfulness meditation practitioners with voxel-based morphometry. Soc. Cogn. Affect. Neurosci. 3 , 55–61 (2008).

Luders, E., Toga, A. W., Lepore, N. & Gaser, C. The underlying anatomical correlates of long-term meditation: Larger hippocampal and frontal volumes of gray matter. NeuroImage 45 , 672–678 (2009).

Brown, E. S. et al. Hippocampal volume in healthy controls given 3-day stress doses of hydrocortisone. Neuropsychopharmacology 40 , 1216–1221 (2015).

Carvalho, D. Z. et al. Excessive daytime sleepiness and fatigue may indicate accelerated brain aging in cognitively normal late middle-aged and older adults. Sleep Med. 32 , 236–243 (2017).

Palotai, M. et al. History of fatigue in multiple sclerosis is associated with grey matter atrophy. Sci. Rep. 9 , 14781 (2019).

Article   ADS   PubMed   PubMed Central   Google Scholar  

Mistri, D. et al. Hippocampal subfields in RRMS: The modulatory role of gender and fatigue. J. Neurol. Sci. 429 , 117646 (2021).

Creswell, J. D. et al. Mindfulness-based stress reduction training reduces loneliness and pro-inflammatory gene expression in older adults: A small randomized controlled trial. Brain. Behav. Immun. 26 , 1095–1101 (2012).

Weckesser, L. J. et al. The psychometric properties and temporal dynamics of subjective stress, retrospectively assessed by different informants and questionnaires, and hair cortisol concentrations. Sci. Rep. 9 , 1098 (2019).

Download references

Acknowledgements

We thank Sara Carbone and Anthony Maciag for their assistance with data collection and MBSR enrollment logistics, respectively. We thank all of the participants who volunteered to take part in this research.

This project was supported by a pilot research award from EMD Serono through the Consortium of MS Centers (CMSC) research program to C.C.H., as well as funding from the Multiple Sclerosis Foundation to C.C.H. for providing free access for MBSR in our MS population, irrespective of research participation. G.M.S. was supported by grant #OPR21101 from the California Governor’s Office of Planning and Research/California Initiative to Advance Precision Medicine. The findings and conclusions in this article are those of the authors and do not necessarily represent the views or opinions of these organizations, which had no role in designing or planning this study; in collecting, analyzing, or interpreting the data; in writing the article; or in deciding to submit this article for publication.

Author information

Authors and affiliations.

Department of Neurology, University of Massachusetts Chan Medical School, 55 Lake Avenue North, Worcester, MA, 01655, USA

Christopher C. Hemond, Mugdha Deshpande & Idanis Berrios-Morales

Department of Radiology, University of Massachusetts Chan Medical School, Worcester, MA, 01655, USA

Shaokuan Zheng

Department of Psychological & Brain Sciences, University of Massachusetts Amherst, Amherst, MA, 01003, USA

Jerrold S. Meyer

Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, CA, 90095, USA

George M. Slavich & Steven W. Cole

You can also search for this author in PubMed   Google Scholar

Contributions

C.C.H.: Conceptualization, methodology, software, formal analysis, investigation, resources, data curation, writing—original draft, writing—review and editing, visualization, funding acquisition; M.D.: Project administration, data curation; I.B.M.: Resources; J.S.M.: Methodology, writing—review and editing; S.Z.: Methodology, writing—review and editing; G.M.S.: Methodology, formal analysis, writing—review and editing; S.W.C.: Methodology, formal analysis, writing—review and editing. All reviewers read and approved the final manuscript.

Corresponding author

Correspondence to Christopher C. Hemond .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary information., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Hemond, C.C., Deshpande, M., Berrios-Morales, I. et al. A single-arm, open-label pilot study of neuroimaging, behavioral, and peripheral inflammatory correlates of mindfulness-based stress reduction in multiple sclerosis. Sci Rep 14 , 14044 (2024). https://doi.org/10.1038/s41598-024-62960-w

Download citation

Received : 14 February 2024

Accepted : 23 May 2024

Published : 18 June 2024

DOI : https://doi.org/10.1038/s41598-024-62960-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Mindfulness-based stress reduction
  • Conserved transcriptional response to adversity
  • Perceived stress
  • Multiple sclerosis

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

hypothesis generating pilot study

hypothesis generating pilot study

Study record managers: refer to the Data Element Definitions if submitting registration or results information.

Search for terms

ClinicalTrials.gov

  • Advanced Search
  • See Studies by Topic
  • See Studies on Map
  • How to Search
  • How to Use Search Results
  • How to Find Results of Studies
  • How to Read a Study Record

About Studies Menu

  • Learn About Studies
  • Other Sites About Studies
  • Glossary of Common Site Terms

Submit Studies Menu

  • Submit Studies to ClinicalTrials.gov PRS
  • Why Should I Register and Submit Results?
  • FDAAA 801 and the Final Rule
  • How to Apply for a PRS Account
  • How to Register Your Study
  • How to Edit Your Study Record
  • How to Submit Your Results
  • Frequently Asked Questions
  • Support Materials
  • Training Materials

Resources Menu

  • Selected Publications
  • Clinical Alerts and Advisories
  • Trends, Charts, and Maps
  • Downloading Content for Analysis

About Site Menu

  • ClinicalTrials.gov Background
  • About the Results Database
  • History, Policies, and Laws
  • ClinicalTrials.gov Modernization
  • Media/Press Resources
  • Linking to This Site
  • Terms and Conditions
  • Search Results
  • Study Record Detail

Maximum Saved Studies Reached

Tofacitinib Hypothesis-generating, Pilot Study for Corticosteroid-Dependent Sarcoidosis

The safety and scientific validity of this study is the responsibility of the study sponsor and investigators. Listing a study does not mean it has been evaluated by the U.S. Federal Government. Read our for details.
ClinicalTrials.gov Identifier: NCT03793439
: Completed : January 4, 2019 : February 18, 2022 : February 18, 2022
  • Study Details
  • Tabular View
  • Study Results

sections

Condition or disease Intervention/treatment Phase
Sarcoidosis, Pulmonary Sarcoidosis Lung Sarcoidosis Drug: Tofacitinib 5mg Oral Tablet [Xeljanz] 16 week trial Diagnostic Test: Spirometry Genetic: RNA Sequencing Diagnostic Test: Laboratory testing Drug: Corticosteroid Drug: Tofacitinib 5mg [Xeljanz] 1 year open-label extension Phase 1

Primary Objectives:

Objective 1: Test the hypothesis that the addition of tofacitinib will allow patients with sarcoidosis to have 50% or greater reduction in their corticosteroid requirement without a significant decrease in pulmonary function testing, and with a similar quality of life as measured by a validated questionnaire (1).

Objective 2: Test the hypothesis that the addition of tofacitinib will result in significantly decreased expression of signal transducer and activator of transcription (STAT)-1 dependent gene expression.

This is a 16-week open-label, interventional, proof of concept, hypothesis-generating study. All subjects will receive Tofacitinib 5mg twice daily for 16 weeks. After four weeks on Tofacitinib, the corticosteroid will be tapered per a pre-defined protocol; once a reduction of 50% has been achieved, any further taper will be per physician discretion. After 16 weeks, subjects who meet the primary end-point will be permitted an optional one year open-label extension.

-->
Layout table for study information
Study Type : Interventional  (Clinical Trial)
Actual Enrollment : 5 participants
Allocation: N/A
Intervention Model: Single Group Assignment
Intervention Model Description: Open-label, interventional, proof of concept, hypothesis-generating study
Masking: None (Open Label)
Primary Purpose: Treatment
Official Title: Tofacitinib Hypothesis-generating, Pilot Study for Corticosteroid-Dependent Sarcoidosis
Actual Study Start Date : May 15, 2019
Actual Primary Completion Date : June 24, 2020
Actual Study Completion Date : June 24, 2021

hypothesis generating pilot study

Arm Intervention/treatment
Experimental: Open-label treatment Drug: Tofacitinib 5mg Oral Tablet [Xeljanz] 16 week trial
Diagnostic Test: Spirometry
Genetic: RNA Sequencing
Diagnostic Test: Laboratory testing
Drug: Corticosteroid
Drug: Tofacitinib 5mg [Xeljanz] 1 year open-label extension
Layout table for eligibility information
Ages Eligible for Study:   18 Years to 89 Years   (Adult, Older Adult)
Sexes Eligible for Study:   All
Accepts Healthy Volunteers:   No

Inclusion Criteria:

  • Meet World Association of Sarcoidosis and other Granulomatous Disorders (WASOG) definition of pulmonary sarcoid
  • Histologically proven sarcoid
  • Evidence of pulmonary sarcoid on chest radiograph
  • Forced vital capacity of > 50%
  • Require 15-30mg/day of prednisone or equivalent corticosteroid to control sarcoidosis.
  • Stable dose of prednisone or equivalent corticosteroid for 4 weeks prior to enrollment.

Exclusion Criteria:

  • May be taking methotrexate but not other immunosuppressive or immunomodulatory treatments in the two months prior to study period. This includes but is not limited to azathioprine, cyclophosphamide, leflunomide, mycophenolate mofetil, cyclosporine, tacrolimus, and biologic medications.
  • Patients requiring >30mg/day prednisone or equivalent.
  • Pregnant or lactating women.
  • Hemoglobin < 9g/dL or hematocrit < 30%
  • White blood cell count <3.0 K/cu mm
  • Absolute neutrophil count <1.2 K/cu mm
  • Platelet count <100 K/cu mm
  • Subjects with an estimated glomerular filtration rate (GFR) ≤40 ml/min
  • Subjects with a total bilirubin, aspartate aminotransferase (AST), or alanine aminotransferase (ALT) more than 1.5 times the upper limit of normal at screening.
  • Severe, progressive, or uncontrolled chronic liver disease including fibrosis, cirrhosis, or recent or active hepatitis.
  • History of any lymphoproliferative disorder such as Epstein Barr virus (EBV) related lymphoproliferative disorder, history of lymphoma, leukemia, or signs and symptoms suggest of current lymphatic disease.
  • Current malignancy or history of malignancy, with the exception of adequately treated or excised non-metastatic basal cell or squamous cell cancer of the skin, or cervical carcinoma in situ.
  • Have or have had an opportunistic infection (e.g., herpes zoster [shingles], cytomegalovirus, Pneumocystis carinii, aspergillosis and aspergilloma, histoplasmosis, or mycobacteria other than TB) within 6 months prior to screening.
  • Have a known infection with human immunodeficiency virus (HIV)
  • Have current signs and symptoms of systemic lupus erythematosus, or severe, progressive, or uncontrolled renal, hepatic, hematologic, endocrine, pulmonary, cardiac (New York Heart Association class III or IV), neurologic, or cerebral diseases (with the exception of sarcoidosis).
Layout table for location information
United States, Oregon
Oregon Health & Science University
Portland, Oregon, United States, 97239
Layout table for investigator information
Principal Investigator: Jim Rosenbaum, MD Oregon Health and Science University
Layout table for additonal information
Responsible Party: Jim Rosenbaum, Professor of Ophthalmology, Medicine, and Cell Biology, OHSU, Oregon Health and Science University
ClinicalTrials.gov Identifier:    
Other Study ID Numbers: STUDY00017902
First Posted: January 4, 2019   
Results First Posted: February 18, 2022
Last Update Posted: February 18, 2022
Last Verified: December 2021
Plan to Share IPD: Yes
Plan Description: De-identified individual participant data for all primary and secondary outcomes will be made available.
Supporting Materials: Study Protocol
Statistical Analysis Plan (SAP)
Informed Consent Form (ICF)
Clinical Study Report (CSR)
Time Frame: January 1, 2022 until December 31, 2023
Access Criteria: Email [email protected]
Layout table for additional information
Studies a U.S. FDA-regulated Drug Product: Yes
Studies a U.S. FDA-regulated Device Product: No
Sarcoidosis
Corticosteroid dependent sarcoidosis
Layout table for MeSH terms
Sarcoidosis, Pulmonary
Sarcoidosis
Lymphoproliferative Disorders
Lymphatic Diseases
Hypersensitivity, Delayed
Hypersensitivity
Immune System Diseases
Lung Diseases, Interstitial
Lung Diseases
Respiratory Tract Diseases
Prednisone
Tofacitinib
Anti-Inflammatory Agents
Glucocorticoids
Hormones
Hormones, Hormone Substitutes, and Hormone Antagonists
Physiological Effects of Drugs
Antineoplastic Agents, Hormonal
Antineoplastic Agents
Janus Kinase Inhibitors
Protein Kinase Inhibitors
Enzyme Inhibitors
Molecular Mechanisms of Pharmacological Action
  • For Patients and Families
  • For Researchers
  • For Study Record Managers
  • Customer Support
  • Accessibility
  • Viewers and Players
  • Freedom of Information Act
  • HHS Vulnerability Disclosure
  • U.S. National Library of Medicine
  • U.S. National Institutes of Health
  • U.S. Department of Health and Human Services

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

nutrients-logo

Article Menu

hypothesis generating pilot study

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

The relationship between patient self-reported, pre-morbid physical activity and clinical outcomes of inpatient treatment in youth with anorexia nervosa: a pilot study.

hypothesis generating pilot study

1. Introduction

2. materials and methods, 2.1. anthropometry, 2.2. assessment of pa and eating disorder psychopathology, 2.3. semi-structured pa interviews, 2.4. pa domains, 2.5. clinical outcome parameters and statistical analysis, 3.1. self-reported pa over time, 3.2. associations between pa parameters, 3.3. relationship between pa parameters and ed pathology, 3.4. differences in pa patterns between an subgroups, 3.5. interrater reliability for classification of the patients into pa subgroups, 3.6. relationship between pa patterns and clinical outcomes, 3.7. prediction of increase in %mbmi and los, 4. discussion, 4.1. high levels of premorbid pa in patients with an and timing of pa increase with respect to onset of an, 4.2. association of increased pa with clinical outcome, 4.3. limitations, 5. conclusions, author contributions, institutional review board statement, informed consent statement, data availability statement, acknowledgments, conflicts of interest, appendix a. case description.

VariableSteps/DayPA1-6PA-PrePA-PostChange PA-Pre to PA-Post (%)
0.286
(0.221)
−0.048
(0.842)
−0.163
(0.492)
0.58
0.728
0.338
(0.145)
0.084
(0.724)
−0.081
(0.735)
0.6
0.756
0.118
(0.621)
−0.104
(0.662)
−0.315
(0.176)
0.387
(0.092)
0.541
0.381
(0.098)
−0.112
(0.638)
−0.193
(0.415)
0.626
0.707
0.308
(0.186)
0.071
(0.768)
−0.015
(0.949)
0.66
0.671
PredictorEffect Size Full ModelConfidence Interval Full Modelp Full ModelEffect Size UnivariableConfidence Interval Univariablep Univariabler Univariable
−1.36[−5.44; 2.71]0.474−0.357[−5.05; 4.33]0.8750.0014
0.0000842[−0.000317; 0.000486]0.6510.000118[−0.000289; 0.000525]0.5510.020
−0.0142[−0.0508; 0.0224]0.4080.00925[−0.0154; 0.0339]0.4400.034
0.0124[−0.0127; 0.0375]0.297−0.000924[−0.0140; 0.0121]0.8830.0012
−0.000268[−0.00733; 0.00679]0.934−0.00000850[−0.00545; 0.00544]0.9970.00000060
−6.28[−24.5; 11.9]0.4591.06[−4.66; 6.79]0.7010.0084
10.7[−4.5; 25.9]0.1492.01[−3.20; 7.23]0.4280.035
0.113[−1.38; 1.61]0.8690.300[−1.04; 1.64]0.6430.012
−0.686[−1.09; 0.28]0.004−0.465[−0.742;−0.189]0.0020.41
of the model
PredictorEffect SizeFull ModelConfidence Interval Full Modelp Full ModelEffect Size UnivariableConfidence Interval Univariablep Univariabler Univariable
No Comorbidity−19.0[−56.1; 18.1]0.281−10.5[−51.9; 30.9]0.6000.016
Steps/day−0.000874[−0.00453; 0.00278]0.606−0.000600[−0.00424; 0.00304]0.7330.0066
PA 1-60.0367[−0.296; 0.370]0.8110.299[0.133; 0.465]0.0010.44
PA-pre0.0907[−0.138; 0.319]0.3970.149[0.059; 0.238]0.0030.40
PA-post−0.00822[−0.0725; 0.0561]0.7820.00642[−0.0419;
0.0547]
0.7830.0043
PA-new67.0[−98; 233]0.38829.0[−20.0; 78.0]0.2300.079
PA-high−49.7[−188; 89]0.442−3.67[−50.8; 43.5]0.8720.0015
EDE-Q Global7.23[−6.4; 20.9]0.2654.02[−7.8; 15.8]0.4840.028
Admission %mBMI (%)−2.23[−5.92; 1.47]0.209−1.86[−4.92; 1.20]0.2190.083
of the model
  • Moncrieff-Boyd, J. Anorexia Nervosa (Apepsia Hysterica, Anorexia Hysterica), Sir William Gull, 1873. Adv. Eat. Disord. 2016 , 4 , 1. [ Google Scholar ]
  • Carrera, O.; Adan, R.A.H.; Gutierrez, E.; Danner, U.N.; Hoek, H.W.; van Elburg, A.A.; Kas, M.J. Hyperactivity in anorexia nervosa: Warming up not just burning-off calories. PLoS ONE 2012 , 7 , e41851. [ Google Scholar ] [ CrossRef ]
  • Solenberger, S.E. Exercise and eating disorders: A 3-year inpatient hospital record analysis. Eat. Behav. 2001 , 2 , 151–168. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Strober, M.; Freeman, R.; Morrell, W. The long-term course of severe anorexia nervosa in adolescents: Survival analysis of recovery, relapse, and outcome predictors over 10–15 years in a prospective study. Int. J. Eat. Disord. 1997 , 22 , 339–360. [ Google Scholar ] [ CrossRef ]
  • Davis, C.; Katzman, D.K.; Kaptein, S.; Kirsh, C.; Brewer, H.; Kalmbach, K.; Olmsted, M.P.; Woodside, D.B.; Kaplan, A.S. The prevalence of high-level exercise in the eating disorders: Etiological implications. Compr. Psychiatry 1997 , 38 , 321–326. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Rizk Melissa, M.L.; Laurence, K.; Sylvie, B.; Jeanne, D.; Odile, V.; Nathalie, G. Physical Activity in Eating Disorders: A Systematic Review. Nutrients 2020 , 12 , 183. [ Google Scholar ] [ CrossRef ]
  • Holtkamp, K.; Hebebrand, J.; Herpertz-Dahlmann, B. The contribution of anxiety and food restriction on physical activity levels in acute anorexia nervosa. Int. J. Eat. Disord. 2004 , 36 , 163–171. [ Google Scholar ] [ CrossRef ]
  • Grosser, J.; Hofmann, T.; Stengel, A.; Zeeck, A.; Winter, S.; Correll, C.U.; Haas, V. Psychological and nutritional correlates of objectively assessed physical activity in patients with anorexia nervosa. Eur. Eat. Disord. Rev. 2020 , 28 , 559–570. [ Google Scholar ] [ CrossRef ]
  • Young, S.; Rhodes, P.; Touyz, S.; Hay, P. The role of exercise across the lifespan in patients with anorexia nervosa: A narrative inquiry. Adv. Eat. Disord. 2015 , 3 , 237–250. [ Google Scholar ] [ CrossRef ]
  • Davis, C.; Kennedy, S.H.; Ravelski, E.; Dionne, M. The Role of Physical-Activity in the Development and Maintenance of Eating Disorders. Psychol. Med. 1994 , 24 , 957–967. [ Google Scholar ] [ CrossRef ]
  • Keyes, A.; Woerwag-Mehta, S.; Bartholdy, S.; Koskina, A.; Middleton, B.; Connan, F.; Webster, P.; Schmidt, U.; Campbell, I.C. Physical activity and the drive to exercise in anorexia nervosa. Int. J. Eat. Disord. 2015 , 48 , 46–54. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Alberti, M.; Galvani, C.; El Ghoch, M.; Capelli, C.; Lanza, M.; Calugi, S.; Grave, R.D. Assessment of physical activity in anorexia nervosa and treatment outcome. Med. Sci. Sports Exerc. 2013 , 45 , 1643–1648. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Holtkamp, K.; Herpertz-Dahlmann, B.; Hebebrand, K.; Mika, C.; Kratzsch, J.; Hebebrand, J. Physical activity and restlessness correlate with leptin levels in patients with adolescent anorexia nervosa. Biol. Psychiatry 2006 , 60 , 311–313. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Kemmer, M.; Correll, C.U.; Hofmann, T.; Stengel, A.; Grosser, J.; Haas, V. Assessment of Physical Activity Patterns in Adolescent Patients with Anorexia Nervosa and Their Effect on Weight Gain. J. Clin. Med. 2020 , 9 , 727. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Lehmann, C.S.; Hofmann, T.; Elbelt, U.; Rose, M.; Correll, C.U.; Stengel, A.; Haas, V. The Role of Objectively Measured, Altered Physical Activity Patterns for Body Mass Index Change during Inpatient Treatment in Female Patients with Anorexia Nervosa. J. Clin. Med. 2018 , 7 , 289. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Davis, C.; Katzman, D.K.; Kirsh, C. Compulsive physical activity in adolescents with anorexia nervosa: A psychobehavioral spiral of pathology. J. Nerv. Ment. Dis. 1999 , 187 , 336–342. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Higgins, J.; Hagman, J.; Pan, Z.; MacLean, P. Increased physical activity not decreased energy intake is associated with inpatient medical treatment for anorexia nervosa in adolescent females. PLoS ONE 2013 , 8 , e61559. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Richter, F.; Strauss, B.; Braehler, E.; Adametz, L.; Berger, U. Screening disordered eating in a representative sample of the German population: Usefulness and psychometric properties of the German SCOFF questionnaire. Eat. Behav. 2017 , 25 , 81–88. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Kromeyer-Hauschild, K.; Wabitsch, M.; Kunze, D.; Geller, F.; Geiß, H.C.; Hesse, V.; von Hippel, A.; Jaeger, U.; Johnsen, D.; Korte, W.; et al. Perzentile für den Body-mass-Index für das Kindes- und Jugendalter unter Heranziehung verschiedener deutscher Stichproben. Monatsschrift Kinderheilkd. 2001 , 149 , 807–818. [ Google Scholar ] [ CrossRef ]
  • Nadler, J.; Correll, C.U.; Le Grange, D.; Accurso, E.C.; Haas, V. The Impact of Inpatient Multimodal Treatment or Family-Based Treatment on Six-Month Weight Outcomes in Youth with Anorexia Nervosa: A Naturalistic, Cross-Continental Comparison. Nutrients 2022 , 14 , 1396. [ Google Scholar ] [ CrossRef ]
  • Mond, M.J.M.; Hay, P.J.; Rodgers, B.; Owen, C.; Beumont, P.J.V. Validity of the Eating Disorder Examination Questionnaire (EDE-Q) in screening for eating disorders in community samples. Behav. Res. Ther. 2004 , 42 , 551–567. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Hilbert, A.; Tuschen-Caffier, B.; Karwautz, A.F.K.; Niederhofer, H.; Munsch, S. Eating Disorder Examination-Questionnaire. Diagnostica 2007 , 53 , 144–154. [ Google Scholar ] [ CrossRef ]
  • Ranzenhofer, L.M.; Jablonski, M.; Davis, L.; Posner, J.; Walsh, B.T.; Steinglass, J.E. Early Course of Symptom Development in Anorexia Nervosa. J. Adolesc. Health 2022 , 71 , 587–593. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Epling, W.F.; Pierce, W.D.; Stefan, L. A theory of activity-based anorexia. Int. J. Eat. Disord. 1983 , 3 , 27–46. [ Google Scholar ] [ CrossRef ]
  • Exner, C.; Hebebrand, J.; Remschmidt, H.; Wewetzer, C.; Ziegler, A.; Herpertz, S.; Schweiger, U.; Blum, W.F.; Preibisch, G.; Heldmaier, G.; et al. Leptin suppresses semi-starvation induced hyperactivity in rats: Implications for anorexia nervosa. Mol. Psychiatry 2000 , 5 , 476–481. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Pjetri, E.; de Haas, R.; de Jong, S.; Gelegen, C.; Oppelaar, H.; Verhagen, L.A.W.; Eijkemans, M.J.C.; Adan, R.A.; Olivier, B.; Kas, M.J. Identifying predictors of activity based anorexia susceptibility in diverse genetic rodent populations. PLoS ONE 2012 , 7 , e50453. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Beeler, J.A.; Burghardt, N.S. Activity-based Anorexia for Modeling Vulnerability and Resilience in Mice. Bio. Protoc. 2021 , 11 , e4009. [ Google Scholar ] [ CrossRef ]
  • Beeler, J.A.; Burghardt, N.S. Commentary on Vulnerability and Resilience to Activity-Based Anorexia and the Role of Dopamine. J. Exp. Neurol. 2021 , 2 , 21–28. [ Google Scholar ]
  • Strik Lievers, L.; Curt, F.; Wallier, J.; Perdereau, F.; Rein, Z.; Jeammet, P.; Godart, N. Predictive factors of length of inpatient treatment in anorexia nervosa. Eur. Child Adolesc. Psychiatry 2009 , 18 , 75–84. [ Google Scholar ] [ CrossRef ]
  • Maguire, S.; Surgenor, L.J.; Abraham, S.; Beumont, P. An international collaborative database: Its use in predicting length of stay for inpatient treatment of anorexia nervosa. Aust. N. Z. J. Psychiatry 2003 , 37 , 741–747. [ Google Scholar ] [ CrossRef ]
  • Noetel, M.; Dawson, L.; Hay, P.; Touyz, S. The assessment and treatment of unhealthy exercise in adolescents with anorexia nervosa: A Delphi study to synthesize clinical knowledge. Int. J. Eat. Disord. 2017 , 50 , 378–388. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Paykel, E.S. Methodological aspects of life events research. J. Psychosom. Res. 1983 , 27 , 341–352. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Sobell, L.C.; Toneatto, T.; Sobell, M.B.; Schuller, R.; Maxwell, M. A procedure for reducing errors in reports of life events. J. Psychosom. Res. 1990 , 34 , 163–170. [ Google Scholar ] [ CrossRef ] [ PubMed ]

Click here to enlarge figure

NPercentage
Restrictive1664.0
Binge–purge 520.0
Atypical416.0
None1456.0
Depression520.0
Obsessive compulsive disorder416.0
Anxiety disorder312.0
Borderline Personality Disorder28.0
None2392.0
Stimulating/non-sedating antidepressants28.0
Antipsychotic medication28.0
Patients with AN, Baseline (n = 25)Healthy Controls (n = 22)p
(years)15.1 ± 1.7 [12.1–17.8]14.7 ± 1.3 [13.0–17.1]0.494
N (%)22 (88.0%)22 (100%)0.004
(kg)41.2 ± 5.5 [31.3–52.4]56.2 ± 10.6 [37.1–77.6]
(cm)165 ± 8 [150–186]165 ± 8 [153–182]0.993
  2 ± 4 [0–19]54 ± 29 [3–89]
74.8± 6 [65.9–89.5]102.4 ± 12.1 [78.0–121.2]
(kg/m )15.0 ± 1.0 [13.0–18.0]20.6 ± 2.7 [15.6–24.0]
(months)10 [0–64]NA
17 (77.3%)0 (0%)
3 (13.6%)4 (18.2%)1.000
2 (9.1%)15 (68.2%)
0 (0%)3 (13.6%)0.602
(admission)8736 (6755/10,158)
[2026–24,536]
11855 (9104/13,954)
[4427–23,139]
(min/week)115 (75/200) [0–375]68 (29/105) [0–330]
(min/week)120 (60/240) [0–800]NA
(min/week)420 (170/767) [0–1680] NA
(%)244 ± 323 [0–1300]
3.32 ± 1.69 [0.40–5.40]NA
2.94 ± 1.82 [0.20–5.60]NA
2.58 ± 1.72 [0.00–5.60]NA
3.50 ± 1.99 [0.00–5.80]NA
4.10 ± 1.87 [0.50–6.00]NA
Steps/DayPA1-6
(min/week)
PA-Pre
(min/week)
PA-Post
(min/week)
Change of PA-Pre to PA-Post (%)
1
(min/week)0.168
(0.434)
1
(min/week)−0.017
(0.938)
0.633
1
(min/week)0.476
0.284
(0.179)
0.291
(0.168)
1
(%)0.46
−0.073
(0.735)
−0.154
(0.473)
0.805
1
PredictorEffect SizeConfidence Intervalp-Value
Admission %mBMI (%)−0.620[−0.862; −0.378]0.001
New onset/high intensity PA 5.69[2.12; 9.25]0.004
of the model
PA before onset AN0.149[0.059; 0.238]0.003
of the model
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Pech, M.; Correll, C.U.; Schmidt, J.; Zeeck, A.; Hofmann, T.; Busjahn, A.; Haas, V. The Relationship between Patient Self-Reported, Pre-Morbid Physical Activity and Clinical Outcomes of Inpatient Treatment in Youth with Anorexia Nervosa: A Pilot Study. Nutrients 2024 , 16 , 1889. https://doi.org/10.3390/nu16121889

Pech M, Correll CU, Schmidt J, Zeeck A, Hofmann T, Busjahn A, Haas V. The Relationship between Patient Self-Reported, Pre-Morbid Physical Activity and Clinical Outcomes of Inpatient Treatment in Youth with Anorexia Nervosa: A Pilot Study. Nutrients . 2024; 16(12):1889. https://doi.org/10.3390/nu16121889

Pech, Martina, Christoph U. Correll, Janine Schmidt, Almut Zeeck, Tobias Hofmann, Andreas Busjahn, and Verena Haas. 2024. "The Relationship between Patient Self-Reported, Pre-Morbid Physical Activity and Clinical Outcomes of Inpatient Treatment in Youth with Anorexia Nervosa: A Pilot Study" Nutrients 16, no. 12: 1889. https://doi.org/10.3390/nu16121889

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

Does TomoDirect 3DCRT represent a suitable option for post-operative whole breast irradiation? A hypothesis-generating pilot study

Affiliation.

  • 1 Department of Medical Physics, Ospedale Regionale U. Parini, AUSL Valle d'Aosta, Aosta, Italy.
  • PMID: 23241224
  • PMCID: PMC3547690
  • DOI: 10.1186/1748-717X-7-211

Background: This study investigates the use of TomoDirect™ 3DCRT for whole breast adjuvant radiotherapy (AWBRT) that represents a very attractive treatment opportunity, mainly for radiotherapy departments without conventional Linacs and only equipped with helical tomotherapy units.

Methods: Plans were created for 17 breast cancer patients using TomoDirect in 3DCRT and IMRT modality and field-in-field 3DCRT planning (FIF) and compared in terms of PTV coverage, overdosage, homogeneity, conformality and dose to OARs. The possibility to define patient-class solutions for TD-3DCRT employment was investigated, correlating OARs dose constraints to patient specific anatomic parameters.

Results: TD-3DCRT showed PTV coverage and homogeneity significantly higher than TD-IMRT and FIF. PTV conformality was significantly better for FIF, while no differences were found between TD-3DCRT and TD-IMRT. TD-3DCRT showed mean values of the OARs dosimetric endpoints significantly higher than TD-IMRT; with respect to FIF, TD-3DCRT showed values significantly higher for lung V(20Gy), mean heart dose and V(25Gy), while contralateral lung maximum dose and contralateral breast mean dose resulted significantly lower. The Central Lung Distance (CLD) and the maximal Heart Distance (HD) resulted as useful clinical tools to predict the opportunity to employ TD-3DCRT: positive correlations were found between CLD and both V(20Gy) and mean lung dose and between HD and both V25Gy and the mean heart dose. TD-3DCRT showed a significantly shorter mean beam-on time than TD-IMRT.

Conclusions: The present study showed that TD-3DCRT and TD-IMRT are two feasible and dosimetrically acceptable treatment approach for AWBRT, with an optimal PTV coverage and adequate OARs sparing. Some concerns might be raised in terms of dose to organs at risks if TD-3DCRT is applied to a general population. A correct patients clusterization according to simple quantitative anatomic measures, would help to correctly allocate patients to the appropriate treatment planning strategy in terms of target coverage, but also of normal tissue sparing.

PubMed Disclaimer

PTV and OARs cumulative DVHs…

PTV and OARs cumulative DVHs for the 3 techniques.

Regression plots of: (a) lung…

Regression plots of: (a) lung V 20Gy vs. CLD; (b) MLD vs. CLD;…

Differential PTV DVHs for the…

Differential PTV DVHs for the techniques.

Allocation pattern of the 17…

Allocation pattern of the 17 patients according to CLD and HD.

Similar articles

  • Intensity-modulated radiotherapy using two static ports of tomotherapy for breast cancer after conservative surgery: dosimetric comparison with other treatment methods and 3-year clinical results. Nagai A, Shibamoto Y, Yoshida M, Inoda K, Kikuchi Y. Nagai A, et al. J Radiat Res. 2017 Jul 1;58(4):529-536. doi: 10.1093/jrr/rrw132. J Radiat Res. 2017. PMID: 28339844 Free PMC article. Clinical Trial.
  • Is the lack of respiratory gating prejudicial for left breast TomoDirect treatments? Meyer P, Niederst C, Scius M, Jarnet D, Dehaynin N, Gantier M, Waissi W, Poulin N, Karamanoukian D. Meyer P, et al. Phys Med. 2016 May;32(5):644-50. doi: 10.1016/j.ejmp.2016.04.001. Epub 2016 Apr 29. Phys Med. 2016. PMID: 27136736
  • Dosimetric Evaluation of Different Intensity-Modulated Radiotherapy Techniques for Breast Cancer After Conservative Surgery. Zhang F, Wang Y, Xu W, Jiang H, Liu Q, Gao J, Yao B, Hou J, He H. Zhang F, et al. Technol Cancer Res Treat. 2015 Oct;14(5):515-23. doi: 10.1177/1533034614551873. Epub 2014 Oct 13. Technol Cancer Res Treat. 2015. PMID: 25311257
  • Dosimetric evaluation of conventional radiotherapy, 3-D conformal radiotherapy and direct machine parameter optimisation intensity-modulated radiotherapy for breast cancer after conservative surgery. Zhang F, Zheng M. Zhang F, et al. J Med Imaging Radiat Oncol. 2011 Dec;55(6):595-602. doi: 10.1111/j.1754-9485.2011.02313.x. J Med Imaging Radiat Oncol. 2011. PMID: 22141607
  • Dosimetric comparisons of three-dimensional conformal radiotherapy, intensity-modulated radiotherapy, and helical tomotherapy in whole abdominopelvic radiotherapy for gynecologic malignancy. Kim YB, Kim JH, Jeong KK, Seong J, Suh CO, Kim GE. Kim YB, et al. Technol Cancer Res Treat. 2009 Oct;8(5):369-77. doi: 10.1177/153303460900800507. Technol Cancer Res Treat. 2009. PMID: 19754213
  • Evaluation of robustness of optimization methods in breast intensity-modulated radiation therapy using TomoTherapy. Oki Y, Akasaka H, Uehara K, Mizonobe K, Sawada M, Nagata J, Harada A, Mayahara H. Oki Y, et al. Phys Eng Sci Med. 2024 Jun;47(2):465-475. doi: 10.1007/s13246-023-01377-7. Epub 2024 Jan 24. Phys Eng Sci Med. 2024. PMID: 38265521
  • Knowledge-based automatic plan optimization for left-sided whole breast tomotherapy. Esposito PG, Castriconi R, Mangili P, Broggi S, Fodor A, Pasetti M, Tudda A, Di Muzio NG, Del Vecchio A, Fiorino C. Esposito PG, et al. Phys Imaging Radiat Oncol. 2022 Jun 23;23:54-59. doi: 10.1016/j.phro.2022.06.009. eCollection 2022 Jul. Phys Imaging Radiat Oncol. 2022. PMID: 35814259 Free PMC article.
  • Do hypofraction and large breast size reciprocally fit in breast cancer radiotherapy? Franco P, Bartoncini S, Martini S, Iorio GC, Ricardi U. Franco P, et al. Ann Transl Med. 2019 Jul;7(Suppl 3):S146. doi: 10.21037/atm.2019.06.26. Ann Transl Med. 2019. PMID: 31576353 Free PMC article. No abstract available.
  • Incidental dose distribution to locoregional lymph nodes of breast cancer patients undergoing adjuvant radiotherapy with tomotherapy - is it time to adjust current contouring guidelines to the radiation technique? Mayinger M, Borm KJ, Dreher C, Dapper H, Duma MN, Oechsner M, Kampfer S, Combs SE, Habermehl D. Mayinger M, et al. Radiat Oncol. 2019 Aug 1;14(1):135. doi: 10.1186/s13014-019-1328-7. Radiat Oncol. 2019. PMID: 31370876 Free PMC article.
  • Dosimetric study of the plan quality and dose to organs at risk on tangential breast treatments using the Halcyon linac. Flores-Martinez E, Kim GY, Yashar CM, Cerviño LI. Flores-Martinez E, et al. J Appl Clin Med Phys. 2019 Jul;20(7):58-67. doi: 10.1002/acm2.12655. Epub 2019 Jun 11. J Appl Clin Med Phys. 2019. PMID: 31183967 Free PMC article.
  • Veronesi U, Cascinelli N, Mariani I. et al.Twenty-year follow-up of a randomized study comparing breast-conserving surgery with radical mastectomy for early breast cancer. N Eng J Med. 2002;347:1227–1232. doi: 10.1056/NEJMoa020989. - DOI - PubMed
  • Kestin LL, Sharpe MB, Frazier RC. et al.Intensity modulation to improve dose uniformity with tangential breast radiotherapy: initial clinical experience. Int J Radiat Oncol Biol Phys. 2000;5:1559–1568. - PubMed
  • Coon AB, Dickler A, Kirk MC. et al.TomoTherapy and multifield intensity-modulated radiotherapy planning reduce cardiac doses in left-sided breast cancer patients with unfavorable cardiac anatomy. Int J Radiat Oncol Biol Phys. 2010;78:104–110. doi: 10.1016/j.ijrobp.2009.07.1705. - DOI - PubMed
  • Franco P, Catuzzo P, Cante D, La Porta MR, Sciacero P, Girelli G, Casanova Borca V, Pasquino M, Numico G, Tofani S, Meloni T, Ricardi U, Ozzello F. TomoDirect: an efficient means to deliver radiation at static angles with tomotherapy. Tumori. 2011;97(4):498–502. - PubMed
  • Reynders T, Tournel K, De Coninck P. et al.Dosimetric assessment of static and helical TomoTherapy in the clinical implementation of breast cancer treatments. Radiother Oncol. 2009;93:71–79. doi: 10.1016/j.radonc.2009.07.005. - DOI - PubMed

Publication types

  • Search in MeSH

Related information

Linkout - more resources, full text sources.

  • BioMed Central
  • Europe PubMed Central
  • PubMed Central
  • MedlinePlus Health Information

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

  • U.S. Department of Health & Human Services

National Institutes of Health (NIH) - Turning Discovery into Health

  • Virtual Tour
  • Staff Directory
  • En Español

You are here

News releases.

News Release

Thursday, June 6, 2024

NIH launches $30 million pilot to test feasibility of a national primary care research network

Initiative aims to improve health outcomes by integrating research in everyday primary care settings.

Illustration describes the CARE for Health program

CARE for Health infographic

The National Institutes of Health (NIH) is investing approximately $30 million in total over fiscal years 2024 and 2025 to pilot a national primary care research network that integrates clinical research with community-based primary care. The new initiative called Communities Advancing Research Equity for Health – or CARE for Health – seeks to improve access to clinical research to inform medical care, particularly for those in communities historically underrepresented in clinical research or underserved in health care. Informed by the health needs of these communities, CARE for Health will help to grow an evidence base that contributes to improved patient outcomes, provide communities access to the best available scientific research and expand opportunities to participate in clinical trials and studies. NIH Director Monica M. Bertagnolli, M.D., lays out her vision for CARE for Health in a Science Editorial that was published today.

“Despite tremendous scientific progress, the health of important segments of the U.S. population is getting worse, not better,” said Dr. Bertagnolli. “Health is dependent upon many factors.  We recognize that environmental and societal factors are very important, and that each community is unique. Because of this, we must adapt our research to be more inclusive and more responsive to the needs of communities currently underserved in health research. Our vision for CARE for Health is to help primary care providers and their patients contribute to knowledge generation, and to deliver evidence back to them to achieve better care.”

Supported through the NIH Common Fund, CARE for Health will initially leverage existing NIH-funded clinical research networks and community partners to establish the infrastructure that will support research at select primary care sites. Initial awards will fund organizations that serve rural communities and are expected to be made in fall 2024.

“Health research should be accessible to all populations. Clinical trials should reflect the diversity of Americans – because we know that delivers the best results,” said HHS Secretary Xavier Becerra. “We are taking a big step towards ensuring communities that are historically underrepresented in clinical research are fully included and have the same access to the best available results and analysis. There has never been more potential for progress than we have today.”

Participating clinical sites will be able to choose research studies based on health issues affecting and prioritized by their communities. Patients will be able to contribute their data to research in order to generate results that are clinically meaningful to them. Final study findings and aggregate results will be shared with research participants. CARE for Health will expand NIH-funded research studies to increase engagement with people from communities historically underrepresented or underserved in health care and clinical research. This includes people from certain racial and ethnic groups, those who are older, those who live in rural areas and those who have low socioeconomic status or lower educational attainment. Studies will seek to address common health issues, as well as disease prevention.  

“Community-oriented primary care not only provides essential health services, but it also engenders trust among those who lack confidence in recommended medical care or science,” said Dr. Bertagnolli. “In fact, greater availability of primary care services in communities is associated with fewer disparities in health outcomes and lower mortality. We earn people’s trust when they get access to the care they need and when they can see direct benefits from their participation in research.”

As CARE for Health expands, the program will launch new studies across the network and further establish study sites, training capabilities, data management and increased interoperability. By expanding collaborations to integrate research data into clinical practice and clinical data collection into research studies, the network will facilitate the use of innovative practices and trial designs to minimize burden of research on primary care providers and patients.

“The goal is to create a learning health system in which research informs clinical practice and clinical data informs research,” said NIH Deputy Director for Program Coordination, Planning, and Strategic Initiatives Tara A. Schwetz, Ph.D. “As the program grows, sites and their communities will help design new clinical studies reflecting their specific health needs, and results from those studies will inform the care they receive.”

The NIH is hosting a public workshop on Friday, June 7 from 10 a.m. to 12:30 p.m. EDT to share findings from a series of listening sessions on the challenges and opportunities for integrating research into primary care. Learn more about the workshop and register for the event .

About the National Institutes of Health (NIH): NIH, the nation's medical research agency, includes 27 Institutes and Centers and is a component of the U.S. Department of Health and Human Services. NIH is the primary federal agency conducting and supporting basic, clinical, and translational medical research, and is investigating the causes, treatments, and cures for both common and rare diseases. For more information about NIH and its programs, visit www.nih.gov .

NIH…Turning Discovery Into Health ®

Connect with Us

  • More Social Media from NIH

The state of AI in early 2024: Gen AI adoption spikes and starts to generate value

If 2023 was the year the world discovered generative AI (gen AI) , 2024 is the year organizations truly began using—and deriving business value from—this new technology. In the latest McKinsey Global Survey  on AI, 65 percent of respondents report that their organizations are regularly using gen AI, nearly double the percentage from our previous survey just ten months ago. Respondents’ expectations for gen AI’s impact remain as high as they were last year , with three-quarters predicting that gen AI will lead to significant or disruptive change in their industries in the years ahead.

About the authors

This article is a collaborative effort by Alex Singla , Alexander Sukharevsky , Lareina Yee , and Michael Chui , with Bryce Hall , representing views from QuantumBlack, AI by McKinsey, and McKinsey Digital.

Organizations are already seeing material benefits from gen AI use, reporting both cost decreases and revenue jumps in the business units deploying the technology. The survey also provides insights into the kinds of risks presented by gen AI—most notably, inaccuracy—as well as the emerging practices of top performers to mitigate those challenges and capture value.

AI adoption surges

Interest in generative AI has also brightened the spotlight on a broader set of AI capabilities. For the past six years, AI adoption by respondents’ organizations has hovered at about 50 percent. This year, the survey finds that adoption has jumped to 72 percent (Exhibit 1). And the interest is truly global in scope. Our 2023 survey found that AI adoption did not reach 66 percent in any region; however, this year more than two-thirds of respondents in nearly every region say their organizations are using AI. 1 Organizations based in Central and South America are the exception, with 58 percent of respondents working for organizations based in Central and South America reporting AI adoption. Looking by industry, the biggest increase in adoption can be found in professional services. 2 Includes respondents working for organizations focused on human resources, legal services, management consulting, market research, R&D, tax preparation, and training.

Also, responses suggest that companies are now using AI in more parts of the business. Half of respondents say their organizations have adopted AI in two or more business functions, up from less than a third of respondents in 2023 (Exhibit 2).

Gen AI adoption is most common in the functions where it can create the most value

Most respondents now report that their organizations—and they as individuals—are using gen AI. Sixty-five percent of respondents say their organizations are regularly using gen AI in at least one business function, up from one-third last year. The average organization using gen AI is doing so in two functions, most often in marketing and sales and in product and service development—two functions in which previous research  determined that gen AI adoption could generate the most value 3 “ The economic potential of generative AI: The next productivity frontier ,” McKinsey, June 14, 2023. —as well as in IT (Exhibit 3). The biggest increase from 2023 is found in marketing and sales, where reported adoption has more than doubled. Yet across functions, only two use cases, both within marketing and sales, are reported by 15 percent or more of respondents.

Gen AI also is weaving its way into respondents’ personal lives. Compared with 2023, respondents are much more likely to be using gen AI at work and even more likely to be using gen AI both at work and in their personal lives (Exhibit 4). The survey finds upticks in gen AI use across all regions, with the largest increases in Asia–Pacific and Greater China. Respondents at the highest seniority levels, meanwhile, show larger jumps in the use of gen Al tools for work and outside of work compared with their midlevel-management peers. Looking at specific industries, respondents working in energy and materials and in professional services report the largest increase in gen AI use.

Investments in gen AI and analytical AI are beginning to create value

The latest survey also shows how different industries are budgeting for gen AI. Responses suggest that, in many industries, organizations are about equally as likely to be investing more than 5 percent of their digital budgets in gen AI as they are in nongenerative, analytical-AI solutions (Exhibit 5). Yet in most industries, larger shares of respondents report that their organizations spend more than 20 percent on analytical AI than on gen AI. Looking ahead, most respondents—67 percent—expect their organizations to invest more in AI over the next three years.

Where are those investments paying off? For the first time, our latest survey explored the value created by gen AI use by business function. The function in which the largest share of respondents report seeing cost decreases is human resources. Respondents most commonly report meaningful revenue increases (of more than 5 percent) in supply chain and inventory management (Exhibit 6). For analytical AI, respondents most often report seeing cost benefits in service operations—in line with what we found last year —as well as meaningful revenue increases from AI use in marketing and sales.

Inaccuracy: The most recognized and experienced risk of gen AI use

As businesses begin to see the benefits of gen AI, they’re also recognizing the diverse risks associated with the technology. These can range from data management risks such as data privacy, bias, or intellectual property (IP) infringement to model management risks, which tend to focus on inaccurate output or lack of explainability. A third big risk category is security and incorrect use.

Respondents to the latest survey are more likely than they were last year to say their organizations consider inaccuracy and IP infringement to be relevant to their use of gen AI, and about half continue to view cybersecurity as a risk (Exhibit 7).

Conversely, respondents are less likely than they were last year to say their organizations consider workforce and labor displacement to be relevant risks and are not increasing efforts to mitigate them.

In fact, inaccuracy— which can affect use cases across the gen AI value chain , ranging from customer journeys and summarization to coding and creative content—is the only risk that respondents are significantly more likely than last year to say their organizations are actively working to mitigate.

Some organizations have already experienced negative consequences from the use of gen AI, with 44 percent of respondents saying their organizations have experienced at least one consequence (Exhibit 8). Respondents most often report inaccuracy as a risk that has affected their organizations, followed by cybersecurity and explainability.

Our previous research has found that there are several elements of governance that can help in scaling gen AI use responsibly, yet few respondents report having these risk-related practices in place. 4 “ Implementing generative AI with speed and safety ,” McKinsey Quarterly , March 13, 2024. For example, just 18 percent say their organizations have an enterprise-wide council or board with the authority to make decisions involving responsible AI governance, and only one-third say gen AI risk awareness and risk mitigation controls are required skill sets for technical talent.

Bringing gen AI capabilities to bear

The latest survey also sought to understand how, and how quickly, organizations are deploying these new gen AI tools. We have found three archetypes for implementing gen AI solutions : takers use off-the-shelf, publicly available solutions; shapers customize those tools with proprietary data and systems; and makers develop their own foundation models from scratch. 5 “ Technology’s generational moment with generative AI: A CIO and CTO guide ,” McKinsey, July 11, 2023. Across most industries, the survey results suggest that organizations are finding off-the-shelf offerings applicable to their business needs—though many are pursuing opportunities to customize models or even develop their own (Exhibit 9). About half of reported gen AI uses within respondents’ business functions are utilizing off-the-shelf, publicly available models or tools, with little or no customization. Respondents in energy and materials, technology, and media and telecommunications are more likely to report significant customization or tuning of publicly available models or developing their own proprietary models to address specific business needs.

Respondents most often report that their organizations required one to four months from the start of a project to put gen AI into production, though the time it takes varies by business function (Exhibit 10). It also depends upon the approach for acquiring those capabilities. Not surprisingly, reported uses of highly customized or proprietary models are 1.5 times more likely than off-the-shelf, publicly available models to take five months or more to implement.

Gen AI high performers are excelling despite facing challenges

Gen AI is a new technology, and organizations are still early in the journey of pursuing its opportunities and scaling it across functions. So it’s little surprise that only a small subset of respondents (46 out of 876) report that a meaningful share of their organizations’ EBIT can be attributed to their deployment of gen AI. Still, these gen AI leaders are worth examining closely. These, after all, are the early movers, who already attribute more than 10 percent of their organizations’ EBIT to their use of gen AI. Forty-two percent of these high performers say more than 20 percent of their EBIT is attributable to their use of nongenerative, analytical AI, and they span industries and regions—though most are at organizations with less than $1 billion in annual revenue. The AI-related practices at these organizations can offer guidance to those looking to create value from gen AI adoption at their own organizations.

To start, gen AI high performers are using gen AI in more business functions—an average of three functions, while others average two. They, like other organizations, are most likely to use gen AI in marketing and sales and product or service development, but they’re much more likely than others to use gen AI solutions in risk, legal, and compliance; in strategy and corporate finance; and in supply chain and inventory management. They’re more than three times as likely as others to be using gen AI in activities ranging from processing of accounting documents and risk assessment to R&D testing and pricing and promotions. While, overall, about half of reported gen AI applications within business functions are utilizing publicly available models or tools, gen AI high performers are less likely to use those off-the-shelf options than to either implement significantly customized versions of those tools or to develop their own proprietary foundation models.

What else are these high performers doing differently? For one thing, they are paying more attention to gen-AI-related risks. Perhaps because they are further along on their journeys, they are more likely than others to say their organizations have experienced every negative consequence from gen AI we asked about, from cybersecurity and personal privacy to explainability and IP infringement. Given that, they are more likely than others to report that their organizations consider those risks, as well as regulatory compliance, environmental impacts, and political stability, to be relevant to their gen AI use, and they say they take steps to mitigate more risks than others do.

Gen AI high performers are also much more likely to say their organizations follow a set of risk-related best practices (Exhibit 11). For example, they are nearly twice as likely as others to involve the legal function and embed risk reviews early on in the development of gen AI solutions—that is, to “ shift left .” They’re also much more likely than others to employ a wide range of other best practices, from strategy-related practices to those related to scaling.

In addition to experiencing the risks of gen AI adoption, high performers have encountered other challenges that can serve as warnings to others (Exhibit 12). Seventy percent say they have experienced difficulties with data, including defining processes for data governance, developing the ability to quickly integrate data into AI models, and an insufficient amount of training data, highlighting the essential role that data play in capturing value. High performers are also more likely than others to report experiencing challenges with their operating models, such as implementing agile ways of working and effective sprint performance management.

About the research

The online survey was in the field from February 22 to March 5, 2024, and garnered responses from 1,363 participants representing the full range of regions, industries, company sizes, functional specialties, and tenures. Of those respondents, 981 said their organizations had adopted AI in at least one business function, and 878 said their organizations were regularly using gen AI in at least one function. To adjust for differences in response rates, the data are weighted by the contribution of each respondent’s nation to global GDP.

Alex Singla and Alexander Sukharevsky  are global coleaders of QuantumBlack, AI by McKinsey, and senior partners in McKinsey’s Chicago and London offices, respectively; Lareina Yee  is a senior partner in the Bay Area office, where Michael Chui , a McKinsey Global Institute partner, is a partner; and Bryce Hall  is an associate partner in the Washington, DC, office.

They wish to thank Kaitlin Noe, Larry Kanter, Mallika Jhamb, and Shinjini Srivastava for their contributions to this work.

This article was edited by Heather Hanselman, a senior editor in McKinsey’s Atlanta office.

Explore a career with us

Related articles.

One large blue ball in mid air above many smaller blue, green, purple and white balls

Moving past gen AI’s honeymoon phase: Seven hard truths for CIOs to get from pilot to scale

A thumb and an index finger form a circular void, resembling the shape of a light bulb but without the glass component. Inside this empty space, a bright filament and the gleaming metal base of the light bulb are visible.

A generative AI reset: Rewiring to turn potential into value in 2024

High-tech bees buzz with purpose, meticulously arranging digital hexagonal cylinders into a precisely stacked formation.

Implementing generative AI with speed and safety

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Front Med (Lausanne)
  • PMC10725950

Markers of oxidative stress during post-COVID-19 fatigue: a hypothesis-generating, exploratory pilot study on hospital employees

Hanna hofmann.

1 Department of Psychosomatic Medicine and Psychotherapy, General Hospital Nuremberg, Paracelsus Medical University, Nuremberg, Germany

Alexandra Önder

Juliane becker, michael gröger.

2 Anesthesiological Pathophysiology and Process Engineering, University Hospital, Ulm, Germany

Markus M. Müller

Fabian zink, barbara stein, peter radermacher, christiane waller, associated data.

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Introduction

Post-COVID-19 fatigue is common after recovery from COVID-19. Excess formation of reactive oxygen species (ROS) leading to oxidative stress-related mitochondrial dysfunction is referred to as a cause of these chronic fatigue-like symptoms. The present observational pilot study aimed to investigate a possible relationship between the course of ROS formation, subsequent oxidative stress, and post-COVID-19 fatigue.

A total of 21 post-COVID-19 employees of the General Hospital Nuremberg suffering from fatigue-like symptoms were studied during their first consultation (T1: on average 3 months after recovery from COVID-19), which comprised an educational talk on post-COVID-19 symptomatology and individualized outpatient strategies to resume normal activity, and 8 weeks thereafter (T2). Fatigue severity was quantified using the Chalder Fatigue Scale together with a health survey (Patient Health Questionnaire) and self-report on wellbeing (12-Item Short-Form Health Survey). We measured whole blood superoxide anion ( O 2 • - ) production rate (electron spin resonance, as a surrogate for ROS production) and oxidative stress-induced DNA strand breaks (single cell gel electrophoresis: “tail moment” in the “comet assay”).

Data are presented as mean ± SD or median (interquartile range) depending on the data distribution. Differences between T1 and T2 were tested using a paired Wilcoxon rank sign or t -test. Fatigue intensity decreased from 24 ± 5 at T1 to 18 ± 8 at T2 ( p < 0.05), which coincided with reduced O 2 • - formation (from 239 ± 55 to 195 ± 59 nmol/s; p < 0.05) and attenuated DNA damage [tail moment from 0.67 (0.36–1.28) to 0.32 (0.23–0.71); p = 0.05].

Our pilot study shows that post-COVID-19 fatigue coincides with (i) enhanced O 2 • - formation and oxidative stress, which are (ii) reduced with attenuation of fatigue symptoms.

1 Introduction

Fatigue after acute viral infection is a well-known consequence of, e.g., an Ebstein-Barr virus (EBV) infection ( 1 ). Similarly, after the acute infection with SARS-CoV-2 has resumed, a significant number of patients are continuously suffering from various physical and psychological symptoms, eventually lasting for several months ( 2 ), among which post-infectious fatigue is a common finding ( 3 ). Fatigue is characterized by severe physical and mental exhaustion disproportionate to the previous activity ( 2 ), which results in markedly impaired cardiorespiratory fitness ( 4 ). In post-COVID-19 patients, female sex and a pre-existing diagnosis of depression and/or anxiety are frequently present ( 5 ), while the degree of fatigue is often unrelated to the initial disease severity ( 5 , 6 ). Despite the high impact on individual mental and physical health and quality of life, the pathophysiology of this fatigue is still not known ( 7 ).

Post-COVID-19 fatigue symptomatology resembles that of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) ( 8 ), and substantial overlap has been reported between post-COVID-19 and ME/CFS symptoms ( 9 ). Persistent neuroinflammation ( 10 ) and brain antioxidant capacity ( 11 ), redox imbalance (oxidative stress) ( 12 ), and consecutive mitochondrial dysfunction resulting from impaired mitochondrial respiratory activity and/or a reduced number of intact mitochondria ( 13 ) have been referred to as a possible link between post-COVID-19 fatigue and ME/CFS. Most recently, a significant relationship was shown between a neuropsychiatric symptoms score and a score based on the relationship between serum markers of oxidative and nitrosative stress and antioxidant capacity ( 14 ). Finally, oxidative stress is defined as the mismatch between the production and/or accumulation of reactive oxygen species (ROS) and the radical scavenger (antioxidant) capacity ( 15 ). This can result in damage to the DNA and/or mitochondria, the latter being mainly responsible for cellular energy metabolism. ROS formation is a natural process ( 16 ), e.g., for antimicrobial host defense ( 17 ), and mitochondrial respiration is the major source of ROS generation ( 18 ).

Activated immune cells (monocytes, neutrophils) also directly release ROS through NADPH oxidase activity ( 19 ). However, this excess ROS formation has also been referred to as a major pathophysiological mechanism of COVID-19: by increasing extracellular trap formation, it suppresses the T-cell response, i.e., the adaptive immune system response necessary to eliminate virus-infected cells ( 20 ).

Given the fundamental role of oxidative stress during the acute phase of a SARS-CoV-2 infection, we aimed to assess a possible relationship between oxidative stress and sequelae in patients who had recovered from the disease. For this purpose, in the present hypothesis-generating, exploratory pilot study, we investigated markers of oxidative stress and post-COVID-19 fatigue symptoms in hospital employees. We collected psychosocial data and analyzed ROS concentration and oxidative DNA damage in blood cells at two different time points prior to and after psychosomatic counseling.

2.1 Subjects and ethics

The present dataset is based on data collected from 21 hospital employees of the post-COVID-19 outpatient clinic at the Department of Psychosomatic Medicine and Psychotherapy, General Hospital Nuremberg, Paracelsus Medical University. The outpatient clinic was set up in March 2021 to support healthcare workers in the metropolitan region of Nuremberg in dealing with the consequences of a SARS-CoV-2 infection and to initiate treatment if necessary.

Prior to inclusion, all subjects gave their written informed consent for participation. The study was conducted in accordance with the Declaration of Helsinki; the study protocol had been approved by the Ethics Committee of the Paracelsus Medical University (No. FMS_W_010.22-XI-3) and the Bavarian State Chamber for Physicians (Bayrische Landesärztekammer No. 22035) and registered in the German Registrary for Clinical Studies (ID: DRKS00028108).

2.2 Study design

The present observational, hypothesis-generating clinical pilot study was carried out on patients of the interdisciplinary post-COVID-19 consultation hour established at General Hospital Nuremberg for hospital employees of all professional groups. Inclusion criteria were age between 18 and 70 years, COVID-19 infection, fatigue symptomatology, and post-COVID-19 syndrome according to the “Long/Post-COVID” guideline of the “ Arbeitsgemeinschaft der Wissenschaftlichen Medizinischen Fachgesellschaften ” (AWMF) ( 21 ). Exclusion criteria were insufficient knowledge of the German language to answer the questionnaires, an untreated somatic disease susceptible to provoking fatigue-like symptoms (e.g., malnutrition, electrolyte disturbances, and endocrine and neurological disorders), and/or the presence of a psychiatric disorder (such as addictive disorder, dementia, psychotic disorder, or suicidality). In particular, except for three individuals, none of the patients included had undergone psychotherapy within 12 months preceding the SARS-CoV-2 infection. A total of 16 women and 5 men with a median age of 52 (range: 32–64) years were recruited. The acute SARS-CoV-2 infection occurred between March 2020 and December 2021; the time interval between the SARS-CoV-2 infection and the first visit (T1) to the interdisciplinary post-COVID-19 consultation was at least 3 months. In 20 out of the 21 patients, SARS-CoV-2 treatment was confined to outpatient clinical care; the only patient requiring hospitalization did not need any ICU treatment. Hence, the patients studied had only shown mild to moderate severity of the acute SARS-CoV-2 infection; long-term pulmonary and/or cardiovascular sequelae were not present either.

Employees with fatigue symptoms presented at the Department of Psychosomatic Medicine and Psychotherapy between 10 a.m. and 12 p.m. for about half an hour and were always treated by the same physician (C.W.). Before the consultation started, the participants were asked to fill out the questionnaires. This was followed by a medical history interview. After a rest period of 5 min, blood was taken and immediately processed at a mobile lab desk for analysis of reactive oxygen species (ROS) formation and oxidative DNA damage. The intervention consisted of an educational talk during which the clinician explained the typical symptoms of the post-COVID-19 syndrome and the relationship between both physical and psychosocial stress and symptom amplification in the recovery phase. Depending on the degree of stress, an individualized outpatient procedure was determined to allow for the resumption of everyday and work activities, and a second psychosomatic consultation was arranged at an interval of 8 weeks to assess the progress (T2). At T2, the completion of the questionnaires and blood sampling were carried out in the same way as at T1. For the first counseling, the total data of 21 employees were analyzed, while for the second examination, only 15 employees took the service: one person could not have a blood draw, and five others did not need a second conversation; therefore, their questionnaire data are missing for T2.

2.3 Psychometric analysis

In addition to the collection of the sociodemographic data “age,” “gender,” and “time and course of SARS-CoV-2 infection,” the following psychometric analyses were performed:

2.3.1 Mental health

Mental health was surveyed using the German Version of the Patient Health Questionnaire (PHQ-D) ( 22 ) which is a self-assessment tool consisting of several modules. We used the PHQ-D modules “somatization (PHQ-15),” “depression (PHQ-9),” and “stress (PHQ-Stress).” The PHQ-15 includes 15 physical complaints such as abdominal pain, headache, dizziness, shortness of breath, or palpitations. Respondents are asked to indicate to what extent they feel affected by the symptoms mentioned during the last 2 and 4 weeks for lack of energy and sleep disorder, respectively. The PHQ-9 module on depression comprises nine items. Participants are asked how often they felt affected by complaints like loss of interest, hopelessness, reduced appetite, or concentration difficulties during the last 2 weeks. The “PHQ-stress” measures psychosocial stress factors comprising, 10 items. For example, it asks how much a person felt affected by worries about their health, difficulties with their partner, stress at work, or financial worries during the last 4 weeks. The response formats are as follows: For PHQ-15 and PHQ-Stress, 0 = not bothered at all, 1 = bothered a little, and 2 = bothered a lot, and for PHQ-9, 0 = not at all, 1 = several days, 2 = more than half the days, and 3 = nearly every day. The evaluation of the individual modules is done by forming the sum value. For PHQ-15, this can range from 0 to 30; for PHQ-9, from 0 to 27; and for PHQ-Stress, from 0 to 20. Higher total scale values indicate a more severe mental disorder. Scale sum scores can be categorized and interpreted as follows: minimal (0–4), mild (5–9), moderate (10–14), and severe (≥15); for PHQ-9, moderate (10–14), moderately severe (15–19), and severe (≥20) symptom expression.

2.3.2 Self-report of health and wellbeing

The German version of the “Short-Form-12 Health Survey” (SF-12) ( 23 ) was used to measure health-related quality of life. The SF-12 is a short version of the Short-Form-36 Health Survey (SF-36) ( 24 ) and consists of 12 items. The eight dimensions of the SF-36 are represented in the SF-12 by four individual items (general health perception, pain, vitality, and social functioning) and four item pairs (physical functioning, physical role functioning, emotional role functioning, and psychological wellbeing). Respondents are asked to use multilevel response scales to describe, e.g., their health in general ( 1 = excellent to 5 = poor), to assess whether and if so, to what extent, they had been limited by their current health in moderately difficult activities (e.g., moving a table, vacuuming, bowling, playing golf; 1 = yes, severely limited to 3 = no, not limited at all), or, e.g., how often they had felt “ full of energy ” in the past 4 weeks ( 1 = always to 6 = never). The subscales of general perception of health, physical functioning, physical role functioning, and pain represent the physical dimension of health. Vitality, psychological wellbeing, emotional role function, and social functioning represent the psychological dimension. A sum scale can be calculated for both physical (Physical Composite Score) and mental (Mental Composite Score) health. Calculation modalities and the standard values were carried out according to the manual by Morfeld et al. ( 25 ). Higher values on the sum scales reflect better subjective physical and mental health. Standard values can be found in the manual. For the German SF-12, these were taken from the standardization of the SF-36.

2.3.3 Fatigue

Fatigue was assessed using the German version (FS) ( 26 ) of the Chalder fatigue scale ( 27 ). The scale is a self-report instrument and measures the intensity of fatigue during the last 4 weeks according to 11 items. Seven items relate to the physical component of fatigue, and four items relate to mental fatigue. For example, the physical dimension of fatigue is surveyed with the questions “ Do you have problems with tiredness? ,” “ Do you need to rest more? ,” or “ Do you feel sleepy or drowsy? ,” while the items “ Do you have difficulty concentrating? ,” “ Do you make slips of the tongue when speaking? ,” or “ How is your memory? ” are examples of the mental dimension of fatigue. The items are answered in a four-point response format, for items 1 to 10, 0 = less than usual, 1 = no more than, 2 = more than, and 3 = much more than usual, and for item 11, 0 = better than, 1 = no worse than, 2 = worse than, and 3 = much worse than. The expressions on the two subscales (physical fatigue and mental fatigue) and a total scale score are determined. The evaluation is either dimensional using a Likert scale from 0 to 3 or categorical using a bimodal scale of (0, 1: 0; 2, 3: 1). Thus, evaluations can be made regarding the severity as well as possible case identification. In the present study, a dimensional evaluation was used. Higher total values represent more pronounced fatigue symptoms. In a study using the Chalder fatigue scale, mean fatigue scores of 24.4 ± 5.8 ( n = 361) and 14.2 ± 4.6 ( n = 1,615) were found for CFS patients and a “ non-clinical community ” sample presenting to a general practitioner, respectively ( 28 ).

2.3.4 Blood analyses

Immediately after sampling, 2 ml of venous blood collected in Lithium-Heparin-Serum Monovettes (Sarstedt, Nümbrecht, Germany), on ice and under light protection, was taken to the mobile lab desk for further processing. Blood samples were processed for the measurement of the superoxide anion ( O 2 • - ) production rate as a surrogate for ROS production and the quantification of oxidative stress-induced DNA strand breaks (single cell gel electrophoresis: “tail moment” in the “comet assay”).

2.3.4.1 Superoxide anion ( O 2 • - ) production

Superoxide anion ( O 2 • - ) production was determined based on electron paramagnetic resonance (EPR) using the VitaScreen ® device (Noxygen Science Transfer and Diagnostics GmbH, Elzach, Germany). For this purpose, the device was heated to 37°C to mirror in vivo conditions, and 15 μl of blood was pipetted into a light-protected PCR reaction tube. The blood solution was mixed with 15 μl of the spin probe 1-hydroxy-3-methoxycarbonyl-2,2,5,5-tetramethylpyrrolidine (CMH, 400 μmol/L) (Elzach, Germany) diluted in Krebs-HEPES buffer containing deferoxamine and the Na salt of diethyldithiocarbamic acid. The CMH-blood mixture was sucked up using a microcapillary, sealed on one side with sealing wax, and subsequently placed in the resonator of the VitaScreen ® . After 10 min of reaction, the result was recorded as “cellular metabolic activity (CMA) of ROS in total cells” in nmol/s ( 29 , 30 ).

2.3.4.2 DNA damage

Oxidative DNA damage was quantified via the determination of DNA strand breaks using single-cell gel electrophoresis (an alkaline version of the “comet assay”) of whole blood samples. Briefly, cell lysis for at least 1 h and slide processing were performed as previously described in detail ( 31 , 32 ) using alkali denaturation and electrophoresis (0.86 V/cm at a pH ≈ 13) to transform alkali-sensitive parts of the DNA into DNA strand breaks. After staining every slide with 50 μl ethidium bromide (Carl Roth, Germany) under a fluorescence microscope (Olympus, Germany), DNA damage was analyzed using image analysis to determine the mean “tail moment” and the mean “tail intensity” of 100 randomly selected nuclei per slide (two slides each per measurement in each individual) (COMET Assay IV, version 4.3., Perceptive Instruments, Haverhill, United Kingdom) ( 32 , 33 ). Nuclei with a calculated “tail moment” of <0.1 were qualified as “undamaged” ( 33 ).

2.3.5 Statistical analysis

Data were analyzed with the statistic package SPSS (version 28, IBM, United States). The mean differences were tested using the t -test for dependent samples or the Wilcoxon test, depending on whether the assumption of a normal distribution was fulfilled. The significance was stated at p < 0.05.

Table 1 and Figures 1 , ​ ,2 2 summarize the results of the fatigue and mental health parameters as well as O 2 • - production rate and the quantification of the DNA damage as assessed using the “tail moment” in the “Comet Assay.” While the fatigue severity was significantly reduced from T1 to T2 ( Table 1 : overall results; Figure 1 , upper panel : individual findings), the attenuation of the PHQ-15 level just did not reach statistical significance ( p = 0.054). None of the other psychometric analyses showed any difference. Whole blood O 2 • - production rate also significantly decreased between the two measurement points ( Table 1 : overall results; Figure 1 , middle panel : individual findings), whereas again, the reduction of the “tail moment” just did not reach statistical significance ( Table 1 : overall results; Figure 1 , lower panel : individual findings; p = 0.053). Figure 2 shows the individual differences between T1 and T2.

Overall results for fatigue, mental health (SF-12 PCS, SF-12 MCS, PHQ-15, PHQ-9, and PHQ-Stress), whole blood superoxide anion ( O 2 • - ), and DNA damage (“tail moment” in the “comet assay”) at T1 and T2.

-test or Wilcoxon test -value
Fatigue23.7 ± 5.4 ( = 21)18.3 ± 8.1 ( = 15) = 2.60.0230.42
SF-12 PCS33.7 ± 9.8 ( = 18)35.5 ± 10.3 ( = 15) = −0.20.864−0.05
SF-12 MCS37.0 ± 10.3 ( = 18)41.2 ± 13.1 ( = 15) = −0.80.435−0.30
PHQ-1513.0 ± 5.8 ( = 21)10.1 ± 5.8 ( = 15) = 2.10.0540.31
PHQ-99.6 ± 4.5 ( = 21)7.7 ± 4.6 ( = 15)z = −1.20.2810.32
PHQ-Stress5.6 ± 3.1 ( = 21)4.5 ± 3.1 ( = 15)z = −0.70.4640.19
[nmol/s]239 ± 55 ( = 21)195 ± 59 ( = 18) = 2.30.0370.70
Tail moment0.67 (0.36; 1.28) ( = 21)0.32 (0.23; 0.71) ( = 15)z = −1.90.0530.50

Data are presented as mean ± SD or median (interquartile range), respectively, depending on the presence/absence of normal data distribution. Note that the p-values for the paired t-test and the Wilcoxon test refer to the number of measurements available at both T1 and T2. For individual data, see Figure 1 . a Cohen's d: Calculation modalities effect size: https://www.psychometrica.de/effect_size.html (t-test), b r = |z/root N|(Wilcoxon test).

An external file that holds a picture, illustration, etc.
Object name is fmed-10-1305009-g0001.jpg

Individual results for the fatigue score (upper panel) as well as whole blood O 2 • - formation rate (in nmol/s) (middle panel) and DNA damage (tail moment in the comet assay) (lower panel) at T1 and T2. Note that black symbols represent patients for whom complete datasets were available at both time points T1 and T2, whereas red symbols represent patients for whom data at T2 were not available for all items.

An external file that holds a picture, illustration, etc.
Object name is fmed-10-1305009-g0002.jpg

Individual results for the fatigue score (upper panel) as well as whole blood O 2 • - formation rate (in nmol/s) (middle panel) and DNA damage (tail moment in the comet assay) (lower panel) as difference values between T1 and T2.

4 Discussion

The present observational, exploratory, and hypothesis-generating pilot study aimed to assess a possible relationship between oxidative stress and fatigue-like sequelae in hospital employees after a SARS-CoV-2 infection. The main results were that post-COVID-19 fatigue coincides with (i) enhanced O 2 • - formation and oxidative stress, which are (ii) reduced with attenuation of fatigue symptoms.

The fatigue severity, as assessed using the Chalder fatigue score, was significantly reduced between the two measurement time points. While the fatigue score at T1 (23.7 ± 5.4) was similar to that reported in 361 CFS patients (24.4 ± 5.8) ( 28 ), the values at T2 were still higher (18.3 ± 8.1) than in 1,615 control patients (14.2 ± 4.6) in that study. However, in CFS patients, oral oxaloacetate ( 34 ), graded exercise ( 35 ), and cognitive behavioral therapy ( 36 ) had yielded similar reductions of the Chalder fatigue score by approximately five points ( 35 , 36 ) from 24–26 to 19–21 and 25% ( 34 ). Hence, the attenuation of the fatigue score in our post-COVID patients well agrees with reports on various therapeutic interventions in CFS patients.

According to the PHQ-stress score, our patients presented with only a mild stress level at T1. Consequently, given the only minor symptomatic burden, we did not expect a major effect on the PHQ-stress score at T2, and the mean difference was negligible. Both the PHQ-9 score, i.e., the quantification of depressive symptoms, and the PHQ-15 score, i.e., the quantification of somatic symptoms, were only moderate at T1. While the PHQ-9 score did not differ at T2, the PHQ-15 score was attenuated, albeit this effect just did not reach statistical significance ( p = 0.054). The finding for PHQ-9 well agrees with the assumption that our patients were “mentally healthy,” which is confirmed by the presence of psychotherapeutic treatment in only three patients within the 12 months prior to the investigation. The PHQ-15 score not only addresses mental health but also comprises somatic symptoms that may also be present in CFS patients ( 37 ). Hence, given the reduced Chalder fatigue score, it is tempting to speculate that it may have resulted in a reduced PHQ-15 score as well.

In CFS patients, increased plasma peroxide and serum oxidized low-density lipoprotein levels have been reported, suggesting enhanced ROS concentrations [e.g., ( 38 )]. Aggravated oxidative stress resulting from excess ROS production is said to play a role in the development of post-COVID-19 syndrome ( 39 – 41 ). Although, to the best of our knowledge, there is no comparable literature on measuring either ROS formation rate or oxidative stress using the methods shown here, this assumption is in good agreement: the mean O 2 • - formation rate at T1 was higher than the upper threshold reported for healthy volunteers without an increased ROS production rate [220 nmol/s; ( 29 )] and decreased to levels within the normal range at T2. In addition, the amount of DNA damage as measured using single cell gel electrophoresis and reported as the “tail moment” in the comet assay at T1 (median 0.67) was markedly higher than in various previous investigations of our group in healthy volunteers [median 0.18, 0.23, and 0.30 ( 31 , 32 , 42 ), respectively]. In the present study, at T2, the median tail moment (0.32) had returned to similar values as in these previous studies.

4.1 Limitations

The relatively small cohort studied may have precluded more robust, statistically significant results. In addition, due to the observational, exploratory pilot nature of the study, we could not include a control group that did not undergo the educational talk on the typical symptoms of post-COVID-19 syndrome or, in particular, the individualized outpatient procedure. Hence, we cannot discriminate between a possible effect of this procedure and a putative time-dependent resolution of the fatigue symptoms and/or the biological findings. Our study was further limited due to our inability to control possible confounding factors that are well-established to affect DNA damage and/or fatigue (e.g., acute stressors or infections, smoking, nutritional habits, and partial resumption of physical activity). Moreover, to the best of our knowledge, our study is the first to examine fatigue and oxidative cell stress by combining the methods described. Hence, no data are available in the literature that would have supported a case number estimate. Consequently, an a priori power analysis was impossible. Finally, our study population was confined to hospital employees, which may cause a selection bias in the recruitment and, consequently, limit the generalizability of the results to a broader population.

4.2 Conclusion

Our data suggest a connection between oxidative cell stress and post-COVID-19 fatigue. This possible relationship warrants further investigation so that knowledge can be gained about pathophysiological processes (oxidative stress) in the development of fatigue. This implies psychosomatic treatment options, e.g., mindfulness-based interventions, that stimulate antioxidative targets through psychological and biomolecular mechanisms.

Data availability statement

Ethics statement.

The studies involving humans were approved by Ethikkommission der Bayerischen Landesärztekammer. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

Author contributions

HH: Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Software, Validation, Visualization, Writing—original draft, Writing—review & editing. AÖ: Data curation, Formal analysis, Investigation, Methodology, Validation, Writing—original draft. JB: Formal analysis, Writing—review & editing. MG: Data curation, Methodology, Validation, Writing—review & editing. MM: Data curation, Formal analysis, Methodology, Software, Validation, Visualization, Writing—review & editing. FZ: Data curation, Methodology, Validation, Writing—review & editing. BS: Data curation, Formal analysis, Funding acquisition, Methodology, Validation, Writing—review & editing. PR: Conceptualization, Funding acquisition, Project administration, Resources, Supervision, Writing—review & editing. CW: Conceptualization, Funding acquisition, Investigation, Project administration, Resources, Supervision, Writing—review & editing.

Acknowledgments

We would like to thank Alexandra Hass for providing skillful technical assistance.

Funding Statement

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This study was supported by the Verein zur Förderung des Tumorzentrums der Universität Erlangen-Nürnberg e.V. (HH) and the Deutsche Forschungsgemeinschaft (DFG: grant number Project-ID 251293561–Collaborative Research Center (CRC) 1149 Project B03) (PR).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

China has become a scientific superpower

From plant biology to superconductor physics the country is at the cutting edge.

The 500-meter Aperture Spherical Telescope (FAST) in Pingtang County, southwest China's Guizhou Province.

Your browser does not support the <audio> element.

I n the atrium of a research building at the Chinese Academy of Sciences ( CAS ) in Beijing is a wall of patents. Around five metres wide and two storeys high, the wall displays 192 certificates, positioned in neat rows and tastefully lit from behind. At ground level, behind a velvet rope, an array of glass jars contain the innovations that the patents protect: seeds.

CAS —the world’s largest research organisation—and institutions around China produce a huge amount of research into the biology of food crops. In the past few years Chinese scientists have discovered a gene that, when removed, boosts the length and weight of wheat grains, another that improves the ability of crops like sorghum and millet to grow in salty soils and one that can increase the yield of maize by around 10%. In autumn last year, farmers in Guizhou completed the second harvest of genetically modified giant rice that was developed by scientists at CAS .

The Chinese Communist Party ( CCP ) has made agricultural research—which it sees as key to ensuring the country’s food security —a priority for scientists. Over the past decade the quality and the quantity of crop research that China produces has grown immensely, and now the country is widely regarded as a leader in the field. According to an editor of a prestigious European plant-sciences journal, there are some months when half of the submissions can come from China.

A journey of a thousand miles

The rise of plant-science research is not unique in China. In 2019 The Economist surveyed the research landscape in the country and asked whether China could one day become a scientific superpower. Today, that question has been unequivocally answered: “yes”. Chinese scientists recently gained the edge in two closely watched measures of high-quality science, and the country’s growth in top-notch research shows no sign of slowing. The old science world order, dominated by America, Europe and Japan, is coming to an end.

hypothesis generating pilot study

One way to measure the quality of a country’s scientific research is to tally the number of high-impact papers produced each year—that is, publications that are cited most often by other scientists in their own, later work. In 2003 America produced 20 times more of these high-impact papers than China, according to data from Clarivate, a science analytics company (see chart 1). By 2013 America produced about four times the number of top papers and, in the most recent release of data, which examines papers from 2022, China had surpassed both America and the entire European Union ( EU ).

Metrics based on citations can be gamed, of course. Scientists can, and do, find ways to boost the number of times their paper is mentioned in other studies, and a recent working paper, by Qui Shumin, Claudia Steinwender and Pierre Azoulay, three economists, argues that Chinese researchers cite their compatriots far more than Western researchers do theirs. But China now leads the world on other benchmarks that are less prone to being gamed. It tops the Nature Index, created by the publisher of the same name, which counts the contributions to articles that appear in a set of prestigious journals. To be selected for publication, papers must be approved by a panel of peer reviewers who assess the study’s quality, novelty and potential for impact. When the index was first launched, in 2014, China came second, but its contribution to eligible papers was less than a third of America’s. By 2023 China had reached the top spot.

According to the Leiden Ranking of the volume of scientific research output, there are now six Chinese universities or institutions in the world top ten, and seven according to the Nature Index. They may not be household names in the West yet, but get used to hearing about Shanghai Jiao Tong, Zhejiang and Peking (Beida) Universities in the same breath as Cambridge, Harvard and ETH Zurich. “Tsinghua is now the number one science and technology university in the world,” says Simon Marginson, a professor of higher education at Oxford University. “That’s amazing. They’ve done that in a generation.”

hypothesis generating pilot study

Today China leads the world in the physical sciences, chemistry and Earth and environmental sciences, according to both the Nature Index and citation measures (see chart 2). But America and Europe still have substantial leads in both general biology and medical sciences. “Engineering is the ultimate Chinese discipline in the modern period,” says Professor Marginson, “I think that’s partly about military technology and partly because that’s what you need to develop a nation.”

Applied research is a Chinese strength. The country dominates publications on perovskite solar panels, for example, which offer the possibility of being far more efficient than conventional silicon cells at converting sunlight into electricity. Chinese chemists have developed a new way to extract hydrogen from seawater using a specialised membrane to separate out pure water, which can then be split by electrolysis. In May 2023 it was announced that the scientists, in collaboration with a state-owned Chinese energy company, had developed a pilot floating hydrogen farm off the country’s south-eastern coast.

China also now produces more patents than any other country, although many are for incremental tweaks to designs, as opposed to truly original inventions. New developments tend to spread and be adopted more slowly in China than in the West. But its strong industrial base, combined with cheap energy, means that it can quickly spin up large-scale production of physical innovations like materials. “That’s where China really has an advantage on Western countries,” says Jonathan Bean, CEO of Materials Nexus, a British firm that uses AI to discover new materials.

The country is also signalling its scientific prowess in more conspicuous ways. Earlier this month, China’s Chang’e-6 robotic spacecraft touched down in a gigantic crater on the far side of the Moon, scooped up some samples of rock, planted a Chinese flag and set off back towards Earth. If it successfully returns to Earth at the end of the month, it will be the first mission to bring back samples from this hard-to-reach side of the Moon.

First, sharpen your tools

The reshaping of Chinese science has been achieved by focusing on three areas: money, equipment and people. In real terms, China’s spending on research and development ( R & D ) has grown 16-fold since 2000. According to the most recent data from the OECD , from 2021, China still lagged behind America on overall R & D spending, dishing out $668bn, compared with $806bn for America at purchasing-power parity. But in terms of spending by universities and government institutions only, China has nudged ahead. In these places America still spends around 50% more on basic research, accounting for costs, but China is splashing the cash on applied research and experimental development (see chart 3).

hypothesis generating pilot study

Money is meticulously directed into strategic areas. In 2006 the CCP published its vision for how science should develop over the next 15 years. Blueprints for science have since been included in the CCP ’s five-year development plans. The current plan, published in 2021, aims to boost research in quantum technologies, AI , semiconductors, neuroscience, genetics and biotechnology, regenerative medicine, and exploration of “frontier areas” like deep space, deep oceans and Earth’s poles.

Creating world-class universities and government institutions has also been a part of China’s scientific development plan. Initiatives like “Project 211”, the “985 programme” and the “China Nine League” gave money to selected labs to develop their research capabilities. Universities paid staff bonuses—estimated at an average of $44,000 each, and up to a whopping $165,000—if they published in high-impact international journals.

Building the workforce has been a priority. Between 2000 and 2019, more than 6m Chinese students left the country to study abroad, according to China’s education ministry. In recent years they have flooded back, bringing their newly acquired skills and knowledge with them. Data from the OECD suggest that, since the late 2000s, more scientists have been returning to the country than leaving. China now employs more researchers than both America and the entire EU .

Many of China’s returning scientists, often referred to as “sea turtles” (a play on the Chinese homonym haigui , meaning “to return from abroad”) have been drawn home by incentives. One such programme launched in 2010, the “Youth Thousand Talents”, offered researchers under 40 one-off bonuses of up to 500,000 yuan (equivalent to roughly $150,000 at purchasing-power parity) and grants of up to 3m yuan to get labs up and running back home. And it worked. A study published in Science last year found that the scheme brought back high-calibre young researchers—they were, on average, in the most productive 15% of their peers (although the real superstar class tended to turn down offers). Within a few years, thanks to access to more resources and academic manpower, these returnees were lead scientists on 2.5 times more papers than equivalent researchers who had remained in America.

As well as pull, there has been a degree of push. Chinese scientists working abroad have been subject to increased suspicion in recent years. In 2018 America launched the China Initiative, a largely unsuccessful attempt to root out Chinese spies from industry and academia. There have also been reports of students being deported because of their association with China’s “military-civilian fusion strategy”. A recent survey of current and former Chinese students studying in America found that the share who had experienced racial abuse or discrimination was rising.

The availability of scientists in China means that, for example in quantum computing, some of the country’s academic labs are more like commercial labs in the West, in terms of scale. “They have research teams of 20, 30, even 40 people working on the same experiments, and they make really good progress,” says Christian Andersen, a quantum researcher at Delft University. In 2023 researchers working in China broke the record for the number of quantum bits, or qubits, entangled inside a quantum computer.

China has also splurged on scientific kit. In 2019, when The Economist last surveyed the state of the country’s scientific research, it already had an enviable inventory of flashy hardware including supercomputers, the world’s largest filled-aperture radio telescope and an underground dark-matter detector. The list has only grown since then. The country is now home to the world’s most sensitive ultra-high-energy cosmic-ray detector (which has recently been used to test aspects of Albert Einstein’s special theory of relativity), the world’s strongest steady-state magnetic field (which can probe the properties of materials) and soon will have one of the world’s most sensitive neutrino detectors (which will be used to work out which type of these fundamental subatomic particles has the highest mass). Europe and America have plenty of cool kit of their own, but China is rapidly adding hardware.

Individual labs in China’s top institutions are also well equipped. Niko McCarty, a journalist and former researcher at the Massachusetts Institute of Technology who was recently given a tour of synthetic biology labs in China, was struck by how, in academic institutions, “the machines are just more impressive and more expansive” than in America. At the Advanced Biofoundry at the Shenzhen Institute of Advanced Technology, which the country hopes will be the centre of China’s answer to Silicon Valley, Mr McCarty described an “amazing building with four floors of robots”. As Chinese universities fill with state-of-the-art equipment and elite researchers, and salaries become increasingly competitive, Western institutions look less appealing to young and ambitious Chinese scientists. “Students in China don’t think about America as some “scientific Mecca” in the same way their advisers might have done,” said Mr McCarty.

Students visit Handan Artificial Intelligence Education Base during the science and technology week in Handan City, north China's Hebei Province.

Take AI , for example. In 2019 just 34% of Chinese students working in the field stayed in the country for graduate school or work. By 2022 that number was 58%, according to data from the AI talent tracker by MacroPolo, an American think-tank (in America the figure for 2022 was around 98%). China now contributes to around 40% of the world’s research papers on AI , compared with around 10% for America and 15% for the EU and Britain combined. One of the most highly cited research papers of all time, demonstrating how deep neural networks could be trained on image recognition, was written by AI researchers working in China, albeit for Microsoft, an American company. “China’s AI research is world-class,” said Zachary Arnold, an AI analyst at the Georgetown Centre for Security and Emerging Technology. “In areas like computer vision and robotics, they have a significant lead in research publications.”

Growth in the quality and quantity of Chinese science looks unlikely to stop anytime soon. Spending on science and technology research is still increasing—the government has announced a 10% increase in funding in 2024. And the country is training an enormous number of young scientists. In 2020 Chinese universities awarded 1.4m engineering degrees, seven times more than America did. China has now educated, at undergraduate level, 2.5 times more of the top-tier AI researchers than America has. And by 2025, Chinese universities are expected to produce nearly twice as many P h D graduates in science and technology as America.

To see further, ascend another floor

Although China is producing more top-tier work, it still produces a vast amount of lower-quality science too. On average, papers from China tend to have lower impact, as measured by citations, than those from America, Britain or the EU . And while the chosen few universities have advanced, mid-level universities have been left behind. China’s second-tier institutions still produce work that is of relatively poor quality compared with their equivalents in Europe or America. “While China has fantastic quality at the top level, it’s on a weak base,” explains Caroline Wagner, professor of science policy at Ohio State University.

When it comes to basic, curiosity-driven research (rather than applied) China is still playing catch-up—the country publishes far fewer papers than America in the two most prestigious science journals, Nature and Science . This may partly explain why China seems to punch below its weight in the discovery of completely new technologies. Basic research is particularly scant within Chinese companies, creating a gap between the scientists making discoveries and the industries that could end up using them. “For more original innovation, that might be a minus,” says Xu Xixiang, chief scientist at LONG i Green Energy Technology, a Chinese solar company.

Incentives to publish papers have created a market for fake scientific publications. A study published earlier this year in the journal Research Ethics , featured anonymous interviews from Chinese academics, one of whom said he had “no choice but to commit [research] misconduct”, to keep up with pressures to publish and retain his job. “Citation cartels” have emerged, where groups of researchers band together to write low-quality papers that cite each other’s work in an effort to drive up their metrics. In 2020 China’s science agencies announced that such cash-for-publication schemes should end and, in 2021, the country announced a nationwide review of research misconduct. That has led to improvements—the rate at which Chinese researchers cite themselves, for example, is falling, according to research published in 2023. And China’s middle-ranking universities are slowly catching up with their Western equivalents, too.

The areas where America and Europe still hold the lead are, therefore, unlikely to be safe for long. Biological and health sciences rely more heavily on deep subject-specific knowledge and have historically been harder for China to “bring back and accelerate”, says Tim Dafforn, a professor of biotechnology at University of Birmingham and former adviser to Britain’s department for business. But China’s profile is growing in these fields. Although America currently produces roughly four times more highly influential papers in clinical medicine, in many areas China is producing the most papers that cite this core research, a sign of developing interest that presages future expansion. “On the biology side, China is growing remarkably quickly,” says Jonathan Adams, chief scientist at the Institute for Scientific Information at Clarivate. “Its ability to switch focus into a new area is quite remarkable.”

The rise of Chinese science is a double-edged sword for Western governments. China’s science system is inextricably linked with its state and armed forces—many Chinese universities have labs explicitly working on defence and several have been accused of engaging in espionage or cyber-attacks. China has also been accused of intellectual-property theft and increasingly stringent regulations have made it more difficult for international collaborators to take data out of the country; notoriously, in 2019, the country cut off access to American-funded work on coronaviruses at the Wuhan Institute of Virology. There are also cases of Chinese researchers failing to adhere to the ethical standards expected by Western scientists.

Despite the concerns, Chinese collaborations are common for Western researchers. Roughly a third of papers on telecommunications by American authors involve Chinese collaborators. In imaging science, remote sensing, applied chemistry and geological engineering, the figures are between 25% and 30%. In Europe the numbers are lower, around 10%, but still significant. These partnerships are beneficial for both countries. China tends to collaborate more in areas where it is already strong like materials and physics. A preprint study, released last year, found that for AI research, having a co-author from America or China was equally beneficial to authors from the other country, conferring on average 75% more citations.

Several notable successes have come from working together, too. During the covid-19 pandemic a joint venture between Oxford University’s Engineering Department and the Oxford Suzhou Centre for Advanced Research developed a rapid covid test that was used across British airports. In 2015 researchers at University of Cardiff and South China Agricultural University identified a gene that made bacteria resistant to the antibiotic colistin. Following this, China, the biggest consumer of the drug, banned its use in animal feed, and levels of colistin resistance in both animals and humans declined.

In America and Europe, political pressure is limiting collaborations with China. In March, America’s Science and Technology Agreement with China, which states that scientists from both countries can collaborate on topics of mutual benefit, was quietly renewed for a further six months. Although Beijing appears keen to renew the 45-year-old agreement, many Republicans fear that collaboration with China is helping the country achieve its national-security goals. In Europe, with the exception of environmental and climate projects, Chinese universities have been effectively barred from accessing funding through the Horizon programme, a huge European research initiative.

There are also concerns among scientists that China is turning inwards. The country has explicit aims to become self-reliant in many areas of science and technology and also shift away from international publications as a way of measuring research output. Many researchers cannot talk to the press—finding sources in China for this story was challenging. One Chinese plant scientist, who asked to remain anonymous, said that she had to seek permission a year in advance to attend overseas conferences. “It’s contradictory—on the one hand, they set restrictions so that scientists don’t have freedoms like being able to go abroad to communicate with their colleagues. But on the other hand, they don’t want China to fall behind.”

Live until old, learn until old

The overwhelming opinion of scientists in China and the West is that collaboration must continue or, better, increase. And there is room to do more. Though China’s science output has grown dramatically, the share that is conducted with international collaborators has remained stable at around 20%—Western scientists tend to have far more international collaborations. Western researchers could pay more attention to the newest science from China, too. Data from a study published last year in Nature Human Behaviour showed that, for work of equivalent quality, Chinese scientists cite Western papers far more than vice versa. Western scientists rarely visit, work or study in China, depriving them of opportunities to learn from Chinese colleagues in the way Chinese scientists have done so well in the West.

Closing the door to Chinese students and researchers wishing to come to Western labs would also be disastrous for Western science. Chinese researchers form the backbone of many departments in top American and European universities. In 2022 more of the top-tier AI researchers working in America hailed from China than from America. The West’s model of science currently depends on a huge number of students, often from overseas, to carry out most day-to-day research.

There is little to suggest that the Chinese scientific behemoth will not continue growing stronger. China’s ailing economy may eventually force the CCP to slow spending on research, and if the country were to become completely cut off from the Western science community its research would suffer. But neither of these looks imminent. In 2019 we also asked if research could flourish in an authoritarian system. Perhaps over time its limits will become clear. But for now, and at least for the hard sciences, the answer is that it can thrive. “I think it’d be very unwise to call limits on the Chinese miracle,” says Prof Marginson. “Because it has had no limits up until now.” ■

Curious about the world? To enjoy our mind-expanding science coverage, sign up to  Simply Science , our weekly subscriber-only newsletter.

Explore more

This article appeared in the Science & technology section of the print edition under the headline “Soaring dragons”

The rise of Chinese science: Welcome or worrying?

From the June 15th 2024 edition

Discover stories from this section and more in the list of contents

More from Science and technology

hypothesis generating pilot study

Only 5% of therapies tested on animals are approved for human use

More rigorous experiments could improve those odds

hypothesis generating pilot study

The secret to taking better penalties

Practise with an augmented-reality headset

hypothesis generating pilot study

Like people, elephants call each other by name

And anthropoexceptionalism takes another tumble

Elon Musk’s Starship makes a test flight without exploding

Crucially, the upper stage of the giant rocket survived atmospheric re-entry

Zany ideas to slow polar melting are gathering momentum

Giant curtains to keep warm water away from glaciers strike some as too risky

The quest to build robots that look and behave like humans

The engineering challenges involved are fiendish, but worth tackling

IMAGES

  1. PPT

    hypothesis generating pilot study

  2. PPT

    hypothesis generating pilot study

  3. PPT

    hypothesis generating pilot study

  4. PPT

    hypothesis generating pilot study

  5. PPT

    hypothesis generating pilot study

  6. PPT

    hypothesis generating pilot study

VIDEO

  1. Hypothesis testing #study bs 7 semester statics

  2. Diffusion and Intravoxel Incoherent Motion MR Imaging–based Virtual Elastography

  3. Research Methodology Hypothesis : Meaning , Sources & Importance

  4. How to Generate Returns and Volatility Series in Eviews

  5. Dynamic Modeling and Autopilot Design of an Airplane in Matlab Simulink

  6. Understanding Hypothesis Testing: Definition and 4 Steps for Testing with Example

COMMENTS

  1. Formulating Hypotheses for Different Study Designs

    Formulating Hypotheses for Different Study Designs. Generating a testable working hypothesis is the first step towards conducting original research. Such research may prove or disprove the proposed hypothesis. Case reports, case series, online surveys and other observational studies, clinical trials, and narrative reviews help to generate ...

  2. The Role and Interpretation of Pilot Studies in Clinical Research

    A pilot study is a requisite initial step in exploring a novel intervention or an innovative application of an intervention. Pilot results can inform feasibility and identify modifications needed in the design of a larger, ensuing hypothesis testing study. Investigators should be forthright in stating these objectives of a pilot study.

  3. A tutorial on pilot studies: the what, why and how

    2. Narrowing the focus: Pilot studies for randomized studies. Pilot studies can be conducted in both quantitative and qualitative studies. Adopting a similar approach to Lancaster et al.[], we focus on quantitative pilot studies - particularly those done prior to full-scale phase III trialsPhase I trials are non-randomized studies designed to investigate the pharmacokinetics of a drug (i.e ...

  4. A tutorial on pilot studies: the what, why and how

    Pilot studies for phase III trials - which are comparative randomized trials designed to provide preliminary evidence on the clinical efficacy of a drug or intervention - are routinely performed in many clinical areas. Also commonly know as "feasibility" or "vanguard" studies, they are designed to assess the safety of treatment or interventions; to assess recruitment potential; to assess the ...

  5. The statistical interpretation of pilot trials: should significance

    In a survey of pilot studies published in 2007-8, Arain et al. found that 81% (21/26) of pilot studies performed hypothesis tests in order to comment on the statistical significance of results. If the primary purpose of a pilot study is to provide preliminary evidence of the efficacy of an intervention, then the significance level can be ...

  6. Pilot Studies in Clinical Research

    Pilot studies are small-scale studies conducted to gather information and provide a foundation for the design of a definitive trial. They do not seek to estimate treatment efficacy or effectiveness themselves but may be used to assess whether a definitive trial is feasible and how it can be carried out. The objectives of a study can be met only ...

  7. The role and interpretation of pilot studies in clinical research

    A pilot study is a requisite initial step in exploring a novel intervention or an innovative application of an intervention. Pilot results can inform feasibility and identify modifications needed in the design of a larger, ensuing hypothesis testing study. Investigators should be forthright in stating these objectives of a pilot study.

  8. Conceptions and Misconceptions, Uses and Misuses of Pilot Studies in

    the pilot study. Pilot studies are booming. Although the number of randomized controlled trials in PubMed peaked in 2015 and has since dropped by about 25%, the number of pilot studies has increased by about 50% in the same period (Fig. 1). Formal pilot studies often represent the first step from the conception of an intervention to its ...

  9. Development of new or worsening headache after cochlear implant

    Development of new or worsening headache after cochlear implant activation: A hypothesis-generating pilot study of incidence, timing, and clinical factors. ... Of note, this pilot study did not classify headache phenotypes according to ICHD-3 (International Classification of Headache Disorders 3rd edition) criteria. This was done purposely to ...

  10. PDF STUDY DESIGN: PILOT STUDIES

    7/9/2013 7 Inadequate literature review Need to generate hypotheses Running out of: Time Money Patients Patience Laziness Bad reasons to do a "Pilot Study" Test integrity & feasibility Recruitment & consent Intervention (e.g. tolerance, compliance, retention) Data collection (e.g. forms, interface, time) Equipment Other procedures (e.g. randomization)

  11. Formulating Hypotheses for Different Study Designs

    naturally occurring event or a proposed outcome of an intervention. 1,2. Hypothesis testing requires choosing the most ap propriate methodology and adequately. powering statistically the study to ...

  12. PDF The role and interpretation of pilot studies in impact evaluation

    A pilot study is a requisite initial step in exploring a novel intervention or an innovative application of an intervention. Pilot results can inform feasibility and identify modifications needed in the design of a larger, ensuing hypothesis testing study.1 However, in-depth pilot studies (also referred to as formative studies) that are

  13. Recommendations for Planning Pilot Studies in Clinical and

    Analyses for pilot studies should mainly rely on estimation (point and interval estimation) and involve only limited hypothesis testing within the scope of the original aims. In a pilot study, the aims should focus on endpoints other than efficacy and safety measurements. For example, they should focus on feasibility.

  14. Full article: Multivariate proteomic analysis of the cerebrospinal

    The results from this hypothesis-generating pilot study have to be confirmed in larger, hypothesis-driven studies with age-matched controls. Nevertheless, the present report indicates a future possibility that a panel of multiple biomarkers will be able to shed light upon the mechanisms involved in neuropathic pain. We think that the systems ...

  15. Multivariate proteomic analysis of the cerebrospinal fluid of patients

    The results from this hypothesis-generating pilot study have to be confirmed in larger, hypothesis-driven studies with age-matched controls, but the present study illustrates the fruitfulness of combining proteomics with multivariate data analysis in hypothesis-generating pain biomarker studies in humans.

  16. The role and interpretation of pilot studies in clinical research

    A pilot study is a requisite initial step in exploring a novel intervention or an innovative application of an intervention. Pilot results can inform feasibility and identify modifications needed in the design of a larger, ensuing hypothesis testing study. Investigators should be forthright in stating these objectives of a pilot study.

  17. PPT

    How can you do hypothesis-generating or pilot studies without funding? • Since reviewers confuse the types of studies, the criteria for evaluating one type of study are often applied to another type, which confuses researchers. • Researchers misrepresent hypothesis-generating as HT, or badly designed HT as "pilot" studies, which ...

  18. The therapeutic effect of bromocriptine in combination with

    The therapeutic effect of bromocriptine in combination with spironolactone in patients with primary aldosteronism: a hypothesis generating pilot study Oncotarget. 2017 Sep 6;8(44):77609-77621. doi: 10.18632/oncotarget.20670. ... Conclusions: In this pilot study, we found that short-term addition of bromocriptine to spironolactone improved the ...

  19. A single-arm, open-label pilot study of neuroimaging ...

    Scientific Reports - A single-arm, open-label pilot study of neuroimaging, behavioral, and peripheral inflammatory correlates of mindfulness-based stress reduction in multiple sclerosis Skip to ...

  20. PDF Recommendations for Planning Pilot Studies in Clinical and

    Analyses for pilot studies should mainly rely on estimation (point and interval estimation) and involve only limited hypothesis testing within the scope of the original aims. In a pilot study, the aims should focus on endpoints other than efi cacy and safety measurements. For example, they should focus on feasibility.

  21. Pilot studies: Are they appropriately reported?

    INTRODUCTION. Generation of good quality evidence requires well designed and accurately performed clinical studies. Feasibility of conducting such studies requires an a priori estimate of both time and cost. Pilot studies, which are performed ahead of the main study[] help us to narrow down the feasibility of a study by formulating same/similar hypothesis, calculating the sample size required ...

  22. Tofacitinib Hypothesis-generating, Pilot Study for Corticosteroid

    Tofacitinib Hypothesis-generating, Pilot Study for Corticosteroid-Dependent Sarcoidosis. ... This is a 16-week open-label, interventional, proof of concept, hypothesis-generating study. All subjects will receive Tofacitinib 5mg twice daily for 16 weeks. After four weeks on Tofacitinib, the corticosteroid will be tapered per a pre-defined ...

  23. Nutrients

    This exploratory study included relatively few participants; thus, the results are hypothesis-generating. Future studies with larger sample sizes and a longitudinal design are warranted to better understand the role of PA in the evolution of AN and in the response to treatment in adolescents with AN.

  24. Does TomoDirect 3DCRT represent a suitable option for post ...

    The present study showed that TD-3DCRT and TD-IMRT are two feasible and dosimetrically acceptable treatment approach for AWBRT, with an optimal PTV coverage and adequate OARs sparing. ... Does TomoDirect 3DCRT represent a suitable option for post-operative whole breast irradiation? A hypothesis-generating pilot study Radiat Oncol. 2012 ...

  25. NIH launches $30 million pilot to test feasibility of a national

    Initiative aims to improve health outcomes by integrating research in everyday primary care settings. ... is investing approximately $30 million in total over fiscal years 2024 and 2025 to pilot a national primary care research network that integrates clinical research with community-based primary care. ... Patients will be able to contribute ...

  26. The state of AI in early 2024: Gen AI adoption spikes and starts to

    The average organization using gen AI is doing so in two functions, most often in marketing and sales and in product and service development—two functions in which previous research determined that gen AI adoption could generate the most value 3 "The economic potential of generative AI: The next productivity frontier," McKinsey, June 14 ...

  27. Markers of oxidative stress during post-COVID-19 fatigue: a hypothesis

    For this purpose, in the present hypothesis-generating, exploratory pilot study, we investigated markers of oxidative stress and post-COVID-19 fatigue symptoms in hospital employees. We collected psychosocial data and analyzed ROS concentration and oxidative DNA damage in blood cells at two different time points prior to and after psychosomatic ...

  28. China has become a scientific superpower

    I n the atrium of a research building at the Chinese Academy of Sciences (CAS) in Beijing is a wall of patents.Around five metres wide and two storeys high, the wall displays 192 certificates ...