Introduction
The growing promise and burgeoning complexity of biomedicine warrant a robust physician-scientist workforce. However, attrition in the physician-scientist path has been a longstanding problem [Reference Garrison and Ley1–Reference Keswani, Moles and Morowitz7]. Sustaining a physician-scientist career not only requires investigative and clinical skills but also versatility in navigating competing time commitments, sustaining innovation and funding, and prioritizing clinical, research, and personal goals [Reference Daye, Patel, Ahn and Nguyen8].
Women face unique difficulties navigating physician-scientist careers compared to men. Women comprise a minority of funded investigators [Reference Levey, Gentile, Jolly, Beaty and Levey9,10] apply for grant funding at a lower rate and are cited less often [Reference Salata, Geraci and Rockey4]. Women MD-PhD’s who attained NIH predoctoral grants are only 37% as likely as their male counterparts to eventually have independent NIH research funding [Reference Ghosh-Choudhary, Carleton, Nouraie, Kliment and Steinman11].
Women express lower confidence in career advancement in medicine [Reference Jones, Griffith, Ubel, Stewart and Jagsi12,Reference Pololi, Civian, Brennan, Dottolo and Krupat13], and in knowledge and performance despite equal clinical knowledge and skills as men [Reference Vajapey, Weber and Samora14]. Women residents and fellows participating in a clinical research training program scored lower than men when self-assessing their ability to conduct clinical research [Reference Bakken, Sheridan and Carnes15]. While systemic factors contribute to the shortfall in physician scientists and disproportionately affect women [Reference Gillen, Markowitz, Long, Villegas-Estrada, Chang and Gupta16–Reference Kwan, Daye and Schmidt21], gender gaps in confidence and self-efficacy could counter the resiliency needed to overcome barriers to success for physician scientists.
The Clinical Research Appraisal Inventory (CRAI) has been used to measure confidence in research skills [Reference Mullikin, Bakken and Betz22,Reference Robinson, Switzer and Cohen23]. The CRAI however, focuses on the research domains pertinent to conducting clinical research studies. No investigation to date has measured confidence in research, career, and personal domains in a physician-scientist cohort focused on basic science and laboratory-based translational research.
The University of Pittsburgh School of Medicine training portfolio includes the Medical Scientist Training Program (MSTP, MD-PhD) and Physician Scientist Training Program (PSTP) [Reference Steinman, Proulx and Levine24,Reference Shah and Rao25] medical student programs and a Burroughs Wellcome Foundation supported Physician Scientist Incubator Program that trains MD-only residents and fellows in preclinical research. While PSTP is an acronym also used for resident and fellow training programs, our PSTP is specific to medical students as described by Steinman et al. [Reference Steinman, Proulx and Levine24]. We refer herein to participating medical students as “MSTP/PSTP” and to the resident/fellows in the BWF Incubator program as “BWF Fellows.”
We developed a Physician Scientist Confidence questionnaire to measure self-confidence in scientific, personal, and professional competencies at early and later points in the training process in these three programs. Our objective was to evaluate the programmatic impact on trainee’s confidence over time and by gender. A secondary objective was to assess the impact of training on confidence rankings by career level.
Results
Cohort characteristics
There were 102 trainees who completed the survey at the two time points administered through Research Electronic Data Capture (REDCAP) [Reference Harris, Taylor and Minor26,Reference Harris, Taylor, Thielke, Payne, Gonzalez and Conde27]. Two individuals were not included in the final analysis – one individual preferred not to identify gender; another individual had logged back into the time 1 survey at time 2. The cohort included 61% female trainees. 82% were enrolled in the MSTP/PSTP program and 18% in the BWF incubator program. Full demographics of the sample are shown in Supplement, Suppl. Table 1. The average time between initial and follow-up survey responses was 1.6 years (see detail in Supplement, Suppl. Methods). All participants included in analyses consented under an expedited protocol approved by the University of Pittsburgh Institutional Review Board. 57% of consented eligible individuals completed both surveys. There was no significant in age or gender distribution between those who completed both surveys and are included in this analysis and those who are not included because they answered only 1 or neither survey, did not respond, or declined consent (see Supplemental Methods, p.S23).
Difference in responses to individual survey questions
Overall, mean scores across all confidence survey items increased at follow-up by a mean of 0.64 (95% CI, 0.25–1.03) on the 11-point scale.
Figure 1 shows average level of confidence by response to each individual item in the survey for the total cohort, men and women. Responses to survey items by training level are shown in Supplement, Suppl. Figure 1. Overall, confidence increased over time. While both men and women rated their level of confidence higher at time 2, this increase was more marked for women. Averaging all item responses, women rated their level of confidence lower than men at time 1 but not at time 2.
For the entire cohort, mean confidence scores increased for 35 of 36 items, with a small decrease (0.153, 2.3% change from initial level) only in confidence in the ability to “Nourish your physical and emotional health.” This decrease was seen in the response of both women and men. Women increased their confidence in response to all other (35/36) items, whereas men rated their confidence higher for 27 and lower for 9 items (see Supplement, Suppl. Figure 2). At time 2 (compared to time 1), the average increase in confidence scores by women rose by 0.56 (95% CI 0.045 to 1.07) more than the increase in men’s scores.
Grouping of survey competencies and mean scores across subscales
To identify thematic subscales, we conducted an exploratory factor analysis. Exploratory factor analysis (EFA) analysis identified five subscales: Career Sustainability, Science Productivity, Grant Management, Goal Setting, and Goal Alignment, shown along with the contributing ranking items in Table 1.
The subscales were compared by training level and gender as summarized in Table 2.
p values were derived from a paired t-test. Participants who completed research confidence skill items included in each subscale at time 1 and at follow-up were included in the analysis.
MSTP = Medical Scientist Training Program; PSTP = Physician Scientist Training Program (medical student); BWF = Burroughs Wellcome Foundation (BWF physician-scientist incubator for residents and fellows).
Notably, the level of confidence increased for every subscale for the full cohort. The subscale with the smallest increase was Goal Alignment, because of a decrease in confidence in skills assigned to this category among men. This was the sole instance of a drop in confidence for a subscale in any trainee group.
Men only increased confidence in the Grant Management subscale. In contrast, women showed an increase in confidence in all five of the thematic subscales. In the supplement, Suppl. Table 2 compares men and women for each subscale at both time points. Initially, women ranked significantly lower in confidence than men in 4 of 5 subscales. At follow-up (time 2) there was no significant difference between men and women in any subscale.
We also examined self-rated confidence by training level. Despite the difference in training level, both BWF Fellows and MSTP/PSTP medical students showed similar levels of confidence in the initial survey (time 1, Table 2). Both groups showed the greatest increase in confidence in skills related to Grant Management, and also significantly increased confidence in Career Sustainability and Scientific Productivity. Only the BWF Fellows significantly increased confidence in the other two subscales, Goal Setting and Goal Alignment.
The BWF Fellow and MSTP/PSTP groups each had a majority of women respondents (66 and 61% respectively). To assess whether the increase in confidence among the different cohorts was restricted to the women in the resident/fellow group, we calculated mean scores by career level and gender as shown in the Supplement, Suppl. Table 3.
In the BWF Fellow cohort, both men and women increased their level of confidence in 4/5 subscales (in all but Goal Alignment). In contrast, in the MSTP/PSTP student group, the change in confidence over time only increased significantly among women. Women exhibited a significant increase in every subscale except for Goal Setting.
The increase in confidence by women during the training period remained for certain subscales after adjustment for initial scores in a mixed effects model. The mixed-effects model showed a differential impact of programing by gender for two of the five subscales, Goal Alignment and Career Sustainability. The model output is shown in Table 3. The increase among females surpassed the increase among males for “Career Sustainability” (time v subscale interaction term = 0.68 [95% CI: 0.03–1.33, p = 0.042]) and for “Goal Alignment” (time v subscale interaction term = 0.96 [95% CI: 0.33–1.59, p = 0.003]). Other subscales did not meet the threshold for significance.
P values are bolded for the interaction term of program gender and time (Gender*time). This difference in differences estimator is calculated as (Male mean score at time 1- Male mean score at time 2) – (Female mean score at time 1- female mean score at time 2). A p < 0.05 indicates a significant interaction term of gender and time.
We also analyzed the effect of the training period by career level in a mixed-effect model. Training had a differential effect by career level across all subscales. This was demonstrated by a significant interaction term between career level and time for each subscale as shown in Table 4. For all subscales, BWF Fellows showed a greater increase in mean scores compared to MSTP/PSTP medical students.
MSTP = Medical Scientist Training Program; PSTP = Physician Scientist Training Program (medical student); BWF = Burroughs Wellcome Foundation (BWF physician-scientist incubator for residents and fellows).
P values are bolded for the interaction term of career level and time (Career level*time). This difference in differences estimator is calculated as (BWF mean score at time1 – BWF mean score at time2) – (MSTP/PSTP mean score at time1- MSTP/PSTP mean score at time 2); p < 0.05 is considered significant.
Lack of change in motivation, satisfaction, or grit
The surveys of self-rated confidence were conducted concurrently with measurement of motivation [Reference Robinson, Switzer and Cohen28], burnout [Reference Dolan, Mohr and Lempa29], satisfaction [Reference Diener, Emmons, Larsen and Griffin30], and grit [Reference Duckworth and Quinn31]. We explored whether ratings of these measures changed during training. However, no significant changes in motivation, satisfaction, or grit were seen in the full cohort (Supplement Suppl. Table 4, Suppl. Figures 3, 4). Burnout scores increased modestly in the cohort from 1.94 to 2.12 (p = 0.03, 95% CI 0.015–0.342). Overall, a relationship between these factors and the observed increase in self-confidence among women was not evident and was not pursued further.
Curricular element ranking by participants
The medical student PSTP program 24 is a 5-year MD program comprised of 16 months of basic/translational laboratory research in addition to six required PSTP enrichment courses beyond the medical school curriculum; the MSTP MD-PhD program has 9 required MSTP enrichment courses (four co-enrolled by PSTPs) beyond those of medical and graduate school; the BWF Incubator Fellows engage in 2 years of laboratory work concurrent with weekly professional and/or scientific development classes. All three programs share the same director (R.A.S), who instructs the majority of classes. Common training components of all programs include courses or classes on grant writing, whiteboard work-in-progress presentations, directed interviews with near-peer role models, mock study sections, and a variety of classes on professional development topics. All of the programs include career advisors or development committee meetings and 4–6 individual sessions with professional career coaches.
Respondents were asked what curricular features contributed to each subscale by ranking the top 3 out of a list of courses/classes/activities that they felt contributed to each of the five thematic competency subscales (Supplement Suppl. Table 5A). A brief description of each subscale accompanied the list; additionally, text fields were available for comments. Sixty-nine participants (69%) responded to the curriculum survey (8 BWF Fellows, 23 PSTP and 38 MSTP). The top curricular items that were identified in common by all three cohorts for each subscale are shown in Supplement Suppl. Tables 5B, 5C. Professional development classes were linked by all to Career Sustainability, whiteboard talks and rigor sessions to Science Productivity, and grantwriting classes to Grant Management. The 1-on-1 sessions with professional coaches were noted by all cohorts as a top factor in building skills in Goal Setting and Goal Alignment, consistent with a recent report on coaching for residency transitions [Reference Winkel, Chang, McGlone, Gillespie and Triola32].
Other factors that could impact trainee confidence
The BWF Cohort comprises residents and fellows and is older (mean 31.6; median 30.5 years old) than the MSTP/PSTP cohort (mean 25.4; median 25.0) years old. Conceivably being older could position the BWF cohort to benefit more from program elements. However, there was only weak correlation between age and changes in the level of confidence over time for the entire cohort (r 2 = 0.11, linear regression), men (r 2 = 0.14), and women (r 2 = 0.13). Moreover, the BWF cohort did not differ significantly from the medical students (p = 0.27) in their ranking of confidence at baseline, despite their age difference.
Mentoring can have a large impact on confidence in physician-scientist skills. All participants were asked, “To what extent do you feel your primary research mentor is meeting your expectations?” From participants as a whole as well as those at each training level and for each gender, the mentors received a median rank of 4.0 (“exceeds expectations”) on a 5-point scale. There was no significant difference between training levels or genders at either time point in participant ranking of mentors.
Discussion
The objective of this study was to evaluate the impact of our laboratory-linked physician-scientist training programs on trainee’s level of confidence in professional, personal, and scientific competencies over time, by gender and by career level. We observed a significant gender gap in confidence at the initial assessment with females expressing lower confidence in all areas queried. That finding is consistent with reports that women in medicine and science have lower perceived self-efficacy than men [Reference Jones, Griffith, Ubel, Stewart and Jagsi12–Reference Vajapey, Weber and Samora14,Reference Epstein and Fischer33]. The onset of this gap in academic confidence is quite early and present in high school if not earlier [Reference Lips34]. This study was the first to explore this gender gap in confidence specifically in pre- and post-doctoral physician-scientist trainees engaged in preclinical research training.
Increase in women trainee’s confidence
It is striking that the women in this study, whether medical students or residents/fellows, reported an increase in their level of confidence during training. There have been few studies assessing changes in confidence among women in academia. Bakken demonstrated that women training in clinical research ranked their ability in six clinical investigation competencies lower than men; interestingly, men’s confidence increased more than women’s [Reference Bakken, Sheridan and Carnes15] following a skill-building workshop.
Several studies have measured confidence in performance among medical students [Reference Klassen and Klassen35] and medical postgraduates [Reference Vajapey, Weber and Samora14]. There was no gender difference among Lerner College of Medicine students in their clinical research confidence (using the CRAI survey) at matriculation or at graduation [Reference Bierer, Prayson and Dannefer36]. Versions of the CRAI have also been used to measure changes in self-efficacy changes following clinical research training or for medical students doing Scholarly Projects; while increases were noted, those studies did not analyze effects by gender [Reference Lipira, Jeffe and Krauss37,Reference DiBiase, Beach and Carrese38]. The CRAI instrument analyzes confidence in research activities related to design, reporting, conceptualizing, planning, funding, and protecting subjects in studies. Literature indicates that the challenges negotiated by physician scientists extend beyond those activities.
Our instrument was designed specifically for physician scientists in training and structured to encompass not only performance-related domains but also questions related to personal and professional persistence, goal setting, and goal alignment. While several of the items in the Goal Alignment and Goal Setting subscales are important in personal (as well as academic) settings, this study did not comprehensively explore the range of factors involved in the personal agency of physician scientists.
The magnitude of significant changes in confidence rating for subscales ranged from 0.3 to 1.0 overall, from 0.4 to 0.7 among the medical students, and from 0.9 to 2.4 among the resident/fellow cohort. The magnitude of these changes in confidence is comparable with other assessments of changes in efficacy or confidence in college students, STEM trainees, or medical students [Reference DiBiase, Beach and Carrese38–Reference Betz and Schifano40]. Ultimately, the significance of our findings will require correlation of self-ranked confidence with career persistence and success.
In our study, men rated their confidence levels higher than women initially. One could posit that men’s higher initial confidence ranking indicates that men are subject to the Dunning Kruger effect [Reference Kruger and Dunning41] and relatively unaware of their shortcomings. However, the moderate range of men’s initial rankings (from 4.2 to 6.5 out of 10 highest score) suggests that Dunning Kruger overconfidence was not a major factor.
The confidence level scores between men and women were significantly different initially, with women rating themselves lower than men initially but not at follow-up. To compare the change over time in confidence as a function of gender, we used a mixed model correcting for gender differences at the initial assessment. The differential effects of programing by gender were significant for the subscales Career Sustainability and Goal Alignment after correction for initial scores. Given the evidence that fewer women persist in physician-scientist careers [Reference Levey, Gentile, Jolly, Beaty and Levey9,Reference Akabas and Brass42] it is promising that women in our cohort increased their confidence in these subscales linked to persistence.
Greater increase in confidence ranking at the resident/fellow level
A secondary objective was to assess differences in self-confidence in professional, personal, and scientific competencies over time by career level. Despite having similar scores initially, BWF Fellows increased confidence across all subscales compared to MSTP/PSTP students. This could indicate that physician-scientist training programs are most effective during residency/fellowship or may be a function of the MD-only BWF Fellow cohort (56% in surgery or surgical specialties) or of our BWF Incubator program curriculum. While similar research and professional competencies were taught in the pre- and postgraduate programs, the context and case studies were tailored to training stage.
Ratings of motivation, grit, satisfaction, burnout
In addition to our confidence rating questions, we surveyed participants with validated scales for motivation [Reference Robinson, Switzer and Cohen28], burnout [Reference Dolan, Mohr and Lempa29], satisfaction [Reference Diener, Emmons, Larsen and Griffin30] and grit [Reference Duckworth and Quinn31]. Only burnout scores increased between initial assessment and follow-up, increasing (0.18 on a 5-point scale) in the full cohort and in men but not women. Whether this contributed to the more modest increase in confidence in men compared with women is unclear.
Women scored higher on the grit scale than men both at initial assessment and at follow-up, without a significant change between timepoints. Higher levels of grit may characterize the population of women choosing this long and challenging career path. It is interesting that at time 1, women ranked 9.5% higher than men in grit and yet rated their confidence lower. The linkage between grit and self-efficacy is complex [Reference Neroni, Meijs, Kirschner, Xu and de Groot43], and a career development model proposing interrelatedness of grit and confidence may be insufficient. It remains to be seen whether the confidence scale we employed is a more robust measure of career persistence and progress than the other measures that were static over the course of the study.
Perceptions of subscale-related curricular elements
We conducted a survey where we asked our cohort to rank which elements of the curriculum they perceived as important in building their confidence in the subscale domains identified in this study. Although this method is purely descriptive, we believe it sheds insights on where to enhance our training programs. Curricular elements identified as building confidence in the surveyed competencies included grantwriting classes, rigor discussions, physician-scientist talks, role model and near-peer interviews, coaching, and whiteboard talks with peers. Our findings reinforce the value of coaching [Reference Deiorio, Carney, Kahl, Bonura and Juve44,Reference Sabatine and Wendell45] and role models [Reference Bakken46].
Limitations
Our evaluation was conducted during the COVID-19 pandemic. All classes were virtual between spring 2020 and fall 2021 due to COVID-19 restrictions. The pandemic stressed academia, with higher academic costs for women [Reference Ellinas, Ark, Kaljo, Quinn, Krier and Farkas47,Reference Weinreich, Kotini-Shah and Man48]. It is interesting that in our cohort, women’s reported confidence increased despite the pandemic. While a full accounting is beyond the scope of this paper, in a separate survey the trainees were asked if they strongly disagreed (1) or strongly agreed (5) on a 5-point Likert scale with the statement: Changes to my home life due to the COVID-19 pandemic have greatly impacted my ability to work. The response of the full cohort was 3.0 (neutral) at both study timepoints. Neither men (p = 0.19, difference 0.39, 95% CI −0.21 to 0.98)) nor women (p = 0.26, difference −0.18, 95% CI −0.51 to 0.14) ranked the impact of COVID-19 on their work to change between the T2 and T1 timepoints. While not significant within gender groups, the slight decrease in women’s ranking of the burden of COVID-19 over time was significant (p = 0.048) in comparison to the difference over time in men’s ranking of the COVID-19 question. Although we did not detect a higher impact of COVID-19 on women as reported elsewhere, it is unclear whether that finding will be generalizable to the post-pandemic era.
Given that this is a single institution study, the generalizability of this survey tool to other training programs and settings remains to be determined. The survey of perceived confidence in professional, personal, and scientific competencies that we used has not been rigorously validated. Additionally, the EFA and outcome analyses were conducted with the same cohort so subscales derived may or may not generalize. We studied three physician-scientist training programs, primarily focused on preclinical research. Although some trainees engaged in both preclinical and clinical research, it is unclear if similar outcomes apply to programs limited to clinical research.
Conclusions
During our pre- and postgraduate physician-scientist training programs, confidence in scientific, professional, and personal skills increased significantly in postgraduate trainees and at all training levels among women. This positive trend in women’s confidence during training may contribute to reducing gender gaps in persistence in academic medicine. Our findings aim to assist physician-scientist training program leaders as they evaluate their trainees and develop their curriculum.
Methods
The 36 Likert-type survey items measuring self-rated confidence included 5 items from CRAI-12 [Reference Robinson, Switzer and Cohen23]. Additional items were developed based on literature on barriers/facilitators identified by physician scientists and on the results of a programmatic needs assessment that we had previously conducted with 143 residents/fellow trainees equally divided between academic educational, clinical, or basic/translational research tracks at our institution. We retained the 11-point rating scale used in the CRAI. The final 36 items were assessed for face validity during cognitive interviews with MD-PhD alumni. Details on survey administration, exploratory factor analysis, design of mixed effects modeling, and the curricular survey are presented in the Supplement, Supplemental Methods.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/cts.2024.556.
Acknowledgments
We acknowledge the help of Bowa Lee MD for input on BWF Scholar experience. We acknowledge support from the Clinical and Translational Science Institute at the University of Pittsburgh (UL1-TR-001857) for the use of REDCap.
Author contributions
TK and CNP share the first author position. TK conducted data analysis, prepared tables, conducted mixed methods analyses, co-wrote the manuscript, and edited and approved the final manuscript; CNP co-conceived the study, conducted exploratory factor analysis, oversaw study conduct, prepared materials for the IRB, conducted preliminary analysis, and edited and approved the final manuscript; SMN oversaw statistical analyses and edited and approved the final manuscript; ASM and RJR reviewed and summarized relevant literature, generated the graphical abstract, and edited and approved the final manuscript; RAS co-conceived the study, reviewed and conducted data analysis, drafted the manuscript, and edited and approved the final manuscript.
Funding statement
The University of Pittsburgh holds a Physician-Scientist Institutional Award from the Burroughs Wellcome Fund that supported this study.
Competing interests
The authors have declared that no conflict of interest exists.