Using a hybrid design to analyze effectiveness and implementation of a refined energy-balance education module for upper elementary physical education

Informed by a previous experiment, this study sought to examine the effectiveness and implementation of a refined energy-balance (EB) education module for upper elementary physical education (PE). The refined module was distributed through a participatory research network to six schools in Iowa (United States) participated in the study and the evaluation followed the type I effectiveness-implementation hybrid design. The PE teachers completed standardized log sheets after delivering each lesson. Students in fourth and fifth grades (n = 570 and n = 586 at baseline and post-test, respectively) completed a written test before and after the experiment, to assess EB knowledge. The overall completion rate of the log sheets was low and varied across lessons, but process data meaningfully informed the implementation. The EB knowledge gain over time was modest and statistically significant in both boys and girls in fourth and fifth grades. The experienced schools showed greater knowledge gains than the naive schools that had not previously been exposed to the EB module. The findings support the potential of the refined EB education module for broader dissemination in elementary schools. eating behavior (Δ M = 5%). Students from all six schools, regardless of their grade and gender, all showed the same increasing pattern for EB knowledge, albeit the more experienced schools showed greater gains than the naïve schools. These findings support the effectiveness of the refined EB education module, when distributed through the Iowa FitnessGram Initiative. The Iowa FitnessGram Initiative provided a useful structure to distribute, facilitate, and evaluate the use of the modules under real-world conditions. research design. The findings confirm the utility of the Iowa FitnessGram Initiative as a PRN for program evaluation. The other strength of the study is related to the continuity of our research on EB education through school PE. The findings from our previous efficacy study and this effectiveness study provide a more complete story


Introduction
Teaching children the basic principles of energy balance (EB) is crucial for them to learn how to establish and maintain a healthy body weight over time (Chen & Nam, 2017;Manore et al., 2014;Pasco & Ennis, 2015). In the United States (U.S.), physical education (PE) is an important setting through which students can gain EB knowledge and establish lifestyle behaviors for long-term weight control (Bassett et al., 2013). Traditionally, EB concepts, principles, and strategies are rarely taught in the gymnasium during PE, although they do make their appearance in health education textbooks at the secondary school level (e.g., late middle school or early high school years) (Friedman, Stine, & Whalen, 2009). The fact that PE and health education are offered as two separate subjects and sometimes taught by two different teachers, makes it difficult for students to connect their experience with learning across classrooms, hence limiting the potential for students to construct a deeper understanding of the content , 2014. Formal health education in the U.S. public schooling usually does not start until middle school years, but it would be more effective if youth already had foundational knowledge and skills related to healthy-living.
Formalized EB education in upper elementary grades is particularly important to counter the high prevalence of childhood obesity rate and declines in physical activity that tend to occur through adolescence. Previous research has shown that EB education in upper elementary eJRIEPS Hors Série 3 Décembre 2019 110 school PE programming is both feasible and effective (Chen, Zhu, Androzzi, & Nam, 2018).
However, the previous study only demonstrated the educational module's efficacy or internal validity under idealized and relatively controlled circumstances. Thus, additional research is needed to test effectiveness in broader and less controlled settings.
The dissemination and implementation of evidence-based curricula is critical to strengthen the quality of educational programming in PE (Giblin, Collins, MacNamara, & Kiely, 2014).
Some exemplary curricula such as the Science, PE, and Me curriculum and the Fitness for Life have demonstrated success in enhancing student learning and are adopted by PE teachers for implementation in numerous schools (Chen, Martin, Sun, & Ennis, 2007;Sun, Chen, Zhu, & Ennis, 2012;Thompson & Hannon, 2012). However, many curriculum intervention trials stop at end of the efficacy-testing phase. A curriculum with established efficacy (using an exploratory trial) may have shown adequate internal validity to produce expected educational outcomes under ideal and controlled circumstances; however, it cannot be assumed to work when applied in less structured or un-controlled settings (Godwin et al., 2003;Halbert & MacPhail, 2010). The under-emphasis on external validity of interventions does not just exist in PE pedagogy research, but it is also a pressing concern in other fields such as public health or medicine, in which there are only a limited number of interventions that have studied effectiveness (pragmatic trial) and sustainability (Colditz, 2012;Green & Nasser, 2012). A recent review found that only a small percentage of the health interventions have systematically examined effectiveness, adoption, implementation, and maintenance, which has slowed down the speed of their translation to dissemination for large practical impact (Green & Nasser, 2012). Additional research examining the dissemination and implementation of interventions is critical to promote the translation of science into practice (Green & Nasser, 2012;Welk, 2017).

The effectiveness-implementation hybrid design
One solution to address the existing concerns about external validity of interventions is to study effectiveness and implementation simultaneously using a hybrid design (Curran, Bauer, Mittman, Pyne, & Stetler, 2012). Concerned about the slow translation of health promotion science, Geoffrey Curran and colleagues critiqued the issues inherent in the traditional step-wide clinical trials (from efficacy and effectiveness to dissemination) and subsequently proposed the effectiveness-implementation hybrid design (Brown et al., 2017;Curran et al., 2012). The hybrid design is positioned to expedite the process needed for science-informed or evidence-based practice and three types of hybrid designs were proposed (Curran et al., 2012). The type I hybrid design is focused on testing effects of a eJRIEPS Hors Série 3 Décembre 2019 111 health promotion intervention on outcomes (primary purpose) while observing and gathering information on implementation (secondary purpose) (Curran et al., 2012;Wolfenden, Williams, Wiggers, Nathan, & Yoong, 2016). Type II hybrid design places equal emphases on determining the intervention effectiveness as well as implementation strategy; while type III hybrid design primarily focuses on determining the utility of an implementation strategy but also secondarily addresses intervention outcomes (Curran et al., 2012). Following the type I hybrid design, we distributed and evaluated the refined EB education module through a participatory research network (PRN) called the called the Iowa FitnessGram Initiative (http://www.iowafitnessgram.org).

Description of the Iowa FitnessGram Initiative
The Iowa FitnessGram Initiative (http://www.iowafitnessgram.org) is a PNR of K-12 schools / teachers and university researchers committed to promoting youth physical fitness and physical activity in schools. As a PRN, it is designed to provide mutual benefits through participation, training, communication, implementation, and evaluation. The advantage to schools is that the network provides resources and training aimed at enhancing the assessment and promotion of physical activity and physical fitness (e.g., FitnessGram assessment in-service training; provision of programming). The advantage to the research group is the ability to deploy training and programs to learn how they work under real-world conditions. The prominent NFL PLAY 60 FitnessGram Partnership project was established as a distributed PNR (Welk, Bai, Saint-Maurice, Allums-Featherston, & Candelaria, 2016) and provided a model for the more localized Iowa FitnessGram Initiative. The goal was to promote bilateral communications between researchers and practitioners for sustained collaborations, leading to a harmonious integration of science and practice (Welk, 2017). It proved to be a particularly useful setting to distribute the refined EB education module for evaluation in this study.

Description of the EB Education Module
The EB Education module was designed and validated by a panel of elementary PE teachers, teacher educators, and researchers to teach students EB concepts, principles, and strategies through movement tasks in upper elementary school grades (Chen et al., 2018). The lessons are aligned with the U.S. national PE standards (Society for Health and Physical Educators, 2014) and use developmentally appropriate tasks to teach boys and girls in fourth and fifth grades relevant content in the psychomotor, cognitive, and affective domains. Each lesson is centered on a focal daily key question guiding students' learning of EB concepts. For example, one lesson that addresses physical activity intensity has a focal question stated as: How do you know if your heart is working hard enough to improve fitness?
(Answer: Take your pulse to see if you are in the Target Heart Rate Zone.) In this lesson, students learn about maximal heart rate (220-age) and Target Heart Rate Zone (50-85% of maximum heart rate) through movement stations categorized by light, moderate, or vigorous intensity physical activities. The stations allow them to learn how to quickly and accurately take pulses (carotid and radial pulses) and how to monitor and regulate exercise intensity.
At end of the lesson, they are expected to name and differentiate activities of varying intensity categories. The optional incorporation of pedometers also enables students to better understand exercise intensity (steps per minute) and links to calories. This is just one example but all lessons follow a lesson plan script that unfolds in three sequential parts: a 5-minute teacher-selected warmup, a 20-minute "think as they move" main activity, and a 5minute closure. For each one, the students are instructed to complete a task sheet in each lesson, to make connections between physical and cognitive engagements within the lessons.
Our previous experiment established the efficacy of the EB education module in four elementary schools (randomized to experimental or control schools) (Chen et al., 2018).
The PE teachers in the experimental schools were instructed to utilize the 11 lesson plans, while the PE teachers in control were asked to follow their regular curriculum . That study demonstrated that both boys and girls in fourth and fifth grades who received the EB education significantly increased their EB knowledge (by ~15% as measured by a valid knowledge test) while also maintaining a similar level of in-class physical activity (measured by Actigraph GT3X+ accelerometers) as the students in the control condition (Chen et al., 2018). During the implementation, the PE teachers closely adhered (~90%) to the scripted PE lesson plans (as assessed by field observation checklists) and they also provided useful feedback for future improvement and modifications (as assessed by teacher interviews) (Chen et al., 2018).
The present study used the refined versions of these 11 lessons along with 5 new ones.
Four of the schools in the evaluation were involved in the previous project (called experienced schools) and two were new schools (called naïve schools). This follow-up study was essential to determine whether the lessons could be implemented under less controlled settings with more distributed training methods.

Research purpose
This study capitalized on testing the effectiveness and implementation of the refined EB education module when distributed through a PRNthe Iowa FitnessGram Initiative. Guided by the effectiveness-implementation hybrid design (Curran et al., 2012), the primary purpose of this study (effectiveness testing) was to examine the difference in EB knowledge between baseline and post-implementation of the refined module, and differences in knowledge gains between experienced schools and inexperienced schools and by grade (i.e., fourth grade and fifth grade) and gender (i.e., boys and girls). The secondary purpose of the study (implementation testing) was to capture the process of implementing the module through process evaluation.

Setting and participants
Six elementary schools recruited from the state of Iowa (U.S.A.) participated in this study. Table 1 shows the sociodemographic characteristics of these schools. Four of the six schools with previous exposure to the curricula were in the same small town (i.e., returning schools) while the other two (i.e., new schools) in remote rural areas. The small town schools were about twice as large as the rural schools. The schools showed varying levels of free or reduced lunch eligibility percentage, ranging from 22.1% to 58.7% (M/SD = 36.8%/12.2%).
The free or reduced lunch eligibility is a frequently used measure by education researchers to capture school-level socio-economic status in the U.S. public school system (Sirin, 2005).
The sample involved 570 and 586 fourth and fifth grade students at baseline and post-test, respectively. A majority of the students enrolled in these schools were Caucasian/White, which is typical for ethnic composition in the state. The study was approved by the Iowa State University Institutional Review Board (IRB) and the participating school districts. The approved IRB protocol waived collection of parental consent and assent because the study was rated as minimal risk and that de-identified data were collected.

Research design and procedures
This study employed the one-group pretest-posttest research design (Thomas, Nelson, & Silverman, 2015). Unlike the previous efficacy study (Chen et al., 2018), the six PE teachers (one per school) in this study received minimal training because the focus was to examine the natural adoption and implementation of the refined EB education module. Two weeks prior to the start of implementation (February 2016), the schools received the teacher's manual and ancillary materials (e.g., pedometers, task sheets, PE posters). One week before starting, students in fourth and fifth grades completed the baseline assessments.
Then, the implementation started and lasted for 12 consecutive weeks (excluding the spring break week in March). In the course of implementation, the research team conducted three checkpoint sessions with the school implementation teams to monitor the process and provide guidance to schools. The checkpoints (~20mins per session) involved telephone conversations between the researchers and the school implementation staff at approximately weeks 4, 8, and 12. In addition, the PE teachers received weekly e-Newsletters through email that included teaching tips and resources to facilitate implementation. The PE teachers completed standardized log sheets to document the eJRIEPS Hors Série 3 Décembre 2019 115 process of implementation. Lastly, the students completed the same assessments at end of the implementation in May.

The Refined EB Education Module and its distribution
The refined EB education module consisted of 16 scripted lessons after adding five new lessons to the original 11-lesson version. These added content on sedentary behavior and screen time mitigation, in addition to physical activity and healthy-eating promotion (i.e., consuming more fruit and vegetables). Table 2 below shows the scope and sequence of the refined module. In this present study, PE teachers were advised to teach one PE lesson per week and were afforded autonomy to choose 12 of the 16 lessons for implementation.
Furthermore, at their discretion, the PE teachers were encouraged to make and document modifications to the lessons to fit local characteristics and meet school needs (modifications documented on log sheets). The rationale for making the three above arrangements was to empower the PE teachers with choices and autonomy to result in more motivated adoption and implementation of the refined module.  .

Data collection and reduction
We collected the completed log sheets from the schools and then a trained data analyst entered the data into a data processing computer. We distributed the EB knowledge test through a www.Qualtrics.com survey to the schools where students responded in their media centers with technical assistance from their PE teachers. During the testing process, we instructed the students to try their best to answer the questions and that their responses would not affect their grades or school standing. After data collection, we scored the test sheets in accordance to the answer key (1 = correct; 0 = incorrect). EB knowledge performance was determined by computing the percentage of correct scores (i.e., sum of correctly answered questions divided by 28 questions).

Data analysis
We analyzed teacher' log data using descriptive analyses of the individual items (i.e., Mean, Standard Deviation, and Percentage) to quantify the EB education module implementation.
We next analyzed the differences in EB knowledge by time (pretest vs posttest), group (4 experienced schools vs. 2 naïve schools), grade (fourth vs fifth grades), and gender (boys vs girls) using analysis of variance (ANOVA). To further depict the change patterns of the overall EB knowledge scores, we conducted cluster analysis to classify scores into three knowledge levels (i.e., high, moderate, and low) followed by ANOVA to verify the group classification. Based on the classifications, we evaluated the migration of scores across knowledge level categories between pretest and posttest (e.g., fewer low knowledge students and more high knowledge students would suggest knowledge gain) using descriptive statistics and Pearson's Chi-square analysis. We also compared knowledge gain scores in the three behavior areas related to EB (i.e., knowledge of Do, knowledge of View, and knowledge of Chew). Cohen's d was computed as effect size for the mean difference comparisons (small = 0.20, medium = 0.50, large = 0.80).

PE teachers log sheets
Although we encouraged the PE teachers (one PE teacher per school) to complete the log sheets as soon as they finished teaching each lesson, we did not require them to fulfill this task as a requirement in this study. Only three PE teachers (out of six) from three different The also perceived that the students enjoyed the PE lessons (32/47, 89.3%) and showed a good level of engagement during the lessons (45/47, 95.8%).
To determine knowledge gain in the three behavior domains related to EB, we computed the mean differences in knowledge of Do (physical activity), knowledge of View (sedentary behavior and screen time), and knowledge of Chew (dietary behavior) between pre-test and post-test. Figure 1 illustrates these changes. Knowledge gains were observed in all three behavior domains (knowledge of Do: 12.57%, knowledge of View: 6.06%, knowledge of Chew: 5.08%) meaning the students enhanced their knowledge in all three knowledge areas. Module Implementation

Discussion
The results from this study support the effectiveness and implementation of the refined EB

Energy-Balance Knowledge in Behavior Areas (Do, View, and Chew) and Overall across Time (Pre-and Post-Test)
the students from the experienced schools showed greater gains than those from naive schools.
The main goal of the refined EB education module was to promote knowledge about EB principles through active tasks during PE classes. The lessons afforded students the opportunity to engage in the carefully designed learning tasks (i.e., "think as they move") in order to acquire an enhanced understanding of EB and its related behaviors. The original module demonstrated efficacy in promoting EB through active PE lessons (Chen et al., 2018), so we evaluated the effectiveness of the refined module using a type I effectivenesshybrid hybrid design (Curran et al., 2012). Specifically, the participating schools implemented the refined module with little training and support received, through the robust PRN (i.e., Iowa FitnessGram Initiative). The goal of releasing them in this way was to assess whether the lessons could be distributed this way to promote broader dissemination. Indeed, our results confirmed a significant gain of EB knowledge (ΔM = 8.5%, ranging from 2.1% to 12.8%, d = 0.56) among the students, although the degree of change was smaller than that observed in the previous efficacy study (ΔM = ~15% increase; Chen et al., 2018). The EB knowledge gain was also evidence among both boys and girls in fourth and fifth grades The overall smaller knowledge gains observed in this effectiveness study compared to the previous efficacy study (Chen et al., 2018), as well as the smaller gains observed in the naïve schools compared to the experienced schools, suggest some loss of fidelity when less structure is provided. The process data showed a low completion rate of the log sheets, which hampered our ability to fully understand the teachers' implementation process across the six schools. Future research should use multiple forms of quality process measures (e.g., checklists, interviews) to keep the implementation process accountable and informative, as teachers proceed with intervention. Asking teachers to voluntarily report the degree of implementation is relevant to researchers but may not be perceived as relevant and important to practitioners. Therefore, future studies should identify strategies to improve completion of self-reported fidelity data in a more consistent manner. Recognitions, incentives, reminders, and encouragements could be provided through the PRN so the participating teachers will see process evaluation as an important step of program evaluation.
Nonetheless, in spite of not being able to see the full implementation landscape, we did gather some encouraging and useful insight about the module implementation. From the completed log sheets, we learned that the refined EB education lessons led to overall positive teaching (high teacher satisfaction, appropriate content difficulty, with modest level of lesson modifications) and learning (enjoyable experiences and modest engagement).
From these process data, future adaptations can be made to facilitate easier access and implementation. For example, we believe that communication with the PE teachers may be more effective when coordinated messages are coming from within the school system (i.e. from another member of the school such as principal or colleague). We will also attempt to furnish lesson plans and ancillary instructional materials (i.e., task sheets) in simpler and shorter forms. The broader release of the materials through the Iowa FitnessGram Initiative and other PE related networks would enable teachers to adopt and adapt these materials to fit their own needs.
A major strength of the study lies in that we followed Curran et al.'s effectivenessimplementation hybrid design (type I) to evaluate both the effectiveness and implementation of the refined EB education module. This research investigation highlights the importance of studying the external validity of curriculum interventions in PE pedagogy research, by following a robust hybrid research design. The findings confirm the utility of the Iowa FitnessGram Initiative as a PRN for program evaluation. The other strength of the study is related to the continuity of our research on EB education through school PE. The findings from our previous efficacy study and this effectiveness study provide a more complete story on steps to translate an evidence-based curriculum or program into a practical set of instructional strategies.
Despite the aforementioned strengths, we also acknowledge several limitations. First, although the primary purpose of this study was to evaluate the effectiveness of the refined EB education module, we had significant missing data (self-reported) to fully address the secondary purpose (implementation testing). The completed log sheets (with low completion rate) were the only process data used to describe the implementation process, limiting our ability to observe the full picture of implementation as well as the potential moderating effect of implementation on intervention effectiveness. The next step of research effort would be to carry out Curran et al.'s type II hybrid research design that treats effectiveness and implementation testing as two equally important aims (Curran et al., 2012). The other potential limitation of the study is the lack of a control group for effectiveness testing.
Traditional effectiveness studies often involve a randomly assigned control group. Having a control group would definitely strengthen the evidence of our effectiveness trial, but it would also increase the difficulty for feasibility and practicability. In addition, our previous efficacy study already showed significant outcomes when compared to control schools, thus we did not see having control schools extremely important in this round of evaluation. Our crosssectional comparisons of knowledge gains between school types (experienced vs. naïve), grade (fourth vs. fifth grades), and gender (boys vs. girls) along with process evaluation data in this study, provide rich details about both the module effectiveness and implementation.