How Well Can Students Evaluate Online Science Information? Contributions of Prior Knowledge, Gender, Socioeconomic Status, and Offline Reading Ability
Abstract
enThis study investigated how well seventh-grade students (n = 1,434) evaluated the credibility of online information in science. The analysis examined the extent to which evaluation appeared to share aspects of other elements of online research and comprehension, including locating, synthesizing, and communicating. This study also investigated the extent to which prior knowledge, gender, socioeconomic status, and offline reading ability affected students’ evaluation during online reading in science. Results suggest that evaluation is a unique and difficult dimension of online research and comprehension. Results also suggest that girls outperform boys and that students with greater prior knowledge and offline reading ability can better evaluate online information compared with those with less prior knowledge and offline reading ability.
Abstract
zh本研究考查1,434名七年级学生对在线科学信息可信度的评价表现。研究分析考查学生的评价有多大程度是与在线研究和阅读理解共同具有定位、综合、传意,以及其他要素。本研究亦考查已有知识、性别、社会经济地位及离线阅读能力等方面,有多大程度会影响到学生对在线阅读科学信息的评价。结果显示,评价是在线研究和阅读理解的一个独特而困难的维度。结果也显示,就评价在线信息而言,女生优于男生,而且具有较高已有知识和离线阅读能力的学生均有较佳的表现。
Abstract
esEste estudio investigó cuán bien estudiantes de séptimo grado (n = 1,434) evaluaron la credibilidad de información en línea sobre las ciencias. El análisis examinó el grado al cual la evaluación parecía compartir aspectos de otros elementos de la investigación en línea y la comprensión, inclusive la localización, el sintetizar y la comunicación. Este estudio también investigó el grado al cual el conocimiento previo, el estado sexual, el estado socioeconómico y la capacidad lectora en línea afectaron la evaluación de los estudiantes al leer sobre las ciencias en línea. Los resultados sugieren que la evaluación es una dimensión particular y difícil de la investigación en línea y la comprensión. Los resultados también sugieren que las muchachas superan a los varones y que los estudiantes con mayor conocimiento previo y mejor capacidad lectora en línea pueden evaluar mejor la información en línea.
Abstract
arبحثت هذه الدراسة قدرة تقييم طلاب الصف السابع (ن = 1434) لمصداقية معلومات العلوم عبر الإنترنت. فحص هذا التحليل المدى الذي يبدو فيه أن التقييم يتشارك عناصر أخرى من البحث والفهم عبر الإنترنت ، بما في ذلك تحديد الموقع، والتوليف، والتواصل. تناولت هذه الدراسة أيضًا مدى تأثير معرفة الطلاب السابقة والجنس والحالة الاجتماعية الاقتصادية وقابلية القراءة العادية على تقييم الطلاب أثناء قراءة العلوم عبر الإنترنت. تشير النتائج إلى أن التقييم يعتبر بعدًا فريدًا وصعبًا في البحث والفهم عبر الإنترنت. وتشير النتائج أيضًا إلى أن أداء الفتيات يتفوق على الفتيان وأن الطلاب الذين لديهم معرفة مسبقة أكبر وقدرات القراءة العادية يمكنهم تقييم المعلومات عبر الإنترنت بشكل أفضل.
Abstract
ruВ какой степени способны семиклассники (n = 1,434) оценить достоверность онлайн-информации при изучении естественных наук? Как связано такое оценивание с другими онлайн-действиями и пониманием прочитанного, в том числе, с поиском, синтезированием и передачей информации? Влияют ли на оценивание достоверности информации ранее полученные знания, гендер, социально-экономический статус и способность к чтению офлайн? Исследователи выяснили, что оценивание достоверности информации – уникальный и сложный аспект понимания прочитанного и работы с онлайн-информацией при изучении естественных наук. Выяснилось также, что девочки оценивают онлайн-информацию лучше, чем мальчики, и то же самое можно сказать об учащихся с более высоким уровнем знаний и развитыми навыками чтения текстов офлайн.
Abstract
frCette recherche s'intéresse à la façon dont des élèves de classe de 5e (n = 1,434) jugent de la crédibilité de l'information scientifique en ligne. Elle se demande jusqu’à quel point ce jugement comporte d'autres éléments de la recherche en ligne et de la compréhension tels que la localisation, la synthèse et la communication. Cette étude se demande aussi jusqu’à quel point les connaissances antérieures, le genre, le statut socio-économique, et le savoir lire hors ordinateur affectent le jugement des élèves lors de la lecture scientifique en ligne. Les résultats suggèrent que le jugement est une dimension unique et difficile de la lecture et de la compréhension en ligne. Les résultats suggèrent également que les filles dépassent les garçons et que les élèves qui ont de meilleures connaissances préalables et de meilleures compétences de lecture hors ordinateur sont plus en mesure de juger de l'information en ligne.
One of the most important skill sets that readers need today is the ability to evaluate the credibility of online information (Leu, Kinzer, Coiro, Castek, & Henry, 2013). This skill set is especially important in disciplines such as science that rely on credibility evaluation to build an accurate understanding of concepts over time (Halverson, Siegel, & Freyermuth, 2010). Unfortunately, prior research has shown that students are not especially skilled in this area (e.g., Goldman, Braasch, Wiley, Graesser, & Brodowinska, 2012; Walraven, Brand-Gruwel, & Boshuizen, 2009). Yet, this research has been underdeveloped in critical ways: It has focused on older adolescents, has tended to rely on narrow views of evaluation, and has largely ignored effects of individual difference variables. These limitations have resulted in a relatively narrow view of students’ evaluation abilities, making it difficult to develop effective instruction.
Thus, the purpose of this study was twofold: First, I sought to understand how well seventh graders evaluated credibility when they were prompted to engage in multiple components of evaluation during online science inquiry. Second, I sought to understand the extent to which four group difference variables, which have implications for individual difference variables, contributed to their ability to do so. Specifically, this study investigated two research questions:
- To what extent can seventh graders evaluate the credibility of information during an online research and comprehension task in science that also includes locating, synthesizing, and communicating information?
- To what extent do prior knowledge, gender, socioeconomic status (SES), and offline reading ability contribute to students’ ability to evaluate information credibility during an online research and comprehension task in science?
Theoretical and Empirical Perspectives
In this study, I used a disciplinary literacy framework (T. Shanahan & Shanahan, 2008), a theory of online research and comprehension (Leu et al., 2013), and perspectives on individual differences during reading (see Afflerbach, 2016) to guide my investigation. A disciplinary literacy framework posits that literacy is characterized by the specific needs of the discipline in which it operates. In this study, I viewed webpages as science texts and the process of meaning making while evaluating during an online research and comprehension task that also included locating, synthesizing, and communicating (Leu et al., 2013) as occurring through an interaction of an individual reader, a text, and a context (RAND Reading Study Group, 2002). I also used theoretical and empirical perspectives on credibility to define credibility evaluation within this context. Here, I define credibility as accuracy (Kiili, Laurinen, & Marttunen, 2008) or believability (Hovland, Janis, & Kelley, 1953) with three subordinate tiers: knowledge–claim credibility, or the extent to which an author’s claims are consistent with the knowledge the reader believes to be true (see, e.g., Bromme, Kienhues, & Porsch, 2010); source credibility, or the extent to which the source of information (i.e., author, publisher) is credible (see Wineburg, 1991); and context credibility, or the extent to which the context in which the information is presented is credible (Wathen & Burkell, 2002). In this study, I define credibility evaluation as an iterative process of determining the extent to which information is credible using all three tiers.
Methods
Participants were 1,434 seventh graders (736 girls and 698 boys) from two U.S. states who were engaged in year 4 of a five-year study called the ORCA Project. This project developed and validated the Online Research and Comprehension Assessments (ORCAs; Leu, Kulikowich, Sedransk, & Coiro, 2009). Students in the present study completed one of four versions, each on a different science topic, of the ORCA-II Virtuals. Each ORCA-II Virtual comprises four items for each of four skill areas: locating, synthesizing, communicating, and evaluating. Students interacted with items through a virtual environment with avatars, email, and a social network (Leu, Kulikowich, Sedransk, & Coiro, 2014). Prior to beginning each ORCA-II Virtual, students completed a 10-item multiple-choice prior knowledge measure on the science domain (heart or eye health) of their randomly assigned version. Students also identified their gender and completed an offline reading measure, developed and validated as part of the project (Cui, Bruner-Sedransk, & Sedransk, 2014). SES data were collected by school, with percentage of students receiving free or reduced-price lunch (FRPL) as a proxy measure of SES.
To investigate research question 1, I first used an analysis of shared variance involving four regression analyses to investigate the extent to which elements found in the Evaluate construct also were found in the Locate, Synthesize, and Communicate constructs. Second, I used three separate two-level models to evaluate the relative difficulty of Evaluate compared with the other three skill areas. For the outcome measure for each analysis, I used a composite variable that was the mean difference of each skill area score to the Evaluate skill area score for each student. For research question 2, I used a two-level model to investigate the extent to which prior knowledge, gender, SES, and offline reading ability contributed to students’ evaluation abilities. Prior knowledge, gender, SES, and offline reading ability scores served as the independent variables, with overall evaluation scores as the dependent variable. In the multilevel analyses, students (level 1) were nested within schools (level 2).
Results and Discussion
Results for research question 1 revealed that, on average, students were not especially skilled at the related areas of locating, synthesizing, evaluating, or communicating (see Table 1 for means, Table 2 for percentages of students scoring correctly on Evaluate items, and Table 3 for correlations) but were particularly unskilled at evaluating. Only 4% scored correctly on all four Evaluate items. Most students (82.7%) identified the author, 22.5% evaluated author expertise, 31% evaluated author point of view, and only 15% evaluated overall webpage credibility (see Table 2). Students performed significantly more poorly on Evaluate than on Locate and Synthesize. However, students performed most poorly on Communicate (see Table 1 for means and Table 4 for parameter estimates). One reason why students performed poorly on Evaluate may be that it is not well defined as a process and thus not well taught. Much work in this area has focused on subcomponents of evaluation, such as sourcing (see Bråten, Stadtler, & Salmerón, 2018), rather than viewing evaluation as an iterative process involving multiple, integrated components, despite the fact that such a process has been identified in both offline (C. Shanahan, Shanahan, & Misischia, 2011) and online contexts (Kiili et al., 2008; Wathen & Burkell, 2002). Students may benefit from learning how to evaluate using all three tiers (knowledge–claim, sourcing, and context) within an online inquiry task rather than using a single tier in isolation.
Variable | M (SD) | Cronbach’s α |
---|---|---|
Research question 1 | ||
Evaluate (out of 4) | 1.51 (0.98) | .44 |
Evaluate 1 (out of 1) | .83 (.38) | |
Evaluate 2 (out of 1) | .23 (.42) | |
Evaluate 3 (out of 1) | .31 (.46) | |
Evaluate 4 (out of 1) | .15 (.36) | |
Locate (out of 4) | 1.85 (1.23) | .56 |
Synthesize (out of 4) | 2.41 (1.35) | .66 |
Communicate (out of 4) | 1.05 (1.08) | .49 |
Locate − Evaluate (out of 4) | 0.342 (1.39) | |
Synthesize − Evaluate (out of 4) | 0.901 (1.37) | |
Communicate − Evaluate (out of 4) | −0.463 (1.30) | |
Research question 2 | ||
Prior knowledge (out of 10) | 4.73 (1.63) | |
Gender (out of 1) | .49 (.50) | |
Offline reading (out of 15) | 9.32 (3.05) | |
School mean for prior knowledge (out of 10) | 4.73 (0.489) | |
School mean for gender (out of 1) | .49 (.084) | |
School mean for offline reading (out of 15) | 9.32 (1.23) | |
School FRPL (out of 100) | 39.12 (23.33) |
Note
- A 100-point school FRPL score would indicate that 100% of students at a given school received free or reduced-price lunch.
Evaluate component | Percentage of students scoring correctly |
---|---|
Evaluate (out of 4 score points) | |
4 score points correct | 4.0 |
3 score points correct | 11.6 |
2 score points correct | 28.8 |
1 score point correct | 42.9 |
0 score points correct | 12.8 |
Individual Evaluate score points (out of 1 point each) | |
Evaluate 1: Identify the author | 82.7 |
Evaluate 2: Evaluate the author’s expertise | 22.5 |
Evaluate 3: Evaluate the author’s point of view | 31.0 |
Evaluate 4: Evaluate the overall credibility of the webpage | 15.0 |
Variable | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
---|---|---|---|---|---|---|---|---|---|---|---|
Dependent variable | |||||||||||
1. Evaluate | — | ||||||||||
Independent variables | |||||||||||
2. Locate | .229** | — | |||||||||
3. Synthesize | .352** | .298** | — | ||||||||
4.Communicate | .245** | .230** | .224** | — | |||||||
Student-level variables | |||||||||||
5. Prior knowledge | .186** | .158** | .233** | .137** | — | ||||||
6. Gender | −.083** | −.083** | −.229** | −.055* | .026 | — | |||||
7. Offline reading measure | .351** | .217** | .373** | .276** | .230** | −.013 | — | ||||
School means | |||||||||||
8. Prior knowledge | .166** | .108** | .185** | .131** | .300** | .003 | .187** | — | |||
9. Gender | −.030 | −.014 | −.030 | .024 | .005 | .167** | .072** | .018 | — | ||
10. Offline reading measure | −.234** | .135** | .211** | .165** | .139** | .030 | .404** | .462** | .178** | — | |
11. Free or reduced-price lunch | −.086** | −.114** | −.132** | −.113** | −.176** | −.029 | −.186** | −.587** | −.174** | −.461** | — |
- *p < .05, two-tailed. **p < .01, two-tailed.
Dependent variable | Fixed effects intercept | Random effects residual | Random effects intercept | Within-school variance | Between-school variance |
---|---|---|---|---|---|
Locate − Evaluate | 0.367* (0.050) | 1.90* (0.072) | 0.044a (0.022) | 0.977 | 0.023 |
Synthesize − Evaluate | 0.902* (0.041) | 1.85* (0.070) | 0.014a (0.015) | 0.993 | 0.007 |
Communicate − Evaluate | −0.445* (0.048) | 1.57* (0.060) | 0.043a (0.021) | 0.970 | 0.027 |
Note
- Results are based on data from 1,434 students distributed across 40 classroom sites. Standard errors are in parentheses. A Bonferroni correction was used to control for Type I error at the .05 level.
- a Not statistically significant.
- *p <.05.
Each skill area contributed a small but unique amount of variance to Evaluate. Locate accounted for 5.2% of the variance in Evaluate, Synthesize for 12.3%, and Communicate for 5.9%. All three analyses were statistically significant (see Table 5). Together, Locate, Synthesize, and Communicate accounted for 16.2% of the variance in Evaluate, which was statistically significant. Synthesize contributed the most, followed by Communicate and then Locate (see Table 6). Small but positive and statistically significant correlations between each of the other three skill areas and Evaluate support these R-squared values (see Table 3). These analyses support the idea that the Evaluate scale measured a construct unique from the other three scales. However, the other scales’ contribution to Evaluate suggests that the constructs all share some commonality.
Variable | R | Adjusted R2 | F | Unstandardized β |
---|---|---|---|---|
Synthesize | .352 | .123 | 202.51† | 0.257† |
Communicate | .245 | .059 | 91.30† | 0.225† |
Locate | .229 | .052 | 79.27† | 0.183† |
- †p < .005.
Variable | R | Adjusted R2 | F | Unstandardized β | Standardized β |
---|---|---|---|---|---|
Synthesize, Communicate, and Locate | .404 | .162 | 93.01* | ||
Synthesize | 0.21* | 0.29* | |||
Communicate | 0.14* | 0.16* | |||
Locate | 0.09* | 0.11* |
- *p <.05.
Analyses for research question 2 revealed that of the variance in Evaluate, 90% occurred within schools, and 9.7% occurred between schools, both of which were statistically significant (see Table 7). For student-level effects, there was a positive and statistically significant, but small, relation between prior knowledge and Evaluate. For each additional standard deviation increase in prior knowledge, Evaluate was predicted to increase by 0.09 points (see Table 7). Online reading research has suggested that prior knowledge may be less important for an online, compared with an offline, context (Coiro, 2011). However, this may not be true for evaluation specifically. Offline reading studies have suggested that prior knowledge may be especially important for evaluating knowledge claims (Scharrer, Stadtler, & Bromme, 2014) and sourcing (C. Shanahan et al., 2011). On average, girls performed significantly better on Evaluate than boys by an average of 0.15 points (see Table 7). One reason why the gender gap was not larger may be that boys’ skills and attitudes in science (Katz, Allbritton, Aronis, Wilson, & Soffa, 2006) and online (Liu & Huang, 2008) mitigated typical offline reading gender gaps. There was a positive and statistically significant, but small, relation between offline reading and Evaluate. For each standard deviation increase in offline reading, Evaluate was predicted to increase by 0.26 points (see Table 7). The contribution of offline reading to online reading was statistically significant but not large, supporting the idea that online and offline reading share commonalities but are not the same (Coiro, 2011).
There was a positive and statistically significant correlation between offline reading and Evaluate (see Table 3). For each standard deviation increase in the school mean for offline reading, Evaluate was predicted to increase by 0.21 points. However, the school mean variables for prior knowledge, gender, and FRPL were not statistically significant (see Table 7 for estimates). That FRPL was not statistically significant was surprising given that prior work in both offline reading (Bailey & Dynarski, 2011) and online reading (Leu et al., 2015) has found a large achievement gap between students from higher and lower income families. This phenomenon may have occurred because of the little variance in evaluation scores, pointing to the need to develop better assessments and instruction.
Parameter | Model 1 (unconditional) | Model 2 (conditional) |
---|---|---|
Fixed effects | ||
Intercept | 1.48‡ (0.06) | 1.78† (0.27) |
Level 1: Student specific | ||
Prior knowledge (standardized) | 0.09† (0.02) | |
Gender | −0.15† (0.05) | |
Offline reading (standardized) | 0.26† (0.02) | |
Level 2: School means | ||
Prior knowledge (standardized) | 0.09a (0.06) | |
Gender | −0.75a (0.49) | |
Offline reading (standardized) | 0.21† (0.05) | |
Socioeconomic status | 0.00a (0.00) | |
Random parameters | ||
Intercept | 0.10‡ (0.03) | 0.05† (0.02) |
Residual | 0.88‡ (0.03) | 0.79† (0.03) |
−2*log likelihood | 3,946.11 | 3,800.94 |
Note
- Results are based on data from 1,434 students distributed across 40 classrooms. Standard errors are in parentheses. A Bonferroni correction was used to control for Type I error at the .05 level.
- a Not statistically significant.
- †p < .005. ‡p < .017.
Implications
Findings from the present study suggest three key priorities for instruction and assessment in evaluation: beginning instruction at a young age, targeting instruction to the needs of learners with different characteristics, and further developing the Evaluate construct within online research and comprehension assessments to inform this instruction. First, by grade 7, students already experience significant difficulty with evaluating. To successfully teach students to evaluate, we need to start young, as with other literacy skills (Annie E. Casey Foundation, 2010). Second, results revealed that students with more prior knowledge and offline reading ability performed better than those with less and that girls performed better than boys. This suggests that differentiating instruction to meet different learners’ needs will be more effective than using a single approach (Coiro, 2011; Kiili et al., 2008). Third, results revealed that Evaluate was a unique but related construct to Locate, Synthesize, and Communicate. Rather than measuring evaluation in isolation, it may be best to measure it with related skills (Leu et al., 2013) because it appears that students evaluate as part of a dynamic process involving other skills (Goldman et al., 2012). Including new items to measure additional component skills would provide detailed information for instructional planning and allow for further development of the online evaluation construct, particularly within different disciplines. Disciplinary literacy in online contexts is not now well defined nor understood, and such work would be a significant contribution leading to improved instruction.
Notes
Portions of this material are based on work supported by the Institute of Education Sciences, U.S. Department of Education (grants R305G050154 and R305A090608). Opinions expressed herein are solely those of the author and do not necessarily represent the position of the Institute of Education Sciences or the U.S. Department of Education.
Biography
ELENA FORZANI earned her PhD in educational psychology, with a specialization in cognition, instruction, and learning technology, from the University of Connecticut, Storrs, USA. Her dissertation was supported by her committee members: Donald J. Leu (chair), Mary Anne Doyle, Michael A. Coyne, and Christopher Rhoads. Forzani is an assistant professor in literacy education at Boston University, Massachusetts, USA; email [email protected]. Her work focuses on digital literacies learning, with an emphasis on understanding how students comprehend and evaluate disciplinary information in online contexts.