STATISTICS

The literature review in a dissertation serves to illuminate the research gap and thereby justify the necessity of the study and the proposed study methodology. Though this purpose remains consistent across all methodologies, the structure of the literature review section of a qualitative dissertation often differs slightly from that of a quantitative dissertation. For example, literature reviews in quantitative dissertations are dominantly constructed around the variables while those in qualitative dissertations can be constructed in many ways. In this assignment, you will contrast the structures of a qualitative and a quantitative literature review and consider a rationale for those differences.

General Requirements:

Don't use plagiarized sources. Get Your Custom Essay on
STATISTICS
Just from $13/Page
Order Essay

Use the following information to ensure successful completion of the assignment:

  • Review the Vangilder (qualitative) dissertation. (ATTACHED)
  • Review the Wigton (quantitative) dissertation. (ATTACHED)
  • APA style is required for this assignment.
  • You are required to submit this assignment to LopesWrite. 

Directions:

Write a paper (600-750 words) in which you contrast the structures of the qualitative and quantitative dissertations referenced above and provide a rationale for the differences. Include the following in your paper:

  1. A clear description of the primary differences between the structures of the literature reviews.
  2. A rationale for the observed differences. What factors contributed to the differences the structures of these literature reviews?

A Grounded Theory Investigation of Thinking and Reasoning with Multiple
Representational Systems for Epistemological Change in Introductory Physics
Submitted by
Clark Henson Vangilder

A Dissertation Presented in Partial Fulfillment
of the Requirements for the Degree
Doctor of Philosophy in Psychology

Grand Canyon University
Phoenix, Arizona
February 23, 2016

All rights reserved
INFORMATION TO ALL USERS
The quality of this reproduction is dependent upon the quality of the copy submitted.
In the unlikely event that the author did not send a complete manuscript
and there are missing pages, these will be noted. Also, if material had to be removed,
a note will indicate the deletion.

All rights reserved.
This work is protected against unauthorized copying under Title 17, United States Code
Microform Edition © ProQuest LLC.
ProQuest LLC.
789 East Eisenhower Parkway
P.O. Box 1346
Ann Arbor, MI 48106 – 1346
ProQuest 10027568
Published by ProQuest LLC (2016). Copyright of the Dissertation is held by the Author.
ProQuest Number: 10027568

© by Clark Henson Vangilder, 2016
All rights reserved.

GRAND CANYON UNIVERSITY

A Grounded Theory Investigation of Thinking and Reasoning with Multiple
Representational Systems for Epistemological Change in Introductory Physics

I verify that my dissertation represents original research, is not falsified or plagiarized,
and that I have accurately reported, cited, and referenced all sources within this
manuscript in strict compliance with APA and Grand Canyon University (GCU)
guidelines. I also verify my dissertation complies with the approval(s) granted for this
research investigation by GCU Institutional Review Board (IRB).

__________________________________________February 8, 2016
Clark Henson Vangilder Date

Abstract
Conceptual and epistemological change work in concert under the influence of
representational systems, and are employed by introductory physics (IP) students in the
thinking and reasoning that they demonstrate in various modelling and problem-solving
processes. A grounded theory design was used to qualitatively assess how students used
multiple representational systems (MRS) in their own thinking and reasoning along the
way to personal epistemological change. This study was framed by the work of Piaget and
other cognitive theorists and conducted in a college in Arizona; the sample size was 44.
The findings herein suggest that thinking and reasoning are distinct processes that handle
concepts and conceptual frameworks in different ways, and thus a new theory for the
conceptual framework of thinking and reasoning is proposed. Thinking is defined as the
ability to construct a concept, whereas reasoning is the ability to construct a conceptual
framework (build a model). A taxonomy of conceptual frameworks encompasses thinking
as a construct dependent on building a model, and relies on the interaction of at least four
different types of concepts during model construction. Thinking is synonymous with the
construction of conceptual frameworks, whereas reasoning is synonymous with the
coordination of concepts. A new definition for understanding as the ability to relate
conceptual frameworks (models) was also created as an extension of the core elements of
thinking and reasoning about the empirically familiar regularizes (laws) that are part of
Physics.
Keywords: thinking, reasoning, understanding, concept, conceptual framework,
personal epistemology, epistemological change, conceptual change, representational
system, introductory physics, model, modeling, physics.

vi

Dedication
This work is dedicated to my marvelous wife, Gia Nina Vangilder. Above all
others, she has sacrificed much during the journey to my Ph.D. Her unwavering love and
loyalty transcend the practical benefits of her proofreading assistance over the years, as
well as other logistical maneuverings pertaining to our family enduring the time
commitment that such an endeavor requires of me personally.
You are amazing Gia, and I love you more than mere words can describe!
Most importantly, I thank God Himself for putting my mind in a wonderful
universe so rich with things to explore.

vii

Acknowledgments
I am exceptionally pleased to have worked with the committee that has approved
this document—Dr. Racheal Stimpson (Chair), Dr. Pat D’Urso (Methodologist), and Dr.
Rob MacDuff (Content Expert). Each one of you has contributed to my success in your
own special way, and with your own particular talents.
I am blessed to have walked this path under your guidance.
Honorable mention is given Dr. Rob MacDuff, whose influence and collaboration
over the years is valuable beyond measure or words. Neither of us would be where we are
at without the partnership of theory and practice that has defined our collaboration for
more than a decade now. I am truly blessed to know you and work with you.

viii

Table of Contents
List of Tables ……………………………………………………………………………………………………. xiii
List of Figures …………………………………………………………………………………………………… xiv
Chapter 1: Introduction to the Study …………………………………………………………………………1
Introduction ……………………………………………………………………………………………………..1
Background of the Study …………………………………………………………………………………..3
Personal epistemology. ……………………………………………………………………………..5
Representational Systems. …………………………………………………………………………6
Problem Statement ……………………………………………………………………………………………8
Purpose of the Study …………………………………………………………………………………………9
Research Questions and Phenomenon ……………………………………………………………….10
Qualitative Research Questions ………………………………………………………………………..11
Advancing Scientific Knowledge ……………………………………………………………………..12
Significance of the Study …………………………………………………………………………………14
Rationale for Methodology ………………………………………………………………………………16
Nature of the Research Design for the Study ………………………………………………………17
Definition of Terms…………………………………………………………………………………………19
Assumptions, Limitations, Delimitations …………………………………………………………..20
Summary and Organization of the Remainder of the Study ………………………………….21
Chapter 2: Literature Review …………………………………………………………………………………23
Introduction to the Chapter and Background to the Problem ………………………………..23
Theoretical Foundations and Conceptual Framework ………………………………………….29
Personal epistemology …………………………………………………………………………….29

ix

Thinking and reasoning ……………………………………………………………………………30
Building a conceptual model for this study ………………………………………………..34
Representational systems …………………………………………………………………………36
Self-efficacy, self-regulation, and journaling ……………………………………………..38
Convergence of conceptual and theoretical foundations ………………………………39
Review of the Literature ………………………………………………………………………………….40
A brief history of personal epistemology research ………………………………………40
A brief history of assessment on personal epistemology ………………………………43
Connections between conceptual change and personal epistemology …………….48
Conceptual change in introductory physics ………………………………………………..51
Personal epistemologies and learning physics …………………………………………….55
Thinking and reasoning in introductory physics ………………………………………….64
Study methodology …………………………………………………………………………………68
Study instruments and measures ……………………………………………………………….71
Summary ……………………………………………………………………………………………………….72
Chapter 3: Methodology ……………………………………………………………………………………….76
Introduction ……………………………………………………………………………………………………76
Statement of the Problem …………………………………………………………………………………77
Research Questions …………………………………………………………………………………………78
Research Methodology ……………………………………………………………………………………80
Research Design……………………………………………………………………………………………..81
Population and Sample Selection………………………………………………………………………83
Instrumentation and Sources of Data …………………………………………………………………84

x

Classroom activities and assessment instrument …………………………………………86
Validity …………………………………………………………………………………………………………91
Reliability ………………………………………………………………………………………………………93
Data Collection and Management ……………………………………………………………………..94
Data Analysis Procedures ………………………………………………………………………………..97
Preparation of data ………………………………………………………………………………….97
Data analysis ………………………………………………………………………………………….98
Ethical Considerations …………………………………………………………………………………….99
Limitations and Delimitations …………………………………………………………………………100
Summary ……………………………………………………………………………………………………..101
Chapter 4: Data Analysis and Results ……………………………………………………………………105
Introduction ………………………………………………………………………………………………….105
Descriptive Data……………………………………………………………………………………………106
Data Analysis Procedures ………………………………………………………………………………109
Coding schemes ……………………………………………………………………………………110
Triangulation of data ……………………………………………………………………………..113
Results …………………………………………………………………………………………………………116
PEP Analysis. ……………………………………………………………………………………….116
Qualitative analysis. ………………………………………………………………………………121
Analysis of the physics and reality activity journals…………………………………..135
Consideration of research questions with current results. ……………………………136
Combined analysis of the remaining study activities ………………………………….137
Other assessments. ………………………………………………………………………………..143

xi

Summary ……………………………………………………………………………………………………..144
Chapter 5: Summary, Conclusions, and Recommendations ……………………………………..147
Introduction ………………………………………………………………………………………………….147
Summary of the Study …………………………………………………………………………………..149
Summary of Findings and Conclusion ……………………………………………………………..151
Research Question 1………………………………………………………………………………152
Research Question 2………………………………………………………………………………162
Definitions ……………………………………………………………………………………………164
Predictions. …………………………………………………………………………………………..171
Suggestions for TRU Learning Theory use ………………………………………………171
Implications………………………………………………………………………………………………….172
Theoretical implications. ………………………………………………………………………..172
Practical implications …………………………………………………………………………….174
Future implications ……………………………………………………………………………….175
Strengths and weaknesses ………………………………………………………………………176
Recommendations …………………………………………………………………………………………177
Recommendations for future research. …………………………………………………….177
Recommendations for future practice. ……………………………………………………..178
References …………………………………………………………………………………………………………181
Appendix A. Site Authorization Form …………………………………………………………………..203
Appendix B. Student Consent Form ……………………………………………………………………..204
Appendix C. GCU D-50 IRB Approval to Conduct Research ………………………………….205
Appendix D. Psycho-Epistemological Profile (PEP)……………………………………………….206

xii

Appendix E. What is Physics? What is Reality? Is Physics Reality? …………………………209
Appendix F. Numbers Do Not Add ………………………………………………………………………213
Appendix G. The Law of the Circle………………………………………………………………………214
Appendix H. The Zeroth Laws of Motion ……………………………………………………………..215
Appendix I. End of Term Interview ………………………………………………………………………218

xiii

List of Tables
Table 1. Literature Review Search Pattern 1 …………………………………………………………. 26
Table 2. Literature Review Search Pattern 2 …………………………………………………………. 27
Table 3. Study Population Demographics …………………………………………………………… 107
Table 4. Interview Transcript Data …………………………………………………………………….. 109
Table 5. PEP Dimension Scores ………………………………………………………………………… 117
Table 6. Basic PEP Composite Descriptive Statistics …………………………………………… 117
Table 7. Basic PEP Dimension Descriptive Statistics …………………………………………… 118
Table 8. Primary PEP Dimension Changes …………………………………………………………. 119
Table 9. Secondary PEP Dimension Changes ……………………………………………………… 120
Table 10. Tertiary PEP Dimension Changes ……………………………………………………….. 120
Table 11. PEP Score Distributions Normality Tests ……………………………………………… 121
Table 12. Overall Coding Results ………………………………………………………………………. 122
Table 13. Coding Results for the Elements of Thought (EoT) ……………………………….. 123
Table 14. Jaccard Indices for Distinction and EoT Code Comparison …………………….. 125
Table 15. Examples of Concept Coordination ……………………………………………………… 130
Table 16. Examples of Belief Development Claims About Thinking ……………………… 139
Table 17. Examples of EoT Belief Development …………………………………………………. 140
Table 18. Examples of Belief Development ………………………………………………………… 141
Table 19. Examples of Belief Development ………………………………………………………… 143
Table 20. Force Concept Inventory (FCI) Results ………………………………………………… 144
Table 21. Mechanics Baseline Test (MBT) Results ……………………………………………… 144
Table 22. Cognitive Modeling Approach to Axiom Development………………………….. 167

xiv

List of Figures
Figure 1.The eight elements of thought. ………………………………………………………………… 33
Figure 2. The eight elements of scientific thought. …………………………………………………. 34
Figure 3. Typiscal classroom activity life cycle. …………………………………………………….. 86
Figure 4. Cluster analysis circle graph for EoT and distinctions. …………………………….. 124
Figure 5. Cluster analysis dendrogram. ……………………………………………………………….. 126
Figure 6. Distinctions and coordinations vs. EoT node matrix. ………………………………. 128
Figure 7. Concepts and individual POV node matrix. ……………………………………………. 129
Figure 8. Distinctions and coordinations vs. EoT node matrix. ……………………………….. 131
Figure 9. MSPR group discussions distinctions-coordinations EoT node matrix. ……… 132
Figure 10. MSPR journals distinctions-coordinations EoT node matrix. ………………….. 132
Figure 11. MSPR math EoT node matrix. ……………………………………………………………. 133
Figure 12. MSPR science EoT node matrix………………………………………………………….. 134
Figure 13. MSPR physics EoT node matrix. ………………………………………………………… 134
Figure 14. Distinctions vs. EoT node matrix. ……………………………………………………….. 135
Figure 15. Coordinations vs. EoT node matrix. …………………………………………………….. 136
Figure 16. Belief development with TRU claims node matrix. ……………………………….. 138
Figure 17. Node matrix comparing beliefs with EoT. ……………………………………………. 140
Figure 18. Node matrix comparing true claims with EoT. …………………………………….. 142
Figure 19. Cognitive Modeling Taxonomy of Conceptual Frameworks – Processes. … 158
Figure 20. Cognitive Modeling Taxonomy of Conceptual Frameworks – Collections. . 165
Figure 21. CMTCF example 1: first zeroth law of motion. …………………………………….. 166
Figure 22. Vector diagrammatic model of the First Zeroth Law. …………………………….. 168

xv

Figure 23. Graphical model of the First Zeroth Law. …………………………………………….. 169
Figure 24. CMTCF Example 2: Second Zeroth Law of Motion………………………………. 169
Figure 25. CMTCF example 3: Second Zeroth Law axiom. …………………………………… 169

1

Chapter 1: Introduction to the Study
Introduction
The cumulative history of physics education research (PER) for the last 34 years
has led to a reform in science teaching that has fundamentally changed the nature of
physics instruction in many places around the world (Modeling Instruction Project, 2013;
ISLE, 2014). Historical developments in PER have highlighted the connection that exists
between conceptual change and the way that students come to learn (Hake, 2007;
Hestenes, 2010), the difficulties that impede their learning (Lising & Elby, 2005), the
connection between personal epistemology and learning physics (Brewe, Traxler, de la
Garza & Kramer, 2013; Ding, 2014; Zhang & Ding, 2013), and theoretical developments
that inform pedagogical reform (Hake, 1998; Hestenes, 2010). To date, little research has
been done exploring the particular mechanisms of general epistemological change
(Bendixen, 2012), with PER pioneers such as Redish (2013) suggesting the need for a
basis in psychological theory for how physics students think and believe when it comes to
learning and knowledge acquisition. There is still no definitive answer about general
epistemological change within the literature (Hofer, 2012; Hofer & Sinatra, 2010), and
many of the leading researchers have been studying that with the context of mathematics
and/or physics (see Hammer & Elby, 2012; Schommer-Aikins & Duell, 2013).
The central goal of this research was to determine how students encode meaning
through the deployment of multiple representational systems (MRS)—such as words,
symbols, diagrams, and graphs—in an effort towards thinking and reasoning their way
through epistemological change in an Introductory Physics (IP) classroom. Specifically,
this study positions MRS as tools for thinking and reasoning that are capable of

2

producing epistemological change. Among other things, the study sought to find the types
and numbers of MRS that are the most useful in producing epistemological change. Such
findings would then inform the PER community concerning the capacity that MRS have
for encoding meaning during the scientific thinking and reasoning process. Moreover, the
relative importance of personal epistemology in the process of conceptual change—either
as a barrier or a promoter—is the kind of information needed for continued progress in
the PER reform effort, as well as learning theory in general. The PER Community has a
number of peer-reviewed journals such as the American Journal of Physics (see Hake
1998, 2007; Lising & Elby, 2005; Redish 2013) and the Physical Review Special Topics –
Physics Education Research (see Bing & Redish, 2012; Bodin, 2012; Brewe, 2011; De
Cock, 2012; Ding, 2014), where much of the research is reported.
The multi-decade findings of both the PER community and the researchers
involved with personal epistemology, indicate a deep connection between learning
physics and beliefs about the world, as well as how those epistemic views correspond to
conceptual change. It is impossible to do Physics without the aid of conventional
representational systems such as natural language and mathematics; hence the inherent
capacity for those representational systems to influence both conceptual and epistemic
knowledge (Plotnitsky, 2012) is a legitimate point of inquiry that has gone largely
unnoticed. The usage of one or more representational systems should inform researchers
of what the students is thinking or reasoning about—specifically, the ontology, and
therefore the beliefs that such a learner has concerning what has been encoded by MRS.
Beliefs about reality and the correspondence to Physics are inextricably linked through
MRS.

3

According to Pintrich (2012), it is unclear at this time how representational
systems influence epistemological change when deployed in learning environments of
any type. Historically, the lessons learned from the advance of the learning sciences have
shown that personal choices in representational systems are critical to the metacognitive
strategies that lead to increased learning and knowledge transfer (Kafai, 2007) when
situated in learning environments that are collaborative and individually reflective against
the backdrop of prior knowledge (Bransford, Brown, Cocking, & National Research
Council, 1999). The central goal of this research was to determine how thinking and
reasoning with multiple representational systems (MRS)—such as diagrams, symbols,
and natural language—influences epistemological change within the setting of an IP
classroom. The study described herein positions adult community college students in a
learning environment rich with conceptual and representational tools, along with a set of
challenges to their prior knowledge and beliefs. This study answers a long-standing
deficit in the literature on epistemological change (Bendixen, 2012; Pintrich, 2012) by
providing a deeper understanding of the processes and mechanisms of epistemological
change as they pertain to context (domain of knowledge) and representational systems in
terms of the psychological constructs of thinking and reasoning. This chapter will setup
the background for the study research questions based on the current and historical
findings within the fields of personal epistemology research, and the multi-decade
findings of the PER community.
Background of the Study
The current state of research on personal epistemology is one of theoretical
competition (Hofer, 2012: Pintrich, 2012), concerning how learners situated within

4

different contexts, domains of inquiry, and developmental stages obtain epistemological
advancement, as well as whether or not to include the nature of learning alongside the
nature of knowledge and knowing in the definition of personal epistemology (Hammer &
Elby, 2012). The term epistemology deals with the origin, nature, and usage of
knowledge (Hofer, 2012), and thus epistemological change addresses how individual
beliefs are adjusted and for what reasons. Moreover, the field has not produced a clear
understanding of how those learners develop conceptual knowledge about the world with
respect to their personal beliefs about the world (Hofer, 2012). Conceptual change
research has not faired much better, and suffers from a punctuated view of conceptual
change that has been dominated by pre-post testing strategies rather than process studies
(diSessa, 2010). According to Hofer (2012), future research needs to find relations
between psychological constructs and epistemological frameworks in order to improve
methodology and terminology such that comparable studies can be conducted—thus
unifying the construct of personal epistemology within the fields of education and
developmental psychology. Bendixen (2012) suggested that little research on the
processes and mechanisms of epistemological change have been done, and echo the call
by Hofer and Pintrich (1997) for more qualitative studies examining the contextual
factors that can constrain or facilitate the process of personal epistemological theory
change. Moore (2012) cited the need for research addressing the debate over domain-
general versus domain-specific epistemic cognition in terms of the features of learning
environments that influence learning and produce qualitative changes in the complexity
of student thinking.

5

Wiser and Smith (2010) described some of the deep connections that exist
between concept formation, ontology, and personal epistemology, within a framework of
metacognitive control that is central to modeling phenomena through both top-down and
bottom-up mental processes. These sorts of cognitive developments depend on the ability
to use representational systems that are rational (mathematics) and/or metaphorical
(natural language), within a methodological context that is empirical (measurement) in
nature. The student’s transition from holding a naïve theory—such as objects possess a
force property—to holding a more sophisticated or expert theory—that forces act on
objects (Hammer & Elby, 2012)—is by means of representational systems that serve in
part as epistemic resources for modeling real-world phenomena (Bing & Redish, 2012;
Moore et al., 2013). Moreover, it is the coupling of internal representations (mental
models) with the external representations that we call models, which is critical to the
reasoning process (Nersessian, 2010) and its assessment. These findings suggest an
intimate connection between personal epistemology and representational systems as they
function in concert with thinking, reasoning, and conceptual change; however, they do so
without specifying any particular tools. The central aim of this research is to describe
how MRS are used in the thinking and reasoning that accompanies epistemological
change.
Personal epistemology. Personal Epistemology (PE) has been an expanding field
of inquiry for at least 40 years, with a coalescence of a handful of models and theories
emerging in the late 1990s to early 2000s—such as process and developmental models,
and at least four different assessment instruments for judging the epistemic state of
learners at most any age (Herrón, 2010; Hofer & Pintrich, 2012). While the current

6

models and theories agree on the relationships to variables such as gender, prior
knowledge, beliefs about learning, and critical thinking (Herrón, 2010), it is not clear at
this time whether or not a unitary construct for personal epistemology applies in all
cases—suggesting a number of domain-specific (knowledge area such as science) gaps
that need further research.
The content of physics is neither purely rational nor empirical, but also depends
on metaphorical representations—such as the term flow for energy transfer, light is a
particle/wave, and electrons tunneling through quantum spaces—in order to foster the
understanding of complex phenomena and their underlying theories (Brewe, 2011;
Lancor, 2012; Scherr, Close, McKagan, & Vokos, 2012; Scherr, Close, Close, & Vokos,
2012). One of the earliest attempts to measure personal epistemology was the Psycho-
epistemological Profile (PEP) (Royce & Mos, 1980), which measures personal
epistemology on three dimensions: Rational, Empirical, and Metaphorical, and is
therefore an ideal assessment tool for scientific domains of epistemology. The rational
dimension of PEP assumes that knowledge is obtained through reason and logic, whereas
the empirical dimension derives and justifies knowledge through direct observation. The
metaphorical dimension of PEP defines knowledge as derived intuitively with a view to
subsequent verification of its universality.
Representational Systems. Schemata theory (Anderson et al., 1977) suggested a
dynamic process of memory storage and retrieval in concert with the use of
representational systems lead to schemata, which serve as interpretive frameworks within
the process of epistemological advancement. Under the Modeling Instruction Theory for
Teaching Physics (Hestenes, 2010) students are taught to use a representational tool

7

known as a system schema that represents an abstraction of a given picture of some
physical situation. Specifically, this diagrammatic tool compels students to represent
various objects and interactions with regard to the system that governs them, and these
relationships are then productive for various aspects of the problem-solving event. One of
its capacities is as an error-checking device that validates (or invalidates) the equation
model of the same system—such as verifying the equation set adequately represents the
superposition of forces. Simply put, you can move forward with a solution (decision)
once you have verified that nothing was (a) left out of the model or (b) included
illegitimately. The use of multiple representational systems within an IE classroom force
the reconciliation of multiple schemata on singular and/or connected phenomena. This
sort of conceptual turbulence challenges the epistemic stance of the learner, and thereby
provides an opportunity to detect epistemological change as a function of MRS.
Hestenes (2010) deployed multiple types of representations for encoding structure
in terms of systemic (links among interacting parts), geometric (configurations and
locations), object (intrinsic properties), interaction (causal), and temporal (changes in the
system) as ways to model and categorize the observation that students of science make in
an effort to mimic the expert view. In these ways, MRS are instrumental for the modeling
the structure of physical phenomena (Plotnitsky, 2012; Scherr et al., 2012), and therefore
serve as evidence of what students believe the varied representational conventions of
mathematics and physics are capable of describing. The status as of MRS as elements of
epistemological change is the primary research question in this study.

8

Problem Statement
It was not known how (a) thinking and reasoning with MRS occurs, and (b) how
that sort of thinking and reasoning affects epistemological change in terms of
mechanisms and processes—whether cognitive, behavioral, or social—in an IP
classroom. Moreover, as shown in the review of the literature herein, it is not clear what
anyone means by the terms thinking and reasoning within any context. The use of
representational systems—such as symbols, diagrams, and narratives—is undoubtedly
central to the progress of science education by virtue of its ubiquitous deployment in the
realm of natural science itself (Plotnitsky, 2012; Scherr, Close, McKagan & Vokos,
2012). Given the cognitive filter that personal epistemology provides for the acquisition
and the application of knowledge (Schommer-Aikins, 2012), it seemed reasonable to
investigate the nature of epistemological change in concert with the thinking and
reasoning that occurs by means of the representational systems associated with a domain
of knowledge—such as IP. The importance of this study hinged on its ability to answer a
long-standing deficit in the literature on epistemological change (Bendixen, 2012;
Pintrich, 2012) by providing a deeper understanding of the processes and mechanisms of
epistemological change as they pertain to context (domain of knowledge) and
representational systems in terms of the psychological constructs of thinking and
reasoning. These findings better inform the Physics Education Research (PER)
community concerning the capacity that MRS have for encoding meaning during the
scientific thinking and reasoning process, while simultaneously clarifying what is meant
by those processes. Moreover, the relative importance of personal epistemology in the
process of conceptual change—either as a barrier or a promoter—is the kind of

9

information needed for continued progress in the PER reform effort, as well as learning
theory in general. The importance of advancing scientific thinking and reasoning,
conceptual change—in terms of epistemological change—lies in the clear evidence from
PER that conceptual change has a positive effect on achievement in terms of problem-
solving skills (Coletta & Phillips, 2010; Coletta, Phillips & Steinert, 2007a; Hake, 2007).
Purpose of the Study
The purpose of this qualitative grounded theory study was to determine how
representational systems deployed in an IP classroom correspond to epistemological
change in accordance with the ways that students therein think and reason, within a study
sample at Central Arizona College—located in Coolidge, Arizona. The collaborative and
writing-intensive nature of the IP curriculum at Central Arizona College lends itself well
to the research questions and methodology of this study. The use of representational
systems—such as symbols, diagrams, and natural language—is undoubtedly central to
the progress of science education by virtue of its ubiquitous deployment in the realm of
natural science itself (Plotnitsky, 2012; Scherr, Close, McKagan & Vokos, 2012). Given
the cognitive filter that personal epistemology provides for the acquisition and the
application of knowledge (Schommer-Aikins, 2012), it seemed reasonable to investigate
the nature of epistemological change in concert with the thinking and reasoning that
occurs by means of the representational systems associated with a domain of
knowledge—such as IP. The researcher identified the mechanisms of epistemological
change (Bendixen, 2012) as they correspond to thinking and reasoning with MRS. The
value of such knowledge to educational reform efforts is significant in terms of (a) the

10

specific mechanisms for epistemological change (Bendixen, 2012), and (b) the
psychological constructs that generate them (Hofer, 2012).
Ongoing PER reform efforts—such as the development of assessment instruments
and pedagogical change—will benefit tremendously from knowing the types and
frequencies of deployment for representational systems that are effective for producing
conceptual and epistemological change in IP. Furthermore, the relative frequency of use
coupled with personal stances about the usefulness of those representational systems will
provide the information needed to reform instruction in topics that tend to confuse
students during their learning trajectory.
Research Questions and Phenomenon
The goal of this qualitative grounded theory study was to determine the influence
that multiple representational systems (MRS) have on the thinking and reasoning of 20-
30 community college IP students at Central Arizona College with respect to their
conceptual frameworks and personal epistemology. Forty-four semi-structured interviews
based on instructional goals, survey response data, and student journal entries were
conducted at regular intervals during the study in order to obtain emergent themes
concerning how students think and reason about symbols and operations in mathematics,
as well as how they monitor their own thinking about the same. Journals and semi-
structured interviews—in the form of group Socratic dialogs—reveal the ways in which
students shift between representational systems (languages) in an effort to model
mathematical systems, while providing ample means for triangulating the data in parallel
with field notes and memos made by the author-researcher. Multiple electronic polls were

11

given throughout the treatment in order to capture opinions about thinking and reasoning,
knowledge acquisition and usage, as well as how concepts and beliefs change as a result.
As shown in the forthcoming review of the literature, thinking and reasoning are
poorly defined and often conflated (Evans, 2012; Evans & Over, 2013; Mulnix, 2012;
Nimon, 2013; Peters, 2007). Given the absence of consensus on the definitions of
thinking and reasoning within the research literature, the author proposed new definitions
for thinking and reasoning as a means for coding, counting, and classifying instances of
student thinking and reasoning with representational systems that were based on the
synthesis of a model for thinking put forward by Paul and Elder (2008). Thinking is
defined as the ability to construct a model, and reasoning is defined as the ability to relate
two or more models. A model is simply any representation of structure, and structure
refers to the way in which relations can be encoded (Hestenes, 2010). The following
research questions were crafted in such a manner as to encompass the gap in the literature
related to the process and mechanisms of epistemological change as they relate to the
psychological constructs of thinking and reasoning within the domain of IP, as well as the
features of Hofer’s epistemic cognition model (Hofer, 2004; Sinatra, Kienhues, & Hofer,
2014) involving the domain of knowledge, the contextual factors of the learning
environment, and how student reflection within the curriculum conveys towards
metacognitive monitoring.
Qualitative Research Questions
R1: How do IP students use representational systems in their thinking and
reasoning?

12

R2: How does the use of MRS in the thinking and reasoning of IP students
promote personal epistemological change?
In order to facilitate an investigation of these research questions, a series of
activities comprising the standard curriculum of IP students at a rural community college
will be studies. Beginning with group discussions, journals and surveys on the nature of
Physics and reality, students then begin to deploy new representational systems designed
to expose and refine conceptions of number and mathematical operations that are critical
to the language of Physics. These advances are then carried forward to an investigation of
motion that serves as the basis of the entire course. Exit interviews at the semesters end
reflected on all that was learned and how the conceptual and representational tools used
throughout the course influence thinking, reasoning, and personal epistemology.
Advancing Scientific Knowledge
As described in the forthcoming literature review, a lack of clarity exists in the
literature concerning the definitions of thinking and reasoning; however, there is an
abundance of claims that all sorts of thinking and reasoning underlie every advance in
human learning. In order to facilitate more efficient data collection, the author introduced
definitions of thinking and reasoning as follows. Thinking is defined as the ability to
construct a model. This definition is (a) flexible enough to encompass any
representational system, (b) straightforward enough to permit the kinds of frequency
distributions and classification schemes that enable direct measurement of this cognitive
behavior, and is (c) inspired by the work of PER pioneers cited herein, such as Hestenes,
Hake, Redish, and Mazur. The term model is simply any representation of structure
(Hestenes, 2010), and structure is a broader term—open to wide interpretation—

13

encompassing the way that interconnectedness between and within systems is articulated.
Furthermore, the term reasoning is defined herein as the ability to relate two or more
models; and therefore, coordinates the terms in a manner that lends consistency and
coherence to the measure of these cognitive behaviors by simply counting attempts.
Little research has been done exploring the particular mechanisms of
epistemological change along developmental trajectories or with respect to the
dimensions of personal epistemology (Bendixen, 2012; Hofer, 2012). Moreover, it was
not known how representational systems influence such change when deployed in
learning environments of any type (Pintrich, 2012). Personal epistemology is linked to
conceptual change (Bendixen, 2012; Hofer, 2012), and representational systems are
required for producing conceptual change (diSessa, 2010). The gap in the literature that
this study addresses is the lack of connections that exists between representational
systems, conceptual change, and epistemological change, and what processes and
mechanisms are productive for such change (Bendixen, 2012; Hofer, 2012; Pintrich,
2012). The persistent question of educational research is ‘what works best and why?’ and
it is the lived experience of learners situated in an IP classroom that should expose their
thoughts and beliefs concerning the representational tools that they use and/or struggle
with when encoding for meaning.
The PER literature speaks extensively to improving the thinking and/or reasoning
skills of students in introductory physics courses (Coletta & Phillips, 2010; Coletta et al.,
2007a; Hake, 2007), without ever providing or relying on a clear definition for thinking
or reasoning in general terms. Thinking and reasoning within the context of problem
solving is part of the functional relationship that exists between the personal

14

epistemology of students and their learning in general (Lising & Elby, 2005; Schommer-
Aikins & Duell, 2013). The use of representational systems—such as symbols, graphs,
diagrams, and narratives—is undoubtedly central to the progress of science education by
virtue of its ubiquitous deployment in the realm of natural science itself. The evidence
cited herein shows a lack of clarity on the mechanisms of conceptual and epistemological
change as they correspond to (1) one another, and (2) towards problem-solving skills.
Moreover, it is not clear what sort of thinking and reasoning is being deployed in an
effort to produce those changes in a knowledge-domain requiring MRS (Plotnitsky,
2012). This study addressed all of these concerns at the focal point of epistemological
change, and thus answered the call for clarity and mechanistic description within the
literature.
Significance of the Study
The role of representational systems is believed to be a factor in promoting
conceptual and epistemological change in settings such as Introductory Physics
classrooms (Brewe et al., 2013), as well as learning in general (Lising & Elby, 2005;
Pintrich, 2012). This research sought to understand (1) what, if any, connection(s) exist
between thinking and reasoning with MRS and epistemological change—as prescribed in
the research questions, and then (2) begin to unravel the types and numbers of
representational systems that are effective for promoting those changes by specifying the
mechanisms (Bendixen, 2012) and processes (Hofer, 2012) found therein. The value of
such knowledge to educational reform efforts is significant, as it identified specific
mechanisms for epistemological change (Bendixen, 2012) in terms of the psychological
constructs that generate them (Hofer, 2012), as well as the epistemic resources for

15

conceptual formation (Bing & Redish, 2012; Wiser & Smith, 2010) and change
(Jonassen, Strobel, & Gottdenker, 2005) within learning environments designed for
epistemic change (Muis & Duffy, 2013).
The importance of epistemological change for this study is evident in its close
connection to the field of conceptual change (diSessa, 2010) and how they are
coordinated in PER through the use of representational systems (Brewe et al., 2013).
Moreover, epistemological change would be better understood in terms of the influence
of representational systems (Pintrich, 2012) and the incremental processes associated
with conceptual change (diSessa, 2010), while also contributing to the lack of theoretical
clarity that persists in defining each of these constructs (Hofer, 2012; Pintrich, 2012). A
secondary goal that is inextricably linked to the primary goal, is to clearly distinguish
thinking and reasoning from one another, and how MRS are used to encode the meaning
evident in those constructs. Such a discovery has the potential for providing a general
metric for the constructs of thinking and reasoning in any domain of knowledge with
respect to the representational systems that accompany it.
Personal epistemology has connections with multiple fields of psychology and
learning science including conceptual change (diSessa, 2010; Jonassen et al., 2005,
Nersessian, 2010), metacognition (Barzilai & Zohar, 2014; Bromme, Pieschl, & Stahl,
2010; Hofer, 2012; Hofer & Sinatra, 2010; Mason & Bromme, 2010; Muis, Kendeou &
Franco, 2011), self-regulated learning (Cassidy, 2011; Greene, Muis & Pieschl, 2010;
Muis & Franco, 2010), and self-efficacy through locus of control (Cifarelli, Goodson-
Espy, & Jeong-Lim, 2010; Kennedy, 2010). Each of these constructs or cognitive
functions are communicated through representational systems that students presumably

16

think and reason about along their way to an understanding that shapes their set of
personal beliefs. Research that seeks to obtain a deeper understanding of the processes
and mechanisms associated with changes along any of those dimensions will have a
lasting impact on multiple areas of psychology and learning science in general.
The PER community has promoted, created, and uncovered a vast array of IE
methods that have surely improved learning outcomes in IP classrooms (Coletta &
Phillips, 2010; Coletta et al., 2007a; Hake, 1998; Hestenes, 2010)—and therefore some
sort of cognitive behavior. So while there is little doubt that some sort of thinking and/or
reasoning is going on while students are learning any topic, it is not clear in the literature
what the specific qualities of thinking and reasoning are when it comes to learning in IP.
Given the deep connections that exist between metacognition and epistemological
frameworks (Barzilai & Zohar, 2014), the effort to obtain the factors of epistemological
change in terms of the tools that are instrumental to that effect present a grand
opportunity to the teaching and learning enterprise.
Rationale for Methodology
A qualitative approach was used in this study. The foundations of qualitative
research rest on the inductive analysis that makes developing an understanding of the
phenomena from the viewpoint of the participants possible (Merriam, 2010) in a manner
that respects how the meaning is constructed in social settings (Yin, 2011) where the
researcher is the primary data collection instrument responsible for producing a richly
descriptive account of the outcomes (Merriam, 2010). Given the nature of the study on
personal epistemology—beliefs about knowledge and its acquisition—and how students
obtain advances in personal epistemology, qualitative methods lend themselves best to

17

the project described herein because they provide a richer description (Schommer-Aikins,
2012), of the lived experience of the study participants (Charmaz, 2006; Glaser &
Strauss, 2009). Moreover, given that the research design was grounded theory, the
necessity of qualitative methodology for data collection and analysis is properly
constrained within this methodology by virtue of its underlying logic and interpretive
framework (Charmaz, 2006).
Nature of the Research Design for the Study
A grounded theory approach (Charmaz, 2006) was used in designing this
qualitative study in order to produce a substantive theory capable of describing the
complex interactions that comprise the phenomena of thinking and reasoning with MRS,
and its influence on epistemological change within the context of a community college IP
classroom. Grounded theory is a qualitative design that allows a researcher to form an
abstract theory of processes or interactions that are grounded in the views of the
participants (Charmaz, 2006; Glaser & Strauss, 2009). Given the fact that personal
epistemology is entirely about personal beliefs and viewpoints, a grounded theory
exploration of the underlying mechanisms and processes of epistemological change is
entirely consistent with the research questions probing how students think and reason
their way towards epistemological change using MRS.
Approximately 30 students comprise the study population from which archived
data will be drawn at Central Arizona College—which is consistent with the 20-30 study
participants recommended for grounded theory research by Creswell (2013), and the 30-
50 participants suggested by Morse (2000). Charmaz (2006) suggested that 25 interviews
are sufficient for grounded theory designs on small projects. Given the current study is

18

using interviews, written journals, and electronic polls, a group of slightly more than 30
student participants should be more than adequate for obtaining the level of theoretical
saturation which is the ultimate criterion for sample size in grounded theory designs
(Corbin & Strauss, 2008). The archived data in this study will include numerous student
journals throughout the IP curriculum, group discussion transcripts, and miscellaneous
assessment results—such as Force Concept Inventory (FCI), and the Psycho-
epistemological Profile (PEP)—that are all part of the normal classroom experience of IP
students at Central Arizona College, which were selected purposively due to their
suitability for the study and amount of data available for the researcher. In order to
eliminate as much researcher bias as possible, archived data was used.
Grounded theory design was selected because of its capacity to capture in theory
the ‘how’ of structure and process within a social setting, versus a phenomenological
‘what’ of the events (Birks & Mills, 2011). The research questions proposed for this
study ask how MRS are used in the processes of thinking and reasoning within
epistemological change, and thus fall under the heading of grounded theory by virtue of
the research question itself—which seeks an answer to a how type of question. In order to
make the connection to epistemological change in these terms though, a certain amount
of discourse analysis is required. However, discourse analysis alone cannot answer the
‘how’ questions because such a design is methodologically constrained to the meaning
that is negotiated in the ‘what’ of language rather than the process of negotiating meaning
with language itself (Yin, 2011). Though phenomenology, discourse analysis, and
grounded theory come from different historical and philosophical traditions, the
boundaries between them are somewhat porous in terms of the methodology required for

19

a particular kind of research question (Yin, 2011), as well as the fact that the elements of
one type of question—such as a ‘how’ question—often entail elements of another type of
research question, such as the ‘what’ type (Starks & Trinidad, 2007).
Since the nature of this study’s research questions probe how students use MRS in
their thinking and reasoning for epistemological change, the importance of using
grounded theory as a tool for grounding the theory in the particular viewpoints of the
participants (Charmaz, 2006; Glaser & Strauss, 2009) further solidifies the primacy of
grounded theory over other designs—such as discourse analysis. Personal epistemology
obviously pertains to personal viewpoints, which must be expressed in language. The
language used by IP students is situated in social contexts constrained by the MRS that
are conventionally used within Physics—such as graphs, equation, pictures, and words
(Plotnitsky, 2012). In this case, the viewpoints that are the central focus of personal
epistemological change are developing within a context that can only be described using
a limited set of representational systems. The connection between the personal and social
aspects of the learning environment for this study, in parallel with the particular uses of
MRS (language), was far too intimate to ignore.
Definition of Terms
Conceptual change. In order to define conceptual change, one must first define a
concept. In general, it is the internal representation that learners construct for themselves
based on the external representations of others (Nersessian, 2010). Conceptual change is
measured on many levels from the taxonomic and semantic aspects of how symbols are
related to referents, as well as how those representations correspond to more complex
conceptual structures such as an event (Hestenes, 2010).

20

Multiple representational systems (MRS). The use of words, symbols, and
pictures to in order to communicate an idea or present a model is described as multiple
representations (Fyfe, McNeil, Son, & Goldstone, 2014; Harr et al., 2014), multiple
external representations (Fyfe et al, 2014; Wu & Puntambekar, 2012), and multiple
representational systems (Ainsworth, Bibby, & Wood, 2002) in the literature.
Personal epistemology. The psychological construct of personal epistemology is
used to describe how personal beliefs convey to what knowledge is, how it is obtained,
what it is used for, and how useful it is in any context (Hofer & Pintrich, 2012).
Assumptions, Limitations, Delimitations
The following assumptions are given with respect to this study.
1. It was assumed that survey participants in this study were not be deceptive
with their answers, and that the participants answered questions honestly and
to the best of their ability. The course and curriculum under study was the
regular curriculum for Physics students at Central Arizona College, and
therefore part of the normal experience that counts for a grade.

2. It is assumed that this study was an accurate representation of what is typical
in IE IP classrooms. The instructor (the author) has been trained in IE methods
for the last decade and has based his research on the best practices of the PER
community.

The following limitations/delimitations apply to this study. The generalizability of
the findings that emerge from this study are limited to the IE class of IP classrooms
typically studied by the PER community. According to Merriam (2010), generalizability
in qualitative research must be thought of differently than it is in quantitative designs.
External validity is the qualitative equivalent of generalizability, and is constrained by the
perception that users of the research have with regard to the transferability to another
context or domain of knowledge. The author makes no claims with regard to
generalizability in this study aside from the likelihood that this design could produce

21

similar results in other IE IP classrooms. This limitation is consistent with the theoretical
and pedagogical norms that persist in that category of instructional practice. One long-
term goal of this dissertation is preparatory towards the development of a learning theory
requiring a great deal more than is typically contained in just one dissertation.
1. The student body was not randomly selected. This qualitative study depends
on purposive sampling of qualified students, which was obtained by
identifying students who meet the pre-requisites for taking physics for
university transfer purposes.

2. The study population was limited to two Physics courses at one community
college. The author-researcher has no other access to students.

Summary and Organization of the Remainder of the Study
Conceptual change and epistemological change are connected by the
representational systems used by learners when deploying them in contexts that require
modeling (Hestenes, 2010, Nersessian, 2010). Learning physics requires thinking and
reasoning within a context for problem solving where beliefs about the world are
regularly challenged (Lising & Elby, 2005). However, there is no clear definition of the
terms thinking and reasoning (Nimon, 2013; Peters, 2007) even though scores of types of
thinking are well attested within the literature—specifically with respect to this study:
scientific thinking and reasoning within the context of learning physics (Coletta et al.,
2007a, 2007b; Hake, 1998; Hestenes, 2010).
Chapter 2 presents a review of current and historical research on the connections
that exist between thinking, reasoning, representational systems, conceptual change and
epistemological change, as well as the theoretical foundations underlying the present
study. Chapter 3 describes the methodology and research design for a generic qualitative
design, and the data collection and analysis procedures for this investigation. Chapter 4

22

delivers the actual data analysis with written and graphic summaries of the results, which
lead into an interpretation and discussion of the results, as they relate to the existing body
of research related to the dissertation topic.
The timeline for completing this dissertation consists of three primary stages. In
stage one, the proposal is completed and approved by August 13, 2014—the end of
PSY955 Dissertation 1, and subsequently approved in PSY960 Dissertation 2. Data
collection begins immediately in PSY960 Dissertation 2 in conjunction with the start of
the courses being studied at Central Arizona College that begin on August 18th. The
analysis phase began subsequent to the approved Proposal in July 2015, and the data
analysis was completed during PSY969 Research Continuation 4. The remainder of the
dissertation was completed during PSY970 Research Continuation 5 in January 2016.

23

Chapter 2: Literature Review
Introduction to the Chapter and Background to the Problem
The basic premise of this research was that the use of MRS is the fundamental
feature of the kinds of thinking and reasoning that promote both conceptual and
epistemological change; however, this study was concerned with just the connections that
exist between thinking and reasoning with MRS and personal epistemological change.
Specifically, that the resources for conceptual change are contingent on the resources for
epistemic change, if not entirely the same. Moreover, the inherent need of
representational systems for communicating meaning is central to conceptual change as
well as the set of personal beliefs that accompany personal epistemology. The extent to
which epistemological change is connected to the deployment of MRS, is the central
research focus that is capable of better informing all PER initiatives concerning the
foundations of thinking and reasoning required for this sort of change.
The current state of research on the personal epistemology of learners situated
within different contexts, domains of inquiry, and developmental stages, has not
produced a clear understanding of how those learners (a) develop conceptual knowledge
about the world with respect to (b) their personal beliefs about the world as it (c) relates
to physics (Hofer, 2012). However, there is evidence showing that when the science
pedagogy matches the science practice, then students are more likely to obtain positive
conceptual change based on the features of instruction and curricular content upon which
student beliefs about the world are formed (Lee & Chin-Chung, 2012). Conceptual
change research has also failed to produce clear understanding of how learners develop
conceptual knowledge about the world, and suffers from a punctuated view of conceptual

24

change that has been dominated by pre-post testing strategies rather than process studies
(diSessa, 2010).
According to Hofer (2012), future research needs to find relations between
psychological constructs and epistemological frameworks. Bendixen (2012) suggested
that little research on the processes and mechanisms of epistemological change have been
done, and echo the call by Hofer and Pintrich (1997) for more qualitative studies
examining the contextual factors that can constrain or facilitate the process of personal
epistemological theory change. The general call for studies probing the connections that
exist between conceptual and epistemological change, as well as the processes and
mechanisms that are productive for those changes, is clearly warranted by these findings.
Moreover, the PER literature also includes studies into the connection between personal
epistemology and conceptual change in terms of representational systems (Brewe et al.,
2013; Lising & Elby, 2005), lending further warrant to the study proposed herein.
Though the particular research questions for this study were focused on epistemological
change, the findings cited thus far warrant a discussion of conceptual change in this
literature review.
Wiser and Smith (2010) describe some of the deep connections that exist between
concept formation, ontology, and personal epistemology, within a framework of
metacognitive control that is central to modeling phenomena through both top-down
(perceptions influenced by prior knowledge) and bottom-up (perceptions influenced by
new data) mental processes. These sorts of cognitive developments depend on the ability
to use representational systems that are rational (mathematics) and/or metaphorical
(natural language), within a methodological context that is empirical (measurement) in

25

nature. The student’s transition from naïve to expert theories is by means of
representational systems that serve in part as epistemic resources for modeling real-world
phenomena (Bing & Redish, 2012; Hestenes, 2010; Moore et al., 2013). Moreover, it is
the coupling of internal representations (mental models) with the external representations
that we call models, which is critical to the reasoning process and its assessment
(Nersessian, 2010). These findings suggest an intimate connection between personal
epistemology and representational systems as they function in concert with thinking,
reasoning, and conceptual change; however, they do so without specifying any particular
tools. The central aim of this research was to identify some of the most basic
representational tools that are instrumental for epistemological change.
The Physics Education Research (PER) community has claimed significant gains
in student thinking and reasoning (Coletta & Phillips, 2010; Coletta et al., 2007a; Hake,
2007) through conceptual change (Hake, 1998), without ever defining what is meant by
the terms thinking and reasoning. As shown in the forthcoming review of the literature,
thinking and reasoning are poorly defined and often conflated (Mulnix, 2012; Nimon,
2013; Peters, 2007). In order to facilitate more efficient data collection, the author
introduced definitions of thinking and reasoning as follows. Thinking is defined as the
ability to construct a model. This definition is (1) flexible enough to encompass any
representational system, (2) straightforward enough to permit the kinds of frequency
distributions and classification schemes that enable direct measurement of this cognitive
behavior, and is (3) inspired by the work of PER pioneers cited herein, such as Hestenes,
Hake, Redish, and Mazur. The term model is simply any representation of structure
(Hestenes, 2010), and structure is a broader term—open to wide interpretation—

26

encompassing the way that interconnectedness between and within systems is articulated.
Furthermore, the term reasoning is defined herein as the ability to relate two or more
models; and therefore, coordinates the terms in a manner that lends consistency and
coherence to the measure of these cognitive behaviors by simply counting attempts.
Though the reform movement as studied in the PER literature has obtained
notable success (Coletta et al., 2007a, 2007b; Hake, 1998), one question that emerges
from the gaps in this body of literature, as well as the persistent conflation of the terms
thinking and reasoning that are common to both the literature and the discourse of math
and science education research (Glevey, 2006), is what exactly do we mean by thinking
and reasoning?
The search terms “definition of thinking” OR “thinking is defined” in scholarly
journals whose names include psycholog* OR cogn* OR educ* yielded only 118 peer-
reviewed articles from 1963 to 2014 in EBSCO Academic Search Complete (EBSCO),
and 47 articles from 1991 – 2014 in ProQuest—as illustrated below in Table 1.
Table 1

Literature Review Search Pattern 1
Date Range Type of thinking Hits in EBSCO Hits in ProQuest
1963 – 2014 Critical 27 —
Other 91 —
1991 – 2014 Critical — 23
Other — 24

These initial search results indicate dominance on the field of research by critical
thinking that has remained stable over the years since 1963, yet waning in recent years. In
both databases, the table entry for “other” is predominantly filled with n = 1 tallies, while
the remainder are n = 2. In other words, somewhere between one-half and three-quarters

27

of published research on thinking is scattered among scores of types of thinking distinct
from critical thinking, or types of thinking such as schizophrenic thinking, that are not
applicable to this study. Table 2 below illustrates a more recent tally for the search terms
given above.
Table 2

Literature Review Search Pattern 2
Date Range Type of thinking Hits in EBSCO Hits in ProQuest
2004 – 2014 Critical 16 5
Other 70 15
2009 – 2014 Critical 8 3
Other 31 8

Applying the same search criteria for a definition of reasoning yielded only 31
articles in EBSCO for the years 1981 – 2014, and 6 articles in ProQuest for the years
1996 – 2014. Restricting the years to 2004 – 2014 produced on 4 ProQuest and 21
EBSCO articles, whereas a 2009 – 2014 search obtained only 3 ProQuest and 9 EBSCO
articles. Changing the search constraints in both databases to just the term “thinking”
produced 946 EBSCO and 652 ProQuest articles for the years 2004 – 2014. This means
that at best, roughly 9% of all research making claims about thinking in the last 10 years
operated with a clear definition of the term. An identical search for the term “reasoning”
produced 601 EBSCO and 152 ProQuest articles—indicating that approximately 3% of
research articles in the last 10 years made claims about reasoning without the aid of a
basic definition.
This review of the literature was structured in terms of how the theories and the
histories of conceptual and epistemological change correspond to the progress of learning
in general, and physics in particular. Though the study was particularly focused on

28

epistemological change, thinking and reasoning within the context of IP has been
historically concerned with conceptual change. However, a great deal of research in
personal epistemology has occurred within IP classrooms; hence the need to give
attention to both conceptual and epistemological change in this literature review.
Additionally, the constructs of thinking and reasoning were considered in general
psychological terms as well as the particulars of Physics education. Self-regulated
learning, self-efficacy, metacognition, and student journaling converge on the
aforementioned theoretical aspects of this study in terms of conceptual and
epistemological change, as well as classroom management and the curriculum used by
the study sample.
The foundations of this study were both theoretical and conceptual, consisting of
the constructs of personal epistemology, thinking and reasoning, and representational
systems—as well as the connections that exist between them and conceptual change,
metacognition, self-efficacy, self-regulated learning, and locus of control. Though the
study was focused on personal epistemology, the entailments listed herein are given
treatment in this chapter in accord with how they influence the study and research
questions. Personal epistemology, thinking and reasoning, and representational systems
were the central focus of the two research questions that are given in the so-named
subsections of the section titled theoretical and conceptual foundations. Metacognition,
self-efficacy, self-regulated learning, and locus of control were factors of the study
environment by virtue of research demonstrating that particular pedagogical and
curricular interventions—such as journaling—convey to changes in these same
constructs, and are covered in the subsection titled self-efficacy, self-regulation, and

29

journaling. In this way, they served as conceptual foundations for the study in terms of
what to expect in the data analysis phase.
The literature review section of chapter two builds on the theoretical and
conceptual foundations as they apply first to epistemological and conceptual change in
general, and second to how thinking and reasoning within the context of IP conveys to
personal epistemological change within that domain, and perhaps in general. This section
begins with brief histories of personal epistemology research and the attempts to assess
this construct, followed by a discussion of how personal epistemology and conceptual
change intersect as fields of research. The remainder of the literature review consists of
subsections addressing conceptual change in IP, personal epistemology in IP, and
thinking and reasoning in IP classroom settings.
Theoretical Foundations and Conceptual Framework
Personal epistemology. Piaget’s cognitive developmental process of
equilibration (Piaget, 1970) is—from a historical perspective—central to the theoretical
underpinnings of what personal epistemology researchers call epistemological
advancement (Bendixen, 2012). Hofer (2004) suggested the concept of epistemic
metacognition as a way to understand how students shift beliefs through reflection, while
epistemic beliefs also constrain and/or advance conceptual change.
In either case, the domain of knowledge and the educational context determine the
direction and magnitude of such transitions in personal epistemology, as well as its
overall advancement for the student. Scientific reasoning is naturally recursive by virtue
of the fact that empirical investigations challenge the models and hypotheses put forward
by scientists—thus forcing the type of declarative metacognition (Hofer & Sinatra, 2010)

30

that influences personal beliefs. IE physics classrooms attempt to simulate the behavior
of a scientific community by virtue of a discourse that is based on inquiry, collaboration,
and consensus building (Bruun & Brewe, 2013; Hestenes, 2010; Irving & Sayre, 2014).
Moreover, the very nature of an IE physics classroom relies on leveraging
representational systems in order to produce a change in beliefs about the real world
through conceptual change. However, in practice, conceptual change interventions differ
from epistemological change interventions by virtue of the fact that conceptual change
instruction seeks to merely confront and change existing beliefs, whereas epistemological
change instruction seeks to influence how beliefs direct learning and the enactment of
epistemology in the classroom (Ding, 2014). Epistemic recursion is therefore a key factor
in scientific advance, and is one way to understand Hofer’s conceptual model of
epistemic metacognition—which served as the conceptual framework of this study.
Thinking and reasoning. In the new paradigm for the psychology of reasoning,
probability rather than logic, is the rational basis for understanding all human inference
(Pfeifer, 2013). Moreover, thinking and reasoning are coupled through the new paradigm
in dual-process theories by virtue of the fact that Type 2 (reflective process) thinking is
defined as enabling “us to reason by supposition, engaging in hypothetical thinking and
mental simulation decoupled from some of our actual beliefs” (Evans & Over, 2013),
whereas Type 1 intuitive thinking is fast and automatic concerning the feeling of
confidence that accompany answers or decisions (Evans, 2012). Common definitions of
the term thinking refer to particular cognitive processes such as transformations of mental
representations (Holyoak & Morrison; 2012; Sinatra & Chinn, 2011), or even cognition
as a general process (Nimon, 2013), whereas reasoning has become synonymous with

31

cognitive processing in general (Evans, 2012) via memory and reasoning for decision
making in social habitats (Rai, 2012). Mulnix (2012, p. 477) conflates the terms thinking
and reasoning by stating, “Critical thinking is the same as thinking rationally or reasoning
well.” Such definitions are clearly circular, and therefore do nothing in the effort to
clarify what is meant by the psychological construct, much less the neuropsychological
reality in terms of neurons and various regions of the brain.
Piaget operationalized the construct of thinking in terms of developmental stages,
whereas more modern cognitivists adopted an information-theoretic approach based on
brain waves, and connectionist notions of neural systems (Peters, 2007). Vygotsky
defined it simply as dialog (Fernyhough, 2011). In describing the conceptions of
philosophers such as Hegel, Heidegger, Kant, and Wittgenstein, Peters (2007) listed
thinking as representation, opinion-making, scientific problem-solving, revealing what is
concealed, and concept-making—thereby covering most of psychology in the broadest
sense of the term, while giving little by way of specific mechanisms.
These assertions made within the literature suggested a need for greater clarity in
defining both thinking and reasoning before any progress can be made in measuring these
constructs. However, according to Elder and Paul (2007b), all thinking consists of the
following eight elements: the generation of purpose(s), raising questions, using
information, utilization of concepts, inference-making, assumption-making, it generates
implications, and embodies a point of view. Elder and Paul affirm the common treatment
of thinking and reasoning as virtually synonymous terms in their assertion that “whenever
we think, we reason” (Elder & Paul, 2007b, “All Humans Use Their Thinking”, para. 2).
In other words, thinking is merely a stage of reasoning in the model put forward by Elder

32

and Paul. Reasoning is then defined as a sense-making and conclusion-making process
conducted by the mind, based on reasons—implying an “ability to engage in a set of
interrelated intellectual processes” (Elder & Paul, 2007b, “All Humans Use Their
Thinking”, para. 5), such as the eight elements of thinking already given herein. One
distinguishing factor of thinking relative to reasoning in the model offered by Elder and
Paul, is that thinking is what agents do when making sense of the world, whereas
reasoning is how agents are able to come to decisions about the elements of their thought.
In an attempt to explain how the human mind learns, Elder and Paul (2007a)
define thinking in even more general terms as the process by which we take control of the
mind in an effort to figure things out. Moreover, these thoughts influence our feelings,
and thus the way that we interpret and come to believe various things—in other words,
thinking informs our viewpoints. Given the consistency of this model with the general
scope of personal epistemology, the models and definitions for thinking and reasoning by
Elder and Paul described herein, will serve as a conceptual foundation for what is meant
in this study by constructs of thinking and/or reasoning.
The model put forward by Elder and Paul (2007b) contains 35 dimensions of
critical thought consisting of 9 affective dimensions, and 26 cognitive dimensions broken
into 17 macro-abilities, and 9 micro-skills. Point of view, questioning, assumption
making, and using information are four of the eight elements of thought that also appear
within the cognitive dimension macro-abilities. Of the remaining four elements of
thought, only inference making appears in cognitive dimension micro-skills set. No other
elements of thought are clearly listed within the 35 dimensions, although each of the key

33

terms are—for example, the exploration of implications is listed as a cognitive micro-
skill, but the ability to generate implications is specified in the eight elements of thought.

Figure 1.The eight elements of thought.
All thought, according to Paul and Elder (2008) consists of eight unique elements that are
situated within particular context.

Mulnix (2012) affirms the equivalence of thinking and reasoning that Elder and
Paul (2007a) assert, whereas Evans (2012) places thinking at the heart of decision-
making and reasoning, as Elder and Paul suggest—in particular, that the process of
thinking generates the reasons that the process of reasoning then bases its conclusions on.
Holyoak and Morrison (2012, p. 1) define thinking as “the systematic transformation of
mental representations of knowledge to characterize actual or possible states of the world,
often in service of goals,” which is essentially goal-directed modeling as defined herein.
These convergences in definitions for the constructs of thinking and reasoning suggest a
recent emergence of coherence in the field that is useful for the purposes of this study.

34

Figure 2. The eight elements of scientific thought.
The general elements of thought remain unchanged when applied in a particular
context—such as natural science.
1
Building a conceptual model for this study. The transition from general thought
to scientific thought is in the specificity of context (Paul & Elder, 2008). In an effort to
distinguish thinking from reasoning, the author proposed the following definitions of
thinking and reasoning as conceptual bases for coding evidence of the same throughout
this study. Thinking is hereby defined as the ability to construct a model—which is one
of the items within the elements scientific of thought given by Paul and Elder, called
concepts. This definition is (1) flexible enough to encompass any representational
system, (2) straightforward enough to permit the kinds of frequency distributions and
classification schemes that enable direct measurement of this cognitive behavior, and is
(3) inspired by the work of PER pioneers cited herein, such as Hestenes, Hake, Redish,

1 From The Miniature Guide for Students and Faculty to Scientific Thinking (Kindle section title Why
Scientific Thinking?), by L Elder and R. Paul, 2008, Copyright 2008 by the Foundation for Critical
Thinking… Reprinted with permission.

35

and Mazur. The term model is simply any representation of structure (Hestenes, 2010),
and structure is a broader term—open to wide interpretation—encompassing the way that
interconnectedness between and within systems is articulated. Furthermore, the term
reasoning is defined herein as the ability to relate two or more models; and therefore,
coordinates the terms in a manner that lends consistency and coherence to the measure of
these cognitive behaviors by simply counting attempts. This definition for reasoning is
consistent with two of the elements of scientific thought suggested by Paul and Elder:
scientific implications and consequences, and scientific point of view.
In the model for scientific thought shown in Figure 2 above, axioms are part of
the assumption that are made rather than the result of any process, whereas in Physics,
axioms are used in order to generate new ones, as well as being part of fundamental
assumptions. Moreover, there are implications and consequences associated with axiom
development—also an element of scientific thought—that must be accounted for. The
right-hand side of Figure 2 is largely empirical in nature, whereas the left-hand side is
rational. To the degree that reasons are generated by thinking about scientific information
in the form of data and observations, and if decisions about the interrelatedness of those
reasons are what comprise reasoning, then the model for scientific thought put forward by
Paul and Elder (2008), already has natural divisions for the constructs of thinking and
reasoning as defined herein by the author. Given that the authors definitions are primarily
for high-level coding that is consistent with the practice of physics in an IP classroom, the
fine-grained distinction put forward in the model by Paul and Elder served as additional
theoretical codes used in the data analysis.

36

Language is the primary means by which human beings encode for meaning. The
academic setting of an Introductory Physics (IP) classroom requires an array of
languages—or what this study calls multiple representational systems (MRS). Words,
symbols, graphs, and diagrams encode various kinds of meaning depending on the
context of student inquiry. The following section addresses this topic with a view to how
encoding for meaning with MRS corresponds to thinking and reasoning.
Representational systems. Representational tools and systems have the capacity
to encode information (Fekete, 2010), promote conceptual change (Johri & Lohani, 2011;
Johri & Olds, 2011), as well as direct inquiry (Moore et al., 2013), scaffold learning
(Eitel et al., 2013), and facilitate the process of knowledge construction (Kolloffel,
Eysink, & Jong, 2011). Fekete (2010) suggested that representations are simply the
realization that there exists an isomorphism (one-to-one relationship) between the
conceptual/perceptual domain, and the activity space where representation occurs.
Activity spaces are technically defined as “spatiotemporal events produced by dynamical
systems” (Fekete, 2010, p. 69), and neural systems in the human brain mimic those
dynamical systems to some degree. The dynamical systems approach is conceptually
equivalent to using most any marker, or token, to describe one thing in terms of
another—which is the general practice of Physics (Plotnitsky, 2012; Wu & Puntambekar,
2012).
Hestenes (2010) deployed multiple types of representations for encoding structure
in terms of systemic (links among interacting parts), geometric (configurations and
locations), object (intrinsic properties), interaction (causal), and temporal (changes in the
system) as ways to model and categorize the observation that students of science make in

37

an effort to mimic the expert view. In these ways, MRS are instrumental for modeling the
structure of physical phenomena, and therefore serve as evidence of what students
believe the varied representational conventions of mathematics and physics are capable
of describing. Their status as mechanisms of epistemological change is the primary
research question at hand.
Waldrip and Prain (2012) have qualitatively tested an intervention that relies
heavily on representational systems in an effort to promote scientific reasoning as a
cognitive activity that involves thinking by means of constructing representations, and
subsequently judging them for their efficacy—which under the model for scientific
thought proposed by Paul and Elder (2008), is both thinking by representation, and
reasoning through judgment of those thoughts. The results they obtained indicate that an
interactive environment where observed phenomena are tested and re-tested, represented
and re-represented, and evaluated through group collaborations that give opportunities to
defend and judge hypotheses, positively influences student confidence and engagement.
The distinction that Fekete (2010) offers in terms of how representations relate to their
encodings is part of the conceptual basis for thinking and reasoning as defined by Paul
and Elder (2008), and described in learning environments by Waldrip and Prain (2012).
Moreover, the features of models in Physics—such as systemic, geometric, object,
interaction, and temporal (Hestenes, 2010)—serve as very particular and fine-grained
conceptual distinctions to be coded for in the qualitative analysis of student artifacts in
this study.
Prior knowledge influences the top-down thinking and reasoning that students
bring to learning habitats where new information found therein is designed to promote

38

bottom-up forms of thinking and reasoning for conceptual, and potentially
epistemological change. However, as shown in the next section, epistemic beliefs are
strong motivators for and against self-regulated learning. In other words, certain beliefs
either promote or stifle the types of thinking and reasoning that are required for learning.
Self-efficacy, self-regulation, and journaling. The epistemic beliefs that
students have concerning the development of scientific knowledge directly influence the
acquisition of that knowledge, and therefore the achievement that shepherd self-concept
and self-efficacy when learning in the scientific domain (Mason, Boscolo, Tornatora, &
Ronconi, 2012; Sawtelle, Brewe, Goertzen, & Kramer, 2012). Cassidy (2011) points out
the fact that academic control is one factor within the complex of self-regulated learning
that competes with a student’s self-evaluation—such as the belief that learning is
dependent on the amount of struggle involved with academic endeavors and inborn traits
such as intelligence (Koksal & Yaman, 2012). Achievement gaps narrow in classrooms
where extensive reading and writing are organic to an engaging experience that
contributes to enhanced motivation, self-efficacy, and locus of control—which are
essential components of active learning and achievement in academic settings (Kennedy,
2010). Moreover, the likelihood that a student will deploy any particular representational
medium—journal or otherwise—depends on factors such as motivation, goal orientation,
self-regulation, and general interest in the domain of knowledge relevant to the setting
(Bodin & Winberg, 2012; Kennedy, 2010). Therefore, providing students with an
opportunity to defend their strategies through discussion and written journals is helpful in
promoting the kinds of self-advocacy that catalyzes self-regulated learning (Cifarelli,
Goodson-Espy, & Jeong-Lim, 2010; Muis & Duffy, 2013). Furthermore, metacognitive

39

monitoring, self-efficacy, and self-regulated learning are optimized when the
epistemological domain of a learner and the epistemology of the domain focus are
matched—such as a rationalist in a mathematics setting (Muis & Franco, 2010).
The aforementioned findings served as a broad conceptual and theoretical
foundation for this study by virtue of the fact that data collection and pedagogy within the
study environment match the general features described therein. Journaling and
collaboration are the central features of the classroom environment where thinking and
reasoning with MRS is being deployed. Muis and Franco (2010) linked metacognitive
monitoring, self-efficacy, self-regulated learning, and epistemology in ways that are
consistent with Hofer’s epistemic metacognition model (Hofer, 2004)—which also
served the overarching conceptual framework for this study. The connections that exist
between metacognition, epistemology, and self-regulated learning (Barzilai & Zohar,
2014; Greene, Muis, & Pieschl, 2010) are relatively new in the literature (Hofer &
Sinatra, 2010), but nonetheless warranted attention in this study given their connections
to the primary data collection method of student journals.
Convergence of conceptual and theoretical foundations. The expression of
epistemic beliefs is typically expressed in the form of language. Within the field of
Physics—and thus an IP classroom—MRS serve as the languages by which a learner is
able to encode for meaning, and therefore transmit in writing or in narratives their own
epistemic stance. Thinking and reasoning are unavoidable cognitive activities for both
conceptual and epistemological change, and are necessarily metaphorical, empirical, and
rational in the context of Physics. The efficacy of journal activities to generate self-
efficacy and self-regulated learning through metacognitive monitoring (Muis & Franco,

40

2010), while simultaneously affording the author-researcher a corpus of student artifacts
employing MRS, provides an equally fertile source of data for analysis of student
thinking and reasoning. In these ways, the model for scientific thought (Elder & Paul,
2007b) corresponds with the advance of self-efficacy and self-regulation that is consistent
with epistemic change in scientific domains of knowledge (Mason et al., 2012; Muis &
Duffy, 2013; Sawtelle et al., 2012). Moreover, the use of journals and interviews provides
ample opportunity for the kinds of student reflection that reveal the connections between
conceptual and epistemological change through what Hofer (2004) described as epistemic
metacognition.
Review of the Literature
A brief history of personal epistemology research. Personal Epistemology (PE)
has been an expanding field of inquiry for at least 40 years, with the coalescence of a
handful of models and theories emerging in the late 1990’s to early 2000’s—such as
process and developmental models (Bendixen, 2012), and at least four different
assessment instruments for judging the epistemic state of learners at most any age (Hofer
& Pintrich, 2012). Student beliefs about knowledge are multidimensional and
multilayered, such that the nature of knowledge itself can be described along the
dimensions of certainty and simplicity, whereas the dimensions source of knowledge and
its justification describe the nature of knowing (Hofer & Pintrich, 2012; Mason, Boldrin,
& Ariasi, 2010). Epistemological beliefs are simply beliefs about what knowledge is and
how it is obtained (Richter & Schmid, 2010), and are a form of declarative metacognitive
knowledge (Hofer, 2004). Richter and Schmid (2010) distinguish epistemological
metacognition from psychological metacognition in terms of their differing content—

41

where psychological metacognition refers to mechanisms of memory and learning, and
epistemological metacognition refers to the process by which knowledge is qualified.
Multiple lines of research into personal epistemology in student populations
indicates that fine-grained cognitive resources better explain the formation of beliefs
about learning than do developmental stages, or belief-systems (Hofer & Pintrich, 2012).
Naïve epistemologies are proposed to precede sophisticated ones developmentally—such
that the natural progression of knowledge as facts justified by authority (naïve) is
transformed into a more complex and nuanced network of ideas (sophisticated) that are
understood socially and contingently, and subsequently result in higher achievement
(Bromme et al., 2010). However, Bråten and Strømsø (2005) found that naïve
epistemology produces better results when the topic at hand is unfamiliar and complex—
thus compelling the epistemological framework to rely on authority—whereas a more
sophisticated epistemology relying on knowledge as a more personal and subjective
construction is more likely to misconstrue the textual evidence under analysis.
Sophisticated epistemologies as the means by which learning is positively influenced is
contingent on the context of the task and the level of expertise that task participants
possess (Hammer & Elby, 2012). Both context and skill place particular kinds of
demands on the deployment of representational systems in accordance with the epistemic
beliefs that students possess with respect to the capacity of those systems to encode
meaning.
Developmental models such as the epistemological reflection model (Baxter
Magolda, 2012) offer a constructivist viewpoint for understanding the mechanism(s) for
epistemological change, whereas process-model theorists consider more fine-grained

42

cognitive resources than developmental stages or beliefs (Bendixen, 2012) as a means for
explaining epistemological advance. Finer-grained resources include particular views
about knowledge in general, acquisition of said knowledge, the kinds of and interrelations
of knowledge types, and the sources of that knowledge. Bendixen and Feucht (2010)
offer an integrative model that attempts to capture the clear findings of both the
developmental and cognitive branches of the field, by framing the mechanism of change
as having three distinct components: epistemic doubt, epistemic volition, and resolution
strategies. Epistemic doubt (cognitive dissonance related to beliefs) and epistemic
volition (the will to change) work in concert towards epistemological advance (Rule &
Bendixen, 2010). Resolution strategies are simply reflective, socially interactive,
retrospections by which a person analyzes the implications of personal belief (Baxter
Magolda, 2012; Bendixen, 2012).
Domain-general and domain-specific epistemologies are distinct factors that
influence learning (Lee & Chin-Chung, 2012; Schommer-Aikins & Duell, 2013). In a
study involving 701 college students in the United States, researchers used path analysis
to determine that domain-general beliefs have an indirect effect on performance, whereas
domain-specific (mathematics) beliefs have both direct and indirect effect on
mathematical problem solving. The beliefs that are formed within the context of a
particular domain influence thinking and reasoning more dramatically than do domain-
general beliefs that apply to all situations. For example, the belief that the average person
learns quickly or not at all was strongly correlated with a weak mathematical background
due to choices influenced by the belief that mathematics is not useful or accessible.
Moreover, the opposite was also found to be true—that a belief that mathematics takes

43

time to learn and is useful is consistent with the practice of taking more mathematics
courses and devoting the diligence to them that accompanies successful skill
development (Schommer-Aikins & Duell, 2013).
While few researchers in the field of personal epistemology doubt the reality of
development stages for epistemological advance, the evidence of domain-specific
processes and environments is the primary reason that the majority of attention has
shifted to the mechanisms of epistemological change in terms of psychological
constructs—such as thinking and reasoning (Hofer & Pintrich, 2012). Strategies for
resolving epistemic doubt (Bendixen & Feucht, 2010) and the implications of and on
personal beliefs (Bendixen, 2012) are metacognitive and epistemological in nature
(Barzilai & Zohar, 2014; Hofer, 2004; Richter & Schmid, 2010), and require some level
of social interaction and individual analysis, as well as a positive affective backdrop from
which motivation leads to concentration and control in problem-solving settings (Bodin
& Winberg, 2012; Muis & Duffy, 2013). With respect to this study, the context of
epistemological advance is scientific, and therefore the thinking and reasoning is as well
(Paul & Elder, 2008).
A brief history of assessment on personal epistemology. One of the earliest
attempts to psychometrically measure personal epistemology was the Psycho-
epistemological Profile (PEP), which measures the construct on three dimensions:
Rational, Empirical, and Metaphorical (Royce & Mos, 1980). The rational dimension of
PEP assumes that knowledge is obtained through reason and logic, whereas the empirical
dimension derives and justifies knowledge through direct observation. The metaphorical
dimension of PEP sees knowledge as derived intuitively with a view to subsequent

44

verification of its universality. The PEP instrument has demonstrated concurrent validity
based on examination of group scores and their correspondence to the underlying theory
(Royce & Mos, 1980). For example, biologists and chemists were typically strongest on
the empirical dimension of PEP, whereas persons situated in the performing arts were
more metaphorical in nature—just as mathematicians tend to be more rational than any of
the other two PEP dimensions. Furthermore, the construct validity of the PEP has
obtained moderate to moderately high correlations at the p = 0.05 level for the Myers-
Briggs Personality Test, and the MMPI (Royce & Mos, 1980).
Royce and Mos (1980) also reported positive correlations for each item on the
PEP with the total score in its dimension. Split-half reliability coefficients on two forms
of the PEP indicate satisfactory homogeneity with correlations of r = .75, .85, and .76
corresponding to the rational, metaphoric, and empirical dimensions, respectively, for a
sample of n = 142 students on form V of the test given in 1970, versus correlations of r =
.77, .88, and .77 for a sample of n = 95 students on form VI (current form) given in 1975.
Test-retest reliability coefficients for the PEP in two small sample studies are given as
follows: junior college students (n = 19) tested over a three-month period obtained
reliability coefficients of r = .61, .78, and .67 on the rational, metaphoric, and empirical
dimensions, whereas first-year university students (n = 43) tested over a nine-month
period obtained reliability coefficients of r = .68, .66, and .87 on the rational, metaphoric,
and empirical dimensions, respectively. The moderately high inter-correlations between
dimensions of the PEP indicate considerable dependence between these epistemic styles;
however, the relative degree of independence suggested their existence as separable and
meaningful dimensions of personal epistemology (Royce & Mos, 1980).

45

The Epistemological Questionnaire (EQ; Schommer, 1990), the Epistemic Beliefs
Inventory (EBI; Schraw, Bendixen, & Dunkle, 2012), and the Epistemological Beliefs
Survey (EBS; Wood & Kardash, 2012) are the three most studied assessments of personal
epistemology to date (DeBacker, Crowson, Beesley, Thoma, & Hestevold, 2008).
Schommer’s approach broke with the tradition of developmental structure models to a
system of independent dimensions pertaining to beliefs about knowledge and knowing
rather than strictly held beliefs. The structure of knowledge as simple vs. complex,
whether or not knowledge is certain, how personal authority and locus of control
determine knowledge, the speed at which learning is possible, and how fixed or malleable
learning truly is, comprise Schommer’s dimensions of personal epistemology. She
created a 63-question instrument with 5 dimensions and 12 subsets within the same. In
her original study, the four factors of Simple Knowledge, Certain Knowledge, Innate
Ability, and Quick Learning emerged; however, the fifth dimension of Source of
Knowledge did not (Schommer, 1990). Subsequent studies have had inconsistent
replications of Schommer’s factor extraction due to abridged versions of the original
instrument, and comparing subscales rather than items in factor extraction processes
(DeBacker et al., 2008). Moreover, attempts to replicate suffer from small sample sizes,
and a tendency to find different factors and number of factors that while clearly related to
Schommer’s original 5 dimensions at face value, are nonetheless structured differently.
The EBI was created in response to these issues and had initially obtained
consistent factor extractions in studies by Bendixen, Schraw and Dunkle (1998) and
Schraw et al. (2012) that seemed to preserve the 5-dimension structure originally
proposed by Schommer (1990). However, Nussbaum and Bendixen (2003) could obtain

46

only three interpretable factors in their study (n = 238) of the EBI: Simple Knowledge (α
= .69), Certain Knowledge (α = .69), and Innate Ability (α = .77). With respect to internal
consistency across the paper-based, computer-based, or web-based delivery modalities,
Chronbach’s Alpha scores ranging from .42 to .79 were found with sample sizes of n =67
and n = 101 (Hardré, Crowson, Kui, & Cong, 2007). Though an improvement on the EQ,
these findings reveal modest correlations within relatively low sample sizes, indicating
the need for continued research in the assessment of personal epistemology (Hofer,
2012).
In two samples (n = 417, n = 378), confirmatory factor analysis (CFA) revealed a
poor fit to the 5 dimensions of the EBI in both sample groups, and internal consistency
coefficients were consistently below .70 (DeBacker et al., 2008). The EBS was tested on
two sample groups (n = 380, n = 415) with only marginal increases in internal
consistency falling below .80 (DeBacker et al., 2008). Low correlations between only two
dimensions of the EQ within all 6 subscales indicate low reliability among the subscales
themselves. For these reasons, and others, DeBacker et al. (2008) argue that the entire
enterprise for assessing personal epistemology has suffered from a purely empirical
approach that has not been properly grounded in theory, and suggest that researcher
within this field more clearly define and clarify epistemic beliefs from beliefs about those
beliefs. Hofer (2012, Kindle location 409) sums it up clearly by stating, “we need
considerably more effort addressed toward either unifying our language or clarifying our
existing distinctions in terminology, improving methodological approaches so that
comparable studies can be conducted, and in considering the relation between

47

epistemological understanding and other key constructs.” It is assumed within this study
that thinking and reasoning qualify under the heading of other key constructs.
The Colorado Learning Attitudes about Science Survey (CLASS) instrument was
designed to measure epistemic stance of students with respect to student knowledge of,
and learning of, Physics (Douglas et. al, 2014). Averaging the number of responses that
agree with the pre-determined expert view is the method by which the percent favorable
score is assigned to a student completing the CLASS survey. The instrument consists of
42 questions distributed in 7 categories: personal interest, real world connections,
conceptual connections, applied conceptual understanding, problem solving general,
problem solving confidence, and problem solving sophistication. In spite of the robust
validation of this instrument using over 7,000 respondents since 2003, interviews, and
factor analysis, the CLASS instrument was deemed unsuitable for this study due to its
emphasis on problem solving and conceptual change—two dimensions of learning
beyond the scope of this study, as well as an unstable factor structure (Douglas et al.,
2014).
Regardless of the state of affairs in assessing personal epistemology, the PEP
instrument was ideal for the purposes of this study for at least two reasons: its rational,
empirical, and metaphorical dimensions match the practice of science in general, and
physics in particular (Lancor, 2012; Lee & Chin-Chung, 2012; Plotnitsky, 2012), as well
as the fact that it does not suffer from any of the reliability issues that other instruments
obtain as described herein. Moreover, as the research questions were not concerned with
the direction of epistemological change, neither are the dimensions of the PEP.

48

Connections between conceptual change and personal epistemology. The
histories of both conceptual change and epistemological change research span
approximately 40 years each, and are just beginning to reach a point in the most recent
decade where theoretical and methodological coherence are feasible (diSessa, 2010;
Hofer, 2012). The sometimes intersecting histories of both fields is worthy of brief
exploration in this review of the literature, as both fields have major contributors whose
research occurred in mathematics and physics classrooms such as the one proposed in
this study. This section describes the major connections that exist between conceptual
change and epistemological change with an emphasis consistent with the research
questions addressing epistemological change through thinking and reasoning with MRS.
Thomas Kuhn first introduced the terminology of conceptual change in his
landmark treatise The Structure of Scientific Revolutions in reference to how concepts
embedded within a scientific theory change when the theory (or paradigm) underlying
them changes (diSessa, 2010). Historically, the process of conceptual change in
educational settings was though to consist of (a) conceptual dissatisfaction, (b) the
recognition of new and intelligible conceptions that are (c) plausible, and (d) perceived as
fruitful for progress (Posner, Strike, Hewson, & Gertzog, 1982). This classical model
became understood as cognitive conflict strategy (CCS) and ultimately failed as an
instructional strategy because student learning was found to be a gradual process that is
influenced by affective and motivational factors that are contingent on personal
epistemology (Lee & Chin-Chung, 2012).
The framework theory of conceptual change asserts that naïve theoretical
frameworks for understanding the world are difficult to change because everyday

49

experience affirms their perceived stature in spite of the felt conflict that persists between
their content and that of conventionally accepted theories (Vosniadou, Vamvakoussi, &
Skopeli, 2010). Conceptual change can be achieved through bottom-up additive
mechanisms such as the acquisition of new information through experience, or through
top-down mechanisms such as instruction-induced conceptual change (Vosniadou, 2007).
Additive mechanisms for conceptual change produce synthetic models consistent with
assimilation and accommodation processes (Piaget, 1970), and lack the sort of meta-
conceptual basis that instruction-induced frameworks are capable of providing
(Vosniadou, 2007). Metacognition is central to the awareness that one’s personal naïve
theory is in conflict with another theory, and therefore productive in both conceptual and
epistemological change (Barzilai & Zohar, 2014; Bendixen, 2012; Chang, Wen, Kuo, &
Tsai, 2010).
The process of building models is commonly understood as a means for assessing
our understanding of one or more theories, which thereby forces a reconciliation of naïve
personal theory with conventionally understood theory (Hestenes, 2010; Jonassen, 2010).
In contrast to classical conceptual change theories, the framework hypothesis is
theoretically constructivist in nature, and views misconceptions as “dynamic, situated,
and constantly changing representations that adapt to contextual variables or to the
learners developing knowledge” (Vosniadou, 2007, p. 60).
Moreover, these models are representative of ontological categories pertaining to
substance and process—as well as epistemological ones concerning the domain of
inquiry. However, it is still unclear at the present time whether or not conceptual change
consists of discrete (knowledge-in-pieces model) comparisons, or continuous (coherence

50

model) bits of knowledge that are connected structurally by the relations that make them
meaningful on a larger scale (diSessa, 2010). According to Inagaki and Hatano (2010),
conceptual change involves a complete restricting of knowledge systems in general
because it involves not only individual concepts, but also how those concepts stand in
relation to rules, models and personal theories. Measuring conceptual change is one of
the fundamental and persistent problems in cognitive psychology; however, the
difference between spontaneous conceptual change and instruction-induced conceptual
change is rooted in the intentional efforts of a cognitive agent to resolve the incongruity
within their knowledge system (Inagaki & Hatano, 2010). Part of the trouble in
measurement of conceptual change is in tracking how a change in the truth-value of one
piece of knowledge corresponds to changes in related pieces of knowledge—which
simply highlights the difference between the knowledge-in-pieces versus coherence
viewpoints, which dominate the field of conceptual change.
Clement (2010) describes the longstanding gap at the core of conceptual change
theory in terms of how the mechanisms of conceptual change are presently unknown,
even though the conditions for, and effects of conceptual change are. According to
Clement (2010), part of this problem rests in defining what a model is, and distinguishing
the features of a mental model from external representations of that mental model. Both
Clement (2010) and Fekete (2010) distinguish the features and existence of a mental
model from external representations that persons make of those mental models.
Nersessian (2010) offers a definition of mental model as an abstract conceptual system
used for reasoning, which idealistically represents the salient features of a physical
system through the use of surrogate objects to which the cognitive agent imparts

51

properties and behaviors. However, the conceptual change process associated with any
model varies widely in scope—such as a complete paradigm shift, model synthesis, major
model modification versus minor model revision, concept integration and/or
differentiation, bridging analogies, and new model construction.
Though this study had a singular focus on epistemological change in terms of
MRS used in thinking and reasoning processes, conceptual change is expected for all of
the reasons, and in the ways described herein (Chang et al., 2010; Lee & Chin-Chung,
2012). One consistent theme shown in these research findings is that conceptual change
depends on the restructuring of knowledge domains in terms of the relationships that
exist between models (diSessa, 2010; Inagaki & Hatano, 2010), as they produce changes
in personal, and sometimes naïve theories (Jonassen, 2010; Vosniadou, Vamvakoussi &
Skopeli, 2010). Nersessian (2010) specifies the mechanism for conceptual change as
model-based reasoning capable of producing paradigm shifts, model
revision/integration/synthesis, and new model construction. Each of Nersessian’s metrics
are consistent with the elements of scientific thinking given by Paul and Elder (2008),
and thus represent specific targets for analysis in this study.
Conceptual change in introductory physics. Under the premise that conceptual
change and scientific reasoning are sequentially fixed with respect to development of
problem-solving skills, Physics Education Research (PER) pioneers created the Force
Concept Inventory (FCI) as a means for assessing the Newtonian force concept in a
student’s understanding before and after instruction, (Hestenes, 2010). The most
successful reforms arising from PER include Interactive Engagement (IE) approaches
such as Peer Instruction (Gok, 2011; Wood, Galloway, Hardy, & Sinclair, 2014), and

52

Modeling Instruction (Hestenes, 2010). Both the FCI and the Mechanics Baseline Test
(MBT) are used widely within the PER community in order to assess the effectiveness of
IE techniques relative to the teaching and learning of introductory physics courses (Hake,
1998)—as is Lawson’s Classroom Test of Scientific Reasoning (CTSR) for general
scientific reasoning (Coletta et al., 2007a). According to Coletta and Phillips (2010), IE
techniques are able to produce measurable changes in scientific thinking and reasoning
that exceed the kinds of assessment gains normally obtained through traditional
instruction—such as the fact that students in IE classrooms obtain an average normalized
gain on the FCI that is more than twice that of the traditional students (Cahill et al., 2014;
Hake, 1998; Rudolph et al., 2014).
The FCI has been shown to define a unidimensional construct distinguishing non-
Newtonian and Newtonian populations, where the defining concept that separates the two
is the idea that no net force is required in order to maintain constant velocity (Planinic,
Ivanjek, & Susac, 2010). However, using Rasch analysis techniques on the FCI,
differential item functioning (DIF) analysis revealed that two different groups with equal
ability were not able to consistently answer certain FCI questions in the same way—
suggesting that the construct changes slightly from pre- to post-test. According to
Planinic et al., the width of the FCI as it pertains to the concepts covered is too narrow for
the proper discrimination in the range of abilities relative to the construct. The authors
suggest a number of improvement including two different test (pre- and post-) that share
a common set of items, as well as simply removing items from the middle of th test and
adding entirely new ones at the extremes. Moreover, the authors stress that the FCI is still

53

a useful test for assessing the efficacy of instruction relative to the Newtonian force
concept (Planinic et al., 2010).
Yasuda and Taniguchi (2013) determined that 2 of the 30 FCI questions were
invalid by using a series of sub-questions in order to validate whether or not the learners
actually possessed the conceptual knowledge required to answer the original items. By
combining the results of testing for both false positive and negatives in student response
patterns, as well as the validity of the sub-questions, Yasusa and Taniguchi were able to
find a significant difference (α = 0.05) between the pre- and post-test conditions. This
study did not extend beyond the two questions under study, and the researchers suggest
that further research is required in multiple populations internationally, as well as for the
rest of the FCI test items. These findings suggest a source of systematic error that has the
potential to reform current understanding of the usefulness and import of the FCI as an
instrument that has shaped PER for several decades now (Yasuda & Taniguchi, 2013).
Wang and Bao (2010) developed the FCI-metric as a way to assess IP student
proficiency based on the FCI score. These researchers used a 3-parameter Item Response
Theory (IRT) model based on data obtained at Ohio State University from 2003 to 2007.
The pre-test data consisted of 2,802 students and the post-test data included 2,729
students. Eigenvalue analysis of the correlation matrices of pre- and post-test conditions
of the FCI in this sample a single proficiency variable (unidimensionality) for all 30 items
on the FCI. However, interpretation of the fit between the assessment model and the
underlying cognitive model is subject to systematic variations that occur within the
assessment model—in particular, which of three particular IRT models are used (Chen et
al., 2011). In their analysis, Chen et al. used archived data from 3,139 participants with

54

each of three 3-parameter logistic models: R, MULTILOG with pre-processing, and
MULTILOG without pre-processing. Though each method produces consistent results,
the variation between proficiency and ability parameters may lead to misunderstandings
in certain contexts. The researchers suggest further analysis in order to determine more
precisely which of the models is best, and within what context it should be used (Chen et
al., 2011).
The Force and Motion Conceptual Evaluation (FMCE) assesses fluency with
verbal and graphic representations of just the force concept and one-dimensional
kinematics, as opposed to the FCI’s broadened focus including verbal and pictorial two-
dimensional motion, vectors, Newtonian forces, and mechanical systems in general
(Thornton, Kuhl, Cummings, & Marx, 2009). Though FCI and FMCE scores have a
strong positive correlation (r = 0.78), students who perform well on one do not
necessarily perform well on the other; and therefore the use of both assessments in
various instructional settings reveals important features of instruction and the use of
representational systems and how those factors convey to student learning (Thornton et
al., 2009). In a study involving 3,420 students at 13 different institutions, pre-test and
post-test test scores for the FMCE revealed a 65% ±6 normalized gains for IE methods
versus a 15% ±3 gain for traditional methods.
Interactive Engagement (IE) techniques alone are not always the source of
conceptual change. In a study involving 2,537 undergraduate students taking a second-
semester IP course in electricity and magnetism at four major universities, student
conceptual gains on the Brief Electricity & Magnetism Assessment (BEMA) for the
groups using a particular curriculum—Matter & Interactions (M&I) textbook and labs—

55

outperformed traditional students by a factor of 2 (Ding & Caballero, 2014). The M&I
curriculum reorganizes content and place an emphasis on microscopic cause and effect
patterns, as well as providing lab opportunities to develop simulations in a programming
environment, whereas traditional methods emphasize standard textbook content and
conventional labs involving strictly physical apparatus. Analysis of time devoted to
lecture topic areas found that there were no significant difference between traditional and
reformed curriculum, thus emphasizing the difference in content and emphasis.
Decades of PER have established that certain pedagogies obtain better conceptual
gains than others—namely, IE methods outperform traditional methods (Hake, 1998).
The study environment described herein is an IE reformed pedagogy with ample
opportunity for students to express and use concepts in collaborative way. Moreover,
many of the assessments created by the PER community to measure conceptual change
are used by the author-researcher, and are therefore material to the overall discussion
concerning the relationship between conceptual and epistemological change.
Personal epistemologies and learning physics. Epistemic beliefs have the
capacity to bias the learning of students towards preferred types of information and
learning environments (Muis, Kendeou, & Franco, 2011). Student epistemologies have
been shown productive in their capacity for transfer from physics to other domains of
knowledge—such as mathematics (Forsyth, 2012; Po-Hung & Shiang-Yao, 2011), but
not necessarily from mathematics back to physics (Po-Hung & Shiang-Yao, 2011). In
their study, Po-Hung and Shinag-Yao noted that students of mathematics based their
interest to learn in physics on their belief in the capacity of mathematics to prove things
versus what physics is able to demonstrate. Though students believed that the fields of

56

mathematics and physics are intimately related, their beliefs about the types of
knowledge that each field conveys determined not only personal interest, but also the
degree to which those connections would be promoted in their teaching practice after
college.
Ding (2014) found two factors that influence student conceptual gains in IP: pre-
existing scientific reasoning skills, and pre-instructional personal epistemology. Path
analysis was used to confirm the existence of a “direct causal influence” of pre-
instructional personal epistemology and conceptual learning in IP (Ding, 2014, p. 5). In
this study consisting of 167 first-year calculus-based IP students at a university Eastern
China, the FCI, CTSR, and the CLASS instruments were given as pre- and post-
instruction tests. The structure of the classroom environment was a traditional lecture
format where the instructor made no efforts to promote or probe student epistemologies.
The CTSR scores of this sample population were typical in incoming college freshman,
whereas the FCI normalized gains were above average at 52.1% ±18.9. In this study, the
researcher cautions that the small-to-moderate path strengths obtained between pre-
instructional epistemology and conceptual gains confirm the veridicality of the model,
but fall short of providing a strong, causal proof. Ding (2014) recommends further
research in classroom settings where instruction cultivates student reasoning and
epistemological growth. In a similar study, Bodin and Winberg (2012) noted that in
addition to prior knowledge and epistemological beliefs, locus of control and positive
emotions associated with concentration serve to enhance and predict performance.
The Maryland Physics Expectations (MPEX) survey of epistemological stance in
IP measures epistemological attitudes and beliefs along six dimensions: independence,

57

coherence, concept, reality link, math link, and effort link (Sharma, Ahluwalia, &
Sharma, 2013). The Coherence dimension refers to the degree to which a student
perceives the topic as disjointed pieces versus a continuous hole, the Concept dimension
of the MPEX-II refers to how students sees concepts as merely cues towards a formula
versus a substantive description of reality, and Independence refers to whether or not the
student place authority in their own understanding or in an external source such as a
teacher or textbook. The reality link attempts to discern whether or not students see ideas
in physics are relevant to real life, whereas the math link probes the students’ view of
math as disconnected from physics versus representative of it. The effort link merely
gauges how diligently students attempt to use information and make sense of it. Sharma
et al. (2013) found that undergraduate students in the United States, Thailand, Turkey,
and India tend to become more entrenched in their novice-like views of physics due to a
full year of traditional instruction. The only exception to this trend was in master’s degree
students, who presumably had greater interest in the field due to their voluntary election
to pursue graduate work in physics. The general conclusion of these researchers is that an
indifference in teacher attitudes about the relationship between students and instructors
leads to a mediocre at best learning experience that tends to drive students away from
science.
A truncated version of the MPEX survey of epistemological stance in IP —the
Maryland Physics Expectations-II (MPEX-II)— was found to be psychometrically
unreliable in a large study of 505 Turkish high school students in IP (Yerdelen-Damar,
Elby, & Eryilmaz, 2012). The source of its shortfall in reliability is due to the fact that
there are at least two perspectives from which to interpret the correlations between items

58

in each dimension of the survey—the beliefs perspective or the resources perspective. A
beliefs perspective understands epistemology in developmental stages and/or naïve
versus expert theory construction, whereas the resources perspective understands
epistemology as a context-dependent construct deployed in accordance with the setting
that a student is situated. The MPEX-II has only three dimensions: Coherence, Concept,
and Independence. Two out of three factors on the survey fell below the 0.70 thresholds
for reliable Cronbach’s alpha due to weak correlations among items within those
dimensions. Interpretation based on the beliefs perspective suggests that the instrument
failed to measure the actual student beliefs, whereas the resources perspective suggests
that the details in the survey items serve to activate distinct epistemological resources.
The main finding of Yerdelen-Damar et al. (2012) was that the MPEX-II is structured to
be understood from the beliefs perspective, and is therefore partisan with respect to
competing theories of epistemological growth.
Epistemological resources include calculations, physical mapping, invoking
authority, and mathematical consistency (Bing & Redish, 2012). Physical mappings
differ from calculations by virtue of how consistent the symbols and diagrams are with
the physical properties of a system, whereas, calculations are simply algorithms that lead
to trustworthy results. Moreover, the epistemological resource of invoking authority
further relies on implicit trust in a source of knowledge—such as an instructor or
textbook from which physical mappings and algorithms are given. Bing and Redish
(2012) lend these four epistemological resources based on the analysis of over 150 hours
of videotaped discussions of upper-division physics students arguing for or against claims
disputed in a classroom setting. Each of these epistemological resources served as

59

warrants for the beliefs held by students engaged in conflict resolution of physics
problems.
Bodin (2012) used network analysis to study the epistemological framing of
physics students engaged in computational physics problem solving in order to generate
graphical representations of epistemic framing before and after a problem-solving
episode. The elements within the epistemic frame proposed by Bodin consist of
knowledge, beliefs, and skills. In the process of solving numerical problems within a
computational environment, the shift in epistemic framing revealed in the before and
after conditions indicated both conceptual change and the construction of new knowledge
for those students. According to Bodin (2012), these findings suggest that assignments
structured to mix competencies and skills from multiple disciplines facilitates the
construction of new knowledge, and thus a shift in epistemic framing that inevitably
progresses from naïve to expert over time.
Epistemological framing is a problem common to classroom environments where
students frame the problem-solving activity as an answer-generating one rather than a
knowledge construction one (Hutchison & Elby, 2013), and where group discussions are
an integral part of the course design (Irving, Martinuk, & Sayre, 2013). Moreover,
epistemological framing is a tool by which learners make sense of current problems in
light of prior experience (Hutchison & Elby, 2013). In their study, IP students were asked
a think-aloud question about two projectile motions, where all variables were the same
except the initial condition. Perceiving that the question had a straightforward answer in
terms of simple facts, many students misinterpreted the question and answered wrong.
When the researcher focused their attention on the salient aspects of the situation, all

60

students quickly realized their mistake and reasoned correctly to the right answer. A
control group of students inexperienced with Physics were asked the same questions, and
all reasoned intuitively towards the correct answer because they did not frame the
question as an opportunity to simply recall textbook-level facts.
Bing and Redish (2012) described the components of epistemological framing as
social, artifacts, affect, and epistemology. The social component of an epistemological
framework describe the who and the how of interactions within groups. Artifactual
components refer to materials used in the process of problem-solving, whereas the
component of affect deals strictly with how an individual feels about those activities. The
epistemology component of an epistemological frame refers to the means by which an
individual constructs new knowledge. The authors use epistemological resources and
epistemological framing as the basis for an ontology of student cognition in physics
capable of describing the elements of student thinking and reasoning therein. Bodin
(2012) describes this sort of epistemological framing as the activation of a network of
epistemological resources, where the network is the ways in which knowledge, beliefs,
and skills are organized within context. Furthermore, Bing and Redish (2012) suggest
that analyzing student work in terms of epistemological resources and epistemological
framing provides a way to assess a student’s transition from to novice to expert condition
by virtue of what they call a journeyman stage where thinking and reasoning are coupled
with diligent efforts to coherently justify the knowledge that they are actively
constructing.
Hammer and Elby (2012) suggest an ontological approach to forming an adequate
theory of epistemological change in terms of the resources that are (1) productive for that

61

change as they are (2) situated within the context that students actually use them. This
changes the traditional focus of simply cataloging how student epistemologies differ from
the experts, to probing the unexplored domain of epistemological resources, and their
capacity to produce epistemological change. Representational systems are one such
resource, and the content of MRS in Physics is neither purely rational nor empirical, but
also depends on metaphorical representations—such as the term flow for energy transfer,
light is a particle/wave, and electrons tunneling through quantum spaces—in order to
foster the understanding of complex phenomena and their underlying theories (Brewe,
2011; Lancor, 2012; Scherr et al., 2012; Scherr, Close, Close, & Vokos, 2012). For
example, the conventional language of physics has proven productive in the hands of
expert physicists; however, due to its metaphorical nature, it is a source for conceptual
confusion among students (Hammer & Elby, 2012) because the common everyday
notions of force and motion held by laypeople, are rarely what physicists are referring to
in their models (Hestenes & Wells, 1992).
A pseudo-longitudinal study of last-year high school student’s (N = 157), year 1 –
5 undergraduate students (N = 406), and post-doctorate researchers or university
professors (N = 74) in the United Kingdom indicated no significant change in attitude
towards Physics during the undergraduate experience using the Colorado Learning
Attitudes about Science Survey (CLASS) instrument (Bates, Galloway, Loptson, &
Slaughter, 2011). There were, however, significant changes in level of expert-like
thinking as measured by the CLASS instrument at the entry and exit points of the
undergraduate program, which researchers attribute to a selection effect reflecting levels
of personal interest, as well as approximately 15% of last-year students intending to

62

major in Physics must take an entrance exam for university admission. In a large-scale
study of Chinese middle (N = 521) and high school students (N = 797), results showed
that traditional lecture-based instruction in Physics over a three-year period, produces a
reduction in expert-like views in Physics (Zhang & Ding, 2013). One exception to this
trend was in grades 9 and 12 where changes in content, sequence, pace, and external
motivations produce slight increases in expert-like views of Physics. Researchers
hypothesized that both pedagogical and non-pedagogical factors influence the complex
interaction between formal instruction and personal epistemology in Physics.
The Colorado Learning Attitudes about Science Survey for Experimental Physics
(E-CLASS) instrument was developed for the sake of assessing epistemology and
expectations in IP laboratory settings (Zwickl, Hirokawa, Finkelstein, & Lewandowski,
2014). The E-CLASS was designed to be given at the beginning and the end of a typical
semester, and presents paired questions addressing the students’ perception of their own
work along with the students’ perception of how an expert physicist would view the
same. The instrument has been developed and validated through extensive testing and
interviews with students participating in 45 classes distributed among 20 different
institutions. In order to establish the content validity of an expert view, 23 expert
physicists at 7 universities were recruited for the sake of establishing consensus
viewpoints of the test items. Most items obtained a 90% or greater consensus, with some
items in the 70% or above consensus dealt with instructor beliefs about the difficulty of
experimenting and student abilities related to lab methodology. Convergent validity
based on correlations with other assessment instruments and course grades has not been

63

obtained; however, student interviews (N = 42) conducted for the sake of validation
revealed consistent interpretations and valuation of the questions across the curriculum.
Standard lecture courses cause a negative shift in self-efficacy that influences the
decline in positive attitudes about Physics and a tendency towards novice theories of
Physics, whereas Modeling Instruction—or IE structured courses—produce no change or
positive changes to attitudes about Physics in terms of expert-like dispositions (Lindsey
et al., 2012; Sawtelle, Brewe, & Kramer, 2010, 2012). IP course designs with
epistemological framing in mind obtain conceptual and epistemological gains (Redish &
Hammer, 2009) through curricular strategies that promote expert thinking—such as
exploring the implications of ideas, sense-making collaborations, and leveraging secure
ideas as a sort of conceptual foothold. According to Redish and Hammer (2009, p. 2),
students in such a reformed IP classroom learn to coordinate “conceptual and
epistemological resources” into their everyday thinking.
The fact that physics is a domain of knowledge requiring rational, empirical, and
metaphorical thinking and reasoning (Lancor, 2012; Lee & Chin-Chung, 2012;
Plotnitsky, 2012) suggests that epistemological change within this domain involves those
same dimensions. The Psycho-epistemological Profile (PEP) described herein was
selected for this very reason for deployment on the study sample, as it describes
knowledge acquisition in terms of rational epistemologies where knowledge is obtained
through reason and logic, empirical dimensions that derive and justify knowledge through
direct observation, and a metaphorical dimension where knowledge is derived intuitively
with a view to subsequent verification of its universality. Other instruments for
measuring personal epistemology—such as the EQ, EBI, and EBS—were not selected

64

due to the fact that after 50 years of international research on personal epistemology, no
well-validated instrument capable of measuring epistemological development in large
groups of students has emerged (Richardson, 2013). Assuming the PEP dimensions
represent mechanisms for epistemological change lends opportunity for consideration of
the tools and processes that govern those mechanisms—such as thinking and reasoning
with MRS within the context of Physics.
Thinking and reasoning in introductory physics. Instruments such as the FCI
and FMCE were designed to measure conceptual change in IP. Findings from
assessments such as these are also used to make assertions about the scientific thinking
and reasoning of students in IP settings (Cahill et al., 2014; Hake, 1998; Hestenes, 2010)
through additional measures such Lawson’s CTSR (Coletta & Phillips, 2010)—which is a
test of formal reasoning, and the ways in which MRS correspond to the force concept
(Nieminen, Savinainen, & Viiri, 2012). In these ways, the PER community is committed
to the promotion and assessment of conceptual change through scientific thinking and
reasoning with MRS.
Coletta and Phillips (2010, p. 13) created the Thinking in Physics (TIP)
instructional program in order to “improve students’ thinking and problem-solving skills”
in first-semester introductory physics courses. However, though the term thinking is used
14 times in the article, nowhere within the article is the term defined; rather, it is always
positioned in the context of either scientific reasoning, or problem-solving skills. The
conceptual basis for the TIP intervention was the Cognitive Acceleration through Science
Education (CASE) program by Adey and Shayer (1994), and the Numerical
Relationships (NR) curriculum by Kurtz and Karplus (1979). TIP is one of many IE

65

interventions that enjoys significant gains in conceptual understanding and problem-
solving skill (Coletta & Phillips, 2010) when compared to traditional methods of
instruction (Hake, 1998). However, no clear definition of thinking and reasoning emerges
from this body of literature in spite of the fact that tremendous gains have been recorded
by the PER community’s attention to theoretical and pedagogical reforms that define IE.
Active learning and IE strategies involving reading, experimentation and
discussion produce significant changes in formal reasoning when compared to traditional
methods (Marušić, Mišurac Zorica, & Pivac, 2012). In their study, Marušić et al.
compared a control group (n =124) experiencing traditional lecture methods, with a group
(n = 181) learning physics via lecture and reading (LPLR), and another group (n = 170)
learning physics by doing (LPD). Both the LPLR and the LPD groups engaged in
discussion of course content, with the only difference being the focus of the discussion
being on lecture/reading content versus classroom experiments. There were no
statistically significant difference in pretest scores among all three groups, and the control
group experienced no significant changes on the post-test event. However, the
normalized gain on the CTSR for the LPLR group was 0.016, while the LPD groups had
a gain of 0.31. Transitions from concrete thinking to formal thinking amounted to 24% of
LPLR learners, and 44% of LPD learners, and was attributed to the active learning
strategies of predict, observe, and explain in small and large group settings. In a smaller
study, Marusic and Slisko (2012) repeated these findings by obtaining effect sizes of d =
0.30 for the reading, presenting, and questioning (RPQ) group (n = 91), and d = 0.65 for
the experimenting and discussion (ED) group (n = 85).

66

IE is the category under which the most successful reforms in PER fall, and
include approaches such as Peer Instruction (Gok, 2011), Modeling Instruction, and
Workshop Physics (Cahill et al., 2014), which are all capable of producing conceptual
gains of more than twice that of the traditional students on assessments such as the Force
Concept Inventory (Bruun & Brewe, 2013; Cahill et al., 2014; Formica, Easley &
Spraker, 2010; Rudolph et al., 2014). The Force Concept Inventory (FCI) was created to
assess individual students Newtonian force concept before and after instruction, under the
premise that said students possess particular scientific reasoning skills (Hestenes,
2010)—even though it never defines the terms: thinking or reasoning.
IE methods alone are not likely to produce the highest conceptual gains on the
FCI for portions of the student population who do not possess certain cognitive skills—
such as those measured by the Lawson Classroom Test of Scientific Reasoning (CTSR)
(Coletta et al., 2007a), or the SAT (Coletta et al., 2007b). Moreover, the companion
assessment to the FCI for measuring problem-solving skill associated with the Newtonian
concept—the Mechanics Baseline Test (MBT)— further defines the Newtonian threshold
in terms of problem-solving ability (Hestenes & Wells, 1992). Students who score 60%
or 80% on the FCI are typically able to solve problems on the MBT at the same levels of
performance. These findings reveal how conceptual change corresponds to the critical
and scientific thinking or reasoning that accompanies problem-solving skill. However,
though claiming to have obtained gains in scientific thinking and reasoning through
assessments like the FCI and the MBT, the fact that no clear definition of thinking or
reasoning has emerged from the PER literature indicates a need to revisit the fundamental
underpinnings of the theories and methods that have driven this success. The continued

67

conflation and fuzzy or non-existent definitions of the terms thinking and reasoning stalls
the scientific progress that is so desperately needed in the research of cognition and
instruction. In terms of cognition, the impact of such a renaissance has seemingly
limitless potential for clarity and progress within the teaching enterprise.
Representational consistency is the ability to interpret representations of content
and context that are isomorphic (Nieminen, Savinainen & Viiri, 2012). In their study of
131 high school students who took the Representational Variant of the Force Concept
Inventory (R-FCI) for representational consistency (Nieminen, Savinainen & Viiri,
2010), and the FCI, Nieminen et al. found a strong positive correlation between pre-
instructional levels of representational consistency and conceptual change associated with
the force concept by correlating pre-test R-FCI scores with FCI gains. Additionally, there
was no correlation between pre-instructional representational consistency and the gain
that they obtained in representational consistency between pre- and post-test conditions—
thus suggesting that prior knowledge is not a limiting factor in a student’s ability to learn
MRS and subsequently use that new knowledge to advance conceptual change and
problem-solving skills (Nieminen et al., 2010). In a related study, De Cock (2012) noted
that student success in solving a problem is related to both the representational format of
the problem and the underlying concept. Moreover, the ability of a student to deploy
MRS is related to the initial representational format of the problem that they are engaged
in solving.
The importance of coordinating the psychology of thinking and reasoning with
the scientific types and practices of thinking and reasoning converges where conceptual
and epistemological changes occur. Multiple Representational Systems (MRS) express

68

the rational, empirical, and metaphorical nature of scientific content (Lancor, 2012; Lee
& Chin-Chung, 2012; Plotnitsky, 2012). Human thinking and reasoning, with and on
these MRS, is central to the process of science (Plotnitsky, 2012); and therefore to the
concepts and beliefs that it has capacity to convey to its consumers.
Study methodology. Hammer and Elby (2012) suggested a qualitative approach
to classroom observations of IP student beliefs rather than focusing primarily on the ways
that student beliefs differ from educator’s views via the use of epistemological surveys.
Moreover, careful consideration of the student’s epistemological resources as they are
situated within the context of IP coursework is central to uncovering the methods and
processes that activate them. Bell and Linn (2012) found that students are more likely to
develop a collection of disjoint ideas about physics rather than a cohesive view—which
requires an effective instructional strategy in order to equip students with the conceptual
and representational tools that are need for structuring knowledge in a meaningful way.
According to Bell and Linn (2012), one reason for this is that students tend to see science
differently than scientific inquiry. In other words, science is merely a static collection of
facts, whereas science inquiry is a dynamic knowledge-generating enterprise. Student
success is therefore linked to their epistemological view about scientific knowledge.
Learning environments based on IE models have a long-standing record of
success in terms of conceptual change related to the central idea in Physics, namely the
force concept (see Coletta & Phillips, 2010; Hake, 1998, 2007; Hestenes, 2010). The key
features of an IE IP classroom are guide inquiry and collaboration that are facilitated by a
pedagogical approach which leverage Socratic dialog as a means for constructing
coherent knowledge structures (Cahill et al., 2014; Hake, 1998, 2007; Rudolph et al.,

69

2014). Given the need for student knowledge construction to match the actual process of
doing science (Bell & Linn, 2012; Marušić et al., 2012), and the established structure of
IE IP learning environments from a modeling perspective (Hestenes, 2010), a qualitative
method such as grounded theory (Charmaz, 2006) is required for theoretical advancement
in a setting where the lived experience of the students is purposely designed to mimic the
true practice of science. Key factors of the learning environment where a proper view of
science as inquiry can be developed are ones that position inquiry as a means for
obtaining personally relevant understanding, as well as fruitful collaboration and debate
of the findings that emerge from inquiry (Bell & Linn, 2012).
Kalman and Rohar (2010) used an intrinsic case study design to determine that a
curriculum structured around reflective writing, collaborative groups, and debate is
capable of positively influencing the development of a scientific mindset. In their study,
Kalman and Rohar recruited over 75 students from three universities in order to collect 3
groups of 5 students—one from each location. The researchers analyzed written artifacts
from the case study participants, as well as interview data to assess cognitive activity
during reflective writing, summary writing, conceptual change, and views on the
usefulness of the course design. In addition to the qualitative evidence for conceptual and
epistemological development over the course of one semester, each of 15 participants in
this study scored in the top 25% to 75% of their classes on the final examination—thus
suggesting a positive outcome for the course design (Kalman & Rohar, 2010).
Hofer (2012) suggested that future research needs to find relations between
psychological constructs and epistemological frameworks in order to improve
methodology and terminology such that comparable studies can be conducted. Wiser and

70

Smith (2010) showed how concept formation and personal epistemology are connected
through metacognitive control while modeling phenomena; however, conceptual change
research has been dominated by pre-post testing strategies (see Hake, 1998) rather than
process studies (diSessa, 2010). Representational systems serve as epistemic resources
for modeling real-world phenomena (Bing & Redish, 2012; Moore et al., 2013), and it is
the coupling of internal representations (mental models) with the external representations
that we call models, which is critical to the reasoning process (Nersessian, 2010) and its
assessment. These findings suggest a deep connection between personal epistemology
and representational systems as they function in concert with thinking, reasoning, and
conceptual change. However, it is still unclear exactly what the processes and
mechanisms of each construct are (Bendixen, 2012; Hofer, 2012).
The qualitative methodology used in this study—and its associated grounded
theory design—are fine-tuned to probe the deep connections described herein between
personal epistemology, MRS, and the psychological constructs of thinking and reasoning
in terms of conceptual change. Moreover, the evidence that reflective writing and
collaboration lead to the development of a scientific mindset (Kalman & Rohar, 2010), as
well as an IE instructional setting where classroom activities mimic true science (Bell &
Linn, 2012), suggest a research method that has the capacity to reveal the how of
processes that influence the lived experience of persons engaged in learning (Bernard &
Ryan, 2010; Boeije, 2010) that occurs within in a social environment (Yin, 2011).
Qualitative research methods consist of inductive analytical techniques that make
developing an understanding of phenomena from the viewpoint of the participants

71

possible (Merriam, 2010) in a manner that respects how the meaning is constructed in
social settings (Yin, 2014).
Study instruments and measures. The Hake (1998) study demonstrated
conclusively that Interactive Engagement (IE) methods dramatically outperform
traditional methods of instruction in terms of conceptual gains as measured by the Force
Concept Inventory (FCI) and problem solving skills as assessed by the Mechanics
Baseline Test (MBT). A total of 62 Introductory Physics courses with a total enrollment
of 6,542 students from various colleges, universities and high schools participated in this
study, which found IE methods produced more than double the average gain at nearly 2
standard deviations as traditional methods on the FCI. Results on the MBT involved
approximately half the entire study sample (n = 3,259 in 30 courses) showed a strong (r =
0.91) correlation between problem-solving skill on the MBT and conceptual knowledge
on the FCI, where the highest gains on the FCI correlated with the highest post-test scores
on the MBT. Bothe the FCI and the MBT were fixed events in the normal curriculum of
IP students in the study sample at Central Arizona College. While these assessments do
not measure epistemological change in any way, they do measure conceptual change—
which is expected along the way to epistemological change. Their inclusion in this
grounded theory study was based on their place in the natural setting of student
experience, as well as their expected value with respect to the theoretical foundations of
the study as described herein.
The Psycho-epistemological Profile (PEP) was selected for measuring
epistemological change for two reasons. First, the three dimensions that it measures
perfectly match the properties of physics as a domain of knowledge requiring rational,

72

empirical, and metaphorical thinking and reasoning (Lancor, 2012; Lee & Chin-Chung,
2012; Plotnitsky, 2012). The PEP defines rational dimension of personal epistemology as
knowledge obtained through reason and logic, the empirical dimension of personal
epistemology as knowledge derived and justified through direct observation, and a
metaphorical dimension where knowledge is derived intuitively with a view to
subsequent verification of its universality. Second, the theory and practice of Physics in
any setting assumes these dimensions within the varied uses of MRS, and therefore
presents an optimal matching of assessment with curriculum and instruction within an IE
Introductory Physics course.
Summary
One can hardly deny that thinking and reasoning are fundamental features of the
cognitive activities that accompany classroom learning. The evidence described herein
suggests a deep connection between personal epistemology, metacognition, and the use
of representational systems for the sake of conceptual change (Bendixen, 2012; Mason &
Bromme, 2010). However, research initiatives to date have failed to consistently define
and distinguish what is meant by the terms thinking and reasoning (Mulnix, 2012;
Nimon, 2013; Peters, 2007), as well as the specific factors that produce epistemological
change in terms of representational systems, or schemata. One exception to the lack of
coherence in defining thinking and reasoning is the multi-decade work of Linda Elder
and Richard Paul at the Foundation for Critical Thinking (FCT, 2014). According to
Elder and Paul (2007), thinking is merely a form of reasoning—which corresponds to the
conflation of both constructs by Mulnix (2012) and Evans (2012). Paul and Elder (2008)
then formalized the equivalence of thinking and reasoning by specifying 8 universal

73

elements of thought and 35 dimensions of critical thought. While the model is useful for
coding student artifacts, it does not fully assist the effort to answer the research questions
that specify thinking and reasoning as separate constructs; hence the author’s definitions
of (1) thinking as the ability to construct a model, and (2) reasoning is the ability to relate
two or models permit a bifurcation of the Paul and Elder model for the sake of analysis.
The two gaps in the literature—mechanisms of epistemological change and
thinking versus reasoning—are connected by the representational systems that a student
is able to use along with the thinking and reasoning that students employ when solving
problems. The central aims of this study are to determine how thinking and reasoning
with MRS influences personal epistemological change in an IP classroom.
Introductory Physics students at Central Arizona College participated in a series
of activities designed to leverage the power of multiple representational systems for
encoding the structure (models) of physical phenomena (law-like behavior), and
simultaneously promote metacognitive reflection on the meaning of the results, as well as
the tools and the processes that have capacity to produce them. Twenty-nine students
comprising 2 class groups served as the study sample. The class groups consist of one
algebra-based physics class group and one calculus-based physics class group. The
structure of the classroom experience under study matched the conceptual frameworks
previously declared for this study in terms of how student journaling and classroom
collaboration lead to self-regulated (Cifarelli, Goodson-Espy, & Jeong-Lim, 2010) and
self-efficacious (Muis & Franco, 2010) epistemic metacognition (Hofer, 2004).
Specifically, collaborative activities shift the locus of control from teacher to student in
ways that promote epistemic metacognition (Muis & Duffy, 2013).

74

The value of this research is rooted in its potential to simultaneously address all of
the concerns exposed by the gaps identified within the multiple streams of literature cited
herein. In addition to highlighting the deep connections that exist between conceptual
change and epistemological change in terms of representational systems, the opportunity
to lend clarity to the psychological constructs of thinking and reasoning in general terms
as well as how they convey to the central focus of this study (epistemology), is
substantial. Representational systems (language in general) are undeniably essential to
communicating ideas and personal beliefs within social settings—such as student
learning environments. Journaling and collaborative discourse provide ample evidence
for how students use and think about the representational systems that they deploy within
academic settings. Given the ease with which such artifacts can be obtained, a corpus of
student journals, interview and polls are at the core of data collection in study proposed
herein. Moreover, as the ebb and flow of classroom activity is somewhat fluid and
adaptable to student and instructor needs, a grounded theory design was selected for
organizing such data for the sake of the stated research questions.
The research questions for this study can be summarized as: how do students use
MRS in their thinking and reasoning about personal beliefs as situated within the context
and the goals of an IP course? In other words, how do they think about their beliefs,
which are also thoughts themselves? Epistemic metacognition (Hofer, 2004) is therefore
an almost inevitable outcome within a learning environment where students are required
to compare and contrast ideas related to what they think and believe. According to Paul
and Elder (2008), thinking and model building of this sort is where scientific opinions
and point-of-view emerge. Qualitative methods are ideal for capturing the true nature of

75

participant viewpoints within the natural setting from which they emerge (Bernard &
Ryan, 2010; Boeije, 2010) in a social environment (Yin, 2011), and were thus employed
by this study.
The qualitative data in the form of student journals, survey, and interviews
obtained throughout the study, is punctuated by a number of traditional IE assessments of
IP, and the Psycho-epistemological Profile (PEP). The general expectation is that
conceptual change as measured by the FCI—or other assessments occurring in the study
environment—will correspond with epistemological change as described by the PEP.
Given the rich context of IP for the use of MRS, and the seemingly inevitable result of IE
methods producing conceptual change (Coletta & Philips, 2010; Hake, 1998), it is
reasonable to expect the potential for epistemological change in concert with student
discourse and activity within the natural setting of an IP course. Chapter 3 will provide a
detailed accounting of student views and practice using MRS in Physics.

76

Chapter 3: Methodology
Introduction
The purpose of this study is to determine how students in an IP classroom think
and reason with MRS as they experience epistemological change. The importance of this
study hinges on its ability to answer a long-standing deficit in the literature on
epistemological change (Bendixen, 2012; Pintrich, 2012) by providing a deeper
understanding of the processes and mechanisms of epistemological change as they
pertain to context (domain of knowledge) and representational systems in terms of the
psychological constructs of thinking and reasoning. Such findings better inform the
Physics Education Research (PER) community concerning the capacity that MRS have
for encoding meaning during the scientific thinking and reasoning process, while
simultaneously clarifying what is meant by those processes. Moreover, the relative
importance of personal epistemology in the process of conceptual change—either as a
barrier or a promoter—is the kind of information needed for continued progress in the
PER reform effort, as well as learning theory in general. The importance of advancing
scientific thinking and reasoning, conceptual change—in terms of epistemological
change—lies in the clear evidence from PER that conceptual change has a positive effect
on achievement in terms of problem-solving skills (Coletta & Phillips, 2010; Coletta et
al., 2007a; Hake, 2007).
The research questions can be summarized as how do students in IP use
representational systems to encode meaning, and promote their own thinking, reasoning,
and understanding, as they experience conceptual and/or epistemological change? This
chapter presents a detailed review of the research questions and the methodology and

77

design employed to answer them. Efforts to ensure the validity and reliability of the
measures and instrumentations are discussed in conjunction with the data collection and
analyses. The chapter terminates with a discussion of ethical concerns and various
limitations of the study.
Statement of the Problem
It was not known how (a) thinking and reasoning with multiple representational
systems (MRS) occurs, and (b) how that sort of thinking and reasoning affects
epistemological change in terms of mechanisms and processes—whether cognitive,
behavioral, or social—in an IP classroom. The use of representational systems—such as
symbols, diagrams, and narratives—is undoubtedly central to the progress of science
education by virtue of its ubiquitous deployment in the realm of natural science itself
(Plotnitsky, 2012). Given the cognitive filter that personal epistemology provides for the
acquisition and the application of knowledge (Schommer-Aikins, 2012), it seemed
reasonable to investigate the nature of epistemological change in concert with the
thinking and reasoning that occurs by means of the representational systems associated
with a domain of knowledge—such as IP.
The importance of this study hinged on its ability to answer a long-standing
deficit in the literature on epistemological change (Bendixen, 2012; Pintrich, 2012) by
providing a deeper understanding of the processes and mechanisms of epistemological
change as they pertain to context (domain of knowledge) and representational systems in
terms of the psychological constructs of thinking and reasoning in an IP classroom? Such
findings would then better inform the Physics Education Research (PER) community
concerning the capacity that MRS have for encoding meaning during the scientific

78

thinking and reasoning process, while simultaneously clarifying what is meant by those
processes. Moreover, the relative importance of personal epistemology in the process of
conceptual change (diSessa, 2010)—either as a barrier or a promoter—is the kind of
information needed for continued progress in the PER reform effort (Redish, 2013), as
well as learning theory in general.
Research Questions
The goal of this qualitative grounded theory study was to determine the influence
that multiple representational systems (MRS) have on the thinking and reasoning of
community college students with respect to their conceptual frameworks and personal
epistemology. Semi-structured interviews based on instructional goals, survey response
data, and student journal entries were conducted at regular intervals during the study in
order to obtain emergent themes concerning how students think and reason about
mathematics, as well as how they monitor their own thinking. Journals and semi-
structured interviews—in the form of group Socratic dialogs—revealed the ways in
which students shift between representational systems (languages) in an effort to model
mathematical systems. Multiple electronic polls were given throughout the treatment in
order to capture opinions about thinking and reasoning, knowledge acquisition and usage,
as well as how concepts and beliefs change as a result. Exit interview questions
terminated the semester filled with daily group interview/discussions and several weekly
journals covering the same material. By that time, the study populations ability to have
substantial and meaningful discourse was fairly well developed, as evidenced by the
more than 200 pages of interview transcripts. Also, each individual submitted a written
version of his or her own answers to the exit interview questions prior to the interview.

79

R1: How do IP students use MRS in their thinking and reasoning?
R2: How does the use of MRS in the thinking and reasoning of IP students
promote personal epistemological change?
The instrument used herein for assessing personal epistemology (the PEP) has no
preferred direction for epistemological change because it simply measures personal
epistemology along three dimensions: rational, metaphorical, and empirical. The structure
of an IP course is already fine-tuned to the PEP dimensions given the widespread use of
MRS in a collaborative learning community focused on conceptual development and
problem-solving skills that involve the use of narrative, specialized symbol systems, and
diagrammatic tools. The PEP survey was selected primarily due to its affinity with an IE
IP course as described above; but also in light of the fact that the most-used instruments
for personal epistemology still suffer from unstable factors (Barzilai & Zohar, 2014).
A grounded theory approach (Charmaz, 2006) was used in designing this
qualitative study in order to produce a substantive theory capable of describing the
complex interactions that comprise the phenomena of thinking and reasoning with MRS,
and its influence on epistemological change within the context of a community college IP
classroom. Grounded theory is a qualitative design that allows a researcher to form an
abstract theory of processes or interactions that are grounded in the views of the
participants (Charmaz, 2006; Glaser & Strauss, 2009). Given the fact that personal
epistemology is entirely about personal beliefs and viewpoints, a grounded theory
exploration of the underlying mechanisms and processes of epistemological change is
entirely consistent with the research questions probing how students think and reason
their way towards epistemological change using MRS. Thirty-four students comprised

80

the study population from which the archived data on 29 of those students was drawn—
which is consistent with the 20-30 study participants recommended for grounded theory
research by Creswell (2013), and the 30-50 participants suggested by Morse (2000).
Charmaz (2006) suggests that 25 interviews is sufficient from grounded theory designs
on small projects, and this study consisted of 44 interviews. Given that the study used
interviews, written journals, and electronic polls, a group of 29 student participants was
more than adequate in order to obtain the level of theoretical saturation which is the
ultimate criterion for sample size in grounded theory designs (Corbin & Strauss, 2008).
Research Methodology
A qualitative approach was used in this study. The foundations of qualitative
research rest on the inductive analysis that makes developing an understanding of the
phenomena from the viewpoint of the participants possible (Merriam, 2010) in a manner
that respects how the meaning is constructed in social settings (Yin, 2014) where the
researcher is the primary data collection instrument responsible for producing a richly
descriptive account of the outcomes (Merriam, 2010). Given the nature of the study on
personal epistemology—beliefs about knowledge and its acquisition—and how students
obtain advances in personal epistemology, qualitative methods lend themselves best to
the project described herein because quantitative test scores do not address the ‘how’ of
anything with a view to theory building until qualitative methods expose the concepts and
hypotheses to be quantified (Yin, 2011). Moreover, given that the research design was
grounded theory, the necessity of qualitative methodology for data collection and analysis
is properly constrained within this methodology by virtue of its underlying logic and
interpretive framework (Charmaz, 2006).

81

Research Design
A grounded theory approach (Charmaz, 2006) was used in designing this
qualitative study in order to advance a new theory that is capable of describing the
connections that exist between thinking and reasoning with MRS, and its influence on
epistemological change relative to the student experience in an IE IP classroom.
Grounded theory designs lend a researcher the required tools for developing a theory of
processes or interactions that are grounded in the views of the participants (Charmaz,
2006; Glaser & Strauss, 2009). Personal epistemology is entirely about personal beliefs
and viewpoints; therefore, a grounded theory exploration of the underlying mechanisms
and processes of epistemological change is entirely consistent with the research questions
probing how students think and reason their way towards epistemological change using
MRS.
Baxter Magolda (2004) deployed a grounded theory approach for a 16-year
longitudinal study upon which the Epistemological Reflection model (Baxter Magolda,
2012) was established, because of its affinity with constructivist developmental theories,
the constructivist paradigm in general, and the fundamental structure of qualitative
inquiry at large. According to Baxter Magolda (2004), the data that she obtained from
more than 1,000 students prior to the longitudinal study served as the categories against
which the grounded theory could be constantly compared to the evolving interpretations
that emerged throughout the study period. In this way, the grounded theory defines the
core category around which all emergent themes find ground (Corbin & Strauss, 2008),
and thus serves to manage the uncertainty—and even the bias—that accompanies the
analysis of personal epistemology as observed within a constructivist context.

82

Palmer and Marra (2004) used a grounded theory design to study the domain-
specific epistemologies of 220 students attending a large eastern research university.
Students were interviewed extensively in order to determine their epistemological
orientation of science as facts versus humanities as facts (stage 1), science as theory
versus humanities as opinions (stage 2), and science as an evolving construction of
commitments within theory versus humanities as construction of facts with evidence and
reason (stage 3). In a sub-sample of 60 upper division students in science and
engineering, it was found that the shift from stage 1 to stage 2 is easier for the humanities
student than it is for the science student; however, the shift from stage 2 to 3 is much
easier for the science student than the humanities student. The grounded theory design
selected by these researchers allowed for the evidence to be grounded in the narratives of
the students engaged in epistemic development, and thus form a predictive theory for
explaining the differences and the transitions that naturally emerge.
Thirty-four Introductory Physics (IP) students comprised the study population
from which archived data on 29 of those students was sampled—which is consistent with
the 20-30 study participants recommended for grounded theory research by Creswell
(2013), and the 30-50 participants suggested by Morse (2000). Charmaz (2006) suggested
that 25 interviews are sufficient from grounded theory designs on small projects, and this
study conducted 44 interviews. A similar study consisting of 18 students used interviews
and questionnaires to compare and contrast domain-specific epistemological beliefs with
respect to physics and biology (Lee & Chin-Chung, 2012). Forsyth (2012) conducted a
single case study of one individual examining the epistemology of far transfer—how one
domain of knowledge influences understanding other domains—using a series of three

83

interviews aimed at describing the relations of similarity between physics and its
application to other content areas. Given the current study is using interviews, written
journals, and electronic polls, a group of 20 – 30 student participants should be more than
adequate for obtaining the level of theoretical saturation which is the ultimate criterion
for sample size in grounded theory designs (Corbin & Strauss, 2008).
Population and Sample Selection
Thirty-four students enrolled in two IP courses at Central Arizona College— a
Hispanic serving institution (HSACU, 2014) located in Coolidge, Arizona—during the
2014 fall semester, comprise a small portion of the nearly 6,500 students attending that
campus. Course enrollment in each course was 17 students ranging in age from 17 – 45.
Archival data collected on the study sample of 29 students was from the existing
curriculum for IP students at Central Arizona College. Site authorization (see Appendix
B) has been obtained to use archived data, and specifies how the school and the
researcher will maintain anonymity of the student study sample during data collection
and analysis. Given that the data collected was from archival sources, no informed
consent was required or obtained. Yin (2011) stated that Institutional Review Board
(IRB) practices are typically ambivalent when it comes to data sources such as these;
however, so long as the basic ethical mandate to protect the anonymity of students is
upheld, archival data in this form does not require consent. Nevertheless, the growing
trend within social science research to use textual and visual archived data is an ethical
problem only to the extent that as databases increase in size, the chances of identifying
participants becomes more likely (Crow & Edwards, 2012). Given that the archival
database for this study is limited to one college with participants who spend no longer

84

than 2 years at the institution, the threat of violating anonymity is virtually non-existent
so long as local researchers and the local IRB maintain data security. Moreover, the
decision to use archived data—i.e. artifacts already graded and returned to students of the
researcher—was implemented so that students felt no undue performance pressure
relative to the study. Artifacts collected during the progress of each course contain no
student identifiers other than their student identification number, which is removed prior
to release for analysis through a joint effort of the local IRB office and the author-
researcher, and replaced with generic identifiers FS1 and MS1 in order to designate
female student 1, and male student 1, respectively. Archived data will be held on site by
the author-researcher for no more than 5 years prior to disposal.
The study population from which archived data will be drawn consisted of 34
individuals—which is consistent with the 20-30 study participants recommended for
grounded theory research by Creswell (2013), and the 30-50 participants suggested by
Morse (2000). This study conducted 44 semi-structured interviews, which is far more
than the minimum 25 suggested by Charmaz (2006) when using grounded theory designs
on small projects. Given the current study is using interviews, written journals, and
electronic polls given within the online learning management system used for
coursework, a group of slightly 29 student participants should be more than adequate for
obtaining the level of theoretical saturation which is the ultimate criterion for sample size
in grounded theory designs (Corbin & Strauss, 2008).
Instrumentation and Sources of Data

The IP course from which archived data were sampled involves semi-structured
interviews in the form of group discussions following lab investigations, electronic polls,

85

group interviews, and student journal entries that are collected throughout the semester.
During the first part of the course under study, topics including the nature of physics and
reality, as well as the foundations of mathematics and geometry were part of the
curriculum leading into the enterprise of crafting physical laws that describe empirically
familiar regularities in nature. The terminal point of the course is group exit interviews;
however, its content is contingent on progress through the activities that precede it, and
most likely not in the treatment phase. The following diagram illustrates the typical flow
of activities during each of learning cycle. Each student will complete the Psycho-
epistemological Profile (PEP; Royce & Mos, 1980) before any IP class activities are
conducted, and then again prior to the end of the course. Each participant will complete
the study by participating in a group exit interview that includes a review of the changes
in their PEP profile scores.
The Psycho-epistemological Profile (PEP) measures personal epistemology on
three dimensions: Rational, Empirical, and Metaphorical (Royce & Mos, 1980). The
rational dimension of PEP assumes that knowledge is obtained through reason and logic,
whereas the empirical dimension derives and justifies knowledge through direct
observation. The metaphorical dimension of PEP sees knowledge as derived intuitively
with a view to subsequent verification of its universality.

86

Figure 3. Typical classroom activity life cycle.
In general, the standard curriculum for IP students at CAC involves a guided inquiry (lab)
event followed by 1 – 2 group collaborations and discussions, as well as subsequent
electronic polling. At least one individual journal assignment is included along with a
follow-up writing assignment.

Classroom activities and assessment instrument. Four distinct classroom
activities comprised the standard curriculum at the beginning of each IP class delivered at
Central Arizona College, which address the nature of Physics versus reality, the
conceptual basis of numbers, how to physically model a conversion factor, and creating
physical laws of motion from basic observations. Each activity is described in detail
below, as well as included in Appendices A-G.
Physics and reality. The very first activity of this study existed to set up the basic
nature of classroom discourse as a Socratic dialog within a learning environment
designed to mimic a scientific community—otherwise known as Modeling Discourse
(Hestenes, 2010). Students began by individually answering the following questions in
writing. What is Physics? What is Reality? Is Physics Reality? Students then form small

87

groups of 3-4 and compare their answers with a goal to report some sort of consensus to
the larger group. During the larger group discussions, each student team presented their
findings on small whiteboards during what is henceforth called a “board meeting.” The
instructor/evaluator posed a series of questions addressing the how and the why that
students answered the way that they did in an open-ended semi-structured group
interview.
As a follow up to this activity, students were assigned a Library/Internet research
project to investigate other opinions/beliefs about this series of questions, and summarize
those findings in a short essay where they reflect on their own initial set of beliefs, and
compare it to both the small and large group consensus, as well as their new findings
from the research project. A poll addressing the change in belief and the mechanisms for
that change was crafted based on these results was delivered prior to the post-essay
follow up discussion through the learning management system used for all coursework. A
second semi-structured group interview addressing the student data generated thus far in
the study asked students to reflect on how and why their beliefs have or have not changed
as a result of the activity. Additionally, students were asked how they felt about the
process, and how this may have changed their views about science in general.
Numbers do not add. The second activity was step one in delivering new and
modified mathematical representational systems for use throughout the course, and was
designed to produce a conceptual change about what numbers actually are, and what they
are used for in mathematical modeling. The worksheet provided to each small group of 3-
4 students consisted of modifying a circle and a square to the point where each shape has
been partitioned into fourths, and labeled with the appropriate numeral ¼. Students were

88

then compelled to join each representation—the geometric and the numeric—by addition.
This step presented a new challenge to students because their normal belief that ¼ + ¼
adds up to ½ is challenged by the fact that joining the circle and square into one new hole
includes shaded regions that are still ¼ of the new whole. The activity transitioned into
students crafting a consensus viewpoint about what has just happened suitable for sharing
with the full class group. The central goals of the activity were to challenge the traditional
concept of number that student presently hold, as well as the arithmetic representation
and operational definition of the fraction/quotient operation. The instructor posed a series
of questions addressing the how and the why that students answered the way that they
did.
The follow up individual journal assignment (one-page or more essay) required
each student to reflect on what he or she learned and how their understanding of number,
fraction and quotient—and the ways that they are represented—had changed as a result of
the activity. A poll addressing the change in belief and the mechanisms for that change
was be crafted based on these results, and delivered prior to the post-essay follow up
discussion through the learning management system used for all coursework. A second
semi-structured group interview addressing the student data generated thus far in the
study asked students to reflect on how and why their beliefs have or have not changed as
a result of the activity. Additionally, students were asked how they feel about the process,
and how this may have changed their views about science in general. This activity was
one of many activities described herein designed to change thinking and reasoning, and
thus personal epistemology as beliefs and concepts change as a result of such influences.

89

The law of the circle. This activity leveraged the findings of the prior activity as
step two in learning how to build mathematical (axiomatic) laws from basic observations
(measurement) first represented in natural language (narratives). Students were given a
set of different sized tubes and equipment for measuring circumference and diameter.
The goal was to not only obtain the measurement data, but also to recognize the
empirically familiar regularity that circumference and diameter increase or decrease
together, which served as the basis for stating a physical law about circles. Subsequent to
obtaining a suitable version of this law in natural language form (determined by
instructor/evaluator), and based on an expanded view of what a quotient operation can be
used for from the prior experiment, students were guided through the process of using
arithmetic to craft the conventional formula for the circumference of a circle. The
instructional goal had been reached at the point where students recognize that the
relationship between circumference and diameter is always the same—i.e. the number pi.
In the large group discussion that followed, the instructor posed a series of questions
addressing the how and the why each student group answered the way that they did, as
well as how the feel about their changes in thinking and reasoning.
The follow up individual journal assignment (one-page or more essay) required
each student to reflect on what he or she learned during the activity, and how their
understanding of number, fraction, quotient, and equation—and the ways that they are
represented—has changed as a result of the activity. A poll addressing the change in
belief and the mechanisms for that change was crafted based on these results, and
delivered prior to the post-essay follow up discussion through the learning management
system used for all coursework. A second semi-structured group interview addressing the

90

student data generated thus far in the study, asked students to reflect on how and why
their beliefs have or have not changed as a result of the activity.
The zeroth laws of motion. Based on observations of various aspects of simple
motion—such as dropping and/or rolling a ball—students were guided to a conclusion
consistent with the fact that no object can be in two places at the same time, or that it
always takes some non-zero amount of time in order for anything to change position.
There were numerous variants on those statements, and the goal was to simply get as
close as possible to either of the two model statements provided herein. This inquiry
leveraged the findings and skills developed in the two prior investigations in terms of
how to encode empirically familiar regularities (laws) using symbols and arithmetic
operations—namely the quotient operation in terms of how it has come to be defined in
the study. The traditional instructional goal of this activity was to obtain the essence of
the principles of momentum and energy—upon which all of physics is based, and are
more fully explicated as the course progressed beyond the study phase.
The follow up individual journal assignment (one-page or more essay) required
each student to reflect on what he or she learned during the activity, and how their
understanding of momentum, energy, laws, and physics—and the ways that they are
represented—had changed as a result of the activity. A poll addressing the change in
belief and the mechanisms for that change was crafted based on these results, and
delivered prior to the post-essay follow up discussion through the learning management
system used for all coursework. A second semi-structured group interview addressing the
student data generated thus far in the study, asked students to reflect on how and why
their beliefs have or have not changed as a result of the activity.

91

Exit interviews. Each study participant completed a brief semi-structured group
interview designed to elicit responses that represent their reactions and feelings about the
entire experience. Within the structure of those small-group interviews, the following
questions were directed at each participant. Each interview was recorded and
subsequently transcribed for analysis. The following questions are based on the two
research questions described in chapter 1: (R1) How do IP students use representational
systems in their thinking and reasoning, and (R2:) How does the use of MRS in the
thinking and reasoning of IP students promote personal epistemological change? This
body of questions served as a foundation for a semi-structured discussion designed to
elicit extensive student response concerning their own thinking and reasoning relative to
personal epistemology.
1. How has your thinking changed as a result of this experience?

2. How has your reasoning changed as a result of this experience?

3. How has your understanding changed as a result of this experience?

4. Do any of these changes impact your thinking and reasoning outside of this
experience? How so?

5. Do any of the changes in your understanding impact your beliefs about
anything? How so?

6. In what ways have any personal beliefs changed as a result of this experience?

7. How would you describe conceptual change, and have you experienced any
during this experience?

8. What conceptual changes have you identified in yourself?

Validity
Validity can be obtained through triangulation, saturation, data trails, bracketing
the researchers’ subjectivity/bias, member checks and participant review, prolonged

92

engagement, and reflexivity (Frost, 2011), as well as simply giving attention to
disconfirming evidence and contradictory interpretations—which is essential to
establishing the trustworthiness, or validity of qualitative data (Yin, 2011). Saturation is a
state within qualitative analysis where new data is no longer productive in its capacity to
generate new themes or categories, and is therefore contingent on the efficiency of the
data collection and management processes that precede it. Properly coding the data,
summarizing, and aggregating those results is not only essential to obtaining confidence
in the emergent themes, but also the only way to really know that you have reached a
saturation point (Saldaña, 2013). Triangulation through multiple data sources—
interviews, documents, and field notes—generally serves the needs of validity for the
qualitative aspects of this study.
The data sources for this study included field notes and memos from the author-
researcher observations, written journal documents, transcribed group interviews, small
group lab discussions, data journal entries, and conceptual inventory results from
instruments like the FCI and TUG-K. Journal, interviews and field notes/memos were
sufficient for triangulation in this study. Two coding schemes—a priori theoretical and
indigenous in vivo—were deployed in this study. The theoretical codes—such as
instances of thinking or reasoning with one or more representational systems—flowed
directly from the research questions, and the definitions for the constructs of thinking and
reasoning that the author-researcher had derived from the literature. Indigenous (in vivo)
codes were derived directly from the data as it was analyzed, and was therefore
unpredictable in many ways; however, given the nature of this study, it was reasonable to
expect changes in belief and concept, as well as opinions about the usefulness of various

93

facts and the authority they carried. Given the nature of the study and the metaphorical
dimension of personal epistemology on the PEP survey, the coding for narrative
mechanisms such as metaphors/analogies was expected. The data sources described
herein involved small group, large group, and individual contributions for each activity
under study in the form of written documents, group narratives, and polling data; and
therefore provided a complete and complementary picture of student beliefs, as well as
the ways in which they come to those beliefs. Both coding schemes were applied to all
three data sources providing a coherent way to triangulate the data.
Reliability
The concept of reliability in qualitative research is identical to quantitative
methods, in that (a) consistency is the goal, for the sake of (b) replication by other
researchers (Butler-Kisber, 2010). Qualitative research is capable of generating millions
of words that must be grouped into “units of work” that subsequently can be coded semi-
quantitatively (Johnson, Dunlap, & Benoit, 2010). While each student and group may
respond differently to a particular treatment, it is the nature of the activities and the
questions therein that must be consistent in order for replication of the study to be
meaningful. The details of classroom activity provided in a prior section were given with
this end in mind. With respect to the study itself, those classroom activities were designed
to correspond to one another progressively, while also retaining the same structure in
terms of individual, small-group, and large-group activities, both in scope and in
sequence.
There are at least two levels of replication that researchers attempting to repeat
this study should be aware before starting a similar study. First, though the content of IP

94

is extremely stable, the classroom and its curriculum are reformed in accordance with the
Interactive Engagement (IE) methods and paradigms described in the literature review;
thus requiring some minimal preparation in those techniques in order to match the overall
framework of this study. Second, the PEP instrument, and the various PER community
assessments such as the FCI, make no assumptions about the structure and pedagogy of
the learning habitat. Any IP class is eligible to use these same assessment devices for
their intended purposes. Finally, the nature of the student body is a minor factor in terms
of developmental trajectories and demographic qualities. The setting was a rural
community college serving a largely Hispanic population student body with ages ranging
from 16 – 63 for the college at large, 17 – 45 within the study population, and 18 – 45
within the study sample.
A remaining threat to reliability is researcher bias; however, field notes and
memos served as two methods of bracketing the personal bias of the author-researcher
during data collection and analysis (Butler-Kisber, 2010). The methodological elements
of triangulation that lend validity, actually provided the basic structure for reliability as it
pertains to (1) variations in observation and (2) data collection techniques (Butler-Kisber,
2010).
Data Collection and Management
Site authorization to use archived data from the IP course under study was
obtained (see Appendix C). The student groups participating in this study were a
purposive sample (Bazeley & Jackson, 2013; Frost, 2011) of adult community college
students at Central Arizona College. Thirty-four students comprised the group that met
twice weekly for 3-hour sessions where group interviews and lab activities were

95

conducted, and were separated by electronic polls, traditional homework assignments,
and journal artifacts collected through the course management system. Electronic polls
were administered as a follow-up to group interviews and journal assignments that are
coordinated with classroom events. Poll results were subsequently the focus of journal
reflections about personal and corporate classroom views. The audio of interviews and
classroom collaborations were recorded, and subsequently transcribed and analyzed.
Student anonymity was maintained through the use of generic ID numbers during the
collection of survey instruments—including, but not limited to: the Force Concept
Inventory (FCI), the Mechanics Baseline Test (MBT), the Psycho-epistemological Profile
(PEP), and the Test of Understanding Kinematics Graphs (TUG-K), which are normal
events within the lifecycle of IP courses at Central Arizona College. Given that the data
collected was from archival sources, no informed consent was required or obtained.
The FCI is a multiple-choice assessment designed to be given in a pre-test and
post-test sequence surrounding a first-semester physics course. FCI results are
meaningful when the Hake (1988) gain is calculated for each student and the class as a
whole. The MBT is designed to be given as a single-event test near the end of a course,
and has the features of serving as a standard exam, as well as being well coordinated with
FCI. The TUG-K is a standalone test that can be given as a pre-post-test if desired, so
long as the pre-test comes before curriculum content exposure to graphing kinematics.
Finally, the PEP survey is also a pre-test/post-test survey that should encompass the
treatment designed to produce epistemological change. All of these instruments are
paper-and-pencil multiple-choice tests delivered in the classroom environment of the

96

study sample. Scan forms are graded using the ZipGrade© app on an iPad, and then
transferred to a spreadsheet for analysis and subsequent import to SPSS.
The vast majority of data collected comes in the form of journal assignments
(essays) embedded throughout the standard IP curriculum at Central Arizona College.
Journal assignments direct the students to reflect on their own thinking in terms of how
concepts and beliefs have changed, and what in their opinion was the source of those
changes, if any. A number of group interviews have been recorded that punctuate these
journal assignments, and serve to represent a group-think-aloud with the expectation that
its influence can be seen in subsequent personal journal entries. A number of electronic
and paper-based polls were also given in order to obtain rank-orderings of various
representations and the reflections on why one is preferred over the others. The ordinal
polling data arising naturally within the curriculum leads to basic descriptive statistics,
which serve merely as a backdrop to the qualitative analysis of this study. Each of these
general approaches to data collection served the interests of the research questions that
seek to describe how students think and reason with MRS along the way to personal
epistemological change.
The Institutional Review Board (IRB) Office at Central Arizona College will
worked in concert with the author-researcher in order to collect all course artifacts prior
to release for analysis. All student identifiers present on written documents and
assessments were removed and replaced by generic identifiers—such as Student 1,
Student 2, etc.—for the sake of anonymity. Transcripts of group interviews recorded in
the classroom were obtained via the TranscribeMe ™ service embedded within the
NVivo software used in this study for qualitative analysis. All data collected will remain

97

on site with the author researcher for no more than 2 years prior to being disposed of
through shredding of paper artifacts and deletion of electronic files.
Data Analysis Procedures
The student groups participating in this study were a purposive sample (Frost,
2011) of adult community college students at Central Arizona College. Thirty-four
students were in the group that met twice weekly for 3-hour sessions where group
interviews and lab activities were conducted, and were separated by electronic polls,
traditional homework assignments, and journal artifacts collected through the course
management system. Electronic polls are administered as a follow-up to group interviews
and journal assignments that are coordinated with classroom events. Poll results are
subsequently the focus of journal reflections about personal and corporate classroom
views. The audio of interviews and classroom collaborations are recorded, and
subsequently transcribed and analyzed.
Preparation of data. The Institutional Review Board (IRB) Office at Central
Arizona College worked in concert with the instructor to collect all course artifacts prior
to release for analysis. All student identifiers present on written documents and
assessments were removed and replaced by generic identifiers—such as Student 1,
Student 2, etc.—for the sake of anonymity. Transcripts of recorded interviews were
obtained via the TranscribeMe ™ service embedded within the NVivo software used in
this study for qualitative analysis. Coding in Nvivo will correspond to the theoretical
aspects of the research questions—such as instances of encoding that are representative
of thinking, reasoning, conceptual and/or epistemological change—as well as emergent

98

themes occurring naturally during document and narrative analysis (Butler-Kisber, 2010;
Rubin & Rubin, 2012).
Data analysis. The first step of data analysis is open coding for the identification
of key words and word groupings in the data (Saldaña, 2013). Step two of data analysis
follows with in vivo codes when important words and word groupings warrant their own
code label. Groups of related codes form categories that can become theoretically
saturated when new data analysis returns the same codes (Birks & Mills, 2011). Constant
comparison of current activities to prior activities, researcher memos on the current and
prior activities, group interview transcripts, and emergent themes and patterns in all of
the artifacts produced by the study population were coded within Nvivo in an effort to
reach saturation. NVivo codes were analyzed for relationships and subsequently
displayed in multiple graph formats ranging from bar charts to cluster analysis maps that
reveal the relationships that exist between nodes (codes) and/or families of nodes
(Bazeley & Jackson, 2013).
Two coding schemes—a priori theoretical and indigenous in vivo—were
employed in this study. The theoretical codes flowed directly from the research questions,
and the definitions for the constructs of thinking and reasoning that the author-researcher
had derived from the literature based on the EOT by Paul and Elder (2008). Per the
research questions concerned with the constructs of thinking and reasoning with MRS for
epistemological change, four basic theoretical codes were used in order to begin the
coding process: thinking, reasoning, representation, and epistemology/epistemological—
which are the key elements of the two research questions that ask how students think and
reason with MRS and how that corresponds to epistemological change. Thinking can be

99

defined and detected as critical or scientific, whereas reasoning can be defined and
detected as metaphorical, analogical, or proportional. Representational systems come in
multiple forms such as spoken or narrative language, diagrammatical, graphical, and
symbolic.
Indigenous (in vivo) codes flowed directly from the data as it is experienced or
analyzed, and are therefore unpredictable in many ways. There was little need for memos
on the theoretical codes as they are constructed a priori; however, in vivo coding required
a nearly constant practice of writing memos concerning not only the basis for creating a
new code (Birks & Mills, 2011; Saldaña, 2013), but also how to bracket the researchers
bias relative to the observations and expectations of the researcher. In conjunction with
the epistemological change detected between pre-post-test conditions of the PEP, each
source of data—interviews, journals, and polls—proved to be a rich framework from
which to analyze how students think and reason with MRS in concert with personal
epistemological change.
Themes emerged in two ways. First, the researcher perceived a theme, and second
that theme was confirmed or denied by the pattern that can be seen when a large enough
family of nodes encodes for a trait or construct evident in the data (Bazeley & Jackson,
2013). Visual analysis in NVivo provided for the coordination of many different codes
such that a correspondence between theoretical and/or in vivo themes was evident by
inspection of cluster analysis, and frequency charts.
Ethical Considerations
No personal or acute affects were expected for any persons in the study
population drawn from archived data. An IRB representative from Central Arizona

100

College’s Office of Institutional Planning and Research verified that no student
identifiers were present in the sampled data. In an effort to assist in maintaining
anonymity for various assessment and research purposes, much of the archived data is
already free of names and other student identifiers, as well as the fact that electronic polls
were completed anonymously. All student identifiers present on written documents and
assessments will be removed and replaced by generic identifiers—such as Male Student
1, Female Student 2, etc.—for the sake of anonymity. Students also used avatar names on
each document submission—which further protected student anonymity without
researcher oversight. The potential for student coercion has been eliminated by the fact
that any students’ grades associated with the archived data were already finalized prior to
collection and analysis. Artifacts collected during the progress of each course contained
no student identifiers other than their student identification number and/or avatar name.
Archived data will be held on site in a locked room by the researcher for no more than 5
years prior to disposal so that other researchers can access the data. Researcher bias was
handled by bracketing the presuppositions of the author throughout the phases of data
collection and analysis (Fischer, 2009).
Limitations and Delimitations
Many of the limitations with research methodology arise when only one
method is used (Frost, 2011). A grounded theory design has methodological capacity
for deploying multiple methods, such as ethnographic and phenomenological
(Charmaz, 2006), and is therefore able to leverage those combined strengths while
minimizing individual methodological weaknesses. The bricolage of multiple methods
in such a design allows the multiple perspectives that come with those analytical

101

approaches to minimize the assumptions of researcher bias (Frost, 2011) while
simultaneously increasing the reliability and validity of the findings (Butler-Kisber,
2010). Qualitative research has general limitations of researcher skill, time required
for deep analysis, researcher bias, researcher presence, and limits to generalizability.
Given the stability of all assessment instruments used in this study, as well as
the acceptance of grounded theory and qualitative methods for social science research,
the limitations of the study were primarily limited to (1) the researchers personal bias,
and (2) the reformed pedagogy and curriculum that is described in the PER literature
detailed within the literature review. Researcher bias was handled by data collection
protocols such as memos during both data collection and data analysis, which bracket
the presuppositions and opinions of the researcher relative to the observations that
they make, and the inferences that they draw from the data. The reformed pedagogy
common to IP classroom using IE methods has a strong theoretical and empirical
foundation that has been already been described in the literature review, and therefore
presented no challenge to authenticity pertaining to content and practice. However,
replicating some of the curriculum content in the absence of training in IE methods for
IP is likely to affect the receipt of similar effects using the same line of question and
assessment. Technically, this is not really a limitation at all, as one would expect
different teaching styles and curricular content to have different affects with students.
Summary
The purpose of this qualitative study was to determine the influence that multiple
representational systems (MRS) have on the thinking and reasoning of community
college students with respect to their conceptual frameworks and personal epistemology.

102

The importance of this study hinged on its ability to answer a long-standing deficit in the
literature on epistemological change (Bendixen, 2012; Pintrich, 2012) by providing a
deeper understanding of the processes and mechanisms of epistemological change as they
pertain to context (domain of knowledge) and representational systems in terms of the
psychological constructs of thinking and reasoning. A grounded theory approach
(Charmaz, 2006) was used in designing this qualitative study in order to produce a
substantive theory capable of describing the complex interactions that comprise the
phenomena of thinking and reasoning with MRS, and its influence on epistemological
change within the context of a community college IP classroom.
Thirty-four students enrolled in two IP courses at Central Arizona College— a
Hispanic serving institution (HSACU, 2014) located in Coolidge, Arizona—during the
2014 fall semester, comprise a small portion of the nearly 6,500 students attending that
campus. Course enrollment in each course was 17 students ranging in age from 18 – 45,
including one 17-year-old male not included in the study. Archival data collected on 29
students comprising the study sample was from the existing curriculum for IP students at
Central Arizona College. The IP courses from which archived data was sampled involved
semi-structured interviews in the form of group discussions following lab investigations,
electronic surveys, and student journal entries that were collected throughout the
semester.
The expectation of this study was the finding that multiple representational
systems (MRS) are factors of epistemological and conceptual change. Moreover, the
qualitative findings of student discourse and document analysis revealed how MRS
facilitates thinking and reasoning according to the operational definitions provided

103

herein. Given that this qualitative study aimed simply to explore the data, a hypothesis
about the number and type of representational systems and their capacity to produce
conceptual and epistemological change emerged from the findings, and thereby served
the needs of further theoretical development. The analytic basis for the development of
new theory from this study began with the coding process, with open coding of key
terms—such as thinking and reasoning.
Journal activities—and the discussions that punctuated them—generated self-
efficacy and self-regulated learning through metacognitive monitoring (Muis & Franco,
2010), while simultaneously providing a rich source of evidence for student thinking and
reasoning. These same data also revealed connections between conceptual and
epistemological change through what Hofer (2004) described as epistemic metacognition.
The model for scientific thought advanced by Elder and Paul (2007b) corresponded with
the advance of self-efficacy and self-regulation that is consistent with epistemic change in
scientific domains of knowledge (Mason et al., 2012; Sawtelle et al., 2012).
Following the data analysis in this chapter, the researcher leveraged the
qualitative results of this grounded theory study in the form of a new theory concerning
the nature of thinking and reasoning with multiple representational systems (MRS) and
how that corresponds to personal epistemological change in terms of conceptual
frameworks. Summaries of the content analysis of student journals, polls, and interviews
are presented with a view to capturing and describing how students think and reason with
MRS. Data from various assessments such as the FCI and the PEP will be discussed in
terms of how conceptual change and MRS work in concert in order to produce
epistemological change. However, the quantitative results presented in the next chapter,

104

of the study instruments specified in this proposal, are offered for descriptive purposes
only, and are not intended to form the basis for any inference in this qualitative study.

105

Chapter 4: Data Analysis and Results
Introduction
It is not known how (a) thinking and reasoning with multiple representational
systems (MRS) occurs, and (b) how that sort of thinking and reasoning affects
epistemological change in terms of mechanisms and processes—whether cognitive,
behavioral, or social—in an IP classroom. A qualitative methodology was used in this
study in an effort to develop an understanding of the phenomena from the viewpoint of
the participants (Merriam, 2009, 2010). Moreover, given the manner in which meaning is
constructed in social settings (Yin, 2011) where the researcher is the primary data
collection instrument responsible for producing a richly descriptive account of the
outcomes (Merriam, 2009, 2010), a qualitative methodology was required. Grounded
theory design was used as a means for developing a substantive theory capable of
describing the complex interactions that comprise the phenomena of thinking and
reasoning with MRS, and its influence on epistemological change within the context of a
community college IP classroom.
The goal of this qualitative grounded theory study is to determine the influence
that multiple MRS have on IP students with respect to their conceptual frameworks and
personal epistemology.
R1: How do IP students use representational systems in their thinking and
reasoning?
R2: How does the use of MRS in the thinking and reasoning of IP students
promote personal epistemological change?

106

This chapter describes in detail the qualitative results of the study by cataloging
the various outcomes of study instruments, interviews, and documentation produced by
students in an IP course. A detailed thematic analysis of student documents and
discussions/interviews are constantly compared and contrasted with one another and the
PEP survey data. Descriptive measures from conceptual inventories normally deployed in
an IP classroom setting are also discussed in contrast with the qualitative results
described herein. Qualitative results are also described quantitatively in an effort to
interpret not only the scope of those findings (Chi, 1997), but also to contrast with
individual and group collaborative outcomes (Clarà & Mauri, 2010) that involve multiple
phases and dimensions of content analysis that are not easily isolated within one
methodological approach (Häkkinen, 2013). Quantitative descriptions of these qualitative
data are intended for comparison and contrast purposes only.
Descriptive Data
The sample population for this study consists of 34 IP students with ages ranging
from 17 to 42 years of age, purposively drawn from two IP courses at Central Arizona
College, located in Coolidge Arizona. Twenty-nine students were selected from the
sample population in order to form the study population based on persistence in the
course, and adult age status. Four students dropped the course before mid-term, and one
student who persisted until the end was under age 18. Thirteen adult students from
College Physics (algebra-based) and 16 adult students from University Physics (calculus-
based) participated in this study. Table 1, below, describes the distribution of students
according to course and gender.

107

Table 3

Study Population Demographics
Gender Total
F M
Course
College Physics 7 6 13
University Physics 3 13 16
Total 10 19 29

Each course met twice a week for three hours at a time. During week one, the
Physics and Reality activity was used to set the stage for scientific discourse by asking
questions that require no special knowledge. The primary goal of the activity was to set a
collaborative tone for small and large-group settings where consensus and description are
required. The Learning the Language activity consisted of several individual activities
that attempted to reform the student ideas about arithmetic and representation of number
versus quantity—which are essential to the law-making procedures that begin to unfold in
the Law of the Circle lab. This lab forms the basis of the next phase where the laws of
motion are constructed and analyzed for conceptual content, as well as analytical
capacity. Students were then polled about the various versions and interpretations of
those versions where an axiom is positioned against a natural language explanation. All
of this happened within the first 3-4 weeks of the course. The post-testing and exit
interviews occurred during the fifteenth and sixteenth weeks of the course, respectively.
The Physics and Reality, and Math-Science-Physics and Reality classroom events
occurred during four separate 3-hour meetings in the first week of classes held on August
18, and August 20, 2014 for both IP groups, and consisted of 17 different observations
during that time interval. A total of 99 pages of small and large group
interview/discussion transcripts were collected for this event, consisting of 4 hours and

108

51 minutes of audio recordings spanning 39,667 words comprising 39% of the overall
transcript data. Twenty-nine students submitted a total of 56 journal documents
associated with the interview/discussion activities occurring on the two days of this
event.
The Learning the Language classroom events occurred during four separate 3-
hour meetings held on August 25, and August 27, 2014 for both IP groups, and consisted
of 19 different observations during that time interval. A total of 102 pages of small and
large group interview/discussion transcripts were collected for this event, consisting of 3
hours and 46 minutes of audio recordings spanning 41,660 words comprising 41% of the
overall transcript data. Twenty-nine students submitted a total of 56 journal documents
associated with the interview/discussion activities occurring on the two days of this
event. A follow up activity to the Law of the Circle activity was the creation of the First
and Second Zeroth Laws (FZL and SZL)—which utilized the methods developed in the
Law of the Circle activity in order to create two conventional equations of motion. No
interview data was collected for the FZL and SZL (poll reflections. However, 18 out of
the 29 participants submitted journal reflections on the content of the core activity and the
follow up poll.
The Exit Interview classroom event occurred during two separate 3-hour meetings
held on December 10, 2014 for both IP groups, and consisted of 8 different observations
during that time interval. A total of 37 pages of small and large group
interview/discussion transcripts were collected for this event, consisting of 2 hours and
11 minutes of audio recordings spanning 19,633 words comprising 19% of the overall
transcript data. Twenty-seven students submitted a single journal document answering

109

the interview questions described herein before attending the semi-structured interview.
A total of 238 pages of transcript data were collected during this study covering
44 different classroom observations. Coding for both the interview data and journal data
consisted of 2,597 references covering 853 sources as collected and arranged within the
NVivo software used for this analysis, as illustrated in Table 4 below. Codes for gender,
student, and course are not included in these totals, which strictly represent the a priori
theoretical codes and in vivo coding activity.
Table 4

Interview Transcript Data
Page Count Word Count Percentage
Physics and Reality 99 39,667 39.3%
Learning the Language 102 41,660 41.3%
Exit Interview 37 19,633 19.4%
Total 238 100,960 100.0%

Data Analysis Procedures
The first step of data analysis is open coding for the identification of key words,
and word groupings in the data (Saldaña, 2013). Step two of data analysis follows with In
vivo codes when important words and word groupings warrant their own code label.
Groups of related codes form categories that can become theoretically saturated when
new data analysis returns the same codes (Birks & Mills, 2011). Constant comparison of
current activities to prior activities, researcher memos on the current and prior activities,
group interview transcripts, and emergent themes and patterns in all of the artifacts
produced by the study population were coded within Nvivo in an effort to reach
saturation. NVivo codes can be analyzed for relationships and subsequently displayed in
multiple graph formats ranging from bar charts to cluster analysis maps that reveal the

110

relationships that exist between nodes (codes) and/or families of nodes (Bazeley &
Jackson, 2013).
Two coding schemes—a priori theoretical and indigenous in vivo—were
employed in this study. The theoretical codes flow directly from the research questions,
and the definitions for the constructs of thinking and reasoning that the author-researcher
has derived from the literature. Per the research questions concerned with the constructs
of thinking and reasoning with MRS for epistemological change, at least four basic
theoretical codes are warranted: thinking in terms of coordinations and distinctions,
reasoning in terms of transformation on thinking, representation, and epistemology in
terms of expressed belief. Additionally, the 8 elements of thought by Paul and Elder
(2008) were used as a priori theoretical codes.
A total of 16 memos were recorded during the analysis, and illustrate decisions
made about in vivo coding, theoretical coding, and theoretical development during the
coding process. The theoretical codes of Distinctions, Coordinations, and EoT all contain
some number of child code relationships.
The child codes for Distinctions consist of various is and is not types of
statements concerning math, science, physics, and reality, and are listed in Table 9 below.
In each case for these child nodes, the coding process involved assigning the code to
statements that were explicitly in that form, or were deemed to satisfy the code definition.
Details on the parent-child code relationships are provided below.
Coding schemes. Coding schemes were constructed on the basis of research
questions asking how students think and reason with MRS for the sake of epistemological
change. The key terms of thinking and reasoning were coded for by means of a priori

111

theoretical codes, whereas personal epistemology was coded for in vivo as the difference
between content of beliefs and the structure and process of belief construction emerged
within the data. Two themes emerged that matched the research questions: belief
development and claims about Thinking, Reasoning, and Understanding—or TRU
Claims.
The parent code Beliefs is an a priori theoretical code addressing the
epistemological aim of the research questions, and consists of the in vivo child codes
Belief Development, Changed Belief Influence, and Old Beliefs. Beliefs is a container for
old beliefs and the factors that influence a change in beliefs. Belief Development refers to
statements that indicate a change in, or new way to form beliefs, whereas Changed Belief
Influence refers to claims about the cause for a change in particular beliefs. Old Beliefs
code for statements that indicate what a belief changed from.
The parent code Coordinations is an a priori theoretical code addressing reasoning
and consists of the in vivo child codes Collections, I Believe Because, IF-THEN or
Because, and Related Things. Coordinations code for the relationships that students
encode for when relating two or more of the distinctions coded for under the parent node
called Distinctions, and essentially identifies the ability to categorize. Collections is a
code that identifies when students combine multiple concepts in an effort to describe their
beliefs or ideas, and essentially represents the ability to classify. The I Believe Because
code identifies statement that explicitly state a point of view in those terms or its
equivalent. IF-THEN or BECAUSE encodes for statements that employ those very words
and/or the same reasoning process. The Related Things code describes lists encoded by
students in an effort to express a common relationship among multiple concepts.

112

The parent code Distinction is an a priori theoretical code designed to capture the
conceptual content of thinking in terms of the distinctions that student make—such as the
categorical operation of IS and IS NOT in reference to various concepts. In particular, the
in vivo child codes associated with Distinctions include Physics IS reality, Physics IS
NOT reality, etc. A total of 16 different sub-Distinction codes were created, and are
described in detail in Table 9 below.
The Elements of Thought (EoT) parent code is an a priori theoretical code for
capturing the EoT as described by Paul and Elder (2008). The child codes of
Assumptions, Concepts, Implications, Information, Interpretation, POV, Purpose, and
Question are described extensively in the literature review chapter. The POV was
expanded in to two child nodes—Individual and Group—in order to account for the two
types of activities where students were asked to express individual opinions versus a
group consensus. Detailed results of this coding scheme are described in Table 10 below.
The Transformations parent code is an a priori theoretical code attempting to
capture the creation of new ideas with a view to how that intersects with the Collections,
Distinctions, and Coordinations code sets. The in vivo codes of Thinking Claim,
Reasoning Claim, and Understanding Claim were created to code for students explicitly
describing how their thinking, reasoning, or understanding has changed when asked those
very questions—such as during the exit interview described by Table 2 below. Two
additional in vivo codes for Questions and Reactions to Others were created in an effort
to catalog the general questions that students raised which were not part of the Question
EoT scheme, as well as which decisions were based on interaction with other
participant’s ideas.

113

A priori theoretical codes for Thinking and Reasoning were deployed late in the
coding process due to an effort to give other a priori codes about those constructs
primacy, as well as a general lack of clarity in the data upon which to discern what the
models described by students actually were. The subsequent process of attempting to
code for these constructs once the data lent itself to the scheme, led to a discovery
concerning the conceptual framework that is necessary for thinking that is described in
chapter 5 as the Cognitive Modelling Taxonomy for Conceptual Frameworks (CMTCF).
Triangulation of data. The research questions concerning how MRS are used in
the thinking and reasoning of IP students for epistemological change were answered by
analyzing four sources of data: written journals, small and large group
discussions/interviews, polling data, and researcher memos. Moreover, two main coding
schemes—a priori theoretical and in vivo—were used during data analysis. Coding
patterns were consistent between written and narrative sources regardless of whether or
not the event was a group activity or an individual reflection, and are described fully in
the results section that follows. Furthermore, the results of quantitative measures such as
the PEP showed patterns of change that were consistent with the qualitative findings of
this study. Those results are also described fully in the next section. The general validity
of this study emerges from the coherence between these sources. External validity is the
qualitative equivalent of generalizability, and is contingent on the perception that other
researchers have regarding the transferability of findings to other domains. The theory
produced by this study has potential import to general, educational, and cognitive
psychology, as well as human and machine learning.

114

According to Yin (2014), rich data, respondent validation, triangulation, quasi-
statistics, and comparison are effective strategies for defeating threats to validity in
qualitative research. Triangulation was achieved through the comparison of both group
and individual sources of qualitative data longitudinally through multiple events until
persistent themes were detected. Expert panel review of these findings served to confirm
the results as consistent and valid within the data itself, and with respect to the researcher
memos. Rich data was obtained in terms of the sheer volume of data—such as 238 pages
of transcribed interviews, and 2,597 references coded in 853 sources. Respondent
validation in the exit interviews confirmed central features of the research questions in
terms of participant experience relative to thinking, reasoning, and understanding. As
described in the introduction to this chapter, quantitative descriptions—or quasi-
statistics—have been used extensively for the sake of comparison and contrast within and
between data sources. Multiple events within the study permitted a constant comparison
of study outcomes in different contextual settings with the same set of participants.
Together, these practices during the study phase support the general validity of the study
outcomes.
The limitations of these findings are restricted to the depth and breadth of the
collected qualitative data—which comes in the form of written journals and transcribed
interviews—and the coding process itself. Overall coding consisted of 2,597 references in
853 sources. Twenty researcher memos were written during the process of coding
analysis, which subsequently led to the creation of a new theory based on the data. A total
of 238 pages of small and large group discussion/interview transcripts were also
analyzed. Given the size of this data set, and the varied nature of its content, there is no

115

reason to suspect missing data. However, given the large number of theoretical and in
vivo codes employed by the study, it is possible that certain coding errors might have
occurred in this large data set. The results section shows the consistency between
theoretical and in vivo coding outcomes, as well as the extensive reach of all coding
schemes, and therefore suggests that there is no viable source of these error types therein.
It is possible, however, that certain theoretical codes specified in the Proposal were
incorrectly coded for—such as the EoT codes by Paul and Elder (2008) that were used in
this study. The likelihood of incorrectly coding is extremely low given (a) the simplicity
of the constructs defined by Paul and Elder (2008), as well as (b) the consistent
application of those codes in a manner that led to the discovery of conceptual frameworks
within the emergent theory proposed herein.
Researcher bias is another possible limitation that has been accounted for through
the bracketing of researcher opinions in the form of memos created throughout the coding
analysis process. A priori theoretical codes based on literature review, in parallel with
researcher-produced in vivo coding schemes, served to balance researcher bias against
consensus views within the field of study. As the results herein reveal, researcher bias
was minimized well within acceptable constraints. The limitations of incorrect coding
and missing data have the potential to reshape the thematic results. However, given the
extensive amount of data described herein, it is unlikely that either limitation is relevant
given the internal consistency of theoretical and in vivo coding procedures that were
employed.

116

Results
The following sections provide the results of study components described in the
Proposal. The primary instrument for measuring epistemological change (the PEP), as
well as other miscellaneous assessments (FCI, MBT) are detailed first. It should be noted
that these quantitative results are provided strictly for the purpose of providing
descriptive data on elements of the Proposal that were material to the overall study
goals—but not directly part of the research questions. Two main types of analysis on the
qualitative data were performed—cluster analysis and node matrices, as well as constant
comparative analysis through memo bracketing. Cluster analysis compares codes or
families of codes against one another by calculating a similarity index based on either
linear regression or coding set intersections. Node matrices simply cross-tabulate
individual node comparisons, and thus provide the user with a sense of how frequently
various texts coincide—or share a single code value. These measures are intended for
description only, and in no way provide quantitative support to the inferences made
herein.
PEP Analysis. All 29 students from the study population took the PEP survey in
both pre- and post-test conditions, with results shown below in Table 2. The
COMPOSITE score is the mean value of the PEP-dimension—Rational, Empirical, and
Metaphorical—scores obtained in either test condition, as shown in Table 3 and Table 4,
and per course as shown in Table 5. Results show an overall increase in composite PEP
scores, as well as an average increase of approximately 3 points along each dimension of
the PEP. Changes in the range and standard deviation for pre-post dimension scores
reveal an overall decreased variance in the data simultaneous to an overall increase in

117

each dimension of the PEP instrument. The following quantitative data is given for
descriptive purposes only, and to address outcomes specified in the Proposal.
Table 5

PEP Dimension Scores
College Physics Pre-test M(SD)
Post-test
M(SD)
Rational 111.2(9.6) 114.6(10.3)
Empirical 111.5(11.5) 107.5(11.2)
Metaphorical 104.7(12.8) 100.1(8.9)
University Physics Pre-test M(SD)
Post-test
M(SD)
Rational 111.4(12.8) 116.2(11.5)
Empirical 102.1(11.10) 111.2(9.5)
Metaphorical 93.6(14.4) 101.8(15.7)
Combined Pre-test M(SD)
Post-test
M(SD)
Rational 111.3(11.2) 114.7(10.1)
Empirical 106.3(12.3) 109.4(10.6)
Metaphorical 98.6(14.8) 101.0(13.4)

Tables 6 and 7 provide aggregate descriptions of changes in PEP scores with
respect to PEP dimension in both pre- and post-test conditions.
Table 6

Basic PEP Composite Descriptive Statistics
n Range M SD
preCOMPOSITE 29 45.33 105.40 11.39
postCOMPOSITE 29 39.00 108.34 9.37
PEP_Change 29 60.00 2.94 13.18

118

Table 7

Basic PEP Dimension Descriptive Statistics
PEP Dimension n Range M SD
preRATIONAL 29 44 111.31 11.68
preEMPIRICAL 29 52 106.34 12.47
preMETAPHORICAL 29 65 98.55 15.03
postRATIONAL 29 42 114.66 10.31
postEMPIRICAL 29 44 109.38 10.74
postMETAPHORICAL 29 58 101.00 13.68

Twelve of the 29 students in this study retained their primary PEP dimension (D1)
as rational, whereas 5 students switched their primary dimension from empirical to
rational. No students changed from the primary dimension of metaphorical to rational.
According to Table 5, approximately the same number of students switched D1 from
empirical to rational (n = 5) as did those who switched from rational to empirical (n = 4).
One student retaind D1 as metaphorical, while 3 students retained D1 as empirical.
Overall, 17 percent of the study population shifted their primary PEP dimension to
rational, while 41 percent retained the rational dimension for D1. The remainder of
changes in D1 are given below in Table 5. The following quantitative data is given for
descriptive purposes only, and to address outcomes specified in the proposal.

119

Table 8

Primary PEP Dimension Changes
preD1 postD1 Matching nodes
n %
R R 12 41
E E 3 10
M M 1 3
R M 2 7
E R 5 17
M R 0 0
R E 4 14
E M 1 3
M E 1 3
Total 29 100

Six of the 29 students in this study retained their secondary PEP dimension (D2)
as empirical, whereas 3 students switched their secondary dimension from rational to
empirical, and 3 students swtched from metaphorical to empirical. Another 6 students
retained the rational dimesions for D2 between pre- and post-test conditions, whereas
only 3 retained the metaphorical dimesion for D2. Overall, 42% of the sample population
retained the secondary PEP dimension (D2) between pre- and post-test conditions, with
an equal amount retaining the rational dimension as did those who retained the empirical
dimension. Ten percent of students in this study retained the metaphorical dimension for
D2. The remainder of changes in D2 are given below in Table 9.

120

Table 9

Secondary PEP Dimension Changes
preD2 postD2 Matching nodes
n %
R R 6 21
E E 6 21
M M 3 10
R E 3 10
E M 2 7
M E 3 10
R M 1 3
E R 4 14
M R 1 3
Total 29 100

Fourteen of 29 students in the study population retained metaphorical as their last
PEP profile dimension (D3), whereas only three students retained the empirical
dimension, and no students held the rational dimension between pre- and post-test
conditions. An equal number of students (n = 5, 17%) switched D3 from empirical to
metaphorical, or vice versa. The remainder of changes in D2 are given below in Table 7.
Table 10

Tertiary PEP Dimension Changes
preD3 postD3 Matching nodes
N %
R R 0 0%
E E 3 10%
M M 14 48%
R M 1 3%
E R 0 0%
M R 1 3%
R E 0 0%
E M 5 17%
M E 5 17%
29 100%

All pre-post scores, and the overall PEP change are normally distributed, as
shown by Shapiro-Wilks (SW) normality tests in Table 11—which is given for

121

descriptive purposes only, in accordance with what is generally appropriate for an
instrument of this type. Epistemological change of the type measured by the PEP
instrument is part of the naturally occurring background of the study environment, rather
than a study outcome falling under the lens of qualitative design. The PEP was specified
in the Proposal, and the content of the PEP is material to the research questions;
therefore, these descriptive statistics are offered for description only, rather than for
inference of any sort.
Table 11

PEP Score Distributions Normality Tests
Kolmogorov-Smirnova Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
preRATIONAL .166 29 .040 .937 29 .083
preEMPIRICAL .114 29 .200* .972 29 .619
preMETAPHORICAL .106 29 .200* .970 29 .549
postRATIONAL .095 29 .200* .951 29 .189
postEMPIRICAL .104 29 .200* .969 29 .530
postMETAPHORICAL .137 29 .172 .941 29 .107
preCOMPOSITE .122 29 .200* .946 29 .146
postCOMPOSITE .100 29 .200* .973 29 .650
PEP_Change .143 29 .132 .969 29 .525
*. This is a lower bound of the true significance.
a. Lilliefors Significance Correction

Qualitative analysis.
Overall math science physics and reality sequence. Combined coding of the
Physics and Reality, and the Math-Science-Physics-Reality journals, and small/large
group discussions/interviews is shown in Table 12 below, according to the codes, and the
number of sources and references coded. The first two memos created while coding this
source data detected a pattern of distinction making concerning the connection between
physics and reality—which led to the 8 distinctions listed below. The Collections,
Coordinations, and Transformation codes were created based on a memo entry that noted

122

how students were collecting concepts and distinctions in an effort to create new ideas.
The fourth memo entry noted a pattern of question-asking in the source data that was in
some way connected to both the thinking and reasoning of students in the sample
population, as well as revealing the epistemic stance and/or doubt therein.
Table 12

Overall Coding Results
Select codes and sub-codes Sources References
Beliefs 49 104
Belief Development 22 35
Changed Belief Influence 27 35
Old Belief 18 20
Coordination’s 115 338
Collections 43 75
I Believe Because 62 96
IF-THEN or BECAUSE 34 57
Related Things 60 109
Distinctions 92 688
Math DOES 32 67
Math IS 28 62
Math IS NOT Reality 9 10
Physics DOES 49 100
Physics IS 70 128
Physics IS NOT or DOES NOT 3 3
Physics IS NOT Reality 51 74
Physics IS Reality 12 14
Physics is Reality MAYBE 4 4
Reality is 50 81
Reality IS NOT 3 3
Science DOES 26 48
Science IS 35 78
Science IS NOT Reality 8 8
Science IS Reality 4 4
Transformations 30 43
286 1173

Table 13 below provides data on the coding for journal submissions and
small/large group discussions/interview results in terms of the Elements of Thought as

123

described by Paul and Elder (2008), as they pertain to both the Physics and Reality and
the Math-Science-Physics-Reality activities. Given the collaborative nature of the
activities, it was necessary to code Point-of-View (POV) as either individual or group—
depending on the specific journal question and/or group activity. Transcripts of group
discussions/interviews, as well as individual journal assignments made explicit reference
to the type of POV that was being expressed, and thus led to coding POV as either Group
or Individual.
Table 13

Coding Results for the Elements of Thought (EoT)
Codes and sub-codes Sources References
Elements of Thought 109 571
Assumptions 16 21
Concepts 69 194
Implications 40 67
Information 9 10
Interpretation 55 88
POV 23 73
Purpose 9 12
Question 43 106
Demographics codes for Course, Gender, and Student have been omitted and listed in
Table 1.

Figure 4 below is a circle graph illustration of the connection between statements
made by students indicating that science is what math is or does, math is what science is
or does, as well as the strong connection between the beliefs that science and math are
not equivalent to reality, but merely ways to describe it.

124

Figure 4. Cluster analysis circle graph for EoT and distinctions.

125

The associative strength between codes related to student opinions illustrated in
Figure 4 were measured using the Jaccard Index for similarity—which is based on a part-
whole relationship between the number of common code references (intersection of sets)
relative to the total number of codes entered in the union of sets (Levandowsky & Winter,
1971). Table 14 below lists the relevant Jaccard Indices for Figure 4.
Table 14

Jaccard Indices for Distinction and EoT Code Comparison
Node A Node B Jaccard Index
Science IS NOT Reality Math IS NOT Reality 1.00
Science IS Math DOES 0.91
Science IS Math IS 0.81
Math IS Math DOES 0.78
Science IS Science DOES 0.78
Science DOES Math DOES 0.75
Physics IS Individual POV 0.71
Science DOES Math IS 0.71
Cluster analysis on the codes: Distinction and Elements of Thought (EoT). Indices less
than 0.70 are not graphed in Figure 4, and therefore do not appear in this table.

126

Figure 5. Cluster analysis dendrogram.
A dendrogram is a hierarchical tree structure with branches, sub-branches, and leaves.
The right-most leaves are the most similar. Leaves on a branch have a strong relationship
in terms of co-occurrence within source material.

Cluster A
Cluster B

127

Cluster B consists exclusively of code pairings involving a point of view relating
physics and reality. Student opinions illustrated by this cluster indicate concepts of what
physics does is strongly related to the point of view that physics is not reality because it is
about reality. The connection between opinions about what physics is, and what reality is,
are disjoint from the decision about the nature of physics as either reality or not reality.
The distinctions of IS and DOES describe statements that define the object of each code
in terms that are either process/product oriented (DOES), or more abstract in terms of
what the term is like or about (IS). Moreover, the decision that physics is reality is quite
distant from the pairing of the aforementioned sub-clusters. The point of view that
physics might be reality (MAYBE code) is contingent on what physics is not, or does not
do, in parallel with what reality is not. Each of these distinctions involve a negative
conclusion that grounds the uncertainty concerning physics and reality.
During the coding process, the in vivo Collections code was brought into a child
relationship with the Coordinations code—which also includes IF-THEN or BECAUSE,
Related Things, and I Believe Because child codes—because it matched that a priori
theoretical construct partially representative of reasoning. The Coordinations parent code
is a container for relationships within or between the collections of Distinctions, whereas
the parent node EoT is an a priori theoretical code consisting of 9 child nodes as listed in
Figure 6 below. Group POV and Individual POV are child nodes of the EoT POV, and
thus expand the original listing of the 8 EoT by Paul and Elder (2008) by one.

128

Figure 6. Distinctions and coordinations vs. EoT node matrix.

Individual POV was the largest contributor to how students organized their
thoughts as captured by the Coordinations and Distinctions codes. Approximately 56% of
the distinction making was in the form of and Individual point-of-view (POV), whereas
roughly 40% of Coordinations were in this form. Group POV comprised 10% and 3%
respectively to the Distinctions and Coordinations code structure. Conceptual content
increased from roughly 12% in Distinctions to 19% in Coordinations. The Distinctions
code is an a priori theoretical code representing the construct of thinking in terms of the
portion of model construction that requires descriptive metrics (distinctions), whereas
Coordinations attempts to provide a container for elements of reasoning. Figure 7 below
illustrates the relative percentages from Figure 6. The major shifts illustrated therein

129

include less POV in the transition from Distinctions to Coordinations, as well as an
increased usage of Concepts, Interpretation, and the consideration of Implications.

Figure 7. Concepts and individual POV node matrix.

The majority of Concept codes that also coded as POV came in the form of
Individual POV across all sources. A total of 71 out of 362 Coordination codes across all
sources were Concepts, where as a total of 61 out of 487 Distinctions codes were
Concepts. Figure 7 above illustrates the coincidence of Concept and Individual POV
codes, whereas Figure 6 above details the occurrence of EoT relative to Distinctions and
Coordinations. A total of 148 out of 362 Coordination codes across all sources were
Individual POV, where as a total of 277 out of 487 Distinctions codes were Individual
POV. The following quotes provide example of the coincidence of Concepts with
Individual POV.

130

Table 15

Examples of Concept Coordination
Student, Source, Date Quotation
MSUP6, Physics and Reality
Journal 1, August 24, 2014 “physics is to reality what a map is to the real world”
FSCP8, Physics and Reality
Journal 1, August 24, 2014 “physics is one of the scientific parts of reality
MSCP4, Physics and Reality
Journal 1, August 27, 2014
“reality is closely correlated to that of Physics and is indeed a
subset of reality”
FSCP2, Math-Science-Physics-
Reality Journal 2, August 24, 2014
“Physics is a subcategory of science that involves things that are
perceived, such as time, motion, energy”
MSCP4, Math-Science-Physics-
Reality Journal 2, August 27, 2014
“mathematics, science, and physics are tools that support and
explain concepts of reality”

Figure 8 below illustrates the relative percentages for each of the EoT coincident
with either Distinctions or Coordinations. Approximately 20% of the EoT codes that
were coincident with Coordinations were Concepts, whereas 12.5% of the Distinction
codes coincident with EoT were Concepts. Individual POV comprised 41% of the
coincident codes, and 57% of the Distinctions codes were also Individual POV.

131

Figure 8. Distinctions and coordinations vs. EoT node matrix.

Figures 9 and 10 below illustrate the pattern of EoT use by students when
considering the Math Science Physics and Reality (MSPR) questions in a group setting
versus an individual reflection opportunity. In both cases, the usage of EoT is reduced by
at least one half when shifting from Distinctions to Coordinations.

132

Figure 9. MSPR group discussions distinctions-coordinations EoT node matrix.

Figure 10. MSPR journals distinctions-coordinations EoT node matrix.

133

The Distinctions child codes can be grouped into four different collections as
illustrated in Figures 11 – 15 below. The Distinction concerning math and reality indicate
a dominant disposition to answer the question of whether or not math is reality in terms
of what Math IS (n = 75) is and/or what Math DOES (n = 72). One of the three coding
references for Math IS Reality was rooted in EoT Assumptions, whereas the remaining
two were expressed as Individual POV. A total of 10 EoT codes comprise the Math IS
NOT Reality Distinction—with 1 coded as a Concepts, 2 Interpretation codes, 1 Group
POV, and 5 Individual POV codes.

Figure 11. MSPR math EoT node matrix.
Similar results were found for the Science and Reality question in terms of the
number of codes generated for the decision to accept or not accept science as reality. One
of the five coding references for Science IS Reality was rooted in EoT Assumptions,
whereas the remaining four were expressed as Individual POV (N = 3), and Implications
(N = 1). A total of 8 EoT codes comprise the Science IS NOT Reality Distinction—with
1 coded as a Concepts, 2 Interpretation codes, 1 Group POV, and 4 Individual POV
codes. However, the relationship between what Science IS and what Science DOES was

134

not as closely matched to the same distinctions relative to math—with a near 2-to-1 ratio
between what Science IS (N = 93), and what Science DOES (N = 53).

Figure 12. MSPR science EoT node matrix.

The vast majority of Distinctions made about the nature of Physics and Reality
came in the form of Physics DOES (N = 103), Physics IS (N = 146), and Physics IS NOT
REALITY (N = 81). Fifteen references coded as Physics IS Reality, whereas only 7
coded for Physics IS Reality MAYBE.

Figure 13. MSPR physics EoT node matrix.

135

Analysis of the physics and reality activity journals. A closer look at the
particular distinctions that students made in the Physics and Reality activity reveals the
dominance of individual POV in the reflections offered by students submitting the
follow-up journal. Figure 10 below illustrates the relative frequency of relevant
Distinctions and their correspondence to the EoT. Assumptions, Information, and
Questions were the least used codes, whereas Concepts, Implications, and Interpretation
comprised most of the remaining code content in this activity.

Figure 14. Distinctions vs. EoT node matrix.

A comparison of the Coordinations with EoT reveals a shift in the dominance that
POV had over Distinctions—to Interpretation and Implications. The use of Assumptions
and Questions also increased when compared to Distinctions—as illustrated in Figure 11
below. The largest shift occurred for IF-THEN reasoning in terms of the Implications that
students perceived in their reflections about physics and reality. Statements coded as I

136

Believe Because, and Related Things, employed all of the EoT, whereas Collections and
IF-THEN statements coded for all of the EoT except Information. Assumptions and
Questions increased in all four categories of Coordinations.

Figure 15. Coordinations vs. EoT node matrix.

Consideration of research questions with current results. The two research
questions are (1) How do IP students use representational systems in their thinking and
reasoning, and (2) How does the use of MRS in the thinking and reasoning of IP students
promote personal epistemological change? Given that the Physics and Reality activity is
entirely narrative or written, natural language in those two forms comprise the entire
scope of representational systems employed by students thus far. Figures 6, 7, and 8
illustrate how students are using the written natural language representational system,
which has been influenced by both small and large-group narratives. Initial results
indicate that distinction-making (Distinctions code) employs a different distribution of

137

the EoT than does the synthesis of those distinctions (Coordinations code). In other
words, thinking and reasoning within the EoT are potentially distinguishable, as opposed
to reasoning being simply another form of thought as suggested by Paul and Elder
(2008). These shifts in thinking and reasoning associated with belief formation and
change were situated within the collaborative and individual reflections of the
participants in the Physics and Reality activity.
The aggregate PEP results (Table 3) indicating large-scale variations (SD = 13.18)
in epistemology, with small changes in composite personal epistemology (M = 2.97).
Eleven of the 29 respondents retained their pre-test PEP profile in the post-test condition,
14 experienced a singular shift between 2 of the 3 PEP profile dimension, and the
remaining 4 obtained a complete change in PEP dimension profile in the post-test
condition. Nine students changed their primary PEP dimension. The degree to which the
Physics and Reality activity contributed to these changes in PEP dimension are unknown
at this time due to the close proximity of the activity to the pre-test condition.
Combined analysis of the remaining study activities. As a follow up to the
MSPR activities, students entered a phase of the curriculum designed to develop
mathematical modelling skills need for modelling Physics in the Learning the Language
activity (see Appendix E and F). The skills developed in that activity were subsequently
used to build conventional laws of motion, and those activities factored into the analysis
of polling results concerning the First Zeroth Law (FZL) and the Second Zeroth Law
(SZL) created by the students in order to describe motion—such as the basic equations
for speed and acceleration (see Appendix G). This is the foundation of the entire course,
and subsequent curriculum in support of the course learning outcomes relied heavily on

138

this process throughout the remainder of the semester leading up to the end of term where
the exit interviews occurred. Each of these activities posed the same set of questions to
students concerning their POV on whether or not they had experiences any changes in
their thinking, reasoning, or understanding. Figure 12 below illustrates coding from the
activities comparing claims about thinking (T), reasoning (R), and understanding (U)—or
what was coded herein as TRU Claims.

Figure 16. Belief development with TRU claims node matrix.

A total of 54 references to Belief Development were coded in 31 sources, whereas
381 TRU Claims were made in 219 duplicated sources throughout the study data—
Thinking Claims sources (N = 71), Reasoning Claim sources (N = 61), and
Understanding Claim sources (N = 87). Figure 12 above, shows the intersection of those
codes with Thinking Claims (N = 10), Reasoning Claims (N = 5), and Understanding
Claims (N = 8). Figure 13 below, illustrates the connection between Belief Development

139

and EoT. The following direct quotes provide a sampling of the coincidence of TRU
Claims and Belief Development in terms of Thinking Claims.
Table 16

Examples of Belief Development Claims About Thinking
Student, Source, Date Quotation
FSCP3, Physics-Reality WRAPUP
Journal, September 8, 2014
“I have learned many new avenues of belief development, like
compilation of thoughts and ideas, deductive reasoning, and
conclusive resolution… PHYSICS!”
FSUP1, Physics-Reality WRAPUP
Journal, September 12, 2014
“My thinking did change a lot about these questions and I
concluded that there are many answers not one solid answer since
everyone will have their own meaning about reality and if
physics is reality because all of our thoughts are very different.”
FSUP1, Physics-Reality WRAPUP
Journal, September 12, 2014
“I have always been a very close minded person when it comes to
believing others and their ideas, so no my way of believing did
not change, but my way of thinking has.”
MSCP5, Physics-Reality
WRAPUP Journal, August 30,
2014
“I believe that my thinking has changed a lot after these activities
… I believe that I am starting to think outside of the box.
MSUP15, Physics-Reality
WRAPUP Journal, September 10,
2014
“Some of the things that have been changed are my thinking,
reasoning, understanding, and the way I come to believe.”

The following direct quotes provide a sampling of the coincidence of TRU Claims
and Belief Development in terms of both Reasoning and Understanding Claims.

140

Table 17

Examples of EoT Belief Development
Student, Source, Date Quotation
MSCP4, Physics-Reality
WRAPUP Journal, September 9,
2014
“And I came to believe that my previous reasoning didn’t allow
for the opinions that I didn’t quite understand the true meanings
of the terms.”
MSCP6, Physics-Reality
WRAPUP Journal, September 7,
2014
“How I have come to believe has slightly changed as my
understanding and reasoning changed.”
MSUP15, Physics-Reality
WRAPUP Journal, September 10,
2014
“Some of the things that have been changed are my thinking,
reasoning, understanding, and the way I come to believe.”

Figure 17. Node matrix comparing beliefs with EoT.

Change Belief Influence (a code tracking a new belief with its cause) was the
largest coincidence (N = 31) within the family of codes comprising the Beliefs node,
which includes Belief Development (N = 4). Only 10 Old Belief codes coincide with
EOT from the 20 references within 18 sources that originally coded for Old Beliefs. The

141

Belief Development code specifies a student’s perception that the way that they have
come to believe in things has changed in some way. Most students did not feel as though
they changed the way that they come to believe, but many students did experience a
change in beliefs concerning math, science, physics and reality, as indicated herein.
Figure 18 below illustrates the connection between TRU Claims and EoT. The following
direct quotes provide a sampling of the coincidence of EoT and Belief Development in
terms of the child codes Individual POV and Changed Belief Influence.
Table 18

Examples of Belief Development
Student, Source, Date Quotation
FSCP6, Physics & Reality
Journals, August 26, 2014
“I do think physics is reality because it is used in everyday life!”
MSCP4, Physics & Reality
Journals, August 27, 2014
“I am now of the opinion that Physics as is all the disciplines of
science are merely tools to get to reality.”
MSUP11, Physics & Reality
Journals, August 27, 2014
“Physics represents Reality…that’s as good as it gets.”
FSUP2, Physics & Reality
Journals, August 27, 2014
“When all that is thought about it would be hard to say physics is
reality because there is no particular definition and one definition
of reality cannot be isolated to fit the question. If the definition of
reality could be distorted to where it only meant what is
physically real the yes, physics would be reality, but taking in
part of the definition is not possible … physics is not reality.”
PHY121 Small group 10 member,
Math-Science-Physics-Reality
Narratives talk 2, August 20, 2014
“My reality changed a little bit– my personal thing, I was
thinking about it. This is just kind of what I wrote down. I wrote
down, Reality is the actual occurrence of things the way they
really are. People do not experience different realities, per se, just
different parts of the same reality or even seeing the same reality
from a different perspective.”
FSCP8, Physics-Reality WRAPUP
Journal, September 10, 2014
“Now, after the two Journal assignments and the discussions in
class, I lean towards ‘Physics represents Reality’.

142

Figure 18. Node matrix comparing true claims with EoT.

Individual POV is the largest component of Thinking Claims (N = 13) and
Understanding Claims (N = 22), while comprising a small portion of Reasoning Claims
(N = 1). The conceptual content of both Thinking (N = 6) and Understanding (N = 7)
Claims was roughly triple the conceptual content of Reasoning Claims (N = 2). Belief
Development and Changed Belief Influence coded together 4 times, as demonstrated in
the direct quotes from sources given below.

143

Table 19

Examples of Belief Development
Student, Source, Date Quotation
FSUP1, Physics and Reality Wrap
Up Journal, September 12, 2014
“My thinking did change a lot about these questions and I
concluded that there are many answers not one solid answer since
everyone will have their own meaning about reality and if
physics is reality because all of our thoughts are very different.
My thinking I believe also expanded more into other thoughts. I
never sat in my room and thought about these questions, ever. So
it made my thoughts expand in a different way than they usually
do”
MSCP4, Physics and Reality Wrap
Up Journal, September 9, 2014
“And I came to believe that my previous reasoning didn’t allow
for the opinions that I didn’t quite understand the true meanings
of the terms”
PHY111 Small Group 4, Learning
the Language Narratives,
September 28, 2014
“I think I learned that it changed my view on how people
understand. So two people with different bodies of knowledge
can end up at the same conclusion about something, and how
different ideas can represent the same concept”
MSCP5, FZL SZL Poll
Reflections, September 24, 2014
“I have just been in a groove for so long and thinking they were
the same thing that it took me a second to recognize the terms are
describing different forms of motion”

Other assessments. The FCI and the MBT were not part of the study in terms of
research questions; however, they were part of the standard set of assessments associated
with the courses. Twenty-two of the 29 study participants completed both the pre- and
post-test FCI with results shown in Table 19. Twenty of the 29 study participants were
present to take the MBT with results shown in Table 20. According to Hake (1998), MBT
scores tend to be 15% lowers than post-FCI scores. College Physics students in this study
had a mean MBT score that was only 5.5% lower than the post-FCI average, while the
University Physics students had a mean MBT score that was 17.7% lower than the post-
FCI average. The combined mean score for the FCI was 49.3%, while the combined
mean score for the MBT was 37.7%—a difference of 11.6%. It should be noted that
Yasuda and Taniguchi (2013) determined that 2 of the 30 FCI questions were invalid, and

144

found systemic errors suggesting further research on the validity of the FCI. Interestingly,
Hake (1998) also noted systematic errors in his data at the time, but nonetheless
attributed the stark differences in conceptual change between traditional (low-gain) and
IE (high-gain) results to pedagogy.
Table 20

Force Concept Inventory (FCI) Results
Course Pre-test M(SD)
Post-test
M(SD)
Gain
⟨??⟩
College Physics 8.3(3.7) 11.8(3.1) 0.16
University Physics 13.3(7.3) 17.8(6.4) 0.27
Overall 10.8(6.4) 15.0(5.9) 0.21
The normalized gain ⟨??⟩ is calculated as the ratio of the pre-post test score difference, relative to
the difference between a perfect score and the pre-test score—thereby excluding prior knowledge
from the evaluation.

Table 21

Mechanics Baseline Test (MBT) Results
Course Percent Score M(SD)
College Physics 33.8(11.0)
University Physics 41.5(16.4)
Overall 37.7(14.2)

Summary
Students in this study population were given several assessments—the PEP for
personal epistemology, and the FCI and MBT for conceptual change. In all three cases,
positive gains were made as described herein. The elements of thinking and reasoning in
terms of the a priori theoretical coding definitions described in the Proposal, as well as in

145

vivo coding, reveal patterns that could support theoretical advancement in terminology
that is capable of lending clarity to models of conceptual and epistemological change.
Belief development and the influence on belief change also contain the distinct patterns
of thinking and reasoning suggested above, and thereby lend further support to a
theoretical advance in the way that concepts, models, thinking, and reasoning are
understood with respect to epistemological change, as well as general cognitive
constructs.
The research questions were concerned with how students use MRS in their
thinking and reasoning, and how the use of those MRS in thinking and reasoning
influence epistemological change. The MRS used in this study were largely written and
narrative discourse about curriculum content that is symbolic and diagrammatic. The
process of coding for EoT led to the discovery that concepts are poorly defined within the
framework of thinking as described by Paul and Elder (2008), but also with respect to
conceptual change literature in general (Vosniadou, 2010). The pattern that emerged from
the study data suggested that concepts come in several families whose properties specify
different kinds of properties and relations. The ongoing process of memo-writing in
parallel to coding activity produced a taxonomy of conceptual frameworks which unifies
and answers the research questions in this study. Chapter 5 presents that finding along
with theoretical and practical implications for continued research and ongoing practice.
One of the limitations in this study is the duration of time between the end of the
analysis phase in weeks one through four of the course, and the exit interviews conducted
during week sixteen. Moreover, the pre-test and post-test conditions occurred 15 weeks
apart—spanning the first and last days of class prior to the week where exit interviews

146

were conducted. Though the course content of weeks five through fifteen were
fundamentally the same as the week four content in terms of structure and approach, none
of the week five through fifteen activities were part of the analysis. However, the focus of
the exit interview questions specified how the overall experience related to changes in
thinking, reasoning, understanding, and personal epistemology. While there were some
large-scale changes in personal epistemology as measured by the PEP, it is not clear
which phase of the course was related to that change.
The student perception of the interview and journal questions that specify
thinking and reasoning are potentially different than what the researcher perceives, as
well as what the theoretical coding definitions prescribe. The coding scheme employed in
this study aimed to provide unambiguous definitions that are dependent on speech
patterns rather than the inference of the coder. Moreover, constant comparative analysis
and memo bracketing of the researchers’ opinions provide a backdrop for understanding
these potential limitations of the data and its analysis.

147

Chapter 5: Summary, Conclusions, and Recommendations
Introduction
The purpose of this study was to address a long-standing gap in the personal
epistemology literature concerning the resources and mechanisms for personal
epistemological change by crafting a new theory that describes the connection between
thinking and reasoning with MRS and epistemological change. Given the context of an IP
classroom for the study, and the persistent efforts of researchers in the PER community to
measure conceptual change in IP settings, this study also includes data on conceptual
change and the possible connection between conceptual change and personal
epistemology. Thinking and reasoning is required for both conceptual and
epistemological change, but is poorly defined in the literature (Nimon, 2013; Peters,
2007). In order to remedy this situation, a streamlined definition of thinking and
reasoning was adopted for the sake of coding source materials that matched the practice
of Physics in general while also being consistent with current mainstream views in the
literature.
The importance of a new and productive theory of learning that spans the fields of
conceptual change and personal epistemology cannot be understated—especially in terms
of the psychological constructs of thinking and reasoning. Good theories have a broad
explanatory scope that is resilient enough to handle significant changes in context—such
as the content domain, and the conventional representational systems that work therein.
Though this study is focused on the practice of Physics in an IP setting, as well as the
MRS that are used in that endeavor, a grounded theory explaining how that is done has
significant potential for describing thinking and reasoning with MRS in general. The

148

potential of such a theory has import to human and machine intelligence in terms of the
structure of knowledge by means of thinking and reasoning processes that employ MRS.
Such a theory would provide researchers with the necessary tools to construct educational
content and assessments regardless of the type of representational system. The Cognitive
Modeling Taxonomy of Conceptual Frameworks (CMTCF) offered herein not only
presents a means by which to do this, but the theory of learning that it is positioned in
attempts to link the biology of brain function to the cognitive and behavioral activities
that lead to MRS artifacts.
The study was conducted in two IP classrooms at a rural community college in
central Arizona during the fall semester of 2014. Twenty-nine students participated in the
study; which consisted of observation of normal classroom activities with the curriculum
at Central Arizona College, as well as well as several assessments including the PEP and
the FCI. The goal of this qualitative grounded theory study is to determine the influence
that multiple MRS have on IP students with respect to their conceptual frameworks and
personal epistemology.
R1: How do IP students use representational systems in their thinking and
reasoning?
R2: How does the use of MRS in the thinking and reasoning of IP students
promote personal epistemological change?
This chapter presents a new theory of learning that connects the neural activity of
the brain to the cognitive and behavioral processes that learners use in order to generate
artifacts in MRS. The core elements of the TRU Learning Theory are definitions of the
psychological constructs of thinking, reasoning, and understanding in terms of conceptual

149

frameworks. The CMTCF was generated in response to emergent themes in the study
data that indicated how students used and constructed concepts. Traditional definitions of
concepts tend towards general abstract ideas, rather than the way that concepts relate to
and build upon one another. The CMTCF defines these relationships in terms of the way
that they correspond to conventional and theoretical methods for modeling in IP.
Moreover, the CMTCF answers the need for clarity about what a concept actually is in
light of the forty-year history of conceptual change research that has resisted defining the
term (Vosniadou, 2010).
The remainder of this chapter is organized to present an overall summary of the
study explaining the general topic and the importance of the study, as well as a summary
of findings and conclusions—which include the introduction of a new theory of learning.
Implications for future theoretical development and research trajectories are offered in
concert with the practical implications of these study findings. Finally, recommendations
for the pursuit of new research questions and new or enhanced practices are offered in
conclusion.
Summary of the Study
It is not known how (a) thinking and reasoning with MRS occurs, and (b) how
that sort of thinking and reasoning affects epistemological change in terms of
mechanisms and processes—whether cognitive, behavioral, or social—in an IP
classroom. The findings herein suggest that this is due to the fact that concepts and
conceptual frameworks are poorly understood, and that this is the missing structure that
conventional definitions of the term model tend to ignore. Beliefs about Physics either
refer to or require multiple representational systems (MRS)—such as words, symbols,

150

and pictures (Plotnitsky, 2012), and are situated within a social and collaborative learning
environment. The way in which Introductory Physics (IP) students use MRS in their
thinking and reasoning ultimately conveys to changing concepts and beliefs. How
students think and reason with MRS, and then how that conveys to epistemological
change was the goal of this study as described by the following research questions.
R1: How do IP students use representational systems in their thinking and
reasoning?
R2: How does the use of MRS in the thinking and reasoning of IP students
promote personal epistemological change?
Conceptual change and epistemological change are connected by the
representational systems used by learners when deploying them in contexts that require
modeling (Jonassen et al., 2005, Nersessian, 2010). Learning physics requires thinking
and reasoning within a context for problem solving where beliefs about the world are
regularly challenged (Lising & Elby, 2005). However, there is no clear definition of the
terms thinking and reasoning (Nimon, 2013; Peters, 2007) even though scores of types of
thinking are well attested within the literature—specifically with respect to this study:
scientific thinking and reasoning within the context of learning physics (Coletta et al.,
2007a, 2007b; Hake, 1998; Hestenes, 2010; Rosenberg, Lorenzo, & Mazur, 2006).
Furthermore, the term concept is poorly defined at best (Vosniadou, 2010).
In an effort to answer the research questions directed at how students think and
reason with MRS towards epistemological change, the fact that thinking and reasoning
are poorly defined was not only confirmed by the analysis, but also that its content in
terms of concepts is also poorly understood—which makes describing thinking and

151

reasoning in terms of concepts vague at best. During the coding process, reflections by
the researcher in the memos discovered that concepts are descriptive, categorical, or
relational. This finding led to the construction of the Cognitive Modelling Taxonomy of
Conceptual Frameworks, which serves to amplify the definition of thinking as the ability
to construct a model by uncovering the conceptual structure of models themselves. The
conventional definition of a model as any representation of structure (Hestenes, 2010) did
nothing to distinguish what structure actually is, and therefore made the judging of what
counts as a model as subjective as the person making the judgment. Moreover, if
modelling and thinking or reasoning are to be coordinated in any way, a solid set of
definitions for what these processes are and what their content must be is essential for
theoretical advance in educational and psychological research. The remainder of chapter
5 is dedicated to exploring and explicating such an advance.
Summary of Findings and Conclusion
Epistemological change measured by the PEP instrument revealed a modest
positive shift in composite PEP scores by means of dramatic shifts within and between
the three dimensions of the PEP: rational, empirical, and metaphorical. Analysis of the
study artifacts—journals, discussions, and exit interviews—reveal a consistent pattern of
concept construction by means of thinking and reasoning as distinct processes capable of
forming conceptual frameworks. Thinking and reasoning (as defined by TRU) are
believed to be the mechanisms of epistemological change (belief development), whereas
conceptual frameworks and the learning environment are believed to be epistemological
resources upon which epistemic framing coordinates conceptual change with belief

152

change. These conclusions are described below in reference to the research questions and
the emergent themes from the analysis of study data.
The significance of these findings corresponds to a reasonable call for paradigm
shifts in conceptual change and personal epistemology research, as well as human and
machine learning. Conceptual change research has long been stifled by a persistent
devotion to pre-post-test approaches that have yet to produce theoretical clarity (DiSessa,
2010). Personal epistemology research has suffered a similar fate at the hands of models
and theories that lack clarity, and assessment methods that fail to produce consistent
results. According to Clement (2010), the mechanisms of conceptual change are not
known because the definition of a model is vague. Bendixen (2012) echoes the
assessment given by Clement (2010) and diSessa (2010) concerning conceptual change
when describing the state of epistemological change research having little to no data on
the processes and mechanisms of this phenomenon. The call for qualitative studies
investigating the contextual factors of epistemological change has persisted from Hofer
and Pintrich (1997) through Bendixen (2012). This study sought to fill those gaps in the
literature, and has thus produced a new theory of learning—the TRU Learning Theory—
that brings them all together under the structure of the CMTCF, described herein.
Research Question 1.
R1: How do IP students use representational systems in their thinking and
reasoning?

153

The Physics and reality classroom event revealed patterns in thinking and
reasoning that were limited to written and narrative representational systems. The nature
of the questions in this event did not warrant the use of graphical or diagrammatic
representational systems because the activity demanded only narrative and written
responses in natural language, and therefore none were observed. However, intricate
patterns were obtained in the qualitative analysis revealing the kinds of interactions that
lead to belief change and the types of narratives that promote those changes, as well as
the cognitive resources that support them.
The largest shifts in thinking and reasoning occurred in terms of the individual
and group point of view (POV) expressed by students in journal entries and group
discussions or interviews. Epistemological changes measured by the PEP instrument
indicate large variations in one or more epistemological dimensions (rational,
metaphorical, or empirical) in concert with the treatment under study. It is unsurprising
that the content of beliefs is in some way conceptual, and therefore the shifts in concept
usage observed in the coding of study artifacts also describes in some way the underlying
structure of thinking and reasoning.
Theme 1: distinctions (thinking). The very nature of the questions posed in this
activity—What is Physics? What is Reality? Is Physics Reality?—seem to force students
to evaluate not only their own beliefs, but also the beliefs of others in the process of
compelling them to define their concepts. As students consider the questions, they use
inferential (if-then) reasoning (coded under Coordinations) in the comparison of their
own views with the views of others—which forces a coordination of existing concepts
and the possible assimilation/accommodation of concepts provided by other students

154

engaged in the same debate. Reflection on ones’ own beliefs requires the use of
metacognitive control, while the consideration other perspectives requires a process of
critical listening followed by dialectical thinking about that content. Subsequent
interpretations (an EoT) and assimilations/accommodations of new information (another
EoT) present an organic opportunity for thinking and reasoning because of the nature of
the questions and the learning habitat.
This study positioned the construct of thinking in two ways by using the EoT by
Paul and Elder (2008), and a synthesis of that structure with the general practice of
science in terms of modelling. The proposed coding scheme of thinking as the ability to
construct a model, and reasoning as the ability to relate one or models, was intended to
provide a better fit to scientific thinking and reasoning in parallel with the EoT. A
decision to delay coding by this scheme was made in an effort to allow the in vivo coding
and the a priori theoretical coding in terms of the EoT to flow naturally in the first phase
of data analysis. During that process the discovery was made that the content of EoT
Concepts varies so widely, that coding for thinking or reasoning by the proposed
synthetic scheme was not possible without a clear definition of what concepts are, and
how they are coordinated into a model. Some of the content coded as concepts presented
in forms that dealt strictly with categorical declarations versus relational ones. This
finding led to the taxonomy offered below. According to Vosniadou (2010), conceptual
change researchers have historically tended to avoid defining the term concept—
preferring rather to position them as something that changes within a larger theoretical
framework of cognition. Moreover, the definition of model as any representation of

155

structure was found to be lacking sufficient clarity, and thus the following taxonomy of
conceptual frameworks was created.
The Cognitive Modelling Taxonomy of Conceptual Frameworks (CMTCF).
Concepts are defined as one or more descriptive distinctions (metrics) about something.
There are two families of concepts leading up to the collection of concepts that comprise
the structure of a model. Pericepts are concepts that have the capacity to categorize or
classify other concepts, whereas Metacepts serve to define relations or relationships
between other concepts because some concepts are about other concepts rather than being
about something. Modelling concepts are simply the coordination of multiple concepts
that represent the structure of complex ideas, and in this way, the term modelling
becomes more precise by specifying the resource for structure.
The way that concepts are coordinated (conceptual frameworks) is the structure
that the conventional definition of model refers to when defining model as any
representation of structure. Representational systems (words, pictures, symbols,
diagrams, etc.) are the only ways to communicate the content of a model—which is
entirely in the form of concepts whose construction and coordination require, and thus
define, the construct that we call thinking. The following definitions are offered in
establishment of the proposed conceptual framework.
Concept. One or more descriptive (metric) distinctions about something. For
example, bananas are yellow provides the metric yellow as describing a property of
bananas. The statement: stones are hard uses the term hard in the same way.
Pericepts. Concepts that serve to categorize or classify other concepts. This
family of concepts includes Supercepts and Subcepts, defined below.

156

Supercept. Categorical concepts with capacity to group other concepts. For
example, the statement: sticks and stones are things uses the concept of thing as a
grouping concept for the class of things known as sticks and stones.
Subcept. Concepts that form classes within, or under, more general concepts such
as Supercepts. For example, the statement: apples and oranges are fruit uses the concepts
of apples and oranges as classes within the categorical concept (Supercept) of fruit.
Metacepts. Concepts that serve to define a relation or a relationship on or between
other concepts. This family of concepts includes Hypocepts and Hypercepts, defined
below.
Hypocept. Relational concepts about other concepts. For example, the statement:
sticks are not stones uses the relational concept of not the same as a way to encode for the
lack of equality between sticks and stones. The statement: this is greater than that uses the
relational concept of greater than to encode for how much larger this is than that.
Hypercept. Concepts about other concepts encoded in the form of a relationship.
For example, the statement: there are three groups of two stones uses the concepts of
numbers as a way to encode for the group structure of the collection in terms of what can
be counted.
Modelling Concepts. The coordination of concepts by means of other concepts,
which in turn represent the structure of complex ideas by creating a new concept. This
family of concepts includes the and the Nomocept, as defined below. The primary
differences between a Multicept and a Nomocept is the manner by which Hypercepts are
joined, and the descriptive scope of the Hypercepts in relationship.

157

Multicept. The encoding of Hypercepts coordinated by a Hypocept. For example,
the statement: three groups of two stones is equivalent to two groups of three stones uses
the hypocept of equivalent to as a way to coordinate the invariance under regrouping that
is evident in the fact that regardless of how you group six stones—2 groups of 3, or 3
groups of 2—the sum total remains the same.
Nomocept. The coordination of Hypercepts through a reasoning process. For
example, the statement: a change in position always requires a change in time uses an
inferential reasoning process in order to connect the two change quantity Hypercepts. The
term Nomocept was chosen because the Greek word nomos refers to law, and scientific
laws are simply empirically familiar regularities. The statement in this example is an
undeniably true description of motion for any object in a Newtonian world.

158

Figure 19. Cognitive Modeling Taxonomy of Conceptual Frameworks – Processes.

159

The Cognitive Modeling Taxonomy of Conceptual Frameworks (CMTCF) shown
above in Figure 19 illustrates the progression of Thinking as a construct that advance
vertically from basic concepts towards complex conceptual frameworks. Reasoning
processes occur horizontally in the form of coordinating concepts at all levels of thinking.
Thinking and reasoning are thus fundamentally different, though inextricably linked.
Understanding is possible only at the level where models interact with other models in
terms of the symbol-to-symbol and symbol-to-referent correspondence—which
ultimately entails reasoning between multiple contexts via MRS. The Nomocept is the
pinnacle of the taxonomy, and represents the generalization of law-like models based on
empirically familiar regularities (EFR). It should be noted that top-down thinking from a
law is also permitted in this taxonomy in the case where the EFR is already known by the
modeler. Law-like understanding can be parsed from the top-down, or built from the
bottom-up in manners consistent with top-down and bottom-up theories of thinking and
reasoning.
Given that thinking was initially defined as the ability to construct a model, and
that a model is simply any representation of structure (Hestenes, 2010), this taxonomy
proposes that multicepts are the structure of models, and that the construction of a model
under this framework is evidence of thinking. If the content of thinking is model
construction, and model construction is concept coordination, then the coordination of
models is naturally something other than thinking—or at least a form of thinking on a
higher level that what has been described herein. The EoT provided by Paul and Elder
(2008) define reasoning as just another kind of thinking; however, there is no clear means
by which to distinguish thinking from reasoning under that model. The coding scheme for

160

reasoning defined as the ability to relate multiple models ended up matching the patterns
found in the study data, and is therefore proposed as the new definition of that construct.
Theme 2: coordination’s (reasoning). Students engaged in the metacognitive
control associated with considering their own views and the views of others are
compelled to sort and organize the pre-existing concepts that they have in light of new
ones exposed in the group interactions. Initially, this is handled by trying to defend
existing views, but also by attempting to assimilate new ideas when found to be superior
to old ones. When the Distinctions that students made are compared to the Coordinations
in terms of how the EoT comprise either cognitive activity (see Figure 6), the results
indicate a shift away from Group or Individual POV simultaneous with an increase in the
use of Information, Interpretation, Assumptions, and Questions. The use of Concepts
remains largely the same. The content of thinking and reasoning in terms of the EOT is
largely the same in terms of the individual elements of thought as described by Paul and
Elder (2008); however, in addition to the varied proportions of EoT usage, there are
additional ways in which those collections of EoT are coordinated. Those additional
methods of coordination come largely in the form of if-then statements that are
classically understood as inferential reasoning, as well as knowledge justification
statements that use the term because to link up beliefs with models and concepts.
This shift is one way to distinguish the cognitive activity of thinking from
reasoning in terms of how model/concept creation arises from the distinction-making
process (thinking), and the coordination of multiple models/concepts relationally
(reasoning). The findings offered in support of this pattern are consistent with the
definitions of thinking and reasoning offered in chapter 2 as a way to encapsulate the EoT

161

by Paul and Elder (2008) within the practice of science in general. Though Distinctions
and Coordinations employ the same set of EoT, they do so in varying and consistent
proportions—thereby indicating on one level that reasoning is in fact another type of
thinking, but also allowing for a means to distinguish that shift in terms of cognitive
processes other than distinction-making. For example, the sub-codes of the Coordination
theme are IF-THEN, Related Things, Collections, and I Believe Because. In all cases, the
cognitive agent is forming relationships between previously made distinctions, and/or
existing Coordinations—in other words, reasoning involves the formation of relationships
between things or relationships, which affirms the definition of reasoning offered by the
author: reasoning is the ability to relate multiple models.
The simple difference between thinking and reasoning is the difference between
model construction and model-to-model interaction. The content of thinking and
reasoning is entirely models, and models are simply coordinations of concepts. However,
thinking is the process of model construction in terms of concepts, whereas reasoning is
the process of forming model relations and relationships. Both processes rely
fundamentally on concepts since relational and relationship concepts are the glue that
makes model construction possible.
Given the Cognitive Modelling Taxonomy of Conceptual Frameworks (CMTCF)
offered herein, it is now possible to accept the definition of reasoning as the ability to
relate multiple models when model construction is understood as evidence of thinking.
Thinking and reasoning are thus connected in terms of what they operate on or within.
The noticeable transitions in magnitude and frequency of EoT usage between
Distinctions (container for descriptive metrics) and Coordinations (relations between

162

descriptive metrics) indicate that thinking and reasoning are in fact distinct cognitive
constructs deserving of separate consideration in future research.
Research Question 2.
R2: How does the use of MRS in the thinking and reasoning of IP students
promote personal epistemological change?
Theme 1: belief development. The findings herein describe patterns of thinking
and reasoning about the content and the process of coming to a set of beliefs concerning
the nature of physics, mathematic, science, reality, thinking, reasoning, understanding,
and motion. The content of thinking in terms of concepts, and the coordination of
families of concepts through inferential reasoning and knowledge justification suggest
that the use of natural language in either written or narrative form is essential to
producing epistemological change. Qualitative findings described in chapter 4 illustrate
that change explicitly, and the quantitative results of the PEP assessment support the
import of those findings in terms of the epistemological dimensions of rational, empirical,
and metaphorical measured by the PEP. The PEP dimensions are more or less
dispositions towards thinking or reasoning about the world, and therefore expand the
scope of inquiry about the types of thinking and reasoning that produce epistemological
change.
In the second half of the activity set under study, students were repeatedly asked
to reflect on how their thinking, reasoning, and understanding had changed as a result of
the activities conducted thus far. Most participants did not think that their beliefs had
changed; however, most participants did think that the way that they understand their
beliefs had changed. Given the large-scale changes in the PEP dimensions among most of

163

the participants, this suggests that the curriculum under study is structure well for
epistemological change. In general, this curriculum is collaborative and reflective with
intensive writing and discussion opportunities. This sort of learning community makes
metacognitive and critical listening demands about beliefs, and therefore suggests that
curriculum content and/or pedagogical engagement are resources for epistemological
change.
The answer to the how question of thinking and reasoning with MRS has now been
answered in terms of the conceptual frameworks described by the CMTCF. Moreover, if
thinking is the ability to construct a model, and models are the coordination of different
families of concepts, then to the degree that beliefs have conceptual content, the CMTCF
describes epistemological change. No matter how vague or precise ones’ definition of
thinking might be, it is difficult to imagine that it is devoid of conceptual content. The
structure of beliefs can now be analyzed in terms of not only its conceptual content, but
also whether or not the structure of those beliefs is limited to mere thinking about a single
model, or reasoning between multiple models.
Theme 2: Thinking-Reasoning-Understanding (TRU) Claims. Students made
claims about whether or not their thinking, reasoning, or understanding had changed after
attempting to define understanding. Understanding was typically defined as the ability to
explain what you know to another person. Memo activity during the Learning the
Language activity questioned whether or not understanding is context-dependent because
the nature of this activity required students to reason between MRS in order to obtain a
meaningful and coherent interpretation. The construct of Understanding was thus defined
as the ability to sustain Reasoning across a shift in context. Based on the sum total of

164

these findings, the author suggests a new theory of learning called the TRU Learning
Theory, where TRU is an acronym for Think-Reason-Understand.
Introduction and background to the TRU Learning Theory. The Hebbian
Principle of neurons that fire together wire together is sufficient for defining learning as
the coordination of the activation of multiple regions of the brain. Various
representational systems (RS) have the ability to activate different regions of the brain.
Through these multiple representational systems (MRS), humans encode for meaning in
an attempt to build models of the world with capacity to represent the structure as they
perceive it.
The cognitive activities of thinking and reasoning are often conflated with one
another or defined in self-referential manners. The term model is most often defined as
any representation of structure, where structure indicates the relations between things
being modelled. Concepts are inevitably part of this cognitive-behavioral process, but
they are more vaguely defined than models. The following definitions of the TRU
constructs are given below in term of concept and conceptual frameworks as defined by
the CMTCF.
TRU Theoretical Statement. The TRU Learning Theory asserts that multiple
representation systems (MRS) encode for meaning by coordinating concepts in a manner
that activate multiple regions of the brain, and thus form conceptual frameworks in
accordance with the Hebbian Principle.
Definitions
Thinking. The ability to construct a concept. The most basic concept besides
mere recognition of a thing is a concept that provides a descriptive metric for the thing in

165

question. Along this line of thinking, the concepts that serve to classify and categorize
allow the cognitive agent to sort and organize worldly objects and events.
Reasoning. The ability to construct a conceptual framework. The Modeling
Concepts of the Multicept and the Nomocept are conceptual frameworks that function as
models. Multicepts are constructed on the basis of relations and relationships, whereas
the Nomocept is borne out of a reasoning process such as inferential reasoning.
Understanding. The ability to relate conceptual frameworks. The use of MRS
allows for the encoding of a model in multiple languages—so to speak. The degree to
which a modeler can form symbol-to-symbol and symbol-to-referent connection both
within and between MRS is the degree to which the models are understood within and
between contexts.

Figure 20. Cognitive Modeling Taxonomy of Conceptual Frameworks – Collections.

166

Example from Physics Using the CMTCF. Consider the law-like statement: a
change in the position [of an object] always requires a change in time. From an
observational point of view, this is an Empirically Familiar Regularity (EFR) in the large-
scale world of objects that can be seen. Objections to this rule (law) deny physical reality.
Position and time are basic concepts about features of the world where objects reside, and
the concept that there can be a change in either one is properly described under the
CMTCF as a hypercept: concepts about other concepts encoded in the form of a
relationship. In this case, both a change in time and a change in position use the concept
of relational concept of change on the basic concepts of position in time in order to form
new concepts that are about position and time.
The phrase “always requires” encodes for an inferential connection between these
two change quantities that fundamentally places the two concepts in a relationship that
obeys the law (nomos) previously given—that a change in position always requires a
change in time. This conceptual connection is a nomocept because it serves to define the
law-like connection (empirically familiar regularity) between the two hypercepts on
position and time.
?? ??ℎ???????? ???? ????????????????�������
??????????????
�������������
ℎ????????????????
???????????? ????????????????�����������
??????????????????
??????????????
?? ??ℎ???????? ???? ????????�
??????????????
�������������
ℎ????????????????
�����������������������������������������
????????????????
.

Figure 21. CMTCF example 1: first zeroth law of motion.
One way to encode for this model is to write the equation Δ?? = ??Δ??, which
specifies speed as the physical connection between space and time. Speed is not a thing,
and neither was the relational concept of “always requires”—hence the conceptual

167

connection as a metacept (hypercept) retains its basic property of about-ness as a derived
quantity in the physical world. Speed is about coordinated changes rather than a
substance upon which a categorical statement can find grounding, or an object upon
which a descriptive metric can be formed.
The Cognitive Modelling approach requires that an axiom should encode for what
the law (nomocept) describes. The following arithmetic illustrates how this is done using
the quotient operation as a way to encode for the “always requires” type of metacept.
Table 22

Cognitive Modeling Approach to Axiom Development
Δ?? = Δ?? Δ?? = Δ?? Identity Multicept. It is undeniably true that something is identical to itself. It is
important to realize that though a position can be identified in the real world, a
change in position, and/or a change in time, is a metacept about such worldly
things.
Δ??
Δ??
= 1 Quotient Identity Multicept. Since changes in time are independent of the change in position of the object, we can use this fancy form of one (ffoo) as a multiplier
that has capacity to preserve the identity on their dependent multicepts, while
making a new relation feasible.
Δ?? = Δ??
Δ??
Δ??

Preservation of Identity Multicept using a fancy form of one (ffoo). The axiom
DOES NOT yet encode for what the law describes because the two quantities do
not stand in a binding (quotient) relationship with one another.
Δ?? =
Δ??
Δ??
Δ?? Letting the axiom encode for what the law describes by using the quotient relationship symbol as a way to encode for “always requires”. In this case,
“always requires” is synonymous with the concept of “per”. Moreover, standard
arithmetic permits the shift.
Δ?? = ??Δ?? Creating a new concept and encoding for it with a new symbol. The symbol-
referent connection is to a hypercept—which is a type of metacept, and metacepts
are about something rather than being ontologically something.

This vector diagram (Figure 22) encodes for the axiom as well as the narrative
law by illustrating with geometric objects all of the concepts except for a change in time.
The quotient of change in position relative to a change in time, scales the change in

168

position length by some amount. Such a conceptual synonym is able to model some of
the parts of the original Multicept, but not all. Moreover, it has capacity to encode for
direction in a way that the natural language model cannot.

The ability to coordinate multiple models—such as natural and diagrammatic—
lends clarity to both models because each RS has a different capacity for encoding
meaning. An additional diagram is needed in order to fully coordinate the first diagram
with the natural language law, or its axiom.
In this diagram, the connection between space and time can be seen. The slope of
the line is a constant value whose magnitude is dependent on how wide the time interval
is in comparison to the position interval. In this way, the diagrammatic RS encodes—or
potentially encodes—for the magnitude of the speed relative to the primary change
quantity (position) that defines it relative to time. However, this graphical RS does not
encode for direction, whereas the vector diagrammatic RS does. For these reasons, the
need for RS other than just the natural or the symbolic is needed in order to generate a
comprehensive model that is the result of multiple model-to-model interactions.

Figure 22. Vector diagrammatic model of the First Zeroth Law.

169

Reasoning = model-to-model interaction. In the same way that a change in
position requires a change in time, so does a change in speed.
?? ??ℎ???????? ???? ??????????���
??????????????
�������������
ℎ????????????????
???????????? ????????????????�����������
??????????????????
??????????????
?? ??ℎ???????? ???? ????????�
??????????????
�������������
ℎ????????????????
���������������������������������������
????????????????
.
Figure 24. CMTCF Example 2: Second Zeroth Law of Motion.
The Cognitive Modeling approach permits the construction of an axiom that
encodes symbolically what the law above describes—or Δ?? = ??Δ??.
Δ??�
ℎ????????????????
=⏟
ℎ??????????????
??⏟
??????????????
Δ??�
ℎ????????????????�����������
ℎ????????????????
�������������������������
??????????????????

Figure 25. CMTCF example 3: Second Zeroth Law axiom.

This axiom ends up creating a new concept conventionally known as acceleration.
In one sense, it as though the reasoning process of “always requires” became an
Figure 23. Graphical model of the First Zeroth Law.

170

equivalence relationship for the new construct. This highlights the transformative
capacity or RS to encode for and subsequently create new concepts. However, the new
conceptual creation tends to be be more or less explicit depending on the RS context. For
example, the natural language of a change in speed always requires a change in time
does not explicitly provide one with the concept of acceleration—much less the
construct. However, the symbolic approach lends the modeler an opportunity to
pick/create a name label for the new concept—which in this case is a relationship, or
Hypercept.
The transition between natural language and symbolic language in the
acceleration law given above retained the elements on the outer edges (change in speed
and time) as Hypercepts across both RS. These are essentially RS synonyms. The
reasoning process of always requires is in some way equivalent to the combination of an
equivalence relation (hypocept) and the new construct of acceleration labeled a.
Moreover, a new Hypercept connection serves to link the equivalence relation and the
new construct within the original conceptual framework. The empirically familiar
regularity that supports the notion of this always requires that results in an equivalence
relation connecting the original concepts via a new and derived concept. The
mathematical equivalent of the universal quantifier on constraint (always requires) is an
equivalence relation definition via a new concept.
Perhaps one translation of the symbols back to words is that a change in speed is
the same but not the same as the connection between acceleration and time. The notion of
the same but not the same is what the equivalence relation demands in the absence of an

171

identity. Another possible translation might that a change in speed is defined to be the
connection between acceleration and time.
Predictions.
1. Two or more RS are required for building a physical network of knowledge
within the human brain.

2. There is a causal connection between the cognitive content of a model as
defined herein, and the behavioral artifacts that present in the form of MRS.

3. Understanding is contingent on one or more diagrammatic RS within the
family of MRS used by a modeler. In other words, the ability to make symbol-
symbol and symbol-referent connections within and between RS is the key to
understanding.

4. Optimal learning achievement is contingent on a family of MRS that include
symbolic, diagrammatic, and natural language RS.

Suggestions for TRU Learning Theory use
1. Obtain frequencies for every attempt to produce concepts and models, and
then compare that to the frequency of attempted concepts and models that
have logical and/or conventional merit.

2. Log the frequency of RS used—such as natural, symbolic, diagrammatic,
etc.—and continuously compare relative to the evolution of conceptual
frameworks within persons or groups.

3. Build curriculum and assessments from the ground up, using the CMTCF as a
guide. This is likely a paradigm shift away from the dominance of pre-post-
test strategies towards a qualitative constant comparative analysis.

4. Use as a measure of intelligence in terms of the creative output of the
individual relative to conventional merit, as well as RS fluency within a
cultural context.

5. Use as a standard for machine learning and artificial intelligence. Current
approaches to machine learning are consistent with the findings herein, but
fail to specify the process in sufficient detail, or provide coherence in terms of
the basic constructs described herein.
These findings, and the TRU Learning Theory described herein, provide a clear
path for future analyses of MRS in terms of the conceptual content, and the means by

172

which those concepts are constructed. This applies to all learning—human or otherwise.
Given that personal beliefs and the cognitive processes for forming those beliefs are
described in terms of concepts and conceptual frameworks, the TRU Learning Theory
simultaneously answers the call by both conceptual change research and epistemological
change research fields for greater theoretical clarity as it pertains to terminology,
mechanisms, and resources.
Implications
Paul and Elder (2008) suggested the process of thinking is what generates the
reasons that the process of reasoning then bases its conclusions on. Holyoak and
Morrison (2012) defined thinking as transformations of mental representations for the
sake of goal-directed modeling. The definition of model as any representation of structure
(Hestenes, 2010) is surely true, but provides no clarity with which to judge the relative
merit of any model. The CMTCF solves all of the problems in clarity described in the
literature review by distinguishing thinking from reasoning in terms of their conceptual
content and coordination. A general definition for the constructs of thinking and
reasoning in terms of concepts has widespread implication for research in general
psychology, philosophy, personal epistemology, conceptual change, human learning,
machine learning, and intelligence.
Theoretical implications. The changes in theory that the TRU Learning Theory
might impose are more likely to be affordances in clarity rather than content. The current
definitions of constructs like thinking and reasoning are merely vague or circular. The
general use of the term concept boils down to any idea that an agent can have, and the
field of conceptual change research has avoided defining what concepts actually are

173

(Vosniadou, 2010). In each case, the present lack of clarity is still true to some degree;
however, measuring change on constructs so poorly defined may prove to have been less
than useful.
In many ways, the CMTCF is the missing link in conceptual change research
because it precisely defines what concepts are, as well as how the coordination of
concepts corresponds to thinking, reasoning, and understanding. Moreover, conceptual
change research has suffered from a snapshot view of conceptual change in the form of
pre-post-test strategies instead of longitudinal qualitative ones (diSessa, 2010). The
degree to which concepts correspond to beliefs, and conceptual change corresponds to
conceptual frameworks, determines the connection that exists between the research fields
of conceptual change and personal epistemology. TRU Learning specifies the
mechanisms for conceptual change by defining how thinking and reasoning correspond to
conceptual frameworks. The primary resource for conceptual change is the use of MRS
that are situated within various domains of knowledge and inquiry, as well as social
structure—which Bodin (2012) describes as epistemological framing activating a
network of epistemological resources. Bing and Reddish (2012) defined epistemological
resources as social, affective, and artifact-based, whereas Bodin (2012) described the
epistemological framing as dealing with the way in which knowledge and beliefs are
constructed. The CMTCF could then lend precision to the question of epistemological
framing in terms of how conceptual frameworks structure both knowledge and beliefs, as
well as the interaction that exists between the social aspects of a learning environment
and the MRS used therein to produce artifacts that encode for concepts and models.

174

Human and machine learning differ on many levels; but with respect to the
elements of this study, one clear distinction is the ability of humans to form beliefs versus
the ability of a machine to do the same. The correspondence between human thinking and
reasoning versus machine thinking and reasoning within the context of model and
knowledge construction is foundational to the potential research questions of (a) what is a
belief, and (b) what do machines believe? This sort of trajectory in research could also
touch on issues in the difference between mind and brain, as well as consciousness.
Practical implications. Each of the elements of the TRU Learning theory—
concepts, models, etc.—can be counted in terms of their construction and their
conventional efficacy in terms of how naïve or sophisticated they are judged to be by an
expert. Moreover, within each of these frequencies there is ample opportunity to capture
various types of each construct in both qualitative and quantitative ways. As shown in the
prior section, the TRU Learning Theory has the capacity to fully describe the conceptual
framework and cognitive processes required to fully understand motion—the most basic
construct in Physics. Moreover, the example given utilized four different representational
systems in the process—natural, symbolic, diagrammatic, and graphical. Furthermore, the
Cognitive Modeling approach provides a modeling method for converting words into
symbols that equate to conventional models. The amount of conceptual change that can
be tracked and measured under such a paradigm holds great promise for education reform
within Physics. The CMTCF is general enough to apply to any structured body of
knowledge regardless of context, so long as conventional representational systems are
productive in expressing the conceptual content of that domain. Therefore, the TRU
Learning Theory is a general theory of learning due to its wide application in terms of

175

MRS, as well as the hypothesis that multiple regions of the brain are activated and
coordinated by the use of MRS.
Future implications. The TRU Learning Theory is in some ways a complete
paradigm shift away from the status-quo of pre-post-test assessments of conceptual
change (diSessa, 2010), by answering the demand in personal epistemology research to
explain how learners develop conceptual knowledge about the world and how that
conceptual knowledge influences belief (Hofer, 2012). It is the mechanisms and
processes of epistemological change, as well as the contextual factors of the same, that
have eluded personal epistemology researchers for the last 40 years (Bendixen, 2012).
These two fields are more deeply connected than anyone could have realized in the
absence of a clear set of definitions for all of the critical constructs therein—namely,
concepts and models.
If the Hebbian principle of neurons that fire together wire together is true, then
human learning could be defined as the coordination of the activation of multiple regions
of the brain. MRS have capacity to activate various regions of the brain, and thus
influence learning in this fashion. The connection between MRS and regions of interest
(ROI) in the brain have potential import for understanding how brain function and
conceptual change are structured. The parallels between machine learning and human
learning are now accessible in terms of the ways in which knowledge content is encoded.
Intelligence research is no better off than the aforementioned fields with respect to
understanding how its conceptual content corresponds to the types of thinking and
reasoning that intelligence tests are supposed to measure. A great deal of clarity is
possible with the aid the CMTCF when it comes to defining intelligence, and therefore

176

another potential paradigm shift is possible in both human and machine intelligence
research. In many ways, the CMTCF portion of the TRU Learning Theory is akin to
discovering that atomic structure of matter. It would be presumptuous to extend the
import of this theory any further than that atomic structure metaphor, as the
aforementioned paradigm shifts are more likely to provide a deeper set of advances in
theory and practice.
Strengths and weaknesses. The strength of this study is rooted in the logical
consistency of the TRU Learning Theory that was produced through an extensive
analysis of the data provided by the study population. Multiple coding schemes in
multiple data types converge on the same result that thinking and reasoning are distinct
cognitive processes dealing with aspects of the knowledge construction process. The key
aspect of the TRU Learning Theory that coordinates knowledge construction with the
constructs of thinking and reasoning is the CMTCF—which defines concepts and situates
them in conceptual frameworks built by the cognitive mechanism of thinking and
reasoning. Moreover, the CMTCF is well-suited for modeling Physics in general by
virtue of it focus on models and model construction, as well as positioning the
construction of law-like models as the paragon of the taxonomy. Two potential
weaknesses in the study are (a) the long span of time between the series of activities
studied and the post-test condition for the PEP, and (b) the potential misuse of the a priori
theoretical codes for the EoT by Paul and Elder (2008). However, one of the problems
that arose in using the Concepts EoT code gave rise to the realization that Concepts come
in several different types, and that they are coordinated with one another in ways that
demand more than one distinction (coding option). The potential to misuse the EoT codes

177

did exist; however, the in vivo coding process ameliorated this, and led directly to the
author-defined constructs of thinking and reasoning—which are fundamental to the
conceptual framework process in the CMTCF.
Recommendations
The TRU Learning Theory is an ambitious proposal with far-reaching
implications for research and practice within multiple fields. The learning sciences span
both human and machine learning, and therefore psychology, neuroscience, and computer
science. New opportunities to refine theoretical approaches through use of the CMTCF,
as well as common metrics for thinking, reasoning, and understanding, serve to present a
grand opportunity for interdisciplinary collaborations that could lend clarity to
fundamental research questions in all applicable fields—such as what is thinking, what
do or can machines believe, etc. The following recommendations attempt to capture those
opportunities in a clear and concise manner.
Recommendations for future research.
1. The CMTCF is the starting point for analyzing any future research on conceptual
change. In order to know what conceptual change actually is, it is crucial to have a
clear definition of concept first. Based on findings therein, a complete reevaluation of
instruments designed to measure conceptual change is warranted in both qualitative
and quantitative modes.

2. The key elements of the TRU Learning Theory are based on the CMTCF. Research
on thinking and reasoning would therefore need to retool in terms of the structure of
models—a paradigm-shift for any field of inquiry that purports to measure or define
thinking and reasoning.

3. Human and machine learning are contingent on the ability of an agent to construct a
model, and therefore have and coordinate concepts. Both human and machine
intelligence are linked to what either type of agent can learn, and thus all areas of
intelligence research could benefit from a more precise definition of concepts,
models, thinking and reasoning. How do different measures of intelligence
correspond to the conceptual frameworks described by the CMTCF?

178

4. Neurons that fire together wire together, and these sorts of results exist because of
behavior in learning environments. The features of those learning environments
are rich with social and collaborative properties, as well as MRS. Extensive
research capable of tracking and coordinating these properties of the teaching and
learning enterprise is needed. Mixed method research that includes direct
measurements of brain activity with respect to the presence of this set of
environmental properties is crucial.

5. Students are capable of producing models, some of which are conventional. The
frequencies of both, as well as the RS employed in the process are needed for
determining the optimal set of resources required for optimal learning within any
discipline. Moreover, the varied types of reasoning—such as inferential,
analogical, metaphorical, proportional, etc.—must be tracked in exhaustive detail
in order to discover how those cognitive patterns correspond to conceptual
frameworks.

6. How do conceptual frameworks correspond to personal epistemology in terms of
basic conceptual content? In other words, how many descriptive metrics are
sufficient for conceptual change? Is there a maximum number beyond which
thinking and reasoning are impaired? What is the relationship between conceptual
change and epistemological change?

7. How do the parts of speech correspond with the cognitive operations described by
the CMTCF? Adjectives describe categorical features of concepts, and are thus
related to the Pericept construct. Adverbs and conjunctions describe relations and
relationships, and are therefore suitable for use as linguistic forms of Metacepts.
The ability to digitally automate the search for parts of speech equates in part to
an ability to detect thinking and reasoning, and in this way, the CMTCF can be
used for analysis of natural language artifacts.

Recommendations for future practice. In terms of educational practice—within
the context of this study—it is the learning habitat and the curricular content that drive
epistemological change. The learning habitat is a student-centered community seeking
consensus through collaboration within a framework of guided inquiry. The core
elements of that guided inquiry are conceptual and representational tools that envision
learning as the coordination of the activation of multiple regions of the brain. In other
words, students learn because they are able to represent their ideas using MRS, as
opposed to just one representational system, as is typical in traditional instructional

179

modes. So in addition to the core curricular content, it is the pedagogy employed by an
instructor that is essential to the content being successful for teaching and learning. A
summary listing of particular practices are offered below.
1. Classroom collaboration. Instructors should plan for generous amounts of active
collaboration in randomized groups, so that metacognitive control is influenced
by the opportunity to practice critical listening and dialectical thinking on
multiple perspectives.

2. Representational tools. In order for deep learning to happen, students must
represent their ideas in MRS. The use of MRS promotes model-model
interactions; and therefore, in accordance with TRU, a great deal of
understanding. Representational systems encode for concepts, and concepts are
elemental to beliefs. Coordinating concepts leads to conceptual change, and
therefore belief change. The nature of conceptual change is thus related to belief
development, or epistemological change.
3. Socratic dialog. The primary mode of instruction should be Socratic in nature.
Concept construction is an active process requiring the kind of cognitive effort
that cannot be generated by mere lecture. Conceptual frameworks are built by
modeling rather than listening because questions are a resource for challenging
beliefs. The key to mental development is the ability to challenge one’s own
understanding—which requires a change in the conceptual framework of models.

4. Journaling. Metacognitive control is the single best pathway to achievement, as
well as a process for encoding knowledge that is rich with opportunity for
constructing conceptual frameworks. Journaling is an excellent means by which
to obtain such artifacts.
If curriculum and pedagogy are designed using the CMTCF, then both the theory
and the practice that define the Teaching Enterprise will be coordinated in a manner that
will inevitably lead to higher achievement for students. Knowledge and beliefs are built
on concepts, but can only be justified by sound thinking and reasoning. The structure of
both thinking and reasoning in terms of the TRU Learning Theory and the CMTCF
provide a clear map of the cognitive and behavioral landscape that emerges in the ideal
instructional setting. To this end, the primary benefactors of the TRU Learning Theory
are teachers and students. Students are likely to benefit the most because it is their

180

conceptual frameworks that are likely to change the most. However, one would expect
that teachers would benefit in a similar way as they begin to reform their own conceptual
frameworks in the process of developing better curriculum and assessments. Researchers
in the learning sciences will also benefit from a comprehensive theory that is context-
independent.

181

References
Adey, P. S., & Shayer, M. (1994). Really Raising Standards: Cognitive Intervention and
Academic Achievement. London: Routledge.
Ainsworth, S., Bibby, P., & Wood, D. (2002). Examining the effects of different multiple
representational systems in learning primary mathematics. Journal of the
Learning Sciences, 11(1), 25–61. doi:10.1207/s15327809jls1101_2
Anderson, R.C., Reynolds, R.E., Schallert, D.L., and Goetz, E.T. (1977). Frameworks for
comprehending discourse. American Educational Research Journal, 14, 367-81.
Barzilai, S., & Zohar, A. (2014). Reconsidering personal epistemology as metacognition:
a multifaceted approach to the analysis of epistemic thinking. Educational
Psychologist, 49(1), 13-35. doi:10.1080/00461520.2013.863265
Bates, S. P., Galloway, R. K., Loptson, C., & Slaughter, K. A. (2011). How attitudes and
beliefs about physics change from high school to faculty. Physical Review Special
Topics – Physics Education Research, 7(2). doi:10.1103/physrevstper.7.020114
Baxter Magolda, M. B. (2004). Evolution of a constructivist conceptualization of
epistemological reflection. Educational Psychologist, 39(1), 31–42.
doi:10.1207/s15326985ep3901_4
Baxter Magolda, M. B. (2012). Epistemological Reflection: The evolution of
epistemological assumptions from age 18 to 30. In Hofer, B.K. & Pintrich, P.R.
(2012). Personal Epistemology: The Psychology of Beliefs About Knowledge and
Knowing. Taylor and Francis. Kindle Edition.
Bazeley, P. & Jackson, K. (2013). Qualitative Data Analysis with NVivo. SAGE
Publications. Kindle Edition.

182

Bendixen, L. D. (2012). A process model of epistemic belief change. In Hofer, B.K. &
Pintrich, P.R. (2012). Personal Epistemology: The Psychology of Beliefs About
Knowledge and Knowing. Taylor and Francis. Kindle Edition.
Bendixen, L.D., and Feucht, F.C. (2010). Personal Epistemology in the Classroom:
Theory, Research, and Implications for Practice. Cambridge, UK: Cambridge
Univ. Press.
Bendixen, L. D., Schraw, G., & Dunkle, M. E. (1998). Epistemic beliefs and moral
reasoning. The Journal of Psychology, 132, 187–200.
Bell, P., & Linn, M.P. (2012) Beliefs about science: how does science instruction
contribute? In Hofer, B.K. & Pintrich, P.R. (2012). Personal Epistemology: The
Psychology of Beliefs About Knowledge and Knowing. Taylor and Francis. Kindle
Edition.
Bernard, R. H., & Ryan, G. W. (2010). Analyzing qualitative data: Systematic
approaches. Thousand Oaks, CA: SAGE Publications.
Bing, T. J., & Redish, E. F. (2012). Epistemic complexity and the journeyman-expert
transition. Physical Review Special Topics – Physics Education Research, 8(1).
doi:10.1103/physrevstper.8.010105
Birks, M., & Mills, J. (2011). Grounded Theory: A Practical Guide. Sage publications.
Kindle edition.
Boeije, H. (2010). Analysis in Qualitative Research. Thousand Oaks, CA: SAGE
Publications.

183

Bodin, M. (2012). Mapping university students’ epistemic framing of computational
physics using network analysis. Physical Review Special Topics – Physics
Education Research, 8(1). doi:10.1103/physrevstper.8.010115
Bodin, M., & Winberg, M. (2012). Role of beliefs and emotions in numerical problem
solving in university physics education. Physical Review Special Topics – Physics
Education Research, 8(1). doi:10.1103/physrevstper.8.010108
Bransford, J. D., Brown, A. L., & Cocking, R. R. (1999). How people learn: Brain, mind,
experience, and school. National Academy Press.
Bråten, I., & Strømsø, H. I. (2005). The relationship between epistemological beliefs,
implicit theories of intelligence, and self-regulated learning among Norwegian
post-secondary students. The British Journal of Educational Psychology, 75, 539–
565.
Brewe, E. (2011). Energy as a substance-like quantity that flows: Theoretical
considerations and pedagogical consequences. Physical Review Special Topics –
Physics Education Research, 7(2). doi:10.1103/physrevstper.7.020106
Brewe, E., Traxler, A., de la Garza, J., & Kramer, L. H. (2013). Extending positive
CLASS results across multiple instructors and multiple classes of Modeling
Instruction. Physical Review Special Topics – Physics Education Research, 9(2).
doi:10.1103/physrevstper.9.020116
Bromme, R., Pieschl, S., & Stahl, E. (2010). Epistemological beliefs are standards for
adaptive learning: a functional theory about epistemological beliefs and
metacognition. Metacognition & Learning, 5(1), 7-26.

184

Bruun, J., & Brewe, E. (2013). Talking and learning physics: Predicting future grades
from network measures and Force Concept Inventory pretest scores. Physical
Review Special Topics – Physics Education Research, 9(2).
doi:10.1103/physrevstper.9.020109
Butler-Kisber, L . (2010). Qualitative inquiry: Thematic, narrative, and arts-informed
perspectives. London: Sage.
Cahill, M. J., Hynes, K. M., Trousil, R., Brooks, L. A., McDaniel, M. A., Repice, M., …
Frey, R. F. (2014). Multiyear, multi-instructor evaluation of a large-class
interactive-engagement curriculum. Physical Review Special Topics – Physics
Education Research, 10(2). doi:10.1103/physrevstper.10.020101
Cassidy, S. (2011). Self-regulated learning in higher education: Identifying key
component processes. Studies In Higher Education, 36(8), 989-1000.
Chang, C.-Y., Wen, M. L., Kuo, P.-C., & Tsai, C.-C. (2010). Exploring high school
students’ views regarding the nature of scientific theory: a study in Taiwan. Asia-
Pac.Educ.Res., 19(1). doi:10.3860/taper.v19i1.1515
Charmaz, K. C. (2006). Constructing grounded theory: A practical guide through
qualitative analysis. Thousand Oaks: Sage Publications.
Chen, L., Han, J., Wang, J., Tu, Y., & Bao, L. (2011). Comparisons of item response
theory algorithms on force concept inventory. Research in Education Assessment
and Learning, 2(02), 26-34.
Cifarelli, V., Goodson-Espy, T., & Jeong-Lim, C. (2010). Associations of students’
beliefs with self-regulated problem solving in college algebra. Journal Of
Advanced Academics, 21(2), 204-232.

185

Clarà, M., & Mauri, T. (2010). Toward a dialectic relation between the results in CSCL:
Three critical methodological aspects of content analysis schemes. Computer
Supported Learning, 5(1), 117–136. doi:10.1007/s11412-009-9078-4
Clement, J. (2010). The role of explanatory models in teaching for conceptual change.
International Handbook of Research on Conceptual Change (Educational
Psychology Handbook). Taylor and Francis. Kindle Edition.
Coletta, V. P., & Phillips, J. A. (2010). Developing thinking & problem solving skills in
introductory mechanics. AIP Conference Proceedings, 1289(1), 13-16.
Coletta, V. P., Phillips, J. A., & Steinert, J. J. (2007a). Why you should measure your
students reasoning ability. The Physics Teacher. 45, 235-238.
Coletta, V. P., Phillips, J. A., & Steinert, J. J. (2007b). Interpreting force concept
inventory scores: Normalized gain and SAT scores. Physical Review Special
Topics: Physics Education Research, (1).
Corbin, J., & Strauss, A. (2008). Basics of Qualitative Research (3rd ed.): Techniques
and Procedures for Developing Grounded Theory. Thousand Oaks, CA: SAGE
Publications, Inc. doi: http://dx.doi.org/10.4135/9781452230153
Creswell, J. W. (2013). Research Design: Qualitative, Quantitative, and Mixed Methods
Approaches. SAGE Publications. Kindle Edition.
Crow, G., & Edwards, R. (2012). Perspectives on working with archived textual and
visual material in social research: editors’ introduction. International Journal Of
Social Research Methodology, 15(4), 259-262.
doi:10.1080/13645579.2012.688308

186

DeBacker, T. K., Crowson, H., Beesley, A. D., Thoma, S. J., & Hestevold, N. L. (2008).
The challenge of measuring epistemic beliefs: an analysis of three self-report
instruments. Journal Of Experimental Education, 76(3), 281-312.
De Cock, M. (2012). Representation use and strategy choice in physics problem solving.
Physical Review Special Topics – Physics Education Research, 8(2).
doi:10.1103/physrevstper.8.020117
Ding, L. (2014). Verification of causal influences of reasoning skills and epistemology on
physics conceptual learning. Physical Review Special Topics – Physics Education
Research, 10(2). doi:10.1103/physrevstper.10.023101
Ding, L., & Caballero, M. D. (2014). Uncovering the hidden meaning of cross-
curriculum comparison results on the Force Concept Inventory. Physical Review
Special Topics – Physics Education Research, 10(2).
doi:10.1103/physrevstper.10.020125
diSessa, A.A. (2010). A bird’s-eye view of the “pieces” vs. “coherence” controversy
(from the “pieces” side of the fence). International Handbook of Research on
Conceptual Change (Educational Psychology Handbook). Taylor and Francis.
Kindle Edition.
Douglas, K. A., Yale, M. S., Bennett, D. E., Haugan, M. P., & Bryan, L. A. (2014).
Evaluation of colorado learning attitudes about science survey. Physical Review
Special Topics – Physics Education Research, 10(2).
doi:10.1103/physrevstper.10.020128

187

Eitel, A., Scheiter, K., Schüler, A., Nyström, M., & Holmqvist, K. (2013). How a picture
facilitates the process of learning from text: Evidence for scaffolding. Learning
and Instruction, 28, 48–63. doi:10.1016/j.learninstruc.2013.05.002
Elder, L., & Paul, R. (2007a). The miniature guide to the human mind. Dillon Beach, CA:
Foundation for Critical Thinking. Kindle edition.
Elder, L., & Paul, R. (2007b). The thinker’s guide to analytic thinking. Dillon Beach, CA:
Foundation for Critical Thinking. Kindle edition.
Evans, J. T. (2012). Questions and challenges for the new psychology of
reasoning. Thinking & Reasoning, 18(1), 5-31.
doi:10.1080/13546783.2011.637674
Evans, J. T., & Over, D. E. (2013). Reasoning to and from belief: deduction and
induction are still distinct. Thinking & Reasoning, 19(3/4), 267-283.
doi:10.1080/13546783.2012.745450
FCT. (2014). The foundation for critical thinking. Retrieved from http://www
.criticalthinking.org.
Fekete, T. (2010). Representational systems. Minds & Machines, 20(1), 69-101.
doi:10.1007/s11023-009-9166-2
Fernyhough, C. (2011). Even “internalist” minds are social. Style, 45(2), 272-275.
Fischer, C. T. (2009). Bracketing in qualitative research: conceptual and practical
matters. Psychotherapy Research, 19(4-5), 583–590.
doi:10.1080/10503300902798375
Formica, S. P., Easley, J. L., & Spraker, M. C. (2010). Transforming common-sense
beliefs into Newtonian thinking through Just-In-Time Teaching. Physical Review

188

Special Topics – Physics Education Research, 6(2).
doi:10.1103/physrevstper.6.020106
Forsyth, B. R. (2012). Beyond physics: A case for far transfer. Instructional Science,
40(3), 515–535. doi:10.1007/s11251-011-9188-z
Frost, N. (2011). Qualitative research methods in psychology: Combining core
approaches. New York, NY: Open University Press.
Fyfe, E., McNeil, N., Son, J., & Goldstone, R. (2014). Concreteness fading in
mathematics and science instruction: a systematic review. Educational
Psychology Review, 26(1), 9-25. doi:10.1007/s10648-014-9249-3
Glaser, B.G. & Strauss, A.L. (2009). The discovery of grounded theory: strategies for
qualitative research. Aldine Transaction. Kindle Edition.
Glevey, K. E. (2006). Promoting thinking skills in education. London Review Of
Education, 4(3), 291-302.
Gok, T. (2011). The impact of peer instruction on college students’ beliefs about physics
and conceptual understanding of electricity and magnetism. International Journal
of Science and Math Education, 10(2), 417–436. doi:10.1007/s10763-011-9316-x
Greene, J. A., Muis, K. R., & Pieschl, S. (2010). The role of epistemic beliefs in students’
self-regulated learning with computer-based learning environments: conceptual
and methodological issues. Educational Psychologist, 45(4), 245-257.
doi:10.1080/00461520.2010.515932
Hake, R. R. (1998). Interactive-engagement versus traditional methods: a six-thousand-
student survey of Mechanics test data for introductory physics course. American
Journal of Physics, 66, 64-74.

189

Hake, R. (2007). Six lessons from the physics education reform effort. Latin-American
Journal Of Physics Education, (1), 24.
Hammer, D., & Elby, A. (2012). On the form of a personal epistemology. In Hofer, B.K.
& Pintrich, P.R. (2012). Personal Epistemology: The Psychology of Beliefs About
Knowledge and Knowing. Taylor and Francis. Kindle Edition.
Hardré, P. L., Crowson, H., Kui, X., & Cong, L. (2007). Testing differential effects of
computer-based, web-based and paper-based administration of questionnaire
research instruments. British Journal Of Educational Technology, 38(1), 5-22.
Harr, N., Eichler, A., Renkl, A., Rich, P., & Kuan-Chung, C. (2014). Integrating
pedagogical content knowledge and pedagogical/psychological knowledge in
mathematics. Frontiers In Psychology, 51-10. doi:10.3389/fpsyg.2014.00924
Herrón, M. A. (2010). Epistemology and epistemic cognition: The problematic virtue of
relativism and its implications for science education. Zona Próxima, (12), 96-107.
Hestenes, D. (2010). Modeling theory for math and science education. In Lesh, R.,
Galbraith, P., Hines, C. & Hurford, A. (eds.) Modeling Students’ Mathematical
Competencies. New York: Springer.
Hestenes, D., & Wells, M. (1992). A mechanics baseline test. The Physics Teacher 30,
March 1992, p. 159-166.
Hofer, B. K. (2004). Epistemological understanding as a metacognitive process: Thinking
aloud during online searching. Educational Psychologist, 39, 43–55.
Hofer, B. K. (2012). Personal epistemology as a psychological and educational construct:
an introduction. In Hofer, B.K. & Pintrich, P.R. (2012). Personal Epistemology:

190

The Psychology of Beliefs About Knowledge and Knowing. Taylor and Francis.
Kindle Edition.
Hofer, B. K., & Pintrich, P. R. (1997). The development of epistemological theories:
beliefs about knowledge and knowing and their relation to learning. Review of
Educational Research, 67(1), 88–140.
Hofer, B.K., Pintrich, P.R. (2012). Personal Epistemology: The Psychology of Beliefs
About Knowledge and Knowing. Taylor and Francis. Kindle Edition.
Hofer, B. K., & Sinatra, G. M. (2010). Epistemology, metacognition, and self-regulation:
musings on an emerging field. Metacognition Learning, 5(1), 113–120.
doi:10.1007/s11409-009-9051-7
Holyoak, K., & Morrison, R. (2012). Thinking and reasoning: a reader’s guide. In
Holyoak, K. & Morrison, R. (2012). The Oxford Handbook of Thinking and
Reasoning. Oxford Press. doi: 10.1093/oxfordhb/9780199734689.013.000
HSACU. (2014). US Department of Agriculture. Appendix B to Part 3434—List of
HSACU institutions, 2013-2014Retrieved from http://www.nifa.usda.gov/
nea/education/pdfs/hispanic/hsacu_inst_2014
Hutchison, P., & Elby, A. (2013). Evidence of epistemological framing in survey
question misinterpretation. AIP Conference Proceedings, 1513(1), 194-197.
doi:10.1063/1.4789685
Inagaki, K., & Hatano, G. (2010). Conceptual change in naïve biology. International
Handbook of Research on Conceptual Change (Educational Psychology
Handbook). Taylor and Francis. Kindle Edition.

191

Irving, P., Martinuk, M., & Sayre, E. (2013). Transitions in students’ epistemic framing
along two axes. Physical Review Special Topics – Physics Education Research,
9(1). doi:10.1103/physrevstper.9.010111
Irving, P. W., & Sayre, E. C. (2014). Conditions for building a community of practice in
an advanced physics laboratory. Physical Review Special Topics – Physics
Education Research, 10(1). doi:10.1103/physrevstper.10.010109
ISLE. (2014). Investigative science learning environment. Retrieved from http://pum
.rutgers.edu/isle.php
Johnson, B. D., Dunlap, E., & Benoit, E. (2010). Structured qualitative research:
organizing “mountains of words” for data analysis, both qualitative and
quantitative. Substance Use & Misuse, 45(5), 648–670.
Johri, A., & Lohani, V. (2011). A framework for improving engineering representational
literacy through the use of pen-based computing. International Journal of
Engineering Education, 27(5), 958–967.
Johri, A., & Olds, B. (2011). Situated engineering learning: Bridging engineering educa-
tion research and the learning sciences. Journal of Engineering Education,
100(1), 151– 185.
Jonassen, D. (2010). Model building for conceptual change. International Handbook of
Research on Conceptual Change (Educational Psychology Handbook). Taylor
and Francis. Kindle Edition.
Jonassen, D., Strobel, J., & Gottdenker, J. (2005). Model building for conceptual
change. Interactive Learning Environments, 13(1/2), 15-37.

192

Kafai, Y.B. (2007). Constructionism. In Sawyer, R. K. (2007). The Cambridge handbook
of the learning sciences. Cambridge: Cambridge University Press. Kindle Edition.
Kalman, C. S., & Rohar, S. (2010). Toolbox of activities to support students in a physics
gateway course. Physical Review Special Topics – Physics Education Research,
6(2). doi:10.1103/physrevstper.6.020111
Kennedy, E. (2010). Narrowing the achievement gap: motivation, engagement, and self-
efficacy matter. Journal Of Education, 190(3), 1-11.
Koksal, M. S., & Yaman, S. (2012). An investigation of the epistemological predictors of
self-regulated learning of advanced science students. Science Educator, 21(2), 45.
Kolloffel, B., Eysink, T., & Jong, T. (2011). Comparing the effects of representational
tools in collaborative and individual inquiry learning. International Journal Of
Computer-Supported Collaborative Learning, 6(2), 223-251. doi:10.1007/s11412-
011-9110-3
Kurtz, B., & Karplus, R. (1979). Intellectual development beyond elementary school vii:
teaching for proportional reasoning. School Science and Mathematics, 79: 387–
398.
Lancor, R. (2012). Using metaphor theory to examine conceptions of energy in biology,
chemistry, and physics. Science & Education, 23(6), 1245–1267.
doi:10.1007/s11191-012-9535-8
Lee, S., & Chin-Chung, T. (2012). Students’ domain-specific scientific epistemological
beliefs: a comparison between biology and physics. Asia-Pacific Education
Researcher (De La Salle University Manila), 21(2), 215-229.

193

Levandowsky, M., & Winter, D. (1971). Distance between sets. Nature, 234(5323), 34–
35. doi:10.1038/234034a0
Lindsey, B. A., Hsu, L., Sadaghiani, H., Taylor, J. W., & Cummings, K. (2012). Positive
attitudinal shifts with the Physics by Inquiry curriculum across multiple
implementations. Physical Review Special Topics – Physics Education Research,
8(1). doi:10.1103/physrevstper.8.010102
Lising, L., & Elby, A. (2005). The impact of epistemology on learning: A case study
from introductory physics. American Journal of Physics, 73(4), p 372-382.
Marušić, M., Mišurac Zorica, I., & Pivac, S. (2012). Influence of learning physics by
reading and learning physics by doing on the shift in level of scientific
reasoning. Journal Of Turkish Science Education (TUSED), 9(1), 146-161.
Marusic´, M., & Slisko, J. (2012). Influence of three different methods of teaching
physics on the gain in students’ development of reasoning. International Journal
of Science Education, 34, 301–326. doi:10(1080/09500693),2011,582522.
Mason, L., Boscolo, P., Tornatora, M. C., & Ronconi, L. (2012). Besides knowledge: a
cross-sectional study on the relations between epistemic beliefs, achievement
goals, self-beliefs, and achievement in science. Instructional Science, 41(1), 49–
79. doi:10.1007/s11251-012-9210-0
Mason, L., & Bromme, R. (2010). Situating and relating epistemological beliefs into
metacognition: studies on beliefs about knowledge and knowing. Metacognition
Learning, 5(1), 1–6. doi:10.1007/s11409-009-9050-8

194

Mason, L., Boldrin, A., & Ariasi, N. (2010). Epistemic metacognition in context:
evaluating and learning online information. Metacognition & Learning, 5(1), 67-
90.
Merriam, S. B. (2009). Qualitative research: A guide to design and implementation. San
Francisco, CA: Jossey Bass.
Merriam, S.B. (2010). Qualitative research in practice: examples for discussion and
analysis. Kindle Edition.
Modeling Instruction Project. (2013). Modeling Instruction – Legacy Site. Retrieved
from http://modeling.asu.edu
Moore, W.S. (2012). Understanding learning in a postmodern world: reconsidering the
perry scheme of intellectual and ethical development. In Hofer, B.K. & Pintrich,
P.R. (2012). Personal Epistemology: The Psychology of Beliefs About Knowledge
and Knowing. Taylor and Francis. Kindle Edition.
Moore, T. J., Miller, R. L., Lesh, R. A., Stohlmann, M. S., & Kim, Y. R. (2013).
Modeling in Engineering: The role of representational fluency in students’
conceptual understanding. Journal Of Engineering Education, 102(1), 141-178.
doi:10.1002/jee.20004
Morse, J. M. (2000). Determining sample size. Qualitative Health Research, 10(1), 3–5.
doi:10.1177/104973200129118183
Muis, K. R., & Duffy, M. C. (2013). Epistemic climate and epistemic change: instruction
designed to change students’ beliefs and learning strategies and improve
achievement. Journal Of Educational Psychology, 105(1), 213-225.
doi:10.1037/a0O29690

195

Muis, K.R., & Franco, G.M. (2010). Epistemic profiles and metacognition: support for
the consistency hypothesis. Metacognition And Learning, 5(1), 27-45.
Muis, K.R., Kendeou, P., & Franco, G.M. (2011). Consistent results with the consistency
hypothesis? The effects of epistemic beliefs on metacognitive processing.
Metacognition And Learning, 6(1), 45-63.
Mulnix, J. (2012). Thinking critically about critical thinking. Educational Philosophy &
Theory, 44(5), 464-479.
Nersessian, N.J. (2010). Mental modeling in conceptual change. International Handbook
of Research on Conceptual Change (Educational Psychology Handbook). Taylor
and Francis. Kindle Edition.
Nieminen, P., Savinainen, A., & Viiri, J. (2010). Force Concept Inventory-based
multiple-choice test for investigating students’ representational consistency.
Physical Review Special Topics – Physics Education Research, 6(2).
doi:10.1103/physrevstper.6.020109
Nieminen, P., Savinainen, A., & Viiri, J. (2012). Relations between representational
consistency, conceptual understanding of the force concept, and scientific
reasoning. Physical Review Special Topics – Physics Education Research, 8(1).
doi:10.1103/physrevstper.8.010123
Nimon, H. I. (2013). Role of neuro-psychological studies in intelligence
education. Journal of Strategic Security, 6(5), 256-266.
doi:http://dx.doi.org/10.5038/1944-0472.6.3S.25

196

Nussbaum, E. M., & Bendixen, L. D. (2003). Approaching and avoiding arguments: The
role of epistemological beliefs, need for cognition, and extraverted personality
traits. Contemporary Educational Psychology, 28, 573–595.
Palmer, B., & Marra, R. M. (2004). College student epistemological perspectives across
knowledge domains: A proposed grounded theory. Higher Education, 47(3), 311–
335. doi:10.1023/b:high.0000016445.92289.f1
Paul, R., & Elder, L. (2008). A miniature guide for students and faculty to scientific
thinking. Dillon Beach, CA: Foundation for Critical Thinking. Kindle edition.
Peters, M. A. (2007). Kinds of thinking, styles of reasoning. Educational Philosophy &
Theory, 39(4), 350-363.
Pfeifer, N. (2013). The new psychology of reasoning: A mental probability logical
perspective. Thinking & Reasoning, 19(3/4), 329-345.
doi:10.1080/13546783.2013.838189
Piaget, J. (1970). Psychology and epistemology. New York: Viking Press.
Pintrich, P. R. (2012). Future challenges and directions for theory and research on
personal epistemology. Personal Epistemology: The Psychology of Beliefs About
Knowledge and Knowing. Taylor and Francis. Kindle Edition.
Planinic, M., Ivanjek, L., & Susac, A. (2010). Rasch model based analysis of the Force
Concept Inventory. Physical Review Special Topics – Physics Education
Research, 6(1). doi:10.1103/physrevstper.6.010103
Plotnitsky, A. (2012). On foundational thinking in fundamental physics, from Riemann to
Einstein to Heisenberg. doi:10.1063/1.3688981

197

Po-Hung, L., & Shiang-Yao, L. (2011). A cross-subject investigation of college students’
epistemological beliefs of physics and mathematics. Asia-Pacific Education
Researcher (De La Salle University Manila), 20(2), 336-351.
Posner, G. J., Strike, K. A., Hewson, P. W., & Gertzog, W. A. (1982). Accommodation
of a scientific conception: Towards a theory of conceptual change. Science
Education, 66(2), 211–227.
Rai, T. S. (2012). Thinking in societies and cultures. In Holyoak, K. & Morrison, R.
(2012). The Oxford Handbook of Thinking and Reasoning. Oxford Press.
doi:10.1093/oxfordhb/9780199734689.013.0029
Redish, E. F. (2013). Oersted Lecture 2013: How should we think about how our students
think? American Journal of Physics, 82, 537.
Redish, E. F., & Hammer, D. (2009). Reinventing college physics for biologists:
Explicating an epistemological curriculum. American Journal of Physics, 77(7),
629. doi:10.1119/1.3119150
Richardson, J. T. E. (2013). Epistemological development in higher education.
Educational Research Review, 9, 191–206. doi:10.1016/j.edurev.2012.10.001
Richter, T., & Schmid, S. (2010). Epistemological beliefs and epistemic strategies in self-
regulated learning. Metacognition & Learning, 5(1), 47-65.
Rosenberg, J. L., Lorenzo, M., & Mazur, E. (2006). Peer instruction: Making science
engaging. Handbook of College Science Teaching, 77-85.
Royce, J. R., & Mos, L. P. (1980). Manual: Psycho-epistemological profile. Center for
Advanced Study in Theoretical Psychology: University of Alberta.

198

Rudolph, A. L., Lamine, B., Joyce, M., Vignolles, H., & Consiglio, D. (2014).
Introduction of interactive learning into French university physics classrooms.
Physical Review Special Topics – Physics Education Research, 10(1).
doi:10.1103/physrevstper.10.010103
Rubin, H. J., & Rubin, I. S. (2012). Qualitative interviewing: The art of hearing data (3rd
ed.). Thousand Oaks, CA: Sage.
Rule, D. C. & Bendixen, L. D. (2010). The integrative model of personal epistemology
development: theoretical underpinnings and implications for education. In B. K.
Hofer and P. R. Pintrich (Eds.), Personal Epistemology in the Classroom: Theory,
Research, and Implications for Practice. Kindle Edition.
Saldaña, J. (2013). The coding manual for qualitative researchers. SAGE Publications.
Kindle Edition.
Sawtelle, V., Brewe, E., & Kramer, L. H. (2010). Positive impacts of modeling
instruction on self-efficacy. AIP Conference Proceedings, 1289(1), 289-292.
doi:10.1063/1.3515225
Sawtelle, V., Brewe, E., & Kramer, L. H. (2012). Exploring the relationship between
self-efficacy and retention in introductory physics. Journal of Research in Science
Teaching, 49(9), 1096–1121. doi:10.1002/tea.21050
Sawtelle, V., Brewe, E., Goertzen, R. M., & Kramer, L. H. (2012). Identifying events that
impact self-efficacy in physics learning. Physical Review Special Topics – Physics
Education Research, 8(2). doi:10.1103/physrevstper.8.020111

199

Scherr, R. E., Close, H. G., McKagan, S. B., & Vokos, S. (2012). Representing energy. I.
Representing a substance ontology for energy. Physical Review Special Topics –
Physics Education Research, 8(2). doi:10.1103/physrevstper.8.020114
Scherr, R. E., Close, H. G., Close, E. W., & Vokos, S. (2012). Representing energy. II.
Energy tracking representations. Physical Review Special Topics – Physics
Education Research, 8(2). doi:10.1103/physrevstper.8.020115
Schommer, M. (1990). Effects of beliefs about the nature of knowledge on
comprehension. Journal of Educational Psychology, 82, 498–504.
Schommer-Aikins, M. (2012). An evolving theoretical framework for an epistemological
belief system. In Hofer, B.K. & Pintrich, P.R. (2012). Personal Epistemology:
The Psychology of Beliefs About Knowledge and Knowing. Taylor and Francis.
Kindle Edition.
Schommer-Aikins, M., & Duell, O. K. (2013). Domain specific and general
epistemological beliefs their effects on mathematics. Revista de Investigación
Educativa, 31(2), 317-330.
Schraw, G., Bendixen, L. D., & Dunkle, M. E. (2012). Development and validation of the
Epistemic Belief Inventory. In B. K. Hofer & P. R. Pintrich (Eds.), Personal
epistemology: The psychology of beliefs about knowledge and knowing (pp. 103–
118). Mahwah, NJ: Erlbaum.
Sharma, S., Ahluwalia, P. K., & Sharma, S. K. (2013). Students’ epistemological beliefs,
expectations, and learning physics: An international comparison. Physical Review
Special Topics – Physics Education Research, 9(1).
doi:10.1103/physrevstper.9.010117

200

Sinatra, G. M., & Chinn, C. (2011). Thinking and reasoning in science: Promoting
epistemic conceptual change. In K. Harris, C. B. McCormick, G. M. Sinatra, & J.
Sweller (Eds.), Critical theories and models of learning and development relevant
to learning and teaching, Volume 1. APA Educational Psychology Handbook
Series (pp. 257–282). Washington, DC: APA Publications. doi:10.1037/ 13275-
011
Sinatra, G. M., Kienhues, D., & Hofer, B. K. (2014). Addressing challenges to public
understanding of science: epistemic cognition, motivated reasoning, and
conceptual change. Educational Psychologist, 49(2), 123–138.
doi:10.1080/00461520.2014.916216
Starks, H., & Brown Trinidad, S. (2007). Choose your method: a comparison of
phenomenology, discourse analysis, and grounded theory. Qualitative Health
Research, 17(10), 1372–1380. doi:10.1177/1049732307307031
Thornton, R., Kuhl, D., Cummings, K., & Marx, J. (2009). Comparing the force and
motion conceptual evaluation and the force concept inventory. Physical Review
Special Topics – Physics Education Research, 5(1).
doi:10.1103/physrevstper.5.010105
Vosniadou, S. (2007). The cognitive—situative divide and the problem of conceptual
change. Educational Psychologist, 42(1), 55-66.
Vosniadou, S., Vamvakoussi, X, & Skopeli, I. (2010). The framework theory approach to
the problem of conceptual change. International Handbook of Research on
Conceptual Change (Educational Psychology Handbook). Taylor and Francis.
Kindle Edition.

201

Wang, J., & Bao, L. (2010). Analyzing force concept inventory with item response
theory. Am. J. Phys., 78(10), 1064. doi:10.1119/1.3443565
Wiser, M., & Smith, C.L. (2010). Learning and teaching about matter in grades K–8:
when should the atomic-molecular theory be introduced?. International
Handbook of Research on Conceptual Change (Educational Psychology
Handbook). Taylor and Francis. Kindle Edition.
Wood, A. K., Galloway, R. K., Hardy, J., & Sinclair, C. M. (2014). Analyzing learning
during Peer Instruction dialogues: A resource activation framework. Physical
Review Special Topics – Physics Education Research, 10(2).
doi:10.1103/physrevstper.10.020107
Wood, P., & Kardash, C. (2012). Critical elements in the design and analysis of studies of
epistemology. In B. K. Hofer & P. R. Pintrich (Eds.), Personal epistemology: The
psychology of beliefs about knowledge and knowing (pp. 231–260). Mahwah, NJ:
Erlbaum.
Wu, H., & Puntambekar, S. (2012). Pedagogical affordances of multiple external
representations in scientific processes. Journal Of Science Education &
Technology, 21(6), 754-767. doi:10.1007/s10956-011-9363-7
Yasuda, J., & Taniguchi, M. (2013). Validating two questions in the Force Concept
Inventory with subquestions. Physical Review Special Topics – Physics Education
Research, 9(1). doi:10.1103/physrevstper.9.010113
Yerdelen-Damar, S., Elby, A., & Eryilmaz, A. (2012). Applying beliefs and resources
frameworks to the psychometric analyses of an epistemology survey. Physical

202

Review Special Topics – Physics Education Research, 8(1).
doi:10.1103/physrevstper.8.010104
Yin, R. K. (2011). Qualitative research from start to finish. New York, NY: The Guilford
Press.
Yin, R.K. (2014). Case study research: Design and methods. (5th ed.). Los Angeles, CA:
Sage.
Zhang, P., & Ding, L. (2013). Large-scale survey of Chinese precollege students’
epistemological beliefs about physics: A progression or a regression? Physical
Review Special Topics – Physics Education Research, 9(1).
doi:10.1103/physrevstper.9.010110
Zwickl, B. M., Hirokawa, T., Finkelstein, N., & Lewandowski, H. J. (2014).
Epistemology and expectations survey about experimental physics: Development
and initial results. Physical Review Special Topics – Physics Education Research,
10(1). doi:10.1103/physrevstper.10.010120

203

Appendix A
Site Authorization Form

204

Appendix B
Student Consent Form

205

Appendix C
GCU D-50 IRB Approval to Conduct Research

206

Appendix D
Psycho-Epistemological Profile (PEP)

207

208

209

Appendix E
What is Physics? What is Reality? Is Physics Reality?

210

211

212

213

Appendix F
Numbers Do Not Add

214

Appendix G
The Law of the Circle

215

Appendix H
The Zeroth Laws of Motion

216

217

218

Appendix I
End of Term Interview

List of Tables
List of Figures
Chapter 1: Introduction to the Study
Introduction
Background of the Study
Problem Statement
Purpose of the Study
Research Questions and Phenomenon
Qualitative Research Questions
Advancing Scientific Knowledge
Significance of the Study
Rationale for Methodology
Nature of the Research Design for the Study
Definition of Terms
Assumptions, Limitations, Delimitations
Summary and Organization of the Remainder of the Study
Chapter 2: Literature Review
Introduction to the Chapter and Background to the Problem
Theoretical Foundations and Conceptual Framework
Review of the Literature
Summary
Chapter 3: Methodology
Introduction
Statement of the Problem
Research Questions
R1: How do IP students use MRS in their thinking and reasoning?
Research Methodology
Research Design
Population and Sample Selection
Instrumentation and Sources of Data
Validity
Reliability
Data Collection and Management
Data Analysis Procedures
Ethical Considerations
Limitations and Delimitations
Summary
Chapter 4: Data Analysis and Results
Introduction
Descriptive Data
Data Analysis Procedures
Results
Qualitative analysis.
Figure 7. Concepts and individual POV node matrix.
Table 15 Examples of Concept Coordination
Figure 16. Belief development with TRU claims node matrix.
Table 17 Examples of EoT Belief Development
Table 18 Examples of Belief Development
Figure 18. Node matrix comparing true claims with EoT.
Table 19 Examples of Belief Development
Summary
Chapter 5: Summary, Conclusions, and Recommendations
Introduction
Summary of the Study
Summary of Findings and Conclusion
Research Question 1.
Research Question 2.
Definitions
Figure 20. Cognitive Modeling Taxonomy of Conceptual Frameworks – Collections.
Example from Physics Using the CMTCF. Consider the law-like statement: a change in the position [of an object] always requires a change in time. From an observational point of view, this is an Empirically Familiar Regularity (EFR) in the large-scale w…
Reasoning = model-to-model interaction. In the same way that a change in position requires a change in time, so does a change in speed.
Figure 24. CMTCF Example 2: Second Zeroth Law of Motion.
Figure 25. CMTCF example 3: Second Zeroth Law axiom.
Predictions.
Suggestions for TRU Learning Theory use
Implications
Recommendations
Recommendations for future research.

References
Appendix A Site Authorization Form
Appendix B Student Consent Form
Appendix C GCU D-50 IRB Approval to Conduct Research
Appendix D Psycho-Epistemological Profile (PEP)
Appendix E What is Physics? What is Reality? Is Physics Reality?
Appendix F Numbers Do Not Add
Appendix G The Law of the Circle
Appendix H The Zeroth Laws of Motion
Appendix I End of Term Interview

Evaluating 19-Channel Z-score Neurofeedback:

Addressing Efficacy in a Clinical Setting

Submitted by

Nancy L. Wigton

A Dissertation Presented in Partial Fulfillment

of the Requirements for the Degree

Doctorate of Philosophy

Grand Canyon University

Phoenix, Arizona

May 15, 2014

All rights reserved

INFORMATION TO ALL USERS
The quality of this reproduction is dependent upon the quality of the copy submitted.

In the unlikely event that the author did not send a complete manuscript
and there are missing pages, these will be noted. Also, if material had to be removed,

a note will indicate the deletion.

Microform Edition © ProQuest LLC.

All rights reserved.

This work is protected against

unauthorized copying under Title 17, United States Code

ProQuest LLC.
789 East Eisenhower Parkway

P.O. Box 1346
Ann Arbor, MI 48106 – 1346

UMI 3625170
Published by ProQuest LLC (2014). Copyright in the Dissertation held by the Author.

UMI Number: 3625170

© by Nancy L. Wigton, 20

1

4

All rights reserved.

Abstract

Neurofeedback (NF) is gaining recognition as an evidence-based intervention grounded

in learning theory, and 19-channel z-score neurofeedback (19ZNF) is a new NF model.

Peer-reviewed literature is lacking regarding empirical-based evaluation of 19ZNF. The

purpose of this quantitative research study was to evaluate the efficacy of 19ZNF, in a

clinical setting, using archival data from a Southwest NF practice, with a retrospective

one-group pretest-posttest design. Each of the outcome measures framed a group such

that 19ZNF was evaluated, as it relates to the particular neuropsychological constructs of

attention (n = 10), behavior (n = 14), executive function (n = 12), as well as

electrocortical functioning (n = 21). The research questions asked if 19ZNF improves

these constructs. One-tailed t tests performed, compared pre-post scores for included

clinical assessment scales, and selected quantitative electroencephalographic (QEEG)

metrics. For all pre-post comparisons, the direction of change was in the predicted

direction. Moreover, for all outcome measures, the group means were beyond the

clinically significant threshold before 19ZNF, and no longer clinically significant after

19ZNF. All differences were statistically significant, with results ranging from p = .00

0

to p = .008; and effect sizes ranging from 1.29 to 3.42. Results suggest 19ZNF improved

attention, behavior, executive function, and electrocortical function. This study provides

beginning evidence of 19ZNF’s efficacy, adds to what is known about 19ZNF, and offers

an innovative approach for using QEEG metrics as outcome measures. These results may

lead to a greater acceptance of 19ZNF, as well as foster needed additional scientific

research.

Keywords: Neurofeedback, QEEG, z-score neurofeedback, 19ZNF, EEG biofeedback

v

Dedication

This dissertation is dedicated to my Lord and Savior, Jesus. From my first

thoughts of considering a doctoral program being divinely inspired and directed, through

to the last step I will take across a graduation stage, the Father, Son, and Holy Spirit are

always the center point, the anchor. To that end, three Bible passages capture the

experience of my journey.

The way of God is perfect, the Lord’s word has stood the test; He is the shield of

all who take refuge in Him. What god is there but the Lord? What rock but our

God? – the God who girds me with strength and makes my way blameless, who

makes me swift as the deer and sets me secure on the mountains (Psalms 18:30-

33, New English Bible).

“Commit your life to the Lord; trust in Him and He will act. He will make your

righteousness shine clear as the day and the justice of your cause like the sun at noon”

(Psalms 37:5-6).

“Not to us, O Lord, not to us, but to thy name ascribe the glory, for thy true love

and for thy constancy” (Psalms 115:1).

vi

Acknowledgments

It is only through the Lord’s strength and wisdom that this dissertation came to

fruition. Next, I acknowledge the man with whom the Lord has made me one, my

husband. You are truly the wind beneath my wings, and without you I would not have

had the wherewithal to complete this endeavor. Thank you for all your support and

sharing your perseverance for my good. I also wish to acknowledge, with unbounded

gratitude, the most perfect dissertation committee possible for this journey.

To my chair, Dr. Genomary Krigbaum, words are insufficient to fully express the

depth and breadth of my appreciation for your support, guidance, and direction. When I

first read descriptions of what the ideal chair would be, with characteristics inclusive of

mentor, advocate, role model, teacher, defender, guide, supervisor, coach, encourager,

and friend, I wondered if it would ever be possible to find all those elements in one

person. Yet in you, I found them all, and more. Por siempre agradecida. Moreover, thank

you for encouraging me to build on the methodology you started. To Dr. Daniel Smith, I

am grateful that you joined my dissertation team. I knew I could count on you for your

statistical expertise, and you did not disappoint. Thank you for the many conversations

prior to my dissertation journey, and in helping to pave the way for the best committee

possible. To Dr. Genie Bodenhamer-Davis, as a most respected neurofeedback

practitioner and educator, I am humbled and honored that you were willing to assist me in

my dissertation journey. Thank you, so much, for your counsel over the last 3 years. To

Dr. Ron Bonnstetter, thank you for your support in being my adjunct dissertation reader.

Thank you for your compliments on my writing and your assurance I have what it takes

to succeed as a scholar.

vii

Table of Contents

List of Tables ……………………………………………………………………………………………………… xi

List of Figures ……………………………………………………………………………………………………. xii

Chapter 1: Introduction to the Study …………………………………………………………………………

1

Introduction ……………………………………………………………………………………………………..1

Background of the Study …………………………………………………………………………………..

2

Problem Statement ……………………………………………………………………………………………4

Purpose of the Study …………………………………………………………………………………………

5

Research Questions and Hypotheses …………………………………………………………………..

6

Advancing Scientific Knowledge ……………………………………………………………………….

8

Significance of the Study …………………………………………………………………………………..

9

Rationale for Methodology ………………………………………………………………………………

10

Nature of the Research Design for the Study ………………………………………………………

11

Definition of Terms…………………………………………………………………………………………1

3

Assumptions, Limitations, Delimitations …………………………………………………………..19

Summary and Organization of the Remainder of the Study ………………………………….22

Chapter 2: Literature Review …………………………………………………………………………………23

Introduction and Background to the Problem ……………………………………………………..23

Historical overview of EEG and QEEG. ……………………………………………….24

Historical overview of NF …………………………………………………………………..25

How problem/gap of 19ZNF research evolved into current form ……………..28

Theoretical Foundations and/or Conceptual Framework ………………………………………31

Foundations of EEG and QEEG …………………………………………………………..31

viii

Learning theory as applied to NF………………………………………………………….31

Traditional/amplitude-based models of NF ……………………………………………33

QNF model of NF ………………………………………………………………………………35

ZNF model of NF……………………………………………………………………………….38

Review of the Literature – Key Themes …………………………………………………………….39

QNF in the literature …………………………………………………………………………..39

4ZNF in the literature………………………………………………………………………….4

7

19ZNF in the literature………………………………………………………………………..50

Outcome measures for ZNF research ……………………………………………………53

Summary ……………………………………………………………………………………………………….59

Chapter 3: Methodology ……………………………………………………………………………………….61

Introduction ……………………………………………………………………………………………………61

Statement of the Problem …………………………………………………………………………………61

Research Questions and Hypotheses …………………………………………………………………62

Research Methodology ……………………………………………………………………………………64

Research Design……………………………………………………………………………………………..65

Population and Sample Selection………………………………………………………………………66

Instrumentation ………………………………………………………………………………………………68

Validity …………………………………………………………………………………………………………72

Reliability ………………………………………………………………………………………………………74

Data Collection Procedures ………………………………………………………………………………76

Data Analysis Procedures ………………………………………………………………………………..78

Ethical Considerations …………………………………………………………………………………….81

ix

Limitations …………………………………………………………………………………………………….82

Summary ……………………………………………………………………………………………………….84

Chapter 4: Data Analysis and Results ……………………………………………………………………..86

Introduction ……………………………………………………………………………………………………86

Descriptive Data……………………………………………………………………………………………..86

Data Analysis Procedures ………………………………………………………………………………..93

Results …………………………………………………………………………………………………………..96

Summary ……………………………………………………………………………………………………..103

Chapter 5: Summary, Conclusions, and Recommendations ……………………………………..105

Introduction ………………………………………………………………………………………………….105

Summary of the Study …………………………………………………………………………………..106

Summary of Findings and Conclusion ……………………………………………………………..107

Implications………………………………………………………………………………………………….113

Theoretical implications…………………………………………………………………….114

Practical implications ………………………………………………………………………..1

15

Future implications. ………………………………………………………………………….1

16

Recommendations …………………………………………………………………………………………1

17

Recommendations for future research. ………………………………………………..117

Recommendations for practice. ………………………………………………………….118

References …………………………………………………………………………………………………………120

Appendix A ……………………………………………………………………………………………………….136

Appendix B ……………………………………………………………………………………………………….137

x

Appendix C ……………………………………………………………………………………………………….138

Appendix D ……………………………………………………………………………………………………….139

xi

List of Tables

Table 1.1. Research Questions and Variables …………………………………………………………….8

Table 4.1. Descriptive Data for All Groups ……………………………………………………………. 91

Table 4.2. Shapiro-wilk Results for Difference Scores ……………………………………………. 95

Table 4.3. Summary of Results – All Groups………………………………………………………….104

xii

List of Figures

Figure 1.1. Formation of Sample Groups ………………………………………………………………. 13

Figure 4.1. IVA Group Pre-Post Scores…………………………………………………………………. 97

Figure 4.2. DSMD Group Pre-Post Scores …………………………………………………………….. 99

Figure 4.3. BRIEF Group Pre-Post Scores …………………………………………………………… 101

Figure 4.4. QEEG Group Pre-Post Scores …………………………………………………………… 102

1

Chapter 1: Introduction to the Study

Introduction

Neurofeedback (NF) is an operant conditioning brainwave biofeedback technique,

which is also referred to as electroencephalographic (EEG) biofeedback. This modality,

dating back to the 1970s (Lubar & Shouse, 1976; Sterman, LoPresti, & Fairchild, 2010),

trains electrical signals of targeted frequencies and involves recording EEG data from

scalp sensors with an amplifier, which is subsequently processed by computer software.

The software provides visual and sound display feedback to the trainee, thereby

providing a reward stimulus when the brain is functioning in the target range. This

reward process generates learning such that the brain’s functioning is conditioned in the

intended manner.

Over the years, new models of NF have been developed, and the most current

iteration is a style of NF which is termed z-score NF (ZNF). ZNF is different from more

traditional NF models in that it incorporates into the NF session real-time quantitative

EEG (QEEG) z-score metrics making it possible to combine operant conditioning with

real-time assessment using a normative database (Collura, Thatcher, Smith, Lambos, &

Stark 2009; Thatcher, 2012). In 2006, a 4-channel ZNF (4ZNF) technique was

introduced, which in 2009 was expanded to include all 19 sites of the International 10-20

System (of electrode placement) to allow for a 19-channel ZNF (19ZNF). To date, case

study and anecdotal clinical reports within the field indicate this new 19ZNF approach is

an improvement over traditional NF models (J. L. Koberda, Moses, Koberda & Koberda,

2012a; Wigton, 2013). However the efficacy of this new model has not yet been

established from empirical studies. This research is different from prior qualitative

2

studies; it has been completed as a quantitative analysis of pre-post outcome measures

with group data, and thus, it is a beginning in establishing empirical evidence regarding

19ZNF.

The remainder of this chapter formulates this dissertation through a review of the

study background, problem statement, purpose and significance, and how this research

advances the scientific knowledge. Moreover the research questions and hypotheses are

presented, together with the methodology rationale and the nature of the research design.

An extended Definition section is included to review the many technical terms germane

to this research. Readers unfamiliar with NF or QEEGs may find it helpful to review the

definitions first. Finally, to establish the scope of the study, a list of assumptions,

limitations, and delimitations are included.

Background of the Study

In recent years NF has seen increasing acceptance as a therapeutic technique.

Current literature includes reviews and meta-analyses which establish a recognition of

NF as effective for the specific condition of attention deficit hyperactivity disorder

(ADHD) (Arns, de Ridder, Strehl, Breteler, & Coenen 2009; Brandeis, 2011;

Gevensleben, Rothenberger, Moll, & Heinrich, 2012; Lofthouse, Arnold, Hersch, Hurt, &

DeBeus, 2012; Niv, 2013; Pigott, De Biase, Bodenhamer-Davis, & Davis, 2013).

However, the type of NF covered in these reviews is limited to the oldest NF model

(theta/beta ratio) and/or slow cortical potential NF. Yet of note are reports in the literature

of a different NF model which is informed by QEEG data. This QEEG-guided NF (QNF)

is reported to be used for a much wider range of conditions; not only ADHD, but also

behavior disorders, cognitive dysfunction, various mood disorders, epilepsy,

3

posttraumatic stress disorder, head injuries, autism spectrum disorders, migraines,

learning disorders, schizophrenia, and mental retardation (Arns, Drinkenburg, &

Kenemans, 2012; Breteler, Arns, Peters, Giepmans, & Verhoeven, 2010; Coben &

Myers, 2010; J. L. Koberda, Hillier, Jones, Moses, & Koberda 2012; Surmeli, Ertem,

Eralp, & Kos, 2012; Surmeli & Ertem, 2009, 2010, 2011; Walker, 2009, 2010b, 2011,

2012b).

Yet, all the aforementioned models are limited in their use of only one or two

electrodes and they also require many sessions to achieve good clinical outcomes. For the

above-cited studies the reported average number of sessions was 40.5. Moreover,

Thatcher (2012, 2013) reports 40 to 80 sessions to be the accepted norm for these older

style models; thus leading to a sizeable cost to access this treatment. However, one of the

newest ZNF models shows promise to bring about positive clinical outcomes in

significantly fewer sessions (Thatcher, 2013). With 4ZNF there have been reports of

successful clinical outcomes with less than 25 sessions (Collura, Guan, Tarrant, Bailey, &

Starr, 2010; Hammer, Colbert, Brown, & Ilioi, 2011; Wigton, 2008); whereas clinical

reviews and recent conference reports (J. L. Koberda, Moses, Koberda, & Koberda,

2012b; Rutter, 2011; Wigton, 2009, 2010a, 2010b, 2013; Wigton & Krigbaum, 2012)

suggest 19ZNF can result in positive clinical outcomes, as well as QEEG normalization,

in as few as 5 to15 sessions. Therefore a NF technique which shows promise to bring

clinical improvement in fewer sessions – thereby reducing treatment cost – deserves

empirical study.

Currently in the peer-reviewed published literature, there are a couple of

descriptive and clinical review articles about the 19ZNF model (Thatcher, 2013; Wigton,

4

2013) and two single case study reports (Hallman, 2012; J. L. Koberda et al., 2012a);

however rigorous scientific studies evaluating 19ZNF have not been found, which poses

a gap in the literature. Therefore, before the question of efficiency and number of

sessions is examined, first its efficacy should be established. NF and ZNF efficacy has

been discussed in the literature as having the desired effect in terms of improved clinical

outcomes (La Vaque et al., 2002; Thatcher, 2013; Wigton, 2013), a definition that fits

well within the scope of this research. In this study, there are two types of clinical

outcome measures; one type (clinical assessments) is a set of psychometric tests designed

to measure symptom severity and/or improvement, the other type (QEEG z-scores)

provides a representative measure of electrocortical dysfunction and/or improvement.

Thus, this dissertation is intended to address efficacy of 19ZNF in a clinical setting,

through a retrospective evaluation of clinical outcomes, as measured by clinical

assessments and QEEG z-scores.

Problem Statement

It is not known, by way of statistical evaluation of either clinical assessments or

QEEG z-scores, if 19ZNF is an effective NF technique. This is an important problem

because 19ZNF is a new NF model currently in use by a growing number of practitioners,

yet scientific research investigating its efficacy is lacking. According to an Efficacy Task

Force, established by the two primary professional organizations for NF and biofeedback

professionals,
1
anecdotal reports (regardless of how many) are insufficient as a basis for

1
The primary professional societies for neurofeedback and biofeedback are the International

Society for Neurofeedback and Research (ISNR; www.isnr.org) and the Association for Applied

Psychophysiology and Biofeedback (AAPB; www.aapb.org).

5

determining treatment efficacy, and uncontrolled case studies are scientifically weak (La

Vaque et al., 2002). Therefore, scientific evidence

of efficacy for 19ZNF is needed.

The identified population for this study is made up of those seeking NF services

(both adults and children), and those who become NF clients. These individuals may

have an array of symptoms, which adversely affect their daily functioning; they may also

have previously diagnosed mental health disorders. When seeking NF services these

individuals must choose among a variety of NF models. However the dearth of scientific

literature regarding 19ZNF limits the information available to inform that decision-

making process. Therefore, it is vital that both NF clinicians and clients have empirically

derived information regarding the clinical value and efficacy of this new NF technique.

Consequently, the problem of this empirical gap impacts the NF clinician and client alike.

The goal of this research is to contribute in providing a first step towards addressing this

research gap.

Purpose of the Study

The purpose of this quantitative, retrospective, one-group, pretest-posttest study

research was to compare the difference between pre and post clinical assessments and

QEEG z-scores data, before and after 19ZNF sessions, from archived data of a private

neurofeedback practice in the Southwest region of the United States. The comparisons

were accomplished via statistical analysis appropriate to the data (i.e. paired t tests), and

will be further discussed in the Data Analysis section of Chapter 3. The independent

variable is defined as the 19ZNF, and the dependent variables are defined as the standard

scaled scores of three clinical assessments and QEEG z-score data. The clinical

assessments measure symptoms of attention, behavior, and executive function, whereas

6

the z-scores provide a representative measure of electrocortical function. The full scopes

of the assessments are further outlined in the Instrumentation section of Chapter 3.

Given the retrospective nature of this study, there were no individuals, as subjects,

with which to interact. However the target population group is considered to be adults

and children with clinical symptoms of compromised attention, behavior, or executive

function, who are interested in NF as an intervention for improvement of those

symptoms. This pretest-posttest comparison research contributes to the NF field by

conducting a scientific study, using quantitative group methods, to address the efficacy of

the new 19ZNF model.

Research Questions and Hypotheses

If the problem to be addressed is a lack of scientific evidence demonstrating

efficacy of 19ZNF, the solution lies in evaluating its potential for improving clinical

outcomes as measured by clinical assessments and electrocortical metrics. Therefore

research questions posed in terms of clinical symptomology and cortical function

measures is a reasonable approach. For this research the independent variable is the

19ZNF and the dependent variables are clinical outcomes, as measured by the scaled

scores from three clinical assessments and z-scores from QEEG data. The clinical

assessments are designed to measure symptom severity of attention, behavior, and

executive functioning, and the z-scores are a representational measure of electrocortical

function. The data gathering, scores calculation, and, data analysis were conducted by the

researcher.

7

The following research questions guided this study:

R1a. Does 19ZNF improve attention as measured by the Integrated Visual and

Auditory continuous performance test (IVA; BrainTrain, Incorporated,

Chesterfield, VA)?

Ha1a: The post scores will be higher than the pre scores for the

IVA

assessment.

H01a: The post scores will be lower than, or not significantly different

from, the pre scores of the IVA

assessment.

R1b. Does 19ZNF improve behavior as measured by the Devereux Scale of

Mental Disorders (DSMD; Pearson Education, Incorporated, San Antonio, TX)?

Ha1b: The post scores will be lower than the pre scores for the

DSMD

assessment.

H01b: The post scores will be higher than, or not significantly different

from, the pre scores of the DSMD assessment.

R1c. Does 19ZNF improve executive function as measured by the Behavior

Rating Inventory of Executive Functioning (BRIEF; Western Psychological

Services, Incorporated, Torrance, CA)?

Ha1c: The post scores will be lower than the pre scores for the

BRIEF

assessment.

H01c: The post scores will be higher than, or not significantly different

from, the pre scores of the BRIEF assessment.

R2. Does 19ZNF improve electrocortical function as measured by QEEG z-

scores

(using the Neuroguide Deluxe software, Applied Neuroscience Incorporated, St.

8

Petersburg, FL), such that the post z-scores are closer to the mean than pre z-

scores?

Ha2: The post z-scores will be closer to the mean

than the pre z-scores.

H02: The post z-scores will be farther from the mean, or not significantly

different from, the pre z-scores.

See as follows Table 1.1, outlining the research questions and variables.

Table 1.1

Research Questions and Variables

Research Questions Hypotheses Variables Instrument(s)
2. 1a. Does 19ZNF improve

attention as measured by

the IVA?

The post scores will be

higher than the pre scores

for the IVA assessment.

IV: 19ZNF

DV: IVA standard scale

scores
IVA

computerized

performance test

1b. Does 19ZNF

improve behavior as

measured by the DSMD?

The post scores will be

lower than the pre scores

for the

DSMD

assessment.

IV: 19ZNF

DV: DSMD standard

scale scores

DSMD

rating scale

1. 1c. Does 19ZNF improve
executive function as

measured by the BRIEF?

The post scores will be
lower than the pre scores

for the

BRIEF

assessment.

IV: 19ZNF

DV: BRIEF standard

scale scores
BRIEF
rating scale

2. 2. Does 19ZNF improve
electrocortical function

as measured by QEEG z-

scores such that the post

z-scores are closer to the

mean

than pre z-scores?

The post QEEG

z-scores

will be closer to the mean

than the pre z-scores.
IV: 19ZNF

DV:

QEEG

z-scores
QEEG

z-score data generated

from Neuroguide

software

Advancing Scientific Knowledge

The theoretical framework of NF is the application of operant conditioning upon

the EEG, which leads to electrocortical changes, and in turn, better brain function and

clinical symptom improvement; moreover, studies evaluating traditional NF have

9

demonstrated its efficacy (Arns et al., 2009; Pigott et al., 2013). The 19ZNF model is

new, and experiencing increased use in the NF field, yet efficacy has not been established

via empirical investigation. There is a gap in the literature in that the only peer-reviewed

information available to date, regarding 19ZNF, are reviews, clinical report presentations,

and single case studies. Also noted as typically absent from traditional NF studies are

analyses of pre-post QEEG data (Arns et al., 2009); this lack of pre-post QEEG data

continues in the QNF literature as well. This, then, poses a secondary gap, in terms of

methodology, which this study has the potential to fill.

The clinical condition most researched for demonstrating traditional NF efficacy

is ADHD (Pigott et al., 2013), which includes cognitive functions of attention and

executive function. These issues also lead to some associated behavioral problems with

adverse impacts in instructional settings that are also treated with 19ZNF. Therefore,

addressing efficacy of 19ZNF with clinical assessments designed to measure these

constructs, will contribute to filling the gap of what is not known about this new NF

model, within a framework related to cognition and instruction. If efficacy is

demonstrated, the theory of operant conditioning, upon which NF is founded, may be

expanded to include 19ZNF.

Significance of the Study

The 19ZNF model is theoretically distinctly different from traditional NF in that it

targets real-time QEEG z-scores with a goal of normalizing QEEG metrics (as indicated

by clinical symptom presentation) rather than only increasing or decreasing targeted brain

frequencies. This model has been in existence for five years and its use by NF clinicians

is rapidly growing. Thus far, other than two qualitatively-oriented, single case study

10

reports (Hallman, 2012; J. L. Koberda et al., 2012a), there are no empirical group studies,

with a quantitative methodology, studying the efficacy of 19ZNF in peer-reviewed

literature. The significance of this study is that it aims to fill this significant gap manifest

as a dearth of 19ZNF efficacy

studies.

Moreover, few NF studies include analysis of EEG measures as an outcome

measure (Arns et al., 2009). Therefore demonstrating how z-scores from QEEG data can

be used for group comparison studies, in a way not previously explored, will benefit the

scientific community. Thus, this research has the potential for opening doors for further

research.

It was expected the findings would demonstrate 19ZNF results in improved

clinical outcomes, as measured by clinical and QEEG assessments; thus demonstrating

efficacy. Potential NF clients will benefit from this contribution of what is known about

19ZNF by having more information upon which to base decisions for what type of NF

they wish to pursue. The potential effect of these results may provide the start of an

evidence-based foundation for its use. This foundation may lead to a greater acceptance

of what may be a more efficient (and thereby more economical) NF model, as well as

foster the needed additional scientific research of 19ZNF.

Rationale for Methodology

The field of clinical psychophysiology makes use of quantifiable variables and the

associated research should include specific independent variables, as well as dependent

variables that relate to treatment response (e.g. clinical assessments) and the measured

physiological component (e.g. EEG metrics) (La Vaque et al., 2002). Yet, many NF

studies do not use the EEG metric as a psychophysiologic measure, but rather provide

11

reports, which are more qualitative in nature. Therefore, there is a need for NF research,

with sound quantitative methodologies, using QEEG data as an outcome measure.

Currently, the available 19ZNF studies are in the form of qualitative research

(Hallman, 2012; J. L. Koberda et al., 2012a). This literature entails presenting data, from

single case studies, in the form of unstructured subjective reports of symptom

improvement and graphical images of before and after QEEG findings, where the

improvement is represented by a change in color on the picture (without statistical

analysis of data). However, for this dissertation, the goal is to explore statistical

relationships between the variables under investigation. The strength of quantitative

methodologies, including quasi-experimental research, is that they provide sufficient

information, regarding the relationship of the investigation variables, to enable the study

of the effects of the independent variable upon the dependent variable (Carr, 1994); this

is suitable in the evaluation of a quantitative technology such as 19ZNF.

As previously stated, for this research the independent variable is specified as

19ZNF. The dependent variables in this study are continuous variables in the form of

standard scores from clinical assessments (IVA, DSMD, and BRIEF) and z-scores from

QEEG data. The alternative hypotheses for all research questions predict a directional

significant difference between the means of the pre and the post values for all dependent

variables. Therefore, a quantitative methodology is appropriate for this dissertation.

Nature of the Research Design for the Study

This quasi-experimental research used a retrospective one-group, pretest-posttest

design. When the goal of research is to measure a modification of a behavior pattern, or

internal process that is stable and likely unchangeable on its own, the one-group pretest-

12

posttest design is appropriate (Kerlinger, 1986). In this type of design the dependent

variable pretest measures are compared to the posttest values for each subject, thus

comparing the members of the group to themselves rather than to a control or comparison

group (Kerlinger, 1986). Consequently, the group members become their own control,

hence reducing the potential for extraneous variation due to individual-to-individual

differences (Kerlinger & Lee, 2000). Moreover, the size of the treatment effect can be

estimated by analyzing the difference between the pretest to the posttest measures

(Reichardt, 2009). Therefore, this design as well as a quantitative methodology, is well

suited to evaluate the pre-post outcome measures from a clinical setting.

The rationale for this being a retrospective study is based on the fact that data

available for analysis came from pre-existing archived records, which frequently provides

a rich source of readily accessible data (Gearing, Mian, Barber, & Ickowicz, 2006).

Within the pool of available data, a sample group was gathered for which various pre and

post assessments were performed during the course of 19ZNF treatment. As depicted in

Figure 1.1, an initial group was formed for which pre-post QEEG assessments and z-

scores were available, and for which either the IVA, DSMD, or BRIEF pre-post

assessment data was also available (n = 21). From this collection three additional groups

were formed: One group for the IVA data (n = 10), a second group for the DSMD data (n

= 14), and a third group for the BRIEF data (n = 12). Therefore, using a one-grou

p

pretest-posttest design with these identified groups is fitting. The independent variable is

the 19ZNF and the dependent variables are the data from the clinical assessments and

QEEG files (IVA, DMSD, BRIEF, and z-scores).

13

Formation of Sample Groups

Figure 1.1. Illustration of how the sample groups were formed. The

total number of subjects in the sample is 21. However, out of those

21, some may have multiple assessments, therefore subjects may be

in more than one clinical assessment group.

Definition of Terms

The following terms were used operationally in

this study.

19ZNF. 19-channel z-score NF is a style of NF using all 19 sites of the

International 10-20 system, where real-time QEEG metrics are incorporated into the NF

session in the form of z-scores (Collura, 2014). The goal is for the targeted excessive z-

score metrics (whether high or low) to normalize (move towards the mean). The 19ZNF

cases included in this study are those for which the assessed clinical symptoms

corresponded with the z-score deviations of the QEEG findings, such that a treatment

goal of overall QEEG normalization was clinically appropriate. While the 19ZNF

protocols are individually tailored to the clinical and QEEG findings, the same treatment

goal always applies, that is the overall QEEG normalization. Therefore, the underlying

19ZNF protocol of overall QEEG normalization is consistent for all cases.

14

Absolute power. A QEEG metric which is a measure of total energy, at each

electrode site, for a defined frequency band (Machado et al., 2007); may be expressed in

terms of microvolts, microvolts squared, or z-scores when compared to a normative

database (Collura,

2014).

Amplifier. The equipment that detects, amplifies, and digitizes the brainwave

signal (Collura, 2014). The term is more correctly referred to as a differential amplifier

because the electrical equipment measures the difference between two signal inputs

(brainwaves from electrode locations) (Collura, Kaiser, Lubar, & Evans, 2011).

Amplitude. A measure of the magnitude or size of the EEG signal; and is

typically expressed in terms of microvolts (uV) (Collura et al., 2011). This can be thought

of as how much energy is in the EEG frequency.

Biofeedback. A process of learning how to change physiological activity with the

goal of improving health and/or performance (AAPB, 2011). A simple example of

biofeedback is the act of stepping on a scale to measure one’s weight.

Behavior Rating Inventory of Executive Functioning (BRIEF). The BRIEF,

published by Western Psychological Services, Incorporated (Torrance, CA), is a rating

scale. It has forms for both children and adults, and is designed to assess behavioral,

emotional, and metacognitive skills, which broadly encompass executive skills, rather

than measure behavior problems or psychopathology (Donders, 2002). The test results

are expressed as T scores for various scales and sub-scales (with clinically significant

scores ≥ 65), and lower scores indicate improvement upon re-assessment. The composite

and global scales of Behavior Regulation Index, Metacognition Index, and Global

Executive Composite were included in this study.

15

Coherence. A measure of similarity between two EEG signals, which also

reflects the degree of shared information between the sites; computed in terms of a

correlation coefficient, which varies between .00 to 1.00 (Collura et al., 2011).

Devereux Scale of Mental Disorders (DSMD). The DSMD, published by

Pearson Education, Incorporated (San Antonio, TX), is a rating scale. It is designed to

assess behavior problems and psychopathology in children and adolescents (Cooper,

2001). The test results are reported in the form of T scores for various scales and sub-

scales (with clinically significant scores ≥ 60), and lower scores indicate improvement

upon re-assessment. The composite and global scales of Externalizing, Internalizing, and

Total were

included in this study.

Electrode. Central to NF is the detection and analysis of the EEG signal from the

scalp. In order to record brainwaves it is necessary to attached metallic sensors

(electrodes) to the scalp and/or ears (with a paste or gel) to facilitate this process (Collura,

2014).

Electroencephalography (EEG). A recording of brain electrical activity (i.e.

brainwaves) using differential amplifiers, measured from the scalp (Collura et al., 2011).

The information from each site or channel is digitized to be viewed as an oscillating line,

such that all channels can be viewed on a computer screen at one time.

Fast Fourier transform (FFT). The conversion of a series of digital EEG

readings into frequency ranges/bands, which can be viewed in a spectral display. Just as

different frequencies of light can be seen when filtered through a prism, so too can EEG

elements be isolated when filtered through a FFT process into different frequency bands

(Collura, 2014).

16

Frequency / frequency bands. The representation of how fast the signal is

moving, expressed in terms of Hertz (Hz) (Collura, 2014) and commonly arranged in

bandwidths, also referred to as bands. Generally accepted frequency bands are delta (1-4

Hz), theta (4-7 Hz), alpha (8-12 Hz), beta (12-25 Hz), and high beta (25-30 Hz); the beta

band may be broken down into smaller bands of beta1 (12-15 Hz), beta2 (15-18 Hz), and

beta3 (18-25 Hz), and the alpha band may be divided into alpha1 (8-10 Hz) and alpha2

(10-12 Hz).

Gaussian. Referring to the normal distribution and/or normal curve (Thatcher,

2012).

Hedges’ d. An effect size, belonging to the d family indices (along with Hedges’

g), which use the standard score form of the difference between the means; therefore it is

similar to the Cohen’s d, with the same interpretation (Hunter & Schmidt, 2004).

However, when used with small sample sizes, both the Cohen’s d and Hedges’ g, can

have an upward bias and be somewhat over-inflated; however the Hedges’ d includes a

correction for this bias (Hunter & Schmidt, 2004). Therefore, in studies with smaller

sample sizes, the use of the Hedges’ d provides a more conservative, and likely more

accurate effect size. Also, complicating this issue is confusion in the literature regarding

the use of the designator g or d for which particular Hedges index, and/or which

calculation does or does not include the correction factor (Hunter & Schmidt, 2004). For

example, frequently Hedges’ g is described as adjusting for small samples sizes;

however, this is only true if the calculation used includes the correction factor. Moreover,

there are even variations in the literature of the correction equation which is applied. As a

result, the only way to know which calculation is actually being used is for the Hedges’

17

index equation to be explicitly reported. To that end, for this study, the

Hedges’ d

definition/calculation will be that used in the Metawin 2.1 meta-analysis software

(Rosenberg, Adams, & Gurevitch, 2000). In this context the Hedges’ d is calculated by

multiplying the Hedges’ g by the correction, which is sometimes referred to as J.

Where and therefore .

Hertz (Hz). The number of times an EEG wave oscillates (moves up and down)

within a second; commonly expressed as cycles per second

(Collura, 2014).

International 10-20 System. A standardized and internationally accepted method

of EEG electrode placement locations (also referred to as sites) on the scalp. The

nomenclature of 10-20 derives from electrode locations being spaced a distance of either

10% or 20% of the measured distance from certain landmarks on the head. The system

consists of a total of 19 sites, with eight locations on the left, eight on the right, and three

central sites found on the midline between the right and left side of the head (Collura,

2014).

Visual and Auditory + Plus Continuous Performance Test (IVA). The IVA,

developed and published by BrainTrain Incorporated (Chesterfield, VA), is a

computerized interactive assessment. It is normed for individuals over the age of 5, and it

is designed to assess both auditory and visual attention and impulse control with the aim

to aid in the quantification of symptoms and diagnosis of ADHD (Sanford & Turner,

2009). The test results are reported in the form of quotient scores for various scales and

sub-scales (with clinically significant scores ≤ 85), and higher scores indicate

improvement upon re-assessment. The global and composite scales of Full Scale

18

Attention Quotient, Auditory Attention Quotient, and the Visual Attention Quotient were

included in this study.

Joint time frequency analysis (JTFA). A method of digitizing the EEG signal

which allows for moment-to-moment (i.e. real time) measures of EEG signal changes

(Collura, 2014).

Montage. The configuration of the electrodes and software defining the reference

point and electrode linkages, for the differential recording of the EEG signals (Thatcher,

2012). For example, in a linked-ears montage, the signal for each electrode site is

referenced to the signal of the ear electrodes linked together. In a Laplacian montage, the

signal for each electrode site is referenced to the signal of the weighted average of the

surrounding electrode sites.

Neurofeedback. An oversimplified, yet accurate, definition of neurofeedback is

that it is simply biofeedback with brainwaves. Generally, it is an implicit learning process

(involving both operant and classical conditioning) where changes in brainwave

signal/patterns, in a targeted direction, generates a reward (a pleasant tone and change in

a video animation) such that the desired brainwave events occur more often (Collura,

2014; Thatcher, 2012).

Normalization. In the context of NF, refers to the progression of excessive z-

scores towards the mean (i.e. z = 0), meaning the NF trainee’s EEG is moving closer to

the EEG range of normal (i.e. typical) individuals of his/her age (Collura, 2014). Thus,

the concept of normalization is generally accepted to be when the z-scores of the

QEEG

move towards the mean (i.e. in the direction of z = 0).

19

Power spectrum. The distribution of EEG energy across the frequency bands,

typically from 1 Hz to 30 Hz and frequently displayed as a line graph, histogram, or color

topographic (i.e. visual representations of the numerical data) images (Collura, 2014).

Phase. The temporal relationship between two EEG signals, reflecting the speed

of shared information (Collura et al., 2011).

Protocol. The settings designated in NF software, informed by a treatment plan,

which determines how the NF proceeds. This establishes parameters such as metrics (e.g.

absolute vs. relative power), direction of training (i.e. targeting more or less), length of

session, and other decision points in the NF process (Collura, et al., 2011).

Quantitative EEG (QEEG). The numerical analysis of the EEG such that it is

transformed into a range of frequencies as well as various metrics such as absolute

power, relative power, power ratios, asymmetry, coherence, and phase (Collura, 2014;

Thatcher, 2012). The data is typically made up of raw numbers, statistical transforms into

z-scores, and/or topographic images (Collura, 2014). As a dependent variable in this

study, QEEG z-scores are considered a representational measure of electrocortical

function. The metrics of absolute power, relative power, and coherence were included.

Relative power. A QEEG metric representing the amount of energy, divided by

the total energy, at each electrode site, for a defined frequency band. It reflects how much

energy is present compared to all other frequencies (Collura, 2014).

Assumptions, Limitations, Delimitations

This section identifies the assumptions and specifies the limitations, together with

the delimitations of the study. The following assumptions were present in this study:

20

1. It was assumed that traditional neurofeedback is deemed efficacious as

discussed and demonstrated in the literature (Arns et al., 2009; Pigott et

al., 2013).

2. It was assumed that the subjects are representative of the population of those

who seek NF treatment for various mental health disorders; thus allowing

for results to be generalized to that population (Gravetter & Wallnau,

2010).

3. It was assumed the sample is homogeneous and selected from a population

that fits the normal distribution such that the sample means distribution are

also likely to fit a normal distribution (Gravetter & Wallnau, 2010).

4. It was assumed that responses provided on rating scale instruments accurately

reflect perceived or remembered observations, thus minimizing bias for

over or under-reporting of observations (Kerlinger & Lee, 2000).

The following limitations were present in this study:

1. Research design elements. A general limitation of designs that incorporate a

pretest-posttest formulation is primarily related to the passage of time

between administering the pre and post assessments (Kerlinger & Lee,

2000). Factors such as history and maturation cannot be controlled for;

therefore it is not possible to know whether or not they have impacted the

dependent variable measures (Hunter & Schmidt, 2004). However, for this

study the time between the pre and post assessment is relatively short, and

can be measured in terms of weeks. Therefore, the impact of time-related

confounds were anticipated to be minimal. Further limitations which also

21

must be recognized are a lack of comparison to a traditional NF group, and

a lack of a randomized control group.

2. Small sample size. Larger sample sizes are preferred in order to allow for

stronger statistical analysis and more generalizability (Gravetter &

Wallnau, 2010). Given this study used pre-existing archived data, the

number of samples were restricted to what was found in the files; thus

there was no option to increase sample size. Though, as detailed in

Chapter 3, the sample sizes for each group provided sufficient power to

allow for adequate statistical analysis.

The following delimitations will be present in this study:

1. This study was delimited to the scope of the surface formulation of 19ZNF.

Therefore, it did not include in its scope other variations of 19-channel NF

models, founded in inverse solution theories, such as low-resolution brain

electromagnetic tomography (LORETA) ZNF or functional magnetic

resonance imaging (fMRI) tomography NF models.

2. This study was delimited to a scope of NF research data collected primarily

from clinical settings, as opposed to laboratory-based experimental

research.

3. The academic quality standards for this dissertation delimit the literature

reviewed for this study to exclude certain non-peer-reviewed sources (i.e.

NF industry newsletters).

In spite of the above stated assumptions, limitations, and delimitations, this study

has potential to be of value to the scientific and neurofeedback community. Given the

22

data for this research comes from a real-world clinical setting, the findings of this study

still contribute to advancing the scientific knowledge of 19ZNF.

Summary and Organization of the Remainder of the Study

In summary, while NF has a history spanning over 40 years, it is only now

gaining acceptance as an evidence-based mental health intervention (Pigott et al., 2013).

Various models of NF have been developed over the years, with one of the newest

iterations including 19ZNF, which is reported to lead to improved clinical outcomes in

fewer sessions than other models (Thatcher, 2013; Wigton, 2013). However, there are

significant gaps in terms of peer-reviewed literature and research, such that efficacy of

19ZNF has yet to be established. This dissertation intends to fill these gaps by addressing

efficacy of 19ZNF, in a clinical setting, using a comparison of pretest-posttest measures

of clinical assessments and QEEG z-scores.

The following chapters include the literature review in Chapter 2 and a

description of the methodology, research design, and the procedures for the study in

Chapter 3. The literature review first explores the background and history of the problem,

then discusses theoretical foundations and conceptual frameworks, and finally reviews

the literature pertaining to the NF models relevant to this study. Of note is the necessity

of a significantly expanded theoretical/conceptual section. The methodological

foundations of a treatment intervention based in EEG/QEEG technology, combined with

the need to explore the theoretical foundations of three different NF models (traditional,

QNF, and ZNF), require more in depth coverage of the topics involved in that section.

23

Chapter 2: Literature Review

Introduction and Background to the Problem

The focus of this study was to explore the efficacy of 19ZNF in a clinical setting,

through the use of clinical assessments and QEEG z-scores as outcome measures. Yet, a

review of the literature is necessary to place this research into context of NF theory and

the various models that have come before 19ZNF. This literature review consists of three

sections.

The first section addresses the history and background of NF in general and

specifically introduces ZNF, as well as comments on how the gap in research for 19ZNF

evolved into its current form. The second section focuses on the theoretical foundations

and conceptual frameworks of NF and QEEG. First, an overview of the foundations of

EEG and QEEG is presented. Next, an overview of learning theory as applied to NF is

discussed. Then, the theoretical frameworks supporting the different models of NF

(traditional, QNF, and ZNF) are reviewed. Last, key themes of NF concepts relevant to

this dissertation including applications of QNF, the development of 4ZNF, and finally the

emergence of 19ZNF are examined. Also included in this section is a review of suitable

outcome measures for use in ZNF research, with special attention paid to prior NF

research regarding performance tests, rating scale assessments, and QEEG z-scores, as

outcome measures.

Of note for this literature review is the necessity to include reviews of conference

oral and poster presentations (which are subject to a peer-review acceptance process).

While inclusion of these sources may be an unusual dissertation strategy, it is necessary

due to the scarcity of sources in the peer-reviewed published literature regarding ZNF

24

models. To exclude these sources would be to limit the coverage of the available

literature regarding the NF model which is the focus of this dissertation (19ZNF).

The literature for this review was surveyed through a variety of means. The

researcher’s personal library (from nearly fifteen years of practicing in the NF field)

served as the foundation for the literature search. Then, this was expanded through online

searches of various university libraries via academic databases such as Academic Search

Complete, PsycINFO, PsycARTICLES, and MEDLINE, with search strings of

combinations of terms such as NF, QEEG, EEG biofeedback, z-score(s). Additionally,

the databases of various industry specific journals, such as the Journal of Neurotherapy,

Clinical EEG and Neuroscience, as well as the Applied Psychophysiology and

Biofeedback journal were queried with similar search terms. Moreover, with the specified

journals, names of leading authors in the QNF and ZNF field (e.g. Koberda, Surmeli,

Walker) were used for search terms.

Historical overview of EEG and QEEG. A review of NF literature reveals a

common theme that the deepest roots of NF go back only as far as Hans Berger’s (1929)

discovery of EEG applications in humans. However, the antecedents of EEG technology

can actually be traced back as far as the 1790s with the work of Luigi Galvani and the

discovery of excitatory and inhibitory electrical forces in frog legs, leading to the

recognition of living tissue having significant electrical properties (Bresadola, 2008;

Collura, 1993). The next notable application occurred when Richard Caton (1875) was

the first to discover electrical activity in the brains of monkeys, rabbits, and cats, and to

make observations regarding the relationship of this activity to physiological functions

(Collura, 1993). Yet for applications of EEG in humans, Berger is generally recognized

25

as the first to record and report on the phenomenon. Thus, it would be most correct to

consider Caton as the first electroencephalographer, and Berger as the first human

electroencephalographer (Collura, 1993). Moreover, Berger’s contributions were

significant as they spurred a plethora of research and technological advancements in EEG

technology in the 1930s and 1940s worldwide. Of note is that Berger not only identified

both alpha and beta waves, but he was also the first to recognize the EEG signal as being

a mixture of various frequencies which could be quantitatively estimated, and spectrally

analyzed through the use of a Fourier transform, thus paving the way for QEEG

technology as well (Collura, 2014; Thatcher, 2013; Thatcher & Lubar, 2009).

Even while there was an understanding of multiple components to the EEG signal

as early as the 1930s, the advent of computer technology was necessary to make possible

QEEG advances (Collura, 1995); for example, the incorporation of normative databases

in conjunction with QEEG analysis. Therefore, the historical landmarks of EEG

developments can trace the modern start of normative database applications of QEEG

back to the 1970s with the work of Matousek and Petersen (1973) as well as John (1977)

(Pizzagalli, 2007; Thatcher & Lubar, 2009). However, while work exploring NF

applications with QEEG technology began in the 1970s, its wider acceptance and use in

the NF field was not until closer to the mid-1990s (Hughes & John, 1999; Thatcher &

Lubar, 2009). Here too, advances in computer technology, whereby personal computers

were able to process more data in less time, made way for advances in the clinical

applications of NF.

Historical overview of NF. The historical development of neurofeedback dates

back to the 1960s and early 1970s when researchers were studying the EEG activity in

26

both animals and humans. In these early days, Kamiya (1968, 1969) was studying how

humans could modify alpha waves, and Sterman and colleagues (Sterman et al., 2010;

Wyricka & Sterman, 1968) were able to demonstrate that cats could generate sensory

motor rhythm, which led to the discovery that this process could make the brain more

resistant to seizure activity; this eventually carried over to work in humans (Budzynski,

1999). Later, Lubar (Lubar & Shouse, 1976), expanded on Sterman’s work, and began

studies applying NF technology to the condition of attention disorders. This work led to

an expansion of clinical applications of neurofeedback to mental health issues such as

ADHD, depression and anxiety, using a training protocol generally designed to increase

one frequency (low beta or beta, depending on the hemisphere) and decrease two other

frequencies (theta and high beta) (S. Othmer, Othmer, & Kaiser, 1999).

Then, in the 1990s QEEG technology began gaining wider acceptance in the NF

community, for the purpose of guiding the development of protocols for NF (Johnstone,

& Gunkelman, 2003). The use of normative referenced databases has been an accepted

practice in the medical and scientific community and the advantage it brings to

neurofeedback is the allowance for the comparison of an individual to a norm-referenced

population, in terms of z-scores, to identify measures of aberrant EEG activity (Thatcher

& Lubar, 2009). This made possible the development of models, which focused more on

the individualized and unique needs of the client rather than a one-size-fits-all model.

Consequently, during the ensuing decade, the QNF model began taking hold in the NF

industry. However, the primary number of channels incorporated in the amplifiers of the

time was still limited to only two.

http://www.citeulike.org/user/michaelbrewer/author/Johnstone:J

http://www.citeulike.org/user/michaelbrewer/author/Gunkelman:J

27

In 2006, the 4-channel – 4ZNF – technique was introduced. ZNF incorporates the

application of an age matched normative database to instantaneously compute z-scores,

via Joint Time Frequency Analysis (as opposed to the fast Fourier transform), making

possible a dynamic mix of both real-time assessment and operant conditioning

simultaneously (Collura et al., 2009; Thatcher, 2012). While the QNF of the 1990s held

as a common goal movement of the z-scores in the QEEG towards the mean, the advent

of ZNF brought with it the more frequent use of the term normalizing the QEEG or

normalization to refer to this process. It is now generally acknowledged that the term

normalization, when used to describe the process of ZNF, refers to the progression of the

z-scores towards the mean (i.e. z = 0), meaning that the NF trainee’s EEG is moving

closer to the EEG range of normal (i.e. typical) individuals of his/her age. But by 2009

the 4ZNF model was further enhanced to include the availability of up to all 19 electrode

sites in the International 10-20 system.

This surface potential 19ZNF greatly expands the number of scalp locations and

measures, including the ability to train real-time z-scores using various montages such as

linked-ears, averaged reference, and Laplacian, as well as simultaneous inclusion of all

connectivity measures such as coherence and phase lag. This, then, makes possible the

inclusion of all values from the database metrics for any given montage (as many as a

total of 5700 variables) in any protocol (Collura, et al., 2009). But the advent of 19ZNF

not only increases the number and types of metrics available to target, it also brought two

major changes to the landscape of NF. First, it established a new model wherein the

target of interest for the NF is the QEEG calculated z-scores of the various metrics

(frequency/power, coherence, etc.), rather than the amplitude of particular frequency

28

bands (theta, beta, etc.). Second, it changed the makeup of a typical NF session. In either

the conventional QNF model, or 4ZNF, the clinician will develop a protocol guided by

the QEEG findings, but will generally employ the same protocol settings repeatedly for

multiple NF sessions until the next assessment QEEG is scheduled. However with

19ZNF, in every session the clinician can acquire and process QEEG data, compare the

pre-session data to past session data, then design an individualized z-score normalization

protocol based on that day’s QEEG profile, and then perform a 19ZNF session, all within

an hour (Wigton, 2013). Thus, each 19ZNF session uses a protocol unique to the client’s

brainwave activity of that day, providing further tailoring of the NF to the individual

needs of the client, on a session-by-session basis. This, then, brought a new dynamic to

the normalization model of NF such that z-scores (rather than amplitude of frequencies)

could be targeted, on a global basis, so as to make possible a goal of normalizing all the

QEEG z-scores (when clinically appropriate) in the direction of z = 0.

How problem/gap of 19ZNF research evolved into current form. Over its

more than 40-year history NF has frequently been criticized as lacking credible research,

as evident by Loo and Barkley’s (2005) critique. Nevertheless, even Loo and Makeig

(2012) concede recently the research has improved. For example, Arns et al. (2009)

conducted the first comprehensive meta-analysis of NF, covering 1194 subjects,

concluding that it was both efficacious and specific as a treatment for ADHD, with large

to medium effect sizes for inattention and impulsivity, respectively. Then, in a research

review sponsored by the International Society for Neurofeedback and Research (ISNR),

in what is a comprehensive review of controlled studies of NF, Pigott et al. (2013)

evaluated 22 studies to conclude that NF meets the criteria of an evidence-based

29

treatment for ADHD. This review further documents that NF has been found to be

superior to various experimental group controls, shows equivalent effectiveness to

stimulant medication, and leads to sustained gains even after termination of treatment.

However, as encouraging as this body of research is, it is limited in that the model

covered by these studies is largely limited to one of the most traditional models of NF

(theta/beta ratio NF) and only addresses a single condition of ADHD. Missing from these

comprehensive reviews and meta-analyses are newer QNF models, which have been in

use since the 1990s, and are frequently employed for a wider range of disorders in

addition to ADHD. Yet, that is not to say that QNF is devoid of research. In fact, from

2002 to 2013 there are at least 20 studies in peer-reviewed literature covering the QNF

model, yet there is great diversity in the different conditions treated in these studies, as

well as a greater use of individualized, custom-designed protocols; hence making meta-

analysis of this collection of research less feasible. Nonetheless, these studies do

represent a body of research pointing to the efficacy of the QNF model.

Yet, when it comes to the newest models of surface ZNF, there is no such

collection of research in the literature. There exist only two studies (Collura et al., 2010;

Hammer et al., 2011) which evaluate sample groups of the 4ZNF model, and the Collura

et al. report is mostly descriptive in nature. This, then, leaves only one experimental

study. There is one dissertation on 4ZNF (Lucido, 2012), but it too is a single case study.

Regarding 19ZNF, as of this writing, there are only two peer-reviewed published

empirical reports specifically evaluating surface potential 19ZNF (Hallman, 2012; J. L.

Koberda et al., 2012b) and those are only case study in nature.

30

Yet, this is not to say the peer-reviewed literature landscape is entirely devoid of

any mention of surface ZNF models. Nevertheless, what does exist is mostly information

about the technique in the form of review articles (Collura, 2008; Stoller, 2011; Thatcher,

2013; Wigton, 2013), chapters in edited books (Collura et al. 2009; Wigton, 2009), as

well as numerous qualitative oral and poster conference presentations since 2008. Of note

is a recent poster presentation (Wigton & Krigbaum, 2012), with a later expansion of that

work (Krigbaum & Wigton, 2013), which was a multicase empirical investigation of

19ZNF; however it primarily focused on a proposed research methodology for assessing

the degree of z-scores progression towards the mean. There also exist anecdotal

observations in the form of case reports in non-peer-reviewed publications and internet

website postings. Yet, while anecdotal observations and information from review and

case study reports are helpful for initial appraisals of a new model, quantitative statistical

analysis is needed to validate theories born of early qualitative evaluations, to counter a

lack of acceptance from the wider neuroscience

community.

Much of the focus of discussions of 19ZNF is on the potential for good clinical

outcomes in fewer sessions than traditional NF (J. L. Koberda et al., 2012a; Rutter, 2011;

Thatcher, 2013; Wigton, 2009; Wigton, 2013). Though, before the question of number of

sessions is examined, first there should be an establishment of the efficacy of this

emerging model; because empirical studies evaluating the efficacy of 19ZNF are absent

from the literature. This dissertation was intended to fill this gap of knowledge, by

analyzing the efficacy of 19ZNF in a clinical setting.

31

Theoretical Foundations

Foundations of EEG and QEEG. Hughes and John (1999) discussed a decade-

long history, inclusive of over 500 EEG and QEEG related reports, the findings of which

indicate that cortical homeostatic systems underlie the regulation of the EEG power

spectrum, that there is a stable characteristic in healthy humans (both for age and cross-

culturally), and that the EEG/QEEG measures are sensitive to psychiatric disorders.

These factors made possible the application of Gaussian-derived normative data to the

QEEG metrics such that these measures are independent of ethnic or cultural factors,

which allow objective brain function assessment in humans of any background, origin, or

age. As a result, Hughes and John assert when using artifact-free QEEG data, the

probability of false positive findings are below that which would be expected by chance

at a p value of .0025. Thus, changes in the QEEG values would not be expected to occur

by chance, nor is there a likelihood of a regression to the mean of QEEG derived z-scores

because EEG measures, and the corresponding QEEG values, are not random. Since the

work of Hughes and John, well over a decade ago, there have been numerous studies

published in the literature further demonstrating the reliability and validity of QEEGs

(Cannon et al., 2012; Corsi-Cabrera, Galindo-Vilchis, del-Río-Portilla, Arce, & Ramos-

Loyo, 2007; Hammond, 2010; Thatcher, 2012; Thatcher & Lubar, 2009).

Learning theory as applied to NF. As has been stated, NF is also frequently

referred to as EEG biofeedback, and biofeedback has been defined simply as the process

whereby an individual learns how to change physiological activity (AAPB, 2011). As

Demos (2005) asserted, biofeedback is a two-way model such that 1) the physiologic

activity of interest is recorded, and 2) reinforcement is provided each time the activity

32

occurs; as a result, voluntary control of the targeted physiologic activity becomes

possible. On the surface this is a basic descriptor of operant conditioning. As a result, a

common practice in the literature is for NF to be referred to only as an operant

conditioning technique. However, the theoretical frameworks of NF are more correctly

framed as encompassing both classical and operant conditioning mechanisms (Collura,

2014; Sherlin, Arns, Lubar, Heinrich, Kerson, Strehl, & Sterman., 2011; Thatcher, 2012;

M. Thompson & Thompson, 2003). Operant conditioning – as first conceptualized by

Edward Thorndike (1911) with the Law of Effect, which holds that satisfying rewards

strengthens behavior, and as further advanced by B. F. Skinner (1953) – has as its

primary principle when an event is reinforced/rewarded it is likely to reoccur

(Hergenhahn, 2009); and for Skinner the reinforcer is anything that has contingency to a

response. It is important to note that operant conditioning relates to the learning of

volitionally controlled responses, motivation is necessary, and rewards need to be desired

or meaningful (M. Thompson & Thompson, 2003).

In contrast, classical conditioning, established by Ivan Pavlov (1928), differs in

that it deals with learning of reflexive or autonomic nervous responses. The primary

mechanism is based in the associative principles of contiguity and frequency such that the

presence of a dog’s food, which naturally elicits a salivation reflex, when paired

(contiguity) with a bell, repeatedly (frequency), will lead to the dog salivating upon the

presentation of only the bell (Hergenhahn, 2009). Thus, the pairing of two previously

unpaired events results in automatic learning in the form of classical conditioning. Yet, it

is important to note that while operant conditioning involves volitionally oriented

behavior modification, NF is a learning process which occurs largely outside of

33

conscious awareness; in essence, an implicit learning process (Collura, 2014). As applied

to NF, the change in the EEG, as reflected in brainwave frequencies, patterns, or z-scores,

is the behavior which is modified as a result of the combined classical and operant

conditioning occurring in the NF session (Collura, 2014).

In this context then, successful NF involves a motivated trainee experiencing the

repeated pairing of meaningful auditory and/or visual reward signals, when the recorded

brainwaves fall in a targeted range. The reward signal is typically in the form of an

auditory tone (beep, chime, music) in combination with an animated visual display

(simple game-like displays or movies), which when aesthetically pleasant to the trainee

enhances and promotes the process. Some have noted the importance of additional

learning theory components such as shaping (Collura, 2014; Sherlin et al., 2011; M.

Thompson & Thompson, 2003), anticipation of future rewards (Thatcher, 2012), and

secondary reinforcers (Sherlin et al., 2011; M. Thompson & Thompson, 2003) to further

inform NF to varying degrees. These variations as applied to NF have served to generate

a range of NF models over the years; however the basic foundations of classical/operant

conditioning remain constant in all the models.

Traditional/amplitude-based models of NF. In NF, when the EEG is divided

into different frequency bands (alpha, beta, etc) the amplitude measures how much of that

frequency is present within the total EEG spectrum recording. The basic goal of

amplitude NF treatment models is to either increase or decrease the amplitude of a

particular frequency. These models are the longest-standing conceptualization of NF

techniques and for that reason, for the purposes herein, the term traditional will be used

to refer to these models of NF. The earliest traditional model of NF started with Kamiya’s

34

(1968) discovery in the early 1960s that human alpha waves could be increased and

trained to occur for increased periods of time. Next, Sterman and Fiar (1972) followed up

on this work by expanding the training Sterman had been conducting with cats to include

humans, with the first known case of resolving a seizure disorder in a person using NF. In

this model the goal was to increase the beta frequency of 12-15 Hz, also referred to as

sensorimotor rhythm (SMR), along the sensorimotor cortex of the brain. Others then

expanded on this model. For example, Lubar believed the model Sterman developed

would be applicable to children with attention disorders (Robbins, 2000). After a year-

long academic fellowship with Sterman, he moved on to develop his own model which

incorporated decreasing the theta frequency in addition to increasing beta (Robbins,

2000). Lubar and Shouse (1976) reported on the first use of this approach, which was the

foundation for what would become one of the most commonly reported and researched

protocols (for use with attention disorders) in the literature since the early 1990s; that of

the theta/beta ratio model.

Another example of a traditional NF model with roots to Sterman’s efforts is the

Othmer model (S. Othmer, Othmer, & Kaiser, 1999), employing a combination of

increasing beta (either 12-15 Hz or 15-18Hz) together with decreasing theta (4-7 Hz), and

a higher beta band (22-30 Hz); again with electrode placements primarily along the

sensorimotor cortex locations of the scalp. In the years since its introduction, there have

been different modifications and variations of the Othmer approach (S. F. Othmer &

Othmer, 2007). Nevertheless, consistent with traditional NF, this model makes use of

targeting the amplitudes of frequency bands in particular directions (i.e. make more or

less of targeted frequencies).

35

While some built models based in the original findings of Sterman, others

expanded on Kamiya’s work, by developing models which targeted the increase of alpha

and/or theta frequencies (in parietal brain regions) to enhance relaxation and creative

states (Budzynski, 1999). Peniston and Kullkosky (1990, 1991) developed applications of

these approaches, which led to treatment models for alcoholism and posttraumatic stress

disorders. Yet still others, such as Baehr, Rosenfeld, and Baehr (1997), established

protocols targeted to balance alpha in the frontal regions as a treatment for depression.

While each of the above models targeted different frequencies with a variety of

protocols, consistent was a focus on changing the amount of the brainwave of interest;

the desired outcome is either greater or lesser amplitude of a target frequency. Moreover,

pre-treatment assessment of EEG activity to inform NF protocols is limited to nonexistent

in the majority of these models, with a typical one-size-fits-all approach. While selecting

the particular NF model for a treatment approach (i.e. theta-beta ratio versus alpha-theta

training) is informed by the presenting symptoms of each case, personalizing a NF

protocol to address the individual brainwave patterns of the client is not the focus of these

approaches.

QNF model of NF. A key focus of QNF is precisely tailoring the NF protocol,

based on the individual EEG baseline and symptom status of the client, as determined by

the QEEG, in conjunction with clinical history and presenting symptoms (Arns et al.,

2012). The primary premise of this approach is that localized cortical dysfunctions, or

dysfunctional connectivity between localized cortical areas, correspond with a variety of

mental disorders and presenting symptoms (Coben & Myers, 2010; Collura, 2010;

Walker, 2010a). When the EEG record of an individual is then compared to a normative

36

database representing a sample of healthy individuals, the resulting outlier data

(deviations of z-scores from the mean) help link clinical symptoms to brain dysregulation

(Thatcher, 2013). For example, when an excess of higher beta frequencies are found, the

typical associated symptoms include irritability, anxiety, and a lowered frustration/stress

tolerance (Walker, 2010a).

The conceptual framework of the stability of QEEG, as noted above, applies to

QNF in that a stable EEG is not expected to change without any intervention, thus the

changes seen as a result of QNF is not occurring by chance, but due to the operant

conditioning of the brainwaves as a result of the NF process (Thatcher, 2012). Therefore,

in the example of excess beta frequencies, when the symptoms of anxiety and irritability

are resolved after QNF, and the post QEEG shows the beta frequencies to be reduced

(closer to the mean), it is assumed the improvement in symptoms is due to the change in

the QEEG; thus representing improved electrocortical functioning (Arns et al., 2012;

Walker, 2010a). The term for this process, which has arisen secondary to QNF, is

generally referred to as normalization of the QEEG, or simply normalization (Collura,

2008; Surmeli & Ertem, 2009; Walker, 2010a). Consequently, the concept of

normalization is generally accepted to be when the z-scores of the QEEG move towards

the mean (i.e. z = 0).

It is also important to note that the QNF model, with its reliance on the QEEG to

guide the NF protocol, embraces the heterogeneity of QEEG patterns as discussed by

Hammond (2010). In understanding that a particular clinical symptom presentation may

be related to varied deviations in the QEEG, it quickly becomes apparent that each NF

protocol needs to be personalized to the client; as well as monitored and modified for

37

maximum treatment effect (Surmeli et al., 2012). This, then, results in different

electrophysiological presentations being treated differently, even if the overarching

diagnosis is the same. This clinical approach is supported through multiple reports in the

literature discussing how training the deviant z-scores towards the mean (i.e. normalize

the QEEG) in QNF results in the greatest clinical benefit (Arns et al., 2012; Breteler et

al., 2010; Collura, 2008; Orgim & Kestad, 2013; Surmeli et al., 2013; Surmeli & Ertem,

2009, 2010; Walker, 2009. 2010a, 2011, 2012a).

However, while the personalization of NF protocols aids in greater specificity in

client treatment, it creates methodological challenges for researching QEEG based NF

models; which will be discussed further below. When boiling down the elements of study

to a lowest common denominator, overall normalization of the QEEG is the only

common point of measurement. Therefore a reasonable tool, as a measure of change in

the QEEG, would be a value reflecting the change of targeted z-scores for a particular

metric.

In summary then, in the normalization model of QNF, when the QEEG data show

excessive deviations of z-scores, and those deviations correspond to the clinical picture,

the NF protocol is targeted to train the amplitude of the frequency in the direction of the

mean (i.e. create more or less energy within a specified frequency band). In other words,

if the QEEG indicates an excess of a beta frequency (i.e. high z-scores), and the

presenting symptoms are expected with that pattern (i.e. anxiety), the protocol would be

designed to decrease the amplitude of that beta frequency. Conversely, if the QEEG

indicates a deficit of an alpha frequency, with corresponding symptoms, the protocol

would be designed to increase the amplitude of the alpha frequency. The QNF model

38

then, is simply traditional amplitude based NF using the QEEG to guide the protocol

development for the NF

sessions.

ZNF model of NF. The ZNF model leverages the statistical underpinnings of a

normal distribution, where a value converted to a z-score is a measure of the distance

from the mean of a population, such that the mean represents a range considered to be

normal (or typical) (Collura, 2014). With ZNF the real-time QEEG metrics are

incorporated into the NF session using a joint time frequency analysis (rather than fast

Fourier transform) to produce instantaneous z-scores, which allows for real-time QEEG

assessment to be paired with operant conditioning (Collura, 2014; Thatcher, 2013).

Therefore, where the QNF model has amplitude (as guided by the QEEG) as its targeted

metric, in its most basic form, the ZNF model targets the calculated real-time z-scores.

Yet, that being said, it is important to note that the z-scores can be considered a meta-

component of EEG metrics (i.e. amplitude or connectivity) and ultimately, even when z-

scores are targeted, the underlying EEG components are still being trained.

Nevertheless, directly targeting z-scores results in a different dynamic in the NF

training protocol. The goal is no longer to simply make more or less frequency amplitude,

but for the targeted excessive z-score metrics (whether high or low) to move towards the

mean, that is to normalize. Thus, there is a greater focus on the construct of

normalization. A second change is the inclusion of many more metrics to target. ZNF

makes available simultaneously, for up to ten frequency bands, both absolute and relative

power, ratios between frequencies (i.e. theta/beta ratio or alpha/beta ratio), as well as the

inclusion of connectivity metrics such as asymmetry, coherence, or phase lag, all as

active training metrics. Therefore, when applied to 4ZNF, the maximum number of

39

metrics to train is 248 (Collura, 2014) and, within the scope of the 19ZNF the maximum

number of metrics is 5700 (Collura, et al., 2009). These changes make the entire range of

all QEEG metrics, or a subset of selected metrics, available for targeting with ZNF

models. Moreover, the increased number of metrics targeted by 19ZNF may allow for an

increase in regulation and synchronization of neural activity simply by the greater

number of training variables. Nonetheless, one consistent theme remains aligned with the

QNF model, in that the decision to target normalization of QEEG metrics is determined

by the presenting clinical symptoms; thus when QEEG deviations correspond to

presenting symptoms, normalization is a reasonable treatment goal.

In asking if the 19ZNF improves attention, behavior, executive function, or

electrocortical function, the research questions for this study add to what is known

regarding whether operant conditioning with 19ZNF, produces clinical results that are

comparable to those reported in the literature for traditional or QNF models. Moreover,

this study also evaluates questions regarding 19ZNF and normalization of QEEG metrics.

This research fits within the overarching NF model with a specific focus on evaluating

efficacy of the ZNF model. As has been demonstrated in the literature, traditional NF is

well researched (Arns et al., 2009; Pigott et al., 2013), and as will be discussed in the next

section, the QNF model is well addressed in the literature. Conversely, as will be seen,

the ZNF models (4ZNF and 19ZNF) are still minimally represented in the literature.

Therefore, this study addresses an area which calls for further research.

Review of the Literature – Key Themes

QNF in the literature. Beginning with QNF models in reviewing the NF

literature is applicable in that the QNF model laid the ground-work for the ZNF models

40

that followed. Both QNF and ZNF models hold the generalized goal of normalizing the

QEEG, and for that reason, QNF is chosen as the first key theme in reviewing NF in the

literature. With few exceptions, literature presented on the QNF model comes from

research conducted in clinical settings. As a result, given the ethical constraints of

conducting research in clinical settings (e.g. asking clients to accept sham or placebo

conditions) (Gevensleben et al., 2012) few are blinded and/or randomized-controlled

studies.

Arns et al. (2012) conducted a well-designed open-label study of 21 ADHD

subjects using the QNF model, incorporating pre-post outcome measures and QEEG data.

The purpose was to investigate if the personalized medicine approach of QNF was more

efficacious (as defined by effect size) for ADHD than the traditional theta/beta or slow

cortical potential models, as reported in his meta-analysis three years earlier (Arns, et al.,

2009). The outcome measures incorporated were a self-report scale based on the

Diagnostic and Statistical Manual-IV list of symptoms and the Beck Depression

Inventory. The findings of this study were statistically significant improvements (p ≤

.003) in both the attention (ATT) and hyperactivity (HI) subtypes of ADHD symptoms as

well as depression symptoms. In this study, the mean number of sessions was 33.6, and

the effect size was 1.8 for the ATT subtype, and 1.2 for the HI subtype; this was a

substantial increase over the traditional model effect sizes of 1.0 (ATT) and 0.7 (HI)

respectively. This suggests the QNF model is more efficacious (i.e. effect size of clinical

improvements) than the older traditional theta/beta or slow cortical potential models.

Furthermore, in this study, non-z-score EEG microvolt data was reported for only nine

frontal and central region electrode sites, and three frequency bands, on a pre-post basis.

41

In addition to that the protocols employed are described as a selection of one of five

standard protocols, with QEEG informed modifications. The limitations of this study

were few but include a lack of a control group, a fairly small sample size, and that some

outcome measures were collected on only a sub-group of participants (thus reducing net

sample size). Moreover the pre-post QEEG data analysis was limited.

J. L. Koberda, Hillier, et al., (2012) reported on the use of QNF in a clinical

setting of a neurology private practice. All 25 participants were treated with at least 20

sessions of a single-channel traditional NF protocol, which was guided by QEEG data

and symptoms, with a goal to improve symptoms and normalize the QEEG. Clinical

improvement was measured by subjective reports from the participants in the categories

of not sure (n = 4), mild if any (n = 1), mild improvement (n = 3), improved/improvement

(n = 13), much improved (n = 2), and major improvement (n = 2); with a total of 84% (n

= 21) reporting some degree of improvement. QEEG change was reported as a clinical

subjective estimation (based on visual inspection of the QEEG topographic images) of

change in the targeted frequencies, in the categories of no major change/no improvement

(n = 6), mild improvement (n = 9), improvement (n = 8), or marked improvement (n = 1),

and one participant not interested in post-QEEG; with a total of 75% (n = 18) showing

estimation of improvement in the QEEG. Of note with this study was the heterogeneous

collection of symptoms treated which included ADD/ADHD, anxiety, autism spectrum,

behavior symptoms, cognitive symptoms, depression, fibromyalgia, headaches, major

traumatic brain injury, pain, seizures, stroke, and tremor, in varying degrees of

comorbidity per case. However, the primary limitation of this study was the loosely

42

defined subjective estimations of improvement for both clinical symptoms and QEEG

outcomes.

In their randomized control study, Breteler et al. (2010) evaluated QNF as an

additional treatment with a linguistic education program. From the total sample of 19, ten

participants were in the NF group and nine were in the control group. Individual NF

protocols were based on QEEG results and four rules, with a generally (though not

strictly adhered to) 1.5 z-score cutoff; which resulted in the use of eight personalized

protocols. Improvement was determined by results of outcome measures of various

reading and spelling tests, as well as computerized neuropsychological tests. Paired t tests

were applied for analysis of the difference values between the pre and post scores. The

reported findings showed the NF group improved spelling scores with a very large

Cohen’s d effect size of 3; however no improvement in reading or neuropsychological

scores. QEEG data was reported, in terms of pre-post z-scores, on an individual basis (i.e.

per each case) for a limited number of targeted sites, frequencies, and coherence pairs;

with most showing statistically significant normalization.

In a retrospective study using archived clinical case files, Huang-Storms,

Bodenhamer-Davis, Davis, and Dunn (2006) evaluated the efficacy of QNF for 20

adopted children with a history of abuse who also had behavioral, emotional, social, and

cognitive problems. The children all received 30 sessions of NF (from a private practice

setting) with QNF protocols, which were individualized based on the QEEG profiles.

Data from the files of 20 subjects were collected to include pre and post scores for

outcome measures from a behavioral rating scale (Child Behavior Checklist; CBCL), and

a computerized performance test (Test of Variables of Attention; TOVA). The findings

43

for the CBCL were statistically significant (p < .05) for most scales and the TOVA

findings were statistically significant (p < .05) for three scales, thus demonstrating QNF

efficacy for the subjects in this study. There was no quantified QEEG reported; only

observations of general trends in the pretreatment QEEG findings, such as excess slow

waves in frontal and/or central areas.

Two researchers are most notable for several published studies evaluating the

QNF model, that being Walker and then Surmeli and colleagues. Each has a particular

consistent style in structuring their studies; and both have reported on the use of QNF

with a wide variety of clinical conditions. Therefore their works will be reviewed in a

grouping format. Walker has reported on mild closed head injury (Walker, Norman, &

Weber, 2002), anxiety associated with posttraumatic stress (Walker, 2009), migraine

headaches (Walker, 2011), enuresis (Walker, 2012a), dysgraphia (Walker, 2012b), and

anger control disorder (Walker, 2013). His QNF protocol development centers on

tailoring the protocol to the individual clinical QEEG data, with some restrictions of

either increasing or decreasing the amplitude of certain frequency ranges. For example,

the protocols for the anger outburst study restricted the target range to decrease only

excess z-scores of beta frequencies, combined with decreasing excess z-scores of 1-10 Hz

frequencies. For the migraine and anxiety/posttraumatic stress studies both were based on

individual excess z-score values found in the beta frequencies in a range of 21-30 Hz (to

decrease) with an addition of increasing 10 Hz. For all studies the electrode sites selected

were ones where the deviant z-scores in the targeted range were found. In the mild closed

head injury article, the protocol was different because the study was meant to evaluate

coherence training with a stated goal to normalize coherence z-scores. Thus, the most

44

deviant coherence pair was selected first (for five sessions each) and, then progressed to

lesser deviant pairs until the symptoms resolved or until 40 sessions were completed.

None of Walker’s reports declare a particular research design; still all involve pretest-

posttest comparisons of various outcome measures.

The outcome measures that Walker typically employs are primarily Likert or

percentage-based self-reports, except in the anger control disorder study where the

DeFoore Anger Scale self-report instrument was used to track the number of anger

outbursts. However, while all protocols are personalized, and based on QEEG findings,

there are no quantified pre-post QEEG data used as an outcome measure, and none are

reported in his studies. Overall the findings of all of Walker’s studies show improvements

in the targeted clinical conditions. In the mild closed head injury study, with an n = 26,

84% of the participants reported greater than 50% improvement in symptoms. For the

anxiety/post-traumatic stress article, with an n = 19, all improved on a Likert scale (1 –

10; 10 being worst) from an average rating of 6 before NF treatment to an average rating

of 1 after NF treatment. With the migraine study, where 46 NF participants were

compared to 25 patients who chose to remain on medication, 54% had complete

remission of headaches, 39% had a greater than 50% reduction, and 4% experienced less

than 50% reduction in migraines, all in the NF group, while in the medication group, 84%

had no change in migraines and only 8% had a greater than 50% reduction in headaches.

In three of his more recent studies, for the enuresis (n = 11), dysgraphia (n = 24), and

anger control research (n = 46), Walker reported all findings for all participants (in all

three studies) showed statistically significant improvement at p < .001.

45

Surmeli and colleagues reported on Down syndrome (Surmeli & Ertem, 2007),

personality disorders (Surmeli & Ertem, 2009), mental retardation (Surmeli & Ertem,

2010), obsessive compulsive disorder (Surmeli & Ertem, 2011), and schizophrenia

(Surmeli et al., 2012). Notable in this collection of work are conditions previously not

known to respond to NF, such as personality disorders, mental retardation, Down

syndrome, and schizophrenia. All of these studies report the QNF protocol as being

individualized, as informed by a combination of the QEEG findings and clinical

judgment; with an overall goal to normalize the QEEG patterns. Notable for most of

Surmeli et al. studies are a high number of sessions reported for the cases; ranging from

an average of 50 to an average of 120 sessions. No particular research design is declared

in the Surmeli et al. studies, but here too, comparisons of pretest-posttest outcome

measures are reported.

The outcome measures in the studies mentioned above generally make use of

clinical assessment instruments designed to measure the symptoms targeted for the QNF

treatment. For example, the schizophrenia study employed the Positive and Negative

Syndrome Scale (PANSS), and for the obsessive compulsive disorder research they

incorporated the Yale-Brown Obsessive-Compulsive Scale (Y-BOCS). For many studies,

the computerized performance Test of Variable Attention (TOVA) was used. Yet, as with

Walker’s work, in spite of all protocols being individually QEEG-guided, QEEG data is

not used or reported as an outcome measure; only observations of general trends of the

changes in QEEGs are discussed. However, the targeted clinical symptoms, as measured

by the clinical assessments, were reported as having statistically significant improvement

in all studies. For the personality disorder study, with an n = 13, twelve were significantly

46

improved on all outcome measures; with the Symptom Assessment 45 Questionnaire at p

= .002, the Minnesota Multiphasic Personality Inventory (MMPI) Psychopathy scale at p

= .000, and the TOVA at p < .05 on the visual and auditory impulsivity scales. With the

article reporting the study with mentally retarded participants, including an n = 23, for 19

there was improvement on the Wechsler Intelligence Scale for Children-Revised (Verbal

scale, p = .034; Performance scale, p = .000; Total scale, p = .000) and the TOVA

(Auditory and Visual Omission scale, p < .02; Auditory and Visual Commission scale, p

< .03; Auditory and Visual Response Time Variability scale, p < .03). In the Down

syndrome study, while the outcome measure was not a commercialized assessment, they

did develop a questionnaire formulated to evaluate symptoms associated with Down

syndrome. The findings were that all subjects in the study (n = 7) showed improvement at

p < .02 on all questionnaire scales. With QNF for obsessive compulsive disorder, with an

n = 36, 33 showed improvement on the Y-BOCS (Obsession subscale, Compulsion

subscale, and Total score all p < .01). Finally, in the schizophrenia study, with an n = 51,

47 out of 48 patients who completed pre and post PANSS improved on all scales at p <

.01. Moreover of the 33 who were able to complete the MMPI, findings showed

significant improvements (p < .01) on the scales of Schizophrenia, Paranoia,

Psychopathic Deviation, and Depression.

This review of QNF research fits within this dissertation topic as examples of how

prior studies with QEEG data have been addressed in the literature. As can be seen,

studies evaluating QNF are typically found in clinical settings, with a wide variety of

clinical symptoms and/or mental health diagnoses, and frequently have relatively small

sample sizes. Moreover the NF protocols employed typically are tailored to the

47

individual, informed by QEEG, with a goal to normalize the QEEG. The overwhelming

majority of clinical QNF research employs retrospective pre-post comparison research

designs and the outcome measures used are tied to the symptoms of investigation. Yet

few, if any, report pre-post QEEG metrics, and only one (Arns et al., 2012) incorporated

statistical analysis of QEEG metrics as an outcome measure (and that was to a limited

degree). Therefore, in the QNF literature, it has become an accepted practice to define

efficacy in terms of measuring symptom improvement with various clinical assessments

(both commercially and informally developed). Nevertheless, clearly there is a gap in the

reporting of group QEEG z-score mean data in the present QNF research.

4ZNF in the literature. Given that 4ZNF is the forerunner to 19ZNF, this topic is

explored to provide historical context on both its development and its coverage in the

literature. While there are numerous studies in the literature for QNF, when it comes to

ZNF studies, such is not the case. However, for the 4ZNF model there are four

representations of 4ZNF clinical results in the literature.

In a first poster presentation on the topic, Wigton (2008) presented a single case

study where 4ZNF was used with an adult to address a diagnostic history of ADHD,

Bipolar disorder, and anxiety symptoms. The primary pre-post outcome measure was the

IVA. Also included were topographic images of pre and post QEEG assessments. After

25 sessions of 4ZNF, in addition to multiple subjective reports of symptom improvement

from the participant, the scaled scores for the IVA showed marked improvement. The full

scale Response Control scale improved from 29 to 94, and the full scale Attention scale

from 0 to 96. The QEEG findings (as reported by visual presentation of QEEG

topographic images) showed improvements in terms of normalization in the QEEG, most

48

noticeably in the left frontal delta and theta frequencies, as well as coherence and phase

lag normalization. However, a limitation of this study was a lack of statistical analysis of

pre-post QEEG data and the use of only one clinical assessment for outcome measures.

Collura et al. (2010) was the first peer-review publication addressing 4ZNF

although its organization was a loosely structured collection of clinical reports from six

clinicians covering 24 successful cases. Nonetheless, for a model with little scientific

evidence, it does stand as the only representation in the literature of a multiple-clinician

report of clinical results with 4ZNF. All cases reported clinical improvement, with no

abreactions, and the average number of sessions for all cases presented were 21.1. The

limitations of this case study are the lack of a structured methodology, no statistical

analysis, and limited pre-post outcome measures and/or QEEG data.

The study conducted by Hammer et al. (2011) represents, to-date, the only

quantitative analysis of 4ZNF. Its strength is a sound methodology with a randomized,

parallel group, single-blind design, together with QEEG z-scores as an outcome measure.

Though, the setting for this research was not in a clinical setting, but rather a university

psychophysiology laboratory wherein participants were recruited specifically for the

study. The purpose was to both explore 4ZNF as a new NF model, and to evaluate the

efficacy of two different 4ZNF protocols for insomnia. The primary findings suggest that

4ZNF may be a beneficial treatment for insomnia. While this study had very small group

sample sizes (n = 5 and n = 3) all insomnia related outcome measures resulted in pre-post

treatment improvement in symptoms, and normal (or near normal) sleep was achieved by

all participants. Moreover, at follow-up 6 to 9 months after treatment, over half sustained

the treatment response. The findings of this study included QEEG measures showing

49

statistically significant electrocortical change occurred for the delta frequency (p < .001)

and beta frequency (p < .01), but not high beta (p < .11). However, a limitation is that the

reported findings only included three frequencies, and the absolute and relative power z-

scores were combined in the analysis; therefore a more discrete picture of overall QEEG

normalization was not available. Further limitations of this study were the small sample

size and the lack of control group. Yet this study does stand alone, being a peer-reviewed

publication, as an example of a quantitative methodology for measuring normalization of

QEEG z-scores with the binomial test of significance, with the 4ZNF model.

A dissertation conducted by Lucido (2012) was a single case study to evaluate the

use of 4ZNF for an adult with Autism spectrum condition (ASC). This study used a

multiple baseline design, with five rounds of assessment data gathered before the 4ZNF

sessions, and a round of assessments at five incremental points during/after the NF

treatment. The outcome measures employed were the Neuropsych Questionnaire, the

CNS Vital Signs computerized neurocognitive assessment, and the Test of Nonverbal

Intelligence. While QEEG data was gathered and purported as an outcome measure, only

limited pre-post colorized topographic images were provided as a means to demonstrate

generalized changes in QEEG metrics. The results were that, with only one exception

(cognitive processing speed), all symptoms assessed with the outcome measures

improved. These included ASC symptoms, executive function, depression, anxiety, mood

stability, attention, and intelligence. To the study’s credit, this was a well-designed, well-

controlled case study; however still a representation of a single case, nonetheless.

Overall, the 4ZNF model is poorly represented in the NF literature. However,

there are still themes relevant to this dissertation. Of the studies reported, most are from

50

clinical settings. Moreover, clinical assessments, as outcome measures, are used in all

studies. A particular stand out, though, is the Hammer et al. (2011) research, wherein

statistical analysis of QEEG metrics was used as an outcome measure.

19ZNF in the literature. With 19ZNF being the focus of this study, reviewing

what literature is available is necessary. Yet, there is an even greater dearth of published

literature for 19ZNF than 4ZNF. Therefore a review of conference oral and poster

presentations is necessary to sufficiently address what is known regarding 19ZNF.

Moreover, the literature reviewed herein is restricted to evaluative and/or case study

research reports regarding clinical applications of 19ZNF (rather than technical reviews

of 19ZNF).

In a first published clinical review of 19ZNF, Wigton (2009) reported initial

findings in which substantial QEEG normalization and clinical improvement was

achieved in as little as three sessions. While research into this technique was clearly

needed, the degree of success achieved in just a few sessions was a novel finding for

previously known NF models. Later in a conference presentation, Wigton (2010a)

reported on a series of case reviews that employed the Laplacian montage with 19ZNF.

There were 10 cases which included conditions such as anger issues, anxiety, ADHD, and

impaired cognition. The findings were that 19ZNF led to clinical improvements and

QEEG normalization, in less than 10 sessions, in seven out of the 10 cases. In this

presentation outcome measures included the IVA, the DSMD, and Likert scale reports. A

year later Rutter (2011) described, in a conference presentation, her use of 19ZNF and

how she was able to see initial indications of QEEG normalization in as little as five

sessions.

51

In their conference oral presentation, J. L. Koberda, et al. (2012a) reported on a

comparison between 25 cases using traditional 1-channel NF and a mixed pool of 15

cases using either surface 19ZNF or LORETA ZNF. However, it is not clear how many

were 19ZNF and how many were LORETA ZNF cases. In this presentation the clinical

symptoms addressed in the 15 cases was varied and included anxiety, headaches, chronic

pain, cognitive and behavioral disorders, as well as focal neurological disorders. The

essential finding of this presentation was that both the traditional single-channel NF and

the 19ZNF/LORETA ZNF lead to improvement in clinical symptoms and improvements

in QEEG measures, but the 19ZNF/LORETA ZNF did so in fewer sessions. The

traditional NF group showed subjective self-report improvements of 84% and an

improvement of 75% of QEEG improvements, whereas the 19ZNF/LORETA ZNF group

showed 95% subjective improvement and 62.5% improvement in QEEG measures.

However an operationalized definition of these improvements was not clearly described

or quantified; nor were there any follow-up data reported. Nevertheless, the number of

sessions for the traditional NF was at least 20, whereas the number for the

19ZNF/LORETA ZNF group was an average of nine sessions.

Hallman (2012) presents a qualitative style clinical review of a single case study,

of a child with fetal alcohol syndrome. The purpose of the article was to describe the case

wherein 80 sessions of 19ZNF resulted in unexpectedly remarkable symptom and

behavior improvements. Moreover, the topographic images of pre-post QEEG data also

showed almost complete normalization; still there was no quantified measurement or

statistical analysis of QEEG data. There also were only subjective parental reports and no

outcome measures to quantify degree of symptom improvement.

52

J. L. Koberda et al. (2012b) also conducted a single case study, of a 23 year-old

male, for the purpose of reporting clinical outcomes using two types of 19ZNF (surface

and LORETA). After only 15 sessions, improvements in a cognitive assessment outcome

measure were achieved, still there were no inferential statistical analysis reported for the

pre-post outcome measures. Moreover, the use of two distinctly different 19ZNF

modalities (surface and LORETA ZNF) makes it hard to know if one better accounted for

the improvement over the other. Finally, while improvements in QEEG data were

reported, again no inferential statistical analyses of these improvements were presented.

Krigbaum and Wigton (2013) present findings for 10 cases with 19ZNF. This

study is notable in that it introduced a proposed methodology for statistically

demonstrating z-score progression towards the mean (i.e. z = 0), and an approach for

plotting individual learning curves as a result of 19ZNF. Additionally, cases in the study

included outcome measures such as the IVA, DSMD, BRIEF and Likert scale (reported

on a supplementary basis, with only an indication of improvement or not), and all

outcome measures showed improvement at case completion. Repeated measures analysis

of variance (rANOVA) and paired t tests supported all three research questions such that

the z-scores progressed towards the mean (rANOVA absolute power, p < .001; relative

power, p < .04; coherence, p < .001); the post z-scores were closer to the mean than the

pre z-scores (paired t test absolute power, p < .007; relative power, p < .05; coherence, p

< .03); and clinical improvement was reported in all cases. However, no follow-up data

was reported.

Clearly, the research evaluating 19ZNF is in its infancy and there is a great need

for scientifically sound investigations. More so, the research needs to move beyond

53

clinical reviews and case studies. As is incorporated in QNF research, use of clinical

assessments as outcome measures are important elements; additionally, finding ways to

include QEEG metrics as outcome measures would benefit 19ZNF research.

Outcome measures for ZNF research. This topic is included to explore outcome

measures that are suitable for ZNF research. A good deal of NF research occurs in

clinical settings, where assessment instruments are employed as part of the case workup.

As such, the use of those same measures after treatment is a natural fit for what are

frequently pretest-posttest research frameworks. Other than informal self-reports (i.e.

Likert scales) two types of popular outcome measures found in the NF literature are

rating-scale type assessments and computerized performance tests. Moreover, commonly

found in NF studies is the use of multiple outcome measures. Further, while the use of

EEG metrics as outcome measures of electrocortical change are infrequently incorporated

in NF research, there are a few reports in the literature which will be reviewed.

Computerized performance tests. Computerized performance tests are common

outcome measures in NF research, usually as a means to evaluate attention-related

symptoms associated with ADHD. One of those instruments is the IVA. While the

IVA

was designed as a diagnostic aid for ADHD, the manual provides usage indications to

include assessing self-control and attention problems related to other disorders such as

depression, anxiety, head injuries, dementia, and other medical problems (Sanford &

Turner, 2009). Several NF studies have incorporated the IVA as an outcome measure to

assess attention related symptoms.

In their study to evaluate NF in a nonclinical group of college students’ cognitive

abilities, Fritson, Wadkins, Gerdes, and Hof (2008) used the IVA as one of their outcome

54

measures; each group (experimental and control) had an n = 16. The stated objective was

to determine effects of NF on attention, impulsivity, mood, intellectual functioning,

emotional intelligence, and general self-efficacy. The IVA was one of several outcome

measures and was included to assess response control (i.e. impulsivity) and attention. The

researchers reported results in terms of the means and standard deviations of pre-post

values of eight of the primary scales of the instrument. The statistical analysis performed

were multivariate analysis of variance (MANOVA) between the control and experimental

groups.

In evaluating the utility of the Tower of London test, as a suitable assessment

instrument for clients with Asperger’s who undergo NF, Knezevic, Thompson, and

Thompson (2010) employed the IVA as one of the outcome measures. They included six

scales of the IVA (Auditory and Visual Prudence, Auditory and Visual Vigilance, and

Auditory and Visual Speed) to assess the efficacy of NF, and evaluate the measure of

impulse control as compared to the Tower of London test. The number of subjects

reported for the IVA varied for the different scales used from a low of n = 6 to a high of n

= 12, because they only included for analysis cases where pre-test scores needed to

improve. The researchers reported the means and standard deviations of the pre-post

values of the included scales, and performed paired t tests for statistical analysis.

Steiner, Sheldrick, Gotthelf, and Perrin (2011) conducted a randomized controlled

study with 41 children, comparing NF to a standardized computer attention training

program and used four outcome measures including the IVA. However, they only

included for analysis the two most broadly defined full-scale components of Response

Control and Attention, and only reported on an n = 6 for the NF group, and an n = 10 for

55

the computerized training group. Repeated measures ANOVAs were performed to

analyze the pre-post outcome measures in this study.

Rating scales. Rating scale instruments are one of the most common assessment

tools found in NF literature for measuring clinical outcomes. Rating scales are

instruments which require rated objects to be assigned to categories or numerical

continua, by the rater or observer, based on their perception or remembrance of the

behavior being rated (Kerlinger & Lee, 2000). Rating scales frequently employed in NF

literature include the BRIEF, the Conner’s’ Rating Scale-revised (CRS-R), the Behavior

Assessment Scale for Children (BASC), and the Child Behavior Checklist (CBCL). The

following are examples from the literature of their use in NF studies.

In a randomized study, Orgim and Kestad (2013) compared NF to medication for

a heterogeneous ADHD group with various comorbidities; each group had an n = 16, and

the NF group was administered 30 NF sessions. The outcome measures included the

rating scales of CRS-R and BRIEF. They conducted analysis of covariance (ANCOVA)

statistical tests, using baseline measurement (Time-1) as the covariate; and they analyzed

group differences at Time-2 for selected scaled scores.

The study of Huang-Storms et al. (2006) provided an example of the use of rating

scales, in a retrospective clinical study, in the form of the CBCL together with a

computerized performance test. The total number of valid CBCLs reported on was an n =

18, and all aforementioned scales were included in the analysis. The statistics employed

were two-tailed paired t test analysis.

Drechsler et al. (2007) conducted a study with an experimental design to assess

the efficacy of slow cortical potential NF with ADHD using multiple outcome measures;

56

where the experimental group had an n = 17 and the control group had an n = 13. Here

they employed two rating scales: The CPS-R and the BRIEF. Moreover, they only

included the composite or global scales from these instruments and performed repeated

measures MANOVAs for analysis.

In a randomized control study, Steiner et al. (2011) compared traditional NF to

computerized attention training to a waitlist control group; the group sizes were n = 9, n =

11, and n = 15. In this study, they used three rating scales: the CRS-R, the BASC, and the

BRIEF. Here too, they included selected scales from the assessments for analysis. The

statistics applied were rANOVAs, in an effort to detect if the experimental conditions

resulted in greater effects for the post NF assessment over the control group.

QEEG z-scores. As has been stated, with the QNF studies, by far, the vast

majority did not use pre-post EEG metrics or z-scores as an outcome measure. Though,

equally so, few traditional NF studies included EEG values as an outcome measure. Yet,

in one study purported to evaluate EEG effects of NF, Gevensleben et al. (2009) reported

values, as grouped together for nine regions across the scalp, and four frequency bands.

The averages of the microvolt values (raw, non z-score EEG values) were computed for

each region and frequency band, and post values minus pre values were used as a

measure of change. Since this was a study for traditional/amplitude NF, no z-score

metrics were used. Further, there were no goals of normalization in the NF protocols.

Two QNF studies do stand out for reporting, to some degree, pre-post EEG

metrics as part of the research. With Arns et al. (2012), non z-score pre-post EEG

microvolt data was analyzed, but for only nine sites, exclusive to frontal and central

areas, and for just three power frequencies. The group data was averaged, and presented

57

in a graph, for each site and frequency combination. Statistically significant pre-post

differences were noted for this data. The second QNF study (Breteler et al., 2010), did

report some pre-post z-scores information, but it was lacking in depth. The QEEG data

were reported for a limited number of sites and frequencies, as well as coherence pairs,

presumably as identified from the personalized training protocols.

Hammer et al. (2011) presented a unique offering in performing the binomial test

of significance to evaluate z-scores as an outcome measure of normalization. While the

results did show a statistically significant number of z-scores normalized after 4ZNF, the

findings were for only three frequencies (delta, beta, and high beta), and combined values

for absolute and relative power. Moreover, this methodology is limited in that it only

provides a yes/no level of analysis for normalization, not a discrete measure of change

towards the mean. Nonetheless, it is a useful offering in an effort to present a measure of

normalization of the QEEG in response to 4ZNF.

One reason for the lack of reporting of z-scores as outcome measures may be due

to the nature of z-scores encompassing both positive and negative values, which, when

averaged, tend to cancel out a magnitude of effect. This was noted in Ramezani’s (2008)

dissertation, which was a study comparing pre and post z-scores of coherence and phase

lag as a result of traditional NF. He noted that mean comparisons of z-scores, with both

positive and negative values being cancelled in the averaging process, had the potential of

masking true differences. In an effort to account for this, he chose to transform the values

by computing the absolute value of the z-score. He then used a score of z ≥ 1.0 as

inclusion criteria for analysis. This approach allowed for statistical analysis, (i.e.

58

averaging, ANOVAs, t tests) to be performed on the resulting z-scores transformed to

absolute values.

Krigbaum and Wigton (2013) presented a methodological approach to account for

positive and negative z-scores, by splitting the positive from negative z-scores, outside of

a cut-off score of ± z = 1.0, to calculate what is termed Sites of Interest (SoI). The

averaged SoI values were then plotted to display a learning curve for each participant,

and statistical analysis (i.e. t tests and rANOVAs) performed on the mean SoI z-score

values. While this methodology fits well for a single-subject design, and in quantifying

the progression of the z-scores towards the mean, its limitation lies in that (in the form

presented) it is not well suited for comparisons of group mean QEEG data. For example,

the split of positive and negative z-scores does not provide a single overall measure of

change for the z-scores. However, there is room to build on this research to develop a

methodology for comparing group data of QEEG z-scores.

Therefore, while few NF studies include EEG or QEEG z-score metrics as

outcome measures, when they do, frequently they only analyze selective components (i.e.

selected sites and/or frequencies). As a result, to date, no proposed methodology for

quantifying overall normalization has been published. Averaging non-transformed z-

scores is less than optimal due to the cancelling factor of the positive and negative values;

and the binomial test of significance provides only limited categorical analysis of the

data, without a measure of distance from the mean. The Krigbaum and Wigton (2013)

study appeared the closest to providing a model for measuring overall normalization of

the QEEG at this time. Still, building on this approach, by taking the absolute value of the

59

z-scores, to provide a single value as a measure of the distance from the mean, could

prove advantageous.

In summary, common themes in the literature present suitable outcome measures

for NF research to consist of computerized performance tests, rating scale instruments,

and QEEG metrics. Examples such as the IVA, the BRIEF, and z-scores were discussed.

These findings are relevant to this research in that the same or similar instruments were

used for the present study.

Summary

In reviewing the 40 year history of NF, a discussion of the historical context of

EEG, QEEG, and NF was presented. NF is grounded in learning theory and through the

years various models, such as traditional NF, QNF, ZNF, have emerged. While 19ZNF is

one of the newest NF models, it does not enjoy a demonstration of efficacy by evidence-

based research, which exists for the traditional models. In fact, there are significant gaps

in the literature in that no scientifically rigorous studies of 19ZNF have been found. This

study aims to address this empirical gap by analyzing the question of efficacy of 19ZNF

in a clinical setting, thus contributing to the field in terms of beginning to fill this

empirical gap. Thus this study aims to contribute to the body of scholarly knowledge

regarding 19ZNF.

Prior QNF and ZNF research is commonly found in clinical settings. These

research studies typically employ pretest-posttest designs using relatively small sample

sizes, while incorporating clinical assessment instruments and occasionally QEEG

metrics, as outcome measures. Moreover, NF protocols are generally individually

tailored, based on QEEG findings, with a goal to normalize the QEEG; and

60

heterogeneous collections of conditions included in studies is frequently found. These

traditions were followed for this study, in both design and outcome measures, in

evaluating 19ZNF. In utilizing QEEG z-scores as an outcome measure, prior research

methods (SoI and taking absolute values of z-scores) were expanded on to establish a

measure of distance from the mean, for statistical analysis of group data. The ZNF

theory, grounded in the use of real-time z-scores with a goal of normalizing the QEEG,

such that the z-scores move towards the mean (z = 0), underlies the 19ZNF approach;

which was the focus of investigation in this pretest-posttest comparison research.

A detailed review and description of the methodology for this research is

presented in the following chapter. To be included is an overview of the study, as well as

further discussion of data collection and analysis methods. Additionally, the

instrumentation, together with reliability and validity issues, will be discussed as it

applies to the study. Limitations will also be reviewed.

61

Chapter 3: Methodology

Introduction

Over the years, new models of NF have been developed, and one of the most

current is 19ZNF. To-date, case study and anecdotal clinical reports within the field

indicate this new 19ZNF approach is an improvement over traditional NF models (J. L.

Koberda et al., 2012a; Wigton, 2013). Still, the efficacy of this new model has not yet

been established from empirical studies.

This research is different from other 19ZNF studies. It is a quantitative analysis of

pre-post outcome measures, with group data from a clinical setting, and thus, it is a

beginning in establishing empirical evidence regarding 19ZNF. The purpose of this

retrospective one-group pretest-posttest research was to compare the difference between

pre and post clinical assessments and QEEG z-scores data, before and after 19ZNF

sessions, from archived data of a private neurofeedback practice in the Southwest region

of the United States.

The remainder of this chapter reviews the problem statement and research

questions, discusses the methodology and research design, and also describes the

population and sample selection. Next, the instrumentation is presented together with a

discussion of the associated validity and reliability. Then, data collection and data

analysis is covered. Finally, a discussion of ethical considerations and the study

limitations are presented.

Statement of the Problem

It is not known, by way of statistical evaluation of either clinical assessments or
QEEG z-scores, if 19ZNF is an effective NF technique. This is an important problem

62

because 19ZNF is a new NF model currently in use by a growing number of practitioners,

yet scientific research investigating its efficacy is lacking. Anecdotal reports are

insufficient as a basis for determining treatment efficacy and uncontrolled case studies

are scientifically weak (La Vaque et al., 2002). Therefore, scientifically sound evidence

of efficacy for 19ZNF is needed.

Research Questions and Hypotheses

For this research, the independent variable was the 19ZNF and the dependent

variables were clinical outcomes, as measured by the scaled scores from three clinical

assessments (IVA, DSMD, BRIEF) and z-scores from QEEG data. Given the

retrospective nature of this study, the approach for data collection was gathering archived

de-identified data, from closed case files, of a NF private practice. The process consisted

of collecting the necessary data elements (i.e. subject demographics, assessment scales

scores, and z-scores) into spreadsheets, for further analysis by statistical software (SPSS).

As will be discussed in detail in the research design section below, this study employed a

one-group pretest-posttest design. This was the best design for the proposed research

because the goal was to compare the means of the outcome measures at two different

time points (before and after 19ZNF) (Kerlinger & Lee, 2000).

As will be detailed in the instrumentation section, and briefly reviewed below, the

clinical assessments are generally designed to measure symptom severity of attention,

behavior, and executive functioning; and the z-scores are a representational measure of

electrocortical function. The clinical assessments are commercially available instruments,

widely used in the mental health field for measuring symptom severity. The QEEG data

63

has been collected with a commercially available QEEG software package, which has

been in general use in the neurofeedback field since 2002.

The instrument to measure attention was the IVA continuous performance test.

This is a computerized test designed to assess both auditory and visual attention and

impulse control symptoms associated with ADHD (Sanford & Turner, 2009). The

associated research question and

hypothesis

was:

R1a. Does 19ZNF improve attention as measured

by the IVA assessment?

Ha1a: The post scores will be higher than the pre scores for the IVA

assessment.
H01a: The post scores will be lower than, or not significantly different

from, the pre scores of the IVA assessment.

The instrument to measure behavior was the DSMD. This is a behavioral rating

scale, completed by parents, designed to assess behavior problems and psychopathology

in children and adolescents (Cooper, 2001). The associated research question and

hypothesis was:

R1b. Does 19ZNF improve behavior as

measured by the DSMD assessment?

Ha1b: The post scores will be lower than the pre scores for the DSMD

assessment.
H01b: The post scores will be higher than, or not significantly different
from, the pre scores of the DSMD assessment.

The instrument to measure executive function was the BRIEF. This is a rating

scale, completed by parents, or self-rated in adults, design to measure observations of

executive function skills in everyday environments (Gioia, Isquith, Guy, & Kenworthy,

64

2000; Roth, Isquith, & Gioia, 2005). The associated research question and hypothesis

was:

R1c. Does 19ZNF improve executive function as measured by the BRIEF

assessment?

Ha1c: The post scores will be lower than the pre scores for the BRIEF

assessment.
H01c: The post scores will be higher than, or not significantly different
from, the pre scores of the BRIEF assessment.

The instrument to measure the QEEG z-scores, which is a representational

measure of electrocortical function, was the QEEG assessments collected using the

Neuroguide software. This is software designed to provide statistical analysis of the

quantified EEG metrics, such that z-scores are calculated to allow a comparison to a

normative database (Thatcher, 2012). The associated research question and hypothesis

was:

R2. Does 19ZNF improve electrocortical function as measured by QEEG z-

scores such that the post z-scores are closer to the mean than pre z-scores?

Ha2: The post z-scores will be closer to the mean than the pre z-scores.

H02: The post z-scores will be farther from the mean, or not significantly

different from, the pre z-scores.

Research Methodology

The field of clinical psychophysiology makes use of quantifiable variables and the
associated research should include specific independent variables, as well as dependent

variables, which relate to treatment response (i.e. clinical assessments) and the measured

65

physiological component (i.e. EEG metrics) (La Vaque et al., 2002). Yet, many NF

studies do not use the EEG metrics as a measure of the cortical component of

psychophysiologic function (Arns et al., 2009), but rather provide reports, which are

more qualitative in nature to discuss NF related QEEG changes. Moreover, NF research

needs to include quantitative methodologies, using QEEG data as an outcome measure, to

learn more about the psychophysiological basis of NF (Gevensleben, 2009). Therefore, a

quantitative methodology was selected, as opposed to qualitative, to address this need.

Currently, the available 19ZNF studies are in the form of qualitative research

(Hallman, 2012; J. L. Koberda et al., 2012a). This literature entails presenting data from

single case studies in the form of unstructured subjective reports of symptom

improvement, as well as graphical images of before and after QEEG findings, where the

improvement is represented by a change in color on the picture (without statistical

analysis of data). However, for this dissertation, the goal was to explore statistical

relationships between the variables under investigation; thus calling for a quantitative

approach. The strength of quantitative methodologies, including quasi-experimental

research, is that they provide sufficient information, regarding the relationship, and the

level of significance, for the investigation variables, to enable the study of the effects of

the independent variable upon the dependent variable (Carr, 1994). Therefore employing

a quantitative method is intended to leverage this strength in the evaluation of 19ZNF.

Research Design

This quasi-experimental research used a retrospective, one-group pretest-posttest

design. When the goal of research is to measure a modification to a behavior pattern, or

internal process that is stable and likely unchangeable on its own, the one-group pretest-

66

posttest design is appropriate (Hunter & Schmidt, 2004; Kerlinger, 1986). This type of

design answers the research questions by comparing the collected dependent variable

pretest measures to the posttest values for each subject; thus comparing the members of

the group to themselves, rather than to a control or comparison group (Kerlinger & Lee,

2000). Consequently, the group members become their own control; thus controlling for

and thereby reducing the potential for extraneous variation due to individual-to-individual

differences (Kerlinger & Lee, 2000). Moreover, the size of the treatment effect can be
estimated by analyzing the difference between the pretest to the posttest measures

(Reichardt, 2009). The rationale for this being a retrospective study is because the data

available for analysis is from pre-existing archived records, which frequently provides a

rich source of readily accessible data (Gearing et al., 2006). Therefore, the chosen design

for this investigation is the best to evaluate the pre-post outcome measures from a clinical

setting, as well as the identified research questions for this study.

As previously stated, the independent variable was the 19ZNF and the dependent

variables were the data from the three clinical assessments and QEEG files; as such, the

specific instruments used to collect the data were the IVA, DSMD, and BRIEF

psychometric tests, as well as the QEEG software. A sample group was formed for each

dependent variable outcome measure so as to form four groups for analysis. Therefore,

using a one-group pretest-posttest design with these identified groups is fitting.

Population and Sample Selection

When individuals seek NF services they must choose among a variety of NF

models. Yet the dearth of scientific literature regarding 19ZNF limits the information

available for that decision process. The identified population for this research was made

67

up of those seeking NF services (both adults and children), or those who accessed NF

services. These individuals may have had an array of symptoms, which adversely affect

their daily functioning, most commonly in the areas of attention, behavior, and executive

function; they may also have been previously diagnosed related mental health disorders.

From the total population (those seeking, or already have, accessed NF services),

this particular study population was identified as all prior clients of the NF private

practice which provided the retrospective data. Given the retrospective nature of this

research, there was no active recruitment of subjects; thus sample selection was

determined by inclusion criteria from available pre-existing cases. The study sample,

then, were the cases which met the inclusion criteria of being a 19ZNF case, having both

a pre and post QEEG assessment, as well as either an IVA, or a DSMD, or a BRIEF

assessment, for both pre and post conditions. Moreover, given the sample consisted only

of pre-existing de-identified data, as will be further detailed below (Data Collection

section), there was no need for an informed consent process. For this research, the total

aggregate sample size was 21 subjects, which was then divided into three additional

outcome measures groups (IVA, DSMD, or BRIEF). The sample size for the IVA group

was 10, the DSMD group was 14, the BRIEF group was 12, and all 21 subjects had

QEEG data.

In a meta-analysis evaluating traditional NF, for ADHD, not using QEEG–

targeted specificity in the NF protocols, Arns et al. (2009) reported an average (averaged

for attention and hyperactivity symptoms) Hedge’s d effect size of 0.85 (0.3 as small, 0.5

as medium, and 0.8 as large). In a more recent NF study where the treatment was more

personalized and targeted with QNF, Arns et al. (2012) reported the average Hedge’s d

68

effect size was found to be nearly double to 1.45 for the combined symptoms of attention

and hyperactivity. Arns et al. (2012) suggested these findings indicate the personalization

of treatment protocols, afforded by QNF, improves clinical outcomes. Given that 19ZNF

also incorporates personalized QEEG-informed treatment protocols, it is reasonable to

expect equivocal effect sizes with 19ZNF. Thus, in determining a needed sample size

using the G*Power3 software (Faul, Erdfelder, Lang & Buchner, 2007), for the reasons

cited by Arns et al. (2012), it would be reasonable to use a predicted effect size in the

range of 1.0 to 1.5. Using the more conservative effect size value of 1.0, with a one-tail

analysis, alpha level of .05, and a power level of 0.80, for repeated measures t tests, the

calculated needed minimum sample size is eight. Therefore, groups with a sample size of

10 or more are sufficient for the data analysis to be performed in this study.

Instrumentation

The type of archived data used was from the following instruments: One

computerized performance test (IVA), two rating scales (DSMD and BRIEF), and QEEG

z-scores (Neuroguide software). All clinical assessments are commercially available

validated instruments, having a history of common use in the mental health industry. The

QEEG software is also commercially available, and since 2002 has been used

internationally by NF clinicians, in university research settings, and military/veteran

institutions (Besenyei, et al. 2012; Thatcher, North, & Biver, 2005). All instruments were

completed as part of the pre and post assessment routines during the previously

completed NF treatment process. All treatments were provided by the researcher who is a

state Licensed Professional Counselor, a board certified Neurofeedback Therapist, and a

69

certified QEEG Diplomate. Descriptions of each of the instruments are provided next,

with a discussion of validity and reliability in separate following sections.

IVA. As reported by Sanford and Turner (2009), the IVA is a 13-minute

computerized test, with 500 responding or inhibiting trials, normed for ages six to adult,

designed to assess both auditory and visual attention and impulse control; with the aim to

aid in the quantification of symptoms and diagnosis of ADHD. Yet, the manual provides

usage indications to include assessing attention and self-control problems related to other

disorders, such as depression, anxiety, head injuries, dementia, and other medical

problems. The test taker is given standardized instructions, from a computer digitized

voice file, that they will see or hear the numbers 1 or 2, and to click the mouse when they

see or hear the number 1, and to refrain from clicking if they see or hear the number 2.

There are two global full-scale composite scores of Full Scale Response Control

Quotient, and Full Scale Attention Quotient. Each full scale is broken into auditory and

visual scales. Auditory and visual primary scales for Response Control include Prudence

(impulse control), Consistency (response reliability), and Stamina (sustained attention

over time). Auditory and visual subscales for Attention include Vigilance (inattention),

Focus (mental processing variability), and Speed (reaction time). The test results are

reported in the form of quotient scores such that a score of ≤ 85 is indicative of clinical

significance. As a performance test, the IVA is completed directly by the subject.

DSMD. The DSMD is a behavior rating scale designed to assess behavior

problems and psychopathology in children and adolescents; the child form (ages 5 to 12)

and adolescent forms (ages 13 to 18) have 110 items which describe problem behaviors,

with a 65% overlap between the two forms (Cooper, 2001). The rater can be either a

70

parent or teacher, with separate norms for each; in this research, only parent ratings are

used. Both versions have (1) a composite Externalizing scale made up of Conduct and

Attention (child)/Delinquency (adolescents), (2) a composite Internalizing scale made up

of Anxiety and Depression, (3) a composite Critical Pathology scale made up of Autism

and Acute Problems, and (4) a global Total scale (Peterson, 2001). The instrument scores

are expressed in T scores, with scores ≥ 60 indicating clinical significance, and can be

completed in about 15 minutes.

BRIEF / BRIEF-A. The BRIEF is a rating scale, with 86 items, designed to

sample observations of children’s (ages 5 to 18) executive function skills in everyday

natural settings, with forms suitable for completion by parents and teachers (Donders,

2002). For this study only the parent form was available. This instrument is intended to

assess behavioral, emotional, and metacognitive skills, which broadly encompass

executive skills, rather than measure behavior problems or psychopathology (Donders,

2002). The BRIEF-A is the adult version (ages 18 to 90), self-report form, with 75 items,

which is designed to assess the views of one’s own executive function skills (self-

regulation) in their everyday environment (Gioia et al., 2000). Both instruments have an

overall summary scale of Global Executive Composite (GEC), which is comprised of two

primary sub-scales of Behavioral Regulation Index (BRI) and Metacognition Index (MI).

The BRI is made up of clinical scales of Inhibit, Shift, and Emotional Control for both the

adult and child versions, with the BRIEF-A adding a scale of Self-Monitor to the

behavior regulatory clinical scales category. The MI, for both the BRIEF and BRIEF-A,

is made up of five clinical scales of Initiate, Working Memory, Plan/Organize,

Organization of Materials, and Monitor. Both assessments take approximately 15 minutes

71

to complete; and scores are expressed in terms of T scores, with scores ≥ 65 indicating

clinical significance (Gioia et al., 2000; Roth et al., 2005).

Neuroguide and QEEG acquisition. The QEEG data was acquired and

processed with the Neuroguide software. This software is designed to collect

conventional EEG data, and then allow for simultaneous visual inspection of the raw

EEG waveforms together with statistical analysis of the quantified EEG metrics

(Thatcher, 2012). Software modules allow the EEG data to be compared to a lifespan

normative database. The database has been normed, for both eyes open and eyes closed

conditions, with 625 individuals from ages of 2 months to 82 years of age, with the

included subjects being screened for normalcy (normal intelligence, lack of pathology or

mental health disorders) through history, interviews, neuropsychological testing and other

evaluations (Thatcher, Walker, Biver, North, & Curtin, 2003). The amplifier used for the

EEG acquisition was the Brainmaster-Discovery 24E (Brainmaster Technologies, Inc,

Bedford, OH), with an A/D conversion of 24 bits resolution, a sampling rate of 256 Hz,

and input impedance of 1000GOhms. Impedance is the obstruction of flow of electrical

current when measuring non-direct current signals (Farley & Connolly, 2005).

EEG data was acquired and processed as has been described by Krigbaum and

Wigton (2013), using accepted standards of QEEG acquisition methods, thus ensuring

quality recordings. An electrode cap (Electro-Cap Inc; Eaton, OH) was used to place the

19 electrodes according to the International 10-20 System referenced to linked ears, with

Electro-Cap brand electro-conductive gel. Electrode impedances were adjusted to be

below 10k ohm for all electrodes and balanced. The digital format of the EEG recording

was with a low-pass filter of 50 Hz and a high-pass filter of 0.5 Hz. The pre and post

72

EEG recordings were acquired with eyes open in a waking-relaxed state, sitting in an

upright relaxed position. The instructions given were to remain still, inhibit muscle

activity from forehead, neck, and jaws, as well as eye movements and blinks. Screening

of EEG was conducted carefully to exclude technical and biological artifacts. The EEG

Selection method (Thatcher, 2012) was used to eliminate artifacts prior to submitting the

EEG to a fast Fourier transformation (FFT) procedure. The remaining edited EEG

consisted of an average of 1 minute of data (30 2s epochs), thus ensuring a representative

sample of data verified by the split-half and test-retest values being ≥ .90. The digitally

filtered frequency bands, for surface potential metrics of absolute power, relative power,

and coherence, were as follows: Delta (1-4 Hz), theta (4-8 Hz), alpha (8-12 Hz), alpha1

(8-10 Hz), alpha2 (10-12 Hz), beta (12-25 Hz), beta1 (12-15 Hz), beta2 (15-18 Hz), beta3

(18-25 Hz), and high beta (25-30 Hz).

Validity

The concept of test validity refers to degree to which it accurately measures that

which it proposes to measure, and also how well it measures the target in question

(Anastasi & Urbina 1997). Thus, the emphasis is on the accuracy of the measure with

regard to the aspect of what is to be measured. Aspects of validity of the outcome

measures for this study will next be addressed.

IVA. A concurrent and diagnostic validity study was conducted by Nova

Southeastern University and BrainTrain, Incorporated. The findings suggested the overall

accuracy, when compared to diagnoses of ADHD provided by physician/psychologists, to

be statistically significant (p < .0001). Moreover, the sensitivity (true positives) was

73

reported to be 92%, specificity (true negatives) as 90%, and positive and negative

predictive power as 89% and 93% respectively (Sanford & Turner,

2009).

DSMD. Peterson (2001) reported content validity for the DSMD to be good, with

a strong congruence with the Diagnostic Statistical Manual-IV criteria regarding the

behaviors examined. Moreover, the DSMD scales have a diagnostic potential to identify

normal versus hospitalized children/adolescents with an accuracy range of 70% to 90%

(Cooper, 2001). In a study to examine concurrent validity with the BASC and the CBCL,

Smith and Reddy (2002) found the DSMD to demonstrate strong concurrent validity with

scales, which were conceptually similar. For example, between the DSMD and the

CBCL, the correlations were .81 for the Externalizing scale, .83 for the Internalizing

scale, and.86 for Total scale (Smith & Reddy, 2002). This is important, given that many

NF studies have previously used the BASC and CBCL; thus demonstrating the DSMD to

be similar to other rating scales, as a behavior measure, used with prior NF studies.

BRIEF / BRIEF-A. Content validity for the BRIEF was determined by seeking

agreement between multiple pediatric neuropsychologists and the test authors for fit of

each test item. The items retained in the clinical scales have item-total correlations that

range from .43 to .73 (Gioia et al., 2000). Content validity for the BRIEF-A was

conducted in a similar manner by seeking agreement among multiple neuropsychologists

experienced with executive function issues in clinical practice. Of the retained items for

the clinical scales the agreement ranged from .38 to .98 (Roth et al., 2005).

Neuroguide QEEG database. As described by Thatcher et al. (2003), the

validation procedure for the Neuroguide QEEG database included a leave one out

Gaussian (normal distribution) cross-validation process, whereby the data for each

74

subject in the database was removed and then compared to that same database. This is

important because the database, which is being compared to needs to fit the normal curve

to ensure unbiased error estimates. In a normal distribution cross-validation with a perfect

fit, it would be expected that 2.3% of the comparison sample would fall outside of +2

standard deviations (SD) and again at -2 SD, and that 0.13% at +3 SD and again at –3

SD. Therefore, percentages which approximate these values can be deemed as validating

the normal distribution. The cross-validation process for the Neuroguide database

revealed an overall percentage (of all metrics) at +2 SD to be 2.58%, and for –2 SD to be

1.98%; then for +3 SD to be 0.18%, and for -3 SD to be 0.14%. Moreover, the kurtosis

and skewness of the database, if fitting the normal distribution, would be within a few

percentage points of zero. Thatcher, Walker et al. reported the validation process found

the Neuroguide database to meet the criteria for skewness with an overall percentage of

0.17%, and for kurtosis with an overall percentage of 2.91%.

Reliability

Reliability is an important aspect in determining if one can trust that a particular

assessment will give a comparatively similar measure if it is given at another time. As

such, reliability reflects score consistency and predicts how much variation one can

expect from one administration of the test to the next (Anastasi & Urbina, 1997). Thus,

reliability allows an estimate of the error of measurement for the instrument. Aspects of

reliability of the outcome measures for this study will next be addressed.

IVA. A test-retest study was conducted by Nova Southeastern University and

BrainTrain, Incorporated, with a testing interval of 1 to 4 weeks. The results showed

statistically significant (p < .01) reliability coefficients ranging from .37 to .75 (Attention

75

scales: .66 to .75; Response Control scales: .37 to .41). The findings of this study are

further reported to support the IVA as being a stable measure of performance while also

being robust against learning or practice effects, such that changes in test scores over

time can reliably be attributed to environmental or treatment effects (Sanford & Turner,

2009).

DSMD. The test-retest reliability was measured for the DSMD and is reported to

range from .80 to .90 for the scales, with an interval of a 24-hour period (Peterson, 2001).

BRIEF/BRIEF-A. The test-retest reliability was measured for both the clinical

and normative samples of the BRIEF, which was reported to be .81 for the normative

sample and .79 for the clinical sample, with an average interval of two to three weeks;

whereas the reliability for the BRI, MI, and GEC was ≥ .80 for both the clinical and

normative samples (Gioia et al., 2000). For the BRIEF-A the test-retest reliability, over

an average interval of four weeks, for the clinical scales was reported to range from .82 to

.93; with the reliability for the BRI, MI and GEC being > .92 (Roth et al., 2005).

Neuroguide QEEG software. Recently, a study was conducted to evaluate the

reliability of the FFT metrics of the Neuroguide software. Cannon et al. (2012) found the

Neuroguide test-retest reliability, at a 30-day interval, to be ≥ .77 for absolute and relative

power, and coherence. A further measure of reliability, with the individual EEG records

in Neuroguide, are a test-retest and split-half measure which is calculated when the

artifacts are removed, which when ≥ .90 provide a representative sample of the overall

EEG record (Thatcher, 2012). The edited EEG records for this study were edited such

that both the split-half and test-retest measures were on average ≥ .90.

76

Data Collection Procedures

The sample consisted of a convenience sample from reviewed closed cases, of

clients from a private neurofeedback practice, who were administered the clinical

assessments and QEEGs before and after 19ZNF treatment. Regarding the retrospective

data used in this study, those clients were informed that after their treatment was

completed and their case closed, non-identifying data could be used for quality assurance

and/or future research purposes; they were all given the opportunity to opt-out. To be

considered an available 19ZNF case, the clinical symptoms presented during the intake

assessment corresponded with the z-score deviations of the QEEG findings, such that a

treatment goal of overall QEEG normalization was clinically appropriate. While the

19ZNF protocol developed for each case was individually tailored to the clinical and

QEEG findings, and possibly modified at each session to correspond with the baseline

QEEG data of that day, the same treatment goal always applied; that of overall QEEG

normalization. Therefore, the underlying 19ZNF protocol of overall QEEG normalization

was consistent for all cases. The hardware platform was the Brainmaster Discovery 24E

amplifier, and the software platform was either the Brainmaster Discovery or Neuroguide

NF-1 19ZNF software. The 19ZNF sessions used the Brainmaster Flashgame visual NF

displays (i.e. simple non-movie animations); and the reward percentages were

approximately 30% to 50% (i.e. 20 to 30 rewards-per-minute).

As depicted in Figure 1.1, from the available 19ZNF cases, an initial group was

formed for which pre-post QEEG assessments existed, and for which either the IVA,

DSMD, or BRIEF pre-post assessment data were also available (n = 21). From this

collection, three additional groups were formed. One group was created for the IVA data

77

(n = 10), a second group for the DSMD data (n = 14), and a third group for the BRIEF

data (n = 12).

The data collected for this study were from pre-existing documents/files and

recorded by the investigator in a manner such that the subjects cannot be identified.

Therefore, in accordance with 45 CFR 46.101(b) and 46.101(b)(4), this research was

exempt from the requirements of the Protection of Human Subjects 45 CFR part 46

(2009) regulation. Consequently, the university Institutional Review Board (IRB)

determined this study to be exempt from IRB review, under exemption category 7.4 (see

Appendix B). As such, IRB-approved informed consent for use of the de-identified data

for this research was not necessary. All data for this study were previously obtained

during the course of subjects’ NF treatment. While the data came from records that

already exist prior to the start of the study, there was a form of data collection by pulling

de-identified information from a review of the archived records of the private practice.

Upon IRB approval, the information was gathered and de-identified in a format such that

it was impossible to identify the subjects. For example, copies/scans were made of the

assessment scoring sheets, but names and/or birthdates (or any other identifying

information) were redacted, and only a sequential case number was assigned and written

on documents associated with that case. The pre and post scaled score data, from the

copied assessment forms, were entered into a spreadsheet to facilitate data analysis. For

the QEEG data, with the Neuroguide software, the report generation feature was used to

save the z-scores into tab delimited text files, which were then saved as Microsoft Excel

worksheet files, thus preparing the data for further analysis.

78

The redacted paper forms of the collected data set are stored in a secured manner

(i.e. under lock/key) and separate from the clinical source files (which also provides for

physical backup of data). For data that was entered into spreadsheets and the statistical

software package, those digital files are stored on an external flash drive separate from

any installed computer hard-drive; and, when not in use, will be kept with the paper data

files in the same secured manner. The data are stored in a secured manner, with hard-

copy (i.e. paper) data as a form of permanent backup, separate from the archived source

files, and will be maintained for the required 3 years after the completion of the study. At

the end of the 3 years paper files

will be shredded and electronic media digitally erased.

A further subject identity protection were that findings reported were only descriptive

group data, and no individual case was described or discussed; thus preventing any

possible inadvertent identification of persons.

Data Analysis Procedures

In general, the research questions asked if 19ZNF improved attention, behavior,

executive function, and electrocortical function, as measured by the clinical assessments

of the IVA, DSMD, BRIEF, and QEEG z-scores. All alternative hypotheses were similar

in that the IVA hypothesis predicted the post scores would be higher than the pre scores,

the DSMD and BRIEF hypotheses predicted the post scores would be lower than the pre

scores, with the z-score hypothesis predicted post z-scores to be closer to the mean than

pre z-scores. The null hypotheses all predicted no significant difference, or a difference

opposite the direction of improvement. The level of significance for this study was alpha

= .05.

79

As previously described in the above section, the scaled scores from the clinical

assessments and QEEG z-scores were collected from archived clinical files and organized

by data entry into spreadsheets for analysis in SPSS v21 software. Columns for relevant

data categories (demographics, pre scores, post scores, difference scores), and identified

relevant scales (composite and global), were created to facilitate entry of data into the

fields of the spreadsheets. The data analysis started with performing descriptive statistics

on each of the sample groups; the means for the pre, post, and difference scores were also

calculated. The specific scales that were analyzed from each clinical assessment are

described next, and are followed by details related to the z-score data analysis.

IVA. The IVA assessment has two primary categories of scales, Response

Control and Attention. The research question associated with this variable focused on

improvement of attention. Thus, in order to maintain alignment with the research

question, only the overall scales specific to attention were analyzed. Therefore scores

from the Full Scale Attention Quotient, Auditory Attention Quotient, and the Visual

Attention Quotient, were collected and analyzed. Additionally, these scales have higher

reliability measures than Response Control scales.

DSMD. The DSMD has two composite scales more specific to generalized

behavior, that being the Externalizing Composite and Internalizing Composite scales, as

well as a Total scale. These three scales correlate strongly (.81, .83, .86, respectively)

with similarly named scales from the CBCL, which is an instrument commonly used as

an outcome measure of behavior in NF studies. Thus this strategy maintained alignment

with the associated research question (improvement of behavior) for this variable.

80

BRIEF / BRIEF-A. While all scales on the BRIEF instruments capture elements

of executive function, in order to maintain alignment with the analysis of the other

instruments (i.e. analyzing generalized composite/global scales), only the composite

scales of Behavior Regulation Index and Metacognition Index, as well as the Global

Executive Composite scale were analyzed. Moreover, both the BRIEF and BRIEF-A

contain these composite/global scales, thus maintaining consistency in the child and adult

assessment measures. Therefore these scales maintained alignment with the associate

research question (improvement of executive function) for this variable.

QEEG z-scores. The QEEG z-scores are a representational measure of

electrocortical function, such that z-scores which are closer to the mean represent

improved functioning; thus maintaining alignment with the research question associated

with this variable. The z-score data were calculated for the QEEG metrics of absolute

power, relative power, and coherence; the same procedure was followed for each metric.

First the z-scores were converted into a spreadsheet format. Next, the values were

transformed to the absolute value. Then, the pre z-scores which were ≥ 1.0 were

highlighted as being the targeted (by site and frequency) z-scores. Those targeted z-

scores were averaged to create a single value, representing an overall distance from the

mean for that metric, for that case. Next, the same targeted z-scores for the corresponding

post values (i.e. same site and frequency) were identified and averaged. This allowed the

pre and post averaged targeted z-score values to be compared, as a measure of change,

such that a lower post value (compared to the pre value) would be closer to the mean.

Statistical analysis. Given that each of the variables forms a separate analysis

group, the proposed data analysis aligned with the one-group pretest-posttest design. The

81

paired (within-subjects/repeated measures) t test was appropriate (assuming the

difference scores to be normally distributed) for this quantitative research, with

continuous variables, because it was based on the difference scores (between pre and

post) for measures taken for each person in a single sample, while allowing for sufficient

statistical power with smaller sample sizes (Gravetter & Wallnau, 2010). Effect size was

computed for discussion of practical results, and compared to that previously reported

from prior studies in the literature.

The statistical analysis was conducted with the SPSS v21 statistical package. For

all hypotheses, the plan was for paired t tests on the pre/post difference scores, for the

means of the selected scales and z-scores, for each outcome measure. The data from the

spreadsheet columns, for the pre and post values (for the scales of each outcome

measure) was transferred into SPSS. Next, the SPSS command sequence selected was

Analysis>Compare Means>Paired Samples T Test. The pre values were identified as

Variable1 and the post values identified as Variable2, and the Confidence Interval

Percentage will be set at 95%. Finally, Hedge’s d effect sizes were calculated with the

Metawin 2.1 software.

Ethical Considerations

There were no ethical problems for this dissertation primarily because it was

determined to be exempt from the requirements of the Protection of Human Subjects

45 CFR part 46 (2009) regulation. Consequently, IRB-approved informed consent for

research was not necessary. As described above, all data was pre-existing prior to the

start of the study and recorded such that there was no potential for revealing the

identity of any person included. The researcher owned the private practice data

82

therefore no data use or site authorization was needed. The data was stored in a

secured manner, with hard-copy (i.e. paper) data as a form of permanent backup,

separate from the archived source files, and will be maintained this way for the

required 3 years after the completion of the study. At the end of the 3 years, paper files

will be shredded and electronic media digitally erased.

Limitations

There were three primary limitations to this study; that of research design

elements, sample size, and the question of efficacy. Moreover, it is important to examine

potential sources for bias in any research. Thus, this aspect will also be discussed.

Most criticisms of pretest-posttest designs, which imply they are inadequate due

to threats to internal validity, can be traced back to Campbell and Stanley (1963).

However, as pointed out by Hunter and Schmidt (2004), the identified limiting elements

(history, maturation, instrumentation, testing, and regression) were only presented by

Campbell and Stanley as potential threats, which may or may not adversely impact a

study. Moreover, in studies of psychological factors, where the intent is intervention

evaluation, the behavior targeted by the treatment (i.e. the DV) is typically quite difficult

to change without some intervention; thus the Campbell and Stanley potential validity

threats were ruled out (Hunter & Schmidt, 2004).

Nonetheless, a general limitation of designs, which incorporate a pretest-posttest

framework is primarily related to the passage of time between administering the pre and

post assessments (Kerlinger & Lee, 2000). Factors such as history (concurrent events

external to the study scope) and maturation (internal growth factors occurring regardless

of interventions) cannot be controlled for. Therefore, it is not possible to know whether or

83

not they have impacted the DV measures (Hunter & Schmidt, 2004). Yet, when the time

between testing points is short, the impact of extraneous variation is lessened (Kerlinger

& Lee, 2000; Reichardt, 2009). In this study, the time between the pre and post

assessment was relatively short, measured in terms of weeks. Therefore, the impact of

time-related confounds were considered to be minimal. Also, identified as a potential

validity threat is the phenomenon of a regression to the mean, where high or low scores

are, by chance, found to be closer to the mean when retested. However, there is an

inverse relationship between the degree of statistical regression and an instrument’s

reliability (Kirk, 2009); such that instruments with higher reliability have less variability

in the measurement error. Given the reliability of the instruments in this study are

relatively high, the estimate of the error of measurement is comparatively low. Thus,

potential validity threats related to regression effects were minimal.

Larger sample sizes are preferred in order to allow for stronger statistical analysis

and more generalizability (Gravetter & Wallnau, 2010). Given this study used pre-

existing archived data, the number of samples was restricted to what was found in the

files; thus there was no option to increase sample size. Though, as discussed, the sample

sizes for each group had sufficient power to allow for adequate statistical analysis.

In order to fully address the question of efficacy, additional studies involving both

follow-up data and control group comparison data are necessary. This is especially true in

answering whether 19ZNF is superior to other QEEG-based approaches. Therefore,

limitations of this study, which also must be recognized, are a lack of comparison to a

traditional NF group and a lack of a randomized control group. Nevertheless, given the

84

data for this research comes from a real-world clinical setting, the findings of this study

can still contribute to advancing the scientific knowledge of 19ZNF.

Finally, in examining potential sources of bias, in a retrospective study where

the data comes from the archived treatment cases of the researcher, a question could

be asked regarding how the researcher can account for the potential. Given the data

was pre-existing in closed cases, and could only be reported, the numerical

information could not be changed nor manipulated. In other words, the data existed in

a set form, and the statistical analysis conveys the message. Moreover, by de-

identifying the data such that every subject was reduced to merely a case number, the

researcher even became blind to the identity to the subjects within the study. Further,

there was no qualitative data in this study for the researcher to interpret. For these

reasons, it is believed the potential for bias was minimized in this study.

Summary

In summary, the methodology for this retrospective pretest-posttest comparative

research was presented. As was reviewed, the independent variable was the 19ZNF and

the dependent variables were the data from the three clinical assessments and QEEG z-

scores; the instruments were the IVA, DMSD, BRIEF psychometric tests, and QEEG

software. The population was described as those who seek NF services, with the study

sample being the pre-existing data available meeting the inclusion criteria, such that four

groups were formed; one group for each outcome measure. A discussion was presented

regarding data aspects germane to a retrospective study, such as how the data was pre-

existing and only de-identified information was collected. Consequently this study

qualified as exempt from the requirements of the Protection of Human Subjects 45 CFR

85

part 46 (2009) regulation, and IRB-approved informed consent was not necessary.

Finally, limitations were reviewed, and what are typically identified as potential

weaknesses in pretest-posttest designs (Campbell & Stanley, 1963), were minimally

impactful because intervention targeted behaviors frequently do not change without

effective intervention (Hunter & Schmidt, 2004), there was a short pre-post time interval

(Reichardt, 2009), and the instruments employed in this study have relatively high

reliability measures (Kirk, 2009).

In this study, all research questions were similar and the paired t test was an

appropriate statistic to compare the means of the different data groups. Moreover, effect

size was computed and compared to prior studies. In the following chapter, the process of

the data analysis, as well as results, will be discussed.

86

Chapter 4: Data Analysis and

Results

Introduction

Addressing efficacy of 19ZNF is important because it was not known, by way of

statistical evaluation of either clinical assessments or QEEG z-scores, if 19ZNF is an

effective NF technique. Therefore, the purpose of this quantitative research was to

evaluate 19ZNF, in a clinical setting, using a retrospective one-group pretest-posttest

research design. Generally, the research questions asked if 19ZNF improves attention,

behavior, executive function, and electrocortical function, as measured by the outcome

measures of the IVA, DSMD, BRIEF, and QEEG z-scores. All alternative hypotheses

were similar in that the IVA hypothesis predicted the post scores would be higher than

the pre scores, the DSMD and BRIEF hypotheses predicted the post scores would be

lower than the pre scores, and the QEEG hypothesis predicted post z-scores would be

closer to the mean than pre z-scores. The null hypotheses all predicted no significant

difference, or a difference opposite the direction of improvement.

This chapter first presents the descriptive data of each of the groups for the IVA,

DSMD, BRIEF, and QEEG z-score data. Then, the steps taken for data analysis are

described. Finally, results of the data analysis are presented.

Descriptive Data

The QEEG group represented the inclusion of all subjects for the study, from

which the other groups were formed; therefore, it is described first. Then, the groups for

the IVA, DSMD, and BRIEF are described. Table 4.1 summarizes the descriptive

information as discussed in the following sections. It is important to note, while the

clinical assessment groups were diverse diagnostically, when viewed by clinical

87

complaints, in terms of the neuropsychological constructs of attention, behavior, or

executive function, the subjects collectively formed well-defined groups for which the

assessment instruments are designed to measure.

QEEG group. The total sample size for this group was 21; there was no reported

experience of 19ZNF prior to coming to this practice. The subjects ranged in age from 7

to 63 years, with a mean age of 21.19 years (SD = 18.12); including 15 children and six

adults, 10 males and 11 females. Seventeen of the subjects were White, two were Asian,

and two were Latino; while five were categorized as low socioeconomic status (SES), 14

as medium SES, and two as high SES. The make-up of the diagnosis
2
and/or presenting

conditions included mostly a combination of ADHD-Inattentive presentation (ADHD-I)

and ADHD-Combined presentation (ADHD-C) (ADHD-I = 4, ADHD-C = 7); yet, there

were three subjects with ADHD-C comorbid with another disorder (ADHD-

C/Unspecified Anxiety Disorder, ADHD-C/Autism Spectrum Disorder, ADHD-

C/Unspecified Learning Disorder). Finally, the other diagnoses included one comorbid

Unspecified Anxiety/Unspecified Depressive Disorder, one Autism Spectrum Disorder,

one Unspecified Bipolar Disorder, one Reactive Attachment Disorder, one comorbid

Obsessive-Compulsive Disorder/issues with executive function, and two subjects with

presenting issues of difficulty with executive functioning. A total of 16 subjects had no

medication usage, two subjects were on medication, two subjects started on medication

2
Given the retrospective nature of the data, all initial diagnoses were made in accordance with the

Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-lV-TR; American

Psychiatric Association, 2000). However, all diagnoses criteria were confirmed with, and are reported in

accordance with the DSM-5 (American Psychiatric Association, 2013) taxonomy.

88

but had ceased medication by the post assessment, and one subject had a reduction of

medication by one-third at the time of post assessment. The number of sessions from pre

assessment to post assessment ranged from three to 20, with a mean of 10.90 (SD = 3.88).

The targeted session frequency was once per week. The number of weeks for treatment

(pre to post assessment) ranged from two to 22, with a mean of 11.76 (SD = 5.19).

Finally, the number of weeks from pre assessment to post assessment ranged from two to

43, with a mean of 15.10 (SD = 10.03). The descriptive data for this group is summarized

in Table 4.1.

IVA group. The total sample size for this group was 10. The subjects ranged in

age from 7 to 63 years, with a mean age of 26.80 years (SD = 19.84); including five

children and five adults, five males and five females. Nine of the subjects were White,

and one was Latino; while three were categorized as low SES, five as medium SES, and

two as high SES. The make-up of the diagnoses and/or presenting conditions included

mostly a combination of ADHD, with three ADHD-I and four ADHD-C; yet, there were

two subjects with ADHD-C comorbid with another disorder (ADHD-C/ Unspecified

Anxiety Disorder, ADHD-C/ Unspecified Learning Disorder). Finally, the other

diagnoses included one subject with presenting issues of difficulty with executive

functioning. A total of eight subjects had no medication usage, one subject was on

medication, and one subject started on medication but had ceased medication by the post

assessment. The number of sessions from pre assessment to post assessment ranged from

three to 15, with a mean of 9.70 (SD = 3.92). The targeted session frequency was once

per week. The number of weeks for treatment (pre to post assessment) ranged from two

to 15, with a mean of 9.40 (SD = 4.40). Finally, the number of weeks from pre

89

assessment to post assessment ranged from two to 43, with a mean of 13.20 (SD = 11.11).

The descriptive data

for this group is summarized in Table 4.1.

DSMD group. The total sample size for this group was 14. The subjects ranged in

age from 7 to 17 years, with a mean age of 10.86 years (SD = 2.91); including 14 children

and no adults, seven males and seven females. Ten of the subjects were White, two were

Asian, and two were Latino; while three were categorized as low SES, nine as medium

SES, and two as high SES. The make-up of the diagnoses and/or presenting conditions

included a combination of ADHD, with two ADHD-I and five ADHD-C; yet, there were

two

subjects with ADHD-C comorbid with another disorder (ADHD-C/Autism Spectrum

Disorder, ADHD-C/Unspecified Learning Disorder). Finally, the other diagnoses

included one comorbid Unspecified Anxiety/Unspecified Depressive Disorder, one

Autism Spectrum Disorder, one Unspecified Bipolar Disorder, one Reactive Attachment

Disorder, and one subject with presenting issues of difficulty with executive functioning.

A total of 11 subjects had no medication usage, one subject was on medication, one

subject started on medication but had ceased medication by the post assessment, and one

subject had a reduction of medication by one-third at the time of post assessment. The

number of sessions from pre assessment to post assessment ranged from three to 20, with

a mean of 11.43 (SD = 4.13). The targeted session frequency was once per week. The

number of weeks for treatment (pre to post assessment) ranged from three to 22, with a

mean of 12.57 (SD = 5.60). Finally, the number of weeks from pre assessment to post

assessment ranged from six to 37, with a mean of 15.36 (SD = 8.63). The descriptive data

for this group is

summarized in Table 4.1.

90

BRIEF group. The total sample size for this group was 12. The subjects ranged

in age from 7 to 63 years, with a mean age of 20.25 years (SD = 19.97); including 10

children and two adults, six males and six females. Eleven of the subjects were White,

and one was Latino; while two were categorized as low SES, nine as medium SES, and

one as high SES. The make-up of the diagnoses and/or presenting conditions included a

combination of ADHD, with two ADHD-I and two ADHD-C; yet, there were two

subjects with ADHD-C comorbid with another disorder (ADHD-C/Autism Spectrum

Disorder, ADHD-C/ Unspecified Learning Disorder). Finally, the other diagnoses

included one comorbid Unspecified Anxiety/Unspecified Depressive Disorder, one

Autism Spectrum Disorder, one Reactive Attachment Disorder, one comorbid Obsessive-

Compulsive Disorder and issues with executive function, and two subjects with

presenting issues of difficulty with executive functioning. A total of 10 subjects had no

medication usage, one subject was on medication, and one subject started on medication

but had ceased medication by the post assessment. The number of sessions from pre

assessment to post assessment ranged from three to 20, with a mean of 11.83 (SD = 2.69).

The targeted session frequency was once per week. The number of weeks for treatment

(pre to post assessment) ranged from approximately three to 22, with a mean of 13.50

(SD = 3.97). Finally, the number of weeks from pre assessment to post assessment ranged

from six to 37, with a mean of 16.17 (SD = 8.44). The descriptive data for this group is

summarized in Table 4.1.

91

Table 4.1

Descriptive Data for All Groups

Category

QEEG

Group

(n = 21)

IVA
Group

(n = 10)

DSMD
Group

(n = 14)

BRIEF
Group

(n =12)

Age

M (SD)

Children

Adults

21.19 (18.12)

15
6

26.80 (19.84)

5
5

10.86 (2.91)

14
0

20.25 (19.97)

10
2

Gender

Male

Female

10
11

5
5

7
7

6
6

Ethnicity

White

Asian

Latino

17
2
2

9
0
1

10
2
2

11
1
0

Socioeconomic Status

Low

Medium

High

5
14
2

3
5
2

3
9
2

2
9
1

Diagnosis or Condition

ADHD-Inattentive

ADHD-Combined

ADHD-C/Anxiety

ADHD-C/ASD

ADHD-C/LD

Anxiety/Depression

ASD

Bipolar

Executive Function

OCD/Exec Function

RAD

4
7
1
1
1
1
1
1
2
1
1

3
4
1
0
1
0
0
0
1
0
0

2
5
0
1
1
1
1
1
1
0
1

2
2
0
1
1
1
1
0
2
1
1

Medication

No

Yes

Yes to off

Yes to reduced

16
2
2
1

8
1
1
0

11
1
1
1

10
1
1
0

# Sessions pre-to-post

M (SD)

# Weeks for treatment

M (SD)

# Weeks pre to post

assessment M (SD)

10.90 (3.88)

11.76 (5.19)

15.10 (10.03)

9.70 (3.92)

9.40 (4.40)

13.20 (11.11)

11.43 (4.13)

12.57 (5.60)

15.36 (8.63)

11.83 (2.69)

13.50 (3.97)

16.17 (8.44)
Note. ADHD: Attention Deficit Hyperactivity Disorder; ASD: Autism Spectrum Disorder; LD: Learning

Disorder; RAD: Reactive Attachment Disorder; OCD: Obsessive-Compulsive Disorder.

92

Data set limitations. Given the data for this study was derived from a

retrospective collection of information from existing files, there are some inherent

limitations. Yet, while these limitations are unavoidable, taking data from real-world

records gives an opportunity to evaluate an intervention using realistic information

typically found in a clinical setting. Such was the case with this study. One example is

with regard to medication usage of the subjects. In a true experimental setting, having no

medication usage in all subjects would be ideal; however, NF clinicians routinely have

clients who seek out NF while still taking medications. The frequency of cases involving

medication use in this study, with an overall five out of 21 for the QEEG group, two out

of 10 in the IVA group, three out of 14 for the DSMD group, and two out of 12 for the

BRIEF group, was a fairly accurate representation of the overall population that has been

seen in this practice for close to 15 years. Therefore, while an argument could be made

that a data set with no medication usage may provide for more credible results; in reality,

the data in this study made the results more generalizable to the population of those who

actually seek NF services.

The other example in this study, impacted by a fixed data set, was regarding the

number of weeks from pre assessment to post assessment. The apparent great variability

in number of weeks was accounted for by two outlier cases on the high end (i.e. 37 and

43 weeks), and one outlier case on the low end (i.e. 2 weeks). If these outliers were

excluded, the range of number of weeks would have been from six to 26 for the QEEG

and DSMD groups, from seven to 15 for the IVA group, and from seven to 26 for the

BRIEF group. This then, better explains why the group means averages of the number of

weeks from pre assessment to post assessment (as seen in table 4.1) are 15 weeks for the

93

QEEG and DSMD groups, 13 weeks for the IVA group, and 16 weeks for the BRIEF

group. Also of note, it is important to realize that the number of weeks from pre to post

assessment does not represent the number of sessions; it is only the time elapsed between

assessment points. To clarify this aspect, the number of sessions pre-to-post treatment, as

well as the number of weeks for the treatment, are reported to better illustrate the

timeframes and sessions performed during the 19ZNF.

Data Analysis Procedures

The data analysis procedures were conducted with no deviation from what was

described in the previous Methodology chapter. Given this study consisted solely of data

collection and analysis, the greatest source of error was data collection and data entry

errors, which would negatively impact this research with inaccurate results. This was

mitigated by being careful in the data handling, as well as double checking the data

collection and entry. An additional check of data processing was accomplished by

separately repeating the data collection and analysis steps two separate times, thus

providing a thorough accuracy check of the values collected and analyzed.

Prior to analysis, using SPSS v. 21, the data were reviewed and there were no

outliers or missing data found. For each data group (IVA, DSMD, BRIEF, and QEEG) an

SPSS file was set up and variables such as Case Number, Pre and Post variables for each

scale were established. Next, data was transferred from the data collection spreadsheet to

the appropriate SPSS columns. Then, Difference variables were created, for each scale,

using the SPSS command sequence of Transform>Compute Variable>Create Difference

to calculate the difference score between the pre and post scale values.

94

Normality checks. Prior to running parametric t tests, for repeated measures data,

checking to ensure the difference scores meet the necessary assumption of normality is an

important step to demonstrate the validity and reliability of the data analysis and

inference (Gravetter & Wallnau, 2010). There are various techniques for checking

normality. Graphical methods include histograms or Q-Q plots, while numerical methods

include skewness/kurtosis coefficients or formal normality tests. As shown in Appendix

C, the Q-Q plots for all the difference scores analyzed provide visual evidence of the

difference scores meeting the assumption of normality. However, only formal normality

tests provide conclusive evidence, with specific cut-off values (i.e. p values), that the

requirement for a normal distribution has been met (Razali & Wah, 2011). In a study

comparing four formal normality tests (i.e. Shapiro-Wilk, Kolmogorov-Smirnov,

Lilliefors, and Anderson-Darling), Razali and Wah (2011) found the Shapiro-Wilk test to

be the most powerful for all sample sizes and distribution types. Therefore, the Shapiro-

Wilk test was also used to check the difference scores for normality. This was

accomplished using the SPSS command sequence of Analyze>Descriptive

Statistics>Explore. The Shapiro-Wilk computations for all scales, in all groups, resulted

in p > .05 (ranging from p = .084 to p = .980); thus ensuring the difference scores met the

normality assumption. Meeting this assumption provides confidence that the statistical

analysis yields reliable and valid results (Razali & Wah, 2011). Therefore, the Shapiro-

Wilk testing indicates the validity and reliability of the interpretation of the data as well

as the inference of the data in this study was demonstrated. A breakdown of the

difference scores Shapiro-Wilk p values are provided in Table 4.2.

95

Table4.2

Shapiro-Wilk Results for Difference Scores

Groups

Scales

Shapiro-Wilk

p Values

IVA (n = 10)

Audio Attention

Visual Attention

Full Scale Attention

DSMD (n = 14)

Externalizing

Internalizing

Total

BRIEF (n = 12)

BRI

MI

GEC

QEEG (n = 21)

Absolute Power

Relative Power

Coherence

.429

.314

.980

.771

.336

.582

.178

.934

.084

.930

.778

.437

Paired t tests. To compute the paired t tests, for the three scales in each group,

the SPSS command sequence executed was Analyze>Compare Means>Paired Samples T

Test. For each scale, the Pre variable was moved to the Variable 1 position and the Post

variable was moved to the Variable 2 position. Given the directional nature of all

hypotheses, it was necessary to divide the SPSS-computed 2-tailed p value by two in

order to derive the 1-tailed p value. Finally, the Hedges’ d effect sizes were calculated

using the MetaCalc module of the Metawin 2.1 software.

The analysis for each of the psychometric assessment groups maintained

alignment with the associated research questions by including the specified scales; those

which most closely associate with the constructs of interest. These included the Attention

scales for the IVA group, the Internalizing, Externalizing, and Total scales for the DSMD

group, and the composite indices of Behavior Regulation, Metacognition, and Global

Executive for the BRIEF group. For the QEEG group, analyzing whether the post z-

96

scores are closer to the mean maintains alignment with the z-score research question.

Moreover, the paired t test analyses, where the means of the pre values are compared to

the means of the post values, was appropriate for the one-group pretest-posttest design of

this study.
Results

For all of the research questions in this study, the group means direction of

change was first determined; then, the paired t test was performed to compare the means

of the pre and post scores. Finally, the Hedges’ d effect size (Hd) was calculated. No

outliers were found in the group means data analyzed. Line graphs showing the pretest

and posttest scores, for each individual subject, are shown in Appendix D to provide a

detailed picture of individual assessments.

Research question 1a: IVA group. Does 19ZNF improve attention as measured

by the IVA assessment?
Ha1a: The post scores will be higher than the pre scores for the IVA
assessment.
H01a: The post scores will be lower than, or not significantly different
from, the pre scores of the IVA assessment.

For this research question, the scales of Auditory Attention, Visual Attention, and

Full Scale were evaluated; with the threshold for clinical significance being ≤ 85. The

mean post scores were higher than the pre scores for all scales; thus the change was in the

predicted direction. The mean of the Auditory Attention scale pre scores was 86.50 (SD =

14.11), 95% CI [76.40, 96.60], and the mean of the post scores was 106.20 (SD = 10.76),

[98.50, 113.90]. The mean of the Visual Attention scale pre scores was 83.60 (SD =

97

19.37), [69.74, 97.46], and the mean of the post scores was 103.70 (SD = 13.21), [94.25,

113.15]. The mean of the Full Scale pre scores was 83.40 (SD = 18.23), [70.36, 96.44],

and the mean of the post scores was 105.60 (SD = 12.25), [96.84, 114.36]. Moreover, the

mean pre scores for all three scales were at or below the cutoff threshold indicating

clinical significance; and the mean post scores for all three scales were above the clinical

cutoff threshold. The one-tailed t test results showed the pre and post scores differed

significantly; with the Auditory Attention scale t(9) = -4.29, p = .001, Hd = 1.84; the

Visual Attention scale t(9) = -3.00, p = .008, Hd = 1.29; and the Full Scale t(9) = -3.78, p

= .002, Hd = 1.62. Therefore, the null hypothesis was rejected in favor of the alternative

hypothesis, as the post scores were higher than the pre scores for the IVA assessment;

thus suggesting improvement in attention. See Figure 4.1 for a graphical representation of

the pre and post scale scores.

Figure 4.1. Mean IVA group standard scores before and after 19ZNF

sessions. The dotted line indicates threshold for clinical significance;

values at or below the line suggest clinically relevant symptoms. Post

values above the line suggest improvements in attention. All post scores

are statistically significant at p ≤ .008.

98

Research question 1b: DSMD group. Does 19ZNF improve behavior as

measured by the DSMD assessment?
Ha1b: The post scores will be lower than the pre scores for the DSMD
assessment.
H01b: The post scores will be higher than, or not significantly different
from, the pre scores of the DSMD assessment.

For this research question, the scales of Externalizing, Internalizing, and Total

were evaluated; with the threshold for clinical significance being ≥ 60. The mean post

scores were lower than the pre scores for all scales; thus the change was in the predicted

direction. The mean of the Externalizing scale pre scores was 68.21 (SD = 15.49), 95%

CI [59.27, 77.16], and the mean of the post scores was 57.71 (SD = 12.78), [50.28,

65.14]. The mean of the Internalizing scale pre scores was 66.21 (SD = 9.82), [60.55,

71.88], and the mean of the post scores was 57.29 (SD = 9.85), [51.60, 62.97]. The mean

of the Total scale pre scores was 65.00 (SD = 10.58), [58.89, 71.11], and the mean of the

post scores was 55.64 (SD = 10.76), [49.43, 61.86]. Moreover, the mean pre scores for all

three scales were above the cutoff threshold indicating clinical significance, and the mean

post scores for all three scales were below the clinical cutoff threshold. The one-tailed t

test results showed the pre and post scores differed significantly; with the Externalizing

scale t(13) = 4.97, p = .000, Hd = 1.83; the Internalizing scale t(13) = 6.43, p = .000, Hd

= 2.36; and the Total scale t(13) = 9.36, p = .000, Hd = 3.42. Therefore, the null

hypothesis was rejected in favor of the alternative hypothesis, as the post scores were

99

lower than the pre scores for the DSMD assessment; thus suggesting improvement in

behavior. See Figure 4.2 for a graphical

representation of the pre and post scale scores.

Figure 4.2. Mean DSMD group standard scores before and after 19ZNF

sessions. The dotted line indicates threshold for clinical significance;

values at or above the line suggest clinically relevant symptoms. Post

values below the line suggest improvements in behavior. All post scores

are statistically significant at p = .000.

Research question 1c: BRIEF group. Does 19ZNF improve executive function

as measured by the BRIEF assessment?

Ha1c: The post scores will be lower than the pre scores for the BRIEF
assessment.
H01c: The post scores will be higher than, or not significantly different
from, the pre scores of the BRIEF assessment.

100

For this research question, the scales of BRI, MI, and GEC were evaluated;

with the threshold for clinical significance being ≥ 65. The mean post scores were lower

than the pre scores for all scales; thus the change was in the predicted direction. The

mean of the BRI scale pre scores was 71.00 (SD = 11.40), 95% CI [63.77, 78.23], and the

mean of the post scores was 60.17 (SD = 10.27), [53.64, 66.69]. The mean of the MI

scale pre scores was 76.08 (SD = 8.24), [70.85, 81.32], and the mean of the post scores

was 65.67 (SD = 10.36), [59.08, 72.25]. The mean of the GEC scale pre scores was 75.75

(SD = 9.33), [69.82, 81.68], and the mean of the post scores was 64.50 (SD = 9.91),

[58.20, 70.80]. Moreover, the mean pre scores for all three scales were above the cutoff

threshold indicating clinical significance, and the mean post scores for all three scales

were below the clinical cutoff threshold. The one-tailed t test results showed the pre and

post scores differed significantly; with the BRI scale t(11) = 4.37, p = .001, Hd = 1.72;

the MI scale t(11) = 4.39, p = .001, Hd = 1.73; and the GEC scale t(11) = 4.66, p = .000,

Hd = 1.84. Therefore, the null hypothesis was rejected in favor of the alternative

hypothesis, as the post scores were lower than the pre scores for the BRIEF assessment;

thus suggesting improvement in executive function. See Figure 4.3 for a graphical

representation of the pre and post scale scores.

101

Figure 4.3. Mean BRIEF group standard scores before and after 19ZNF

sessions. The dotted line indicates threshold for clinical significance;
values at or above the line suggest clinically relevant symptoms. Post

values below the line suggest improvements in executive function. All

post scores are statistically significant at p ≤ .001.

Research question 2: QEEG group. Does 19ZNF improve electrocortical

function as measured by QEEG z-scores such that the post z-scores are closer to the mean

than pre z-scores?
Ha2: The post z-scores will be closer to the mean than the pre z-scores.

H02: The post z-scores will be farther from the mean, or not significantly

different from, the pre z-scores.

For this research question, the QEEG metrics of Absolute power, Relative power,

and Coherence were evaluated; with the targeted transformed z-score threshold value

being z ≥ 1.00. The mean post z-scores were lower than the pre z-scores for all metrics;

thus the change was in the predicted direction and the z-scores were closer to the mean.

The mean of the Absolute power pre z-scores was 1.46 (SD = 0.28), 95% CI [1.33, 1.59],

102

and the mean of the post scores was 1.03 (SD = 0.37), [0.87, 1.20]. The mean of the

Relative power pre z-scores was 1.51 (SD = 0.22), [1.41, 1.61], and the mean of the post

scores was 1.13 (SD = 0.35), [0.97, 1.29]. The mean of the Coherence pre z-scores was

1.46 (SD = 0.14), [1.40, 1.53], and the mean of the post scores was 0.96 (SD = 0.32),

[0.82, 1.11]. Moreover, the mean pre scores for all metrics were above 1.00, and the

mean post scores for all metrics approached or were below 1.00. The one-tailed t test

results showed the pre and post scores differed significantly; with the Absolute power

t(20) = 7.73, p = .000, Hd = 2.29; the Relative power t(20) = 5.22, p = .000, Hd = 1.76;

and the Coherence t(20) = 6.55, p = .000, Hd = 1.88. Therefore, the null hypothesis was

rejected in favor of the alternative hypothesis, as the post z-scores were closer to the

mean than the pre z-scores; thus suggesting improvement in electrocortical functioning.

See Figure 4.4 for a graphical representation of the pre and post scale scores.

Figure 4.4. Mean QEEG group targeted z-scores before and after 19ZNF

sessions. The dotted line indicates threshold for inclusion as targeted

z-scores; values above the line suggest electrocortical dysfunction. Post

values at or below the line suggest improvements in electrocortical

function. All post scores are statistically significant at p = .000.

103

Summary

The research questions for this study asked if the independent variable of 19ZNF

improved attention, behavior, executive function, and electrocortical function. The

dependent variables to test the hypotheses included the scaled scores from the IVA,

DSMD, and BRIEF clinical assessments and QEEG z-scores. The difference scores were

normally distributed, thus supporting the use of one-tailed t tests to compare the pre to the

post scores for each of the dependent variables.

For all pre-post comparisons, the direction of change in the scores was in the

predicted direction for all hypotheses. Moreover, for all the outcome measures, the

averaged scores were beyond the clinically significant threshold before 19ZNF and

changed to no longer being so after 19ZNF. Finally, for all research questions, the null

hypothesis was rejected, in favor of the conclusion that 19ZNF improved attention,

behavior, executive function, and electrocortical function (respective to each hypothesis).

All differences were statistically significant, with results ranging from p = .000 to p =

.008; and Hd values ranging from 1.29 to 3.42. Table 4.3 provides a cumulative summary

of the results of these findings for all groups.

In the chapter that follows, a discussion of these findings will be presented.

Conclusions and interpretations regarding the contributions of this research will be

offered. Furthermore, a review of the implications (practical, theoretical, and future) of

this research, and recommendations for future research and practice will be provided.

104

Table 4.3

Summary of Results – All Groups

Groups

Scales

PRE Scores

M (SD)

POST Scores

M (SD)

t(df)

p

Hedges’ d
IVA

Audio Attention

Visual Attention
Full Scale Attention

86.50 (14.11)

83.60 (19.37)

83.40 (18.23)

106.20 (10.76)

103.70 (13.21)

105.60 (12.25)

-4.29 (9)

-3.00 (9)

-3.78 (9)

.001

.008

.002

1.84

1.29

1.62

DSMD
Externalizing
Internalizing
Total

68.21 (15.49)

66.21 (9.82)

65.00 (10.58)

57.71 (12.87)

57.29 (9.85)

55.64 (10.76)

4.97 (13)

6.43 (13)

9.36 (13)

.000

.000
.000

1.83

2.36

3.42

BRIEF
BRI
MI
GEC

71.00 (11.40)

76.08 (8.24)

75.75 (9.33)

60.17 (10.27)

65.67 (10.36)

64.50 (9.91)

4.37 (11)

4.39 (11)

4.66 (11)

.001
.001
.000

1.72

1.73

1.84

QEEG Z-Scores

Absolute Power
Relative Power
Coherence

1.46 (0.28)

1.51 (0.22)

1.46 (0.14)

1.03 (0.37)

1.13 (0.35)

0.96 (0.32)

7.73 (20)

5.22 (20)

6.55 (20)

.000
.000
.000

2.29

1.76

1.88

105

Chapter 5: Summary, Conclusions, and

Recommendations

Introduction

The primary problem this research sought to address was how it was not known,

by way of statistical evaluation of either clinical assessments or QEEG z-scores, if

19ZNF was an effective NF technique. This problem, manifest as a lack of literature,

leaves clinicians and prospective NF clients alike without research-based evidence to

evaluate 19ZNF. Currently, mostly qualitative case-study reports have been found in the

literature. Thus, this study has importance in its aim to fill this empirical gap.

As has been discussed, NF is gaining recognition as an evidence-based

intervention grounded in learning theory. Among the different models developed over the

last 40 years, 19ZNF is one of the newest. Yet, while 19ZNF is reported to lead to

improved clinical outcomes in fewer sessions than traditional NF, and a growing number

of clinicians are adding this model to their practice, the peer-review literature is lacking

regarding the efficacy of the model. This study was different in its use of group means

data to directly compare pre and post outcome measure variables, to include QEEG data,

to begin an evaluation of efficacy of 19ZNF. The use of a quasi-experimental design in

this research, which has not been typical in prior 19ZNF evaluations, provides baseline

research for investigating the efficacy of 19ZNF. The use of quantitative methods, with

group means data, contributes to the base of knowledge regarding 19ZNF by providing

statistical analysis, which allows for greater generalization over qualitative and/or case

study investigations.

106

Summary of the Study

This chapter aims to first present a summary and conclusions of the study. Next,

practical, theoretical, and future implications will be reviewed. Finally, future research

and practice recommendations will be discussed.

This retrospective pretest-posttest study investigated if 19ZNF improved

attention, behavior, executive function, and electrocortical functioning. To that end, the

research questions asked if 19ZNF improved: Attention as measured by the IVA,

behavior as measured by the DSMD, executive function as measured by the BRIEF, and

electrocortical function as measured by QEEG z-scores. Paired t tests were performed to

compare the means of four outcome measures; which included three clinical assessments

(IVA, DSMD, and BRIEF) and QEEG z-scores. Each of the clinical assessments framed

a sample group such that the efficacy of 19ZNF was evaluated, as it relates to the

particular neuropsychological constructs of attention (n = 10), behavior (n = 14),

executive function (n = 12), and additionally as related to electrocortical functioning (n =

21). The focus of the IVA sample group was attention, and the scales specific to attention

were the Auditory Attention, Visual Attention, and Full Scale. The focus of the DSMD

sample group was behavior, and the scales specific to behavior were the Externalizing,

Internalizing and Total. The focus of the BRIEF sample group was executive function,

and the composite scales included were BRI, MI, and GEC. The focus of the QEEG

sample group was electrocortical function, and the QEEG metrics included were

Absolute power, Relative power, and Coherence.

Overall, the makeup of the sample was a diagnostically diverse mixture of adults

and children, with most diagnoses related to ADHD. The sample consisted of more

107

children (QEEG = 15, IVA = 5, DSMD = 14, BRIEF = 10) than adults (QEEG = 6, IVA

= 5, DSMD = 0, BRIEF = 2). Other sample characteristics consistent across all groups

are they were evenly divided with respect to gender, were primarily ethnically white, and

were mostly medium SES.

In Chapter 1, an orienting framework of the study was presented to include the

problem statement and study purpose, as well as the methodology rationale and nature of

the research design. In Chapter 2, a review of the literature was presented. The history

and background of ZNF was first addressed; then, the theoretical foundations and

conceptual frameworks of NF and QEEG were presented. Theoretical frameworks

supporting the models of traditional NF, QNF, and ZNF were then reviewed, as were key

NF themes related to applications of QNF and the emergence of 19ZNF. Moreover,

outcome measures suitable for ZNF research were discussed. The focus of Chapter 3was

the methodology of the study and Chapter 4 presented research findings and results.

Summary of Findings and Conclusion

Operant conditioning is the theoretical foundation of NF, with demonstrated

efficacy in improving brain functioning and clinical symptoms, through the resulting

electrocortical changes. However, whether this also holds true for the new 19ZNF model

has been an outstanding question. As discussed in Chapter 1, and again in Chapter 3, the

aim of this study was to provide the beginnings of an evidence-based foundation for the

efficacy of 19ZNF. The focus was to evaluate if 19ZNF would result in improved clinical

symptoms and electrocortical function as measured by the identified outcome measures.

In general, the findings of this study were that attention, behavior, executive function,

and electrocortical function all improved after approximately ten 19ZNF sessions; with

108

the number of sessions ranging from an average of 9.70 to 11.83 sessions across the four

groups. This study also supported the clinical reports of Thatcher (2013) and Wigton

(2013) that 19ZNF results in improvement in clinical symptoms in fewer sessions than

the 40+ sessions typical in traditional NF. Also notable, is that the frequency of the

sessions was an average of once per week, rather than the two to three times per week as

is typical of traditional NF or QNF. Each finding will next be reviewed separately, to

further discuss the significance of this study as related to the identified constructs of

attention, behavior, executive function, and

electrocortical function.

Research question 1a: IVA group. Does 19ZNF improve attention as measured

by the IVA assessment? In answering this research question, as seen in Table 4.3, the

post scores were higher than the pre scores for the IVA, thus lending support for attention

being improved. Although this group was made up of subjects with varying diagnoses

(though mostly associated with ADHD), as a collective group, they all initially exhibited

symptoms of attention dysfunction; as all the group means Attention scales scores fell at

or below the clinically significant range (Auditory Attention = 86.50, Visual Attention =

83.60, and Full Scale = 83.40). As was expected, 19ZNF resulted in a positive clinical

outcome of improved attention, as the subjects’ performance on the posttest assessment

significantly improved. After 19ZNF, all the included group means Attention scales were

no longer in the clinically significant range (Auditory Attention = 106.2, Visual Attention

= 103.70, and Full Scale = 105.60). The effect sizes for the three scales (1.84, 1.29, and

1.62, respectively) are all considered very large. Therefore, the results of this research

question were both clinically and

statistically significant.

109

Given that no prior 19ZNF studies were found which analyzed IVA data as an

outcome measure, no direct comparison to prior research is possible. Moreover, there

were no QNF studies found incorporating the IVA as an outcome measure. In looking at

traditional NF studies, while Knezevic, et al. (2010) incorporated the IVA in their study,

they did not use the any of the composite Attention scales. The Fritson et al. (2008) study

is not a relevant comparison as they used a sample of non-clinical college students.

Finally, in the research of Steiner, et al. (2011), the only comparable scale used was the

Attention Full scale; yet, with an n = 6, while the post scores were in the desired

direction, the pre-post difference scores were not statistically significant.

Research question 1b: DSMD group. Does 19ZNF improve behavior as

measured by the DSMD assessment? In answering this research question, as seen in

Table 4.3, the post scores were lower than the pre scores for the DSMD, thus lending

support for behavior being improved. Although this group was made up of subjects with

varying diagnoses, as a collective group, they all initially exhibited symptoms of

behavioral issues; as all the included group means scales scores fell at or above the

clinically significant range (Externalizing = 68.21, Internalizing = 66.21, and Total =

65.00). As was expected, 19ZNF resulted in a positive clinical outcome of improved

behavior, as the subjects’ scores on the posttest assessment significantly improved. After

19ZNF, all the included group means scales were no longer in the clinically significant

range (Externalizing = 57.71, Internalizing = 57.29, and Total = 55.64). The effect sizes

for the three scales (1.83, 2.36, and 3.42, respectively) are all interpreted as being very

large; and are the largest effect sizes in this study. Therefore, the results of this research

question were both clinically and statistically significant.

110

To date, no prior NF studies (ZNF, QNF, or traditional NF) have conducted

outcome measure analysis with the DSMD; as such, there are no relevant existing studies

with which to directly contrast or compare. However, the DSMD scales of Externalizing,

Internalizing, and Total correlate well to the similarly named scales of the CBCL. Huang-

Storms et al. (2006) used the CBCL as an outcome measure in their retrospective pretest

posttest study evaluating traditional NF. All post scores were in the desired direction and

difference scores for all scales were statistically significant (p < .01) with medium to

large effect sizes (Externalizing, Cohen’s d = .94; Internalizing, Cohen’s d = .59; Total,

Cohen’s d = .78).

Research question 1c: BRIEF group. Does 19ZNF improve executive function

as measured by the BRIEF assessment? In answering this research question, as seen in

Table 4.3, the post scores were lower than the pre scores for the BRIEF, thus lending

support for executive function being improved. Although this group was made up of

subjects with varying diagnoses, as a collective group, they all initially exhibited

symptoms of compromised executive function; with all the included group means scales

scores falling at or above the clinically significant range (BRI = 71.00, MI = 76.08, and

GEC = 75.75). As was expected, 19ZNF resulted in a positive clinical outcome of

improved executive function, as the subjects’ scores on the posttest assessment

significantly improved. After 19ZNF, all the included group means scales were no longer

in the clinically significant range (BRI = 60.17, MI = 65.67, and GEC = 64.50). The

effect sizes for the three scales (1.72, 1.73, and 1.84, respectively) are all interpreted as

being very large. Therefore, the results of this research question were both clinically and

statistically significant.

111

Here too, no prior 19ZNF studies were found which conducted outcome measure

analysis with the BRIEF; thus, no direct comparison to prior research is possible.

However, the Orgim and Kestad (2013) study, which compared QNF to medication for

ADHD, included the BRI and MI scales of the BRIEF among various outcome measures.

For the BRIEF scales analyzed, while the post scores were in the desired direction for

both groups, the difference between NF and medication groups were not significant. The

Drechsler et al. (2007) study compared SCP NF to group therapy and incorporated the

BRI and MI scales of the BRIEF as two of their outcome measures. Their findings

indicated a statistically significant (p = .004) improvement for NF, more than group

therapy, on the MI scale of the BRIEF; whereas there were no significant differences for

NF versus group therapy for the BRI scale. Finally, Steiner et al. (2011) incorporated the

GEC scale of the BRIEF as one of many outcome measures in comparing traditional NF

to computerize attention training to a waitlist control. For all groups, for the primary

parent and co-parent ratings, all post scores moved in the desired direction, however, only

the computerized attention training resulted in a significant difference (p < .05) for the

GEC scale.

Research question 2: QEEG group. Does 19ZNF improve electrocortical

function as measured by QEEG z-scores, such that the post z-scores are closer to the

mean than pre z-scores? In answering this research question, as seen in Table 4.3, the

post z-scores were closer to the mean than the pre z-scores, thus lending support for

electrocortical function being improved. Although this group was made up of subjects

with varying diagnoses, as a collective group, they all exhibited electrocortical

dysregulation; with all the targeted z-scores group means falling above the z-score

112

threshold (Absolute power = 1.46, Relative power = 1.51, and Coherence = 1.46). As was

expected, 19ZNF resulted in a positive clinical outcome of improved electrocortical

function, as the subjects’ averaged targeted z-scores on the posttest assessment

significantly improved. After 19ZNF, the targeted z-score group means for Absolute

power (1.03) and Coherence (0.96) were at or below the z-score threshold, with the

Relative power (1.13) approaching the threshold. The effect sizes for the three scales

(Absolute power = 2.29, Relative power = 1.76, and Coherence = 1.88) are all interpreted

as being very large. Therefore, the results of this research question were both clinically

and statistically significant. Moreover, these findings suggested that, as a group, the

subjects’ QEEG z-scores normalized as a result of 19ZNF; and perhaps more

importantly, the normalization was accompanied by clinical symptom improvement.

As has been stated, few NF studies make use of QEEG metrics as outcome

measures. More so, as of this writing, no prior NF studies (ZNF, QNF, or traditional NF)

have been found incorporating a measure of overall QEEG normalization. Thus, there are

no relevant existing studies with which to contrast or compare.

Conclusions. The literature reviewed for this study found both traditional NF and

QNF studies consistently employed retrospective pretest-posttest designs. This research

was consistent with those prior works. Significant differences were found between the

pre and post scores, thus indicating positive clinical outcomes. However, this research

was also innovative in that it made use of QEEG metrics, as outcome measures, to

provide an overall measure of the distance from the mean, for determining overall

normalization of the z-scores. Here too, the pre to post score differences were significant

113

for all metrics indicating normalization of the QEEG z-scores, thus indicating improved

electrocortical function.

Arns et al. (2009, 2012) have discussed effect sizes in studies evaluating NF to

treat ADHD. For traditional NF models, Hd effect sizes were 0.7 and 1.0 for hyperactive

and attention symptoms, respectively; yet for the QNF models, Hd effect sizes were 1.2

and 1.8 (hyperactive and attention symptoms, respectively). In this research, Hd effect

sizes ranged from 1.29 to 3.42, with an average of 1.97. Therefore, the effect sizes for

this study were similar, or greater, than what has been reported for QNF and traditional

NF models. Moreover, if NF efficacy is defined in terms of large effect sizes when

comparing pre-post outcome measure data (Arns et al., 2012), then the effect sizes of this

study support 19ZNF as being effective.

Therefore, as was proposed in Chapter 1, it is reasonable to conclude that the

theory of operant conditioning, upon which NF is founded, can be expanded to include

19ZNF. It is also reasonable to conclude that, in the context of this study, the findings

supported the efficacy of 19ZNF in improving attention, behavior, executive function,

and electrocortical function. Thus, this research addressed the literature gap and begins to

lend credence to the position that 19ZNF could be considered an evidence-based

intervention. Further, this study demonstrated that QEEG z-scores data can be used for

group comparison studies, in a way not previously developed; thus, this study has the

potential for cultivating future QEEG-based research.

Implications

The objective of this research was a comparison of outcome measures before and

after 19ZNF to evaluate the efficacy of this NF intervention. In reviewing the theoretical

114

framework discussed in the literature review, certain elements are pertinent to the

findings of this research. Hughes and John (1999) demonstrated EEG/QEEG measures to

be sensitive to psychiatric disorders. The QNF model (which informs the 19ZNF model)

is founded on the premise that electrocortical dysfunctions correspond with clinical

symptoms and mental disorders (Coben & Myers, 2010; Collura, 2010; Walker, 2010a),

such that clinical symptoms can be linked to brain dysregulation (Thatcher, 2013).

Further, when NF results in symptom resolution, together with QEEG normalizing, this

represents an improvement in electrocortical functioning (Arns et al., 2012; Walker,

2010a). Therefore, the findings of this study (with the 19ZNF protocol of QEEG

normalization) were consistent with the multiple reports in the literature suggesting

QEEG normalization protocols bring about clinical benefits (Arns et al., 2012; Breteler et

al., 2010; Collura, 2008; Orgim & Kestad, 2013; Surmeli et al., 2013; Surmeli & Ertem,

2009, 2010; Walker, 2009. 2010a, 2011, 2012a).

Theoretical implications. QEEG normalization is a theoretical construct which

has grown in popularity with the advent of the QNF model; as has the use of individually

tailored QEEG-based protocols to bring about that normalization. Additionally, clinical

reports have suggested 19ZNF may exhibit better performance than traditional NF. These

findings supported 19ZNF as a NF modality which can bring about both QEEG

normalization and symptom improvement. More so, it can do so quite efficiently, as

evidenced by the results of this study occurring on average of within 10 sessions, at a

target frequency of once per week.

As discussed in the literature review, the greater specificity that QEEG-based

methods allowed in treatment also creates methodological challenges due to the need to

115

account for both positive and negative z-scores. This study’s method of transforming the

z-scores to the absolute value, then tracking pre to post changes of the targeted z-scores,

presented an innovative methodology for measuring overall normalization of the QEEG.

If further validated, this approach has the potential to open new avenues for QEEG-based

research, both within the NF community as well as the broader neuroscience fields.

The implications of this study, as related to cognition and instruction, are twofold.

First, the findings suggested 19ZNF improves the attention and executive function

components of cognition. Second, when cognition improves, more mental resources are

made available for an individual to better engage in instructional processes. The findings

of this study also suggested that 19ZNF can improve behavior. In group educational

settings, when disruptive behavior improves, distractions to other learners are reduced

and the effectiveness of the instructional environment can be enhanced. Therefore, this

study lent support to 19ZNF as benefiting both cognition and instruction.

Practical implications. This research begins to address the literature gap

regarding evidence-based findings of 19ZNF. Thus, this study can provide NF clients and

clinicians with information regarding its efficacy in improving attention, behavior,

executive function, and electrocortical function. Furthermore, it suggests that 19ZNF may

address the need for 40+ sessions for success with NF. If 19ZNF is shown to be an

evidence-based intervention which requires fewer sessions than tradition NF or QNF,

clients will benefit through the associated cost savings. Also of note, while not a specific

focus in this research, is that the 19ZNF in this study occurred at a frequency of only

once per week, rather than the two to three times per week as other models. These

116

aspects, taken together, may potentially serve to reduce resistance of third-party payers to

include NF as covered services.

Future implications. Future implications of this study depend on future research.

This study only provided the beginning steps of forming an evidence-based framework

for 19ZNF. As will be discussed below, much remains to be investigated and evaluated

through further research. However, that being said, this study has the potential of

widening the acceptance of 19ZNF, as well as opening new frontiers for QEEG-based

research.

Strengths of this study include being a first quantitative analysis of group means

data from 19ZNF, of which, as of this writing, none has been found. Thus, this research

contributed in taking the empirical evaluation of 19ZNF beyond clinical reports and case

study presentations. Moreover, data coming from a real-world clinical setting suggests

clinicians employing this new model may have similar results. Given the pretest-posttest

design, and the group means averaged time between pre and post assessments ranged

from 13 to 16 weeks (see table 4.1), the previously identified limitation of potential

maturational or history effects likely had minimal impact on the findings. This, then,

increased the credibility of the conclusions. However, remaining weaknesses, inherent in

retrospective studies in clinical settings, included limitations already discussed, such as

small sample size, lack of a separate control groups (lack of randomization), or

comparison to traditional and QNF models. Therefore, recommendations for further

research are next provided.

117

Recommendations

As discussed in the Limitations section of Chapter 3, the question of efficacy

cannot be fully explored without further research. More so in investigating 19ZNF as

being superior to other NF approaches. Therefore, specific recommendations for further

research are presented. Additionally, recommendations for practice will also be reviewed.

Recommendations for future research. As has been discussed, this study was

only a beginning step toward proving 19ZNF as efficacious; thus the recommendations

herein serve to propose next steps in forwarding this line of research. A notable

significance of this study, in advancing scientific knowledge, was that it filled the gap of

a lack of quantitative studies evaluating 19ZNF. However, the gap is large and more

research is needed. Therefore, all the following recommendations would be best

implemented through the use of quantitative methodologies, in order to apply evidence-

based strategies and statistical analysis to evaluate outcomes of 19ZNF as a treatment

intervention.

A single study is insufficient to fully validate the efficacy of any treatment

intervention. Thus, replication of this study would add to the scientific integrity of the

results; however, doing so with larger sample sizes would, of course, be recommended.

Next, follow-up studies are a needed area of focus. While 19ZNF may be effective in the

short-term, the question of whether the benefits hold over time is still outstanding. With

19ZNF being new among other approaches, ones backed by more research, direct

comparisons to the traditional or QNF models are needed; particularly with randomized

assignments. Additional suggestions for randomized control group research are for

comparisons to waitlist groups. However, randomized controlled methods are less

118

feasible in clinical settings; and as such, these studies will likely require university and/or

grant-supported research settings (more conducive to true experimental designs) to

complete. Other comparison research should also explore comparisons of 19ZNF using

surface montages (as with this study) to 19ZNF using inverse-solution montages (e.g.

LORETA).

As has been discussed, few NF studies employ QEEG metrics as a direct outcome

measure; and even fewer do so in analyzing group means data. Therefore, an additional

notable significance of this study, in advancing scientific knowledge, is the novel

development of a measure of overall QEEG normalization, by tracking the pre-post

values of the targeted transformed z-scores. Here too, though, replication and further

validation is needed. Also recommended is an investigation of whether z ± 1.00 is an

optimal threshold value to determine targeted z-scores.

Recommendations for practice. Both NF clinicians and prospective clients will

benefit from reviewing this study. Researchers will also find this study of interest in

furthering what is known about NF, and/or using QEEG metrics as outcome measures in

NF or other QEEG-based investigations. For clinicians employing 19ZNF, who do not

already do so, incorporating the regular use of pre and post outcome measures, and

gathering pre-session baseline QEEG data, is important to furthering what is known

about 19ZNF. Currently, 19ZNF is in its infancy, and likely will face resistance in the

scientific community, much the same as traditional NF has until only recent years. The

settings where conventional experimental work occurs (i.e. grant-funded and/or

university laboratories) may be less likely to embrace research with newer 19ZNF

models, in favor of traditional NF models; at least in the short term. As a result, the

119

clinical setting is currently the primary source of data to evaluate 19ZNF. Therefore,

performing quality pre-post assessments, and then moving forward in research with that

data, will be necessary to advance the acceptance of 19ZNF by the wider scientific

community.

As was discussed in Chapter 1, this study has the potential of opening doors to

future QEEG-based research, in demonstrating that z-scores from QEEG data can be used

for group comparison studies, in a way not previously developed. In moving forward

with this line of research, this study proposed a method for using QEEG metrics for

measuring the degree of normalization. Therefore, incorporating QEEG data as outcome

measures is a practical reality for NF researchers. Thus, practice recommendations are for

including these metrics in future research.

120

References

American Psychiatric Association. (2000). Diagnostic and statistical manual of mental

disorders (4th ed., text rev.). Washington, DC: American Psychiatric Publishing.

American Psychiatric Association. (2013). Diagnostic and Statistical Manual of Mental

Disorders (5
th

ed.). Arlington, VA: American Psychiatric Publishing.

Anastasi, A., & Urbina, S. (1997). Psychological testing (7th ed.). Upper Saddle River,

NJ: Pearson.

Arns, M., de Ridder, S., Strehl, U., Breteler, M., & Coenen. A. (2009). Efficacy of

neurofeedback treatment in ADHD: The effects on inattention, impulsivity and

hyperactivity: A meta-analysis. Clinical EEG and Neuroscience, 40(3), 180-189.

Arns, M., Drinkenburg, W., & Kenemans, J. L. (2012). The effects of QEEG-informed

neurofeedback in ADHD: An open-label pilot study. Applied Psychophysiology

and Biofeedback, 37, 171-180. doi:10.1007/s10484-012-9191-4

Association for Applied Psychophysiology and Biofeedback (AAPB). (2011). What is

biofeedback? Association for Applied Psychophysiology and Biofeedback,

[website home page], retrieved from: www.aapb.org

Baehr, E., Rosenfeld, J. P., & Baehr, R. (1997). The clinical use of an alpha asymmetry

protocol in the neurofeedback treatment of depression: Two case studies. Journal

of Neurotherapy, 2(3), 10-23.

Berger H. (1929) Über das elektroenkephalogramm des menschen [About the

elektroenkephalogram of humans]. Archiv für Psychiatrie und NerveMrankheilen,

87, 527-570.

121

Besenyei, M. , Varga, E., Fekete, I., Puskás, S., Hollódy, K., Fogarasi, A., …Clemens, B.

(2012). EEG background activity is abnormal in the temporal and inferior parietal

cortex in benign rolandic epilepsy of childhood: A LORETA study. Epilepsy

Research, 98(1), 44-49. doi:10.1016/j.eplepsyres.2011.08.013

Brandeis, D. (2011). Neurofeedback training in ADHD: More news on specificity.

Clinical Neurophysiology, 122, 856-857. doi:10.1016/j.clinph.2010.08.011

Bresadola, M. (2008). Animal electricity at the end of the eighteenth century: The many

facets of a great scientific controversy. Journal of the History of the

Neurosciences, 17(1), 8-32. doi:10.1080/09647040600764787

Breteler, M. H. M., Arns, M., Peters, S., Giepmans, I., & Verhoeven, L. (2010).

Improvements in spelling after QEEG-based neurofeedback in dyslexia: A

randomized controlled treatment study. Applied Psychophysiology and

Biofeedback, 35, 5–11. doi:10.1007/s10484-009-9105-2

Budzynski, T. H. (1999). From EEG to neurofeedback. In J. R. Evans & A. Abarbanel

(Eds.), Introduction to quantitative EEG and neurofeedback (pp. 65-79). San

Diego, CA: Academic

Press.

Campbell, D. T., & Stanley, J. C. (1963). Experimental and quasi-experimental designs

for research. Chicago, IL: Rand McNally.

Cannon, R. L., Baldwin, D. R., Shaw, T. L., Diloreto, D. J., Phillips, S. M., Scruggs, A.

M., & Riehl, T. C. (2012). Reliability of quantitative EEG (qEEG) measures and

LORETA current source density at 30 days. Neuroscience Letters, 518, 27-31.

doi:10.1016/j.neulet.2012.04.035

122

Carr, L. T. (1994). The strengths and weaknesses of quantitative and qualitative research:

What method for nursing? Journal of Advanced Nursing, 20, 716-721.

Caton, R. (1875). The electric currents of the brain. British Medical Journal, 2, 278.

Coben, R. & Myers, T. E. (2010). The relative efficacy of connectivity guided and

symptom based EEG biofeedback for autistic disorders. Applied

Psychophysiology and Biofeedback, 35, 13–23. doi:10.1007/s10484-009-9102-5

Collura, T. F. (1993). History and evolution of electroencephalographic instruments and

techniques. Journal of Clinical Neurophysiology, 10(4), 476-504.

Collura, T. F. (1995). History and evolution of computerized electroencephalography

Journal of Clinical Neurophysiology, 12(3), 214-229.

Collura, T. F. (2008). Neuronal dynamics in relation to normative

electroencephalography assessment and training. Biofeedback, 36(4), 134-139.

Collura, T. F. (2010). Conclusion: QEEG-guided neurofeedback in context and in

practice. Applied Psychophysiology and Biofeedback, 35, 37-38.

doi: 10.1007/s10484-009-9108-z

Collura, T. F. (2014). Technical foundations of neurofeedback. New York, NY:

Routledge.

Collura, T. F., Guan, J. G., Tarrant, J., Bailey, J., & Starr, F. (2010). EEG biofeedback

case studies using live z-score training and a normative database. Journal of

Neurotherapy, 14(1), 22-46. doi:10.1080/10874200903543963

Collura, T. F., Kaiser, D., Lubar, J., & Evans, J. (2011). Biofeedback Glossary:

Dictionary of Biofeedback / Neurofeedback Terms. [HTML webpage].

123

Association for Applied Psychophysiology and Biofeedback (AAPB). Retrieved

from: http://www.aapb.org/i4a/pages/index.cfm?pageid=3462

Collura, T. F., Thatcher, R. W., Smith, M. L., Lambos, W. A., & Stark, C. R. (2009).

EEG Biofeedback training using live z-scores and a normative database. In T. H.

Budzynski, H. K. Budzynski, J. R. Evans & A. Abarbanel (Eds.), Introduction to

quantitative EEG and neurofeedback: Advanced theory and applications (2nd

ed., pp. 29-59). Burlington, MA: Elsevier.

Cooper, C. (2001). Review of the Devereeux Scales of Mental Disorders. In B. S. Plake

& J. C. Impara (Eds.), The fourteenth mental measurements yearbook (pp. 408-

410). Lincoln, NE: Buros Institute of Mental Measurements.

Corsi-Cabrera, M. M., Galindo-Vilchis, L. L., del-Río-Portilla, Y. Y., Arce, C. C., &

Ramos-Loyo, J. J. (2007). Within-subject reliability and inter-session stability of

EEG power and coherent activity in women evaluated monthly over nine months.

Clinical Neurophysiology, 118(1), 9-21. doi:10.1016/j.clinph.2006.08.013

Demos, J. N. (2005). Getting started with neurofeedback. New York, NY: W. W. Norton.

Donders, J. (2002). The Behavior Rating Inventory of Executive Function: Introduction.

Child Neuropsychology, 8(4), 229-230.

Drechsler, R., Straub, M., Doehnert, M., Heinrich, H., Steinhausen, H., & Brandeis, D.

(2007). Controlled evaluation of a neurofeedback training of slow cortical

potentials in children with attention deficit/hyperactivity disorder. Behavioral and

Brain Functions, 3(35), 1-13. doi:10.1186/1744-9081-3-35

124

Farley, J. W., & Connolly, J. J. (2005). Low impedances: How important are they with

digital recordings? American Journal of Electroneurodiagnostic Technology,

45(2), 139-44.

Faul, F., Erdfelder, E., Lang, A. G., & Buchner, A. (2007). G*Power 3: A flexible

statistical power analysis program for the social, behavioral, and biomedical

sciences. Behavior Research Methods, 39, 175-191.

Fritson, K. K., Wadkins, T. A., Gerdes, P., & Hof, D. (2008). The impact of neurotherapy

on college students’ cognitive abilities and emotions. Journal of Neurotherapy,

11(4), 1-9. doi:10.1080/10874200802143998

Gearing, R. E., Mian, I. A., Barber, J., & Ickowicz, A. (2006). A methodology for

conducting retrospective chart review research in child and adolescent psychiatry.

Journal of the Canadian Academy of Child and Adolescent Psychiatry, 15(3),

126-134.

Gevensleben, H., Holl, B., Albrecht, B., Schlamp, D., Kratz, O., Studer, P.,Wangler, S,…

Heinrich, H. (2009). Distinct EEG effects related to neurofeedback training in

children with ADHD: A randomized controlled trial. International Journal of

Psychophysiology, 74, 149–157. doi:10.1016/j.ijpsycho.2009.08.005

Gevensleben, H., Rothenberger, A., Moll, G. H., & Heinrich, H. (2012). Neurofeedback

in children with ADHD: Validation and challenges. Expert Reviews of

Neurotherapeutics, 12(4), 447–460. doi:10.1586/ERN.12.22

Gioia, G. A., Isquith, P. K., Guy, S. C., & Kenworthy, L. (2000). BRIEF: Behavior

Ration Inventory of Executive Function: Professional manual. Lutz, FL:

Psychological Assessment Resources PAR.

125

Gravetter, F. J., & Wallnau, L. B. (2010). Statistics for the behavioral sciences (9
th

ed.).

Belmont, CA: Wadsworth Cengage Learning.

Hallman, D. W. (2012). 19-Channel neurofeedback in an adolescent with FASD. Journal

of Neurotherapy, 16(2), 150-154. doi:10.1080/10874208.2012.677646

Hammer, B., Colbert, A., Brown, K., & Ilioi, E. (2011). Neurofeedback for insomnia: A

pilot study of z-score SMR and individualized protocols. Applied

Psychophysiology and Biofeedback, 36(4), 251-64.

doi:10.1007/s10484-011-9165-y

Hammond, C. D. (2010). The need for individualization in neurofeedback:

Heterogeneity in QEEG patterns associated with diagnoses and symptoms.

Applied Psychophysiology and Biofeedback, 35(31–36).

doi:10.1007/s10484-009-9106-1

Hergenhahn, B. R. (2009). An introduction to the history of psychology, (6th ed.).

Belmont, CA: Wadsworth.

Huang-Storms, L., Bodenhamer-Davis, E., Davis, R., & Dunn, J. (2006). QEEG-guided

neurofeedback for children with histories of abuse and neglect:

Neurodevelopmental rationale and pilot study. Journal of Neurotherapy, 10(4), 3-

16. doi:10.1300/J184v10n04̱02

Hughes, J. R., & John, E. R. (1999). Conventional and quantitative

electroencephalography in psychiatry. The Journal of Neuropsychiatry and

Clinical Neurosciences, 11(2), 190-208.

126

Hunter, J. E., & Schmidt, F. L. (2004). Methods of Meta-Analysis. (2nd ed.). Thousand

Oaks, CA: SAGE Publications, Inc.

doi: http://dx.doi.org.library.gcu.edu:2048/10.4135/9781412985031

John, E. R. (1977). Neurometrics: Quantitative electrophysiological analyses. In E. R.

John & R. W. Thatcher (Eds.), Functional neuroscience, (Vol. II). New Jersey: L.

Erlbaum Assoc.

Johnstone, J., & Gunkelman, J. (2003). Use of databases in QEEG evaluation. Journal of

Neurotherapy, 7(3), 31-52. doi:10.1300/J184v07n03_02

Kamiya, J. (1968). Conscious control of brainwaves. Psychology Today, 1, 56-60.

Kamiya, J. (1969). Operant control of the EEG alpha rhythm and some of its reported

effects on consciousness. In C. T. Tart (Ed.), Altered states of consciousness (p.

489-501). New York: Wiley.

Kerlinger, F. N. (1986). Foundations of behavioral research (3
rd

ed.). Orlando, FL: Holt,

Reinhardt, & Winston.

Kerlinger, F. N., & Lee, H. B. (2000). Foundations of behavioral research (4th ed.).

United States: Wadsworth Thomson Learning.

Kirk, R. (2009). Experimental design. In R. E. Millsap, & A. Maydeu-Olivares (Eds.),

The Sage handbook of quantitative methods in psychology. (pp. 23-46). London:

Sage Publications Ltd.

doi: http://dx.doi.org.library.gcu.edu:2048/10.4135/9780857020994.n2

Knezevic, B., Thompson, L., & Thompson, M. (2010). Pilot project to ascertain the

utility of Tower of London test to assess outcomes of neurofeedback in clients

http://www.citeulike.org/user/michaelbrewer/author/Johnstone:J

http://www.citeulike.org/user/michaelbrewer/author/Gunkelman:J

127

with Asperger’s Syndrome, Journal of Neurotherapy, 14(1), 3-19.

doi:10.1080/10874200903543922

Koberda, J. L., Hillier, D. S., Jones, B., Moses, A., & Koberda, L. (2012). Application of

neurofeedback in general neurology practice. Journal of Neurotherapy, 16(3),

231-234. doi:10.1080/10874208.2012.705770

Koberda, J. L., Moses, A., Koberda, P., & Koberda, L. (2012a, September). Comparison

of the effectiveness of z-score surface/LORETA 19-electrodes neurofeedback to

standard 1-electrode neurofeedback. Oral Presentation at

the 20
th

Annual

Conference of the International Society for Neurofeedback and Research,

Orlando, FL.

Koberda, J. L., Moses, A., Koberda, L., & Koberda, P. (2012b). Cognitive enhancement

using 19-electrode z-score neurofeedback. Journal of Neurotherapy, 16(3), 224-

230. doi:10.1080/10874208.2012.705770

Krigbaum, G., & Wigton, N. L. (2013). A proposed methodology of analysis for

monitoring treatment progression with 19-channel z-score neurofeedback

(19ZNF) in a single-subject design. Manuscript submitted for publication.

La Vaque, T. J., Hammond, D. C., Trudeau, D., Monastra, V., Perry, J., Lehrer, P., …

Sherman, R. (2002). Template for developing guidelines for the evaluation of the

clinical efficacy of psychophysiological interventions. Applied Psychophysiology

& Biofeedback, 27(4), 273–281.

Lofthouse, N., Arnold, L. E., Hersch, S., Hurt, E., & DeBeus, R. (2012). A review of

neurofeedback treatment for pediatric ADHD. Journal of Attention Disorders, 16,

351-372. doi:10.1177/1087054711427530

128

Loo, S. K., & Barkley, R. A. (2005). Clinical utility of EEG in attention deficit

hyperactivity disorder. Applied Neuropsychology, 12(2), 64–76.

Loo, S. K., & Makeig, S. (2012). Clinical utility of EEG in attention deficit/ hyperactivity

disorder: A research update. Neurotherapeutics, 9(3), pp 569-587.

doi:10.1007/s13311-012-0131-z

Lubar, J. F., & Shouse, M. N. (1976). EEG and behavioral changes in a hyperactive child

concurrent with training of the sensorimotor rhythm (SMR): A preliminary report.

Biofeedback & Self-Regulation, 1, 293–306.

Lucido, M. L. (2012). Effects of neurofeedback on neuropsychological functioning in an

adult with autism (Doctoral dissertation). Available from ProQuest Central

database (UMI No. 3499929).

Machado, S., Portella, C. E., Silva, J. G., Velasques, B., Terra, P., Vorkapic, C. F.,

…Ribeiro, P. (2007). Changes in quantitative EEG absolute power during the task

of catching an object in free fall. Arquivos de neuro-psiquiatria, 65(3-A):633-636.

Matousek, M. and Petersen, I. (1973). Automatic evaluation of background activity by

means of age dependent EEG quotients. EEG & Clinical Neurophysiology, 35,

603–612.

Niv, S. (2013). Clinical efficacy and potential mechanisms of neurofeedback. Personality

and Individual Differences, 54, 676-686.

Orgim, G. & Kestad, K. A. (2013). Effects of neurofeedback versus stimulant medication

in attention-deficit/hyperactivity disorder: A randomized pilot study. Journal of

Child and Adolescent Psychopharmacology, 23(7), 448-57.

doi:10.1089/cap.2012.0090

129

Othmer, S., Othmer, S. F., & Kaiser, D. A. (1999). EEG biofeedback: An emerging

model for its global efficacy. In J. R. Evans & A. Abarbanel (Eds.), Introduction

to quantitative EEG and neurofeedback (pp. 243-310). San Diego, CA: Academic

Press.

Othmer, S. F., & Othmer, S. (2007). Interhemispheric EEG training: Clinical experience

and conceptual models. In J. R. Evans (Ed.), Handbook of neurofeedback,

(pp.109-136). New York: Hawthorn Medical Press.

Pavlov, I. (1928). Natural science and the brain (W. H. Gantt, Trans.). In, Lectures on

conditioned reflexes: Twenty-five years of objective study of the higher nervous

activity (behaviour) of animals (pp. 120-130). New York: Liverwright Publishing.

doi:10.1037/11081-010

Peniston, E. G., & Kullkosky, P. J. (1990). Alcoholic personality and alpha-theta

brainwave training. Medical Psychotherapy, 3, 37-55.

Peniston, E. G., & Kullkosky, P. J. (1991). Alpha-theta brainwave neuro-feedback

therapy for Vietnam veterans with combat-related post-traumatic-stress disorder.

Medical Psychotherapy, 4, 47-60.

Peterson, C. A. (2001). Review of the Devereux Scales of Mental Disorders. In B. S.

Plake & J. C. Impara (Eds.), The fourteenth mental measurements yearbook (pp.

408–412). Lincoln, NE: Buros Institute of Mental Measurements.

Pigott, H. E., De Biase, L., Bodenhamer-Davis, E., & Davis, R. E. (2013). The evidence-

base for neurofeedback as a reimbursable healthcare service to treat attention

deficit/hyperactivity disorder. Retrieved from International Society for

Neurofeedback and Research website: http://www.isnr.org/uploads/nfb-adhd

130

Pizzagalli, D. A. (2007). Electroencephalography and high-density electrophysiological

source localization. In J. T. Cacioppo, L. G. Assinaru, G. G. Berntson (Eds.),

Handbook of psychophysiology, (3rd ed.) (pp. 56-84), Cambridge, England:

Cambridge University Press.

Protection of Human Subjects, 45, C.F.R. pt. 46 (2009).

Ramezani, A. (2008). The effects of sequential versus referential montage neurofeedback

amplitude training on QEEG Measures of phase and coherence. (Doctoral

dissertation). Available from ProQuest Central database (UMI No. 3352127).

Razali, N. M., & Wah, Y. B. (2011). Power comparisons of Shapiro-Wilk, Kolmogorov-

Smirnov, Lilliefors and Anderson-Darling tests. Journal of Statistical Modeling

and Analytics, 2(1), 21-33.

Reichardt, C. (2009). Quasi-experimental design. In Roger E. Millsap, & A. Maydeu-

Olivares (Eds.), The Sage handbook of quantitative methods in psychology. (pp.

46-72). London: Sage Publications Ltd.

doi: http://dx.doi.org.library.gcu.edu:2048/10.4135/9780857020994.n3

Robbins, J. (2000). A Symphony in the brain. New York, NY: Grove Press.

Rosenberg, M. S., Adams, D. C., & Gurevitch, J. (2000). MetaWin: Statistical software

for meta-analysis, version 2.0. Sunderland, MA: Sinauer Associates.

Roth, R. M., Isquith, P. K., & Gioia, G. A. (2005). BRIEF: Behavior Ration Inventory of

Executive Function – Adult version: Professional manual. Lutz, FL: PAR

Inc.

Rutter, P. (2011, September). Potential clinical applications for 19 channel live z-score

training using Percent ZOK and ZPlus protocols. Oral Presentation at the 19
th

131

Annual Conference of the International Society for Neurofeedback and Research,

Carefree, AZ.

Sanford, J. A., & Turner, A. (2009). Integrated Visual and Auditory Continuous

Performance Test Administration manual. North Chesterfield, VA: BrainTrain

Inc.

Sherlin, L. H., Arns, M., Lubar, J., Heinrich, H., Kerson, C., Strehl, U., & Sterman, M. B.

(2011). Neurofeedback and basic learning theory: Implications for research and

practice, Journal of Neurotherapy, 15(4), 292-304.

doi: 10.1080/10874208.2011.623089

Skinner, B. F. (1953). Science and human behavior. New York: Macmillan.

Smith, S. R., & Reddy, L. A. (2002). The concurrent validity of the Devereux Scales of

Mental Disorders. Journal of Psychoeducational Assessment, 20, 112-127.

doi: 10.1177/073428290202000201

Steiner, N. J., Sheldrick, R. C., Gotthelf, D., & Perrin, E. C. (2011). Computer-based

attention training in the schools for children with attention deficit/hyperactivity

disorder: a preliminary trial. Clinical Pediatrics, 50, 615-622.

Sterman, M. B., & Friar, L. (1972). Suppression of seizures in an epileptic following

sensorimotor EEG feedback training. Electroencephalography and Clinical

Neurophysiology, 33(1), 89-95.

Sterman, M. B., LoPresti, R. W., & Fairchild, M. D. (2010). Electroencephalographic and

behavioral studies of monomethyl hydrazine toxicity in the cat. Journal of

Neurotherapy, 14(4), 293-300. doi:10.1080/10874208.2010.523367

132

Stoller, L. (2011). Z-Score training, combinatorics, and phase transitions, Journal of

Neurotherapy, 15(1), 35-53. doi:10.1080/10874208.2010.545758

Sürmeli, T., & Ertem, A. (2007). EEG neurofeedback treatment of patients with Down

syndrome, Journal of Neurotherapy, 11(1), 63-68. doi:10.1300/J184v11n01_07

Surmeli, T., & Ertem, A. (2009). QEEG guided neurofeedback therapy in personality

disorders: 13 case studies. Clinical EEG and Neuroscience, 40, 5-10.

Surmeli, T., & Ertem, A. (2010). Post WISC-R and TOVA improvement with QEEG

guided neurofeedback training in mentally retarded: A clinical case series of

behavioral problems. Clinical EEG and Neuroscience, 41(1), 32-41.

Surmeli, T., & Ertem, A. (2011). Obsessive compulsive disorder and the efficacy of

qEEG-Guided neurofeedback treatment: A case series. Clinical EEG and

Neuroscience, 42(3), 195-201.

Surmeli, T., Ertem, A., Eralp, E., & Kos, I. H. (2012). Schizophrenia and the efficacy of

qEEG guided neurofeedback treatment: A clinical case series. Clinical EEG and

Neuroscience, 43, 133-144. doi:10.1177/1550059411429531

Thatcher, R. W. (2012). Handbook of quantitative electroencephalography and EEG

biofeedback. St. Petersburg, Florida: Anipublishing.

Thatcher, R. W. (2013). Latest developments in live z-score training: Symptom check

list, phase reset, and LORETA z-score biofeedback. Journal of Neurotherapy,

17(1), 69-87. doi:10.1080/10874208.2013.759032

Thatcher, R. W., & Lubar, J. E. (2009). History of the scientific standards of QEEG

normative databases. In T. H. Budzynski, H. K. Budzynski, J. R. Evans & A.

133

Abarbanel (Eds.), Introduction to quantitative EEG and neurofeedback: Advanced

theory and applications (2nd ed., pp. 29-59). Burlington, MA: Elsevier.

Thatcher, R. W., North, D., & Biver, C. (2005). Evaluation and validity of a LORETA

normative EEG database. Clinical EEG and Neuroscience, 36(2), 116–122.

Thatcher, R. W., Walker, R. A., Biver, C., North, D., & Curtin, R. (2003). Quantitative

EEG normative databases: Validation and clinical correlation. Journal of

Neurotherapy, 7(3-4), 87–122. doi:10.1300/J184v07n03_05

Thompson, M., & Thompson, L. (2003). The neurofeedback book. Wheat Ridge, CO:

Association of Applied Psychophysiology and Biofeedback.

Thorndike, E. L. (1911). Animal intelligence. New York: Macmillan.

Walker, J. E. (2009). Anxiety associated with post-traumatic stress disorder: The role of

quantitative electroencephalograph in diagnosis and in guiding neurofeedback

training to remediate the anxiety. Biofeedback, 37, 67-70.

Walker, J. E. (2010a). Recent advances in quantitative EEG as an aid to diagnosis and as

a guide to neurofeedback training for cortical hypofunctions, hyperfunctions,

disconnections, and hyperconnections: Improving efficacy in complicated

neurological and psychological disorders. Applied Psychophysiology and

Biofeedback, 35, 25–27. doi:10.1007/s10484-009-9107-0

Walker, J. E. (2010b). Using QEEG-guided neurofeedback for epilepsy versus

standardized protocols: Enhanced effectiveness? Applied Psychophysiology and

Biofeedback, 35, 29–30. doi:10.1007/s10484-009-9123-0

Walker, J. E. (2011). QEEG-guided neurofeedback for recurrent migraine headaches.

Clinical EEG and Neuroscience, 42(1), 59-61.

134

Walker, J. E. (2012a). Remediation of enuresis using QEEG-guided neurofeedback

training. Biofeedback, 40(3), 109-112. doi:10.5298/1081-5937-40.3.04

Walker, J. E. (2012b). QEEG-guided neurofeedback for remediation of dysgraphia.

Biofeedback 40(3), 113–114. doi:10.5298/1081-5937-40.3.03

Walker, J. E. (2013). QEEG-guided neurofeedback for anger/anger control disorder,

Journal of Neurotherapy, 17(1), 88-92. doi:10.1080/10874208.2012.705767

Walker, J. E., Norman, C. A., & Weber, R. K. (2002). Impact of qEEG-guided coherence

training for patients with a mild closed head injury, Journal of Neurotherapy,

6(2), 31-43. doi:10.1300/J184v06n02_05

Wigton, N. L. (2008, September). 4-channel z-score neurofeedback – A single case study,

Poster presented at the 16
th

Annual Conference of the International Society for

Neurofeedback and Research, San Antonio, TX.

Wigton, N. L. (2009). First impressions of Neuroguide real-time z-score training. In J.

Demos (Ed.), Getting started with dynamic z-score training, (pp. 81-89).

Westminster, VT: Neurofeedback of S.VT.

Wigton, N. L. (2010a, September). Laplacian z-score neurofeedback: A unique option in

the realm of multi-channel z-score neurofeedback. Plenary Session Oral

Presentation at the 18
th

Annual Conference of the International Society for

Neurofeedback and Research, Denver, CO.

Wigton, N. L. (2010b, September). Case studies overview of multi-channel z-score

neurofeedback. Poster presented at the 18
th

annual International Society for

Neurofeedback and Research Conference, Denver, CO.

135

Wigton, N. L. (2013). Clinical perspectives of 19-channel z-score neurofeedback:

Benefits and limitations. Journal of Neurotherapy, 17(4), 259-264.

doi: 10.1080/10874208.2013.847142

Wigton, N. L., & Krigbaum, G. (2012, September). Insights gained from over 3 years of

19-channel z-score neurofeedback: Towards new paradigms. Poster presented at

the 20
th

annual International Society for Neurofeedback & Research Conference,

Orlando, FL.

Wyricka, W., & Sterman, M. B. (1968). Instrumental conditioning of sensorimotor cortex

EEG spindles in the waking cat. Physiology and Behavior, 3(5), 703-707.

136

Appendix A

Test Distribution Limitations

Copies of the commercially available BRIEF and DSMD psychometric test

instruments cannot be provided due to copyright protections. The publisher of the BRIEF,

PAR Incorporated, states on the Permissions page of their website

(www4.parinc.com/ProRes/permissions.aspx) that permission to include copies of an

entire test will not be granted, for any publication, to include dissertations. The publisher

of the DSMD, Pearson Education Incorporated, states in its Terms and Conditions page

of their website (www.pearsonclinical.com/psychology/legal/termsofsale.html) that

reproducing test items/scales is strictly prohibited by law as well as the terms and

conditions for their products. The IVA is a computerized performance test, and as such, is

only accessible by running the program on a computer. Therefore, a printed copy of this

test is not available for inclusion in an appendix.

http://www4.parinc.com/ProRes/permissions.aspx

http://www.pearsonclinical.com/psychology/legal/termsofsale.html

137

Appendix B

IRB Letter: Determination of Exempt Status

138

Appendix C

Q-Q Plots of Difference Scores

IVA Group

Auditory Scale

Visual Scale

Full Scale

DSMD Group

Externalizing Scale

Internalizing Scale

Total Scale

BRIEF Group

BRI Scale

MI Scale

GEC Scale

QEEG Group

Absolute Power

Relative Power

Coherence

139

Appendix D

Line Graphs of Individual Pre-Post Scores

IVA Group

DSMD Group

BRIEF Group

QEEG Group

What Will You Get?

We provide professional writing services to help you score straight A’s by submitting custom written assignments that mirror your guidelines.

Premium Quality

Get result-oriented writing and never worry about grades anymore. We follow the highest quality standards to make sure that you get perfect assignments.

Experienced Writers

Our writers have experience in dealing with papers of every educational level. You can surely rely on the expertise of our qualified professionals.

On-Time Delivery

Your deadline is our threshold for success and we take it very seriously. We make sure you receive your papers before your predefined time.

24/7 Customer Support

Someone from our customer support team is always here to respond to your questions. So, hit us up if you have got any ambiguity or concern.

Complete Confidentiality

Sit back and relax while we help you out with writing your papers. We have an ultimate policy for keeping your personal and order-related details a secret.

Authentic Sources

We assure you that your document will be thoroughly checked for plagiarism and grammatical errors as we use highly authentic and licit sources.

Moneyback Guarantee

Still reluctant about placing an order? Our 100% Moneyback Guarantee backs you up on rare occasions where you aren’t satisfied with the writing.

Order Tracking

You don’t have to wait for an update for hours; you can track the progress of your order any time you want. We share the status after each step.

image

Areas of Expertise

Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.

Areas of Expertise

Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.

image

Trusted Partner of 9650+ Students for Writing

From brainstorming your paper's outline to perfecting its grammar, we perform every step carefully to make your paper worthy of A grade.

Preferred Writer

Hire your preferred writer anytime. Simply specify if you want your preferred expert to write your paper and we’ll make that happen.

Grammar Check Report

Get an elaborate and authentic grammar check report with your work to have the grammar goodness sealed in your document.

One Page Summary

You can purchase this feature if you want our writers to sum up your paper in the form of a concise and well-articulated summary.

Plagiarism Report

You don’t have to worry about plagiarism anymore. Get a plagiarism report to certify the uniqueness of your work.

Free Features $66FREE

  • Most Qualified Writer $10FREE
  • Plagiarism Scan Report $10FREE
  • Unlimited Revisions $08FREE
  • Paper Formatting $05FREE
  • Cover Page $05FREE
  • Referencing & Bibliography $10FREE
  • Dedicated User Area $08FREE
  • 24/7 Order Tracking $05FREE
  • Periodic Email Alerts $05FREE
image

Our Services

Join us for the best experience while seeking writing assistance in your college life. A good grade is all you need to boost up your academic excellence and we are all about it.

  • On-time Delivery
  • 24/7 Order Tracking
  • Access to Authentic Sources
Academic Writing

We create perfect papers according to the guidelines.

Professional Editing

We seamlessly edit out errors from your papers.

Thorough Proofreading

We thoroughly read your final draft to identify errors.

image

Delegate Your Challenging Writing Tasks to Experienced Professionals

Work with ultimate peace of mind because we ensure that your academic work is our responsibility and your grades are a top concern for us!

Check Out Our Sample Work

Dedication. Quality. Commitment. Punctuality

Categories
All samples
Essay (any type)
Essay (any type)
The Value of a Nursing Degree
Undergrad. (yrs 3-4)
Nursing
2
View this sample

It May Not Be Much, but It’s Honest Work!

Here is what we have achieved so far. These numbers are evidence that we go the extra mile to make your college journey successful.

0+

Happy Clients

0+

Words Written This Week

0+

Ongoing Orders

0%

Customer Satisfaction Rate
image

Process as Fine as Brewed Coffee

We have the most intuitive and minimalistic process so that you can easily place an order. Just follow a few steps to unlock success.

See How We Helped 9000+ Students Achieve Success

image

We Analyze Your Problem and Offer Customized Writing

We understand your guidelines first before delivering any writing service. You can discuss your writing needs and we will have them evaluated by our dedicated team.

  • Clear elicitation of your requirements.
  • Customized writing as per your needs.

We Mirror Your Guidelines to Deliver Quality Services

We write your papers in a standardized way. We complete your work in such a way that it turns out to be a perfect description of your guidelines.

  • Proactive analysis of your writing.
  • Active communication to understand requirements.
image
image

We Handle Your Writing Tasks to Ensure Excellent Grades

We promise you excellent grades and academic excellence that you always longed for. Our writers stay in touch with you via email.

  • Thorough research and analysis for every order.
  • Deliverance of reliable writing service to improve your grades.
Place an Order Start Chat Now
image

Order your essay today and save 30% with the discount code Happy