1. please Write down and read the document below “Task for qualitative data analysis_Excerpts interviews of trainee teachers” (See attach). Use the excerpts from the interviews and the thematic analysis phases as presented in this week’s materials in order to generate ‘codes’ or ‘themes’.
Create at least two (2) themes. Produce a brief report (phase 7) (maximum word count: 500 words) to present 1 of the themes.
2. Read carefully the following research problem:
“Research studies suggest that teachers’ attitudes towards the inclusion of students with disabilities are influenced by a number of interrelated factors. For example, some earlier studies indicate that the nature of disability and the associated educational problems presented influence teachers’ attitudes. These are termed as ‘child-related’ variables. Other studies suggest demographic and other personality factors which can be classified as ‘teacher-related’ factors. Finally, the specific context is found to be another influencing factor and can be termed as ‘educational environment-related’ (Avramidis & Norwich, 2002).
Based on this research problem, please provide a research question that can address two or more variables. Bear in mind that the research question needs to use quantitative terms, defining the variables you will use.
Finally, discuss which statistical test you would use to answer your research question and explain the rationale behind your choice.
3. During the module you had the opportunity to engage with various topics related to the area of Research Practices and Methodologies. Throughout the whole lesson, you got involved in the Research Practices and Methodologies for academic purpose.
Now, describe/provide/present three (3) points of the module that could help you improve your Learning and provide examples from your practice and experience in an academic research.
Then, write your thoughts and provide feedback as your final thoughts about Research Practices and Methodologies
Whydid you decide to become a teacher?
Data excerpt 1: … as a person I’m quite outgoing, I’m quite a confident person and I think
my communication skills are one of my strengths as are my facilitation and group working
skills, and so really, yeah, I thought from quite an early age that I’d be a candidate who
would make a successful teacher. I thought after I finish my degree I’ll apply for some jobs in
advertising and didn’t really get anywhere, it’s a real tough graduate job so it’s very hard and
in the end I’d already applied for this [PGCE] as my fallback and so I did this in the end
because I couldn’t get a job in advertising. Sounds awful but yeah. I did work experience in a
school, voluntary, I went in half a day a week and I just loved it so I knew it was the right
thing for me really. It’s nice to be part of people’s growing up. I look back at my teachers and
I still remember the ones that I loved at primary school. I remember the impact they made on
my life… I’d like to be able to give that to children, that sort of enjoyment and the amount of
pleasure I got out of it… I’d love to think that fifteen years down the line somebody would
say that about me.
Data excerpt 2: Coming from the family that I come from we’ve got a lot of children
around us so I’ve grown up with lots of children and being the oldest as well helped my
brothers and sisters learn. So I’ve always been being a teacher. It’s all I want to do really, just
be a teacher… I knew from day one, I had applied for teaching as soon as I left school. I’d
always wanted to be a teacher. Because I speak different languages, I speak Urdu, Arabic,
Farsi, French and English, I can see things from a different perspective sometimes. Certain
people might think that a child doesn’t do that properly, but I sometimes see what they’re
doing because I can see it from here and from here and I can put that across… so I knew I
could bring certain things into the teaching profession. My secondary education was very
much ‘ you will listen to the teacher and you will learn from the teacher�. I don’t think
that’s the case. I think children learn from each other. I’m a facilitator rather than a person
who’s going to stand up in front of the blackboard six hours a day. I’ve done lots of voluntary
work with children and when I was doing my degree I did a lot of work with children and I
did really enjoy it. I do have three small children and that did have an impact in my choice
because you do sort of imagine that a career in teaching will fit in with family life more
conveniently
Data excerpt 3: I’m a big kid at heart and I thought if I become a teacher I’ll try really
hard to do things that are entertaining because I know how boring [school] can be. I hated
school. I was expelled from school twice. That might be another reason that I went into
teaching, the idea of going back and doing it better. I’ve grown up around teachers, you
know, arguments about teaching over the Christmas dinner table, that really put me off in
those days, but now I’ve worked for ten years and I’ve got a different perspective on it. I
swore blind I’d never do it but ten years on, your life changes. I’ve worked all my life. I was
in banking for over 10 years, then I went into HR [Human Resources] through banking, and
got made redundant twice in a year. And I just thought, OK seeing as I am not working
anyway, I may as well go into teaching. [I visited a school] and the fact my brain was
bubbling with ideas of how to cope with things said to me that it would be quite a nice job to
do creatively. It would be an outlet for my creative side which banking and HR hasn’t given
me. I have children of my own and the school year helps. I’m not going to have to think
‘ what am I going to do for six weeks in the summer?’ I am going to be able to spend time
with my own children as well.
Data excerpt 4: I worked abroad in French speaking countries, I speak fluent French and I
saw the drive to get primary schools having foreign languages. That was something I really
wanted to be a part of. It was really contributing rather than just making profits, which was a
big factor and I’ve got three young children so family and work / life balance was a big issue
and really that was more important than money… I’d earn a bit less but I’d get a good balance
on that. And interest. I was sort of groaning about my old job. It was too easy and a bit boring
and I relished the idea of getting my brain going again… So it was interests as well, and
stimulation… I couldn’t do with just sitting in front of a computer, I needed to be on my feet,
moving around, interacting, the whole of stimulation of that really.
Data excerpt 5: When I started working as a support assistant in schools with children
who didn’t really speak English, I would tell them a story or show them pictures in a book to
talk about, just to see a spark, and I thought I want the knowledge, I want something more, I
want to be able to give them more. And I got to the point with being a support assistant that it
wasn’t enough, it’s not very good pay, and I was on the maximum which was £8k and you
can’t buy a house or get a mortgage on that.
EDU730:
Research
Practices and Methods
Page 1 EDU-730 Research Practices and Methods
Week 8:
Qualitative data analysis
Topic goals
To discuss some of the theoretical models within which
qualitative data can be analysed and to select the most
appropriate model for a particular piece of research.
To understand the stages involved in qualitative
data
analysis, and gain some experience in coding and
developing categories.
To assess how rigour can be maximised in qualitative
data
analysis.
Task – Forum
Use the excerpts from the interviews and the thematic
analysis phases as presented in this week’s materials in
order to generate ‘codes’ or ‘themes’.
EDU730: Research
Practices and Methods
Page 2 EDU-730 Research Practices and Methods
QUALITATIVE DATA ANALYSIS
1.1 INTRODUCTION TO QUALITATIVE DATA ANALYSIS:
You are probably familiar with the basic differences between qualitative and
quantitative research methods based on the previous weeks and the materials
provided and the different applications those methods can have in order to deal
with the research questions posed.
Qualitative research is particularly good at answering the ‘why’, ‘what’ or ‘how’
questions, such as:
“What are the perceptions of carers living with people with learning
disability, as
regards their own health needs?”
“Why do students choose to study for the MSc in Research Methods through
the online programme?
1.2 What do we mean by analysis?
As being explored in previous weeks, Quantitative research techniques generate a
mass of numbers that need to be summarised, described and analysed. The data
are explored by using graphs and charts, and by doing cross tabulations and
calculating means and standard deviations. Further analysis would build on these
initial findings, seeking patterns and relationships in the data by performing
multiple regression, or an analysis of variance perhaps (Lacey and Luff, 2007).
So it is with Qualitative data analysis. .
Qualitative Data Analysis (QDA) is the range of processes and
procedures whereby we move from the qualitative data that have
been collected into some form of explanation, understanding or
interpretation of the people and situations we are investigating.
EDU730: Research
Practices and Methods
Page 3 EDU-730 Research Practices and Methods
QDA is usually based on an interpretative philosophy. The idea is to
examine the meaningful and symbolic context of qualitative data
(http://onlineqda.hud.ac.uk/Intro_QDA/what_is_qda.php)
A generous amount of words is created by interviews or observational data
and needs to be described and summarised.
The questions asked may require the researchers to seek relationships
between various themes that have been identified, or to relate behaviour
or ideas to biographical characteristics of respondents such as age or
gender.
Implications for policy or practice may be derived from the data, or
interpretation sought of puzzling findings from previous studies.
Ultimately theory could be developed and tested using advanced analytical
techniques.
1.3 Approaches in Analysis
a) Deductive approach
– Using your research questions to group the data and then look for
similarities and differences
– Used when time and resources are limited
– Used when qualitative research is a smaller component of a larger
quantitative study
b) Inductive approach
– Used when qualitative research is a major design of the inquiry
– Using emergent framework to group the data and then look for
relationships
http://onlineqda.hud.ac.uk/Intro_QDA/what_is_qda.php
EDU730: Research
Practices and Methods
Page 4 EDU-730 Research Practices and Methods
Familiarisation with the data through review, reading, listening etc
Transcription of tape recorded material
Organisation and indexing of data for easy retrieval and
identification
Anonymising of sensitive data
Coding (may be called indexing)
Identification of themes
Re-coding
Development of provisional categories
Exploration of relationships between categories
Refinement of themes and categories
Development of theory and incorporation of pre-existing
knowledge
Testing of theory against the data
Report writing, including excerpts from original data if appropriate
(e.g. quotes from interviews)
Adapted from Pacey and Luff (2009, p. 6-7)
In summary:
There are no ‘quick fix’ techniques in qualitative analysis (Lacey and Luff, 2007).
There are probably as many different ways of analysing qualitative data as
there are qualitative researchers doing it!
It is argued that qualitative research is an interpretive and subjective
exercise is intimately involved in the process, not aloof from it (Pope and
Mays 2006).
However there are some theoretical approaches to choose from and in this
week we will explore a basic one. In addition there are some common
processes, no matter which approach you take. Analysis of qualitative data
usually goes through some or all of the following stages (though the order
may vary):
EDU730: Research
Practices and Methods
Page 5 EDU-730 Research Practices and Methods
1.2 What do you want to get out of your data?
It is not always necessary to go through all the stages above, but it is suggested
that some of them are necessary in order to go in-depth in your analysis!
Let’s take an example based on the research question provided above about the
health needs of the carers:
Research question:
“What are the perceptions of carers living with people with learning disability, as
regards their own health needs?”
You may be interested in finding out the community services that needs to
be provided in order the perceived needs of the carers to be met.
You might also be interested to know what kind of services are needed or
are valued by most of the carers.
Maybe several respondents mention that they struggle with depression and
loneliness
In order to explore this, three broad levels of analysis that could be pursued are
as follows:
One approach is to simply count the number of times a particular word or
concept occurs (e.g. loneliness) in a narrative. Such approach is called
content analysis. It is not purely qualitative since the qualitative data can
then be categorised quantitatively and will be subjected to statistical
analysis
Another approach is the thematic analysis from which we would want to go
deeper than this. All units of data (eg sentences or paragraphs) referring to
loneliness could be given a particular code, extracted and examined in
more detail. Do participants talk of being lonely even when others are
present? Are there particular times of day or week when they experience
loneliness? In what terms do they express loneliness? Are those who speak
EDU730: Research
Practices and Methods
Page 6 EDU-730 Research Practices and Methods
of loneliness are also those who experience depress? Such questions can
lead to themes which could eventually be developed such as ‘lonely but
never alone’.
Finally, for theoretical analysis such as ground theory we go further in
depth. For example, you may have developed theories when you have been
analysing the data with regard to depression as being associated with
perceived loss of a ‘normal’ child/spouse. The disability may be attributed
to an accident, or to some failure of medical care, without which the person
cared for would still be ‘normal’. You may be able to test this emerging
theory against existing theories of loss in the literature, or against further
analysis of the data. You may even search for ‘deviant cases’ that is data
which seems to contradict your theory, and seek to modify your theory to
take account of this new finding. This process is sometimes known as
‘analytic induction’, and is use to build and test emerging theory.
(Lacey and Luff, 2009, p.8)
In the following sections we will explore two approaches for qualitative data
analysis: a) grounded theory approach and b) thematic analysis.
1.4 Grounded Theory
Developed out of research by sociologists Glaser and Strauss (1967). Glaser
and Strauss were concerned to outline an inductive method of qualitative
research which would allow social theory to be generated systematically
from data. As such theories should be ‘grounded’ in rigorous empirical
research, rather than to be produced based in the abstract.
Grounded theory is a methodology; it is a way of thinking about and
conceptualising data. It is an approach to research as a whole and as such
can use a range of different methods.
Grounded Theory analysis is inductive, in that the resulting theory
‘emerges’ from the data through a process of rigorous and structured
analysis.
EDU730: Research
Practices and Methods
Page 7 EDU-730 Research Practices and Methods
1.5 Procedure and the Rules of Grounded Theory approach
1) Data Collection and Analysis are Interrelated Processes. In grounded theory,
the analysis begins as soon as the first bit of data is collected.
2) Concepts Are the Basic Units of Analysis. A theorist works with
conceptualizations of data, not the actual data per se. Theories can’t be built with
actual incidents or activities as observed or reported; that is, from “raw data.” The
incidents, events, and happenings are taken as, or analyzed as, potential
indicators of phenomena, which are thereby given conceptual labels. If a
respondent says to the researcher, “Each day I spread my activities over the
morning, resting between shaving and bathing,” then the researcher might label
this phenomenon as “pacing.” As the researcher encounters other incidents, and
when after comparison to the first, they appear to resemble the same
phenomena, then these, too, can be labeled as “pacing.” Only by comparing
incidents and naming like phenomena with the same term can a theorist
accumulate the basic units for theory. In the grounded theory approach such
concepts become more numerous and more abstract as the analysis continues
3. Categories Must Be Developed and Related. Concepts that pertain to the
same phenomenon may be grouped to form categories. Not all concepts become
categories. Categories are higher in level and more abstract than the concepts
they represent. They are generated through the same analytic process of making
comparisons to highlight similarities and differences that is used to produce lower
level concepts. Categories are the “cornerstones” of a developing theory. They
provide the means by which a theory can be integrated.
4. Sampling in Grounded Theory Proceeds on Theoretical Grounds. Sampling
proceeds not in terms of drawing samples of specific groups of individuals, units
of time, and so on, but in terms of concepts, their properties, dimensions, and
variations.
5) Analysis Makes Use of Constant Comparisons. As an incident is noted, it
should be compared against other incidents for similarities and differences. The
EDU730: Research
Practices and Methods
Page 8 EDU-730 Research Practices and Methods
resulting concepts are labeled as such, and over time, they are compared and
grouped as previously described.
6) Patterns and Variations Must Be Accounted For. The data must be examined
for regularity and for an understanding of where that regularity is not apparent.
7) Process Must Be Built Into the Theory. In grounded theory, process has several
meanings. Process analysis can mean breaking a phenomenon down into stages,
phases, or steps. Process may also denote purposeful action/interaction that is
not necessarily progressive, but changes in response to prevailing conditions
8) Writing Theoretical Memos Is an Integral Part of Doing Grounded Theory.
Since the analyst cannot readily keep track of all the categories, properties,
hypotheses, and generative questions that evolve from the analytical process,
there must be a system for doing so. The use of memos constitutes such a system.
Memos are not simply about “ideas.”
(adapted from Corbin and Strauss, 1990, pp.7-10)
1.6 Thematic Analysis approach (Braun and Clarke, 2006, p.79)
Thematic analysis is a method for identifying, analysing, and reporting patterns
(themes) within data. It minimally organises and describes your data set in (rich)
detail. However, it also often goes further than this, and interprets various
aspects of the research topic (Boyatzis, 1998).
Boyatzis (1998) defines the ‘unit of coding’ as the most basic segment or
element of the raw data of information that can be assessed in a
meaningful way regarding the phenomenon (pxi)
A good thematic code ‘captures the qualitative richness of the phenomenon’
(Boyatzis 1998, p31) and has 5 elements:
1. A label
2. A definition of when the theme occurs
EDU730: Research
Practices and Methods
Page 9 EDU-730 Research Practices and Methods
3. A description of how to know when the theme occurs
4. A description of any qualifications or exclusions to the theme
5. Examples to eliminate possible confusion when looking at the theme
Braun and Clarke (2006 pp 94-95) identify some “potential pitfalls” to be avoided
in qualitative analysis
1. A failure to actually analyse the data
2. Using data collection questions as themes that are reported
3. A weak or unconvincing analysis
4. A mismatch between the data and the analytic claims that are made about it. 1.
1.7 Phases of thematic analysis (inductive and deductive) (Braun and Clarke,
2006)
Phase Description of the Process
1. Development of
a priori codes
Determining important
theoretical areas that can be
used as initial codes to organize
the data (Boyatzis, 1998). Use of
theory-driven coding that links
to the theoretical framework of
the study.
2. Familiarization with the
data
Transcription of data and field
notes, reading and re-reading
the data, noting down initial
ideas (Braun and Clarke, 2006)
3. Carrying out theory-driven coding Coding data in a systematic
fashion within each interview
and the field notes and across
the entire data collating data
relevant to each a priori code
(Boyatzis 1998; Braun and
EDU730: Research
Practices and Methods
Page 10 EDU-730 Research Practices and Methods
Clarke, 2006).
4. Reviewing and revising codes and
Carrying out additional data-driven coding
Reviewing and revising theory-
driven codes in the context of
the data (Boyatzis, 1998).
Additional coding is done at this
stage, which is not confined by
the a priori codes and inductive
(data-driven) codes are assigned
to the data (Fereday and Muir-
Cochrane, 2006).
5. Searching for themes Collating codes into potential
themes, gathering all data
relevant to each potential theme
(Braun and Clarke, 2006;
Fereday and Muir-Cochrane,
2006)
6. Reviewing themes Checking if the themes produced
are related to the coded extracts
(Level 1) and the entire data set
(Level 2) as well as developing
the thematic ‘map’ of the
analysis (Braun and Clarke, 2006)
so as to determine credibility of
the themes (Fereday and Muir-
Cochrane, 2006).
7. Producing the report The final opportunity for the
analysis in which vivid
compelling extract examples are
selected, final analysis of
selected extracts, relating back
the analysis to the research
questions and the relevant
literature and producing a
scholarly report of the analysis
(Braun and Clarke, 2006).
EDU730: Research
Practices and Methods
Page 11 EDU-730 Research Practices and Methods
1.8 Example of qualitative data analysis using thematic analysis
Question: “how do you feel about your student accommodation?”
Participants: 10 Master’s students living in student accommodation an open
question
• You have coded three data segments using the code ‘satisfactory
accommodation’. You have defined ‘satisfactory’ as instances when
students indicate that their accommodation generally meets their needs,
but they report mixed views, balancing positive opinions with critical
comments. You have decided not to include views which are almost
exclusively positive or negative. The data segments you have coded as
‘satisfactory’ are:
‘It’s okay – it’s not my home, my house at home in my country, but I have
the things I need, desk, bed, arm chair, clean and warm, not damp or
anything.’ (Student 3)
‘It could be nicer – the decoration is a bit old, and it can be a little bit noisy
at night sometimes – but overall it’s fine just for students. When I graduate
and get a job, I want to rent a more modern apartment, fashionable with
lots of technology.’ (Student 9)
‘The only thing is it’s a bit small… I can’t invite all my friends to my room to
watch television or chat, so we have to go to the coffee shop, cinema… it’s a
bit expensive always going out. That’s the main problem, but I quite like it,
it’s quite good, I feel quite safe.’ (Student 2)
Is it okay to say ‘3 students reported that their accommodation was satisfactory’?
In qualitative studies, we are interested in individual’s feelings, thoughts, beliefs
and unique contributions. It is ok to say that 3 students reported that about their
accommodation.
EDU730: Research
Practices and Methods
Page 12 EDU-730 Research Practices and Methods
1.9 Producing the report of the data
Several students suggested their
accommodation, while having some
limitations, was generally satisfactory,
being ‘okay’ (student 2) or ‘fine for
students’ (student 9). Their accommodation
appeared to meet many of their needs, for instance, student 3 commented ‘I
have the things I need, a desk, bed, arm chair, clean and warm, not damp or
anything’, while student 2 reported she ‘feels quite safe’. However, they also
noted some limitations, for example, about the limited space: ‘it’s a bit small… I
can’t invite all my friends to my room’ (student 2), and the décor: ‘it could be
nicer – the decoration is a bit old’ (student 9). Nonetheless, the students seemed
to be quite accepting of these limitations – notably, student 2 still said ‘I quite
like it, it’s quite good’ even though she found it quite expensive going out to see
friends because her room was too small to invite them over.
There was also some suggestion that the students tended to think of their
accommodation as temporary; student 3 is clear ‘it is not my home, my house’,
while student 9 is already planning to rent a more modern apartment which
suits his tastes better on graduating. This might be considered to have made
them more accepting of their accommodation’s limitations, as long as their
accommodation generally meets their main needs as students.
Summary:
The words in bold and underlined fond indicate how we suggest possible
conclusions from the data as in qualitative research we talk about
interpretations and how ‘reality’ is constructed by other people’s point of
view.
Therefore we tend not to say that e.g. ‘students are not satisfied’ we prefer
to report ‘students seem not to be satisfied’
EDU730: Research
Practices and Methods
Page 13 EDU-730 Research Practices and Methods
Task – Forum
Using this week notes, please down and read the document below “Task
for qualitative data analysis_Excerpts interviews of trainee teachers”. Use
the excerpts from the interviews and the thematic analysis phases as
presented in this week’s materials in order to generate ‘codes’ or ‘themes’.
Create at least two (2) themes. Produce a brief report (phase 7) (maximum
word count: 500 words) to present 1 of the themes.
Further reading:
Aronson, J. (1995). A pragmatic view of thematic analysis. The qualitative report, 2(1), 1-
3.
Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative research in
psychology, 3(2), 77-101.
Boyce, C. and Neale, P., 2006. Conducting in-depth interviews: A guide for designing and
conducting in-depth interviews for evaluation input.
Charmaz, K. (2011). Grounded theory methods in social justice research. The Sage
handbook of qualitative research, 4, 359-380.
Corbin, J. M., & Strauss, A. (1990). Grounded theory research: Procedures, canons, and
evaluative criteria. Qualitative sociology, 13(1), 3-21.
Doody, O., & Noonan, M. (2013). Preparing and conducting interviews to collect
data. Nurse researcher, 20(5), 28-32.
Fereday, J. and Muir-Cochrane, E., (2006). Demonstrating rigour using thematic analysis:
A hybrid approach of inductive and deductive coding and theme
development. International journal of qualitative methods, 5(1), pp.80-92.
Jacob, S. A., & Furgerson, S. P. (2012). Writing interview protocols and conducting
interviews: Tips for students new to the field of qualitative research. The Qualitative
Report, 17(42), 1-10.
EDU730: Research
Practices and Methods
Page 14 EDU-730 Research Practices and Methods
Lacey, A., & Luff, D. (2001). Qualitative data analysis (pp. 320-357). Sheffield: Trent
Focus.
Smith, J., & Firth, J. (2011). Qualitative data analysis: the framework approach. Nurse
researcher, 18(2), 52-62.
Smithson, J. (2000). Using and analysing focus groups: limitations and
possibilities. International journal of social research methodology, 3(2), 103-119.
Strauss, A., & Corbin, J. (1994). Grounded theory methodology. Handbook of qualitative
research, 17, 273-85.
Video:
References:
Boyatzis, R. E. (1998). Transforming qualitative information: Thematic analysis and code
development. sage.
Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative
research in psychology, 3(2), 77-101.
Corbin, J. M., & Strauss, A. (1990). Grounded theory research: Procedures, canons, and
evaluative criteria. Qualitative sociology, 13(1), 3-21.
Fereday, J. and Muir-Cochrane, E., (2006). Demonstrating rigour using thematic analysis:
A hybrid approach of inductive and deductive coding and theme
development. International journal of qualitative methods, 5(1), pp.80-92.
Glaser, B., & Strauss, A. (1967). The discovery of grounded theory. Weidenfield &
Nicolson, London, 1-19.
Lacey A. and Luff D. (2009) Qualitative Research Analysis. The NIHR RDS for the East
Midlands / Yorkshire & the Humber.
EDU730: Research
Practices and Methods
Page 1 EDU730: Research Practices and Methods
Week 9:
Quantitative Data Analysis
Topic goals
To gain an understanding of Quantitative Analysis
To familiarize with the statistical tests for Quantitative
research.
To understand the stages involved in quantitative data
analysis
Task – Forum
Based on the given research problem, provide a research
question that can address two or more variables, using
quantitative terms, defining the variables you will
use.
Discuss which statistical test you would use to answer
your research question and explain the rationale behind
your choice.
EDU730: Research
Practices and Methods
Page 2 EDU730: Research Practices and Methods
QUANTITATIVE DATA ANALYSIS
1. Introduction
The main purpose to analyze data is to gain useful and valuable information. Data
analysis is useful to describe data, compare and find relationships or differences
between variables, etc. The researcher uses techniques to convert the data to
numerical forms.
1.1. Prepare your data
As a researcher you have to be sure that your data are correct e.g. respondents
answered all of the questions, check your transcriptions, etc. You have to identify
your missing data and then you have to convert them into a numerical form e.g.
red=1, yellow=2, green=3, etc.
1.2. Scales of measurements
Before analyzing quantitative data, researchers must identify the level of
measurement associated with the quantitative data. The type of data that you have
to use on a set of data depends on the scale of measurement of your data. The
scales of measurements are nominal, ordinal, interval and ratio.
Nominal data
Data has no logical order and can be classified into non-numerical or named
categories. It is basic classification data. The values we give are just to replace the
name and they cannot be order. Ex. Male, female, district A, district b
Example: Male or Female
There is no order associated with male or female
EDU730: Research
Practices and Methods
Page 3 EDU730: Research Practices and Methods
Ordinal data
Data has a logical order, but the differences between values are not constant.
These data are usually used for questions that are referred to ratings of quality or
agreements like good, fair, bad, or strongly agree, agree, disagree, strongly
disagree.
Example: 1st , 2nd, 3rd
Example: T-shirt size (small, medium, large)
Interval data:
Data is continuous and has a logical order, data has standardized differences
between values, but no natural zero .
Example: Fahrenheit degrees
* Remember that ratios are meaningless for interval data. You cannot say, for
example, that one day is twice as hot as another day.
Ratio data
Data is continuous, ordered, has standardized differences between values, and a
natural zero
Example: height, weight, age, length
Having an absolute zero allows you to meaningful argue that one measure is twice
as long as another.
For example – 10 km is twice as long as 5 km
Remember that there are several ways of approaching a research question and how
the researcher puts together a research question will determine the type of
methodology, data collection method, statistics, analysis and presentation that will
be used to approach the research problem.
For each type of data you have to use different analysis techniques. When using a
quantitative methodology, you are normally testing a theory through the testing of
a hypothesis.
EDU730: Research
Practices and Methods
Page 4 EDU730: Research Practices and Methods
1.3. Hypothesis/Null hypothesis:
A hypothesis is a logical assumption, a reasonable guess, or a suggested answer to
a research problem.
A null hypothesis states that minor differences between the variables can occur
because of chance errors, and are therefore not significant.
*Chance error is defined as the difference between the predicted value of a
variable (by the statistical model in question) and the actual value of the variable.
In statistical hypothesis testing, a type I error is the incorrect rejection of a true null
hypothesis (a “false positive”), while a type II error is incorrectly retaining a false
null hypothesis (a “false negative”). Simply, a type I error is detecting an effect (e.g.
a relationship between two variables) that is not present, while a type II error is
failing to detect an effect that is present.
1.4. Randomised, controlled and double-blind trial
Randomised – chosen by random.
Controlled – there is a control group as well as an experimental
group.
Double-blind – neither the subjects nor the researchers know who is in which
group.
Variables:
An experiment has three characteristics:
1. A manipulated independent variable (often denoted by x, whose variation does
not depend on that of another).
2. Control of other variables i.e. dependent variables (a variable often denoted
by y, whose value depends on that of another.
3. The observed effect of the independent variable on the dependent variables.
EDU730: Research
Practices and Methods
Page 5 EDU730: Research Practices and Methods
1.5. Validity, reliability and generalizability
Validity: refers to whether the researcher measures what he/she wants to
measure. The three types of validity are:
Content validity – refers to whether or not the content of the variables is right to
measure the concept.
Criterion validity – refers to the collection of information on these other measures
that can determine this.
Construct validity – refers to the design of your instrument so that it contains
several factors, rather than just one.
(Muijs, 2010)
Reliability: “refers to the extent to which test scores are free of measurement
error” (Muijs, 2010, pg.82). The two types of reliability are:
Repeated measures or test-retest reliability – refers to the instrument that you use
if it can be trusted to give similar result if used later on time with the same
respondents.
Internal consistency – refers to whether all the items are measuring the same
construct.
Generalizability: it is about the generalization of your findings from your sample to
the population.
EDU730: Research
Practices and Methods
Page 6 EDU730: Research Practices and Methods
2. Descriptive statistics
Descriptive statistics are summarizing data. These are used to describe
variables
and the basic features of the data that have been collected in a study. They provide
simple summaries about the sample and measures of central tendency (e.g. mean,
median, standard deviation etc.). Together with simple graphics analysis, they form
the basis of virtually every quantitative analysis of data.
It should be noted that with descriptive statistics no conclusions can be extended
beyond the immediate group from which the data was gathered.
Some popular summary statistics for interval variables
Mean: is the arithmetic average of the values, calculated by adding all the values
and divided by the total number of values.
Median: the data point that is in the middle of “low” and “high” values , after put in
numerical order
Mode: The most common occurring score in a data set
Range: It is the difference between the highest score and the lowest score.
Standard deviation: “The standard deviation exists for all interval variables. It is the
average distance of each value away from the sample mean. The larger the
standard deviation, the farther away the values are from the mean; the smaller the
standard deviation the closer, the values are to the mean” (Patel, 2009, pg.5).
Minimum and Maximum value: the smallest and largest score in data set
Frequency: The number of times a certain value appears
Quartiles: same thing as median for 1/4 intervals
EDU730: Research
Practices and Methods
Page 7 EDU730: Research Practices and Methods
(Adapted from Patel, 2009, pg. 6)
3. Data distribution
Before beginning the statistical tests, it is necessary to check the distribution of
your data. The main types of distribution are normal and non-normal.
Example
Case no Grades
1 90
2 67
3 85
4 90
5 100
6 58
7 90
Total 490
Mean: 70
Median: 90
Mode: 90
Minimum value: 100
Maximum value: 58
EDU730: Research
Practices and Methods
Page 8 EDU730: Research Practices and Methods
3.1. The Normal distribution
When the data tends to be around a central value with no bias left or right, it gets
close to a “Normal Distribution”:
The graph of the normal distribution depends on two factors i.e. the mean (M) and
the standard deviation (SD). The basics characteristics of a normal curve are: a) a
bell shape curve, b) It is perfectly symmetrical, c) Mode, median, and mean lie in
the middle of the curve (50% of the values lie to the left of the mean, and 50% lie to
the right) d) Approximately 95% of the values are found two standard deviations
away from the mean (in both directions) (Patel, 2009). The location of the center of
the graph is determined by the mean of the distribution, and the height and width
of the graph is determined by the standard deviation. When the standard deviation
is large, the curve is short and wide; when the standard deviation is small, the curve
is tall and narrow. Normal distribution graphs look like a symmetric, bell-shaped
curve, as shown above. When measuring things like people’s height, weight, salary,
opinions or votes, the graph of the results is very often a normal curve.(Langley
Perrie, 2014)
https://www.google.com.cy/search?espv=2&biw=1600&bih=794&tbm=bks&q=inauthor:%22Chris+Langley%22&sa=X&ved=0ahUKEwi4vvv62P3RAhUhIMAKHcwwDHQQ9AgIKzAD
https://www.google.com.cy/search?espv=2&biw=1600&bih=794&tbm=bks&q=inauthor:%22Yvonne+Perrie%22&sa=X&ved=0ahUKEwi4vvv62P3RAhUhIMAKHcwwDHQQ9AgILDAD
EDU730: Research
Practices and Methods
Page 9 EDU730: Research Practices and Methods
3.2. Non-Normal Distributions:
There are several ways in which a distribution can be non-normal.
4. Statistical Analysis
Statistical tests are used to make inferences about data, and can tell us if our
observation is real. There is a wide range of statistical tests and the decision of
which of them you are going to test it depends on your research design. If your data
is normally distributed you have to choose a parametric test otherwise you have to
choose non-parametric tests.
4.1. Parametric and Nonparametric Tests
A parametric statistical test makes assumptions about the parameters (defining
properties) of the population distribution(s) from which one’s data are drawn,
whereas a non-parametric test makes no such assumptions. Nonparametric tests
are also called distribution-free tests because they do not assume that your data
follow a specific distribution (Frost, 2015).
EDU730: Research
Practices and Methods
Page 10 EDU730: Research Practices and Methods
Parametric tests (means) Nonparametric tests (medians)
1-sample t test 1-sample Sign, 1-sample Wilcoxon
2-sample t test Mann-Whitney
test
One-Way ANOVA Kruskal-Wallis, Mood’s median test
Factorial DOE with one factor and one
blocking
variable
Friedman test
It is argued that nonparametric tests should be used when the data do not meet
the assumptions of the parametric test, particularly the assumption about normally
distributed data. However, there are additional considerations when deciding
whether a parametric or nonparametric test should be used.
4.2. Reasons to Use Parametric Tests
Reason 1: Parametric tests can perform well with skewed and non-normal
distributions
Parametric tests can perform well with continuous data that are not normally
distributed if the sample size guidelines demonstrated in the table below are
satisfied.
EDU730: Research
Practices and Methods
Page 11 EDU730: Research Practices and Methods
Parametric analyses Sample size guidelines for non-normal data
1-sample t test Greater than 20
2-sample t test Each group should be greater than 15
One-Way ANOVA If you have 2-9 groups, each group should be
greater than 15.
If you have 10-12 groups, each group should be
greater than 20.
Note: These guidelines are based on simulation studies conducted by statisticians at
Minitab.
Reason 2: Parametric tests can perform well when the spread of each group
is different
While nonparametric tests do not assume that your data are normally distributed,
they do have other assumptions that can be hard to satisfy. For example, when
using nonparametric tests that compare groups, a common assumption is that the
data for all groups have the same spread (dispersion). If the groups have a different
spread, then the results from nonparametric tests might be invalid.
Reason 3: Statistical power
Parametric tests usually have more statistical power compared to nonparametric
tests. Hence, they are more likely to detect a significant effect when one truly
exists.
http://support.minitab.com/en-us/minitab/17/topic-library/basic-statistics-and-graphs/power-and-sample-size/what-is-power/
EDU730: Research
Practices and Methods
Page 12 EDU730: Research Practices and Methods
4.3. Reasons to Use Nonparametric Tests
Reason 1: Your area of study is better represented by the median
The fact that a parametric test can be performed with no normal data does not
imply that the mean is the best measure of the central tendency for your data. For
example, the center of a skewed distribution (e.g. income), can be better measured
by the median where 50% are above the median and 50% are below. However, if
you add a few billionaires to a sample, the mathematical mean increases greatly,
although the income for the typical person does not change.
When the distribution is skewed enough, the mean is strongly influenced by
changes far out in the distribution’s tail, whereas the median continues to more
closely represent the center of the distribution.
Reason 2: You have a very small sample size
If the data are not normally distributable and do not meet the sample size
guidelines for the parametric tests, then a nonparametric test should be used. In
addition, when you have a very small sample, it might be difficult to ascertain the
distribution of your data as the distribution tests will lack sufficient power to
provide meaningful results.
http://support.minitab.com/en-us/minitab/17/topic-library/basic-statistics-and-graphs/summary-statistics/measures-of-central-tendency/
EDU730: Research
Practices and Methods
Page 13 EDU730: Research Practices and Methods
Reason 3: You have ordinal data, ranked data, or outliers that you cannot
remove
Typical parametric tests can only assess continuous data and the results can be
seriously affected by outliers. Conversely, some nonparametric tests can handle
ordinal data, ranked data, without being significantly affected by outliers.
4.4. Statistical tests
One-tailed test: A test of a statistical hypothesis, where the region of rejection is on
only one side of the sampling distribution is called a one-tailed test. For example,
suppose the null hypothesis states that the mean is less than or equal to 10. The
alternative hypothesis would be that the mean is greater than 10.
Two-tailed test: When using a two-tailed test, regardless of the direction of the
relationship you hypothesize, you are testing for the possibility of the relationship
in both directions. For example, we may wish to compare the mean of a sample to a
given value x using a t-test. Our null hypothesis is that the mean is equal to x.
Alpha level (p value): In statistical analysis the researcher examines whether there
is any significance in the results. This is equal to the probability of obtaining the
observed difference, or one more extreme, if the null hypothesis is true.
The acceptance or rejection of a hypothesis is based upon a level of significance –
the alpha (a)
level
This is typically set at the 5% (0.05) a level, followed in popularity by the 1% (0.01) a
level
These are usually designated as p, i.e. p =0.05 or p = 0.01
So, what do we mean by levels of significance that the ‘p’ value can give us?
EDU730: Research
Practices and Methods
Page 14 EDU730: Research Practices and Methods
The p value is concerned with confidence levels. This states the threshold at which
you are prepared to accept the possibility of a Type I Error – otherwise known as a
false positive – rejecting a null hypothesis that is actually true.
The question that significance levels answer is ‘How confident can the researcher
be that the results have not arisen by chance?’
Note: The confidence levels are expressed as a percentage.
So if we had a result of:
p =1.00, then there would be a 100% possibility that the results occurred by chance.
p = 0.50, then there would be a 50% possibility that the results occurred by chance.
p = 0.05, then we are 95% certain that the results did not arise by chance
p = 0.01, then we are 99% certain that the results did not arise by chance.
Clearly, we want our results to be as accurate as possible, so we set our significance
levels as low as possible – usually at 5% (p = 0.05), or better still, at 1% (p = 0.01)
Anything above these figures, are considered as not accurate enough. In other
words, the results are not significant.
Now, you may be thinking that if an effect could not have arisen by chance 90 times
out of 100 (p = 0.1), then that is pretty significant.
However, what we are determining with our levels of significance, is ‘statistical
significance’, hence we are much more strict with that, so we would usually not
accept values greater than p = 0.05.
So when looking at the statistics in a research paper, it is important to check the ‘p’
values to find out whether the results are statistically significant or not.
(Burns & Grove, 2005)
EDU730: Research
Practices and Methods
Page 15 EDU730: Research Practices and Methods
p-value Outcome of test Statement
greater than 0.05 Fail to reject H0 No evidence to reject H0
between 0.01 and 0.05 Reject H0 (Accept H1) Some evidence to reject H0
(therefore accept H1)
between 0.001 and 0.01 Reject H0 (Accept H1) Strong evidence to reject H0
(therefore accept H1)
less than 0.001 Reject H0 (Accept H1) Very strong evidence to reject
H0 (therefore accept H1)
ANOVA (Analysis of Variance)
ANOVA is one of a number of tests (ANCOVA – analysis of covariance – and
MANOVA – multivariate analysis of variance) that are used to describe/compare the
association between a number of groups. ANOVA is used to determine whether the
difference in means (averages) for two groups is statistically significant.
T-test
The t-test is used to assess whether the means of two groups differ statistically
from each other.
The Mann-Whitney U-test test is used to test for differences between two
independent groups on a continuous measure, e.g. do males and females differ in
terms of their levels of anxiety.
This test requires two variables (e.g. male/female gender) and one continuous
variable (e.g. anxiety level). Basically, the Mann-Whitney U-test converts the scores
on the continuous variable to ranks, across the two groups and calculates and
compares the medians of the two groups. It then evaluates whether the medians
for the two groups differ significantly.
EDU730: Research
Practices and Methods
Page 16 EDU730: Research Practices and Methods
Wilcoxon signed-rank test
The Wilcoxon signed-rank test (also known as Wilcoxon matched-pairs test) is the
most common nonparametric test for the two-sampled repeated measures design
of research study.
Kruskal-Wallis test
The Kruskal-Wallis test is used to compare the means amongst more than two
samples, when either the data are ordinal or the distribution is not normal. When
there are only two groups, then it is the equivalent of the Mann-Whitney U-test.
This test is typically used to determine the significance of difference among three or
more groups.
Correlations
These tests are used to justify the nature of the relationship between two
variables, and this relation statistically, is referred to as a linear trend. This
relationship between variables usually presented on scatter plots. A correlation
does not explain causation and it does not mean that one variable is the cause of
the other.
This and other possibilities are listed below:
Variable 1 Action Variable 2 Action Type of Correlation
Math Score ↑ Science Score ↑ Positive; as Math Score improves,
Science Score improves
Math Score ↓ Science Score ↓ Positive; as Math Score declines,
Science Score declines
Math Score ↑ Science Score ↓ Negative; as Math Score improves,
Science Score declines
Math Score ↓ Science Score ↑ Negative; as Math Score declines,
Science Score improves
EDU730: Research
Practices and Methods
Page 17 EDU730: Research Practices and Methods
The following graphs show the same relationships:
Perfect Positive Correlation
Pearson’s correlation
It is used to test the correlation between at least two continuous variables. The
value for Pearson’s correlation lies between 0.00 (no correlation) and 1.00 (perfect
correlation).
Spearman rank
correlation test
The Spearman rank correlation test is used to demonstrate the association
between two ranked variables (X and Y), which are not normally distributed. It is
frequently used to compare the scores of a group of subjects on two measures (i.e.
a coefficient correlation based on ranks).
Chi-square test
There are two different types of chi-square tests – but both involve categorical data.
One type of chi-square test compares the frequency count of what is expected in
theory against what is actually observed.
The second type of chi-square test is known as a chi-square test with two variables
or the chi-square test for independence.
EDU730: Research
Practices and Methods
Page 18 EDU730: Research Practices and Methods
Regression
It is an extension of correlation and is used to define whether one variable is a
predictor of another variable. Regression is used to determine how strong the
relationship is between your intervention and your outcome variables
Table for common statistical tests
Type of test Use Parametric/ Non-parametric
Correlation These test justifies the nature of the relationship between two
variables
Pearson’s correlation
Tests for the strength of the association
between two continuous variables
Parametric
Spearman rank
correlation test
Tests for the strength of the association
between two ordinal, ranked variables (X
and Y).
Non-parametric
Chi-square test Tests for the strength of the association
between two categorical variables
Non-parametric
Comparison of
Means:
Look for the difference between the means of variables
Paired T-test Tests for difference between two related
variables
Parametric
Independent T-test
Tests for difference between two
independent variables
Parametric
ANOVA Test if the difference in means (averages)
for two groups is statistically significant. It
is used to describe/compare the
association between a number of groups.
Parametric
Regression
Assess if change in one variable predicts change in another
variable
Simple regression Tests how change in the predictor variable Parametric
EDU730: Research
Practices and Methods
Page 19 EDU730: Research Practices and Methods
predicts the level of change in the
outcome variable
Multiple regression Tests how change in the combination of
two or more predictor variables predict
the level of change in the outcome
variable
Parametric
Non-parametric
Mann-Whitney U-test Test for differences between two
independent groups on a continuous
measure
Non-parametric
Wilcoxon rank-sum
test
Tests for difference between two
independent variables – takes into account
magnitude and direction of difference
Non-parametric
Wilcoxon signed-rank
test
tests for difference between two-sampled
repeated measures – takes into account
magnitude and direction of difference
Non-parametric
Kruskal-Wallis test Tests the means among more than two
samples,
if two related variables are different –
ignores magnitude of change, only takes
into account direction.
Non-parametric
5. Power of the study
There is increasing criticism about the lack of statistical power of published
research in sports and exercise science and psychology. Statistical power is defined
as the probability of rejecting the null hypothesis; that is, the probability that the
study will lead to significant results. If the null hypothesis is false but not rejected, a
type 2 error occurs. Cohen suggested that a power of 0.80 is satisfactory when an
alpha is set at 0.05—that is, the risk of type 1 error (i.e. rejection of the null
hypothesis when it is true) is 0.05. This means that the risk of a type 2 error is 0.20.
The magnitude of the relation or treatment effect (known as the effect size) is a
factor that must receive a lot of attention when considering the statistical power of
EDU730: Research
Practices and Methods
Page 20 EDU730: Research Practices and Methods
a study. When calculated in advance, this can be used as an indicator of the degree
to which the researcher believes the null hypothesis to be false. Each statistical test
has an effect size index that ranges from zero upwards and is scale free. For
instance, the effect size index for a correlation test is r; where no conversion is
required. For assessing the difference between two sample means, Cohen’s d ,
Hedges g, or Glass’s Δ can be used. These divide the difference between two means
by a standard deviation. Formulae are available for converting other statistical test
results (e.g. t test, one way analysis of variance, and χ2 results—into effect size
indexes (see Rosenthal, 1991).
Effect sizes are typically described as small, medium, and large. Effect sizes of
correlations that equal to 0.1, 0.3, and 0.5 and effect sizes of Cohen’s that equal
0.2, 0.5, and 0.8 equate to small, medium, and large effect sizes respectively. It is
important to note that the power of a study is linked to the sample size i.e. the
smaller the expected effect size, the larger the sample size required to have
sufficient power to detect that effect size.
For example, a study that assesses the effects of habitual physical activity on body
fat in children might have a medium effect size (e.g. see Rowlands et al., 1999). In
this study, there was a moderate correlation between habitual physical activity and
body fat, with a medium effect size. A large effect size may be anticipated in a study
that assesses the effects of a very low energy diet on body fat in overweight women
(e.g. see Eston et al, 1995). In Eston et al’s study, a significant reduction in total
body intake resulted in a substantial decrease in total body mass and the
percentage of body fat.
The effect size should be estimated during the design stage of a study, as this will
allow the researcher to determine the size required to give adequate power for a
given alpha (i.e. p value). Therefore, the study can be designed to ensure that there
is sufficient power to detect the effect of interest, that is minimising the possibility
of a type 2 error.
Table 3.
Small, medium and large effect sizes as defined by Cohen
EDU730: Research
Practices and Methods
Page 21 EDU730: Research Practices and Methods
When empirical data are available, they can be used to assess the effect size for a
study. However, for some research questions it is difficult to find enough
information (e.g. there is limited empirical information on the topic or insufficient
detail provided in the results of the relevant studies) to estimate the expected
effect size. In order to compare effect sizes of studies that differ in sample size, it is
recommended that, in addition to reporting the test statistic and p value, the
appropriate effect size index is also reported.
6. Data presentation
A set of data on its own is very hard to interpret. There is a lot of information
contained in the data, but it is hard to see. Eye-balling your data using graphs and
exploratory data analysis is necessary for understanding important features of the
data, detecting outliers, and data which has been recorded incorrectly. Outliers are
extreme observations which are inconsistent with the rest of the data. The
presence of outliers can significantly distort some of the more formal statistical
techniques, and hence there is a high need for preliminary detection and correction
or accommodation of such observations, before further analysis takes place.
Usually, a straight line fits the data well. However, the outlier “pulls” the line in the
direction of the outlier, as demonstrated in the lower graph in Figure 2. When the
line is dragged towards the outlier, the rest of the points then fall farther from the
line that they would otherwise fall on or close to. In this case the “fit” is reduced;
thus, the correlation is weaker. Outliers typically occur from an error including a
mismarked answer paper, a mistake in entering a score in a database, a subject who
EDU730: Research
Practices and Methods
Page 22 EDU730: Research Practices and Methods
misunderstood the directions etc. The researcher should always seek to understand
the cause of an outlying score. If the cause is not legitimate, the researcher should
eliminate the outlying score from the analysis to avoid distorts in the
analysis.
Figure 1. A demonstration of how outliers can identified using graphs
EDU730: Research
Practices and Methods
Page 23 EDU730: Research Practices and Methods
Figure 2. The two graphs above demonstrate Data where no outliers are observed
(top graph) and Data where an Outlier is observed (bottom graph).
6.1. Charts for quantitative data
There are different types of charts that can be used to present quantitative data.
Dot plots are one of the simplest ways of displaying all the data. Each dot
represents an individual and is plotted along a vertical axis. Data for several groups
can be plotted alongside each other for comparison (Freeman& Julious, 2005).
Scatter plots: it is a type of diagram that typically presents the values of tow
variables. The data are displayed as a collection of points. Each point position
depends of the horizontal and vertical axis.
EDU730: Research
Practices and Methods
Page 24 EDU730: Research Practices and Methods
7. Quantitative Software for Data Analysis
Quantitative studies often result in large numerical data sets that would be difficult
to analyse without the help of computer software packages. Programs such as
EXCEL are available to most researchers and are relatively straight-forward. These
programs can be very useful for descriptive statistics and less complicated analyses.
However, sometimes the data require more sophisticated software. There are a
number of excellent statistical software packages including:
SPSS – The Statistical Package for Social Science (SPSS) is one of the most popular
software in social science research. SPSS is comprehensive and compatible with
almost any type of data and can be used to run both descriptive statistics and other
more complicated analyses, as well as to generate reports, graphs, plots and trend
lines based on data analyses.
STATA – This is an interactive program that can be used for both simple and
complex analyses. It can also generate charts, graphs and plots of data and results.
This program seems a bit more complicated than other programs as it uses four
different windows including the command window, the review window, the result
window and the variable window.
SAS – The Statistical Analysis System (SAS) is another very good statistical software
package that can be useful with very large data sets. It has additional capabilities
that make it very popular in the business world because it can address issues such
as business forecasting, quality improvement, planning, and so forth. However,
some knowledge of programming language is necessary to use the software,
making it a less appealing option for some researchers.
R programming – R is an open source programming language and software
environment for statistical computing and graphics that is supported by the R
Foundation for Statistical Computing. The R language is commonly used
among statisticians and data miners for developing statistical software and data
analysis.
(Blaikie, 2003)
https://en.wikipedia.org/wiki/Open_source
https://en.wikipedia.org/wiki/Programming_language
https://en.wikipedia.org/wiki/Statistical_computing
https://en.wikipedia.org/wiki/Statistician
https://en.wikipedia.org/wiki/Data_mining
https://en.wikipedia.org/wiki/Statistical_software
https://en.wikipedia.org/wiki/Data_analysis
https://en.wikipedia.org/wiki/Data_analysis
EDU730: Research
Practices and Methods
Page 25 EDU730: Research Practices and Methods
8. Statistical Symbols:
α: significance level (type I error).
b or b0: y intercept.
b1: slope of a line (used in regression).
β: probability of a Type II error.
1-β: statistical power.
BD or BPD: binomial distribution.
CI: confidence interval.
CLT: Central Limit Theorem.
d: difference between paired data.
df: degrees of freedom.
DPD: discrete probability distribution.
E = margin of error.
f = frequency (i.e. how often
something happens).
f/n = relative frequency.
HT = hypothesis test.
Ho = null hypothesis.
H1 or Ha: alternative hypothesis.
IQR = interquartile range.
m = slope of a line.
M: median.
n: sample size or number of trials in
a binomial experiment.
σ : standard error of the
proportion.
p: p-value, or probability of success in
a binomial experiment, or population
proportion.
ρ: correlation coefficient for a
population.
: sample proportion.
P(A): probability of event A.
P(AC) or P(not A): the probability that A
doesn’t ha en.
P(B|A): the probability that event B
occurs, given that event A occurs.
Pk: kth percentile. For example, P90 =
90th percentile.q: probability of failure in
a binomial or geometric distribution.
Q1: first quartile.
Q3: third quartile.
r: correlation coefficient of a sample.
R²: coefficient of determination.
s: standard deviation of a sample.
s.d or SD: standard deviation.
SEM: standard error of the mean.
SEP: standard error of the proportion.
http://www.statisticshowto.com/what-is-an-alpha-level/
http://www.statisticshowto.com/type-i-and-type-ii-errors-definition-examples/
http://cs.selu.edu/~rbyrd/math/intercept/
http://www.statisticshowto.com/regression/
http://www.statisticshowto.com/type-i-and-type-ii-errors-definition-examples/
http://www.statisticshowto.com/statistical-power/
http://www.statisticshowto.com/binomial-distribution-article-index/
http://www.statisticshowto.com/how-to-find-a-confidence-interval/
http://www.statisticshowto.com/central-limit-theorem-examples/
http://www.statisticshowto.com/degrees-of-freedom/
http://www.statisticshowto.com/discrete-probability-distribution/
http://www.statisticshowto.com/how-to-calculate-margin-of-error/#WhatMofE
http://www.statisticshowto.com/probability-and-statistics/hypothesis-testing/
http://www.statisticshowto.com/what-is-the-null-hypothesis/
http://www.statisticshowto.com/what-is-an-alternate-hypothesis/
http://www.statisticshowto.com/probability-and-statistics/interquartile-range/
http://www.statisticshowto.com/median
http://www.statisticshowto.com/find-sample-size-statistics/
http://www.statisticshowto.com/how-to-determine-if-something-is-a-binomial-experiment/
http://www.statisticshowto.com/p-value/
http://www.statisticshowto.com/how-to-determine-if-something-is-a-binomial-experiment/
http://www.statisticshowto.com/population-proportion/
http://www.statisticshowto.com/population-proportion/
http://www.statisticshowto.com/how-to-compute-pearsons-correlation-coefficients/
http://www.statisticshowto.com/probability-and-statistics/probability-main-index/
http://www.statisticshowto.com/percentiles/
http://www.statisticshowto.com/geometric-distribution/
http://www.statisticshowto.com/what-are-quartiles/
http://www.statisticshowto.com/what-are-quartiles/
http://www.statisticshowto.com/how-to-compute-pearsons-correlation-coefficients/
http://www.statisticshowto.com/what-is-a-coefficient-of-determination/
http://www.statisticshowto.com/what-is-standard-deviation/
http://www.statisticshowto.com/sample/
http://www.statisticshowto.com/what-is-standard-deviation/
http://www.statisticshowto.com/calculate-standard-error-sample-mean/
EDU730: Research
Practices and Methods
Page 26 EDU730: Research Practices and Methods
N: population size.
ND: normal distribution.
σ: standard deviation.
σ : standard error of the mean.
t: t-score.
μ mean.
ν: degrees of freedom.
X: a variable.
χ
2
: chi-square.
x: one data value.
: mean of a sample.
z: z-score.
Accessed: http://www.statisticshowto.com/statistics-symbols/
9. Task – Forum
Read carefully the following research problem:
“Research studies suggest that teachers’ attitudes towards the inclusion
of students with disabilities are influenced by a number of interrelated
factors. For example, some earlier studies indicate that the nature of
disability and the associated educational problems presented influence
teachers’ attitudes. These are termed as ‘child-related’ variables. Other
studies suggest demographic and other personality factors which can be
classified as ‘teacher-related’ factors. Finally, the specific context is
found to be another influencing factor and can be termed as
‘educational environment-related’ (Avramidis & Norwich, 2002).
Based on this research problem, please provide a research question that
can address two or more variables. Bear in mind that the research
question needs to use quantitative terms, defining the variables you will
use.
Finally, discuss which statistical test you would use to answer your
research question and explain the rationale behind your choice.
http://www.statisticshowto.com/what-is-a-population/
http://www.statisticshowto.com/probability-and-statistics/normal-distributions/
http://www.statisticshowto.com/what-is-standard-deviation/
http://www.statisticshowto.com/calculate-standard-error-sample-mean/
http://www.statisticshowto.com/t-score/
http://www.statisticshowto.com/mean
http://www.statisticshowto.com/degrees-of-freedom/
http://www.statisticshowto.com/variable/
http://www.statisticshowto.com/chi-square/
http://www.statisticshowto.com/mean/
http://www.statisticshowto.com/sample/
http://www.statisticshowto.com/z-score-definition/
http://www.statisticshowto.com/statistics-symbols/
EDU730: Research
Practices and Methods
Page 27 EDU730: Research Practices and Methods
Further Reading and Study
Book
Muijs, D. (2010). Doing quantitative research in education with SPSS. Sage.
References:
Avramidis, E., & Norwich, B. (2002). Teachers’ attitudes towards
integration/inclusion: a review of the literature. European Journal of Special
Needs Education, 17(2), 129-147.
Blaikie, N. (2003). Analyzing quantitative data: From description to
explanation. Sage.
Burns N, Grove SK (2005). The Practice of Nursing Research: Conduct, Critique,
and Utilization (5th Ed.). St. Louis, Elsevier Saunders
Eston, RG, Fu F. Fung L (1995). Validity of conventional anthropometric
techniques for estimating body composition in Chinese adults. Br J Sports Med,
29, 52–6.
Freeman, J. V., & Julious, S. A. (2005). The visual display of quantitative
information. Scope, 14(2), 11-15.
EDU730: Research
Practices and Methods
Page 28 EDU730: Research Practices and Methods
Frost J. (2015). Choosing Between a Nonparametric Test and a Parametric Test.
Retrieved from http://blog.minitab.com/blog/adventures-in-statistics-
2/choosing-between-a-nonparametric-test-and-a-parametric-test
angley , Perrie Y (2014). Maths Skills for Pharmacy: Unlocking
Pharmaceutical Calculations. Oxford University Press.
Muijs, D. (2010). Doing quantitative research in education with SPSS. Sage.
Patel, P. (2009, October). Introduction to Quantitative Methods. In Empirical
Law Seminar.
Rosenthal R. (1991.). Meta-analytic procedures for social research (revised
edition). Newbury Park, CA: Sage,
Rowlands A.V, Eston R.G, Ingledew D.K. (1999). The relationship between
activity levels, body fat and aerobic fitness in 8–10 year old children. J Appl
Physiol, 86, 1428–35.
EDU730: Research
Practices and Methods
Page 1 EDU730: Research Practices and Methods
Week 9:
Quantitative Data Analysis
Topic goals
To gain an understanding of Quantitative Analysis
To familiarize with the statistical tests for Quantitative
research.
To understand the stages involved in quantitative data
analysis
Task – Forum
Based on the given research problem, provide a research
question that can address two or more variables, using
quantitative terms, defining the variables you will
use.
Discuss which statistical test you would use to answer
your research question and explain the rationale behind
your choice.
EDU730: Research
Practices and Methods
Page 2 EDU730: Research Practices and Methods
QUANTITATIVE DATA ANALYSIS
1. Introduction
The main purpose to analyze data is to gain useful and valuable information. Data
analysis is useful to describe data, compare and find relationships or differences
between variables, etc. The researcher uses techniques to convert the data to
numerical forms.
1.1. Prepare your data
As a researcher you have to be sure that your data are correct e.g. respondents
answered all of the questions, check your transcriptions, etc. You have to identify
your missing data and then you have to convert them into a numerical form e.g.
red=1, yellow=2, green=3, etc.
1.2. Scales of measurements
Before analyzing quantitative data, researchers must identify the level of
measurement associated with the quantitative data. The type of data that you have
to use on a set of data depends on the scale of measurement of your data. The
scales of measurements are nominal, ordinal, interval and ratio.
Nominal data
Data has no logical order and can be classified into non-numerical or named
categories. It is basic classification data. The values we give are just to replace the
name and they cannot be order. Ex. Male, female, district A, district b
Example: Male or Female
There is no order associated with male or female
EDU730: Research
Practices and Methods
Page 3 EDU730: Research Practices and Methods
Ordinal data
Data has a logical order, but the differences between values are not constant.
These data are usually used for questions that are referred to ratings of quality or
agreements like good, fair, bad, or strongly agree, agree, disagree, strongly
disagree.
Example: 1st , 2nd, 3rd
Example: T-shirt size (small, medium, large)
Interval data:
Data is continuous and has a logical order, data has standardized differences
between values, but no natural zero .
Example: Fahrenheit degrees
* Remember that ratios are meaningless for interval data. You cannot say, for
example, that one day is twice as hot as another day.
Ratio data
Data is continuous, ordered, has standardized differences between values, and a
natural zero
Example: height, weight, age, length
Having an absolute zero allows you to meaningful argue that one measure is twice
as long as another.
For example – 10 km is twice as long as 5 km
Remember that there are several ways of approaching a research question and how
the researcher puts together a research question will determine the type of
methodology, data collection method, statistics, analysis and presentation that will
be used to approach the research problem.
For each type of data you have to use different analysis techniques. When using a
quantitative methodology, you are normally testing a theory through the testing of
a hypothesis.
EDU730: Research
Practices and Methods
Page 4 EDU730: Research Practices and Methods
1.3. Hypothesis/Null hypothesis:
A hypothesis is a logical assumption, a reasonable guess, or a suggested answer to
a research problem.
A null hypothesis states that minor differences between the variables can occur
because of chance errors, and are therefore not significant.
*Chance error is defined as the difference between the predicted value of a
variable (by the statistical model in question) and the actual value of the variable.
In statistical hypothesis testing, a type I error is the incorrect rejection of a true null
hypothesis (a “false positive”), while a type II error is incorrectly retaining a false
null hypothesis (a “false negative”). Simply, a type I error is detecting an effect (e.g.
a relationship between two variables) that is not present, while a type II error is
failing to detect an effect that is present.
1.4. Randomised, controlled and double-blind trial
Randomised – chosen by random.
Controlled – there is a control group as well as an experimental
group.
Double-blind – neither the subjects nor the researchers know who is in which
group.
Variables:
An experiment has three characteristics:
1. A manipulated independent variable (often denoted by x, whose variation does
not depend on that of another).
2. Control of other variables i.e. dependent variables (a variable often denoted
by y, whose value depends on that of another.
3. The observed effect of the independent variable on the dependent variables.
EDU730: Research
Practices and Methods
Page 5 EDU730: Research Practices and Methods
1.5. Validity, reliability and generalizability
Validity: refers to whether the researcher measures what he/she wants to
measure. The three types of validity are:
Content validity – refers to whether or not the content of the variables is right to
measure the concept.
Criterion validity – refers to the collection of information on these other measures
that can determine this.
Construct validity – refers to the design of your instrument so that it contains
several factors, rather than just one.
(Muijs, 2010)
Reliability: “refers to the extent to which test scores are free of measurement
error” (Muijs, 2010, pg.82). The two types of reliability are:
Repeated measures or test-retest reliability – refers to the instrument that you use
if it can be trusted to give similar result if used later on time with the same
respondents.
Internal consistency – refers to whether all the items are measuring the same
construct.
Generalizability: it is about the generalization of your findings from your sample to
the population.
EDU730: Research
Practices and Methods
Page 6 EDU730: Research Practices and Methods
2. Descriptive statistics
Descriptive statistics are summarizing data. These are used to describe
variables
and the basic features of the data that have been collected in a study. They provide
simple summaries about the sample and measures of central tendency (e.g. mean,
median, standard deviation etc.). Together with simple graphics analysis, they form
the basis of virtually every quantitative analysis of data.
It should be noted that with descriptive statistics no conclusions can be extended
beyond the immediate group from which the data was gathered.
Some popular summary statistics for interval variables
Mean: is the arithmetic average of the values, calculated by adding all the values
and divided by the total number of values.
Median: the data point that is in the middle of “low” and “high” values , after put in
numerical order
Mode: The most common occurring score in a data set
Range: It is the difference between the highest score and the lowest score.
Standard deviation: “The standard deviation exists for all interval variables. It is the
average distance of each value away from the sample mean. The larger the
standard deviation, the farther away the values are from the mean; the smaller the
standard deviation the closer, the values are to the mean” (Patel, 2009, pg.5).
Minimum and Maximum value: the smallest and largest score in data set
Frequency: The number of times a certain value appears
Quartiles: same thing as median for 1/4 intervals
EDU730: Research
Practices and Methods
Page 7 EDU730: Research Practices and Methods
(Adapted from Patel, 2009, pg. 6)
3. Data distribution
Before beginning the statistical tests, it is necessary to check the distribution of
your data. The main types of distribution are normal and non-normal.
Example
Case no Grades
1 90
2 67
3 85
4 90
5 100
6 58
7 90
Total 490
Mean: 70
Median: 90
Mode: 90
Minimum value: 100
Maximum value: 58
EDU730: Research
Practices and Methods
Page 8 EDU730: Research Practices and Methods
3.1. The Normal distribution
When the data tends to be around a central value with no bias left or right, it gets
close to a “Normal Distribution”:
The graph of the normal distribution depends on two factors i.e. the mean (M) and
the standard deviation (SD). The basics characteristics of a normal curve are: a) a
bell shape curve, b) It is perfectly symmetrical, c) Mode, median, and mean lie in
the middle of the curve (50% of the values lie to the left of the mean, and 50% lie to
the right) d) Approximately 95% of the values are found two standard deviations
away from the mean (in both directions) (Patel, 2009). The location of the center of
the graph is determined by the mean of the distribution, and the height and width
of the graph is determined by the standard deviation. When the standard deviation
is large, the curve is short and wide; when the standard deviation is small, the curve
is tall and narrow. Normal distribution graphs look like a symmetric, bell-shaped
curve, as shown above. When measuring things like people’s height, weight, salary,
opinions or votes, the graph of the results is very often a normal curve.(Langley
Perrie, 2014)
https://www.google.com.cy/search?espv=2&biw=1600&bih=794&tbm=bks&q=inauthor:%22Chris+Langley%22&sa=X&ved=0ahUKEwi4vvv62P3RAhUhIMAKHcwwDHQQ9AgIKzAD
https://www.google.com.cy/search?espv=2&biw=1600&bih=794&tbm=bks&q=inauthor:%22Yvonne+Perrie%22&sa=X&ved=0ahUKEwi4vvv62P3RAhUhIMAKHcwwDHQQ9AgILDAD
EDU730: Research
Practices and Methods
Page 9 EDU730: Research Practices and Methods
3.2. Non-Normal Distributions:
There are several ways in which a distribution can be non-normal.
4. Statistical Analysis
Statistical tests are used to make inferences about data, and can tell us if our
observation is real. There is a wide range of statistical tests and the decision of
which of them you are going to test it depends on your research design. If your data
is normally distributed you have to choose a parametric test otherwise you have to
choose non-parametric tests.
4.1. Parametric and Nonparametric Tests
A parametric statistical test makes assumptions about the parameters (defining
properties) of the population distribution(s) from which one’s data are drawn,
whereas a non-parametric test makes no such assumptions. Nonparametric tests
are also called distribution-free tests because they do not assume that your data
follow a specific distribution (Frost, 2015).
EDU730: Research
Practices and Methods
Page 10 EDU730: Research Practices and Methods
Parametric tests (means) Nonparametric tests (medians)
1-sample t test 1-sample Sign, 1-sample Wilcoxon
2-sample t test Mann-Whitney
test
One-Way ANOVA Kruskal-Wallis, Mood’s median test
Factorial DOE with one factor and one
blocking
variable
Friedman test
It is argued that nonparametric tests should be used when the data do not meet
the assumptions of the parametric test, particularly the assumption about normally
distributed data. However, there are additional considerations when deciding
whether a parametric or nonparametric test should be used.
4.2. Reasons to Use Parametric Tests
Reason 1: Parametric tests can perform well with skewed and non-normal
distributions
Parametric tests can perform well with continuous data that are not normally
distributed if the sample size guidelines demonstrated in the table below are
satisfied.
EDU730: Research
Practices and Methods
Page 11 EDU730: Research Practices and Methods
Parametric analyses Sample size guidelines for non-normal data
1-sample t test Greater than 20
2-sample t test Each group should be greater than 15
One-Way ANOVA If you have 2-9 groups, each group should be
greater than 15.
If you have 10-12 groups, each group should be
greater than 20.
Note: These guidelines are based on simulation studies conducted by statisticians at
Minitab.
Reason 2: Parametric tests can perform well when the spread of each group
is different
While nonparametric tests do not assume that your data are normally distributed,
they do have other assumptions that can be hard to satisfy. For example, when
using nonparametric tests that compare groups, a common assumption is that the
data for all groups have the same spread (dispersion). If the groups have a different
spread, then the results from nonparametric tests might be invalid.
Reason 3: Statistical power
Parametric tests usually have more statistical power compared to nonparametric
tests. Hence, they are more likely to detect a significant effect when one truly
exists.
http://support.minitab.com/en-us/minitab/17/topic-library/basic-statistics-and-graphs/power-and-sample-size/what-is-power/
EDU730: Research
Practices and Methods
Page 12 EDU730: Research Practices and Methods
4.3. Reasons to Use Nonparametric Tests
Reason 1: Your area of study is better represented by the median
The fact that a parametric test can be performed with no normal data does not
imply that the mean is the best measure of the central tendency for your data. For
example, the center of a skewed distribution (e.g. income), can be better measured
by the median where 50% are above the median and 50% are below. However, if
you add a few billionaires to a sample, the mathematical mean increases greatly,
although the income for the typical person does not change.
When the distribution is skewed enough, the mean is strongly influenced by
changes far out in the distribution’s tail, whereas the median continues to more
closely represent the center of the distribution.
Reason 2: You have a very small sample size
If the data are not normally distributable and do not meet the sample size
guidelines for the parametric tests, then a nonparametric test should be used. In
addition, when you have a very small sample, it might be difficult to ascertain the
distribution of your data as the distribution tests will lack sufficient power to
provide meaningful results.
http://support.minitab.com/en-us/minitab/17/topic-library/basic-statistics-and-graphs/summary-statistics/measures-of-central-tendency/
EDU730: Research
Practices and Methods
Page 13 EDU730: Research Practices and Methods
Reason 3: You have ordinal data, ranked data, or outliers that you cannot
remove
Typical parametric tests can only assess continuous data and the results can be
seriously affected by outliers. Conversely, some nonparametric tests can handle
ordinal data, ranked data, without being significantly affected by outliers.
4.4. Statistical tests
One-tailed test: A test of a statistical hypothesis, where the region of rejection is on
only one side of the sampling distribution is called a one-tailed test. For example,
suppose the null hypothesis states that the mean is less than or equal to 10. The
alternative hypothesis would be that the mean is greater than 10.
Two-tailed test: When using a two-tailed test, regardless of the direction of the
relationship you hypothesize, you are testing for the possibility of the relationship
in both directions. For example, we may wish to compare the mean of a sample to a
given value x using a t-test. Our null hypothesis is that the mean is equal to x.
Alpha level (p value): In statistical analysis the researcher examines whether there
is any significance in the results. This is equal to the probability of obtaining the
observed difference, or one more extreme, if the null hypothesis is true.
The acceptance or rejection of a hypothesis is based upon a level of significance –
the alpha (a)
level
This is typically set at the 5% (0.05) a level, followed in popularity by the 1% (0.01) a
level
These are usually designated as p, i.e. p =0.05 or p = 0.01
So, what do we mean by levels of significance that the ‘p’ value can give us?
EDU730: Research
Practices and Methods
Page 14 EDU730: Research Practices and Methods
The p value is concerned with confidence levels. This states the threshold at which
you are prepared to accept the possibility of a Type I Error – otherwise known as a
false positive – rejecting a null hypothesis that is actually true.
The question that significance levels answer is ‘How confident can the researcher
be that the results have not arisen by chance?’
Note: The confidence levels are expressed as a percentage.
So if we had a result of:
p =1.00, then there would be a 100% possibility that the results occurred by chance.
p = 0.50, then there would be a 50% possibility that the results occurred by chance.
p = 0.05, then we are 95% certain that the results did not arise by chance
p = 0.01, then we are 99% certain that the results did not arise by chance.
Clearly, we want our results to be as accurate as possible, so we set our significance
levels as low as possible – usually at 5% (p = 0.05), or better still, at 1% (p = 0.01)
Anything above these figures, are considered as not accurate enough. In other
words, the results are not significant.
Now, you may be thinking that if an effect could not have arisen by chance 90 times
out of 100 (p = 0.1), then that is pretty significant.
However, what we are determining with our levels of significance, is ‘statistical
significance’, hence we are much more strict with that, so we would usually not
accept values greater than p = 0.05.
So when looking at the statistics in a research paper, it is important to check the ‘p’
values to find out whether the results are statistically significant or not.
(Burns & Grove, 2005)
EDU730: Research
Practices and Methods
Page 15 EDU730: Research Practices and Methods
p-value Outcome of test Statement
greater than 0.05 Fail to reject H0 No evidence to reject H0
between 0.01 and 0.05 Reject H0 (Accept H1) Some evidence to reject H0
(therefore accept H1)
between 0.001 and 0.01 Reject H0 (Accept H1) Strong evidence to reject H0
(therefore accept H1)
less than 0.001 Reject H0 (Accept H1) Very strong evidence to reject
H0 (therefore accept H1)
ANOVA (Analysis of Variance)
ANOVA is one of a number of tests (ANCOVA – analysis of covariance – and
MANOVA – multivariate analysis of variance) that are used to describe/compare the
association between a number of groups. ANOVA is used to determine whether the
difference in means (averages) for two groups is statistically significant.
T-test
The t-test is used to assess whether the means of two groups differ statistically
from each other.
Mann-Whitney U-test
The Mann-Whitney U-test test is used to test for differences between two
independent groups on a continuous measure, e.g. do males and females differ in
terms of their levels of anxiety.
This test requires two variables (e.g. male/female gender) and one continuous
variable (e.g. anxiety level). Basically, the Mann-Whitney U-test converts the scores
on the continuous variable to ranks, across the two groups and calculates and
compares the medians of the two groups. It then evaluates whether the medians
for the two groups differ significantly.
EDU730: Research
Practices and Methods
Page 16 EDU730: Research Practices and Methods
Wilcoxon signed-rank test
The Wilcoxon signed-rank test (also known as Wilcoxon matched-pairs test) is the
most common nonparametric test for the two-sampled repeated measures design
of research study.
Kruskal-Wallis test
The Kruskal-Wallis test is used to compare the means amongst more than two
samples, when either the data are ordinal or the distribution is not normal. When
there are only two groups, then it is the equivalent of the Mann-Whitney U-test.
This test is typically used to determine the significance of difference among three or
more groups.
Correlations
These tests are used to justify the nature of the relationship between two
variables, and this relation statistically, is referred to as a linear trend. This
relationship between variables usually presented on scatter plots. A correlation
does not explain causation and it does not mean that one variable is the cause of
the other.
This and other possibilities are listed below:
Variable 1 Action Variable 2 Action Type of Correlation
Math Score ↑ Science Score ↑ Positive; as Math Score improves,
Science Score improves
Math Score ↓ Science Score ↓ Positive; as Math Score declines,
Science Score declines
Math Score ↑ Science Score ↓ Negative; as Math Score improves,
Science Score declines
Math Score ↓ Science Score ↑ Negative; as Math Score declines,
Science Score improves
EDU730: Research
Practices and Methods
Page 17 EDU730: Research Practices and Methods
The following graphs show the same relationships:
Perfect Positive Correlation
Pearson’s correlation
It is used to test the correlation between at least two continuous variables. The
value for Pearson’s correlation lies between 0.00 (no correlation) and 1.00 (perfect
correlation).
Spearman rank
correlation test
The Spearman rank correlation test is used to demonstrate the association
between two ranked variables (X and Y), which are not normally distributed. It is
frequently used to compare the scores of a group of subjects on two measures (i.e.
a coefficient correlation based on ranks).
Chi-square test
There are two different types of chi-square tests – but both involve categorical data.
One type of chi-square test compares the frequency count of what is expected in
theory against what is actually observed.
The second type of chi-square test is known as a chi-square test with two variables
or the chi-square test for independence.
EDU730: Research
Practices and Methods
Page 18 EDU730: Research Practices and Methods
Regression
It is an extension of correlation and is used to define whether one variable is a
predictor of another variable. Regression is used to determine how strong the
relationship is between your intervention and your outcome variables
Table for common statistical tests
Type of test Use Parametric/ Non-parametric
Correlation These test justifies the nature of the relationship between two
variables
Pearson’s correlation
Tests for the strength of the association
between two continuous variables
Parametric
Spearman rank
correlation test
Tests for the strength of the association
between two ordinal, ranked variables (X
and Y).
Non-parametric
Chi-square test Tests for the strength of the association
between two categorical variables
Non-parametric
Comparison of
Means:
Look for the difference between the means of variables
Paired T-test Tests for difference between two related
variables
Parametric
Independent T-test
Tests for difference between two
independent variables
Parametric
ANOVA Test if the difference in means (averages)
for two groups is statistically significant. It
is used to describe/compare the
association between a number of groups.
Parametric
Regression
Assess if change in one variable predicts change in another
variable
Simple regression Tests how change in the predictor variable Parametric
EDU730: Research
Practices and Methods
Page 19 EDU730: Research Practices and Methods
predicts the level of change in the
outcome variable
Multiple regression Tests how change in the combination of
two or more predictor variables predict
the level of change in the outcome
variable
Parametric
Non-parametric
Mann-Whitney U-test Test for differences between two
independent groups on a continuous
measure
Non-parametric
Wilcoxon rank-sum
test
Tests for difference between two
independent variables – takes into account
magnitude and direction of difference
Non-parametric
Wilcoxon signed-rank
test
tests for difference between two-sampled
repeated measures – takes into account
magnitude and direction of difference
Non-parametric
Kruskal-Wallis test Tests the means among more than two
samples,
if two related variables are different –
ignores magnitude of change, only takes
into account direction.
Non-parametric
5. Power of the study
There is increasing criticism about the lack of statistical power of published
research in sports and exercise science and psychology. Statistical power is defined
as the probability of rejecting the null hypothesis; that is, the probability that the
study will lead to significant results. If the null hypothesis is false but not rejected, a
type 2 error occurs. Cohen suggested that a power of 0.80 is satisfactory when an
alpha is set at 0.05—that is, the risk of type 1 error (i.e. rejection of the null
hypothesis when it is true) is 0.05. This means that the risk of a type 2 error is 0.20.
The magnitude of the relation or treatment effect (known as the effect size) is a
factor that must receive a lot of attention when considering the statistical power of
EDU730: Research
Practices and Methods
Page 20 EDU730: Research Practices and Methods
a study. When calculated in advance, this can be used as an indicator of the degree
to which the researcher believes the null hypothesis to be false. Each statistical test
has an effect size index that ranges from zero upwards and is scale free. For
instance, the effect size index for a correlation test is r; where no conversion is
required. For assessing the difference between two sample means, Cohen’s d ,
Hedges g, or Glass’s Δ can be used. These divide the difference between two means
by a standard deviation. Formulae are available for converting other statistical test
results (e.g. t test, one way analysis of variance, and χ2 results—into effect size
indexes (see Rosenthal, 1991).
Effect sizes are typically described as small, medium, and large. Effect sizes of
correlations that equal to 0.1, 0.3, and 0.5 and effect sizes of Cohen’s that equal
0.2, 0.5, and 0.8 equate to small, medium, and large effect sizes respectively. It is
important to note that the power of a study is linked to the sample size i.e. the
smaller the expected effect size, the larger the sample size required to have
sufficient power to detect that effect size.
For example, a study that assesses the effects of habitual physical activity on body
fat in children might have a medium effect size (e.g. see Rowlands et al., 1999). In
this study, there was a moderate correlation between habitual physical activity and
body fat, with a medium effect size. A large effect size may be anticipated in a study
that assesses the effects of a very low energy diet on body fat in overweight women
(e.g. see Eston et al, 1995). In Eston et al’s study, a significant reduction in total
body intake resulted in a substantial decrease in total body mass and the
percentage of body fat.
The effect size should be estimated during the design stage of a study, as this will
allow the researcher to determine the size required to give adequate power for a
given alpha (i.e. p value). Therefore, the study can be designed to ensure that there
is sufficient power to detect the effect of interest, that is minimising the possibility
of a type 2 error.
Table 3.
Small, medium and large effect sizes as defined by Cohen
EDU730: Research
Practices and Methods
Page 21 EDU730: Research Practices and Methods
When empirical data are available, they can be used to assess the effect size for a
study. However, for some research questions it is difficult to find enough
information (e.g. there is limited empirical information on the topic or insufficient
detail provided in the results of the relevant studies) to estimate the expected
effect size. In order to compare effect sizes of studies that differ in sample size, it is
recommended that, in addition to reporting the test statistic and p value, the
appropriate effect size index is also reported.
6. Data presentation
A set of data on its own is very hard to interpret. There is a lot of information
contained in the data, but it is hard to see. Eye-balling your data using graphs and
exploratory data analysis is necessary for understanding important features of the
data, detecting outliers, and data which has been recorded incorrectly. Outliers are
extreme observations which are inconsistent with the rest of the data. The
presence of outliers can significantly distort some of the more formal statistical
techniques, and hence there is a high need for preliminary detection and correction
or accommodation of such observations, before further analysis takes place.
Usually, a straight line fits the data well. However, the outlier “pulls” the line in the
direction of the outlier, as demonstrated in the lower graph in Figure 2. When the
line is dragged towards the outlier, the rest of the points then fall farther from the
line that they would otherwise fall on or close to. In this case the “fit” is reduced;
thus, the correlation is weaker. Outliers typically occur from an error including a
mismarked answer paper, a mistake in entering a score in a database, a subject who
EDU730: Research
Practices and Methods
Page 22 EDU730: Research Practices and Methods
misunderstood the directions etc. The researcher should always seek to understand
the cause of an outlying score. If the cause is not legitimate, the researcher should
eliminate the outlying score from the analysis to avoid distorts in the
analysis.
Figure 1. A demonstration of how outliers can identified using graphs
EDU730: Research
Practices and Methods
Page 23 EDU730: Research Practices and Methods
Figure 2. The two graphs above demonstrate Data where no outliers are observed
(top graph) and Data where an Outlier is observed (bottom graph).
6.1. Charts for quantitative data
There are different types of charts that can be used to present quantitative data.
Dot plots are one of the simplest ways of displaying all the data. Each dot
represents an individual and is plotted along a vertical axis. Data for several groups
can be plotted alongside each other for comparison (Freeman& Julious, 2005).
Scatter plots: it is a type of diagram that typically presents the values of tow
variables. The data are displayed as a collection of points. Each point position
depends of the horizontal and vertical axis.
EDU730: Research
Practices and Methods
Page 24 EDU730: Research Practices and Methods
7. Quantitative Software for Data Analysis
Quantitative studies often result in large numerical data sets that would be difficult
to analyse without the help of computer software packages. Programs such as
EXCEL are available to most researchers and are relatively straight-forward. These
programs can be very useful for descriptive statistics and less complicated analyses.
However, sometimes the data require more sophisticated software. There are a
number of excellent statistical software packages including:
SPSS – The Statistical Package for Social Science (SPSS) is one of the most popular
software in social science research. SPSS is comprehensive and compatible with
almost any type of data and can be used to run both descriptive statistics and other
more complicated analyses, as well as to generate reports, graphs, plots and trend
lines based on data analyses.
STATA – This is an interactive program that can be used for both simple and
complex analyses. It can also generate charts, graphs and plots of data and results.
This program seems a bit more complicated than other programs as it uses four
different windows including the command window, the review window, the result
window and the variable window.
SAS – The Statistical Analysis System (SAS) is another very good statistical software
package that can be useful with very large data sets. It has additional capabilities
that make it very popular in the business world because it can address issues such
as business forecasting, quality improvement, planning, and so forth. However,
some knowledge of programming language is necessary to use the software,
making it a less appealing option for some researchers.
R programming – R is an open source programming language and software
environment for statistical computing and graphics that is supported by the R
Foundation for Statistical Computing. The R language is commonly used
among statisticians and data miners for developing statistical software and data
analysis.
(Blaikie, 2003)
https://en.wikipedia.org/wiki/Open_source
https://en.wikipedia.org/wiki/Programming_language
https://en.wikipedia.org/wiki/Statistical_computing
https://en.wikipedia.org/wiki/Statistician
https://en.wikipedia.org/wiki/Data_mining
https://en.wikipedia.org/wiki/Statistical_software
https://en.wikipedia.org/wiki/Data_analysis
https://en.wikipedia.org/wiki/Data_analysis
EDU730: Research
Practices and Methods
Page 25 EDU730: Research Practices and Methods
8. Statistical Symbols:
α: significance level (type I error).
b or b0: y intercept.
b1: slope of a line (used in regression).
β: probability of a Type II error.
1-β: statistical power.
BD or BPD: binomial distribution.
CI: confidence interval.
CLT: Central Limit Theorem.
d: difference between paired data.
df: degrees of freedom.
DPD: discrete probability distribution.
E = margin of error.
f = frequency (i.e. how often
something happens).
f/n = relative frequency.
HT = hypothesis test.
Ho = null hypothesis.
H1 or Ha: alternative hypothesis.
IQR = interquartile range.
m = slope of a line.
M: median.
n: sample size or number of trials in
a binomial experiment.
σ : standard error of the
proportion.
p: p-value, or probability of success in
a binomial experiment, or population
proportion.
ρ: correlation coefficient for a
population.
: sample proportion.
P(A): probability of event A.
P(AC) or P(not A): the probability that A
doesn’t ha en.
P(B|A): the probability that event B
occurs, given that event A occurs.
Pk: kth percentile. For example, P90 =
90th percentile.q: probability of failure in
a binomial or geometric distribution.
Q1: first quartile.
Q3: third quartile.
r: correlation coefficient of a sample.
R²: coefficient of determination.
s: standard deviation of a sample.
s.d or SD: standard deviation.
SEM: standard error of the mean.
SEP: standard error of the proportion.
http://www.statisticshowto.com/what-is-an-alpha-level/
http://www.statisticshowto.com/type-i-and-type-ii-errors-definition-examples/
http://cs.selu.edu/~rbyrd/math/intercept/
http://www.statisticshowto.com/regression/
http://www.statisticshowto.com/type-i-and-type-ii-errors-definition-examples/
http://www.statisticshowto.com/statistical-power/
http://www.statisticshowto.com/binomial-distribution-article-index/
http://www.statisticshowto.com/how-to-find-a-confidence-interval/
http://www.statisticshowto.com/central-limit-theorem-examples/
http://www.statisticshowto.com/degrees-of-freedom/
http://www.statisticshowto.com/discrete-probability-distribution/
http://www.statisticshowto.com/how-to-calculate-margin-of-error/#WhatMofE
http://www.statisticshowto.com/probability-and-statistics/hypothesis-testing/
http://www.statisticshowto.com/what-is-the-null-hypothesis/
http://www.statisticshowto.com/what-is-an-alternate-hypothesis/
http://www.statisticshowto.com/probability-and-statistics/interquartile-range/
http://www.statisticshowto.com/median
http://www.statisticshowto.com/find-sample-size-statistics/
http://www.statisticshowto.com/how-to-determine-if-something-is-a-binomial-experiment/
http://www.statisticshowto.com/p-value/
http://www.statisticshowto.com/how-to-determine-if-something-is-a-binomial-experiment/
http://www.statisticshowto.com/population-proportion/
http://www.statisticshowto.com/population-proportion/
http://www.statisticshowto.com/how-to-compute-pearsons-correlation-coefficients/
http://www.statisticshowto.com/probability-and-statistics/probability-main-index/
http://www.statisticshowto.com/percentiles/
http://www.statisticshowto.com/geometric-distribution/
http://www.statisticshowto.com/what-are-quartiles/
http://www.statisticshowto.com/what-are-quartiles/
http://www.statisticshowto.com/how-to-compute-pearsons-correlation-coefficients/
http://www.statisticshowto.com/what-is-a-coefficient-of-determination/
http://www.statisticshowto.com/what-is-standard-deviation/
http://www.statisticshowto.com/sample/
http://www.statisticshowto.com/what-is-standard-deviation/
http://www.statisticshowto.com/calculate-standard-error-sample-mean/
EDU730: Research
Practices and Methods
Page 26 EDU730: Research Practices and Methods
N: population size.
ND: normal distribution.
σ: standard deviation.
σ : standard error of the mean.
t: t-score.
μ mean.
ν: degrees of freedom.
X: a variable.
χ
2
: chi-square.
x: one data value.
: mean of a sample.
z: z-score.
Accessed: http://www.statisticshowto.com/statistics-symbols/
9. Task – Forum
Read carefully the following research problem:
“Research studies suggest that teachers’ attitudes towards the inclusion
of students with disabilities are influenced by a number of interrelated
factors. For example, some earlier studies indicate that the nature of
disability and the associated educational problems presented influence
teachers’ attitudes. These are termed as ‘child-related’ variables. Other
studies suggest demographic and other personality factors which can be
classified as ‘teacher-related’ factors. Finally, the specific context is
found to be another influencing factor and can be termed as
‘educational environment-related’ (Avramidis & Norwich, 2002).
Based on this research problem, please provide a research question that
can address two or more variables. Bear in mind that the research
question needs to use quantitative terms, defining the variables you will
use.
Finally, discuss which statistical test you would use to answer your
research question and explain the rationale behind your choice.
http://www.statisticshowto.com/what-is-a-population/
http://www.statisticshowto.com/probability-and-statistics/normal-distributions/
http://www.statisticshowto.com/what-is-standard-deviation/
http://www.statisticshowto.com/calculate-standard-error-sample-mean/
http://www.statisticshowto.com/t-score/
http://www.statisticshowto.com/mean
http://www.statisticshowto.com/degrees-of-freedom/
http://www.statisticshowto.com/variable/
http://www.statisticshowto.com/chi-square/
http://www.statisticshowto.com/mean/
http://www.statisticshowto.com/sample/
http://www.statisticshowto.com/z-score-definition/
http://www.statisticshowto.com/statistics-symbols/
EDU730: Research
Practices and Methods
Page 27 EDU730: Research Practices and Methods
Further Reading and Study
Book
Muijs, D. (2010). Doing quantitative research in education with SPSS. Sage.
References:
Avramidis, E., & Norwich, B. (2002). Teachers’ attitudes towards
integration/inclusion: a review of the literature. European Journal of Special
Needs Education, 17(2), 129-147.
Blaikie, N. (2003). Analyzing quantitative data: From description to
explanation. Sage.
Burns N, Grove SK (2005). The Practice of Nursing Research: Conduct, Critique,
and Utilization (5th Ed.). St. Louis, Elsevier Saunders
Eston, RG, Fu F. Fung L (1995). Validity of conventional anthropometric
techniques for estimating body composition in Chinese adults. Br J Sports Med,
29, 52–6.
Freeman, J. V., & Julious, S. A. (2005). The visual display of quantitative
information. Scope, 14(2), 11-15.
EDU730: Research
Practices and Methods
Page 28 EDU730: Research Practices and Methods
Frost J. (2015). Choosing Between a Nonparametric Test and a Parametric Test.
Retrieved from http://blog.minitab.com/blog/adventures-in-statistics-
2/choosing-between-a-nonparametric-test-and-a-parametric-test
angley , Perrie Y (2014). Maths Skills for Pharmacy: Unlocking
Pharmaceutical Calculations. Oxford University Press.
Muijs, D. (2010). Doing quantitative research in education with SPSS. Sage.
Patel, P. (2009, October). Introduction to Quantitative Methods. In Empirical
Law Seminar.
Rosenthal R. (1991.). Meta-analytic procedures for social research (revised
edition). Newbury Park, CA: Sage,
Rowlands A.V, Eston R.G, Ingledew D.K. (1999). The relationship between
activity levels, body fat and aerobic fitness in 8–10 year old children. J Appl
Physiol, 86, 1428–35.
We provide professional writing services to help you score straight A’s by submitting custom written assignments that mirror your guidelines.
Get result-oriented writing and never worry about grades anymore. We follow the highest quality standards to make sure that you get perfect assignments.
Our writers have experience in dealing with papers of every educational level. You can surely rely on the expertise of our qualified professionals.
Your deadline is our threshold for success and we take it very seriously. We make sure you receive your papers before your predefined time.
Someone from our customer support team is always here to respond to your questions. So, hit us up if you have got any ambiguity or concern.
Sit back and relax while we help you out with writing your papers. We have an ultimate policy for keeping your personal and order-related details a secret.
We assure you that your document will be thoroughly checked for plagiarism and grammatical errors as we use highly authentic and licit sources.
Still reluctant about placing an order? Our 100% Moneyback Guarantee backs you up on rare occasions where you aren’t satisfied with the writing.
You don’t have to wait for an update for hours; you can track the progress of your order any time you want. We share the status after each step.
Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.
Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.
From brainstorming your paper's outline to perfecting its grammar, we perform every step carefully to make your paper worthy of A grade.
Hire your preferred writer anytime. Simply specify if you want your preferred expert to write your paper and we’ll make that happen.
Get an elaborate and authentic grammar check report with your work to have the grammar goodness sealed in your document.
You can purchase this feature if you want our writers to sum up your paper in the form of a concise and well-articulated summary.
You don’t have to worry about plagiarism anymore. Get a plagiarism report to certify the uniqueness of your work.
Join us for the best experience while seeking writing assistance in your college life. A good grade is all you need to boost up your academic excellence and we are all about it.
We create perfect papers according to the guidelines.
We seamlessly edit out errors from your papers.
We thoroughly read your final draft to identify errors.
Work with ultimate peace of mind because we ensure that your academic work is our responsibility and your grades are a top concern for us!
Dedication. Quality. Commitment. Punctuality
Here is what we have achieved so far. These numbers are evidence that we go the extra mile to make your college journey successful.
We have the most intuitive and minimalistic process so that you can easily place an order. Just follow a few steps to unlock success.
We understand your guidelines first before delivering any writing service. You can discuss your writing needs and we will have them evaluated by our dedicated team.
We write your papers in a standardized way. We complete your work in such a way that it turns out to be a perfect description of your guidelines.
We promise you excellent grades and academic excellence that you always longed for. Our writers stay in touch with you via email.