Please no plagiarism and make sure you are able to access all resource on your own before you bid. Main references come from Neukrug, E. S., & Fawcett, R. C. (2015) and/or Encyclopedia of Counseling (2017). You need to have scholarly support for any claim of fact or recommendation regarding treatment. APA format also requires headings. Use the prompt each week to guide your heading titles and organize the content of your initial post under the appropriate headings. Remember to use scholarly research from peer-reviewed articles that are current. I have attached examples and expectations, so you can see how to make full points. Please follow the instructions to get full credit for the discussion. The client Tracey is attached with the diagnosis. I need this completed by 01/07/19 at 4pm.
DISCUSSION POSTS— For your Main discussion posts, I require that all posts be a MINIMUM of 250 words. The main post must contain a minimum of two (2) different references from a peer reviewed journal or scholarly book or scholarly website. It is a good idea to use your Learning Resources each week. Wikipedia does not count as a scholarly website since the information is not validated.
Discussion – Week 7
Top of Form
Case Study and Assessments from MMY
How do clinicians find and select an appropriate assessment to administer to a client? One source of information is the Mental Measurements Yearbook (MMY). As you review the descriptions of the cases in “Practice Making a Diagnosis,” from Neukrug and Fawcett, consider issues that may be relevant and that you would like to assess. Review the list of possible issues in the handout and select two that you would like to explore further through the MMY.
To Prepare:
In the Walden Library, select Databases A-Z on the home page.
Select M from the alphabetic menu.
Select Mental Measurements Yearbook with Tests in Print from the options.
From the Learning Resources cases, identify two assessments in the MMY that you think would best provide information for you to consider exploring with this client.
By Day 3 of Week 7
Post your assessment selections and reasons why you selected these assessments. What would be the pros and cons of using each? Which would you select for this client?
Be sure to support your postings and responses with specific references to the Learning Resources.
Bottom of Form
Required Resources
Neukrug, E. S., & Fawcett, R. C. (2015). Exercise 3.3: “Practice making a diagnosis.” In The essentials of Testing and Assessment: A practical guide for counselors, social workers, and psychologists (pp. 55). Stamford, CN: Cengage Learning.
Carlson, J. F., Geisinger, K. F., & Jonson, J. L. (Eds.). (2017). The twentieth mental measurements yearbook. Lincoln, NE: Burros Center for Testing.
Neukrug, E. S., & Fawcett, R. C. (2015). Chapter 6: “Statistical Concepts: Making Meaning out of Raw Scores.” In The essentials of Testing and Assessment: A practical guide to counselors, social workers, and psychologists (pp. 110-126). Stamford, CN: Cengage Learning.
Neukrug, E. S., & Fawcett, R. C. (2015). Chapter 7: “Statistical concepts: Creating new scores to interpret test data.” In The essentials of Testing and Assessment: A practical guide to counselors, social workers, and psychologists (pp. 127-149). Stamford, CN: Cengage Learning.
II. Tracey Tracey is a 25-year-old single working mother. Her daughter, Alicia, is 3 years old and in day care during the work week. Tracey was recently divorced from Alicia’s father and has sole custody of their child because her ex-husband was physically abusive. In the past few years, when the marital problems began, Tracey became overwhelmed with anxiety but was so busy that she stated she just didn’t have time to deal with it. She starts her day at 5:30 a.m. to get Alicia dressed, packed, and ready for day care so that she can get to work by 7:00 a.m. Tracey usually has breakfast on the road, and she frequents the drive-through on her way to work for convenience. At work it’s “go, go, go,” and Tracey doesn’t usually have time to break for lunch. By the time she picks up Alicia from day care and gets home, it’s about 6:00 p.m. Tracey cooks dinner by 7:00 p.m., which usually consists of a healthy, balanced meal. Once she gives Alicia a bath and puts her to bed, Tracey finally gets a breather to relax on the couch and watch TV. Now that she is alone, she feels an uncontrollable urge to snack and often goes through a large bag of potato chips followed by a quart of ice cream before she realizes it. Sometimes, she finishes eating that amount before her favorite half-hour sitcom is over. “I just can’t stop. It’s like I zone out, and I don’t even realize how much I’ve eaten. I feel like I can’t control myself. Usually, I feel physically sick by the end of it and just pass out, like a food coma.” Tracey doesn’t like to eat junk food in front of others because she’s ashamed that she has gained so much weight since the divorce and feels self-conscious. She’s been eating in secret like this for the past year since the divorce, and it happens almost every night. It’s gotten to the point where she has begun isolating herself, preferring to go home and snack all night in front of the TV instead of spending time with family and friends.
Neukrug, Edward S.. Essentials of Testing and Assessment: A Practical Guide for Counselors, Social Workers, and Psychologists, Enhanced (p. 55). Cengage Learning. Kindle Edition.
Diagnosis
307.51 (F50.8) Binge eating disorder, moderate; V61.03 (Z63.5) Disruption of family by divorce (recent); V60.2 (Z59.7) Low income; V62.9 (Z60.9) Unspecified problem related to social environment: Social isolation.
Neukrug, Edward S.. Essentials of Testing and Assessment: A Practical Guide for Counselors, Social Workers, and Psychologists, Enhanced (p. 57). Cengage Learning. Kindle Edition.
Articl
e
Measurement and Evaluation in
Counseling and Developmen
t
43(1) 3 –1
5
© The Author(s) 2010
Reprints and permission: http://www.
sagepub.com/journalsPermissions.nav
DOI: 10.1177/0748175610362237
http://mec.sagepub.com
Assessment and
Self-Injury: Implications
for Counselo
r
s
Laurie M. Craigen1, Amanda C. Healey2,
Cynthia T. Walley3, Rebekah Byrd1,
and Jennifer Schuster1
Abstract
This article provides readers with an understanding of self-injury assessment. The article begins
with a critical revi
ew
of a number of self-injury assessments. The latter section of the article
introduces a comprehensive two-tiered approach to accurately assessing self-injury. Implications
for counselors related to the assessment of self-injury are also provided.
Keywords
self-injury, assessment, self-injurious behavio
r
Self-injurious behavior is an increasing issue
among adolescents and young adults. Accord-
ing to current research, self-injurious behavior
occurs in 4% to 39% of adolescents in the
general population and the numbers are pre-
dicted to rise, due to various reasons, ranging
from levels and quality of social interactions
with peers to the availability and assimilation
of coping behaviors through access to the
Internet (Briere & Gil, 1998; Favazza, 1996;
Gratz, 2001; Gratz, Conrad, & Roemer, 2002;
Muehlenkamp & Guiterrez, 2004; Nock &
Prinstein, 2005; Ross & Heath, 2002). Statis-
tics on the incidence of self-injury can be
unreliable, underestimating the true incidence
of self-injury. The reality is that many inci-
dents will be dealt with by the individual, in
private, and will never reach the attention of
medical services or mental health profession-
als (McAllister, 2003). Recently, there has
been a surge in the literature related to defin-
ing and explaining the behavior (Gratz, 2006).
Conversely, very little is known about the
assessment of self-injury, and therefore, a gap
exists between understanding the behavior and
implementing focused counseling interventions
and treatment (White Kress, 2003). The
purpose of this article is to provide readers
with knowledge about the difficulties related
to accurately evaluating self-injury and the
history of self-injury assessments, while also
introducing a comprehensive two-tiered
app roach to assessing self-injury, emphasiz-
ing a holistic perspective.
Review of Self-Injury
Assessments
The development of inventories to evaluate
self-injury began in the early 1990s and con-
tinues today. As the conceptualizations and
definitions of self-injury have evolved, so too
has the focus of the assessments tailored for
its evaluation. Although the newer scales appear
to assess the behaviors and attitudes associated
1Old Dominion University, Norfolk, Virginia, USA
2East Tennessee State University, Johnson City,
Tennessee, USA
3Hunter College, New York City, New York, USA
Corresponding Author:
Laurie M. Craigen, PhD, LPC, Old Dominion University,
110 Education Building, Norfolk, VA 23529 USA
Email: lcraigen@odu.edu
4 Measurement and Evaluation in Counseling and Development 43(1)
with self-injury, many have not been through
the rigorous testing necessary to fully evaluate
their efficacy, reliability, and validity. Thus, when
selecting and administering assessments, it is
necessary for counselors to understand the evolv-
ing nature and continuing development of the
instrument they select for evaluating self-injury.
In the following section, a brief overview of
the inventories available for assessing self-
injurious behaviors is provided (see Table 1).
Self-Injury Trauma Scale (SITS)
One of the first inventories to be developed
for the assessment of self-injurious behaviors
is the SITS created by Iwata, Pace, and Kissel
(1990). It was created to evaluate the extent of
tissue damage caused by self-injury. This inven-
tory examines categories including location,
type, number, and severity of the tissue damage
as well as a summary evaluation of severity and
current risk for continued self-injury. SITS
defines its typical use in terms of quantifying
tissue damage directly. It also permits differ-
entiation of self-injury according to topography,
location of the injury on the body, type of injury,
number of injuries, and estimates of severity
through evaluation of the injuries themselves.
Test-retest reliability was reported at r = .68
(Iwata et al., 1990). This assessment was later
used to evaluate self-injury in conjunction
with physical pain as based on the proposition
that the experience and expression of pain is
somehow different among those individuals who
self-injure, therefore leading to the acceptabil-
ity and tolerability of self-injury as a behavior
(Symons & Danov, 2005).
The SITS was later used in a study to
det ermine the effects of a psychopharmacologi-
cal treatment on those with intellectual
disabilities who engaged in self-injury. In this
study, the SITS inventory was found to be reli-
able when used in conjunction with the
Non-Communication Children’s Pain Check-
list–Revised (NCCPC-R) in recognizing and
tracking self-injury from the perspective of an
outside observer—in this case, the parent
(McDonough, Hillery, & Kennedy, 2000). No
specific data were reported related to concurrent
validity beyond the statement that “the mean
NCCPC-R score was 20.1 for time intervals
scored with self-injurious behavior (SIB) and
2.5 for time intervals scored without SIB” (p.
474) as indicated by the SITS. The initial evalu-
ation of the inventor’s efficacy and subsequent
usage found the scale to be a reliable method for
collecting data on surface tissue damage caused
by self-injury. However, the use of this scale
might not be practical for counselors but could
be useful for professionals who intervene with
the physical consequences of self-injury, such as
school nursing staff or medical professionals.
Self-Harm Inventory (SHI)
The SHI was developed by Sansone, Wiederman,
and Sansone (1998) in the context of screen-
ing for Borderline Personality Disorder (BPD).
It was the belief of the instrument developers
that BPD exists on a continuum in which self-
injury is the most severe manifestation of self-
sabotaging behaviors. With regard to the uses
of the SHI, self-harm is defined as the deliber-
ate, direct destruction of body tissue without
conscious suicidal intent but results in injury
severe enough for tissue damage to occur. The
SHI assesses frequency, severity, duration, and
type of self-injurious behavior. The SHI was
found to be highly related to the Diagnostic
Interview for Borderlines (DIB) at a correla-
tion of r = .76 and the Personality Diagnostic
Questionnaire–Revised at r = .71 with regard
to non-psychotic adults (Sansone et al., 1998).
The developers of this inventory also showed
that the SHI was able to predict the diagnosis
of BPD as based on its convergent validity. This
inventory is made up of 22 items that were
selected due to their correlation with the DIB,
and each question begins with the phrase, “Have
you ever on purpose, or intentionally . . . ,” and
respondents were asked to give a “yes” or
“no” answer (Sansone, Songer, Douglas, &
Sellbom, 2006, p. 976). The final score is a
simple summation of the items endorsed by
the client. In developing and testing the mea-
sure, it showed acceptable levels of clinical
accuracy as a measure for the diagnosis of
BPD by assessing a pattern of self-destructive
5
T
a
b
le
1
.
St
re
ng
th
s
an
d
W
ea
kn
es
se
s
o
f
C
ur
re
nt
S
ca
le
In
ve
nt
o
ri
es
A
ut
ho
r(
s)
C
re
at
e
d
R
e
l
ia
bi
l
it
y
V
al
id
it
y
U
se
/F
ac
to
rs
Pr
ed
ic
ti
ve
A
bi
lit
y
Su
ic
id
al
it
y
SI
T
S
Iw
at
a,
P
ac
e,
&
K
is
se
l
19
90
Te
st
-r
et
es
t
A
ss
es
si
ng
t
is
su
e
da
m
ag
e
as
r
es
ul
t
o
f
se
l
f-
in
ju
ry
A
bl
e
to
p
re
di
ct
c
ur
re
nt
r
is
k
N
o
t
ev
al
ua
te
d
S
H
I
Sa
ns
o
ne
,
W
ie
de
rm
an
, &
Sa
ns
o
ne
19
98
Pr
ed
ic
ti
ve
co
nv
er
ge
nt
Id
en
ti
fy
in
g
se
lf-
in
ju
ry
in
co
nj
un
ct
io
n
w
it
h
B
PD
Pr
ed
ic
t
pr
es
en
ce
o
f
b
o
rd
er
lin
e
pe
rs
o
na
lit
y
fe
at
ur
es
C
an
d
iff
er
en
ti
at
e
be
tw
ee
n
hi
gh
a
nd
lo
w
le
th
al
it
y
SI
Q
A
le
xa
nd
er
19
99
In
te
rn
al
co
ns
is
te
nc
y;
te
st
-r
et
es
t
Fa
ce
a
nd
co
nv
er
ge
nt
di
ve
rg
en
t
U
se
w
it
h
th
o
se
w
ho
ha
ve
s
uf
fe
re
d
tr
au
m
a
M
ea
su
re
s
in
te
nt
t
o
s
el
f-
ha
rm
M
ea
su
re
s
fo
r
m
aj
o
r
su
ic
id
e
co
nc
ep
ts
D
SH
I
G
ra
tz
20
01
Te
st
-r
et
es
t
In
te
rn
al
co
ns
is
te
nc
y
co
ns
tr
uc
t
co
nv
er
ge
nt
a
nd
di
sc
ri
m
in
at
e
B
eh
av
io
ra
lly
b
as
ed
;
cl
in
ic
al
p
o
pu
la
ti
o
ns
A
bl
e
to
p
re
di
ct
t
he
fe
at
ur
es
o
f
se
lf-
in
ju
ri
o
us
b
eh
av
io
rs
Su
ic
id
al
in
te
nt
as
se
ss
ed
SI
-I
A
T
N
o
ck
&
B
an
ja
i
20
07
Pr
ed
ic
ti
ve
A
ss
es
si
ng
b
el
ie
fs
a
nd
id
en
ti
fic
at
io
n
w
it
h
se
lf-
in
ju
ry
Pr
ed
ic
t
su
ic
id
al
id
ea
ti
o
n
an
d
be
ha
vi
o
rs
Ev
al
ua
te
d
an
d
di
ffe
re
nt
ia
te
d
SA
SI
I
Li
ne
ha
n,
C
o
m
to
is
,
B
ro
w
n,
H
ea
rd
, &
W
ag
ne
r
20
07
In
te
rr
at
er
Pr
ed
ic
ti
ve
co
nt
en
t
Pr
o
vi
de
d
es
cr
ip
ti
ve
in
fo
o
n
su
ic
id
al
a
nd
s
el
f-
in
ju
ri
o
us
b
eh
av
io
rs
Ev
al
ua
te
s
pa
st
b
eh
av
io
r;
b
as
ed
o
n
PH
I
Su
ic
id
al
in
te
nt
a
nd
le
th
al
it
y
o
f
se
lf-
in
ju
ri
o
us
b
eh
av
io
rs
SI
T
B
I
N
o
ck
, H
o
lm
be
rg
,
Ph
o
to
s,
&
M
ic
he
l
20
07
In
te
r-
ra
te
r
te
st
-r
et
es
t
C
o
ns
tr
uc
t
To
a
ss
es
s
a
w
id
e
ra
ng
e
o
f
se
lf-
in
ju
ry
-r
el
at
ed
co
ns
tr
uc
ts
N
o
ne
s
ta
te
d
A
ss
es
se
d
ge
st
ur
es
,
pl
an
, i
de
at
io
ns
, a
nd
at
te
m
pt
s
SI
T
S
=
Se
lf-
In
ju
ry
T
ra
um
a
Sc
al
e;
S
H
I
=
S
el
f-
H
ar
m
In
ve
nt
o
ry
; S
IQ
=
S
el
f-
In
ju
ry
Q
ue
st
io
nn
ai
re
; D
SH
I =
D
el
ib
er
at
e
Se
lf-
H
ar
m
In
ve
nt
o
ry
; S
I-
IA
T
=
S
el
f-
In
ju
ry
Im
pl
ic
it
A
ss
o
ci
at
io
n
Te
st
;
SA
SI
I =
S
ui
ci
de
A
tt
em
pt
S
el
f-
In
ju
ry
I
nt
er
vi
ew
; S
IT
B
I
=
Se
lf-
In
ju
ri
o
us
T
ho
ug
ht
s
an
d
B
eh
av
io
rs
I
nt
er
vi
ew
; B
PD
=
B
o
rd
er
lin
e
Pe
rs
o
na
lit
y
D
is
o
rd
er
; P
H
I
=
Pa
ra
su
ic
id
e
H
is
to
ry
I
nt
er
vi
ew
6 Measurement and Evaluation in Counseling and Development 43(1)
behaviors (Sansone, Whitecare, Meier, & Murry,
2001). Additionally, the SHI has been shown to
have an acceptable level of internal consistency
with Cronbach’s α = .80 (Sansone et al., 2006).
The developers have stated that the inventory
could help clinicians identify and distinguish
high-lethality and low-lethality self-injury.
Self-Injury Questionnaire (SIQ)
The SIQ was developed by Alexander (1999)
and later evaluated by Santa Mina, Gallop,
and Links (2006). This inventory was created
to evaluate and differentiate the intentions behind
self-injurious behaviors as based on a history
of childhood physical and/or sexual abuse. The
questionnaire was developed using a guiding
definition of self-injury as simply self-destructive
behaviors without the intent to die. Preliminary
findings of the initial research study that used
the SIQ showed good face validity and ade-
quate test-retest reliability in nonclinical
populati ons. Test-rest reliability over a 2-week
period of the behavioral items ranged from
r = .29 to r = 1.0, with a total correlation of
test-retest of r = .91 (Alexander, 1999). A sep-
arate study also revealed similar results for
the SIQ in acute populations, with the addi-
tion of statistical analysis resulting in findings
of high internal consistency of the total scale
(α = .83; 95% Confidence Interval [CI]) and an
adequate Cronbach’s alpha for each subscale
(α = .72 to .77) (Santa Mina et al., 2006).
Convergent validity analyses were also con-
ducted by Santa Mina et al. (2006) between
the SIQ and the Suicide Intent Scale (SIS), the
Beck Depression Inventory II (BDI II), and the
Self-Inflicted Injury Severity Form (SIISF).
The convergent validity between the SIQ and
the scales was reported to be r = –.37 with
regard to the factor of stimulation and the SIS,
r = .23 with regard to the affect regulation
factor of the SIQ as compared to the BDI II,
and r = –.25 with regard to the dissociation
factor of the SIQ and the SIISF. The SIQ is a
30-item self-report instrument conceptualized
from developments in trauma research. This
questionnaire measures the intent of self-injury
through evaluation methods across various
subscales, including body alterations, indirect
self-injury, failure to care for oneself, and
overt self-injury. The SIQ measures the func-
tions, types, and frequency of self-injuring
behaviors in association with a trauma history.
Questions on the SIQ related to agreement to
engagement in behaviors such as tattooing
and the frequency and number of self-injurious
acts related to these behaviors. Following each
behavioral item, if agreement was stated, par-
ticipants were then asked to circle further
items related to the reason contributing to the
behavior. At the time of this publication, this
inventory was yet to be tested in a clinical
setting; therefore, its efficacy with regard to
counseling is unclear and needs to be tested
further.
Deliberate Self-Harm Inventory (DSHI)
The DSHI was developed using an integrated
definition of self-injury in order to help pro-
vide a clear foundation for the instrument, given
that previous assessments lacked consensus
in definition (Gratz, 2001). It is based on the
notion that self-harm is the deliberate, direct
destruction of body tissue without conscious
suicidal intent but results in injury severe enough
for tissue damage to occur (Fliege et al., 2006).
This measure evaluates various features of self-
injury, including frequency, severity, duration,
and types of self-injurious behaviors. The
inventory consists of 17 items that are behav-
iorally based and reliant on self-report. The
DSHI has been found to be reliable and valid
for assessing self-injury and past suicidal
beh aviors (Gratz, 2006; Gratz & Chapman,
2007; Gratz et al., 2002; Lundh, Karim, &
Quilisch, 2007), with adequate internal reliabil-
ity at α = .62 (Fliege et al., 2006) and adequate
test-retest reliability during a 2- to 4-week
period of φ= .68 (p = .001) (Gratz, 2001). In
the study by Gratz (2001), adequate construct,
convergent, and discriminant reliability was
also found. This assessment is in wide use,
and its brief length lends itself to application
in clinical and outpatient settings. This assess-
ment could be useful in mental health as well
as school settings to determine the need,
Craigen et al. 7
immediacy, and level of intervention needed
with regard to a client or student presenting
self-injurious behaviors.
Suicide Attempt Self-Injury
Interview (SASII)
The SASII was designed to evaluate factors
involved in what the authors referred to as
“nonfatal suicide attempts and intentional self-
injury” (Linehan, Comtois, Brown, Heard, &
Wagner, 2006, p. 304). This measure, once
referred to as the Parasuicide History Inven-
tory, was developed to better understand the
methods involved in self-injury—the motiva-
tions, consequences, ritual, and impulsivity of
the act itself. Its validity and reliability mea-
sures were taken using an inpatient population.
In defining suicidal behavior, this instrument
includes all general definitions pertaining to
parasuicide, fatal and nonfatal suicide, and
self-injury without the intent to die. Therefore,
it does provide descriptive details about self-
injurious and suicidal behaviors but does not
differentiate between the two beyond lethality.
This instrument has been used in several
recent studies that confirm its usability and
importance in assessing the multiple aspects
of suicidal and self-injurious behaviors
(Brown, Comtois, & Linehan, 2002; Koons
et al., 2001). Six scales were developed based
on factor analysis with factors loading at .4 or
above. These six scales evaluated lethality of
the method, suicidal and nonsuicidal intent
associated with an episode, communication of
suicide intent prior to the episode, impulsiv-
ity, physical condition, and level of medical
treatment. The assessment showed high inter-
rater reliability at r = .918 for classification of
suicidality components and r = .843 for epi-
sodes classified as a single event versus a
cluster of self-injurious events (Linehan
et al., 2006). The SASII instrument is useful
in that it provides a rating concerning the
lethality of the act in question in terms of
several com ponents including medical and
other con sequences. This instrument can also
be used to evaluate treatment outcomes
through pre- and postassessment.
Self-Injury Implicit Association
Test (SI-IAT)
The SI-IAT was developed by Nock and Banjai
(2007) to assess self-injury in terms of the
identity with and beliefs surrounding the act
itself. This test was based on the Implicit Asso-
ciation Test (IAT), developed by Greenwald,
McGhee, and Schwartz (1998). To understand
the SI-IAT, it is important to know a little bit
about the test from which it was developed.
The IAT is primarily used for evaluating asso-
ciations to nonclinical constructs and beliefs.
The IAT itself has been shown to have strong
reliability, construct validity, and the capacity
to distinguish clinical changes caused by treat-
ment and attempts to mask feelings. The SI-IAT
was created in order to integrate the advantages
of the IAT in an attempt to assess self-injury
without relying on explicit self-report. The
test measures the implicit associations indi-
viduals have concerning self-injury in terms
of identification with the behavior as well as
attitudes about it.
The research studies conducted by Nock
and Banjai (2007) using the SI-IAT showed
that the assessment was able to strongly predict
recent suicidal ideation and suicide attempts,
with good incremental predictive validity ranging
from .74 to .77 with the participating ado-
lescent population. The assessment could
also distinguish between groups of nonsuicidal
ado lescents who had negative beliefs about
self-injury, adolescents with suicidal ideations
who showed some positive identification, and
adolescents who had attempted suicide while
having strong identification with self-injurious
behaviors. Because of the interpretive nature
of this assessment, it would be important for
counselors to use this in conjunction with mul-
tiple informal assessment techniques to evaluate
the client’s perceptions with regard to his or
her statements. This would help avoid coun-
selor bias in determining the client’s level of
identification with the behaviors. This assess-
ment is also helpful in evaluating how useful the
client views his or her self-injurious behav-
iors in managing symptomology. The level at
which a client integrates self-injury into his or
8 Measurement and Evaluation in Counseling and Development 43(1)
her identity and views self-injury as assistive
to his or her functioning could drastically affect
the approach and interventions the counselor
ultimately decides to use in the process of
treatment. For example, if the client views
self-injury as an effective coping strategy to
reduce stress, the counselor and client could
explore alternative stress-reduction strategies
in counseling sessions.
Self-Injurious Thoughts and
Behaviors Interview (SITBI)
The SITBI was developed by Nock, Holmberg,
Photos, and Michel (2007) as a 169-item
str uctured interview that assesses the pres-
ence, frequency, severity, age-of-onset, and
general characteristics associated with the
thoughts and behaviors of suicidal ideations
and suicide attempts. The SITBI assumes that,
by definition, self-injury does not include the
intent to die and thus differentiates self-injury
from suicidal intent and action. In assessing the
strengths of the interview as an assessment
tool, the authors found it to have strong inter-
rater reliability (Nock et al., 2007), good
test-retest reliability (average k = .70) after 6
months, good construct validity in relation to
suicide measures and suicide attempts (k =
.65), and concurrent validity with measures of
suicidal ideations and gestures. However, it
did have weak reliability in assessing suicide
gestures and plans. Predictive validity for sui-
cidal ideation or future self-injury was not
addressed in the study conducted by Nock et
al. (2007). It is the belief of the authors that
the interview could be used easily in a variety
of clinical settings to get an overview of current
and recent self-injurious behaviors; however,
because of the length of the assessment, there
are time constraints to consider with regard to
the pra cticality of its use.
The self-injury assessment tools that have
been developed over recent years have clear
strengths and weaknesses. For counselors, it is
important to consider the population you are
using before selecting a particular self-injury
assessment tool as well as the setting in which
you will be implementing it. Also, it is critical
to realize that the aforementioned formal assess-
ments are only one piece of the assessment
process. Counselors should never use these
measures in isolation for determining the course
of treatment, outcomes, or need for intervention.
The following section outlines a recommended
approach for assessing self-injury and using
formal assessments in conjunction with addi-
tional evaluation methods.
Comprehensive Assessment
Approach
The need for a comprehensive and multilevel
approach to the assessment and evaluation of
self-injury is clear because of the multifaceted
nature of self-injury. The following section out-
lines a two-tiered process of assessing self-injury.
This process includes the use of both formal and
informal assessment procedures (see Figure 1).
Formal Assessment
The first step in this integrated approach
inv olves the formal assessment of self-injury
(as introduced above) as well as other possi-
bly related concerns, such as depression,
traumatic history, or anxiety. These mental
health concerns necessitate mentioning
because of independent empirical indications
of association with self-injurious behaviors
(Conaghan & Davidson, 2002; Herpertz, Sass,
& Favazza, 1997; Klonsky & Olino, 2008;
Sansone, Chu, & Wiederman, 2007; Sansone
& Levitt, 2002). Overall, formal assessment
measures allow for more accurate diagnoses
and appropriate evaluation and enhance the
formulation of an informed treatment plan.
Self-Injury assessment measures. Many self-
injury assessment tools are available for
con sideration during the implementation of a
formal assessment process as previously pre-
sented (see Table 1). Selecting an appropriate
tool based on population, validity, and reli-
ability is necessary in treating self-injurious
behavior.
Additional formal assessments. Self-injury
rarely occurs in isolation. As stated previously,
many mental health disorders coexist with
Craigen et al. 9
self-injury. Thus, a combination of formal
assessments is fundamental, as it is imperative
to examine the intent behind each act of self-
injury to carefully evaluate which elements of
concern or distress are present for each unique
individual. Because of the complex nature of
self-injury, the more accurate the evaluation,
the better suited and successful the treatment
will be (White Kress, 2003). Thus, it would
behoove counselors to also use standardized
assessments that evaluate areas such as (but
not limited to) suicide, trauma, depression,
anxiety, and eating disorders. The following
are examples of assessments that could address
these indicators. Although this list is not com-
prehensive, other assessments may be selected
and should be matched to the unique needs of
the client:
Self-Injury
Assessment
• Self-Injury Assessment/Inventory
• Suicidality Protocol/Inventory
• Trauma Inventory
• Beck Depression Inventory
Anxiety Scales
Tier One: Formal Assessment
Tier Two: Informal Assessment (all are ongoing):
Formal Assessment
Background
Familial History
Peer Support
Social Support
Negative/Positive
Influences
Emotional Capacity
Verbal Ability to
Express Emotions
Coping Strategies
In Combination withIn Combination with
Figure 1. Two-tier model of assessment
10 Measurement and Evaluation in Counseling and Development 43(1)
• Suicidality Protocol/Inventories: that
is, Inventory of Suicide Orientation-30,
Beck Suicide Inventory, Reasons for
Living Inventory, Hopelessness Scale,
Scale for Suicide Ideation, Suicide
Probability Scale, Suicide Ideation
Questionnaire, and Suicide Probabil-
ity Scale
• Trauma Inventories: that is, Early
Trauma Inventory, Trauma Coping
Inventory, Trauma Symptom Inven-
tory, Trauma Assessment Inventories
• Depression Inventories: that is,
Inv entory of Depressive Symtoma-
tology, BDI, Children’s Depression
Inventory, Major Depression Inven-
tory, Inventory of Depression and
Anxiety Symptoms, Zung Self-Rating
Dep ression Scale
• Anxiety Inventories: that is, Beck
Anxiety Inventory, Spielberger State-
Trait Anxiety Scales, Anxiety Status
Inventory
• Eating Disorder Inventories: that is,
Eating Disorders Inventories, Eating
Attitudes Test, Eating Disorder Exa-
mination, and additional measures
sui ted for the particular client
The aforementioned formal assessments vary
according to reliability and validity. Thus, prior
to selecting a measure, it is important to exa-
mine its strengths as well as the population
being served.
Informal Assessment
The second step in this approach involves
using informal assessment measures. Infor-
mal assessment techniques are subjective and
provide counselors with additional tools for
understanding clients (Neukrug & Fawcett,
2005). The majority of informal assessments
are used in a formative evaluative manner, rather
than through a pretreatment or posttreatment
(summative) evaluation. Informal assessment
techniques combined with formal assessments
allow the clinician to gain a comprehensive,
holistic, and in-depth understanding of the
client and his or her presenting concerns. For
example, gaining an understanding of past and
current familial and relational connections as
well as relational conflicts could lead to greater
insight into the client’s reasoning for his or
her self-injurious behaviors and the structure
of his or her current support network. With all
informal assessment techniques, it is neces-
sary to consistently be aware of cultural context
and how this could be a factor for each client.
Although many techniques can be used to con-
duct informal assessments, only those most
pertinent to the treatment of self-injurious behav-
iors are addressed in this section.
Intakes. Many informal assessment measures
exist and should be used during intake and
also throughout the treatment process for each
individual. At intake, it is important to add a
section or line dedicated to self-injury. This is
an area that is often left off of intakes and is
important in the initial assessment. For exam-
ple, “Have you ever intentionally hurt yourself
for any reason?”
Interviews. Parent and teacher interviews are
a great tool to access valuable information
about your client and his or her experiences
with self-injury. Although many individuals go
to great lengths to hide their self-injury from
parents and teachers, valuable information
can be garnered from speaking with these
individuals, as they may play an important
role in the client’s self-injury and might also
serve as an ally for the client as he or she
explores issues related to his or her behaviors
in counseling. Some questions that might
garner useful treatment information include
the following: “Is the client’s behavior consis-
tent at home and school?” “Does the client
engage in isolative behaviors?” “How does the
client normally express his or her feelings or
needs?” “What type of internalizing or exter-
nalizing behaviors are the parents or teachers
aware of in your client?”
Observations. Observations are an important
assessment tool, providing counselors with an
additional mechanism for understanding the
client (Neukrug & Fawcett, 2005). Although
not all clients who self-injure present in the
same way, there may be consistent behaviors,
Craigen et al. 11
appearances, or nuances that could provide
counselors with helpful information to sup-
plement their understanding of the client. For
example, a client who self-injures may often-
times wear clothes that hide his or her injuries
or have many unexplained cuts, scars, or burns
(White Kress, Gibson, & Reynolds, 2004).
Additionally, clients may avoid conversations
about self-injury or deny their personal expe-
riences with self-injury.
Background information. Acquiring back-
gro und information is a vital aspect of self-
injury assessment and can potentially provide
the counselor with valuable information about
the contributing factors related to the client’s
self-injurious behavior. When obtaining back-
ground information, it is necessary to focus
on all aspects of the individual and not limit
the assessment to the behavior itself. This knowl-
edge provides counselors with valuable
information about what lies beneath the sur-
face of the wounds, a focus of treatment that
has been ignored in the past (Craigen &
Foster, 2009; Walsh, 2006).
Familial history is one aspect of background
information that is often overlooked. Gather-
ing information about an individual’s family
history avoids pathologizing the behavior and
views the presenting behaviors through more
of a systemic lens. Seeking to understand all
contributing factors such as a client’s per-
spectives and experiences regarding his or her
family might not have been considered in the
past; however, it is necessary (McAllister, 2003;
Selekman, 2002). For example, the counselor
may ask, “Who do you talk to in your family
about your feelings?” “How does your family
typically deal with their emotions?” “What
feelings do you have for different members of
your family?” or “What events in your past
family history have affected you negatively?”
In addition to familial information, it is also
important to discuss with the client his or her
peer and social supports (Walsh, 2006). This
is particularly relevant in the adolescent popu-
lation because at this developmental milestone,
peer supports are highly valued. For example,
counselors may say, “Tell me about your
friends.” Or they may ask, “When you are
upset, do you typically talk with your friends?”
“Do your friends know about your self-injurious
behavior?” Other factors that affect the indivi-
dual and need to be assessed are negative
or positive influences that could facilitate
self-injury. These could include Internet sites
ded icated to perpetuating self-injurious behav-
ior, friends who self-injure, and/or media role-
models who self-injure or have self-injured.
Emotional capacity. Evaluating the emotional
capacity of the individual using informal
ass essment techniques is an essential process
in developing effective treatment interven-
tions and conceptualizing the issues related to
the self-injurious behaviors. Examining an indi-
vidual’s ability to outwardly express and
understand his or her feelings involves an
ongoing process of assessment, evaluation,
and treatment with clients who self-injure.
One’s ability to express emotions is a concern
for many but particularly those who self-injure.
Since this is the case, it may be important to
ask clients, “If your wounds could speak,
what would they say about you?” (Levenkron,
1998). Additionally, basic questions that assess
one’s feelings voc abulary can also be benefi-
cial in the informal assessment process.
Coping strategies. In addition to assessing
the emotional capacity of clients who self-injure,
coping strategies can also be assessed by using
informal assessment techniques and can be
incorporated in any treatment approach for
those who self-injure. For example, it may be
important to ask clients, “What do you do
when you feel angry, anxious, or upset?” or
“What function does self-injury serve for you?”
These two questions allow the counselor to
examine how and to what extent that self-injury
serves as a maladaptive coping strategy for
clients presenting with self-injurious behaviors.
Typically, the use of self-injury is seen as an
effective method for dealing with overwhelm-
ing emotions associated with traumatic memories
or other issues occurring in the client’s life
(Gratz, 2007). Therefore, it is necessary to
determine how invested the client is in the
counseling process and how interested he or
she is in working toward a change with regard
to this pattern of behavior. Clients may be
12 Measurement and Evaluation in Counseling and Development 43(1)
fearful that any attempt to alter their current
way of coping could result in an increased
level of instability that would result in hospi-
talization or worse. Evaluating the fear and
anxiety clients may be associating with change
could be critical in determining an effective
treatment approach. Determining a client’s
concerns, commitment, and understanding with
regard to the counseling process is an integral
component of any assessment process and is
particularly crucial with regard to the issue of
self-injury.
Synthesis of Approaches
This article serves to illuminate the benefits of
both a formal and informal approach to assess-
ing self-injury. Although each approach is
important, the integration of both approaches is
vital (see Figure 1). In the comprehensive
two-tiered model of assessment, the formal ass-
essments serve as the first step in evaluating
self-injury; formal assessments provide coun-
selors with a standardized and quantifiable
way of determining the seriousness of the
problem and can also reflect progress or regres-
sion in treatment. The informal assessments,
as described above, serve to support, enhance,
and depict a comprehensive view of self-
injury. In addition to using the perspectives of
others, the informal assessment also widens
the lens in which self-injury has been examined
in the past. Although the formal assessments
focus on the behavior of self-injury, the infor-
mal assessments exa mine context, background,
and emotional cap acities. Thus, although both
approaches are important, counselors will ben-
efit from using them in tandem when assessing
self-injury to focus treatment and hopefully
improve short- and long-term outcomes.
Counselor Implications
Counselors will inevitability encounter individ-
uals who self-injure, creating instances whereby
they may have a responsibility to properly
assess and evaluate self-injury in their clients.
Alth ough the assessment of self-injury is
clearly in the early stages, further research on
new and established assessment tools is
needed. Conceptualization of self-injurious
behaviors is multidimensional; therefore,
assessment of these behaviors needs to be
complementary. For mental health profession-
als, to accurately assess focusing on frequency,
severity (tissue damage and intention), dura-
tion, type, thoughts and attitudes, and age of
onset is essential in treatment. Professionals
must also be aware of culture when assessing
those who self-injury. Cultural considerations
would include, but not be limited to, family
experiences, religion, ethnicity, and gender.
Additionally, qualitative research methods
that examine counselors’ and client’s percep-
tions about self-injury assessment tools as well
as their perceived usefulness could be helpful.
In addition, cultural considerations need to
be included in current research. Cultural dimen-
sions may contribute to the variability of
acc urately assessing those who self-injure,
which would eventually affect treatment. In
addition to research, counselors must begin to
expand their knowledge base on the topic of
assessment and self-injury. Because the defi-
nition of self-injury continues to be debated,
which affects the consistency of assessment,
further research is needed in this area.
Trainings that increase awareness about
self-injury assessment scales are imperative.
Because suicide is often discussed in counselor
education programs, incorporating self-inju-
rious behavior into the curriculum could be a
way to dialogue about this topic. By encom-
passing self-injurious behavior into counseling
programs, students will be exposed to charac-
teristics and features of this behavior that are
vital to assessment and intervention. In addi-
tion, training may also be in the form of
community-wide or in-service trainings that
focus on assessment. Training and practice
must comprise numerous difficulties in assess-
ment of self-injury, such as various nomenclature,
conflicting theoretical definitions, and incon-
sistencies with other disorders. In addition,
training must inc lude the comprehensive
ass essment approach, which includes formal
and informal assessment measures. On a broader
level, the topic of self-injury and assessment
Craigen et al. 13
should be presented at local, regional, and
national counseling conferences.
Given the review of the current self-injury
assessments, there are notable limitations and
weaknesses within these scales. For example,
all of the reviewed inventories were either
developed in conjunction with a diagnosis of
BPD or they assessed a component of suicidal
ideation. Furthermore, the assessments reviewed
failed to consider cultural context and were
normed on homogeneous samples, ignoring
diverse populations. Thus, to accurately assess
self-injury, it is imperative for counselors
and researchers to develop a scale that (a) is
nor med on a heterogeneous sample, (b) is inde-
pendent from the criteria of BPD, and (c)
evaluates self-injury without the inclusion of
suicidal ideations. The development of a scale
like this would benefit clinicians and clients
and would contribute greatly to the accurate
assessment of self-injury.
Summary
The topic of assessment and self-injury is
quickly beginning to gain attention among
mental health professionals and researchers.
Although there are several assessment tools
available to counselors, many have method-
ological flaws (e.g., low reliability and validity
and lack of factor analytic procedures) and are
used solely for a distinct population of indi-
viduals who self-injure. Prior to selecting a
formal self-injury assessment, it is important
to examine the strength of the assessments as
well as the population being served. Addi-
tionally, it is important never to use one
instrument in isolation. Combining additional
formal assessments and using many informal
assessment methods throughout the counsel-
ing relationship is imperative. Future research
and training on the topic of self-injury is clearly
needed.
Declaration of Conflicting Interests
The authors declared no potential conflicts of inter-
ests with respect to the authorship and/or
publication of this article.
Financial Disclosure/Funding
The authors disclosed receipt of the following
financial support for the research and/or authorship
of this article: Institute for the Study of Disadvan-
tage and Disability awarded a student research
honorarium to the second author.
References
Alexander, L. (1999). The functions of self-injury
and its link to traumatic events in college students.
UMI Dissertation Services (UMI No. 9932285).
Briere, J., & Gil, E. (1998). Self-mutilation in clini-
cal and general population samples: Prevalence,
correlates, and functions. American Journal of
Orthopsychiatry, 68(4), 609–620.
Brown, M. Z., Comtois, K. A., & Linehan, M. M.
(2002). Reasons for suicide attempts and non-
suicidal self-injury in women with borderline
personality disorder. Journal of Abnormal Psy-
chology, 111, 198–202.
Conaghan, S., & Davidson, K. M. (2002). Hope-
lessness and the anticipation of positive and
negative future experiences in older parasui-
cidal adults. The British Journal of Clinical
Psychology/The British Psychological Society,
41(3), 233–242.
Craigen, L., & Foster, V. (2009). A qualitative
investigation of the counseling experiences of
adolescent women with a history of self-injury.
Journal of Mental Health Counseling, 31(1),
76–94.
Favazza, A. R. (1996). Bodies under siege: Self-
mutilation and body modification in culture
and psychiatry (2nd ed.). Baltimore: The Johns
Hopkins University Press.
Fliege, H., Kocaleventa, R. D., Waltera, O. B.,
Becka, S., Gratz, K. L., Gutierrez, P. M., et al.
(2006). Three assessment tools for deliberate
self-harm and suicide behavior: Evaluation and
psychopathological correlates. Journal of Psy-
chosomatic Research, 61, 113–121.
Gratz, K. L. (2001). Measurement of deliberate self-
harm: Preliminary data on the deliberate self-
harm inventory. Journal of Psychopathology
and Behavioral Assessment, 23(4), 253–263.
Gratz, K. L. (2006). Risk factors for deliberate
self-harm among female college students: The
14 Measurement and Evaluation in Counseling and Development 43(1)
role and interaction of childhood maltreatment,
emotional inexpressivity, and affect intensity/
reactivity. American Journal of Orthopsychia-
try, 76(2), 238–250.
Gratz, K. L. (2007). Targeting emotion dysregula-
tion in the treatment of self-injury. Journal of
Clinical Psychology, 63(11), 1091–1103.
Gratz, K. L., & Chapman, A. L. (2007). The role of
emotional responding and childhood maltreat-
ment in the development and maintenance of
deliberate self-harm among male undergradu-
ates. Psychology of Men and Masculinity, 8(1),
1–14.
Gratz, K. L., Conrad, S. D., & Roemer, L. (2002).
Risk factors for deliberate self-harm among col-
lege students. American Journal of Orthopsy-
chiatry, 72(1), 128–140.
Greenwald, A. G., McGhee, D. E., & Schwartz,
J. L. K. (1998). Measuring individual differ-
ences in implicit cognition: The implicit asso-
ciation task. Journal of Personality and Social
Psychology, 74, 1464–1480.
Herpertz, S., Sass, H., & Favazza, A. (1997).
Impulsivity in self-mutilative behavior: Psy-
chometric and biological findings. Journal of
Psychiatric Research, 31(4), 451–465.
Iwata, B. A., Pace, G. M., & Kissel, R. C. (1990).
The self-injury trauma (SIT) scale: A method for
quantifying surface tissue damage caused by
self-injurious behavior. Journal of Applied
Behavior Analysis, 23(1), 99–110.
Klonsky, E. D., & Olino, T. M. (2008). Identify-
ing clinically distinct subgroups of self-injurers
among young adults: A latent class analysis.
Journal of Consulting and Clinical Psychology,
76(1), 22–27.
Koons, C. R., Robins, C. J., Tweed, J. L., Lynch,
T. R., Gonzalez, A. M., Morse, J. Q., et al.
(2001). Efficacy of dialectical behavior therapy
in women veterans with borderline personality
disorder. Behavior Therapy, 32, 371–390.
Levenkron, S. (1998). Understanding and overcom-
ing self-mutilation. New York: W.W. Norton.
Linehan, M. M., Comtois, K. A., Brown, Z. M.,
Heard, H. L., & Wagner, A. (2006). Suicide
attempt self-injury interview (SASSI): Develop-
ment, reliability, and validity of a scale to assess
suicide attempts and intentional self-injury. Psy-
chological Assessment, 18(3), 303–312.
Lundh, L. G., Karim, J., & Quilisch, E. V. A. (2007).
Deliberate self-harm in 15-year-old adolescents:
A pilot study with a modified version of the
Deliberate Self-Harm Inventory. Scandinavian
Journal of Psychology, 48, 33–41.
McAllister, M. (2003). Multiple meanings of self-
harm: A critical review. Intentional Journal of
Mental Health Nursing, 12, 175–185.
McDonough, M., Hillery, J., & Kennedy, N. (2000,
December). Olanzapine for chronic, stereotypic
self-injurious behaviour: A pilot study in seven
adults with intellectual disability. Journal of
Intellectual Disability, 44(6), 677–684.
Muehlenkamp, J. J., & Gutierrez, P. M. (2004).
An investigation of differences between self-
injurious behavior and suicide attempts in a
sample of adolescents. Suicide & Life-Threat-
ening Behavior, 34(1), 12–23.
Neukrug, E., & Fawcett, C. (2005). Essentials of
testing and assessment: A practical guide for
counselors, social workers, and psychologists.
Belmont, CA: Thomson Brooks/Cole.
Nock, M. K., & Banjai, M. R. (2007). Prediction
of suicide ideation and attempts among ado-
lescents using a brief performance-based test.
Journal of Consulting and Clinical Psychology,
75(5), 707–715.
Nock, M. K., Holmberg, E. B., Photos, V. I., &
Michel, B. D. (2007). Self-injurious thoughts
and behaviors interview: Development, reliabil-
ity and validity in an adolescent sample. Psy-
chological Assessment, 19(3), 309–317.
Nock, M. K., & Prinstein, M. J. (2005). Contex-
tual features and behavioral functions of self-
mutilation among adolescents. Journal of Abnor-
mal Psychology, 114, 140–146.
Ross, S., & Heath, N. L. (2002). A study of the
frequency of self-mutilation in a community
sample of adolescents. Journal of Youth and
Adolescence, 31, 67–77.
Sansone, R. A., Chu, J., & Wiederman, M. (2007).
Self-inflicted bodily harm among victims of
intimate-partner violence. Clinical Psychology
& Psychotherapy, 14(5), 352–357.
Sansone, R. A., & Levitt, J. L. (2002). Self-harm
behaviors among those with eating disorders:
An overview. Eating Disorders, 10(3), 205–213.
Sansone, R. A., Songer, D. A., Douglas, A., &
Sellbom, M. (2006). The relationship between
Craigen et al. 15
suicide attempts and low-lethal self-harm behav-
ior among psychiatric inpatients. Journal of
Psychiatric Practice, 12(3), 148–152.
Sansone, R. A., Whitecare, P., Meier, B. P., &
Murry, A. (2001). The prevalence of borderline
personality among primary care patients with
chronic pain. General Hospital Psychiatry, 23(4),
193–197.
Sansone, R. A., Wiederman, M. W., & Sansone, L. A.
(1998). The self-harm inventory (SHI): Devel-
opment of a scale for identifying self-destructive
behaviors and borderline personality disorder. Jour-
nal of Clinical Psychology, 54(7), 973–983.
Santa Mina, E. E., Gallop, R., & Links, P. (2006). The
self-injury questionnaire: Evaluation of the psy-
chometric properties in a clinical population. Jour-
nal of Mental Health Nursing, 13(2), 221–227.
Selekman, M. D. (2002). Living on the razor’s
edge: Solution-oriented brief family therapy
with self-harming adolescents. New York:
Norton.
Symons, F. J., & Danov, S. E. (2005). A prospec-
tive clinical analysis of pain behavior and self-
injurious behavior. Pain, 117, 473–477.
Walsh, B. W. (2006). Treating self-injury: A practi-
cal guide. New York: Guilford.
White Kress, V. E. (2003). Self-injurious behaviors:
Assessment and diagnoses. Journal of Counsel-
ing and Development, 81, 490–496.
White Kress, V. E., Gibson, D. M., & Reynolds,
C. A. (2004). Adolescents who self-injure:
Implications and strategies for school coun-
selors. Professional School Counseling, 7(3),
195–201.
Bios
Laurie M. Craigen, PhD, LPC, is an assistant pro-
fessor in the Department of Counseling and Human
Services at Old Dominion University in Norfolk,
Virginia. She also works as a Licensed Professional
Counselor at Southside Counseling Center in Suf-
folk, VA. Laurie is actively involved in research on
mental health concerns in women, particularly with
self-injurious behavior. Additionally, she has pre-
sented at local, regional, and national conferences
on the topic of self-injury and is an Assistant Editor
of Human Service Education.
Amanda C. Healey, PhD, LPC-MHSP, NCC, is
currently a temporary fulltime counseling program
faculty member at East Tennessee State University.
She is involved in research pertaining to issues of
self injurious behaviors, professional identity
development in counseling, and burnout in mental
health and has published on these topics. Amanda
works from an Adlerian-Feminist perspective and
this is reflected in her professional and scholarly
activities.
Cynthia T. Walley, PhD, NCC, is an Assistant
Professor in the Educational Foundations and
Counseling Department at Hunter College in New
York, NY. Dr. Walley’s research interest include,
school counseling preparation, adolescent mental
health, and assessment and diagnosis.
Rebekah Byrd, MSEd, LPC, NCC, is a doctoral
candidate at Old Dominion University in Norfolk,
Virginia. She currently works as the Director of
CARE NOW, a middle school based Character
Education Program and also serves as President for
the ODU chapter of Chi Sigma Iota. Rebekah
supervises master’s counseling students and teaches
undergraduate and master’s classes. Over the last
year she has published two book chapters and two
articles; presented at the national, regional, and
state level and won a competitive research grant.
Jennifer Schuster, MEd, is a 2009 graduate of
Master’s Program in School Counseling at Old
Dominion University. Jennifer is currently work-
ing as a school counselor in Newport News,
Virginia and continues to engage in research proj-
ects at Old Dominion University.
Copyright of Measurement & Evaluation in Counseling & Development (Sage Publications Inc. ) is the
property of Sage Publications Inc. and its content may not be copied or emailed to multiple sites or posted to a
listserv without the copyright holder’s express written permission. However, users may print, download, or
email articles for individual use.
Copyright of Measurement & Evaluation in Counseling & Development is the property of
Taylor & Francis Ltd and its content may not be copied or emailed to multiple sites or posted
to a listserv without the copyright holder’s express written permission. However, users may
print, download, or email articles for individual use.
Personal Experience Inventory for Adults
Review of the Personal Experience Inventory for Adults by MARK D. SHRIVER, Assistant Professor, University of Nebraska Medical Center, Omaha, NE:
The stated purpose of the Personal Experience Inventory for Adults (PEI-A) is to function as “a comprehensive, standardized self-report inventory to assist in problem identification, treatment referral, and individualized planning associated with addressing the abuse of alcohol and other drugs by adults” (p. 3). It is designed to yield “comprehensive information about an individual’s substance abuse patterns and problems [and] . . .also helps to identify the psycho-social difficulties of the referred individual” (p. 3). The PEI-A is developed from, and an extension of, the PEI (1989) which is a self-report measure of drug and alcohol use for adolescents ages 12 to 18 (see Mental Measurements Yearbook (1992) 11:284 for reviews by Tony Toneatto and Jalie A. Tucker).
The manual states that the PEI-A “was designed primarily as a clinical descriptive tool for use by addiction professionals” and that it “is intended to supplement a comprehensive assessment process” (p. 5). Specifically, the PEI-A was developed to measure the following characteristics:
1. The presence of the psychological, physiological, and behavioral signs of alcohol and other drug abuse and dependence.
2. The nature and style of drug use (e.g., consequences, personal effects, and setting).
3. The onset, duration, and frequency of use for each of the major drug categories.
4. The characteristics of psychosocial functioning, especially factors identified as precipitating or maintaining drug involvement and expected to be relevant to treatment goals.
5. The existence of behavioral or mental problems that may accompany drug use (e.g., sexual abuse or co-addiction).
6. The sources of invalid response tendencies (e.g., “faking bad,” “faking good,” inattention, or random responding).
The appropriate use of any measure for its intended purposes is dependent on its sample representation, reliability, and validity. This information is reviewed respective to the PEI-A’s stated purposes described above.
TEST ADMINISTRATION. The manual provides easy-to-read instructions on test administration, which will assist with increasing standardization of the administration. The manual states that the reading level of the measure is approximately sixth grade (p. 7), and the test may be read to clients with lower reading abilities. No discussion is presented, however, regarding whether this type of administration occurred during norming of the test, or how this type of administration may affect the reliability or validity of the self-report measure, as the examiner is directly involved with administration, which runs counter to one of the intended goals for test development (p. 29).
The test is scored by computer, either through a mail-in service, a FAX service, or by computer disk. The mail-in service typically requires approximately 3-5 working days to return scores (p. 75). This may be too long in some clinical settings. Interpretations of each scale are provided in the computer-generated report.
TEST DEVELOPMENT. The content of the PEI-A is largely derived from the PEI. A panel of experts is reported to have examined the items on the PEI and made changes where necessary to adapt the items to an adult population. The panel of experts was composed of “Groups of researchers and drug treatment service providers” (p. 30), but no further indication is provided in the manual about who these individuals are, where they are from, and what respective experience/expertise they have in item/test development. Item selection and scale development proceeded on a “rational basis” (p. 30). In addition to the PEI, a drug use frequency checklist adapted from adolescent and adult national survey instruments was incorporated into the drug consumption section of the PEI-A.
The initial items and scales of the PEI-A were examined with a sample of 300 drug clinic subjects (150 males, 150 females) for internal scale consistency (alpha coefficients) and interscale correlations. Correlations between Problem Severity Scales “were somewhat higher than desired for scales intended to contribute substantial unique and reliable information about the respondent, ranging from .55 to .92” (p. 33). Alpha correlations of individual scales were good, typically within the .75 to .93 range (pp. 32-33). Following examination of the correlations on this initial sample of subjects “only minor adjustments in item assignment” were made in scales (p. 33).
The content of the Problem Severity Scales is described as “multidimensional, oriented around signs and symptoms of drug abuse and dependence, and not anchored in any single theoretical model” (p. 30). Review of the items on the measure suggests that the content appears appropriate for the purposes listed above. Review of the empirical evidence for sample comparisons, reliability, and validity will help determine if items originally developed for adolescents and refined for adults based on unknown expert opinion are truly valid for adults.
The test items were not analyzed statistically for possible bias. The test was examined for differences in internal consistency across gender and race, and significant differences were not found; however, predictive validity and differential decision making across gender and race have not yet been examined. Differences in primary language of the subjects was not discussed and it is difficult to determine how language (i.e., not primarily English) might affect responses (written or oral) on the PEI-A.
SAMPLES FOR TEST VALIDATION AND NORMING. Three samples were chosen for test norming: 895 drug clinic clients from Minnesota, Illinois, Washington, California, Missouri, and Ontario, although specific numbers from each state are not provided; 410 criminal offenders, most from Minnesota; and 690 nonclinical participants, all from Minnesota. All sample subjects were volunteers. No discussion is provided in the manual regarding the possible impact on client self-report due to the sample selection process, and whether valid interpretation of the results can be made with individuals who may be tested under some type of coercion such as for court-ordered treatment. Sample demographic information is provided in the manual regarding mean age, age range, gender, minority, percent in prior treatment, marital status, employment status, and education (p. 36).
Scores from the measure are compared with the drug clinic sample in the form of T scores; however, score comparisons (T scores) are also provided at the end of the computerized report for the nonclinic sample. Given the restricted geographic sampling of the nonclinic group, it is difficult to determine if this comparison provides useful information for individuals who are not from Minnesota. It is also unclear if the nonclinic sample participated by mail as described on page 35 of the manual or through group testing as described on page 36 of the manual. Both contexts for test taking are somewhat different from the typical administration (e.g., individualized, in drug clinic) described in the manual and may limit score interpretations even further.
The drug clinic sample is described in terms of two groups: outpatient and residential treatment. Only 37.5% of the outpatient drug clinic sample is female (approximately 188). Only 36.3% of the residential drug clinic sample is female (approximately 143). Separate T scores are provided for male and female samples (p. 36). Only approximately 113 members of the outpatient drug clinic sample are of an ethnic minority status, and approximately 68 members of the residential drug clinic sample are of an ethnic minority status. Minority is not defined further (e.g., African American, Hispanic, Native American), although it is conceivable that minority status may differentially impact drug use. In addition, although the gender representation may be an accurate reflection of general population drug use, the small sample size for females limits normative comparisons. Reported drug use patterns may also differ by gender.
Information is not provided on whether the norm groups come from rural or urban settings. A rural or urban context may impact drug use (i.e., availability of drugs). Also, specific numbers are not provided relative to the geographic regions from which the drug clinic samples originate, and as indicated by the authors (p. 36), geographic region may impact reported drug use (i.e., higher cocaine use in California and Washington reported relative to Midwest states and Ontario).
In summary, caution is advised in using the PEI-A with females, minorities, and individuals from geographic regions other than those sampled. In addition, the comparison with nonclinic population may not be useful for individuals outside of Minnesota. The test norms appear to be useful for comparing Caucasian males with possible drug use history with the drug clinic sample.
RELIABILITY. Internal consistency reliabilities are provided (coefficient alpha) for the entire sample and provided for male, female, white, and minority samples (pp. 37-41). In addition, test-retest reliabilities are presented for one week and for one month using the drug clinic sample, although there was some intervening treatment between pre- and posttest scores (pp. 42-43). Only reliabilities for the drug clinic and nonclinic samples will be discussed as these represent the primary comparative groups for examinees.
Coefficient alphas are generally good for the Problem Severity Scales (median .89 range .72 to .94) and the Psychosocial Scales (median .81 range .67 to .91). The coefficient alphas are low for the Validity Scales (median .63 range .58 to .77) (p. 37). The authors claim the validity reliability estimates compare favorably with other instruments, and this may be true; however, these values are just acceptable given the use to be made of these scores.
Median test-retest reliabilities at one week (70 individual) were as follows: Problem Severity scales .71 (.60 to .88), Psychosocial scales .66 (.55 to .87), and Validity Indices .52 (.40 to .57). One-month test-retest reliabilities were lower as expected given intervening treatment (pp. 42-43). Given that some subjects were provided intervening treatment in the one-week test-retest group also, it can reasonably be said that test-retest reliability has not been adequately examined and no conclusions can be drawn regarding temporal stability of the test. This makes it less useful pre- and post-treatment as it is difficult to determine if changes in scores are due to treatment or lack of score stability. This is in conflict with the authors’ conclusions, however, that scores can be compared pre- and posttreatment (p. 43).
In summary, the internal consistency estimates of the Problem Severity Scales and the Psychosocial Scales range from good to acceptable. More research is definitely needed on the stability of the test scores (test-retest) before conclusions can be drawn regarding the test’s usefulness pre- and posttreatment.
VALIDITY. One potential use of this instrument is to determine appropriate treatment options for individuals. Drug clinic subjects (N = 251) were classified into three referral categories: no treatment, outpatient treatment, and residential treatment based on clinical staff ratings. Mean scores on the PEI-A Problem Severity Scales were examined and expected differences in scores were found for the three groups (p. 46). Future researchers, however, may want to look at the contribution the PEI-A provides above and beyond other information used in making referral decisions. In other words, are these mean score differences useful? Also, significant differences in mean scores according to sample group membership (nonclinical, drug clinic, and criminal offender) were also found (p. 46). Again, an empirical examination as to how this information contributes as part of a comprehensive assessment would be useful.
Seven of the Problem Screens were compared with staff ratings to determine sensitivity and specificity of the screens, essentially the degree of agreement regarding the existence of problems (p. 48). For the total sample, there were significant correlations (p<.05) for agreement between the PEI-A and staff ratings for negative ratings (i.e., individual not identified with having problem), but not for positive ratings (i.e., individual identified as having problem) (p. 49).
The Validity Indices were found to correlated as expected with Minnesota Multiphasic Personality Inventory (MMPI) Validity scales (p. 48).
To assess the construct validity of the scale, correlations with tests purported to measure similar constructs were examined. Moderate correlations were found between the Problem Severity Basic Scales scores and the Alcohol Dependence Scale (.41-.66; p. 44; ADS; Horn, Skinner, Wanberg, & Foster, 1982). In addition, correlations are also provided for Problem Severity Scale scores and the Drug Use Frequency Checklist; however, the Drug Use Frequency Checklist is actually part of the PEI-A so the usefulness of this information for construct validity is weakened. The Psychosocial Scales of the PEI-A were found to correlate significantly with MMPI scales, suggesting the psychosocial Scales are measuring psychopathology to some extent (p. 45), but there does not appear to be much differentiation between the PEI-A scales as all but Rejecting Convention and Spiritual Isolation correlate highly with each of the MMPI scales. Finally, information is provided that “select” PEI-A scales (p. 45) correlate significantly with a Significant Other Questionnaire. However, the Significant Other Questionnaire is also developed from PEI-A items, which again attenuates the meaningfulness of this relationship.
In summary, the validity evidence presented in the manual does not appear to address specifically the intended purposes/applications of the test noted above. The content looks good, but much more empirical research is needed on the validity of this instrument specifically related to the applications for which it is intended. Future research should address whether this instrument contributes significantly (above and beyond other information in a comprehensive assessment) to decision making involved in assessing and treating individuals with alcohol and drug use problems.
SUMMARY. The PEI-A may be most useful for examining alcohol and drug use in white males who are compared with a drug clinic sample. Results of this test are intended to tell the clinician whether an individual is similar to individuals in the drug clinic sample and to provide some information on the impact of drugs on the individual’s life. Caution is urged in using the PEI-A with females and minorities given the small sample sizes. Geographic region and urban-rural differences may also impact reports of drug use and should be considered by the test user. In addition, this test may not be useful for individuals whose primary language is not English. The use of the nonclinic scores for comparisons is questionable for individuals outside Minnesota. Estimates of the internal consistency reliability of the scales and content appear good. Additional research on test-retest reliability is needed. More research on the validity of the PEI-A as part of a comprehensive assessment is needed. The PEI-A looks promising, but users are encouraged to heed the test author’s statement that this test should only be used as part of comprehensive assessment.
REVIEWER’S REFERENCES
Horn, J. L., Skinner, H. A., Wanberg, K., & Foster, F. M. (1982). Alcohol Dependence Scale (ADS). Toronto: Addiction Research Foundation.
Toneatto, T. (1992). [Review of the Personal Experience Inventory.] In J. J. Kramer & J. C. Conoley (Eds.), The eleventh mental measurements yearbook (pp. 660-661). Lincoln, NE: Buros Institute of Mental Measurements.
Tucker, J. A. (1992). [Review of the Personal Experience Inventory.] In J. J. Kramer & J. C. Conoley (Eds.), The eleventh mental measurements yearbook (pp. 661-663). Lincoln, NE: Buros Institute of Mental Measurements.
Review of the Personal Experience Inventory for Adults by CLAUDIA R. WRIGHT, Professor of Educational Psychology, California State University, Long Beach, CA:
The Personal Experience Inventory for Adults (PEI-A) is a standardized self-report instrument for use by service providers in the substance abuse treatment field to assess patterns of abuse and related problems in adult clients (age 19 or older). The two-part, 270-item PEI-A is made up of 10 problem severity scales and 11 psychosocial scales, 5 validity indicators, and 10 problem screens; it parallels in content and form the two-part, 300-item Personal Experience Inventory (PEI; 11:284) developed for use with adolescents (age 18 or younger). A broad theoretical framework, influenced by Alcoholics Anonymous, social learning, and psychiatric models, underlies the development of both inventories. The manual presents a thorough treatment of test development, standardization, and validation procedures along with clear test administration and computer-scoring guidelines and useful strategies for score interpretation. The inventory is written at a sixth-grade reading level. No provisions are made for non-English-speaking test takers.
NORMING PROCEDURES. Norm tables were constructed separately for males and females in two standardization samples (clinical and nonclinical). Normative data were obtained primarily from Midwestern Whites, raising concerns about the generalizability of score interpretations to clients classified as nonwhite. Demographic information presented in the PEI-A manual indicates that 20% of the clinical sample (n = 895) was classified as minority. Clinic respondents attended outpatient and residential Alcoholics Anonymous-based programs at 12 sites (located in 3 midwestern and 2 western states and 1 Canadian province). No rationale was provided for site selection. A total of 690 Minnesota residents comprised the nonclinical sample; 11% were classified as minority. A sample of 410 criminal offenders (77% were male; 68% of the sample was nonwhite) was used to provide data for some validation analyses.
Caution is warranted in applying the PEI-A norms to members of nonwhite groups in either clinical or nonclinical settings. The test developer is to be commended for briefly acknowledging this limitation. Sampling that includes more regions, broader ethnic representation, and types of treatment program sites is essential.
RELIABILITY. For 1,995 respondents, median Cronbach alphas were (a) Problem Severity Scales = .89 (range: .81-.93); (b) Psychosocial Scales = .80 (range: .75-.88); and (c) three of the five Validity Indicators = .70 (range: .65-.73). When subsamples were broken out by gender, ethnicity (white or minority), and setting (nonclinical, drug clinic, or criminal offender), patterns of reliability estimates were comparable to those obtained with the total sample. One-week (n = 58; .42-.78, mdn = .69) and one-month (n = 49; .39-.72, mdn = .52) stability indexes for problem screens were lower than desired due to respondents’ exposure to treatment programs during the test-retest intervals.
CONTENT VALIDATION. Common content validation procedures were followed. Researchers and treatment providers rated PEI items intended for inclusion in the PEI-A with respect to clinical relevance and importance to adult substance abuse. Based upon rater feedback, minor item modifications were made.
CRITERION-RELATED VALIDITY. Concurrent validity evidence for the PEI-A was provided by data comparisons examining the effects on scale scores of (a) treatment history for substance abuse among drug clinic clients (no sample size reported); (b) referral recommendation (no treatment, outpatient, or residential) (N = 251); (c) setting (nonclinical, drug clinic, or criminal offender) (N = 1,978); and (d) DSM-III-R (American Psychiatric Association, 1987) diagnosis of abuse or dependence upon alcohol or drugs (N = 244). The observed group differences obtained from scores on the 10 Problem Severity Scales supported the view that individuals referred to treatment settings (outpatient or residential) had greater problems with higher substance use, dependence, and related consequences of usage compared to those for whom no drug treatment was recommended. The 11 Psychosocial Scales fared less well in distinguishing among the three groups with only three scales (Negative Self-Image, Deviant Behavior, and Peer Drug Use) yielding statistically significant differences. In a separate analysis, scores obtained from a nonclinical subsample (n = 687) were significantly lower on each of the 21 scales (all p < .01) when compared with those from drug clinic (n = 887) and offender (n = 404) groups. For the DSM-III-R Diagnosis comparison, clients identified as dependent on alcohol or drugs had significantly higher scores on the 5 Basic Scales when compared to those classified as abusing these substances.
Although the measure is purportedly used to assist in treatment referral, no predictive validity information was presented linking referral decisions based upon standing on the PEI-A scales and outcome success.
CONSTRUCT VALIDITY. Only modest to moderate levels of construct validity evidence were presented based on correlations between PEI-A Problem Severity Basic scale scores and performance on the Alcohol Dependence Scale (ADS; Horn, Skinner, Wanberg, & Foster, 1982) and the PEI-A Drug Use Frequency Checklist. Moderate coefficients were obtained for a sample of 89 clients indicating that the 5 Basic Scale scores were somewhat related to ADS scores (.52-.63, mdn = .59) and Checklist scores (.41-.66; mdn = .55). For a sample of 213 clinic respondents, correlations among the 11 PEI-A Psychosocial Scales and 9 Minnesota Multiphasic Personality Inventory (MMPI) Scales yielded 62 out of 99 possible coefficients ranging from .20-.69, mdn = .38 (all p < .001) indicating, for the most part, only modest levels of shared variance (4% to 48% explained, mdn = 14%). Moderate coefficients (above the median) were associated with PEI-A scales that deal with personal adjustment issues (e.g., Negative Self-Image, Psychological Disturbance, Social Isolation, and Absence of Goals). PEI-A scale scores dealing with personal values and environmental influences (e.g., Rejecting Convention and Spiritual Isolation) yielded negligible correlations with the MMPI. PEI-A and MMPI validity indicators also were moderately correlated.
Inspection of intercorrelations among the 10 Problem Severity Scales revealed moderate to strong coefficients posing a multicollinearity problem. It is evident from data reported in the manual that the statistical contribution of unique variance to score interpretation associated with each of the 5 Clinical Scales adds little or no unique information (rxys ranged from .04 to .09, mdn = .05). This outcome was consistent with that reported for the same 10 scales of the PEI. The 5 Clinical Scales were retained “because users have found these scales helpful” (manual, p. 33). The retention of redundant scales requires more detailed explanation than that provided in the manual. For future research and test development purposes, targeting items from scales that contribute unique information for provider applications and removing redundant items would strengthen this section of the inventory.
Intercorrelations among the Psychosocial Scales revealed patterns of coefficients more distinctive of a multidimensional scale (as intended) with proportions of unique variance ranging from .18 to .57 (mdn = .29). However, lower reliability estimates and the inability of these scales to distinguish between referral groups is of concern.
SUMMARY. The Personal Experience Inventory for Adults (PEI-A) offers a beginning point to the service provider for assessment. Most PEI-A scale scores demonstrate adequate levels of reliability and distinguish between clinical and nonclinical groups. Current norms may be too restrictive for some settings. Based upon validity evidence provided, caution is warranted in all testing with use of scores from the Clinical Scales, which are redundant with the Basic Scales and with scores from the Psychosocial Scales, which have shown only low to moderate relationships with related constructs. PEI-A computer-generated recommendations for individual clients should be considered in light of these limitations and decisions made in conjunction with other measures.
REVIEWER’S REFERENCES
Horn, J. L., Skinner, H. A., Wanberg, K., & Foster, F. M. (1982). Alcohol Dependence Scale (ADS). Toronto: Addiction Research Foundation.
American Psychiatric Association. (1987). Diagnostic and statistical manual of mental disorders (3rd ed.). Washington, DC: Author.
Overeating Questionnaire
Review of the Overeating Questionnaire by JAMES P. DONNELLY, Assistant Professor, Department of Counseling, School & Educational Psychology, University at Buffalo, Amherst, NY:
DESCRIPTION. The Overeating Questionnaire (OQ) is an 80-item self-report measure of attitudes and behaviors related to obesity. In the test manual, the authors indicated that the OQ was developed to meet a growing need for a comprehensive measure useful in the treatment of obesity, especially in individualized treatment planning. They also noted that the wide age range covered by the norms for the measure meets the increasing need for assessment of children and adolescents in weight-loss programs. Users are advised that the test is not intended to be used in diagnosis of eating disorders such as anorexia or more general mental health issues like depression.
The measure includes two validity scales (Inconsistent Responding and Defensiveness) as well as 10 clinically oriented scales. The six clinical scales specifically related to eating include: Overeating, Undereating, Craving, Expectations about Eating, Rationalizations, and Motivation to Lose Weight. The remaining four clinical scales address more general health-related issues thought to be central to weight loss treatment, including Health Habits, Body Image, Social Isolation, and Affective Disturbance. The measure also includes 14 items related to patient identity, demographics, weight, and general health behavior.
The OQ can be completed via paper form or computer, and can be administered by a technician. Interpretation of results, which include raw scores, normalized T scores, percentiles, and a graphic profile plot, should be done by a professional with competence in psychometrics sufficient to be able to read and understand the test manual. Time for test completion is said to average about 20 minutes and requires a fourth-grade reading level. The paper or “autoscore” version is printed on a cleverly designed form that integrates all items, scoring instructions and worksheet, and a scoring page (or “profile sheet”) that includes raw score, percentile, and T score equivalents. Hand scoring on the worksheet is facilitated by a combination of arrows, boxes, and shading, which makes the computation of raw scale scores relatively quick and easy. The profiling of scores facilitates efficient visual identification of relative strengths and vulnerabilities, but is not intended for classification of subtypes of test takers. The computer version of the test was not available for this review; however, the manual provides a description and a sample report.
DEVELOPMENT. The development process appears to have generally followed accepted scale development practices (e.g., DeVellis, 2003), though some irregularities in the manual report cause concern. Item development and evaluation included two sequences of literature review, data collection, and item and scale analysis. No specific theory was cited. Following an initial literature review, 140 items thought to be related to overeating and responsiveness to weight loss interventions were written. Constructs represented in this item set included attitudes toward weight, food, eating, and self-image. Items reflecting defensiveness and general psychosocial functioning were also included. The initial item set was studied in a sample of convenience in a university medical school setting (no other description of the participants or their number is given). Based on examination of correlations, 129 items were retained, supplemented by an additional 59 new items generated from feedback from the pilot sample and additional literature review. The second item set was evaluated based on responses of 140 nursing students. The manual notes that the scale structure based on the new data was generally similar to the original set with two minor exceptions, yet no specifics on how scale structure was studied are given. For final inclusion, an item had to correlate at least .30 with its intended scale, and had to show discrimination of at least .10 greater correlation with its own versus any other scale. In addition, final decisions were made with regard to item readability and content uniqueness, resulting in the final 80-item set.
As noted, there are two validity scales, Inconsistent Responding Index (INC) and Defensiveness (DEF). The INC scale includes 15 pairs of items with correlations of .5 or greater in the standardization sample. The scale is scored by counting all of the item pairs in which the response differed by at least 2 scale points. The test authors computed the average INC score for 200 randomly generated scores to provide an interpretive guide vis-à-vis the probability that an INC score reflects random responding. For example, an INC score of 5 is associated with a 71% likelihood that the scale was completed randomly. The Defensiveness scale includes seven items representing idealized self-evaluations (e.g., “I am always happy”). Relatively less information is provided on this scale, except that T scores above 60 are said to suggest caution in interpretation and reassurance for anyone completing the scale in the context of treatment.
TECHNICAL.
Standardization. The standardization sample of 1,788 was recruited nationally from schools and community settings. A table of breakdowns by gender, age, race/ethnicity, education, and region are provided with national proportions for each variable for comparison, with the exception of age (perhaps because the categories used for the test were not comparable to U.S. Census records, though no explanation is given). Overall, as the test authors noted, the sample resembles national data with some underrepresentation of males and some minority groups. The sample data were then transformed to normalized T scores, which were the basis for both the examination of subgroup differences and for the clinical scoring procedures.
The analysis of subgroups involved inspection of means with interpretation of differences guided by a general statement regarding effect sizes (.1-.3 = small, .3-.5 = moderate, greater than .5 = large). The use of effect sizes as an interpretive guide is laudable, but more specific reference to the meaningfulness of these numbers in the context of obesity research and treatment would be a significant improvement. For example, some of the subscales may represent attitudes and behaviors that are more difficult to change in treatment than others; some scales may be more stable following treatment than others; and some may be more highly correlated with other treatment outcomes such as Body Mass Index, any of which would significantly affect interpretation. We can hope that future research provides such data. Nevertheless, the tables indicate that most of the subgroup mean differences are less than the 3 T-score points the authors suggest is the upper limit of a small effect. The differences beyond this level are noted in text, and further research is acknowledged as important in these instances. The overall conclusion that the subgroup differences are minimal simplifies the matter of scoring and interpretation because the T-score norms essentially become a “one size fits all” scoring protocol, a trade of simplicity for specificity that may be welcomed in the clinical setting on purely practical grounds, but cannot be said to reflect strong evidence-based assessment at this point in time.
Reliability. Reliability data for the OQ are presented in terms of internal consistency for the standardization sample, and 1-week test-retest reliability for a separate group. The coefficient alpha estimates for the 10 clinical scales and the Defensiveness scale show evidence of strong internal consistency, with a range of .79 to .88 across the subscales for the full sample. Interestingly, the test authors separately examined internal consistency for the 68 children aged 9 or 10 in the sample. For this group, one scale (Health Habits) dipped below .70 (to .66), but otherwise the reliability estimates remained reasonably strong (range = .72-.88). In the same table, the authors also provided corrected median item-total correlations for the items in each scale, along with ranges for these estimates. Again, the evidence points toward desirable internal consistency. The 1-week test-retest data are also strong if we merely examine the range of the estimates (.64-.94), but is much more limited when taking into account the small number in this sample (n = 24), the fact that no information is given about the sample, and the absence of any theoretical or other comment on why this interval was chosen or whether the constructs measured by the scales should be stable over this interval.
Validity. The manual reports evidence of construct validity that reflects internal and external validity characteristics of the scales. The internal validity report includes tables of scale intercorrelations as well as the results of a principal components analysis on the standardization sample. The external validity data include correlations with a number of other scales and variables chosen to reflect plausible relationships that would provide convergent and divergent validity evidence.
The table of intercorrelations and the accompanying interpretive text are consistent with previously described internal structure of the measure. The principal components analysis was conducted separately for seven scales measuring vulnerabilities (e.g., Overeating) and the remaining three measuring strengths (e.g., Motivation to Lose Weight). The table reporting this analysis includes only the component loadings. No other information on important details of the analysis that should typically be reported is given (e.g., rotation, extraction criteria, eigenvalues) (Henson & Roberts, 2006). The authors noted that the loadings are generally consistent with indicated scales, though, for example, two clearly distinct but adjoining components are combined in a single scale.
Additional construct validity data are presented in the form of correlational studies further examining the relationship of OQ scales to person characteristics such as BMI in the standardization sample, and a small sample (N = 50) study of OQ correlations with five previously established self-report measures of related constructs (e.g., eating, self-concept, stress). In addition, a study of Piers-Harris Self-Concept and OQ scores for 268 of the “youngsters” from the standardization sample was mentioned (no other information is given on this subsample). The authors’ conclusion that the overall pattern is consistent with expectations given the nature of the OQ scale constructs is quite global but not unreasonable.
COMMENTARY. Strengths of the OQ include the efficiency of a single instrument for virtually anyone who might be seen in treatment, ease of administration and scoring, attention to response style, inclusion of specific eating and more general health behaviors, a reasonably large standardization sample of children and adults, internal consistency reliability, face validity, and some evidence of construct validity. The question of to what extent the standardization sample resembles the likely clinical population is not directly addressed. A case could be made that the sample is, in fact, a good comparison one because a large proportion of the U.S. population is overweight and at some point may seek professional assistance. The use of effect sizes in interpretation is commendable, but should eventually be more specifically associated with clinical data in the intended population in future versions of the scale. In addition, some details of the measure development process are missing from the manual (e.g., minimal reporting of the pilot samples, few details of the principal-components analysis).
SUMMARY. The OQ is a relatively new measure attempting to address a major health issue with a comprehensive and efficient set of scales intended for use in individualized treatment of overeating. The test manual sets a relatively circumscribed goal of aiding in individual treatment planning, but that process must be undertaken without the benefit of any predictive data. The OQ ambitiously attempts to provide a single measure for children through older adults with a single set of norms. In providing a user-friendly format and some good psychometric evidence, it is potentially useful in the expressed goal of aiding in treatment planning. Further research is needed to enhance the clinician’s ability to confidently employ the measure, especially in understanding the relationship of scores and profile patterns to treatment process and outcome.
REVIEWER’S REFERENCES
DeVellis, R. F. (2003). Scale development. Thousand Oaks, CA: Sage.
Henson, R. K., & Roberts, J. K. (2006). Use of exploratory factor analysis in published research: Some common errors and some comment on improved practice. Educational and Psychological Measurement, 66, 393-416.
Review of the Overeating Questionnaire by SANDRA D. HAYNES, Dean, School of Professional Studies, Metropolitan State College of Denver, Denver, CO:
DESCRIPTION. The Overeating Questionnaire is an 80-item self-report questionnaire designed to measure key habits, thoughts, and attitudes related to obesity in order to establish individualized weight loss programs. Such an instrument is rare as tests of eating behavior are typically geared toward anorexia nervosa and bulimia nervosa. The paper-and-pencil version of the questionnaire can be administered individually or in a group and takes approximately 20 minutes to complete. The administration time for the PC version is similar but, as suggested, administration is accomplished using computer keyboard and mouse. After completing identifying information including age, gender, education, and race/ethnicity, examinees are asked to answer questions in Part I regarding height, historical weight and eating patterns, use of alcohol and drugs, health problems, and perceptions of weight in self and others. Part II consists of a list of 80 statements that the examinee is asked to rate with regard to agreement on a 5-point scale: Not at all (0), A little bit (1), Moderately (2), Quite a lot (3), and Extremely (4). Care should be taken to ensure that clients respond to all statements on the questionnaire. If an item has been left blank and an answer cannot be obtained from the client, the median score for that item is used in scoring. No written instructions are given to the client regarding the correction of responses made in error. The sample scoring sheet shows errors being crossed out. Verbal instruction should be given.
Scoring is manual using the paper-and-pencil AutoScore(tm) form or computerized using the PC version. Using the AutoScore(tm) form, responses are automatically transferred to an easy score worksheet. Raw scores for each question are transferred to a box under the appropriate scale heading. Numbers from columns representing each of 11 scales are then summed and transferred to the profile sheet. The profile sheet contains corresponding normalized T-scores and percentiles, and provides a graphic representation of results. Scores greater than or equal to 60T are considered high; greater than or equal to 70T are very high. Scores less than or equal to 40T are considered low. A 12th score, the Inconsistent Responding Index (INC), is calculated by finding the differences between 15 INC similar item pairs.
Remarkably little attention is paid to the computerized scoring in the text of the manual. (It is described in an appendix.) Using this method, the client uses a computer to complete the questionnaire. Scoring is quicker and multiple tests can be scored at the same time. An interpretive report is automatically produced. Even so, care should be taken to ensure accuracy of the report.
As mentioned, 12 scores are generated from the questionnaire. Of the 12 scores, 2 are validity scores. These are Inconsistent Responding (INC) and Defensiveness (DEF). Using INC, an inconsistency is noted if the difference between the paired items is greater than or equal to 2. There is no absolute cutoff score for a high INC score. An INC of 5 or more indicates a 71% probability of random or careless responding. Clients should be queried about their distractibility during test taking. The results of the INC score should be discussed in the interpretative report. The DEF score corresponding to items is indicative of an idealized self. If the DEF score is elevated, accuracy of responding to the questionnaire as a whole is questionable.
Of the 10 remaining scores, 6 of the scores are classified under the category Eating-Related Habits and Attitudes. This cluster of scores identifies positive and negative habits and attitudes that enhance or interfere with maintenance of healthy body weight. These scores are: Overeating (OVER), Undereating (UNDER), Craving (CRAV), Expectations About Eating (EXP), Rationalizations (RAT), and Motivation to Lose Weight (MOT). The 4 remaining scales are classified as General Health Habits and Psychosocial Functioning. These scores are: Health Habits (HEAL), Body Image (BODY), Social Isolation (SOCIS), and Affective Disturbance (AFF). This cluster of scores identifies positive and negative aspects of the environment that enhance or interfere with the maintenance of healthy body weight. Taken together, these scores are designed to help the clinician and client develop an effective, personalized weight reduction plan.
DEVELOPMENT. The OQ was formulated after extensive literature review, creation of an initial item pool of 140 items, and modification of the item pools and scales in two pilot tests. The initial items were related to attitudes toward weight, food and eating, self-image, and defensive response. Related questions were placed into different scales as they were identified in the pilot testing process. The 80-item questionnaire was derived from an intercorrelation evaluation of “fit” within the scales and from feedback from respondents. The INC score was incorporated after the final 80 questions were decided upon by correlation of item pairs. Pairs with a correlation of .50 or higher in the standardization sample were included in the sample. Readability was taken into consideration and the reading level for the final form is fourth grade.
TECHNICAL.
Standardization. A standardization sample of 1,788 individuals ranging in age from 9 to 98 from public, nonclinical settings (such as public schools) was used to standardize the OQ. Males, persons of color, and those with less education were somewhat underrepresented. Nonetheless, the authors examined differences among gender, ethnicity, age, education, and region of the United States. Standard scores held relatively true for these demographic variables. The authors are well aware of the need to continue their research in the area of differences among individuals from various demographic backgrounds.
Reliability. Estimates of internal consistency (coefficient alpha), item-to-scale correlations, and test-retest reliability were examined. All measures of reliability indicate that the OQ is a reliable measure. Specific values are generally acceptable to high with an internal consistency median value of .82 (.77 for respondents aged 9-10), item to scale correlations median value of .55, and test-retest correlation median value of .88. The first two estimates of reliability were conducted using the entire standardization sample. Test-retest reliability used a subgroup of 24 individuals aged 27-64 with a 1-week interval between testing. Further investigation of test-retest reliability is warranted given the small sample size and short retest interval.
Validity. Construct and discriminate validity measures were used to assess the validity of the OQ. Construct validity was evaluated in three ways: interscale correlations, a factor analysis showing the relationships among responses given to test items, and correlations between a scale and other measures of a similar characteristic. The first two measures showed strong evidence that the OQ scales measure unique although sometimes related constructs. The third measure indicated good correlation with other measures of similar characteristics and good negative correlation with other measures of opposite characteristics.
Discriminate validity was assessed in two ways. First, three subgroups from the standardization sample who indicated in one of three ways they were overweight were compared to the overall sample. As expected, individual scores from these groups differed significantly from those without weight problems on most scales. However, females scored differently on more scales than did males. Such a finding underscores the need for further research into gender and other demographic differences in scoring. Second, the standardization sample was compared to a group of individuals who were in treatment for mood disorders. All of these individuals were overweight. All but three scores were above average for this group as compared to the standardization group.
COMMENTARY. The major strength of the OQ is its measurement of the key habits, thoughts, and attitudes related to obesity in order to establish individualized weight loss programs. Thus, not only does the questionnaire focus on an important, yet often neglected area of eating disorders-obesity-it appears that it may be a useful instrument in the development of personalized weight loss programs. The efficacy of the latter claim needs further research, however. Administration and scoring are straightforward and the ability to administer the OQ to individuals or to a group is a plus.
The manual is well organized and is easy to read. Psychometric concepts are explained prior to giving the specific measures of the OQ and were well evaluated. More supporting interpretive comments would make the test more useful in clinical situations.
SUMMARY. The OQ appears to be a well-researched measure of factors that influence obesity. More research is needed in the efficacy of the instrument in establishing effective treatment protocols.
Firestone Assessment of Self-Destructive Thoughts
Review of the Firestone Assessment of Self-Destructive Thoughts by WILLIAM E. MARTIN, JR., Professor of Educational Psychology, Northern Arizona University, Flagstaff, AZ:
The Firestone Assessment of Self-Destructive Thoughts (FAST) is designed to measure the “Continuum of Negative Thought Patterns” as they relate to a client’s level of self-destructive potential or suicidality. The authors recommend the FAST to be used for screening, diagnosis, treatment progress, treatment outcome, research, and therapy. The FAST is theoretically grounded in what the authors refer to as the “concept of the voice,” which refers to negative thoughts and attitudes that are said to be at the core of maladaptive behavior.
The FAST consists of 84 items that provide self-report information from a respondent on how frequently he or she is experiencing various negative thoughts directed toward himself or herself. Four “composites” and 11 linked “continuum levels” comprise the FAST. One composite is named Self-Defeating and has five continuum levels (Self-Depreciation, Self-Denial, Cynical Attitudes, Isolation, and Self-Contempt). Addictions is another composite with addictions listed as its continuum level. A third composite is Self-Annihilating with four continuum levels (Hopelessness, Giving Up, Self-Harm, Suicide Plans, and Suicide Injunctions). The last composite is Suicide Intent and no continuum levels are identified.
ADMINISTRATION, SCORING, AND INTERPRETATION. The FAST instrument is a seven-page perforated, self-carbon form used for responding to items, scoring responses, and graphing the results. T scores are derived for the 11 continuum levels, four composites, and for the total score. Percentiles and 90% confidence interval bands also are available for use. The T scores are plotted on the T-Score profile graph, which has shaded partitions that indicate if the T scores fall within a nonclinical range, equivocal range, or clinical ranges that include elevated and extremely elevated.
The normative sample for the FAST was a clinical sample of outpatient clients undergoing psychotherapy. A T score of 50 on any scale represents the average performance of an individual who was in outpatient treatment with no suicide ideation from the normative sample. The nonclinical range is a T score between 20 and 41 whereas the equivocal range is 42-48. The two clinical ranges are elevated (42-59) and extremely elevated (60+). Any score that falls above the equivocal range is treated with concern and anyone scoring in the extremely elevated range on levels 7-11, the Self-Annihilating Composite, the Suicide Intent Composite, or the Total score should be immediately assessed for suicide potential.
DEVELOPMENT OF THE SCALES. The items for the FAST were derived from actual statements of 21 clinical outpatients who were receiving “voice therapy” in groups. Nine of the outpatients had a previous history of serious suicide attempts and the others exhibited less severe self-defeating behaviors including self-denial, isolation, substance abuse, and eating disorders. The list of items was further refined from a study conducted to select those factors that significantly discriminated between suicide attempters and nonattempters. Then items were retained or deleted based upon their psychometric relationship to hypothesized constructs, resulting in the current 84-item version of the FAST.
RELIABILITY AND VALIDITY. Cronbach’s alpha reliability coefficients ranging from .76 to .91 (Mdn = .84) are reported for the 11 level scores. Standard errors of measurement and 90% confidence intervals also are provided. However, sample sizes and descriptions are not provided for these measures. Test-retest reliability coefficients (1-266 days) ranged from .63-.94 (M = .82) using a sample (N = 131) of nonclinical, psychotherapy outpatients, and psychiatric inpatients.
Content validity of the FAST was investigated using a Guttman Scalogram Analysis resulting in a coefficient of reproducibility of .91 and a coefficient of scalability of .66. FAST Total Scores were correlated with the Suicide Ideation subscale of the Suicide Probability Scale (r = .72) as indicators of convergent validity. An exploratory factor analysis was conducted using 579 outpatients resulting in a 3-factor solution (Self-Annihilating, Self-Defeating, and Addictions), which provided support for construct validity. Evidence for criterion-related validity was demonstrated from studies showing how FAST scores were able to discriminate inpatient and outpatient ideators from nonideators and to identify individuals who made prior suicide attempts.
SUMMARY. The authors have put forth empirical evidence that supports the psychometric properties of the FAST. However, continuing studies are needed, especially related to the effectiveness of the FAST in diagnosing and predicting chemical addictive behavior. Furthermore, the construct validity of scores from the FAST needs further consideration. First, the items for the FAST were generated from a small (N = 21) somewhat restricted focus group of persons receiving “voice therapy.” Second, the FAST is closely anchored to a theoretical orientation known as “concept of the voice” in which additional studies are needed to validate.
Overall, the FAST is a measure worth considering for professionals working with individuals who have exhibited self-destructive potential or suicidality. However, I encourage professionals to study the theoretical orientation underlying the FAST and determine if it is congruent with their own expectations for clinical outcomes prior to extensive use of the instrument.
Review of the Firestone Assessment of Self-Destructive Thoughts by ROBERT C. REINEHR, Professor of Psychology, Southwestern University, Georgetown, TX:
The Firestone Assessment of Self-Destructive Thoughts (FAST) is a self-report questionnaire intended to provide clinicians with a tool for the assessment of a patient’s suicide potential. Respondents are asked to endorse how frequently they are experiencing various negative thoughts directed toward themselves. The items were derived from the actual statements of clinical outpatients who were members of therapy groups in which the techniques of Voice Therapy were used.
Voice Therapy is a technique developed by the senior test author as a means of giving language to the negative thought processes that influence self-limiting, self-destructive behaviors and lifestyles. The FAST includes items intended to assess each of 11 levels of a Continuum of Negative Thought Patterns. Items were assigned to levels based on the judgments of advanced graduate students and psychologists with training in Voice Therapy.
In the standardization process, the FAST was administered to a sample of 478 clients who were currently receiving outpatient psychotherapy and who did not have any current (within the last month) suicide ideation, suicide threats, or suicide attempts. Standard scores were calculated for the Total Score, for four composite scores derived by factor analysis and other statistical procedures, and for each of the 11 levels of negative thought patterns.
Estimates of internal consistency are based on a single sample, the size of which is not reported in the manual. They range from .76 to .97, with the majority falling between .81 and .88. Test-retest reliability estimates are reported for three samples with intervals from 28-266 days in one study and 1-31 days in another: psychiatric inpatients (n = 28), psychotherapy outpatients (n = 68), and nonclinical college students (n = 35). Reliabilities for the various levels of the negative-thought continuum range from .63 to .94, with the higher coefficients generally being found among the nonclinical respondents. Test-retest reliability estimates for the various composite scores and for the total score are somewhat higher, ranging from .79 to .94.
As an indication of construct validity, FAST scores were compared to scores on the Beck Depression Inventory (BDI), the Beck Suicide Inventory (BSI), and the Suicide Probability Scale (SPS). The FAST Total score had its highest correlations with the BDI (.73), the BSI (.72), and the Suicide Ideations subscale of the SPS (.76). The composite scores and the various level scores had lower correlations with the subscales of the Beck instruments or the SPS.
The FAST was administered to groups of inpatients and outpatients with various diagnoses including Adjustment Disorder, Anxiety Disorder, Bipolar Disorder, Depression, Personality Disorder, Schizophrenia, and Substance Abuse, and to a nonclinical sample of 172 college students. Each of the clinical groups was further subdivided into suicide Ideators and Nonideators. Ideators had higher average FAST Total scores than did Nonideators and clinical groups had higher average FAST Total scores than did the nonclinical group. Information is provided in the manual with respect to the relationships between the various FAST subscales and the diagnostic groups and subgroups.
SUMMARY. In general, it would appear that the FAST is similar in many ways to other depression and suicide inventories. Total Scores tend to be higher for respondents in diagnostic groups than for nonclinical respondents, and within diagnostic groups, Suicide Ideators score more highly than do Nonideators.
Within the limits of these findings, the FAST may be useful to clinicians as an indication of how a given respondent’s answers compare to those of various diagnostic groups. It might also be possible to use the scale as a clinical tool for the evaluation of change during therapy, although use as a psychometric instrument is not justified on the basis of the evidence presented in the manual.
Psychiatric Diagnostic Screening Questionnaire
Review of The Psychiatric Diagnostic Screening Questionnaire by MICHAEL G. KAVAN, Associate Dean for Student Affairs and Associate Professor of Family Medicine, Creighton University School of Medicine, Omaha, NE:
DESCRIPTION. The Psychiatric Diagnostic Screening Questionnaire (PDSQ) is a self-report instrument designed to screen for Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV; American Psychiatric Association, 1994) Axis I disorders that are most commonly seen in medical and outpatient mental health settings. It is designed to be completed by individuals 18 years of age and older prior to their initial diagnostic interview. The PDSQ covers 13 Axis I areas including Major Depressive Disorder, Posttraumatic Stress Disorder, Bulimia/Binge-Eating Disorder, Obsessive-Compulsive Disorder, Panic Disorder, Psychosis, Agoraphobia, Social Phobia, Alcohol Abuse/Dependence, Drug Abuse/Dependence, Generalized Anxiety Disorder, Somatization Disorder, and Hypochondriasis. It also provides a PDSQ Total score, which acts as a global measure of psychopathology.
According to the manual, the PDSQ is designed to be used in “any clinical or research setting where screening for psychiatric disorders is of interest” (manual, p. 2). It may be administered and scored by any appropriately trained and supervised technician; however, clinical interpretation should only be undertaken by a professional with appropriate psychometric and clinical training.
The PDSQ consists of 125 items in which respondents are requested to answer yes or no to each test booklet question according to “how you have been acting, feeling, or thinking” during the past 2 weeks or 6 months, depending on the symptom cluster. Typical administration time is between 15 and 20 minutes. Scoring is completed by hand and entails counting the number of yes responses on each PDSQ subscale and entering that number in the space provided on the accompanying summary sheet. Subscale scores are then compared to cutoff scores to determine whether follow-up interviewing is indicated. In addition, the scorer is to circle critical items to which the respondent answered “yes.” All subscale scores are then summed in order to obtain a PDSQ Total raw score. Finally, the PDSQ Total raw score is transferred to a PDSQ Score Conversion table that converts the total score into a T-score. On the back side of the summary sheet is a table that includes diagnosis percentages of persons who endorsed each item and either qualified or failed to qualify for a subscale diagnosis. An accompanying CD provides follow-up interview guides for all 13 disorders. These may be printed and then used to gather additional diagnostic information regarding these syndromes.
As noted previously, scores from the PDSQ are then used to facilitate the initial diagnostic evaluation. The author notes that “results should be verified whenever possible against all available information, including the results of patient interviews, clinical history, professional consultations, service agency records, and the results of additional psychological tests” (manual, p. 11).
DEVELOPMENT. The PDSQ was developed to be a relatively brief, self-administered questionnaire for the assessment of various DSM-IV Axis I disorders in psychiatric patients. Development of the measure began over 10 years ago with an instrument entitled the SCREENER, which was originally designed to screen for psychiatric disorders in primary care settings and later in outpatient mental health settings. Following subscale revisions, the SCREENER became a 102-item version of the PDSQ. Through additional modifications the PDSQ took its present form as a scale of 125 items.
TECHNICAL. The author stresses the importance of patients being able to understand any self-administered instrument. As such, readability studies of the initial version of the PDSQ were conducted and ranged from a 5.8 grade level (Flesch-Kincaid method) to a 9.2 grade level (Bermuth formula). Additional understandability studies using psychiatric outpatients demonstrated that PDSQ items were “written at a level that most individuals … would understand” (manual, p. 27). The author acknowledges that one-third of the sample patients were college graduates and only 5% of the sample patients had less than a high school diploma.
Initial and replication studies were conducted to estimate internal consistency and test-retest reliability on 112- and 139-item versions of the PDSQ. Samples were large, but dominated by white, married or single, and educated females. Internal consistency values (Cronbach alpha) for the initial study on 732 psychiatric outpatients ranged from .73 (Somatization Disorder) to .95 (Drug Abuse/Dependence), whereas a replication study involving 994 psychiatric outpatients found internal consistency estimates to range from .66 (Psychosis and Somatization Disorder) to .94 (Posttraumatic Stress Disorder). Test-retest reliability coefficients on a subsample of these patients ranged from .66 (Bulimia/Binge-Eating Disorder) to .98 (Drug Abuse/Dependence) for the initial study (mean interval of 4.8 days) and from .61 (Mania/Hypomania) to .93 (Drug Abuse/Dependence) in the replication study (mean interval of 1.6 days).
The author reports that 27 of the 112 items did not achieve a minimum endorsement base rate of 5% during the initial study and were not used to determine test-retest reliability. Eighty-three of the 85 remaining items had a Cohen’s kappa coefficient, which corrects for chance levels of agreement, between .67 and .92. In the replication study, only two items were excluded in the test-retest reliability study. Cohen’s kappa for the remaining items ranged from .50 to .83. Although there is some disagreement regarding the interpretation of kappa, Spitzer, Fleiss, and Endicott (1978) suggest that values greater than .75 demonstrate good reliability, values between .50 and .75 suggest fair reliability, and values below .50 connote poor reliability. In the initial study, 7 subscales (Major Depressive Disorder, Dysthymic Disorder, Bulimia/Binge-Eating Disorder, Mania/Hypomania, Agoraphobia, Generalized Anxiety Disorder, and Hypochondriasis) would be considered to have fair reliability and 7 (PTSD, Obsessive-Compulsive Disorder, Panic Disorder, Psychosis, Social Phobia, Alcohol Abuse/Dependence, and Somatization Disorder) would be considered to have good reliability (1 subscale did not meet the base rate standard). In the replication study, 14 subscales would be considered to have fair reliability and 1 (Drug Abuse/Dependence) would be considered to have good reliability.
To document discriminant and convergent validity, corrected item/subscale total correlation coefficients were calculated between each item and subscale. The mean of the correlations between each subscale item and that subscale’s total score were compared to the mean of correlations between each subscale item and the other 14 subscale scores. The author points out that in 90.2% of the calculations the item/parent-subscale correlation was higher than each of the item/other-subscale correlations. A similar pattern emerged from the replication study with 97.1% of items having a higher correlation with their parent subscale. Data are not provided on correlations between each subscale mean and other individual subscales within the PDSQ.
The PDSQ subscales were also compared to “other measures of the same construct versus measures of different constructs” (manual, pp. 31-32). In all instances, the PDSQ subscale scores were significantly correlated with measures of similar syndromes. In addition, correlations were higher between scales assessing the same symptom domain than scales assessing other symptom domains. Interpretation is somewhat clouded by the manual’s lack of clarity regarding the nature of these measures.
Finally, criterion validity was documented by comparing the scores of respondents with and without a particular DSM-IV diagnosis. In both the initial and replication studies, the average PDSQ score was significantly higher for those with versus those without the disorder (the only exception was Mania-Hypomania, which was subsequently dropped from the PDSQ).
Cutoff scores are provided based on a study of 630 psychiatric outpatients who were interviewed with the Structured Clinical Interview for DSM-IV Axis I Disorders (SCID; First, Spitzer, Gibbon, & Williams, 1997). Based on results from this study and the fact that the PDSQ is intended to be used as an aid for conducting an initial diagnostic evaluation, the author has recommended a cutoff score resulting in diagnostic sensitivity of 90%. These cutoff scores are provided on the PDSQ Summary Sheet. In addition, a table within the manual includes cutoff scores, sensitivity, negative and positive predictive values, and separate columns estimating the rates of occurrence among psychiatric patients and the general population-the latter being based on information obtained from the DSM-IV.
Limited data are provided within the manual on the PDSQ Total Score. The author states that it is the only norm-referenced score in the instrument. The Total Score is expressed as a standard T-score and is a means for “comparing the patient’s level of symptom endorsement with that of the average patient seen for intake in a clinical psychiatric outpatient setting” (manual, p. 11). Apparently, it provides a “rough measure of the overall level of psychopathology and consequent dysfunction that a patient reports” (manual, p. 11). However, the author states that it is only loosely related to the distress a patient may be experiencing and it should not be used as an index of severity.
COMMENTARY. The purpose of the PDSQ is to screen for DSM-IV Axis I disorders that are most commonly seen in outpatient mental health settings. With any measure such as this, the real question is: Is it accurate and does it improve efficiency? In regard to accuracy, the PDSQ has respectable internal consistency and test-retest reliability. In addition, convergent and discriminate validity studies demonstrate that PDSQ items are correlated more strongly with their parent subscale than with other subscales within the PDSQ. Also, the PDSQ items were more strongly correlated with other measures of the same construct versus measures of different constructs; although the manual is somewhat unclear as to the nature of these “measures.” Finally, it appears as though the PDSQ has decent sensitivity and specificity and does well at identifying both principal and comorbid disorders. A problem, however, is that the PDSQ has no validity indices, thereby allowing patients to misrepresent themselves on the instrument. Any interpretation should, therefore, be done cautiously and with corroborating information.
In regard to the question of efficiency, the author admits that this, as well as the issue of accuracy, remain empirical questions. Despite the lack of supportive data within the manual, the PDSQ does appear to readily guide the interview toward symptom areas requiring more detailed assessment. In and of itself, this should streamline the diagnostic interview.
Potential PDSQ users are cautioned about several other areas. The first relates to the samples used in studying the PDSQ. Although numbers are typically adequate, the generalizability of findings are somewhat limited by rather homogeneous (i.e., mostly white, female, married/single, and well-educated patients) samples used within the various studies. Finally, users of the PDSQ are reminded of the fairly high reading level necessary for self-administration and the lack of validity indices within the instrument.
SUMMARY. The author should be commended for developing a self-report screening measure that is relatively easy to administer and score and has acceptable evidence of reliability and validity. As noted by the authors, the PDSQ “is not a substitute for a diagnostic interview …. There are no special questions on the PDSQ that allow it to detect psychopathology that otherwise would go undetected during a clinical evaluation” (Zimmerman, 2003, p. 284). Nonetheless, the PDSQ will likely guide clinicians toward those areas of clinical concern that need additional assessment. In doing this, the PDSQ should serve its intended purpose of increased clinical diagnostic accuracy and efficiency. Additional studies will need to be completed to determine the overall impact of the PDSQ on these issues and whether it leads to improved treatment outcome.
REVIEWER’S REFERENCES
American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC: Author.
First, M. B., Spitzer, R. L., Gibbon, M., & Williams, J. B. W. (1997). Structured clinical interview for DSM-IV Axis I Disorders (SCID). Washington, DC: American Psychiatric Association.
Spitzer, R. L., Fleiss, J. L., & Endicott, J. (1978). Problems of classification: Reliability and validity. In M. A. Lipton, A. DiMarco, & K. Killam (Eds.), Psychopharmacology: A generation of progress (pp. 857-869). New York: Raven.
Zimmerman, M. (2003). What should the standard of care for psychiatric diagnostic evaluations be? Journal of Nervous and Mental Disease, 191, 281-286.
Review of The Psychiatric Diagnostic Screening Questionnaire by SEAN P. REILLEY, Assistant Professor of Psychology, Morehead State University, Morehead, KY:
DESCRIPTION. The Psychiatric Diagnostic Screening Questionnaire (PDSQ) consists of 125 items (111 numbered items, 2 with multiple parts) that tap symptoms of several DSM-IV Axis I disorders commonly seen in outpatient settings. The PDSQ can be completed on-site in as little as 20 minutes or at home in advance of an appointment. Respondents use one of three time frames (past 2 weeks, past 6 months, lifetime recollection) to specify the presence (“Yes”) or absence (“No”) of symptoms. Responses can be rapidly handsummed into raw subscale scores and converted to t-scores by clinicians and appropriately trained staff. The inventory yields a total score and 13 subscale scores, denoted in brackets, which tap mood [Major Depressive Disorder], anxiety [Posttraumatic Stress Disorder, Obsessive Compulsive Disorder, Panic Disorder, Agoraphobia, Social Phobia, Generalized Anxiety Disorder], eating [Bulimia/Binge-Eating Disorder], somatoform [Somatization Disorder, Hypochondriasis], substance abuse/dependence problems [Alcohol Abuse/Dependence, Drug Abuse/Dependence], and psychotic [Psychosis] symptoms. Summary sheets assist with identification of 45 possible critical items and comparison of subscale scores with recommended clinical cutting scores. A compact disc containing follow-up interview guides with prompts related to DSM-IV criteria is available from the test publisher for each subscale.
DEVELOPMENT. The PDSQ is an atheoretical inventory. Items were written to reflect symptom criteria for DSM-IV Axis I disorders most common in epidemiological surveys and in published research articles. Most items adequately represent the DSM-IV nosology except those comprising the Alcohol and Substance Abuse/Dependence and the Psychosis subscales. The former reflect abuse/dependence symptoms broader than those required by the DSM-IV, whereas the latter assess critical symptoms of several nonspecific psychotic disorders. During the item revision process, 89% of items successfully passed four criteria established by the developer. The addition of new and revised items was not successful in meeting five additional subscale retention criteria for: Anorexia Nervosa, Body Dysmorphic Disorder, Mania/Hypomania, Dysthymic Disorder, Generalized Anxiety Disorder, Psychosis, Somatization, and Hypochondriasis subscales. The latter four subscales were retained, however, partially based on adequate diagnostic performance with outpatient clinical groups, but could benefit from further modification. The 13 subscales comprising the current version of the PDSQ contain uneven item distributions ranging from 5 to 22 items. This, in addition to a lack of items to assess response bias, raises concern. Namely, endorsement of a single transparent item for some subscales is sufficient to exceed the clinical screening criterion, which could potentially lower their positive predictive power. The developers do offer practical suggestions for detecting response bias using the PDSQ Total score. However, this indicant is norm referenced and not criterion referenced like subscale scores. Thus, prior to acceptance, bias detection procedures using the PDSQ Total score need empirical validation with clinical groups.
TECHNICAL. Multiple, adult, medical, and psychiatric outpatient samples, of at least four hundred individuals (over 3,000 combined) per sample, were used to standardize the PDSQ. Although certainly commendable, these data are predominantly from Caucasian (range 85 to 94%) high school graduates (89 to 94%) living in Providence, RI. The impact of gender on PDSQ norms is not reported, despite women outnumbering men by a 2:1 ratio in all standardization samples. Because several DSM-IV syndromes tapped by the inventory show marked gender differences (e.g., Major Depressive Disorder), gender impact studies are needed. Perhaps more salient is the need for a broader and more representative normative sample to improve the generalizability of the PDSQ for diverse populations including rural, multi-ethnic, and geriatric adults, as well as those with lower education and/or socioeconomic status.
Readability analyses using the Flesch-Kincaid and Bermuth methods indicate PDSQ items range from fifth to ninth grade reading levels. Studies employing simpler forced-choice procedures (Understand/Don’t Understand) suggest greater than 95% of adults with a high school degree or equivalency understood all PDSQ items. Despite these initial data, no minimum level of reading skill is recommended in the manual.
Reliability estimates are reported for previous PDSQ versions that include items and subscales not found on the present version. Extrapolating from these data, the Cronbach alpha coefficients of subscales common to the current PDSQ are adequate (.66) to excellent (.94). To date, internal consistency estimates have been used to estimate the latent variables comprising the PDSQ. No attempts are reported to validate its primary factor structure. This could be accomplished using advanced modeling procedures such as confirmatory factor analysis or structural equation modeling. Test-retest estimates for approximately a week are borderline adequate (kappa = .56) to excellent (kappa = .98) at the item level, and slightly higher at the subscale level (rs ranging .72 to .93). Studies involving longer test-retest durations are needed to bolster initial data, given the longer temporal requirements of several DSM-IV syndromes (e.g., Major Depression; 2 weeks, PTSD; 4 weeks) tapped by the PDSQ.
Data concerning convergent and discriminant validity of the PDSQ are based on initial outpatient (n = 732) and replication samples (n = 994) using multiple methods. Across studies, the mean corrected item-parent PDSQ subscale correlations (rs ranging .42 to .85) are significantly higher than 90% of those afforded by item-other PDSQ subscale relations (rs ranging .15 to .35). Subscale-specific correlations with externally recognized instruments are modest (r = .25) to very good (r = .77), and higher than those afforded by nonspecific PDSQ subscales (rs ranging .15 to .35). Thus, initial internal and external comparisons of PDSQ subscales suggest appropriate convergence and discriminant validity. Adequate criterion validity is initially examined by showing significantly higher diagnosis-specific PDSQ subscale scores (e.g., Major Depressive Disorder) among outpatient groups with the corresponding DSM-IV disorder than for those without the disorder. Absent from the manual, however, are comparisons of non-diagnosis-specific PDSQ scores between outpatient groups. Inclusion of these data could further bolster the criterion validity evidence of the PDSQ.
The initial sensitivity, specificity, and positive and negative predictive power of PDSQ cutting scores for primary DSM-IV diagnoses are based on a single sample of psychiatric outpatients (n = 630). Subscale sensitivity is generally adequate (75%) to very good (100%) in this sample with less variability noted for rates of specificity (range 83 to 100%). The developer recommends a sensitivity level of 90% for establishing cutting scores for clinical practice. However, four subscales, Obsessive Compulsive Disorder (89%), Psychosis (75%), and both Alcohol (85%) and Drug (85%) Abuse/Dependence, fail to reach this sensitivity level. Using the most liberal cutoff scores, positive predictive values range considerably (18 to 100%), whereas negative predictive values are high and fairly consistent (97 to 100%). Seven subscales yield positive predictive values below 60%, which, in part, may be due to low base rates of disorders tapped by Bulimia/Binge Eating Disorder, Somatization Disorder, Hypochondriasis, and Psychosis subscales. However, as noted, Drug Abuse/Dependence and Psychosis subscales provide less than an adequate mapping of the DSM-IV nosology, which may negatively impact their predictive ability. Finally, the differential validity evidence of all PSDQ cutting scores needs to be clarified for gender and diversity considerations.
COMMENTARY. The PDSQ appears to be a potentially valuable screening instrument for common DSM-IV Axis I disorders in outpatient settings. Several issues need to be addressed in order to firmly anchor the psychometrics and generalizability of the PDSQ. First, a more representative standardization sample needs to be collected using the current version of the PDSQ. In that sample, gender and diversity contributions to PDSQ scores need clarifying and a minimum reading level should be established. Second, response bias needs to be addressed either by inclusion of new items or additional studies designed to empirically validate bias detection techniques using the PDSQ Total Score. Third, factor analysis or structural equation modeling is needed to adequately assess the overall PDSQ factor structure and to address less than adequate homogeneity in several subscales. Fourth, longer test-retest studies are needed to bridge the existing stability of several subscales with specific temporal requirements of several selected DSM-IV disorders. Finally, the positive predictive power of Psychosis, Bulimia/Binge Eating Disorder, Generalized Anxiety Disorder, Somatization Disorder, Hypochondriasis, and Alcohol and Drug Abuse/Dependence subscales needs to be improved.
SUMMARY. The developer, to his credit, has produced a potentially valuable screening instrument, and one of the first that directly incorporates the DSM-IV nosology for common Axis I disorders. Significant care was taken in initial studies to evaluate PDSQ items and subscales using multiple reliability and validity indices. In order for this instrument to become a gold standard, a more representative standardization sample is needed. Careful, continued validation work will also be required to solidify the PDSQ factor structure, to enhance the homogeneity and test-retest reliability of specific subscales, and to improve their positive predictive power. As a whole, this inventory is recommended for screening purposes with an eye to its current limitations.
We provide professional writing services to help you score straight A’s by submitting custom written assignments that mirror your guidelines.
Get result-oriented writing and never worry about grades anymore. We follow the highest quality standards to make sure that you get perfect assignments.
Our writers have experience in dealing with papers of every educational level. You can surely rely on the expertise of our qualified professionals.
Your deadline is our threshold for success and we take it very seriously. We make sure you receive your papers before your predefined time.
Someone from our customer support team is always here to respond to your questions. So, hit us up if you have got any ambiguity or concern.
Sit back and relax while we help you out with writing your papers. We have an ultimate policy for keeping your personal and order-related details a secret.
We assure you that your document will be thoroughly checked for plagiarism and grammatical errors as we use highly authentic and licit sources.
Still reluctant about placing an order? Our 100% Moneyback Guarantee backs you up on rare occasions where you aren’t satisfied with the writing.
You don’t have to wait for an update for hours; you can track the progress of your order any time you want. We share the status after each step.
Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.
Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.
From brainstorming your paper's outline to perfecting its grammar, we perform every step carefully to make your paper worthy of A grade.
Hire your preferred writer anytime. Simply specify if you want your preferred expert to write your paper and we’ll make that happen.
Get an elaborate and authentic grammar check report with your work to have the grammar goodness sealed in your document.
You can purchase this feature if you want our writers to sum up your paper in the form of a concise and well-articulated summary.
You don’t have to worry about plagiarism anymore. Get a plagiarism report to certify the uniqueness of your work.
Join us for the best experience while seeking writing assistance in your college life. A good grade is all you need to boost up your academic excellence and we are all about it.
We create perfect papers according to the guidelines.
We seamlessly edit out errors from your papers.
We thoroughly read your final draft to identify errors.
Work with ultimate peace of mind because we ensure that your academic work is our responsibility and your grades are a top concern for us!
Dedication. Quality. Commitment. Punctuality
Here is what we have achieved so far. These numbers are evidence that we go the extra mile to make your college journey successful.
We have the most intuitive and minimalistic process so that you can easily place an order. Just follow a few steps to unlock success.
We understand your guidelines first before delivering any writing service. You can discuss your writing needs and we will have them evaluated by our dedicated team.
We write your papers in a standardized way. We complete your work in such a way that it turns out to be a perfect description of your guidelines.
We promise you excellent grades and academic excellence that you always longed for. Our writers stay in touch with you via email.