Would your problem identified -Hospital Survey on Patients Safety Culture- lend itself to a qualitative or quantitative design? What level of evidence (research design) would best address the problem?
Use these 2 articles- are they qualitative or quantitave? What type of research is?
discussionShort answer
Journalof Data Science
355-376 , DOI: 10.6339/JDS.201804_16(2).0007
CAN EMOTICONS BE USED TO PREDICT SENTIMENT?
Keenen Cates1, Pengcheng Xiao1,∗, Zeyu Zhang1, Calvin Dailey1
1
Department of Mathematics, University of Evansville
1800 Lincoln Ave, Evansville, Indiana, 47722 USA
Abstract: Getting a machine to understand the meaning of language is a
largely important goal to a wide variety of fields, from advertising to enter-
tainment. In this work, we focus on Youtube comments from the top two-
hundred trending videos as a source of user text data. Previous Sentiment
Analysis Models focus on using hand-labelled data or predetermined
lexicon-s.Our goal is to train a model to label comment sentiment with
emoticons by training on other user-generated comments containing
emoticons. Naive Bayes and Recurrent Neural Network models are both
investigated and im- plemented in this study, and the validation accuracies
for Naive Bayes model and Recurrent Neural Network model are found to
be .548 and .812.
Key words: Sentiment analysis, Emoticons, Natural Language Processing,
Machine Learning.
1. Introduction
Sentiment analysis is a branch of natural language processing that involves trying to
understand the underlying sentiment and emotion behind language. For example,“Have a
great day” has a positive sentiment, and “Have a bad day” has a negative sentiment.
Current state of the art techniques for modelling sentiment in language involve using
machine learning and deep neural networks to classify the sentiment of language. For
example, SemEval is a yearly contest for trying to classify tweets as Positive, Negative, or
Neutral. Its findings advance the field of sentiment analysis and machine learning
(Rosenthal, Noura, and Preslav 2017).
356 Can emoticons be used to predict sentiment?
1.1 Objectives
Our focus is on another major social platform, Youtube, which garners hundreds of
thousands of comments and other user generated statistics. User data yields important
results in the fields of social sciences. In particular we are in- terested in the top trending
Youtube videos,and aim to identify sentiment of commenters by suggesting what emoticon
a user might use with their comments. We suggest emoticons give insight into the
sentiment of the user, and the emoticons pictographic nature gives us a better language to
indicate emotion. Using the subset of comments with emoticons we engineered a
labelled dataset of com- ments and emoticons. Our models take advantage of this
labelling to model the emoticon lexicon. This is further used to suggest what emoticons
might ac- company a comment (Hogenboom 2013). Using this dataset and the models we
have create, we hope to answer whether or not we can accurately predict what emoticon a
user might use.
1.2 Literature Review
Sentiment Analysis drives many industries and being able to correctly identif
y
sentiment in a Youtube comment would allow automated systems to moderate comments
or correctly recommend media or advertisements to users. In general, there are two
methods that Natural Language Processing researchers use for Sentiment Analysis;
Lexicon based and Machine Learning based. Sentiment Analysis is a fairly robust field,
and has consistently seen interest since its conception. This field has increased
exponentially with the surge in data seen with the rise of the internet, in many cases the
amount of data is intractable. Social platforms such as Youtube, by themselves generate
more data than any one hu- man could analyze. Therefore a system of Natural Language
Processing (NLP) is required to deal with the sheer volume of data.
Natural Language Processing can be considered a subset of cognitive science or
computer science. The concept of natural language processing originally came about in the
mid-20th century. The initial motivation was language translation (Salas-Za ŕate 2017).
Natural Language Processing naturally lends itself to the field of Artificial Intelligence, as
there is a strong desire for agents that can understand human language; for example, a chat
bot. Sentiment Analysis did not pull much attention until the early 2000s. The natural
language processing systems that were developed at first were only applicable to narrow
subject areas, such as answering questions with information from a database about moon
rocks, or answering questions from a manual on airplane maintenance (Liu 2012). The
Keenen Cates, Pengcheng Xiao, Zeyu Zhang, Calvin Dailey 357
explosion of social data quickly created a necessity to autonomously understand language
sentiment. Especially with the ubiquitous nature of social media in recent years, the field
of sentiment analysis has become more and more applicable to many fields. It has been
one of the most active areas of research in the field of natural language processing since
the turn of the century (Pozzi 2017).
There are many commercial applications. It may have significant effects for the fields
of management, political science, economics, and other social sciences, among others (Liu
2012). Sentiment analysis, also known as opinion mining, refers to the process of creating
automatic tools or systems which can derive subjective information from text in natural
(human) languages, as opposed to computer codes. The subjective information most
commonly desired by researchers are opinions and sentiments, hence the name sentiment
analysis. Sentiment analysis, while originally only practiced by computer scientists, has
become widely used by the management scientists and the social sciences. Microsoft,
Google, Hewlett-Packard, IBM, and others have created their own systems for sentiment
analysis.
Before the turn of the century, there were previous developments in what would later
become the field of sentiment analysis. Naive Bayes classifier pro- vided a way to model
the affective tone of an entire document based on the “semantic differential scores” of each
of the words in the document. The semantic meanings and scores were derived from a 1965
study by Heise. According to Lee and Pang (2002) marked an explosion of research in
sentiment analysis. This increase in the study of this topic was partially attributed to the
increasing popularity of machine learning models, and the availability of training sets with
which machine learning models could be trained. Turney (2002) used an algorithm based
on parts-of-speech tagging and semantic orientation in order to classify online reviews as
recommended or not recommended. Anderson and McMaster (1982) used machine
learning techniques such as Support Vector Ma- chines and Naive Bayes in order to
classify the sentiment of movie reviews. Dave, Lawrence, and Pennock (2003) classified
polarity of web reviews based on several n-gram methods. It was not as accurate when
applied to individual sentences because it was developed with the purpose of classifying
reviews which normally contained multiple sentences. Hu and Liu (2004) used a method
that could predict the sentimental orientation of opinion words and therefore the opinion
orientation of a sentence. It was an unsupervised method and did not require a corpus, and
was loosely based off the work of Dave, Lawrence and Pennock. It returned the
sentiments at the sentence level instead of at the entire review at once. Then it combined
the sentence-level sentiments to give a summary of the entire review. Moraes, Valiati, and
Neto (2013) showed the effectiveness of machine learning processes as opposed to
358 Can emoticons be used to predict sentiment?
lexicon-based models. They empirically compared the Support Vector Machines and
Artificial Neural Network machine learning methods for sentiment analysis and found that
the Artificial Neural Networks performed better. In 2015, Wang, Liu, Sun, Wang.B, and
Wang.X. showed the effectiveness of Long short-term memory recurrent neural networks
for sentiment analysis by predicting the sentiments of tweets.
1.3 Sentiment Lexicon
The lexicon method splits input text into many individual words or phrases called
tokens. Then, it creates a table of these tokens and records the number of times each token
shows up in the text. The resulting tally is called a “Bag of Words” model. Once this
process is done, another tool called “Sentiment Lexicon” is used for computing the
classification of the bag of tokens we mentioned above. The Sentiment Lexicon has the
sentiment values, which can be just positive or negative numbers or some other value-
representations, like vectors, that are pre-recorded for each token. This can be done either
manually or by some machine learning techniques. Once we have the input text tokenized
and a suit- able Sentiment Lexicon, the final task is to design a function to compute the
final sentiment. The simplest way to compute the final sentiment is to sum the sentiment
values of each token together. The lexicon method is a traditional way to deal with natural
language processing problems, and it has a good theoretical basis. Many people are still
using and studying this method in spite of its origins in the 1960s. However, it does have
some drawbacks such as ignoring the importance of integrality and continuity of the text.
We know that the meaning of a sentence highly depends on the order of words and context;
these should not be ignored if we want a real intelligent sentiment processing system
(Tbboada 2011).
1.4 Machine Learning
In the Machine Learning technique of sentiment analysis the classification algorithm
uses a training set to learn a model based on features in the set. This makes a more nuanced
classification possible and can help with ambiguous words or interpretations that vary by
context. A method of feature extraction must be chosen. Some of these methods include
n-grams, which are sets of words that contain n words each. Others use parts-of-speech
information, emotional, affective, or semantic data. One of the disadvantages of the
machine learning method is that it requires a large set of labelled data to be used as the
Keenen Cates, Pengcheng Xiao, Zeyu Zhang, Calvin Dailey 359
training set. It is simpler to use the lexicon-based method unless a suitable training set is
available (Salas-Za ŕate 2017).
We will need to classify the sentiments of the emoticons manually in order to prepare
them for use in our analysis. Once that is done, we can compile our training set using the
comments in the data that already contain emoticons, using the sentiments of each
emoticon. Then our model will be able to classify and assign an emoticon to each comment
in the data set that does not already contain one. Recurrent Neural Networks(RNNs) have
had a great deal of success in the Natural Language Processing Realm. The reason is that
text data is highly sequential, for example, the word “day” does not mean much unless you
know the words that came before it; i.e “Have a great day.” RNNs have pushed the state
of the art of previous architectures in short-length text data (Lee and Dernoncourt 2016).
Given previous attempts to model sentiment have not thoroughly explored emoticons,
we hope to answer the question of whether or not we can accurately recommend emoticons
that might accompany a piece of text. Once we have answered this, further research can
make attempts to analyze sentiment with emoticons on a machine.
2. Methodology
2.1 Data
To get our data, we used the Data Science Competition Website Kaggle. On this
website, people share datasets, competitions, and tutorials. We found a dataset containing
comments from the top 200 trending Youtube videos. The author of this dataset obtained
the data through Youtube’s publicly available API, which allows developers to easily
query for data on Youtube. The data itself contains profanity, nonsensical text, and in
general is noisy. The data itself could be generated by bots, and we do no vetting to
determine whether a comment actually comes form a human. The noisiness of the data
might prevent us from training a successful model; however, we assume that the large
amount of data will help our models perform well in spite of the low quality of data.
In order to answer the question of whether or not a model could recommend emoticons,
we created 2 models that attempt to perform this recommendation. We also created a
simple dummy model for purposes of comparison. We have roughly three-hundred
thousand comments with emoticons, and use that to boos- trap a dataset of comments with
labels. More data is desirable, but this is a fairly large corpus for initial research.
In total, there are 691, 388 rows in the dataset. A large proportion of them contain
emoticons, (more than 200, 000), so there is a quite a bit of data, and it would be fairly
360 Can emoticons be used to predict sentiment?
straightforward to access the Youtube API and get more if needed. This means I have as
much data as I could possibly want, and more if needed. As for features, I will only use
the text, likes, reply threads, and so on will be ignored in this phase of the project. On
average, each text is 15 words long. Figure 1 shows some examples of how the data looks.
Figure 1: Example unprocessed data
2.2 Evaluation Metrics
The models will be evaluated using a holdout set of data, in which each will
recommend five emoticons that might accompany a text. If at least one recommendation
is an emoticons that occurs in the validation comments, then I will consider it to be
a ”correct” guess. Accuracy is then the number of correct guesses divided by total guesses.
Keras calls this accuracy ”top k categorical accuracy”, and will be implemented for our
models. Mathematically, this would look something like this where matching x ∈
Comments and y ∈ Labels and score(x) = 1 if any p ∈ argmaxk=5(predict labels(x)) is in
y, else score(x) = 0. predict labels(x) would return the probabilities of each output class
occurring. Then the accuracy of the model would be
ΣN(score(xi))
?
where xi∈ Comments
and N =| Comments |.
One consideration is that the distribution of emoticons occurring in the corpus of data
is highly skewed; this would be good reason to suggest F1 scores and might be better for
future analysis. However, we chose this evaluation metric because it more closely
resembles the question we are asking. The important thing to note is that the distribution
is indeed skewed(see Figure 2).
Keenen Cates, Pengcheng Xiao, Zeyu Zhang, Calvin Dailey 361
Figure 2: Distribution of a subset of Emoticons
2.3 Analysis Plan
In order to compare the performance of our model, we created a holdout set of data
meant for only validation of accuracy. We also defined what a prediction would be for
each model, each model would output its top five highest predictions. If any of those
predictions are in the output validation set, then we considered it an accurate prediction.
Then in order to analyze the dataset, we will compute the prediction accuracy of each
model and compare those scores. One might also consider looking at the training accuracy
of each model; however, these scores are not directly comparable, so we ignore them
except for the purposes of optimizing the model.
2.4 Approach
In our approach, we had to make a few crucial assumptions and simplifications to
contextualize our problem. Firstly, our dataset involved input data with multiple output
classifications. For example, a users can add hundreds of the same emoticon or many
different emoticons. As a preprocessing step, we narrowed down these classes to the
unique emoticons that show up in a comment, and unrolled the data set to have a single
label. Table 1, displays how each comment gets unrolled into multiple data points with
single labels.
362 Can emoticons be used to predict sentiment?
I loved this video! x x y
I loved this video!
I loved this video!
x
y
Table 1: Unrolling of data labels
The other assumption exists only for our Naive Bayes Model, and it is that all words
in the comments are independent. This assumption is difficult to back up, and it is not clear
whether there is mutual dependence or mutual exclusivity between words. However, our
Recurrent Neural Network does not have this limitation because it can model the entire
sequence.
2.5 Preprocessing
One of the most important steps is the preprocessing stage. This is done before all
models are trained. We first separate the data into comments with emoticons and comments
without emoticons. We then make all comments lowercase and afterwards normalize our
comments on both by creating a dictionary of punctuation to tokens, and a dictionary of
word counts over all comments that use thes ordering of each word as its embedding. Table
2 shows an example of how the dictionaries are used to tokenize a comment. A similar
process is used to encode the emoticons, we use a dictionary to encode them as integers.
Preprocessing the comments in this way gives us a normalized integer sequence, which
deals with comments that might have different capitalizations of words.
2.6 Dummy Model
For purposes of comparison, we created a very simple model that always predicts that
a comment would use the emoticon with the largest prior probability. The motivation
behind this, is that it gives us a baseline score to beat. If we can do significantly better than
this, then we know that the models have potential.
2.7 Naive Bayes Model
Our second model uses Bayesian Statistics that creates tables of posterior proba-
bilities for each class given a word using Bayes rule. Naive Bayes is a conditional
probability model, and given some instance to be classified, represented by a vector of
features:
x = (?1,…,??)
We then compute the probability of each output class using conditional probability
Keenen Cates, Pengcheng Xiao, Zeyu Zhang, Calvin Dailey 363
p(??|?1,…,??)
Table 2: Tokenization Process
Since n, can be large making this model less tractable we need to reformulate our
model using Bayes Rule. In plain english,
????????? =
????? ∙ ??????ℎ???
????????
And symbolically,
p(??|?) =
?(??)?(?|??)
?(?)
In practice, the numerator is the most import part as the denominator does not depend
on effectively making it a constant. The numerator is equivalent to the joint probability
model meaning we can replace the numerator with,
p(??,?1,…,??)
We can then rewrite the numerator using the chain rule for repeated applications of
conditional probability, derivation is in appendix 1. Then we add the naive as- sumption
of conditional independence, allowing use to further simplify our model
364 Can emoticons be used to predict sentiment?
Figure 3: Naive Bayes Model
to:
p(??,?1,…,??) =
1
?
?(??)∏?(??|??)
?
?=1
Where Z is:
Z = p(x) = ∑?(??)
?
?(?|??)
Which is the scaling factor dependent on the instance. The derivation is in appendix 2.
In order to make a classifier, we would generally take the argmax of the simplified model
without Z, but in our case we take the top five arguments as our program is recommending
multiple emoticons that might be appropriate to the definition of Naive Bayes classifier .
We implement this model in python and the model follows figure 3.
Another problem is that we have to deal with words that never show up in our corpus
of texts. In order to deal with this, we smooth the probabilities. To do this, we make any
word or class that doesn’t show up have a very small probability that is close, but not zero.
Otherwise, the probability would zero out when words are not in the corpus.
2.8 Recurrent Neural Network
Our third and final model, is a recurrent neural network and our architecture is as
follows in table 3.
Keenen Cates, Pengcheng Xiao, Zeyu Zhang, Calvin Dailey 365
Input
Embedding Layer
LSTM Layer
LSTM Layer
LSTM Layer
Fully-Connected Layer
Output layer
Table 3: RNN Architecture
Recurrent Neural networks are a class of neural networks that form a directed cycle,
allowing them to take time into account, or a notion of memory. This allows for the
RNN
to be suited to predicted arbitrary sequences by taking advantage of their memories.
The label data also undergoes another transformation before the RNN begins the
learning process. Since the emoticons are encoded using an ordinal number, the integer
representation does not quite make sense as one emoticon is not greater than another. To
rectify this, we represent this integer as a one-hot vector, essentially we take a fixed-length
vector that is the size of the total number of output classes. Then the integer is used as an
index of the “hot” class. Table 4 gives a small example of encoding a small class space.
Table 4: One-Hot Encoding
One of the major features of this model is the stacked LSTM layers. This architecture
allows us to better model hierarchical elements of language. This means each layer will
represent progressively complex parts of the hierarchy. One might imagine this in terms
of the composition of the human face. For example, the most basic element is an edge.
Then a more complex step would be individual elements of the face such as a nose or
mouth. Then the most complex part would be the entire face, and the composition of its
requisite parts.
366 Can emoticons be used to predict sentiment?
The LSTM itself is able to remember previous contexts in sentences, meaning we
could potentially get more performance via our model becoming better at modelling
context.Our RNN had a much longer time to run, and in order to train the model, we
decided to use more power hardware in the form of a GPU. The Neural Network was then
trained on a GPU using Floyd Hub, a platform for running deep learning projects. The
expense was roughly 14 dollars, as a we subscribed to the Data Science plan which gave
us 10 hours of gpu time which we used for experimentation on multiple occasions. The
price was remarkably cheap compared to other platforms such as Amazon. Usage of
FloydHub is remarkably simply, and resembles version control programs such as git. One
simply uploads their code to the website using command line tools, and are given an
interface to interact with their instance. This service was worthwhile to learn because it
abstracted away elements such as infrastructure, version control, and storage and we could
focus on the problem.
In addition to our baseline architecture, we also preform dropout on each lay- er,
which helps prevent against training bias because the network probabilistic “drops” some
of the weight which forces the network to build redundancies. For the training metric, we
implemented the top k categorical accuracy metric listed in the evaluation metrics. For
the objective function we found that categorical cross entropy work best which typically
works well in multi-class, single-label s- cenarios.Using TFLearn, a deep learning library
for Python, we implemented the architecture we decided on with relative ease. TFLearn
builds on top of Tensor- Flow, abstracting away many of the more intimate computational
components, and allowing the programming to think about the layers and interactions
between layers rather than how to build a well known type of layer or cell.
2.9 Implementation
2.9.1 rogramming Language Libraries
•Python 3
•TFLearn a deep learning library featuring a higher-level API for Tensor- Flow.
•TensorFlow a deep learning library
As mentioned throughout the text, the models where implemented using the listed
libraries. We did our coding on the website FloydHub via iPython Notebooks, which
abstracted away much of the setup. We split our code up into three notebooks, one for
preprocessing, Bayesian Model, and RNN. We ran into very few problems implementing
our solution; however, some are outlined below.
Keenen Cates, Pengcheng Xiao, Zeyu Zhang, Calvin Dailey 367
2.9.2 Problems
•Bayes Smoothing We ran into a small hitch with the Bayesian when dealing with
querying prior probabilities when certain values did not exist in the data. However, we
used a technique to ”smooth” the values by assigning a small probability to these values.
•Skin Tone Modifters There are emoticons that exist that modify other emoticons, i.e.
allowing one to change the skin tone of the smiley face. We found that these confounded
our predictions, and removed them as possible predictions.
•Finding loss, activation, and metrics We had to experiment many times to find the
best loss, activation, and metric functions for our RNN. This process may be simple trial-
and-error as we experienced.
2.10 Reftnement
Originally, our RNN model did not preform as well as we had hoped; however, a few
optimization to our model vastly impacted our performance. The first model we used was
a multi-class, multi-label classifier which performed very poorly. Our RNN had
performance at .508 which left much to be desired. We believe the reason for this is that
instead of one-hot encoded vector, we had many-hot encoded. This means that the label
space would be of order 2# of emoticons. Since this space is extremely large, the model would
have trouble representing any reasonable portion of this. For this reason, we needed to
unroll data points to preform multi-class, single-label classification. After adjusting our
loss function, metric function, and activation function we ended up with much better
performance. We believe this to be because of the reduction in potential labels to just #
of emoticons. In addition, hyper parameters were adjusted, such as, learning rate and batch
size to find out what setting worked best. The best we found was a learning rate of .001
and a batch size of 128.
3. Results
In order to validate the models, we created a holdout set of labelled data that none of
the models got to use for training or testing. The accuracy of each model using top k
categorical accuracy is in tables 5 and 6.
368 Can emoticons be used to predict sentiment?
Model Accuracy
Dummy
Naive Bayes
RNN
.527
.859
.702
Table 5: Training Accuracy Results
Model Accuracy
Dummy
Naive Bayes
RNN
.527
.548
.812
Table 6: Validation Accuracy Results
Table 6 gives us a measurement of how well our recommendation engine gives us
accurate emoticons to represent our text. Our results do not promote strong confidence in
our Naive Bayes Model’s ability to recommend emoticons; however, there are some
potential improvements to the model such as n-gram modelling. Notably, the Bayesian
Model preforms decently on the training data, but generalizes quite poorly and shows signs
of over-fitting. The RNN on the other hand, surprisingly preforms slightly worse on
training, but preforms much better on the validation set. For whatever reason this
phenomenon occurs, it is clear that the model generalizes much better.
3.1 Visualization of Model Functionality
We have a model that could be incorporated into a wide variety of applications; for
example, a browser plugin that predicts what emoticons you might put with a comment
and assist the user similar to an auto-complete feature. One issue to consider might be the
nature of Youtube comments themselves, which might pre- vent the generalization of this
model to other applications. However, the models do show that this sort of functionality is
possible. For example, we have pulled some examples from the data and run them through
our models to produces the tables below, and the comments themselves seem to be quite
different than more formal forms of language.
Keenen Cates, Pengcheng Xiao, Zeyu Zhang, Calvin Dailey 369
Table 7: Example data and predictions
While the machine learning back-end may not be the most sophisticated, the model
does a good job in practice of giving recommendations, and we think the model would be
good enough to use for applications to be built on top of.
3.2 Limitations
One limitation of our models is that words that do not show up in the Youtube
Comment corpus cause issues, as our models have trouble predicting outputs for words
that it has never seen. One way to fix this, might be to mine for more Comment data. Some
drawbacks of the Naive Bayes Model is that we may not be able to model longer term
trends in comments, however with the short length of the comments, this may be a non
issue. We also are limited in our choice of language modelling because we are on the word
level. We would likely see large improvement by expanding our level of modelling to some
type of n-gram. The RNN has limitations in multi-class classification, and this may be
hindering its ability to learning. Another limitation might be that the training time is cost
prohibitive. The model would likely continue to learn and perform better with more
370 Can emoticons be used to predict sentiment?
training time and data, meaning ultimately a higher cost for the model. The naive bayes is
easy to program with fast run time, and no need to train for hours upon hours.
Another major consideration is that an RNN might be a bad fit. We originally though
long term sequential modelling would be important, but it turns out the average comment
length is 15 words long. It may be the case that sense the length of texts are so short, that
we might have to thoroughly rethink what our strategy would be if this sequential
modelling is unimportant.
3.3 Future Work
In order to eliminate the assumption of independence in the Bayesian model, we can
add complexity by changing at what level we model the data. To do such we
would need
to employ a skip-gram or n-gram model that contain larger parts of the sequence data. One
might also explore alternative Bayesian Models such as Hidden Markov Models. The same
improvements to the data modelling using n-grams would likely improve the quality of the
RNN results. The RNN model likely has a great deal of room for improvement, one might
experiment with hyperparameter tuning or modifying the architecture. There are even more
powerful models such as CRNNs and GANs that push the state of the art in deep learning.
These models would be worth exploring; however, we pushed our newfound deep learning
knowledge as far as we could in the time allotted.
Another important consideration is the unrolling of the data. Future work should
further explore how to deal with multi-class classification, which would likely involve
writing new validation and loss functions for the neural network model. However, the
Naive Bayes Model does not suffer from this limitation.
Future work might also try and further connect the emoticons and sentiment. We
hypothesize that emoticons will naturally lend themselves to a easily convert into
sentiment classes. However, our current models predict only what emoticon might be used,
and the user of the model would have to infer what sentiment the emoticon might convey
depending on context.
One might also find more optimizations by adding further preprocessing steps, for
example, eliminating common english words that add very little information.
Keenen Cates, Pengcheng Xiao, Zeyu Zhang, Calvin Dailey 371
3.4 Reflection
Looking back at the process, here are the steps we took to get to the current models
• Literature Review We made sure to have a rough idea of what people in this field
have tried, and what the state of the art is.
• Deciding on a Model After reviewing the field, we made a decision on what models
we wanted to implement which set the tone for preprocessing and implementation.
• FloydHub Next we setup our programming environment with cloud computing in
mind. It’s important to setup an environment such as FloydHub or AWS to minimize
training time on a fast gpu. At this step we also made sure to download all the libraries we
would need
• Preprocessing a large majority of time was spent trying to learn how to deal with
the data, and exploring the data itself. We had to go through multiple iterations of
embedding and tokenization to find the method that made sense.
• Model Implementation After preprocessing our data, this step was fairly
straightforward. Most of the time at this step is dealing with edge cases, or optimization of
models rather than the actual implementation.
• Reftnement Refinement may have been the hardest part because we had to make
inferences about why our model was not performing up to our desires. It’s hard to say what
the potential of each model was, so we kept iterating until we had something that seemed
substantial.
3.5 Conclusion
Overall, there are many areas for potential improvement, and our work serves as a
baseline for recommending emoticons. However, we have begun to answer our original
question, it seems plausible the emoticons can be assigned with accuracy to comments as
noisy as Youtube comments, making it easy for a casual observer to understand the
sentiment of a text.
Acknowledgment: The authors appreciate the anonymous referee for the con-
structive review of the paper which has greatly improve the quality of the article. The
authors would also like to thank the generous support from the mathematics department at
University of Evansville.
372 Can emoticons be used to predict sentiment?
Appendix
1. Chain rule for repeated applications of conditional probability.
p(??,?1,…,??) = ?(?1,…,??,??)
= ?(?1|?2 …,??,??)?(?2 …,??,??)
= ?(?1|?2 …,??,??)?(?2|?3 …,??,??)?(?3 … ,??,??)
=….
= ?(?1|?2 …,??,??)?(?2|?3 …,??,??)…?(??−1|??,??)?(??|??)p(??)
2. Naive Assumption of conditional independence to simplify model. This the joint
model can be derived via:
p(??|?1,…,??) = p(??,?1,… ,??)
= p(??)?(?1|??)?(?2|??)?(?3|??)…
= p(??)∏?(??|??)
?
?=1
Keenen Cates, Pengcheng Xiao, Zeyu Zhang, Calvin Dailey 373
References
[1] Anderson, C., McMaster, G. (1982). Computer Assisted Modeling of Affective Tone
in Written Documents. Computers and the Humanities, 16(1), 1-9.
[2] Brill, E., Mooney, R. J. (1997). An Overview of Empirical Natural Language
Processing. AI Magazine, 18(4), 13.
[3] Chipman, S. E. (2017). The Oxford Handbook of Cognitive Science. Oxford: Oxford
University Press.
[4] Dave, K., Lawrence, S., and Pennock, D. (2003). Mining the Peanut Gallery: Opinion
Extraction and Semantic Classification of Product Reviews. In Proceedings of the
12th International Conference on World Wide Web (WWW 03). ACM, New York,
NY, USA, 519-528.
[5] Hu, M., Liu, B. (2004). Mining and Summarizing Customer Reviews. In Pro- ceedings
of the Tenth ACM SIGKDD International Conference on Knowl- edge Discovery and
Data Mining pp. 168-177. ACM.
[6] Hogenboom, Alexander, et al.(2013) Exploiting emoticons in sentiment analysis.
Proceedings of the 28th Annual ACM Symposium on Applied Computing. ACM.
[7] Kang, Mangi, Jaelim Ahn, and Kichun Lee.(2017) Opinion mining using ensem- ble
text hidden Markov models for text classification. Expert Systems with Applications.
[8] Lee, Ji Young, and Franck Dernoncourt.(2016) Sequential short-text classifi- cation
with recurrent and convolutional neural networks. arXiv preprint arXiv:1603.03827.
[9] Liu, B. (2012). Sentiment Analysis and Opinion Mining. Morgan and Claypool.
LSTM Networks for Sentiment Analysis. DeepLearning 0.1 Documentation,
Deeplearning. Retrieved December 01, 2017.
[10] Mitchell, J. (Datasnaek). Trending Youtube Video Statistics and Comments.
Kaggle, Kaggle Inc., Aug./Sep. 2017.
374 Can emoticons be used to predict sentiment?
[11] Moraes, R., Valiati, J. F., Neto, W. P. G. (2013). Document-level Sentiment
Classification: An Empirical Comparison between SVM and ANN. Expert Systems
with Applications, 40(2), 621-633.
[12] Naive Bayes classifier.Wikipedia, Wikimedia Foundation INC, 30 Nov. 2017,
Available from http://en.wikipedia.org/wiki/NaiveBayesclassifier.
[13] Pang, B., Lee, L., Vaithyanathan, S. (2002). Thumbs up? Proceedings of the ACL-02
Conference on Empirical Methods in Natural Language Processing
– EMNLP 02.
[14] Pozzi, F. A. (2017). Sentiment Analysis in Social Networks. Amsterdam: Else- vier.
[15] Rosenthal Sara, Noura Farra, and Preslav Nakov. (2017 )SemEval-2017 task 4:
Sentiment analysis in Twitter. Proceedings of the 11th International Workshop on
Semantic Evaluation .
[16] Salas-Za ŕate, M. P., Medina-Moreira, J., Lagos-Ortiz, K., Luna-Aveiga, H.,
Rodr íguez-Garc ía, M. A .́, and Valencia-Garc ía, R.(2017) Sentiment Analy- sis on
Tweets about Diabetes: An Aspect-Level Approach. Computational and
Mathematical Methods In Medicine, 1-9.
[17] Siersdorfer, Stefan, et al.(2010) How useful are your comments?: analyzing and
predicting Youtube comments and comment ratings. Proceedings of the 19th
international conference on World wide web. ACM.
[18] Taboada, Maite, et al.(2011) Lexicon-based methods for sentiment analysis.
Computational linguistics 37.2:267-307.
[19] Turney, P. D. (2002). Thumbs Up or Thumbs Down?: Semantic Orientation Applied
to Unsupervised Classification of Reviews. In Proceedings of the 40th Annual
Meeting on Association for Computational Linguistics, pp. 417-424. Association for
Computational Linguistics.
[20] Wang, X., Liu, Y., Sun, C., Wang, B., Wang, X. (2015). Predicting Polarities of
Tweets by Composing Word Embeddings with Long Short-Term Memory. In
Proceedings of the 53rd Annual Meeting of the Association for Compu- tational
Linguistics and the 7th International Joint Conference on Natural Language
Keenen Cates, Pengcheng Xiao, Zeyu Zhang, Calvin Dailey 375
Processing Volume 1: Long Papers, pp. 1343-1353, Beijing, Chi- na. Association for
Computational Linguistics.
[21] Whitelaw, C., Garg, N., Argamon, S. (2005). Using Appraisal Groups for Sen- timent
Analysis. In Proceedings of the 14th ACM International Conference on Information
and Knowledge Management, pp. 625-631. ACM.
Keenen Cates1, Pengcheng Xiao1,∗, Zeyu Zhang1, Calvin Dailey1
1Department of Mathematics, University of Evansville
1800 Lincoln Ave, Evansville, Indiana, 47722 USA
∗Corresponding author: px3@evansville.edu; fax: (812)488-2944
Copyright of Journal of Data Science is the property of National University of Kaohsiung,
Department of Applied Mathematics and its content may not be copied or emailed to multiple
sites or posted to a listserv without the copyright holder’s express written permission.
However, users may print, download, or email articles for individual use.
International Journal for Quality in Health Care, 2018, 30(9), 660–677
doi: 10.1093/intqhc/mzy080
Advance Access Publication Date: 17 May 2018
Review
Review
The patient safety culture: a systematic review
by characteristics of Hospital Survey on Patient
Safety Culture
dimensions
CLÁUDIA TARTAGLIA REIS
1,2
, SOFIA GUERRA PAIVA
2
,
and PAULO SOUSA
2,3
1Brazilian Minister of Health, SMS Cataguases, Rua José Gustavo Cohen, 70 Cataguases, MG 36772-014, Brazil,
2National School of Public Health, Universidade Nova de Lisboa, Avenida Padre Cruz, 1600-540 Lisboa, Portugal,
and 3CISP—Centro de Investigação em Saúde Pública, ENSP-Universidade Nova de Lisboa, Avenida Padre Cruz,
1600-540, Lisboa, Portugal
Address reprint requests to: Cláudia Tartaglia Reis, Rua Manoel Ramos Trindade 76/201 Cataguases, MG, Brazil.
Tel: +55 32-3421-3121; Fax: +55 32-3429-2600; E-mail: clautartaglia@gmail.com
Editorial Decision 22 March 2018; Accepted 3 May 2018
Abstract
Purpose: To learn the weaknesses and strengths of safety culture as expressed by the dimen-
sions measured by the Hospital Survey on Patient Safety Culture (HSOPSC) at hospitals in the
various cultural contexts. The aim of this study was to identify studies that have used the
HSOPSC to collect data on safety culture at hospitals; to survey their findings in the safety cul-
ture dimensions and possible contributions to improving the quality and safety of hospital
care.
Data sources: Medline (via PubMed), Web of Science and Scopus were searched from 2005 to
July 2016 in English, Portuguese and Spanish.
Study selection: Studies were identified using specific search terms and inclusion criteria. A total
of 33 articles, reporting on 21 countries, was included.
Data extraction: Scores were extracted by patient safety culture dimensions assessed by the
HSOPSC. The quality of the studies was evaluated by the STROBE Statement.
Results: The dimensions that proved strongest were ‘Teamwork within units’ and ‘Organisational
learning–continuous improvement’. Particularly weak dimensions were ‘Non-punitive response to
error’, ‘Staffing’, ‘Handoffs and transitions’ and ‘Teamwork across units’.
Conclusion: The studies revealed a predominance of hospital organisational cultures that were
underdeveloped or weak as regards patient safety. For them to be effective, safety culture evalu-
ation should be tied to strategies designed to develop safety culture hospital-wide.
Key words: patient safety, safety culture, survey, hospital care, quality improvement
Background
Patient safety is a critical component of the quality of healthcare. It
is increasingly recognised that strengthening safety culture in health
organisations is important to continuously improving the quality of
care. Strong safety culture is associated with achieving favourable
outcomes, especially in hospitals [1, 2].
Safety culture comprises an understanding of values, beliefs and
standards as regards what is important in an organisation and what
© The Author(s) 2018. Published by Oxford University Press in association with the International Society for Quality in Health Care. All rights reserved.
For permissions, please e-mail: journals.permissions@oup.com 660
D
ow
nloaded from
https://academ
ic.oup.com
/intqhc/article-abstract/30/9/660/4998840 by A
dam
E
llsw
orth, A
dam
E
llsw
orth on 07 January 2019
http://www.oxfordjournals.org
safety-related attitudes and behaviour are valued, supported and
expected [3]. Organisations with a strong safety culture are charac-
terised by good communication among staff, mutual trust and com-
mon perceptions of the importance of safety and the effectiveness of
preventive measures [4, 5].
Safety culture is a multidimensional concept defined, in the
health service context, as the product of values, attitudes, percep-
tions, competences and standards of individual and group behaviour
that determine the administration’s commitment, style and profi-
ciency in managing patient safety [6].
Hospital safety culture assessment is being used as a manage-
ment tool and encouraged by health policymakers and managers in
countries around the world. The culture assessment has multiple
uses: (i) building staff awareness on patient safety; (ii) evaluating the
present state of patient safety culture (PSC) in the organisation; (iii)
identifying strong points of safety culture and areas for improve-
ment; (iv) analysing safety culture trends over time; (v) evaluating
the impact on safety culture of initiatives and interventions to
improve patient safety and (vi) drawing comparisons within and
between health organisations [3].
In the 2000s, questionnaires and assessment instruments were
developed to assist in understanding an organisation’s safety cul-
ture and whether it is ready to receive measures to improve the
safety and quality of care as well as to ascertain what factors may
favour or hinder efforts in this respect. These questionnaires are
based on a combination of dimensions; they are considered an effi-
cient strategy and offer methodological advantages, such as assur-
ing participant anonymity and lower costs than qualitative
approaches [7]. Since the mid-2000s, such instruments have been
the subject of a number of review studies, which have compared
their overall characteristics and examined their psychometric prop-
erties [8–11].
The Hospital Survey on Patient Safety Culture (HSOPSC) cre-
ated by the Agency for Healthcare Research and Quality (AHRQ) in
the USA [12], is applicable to hospital staffs whose work influences
patient care directly or indirectly—from housekeeping and security
to nurses and physicians (clinical staff or non-clinical staff, such as
unit clerks, staff in units such as pharmacy, laboratory/pathology,
staff in other areas, such as administration and management). The
HSOPSC has performed satisfactorily in psychometric analyses, as
demonstrated by a number of studies [9–11], and is accessible to
professionals the world over interested in assessing safety culture at
their hospital. It is being used by hundreds of hospitals in the USA
and several other industrialised and developing countries. By 2015,
more than 60 countries had published studies using this instrument,
which is available in some 30 different translations, backed by trans-
cultural adaptation studies [13].
In this context, the study question here is: as measured by the
HSOPSC in the various cultural contexts in which it has been used,
Table 1 Search strategy in MEDLINE via PubMed
a
Strategy Keywords
#1 ‘Safety culture’ (All fields) OR ‘safety climate’ (All fields) OR
organisational culture [MeSH Terms]
#2 Hospitals [MeSH Terms]
#3 Patient safety [All fields]
#4 #1 AND #2 AND #3
aPeriod: 1 January 2005–31 July 2016. Languages: English, Portuguese
and Spanish.
Table 2 Patient safety culture dimensions and definitions
Patient safety culture dimensions Definition: The extent to which…
Unit level dimensions
Communication openness Staff speak up freely if they see something that may affect a patient negatively and feel free to question
those with more authority.
Feedback and communication about error Staff are informed about errors that happen, are given feedback about changes implemented and discuss
ways to prevent errors.
Teamwork within units Staff support each other, treat each other with respect and work together as a team.
Non-punitive response to error Staff feel that their mistakes and event reports are not held against them and that mistakes are not
recorded in their personnel file.
Organisational learning–continuous
improvement
Mistakes have led to positive changes and changes are evaluated for effectiveness.
Supervisor/manager expectations and
actions promoting patient safety
Supervisors/managers consider staff suggestions for improving patient safety, praise staff for following
patient safety procedures and do not overlook patient safety problems.
Staffing There are enough staff to handle the workload and work hours are appropriate to provide the best care
for patients.
Hospital level dimensions
Teamwork across units Hospital units cooperate and coordinate with one another to provide the best care for patients.
Handoffs and transitions Important patient care information is transferred across hospital units and during shift changes.
Management support for patient safety Hospital management provides a work climate that promotes patient safety and shows that patient safety
is a top priority.
Outcome dimensions
Frequency of events reported Mistakes of the following types are reported: (1) mistakes caught and corrected before affecting the
patient, (2) mistakes with no potential to harm the patient and (3) mistakes that could harm the
patient, but do not.
Overall perceptions of patient safety Procedures and systems are good at preventing errors and there is a lack of patient safety problems.
Source: Adapted from Sorra et al. [3].
661Patient safety culture • Patient Safety Culture
D
ow
nloaded from
https://academ
ic.oup.com
/intqhc/article-abstract/30/9/660/4998840 by A
dam
E
llsw
orth, A
dam
E
llsw
orth on 07 January 2019
what dimensions of safety culture in hospitals are classified as strong
and weak?
Study objectives
This article sought to identify studies that have used the HSOPSC to
collect data on safety culture at hospitals and to learn their chief
findings relating to safety culture dimensions and possible contribu-
tions to improving the quality of hospital care.
We believe the HSOPSC to be a useful and accessible manage-
ment tool for health personnel and managers interested in safer and
better quality healthcare for hospital patients.
Methods
The systematic literature review conducted to meet the stated aims
was guided by a protocol designed by the three authors.
The search methodology and related findings are described in
accordance with the relevant sections of the Preferred Reporting Items
for Systematic Reviews and Meta-Analyses (PRISMA) statement [14].
Articles were selected by consulting the following data bases:
MEDLINE (via PubMed), Web of Science and Scopus. The search
strategy included combined terms using the Boolean operator ‘OR’
between keywords or similar MeSH terms; and terms with different
meanings were combined using the Boolean operator ‘AND’ to
refine the search. The search strategy used for MEDLINE is shown
in Table 1. For the other data bases, the strategy was the same, but
Records identified by database search
MEDLINE (n = 239)
Scopus (n = 365)
Web of Science (n = 284)
Total (n = 888)
S
c
re
e
n
in
g
In
c
lu
d
e
d
E
li
g
ib
il
it
y
Id
e
n
ti
fi
c
a
ti
o
n
Additional records identified from other sources
(AHRQ Research Reference List)
(n = 69)
Records after duplicates removed
(n = 824)
Records screened
(n = 261)
Records excluded
(n = 179)
Full-text articles assessed for
eligibility
(n = 82)
Full-text articles excluded, with reasons
(n = 48)
– Did not assess culture (8)
– Did not present data on culture (12)
– Used another instrument (13)
– Used comparative database data (5)
– Assessed specific sectors (5)
– Included only one professional category
(3)
– Results presentedin article already
included (3)
Studies included
(n = 33)
Figure 1 Study selection flowchart.
662 Reis et al.
D
ow
nloaded from
https://academ
ic.oup.com
/intqhc/article-abstract/30/9/660/4998840 by A
dam
E
llsw
orth, A
dam
E
llsw
orth on 07 January 2019
adapted to the characteristics of each. The search was complemen-
ted by consulting both the Research Reference List of articles that
have used the HSOPSC, which is posted on the AHRQ website [13],
and the references cited in the articles identified by the search.
Given the diversity of specific features displayed by instruments
available for assessing PSC [9–11], it was opted to select articles that
meet the following eligibility criteria: (i) studies using the HSOPSC to
measure the dimensions of safety culture among staff at acute care
hospitals and (ii) articles in English, Portuguese and Spanish. The fol-
lowing studies were excluded: those (i) in the form of letters, editor-
ials, commentaries, case studies and reviews; (ii) with no abstract
available; (iii) that focussed on only one category of hospital staff;
(iv) that focussed on only one specific hospital unit or sector; (v) that
focussed only on transcultural adaptation of the instrument, without
reporting findings on safety culture; (vi) that used information from
data bases for benchmarking, where eligibility and sampling criteria
are not given and (vii) published in languages other than Portuguese,
English and Spanish. The review period began in 2005, 1 year after
the instrument was provided by the AHRQ, and ended on 31 July
2016. The exclusion criteria were based on the concept of safety cul-
ture itself, one of whose dimensions is defined as teamwork, as it was
not the purpose of this review to learn about the safety culture of spe-
cific professional categories, but of members of the overall hospital
staff. In the same way, priority was given to studies that assessed
safety culture in several hospital units or sectors because the study
question here was to ascertain the safety culture status in hospitals
that had applied the HSOPSC for that purpose.
The HSOPSC measures 12 dimensions of safety culture, with
from three to four items on each, totalling 42 items. The AHRQ
recommends using the estimated mean percentage of positive
responses obtained in each dimension as the measure of safety cul-
ture status. As an evaluative parameter, it suggests that any dimen-
sion for which the percentage of positive responses is 75% or more
should be considered a strong or developed dimension of safety cul-
ture in the population studied. Meanwhile, any dimension for which
the percentage of positive responses is 50% or less should be con-
sidered ‘needing improvement’ and should be prioritised in related
investments [3]. However, early studies that assessed safety culture
in hospitals using the HSOPSC aimed primarily not to assess safety
culture, but to adapt the instrument transculturally for use in other
countries. Many of these studies evaluating safety culture dimen-
sions estimated mean scores ranging from 0 to 5 in each dimension,
where a mean score closer to 5.0 denotes a dimension in which
safety culture is strong among hospital staff.
Accordingly, the measures of interest to this systematic review
were: in studies that reported in percentage form, the mean percent-
age of positive responses obtained on dimensions of safety culture
and, in studies that opted to estimate measures ranging from 0 to 5,
the mean scores estimated by dimension.
The 12 safety culture dimensions measured by the HSOPSC and
their respective definitions are given in Table 2.
To begin with, two of the authors, independently, read the titles
of the articles. After exclusion of duplicate articles and those that
did not provide an abstract, the abstracts of articles not excluded at
this first stage were evaluated independently. Articles were selected
for inclusion in the review after independent readings of the com-
plete texts. In cases where one of the two authors raised doubts as
to whether or not to include an article in the review, a third evalu-
ator who participated in designing the study was consulted and a
final decision was taken by consensus among the three.
Data were drawn from the articles on the basis of the informa-
tion about their authors, year of publication, study design, study
period and site, study population characteristics, how the survey
was administered, response rate and main findings on the safety cul-
ture dimensions specified by the authors.
The quality of the studies selected was evaluated using the
Strengthening the Reporting of Observational Studies in Epidemiology
(STROBE) tool [15], adapted into Portuguese, which has a 22-item
checklist, known as the STROBE Statement. This option responded to
the fact that all the studies using the HSOPSC used observational
design as part of their method.
Results
The searches of the three data bases (MEDLINE, Web of Science
and Scopus), on 24 September 2016, identified 888 relevant titles.
To these were added 69 articles identified in the Research Reference
List posted on the AHRQ website [13]. After eliminating duplicate
titles, 824 articles were selected for reading. Of these, 563 were dis-
carded for meeting at least one of the exclusion criteria, leaving 261
whose abstracts were read. After reading all the abstracts, 82 articles
were selected for the complete text to be read. No additional articles
were included from examining the references of the articles selected.
Figure 1 shows a flowchart of the article selection process.
Figure 2 Studies by country and year of publication.
663Patient safety culture • Patient Safety Culture
D
ow
nloaded from
https://academ
ic.oup.com
/intqhc/article-abstract/30/9/660/4998840 by A
dam
E
llsw
orth, A
dam
E
llsw
orth on 07 January 2019
Table 3 Characteristics of the selected studies
Reference (year) Study site
(HSOPSC
Language)
Study design and period Study population/setting/
sample size/participant
characteristics
Survey administration
mode/response rate/
number of HSOPSC
dimensions
Study results STROBE
instrument items
not fully covered
Stronger Weaker Obs.
Hefner, Hilligoss,
Knupp et al. [48]
USA (English) Cross-sectional study;
HSOPSC was
administered between
mid-2011 and 2013.
Eight departments at Ohio
State University Wexner
Medical Center
(OSUWMC),
comprising six hospitals
and two campuses/1425
employees were included
before Crew Resource
Management (CRM)
training and 1308
afterwards/Nurses
(advanced practice
registered nurses),
doctors (physicians,
including fellows and
some residents) and staff
were included.
Electronic mode/55%
response rate
(N = 784) pre-CRM
and 51% response rate
(N = 667)
post-CRM/12 HSOPSC
dimensions.
‘Teamwork within units’
(72% positive response
rate).
Low pre-CRM scores:
‘Non-punitive Response
to Errors’ (28%) and
‘Handoffs and
Transitions’ (35%)
‘Staffing’ (42%),
‘Teamwork Across
units’ (40%),
‘Frequency of Events
Reported’ (46%),
‘Overall Perceptions of
Patient Safety’ (48%)
and ‘Communication
Openness’ (49%).
Low post-CRM scores:
‘Non-punitive Response
to Errors’ (35%),
‘Handoffs and
Transitions’ (42%),
‘Staffing’ (43%) and
‘Teamwork Across
units’ (44%).
No dimension scored
≥75%, either pre-
or post-CRM.
No descriptive
statistics given
for
participating
professional
categories.
Kiaei, Ziaee,
Mohebbifar et al.
[47]
Iran (Persian) Cross-sectional study/
hospitals of three
central provinces of
Iran (Tehran, Alborz
and Qazvin) in 2013.
About 10 teaching
hospitals of central
provinces: Tehran,
Alborz and Qazvin/552
hospital personnel/292
nurses (53.4%), 47
auxiliary health workers
(8.6%), 36 physicians
(7.6%), 31 operation
room technicians
(5.7%), 22 unit
managers (4%), 15
speech therapists,
audiologists or
physiotherapists (2.7%),
nine technicians (1.6%),
five pharmacists (0.9%),
one nutritionist (0.2%)
and eight other jobs
(1.5%).
No information given on
survey administration
mode/none on response
rate/12 dimensions of
the HSOPSC.
No dimension scored
≥75%.
‘Handoffs and
Transitions’ (54.49%),
‘Frequency of event
reporting’ (55.63%).
‘Teamwork within
units’ is known to
be the strongest
point of patient
safety culture (PSC)
in most related
studies, but not in
this study.
Participant
inclusion
criteria not
stated.
Not stated in
what form
(e.g. paper or
electronic)
questionnaires
were
distributed.
Response rate
not reported.
Study limitations
not stated.
D
ow
nloaded from
https://academ
ic.oup.com
/intqhc/article-abstract/30/9/660/4998840 by A
dam
E
llsw
orth, A
dam
E
llsw
orth on 07 January 2019
Al-Mandhari,
Al-Zakwani,
Al-Kindi et al. [46]
Oman
(English),
compared
with
Taiwan
(Chinese),
Lebanon
(Arabic) and
USA
(English)
Cross-sectional study;
data collection period
was not stated.
Eight regional hospitals
operate under the Oman
Ministry of Health/
professional and allied
healthcare staff working
in government hospitals
in Oman/400 employees/
nurses (60.15%),
physicians (21.01%),
technicians (8.88%),
pharmacists (4.31%)
and others (5.58%).
Hard copy format/98%
response rate
(N = 390)/12
dimensions of the
HSOPSC.
‘Organisational learning–
continuous
improvement’ (84%)
and ‘Teamwork within
units’ (83%).
‘Hospital non-punitive
response to error’
(25.0%), ‘Staffing’
(30.0%) and ‘Handoffs
and transitions’
(25.0%).
—
Data collection
period not
stated.
El-Jardali, Sheikh,
Garcia, Jamal et al.
[45]
Saudi Arabia
(Arabic)
Cross-sectional study.
December 2011 to
March 2012.
The hospital comprises
two sites, Site A
(large—800 beds) and
Site B (small—104
beds)/3000 employees
were included/registered
nurses (50.1%),
technicians (12.0%),
attending or staff
physicians (6.1%) and
unit assistants, clerks or
secretaries (5.2%).
Mixed mode (electronic
and hard copy format)/
85.7% response rate
(N = 2572)/12
dimensions of the
HSOPSC
‘Organisational Learning
and Continuous
Improvement’ (79.6%)
and ‘Teamwork within
units’ (78.5%).
‘Hospital non-punitive
response to error’
(26.8%), ‘Staffing’
(35.1%) and
‘Communication
openness’ (42.9%).
When results on
survey composites
were compared with
results from
Lebanon and the
USA, several areas
requiring
improvement were
noted.
Participant
inclusion
criteria not
stated.
Fujita, Seto,
Kitazawa et al.
[44]
Japan
(Japanese)
Cross-sectional study in
2012.
Eighteen hospitals in
Japan/12 076 healthcare
workers/9.2%
physicians, 46.4%
nurses, 14.4%
administrative workers
and 30.0% other roles.
Hard copy format/72%
response rate
(N = 8,700)/12-
dimension HSOPSC.
The highest-scoring
dimension was
‘Teamwork within
hospital units’ (total
sample T = 70%; high
patient safety score
H = 79%; and lowest
patient safety score
L = 63%).
‘Hospital handoffs
and transitions’
(T = 36%; H = 41%;
L = 32%), ‘Staffing’
(T = 40%; H = 44%;
L = 38%), ‘Non-
punitive response to
error’ (T = 43%;
H = 50%; L = 37%)
and ‘Teamwork across
units’ (T = 44%;
H = 52%; L = 38%).
PSC scores were
estimated for the
total sample (T) and
for two clusters, by
two unit response
patterns: those with
the highest scores
(High PSC units
—
H) and lowest scores
(Low PSC units—L).
Reports that
hospital
participation
was voluntary,
but
participant
inclusion
criteria not
stated.
Eiras, Escoval, Grillo
et al. [43]
Portugal
(Portuguese)
Cross-sectional
psychometric study.
Three hospitals, 4057
questionnaires were
distributed; at the final
dataset totalled 884
questionnaires.
Hard copy format/24.6%
response rate (N =
884). The 12-dimension
HSOPSC was
confirmed.
‘Teamwork within units’
(70%), ‘Organisational
learning–continuous
improvement’ (65%)
and ‘Supervisor/
manager expectations
and actions promoting
patient safety’ (63%).
‘Non-punitive response to
error’ (25%),
‘Management support
for patient safety’
(37%) and ‘Staffing’
(39%).
Measurement of
healthcare safety
culture is still at a
relatively immature
stage in Portugal.
Data collection
period not
stated.
No descriptive
statistics of
participant
characteristics
given, possibly
because the
main aim was
psychometric
validation of
the instrument
used.
Table continued
D
ow
nloaded from
https://academ
ic.oup.com
/intqhc/article-abstract/30/9/660/4998840 by A
dam
E
llsw
orth, A
dam
E
llsw
orth on 07 January 2019
Table 3 Continued
Reference (year) Study site
(HSOPSC
Language)
Study design and period Study population/setting/
sample size/participant
characteristics
Survey administration
mode/response rate/
number of HSOPSC
dimensions
Study results STROBE
instrument items
not fully covered
Stronger Weaker Obs.
Agnew, Flin,
Mearns [42]
Scotland
(English)
Cross-sectional study in
2009.
A sample of National
Health Service (NHS)
acute hospitals, six NHS
acute hospitals in
Scotland/1866 clinical
staff from many work/
area units at/nurses
(53%), nursing or
healthcare assistants
(13%) and medical and
dental consultants
(22%).
Hard copy format/23%
response rate/12-
dimension HSOPSC.
‘Teamwork within units’
(73%).
‘Handover’ (32%),
‘Hospital management
support for patient
safety’ (38%),
‘Teamwork across
units’ (39%), ‘Non-
punitive response to
error’ (44%), ‘Staffing’
(45%) and ‘Feedback
and communication
about error’ (45%).
— Study design not
indicated in
title or
abstract.
Amarapathy,
Sridharan, Perera
et al. [41]
Sri Lanka (not
given)
Cross-sectional descriptive
study to assess current
PSC in a tertiary care
hospital in Sri Lanka.
A tertiary care hospital/of
389 respondents, 16 (the
smallest percentage,
4.1%) were consultants,
while 214 (the largest
percentage, 55%) were
nursing officers. The rest
were 52 medical officers
(13.4%), 42 house
officers (10.8%), 41
administrators (10.5%)
and 24 PG-trainees
(6.2%)
Hard copy format/no
information on response
rate/11-dimension
version of HSOPSC
‘Teamwork within units’
(84.8%), ‘Organisation
learning–continuous
improvement’ (82.5%)
and ‘Overall perception
of patient safety’ (81.3
%).
‘Workload and staff’
(15.7%), ‘Frequency of
events reporting as it
occurs’ (36.6%) and
‘Non-punitive response
to errors’ (39.4%).
— Data collection
period not
stated.
Survey response
rate not
reported.
Davoodi,
Mohammadzadeh,
Shabestari et al.
[40]
Iran (Persian) Cross-sectional,
analytical-descriptive
study in the 3-months
from April to June
2012
Twenty-five government
hospitals in Khorasa
Razavi Province (13 in
Mashad and 12 in other
cities) affiliated to
Mashhad University of
Medical Sciences/ 1200
clinical staff/nurses
(77%), physicians
(10%), laboratory staff
(5.9%), radiology staff
(3.5%), operation room
staff (0.3%), general
managers with no
specialty in therapeutic
procedures (0.2%).
Hard copy format/76%
response rate (N = 922)
12-dimension version of
HSOPSC
‘Organisational learning–
continuous
improvement’ (79.85%)
and ‘Teamwork within
units’ (71.92%).
‘Non-punitive response to
error’ (21.57%),
‘Staffing’ (26.35%),
‘Frequency of events
reported’ (42.85%) and
‘Communication
openness’ (45.46%).
— Possible study
limitations
were not
stated.
External validity
of results was
not discussed.
D
ow
nloaded from
https://academ
ic.oup.com
/intqhc/article-abstract/30/9/660/4998840 by A
dam
E
llsw
orth, A
dam
E
llsw
orth on 07 January 2019
Hamdan, Saleem [39] Palestine
(Arabic)
Cross-sectional design.
Data were collected
between July and
August 2011.
About 11 general public
hospitals in the West
Bank/1460 clinical and
non-clinical hospital
staff/most participants
were nurses and
physicians (69.2%).
Hard copy format/
response rate = 51.2%/
12-dimension version of
HSOPSC.
‘Teamwork within units’
(71%) and
‘Organisational learning
and continuous
improvement’ (62%).
‘Non-punitive response to
error’ (17%),
‘Frequency of events
reported’ (35%),
‘Communication
openness’ (36%),
‘Hospital management
support for patient
safety’ (37%) and
‘Staffing’ (38%).
— —
Jones, Skinneer, High
et al. [38]
USA (English) Quasi-experimental
design: a crossectional,
descriptive study in s-
sectional comparison of
HSOPSC results from
an intervention and
static group/from
February 2008 to
March 2009.
Thirty-seven hospitals/
4601 personnel/static
group: nurses (27.0%),
allied health staff
(21.7%), non-clinical
support staff (15.3%)
clinical support staff
(9.9 %), administration-
management (11.7%)
and the intervention
group: nurses (32.0%),
allied health staff
(23.3%), clinical
support staff (11.8%),
non-clinical support
staff (11.2%).
Hard copy format/
response rate = 75.3%
(N = 3465)/12-
dimension version of
HSOPSC
Intervention group vs.
static group:
‘Organisational
learning–continuous
improvement’ (76% vs.
71%), ‘Teamwork
within units’ (82% vs.
80%) and ‘Teamwork
across hospital
departments’ (67% vs.
62%).
— Mean positive
response scores are
not given for all
dimensions and it is
thus not possible to
identify mean scores
of <50%.
—
Nie, Mao, Cui et al.
[37]
China
(Chinese)
Cross-sectional study;
from July to December
2011.
Thirty-two hospitals in 15
cities across China/1160
healthcare workers,
physicians (surgical and
internal clinicians)/the
majority of respondents
were nurses (66%), then
surgical clinicians (33%)
and internal medicine
clinicians (30%).
Hard copy format/
response rate = 77% (N
= 1160)/10-dimension
version of HSOPSC
‘Organisation learning–
continuous
improvement’ (88%)
and ‘Teamwork within
units’ (84%).
‘Feedback and
communication about
error’ (50%) and
‘Staffing’ (45%).
— —
Occeli, Quenon, Kret
et al. [36]
France
(French)
Cross-sectional study in
January.
Seven hospitals in South-
western France. At the
selected hospitals/524
employees included:
nurses (45.8%),
auxiliary nurses
(32.7%), physicians
(13.9%) and others
(7.6%).
Hard copy format/
response rate = 76.5%
(N = 401)/10-dimension
version of HSOPSC
— ‘Overall perceptions of
safety’ (25.0–71.8%),
‘Non-punitive response
to error’ (3.5–47.1%),
‘Staffing’ (15.0–58.3%),
‘Hospital management
support for patient
safety’ (15.4–58.8%)
and ‘Teamwork across
hospital units’
(24.6–66.7%).
The article does not
mention whether the
findings revealed
dimensions classified
as stronger.
—
Table continued
D
ow
nloaded from
https://academ
ic.oup.com
/intqhc/article-abstract/30/9/660/4998840 by A
dam
E
llsw
orth, A
dam
E
llsw
orth on 07 January 2019
Table 3 Continued
Reference (year) Study site
(HSOPSC
Language)
Study design and period Study population/setting/
sample size/participant
characteristics
Survey administration
mode/response rate/
number of HSOPSC
dimensions
Study results STROBE
instrument items
not fully covered
Stronger Weaker Obs.
Robida [35] Slovenia
(Slovene)
Cross-sectional
psychometric in 2010.
Three acute general
hospitals/all clinical and
non-clinical staff (n =
1745).
Hard copy format/
response rate = 60% (N
= 1048)/12-dimension
version of HSOPSC
— ‘Non-punitive response to
error’ (39% positive
response rate), ‘Staffing’
(31%), ‘Hospital
management support
for patient safety’
(39%) and ‘Teamwork
across hospital units’
(41%).
No PSC dimension
reached the
artificially set value
of 75% of positive
answers.
No descriptive
statistics given
on participant
characteristic.
Study limitations
not discussed.
External validity
of results not
discussed.
Abdelhai, Abdelaziz,
Ghanem [34]
Egypt (Arabic) Analytical, cross-sectional
design study; data was
collected from
December 2011 to
March 2012.
Cairo University Teaching
Hospitals—Cairo/400
healthcare providers/219
(54.8%) were
physicians, 99 (24.7%)
nurses and 82 (20.5%)
paramedical personnel.
Hard copy format/
response rate = 100%
(N = 400)/12-dimension
version of HSOPSC
‘Overall perceptions of
patient safety’ (74.3%).
‘Non-punitive response to
error’ (33.3%),
‘Supervisor/manager
expectations and
actions promoting
safety’ (36.8%),
‘Communication
openness’ (42%) and
‘Teamwork across
units’ (42.3%).
— Possible study
limitations not
reported.
Aboul-Fotouh,
Ismail, EzElarab
et al. [33]
Egypt (Arabic) Cross-sectional study;
data was collected from
November 2008 to
May 2009.
Ain Shams University
hospitals/738 healthcare
providers.
Hard copy format/
response rate = 69.1%
(N = 510)/12-dimension
version of HSOPSC
‘Organisational learning’
(78.2%).
‘Non-punitive response to
error’ (19.5%);
‘Handoffs and
transitions’ (24.6%),
‘Hospital management
support for patient
safety’ (27.2%),
‘Adverse event
reporting’ (33.4%),
‘Overall perception of
safety’ (33.9%),
‘Communication
openness’ (34.6%),
‘Teamwork across
units’ (38.0%),
‘Feedback and
communication about
error’ (39.7%),
‘Supervisor/manager
expectations and
actions promoting
safety’ (46.4%).
— —
D
ow
nloaded from
https://academ
ic.oup.com
/intqhc/article-abstract/30/9/660/4998840 by A
dam
E
llsw
orth, A
dam
E
llsw
orth on 07 January 2019
Smits, Wagner,
Spreeuwenberg,
[32]
The
Netherlands
(Dutch)
A cross-sectional study
was conducted from
October 2006 to
February 2008.
Twenty-eight hospital
units of 20 hospitals in
the Netherlands/nurses
(74%), resident
physicians (10%),
medical specialists (6%)
and managers (2%),
other professions (5%).
Hard copy format/
response rate = 56% (N
= 542)/11-dimension
version of HSOPSC
‘Teamwork within units’
(3.83), ‘Communication
openness’ (3.72), ‘Non-
punitive response to
error’ (3.57).
‘Willingness to report’
(2.78), ‘Hospital
management support’
(2.82) and ‘Teamwork
across hospital units’
(2.85).
— —
Bagnasco, Tibaldi,
Chirone et al. [31]
Italy (Italian) Cross-sectional study. A hospital in Northern
Italy/1008
questionnaires were
distributed/directors/
coordinators,
physicians, nurses/
midwives,
physiotherapists and
technicians were
involved.
Hard copy format/
response rate = 71% (N
= 724)/12-dimension
version of HSOPSC.
‘Organisational learning—
continued improvement’
(74% positive
response).
‘Hospital management
support for patient
safety’ (28%), ‘Staffing’
(30%), ‘Teamwork
among hospital units’
(30%) and ‘Non-
punitive response to
error’ (35%).
No dimension scored
75% or more.
No descriptive
statistics of
study
participants
presented.
Occelli, Quenon,
Hubert et al. [30]
France
(French)
A cross-sectional,
descriptive study in
2007.
Six hospitals (three public
and three private) in the
Aquitaine region/488
professionals (268 were
nursing staff).
Hard copy format/
response rate = 65%/
12-dimension version of
HSOPSC.
— ‘Non-punitive response to
error’ (13–52%),
‘Staffing’ (14–64%),
‘Management support
for patient safety’
(7–67%), ‘Handoffs
and transition’
(27–70%).
No dimension scored
75% or more.
—
Bodur, Filiz [29] Turkey
(Turkish)
Psychometric cross-
sectional study in 2008
Three hospitals (one
general, one teaching
and one university
hospital) in the
metropolitan centre of
Konya Province/
physicians and nurses (n
= 309).
Hard copy format/by
hospital type, response
rates were 56% for
university hospitals,
72% for general public
hospitals and 86% for
teaching hospitals/10-
dimension version of
HSOPSC.
‘Teamwork within units’
(70%), followed by
‘Overall perceptions of
safety’ (62%).
Items in the ‘Frequency of
events reported’ (15%)
and ‘Non-punitive
response to error’
(24%).
— Study
participant
inclusion
criteria not
stated.
Campbell, Singer,
Kitch et al. [28]
USA (English) Cross-sectional study in
2008.
Massachusetts General
Hospital (MGH) a 900-
bed acute care hospital/
nurses and attending
physicians (N = 4 283)/
80% nurses and 20%
physicians.
Mixed mode (electronic
and hard copy format)/
73% response rate
(N = 2 163)/12
dimensions of the
HSOPSC
‘Teamwork within units’
(85%).
‘Handoffs and transitions’
(45%) and ‘Event
reporting’ (49%).
— —
Chen, Li [27] Taiwan
(Chinese)
Cross-sectional design in
2007.
Forty-two hospitals (10
medical centres, 16
regional hospitals and
16 community
hospitals)/1788
professionals included/
29.2% (N = 230)
physicians, 60.6% (N =
478) nurses and 10.2%
(80) administrators.
Hard copy format/
response rate = 78.8%
(N = 788)/12-dimension
version of HSOPSC
‘Teamwork within units’
(94%) and ‘Supervisor/
manager expectations
and actions promoting
patient safety’ (74%).
‘Non-punitive response to
Error’ (45%), ‘Hospital
Handoffs and
Transitions’ (48%) and
‘Staffing’ (39%).
— —
Table continued
D
ow
nloaded from
https://academ
ic.oup.com
/intqhc/article-abstract/30/9/660/4998840 by A
dam
E
llsw
orth, A
dam
E
llsw
orth on 07 January 2019
Table 3 Continued
Reference (year) Study site
(HSOPSC
Language)
Study design and period Study population/setting/
sample size/participant
characteristics
Survey administration
mode/response rate/
number of HSOPSC
dimensions
Study results STROBE
instrument items
not fully covered
Stronger Weaker Obs.
EL-Jardali, Jaafar,
Dimassi et al. [25]
Lebanon
(Arabic)
Cross-sectional study in
2009.
Sixty-eight Lebanese
hospitals participated in
the study/sample
= 12 250 employees/
physicians, nurses,
clinical and non-clinical
staff, pharmacy and
laboratory staff, dietary
and radiology staff,
supervisors and hospital
managers.
Most respondents (57.8%)
were nurses.
Hard copy format/
response rate = 55.56%
(N = 6807)/12-
dimension version of
HSOPSC
‘Teamwork within units’
(82.3%), ‘Hospital
management support
for patient safety’
(78.4%) and
‘Organisational learning
and continuous
improvement’ (78.3%).
‘Non-punitive response to
error’ (24.3%),
‘Staffing’ (36.8%) and
‘Hospital handoffs and
transitions’ (49.7%).
— Only percentage
of respondents
available was
for nurses.
Percentages of
other
professionals
not given.
Hellings, Schrooten
Klazinga et al. [25]
Belgium
(Dutch)
Cross-sectional study
before and after
implementation
approach. First
measurement: between
September and October
2005, except for the
hospital five pilot
(April–May, 2005); the
second measurement:
between April and
August 2007.
Five hospitals- institutional
status (private and
public)/nurses (60.2%),
head nurses (3.9%),
nurse assistants (7.3%),
physicians (9.0%), head
physicians (1.8%),
junior physicians
(0.9%), pharmacists
(0.5%), pharmacy
assistants (1.1%),
middle management
(0.6%), technicians
(4.8%), paramedics
(5.3%) and others
(3.4%).
Hard copy format/77%
response rate in first
survey (N = 3940) and
68% (N = 3626) in
second survey/12-
dimension version of
HSOPSC
In both first and second
surveys, the highest
scoring was ‘Teamwork
within hospital units’,
even though no hospital
scored ≥75%.
Lowest scores (<50%) at the five hospitals in first and second measurement were ‘Non-punitive response to error’, ‘Staffing’, ‘Teamwork across hospital units’ and ‘Hospital handoffs and transitions’.
— —
Olsen [24] Norway
(Norwegian)
Cross-sectional study
validated two safety
climate instruments: (1)
Short Safety Climate
Survey (SSCS) and (2)
Hospital Survey on
Patient Safety Culture-
short form (HSOPSC-
short). The surveys
started in April 2006
and September 2007,
respectively.
A large regional hospital in
Norway.
The target group in the
hospital included health
workers and other
personnel employed in
the same working
environment as the
healthcare personnel/
nurses represented the
largest job category
(50%). ‘Non-nurses’
was not described.
Hard copy format/hospital
response rate was 55%
(N = 1919)/HSOPSC-
short form (five
dimensions).
At the hospital level, the
strong HSOPSC
dimensions were
‘Teamwork within
units’ (mean 3.84) and
‘Supervisor/manager
expectations and
actions promoting
safety’ (mean 3.82).
Meanwhile,
‘Organisational
management support
for safety’ was the
weakest dimension
(mean 2.85).
— —
D
ow
nloaded from
https://academ
ic.oup.com
/intqhc/article-abstract/30/9/660/4998840 by A
dam
E
llsw
orth, A
dam
E
llsw
orth on 07 January 2019
Blegen, Gearhart, O.
Brien et al. [23]
USA (English)
Psychometric cross-
sectional study
Survey was administered
before the first
intervention (March to
June 2006) and again at
the end of the project
(March 2007).
Three hospitals; the survey
was administered to 454
healthcare staff before
and after a series of
multidisciplinary
interventions/(434
before, 368 after) were
mostly registered nurses
(30% before, 33%
after), followed by
medical residents (24%,
27%), pharmacists
(12%, 12%) and
attending physicians
(10%, 13%). The
remainder were other
nursing care providers
(12%, 5%), therapists
(5%, 6%),
administrators and
managers (2%, 2%) and
others (5%, 2%).
Hard copy format/
response rate pre-
intervention = 96% (N
= 434); response rate at
project end = 81% (N =
368/)/11-dimension
version of HSOPSC.
‘Teamwork within units’
(78%).
‘Non-punitive response to
error’ (40%) and
‘Hospital handoffs and
transitions’ (42%).
— —
Smits, Wagner,
Spreeuwenberg
et al. [22]
The
Netherlands
(Dutch)
Cross-sectional study
surveyed in May–June
2005 and 11 in May–
June 2006.
Nineteen hospitals (nine
general hospitals, nine
teaching hospitals and
one university hospital)/
a total of 1889 hospital
staff participated in the
study/participants were
1174 registered nurses
(62.7%), 50 resident
nurses (2.7%), 65
clerks/secretaries
(3.5%), 69 resident
physicians (3.7%), 109
medical specialists
(5.8%), 58 managers
(3.1%) and 346 others
(18.3%).
Hard copy format/1889
respondents at 87 units
in 19 hospitals
completed the
questionnaire. Response
rates were scored for 67
of the 87 units: there
was no reliable
information about the
number of people
having received a
questionnaire in 20
units. The mean
response rate (known
for 67 units) was 80%
(25–100%). The
number of respondents
per unit ranged from
seven to 53 (mean of
22)/11-dimension
version of HSOPSC.
‘Teamwork within units’
(mean 3.88) and
‘Openness of
communication’ (mean
3.78).
‘Teamwork across
hospital units’ (mean
2.85), ‘Hospital
management support’
(mean 2.97) and
‘Frequency of event
reporting’ (mean 2.99).
— —
Table continued
D
ow
nloaded from
https://academ
ic.oup.com
/intqhc/article-abstract/30/9/660/4998840 by A
dam
E
llsw
orth, A
dam
E
llsw
orth on 07 January 2019
Table 3 Continued
Reference (year) Study site
(HSOPSC
Language)
Study design and period Study population/setting/
sample size/participant
characteristics
Survey administration
mode/response rate/
number of HSOPSC
dimensions
Study results STROBE
instrument items
not fully covered
Stronger Weaker Obs.
Al-Ahamadi [21] Saudi Arabia
(English)
Cross-sectional study
during May–August,
2008.
The study population
comprised all medical
and administrative staff
at all public and private
hospitals in Riyadh/
nurses (63.7%),
physicians (8.8%) and
technicians (8.1%); the
last category was
dieticians (0.4%).
Hard copy format/
response rate = 47.4%
(N = 1224)/12-
dimension version of
HSOPSC.
‘Organisational learning’
(75.9%), ‘Teamwork
within units’ (70%).
‘Handoffs and transitions’
(47.6%),
‘Communication
openness’ (44.2%),
‘Staffing’ (31.2%) and
‘Non-punitive response
to error’ (21.1%).
— Study limitations
not stated.
Sine, Northcutt [20] USA (English) Mixed method study:
cross-sectional study
(Phase 1); focus group
using techniques of
interactive qualitative
analysis (Phase 2).
A medium-sized urban
hospital setting.
Hard copy format/
response rate not given/
12-dimension version of
HSOPSC.
‘Teamwork within units’
(89%), ‘Management
Support for Patient
Safety’ (81%) and
‘Organisational
Learning’ (80%).
‘Non-punitive response to
error’ (45%).
— Study
participant
characteristics
not given.
No information
given on
sample size.
Saturno, Gama, De
Oliveira-Sousa
et al. [19]
Spain
(Catalan,
Basque
Galician
and
Spanish)
Cross-sectional study.
No information on data
collection period is
given.
Twenty-four hospitals (5
large—>500 beds, 13
medium—200–499 beds
and six small—<200
beds)/6257 health
professionals (N =
6257) (physicians,
nurses, pharmacists,
physiotherapists,
psychologists, etc.). The
sample comprised
mostly nurses (61.1%).
Hard copy format/
response rate = 40%/
12-dimension version of
HSOPSC.
‘Teamwork within units’
(71.8%) ‘Supervisor/
manager expectations/
actions’ (61.8%).
‘Adequate staffing’
(27.6%) and ‘Hospital
management support
for patient safety’
(24.5%).
— Data collection
period not
specified.
Smits, Christiaans-
Dingelhoff,
Wagner, Wal,
Groenewegen [18]
The
Netherlands
(Dutch)
Psychometric cross-
sectional study
The Dutch version of the
HSOPSC was
distributed at eight
hospitals in the
Netherlands in June
2005.
Eight hospitals (four
general, three teaching
and one university) in
the Netherlands of eight
hospitals/nurses
(59.8%), medical
consultants (6.8%),
resident physicians
(6.0%), administrative
staff (4.3%), trainee
nurses (2.6%) or in
management (2.4%).
Hard copy format/583
staff members
completed the
questionnaire (response
rate not available)/11-
dimension version of
HSOPSC.
‘Teamwork within units’
(3.89), ‘Communication
openness’ (3.76);
‘Adequate staffing’
(3.73), ‘Non-punitive
response to error’ (3.61)
and ‘Supervisor/
manager expectations/
actions’ (3.58).
‘Teamwork across
hospital units’ (2.82).
— —
D
ow
nloaded from
https://academ
ic.oup.com
/intqhc/article-abstract/30/9/660/4998840 by A
dam
E
llsw
orth, A
dam
E
llsw
orth on 07 January 2019
Jones, Skinner, Xu,
Sun, Mueller [17]
USA (English) Cross-sectional study in
2005 and 2007.
Twenty-four Critical
Access Hospitals
(CAHs) in 2005 (1995
eligible employees); in
Spring 2007, 21 of these
24 CAHs chose to
participate in a
reassessment (1963
eligible employees).
Respondent demographics
by position were
consistent in 2005 and
2007: respectively,
nurses (35 and 37%);
allied health personnel
(28 and 24%); support
personnel (12 and
12%); administrators
and managers, (12 and
12%); providers, (7 and
6%); and others (7 and
8%).
Hard copy format/
response rate (2005) =
70.4%; response rate
(2007) = 70.0%/12-
dimension version of
HSOPSC.
In the first assessment
(2005): ‘Teamwork
within departments’
(80%).
In the second assessment
(2007): ‘Teamwork
within departments’
(81%), while
‘Organisational
learning–continuous
improvement and
‘Supervisor/manager
expectations and
actions promoting
patient safety’, achieved
75% scores.
‘Non-punitive response to
error’ scored lowest
(50% in 2005 and 52%
in 2007).
No dimension scored
<50%.
—
Hellings, Schrooten,
Klazinga et al. [16]
Belgium
(Dutch)
Cross- sectional study was
conducted from March
to November 2005.
Five general hospitals/
3940 individuals: 2813
nurses and assistants
(71.40%), 462
physicians (11.73%),
397 physiotherapists,
laboratory and
radiology assistants,
social workers (10.08%)
and 64 pharmacists and
pharmacy assistants
(1.62%).
The questionnaire was
distributed on paper/
response rate = 77% (N
= 9940)/11 dimensions
of HSOPSC version.
‘Teamwork within
hospital units’ scored
highest (70%).
‘Hospital management
support for patient
safety’ (35%), ‘Non-
punitive response to
error’ (36%), ‘Hospital
transfers and
transitions’ (36%),
‘Staffing’ (38%) and
‘Teamwork across
hospital units’ (40%).
— —
D
ow
nloaded from
https://academ
ic.oup.com
/intqhc/article-abstract/30/9/660/4998840 by A
dam
E
llsw
orth, A
dam
E
llsw
orth on 07 January 2019
Finally, 33 studies [16–48] were included, which had been pub-
lished between 2007 and 2016, all in English, except one in Spanish
[19]. Figure 2 shows the studies by country where they were carried
out and year of publication.
The 33 studies [16–48] originated from 21 countries at varying
stages of development. The characteristics of the 33 studies are
shown in Table 3.
All the studies included observational epidemiological design in
their methodology and presented findings on the status of safety
culture in their study sample. However, the studies’ focus varied:
(i) 11 studies focused primarily on evaluating the status of safety cul-
ture among hospital staffs [19, 22, 25–27, 30, 33, 34, 39, 41, 47]; (ii)
10 studies focused on psychometric validation of the HSOPSC [16, 18,
23, 24, 29, 31, 35–37, 43]; (iii) five studies evaluated safety culture by
investigating relations between dimensions of the culture and charac-
teristics of the hospitals or participants [28, 40, 42, 44, 45]; (iv) four
studies evaluated the effects on PSC of investments in improving the
quality and safety of healthcare at hospitals [17, 20, 38, 48]; (v) two
studies investigated associations between safety culture and outcome
variables [21, 32] and (vi) one study [46] evaluated safety culture
among hospital staffs and made comparisons with studies in other
countries.
Regarding their participants, 26 (78.8%) of the 33 studies stated
that these were mainly nurses [16–19, 21–30, 32, 36–42, 44–47], in
proportions ranging from 27% [38] to 80% [28]. Five studies did not
give the demographic characteristics of the sample [20, 31, 35, 43, 48].
Approximately 85% of the studies (N = 28) collected their data
using the instrument on paper [16–19, 21–27, 29–44, 46], achieving
response rates ranging from 23% [42] to 100% [34].
Quality assessment of the studies
Of the articles included in this review, 45.5% (N = 15) contem-
plated the criteria of the STROBE Statement [16, 18, 21–25, 27, 28,
30, 32, 33, 37–39]. Of those that did not contemplate the STROBE
criteria; four presented no descriptive statistics on the participants
[17, 26, 31, 48]; three did not state the study participant inclusion
criteria [29, 44, 45]; two failed to specify the data collection period
[19, 46]; two did not discuss the study’s external validity or limita-
tions [34, 40]; one did not state the study design in their title or
abstract [42]; one did not report the response rate [41]; one stated
neither the data collection period nor descriptive statistics on the
participants [43]; one did not discuss the study’s limitations [21];
one stated neither the inclusion criteria, how data were collected,
response rate nor the study’s limitations [47]; one did not give
descriptive statistics on the participants nor discuss its limitations
and external validity [35] and lastly, one study did not give descrip-
tive statistics on its participants or information on sample size and
response rate [20] (Table 3).
As regards the status of PSC, which was the main focus of this
review, most of the studies were found to estimate scores for safety
culture dimensions as mean percentages of positive responses to
their component items, with the exception of four [18, 22, 24, 32]
which estimated mean scores from 0 to 5 (Table 3).
The main safety culture dimensions that scored highest percen-
tages of positive responses in the studies and, therefore, are classified
as strong or developed dimensions, were: (i) ‘Teamwork within units’
(78–89%) [20, 23, 26–28, 37, 38, 41, 45, 46] (ii) ‘Organisational
learning–continuous improvement’ (71–88%) [17, 20, 21, 26, 33,
37, 38, 40, 41, 45, 46]. In studies that estimated dimension scores
from 0 to 5, the strongest dimensions were: (i) ‘Teamwork within
units’ (3.78–3.89) [18, 22, 24, 32]; (ii) ‘Communication openness’
(3.72–3.78) [18, 22, 32] and (iii) ‘Supervisor/manager expectations
and actions promoting patient safety’ (3.58–3.82) [18, 24] (Table 3).
The main safety culture dimensions that scored 50% or fewer
positive responses and, therefore, can be classified as weak, were: (i)
‘Non-punitive response to error’ (3.5–47%) [16, 20, 21, 23, 25–27,
29–31, 33–36, 39, 40, 42–46, 48]; (ii) ‘Staffing’ (14–45%) [16, 19,
21, 25–27, 30, 35–37, 39–43, 45, 46, 48]; (iii) ‘Handoffs and transi-
tions’ (24.6–49.7%) [16, 21, 23, 26–28, 30, 33, 42, 44, 46, 48]; (iv)
‘Teamwork across units’ (24.6–44%) [16, 25, 31, 33–36, 42, 44, 48];
(v) ‘Hospital management support for patient safety’ (15.4–39%)
[16, 19, 31, 33, 36, 39, 42, 43]; (vi) ‘Frequency of event reported’
(15–49%) [28, 29, 33, 39–41]; (vii) ‘Communication openness’
(36–45.5%) [21, 34, 39, 40, 45] and ‘Feedback and communication
about error’ (39.7–50%) [37, 42]; (viii) ‘Supervisor/Manager
Expectations and Actions Promoting Patient Safety’ (36.8–46.4%)
[33, 34] and (ix) ‘Overall Perceptions of Patient Safety’ (25–33.9%)
[33, 37]. In studies that estimated scores from 0 to 5, by dimension,
the weakest dimensions were (i) ‘Hospital management support for
patient safety’ (2.82–2.97) [22, 24, 32] and (ii) ‘Frequency of event
reported’ (2.78–2.99) [22, 32] (Table 3).
Discussion
Interest in PSC has been growing since the 2000s, when health sys-
tems were challenged to offer safe, better quality care. This interest
arose from a concern over safety shortcomings in structures and
work processes, recognition of the high risk of incidents and com-
plexity inherent to healthcare provision.
There is mounting evidence of the influence of safety culture on
patient clinical outcomes, examples of which are rates of infection
and readmission [49–51]. In this regard, developing and strengthening
safety culture is a prominent means of managing and minimising risk
in health organisations. The first step in setting this whole process in
motion is to assess the current status of safety culture [52]. Safety cul-
ture assessment makes it possible to identify significant safety issues in
work routines and working conditions and to manage them prospect-
ively and to monitor safety-related changes and outcomes.
Nurses accounted for the largest proportion of participants
in ~80% of the studies included in this review [16–19, 21–30, 32,
36–42, 44–47], suggesting that this professional category is inclined
to collaborate and engage with surveys on patient safety, as has
been found in other contexts [53]. Nonetheless, when the intention
is to ascertain the status of culture at the level of the organisation as
a whole, all professional categories should be encouraged to partici-
pate in safety culture surveys.
In 10 of the 33 studies included in this review [16, 18, 23, 24,
29, 31, 35–37, 43], the main aim was the psychometric validation
of translated versions of the HSOPSC, pointing to an interest in the
various countries in assessing safety culture among hospitals staffs.
Although all the studies offered findings on safety culture among
hospitals staffs, they differed in focus, illustrating how broadly
safety culture assessment is applicable to management. For example,
Hefner et al. [48] evaluated the impact on PSC of implementing
Crew Resource Management (CRM), a strategy that is being used
to strengthen PSC by applying a systematic approach to training
teams in interpersonal communication, teamwork, leadership and
decision-making [54]. One quasi-experimental study [38] evaluated
how training applied to a set of 23 hospitals impacted PSC and then
compared this with a static group of 14 hospitals. Intervention
group HSOPSC scores were significantly higher than static group
674 Reis et al.
D
ow
nloaded from
https://academ
ic.oup.com
/intqhc/article-abstract/30/9/660/4998840 by A
dam
E
llsw
orth, A
dam
E
llsw
orth on 07 January 2019
scores in three dimensions assessing the flexible and learning compo-
nents of safety culture [38]. In one US study [17], the authors used
results from a rural-adapted version of the HSOPSC to plan, execute
and evaluate a 2-year patient safety programme in 24 Critical
Access Hospitals. The HSOPSC detected changes in safety culture
over time when managers used a change strategy to execute specific
practices that support the four components of an informed, safe
culture.
The data collection method most used among the studies (85%
N = 28) was administration of the questionnaire on paper, which is
shown by comparative data usually to produce a higher response
rate than when web surveys are used [3]. The response rates in these
studies ranged from 23 to 100%. Two studies are particularly not-
able for having used a mixed method [28, 45] and obtaining
response rates of 85.7% and 73%, respectively. Response rates are
important because low values can limit the ability to generalise find-
ings to the hospital as a whole. When response rates are low, there
is a danger that the large number of staff who did not respond to
the survey might have responded very differently from those who
did respond, which is one of the major possible biases of cross-
sectional studies. Accordingly, the decision to use a survey on paper,
a web survey or a mixed data collection method should consider the
various factors that may influence the response rate, such as the
available resources, the means used to assure respondent anonymity,
the hospital’s experience with web surveys and so on [3].
Of the 33 studies included, nine used random surveys [18, 19,
22, 26, 27, 40, 41, 46, 47]. Random samples are one efficient (low-
cost) option for cross-sectional studies in that they enable character-
istics of the population to be determined from a small number of
participating units [55]. The probable explanation is that the studies
using random sampling included larger numbers of hospital units
[18, 19, 22, 26, 27, 40, 47]. Put differently, the User’s Guide pro-
vided by the AHRQ [3] recommends that, if the hospital has a staff
of fewer than 500, efforts should be made to include them all in the
study.
Against the STROBE Statement checklist, the studies were gener-
ally of good quality and about half the studies met all the require-
ments listed for observational epidemiological studies [16, 18,
21–25, 27, 28, 30, 32, 33, 37–39]. It should be noted, however, that
some editors have proven reticent in view of the fact that the
STROBE initiative seeks to formalise the description of studies con-
ducted in such a heterogeneous field of research as epidemiology,
particularly as regards observational studies. This initiative, they
claim, may not favour the execution and description of singular, cre-
ative studies [15]. The studies included were found to feature a
diversity of objectives and methods, which may have contributed to
whether or not they met the items listed in the STROBE Statement.
Characteristics of the patient safety culture
dimensions
The central aim of this review was to ascertain the characteristics of
PSC at hospitals in the various cultural contexts. Dimensions in
which safety culture was classified as strong and weak were
identified.
‘Teamwork within units’ scored higher in countries at various dif-
ferent stages of development and in studies with different temporal
characteristics [20, 23, 26–28, 37, 38, 41, 45, 46]. The process of
providing healthcare is intrinsically interdisciplinary. Teams generally
comprise people who work together to achieve definite, shared goals,
where each component has specific competences, tasks and functions
in specialised work, uses shared resources and communicates in order
to coordinate and adapt to change. Observational studies of team
behaviour as it relates to high standards of clinical performance have
identified patterns of communication, coordination and leadership
that provide support for effective teamwork [56].
‘Staffing’ scored low in ~60% of the studies (N = 18) [16, 19,
21, 25–27, 30, 35–37, 39–43, 45, 46, 48]. The results suggest that,
in the contexts of more than half the hospitals participating in the
studies, staff felt overloaded by the unsuitability of personnel to
their work activities, which can prejudice the quality of care
provided.
‘Organisational learning–continuous improvement’ was per-
ceived as strong by participants in 33% (N = 11) of the studies
[17, 20, 21, 26, 33, 37, 38, 40, 41, 45, 46]. This dimension relates
to learning in health organisations, which does not consist in a sin-
gle intervention, but is a continuous phenomenon occurring in for-
mal and informal learning. It is fundamentally important to manage
learning requirements in healthcare systems because these are com-
plex, interconnected, dynamic systems where all have tasks and
responsibilities in executing the assigned functions, communicating
and conveying the flow of relevant information and collectively pro-
viding safe care for patients [57]. In the context of patient safety,
where the main goal is to reduce avoidable harm resulting from
healthcare, ‘Frequency of Events Reported’ (an outcome decision)
has the potential to contribute continuously to learning. Safety inci-
dent reports make it possible to identify the possible causes of fail-
ures in work processes and structures. However, the outcome
dimension ‘Frequency of Events Reported’ did not prove strong in
all the studies included in this review, but needed improvement in
the various countries represented.
‘Teamwork across units’ captures respondents’ perceptions of
coordination and cooperation among hospital units with a view to
providing the best possible healthcare to patients. This dimension
could be improved in all the organisations considered in the set of
studies included in this review, while in 30% (n = 10) of the studies,
this dimension was considered weak and scored <50% positive
responses [16, 25, 31, 33–36, 42, 44, 48].
Similarly, ‘Handoffs and transitions’ proved weak in 36%
(N = 12) of the studies [16, 21, 23, 26–28, 30, 33, 42, 44, 46, 48]
and needing improvement in all the studies included. ‘Handoffs and
transitions’ are targeted by quality improvement efforts in health
organisations because they entail high risk of safety incidents and
can lead to loss of important information and to fragmentation of
patient care [58].
Lastly, a culture of blame appears to exist in the hospitals over-
all. In nearly 70% of the studies (N = 22) [16, 20, 21, 23, 25–27,
29–31, 33–36, 39, 40, 42–46, 48], the dimension ‘Non-punitive
response to error’ proved weak. A punitive culture with regard to
the occurrence of safety incidents discourages staff from reporting,
makes it difficult to discover possible causes and thus prevents learn-
ing from mistakes. In a strong safety culture, individuals feel com-
fortable about drawing attention to potential risks or actual
failures, with no fear of censure by managers [59]. Wachter (2013)
claims the ‘no-blame’ approach was responsible for many of the
advances made by the patient safety movement in its first decade,
but argues that most adverse events result from multiple causes and
are unintentional. Occasionally, however, blame may be appropri-
ate in certain situations that involve individuals who commit fre-
quent, careless errors, who fail to accompany developments in their
speciality or who choose to ignore sensible safety standards.
Wachter (2013) cites the emergence of the concept of a ‘just culture’
675Patient safety culture • Patient Safety Culture
D
ow
nloaded from
https://academ
ic.oup.com
/intqhc/article-abstract/30/9/660/4998840 by A
dam
E
llsw
orth, A
dam
E
llsw
orth on 07 January 2019
(instead of a ‘no-blame’ culture) as a way to shift the (appropriate)
no-blame focus back onto the care process. The assumption is that
competent collaborators make mistakes and there is a need to make
individuals (and institutions) accountable for blameworthy errors or
conditions.
In this connection, the HSOPSC is being reviewed to construct a
new version absorbing suggestions from user feedback from around
the world, which include incorporating the ‘just culture’ concept
(https://www.ahrq.gov/professionals/quality-patient-safety/patientsaf
etyculture/hospital/update/index.html).
Study limitations
The authors recognise that this study has a number of limitations.
Firstly, as regards the data bases consulted, it was decided to restrict
the search to the three bases because they were considered suitable
for collecting all the eligible articles according to the proposed subject
and objectives and because they were available to the authors in their
academic setting. With a view to correcting any kind of selection
bias, we consulted the Research Reference List available on the
AHRQ website at https://www.ahrq.gov/professionals/quality-pati
ent-safety/patientsafetyculture/resources/index.html and, from it,
added another 69 articles to those obtained in the database searches.
Another issue that should be highlighted is that this review
searched for articles in English, Portuguese and Spanish only. It is
possible that this search strategy may have failed to retrieve some
articles, although we have identified no articles published in other
languages, not even in the Research Reference list posted on the
AHRQ website, leading us to believe that, by and large, such articles
have been published in English and Spanish. No published article
using the HSOPSC in Latin American countries was identified.
Another important potential limitation of this review was the
authors’ choice not to conduct a meta-analysis. The rationale behind
this is that the findings of the studies included are difficult to gener-
alise and compare, for the following reasons: the studies occurred in
different time periods, they used different sampling strategies and
were conducted in hospital contexts in countries at different stages
of development, which entail different capacities for investment in
improving the quality and safety of care at the study hospitals.
Conclusion
This systematic review demonstrated that the assessment of safety
culture in health organisation settings had received special interest
on the part of health researchers, managers and practitioners in vari-
ous parts of the world.
The set of studies included in this review reveals that hospital
organisational cultures are predominantly underdeveloped or weak
as regards patient safety and comprise dimensions that require
strengthening. In particular, it underlines the need to think about:
(i) strategies directed to prepare personnel to offer safe, quality
healthcare; (ii) work processes surrounding shift changes and hand-
overs, so as to prevent loss of important information about patients
and their treatment; (iii) cooperation, integration and coordination
of teamwork among the hospital units, in order to prevent fragmen-
tation of care; and lastly and (iv) the culture of blame, which should
give way to a ‘just culture’ approach, which would counter the urge
to blame, enhance professional and institutional accountability and
prioritise the identification of systemic failures and, consequently,
proceed to mitigate them.
Use of the HSOPSC to measure safety culture in hospital organi-
sations proved efficient, applicable to the various objectives of the
studies included in this review and adaptable to the different cultural
and organisational development contexts. The findings of these
safety culture assessment studies are highly useful and constitute a
knowledge base for taking specific improvement action.
Acknowledgements
This paper is part of the result of the Postdoctoral Research done by C.T.R. at
the National School of Public Health—Universidade NOVA de Lisboa, super-
vised by Professor P.S. and supported by the Brazilian Minister of Health.
References
1. Di Cuccio MH. The relationship between patient safety culture and
patient outcomes: a systematic review. J Patient Saf 2014;22:11–8.
2. Fan CJ, Pawlik TM, Daniels T et al. Association of safety culture with
surgical site infection outcomes. J Am Coll Surg 2016;222:122–8.
3. Sorra J, Gray L, Streagle S et al. AHRQ Hospital Survey on Patient Safety
Culture: User’s Guide. (Prepared by Westat, under Contract No.
HHSA290201300003C). AHRQ Publication No. 15-0049-EF (Replaces
04-0041). Rockville, MD: Agency for Healthcare Research and Quality.
January 2016. https://www.ahrq.gov/sites/default/files/wysiwyg/professio
nals/quality-patient-safety/patientsafetyculture/hospital/userguide/hospc
ult (3 August 2016, date last accessed).
4. Health and Safety Commission. Third Report: Organizing for Safety.
ACSNI Study Group of Human Factors. London: HMSO, 1993.
5. Cox SJ, Cox T. The structure of employee attitude to safety: a European
example. Work & Stress 1991;5:93–106.
6. Nieva VF, Sorra J. Safety culture assessment: a tool for improving patient
safety in healthcare organizations. Qual Saf Healthcare 2003;12:ii17–23.
7. Flin R. Measuring safety culture in healthcare: a case for accurate diagno-
sis. Saf Sci 2007;45:653–67.
8. Colla JB, Bracken AC, Kinney LM et al. Measuring patient safety climate:
a review of surveys. Qual Saf Health Care 2005;14:364–6.
9. Flin R, Burns C, Mearns K et al. Measuring safety climate in health care.
Qual Saf Health Care 2006;15:109–15.
10. Robb G, Seddon M. Measuring the safety culture in a hospital setting: a
concept whose time has come? NZMJ 2010;123:66–76.
11. Halligan M, Zecevic A. Safety Culture in healthcare: a review of concepts,
dimensions, measures and progress. BMJ Qual Saf 2011;20:338–43.
12. Sorra JS, Nieva VF Hospital Survey on Patient Safety Culture (Prepared
by Westat, under Contract No. 290-96-0004). AHRQ Publication No.
04-0041. Rockville, MD:. September 2004.
13. Surveys on Patient Safety Culture Research Reference List. Content last
reviewed April 2017. Agency for Healthcare Research and Quality,
Rockville, MD. http://www.ahrq.gov/professionals/quality-patient-safety/
patientsafetyculture/resources/index.html (3 October 2016, data last
accessed).
14. Moher D, Liberati A, Tetzlaff J et al. Preferred reporting items for system-
atic reviews and meta-analyses: the PRISMA statement. J Clin Epidemiol
2009;62:1006–12.
15. Malta M, Cardoso LO, Bastos FI et al. Iniciativa STROBE: subsídios
para a comunicação de estudos observacionais. Rev Saúde Pública 2010;
44:559–65.
16. Hellins J, Schrooten W, Klazinga N et al. Challenging patient safety cul-
ture: survey results. Int J Health Care Qual Assur 2007;20:620–32.
17. Jones KJ, Skinner A, Xu L et al The AHRQ Hospital Survey on Patient
Safety Culture: A tool to plan and evaluate patient safety programs. In:
Henriksen K, Battles JB, Keyes MA, et al., eds. Advances in Patient
Safety: New Directions and Alternative Approaches. Vol. 2: Culture and
Redesign. Rockville, MD: Agency for Healthcare Research and Quality;
July 2008. AHRQ Publication No. 08-0034-2.
676 Reis et al.
D
ow
nloaded from
https://academ
ic.oup.com
/intqhc/article-abstract/30/9/660/4998840 by A
dam
E
llsw
orth, A
dam
E
llsw
orth on 07 January 2019
https://www.ahrq.gov/professionals/quality-patient-safety/patientsafetyculture/hospital/update/index.html
https://www.ahrq.gov/professionals/quality-patient-safety/patientsafetyculture/hospital/update/index.html
https://www.ahrq.gov/professionals/quality-patient-safety/patientsafetyculture/resources/index.html
https://www.ahrq.gov/professionals/quality-patient-safety/patientsafetyculture/resources/index.html
https://www.ahrq.gov/sites/default/files/wysiwyg/professionals/quality-patient-safety/patientsafetyculture/hospital/userguide/hospcult
https://www.ahrq.gov/sites/default/files/wysiwyg/professionals/quality-patient-safety/patientsafetyculture/hospital/userguide/hospcult
https://www.ahrq.gov/sites/default/files/wysiwyg/professionals/quality-patient-safety/patientsafetyculture/hospital/userguide/hospcult
http://www.ahrq.gov/professionals/quality-patient-safety/patientsafetyculture/resources/index.html
http://www.ahrq.gov/professionals/quality-patient-safety/patientsafetyculture/resources/index.html
18. Smits M, Christiaans-Dingelhoff I, Wagner C et al. The psychometric
properties of the ‘Hospital Survey on Patient Safety Culture’ in Dutch
hospitals. BMC Health Serv Res 2008;8:230.
19. Saturno PJ, Da Silva Gama ZA, De Oliveira-Sousa SL et al. Analysis of
patient safety culture in Spanish National Health System hospitals. Med
Clin (Barc) 2008;131:18–25.
20. Sine DM, Northcutt N. Interactive qualitative assessment of patient safety
culture survey scores. J Patient Saf 2008;4:78–83.
21. Al-Ahmadi TA. Measuring patient safety culture in Riyadh’s hospitals: a
comparison between public and private hospitals. J Egypt Public Health
Assoc 2009;84:479–500.
22. Smits M, Wagner C, Spreeuwenberg P et al. Measuring patient safety cul-
ture: an assessment of the clustering of responses at unit level and hospital
level. Qual Saf Health Care 2009;18:292–96.
23. Blegen MA, Gearhart S, O’Brien R et al. AHRQ’s Hospital Survey on
Patient Safety Culture: psychometric analyses. J Patient Saf 2009;5:139–44.
24. Olsen E. Exploring the possibility of a common structural model measur-
ing associations between safety climate factors and safety behaviour
in health care and the petroleum sectors. Accid Anal Prev 2010;42:
1507–16.
25. Hellings J, Schrooten W, Klazinga NS et al. Improving patient safety cul-
ture. Int J Health Care Qual Assur 2010;23:489–506.
26. El-Jardali F, Jaafar M, Dimassi H et al. The current state of patient safety
culture in Lebanese hospitals: a study at baseline. Int J Qual Health Care
2010;22:386–95.
27. Chen IC, Li HH. Measuring patient safety culture in Taiwan using the
Hospital Survey on Patient Safety Culture (HSOPSC). BMC Health Serv
Res 2010;10:152.
28. Campbell EG, Singer S, Kitch BT et al. Patient safety climate in hospitals:
act locally on variation across units. Jt Comm J Qual Patient Saf 2010;
36:319–26.
29. Bodur S, Filiz E. Validity and reliability of Turkish version of ‘Hospital
Survey on Patient Safety Culture’ and perception of patient safety in pub-
lic hospitals in Turkey. BMC Health Serv Res 2010;10:28.
30. Occelli P, Quenon J, Hubert B et al. Development of a safety culture: ini-
tial measurements of six hospitals in France. J Health Risk Manag 2011;
30:42–7.
31. Bagnasco A, Tibaldi L, Chirone P et al. Patient safety culture: an Italian
experience. J Clin Nurs 2011;20:1188–95.
32. Smits M, Wagner C, Spreeuwenberg P et al. The role of patient safety cul-
ture in the causation of unintended events in hospitals. J Clin Nurs 2012;
21:3392–401.
33. Aboul-Fotouh AM, Ismail NA, EzElarab HS et al. Assessment of patient
safety culture among healthcare providers at a teaching hospital in Cairo,
Egypt. East Mediterr Health J 2012;18:372–7.
34. Abdelhai R, Abdelaziz SB, Ghanem NS. Assessing patient safety culture
and factors affecting it among health care providers at Cairo university
hospitals. J Am Sci 2012;8:277–85.
35. Robida A. Hospital Survey on Patient Safety Culture in Slovenia: a psy-
chometric evaluation. Int J Qual Health Care 2013;25:469–75.
36. Occelli P, Quenon J-L, Kret M et al. Validation of the French version of
the Hospital Survey on Patient Safety Culture questionnaire. Int J Qual
Health Care 2013;25:459–68.
37. Nie Y, Mao X, Cui H et al. Hospital survey on patient safety culture in
China. BMC Health Serv Res 2013;13:228.
38. Jones KJ, Skinner AM, High R et al. A theory-driven, longitudinal evalu-
ation of the impact of team training on safety culture in 24 hospitals.
BMJ Qual Saf 2013;22:394–404.
39. Hamdan M, Saleem AA. Assessment of patient safety culture in
Palestinian public hospitals. Int J Qual Health Care 2013;25:167–75.
40. Davoodi R, Shabestari MM, Takbiri A et al. Patient safety culture based
on medical staff attitudes in Khorasan Razavi hospitals, Northeastern
Iran. Iranian J Publ Health 2013;42:1292–8.
41. Amarapathy M, Sridharan S, Perera R et al. Factors affecting patient
safety culture in a tertiary care hospital in Sri Lanka. Int J Sci Tech Res
2013;2:173–80.
42. Agnew C, Flin R, Mearns K. Patient safety climate and worker safety
behaviours in acute hospitals in Scotland. J Safety Res 2013;45:95–101.
43. Eiras M, Escoval A, Grillo IM et al. The hospital survey on patient safety
culture in Portuguese hospitals: instrument validity and reliability. Int J
Health Care Qual Assur 2014;27:111–22.
44. Fujita S, Seto K, Kitazawa T et al. Characteristics of unit-level patient
safety culture in hospitals in Japan: a cross-sectional study. BMC Health
Serv Res 2014;14:508.
45. El-Jardali F, Sheikh F, Garcia NA et al. Patient safety culture in a
large teaching hospital in Riyadh: baseline assessment, comparative ana-
lysis and opportunities for improvement. BMC Health Serv Res 2014;
14:122.
46. Al-Mandhari A, Al-Zakwani I, Al-Kindi M et al. Patient safety culture
assessment in Oman. Oman Med J 2014;29:264–70.
47. Kiaei MZ, Ziaee A, Mohebbifar R et al. Patient safety culture in teaching
hospitals in Iran: assessment by the Hospital Survey on Patient Safety
Culture (HSOPSC). J Health Man Info 2016;3:51–6.
48. Hefner JL, Hilligoss B, Knupp A et al. Cultural transformation after
implementation of a crew resource management: is it really possible? Am
J Med Qual 2017;32:384–90.
49. Huang DT, Clermont G, Kong L et al. Intensive care unit safety culture
and outcomes: a US multicenter study. Int J Qual Health Care 2010;22:
151–61.
50. Mardon RE, Khanna K, Sorra J et al. Exploring relationships between
hospital patient safety culture and adverse events. J Patient Saf 2010;6:
226–32.
51. Hansen LO, Williams MV, Singer SJ. Perceptions of hospital safety cli-
mate and incidence of readmission. Health Serv Res 2011;46:596–616.
52. Pronovost PJ, Weast B, Bishop K et al. Senior executive adopt-a-work
unit: a model for safety improvement. Jt Comm J Qual Saf 2004;30:
59–68.
53. Aiken LH, Sermeus W, Van den Heede K et al. Patient safety, satisfaction,
and quality of hospital care: cross sectional surveys of nurses and patients
in 12 countries in Europe and the United States. BMJ 2012;344:e1717.
54. Maynard MT, Marshall D, Dean MD. Crew resource management and
teamwork training in health care: a review of the literature and recom-
mendations for how to leverage such interventions to enhance patient
safety. Adv Health Care Manag 2012;13:59–91.
55. Pereira MG. Epidemiologia, teoria e prática. Rio de Janeiro: Guanabara
Koogan, 2001.
56. Manser T. Teamwork and patient safety in dynamic domains of
healthcare: a review of the literature. Acta Anaesthesiol Scand 2009;53:
143–51.
57. Ratnapalan S, Uleryk E. Organizational learning in health care organiza-
tions. Systems 2014;2:24–33.
58. Lee S-H, Phan PH, Dorman T et al. Handoffs, safety culture, and prac-
tices: evidence from the hospital survey on patient safety culture. BMC
Health Serv Res 2016;16:254.
59. Wachter RM Compreendendo a Segurança do Paciente – 2ª Ed. AMGH
Editora. Porto Alegre, 2013.
677Patient safety culture • Patient Safety Culture
D
ow
nloaded from
https://academ
ic.oup.com
/intqhc/article-abstract/30/9/660/4998840 by A
dam
E
llsw
orth, A
dam
E
llsw
orth on 07 January 2019
Copyright of International Journal for Quality in Health Care is the property of Oxford
University Press / USA and its content may not be copied or emailed to multiple sites or
posted to a listserv without the copyright holder’s express written permission. However, users
may print, download, or email articles for individual use.
Purpose
Data sources
Study selection
Data extraction
Results
Conclusion
Background
Study objectives
Methods
Results
Quality assessment of the studies
Discussion
Characteristics of the patient safety culture dimensions
Study limitations
Conclusion
Acknowledgements
References
We provide professional writing services to help you score straight A’s by submitting custom written assignments that mirror your guidelines.
Get result-oriented writing and never worry about grades anymore. We follow the highest quality standards to make sure that you get perfect assignments.
Our writers have experience in dealing with papers of every educational level. You can surely rely on the expertise of our qualified professionals.
Your deadline is our threshold for success and we take it very seriously. We make sure you receive your papers before your predefined time.
Someone from our customer support team is always here to respond to your questions. So, hit us up if you have got any ambiguity or concern.
Sit back and relax while we help you out with writing your papers. We have an ultimate policy for keeping your personal and order-related details a secret.
We assure you that your document will be thoroughly checked for plagiarism and grammatical errors as we use highly authentic and licit sources.
Still reluctant about placing an order? Our 100% Moneyback Guarantee backs you up on rare occasions where you aren’t satisfied with the writing.
You don’t have to wait for an update for hours; you can track the progress of your order any time you want. We share the status after each step.
Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.
Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.
From brainstorming your paper's outline to perfecting its grammar, we perform every step carefully to make your paper worthy of A grade.
Hire your preferred writer anytime. Simply specify if you want your preferred expert to write your paper and we’ll make that happen.
Get an elaborate and authentic grammar check report with your work to have the grammar goodness sealed in your document.
You can purchase this feature if you want our writers to sum up your paper in the form of a concise and well-articulated summary.
You don’t have to worry about plagiarism anymore. Get a plagiarism report to certify the uniqueness of your work.
Join us for the best experience while seeking writing assistance in your college life. A good grade is all you need to boost up your academic excellence and we are all about it.
We create perfect papers according to the guidelines.
We seamlessly edit out errors from your papers.
We thoroughly read your final draft to identify errors.
Work with ultimate peace of mind because we ensure that your academic work is our responsibility and your grades are a top concern for us!
Dedication. Quality. Commitment. Punctuality
Here is what we have achieved so far. These numbers are evidence that we go the extra mile to make your college journey successful.
We have the most intuitive and minimalistic process so that you can easily place an order. Just follow a few steps to unlock success.
We understand your guidelines first before delivering any writing service. You can discuss your writing needs and we will have them evaluated by our dedicated team.
We write your papers in a standardized way. We complete your work in such a way that it turns out to be a perfect description of your guidelines.
We promise you excellent grades and academic excellence that you always longed for. Our writers stay in touch with you via email.