keywords should include (Allele frequencies, Admixture, Population Genetics; Gene flow; Gulf countries; Saudi Arabia)
– Words are not less than 500 words.
– Accepted plagiarism (<15%).
- References' section should be added in Alphabets by use endnote or Mendeley
i upload some articles maybe help
you can use any articles related to my subject
gcsp-2014-054
O P E N A C C E S S Review article
Arab gene geography: From
population diversities to
personalized medical genomics
Ghazi O. Tadmouri
1
, Konduru S. Sastry
2
, Lotfi Chouchane
2,
*
ABSTRACT
Genetic disorders are not equally distributed over the geography of the Arab region. While a number of
disorders have a wide geographical presence encompassing 10 or more Arab countries, almost half of
these disorders occur in a single Arab country or population. Nearly, one-third of the genetic disorders
in Arabs result from congenital malformations and chromosomal abnormalities, which are also
responsible for a significant proportion of neonatal and perinatal deaths in Arab populations.
Strikingly, about two-thirds of these diseases in Arab patients follow an autosomal recessive mode of
inheritance. High fertility rates together with increased consanguineous marriages, generally noticed in
Arab populations, tend to increase the rates of genetic and congenital abnormalities. Many of the
nearly 500 genes studied in Arab people revealed striking spectra of heterogeneity with many novel
and rare mutations causing large arrays of clinical outcomes. In this review we provided an overview of
Arab gene geography, and various genetic abnormalities in Arab populations, including disorders of
blood, metabolic, circulatory and neoplasm, and also discussed their associated molecules or genes
responsible for the cause of these disorders. Although studying Arab-specific genetic disorders
resulted in a high value knowledge base, approximately 35% of genetic diseases in Arabs do not have
a defined molecular etiology. This is a clear indication that comprehensive research is required in this
area to understand the molecular pathologies causing diseases in Arab populations.
Keywords: Arab populations, neolithic, population genetics, gene geography, genetic disorders, neoplasms
Cite this article as: Tadmouri GO, Sastry KS, Chouchane L. Arab gene geography: From population
diversities to personalized medical genomics, Global Cardiology Science and Practice 2014:54
http://dx.doi.org/10.5339/gcsp.2014.54
http://dx.doi.org/
10.5339/gcsp.2014.54
Submitted: 1 September 2014
Accepted: 11 December 2014
ª 2014 Tadmouri, Sastry &
Chouchane, licensee Bloomsbury
Qatar Foundation Journals. This is
an open access article distributed
under the terms of the Creative
Commons Attribution license CC BY
4.0, which permits unrestricted use,
distribution and reproduction in any
medium, provided the original work
is properly cited.
1
Faculty of Public Health, Jinan University,
Tripoli, Lebanon
2
Laboratory of Genetic Medicine and
Immunology, Weill Cornell Medical
College in Qatar, Qatar Foundation,
Doha, Qatar
*Email: loc2008@qatar-med.cornell.edu
A DEFINITION OF ‘ARAB POPULATIONS’
The term “Arabs” indicates a panethnicity of peoples of various ancestral origins, religious
backgrounds, and historic identities. It is possible to define the geographical area inhabited by Arabs
using one of the two following approaches:
(1) The linguistic approach is a relaxed definition and it includes all populations speaking the
Arabic language and living in a vast area extending from south of Iran in the east to Morocco in
the west including parts in the south-east of Asia Minor, East, and West Africa.
(2) The political definition of Arabs is more conservative as it only includes those populations
residing in 23 Arab States, namely: Algeria, Bahrain, Comoros, Djibouti, Egypt, Eritrea, Iraq,
Jordan, Kuwait, Lebanon, Libya, Mauritania, Morocco, Oman, Palestine, Qatar, Saudi Arabia,
Somalia, Sudan, Syria, Tunisia, United Arab Emirates (UAE), and Yemen.
In the subsequent parts of this paper, it is the political definition that would mainly be used to define
the term “Arab region” or simply “the region”. In all cases, the Arab geocultural unit is the largest in the
world after Russia and Anglo-America. The size of this unit exceeds 375 million people and spans more
than 14 million square kilometers.
1
PALEOLITHIC OUT-OF-AFRICA MIGRATIONS
Archeological excavations, historical records, and molecular analyses, mainly based on the study of
uniparental Y-chromosome and mitochondrial DNA (mtDNA), provided considerable information
regarding the early evolutionary history of modern humans in the vast geographical region embracing
Arab populations. The advent of genomic methodologies based on the simultaneous analysis of
hundreds of thousands of single nucleotide polymorphisms allowed the drawing of conclusions on the
genetic structures of Arab populations with a higher resolution.
2
DNA evidence indicates that modern humans originated in East Africa about 200-100 kiloyears (kyr)
ago then established regional populations throughout the continent.
3
Archeological artifacts excavated
from Taforalt in today’s Morocco indicate that human inhabitation of modern day’s Maghreb region
(i.e., modern day Morocco, Algeria, Tunisia and Libya) dates back to some 82 kyr ago.
4
At that time,
settlements in the region were characterized by developed cultural manifestations that could only be
present in Europe 40 millennia later.
5
According to the Recent Out-of-Africa model, members of one
branch of anatomically modern humans left Africa to the Near East some 70-45 kyr ago.
6,7
Phylogenies
constructed on the basis of mtDNA comparisons are indicative for two possible migration routes in this
episode of human history (see Figure 1):
(1) A major route laid across Bab-el-Mandeb straits in the Red Sea linking modern day Eritrea and
Djibouti in Africa to Yemen, hence, probably making the Arabian Peninsula as the initial
Figure 1. Out-of-Africa migration routes.
Page 395 of 408
Tadmouri, Sastry & Chouchane. Global Cardiology Science and Practice 2014:54
staging post in the first successful migration of anatomically modern humans out of Africa
70-60 kyr ago.
7-9
Y-chromosome diversity studies in modern Saudi males support this view as
14% of them exhibit a pool typical of African biogeographic ancestry.
10
High diversity in the
Y-haplogroup substructure in samples from the region extends the geography of this active
route to include southern Arabia, South Iran, and South Pakistan. This route has possibly
maintained its important role in influencing gene flow from Africa along the coastal
crescent-shaped corridor of the Gulf of Oman and could have facilitated human dispersals into
the region until nearly 2500 years ago.
11
(2) Another route followed the Nile from East Africa, heading northwards and crossing through the
Sinai Peninsula into the Levant and resulted in a noticeable gene flow during the Upper
Paleolithic and Mesolithic periods between 40-14 kyr ago.
12-14
Recent data from Alu/short
tandem repeat compound systems and genome-wide polymorphisms are in support with this
view with 4-15% of the Levantine groups harboring African ancestry while this influence barely
reaches 1-3% in Southern Europeans.
15-18
Human populations in the Near East then branched
in several directions, some heading north into Europe and others heading east into Asia.
19-22
Y-chromosome analysis supports this view and demonstrates the absence of any significant
genetic barrier in the Levant, where a remarkable genetic variation was attained and gene flow
followed the “isolation-by-distance” model. This is in contrast to a strong north-south genetic
barrier, for both male and female gene flow, in the western Mediterranean basin, defined by
the Gibraltar Strait.
23,24
Paleoanthropological evidence and mtDNA variation analysis indicate that both the Levantine
corridor and the Horn of Africa served, repeatedly, as migratory passageways between Africa and
Eurasia.
14,25
Some of the oldest known genetic mutations that could have followed this route include:
(1) the delta F508 (c.1521_1523delCTT) mutation of the CFTR gene, which is responsible today for a
majority of cases with cystic fibrosis in Europe,
26
and (2) the p.Glu6Val sickle cell mutation associated
with the Benin haplotype and frequently observed in the western coastal region of the Arabian
Peninsula, the Levant, Egypt, and in the Maghreb region.
27
Some studies also support the view that regions near, but external to northeast Africa, like the
Levant, the southern-Arabian Peninsula, or Mesopotamia could have served as incubators for the early
diversification of non-African lineages and the development of local cultural techniques.
10,28,29
Again,
the p.Glu6Val sickle cell mutation provides a supportive evidence for this view since the mutation
associated with the Arab/Asian haplotype seems to be restricted to the eastern coastal regions of the
Arabian Peninsula with milder presence in Mesopotamia and the Levant.
27
THE EARLY FARMERS
Around 12 kyr ago, Neolithic human populations adapted some developed agricultural technologies
that allowed them to cause a far-reaching shift in subsistence and lifestyle. Improvement of the climatic
conditions in the area along with the practice of agriculture helped in the establishment of major
historical settlements with sizeable densities that could have contributed enormously to the genetic
makeup of modern Arab populations. Yet, farming was almost always associated with settlements near
mosquito-infested soft and marshy soil causing large malarial outbreaks.
30-32
These outbreaks
imposed selective pressure on the human genome and amplified the frequencies of several genetic
disorders including sickle cell disease, b-thalassemia, and glucose-6-phosphate dehydrogenase
(G6PD) deficiency.
33-35
Infectious agents that favored humid conditions could have also played major
roles in the selective advantage to a variety of other genetic traits.
36,37
A major example in this category
includes the heterozygote advantage of cystic fibrosis carriers against tuberculosis.
38
On the other
hand, adapting to an active lifestyle along with calorie-restricted diets, common in communities at the
time, could have provided protective features that suppressed the expression of celiac disease, type 2
diabetes, and inflammatory bowel disease.
39
In this phase of human history, the Arabian Peninsula, Sub-Saharan Africa, the Levant and Iran saw
local population expansions from refugia that could have participated in the building of the primitive
Arabian population. Y-chromosome and mtDNA haplogroup data support this view.
9,40
For example,
approximately 62–69% of today’s males in Saudi Arabia share common structures with those in the
near east and this demonstrates a possibly important role for the Levant in shaping the Neolithic
dispersal of human settlements in the Gulf.
10
This genetic evidence is consistent with archeological
Page 396 of 408
Tadmouri, Sastry & Chouchane. Global Cardiology Science and Practice 2014:54
interpretations of the expansion of sedentary Natufian hamlets in the Levant during the wet phase
15–13 kyr.
41
Male lineage estimates for these prominent Levantine haplogroups indicate a north to
south influence with a history of almost 12 kyr in Saudi Arabia, 11 kyr in Yemen, mainly in the western
region, and only at nearly 7 kyr in Qatar and the UAE.
10
Detailed analyses hint to a major terrestrial
colonization for the eastern Arabian Peninsula, which was followed by subsequent population isolation
from the western Arabian Peninsula and demonstrating significant genetic affinities to near-eastern
populations.
42
Many of the earliest disease-causing genetic mutations might have followed these
steps.
43
In particular, the c.208-2A . G mutation in the human amnionless homolog (AMN) gene,
found in 15% of Imerslund-Gräsbeck syndrome cases, could have emerged in the region around
13.6 kyr. Today, this mutation is responsible for over 50% of the Imerslund-Gräsbeck syndrome cases
among Arabic, Turkish, and Sephardic Jewish families.
44
On the other hand, studies of mtDNA
variability confirm a notable sub-Saharan African female-driven flow in the Arabian Peninsula.
9,25,29
An
Iranian influence also existed, but this was weakened by the presence of barriers to gene flow posed by
the two major Iranian deserts and the Zagros mountain range.
45
Analysis of the pattern of Y-chromosome and mtDNA variations in North Africa provides evidence of
the relatively young population history of North Africa mainly influenced by a strong demic expansion
of Neolithic pastoralists from the Levant and possible admixture with original settlers.
7,46,47
Some of
these earliest civilizations in the Maghreb region include immigrant Berbers who originated from the
Sahara 10,000 years ago and left considerable gene imprints in the gene pool of the populations
inhabiting the area between modern day Mauritania and southern Egypt.
48,49
Nearly 2,000 years later,
Mesolithic Capsians became the next influential genetic stock in the region.
50
MAJOR EVENTS IN ANCIENT HISTORY
In the Arabian Peninsula, Semitic-speaking peoples of Arabian origin migrated into the valley of the
Tigris and Euphrates rivers in Mesopotamia some 7,000-5,500 years ago.
51,52
Analysis of Y
chromosome and mitochondrial DNA in Iraqi Marsh Arabs revealed a prevalent autochthonous Middle
Eastern component for both male and female gene pools, with weak Southwest Asian and African
contributions.
29
The detailed analysis of genome-wide variation patterns among Qataris indicate that
the Southwest Asian influence is derived from Greater Persia rather than from China while the African
stock has a sub-Saharan origin and not a Southern African Bantu origin.
53
Data from the neighboring
Bahraini and Emirati populations reveal an increasing North-to-South influence of the Southwest Asian
component with a high contribution of 23% and 24%, respectively.
54
This could also explain the
exceptionally high frequencies of the Asian sickle cell mutation in the region extending from Kuwait to
the United Arab Emirates.
27
Archeological evidence further indicates that another group of Semites left Arabia around 4,500
years ago during the Early Bronze Age and settled along the Levant and mixed in with the local
populations there. Some 3,500 years ago, the Phoenician civilization of Lebanon became a developed
enterprising maritime trading culture. Phoenician traders spread across the Mediterranean and
established major cities and colonies that harbored their pathologic or polymorphic gene
variations.
55-58
Among the pathologic gene variations that could have followed Phoenician footsteps
are (1) the IVS-I-110 (c.93-21G . A) beta-globin gene mutation, the most frequently encountered
beta-thalassemia mutation among Arabs, and (2) the p.G542X mutation in the CFTR gene, a frequently
observed cystic fibrosis mutation in the Mediterranean basin.
56,59
Results of the Genographic
Consortium from Y-chromosome variations indicate that as many as 1 in 17 men living today on the
coasts of North Africa and southern Europe may have a Phoenician direct male-lineage ancestry.
60
The
genetic pool was further enriched in Mesopotamia through Persians while Romans gained a 600
year-long period of settlements throughout most of the region that were subsequently replaced by the
Byzantines.
61
MAJOR EVENTS IN MEDIEVAL HISTORY
Soon after the rise of Islam 1,400 years ago, the Arab Caliphates unified the region flanking the
Mediterranean and amalgamated the dominant ethnic identity that persists today in the Near East, the
Levant, the Maghreb, and Andalusia in the Iberian Peninsula.
58,62,63
The Arabian Peninsula gained and
increasing role and linked distant populations of China and India to communities of the Mediterranean
and beyond. During this period, demographical dynamics were predominantly governed by cultural
change in endogenous populations rather than demic influences with significant gene flow.
64
This view
Page 397 of 408
Tadmouri, Sastry & Chouchane. Global Cardiology Science and Practice 2014:54
is strongly supported by Y-chromosome analysis of Muslim expansion in India and mtDNA
haplogroups in the Sinai Peninsula and North Africa.
24,65,66
During the 11
th
-13
th
centuries CE, the Levant witnessed major Crusader settlements that could have
caused remarkable genetic drifts and bottlenecks and introduced western European lineages.
58
In the
16
th
Century CE, the impact of the western European gene stock extended to the eastern Arabian
Peninsula where major parts, including today’s Bahrain, felt under the authority of the Portuguese for
nearly 150 years. This presence left clear impressions in the mutational spectrum of common disorders
in the eastern Arabian Peninsula as in the frequent observation of the western Mediterranean Codon 39
(c.118C . T) b-thalassemia mutation;
67-69
reviewed in Obeid and Tadmouri.
27
On the contrary, some
other disorders from the region have possibly spread out to geographically distant locations under this
Portuguese influence as demonstrated in the increasing evidence noted with regard to the world
distribution of Machado-Joseph disease.
70,71
During the 13
th
-19
th
centuries CE, Ottomans controlled
much of the lands surrounding the Mediterranean then expanded their influence to cover all the
Arabian Peninsula and further contributed to the enrichment of the genetic pool in the region.
72
After
the 19
th
century, areas of the Maghreb were colonized by France, Spain and Italy while the Levant,
Egypt, and the Arabian Peninsula where colonized by France and England.
Despite this long trail of historical admixtures, genetic isolates persisted in the Arab region. Some of
these isolates include the inhabitants of the Island of Jerba in Tunisia,
73
the Bedouins of Sinai,
65
the
dwellers of the Dead Sea region in Jordan,
74
the Druze of the Levant,
58
and the Kurdish population of
Northern Iraq.
75
THE GENETIC HETEROGENEITY OF ARABS
Arab populations display some of the highest rates of consanguineous marriages in the world
including a large proportion of first cousin marriages.
76
At a macrogenomic level, this norm permits the
reunion of ancestral chromosomal segments in a homozygous pattern referred to as the autozygome.
77
At a microgenomic level, however, populations in the region exhibit exceptionally high levels of
variance within those runs of homozygosity.
2
This variance seems to follow a sexually asymmetric
model with higher heterogeneity recorded among the female groups while paternal lineages are mostly
of autochthonous origin.
29,78
In either way, this variance leads phenotypically to a wide array of more
than 1,100 genetic disorders described in the region of which 44% are confined to a single population
or region, a diversity of affected body systems and of clinical outcomes, and a diversity of disease
incidence and geographical distributions (reviewed in Tadmouri
79
).
While the common practice of consanguinity seems to have also contributed to the preponderance
of more autosomal recessive (60%) than autosomal dominant (28%) disorders in the region,
76
it is
probably the large spectra of pathological gene mutations associated with many genetic disorders in
the region that emphasizes the genetic heterogeneity of Arab populations at its best. The following
disease families represent few examples of a continuously growing list of disorders related to a long list
of mutations many of which have possibly originated in the region.
BLOOD DISORDERS
b-Thalassemia
b-thalassemia syndromes are a group of hereditary disorders characterized by a genetic deficiency in
the synthesis of beta-globin chains. A meta-analysis of 6,652 b-thalassemia alleles from 17 Arab
populations indicated the presence of 73 out of the ,250 b-globin gene mutations occurring
worldwide. In contrast to many world populations, this heterogeneity seems to be a common
observation in many Arab populations irrespective of the size of pooled b-thalassemia alleles. This
case is clearly demonstrated in Algeria, Egypt, Morocco, Tunisia, and the United Arab Emirates
exhibiting the largest heterogeneity with more than 20 b-thalassemia mutation types described in each
population so far (reviewed in Obeid and Tadmouri
27
).
Glucose-6-Phosphate Dehydrogenase (G6PD) deficiency
G6PD deficiency is an X-linked inherited disorder caused by a defect or deficiency in the production of
an important red blood cell enzyme called G6PD. G6PD deficiency may cause the sudden destruction of
premature red blood cells leading to hemolytic anemia since the body cannot compensate for the
destroyed cells. In Tunisia, the African G6PD*A
–
variant is the most prevalent among G6PD patients and
Page 398 of 408
Tadmouri, Sastry & Chouchane. Global Cardiology Science and Practice 2014:54
causes a severe phenotype hemolytic anemia following the ingestion of fava beans.
80
This mutation is
also followed by the G6PD*Mediterranean (c.563C . T; p.Ser188Phe) and the G6PD*Aures
(c.143T . C; p.lle48Thr) mutations. The later, was originally described in Algeria and then in Saudi
Arabia.
81,82
The analysis of mildly affected males, revealed the presence of the association of
c.1311C . T, a newly described silent mutation in the exon 12, with the c.93T . C polymorphism in the
intron 11 and two single intronic base deletions: IVS-V-17 (-C) and IVS-VIII-43 (-G).
80
In Sudan, the
G6PD*B variant represents the most common type of enzyme in all the population groups. However,
the mutant G6PD*A
þ
enzyme, but with normal activity, is prevalent among individuals of African
descent. Among the deficiency-causing variants G6PD*Mediterranean and G6PD*A
–
are the most
common.
83
The genetic heterogeneity of G6PD further continues in the Arabian Peninsula. In the United
Arab Emirates, G6PD*B
þ
is the major allele described among non-deficient subjects while the
G6PD*Mediterranean mutation is the most common cause of G6PD deficiency among Emirati
patients.
84
Other mutations detected include: the African G6PD*A
–
(c.202G . A) and the G6PD*Aures
mutations.
84
This spectrum of mutations seems to be common with neighboring Kuwait, where the
G6PD*Mediterranean and the African G6PD*A
–
genotypes are the most common followed less frequent
G6PD*Chatham and G6PD*Aures alleles.
85
The Saudi population is also no exception, the G6PD*A
2
,
G6PD*Mediterranean, and G6PD*B
þ
are the major variants producing a severe deficiency state among
affected individuals. These variants exhibit a significant difference in their frequencies, with the highest
recorded in areas that were endemic to malaria and have high frequencies of sickle cell disease and
b-thalassemia, namely, the Eastern and the Southern Regions.
86,87
In neighboring Jordan, molecular
screening of G6PD alleles revealed a higher incidence of the disease in Jordan Valley, known for its
historically higher rates of malaria, when compared to the Amman area and has also shown the
existence of six mutations: the c.563C . T G6PD*Mediterranean mutation (53%), the African G6PD*A
–
(c.376A . G þ 202G . A; p.Asn126Asp þ Val68Met) mutation, G6PD*Chatham (c.1003G . A;
p.Ala335Thr), G6PD*Valladolid (c.406C . T), G6PD*Aures (c.143T . C), and G6PD*Asahi
(c.202G . A).
88
Molecular screening of G6PD alleles in Iraqi Kurdish males indicated that the
G6PD*Mediterranean variant was the most common (88%), followed by the G6PD*Chatham variant
(c.1003G . A; 9%).
89
In a study of 21 unrelated individuals with G6PD*Mediterranean,
90
confirmed
that almost all patients from Saudi Arabia, Iraq, Iran, Jordan, Lebanon, and Palestine share the
c.563C . T mutation.
METABOLIC DISORDERS
Cystic fibrosis
Cystic fibrosis is a multi-system life threatening inherited disorder that primarily affects the lungs and
digestive system. The spectrum of cystic fibrosis mutations in Arab populations reveals a major
difference from worldwide observations. For examples, more than 70% of cystic fibrosis patients with
European ancestry show the delta F508 (c.1521_1523delCTT) mutation of the CFTR gene. In Arab
patients these figures are far from being homogenous. A comprehensive meta-analysis of 827 alleles
with cystic fibrosis and encompassing 15 Arab populations revealed a wide spectrum of 56 CFTR gene
mutations responsible for the disease in the region (unpublished observations). This heterogeneity
seems to continue at regional level as well. For instance, the cystic fibrosis population of the Arabian
Peninsula exhibit 17 CFTR mutations. In Saudi Arabians, the 3120 þ 1G . A (c.2988 þ 1G . A) CFTR
mutation is the most common, while in Kuwait it is replaced by the delta F508 mutation. In neighboring
Bahrain, three mutations other mutations seem to prevail, these are: 2043delG (c.1911delG), 548A . T
(c.416A . T), and 4041C . G (c.3909C . G). This battery of mutations is replaced in Qatar by the
commonly observed c.3700A . G (p.I1234V) mutation in the CFTR gene. The picture further changes in
Oman and the United Arab Emirates where the c.1647T . G (p.S549R) mutation is common and the
delta F508 occurs at relatively low frequencies, but exclusively in patients of Baluchi descent (reviewed
in Obeid and Tadmouri
27
).
Lipoid congenital adrenal hyperplasia
This is a severe genetic disorder of steroid hormone biosynthesis, in which the production of all adrenal
and gonadal steroids is significantly impaired by a severe defect in the conversion of cholesterol to
pregnenolone. Worldwide, lipoid congenital adrenal hyperplasia is caused by nearly 35 mutations in
the steroidogenic acute regulatory (StAR) protein gene. Collective results of 20 Arab patients from
Page 399 of 408
Tadmouri, Sastry & Chouchane. Global Cardiology Science and Practice 2014:54
Libya, Egypt, Palestine, Jordan, Kuwait, Qatar, Saudi Arabia, and Yemen indicate the presence of 12
mutations including five novel ones in the StAR gene (reviewed in Obeid and Tadmouri
27
).
DISORDERS OF THE CIRCULATORY SYSTEM
An extensive survey on genetic disorders in Arab people indicated that there are at least 27 disorders of
the circulatory system known to run in Arab families.
79
However, unlike blood disorders and common
metabolic abnormalities, appreciation of the genetic etiologies of diseases of the circulatory system
has only occurred in the last decade. This resulted in the presence of scanty information that hints to
specific genetic signatures characteristic of Arab patients with cardiovascular disorders.
Congenital heart disease (CHD)
CHD is a structural abnormality of the heart or intra-thoracic great vessels. It is the most common birth
defect worldwide representing one third of all congenital malformations presenting in the neonatal
period. Arabs are liable to have more children with congenital defects including CHD because of high
fertility rates.
76,91
The presence of small isolated communities in different parts of the Arab world with
the common practice of consanguinity is another evidence of high incidence of CHD (e.g., Armenians,
Bedouins, Druzes, Jews, Kurds, Nubians, Berbers, Tebo, and Twareq). A molecular study in Lebanese
CHD patients identified a differential duplication of a 44-bp intronic segment within the Rel-family
transcription factor gene, NFATC1, suggestive that this gene could be a potential ventricular septal
defect-susceptibility gene.
92
In a prospective study involving 60 Jordanian babies with cleft lip and/or
cleft palate, 47% had CHD. However, no chromosomal studies were performed in these patients.
93
Coronary artery disease (CAD)
A study of Arabs living in Kuwait, found a strong association between a C to G substitution substitution
in the 3-prime untranslated region (3’UTR) of the APOC3 gene with coronary artery disease. The
population in the study included adults from Kuwait, Jordan, Palestine, Lebanon, Syria, Egypt, and
Iraq.
94
In Saudi individuals, CAD was also found to be associated with the 3’UTR allele of the APOC3
gene,
95
but also other associations were found with the MTHFR c.677C . T variant, a platelet
glycoprotein receptor IIIa (PlA1/PlA1) genotype,
96,97
and the null-genotypes of GSTT1 and GSTM1.
98
In
support of a probable specificity of the genotypic etiology of coronary artery disease in Arabs, no
association was found with the lipoprotein lipase (LPL) polymorphisms (LPL-HindIII and LPL-PvuII);
99
the infrequent band of 3.2-kb of the apolipoprotein A-I/C-III;
100
the insertion/deletion sites in the
polymorphic region of intron 16 of the angiotensin I-converting enzyme (ACE) gene;
101
the p.W64R
polymorphism of the b3-adrenoceptor (b3-AR) gene;
102
PvuII polymorphism in the LPL gene;
103
and the
c.677C . T and c.1298A . C variants of the MTHFR gene.
104
Hypertrophic cardiomyopathy (HCM)
HCM is characterized by an abnormal thickening of the heart muscles, resulting from mutations in one
of several genes that result in defects in the protein component of the cardiac muscles. An apical
hypertrophic cardiomyopathy in father and daughter of a Lebanese Christian family has been reported.
In both, identical segments of the left ventricle were involved by the hypertrophic process with differing
degrees of severity.
105
In an analysis of data pertaining to all patients less than 50-years of age in Qatar,
six of 42 Qataris were diagnosed with HCM, making it the most encountered cardiomyopathy in this
group following dilated cardiomyopathy. HCM occurred in two peaks: one below 15-years of age, and
the other between 36 and 50-years of age. About 27% of the children (between 1- and 15-years) were
found to have HCM. The prevalence rate of HCM was calculated as 3.1 per 100,000 of the population.
106
Arterial tortuosity syndrome
Probably, the earliest account of the disease in the region dates back to year 2000 with the description
of 12 patients from eight different families in Saudi Arabia.
107
The first mutations associated with the
disease, however, were reported six years later in patients of Moroccan origin who had homozygosity
for the c.510G . A (p.W170X) and for a frameshift c.961delG (p.V321fsX391) mutation in the SLC2A10
gene.
108
In Qatar, two mutations, a novel p.R105C and a recurrent p.S81R, were recently described in the
SLC2A10 gene in seven patients from two unrelated families.
109
Page 400 of 408
Tadmouri, Sastry & Chouchane. Global Cardiology Science and Practice 2014:54
Other disorders
In two consanguineous Saudi families, long QT syndrome (LQTS) was described as segregating with a
novel homozygous splicing mutation in the KCNQ1 gene. The observation of the same mutation in both
families indicated that this could be a founder mutation.
110
On the contrary, Naxos disease, a rare
cardiomyopathy disorder, failed to exhibit linkage with the previously identified plakoglobin gene in
two Saudi patients
111
indicating that the disease might have a private signature in the region.
NEOPLASMS
Neoplasms are not typically regarded as population-specific disorders. However, several aspects of
these disorders differ by race and ethnicity. Among Arabs, several types of cancers show many distinct
features that are quite different from those seen in other populations worldwide. Very preliminary data
from the CTGA (Catalogue for Transmission Genetics in Arabs) Database for genetic disorders in Arab
populations indicate the presence of at least 55 cancer types in Arab people.
112
Breast, ovarian, lung,
and colorectal cancers are the main cancers that run in Arab families. Cancer susceptibility genes for
many of these cancers have been reported. Yet, other cancers with familial types such as prostate,
pancreatic, and testicular cancers did not reveal specific cancer-susceptibility genes at this time.
Breast and ovarian cancer
Broadly speaking, 90% of breast cancer cases are sporadic and the processes leading to gene
mutations in such cases are not well-understood. Defined genetic predisposition accounts for only
about 5–10% of inherited breast cancer types. In either familial or sporadic cases, multiple genetic
etiologies, related to mutations in oncogenes and tumor suppressor genes, characterize breast
carcinomas in Arab patients.
113
A large fraction of inherited cases of breast cancer are usually
associated with mutations of the BRCA1 and BRCA2 genes. Other genes have also been implicated,
such as: BRCATA, BRCA3, TP53, BRIP1, PTEN, and STK11 genes. In sporadic breast cancer, increased
susceptibility has been blamed on the mutation of low penetrance genes including TNFA, HSP70-2, and
TNFRII. These private signatures of the disease in the region have probably contributed to the peculiar
clinical characteristics of the disease in Arab women particularly the earlier mean age of onset, which is
at least a decade earlier than in women of other ethnicities, and the more aggressive course of the
disease.
114
According to a study by Rouba et al.
115
, the proportion of BRCA1 and BRCA2 mutations could be
higher in Arab women when compared to other populations.
115
In Morocco, five deleterious mutations
in the BRCA1 gene where encountered in families with breast/ovarian cancer, including the novel
compound deletional c.2805delA/2924delA mutation.
116
In Algerian women, four of 11 familial cases
were associated with BRCA1 alterations.
117
In neighboring Tunisia, the prevalence of breast cancer is
calculated to be between 16% and 38%.
118,119
There, four BRCA1 mutations have been identified
including a novel Tunisian-specific c.212 þ 2insG mutation and a frequently observed c.798_799delTT
Tunisian and North African founder mutation.
119,120
In Egypt, the p.Arg841Trp BRCA1 disease-associated
mutation was detected while a novel p.Glu1373X mutation in exon 12 of the BRCA1 gene was identified
in ovarian or breast cancer patients in Arab kindred from East Jerusalem.
121
An extensive analysis of
familial breast cancer in Lebanon revealed the presence of 38 BRCA1 sequence variants, many of which
are novel.
122
Adding to this heterogeneity, two other unclassified BRCA1 variants, p.Phe486Leu and
p.Asn550His, were detected in Saudi patients.
123
In the case of BRCA2 gene, the scene is far from being different. Four mutations in BRCA2 gene cause
breast/ovarian cancer in Moroccan families including three novel ones (c.3381delT/3609delT;
c.7110delA/7338delA, and c.7235insG/7463insG).
116
. The same study also identified a large number of
distinct polymorphisms and unclassified variants in BRCA2 as well as in BRCA1 that were described for
the first time.
116
In four unrelated Tunisian families, two novel c.1313dupT and c.7654dupT mutations in
exons 10 and 16 of the BRCA2 gene were reported.
124
In an Arab patient of Palestinian descent with
breast cancer, the c.2482delGACT novel BRCA2 truncating mutation was observed.
123
An extensive
analysis of familial breast cancer in Lebanon revealed the presence of 40 BRCA2 gene sequence
variants, many of which are novel.
122
In Saudi patients, an unclassified p.Asp1420Tyr BRCA2 variant
was detected.
123
This array of region-specific mutation seems to extend to Arab Diasporas as well. For
example, the c.5804del4 mutation in exon 11 of BRCA2 gene was seen in nearly half of the carriers of
Page 401 of 408
Tadmouri, Sastry & Chouchane. Global Cardiology Science and Practice 2014:54
deleterious mutations in Arab American women. This mutation has not been previously associated with
a particular Arab ethnicity and may represent a founder mutation of recent origin.
125
Another frequently mutated gene in Arab breast cancer patients is the TP53 gene. In fact, the
frequency of TP53 mutations among Saudi patients is one of the highest in the world. The list of
mutations include seven novel ones of which five are found in exon 4 of the TP53 gene. In brief, tumors
from Arab breast cancer patients have a high prevalence (29%) of TP53 mutations in exons 4 and 5,
whereas the smallest proportion of TP53 mutations (10%) is found in exon 7. Also, an excess of
G:C . A:T transitions (49%) at non-CpG sites was noted, suggesting exposure to particular
environmental carcinogens such as N-nitroso compounds.
126
In addition, several single nucleotide
polymorphisms in Arab patients seem to be specific to the indigenous populations and could be
associated with increased risk of breast cancer. Examples include: the p.Pro72Pro in the TP53 gene and
the c.309GG in the MDM2 gene in Saudi women, the c.-251A IL8 allele in Tunisian women, and the
c.1298A . C DNA polymorphism in the MTHFR gene in patients of Syrian ancestry.
127-129
In western societies, mutation of the TP3 gene is highly associated with epithelial ovarian cancers
(50–80%), however, only 32% Arab patients with this neoplasm exhibit TP3 gene mutations. Instead,
PIK3CA amplification, but not PIK3CA mutation, is the single most common genetic alteration in Arab
cases (60%) and is mutually exclusive with gene mutations in both PI3 Kinase and MAPK pathways
(PIK3CA, KRAS, and BRAF).
130-132
This finding is suggestive for a significant role of the dysregulated
PI3K/Akt pathway in the pathogenesis of ovarian cancers.
132
Colorectal carcinoma (CRC)
This type of neoplasm is a further example demonstrating a genetic heterogeneity in the region in
which not only different alleles of the same gene are involved, but also several genes seem to be of
importance for the emergence of this ailment. In Moroccan patients with attenuated polyposis, the
homozygous p.Tyr165Cys and c.1186_1187insGG mutations of the MYH gene were reported
133,134
whereas in neighboring Tunisia, a large deletion involving exon 6 of the MLH1, a DNA mismatch repair,
gene was observed in a family with six patients diagnosed with a colorectal or an endometrial cancer
and characterized by a severe phenotype and an early onset.
135
Another study in Tunisians
demonstrated a significant association between the p.E1317Q, p.D1822V, and p.I1307K variants of the
adenomatous polyposis coli (APC) gene with colorectal carcinoma risk.
136
The p.I1307K mutation
seems to have a long history in the region as demonstrated in the repeated observation of the allele
among many populations in the region. In 1999, the p.I1307K mutation was first described among
Ashkenazi and Yemenite Jews.
137
A study on the general population demonstrated a carrier frequency
of the allele in Yemenite Jews of approximately 5%.
138
A more extensive analysis showed the p.I1307K
mutation existed in Sephardi Jews of Syrian, Egyptian, Moroccan, Yemeni, and Palestinian origins, as
well as in Muslim and Christian individuals of Arab descent. This study also demonstrated that the
ancestor of modern p.I1307K alleles existed some 2.2-2.95 kya.
139
The portrait of colorectal carcinoma
further gets more interesting with the presence of a recent study that investigated the methylation
patterns in colorectal carcinoma from Egypt and Jordan and showed that differing gene methylation
patterns and mutation frequencies are also involved, hence, indicating dissimilar molecular
pathogenesis and probably reflecting different environmental exposures.
140
Prostate cancer
In Tunisians, a significantly increased prostate cancer risk was associated with the VEGF-634 (GC þ CC)
combined genotype while the VEGF-634C allele was associated with high histological grade. However,
the VEGF-1154A/-634G haplotype was negatively associated with prostate cancer risk and high tumor
grade.
141
No association was observed between the p.N700S TSP1 polymorphism and prostate cancer
risk or severity. Yet, subjects carrying one copy of the MMP9-1562T allele exhibited a threefold higher
risk of developing prostate cancer.
142
Other neoplasms
The CYP1A1*2C, GSTT1 null, and GSTP1 TT genotypes demonstrated significant association with diffuse
large B-cell lymphoma (DLBCL) in the Saudi population.
143
The CYP1A1 c.4887C . A genotypes CA, AA
and variant allele A were demonstrated to have significant differences and greater risk of developing
papillary thyroid cancer in Saudi patients compared to wild type genotype CC. Also, in thyroid cancer,
Page 402 of 408
Tadmouri, Sastry & Chouchane. Global Cardiology Science and Practice 2014:54
GSTT1 null showed higher risk while GSTM1 null showed protective effect.
144
Tunisian smokers carrying
this later allele had an approximately 2.2-fold high risk of bladder cancer.
145
Furthermore, individuals
carrying at least one copy of the methionine synthase (MS) c.2756A . G variant allele and
heterozygous for the c.1298A . C MTHFR polymorphism displayed a 2.33 and 1.8 times increased risk
of developing bladder cancer, respectively.
146
FINAL NOTE
A multitude of studies reviewed in this paper clearly indicate that the Arab region was an important
milieu for the early adaptations of modern human populations to the out-of-Africa environment. The
experiences learned in that period certainly have allowed human populations to establish further
settlements and cover many areas in the rest of the world. The tidal movements of historical
populations in and out of the Arab region allowed the area to become an important bridge for the flow
of genes between Africa, Asia, and Europe. This characteristic made the area a focal point of attraction
for many population geneticists seeking to fill the gap in the interpretation of benign or lethal genomic
variations in world populations.
While we could be fascinated with the extent of the genetic heterogeneity that characterizes Arab
population, understanding the genetic structure of populations and exploring their biogeographical
heterogeneities may also yield a better understanding of the genetic processes and, eventually,
disease etiologies in the region. In many instances, studying Arab families, with Arab-specific genetic
disorders, has resulted in a high value knowledge base and linked many genes to well-defined
phenotypes and helped a great deal in global genome annotation efforts.
147
Yet, many of the nearly
500 genes studied in Arab people revealed striking spectra of heterogeneities with many rare and novel
mutations causing large arrays of clinical outcomes, thus, considerably complicating proper counseling
and diagnosis for many disorders. Unfortunately, the materialization of large-scale personalized
medical genomics may not be expected in the near future especially because of the presence of
hundreds of genetic disorders in Arabs with no defined molecular determinants and because of the
restricted economies to sustain genomic research throughout the region.
REFERENCES
[1] US Census Bureau. http://www.census.gov/population/international/data/idb/informationGateway.php, visited:
2.3.2014.
[2] Hunter-Zinck H, Musharoff S, Salit J, Al-Ali KA, Chouchane L, Gohar A, Matthews R, Butler MW, Fuller J, Hackett NR,
Crystal RG, Clark AG. Population genetic structure of the people of Qatar. Am J Hum Genet. 2010;87(1):17–25.
[3] Liu H, Prugnolle F, Manica A, Balloux F. A geographically explicit genetic model of worldwide human-settlement
history. Am J Hum Genet. 2006;79(2):230–237.
[4] Ferembach D. Human remains from the epipaleolithic period in the Taforalt grotto in eastern Morocco. C R Hebd
Seances Acad Sci. 1959;248(24):3465–3467.
[5] Bouzouggar A, Barton N, Vanhaeren M, d’Errico F, Collcutt S, Higham T, Hodge E, Parfitt S, Rhodes E, Schwenninger JL,
Stringer C, Turner E, Ward S, Moutmir A, Stambouli A. 82,000-year-old shell beads from North Africa and implications
for the origins of modern human behavior. Proc Natl Acad Sci U S A. 2007;104(24):9964–9969.
[6] Relethford JH. Genetic evidence and the modern human origins debate. Heredity. 2008;100(6):555–563.
[7] Fernandes V, Alshamali F, Alves M, Costa MD, Pereira JB, Silva NM, Cherni L, Harich N, Cerny V, Soares P, Richards
MB, Pereira L. The Arabian cradle: Mitochondrial relicts of the first steps along the southern route out of Africa. Am J
Hum Genet. 2012;90(2):347–355.
[8] Bailey GN, Flemming NC, King GCP, Lambeck K, Momber G, Moran LJ, Al-Sharekh A, Vita-Finzi C. Coastlines,
submerged landscapes, and human evolution: The Red Sea Basin and the Farasan Islands. J Island Coastal Archaeol.
2007;2:127–160.
[9] Cerný V, Mulligan CJ, Rı́dl J, Zaloudková M, Edens CM, Hájek M, Pereira L. Regional differences in the distribution of
the sub-Saharan, West Eurasian, and South Asian mtDNA lineages in Yemen. Am J Phys Anthropol.
2008;136(2):128–137.
[10] Abu-Amero KK, Hellani A, González AM, Larruga JM, Cabrera VM, Underhill PA. Saudi Arabian Y-Chromosome
diversity and its relationship with nearby regions. BMC Genet. 2009;10:59.
[11] Cadenas AM, Zhivotovsky LA, Cavalli-Sforza LL, Underhill PA, Herrera RJ. Y-chromosome diversity characterizes the
Gulf of Oman. Eur J Hum Genet. 2008;16(3):374–386.
[12] Cann RL, Stoneking M, Wilson AC. Mitochondrial DNA and human evolution. Nature. 1987;325(6099):31–36.
[13] Ingman M, Kaessmann H, Pääbo S, Gyllensten U. Mitochondrial genome variation and the origin of modern humans.
Nature. 2000;408(6813):708–713.
[14] Luis JR, Rowold DJ, Regueiro M, Caeiro B, Cinnioğlu C, Roseman C, Underhill PA, Cavalli-Sforza LL, Herrera RJ. The
Levant versus the Horn of Africa: Evidence for bidirectional corridors of human migrations. Am J Hum Genet.
2004;74(3):532–544.
[15] Pérez-Miranda AM, Alfonso-Sánchez MA, Peña JA, Herrera RJ. Qatari DNA variation at a crossroad of human
migrations. Hum Hered. 2006;61(2):67–79.
Page 403 of 408
Tadmouri, Sastry & Chouchane. Global Cardiology Science and Practice 2014:54
http://www.census.gov/population/international/data/idb/informationgateway.php,
[16] Ferri G, Tofanelli S, Alù M, Taglioli L, Radheshi E, Corradini B, Paoli G, Capelli C, Beduschi G. Y-STR variation in
Albanian populations: Implications on the match probabilities and the genetic legacy of the minority claiming an
Egyptian descent. Int J Legal Med. 2010;124(5):363–370.
[17] González-Pérez E, Esteban E, Via M, Gayà-Vidal M, Athanasiadis G, Dugoujon JM, Luna F, Mesa MS, Fuster V,
Kandil M, Harich N, Bissar-Tadmouri N, Saetta A, Moral P. Population relationships in the Mediterranean revealed by
autosomal genetic data (Alu and Alu/STR compound systems). Am J Phys Anthropol. 2010;141(3):430–439.
[18] Moorjani P, Patterson N, Hirschhorn JN, Keinan A, Hao L, Atzmon G, Burns E, Ostrer H, Price AL, Reich D. The history of
African gene flow into Southern Europeans, Levantines, and Jews. PLoS Genet. 2011;7(4):e1001373.
[19] Ke Y, Su B, Song X, Lu D, Chen L, Li H, Qi C, Marzuki S, Deka R, Underhill P, Xiao C, Shriver M, Lell J, Wallace D,
Wells RS, Seielstad M, Oefner P, Zhu D, Jin J, Huang W, Chakraborty R, Chen Z, Jin L. African origin of modern humans
in East Asia: A tale of 12,000 Y chromosomes. Science. 2001;292(5519):1151–1153.
[20] Maca-Meyer N, González AM, Larruga JM, Flores C, Cabrera VM. Major genomic mitochondrial lineages delineate
early human expansions. BMC Genet. 2001;2:13.
[21] Underhill PA, Passarino G, Lin AA, Shen P, Mirazón Lahr M, Foley RA, Oefner PJ, Cavalli-Sforza LL. The phylogeography
of Y chromosome binary haplotypes and the origins of modern human populations. Ann Hum Genet. 2001;65(Pt
1):43–62.
[22] Lahr MM, Field JS. Assessment of the southern dispersal: GIS-based analyses of potential routes at oxygen isotopic
stage 4. J World Prehistory. 2005;19:1–45.
[23] Manni F, Leonardi P, Barakat A, Rouba H, Heyer E, Klintschar M, McElreavey K, Quintana-Murci L. Y-chromosome
analysis in Egypt suggests a genetic regional continuity in Northeastern Africa. Hum Biol. 2002;74(5):645–658.
[24] Ennafaa H, Cabrera VM, Abu-Amero KK, González AM, Amor MB, Bouhaha R, Dzimiri N, Elgaaı̈ed AB, Larruga JM.
Mitochondrial DNA haplogroup H structure in North Africa. BMC Genet. 2009;10:8.
[25] Rowold DJ, Luis JR, Terreros MC, Herrera RJ. Mitochondrial DNA geneflow indicates preferred usage of the Levant
Corridor over the Horn of Africa passageway. J Hum Genet. 2007;52(5):436–447.
[26] Saleheen D, Frossard PM. The cradle of the DF508 mutation. J Ayub Med Coll Abbottabad. 2008;20(4):157–160.
[27] Obeid T, Tadmouri GO. Initial results of a pilot Arab human variome project. In: Tadmouri GO, Taleb Al Ali M,
Al Khaja N, eds. Genetic Disorders in the Arab World: Qatar. Dubai, United Arab Emirates: Centre for Arab Genomic
Studies; 2012.
[28] Rose JI. New light on human prehistory in the Arabo-Persian Gulf Oasis. Curr Anthropol. 2010;51:849–883.
[29] Al-Zahery N, Pala M, Battaglia V, Grugni V, Hamod MA, Hooshiar Kashani B, Olivieri A, Torroni A, Santachiara-
Benerecetti AS, Semino O. In search of the genetic footprints of Sumerians: A survey of Y-chromosome and mtDNA
variation in the Marsh Arabs of Iraq. BMC Evol Biol. 2011;11:288.
[30] Grmek MD. Malaria in the eastern Mediterranean in prehistory and antiquity. Parassitologia. 1994;36:1–6.
[31] de Zulueta J. Malaria and ecosystems: From prehistory to posteradication. Parassitologia. 1994;36:7–15.
[32] Joy DA, Feng X, Mu J, Furuya T, Chotivanich K, Krettli AU, Ho M, Wang A, White NJ, Suh E, Beerli P, Su XZ. Early origin
and recent expansion of Plasmodium falciparum. Science. 2003;300(5617):318–321.
[33] Angel JL. Porotic hyperostosis, anemias, malarias, and marshes in the prehistoric Eastern Mediterranean. Science.
1966;153:760–763.
[34] Carter R, Mendis KN. Evolutionary and historical aspects of the burden of malaria. Clin Microbiol Rev.
2002;15:564–594.
[35] Kwiatkowski DP. How malaria has affected the human genome and what human genetics can teach us about
malaria. Am J Hum Genet. 2005;77:171–192.
[36] Ziskind B, Halioua B. La tuberculose en ancienne Egypte. Rev Mal Respir. 2007;24(10):1277–1283.
[37] Karlsson EK, Kwiatkowski DP, Sabeti PC. Natural selection and infectious disease in human populations. Nat Rev
Genet. 2014;15(6):379–393.
[38] Poolman EM, Galvani AP. Evaluating candidate agents of selective pressure for cystic fibrosis. J R Soc Interface.
2007;4(12):91–98.
[39] Stiehm ER. Disease versus disease: How one disease may ameliorate another. Pediatrics. 2006;117(1):184–191.
[40] Abu-Amero KK, Larruga JM, Cabrera VM, González AM. Mitochondrial DNA structure in the Arabian Peninsula. BMC
Evol Biol. 2008;8:45.
[41] Bar-Yosef O. The Natufian culture in the Levant, threshold to the origins of agriculture. Evol Anthropol.
1998;6:159–177.
[42] AlShamali F, Pereira L, Budowle B, Poloni ES, Currat M. Local population structure in Arabian Peninsula revealed by
Y-STR diversity. Hum Hered. 2009;68(1):45–54.
[43] Fu W, O’Connor TD, Jun G, Kang HM, Abecasis G, Leal SM, Gabriel S, Rieder MJ, Altshuler D, Shendure J, Nickerson DA,
Bamshad MJ, NHLBI Exome Sequencing Project, Akey JM. Analysis of 6,515 exomes reveals the recent origin of most
human protein-coding variants. Nature. 2013;493(7431):216–220.
[44] Beech CM, Liyanarachchi S, Shah NP, Sturm AC, Sadiq MF, de la Chapelle A, Tanner SM. Ancient founder mutation is
responsible for Imerslund-Gräsbeck Syndrome among diverse ethnicities. Orphanet J Rare Dis. 2011;6:74.
[45] Terreros MC, Rowold DJ, Mirabal S, Herrera RJ. Mitochondrial DNA and Y-chromosomal stratification in Iran:
Relationship between Iran and the Arabian Peninsula. J Hum Genet. 2011;56(3):235–246.
[46] Arredi B, Poloni ES, Paracchini S, Zerjal T, Fathallah DM, Makrelouf M, Pascali VL, Novelletto A, Tyler-Smith C.
A predominantly neolithic origin for Y-chromosomal DNA variation in North Africa. Am J Hum Genet.
2004;75(2):338–345.
[47] Kujanová M, Pereira L, Fernandes V, Pereira JB, Cerný V. Near eastern neolithic genetic input in a small oasis of the
Egyptian Western Desert. Am J Phys Anthropol. 2009;140(2):336–346.
[48] Lucotte G, Aouizérate A, Berriche S. Y-chromosome DNA haplotypes in north African populations. Hum Biol.
2000;72(3):473–480.
[49] Lucotte G, Mercier G. Brief communication: Y-chromosome haplotypes in Egypt. Am J Phys Anthropol.
2003;121(1):63–66.
Page 404 of 408
Tadmouri, Sastry & Chouchane. Global Cardiology Science and Practice 2014:54
[50] Irish JD. The Iberomaurusian enigma: North African progenitor or dead end? J Hum Evol. 2000;39:393–410.
[51] Beech M, Cuttler R, Moscrop D, Kallweit H, Martin J. New evidence for the Neolithic settlement of Marawah Island,
Abu Dhabi, United Arab Emirates. PSAS. 2005;35:37–56.
[52] Bahri R, El Moncer W, Al-Batayneh K, Sadiq M, Esteban E, Moral P, Chaabani H. Genetic differentiation and origin of
the Jordanian population: An analysis of Alu insertion polymorphisms. Genet Test Mol Biomarkers.
2012;16(5):324–329.
[53] Omberg L, Salit J, Hackett N, Fuller J, Matthew R, Chouchane L, Rodriguez-Flores JL, Bustamante C, Crystal RG, Mezey
JG. Inferring genome-wide patterns of admixture in Qataris using fifty-five ancestral populations. BMC Genet.
2012;13:49.
[54] Garcia-Bertrand R, Simms TM, Cadenas AM, Herrera RJ. United Arab Emirates: Phylogenetic relationships and
ancestral populations. Gene. 2014;533(1):411–419.
[55] Walter H, Matsumoto H, De Stefano GF. Gm and Km allotypes in four Sardinian population samples. Am J Phys
Anthropol. 1991;86(1):45–50.
[56] Tadmouri GO, Garguier N, Demont J, Perrin P, Başak AN. History and origin of beta-thalassemia in Turkey: Sequence
haplotype diversity of beta-globin genes. Hum Biol. 2001;73(5):661–674.
[57] Gérard N, Berriche S, Aouizérate A, Diéterlen F, Lucotte G. North African Berber and Arab influences in the western
Mediterranean revealed by Y-chromosome DNA haplotypes. Hum Biol. 2006;78(3):307–316.
[58] Zalloua PA, Xue Y, Khalife J, Makhoul N, Debiane L, Platt DE, Royyuru AK, Herrera RJ, Hernanz DF, Blue-Smith J, Wells
RS, Comas D, Bertranpetit J, Tyler-Smith C, Genographic Consortium. Y-chromosomal diversity in Lebanon is
structured by recent historical events. Am J Hum Genet. 2008a;82:873–882.
[59] Loirat F, Hazout S, Lucotte G. G542X as a probable Phoenician cystic fibrosis mutation. Hum Biol.
1997;69(3):419–425.
[60] Zalloua PA, Platt DE, El Sibai M, Khalife J, Makhoul N, Haber M, Xue Y, Izaabel H, Bosch E, Adams SM, Arroyo E, López-
Parra AM, Aler M, Picornell A, Ramon M, Jobling MA, Comas D, Bertranpetit J, Wells RS, Tyler-Smith C, Genographic
Consortium. Identifying genetic traces of historical expansions: Phoenician footprints in the Mediterranean. Am J
Hum Genet. 2008b;83:633–642.
[61] Zahed L, Demont J, Bouhass R, Trabuchet G, Hänni C, Zalloua P, Perrin P. Origin and history of the IVS-I-110 and
codon 39 beta-thalassemia mutations in the Lebanese population. Hum Biol. 2002;74:837–847.
[62] Fattoum S, Abbes S. Some data on the epidemiology of hemoglobinopathies in Tunisia. Hemoglobin.
1985;9:423–429.
[63] Ben Abdeladhim A, Aı̈ssaoui B, Boussen M, Homozygous O. Arab hemoglobinopathy in a Tunisian family. Apropos of
a case. Tunis Med. 1987;65:571–574.
[64] Labie D, Elion J, Beldjord C. On the diversity of beta-globin mutations, a reflection of recent historic events in Israel.
Am J Hum Genet. 1994;55(6):1284–1285.
[65] Salem AH, Badr FM, Gaballah MF, Pääbo S. The genetics of traditional living: Y-chromosomal and mitochondrial
lineages in the Sinai Peninsula. Am J Hum Genet. 1996;59(3):741–743.
[66] Gutala R, Carvalho-Silva DR, Jin L, Yngvadottir B, Avadhanula V, Nanne K, Singh L, Chakraborty R, Tyler-Smith C.
A shared Y-chromosomal heritage between Muslims and Hindus in India. Hum Genet. 2006;120(4):543–551.
[67] Gomes MP, da Costa MG, Braga LB, Cordeiro-Ferreira NT, Loi A, Pirastu M, Cao A. Beta-thalassemia mutations in the
Portuguese population. Hum Genet. 1988;78(1):13–15.
[68] Jassim N, Merghoub T, Pascaud O, al Mukharraq H, Ducrocq R, Labie D, Elion J, Krishnamoorthy R, Arrayed SA.
Molecular basis of beta-thalassemia in Bahrain: An epicenter for a Middle East specific mutation. Ann N Y Acad Sci.
1998;850:407–409.
[69] Al-Ali AK, Al-Ateeq S, Imamwerdi BW, Al-Sowayan S, Al-Madan M, Al-Muhanna F, Bashaweri L, Qaw F. Molecular
Bases of beta-thalassemia in the Eastern Province of Saudi Arabia. J Biomed Biotechnol. 2005a;2005(4):322–325.
[70] Purdey M. The pathogenesis of Machado Joseph Disease: A high manganese/low magnesium initiated CAG
expansion mutation in susceptible genotypes? J Am Coll Nutr. 2004;23(6):715S–729S.
[71] Mittal U, Srivastava AK, Jain S, Jain S, Mukerji M. Founder haplotype for Machado-Joseph disease in the Indian
population: Novel insights from history and polymorphism studies. Arch Neurol. 2005;62(4):637–640.
[72] Haj Khelil A, Laradi S, Miled A, Tadmouri GO, Ben Chibani J, Perrin P. Clinical and molecular aspects of
haemoglobinopathies in Tunisia. Clin Chim Acta. 2004;340:127–137.
[73] Loueslati BY, Cherni L, Khodjet-Elkhil H, Ennafaa H, Pereira L, Amorim A, Ben Ayed F, Ben Ammar Elgaaied A. Islands
inside an island: Reproductive isolates on Jerba island. Am J Hum Biol. 2006;18(1):149–153.
[74] González AM, Karadsheh N, Maca-Meyer N, Flores C, Cabrera VM, Larruga JM. Mitochondrial DNA variation in
Jordanians and their genetic relationship to other Middle East populations. Ann Hum Biol. 2008;35(2):212–231.
[75] Jalal SD, Al-Allawi NA, Bayat N, Imanian H, Najmabadi H, Faraj A. b-Thalassemia mutations in the Kurdish population
of northeastern Iraq. Hemoglobin. 2010;34(5):469–476.
[76] Tadmouri GO, Nair P, Obeid T, Al Ali MT, Al Khaja N, Hamamy HA. Consanguinity and reproductive health among
Arabs. Reprod Health. 2009;6:17.
[77] AlKuraya FS. Autozygome decoded. Genet Med. 2010;12(12):765–771.
[78] Fadhlaoui-Zid K, Martinez-Cruz B, Khodjet-el-khil H, Mendizabal I, Benammar-Elgaaied A, Comas D. Genetic structure
of Tunisian ethnic groups revealed by paternal lineages. Am J Phys Anthropol. 2011;146(2):271–280.
[79] Tadmouri GO. Genetic disorders in Arabs. In: Tadmouri GO, Taleb Al Ali M, Al Khaja N, eds. Genetic Disorders in the
Arab World: Qatar. Dubai, United Arab Emirates: Centre for Arab Genomic Studies; 2012.
[80] Daoud BB, Mosbehi I, Préhu C, Chaouachi D, Hafsia R, Abbes S. Molecular characterization of erythrocyte glucose-6-
phosphate dehydrogenase deficiency in Tunisia. Pathol Biol (Paris). 2008;56(5):260–267.
[81] Nafa K, Reghis A, Osmani N, Baghli L, Benabadji M, Kaplan JC, Vulliamy TJ, Luzzatto L. G6PD Aures: A new mutation
(48 Ile–.Thr) causing mild G6PD deficiency is associated with favism. Hum Mol Genet. 1993;2(1):81–82.
[82] Niazi GA, Adeyokunnu A, Westwood B, Beutler E. Neonatal jaundice in Saudi newborns with G6PD Aures. Ann Trop
Paediatr. 1996;16(1):33–37.
Page 405 of 408
Tadmouri, Sastry & Chouchane. Global Cardiology Science and Practice 2014:54
[83] Beutler E. Glucose-6-phosphate dehydrogenase deficiency. In: Williams WJ, Beutler E, Erslev AS, Lichtman MA, eds.
Haematology. New York: McGraw-Hill; 1991.
[84] Bayoumi RA, Nur-E-Kamal MS, Tadayyon M, Mohamed KK, Mahboob BH, Qureshi MM, Lakhani MS, Awaad MO,
Kaeda J, Vulliamy TJ, Luzzatto L. Molecular characterization of erythrocyte glucose-6-phosphate dehydrogenase
deficiency in Al-Ain District. United Arab Emirates. Hum Hered. 1996;46(3):136–141.
[85] AlFadhli S, Kaaba S, Elshafey A, Salim M, AlAwadi A, Bastaki L. Molecular characterization of glucose-6-phosphate
dehydrogenase gene defect in the Kuwaiti population. Arch Pathol Lab Med. 2005;129(9):1144–1147.
[86] El-Hazmi MA, Al-Swailem AR, Al-Faleh FZ, Warsy AS. Frequency of glucose-6-phosphate dehydrogenase, pyruvate
kinase and hexokinase deficiency in the Saudi population. Hum Hered. 1986;36(1):45–49.
[87] El-Hazmi MA, Warsy AS. Frequency of glucose-6-phosphate dehydrogenase variants and deficiency in Arabia. Gene
Geogr. 1990;4(1):15–19.
[88] Karadsheh NS, Moses L, Ismail SI, Devaney JM, Hoffman E. Molecular heterogeneity of glucose-6-phosphate
dehydrogenase deficiency in Jordan. Haematologica. 2005;90(12):1693–1694.
[89] Al-Allawi N, Eissa AA, Jubrael JM, Jamal SA, Hamamy H. Prevalence and molecular characterization of Glucose-6-
Phosphate dehydrogenase deficient variants among the Kurdish population of Northern Iraq. BMC Blood Disord.
2010;10:6.
[90] Kurdi-Haidar B, Mason PJ, Berrebi A, Ankra-Badu G, al-Ali A, Oppenheim A, Luzzatto L. Origin and spread of the
glucose-6-phosphate dehydrogenase variant (G6PD-Mediterranean) in the Middle East. Am J Hum Genet.
1990;47(6):1013–1019.
[91] Aburawi EH. Call for multinational studies of the epidemiology of congenital heart disease in the Arab World.
Ibnosina J Med BS. 2013.
[92] Yehya A, Souki R, Bitar F, Nemer G. Differential duplication of an intronic region in the NFATC1 gene in patients with
congenital heart disease. Genome. 2006;49(9):1092–1098.
[93] Aqrabawi HE. Facial cleft and associated anomalies: Incidence among infants at a Jordanian medical centre. East
Mediterr Health J. 2008;14(2):356–359.
[94] Tas S. Strong association of a single nucleotide substitution in the 3’-untranslated region of the apolipoprotein-CIII
gene with common hypertriglyceridemia in Arabs. Clin Chem. 1989;35(2):256–259.
[95] Hussain SS, Buraiki J, Dzimiri N, Butt AI, Vencer L, Basco MC, Khan B. Polymorphism in apoprotein-CIII gene and
coronary heart disease. Ann Saudi Med. 1999;19(3):201–205.
[96] Abu-Amero KK, Wyngaard CA, Dzimiri N. Association of the platelet glycoprotein receptor IIIa (PlA1/PlA1) genotype
with coronary artery disease in Arabs. Blood Coagul Fibrinolysis. 2004;15(1):77–79.
[97] Al-Ali AK, Al-Muhana FA, Larbi EB, Abdulmohsen MF, Al-Sultan AI, Al-Maden MS, Al-Ateeq SA. Frequency of
methylenetetrahydrofolate reductase C677T polymorphism in patients with cardiovascular disease in Eastern Saudi
Arabia. Saudi Med J. 2005b;26(12):1886–1888.
[98] Abu-Amero KK, Al-Boudari OM, Mohamed GH, Dzimiri N. T null and M null genotypes of the glutathione S-transferase
gene are risk factor for CAD independent of smoking. BMC Med Genet. 2006;7:38.
[99] Abu-Amero KK, Wyngaard CA, Al-Boudari OM, Kambouris M, Dzimiri N. Lack of association of lipoprotein lipase gene
polymorphisms with coronary artery disease in the Saudi Arab population. Arch Pathol Lab Med.
2003a;127(5):597–600.
[100] Johansen K, Dunn B, Tan JC, Kwaasi AA, Skotnicki A, Skotnicki M. Coronary artery disease and apolipoprotein A-I/C-III
gene polymorphism: A study of Saudi Arabians. Clin Genet. 1991;39(1):1–5.
[101] Dzimiri N, Basco C, Moorji A, Meyer BF. Angiotensin-converting enzyme polymorphism and the risk of coronary heart
disease in the Saudi male population. Arch Pathol Lab Med. 2000;124(4):531–534.
[102] Abu-Amero KK, Al-Boudari OM, Mohamed GH, Dzimiri N. Beta 3 adrenergic receptor Trp64Arg polymorphism and
manifestation of coronary artery disease in Arabs. Hum Biol. 2005;77(6):795–802.
[103] Cagatay P, Susleyici-Duman B, Ciftci C. Lipoprotein lipase gene PvuII polymorphism serum lipids and risk for
coronary artery disease: Meta-analysis. Dis Markers. 2007;23(3):161–166.
[104] Abu-Amero KK, Wyngaard CA, Dzimiri N. Prevalence and role of methylenetetrahydrofolate reductase 677 C–.T and
1298 A–.C polymorphisms in coronary artery disease in Arabs. Arch Pathol Lab Med. 2003b;127(10):1349–1352.
[105] Malouf J, Alam S, Kanj H, Mufarrij A, Der Kaloustian VM. Hypergonadotropic hypogonadism with congestive
cardiomyopathy: An autosomal-recessive disorder? Am J Med Genet. 1985;20(3):483–489.
[106] El-Menyar AA, Bener A, Numan MT, Morcos S, Taha RY, Al-Suwaidi J. Epidemiology of idiopathic cardiomyopathy in
Qatar during 1996-2003. Med Princ Pract. 2006;15(1):56–61.
[107] Al Fadley F, Al Manea W, Nykanen DG, Al Fadley A, Bulbul Z, Al Halees Z. Severe tortuosity and stenosis of the
systemic, pulmonary and coronary vessels in 12 patients with similar phenotypic features: A new syndrome? Cardiol
Young. 2000;10(6):582–589.
[108] Coucke PJ, Wessels MW, Van Acker P, Gardella R, Barlati S, Willems PJ, Colombi M, De Paepe A. Homozygosity
mapping of a gene for arterial tortuosity syndrome to chromosome 20q13. J Med Genet. 2003;40(10):747–751.
[109] Faiyaz-Ul-Haque M, Zaidi SH, Al-Sanna N, Alswaid A, Momenah T, Kaya N, Al-Dayel F, Bouhoaigah I, Saliem M, Tsui
LC, Teebi AS. A novel missense and a recurrent mutation in SLC2A10 gene of patients affected with arterial tortuosity
syndrome. Atherosclerosis. 2009;203(2):466–471.
[110] Bhuiyan ZA, Momenah TS, Amin AS, Al-Khadra AS, Alders M, Wilde AA, Mannens MM. An intronic mutation leading to
incomplete skipping of exon-2 in KCNQ1 rescues hearing in Jervell and Lange-Nielsen syndrome. Prog Biophys Mol
Biol. 2008;98(2-3):319–327.
[111] Stuhrmann M, Bukhari IA, El-Harith el-HA. Naxos disease in an Arab family is not caused by the Pk2157del2 mutation.
Evidence for exclusion of the plakoglobin gene. Saudi Med J. 2004;25(10):1449–1452.
[112] Tadmouri GO, Nair P. Cancers in Arab populations: Concise notes. Hamdan Medical J. 2012;5(1):79–82.
[113] Polyak K. Molecular alterations in ductal carcinoma in situ of the breast. Curr Opin Oncol. 2002;14(1):92–96.
[114] Ayad E, Francis I, Peston D, Shousha S. Triple negative, basal cell type and EGFR positive invasive breast carcinoma
in Kuwaiti and British patients. Breast J. 2009;15(1):109–111.
Page 406 of 408
Tadmouri, Sastry & Chouchane. Global Cardiology Science and Practice 2014:54
[115] Rouba A, Kaisi N, Al-Chaty E, Badin R, Pals G, Young C, Worsham MJ. Patterns of allelic loss at the BRCA1 locus in
Arabic women with breast cancer. Int J Mol Med. 2000;6(5):565–569.
[116] Tazzite A, Jouhadi H, Nadifi S, Aretini P, Falaschi E, Collavoli A, Benider A, Caligo MA. BRCA1 and BRCA2 germline
mutations in Moroccan breast/ovarian cancer families: Novel mutations and unclassified variants. Gynecol Oncol.
2012;125(3):687–692.
[117] Uhrhammer N, Abdelouahab A, Lafarge L, Feillel V, Ben Dib A, Bignon YJ. BRCA1 mutations in Algerian breast cancer
patients: High frequency in young, sporadic cases. Int J Med Sci. 2008;5(4):197–202.
[118] Troudi W, Uhrhammer N, Romdhane KB, Sibille C, Amor MB, Khodjet El Khil H, Jalabert T, Mahfoudh W, Chouchane L,
Ayed FB, Bignon YJ, Elgaaied AB. Complete mutation screening and haplotype characterization of BRCA1 gene in
Tunisian patients with familial breast cancer. Cancer Biomark. 2008;4(1):11–18.
[119] Mahfoudh W, Bouaouina N, Ahmed SB, Gabbouj S, Shan J, Mathew R, Uhrhammer N, Bignon YJ, Troudi W, Elgaaied
AB, Hassen E, Chouchane L. Hereditary breast cancer in Middle Eastern and North African (MENA) populations:
Identification of novel, recurrent and founder BRCA1 mutations in the Tunisian population. Mol Biol Rep.
2012;39(2):1037–1046.
[120] Chouchane L, Boussen H, Sastry KS. Breast cancer in Arab populations: Molecular characteristics and disease
management implications. Lancet Oncol. 2013;14(10):e417–e424.
[121] Kadouri L, Bercovich D, Elimelech A, Lerer I, Sagi M, Glusman G, Shochat C, Korem S, Hamburger T, Nissan A,
Abu-Halaf N, Badrriyah M, Abeliovich D, Peretz T. A novel BRCA-1 mutation in Arab kindred from east Jerusalem with
breast and ovarian cancer. BMC Cancer. 2007;7:14.
[122] Jalkh N, Nassar-Slaba J, Chouery E, Salem N, Uhrchammer N, Golmard L, Stoppa-Lyonnet D, Bignon YJ, Mégarbané A.
Prevalance of BRCA1 and BRCA2 mutations in familial breast cancer patients in Lebanon. Hered Cancer Clin Pract.
2012;10(1):7.
[123] El-Harith el-HA, Abdel-Hadi MS, Steinmann D, Dork T. BRCA1 and BRCA2 mutations in breast cancer patients from
Saudi Arabia. Saudi Med J. 2002;23(6):700–704.
[124] Riahi A, Kharrat M, Ghourabi ME, Khomsi F, Gamoudi A, Lariani I, May AE, Rahal K, Chaabouni-Bouhamed H. Mutation
spectrum and prevalence of BRCA1 and BRCA2 genes in patients with familial and early-onset breast/ovarian cancer
from Tunisia. Clin Genet. 2013;, Dec 28.
[125] Shatavi SV, Dohany L, Chisti MM, Jaiyesimi IA, Zakalik D. Unique genetic characteristics of BRCA mutation carriers in a
cohort of Arab American women. J Clin Oncol. 2013;31(suppl):abstr 1541.
[126] Al-Qasem AJ, Toulimat M, Eldali AM, Tulbah A, Al-Yousef N, Al-Daihan SK, Al-Tassan N, Al-Tweigeri T, Aboussekhra A.
TP53 genetic alterations in Arab breast cancer patients: Novel mutations, pattern and distribution. Oncol Lett.
2011;2(2):363–369.
[127] Snoussi K, Mahfoudh W, Bouaouina N, Ahmed SB, Helal AN, Chouchane L. Genetic variation in IL-8 associated with
increased risk and poor prognosis of breast carcinoma. Hum Immunol. 2006;67(1-2):13–21.
[128] AlShatwi AA, Hasan TN, Shafi G, Alsaif MA, Al-Hazzani AA, Alsaif AA. A single-nucleotide polymorphism in the TP53
and MDM-2 gene modifies breast cancer risk in an ethnic Arab population. Fundam Clin Pharmacol.
2012;26(3):438–443.
[129] Lajin B, Alhaj Sakur A, Ghabreau L, Alachkar A. Association of polymorphisms in one-carbon metabolizing genes with
breast cancer risk in Syrian women. Tumour Biol. 2012;33(4):1133–1139.
[130] Levine DA, Bogomolniy F, Yee CJ, Lash A, Barakat RR, Borgen PI, Boyd J. Frequent mutation of the PIK3CA gene in
ovarian and breast cancers. Clin Cancer Res. 2005;11(8):2875–2878.
[131] Wang Y, Helland A, Holm R, Kristensen GB, Børresen-Dale AL. PIK3CA mutations in advanced ovarian carcinomas.
Hum Mutat. 2005;25(3):322.
[132] Abubaker J, Bavi P, Al-Haqawi W, Jehan Z, Munkarah A, Uddin S, Al-Kuraya KS. PIK3CA alterations in Middle Eastern
ovarian cancers. Mol Cancer. 2009;8:51.
[133] Teebi AS. Genetic disorders among Arab populations. Second Edition. Berlin, Heidelberg, Germany: Springer-Verlag;
2010.
[134] Laarabi FZ, Cherkaoui Jaouad I, Baert-Desurmont S, Ouldim K, Ibrahimi A, Kanouni N, Frebourg T, Sefiani A. The first
mutations in the MYH gene reported in Moroccan colon cancer patients. Gene. 2012;496(1):55–58.
[135] Aissi-Ben Moussa S, Moussa A, Lovecchio T, Kourda N, Najjar T, Ben Jilani S, El Gaaied A, Porchet N, Manai M, Buisine
MP. Identification and characterization of a novel MLH1 genomic rearrangement as the cause of HNPCC in a Tunisian
family: Evidence for a homologous Alu-mediated recombination. Fam Cancer. 2009;8(2):119–126.
[136] Bougatef K, Marrakchi R, Ouerhani S, Sassi R, Moussa A, Kourda N, Blondeau Lahely Y, Najjar T, Ben Jilani S,
Soubrier F, Ben Ammar Elgaaied A. No evidence of the APC D1822V missense variant’s pathogenicity in Tunisian
patients with sporadic colorectal cancer. Pathol Biol (Paris). 2009;57(3):e67–e71.
[137] Patael Y, Figer A, Gershoni-Baruch R, Papa MZ, Risel S, Shtoyerman-Chen R, Karasik A, Theodor L, Friedman E.
Common origin of the I1307K APC polymorphism in Ashkenazi and non-Ashkenazi Jews. Eur J Hum Genet.
1999;7(5):555–559.
[138] Drucker L, Shpilberg O, Neumann A, Shapira J, Stackievicz R, Beyth Y, Yarkoni S. Adenomatous polyposis coli I1307K
mutation in Jewish patients with different ethnicity: Prevalence and phenotype. Cancer. 2000;88(4):755–760.
[139] Niell BL, Long JC, Rennert G, Gruber SB. Genetic anthropology of the colorectal cancer-susceptibility allele APC
I1307K: Evidence of genetic drift within the Ashkenazim. Am J Hum Genet. 2003;73(6):1250–1260.
[140] Chan AO, Soliman AS, Zhang Q, Rashid A, Bedeir A, Houlihan PS, Mokhtar N, Al-Masri N, Ozbek U, Yaghan R,
Kandilci A, Omar S, Kapran Y, Dizdaroglu F, Bondy ML, Amos CI, Issa JP, Levin B, Hamilton SR. Differing DNA
methylation patterns and gene mutation frequencies in colorectal carcinomas from Middle Eastern countries. Clin
Cancer Res. 2005;11(23):8281–8287.
[141] Sfar S, Hassen E, Saad H, Mosbah F, Chouchane L. Association of VEGF genetic polymorphisms with prostate
carcinoma risk and clinical outcome. Cytokine. 2006;35(1-2):21–28.
[142] Sfar S, Saad H, Mosbah F, Gabbouj S, Chouchane L. TSP1 and MMP9 genetic variants in sporadic prostate cancer.
Cancer Genet Cytogenet. 2007;172(1):38–44.
Page 407 of 408
Tadmouri, Sastry & Chouchane. Global Cardiology Science and Practice 2014:54
[143] Al-Dayel F, Al-Rasheed M, Ibrahim M, Bu R, Bavi P, Abubaker J, Al-Jomah N, Mohamed GH, Moorji A, Uddin S, Siraj AK,
Al-Kuraya K. Polymorphisms of drug-metabolizing enzymes CYP1A1, GSTT and GSTP contribute to the development of
diffuse large B-cell lymphoma risk in the Saudi Arabian population. Leuk Lymphoma. 2008;49(1):122–129.
[144] Siraj AK, Ibrahim M, Al-Rasheed M, Abubaker J, Bu R, Siddiqui SU, Al-Dayel F, Al-Sanea O, Al-Nuaim A, Uddin S, Al-
Kuraya K. Polymorphisms of selected xenobiotic genes contribute to the development of papillary thyroid cancer
susceptibility in Middle Eastern population. BMC Med Genet. 2008;9:61.
[145] Ouerhani S, Tebourski F, Slama MR, Marrakchi R, Rabeh M, Hassine LB, Ayed M, Elgaaı̈ed AB. The role of glutathione
transferases M1 and T1 in individual susceptibility to bladder cancer in a Tunisian population. Ann Hum Biol.
2006;33(5-6):529–535.
[146] Ouerhani S, Oliveira E, Marrakchi R, Ben Slama MR, Sfaxi M, Ayed M, Chebil M, Amorim A, El Gaaied AB, Prata MJ.
Methylenetetrahydrofolate reductase and methionine synthase polymorphisms and risk of bladder cancer in a
Tunisian population. Cancer Genet Cytogenet. 2007;176(1):48–53.
[147] Ozçelik T, Kanaan M, Avraham KB, Yannoukakos D, Mégarbané A, Tadmouri GO, Middleton L, Romeo G, King MC,
Levy-Lahad E. Collaborative genomics for human health and cooperation in the Mediterranean region. Nat Genet.
2010;42(8):641–645.
Page 408 of 408
Tadmouri, Sastry & Chouchane. Global Cardiology Science and Practice 2014:54
Genetic heterogeneity of Arab pop-HLA gene-2018 (1)
RESEARCH ARTICLE
The genetic heterogeneity of Arab
populations as inferred from HLA genes
Abdelhafidh Hajjej
1*, Wassim Y. Almawi2¤, Antonio Arnaiz-Villena3, Lasmar Hattab4,
Slama Hmida
1
1 Department of Immunogenetics, National Blood Transfusion Center, Tunis, Tunisia, 2 Department of
Medicine, Harvard Medical School, Boston, MA, United States of America, 3 Department of Immunology,
University Complutense, School of Medicine, Madrid Regional Blood Center, Madrid, Spain, 4 Department of
Medical Analysis, Hospital of Gabes (Ghannouch), Gabes, Tunisia
¤ Current address: School of Pharmacy, Lebanese American University, Byblos, Lebanon
* abdelhafidhhajjej@gmail.com
Abstract
This is the first genetic anthropology study on Arabs in MENA (Middle East and North Africa)
region. The present meta-analysis included 100 populations from 36 Arab and non-Arab com-
munities, comprising 16,006 individuals, and evaluates the genetic profile of Arabs using HLA
class I (A, B) and class II (DRB1, DQB1) genes. A total of 56 Arab populations comprising
10,283 individuals were selected from several databases, and were compared with 44 Mediter-
ranean, Asian, and sub-Saharan populations. The most frequent alleles in Arabs are A*01,
A*02, B*35, B*51, DRB1*03:01, DRB1*07:01, DQB1*02:01, and DQB1*03:01, while
DRB1*03:01-DQB1*02:01 and DRB1*07:01-DQB1*02:02 are the most frequent class II hap-
lotypes. Dendrograms, correspondence analyses, genetic distances, and haplotype analysis
indicate that Arabs could be stratified into four groups. The first consists of North Africans
(Algerians, Tunisians, Moroccans, and Libyans), and the first Arabian Peninsula cluster (Sau-
dis, Kuwaitis, and Yemenis), who appear to be related to Western Mediterraneans, including
Iberians; this might be explained for a massive migration into these areas when Sahara under-
went a relatively rapid desiccation, starting about 10,000 years BC. The second includes Levan-
tine Arabs (Palestinians, Jordanians, Lebanese, and Syrians), along with Iraqi and Egyptians,
who are related to Eastern Mediterraneans. The third comprises Sudanese and Comorians,
who tend to cluster with Sub-Saharans. The fourth comprises the second Arabian Peninsula
cluster, made up of Omanis, Emiratis, and Bahrainis. It is noteworthy that the two large minori-
ties (Berbers and Kurds) are indigenous (autochthonous), and are not genetically different from
“host” and neighboring populations. In conclusion, this study confirmed high genetic heteroge-
neity among present-day Arabs, and especially those of the Arabian Peninsula.
Introduction
The human leukocyte antigens (HLA) system plays a key role in self-nonself recognition, and
is divided into class I (HLA-A, -B, and -C) and class II (HLA-DP, -DQ, and -DR) loci, and com-
prises 220 genes in a 3.6 Mb region found on the short arm of chromosome 6. HLA system is
highly polymorphic, and in excess of 17,000 alleles were detected. For example, there are 4,828
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 1 / 24
a1111111111
a1111111111
a1111111111
a1111111111
a1111111111
OPEN ACCESS
Citation: Hajjej A, Almawi WY, Arnaiz-Villena A,
Hattab L, Hmida S (2018) The genetic
heterogeneity of Arab populations as inferred from
HLA genes. PLoS ONE 13(3): e0192269. https://
doi.org/10.1371/journal.pone.0192269
Editor: Amr H Sawalha, University of Michigan,
UNITED STATES
Received: November 6, 2017
Accepted: January 19, 2018
Published: March 9, 2018
Copyright: © 2018 Hajjej et al. This is an open
access article distributed under the terms of the
Creative Commons Attribution License, which
permits unrestricted use, distribution, and
reproduction in any medium, provided the original
author and source are credited.
Data Availability Statement: All relevant data are
within the paper and its Supporting Information
files.
Funding: The authors received no specific funding
for this work.
Competing interests: The authors have declared
that no competing interests exist.
https://doi.org/10.1371/journal.pone.0192269
http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0192269&domain=pdf&date_stamp=2018-03-09
http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0192269&domain=pdf&date_stamp=2018-03-09
http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0192269&domain=pdf&date_stamp=2018-03-09
http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0192269&domain=pdf&date_stamp=2018-03-09
http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0192269&domain=pdf&date_stamp=2018-03-09
http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0192269&domain=pdf&date_stamp=2018-03-09
https://doi.org/10.1371/journal.pone.0192269
https://doi.org/10.1371/journal.pone.0192269
http://creativecommons.org/licenses/by/4.0/
B, 3,968 A, and 3,579 C class I alleles, compared with 2,103 DRB1, and 1,142 DQB1 class II
alleles. Several HLA alleles were associated with various auto-immune and infectious diseases
[1]. HLA class I and class II loci are characterized by high (80–90%) heterozygosity, and thus
constitute reliable genetic markers for phylogenetic study, and thus are useful for anthropolog-
ical studies.
Population studies confirmed varied frequencies of HLA alleles and haplotypes according
to ethnicity and geographic origin. Given the codominant nature of the expression of HLA
markers, this enables distinguishing between heterozygotes from homozygotes, hence allowing
assignment of genotypes and allele frequencies [2]. Linkage disequilibrium (LD) analysis
between HLA alleles identified the number of generations in-between two closely related pop-
ulations from the time of their separation. Diversity in haplotype distribution, allele frequency,
and LD analysis reflect the extent of variation between closely related populations. Allele fre-
quency-based genetic distance analysis allows for construction of phylogenetic tree (Dendro-
grams), so as to infer relative estimate of the time that elapsed since the populations existed as
single cohesive units [3–6].
Arabs are a major panethnic group, and their union, Arab League, is a cultural and ethnic
union of 22 member states. As of 2013, nationals of the Arab League countries are 357 mil-
lions, who populate an area of 13 million km
2
, straddling Africa and Asia [7]. Ethnic, religious,
and linguistic diversity (triple heterogeneity) characterize Arabs. Most Arabs follow Islam, and
Christianity is the second largest religion, with over 15 million Christians. There are also
smaller but significant religious minorities (as Druze, Jews), and a number of non-Arab ethnic
minorities (as Berbers, Kurds) [7, 8].
The history of Arabs extends from circa 1200 BC when Southern Arabian Peninsula
was ruled by three successive civilizations: Mineans, who established their capital Karna
(1200–650 BC), Sabeans in Marib (1000 BC—570 AD), and the Himyarite (2nd-6th centu-
ries AD) in Dhafar (Oman) [9–11]. These civilizations were built by authentic Yemeni
tribes. The kingdom of Kinda was established in Central Arabia in 4th-early 6th century
AD, while Dilmun civilization was founded in Eastern Arabia. In 3rd century AD, East
African Kingdom of Aksum extended into Yemen and Western Saudi Arabia [12]. In
addition, the Lakhmids (Yemeni origin), established a dynasty which ruled part of pres-
ent-day Iraq and Syria in 300–602 AD [10, 13, 14]. The Arab Christian Ghassanids
(220–638 AD), originating from Southern Arabia, migrated in 3rd century to Jordan,
where they established their kingdom that extended from Syria to Yathrib (Saudi Arabia)
[12.13]. Islam was introduced in 610 AD to Arabian Peninsula. Shortly thereafter, Arabian
tribes were united as a single Islamic state in the Arabian Peninsula, which was spear-
headed by the Islamic prophet Muhammad. This Islamic state progressively grew in area,
and in types and numbers of populations, and extended from Andalusia (Spain) to the
west, to Indus in the east [14].
Subsequent spread of Islam involved swift invasion of Persia (637-651AD), Iraq, Levant,
and Egypt (639 AD), which extended into North Africa (640–709), and to Spain, Portugal, and
France (Poitiers) in 8
th
century AD. Eastwards, Arab expansion to Central Asia, Bukhara
(Uzbekistan), Afghanistan (637–709), and the Indus border (664–712) followed. Northwards,
Arab invaders were in contact with the Byzantine Empire, and the Caspian and Caucasus to
the north [15, 16]. With the Islamic expansion from 7th century, social and political groups
were gradually Arabized. The spreading of Arab-Muslim culture was at the expense of local
languages (as Berber, Kurdish), especially in Middle East and North Africa, resulting in the
Arabized population speaking variants of Arabic, mixed with original languages (dialect). The
extent of gene Arab exchange with these autochthonous groups is undetermined but is thought
to be lower than religious/cultural influence.
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 2 / 24
https://en.wikipedia.org/wiki/Panethnicity
https://en.wikipedia.org/wiki/Africa
https://en.wikipedia.org/wiki/Christianity
https://en.wikipedia.org/wiki/Druze
http://www.newworldencyclopedia.org/entry/Yemen
http://www.newworldencyclopedia.org/entry/Saudi_Arabia
https://en.wikipedia.org/wiki/Islamic_prophet
https://en.wikipedia.org/wiki/Muhammad
https://en.wikipedia.org/wiki/Spain
https://en.wikipedia.org/wiki/Indus
https://en.wikipedia.org/wiki/Uzbekistan
https://doi.org/10.1371/journal.pone.0192269
Given the large number of conquests, Arabs were in contact with different ethnicities resid-
ing on a vast area stretching from Mauritania (West Africa) to the western China border (East
Asia). This suggests that cultural and perhaps genetic relationships were established with these
ethnic groups. This work aims to study the HLA distribution in North African and Oriental
Arab populations, and compare them to neighboring populations (Sub-Saharans Africans,
Europeans, and Asians).
Populations and methods
Search strategy
Datasets of HLA allele frequencies were collected from a systematic review performed per Pre-
ferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) criteria [Only
the criteria from 1–10, 17, and 26 are applicable to this type of study (S1 Checklist)] [17].
PubMed, ScienceDirect, AlleleFrequencies.net, and ResearchGate databases were searched for
all papers on HLA polymorphism, and HLA disease associations in Arabs. This systematic lit-
erature search covering published papers up to May 31, 2017 was conducted by two investiga-
tors (H.A and H.L); the search terms used were: ‘HLA Arabs’, or ‘Human Leukocyte Antigen
Arabs’. A search per country followed: ‘HLA Tunisians’, ‘HLA Saudis’, and so on. This was
repeated for remaining countries, which resulted in excess of 50 keywords used. A database
from International Histocompatibility Workshops was also used. Some authors were also con-
tacted by e-mail, or through ResearchGate, requesting information and missing data. While
most datasets were taken from studies with an explicit anthropological focus, control groups
from case-control disease studies were also used. There was no language restriction used for
this search.
Inclusion and exclusion criteria
All included studies met the following criteria. HLA allele frequencies must be obtained by
molecular typing, and that subjects should be typed for at least one of the following: HLA-A,
HLA-B, HLA-DRB1, and HLA-DQB1. Publications were excluded in case of serological data;
sample size less than 35 individuals, typed individuals (or controls) were either related and not
randomly selected, presentation of duplicate data sets. Studies were also excluded if they pre-
sented incomplete/partial allele frequencies, or there were significant ambiguities in the typing.
Data extraction
Studies were independently selected by two authors (H.A and H.L). An external referee was
invited in case of disagreements not resolved by both reviewers. Data extracted from selected
papers included publication year, study type (anthropology, association), sample size, HLA-A, -B,
-C, -DRB1, and -DQB1 allele frequencies, haplotype frequencies, region, country, and typed loci.
Statistical analysis
A three-dimensional correspondence analysis and bi-dimensional representation were per-
formed using VISTA V5.02 software [18]. Phylogenetic trees were constructed based on allele
frequencies using the Neighbor-Joining (NJ) method [19], and standard genetic distances
(SGD) [20], using DISPAN software containing GNKDST and TREEVIEW software [21, 22].
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 3 / 24
https://doi.org/10.1371/journal.pone.0192269
Results
Study flow
The use of more than fifty key words allowed identification of 5,456 papers and HLA datasets,
of which 315 were deemed relevant to the study. Of these, 42 articles and 11 HLA datasets con-
taining information on 56 Arab populations, and meeting the study criteria, were included.
The study flow is illustrated in Fig 1. In addition, 20 articles and 18 HLA datasets which meet
the criteria of this study, containing complete information on 44 other populations were
selected, but without going through systematic review. The populations used in the compari-
son were chosen mainly from neighboring Arab countries. This study relied on a database con-
sisting of 100 populations (of which data of 11 populations were extracted from association
studies) from 36 countries Arab and worldwide countries, and belonging to Asia, Europe, and
Africa. The distribution of populations by region is illustrated in Fig 2A. These populations
represent allele frequency data for 16,006 individuals (160.06 individuals/population), and
from 63 references.
Selected populations
Arab populations. The 42 articles and 11 HLA datasets (http://www.allelefrequencies.net)
selected provided information on 56 populations (Table 1), comprising 10,283 individuals
[23–67]. The 56 different ethnic and religious populations were selected from 18 Arab coun-
tries. There were no reliable HLA data for the remaining countries (Somalia, Djibouti, Mauri-
tania, and Qatar) (Fig 2B). The studied populations are divided into 29 African (26 North
Africans and 3 Sub-Saharans), and 27 Asian populations (13 Levantines, and 14 Arabian Pen-
insula). With the exception of 8 populations [28, 38, 47, 48, 50, 52, 53, 55], where HLA data
were extracted from association studies, the 50-remaining studies were extracted from anthro-
pological ones.
Neighboring populations. Forty-four worldwide populations [23, 34, 39, 66, 68–85] com-
prising 5,723 individuals, were selected from 18 countries in three continents, using the same
criteria previously described (Table 2). These comprised 22 European, 11 non-Arab Asian,
and 11 Sub-Saharan African populations. Of the 11 Asian populations, there were two Arab
minorities living in Iran (Khuzestan and Famoori).
Data of only three populations [74, 75, 84] were extracted from association studies. These
populations were typed for at least HLA-A, -B, -DRB1, or DQB1.
HLA allele frequencies features of Arab populations
Table 3 shows the most frequent HLA-A and -B alleles in Arab populations. A�02 was the
most prevalent allele, and its frequency exceeded 25% in some populations, such as Saudis
(30.4%) [23], Tunisian Berbers of Zrawa (29.3%) [24], Moroccans (26.2%) [25], and Suda-
nese (25.9%) [23]. A�01, �03, �24, �30, and �68 alleles were also common in most Arab popu-
lations. For example, the highest frequency of A�01 was seen in Tunisians (15%) [26] and
Moroccans (14.8%) [25], while A�03 was prevalent among Iraqi Kurds (15.1%) [23], and
A�30 was prevalent among Sudanese (17.6%) [23]. In addition, A�24 was common among
Lebanese-Armenians (17.3%) [27], while A�68 was prevalent in Saudis (10.5%) [28]. In con-
trast, A�25, �28, �34, �36, �43, �66, �69, �74, and �80 are rare among Arabs. It is noteworthy
that A�34, described as rare allele among Arabs, is found at a high frequency (22.2%) in
Tunisian Berbers from Zrawa [24], the highest reported for any population worldwide.
Results of HLA-B locus are presented in Table 3. B�35 was the most frequent B� allele in
Palestinians (20.3%) [29] and Lebanese-Armenians (19.8%) [27]. B�35 was found at varied
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 4 / 24
http://www.allelefrequencies.net/
https://doi.org/10.1371/journal.pone.0192269
frequencies in Iraqi Kurds (15.6%) [23], Omanis (15.3%) [30], Jordanians (14.9%) [31], and
Arab Emirati (11.1%) [23] populations. B�51 was the second most frequent allele, and high fre-
quencies were recorded for Saudis (19.3%) [23], Omanis (17.5%) [30], and Arab Emirati
(15.6%) [23] populations. B�50 was also a frequent B� allele in most Arabs, including Saudis
(18.8%) [23], and Libyans (16.1%) [31], along with B�08, and B�44 among the Tunisian Berbers
of Zrawa (32.8%) [24], the latter being the highest frequency worldwide. Similarly, the fre-
quency of B�27 is the highest among Jordanians (27.1%) [31]. In contrast, B�37, �42, �46, �47,
�48, �54, �59, �67, and �78 alleles are extremely rare or virtually in all Arab populations.
The most common DRB1 and DQB1 alleles among Arabs are shown in Table 4.
DRB1�07:01 was the most frequent allele among Tunisians from Ghannouch (28.6%) [33], Jor-
danians (26.9%) [31], and Saudis (26.6%) [23], while Egyptians (8.3%) and Sudanese had the
lowest frequencies of DRB1�07:01. DRB1�03:01 was the second most frequent DRB1� allele in
some Arabs, such as Tunisians of Tunis (21.9%) [34] and Moroccans of Metelsa (20.2%) [23],
Fig 1. Flow diagram of the study selection process.
https://doi.org/10.1371/journal.pone.0192269.g001
Fig 2. The distribution of studied populations by region (A) and country (B).
https://doi.org/10.1371/journal.pone.0192269.g002
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 5 / 24
https://doi.org/10.1371/journal.pone.0192269.g001
https://doi.org/10.1371/journal.pone.0192269.g002
https://doi.org/10.1371/journal.pone.0192269
but rare in Jordanians (2.4%) [31]. DRB1�11:01 was also frequent among some Arabs, such as
Lebanese (36.8%) [35], but rare among Saudis (4.8%) and Moroccans of Chayoua (2.5%) [23].
Furthermore, DRB1�13:01, �13:02, and �15:01 alleles are relatively frequent among Arabs. High
frequency of DRB1�13:01 were recorded for Sudanese (23.3%), while DRB1�13:02 was virtually
absent in Bahraini [35] and Sudanese [23]. All DRB1�09, �12, and �14 subtypes are extremely
rare among Arabs. In addition, DRB1�16 subtypes are rare in all Arab populations except for
Bahrain, where DRB1�16:01 is found at a high frequency (13.9%) [35].Haut du formulaire.
DQB1�02:0X and �03:01 alleles are the most frequent DQB1� in Arabs. The highest frequen-
cies of DQB1�02:0X were reported for Tunisians (Ghannouch; 40.01%) [33], Yemenites-Jews
(39.1%) [36], Moroccans (Agadir-Souss; 37.8%) [37] and Saudis (37.3%) [23], while the lowest
frequency was found in Egyptians (6%) [38]. On the other hand, DQB1�03:01 is very common
among Lebanese (45%) [39] and Algerians (Oran; 35.1%) [23], but not Saudis (7.6%) [23].
DQB1�03:02 and �05:01 are also frequent in most Arabs, such as Tunisians (Ghannouch;
20.7%) [33], Jordanians (17.8%) [31], Palestinians (17.6%) [29] and Lebanese (16.8%) [35].
DQB1�05:01 is frequent among Bahrainis (29.2%) [35], Tunisians (Berbers of Jerba; 22.7%)
[40], and Lebanese (20.5%) [35]. Among DQB1�06 subtypes, DQB1�06:02 and �06:03 were the
most frequent in most Arab populations, but absent in Bahrainis where DQB1�06:01 is very
frequent (13.20%) [35]. Furthermore, all DQB1�04 subtypes are rare among Arabs, particularly
Table 1. List of Arab populations used in the present work.
N
o
Populations Symbols Size References N
o
Populations Symbols Size References
1 Algiers Alg 102 [67] 29 Comorians Com 117 [43]
2 Algerians-B Alg-B 97 [23] 30 Jordanians Jor 146 [31]
3 Algerians-A Alg-A 132 [48] 31 Jordanians-A Jor-A 1254 [46]
4 Algerians-Oran Ora 100 [23] 32 Syrians Syr 200 [47]
5 Gabesians Gab 77 [59] 33 Syrians-A Syr-A 225 [58]
6 Gabesians-A Gab-A 96 [40] 34 Lebanese Leb 95 [35]
7 Ghannouchians Gha 82 [33] 35 Lebanese-A Leb-A 1123 [45]
8 Berbers-Jerba Ber-J 55 [40] 36 Lebanese-B Leb-B 191 [44]
9 Berbers-Matmata Ber-M 81 [40] 37 Lebanese-Armen Leb-Ar 368 [27]
10 Berbers-Zrawa Ber-Z 70 [24] 38 Lebanese-KZ Leb-Kz 93 [39]
11 Tunisians Tun 376 [61] 39 Lebanese-NS Leb-Ns 59 [39]
12 Tunisians-A Tun-A 80 [60] 40 Lebanese-Yohmor Leb-Y 75 [39]
13 Tunisians-B Tun-B 101 [34] 41 Palestinians Pal 165 [29]
14 Tunisians-C Tun-C 100 [63] 42 Palestinians-A Pal-A 109 [36]
15 Tunisians-M Tun-M 123 [26] 43 Saudis Sau 105 [28]
16 Southern Tunisians Tun-S 250 [62] 44 Saudis-A Sau-A 213 [23]
17 Libyans Lib 118 [32] 45 Saudis-B Sau-B 158 [49]
18 Libyans-Jews Lib-J 119 [36] 46 Saudis-C Sau-C 499 [23]
19 Berbers-Metelsa Ber-Me 99 [64] 47 Saudis-D Sau-D 383 [50]
20 Moroccans Mor 96 [25] 48 Omanis-A Oma-A 259 [30] [51]
21 Moroccans-A Mor-A 110 [42] 49 Kuwaitis Kuw 212 [52]
22 Moroccans-Agadir Mor-Ag 98 [37] 50 Kuwaitis-A Kuw-A 114 [53]
23 Moroccans-Chaouya Mor-Ch 98 [65] 51 Bahrainis Bah 72 [35]
24 Moroccans-Jews Mor-J 94 [66] 52 Emiratis Emi 373 [23]
25 Egyptians Egy 101 [39] 53 Iraq kurds Ira-K 209 [54]
26 Egyptians-A Egy-A 121 [38] 54 Yemenite-Jews Yem-J 76 [36]
27 Sudanese Sud 200 [23] 55 Yemen-sana’a Yem 50 [55]
28 Sudanese-Nuba Sud-N 46 [23] 56 Omanis Oma 118 [56] [57]
https://doi.org/10.1371/journal.pone.0192269.t001
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 6 / 24
https://doi.org/10.1371/journal.pone.0192269.t001
https://doi.org/10.1371/journal.pone.0192269
DQB1�04:01 which is virtually absent, except in Egyptians (10.17%) [38]. The most common
DQB1�04 subtype in Arabs is DQB1�04:02.
Allelic comparison between Tunisians and other populations
Allelic comparisons were done at Neighbor-Joining, correspondence analysis, and standard
genetic distances. Analyses were performed with Class I and Class II markers, and at generic
and high-resolution levels to make the most of available data, and seeing that some of the pop-
ulations included in these comparisons lack high-resolution data.
Neighbor-joining dendrograms. Comparison at the generic level was made using genetic
distances based on DRB1� and DQB1� allelic frequencies. Four groups can be interpreted from
Fig 3. The first group comprises North African Arabs (Tunisians, Algerians, Moroccans, Liby-
ans), Western Mediterranean Europeans (Iberians, French), Arabian Peninsula Arabs (Saudis,
Kuwaitis, Yemenis), and Arab minority of Iran (Khuzestani). The second group is formed by
Eastern Mediterranean Europeans (Greeks, Cretans, Albanians, Turks, Macedonians), Italians,
Levant Arabs (Palestinians, Lebanese, Syrians), Iraqi-Kurds, Tunisian Berbers (Djerba), and
Iranians. The third group comprises Sub-Saharan Africans (Fulani, Mossi, Rimaibe, Bubi,
Mandenka, and Senegalese). Omanis, Bahrainis, Egyptians, and Sudanese form a heteroge-
neous group containing Asians and Sub-Saharan Africans. Similar results but with notable dif-
ferences, were observed in dendrograms built with standards genetic distances (SGD) based
on generic DRB1(S1 Fig) and generic B loci (S2 Fig).
Correspondence analysis. High-resolution DRB1 correspondence analysis (Fig 4) dem-
onstrated the clustering of the studied populations into three groups. The first containing
North Africans (Tunisians, Algerians, Moroccans, and Libyans), Iberians (Basques, Spaniards,
Table 2. Worldwide populations included in the meta-analysis.
N
o
Populations Symbols Size References N
o
Populations Symbols Size References
1 Spaniards Spa 176 [41] 23 Mossi Mos 42 [39]
2 Portuguese Por 118 [39] 24 Mandenka Mad 200 [39]
3 Murcians Mur 173 [80] 25 Amhara Amh 98 [39]
4 Italians Ita 284 [68] 26 Bubi Bub 101 [39]
5 Basques-A Bas-A 82 [41] 27 Congolese Con 85 [72]
6 Basques-Arratia Bas-Ar 83 [77] 28 Fulani Ful 38 [39]
7 Basques-B Bas-B 99 [70] 29 Gabonese Gab 167 [85]
8 French Fre 179 [68] 30 Nigerians Nig 258 [23]
9 French-Rennes Fre-R 200 [34] 31 Oromo Oro 83 [39]
10 Balearic Bal 90 [71] 32 Rimaibe Rim 39 [39]
11 Corsica Cor 100 [71] 33 Senegalese Sen 177 [39]
12 Sardinians Sar 91 [68] 34 Famoori Arabs Fam 84 [73]
13 Ashkenazi-Jews Ash-J 132 [66] 35 India-Northeast Ind-N 188 [83]
14 Greeks-A Gre-A 96 [39] 36 Indians-Delhi Ind-D 112 [84]
15 Greeks-B Gre-B 101 [39] 37 Iranian-Jews Ira-J 91 [73]
16 Greeks-C Gre-C 98 [39] 38 Iranians Ira 120 [74]
17 Greeks-D Gre-D 242 [23] 39 Iranians-A Ira-A 100 [75]
18 Macedonians Mac 172 [78] 40 Iranians-Azeri Ira-Az 100 [81]
19 Turks Tur 250 [23] 41 Iranians-Kurd Ira-k 100 [81]
20 Turks-A Tur-A 228 [79] 42 Khuzestani Arabs Khu 50 [73]
21 Albanians Alb 160 [76] 43 Pakistanis-Pathan Pak-P 100 [82]
22 Cretans Cre 135 [69] 44 Pakistanis-Sindh Pak-S 101 [82]
https://doi.org/10.1371/journal.pone.0192269.t002
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 7 / 24
https://doi.org/10.1371/journal.pone.0192269.t002
https://doi.org/10.1371/journal.pone.0192269
Portuguese, Murcians), French, Saudis, Yeminis-Jews, and Khuzestani Arabs. The second con-
tains Eastern Mediterraneans (Greeks, Cretans, Lebanese, Palestinians, and Macedonians),
Berbers of Djerba, Italians, Iraqi-Kurds, Iranians, Egyptians, Ashkenazi-Jews, and Moroccan-
Jews. The last cluster consists of Sub-Saharan populations. It should be noted that Jordanians,
Bahrainis, and Sudanese were outside these main groups. Similarly, correspondence analysis
using class I (A and B) identified three main clusters (Fig 5). The first cluster contained all
Sub-Saharan Africans along with Sudanese. The second cluster contains Eastern Mediterra-
nean populations (Albanians, Greeks, Cretans, Lebanese, Palestinians, and Macedonians), Ital-
ians, Iraqi-Kurds, Ashkenazi-Jews, and Jordanians-A. The last cluster includes North Africans
(Tunisians, Algerians, Moroccans, and Libyans), Iberians (Basques, Spaniards), French, and
Saudis.
Correspondence analysis based on generic DRB1 data, and using only Arab populations
shows that Arabs can cluster into four groups (Fig 6). The first contains the North Africans
(Tunisians, Algerians, Moroccans, and Libyans), Saudis, Yemenis, Kuwaitis, and Khuzestanis
(Iranian Arabs). The second cluster includes the Arabs of Levant (Palestinians, Jordanians,
Lebanese, Syrians), Egyptians, Iraqi Kurds, and Moroccans Jews. The third group consists of
Table 3. Most frequent HLA-A� and–B� alleles in Arab populations.
HLA-A A�01 A�02 A�03 A�24 A�30 A�68
Population % Population % Population % Population % Population % Population %
Tun-M 15.0 Sau-D 30.4 Ira-k 15.1 Leb-Ar 17.3 Sud 17.6 Sau 10.5
Mor 14.8 Ber-Z 29.3 Leb-Ar 14.0 Gha 15.2 Mor-C 13.0 Tun-M 09.4
Jor-A 14.7 Mor 26.2 Pal 10.7 Ira-k 13.9 Tun-A 11.8 Mor 09.3
Ira-k 13.2 sud 25.9 Lib 10.3 Sau-B 13.3 Jor 11.5 Alg-K 08.6
Pal 12.5 Emi 25.2 Mor-A 10.0 Jor-A 10.7 Alg-K 10.2 sud 08.5
Leb-A 12.2 Oma 24.9 Alg-K 09.3 Pal 10.1 Sau-B 10.2 Emi 08.4
Sau-A 12.2 Alg 24.6 Jor-A 09.1 Alg 09.4 Pal 08.4 Lib 08.2
Alg 11.9 Lib 23.5 Emi 09.1 Lib 09.3 Oma-A 07.5 Jor 07.6
Lib 11.5 Jor-A 22.0 Sau-A 08.9 Mor 07.3 Leb-A 06.7 Oma-A 07.1
Oma 07.2 pal 20.5 Gab 07.7 Oma 06.3 Lib 06.4 Leb-A 05.1
Sud 06.5 Leb-A 18.7 Sud 07.1 Sud 06.1 Emi 05.0 Ira-k 03.8
Emi 06.2 Ira-k 17.0 Oma 06.4 Emi 05.2 Ira-k 03.8 Pal 03.6
HLA-B B�07 B�08 B�35 B�44 B�50 B�51
Population % Population % Population % Population % Population % Population %
Jor 27.1 Oma 11.0 Pal 20.3 Ber-Z 32.8 Sau-D 18.8 Sau-C 19.3
Sau-A 11.7 sau-B 10.1 Leb-Ar 19.8 Ira-k 10.3 Lib 16.1 Oma 17.5
Mor 09.0 Emi 08.6 Ira-k 15.6 Mor-C 10.2 Ber-Z 15.7 Emi 156
Lib 07.7 Gha 08.5 Oma-A 15.3 pal 09.6 Tun-S 14.2 Ira-K 15.6
Tun-A 07.5 Ira-k 07.2 Jor-A 14.9 Alg 08.8 Mor-C 12.5 Gha 12.2
Alg-k 07.1 Lib 06.4 Emir 11.1 Leb-Ar 08.4 Emi 09.4 Leb-Ar 12.1
Leb-Ar 04.5 Mor-C 06.2 Alg 10.3 Lib 07.6 Jor-A 06.4 Lib 11.1
Ira-k 04.1 Jor 04.7 Lib 10.1 Jor-A 05.6 Pal 05.8 Jor-A 10.3
Oma-A 03.1 Sud 04.0 Tun-M 09.8 Sau-D 03.5 Leb-Ar 05.2 Sud 07.8
Sud 02.8 Alg 03.5 Sau 08.6 Sud 02.3 Alg 05.1 Mor 07.4
Emi 02.4 Leb-Ar 03.0 Mor-C 06.9 Emi 02.3 Oma-A 04.2 Pal 06.4
Pal 01.8 Pal 02.7 sud 06.1 Oma-A 02.1 Sud 02.5 Alg-k 04.7
Only one population per country is illustrated; the frequencies are ranked from highest to lowest for each allele; to identify the population and country see Table 1
https://doi.org/10.1371/journal.pone.0192269.t003
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 8 / 24
https://doi.org/10.1371/journal.pone.0192269.t003
https://doi.org/10.1371/journal.pone.0192269
Table 4. Most frequent HLA-DRB1� and–DQB1� alleles in Arab populations.
HLA-DRB1� 03:01 07:01 11:01 13:01 13:02 15:01
Population % Population % Population % Population % Population % Population %
Tun-B 21.9 Gha 28.6 Leb 36.8 Sud 23.3 Mor-Me 11.1 Alg-B 13.4
Mor-Me 20.2 Jor 26.9 Bah 16.0 Sau-A 10.6 Lib 09.3 Mor-c 12.6
Sau-B 16.5 Sau-B 26.6 Egy-A 13.2 Ber-M 08.0 Sau-A 08.9 Ber-Z 11.4
Ora 15.1 Yem-J 22.1 Gab-A 11.2 Leb-B 06.8 Egy-A 07.4 Jor 09.0
Bah 13.9 Mor-Ag 20.5 Pal 10.0 Alg-B 05.6 Tun-C 06.7 Sau-A 08.9
Sud 13.8 Lib-Y 19.6 Ora 08.6 Lib 05.5 Leb-N 05.0 Bah 07.6
Lib 13.6 Lib 17.0 Sud 08.3 Yem-J 05.4 Ora 04.5 Leb 04.7
Yem-J 12.0 Alg-B 15.9 Jor 08.3 Egy-A 04.6 Yem-J 04.0 Lib 04.2
Leb-B 09.6 Pal 12.7 Lib 05.1 Mor-Me 03.5 Pal 03.9 Pal 03.6
Pal 07.6 Bah 09.0 Sau-A 04.8 Jor 02.1 Jor 00.3 Sud 03.3
Egy-A 07.0 Egy-A 08.3 Yem-J 03.4 Bah 02.1 Sud 00.0 Egy-A 02.5
Jor 02.4 Sud 07.8 Mor-C 02.5 Pal 00.9 Bah 00.0 Yem-J 02.0
HLA-DQB1� 02:0X 03:01 03:02 05:01 06:02 06:03
Population % Population % Population % Population % Population % Population %
Gha 40.1 Leb-NS 45.0 Gha 20.7 Bah 29.2 Mor-C 12.9 Egy-A 10.2
Yem-J 39.1 Ora 35.1 Jor 17.8 Ber-J 22.7 Alg 12.8 Jor 08.3
Mor-Ag 37.8 Lib-J 29.6 Pal 17.6 Leb 20.5 Egy-A 12.7 Ber-J 07.8
Sau-B 37.3 Ber-J 27.4 Leb 16.8 Alg 13.9 Tun-A 12.6 Lib-J 07.4
Jor 35.9 Pal 26.7 Yem-J 14.2 Mor-C 12.3 Jor 10.7 Yem-J 06.1
Lib-J 33.3 Yem-J 19.1 Lib-J 13.0 Pal 11.8 Sau-B 05.1 Ora 04.3
Bah 25.7 Bah 16.0 Alg 12.3 Sau-B 10.1 Pal 04.2 Sau-B 04.1
ora 24.5 Mor-C 15.4 Mor-C 12.3 Jor 09.3 Leb-Y 03.7 Leb-Y 03.3
Pal 20.9 Egy 11.9 Bah 09.7 Egy-A 08.5 Yem-J 02.0 Mor-C 01.8
Leb-Y 20.0 Jor 10.0 Sau-B 08.9 Yem-J 06.1 Lib-J 00.8 Pal 01.2
Only one population per country is illustrated; the frequencies are ranked from highest to lowest for each allele; to identify the population and country see Table 1
https://doi.org/10.1371/journal.pone.0192269.t004
Fig 3. Neighbor-Joining dendrograms, based on Standard genetic distances (SGD), showing relatedness between
Arabs and other populations using generic HLA-DRB1� and -DQB1� allele frequencies data. Populations’ data were
taken from references detailed in Tables 1 and 2. Bootstrap values from 1.000 replicates are shown.
https://doi.org/10.1371/journal.pone.0192269.g003
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 9 / 24
https://doi.org/10.1371/journal.pone.0192269.t004
https://doi.org/10.1371/journal.pone.0192269.g003
https://doi.org/10.1371/journal.pone.0192269
Bahrainis, Omanis, Emiratis and Famoori (Iranian Arab). The fourth is composed of Suda-
nese, Sudanese from Nuba, and Comorians.
Genetic distances. Table 5 illustrates standard genetic distances (SGD) between Arabs
and other populations, using generic DRB1� allele frequencies. North Africans and Iberians
are the closest to Saudis. Moroccans (Agadir, 0.0024), Basques-Ar (0.0057), and Tunisians-S.
Fig 4. Correspondence analysis (bi-dimensional representation), based on the standard genetic distances, showing
the relationship between Arabs and other populations according to high resolution HLA-DRB1� allele frequencies
data. Only individuals with defined DRB1� subtypes are considered. Populations data were taken from references
detailed in Tables 1 and 2.
https://doi.org/10.1371/journal.pone.0192269.g004
Fig 5. Correspondence analysis (bi-dimensional representation), based on the standard genetic distances, showing
a global view of the relationship among Arabs and other populations according to generic HLA�-A and–B� allele
frequencies data. Populations data were taken from references detailed in Tables 1 and 2.
https://doi.org/10.1371/journal.pone.0192269.g005
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 10 / 24
https://doi.org/10.1371/journal.pone.0192269.g004
https://doi.org/10.1371/journal.pone.0192269.g005
https://doi.org/10.1371/journal.pone.0192269
Syrians are genetically close to Eastern Mediterranean, as Cretans (-0.0001) and Lebanese
Armenians (0.0050), while Tunisians are closed to Western Mediterraneans as North Africans
and Iberians, and Saudis. The populations most related to Tunisians are the other Tunisian
populations (Gabesians, -0.0139), Moroccans (Agadir; -0.0080), and Algerians (-0.0055). Sub-
Saharans such as Congolese (0.0519) and Nigerians (0.0828), and Greeks (0.0836) showed the
closest genetic distances to Comorians. It is noteworthy that Arab minority in Khuzestan
(Iran) displayed close relatedness with North Africans [as Gabesians from Tunisia (-0.0086)
and Orans from Algeria], and Saudis (0.0231).
HLA Class I and Class II haplotype
HLA-A-B haplotypes. HLA A-B haplotypic data are extremely rare in Arabs. The most
frequent A-B haplotypes in Arabs are shown in Table 6. A�02:01-B�50:01 (9.0%) and
A�02:01-B�44:02/03 (7.5%) were the haplotypes with the highest frequencies in Berbers of
Zrawa. Diversity in A-B haplotype frequencies are found among Arabs, hence demonstrat-
ing comparable frequencies of A-B haplotype in Arab populations, which did not exceed
5.3% in Gabesians (Tunisia). For example, while A�34:02-B�08:01 and A�29:01-B�45:01
characterize Tunisians, A�01-B�57(02.9%), A�30-B�18 (01.50%), and A�33:01-B�14:01
(02.50%) characterize Algerians. Several haplotypes identified in Arabs were also seen in
other Mediterraneans. For example, A�32:01-B�40:02 was seen in Greeks (2%) [39] and
Spaniards (0.5%) [41], while A�02:01-B�50:01 was seen in Italians (2%) [68], Portuguese
(3%) [39], and Moroccan Jews (3%) [66]. A�24:02-B�08:01 (4.75%) and A�30:02-B�53:01
(3.48%) were only identified in Saudis.
HLA-DRB1-DQB1 haplotypes. The most frequent DRB1-DQB1 haplotypes with signifi-
cant LD in Arabs are listed in Table 7. In general, class II haplotype frequencies are markedly
higher than those of class I haplotypes. DRB1�03:01-DQB1�02:01 haplotype was the most fre-
quent DRB1-DQB1 haplotype in Arabs (Table 7), and its frequency ranging from 3.2% in Leba-
nese to 16.60% in Tunisians. DRB1�03:01-DQB1�02:01 is a common class II haplotype in the
Fig 6. Correspondence analysis (bi-dimensional representation), based on the standard genetic distances, showing
the relationship between different Arab populations according to generic HLA-DRB1� allele frequencies data.
Populations data were taken from references detailed in Tables 1 and 2.
https://doi.org/10.1371/journal.pone.0192269.g006
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 11 / 24
https://doi.org/10.1371/journal.pone.0192269.g006
https://doi.org/10.1371/journal.pone.0192269
Table 5. The closest populations to Arabs using standard genetic distances (SGD) based on HLA-DRB1� alleles.
Saudis-B Emiratis Omanis-A Sudanese
Population SGD Population SGD Population SGD Population SGD
Moroccans-Ag 0.0024 Omanis-A 0.0411 Emirates 0.0411 Nigerians 0.0497
Basques-Ar 0.0057 Bahrain 0.0429 Sardinians 0.0939 Egyptians-A 0.0556
Tunisians-S 0.0124 Sardinians 0.0593 Bahrain 0.1327 Congolese 0.0594
Saudis-C 0.0160 Kuwaitis 0.0688 Kuwait 0.2014 Egyptians 0.0620
Ghanouchians 0.0203 Tunisians-B 0.1169 Famoori Arabs 0.2377 Mandenka 0.0908
Saudis 0.0258 Khuzestanis 0.1213 Macedonians 0.2461 Moroccans 0.0984
Tunisians 0.0272 Tunisians-A 0.1276 Tunisians-B 0.3071 Senegalese 0.1044
Kuwaitis-A 0.0312 Algerians-Oran 0.1371 Khuzestanis 0.3192 Bubi 0.1078
Khuzestanis 0.0349 Algerians-A 0.1407 Greeks-B 0.3197 Palestinians-A 0.1111
Spaniards 0.0354 Algerians-B 0.1612 Tunisians-A 0.3261 Pakistanis-S 0.1122
Saudis-D 0.0374 Algiers 0.1639 Kuwaitis-A 0.3544 Tunisians-A 0.1133
Gabesians 0.0377 Saudis-C 0.1746 Algerians-Oran 0.3600 Libyans 0.1197
Gabesians-A 0.0394 Macedonians 0.1756 Algerians-A 0.3639 Sudanese-Nuba 0.1234
Jordanians 0.0428 Gabesians 0.1820 Greeks-D 0.3657 Algerians-B 0.1315
Algerians-B 0.0433 Saudis-D 0.1820 Algerians-B 0.3867 Berbers-Matmata 0.1317
Basques-B 0.0449 Moroccans-Agadir 0.1830 Greeks-C 0.3927 Algerians-A 0.1407
Saudis-A 0.0450 Kuwaitis-A 0.1837 Turks 0.3944 Berbers-Zrawa 0.1409
Algerians-A 0.0497 Famoori Arabs 0.1894 Saudis-C 0.3984 Gabesians 0.1413
Tunisians-C 0.0533 Moroccans-A 0.1900 Algiers 0.4027 Jordanians-A 0.1434
Yemenite-J 0.0536 Gabesians-A 0.1908 Albanians 0.4034 Gabesians-A 0.1442
Khuzestanis Tunisians Syrians-A Comorians
Population SGD Population SGD Population SGD Population SGD
Gabesians -0.0086 Gabesians -0.0139 Cretans -0.0001 Congolese 0.0519
Orans -0.0074 Gabesians-A -0.0081 Lebanese-Ar 0.0050 Nigerians 0.0828
Gabesians-A -0.0025 Moroccans-Agadir -0.0080 Syrians 0.0076 Greeks-A 0.0836
Algerians-A -0.0015 Southern Tunisians -0.0062 Iranians-Kurd 0.0100 Gabonese 0.0904
Moroccans-Ag 0.0106 Algerians-A -0.0055 Lebanese-A 0.0149 Iranians-A 0.0947
Tunisians-S 0.0140 Moroccans-A 0.0010 Lebanese-Y 0.0151 Egyptians-A 0.1090
Tunisians 0.0161 Algerians-B 0.0019 Iranians 0.0159 Iranians 0.1184
Tunisians-C 0.0195 Berbers-Zrawa 0.0027 Lebanese-B 0.0161 Italians 0.1222
Yemenite-J 0.0217 Libyans 0.0028 Iranians-Azeri 0.0185 Iranians-Azeri 0.1394
Tunisians-M 0.0225 Algerians-Oran 0.0033 Turks 0.0192 Iranians-Kurd 0.1418
Saudis-C 0.0231 Tunisians-M 0.0038 Iraq kurdistan 0.0198 Albanians 0.1426
Spaniards 0.0291 Saudis-C 0.0061 Ashkenazi-Jews 0.0222 Turks 0.1428
Saudis 0.0324 Tunisians-C 0.0083 Iranians-A 0.0223 Syrians 0.1470
Saudis-B 0.0349 Algiers 0.0103 Palestinians-A 0.0228 Cretans 0.1483
Algerians-B 0.0353 Berbers-Matmata 0.0106 Italians 0.0241 Egyptians 0.1483
Tunisians-B 0.0422 Moroccans-Chaouya 0.0111 Turks-A 0.0288 Greeks-C 0.1487
Indians-Delhi 0.0454 Spaniards 0.0126 Lebanese 0.0320 Palestinians-A 0.1559
Algiers 0.0461 Moroccans 0.0144 Jordanians-A 0.0355 Iraq Kurdistan 0.1564
Basques-Ar 0.0471 Saudis-D 0.0159 Lebanese-KZ 0.0368 Greeks-D 0.1594
Libyans 0.0485 Khuzestani Arabs 0.0161 Greeks-A 0.0407 Syrians-A 0.1617
(0.0124) had the closest genetic distances from Saudis, while Emiratis were closely related to Omanis (0.0411), Bahrainis (0.0429), Sardinians (0.0593), and Kuwaitis
(0.0688). On the other hand, Sudanese are related to Sub-Saharans, including Nigerians (0.0497), Congolese (0.0594), and Egyptians (0.556).
https://doi.org/10.1371/journal.pone.0192269.t005
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 12 / 24
https://doi.org/10.1371/journal.pone.0192269.t005
https://doi.org/10.1371/journal.pone.0192269
Mediterranean basin, and is frequent among Basques (17.5%) [41], Moroccans (17.3%) [25],
Algerians (11.3%) [67], and Cretans (7.4%) [69]. In addition, DRB1�07:01-DQB1�02:02 is also
frequent in Arabs, such as Moroccans (16.70%), and is reportedly common in Spaniards
(17.3%) [41], and Moroccans (12.6%) [25], but rare in Southern Tunisians (2.10%) (Gabe-
sians). In addition, DRB1�07:01-DQB1�02:01 is also a common DRB1-DQB1 haplotype, and its
frequency exceeds 4% in several Arab populations.
Table 6. Most frequent (%) HLA Class I (A-B) two-locus haplotypes with significant linkage disequilibrium (P<0.05) in Arabs.
A-B haplotype Tun Saudi-B Alg Mor-Ch Mor-a Ber-Z Lib Gab
01:01–50:01 - - - 04.10 - - - -
01–57 - - 02.90 - - - - -
02:01–07:02 - - - - - - 02.97 -
02:01–44:02/03 03.86 - - 02.10a 02.95c 07.50b - 05.26
02:01–50:01 03.30 - - - 01.99d 09.01 - -
02:01–51:01 - 04.66 - 03.40 01.62f - - -
23:01–50:01 - 04.90 - - - - 02.97 -
24:02–08:01 - 04.75 - - - - - -
29:01–45:01 01.79 - - - - - - 02.10
29:02–44:03 - - - 02.70 - - - -
30–18 - - 01.50 - 02.60 03.00 - -
30:02–53:01 - 03.48 - - - - - -
32:01–40:02 00.80 - - - - 05.66 - -
33:01–14:01 - - 02.50 - 01.86e 01.41 - -
34:02–08:01 02.12 - - - - 06.11 - 02.10
a02:01–44:02.
b02:01–44.
c02-44.
d02-50.
e33-14.
f02-51.
https://doi.org/10.1371/journal.pone.0192269.t006
Table 7. Most frequent (%) HLA Class II (DRB1-DQB1) two-locus haplotypes with significant linkage disequilibrium (P<0.05) in Arabs.
HLA-DRB1-DQB1 Tun Sau-B Mor-Ch Bah Leb Alg Lib-J Yem-J Ber-Z Ber-J
01:02–05:01 02.40 02.85 - - - 08.00 02.10 0.70 09.85 04.50
07:01–02:02 14.80 12.32 16.70 - - - 24.70a 22.10a 16.03 -
03:01–02:01 16.60 13.56 12.30 12.02 03.21 11.30 05.60a 12.00a 11.26 -
10:01–05:01 03.80 03.80 - 01.35 04.90 00.30 00.80 04.00 01.41 03.30
07:01–02:01 - - - 09.38 04.20 09.90 - - - 11.00
15:01–06:02 07.80 03.80 08.90 - - 09.90 - - 11.26 02.00
04:02–03:02 02.60 - 06.20 - - 04.20 03.00 07.50 05.15 -
13:01–06:03 02.40 - - - - 03.30 07.70 05.40 05.63 01.80
16:01–05:01 - - - 13.18 03.79 - - - - -
04:01–03:02 - - - 02.78 14.16 - - - - -
11:01–03:01 07.20b 02.22 - 11.98 31.42 04.70 09.30 03.40 07.00b 03.20
aDQB1�02
b11:01/04-03:01
https://doi.org/10.1371/journal.pone.0192269.t007
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 13 / 24
https://doi.org/10.1371/journal.pone.0192269.t006
https://doi.org/10.1371/journal.pone.0192269.t007
https://doi.org/10.1371/journal.pone.0192269
In addition, DRB1�16:01-DQB1�05:01 and DRB1�04:01-DQB1�03:02, rare in neighboring
populations and Mediterraneans, were identified only in Lebanese and Bahraini Arabs. The
high frequency of DRB1�11:01-DQB1�03:01 haplotype (31.42%) among Lebanese is notewor-
thy, since it is the highest in all populations studied, but rare in Saudi (2.2%). Furthermore,
DRB1�11:01/04-DQB1�03:01, identified in Arabs, is also frequent in Cretans (18.5%) [69] and
Basques (3.1%)[41], while DRB1�01:02-DQB1�05:01 was seen in Spaniards (6.30%) [41]. Var-
ied frequency of DRB1�13:01-DQB1�06:03 was also reported for Spaniards (13.23%) [86], Cre-
tans (3.3%) [69], and Germans (10.8%) [87]. Likewise, DRB1�15:01-DQB1�06:02 was observed
in Cretans (2.6%) [65], German population (25.2%) [87], and Southern Ireland (14.90%) [23].
HLA class I and class II extended haplotypes. Table 8 shows the most frequent extended
haplotypes in Arab populations, and their likely origins. The systematic review did not reveal
haplotypes shared by Arab populations because of partial presentation of haplotypic data, dis-
parity in the level of typing resolution, variability of the studied loci, and lack of data. In addi-
tion, Arab populations share their frequent extended haplotypes with several European,
especially Mediterranean, and Asian populations (Table 8). Furthermore, the possible origins
of the most frequent extended haplotypes among Arabs are mainly European, Asian or
Autochthonous.
Table 8. The most frequent (%) HLA extended haplotypes in Arabs.
HLA Extended haplotypes Arab Populations [references] Possible origin
A�02:01-B�50:01-DRB1�07:01-DQB1�02:02a Southern Tunisians (3.2%)[62], Berbers of Zrawa (8.12%) [24] Euro-Asiatic
A�02:01–B�44– DRB1�04:02–DQB1�03:02b Berbers of Zrawa (6.5%)[24] Tunisians (0.6%) [61] Western European
A�24:02-B�08:01-C�07:02-DRB1�03:01c Saudis (3.16%) [49] Euro-Asiatic
A�23:01-B�50:01-C�06:02-DRB1�07:01 Saudis (3.16%) [49] Autochthonous
A�33-C�8-B�14-DRB1�01:02-DQA1�01:01-DQB1�05:01d Algerians (1.5%) [88] Mediterranean
A�30-C�5-B�18-DRB1�03:01-DQA1�05:01-DQB1�02:01e Algerians (1.5%) [88] Iberian-paleo-North
African
A�02:01-C�06:02-B�50:01-DRB1�07:01-DQA1�02:01-DQB1�02:02f Moroccans (2.9%) [65] Euro-Asiatic
A�01:01-C�06:02-B�50:01-DRB1�03:01-DQA1�05:01-DQB1�02:01g Moroccans (2.9%) [65] Mediterranean
A�30-B�07-DRB1�03-DQA1�05:01-DQB1�02:01h Jordanians (1.38%) [31] Euro-Asiatic
A�1-B�8-DRB1�03-DQA1�05:01-DQB1�02:01i Jordanians (1.03%) [31] Pan-European
A�02:01-B�50:01-DRB1�07:01j Libyans (4.24%) [32] Tunisians (1.8%) [60], and Ghannouch (2.5%)
[33].
North African
A�11:01-B�52:01-DRB1�15:02k Libyans (2.54%) [32]; Yemen Jews (0.93%) [23] Mediterranean
A�69-B�49-DRB1�04:03-DQB1�03:02 Palestinians (2.4%) [29] Autochthonous
A�24-B�18-DRB1�11:04-DQB1�03:01l Palestinians (1.8%) [29] Central-South-Eurasian
a
present in Spaniards (1.2%) [41], Turks (1.3%) [79], Italians (0.5%) [68], and Moroccan Jews (2%) [66].
b
also found in British (2.6%), Cornish (7.9%), Danes (2%) [39], Italians (0.9%) [68], Spaniards (0.6%) [41], Spanish Basques (1.9%), Pasiegos (3.3%), Cabuemigos (2.2%)
[77], and Portuguese (3.1%) [39].
c
present at low frequencies in the Euro-Asian minorities of Germany [23].
d
found in Armenians (0.031), Sardinians (0.027), French (0.014), Greeks (0.011), and Italians (0.007) [68].
e
also found in Sardinians (11.4%), and French-Basques (4.7%) [68].
f
present also in Mongolians [68], Turks [79].
g
found in Spaniards, Italians, and north Africans [65].
h
present in Cornish (0.084), British (3.3%), and Danes (3.8%) [68].
i
present in Basques (5%), Spaniards (3.4%) [41], Macedonians (4.9%) [78], Yugoslavians (7.7%), British (2.9%), and Germans (4.8%) [68].
j
found in Poland Jews (1.15%); Ashkenazi Jews (0.92%) [23].
k
present in Ashkenazi Jews (1.05%) [23].
l
found in Armenians (2.1%) and Italians (0.7%) [23].
https://doi.org/10.1371/journal.pone.0192269.t008
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 14 / 24
http://www.allelefrequencies.net/pop6001c.asp?pop_id=3374
http://www.allelefrequencies.net/pop6001c.asp?pop_id=3388
https://doi.org/10.1371/journal.pone.0192269.t008
https://doi.org/10.1371/journal.pone.0192269
Discussion
This meta-analysis is the first genetic anthropology study in MENA region, and included 100
populations from 36 Arab and neighbouring countries, and comprising in excess of 16,000
individuals. A main outcome of the study is the lack of striking differences in the distribution
of HLA alleles and haplotypes between North Africans and Arabian Peninsula populations. On
the contrary, key differences were noted between Levant Arabs (Lebanese, Palestinians, Syr-
ians), and other Arab populations, highlighted by high frequencies of A�24, B�35, DRB1�11:01,
DQB1�03:01, and DRB1�11:01-DQB1�03:01 haplotype in Levantine Arabs compared to other
Arab populations. Class I haplotype frequencies are lower than Class II haplotypes, because of
weak LD between A and B loci, due to long physical distance between them, compared to
DRB1 and DQB1 loci. The identification of shared haplotypes between Arabs and other Medi-
terranean and Asian populations is attributed to the higher admixture of Mediterraneans and
Asians in Arab populations.
Iberians, North Africans, and Arabian Peninsula inhabitants
The relatedness between North Africans and Iberians was previously discussed [29, 59–62, 69,
78, 79, 86, 88]. Using correspondence analysis, NJ trees and genetic distances, our results show
that North Africans are genetically close to Iberians, which is supported by historical events.
First, this relatedness is attributed to the Berber migration from the African Sahara northwards
in 10000–4000 BC, because of hyper-arid conditions [69]. It may also be explained by the simi-
lar history between Iberians and North Africans, both of whom were invaded by Phoenicians,
Romans, Germans, Muslim Arabs [89]; the respective invading armies had a mixed genetic
complexity; indeed, most of them were mercenaries recruited in recent conquests like in the
case of Phoenicians [90] and Muslim who invaded Iberia had troops that were mostly Berbers.
The invasion of Iberia by Muslims in the 8th century AD may have had a role in the related-
ness between North Africans and Iberians for two reasons: first, most Muslim invaders recruits
were North African Berbers, and the second is explained by the 8 centuries period of settle-
ment of the Muslims in Iberia, although more ancient and continuous gene exchange since
prehistoric times between Iberia and North Africa may have been induced the main exchange
[86]; massive mixed marriages and breeding across religious Iberian groups under Muslim
rule is not documented.
The analyses performed showed that current North Africans are closely related to Tunisian
(Zrawa and Matmata) and Moroccan (Sousse-Agadir and Eljadida) Berbers, suggesting that
North Africans have a genetic Berber profile. On the contrary, North Africans displayed a
greater distance from the Arabs of Levant (Palestinians, Syrians, Lebanese, and Jordanians),
indicating low genetic contribution of Phoenician and Levant Arab invasion of North Africa.
These observations based on HLA markers prompted the conclusion that all Berbers of North
Africa constitute a homogeneous genetic unit, except for small isolates, such as the Berbers of
Djerba, who display a Berber genetic profile.
Saudi populations used in this study originated from Eastern Saudi Arabia, especially from
Riyadh province. There is no reliable HLA data on Eastern Saudi Arabia that shed light on pre-
Islamic history; some ancient people may have originated from old Persians, but quantification
is difficult and undetermined [91]. The genetic heterogeneity between Eastern and Western
Saudi Arabia is very possible, and should be taken into account in further interpretation. All
analyses performed here, using HLA-A,-B, -DRB1, and DQB1 markers support the notion that
Saudis along with the Kuwaitis and Yemenis are closely related to North Africans.
The most plausible explanation for West Arabia and Yemen clustering with Iberian/North
Africans is a possible important massive migration that occurred when Sahara underwent
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 15 / 24
https://doi.org/10.1371/journal.pone.0192269
desiccation in all directions [92, 93]. Cultural and language relatedness of many Mediterranean
languages, including old Iberian and Basque [92], with Berber language are concordant with our
genetic findings and Saharan origin hypothesis; also a part of Arabian Peninsula inhabitants
(including Yemen) may had been reached by Saharan people. In fact, Malika Hachid who has
been studying Saharan and North African Archaeology, culture and rock painting/writing of pre-
historic Sahara, even suggests that first known writing alphabet was originated in Sahara. Proto-
Berber writing rock characters have been used (very similar to present day used Berber scripts).
This Proto-Berber language could have appeared 5,000 years BC [94, 95].
Explanation to HLA Kuwait genetic similarity to this group seems more difficult to achieve
but interaction between Arabian Peninsula and Mesopotamia through this strategic Kuwait
area is documented since 6,500 years BC (Ubard Period) [96].
Arabs of Levant
Using genetic distances, correspondence analysis and NJ trees, we showed earlier [61, 62] and
in this study that Palestinians, Syrians, Lebanese and Jordanians are closely related to each
other and to Eastern Mediterranean Europeans (Turks, Cretans, Greeks), Egyptians and Irani-
ans, and confirmed by HLA class I (A, B) and class II markers (DRB1 and DQB1) analysis.
However, Levant Arabs are distant from North African Arabs (Tunisians, Algerians, Moroc-
cans and Libyans) and Iberians (Basques, Spaniards). The strong relatedness between Levant
Arab populations is explained by their common ancestry, the ancient Canaanites, who came
either from Africa or Arabian Peninsula via Egypt in 3300 BC [97], and settled in Levant low-
lands after collapse of Ghassulian civilization in 3800–3350 BC [98]. The relatedness is also
attributed to the close geographical proximity, which constituted one territory before 19th cen-
tury British and French colonization.
The close relatedness of Levant Arabs to Egyptians, as confirmed genetic distances using
HLA markers, may be due to three reasons. First, Egypt is a neighbor to Levant Arab countries,
and historically part of the Levant. Second, the Egyptians invaded the Levant several times
throughout history; the most significant was 1468 BC invasion, where they settled for 12 centu-
ries [99]. Third, the Canaanites, the likely ancestors of Levant Arabs, may have originated
from Africa through Egypt, where they settled for a long period, suggesting likely admixture
between Canaanites and Egyptians.
Historically, Levant is a wider region that included countries along the Eastern Mediterra-
nean with its islands, and extended from Greece to Cyrenaica [100]. Broadly, Levant was his-
torically characterized by high migratory flow between its sub-regions in all directions. For
example, present-day Levant comprising Palestine, Lebanon, Syria, and Jordan has undergone
successive invasions by populations originating from the great Levant, including Egyptians
(1468 BC), Horites, Amorites, Hitites (Turks), Greeks (1200 BC), Assyrians (1090 BC) [99],
and more recently the Ottomans. This has favored admixture, reduced distances and homoge-
nized Great Levant populations, thus explaining the close relatedness of Levant Arabs to East-
ern Mediterranean populations. On the other hand, Levant Arabs are distant from Saudis,
Kuwaitis, and Yeminis, an indication that the contribution of the Arabian Peninsula popula-
tions to Levantine gene pool is low, probably due to the absence of the demographic aspect of
7th century invasion.
Sudanese and Comorians
Sudanese are close to sub-Saharan Africans (Nigerians, Congolese, and Senegalese), and North
Africans, in particular Egyptians, suggesting that the genetic profile of Sudanese is the admix-
ture between North Africans (especially Egyptians) and sub-Saharan Africans throughout
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 16 / 24
https://en.wikipedia.org/wiki/Greece
https://en.wikipedia.org/wiki/Cyrenaica
https://doi.org/10.1371/journal.pone.0192269
history. The close relatedness of Sudanese to sub-Saharan Africans suggests a reduced genetic
effect of Arabs on Sudanese. Also, the Comorians (Comoros islands officially joined League of
Arab Countries in 1993) are close to sub-Saharan Africans (Congolese, Nigerians, and Gabo-
nese) [43], Egyptians, Iranians, and Eastern Mediterranean. This suggests high admixture
between populations belonging to three continents in the Comoro Islands, and can be
explained by their geographical position as a corridor for international trade.
Bahrainis, Emiratis, and Omanis
Bahrainis, Emiratis, and Omanis are geographically similar populations, which explains their
genetic relationship as demonstrated in this study. These three populations tend to form a het-
erogeneous group with Pakistanis, Indians, Iranian Arabs (Famoori), Sardinians (the later
probably close to Iberians/North Africans but behaving as out layer group in analyses because
of they are a genetic island isolate), Egyptians, and some sub-Saharan Africans, such as Congo-
lese. These populations appear close to certain Eastern Mediterranean populations including
Greeks, Macedonians, and those further, in particular North Africans, hence explaining their
intermediate grouping, and distinction from two main clusters. Collectively, this suggests high
admixture in these populations brought about by their commercially important position. Sar-
dinia is a relative genetic isolate “founded” by Iberian Norax/Nora (first documented Sardin-
ian capital close to Cagliari) and Iberians/North Africans may be genetically related to
Sardinians (A�30-B�18-Cw�5 basic HLA haplotype is very high in Sardinia, Iberia, and North
Africa) [93].
Minorities of Arab World
Ethnic minorities. The Kurds and Berbers are the two major ethnic minorities in Arab
world. Berbers are indigenous North African ethnic group found over a vast area stretching
from Atlantic Ocean to Siwa Oasis in Egypt, and from Mediterranean Sea to Niger River. Berbers
number about 20 million people, and constitute 40–45% of Moroccans, 20–25% of Algerians,
and 2–7% in both Libya and Tunisia. The Kurds live in the northern regions of Iraq (15–20%)
and Syria (10%). They constitute an Indo-European ethnic group, and speak Kurdish. Less
important minorities include Armenians, Nubians, Assyrians, and Turkmen [99].
Berbers populations used in this work are closely linked to each other, as well as to present-
day North Africans, and to Western Mediterranean populations, especially Iberians. Indeed,
the Moroccan Berbers are not genetically different from the current Moroccans, nor those of
neighboring populations, like Algerians and Tunisians. This also applies to Tunisian Berbers,
except those of the island of Djerba, who appear to be related to Eastern Mediterranean popu-
lations, including Levant Arabs. This suggests that North African Berbers are in perfect har-
mony with their environments, and that differences between them are cultural rather than
genetic due to 7th century Arabization of the region.
Clustering and genetic distances analyses demonstrated that Iraqi and Iranian Kurds are not
genetically different from Iranians or neighboring populations, including Levant Arab, and are
close to Turks and other Eastern Mediterranean populations. This suggests that Kurds originate
from the region, and are in genetic harmony with neighboring populations, despite the clear
cultural differences. This suggests that Kurds, Syrians, Jordanians, Palestinians, Iraqis, Lebanese,
and Iranians probably share the same genetic profile, with few differences. Accordingly, our
findings confirm the results of an earlier study of Arnaiz-Villena on Iraqi Kurds [54].
Religious minorities. Sunni Muslims constitute the majority (80%) of Arab populations,
followed by Shi’a Muslims (10%) who are present in parts of Iraq, Lebanon, Saudi Arabia,
Kuwait, Yemen, and Bahrain. Non-Muslims make up about 10% of all Arabs, and Christianity
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 17 / 24
https://doi.org/10.1371/journal.pone.0192269
(6%) is the second largest religion among Arabs, with about 20 million Christians living in
Lebanon, Egypt, Iraq, Syria, and Jordan. Other minor religions (4%) such as Judaism, Druze
and others are practiced on a much smaller scale [99].
HLA data on Sunni and Shiite Arabs are not available, same as comparison of Muslims to
Christians. The only available data are those concerning Arab Jews. In this study, data are
available for three Jewish populations, including two from North Africa (Moroccan and Lib-
yan Jews) and one from the Arabian Peninsula (Yemenite Jews). While genetic distances sepa-
rating these three groups of Jews are small (S1 Table), genetic heterogeneity between these
Jewish populations was noted. For example, Yemenite Jews are related to Western Mediterra-
nean populations, including North Africans and Iberians, while Libyan Jews are related to
Eastern Mediterraneans, including Levantine Arabs. The relatedness of Moroccan Jews
depends to other communities on the studied HLA loci; they associate with Eastern Mediterra-
neans using DRB1, but group with Eastern Mediterraneans when the other markers are used.
Conclusion
This study supports the notion that Arabs are divided into four groups. The first consisting
of North Africans (Algerians, Tunisians, Moroccans, and Libyans), Saudis, Kuwaitis, and
Yemenis, with relatedness to Western Mediterraneans, including Iberians. The second
includes Levantine Arabs (Palestinians, Jordanians, Lebanese, and Syrians), Iraqi, and
Egyptians, who appear to be related to the Eastern Mediterranean and Iranians, who in
turn belonged to ’Great Levant’ historically described. The third consists of Sudanese and
Comorians who associate with Sub-Saharan Africans. Finally, the fourth group of Arabs
comprises Omanis, Emiratis, and Bahrainis. This group associates with heterogeneous pop-
ulations (Mediterranean, Asian and sub-Saharan). Lastly, the two main indigenous minori-
ties, Berbers and Kurds, are not genetically different from the ‘host’ and neighboring
populations.
Supporting information
S1 Checklist. PRISMA 2009 checklist.
(DOC)
S1 Fig. Neighbor-Joining dendrograms, based on standard genetic distances (SGD), show-
ing relatedness between Arabs and other populations using generic HLA-DRB1� allele fre-
quencies data. Populations’ data were taken from references detailed in Tables 1 and 2.
Bootstrap values from 1.000 replicates are shown.
(TIF)
S2 Fig. Neighbor-Joining dendrograms, based on standard genetic distances (SGD), show-
ing relatedness between Arabs and other populations using generic HLA-B� allele frequen-
cies data. Populations’ data were taken from references detailed in Tables 1 and 2. Bootstrap
values from 1.000 replicates are shown.
(TIF)
S1 Table. Genetic distances between three groups of Arab Jews based on HLA-DRB1 and
-DQB1 alleles frequencies.
(DOC)
Author Contributions
Conceptualization: Abdelhafidh Hajjej, Lasmar Hattab, Slama Hmida.
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 18 / 24
http://www.plosone.org/article/fetchSingleRepresentation.action?uri=info:doi/10.1371/journal.pone.0192269.s001
http://www.plosone.org/article/fetchSingleRepresentation.action?uri=info:doi/10.1371/journal.pone.0192269.s002
http://www.plosone.org/article/fetchSingleRepresentation.action?uri=info:doi/10.1371/journal.pone.0192269.s003
http://www.plosone.org/article/fetchSingleRepresentation.action?uri=info:doi/10.1371/journal.pone.0192269.s004
https://doi.org/10.1371/journal.pone.0192269
Formal analysis: Abdelhafidh Hajjej, Slama Hmida.
Investigation: Abdelhafidh Hajjej.
Methodology: Wassim Y. Almawi, Slama Hmida.
Software: Abdelhafidh Hajjej, Lasmar Hattab.
Supervision: Slama Hmida.
Validation: Abdelhafidh Hajjej, Wassim Y. Almawi, Antonio Arnaiz-Villena, Lasmar Hattab,
Slama Hmida.
Writing – original draft: Abdelhafidh Hajjej.
Writing – review & editing: Wassim Y. Almawi, Antonio Arnaiz-Villena.
References
1. HLA allele database: http://hla.alleles.org (last accessed on September 17, 2017)
2. Hudson RR. Analysis of population subdivision in Handbook of statistical genetics, MBD. Balding
MBD and Cannings C. (Eds). pp. 309–324. John Wiley & Sons Chichester, UK, 2001
3. Takezaki N, Nei M. Empirical tests of the reliability of phylogenetic trees constructed with microsatellite
DNA. Genetics. 2008; 178(1): 385–92. https://doi.org/10.1534/genetics.107.081505 PMID: 18202381
4. Nei M. Phylogenetic analysis in molecular evolutionary genetics. Annual Review of Genetics. 1996;
30: 371–403. https://doi.org/10.1146/annurev.genet.30.1.371 PMID: 8982459
5. Tamura K, Nei M, Kumar S. Prospects for inferring very large phylogenies by using the neighbor-join-
ing method. Proceedings of the National Academy of Sciences USA. 2004; 101(30): 11030–5. https://
doi.org/10.1073/pnas.0404206101 PMID: 15258291
6. Excoffier L, Slatkin M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid
population. Molecular Biology and Evolution. 1995; 12(5): 921–7. https://doi.org/10.1093/
oxfordjournals.molbev.a040269 PMID: 7476138
7. The World Factbook: https://www.cia.gov/library/publications/the-world-factbook
8. Bengio O, Ben-Dor G. Minorities and the State in the Arab World. Lynne Rienner Publishers, 1999–
224 pages
9. Encyclopædia Britannica, Himyar: https://www.britannica.com/topic/Himyar
10. Korotayev A. Ancient Yemen. Oxford: Oxford University Press, 1995.
11. Korotayev A. Pre-Islamic Yemen. Wiesbaden: Harrassowitz Verlag, 1996.
12. Munro-Hay, Stuart C. Aksum: An African Civilization of Late Antiquity 1991. Edinburgh: Edinburgh
University Press, 1991.
13. Robin CJ. Arabia and Ethiopia, ’in Johnson Scott (ed.) The Oxford Handbook of Late Antiquity, Oxford
University Press 2012 pp. 247–333, p.279.
14. Hoyland R. Arabia and the Arabs: From the Bronze Age to the Coming of Islam, Routledge, 2001,
p.51.
15. Encyclopædia wikipedia: https://en.wikipedia.org/wiki/History_of_Islam
16. Hourani A. A History of the Arab Peoples. Harvard University Press 2002; pp. 15–19. ISBN
9780674010178.
17. Moher D, Liberati A, Tetzlaff J, Altman DG, and PRISMA Group, “Reprint—preferred reporting
items for systematic reviews and meta-analyses: the PRISMA statement”. Physical Therapy. 2009;
89(9): 873–80. https://doi.org/10.1093/ptj/89.9.873 PMID: 19723669
18. Young FW, Bann CM. A visual statistics system. In Stine RA, Fox J, eds. Statistical computing envi-
ronments for social researches. New York: Sage publications. 1996; 207–36.
19. Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees.
Molecular Biology and Evolution. 1987; 4(4): 406–425. https://doi.org/10.1093/oxfordjournals.molbev.
a040454 PMID: 3447015
20. Nei M. Genetic distances between populations. The American Naturalist. 1972; 106:283. http://jstor.
org/stable/2459777
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 19 / 24
http://hla.alleles.org
https://doi.org/10.1534/genetics.107.081505
http://www.ncbi.nlm.nih.gov/pubmed/18202381
https://doi.org/10.1146/annurev.genet.30.1.371
http://www.ncbi.nlm.nih.gov/pubmed/8982459
https://doi.org/10.1073/pnas.0404206101
https://doi.org/10.1073/pnas.0404206101
http://www.ncbi.nlm.nih.gov/pubmed/15258291
https://doi.org/10.1093/oxfordjournals.molbev.a040269
https://doi.org/10.1093/oxfordjournals.molbev.a040269
http://www.ncbi.nlm.nih.gov/pubmed/7476138
https://www.cia.gov/library/publications/the-world-factbook
https://www.britannica.com/topic/Himyar
https://en.wikipedia.org/wiki/History_of_Islam
https://doi.org/10.1093/ptj/89.9.873
http://www.ncbi.nlm.nih.gov/pubmed/19723669
https://doi.org/10.1093/oxfordjournals.molbev.a040454
https://doi.org/10.1093/oxfordjournals.molbev.a040454
http://www.ncbi.nlm.nih.gov/pubmed/3447015
http://jstor.org/stable/2459777
http://jstor.org/stable/2459777
https://doi.org/10.1371/journal.pone.0192269
21. Nei M. Analysis of gene diversity in subdivided populations. Proceedings of the National Academy of
Sciences USA. 1973; 70(12): 3321–3. PMID: 4519626.
22. Nei M, Tajima F, Tateno Y. Accuracy of estimated phylogenetic trees from molecular data. II. Gene
frequency data.Journal of Molecular Evolution. 1983; 19(2): 153–70. https://doi.org/10.1007/
BF02300753 PMID: 6571220
23. Database of allele frequencies: http://www.allelefrequencies.net, 2017
24. Hajjej A, Sellami MH, Kaabi H, Hajjej G, El-Gaaied A, Boukef K, et al. HLA class I and class II polymor-
phisms in Tunisian Berbers. Annals of Human Biology. 2011; 38 (2): 156–64. https://doi.org/10.3109/
03014460.2010.504195 PMID: 20666704
25. Gomez-Casado E, del Moral P, Martinez-Laso J, Garcı́a-Gómez A, Allende L, Silvera-Redondo C,
et al. HLA gene in Arabic-Speaking Moroccans: close relatedness to Berbers and Iberians. Tissue
Antigens. 2000; 55(3): 239–49. https://doi.org/10.1034/j.1399-0039.2000.550307.x PMID: 10777099
26. Mahfoudh N, Ayadi I, Kamoun A, Ammar R, Mallek B, Maalej L, et al. Analysis of HLA-A, -B, -C, -DR,
-DQ polymorphisms in the South Tunisian population and a comparison with other populations. Annals
of Human Biology. 2013; 40(1): 41–7. https://doi.org/10.3109/03014460.2012.734334 PMID:
23095049
27. Matevosyan L, Chattopadhyay S, Madelian V, Avagyan S, Nazaretyan M, Hyussian A, et al. HLA-A,
HLA-B, and HLA-DRB1 allele distribution in a large Armenian population sample. Tissue Antigens.
2011; 78(1): 21–30. https://doi.org/10.1111/j.1399-0039.2011.01668.x PMID: 21501120
28. Hamdi NM, Al-Hababi FH, Eid AE. HLA class I and class II associations with ESRD in Saudi Arabian
population. PLoS One. 2014 Nov 7; 9(11): e111403. https://doi.org/10.1371/journal.pone.0111403
PMID: 25380295
29. Arnaiz-Villena A, Elaiwa N, Silvera C, Rostom A, Moscoso J, Gómez-Casado E, et al. The origin of
Palestinians and their genetic relatedness with other Mediterranean populations. Retraction in: Suciu-
Foca N, Lewis R. Human Immunology. 2001; 62(9): 889–900. (Accessed on https://commons.
wikimedia.org/wiki/File:Palestinians_hla ) PMID: 11543891
30. Albalushi KR, Sellami MH, Alriyami H, varghese M, Boukef MK, Hmida S. The Investigation of the Evo-
lutionary History of the Omani Population by Analysis of HLA Class I Polymorphism. Anthropologist.
2014; 18(1): 205–210
31. Sánchez-Velasco P, Karadsheh NS, Garcı́a-Martı́n A, Ruı́z de Alegrı́a C, Leyva-Cobián F. Molecular
analysis of HLA allelic frequencies and haplotypes in Jordanians and comparison with other related
populations. Human Immunology. 2001; 62(9): 901–9. https://doi.org/10.1016/S0198-8859(01)
00289-0. PMID: 11543892.
32. Galgani A, Mancino G, Martı́nez-Labarga C, Cicconi R, Mattei M, Amicosante M, et al. HLA-A, -B and
-DRB1 allele frequencies in Cyrenaica population (Libya) and genetic relationships with other popula-
tions. Hum Immunol. 2013; 74(1): 52–9. https://doi.org/10.1016/j.humimm.2012.10.001 PMID:
23079236
33. Hajjej A, Hmida S, Kaabi H, Dridi A, Jridi A, El Gaaled A, et al. HLA genes in Southern Tunisians
(Ghannouch area) and their relationship with other Mediterraneans. European Journal Medical Genet-
ics. 2006; 49(1): 43–56. https://doi.org/10.1016/j.ejmg.2005.01.001 PMID: 16473309
34. Hmida S, Gauthier A, Dridi A, Quillivic F, Genetet B, Boukef K, et al. HLA class II gene polymorphism
in Tunisians. Tissue Antigens. 1995; 45(1): 63–8. https://doi.org/10.1111/j.1399-0039.1995.tb02416.
x PMID: 7725313
35. Almawi WY, Busson M, Tamim H, Al-Harbi EM, Finan RR, Wakim-Ghorayeb SF, et al. HLA class II
profile and distribution of HLA-DRB1 and HLA-DQB1 alleles and haplotypes among Lebanese and
Bahraini Arabs. Clinical and Diagnostic Laboratory Immunology. 2004; 11(4): 770–4. https://doi.org/
10.1128/CDLI.11.4.770-774.2004 PMID: 15242955
36. Amar A, Kwon OJ, Motro U, Witt CS, Bonne-Tamir B, Gabison R, et al. Molecular analysis of HLA
class II polymorphisms among different ethnic groups in Israel. Human Immunology. 1999; 60(8):
723–30. https://doi.org/10.1016/S0198-8859(99)00043-9 PMID: 10439318
37. Izaabel H, Garchon HJ, Caillat-Zucman S, Beaurain G, Akhayat O, Bach JF, et al. HLA class II DNA
polymorphism in a Moroccan population from the Souss, Agadir area. Tissue Antigens. 1998; 51(1):
106–10. https://doi.org/10.1111/j.1399-0039.1998.tb02954.x PMID: 9459511
38. Al-Tonbary Y, Abdel-Razek N, Zaghloul H, Metwaly S, El-Deek B, El-Shawaf R. HLA class II polymor-
phism in Egyptian children with lymphomas. Hematology. 2004; 9(2): 139–45. https://doi.org/10.
1080/1024533042000205487 PMID: 15203870
39. Clayton J, Lonjou C. Allele and Haplotype frequencies for HLA loci in various ethnic groups. In
Charron D, ed. Genetic diversity of HLA. Functional and medical implications. Vol 1. Paris: EDK.
1997; 665–820.
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 20 / 24
http://www.ncbi.nlm.nih.gov/pubmed/4519626
https://doi.org/10.1007/BF02300753
https://doi.org/10.1007/BF02300753
http://www.ncbi.nlm.nih.gov/pubmed/6571220
http://www.allelefrequencies.net
https://doi.org/10.3109/03014460.2010.504195
https://doi.org/10.3109/03014460.2010.504195
http://www.ncbi.nlm.nih.gov/pubmed/20666704
https://doi.org/10.1034/j.1399-0039.2000.550307.x
http://www.ncbi.nlm.nih.gov/pubmed/10777099
https://doi.org/10.3109/03014460.2012.734334
http://www.ncbi.nlm.nih.gov/pubmed/23095049
https://doi.org/10.1111/j.1399-0039.2011.01668.x
http://www.ncbi.nlm.nih.gov/pubmed/21501120
https://doi.org/10.1371/journal.pone.0111403
http://www.ncbi.nlm.nih.gov/pubmed/25380295
https://commons.wikimedia.org/wiki/File:Palestinians_hla
https://commons.wikimedia.org/wiki/File:Palestinians_hla
http://www.ncbi.nlm.nih.gov/pubmed/11543891
https://doi.org/10.1016/S0198-8859(01)00289-0
https://doi.org/10.1016/S0198-8859(01)00289-0
http://www.ncbi.nlm.nih.gov/pubmed/11543892
https://doi.org/10.1016/j.humimm.2012.10.001
http://www.ncbi.nlm.nih.gov/pubmed/23079236
https://doi.org/10.1016/j.ejmg.2005.01.001
http://www.ncbi.nlm.nih.gov/pubmed/16473309
https://doi.org/10.1111/j.1399-0039.1995.tb02416.x
https://doi.org/10.1111/j.1399-0039.1995.tb02416.x
http://www.ncbi.nlm.nih.gov/pubmed/7725313
https://doi.org/10.1128/CDLI.11.4.770-774.2004
https://doi.org/10.1128/CDLI.11.4.770-774.2004
http://www.ncbi.nlm.nih.gov/pubmed/15242955
https://doi.org/10.1016/S0198-8859(99)00043-9
http://www.ncbi.nlm.nih.gov/pubmed/10439318
https://doi.org/10.1111/j.1399-0039.1998.tb02954.x
http://www.ncbi.nlm.nih.gov/pubmed/9459511
https://doi.org/10.1080/1024533042000205487
https://doi.org/10.1080/1024533042000205487
http://www.ncbi.nlm.nih.gov/pubmed/15203870
https://doi.org/10.1371/journal.pone.0192269
40. Abdennaji Guenounou B, Loueslati BY, Buhler S, Hmida S, Ennafaa H, Khodjet-Elkhil H, et al. HLA
class II genetic diversity in Southern Tunisia and the Mediterranean area. International Journal Immu-
nogenetics. 2006; 33(2): 93–103. https://doi.org/10.1111/j.1744-313X.2006.00577.x PMID:
16611253
41. Martinez-Laso J, De Juan D, Martinez-Quiles N, Gomez-Casado E, Cuadrado E, Arnaiz-Villena A.
The contribution of the HLA-A, -B, -C and -DR, -DQ DNA typing to the study of the origins of Spaniards
and Basques. Tissue Antigens. 1995; 45(4): 237–45. https://doi.org/10.1111/j.1399-0039.1995.
tb02446.x PMID: 7638859.
42. Brick C, Bennani N, Atouf O, Essakalli M. HLA-A, -B, -DR and -DQ allele and haplotype frequencies in
the Moroccan population: a general population study. Transfusion Clinique et Biologique. 2006; 13(6):
346–52. https://doi.org/10.1016/j.tracli.2006.12.003 PMID: 17306585
43. Gibert M, Touinssi M, Reviron D, Mercier P, Boëtsch G, Chiaroni J. HLA-DRB1 frequencies of the
Comorian population and their genetic affinities with Sub-Saharan African and Indian Oceanian popu-
lations. Annals of Human Biology. 2006; 33(3): 265–78. https://doi.org/10.1080/03014460600578599
PMID: 17092866
44. Samaha H, Rahal EA, Abou-Jaoude M, Younes M, Dacchache J, Hakime N. HLA class II allele fre-
quencies in the Lebanese population. Molecular Immunology. 2003; 39(17–18): 1079–81. https://doi.
org/10.1016/S0161-5890(03)00073-7 PMID: 12835080
45. Khansa S, Hoteit R, Shammaa D, Khalek RA, El Halas H, Greige L, et al. HLA class II allele frequen-
cies in the Lebanese population. Gene. 2012; 506(2): 396–9. https://doi.org/10.1016/j.gene.2012.06.
063 PMID: 22750800
46. Elbjeirami WM, Abdel-Rahman F, Hussein AA. Probability of finding an HLA-matched donor in imme-
diate and extended families: the Jordanian experience. Biology of Blood and Marrow Transplantation.
2013; 19(2): 221–6. https://doi.org/10.1016/j.bbmt.2012.09.009 PMID: 23025986
47. Mourad J, Monem F. HLA-DRB1 allele association with rheumatoid arthritis susceptibility and severity
in Syria. Revista Brasileira De Reumatologia. 2013; 53(1): 47–56. PMID: 23588515
48. Djidjik R, Allam I, Douaoui S, Meddour Y, Cherguelaı̂ne K, Tahiat A, et al. Association study of human
leukocyte antigen-DRB1 alleles with rheumatoid arthritis in Algerian patients. International Journal of
Rheumatic Diseases. 2014. https://doi.org/10.1111/1756-185X.12272 PMID: 24447879
49. Hajeer AH, Al Balwi MA, AytülUyar F, Alhaidan Y, Alabdulrahman A, Al Abdulkareem I, et al. HLA-A,
-B, -C, -DRB1 and -DQB1 allele and haplotype frequencies in Saudis using next generation sequenc-
ing technique. Tissue Antigens. 2013; 82(4): 252–8. https://doi.org/10.1111/tan.12200 PMID:
24461004
50. Hajeer AH, Sawidan FA, Bohlega S, Saleh S, Sutton P, Shubaili A, Tahan AA, Al Jumah M. HLA class
I and class II polymorphisms in Saudi patients with myasthenia gravis. International Journal of Immu-
nogenetics. 2009; 36(3): 169–72. https://doi.org/10.1111/j.1744-313X.2009.00843.x PMID:
19490212
51. Albalushi KR, Sellami MH, Alriyami H, varghese M, Boukef MK, Hmida S. HLA Class II (DRB1 and
DQB1) Polymorphism in Omanis. Journal of Transplantation Technologies and Research 2014; 4:
134. https://doi.org/10.4172/2161-0991.1000134
52. Haider MZ, Shaltout A, Alsaeid K, Qabazard M, Dorman J. Prevalence of human leukocyte antigen
DQA1 and DQB1 alleles in Kuwaiti Arab children with type 1 diabetes mellitus. Clinical Genetics.
1999; 56(6): 450–6. https://doi.org/10.1034/j.1399-0004.1999.560608.x PMID: 10665665
53. Haider MZ, Zahid MA, Dalal HN, Razik MA. Human leukocyte antigen (HLA) DRB1 alleles in Kuwaiti
Arabs with schizophrenia.American Journal of Medical Genetics. 2000; 96(6): 870–2. https://doi.org/
10.1002/1096-8628(20001204)96:6<870::AID-AJMG36>3.0.CO;2-L PMID: 11121200.
54. Arnaiz-Villena A, Palacio-Grüber J, Muñiz E, Campos C, Alonso-Rubio J, Gomez-Casado E, et al.
Genetic HLA Study of Kurds in Iraq, Iran and Tbilisi (Caucasus, Georgia): Relatedness and Medical
Implications. PLoS One. 2017 Jan 23; 12(1): e0169929. https://doi.org/10.1371/journal.pone.
0169929 PMID: 28114347
55. Nassar MY, Al-Shamahy HA, Masood HA. The Association between Human Leukocyte Antigens and
Hypertensive End-Stage Renal Failure among Yemeni Patients. Sultan Qaboos University Medical
Journal. 2015; 15(2): e241–249. PMID: 26052458
56. Middleton D, Williams F, Meenagh A, Daar AS, Gorodezky C, Hammond M, et al. Analysis of the
distribution of HLA-A alleles in populations from five continents. Human Immunology. 2000; 61
(10): 1048–52. https://doi.org/10.1016/S0198-8859(00)00178-6 PMID: 11082518
57. Williams F, Meenagh A, Darke C, Acosta A, Daar AS, Gorodezky C, et al. Analysis of the distribution
of HLA-B alleles in populations from five continents. Human Immunology. 2001; 62(6): 645–50.
https://doi.org/10.1016/S0198-8859(01)00247-6 PMID: 11390040
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 21 / 24
https://doi.org/10.1111/j.1744-313X.2006.00577.x
http://www.ncbi.nlm.nih.gov/pubmed/16611253
https://doi.org/10.1111/j.1399-0039.1995.tb02446.x
https://doi.org/10.1111/j.1399-0039.1995.tb02446.x
http://www.ncbi.nlm.nih.gov/pubmed/7638859
https://doi.org/10.1016/j.tracli.2006.12.003
http://www.ncbi.nlm.nih.gov/pubmed/17306585
https://doi.org/10.1080/03014460600578599
http://www.ncbi.nlm.nih.gov/pubmed/17092866
https://doi.org/10.1016/S0161-5890(03)00073-7
https://doi.org/10.1016/S0161-5890(03)00073-7
http://www.ncbi.nlm.nih.gov/pubmed/12835080
https://doi.org/10.1016/j.gene.2012.06.063
https://doi.org/10.1016/j.gene.2012.06.063
http://www.ncbi.nlm.nih.gov/pubmed/22750800
https://doi.org/10.1016/j.bbmt.2012.09.009
http://www.ncbi.nlm.nih.gov/pubmed/23025986
http://www.ncbi.nlm.nih.gov/pubmed/23588515
https://doi.org/10.1111/1756-185X.12272
http://www.ncbi.nlm.nih.gov/pubmed/24447879
https://doi.org/10.1111/tan.12200
http://www.ncbi.nlm.nih.gov/pubmed/24461004
https://doi.org/10.1111/j.1744-313X.2009.00843.x
http://www.ncbi.nlm.nih.gov/pubmed/19490212
https://doi.org/10.4172/2161-0991.1000134
https://doi.org/10.1034/j.1399-0004.1999.560608.x
http://www.ncbi.nlm.nih.gov/pubmed/10665665
https://doi.org/10.1002/1096-8628(20001204)96:6<870::AID-AJMG36>3.0.CO;2-L
https://doi.org/10.1002/1096-8628(20001204)96:6<870::AID-AJMG36>3.0.CO;2-L
http://www.ncbi.nlm.nih.gov/pubmed/11121200
https://doi.org/10.1371/journal.pone.0169929
https://doi.org/10.1371/journal.pone.0169929
http://www.ncbi.nlm.nih.gov/pubmed/28114347
http://www.ncbi.nlm.nih.gov/pubmed/26052458
https://doi.org/10.1016/S0198-8859(00)00178-6
http://www.ncbi.nlm.nih.gov/pubmed/11082518
https://doi.org/10.1016/S0198-8859(01)00247-6
http://www.ncbi.nlm.nih.gov/pubmed/11390040
https://doi.org/10.1371/journal.pone.0192269
58. Jazairi B, Khansaa I, Ikhtiar A, Murad H. Frequency of HLA-DRB1 and HLA-DQB1 Alleles and Haplo-
type Association in Syrian Population. Immunological Investigation. 2016; 45(2): 172–9. https://doi.
org/10.3109/08820139.2015.1131293 PMID: 26853713
59. Hajjej A, Hajjej G, Almawi WY, Kaabi H, El-Gaaied A, Hmida S. HLA class I and class II polymorphism
in a population from south-eastern Tunisia (Gabes Area). International Journal of Immunogenetics.
2011; 38(3): 191–9. https://doi.org/10.1111/j.1744-313X.2011.01003.x PMID: 21385325
60. Hajjej A, Kâabi H, Sellami MH, Dridi A, Jeridi A, El borgi W, et al. The contribution of HLA class I and II
alleles and haplotypes to the investigation of the evolutionary history of Tunisians. Tissue Antigens.
2006; 68(2): 153–62. https://doi.org/10.1111/j.1399-0039.2006.00622.x PMID: 16866885
61. Hajjej A, Almawi WY, Hattab L, El-Gaaied A, Hmida S. HLA Class I and Class II Alleles and Haplo-
types Confirm the Berber Origin of the Present Day Tunisian Population. PLoS One. 2015; 10(8):
e0136909. https://doi.org/10.1371/journal.pone.0136909 PMID: 26317228
62. Hajjej A, Almawi WY, Hattab L, El-Gaaied A, Hmida S. The investigation of the origin of Southern Tuni-
sians using HLA genes. Journal of Human Genetics. 2017; 62(3): 419–429. https://doi.org/10.1038/
jhg.2016.146 PMID: 27881842
63. Ayed K, Ayed-Jendoubi S, Sfar I, Labonne MP, Gebuhrer L. HLA class-I and HLA class-II phenotypic,
gene and haplotypic frequencies in Tunisians by using molecular typing data. Tissue Antigens. 2004;
64(4): 520–32. https://doi.org/10.1111/j.1399-0039.2004.00313.x PMID: 15361135
64. Oumhani K, Canossi A, Piancatelli D, Di Rocco M, Del Beato T, Liberatore G, et al. Sequence-Based
analysis of the HLA-DRB1 polymorphism in Metalsa Berber and Chaouya Arabic-speaking groups
from Morocco. Human Immunology. 2002; 63(2): 129–38. https://doi.org/10.1016/S0198-8859(01)
00370-6 PMID: 11821160
65. Canossi A, Piancatelli D, Aureli A, Oumhani K, Ozzella G, Del Beato T, et al. Correlation between
genetic HLA class I and II polymorphisms and anthropological aspects in the Chaouya population
from Morocco (Arabic speaking). Tissue Antigens. 2010; 76(3): 177–193. https://doi.org/10.1111/j.
1399-0039.2010.01498.x PMID: 20492599
66. Roitberg-Tambur A, Witt CS, Friedmann A, Safirman C, Sherman L, Battat S, Nelken D, Brautbar C.
Comparative analysis of HLA polymorphism at the serologic and molecular level in Moroccan and Ash-
kenazi Jews. Tissue Antigens. 1995; 46(2): 104–10. https://doi.org/10.1111/j.1399-0039.1995.
tb02485.x PMID: 7482502
67. Arnaiz-Villena A, Benmamar D, Alvarez M, Diaz-Campos N, Varela P, Gomez-Casado E, et al. HLA
allele and haplotype frequencies in Algerians. Relatedness to Spaniards and Basques. Human Immu-
nology. 1995; 43(4): 259–68. https://doi.org/10.1016/0198-8859(95)00024-X PMID: 7499173
68. Imanishi T, Akaza T, Kimura A, Tokunaga K, Gjobori T. Allele and haplotype frequencies for HLA and
complement loci in various ethnic groups. In, eds. HLA 1991. VOL 1. Oxford: Oxford University
Press. 1992; 1065–220.
69. Arnaiz-Villena A, Iliakis P, González-Hevilla M, Longás J, Gómez-Casado E, Sfyridaki K, et al. The ori-
gin of Cretan populations as determined by characterization of HLA alleles. Tissue Antigens. 1999; 53
(3): 213–26. https://doi.org/10.1034/j.1399-0039.1999.530301.x PMID: 10203014
70. Comas D, Mateu E, Calafell F, Pérez-Lezaun A, Bosch E, Martı́nez-Arias R, et al. HLA class I and
class II DNA typing and the origin of Basques. Tissue Antigens. 1998; 51(1): 30–40. https://doi.org/
10.1111/j.1399-0039.1998.tb02944.x PMID: 9459501
71. Grimaldi MC, Crouau-Roy B, Amoros JP, Cambon-Thomsen A, Carcassi C, Orru S, et al. West Medi-
terranean islands (Corsica, Balearic Islands, Sardinia) and the Basque population: contribution of HLA
class I molecular markers to their evolutionary history. Tissue Antigens. 2001; 58(5): 281–92. https://
doi.org/10.1034/j.1399-0039.2001.580501.x PMID: 11844138
72. Renquin J, Sanchez-Mazas A, Halle L, Rivalland S, Jaeger G, Mbayo K, et al. HLA class II polymor-
phism in Aka Pygmies and Bantu Congolese and a reassessment of HLA-DRB1 African diversity. Tis-
sue Antigens. 2001; 58(4): 211–22. https://doi.org/10.1034/j.1399-0039.2001.580401.x PMID:
11782272
73. Farjadian S, Ghaderi A. HLA class II genetic diversity in Arabs and Jews of Iran. Iranian Journal of
Immunology. 2007; 4(2): 85–93. https://doi.org/IJIv4i2A3 PMID: 17652848
74. Kollaee A, Ghaffarpor M, Ghlichnia HA, Ghaffari SH, Zamani M. The influence of the HLA-DRB1 and
HLA-DQB1 allele heterogeneity on disease risk and severity in Iranian patients with multiple sclerosis.
International Journal of Immunogenetics. 2012; 39(5): 414–22. https://doi.org/10.1111/j.1744-313X.
2012.01104.x PMID: 22404765
75. Sayad A, Akbari MT, Pajouhi M, Mostafavi F, Zamani M. The influence of the HLA-DRB, HLA-DQB
and polymorphic positions of the HLA-DRβ1 and HLA-DQβ1 molecules on risk of Iranian type 1 diabe-
tes mellitus patients. International Journal of Immunogenetics. 2012; 39(5): 429–36. https://doi.org/
10.1111/j.1744-313X.2012.01116.x PMID: 22494469
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 22 / 24
https://doi.org/10.3109/08820139.2015.1131293
https://doi.org/10.3109/08820139.2015.1131293
http://www.ncbi.nlm.nih.gov/pubmed/26853713
https://doi.org/10.1111/j.1744-313X.2011.01003.x
http://www.ncbi.nlm.nih.gov/pubmed/21385325
https://doi.org/10.1111/j.1399-0039.2006.00622.x
http://www.ncbi.nlm.nih.gov/pubmed/16866885
https://doi.org/10.1371/journal.pone.0136909
http://www.ncbi.nlm.nih.gov/pubmed/26317228
https://doi.org/10.1038/jhg.2016.146
https://doi.org/10.1038/jhg.2016.146
http://www.ncbi.nlm.nih.gov/pubmed/27881842
https://doi.org/10.1111/j.1399-0039.2004.00313.x
http://www.ncbi.nlm.nih.gov/pubmed/15361135
https://doi.org/10.1016/S0198-8859(01)00370-6
https://doi.org/10.1016/S0198-8859(01)00370-6
http://www.ncbi.nlm.nih.gov/pubmed/11821160
https://doi.org/10.1111/j.1399-0039.2010.01498.x
https://doi.org/10.1111/j.1399-0039.2010.01498.x
http://www.ncbi.nlm.nih.gov/pubmed/20492599
https://doi.org/10.1111/j.1399-0039.1995.tb02485.x
https://doi.org/10.1111/j.1399-0039.1995.tb02485.x
http://www.ncbi.nlm.nih.gov/pubmed/7482502
https://doi.org/10.1016/0198-8859(95)00024-X
http://www.ncbi.nlm.nih.gov/pubmed/7499173
https://doi.org/10.1034/j.1399-0039.1999.530301.x
http://www.ncbi.nlm.nih.gov/pubmed/10203014
https://doi.org/10.1111/j.1399-0039.1998.tb02944.x
https://doi.org/10.1111/j.1399-0039.1998.tb02944.x
http://www.ncbi.nlm.nih.gov/pubmed/9459501
https://doi.org/10.1034/j.1399-0039.2001.580501.x
https://doi.org/10.1034/j.1399-0039.2001.580501.x
http://www.ncbi.nlm.nih.gov/pubmed/11844138
https://doi.org/10.1034/j.1399-0039.2001.580401.x
http://www.ncbi.nlm.nih.gov/pubmed/11782272
https://doi.org/IJIv4i2A3
http://www.ncbi.nlm.nih.gov/pubmed/17652848
https://doi.org/10.1111/j.1744-313X.2012.01104.x
https://doi.org/10.1111/j.1744-313X.2012.01104.x
http://www.ncbi.nlm.nih.gov/pubmed/22404765
https://doi.org/10.1111/j.1744-313X.2012.01116.x
https://doi.org/10.1111/j.1744-313X.2012.01116.x
http://www.ncbi.nlm.nih.gov/pubmed/22494469
https://doi.org/10.1371/journal.pone.0192269
76. Sulcebe G, Sanchez-Mazas A, Tiercy JM, Shyti E, Mone I, Ylli Z, et al. HLA allele and haplotype fre-
quencies in the Albanian population and their relationship with the other European populations. Inter-
national Journal of Immunogenetics. 2009; 36(6): 337–43. https://doi.org/10.1111/j.1744-313X.2009.
00868.x PMID: 19703234
77. Sanchez-Velasco P, Gomez-Casado E, Martinez-Laso J, Moscoso J, Zamora J, Lowy E, et al. HLA
alleles in isolated populations from North Spain: origin of the Basques and the ancient Iberians. Tissue
Antigens. 2003; 61(5): 384–92. https://doi.org/10.1034/j.1399-0039.2003.00041.x PMID: 12753657
78. Arnaiz-Villena A, Dimitroski K, Pacho A, Moscoso J, Gómez-Casado E, Silvera-Redondo C, et al.
HLA genes in Macedonians and the sub-Saharan origin of the Greeks. Tissue Antigens. 2001; 57
(2): 118–27. https://doi.org/10.1034/j.1399-0039.2001.057002118.x PMID: 11260506
79. Arnaiz-Villena A, Karin M, Bendikuze N, Gomez-Casado E, Moscoso J, Silvera C, et al. HLA alleles
and haplotypes in the Turkish population: relatedness to Kurds, Armenians and other Mediterraneans.
Tissue Antigens. 2001; 57(4): 308–17. https://doi.org/10.1034/j.1399-0039.2001.057004308.x PMID:
11380939
80. Muro M, Marı́n L, Torı́o A, Moya-Quiles MR, Minguela A, Rosique-Roman J, et al. HLA polymorphism
in the Murcia population (Spain): in the cradle of the archaeologic Iberians. Human Immunology. 2001;
62(9): 910–21. https://doi.org/10.1016/S0198-8859(01)00290-7 PMID: 11543893
81. Farjadian S, Ghaderi A. HLA class II similarities in Iranian Kurds and Azeris. International Journal of
Immunogenetics. 2007; 34(6): 457–63. https://doi.org/10.1111/j.1744-313X.2007.00723.x PMID:
18001303
82. Mohyuddin A, Ayub Q, Khaliq S, Mansoor A, Mazhar K, Rehman S, et al. HLA polymorphism in six eth-
nic groups from Pakistan. Tissue Antigens. 2002; 59(6): 492–501. https://doi.org/10.1034/j.1399-
0039.2002.590606.x PMID: 12445319
83. Agrawal S, Srivastava SK, Borkar M, Chaudhuri TK. Genetic affinities of north and northeastern popu-
lations of India: inference from HLA-based study. Tissue Antigens. 2008; 72(2): 120–30. https://doi.
org/10.1111/j.1399-0039.2008.01083.x PMID: 18721272
84. Rani R, Sood A, Goswami R. Molecular basis of predisposition to develop type 1 diabetes mellitus in
North Indians. Tissue Antigens. 2004; 64(2): 145–55. https://doi.org/10.1111/j.1399-0039.2004.
00246.x PMID: 15245369
85. Migot-Nabias F, Fajardy I, Danze PM, Everaere S, Mayombo J, Minh TN, et al. HLA class II polymor-
phism in a Gabonese Banzabi population. Tissue Antigens. 1999; 53(6): 580–5. https://doi.org/10.
1034/j.1399-0039.1999.530610.x PMID: 10395110
86. Arnaiz-Villena A, Muñiz E, Campos C, Gomez-Casado E, Tomasi S, Martı́nez-Quiles N, et al. Origin
of Ancient Canary Islanders (Guanches): presence of Atlantic/Iberian HLA and Y chromosome genes
and Ancient Iberian language. International Journal of Modern Anthropology. 2015; 8: 67–93. https://
doi.org/10.4314/ijma.v1i8.4
87. Reil A, Bein G, Machulla HK, Sternberg B, Seyfarth M. High-resolution DNA typing in immunoglobulin
A deficiency confirms a positive association with DRB1*0301, DQB1*02 haplotypes. Tissue Antigens.
1997; 50(5): 501–6. https://doi.org/10.1111/j.1399-0039.1997.tb02906.x PMID: 9389325
88. Arnaiz-Villena A, Martı́nez-Laso J, Gómez-Casado E, Dı́az-Campos N, Santos P, Martinho A, et al.
Relatedness among Basques, Portuguese, Spaniards, and Algerians studied by HLA allelic frequen-
cies and haplotypes. Immunogenetics. 1997; 47(1): 37–43. PMID: 9382919
89. Stearns PN. The Encyclopedia of World History: Ancient, Medieval, and Modern, Chronologically
Arranged, 6 ed., Houghton Mifflin Harcourt, 2001, 2017, pp. 129–131.
90. Mira-Guardiola MA (2000). Cartago contra Roma. Ed.: Alderaban. Madrid, Spain.
91. Sellier J, Sellier A. Atlas des Peuples d’Orient. Paris, France: Editions La Decouverte, 1993
92. Arnaiz-Villena A, Martinez-Laso J, Alonso-Garciá J. The Correlation Between Languages and Genes:
The Usko-Mediterranean Peoples. Human Immunology. 2001; 62(9): 1051–1061. https://doi.org/10.
1016/S0198-8859(01)00300-7 PMID: 11543906
93. Arnaiz-Villena A, Gomez-Casado E, Martinez-Laso J. Population genetic relationships between Medi-
terranean populations determined by HLA allele distribution and a historic Perspective. Tissue Anti-
gens. 2002; 60(2): 111–21. https://doi.org/10.1034/j.1399-0039.2002.600201.x PMID: 12392505
94. Hachid M. Postface de l’ouvrage “aux origines de l’ecriture au Maroc. corpus des inscriptions ama-
zighes des sites d’art rupestre du maroc” edited by: Skounti A., Lemdjidi A. and Nami M. Publication
de l’institut royal de la culture amazighe. Cealpa, rabat, morocco, 2003.
95. Malika H. Les premier berebers entre mediterranee, tassili et nil. Edited by edisud. aix-en-provence,
France 2000
96. Carter RA. Boat remains and trade in Persian Gulf during the 6th and 5th millenia BC. Antiquity. 2006;
80(307): 52–63. https://doi.org/10.1017/S0003598X0009325X
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 23 / 24
https://doi.org/10.1111/j.1744-313X.2009.00868.x
https://doi.org/10.1111/j.1744-313X.2009.00868.x
http://www.ncbi.nlm.nih.gov/pubmed/19703234
https://doi.org/10.1034/j.1399-0039.2003.00041.x
http://www.ncbi.nlm.nih.gov/pubmed/12753657
https://doi.org/10.1034/j.1399-0039.2001.057002118.x
http://www.ncbi.nlm.nih.gov/pubmed/11260506
https://doi.org/10.1034/j.1399-0039.2001.057004308.x
http://www.ncbi.nlm.nih.gov/pubmed/11380939
https://doi.org/10.1016/S0198-8859(01)00290-7
http://www.ncbi.nlm.nih.gov/pubmed/11543893
https://doi.org/10.1111/j.1744-313X.2007.00723.x
http://www.ncbi.nlm.nih.gov/pubmed/18001303
https://doi.org/10.1034/j.1399-0039.2002.590606.x
https://doi.org/10.1034/j.1399-0039.2002.590606.x
http://www.ncbi.nlm.nih.gov/pubmed/12445319
https://doi.org/10.1111/j.1399-0039.2008.01083.x
https://doi.org/10.1111/j.1399-0039.2008.01083.x
http://www.ncbi.nlm.nih.gov/pubmed/18721272
https://doi.org/10.1111/j.1399-0039.2004.00246.x
https://doi.org/10.1111/j.1399-0039.2004.00246.x
http://www.ncbi.nlm.nih.gov/pubmed/15245369
https://doi.org/10.1034/j.1399-0039.1999.530610.x
https://doi.org/10.1034/j.1399-0039.1999.530610.x
http://www.ncbi.nlm.nih.gov/pubmed/10395110
https://doi.org/10.4314/ijma.v1i8.4
https://doi.org/10.4314/ijma.v1i8.4
https://doi.org/10.1111/j.1399-0039.1997.tb02906.x
http://www.ncbi.nlm.nih.gov/pubmed/9389325
http://www.ncbi.nlm.nih.gov/pubmed/9382919
https://doi.org/10.1016/S0198-8859(01)00300-7
https://doi.org/10.1016/S0198-8859(01)00300-7
http://www.ncbi.nlm.nih.gov/pubmed/11543906
https://doi.org/10.1034/j.1399-0039.2002.600201.x
http://www.ncbi.nlm.nih.gov/pubmed/12392505
https://doi.org/10.1017/S0003598X0009325X
https://doi.org/10.1371/journal.pone.0192269
97. Kuhrt A. The ancient Near East (3000–330 BC). Vol II. Barcelona, Editorial Critica, 2001.
98. Hitti PK. History of Syria: Including Lebanon and Palestine, 2004, p26
99. Encyclopaedia Britannica: https://www.britannica.com/
100. Sartre M, D’Alexandre à Zénobie: Histoire du Levant antique, IVe siècle avant Jésus-Christ-IIIe siècle
après Jésus-Christ, Fayard, 2001.
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 24 / 24
https://www.britannica.com/
https://doi.org/10.1371/journal.pone.0192269
Genetic heterogeneity of Arab pop-HLA gene-2018
RESEARCH ARTICLE
The genetic heterogeneity of Arab
populations as inferred from HLA genes
Abdelhafidh Hajjej
1*, Wassim Y. Almawi2¤, Antonio Arnaiz-Villena3, Lasmar Hattab4,
Slama Hmida
1
1 Department of Immunogenetics, National Blood Transfusion Center, Tunis, Tunisia, 2 Department of
Medicine, Harvard Medical School, Boston, MA, United States of America, 3 Department of Immunology,
University Complutense, School of Medicine, Madrid Regional Blood Center, Madrid, Spain, 4 Department of
Medical Analysis, Hospital of Gabes (Ghannouch), Gabes, Tunisia
¤ Current address: School of Pharmacy, Lebanese American University, Byblos, Lebanon
* abdelhafidhhajjej@gmail.com
Abstract
This is the first genetic anthropology study on Arabs in MENA (Middle East and North Africa)
region. The present meta-analysis included 100 populations from 36 Arab and non-Arab com-
munities, comprising 16,006 individuals, and evaluates the genetic profile of Arabs using HLA
class I (A, B) and class II (DRB1, DQB1) genes. A total of 56 Arab populations comprising
10,283 individuals were selected from several databases, and were compared with 44 Mediter-
ranean, Asian, and sub-Saharan populations. The most frequent alleles in Arabs are A*01,
A*02, B*35, B*51, DRB1*03:01, DRB1*07:01, DQB1*02:01, and DQB1*03:01, while
DRB1*03:01-DQB1*02:01 and DRB1*07:01-DQB1*02:02 are the most frequent class II hap-
lotypes. Dendrograms, correspondence analyses, genetic distances, and haplotype analysis
indicate that Arabs could be stratified into four groups. The first consists of North Africans
(Algerians, Tunisians, Moroccans, and Libyans), and the first Arabian Peninsula cluster (Sau-
dis, Kuwaitis, and Yemenis), who appear to be related to Western Mediterraneans, including
Iberians; this might be explained for a massive migration into these areas when Sahara under-
went a relatively rapid desiccation, starting about 10,000 years BC. The second includes Levan-
tine Arabs (Palestinians, Jordanians, Lebanese, and Syrians), along with Iraqi and Egyptians,
who are related to Eastern Mediterraneans. The third comprises Sudanese and Comorians,
who tend to cluster with Sub-Saharans. The fourth comprises the second Arabian Peninsula
cluster, made up of Omanis, Emiratis, and Bahrainis. It is noteworthy that the two large minori-
ties (Berbers and Kurds) are indigenous (autochthonous), and are not genetically different from
“host” and neighboring populations. In conclusion, this study confirmed high genetic heteroge-
neity among present-day Arabs, and especially those of the Arabian Peninsula.
Introduction
The human leukocyte antigens (HLA) system plays a key role in self-nonself recognition, and
is divided into class I (HLA-A, -B, and -C) and class II (HLA-DP, -DQ, and -DR) loci, and com-
prises 220 genes in a 3.6 Mb region found on the short arm of chromosome 6. HLA system is
highly polymorphic, and in excess of 17,000 alleles were detected. For example, there are 4,828
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 1 / 24
a1111111111
a1111111111
a1111111111
a1111111111
a1111111111
OPEN ACCESS
Citation: Hajjej A, Almawi WY, Arnaiz-Villena A,
Hattab L, Hmida S (2018) The genetic
heterogeneity of Arab populations as inferred from
HLA genes. PLoS ONE 13(3): e0192269. https://
doi.org/10.1371/journal.pone.0192269
Editor: Amr H Sawalha, University of Michigan,
UNITED STATES
Received: November 6, 2017
Accepted: January 19, 2018
Published: March 9, 2018
Copyright: © 2018 Hajjej et al. This is an open
access article distributed under the terms of the
Creative Commons Attribution License, which
permits unrestricted use, distribution, and
reproduction in any medium, provided the original
author and source are credited.
Data Availability Statement: All relevant data are
within the paper and its Supporting Information
files.
Funding: The authors received no specific funding
for this work.
Competing interests: The authors have declared
that no competing interests exist.
https://doi.org/10.1371/journal.pone.0192269
http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0192269&domain=pdf&date_stamp=2018-03-09
http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0192269&domain=pdf&date_stamp=2018-03-09
http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0192269&domain=pdf&date_stamp=2018-03-09
http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0192269&domain=pdf&date_stamp=2018-03-09
http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0192269&domain=pdf&date_stamp=2018-03-09
http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0192269&domain=pdf&date_stamp=2018-03-09
https://doi.org/10.1371/journal.pone.0192269
https://doi.org/10.1371/journal.pone.0192269
http://creativecommons.org/licenses/by/4.0/
B, 3,968 A, and 3,579 C class I alleles, compared with 2,103 DRB1, and 1,142 DQB1 class II
alleles. Several HLA alleles were associated with various auto-immune and infectious diseases
[1]. HLA class I and class II loci are characterized by high (80–90%) heterozygosity, and thus
constitute reliable genetic markers for phylogenetic study, and thus are useful for anthropolog-
ical studies.
Population studies confirmed varied frequencies of HLA alleles and haplotypes according
to ethnicity and geographic origin. Given the codominant nature of the expression of HLA
markers, this enables distinguishing between heterozygotes from homozygotes, hence allowing
assignment of genotypes and allele frequencies [2]. Linkage disequilibrium (LD) analysis
between HLA alleles identified the number of generations in-between two closely related pop-
ulations from the time of their separation. Diversity in haplotype distribution, allele frequency,
and LD analysis reflect the extent of variation between closely related populations. Allele fre-
quency-based genetic distance analysis allows for construction of phylogenetic tree (Dendro-
grams), so as to infer relative estimate of the time that elapsed since the populations existed as
single cohesive units [3–6].
Arabs are a major panethnic group, and their union, Arab League, is a cultural and ethnic
union of 22 member states. As of 2013, nationals of the Arab League countries are 357 mil-
lions, who populate an area of 13 million km
2
, straddling Africa and Asia [7]. Ethnic, religious,
and linguistic diversity (triple heterogeneity) characterize Arabs. Most Arabs follow Islam, and
Christianity is the second largest religion, with over 15 million Christians. There are also
smaller but significant religious minorities (as Druze, Jews), and a number of non-Arab ethnic
minorities (as Berbers, Kurds) [7, 8].
The history of Arabs extends from circa 1200 BC when Southern Arabian Peninsula
was ruled by three successive civilizations: Mineans, who established their capital Karna
(1200–650 BC), Sabeans in Marib (1000 BC—570 AD), and the Himyarite (2nd-6th centu-
ries AD) in Dhafar (Oman) [9–11]. These civilizations were built by authentic Yemeni
tribes. The kingdom of Kinda was established in Central Arabia in 4th-early 6th century
AD, while Dilmun civilization was founded in Eastern Arabia. In 3rd century AD, East
African Kingdom of Aksum extended into Yemen and Western Saudi Arabia [12]. In
addition, the Lakhmids (Yemeni origin), established a dynasty which ruled part of pres-
ent-day Iraq and Syria in 300–602 AD [10, 13, 14]. The Arab Christian Ghassanids
(220–638 AD), originating from Southern Arabia, migrated in 3rd century to Jordan,
where they established their kingdom that extended from Syria to Yathrib (Saudi Arabia)
[12.13]. Islam was introduced in 610 AD to Arabian Peninsula. Shortly thereafter, Arabian
tribes were united as a single Islamic state in the Arabian Peninsula, which was spear-
headed by the Islamic prophet Muhammad. This Islamic state progressively grew in area,
and in types and numbers of populations, and extended from Andalusia (Spain) to the
west, to Indus in the east [14].
Subsequent spread of Islam involved swift invasion of Persia (637-651AD), Iraq, Levant,
and Egypt (639 AD), which extended into North Africa (640–709), and to Spain, Portugal, and
France (Poitiers) in 8
th
century AD. Eastwards, Arab expansion to Central Asia, Bukhara
(Uzbekistan), Afghanistan (637–709), and the Indus border (664–712) followed. Northwards,
Arab invaders were in contact with the Byzantine Empire, and the Caspian and Caucasus to
the north [15, 16]. With the Islamic expansion from 7th century, social and political groups
were gradually Arabized. The spreading of Arab-Muslim culture was at the expense of local
languages (as Berber, Kurdish), especially in Middle East and North Africa, resulting in the
Arabized population speaking variants of Arabic, mixed with original languages (dialect). The
extent of gene Arab exchange with these autochthonous groups is undetermined but is thought
to be lower than religious/cultural influence.
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 2 / 24
https://en.wikipedia.org/wiki/Panethnicity
https://en.wikipedia.org/wiki/Africa
https://en.wikipedia.org/wiki/Christianity
https://en.wikipedia.org/wiki/Druze
http://www.newworldencyclopedia.org/entry/Yemen
http://www.newworldencyclopedia.org/entry/Saudi_Arabia
https://en.wikipedia.org/wiki/Islamic_prophet
https://en.wikipedia.org/wiki/Muhammad
https://en.wikipedia.org/wiki/Spain
https://en.wikipedia.org/wiki/Indus
https://en.wikipedia.org/wiki/Uzbekistan
https://doi.org/10.1371/journal.pone.0192269
Given the large number of conquests, Arabs were in contact with different ethnicities resid-
ing on a vast area stretching from Mauritania (West Africa) to the western China border (East
Asia). This suggests that cultural and perhaps genetic relationships were established with these
ethnic groups. This work aims to study the HLA distribution in North African and Oriental
Arab populations, and compare them to neighboring populations (Sub-Saharans Africans,
Europeans, and Asians).
Populations and methods
Search strategy
Datasets of HLA allele frequencies were collected from a systematic review performed per Pre-
ferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) criteria [Only
the criteria from 1–10, 17, and 26 are applicable to this type of study (S1 Checklist)] [17].
PubMed, ScienceDirect, AlleleFrequencies.net, and ResearchGate databases were searched for
all papers on HLA polymorphism, and HLA disease associations in Arabs. This systematic lit-
erature search covering published papers up to May 31, 2017 was conducted by two investiga-
tors (H.A and H.L); the search terms used were: ‘HLA Arabs’, or ‘Human Leukocyte Antigen
Arabs’. A search per country followed: ‘HLA Tunisians’, ‘HLA Saudis’, and so on. This was
repeated for remaining countries, which resulted in excess of 50 keywords used. A database
from International Histocompatibility Workshops was also used. Some authors were also con-
tacted by e-mail, or through ResearchGate, requesting information and missing data. While
most datasets were taken from studies with an explicit anthropological focus, control groups
from case-control disease studies were also used. There was no language restriction used for
this search.
Inclusion and exclusion criteria
All included studies met the following criteria. HLA allele frequencies must be obtained by
molecular typing, and that subjects should be typed for at least one of the following: HLA-A,
HLA-B, HLA-DRB1, and HLA-DQB1. Publications were excluded in case of serological data;
sample size less than 35 individuals, typed individuals (or controls) were either related and not
randomly selected, presentation of duplicate data sets. Studies were also excluded if they pre-
sented incomplete/partial allele frequencies, or there were significant ambiguities in the typing.
Data extraction
Studies were independently selected by two authors (H.A and H.L). An external referee was
invited in case of disagreements not resolved by both reviewers. Data extracted from selected
papers included publication year, study type (anthropology, association), sample size, HLA-A, -B,
-C, -DRB1, and -DQB1 allele frequencies, haplotype frequencies, region, country, and typed loci.
Statistical analysis
A three-dimensional correspondence analysis and bi-dimensional representation were per-
formed using VISTA V5.02 software [18]. Phylogenetic trees were constructed based on allele
frequencies using the Neighbor-Joining (NJ) method [19], and standard genetic distances
(SGD) [20], using DISPAN software containing GNKDST and TREEVIEW software [21, 22].
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 3 / 24
https://doi.org/10.1371/journal.pone.0192269
Results
Study flow
The use of more than fifty key words allowed identification of 5,456 papers and HLA datasets,
of which 315 were deemed relevant to the study. Of these, 42 articles and 11 HLA datasets con-
taining information on 56 Arab populations, and meeting the study criteria, were included.
The study flow is illustrated in Fig 1. In addition, 20 articles and 18 HLA datasets which meet
the criteria of this study, containing complete information on 44 other populations were
selected, but without going through systematic review. The populations used in the compari-
son were chosen mainly from neighboring Arab countries. This study relied on a database con-
sisting of 100 populations (of which data of 11 populations were extracted from association
studies) from 36 countries Arab and worldwide countries, and belonging to Asia, Europe, and
Africa. The distribution of populations by region is illustrated in Fig 2A. These populations
represent allele frequency data for 16,006 individuals (160.06 individuals/population), and
from 63 references.
Selected populations
Arab populations. The 42 articles and 11 HLA datasets (http://www.allelefrequencies.net)
selected provided information on 56 populations (Table 1), comprising 10,283 individuals
[23–67]. The 56 different ethnic and religious populations were selected from 18 Arab coun-
tries. There were no reliable HLA data for the remaining countries (Somalia, Djibouti, Mauri-
tania, and Qatar) (Fig 2B). The studied populations are divided into 29 African (26 North
Africans and 3 Sub-Saharans), and 27 Asian populations (13 Levantines, and 14 Arabian Pen-
insula). With the exception of 8 populations [28, 38, 47, 48, 50, 52, 53, 55], where HLA data
were extracted from association studies, the 50-remaining studies were extracted from anthro-
pological ones.
Neighboring populations. Forty-four worldwide populations [23, 34, 39, 66, 68–85] com-
prising 5,723 individuals, were selected from 18 countries in three continents, using the same
criteria previously described (Table 2). These comprised 22 European, 11 non-Arab Asian,
and 11 Sub-Saharan African populations. Of the 11 Asian populations, there were two Arab
minorities living in Iran (Khuzestan and Famoori).
Data of only three populations [74, 75, 84] were extracted from association studies. These
populations were typed for at least HLA-A, -B, -DRB1, or DQB1.
HLA allele frequencies features of Arab populations
Table 3 shows the most frequent HLA-A and -B alleles in Arab populations. A�02 was the
most prevalent allele, and its frequency exceeded 25% in some populations, such as Saudis
(30.4%) [23], Tunisian Berbers of Zrawa (29.3%) [24], Moroccans (26.2%) [25], and Suda-
nese (25.9%) [23]. A�01, �03, �24, �30, and �68 alleles were also common in most Arab popu-
lations. For example, the highest frequency of A�01 was seen in Tunisians (15%) [26] and
Moroccans (14.8%) [25], while A�03 was prevalent among Iraqi Kurds (15.1%) [23], and
A�30 was prevalent among Sudanese (17.6%) [23]. In addition, A�24 was common among
Lebanese-Armenians (17.3%) [27], while A�68 was prevalent in Saudis (10.5%) [28]. In con-
trast, A�25, �28, �34, �36, �43, �66, �69, �74, and �80 are rare among Arabs. It is noteworthy
that A�34, described as rare allele among Arabs, is found at a high frequency (22.2%) in
Tunisian Berbers from Zrawa [24], the highest reported for any population worldwide.
Results of HLA-B locus are presented in Table 3. B�35 was the most frequent B� allele in
Palestinians (20.3%) [29] and Lebanese-Armenians (19.8%) [27]. B�35 was found at varied
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 4 / 24
http://www.allelefrequencies.net/
https://doi.org/10.1371/journal.pone.0192269
frequencies in Iraqi Kurds (15.6%) [23], Omanis (15.3%) [30], Jordanians (14.9%) [31], and
Arab Emirati (11.1%) [23] populations. B�51 was the second most frequent allele, and high fre-
quencies were recorded for Saudis (19.3%) [23], Omanis (17.5%) [30], and Arab Emirati
(15.6%) [23] populations. B�50 was also a frequent B� allele in most Arabs, including Saudis
(18.8%) [23], and Libyans (16.1%) [31], along with B�08, and B�44 among the Tunisian Berbers
of Zrawa (32.8%) [24], the latter being the highest frequency worldwide. Similarly, the fre-
quency of B�27 is the highest among Jordanians (27.1%) [31]. In contrast, B�37, �42, �46, �47,
�48, �54, �59, �67, and �78 alleles are extremely rare or virtually in all Arab populations.
The most common DRB1 and DQB1 alleles among Arabs are shown in Table 4.
DRB1�07:01 was the most frequent allele among Tunisians from Ghannouch (28.6%) [33], Jor-
danians (26.9%) [31], and Saudis (26.6%) [23], while Egyptians (8.3%) and Sudanese had the
lowest frequencies of DRB1�07:01. DRB1�03:01 was the second most frequent DRB1� allele in
some Arabs, such as Tunisians of Tunis (21.9%) [34] and Moroccans of Metelsa (20.2%) [23],
Fig 1. Flow diagram of the study selection process.
https://doi.org/10.1371/journal.pone.0192269.g001
Fig 2. The distribution of studied populations by region (A) and country (B).
https://doi.org/10.1371/journal.pone.0192269.g002
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 5 / 24
https://doi.org/10.1371/journal.pone.0192269.g001
https://doi.org/10.1371/journal.pone.0192269.g002
https://doi.org/10.1371/journal.pone.0192269
but rare in Jordanians (2.4%) [31]. DRB1�11:01 was also frequent among some Arabs, such as
Lebanese (36.8%) [35], but rare among Saudis (4.8%) and Moroccans of Chayoua (2.5%) [23].
Furthermore, DRB1�13:01, �13:02, and �15:01 alleles are relatively frequent among Arabs. High
frequency of DRB1�13:01 were recorded for Sudanese (23.3%), while DRB1�13:02 was virtually
absent in Bahraini [35] and Sudanese [23]. All DRB1�09, �12, and �14 subtypes are extremely
rare among Arabs. In addition, DRB1�16 subtypes are rare in all Arab populations except for
Bahrain, where DRB1�16:01 is found at a high frequency (13.9%) [35].Haut du formulaire.
DQB1�02:0X and �03:01 alleles are the most frequent DQB1� in Arabs. The highest frequen-
cies of DQB1�02:0X were reported for Tunisians (Ghannouch; 40.01%) [33], Yemenites-Jews
(39.1%) [36], Moroccans (Agadir-Souss; 37.8%) [37] and Saudis (37.3%) [23], while the lowest
frequency was found in Egyptians (6%) [38]. On the other hand, DQB1�03:01 is very common
among Lebanese (45%) [39] and Algerians (Oran; 35.1%) [23], but not Saudis (7.6%) [23].
DQB1�03:02 and �05:01 are also frequent in most Arabs, such as Tunisians (Ghannouch;
20.7%) [33], Jordanians (17.8%) [31], Palestinians (17.6%) [29] and Lebanese (16.8%) [35].
DQB1�05:01 is frequent among Bahrainis (29.2%) [35], Tunisians (Berbers of Jerba; 22.7%)
[40], and Lebanese (20.5%) [35]. Among DQB1�06 subtypes, DQB1�06:02 and �06:03 were the
most frequent in most Arab populations, but absent in Bahrainis where DQB1�06:01 is very
frequent (13.20%) [35]. Furthermore, all DQB1�04 subtypes are rare among Arabs, particularly
Table 1. List of Arab populations used in the present work.
N
o
Populations Symbols Size References N
o
Populations Symbols Size References
1 Algiers Alg 102 [67] 29 Comorians Com 117 [43]
2 Algerians-B Alg-B 97 [23] 30 Jordanians Jor 146 [31]
3 Algerians-A Alg-A 132 [48] 31 Jordanians-A Jor-A 1254 [46]
4 Algerians-Oran Ora 100 [23] 32 Syrians Syr 200 [47]
5 Gabesians Gab 77 [59] 33 Syrians-A Syr-A 225 [58]
6 Gabesians-A Gab-A 96 [40] 34 Lebanese Leb 95 [35]
7 Ghannouchians Gha 82 [33] 35 Lebanese-A Leb-A 1123 [45]
8 Berbers-Jerba Ber-J 55 [40] 36 Lebanese-B Leb-B 191 [44]
9 Berbers-Matmata Ber-M 81 [40] 37 Lebanese-Armen Leb-Ar 368 [27]
10 Berbers-Zrawa Ber-Z 70 [24] 38 Lebanese-KZ Leb-Kz 93 [39]
11 Tunisians Tun 376 [61] 39 Lebanese-NS Leb-Ns 59 [39]
12 Tunisians-A Tun-A 80 [60] 40 Lebanese-Yohmor Leb-Y 75 [39]
13 Tunisians-B Tun-B 101 [34] 41 Palestinians Pal 165 [29]
14 Tunisians-C Tun-C 100 [63] 42 Palestinians-A Pal-A 109 [36]
15 Tunisians-M Tun-M 123 [26] 43 Saudis Sau 105 [28]
16 Southern Tunisians Tun-S 250 [62] 44 Saudis-A Sau-A 213 [23]
17 Libyans Lib 118 [32] 45 Saudis-B Sau-B 158 [49]
18 Libyans-Jews Lib-J 119 [36] 46 Saudis-C Sau-C 499 [23]
19 Berbers-Metelsa Ber-Me 99 [64] 47 Saudis-D Sau-D 383 [50]
20 Moroccans Mor 96 [25] 48 Omanis-A Oma-A 259 [30] [51]
21 Moroccans-A Mor-A 110 [42] 49 Kuwaitis Kuw 212 [52]
22 Moroccans-Agadir Mor-Ag 98 [37] 50 Kuwaitis-A Kuw-A 114 [53]
23 Moroccans-Chaouya Mor-Ch 98 [65] 51 Bahrainis Bah 72 [35]
24 Moroccans-Jews Mor-J 94 [66] 52 Emiratis Emi 373 [23]
25 Egyptians Egy 101 [39] 53 Iraq kurds Ira-K 209 [54]
26 Egyptians-A Egy-A 121 [38] 54 Yemenite-Jews Yem-J 76 [36]
27 Sudanese Sud 200 [23] 55 Yemen-sana’a Yem 50 [55]
28 Sudanese-Nuba Sud-N 46 [23] 56 Omanis Oma 118 [56] [57]
https://doi.org/10.1371/journal.pone.0192269.t001
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 6 / 24
https://doi.org/10.1371/journal.pone.0192269.t001
https://doi.org/10.1371/journal.pone.0192269
DQB1�04:01 which is virtually absent, except in Egyptians (10.17%) [38]. The most common
DQB1�04 subtype in Arabs is DQB1�04:02.
Allelic comparison between Tunisians and other populations
Allelic comparisons were done at Neighbor-Joining, correspondence analysis, and standard
genetic distances. Analyses were performed with Class I and Class II markers, and at generic
and high-resolution levels to make the most of available data, and seeing that some of the pop-
ulations included in these comparisons lack high-resolution data.
Neighbor-joining dendrograms. Comparison at the generic level was made using genetic
distances based on DRB1� and DQB1� allelic frequencies. Four groups can be interpreted from
Fig 3. The first group comprises North African Arabs (Tunisians, Algerians, Moroccans, Liby-
ans), Western Mediterranean Europeans (Iberians, French), Arabian Peninsula Arabs (Saudis,
Kuwaitis, Yemenis), and Arab minority of Iran (Khuzestani). The second group is formed by
Eastern Mediterranean Europeans (Greeks, Cretans, Albanians, Turks, Macedonians), Italians,
Levant Arabs (Palestinians, Lebanese, Syrians), Iraqi-Kurds, Tunisian Berbers (Djerba), and
Iranians. The third group comprises Sub-Saharan Africans (Fulani, Mossi, Rimaibe, Bubi,
Mandenka, and Senegalese). Omanis, Bahrainis, Egyptians, and Sudanese form a heteroge-
neous group containing Asians and Sub-Saharan Africans. Similar results but with notable dif-
ferences, were observed in dendrograms built with standards genetic distances (SGD) based
on generic DRB1(S1 Fig) and generic B loci (S2 Fig).
Correspondence analysis. High-resolution DRB1 correspondence analysis (Fig 4) dem-
onstrated the clustering of the studied populations into three groups. The first containing
North Africans (Tunisians, Algerians, Moroccans, and Libyans), Iberians (Basques, Spaniards,
Table 2. Worldwide populations included in the meta-analysis.
N
o
Populations Symbols Size References N
o
Populations Symbols Size References
1 Spaniards Spa 176 [41] 23 Mossi Mos 42 [39]
2 Portuguese Por 118 [39] 24 Mandenka Mad 200 [39]
3 Murcians Mur 173 [80] 25 Amhara Amh 98 [39]
4 Italians Ita 284 [68] 26 Bubi Bub 101 [39]
5 Basques-A Bas-A 82 [41] 27 Congolese Con 85 [72]
6 Basques-Arratia Bas-Ar 83 [77] 28 Fulani Ful 38 [39]
7 Basques-B Bas-B 99 [70] 29 Gabonese Gab 167 [85]
8 French Fre 179 [68] 30 Nigerians Nig 258 [23]
9 French-Rennes Fre-R 200 [34] 31 Oromo Oro 83 [39]
10 Balearic Bal 90 [71] 32 Rimaibe Rim 39 [39]
11 Corsica Cor 100 [71] 33 Senegalese Sen 177 [39]
12 Sardinians Sar 91 [68] 34 Famoori Arabs Fam 84 [73]
13 Ashkenazi-Jews Ash-J 132 [66] 35 India-Northeast Ind-N 188 [83]
14 Greeks-A Gre-A 96 [39] 36 Indians-Delhi Ind-D 112 [84]
15 Greeks-B Gre-B 101 [39] 37 Iranian-Jews Ira-J 91 [73]
16 Greeks-C Gre-C 98 [39] 38 Iranians Ira 120 [74]
17 Greeks-D Gre-D 242 [23] 39 Iranians-A Ira-A 100 [75]
18 Macedonians Mac 172 [78] 40 Iranians-Azeri Ira-Az 100 [81]
19 Turks Tur 250 [23] 41 Iranians-Kurd Ira-k 100 [81]
20 Turks-A Tur-A 228 [79] 42 Khuzestani Arabs Khu 50 [73]
21 Albanians Alb 160 [76] 43 Pakistanis-Pathan Pak-P 100 [82]
22 Cretans Cre 135 [69] 44 Pakistanis-Sindh Pak-S 101 [82]
https://doi.org/10.1371/journal.pone.0192269.t002
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 7 / 24
https://doi.org/10.1371/journal.pone.0192269.t002
https://doi.org/10.1371/journal.pone.0192269
Portuguese, Murcians), French, Saudis, Yeminis-Jews, and Khuzestani Arabs. The second con-
tains Eastern Mediterraneans (Greeks, Cretans, Lebanese, Palestinians, and Macedonians),
Berbers of Djerba, Italians, Iraqi-Kurds, Iranians, Egyptians, Ashkenazi-Jews, and Moroccan-
Jews. The last cluster consists of Sub-Saharan populations. It should be noted that Jordanians,
Bahrainis, and Sudanese were outside these main groups. Similarly, correspondence analysis
using class I (A and B) identified three main clusters (Fig 5). The first cluster contained all
Sub-Saharan Africans along with Sudanese. The second cluster contains Eastern Mediterra-
nean populations (Albanians, Greeks, Cretans, Lebanese, Palestinians, and Macedonians), Ital-
ians, Iraqi-Kurds, Ashkenazi-Jews, and Jordanians-A. The last cluster includes North Africans
(Tunisians, Algerians, Moroccans, and Libyans), Iberians (Basques, Spaniards), French, and
Saudis.
Correspondence analysis based on generic DRB1 data, and using only Arab populations
shows that Arabs can cluster into four groups (Fig 6). The first contains the North Africans
(Tunisians, Algerians, Moroccans, and Libyans), Saudis, Yemenis, Kuwaitis, and Khuzestanis
(Iranian Arabs). The second cluster includes the Arabs of Levant (Palestinians, Jordanians,
Lebanese, Syrians), Egyptians, Iraqi Kurds, and Moroccans Jews. The third group consists of
Table 3. Most frequent HLA-A� and–B� alleles in Arab populations.
HLA-A A�01 A�02 A�03 A�24 A�30 A�68
Population % Population % Population % Population % Population % Population %
Tun-M 15.0 Sau-D 30.4 Ira-k 15.1 Leb-Ar 17.3 Sud 17.6 Sau 10.5
Mor 14.8 Ber-Z 29.3 Leb-Ar 14.0 Gha 15.2 Mor-C 13.0 Tun-M 09.4
Jor-A 14.7 Mor 26.2 Pal 10.7 Ira-k 13.9 Tun-A 11.8 Mor 09.3
Ira-k 13.2 sud 25.9 Lib 10.3 Sau-B 13.3 Jor 11.5 Alg-K 08.6
Pal 12.5 Emi 25.2 Mor-A 10.0 Jor-A 10.7 Alg-K 10.2 sud 08.5
Leb-A 12.2 Oma 24.9 Alg-K 09.3 Pal 10.1 Sau-B 10.2 Emi 08.4
Sau-A 12.2 Alg 24.6 Jor-A 09.1 Alg 09.4 Pal 08.4 Lib 08.2
Alg 11.9 Lib 23.5 Emi 09.1 Lib 09.3 Oma-A 07.5 Jor 07.6
Lib 11.5 Jor-A 22.0 Sau-A 08.9 Mor 07.3 Leb-A 06.7 Oma-A 07.1
Oma 07.2 pal 20.5 Gab 07.7 Oma 06.3 Lib 06.4 Leb-A 05.1
Sud 06.5 Leb-A 18.7 Sud 07.1 Sud 06.1 Emi 05.0 Ira-k 03.8
Emi 06.2 Ira-k 17.0 Oma 06.4 Emi 05.2 Ira-k 03.8 Pal 03.6
HLA-B B�07 B�08 B�35 B�44 B�50 B�51
Population % Population % Population % Population % Population % Population %
Jor 27.1 Oma 11.0 Pal 20.3 Ber-Z 32.8 Sau-D 18.8 Sau-C 19.3
Sau-A 11.7 sau-B 10.1 Leb-Ar 19.8 Ira-k 10.3 Lib 16.1 Oma 17.5
Mor 09.0 Emi 08.6 Ira-k 15.6 Mor-C 10.2 Ber-Z 15.7 Emi 156
Lib 07.7 Gha 08.5 Oma-A 15.3 pal 09.6 Tun-S 14.2 Ira-K 15.6
Tun-A 07.5 Ira-k 07.2 Jor-A 14.9 Alg 08.8 Mor-C 12.5 Gha 12.2
Alg-k 07.1 Lib 06.4 Emir 11.1 Leb-Ar 08.4 Emi 09.4 Leb-Ar 12.1
Leb-Ar 04.5 Mor-C 06.2 Alg 10.3 Lib 07.6 Jor-A 06.4 Lib 11.1
Ira-k 04.1 Jor 04.7 Lib 10.1 Jor-A 05.6 Pal 05.8 Jor-A 10.3
Oma-A 03.1 Sud 04.0 Tun-M 09.8 Sau-D 03.5 Leb-Ar 05.2 Sud 07.8
Sud 02.8 Alg 03.5 Sau 08.6 Sud 02.3 Alg 05.1 Mor 07.4
Emi 02.4 Leb-Ar 03.0 Mor-C 06.9 Emi 02.3 Oma-A 04.2 Pal 06.4
Pal 01.8 Pal 02.7 sud 06.1 Oma-A 02.1 Sud 02.5 Alg-k 04.7
Only one population per country is illustrated; the frequencies are ranked from highest to lowest for each allele; to identify the population and country see Table 1
https://doi.org/10.1371/journal.pone.0192269.t003
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 8 / 24
https://doi.org/10.1371/journal.pone.0192269.t003
https://doi.org/10.1371/journal.pone.0192269
Table 4. Most frequent HLA-DRB1� and–DQB1� alleles in Arab populations.
HLA-DRB1� 03:01 07:01 11:01 13:01 13:02 15:01
Population % Population % Population % Population % Population % Population %
Tun-B 21.9 Gha 28.6 Leb 36.8 Sud 23.3 Mor-Me 11.1 Alg-B 13.4
Mor-Me 20.2 Jor 26.9 Bah 16.0 Sau-A 10.6 Lib 09.3 Mor-c 12.6
Sau-B 16.5 Sau-B 26.6 Egy-A 13.2 Ber-M 08.0 Sau-A 08.9 Ber-Z 11.4
Ora 15.1 Yem-J 22.1 Gab-A 11.2 Leb-B 06.8 Egy-A 07.4 Jor 09.0
Bah 13.9 Mor-Ag 20.5 Pal 10.0 Alg-B 05.6 Tun-C 06.7 Sau-A 08.9
Sud 13.8 Lib-Y 19.6 Ora 08.6 Lib 05.5 Leb-N 05.0 Bah 07.6
Lib 13.6 Lib 17.0 Sud 08.3 Yem-J 05.4 Ora 04.5 Leb 04.7
Yem-J 12.0 Alg-B 15.9 Jor 08.3 Egy-A 04.6 Yem-J 04.0 Lib 04.2
Leb-B 09.6 Pal 12.7 Lib 05.1 Mor-Me 03.5 Pal 03.9 Pal 03.6
Pal 07.6 Bah 09.0 Sau-A 04.8 Jor 02.1 Jor 00.3 Sud 03.3
Egy-A 07.0 Egy-A 08.3 Yem-J 03.4 Bah 02.1 Sud 00.0 Egy-A 02.5
Jor 02.4 Sud 07.8 Mor-C 02.5 Pal 00.9 Bah 00.0 Yem-J 02.0
HLA-DQB1� 02:0X 03:01 03:02 05:01 06:02 06:03
Population % Population % Population % Population % Population % Population %
Gha 40.1 Leb-NS 45.0 Gha 20.7 Bah 29.2 Mor-C 12.9 Egy-A 10.2
Yem-J 39.1 Ora 35.1 Jor 17.8 Ber-J 22.7 Alg 12.8 Jor 08.3
Mor-Ag 37.8 Lib-J 29.6 Pal 17.6 Leb 20.5 Egy-A 12.7 Ber-J 07.8
Sau-B 37.3 Ber-J 27.4 Leb 16.8 Alg 13.9 Tun-A 12.6 Lib-J 07.4
Jor 35.9 Pal 26.7 Yem-J 14.2 Mor-C 12.3 Jor 10.7 Yem-J 06.1
Lib-J 33.3 Yem-J 19.1 Lib-J 13.0 Pal 11.8 Sau-B 05.1 Ora 04.3
Bah 25.7 Bah 16.0 Alg 12.3 Sau-B 10.1 Pal 04.2 Sau-B 04.1
ora 24.5 Mor-C 15.4 Mor-C 12.3 Jor 09.3 Leb-Y 03.7 Leb-Y 03.3
Pal 20.9 Egy 11.9 Bah 09.7 Egy-A 08.5 Yem-J 02.0 Mor-C 01.8
Leb-Y 20.0 Jor 10.0 Sau-B 08.9 Yem-J 06.1 Lib-J 00.8 Pal 01.2
Only one population per country is illustrated; the frequencies are ranked from highest to lowest for each allele; to identify the population and country see Table 1
https://doi.org/10.1371/journal.pone.0192269.t004
Fig 3. Neighbor-Joining dendrograms, based on Standard genetic distances (SGD), showing relatedness between
Arabs and other populations using generic HLA-DRB1� and -DQB1� allele frequencies data. Populations’ data were
taken from references detailed in Tables 1 and 2. Bootstrap values from 1.000 replicates are shown.
https://doi.org/10.1371/journal.pone.0192269.g003
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 9 / 24
https://doi.org/10.1371/journal.pone.0192269.t004
https://doi.org/10.1371/journal.pone.0192269.g003
https://doi.org/10.1371/journal.pone.0192269
Bahrainis, Omanis, Emiratis and Famoori (Iranian Arab). The fourth is composed of Suda-
nese, Sudanese from Nuba, and Comorians.
Genetic distances. Table 5 illustrates standard genetic distances (SGD) between Arabs
and other populations, using generic DRB1� allele frequencies. North Africans and Iberians
are the closest to Saudis. Moroccans (Agadir, 0.0024), Basques-Ar (0.0057), and Tunisians-S.
Fig 4. Correspondence analysis (bi-dimensional representation), based on the standard genetic distances, showing
the relationship between Arabs and other populations according to high resolution HLA-DRB1� allele frequencies
data. Only individuals with defined DRB1� subtypes are considered. Populations data were taken from references
detailed in Tables 1 and 2.
https://doi.org/10.1371/journal.pone.0192269.g004
Fig 5. Correspondence analysis (bi-dimensional representation), based on the standard genetic distances, showing
a global view of the relationship among Arabs and other populations according to generic HLA�-A and–B� allele
frequencies data. Populations data were taken from references detailed in Tables 1 and 2.
https://doi.org/10.1371/journal.pone.0192269.g005
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 10 / 24
https://doi.org/10.1371/journal.pone.0192269.g004
https://doi.org/10.1371/journal.pone.0192269.g005
https://doi.org/10.1371/journal.pone.0192269
Syrians are genetically close to Eastern Mediterranean, as Cretans (-0.0001) and Lebanese
Armenians (0.0050), while Tunisians are closed to Western Mediterraneans as North Africans
and Iberians, and Saudis. The populations most related to Tunisians are the other Tunisian
populations (Gabesians, -0.0139), Moroccans (Agadir; -0.0080), and Algerians (-0.0055). Sub-
Saharans such as Congolese (0.0519) and Nigerians (0.0828), and Greeks (0.0836) showed the
closest genetic distances to Comorians. It is noteworthy that Arab minority in Khuzestan
(Iran) displayed close relatedness with North Africans [as Gabesians from Tunisia (-0.0086)
and Orans from Algeria], and Saudis (0.0231).
HLA Class I and Class II haplotype
HLA-A-B haplotypes. HLA A-B haplotypic data are extremely rare in Arabs. The most
frequent A-B haplotypes in Arabs are shown in Table 6. A�02:01-B�50:01 (9.0%) and
A�02:01-B�44:02/03 (7.5%) were the haplotypes with the highest frequencies in Berbers of
Zrawa. Diversity in A-B haplotype frequencies are found among Arabs, hence demonstrat-
ing comparable frequencies of A-B haplotype in Arab populations, which did not exceed
5.3% in Gabesians (Tunisia). For example, while A�34:02-B�08:01 and A�29:01-B�45:01
characterize Tunisians, A�01-B�57(02.9%), A�30-B�18 (01.50%), and A�33:01-B�14:01
(02.50%) characterize Algerians. Several haplotypes identified in Arabs were also seen in
other Mediterraneans. For example, A�32:01-B�40:02 was seen in Greeks (2%) [39] and
Spaniards (0.5%) [41], while A�02:01-B�50:01 was seen in Italians (2%) [68], Portuguese
(3%) [39], and Moroccan Jews (3%) [66]. A�24:02-B�08:01 (4.75%) and A�30:02-B�53:01
(3.48%) were only identified in Saudis.
HLA-DRB1-DQB1 haplotypes. The most frequent DRB1-DQB1 haplotypes with signifi-
cant LD in Arabs are listed in Table 7. In general, class II haplotype frequencies are markedly
higher than those of class I haplotypes. DRB1�03:01-DQB1�02:01 haplotype was the most fre-
quent DRB1-DQB1 haplotype in Arabs (Table 7), and its frequency ranging from 3.2% in Leba-
nese to 16.60% in Tunisians. DRB1�03:01-DQB1�02:01 is a common class II haplotype in the
Fig 6. Correspondence analysis (bi-dimensional representation), based on the standard genetic distances, showing
the relationship between different Arab populations according to generic HLA-DRB1� allele frequencies data.
Populations data were taken from references detailed in Tables 1 and 2.
https://doi.org/10.1371/journal.pone.0192269.g006
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 11 / 24
https://doi.org/10.1371/journal.pone.0192269.g006
https://doi.org/10.1371/journal.pone.0192269
Table 5. The closest populations to Arabs using standard genetic distances (SGD) based on HLA-DRB1� alleles.
Saudis-B Emiratis Omanis-A Sudanese
Population SGD Population SGD Population SGD Population SGD
Moroccans-Ag 0.0024 Omanis-A 0.0411 Emirates 0.0411 Nigerians 0.0497
Basques-Ar 0.0057 Bahrain 0.0429 Sardinians 0.0939 Egyptians-A 0.0556
Tunisians-S 0.0124 Sardinians 0.0593 Bahrain 0.1327 Congolese 0.0594
Saudis-C 0.0160 Kuwaitis 0.0688 Kuwait 0.2014 Egyptians 0.0620
Ghanouchians 0.0203 Tunisians-B 0.1169 Famoori Arabs 0.2377 Mandenka 0.0908
Saudis 0.0258 Khuzestanis 0.1213 Macedonians 0.2461 Moroccans 0.0984
Tunisians 0.0272 Tunisians-A 0.1276 Tunisians-B 0.3071 Senegalese 0.1044
Kuwaitis-A 0.0312 Algerians-Oran 0.1371 Khuzestanis 0.3192 Bubi 0.1078
Khuzestanis 0.0349 Algerians-A 0.1407 Greeks-B 0.3197 Palestinians-A 0.1111
Spaniards 0.0354 Algerians-B 0.1612 Tunisians-A 0.3261 Pakistanis-S 0.1122
Saudis-D 0.0374 Algiers 0.1639 Kuwaitis-A 0.3544 Tunisians-A 0.1133
Gabesians 0.0377 Saudis-C 0.1746 Algerians-Oran 0.3600 Libyans 0.1197
Gabesians-A 0.0394 Macedonians 0.1756 Algerians-A 0.3639 Sudanese-Nuba 0.1234
Jordanians 0.0428 Gabesians 0.1820 Greeks-D 0.3657 Algerians-B 0.1315
Algerians-B 0.0433 Saudis-D 0.1820 Algerians-B 0.3867 Berbers-Matmata 0.1317
Basques-B 0.0449 Moroccans-Agadir 0.1830 Greeks-C 0.3927 Algerians-A 0.1407
Saudis-A 0.0450 Kuwaitis-A 0.1837 Turks 0.3944 Berbers-Zrawa 0.1409
Algerians-A 0.0497 Famoori Arabs 0.1894 Saudis-C 0.3984 Gabesians 0.1413
Tunisians-C 0.0533 Moroccans-A 0.1900 Algiers 0.4027 Jordanians-A 0.1434
Yemenite-J 0.0536 Gabesians-A 0.1908 Albanians 0.4034 Gabesians-A 0.1442
Khuzestanis Tunisians Syrians-A Comorians
Population SGD Population SGD Population SGD Population SGD
Gabesians -0.0086 Gabesians -0.0139 Cretans -0.0001 Congolese 0.0519
Orans -0.0074 Gabesians-A -0.0081 Lebanese-Ar 0.0050 Nigerians 0.0828
Gabesians-A -0.0025 Moroccans-Agadir -0.0080 Syrians 0.0076 Greeks-A 0.0836
Algerians-A -0.0015 Southern Tunisians -0.0062 Iranians-Kurd 0.0100 Gabonese 0.0904
Moroccans-Ag 0.0106 Algerians-A -0.0055 Lebanese-A 0.0149 Iranians-A 0.0947
Tunisians-S 0.0140 Moroccans-A 0.0010 Lebanese-Y 0.0151 Egyptians-A 0.1090
Tunisians 0.0161 Algerians-B 0.0019 Iranians 0.0159 Iranians 0.1184
Tunisians-C 0.0195 Berbers-Zrawa 0.0027 Lebanese-B 0.0161 Italians 0.1222
Yemenite-J 0.0217 Libyans 0.0028 Iranians-Azeri 0.0185 Iranians-Azeri 0.1394
Tunisians-M 0.0225 Algerians-Oran 0.0033 Turks 0.0192 Iranians-Kurd 0.1418
Saudis-C 0.0231 Tunisians-M 0.0038 Iraq kurdistan 0.0198 Albanians 0.1426
Spaniards 0.0291 Saudis-C 0.0061 Ashkenazi-Jews 0.0222 Turks 0.1428
Saudis 0.0324 Tunisians-C 0.0083 Iranians-A 0.0223 Syrians 0.1470
Saudis-B 0.0349 Algiers 0.0103 Palestinians-A 0.0228 Cretans 0.1483
Algerians-B 0.0353 Berbers-Matmata 0.0106 Italians 0.0241 Egyptians 0.1483
Tunisians-B 0.0422 Moroccans-Chaouya 0.0111 Turks-A 0.0288 Greeks-C 0.1487
Indians-Delhi 0.0454 Spaniards 0.0126 Lebanese 0.0320 Palestinians-A 0.1559
Algiers 0.0461 Moroccans 0.0144 Jordanians-A 0.0355 Iraq Kurdistan 0.1564
Basques-Ar 0.0471 Saudis-D 0.0159 Lebanese-KZ 0.0368 Greeks-D 0.1594
Libyans 0.0485 Khuzestani Arabs 0.0161 Greeks-A 0.0407 Syrians-A 0.1617
(0.0124) had the closest genetic distances from Saudis, while Emiratis were closely related to Omanis (0.0411), Bahrainis (0.0429), Sardinians (0.0593), and Kuwaitis
(0.0688). On the other hand, Sudanese are related to Sub-Saharans, including Nigerians (0.0497), Congolese (0.0594), and Egyptians (0.556).
https://doi.org/10.1371/journal.pone.0192269.t005
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 12 / 24
https://doi.org/10.1371/journal.pone.0192269.t005
https://doi.org/10.1371/journal.pone.0192269
Mediterranean basin, and is frequent among Basques (17.5%) [41], Moroccans (17.3%) [25],
Algerians (11.3%) [67], and Cretans (7.4%) [69]. In addition, DRB1�07:01-DQB1�02:02 is also
frequent in Arabs, such as Moroccans (16.70%), and is reportedly common in Spaniards
(17.3%) [41], and Moroccans (12.6%) [25], but rare in Southern Tunisians (2.10%) (Gabe-
sians). In addition, DRB1�07:01-DQB1�02:01 is also a common DRB1-DQB1 haplotype, and its
frequency exceeds 4% in several Arab populations.
Table 6. Most frequent (%) HLA Class I (A-B) two-locus haplotypes with significant linkage disequilibrium (P<0.05) in Arabs.
A-B haplotype Tun Saudi-B Alg Mor-Ch Mor-a Ber-Z Lib Gab
01:01–50:01 - - - 04.10 - - - -
01–57 - - 02.90 - - - - -
02:01–07:02 - - - - - - 02.97 -
02:01–44:02/03 03.86 - - 02.10a 02.95c 07.50b - 05.26
02:01–50:01 03.30 - - - 01.99d 09.01 - -
02:01–51:01 - 04.66 - 03.40 01.62f - - -
23:01–50:01 - 04.90 - - - - 02.97 -
24:02–08:01 - 04.75 - - - - - -
29:01–45:01 01.79 - - - - - - 02.10
29:02–44:03 - - - 02.70 - - - -
30–18 - - 01.50 - 02.60 03.00 - -
30:02–53:01 - 03.48 - - - - - -
32:01–40:02 00.80 - - - - 05.66 - -
33:01–14:01 - - 02.50 - 01.86e 01.41 - -
34:02–08:01 02.12 - - - - 06.11 - 02.10
a02:01–44:02.
b02:01–44.
c02-44.
d02-50.
e33-14.
f02-51.
https://doi.org/10.1371/journal.pone.0192269.t006
Table 7. Most frequent (%) HLA Class II (DRB1-DQB1) two-locus haplotypes with significant linkage disequilibrium (P<0.05) in Arabs.
HLA-DRB1-DQB1 Tun Sau-B Mor-Ch Bah Leb Alg Lib-J Yem-J Ber-Z Ber-J
01:02–05:01 02.40 02.85 - - - 08.00 02.10 0.70 09.85 04.50
07:01–02:02 14.80 12.32 16.70 - - - 24.70a 22.10a 16.03 -
03:01–02:01 16.60 13.56 12.30 12.02 03.21 11.30 05.60a 12.00a 11.26 -
10:01–05:01 03.80 03.80 - 01.35 04.90 00.30 00.80 04.00 01.41 03.30
07:01–02:01 - - - 09.38 04.20 09.90 - - - 11.00
15:01–06:02 07.80 03.80 08.90 - - 09.90 - - 11.26 02.00
04:02–03:02 02.60 - 06.20 - - 04.20 03.00 07.50 05.15 -
13:01–06:03 02.40 - - - - 03.30 07.70 05.40 05.63 01.80
16:01–05:01 - - - 13.18 03.79 - - - - -
04:01–03:02 - - - 02.78 14.16 - - - - -
11:01–03:01 07.20b 02.22 - 11.98 31.42 04.70 09.30 03.40 07.00b 03.20
aDQB1�02
b11:01/04-03:01
https://doi.org/10.1371/journal.pone.0192269.t007
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 13 / 24
https://doi.org/10.1371/journal.pone.0192269.t006
https://doi.org/10.1371/journal.pone.0192269.t007
https://doi.org/10.1371/journal.pone.0192269
In addition, DRB1�16:01-DQB1�05:01 and DRB1�04:01-DQB1�03:02, rare in neighboring
populations and Mediterraneans, were identified only in Lebanese and Bahraini Arabs. The
high frequency of DRB1�11:01-DQB1�03:01 haplotype (31.42%) among Lebanese is notewor-
thy, since it is the highest in all populations studied, but rare in Saudi (2.2%). Furthermore,
DRB1�11:01/04-DQB1�03:01, identified in Arabs, is also frequent in Cretans (18.5%) [69] and
Basques (3.1%)[41], while DRB1�01:02-DQB1�05:01 was seen in Spaniards (6.30%) [41]. Var-
ied frequency of DRB1�13:01-DQB1�06:03 was also reported for Spaniards (13.23%) [86], Cre-
tans (3.3%) [69], and Germans (10.8%) [87]. Likewise, DRB1�15:01-DQB1�06:02 was observed
in Cretans (2.6%) [65], German population (25.2%) [87], and Southern Ireland (14.90%) [23].
HLA class I and class II extended haplotypes. Table 8 shows the most frequent extended
haplotypes in Arab populations, and their likely origins. The systematic review did not reveal
haplotypes shared by Arab populations because of partial presentation of haplotypic data, dis-
parity in the level of typing resolution, variability of the studied loci, and lack of data. In addi-
tion, Arab populations share their frequent extended haplotypes with several European,
especially Mediterranean, and Asian populations (Table 8). Furthermore, the possible origins
of the most frequent extended haplotypes among Arabs are mainly European, Asian or
Autochthonous.
Table 8. The most frequent (%) HLA extended haplotypes in Arabs.
HLA Extended haplotypes Arab Populations [references] Possible origin
A�02:01-B�50:01-DRB1�07:01-DQB1�02:02a Southern Tunisians (3.2%)[62], Berbers of Zrawa (8.12%) [24] Euro-Asiatic
A�02:01–B�44– DRB1�04:02–DQB1�03:02b Berbers of Zrawa (6.5%)[24] Tunisians (0.6%) [61] Western European
A�24:02-B�08:01-C�07:02-DRB1�03:01c Saudis (3.16%) [49] Euro-Asiatic
A�23:01-B�50:01-C�06:02-DRB1�07:01 Saudis (3.16%) [49] Autochthonous
A�33-C�8-B�14-DRB1�01:02-DQA1�01:01-DQB1�05:01d Algerians (1.5%) [88] Mediterranean
A�30-C�5-B�18-DRB1�03:01-DQA1�05:01-DQB1�02:01e Algerians (1.5%) [88] Iberian-paleo-North
African
A�02:01-C�06:02-B�50:01-DRB1�07:01-DQA1�02:01-DQB1�02:02f Moroccans (2.9%) [65] Euro-Asiatic
A�01:01-C�06:02-B�50:01-DRB1�03:01-DQA1�05:01-DQB1�02:01g Moroccans (2.9%) [65] Mediterranean
A�30-B�07-DRB1�03-DQA1�05:01-DQB1�02:01h Jordanians (1.38%) [31] Euro-Asiatic
A�1-B�8-DRB1�03-DQA1�05:01-DQB1�02:01i Jordanians (1.03%) [31] Pan-European
A�02:01-B�50:01-DRB1�07:01j Libyans (4.24%) [32] Tunisians (1.8%) [60], and Ghannouch (2.5%)
[33].
North African
A�11:01-B�52:01-DRB1�15:02k Libyans (2.54%) [32]; Yemen Jews (0.93%) [23] Mediterranean
A�69-B�49-DRB1�04:03-DQB1�03:02 Palestinians (2.4%) [29] Autochthonous
A�24-B�18-DRB1�11:04-DQB1�03:01l Palestinians (1.8%) [29] Central-South-Eurasian
a
present in Spaniards (1.2%) [41], Turks (1.3%) [79], Italians (0.5%) [68], and Moroccan Jews (2%) [66].
b
also found in British (2.6%), Cornish (7.9%), Danes (2%) [39], Italians (0.9%) [68], Spaniards (0.6%) [41], Spanish Basques (1.9%), Pasiegos (3.3%), Cabuemigos (2.2%)
[77], and Portuguese (3.1%) [39].
c
present at low frequencies in the Euro-Asian minorities of Germany [23].
d
found in Armenians (0.031), Sardinians (0.027), French (0.014), Greeks (0.011), and Italians (0.007) [68].
e
also found in Sardinians (11.4%), and French-Basques (4.7%) [68].
f
present also in Mongolians [68], Turks [79].
g
found in Spaniards, Italians, and north Africans [65].
h
present in Cornish (0.084), British (3.3%), and Danes (3.8%) [68].
i
present in Basques (5%), Spaniards (3.4%) [41], Macedonians (4.9%) [78], Yugoslavians (7.7%), British (2.9%), and Germans (4.8%) [68].
j
found in Poland Jews (1.15%); Ashkenazi Jews (0.92%) [23].
k
present in Ashkenazi Jews (1.05%) [23].
l
found in Armenians (2.1%) and Italians (0.7%) [23].
https://doi.org/10.1371/journal.pone.0192269.t008
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 14 / 24
http://www.allelefrequencies.net/pop6001c.asp?pop_id=3374
http://www.allelefrequencies.net/pop6001c.asp?pop_id=3388
https://doi.org/10.1371/journal.pone.0192269.t008
https://doi.org/10.1371/journal.pone.0192269
Discussion
This meta-analysis is the first genetic anthropology study in MENA region, and included 100
populations from 36 Arab and neighbouring countries, and comprising in excess of 16,000
individuals. A main outcome of the study is the lack of striking differences in the distribution
of HLA alleles and haplotypes between North Africans and Arabian Peninsula populations. On
the contrary, key differences were noted between Levant Arabs (Lebanese, Palestinians, Syr-
ians), and other Arab populations, highlighted by high frequencies of A�24, B�35, DRB1�11:01,
DQB1�03:01, and DRB1�11:01-DQB1�03:01 haplotype in Levantine Arabs compared to other
Arab populations. Class I haplotype frequencies are lower than Class II haplotypes, because of
weak LD between A and B loci, due to long physical distance between them, compared to
DRB1 and DQB1 loci. The identification of shared haplotypes between Arabs and other Medi-
terranean and Asian populations is attributed to the higher admixture of Mediterraneans and
Asians in Arab populations.
Iberians, North Africans, and Arabian Peninsula inhabitants
The relatedness between North Africans and Iberians was previously discussed [29, 59–62, 69,
78, 79, 86, 88]. Using correspondence analysis, NJ trees and genetic distances, our results show
that North Africans are genetically close to Iberians, which is supported by historical events.
First, this relatedness is attributed to the Berber migration from the African Sahara northwards
in 10000–4000 BC, because of hyper-arid conditions [69]. It may also be explained by the simi-
lar history between Iberians and North Africans, both of whom were invaded by Phoenicians,
Romans, Germans, Muslim Arabs [89]; the respective invading armies had a mixed genetic
complexity; indeed, most of them were mercenaries recruited in recent conquests like in the
case of Phoenicians [90] and Muslim who invaded Iberia had troops that were mostly Berbers.
The invasion of Iberia by Muslims in the 8th century AD may have had a role in the related-
ness between North Africans and Iberians for two reasons: first, most Muslim invaders recruits
were North African Berbers, and the second is explained by the 8 centuries period of settle-
ment of the Muslims in Iberia, although more ancient and continuous gene exchange since
prehistoric times between Iberia and North Africa may have been induced the main exchange
[86]; massive mixed marriages and breeding across religious Iberian groups under Muslim
rule is not documented.
The analyses performed showed that current North Africans are closely related to Tunisian
(Zrawa and Matmata) and Moroccan (Sousse-Agadir and Eljadida) Berbers, suggesting that
North Africans have a genetic Berber profile. On the contrary, North Africans displayed a
greater distance from the Arabs of Levant (Palestinians, Syrians, Lebanese, and Jordanians),
indicating low genetic contribution of Phoenician and Levant Arab invasion of North Africa.
These observations based on HLA markers prompted the conclusion that all Berbers of North
Africa constitute a homogeneous genetic unit, except for small isolates, such as the Berbers of
Djerba, who display a Berber genetic profile.
Saudi populations used in this study originated from Eastern Saudi Arabia, especially from
Riyadh province. There is no reliable HLA data on Eastern Saudi Arabia that shed light on pre-
Islamic history; some ancient people may have originated from old Persians, but quantification
is difficult and undetermined [91]. The genetic heterogeneity between Eastern and Western
Saudi Arabia is very possible, and should be taken into account in further interpretation. All
analyses performed here, using HLA-A,-B, -DRB1, and DQB1 markers support the notion that
Saudis along with the Kuwaitis and Yemenis are closely related to North Africans.
The most plausible explanation for West Arabia and Yemen clustering with Iberian/North
Africans is a possible important massive migration that occurred when Sahara underwent
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 15 / 24
https://doi.org/10.1371/journal.pone.0192269
desiccation in all directions [92, 93]. Cultural and language relatedness of many Mediterranean
languages, including old Iberian and Basque [92], with Berber language are concordant with our
genetic findings and Saharan origin hypothesis; also a part of Arabian Peninsula inhabitants
(including Yemen) may had been reached by Saharan people. In fact, Malika Hachid who has
been studying Saharan and North African Archaeology, culture and rock painting/writing of pre-
historic Sahara, even suggests that first known writing alphabet was originated in Sahara. Proto-
Berber writing rock characters have been used (very similar to present day used Berber scripts).
This Proto-Berber language could have appeared 5,000 years BC [94, 95].
Explanation to HLA Kuwait genetic similarity to this group seems more difficult to achieve
but interaction between Arabian Peninsula and Mesopotamia through this strategic Kuwait
area is documented since 6,500 years BC (Ubard Period) [96].
Arabs of Levant
Using genetic distances, correspondence analysis and NJ trees, we showed earlier [61, 62] and
in this study that Palestinians, Syrians, Lebanese and Jordanians are closely related to each
other and to Eastern Mediterranean Europeans (Turks, Cretans, Greeks), Egyptians and Irani-
ans, and confirmed by HLA class I (A, B) and class II markers (DRB1 and DQB1) analysis.
However, Levant Arabs are distant from North African Arabs (Tunisians, Algerians, Moroc-
cans and Libyans) and Iberians (Basques, Spaniards). The strong relatedness between Levant
Arab populations is explained by their common ancestry, the ancient Canaanites, who came
either from Africa or Arabian Peninsula via Egypt in 3300 BC [97], and settled in Levant low-
lands after collapse of Ghassulian civilization in 3800–3350 BC [98]. The relatedness is also
attributed to the close geographical proximity, which constituted one territory before 19th cen-
tury British and French colonization.
The close relatedness of Levant Arabs to Egyptians, as confirmed genetic distances using
HLA markers, may be due to three reasons. First, Egypt is a neighbor to Levant Arab countries,
and historically part of the Levant. Second, the Egyptians invaded the Levant several times
throughout history; the most significant was 1468 BC invasion, where they settled for 12 centu-
ries [99]. Third, the Canaanites, the likely ancestors of Levant Arabs, may have originated
from Africa through Egypt, where they settled for a long period, suggesting likely admixture
between Canaanites and Egyptians.
Historically, Levant is a wider region that included countries along the Eastern Mediterra-
nean with its islands, and extended from Greece to Cyrenaica [100]. Broadly, Levant was his-
torically characterized by high migratory flow between its sub-regions in all directions. For
example, present-day Levant comprising Palestine, Lebanon, Syria, and Jordan has undergone
successive invasions by populations originating from the great Levant, including Egyptians
(1468 BC), Horites, Amorites, Hitites (Turks), Greeks (1200 BC), Assyrians (1090 BC) [99],
and more recently the Ottomans. This has favored admixture, reduced distances and homoge-
nized Great Levant populations, thus explaining the close relatedness of Levant Arabs to East-
ern Mediterranean populations. On the other hand, Levant Arabs are distant from Saudis,
Kuwaitis, and Yeminis, an indication that the contribution of the Arabian Peninsula popula-
tions to Levantine gene pool is low, probably due to the absence of the demographic aspect of
7th century invasion.
Sudanese and Comorians
Sudanese are close to sub-Saharan Africans (Nigerians, Congolese, and Senegalese), and North
Africans, in particular Egyptians, suggesting that the genetic profile of Sudanese is the admix-
ture between North Africans (especially Egyptians) and sub-Saharan Africans throughout
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 16 / 24
https://en.wikipedia.org/wiki/Greece
https://en.wikipedia.org/wiki/Cyrenaica
https://doi.org/10.1371/journal.pone.0192269
history. The close relatedness of Sudanese to sub-Saharan Africans suggests a reduced genetic
effect of Arabs on Sudanese. Also, the Comorians (Comoros islands officially joined League of
Arab Countries in 1993) are close to sub-Saharan Africans (Congolese, Nigerians, and Gabo-
nese) [43], Egyptians, Iranians, and Eastern Mediterranean. This suggests high admixture
between populations belonging to three continents in the Comoro Islands, and can be
explained by their geographical position as a corridor for international trade.
Bahrainis, Emiratis, and Omanis
Bahrainis, Emiratis, and Omanis are geographically similar populations, which explains their
genetic relationship as demonstrated in this study. These three populations tend to form a het-
erogeneous group with Pakistanis, Indians, Iranian Arabs (Famoori), Sardinians (the later
probably close to Iberians/North Africans but behaving as out layer group in analyses because
of they are a genetic island isolate), Egyptians, and some sub-Saharan Africans, such as Congo-
lese. These populations appear close to certain Eastern Mediterranean populations including
Greeks, Macedonians, and those further, in particular North Africans, hence explaining their
intermediate grouping, and distinction from two main clusters. Collectively, this suggests high
admixture in these populations brought about by their commercially important position. Sar-
dinia is a relative genetic isolate “founded” by Iberian Norax/Nora (first documented Sardin-
ian capital close to Cagliari) and Iberians/North Africans may be genetically related to
Sardinians (A�30-B�18-Cw�5 basic HLA haplotype is very high in Sardinia, Iberia, and North
Africa) [93].
Minorities of Arab World
Ethnic minorities. The Kurds and Berbers are the two major ethnic minorities in Arab
world. Berbers are indigenous North African ethnic group found over a vast area stretching
from Atlantic Ocean to Siwa Oasis in Egypt, and from Mediterranean Sea to Niger River. Berbers
number about 20 million people, and constitute 40–45% of Moroccans, 20–25% of Algerians,
and 2–7% in both Libya and Tunisia. The Kurds live in the northern regions of Iraq (15–20%)
and Syria (10%). They constitute an Indo-European ethnic group, and speak Kurdish. Less
important minorities include Armenians, Nubians, Assyrians, and Turkmen [99].
Berbers populations used in this work are closely linked to each other, as well as to present-
day North Africans, and to Western Mediterranean populations, especially Iberians. Indeed,
the Moroccan Berbers are not genetically different from the current Moroccans, nor those of
neighboring populations, like Algerians and Tunisians. This also applies to Tunisian Berbers,
except those of the island of Djerba, who appear to be related to Eastern Mediterranean popu-
lations, including Levant Arabs. This suggests that North African Berbers are in perfect har-
mony with their environments, and that differences between them are cultural rather than
genetic due to 7th century Arabization of the region.
Clustering and genetic distances analyses demonstrated that Iraqi and Iranian Kurds are not
genetically different from Iranians or neighboring populations, including Levant Arab, and are
close to Turks and other Eastern Mediterranean populations. This suggests that Kurds originate
from the region, and are in genetic harmony with neighboring populations, despite the clear
cultural differences. This suggests that Kurds, Syrians, Jordanians, Palestinians, Iraqis, Lebanese,
and Iranians probably share the same genetic profile, with few differences. Accordingly, our
findings confirm the results of an earlier study of Arnaiz-Villena on Iraqi Kurds [54].
Religious minorities. Sunni Muslims constitute the majority (80%) of Arab populations,
followed by Shi’a Muslims (10%) who are present in parts of Iraq, Lebanon, Saudi Arabia,
Kuwait, Yemen, and Bahrain. Non-Muslims make up about 10% of all Arabs, and Christianity
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 17 / 24
https://doi.org/10.1371/journal.pone.0192269
(6%) is the second largest religion among Arabs, with about 20 million Christians living in
Lebanon, Egypt, Iraq, Syria, and Jordan. Other minor religions (4%) such as Judaism, Druze
and others are practiced on a much smaller scale [99].
HLA data on Sunni and Shiite Arabs are not available, same as comparison of Muslims to
Christians. The only available data are those concerning Arab Jews. In this study, data are
available for three Jewish populations, including two from North Africa (Moroccan and Lib-
yan Jews) and one from the Arabian Peninsula (Yemenite Jews). While genetic distances sepa-
rating these three groups of Jews are small (S1 Table), genetic heterogeneity between these
Jewish populations was noted. For example, Yemenite Jews are related to Western Mediterra-
nean populations, including North Africans and Iberians, while Libyan Jews are related to
Eastern Mediterraneans, including Levantine Arabs. The relatedness of Moroccan Jews
depends to other communities on the studied HLA loci; they associate with Eastern Mediterra-
neans using DRB1, but group with Eastern Mediterraneans when the other markers are used.
Conclusion
This study supports the notion that Arabs are divided into four groups. The first consisting
of North Africans (Algerians, Tunisians, Moroccans, and Libyans), Saudis, Kuwaitis, and
Yemenis, with relatedness to Western Mediterraneans, including Iberians. The second
includes Levantine Arabs (Palestinians, Jordanians, Lebanese, and Syrians), Iraqi, and
Egyptians, who appear to be related to the Eastern Mediterranean and Iranians, who in
turn belonged to ’Great Levant’ historically described. The third consists of Sudanese and
Comorians who associate with Sub-Saharan Africans. Finally, the fourth group of Arabs
comprises Omanis, Emiratis, and Bahrainis. This group associates with heterogeneous pop-
ulations (Mediterranean, Asian and sub-Saharan). Lastly, the two main indigenous minori-
ties, Berbers and Kurds, are not genetically different from the ‘host’ and neighboring
populations.
Supporting information
S1 Checklist. PRISMA 2009 checklist.
(DOC)
S1 Fig. Neighbor-Joining dendrograms, based on standard genetic distances (SGD), show-
ing relatedness between Arabs and other populations using generic HLA-DRB1� allele fre-
quencies data. Populations’ data were taken from references detailed in Tables 1 and 2.
Bootstrap values from 1.000 replicates are shown.
(TIF)
S2 Fig. Neighbor-Joining dendrograms, based on standard genetic distances (SGD), show-
ing relatedness between Arabs and other populations using generic HLA-B� allele frequen-
cies data. Populations’ data were taken from references detailed in Tables 1 and 2. Bootstrap
values from 1.000 replicates are shown.
(TIF)
S1 Table. Genetic distances between three groups of Arab Jews based on HLA-DRB1 and
-DQB1 alleles frequencies.
(DOC)
Author Contributions
Conceptualization: Abdelhafidh Hajjej, Lasmar Hattab, Slama Hmida.
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 18 / 24
http://www.plosone.org/article/fetchSingleRepresentation.action?uri=info:doi/10.1371/journal.pone.0192269.s001
http://www.plosone.org/article/fetchSingleRepresentation.action?uri=info:doi/10.1371/journal.pone.0192269.s002
http://www.plosone.org/article/fetchSingleRepresentation.action?uri=info:doi/10.1371/journal.pone.0192269.s003
http://www.plosone.org/article/fetchSingleRepresentation.action?uri=info:doi/10.1371/journal.pone.0192269.s004
https://doi.org/10.1371/journal.pone.0192269
Formal analysis: Abdelhafidh Hajjej, Slama Hmida.
Investigation: Abdelhafidh Hajjej.
Methodology: Wassim Y. Almawi, Slama Hmida.
Software: Abdelhafidh Hajjej, Lasmar Hattab.
Supervision: Slama Hmida.
Validation: Abdelhafidh Hajjej, Wassim Y. Almawi, Antonio Arnaiz-Villena, Lasmar Hattab,
Slama Hmida.
Writing – original draft: Abdelhafidh Hajjej.
Writing – review & editing: Wassim Y. Almawi, Antonio Arnaiz-Villena.
References
1. HLA allele database: http://hla.alleles.org (last accessed on September 17, 2017)
2. Hudson RR. Analysis of population subdivision in Handbook of statistical genetics, MBD. Balding
MBD and Cannings C. (Eds). pp. 309–324. John Wiley & Sons Chichester, UK, 2001
3. Takezaki N, Nei M. Empirical tests of the reliability of phylogenetic trees constructed with microsatellite
DNA. Genetics. 2008; 178(1): 385–92. https://doi.org/10.1534/genetics.107.081505 PMID: 18202381
4. Nei M. Phylogenetic analysis in molecular evolutionary genetics. Annual Review of Genetics. 1996;
30: 371–403. https://doi.org/10.1146/annurev.genet.30.1.371 PMID: 8982459
5. Tamura K, Nei M, Kumar S. Prospects for inferring very large phylogenies by using the neighbor-join-
ing method. Proceedings of the National Academy of Sciences USA. 2004; 101(30): 11030–5. https://
doi.org/10.1073/pnas.0404206101 PMID: 15258291
6. Excoffier L, Slatkin M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid
population. Molecular Biology and Evolution. 1995; 12(5): 921–7. https://doi.org/10.1093/
oxfordjournals.molbev.a040269 PMID: 7476138
7. The World Factbook: https://www.cia.gov/library/publications/the-world-factbook
8. Bengio O, Ben-Dor G. Minorities and the State in the Arab World. Lynne Rienner Publishers, 1999–
224 pages
9. Encyclopædia Britannica, Himyar: https://www.britannica.com/topic/Himyar
10. Korotayev A. Ancient Yemen. Oxford: Oxford University Press, 1995.
11. Korotayev A. Pre-Islamic Yemen. Wiesbaden: Harrassowitz Verlag, 1996.
12. Munro-Hay, Stuart C. Aksum: An African Civilization of Late Antiquity 1991. Edinburgh: Edinburgh
University Press, 1991.
13. Robin CJ. Arabia and Ethiopia, ’in Johnson Scott (ed.) The Oxford Handbook of Late Antiquity, Oxford
University Press 2012 pp. 247–333, p.279.
14. Hoyland R. Arabia and the Arabs: From the Bronze Age to the Coming of Islam, Routledge, 2001,
p.51.
15. Encyclopædia wikipedia: https://en.wikipedia.org/wiki/History_of_Islam
16. Hourani A. A History of the Arab Peoples. Harvard University Press 2002; pp. 15–19. ISBN
9780674010178.
17. Moher D, Liberati A, Tetzlaff J, Altman DG, and PRISMA Group, “Reprint—preferred reporting
items for systematic reviews and meta-analyses: the PRISMA statement”. Physical Therapy. 2009;
89(9): 873–80. https://doi.org/10.1093/ptj/89.9.873 PMID: 19723669
18. Young FW, Bann CM. A visual statistics system. In Stine RA, Fox J, eds. Statistical computing envi-
ronments for social researches. New York: Sage publications. 1996; 207–36.
19. Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees.
Molecular Biology and Evolution. 1987; 4(4): 406–425. https://doi.org/10.1093/oxfordjournals.molbev.
a040454 PMID: 3447015
20. Nei M. Genetic distances between populations. The American Naturalist. 1972; 106:283. http://jstor.
org/stable/2459777
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 19 / 24
http://hla.alleles.org
https://doi.org/10.1534/genetics.107.081505
http://www.ncbi.nlm.nih.gov/pubmed/18202381
https://doi.org/10.1146/annurev.genet.30.1.371
http://www.ncbi.nlm.nih.gov/pubmed/8982459
https://doi.org/10.1073/pnas.0404206101
https://doi.org/10.1073/pnas.0404206101
http://www.ncbi.nlm.nih.gov/pubmed/15258291
https://doi.org/10.1093/oxfordjournals.molbev.a040269
https://doi.org/10.1093/oxfordjournals.molbev.a040269
http://www.ncbi.nlm.nih.gov/pubmed/7476138
https://www.cia.gov/library/publications/the-world-factbook
https://www.britannica.com/topic/Himyar
https://en.wikipedia.org/wiki/History_of_Islam
https://doi.org/10.1093/ptj/89.9.873
http://www.ncbi.nlm.nih.gov/pubmed/19723669
https://doi.org/10.1093/oxfordjournals.molbev.a040454
https://doi.org/10.1093/oxfordjournals.molbev.a040454
http://www.ncbi.nlm.nih.gov/pubmed/3447015
http://jstor.org/stable/2459777
http://jstor.org/stable/2459777
https://doi.org/10.1371/journal.pone.0192269
21. Nei M. Analysis of gene diversity in subdivided populations. Proceedings of the National Academy of
Sciences USA. 1973; 70(12): 3321–3. PMID: 4519626.
22. Nei M, Tajima F, Tateno Y. Accuracy of estimated phylogenetic trees from molecular data. II. Gene
frequency data.Journal of Molecular Evolution. 1983; 19(2): 153–70. https://doi.org/10.1007/
BF02300753 PMID: 6571220
23. Database of allele frequencies: http://www.allelefrequencies.net, 2017
24. Hajjej A, Sellami MH, Kaabi H, Hajjej G, El-Gaaied A, Boukef K, et al. HLA class I and class II polymor-
phisms in Tunisian Berbers. Annals of Human Biology. 2011; 38 (2): 156–64. https://doi.org/10.3109/
03014460.2010.504195 PMID: 20666704
25. Gomez-Casado E, del Moral P, Martinez-Laso J, Garcı́a-Gómez A, Allende L, Silvera-Redondo C,
et al. HLA gene in Arabic-Speaking Moroccans: close relatedness to Berbers and Iberians. Tissue
Antigens. 2000; 55(3): 239–49. https://doi.org/10.1034/j.1399-0039.2000.550307.x PMID: 10777099
26. Mahfoudh N, Ayadi I, Kamoun A, Ammar R, Mallek B, Maalej L, et al. Analysis of HLA-A, -B, -C, -DR,
-DQ polymorphisms in the South Tunisian population and a comparison with other populations. Annals
of Human Biology. 2013; 40(1): 41–7. https://doi.org/10.3109/03014460.2012.734334 PMID:
23095049
27. Matevosyan L, Chattopadhyay S, Madelian V, Avagyan S, Nazaretyan M, Hyussian A, et al. HLA-A,
HLA-B, and HLA-DRB1 allele distribution in a large Armenian population sample. Tissue Antigens.
2011; 78(1): 21–30. https://doi.org/10.1111/j.1399-0039.2011.01668.x PMID: 21501120
28. Hamdi NM, Al-Hababi FH, Eid AE. HLA class I and class II associations with ESRD in Saudi Arabian
population. PLoS One. 2014 Nov 7; 9(11): e111403. https://doi.org/10.1371/journal.pone.0111403
PMID: 25380295
29. Arnaiz-Villena A, Elaiwa N, Silvera C, Rostom A, Moscoso J, Gómez-Casado E, et al. The origin of
Palestinians and their genetic relatedness with other Mediterranean populations. Retraction in: Suciu-
Foca N, Lewis R. Human Immunology. 2001; 62(9): 889–900. (Accessed on https://commons.
wikimedia.org/wiki/File:Palestinians_hla ) PMID: 11543891
30. Albalushi KR, Sellami MH, Alriyami H, varghese M, Boukef MK, Hmida S. The Investigation of the Evo-
lutionary History of the Omani Population by Analysis of HLA Class I Polymorphism. Anthropologist.
2014; 18(1): 205–210
31. Sánchez-Velasco P, Karadsheh NS, Garcı́a-Martı́n A, Ruı́z de Alegrı́a C, Leyva-Cobián F. Molecular
analysis of HLA allelic frequencies and haplotypes in Jordanians and comparison with other related
populations. Human Immunology. 2001; 62(9): 901–9. https://doi.org/10.1016/S0198-8859(01)
00289-0. PMID: 11543892.
32. Galgani A, Mancino G, Martı́nez-Labarga C, Cicconi R, Mattei M, Amicosante M, et al. HLA-A, -B and
-DRB1 allele frequencies in Cyrenaica population (Libya) and genetic relationships with other popula-
tions. Hum Immunol. 2013; 74(1): 52–9. https://doi.org/10.1016/j.humimm.2012.10.001 PMID:
23079236
33. Hajjej A, Hmida S, Kaabi H, Dridi A, Jridi A, El Gaaled A, et al. HLA genes in Southern Tunisians
(Ghannouch area) and their relationship with other Mediterraneans. European Journal Medical Genet-
ics. 2006; 49(1): 43–56. https://doi.org/10.1016/j.ejmg.2005.01.001 PMID: 16473309
34. Hmida S, Gauthier A, Dridi A, Quillivic F, Genetet B, Boukef K, et al. HLA class II gene polymorphism
in Tunisians. Tissue Antigens. 1995; 45(1): 63–8. https://doi.org/10.1111/j.1399-0039.1995.tb02416.
x PMID: 7725313
35. Almawi WY, Busson M, Tamim H, Al-Harbi EM, Finan RR, Wakim-Ghorayeb SF, et al. HLA class II
profile and distribution of HLA-DRB1 and HLA-DQB1 alleles and haplotypes among Lebanese and
Bahraini Arabs. Clinical and Diagnostic Laboratory Immunology. 2004; 11(4): 770–4. https://doi.org/
10.1128/CDLI.11.4.770-774.2004 PMID: 15242955
36. Amar A, Kwon OJ, Motro U, Witt CS, Bonne-Tamir B, Gabison R, et al. Molecular analysis of HLA
class II polymorphisms among different ethnic groups in Israel. Human Immunology. 1999; 60(8):
723–30. https://doi.org/10.1016/S0198-8859(99)00043-9 PMID: 10439318
37. Izaabel H, Garchon HJ, Caillat-Zucman S, Beaurain G, Akhayat O, Bach JF, et al. HLA class II DNA
polymorphism in a Moroccan population from the Souss, Agadir area. Tissue Antigens. 1998; 51(1):
106–10. https://doi.org/10.1111/j.1399-0039.1998.tb02954.x PMID: 9459511
38. Al-Tonbary Y, Abdel-Razek N, Zaghloul H, Metwaly S, El-Deek B, El-Shawaf R. HLA class II polymor-
phism in Egyptian children with lymphomas. Hematology. 2004; 9(2): 139–45. https://doi.org/10.
1080/1024533042000205487 PMID: 15203870
39. Clayton J, Lonjou C. Allele and Haplotype frequencies for HLA loci in various ethnic groups. In
Charron D, ed. Genetic diversity of HLA. Functional and medical implications. Vol 1. Paris: EDK.
1997; 665–820.
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 20 / 24
http://www.ncbi.nlm.nih.gov/pubmed/4519626
https://doi.org/10.1007/BF02300753
https://doi.org/10.1007/BF02300753
http://www.ncbi.nlm.nih.gov/pubmed/6571220
http://www.allelefrequencies.net
https://doi.org/10.3109/03014460.2010.504195
https://doi.org/10.3109/03014460.2010.504195
http://www.ncbi.nlm.nih.gov/pubmed/20666704
https://doi.org/10.1034/j.1399-0039.2000.550307.x
http://www.ncbi.nlm.nih.gov/pubmed/10777099
https://doi.org/10.3109/03014460.2012.734334
http://www.ncbi.nlm.nih.gov/pubmed/23095049
https://doi.org/10.1111/j.1399-0039.2011.01668.x
http://www.ncbi.nlm.nih.gov/pubmed/21501120
https://doi.org/10.1371/journal.pone.0111403
http://www.ncbi.nlm.nih.gov/pubmed/25380295
https://commons.wikimedia.org/wiki/File:Palestinians_hla
https://commons.wikimedia.org/wiki/File:Palestinians_hla
http://www.ncbi.nlm.nih.gov/pubmed/11543891
https://doi.org/10.1016/S0198-8859(01)00289-0
https://doi.org/10.1016/S0198-8859(01)00289-0
http://www.ncbi.nlm.nih.gov/pubmed/11543892
https://doi.org/10.1016/j.humimm.2012.10.001
http://www.ncbi.nlm.nih.gov/pubmed/23079236
https://doi.org/10.1016/j.ejmg.2005.01.001
http://www.ncbi.nlm.nih.gov/pubmed/16473309
https://doi.org/10.1111/j.1399-0039.1995.tb02416.x
https://doi.org/10.1111/j.1399-0039.1995.tb02416.x
http://www.ncbi.nlm.nih.gov/pubmed/7725313
https://doi.org/10.1128/CDLI.11.4.770-774.2004
https://doi.org/10.1128/CDLI.11.4.770-774.2004
http://www.ncbi.nlm.nih.gov/pubmed/15242955
https://doi.org/10.1016/S0198-8859(99)00043-9
http://www.ncbi.nlm.nih.gov/pubmed/10439318
https://doi.org/10.1111/j.1399-0039.1998.tb02954.x
http://www.ncbi.nlm.nih.gov/pubmed/9459511
https://doi.org/10.1080/1024533042000205487
https://doi.org/10.1080/1024533042000205487
http://www.ncbi.nlm.nih.gov/pubmed/15203870
https://doi.org/10.1371/journal.pone.0192269
40. Abdennaji Guenounou B, Loueslati BY, Buhler S, Hmida S, Ennafaa H, Khodjet-Elkhil H, et al. HLA
class II genetic diversity in Southern Tunisia and the Mediterranean area. International Journal Immu-
nogenetics. 2006; 33(2): 93–103. https://doi.org/10.1111/j.1744-313X.2006.00577.x PMID:
16611253
41. Martinez-Laso J, De Juan D, Martinez-Quiles N, Gomez-Casado E, Cuadrado E, Arnaiz-Villena A.
The contribution of the HLA-A, -B, -C and -DR, -DQ DNA typing to the study of the origins of Spaniards
and Basques. Tissue Antigens. 1995; 45(4): 237–45. https://doi.org/10.1111/j.1399-0039.1995.
tb02446.x PMID: 7638859.
42. Brick C, Bennani N, Atouf O, Essakalli M. HLA-A, -B, -DR and -DQ allele and haplotype frequencies in
the Moroccan population: a general population study. Transfusion Clinique et Biologique. 2006; 13(6):
346–52. https://doi.org/10.1016/j.tracli.2006.12.003 PMID: 17306585
43. Gibert M, Touinssi M, Reviron D, Mercier P, Boëtsch G, Chiaroni J. HLA-DRB1 frequencies of the
Comorian population and their genetic affinities with Sub-Saharan African and Indian Oceanian popu-
lations. Annals of Human Biology. 2006; 33(3): 265–78. https://doi.org/10.1080/03014460600578599
PMID: 17092866
44. Samaha H, Rahal EA, Abou-Jaoude M, Younes M, Dacchache J, Hakime N. HLA class II allele fre-
quencies in the Lebanese population. Molecular Immunology. 2003; 39(17–18): 1079–81. https://doi.
org/10.1016/S0161-5890(03)00073-7 PMID: 12835080
45. Khansa S, Hoteit R, Shammaa D, Khalek RA, El Halas H, Greige L, et al. HLA class II allele frequen-
cies in the Lebanese population. Gene. 2012; 506(2): 396–9. https://doi.org/10.1016/j.gene.2012.06.
063 PMID: 22750800
46. Elbjeirami WM, Abdel-Rahman F, Hussein AA. Probability of finding an HLA-matched donor in imme-
diate and extended families: the Jordanian experience. Biology of Blood and Marrow Transplantation.
2013; 19(2): 221–6. https://doi.org/10.1016/j.bbmt.2012.09.009 PMID: 23025986
47. Mourad J, Monem F. HLA-DRB1 allele association with rheumatoid arthritis susceptibility and severity
in Syria. Revista Brasileira De Reumatologia. 2013; 53(1): 47–56. PMID: 23588515
48. Djidjik R, Allam I, Douaoui S, Meddour Y, Cherguelaı̂ne K, Tahiat A, et al. Association study of human
leukocyte antigen-DRB1 alleles with rheumatoid arthritis in Algerian patients. International Journal of
Rheumatic Diseases. 2014. https://doi.org/10.1111/1756-185X.12272 PMID: 24447879
49. Hajeer AH, Al Balwi MA, AytülUyar F, Alhaidan Y, Alabdulrahman A, Al Abdulkareem I, et al. HLA-A,
-B, -C, -DRB1 and -DQB1 allele and haplotype frequencies in Saudis using next generation sequenc-
ing technique. Tissue Antigens. 2013; 82(4): 252–8. https://doi.org/10.1111/tan.12200 PMID:
24461004
50. Hajeer AH, Sawidan FA, Bohlega S, Saleh S, Sutton P, Shubaili A, Tahan AA, Al Jumah M. HLA class
I and class II polymorphisms in Saudi patients with myasthenia gravis. International Journal of Immu-
nogenetics. 2009; 36(3): 169–72. https://doi.org/10.1111/j.1744-313X.2009.00843.x PMID:
19490212
51. Albalushi KR, Sellami MH, Alriyami H, varghese M, Boukef MK, Hmida S. HLA Class II (DRB1 and
DQB1) Polymorphism in Omanis. Journal of Transplantation Technologies and Research 2014; 4:
134. https://doi.org/10.4172/2161-0991.1000134
52. Haider MZ, Shaltout A, Alsaeid K, Qabazard M, Dorman J. Prevalence of human leukocyte antigen
DQA1 and DQB1 alleles in Kuwaiti Arab children with type 1 diabetes mellitus. Clinical Genetics.
1999; 56(6): 450–6. https://doi.org/10.1034/j.1399-0004.1999.560608.x PMID: 10665665
53. Haider MZ, Zahid MA, Dalal HN, Razik MA. Human leukocyte antigen (HLA) DRB1 alleles in Kuwaiti
Arabs with schizophrenia.American Journal of Medical Genetics. 2000; 96(6): 870–2. https://doi.org/
10.1002/1096-8628(20001204)96:6<870::AID-AJMG36>3.0.CO;2-L PMID: 11121200.
54. Arnaiz-Villena A, Palacio-Grüber J, Muñiz E, Campos C, Alonso-Rubio J, Gomez-Casado E, et al.
Genetic HLA Study of Kurds in Iraq, Iran and Tbilisi (Caucasus, Georgia): Relatedness and Medical
Implications. PLoS One. 2017 Jan 23; 12(1): e0169929. https://doi.org/10.1371/journal.pone.
0169929 PMID: 28114347
55. Nassar MY, Al-Shamahy HA, Masood HA. The Association between Human Leukocyte Antigens and
Hypertensive End-Stage Renal Failure among Yemeni Patients. Sultan Qaboos University Medical
Journal. 2015; 15(2): e241–249. PMID: 26052458
56. Middleton D, Williams F, Meenagh A, Daar AS, Gorodezky C, Hammond M, et al. Analysis of the
distribution of HLA-A alleles in populations from five continents. Human Immunology. 2000; 61
(10): 1048–52. https://doi.org/10.1016/S0198-8859(00)00178-6 PMID: 11082518
57. Williams F, Meenagh A, Darke C, Acosta A, Daar AS, Gorodezky C, et al. Analysis of the distribution
of HLA-B alleles in populations from five continents. Human Immunology. 2001; 62(6): 645–50.
https://doi.org/10.1016/S0198-8859(01)00247-6 PMID: 11390040
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 21 / 24
https://doi.org/10.1111/j.1744-313X.2006.00577.x
http://www.ncbi.nlm.nih.gov/pubmed/16611253
https://doi.org/10.1111/j.1399-0039.1995.tb02446.x
https://doi.org/10.1111/j.1399-0039.1995.tb02446.x
http://www.ncbi.nlm.nih.gov/pubmed/7638859
https://doi.org/10.1016/j.tracli.2006.12.003
http://www.ncbi.nlm.nih.gov/pubmed/17306585
https://doi.org/10.1080/03014460600578599
http://www.ncbi.nlm.nih.gov/pubmed/17092866
https://doi.org/10.1016/S0161-5890(03)00073-7
https://doi.org/10.1016/S0161-5890(03)00073-7
http://www.ncbi.nlm.nih.gov/pubmed/12835080
https://doi.org/10.1016/j.gene.2012.06.063
https://doi.org/10.1016/j.gene.2012.06.063
http://www.ncbi.nlm.nih.gov/pubmed/22750800
https://doi.org/10.1016/j.bbmt.2012.09.009
http://www.ncbi.nlm.nih.gov/pubmed/23025986
http://www.ncbi.nlm.nih.gov/pubmed/23588515
https://doi.org/10.1111/1756-185X.12272
http://www.ncbi.nlm.nih.gov/pubmed/24447879
https://doi.org/10.1111/tan.12200
http://www.ncbi.nlm.nih.gov/pubmed/24461004
https://doi.org/10.1111/j.1744-313X.2009.00843.x
http://www.ncbi.nlm.nih.gov/pubmed/19490212
https://doi.org/10.4172/2161-0991.1000134
https://doi.org/10.1034/j.1399-0004.1999.560608.x
http://www.ncbi.nlm.nih.gov/pubmed/10665665
https://doi.org/10.1002/1096-8628(20001204)96:6<870::AID-AJMG36>3.0.CO;2-L
https://doi.org/10.1002/1096-8628(20001204)96:6<870::AID-AJMG36>3.0.CO;2-L
http://www.ncbi.nlm.nih.gov/pubmed/11121200
https://doi.org/10.1371/journal.pone.0169929
https://doi.org/10.1371/journal.pone.0169929
http://www.ncbi.nlm.nih.gov/pubmed/28114347
http://www.ncbi.nlm.nih.gov/pubmed/26052458
https://doi.org/10.1016/S0198-8859(00)00178-6
http://www.ncbi.nlm.nih.gov/pubmed/11082518
https://doi.org/10.1016/S0198-8859(01)00247-6
http://www.ncbi.nlm.nih.gov/pubmed/11390040
https://doi.org/10.1371/journal.pone.0192269
58. Jazairi B, Khansaa I, Ikhtiar A, Murad H. Frequency of HLA-DRB1 and HLA-DQB1 Alleles and Haplo-
type Association in Syrian Population. Immunological Investigation. 2016; 45(2): 172–9. https://doi.
org/10.3109/08820139.2015.1131293 PMID: 26853713
59. Hajjej A, Hajjej G, Almawi WY, Kaabi H, El-Gaaied A, Hmida S. HLA class I and class II polymorphism
in a population from south-eastern Tunisia (Gabes Area). International Journal of Immunogenetics.
2011; 38(3): 191–9. https://doi.org/10.1111/j.1744-313X.2011.01003.x PMID: 21385325
60. Hajjej A, Kâabi H, Sellami MH, Dridi A, Jeridi A, El borgi W, et al. The contribution of HLA class I and II
alleles and haplotypes to the investigation of the evolutionary history of Tunisians. Tissue Antigens.
2006; 68(2): 153–62. https://doi.org/10.1111/j.1399-0039.2006.00622.x PMID: 16866885
61. Hajjej A, Almawi WY, Hattab L, El-Gaaied A, Hmida S. HLA Class I and Class II Alleles and Haplo-
types Confirm the Berber Origin of the Present Day Tunisian Population. PLoS One. 2015; 10(8):
e0136909. https://doi.org/10.1371/journal.pone.0136909 PMID: 26317228
62. Hajjej A, Almawi WY, Hattab L, El-Gaaied A, Hmida S. The investigation of the origin of Southern Tuni-
sians using HLA genes. Journal of Human Genetics. 2017; 62(3): 419–429. https://doi.org/10.1038/
jhg.2016.146 PMID: 27881842
63. Ayed K, Ayed-Jendoubi S, Sfar I, Labonne MP, Gebuhrer L. HLA class-I and HLA class-II phenotypic,
gene and haplotypic frequencies in Tunisians by using molecular typing data. Tissue Antigens. 2004;
64(4): 520–32. https://doi.org/10.1111/j.1399-0039.2004.00313.x PMID: 15361135
64. Oumhani K, Canossi A, Piancatelli D, Di Rocco M, Del Beato T, Liberatore G, et al. Sequence-Based
analysis of the HLA-DRB1 polymorphism in Metalsa Berber and Chaouya Arabic-speaking groups
from Morocco. Human Immunology. 2002; 63(2): 129–38. https://doi.org/10.1016/S0198-8859(01)
00370-6 PMID: 11821160
65. Canossi A, Piancatelli D, Aureli A, Oumhani K, Ozzella G, Del Beato T, et al. Correlation between
genetic HLA class I and II polymorphisms and anthropological aspects in the Chaouya population
from Morocco (Arabic speaking). Tissue Antigens. 2010; 76(3): 177–193. https://doi.org/10.1111/j.
1399-0039.2010.01498.x PMID: 20492599
66. Roitberg-Tambur A, Witt CS, Friedmann A, Safirman C, Sherman L, Battat S, Nelken D, Brautbar C.
Comparative analysis of HLA polymorphism at the serologic and molecular level in Moroccan and Ash-
kenazi Jews. Tissue Antigens. 1995; 46(2): 104–10. https://doi.org/10.1111/j.1399-0039.1995.
tb02485.x PMID: 7482502
67. Arnaiz-Villena A, Benmamar D, Alvarez M, Diaz-Campos N, Varela P, Gomez-Casado E, et al. HLA
allele and haplotype frequencies in Algerians. Relatedness to Spaniards and Basques. Human Immu-
nology. 1995; 43(4): 259–68. https://doi.org/10.1016/0198-8859(95)00024-X PMID: 7499173
68. Imanishi T, Akaza T, Kimura A, Tokunaga K, Gjobori T. Allele and haplotype frequencies for HLA and
complement loci in various ethnic groups. In, eds. HLA 1991. VOL 1. Oxford: Oxford University
Press. 1992; 1065–220.
69. Arnaiz-Villena A, Iliakis P, González-Hevilla M, Longás J, Gómez-Casado E, Sfyridaki K, et al. The ori-
gin of Cretan populations as determined by characterization of HLA alleles. Tissue Antigens. 1999; 53
(3): 213–26. https://doi.org/10.1034/j.1399-0039.1999.530301.x PMID: 10203014
70. Comas D, Mateu E, Calafell F, Pérez-Lezaun A, Bosch E, Martı́nez-Arias R, et al. HLA class I and
class II DNA typing and the origin of Basques. Tissue Antigens. 1998; 51(1): 30–40. https://doi.org/
10.1111/j.1399-0039.1998.tb02944.x PMID: 9459501
71. Grimaldi MC, Crouau-Roy B, Amoros JP, Cambon-Thomsen A, Carcassi C, Orru S, et al. West Medi-
terranean islands (Corsica, Balearic Islands, Sardinia) and the Basque population: contribution of HLA
class I molecular markers to their evolutionary history. Tissue Antigens. 2001; 58(5): 281–92. https://
doi.org/10.1034/j.1399-0039.2001.580501.x PMID: 11844138
72. Renquin J, Sanchez-Mazas A, Halle L, Rivalland S, Jaeger G, Mbayo K, et al. HLA class II polymor-
phism in Aka Pygmies and Bantu Congolese and a reassessment of HLA-DRB1 African diversity. Tis-
sue Antigens. 2001; 58(4): 211–22. https://doi.org/10.1034/j.1399-0039.2001.580401.x PMID:
11782272
73. Farjadian S, Ghaderi A. HLA class II genetic diversity in Arabs and Jews of Iran. Iranian Journal of
Immunology. 2007; 4(2): 85–93. https://doi.org/IJIv4i2A3 PMID: 17652848
74. Kollaee A, Ghaffarpor M, Ghlichnia HA, Ghaffari SH, Zamani M. The influence of the HLA-DRB1 and
HLA-DQB1 allele heterogeneity on disease risk and severity in Iranian patients with multiple sclerosis.
International Journal of Immunogenetics. 2012; 39(5): 414–22. https://doi.org/10.1111/j.1744-313X.
2012.01104.x PMID: 22404765
75. Sayad A, Akbari MT, Pajouhi M, Mostafavi F, Zamani M. The influence of the HLA-DRB, HLA-DQB
and polymorphic positions of the HLA-DRβ1 and HLA-DQβ1 molecules on risk of Iranian type 1 diabe-
tes mellitus patients. International Journal of Immunogenetics. 2012; 39(5): 429–36. https://doi.org/
10.1111/j.1744-313X.2012.01116.x PMID: 22494469
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 22 / 24
https://doi.org/10.3109/08820139.2015.1131293
https://doi.org/10.3109/08820139.2015.1131293
http://www.ncbi.nlm.nih.gov/pubmed/26853713
https://doi.org/10.1111/j.1744-313X.2011.01003.x
http://www.ncbi.nlm.nih.gov/pubmed/21385325
https://doi.org/10.1111/j.1399-0039.2006.00622.x
http://www.ncbi.nlm.nih.gov/pubmed/16866885
https://doi.org/10.1371/journal.pone.0136909
http://www.ncbi.nlm.nih.gov/pubmed/26317228
https://doi.org/10.1038/jhg.2016.146
https://doi.org/10.1038/jhg.2016.146
http://www.ncbi.nlm.nih.gov/pubmed/27881842
https://doi.org/10.1111/j.1399-0039.2004.00313.x
http://www.ncbi.nlm.nih.gov/pubmed/15361135
https://doi.org/10.1016/S0198-8859(01)00370-6
https://doi.org/10.1016/S0198-8859(01)00370-6
http://www.ncbi.nlm.nih.gov/pubmed/11821160
https://doi.org/10.1111/j.1399-0039.2010.01498.x
https://doi.org/10.1111/j.1399-0039.2010.01498.x
http://www.ncbi.nlm.nih.gov/pubmed/20492599
https://doi.org/10.1111/j.1399-0039.1995.tb02485.x
https://doi.org/10.1111/j.1399-0039.1995.tb02485.x
http://www.ncbi.nlm.nih.gov/pubmed/7482502
https://doi.org/10.1016/0198-8859(95)00024-X
http://www.ncbi.nlm.nih.gov/pubmed/7499173
https://doi.org/10.1034/j.1399-0039.1999.530301.x
http://www.ncbi.nlm.nih.gov/pubmed/10203014
https://doi.org/10.1111/j.1399-0039.1998.tb02944.x
https://doi.org/10.1111/j.1399-0039.1998.tb02944.x
http://www.ncbi.nlm.nih.gov/pubmed/9459501
https://doi.org/10.1034/j.1399-0039.2001.580501.x
https://doi.org/10.1034/j.1399-0039.2001.580501.x
http://www.ncbi.nlm.nih.gov/pubmed/11844138
https://doi.org/10.1034/j.1399-0039.2001.580401.x
http://www.ncbi.nlm.nih.gov/pubmed/11782272
https://doi.org/IJIv4i2A3
http://www.ncbi.nlm.nih.gov/pubmed/17652848
https://doi.org/10.1111/j.1744-313X.2012.01104.x
https://doi.org/10.1111/j.1744-313X.2012.01104.x
http://www.ncbi.nlm.nih.gov/pubmed/22404765
https://doi.org/10.1111/j.1744-313X.2012.01116.x
https://doi.org/10.1111/j.1744-313X.2012.01116.x
http://www.ncbi.nlm.nih.gov/pubmed/22494469
https://doi.org/10.1371/journal.pone.0192269
76. Sulcebe G, Sanchez-Mazas A, Tiercy JM, Shyti E, Mone I, Ylli Z, et al. HLA allele and haplotype fre-
quencies in the Albanian population and their relationship with the other European populations. Inter-
national Journal of Immunogenetics. 2009; 36(6): 337–43. https://doi.org/10.1111/j.1744-313X.2009.
00868.x PMID: 19703234
77. Sanchez-Velasco P, Gomez-Casado E, Martinez-Laso J, Moscoso J, Zamora J, Lowy E, et al. HLA
alleles in isolated populations from North Spain: origin of the Basques and the ancient Iberians. Tissue
Antigens. 2003; 61(5): 384–92. https://doi.org/10.1034/j.1399-0039.2003.00041.x PMID: 12753657
78. Arnaiz-Villena A, Dimitroski K, Pacho A, Moscoso J, Gómez-Casado E, Silvera-Redondo C, et al.
HLA genes in Macedonians and the sub-Saharan origin of the Greeks. Tissue Antigens. 2001; 57
(2): 118–27. https://doi.org/10.1034/j.1399-0039.2001.057002118.x PMID: 11260506
79. Arnaiz-Villena A, Karin M, Bendikuze N, Gomez-Casado E, Moscoso J, Silvera C, et al. HLA alleles
and haplotypes in the Turkish population: relatedness to Kurds, Armenians and other Mediterraneans.
Tissue Antigens. 2001; 57(4): 308–17. https://doi.org/10.1034/j.1399-0039.2001.057004308.x PMID:
11380939
80. Muro M, Marı́n L, Torı́o A, Moya-Quiles MR, Minguela A, Rosique-Roman J, et al. HLA polymorphism
in the Murcia population (Spain): in the cradle of the archaeologic Iberians. Human Immunology. 2001;
62(9): 910–21. https://doi.org/10.1016/S0198-8859(01)00290-7 PMID: 11543893
81. Farjadian S, Ghaderi A. HLA class II similarities in Iranian Kurds and Azeris. International Journal of
Immunogenetics. 2007; 34(6): 457–63. https://doi.org/10.1111/j.1744-313X.2007.00723.x PMID:
18001303
82. Mohyuddin A, Ayub Q, Khaliq S, Mansoor A, Mazhar K, Rehman S, et al. HLA polymorphism in six eth-
nic groups from Pakistan. Tissue Antigens. 2002; 59(6): 492–501. https://doi.org/10.1034/j.1399-
0039.2002.590606.x PMID: 12445319
83. Agrawal S, Srivastava SK, Borkar M, Chaudhuri TK. Genetic affinities of north and northeastern popu-
lations of India: inference from HLA-based study. Tissue Antigens. 2008; 72(2): 120–30. https://doi.
org/10.1111/j.1399-0039.2008.01083.x PMID: 18721272
84. Rani R, Sood A, Goswami R. Molecular basis of predisposition to develop type 1 diabetes mellitus in
North Indians. Tissue Antigens. 2004; 64(2): 145–55. https://doi.org/10.1111/j.1399-0039.2004.
00246.x PMID: 15245369
85. Migot-Nabias F, Fajardy I, Danze PM, Everaere S, Mayombo J, Minh TN, et al. HLA class II polymor-
phism in a Gabonese Banzabi population. Tissue Antigens. 1999; 53(6): 580–5. https://doi.org/10.
1034/j.1399-0039.1999.530610.x PMID: 10395110
86. Arnaiz-Villena A, Muñiz E, Campos C, Gomez-Casado E, Tomasi S, Martı́nez-Quiles N, et al. Origin
of Ancient Canary Islanders (Guanches): presence of Atlantic/Iberian HLA and Y chromosome genes
and Ancient Iberian language. International Journal of Modern Anthropology. 2015; 8: 67–93. https://
doi.org/10.4314/ijma.v1i8.4
87. Reil A, Bein G, Machulla HK, Sternberg B, Seyfarth M. High-resolution DNA typing in immunoglobulin
A deficiency confirms a positive association with DRB1*0301, DQB1*02 haplotypes. Tissue Antigens.
1997; 50(5): 501–6. https://doi.org/10.1111/j.1399-0039.1997.tb02906.x PMID: 9389325
88. Arnaiz-Villena A, Martı́nez-Laso J, Gómez-Casado E, Dı́az-Campos N, Santos P, Martinho A, et al.
Relatedness among Basques, Portuguese, Spaniards, and Algerians studied by HLA allelic frequen-
cies and haplotypes. Immunogenetics. 1997; 47(1): 37–43. PMID: 9382919
89. Stearns PN. The Encyclopedia of World History: Ancient, Medieval, and Modern, Chronologically
Arranged, 6 ed., Houghton Mifflin Harcourt, 2001, 2017, pp. 129–131.
90. Mira-Guardiola MA (2000). Cartago contra Roma. Ed.: Alderaban. Madrid, Spain.
91. Sellier J, Sellier A. Atlas des Peuples d’Orient. Paris, France: Editions La Decouverte, 1993
92. Arnaiz-Villena A, Martinez-Laso J, Alonso-Garciá J. The Correlation Between Languages and Genes:
The Usko-Mediterranean Peoples. Human Immunology. 2001; 62(9): 1051–1061. https://doi.org/10.
1016/S0198-8859(01)00300-7 PMID: 11543906
93. Arnaiz-Villena A, Gomez-Casado E, Martinez-Laso J. Population genetic relationships between Medi-
terranean populations determined by HLA allele distribution and a historic Perspective. Tissue Anti-
gens. 2002; 60(2): 111–21. https://doi.org/10.1034/j.1399-0039.2002.600201.x PMID: 12392505
94. Hachid M. Postface de l’ouvrage “aux origines de l’ecriture au Maroc. corpus des inscriptions ama-
zighes des sites d’art rupestre du maroc” edited by: Skounti A., Lemdjidi A. and Nami M. Publication
de l’institut royal de la culture amazighe. Cealpa, rabat, morocco, 2003.
95. Malika H. Les premier berebers entre mediterranee, tassili et nil. Edited by edisud. aix-en-provence,
France 2000
96. Carter RA. Boat remains and trade in Persian Gulf during the 6th and 5th millenia BC. Antiquity. 2006;
80(307): 52–63. https://doi.org/10.1017/S0003598X0009325X
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 23 / 24
https://doi.org/10.1111/j.1744-313X.2009.00868.x
https://doi.org/10.1111/j.1744-313X.2009.00868.x
http://www.ncbi.nlm.nih.gov/pubmed/19703234
https://doi.org/10.1034/j.1399-0039.2003.00041.x
http://www.ncbi.nlm.nih.gov/pubmed/12753657
https://doi.org/10.1034/j.1399-0039.2001.057002118.x
http://www.ncbi.nlm.nih.gov/pubmed/11260506
https://doi.org/10.1034/j.1399-0039.2001.057004308.x
http://www.ncbi.nlm.nih.gov/pubmed/11380939
https://doi.org/10.1016/S0198-8859(01)00290-7
http://www.ncbi.nlm.nih.gov/pubmed/11543893
https://doi.org/10.1111/j.1744-313X.2007.00723.x
http://www.ncbi.nlm.nih.gov/pubmed/18001303
https://doi.org/10.1034/j.1399-0039.2002.590606.x
https://doi.org/10.1034/j.1399-0039.2002.590606.x
http://www.ncbi.nlm.nih.gov/pubmed/12445319
https://doi.org/10.1111/j.1399-0039.2008.01083.x
https://doi.org/10.1111/j.1399-0039.2008.01083.x
http://www.ncbi.nlm.nih.gov/pubmed/18721272
https://doi.org/10.1111/j.1399-0039.2004.00246.x
https://doi.org/10.1111/j.1399-0039.2004.00246.x
http://www.ncbi.nlm.nih.gov/pubmed/15245369
https://doi.org/10.1034/j.1399-0039.1999.530610.x
https://doi.org/10.1034/j.1399-0039.1999.530610.x
http://www.ncbi.nlm.nih.gov/pubmed/10395110
https://doi.org/10.4314/ijma.v1i8.4
https://doi.org/10.4314/ijma.v1i8.4
https://doi.org/10.1111/j.1399-0039.1997.tb02906.x
http://www.ncbi.nlm.nih.gov/pubmed/9389325
http://www.ncbi.nlm.nih.gov/pubmed/9382919
https://doi.org/10.1016/S0198-8859(01)00300-7
https://doi.org/10.1016/S0198-8859(01)00300-7
http://www.ncbi.nlm.nih.gov/pubmed/11543906
https://doi.org/10.1034/j.1399-0039.2002.600201.x
http://www.ncbi.nlm.nih.gov/pubmed/12392505
https://doi.org/10.1017/S0003598X0009325X
https://doi.org/10.1371/journal.pone.0192269
97. Kuhrt A. The ancient Near East (3000–330 BC). Vol II. Barcelona, Editorial Critica, 2001.
98. Hitti PK. History of Syria: Including Lebanon and Palestine, 2004, p26
99. Encyclopaedia Britannica: https://www.britannica.com/
100. Sartre M, D’Alexandre à Zénobie: Histoire du Levant antique, IVe siècle avant Jésus-Christ-IIIe siècle
après Jésus-Christ, Fayard, 2001.
Genetic heterogeneity of Arabs
PLOS ONE | https://doi.org/10.1371/journal.pone.0192269 March 9, 2018 24 / 24
https://www.britannica.com/
https://doi.org/10.1371/journal.pone.0192269
Genetics, health, urban Dubai-2018
City and cosmology: genetics, health, and urban living in
Dubai
Aaron Parkhurst
Department of Anthropology, University College London (UCL), London, United Kingdom
ARTICLE HISTORY
Received 28 September 2017
Accepted 9 October 2017
ABSTRACT
In light of increasingly high rates of diabetes, heart disease, and
obesity among citizens of the Arabian Gulf, popular health
discourse in the region has emphasised the emergent Arab genome
as the primary etiological basis of major health conditions.
However, after many years of public dissemination of genomic
knowledge in the region, and widespread acceptance of this
knowledge among Gulf Arab citizens, the rates of chronic illness
continue to increase. This paper briefly explores the clash between
indigenous Islamic knowledge systems and biomedical knowledge
systems imported into the United Arab Emirates. It presents
vignettes collected from interviews and participant observation in
Dubai as part of nearly four years of ethnographic research,
completed as part of the author’s doctoral work on ‘Anxiety and
Identity in Southeast Arabia’. Rather than radically informing health
seeking behaviours among many UAE citizens, the emphasis on the
‘Arab Genome’ has instead reconfirmed the authority of Bedouin
cosmological understandings of disease, reshaping the language
that people use to engage with their bodies and their health. Local
cosmology remains a powerful discursive element that often
operates in contention, in sometimes powerfully subtle ways, with
novel health initiative regimes. For many people in the region,
genomic information, as it is often discussed and propagated in the
UAE, shares an intimate relationship with ideas of fate and national
identity, and sometimes serves to mitigate the increasingly
uncertain terms of engagement that people share between the
body, their health, and rapidly changing urban landscapes.
KEYWORDS
Genetics; medical
anthropology; chronic illness;
fate; urban anthropology
Introduction
The underlying premise of this article extends from a simple, but profound anthropologi-
cal critique in the practice of biomedicine in different societies. That is, when policy plan-
ners and health professionals try to think through ideas of behaviour change that
accompany much of the discourse on obesity, diabetes, heart disease, and global health in
general, they need to take into account people’s perceptions or ideas of their ability to cre-
ate bodily change for themselves in general. Medical anthropology has long emphasised
the role of cultural landscapes and idiosyncrasies in producing powerful regimes of both
CONTACT Aaron Parkhurst a.parkhurst@ucl.ac.uk
© 2018 Informa UK Limited, trading as Taylor & Francis Group
ANTHROPOLOGY & MEDICINE, 2018
VOL. 25, NO. 1, 68–84
https://doi.org/10.1080/13648470.2017.1398815
http://crossmarksupport.crossref.org/?doi=10.1080/13648470.2017.1398815&domain=pdf
http://orcid.org/0000-0002-0762-0929
http://orcid.org/0000-0002-0762-0929
mailto:a.parkhurst@ucl.ac.uk
https://doi.org/10.1080/13648470.2017.1398815
http://www.tandfonline.com
illness and health, and alarming rates of chronic illness across the globe re-illuminate the
systematic neglect of culture in policy planning and debate (Napier et al. 2014). How is
agency constructed in ‘health seeking behaviour’, and what are the wider social factors
that inform ‘health seeking behaviour’?
This paper is informed from long-term field-work in Dubai that focused on these ques-
tions of health seeking behaviours and how they relate to local ideas of fate, agency, and
genes. Further to these ideas, however, Dubai provides a unique context to think through
many forms of chronic illness that become propagated through individual habits and
behaviours. From questions that emerge in my recent inquiries on the human body and
urban environments, this paper explores an anthropological problem presented by the
body in the city, namely, the disruption of the stable relationship between the human
body and the environment. Genetics, as a concept, becomes an explanatory model that
men and women in Southeast Arabia utilise to speak towards this disruption.
The ethnographic data used in this paper was collected as part of nearly four years of
anthropological fieldwork in Dubai and Abu Dhabi, in which I lived and worked as an
anthropologist (February 2007–October 2010). It forms part of a larger body of work on
the relationship between globalisation, chronic illness, and tradition within Southeast Ara-
bia, undertaken as my doctoral research. The research was conducted in many social and
medical spaces, but primarily in participants’ homes, caf�e’s, and other intimate social
spaces. Part of this ethnography was also conducted in clinical settings, involving partici-
pant observation in three mental health institutions (one in Abu Dhabi and two in Dubai),
and two nutrition clinics in Dubai. My anthropological research began as a project study-
ing mental health and the stigmatisation of mental illness in the Emirates, as well as
men’s health issues in the country in general. The current focus on diabetes and genetics
emerged from concerns from both local health authorities and from Emirati lay persons.
During my time in the Emirates researching chronic illness, Emiratis in general spoke
often and openly about their engagement with genetics, and both their deep love and anx-
iety of the city. These themes comprise the focus of this paper.
The research methodology consisted primarily of participant observation and inter-
views conducted in both Arabic and English. Unless otherwise stated, the dialogue pre-
sented in this paper was conducted in English. Most of the discussions between my
participants and myself were qualitative, open ended engagements, though many inter-
views directed participants to discuss their understandings of genetics, the city, or both.
Participants were recruited in a wide number of contexts: some participated in discussions
as part of formal discussions in clinics; others were recruited through participant observa-
tion in Dubai, and we met in their homes, caf�e’s, or places of work in which I had access
and permission to conduct fieldwork. Still others were part of a support network in my
Arabic education. Most of the participants that inform the ethnography of this paper, and
with whom I became close, were men. This is partly due to the nature of the overarching
research questions on men’s mental and physical health issues in the Emirates, but it is
also due to the social structures of the country. While women participated in general
interviews in public health spaces, I only had ethnographic access to men in more per-
sonal and private social spaces. The participants of whom this paper concerns are almost
all Emirati citizens living in Dubai, with the exception of some perspectives from Euro-
American health professionals working in the city. Citizenship in the UAE is still
informed from tribal affiliation. Many Emirati in Dubai and Abu Dhabi are members of
ANTHROPOLOGY & MEDICINE 69
different branches of the Bani-Yas tribe, a large and powerful kinship group that enjoys a
long history in the Arabian Peninsula. However, there are also many who trace their line-
age through other large tribes. Emirati tribal leaders (sheikhs) often draw upon Bedouin
identity in public discourse in Dubai, though the label of Bedouin is rather fluid. While
different families in the Emirates have diverse historical backgrounds and histories that
shape their experience of the developing Emirati cities, this paper draws upon shared
understandings of the body and cosmology that unify the citizens of the Emirates.
Diabetes in the Emirates
The predominant blood sugar disorder discussed in this paper is Diabetes Mellitus Type 2.
This condition is categorised through the inability of the body to respond to insulin prop-
erly, and usually develops in adulthood. There are many risk factors that are known to
contribute to Diabetes Mellitus Type 2, henceforth often referred to in this paper as sim-
ply ‘diabetes’, but most salient in public health narratives are those risk factors that corre-
late diabetes to obesity (Body Mass Index of 30 and higher), personal diets, behaviours,
and habits. Diabetes is well-understood as contributing profoundly to a wide-range of co-
morbidities. Because of its relationship with obesity, they are often discussed in unison by
health officials in Dubai.
The experience of diabetes in Dubai is often explained through narratives of ‘energy’.
Those who have the condition complain of not having any energy to go shopping, or go
to work, and sometimes complain that they do not have the ‘energy’ to go outside, as the
heat of Dubai’s oppressive climate stifles them. This is especially frustrating for those who
are told their condition is tied to inactivity. The experience of diabetes, however, is highly
variable in Dubai, especially as the condition presents itself in increasingly younger indi-
viduals. It often first presents itself as a major problem when people have other ailments
or are treated for other conditions. The experience of the condition remains confusing for
many of the people with whom I spoke, especially for younger individuals (in their late
20s or early 30s). They were aware, and even fearful, of the cardiovascular risks that the
condition informs, and they all had personally known others whose death at an early age
due to cardiovascular disease was informed by diabetes. While they felt the physical effects
of the chronic illness, and indeed, some had been diagnosed after an initial diabetic attack,
their social lives, in their own terms, had yet to be grossly impacted by the disease. As a
result, it was difficult for many people to narrate their current suffering beyond physical
sensation. As I will discuss later, for many the condition was considered with some ambiv-
alence. In this regard, when I spoke with people about the experience of living with diabe-
tes, they often turned the discussion away from their own lives, and instead borrowed
pathology as an opportunity to think through other aspects of their society.
Diabetes, and even obesity in general, is often seen by Emiratis in the UAE as a condi-
tion brought about by modernity. The Arabic term for diabetes in the Emirates is ‘da3 al-
suker’ and translates literally as ‘disease of sugar’. However, the Latin term ‘diabetes’ is
used ubiquitously in both Arabic and English discourse. In this regard, its immediate rela-
tionship to food and drink consumption is disrupted, allowing for more fluid and com-
plex understandings of the origins of the condition. Long-term medical professionals in
the UAE remember and recognise the historical development of blood sugar discourse in
the country. For example, a German physician who had practiced in the country for
70 A. PARKHURST
20 years explained, ‘There was an idea, and I still come across this, that we [here he refers
to himself, and other Euro-American expatriates] brought some of these conditions with
us. Sometimes people might say ‘you made this problem so you fix it’, and I had no idea
what they were talking about’. The physician later came to understand that his patients
were referring to the idea of Euro-American immigrants as perceived agents of disease, or
at least associating these expatriates with the conditions of change and foreign influence
that bring sickness. ‘My father thinks these things’, a friend explained to me. ‘He thinks
diabetes is a conspiracy from Israel or something like this’. I asked why. ‘Well, people
didn’t have this problem, … nobody used to have Diabetes. Or maybe they had it, I think,
but nobody knew about these things. So they blamed everyone else. And now we know it
is genetic, but even now some people don’t believe that’.
There is great complexity embedded in these ideas. Israel, here, is understood to be in
partnership with American and European governments to subvert Arab society, though
these ideas are not shared by everybody. There is also an attempt to understand how dia-
betes developed so quickly in the rapidly growing city. Other logics concern immigration
as a direct process of pathology. In this regard, diabetes is seen less as something that
develops from habits, and instead is partially socially constructed as something caused by
ambiguous pathogens that accompany immigration. Others see Euro-American expan-
sion as an agent of corruption, if not a direct agent of disease. The complex consequence
of these commercial and social infiltrations on the human body is a trend seen in many
areas of the world, and has been given the moniker ‘cocacolonisation’ (see Leatherman
and Goodman 2005). In the past, diabetes was not known to be a problem, and suddenly,
one day it was. According to the International Diabetes Federation (IDF), during the cul-
mination of my fieldwork, The Emirates had the second highest rates of diabetes in the
world, behind the small Pacific island nation of Nauru (IDF 2010). This trend remains
strong. Current data from the IDF holds that nearly 1 in 5 adults in the UAE is currently
afflicted with diabetes, and the country’s rates of diabetes are rising faster than both its
neighbours in the Arabian peninsula and in the world at large (IDF 2015). If these rates
continue, the prevalence of diabetes is expected to double within a generation.
My participants do not use the term ‘cocacolonisation’, but they are aware of these
forces of commercial and social intrusions, and they see these processes centred in
the city, namely, Dubai. My friend Ali, for example, spoke often about the problems that
the city posed and the dilemmas it caused for him and his peers. Ali explains, ‘There are
some people who just think it would be better if everyone (foreign) left, and there are
other people who are afraid of what will happen if everybody leaves’. ‘What do you think’,
I ask him. ‘I think like most people we love people to come here and we love to share our
country. But maybe some people are meant to come live here, maybe some people should
only come visit. Smaller is ok too, all these towers… It will be good to slow down, or else
people (locals) will never leave their homes, and the people coming here will be bored,
and they will stop coming… people are becoming very selfish… . [We] do not have to do
much. We need to be better’. At other times, he and his peers would complain about the
fast food that they and their children consumed, or the amount of TV their family
watched, always wildly gesturing to the streets. The city then becomes tied to indigenous
understandings of modernity and disease, and is understood to be mapped upon the
human body. The body and the city is, in many ways, still a developing subject of analysis
in social science, though it has an emerging collection of thought in a range of disciplines
ANTHROPOLOGY & MEDICINE 71
from geography and anthropology to psychoanalysis. While architectural planning has
throughout centuries borrowed upon human corporeality to understand the form of
streets, buildings, townships and cities (see Vitruvius and De Vinci, for example), philoso-
phers and artists near the beginning of the last century began to recognise the metropolis
as a new grounding for human culture and corporeality (e.g. Mumford 1934; Metropolis
1927). In a different vein, other thinkers in anthropology and geography conceived of the
body and society as mirrors for each other (Douglas 1966), and the city, specifically, as a
metaphor for the human body in which stable urban landscapes inform cultural under-
standings of the body and identity (Sennett 1994). In this way, space, place and the body
become concretely joined. What Sennett identified is how urban spaces become norma-
tive, seemingly stable lived experiences for those who live within them. Yet, he also shows
how this normative experience of urban-ness belies the reality of the city as a highly unsta-
ble, and profoundly fluid and dynamic space. It is a transformational entity in its own
right that shares an anthropologically reciprocal relationship with the human body: the
city-cum-body is constructed by the body, much as people embody the dynamic forces of
the city (ibid).
In discourse on diabetes, obesity, and heart disease, social scientists have long argued
for a more holistic view of the body in relation to society to think through health seeking
behaviour (see Edwards 2012; Paul 2005; Mendenhall et al. 2010). Specifically, in order to
create changes and shifts in health delivery and demographics, especially in a context
such as London or Dubai, policy planners need to think beyond what a health authority
might be able to issue, and think additionally about the pragmatics and lived experience
of people as they try to move through their daily life. In terms of diet and exercise, this
has implications for public transport, daily commutes, housing prices, and a wide range
of socio-economic policy and practice. In this regard, city politics and urban management
in the US and UK, for example, have informed urban neighbourhood demographics, the
distances between an individual’s work and residence, the pragmatics of daily travel, and
opportunities to create and utilise time for activities beyond income production and
household maintenance. These aspects of quotidian city life are mapped onto the human
body in the form of chronic illness (Church et al. 2011; Cetateanu and Jones 2014; Bur-
goine et al. 2014; Bourgois 2011). The structural limitations of urban living often provide
daunting hurdles to prevention of chronic illness, but there is a psychological aspect to
health behaviour and practice that is sometimes ignored. That is the sense of futility many
people express and experience in thinking through how they might work upon their
bodies.
Diabetes and fate
Obesity and diabetes are made complex in Dubai, as they are medical categories that are
often fraught with ambivalence, and they are not always seen as unhealthy body categories
in the city and country at large. This is certainly not unique to this region of the world (see
Randall 2011, or Popenoe 2003, for example). One of the issues that contributes to high
Body Mass Index and high rates of blood-sugar disorders in Southeast Arabia that is not
discussed in this paper is the perception of these conditions as normative or healthy, and
in the case of obesity, sometimes desired. However, as discussed in the section above, dia-
betes, specifically, is often understood as a condition of modernity, a sudden product of
72 A. PARKHURST
‘modernisation’. This is evidenced by my participants in a number of ways. One concern
from locals is the idea of Western imperialism as an agent of disease. The widespread idea
of diabetes in the region grew in similar terms to the influx of foreign immigration, prod-
ucts, and ideas. This type of modernity also brought more robust systems of medicaliza-
tion into the country. Very few in the Emirates were diagnosed with diabetes before the
invitation towards foreign development, and so it is rather reasonable to deduce that it is
a ‘Western’ illness category that expatriates brought (and continue to bring) into the
country. This perception is made complicated by discourse that links Western material
and social imports to cultural pollutants, if not direct agents of disease. American
designed fast-food industries, expensive villas, sport-utility vehicles, mass media, and
even increased longevity become objects vacillating between desire and danger. All these
vacillating objects were tied to urbanising processes, and the city is perceived to be the
locus of these goods. In this regard, the desert was often looked upon as a safe haven. As
one of my participants proudly advertised, ‘I make my family go camping to the desert
every month usually because it is the best thing to grow up right… It is like a medicine’.
Though, even then, my friend’s ‘tent’ was fitted with modern amenities. Vacillation, as
theorised by Ghassan Hage (2010).
occurs because we do not always know what we want and we often want contradictory
things… we can say that vacillation is when there are many incompatible things giving mean-
ing to our lives and we find ourselves pursuing them despite their incompatibility. What is
important, though, is that vacillation is not just a movement between various states of being;
rather, it is a state of being in itself. (Hage 2010, 152)
My participants often describe themselves in this way, torn between desires for conflicting
interests and identities. Some defined the city as ‘a place where people don’t know how to
not want things’. The desire for both modernity and tradition, and the perceived futility
of pursuing both, creates conditions of uncertainty that my participants expressed often.
The city becomes a vessel for this uncertainty, and becomes tied to other categories of
ambiguity more closely associated with the body; namely, genetics.
As Kilshaw has demonstrated in her ethnography in Qatar (Kilshaw 2015), the Qatari
state’s dedicated mission to become ‘modern’ borrows significantly on the role of genetics,
but this is often in contention with the way that local Qataris ‘themselves understand and
incorporate genetic knowledge into their lives’ (Kilshaw, this issue). Institutionalised
genetic sequencing and testing programmes speak towards a local desire to bring Qatar
forward as a global leader in healthcare, and they become representative of a ‘modernity’
of which Qatari citizens are very proud. Yet, balancing these desires with traditional
emphasis on inheritance makes genetic dissemination very complex, and in some ways,
ironic (Kilshaw 2015, this issue). In the context of Dubai, the imports described above
bring both comfort and ‘corruption’, and are problematically, though not necessarily
falsely, tied to conditions that are often ethnographically also attributed to genetics, such
as ‘misbehaving children’ (in terms of autism spectrum), depression, and, saliently, diabe-
tes. All these categories are, then, often understood as diseases brought by the West. Some
speak of diabetes as a result of a loss of traditional value and culture or religion. For exam-
ple, I met a participant who insisted that soft drinks, and specifically Coca Cola, were
ruining the health of the city (indirectly invoking the idea of coca-colonisation discussed
above), which is something he and I agreed on to a degree. He asserted, however, that if
ANTHROPOLOGY & MEDICINE 73
locals drank more coffee, as was considered traditional, then the diabetes epidemic could
be annihilated. There may be some medical truth to this, depending on the ways and the
amounts coffee is consumed. However, my participant’s concern was not with the physi-
cal and chemical properties of the drink. The harmful long-term effects of soft drink con-
sumption are not always perceived to stem from the ingredients of the products: sugar,
corn syrup, or, perhaps, colouring compounds. Rather, it is the nationalism of the prod-
uct, and its cultural disruption that is understood to be poison for the human body. ‘Coca-
colonisation’, then, is a useful but limited concept in the region as it directs analysis of
health seeking behaviour away from the individual and places it within wider systems of
structural imbalance. My participants do often recognise that coca-cola, as a ‘material’,
leads to Diabetes, but this ‘material’ takes on different meaning depending on its source.
In this regard, sugar is good when it is used to make local products, and bad when it is
imposed upon those who fall within Euro-American patterns of consumption.
Parallel to local understandings of foreign influence are increasingly prevalent public
discourse on genetics. Within popular imagination, there is a widely-held perception of
genetics as diabetic aetiology; that is, genes are largely, if not wholly responsible for diabe-
tes. For example, where I was discussing aetiology with one of my participants, I was
speaking about genetic susceptibility for type 2 diabetes, a ‘gene’ for diabetes, and he was
speaking of ‘Al Djinn’, those ambiguous agents of the desert, usually frustratingly amoral,
that are known to influence the world of humans and disrupt human agency. I am careful
to note that he probably does not mean this literally, that genes and Djinn are one and
same. Or, if he does, it remains speculative. However, in many regions of Southeast Ara-
bia, genes and Djinn, as ambiguous categories of nature and fate, do borrow each other’s
language, if not further synonymy. It is a recognition that the sands and vastness of the
Rub al Khali, the vast desert that lies across the Southeastern Arabian peninsula, and the
human body were both their own cosmologies, populated by cosmological agents that can
affect one’s life and well-being.
In this way, genes have been incorporated into indigenous cosmology. The language
and rhetoric that my participants apply to discourses of fate are often re-appropriated to
help them think through genetics and other biomedical body knowledge. While I do not
have the space in this paper to unpack the complex construction of ‘fate’ itself in Dubai,
my larger ethnography has shown that fate is a language of uncertainty in Dubai, but is
often incommensurable and sometimes even congruous with deep personal agency (Par-
khurst 2014). In thinking through the body in the city, and the body of the future, fate
becomes a rhetoric that is helpful to situate oneself in the conditions of vacillation I have
described above. In relationship to disease, other anthropologists have shown how Islamic
conceptions of fate are better understood as languages for structural imbalance. Sherine
Hamdy’s work in Egypt, for example, shows how fate is invoked by some as mechanism
to take action and meaning within systems of political failure and structural violence
(Hamdy 2008, 2009). In contrast to traditional perceptions of ‘Islamic fate’ by colonialist
thinkers, my participants often invoked strong sentiments of personal cultivation and cos-
mological futility simultaneously. Because of its place in religion and other systems of
social relations, fate, as locally defined as submission to God, is proudly locally owned as a
marker of identity, yet is practiced with ambivalence. Processes of modernity and urbani-
sation as understood by my participants, because of their own ambiguity, and because of
their association with bringing both success and disease, are then placed within this
74 A. PARKHURST
language of fate. As genes become increasingly understood as carriers of both identity and
disease, they become tied to these languages as well.
The development and dissemination of molecular biological science in laboratory cul-
tures over the last five decades informs the social understandings of genes as the science is
imported into new contexts. Outside of the Middle-East, this trend has provoked wide
philosophical and bioethical debate. In discussing genes with patients, or the public, an
often-overlooked consequence is a lay understanding of genes as destiny. Within the sci-
entific community, this problem has been discussed for decades, asking, in a broader
sense, what it means to say ‘x-gene determines y’. Richard Dawkins has fought against
this type of genetic deterministic understanding, asking, ‘Why are genetic determinants
thought to be any more ineluctable, or blame-absolving, than ‘environmental’ ones?’
(1999, 10–11). There is, arguably, a cultural miscommunication here between the cultures
of laboratories and the general public. For many philosophers, and laypeople, the question
is somewhat teleological, for biologists, the question is statistical (ibid). Nonetheless, the
human body, as it ambiguously weaves through all systems of social relations, blurring
biology and culture, remains a steadfast anthropological problem (Csordas 1994; Scheper-
Hughes and Lock 1987), and genetics, understood as synecdoche for the body, have only
complicated long-running debates on what it means to ‘be in the world’ (Franklin 1995).
Ethnographically speaking, genetic understandings can be strikingly and profoundly
meaningful, and have the potential to elicit powerful change in individual and social iden-
tity (Rabinow 1996). Anthropologists, recognising the need to create new theoretical tools
to think through the ramifications of genomic information in society, have taken on-
board this concept of ‘biosociality’ to help understand the role of genes within ethnogra-
phy (see Gibbon and Novas 2008). However, they are also critical of instilling too much
power within the gene as definitive instruments of change and control (Rabinow 2008).
Within the clinic, semantics of genes can radically inform patient behaviour, in both
informing aetiology (see Senior et al. 1999 for example), and, in new ways, avoiding aetiol-
ogy (Franklin and Roberts 2006). Within larger debates in anthropology, these disruptions
of nature and culture perhaps provide evidence for the post-modern viewpoint that social
science itself has ill-constructed binaries which it debates and refutes (Latour 1993).
Molecular biology may have a role here as well. New genetic sciences and epigenetic influ-
ences on the body contribute to the development of radically new debates within anthro-
pology on nature and culture (Lock 2013, 2015). However, it is worth noting that, for the
people within the context of this study, the line between the biological and the social has
always been very weakly drawn. The people of Dubai do recognise genes are biological
agents, but they are simultaneously social ones, as I will discuss.
How people construct the notion of fate, or destiny, in relation to genes, is just as deli-
cate as social and biological binaries. The language of genetics, premised on imaginations
of the inevitability of nature, remains an instrument that can invoke a sense of fate, or a
prescription of behaviour. This is further complicated by deeply held values of genetics as
specifications of race, and by extension, ethnicity (Fullwiley 2007). As I have argued else-
where (2014) one implication here is that many geneticists still wantonly operate under
the same formulas for ‘national character’ that social sciences have accused the Oriental-
ists of perpetrating, and that scholars have attempted to weed out of anthropology. In this
way, the semantics of genes are translated outside of the laboratory to the public to give
chemical and organic evidence towards national identity.
ANTHROPOLOGY & MEDICINE 75
In the Emirates, the relationship between genes and national identity often takes com-
plex forms. While there exists a robust local knowledge of the mechanisms of inheritance
and kinship in Southern Arabia that I have not the space to discuss in depth here, genes
as biological entities are not necessarily part of, and not always associated with this inheri-
tance and kinship. Genes are widely known as identity markers independent of kinship.
They are widely known to be carriers of disease, but are not generally understood to con-
tain the essence of, or the benign traits of, a person. The following brief conversation
between myself and two of my participants, a debate on the genetic influences to, say, hair
colour vs. diabetes, illustrates local incommensurability between genes and inheritance. It
began from a popular discussion among my participants – what makes a person beautiful.
(Ali) ‘A woman’s hair comes from her mother, and that is why they are keeping it like this
[silky, and pitch black]’
(Myself) ‘What about diabetes’, I asked, ‘is this something that comes from the mother or
from the father?’
(Ali) ‘No, this one is genetic I think.’
(I continued) ‘Sure, but do you get it from your mother’s side of the family, or does it come
from your father’s side of the family?.’
(Ali) ‘No, these ones, these diseases they are genetic.’
(Myself) ‘Fine, but where does it come from?’
(Ali) ‘No, yaani, they do not come from anywhere. I am trying to tell you that. They are not
coming from anywhere.’
(Myself) ‘But if they are genetic, they are inherited from someone!’
(Ali) ‘Yes, but no it does not come from anywhere, yaani, this is why it means it is genetic’
(Myself) ‘Well, what does genetic mean?’
(Ali) ‘It means that you have genes… that it is because you are Arab or maybe like these peo-
ple’, he points to a group of Filipinos who were working at the caf�e in which we met.
(Myself) ‘[The Filipinos] are genetic?’, I asked. The two men at the table could see that I was
confused.
(Rahman) ‘Don’t you know that Arabs have these genes and that British people have these
genes and all these peoples have these genes.’, one of them yells at me.
(Ali) ‘He means different genes’, his colleague explains.
(Rahman) ‘Yes, yaani, different genes all these people,’ he clarifies.
76 A. PARKHURST
(Myself) ‘Yes, I understand that, but where do these genes come from?’
(Ali) ‘But they are not coming from anywhere is what I am telling you. They are because they
are these people… .’ His friend interrupts,
(Rahman)‘We are Arab so we have some of these ones [genes].’
(Myself) ‘Is being Arab genetic then?’, I asked.
(Rahman) ‘Yes, of course, and like you are coming here from England.’
(Myself) ‘Is being English genetic?’
(Rahman)‘Yes that is what we are trying to tell you.’
(Myself) ‘Ok, so is being Emirati genetic?.’ This question seemed to provoke some thinking.
After a short time, they answered.
(Both) ‘No, this one is not genetic, it is coming from who your father is.’
The debate continued for some time. I asked about skin (from the mother), height (from
the father), obesity (genetic), cancer (genetic), eyes (mother), gender (father), and so forth.
I continued these questions with many people throughout my fieldwork, with more or less
the same responses. In terms of pathology, diabetes, cancer, obesity, and both psychotic
and non-psychotic mental illness: these conditions and behaviours were perceived to be
genetic. However, certain types of nationality and general behaviour, and the phenotypic
attributes of appearance were said to originate with parents, in the home, and in the
womb. Ethnicity, as a concept, and as a broad signifier, is often slippery. Being ‘Arab’ or
Chinese, or White-European, in local terms, was discussed as evidenced through genes.
Being Emirati, for example, is inherited, but not genetic, while being Arab, and more spe-
cifically, deriving from Southeastern Arabia at large (Bahrain, Qatar, Emirates, Oman,
possibly Saudi, but not Yemen) is said to be informed through genetics. Beyond biology,
many factors contribute to these designations: Bani-Yas tribal affiliations, ties to desert
and coastal landscapes, concepts of wealth, constructs of purity, and language practices –
but to name a few. While the limits of genetic influence in popular imagination provide
further ethnographic evidence on the nature of agency in kinship and reproductive practi-
ces, the ambiguous coupling between pathology and ethnicity speaks to the constructs of
genes in this paper.
John Avise, in his monograph on the Genetic Gods (2001), extends genetic determinism
to the structural realm of cosmology, attempting to ask and answer questions that are, for
many people, religious. The link between genes and gods can be, Avise argues, a rather
rapid one. Certainly, as invoked in the anecdote above regarding Al Djinn diabetes, there
is evidence for this in my field-site as well. Here, of course, the connection is not made
with ‘Gods’, but it is still made with religious cosmological entities. This synonymy and
parallelism presents an anthropological question: If genes conjure up their own cosmology
within the imagination, then is it reasonable to suggest that an already present and strong
cosmology might inform genetics? In the Arabian Gulf, genetics have found an audience
ANTHROPOLOGY & MEDICINE 77
with which it was unfamiliar. The intentions behind its language are especially vulnerable.
The men and women of the Emirates already have a very robust and complex language of
their own with which they can engage fate. Genetic dissemination was bound, in some
way, to be reworked under these powerful Arabic articulations. There is not space here to
do justice to the diverse and encompassing language of fate in the Emirates, let alone the
Arab-speaking world at large, and despite the complexity of fatalistic discourse in the
region, modern ethnography conducted in the Arabian Peninsula remains sparse. This
paper in many ways takes the presence of fatalistic language as an ethnographic given,
even if the link between behaviour and discourse is often nebulous and even sometimes
careless (Chaves 2010).
In terms of how genetics and fate are interwoven in Southeast Arabia in general, other
research has provided insight in contexts outside chronic illness. Kilshaw (2015) has ana-
lysed how maternal prospects, marriage and consanguinity highlights genetics and the
management of risk in Qatari communities in and around Doha. Similarly, inherited
blood disorders and genomic testing not only encroach upon marriage practice in Oman,
but become novel signifiers of nationalism, history and identity in a context in which nor-
mative concepts of time and history are politically prescribed (Beaudevin 2013).
The research presented here complements these works. Rather than simply replace the
cultural models of the world that the Bedouin and coastal tribes of the UAE know to be
true, foreign medical and scientific concepts are re-shaped and interpreted through the
languages of the desert, themselves becoming common discursive elements of public
knowledge. In thinking through ‘genes’ as agents of disease, and Djinn as ambiguous spi-
rits of the desert, my participants see congruences. The slippage between Djinn and genes
becomes a powerful metaphor to depict the fallacies inherent in the designs of globaliza-
tion and in the assumptions embedded in Western scientific empiricism and dissemina-
tion. The direct association between these terms is not as important here. What I argue is
that the failure to recognise genetics as its own cosmology can indeed perpetuate suffering.
I have argued that Emirati conceptions of the self and body in relation to nature, spirits
and foreigners are challenged by the promises of globalization and modernity. As people
move through the desert, the coast and the rapidly growing cities, their quest for an elu-
sive notion of modernity ricochets into local systems of destiny, cosmology, agency, body
practices, and kinship, and the languages one uses to articulate the ‘self’ and world are
transformed.
The language of fate is a language in which genetics is often fully embedded. As dis-
cussed, while the epistemology of ‘genetic determinism’ has been a trope borrowed in
both social and biological landscapes in the West, fate is far more culturally owned in
the Gulf, and in much wider social ways than in, say, the UK. Ideas of Islamic fatalism are
often a proudly culturally owned category in the region. As Bourdieu has told us of
the Kabalye, ‘Submission to nature is inseparable from submission to the passage of time
scanned in the rhythms of nature’ (Bourdieu 1963, 57). Bourdieu was attempting to
understand fate, fatalism, or determinism among his Islamic informants in Algeria. In his
writings, his informants understood fate as scanned in the rhythms of nature. This is not
foreign to my participants who invoke a similar symbolic association with nature, and
specifically the moon, the tides, and even genes. However, these terms are variable in
meaning for my participants depending on the context in which fate is invoked. I have
argued elsewhere (2014) that ‘Islamic fatalism’ in the Gulf is often a poor concept. Rather
78 A. PARKHURST
than see themselves and their fate inescapable from the moon or the tide (as James Fraser
(1990) has poetically described a century ago), I struggled to find notions of Islamic fatal-
ism in common practice, and in the reality of people moving through their day. Rather,
nature was a stable stage in which people took comfort. The coast, tides and waves, and
above all, the desert, provided a language of empowerment, that the individual could
effect change in the world, and should indeed do so. Oil provided an index of power that
was granted from nature, and Bedouin reliance of the ever-stable Earth reinforced these
motivations for planning, hurrying, scheming and creating – at least among the Gulf’s
elite. Bourdieu’s Algerian notions of fate and hubris do exist in conversation and song,
but they usually contradict practice.
Chronic illness in Dubai, and specifically diabetes, complicates applications of fate.
Many of my participants, and especially young and middle-aged men, understood diabe-
tes as something within the body that can make one sick, but not as a constant condition
in and of itself. In times of diabetic distress, patients would eagerly seek immediate medi-
cal attention, and then participate in health planning in the few days and weeks following
their distress, though these behaviours would generally transform into old habits. It was
difficult for many participants to imagine themselves being ill in those times in which
they did not feel ill, and indeed felt normal. It is in these contexts that the language of fate
and genes were simultaneously invoked.
In these ways, I have briefly outlined how genes in the Emirates become simultaneously
tied to pathology, race, ethnicity, and fate. These relationships are made evident in local
discourse in complex ways. Consanguinity, in the Emirates, for example, is increasing.
Studies indicate that slightly over half of Emirati marriages are consanguineous (Al-Gazali
et al. 1997). However, as the local population increases, and as the Emirati population has
more access to education and health services, the rates of consanguinity have increased.
Contrary to patterns in many other parts of the world, in the span of one generation,
research indicates that rates of consanguineous relations have risen another 10%, and the
preferred marriage is between first cousins (ibid.). Studies in the Emirates have attempted
to examine the effects of the trends in these marriage practices on health patterns
(Tadmouri 2009; Abdulrazzaq et al. 1997; Al Gazali 1995). However, new local under-
standings of genetics give novel meanings to inheritance. Paired with traditional ideas of
fate, and tied, again, to anxieties related to an ever-increasingly heterogenous city, genetic
information, for many, may ironically help inform higher rates of consanguinity. Simi-
larly, Kilshaw has collected narratives of women in a similar context in Qatar in which
dialogue between genes, responsibilities towards health, arranged marriages, and familial
obligations are constantly contested and negotiated (2015, this issue).
Chronic illness presents similar challenges. Chronic illness is, by its nature, confusing
as negatively constructed pathology. If disease in general can be discussed through fatalis-
tic terms, chronic illness, for which patients may not recognise or anticipate future symp-
toms, becomes even more of a logical consequence of destiny. Race is constructed in
Dubai as a profoundly positive form of cultural capital, and genes as markers of race are
proudly owned. In other aspects of social engagement, many are very protective of what is
acceptable as informed by genetics. Mental illnesses are often said to be genetic, but sexual
behaviour is not, and my participants become deeply offended at suggestions that sexual
behaviour might be informed from biology. When race, seen as a profoundly positive
social capital, is made parallel to pathology in terms of genetic dissemination, an
ANTHROPOLOGY & MEDICINE 79
individual’s natural approach to their chronic illness often becomes marked by indiffer-
ence, and on some occasions, might even be embraced as a socially owned form of cultural
capital, regardless of the health consequences. As a result, emergent public genetic educa-
tion on the ‘Arab’ genome, designed by health authorities to curb those habits that
encourage and spread chronic illness, is local embraced as authoritative knowledge, mir-
roring the language of fate that local residents have long used to articulate their world.
However, as authoritative as the concept of the gene in the Emirates is, it fails to produce
the behaviour change for which health planners have hoped. Indeed, the opposite effect
has occurred, as the rates of diabetes and obesity continue to climb.
The body and the city
The ‘city’, however, creates a new and very real dilemma for those who inhabit them, and
it has ramifications for the body. In my previous fieldwork, I set off to answer a very broad
question in the Emirates: What happens to identity within indigenous culture when faced
with globalization and modernization on such a rapid course? Dubai, perhaps more than
anywhere else in the world, is well suited to afford opportunities to explore this question.
The city itself became a protagonist, and a type of an anti-heroine. In the years I lived in
the city, I was able to watch megaliths rise from the sand. Countless workers from South
Asia spun webs of steel and scaffolding from dawn until after dusk. Every evening the
towers were half a metre taller. One can drive somewhere in the morning, only to be lost
when the road is wiped away by evening. The city is a fortress against nature, a place
that – even for my participants – could not be, should not be. For many of the residents
of Dubai, the city is an impossible landscape, save for the vision of the sheikhs, and the
blessings of God. Dubai, for many local people, is itself an articulation of their sub-con-
scious, arising from the dreams of their leaders who imagined the wealth of the city as
they stared across what was once a tiny creek babbling along sand and rock. Because of
this perception of Dubai as a materiality of local dream-scape, her betrayal is especially
harsh. Many of the men and women who watched the first cargo come to Jebel Ali Port,
and who remember the first hotels and towers, now feel that the city is designed for every-
one except them. Some people act as if the city has its own agency, and there is a sense of
amorality in its development, but for my participants, who are fiercely loyal to each other,
to the sheikhs, and to Dubai, there is a sense that Dubai has not reciprocated, that at some
point the city began to be disloyal.
The sand, the coast, and even the oil, previously dependable wells of wealth, gifted by
the desert and the sea, that are worthy of their own ethnography (see Limbert 2010), are
no longer the stable entities with which local people can pivot themselves against to enact
identity. In this way, the relationship between people and their environment, as I have
witnessed it in Dubai, becomes profoundly disrupted. Local Bedouin and Beni Yas tribal
cosmology has long seen actors subject to the permanence of land and the inevitability of
predictable – if sometimes oppressive – nature, the extreme reality of the desert, and the
moon and tides in which they see the natural symbols of fate. The city, in this sense is pro-
foundly disruptive. Cosmology which has long-depended upon a relationship between
moving bodies and stable Earth fails to cohere in a landscape of rising monoliths, 14-lane
highways, and an influx of cultures and languages from abroad, and, of course, genetic
heterogeneity. Emiratis now compose less than 10 per cent of Dubai’s population. No
80 A. PARKHURST
longer the flexible bodies against the rock, many people develop a deep anxiety which lim-
its their ability for action across the gambit of individual and social enterprise. In other
words, the uncertainty of ‘modernity’, whatever ‘modernity’ is, makes thoroughly intoler-
able the complexity of life’s choices.
I have studied how this frustration over uncertainty becomes enacted in local cosmol-
ogy, among the Djinn of the Emirates, who lash out against both the past and the future
(2014). Here, though, there are repercussions for the body, which is, in the Emirates at
least, one of the casualties in the conflict between local identity and the changing urban
landscape. Fate becomes enacted uniquely here. Rapid change radically disrupts people’s
ability to see themselves in the future, and so ‘health seeking behaviour’ becomes desir-
able, but highly directionless. Genes, too, take on further meaning in light of the shaky
ground. As a biological category of both fate and ethnicity, they are relied upon to provide
an anchor to identity when identity is under threat from a newly uncertain world. They
become a cosmology in and of themselves, synonymous with tradition, and their associa-
tion with pathology is forgiven, and even valued as a consequence of fate. Health educa-
tion directed towards managing and preventing chronic illness asks the individual to
imagine one’s body in the future. However, the body, as outlined at the beginning of this
paper, is inexorably intertwined with the urban cosmos, and for many, the unstable,
uncertain city makes this request for vision cognitively exasperating and disheartening.
The city, as I have described, is a site of vacillation for my participants. They have called
it, poetically, the ‘inescapable place of desire’, highlighting their deep frustrations. My par-
ticipants are not usually resentful towards the city. Indeed, they often express deep love
for it along with their exasperation. They do not want to city to collapse, but they are
simultaneously overwhelmed by it. Diabetes and genes becomes enmeshed in this exas-
peration, and many turn to concepts of fate to cope with their precariousness. Genes help
concretise this fate within the human body.
Conclusion
I suggest as a final thought that both chronic illness and anxiety in the Emirates is partly
the result of the ways in which many local people define what modernity means to them.
Perhaps Dubai’s betrayal is that it grew too quickly. Foreigners come to the desert and sift
in and out of memory and landscapes, but it is the local Emirati who are left to make sense
of the shadows of all this movement. Genes and Djinn, germs and fate, SUVs and oil, sky-
scrapers in the city, and the sands of the empty quarter all must be constantly reimagined,
and it can be very arduous work. Emirati citizens value tradition and preservation, and
they do want to preserve the new city. The task at hand is how paradoxically to create tra-
dition and sustainability against a backdrop of something entirely new, but not just new,
from something that has no firm foundation. Emirati locals do by and large know the
steps they need to take for healthier lives, and they are educated on what health-seeking
behaviours will drive communal health, but with both local imagination and local health
structures, they lack novel frameworks in which these behaviours carry deeper meaning.
For my participants, whilst their futures and their city sit upon volatile terrain, they hold
steadfast to cosmologies that help anchor them to the world that they value, and they are
fiercely proud of constructed Arab identifiers that help index their lives as both desert and
urban people. Genes become valued as these identifiers, and are tied to conceptions of
ANTHROPOLOGY & MEDICINE 81
fate. Local people understand pathology when it is presented through genetic discourse,
but in terms of the uncertainty of the city, and the threats the city presents to local iden-
tity, pathology becomes equally tied to fate.
The systems which inform rising rates of obesity and diabetes around the world are
massively complex, and there is a host of social and biological factors that inform these
body categories. My simple existential point is that when suddenly faced with the
intensely myriad choices of the modern world, many people (regardless of nationality,
religion, gender or race) simply, and ironically, cannot make any. This includes choices
on health and habits. It becomes profoundly difficult to consider the future body in a
landscape that wantonly clouds future vision. In the context of Dubai’s rapid urban
growth, residents rely upon structures of cosmology that they hold self-evident to cope
with radical uncertainty, and they apply these cosmologies of the body and to emergent
biomedical categories. In addition to health care education, and policy that addresses
structural violence, in all its many forms, I argue that health-care planning and policy can
still be profoundly informed by local cosmology, and it must take into account how the
human figure pivots itself against a world that is, for many, no longer as sturdy and
dependable as they once had known.
Ethical approval
This paper is derived from research that was conducted with ethics approval from UCL.
Acknowledgments
This paper would not be possible without the participation and help from my informants in the
United Arab Emirates, and I am grateful for the time they have given me. The author would like to
thank the editors of this special edition, Susie Kilshaw, Sahra Gibbon, and Margaret Sleeboom-
Faulkner for their reviews and suggestions that helped develop this paper. The author also thanks
the anonymous reviewers for their helpful comments and edits.
Disclosure statement
No potential conflict of interest was reported by the author.
ORCID
Aaron Parkhurst http://orcid.org/0000-0002-0762-0929
References
Abdulrazzaq, Y. M., A. Bener, L. I. Al-Gazali, A. I. Al-Khayat, R. Micallef, and T. Gaber. 1997. “A
Study of Possible Deleterious Effects of Consanguinity.” Clinical Genetics 51: 167–173.
Al Gazali, L. I., A. Bener, Y. M. Abdulrazzaq, R. Micallef, A. I. Al-Khayat, and T. Gaber. 1997.
“Consanguineous Marriages in the United Arab Emirates.” Journal of Biosocial Science 29 (4):
491–497.
Al-Gazali, L. I., A. H. Dawodu, K. Sabarinathan, and M. Varghese. 1995. “The Profile of Major
Congenital Abnormalities in the United Arab Emirates (UAE) Population.” Journal of Medical
Genetics 32: 7–13.
82 A. PARKHURST
http://orcid.org/0000-0002-0762-0929
Avise, John C. 2001. The Genetic Gods: Evolution and Belief in Human Affairs. Boston, MA:
Harvard University Press. (first published 1998).
Beaudevin, Claire. 2013. “Old Diseases & Contemporary Crisis. Inherited Blood Disorders in
Oman.” Anthropology & Medicine 20 (2): 175–189.
Bourdieu, Pierre. 1963. “The Attitude of the Algerian Peasant Toward Time.” In Mediterranean
Countrymen: Essays in the Social Anthropology of the Mediterranean, edited by J. Pitt-Rivers, 55–
72. Paris: Mouton.
Bourgois, P. 2011. “Lumpen Abuse: The Human Rights Cost of Righteous Neoliberalism.” City and
Society 23 (1): 2–12.
Burgoine, T., N. G. Forouhi, S. J. Griffin, N. J. Wareham, and P. Monsivais. 2014. “Associations
Between Exposure to Takeaway Food Outlets, Takeaway Food Consumption, and Body Weight
in Cambridgeshire, UK: Population Based, Cross Sectional Study.” British Medical Journal 348:
g1464. doi:10.1136/bmj.g1464.
Cetateanua, A., and A. Jones. 2014. “Understanding the Relationship Between Food Environments,
Deprivation and Childhood Overweight and Obesity: Evidence from a Cross Sectional England-
Wide Study.” Health & Place 27: 68–76.
Chaves, Mark. 2010. “SSSR Presidential Address: Rain Dances in the Dry Season: Overcoming the
Religious Congruence Fallacy.” Journal for the Scientific Study of Religion 49 (1): 1–14.
Church, T. S., D. M. Thomas, C. Tudor-Locke, P. T. Katzmarzyk, C. P. Earnest, R. Q. Rodarte, C. K.
Martin, S. N. Blair, and C. Bouchard. 2011. “Trends Over 5 Decades in U.S. Occupation-Related
Physical Activity and Their Associations with Obesity.” PLoS ONE 6(5): e19657.
Csordas, Thomas J. 1994. Embodiment and Experience: The Existential Ground of Culture and Self.
Cambridge, UK: Cambridge University Press.
Dawkins, Richard. 1999. The Extended Phenotype: The Long Reach of the Gene. Oxford, UK: Oxford
University Press.
Douglas, Mary. 1966. Purity and Danger. London: Routledge.
Edwards, N. 2012. “Taking Action on Health Inequities: Essential Contributions by Qualitative
Researchers.” International Journal of Qualitative Methods 11: 61–63.
Farmer, Paul. 2005. Pathologies of Power: Health, Human Rights, and The New War on The Poor.
Berkeley: University of California Press.
Franklin, Sarah. 1995. “Science as Culture, Cultures of Science.” Annual Review of Anthropology 24:
163–184.
Franklin, Sarah, and Roberts, Celia. 2006. An Ethnography of Preimplantation Genetic Diagnosis.
Princeton, NJ: Princeton University Press
Frazer, J. G. 1990. “The Golden Bough.” In The Golden Bough, 701–711. London, UK: Palgrave
Macmillan.
Fullwiley, Duana. 2007. “Race and Genetics: Attempts to Define the Relationship.” Biosocieties
2 (02): 221–237.
Gibbon, Sahra, and Novas, Carlos. 2008. Biosocialities, Genetics and the Social Sciences: Making
Biologies and Identities. London: Routledge.
Hage, Ghassan. 2010. “Hating Israel in the Field.” In Emotions in the Field, edited by James Davies
and Dimitrina Spencer, 129–154. Palo Alto, CA: Stanford University Press.
Hamdy, Sherine F. 2008. “When the State and Your Kidneys Fail: Political Etiologies in an Egyptian
Dialysis Ward.” American Ethnologist 35 (4): 553–569.
Hamdy, Sherine F. 2009. “Islam, Fatalism, and Medical Intervention: Lessons from Egypt on the
Cultivation of Forbearance (Sabr) and Reliance on God (Tawakkul).” Anthropological Quarterly
82 (1): 173–196.
International Diabetes Federation. 2010. IDF Diabetes Atlas. 5th ed. Accessed 04 April 2017. http://
www.diabetesatlas.org/resources/previous-editions.html, http://www.allcountries.org/ranks/dia
betes_prevalence_country_ranks.html
International Diabetes Federation. 2015. IDF Diabetes Atlas. 7th ed. Accessed 04 April 2017. http://
www.diabetesatlas.org/resources/previous-editions.html
Kilshaw, S., T. Al Raisi, and F. Alshaban. 2015. “Arranging Marriage; Negotiating Risk: Genetics
and Society in Qatar.” Anthropology & Medicine 22 (2): 98–113.
ANTHROPOLOGY & MEDICINE 83
https://doi.org/10.1136/bmj.g1464
http://www.diabetesatlas.org/resources/previous-editions.html
http://www.diabetesatlas.org/resources/previous-editions.html
http://www.allcountries.org/ranks/diabetes_prevalence_country_ranks.html
http://www.allcountries.org/ranks/diabetes_prevalence_country_ranks.html
http://www.diabetesatlas.org/resources/previous-editions.html
http://www.diabetesatlas.org/resources/previous-editions.html
Latour, Bruno. 1993. We Have Never Been Modern. Translated by C. Porter. London: Harvester
Wheatsheaf.
Leatherman, Thomas L., and Alan Goodman. 2005. “Coca-Colonization of Diets in the Yucatan.”
Social Science & Medicine (The Social Production of Health: Critical Contributions from Evolu-
tionary, Biological and Cultural Anthropology: Papers in Memory of Arthur J. Rubel. The Social
Production of Health: Critical Contributions from Evolutionary, Biological and Cultural
Anthropology: Papers in Memory of Arthur J. Rubel). 61 (4): 833–846. doi:10.1016/j.
socscimed.2004.08.047.
Limbert, Mandana. 2010. In the Time of Oil: Piety, Memory, and Social Life in an Omani Town.
Palo Alto, CA: Stanford University Press.
Lock, Margaret. 2013. “The Epigenome and Nature/Nurture Reunification: A Challenge for
Anthropology.” Medical Anthropology 32 (4): 291–308.
Lock, Margaret. 2015. “Comprehending the Body in the Era of the Epigenome.” Current Anthropol-
ogy 56 (2): 151–177.
Mendenhall, E., R. A. Seligman, A. Fernandez, and E. A. Jacobs. 2010. “Speaking Through Diabetes:
Rethinking the Significance of Lay Discourses on Diabetes.” Medical Anthropology Quarterly
24 (2): 220–239.
“Metropolis”. 1927. Dir. Fritz Lang [Film]. Germany: Universum Film AG.
Mumford, Lewis. 1934. Technics and Civilization. New York: Harcourt, Brace & Company.
Napier, A. David, Clyde Ancarno, Beverley Butler, Joseph Calabrese, Angel Chater, Helen Chatter-
jee, François Guesnet, et al. 2014. “Culture and Health.” The Lancet 384 (9954): 1607–1639.
Parkhurst, A. L. 2014. “Genes and Djinn: Identity and Anxiety in Southeast Arabia.” PhD diss, UCL
(University College London).
Popenoe, Rebecca. 2003. Feeding Desire: Fatness, Beauty and Sexuality Among a Saharan People:
Fatness and Beauty in the Sahara. London: Routledge
Rabinow, P. 1996. Artificiality and Enlightenment: From Sociobioloy to Biosociality. Essays on the
Anthropology of Reason. Princeton, NJ: Princeton University Press.
Rabinow, Paul. 2008. “Afterword. Concept Work.” In Biosocialities, Genomics and the Social Scien-
ces; Making Biologies and Identities, edited by S. Gibbon and C. Novas, 188–193. London:
Routledge.
Randall, S. C. 2011. “Fat and Fertility, Mobility and Slaves: Long-Term Perspectives on Tuareg Obe-
sity and Reproduction.” In Fatness and the Maternal Body: Women’s Experiences of Corporeality
and the Shaping of Social Policy, edited by M. Unnithan-Kumar and S. Tremayne, 43–70.
Oxford: Berhahn.
Scheper-Hughes, Nancy, and Margaret M. Lock. 1987. “The Mindful Body: A Prolegomenon to
Future Work in Medical Anthropology.” Medical Anthropology Quarterly 1 (1): 6–41.
Senior, V., T. M. Marteau, and T. J. Peters. 1999. “Will Genetic Testing for Predisposition for Dis-
ease Result in Fatalism? A Qualitative Study of Parents Responses to Neonatal Screening for
Familial Hypercholesterolaemia.” Social Science & Medicine 48 (12): 1857–1860.
Sennett, Richard. 1994. Flesh and Stone: The Body and the City in Western Civilization. New York:
W.W. Norton
Tadmouri, Ghazi O., Pratibha Nair, Tasneem Obeid, Mahmoud T. Al Ali, Najib Al Khaja, and
Hanan A. Hamamy. 2009. “Consanguinity and Reproductive Health Among Arabs.” Reproduc-
tive Health 6: 17. doi:10.1186/1742-4755-6-17.
84 A. PARKHURST
https://doi.org/10.1016/j.socscimed.2004.08.047
https://doi.org/10.1016/j.socscimed.2004.08.047
https://doi.org/10.1186/1742-4755-6-17
Copyright of Anthropology & Medicine is the property of Routledge and its content may not
be copied or emailed to multiple sites or posted to a listserv without the copyright holder’s
express written permission. However, users may print, download, or email articles for
individual use.
Abstract
Introduction
Diabetes in the Emirates
Diabetes and fate
The body and the city
Conclusion
Ethical approval
Acknowledgments
Disclosure statement
References
journal.pone.0203644
RESEARCH ARTICLE
Population structure and gene flow of the
tropical seagrass, Syringodium filiforme, in the
Florida Keys and subtropical Atlantic region
Alexandra L. Bijak
1*, Kor-jent van Dijk2, Michelle Waycott2,3
1 Department of Environmental Sciences, University of Virginia, Charlottesville, Virginia, United States of
America, 2 School of Biological Sciences, Environment Institute, Australian Centre for Evolutionary Biology
and Biodiversity, University of Adelaide, Adelaide, South Australia, Australia, 3 State Herbarium of South
Australia, Department of Environment, Water and Natural Resources, Adelaide, South Australia, Australia
* alb5bd@virginia.edu
Abstract
Evaluating genetic diversity of seagrasses provides insight into reproductive mode and
adaptation potential, and is therefore integral to broader conservation strategies for coastal
ecosystems. In this study, we assessed genetic diversity, population structure and gene
flow in an opportunistic seagrass, Syringodium filiforme, in the Florida Keys and subtropical
Atlantic region. We used microsatellite markers to analyze 20 populations throughout the
Florida Keys, South Florida, Bermuda and the Bahamas primarily to understand how
genetic diversity of S. filiforme partitions across the Florida Keys archipelago. We found low
allelic diversity within populations, detecting 35–106 alleles across all populations, and in
some instances moderately high clonal diversity (R = 0.04–0.62). There was significant
genetic differentiation between Atlantic and Gulf of Mexico (Gulf) populations (FST = 0.109 ±
0.027, p-value = 0.001) and evidence of population structure based on cluster assignment,
dividing the region into two major genetic demes. We observed asymmetric patterns in gene
flow, with a few instances in which there was higher than expected gene flow from Atlantic to
Gulf populations. In South Florida, clustering into Gulf and Atlantic groups indicate dispersal
in S. filiforme may be limited by historical or contemporary geographic and hydrologic barri-
ers, though genetic admixture between populations suggests exchange may occur between
narrow channels in the Florida Keys, or has occurred through other mechanisms in recent
evolutionary history, maintaining regional connectivity. The variable genotypic diversity, low
genetic diversity and evidence of population structure observed in populations of S. filiforme
resemble the population genetics expected for a colonizer species.
Introduction
Genetic diversity is paramount to the long-term survival of populations, as genetic variation
provides the basis for adaptation to environmental change via natural selection and confers
short-term fitness advantages at the population level. Population structure and gene flow,
PLOS ONE | https://doi.org/10.1371/journal.pone.0203644 September 5, 2018 1 / 18
a1111111111
a1111111111
a1111111111
a1111111111
a1111111111
OPEN ACCESS
Citation: Bijak AL, van Dijk K-j, Waycott M (2018)
Population structure and gene flow of the tropical
seagrass, Syringodium filiforme, in the Florida Keys
and subtropical Atlantic region. PLoS ONE 13(9):
e0203644. https://doi.org/10.1371/journal.
pone.0203644
Editor: Heather M. Patterson, Department of
Agriculture and Water Resources, AUSTRALIA
Received: April 7, 2018
Accepted: August 26, 2018
Published: September 5, 2018
Copyright: © 2018 Bijak et al. This is an open
access article distributed under the terms of the
Creative Commons Attribution License, which
permits unrestricted use, distribution, and
reproduction in any medium, provided the original
author and source are credited.
Data Availability Statement: Complete
microsatellite genotype data are available from the
Dryad database (accession number doi:10.5061/
dryad.pp0q255).
Funding: Financial support for this study was
provided by the Jones Environmental Research
Endowment to the Department of Environmental
Sciences at the University of Virginia. The funders
had no role in study design, data collection and
analysis, decision to publish, or preparation of the
manuscript.
https://doi.org/10.1371/journal.pone.0203644
http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0203644&domain=pdf&date_stamp=2018-09-05
http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0203644&domain=pdf&date_stamp=2018-09-05
http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0203644&domain=pdf&date_stamp=2018-09-05
http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0203644&domain=pdf&date_stamp=2018-09-05
http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0203644&domain=pdf&date_stamp=2018-09-05
http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0203644&domain=pdf&date_stamp=2018-09-05
https://doi.org/10.1371/journal.pone.0203644
https://doi.org/10.1371/journal.pone.0203644
http://creativecommons.org/licenses/by/4.0/
https://doi.org/10.5061/dryad.pp0q255
https://doi.org/10.5061/dryad.pp0q255
which describe the level of genetic differentiation and connectivity between populations, are
important components and drivers of genetic diversity. Quantifying genetic diversity within
and among natural populations enhances conservation efforts because genetic patterns are dif-
ficult to predict given the complex suite of environmental and biological factors that contribute
to genetic diversity and population structure [1]. In seagrass ecosystems, genetic diversity is of
particular concern because within-species diversity may replace the functional role of species
diversity due to the limited number of species present in seagrass communities [2]. Within-
species diversity is also important to the short-term population persistence of seagrasses as
genetically diverse assemblages of multiple unique genotypes (or clones) promote greater resis-
tance and faster recovery following disturbance [3,4]. Successional stage, an aggregate category
based on multiple traits, provides an ecological lens to assess broad patterns in plant genetic
diversity and population structure. Examining population genetics within the context of eco-
logical succession aids in identifying traits that foster resilience in populations of foundational
taxa such as seagrasses, and thereby the ecosystems they support.
Theory suggests early successional species are expected to have diminished genetic diversity
due to founder effects and to develop strong population structure due to limited gene flow,
while later successional species are typified by greater standing genetic diversity and weaker
population structure [5]. In terrestrial ecosystems, long-lived woody species tend to have more
genetic diversity within populations and less variation between populations based on allozyme
studies [6] in congruence with expectations, though patterns in terrestrial pioneering species
are less clear. Populations of an early successional species, Silene dioica, in the Gulf of Bothnia
show strong differentiation when the supply of colonists is limited [7], while other European
early colonizing plant species have higher than expected genetic diversity within populations
and low genetic differentiation between populations [8,9]. The relationship between succes-
sional status and population genetics in seagrasses, however, has not been thoroughly
explored.
Seagrasses present an opportunity to study environmental and ecological determinants of
genetic diversity because they comprise a globally distributed, paraphyletic taxon that has
evolved from up to four independent lineages [10] and represents a spectrum of life history
strategies. Analyses of diversity, population structure and gene flow can reveal biological and
physical phenomena that promote or deter the exchange of genetic material across popula-
tions. Species biological traits such as breeding system, pollination mechanisms and dispersal
capability strongly influence genetic diversity as measured by genotypic diversity, gene copy
(or allele) diversity and heterozygosity [11]. The capability to propagate through horizontal
rhizome expansion and reproduction by seed has led to early notions that seagrasses are pre-
dominantly clonal and therefore lack genetic diversity [12,13]. The development of high-reso-
lution markers prompted studies that have countered this expectation by detecting higher
genetic diversity than initially reported for several seagrass species [14], generating questions
regarding the role of dispersal and sexual reproduction in shaping seagrass population genet-
ics. Environmental conditions, such as water quality, prevailing winds and local water move-
ment, contribute to fine-scale population genetic structure in seagrasses [15,16], while
geographic history, including glaciation and continental drift, in conjunction with modern
gene flow patterns influenced by oceanic hydrology, determine genetic connectivity at broader
spatial scales [17]. In this study, we described the genetic diversity, population structure and
gene flow of the opportunistic seagrass, Syringodium filiforme, in the Florida Keys and subtrop-
ical Atlantic region.
S. filiforme is widely distributed throughout the western tropical and subtropical Atlantic
Ocean in shallow coastal and back reef environments [18,19], and is a common species in sea-
grass meadows that cover as many as 17,629 km
2
of South Florida coastline [20]. These
Genetic diversity of the seagrass, Syringodium filiforme, in the subtropical Atlantic
PLOS ONE | https://doi.org/10.1371/journal.pone.0203644 September 5, 2018 2 / 18
Competing interests: The authors have declared
that no competing interests exist.
https://doi.org/10.1371/journal.pone.0203644
meadows support marine food webs including epiphytic algae to large grazers, deliver ecosys-
tem services by stabilizing coastal sediments and improving local water quality, and have
recently been recognized for their role in carbon storage [21,22]. Seagrasses in this region are
threatened by local impacts related to water quality such as sedimentation and nutrient over-
enrichment [23–25] and have experienced substantial die-offs within the past several decades,
notably in Florida Bay [26–29] and Tampa Bay [30]. In order to understand the full impact of
environmental decline and perturbations in seagrass ecosystems, evaluating existing levels of
genetic diversity, population structure and gene flow is essential. Sampling locations for this
study spanned across tens of kilometers in the focal area of the Florida Keys and South Florida,
but also included remote populations in Bermuda and the Bahamas at distances of hundreds
to thousands of kilometers apart in order to compare diversity in South Florida populations to
diversity across more distant populations.
S. filiforme generally dominates early successional meadows because it has relatively high
horizontal rhizome elongation rates [31] and tolerates sediment conditions that are less favor-
able to other dominant seagrasses, but is also present in the climax state [32]. These traits
enable S. filiforme to quickly colonize bare areas through clonal propagation, but also through
seed and vegetative fragment dispersal, especially following disturbance [33]. Based on the
ability to colonize, reproduce by seed, and generate and maintain substantial biomass, Kilmin-
ster et al. [34] categorized Syringodium species as opportunistic, exhibiting a mixture of life his-
tory traits found in both colonizing and persistent species. Previous studies have characterized
the genetic diversity of other common South Florida seagrasses, Thalassia testudinum and
Halodule wrightii. In line with theory, T. testudinum, a late successional seagrass species, exhib-
ited high genetic diversity within populations and weak genetic structure in Florida Bay and
the Lower Keys (regions within the Florida Keys are typically described on a north-south basis
as Upper, Middle and Lower Keys) [35–37]. As expected for an early colonizer and opportu-
nistic species, most of the genetic variation in H. wrightii partitioned among populations rather
than within populations in a study focused on the Gulf of Mexico and Florida Bay [38]. We
hypothesized S. filiforme would exhibit genetic diversity and population structure patterns sim-
ilar to those expected for colonizer species, and would therefore reveal high clonality, low
genetic diversity and strong differentiation between populations throughout the Florida Keys
and wider study area.
South Florida coastal waters exhibit particularly complex hydrology, especially around the
Florida Keys, an archipelago that spans 350 km from the South Florida mainland to Key West,
separating the Gulf of Mexico and Florida Bay from the Atlantic Ocean [39]. The distribution
of S. filiforme across the Florida Keys ranges from marginal and patchy in northeastern Florida
Bay, to sparse in offshore intermixed beds on the Atlantic Ocean (hereafter referred to as
Atlantic) side, to dense, monospecific stands along the Middle and Lower Keys on the Gulf of
Mexico (hereafter referred to as Gulf) side [40,41]. The division created by the Florida Keys
archipelago separates geographically proximal S. filiforme populations in the Atlantic and Gulf
basins, leading us to predict these basins host genetically distinct populations due to limited
opportunity for propagule exchange and gene flow across a physical barrier. We expected the
Bahamas population to have high genetic connectivity with the Florida populations because of
its relative proximity, but the Bermuda population to be genetically distinct from the Florida
populations due to its geographic isolation.
In this study, we used species-specific microsatellite loci to assess genetic diversity, popula-
tion genetic structure and connectivity via gene flow in S. filiforme across the Florida Keys and
subtropical Atlantic region. We examined 1) whether clonality varies within 100s of m
2
and
1000s of m
2
spatial scales; 2) the relative amounts of genetic diversity present within individu-
als and populations of S. filiforme; 3) the degree to which genetic differentiation among
Genetic diversity of the seagrass, Syringodium filiforme, in the subtropical Atlantic
PLOS ONE | https://doi.org/10.1371/journal.pone.0203644 September 5, 2018 3 / 18
https://doi.org/10.1371/journal.pone.0203644
populations results in population structure; and 4) whether there are patterns in the magnitude
and direction of gene flow between populations.
Methods
Sample collection
We sampled a total of 20 meadows, hereafter termed populations, in South Florida, the Baha-
mas and Bermuda (Fig 1; see S1 Table for site GPS coordinates) following three sampling
designs over the summers of 2014 and 2015. In 2014, we sampled within a ~ 2,500 m
2
area to
estimate clonal extent. After detecting unique genotypes within meters of each other, we
reduced the sampling area in 2015 and modified the sampling area for Florida Bay and Ber-
muda in order to accommodate for greater patchiness of S. filiforme meadows. Genetic data
collected with uneven but similar sampling schemes are comparable when using unique geno-
types for regional analyses of genetic diversity and population structure [42], assuming the
alleles detected are representative of the areas sampled [43,44]. The use of three sampling
approaches did not allow for direct comparison of genotypic diversity across sampling designs;
however, the primary goal of this study was to determine allelic diversity in order to evaluate
Fig 1. Map of study area and sampling locations. The inset map shows the relative positions of the Florida Keys, Tampa Bay, the Bahamas and
Bermuda. The main map shows the positions of the Florida Keys sampling locations. Site numbers are displayed to minimize text in the figure
(corresponding site names are available in Table 1). Sampling methodology is represented by shape in both the main and inset maps (sampling area of ~
2,500 m
2
: circle; sampling area of ~ 500 m
2
: square; Florida Bay and Bermuda–composite sampling areas of ~ 70 m
2
: triangle).
https://doi.org/10.1371/journal.pone.0203644.g001
Genetic diversity of the seagrass, Syringodium filiforme, in the subtropical Atlantic
PLOS ONE | https://doi.org/10.1371/journal.pone.0203644 September 5, 2018 4 / 18
https://doi.org/10.1371/journal.pone.0203644.g001
https://doi.org/10.1371/journal.pone.0203644
population structure and regional connectivity, not to determine fine-scale population struc-
ture or the spatial distribution of clones.
In 2014, we sampled eight populations in the Upper and Middle Keys on the Atlantic side,
two populations on the Gulf side (Sprigger and Sluiceway) and a single population in Tampa
Bay. In 2015, we sampled six populations in the Middle and Lower Keys on the Gulf side and
one population in the northeastern portion of Florida Bay. Additionally in 2015, we sampled
single populations in San Salvador, the Bahamas and Bailey’s Bay, Bermuda. Leaves of at least
50 individual S. filiforme ramets were randomly collected within a ~ 2,500 m2 sampling area,
spaced 5 m apart, for the 2014 collection, and within a ~ 500 m
2
sampling area, spaced 1.5 m
apart, for most of the 2015 collection. At the Florida Bay and Bermuda sites where the distribu-
tion of S. filiforme was limited, six and five smaller areas (spaced < 1 km) were sampled, respectively. Within each area, 24 leaves were collected from ramets (spaced 1.5 m apart) in a ~ 70 m 2 sampling area. Ethics statement Permits were required for sample collection in the Lower Florida Keys (Florida Keys National Marine Sanctuary) and Florida Bay (Everglades National Park) in 2015 because sediment was collected in addition to seagrass plant tissue for supplemental analyses; sampling in these areas was conducted under FKNMS-2015-085 and EVER-2013-SCI-0058, respectively. Sampling in Bermuda was conducted under the Bermuda Dept. of Conservation Services License no. 15- 04-16-22. Genotyping Total genomic DNA was extracted from the samples collected in 2014 using a DNeasy™ Plant Kit (QIAGEN) according to the manufacturer’s instructions. Extracted DNA was quantified on a Qubit1 2.0 Fluorometer (Invitrogen). Samples collected in 2015 were sent to the Univer- sity of Wisconsin Biotechnology (University of Wisconsin, Wisconsin, USA) for extraction and quantification. DNA was extracted from 40–50 mg of dried leaf tissue using the CTAB method as described in Saghai-Maroof et al. with minimal modification [45]. Following elu- tion, a final DNA cleaning step was performed using a 1.5:1 by volume ratio of Axygen Clean- Seq beads (Corning Life Sciences, Corning, NY, USA) to extracted DNA to remove any remaining inhibitory compounds in the sample. DNA was quantified using Quant-IT Pico- Green fluorescent dye (Thermo Fisher, Waltham, MA, USA). All extracted DNA was diluted to a concentration of ~5ng μL-1. For some samples, DNA extraction was unsuccessful due to the poor tissue quality of senescing seagrass leaves, reducing the sample size for several sites. A total of 17 microsatellite loci were amplified using fluorescently labeled primers [46]. PCR was conducted in three PCR multiplex panels using a Type-it1 Microsatellite Multiplex PCR Kit (QIAGEN) in 10 μreactions with 0.5 μL of 2 μM primer mix and 1 μL of diluted tem- plate DNA. PCR conditions were set to the manufacturer’s optimized cycling conditions (QIA- GEN). PCR products were sequenced on a capillary-based 3730xl DNA Analyzer (Applied Biosystems) with an internal ET-ROX 500 size standard at the Georgia Genomics Facility (University of Georgia, Georgia, USA). Fragment lengths for each locus were determined using the Geneious v7.1.9 (Biomatters Ltd.) and microsatellite plugin [v1.4.0]. Verification samples from 2014 were included in 2015 PCR and sequencing steps to assess the reproducibil- ity of our methods. Approximately 32% of the verification sample loci either did not success- fully amplify during PCR or did not produce microsatellite peaks when sequenced, likely due to pipetting error. When verification samples were successfully amplified and sequenced, Genetic diversity of the seagrass, Syringodium filiforme, in the subtropical Atlantic PLOS ONE | https://doi.org/10.1371/journal.pone.0203644 September 5, 2018 5 / 18 https://doi.org/10.1371/journal.pone.0203644 discrepancies between microsatellite peaks in 2014 and 2015 occurred for less than 3% of samples. Within-population genetic diversity The number of unique multi-locus genotypes (MLGs), G, the probability individuals sharing the same genotype were derived via separate sexual events, (Psex), and the probability of clonal identity, (Pgen), were estimated for each population using GENCLONE version 2.0 [47,48]. Unique MLGs were identified under the assumption that scoring error and somatic mutation rates were negligible (genotypes with a single allele difference were considered distinct). We tested for the presence of null alleles across all populations using ML-Null Freq with 100,000 randomizations [49]. Genotypic richness, R, the proportion of genetically distinct individuals (or genets) in the population, was calculated as R = (G-1)/(N-1) [50]. For the remainder of population genetic analyses, replicate MLGs were removed from the dataset to avoid allele frequency bias due to the presence of clones. The total number of alleles, A, average number of alleles per locus, NA, and average allelic richness per locus standardized by smallest sample size, AR, were calculated using the ‘diveRsity’ package [51] in R [52]. For each population, observed heterozygosity (Ho), expected heterozygosity (He), and deviation from Hardy-Weinberg equilibrium as measured by the inbreeding coefficient, FIS, were calcu- lated in GENALEX version 6.5 [53,54]. We calculated linkage disequilibrium for each population using log-likelihood tests in GENEPOP version 4.2 [55,56] and determined significance using a sequential Bonferroni correction to account for multiple comparisons. Genetic differentiation between populations An analysis of molecular variance (AMOVA) was performed first on all populations to assess overall genetic differentiation, and again with only populations bordering the archipelago (populations with numeric codes 1–16 in Fig 1) in a nested design to evaluate differentiation between the Gulf and Atlantic populations, following the assumptions of the Infinite Allele Model in GENODIVE [57]. Standard deviations for AMOVA F-statistics were calculated by jack- knife resampling over loci, and 999 permutation tests were used to assess significance. Fixation indices Weir and Cockerham’s FST [58] and Jost’s D [59] were calculated for all possible pair- wise population combinations using the ‘diveRsity’ package in R. Statistical significance was determined by 95% confidence intervals derived from bias corrected bootstrapping. Principal components analysis (PCA) was performed in GENODIVE using a covariance matrix based on individual allele frequencies to determine whether geographically proximal samples exhibit similar allele frequencies, but without the assumption of hierarchical genetic structure. Population structure and gene flow To determine the most likely number of population clusters, K, population assignment utiliz- ing a Bayesian approach was performed in the genetic software program STRUCTURE [60]. Admixture was specified in the model, allowing genotypes to show membership to more than one cluster. The correlated allele frequency model was selected and sampling locations were not used as priors in the analysis. Model parameters were set to K = 1–20, with 10 iterations run for each K, and an initial burn-in period of 100,000 iterations (sufficient for α, FST to con- verge) followed by 1,000,000 Markov Chain Monte Carlo repetitions. The most likely number of population clusters was determined by the ad hoc quantity, ΔK [61]. Complementary soft- ware programs, CLUMPAK [62] were used for downstream processing, and DISTRUCT was used for visual representation of the results [63]. Genetic diversity of the seagrass, Syringodium filiforme, in the subtropical Atlantic PLOS ONE | https://doi.org/10.1371/journal.pone.0203644 September 5, 2018 6 / 18 https://doi.org/10.1371/journal.pone.0203644 Average total migration, Nm, was estimated using FST [64] and rare alleles methods [65] in GENALEX and GENEPOP, respectively. Additionally, pairwise relative migration rates were esti- mated using Alcala’s Nm [66] and directionality of differentiation was estimated according to methods developed by Sundqvist et al. [67] using ‘diveRsity’ in R. Results Within-population genetic diversity For most populations, Pgen ranged from 3.3 x 10 −8 to 7.0 x 10 −3 , indicating there was a low probability of generating the observed genotypes under Hardy-Weinberg Equilibrium condi- tions. Psex ranged from 1.7 x 10 −7 to 2.0 x 10 −3 , though there were higher values for Psex in the following Florida populations: Crane (0.094), Key West (0.15), Tampa Bay (0.063), and Florida Bay (0.19). The few instances in which Psex exceeded 0.05 occurred in populations dominated by few clones, thereby inflating Psex, and were unlikely to have greatly impacted the accuracy of heterozygosity estimates and other statistical analyses performed. The 37 instances (of the total 2,720 pairwise comparisons) in which linkage disequilibrium was significant after a Bonferroni correction was applied (p-value < 0.003) were also unlikely to affect subsequent population genetic analyses. Genotypic richness was highly variable among populations, ranging from 0.04 to 0.62 (Table 1). Genotypic richness for Florida Keys populations sampled in 2014 and 2015 ranged from 0.37 to 0.62 and from 0.05 to 0.43, respectively. Genotypic richness values may have been Table 1. Summary genetic statistics for all populations. Population N G R A NA AR Ho He FIS 1 Carysfort 48 19 0.38 98 5.76 2.53 0.51 ± 0.08 0.43 ± 0.07 -0.16 ± 0.04 2 Elbow 45 28 0.61 106 6.24 2.53 0.47 ± 0.08 0.43 ± 0.07 -0.08 ± 0.02 3 Dixie 50 20 0.39 88 5.18 2.48 0.44 ± 0.08 0.39 ± 0.07 -0.10 ± 0.05 4 Conch 47 18 0.37 85 5.00 2.47 0.51 ± 0.09 0.40 ± 0.06 -0.23 ± 0.05 5 Davis 47 22 0.46 102 6.00 2.56 0.48 ± 0.07 0.44 ± 0.07 -0.09 ± 0.02 6 Molasses 48 22 0.45 93 5.47 2.55 0.49 ± 0.07 0.43 ± 0.07 -0.13 ± 0.01 7 Alligator 45 19 0.41 90 5.29 2.55 0.45 ± 0.07 0.42 ± 0.06 -0.08 ± 0.05 8 Tennessee 46 29 0.62 102 6.00 2.51 0.41 ± 0.07 0.40 ± 0.07 -0.04 ± 0.04 9 Sprigger 32 18 0.55 73 4.29 2.49 0.44 ± 0.07 0.38 ± 0.06 -0.17 ± 0.03 10 Sluiceway 48 22 0.45 66 3.88 2.43 0.42 ± 0.07 0.34 ± 0.06 -0.20 ± 0.04 11 Marathon 22 10 0.43 67 3.94 2.45 0.44 ± 0.09 0.42 ± 0.05 0.08 ± 0.14 12 Pigeon 43 17 0.38 75 4.41 2.44 0.38 ± 0.06 0.33 ± 0.05 -0.12 ± 0.05 13 Bahia Honda 47 15 0.30 67 3.94 2.43 0.39 ± 0.08 0.35 ± 0.06 -0.09 ± 0.07 14 Water 39 12 0.29 67 3.94 2.47 0.42 ± 0.07 0.38 ± 0.05 -0.10 ± 0.08 15 Crane 31 12 0.37 59 3.47 2.37 0.29 ± 0.06 0.28 ± 0.05 -0.05 ± 0.08 16 Key West 23 2 0.05 35 2.06 2.03 0.41 ± 0.12 0.21 ± 0.06 -0.94 ± 0.04 17 Tampa Bay 33 6 0.16 47 2.76 2.24 0.21 ± 0.07 0.18 ± 0.05 -0.09 ± 0.08 18 Florida Bay 123 6 0.04 54 3.18 2.36 0.41 ± 0.08 0.34 ± 0.05 -0.19 ± 0.10 19 Bahamas 44 19 0.42 69 4.06 2.37 0.32 ± 0.08 0.29 ± 0.07 -0.08 ± 0.05 20 Bermuda 107 20 0.18 67 3.94 2.39 0.26 ± 0.06 0.29 ± 0.06 0.12 ± 0.07 Numeric codes are provided alongside location name for each population. Sample size (N), number of unique multilocus genotypes (G), genotypic richness (R), total number of alleles (A), average number of alleles per locus (NA), allelic richness per locus (AR), observed heterozygosity (Ho), expected heterozygosity (He) and inbreeding coefficient (FIS) are reported for each population. Standard error is included for Ho, He and FIS. Values in bold indicate significant deviation from Hardy- Weinberg equilibrium at p-value < 0.05. https://doi.org/10.1371/journal.pone.0203644.t001 Genetic diversity of the seagrass, Syringodium filiforme, in the subtropical Atlantic PLOS ONE | https://doi.org/10.1371/journal.pone.0203644 September 5, 2018 7 / 18 https://doi.org/10.1371/journal.pone.0203644.t001 https://doi.org/10.1371/journal.pone.0203644 overestimated because we did not account for scoring error or somatic mutation when identi- fying unique MLGs. The total number of alleles ranged from 35 to 106 and the average number of alleles per locus ranged from 2.06 to 6.24. Once adjusted for sample size, allelic richness was similar across all populations, ranging from 2.03 to 2.56. Observed heterozygosity ranged from 0.21 to 0.51, and expected heterozygosity ranged from 0.18 to 0.44. Deviation from Hardy- Weinberg conditions was detected in nine populations (p-value < 0.05), most of which exhib- ited negative inbreeding coefficients. We found no significant effect of null alleles, except in populations with few genets (G � 10) and loci for which all samples were homozygous for the same allele, or fixed. Excluding populations with few genets and loci with fixed alleles, the mean per locus significance of heterozygote deficiency due to null alleles across populations ranged from 0.276 to 0.827. Genetic differentiation between populations AMOVA revealed significant genetic differentiation between all populations (FST = 0.149 ± 0.017, p-value = 0.001) and significant genetic differentiation between the Gulf and Atlantic populations (FST = 0.109 ± 0.027, p-value = 0.001). The results of pairwise population differen- tiation were consistent across both statistics, FST and Jost’s D (S2 Table), with maximum values calculated as 0.531 and 0.295, respectively. Similar patterns in relative differentiation between populations were observed for both statistics, and differences were primarily in the magnitude of pairwise values, thus only FST will be described in detail. Pairwise differentiation values were low to moderate within Atlantic populations (FST = 0.000–0.092), and low to high within Gulf populations (FST = 0.012–0.237). Pairwise differentiation values between Atlantic and Gulf populations ranged from 0.041 between Davis in the Upper Keys and Marathon near the Mid- dle Keys, to 0.330 between Conch in the Upper Keys and Crane in the Lower Keys. Florida Bay exhibited similar levels of differentiation between Gulf (FST = 0.144–0.273) and Atlantic (FST = 0.177–0.259) sites. Tampa Bay, the westernmost site sampled, exhibited high differentiation between Atlantic populations (FST = 0.236–0.37) and moderate to high differentiation between Gulf populations (FST = 0.101–0.279). The highest overall pairwise differentiation was found between the Bahamas and Tampa Bay, where FST = 0.531. The next greatest values were found between the Bahamas and Gulf populations (FST = 0.261–0.473), and values were moderate between the Bahamas and Atlantic populations (FST = 0.192 0.259). Bermuda pairwise differ- entiation with Atlantic and Gulf sites was moderate to high, with FST values ranging from 0.142 to 0.224 and 0.176 to 0.280, respectively. In the PCA, the first two principal component axes contained 18.3% and 7.7% of total vari- ance, respectively (S1 Fig). The Atlantic and Gulf sites clustered separately, with some overlap occurring, mostly between Gulf sites proximal to breaks in the Middle keys (Marathon, Pigeon and Sprigger), and Atlantic sites. Tampa Bay clustered with the Gulf sites, while Bermuda clus- tered between Gulf and Atlantic sites. The Bahamas clustered separately from all other sites. Population structure and gene flow Population structure was present, with greatest statistical support for K = 2 (ΔK = 297.26), fol- lowed by K = 4 (ΔK = 20.84) number of population clusters. For K = 2, Atlantic and Gulf popu- lations clustered separately, and Tampa Bay, Florida Bay and the Bahamas were assigned to the Gulf cluster (Fig 2). The genotypes in the Bermuda population show mixed membership to both the Atlantic and Gulf clusters. For K = 4, the Atlantic and Gulf populations still clustered separately, and the Bahamas and Bermuda were assigned to distinct clusters. For both K = 2 and K = 4, Gulf sites proximal to breaks in the Middle Keys (Sprigger, Marathon and Pigeon) exhibit admixture with Atlantic populations. Genetic diversity of the seagrass, Syringodium filiforme, in the subtropical Atlantic PLOS ONE | https://doi.org/10.1371/journal.pone.0203644 September 5, 2018 8 / 18 https://doi.org/10.1371/journal.pone.0203644 Average migration between all populations was 1.7 and 2.6 migrants per generation, follow- ing the FST method and private alleles method, respectively. Relative pairwise migration was highest among Atlantic populations, ranging from 0.174 to 1, on a scale from 0 to 1 (Fig 3; S3 Table). Within the Atlantic group, lowest genetic exchange occurred from Conch to Elbow, and the highest from Davis to Carysfort. Exchange within the Gulf populations (excluding Key West) ranged from 0.029 to 0.792, and the greatest exchange occurred between Sluiceway and Sprigger, both located on the western edge of Florida Bay. Exchange to and from Key West was particularly low and did not exceed 0.084. There was greater relative migration from Atlantic sites to Gulf sites proximal to a break in the Middle Keys (Marathon, Pigeon and Sprigger) than there was from within the Gulf. Florida Bay exhibited relative migration rates lower than 0.125 with greatest outgoing migration to the Atlantic site Davis. Tampa Bay exhib- ited migration rates lower than 0.148 with highest migration coming from Gulf sites. The Bahamas exhibited negligible migration rates, not exceeding 0.085. Incoming relative migra- tion to Bermuda was always less than 0.067, while outgoing migration ranged from 0.01 to 0.23, with the highest rates of exchange occurring with the Atlantic populations. Discussion Within the Florida Keys and subtropical Atlantic region, S. filiforme exhibits low genetic diver- sity when compared with other temperate and tropical seagrass species. We found 1) the level of clonality in S. filiforme, as measured by shared multilocus genotypes, to be highly variable among populations; 2) low allelic diversity and heterozygote excess in almost every population; 3) evidence of genetic differentiation and population structure, in which the sampled popula- tions were assigned to two major demes separated by the Florida Keys archipelago; and 4) asymmetric gene flow patterns, though average migration rates across all populations exceeded one migrant per generation. Fig 2. Diagrams of STRUCTURE cluster assignment. (A) K = 2 cluster assignment and (B) K = 4 cluster assignment. Population names are on the x- axis, separated by black vertical bands. Individual genotypes are represented as vertical bars and cluster assignment is depicted by color. https://doi.org/10.1371/journal.pone.0203644.g002 Genetic diversity of the seagrass, Syringodium filiforme, in the subtropical Atlantic PLOS ONE | https://doi.org/10.1371/journal.pone.0203644 September 5, 2018 9 / 18 https://doi.org/10.1371/journal.pone.0203644.g002 https://doi.org/10.1371/journal.pone.0203644 Genotypic richness of S. filiforme varied widely across sampling sites, though this cannot be completely explained by disparities in sampling area. Sample collections in 2014 were from within an area of ~ 2,500 m 2 , in which genotypic richness ranged from 0.37 to 0.62. Sample collections in 2015 were from within an area of ~ 500 m 2 , in which genotypic richness ranged from 0.05 to 0.43. Therefore, within each sampling scheme, we observed a wide range in clon- ality. In the larger areas sampled in 2014, we detected one genet present in two adjacent popu- lations (Sprigger and Sluiceway), extending over hundreds of meters. For the 2015 collection sites, sampling in a smaller total area with shorter distances between each shoot sampled may have led to a decrease in detection efficiency of total genets and number of alleles present in the population [68]. Allelic richness standardized by smallest sample size was consistent across all sites, suggesting the observed higher number of alleles and unique MLGs in 2014 collection populations were related to the spatial scale of sampling and do not necessarily indicate greater diversity in the Atlantic populations. Low genotypic richness in some populations and low allelic diversity in all populations of S. filiforme across the subtropical Atlantic region under- scores the advantage of clonal reproduction in this environment. It is also possible that we observed an edge-of-range effect [69] in which populations of a species closer toward their range limit express lower genetic diversity than populations in the center of the species’ range. Though our study did not span across the center of distribution for S. filiforme, we would expect to find greater levels of diversity in Caribbean populations. The strongest population structure clearly develops in South Florida, where Tampa Bay, though hundreds of kilometers away from the Florida Keys, groups with Gulf populations, suggesting the Florida Keys archipelago presented historical barriers to gene flow between Fig 3. Diagram of relative magnitude and direction of gene flow. Nodes represent populations (refer to Table 1 to match numeric codes with location names). Arrows are weighted according to Alcala’s Nm values (S3 Table), which range from 0.004 to 1.000, and arrowheads show the estimated direction of gene flow. https://doi.org/10.1371/journal.pone.0203644.g003 Genetic diversity of the seagrass, Syringodium filiforme, in the subtropical Atlantic PLOS ONE | https://doi.org/10.1371/journal.pone.0203644 September 5, 2018 10 / 18 https://doi.org/10.1371/journal.pone.0203644.g003 https://doi.org/10.1371/journal.pone.0203644 Gulf and Atlantic demes, and perhaps continues to impede gene flow with contemporary land configurations and sea levels. This is also supported by the relative migration and direction of gene flow calculations, which revealed asymmetric patterns in the magnitude and direction of genetic exchange. The Atlantic populations are strongly connected to one another, as are the Gulf populations, though to a lesser extent. The genetic disjunction between Atlantic and Gulf S. filiforme populations in Florida may provide evidence of a phylogeographic break, which has been observed for a number of warm-temperate marine and intertidal organisms, related to increases in seawater temperature (and thereby northward shifts in temperate species’ range limits) associated with glacial retreats that occurred throughout the Pleistocene [70–73]. Historical changes in sea level (and not necessarily temperature) may have been a primary factor contributing to the development of the genetic break for S. filiforme, a tropical species tolerant of warm seawater temperatures. During the Pleistocene, glacial advances exposed more of the Florida peninsula and may have restricted estuarine habitat to a small area within the western Gulf of Mexico [74], while glacial retreats increased sea level and promoted the expansion of estuarine habitat, likely causing increased contact between eurythermal species along the southern tip of the peninsula [75]. Depending on Pliocene distributions of S. fili- forme, changes in sea level that resulted in the final emergence of the Florida peninsula may have instigated the genetic break. McCommas [76] attributed the genetic discontinuity between the Gulf of Mexico and the Atlantic populations of the sea anemone, Bunodosoma car- vernata, to this vicariant event based on estimated time since divergence. Without fossil evi- dence or molecular clock calculations, we can merely suggest the break we found in Florida was similarly initiated by prior fluctuations in sea level and maintained by contemporary ocean currents. Interestingly, there are exceptions to the Atlantic-Gulf divide for S. filiforme, in which we detected relatively high gene flow from Atlantic populations to Gulf populations proximal to a break in the Middle Keys (at sites Marathon, Pigeon and Sprigger). Additionally, the Marathon population appeared more genetically similar to the Atlantic populations than to those in Gulf. This finding could reflect shared ancestry and relatively recent divergence between the Atlantic and Gulf populations, but does not exclude the possibility of genetic exchange occurring between Atlantic and Gulf populations across the archipelago via propagules or rafting vegetation. We found relative gene flow between proximal Florida Bay, Tampa Bay and Key West pop- ulations and other South Florida populations to be comparable to (and in some instances less than) gene flow levels observed between Florida and more distant non-Florida populations. These populations were also highly clonal, exhibiting the lowest genotypic richness values measured in this study. We sampled in the northeastern-most extent of Florida Bay in Black- water Sound, an enclosed area with few hydrological connections to the greater Florida Bay or the Atlantic; we suspect the low gene flow and low genotypic richness are related to this isola- tion. Gene flow between the Key West population and other South Florida populations may be limited by hydrologic rather than topographic isolation: strong reversing tidal currents flowing between the Gulf of Mexico and the Florida Straits may prevent mixing between the Key West population and the further eastward Lower Keys populations. The Tampa Bay population, located roughly halfway up the Florida peninsula, approaches the northern range limit for S. filiforme in the Gulf of Mexico. In the latter half of the 20th century, Tampa Bay experienced a major decline (~ 70%) in historical seagrass coverage due to rapid population expansion and development along the coast [77]. Since adopting policies to prevent pollution and dredging activities, seagrasses in Tampa Bay have been on a recovery trajectory and now exceed histori- cal extent [78]. It is unclear whether the low gene flow and high clonality in this population reflects its northern position, past seagrass decline or contemporary dispersal limitations. The Genetic diversity of the seagrass, Syringodium filiforme, in the subtropical Atlantic PLOS ONE | https://doi.org/10.1371/journal.pone.0203644 September 5, 2018 11 / 18 https://doi.org/10.1371/journal.pone.0203644 Tampa Bay population is not representative of the entire estuary as the samples were collected near the mouth of the bay, disregarding the meadows within the interior of the bay. Further research on the population genetics of all seagrass species throughout Tampa Bay is warranted, particularly given its tumultuous history of environmental decline, restoration and recovery. Though Bermuda is the furthest distance from all other populations, the greatest differenti- ation occurred between the Bahamas and Florida sites. High relatedness between Florida and Bermuda populations of H. wrightii [79] indicates similar mechanisms may be responsible for this pattern. Population structure and gene flow patterns in the Bahamas and Bermuda popu- lations were somewhat counter to our expectations, but must be interpreted with caution because we only sampled one site from each location. In the ΔK = 2 population clusters sce- nario, the Bermuda population contains genotypes with near equal membership to the Atlantic and Gulf clusters, while the Bahamas population shows complete membership to the Atlantic cluster. These results imply the Bahamas population groups with the Atlantic populations, though our previous analyses suggest relatively strong genetic differentiation and limited gene flow between the Bahamas and all other sites. Contemporary surface ocean currents directing the movement of propagules, and therefore genetic exchange between populations, may be responsible for these patterns [80]. Based on the mixed-membership genotypes in the Ber- muda population, it is plausible the Bermuda population developed from an initial source pop- ulation in recent evolutionary history that later diverged to create the two major clusters identified here, and now receives propagules from Florida via the Gulf Stream at a frequency sufficient to prevent strong genetic differentiation. The moderate degree of gene flow from Bermuda to the Florida populations estimated here (Fig 3) is interesting, as propagule dispersal via surface currents in the opposite direction (South to North) along the Gulf Stream seems more likely. And despite the westward flow of the Antilles current, the topography of the islands of the Bahamas might restrict gene flow between Florida populations and the remote sampling location in San Salvador, the easternmost island of the Bahamas. We believe further sampling across the subtropical Atlantic, especially along the western Bahamian islands, will clarify unexpected gene flow patterns. The high genetic exchange within Atlantic populations as evidenced by high migration rates may be explained by hydrologic connections created by surface currents and eddies that form along the Florida Keys Atlantic coastline. The gene flow patterns observed here roughly agree with the modeled and observed movement of spiny lobster (Panulirus argus) larvae along a ‘recruitment conveyor’ in the Florida Keys, in which spawning larvae near the Yucatán Peninsula have been identified as source populations [81]. The net eastward and northward movement of the Florida current along the Florida Shelf and the intermittent formation of small eddies could facilitate local movement and entrainment of seagrass propagules [82]. Less genetic exchange within the Gulf populations is perhaps related to the isolating topography of the Lower Keys, in which several small key islands and narrow channels separate the seagrass meadows, potentially hindering the movement of propagules. Though mean hydrological transport occurs from the Gulf to the Atlantic, westward tidal flow sometimes pushes Atlantic waters through channels in the Keys [83,84] and could promote movement of propagules of Atlantic origin through to Gulf side populations Marathon, Pigeon and Sprigger, facilitating the admixture of genotypes detected between clusters. The population genetics of S. filiforme in the subtropical Atlantic appear to match theoreti- cal predictions for a colonizer species. The S. filiforme meadows we sampled contained variable genotypic diversity, likely a result of site-specific properties influencing the growth and repro- ductive strategies in this species as well as propagule supply [85]. The low allelic diversity within S. filiforme meadows and evidence for population structure along the possible Genetic diversity of the seagrass, Syringodium filiforme, in the subtropical Atlantic PLOS ONE | https://doi.org/10.1371/journal.pone.0203644 September 5, 2018 12 / 18 https://doi.org/10.1371/journal.pone.0203644 phylogeographic boundary in the Florida Keys, are typical of colonizers. These findings are consistent with a study on the only congener of S. filiforme, Syringodium isoetifolium, which also exhibited variable genotypic diversity and population structure defined by bioregions in the western North Pacific [86]. The climax species of the tropical and subtropical Atlantic, T. testudinum, exhibited high genotypic and allelic diversity, and no evidence of population struc- ture in Florida Bay [37], and similarly high allelic diversity and little evidence of population structure across ~ 1000 km of coastline along the Yucatan Peninsula in Mexico [87]. In con- trast, the colonizer species H. wrightii showed high clonality and strong differentiation among edge-of-range populations in Florida, North Carolina and Bermuda [79], and generally high clonality and weak population structure along the western Gulf of Mexico coast [88]. The pop- ulation genetics of S. filiforme conform to expectations for colonizer species, with genotypic diversity mediated by local conditions and meadow demographics. Successional status is derived from environmental tolerances and growth and reproductive strategies that in turn impact population genetics, while modern oceanic hydrology ultimately controls dispersal trajectories and therefore genetic exchange. It is likely that evolutionarily historical population dynamics under past continent arrangements and sea levels are the dom- inant forces driving the population structure in S. filiforme in the subtropical Atlantic Ocean. The higher genotypic diversity found in S. filiforme in certain populations suggests that some meadows may be more resilient to disturbances than others, and these more resilient meadows may enhance recovery of depauperate meadows by sustaining a supply of propagules and gene flow, but only where ocean currents and land barriers do not impede connectivity. Whether overall low genetic diversity and strong population structure in subtropical Atlantic popula- tions of S. filiforme equates to limited capability for adaptation to selective pressures has yet to be tested. Supporting information S1 Table. Sample site GPS coordinates. GPS coordinates mark the exact location of each sample site. Latitude and longitude are in decimal degrees. (DOCX) S2 Table. Pairwise genetic differentiation. FST values are provided to the left of the diagonal and Jost’s D values are provided to the right of the diagonal. Bold text indicates significance based on non-overlapping confidence intervals. (DOCX) S3 Table. Values for relative magnitude and direction of gene flow. Values represent the rel- ative amount of gene flow from populations in the first column to receiving populations iden- tified in the first row. For example, the highest amount of gene flow (1.000) occurs from Carysfort to Davis, while the lowest amount of gene flow occurs from the Bahamas to Key West (0.004). Bold text indicate significance based on non-overlapping 95% confidence inter- vals. (DOCX) S1 Fig. Principal components analysis (PCA) plot. Axis loading values are depicted for the two principle coordinate axes containing the greatest amount of variation, PC1 (18.3% vari- ance) and PC2 (7.7% variance). Genotypes from each population group are distinguished by color and shape (Atlantic: blue circles, Gulf: orange triangles, Bermuda: yellow diamonds, Bahamas: magenta squares). (EPS) Genetic diversity of the seagrass, Syringodium filiforme, in the subtropical Atlantic PLOS ONE | https://doi.org/10.1371/journal.pone.0203644 September 5, 2018 13 / 18 http://www.plosone.org/article/fetchSingleRepresentation.action?uri=info:doi/10.1371/journal.pone.0203644.s001 http://www.plosone.org/article/fetchSingleRepresentation.action?uri=info:doi/10.1371/journal.pone.0203644.s002 http://www.plosone.org/article/fetchSingleRepresentation.action?uri=info:doi/10.1371/journal.pone.0203644.s003 http://www.plosone.org/article/fetchSingleRepresentation.action?uri=info:doi/10.1371/journal.pone.0203644.s004 https://doi.org/10.1371/journal.pone.0203644 Acknowledgments The authors thank Thomas Frankovich for assistance in the field, Margot Miller and Ainsley Calladine for technical and laboratory support at the University of Virginia and University of Adelaide, Laura K. Reynolds for thoughtful feedback on early versions of the manuscript and anonymous reviewers whose suggestions also greatly improved the manuscript. Sampling for this study was conducted under permits FKNMS-2015-085, EVER-2013-SCI-0058, and Ber- muda Dept. of Conservation Services License no. 15-04-16-22. The Jones Environmental Research Endowment to the Department of Environmental Sciences at the University of Vir- ginia funded this research. Author Contributions Conceptualization: Alexandra L. Bijak, Michelle Waycott. Formal analysis: Alexandra L. Bijak, Kor-jent van Dijk. Investigation: Alexandra L. Bijak. Methodology: Kor-jent van Dijk. Supervision: Michelle Waycott. Visualization: Alexandra L. Bijak, Kor-jent van Dijk. Writing – original draft: Alexandra L. Bijak. Writing – review & editing: Alexandra L. Bijak, Kor-jent van Dijk, Michelle Waycott. References 1. Gray A. Genetic diversity and its conservation in natural populations of plants. Biodivers Lett. 1996; 3: 71–80. 2. Duffy JE. Biodiversity and the functioning of seagrass ecosystems. Mar Ecol Prog Ser. 2006; 311: 233– 250. 3. Hughes AR, Stachowicz JJ. Genetic diversity enhances the resistance of a seagrass ecosystem to dis- turbance. Proc Natl Acad Sci. 2004; 101: 8998–9002. https://doi.org/10.1073/pnas.0402642101 PMID: 15184681 4. Randall Hughes A, Stachowicz JJ. Seagrass genotypic diversity increases disturbance response via complementarity and dominance. J Ecol. 2011; 99: 445–453. https://doi.org/10.1111/j.1365-2745.2010. 01767.x 5. Loveless MD, Hamrick JL. Ecological determinants of genetic structure in plant populations. Annu Rev Ecol Syst. 1984; 15: 65–95. 6. Hamrick JL, Godt MJW, Sherman-Broyles SL. Factors influencing levels of genetic diversity in woody plant species. New For. 1992; 6: 95–124. https://doi.org/10.1007/978-94-011-2815-5_7 7. Giles BE, Goudet J. Genetic differentiation in Silene dioica metapopulations: Estimation of spatiotempo- ral effects in successional plant species. Am Nat. 1997; 149: 507–526. https://doi.org/10.1086/286002 8. Raffl C, Schönswetter P, Erschbamer B. “Sax-sess”—genetics of primary succession in a pioneer spe- cies on two parallel glacier forelands. Mol Ecol. 2006; 15: 2433–2440. https://doi.org/10.1111/j.1365- 294X.2006.02964.x PMID: 16842417 9. Raffl C, Holderegger R, Parson W, Erschbamer B. Patterns in genetic diversity of Trifolium pallescens populations do not reflect chronosequence on alpine glacier forelands. Heredity. 2008; 100: 526–532. https://doi.org/10.1038/hdy.2008.8 PMID: 18270530 10. Les DH, Cleland MA, Waycott M. Phylogenetic studies in Alismatidae, II: Evolution of marine angio- sperms (seagrasses) and hydrophily. Syst Bot. 1997; 22: 443–463. 11. Kendrick GA, Orth RJ, Statton J, Hovey R, Ruiz-Montoya L, Lowe RJ, et al. Demographic and genetic connectivity: The role and consequences of reproduction, dispersal and recruitment in seagrasses. Biol Rev Camb Philos Soc. 2016. https://doi.org/10.1111/brv.12261 PMID: 27010433 Genetic diversity of the seagrass, Syringodium filiforme, in the subtropical Atlantic PLOS ONE | https://doi.org/10.1371/journal.pone.0203644 September 5, 2018 14 / 18 https://doi.org/10.1073/pnas.0402642101 http://www.ncbi.nlm.nih.gov/pubmed/15184681 https://doi.org/10.1111/j.1365-2745.2010.01767.x https://doi.org/10.1111/j.1365-2745.2010.01767.x https://doi.org/10.1007/978-94-011-2815-5_7 https://doi.org/10.1086/286002 https://doi.org/10.1111/j.1365-294X.2006.02964.x https://doi.org/10.1111/j.1365-294X.2006.02964.x http://www.ncbi.nlm.nih.gov/pubmed/16842417 https://doi.org/10.1038/hdy.2008.8 http://www.ncbi.nlm.nih.gov/pubmed/18270530 https://doi.org/10.1111/brv.12261 http://www.ncbi.nlm.nih.gov/pubmed/27010433 https://doi.org/10.1371/journal.pone.0203644 12. Barrett SCH, Eckert CG, Husband BC. Evolutionary processes in aquatic plant populations. Aquat Bot. 1993; 44: 105–145. https://doi.org/10.1016/0304-3770(93)90068-8 13. Kendrick GA, Duarte CM, Marbà N. Clonality in seagrasses, emergent properties and seagrass land- scapes. Mar Ecol Prog Ser. 2005; 290: 291–296. https://doi.org/10.3354/meps290291 14. Arnaud-Haond S, Alberto F, Teixeira S, Procaccini G, Serrão EA, Duarte CM. Assessing genetic diver- sity in clonal organisms: Low diversity or low resolution? Combining power and cost efficiency in select- ing markers. J Hered. 2005; 96: 434–440. https://doi.org/10.1093/jhered/esi043 PMID: 15743902 15. Oliva S, Romero J, Marta PE. Reproductive strategies and isolation-by-demography in a marine clonal plant along an eutrophication gradient. Mol Ecol. 2014; 23: 5698–5711. https://doi.org/10.1111/mec. 12973 PMID: 25331192 16. Sinclair EA, Krauss SL, Anthony J, Hovey R, Kendrick GA. The interaction of environment and genetic diversity within meadows of the seagrass Posidonia australis (Posidoniaceae). Mar Ecol Prog Ser. 2014; 506: 87–98. https://doi.org/10.3354/meps10812 17. Serra IA, Innocenti AM, Di Maida G, Calvo S, Migliaccio M, Zambianchi E, et al. Genetic structure in the Mediterranean seagrass Posidonia oceanica: Disentangling past vicariance events from contemporary patterns of gene flow. Mol Ecol. 2010; 19: 557–568. https://doi.org/10.1111/j.1365-294X.2009.04462.x PMID: 20051010 18. Short FT, Carruthers TJB, Dennison WC, Waycott M. Global seagrass distribution and diversity: A bio- regional model. J Exp Mar Bio Ecol. 2007; 350: 3–20. https://doi.org/10.1016/j.jembe.2007.06.012 19. Creed JC, Phillips RC, van Tussenbroek. Seagrasses of the Caribbean. In: Green EP, Short FT editors. World atlas of seagrasses. Berkeley, USA: University of California Press; 2003. pp. 234–240. 20. Fourqurean JW, Durako MJ, Hall MO, Hefty LN. Seagrass distribution in South Florida: A multi-agency coordinated monitoring program. In: Porter JW, Porter KG editors. The Everglades, Florida Bay, and coral reefs of the Florida Keys: An ecosystem sourcebook. 2002. pp. 497–522. 21. Costanza R, d’Arge R, de Groot R, Farber S, Grasso M, Hannon B, et al. The value of the world’s eco- system services and natural capital. Nature. 1997; 387: 253–260. https://doi.org/10.1038/387253a0 22. Fourqurean JW, Duarte CM, Kennedy H, Marbà N, Holmer M, Mateo MA, et al. Seagrass ecosystems as a globally significant carbon stock. Nat Geosci. 2012; 5: 505–509. https://doi.org/10.1038/ngeo1477 23. Sargent FJ, Leary TJ, Crewz DW, Kruer CR. Scarring of Florida’s seagrasses: Assessment and man- agement options. St. Petersburg (FL): Florida Marine Research Institute; 1995. Report No.: FMRI Tech. Rep. TR-1. 24. Short FT, Wyllie-Echeverria S. Natural and human-induced disturbance of seagrasses. Environ Con- serv. 1996; 23: 17. https://doi.org/10.1017/S0376892900038212 25. Orth RJ, Carruthers TJB, Dennison WC, Duarte CM, James W, Heck KL, et al. Global crisis for sea- grass ecosystems. BioScience. 2006; 56: 987–996. 26. Hall MO, Furman BT, Merello M, Durako MJ. Recurrence of Thalassia testudinum seagrass die-off in Florida Bay, USA: Initial observations. Mar Ecol Prog Ser. 2016; 560: 243–249. https://doi.org/10.3354/ meps11923 27. Hall MO, Durako MJ, Fourqurean JW, Zieman JC. Decadal changes in seagrass distribution and abun- dance in Florida Bay. Estuaries. 1999; 22: 445–459. https://doi.org/10.2307/1353210 28. Roblee MB, Barber TR, Carlson PR, Durako MJ, Fourqurean JW, Muehkstein LK, et al. Mass mortality of the tropical seagrass Thalassia testudinum in Florida Bay (USA). Mar Ecol Prog Ser. 1991; 71: 297–299. 29. Zieman JC, Fourqurean JW, Frankovich TA. Seagrass Die-Off in Florida Bay: Long-term trends in abundance and growth of turtle grass, Thalassia testudinum. Estuaries. 1999; 22: 460–470. https://doi. org/10.2307/1353211 30. Johansson JOR. Historical overview of Tampa Bay water quality and seagrass issues and trends. In: Greening HS, editor. Seagrass Management, It’s Not Just Nutrients! Symposium. St. Petersburg, Flor- ida; 2002. p. 246. 31. Marbà N, Duarte CM. Rhizome elongation and seagrass clonal growth. Mar Ecol Prog Ser. 1998; 174: 269–280. https://doi.org/10.3354/meps174269 32. Williams SL. Experimental studies of Caribbean seagrass bed development. Ecol Monogr. 1990; 60: 449–469. https://doi.org/10.2307/1943015 33. Kendall MS, Battista T, Hillis-Starr Z. Long term expansion of a deep Syringodium filiforme meadow in St. Croix, US Virgin Islands: The potential role of hurricanes in the dispersal of seeds. Aquat Bot. 2004; 78: 15–25. https://doi.org/10.1016/j.aquabot.2003.09.004 34. Kilminster K, McMahon K, Waycott M, Kendrick GA, Scanes P, McKenzie L, et al. Unravelling complex- ity in seagrass systems for management: Australia as a microcosm. Sci Total Environ. Elsevier B.V.; 2015; 534: 97–109. https://doi.org/10.1016/j.scitotenv.2015.04.061 PMID: 25917445 Genetic diversity of the seagrass, Syringodium filiforme, in the subtropical Atlantic PLOS ONE | https://doi.org/10.1371/journal.pone.0203644 September 5, 2018 15 / 18 https://doi.org/10.1016/0304-3770(93)90068-8 https://doi.org/10.3354/meps290291 https://doi.org/10.1093/jhered/esi043 http://www.ncbi.nlm.nih.gov/pubmed/15743902 https://doi.org/10.1111/mec.12973 https://doi.org/10.1111/mec.12973 http://www.ncbi.nlm.nih.gov/pubmed/25331192 https://doi.org/10.3354/meps10812 https://doi.org/10.1111/j.1365-294X.2009.04462.x http://www.ncbi.nlm.nih.gov/pubmed/20051010 https://doi.org/10.1016/j.jembe.2007.06.012 https://doi.org/10.1038/387253a0 https://doi.org/10.1038/ngeo1477 https://doi.org/10.1017/S0376892900038212 https://doi.org/10.3354/meps11923 https://doi.org/10.3354/meps11923 https://doi.org/10.2307/1353210 https://doi.org/10.2307/1353211 https://doi.org/10.2307/1353211 https://doi.org/10.3354/meps174269 https://doi.org/10.2307/1943015 https://doi.org/10.1016/j.aquabot.2003.09.004 https://doi.org/10.1016/j.scitotenv.2015.04.061 http://www.ncbi.nlm.nih.gov/pubmed/25917445 https://doi.org/10.1371/journal.pone.0203644 35. Schlueter MA, Guttman SI. Gene flow and genetic diversity of turtle grass, Thalassia testudinum, banks ex könig, in the lower Florida Keys. Aquat Bot. Elsevier; 1998; 61: 147–164. https://doi.org/10.1016/ S0304-3770(98)00063-1 36. Kirsten JH, Dawes CJ, Cochrane BJ. Randomly amplified polymorphism detection (RAPD) reveals high genetic diversity in Thalassia testudinum banks ex König (Turtlegrass). Aquat Bot. 1998; 61: 269–287. https://doi.org/10.1016/S0304-3770(98)00070-9 37. Bricker E, Waycott M, Calladine A, Zieman JC. High connectivity across environmental gradients and implications for phenotypic plasticity in a marine plant. Mar Ecol Prog Ser. 2011; 423: 57–67. https://doi. org/10.3354/meps08962 38. Angel R. Genetic diversity of Halodule wrightii using random amplified polymorphic DNA. Aquat Bot. 2002; 74: 165–174. https://doi.org/10.1016/S0304-3770(02)00079-7 39. Briceño HO, Boyer JN, Castro J, Harlem P. Biogeochemical classification of South Florida’s estuarine and coastal waters. Mar Pollut Bull. 2013; 75: 187–204. https://doi.org/10.1016/j.marpolbul.2013.07. 034 PMID: 23968989 40. Fourqurean JW, Boyer JN, Durako MJ, Hefty LN, Peterson BJ. Forecasting responses of seagrass dis- tributions to changing water quality using monitoring data. Ecol Appl. 2003; 13: 474–489. https://doi.org/ 10.1890/1051-0761(2003)013[0474:FROSDT]2.0.CO;2 41. Fourqurean JW, Willsie A, Rose CD, Rutten LM. Spatial and temporal pattern in seagrass community composition and productivity in South Florida. Mar Biol. 2001; 138: 341–354. https://doi.org/10.1007/ s002270000448 42. Diekmann OE, Serrão EA. Range-edge genetic diversity: Locally poor extant southern patches maintain a regionally diverse hotspot in the seagrass Zostera marina. Mol Ecol. 2012; 21: 1647–1657. https://doi. org/10.1111/j.1365-294X.2012.05500.x PMID: 22369278 43. Arnaud-Haond S, Duarte CM, Alberto F, Serrão EA. Standardizing methods to address clonality in pop- ulation studies. Mol Ecol. 2007; 16: 5115–5139. https://doi.org/10.1111/j.1365-294X.2007.03535.x PMID: 17944846 44. Balloux F, Lugon-Moulin N. The estimation of population differentiation with microsatellite markers. Mol Ecol. 2002; 11: 155–165. https://doi.org/10.1046/j.0962-1083.2001.01436.x PMID: 11856418 45. Saghai-Maroof MA, Soliman KM, Jorgensen RA, Allard RW. Ribosomal DNA spacer-length polymor- phisms in barley: Mendelian inheritance, chromosomal location, and population dynamics. Proc Natl Acad Sci. 1984; 81: 8014–8018. https://doi.org/10.1073/pnas.81.24.8014 PMID: 6096873 46. Bijak AL, van Dijk K-J, Waycott M. Development of microsatellite markers for a tropical seagrass, Syrin- godium filiforme (Cymodoceaceae). Appl Plant Sci. 2014; 2: 1–4. https://doi.org/10.3732/apps.1400082 PMID: 25309842 47. Parks JC, Werth CR. A study of spatial features of clones in a population of bracken fern, Pteridium aquilinum (Dennstaedtiaceae). Am J Bot. 1993; 80: 537–544. https://doi.org/10.1002/j.1537-2197. 1993.tb13837.x PMID: 30139148 48. Arnaud-Haond S, Belkhir K. GENCLONE: A computer program to analyse genotypic data, test for clon- ality and describe spatial clonal organization. Mol Ecol Notes. 2007; 7: 15–17. https://doi.org/10.1111/j. 1471-8286.2006.01522.x 49. Kalinowski ST, Taper ML. Maximum likelihood estimation of the frequency of null alleles at microsatellite loci. Conserv Genet. 2006; 7: 991–995. https://doi.org/10.1007/s10592-006-9134-9 50. Dorken ME, Eckert CG. Severely reduced sexual reproduction in northern populations of a clonal plant, Decodon verticillatus (Lythraceae). J Ecol. 2011; 89: 339–350. https://doi.org/10.1046/j.1365-2745. 2001.00558.x 51. Keenan K, Mcginnity P, Cross TF, Crozier WW, Prodöhl PA. DiveRsity: An R package for the estimation and exploration of population genetics parameters and their associated errors. Methods Ecol Evol. 2013; 4: 782–788. https://doi.org/10.1111/2041-210X.12067 52. R Core Team. R: A language and environment for statistical computing [Internet]. Vienna, Austria: R Foundation for Statistical Computing; 2014. Available: http://www.r-project.org/ 53. Peakall R, Smouse PE. GENALEX 6: Genetic analysis in Excel. Population genetic software for teach- ing and research. Mol Ecol Notes. 2006; 6: 288–295. https://doi.org/10.1111/j.1471-8286.2005.01155.x 54. Peakall R, Smouse PE. GenALEx 6.5: Genetic analysis in Excel. Population genetic software for teach- ing and research-an update. Bioinformatics. 2012; 28: 2537–2539. https://doi.org/10.1093/ bioinformatics/bts460 PMID: 22820204 55. Raymond M, Rousset F. Genepop (Version-1.2): Population genetics software for exact tests and ecu- menicism. J Hered. 1995; 86: 248–249. Genetic diversity of the seagrass, Syringodium filiforme, in the subtropical Atlantic PLOS ONE | https://doi.org/10.1371/journal.pone.0203644 September 5, 2018 16 / 18 https://doi.org/10.1016/S0304-3770(98)00063-1 https://doi.org/10.1016/S0304-3770(98)00063-1 https://doi.org/10.1016/S0304-3770(98)00070-9 https://doi.org/10.3354/meps08962 https://doi.org/10.3354/meps08962 https://doi.org/10.1016/S0304-3770(02)00079-7 https://doi.org/10.1016/j.marpolbul.2013.07.034 https://doi.org/10.1016/j.marpolbul.2013.07.034 http://www.ncbi.nlm.nih.gov/pubmed/23968989 https://doi.org/10.1890/1051-0761(2003)013[0474:FROSDT]2.0.CO;2 https://doi.org/10.1890/1051-0761(2003)013[0474:FROSDT]2.0.CO;2 https://doi.org/10.1007/s002270000448 https://doi.org/10.1007/s002270000448 https://doi.org/10.1111/j.1365-294X.2012.05500.x https://doi.org/10.1111/j.1365-294X.2012.05500.x http://www.ncbi.nlm.nih.gov/pubmed/22369278 https://doi.org/10.1111/j.1365-294X.2007.03535.x http://www.ncbi.nlm.nih.gov/pubmed/17944846 https://doi.org/10.1046/j.0962-1083.2001.01436.x http://www.ncbi.nlm.nih.gov/pubmed/11856418 https://doi.org/10.1073/pnas.81.24.8014 http://www.ncbi.nlm.nih.gov/pubmed/6096873 https://doi.org/10.3732/apps.1400082 http://www.ncbi.nlm.nih.gov/pubmed/25309842 https://doi.org/10.1002/j.1537-2197.1993.tb13837.x https://doi.org/10.1002/j.1537-2197.1993.tb13837.x http://www.ncbi.nlm.nih.gov/pubmed/30139148 https://doi.org/10.1111/j.1471-8286.2006.01522.x https://doi.org/10.1111/j.1471-8286.2006.01522.x https://doi.org/10.1007/s10592-006-9134-9 https://doi.org/10.1046/j.1365-2745.2001.00558.x https://doi.org/10.1046/j.1365-2745.2001.00558.x https://doi.org/10.1111/2041-210X.12067 http://www.r-project.org/ https://doi.org/10.1111/j.1471-8286.2005.01155.x https://doi.org/10.1093/bioinformatics/bts460 https://doi.org/10.1093/bioinformatics/bts460 http://www.ncbi.nlm.nih.gov/pubmed/22820204 https://doi.org/10.1371/journal.pone.0203644 56. Rousset F. GENEPOP’007: A complete re-implementation of the GENEPOP software for Windows and Linux. Mol Ecol Resour. 2008; 8: 103–106. https://doi.org/10.1111/j.1471-8286.2007.01931.x PMID: 21585727 57. Meirmans PG, Van Tienderen PH. GENOTYPE and GENODIVE: Two programs for the analysis of genetic diversity of asexual organisms. Mol Ecol Notes. 2004; 4: 792–794. https://doi.org/10.1111/j. 1471-8286.2004.00770.x 58. Weir BS, Cockerham CC. Estimating F-statistics for the analysis of population structure. Evolution. 1984; 38: 1358–1370. https://doi.org/10.1111/j.1558-5646.1984.tb05657.x PMID: 28563791 59. Jost L. GST and its relatives do not measure differentiation. Mol Ecol. 2008; 17: 4015–4026. https://doi. org/10.1111/j.1365-294X.2008.03887.x PMID: 19238703 60. Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000; 155: 945–959. https://doi.org/10.1111/j.1471-8286.2007.01758.x PMID: 10835412 61. Evanno G, Regnaut S, Goudet J. Detecting the number of clusters of individuals using the software STRUCTURE: A simulation study. Mol Ecol. 2005; 14: 2611–2620. https://doi.org/10.1111/j.1365- 294X.2005.02553.x PMID: 15969739 62. Kopelman NM, Mayzel J, Jakobsson M, Rosenberg NA, Mayrose I. CLUMPAK: A program for identify- ing clustering modes and packaging population structure inferences across K. Mol Ecol Resour. 2015; https://doi.org/10.1111/1755-0998.12387 PMID: 25684545 63. Rosenberg N a. DISTRUCT: A program for the graphical display of population structure. Mol Ecol Notes. 2004; 4: 137–138. https://doi.org/10.1046/j.1471-8286.2003.00566.x 64. Wright S. Genetical structure of populations. Ann Eugen. 1951; 15: 323–354. https://doi.org/10.1111/j. 1469-1809.1949.tb02451.x PMID: 24540312 65. Barton NH, Slatkin M. A quasi-equilibrium theory of the distribution of rare alleles in a subdivided popula- tion. Heredity. 1986; 56: 409–415. https://doi.org/10.1038/hdy.1986.63 PMID: 3733460 66. Alcala N, Goudet J, Vuilleumier S. On the transition of genetic differentiation from isolation to panmixia: What we can learn from Gst and D. Theor Popul Biol. 2014; 93: 75–84. https://doi.org/10.1016/j.tpb. 2014.02.003 PMID: 24560956 67. Sundqvist L, Keenan K, Zackrisson M, Prodöhl P, Kleinhans D. Directional genetic differentiation and relative migration. Ecol Evol. 2016; 6: 3461–3475. https://doi.org/10.1002/ece3.2096 PMID: 27127613 68. Leberg PL. Estimating allelic richnes: Effects of sample size and bottlenecks. Mol Ecol. 2002; 11: 2445– 2449. PMID: 12406254 69. Billingham MR, Reusch TBH, Alberto F, Serrão EA. Is asexual reproduction more important at geo- graphical limits? A genetic study of the seagrass Zostera marina in the Ria Formosa, Portugal. Mar Ecol Prog Ser. 2003; 265: 77–83. https://doi.org/10.3354/meps265077 70. Felder DL, Staton JL. Genetic differentiation in trans-Floridian species complexes of Sesarma and Uca (Decapoda: Brachyura). J Crustac Biol. 1994; 14: 191–209. 71. Young AM, Torres C, Mack JE, Cunningham CW. Morphological and genetic evidence for vicariance and refugium in Atlantic and Gulf of Mexico populations of the hermit crab Pagurus longicarpus. Mar Biol. 2002; 140: 1059–1066. https://doi.org/10.1007/s00227-002-0780-2 72. Lee TN, Foighil DÓ. Hidden Floridian biodiversity: Mitochondrial and nuclear gene trees reveal four cryptic species within the scorched mussel, Brachidontes exustus, species complex. Mol Ecol. 2004; 13: 3527–3542. https://doi.org/10.1111/j.1365-294X.2004.02337.x PMID: 15488009 73. Mathews LM. Cryptic biodiversity and phylogeographical patterns in a snapping shrimp species complex. Mol Ecol. 2006; 15: 4049–4063. https://doi.org/10.1111/j.1365-294X.2006.03077.x PMID: 17054502 74. Petuch EJ. Geographical heterochrony: Comtemporaneous coexistence of neogene and recent mollus- can faunas in the Americas. Palaeogeogr Palaeoclimatol Palaeoecol. Elsevier; 1982; 37: 277–312. https://doi.org/10.1016/0031-0182(82)90041-4 75. Avise JC. Molecular population structure and the biogeographic history of a regional fauna: A case his- tory with lessons for conservation biology. Oikos. 1992; 63: 62–76. 76. Mccommas SA. Biochemical genetics of the sea anemone Bunodosoma cavernata and the zoogeogra- phy of the Gulf of Mexico. Mar Biol. 1982; 68: 169–173. 77. Lewis RR, Durako MJ, Moffler MD, Phillips RC. Seagrass meadows of Tampa Bay—a review. In: Treat SF, Simon JL, Lewis RR, Whitman RL Jr., editors. Tampa Bay Area Scientific Information Symposium. Minneapolis, Minnesota: Burgess Publishing Company; 1985. pp. 210–246. 78. Sherwood ET, Greening HS, Johansson JOR, Kaufman K, Raulerson GE. Tampa Bay (Florida, USA): Documenting seagrass recovery since the 1980’s and reviewing the benefits. Southeast Geogr. 2017; 57: 294–319. https://doi.org/10.1353/sgo.2017.0026 Genetic diversity of the seagrass, Syringodium filiforme, in the subtropical Atlantic PLOS ONE | https://doi.org/10.1371/journal.pone.0203644 September 5, 2018 17 / 18 https://doi.org/10.1111/j.1471-8286.2007.01931.x http://www.ncbi.nlm.nih.gov/pubmed/21585727 https://doi.org/10.1111/j.1471-8286.2004.00770.x https://doi.org/10.1111/j.1471-8286.2004.00770.x https://doi.org/10.1111/j.1558-5646.1984.tb05657.x http://www.ncbi.nlm.nih.gov/pubmed/28563791 https://doi.org/10.1111/j.1365-294X.2008.03887.x https://doi.org/10.1111/j.1365-294X.2008.03887.x http://www.ncbi.nlm.nih.gov/pubmed/19238703 https://doi.org/10.1111/j.1471-8286.2007.01758.x http://www.ncbi.nlm.nih.gov/pubmed/10835412 https://doi.org/10.1111/j.1365-294X.2005.02553.x https://doi.org/10.1111/j.1365-294X.2005.02553.x http://www.ncbi.nlm.nih.gov/pubmed/15969739 https://doi.org/10.1111/1755-0998.12387 http://www.ncbi.nlm.nih.gov/pubmed/25684545 https://doi.org/10.1046/j.1471-8286.2003.00566.x https://doi.org/10.1111/j.1469-1809.1949.tb02451.x https://doi.org/10.1111/j.1469-1809.1949.tb02451.x http://www.ncbi.nlm.nih.gov/pubmed/24540312 https://doi.org/10.1038/hdy.1986.63 http://www.ncbi.nlm.nih.gov/pubmed/3733460 https://doi.org/10.1016/j.tpb.2014.02.003 https://doi.org/10.1016/j.tpb.2014.02.003 http://www.ncbi.nlm.nih.gov/pubmed/24560956 https://doi.org/10.1002/ece3.2096 http://www.ncbi.nlm.nih.gov/pubmed/27127613 http://www.ncbi.nlm.nih.gov/pubmed/12406254 https://doi.org/10.3354/meps265077 https://doi.org/10.1007/s00227-002-0780-2 https://doi.org/10.1111/j.1365-294X.2004.02337.x http://www.ncbi.nlm.nih.gov/pubmed/15488009 https://doi.org/10.1111/j.1365-294X.2006.03077.x http://www.ncbi.nlm.nih.gov/pubmed/17054502 https://doi.org/10.1016/0031-0182(82)90041-4 https://doi.org/10.1353/sgo.2017.0026 https://doi.org/10.1371/journal.pone.0203644 79. Digiantonio GB. The genetic diversity of two contrasting seagrass species using microsatellite analysis. M.Sc. Thesis, University of Virginia. 2017. Available from: https://libraetd.lib.virginia.edu/public_view/ nk322d45k. 80. McMahon K, van Dijk K-J, Ruiz-Montoya L, Kendrick GA, Krauss SL, Waycott M, et al. The movement ecology of seagrasses. Proc R Soc B. 2014; 281: 20140878. https://doi.org/10.1098/rspb.2014.0878 PMID: 25297859 81. Yeung C, Lee TN. Larval transport and retention of the spiney lobster, Panulirus argus, in the coastal zone of the Florida Keys, USA. Fish Oceanogr. 2002; 11: 286–309. 82. Lee TN, Williams E. Mean distribution and seasonal variability of coastal currents and temperature in the Florida Keys with implications for larval recruitment. Bull Mar Sci. 1999; 64: 35–56. 83. Smith NP. Long-term Gulf-to-Atlantic transport through tidal channels in the Florida Keys. Bull Mar Sci. 1994; 54: 602–609. 84. Lee TN, Smith NP. Volume transport variability through the Florida Keys tidal channels. Cont Shelf Res. 2002; 22: 1361–1377. http://dx.doi.org/10.1016/S0278-4343(02)00003-1 85. Kendrick GA, Waycott M, Carruthers TJB, Cambridge ML, Hovey R, Krauss SL, et al. The central role of dispersal in the maintenance and persistence of seagrass populations. Bioscience. 2012; 62: 56–65. https://doi.org/10.1525/bio.2012.62.1.10 86. Kurokochi H, Matsuki Y, Nakajima Y, Fortes MD, Uy WH, Campos WL, et al. A baseline for the genetic conservation of tropical seagrasses in the western North Pacific under the influence of the Kuroshio Current: the case of Syringodium isoetifolium. Conserv Genet. 2015; https://doi.org/10.1007/s10592- 015-0764-7 87. van Dijk K- J, van Tussenbroek B, Jiménez-Durán K, Márquez-Guzmán G, Ouborg J. High levels of gene flow and low population genetic structure related to high dispersal potential of a tropical marine angiosperm. Mar Ecol Prog Ser. 2009; 390: 67–77. https://doi.org/10.3354/meps08190 88. Larkin PD, Maloney TJ, Rubiano-rincon S, Barrett MM. A map-based approach to assessing genetic diversity, structure, and connectivity in the seagrass Halodule wrightii. Mar Ecol Prog Ser. 2017; 567: 95–107. Genetic diversity of the seagrass, Syringodium filiforme, in the subtropical Atlantic PLOS ONE | https://doi.org/10.1371/journal.pone.0203644 September 5, 2018 18 / 18 https://libraetd.lib.virginia.edu/public_view/nk322d45k https://libraetd.lib.virginia.edu/public_view/nk322d45k https://doi.org/10.1098/rspb.2014.0878 http://www.ncbi.nlm.nih.gov/pubmed/25297859 http://dx.doi.org/10.1016/S0278-4343(02)00003-1 https://doi.org/10.1525/bio.2012.62.1.10 https://doi.org/10.1007/s10592-015-0764-7 https://doi.org/10.1007/s10592-015-0764-7 https://doi.org/10.3354/meps08190 https://doi.org/10.1371/journal.pone.0203644 MEPopulation2014 See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/266151290 The influence of admixture and consanguinity on population genetic diversity in Middle East Article in Journal of Human Genetics · September 2014 DOI: 10.1038/jhg.2014.81 · Source: PubMed CITATIONS 9 READS 226 9 authors, including: Some of the authors of this publication are also working on these related projects: trait GWAS View project Copy Number Variations in Human View project Xiong Yang Chinese Academy of Sciences 38 PUBLICATIONS 153 CITATIONS SEE PROFILE Qidi Feng Chinese Academy of Sciences 13 PUBLICATIONS 173 CITATIONS SEE PROFILE Makia Marafie Kuwait Medical Genetics Centre 58 PUBLICATIONS 657 CITATIONS SEE PROFILE Sindhu Jacob Kuwait Health Sciences Center 9 PUBLICATIONS 107 CITATIONS SEE PROFILE All content following this page was uploaded by Makia Marafie on 01 October 2014. The user has requested enhancement of the downloaded file. https://www.researchgate.net/publication/266151290_The_influence_of_admixture_and_consanguinity_on_population_genetic_diversity_in_Middle_East?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_2&_esc=publicationCoverPdf https://www.researchgate.net/publication/266151290_The_influence_of_admixture_and_consanguinity_on_population_genetic_diversity_in_Middle_East?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_3&_esc=publicationCoverPdf https://www.researchgate.net/project/trait-GWAS?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_9&_esc=publicationCoverPdf https://www.researchgate.net/project/Copy-Number-Variations-in-Human?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_9&_esc=publicationCoverPdf https://www.researchgate.net/?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_1&_esc=publicationCoverPdf https://www.researchgate.net/profile/Xiong_Yang?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_4&_esc=publicationCoverPdf https://www.researchgate.net/profile/Xiong_Yang?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_5&_esc=publicationCoverPdf https://www.researchgate.net/institution/Chinese_Academy_of_Sciences?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_6&_esc=publicationCoverPdf https://www.researchgate.net/profile/Xiong_Yang?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_7&_esc=publicationCoverPdf https://www.researchgate.net/profile/Qidi_Feng?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_4&_esc=publicationCoverPdf https://www.researchgate.net/profile/Qidi_Feng?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_5&_esc=publicationCoverPdf https://www.researchgate.net/institution/Chinese_Academy_of_Sciences?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_6&_esc=publicationCoverPdf https://www.researchgate.net/profile/Qidi_Feng?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_7&_esc=publicationCoverPdf https://www.researchgate.net/profile/Makia_Marafie?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_4&_esc=publicationCoverPdf https://www.researchgate.net/profile/Makia_Marafie?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_5&_esc=publicationCoverPdf https://www.researchgate.net/institution/Kuwait_Medical_Genetics_Centre?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_6&_esc=publicationCoverPdf https://www.researchgate.net/profile/Makia_Marafie?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_7&_esc=publicationCoverPdf https://www.researchgate.net/profile/Sindhu_Jacob?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_4&_esc=publicationCoverPdf https://www.researchgate.net/profile/Sindhu_Jacob?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_5&_esc=publicationCoverPdf https://www.researchgate.net/institution/Kuwait_Health_Sciences_Center?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_6&_esc=publicationCoverPdf https://www.researchgate.net/profile/Sindhu_Jacob?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_7&_esc=publicationCoverPdf https://www.researchgate.net/profile/Makia_Marafie?enrichId=rgreq-c0c895837bc54f466d11474e6c37e583-XXX&enrichSource=Y292ZXJQYWdlOzI2NjE1MTI5MDtBUzoxNDc0MDg2MTI3NjE2MDBAMTQxMjE1NjM2MDYyOA%3D%3D&el=1_x_10&_esc=publicationCoverPdf ORIGINAL ARTICLE The influence of admixture and consanguinity on population genetic diversity in Middle East Xiong Yang1, Suzanne Al-Bustan2, Qidi Feng1, Wei Guo3, Zhiming Ma3, Makia Marafie4, Sindhu Jacob5, Fahd Al-Mulla5 and Shuhua Xu1 The Middle East (ME) is an important crossroad where modern humans migrated ‘out of Africa’ and spread into Europe and Asia. After the initial peopling and long-term isolation leading to well-differentiated populations, the ME also had a crucial role in subsequent human migrations among Africa, Europe and Asia; thus, recent population admixture has been common in the ME. On the other hand, consanguinity, a well-known practice in the ME, often reduces genetic diversity and works in opposition to admixture. Here, we explored the degree to which admixture and consanguinity jointly affected genetic diversity in ME populations. Genome-wide single-nucleotide polymorphism data were generated in two representative ME populations (Arabian and Iranian), with comparisons made with populations worldwide. Our results revealed an overall higher genetic diversity in both ME populations relative to other non-African populations. We identified a much larger number of long runs of homozygosity in ME populations than in any other populations, which was most likely attributed to high levels of consanguineous marriages that significantly decreased both individual and population heterozygosity. Additionally, we were able to distinguish African, European and Asian ancestries in ME populations and quantify the impact of admixture and consanguinity with statistical approaches. Interestingly, genomic regions with significantly excessive ancestry from individual source populations are functionally enriched in olfactory pathways, which were suspected to be under natural selection. Our findings suggest that genetic admixture, consanguinity and natural selection have collectively shaped the genetic diversity of ME populations, which has important implications in both evolutionary studies and medical practices. Journal of Human Genetics advance online publication, 25 September 2014; doi:10.1038/jhg.2014.81 INTRODUCTION Studies of both mitochondrial DNA (mtDNA) and Y-chromosome lineages indicate that after modern human migrating out of Africa, tens of thousands of years ago, they arrived in the Middle East (ME), and then dispersed into Europe and Asia.1,2 Over thousands of years, most of human populations have been relatively isolated, evolved independently and generated the distinct genomic characteristics as can be noted today. However, Africa, Asia and Europe are geogra- phically connected by the ME, which provides opportunities for population contact and thus population admixture, with this effect being more pronounced following trade and the establishment of the Silk Road. Previous studies have identified admixture events in ME populations when examining both uniparental markers and genome- wide single-nucleotide polymorphisms (SNPs).3–6 A previous study of Uyghurs and African Americans reported that admixture could increase the genetic diversity of the admixed populations.7 Moreover, ME populations generally have large family sizes, with marriages between relatives very common;8 thus, consanguinity is highly prevalent in this region. A similar situation is encountered in Central Asia, South Asia and the Americas,9,10 especially in Islamic-influenced areas. It is deemed that consanguinity usually decreases the genetic diversity and results in many recessive diseases such as neuromuscular disorders, metabolic disorders, osteopetrosis syndromes and chondrodystrophia.8,11 However, to our knowledge, few studies have focused on the influences of both admixture and consanguinity on population genetic diversity simultaneously. In the present study, we attempt to qualify and quantify the influence of these two forces in two representative ME populations residing in Kuwait, with evidence showing that both populations experienced admixture and consanguinity.3–5,11 Ultimately, mathematical modeling was used to elucidate the degree to which admixture and consanguinity shaped the genetic diversity and structure of the two ME human populations. 1Chinese Academy of Sciences (CAS) Key Laboratory of Computational Biology, Max-Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology (PICB), Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China; 2Department of Biological Sciences, Kuwait University, Safat, Kuwait; 3Institute of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China; 4Kuwait Medical Genetic Center, Maternity Hospital, Sulaibikhat, Kuwait and 5Department of Pathology, Faculty of Medicine, Health Sciences Center, Kuwait University, Safat, Kuwait Correspondence: Professor S Xu, Chinese Academy of Sciences (CAS) Key Laboratory of Computational Biology, Max-Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology (PICB), Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China. E-mail: xushua@picb.ac.cn Received 8 July 2014; revised 25 August 2014; accepted 27 August 2014 Journal of Human Genetics (2014), 1–8 & 2014 The Japan Society of Human Genetics All rights reserved 1434-5161/14 www.nature.com/jhg http://dx.doi.org/10.1038/jhg.2014.81 mailto:xushua@picb.ac.�cn http://www.nature.com/jhg MATERIALS AND METHODS Samples and quality controls DNA samples were collected from 42 Kuwaitis whose ancestry had been traced back at least four generations to the Arabian Peninsula (ARB) and 22 Kuwaitis whose ancestry had been traced back at least four generations to Persia (IRN) via pedigree analysis. These samples were genotyped by Affymetrix Genome- Wide Human SNP Array 6.0 (Santa Clara, CA, USA) according to standard protocols. Additionally, the raw data of 200 unrelated samples from four populations in the International HapMap Project phase III12 were downloaded to include 50 CHB (Han Chinese in Beijing, China), 50 JPT (Japanese in Tokyo, Japan), 50 YRI (Yoruba in Ibadan, Nigeria) and 50 CEU (Utah residents with Northern and Western European ancestries), which were also genotyped by Affymetrix Genome-Wide Human SNP Array 6.0. All the raw array data were called by Affymetrix Power Tools (APT, Version 1.15.0) with the Affymetrix platform annotation file (Genome-Wide SNP 6 annotation na32, with genome references to UCSC hg19 or NCBI GRCh37). One sample from the ARB sample group was removed from subsequent analyses owing to a calling rate o85%. Data filtering was performed within each population, samples with a missing rate 45% per individual, SNPs with missing rate 45% and SNPs failing the Hardy–Weinberg equilibrium test (P-value o1.0E− 6) were excluded from the analysis. Data from the six populations was merged according to the intercepted SNPs (721 989 autosomal SNPs), SNPs whose minor allele frequency (MAF) o0.01 was excluded (11 362 autosomal SNPs). Finally, 40 ARB samples, 22 IRN samples, all 200 selected HapMap Phase III samples and 710 627 autosomal SNPs were used for subsequent analyses. Population structure analysis Markers with r240.5 calculated by a 50-SNP sliding window shifted at a 5-SNP interval were removed to reduce strong linkage disequilibrium (LD). Principal component analysis (PCA) was performed on the six populations with the thinned autosomal SNPs (334 705 markers) using smartpca in the EIGENSOFT package (version 4.2),13,14 The population structure was also inferred with ADMIXTURE (version 1.2.3),15 which implemented a maximum-likelihood method to estimate individual ancestries. The analysis was performed with LD-pruned SNPs, with ten fold cross validation error (–cv=10), a K from 2 to 6 and other parameters set to default. Assessing genetic diversity To eliminate the effects that the ascertainment bias of genotyping data might bring to the measurement of genetic diversity, we merged the SNPs of the four reference populations (CEU, CHB, JPT and YRI) called from the next- generation sequencing database.16,17 Next, we calculated the MAF and divided the MAF ⩾ 0.05 into small intervals: increase each interval by 0.01; that is, [0.05, 0.06), [0.06, 0.07), …, and [0.49, 0.50]. Next, we randomly sampled SNPs from the genotyping data of the same merged four populations according to the proportion in each interval decided by sequencing called SNPs, with the sampling processes repeated 10 times. The expected heterozygosity for each SNP (HSe) was used to measure the SNP-based diversity of each population by the formula: HSe ¼ 1n Pn i¼1 1 � Sp2j � � , where n denotes the number of SNPs in a sliding window and pj denotes the allele frequency of jth allele. We chose a 100 kb sliding window, with at least five SNPs per 100 kb. Individual-based diversity was measured by dividing the number of heterozygous SNPs by the total number of non-missing SNPs per individual. The data were then phased using Beagle (version 3.3.2)18,19 with default parameters and the expected haplotype heterozygosity (HHe) calculated for the windows of 10, 20, 30, 40, 50, 100, 200 and 500 kb as previously described.7 All measurements were performed independently on the 10 sampling repeats. Assessing consanguinity If a child received two copies of the same segment from father and mother, it would create a run of homozygosity (ROH).20 The probability for a child from consanguineous marriage receiving the same segment was significantly elevated; thus, ROH could be used to measure the level of consanguinity. Here ROH was calculated by PLINK (version 1.07)21 with a sliding window of 500 kb and at least 50 SNPs, one heterozygote and no more than five missing SNPs allowed per window. Satisfactory ROHs contained a span of at least 500 kb, with a minimum density of one SNP per 50 kb and a maximum distance between two adjacent SNPs of 100 kb. Two adjacent ROH segments were merged if the proportion of overlap was 40.05. ROH fragments were then clustered into three classes with Mclust (an R package) using the methods described by Pemberton et al.9 Testing population admixture To test whether ARB and IRN were admixed populations, the three-population test (F3 test), a test that can provide strong evidence of population admixture by modeling genetic drift paths, was used.22 The F3 test has a general form F3 (C; A, B), in which C denotes the target population and A and B denote two reference populations; thus, ARB and IRN were deemed target populations. In this study, we selected CEU, CHB and YRI as the reference populations (surrogate ancestral populations) because they are less admixed and high- density SNP data are available. Significantly negative F3 scores support population admixture, with gene flow occurring between the two reference populations. Inferring local ancestry To determine local ancestry for each SNP from each individual, ELAI, a two- layer hidden Markov model, was used by modeling the LD within and among groups.23 YRI, CEU and CHB were set as surrogate ancestral reference populations, with 50 EM steps, and 3 upper and 30 lower clusters. Previous studies on the admixture events of other ME populations reported the events to have occurred about 100 generations ago;24,25 thus, this time estimate was used as a priori in our local ancestry inference. Linear regression analysis To determine how admixture and consanguinity jointly influenced genetic diversity, linear regression analysis was performed at both the SNP and individual levels as follows: (1) For SNP diversity, ROH scores were defined as Xs1 ¼ The occurence times of that SNP in ROH region Number of individuals in the admixed population and the admixture effect was defined as Xs2 ¼ X ai ´ 2f ið1 � f iÞ; i ¼ 1; 2; 3 where αi is the ancestral contribution to that SNP, fi is the MAF of that SNP and i is the ancestral population. Then, the diversity of an SNP (Ys) was modeled according to Ys ¼ bs0 þ bs1Xs1 þ bs2Xs2 þ e1; e1BN 0; s12 � � (2) For the individual diversity, ROH scores were defined as Xi1 ¼ Number of SNPs in that individual in ROH region Total number of SNPs in that individual and the admixture effect was defined as Xi2 ¼ X aiHi; i ¼ 1; 2; 3 where αi is the ancestral contribution to that individual, Hi is the mean individual diversity and i is the ancestral population. Similarly, the diversity of an individual (Yi) was modeled according to Yi ¼ bi0 þ bi1Xi1 þ bi2Xi2 þ e2; e2BN 0; s22 � � The proposed null hypothesis (H0) for these models assumes the ROH score and admixture effect to have no impact on the observed SNP and individual diversities. Based on this hypothesis, linear regression analysis was performed separately on the two ME populations, ARB and IRN, for the 10 sampling repeats. Admixture and consanguinity in the Middle East X Yang et al 2 Journal of Human Genetics RESULTS Population structure of ME populations PCA was performed at the individual level to investigate the population structure. A plot displaying the two most significant principal components (PCs) (Figure 1a) showed individuals from Africa, Europe and Asia to tightly cluster in their groups. PC1 clearly separated Africans and non-Africans, whereas PC2 separated Asians and Europeans. However, individuals from the two ME populations (ARB and IRN) clustered loosely, with ARB samples located along the edge between YRI and CEU, while the IRN samples shifted slightly towards the Eastern Asian populations (CHB and JPT) (Figure 1a). The long tails exhibited by the two ME populations in the PCA plot imply possible admixture events, or the occurrence of gene flow from other populations. When performing the ADMIXTURE analysis, the lowest cross-validation error could be found when K = 3 (Supplementary Figure S1). These results clearly show that the genetic ancestries of the two ME populations share mainly European (blue) and African (gray) ancestries, as well as a slight Eastern Asian (red) ancestry (Figure 1b), which was consistent with the observed PCA results and suggested admixture events supported by the ADMIX- TURE analysis. ME populations show higher genetic diversity than the other non- African populations To compare the genetic diversity of ME populations relative to others, SNP-based, haplotype-based and individual-based heterozygosity assessments were measured. All diversities were calculated from ascertainment bias-corrected SNP subsets, with independent sampling repeated 10 times. Furthermore, the mean SNP-based diversity (HSe) in the two ME populations were higher than those in CEU, CHB and JPT populations, but slightly lower than that in the YRI population (Figure 2a), with the same pattern for individual-based heterozygosity noted (Figure 2b). Remarkably, when SNPs in ROH regions were excluded for each individual to control potential consanguinity, the two ME populations exhibited even higher individual heterozygosity than the other non-African populations and showed comparable levels to the African population (Figure 3b). When examining haplotype- based heterozygosity (HHe), similar patterns were observed despite window size (Figure 2c), with increasing values approaching 1 correlated with an increased window size and a value of 1 almost reached with a window size exceeding 500 kb. One possible inter- pretation of these results is that the two ME populations are admixed populations with ancestral contributions from African, European and Asian populations, with the increased genetic diversity due to admixture counteracted by the substantial increase of consanguineous marriage practices, which is consistent with previous findings.3,5 ME populations show higher consanguinity To compare consanguinity, we measured consanguinity using ROH and clustered the ROH fragments into three classes. It was clearly observed that both the total number and total length of ROHs per individual gradually increased with an increase in geographical distance from Africa for both short (Figure 3a) and intermediate (Figure 3b) ROH classes. These observed patterns were consistent with a previous study based on the HGDP data set.9 However, the two ME populations presented large variations in both the total number and total length of ROHs per individual (Figure 3c). When the total ROH length was plotted against the total ROH number per individual, a strong correlation was noted for both the short and intermediate classes, with the distance along the fitted line proportional to the geographical distance from Africa (Figures 3d and e). For long ROH class, the two ME populations showed a greater total ROH number and longer total ROH length per individual than the other four populations (Figure 3f), with the long ROH fragments most likely arising from a recent background relatedness;that is, consanguinity. Thus, the possibility of consanguinity having reduced the genetic diversity of the two originally admixed ME populations is plausible, with these populations exhibiting a lower genetic diversity than their surrogate ancestral YRI population. However, the two ME populations still showed higher genetic diversity than the other non-African populations. This may be explained by both European and Asian populations having possibly experienced a bottleneck event since their divergence from the ME populations, in addition to the time and strength of consanguinity being unable to counteract completely the diversity introduced by admixture in the two ME populations. Evidence of admixture in the ME populations ADMIXTURE analysis revealed that the two ME populations had the highest genetic similarities to Europeans, followed by Africans and Asians. To formally test for admixture in these populations, we first calculated F3 (ARB or IRN; YRI, CEU) and observed significant negative values for both ARB and IRN; then we calculated F3 (ARB or IRN; YRI, CHB or JPT), but none of them were negative; and finally we calculated F3 (ARB or IRN; CEU, CHB or JPT), and only the value for IRN was significantly negative no matter whether the Asian Figure 1 Population structure analysis. (a) Principal component analysis (PCA) with samples from the two Middle East (ME) populations: Arabian (ARB), Iranian (IRN) and 200 samples of four reference populations (CEU, CHB, JPT and YRI) from the International HapMap Project III. (b) ADMIXTURE analysis with data pruned based on linkage disequilibrium (LD); the lowest cross-validation error was observed at K = 3. Admixture and consanguinity in the Middle East X Yang et al 3 Journal of Human Genetics reference population was CHB or JPT (Table 1), thus indicating both of these populations admixed. In summary, the ARB population received ancestral contributions from European and African popula- tions, whereas the IRN population received ancestral contributions from European, African and Asian populations. These results were in accordance with the PCA and ADMIXTURE analyses. Moreover, some individuals showed excessive African ancestry (Figure 1b), suggesting recent gene flow from African population, which was consistent with previous mtDNA studies.3 For the ARB population, negative values were not obtained during the F3 testing using Asian reference populations, possibly because of low levels of gene flow that could not be detected. Furthermore, negative F3 values were not obtained for either of the ME populations with the YRI, CHB or JPT reference populations, which could be attributed to the fact that the admixture events were mainly between European and African populations, with only low-level gene flow occurring with Asian populations. The direction and magnitude of influences of admixture and consanguinity on genetic diversity To investigate the direction and magnitude of influences that admixture and consanguinity had on genetic diversity, a linear model was proposed, with the ROH score and admixture effect fitted to the observed diversity. Regression analysis was performed on the two ME populations separately with the 10 independent samplings to investigate relationships at both the SNP and individual levels. At the SNP level, the results for both ME populations were highly concordant among 10 independent samplings, with the intercept (βs0), ROH score (Xs1) coefficient (βs1) and admixture effect (Xs2) coefficient (βs2) all statistically significant (Supplementary Table S1). Owing to the level of consistency among the 10 independent samplings, the regression model parameters βs0, βs1 and βs2 were averaged to generate the final regression model for the SNP diversity as: Ys(ARB) = 0.06722 − 0.05680*Xs1+0.80069*Xs2 (mean adjusted R 2 = 0.66885) and Ys(IRN) = 0.05560 − 0.03963*Xs1+0.82681*Xs2 (mean adjusted R2 = 0.70289). Similar results were obtained for the individual diversity. Both the regression models for ARB and IRN showed statistical significance and the 10 independent samplings were highly concordant (Supplementary Table S2). Again, the regression model parameters βi0, βi1 and βi2 were averaged to obtain the final individual diversity models: Yi(ARB) = − 1.05647 − 0.36337*Xi1+4.71255*Xi2 (mean adjusted R 2 = 0.90692) and Yi(IRN) = − 1.76090 − 0.28837*Xi1+7.16242*Xi2 (mean adjusted R2 = 0.97046). The positive coefficients for the admixture effect confirmed an increased genetic diversity owing to admixture, which was consistent with the previous study,7 whereas the negative ROH score coefficients confirmed a decrease in genetic diversity owing to consanguinity. Overall, linear modeling enabled the quantification of the influences of admixture and consanguinity on the genetic diversity in the two ME populations. Genome-wide distribution of local ancestry in ME populations The local ancestry at each SNP for each individual was estimated by ELAI. The local ancestry contributions from different ancestries were Figure 2 Single-nucleotide polymorphism (SNP) level, individual level and haplotype level of genetic diversity obtained from 10 independent random samplings. (a) Mean SNP heterozygosity of a 100 kb sliding window. (b) Mean individual heterozygosity calculated from non-missing SNPs with and without runs of homozygosity (ROH) regions considered; and (c) mean haplotype heterozygosity. Haplotype heterozygosity was calculated by sliding windows of 10, 20, 30, 40, 50, 100, 200 and 500 kb. Admixture and consanguinity in the Middle East X Yang et al 4 Journal of Human Genetics not uniformly distributed across the genome, with some genomic regions showing excessive ancestry contribution from a given parental population (Figure 4). For the ARB and IRN populations, the loci showing excessive or scarce ancestry contribution beyond the 0.5% quantile were collected and the collected ARB and IRN SNPs were found to be highly overlapped (Supplementary Figure S2). These overlaps could contribute to the populations adapting to the same local environment. Functional annotation of these overlapping SNPs were identified using the DAVID database26,27 and showed all of the top 10 categories to relate to olfactory perception pathway (Benjamini corrected P-value o6.50× 10− 15; false discovery rate P-value o6.60× 10− 14) (Table 2). The genes enriched among the top 10 functional categories were mostly from olfactory families (2, 4, 5, 8, 9, 10, 11 and 12), with the exception of GABBR1 and MAS1L (Supplementary Table S3). GABBR1 is a γ-aminobutyric acid B receptor, which is the main inhibitory neurotransmitter in the mammalian central nervous system, whereas MAS1L is a G-protein- coupled receptor and is associated with the G-protein coupled receptor protein signaling pathway. It is likely that these two genes are also associated with the olfactory perception pathways indirectly. DISCUSSION In this study, we attempted to explore the combined effect of genetic admixture and consanguinity on human genetic diversity. We analyzed genome-wide SNP data of two ME populations, ARB and IRN, and our results showed that the genetic diversity of the two ME populations was higher than that of the other non-African popula- tions, which was consistent with an admixture scenario. At the same time, long ROH fragments were also identified in a vast number of genomic regions in the two ME populations, which was also consistent Figure 3 Runs of homozygous fragments (ROHs). (a–c) Total length (top) and total number (bottom) of short (a), intermediate (b) and long (c) ROHs per individual, respectively. (d–f) Scatterplot of total length against the total number of short (d), intermediate (e) and long (f) ROHs per individual, respectively. Legends in (e) and (f) are the same as those in (d). Admixture and consanguinity in the Middle East X Yang et al 5 Journal of Human Genetics with the expected consequence of consanguineous marriage. These results suggest that the demographic history of the two ME popula- tions is very complex. Considering the geographical location of the ME, the observed higher genetic diversity in the two ME populations could simply be explained by a possible scenario that these populations were surrogate ancestral populations of the European and Asian populations. Moreover, signatures of population admixture were also very pro- nounced based on PCA, ADMIXTURE analysis and F3 testing. Therefore, a more likely yet complex scenario was that the two ME populations were admixed populations with gene flow contributions from European, Asian and African populations. These admixture events increased the genetic diversity of the ME populations to levels comparable to or higher than those of African populations, with this diversity gradually decreased because of the prevalent cultural practices of consanguinity. Our results support the second scenario and are consistent with previous findings.3–5 However, the genetic architec- tures of modern ME populations could result from ancient migration, subsequent gene flow (or admixture) between well-differentiated populations and entangle with recent consanguineous marriages. Social and historical documentation in conjunction with other genetic findings all support this interpretation.8,28 A challenge when analyzing genetic diversity based on genotyping data is potential ascertainment bias. The availability of public sequencing data in worldwide populations (e.g., CEU, CHB, JPT and YRI) made it possible to correct for this bias by referencing the MAF spectrum of sequence data (Supplementary Figure S3). This bias was corrected by randomly sampling a subset of SNPs from the genotyping data according to the distribution of MAFs from the sequencing data, with this approach repeated 10 times. Interestingly, even after correction, the CEU diversity was still slightly higher than that of CHB and JPT. One possible cause of these differences between European and Asian populations may be attributed to differences in Table 1 F3 test results A B C F3-score S.e. Z-score YRI CEU ARB − 0.01079 0.000294 − 36.699 YRI CEU IRN − 0.0086 0.000293 − 29.382 YRI CHB ARB 0.017876 0.000618 28.941 YRI CHB IRN 0.019322 0.000653 29.61 YRI JPT ARB 0.018061 0.000619 29.193 YRI JPT IRN 0.019288 0.000651 29.635 CEU CHB ARB 0.008413 0.000285 29.486 CEU CHB IRN − 0.00126 0.00026 − 4.854 CEU JPT ARB 0.008449 0.000289 29.265 CEU JPT IRN − 0.00145 0.000264 − 5.491 Abbreviations: ARB, Arabian; CEU, Utah residents with Northern and Western European ancestries; CHB, Han Chinese in Beijing, China; JPT, Japanese in Tokyo, Japan; IRN, Iranian; YRI, Yoruba in Ibadan, Nigeria. A and B denote the two proxy parental populations and C is the target population tested. Significant negative value observed in F3 test indicates population admixture in target population C. Z-score o− 1.64 (corresponding P-value for one-tailed test is 0.05) indicates statistical significance. Figure 4 Mean ancestry contributions. (a) Mean ancestry contributions for each single-nucleotide polymorphism (SNP) in Arabian (ARB). Top: Mean European ancestry contribution to ARB; bottom: mean Asian ancestry contribution to ARB; (b) mean ancestry contribution for each SNP in Iranian (IRN). Top: Mean European ancestry contribution to IRN; bottom: mean Asian ancestry contribution to IRN. Black solid line denotes average mean ancestry contribution across genome; blue solid line denotes 99.5% quantile; and red solid line denotes 0.005% quantile. Admixture and consanguinity in the Middle East X Yang et al 6 Journal of Human Genetics strength and lasting time of a population bottleneck. To address this problem, the individual diversities of the four reference populations were calculated from sequencing data, with the same pattern observed as when randomly sampling genotyping data (Supplementary Figure S4). Therefore, it seemed that the higher genetic diversity in CEU populations was intrinsic, suggesting that recent gene flow in Europeans could be an important factor, with genetic contributions from other sources such as African or even possibly some archaic humans29,30 having significantly influenced the European gene pool. Admixture has been a common phenomenon throughout the history of modern humans, with previously isolated populations often come into contact through colonization and migration. It is especially common in the ME since it has been a melting pot of cultures, languages and people. Both prehistoric and recent genetic admixture have greatly influenced the genetic makeup of regional ME popula- tions. On the other hand, consanguineous marriage is prevalent in many ME countries, which is expected to decrease ME genetic diversity. As a retrospective study based on modern human genomic data, it is difficult to fully distinguish the influences of admixture and consanguinity on genetic diversity, as each has a confounding effect on the other. However, we were able to confirm the generated theoretical predictions and roughly estimate the magnitudes of the influence of admixture and consanguinity based on the statistical approaches used in this study. Our analyses revealed that the current genetic archi- tectures of the two ME populations were shaped by a joint effect of the two forces that resulted from historical, cultural and potentially also from religious reasons. Additionally, we further explored the possibility of the influence of a third type of force on regional genetic diversity, natural selection. Our approach to search for footprints of natural selection in both ME populations was based on admixture analysis seeking to identify genomic regions with local ancestry significantly deviated from the mean genome-wide distribution. While this approach could only detect natural selection signatures after population admixture, it is extremely interesting that the top candidate genes underlying natural selection in the two ME populations were associated with olfactory pathways. While we could not provide a convincing interpretation for these selection signatures, the noted statistical signals could not be explained by stochastic process. This suggests the presence of environmental pressures on these genes in the history of the two ME populations. Taken together, genetic admixture, consanguinity and natural selection have jointly shaped the genetic diversity of the two ME populations, with admixture and consanguinity having opposing effects on diversity, while natural selection exhibits a more regional effect relative to the genome-wide influences seen from the former two factors. CONFLICT OF INTEREST The authors declare no conflict of interests. ACKNOWLEDGEMENTS These studies were supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (CAS) (XDB13040100) and by the National Science Foundation of China (NSFC) grants (91331204; 31171218). This research was supported in part by the Ministry of Science and Technology (MoST) International Cooperation Base of China and by National Center for Mathematics and Interdisciplinary Sciences (NCMIS), Academy of Mathematics and Systems Science, CAS. SX is Max-Planck Independent Research Group Leader and member of CAS Youth Innovation Promotion Association. SX also gratefully acknowledges the support of the National Program for Top-notch Young Innovative Talents of The ‘Ten-Thousand- Talents’ Project and the support of KC Wong Education Foundation, Hong Kong. Fahd Al-Mulla was supported by the Kuwait Foundation for Advancement of Sciences (No. 2011-1302-06). WG was supported by the Fundamental Research Funds for the Central Universities (2011JBZ019). We thank LetPub (http: //www.letpub.com) for its linguistic assistance during the preparation of this manuscript. All funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript. Author contributions: SX conceived and designed the study; FA-M, SA-B and MM collected and genotyped the samples; XY, QF and WG analyzed the data; XY and SX wrote the paper. 1 Oppenheimer, S. Out-of-Africa, the peopling of continents and islands: tracing uniparental gene trees across the map. Philos. Trans. R. Soc. Lond. Ser. B 367, 770–784 (2012). 2 Wilder, J. A., Kingan, S. B., Mobasher, Z., Pilkington, M. M. & Hammer, M. F. Global patterns of human mitochondrial DNA and Y-chromosome structure are not influenced by higher migration rates of females versus males. Nat. Genet. 36, 1122–1125 (2004). 3 Theyab, J. B., Al-Bustan, S. & Crawford, M. H. The genetic structure of the Kuwaiti population: mtDNA inter- and intra-population variation. Hum. Biol. 84, 379–403 (2012). 4 Triki-Fendri, S., Alfadhli, S., Ayadi, I., Kharrat, N., Ayadi, H. & Rebai, A. Genetic structure of Kuwaiti population revealed by Y-STR diversity. Ann. Hum. Biol. 37, 827–835 (2010). 5 Alsmadi, O., Thareja, G., Alkayal, F., Rajagopalan, R., John, S. E., Hebbar, P. et al. Genetic substructure of Kuwaiti population reveals migration history. PLoS ONE 8, e74913 (2013). 6 Hellenthal, G., Busby, G. B., Band, G., Wilson, J. F., Capelli, C., Falush, D. et al. A genetic atlas of human admixture history. Science 343, 747–751 (2014). 7 Xu, S., Jin, W. & Jin, L. Haplotype-sharing analysis showing Uyghurs are unlikely genetic donors. Mol. Biol. Evol. 26, 2197–2206 (2009). 8 Teebi, A. S. & Teebi, S. A. Genetic diversity among the Arabs. Community Genet. 8, 21–26 (2005). 9 Pemberton, T. J., Absher, D., Feldman, M. W., Myers, R. M., Rosenberg, N. A. & Li, J. Z. Genomic patterns of homozygosity in worldwide human populations. Am. J. Hum. Genet. 91, 275–292 (2012). Table 2 Functional annotation of the overlapped SNPs beyond 0.5% quantile in European or Asian ancestries within ARB and IRN populations Category Term P-value Benjamini FDR GOTERM_BP_FAT Sensory perception of smell 1.50E− 29 8.50E− 27 2.10E− 26 SP_PIR_KEYWORDS Olfaction 3.50E− 29 6.10E− 27 4.30E− 26 GOTERM_MF_FAT Olfactory receptor activity 4.70E− 28 8.30E− 26 5.80E− 25 GOTERM_BP_FAT Sensory perception of chemical stimulus 9.70E− 28 2.80E− 25 1.40E− 24 SP_PIR_KEYWORDS Sensory transduction 5.20E− 23 4.50E− 21 6.30E− 20 KEGG_PATHWAY Olfactory transduction 1.60E− 22 8.20E− 21 1.60E− 19 SP_PIR_KEYWORDS G-protein-coupled receptor 1.20E− 20 7.00E− 19 1.50E− 17 SP_PIR_KEYWORDS Transducer 1.60E− 19 7.00E− 18 2.00E− 16 GOTERM_BP_FAT Sensory perception 6.80E− 19 1.30E− 16 1.00E− 15 GOTERM_BP_FAT Cognition 4.50E− 17 6.50E− 15 6.60E− 14 Abbreviations: ARB, Arabian; FDR, false discovery rate; IRN, Iranian; SNP, single-nucleotide polymorphism. Admixture and consanguinity in the Middle East X Yang et al 7 Journal of Human Genetics www.letpub.com 10 Leutenegger, A. L., Sahbatou, M., Gazal, S., Cann, H. & Genin, E. Consanguinity around the world: what do the genomic data of the HGDP–CEPH diversity panel tell us? Eur. J. Hum. Genet. 19, 583–587 (2011). 11 Al-Kandari, Y. Y. & Crews, D. E. The effect of consanguinity on congenital disabilities in the Kuwaiti population. J. Biosoc. Sci. 43, 65–73 (2011). 12 International HapMap, C., Altshuler, D. M., Gibbs, R. A., Peltonen, L., Altshuler, D. M., Gibbs, R. A. et al. Integrating common and rare genetic variation in diverse human populations. Nature 467, 52–58 (2010). 13 Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, e190 (2006). 14 Price, A. L., Patterson, N. J., Plenge, R. M., Weinblatt, M. E., Shadick, N. A. & Reich, D. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006). 15 Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009). 16 Genomes Project, C., Abecasis, G. R., Auton, A., Brooks, L. D., DePristo, M. A., Durbin, R. M. et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012). 17 Genomes Project, C., Abecasis, G. R., Altshuler, D., Auton, A., Brooks, L. D., Durbin, R. M. et al. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010). 18 Browning, B. L. & Browning, S. R. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am. J. Hum. Genet. 84, 210–223 (2009). 19 Browning, S. R. & Browning, B. L. Rapid and accurate haplotype phasing and missing- data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81, 1084–1097 (2007). 20 Kirin, M., McQuillan, R., Franklin, C. S., Campbell, H., McKeigue, P. M. & Wilson, J. F. Genomic runs of homozygosity record population history and consanguinity. PLoS ONE 5, e13996 (2010). 21 Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A., Bender, D. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007). 22 Patterson, N., Moorjani, P., Luo, Y., Mallick, S., Rohland, N., Zhan, Y. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012). 23 Guan, Y. Detecting structure of haplotypes and local ancestry. Genetics 196, 625–642 (2014). 24 Jin, W., Wang, S., Wang, H., Jin, L. & Xu, S. Exploring population admixture dynamics via empirical and simulated genome-wide distribution of ancestral chromosomal segments. Am. J. Hum. Genet. 91, 849–862 (2012). 25 Price, A. L., Tandon, A., Patterson, N., Barnes, K. C., Rafaels, N., Ruczinski, I. et al. Sensitive detection of chromosomal segments of distinct ancestry in admixed popula- tions. PLoS Genet. 5, e1000519 (2009). 26 Huang, D. W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57 (2009). 27 Huang, D. W., Sherman, B. T. & Lempicki, R. A. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 37, 1–13 (2009). 28 Al-Bustan, M., Majeed, S., Bitar, M. S. & Al-Asousi, A. Socio-demographic features and knowledge of diabetes mellitus among diabetic patients in kuwait. Int. Q. Community Health Educ. 17, 65–76 (1997). 29 Der Sarkissian, C., Balanovsky, O., Brandt, G., Khartanovich, V., Buzhilova, A., Koshel, S. et al. Ancient DNA reveals prehistoric gene-flow from siberia in the complex human population history of North East Europe. PLoS Genet. 9, e1003296 (2013). 30 Botigue, L. R., Henn, B. M., Gravel, S., Maples, B. K., Gignoux, C. R., Corona, E. et al. Gene flow from North Africa contributes to differential human genetic diversity in southern Europe. Proc. Natl Acad. Sci. USA 110, 11791–11796 (2013). Supplementary Information accompanies the paper on Journal of Human Genetics website (http://www.nature.com/jhg) Admixture and consanguinity in the Middle East X Yang et al 8 Journal of Human Genetics View publication statsView publication stats https://www.researchgate.net/publication/266151290 The influence of admixture and consanguinity on population genetic diversity in Middle�East Introduction Materials and methods Samples and quality controls Population structure analysis Assessing genetic diversity Assessing consanguinity Testing population admixture Inferring local ancestry Linear regression analysis Results Population structure of ME populations ME populations show higher genetic diversity than the other non-African populations ME populations show higher consanguinity Evidence of admixture in the ME populations Figure 1 Population structure analysis. The direction and magnitude of influences of admixture and consanguinity on genetic diversity Genome-wide distribution of local ancestry in ME populations Figure 2 Single-nucleotide polymorphism (SNP) level, individual level and haplotype level of genetic diversity obtained from 10 independent random samplings. Discussion Figure 3 Runs of homozygous fragments (ROHs). Table 1 F3 test results Figure 4 Mean ancestry contributions. These studies were supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (CAS) (XDB13040100) and by the National Science Foundation of China (NSFC) grants (91331204; 31171218). This research was supported in part by the M ACKNOWLEDGEMENTS Table 2 Functional annotation of the overlapped SNPs beyond 0.5% quantile in European or Asian ancestries within ARB and IRN populations nihms788970 Characterization of Greater Middle Eastern genetic variation for enhanced disease gene discovery Eric M. Scott1,2,3, Anason Halees4,5,6, Yuval Itan1,7, Emily G. Spencer1,2,3, Yupeng He1,2,3, Mostafa Abdellateef Azab1,2,3, Stacey B. Gabriel8, Aziz Belkadi9,10, Bertrand Boisson8,9,10, Laurent Abel6,9,10, Andrew G. Clark11, Greater Middle East Variome Consortium1,2,3, Fowzan S. Alkuraya12,13, Jean-Laurent Casanova1,7,9,10,14, and Joseph G. Gleeson1,2,3 1Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA 2Department of Neurosciences, University of California, San Diego, La Jolla, CA 92093, USA 3Laboratory for Pediatric Brain Disease, The Rockefeller University, New York, NY 10065, USA 4Department of Biostatistics, King Faisal Specialist Hospital & Research Center, Riyadh, 11211, Saudi Arabia 5Department of Epidemiology, King Faisal Specialist Hospital & Research Center, Riyadh, 11211, Saudi Arabia 6Scientific Computing, King Faisal Specialist Hospital & Research Center, Riyadh, 11211, Saudi Arabia 7St. Giles Laboratory of Human Genetics of Infectious Diseases, The Rockefeller University, New York, NY, 10065, USA 8The Broad Institute of MIT and Harvard, Cambridge, MA 02141, USA Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms Correspondence: jogleeson@rockefeller.edu. #Full list of Consortium contributors provided in Acknowledgements URLs ANNOVAR, http://annovar.openbioinformatics.org Kinship-based INference for Gwas (KING), http://people.virginia.edu/~wc9c/KING/ Plink, http://pngu.mgh.harvard.edu/~purcell/plink/ PolyPhen-2, http://genetics.bwh.harvard.edu/pph2/ PSEQ, http://atgu.mgh.harvard.edu/plinkseq/pseq.shtml SnpEff, http://snpeff.sourceforge.net/SnpEff_manual.html UCSC Genome Browser, http://genome.ucsc.edu 1,000 Genomes Browser, http://browser.1000genomes.org Consang.net, http://consang.net/index.php/Global_prevalence Denisovan to Human alignment (FTP), http://www.eva.mpg.de/denisova Neanderthal to Human alignment (FTP), http://cdna.eva.mpg.de/neandertal GME Variome, http://gme.igm.ucsd.edu Author contributions E.M.S. performed analysis and generated all figures. A.H, Y.I, Y.H., M.A.A. consulted on analysis. E.G.S., A.B., B.B., A.A., F.S.A., J.-L.C., J.G.G. contributed subjects and jointly wrote and edited the manuscript. S.B.G. oversaw sequencing. A.G.C. consulted on population studies. GME Consortium identified subjects for study. Competing financial interests The authors declare no competing financial interests Published as: Nat Genet. 2016 September ; 48(9): 1071–1076. H H M I A u th o r M a n u scrip t H H M I A u th o r M a n u scrip t H H M I A u th o r M a n u scrip t http://annovar.openbioinformatics.org http://people.virginia.edu/~wc9c/KING/ http://pngu.mgh.harvard.edu/~purcell/plink/ http://genetics.bwh.harvard.edu/pph2/ http://atgu.mgh.harvard.edu/plinkseq/pseq.shtml http://snpeff.sourceforge.net/SnpEff_manual.html http://genome.ucsc.edu http://browser.1000genomes.org http://consang.net/index.php/Global_prevalence http://www.eva.mpg.de/denisova http://cdna.eva.mpg.de/neandertal http://gme.igm.ucsd.edu 9Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM U1163, Necker Hospital for Sick Children, INSERM, Paris, France, EU 10Paris Descartes University, Imagine Institute, Paris, France, EU 11Department of Molecular Biology and Genetics, Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14853, USA 12Department of Genetics, King Faisal Specialist Hospital and Research Center, Riyadh, Saudi Arabia 13Department of Anatomy and Cell Biology, College of Medicine, Alfaisal University, Riyadh, Saudi Arabia 14Pediatric Hematology-Immunology Unit, Necker Hospital for Sick Children, Paris, France, EU Abstract The Greater Middle East (GME) has been a central hub of human migration and population admixture. The tradition of consanguinity, variably practiced in the Gulf region, North Africa, and Central Asia 1–3, has resulted in an elevated burden of recessive disease4. Here we generated a whole exome GME variome from 1,111 unrelated subjects. We detected substantial diversity from sub-geographies, continental and subregional admixture, several ancient founder populations with little evidence of bottlenecks. Measured consanguinity was an order-of-magnitude above that of other sampled populations, and included an increased burden of runs of homozygosity (ROH), but no evidence for reduced burden of deleterious variation due to classically theorized ‘genetic purging’. Applying this database to unsolved GME recessive conditions reduced the number of potential disease-causing variants by 4–7-fold. These results reveal the variegated GME genetic architecture and support future human genetic discoveries in Mendelian and population genetics. Keywords Mutational load; whole exome sequencing; introgression; admixture; inbreeding coefficient; homozygous; derived allele frequency; consanguineous; selective pressure; runs of homozygosity The Greater Middle East (GME), loosely defined as a large swath of Arab and non-Arab Muslim countries from Morocco in the west to as far east as Pakistan 5, is home to approximately 10% of the world’s population. Despite its invaluable contribution to our understanding of the genetic causes of inherited conditions, especially recessive conditions, and its critical hub as a crossroad to early civilizations, genetic architecture and extent of rare genetic variation remains poorly defined 6–8. To address this shortcoming, the GME Variome Consortium collected whole-exome data on 1,794 self-reported nationals from GME regions participating in on-going genetics studies. In order to minimize selection bias or overrepresentation of disease alleles, we selected primarily healthy individuals from families, and wherever possible, removed from datasets the allele that brought the family to medical attention. Samples were jointly processed, and filtered for quality and familial relation, leaving 1,111 high-quality unrelated individuals. Scott et al. Page 2 Nat Genet. Author manuscript; available in PMC 2017 March 01. H H M I A u th o r M a n u scrip t H H M I A u th o r M a n u scrip t H H M I A u th o r M a n u scrip t We grouped the 1,111 GME exomes into six different GME subregions: Northwest Africa (NWA, 85 samples), Northeast Africa (NEA, 423 samples), Turkish Peninsula (TP, 140 samples), Syrian Desert (SD, 81 samples), Arabian Peninsula (AP, 214 samples), and Persia and Pakistan (PP, 168 samples) (Fig. S1, Table S1), which represent historic groupings, then compared with exomic data of nine established continental populations from 1000 Genomes (1000G) 9. Unbiased identity-by-state clustering showed that samples largely grouped according to the location of ascertainment, validating grouping criteria (Fig. S2). To evaluate GME genetic substructure, we ran the unsupervised algorithm ADMIXTURE 10, where K=6 clusters minimized cross-validation error (Fig. S3). We found some overlap with the primary admixture components from Africa, Europe and East Asia at the edges of geography, but also a large proportion not found in previous reference samples (Figs. 1a, S4). The admixture results also aligned with publications reporting common variation 11–13. The least admixed samples were found in NWA, AP, and PP, suggesting these were founder populations, but showed inter-regional variation of GME-specific components suggesting local admixture (Fig. 1b), and potentially supporting historic events. The NWA component was found from west to east across North Africa, likely representing the presence of Berber genetic background 14. The AP component likely represented ancestral Arab populations and was observed in nearly all regions, possibly a result of the Arab conquests of the 7th century coincident with the expansion of the Arabic language now spoken over much of the region. Similarly, the Persian expansion into TP, SD, and parts of NEA in the 5th century was the most likely contributor of PP signal. Additional sources of human heterogeneity derive from ancient introgression. We found similar patterns of Neanderthal introgression across all GME populations with the exception of NWA, which clustered closer to Sub-Saharan Africans (Fig. S5) 15–17. These data supports the reduced Neanderthal introgression observed in native African populations. Patterns of human migration and drift were recapitulated using TreeMix among GME subregions, based upon 1000G control populations (Fig. 1c) 18. The inferred tree with no migration showed tight clusters of European and Asian populations, but much larger apparent divergence among GME regions. The ordering of GME subregions from the root corroborated much of the ‘out-of-Africa’ ordering of subsequent founder populations 13. Within the GME, the distance from the root emulated the west-to-east organization of GME samples, with PP showing the largest inferred drift parameter, supporting a west-to-east trajectory of human migrations. Assessment of Wright’s fixation index (Fst) demonstrated that the GME grouped with European populations, agreeing with TreeMix results. This resulted in three distinct clusters with a low degree of differentiation (Figs. 1d, S6). PP and NWA represented the extremes of the identified subregions, and showed the highest degree of differentiation (Fst = 0.026) (>2x
compared to the distance between Finnish (FIN) and Toscani (TSI) but smaller than
intercontinental comparisons). Of the four measured 1000G European populations, GME Fst
measurements were closest to TSI, especially SD and TP, consistent with higher levels of
European admixture in these populations. Despite the contribution of admixture, these
Scott et al. Page 3
Nat Genet. Author manuscript; available in PMC 2017 March 01.
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
values suggested extended periods of isolation relative to 1000G populations within each
subregion.
Inter-subregion relationships were tested using principal component analysis (PCA). As
expected, the first two PCs separated along well-established geographic axes: PC1 separated
Sub-Saharan Africans from all other populations, and PC2 separated Eurasian populations
(Fig. S7). GME sub-regions fell between the 1000G African, East Asian, and European,
supporting recent admixture. PP and TP were closer to East Asian, while NEA, NWA, and
SD were closer to Sub-Saharan Africans. PC3 and PC4 separated samples along
topographical north-south and east-west gradients, while exhibiting largely distinct but
overlapping groups with a high-degree of inter-region diversity (Fig. 2a).
To test if these populations were subject to bottlenecks, we calculated the mean linkage
disequilibrium (LD) decay, as haplotypes should decay as a function of size more slowly
with increased bottleneck (Fig. 2b). LD for each GME population decayed faster than
European and East Asian but slower than African populations. LD decayed faster in NWA
and NEA compared with other GME regions, in agreement with our TreeMix results.
Diverse patterns of admixture across these regions suggested these trends were not
predominantly due to intermixing, but instead argued for a historic common ancient
bottleneck.
Between 20–50% of all GME marriages are consanguineous (compared with < 0.2% in the Americas and Western Europe) 1–3, with the majority being first cousin. This roughly 100X higher rate of consanguinity has correlated with roughly a doubling of the rate of recessive Mendelian disease 19,20. European, African and East Asian 1000G populations all had distributions of estimated inbreeding coefficients (F) ~0.005, whereas GME F values ranged from 0.059 to 0.098, but with high variance within each population (Fig. 2c). Thus, measured F was ~10–20X higher, reflecting the shared blocks common to all human populations. F values were dominated by immediate family structure rather than historic or population-wide data trends (Fig. S8) 21. Examining the larger set of 1,794 exomes that included many parent-child trios also showed an overwhelming influence of immediate family structure, in which offspring from first-cousin marriages displayed higher F values compared with non-consanguineous marriages (Fig. 2d). We expected that higher F values would correlate with an increased burden and length of ‘runs of homozygosity’ (ROH), defined as homozygous haplotypes as a function of length 22. 1000G sub-Saharan Africa displayed the smallest total ROH as expected 23, whereas the two other 1000G assessed populations were relatively similar to GME (Figs. 3a, S9), probably reflecting similar lengths of short (<0.515 Mb) and medium (0.516–1.606 Mb) ROH. Most striking was the increase in long ROH (>1.607 Mb), found nearly exclusively in
GME samples, especially for those over 4 Mb (Fig. 3b). In the GME, there was an
enrichment of rare and very rare variants (AF <.05, and AF <.01) in longer ROH, and of common variants (AF >= .05) in shorter ROH (Fig. 3c), suggesting that the longer ROH
result from recent consanguinity 24
Scott et al. Page 4
Nat Genet. Author manuscript; available in PMC 2017 March 01.
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
This increased ROH provided an opportunity to identify homozygous loss-of-function
variants (LOF) in healthy humans. While these variants are only putatively LOF until
experimentally verified, these exhibit the strongest signs of selective pressure and are the
first checked as disease candidates25. Recently, among 2,636 sequenced and 101,584 chip-
imputed Icelanders, 1,171 genes were predicted to be inactivated 26. From our 354 exomes
on verified healthy adults, we found 301 genes with rare homozygous putative LOF variants
(Table S2, S3), with only 50 genes overlapping with the Icelandic gene list. Similarly the
ExAC dataset on 60,706 sequenced individuals identified 2068 genes inactivated, of which
only 94 genes overlapped with our 301 genes. This suggests that the set of non-clinically
relevant LOF variants is far from being exhausted. The GME represents an optimal
population from which to identify homozygous variants due to the elevated consanguinity
rates.
Darwin observed that rare self-fertilized orchid strains exhibited surprisingly higher fitness
than founder strains, which he termed ‘hero strains’ 27. This led to the concept of ‘purging of
recessive alleles’ by Haldane 28, referring to increased loss of deleterious alleles due to
increased selective pressure in inbred populations. Purging was hypothesized to impact the
GME genome due to the higher rates of birth defects incompatible with future
reproduction 29, but has yet to be documented in humans. We compared the distribution of
derived allele frequencies (DAF) in GME and 1000G populations 30. Variants were divided
into 7 functional and PolyPhen-2 deleterious classes. We calculated mean DAFs using
chimpanzee (PanTro2) as the common ancestor (Figs. S10, S11) 31. Neither autosomal nor
X-linked variants showed significant differences (Fig. S12), arguing against a measurable
effect on overall variant burden resulting from consanguinity.
Numerous studies have relied on the increased power of GME-resident consanguineous
families to identify causes of recessive disease, but the lack of an accessible variome has
hindered progress. Efforts like the NHLBI GO Exome Sequencing Project (ESP) produced
variomes for European American (EA) and African American (AA) populations, but poor
correlation of DAFs between population pairs determined that neither were good estimators
for GME DAFs (Pearson’s r 0.7979 GME vs. EA, 0.385 GME vs. AA, 0.1447 EA vs. AA,
Figs. 4a, S13). Moreover, we found much of the GME variation to be poorly represented
outside the GME (Fig. 4b), with the majority of variants in the rarest DAF bin found only in
the GME.
In order to assess how well the GME Variome captured extant exome variation, we sub-
sampled the cohort for 100 iterations from 5 to 700 individuals, and for 8 variant classes (see
methods, Fig. 4c). There was decay in the number of unique variants and accumulation of
rare variants as sample size increased, due to a scaled ability to estimate prevalence. When
sampled near 1,000 individuals, the change in mean of these values was negligible as new
samples were added. Thus the GME Variome should allow accurate determination of
population-level DAFs for all but the rarest alleles.
In order to investigate the potential of the Variome to expedite the discovery of new disease
genes, we compared causal variant sets from GME families displaying recessive hereditary
spastic paraplegia (HSP), where we recently established 17 new genetic forms of disease 32.
Scott et al. Page 5
Nat Genet. Author manuscript; available in PMC 2017 March 01.
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
For a disease like recessive HSP with a prevalence of 3–10 per 100,000 33 and where there
are more than 40 genetic forms and hundreds of individual genetic mutations known, the
expected allele frequency for any causative mutation should be <1:1000 (see methods). Select individuals from 20 representative families underwent whole exome sequencing. For each family, we calculated the number of alleles that passed standard filtering (i.e. LOF or otherwise potential ‘high impact’) 34 and were unique, both without and with the DAFs of the Variome (see website for Variome below). Using only exome data from the 20 families and public sources, there were on average 56, 20, and 11 unique variants passing filters from families with one, two or three sequenced affected members, respectively. In contrast, by accessing the Variome there were on average 13, 5, and 4 unique variants (Fig. 4d, Table S4), yielding a 4–7-fold reduction of the number of variants requiring further consideration. Loosening the allowable AFs to <1:500, <1:333 or <1:250 also showed substantial reduction in the number variants for consideration. Here we have interrogated the fine-scale genomic structure across the GME, shaped by prehistoric as well as historic migrations, conquests, and cultural traditions. The degree of unique genetic variation represented in the GME was surprising given previous efforts to capture diversity, and speaks to the value of sampling of understudied populations. The data support records of migrations and conquests, but also suggest a previously unstudied GME contribution. Despite millennia of elevated consanguinity in the GME, we detected no evidence for purging of recessive alleles. Instead, we detected large rare homozygous blocks, distinct from the small homozygous blocks found in other populations, supporting recent consanguineous matings, and allowing identification of genes harboring putatively high impact homozygous variants in healthy humans from this population. Applying the Variome to future sequencing projects for GME-originating subjects could aid in recessive gene identification across all classes of disease. GME Variome is a publicly accessible resource that will facilitate a broad range of genomic studies in the GME and globally. Online Methods 1. Definition of the Greater Middle East The term “greater Middle East” has been used to refer to a large swath of Arab and non- Arab Muslim countries, stretching from Morocco in the west to as far east as Pakistan in southeast Asia. However, no precise listing of designated countries has yet emerged. “U.S. Working Paper for G8 Sherpas,” Al-Hayat, February 13, 2004. Available online at [http:// english.daralhayat.com/Spec/02-2004/Article-20040213-ac40bdaf- c0a8-01ed-004e-5e7ac897d678/story.html] and [https://www.fas.org/sgp/crs/mideast/ RS22053 ]. Editable map of the Middle East was downloaded from [http:// www.presentationmagazine.com]. 2. Exome Resequencing 2.1 Study sample—The 2,497 individuals used in the analysis were selected from samples ascertained across three labs and recruited with the help of clinicians that Scott et al. Page 6 Nat Genet. Author manuscript; available in PMC 2017 March 01. H H M I A u th o r M a n u scrip t H H M I A u th o r M a n u scrip t H H M I A u th o r M a n u scrip t http://english.daralhayat.com/Spec/02-2004/Article-20040213-ac40bdaf-c0a8-01ed-004e-5e7ac897d678/story.html http://english.daralhayat.com/Spec/02-2004/Article-20040213-ac40bdaf-c0a8-01ed-004e-5e7ac897d678/story.html http://english.daralhayat.com/Spec/02-2004/Article-20040213-ac40bdaf-c0a8-01ed-004e-5e7ac897d678/story.html https://www.fas.org/sgp/crs/mideast/RS22053 https://www.fas.org/sgp/crs/mideast/RS22053 http://www.presentationmagazine.com http://www.presentationmagazine.com constituted the GME Consortium. Although these individuals were not a random sample, they were ascertained within a wide variety of distinct phenotypes such that cohort-specific effects were not expected to bias patterns of variation. All study participants in each of the component studies provided written informed consent for the use of their DNA in studies aimed at identifying genetic risk variants for disease, and for broad data sharing. Institutional certification was obtained for each sample to allow deposition of genotype data in dbGaP and other purposes. 2.2 Exome resequencing, variant calling, and filtering—Blood DNA was extracted using Qiagen reagents, subjected to exome capture with the Agilent SureSelect Human All Exome 50 Megabase (Mb) kit, sequenced on an Illumina HiSeq2000 instrument, resulting in ~94% target coverage at > 30X depth 35–37. FASTQ files were reprocessed and jointly called
to minimize batch effects and ensure consistent variant calling. using the GATK pipeline
(version 3.1–1) adhering to best practices 38, eliminating duplicate reads. Paired-end reads
were aligned to the human reference genome NCBI Build 37, using BWA (version 0.7.5) 39.
Principal component analysis (PCA) was run on the resultant set of variants to identify
potential batch effects between labs, sequencing centers, or collectively run groups of
samples, then samples eliminated until no batch effects were observed.
We calculated four quality control (QC) metrics for each sample using PSEQ and identified
statistical outliers. Metrics included: total number of variants, transition/transversion ratio,
number of sequenced positions, and number of singletons. Due to possible reference
distance bias, we considered samples grouped by geographic region independently. Samples
were identified as outliers using a cutoff of >5 standard deviations from the mean threshold
for each QC metric, removing 314 samples. The PCA based outlier analysis algorithm from
the EIGENSOFT software library was also run, but failed to find any additional samples
violating a standard deviation threshold of 5.0 40.
To ensure unbiased population structure statistics and allele frequency estimates, we
removed close and cryptic relationships from the dataset. Kinship estimation was generated
using KING, which calculated relatedness between all pairs of individuals and was robust to
population structure 41. Using the 182,967 LD filtered SNPs, we ran KING following
standard guidelines for a 3rd degree relationship (i.e. first cousins), using a kinship
coefficient of 0.04419. When a cluster of related individuals was identified, we preferentially
removed those to leave the largest number of samples. Of the remaining 2183 samples after
outlier filtering, 667 samples were removed to reduce dataset relatedness, leaving a final
cohort of 1516 non-related individuals. Remaining samples were rerun through the KING,
which identified no additional kinships. Final continental sample counts after filtering: Sub-
Saharan Africa: 19, America: 33, Europe: 378, Oceania: 1, and Middle East: 1111.
Coverage statistics were generated across all internal exome data sets using BEDTools, to
calculate the average coverage across each exon 42. Exons were filtered from the analysis if
greater than 5% of samples had less than 10x average coverage. Out of the initial 192,056
exons targeted by the Agilent SureSelect II capture kit, 170,032 exons were well covered in
at least 95% of samples. Variants were filtered if identified outside of these genomic regions,
Scott et al. Page 7
Nat Genet. Author manuscript; available in PMC 2017 March 01.
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
leaving 32,967,859 bases under consideration (~1% of the human genome) within 17,800
genes.
Standard filters for variants that were called with posterior probability >99% (glfMultiples
SNP quality > 20), were at least 5 bp away from an indel detected in the 1000G Pilot
Project, were targeted in at least 95% individuals and had a total depth across samples
between 6,823 to 6,823,000 (~1–1000 reads per sample on average) 9. Variant positions
were filtered based on population statistics including a ‘missingness’ rate (referring to the
percent of samples where information was missing) of less than 5%, and Hardy-Weinberg
equilibrium (HWE) deviation p-value < 0.00005 43. We generated a subset of variants in minimal linkage disequilibrium (LD) by pruning variants exhibiting pairwise linkage disequilibrium (r2). Variants were filtered to exclude SNPs with minor allele frequency (MAF) <5%, and all indels. Remaining SNPs were pruned adhering to a maximum threshold of 0.5 using PLINK’s ‘--indep-pairwise’ command 43. Of the initial 578,231 variants, 182,967 SNPs passed filters. This LD pruned dataset was used for population structure characterization including principal component analysis (PCA), Wright’s fixation index (Fst), admixture analysis, KING relationship testing, and estimation of inbreeding coefficient. 2.3 Geographic region assignment—Samples were recruited from 20 countries and territories across the GME and grouped into a set of six geographic regions: Northwest Africa (85 Samples), Northeast Africa (423 Samples), Arabian Peninsula (214 Samples), Syrian Desert (81 Samples), Turkish Peninsula (140 Samples), and Persia and Pakistan (168 Samples). Country boundaries were not used to group samples for two reasons: 1] Inconsistent sampling left several countries with too few samples to accurately represent the diversity of the population. Syria and Yemen, for example, were only represented by a few samples, due to ongoing conflicts. 2] Current country borders frequently fail to accurately separate ethnicities, due to a combination of recent migrations and recent political history. For example, south-eastern Arabian Peninsula Bedouin tribes do not distinguish between the relatively recently defined borders of Oman and the UAE. Self-identified ethnicities were available for some samples, but incompleteness of this annotation, and the great diversity of populations affiliating as “Arab”, prompted use of geography for groupings. As much as possible we assigned location to the current residence, rather than ancestral residence or location where samples were drawn. While some reference GME ethnicities exist in public resources, such as the Human Genome Diversity Project (HGDP) 44, we found both the breadth of ascertained ethnicities and sample size insufficient to impute ethnicities where absent. The original cohort was largely composed of samples from GME countries, but also included samples of African, European, and East Asian decent. To ensure consistency in our geographic designations we performed and linkage clustering, based on pairwise distances between samples using Plink’s ‘--distance-matrix’ command 43. We performed hierarchical clustering on all samples using Ward’s hierarchical clustering method (“ward.D2” option for the “hclust” algorithm in R)45. Scott et al. Page 8 Nat Genet. Author manuscript; available in PMC 2017 March 01. H H M I A u th o r M a n u scrip t H H M I A u th o r M a n u scrip t H H M I A u th o r M a n u scrip t 3. Population Structure of GME 3.1 Data integration—Population structure was analyzed in the context of continental populations from the 1000 Genomes Phase I (1000G) dataset 9. As 1000G samples were generated from a combination of whole genome and exome sequencing, variants falling outside of RefSeq exonic regions +/− 30 base pairs (bp) were filtered using BedTools and merged with the GME cohort 46,47. Nine populations from 1000G data were used in comparative analyses: African populations YRI and LWK; East Asian populations CHB, CHS, and JPT; and European populations GBR, TSI, IBS, and FIN. Related 1000G samples were filtered by a KING analysis as previously described. A total of 1821 samples remained after filtering representing 15 geographic regions, 6 from the GME and 9 from 1000G. 3.2 Substructure analysis—To investigate the influence of admixture on the GME samples, we used the block relaxation algorithm implemented in ADMIXTURE to estimate individual ancestry proportions given K ancestral populations 48. Unsupervised ADMIXTURE was run using default settings (folds=5) on merged GME and 1000G samples and iterations of K values from 2 to 14. Minimum squared error values calculated from ADMIXTURE’s cross-validation procedure for evaluating fit of different values of K, found an optimum K = 6 for just GME samples, and 7 including 1000G control data. 3.3 PCA and Wright’s Fixation Index (Fst)—Principal component analysis (PCA) was used to investigate the affinities within human populations and the relationships between them. We performed PCA on GME and 1000G samples using the SmartPCA tool from the EIGENSOFT software library and the first four principal components compared graphically 40,49. Wright’s fixation index (Fst) was used to explore the degree of differentiation between populations. Fst values and standard error for all pairs of populations were calculated using the estimator of Weir & Cockerham, also included in the EIGENSOFT software library. All plots were generated using ggplot2 50. 3.4 LD decay—Pairwise linkage disequilibrium among pairs of SNPs is an indicator of the past history of recombination and genetic drift. To calculate LD, we tallied pairwise r2 for SNP pairs for all GME and control populations using the Plink “r2” option 43. Correlations between all SNPs falling within each sliding-window of 70 kilobase (kb) were calculated with no lower limit on r2 values. Pairwise correlations were binned by genomic distance between SNPs (up to 70kb), and averages calculated for each bin. Control samples followed expected patterns of LD decay. 3.5 Estimation of inbreeding—The inbreeding coefficient of an individual (F) was used to represent the probability that two randomly chosen alleles at a homologous locus within an individual were identical by descent (IBD) with respect to a base reference population in which all alleles were independent. While the true inbreeding coefficient of an individual is often unknown, several estimation methods have been shown to give a reasonable estimate. F estimates were calculated using the Plink “het” algorithm on LD pruned variants following authors guidelines 43. We compared results to the HMM algorithm Festim 51 and found the Scott et al. Page 9 Nat Genet. Author manuscript; available in PMC 2017 March 01. H H M I A u th o r M a n u scrip t H H M I A u th o r M a n u scrip t H H M I A u th o r M a n u scrip t two estimates were very similar (Pearson’s r: 0.874) but frequently Festim failed to return results for samples with missing data. Negative F values were most likely the result of either biased variant sampling, a high-degree interracial marriage, or due to recent intermixing of previously disparate populations8. 3.6 Runs of homozygosity (ROH) estimation—To infer estimates of the autozygosity and relative recent population size, we estimated runs of homozygosity using the HMM algorithm H3M2 52. H3M2 was run directly on aligned BAM files, following authors recommendations for all parameters. Proportion of genome and exome falling within ROH was calculated for each sampling using BedTools. ROH length classes were based on published ranges 23, where the authors used machine learning to identify three ROH classes including: Short (<0.515 Mb), Medium (0.156–1.606 Mb), and Long (>1.607 Mb). We
compared densities of ROH lengths from internal data and found a near identical distribution
as the published values used to identify these classes.
4. Variant Annotation and Classification
4.1 Variant annotation—Functional annotation was performed for genetic purging and
loss of function analyses. Variants were annotated using the ANNOVAR suite of scripts
(version 2014Nov12) 53. ANNOVAR classified variants into eight coding region functional
groups including: “frameshift_deletion”, “frameshift_insertion”, “nonframeshift_deletion”,
“nonframeshift_insertion”, “nonsynonymous_SNV”, “stopgain”, “stoploss”, and
“synonymous_SNV”. Non-coding variants are classified as “unknown”. Splicing defects
were identified based on 2 base pair distance from the splice junction, either on the intronic
or exonic side. A predicted deleteriousness classification was generated for each missense
variant using PolyPhen-2 54. The functional designations for PolyPhen-2 include: B
(Benign), P (Possibly Damaging), D (Probably Damaging). We compared these annotations
to those generated by SNPEff 55, and while there were some differences, found distributions
of calls from each sample to be consistent.
4.2 Ancestral allele identification—We used the Chimpanzee genome as the closest
assembled out-group genome. Ancestral allele estimates were obtained by UCSC pairwise
alignments between human reference hg19 and chimp references PanTro2 and PanTro4.
Systematic lookups for all GME and 1000G variants were performed using UCSC Genome
Browser tools and custom scripts to identify associated chimpanzee alleles. We compared
PanTro2 and PanTro4 to assess the difference in correcting the apparent reference bias, but
found both worked equally well.
Estimated ancestral alleles were used as the reference allele to calculate derived allele
frequencies (DAF). DAFs were not calculated for variants where the ancestral allele was not
present in the human germline.
4.3 Identity-by-state (IBS) distance to reference—To interrogate the potential biases
that might result from reference selection we calculate the IBS distance between samples
and multiple different references including hg19, and chimpanzee. The distance represents
Scott et al. Page 10
Nat Genet. Author manuscript; available in PMC 2017 March 01.
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
the proportion of positions that diverge from reference, and was calculated between all pairs
of samples and references.
The IBS distance, d, represented the number of differing alleles between the two samples
divided by the total number of alleles compared. More formally, d, between the two n-length
vectors p, q (in our case where p is the reference sample and q is the sample being
compared) in a vector space v, where v ={0,1,2} encoding the homozygote for the human
reference allele, the heterozygote, and the homozygote for the alternate allele, respectively.
For any two samples, we calculate d as:
where (p,q) are vectors such that p = (p1, p2,…, pn)and q = (q1,q2,…,qn)
Each vector represented all genotype calls between the two samples, excluding filtered sites
or missing positions.
The IBS distance was calculated for all GME and 1000G samples against the hg19 and
chimpanzee reference genomes. All genotypes from the merged VCF file were coded based
on a comparison to the hg19 reference. Variant positions were filtered to remove indels, due
to the possibility of alignment errors, and non-biallelic sites. When comparing to hg19,
vector p was represented by a vector of zeros.
4.4 Hereditary Spastic Paraplegia (HSP) candidate variant analysis—Samples
from 20 consanguineous families displaying an autosomal recessive inheritance pattern of
HSP were selected from a previously analyzed cohort 32, selected from a total cohort of 55
families because in these 20 there was a single genetic causes identified. Families were
analyzed in adherence to published methods 32. Briefly, homozygous variants were filtered
based on family structure to ensure variants segregated with the disease phenotype. We
performed deleteriousness filtering using functional classes and GERP++ scores 56. All
candidate variants were potentially LOF (frameshift, stop, or perturbing splicing) or a coding
variant with a GERP score >4.
The maximum allele frequency for candidate variants were based on established rates of
disease prevalence, estimated at 1:10,000 for clinical presentations classified of HSP 57.
Approximately 50% of HSP is autosomal dominant, and of the remaining, about 50% is
explained by mutations by SPG11 58, leaving only 1:40,000 with recessive HSP caused by
other genes. At least 35 other genes are reported to cause recessive HSP. Thus, the
contribution to HSP disease prevalence for any given gene is unlikely to be more than
1:1,000,000. While prevalence of HSP mutations is not expected to be uniform, we expect
the maximum carrier frequency for any new causal variant to be no more than 1:1000,
assuming full penetrance and a classic recessive inheritance, and is actuality is likely to be
much rarer given allelic diversity.
Scott et al. Page 11
Nat Genet. Author manuscript; available in PMC 2017 March 01.
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
With roughly 1000 individuals in our cohort, we calculated that variants with DAFs <1:1000
should not be observed commonly in our dataset, AFs <1:500 should be not be observed in
more than 1 individual, AFs <1:333 in not more than 2 individuals and AFs <1:250 in not
more than 3 individuals. Variants passing deleteriousness and allele frequency thresholds
were treated as candidates to calculate the usefulness of the Variome to limit the number of
deleterious variants considered as candidates.
5. Testing for the Influence of Genetic Purging
Consanguinity has been practiced in the GME for at least several centuries 59. Simulations
of GME like populations have found sufficient time has past for purging to have been
effective in reducing genetic load 29. Clinical studies aimed at comparing clinical rates of
birth defect rates, premature births or miscarriages, between communities that practice
consanguinity to those largely out-breeding populations have found all metrics have fallen
within range for the rate of immediate form of consanguinity 21,60,61. More recent genetics
studies investigating differential selective pressure across human populations focused on the
role of population bottlenecks, neglecting the potential influence of consanguinity, and
lacked representation from the GME 31,62. For these reasons, we sought to investigate the
possibility that genetic purging has influenced variant burden in the GME.
In order to approach the question of variable selective pressure across human populations,
we implemented a variation of the DAF comparison method 31. We assumed that any change
in the efficacy of natural selection should be evident across populations in the mean DAF
within each variant classes.
For all variants described across the GME and 1000G populations, we filtered for high
quality calls, identified ancestral alleles (described in “Ancestral Allele” section), annotated
variants for predicted function and PolyPhen-2 classes using ANNOVAR, down-sampled to
achieve an equivalent numbers of chromosomes across populations, and calculated DAFs for
all positions. Variants were grouped by class, and the DAF means were calculated for each
population. Standard-errors were calculated by bootstrapping DAF means for 1000
iterations.
Recent studies using PolyPhen-2 demonstrated a deflation of deleteriousness scores for
derived variants found in the hg19 reference, likely due to a training artifact 31,62. Before
using PolyPhen-2 classes, this bias was corrected for all derived reference positions. Bias
correction was implemented by grouping variants by DAF bins, and calculating the
proportions of each PolyPhen-2 class per bin for ancestral reference positions. Using these
proportions as expectations, and all derived reference positions were randomly reassigned a
new PolyPhen-2 class based on a hypergeometric distribution within each DAF bin. DAF
means across classes for all included 1000G and GME populations showed no deviation
outside the standard-error for any two populations61.
6. Neanderthal and Denisovan Introgression Analysis
Neanderthal-derived variants are often subjected to strong negative selection, thereby
making exome analysis inadequate for estimating age of introgression. Thus we calculated
the proportion observed between extant populations 15,63.
Scott et al. Page 12
Nat Genet. Author manuscript; available in PMC 2017 March 01.
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
To estimate introgression in exome samples, we identified aligned consensus calls for all
human variant positions from the chimpanzee, Neanderthal and Denisovan reference
genomes. Alignments of Neanderthal and Denisovan genomes to 1000G variant positions
were downloaded from the Max Planck Institute for Evolutionary Anthropology FTP 15,64.
Neanderthal and Denisovan alleles were identified from the hg19-ancestor alignment files.
Chimpanzee alleles were identified as described in the “ancestral allele” section of these
methods.
We projected GME and 1000G control populations on the principal components calculated
using representative samples from Neanderthal, Denisova, and chimpanzee 16,65,66, and
aligned the human samples to these ancestral populations. Principal components were
computed using R’s “prcomp” function (see web resources), and projected vectors were
calculated for all 1000G and GME samples. Distance from the re-adjusted origin to each
species reflected the proportion of introgression observed in each sample. The limited
number of SNPs that were examined in this analysis compared to similar genotype-based
analysis likely inflated the sampling variance within populations, and limited the sensitivity
of our analysis to smaller introgression proportions. Centroids for all populations were
labeled with their abbreviated names. Similar to previous work, Europeans, East Asians, and
GME populations overlapped, and demonstrated larger proportions of Neanderthal than
African populations 65–67.
Supplementary Material
Refer to Web version on PubMed Central for supplementary material.
Acknowledgments
The authors thank Shamil Sunyaev and David Reich for help with PolyPhen-2 and DAF corrections, Michael
Turchin for help with purging analysis, Joseph Pickrell for help with TreeMix, Vineet Bafna, Nicholas Schork,
Stefano Bonissone for suggestions. Work was supported by grants from the National Institutes of Health
(P01HD070494, R01NS048453), Qatari National Research Foundation (NPRP6-1463), Simons Foundation Autism
Research Initiative (175303 and 275275) to JGG, the Yale Center for Mendelian Disorders (U54HG006504), the
Broad Institute (U54HG003067), The Rockefeller University CTSA (5UL1RR024143-04), the Howard Hughes
Medical Institute (to JGG and J-LC), Institut National de la Santé et de la Recherche Médicale, the St. Giles
Foundation, and the Candidoser Association, R01AI088364, R37AI095983, P01AI061093, U01AI109697 (to J-
LC), U01AI088685 to J-LC and LA, R21AI107508 (to E. Jouanguy), DHFMR Collaborative Research Grant and
KACST 13-BIO1113-20 (to FSA).
Greater Middle Eastern Variome Consortium
Sohair Abdel Rahim, Sawsan Abdel-Hadi, Ghada Abdel-Salam, Ekram Abdel-Salam, Mohammed Abdou, Avinash
Abhytankar, Parisa Adimi, Jamil Ahmad, Mustafa Akcakus, Guside Aksu, Sami Al Hajjar, Suliman Al Juamaah,
Saleh Al Muhsen, Nouriya Al Sannaa, Salem Al Tameni, Jumana Al-Aama, Nasir Al-Allawi, Raidah Al-Baradie,
Lihadh Al-Gazali, Amal Al-Hashem, Waleed Al-Herz, Deema Al-Jeaid, Asma Al-Tawari, Abdullah Alangari,
Alexandre Alcais, Tariq S AlFawaz, Zobaida Alsum, Aomar Ammar-Khodja, Sepideh Amouian, Cigdem Arikan,
Omid Aryani, Ayca Aslanger, Cigdem Aydogmus, Caner Aytekin, Matloob Azam, Boglarka Bansagi, Mohamed-
Rhida Barbouche, Laila Bastaki, Tawfeg Ben-Omran, PS Bindu, Lizbeth Blancas, Stéphanie Boisson-Dupuis,
Damien Bonnet, Omar Boudghene Stambouli, Aziz Bousfiha, Lobna Boussafara, Jeannette Boutros, Jacinta
Bustamante, Huseyin Caksen, Yildiz Camcioglu, Emilie Catherinot, Fatma C Celik, Michael Ciancanelli, Funda E
Cipe, Gary Clark, Aurélie Cobat, Sinan Comu, Angela Condie, Antonio Condino-Neto, Mukesh Desai, William
Dobyns, Figen Dogu, Mohamed Domaia, Meltem Dorum, Odul Egritas, Safa El Azbaoui, Jamila El Baghdadi,
Mona El Ruby, Ashraf El-Harouni, Reem A Elfeky, Gehad Elghazali, Eissa Faqeih, Elif Fenerci, Claire Fieschi,
Cipe Funda, Iman Gamal, Umit Gelik, Fetah Genel, Alper Gezdirici, KM Girisha, Amy Goldstein, Padraic Grattan-
Smith, Neerja Gupta, Jin Hahn, Nevin Hatipoglu, Raoul Hennekam, Massoud Houshmand, Philippe Ichai, Aydan
Ikinciogullari, Samira Ismail, Chaim Jalas, Emmanuelle Jouanguy, Madhulika Kabra, Göknur Kalkan, Majdi Kara,
Scott et al. Page 13
Nat Genet. Author manuscript; available in PMC 2017 March 01.
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
Neslihan Karaca, Kadri Karaer, Ariana Kariminejad, Hulya Kayserili, Melike Keser-Emiroglu, Sara S Kilic, Najib
Kissani, Cristina Kokron, Roshan Koul, Necil Kutukculer, Fanny Lanternier, Alireza Mahdaviani, Nizar Malhaoui,
Lobna Mansour, Davood Mansouri, Lucia Margari, Enza Maria Valente, Naima Marzouki, Amira Masri, Amina
Megahed, Hisham Megahed, Najla Mekki, Mehrnaz Mesdaghi, Mohd Mikati, Faezeh Mojahedi, John Mulley,
Sheela Nampoothiri, Carmen Navarrete, Tarek Omar, Azza Oraby, Ayse Pandaluz, Nima Parvaneh, Turkan
Patiroglu, Zeynep Peker Koc, Isabelle Pellier, Capucine Picard, Anne Puel, Annick Raas-Rothschild, Anna Rajab,
Didier Raoult, Ismail Reisli, Nima Rezaei, Ayoub Sabri, Yasin Sahin, Laila Saleem, Fadia Salem, Najla Sameer
AlSediq, Ozden Sanal, Terry Sanger, Hanan Shakankiry, Lei Shang, Nabil Shehata, Nuri Shembesh, Vared Shkalim,
Ameen Softah, Sameera Sogaty, Neveen Soliman, Fatma Sonmez-Aunaci, Laszlo Sztriha, Lynda Taibi-Berrah,
Samia Temtamy, Hasan Tonekaboni, Doris Trauner, Beyhan Tuysuz, Beyhan Tuysuz, Ali Varan, Guillaume Vogt,
Christopher Walsh, Geoffrey Woods, Gozde Yesil, Alisan Yildiran, Basak Yildiz, Adnan Yuksel, Maha Zaki, Shen-
Ying Zhang
References
1. Anwar WA, Khyatti M, Hemminki K. Consanguinity and genetic diseases in North Africa and
immigrants to Europe. Eur J Public Health. 2014; 24(Suppl 1):57–63. [PubMed: 25107999]
2. Al-Gazali L, Hamamy H, Al-Arrayad S. Genetic disorders in the Arab world. British Med J. 2006;
333:831–4.
3. Hussain R, Bittles AH. The prevalence and demographic characteristics of consanguineous
marriages in Pakistan. J Biosoc Sci. 1998; 30:261–75. [PubMed: 9746828]
4. Sheffield VC, Stone EM, Carmi R. Use of isolated inbred human populations for identification of
disease genes. Trends Genet. 1998; 14:391–6. [PubMed: 9820027]
5. Sharp, JM. The Broader Middle East and North Africa Initiative: An overview. CRS Report for
Congress; 2005.
6. Hellenthal G, et al. A genetic atlas of human admixture history. Science. 2014; 343:747–51.
[PubMed: 24531965]
7. Ravindranath V, et al. Regional research priorities in brain and nervous system disorders. Nature.
2015; 527:S198–206. [PubMed: 26580328]
8. Hunter-Zinck H, et al. Population genetic structure of the people of Qatar. Am J Hum Genet. 2010;
87:17–25. [PubMed: 20579625]
9. Consortium GP, et al. An integrated map of genetic variation from 1,092 human genomes. Nature.
2012; 491:56–65. [PubMed: 23128226]
10. Moreno-Estrada A, et al. Reconstructing the population genetic history of the Caribbean. PLoS
Genet. 2013; 9:e1003925. [PubMed: 24244192]
11. Botigue LR, et al. Gene flow from North Africa contributes to differential human genetic diversity
in southern Europe. Proc Natl Acad Sci U S A. 2013; 110:11791–6. [PubMed: 23733930]
12. Li JZ, et al. Worldwide human relationships inferred from genome-wide patterns of variation.
Science. 2008; 319:1100–4. [PubMed: 18292342]
13. Henn BM, et al. Genomic ancestry of North Africans supports back-to-Africa migrations. PLoS
Genet. 2012; 8:e1002397. [PubMed: 22253600]
14. Gerard N, Berriche S, Aouizerate A, Dieterlen F, Lucotte G. North African Berber and Arab
influences in the western Mediterranean revealed by Y-chromosome DNA haplotypes. Hum Biol.
2006; 78:307–16. [PubMed: 17216803]
15. Green RE, et al. A draft sequence of the Neandertal genome. Science. 2010; 328:710–22.
[PubMed: 20448178]
16. Sankararaman S, et al. The genomic landscape of Neanderthal ancestry in present-day humans.
Nature. 2014; 507:354–7. [PubMed: 24476815]
17. Consortium STD, et al. Sequence variants in SLC16A11 are a common risk factor for type 2
diabetes in Mexico. Nature. 2014; 506:97–101. [PubMed: 24390345]
18. Pickrell JK, Pritchard JK. Inference of population splits and mixtures from genome-wide allele
frequency data. PLoS Genet. 2012; 8:e1002967. [PubMed: 23166502]
19. Tadmouri GO, et al. Consanguinity and reproductive health among Arabs. Reprod Health. 2009;
6:17. [PubMed: 19811666]
Scott et al. Page 14
Nat Genet. Author manuscript; available in PMC 2017 March 01.
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
20. Leutenegger AL, Sahbatou M, Gazal S, Cann H, Genin E. Consanguinity around the world: what
do the genomic data of the HGDP-CEPH diversity panel tell us? Eur J Hum Genet. 2011; 19:583–
7. [PubMed: 21364699]
21. Bittles, AH.; Black, ML. Global patterns and tables of consanguinity. 2014.
22. Pippucci T, Magi A, Gialluisi A, Romeo G. Detection of runs of homozygosity from whole exome
sequencing data: state of the art and perspectives for clinical, population and epidemiological
studies. Hum Hered. 2014; 77:63–72. [PubMed: 25060270]
23. Pemberton TJ, et al. Genomic patterns of homozygosity in worldwide human populations. Am J
Hum Genet. 2012; 91:275–92. [PubMed: 22883143]
24. Szpiech ZA, et al. Long runs of homozygosity are enriched for deleterious variation. Am J Hum
Genet. 2013; 93:90–102. [PubMed: 23746547]
25. MacArthur DG, et al. A systematic survey of loss-of-function variants in human protein-coding
genes. Science. 2012; 335:823–8. [PubMed: 22344438]
26. Sulem P, et al. Identification of a large set of rare complete human knockouts. Nat Genet. 2015;
47:448–52. [PubMed: 25807282]
27. Jones, S. The Darwin Archipelago. Yale University Press; New Haven: 2011.
28. Haldane JBS. The effect of variation of fitness. Am Nat. 1937; 71:337–349.
29. Overall AD, Ahmad M, Nichols RA. The effect of reproductive compensation on recessive
disorders within consanguineous human populations. Heredity. 2002; 88:474–9. [PubMed:
12180090]
30. Neale BM, et al. Patterns and rates of exonic de novo mutations in autism spectrum disorders.
Nature. 2012; 485:242–5. [PubMed: 22495311]
31. Simons YB, Turchin MC, Pritchard JK, Sella G. The deleterious mutation load is insensitive to
recent population history. Nat Genet. 2014; 46:220–4. [PubMed: 24509481]
32. Novarino G, et al. Exome sequencing links corticospinal motor neuron disease to common
neurodegenerative disorders. Science. 2014; 343:506–11. [PubMed: 24482476]
33. Blackstone C, O’Kane CJ, Reid E. Hereditary spastic paraplegias: membrane traffic and the motor
pathway. Nat Rev Neurosci. 2011; 12:31–42. [PubMed: 21139634]
34. MacArthur DG, et al. Guidelines for investigating causality of sequence variants in human disease.
Nature. 2014; 508:469–76. [PubMed: 24759409]
35. Dixon-Salazar TJ, et al. Exome sequencing can improve diagnosis and alter patient management.
Sci Transl Med. 2012; 4:138ra78.
36. Okada S, et al. IMMUNODEFICIENCIES. Impairment of immunity to Candida and
Mycobacterium in humans with bi-allelic RORC mutations. Science. 2015; 349:606–13. [PubMed:
26160376]
37. Alsalem AB, Halees AS, Anazi S, Alshamekh S, Alkuraya FS. Autozygome sequencing expands
the horizon of human knockout research and provides novel insights into human phenotypic
variation. PLoS Genet. 2013; 9:e1004030. [PubMed: 24367280]
38. DePristo MA, et al. A framework for variation discovery and genotyping using next-generation
DNA sequencing data. Nat Genet. 2011; 43:491–8. [PubMed: 21478889]
39. Li H, Durbin R. Fast and accurate long-read alignment with Burrows-Wheeler transform.
Bioinformatics. 2010; 26:589–95. [PubMed: 20080505]
40. Patterson N, Price AL, Reich D. Population structure and eigenanalysis. PLoS Genet. 2006;
2:e190. [PubMed: 17194218]
41. Manichaikul A, et al. Robust relationship inference in genome-wide association studies.
Bioinformatics. 2010; 26:2867–73. [PubMed: 20926424]
42. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features.
Bioinformatics. 2010; 26:841–2. [PubMed: 20110278]
43. Purcell S, et al. PLINK: a tool set for whole-genome association and population-based linkage
analyses. Am J Hum Genet. 2007; 81:559–75. [PubMed: 17701901]
44. Cann HM, et al. A human genome diversity cell line panel. Science. 2002; 296:261–2. [PubMed:
11954565]
Scott et al. Page 15
Nat Genet. Author manuscript; available in PMC 2017 March 01.
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
http://consang.net
45. Behar DM, et al. The genome-wide structure of the Jewish people. Nature. 2010; 466:238–42.
[PubMed: 20531471]
46. Danecek P, et al. The variant call format and VCFtools. Bioinformatics. 2011; 27:2156–8.
[PubMed: 21653522]
47. Pruitt KD, et al. RefSeq: an update on mammalian reference sequences. Nucleic Acids Res. 2014;
42:D756–63. [PubMed: 24259432]
48. Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated
individuals. Genome Res. 2009; 19:1655–64. [PubMed: 19648217]
49. Price AL, et al. Principal components analysis corrects for stratification in genome-wide
association studies. Nat Genet. 2006; 38:904–9. [PubMed: 16862161]
50. Wickham, H. ggplot2: Elegant graphics for data analysis. Springer Science & Business Media;
2009.
51. Polasek O, et al. Comparative assessment of methods for estimating individual genome-wide
homozygosity-by-descent from human genomic data. BMC Genomics. 2010; 11:139. [PubMed:
20184767]
52. Magi A, et al. H3M2: detection of runs of homozygosity from whole-exome sequencing data.
Bioinformatics. 2014; 30:2852–9. [PubMed: 24966365]
53. Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-
throughput sequencing data. Nucleic Acids Res. 2010; 38:e164. [PubMed: 20601685]
54. Adzhubei IA, et al. A method and server for predicting damaging missense mutations. Nat
Methods. 2010; 7:248–9. [PubMed: 20354512]
55. Cingolani P, et al. A program for annotating and predicting the effects of single nucleotide
polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2;
iso-3. Fly (Austin). 2012; 6:80–92. [PubMed: 22728672]
56. Davydov EV, et al. Identifying a high fraction of the human genome to be under selective
constraint using GERP++ PLoS Comput Biol. 2010; 6:e1001025. [PubMed: 21152010]
57. Erichsen AK, Koht J, Stray-Pedersen A, Abdelnoor M, Tallaksen CM. Prevalence of hereditary
ataxia and spastic paraplegia in southeast Norway: a population-based study. Brain. 2009;
132:1577–88. [PubMed: 19339254]
58. Stevanin G, et al. Mutations in SPG11 are frequent in autosomal recessive spastic paraplegia with
thin corpus callosum, cognitive decline and lower motor neuron degeneration. Brain. 2008;
131:772–84. [PubMed: 18079167]
59. Vardi-Saliternik R, Friedlander Y, Cohen T. Consanguinity in a population sample of Israeli
Muslim Arabs, Christian Arabs and Druze. Ann Hum Biol. 2002; 29:422–31. [PubMed:
12160475]
60. Shami SA, Qaisar R, Bittles AH. Consanguinity and adult morbidity in Pakistan. Lancet. 1991;
338:954. [PubMed: 1681304]
61. Stoltenberg C, Magnus P, Lie RT, Daltveit AK, Irgens LM. Birth defects and parental
consanguinity in Norway. Am J Epidemiol. 1997; 145:439–48. [PubMed: 9048518]
62. Do R, et al. No evidence that selection has been less effective at removing deleterious mutations in
Europeans than in Africans. Nat Genet. 2015; 47:126–31. [PubMed: 25581429]
63. Consortium STD, et al. Association of a low-frequency variant in HNF1A with type 2 diabetes in a
Latino population. JAMA. 2014; 311:2305–14. [PubMed: 24915262]
64. Meyer M, et al. A high-coverage genome sequence from an archaic Denisovan individual. Science.
2012; 338:222–6. [PubMed: 22936568]
65. Huerta-Sanchez E, et al. Altitude adaptation in Tibetans caused by introgression of Denisovan-like
DNA. Nature. 2014; 512:194–7. [PubMed: 25043035]
66. Wang S, Lachance J, Tishkoff SA, Hey J, Xing J. Apparent variation in Neanderthal admixture
among African populations is consistent with gene flow from Non-African populations. Genome
Biol Evol. 2013; 5:2075–81. [PubMed: 24162011]
67. Lowery RK, et al. Neanderthal and Denisova genetic affinities with contemporary humans:
introgression versus common ancestral polymorphisms. Gene. 2013; 530:83–94. [PubMed:
23872234]
Scott et al. Page 16
Nat Genet. Author manuscript; available in PMC 2017 March 01.
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
Figure 1. Greater Middle East Variome as a hub of human genetics
a. Map of GME sub-regions. Lines define borders for admixture analysis from East Asia,
Europe, Sub-Saharan Africa and the novel GME contribution (NWA: Northwest Africa,
NEA: Northeast Africa, TP: Turkish Peninsula, SD: Syrian Desert, AP: Arabian Peninsula,
PP: Persia and Pakistan). Pie charts: admixture proportions of 1000 Genomes Project
(1000G) continental populations according to K=6 clusters.
b. Global ancestry proportions (K=6) for 1000G control populations with three distinct
sources of contribution. 1000G population contributions: Africa (red), Europe (green) and
Scott et al. Page 17
Nat Genet. Author manuscript; available in PMC 2017 March 01.
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
East Asia (blue). GME populations from west to east: NWA (purple), AP (orange), and PP
(yellow) derived from the GME.
c. TreeMix phylogeny of GME along with 1000G controls representing population
divergence patterns. Length of the branch proportional to population drift. GME populations
grouped around the African branch, but showed a substantial divergence. YRI: Yoruba in
Ibadan, LWK: Luhya in Webuye Kenya, FIN: Finnish, GBR: Great Britain, TSI: Toscani,
CHS: Southern Han Chinese, CHB: Han Chinese in Beijing, JPT: Japanese in Tokyo.
d. Wright’s Fixation Index (Fst) values for all pairs of GME and 1000G European
populations, showing a smaller distance between GME and European populations compared
with Sub-Saharan African populations. Greatest Fst value between any two GME
populations was 0.026 (i.e. a quarter of the distance between FIN and JPT).
Scott et al. Page 18
Nat Genet. Author manuscript; available in PMC 2017 March 01.
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
Figure 2. Wide diversity and high inbreeding coefficients in GME substructure
a. Principal component analysis (PCA) for individuals from GME and 1000G populations.
Individuals projected along PC3 and PC4 axes. Persia and Pakistan (PP), Northwest Africa
(NWA) and Europe defined the limits from right, left, and top, as coinciding with geography.
Arab Peninsula (AP) defined the bottom limit, and was closest to Northeast Africa (NEA)
and Syrian Desert (SD).
b. GME populations had increased rates of linkage disequilibrium decay compared to 1000G
European and East Asian populations. Mean variant correlations (r2) shown for each 1,000
basepair (bp) bin from 1,000–70,000 bp.
c. Inbreeding coefficient (F) distributions for GME and 1000G populations. GME
populations (purple) showed elevated F values, consistent with increased rates of
consanguineous marriages. Box plots show median (horizontal line), 25%ile (45° angle),
75%ile (90° angle), minimum and maximum observations (whiskers).
d. F distributions for family structures for GME and European American (EA) trios. Mean F
values correlated with expected for consanguineous offspring. Unk=unknown.
Scott et al. Page 19
Nat Genet. Author manuscript; available in PMC 2017 March 01.
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
Figure 3. Distributions of short and long Runs of Homozygosity (ROH) correlates with patterns
of bottlenecks and recent consanguinity
a. Sample burdens of ROH grouped by length (Short: <0.155 Mb, Medium: 0.156–1.606
Mb, Long: >1.607 Mb). GME samples (purple) showed a unique contribution of long ROH
compared with other populations (*), with less in short and medium bins compared to
Europe and East Asia. Total ROH in GME sub-regions overlapped with European and East
Asian likely due to greater bottlenecks in these populations.
b. Histograms of long ROH for GME, Africa, Europe, and East Asia. GME samples more
frequently harbored runs >4 Mb compared to other populations. ROH >15 Mb are binned
together (* peak unique to Middle East).
c. Longer GME ROH spans were enriched for rare variation, while shorter runs were
enriched for more common variation. Proportion of variants binned by allele frequency for
different sized ROH, binned by 0.5 Mb intervals. Probability density function calculated for
each allele frequency class. Note that AFs for common alleles declined whereas AFs for rare
and very rare alleles rose as ROH increased in size (Common: AF > .05, Rare: AF 0.05–
0.01, Very Rare: AF < 0.01).
Scott et al. Page 20
Nat Genet. Author manuscript; available in PMC 2017 March 01.
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
Figure 4. GME Variome facilitates the discovery of Mendelian disease genes
a–b. Comparison of rare derived allele frequencies (DAF) between GME and Exome
Sequencing Project (ESP). AA: African American, EA: European-American. Hexagonal
bins shaded by log number of variants within each bin. Pearson’s r suggests GME DAFs
were not accurately estimated by AA or EA populations.
b. The majority of variants in the rarest DAF bins were unique to the GME. AA: found only
in GME and AA. EA: found only in GME and EA. All: found in GME, EA and AA. GME
Unique: found only in GME.
Scott et al. Page 21
Nat Genet. Author manuscript; available in PMC 2017 March 01.
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
c. Change in per-individual burden of eight variant classes as a function of increasing the
number of individuals incorporated into the GME Variome cohort. As sample size increased
there was a drop in the number of unique variants, along with more accurate estimation of
DAFs for rare variants. Bootstraps were sampled with replacement for 100 iterations to
calculate standard errors. “High impact”: variants meeting predicted deleteriousness
thresholds (see Methods).
d. Number of candidate variants for 20 families, meeting segregation and deleteriousness
filtering criteria, using DAFs derived from Hereditary Spastic Paraplegia (HSP)-only
families (top) or also incorporating the GME Variome (bottom). Single, Duo, Trio: families
with one, two or three affected members. Colors: number of individuals sharing the variant.
“0”: no other individuals carried the allele, etc. Analysis was performed using this threshold
for the number of individuals sharing alleles (0,1,2,3). Note drop in number of segregating
variants for any given family after the GME Variome was applied.
Scott et al. Page 22
Nat Genet. Author manuscript; available in PMC 2017 March 01.
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
H
H
M
I A
u
th
o
r M
a
n
u
scrip
t
Abstract
Online Methods
1. Definition of the Greater Middle East
2. Exome Resequencing
2.1 Study sample
2.2 Exome resequencing, variant calling, and filtering
2.3 Geographic region assignment
3. Population Structure of GME
3.1 Data integration
3.2 Substructure analysis
3.3 PCA and Wright’s Fixation Index (Fst)
3.4 LD decay
3.5 Estimation of inbreeding
3.6 Runs of homozygosity (ROH) estimation
4. Variant Annotation and Classification
4.1 Variant annotation
4.2 Ancestral allele identification
4.3 Identity-by-state (IBS) distance to reference
4.4 Hereditary Spastic Paraplegia (HSP) candidate variant analysis
5. Testing for the Influence of Genetic Purging
6. Neanderthal and Denisovan Introgression Analysis
References
Figure 1
Figure 2
Figure 3
Figure 4
40246_2018_Article_152
PRIMARY RESEARCH Open Access
Molecular characterization of exonic
rearrangements and frame shifts in the
dystrophin gene in Duchenne muscular
dystrophy patients in a Saudi community
Nasser A. Elhawary1,2*, Essam H. Jiffri3, Samira Jambi4, Ahmad H. Mufti1, Anas Dannoun1, Hassan Kordi1,
Asim Khogeer5, Osama H. Jiffri3, Abdelrahman N. Elhawary6 and Mohammed T. Tayeb1
Abstract
Background: In individuals with Duchenne muscular dystrophy (DMD), exon skipping treatment to restore a wild-
type phenotype or correct the frame shift of the mRNA transcript of the dystrophin (DMD) gene are mutation-
specific. To explore the molecular characterization of DMD rearrangements and predict the reading frame, we
simultaneously screened all 79 DMD gene exons of 45 unrelated male DMD patients using a multiplex ligation-
dependent probe amplification (MLPA) assay for deletion/duplication patterns. Multiplex PCR was used to confirm
single deletions detected by the MLPA.
Results: There was an obvious diagnostic delay, with an extremely statistically significant difference between the
age at initial symptoms and the age of clinical evaluation of DMD cases (t value, 10.3; 95% confidence interval 5.95–
8.80, P < 0.0001); the mean difference between the two groups was 7.4 years. Overall, we identified 147 intragenic
rearrangements: 46.3% deletions and 53.7% duplications. Most of the deletions (92.5%) were between exons 44 and
56, with exon 50 being the most frequently involved (19.1%). Eight new rearrangements, including a mixed
deletion/duplication and double duplications, were linked to seven cases with DMD. Of all the cases, 17.8% had
duplications with no hot spots. In addition, confirmation of the reading frame hypothesis helped account for new
DMD rearrangements in this study. We found that 81% of our Saudi patients would potentially benefit from exon
skipping, of which 42.9% had a mutation amenable to skipping of exon 51.
Conclusions: Our study could generate considerable data on mutational rearrangements that may promote future
experimental therapies in Saudi Arabia.
Keywords: Duchenne muscular dystrophy, Dystrophin gene, Large rearrangements, Frame shift, MLPA, Saudi
community
Background
Dystrophinopathies are the most common form of muscu-
lar dystrophy in childhood. They are caused by mutations
in the dystrophin gene (DMD; OMIM #300377) [1, 2]. Du-
chenne muscular dystrophy (DMD; OMIM #310200) is a
severe form of muscular dystrophy, with an incidence of 1
in 3600–5000 male births [3]. Becker muscular dystrophy
(BMD) is a milder form of DMD, with an incidence of 1
in 20,000 male births (BMD; OMIM # 300376) [4].
DMD is characterized by rapidly progressive degener-
ation and necrosis of the proximal muscles and calf
pseudo-hypertrophy. Most DMD patients show muscle
weakness at age 2 or 3, but it may be seen as early as in-
fancy. Patients commonly lose independent ambulation
by the age of 12 and die of dilated cardiomyopathy
around the second or third decade. In comparison, pa-
tients with BMD exhibit relatively minor pathological
* Correspondence: naelhawary@uqu.edu.sa
1Department of Medical Genetics, Medicine College, Umm Al-Qura
University, P.O. Box 57543, Mecca 21955, Saudi Arabia
2Department of Molecular Genetics, Faculty of Medicine, Ain Shams
University, Cairo 11566, Egypt
Full list of author information is available at the end of the article
© The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to
the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver
(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Elhawary et al. Human Genomics (2018) 12:18
https://doi.org/10.1186/s40246-018-0152-8
http://crossmark.crossref.org/dialog/?doi=10.1186/s40246-018-0152-8&domain=pdf
mailto:naelhawary@uqu.edu.sa
http://creativecommons.org/licenses/by/4.0/
http://creativecommons.org/publicdomain/zero/1.0/
symptoms, slower progression, later onset, and longer
survival. Patients with an intermediate form of the dis-
ease, intermediate muscular dystrophy (IMD), may con-
tinue to walk until they are 16 years of age [4, 5].
The DMD gene is one of the largest known genes in
humans, with 79 exons (approximately 2.4 Mb of gen-
omic DNA) [1] expressing a 427-kDa muscular protein
that plays a fundamental role in stabilizing the sarco-
lemma. It does so by using a complex of glycoproteins
associated with dystrophin to link actin filaments within
the cytoskeleton and the extracellular matrix. Lack of
dystrophin breaks these connections, altering the plasma
membrane and finally producing myofiber degeneration
and necrosis [6]. Thus, according to the reading frame
hypothesis [7], DMD mutations that destroy the reading
frame result in a truncated, non-functional dystrophin
protein associated with a “DMD” phenotype. These mu-
tations frequently generate a premature stop codon that
activates nonsense-mediated mRNA decay [8]. On the
other hand, mutations that maintain the reading frame
can permit semi-functional dystrophin protein and thus
give rise to a “BMD” phenotype [4, 9]. Together, these
two phenotype-genotype correlations explain more than
92% of all cases [7].
Different types of mutations have been reported in pa-
tients with DMD and BMD. These are mainly large rear-
rangements (deletions in approximately 60–70% of
patients and duplications in approximately 7–10%), with
the remaining being point mutations (mainly nonsense
mutations) and small deletions or insertions [10, 11].
Most gross deletions can be detected by multiplex PCR
(mPCR) [12, 13] and are clustered in the proximal and
central hot spot regions [14, 15]. Although a large pro-
portion of the duplications were reported many years
ago [16, 17], most laboratories do not systematically
screen for these rearrangements. Duplication analysis
and the determination of at-risk carrier status of the
DMD gene require quantitative investigation, which is
laborious and technically demanding [18, 19]. Previous
studies have applied Southern blotting [16, 20], pulsed-
field gel electrophoresis, quantitative mPCR [21–23],
multiplex amplifiable probe hybridization [24], and com-
parative genomic hybridization microarray [25].
Given that deletions and duplications of one or more
exons are found in the majority (70%) of patients, it is
most cost-efficient and labor-efficient to check for these
mutations first. A reliable and rapid technique, multiplex
ligation-dependent probe amplification (MLPA), has been
applied to cover the whole DMD gene to detect deletions
and duplications and to identify exactly which exons are
involved in deletions or duplications [18, 26–29]. This ap-
proach reveals whether a given exon is present and allows
the copy number of each exon to be calculated by com-
paring relative peak heights. MLPA can detect both
deletions and duplications in patients as well as in female
carriers. Compared to array comparative genomic
hybridization, MLPA is a low-cost and technically uncom-
plicated method.
Although several studies have investigated exonic dele-
tions in different populations [11, 30–36], it is unknown
where these deletions occur in the Saudi population. Al-
though a few studies have described the molecular diag-
nosis of DMD in Saudi patients, the large deletions
associated with disease were examined in only some
exons, and the studies were limited by small sample
sizes [37, 38]. Molecular characterization of the large
DMD gene has been proposed to address large intra-
genic rearrangements in the whole exons of the DMD
gene using an MLPA strategy covering nearly 75% of
whole gene mutations. Thus, accurate molecular diagno-
sis may provide information on eligibility for mutation-
specific treatments. A plausible frame shift hypothesis
suggests how one might reduce disease severity via exon
skipping, for example, by correcting the fidelity of the
translational reading frame with large DMD deletions or
restoring the wild type with large DMD duplications.
To our knowledge, the present study is the first study
using the MLPA strategy to identify genotype-phenotype
correlations in DMD patients in a Saudi community.
Our results will add valuable data on de novo mutations
in this population and to the databases of different
DMD web pages as well.
Methods
Ethics statement and participants
All participants were enrolled under a protocol approved
by the Institutional Biomedical Ethics Committee at
Umm Al-Qura University (ref. #HAPO-02-K-012). Par-
ents of all participants gave written consent after being
informed about the aim of the study.
The study included 45 unrelated male patients with
DMD selected from 65 families from the western region
of the Kingdom of Saudi Arabia (KSA), including Jeddah,
Mecca, Taif, and Hada. Twenty additional eligible male pa-
tients did not enroll because their parents refused to share
their clinical data, their clinical profiles were incomplete,
or their creatine phosphokinase assessments were missing.
For each patient included in the study, a clinical data sheet
was recorded in the database of the Molecular Genetics
Laboratory in the Department of Medical Genetics at
Umm Al-Qura University. Clinical information was inde-
pendent of any molecular DNA data for the DMD gene or
its protein. Dystrophin probands were diagnosed by clin-
ical geneticists or pediatricians based on strict criteria in-
cluding a clinical presentation expected for DMD, family
history of X-linked muscular dystrophy, or muscle biopsy
with a dystrophin analysis performed using immunohisto-
chemistry. Clinical diagnosis of dystrophin probands
Elhawary et al. Human Genomics (2018) 12:18 Page 2 of 11
included age at onset, age at clinical evaluation, calf pseu-
dohypertrophy, age at wheelchair confinement, cardiac
function, and motor function. A histopathological study
was performed before molecular DNA analysis if muscle
biopsies were available. To avoid bias, we included only
one case for each family. We categorized patients accord-
ing to age at loss of ambulation: DMD ≤ 12 years, IMD
12–16 years, and BMD > 16 years. Cases with a family his-
tory of autosomal recessive inheritance or with normal
dystrophin protein were excluded.
DNA isolation
Genomic DNA was isolated from buccal cells using the
Oragene DNA-OGR-575 kit (DNA Genotek Inc.,
Ottawa, ON, Canada) according to the manufacturer’s
protocol with some modifications. Briefly, the full buccal
cells were collected within 30 min, and the Oragene tube
was capped immediately. The cells were incubated with
the OGR-lysis buffer in a water bath at 53 °C to release
the DNA, which was then precipitated by ethanol and
dissolved in elution buffer [39].
Multiplex polymerase chain reaction
The genomic DNA of all DMD patients was subjected to
multiplex PCR (mPCR) to screen for DMD deletions
using 15 primer sets (Additional file 1: Table S1). The ol-
igonucleotides included flanking sequences of exons 4,
8, 12, 17, 19, 44, 45, 48, and 51 [12] and of exons 6, 13,
47, 50, 52, and 60 [13]. We made some modifications to
Chamberlain’s mPCR set by not adding dimethylsulfox-
ide, which could result in a lower PCR yield. However,
PCR cycling was programmed as initial denaturing at
95 °C for 6 min (1 round), then 94 °C for 30 s, annealing
at 53 °C for 30 s, 65 °C for 4 min (repeated for 23
rounds), and final elongation at 65 °C for 7 min [12].
Hot-start mPCR was performed using Beggs’ PCR pro-
gram [13]: 95 °C for 6 min (1 round) and 25 subsequent
cycles including DNA denaturing at 95 °C for 30 s, an-
nealing at 56 °C for 1 min, and elongation at 68 °C for
4 min. Amplification reactions were carried out on ther-
mal cycler Engine Dyad (Bio-Rad Laboratories Inc.,
Hercules, CA). PCR products (10–15 μl) were separated
on 3% NuSieve agarose (BMA Bioproducts, Rockland,
ME). The gels were viewed using the Gel Documenta-
tion and Analysis System (G-Box, SynGene, Frederick,
MD, USA).
Multiplex ligation-dependent probe amplification
We analyzed all DMD cases for large deletions and large
duplications using MLPA SALSA P034/P035 DMD kits
(http://www.mrc-holland.com) following the manufac-
turer’s instructions. In brief, denaturation, hybridization,
ligation, and amplification steps were performed on a
DNA Engine Dyad thermal cycler (Bio-Rad Laboratories
Inc., Hercules, CA). Finally, PCR amplification was per-
formed using SALSA MLPA PCR primers labeled with
the FAM dye. A mixture of 0.7 μl of PCR product, 0.2 μl
of 600 LIZ GS size-standard, and 9.0 μl of Hi-Di form-
amide was incubated for 3 min at 86 °C and cooled at
4 °C for 2 min. The MLPA product mix was separated
on a POP7 polymer (Applied Biosystems Inc., Life Tech-
nologies, Foster City, CA) at 60 °C with the setting of 1.
6 kV for injection voltage, 18 s for injection time, 15 kV
for run voltage, and 1800 s for run time.
Data analysis
The raw data were analyzed using GeneMapper Software
5 (Applied Biosystems Inc., Life Technologies, Foster
City, CA). The DNA of cases with single-exon deletions
were re-examined using conventional PCR. Initial ana-
lysis was performed with the naked eye to look for
missed exon-specific peaks. For the remaining samples,
the peak height of each exon was divided by the two
nearest control peaks. The median ratio across all sam-
ples for each peak was calculated and used as a reference
for one copy. For the sake of accuracy, any normalized ra-
tio below 0.3 was considered a possible deletion. A dupli-
cation was considered if a normalized ratio was 1.8–2.0. If
any single-exon deletion was identified, conventional PCR
amplification was carried out to validate this deletion
using primer sets and PCR conditions given in the Leiden
Muscular Dystrophy pages (http://www.dmd.nl).
Databases and confirming mutations for the DMD gene
We checked all mutations recorded in this study accord-
ing to available databases established by the Leiden Mus-
cular Dystrophy pages (http://www.dmd.nl) [40], the
Leiden Open Variation Database 3.0 (http://www.lovd.
nl/3.0/home) [41], and UMD-DMD (http://www.umd.
be/DMD/) [31, 32]. Databases for exon skipping to re-
store the DMD reading frame were found on sites devel-
oped by Leiden University Medical Center (http://www.
exonskipping.nl/?s=exon+skipping&submit=Go) and
CureDuchenne (https://www.cureduchenne.org/cure/
edystrophin/).
Statistical analysis
Hardy-Weinberg equilibrium (HWE) deviation was ex-
amined for X-linked DMD cases in this study using the
Online Encyclopedia for Genetic Epidemiology studies
software (http://www.oege.org/software/hwe-mr-calc.
shtml). We used the G*Power Software (http://www.psy-
cho.uni-duesseldorf.de/abteilungen/aap/gpower3/down-
load-and-register/) to estimate power analysis to
determine adequate sample sizes to achieve an 80%
power for t testing of point biserial model. “Priori” sam-
ple size and “post hoc” power estimations were tested
knowing our DMD sample size, a probability of α = 0.05,
Elhawary et al. Human Genomics (2018) 12:18 Page 3 of 11
http://www.mrc-holland.com
http://www.dmd.nl
http://www.dmd.nl
http://www.lovd.nl/3.0/new
http://www.lovd.nl/3.0/new
http://www.umd.be/DMD/
http://www.umd.be/DMD/
http://www.exonskipping.nl/?s=exon+skipping&submit=Go
http://www.exonskipping.nl/?s=exon+skipping&submit=Go
http://www.oege.org/software/hwe-mr-calc.shtml
http://www.oege.org/software/hwe-mr-calc.shtml
http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3/download-and-register/
http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3/download-and-register/
http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3/download-and-register/
and the effect size index “r” (the absolute value of the
correlation coefficient in the population, 0 < “r” < 1). We
used paired t test analysis to compare the significant dif-
ference between the age at onset and the age of clinical
evaluation for each DMD case. A two-sided P value less
than 0.05 was considered to indicate statistical signifi-
cance and 95% confidence interval (CI) for all analyses.
Results
Clinical profile
Among 45 unrelated patients, 21 were diagnosed with
DMD, 10 with IMD, and 5 with BMD. The unassigned
patients were defined as not determined (ND), as they
were too young to permit a definitive diagnosis (n = 9).
The median age at onset was 3.5 years (range 1.0–7.
0 years), while the median presenting age was 11.5 years
(1.5–20 years) (Fig. 1). Most of the patients reported ini-
tial symptoms between 1 and 3 years of age (71.1%, 32/
45), followed by those reporting symptoms at 4–5 years
(24.4%, 11/45). The age at clinical evaluation was most
frequently between 10 and 12 years (35.6%, 16/45). We
found an extremely statistically significant difference be-
tween the age at initial symptoms and the age at clinical
evaluation of DMD cases (t value, 10.3; 95% CI 5.95–8.8,
P < 0.0001). The mean difference in age between the two
groups was 7.4 years.
Hardy-Weinberg equilibrium
All affected males were in HWE at the DMD gene dele-
tions/duplications (χ2 = 1.00, P = 0.317), where the
heterozygotes were absent in such X-linked recessive
mode of inheritance.
Large-scale rearrangements
Using mPCR, we identified 55 large deletions in the 45
unrelated DMD patients. MLPA detected 147 intragenic
rearrangements, 68 (46.3%) of which were large
deletions and 79 (53.7%) of which were large duplica-
tions. All deletions identified by mPCR were confirmed
by the MLPA-based screening. The utility of MLPA
assay for all exons is clear, as 13 (19%) of 68 deletions
were detected using MLPA but were not detected by
conventional mPCR analysis. The percentage of cases
with deletions and duplications were 46.7% (21/45) and
17.8% (8/45), respectively. Table 1 includes the large re-
arrangements that were identified in the present study
and had been previously described.
New mutations in the DMD gene
We also identified seven previously undescribed large
DMD rearrangements from eight Saudi cases. These
new mutations, based on the DMD databases [31, 32,
40, 41], included one mixed rearrangement (del 45–52 +
dup 21–23), one large deletion (del 45–56), two large
duplications (8–30 and 17–24), and three double dupli-
cations (dup 2–4 + dup 18–19, dup 13 + dup 21–24,
and dup 56–58 + dup 62–64). These large mutations
were from eight (17.8%) of the Saudi patients (Table 2).
Distribution of rearrangements
In this study, deletions did not have a random distribu-
tion. We found that 92.5% (63/68) of hot spot deletions
were linked to exons 44–56 (central region), whereas 7.
5% (5/68) of deletions were related to exons 10–20.
Exon 50 was most frequently involved in deletions (19.
1%, 13/68), followed by exons 48 and 49 (each 11.8%, 8/
68). The rate of deletions increased from a minimum in
exon 44 to a maximum in exon 50 and then decreased
until the 3′ end of the DMD gene, with no deletion in
exons 57–79 (distal region) (Fig. 2). Moreover, we found
that the number of cases with a deletion of only one
exon was lower than the number with deletions of more
than one exon (9/21, 42.8% versus 12/21, 57.1%). About
half of the deletions (44.4%) were detected only once, in
Fig. 1 The age at onset and the age of clinical evaluation of DMD patients in this study. The analysis of DMD cases showed an apparent
diagnostic delay
Elhawary et al. Human Genomics (2018) 12:18 Page 4 of 11
agreement with the high allelic heterogeneity of the
DMD gene.
Duplications were distributed in the proximal (68/79,
86.1%), central (5/79, 6.3%), and distal regions (6/79, 7.
6%) (Fig. 2). Unlike deletions, duplicated exons were
more frequent in the proximal region (26 duplications)
than in the central and distal regions (13 duplications).
The most frequent duplications were of exons 21, 22,
and 23 (7.6%, 6/79 each), followed by exons 18 and 19
(5.1%, 4/79 each). We did not find any duplications in
exons 31–49 (central region) or exons 69–79 (distal re-
gion) within our cases (Fig. 2). Similar to deletions, 42.
5% of exonic duplications (17/40) were observed only
once, revealing a considerable heterogeneity of
duplications.
Reading frame shift and phenotype correlation
Gene rearrangements (deletions and duplications) were
correlated with clinical phenotypes in 28 unrelated cases:
11 (39.3%) with DMD, 7 (25%) with IMD, 3 (10.7%) with
BMD, and 7 (25%) with ND. We also predicted the
translational reading frame in 28 DMD cases with rear-
rangements identified in this study, using the Leiden
Muscular Dystrophy pages (http://www.dmd.nl). Apply-
ing the reading frame rule revealed consistency with the
frame shift rule for 90.9% (10/11) of the individuals with
Table 1 Previously described large rearrangements identified in this study and their reading frame shifts
Family
no.
Phenotype Multiplex
PCR
MLPA del/
dup
Exon(s) del/
dup
Codons del/
dup
Frame shift Amino acid changea cDNAa
DS-23 DMD No del Del 10–11 371 123 2/3 Stop at 323 p.His321PhefsX3 c.961_1331del
DS-1 DMD Del 19 Del 18–20 454 151 1/3 − 1 p.Arg723Lys874 c.2169_2622del
DS-37 IMD Del 44 Del 44 148 49 1/3 Stop at
2113
p.Arg2098AsnfsX16 c.6291_6438del
DS-34 ND Del 44–48 Del 44–48 808 269 1/3 − 1 p.Arg2098Gln2366del c.6291_7098del
DS-24 DMD Del 45 Del 45 176 58 2/3 Stop at
2163
p.Glu2147AlafsX17 c.6439_6614del
DS-8 IMD Del 45–50 Del 45–50 871 290 1/3 Stop at
2155
p.Glu2147LeufsX9 c.6439_7309del
DS-38 IMD Del 45–52 Del 45–52 1222 407 1/3 Stop at
2168
p.Glu2147LeufsX22 c.6439_7660del
DS-29 DMD Del 47–50 Del 47–50 547 182 1/3 Stop at
2263
p.Val2257LeufsX7 c.6763_7309del
DS-20 ND Del 47–50 Del 47–50 547 182 1/3 Stop at
2263
p.Val2257LeufsX7 c.6763_7309del
DS-48 BMD Del 48 Del 48 186 62 In-frame p.Val2305Gln2366del c.6913_7098del
DS-30 DMD Del 50 Del 49–50 211 70 1/3 Stop at
2375
p.Glu2367LeufsX9 c.7099_7309del
DS-31 IMD Del 50 Del 49–50 211 70 1/3 Stop at
2375
p.Glu2367LeufsX9 c.7099_7309del
DS-36 DMD Del 50 Del 50 109 36 1/3 Stop at
2409
p.Arg2401LeufsX9 c.7201_7309del
DS-32 IMD Del 50 Del 50 109 36 1/3 Stop at
2409
p.Arg2401LeufsX9 c.7201_7309del
DS-33 ND Del 50 Del 50 109 36 1/3 Stop at
2409
p.Arg2401LeufsX9 c.7201_7309del
DS-35 ND Del 50 Del 50 109 36 1/3 Stop at
2409
p.Arg2401LeufsX9 c.7201_7309del
DS-27 ND Del 50–52 Del 50–52 460 153 1/3 Stop at
2422
p.Arg2401LeufsX22 c.7201_7660del
DS-12 DMD Del 51 Del 51 233 77 2/3 Stop at
2469
p.Ser2437CysfsX33 c.7310_7542del
DS-18 DMD No del Del 55 190 63 1/3 Stop at
2700
p.Val2677ThrfsX24 c.8028_8217del
DS-11 ND No del Dup 50–51 343 114 In-frame p.Arg2401Lys2514dup c.7201-?_7542
+?dup
DMD Duchenne muscular dystrophy, IMD intermediate muscular dystrophy, BMD Becker muscular dystrophy, ND not determined
aThese data are based on the Leiden Muscular Dystrophy Pages (http://www.dmd.nl/) and the UMD-DMD (http://www.umd.be/DMD/)
Elhawary et al. Human Genomics (2018) 12:18 Page 5 of 11
http://www.dmd.nl
http://www.dmd.nl/
http://www.umd.be/DMD/
DMD phenotypes and 100% (7/7) of the individuals with
IMD phenotypes. Likewise, the DMD genes in all cases
with BMD phenotypes had in-frame functional effects
on the DMD protein (cases #DS-48, #DS-50, and #DS-
52) (Tables 1 and 2). All previously described rearrange-
ments we detected gave rise to a stop codon and thus a
truncated protein, except for case #DS-48 with a BMD
phenotype and case #DS-11 with ND, which gave rise to
in-frame predictions (Table 2). Two cases identified in
this study (#DS-53 and #DS-22) reflected both in-frame
and reading frame shift predictions (Fig. 3). The complex
rearrangement of case #DS-53 (del 45–52 + dup 21–23)
was associated with an IMD phenotype, with transla-
tional reading frame predictions with in-frame and
frame shift patterns (Fig. 3). This phenotype may have
occurred from the addition of exons 21–23 to the
mRNA transcript lessening the damaging effect of del
45–52 on the functional protein. On the contrary, case
#DS-22 could not have corrected for the harmful dup
56–58, giving rise to a DMD phenotype (Table 2).
Discussion
The present study used a facile, reliable, and time-
consuming MLPA strategy to identify large rearrange-
ments covering all 79 exons of the DMD gene. Our re-
sults showed the prevalence of 46.7 and 17.8%,
respectively, for large deletions and large duplications in
45 Saudi patients with DMD. Unlike the hot spot dele-
tions in exons 44–56 (92.5%), the hot spot deletions near
the 5′ end of the gene were not distinctive, and no large
hot spot duplications were found anywhere along the
DMD gene. The presence of an unusual MLPA pattern
in our Saudi sample, including non-contiguous duplica-
tions as well as contiguous deletions combined with
Table 2 New DMD mutational rearrangements identified in this study and their predicted reading frame shifts
Case no. Phenotype Multiplex PCR MLPA del/dup Exon(s) del/dup Codons del/dup Frame shift Amino acid changea
DS-2 DMD No del dup 2–4 and dup 18–19 233;
212
77 2/3
70 2/3
+ 2
+ 2
p.Val89MetfsX15 b
p.Lys724Gly795fsX1
DS-14 DMD No del dup 8–30 3679 1226 1/3 + 1 p.Val218K1412fsX
DS-15 ND No del dup 8–30 3679 1226 1/3 + 1 p.Val218K1412fsX
DS-52 BMD No del dup 13 and dup 21–24 120
654
40
218
0
0
p.Val495Val535dup
p.Asp875Lys1093dup
DS-50 BMD No del dup 17–24 1248 428 0 p.Ile665Lys1093dup
DS-53 IMD del 45–51 del 45–52 and dup 21–23 1222;
540
407 1/3
180
− 1c
0
p.Glu2147LeufsX22 C
p.Asp875Asn1055dup
DS-25 IMD del 45–51 del 45–56 1952 650 2/3 − 2 p.Glu2147Ser2798
DS-22 DMD No del dup 56–58 and dup 62–64 451
198
150 1/3
66
+ 1
0
p.Ser2798Lys2891fsX4
p.Ser3056Asp3122
DMD Duchenne muscular dystrophy, IMD intermediate muscular dystrophy, BMD Becker muscular dystrophy, ND not determined
aTheoretical amino acid change based on the database of the Leiden Muscular Dystrophy Pages (http://www.dmd.nl/)
bPreviously described duplication at cDNA (c.6439_7660del) showing amino acid change (p.Glu2147LeufsX22) resulting in a termination transcript at codon 2168
cPreviously described deletion at cDNA (c.32_265dup) showing amino acid change (p.Val89MetfsX15) resulting in a termination transcript at codon 103
Fig. 2 The frequency of large mutational rearrangements for each exon of the DMD gene. A region with a high frequency of deletions was
found in exons 44–56. No such region of frequency was detected for large duplications
Elhawary et al. Human Genomics (2018) 12:18 Page 6 of 11
http://www.dmd.nl/
non-contiguous duplications, suggests complex rear-
rangements. Our findings regarding double, separate du-
plications and complex rearrangements are consistent
with some previous reports in Serbian and South African
patients [18, 42].
Results from MLPA-related studies among different
ethnic populations are conflicting in terms of rates of
large rearrangements within the DMD gene. Studies
have found rates of deletions (and duplications) of 71.8–
79.0% (16.4–19.8%) in Chinese [29], 79.5% (6.5%) in In-
dian [43], 60% (10.0%) in Japanese [44], 45.5–71.8% (16.
7%) in Korean [35], and 28.2% (20.5%) in Taiwanese [45]
populations. When compared with our sample, a Turk-
ish sample has also been shown to have a relatively
higher rate of deletions within the DMD gene (63.7%)
[46], likely because of admixture with other European
ethnicities. Rates of DMD deletions and duplications in
some other Middle Eastern populations are more similar
to what we found: Egyptian (51.3% deletions) [47], Iran-
ian (51% deletions) [48], Moroccan (51% deletions) [49],
and Syrian (49.0% deletions; 9.8% duplications) [50]. The
majority of the reported DMD gene mutations in our
Saudi data showed translational reading frame shifts (94.
4%), while 5.6% of the mutations did not follow the
reading frame rule. This latter outcome is relatively con-
sistent with the corresponding values in the TREAT-
NMD DMD Global database (7%) [11], the UMD-DMD
database (4%) [32], and the Leiden database (9%) [41].
The overall rate of consanguinity in KSA is 57.7%, ran-
ging from 34 to 80.6% [51], with lower rates in Mecca
Fig. 3 A schematic overview of new complex large rearrangements in the DMD gene. a The case #DS-53 with an unusual mixed rearrangement
(dup 21–23 + del 45–52) leads to an out-of-frame shift giving rise to a severe DMD phenotype. b The case #DS2 with a double duplication (dup
2–4 + dup 18–19) results in out-of-frame shifts with a DMD phenotype. c The case #DS-52 with two in-frame shift due to double duplications
(dup 13 + dup 21–24) giving rise to a BMD phenotype. d The case #DS-22 showed two double duplications within the mature mRNA giving an
out-frame (dup 56–58), in-frame (dup 62–64) mutations giving rise to a DMD phenotype
Elhawary et al. Human Genomics (2018) 12:18 Page 7 of 11
(North Western region) than in Riyadh (Central region)
(44.1% versus 62.8%) [51]. This may account for the in-
creased deletions in patients of Riyadh (21/27, 77.8%)
when compared with those in our study [37]. During
Muslim immigration from the Levant, Africa, in Ancient
Islamic times, much intermarriage reinforced gene flow
of the DMD gene to the Saudi people. This has likely in-
fluenced the prevalence of different Mendelian patterns,
particularly X-linked types, exemplified by the
consistency of data for DMD rearrangements between
our study and a recent Spanish cohort study (46.1%,
131/284 for deletions and 56/284, 19.7% for duplica-
tions) [52].
It is noteworthy that some populations have inherent
reproductive barriers that prevent interbreeding, which
keeps them at native levels without merging (i.e., cryptic
taxa). Other populations may lack inherent reproductive
isolation. Therefore, admixture among different geo-
graphical populations might increase genetic variations
and perhaps create new genotypic combinations within
non-isolated (or non-native) populations [53]. Thus,
genetic variations among Gulf Arabs and some Middle
Eastern individuals (e.g., Barbarians in North Africa,
Kurdish, Upper Egyptian) [54, 55] should be handled
with caution, as increased consanguinity, extensive re-
productive isolation, and admixture with native source
populations (e.g., Black Africans, South Eastern Asians,
Caucasians) have had substantial roles in gene flow or
founder effects in these populations.
In our study, the analysis of DMD cases showed an ap-
parent diagnostic delay, as 69.8% of our patients showed
their first symptoms at an early age (1–3 years), but 44%
of these patients were 9–12 years old at first clinical
examination. Other countries have also reported long
delays in diagnosis of the disease, with a mean delay be-
tween 1.6 and 2.5 years [56, 57]. In south China, the first
symptoms occurred by 3 years of age, but the age at
clinical evaluation was 6–8 years [36]. Numerous studies
have advocated raising public awareness to identify early
symptoms in DMD patients [47, 57, 58], as parents are
usually the first to notice symptoms, which prompt them
to visit a health professional. To further reduce diagnos-
tic delay, creatine phosphokinase (CPK) testing should
be emphasized in primary care and performed as a rou-
tine test in children’s physical examinations.
Earlier clinical trials reported the safety and biochem-
ical efficacy of intravenous or intramuscular administra-
tion of antisense oligonucleotides (20-30 mer) to bring
hope to DMD patients with large deletions [59]. There-
fore, inducing exon 51 skipping to restore the open
reading frame is an attractive therapeutic strategy that
can be achieved with splice-switching oligomers. After
the US Food and Drug Administration (FDA) acceler-
ated approval of AVI-4658/eteplirsen (Exondys 51;
Sarepta Therapeutics Inc., Cambridge, MA, USA), tar-
geting DMD exon 51 skipping, eteplirsen was approved
and introduced in some countries [60–63]. Eteplirsen is
useful for patients with amenable DMD deletions, end-
ing at exon 50 and starting at exon 52 [64]. To date, ete-
plirsen has not been approved by the Saudi FDA
(https://www.sfda.gov.sa). Hence, numerous efforts have
used antisense oligomers to target exon skipping of exon
53 (SRP4053, PRO053), exon 45 (DS-514b, SRP4045),
and exon 44 (PRO044) (https://www.clinicaltrials.gov/
beta/home) [62, 65]. Based on our data for deletions,
exon skipping could eventually apply to 81% (17/21) of
DMD cases with large deletions. Among our Saudi pa-
tients with DMD gene deletions, the exons most fre-
quently skipped were exons 51 (42.9%, 9/21), 53 (14.3%,
3/21), 44 (9.5%, 2/21), 45 (4.8%, 1/21), 43 (4.8%, 1/21),
and 50 (4.8%, 1/21). Wein et al. have recently reported
the efficiency of exon skipping in the DMD gene, with
each duplicated exon expressing a wild-type, full-length
mRNA [66]. For more than one duplicated exon, several
antisense oligomers can be delivered as a cocktail of
drugs to skip larger regions of the transcript. Thus, for
duplications in exons 45–55, therapeutic skipping can be
applied to more than 60% of all DMD patients [67].
Although the power is conventionally utilized for poly-
genic disorders, the power under different monogenic
model of inheritance has not been systematically consid-
ered. This issue could be explained because of wide is-
sues, for example, rate of background variation in
disease-associated genes, mode of inheritance, extent of
penetrance, and locus heterogeneity.
In contrast to dominant model of inheritance, the in-
complete penetrance does not hold for the recessive
model, and in consequence, much smaller sample sizes
are needed under a recessive model, even in the pres-
ence of high locus heterogeneity [68]. According to our
priori sample size estimations at the effect size “r” = 0.3
(medium effect), or “r” = 0.5 (strong effect), we would
need 64 or 21 sample sizes, respectively, to ensure a power
detection of 80%. Thus, post hoc analysis using our DMD
sample size data in this study (n = 45 cases) could achieve
the power of 66.7% (r = 0.3) and 98.5% (r = 0.5).
Pinning down the spectrum of mutations for DMD
has been difficult because of poor replication of studies.
First, when compared with our study, some studies have
had populations with admixed ethnicities, conflicted out-
comes, or small sample sizes, which lessen the strength
of the overall results. Second, various molecular tech-
nologies have been utilized to examine DMD patients,
resulting in a broad range of false-positive or false-
negative results regarding rearrangements. Our study
mainly used the DMD MLPA test, providing a cheap
and straightforward DNA-based test that can screen for
deletions and duplications and be performed in any
Elhawary et al. Human Genomics (2018) 12:18 Page 8 of 11
https://www.sfda.gov.sa
https://www.clinicaltrials.gov/beta/home
https://www.clinicaltrials.gov/beta/home
DNA laboratory. Third, insufficient communication be-
tween clinicians and geneticists, because of difficulty
accessing hospitals of interest, may result in underdiag-
nosis of critical cases. However, precise coordination be-
tween clinicians and geneticists may help promote and
improve the genetic diagnosis of dystrophinopathies and
ameliorate potential therapies in these cases.
Conclusions
We detected nine previously undescribed exonic rear-
rangements within the DMD gene, including one un-
usual mixed rearrangement. MLPA or mPCR can be
used to define the molecular characteristics of DMD re-
arrangements and hence the effects of the frame shifts
on genotype-phenotype correlations in Saudi patients.
This information will also be important for future gene
therapy targeting exon skipping of the DMD gene. Our
clinical characteristics revealed a diagnostic delay, sug-
gesting the need for more public awareness about early
symptoms of disease. However, CPK testing should also
be performed as a routine test in children’s hospitals and
in primary care settings. In KSA, molecular testing of
DMD patients should be covered by medical insurance,
at least once in a lifetime. This single test could lead to
genetic diagnosis of more patients. The large deletions
and duplications we identified are predictive and intri-
guing, but the study needs to be replicated in different
ethnic populations of the Middle East, as well as in other
Saudi governorates. Though the sample size for this
study might not have been large enough to explore the
DMD mutational mechanisms, extensive sequencing
analyses will be needed to discover the DMD break-
points at the nucleotide level. Ongoing analyses of
whole-exome sequences for Saudi patients with DMD
are being carried out to identify the small breakpoints
within the DMD gene.
Additional file
Additional file 1: Table S1. Oligonucleotide Sequences of 15 multiplex
PCR sets and amplification size fragments. (DOCX 17 kb)
Abbreviations
BMD: Becker muscular dystrophy; CPK: Creatine phosphokinase;
DMD: Duchenne muscular dystrophy; FDA: Food and Drug Administration;
IMD: Intermediate muscular dystrophy; KSA: Kingdom of Saudi Arabia;
MLPA: Multiplex ligation-probe dependent amplification; ND: Not
determined
Acknowledgements
The authors would like to thank the parents of the cases for their
participation in this study. The authors also thank the Institute of Scientific
Research at Umm Al-Qura University (Project #43309030) for financial support
and the Faculty of Medicine, Cairo University-Giza, Egypt, for allowing ANE to
assist the following up the clinical phenotypes of the DMD cases.
Funding
This work was funded through grants from the Institute of Scientific
Research at Umm Al-Qura University (Project #43309030).
Availability of data and materials
The data sets analyzed during the current study are available from the
corresponding author.
Authors’ contributions
NAE and MTT designed the research; NAE, MTT, SJ, AD, EHJ, ANE, HK, and AK
performed the research; NAE, MTT, and AHM analyzed the data; NAE, MTT,
NB, KFA, and MR wrote the paper. Also, NAE and MTT initiated the grant
funding through a contract with the Institute of Scientific Research at Umm
Al-Qura University. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Written informed consent was obtained from the parents of all the
participants enrolled in this project (#43309030), which was approved by the
Institutional Biomedical Ethics Committee of Umm Al-Qura University. The
study was performed by the declaration of the National Committee of Bio-
medical Ethics at King Abdulaziz City for Sciences and Technology (KACST)
(http://bioethics.kacst.edu.sa/About.aspx?lang=en-US).
Consent for publication
Written informed consent was obtained from the parents of all study
participants to publish the results.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.
Author details
1Department of Medical Genetics, Medicine College, Umm Al-Qura
University, P.O. Box 57543, Mecca 21955, Saudi Arabia. 2Department of
Molecular Genetics, Faculty of Medicine, Ain Shams University, Cairo 11566,
Egypt. 3Department of Medical Laboratory Technology, Faculty of Applied
Medical Sciences, King Abdul-Aziz University, Jeddah, Saudi Arabia.
4Department of Pediatrics, Al Hada Military Hospital, Al Hada, Saudi Arabia.
5Department of Plan and Research, General Directorate of Health Affairs,
Mecca Region, Ministry of Health, Mecca, Saudi Arabia. 6Department of
Pediatrics, Faculty of Medicine, Cairo University, Giza, Egypt.
Received: 27 January 2018 Accepted: 2 April 2018
References
1. Hoffman EP, Brown RH Jr, Kunkel LM. Dystrophin: the protein product of
the Duchenne muscular dystrophy locus. Cell. 1987;51(6):919–28.
2. Koenig M, Monaco AP, Kunkel LM. The complete sequence of dystrophin
predicts a rod-shaped cytoskeletal protein. Cell. 1988;53(2):219–28.
3. Emery AE. Population frequencies of inherited neuromuscular diseases—a
world survey. Neuromuscul Disord. 1991;1(1):19–29.
4. Bushby K, Finkel R, Birnkrant DJ, Case LE, Clemens PR, Cripe L, Kaul A,
Kinnett K, McDonald C, Pandya S, et al. Diagnosis and management of
Duchenne muscular dystrophy, part 1: diagnosis, and pharmacological and
psychosocial management. Lancet Neurol. 2010;9(1):77–93.
5. Jarmin S, Kymalainen H, Popplewell L, Dickson G. New developments in the
use of gene therapy to treat Duchenne muscular dystrophy. Expert Opin
Biol Ther. 2014;14(2):209–30.
6. Durbeej M, Campbell KP. Muscular dystrophies involving the dystrophin-
glycoprotein complex: an overview of current mouse models. Curr Opin
Genet Dev. 2002;12(3):349–61.
7. Monaco AP, Bertelson CJ, Liechti-Gallati S, Moser H, Kunkel LM. An
explanation for the phenotypic differences between patients bearing partial
deletions of the DMD locus. Genomics. 1988;2(1):90–5.
8. Hentze MW, Kulozik AE. A perfect message: RNA surveillance and nonsense-
mediated decay. Cell. 1999;96(3):307–10.
Elhawary et al. Human Genomics (2018) 12:18 Page 9 of 11
https://doi.org/10.1186/s40246-018-0152-8
http://bioethics.kacst.edu.sa/About.aspx?lang=en-US
9. Muntoni F, Torelli S, Ferlini A. Dystrophin and mutations: one gene, several
proteins, multiple phenotypes. Lancet Neurol. 2003;2(12):731–40.
10. Flanigan KM, Dunn DM, von Niederhausern A, Howard MT, Mendell J,
Connolly A, Saunders C, Modrcin A, Dasouki M, Comi GP, et al. DMD Trp3X
nonsense mutation associated with a founder effect in North American
families with mild Becker muscular dystrophy. Neuromuscul Disord. 2009;
19(11):743–8.
11. Bladen CL, Salgado D, Monges S, Foncuberta ME, Kekou K, Kosma K,
Dawkins H, Lamont L, Roy AJ, Chamova T, et al. The TREAT-NMD DMD
Global Database: analysis of more than 7,000 Duchenne muscular dystrophy
mutations. Hum Mutat. 2015;36(4):395–402.
12. Chamberlain JS, Gibbs RA, Ranier JE, Nguyen PN, Caskey CT. Deletion
screening of the Duchenne muscular dystrophy locus via multiplex DNA
amplification. Nucleic Acids Res. 1988;16(23):11141–56.
13. Beggs AH, Koenig M, Boyce FM, Kunkel LM. Detection of 98% of DMD/BMD
gene deletions by polymerase chain reaction. Hum Genet. 1990;86(1):45–8.
14. Forrest SM, Cross GS, Speer A, Gardner-Medwin D, Burn J, Davies KE.
Preferential deletion of exons in Duchenne and Becker muscular
dystrophies. Nature. 1987;329(6140):638–40.
15. Oudet C, Hanauer A, Clemens P, Caskey T, Mandel JL. Two hot spots of
recombination in the DMD gene correlate with the deletion prone regions.
Hum Mol Genet. 1992;1(8):599–603.
16. Den Dunnen JT, Grootscholten PM, Bakker E, Blonden LA, Ginjaar HB,
Wapenaar MC, van Paassen HM, van Broeckhoven C, Pearson PL, van
Ommen GJ. Topography of the Duchenne muscular dystrophy (DMD) gene:
FIGE and cDNA analysis of 194 cases reveals 115 deletions and 13
duplications. Am J Hum Genet. 1989;45(6):835–47.
17. Hu XY, Ray PN, Murphy EG, Thompson MW, Worton RG. Duplicational
mutation at the Duchenne muscular dystrophy locus: its frequency,
distribution, origin, and phenotypegenotype correlation. Am J Hum Genet.
1990;46(4):682–95.
18. Lalic T, Vossen RH, Coffa J, Schouten JP, Guc-Scekic M, Radivojevic D,
Djurisic M, Breuning MH, White SJ, den Dunnen JT. Deletion and
duplication screening in the DMD gene using MLPA. Eur J Hum Genet.
2005;13(11):1231–4.
19. Elhawary NA, Shawky RM, Elsayed N. High-precision DNA microsatellite
genotyping in Duchenne muscular dystrophy families using ion-pair
reversed-phase high performance liquid chromatography. Clin Biochem.
2006;39(7):758–61.
20. Koenig M, Hoffman EP, Bertelson CJ, Monaco AP, Feener C, Kunkel LM.
Complete cloning of the Duchenne muscular dystrophy (DMD) cDNA and
preliminary genomic organization of the DMD gene in normal and affected
individuals. Cell. 1987;50(3):509–17.
21. Ioannou P, Christopoulos G, Panayides K, Kleanthous M, Middleton L. Detection
of Duchenne and Becker muscular dystrophy carriers by quantitative multiplex
polymerase chain reaction analysis. Neurol. 1992;42(9):1783–90.
22. Kodaira M, Hiyama K, Karakawa T, Kameo H, Satoh C. Duplication detection
in Japanese Duchenne muscular dystrophy patients and identification of
carriers with partial gene deletions using pulsed-field gel electrophoresis.
Hum Genet. 1993;92(3):237–43.
23. Yau SC, Bobrow M, Mathew CG, Abbs SJ. Accurate diagnosis of carriers of
deletions and duplications in Duchenne/Becker muscular dystrophy by
fluorescent dosage analysis. J Med Genet. 1996;33(7):550–8.
24. White S, Kalf M, Liu Q, Villerius M, Engelsma D, Kriek M, Vollebregt E, Bakker
B, van Ommen GJ, Breuning MH, et al. Comprehensive detection of
genomic duplications and deletions in the DMD gene, by use of multiplex
amplifiable probe hybridization. Am J Hum Genet. 2002;71(2):365–74.
25. del Gaudio D, Yang Y, Boggs BA, Schmitt ES, Lee JA, Sahoo T, Pham HT,
Wiszniewska J, Chinault AC, Beaudet AL, et al. Molecular diagnosis of
Duchenne/Becker muscular dystrophy: enhanced detection of dystrophin
gene rearrangements by oligonucleotide array-comparative genomic
hybridization. Hum Mutat. 2008;29(9):1100–7.
26. Schouten JP, McElgunn CJ, Waaijer R, Zwijnenburg D, Diepvens F, Pals G.
Relative quantification of 40 nucleic acid sequences by multiplex ligation-
dependent probe amplification. Nucleic Acids Res. 2002;30(12):e57.
27. Schwartz M, Duno M. Improved molecular diagnosis of dystrophin gene
mutations using the multiplex ligation-dependent probe amplification
method. Genet Test. 2004;8(4):361–7.
28. Janssen B, Hartmann C, Scholz V, Jauch A, Zschocke J. MLPA analysis for the
detection of deletions, duplications and complex rearrangements in the
dystrophin gene: potential and pitfalls. Neurogenetics. 2005;6(1):29–35.
29. Chen C, Ma H, Zhang F, Chen L, Xing X, Wang S, Zhang X, Luo Y. Screening
of Duchenne muscular dystrophy (DMD) mutations and investigating its
mutational mechanism in Chinese patients. PLoS One. 2014;9(9):e108038.
30. Nobile C, Toffolatti L, Rizzi F, Simionati B, Nigro V, Cardazzo B, Patarnello T,
Valle G, Danieli GA. Analysis of 22 deletion breakpoints in dystrophin intron
49. Hum Genet. 2002;110(5):418–21.
31. Cotton RG, Auerbach AD, Beckmann JS, Blumenfeld OO, Brookes AJ, Brown
AF, Carrera P, Cox DW, Gottlieb B, Greenblatt MS, et al. Recommendations
for locus-specific databases and their curation. Hum Mutat. 2008;29(1):2–5.
32. Tuffery-Giraud S, Beroud C, Leturcq F, Yaou RB, Hamroun D, Michel-Calemard L,
Moizard MP, Bernard R, Cossee M, Boisseau P, et al. Genotype-phenotype analysis
in 2,405 patients with a dystrophinopathy using the UMD-DMD database: a
model of nationwide knowledgebase. Hum Mutat. 2009;30(6):934–45.
33. Mitsui J, Takahashi Y, Goto J, Tomiyama H, Ishikawa S, Yoshino H, Minami N,
Smith DI, Lesage S, Aburatani H, et al. Mechanisms of genomic instabilities
underlying two common fragile-site-associated loci, PARK2 and DMD, in
germ cell and cancer cell lines. Am J Hum Genet. 2010;87(1):75–89.
34. Ankala A, Kohn JN, Hegde A, Meka A, Ephrem CL, Askree SH, Bhide S,
Hegde MR. Aberrant firing of replication origins potentially explains
intragenic nonrecurrent rearrangements within genes, including the human
DMD gene. Genome Res. 2012;22(1):25–34.
35. Suh MR, Lee KA, Kim EY, Jung J, Choi WA, Kang SW. Multiplex ligation-
dependent probe amplification in X-linked recessive muscular dystrophy in
Korean subjects. Yonsei Med J. 2017;58(3):613–8.
36. Wang DN, Wang ZQ, Yan L, He J, Lin MT, Chen WJ, Wang N. Clinical and
mutational characteristics of Duchenne muscular dystrophy patients based
on a comprehensive database in South China. Neuromuscul Disord. 2017;
27(8):715–22.
37. Al-Jumah M, Majumdar R, Al-Rajeh S, Chaves-Carballo E, Salih MM, Awada A,
Al-Shahwan S, Al-Uthaim S. Deletion mutations in the dystrophin gene of
Saudi patients with Duchenne and Becker muscular dystrophy. Saudi Med J.
2002;23(12):1478–82.
38. Tayeb MT. Deletion mutations in Duchenne muscular dystrophy (DMD) in
Western Saudi children. Saudi J Biol Sci. 2010;17(3):237–40.
39. Elhawary NA, Nassir A, Saada H, Dannoun A, Qoqandi O, Alsharif A, Tayeb
MT. Combined genetic biomarkers confer susceptibility to risk of urothelial
bladder carcinoma in a Saudi population. Dis Markers. 2017;2017:1474560.
40. Aartsma-Rus A, Van Deutekom JC, Fokkema IF, Van Ommen GJ, Den
Dunnen JT. Entries in the Leiden Duchenne muscular dystrophy mutation
database: an overview of mutation types and paradoxical cases that confirm
the reading-frame rule. Muscle Nerve. 2006;34(2):135–44.
41. White SJ, den Dunnen JT. Copy number variation in the genome; the human
DMD gene as an example. Cytogenet Genome Res. 2006;115(3–4):240–6.
42. Kerr R, Robinson C, Essop FB, Krause A. Genetic testing for Duchenne/Becker
muscular dystrophy in Johannesburg. South Africa S Afr Med J. 2013;103(12
Suppl 1):999–1004.
43. Manjunath M, Kiran P, Preethish-Kumar V, Nalini A, Singh RJ, Gayathri N. A
comparative study of mPCR, MLPA, and muscle biopsy results in a cohort of
children with Duchenne muscular dystrophy: a first study. Neurol India.
2015;63(1):58–62.
44. Okubo M, Minami N, Goto K, Goto Y, Noguchi S, Mitsuhashi S, Nishino I.
Genetic diagnosis of Duchenne/Becker muscular dystrophy using next-
generation sequencing: validation analysis of DMD mutations. J Hum Genet.
2016;61(6):483–9.
45. Liang WC, Wang CH, Chou PC, Chen WZ, Jong YJ. The natural history of the
patients with Duchenne muscular dystrophy in Taiwan: a medical center
experience. Pediatr Neonatol. 2017.
46. Ulgenalp A, Giray O, Bora E, Hizli T, Kurul S, Sagin-Saylam G, Karasoy H, Uran
N, Dizdarer G, Tutuncuoglu S, et al. Deletion analysis and clinical
correlations in patients with Xp21 linked muscular dystrophy. Turk J Pediatr.
2004;46(4):333–8.
47. Elhawary NA, Shawky RM, Hashem N. Frameshift deletion mechanisms in
Egyptian Duchenne and Becker muscular dystrophy families. Mol Cells.
2004;18(2):141–9.
48. Nouri N, Fazel-Najafabadi E, Salehi M, Hosseinzadeh M, Behnam M, Ghazavi
MR, Sedghi M. Evaluation of multiplex ligation-dependent probe
amplification analysis versus multiplex polymerase chain reaction assays in
the detection of dystrophin gene rearrangements in an Iranian population
subset. Adv Biomed Res. 2014;3:72.
49. Sbiti A, El Kerch F, Sefiani A. Analysis of dystrophin gene deletions by
multiplex PCR in Moroccan patients. J Biomed Biotechnol. 2002;2(3):158–60.
Elhawary et al. Human Genomics (2018) 12:18 Page 10 of 11
50. Madania A, Zarzour H, Jarjour RA, Ghoury I. Combination of conventional
multiplex PCR and quantitative real-time PCR detects large rearrangements
in the dystrophin gene in 59% of Syrian DMD/BMD patients. Clin Biochem.
2010;43(10–11):836–42.
51. el-Hazmi MA, al-Swailem AR, Warsy AS, al-Swailem AM, Sulaimani R, al-
Meshari AA. Consanguinity among the Saudi Arabian population. J Med
Genet. 1995;32(8):623–6.
52. Vieitez I, Gallano P, Gonzalez-Quereda L, Borrego S, Marcos I, Millan JM, Jairo
T, Prior C, Molano J, Trujillo-Tiebas MJ, et al. Mutational spectrum of
Duchenne muscular dystrophy in Spain: study of 284 cases. Neurologia.
2017;32(6):377–85.
53. Lavergne S, Molofsky J. Increased genetic variation and evolutionary
potential drive the success of an invasive grass. Proc Natl Acad Sci U S A.
2007;104:3883–8.
54. Rund D, Cohen T, Filon D, Dowling CE, Warren TC, Barak I, Rachmilewitz E,
Kazazian HH Jr, Oppenheim A. Evolution of a genetic disease in an ethnic
isolate: beta-thalassemia in the Jews of Kurdistan. Proc Natl Acad Sci U S A.
1991;88(1):310–4.
55. Jiffri EH, Bogari N, Zidan KH, Teama S, Elhawary NA. Molecular updating of
β-thalassemia mutations in the upper Egyptian population. Hemoglobin.
2010;34(6):538–47.
56. Ciafaloni E, Fox DJ, Pandya S, Westfield CP, Puzhankara S, Romitti PA,
Mathews KD, Miller TM, Matthews DJ, Miller LA, et al. Delayed diagnosis in
duchenne muscular dystrophy: data from the Muscular Dystrophy
Surveillance, Tracking, and Research Network (MD STARnet). J Pediatr. 2009;
155(3):380–5.
57. van Ruiten HJ, Straub V, Bushby K, Guglieri M. Improving recognition of
Duchenne muscular dystrophy: a retrospective case note review. Arch Dis
Child. 2014;99(12):1074–7.
58. Li X, Zhao L, Zhou S, Hu C, Shi Y, Shi W, Li H, Liu F, Wu B, Wang Y. A
comprehensive database of Duchenne and Becker muscular dystrophy
patients (0−18 years old) in East China. Orphanet J Rare Dis. 2015;10:5.
59. Cirak S, Arechavala-Gomeza V, Guglieri M, Feng L, Torelli S, Anthony K, Abbs
S, Garralda ME, Bourke J, Wells DJ, et al. Exon skipping and dystrophin
restoration in patients with Duchenne muscular dystrophy after systemic
phosphorodiamidate morpholino oligomer treatment: an open-label, phase
2, dose-escalation study. Lancet. 2011;378(9791):595–605.
60. Mendell JR, Goemans N, Lowes LP, Alfano LN, Berry K, Shao J, Kaye EM,
Mercuri E, Eteplirsen Study G, Telethon Foundation DMDIN. Longitudinal
effect of eteplirsen versus historical control on ambulation in Duchenne
muscular dystrophy. Ann Neurol. 2016;79(2):257–71.
61. Shimizu-Motohashi Y, Miyatake S, Komaki H, Takeda S, Aoki Y. Recent advances
in innovative therapeutic approaches for Duchenne muscular dystrophy: from
discovery to clinical trials. Am J Transl Res. 2016;8(6):2471–89.
62. Lee BL, Nam SH, Lee JH, Ki CS, Lee M, Lee J. Genetic analysis of dystrophin
gene for affected male and female carriers with Duchenne/Becker muscular
dystrophy in Korea. J Korean Med Sci. 2012;27(3):274–80.
63. Lim KR, Maruyama R, Yokota T. Eteplirsen in the treatment of Duchenne
muscular dystrophy. Drug Des Devel Ther. 2017;11:533–45.
64. van Deutekom JC, van Ommen GJ. Advances in Duchenne muscular
dystrophy gene therapy. Nat Rev Genet. 2003;4(10):774–83.
65. Mah JK. Current and emerging treatment strategies for Duchenne muscular
dystrophy. Neuropsychiatr Dis Treat. 2016;12:1795–807.
66. Wein N, Vulin A, Findlay AR, Gumienny F, Huang N, Wilton SD, Flanigan KM.
Efficient skipping of single exon duplications in DMD patient-derived cell
lines using an antisense oligonucleotide approach. J Neuromuscul Dis. 2017;
4(3):199–207.
67. Aoki Y, Yokota T, Nagata T, Nakamura A, Tanihata J, Saito T, Duguez SM,
Nagaraju K, Hoffman EP, Partridge T, et al. Bodywide skipping of exons 45-
55 in dystrophic mdx52 mice by systemic antisense delivery. Proc Natl Acad
Sci U S A. 2012;109(34):13763–8.
68. Guo MH, Dauber A, Lippincott MF, Chan YM, Salem RM, Hirschhorn JN.
Determinants of power in gene-based burden testing for monogenic
disorders. Am J Hum Genet. 2016;99(3):527–39.
Elhawary et al. Human Genomics (2018) 12:18 Page 11 of 11
Abstract
Background
Results
Conclusions
Background
Methods
Ethics statement and participants
DNA isolation
Multiplex polymerase chain reaction
Multiplex ligation-dependent probe amplification
Data analysis
Databases and confirming mutations for the DMD gene
Statistical analysis
Results
Clinical profile
Hardy-Weinberg equilibrium
Large-scale rearrangements
New mutations in the DMD gene
Distribution of rearrangements
Reading frame shift and phenotype correlation
Discussion
Conclusions
Additional file
Abbreviations
Funding
Availability of data and materials
Authors’ contributions
Ethics approval and consent to participate
Consent for publication
Competing interests
Publisher’s Note
Author details
References
Arab genome-Health and Wealth-2016
Gene 592 (2016) 239–243
Contents lists available at ScienceDirect
Gene
journal homepage: www.elsevier.com/locate/gene
Review
The Arab genome: Health and wealth
Hatem Zayed
College of Health and Sciences, Biomedical Sciences Department, Qatar University, PO Box 2713, Doha, Qatar
E-mail address: hatem.zayed@qu.edu.qa.
http://dx.doi.org/10.1016/j.gene.2016.07.007
0378-1119/© 2016 Published by Elsevier B.V.
a b s t r a c t
a r t i c l e i n f o
Article history:
Received 21 June 2016
Accepted 3 July 2016
Available online 5 July 2016
The 22 Arab nations have a unique genetic structure, which reflects both conserved and diverse gene pools due to
the prevalent endogamous and consanguineous marriage culture and the long history of admixture among dif-
ferent ethnic subcultures descended from the Asian, European, and African continents. Human genome sequenc-
ing has enabled large-scale genomic studies of different populations and has become a powerful tool for studying
disease predictions and diagnosis. Despite the importance of the Arab genome for better understanding the dy-
namics of the human genome, discovering rare genetic variations, and studying early human migration out of
Africa, it is poorly represented in human genome databases, such as HapMap and the 1000 Genomes Project.
In this review, I demonstrate the significance of sequencing the Arab genome and setting an Arab genome
reference(s) for better understanding the molecular pathogenesis of genetic diseases, discovering novel/rare var-
iants, and identifying a meaningful genotype-phenotype correlation for complex diseases.
© 2016 Published by Elsevier B.V.
Keywords:
Arab countries
Human genome sequencing
Whole exome sequencing
Consanguinity
Endogamous marriage
Novel genes
Novel variants
Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
2. The Arab world. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
2.1. Inbred Arab communities and rare variants discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
3. The Arab genome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
3.1. Discovery of novel disease-causing genes and the Arab genome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
3.2. Arab efforts in genome sequencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
3.3. The Arab genome and the “Out of Africa” theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
3.4. Benefits of sequencing the Arab genome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Disclosure declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
1. Introduction
The completion of the Human Genome Project (HGP) in April 2003
provided a wealth of information to scientists and clinicians. Subse-
quently, the world has witnessed rapid evolution in the field of
human genetics and genomics (Lander et al., 2001; Venter et al.,
2001). Initially, the focus of the HGP was to catalog the protein-
expressing genes, which are now estimated to include approximately
20,000 to 25,000 coding genes (International Human Genome
Sequencing Consortium, 2004). However, the hard work of decoding
the function of many genes and their precise genotype-phenotype cor-
relation in disease development remains.
From the publication of the first draft of the human genome, there
has been fierce competition to develop sequencing technologies that
are faster, more efficient and cheaper and to make the price of human
genome sequencing more affordable. Thus far, whole genome/exome
sequencing has provided outstanding insights into the frequency and
incidence of novel variants in the human genome that are associated
with disease phenotypes. This information provides opportunities to
different populations in the world to be able to map the sequence vari-
ants that might be unique to their own individuals and that might be re-
sponsible for genetic disorders in their specific populations. For this
purpose, the HapMap (human haplotype mapping) Project was
http://crossmark.crossref.org/dialog/?doi=10.1016/j.gene.2016.07.007&domain=pdf
http://dx.doi.org/10.1016/j.gene.2016.07.007
mailto:hatem.zayed@qu.edu.qa
http://dx.doi.org/10.1016/j.gene.2016.07.007
http://www.sciencedirect.com/science/journal/03781119
www.elsevier.com/locate/gene
240 H. Zayed / Gene 592 (2016) 239–243
launched in 2002 (International HapMap Consortium, 2003); this pro-
ject has identified a considerable number of genetic variants, providing
extensive catalogs for genetic variation. The HapMap Project has also
served as the basis for genome-wide association studies (GWAS). In
particular, the HapMap Project has contributed to the successful map-
ping of more than 100 genomic regions that are associated with genetic
diseases (International HapMap Consortium, 2003).
As an extension of the HapMap Project, the 1000 Genomes Project
was launched in 2008 through international concerted efforts
(Buchanan et al., 2012). This project aims to sequence the whole ge-
nomes of 1000 unidentified individuals from Europe, America, Africa,
and Asia, and will add information to the single-nucleotide polymor-
phism (SNP) database already cataloged by the HapMap Project and
provide a rich resource for both SNPs and structural variant haplotypes.
Although this information will allow researchers to learn more about
many genetic variants and genetic diseases, unfortunately, the Arab ge-
nome is greatly under-represented in the international efforts of such
genomic studies; specifically, it is not included in the HGP, HapMap Pro-
ject, or 1000 Genomes Project. There is no doubt that the importance of
the Arab genome sequencing is significant and that this genome thus
should not be omitted from the diverse collections of genomes that
have already been sequenced. Therefore, I am focusing this review on
elaborating upon the importance of the Arab genome and the potential
contribution of the Arab genome to the genomic sciences.
2. The Arab world
The Arab world includes 22 Arabic-speaking countries (Fig. 1). Ac-
cording to the World Bank latest classification for 2015 (http://data.
worldbank.org), the Arab countries include high-income countries
(HICs) such as Bahrain, Kuwait, Oman, Saudi Arabia, Qatar, and the
United Arab Emirates; middle-income countries (MICs) such as
Algeria, Egypt, Iraq, Jordan, Lebanon, Libya, Morocco, Palestine, Sudan,
Syria, and Tunisia; and low-income countries (LICs) such as Comoros,
Djibouti, Mauritania, Somalia, and Yemen. These countries occupy a
Fig. 1. Arabic speaking countries accordi
(Source: http://www.arabic-keyboard.o
large area that extends from the Atlantic Ocean in the west to the Arabi-
an Sea in the east, and the Arab population is approaching 0.5 billion.
This region has been extensively exposed to many successive invaders
from Turkey, Rome, and Europe as well as to traders and immigrants,
thus contributing to mixing of the ethnic demographics of the popula-
tion. However, the HICs, which include countries with the highest
Gross Domestic Product (GDP) per capita worldwide (http://data.
worldbank.org), spend less than 0.2% of their GDP on scientific develop-
ment (Giles, 2006). This phenomenon has led to the immigration of
many Arab scientists into the West to look for better opportunities.
However, recently, biomedical disease-based research has received spe-
cial attention from Arab governments, with the aim of improving the
understanding and treatment of common diseases afflicting the Arab
population. Various attempts have been made by Saudi Arabia and
Qatar in particular to establish a research infrastructure, but the prog-
ress has been significantly slow relative to the amount of capital infused
into such programs, and the benefits of such investments might take
significant time to yield results. In this manuscript I will refer to the
“Arab genome” as the genome of the 22 Arab countries.
2.1. Inbred Arab communities and rare variants discovery
There are 955 genetic diseases that have been identified in Arabs, of
which 586 (60%) are reported to be recessive diseases (http://www.
cags.org.ae). Arabs have one of the highest rates of consanguineous
marriage worldwide, reaching up to ~70%, with an extreme prevalence
of first-cousin marriage (Tadmouri et al., 2009), These factors, together
with the endogamous marriage culture and large family sizes, are re-
sponsible for the spread of genetic diseases in Arab countries, with a
high prevalence of rare diseases (Teebi and Teebi, 2005). Endogamous
marriages approach 100% in many Arab countries, and especially the
Gulf States (i.e., Bahrain, Kuwait, Oman, Qatar, Saudi Arabia and the
United Arab Emirates). For example, women in Saudi Arabia are
prohibited from marrying men other than Arab men from the Gulf
countries without special dispensation from the king (http://web.
ng to the latest WHO classification.
rg/arabic).
http://data.worldbank.org
http://data.worldbank.org
http://data.worldbank.org
http://data.worldbank.org
http://www.cags.org.ae
http://www.cags.org.ae
http://web.archive.org/web/20120614045804/http://travel.state.gov/travel/cis_pa_tw/tw/tw_931.html
http://www.arabic-keyboard.org/arabic
241H. Zayed / Gene 592 (2016) 239–243
archive.org/web/20120614045804/http://travel.state.gov/travel/cis_
pa_tw/tw/tw_931.html), and men must acquire a government permit
to marry a foreign woman. This law is applicable to the six Gulf States
and is due to deeply entrenched, centuries-old traditions that strongly
favor marriage within the same Arab subcultures. In addition, this mar-
riage culture is still on the rise; for example, consanguineous marriage
rates in Qatar increased from 41.8% to 54.5% in just one generation
(Bener and Alali, 2006).
Although a large number of rare variants still have unknown clinical
significance because of the limitations of current technologies, which
can be attributed to the need of large number of individuals harboring
these variants that are largely untested by high-density SNP arrays.
Therefore, studying inbred communities such as Arab communities is
an ideal scenario to understand the effect of genetic variants on the
human genome. In this regard, genetic analysis of the Arab genome is
considered to be a goldmine for genomic scientists who are looking
for a more discernible correlation between the genotype and the pheno-
type of genetic diseases, and particularly complex disorders and rare ge-
netic disorders. The inbreeding nature of many Arab communities and
the commonness of the conservative marriage culture might predict a
wide class of complex disorders, especially if the causative variants are
rare and the most identified genetic variants causing the complex dis-
eases in humans are partially recessive (Bittles and Black, 2010; Rudan
et al., 2003). In this regard, Arabs represent an ideal population for bet-
ter understanding the pathogenesis and prognosis of recessive diseases,
which are yet to be elucidated. Although the consanguineous, endoga-
mous Arab culture seems to predict a conserved pool of genes among
Arabs, the structure of the Arab genome became diversified over time,
mainly due to admixing of the genome with those of different ethnic
groups descended from Africa, Asia, and Europe (Teebi and Teebi,
2005), which provide another opportunity for understanding the dy-
namic of the Arab genome and the “out of Africa” migration theory.
3. The Arab genome
Although the Arab region is considered to be a hot spot for medical
and clinical genetic studies, (Nat. Genet., 2006) Arabs have been slow
to explore their own genome. This reticence might be due to the follow-
ing reasons: (1) in most Arab countries, it is not yet affordable to se-
quence a genome, even for clinical diagnostic reasons, despite the
continual diminishing costs of next-generation sequencing technolo-
gies; (2) research is not considered to be a necessity in most Arab coun-
tries, mainly due to economic reasons; and (3) there is a dearth of well-
trained scientists in genomics. As a consequence, there is a lack of infor-
mation related to molecular pathogenesis and poor knowledge of both
the genotype-phenotype correlation of genetic diseases and the gene
variants that are responsible for the spread of these diseases that are
segregating in the Arab genome. This is the case even for the most dev-
astating diseases, such as diabetes and cardiovascular disorders, which
compromises the level of the health care provided to the Arab popula-
tion. Therefore, Arab governments must prioritize seeking the means
to understand the complexity and dynamics of the Arab genome, espe-
cially in countries that are able to afford the costs of genome sequencing.
Consistent with this concept, a genomic revolution has been ignited in
the Arabian Peninsula, especially in the Gulf States of Saudi Arabia,
Kuwait, and Qatar, as the US Encyclopedia of DNA Elements (ENCODE)
project and the Arab genome initiatives, represented by the Saudi
Human Genome Project (SHGP) (http://shgp.kacst.edu.sa/site), the
Qatar Genome Project (QGP) (Al-Mulla, 2014), and the Kuwaiti Genome
Project (KGP) (Thareja et al., 2015), aim to systematically and compre-
hensively analyze and catalog the genetic variants and haplotypes that
are associated with health and disease. These efforts are expected to
help in the identification of novel disease associated gene variants.
The initiatives also aim to derive reference genome(s) sequence for dif-
ferent subpopulations of different ancestries in Kuwait. Although Arab
scientists are a decade late in sequencing the Arab genome, this
sequencing is expected to contribute to knowledge related to migration
genome ancestry, genome evolution, genome dynamics, mapping of
rare disease-associated variants, and novel disease associated gene
discovery.
3.1. Discovery of novel disease-causing genes and the Arab genome
Inbreeding is associated with an increased disease risk based on in-
creased homozygosity at many genetic loci (Rudan et al., 2003) and
leads to a high probability of shared ancestry between randomly select-
ed Arab individuals and longer runs of homozygosity, this is an ideal
way to map rare disease susceptibility loci among highly consanguine-
ous families in inbred Arab communities. A representative example
was provided by Verge et al. (1998), who analyzed an inbred Bedouin
Arab community who has a long history of first-cousin marriage, they
analyzed a large Arab family of 248 individuals living in Israel that had
19 relatives affected with type 1 diabetes who carried rare predisposing
haplotypes to type 1 diabetes that were not found in other families. In-
terestingly, the researchers discovered a novel susceptibility locus
(IDDM17; MIM#603266) for type 1 diabetes, which was mapped to
chromosome 10 (10q25.1). Another example is the identification of a
novel locus that was defined by the TMEM107 mutation through se-
quencing 25 families with the rare, ciliopathic Meckel-Gruber syndrome
(Shaheen et al., 2015), and another study that successfully led to the
discovery of six novel candidate genes which found to be associated
with embryonic lethality in Saudi Arabian consanguineous families
(Shamseldin et al., 2015).
The whole exome sequencing (WES) was also successful to
reveal a long list of novel candidate genes among consanguineous
Arab families, including, but not limited to, identifying 69 genes
which are linked to recessive diseases in 143 multiplex Saudi fami-
lies, which was not previously associated with genetic diseases
(Alazami et al., 2015). Diagnostic WES has also been able to identify
several novel disease-associated genes among 149 probands that be-
long to highly consanguineous population in Qatar, with various
Mendelian phenotypes but mainly neurocognitive (Yavarna et al.,
2015). In a study of 18 consanguineous Arab families with Meckel–
Gruber syndrome (MKS), WES revealed a likely pathogenic mutation
in three novel candidate MKS disease-causing genes (C5orf42, EVC2,
and SEC8) (Shaheen et al., 2013). The ARL6IP6 gene was identified as
a novel candidate gene for a syndromic form of CMTC in a Saudi con-
sanguineous family (Abumansour et al., 2015). Therefore, the Arab
genome carries significant potential in advancing the fields of clinical
and medical genetics.
3.2. Arab efforts in genome sequencing
The SHGP is a 5-year project launched in December 2013 that in-
volves a partnership between the SHGP and Life Technologies (http://
shgp.kacst.edu.sa/site). The aim of the project is to sequence 100,000
Saudi genomes that represent both normal and disease conditions to
identify Saudi-specific genetic variants that are linked to high-
incidence genetic diseases in Saudi Arabia, such as diabetes, deafness,
cardiovascular disorders, cancer, and neurodegenerative diseases
(Abu-Elmagd et al., 2015). The SHGP’s specific mission is to establish a
genotype-phenotype correlation for genetic disease and to create a
foundation for personalized medicine, in which treatment will be devel-
oped based on the DNA blueprint of each Saudi individual. This ap-
proach will reduce the cost of health care, as the health care expenses
related to human genetic disease are greater than $30 billion annually
in Saudi Arabia (http://shgp.kacst.edu.sa/site).
A few days after the SHGP announcement, Qatar announced its in-
tention to launch the QGP and a plan to sequence the genomes of all
Qatari citizens (~300,000) (Al-Mulla, 2014). Similarly to the SHGP, the
QGP seeks the future protection of Qatari citizens from the spread of ge-
netic diseases due to the deep-entrenched culture of endogamous and
http://web.archive.org/web/20120614045804/http://travel.state.gov/travel/cis_pa_tw/tw/tw_931.html
http://web.archive.org/web/20120614045804/http://travel.state.gov/travel/cis_pa_tw/tw/tw_931.html
http://shgp.kacst.edu.sa/site
http://shgp.kacst.edu.sa/site
http://shgp.kacst.edu.sa/site
http://shgp.kacst.edu.sa/site
242 H. Zayed / Gene 592 (2016) 239–243
consanguineous marriage by understanding the genomic make-up of
the Qatari population, and integrating the sequencing information into
clinical care for Qatari individuals. The data collected from the genome
sequencing will be used as a platform for developing customized molec-
ular diagnostics approaches to Arabs (Zayed and Ouhtit, 2016), help to
create the foundation of personalized medicine in the Arabian Peninsu-
la, and are expected to advance prenatal screening, genetic counseling
for disease-carrying individuals in Qatar. QGP has already started its
pilot phase by sequencing 3000 Qatari citizens (http://www.qatar-
tribune.com/viewnews.aspx?d=20151214&cat=nation2&pge=5).
Computational analyses aimed to decode the Qatari genome and map
the genetic variants which are unique to the Qatari individuals, are sup-
ported by generous competitive funding from Qatar Foundation
(https://www.qf.org.qa). These sequencing data are kept in electronic
medical records which will be an integral part of the Qatari National
Health Service.
The KGP is an initiative to determine the genetic diversity of the
main ethnic groups that constitute the Kuwaiti population, namely,
Saudi Arabians, Bedouins, and Persians, ascribing their origin to dif-
ferent regions of the Arabian Peninsula and West Asia (modern
Iranians). Thus, this project is the first to report a reference genome
resource for the population of Persian ancestry in Kuwait (Thareja
et al., 2015).
3.3. The Arab genome and the “Out of Africa” theory
The modern Arab gene pool exhibits a very interesting genetic
structure: it has numerous pockets of inbred communities due to
the prevalence of consanguineous unions, conserved pools of ge-
nomes due to widespread endogamous marriage, and a mixed gene
pool due to the history of Arab nations and the admixture of the ge-
nomes of different ethnic groups with those of people from Europe,
Africa, and Asia. This diversity is important in terms of understand-
ing genome evolution and dynamics, answering the “Out of Africa”
human migration question, and providing insights into the migra-
tion routes of early modern humans from Africa to Eurasia. The pri-
mary African origin of all modern human populations is well
known, but the routes of human migration out of Africa are still un-
certain. One potential route is through Levant. Although the North
African background is mainly stemmed from Near East/Arabian Pen-
insula, the genomic ancestry of the Arabs of North Africa supports an
African genome background due to the historical mixing with sub-
Saharan African genome (Henn et al., 2012). Another potential
route is to the South, across the Arabian Peninsula, which is a
nexus of Asia, Africa, and Europe (Kopp et al., 2014). Interestingly,
Fernandes et al. (Fernandes et al., 2012) focused in disentangling be-
tween the impact of several waves of migration into Arabian Penin-
sula in terms of contribution of African input and provided a proof
that Arabian Peninsula could be the first staging post in the spread
of modern humans from Africa to the rest of the world.
Interestingly, sequencing of just 13 exomes and 2 full genomes in
Kuwait revealed ancestral genomic signature traces stemming from
Asia, Europe and Africa (Alsmadi et al., 2014; Alsmadi et al., 2013).
Egypt is an Afro-Asian Arab country that shares the Mediterranean Sea
with European countries (Fig. 1), and it has been proposed as a potential
source of the exodus of the African genome to Eurasia (Pagani et al.,
2015) according to geographical, archaeological, and genetic evidence.
African genomic components have been mapped (Pagani et al., 2015);
however, most of the analyzed Egyptian haplotypes were genetically
similar to those of modern non-Africans. The study concluded that
Egypt was a potential gateway for the migration of the African genome
to the rest of the world. Therefore, comparing the Egyptian genomes
with European ones supports the exit route, where Ethiopian genomes
compared with Arab genomes addresses southern route of the out-of-
Africa migration.
3.4. Benefits of sequencing the Arab genome
Given the frequent spread of genetic diseases in Arab countries,
reaching reference genome(s) reflecting the diversity and population
structure of Arab countries will serve as an example for other communi-
ties with comparable population structures and will have many bene-
fits, including, but not limited to, (1) serving as a vital tool for the
identification of novel variants; (2) serving as a baseline for further ge-
nomic epidemiological studies in Arab nations; (3) serving as a useful
foundation for cohort and case-control genetic studies that aim to char-
acterize the genetic etiology of genetic diseases; (4) improving genetic
counseling for individuals with genetic disorders; (5) serving as a plat-
form for future GWAS; (6) advancing translational medicine in the
fields of personalized medicine and pharmacogenomics, allowing med-
ications to be individualized to Arab patients and Arab responses to
drugs to become well understood; (7) allowing the study of inbred
Arab communities, and specifically the Bedouin population, thus serv-
ing as a valuable tool to facilitate the discovery of rare and novel gene
variants and novel genes; this information is very important to better
understand the molecular pathology of complex diseases/traits and is
expected to shed light on other genetic risk factors related to gene-
environment interactions and epistasis as well as many other genetic
risk factors with major importance in genetic disease development,
and (8) serve as a historical tracing tool for population migration.
The ultimate goal of the Arab genome is to create a database of the
DNA variation in the Arab population and to make it available to clini-
cians and researchers in Arab countries who seek to increase the
power of disease prediction, to understand gene drug interactions, to
study the Arab population substructures, to improve understanding of
the nature of Arab genetic diversity, and to trace population migration.
All of these endeavors will contribute to one major aim, which is to im-
prove patients’ quality of life by improving overall health care and sav-
ing lives. However, translating the outcome of the results of the Arab
genome into effective clinical practice is a challenging task that will re-
quire concerted efforts by both policymakers and scientists to imple-
ment effective strategies in the health care sector and to make funding
available to allow such programs to continue.
4. Conclusion
Arabs are an ideal population for genetic studies, with a diverse genet-
ic structure, ranging from inbred communities to a diverse gene pool
that includes elements from Europe, Asia, and Africa. This feature renders
the Arab population a rich source of information that would be of
global benefit. This emphasizes the value of a consensus Arab genome
reference(s) which will positively impact the future directions of person-
alized medicine. Using genomic sequencing technologies, numerous rare
variants and novel genes have been identified in Arab families, mainly
with consanguineous marriage history. The outcome of the SHGP and
QGP are soon to be released, which will pave the way of a future consen-
sus Arab genome reference(s). Therefore, there is an urgent need for data
sharing, both locally and internationally, which dictates the need for the
development of mechanisms and standards to facilitate this sharing.
Disclosure declaration
Hatem Zayed declares no conflict of interest.
References
Abu-Elmagd, M., Assidi, M., Schulten, H.J., Dallol, A., Pushparaj, P., Ahmed, F., Scherer, S.W.,
Al-Qahtani, M., 2015. Individualized medicine enabled by genomics in Saudi Arabia.
BMC Med. Genet. 8 (Suppl. 1), S3.
Abumansour, I.S., Hijazi, H., Alazmi, A., Alzahrani, F., Bashiri, F.A., Hassan, H., Alhaddab, M.,
Alkuraya, F.S., 2015. ARL6IP6, a susceptibility locus for ischemic stroke, is mutated in
a patient with syndromic Cutis Marmorata Telangiectatica Congenita. Hum. Genet.
134, 815–822.
https://www.qf.org.qa
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0005
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0005
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0010
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0010
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0010
243H. Zayed / Gene 592 (2016) 239–243
Alazami, A.M., Patel, N., Shamseldin, H.E., Anazi, S., Al-Dosari, M.S., Alzahrani, F., Hijazi, H.,
Alshammari, M., Aldahmesh, M.A., Salih, M.A., Faqeih, E., Alhashem, A., Bashiri, F.A.,
Al-Owain, M., Kentab, A.Y., Sogaty, S., Al Tala, S., Temsah, M.-H., Tulbah, M.,
Aljelaify, R.F., Alshahwan, S.A., Seidahmed, M.Z., Alhadid, A.A., Aldhalaan, H.,
AlQallaf, F., Kurdi, W., Alfadhel, M., Babay, Z., Alsogheer, M., Kaya, N., Al-Hassnan,
Z.N., Abdel-Salam, G.M.H., Al-Sannaa, N., Al Mutairi, F., El Khashab, H.Y., Bohlega, S.,
Jia, X., Nguyen, H.C., Hammami, R., Adly, N., Mohamed, J.Y., Abdulwahab, F.,
Ibrahim, N., Naim, E.A., Al-Younes, B., Meyer, B.F., Hashem, M., Shaheen, R., Xiong,
Y., Abouelhoda, M., Aldeeri, A.A., Monies, D.M., Alkuraya, F.S., 2015. Accelerating
novel candidate gene discovery in neurogenetic disorders via whole-exome sequenc-
ing of prescreened multiplex consanguineous families. Cell Rep. 10, 148–161.
Al-Mulla, F., 2014. The locked genomes: a perspective from Arabia. Applied & Translation-
al Genomics 3, 132–133.
Alsmadi, O., Thareja, G., Alkayal, F., Rajagopalan, R., John, S.E., Hebbar, P., Behbehani, K.,
Thanaraj, T.A., 2013. Genetic substructure of Kuwaiti population reveals migration
history. PLoS One 8, e74913.
Alsmadi, O., John, S.E., Thareja, G., Hebbar, P., Antony, D., Behbehani, K., Thanaraj, T.A.,
2014. Genome at juncture of early human migration: a systematic analysis of two
whole genomes and thirteen exomes from Kuwaiti population subgroup of inferred
Saudi Arabian tribe ancestry. PLoS One 9, e99069.
Bener, A., Alali, K.A., 2006. Consanguineous marriage in a newly developed country: the
Qatari population. J. Biosoc. Sci. 38, 239–246.
Bittles, A.H., Black, M.L., 2010. Evolution in health and medicine Sackler colloquium: con-
sanguinity, human evolution, and complex diseases. Proc. Natl. Acad. Sci. U. S. A. 107
(Suppl. 1), 1779–1786.
Buchanan, C.C., Torstenson, E.S., Bush, W.S., Ritchie, M.D., 2012. A comparison of cataloged
variation between International HapMap Consortium and 1000 Genomes Project
data. J. Am. Med. Inform. Assoc. 19, 289–294.
Fernandes, V., Alshamali, F., Alves, M., Costa, M.D., Pereira, J.B., Silva, N.M., Cherni, L.,
Harich, N., Cerny, V., Soares, P., Richards, M.B., Pereira, L., 2012. The Arabian cradle:
mitochondrial relicts of the first steps along the southern route out of Africa. Am.
J. Hum. Genet. 90, 347–355.
Giles, J., 2006. Islam and science: oil rich, science poor. Nature 444, 28.
Henn, B.M., Botigue, L.R., Gravel, S., Wang, W., Brisbin, A., Byrnes, J.K., Fadhlaoui-Zid, K.,
Zalloua, P.A., Moreno-Estrada, A., Bertranpetit, J., Bustamante, C.D., Comas, D., 2012.
Genomic ancestry of North Africans supports back-to-Africa migrations. PLoS Genet.
8, e1002397.
International HapMap Consortium, 2003. The International HapMap Project. Nature 426,
789–796.
International Human Genome Sequencing Consortium, 2004. Finishing the euchromatic
sequence of the human genome. Nature 431, 931–945.
Kopp, G.H., Roos, C., Butynski, T.M., Wildman, D.E., Alagaili, A.N., Groeneveld, L.F., Zinner,
D., 2014. Out of Africa, but how and when? The case of hamadryas baboons (Papio
hamadryas). J. Hum. Evol. 76, 154–164.
Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar,
K., Doyle, M., FitzHugh, W., Funke, R., Gage, D., et al., 2001. Initial sequencing and
analysis of the human genome. Nature 409, 860–921.
Editorial, The germinating seed of Arab genomicsNat. Genet. 38, 851.
Pagani, L., Schiffels, S., Gurdasani, D., Danecek, P., Scally, A., Chen, Y., Xue, Y., Haber, M.,
Ekong, R., Oljira, T., Mekonnen, E., Luiselli, D., Bradman, N., Bekele, E., Zalloua, P.,
Durbin, R., Kivisild, T., Tyler-Smith, C., 2015. Tracing the route of modern humans
out of Africa by using 225 human genome sequences from Ethiopians and
Egyptians. Am. J. Hum. Genet. 96, 986–991.
Rudan, I., Rudan, D., Campbell, H., Carothers, A., Wright, A., Smolej-Narancic, N.,
Janicijevic, B., Jin, L., Chakraborty, R., Deka, R., Rudan, P., 2003. Inbreeding and risk
of late onset complex disease. J. Med. Genet. 40, 925–932.
Shaheen, R., Faqeih, E., Alshammari, M.J., Swaid, A., Al-Gazali, L., Mardawi, E., Ansari, S.,
Sogaty, S., Seidahmed, M.Z., AlMotairi, M.I., Farra, C., Kurdi, W., Al-Rasheed, S.,
Alkuraya, F.S., 2013. Genomic analysis of Meckel-Gruber syndrome in Arabs reveals
marked genetic heterogeneity and novel candidate genes. Eur. J. Hum. Genet. 21,
762–768.
Shaheen, R., Almoisheer, A., Faqeih, E., Babay, Z., Monies, D., Tassan, N., Abouelhoda, M.,
Kurdi, W., Al Mardawi, E., Khalil, M.M., Seidahmed, M.Z., Alnemer, M., Alsahan, N.,
Sogaty, S., Alhashem, A., Singh, A., Goyal, M., Kapoor, S., Alomar, R., Ibrahim, N.,
Alkuraya, F.S., 2015. Identification of a novel MKS locus defined by TMEM107 muta-
tion. Hum. Mol. Genet. 24, 5211–5218.
Shamseldin, H.E., Tulbah, M., Kurdi, W., Nemer, M., Alsahan, N., Al Mardawi, E., Khalifa, O.,
Hashem, A., Kurdi, A., Babay, Z., Bubshait, D.K., Ibrahim, N., Abdulwahab, F., Rahbeeni,
Z., Hashem, M., Alkuraya, F.S., 2015. Identification of embryonic lethal genes in
humans by autozygosity mapping and exome sequencing in consanguineous fami-
lies. Genome Biol. 16, 116.
Tadmouri, G.O., Nair, P., Obeid, T., Al Ali, M.T., Al Khaja, N., Hamamy, H.A., 2009. Consan-
guinity and reproductive health among Arabs. Reprod. Health 6, 17.
Teebi, A.S., Teebi, S.A., 2005. Genetic diversity among the Arabs. Community Genet. 8,
21–26.
Thareja, G., John, S.E., Hebbar, P., Behbehani, K., Thanaraj, T.A., Alsmadi, O., 2015. Sequence
and analysis of a whole genome from Kuwaiti population subgroup of Persian ances-
try. BMC Genomics 16, 92.
Venter, J.C., Adams, M.D., Myers, E.W., Li, P.W., Mural, R.J., Sutton, G.G., Smith, H.O.,
Yandell, M., Evans, C.A., Holt, R.A., Gocayne, J.D., Amanatides, P., Ballew, R.M.,
Huson, D.H., Wortman, J.R., Zhang, Q., Kodira, C.D., Zheng, X.H., Chen, L., Skupski,
M., Subramanian, G., Thomas, P.D., Zhang, J., Gabor Miklos, G.L., Nelson, C., Broder,
S., Clark, A.G., Nadeau, J., McKusick, V.A., Zinder, N., Levine, A.J., Roberts, R.J., Simon,
M., Slayman, C., Hunkapiller, M., Bolanos, R., Delcher, A., Dew, I., Fasulo, D., Flanigan,
M., Florea, L., Halpern, A., Hannenhalli, S., Kravitz, S., Levy, S., Mobarry, C., Reinert,
K., Remington, K., Abu-Threideh, J., Beasley, E., Biddick, K., Bonazzi, V., Brandon, R.,
Cargill, M., Chandramouliswaran, I., Charlab, R., Chaturvedi, K., Deng, Z., Di
Francesco, V., Dunn, P., Eilbeck, K., Evangelista, C., Gabrielian, A.E., Gan, W., Ge, W.,
Gong, F., Gu, Z., Guan, P., Heiman, T.J., Higgins, M.E., Ji, R.R., Ke, Z., Ketchum, K.A.,
Lai, Z., Lei, Y., Li, Z., Li, J., Liang, Y., Lin, X., Lu, F., Merkulov, G.V., Milshina, N., Moore,
H.M., Naik, A.K., Narayan, V.A., Neelam, B., Nusskern, D., Rusch, D.B., Salzberg, S.,
Shao, W., Shue, B., Sun, J., Wang, Z., Wang, A., Wang, X., Wang, J., Wei, M., Wides,
R., Xiao, C., Yan, C., Yao, A., Ye, J., Zhan, M., Zhang, W., Zhang, H., Zhao, Q., Zheng, L.,
Zhong, F., Zhong, W., Zhu, S., Zhao, S., Gilbert, D., Baumhueter, S., Spier, G., Carter,
C., Cravchik, A., Woodage, T., Ali, F., An, H., Awe, A., Baldwin, D., Baden, H., Barnstead,
M., Barrow, I., Beeson, K., Busam, D., Carver, A., Center, A., Cheng, M.L., Curry, L.,
Danaher, S., Davenport, L., Desilets, R., Dietz, S., Dodson, K., Doup, L., Ferriera, S.,
Garg, N., Gluecksmann, A., Hart, B., Haynes, J., Haynes, C., Heiner, C., Hladun, S., Hostin,
D., Houck, J., Howland, T., Ibegwam, C., Johnson, J., Kalush, F., Kline, L., Koduru, S., Love,
A., Mann, F., May, D., McCawley, S., McIntosh, T., McMullen, I., Moy, M., Moy, L., Mur-
phy, B., Nelson, K., Pfannkoch, C., Pratts, E., Puri, V., Qureshi, H., Reardon, M.,
Rodriguez, R., Rogers, Y.H., Romblad, D., Ruhfel, B., Scott, R., Sitter, C., Smallwood,
M., Stewart, E., Strong, R., Suh, E., Thomas, R., Tint, N.N., Tse, S., Vech, C., Wang, G.,
Wetter, J., Williams, S., Williams, M., Windsor, S., Winn-Deen, E., Wolfe, K., Zaveri, J.,
Zaveri, K., Abril, J.F., Guigo, R., Campbell, M.J., Sjolander, K.V., Karlak, B., Kejariwal,
A., Mi, H., Lazareva, B., Hatton, T., Narechania, A., Diemer, K., Muruganujan, A., Guo,
N., Sato, S., Bafna, V., Istrail, S., Lippert, R., Schwartz, R., Walenz, B., Yooseph, S.,
Allen, D., Basu, A., Baxendale, J., Blick, L., Caminha, M., Carnes-Stine, J., Caulk, P.,
Chiang, Y.H., Coyne, M., Dahlke, C., Mays, A., Dombroski, M., Donnelly, M., Ely, D.,
Esparham, S., Fosler, C., Gire, H., Glanowski, S., Glasser, K., Glodek, A., Gorokhov, M.,
Graham, K., Gropman, B., Harris, M., Heil, J., Henderson, S., Hoover, J., Jennings, D.,
Jordan, C., Jordan, J., Kasha, J., Kagan, L., Kraft, C., Levitsky, A., Lewis, M., Liu, X.,
Lopez, J., Ma, D., Majoros, W., McDaniel, J., Murphy, S., Newman, M., Nguyen, T.,
Nguyen, N., Nodell, M., Pan, S., Peck, J., Peterson, M., Rowe, W., Sanders, R., Scott, J.,
Simpson, M., Smith, T., Sprague, A., Stockwell, T., Turner, R., Venter, E., Wang, M.,
Wen, M., Wu, D., Wu, M., Xia, A., Zandieh, A., Zhu, X., 2001. The sequence of the
human genome. Science 291, 1304–1351.
Verge, C.F., Vardi, P., Babu, S., Bao, F., Erlich, H.A., Bugawan, T., Tiosano, D., Yu, L.,
Eisenbarth, G.S., Fain, P.R., 1998 Oct 15. Evidence for oligogenic inheritance of type
1 diabetes in a large Bedouin Arab family. J Clin Invest. 102 (8), 1569–1575.
Yavarna, T., Al-Dewik, N., Al-Mureikhi, M., Ali, R., Al-Mesaifri, F., Mahmoud, L., Shahbeck,
N., Lakhani, S., AlMulla, M., Nawaz, Z., Vitazka, P., Alkuraya, F.S., Ben-Omran, T., 2015.
High diagnostic yield of clinical exome sequencing in Middle Eastern patients with
Mendelian disorders. Hum. Genet. 134, 967–980.
Zayed, H., Ouhtit, A., 2016. Accredited genetic testing in the Arab Gulf region: reinventing
the wheel. J. Hum. Genet. http://dx.doi.org/10.1038/jhg.2016.22 (Epub ahead of
print).
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0015
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0015
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0015
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0020
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0020
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0025
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0025
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0030
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0030
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0030
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0035
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0035
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0040
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0040
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0040
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0045
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0045
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0045
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0050
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0050
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0050
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0055
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0060
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0060
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0065
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0065
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0070
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0070
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0075
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0075
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0080
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0080
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0090
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0090
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0090
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0095
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0095
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0100
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0100
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0100
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0105
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0105
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0110
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0110
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0110
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0115
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0115
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0120
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0120
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0125
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0125
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0125
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0130
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0130
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf9000
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf9000
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0135
http://refhub.elsevier.com/S0378-1119(16)30533-9/rf0135
http://dx.doi.org/10.1038/jhg.2016.22
This link is http://www.qatar-tribune.com/viewnews.aspx?d=amp;catation2&pge=,”,
The Arab genome: Health and wealth
1. Introduction
2. The Arab world
2.1. Inbred Arab communities and rare variants discovery
3. The Arab genome
3.1. Discovery of novel disease-causing genes and the Arab genome
3.2. Arab efforts in genome sequencing
3.3. The Arab genome and the “Out of Africa” theory
3.4. Benefits of sequencing the Arab genome
4. Conclusion
Disclosure declaration
References
City and cosmology: genetics, health, and urban living in
Dubai
Aaron Parkhurst
Department of Anthropology, University College London (UCL), London, United Kingdom
ARTICLE HISTORY
Received 28 September 2017
Accepted 9 October 2017
ABSTRACT
In light of increasingly high rates of diabetes, heart disease, and
obesity among citizens of the Arabian Gulf, popular health
discourse in the region has emphasised the emergent Arab genome
as the primary etiological basis of major health conditions.
However, after many years of public dissemination of genomic
knowledge in the region, and widespread acceptance of this
knowledge among Gulf Arab citizens, the rates of chronic illness
continue to increase. This paper briefly explores the clash between
indigenous Islamic knowledge systems and biomedical knowledge
systems imported into the United Arab Emirates. It presents
vignettes collected from interviews and participant observation in
Dubai as part of nearly four years of ethnographic research,
completed as part of the author’s doctoral work on ‘Anxiety and
Identity in Southeast Arabia’. Rather than radically informing health
seeking behaviours among many UAE citizens, the emphasis on the
‘Arab Genome’ has instead reconfirmed the authority of Bedouin
cosmological understandings of disease, reshaping the language
that people use to engage with their bodies and their health. Local
cosmology remains a powerful discursive element that often
operates in contention, in sometimes powerfully subtle ways, with
novel health initiative regimes. For many people in the region,
genomic information, as it is often discussed and propagated in the
UAE, shares an intimate relationship with ideas of fate and national
identity, and sometimes serves to mitigate the increasingly
uncertain terms of engagement that people share between the
body, their health, and rapidly changing urban landscapes.
KEYWORDS
Genetics; medical
anthropology; chronic illness;
fate; urban anthropology
The underlying premise of this article extends from a simple, but profound anthropologi-
cal critique in the practice of biomedicine in different societies. That is, when policy plan-
ners and health professionals try to think through ideas of behaviour change that
accompany much of the discourse on obesity, diabetes, heart disease, and global health in
general, they need to take into account people’s perceptions or ideas of their ability to cre-
ate bodily change for themselves in general. Medical anthropology has long emphasised
the role of cultural landscapes and idiosyncrasies in producing powerful regimes of both
CONTACT Aaron Parkhurst a.parkhurst@ucl.ac.uk
© 2018 Informa UK Limited, trading as Taylor & Francis Group
ANTHROPOLOGY & MEDICINE, 2018
VOL. 25, NO. 1, 68–84
https://doi.org/10.1080/13648470.2017.1398815
http://crossmarksupport.crossref.org/?doi=10.1080/13648470.2017.1398815&domain=pdf
http://orcid.org/0000-0002-0762-0929
http://orcid.org/0000-0002-0762-0929
mailto:a.parkhurst@ucl.ac.uk
https://doi.org/10.1080/13648470.2017.1398815
http://www.tandfonline.com
illness and health, and alarming rates of chronic illness across the globe re-illuminate the
systematic neglect of culture in policy planning and debate (Napier et al. 2014). How is
agency constructed in ‘health seeking behaviour’, and what are the wider social factors
that inform ‘health seeking behaviour’?
This paper is informed from long-term field-work in Dubai that focused on these ques-
tions of health seeking behaviours and how they relate to local ideas of fate, agency, and
genes. Further to these ideas, however, Dubai provides a unique context to think through
many forms of chronic illness that become propagated through individual habits and
behaviours. From questions that emerge in my recent inquiries on the human body and
urban environments, this paper explores an anthropological problem presented by the
body in the city, namely, the disruption of the stable relationship between the human
body and the environment. Genetics, as a concept, becomes an explanatory model that
men and women in Southeast Arabia utilise to speak towards this disruption.
The ethnographic data used in this paper was collected as part of nearly four years of
anthropological fieldwork in Dubai and Abu Dhabi, in which I lived and worked as an
anthropologist (February 2007–October 2010). It forms part of a larger body of work on
the relationship between globalisation, chronic illness, and tradition within Southeast Ara-
bia, undertaken as my doctoral research. The research was conducted in many social and
medical spaces, but primarily in participants’ homes, caf�e’s, and other intimate social
spaces. Part of this ethnography was also conducted in clinical settings, involving partici-
pant observation in three mental health institutions (one in Abu Dhabi and two in Dubai),
and two nutrition clinics in Dubai. My anthropological research began as a project study-
ing mental health and the stigmatisation of mental illness in the Emirates, as well as
men’s health issues in the country in general. The current focus on diabetes and genetics
emerged from concerns from both local health authorities and from Emirati lay persons.
During my time in the Emirates researching chronic illness, Emiratis in general spoke
often and openly about their engagement with genetics, and both their deep love and anx-
iety of the city. These themes comprise the focus of this paper.
The research methodology consisted primarily of participant observation and inter-
views conducted in both Arabic and English. Unless otherwise stated, the dialogue pre-
sented in this paper was conducted in English. Most of the discussions between my
participants and myself were qualitative, open ended engagements, though many inter-
views directed participants to discuss their understandings of genetics, the city, or both.
Participants were recruited in a wide number of contexts: some participated in discussions
as part of formal discussions in clinics; others were recruited through participant observa-
tion in Dubai, and we met in their homes, caf�e’s, or places of work in which I had access
and permission to conduct fieldwork. Still others were part of a support network in my
Arabic education. Most of the participants that inform the ethnography of this paper, and
with whom I became close, were men. This is partly due to the nature of the overarching
research questions on men’s mental and physical health issues in the Emirates, but it is
also due to the social structures of the country. While women participated in general
interviews in public health spaces, I only had ethnographic access to men in more per-
sonal and private social spaces. The participants of whom this paper concerns are almost
all Emirati citizens living in Dubai, with the exception of some perspectives from Euro-
American health professionals working in the city. Citizenship in the UAE is still
informed from tribal affiliation. Many Emirati in Dubai and Abu Dhabi are members of
ANTHROPOLOGY & MEDICINE 69
different branches of the Bani-Yas tribe, a large and powerful kinship group that enjoys a
long history in the Arabian Peninsula. However, there are also many who trace their line-
age through other large tribes. Emirati tribal leaders (sheikhs) often draw upon Bedouin
identity in public discourse in Dubai, though the label of Bedouin is rather fluid. While
different families in the Emirates have diverse historical backgrounds and histories that
shape their experience of the developing Emirati cities, this paper draws upon shared
understandings of the body and cosmology that unify the citizens of the Emirates.
The predominant blood sugar disorder discussed in this paper is Diabetes Mellitus Type 2.
This condition is categorised through the inability of the body to respond to insulin prop-
erly, and usually develops in adulthood. There are many risk factors that are known to
contribute to Diabetes Mellitus Type 2, henceforth often referred to in this paper as sim-
ply ‘diabetes’, but most salient in public health narratives are those risk factors that corre-
late diabetes to obesity (Body Mass Index of 30 and higher), personal diets, behaviours,
and habits. Diabetes is well-understood as contributing profoundly to a wide-range of co-
morbidities. Because of its relationship with obesity, they are often discussed in unison by
health officials in Dubai.
The experience of diabetes in Dubai is often explained through narratives of ‘energy’.
Those who have the condition complain of not having any energy to go shopping, or go
to work, and sometimes complain that they do not have the ‘energy’ to go outside, as the
heat of Dubai’s oppressive climate stifles them. This is especially frustrating for those who
are told their condition is tied to inactivity. The experience of diabetes, however, is highly
variable in Dubai, especially as the condition presents itself in increasingly younger indi-
viduals. It often first presents itself as a major problem when people have other ailments
or are treated for other conditions. The experience of the condition remains confusing for
many of the people with whom I spoke, especially for younger individuals (in their late
20s or early 30s). They were aware, and even fearful, of the cardiovascular risks that the
condition informs, and they all had personally known others whose death at an early age
due to cardiovascular disease was informed by diabetes. While they felt the physical effects
of the chronic illness, and indeed, some had been diagnosed after an initial diabetic attack,
their social lives, in their own terms, had yet to be grossly impacted by the disease. As a
result, it was difficult for many people to narrate their current suffering beyond physical
sensation. As I will discuss later, for many the condition was considered with some ambiv-
alence. In this regard, when I spoke with people about the experience of living with diabe-
tes, they often turned the discussion away from their own lives, and instead borrowed
pathology as an opportunity to think through other aspects of their society.
Diabetes, and even obesity in general, is often seen by Emiratis in the UAE as a condi-
tion brought about by modernity. The Arabic term for diabetes in the Emirates is ‘da3 al-
suker’ and translates literally as ‘disease of sugar’. However, the Latin term ‘diabetes’ is
used ubiquitously in both Arabic and English discourse. In this regard, its immediate rela-
tionship to food and drink consumption is disrupted, allowing for more fluid and com-
plex understandings of the origins of the condition. Long-term medical professionals in
the UAE remember and recognise the historical development of blood sugar discourse in
the country. For example, a German physician who had practiced in the country for
70 A. PARKHURST
20 years explained, ‘There was an idea, and I still come across this, that we [here he refers
to himself, and other Euro-American expatriates] brought some of these conditions with
us. Sometimes people might say ‘you made this problem so you fix it’, and I had no idea
what they were talking about’. The physician later came to understand that his patients
were referring to the idea of Euro-American immigrants as perceived agents of disease, or
at least associating these expatriates with the conditions of change and foreign influence
that bring sickness. ‘My father thinks these things’, a friend explained to me. ‘He thinks
diabetes is a conspiracy from Israel or something like this’. I asked why. ‘Well, people
didn’t have this problem, … nobody used to have Diabetes. Or maybe they had it, I think,
but nobody knew about these things. So they blamed everyone else. And now we know it
is genetic, but even now some people don’t believe that’.
There is great complexity embedded in these ideas. Israel, here, is understood to be in
partnership with American and European governments to subvert Arab society, though
these ideas are not shared by everybody. There is also an attempt to understand how dia-
betes developed so quickly in the rapidly growing city. Other logics concern immigration
as a direct process of pathology. In this regard, diabetes is seen less as something that
develops from habits, and instead is partially socially constructed as something caused by
ambiguous pathogens that accompany immigration. Others see Euro-American expan-
sion as an agent of corruption, if not a direct agent of disease. The complex consequence
of these commercial and social infiltrations on the human body is a trend seen in many
areas of the world, and has been given the moniker ‘cocacolonisation’ (see Leatherman
and Goodman 2005). In the past, diabetes was not known to be a problem, and suddenly,
one day it was. According to the International Diabetes Federation (IDF), during the cul-
mination of my fieldwork, The Emirates had the second highest rates of diabetes in the
world, behind the small Pacific island nation of Nauru (IDF 2010). This trend remains
strong. Current data from the IDF holds that nearly 1 in 5 adults in the UAE is currently
afflicted with diabetes, and the country’s rates of diabetes are rising faster than both its
neighbours in the Arabian peninsula and in the world at large (IDF 2015). If these rates
continue, the prevalence of diabetes is expected to double within a generation.
My participants do not use the term ‘cocacolonisation’, but they are aware of these
forces of commercial and social intrusions, and they see these processes centred in
the city, namely, Dubai. My friend Ali, for example, spoke often about the problems that
the city posed and the dilemmas it caused for him and his peers. Ali explains, ‘There are
some people who just think it would be better if everyone (foreign) left, and there are
other people who are afraid of what will happen if everybody leaves’. ‘What do you think’,
I ask him. ‘I think like most people we love people to come here and we love to share our
country. But maybe some people are meant to come live here, maybe some people should
only come visit. Smaller is ok too, all these towers… It will be good to slow down, or else
people (locals) will never leave their homes, and the people coming here will be bored,
and they will stop coming… people are becoming very selfish… . [We] do not have to do
much. We need to be better’. At other times, he and his peers would complain about the
fast food that they and their children consumed, or the amount of TV their family
watched, always wildly gesturing to the streets. The city then becomes tied to indigenous
understandings of modernity and disease, and is understood to be mapped upon the
human body.
is, in many ways, still a developing subject of analysis
in social science, though it has an emerging collection of thought in a range of disciplines
ANTHROPOLOGY & MEDICINE 71
from geography and anthropology to psychoanalysis. While architectural planning has
throughout centuries borrowed upon human corporeality to understand the form of
streets, buildings, townships and cities (see Vitruvius and De Vinci, for example), philoso-
phers and artists near the beginning of the last century began to recognise the metropolis
as a new grounding for human culture and corporeality (e.g. Mumford 1934; Metropolis
1927). In a different vein, other thinkers in anthropology and geography conceived of the
body and society as mirrors for each other (Douglas 1966), and the city, specifically, as a
metaphor for the human body in which stable urban landscapes inform cultural under-
standings of the body and identity (Sennett 1994). In this way, space, place and the body
become concretely joined. What Sennett identified is how urban spaces become norma-
tive, seemingly stable lived experiences for those who live within them. Yet, he also shows
how this normative experience of urban-ness belies the reality of the city as a highly unsta-
ble, and profoundly fluid and dynamic space. It is a transformational entity in its own
right that shares an anthropologically reciprocal relationship with the human body: the
city-cum-body is constructed by the body, much as people embody the dynamic forces of
the city (ibid).
In discourse on diabetes, obesity, and heart disease, social scientists have long argued
for a more holistic view of the body in relation to society to think through health seeking
behaviour (see Edwards 2012; Paul 2005; Mendenhall et al. 2010). Specifically, in order to
create changes and shifts in health delivery and demographics, especially in a context
such as London or Dubai, policy planners need to think beyond what a health authority
might be able to issue, and think additionally about the pragmatics and lived experience
of people as they try to move through their daily life. In terms of diet and exercise, this
has implications for public transport, daily commutes, housing prices, and a wide range
of socio-economic policy and practice. In this regard, city politics and urban management
in the US and UK, for example, have informed urban neighbourhood demographics, the
distances between an individual’s work and residence, the pragmatics of daily travel, and
opportunities to create and utilise time for activities beyond income production and
household maintenance. These aspects of quotidian city life are mapped onto the human
body in the form of chronic illness (Church et al. 2011; Cetateanu and Jones 2014; Bur-
goine et al. 2014; Bourgois 2011). The structural limitations of urban living often provide
daunting hurdles to prevention of chronic illness, but there is a psychological aspect to
health behaviour and practice that is sometimes ignored. That is the sense of futility many
people express and experience in thinking through how they might work upon their
bodies.
Diabetes and fate
Obesity and diabetes are made complex in Dubai, as they are medical categories that are
often fraught with ambivalence, and they are not always seen as unhealthy body categories
in the city and country at large. This is certainly not unique to this region of the world (see
Randall 2011, or Popenoe 2003, for example). One of the issues that contributes to high
Body Mass Index and high rates of blood-sugar disorders in Southeast Arabia that is not
discussed in this paper is the perception of these conditions as normative or healthy, and
in the case of obesity, sometimes desired. However, as discussed in the section above, dia-
betes, specifically, is often understood as a condition of modernity, a sudden product of
72 A. PARKHURST
‘modernisation’. This is evidenced by my participants in a number of ways. One concern
from locals is the idea of Western imperialism as an agent of disease. The widespread idea
of diabetes in the region grew in similar terms to the influx of foreign immigration, prod-
ucts, and ideas. This type of modernity also brought more robust systems of medicaliza-
tion into the country. Very few in the Emirates were diagnosed with diabetes before the
invitation towards foreign development, and so it is rather reasonable to deduce that it is
a ‘Western’ illness category that expatriates brought (and continue to bring) into the
country. This perception is made complicated by discourse that links Western material
and social imports to cultural pollutants, if not direct agents of disease. American
designed fast-food industries, expensive villas, sport-utility vehicles, mass media, and
even increased longevity become objects vacillating between desire and danger. All these
vacillating objects were tied to urbanising processes, and the city is perceived to be the
locus of these goods. In this regard, the desert was often looked upon as a safe haven. As
one of my participants proudly advertised, ‘I make my family go camping to the desert
every month usually because it is the best thing to grow up right… It is like a medicine’.
Though, even then, my friend’s ‘tent’ was fitted with modern amenities. Vacillation, as
theorised by Ghassan Hage (2010).
occurs because we do not always know what we want and we often want contradictory
things… we can say that vacillation is when there are many incompatible things giving mean-
ing to our lives and we find ourselves pursuing them despite their incompatibility. What is
important, though, is that vacillation is not just a movement between various states of being;
rather, it is a state of being in itself. (Hage 2010, 152)
My participants often describe themselves in this way, torn between desires for conflicting
interests and identities. Some defined the city as ‘a place where people don’t know how to
not want things’. The desire for both modernity and tradition, and the perceived futility
of pursuing both, creates conditions of uncertainty that my participants expressed often.
The city becomes a vessel for this uncertainty, and becomes tied to other categories of
ambiguity more closely associated with the body; namely, genetics.
As Kilshaw has demonstrated in her ethnography in Qatar (Kilshaw 2015), the Qatari
state’s dedicated mission to become ‘modern’ borrows significantly on the role of genetics,
but this is often in contention with the way that local Qataris ‘themselves understand and
incorporate genetic knowledge into their lives’ (Kilshaw, this issue). Institutionalised
genetic sequencing and testing programmes speak towards a local desire to bring Qatar
forward as a global leader in healthcare, and they become representative of a ‘modernity’
of which Qatari citizens are very proud. Yet, balancing these desires with traditional
emphasis on inheritance makes genetic dissemination very complex, and in some ways,
ironic (Kilshaw 2015, this issue). In the context of Dubai, the imports described above
bring both comfort and ‘corruption’, and are problematically, though not necessarily
falsely, tied to conditions that are often ethnographically also attributed to genetics, such
as ‘misbehaving children’ (in terms of autism spectrum), depression, and, saliently, diabe-
tes. All these categories are, then, often understood as diseases brought by the West. Some
speak of diabetes as a result of a loss of traditional value and culture or religion. For exam-
ple, I met a participant who insisted that soft drinks, and specifically Coca Cola, were
ruining the health of the city (indirectly invoking the idea of coca-colonisation discussed
above), which is something he and I agreed on to a degree. He asserted, however, that if
ANTHROPOLOGY & MEDICINE 73
locals drank more coffee, as was considered traditional, then the diabetes epidemic could
be annihilated. There may be some medical truth to this, depending on the ways and the
amounts coffee is consumed. However, my participant’s concern was not with the physi-
cal and chemical properties of the drink. The harmful long-term effects of soft drink con-
sumption are not always perceived to stem from the ingredients of the products: sugar,
corn syrup, or, perhaps, colouring compounds. Rather, it is the nationalism of the prod-
uct, and its cultural disruption that is understood to be poison for the human body. ‘Coca-
colonisation’, then, is a useful but limited concept in the region as it directs analysis of
health seeking behaviour away from the individual and places it within wider systems of
structural imbalance. My participants do often recognise that coca-cola, as a ‘material’,
leads to Diabetes, but this ‘material’ takes on different meaning depending on its source.
In this regard, sugar is good when it is used to make local products, and bad when it is
imposed upon those who fall within Euro-American patterns of consumption.
Parallel to local understandings of foreign influence are increasingly prevalent public
discourse on genetics. Within popular imagination, there is a widely-held perception of
genetics as diabetic aetiology; that is, genes are largely, if not wholly responsible for diabe-
tes. For example, where I was discussing aetiology with one of my participants, I was
speaking about genetic susceptibility for type 2 diabetes, a ‘gene’ for diabetes, and he was
speaking of ‘Al Djinn’, those ambiguous agents of the desert, usually frustratingly amoral,
that are known to influence the world of humans and disrupt human agency. I am careful
to note that he probably does not mean this literally, that genes and Djinn are one and
same. Or, if he does, it remains speculative. However, in many regions of Southeast Ara-
bia, genes and Djinn, as ambiguous categories of nature and fate, do borrow each other’s
language, if not further synonymy. It is a recognition that the sands and vastness of the
Rub al Khali, the vast desert that lies across the Southeastern Arabian peninsula, and the
human body were both their own cosmologies, populated by cosmological agents that can
affect one’s life and well-being.
In this way, genes have been incorporated into indigenous cosmology. The language
and rhetoric that my participants apply to discourses of fate are often re-appropriated to
help them think through genetics and other biomedical body knowledge. While I do not
have the space in this paper to unpack the complex construction of ‘fate’ itself in Dubai,
my larger ethnography has shown that fate is a language of uncertainty in Dubai, but is
often incommensurable and sometimes even congruous with deep personal agency (Par-
khurst 2014). In thinking through the body in the city, and the body of the future, fate
becomes a rhetoric that is helpful to situate oneself in the conditions of vacillation I have
described above. In relationship to disease, other anthropologists have shown how Islamic
conceptions of fate are better understood as languages for structural imbalance. Sherine
Hamdy’s work in Egypt, for example, shows how fate is invoked by some as mechanism
to take action and meaning within systems of political failure and structural violence
(Hamdy 2008, 2009). In contrast to traditional perceptions of ‘Islamic fate’ by colonialist
thinkers, my participants often invoked strong sentiments of personal cultivation and cos-
mological futility simultaneously. Because of its place in religion and other systems of
social relations, fate, as locally defined as submission to God, is proudly locally owned as a
marker of identity, yet is practiced with ambivalence. Processes of modernity and urbani-
sation as understood by my participants, because of their own ambiguity, and because of
their association with bringing both success and disease, are then placed within this
74 A. PARKHURST
language of fate. As genes become increasingly understood as carriers of both identity and
disease, they become tied to these languages as well.
The development and dissemination of molecular biological science in laboratory cul-
tures over the last five decades informs the social understandings of genes as the science is
imported into new contexts. Outside of the Middle-East, this trend has provoked wide
philosophical and bioethical debate. In discussing genes with patients, or the public, an
often-overlooked consequence is a lay understanding of genes as destiny. Within the sci-
entific community, this problem has been discussed for decades, asking, in a broader
sense, what it means to say ‘x-gene determines y’. Richard Dawkins has fought against
this type of genetic deterministic understanding, asking, ‘Why are genetic determinants
thought to be any more ineluctable, or blame-absolving, than ‘environmental’ ones?’
(1999, 10–11). There is, arguably, a cultural miscommunication here between the cultures
of laboratories and the general public. For many philosophers, and laypeople, the question
is somewhat teleological, for biologists, the question is statistical (ibid). Nonetheless, the
human body, as it ambiguously weaves through all systems of social relations, blurring
biology and culture, remains a steadfast anthropological problem (Csordas 1994; Scheper-
Hughes and Lock 1987), and genetics, understood as synecdoche for the body, have only
complicated long-running debates on what it means to ‘be in the world’ (Franklin 1995).
Ethnographically speaking, genetic understandings can be strikingly and profoundly
meaningful, and have the potential to elicit powerful change in individual and social iden-
tity (Rabinow 1996). Anthropologists, recognising the need to create new theoretical tools
to think through the ramifications of genomic information in society, have taken on-
board this concept of ‘biosociality’ to help understand the role of genes within ethnogra-
phy (see Gibbon and Novas 2008). However, they are also critical of instilling too much
power within the gene as definitive instruments of change and control (Rabinow 2008).
Within the clinic, semantics of genes can radically inform patient behaviour, in both
informing aetiology (see Senior et al. 1999 for example), and, in new ways, avoiding aetiol-
ogy (Franklin and Roberts 2006). Within larger debates in anthropology, these disruptions
of nature and culture perhaps provide evidence for the post-modern viewpoint that social
science itself has ill-constructed binaries which it debates and refutes (Latour 1993).
Molecular biology may have a role here as well. New genetic sciences and epigenetic influ-
ences on the body contribute to the development of radically new debates within anthro-
pology on nature and culture (Lock 2013, 2015). However, it is worth noting that, for the
people within the context of this study, the line between the biological and the social has
always been very weakly drawn. The people of Dubai do recognise genes are biological
agents, but they are simultaneously social ones, as I will discuss.
How people construct the notion of fate, or destiny, in relation to genes, is just as deli-
cate as social and biological binaries. The language of genetics, premised on imaginations
of the inevitability of nature, remains an instrument that can invoke a sense of fate, or a
prescription of behaviour. This is further complicated by deeply held values of genetics as
specifications of race, and by extension, ethnicity (Fullwiley 2007). As I have argued else-
where (2014) one implication here is that many geneticists still wantonly operate under
the same formulas for ‘national character’ that social sciences have accused the Oriental-
ists of perpetrating, and that scholars have attempted to weed out of anthropology. In this
way, the semantics of genes are translated outside of the laboratory to the public to give
chemical and organic evidence towards national identity.
ANTHROPOLOGY & MEDICINE 75
In the Emirates, the relationship between genes and national identity often takes com-
plex forms. While there exists a robust local knowledge of the mechanisms of inheritance
and kinship in Southern Arabia that I have not the space to discuss in depth here, genes
as biological entities are not necessarily part of, and not always associated with this inheri-
tance and kinship. Genes are widely known as identity markers independent of kinship.
They are widely known to be carriers of disease, but are not generally understood to con-
tain the essence of, or the benign traits of, a person. The following brief conversation
between myself and two of my participants, a debate on the genetic influences to, say, hair
colour vs. diabetes, illustrates local incommensurability between genes and inheritance. It
began from a popular discussion among my participants – what makes a person beautiful.
(Ali) ‘A woman’s hair comes from her mother, and that is why they are keeping it like this
[silky, and pitch black]’
(Myself) ‘What about diabetes’, I asked, ‘is this something that comes from the mother or
from the father?’
(Ali) ‘No, this one is genetic I think.’
(I continued) ‘Sure, but do you get it from your mother’s side of the family, or does it come
from your father’s side of the family?.’
(Ali) ‘No, these ones, these diseases they are genetic.’
(Myself) ‘Fine, but where does it come from?’
(Ali) ‘No, yaani, they do not come from anywhere. I am trying to tell you that. They are not
coming from anywhere.’
(Myself) ‘But if they are genetic, they are inherited from someone!’
(Ali) ‘Yes, but no it does not come from anywhere, yaani, this is why it means it is genetic’
(Myself) ‘Well, what does genetic mean?’
(Ali) ‘It means that you have genes… that it is because you are Arab or maybe like these peo-
ple’, he points to a group of Filipinos who were working at the caf�e in which we met.
(Myself) ‘[The Filipinos] are genetic?’, I asked. The two men at the table could see that I was
confused.
(Rahman) ‘Don’t you know that Arabs have these genes and that British people have these
genes and all these peoples have these genes.’, one of them yells at me.
(Ali) ‘He means different genes’, his colleague explains.
(Rahman) ‘Yes, yaani, different genes all these people,’ he clarifies.
76 A. PARKHURST
(Myself) ‘Yes, I understand that, but where do these genes come from?’
(Ali) ‘But they are not coming from anywhere is what I am telling you. They are because they
are these people… .’ His friend interrupts,
(Rahman)‘We are Arab so we have some of these ones [genes].’
(Myself) ‘Is being Arab genetic then?’, I asked.
(Rahman) ‘Yes, of course, and like you are coming here from England.’
(Myself) ‘Is being English genetic?’
(Rahman)‘Yes that is what we are trying to tell you.’
(Myself) ‘Ok, so is being Emirati genetic?.’ This question seemed to provoke some thinking.
After a short time, they answered.
(Both) ‘No, this one is not genetic, it is coming from who your father is.’
The debate continued for some time. I asked about skin (from the mother), height (from
the father), obesity (genetic), cancer (genetic), eyes (mother), gender (father), and so forth.
I continued these questions with many people throughout my fieldwork, with more or less
the same responses. In terms of pathology, diabetes, cancer, obesity, and both psychotic
and non-psychotic mental illness: these conditions and behaviours were perceived to be
genetic. However, certain types of nationality and general behaviour, and the phenotypic
attributes of appearance were said to originate with parents, in the home, and in the
womb. Ethnicity, as a concept, and as a broad signifier, is often slippery. Being ‘Arab’ or
Chinese, or White-European, in local terms, was discussed as evidenced through genes.
Being Emirati, for example, is inherited, but not genetic, while being Arab, and more spe-
cifically, deriving from Southeastern Arabia at large (Bahrain, Qatar, Emirates, Oman,
possibly Saudi, but not Yemen) is said to be informed through genetics. Beyond biology,
many factors contribute to these designations: Bani-Yas tribal affiliations, ties to desert
and coastal landscapes, concepts of wealth, constructs of purity, and language practices –
but to name a few. While the limits of genetic influence in popular imagination provide
further ethnographic evidence on the nature of agency in kinship and reproductive practi-
ces, the ambiguous coupling between pathology and ethnicity speaks to the constructs of
genes in this paper.
John Avise, in his monograph on the Genetic Gods (2001), extends genetic determinism
to the structural realm of cosmology, attempting to ask and answer questions that are, for
many people, religious. The link between genes and gods can be, Avise argues, a rather
rapid one. Certainly, as invoked in the anecdote above regarding Al Djinn diabetes, there
is evidence for this in my field-site as well. Here, of course, the connection is not made
with ‘Gods’, but it is still made with religious cosmological entities. This synonymy and
parallelism presents an anthropological question: If genes conjure up their own cosmology
within the imagination, then is it reasonable to suggest that an already present and strong
cosmology might inform genetics? In the Arabian Gulf, genetics have found an audience
ANTHROPOLOGY & MEDICINE 77
with which it was unfamiliar. The intentions behind its language are especially vulnerable.
The men and women of the Emirates already have a very robust and complex language of
their own with which they can engage fate. Genetic dissemination was bound, in some
way, to be reworked under these powerful Arabic articulations. There is not space here to
do justice to the diverse and encompassing language of fate in the Emirates, let alone the
Arab-speaking world at large, and despite the complexity of fatalistic discourse in the
region, modern ethnography conducted in the Arabian Peninsula remains sparse. This
paper in many ways takes the presence of fatalistic language as an ethnographic given,
even if the link between behaviour and discourse is often nebulous and even sometimes
careless (Chaves 2010).
In terms of how genetics and fate are interwoven in Southeast Arabia in general, other
research has provided insight in contexts outside chronic illness. Kilshaw (2015) has ana-
lysed how maternal prospects, marriage and consanguinity highlights genetics and the
management of risk in Qatari communities in and around Doha. Similarly, inherited
blood disorders and genomic testing not only encroach upon marriage practice in Oman,
but become novel signifiers of nationalism, history and identity in a context in which nor-
mative concepts of time and history are politically prescribed (Beaudevin 2013).
The research presented here complements these works. Rather than simply replace the
cultural models of the world that the Bedouin and coastal tribes of the UAE know to be
true, foreign medical and scientific concepts are re-shaped and interpreted through the
languages of the desert, themselves becoming common discursive elements of public
knowledge. In thinking through ‘genes’ as agents of disease, and Djinn as ambiguous spi-
rits of the desert, my participants see congruences. The slippage between Djinn and genes
becomes a powerful metaphor to depict the fallacies inherent in the designs of globaliza-
tion and in the assumptions embedded in Western scientific empiricism and dissemina-
tion. The direct association between these terms is not as important here. What I argue is
that the failure to recognise genetics as its own cosmology can indeed perpetuate suffering.
I have argued that Emirati conceptions of the self and body in relation to nature, spirits
and foreigners are challenged by the promises of globalization and modernity. As people
move through the desert, the coast and the rapidly growing cities, their quest for an elu-
sive notion of modernity ricochets into local systems of destiny, cosmology, agency, body
practices, and kinship, and the languages one uses to articulate the ‘self’ and world are
transformed.
The language of fate is a language in which genetics is often fully embedded. As dis-
cussed, while the epistemology of ‘genetic determinism’ has been a trope borrowed in
both social and biological landscapes in the West, fate is far more culturally owned in
the Gulf, and in much wider social ways than in, say, the UK. Ideas of Islamic fatalism are
often a proudly culturally owned category in the region. As Bourdieu has told us of
the Kabalye, ‘Submission to nature is inseparable from submission to the passage of time
scanned in the rhythms of nature’ (Bourdieu 1963, 57). Bourdieu was attempting to
understand fate, fatalism, or determinism among his Islamic informants in Algeria. In his
writings, his informants understood fate as scanned in the rhythms of nature. This is not
foreign to my participants who invoke a similar symbolic association with nature, and
specifically the moon, the tides, and even genes. However, these terms are variable in
meaning for my participants depending on the context in which fate is invoked. I have
argued elsewhere (2014) that ‘Islamic fatalism’ in the Gulf is often a poor concept. Rather
78 A. PARKHURST
than see themselves and their fate inescapable from the moon or the tide (as James Fraser
(1990) has poetically described a century ago), I struggled to find notions of Islamic fatal-
ism in common practice, and in the reality of people moving through their day. Rather,
nature was a stable stage in which people took comfort. The coast, tides and waves, and
above all, the desert, provided a language of empowerment, that the individual could
effect change in the world, and should indeed do so. Oil provided an index of power that
was granted from nature, and Bedouin reliance of the ever-stable Earth reinforced these
motivations for planning, hurrying, scheming and creating – at least among the Gulf’s
elite. Bourdieu’s Algerian notions of fate and hubris do exist in conversation and song,
but they usually contradict practice.
Chronic illness in Dubai, and specifically diabetes, complicates applications of fate.
Many of my participants, and especially young and middle-aged men, understood diabe-
tes as something within the body that can make one sick, but not as a constant condition
in and of itself. In times of diabetic distress, patients would eagerly seek immediate medi-
cal attention, and then participate in health planning in the few days and weeks following
their distress, though these behaviours would generally transform into old habits. It was
difficult for many participants to imagine themselves being ill in those times in which
they did not feel ill, and indeed felt normal. It is in these contexts that the language of fate
and genes were simultaneously invoked.
In these ways, I have briefly outlined how genes in the Emirates become simultaneously
tied to pathology, race, ethnicity, and fate. These relationships are made evident in local
discourse in complex ways. Consanguinity, in the Emirates, for example, is increasing.
Studies indicate that slightly over half of Emirati marriages are consanguineous (Al-Gazali
et al. 1997). However, as the local population increases, and as the Emirati population has
more access to education and health services, the rates of consanguinity have increased.
Contrary to patterns in many other parts of the world, in the span of one generation,
research indicates that rates of consanguineous relations have risen another 10%, and the
preferred marriage is between first cousins (ibid.). Studies in the Emirates have attempted
to examine the effects of the trends in these marriage practices on health patterns
(Tadmouri 2009; Abdulrazzaq et al. 1997; Al Gazali 1995). However, new local under-
standings of genetics give novel meanings to inheritance. Paired with traditional ideas of
fate, and tied, again, to anxieties related to an ever-increasingly heterogenous city, genetic
information, for many, may ironically help inform higher rates of consanguinity. Simi-
larly, Kilshaw has collected narratives of women in a similar context in Qatar in which
dialogue between genes, responsibilities towards health, arranged marriages, and familial
obligations are constantly contested and negotiated (2015, this issue).
Chronic illness presents similar challenges. Chronic illness is, by its nature, confusing
as negatively constructed pathology. If disease in general can be discussed through fatalis-
tic terms, chronic illness, for which patients may not recognise or anticipate future symp-
toms, becomes even more of a logical consequence of destiny. Race is constructed in
Dubai as a profoundly positive form of cultural capital, and genes as markers of race are
proudly owned. In other aspects of social engagement, many are very protective of what is
acceptable as informed by genetics. Mental illnesses are often said to be genetic, but sexual
behaviour is not, and my participants become deeply offended at suggestions that sexual
behaviour might be informed from biology. When race, seen as a profoundly positive
social capital, is made parallel to pathology in terms of genetic dissemination, an
ANTHROPOLOGY & MEDICINE 79
individual’s natural approach to their chronic illness often becomes marked by indiffer-
ence, and on some occasions, might even be embraced as a socially owned form of cultural
capital, regardless of the health consequences. As a result, emergent public genetic educa-
tion on the ‘Arab’ genome, designed by health authorities to curb those habits that
encourage and spread chronic illness, is local embraced as authoritative knowledge, mir-
roring the language of fate that local residents have long used to articulate their world.
However, as authoritative as the concept of the gene in the Emirates is, it fails to produce
the behaviour change for which health planners have hoped. Indeed, the opposite effect
has occurred, as the rates of diabetes and obesity continue to climb.
The body and the city
The ‘city’, however, creates a new and very real dilemma for those who inhabit them, and
it has ramifications for the body. In my previous fieldwork, I set off to answer a very broad
question in the Emirates: What happens to identity within indigenous culture when faced
with globalization and modernization on such a rapid course? Dubai, perhaps more than
anywhere else in the world, is well suited to afford opportunities to explore this question.
The city itself became a protagonist, and a type of an anti-heroine. In the years I lived in
the city, I was able to watch megaliths rise from the sand. Countless workers from South
Asia spun webs of steel and scaffolding from dawn until after dusk. Every evening the
towers were half a metre taller. One can drive somewhere in the morning, only to be lost
when the road is wiped away by evening. The city is a fortress against nature, a place
that – even for my participants – could not be, should not be. For many of the residents
of Dubai, the city is an impossible landscape, save for the vision of the sheikhs, and the
blessings of God. Dubai, for many local people, is itself an articulation of their sub-con-
scious, arising from the dreams of their leaders who imagined the wealth of the city as
they stared across what was once a tiny creek babbling along sand and rock. Because of
this perception of Dubai as a materiality of local dream-scape, her betrayal is especially
harsh. Many of the men and women who watched the first cargo come to Jebel Ali Port,
and who remember the first hotels and towers, now feel that the city is designed for every-
one except them. Some people act as if the city has its own agency, and there is a sense of
amorality in its development, but for my participants, who are fiercely loyal to each other,
to the sheikhs, and to Dubai, there is a sense that Dubai has not reciprocated, that at some
point the city began to be disloyal.
The sand, the coast, and even the oil, previously dependable wells of wealth, gifted by
the desert and the sea, that are worthy of their own ethnography (see Limbert 2010), are
no longer the stable entities with which local people can pivot themselves against to enact
identity. In this way, the relationship between people and their environment, as I have
witnessed it in Dubai, becomes profoundly disrupted. Local Bedouin and Beni Yas tribal
cosmology has long seen actors subject to the permanence of land and the inevitability of
predictable – if sometimes oppressive – nature, the extreme reality of the desert, and the
moon and tides in which they see the natural symbols of fate. The city, in this sense is pro-
foundly disruptive. Cosmology which has long-depended upon a relationship between
moving bodies and stable Earth fails to cohere in a landscape of rising monoliths, 14-lane
highways, and an influx of cultures and languages from abroad, and, of course, genetic
heterogeneity. Emiratis now compose less than 10 per cent of Dubai’s population. No
80 A. PARKHURST
longer the flexible bodies against the rock, many people develop a deep anxiety which lim-
its their ability for action across the gambit of individual and social enterprise. In other
words, the uncertainty of ‘modernity’, whatever ‘modernity’ is, makes thoroughly intoler-
able the complexity of life’s choices.
I have studied how this frustration over uncertainty becomes enacted in local cosmol-
ogy, among the Djinn of the Emirates, who lash out against both the past and the future
(2014). Here, though, there are repercussions for the body, which is, in the Emirates at
least, one of the casualties in the conflict between local identity and the changing urban
landscape. Fate becomes enacted uniquely here. Rapid change radically disrupts people’s
ability to see themselves in the future, and so ‘health seeking behaviour’ becomes desir-
able, but highly directionless. Genes, too, take on further meaning in light of the shaky
ground. As a biological category of both fate and ethnicity, they are relied upon to provide
an anchor to identity when identity is under threat from a newly uncertain world. They
become a cosmology in and of themselves, synonymous with tradition, and their associa-
tion with pathology is forgiven, and even valued as a consequence of fate. Health educa-
tion directed towards managing and preventing chronic illness asks the individual to
imagine one’s body in the future. However, the body, as outlined at the beginning of this
paper, is inexorably intertwined with the urban cosmos, and for many, the unstable,
uncertain city makes this request for vision cognitively exasperating and disheartening.
The city, as I have described, is a site of vacillation for my participants. They have called
it, poetically, the ‘inescapable place of desire’, highlighting their deep frustrations. My par-
ticipants are not usually resentful towards the city. Indeed, they often express deep love
for it along with their exasperation. They do not want to city to collapse, but they are
simultaneously overwhelmed by it. Diabetes and genes becomes enmeshed in this exas-
peration, and many turn to concepts of fate to cope with their precariousness. Genes help
concretise this fate within the human body.
I suggest as a final thought that both chronic illness and anxiety in the Emirates is partly
the result of the ways in which many local people define what modernity means to them.
Perhaps Dubai’s betrayal is that it grew too quickly. Foreigners come to the desert and sift
in and out of memory and landscapes, but it is the local Emirati who are left to make sense
of the shadows of all this movement. Genes and Djinn, germs and fate, SUVs and oil, sky-
scrapers in the city, and the sands of the empty quarter all must be constantly reimagined,
and it can be very arduous work. Emirati citizens value tradition and preservation, and
they do want to preserve the new city. The task at hand is how paradoxically to create tra-
dition and sustainability against a backdrop of something entirely new, but not just new,
from something that has no firm foundation. Emirati locals do by and large know the
steps they need to take for healthier lives, and they are educated on what health-seeking
behaviours will drive communal health, but with both local imagination and local health
structures, they lack novel frameworks in which these behaviours carry deeper meaning.
For my participants, whilst their futures and their city sit upon volatile terrain, they hold
steadfast to cosmologies that help anchor them to the world that they value, and they are
fiercely proud of constructed Arab identifiers that help index their lives as both desert and
urban people. Genes become valued as these identifiers, and are tied to conceptions of
ANTHROPOLOGY & MEDICINE 81
fate. Local people understand pathology when it is presented through genetic discourse,
but in terms of the uncertainty of the city, and the threats the city presents to local iden-
tity, pathology becomes equally tied to fate.
The systems which inform rising rates of obesity and diabetes around the world are
massively complex, and there is a host of social and biological factors that inform these
body categories. My simple existential point is that when suddenly faced with the
intensely myriad choices of the modern world, many people (regardless of nationality,
religion, gender or race) simply, and ironically, cannot make any. This includes choices
on health and habits. It becomes profoundly difficult to consider the future body in a
landscape that wantonly clouds future vision. In the context of Dubai’s rapid urban
growth, residents rely upon structures of cosmology that they hold self-evident to cope
with radical uncertainty, and they apply these cosmologies of the body and to emergent
biomedical categories. In addition to health care education, and policy that addresses
structural violence, in all its many forms, I argue that health-care planning and policy can
still be profoundly informed by local cosmology, and it must take into account how the
human figure pivots itself against a world that is, for many, no longer as sturdy and
dependable as they once had known.
This paper is derived from research that was conducted with ethics approval from UCL.
This paper would not be possible without the participation and help from my informants in the
United Arab Emirates, and I am grateful for the time they have given me. The author would like to
thank the editors of this special edition, Susie Kilshaw, Sahra Gibbon, and Margaret Sleeboom-
Faulkner for their reviews and suggestions that helped develop this paper. The author also thanks
the anonymous reviewers for their helpful comments and edits.
No potential conflict of interest was reported by the author.
ORCID
Aaron Parkhurst http://orcid.org/0000-0002-0762-0929
Abdulrazzaq, Y. M., A. Bener, L. I. Al-Gazali, A. I. Al-Khayat, R. Micallef, and T. Gaber. 1997. “A
Study of Possible Deleterious Effects of Consanguinity.” Clinical Genetics 51: 167–173.
Al Gazali, L. I., A. Bener, Y. M. Abdulrazzaq, R. Micallef, A. I. Al-Khayat, and T. Gaber. 1997.
“Consanguineous Marriages in the United Arab Emirates.” Journal of Biosocial Science 29 (4):
491–497.
Al-Gazali, L. I., A. H. Dawodu, K. Sabarinathan, and M. Varghese. 1995. “The Profile of Major
Congenital Abnormalities in the United Arab Emirates (UAE) Population.” Journal of Medical
Genetics 32: 7–13.
82 A. PARKHURST
http://orcid.org/0000-0002-0762-0929
Avise, John C. 2001. The Genetic Gods: Evolution and Belief in Human Affairs. Boston, MA:
Harvard University Press. (first published 1998).
Beaudevin, Claire. 2013. “Old Diseases & Contemporary Crisis. Inherited Blood Disorders in
Oman.” Anthropology & Medicine 20 (2): 175–189.
Bourdieu, Pierre. 1963. “The Attitude of the Algerian Peasant Toward Time.” In Mediterranean
Countrymen: Essays in the Social Anthropology of the Mediterranean, edited by J. Pitt-Rivers, 55–
72. Paris: Mouton.
Bourgois, P. 2011. “Lumpen Abuse: The Human Rights Cost of Righteous Neoliberalism.” City and
Society 23 (1): 2–12.
Burgoine, T., N. G. Forouhi, S. J. Griffin, N. J. Wareham, and P. Monsivais. 2014. “Associations
Between Exposure to Takeaway Food Outlets, Takeaway Food Consumption, and Body Weight
in Cambridgeshire, UK: Population Based, Cross Sectional Study.” British Medical Journal 348:
g1464. doi:10.1136/bmj.g1464.
Cetateanua, A., and A. Jones. 2014. “Understanding the Relationship Between Food Environments,
Deprivation and Childhood Overweight and Obesity: Evidence from a Cross Sectional England-
Wide Study.” Health & Place 27: 68–76.
Chaves, Mark. 2010. “SSSR Presidential Address: Rain Dances in the Dry Season: Overcoming the
Religious Congruence Fallacy.” Journal for the Scientific Study of Religion 49 (1): 1–14.
Church, T. S., D. M. Thomas, C. Tudor-Locke, P. T. Katzmarzyk, C. P. Earnest, R. Q. Rodarte, C. K.
Martin, S. N. Blair, and C. Bouchard. 2011. “Trends Over 5 Decades in U.S. Occupation-Related
Physical Activity and Their Associations with Obesity.” PLoS ONE 6(5): e19657.
Csordas, Thomas J. 1994. Embodiment and Experience: The Existential Ground of Culture and Self.
Cambridge, UK: Cambridge University Press.
Dawkins, Richard. 1999. The Extended Phenotype: The Long Reach of the Gene. Oxford, UK: Oxford
University Press.
Douglas, Mary. 1966. Purity and Danger. London: Routledge.
Edwards, N. 2012. “Taking Action on Health Inequities: Essential Contributions by Qualitative
Researchers.” International Journal of Qualitative Methods 11: 61–63.
Farmer, Paul. 2005. Pathologies of Power: Health, Human Rights, and The New War on The Poor.
Berkeley: University of California Press.
Franklin, Sarah. 1995. “Science as Culture, Cultures of Science.” Annual Review of Anthropology 24:
163–184.
Franklin, Sarah, and Roberts, Celia. 2006. An Ethnography of Preimplantation Genetic Diagnosis.
Princeton, NJ: Princeton University Press
Frazer, J. G. 1990. “The Golden Bough.” In The Golden Bough, 701–711. London, UK: Palgrave
Macmillan.
Fullwiley, Duana. 2007. “Race and Genetics: Attempts to Define the Relationship.” Biosocieties
2 (02): 221–237.
Gibbon, Sahra, and Novas, Carlos. 2008. Biosocialities, Genetics and the Social Sciences: Making
Biologies and Identities. London: Routledge.
Hage, Ghassan. 2010. “Hating Israel in the Field.” In Emotions in the Field, edited by James Davies
and Dimitrina Spencer, 129–154. Palo Alto, CA: Stanford University Press.
Hamdy, Sherine F. 2008. “When the State and Your Kidneys Fail: Political Etiologies in an Egyptian
Dialysis Ward.” American Ethnologist 35 (4): 553–569.
Hamdy, Sherine F. 2009. “Islam, Fatalism, and Medical Intervention: Lessons from Egypt on the
Cultivation of Forbearance (Sabr) and Reliance on God (Tawakkul).” Anthropological Quarterly
82 (1): 173–196.
International Diabetes Federation. 2010. IDF Diabetes Atlas. 5th ed. Accessed 04 April 2017. http://
www.diabetesatlas.org/resources/previous-editions.html, http://www.allcountries.org/ranks/dia
betes_prevalence_country_ranks.html
International Diabetes Federation. 2015. IDF Diabetes Atlas. 7th ed. Accessed 04 April 2017. http://
www.diabetesatlas.org/resources/previous-editions.html
Kilshaw, S., T. Al Raisi, and F. Alshaban. 2015. “Arranging Marriage; Negotiating Risk: Genetics
and Society in Qatar.” Anthropology & Medicine 22 (2): 98–113.
ANTHROPOLOGY & MEDICINE 83
https://doi.org/10.1136/bmj.g1464
http://www.diabetesatlas.org/resources/previous-editions.html
http://www.diabetesatlas.org/resources/previous-editions.html
http://www.allcountries.org/ranks/diabetes_prevalence_country_ranks.html
http://www.allcountries.org/ranks/diabetes_prevalence_country_ranks.html
http://www.diabetesatlas.org/resources/previous-editions.html
http://www.diabetesatlas.org/resources/previous-editions.html
Latour, Bruno. 1993. We Have Never Been Modern. Translated by C. Porter. London: Harvester
Wheatsheaf.
Leatherman, Thomas L., and Alan Goodman. 2005. “Coca-Colonization of Diets in the Yucatan.”
Social Science & Medicine (The Social Production of Health: Critical Contributions from Evolu-
tionary, Biological and Cultural Anthropology: Papers in Memory of Arthur J. Rubel. The Social
Production of Health: Critical Contributions from Evolutionary, Biological and Cultural
Anthropology: Papers in Memory of Arthur J. Rubel). 61 (4): 833–846. doi:10.1016/j.
socscimed.2004.08.047.
Limbert, Mandana. 2010. In the Time of Oil: Piety, Memory, and Social Life in an Omani Town.
Palo Alto, CA: Stanford University Press.
Lock, Margaret. 2013. “The Epigenome and Nature/Nurture Reunification: A Challenge for
Anthropology.” Medical Anthropology 32 (4): 291–308.
Lock, Margaret. 2015. “Comprehending the Body in the Era of the Epigenome.” Current Anthropol-
ogy 56 (2): 151–177.
Mendenhall, E., R. A. Seligman, A. Fernandez, and E. A. Jacobs. 2010. “Speaking Through Diabetes:
Rethinking the Significance of Lay Discourses on Diabetes.” Medical Anthropology Quarterly
24 (2): 220–239.
“Metropolis”. 1927. Dir. Fritz Lang [Film]. Germany: Universum Film AG.
Mumford, Lewis. 1934. Technics and Civilization. New York: Harcourt, Brace & Company.
Napier, A. David, Clyde Ancarno, Beverley Butler, Joseph Calabrese, Angel Chater, Helen Chatter-
jee, François Guesnet, et al. 2014. “Culture and Health.” The Lancet 384 (9954): 1607–1639.
Parkhurst, A. L. 2014. “Genes and Djinn: Identity and Anxiety in Southeast Arabia.” PhD diss, UCL
(University College London).
Popenoe, Rebecca. 2003. Feeding Desire: Fatness, Beauty and Sexuality Among a Saharan People:
Fatness and Beauty in the Sahara. London: Routledge
Rabinow, P. 1996. Artificiality and Enlightenment: From Sociobioloy to Biosociality. Essays on the
Anthropology of Reason. Princeton, NJ: Princeton University Press.
Rabinow, Paul. 2008. “Afterword. Concept Work.” In Biosocialities, Genomics and the Social Scien-
ces; Making Biologies and Identities, edited by S. Gibbon and C. Novas, 188–193. London:
Routledge.
Randall, S. C. 2011. “Fat and Fertility, Mobility and Slaves: Long-Term Perspectives on Tuareg Obe-
sity and Reproduction.” In Fatness and the Maternal Body: Women’s Experiences of Corporeality
and the Shaping of Social Policy, edited by M. Unnithan-Kumar and S. Tremayne, 43–70.
Oxford: Berhahn.
Scheper-Hughes, Nancy, and Margaret M. Lock. 1987. “The Mindful Body: A Prolegomenon to
Future Work in Medical Anthropology.” Medical Anthropology Quarterly 1 (1): 6–41.
Senior, V., T. M. Marteau, and T. J. Peters. 1999. “Will Genetic Testing for Predisposition for Dis-
ease Result in Fatalism? A Qualitative Study of Parents Responses to Neonatal Screening for
Familial Hypercholesterolaemia.” Social Science & Medicine 48 (12): 1857–1860.
Sennett, Richard. 1994. Flesh and Stone: The Body and the City in Western Civilization. New York:
W.W. Norton
Tadmouri, Ghazi O., Pratibha Nair, Tasneem Obeid, Mahmoud T. Al Ali, Najib Al Khaja, and
Hanan A. Hamamy. 2009. “Consanguinity and Reproductive Health Among Arabs.” Reproduc-
tive Health 6: 17. doi:10.1186/1742-4755-6-17.
84 A. PARKHURST
https://doi.org/10.1016/j.socscimed.2004.08.047
https://doi.org/10.1016/j.socscimed.2004.08.047
https://doi.org/10.1186/1742-4755-6-17
Copyright of Anthropology & Medicine is the property of Routledge and its content may not
be copied or emailed to multiple sites or posted to a listserv without the copyright holder’s
express written permission. However, users may print, download, or email articles for
individual use.
Introduction
Diabetes in the Emirates
Diabetes and fate
The body and the city
Conclusion
Ethical approval
Acknowledgments
Disclosure statement
References
1Department of Animal Biology-Anthropology, University of Barcelona, Barcelona, Spain.
2Department of Biological Sciences, Yarmouk University, Irbid, Jordan.
3Department of Medical Laboratory Sciences, Jordan University of Science and Technology, Irbid, Jordan, and Department of Biology,
Faculty of Science, Taibah University, Saudi Arabia.
4Department of Psychiatry and Clinical Psychobiology, University of Barcelona, Barcelona, Spain.
*Correspondence to: Pedro Moral, Biodiversity Research Institute, Department of Animal Biology-Anthropology, University of
Barcelona, Avenida Diagonal no. 643, 08028 Barcelona, Spain. E-mail: pmoral@ub.edu.
KEY WORDS: alu insertion polymorphisms, jordan, bedouins, population genetics.
Human Biology, Spring 2014, v. 86, no. 2, pp. 131–138. Copyright © 2014 Wayne State University Press, Detroit, Michigan 48201
Human Diversity in Jordan: Polymorphic Alu Insertions
in General Jordanian and Bedouin Groups
Daniela Zanetti,1 May Sadiq,2 Robert Carreras-Torres,1 Omar Khabour,3
Almuthanna Alkaraki,2 Esther Esteban,1 Marc Via,4 and Pedro Moral 1*
abstract
Jordan, located in the Levant region, is an area crucial for the investigation of human migration between
Africa and Eurasia. However, the genetic history of Jordanians has yet to be clarified, including the
origin of the Bedouins today resident in Jordan. Here, we provide new genetic data on autosomal
independent markers in two Jordanian population samples (Bedouins and the general population) to
begin to examine the genetic diversity inside this country and to provide new information about the
genetic position of these populations in the context of the Mediterranean and Middle East area. The
markers analyzed were 18 Alu polymorphic insertions characterized by their identity by descent, known
ancestral state (lack of insertion), and apparent selective neutrality. The results indicate significant
genetic diffferences between Bedouins and general Jordanians (p = 0.038). Whereas Bedouins show a
close genetic proximity to North Africans, general Jordanians appear genetically more similar to other
Middle East populations. In general, these data are consistent with the hypothesis that Bedouins had an
important role in the peopling of Jordan and constitute the original substrate of the current population.
However, migration into Jordan in recent years likely has contributed to the diversity among current
Jordanian population groups.
The State of Jordan emerged in 1946 as the Hashemite Kingdom of Transjordan when Britain and France divided the Middle East
after World War II. Since 1948 it has offficially been
known as the Hashemite Kingdom of Jordan. Jor-
dan is a predominantly Arab nation, whose capital
and largest city is Amman. It is located on the East
Bank of the Jordan River and the Dead Sea and
borders Palestine and Israel states to the west, Syria
to the north, Saudi Arabia to the south and east, and
Iraq to the northeast.
Because of its position in the Levant region,
Jordan represents one of the major pathways for
human movement. Since antiquity, traders tra-
versed this area carrying products from the lands
of the Indian Ocean basin to Syria, to be distributed
from there to other parts of the Mediterranean
world. Jordan was a crossroads for people from
all over what is known today as the Middle East.
Because of its strategic position connecting Asia,
Africa, and Europe in the ancient world, Jordan
was a major transit zone and thus an object of
132 ■ Zanetti et al.
contention among the rival empires of ancient
Persians, Macedonian Greeks, and many others
(Salibi 1998).
Current inhabitants of Jordan are mostly Arab
descendants of Transjordan or Palestine, and Bed-
ouins, part of a predominantly desert-dwelling
Arabian ethnic group traditionally divided into
tribes. Historically, the inhabitants of this desert,
which spreads northward into Syria, eastward
into Iraq, and southward into Saudi Arabia, were
Bedouin pastoralists (Salibi 1998). Today around
98% of the 7.9 million Jordanians are of Arab
origin, along with other small minorities such as
Circassians (1%) and Armenians (1%). Culturally,
the offficial language is Arabic; in terms of religion,
over 92% of the people are Sunni Muslims, around
6% are Christians (mostly Greek Orthodox, but
some Greek and Roman Catholics, Syrian Ortho-
dox, Coptic Orthodox, Armenian Orthodox, and
Protestant denominations), and the remaining 2%
are Shia Muslim and Druze populations (Central
Intelligence Agency 2013–2014).
Historically, the term “Bedouin” has denoted
both a nomadic way of life and a group identity.
Bedouins were the original settlers in the Middle
East. From the Arabian Peninsula, their original
home, they spread out and now live in desert
regions of all the countries between the Arabian
Gulf and the Atlantic. The Arab conquest of North
Africa in the seventh century AD caused a wide
dispersion, such that today the Arab culture is
extended over North Africa and beyond.
The availability of historical and ethnical in-
formation about Jordanian peoples (Salibi 1998)
contrasts with the lack of information about
the genetic background of these groups. As far
as we know, previous genetic information about
Jordanian populations includes two studies on
uniparental markers analyzed in Bedouins and
general Jordanians (Flores et al. 2005; González et
al. 2008) and a survey of a reduced number of Alu
insertions, fewer than those analyzed in this study,
in a sample of the general Jordanian population
(Bahri et al. 2011). Variation in the uniparental
markers (Y-chromosome and mitochondrial DNA)
underlines the genetic outlier position of Bedouins,
whereas general Jordanians are relatively close to
the neighboring Middle East groups.
To provide new insight from autosomal gene
variation about the distinctiveness of Bedouins
suggested by uniparental markers, this study geno-
typed 18 autosomal Alu insertions in two diffferent
Jordanian samples: one of individuals of Bedouin
origin and the other of considered as representative
of the general Jordanian population. The main
objective was to test whether autosomal markers
confirm the previous population diffferentiation
within Jordan revealed by uniparental markers. The
secondary objectives were to determine the degree
of genetic heterogeneity in Jordan, the genetic
position of Bedouins and general Jordanians in
the general context of the Mediterranean and the
Middle East areas, and to provide new data about
the potential influence of Bedouins, as representa-
tives of Arab origins, in North Africa.
In this study 18 Alu insertion markers were se-
lected because they are a useful tool for population
studies on the basis of their identity by descent,
known ancestral state, and selective neutrality
(Cordaux et al. 2006; Cordaux and Batzer 2009).
The potential usefulness of specific Alu loci as
ancestry-informative markers has been explored
to detect diffferences between populations and to
estimate biogeographical ancestry (Luizon et al.
2007). Polymorphic Alu insertions have also been
used in several studies tackling many historical
and demographical questions (González-Pérez et
al. 2010; Terreros et al. 2009).
Materials and Methods
Samples and Markers
A total of 96 blood samples from healthy unrelated
individuals of both sexes, collected from diffferent
regions of the north, center, and south of Jordan,
were classified into two groups: Bedouins (n =
43) and general Jordanians (n = 53). Collection,
classification, and DNA isolation of all samples
were carried out by researchers at Yarmouk Uni-
versity. All participants were selected because their
relatives were born in Jordan for at least three gen-
erations. The general Jordanian group was mostly
sampled in Jordanian cities, such as Amman and
Irbid. The Bedouin samples were collected from
the Badia desert in collaboration with the Jordan
Badia Research and Development Center. These
samples were classified according to the towns
or village in which the subject and the subject’s
parents and grandparents were born, as well as
Polymorphic Alu Insertions in Jordanian and Bedouin Groups ■ 133
the last names of the families and the tribes they
belong to. All subjects signed an informed consent,
and the study was approved by the ethical commit-
tees of the University of Barcelona and Yarmouk
University. The protocols and procedures used in
this research were in compliance with the Declara-
tion of Helsinki.
Genomic DNA was extracted from blood cells
using a Blood DNA Midi Kit (Omega Bio-Tek,
Norcross, GA) according to the manufacturer’s
procedure. Eighteen human-specific Alu polymor-
phic elements (A25, ACE, APOA1, B65, CD4, D1,
DM, FXIIIB, HS2.43, HS4.32, HS4.69, PV92, Sb19.12,
Sb19.3, TPA25, Ya5NBC221, Yb8NBC120, and Yb-
8NBC125) located on 10 diffferent chromosomes
(Chr 1, 3, 8, 11, 12, 16, 17, 19, 21, and 22) were typed
by PCR amplification and electrophoretic analysis.
Primers and amplification conditions have been
previously described (Batzer and Deininger 1991;
González-Pérez et al. 2010; Stoneking et al. 1997).
Positive and negative controls for the polymor-
phisms examined were included in all PCR runs.
Statistical Analyses
Standard human population genetic parameters
were obtained. Allele frequencies were estimated
by direct counting. Hardy–Weinberg equilibrium
was assessed by an exact test based on the Markov
chain method (Guo and Thompson 1992) using Ge-
nepop, version 4.2 (Rousset 2008). Heterozygosity
values by locus and population according to Nei’s
formula (Saitou and Nei 1987) were calculated using
Genetix version 4.05 (Belkhir et al. 1996–2004). Dif-
ferences in allele frequency distribution between
the two Jordanian samples and, in general, between
all pairs of populations were assessed by an exact
test based on Fisher’s exact probability test using
the Genepop software.
Genetic distances (Reynolds’s distance) and hi-
erarchical analyses of molecular variance (AMOVA)
were estimated using Phylip, version 3.69 (Tuimala
2006), and Arlequin, version 3.5 (Excofffier et al.
2005). Genetic relationships among populations
were assessed by a principal component (PC) plot
using the FactoMineR package of R ( Josse 2008).
Comparisons with Published Data Sets
To evaluate the genetic position of Bedouins and
general Jordanians in the Mediterranean and the
Middle East areas, two comparative analyses were
carried out, based on population data available in
the literature. The main analysis focused on the
whole Mediterranean area using 18 polymorphic
Alu insertions in 16 populations, as indicated in
Figure 1. These populations comprised three Span-
ish regions (southern Spain: Andalusia; northern
Spain: Asturias; central Spain: Sierra de Gredos),
southern France (Toulouse), Turkey (Anatolia
Peninsula), Greece (Attica region), five Mediter-
ranean islands (Sardinia, Corsica, Sicily, Crete, and
Minorca), and five Berber groups from Morocco,
Algeria, and Egypt. The Moroccan samples came
FIGURE 1. Geographic location of the populations analyzed in the study: populations analyzed using 18 Alu (circles) and
populations analyzed using the only eight Alu insertion polymorphisms available in the literature (crosses). 1: Amizmiz Berbers
(AMBE), 2: Middle Atlas Berbers (MABE), 3: Northeast Moroccan Berbers (NEBE), 4: Southern Spain, 5: Central Spain, 6:
Northern Spain, 7: France, 8: Corsica, 9: Sicily, 10: Greece, 11: Crete, 12: Turkey, 13: Syria, 14: Iran, 15: United Arab Emirates, 16:
Baharain, 17: Cyprus, 18: Siwa Berbers (Siwa), 19: Mzab Berbers (Mzab), 20: Sardinia, 21: Menorca.
134 ■ Zanetti et al.
from High Atlas (Amizmiz Berbers), Middle Atlas
(Berbers from the Khenifra region), and northeast
Moroccan Berbers (Bouhria area). Other Berber
samples were Mzab from Algeria and Siwi from the
Siwa Oasis in Egypt (González-Pérez et al. 2007,
2010).
To obtain a geographically more comprehensive
data set in the Middle East, a second comparative
analysis adding samples from Iran, Cyprus, United
Arab Emirates, Syria, and Bahrain was performed.
This analysis was based on data from only eight Alu
markers available in the literature (Bahri et al. 2013;
González-Pérez et al. 2010; Romualdi et al. 2002;
Stoneking et al. 1997).
Results
Alu insertion frequencies and gene diversities in
Bedouins and general Jordanians are shown in
Table 1. The highest insertion frequencies corre-
spond to the Ya5NBC221 locus in Bedouins (0.941)
and to the APOA1 locus in general Jordanians
(0.950); the lowest frequency values are found
in the HS2.43 locus (0 in Bedouins and 0.08 in
general Jordanians). As expected, the lowest gene
diversity values correspond to loci showing ex-
treme allele frequencies: Ya5NBC221 (H = 0.112) in
Bedouins, APOA1 (H = 0.096) in general Jordanians,
and HS2.43 in both Bedouins (H = 0) and general
Jordanians (H = 0.149). The highest diversity values
corresponding to loci with frequencies close to 0.5
were B65 and TPA25 (H = 0.506) in Bedouins and
TPA25 (H = 0.500) in general Jordanians.
The test for Hardy-Weinberg equilibrium,
after Bonferroni correction, indicates significant
deviations only for D1 (p = 0.0000) and FXIIIB (p =
0.0000) in general Jordanians. Chance is the most
likely explanation for this departure because there
is no particular reason to expect a Hardy-Weinberg
deviation for these markers, and the deviations are
not shared by the two population samples.
Comparison of the two Jordanian samples
shows that the average gene diversity in general
Jordanians (0.366 ± 0.142) is only slightly higher
than in Bedouins (0.349 ± 0.146). In general,
Table 1. Alu Insertion Frequencies, Gene Diversities, and p-Values of Hardy-Weinberg (H-W) Equilibrium in Bedouins
and General Jordanians
Locus Bedouin General_Jordan Frequency Range
N Insertion Heterozygosity H-W N Insertion Heterozygosity H-W High Low
DM 25 0.640 0.470 0.187 37 0.405 0.489 0.048 Siwa (0.356) Sicily (0.674)
HS4.69 42 0.452 0.501 0.530 50 0.440 0.498 0.011 Mzab (0.287) Bedouin (0.452)
HS4.32 38 0.776 0.352 0.059 51 0.824 0.294 0.638 Central Spain (0.493) General_Jordan (0.824)
Ya5NBC221 34 0.941 0.112 1.000 41 0.939 0.116 0.121 Southern Spain (0.725) Northern Spain (0.978)
Sb19.3 42 0.750 0.380 1.000 53 0.755 0.374 0.259 AMBE (0.613) Sardinia (0.945)
HS2.43 38 0.000 0.000 <0.001 50 0.080 0.149 0.261 Bedouin (0) Sardinia (0.171)
Sb19.12 43 0.267 0.396 0.133 53 0.274 0.401 1.000 Mzab (0.135) Central Spain (0.4)
B65 40 0.500 0.506 0.536 48 0.563 0.497 0.140 Siwa (0.150) Crete (0.647)
Yb8NBC120 33 0.394 0.485 0.270 43 0.430 0.496 1.000 Siwa (0.023) AMBE (0.569)
YbNBC125 41 0.134 0.235 1.000 53 0.226 0.354 0.048 Siwa (0.065) General Jordan (0.226)
PV92 27 0.241 0.373 0.613 35 0.143 0.248 0.526 Sicily (0.079) MABE (0.368)
D1 39 0.385 0.479 0.005 51 0.412 0.489 <0.001 United Arab Emirates (0.08) Sicily (0.474)
FXIIIB 43 0.302 0.427 1.000 52 0.298 0.423 t0.001 Iran (0.214) Turkey (0.584)
A25 43 0.105 0.190 0.372 53 0.132 0.231 0.575 Syria (0) Central Spain (0.175)
CD4 37 0.797 0.328 0.616 43 0.663 0.452 0.041 Crete (0.593) Bedouin (0.797)
TPA25 38 0.487 0.506 0.204 49 0.551 0.500 0.251 Siwa (0.317) NEBE (0.661)
APOA1 38 0.868 0.232 0.098 50 0.950 0.096 0.100 Siwa (0.84) France (0.981)
ACE 42 0.202 0.327 0.657 53 0.387 0.479 0.772 Bedouin (0.202) Central Spain (0.467
Average heterozygosity 0.349±0.146 0.366±0.142
Abbreviations: N: number of chromosomes; AMBE: Amizmiz Berbers, MABE: Middle Atlas Berbers, NEBE: Northeast Moroccan Berbers, MZAB: Mzab Berbers. Variation ranges are given according to data
from reviewed literature for populations represented in Figure 1.
the Jordanian frequencies and gene diversities
show values within the variation range of other
Mediterranean populations. Extreme values
were found only for HS2.43 and ACE in Bedou-
ins, corresponding to the lowest frequencies in
the literature revised, and for HS4.69 and CD4 in
Bedouins and HS4.32 and Yb8NBC125 in general
Jordanians, which are the highest values in the
literature revised. Allele frequency comparisons
show significant diffferences across all 18 loci (p
= 0.038; 36 df ) between Bedouins and general
Jordanians. Locus-by-locus comparisons indicate
significant diffferences for DM (p = 0.015), HS2.43
(p = 0.01), and ACE (p = 0.005) markers.
Concerning population relationships, the PC
analysis based on the whole set of Alu insertion
polymorphisms in 16 populations indicates that
the two first axes account for 49.31% of the total
genetic variance (Figure 2). The first axis (33.76%
of the total variance) clusters Bedouins along with
North African samples with a certain separation
from the rest. Within this group, the Siwa Oasis
sample appears in the most distant position. The
second component underlines the separation
of the Western Mediterranean samples (central
Spain, France, north of Spain, Corsica, and Sicily)
from Eastern Mediterranean groups (Greece, Tur-
key, Crete) and general Jordan. When the analysis
was repeated to remove the efffect of the Siwa Oasis
sample (data not shown), the observed pattern was
substantially the same. Population relationships
within Jordan indicate that the Bedouins, closer
to North Africans, show an intermediate position
between these populations and Eastern Mediter-
raneans, whereas general Jordanians cluster with
Eastern Mediterranean populations. Results from
both genetic distance analysis and AMOVA sup-
port the distribution revealed by the PC analysis.
Thus, the average Reynolds genetic distance of
Bedouins to the remaining populations (31 × 10–3)
is of the same order of magnitude as the average
distance among all the populations (32 × 10–3),
whereas the distance of general Jordanians to
Middle Eastern populations (23 × 10–3) is lower
than that corresponding to Bedouins (28 × 10–3;
Table 2).
The hierarchical analysis of the allele fre-
quency variance, classifying the populations into
two groups (North Africa plus Bedouins, and all
others) indicates a significant variation between
the two groups, as plotted along the first PC axes
(FST = 3.4%, p < 0.001; FCT = 1.6%, p ≤ 0.001; FSC =
1.8%, p < 0.001). Likewise, the population distribu-
tion associated with the second PC component
is also supported by the AMOVA results. In this
case, the genetic variance between the three
population groups formed by North Africa plus
Bedouins, Middle East plus general Jordanians, and
FIGURE 2. PC plot of 16
populations from the
Mediterranean area based on
the variation of 18 Alu insertion
polymorphisms.
Polymorphic Alu Insertions in Jordanian and Bedouin Groups ■ 135
136 ■ Zanetti et al.
Western Mediterranean also indicates statistically
significant variation (FST = 3%, p < 0.001; FCT = 1.2%,
p ≤ 0.001; FSC = 1.8%, p < 0.001).
A second comparison, partial because it is
based on the variation of only eight Alu markers but
including a wider number of populations (21; PC
analysis population plot not shown) also separates
Bedouins from general Jordanians. However, in this
case, the relative position of the two Jordanian
samples versus other populations shows some
diffferences compared with results of the previous
analysis. For instance, the general Jordanian group
tends to be closer to Western Mediterranean than
to Middle East populations.
Discussion
This study provides the first comparative genetic
analysis between two Jordanian ethnic groups
selected according to strict and reliable criteria,
Bedouins and general Jordanians, by analyzing 18
autosomal Alu insertion polymorphisms. In gen-
eral, Jordanian allele frequencies and gene diversity
estimates show intermediate values within the vari-
ation range of other Mediterranean populations.
Compared with previous data, Alu frequencies
in general Jordanians are substantially similar to
those previously reported for a partial subset of Alu
markers (10 of the 18) in a Jordanian sample (Bahri
et al. 2011), except for two Alu markers: D1 (p = 0.02)
and HS4.32 (p = 0.006). These few diffferences could
be related to the potentially diverse origin of the
individuals sampled in each case.
Concerning diffferentiation within Jordan, this
study indicates a significant diffference between
Bedouins and urban inhabitants of Jordan (p =
0.038). Of the 18 autosomal insertion markers,
three are statistically diffferent: DM (p = 0.015),
HS2.43 (p = 0.01), and ACE (p = 0.005). Consider-
ing the relatively small sample size, the genetic
diffferences point to a clear separation between
these two groups. This could be related to the fact
that in recent times urban areas have been subject
to several external influences but Bedouins have
conserved their own genetic background because
of their nomadic and isolated lifestyle. In fact,
among all the considered populations in the com-
parative analyses, Bedouins appear to be the most
diverse group, in contrast to general Jordanians,
who cluster with other Middle Eastern groups.
However, we should not ignore the fact that the
Table 2. Reynolds’s Genetic Distances Estimated among All 18 Populations using 18 Alu Insertion Markers
Bedouin GJ Greece Crete Turkey Asturias C_Spain Andalusia Balearic_I France Corsica Sardinia Sicily AMBE MABE NEBE MZAB Siwa
Bedouin —
GJ 0.023 —
Greece 0.028 0.023 —
Crete 0.028 0.019 0.008 —
Turkey 0.027 0.028 0.005 0.009 —
Asturias 0.029 0.012 0.014 0.016 0.019 —
C_Spain 0.039 0.034 0.024 0.029 0.035 0.017 —
Andalusia 0.034 0.019 0.027 0.024 0.023 0.016 0.030 —
Balearic_I 0.026 0.020 0.012 0.007 0.012 0.015 0.026 0.019 —
France 0.020 0.018 0.011 0.014 0.014 0.009 0.016 0.021 0.016 —
Corsica 0.028 0.026 0.008 0.012 0.011 0.016 0.018 0.024 0.008 0.011 —
Sardinia 0.029 0.030 0.026 0.022 0.028 0.027 0.034 0.034 0.025 0.017 0.020 —
Sicily 0.023 0.018 0.017 0.016 0.022 0.015 0.023 0.028 0.016 0.013 0.010 0.033 —
AMBE 0.027 0.017 0.039 0.034 0.041 0.030 0.048 0.025 0.031 0.039 0.043 0.052 0.035 —
MABE 0.028 0.020 0.025 0.023 0.026 0.022 0.033 0.019 0.027 0.028 0.036 0.033 0.040 0.018 —
NEBE 0.024 0.022 0.024 0.023 0.024 0.023 0.042 0.023 0.022 0.031 0.030 0.038 0.036 0.017 0.012 —
MZAB 0.034 0.036 0.045 0.037 0.042 0.042 0.050 0.033 0.036 0.045 0.049 0.035 0.059 0.027 0.010 0.020 —
Siwa 0.076 0.085 0.105 0.101 0.102 0.087 0.084 0.077 0.091 0.091 0.094 0.075 0.100 0.102 0.065 0.083 0.057 —
Abbreviations: AMBE: Amizmiz Berbers, GJ, general Jordanians, MABE: Middle Atlas Berbers, NEBE: Northeast Moroccan Berbers, MZAB: Mzab Berbers.
markers analyzed (number and/or low mutation
rate) may be not powerful enough to uncover
relatively recent demographic events. In this way,
the small inconsistencies in the relative genetic
position of the two Jordanian samples with respect
to other populations found in the two analyses
using diffferent numbers of Alu loci (18 vs. 8) most
likely reflect the role of chance when few mark-
ers are used to characterize human populations.
In any case, the genetic diffferentiation observed
between Bedouin and general Jordanians using 18
Alu insertions polymorphisms is consistent with
the diffferentiation reported from the mitochon-
drial DNA and Y-chromosome uniparental loci in
two recent studies (Flores et al. 2005; González
et al. 2008). Assuming that Bedouins represent
the original substrate of current-day Jordanians,
the diffferentiation found between them and the
general Jordanian group could be explained by
a higher Mediterranean influence in the general
population due to Jordan’s position as a crossroads
since ancient times and/or the recent contribution
of immigrants in the last half of the twentieth
century.
In a Mediterranean context, Bedouins seem to
be closer to North African groups, whereas general
Jordanians tend to group with North Mediterra-
neans, especially with the easternmost popula-
tions. Greater genetic proximity of Bedouins and
North Africans could be explained by the impact of
Arabic expansion into North Africa in the seventh
century. However, the outlier position of the Egyp-
tian sample from Siwa, also acknowledged in other
studies (Athanasiadis et al. 2007), together with the
significant lack of Alu data in most points of North
Africa, does not allow definite conclusions.
In summary, this Alu population analysis re-
inforces the genetic distinctiveness of Bedouins,
suggesting that they had an important role in the
peopling of Jordan and probably constitute the
original substrate of this population. Their relative
genetic proximity to North African groups supports
the idea that they share the genetic background of
the populations that spread the Arab culture into
North Africa. The genetic diffferentiation found
between the two groups of current Jordanian
population could be attributed to some extent
to a relatively recent contribution of immigrants
coming from neighboring areas. However, this
conclusion needs to be confirmed with additional
markers to avoid random efffects associated with
the use of a low number of markers.
acknowledgments
We thank all participants who provided blood samples and
Mr. Nawras Al-Jazi from Badia Research Program Jordan for
facilitating sample collection in the remote southern regions
of Jordan. This work was supported by Programa de Coop-
eración Interuniversitaria e Investigación Científica grants
A/023616/09 and A/030982/10 from the Agencia Española
de Cooperación Internacional para el Desarrollo, and by
the CGL-2011-27866 project to the participant Yarmouk and
Barcelona Universities. D.Z. was supported by Master & Back
grant AF-DR-A2011B-48666-25399/2011.
Received 21 January 2014; revision accepted for
publication 3 June 2014.
literature cited
Athanasiadis, G., E. Esteban, M. Via et al. 2007. The X chro-
mosome Alu insertions as a tool for human population
genetics: Data from European and African human
groups. Eur. J. Hum. Genet. 15:578–583.
Bahri, R., A. Ben Halima, I. Ayadi et al. 2013. Genetic position
of Bahrain natives among wider Middle East popula-
tions according to Alu insertion polymorphisms. Ann.
Hum. Biol. 40:35–40.
Bahri, R., W. El Moncer, K. Al-Batayneh et al. 2011. Genetic
diffferentiation and origin of the Jordanian population:
An analysis of Alu insertion polymorphisms. Genet. Test
Mol. Biomarkers 16:324–329.
Batzer, M. A., and P. L. Deininger. 1991. A human-specific
subfamily of Alu sequences. Genomics 9:481–487.
Belkhir, K., P. Borsa, L. Chikhi et al. 1996–2004. GENETIX
4.05, logiciel sous Windows TM pour la génétique des
populations. Montpellier, France: Laboratoire Génome,
Populations, Interactions, CNRS UMR 5000, Université
de Montpellier II.
Central Intelligence Agency. 2013–2014. Middle East: Jordan.
In The World Factbook, https://www.cia.gov/library/
publications/the-world-factbook/geos/jo.html.
Cordaux, R., and M. A. Batzer. 2009. The impact of ret-
rotransposons on human genome evolution. Nat. Rev.
Genet. 10:691–703.
Cordaux, R., J. Lee, L. Dinoso et al. 2006. Recently integrated
Alu retrotransposons are essentially neutral residents
of the human genome. Gene. 373:138–144
Excofffier, L., G. Laval, and S. Schneider. 2005. Arlequin ver.
Polymorphic Alu Insertions in Jordanian and Bedouin Groups ■ 137
138 ■ Zanetti et al.
3.0: An integrated software package for population
genetics data analysis. Evol. Bioinform. Online 1:47–50.
Flores, C., N. Maca-Meyer, J. Larruga et al. 2005. Isolates in
a corridor of migrations: A high-resolution analysis
of Y-chromosome variation in Jordan. J. Hum. Genet.
50:435–441.
González, A. M., N. Karadsheh, N. Maca-Meyer et al. 2008.
Mitochondrial DNA variation in Jordanians and their
genetic relationship to other Middle East populations.
Ann. Hum. Biol. 35:212–231.
González-Pérez, E., E. Esteban, M. Via et al. 2010. Popula-
tion relationships in the Mediterranean revealed by
autosomal genetic data (Alu and Alu/STR compound
systems). Am. J. Phys. Anthropol. 141:430–439.
González-Pérez, E., P. Moral, M. Via et al. 2007. The ins and
outs of population relationships in west-Mediterranean
islands: Data from autosomal Alu polymorphisms
and Alu/STR compound systems. J. Hum. Genet.
52:999–1,010.
Guo, S. W., and E. A. Thompson. 1992. Performing the exact
test of Hardy-Weinberg proportion for multiple alleles.
Biometrics 48:361–372.
Josse, J. 2008. FactoMineR : An R package for multivariate
analysis. J. Stat. Softw. 25:1–18.
Luizon, M. R., C. T. Mendes-Junior, S. F. De Oliveira et al.
2007. Ancestry informative markers in Amerindians
from Brazilian Amazon. Am. J. Hum. Biol. 20:86–90.
Romualdi, C., D. Balding, I. S. Nasidze et al. 2002. Patterns
of human diversity, within and among continents,
inferred from biallelic DNA polymorphisms. Genome
Res. 12:602–612.
Rousset, F. 2008. Genepop’007: A complete re-implementa-
tion of the Genepop software for Windows and Linux.
Mol. Ecol. Resour. 8:103–106.
Saitou, N., and M. Nei. 1987. The neighbor-joining method:
A new method for reconstructing phylogenetic trees.
Mol. Biol. Evol. 4:406–425.
Salibi, K. S. 1998. The Modern History of Jordan. London: I.
B. Tauris.
Stoneking, M., J. J. Fontius, S. L. Cliffford et al. 1997. Alu
insertion polymorphisms and human evolution: Evi-
dence for a larger population size in Africa. Gen. Res.
7:1,061–1,071.
Terreros, M. C., M. A. Alfonso-Sanchez, G. E. Novick et al.
2009. Insights on human evolution: An analysis of Alu
insertion polymorphisms. J. Hum. Genet. 54:603–611.
Tuimala, J. 2006. A Primer to Phylogenetic Analysis using
the PHYLIP Package. Espoo, Finland: CSC—Scientific
Computing.
Copyright of Human Biology is the property of Wayne State University Press and its content
may not be copied or emailed to multiple sites or posted to a listserv without the copyright
holder’s express written permission. However, users may print, download, or email articles for
individual use.
Gene 592 (2016) 239–243
Contents
lists available at ScienceDirect
Gene
journal homepage: www.elsevier.com/locate/gene
Review
Hatem Zayed
College of Health and Sciences, Biomedical Sciences Department, Qatar University, PO Box 2713, Doha, Qatar
E-mail address: hatem.zayed@qu.edu.qa.
http://dx.doi.org/10.1016/j.gene.2016.07.007
0378-1119/
© 2016 Published by Elsevier B.V.
a b s t r a c t
a r t i c l e i n f o
Article history:
Received 21 June 2016
Accepted 3 July 2016
Available online 5 July 2016
The 22 Arab nations have a unique genetic structure, which reflects both conserved and diverse gene pools due to
the prevalent endogamous and consanguineous marriage culture and the long history of admixture among dif-
ferent ethnic subcultures descended from the Asian, European, and African continents. Human genome sequenc-
ing has enabled large-scale genomic studies of different populations and has become a powerful tool for studying
disease predictions and diagnosis. Despite the importance of the Arab genome for better understanding the dy-
namics of the human genome, discovering rare genetic variations, and studying early human migration out of
Africa, it is poorly represented in human genome databases, such as HapMap and the 1000 Genomes Project.
In this review, I demonstrate the significance of sequencing the Arab genome and setting an Arab genome
reference(s) for better understanding the molecular pathogenesis of genetic diseases, discovering novel/rare var-
iants, and identifying a meaningful genotype-phenotype correlation for complex diseases.
© 2016 Published by Elsevier B.V.
Keywords:
Arab countries
Human genome sequencing
Whole exome sequencing
Consanguinity
Endogamous marriage
Novel genes
Novel variants
Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
2. The Arab world. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
2.1. Inbred Arab communities and rare variants discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
3. The Arab genome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
3.1. Discovery of novel disease-causing genes and the Arab genome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
3.2. Arab efforts in genome sequencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
3.3. The Arab genome and the “Out of Africa” theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
3.4. Benefits of sequencing the Arab genome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Disclosure declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
1. Introduction
The completion of the Human Genome Project (HGP) in April 2003
provided a wealth of information to scientists and clinicians. Subse-
quently, the world has witnessed rapid evolution in the field of
human genetics and genomics (Lander et al., 2001; Venter et al.,
2001). Initially, the focus of the HGP was to catalog the protein-
expressing genes, which are now estimated to include approximately
20,000 to 25,000 coding genes (International Human Genome
Sequencing Consortium, 2004). However, the hard work of decoding
the function of many genes and their precise genotype-phenotype cor-
relation in disease development remains.
From the publication of the first draft of the human genome, there
has been fierce competition to develop sequencing technologies that
are faster, more efficient and cheaper and to make the price of human
genome sequencing more affordable. Thus far, whole genome/exome
sequencing has provided outstanding insights into the frequency and
incidence of novel variants in the human genome that are associated
with disease phenotypes. This information provides opportunities to
different populations in the world to be able to map the sequence vari-
ants that might be unique to their own individuals and that might be re-
sponsible for genetic disorders in their specific populations. For this
purpose, the HapMap (human haplotype mapping) Project was
240 H. Zayed / Gene 592 (2016) 239–243
launched in 2002 (International HapMap Consortium, 2003); this pro-
ject has identified a considerable number of genetic variants, providing
extensive catalogs for genetic variation. The HapMap Project has also
served as the basis for genome-wide association studies (GWAS). In
particular, the HapMap Project has contributed to the successful map-
ping of more than 100 genomic regions that are associated with genetic
diseases (International HapMap Consortium, 2003).
As an extension of the HapMap Project, the 1000 Genomes Project
was launched in 2008 through international concerted efforts
(Buchanan et al., 2012). This project aims to sequence the whole ge-
nomes of 1000 unidentified individuals from Europe, America, Africa,
and Asia, and will add information to the single-nucleotide polymor-
phism (SNP) database already cataloged by the HapMap Project and
provide a rich resource for both SNPs and structural variant haplotypes.
Although this information will allow researchers to learn more about
many genetic variants and genetic diseases, unfortunately, the Arab ge-
nome is greatly under-represented in the international efforts of such
genomic studies; specifically, it is not included in the HGP, HapMap Pro-
ject, or 1000 Genomes Project. There is no doubt that the importance of
the Arab genome sequencing is significant and that this genome thus
should not be omitted from the diverse collections of genomes that
have already been sequenced. Therefore, I am focusing this review on
elaborating upon the importance of the Arab genome and the potential
contribution of the Arab genome to the genomic sciences.
2. The Arab world
The Arab world includes 22 Arabic-speaking countries (Fig. 1). Ac-
cording to the World Bank latest classification for 2015 (http://data.
worldbank.org), the Arab countries include high-income countries
(HICs) such as Bahrain, Kuwait, Oman, Saudi Arabia, Qatar, and the
United Arab Emirates; middle-income countries (MICs) such as
Algeria, Egypt, Iraq, Jordan, Lebanon, Libya, Morocco, Palestine, Sudan,
Syria, and Tunisia; and low-income countries (LICs) such as Comoros,
Djibouti, Mauritania, Somalia, and Yemen. These countries occupy a
Fig. 1. Arabic speaking countries accordi
(Source: http://www.arabic-keyboard.o
large area that extends from the Atlantic Ocean in the west to the Arabi-
an Sea in the east, and the Arab population is approaching 0.5 billion.
This region has been extensively exposed to many successive invaders
from Turkey, Rome, and Europe as well as to traders and immigrants,
thus contributing to mixing of the ethnic demographics of the popula-
tion. However, the HICs, which include countries with the highest
Gross Domestic Product (GDP) per capita worldwide (http://data.
worldbank.org), spend less than 0.2% of their GDP on scientific develop-
ment (Giles, 2006). This phenomenon has led to the immigration of
many Arab scientists into the West to look for better opportunities.
However, recently, biomedical disease-based research has received spe-
cial attention from Arab governments, with the aim of improving the
understanding and treatment of common diseases afflicting the Arab
population. Various attempts have been made by Saudi Arabia and
Qatar in particular to establish a research infrastructure, but the prog-
ress has been significantly slow relative to the amount of capital infused
into such programs, and the benefits of such investments might take
significant time to yield results. In this manuscript I will refer to the
“Arab genome” as the genome of the 22 Arab countries.
2.1. Inbred Arab communities and rare variants discovery
There are 955 genetic diseases that have been identified in Arabs, of
which 586 (60%) are reported to be recessive diseases (http://www.
cags.org.ae). Arabs have one of the highest rates of consanguineous
marriage worldwide, reaching up to ~70%, with an extreme prevalence
of first-cousin marriage (Tadmouri et al., 2009), These factors, together
with the endogamous marriage culture and large family sizes, are re-
sponsible for the spread of genetic diseases in Arab countries, with a
high prevalence of rare diseases (Teebi and Teebi, 2005). Endogamous
marriages approach 100% in many Arab countries, and especially the
Gulf States (i.e., Bahrain, Kuwait, Oman, Qatar, Saudi Arabia and the
United Arab Emirates). For example, women in Saudi Arabia are
prohibited from marrying men other than Arab men from the Gulf
countries without special dispensation from the king (http://web.
ng to the latest WHO classification.
rg/arabic).
241H. Zayed / Gene 592 (2016) 239–243
archive.org/web/20120614045804/http://travel.state.gov/travel/cis_
pa_tw/tw/tw_931.html), and men must acquire a government permit
to marry a foreign woman. This law is applicable to the six Gulf States
and is due to deeply entrenched, centuries-old traditions that strongly
favor marriage within the same Arab subcultures. In addition, this mar-
riage culture is still on the rise; for example, consanguineous marriage
rates in Qatar increased from 41.8% to 54.5% in just one generation
(Bener and Alali, 2006).
Although a large number of rare variants still have unknown clinical
significance because of the limitations of current technologies, which
can be attributed to the need of large number of individuals harboring
these variants that are largely untested by high-density SNP arrays.
Therefore, studying inbred communities such as Arab communities is
an ideal scenario to understand the effect of genetic variants on the
human genome. In this regard, genetic analysis of the Arab genome is
considered to be a goldmine for genomic scientists who are looking
for a more discernible correlation between the genotype and the pheno-
type of genetic diseases, and particularly complex disorders and rare ge-
netic disorders. The inbreeding nature of many Arab communities and
the commonness of the conservative marriage culture might predict a
wide class of complex disorders, especially if the causative variants are
rare and the most identified genetic variants causing the complex dis-
eases in humans are partially recessive (Bittles and Black, 2010; Rudan
et al., 2003). In this regard, Arabs represent an ideal population for bet-
ter understanding the pathogenesis and prognosis of recessive diseases,
which are yet to be elucidated. Although the consanguineous, endoga-
mous Arab culture seems to predict a conserved pool of genes among
Arabs, the structure of the Arab genome became diversified over time,
mainly due to admixing of the genome with those of different ethnic
groups descended from Africa, Asia, and Europe (Teebi and Teebi,
2005), which provide another opportunity for understanding the dy-
namic of the Arab genome and the “out of Africa” migration theory.
3. The Arab genome
Although the Arab region is considered to be a hot spot for medical
and clinical genetic studies, (Nat. Genet., 2006) Arabs have been slow
to explore their own genome. This reticence might be due to the follow-
ing reasons: (1) in most Arab countries, it is not yet affordable to se-
quence a genome, even for clinical diagnostic reasons, despite the
continual diminishing costs of next-generation sequencing technolo-
gies; (2) research is not considered to be a necessity in most Arab coun-
tries, mainly due to economic reasons; and (3) there is a dearth of well-
trained scientists in genomics. As a consequence, there is a lack of infor-
mation related to molecular pathogenesis and poor knowledge of both
the genotype-phenotype correlation of genetic diseases and the gene
variants that are responsible for the spread of these diseases that are
segregating in the Arab genome. This is the case even for the most dev-
astating diseases, such as diabetes and cardiovascular disorders, which
compromises the level of the health care provided to the Arab popula-
tion. Therefore, Arab governments must prioritize seeking the means
to understand the complexity and dynamics of the Arab genome, espe-
cially in countries that are able to afford the costs of genome sequencing.
Consistent with this concept, a genomic revolution has been ignited in
the Arabian Peninsula, especially in the Gulf States of Saudi Arabia,
Kuwait, and Qatar, as the US Encyclopedia of DNA Elements (ENCODE)
project and the Arab genome initiatives, represented by the Saudi
Human Genome Project (SHGP) (http://shgp.kacst.edu.sa/site), the
Qatar Genome Project (QGP) (Al-Mulla, 2014), and the Kuwaiti Genome
Project (KGP) (Thareja et al., 2015), aim to systematically and compre-
hensively analyze and catalog the genetic variants and haplotypes that
are associated with health and disease. These efforts are expected to
help in the identification of novel disease associated gene variants.
The initiatives also aim to derive reference genome(s) sequence for dif-
ferent subpopulations of different ancestries in Kuwait. Although Arab
scientists are a decade late in sequencing the Arab genome, this
sequencing is expected to contribute to knowledge related to migration
genome ancestry, genome evolution, genome dynamics, mapping of
rare disease-associated variants, and novel disease associated gene
discovery.
3.1. Discovery of novel disease-causing genes and the Arab genome
Inbreeding is associated with an increased disease risk based on in-
creased homozygosity at many genetic loci (Rudan et al., 2003) and
leads to a high probability of shared ancestry between randomly select-
ed Arab individuals and longer runs of homozygosity, this is an ideal
way to map rare disease susceptibility loci among highly consanguine-
ous families in inbred Arab communities. A representative example
was provided by Verge et al. (1998), who analyzed an inbred Bedouin
Arab community who has a long history of first-cousin marriage, they
analyzed a large Arab family of 248 individuals living in Israel that had
19 relatives affected with type 1 diabetes who carried rare predisposing
haplotypes to type 1 diabetes that were not found in other families. In-
terestingly, the researchers discovered a novel susceptibility locus
(IDDM17; MIM#603266) for type 1 diabetes, which was mapped to
chromosome 10 (10q25.1). Another example is the identification of a
novel locus that was defined by the TMEM107 mutation through se-
quencing 25 families with the rare, ciliopathic Meckel-Gruber syndrome
(Shaheen et al., 2015), and another study that successfully led to the
discovery of six novel candidate genes which found to be associated
with embryonic lethality in Saudi Arabian consanguineous families
(Shamseldin et al., 2015).
The whole exome sequencing (WES) was also successful to
reveal a long list of novel candidate genes among consanguineous
Arab families, including, but not limited to, identifying 69 genes
which are linked to recessive diseases in 143 multiplex Saudi fami-
lies, which was not previously associated with genetic diseases
(Alazami et al., 2015). Diagnostic WES has also been able to identify
several novel disease-associated genes among 149 probands that be-
long to highly consanguineous population in Qatar, with various
Mendelian phenotypes but mainly neurocognitive (Yavarna et al.,
2015). In a study of 18 consanguineous Arab families with Meckel–
Gruber syndrome (MKS), WES revealed a likely pathogenic mutation
in three novel candidate MKS disease-causing genes (C5orf42, EVC2,
and SEC8) (Shaheen et al., 2013). The ARL6IP6 gene was identified as
a novel candidate gene for a syndromic form of CMTC in a Saudi con-
sanguineous family (Abumansour et al., 2015). Therefore, the Arab
genome carries significant potential in advancing the fields of clinical
and medical genetics.
3.2. Arab efforts in genome sequencing
The SHGP is a 5-year project launched in December 2013 that in-
volves a partnership between the SHGP and Life Technologies (http://
shgp.kacst.edu.sa/site). The aim of the project is to sequence 100,000
Saudi genomes that represent both normal and disease conditions to
identify Saudi-specific genetic variants that are linked to high-
incidence genetic diseases in Saudi Arabia, such as diabetes, deafness,
cardiovascular disorders, cancer, and neurodegenerative diseases
(Abu-Elmagd et al., 2015). The SHGP’s specific mission is to establish a
genotype-phenotype correlation for genetic disease and to create a
foundation for personalized medicine, in which treatment will be devel-
oped based on the DNA blueprint of each Saudi individual. This ap-
proach will reduce the cost of health care, as the health care expenses
related to human genetic disease are greater than $30 billion annually
in Saudi Arabia (http://shgp.kacst.edu.sa/site).
A few days after the SHGP announcement, Qatar announced its in-
tention to launch the QGP and a plan to sequence the genomes of all
Qatari citizens (~300,000) (Al-Mulla, 2014). Similarly to the SHGP, the
QGP seeks the future protection of Qatari citizens from the spread of ge-
netic diseases due to the deep-entrenched culture of endogamous and
242 H. Zayed / Gene 592 (2016) 239–243
consanguineous marriage by understanding the genomic make-up of
the Qatari population, and integrating the sequencing information into
clinical care for Qatari individuals. The data collected from the genome
sequencing will be used as a platform for developing customized molec-
ular diagnostics approaches to Arabs (Zayed and Ouhtit, 2016), help to
create the foundation of personalized medicine in the Arabian Peninsu-
la, and are expected to advance prenatal screening, genetic counseling
for disease-carrying individuals in Qatar. QGP has already started its
pilot phase by sequencing 3000 Qatari citizens (http://www.qatar-
tribune.com/viewnews.aspx?d=20151214&cat=nation2&pge=5).
Computational analyses aimed to decode the Qatari genome and map
the genetic variants which are unique to the Qatari individuals, are sup-
ported by generous competitive funding from Qatar Foundation
(https://www.qf.org.qa). These sequencing data are kept in electronic
medical records which will be an integral part of the Qatari National
Health Service.
The KGP is an initiative to determine the genetic diversity of the
main ethnic groups that constitute the Kuwaiti population, namely,
Saudi Arabians, Bedouins, and Persians, ascribing their origin to dif-
ferent regions of the Arabian Peninsula and West Asia (modern
Iranians). Thus, this project is the first to report a reference genome
resource for the population of Persian ancestry in Kuwait (Thareja
et al., 2015).
3.3. The Arab genome and the “Out of Africa” theory
The modern Arab gene pool exhibits a very interesting genetic
structure: it has numerous pockets of inbred communities due to
the prevalence of consanguineous unions, conserved pools of ge-
nomes due to widespread endogamous marriage, and a mixed gene
pool due to the history of Arab nations and the admixture of the ge-
nomes of different ethnic groups with those of people from Europe,
Africa, and Asia. This diversity is important in terms of understand-
ing genome evolution and dynamics, answering the “Out of Africa”
human migration question, and providing insights into the migra-
tion routes of early modern humans from Africa to Eurasia. The pri-
mary African origin of all modern human populations is well
known, but the routes of human migration out of Africa are still un-
certain. One potential route is through Levant. Although the North
African background is mainly stemmed from Near East/Arabian Pen-
insula, the genomic ancestry of the Arabs of North Africa supports an
African genome background due to the historical mixing with sub-
Saharan African genome (Henn et al., 2012). Another potential
route is to the South, across the Arabian Peninsula, which is a
nexus of Asia, Africa, and Europe (Kopp et al., 2014). Interestingly,
Fernandes et al. (Fernandes et al., 2012) focused in disentangling be-
tween the impact of several waves of migration into Arabian Penin-
sula in terms of contribution of African input and provided a proof
that Arabian Peninsula could be the first staging post in the spread
of modern humans from Africa to the rest of the world.
Interestingly, sequencing of just 13 exomes and 2 full genomes in
Kuwait revealed ancestral genomic signature traces stemming from
Asia, Europe and Africa (Alsmadi et al., 2014; Alsmadi et al., 2013).
Egypt is an Afro-Asian Arab country that shares the Mediterranean Sea
with European countries (Fig. 1), and it has been proposed as a potential
source of the exodus of the African genome to Eurasia (Pagani et al.,
2015) according to geographical, archaeological, and genetic evidence.
African genomic components have been mapped (Pagani et al., 2015);
however, most of the analyzed Egyptian haplotypes were genetically
similar to those of modern non-Africans. The study concluded that
Egypt was a potential gateway for the migration of the African genome
to the rest of the world. Therefore, comparing the Egyptian genomes
with European ones supports the exit route, where Ethiopian genomes
compared with Arab genomes addresses southern route of the out-of-
Africa migration.
3.4. Benefits of sequencing the Arab genome
Given the frequent spread of genetic diseases in Arab countries,
reaching reference genome(s) reflecting the diversity and population
structure of Arab countries will serve as an example for other communi-
ties with comparable population structures and will have many bene-
fits, including, but not limited to, (1) serving as a vital tool for the
identification of novel variants; (2) serving as a baseline for further ge-
nomic epidemiological studies in Arab nations; (3) serving as a useful
foundation for cohort and case-control genetic studies that aim to char-
acterize the genetic etiology of genetic diseases; (4) improving genetic
counseling for individuals with genetic disorders; (5) serving as a plat-
form for future GWAS; (6) advancing translational medicine in the
fields of personalized medicine and pharmacogenomics, allowing med-
ications to be individualized to Arab patients and Arab responses to
drugs to become well understood; (7) allowing the study of inbred
Arab communities, and specifically the Bedouin population, thus serv-
ing as a valuable tool to facilitate the discovery of rare and novel gene
variants and novel genes; this information is very important to better
understand the molecular pathology of complex diseases/traits and is
expected to shed light on other genetic risk factors related to gene-
environment interactions and epistasis as well as many other genetic
risk factors with major importance in genetic disease development,
and (8) serve as a historical tracing tool for population migration.
The ultimate goal of the Arab genome is to create a database of the
DNA variation in the Arab population and to make it available to clini-
cians and researchers in Arab countries who seek to increase the
power of disease prediction, to understand gene drug interactions, to
study the Arab population substructures, to improve understanding of
the nature of Arab genetic diversity, and to trace population migration.
All of these endeavors will contribute to one major aim, which is to im-
prove patients’ quality of life by improving overall health care and sav-
ing lives. However, translating the outcome of the results of the Arab
genome into effective clinical practice is a challenging task that will re-
quire concerted efforts by both policymakers and scientists to imple-
ment effective strategies in the health care sector and to make funding
available to allow such programs to continue.
4. Conclusion
Arabs are an ideal population for genetic studies, with a diverse genet-
ic structure, ranging from inbred communities to a diverse gene pool
that includes elements from Europe, Asia, and Africa. This feature renders
the Arab population a rich source of information that would be of
global benefit. This emphasizes the value of a consensus Arab genome
reference(s) which will positively impact the future directions of person-
alized medicine. Using genomic sequencing technologies, numerous rare
variants and novel genes have been identified in Arab families, mainly
with consanguineous marriage history. The outcome of the SHGP and
QGP are soon to be released, which will pave the way of a future consen-
sus Arab genome reference(s). Therefore, there is an urgent need for data
sharing, both locally and internationally, which dictates the need for the
development of mechanisms and standards to facilitate this sharing.
Disclosure declaration
Hatem Zayed declares no conflict of interest.
References
Abu-Elmagd, M., Assidi, M., Schulten, H.J., Dallol, A., Pushparaj, P., Ahmed, F., Scherer, S.W.,
Al-Qahtani, M., 2015. Individualized medicine enabled by genomics in Saudi Arabia.
BMC Med. Genet. 8 (Suppl. 1), S3.
Abumansour, I.S., Hijazi, H., Alazmi, A., Alzahrani, F., Bashiri, F.A., Hassan, H., Alhaddab, M.,
Alkuraya, F.S., 2015. ARL6IP6, a susceptibility locus for ischemic stroke, is mutated in
a patient with syndromic Cutis Marmorata Telangiectatica Congenita. Hum. Genet.
134, 815–822.
243H. Zayed / Gene 592 (2016) 239–243
Alazami, A.M., Patel, N., Shamseldin, H.E., Anazi, S., Al-Dosari, M.S., Alzahrani, F., Hijazi, H.,
Alshammari, M., Aldahmesh, M.A., Salih, M.A., Faqeih, E., Alhashem, A., Bashiri, F.A.,
Al-Owain, M., Kentab, A.Y., Sogaty, S., Al Tala, S., Temsah, M.-H., Tulbah, M.,
Aljelaify, R.F., Alshahwan, S.A., Seidahmed, M.Z., Alhadid, A.A., Aldhalaan, H.,
AlQallaf, F., Kurdi, W., Alfadhel, M., Babay, Z., Alsogheer, M., Kaya, N., Al-Hassnan,
Z.N., Abdel-Salam, G.M.H., Al-Sannaa, N., Al Mutairi, F., El Khashab, H.Y., Bohlega, S.,
Jia, X., Nguyen, H.C., Hammami, R., Adly, N., Mohamed, J.Y., Abdulwahab, F.,
Ibrahim, N., Naim, E.A., Al-Younes, B., Meyer, B.F., Hashem, M., Shaheen, R., Xiong,
Y., Abouelhoda, M., Aldeeri, A.A., Monies, D.M., Alkuraya, F.S., 2015. Accelerating
novel candidate gene discovery in neurogenetic disorders via whole-exome sequenc-
ing of prescreened multiplex consanguineous families. Cell Rep. 10, 148–161.
Al-Mulla, F., 2014. The locked genomes: a perspective from Arabia. Applied & Translation-
al Genomics 3, 132–133.
Alsmadi, O., Thareja, G., Alkayal, F., Rajagopalan, R., John, S.E., Hebbar, P., Behbehani, K.,
Thanaraj, T.A., 2013. Genetic substructure of Kuwaiti population reveals migration
history. PLoS One 8, e74913.
Alsmadi, O., John, S.E., Thareja, G., Hebbar, P., Antony, D., Behbehani, K., Thanaraj, T.A.,
2014. Genome at juncture of early human migration: a systematic analysis of two
whole genomes and thirteen exomes from Kuwaiti population subgroup of inferred
Saudi Arabian tribe ancestry. PLoS One 9, e99069.
Bener, A., Alali, K.A., 2006. Consanguineous marriage in a newly developed country: the
Qatari population. J. Biosoc. Sci. 38, 239–246.
Bittles, A.H., Black, M.L., 2010. Evolution in health and medicine Sackler colloquium: con-
sanguinity, human evolution, and complex diseases. Proc. Natl. Acad. Sci. U. S. A. 107
(Suppl. 1), 1779–1786.
Buchanan, C.C., Torstenson, E.S., Bush, W.S., Ritchie, M.D., 2012. A comparison of cataloged
variation between International HapMap Consortium and 1000 Genomes Project
data. J. Am. Med. Inform. Assoc. 19, 289–294.
Fernandes, V., Alshamali, F., Alves, M., Costa, M.D., Pereira, J.B., Silva, N.M., Cherni, L.,
Harich, N., Cerny, V., Soares, P., Richards, M.B., Pereira, L., 2012. The Arabian cradle:
mitochondrial relicts of the first steps along the southern route out of Africa. Am.
J. Hum. Genet. 90, 347–355.
Giles, J., 2006. Islam and science: oil rich, science poor. Nature 444, 28.
Henn, B.M., Botigue, L.R., Gravel, S., Wang, W., Brisbin, A., Byrnes, J.K., Fadhlaoui-Zid, K.,
Zalloua, P.A., Moreno-Estrada, A., Bertranpetit, J., Bustamante, C.D., Comas, D., 2012.
Genomic ancestry of North Africans supports back-to-Africa migrations. PLoS Genet.
8, e1002397.
International HapMap Consortium, 2003. The International HapMap Project. Nature 426,
789–796.
International Human Genome Sequencing Consortium, 2004. Finishing the euchromatic
sequence of the human genome. Nature 431, 931–945.
Kopp, G.H., Roos, C., Butynski, T.M., Wildman, D.E., Alagaili, A.N., Groeneveld, L.F., Zinner,
D., 2014. Out of Africa, but how and when? The case of hamadryas baboons (Papio
hamadryas). J. Hum. Evol. 76, 154–164.
Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar,
K., Doyle, M., FitzHugh, W., Funke, R., Gage, D., et al., 2001. Initial sequencing and
analysis of the human genome. Nature 409, 860–921.
Editorial, The germinating seed of Arab genomicsNat. Genet. 38, 851.
Pagani, L., Schiffels, S., Gurdasani, D., Danecek, P., Scally, A., Chen, Y., Xue, Y., Haber, M.,
Ekong, R., Oljira, T., Mekonnen, E., Luiselli, D., Bradman, N., Bekele, E., Zalloua, P.,
Durbin, R., Kivisild, T., Tyler-Smith, C., 2015. Tracing the route of modern humans
out of Africa by using 225 human genome sequences from Ethiopians and
Egyptians. Am. J. Hum. Genet. 96, 986–991.
Rudan, I., Rudan, D., Campbell, H., Carothers, A., Wright, A., Smolej-Narancic, N.,
Janicijevic, B., Jin, L., Chakraborty, R., Deka, R., Rudan, P., 2003. Inbreeding and risk
of late onset complex disease. J. Med. Genet. 40, 925–932.
Shaheen, R., Faqeih, E., Alshammari, M.J., Swaid, A., Al-Gazali, L., Mardawi, E., Ansari, S.,
Sogaty, S., Seidahmed, M.Z., AlMotairi, M.I., Farra, C., Kurdi, W., Al-Rasheed, S.,
Alkuraya, F.S., 2013. Genomic analysis of Meckel-Gruber syndrome in Arabs reveals
marked genetic heterogeneity and novel candidate genes. Eur. J. Hum. Genet. 21,
762–768.
Shaheen, R., Almoisheer, A., Faqeih, E., Babay, Z., Monies, D., Tassan, N., Abouelhoda, M.,
Kurdi, W., Al Mardawi, E., Khalil, M.M., Seidahmed, M.Z., Alnemer, M., Alsahan, N.,
Sogaty, S., Alhashem, A., Singh, A., Goyal, M., Kapoor, S., Alomar, R., Ibrahim, N.,
Alkuraya, F.S., 2015. Identification of a novel MKS locus defined by TMEM107 muta-
tion. Hum. Mol. Genet. 24, 5211–5218.
Shamseldin, H.E., Tulbah, M., Kurdi, W., Nemer, M., Alsahan, N., Al Mardawi, E., Khalifa, O.,
Hashem, A., Kurdi, A., Babay, Z., Bubshait, D.K., Ibrahim, N., Abdulwahab, F., Rahbeeni,
Z., Hashem, M., Alkuraya, F.S., 2015. Identification of embryonic lethal genes in
humans by autozygosity mapping and exome sequencing in consanguineous fami-
lies. Genome Biol. 16, 116.
Tadmouri, G.O., Nair, P., Obeid, T., Al Ali, M.T., Al Khaja, N., Hamamy, H.A., 2009. Consan-
guinity and reproductive health among Arabs. Reprod. Health 6, 17.
Teebi, A.S., Teebi, S.A., 2005. Genetic diversity among the Arabs. Community Genet. 8,
21–26.
Thareja, G., John, S.E., Hebbar, P., Behbehani, K., Thanaraj, T.A., Alsmadi, O., 2015. Sequence
and analysis of a whole genome from Kuwaiti population subgroup of Persian ances-
try. BMC Genomics 16, 92.
Venter, J.C., Adams, M.D., Myers, E.W., Li, P.W., Mural, R.J., Sutton, G.G., Smith, H.O.,
Yandell, M., Evans, C.A., Holt, R.A., Gocayne, J.D., Amanatides, P., Ballew, R.M.,
Huson, D.H., Wortman, J.R., Zhang, Q., Kodira, C.D., Zheng, X.H., Chen, L., Skupski,
M., Subramanian, G., Thomas, P.D., Zhang, J., Gabor Miklos, G.L., Nelson, C., Broder,
S., Clark, A.G., Nadeau, J., McKusick, V.A., Zinder, N., Levine, A.J., Roberts, R.J., Simon,
M., Slayman, C., Hunkapiller, M., Bolanos, R., Delcher, A., Dew, I., Fasulo, D., Flanigan,
M., Florea, L., Halpern, A., Hannenhalli, S., Kravitz, S., Levy, S., Mobarry, C., Reinert,
K., Remington, K., Abu-Threideh, J., Beasley, E., Biddick, K., Bonazzi, V., Brandon, R.,
Cargill, M., Chandramouliswaran, I., Charlab, R., Chaturvedi, K., Deng, Z., Di
Francesco, V., Dunn, P., Eilbeck, K., Evangelista, C., Gabrielian, A.E., Gan, W., Ge, W.,
Gong, F., Gu, Z., Guan, P., Heiman, T.J., Higgins, M.E., Ji, R.R., Ke, Z., Ketchum, K.A.,
Lai, Z., Lei, Y., Li, Z., Li, J., Liang, Y., Lin, X., Lu, F., Merkulov, G.V., Milshina, N., Moore,
H.M., Naik, A.K., Narayan, V.A., Neelam, B., Nusskern, D., Rusch, D.B., Salzberg, S.,
Shao, W., Shue, B., Sun, J., Wang, Z., Wang, A., Wang, X., Wang, J., Wei, M., Wides,
R., Xiao, C., Yan, C., Yao, A., Ye, J., Zhan, M., Zhang, W., Zhang, H., Zhao, Q., Zheng, L.,
Zhong, F., Zhong, W., Zhu, S., Zhao, S., Gilbert, D., Baumhueter, S., Spier, G., Carter,
C., Cravchik, A., Woodage, T., Ali, F., An, H., Awe, A., Baldwin, D., Baden, H., Barnstead,
M., Barrow, I., Beeson, K., Busam, D., Carver, A., Center, A., Cheng, M.L., Curry, L.,
Danaher, S., Davenport, L., Desilets, R., Dietz, S., Dodson, K., Doup, L., Ferriera, S.,
Garg, N., Gluecksmann, A., Hart, B., Haynes, J., Haynes, C., Heiner, C., Hladun, S., Hostin,
D., Houck, J., Howland, T., Ibegwam, C., Johnson, J., Kalush, F., Kline, L., Koduru, S., Love,
A., Mann, F., May, D., McCawley, S., McIntosh, T., McMullen, I., Moy, M., Moy, L., Mur-
phy, B., Nelson, K., Pfannkoch, C., Pratts, E., Puri, V., Qureshi, H., Reardon, M.,
Rodriguez, R., Rogers, Y.H., Romblad, D., Ruhfel, B., Scott, R., Sitter, C., Smallwood,
M., Stewart, E., Strong, R., Suh, E., Thomas, R., Tint, N.N., Tse, S., Vech, C., Wang, G.,
Wetter, J., Williams, S., Williams, M., Windsor, S., Winn-Deen, E., Wolfe, K., Zaveri, J.,
Zaveri, K., Abril, J.F., Guigo, R., Campbell, M.J., Sjolander, K.V., Karlak, B., Kejariwal,
A., Mi, H., Lazareva, B., Hatton, T., Narechania, A., Diemer, K., Muruganujan, A., Guo,
N., Sato, S., Bafna, V., Istrail, S., Lippert, R., Schwartz, R., Walenz, B., Yooseph, S.,
Allen, D., Basu, A., Baxendale, J., Blick, L., Caminha, M., Carnes-Stine, J., Caulk, P.,
Chiang, Y.H., Coyne, M., Dahlke, C., Mays, A., Dombroski, M., Donnelly, M., Ely, D.,
Esparham, S., Fosler, C., Gire, H., Glanowski, S., Glasser, K., Glodek, A., Gorokhov, M.,
Graham, K., Gropman, B., Harris, M., Heil, J., Henderson, S., Hoover, J., Jennings, D.,
Jordan, C., Jordan, J., Kasha, J., Kagan, L., Kraft, C., Levitsky, A., Lewis, M., Liu, X.,
Lopez, J., Ma, D., Majoros, W., McDaniel, J., Murphy, S., Newman, M., Nguyen, T.,
Nguyen, N., Nodell, M., Pan, S., Peck, J., Peterson, M., Rowe, W., Sanders, R., Scott, J.,
Simpson, M., Smith, T., Sprague, A., Stockwell, T., Turner, R., Venter, E., Wang, M.,
Wen, M., Wu, D., Wu, M., Xia, A., Zandieh, A., Zhu, X., 2001. The sequence of the
human genome. Science 291, 1304–1351.
Verge, C.F., Vardi, P., Babu, S., Bao, F., Erlich, H.A., Bugawan, T., Tiosano, D., Yu, L.,
Eisenbarth, G.S., Fain, P.R., 1998 Oct 15. Evidence for oligogenic inheritance of type
1 diabetes in a large Bedouin Arab family. J Clin Invest. 102 (8), 1569–1575.
Yavarna, T., Al-Dewik, N., Al-Mureikhi, M., Ali, R., Al-Mesaifri, F., Mahmoud, L., Shahbeck,
N., Lakhani, S., AlMulla, M., Nawaz, Z., Vitazka, P., Alkuraya, F.S., Ben-Omran, T., 2015.
High diagnostic yield of clinical exome sequencing in Middle Eastern patients with
Mendelian disorders. Hum. Genet. 134, 967–980.
Zayed, H., Ouhtit, A., 2016. Accredited genetic testing in the Arab Gulf region: reinventing
the wheel. J. Hum. Genet. http://dx.doi.org/10.1038/jhg.2016.22 (Epub ahead of
print).
The Arab genome: Health and wealth
1. Introduction
2. The Arab world
2.1. Inbred Arab communities and rare variants discovery
3. The Arab genome
3.1. Discovery of novel disease-causing genes and the Arab genome
3.2. Arab efforts in genome sequencing
3.3. The Arab genome and the “Out of Africa” theory
3.4. Benefits of sequencing the Arab genome
4. Conclusion
Disclosure declaration
References
We provide professional writing services to help you score straight A’s by submitting custom written assignments that mirror your guidelines.
Get result-oriented writing and never worry about grades anymore. We follow the highest quality standards to make sure that you get perfect assignments.
Our writers have experience in dealing with papers of every educational level. You can surely rely on the expertise of our qualified professionals.
Your deadline is our threshold for success and we take it very seriously. We make sure you receive your papers before your predefined time.
Someone from our customer support team is always here to respond to your questions. So, hit us up if you have got any ambiguity or concern.
Sit back and relax while we help you out with writing your papers. We have an ultimate policy for keeping your personal and order-related details a secret.
We assure you that your document will be thoroughly checked for plagiarism and grammatical errors as we use highly authentic and licit sources.
Still reluctant about placing an order? Our 100% Moneyback Guarantee backs you up on rare occasions where you aren’t satisfied with the writing.
You don’t have to wait for an update for hours; you can track the progress of your order any time you want. We share the status after each step.
Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.
Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.
From brainstorming your paper's outline to perfecting its grammar, we perform every step carefully to make your paper worthy of A grade.
Hire your preferred writer anytime. Simply specify if you want your preferred expert to write your paper and we’ll make that happen.
Get an elaborate and authentic grammar check report with your work to have the grammar goodness sealed in your document.
You can purchase this feature if you want our writers to sum up your paper in the form of a concise and well-articulated summary.
You don’t have to worry about plagiarism anymore. Get a plagiarism report to certify the uniqueness of your work.
Join us for the best experience while seeking writing assistance in your college life. A good grade is all you need to boost up your academic excellence and we are all about it.
We create perfect papers according to the guidelines.
We seamlessly edit out errors from your papers.
We thoroughly read your final draft to identify errors.
Work with ultimate peace of mind because we ensure that your academic work is our responsibility and your grades are a top concern for us!
Dedication. Quality. Commitment. Punctuality
Here is what we have achieved so far. These numbers are evidence that we go the extra mile to make your college journey successful.
We have the most intuitive and minimalistic process so that you can easily place an order. Just follow a few steps to unlock success.
We understand your guidelines first before delivering any writing service. You can discuss your writing needs and we will have them evaluated by our dedicated team.
We write your papers in a standardized way. We complete your work in such a way that it turns out to be a perfect description of your guidelines.
We promise you excellent grades and academic excellence that you always longed for. Our writers stay in touch with you via email.