Molecular Characterization of Mosquitoes of Anopheles gambiae Species Complex (Diptera: Culicidae) from Sudan and Republic of Southern Sudan  

Asma Mahmoud Hamza1 , El Amin El-Rayah2 , Sumaia Mohamed Ahmed Abukashawa2
1. Department of Biology, Faculty of Education, University of Kassala, Kassala State, Sudan.
2. Department of Zoology, Faculty of Science, University of Khartoum, Khartoum, Sudan.
Author    Correspondence author
Journal of Mosquito Research, 2014, Vol. 4, No. 13   doi: 10.5376/jmr.2014.04.0013
Received: 14 Jun., 2014    Accepted: 13 Jul., 2014    Published: 30 Aug., 2014
© 2014 BioPublisher Publishing Platform
This is an open access article published under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Preferred citation for this article:
Hamza et al., 2014, Molecular Characterization of Mosquitoes of Anopheles gambiae Species Complex (Diptera: Culicidae) from Sudan and Republic of Southern Sudan, Journal of Mosquito Research, Vol.4, No.13 1-10 (doi: 10.5376/jmr.2014.04.0013)
Mosquitoes of the Anopheles gambiae complex, namely Anopheles arabiensis (Patton, 1905) and Anopheles gambiae (Giles, 1902) are the major vectors of human malaria in the African continent. This study was conducted mainly to investigate the molecular biologyof members of the An. gambiae complex in Sudan and Republic of Southern Sudan. The molecular investigation involved identification of members of the An. gambiae complex using polymerase chain reaction (PCR) techniques based on DNA specific nucleotide differences in the intergenic spacer region (IGS) of the ribosomal RNA gene claster (rRNA) and partial sequencing and analysis of IGS regions. Adult Anopheles mosquitoes were collected from four sites in Kassala State, Sudan and from one site in Western Bahr El Ghazal State, Republic of Southern Sudan. In addition, An. arabiensis specimens, obtained from Sennar town laboratory colony (Sudan) was used in the study. Collection of Anopheles mosquitoes was done by hand capture using sucking tube (aspirator) during the rainy seasons of 2008, 2009 and 2010. The molecular investigation predicts the existence of two species within the An. gambiae complex, namely An. arabiensis and An. gambiae. An. arabiensis was found as the predominant Anopheles mosquitoes in all the collection sites while An. gambiae was found sympatrically with An. arabiensis in Republic of Southern Sudan. The analysis of the IGS fragments revealed moderate level of genetic variations within and between the An. arabiensis populations. An. gambiae individuals showed high genetic similarity. The genetic analysis revealed little population differentiation (Fst=0.067) and high migration rate (Nm=3.51) which indicated high gene flow between An. arabiensis populations collected from Kassala State localities. The phylogenetic relationships between the different populations of An. arabiensis and An. gambiae were investigated. The IGS regions of rRNA gene have been shown to be powerful markers for species identification and studying the genetic structure of members of An. gambiae complex.
An. arabiensis; An. gambiae; Ribosomal RNA gene (rRNA); Intergenic spacer region (IGS); Sudan; Republic of Southern Sudan

Malaria is a major health problem in Africa. Members of An. gambiae complex have been identified as major vectors of human malaria parasites in the African continent. The An. gambiae complex is comprised of seven genetically and behaviourally distinct species that are morphologically undistinguishable (Davidson et al., 1967; Service, 1985; Hunt et al., 1998). Within this species complex the most important vectors of human malaria are: An. gambiae (Giles, 1902) andAn. arabiensis (Patton, 1905) which are distributed over 70% of the Sub-Saharan Africa with An. arabiensis being distributed over the dry savanna and semi-arid parts of Africa (Service, 1980; Bryan, 1983; Lindsay et al. 1998).The two species are more adapted to the human environment; they are sympatric and synchronic over most of their geographical distribution range (Petrarca et al., 1998).
Of the seven recognized species of the An. gambiae complex, An. arabiensis and An. gambiae are the most abundant and most important vector of human malaria in Sudan and Republic of Southern Sudan. An. arabiensis has been regarded from many localities in Sudan, from the extreme south up to the northern borders with Egypt. It was reported from many localities in Kassala State, eastern Sudan by (Haridi, 1972; Petrarca et al., 1986; Himeidan, 2004). An. gambiae is restricted to Republic of Southern Sudan and found sympatrically with An. arabiensis (Zahar, 1985; Petrarca et al.,1986).
Identification of species within the An. gambiae group is essential for the correct evaluation of malaria vector ecology studies and control programs (Gale, 1987). Understanding the genetic structure of mosquito populations is important for addressing important biological and public health issues such as evolution, spread of insecticide resistance alleles and epidemiology of vector-borne diseases (Cramption et al., 1994; Tripet et al., 2001; Fanello et al., 2003).
Several methods for identifying species of mosquito complexes have been developed such as polymerase chain reaction (PCR) techniques. This has become the standard method to species identification and studying the genetic structure (Scott et al., 1993; Wilkins et al., 2006). Ribosomal RNA gene claster (rRNA) is one of the most widely used regions of the genome to infer genetic variations and phylogenetic relationships. rRNA gene is a gene family consisting of many copies (100-500) of genes and encodes the ribosomal RNA. In eukaryotes, the rRNA gene is composed of tandem repeated units separated from each other by intergenic non transcribed spacers (IGS). Each repeat contains the coding genes for 18S, 5.8S and 28S in a respective order separated from each other by an external transcribed spacer (ETS) at the 5´ end of the 18S gene, and two internal transcribed spacers (ITS). The first internal transcribed spacer (ITS1) is located between 18S and 5.8S genes, whereas the ITS2 separates 5.8S and 28S rRNA genes (Hillis and Dixon, 1991). The coding regions of 18S, 5.8S and 28S are highly conserved; whereas the non-coding spacers (ITS1, ITS2 and IGS) are highly variable and evolve at a faster rate than the coding regions. They can be highly variable in length and nucleotides sequence between closely related species (Olsen and Woese, 1993; Aransay, 2000). Highly conserved region repeats can be used for studying relationships across phyla (Gerib, 1985) while more variable regions can be used for lower taxonomic levels. The IGS region contains species-specific nucleotide sequences and has facilitated discrimination of species in An. gambiae group (Collins et al., 1987; Paskewiz et al.; 1993; Scott et al., 1993).
The ability to amplify DNA using PCR techniques has greatly facilitated DNA sequence comparisons (Innis et al., 1990) and resulted in the development and use of species-specific diagnostic PCR primer pairs (Paskewitz and Collins, 1990). Nucleotide sequencing of PCR products has made it suitable both for conformation and characterization of mutants detected by one of the described screening methods such as single nucleotide polymorphism (SNP). SNP occurs when a single nucleotide in the genome sequence is changed. SNPs are the commonest type of nucleotide sequence variations in genome but they have only recently been used to investigate the evolutionary and demographic history of populations and speciation (Brumfield et al., 2003).
Several molecular markers have been used in studying population genetic structure and gene flow in anopheline mosquitoes. These markers range from classical genetic markers (e. g. mtDNA or rRNA gene) to methods used to detect and identify SNPs and finally to highly polymorphic markers (e. g. RAPDs, microsatellite DNAs) (Norris, 2002). Classical genetic markers are characterized by targeting a defined gene or genetic fragment for analysis using different techniques to evaluate genetic variability down to resolution of SNPs and sequencing.
Gene flow can be defined as the movement of genes within and between the different populations (Ferris et al., 1983) and it cannot be measured, but estimated using either a direct or an indirect method (Slatkin, 1995). Direct estimates of gene flow are based on the observation of organism dispersal within a defined range of time and space (Taylor et al., 2001). The indirect method is based on the estimation of allele frequencies, obtained by electrophoretic survey of proteins or DNA sequence using molecular markers (Donnelly and Townson, 2000).
The IGS regions have previously been used as a powerful tool in studying the genetic and phylogenetic divergence between closely related species. Phylogenetic trees are powerful means for summarizing the evolutionary relationships. Many different criteria can be used to construct phylogenetic trees from morphological or molecular data. In integrated mosquito control programmes, taxonomic and phylogenetic studies had been quite useful in understanding the vectorial capacity and insecticide resistance in malaria vectors (Sharma and Chaudhry, 2010).
In the present study we conducted molecular investigation of members of the An. gambiae complex in Sudan and Republic of Southern Sudan. The investigation involved molecular identification of the members of the complex and analysis of partial sequence of IGS regions of rRNA gene.
1 Materials and methods
1.1 Collection sites and mosquitoes used in the study
Sudan and Republic of Southern Sudan are situated in the eastern part of the African continent, between 22° and 38° degrees East (E) longitude and between 4° and 22° degrees North (N) latitude (Figure 1). The area is crossed by the River Nile and its tributaries. Sudan has different climatic regions, ranging from desert (dry hot-arid) in the North, to semi-desert and savanna in the South. Republic of Southern Sudan has equatorial (humid-tropical) climate.

Figure 1 Map of Sudan and Republic of Southern Sudan showing the collection sites

Field samples of females An. gambiae complex were collected from four different sites in Kassala State, eastern Sudan (dry area) and from one site in Western Bahr El Ghazal State, northern Republic of Southern Sudan (humid area). In addition samples of An. arabiensis from Sennar laboratory colony were used in the study. The laboratory colony materials originating from Sennar State (central Sudan) and maintained at Sennar town since 2007 (Figure 1). The major consideration in the selection of the study sites is that they represent different ecological regions with different environmental conditions that may have an effect on the distribution and the population genetics of the An. gambiaespecies complex. The selection is also based on the easy accessibility of the collection sites. The distance between each two sites in Kassala State exceeds 30 Km. This distance is more than the expected dispersal distance of Anophelesmosquitoes from the favorable breeding sites. Table 1 shows the collection sites and the species and number of mosquitoes analyzed.
Indoor resting wild adult Anopheles mosquitoes were caught from rooms by hand capture using sucking tube (aspirator) during the rainy seasons 2008, 2009 and 2010. Anopheles mosquitoes collected were fixed and preserved individually in 70% ethanol and stored at -20 â„ƒ for subsequent processing. Members of An. gambiae complex were morphologically separated from other anopheline mosquitoes using the morphological identification keys of Gillies and De-Mellion (1968) and Gillies and Coetzee (1987). The processing of the materials for this study was carried out at the Genetics and Molecular Biology laboratory at the Department of Zoology, Faculty of Science, University of Khartoum, Sudan.
1.2 DNA extraction, amplification and sequencing
Genomic DNA was extracted from thorax and abdomen tissues of individual mosquitoes according to the method ofCollins et al. (1987) with minor modifications described by Proft et al. (1999). The quantity and quality of the extracted DNA were estimated using the Nanodrop spectrophotometer (ND-1000) for absorbance.
PCR and partial sequencing of the IGS region was based on the diagnostic method of species-specific nucleotide sequences in the IGS regions of rRNA gene. The ribosomal set of primers developed by Scott et al. (1993) was used. Three primers, of which 2 were specific to An. arabiensis and An. gambiae and one was a common (universal) primer to both species were used. 

Table 1 Collection sites, the species and number of females An. gambiae species complex used in the study

 An. arabiensis primer:
Species-specific An. gambiae primer:
Universal 5´ primer sequence:
Amplification reaction was performed following a slightly modified version (in the master mix and the times of the program of amplification) of the protocol described by Scott et al. (1993). PCR reagents were obtained from Vivantitis. PCR was performed in a total volume of 25 µL using a thermocycler (Techne, Touchgene Gradient). 5µL of DNA template (10ng / µL) were used with 20µL PCR mix containing 2.5µL of 10X PCR buffer Mg++ free, 1µL of dNTPs mix, each at 10 mM, 1.2µL MgCl2, 3 units of Taq polymerase and 2 µL of each forward and reverse primers (20 pmol /25 µL). 11 µL of sterile deionized water were added to make the final volume to 25 µL.
The PCR reaction was carried out with a program of 30 cycles of denaturation at 94 â„ƒ for 1 min, annealing at 50 â„ƒfor 1 min and extension at 72 â„ƒ for 1 min. After the PCR was completed, 5 µL from the PCR product were mixed with 5 µL of loading dye, electrophoresed through a 2.5% agarose gel and stained with ethidium bromide stain following the standard method described by Sambrook et al. (1989). The amplified fragments were visualized by illumination with short wave ultraviolet light and photodocumented.
Samples of purified PCR products were sequenced at Macrogen (www. macogen. com).
1.3 Sequence analysis
Chromatograms of sequence results and texts of the seven populations of An. arabiensis and An. gambiae were analyzed using different computer software programs. Mismatching alignments were checked by eye for sequence reading errors. The forward and reverse sequence strands of each specimen were matched. The consensus sequences of An. arabiensis and An. gambiae populations were aligned along with the published reference sequences of the GenBank sequences of Scott et al. (1993). Sequencing alignments were done using the software BioEdit and CLC Main Work Bench-version 6.5. The polymorphisms in the analyzed segments were exported using software Mega 5.05 (Tamura, 2011). Then the clustered sequences were directed for further analysis.
Using the software DnaSp-version 5.10.0, the frequency of each haplotype, haplotype diversity (Nei, 1987) and nucleotide diversity (Tajima, 1983) were calculated. Population genetic differentiation using Wrights F-statistics (Fst) and levels of gene flow were determined through the effective number of migrants (Nm) between locations usingDnaSp-version 5.10.0 (Hudson et al., 1992).
Mega 5.05 software was used to construct trees of individuals and haplotypes of An. arabiensis and An. gambiae. A neighbor-joining tree (Saitou et al., 1987) using Kimura 2-parameter model (Kimura, 1980) with 1000 bootstrapping replicates was constructed based on the aligned sequences to identify possible phylogenetic lineages.
2 Results
2.1 Molecular identification of species of An. gambiae complex
The quantity and quality of template DNA was found to be suitable for PCR amplification. The mean DNA quantity was 10-22ng /µL and DNA quality range was 1.7-2.2 for single female mosquitoes.
Female An. gambiae species complex were identified as An. arabiensis and An. gambiae by the results of the PCR identification. A total of 315bp and 390bp segments of the IGS region of rRNA gene sequences of An. arabiensis and An. gambiae, respectively, were amplified (Figure 2). An. arabiensis was found as the predominant Anopheles mosquitoes in all the collection sites. All samples from Kassala State were identified as An. arabiensis. In Southern Sudan An. arabiensiswas found sympatrically with An. gambiae and represented 72% of the An. gambiae complex in the area.

Figure 2 Agarose gel showing DNA fragments of PCR amplification using An. gambiae and An. arabiensis DNA from Sudan and Republic of Southern Sudan

2.2 Sequence alignment and characterization of IGS regions of rRNA gene
The amplified regions were found to be corresponding to the rRNA gene segments characterized by Scott et al.(1993).The sequences of the IGS region of An. arabiensis and An. gambiae populations were aligned along with the published sequences of the IGS region of An. arabiensis [GenBank: accession number, U10138] and An. gambiae[GenBank: accession number, U10135] (Scott et al., 1993) (Figure 3). Sequence analysis was carried out on 255bp and 334bp of An. arabiensis and An. gambiae, respectively. The analyzed regions were equivalent to position (506-760 forAn. arabiensis and 456-789 for An. gambiae) in the An. gambiae reference sequences of Scott et al. (1993). The sequences of the analyzed regions were published in the GenBank with accession numbers [KC491792- KC491797] and [KC491806- KC491834] for An. arabiensis and [KC491798- KC491805] for An. gambiae.
2.3 Single nucleotide polymorphisms
Mismatching alignments of An. arabiensis sequences indicated that variations detected by sequencing are substitutions. Three polymorphic sites were identified within An. arabiensis populations and no gaps were present. The polymorphic sites were located in positions of 590bp, 693bp and 713bp in the An. arabiensis reference sequences (Figure 3). Position 590bp (T-A) was found only in colony specimens [accession number, KC491828] while position 693bp (C-T) found only in New Halfa and Aroma populations [accession number, KC491814]. Position 713bp (A- T) was found in all the populations and characterized most individuals of An. arabiensis collected from Wau town [accession number, KC491795].
Out of 35 An. arabiensis sequenced, 23 % yielded sequences that were identical to the An. arabiensis IGS reference sequence. The direct sequences revealed no substitution, insertion or deletion events within any An. gambiaesequences, i.e all An. gambiae individuals yielded sequences identical to the published GenBank sequence (Scott et al., 1993).
2.4 Haplotypes estimation and genetic diversity
Comparison of IGS regions of rRNA gene of field An. arabiensis individuals and colony specimens’ revealed 4 different haplotypes with 3 polymorphic sites (Figure 3). The 4 haplotypes are shared between all An. arabiensis populations and their average guanine-cytosine (GC) content was observed to be 0.519%. There are two major groups of haplotypes within An. arabiensis populations, one being identical to the An. arabiensis IGS reference sequence. The most frequent haplotype is (An. arabiensis II) with (57%) frequency.
The haplotype number in each population varies from 1-3 and haplotype diversity values (Hd) differ from 0.286 in Aroma population to 0.600 in Kassala and New Halfa populations. Nucleotide diversity (π) values range from 0.00131 in WauAn. arabiensis population to 0.00341 in New Halfa population. An. gambiae population showed the lowest haplotype diversity and nucleotide diversity values (0.000) (Table 2).

Figure 3 Partial alignments of IGS regions of rRNA gene of An. gambiae and An. arabiensis from Sudan and Republic of Southern Sudan. The sequences numbered with reference to the published IGS sequence of An. arabiensis in GenBank [accession number: U10138]. A dot in the alignment indicates that the sequence is identical with that of the consensus sequence. (An. arabiensis I, II, III and IV), the different haplotypes of An. arabiensis

Table 2 Statistical data of 255 bp of IGS regions of rRNA gene polymorphism within An. arabiensis and An. gambiaepopulations collected from Sudan and Republic of Southern Sudan

Phylogenetic relationships
All the individuals of An. arabiensis and An. gambiae were found to be separated into two main clades: (Figure 4) one consisting of two groups, the first group consisted of An. gambiae sequences revealed from the present study in addition to the reference sequence of IGS of An. gambiae [accession number: U10135]. A fairy An. melas sequence from GenBank [accession number: U10139] (Scott et al., 1993) was used to root the phylogeny. This comprised the second group (out group). The other clade consisting of all An. arabiensis populations plus the reference sequence of IGS of An. arabiensis [accession number: U10138].

Journal of Mosquito Research
• Volume 4
View Options
. PDF(939KB)
. FPDF(win)
. Online fPDF
Associated material
. Readers' comments
Other articles by authors
. Asma Hamza
. El Amin El-Rayah
. Sumaia Abukashawa
Related articles
. An. arabiensis
. An. gambiae
. Ribosomal RNA gene (rRNA)
. Intergenic spacer region (IGS)
. Sudan
. Republic of Southern Sudan
. Email to a friend
. Post a comment