This web service does not fully support Internet Explorer versions below IE9. If you are using an earlier version, some features may be displayed incorrectly.

Euglena gracilis PacBio sequencing reads corresponding to the partially spliced transcripts of the gapC and tubA genes (in fasta format)

Deposited are analyzed sequencing data corresponding to the amplicons obtained as a result of RT-PCR reactions for the gapC and tubA partially spliced transcripts of Euglena gracilis. Sequencing of obtained PCR products was carried out commercially on the PacBio RS II (Pacific Biosciences) instrument. Sequencing reads were initially quality controlled and trimmed by the sequencing institution. Analyzed sequencing reads for the three amplified amplicons, gapC, i6- and i5-tubA, were assembled to contigs (c) and assigned into two categories: accepted (fair sequencing reads taken to the further analysis) and discarded (unfair reads rejected from the further analysis). They were accordingly organized in the deposited folders (f). Only in case of the amplicons i6- and i5-tubA, a part of the discarded reads was mutual for both amplicons as a result of the preliminary quality examination (mapping of the sequences to the reference based on their length). Those reads were provided in the separate folder (f: tubA-i6i5_discarded_reads). The structure of the folders is following: f: gapC - accepted_reads (29 files in fasta format), discarded_reads (29 files in fasta format); f: i5-tubA - accepted_reads (21 files in fasta format), discarded_reads (10 files in fasta format); f: i6-tubA - accepted_reads (20 files in fasta format), discarded_reads (10 files in fasta format); f: i6i5-tubA discardedreads (23 files in fasta format). The files were named according to the following example scheme: c1_gapC_accepted#1: in this file sequencing reads assembled to the contig c1 of gapC amplicon, assigned to the accepted category in analysis number #1 are stored c1_gapC_discarded#1: in this file sequencing reads assembled to the contig c1 of gapC amplicon, assigned to the discarded category in analysis number #1 are stored. In case of the gapC amplicon we performed three equal analyses (#1, #2, #3) in order to effectively and reliably examine the sequencing reads, while for i6- and i5-tubA amplicons – two (#1, #2).

Publisher: RepOD

Publication year: 2018

Type of resource: Dataset

Area of study: Natural and mathematical sciences

License for files: notspecified

Files in this dataset

Keywords

Authors

Author Affiliation
Gumińska, Natalia Department of Molecular Phylogenetics and Evolution, Institute of Botany, Faculty of Biology, Biological and Chemical Research Center, University of Warsaw, ul. Żwirki i Wigury 101, 02-089 Warsaw, Poland
Płecha, Magdalena Department of Molecular Phylogenetics and Evolution, Institute of Botany, Faculty of Biology, Biological and Chemical Research Center, University of Warsaw, ul. Żwirki i Wigury 101, 02-089 Warsaw, Poland
Zakryś, Bożena Department of Molecular Phylogenetics and Evolution, Institute of Botany, Faculty of Biology, Biological and Chemical Research Center, University of Warsaw, ul. Żwirki i Wigury 101, 02-089 Warsaw, Poland
Milanowski, Rafał Department of Molecular Phylogenetics and Evolution, Institute of Botany, Faculty of Biology, Biological and Chemical Research Center, University of Warsaw, ul. Żwirki i Wigury 101, 02-089 Warsaw, Poland

Cite this dataset as:

Gumińska, N.; Płecha, M.; Zakryś, B.; Milanowski, R. (2018) Euglena gracilis PacBio sequencing reads corresponding to the partially spliced transcripts of the gapC and tubA genes (in fasta format). RepOD. http://dx.doi.org/10.18150/repod.9692603

Publicly available in RepOD since: 2018-05-09 15:11 (CEST)

Download the dataset citation