scholarly journals Data Employed in the Construction of a Composite Protein Database for Proteogenomic Analyses of Cephalopods Salivary Apparatus

Data ◽  
2020 ◽  
Vol 5 (4) ◽  
pp. 110
Author(s):  
Daniela Almeida ◽  
Dany Domínguez-Pérez ◽  
Ana Matos ◽  
Guillermin Agüero-Chapin ◽  
Yuselis Castaño ◽  
...  

Here we provide all datasets and details applied in the construction of a composite protein database required for the proteogenomic analyses of the article “Putative Antimicrobial Peptides of the Posterior Salivary Glands from the Cephalopod Octopus vulgaris Revealed by Exploring a Composite Protein Database”. All data, subdivided into six datasets, are deposited at the Mendeley Data repository as follows. Dataset_1 provides our composite database “All_Databases_5950827_sequences.fasta” derived from six smaller databases composed of (i) protein sequences retrieved from public databases related to cephalopods’ salivary glands, (ii) proteins identified with Proteome Discoverer software using our original data obtained by shotgun proteomic analyses of posterior salivary glands (PSGs) from three Octopus vulgaris specimens (provided as Dataset_2) and (iii) a non-redundant antimicrobial peptide (AMP) database. Dataset_3 includes the transcripts obtained by de novo assembly of 16 transcriptomes from cephalopods’ PSGs using CLC Genomics Workbench. Dataset_4 provides the proteins predicted by the TransDecoder tool from the de novo assembly of 16 transcriptomes of cephalopods’ PSGs. Further details about database construction, as well as the scripts and command lines used to construct them, are deposited within Dataset_5 and Dataset_6. The data provided in this article will assist in unravelling the role of cephalopods’ PSGs in feeding strategies, toxins and AMP production.

Antibiotics ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 757 ◽  
Author(s):  
Daniela Almeida ◽  
Dany Domínguez-Pérez ◽  
Ana Matos ◽  
Guillermin Agüero-Chapin ◽  
Hugo Osório ◽  
...  

Cephalopods, successful predators, can use a mixture of substances to subdue their prey, becoming interesting sources of bioactive compounds. In addition to neurotoxins and enzymes, the presence of antimicrobial compounds has been reported. Recently, the transcriptome and the whole proteome of the Octopus vulgaris salivary apparatus were released, but the role of some compounds—e.g., histones, antimicrobial peptides (AMPs), and toxins—remains unclear. Herein, we profiled the proteome of the posterior salivary glands (PSGs) of O. vulgaris using two sample preparation protocols combined with a shotgun-proteomics approach. Protein identification was performed against a composite database comprising data from the UniProtKB, all transcriptomes available from the cephalopods’ PSGs, and a comprehensive non-redundant AMPs database. Out of the 10,075 proteins clustered in 1868 protein groups, 90 clusters corresponded to venom protein toxin families. Additionally, we detected putative AMPs clustered with histones previously found as abundant proteins in the saliva of O. vulgaris. Some of these histones, such as H2A and H2B, are involved in systemic inflammatory responses and their antimicrobial effects have been demonstrated. These results not only confirm the production of enzymes and toxins by the O. vulgaris PSGs but also suggest their involvement in the first line of defense against microbes.


2012 ◽  
Vol 24 (2) ◽  
pp. 660-675 ◽  
Author(s):  
Anna Stengel ◽  
Irene L. Gügel ◽  
Daniel Hilger ◽  
Birgit Rengstl ◽  
Heinrich Jung ◽  
...  

2021 ◽  
Vol 18 (2) ◽  
pp. 170-175 ◽  
Author(s):  
Haoyu Cheng ◽  
Gregory T. Concepcion ◽  
Xiaowen Feng ◽  
Haowen Zhang ◽  
Heng Li
Keyword(s):  

Author(s):  
Guangtu Gao ◽  
Susana Magadan ◽  
Geoffrey C Waldbieser ◽  
Ramey C Youngblood ◽  
Paul A Wheeler ◽  
...  

Abstract Currently, there is still a need to improve the contiguity of the rainbow trout reference genome and to use multiple genetic backgrounds that will represent the genetic diversity of this species. The Arlee doubled haploid line was originated from a domesticated hatchery strain that was originally collected from the northern California coast. The Canu pipeline was used to generate the Arlee line genome de-novo assembly from high coverage PacBio long-reads sequence data. The assembly was further improved with Bionano optical maps and Hi-C proximity ligation sequence data to generate 32 major scaffolds corresponding to the karyotype of the Arlee line (2 N = 64). It is composed of 938 scaffolds with N50 of 39.16 Mb and a total length of 2.33 Gb, of which ∼95% was in 32 chromosome sequences with only 438 gaps between contigs and scaffolds. In rainbow trout the haploid chromosome number can vary from 29 to 32. In the Arlee karyotype the haploid chromosome number is 32 because chromosomes Omy04, 14 and 25 are divided into six acrocentric chromosomes. Additional structural variations that were identified in the Arlee genome included the major inversions on chromosomes Omy05 and Omy20 and additional 15 smaller inversions that will require further validation. This is also the first rainbow trout genome assembly that includes a scaffold with the sex-determination gene (sdY) in the chromosome Y sequence. The utility of this genome assembly is demonstrated through the improved annotation of the duplicated genome loci that harbor the IGH genes on chromosomes Omy12 and Omy13.


2018 ◽  
Vol 19 (2) ◽  
pp. 520 ◽  
Author(s):  
Le Zhao ◽  
Xinmei Zhang ◽  
Zhongying Qiu ◽  
Yuan Huang
Keyword(s):  

Data in Brief ◽  
2020 ◽  
Vol 31 ◽  
pp. 105917
Author(s):  
Marianela Cobos ◽  
Hicler N. Rodríguez ◽  
Segundo L. Estela ◽  
Carlos G. Castro ◽  
J. Dylan Maddox ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document