Validating Seed Data Samples for Synthetic Identities – Methodology and Uniqueness Metrics

Spaced Seed Data Structures forDe NovoAssembly

International Journal of Genomics ◽

10.1155/2015/196591 ◽

2015 ◽

Vol 2015 ◽

pp. 1-8 ◽

Cited By ~ 3

Author(s):

Inanç Birol ◽

Justin Chu ◽

Hamid Mohamadi ◽

Shaun D. Jackman ◽

Karthika Raghavan ◽

...

Keyword(s):

Data Structure ◽

Data Structures ◽

De Novo ◽

Bloom Filters ◽

De Bruijn Graph ◽

Sequence Specificity ◽

Sequencing Errors ◽

Spaced Seeds ◽

Read Error Correction ◽

Seed Data

De novoassembly of the genome of a species is essential in the absence of a reference genome sequence. Many scalable assembly algorithms use the de Bruijn graph (DBG) paradigm to reconstruct genomes, where a table of subsequences of a certain length is derived from the reads, and their overlaps are analyzed to assemble sequences. Despite longer subsequences unlocking longer genomic features for assembly, associated increase in compute resources limits the practicability of DBG over other assembly archetypes already designed for longer reads. Here, we revisit the DBG paradigm to adapt it to the changing sequencing technology landscape and introduce three data structure designs for spaced seeds in the form of paired subsequences. These data structures address memory and run time constraints imposed by longer reads. We observe that when a fixed distance separates seed pairs, it provides increased sequence specificity with increased gap length. Further, we note that Bloom filters would be suitable to implicitly store spaced seeds and be tolerant to sequencing errors. Building on this concept, we describe a data structure for tracking the frequencies of observed spaced seeds. These data structure designs will have applications in genome, transcriptome and metagenome assemblies, and read error correction.

Download Full-text

Positivity of the T-System Cluster Algebra

The Electronic Journal of Combinatorics ◽

10.37236/229 ◽

2009 ◽

Vol 16 (1) ◽

Cited By ~ 10

Author(s):

Philippe Di Francesco ◽

Rinat Kedem

Keyword(s):

Continued Fraction ◽

Cluster Algebra ◽

Path Model ◽

Partition Functions ◽

Weighted Graphs ◽

Generic Boundary ◽

Seed Data ◽

Q System ◽

Extra Parameter ◽

T System

We give the path model solution for the cluster algebra variables of the $T$-system of type $A_r$ with generic boundary conditions. The solutions are partition functions of (strongly) non-intersecting paths on weighted graphs. The graphs are the same as those constructed for the $Q$-system in our earlier work, and depend on the seed or initial data in terms of which the solutions are given. The weights are "time-dependent" where "time" is the extra parameter which distinguishes the $T$-system from the $Q$-system, usually identified as the spectral parameter in the context of representation theory. The path model is alternatively described on a graph with non-commutative weights, and cluster mutations are interpreted as non-commutative continued fraction rearrangements. As a consequence, the solution is a positive Laurent polynomial of the seed data.

Download Full-text

Seed Data Analysis for Production Fermenter Performance Estimation

Computer Applications in Biotechnology ◽

10.1016/b978-0-08-042377-7.50013-1 ◽

1995 ◽

pp. 53-58 ◽

Cited By ~ 2

Author(s):

M. Ignova ◽

J. Glassey ◽

A.C. Ward ◽

G.A. Montague ◽

T.S. Irvine

Keyword(s):

Data Analysis ◽

Performance Estimation ◽

Seed Data

Download Full-text

Spaced seed data structures

2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) ◽

10.1109/bibm.2014.6999305 ◽

2014 ◽

Cited By ~ 1

Author(s):

Inanc Birol ◽

Hamid Mohamadi ◽

Anthony Raymond ◽

Karthika Raghavan ◽

Justin Chu ◽

...

Keyword(s):

Data Structures ◽

Seed Data

Download Full-text

Reproductive ecology of Dryas integrifolia in the high Arctic semi-desert

Canadian Journal of Botany ◽

10.1139/b96-175 ◽

1996 ◽

Vol 74 (9) ◽

pp. 1451-1460 ◽

Cited By ~ 10

Author(s):

P. G. Krannitz

Keyword(s):

Seed Dispersal ◽

Seed Set ◽

High Arctic ◽

The Arctic ◽

Bet Hedging ◽

Crop Failure ◽

Flower Stalk ◽

Insect Visitation ◽

Solar Noon ◽

Seed Data

Flowering and fruiting of Dryas integrifolia were studied at Igloolik and Pangnirtung to analyse the importance of variation in heliotropy and flower size to seed set and weight. In addition, peduncle elongation and seed plume length were also studied to analyse variation in seed dispersal characters. At both Igloolik and Pangnirtung, most Dryas flowers were not heliotropic throughout the course of the day, but in general, pointed towards the solar noon sun. Benefits to orienting toward the sun were warmer gynoecial temperatures, heavier seeds, and more insect visitation (though not percent seed set). Flowers varied in size from 1.2 to 2.7 cm in diameter and differed in size between plants. Even though larger flowers did not point towards the solar noon sun more than smaller flowers, they had heavier and proportionally more seeds. Variation in peduncle elongation suggests the potential for conservative dispersion when a flower has produced only a few propagules: flowers with fewer or no seeds had shorter stalks. Similarly, with good seed production, a bet hedging strategy is apparent: seeds located at the centre of the receptacle had much longer plumes than those at the perimeter of the seed head. All seed data were from Pangnirtung; the cold summer in 1986 at Igloolik resulted in a complete seed crop failure. Despite the adversity of the arctic climate, there are moderate summers during the lifetime of perennial plants such as D. integrifolia in which adaptations like those described in this study benefit the production of sexual offspring. Keywords: heliotropism, flower stalk elongation, basking insects, seed dispersal, insolation, bet hedging.

Download Full-text

Research and Application of Network Technology and Online Translation Tools in English Translation

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.998-999.1178 ◽

2014 ◽

Vol 998-999 ◽

pp. 1178-1181

Author(s):

Nan Lu

Keyword(s):

Evaluation Method ◽

Target Language ◽

Surface Pattern ◽

Svm Classifier ◽

High Quality ◽

Web Based ◽

Ranking Svm ◽

Translation Tools ◽

Seed Data ◽

The Web

This paper proposed a novel method to extract bilingual translation pairs from the web. Based on the observation that translation pairs tend to appear collectively on the web, a recursive process is used to extract high quality translation pairs from the web. First query the search engine with some seed data and crawl the returned pages. Then identify the Collective Translation Pair Block (CTPB) which contains the collective translation pairs using a heuristic evaluation method. After the CTPB has been identified, a PAT tree is employed to generate the extraction patterns automatically. Then a ranking SVM model is used to re-rank these patterns based on the F measure. The top 10 patterns are adopted to extract the translation pairs with the help of surface pattern. At last in order to get the high quality extraction translation, the extracted translation pairs are verified by a SVM classifier based on the translation relevant between the source and the target language.

Download Full-text

Analysis of Benefit Cost Ratio of Sesame Production under Demonstration in Low Lands of Western Zone, Tigray, Ethiopia

South Asian Journal of Social Studies and Economics ◽

10.9734/sajsse/2021/v11i330283 ◽

2021 ◽

pp. 1-7

Author(s):

Tewoderos Meleaku ◽

Desaly Gebre Tshadike ◽

Goteom Zenbe

Keyword(s):

Production Cost ◽

Cost Benefit ◽

The Other ◽

Cost Ratio ◽

Benefit Cost Ratio ◽

Western Zone ◽

The Mean ◽

Seed Data ◽

The Cost ◽

Benefit Cost

This study aimed to investigate the cost-benefit of sesame production per hectare under (farmers practice, partial package and full package) practice were farmers performed side by side in their plot. Benefit cost ratio analyses of sesame was conducted in western low lands of Tigray. It includes the production year of 2016/17 E.C and bounded of two woredas with six production sites. In the present study 40 respondents of sesame producers were incorporated. Producers were categorized in to full package (row planting, fertilizer and improved seed users), partial package (broadcast, fertilizer and improved seed users) and non package (broad cast and improved seed). Data was analyzed using SPSS version 16 in terms of percentage, mean, model and others. On the other hand, per hectare yield, return, production cost, and benefit cost ratio of each package were statistically different. The mean productivity per hectare for full package, partial package and non package was 6.55, 5.26 and 3.85 quintal sequentially. The mean return per hectare of full package, partial package, and non package was 26243.75, 21746.25 and 13178.91 birr sequentially. The production cost per hectare of full package, partial package, and non package was 13826.74, 12561.35 and 8681.46 birr respectively. The mean benefit cost ratio was 1.90, 1.74 and 1.50 birr respectively for full package, partial package and non package.

Download Full-text

Bootstrapping a Persian Dependency Treebank

Linguistic Issues in Language Technology ◽

10.33011/lilt.v7i.1297 ◽

2012 ◽

Vol 7 ◽

Author(s):

Mojgan Seraji ◽

Beáta Megyesi ◽

Joakim Nivre

Keyword(s):

Open Source ◽

Data Set ◽

Parts Of Speech ◽

Seed Data ◽

Dependency Parser ◽

Syntactic Dependency ◽

Ongoing Project

This paper presents an ongoing project whose goal is to create a freely available dependency treebank for Persian. The data is taken from the Bijankhan corpus, which is already annotated for parts of speech, and a syntactic dependency annotation based on the Stanford Typed Dependencies is added through a bootstrapping procedure involving the open-source dependency parser MaltParser. We report preliminary parsing experiments with promising results after training the parser on a manually annotated seed data set of 215 sentences.

Download Full-text

Sistem Pembibitan PT. Agrowisata Porlak Parna Berbasis Web

MIND Journal ◽

10.26760/mindjournal.v5i2.81-91 ◽

2021 ◽

Vol 5 (2) ◽

pp. 81-91

Author(s):

JOICE ANGELINA PURBA ◽

JURMIDA PULUNGAN ◽

MARDI TURNIP ◽

ADVENT TORAS MARBUN

Keyword(s):

Programming Language ◽

Conventional System ◽

Waterfall Model ◽

Seed Data

AbstrakPT. Agrowisata Porlak Parna mempunyai program kegiatan penyediaan bibit dan menyalurkan bibit untuk melestarikan kawasan di sekitar danau toba. Namun pengolahan dan penyaluran data bibit mengalami kesulitan Karena sistem yang terdapat pada perusahaan masih menggunakan sistem konvensional. Untuk itu perlu dirancang sistem pembibitan dalam bentuk website dengan PHP digunakan sebagai bahasa pemrograman, DBMS MySQL sebagai database pada model waterfall. Rancangan tersebut menghasilkan sistem pembibitan yang memudahkan admin dalam mengolah data bibit dan pengadopsi dapat mengetahui perkembangan bibit sehingga membantu meringankan pekerjaan pegawai dan meningkatkan kinerja yang baik terhadap perusahaan.Kata kunci: Sistem Pembibitan, Web, PHP, DBMS MySQL, WaterfallAbstractPT. Porlak Parna Agrotourism has a program of providing seeds and distributing seeds to preserve the area around Lake Toba. However, the processing and distribution of seed data experienced difficulties because the system contained in the company was still using a conventional system. For this reason, it is necessary to design a nursery system. In the form of a website with PHP used as a programming language, DBMS MySQL as a database in the waterfall model. The design produces a nursery system that makes it easier for admins to process seed data and adopters can find out the development of seedlings so as to help ease the work of employees and improve good performance for the company.Keywords: Nursery System, Web, PHP, MySQL DBMS, Waterfall

Download Full-text

A NONPARAMETRIC MULTI-SEED DATA CLUSTERING TECHNIQUE

Journal of the Chinese Institute of Industrial Engineers ◽

10.1080/10170660809509067 ◽

2008 ◽

Vol 25 (1) ◽

pp. 1-10

Author(s):

Tseng-Pin Lee ◽

Victor B. Kreng

Keyword(s):

Data Clustering ◽

Clustering Technique ◽

Seed Data

Download Full-text