scholarly journals Genome-Scale Algorithm Design: Biological Sequence Analysis in the Era of High-Throughput Sequencing. By Veli Mäkinen, Djamal Belazzougui, Fabio Cunial, and Alexandru I. Tomescu. Cambridge and New York: Cambridge University Press. $64.99. xxii + 391 p.; ill.; index. ISBN: 978-1-107-07853-6. 2015.

2017 ◽  
Vol 92 (1) ◽  
pp. 107-107
2021 ◽  
Author(s):  
Hitoshi Iuchi ◽  
Taro Matsutani ◽  
Keisuke Yamada ◽  
Shunsuke Sumi ◽  
Shion Hosoda ◽  
...  

Remarkable advances in high-throughput sequencing have resulted in rapid data accumulation, and analyzing biological (DNA/RNA/protein) sequences to discover new insights in biology has become more critical and challenging. To tackle this issue, the application of natural language processing (NLP) to biological sequence analysis has received increased attention, because biological sequences are regarded as sentences and k-mers in these sequences as words. Embedding is an essential step in NLP, which converts words into vectors. This transformation is called representation learning and can be applied to biological sequences. Vectorized biological sequences can be used for function and structure estimation, or as inputs for other probabilistic models. Given the importance and growing trend in the application of representation learning in biology, here, we review the existing knowledge in representation learning for biological sequence analysis.


Sign in / Sign up

Export Citation Format

Share Document