scholarly journals Automated Recognition of Chemical Molecule Images Based on an Improved TNT Model

2022 ◽  
Vol 12 (2) ◽  
pp. 680
Author(s):  
Yanchi Li ◽  
Guanyu Chen ◽  
Xiang Li

The automated recognition of optical chemical structures, with the help of machine learning, could speed up research and development efforts. However, historical sources often have some level of image corruption, which reduces the performance to near zero. To solve this downside, we need a dependable algorithmic program to help chemists to further expand their research. This paper reports the results of research conducted for the Bristol-Myers Squibb-Molecular Translation competition, which was held on Kaggle and which invited participants to convert old chemical images to their underlying chemical structures, annotated as InChI text; we define this work as molecular translation. We proposed a model based on a transformer, which can be utilized in molecular translation. To better capture the details of the chemical structure, the image features we want to extract need to be accurate at the pixel level. TNT is one of the existing transformer models that can meet this requirement. This model was originally used for image classification, and is essentially a transformer-encoder, which cannot be utilized for generation tasks. On the other hand, we believe that TNT cannot integrate the local information of images well, so we improve the core module of TNT—TNT block—and propose a novel module—Deep TNT block—by stacking the module to form an encoder structure, and then use the vanilla transformer-decoder as a decoder, forming a chemical formula generation model based on the encoder–decoder structure. Since molecular translation is an image-captioning task, we named it the Image Captioning Model based on Deep TNT (ICMDT). A comparison with different models shows that our model has benefits in each convergence speed and final description accuracy. We have designed a complete process in the model inference and fusion phase to further enhance the final results.

2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Daiyou Xiao

Investors make capital investment by buying stocks and expect to get a certain income from the stock market. When buying stocks, they need to draw up investment plans based on various information such as stock market historical transaction data and related news data of listed companies and collect and analyze these data. The data are relatively cumbersome and require a lot of time and effort. If you only rely on subjective analysis, the reference factors are often not comprehensive enough. At the same time, Internet social media, such as the speech in stock forums, also affect the judgment and behavior of investors, and investor sentiment will have a positive or negative effect on the stock market. This has an impact on the trend of stock prices. Therefore, this article proposes a stock market prediction model that uses data preprocessing technology based on past stock market transaction data to establish a stock market prediction model, and secondly, an image description generation model based on a generative confrontation network is designed. The model includes a generator and a discriminator. A time-varying preattention mechanism is proposed in the generator. This mechanism allows each image feature to pay attention to the image features of other stock markets to predict stock market trends so that the decoder can better understand the relational information in the image. The discriminator is based on the recurrent neural network and considers the degree of matching between the input sentence and the 4 reference sentences and the image features. Experiments show that the accuracy of the model is higher than that of the stock pretrend forecast model based on historical data, which proves the effectiveness of the data used in this paper in the stock price trend forecast.


2009 ◽  
Vol 129 (9) ◽  
pp. 1690-1698
Author(s):  
Manabu Gouko ◽  
Naoki Tomi ◽  
Tomoaki Nagano ◽  
Koji Ito
Keyword(s):  

IEEE Access ◽  
2021 ◽  
Vol 9 ◽  
pp. 25360-25370
Author(s):  
Ziwei Zhou ◽  
Liang Xu ◽  
Chaoyang Wang ◽  
Wei Xie ◽  
Shuo Wang ◽  
...  
Keyword(s):  

Author(s):  
Huimin Lu ◽  
Rui Yang ◽  
Zhenrong Deng ◽  
Yonglin Zhang ◽  
Guangwei Gao ◽  
...  

Chinese image description generation tasks usually have some challenges, such as single-feature extraction, lack of global information, and lack of detailed description of the image content. To address these limitations, we propose a fuzzy attention-based DenseNet-BiLSTM Chinese image captioning method in this article. In the proposed method, we first improve the densely connected network to extract features of the image at different scales and to enhance the model’s ability to capture the weak features. At the same time, a bidirectional LSTM is used as the decoder to enhance the use of context information. The introduction of an improved fuzzy attention mechanism effectively improves the problem of correspondence between image features and contextual information. We conduct experiments on the AI Challenger dataset to evaluate the performance of the model. The results show that compared with other models, our proposed model achieves higher scores in objective quantitative evaluation indicators, including BLEU , BLEU , METEOR, ROUGEl, and CIDEr. The generated description sentence can accurately express the image content.


1982 ◽  
Vol 119 (4-6) ◽  
pp. 348-350 ◽  
Author(s):  
I.M. Benn ◽  
R.W. Tucker
Keyword(s):  

Author(s):  
Wei Zhao ◽  
Benyou Wang ◽  
Jianbo Ye ◽  
Min Yang ◽  
Zhou Zhao ◽  
...  

In this paper, we propose a Multi-task Learning Approach for Image Captioning (MLAIC ), motivated by the fact that humans have no difficulty performing such task because they possess capabilities of multiple domains. Specifically, MLAIC consists of three key components: (i) A multi-object classification model that learns rich category-aware image representations using a CNN image encoder; (ii) A syntax generation model that learns better syntax-aware LSTM based decoder; (iii) An image captioning model that generates image descriptions in text, sharing its CNN encoder and LSTM decoder with the object classification task and the syntax generation task, respectively. In particular, the image captioning model can benefit from the additional object categorization and syntax knowledge. To verify the effectiveness of our approach, we conduct extensive experiments on MS-COCO dataset. The experimental results demonstrate that our model achieves impressive results compared to other strong competitors.


Author(s):  
Min Chen ◽  
Shu-Ching Chen

This chapter introduces an advanced content-based image retrieval (CBIR) system, MMIR, where Markov model mediator (MMM) and multiple instance learning (MIL) techniques are integrated seamlessly and act coherently as a hierarchical learning engine to boost both the retrieval accuracy and efficiency. It is well-understood that the major bottleneck of CBIR systems is the large semantic gap between the low-level image features and the high-level semantic concepts. In addition, the perception subjectivity problem also challenges a CBIR system. To address these issues and challenges, the proposed MMIR system utilizes the MMM mechanism to direct the focus on the image level analysis together with the MIL technique (with the neural network technique as its core) to real-time capture and learn the object-level semantic concepts with some help of the user feedbacks. In addition, from a long-term learning perspective, the user feedback logs are explored by MMM to speed up the learning process and to increase the retrieval accuracy for a query. The comparative studies on a large set of real-world images demonstrate the promising performance of our proposed MMIR system.


Sign in / Sign up

Export Citation Format

Share Document