Sequence based Learning for Solubility Prediction from Molecular Smiles

Main Article Content

K. Venkateswara Rao, Kunjam Nageswara Rao, G. Sita Ratnam

Abstract

During the process of drug discovery, the molecular property prediction of drugs is one of the time-consuming steps. The molecular property prediction includes solubility, toxicity etc., the proposed Bi-LSTM approach which helps in predicting the solubility of targets identified at the target identification step of drug discovery. SMILES(Simplified Molecular Input Line Entry System) which are molecular sequences are taken as inputs for this sequence-based approach. Outperforming traditional models, the proposed model demonstrates superior performance in predicting solubility from molecular SMILES representations taken from the FreeSolv dataset. The proposed model is achieved a rmse of 1.22. In this process we go through tokenization, where each string is broken into tokens. These tokens are embedded into the embedding layer to convert into dense vectors. We train our data and test it. Then we apply our model to get the outputs.

Article Details

How to Cite
K. Venkateswara Rao. (2023). Sequence based Learning for Solubility Prediction from Molecular Smiles. International Journal on Recent and Innovation Trends in Computing and Communication, 11(11), 1609–1617. Retrieved from https://www.ijritcc.org/index.php/ijritcc/article/view/11076
Section
Articles