A CNN and LSTM-based Model for Creating Captions for Photos

Bala Murali  Krishna Thati; Swathi  Voddi; Srikanth  Busa; Surendra Surendra; J N V R Swarup  Kumar; M.V.L.N. Raja  Rao

doi:10.17762/ijritcc.v11i6s.6965

PDF

Published: Jun 14, 2023

DOI: https://doi.org/10.17762/ijritcc.v11i6s.6965

Keywords:

Feature Extraction, Image Analysis, Neural Network, Deep Learning, Text Analysis

Bala Murali Krishna Thati

Professor, Department of CSE, Dhanekula Institute of Engineering & Technology. Ganguru, Vijayawada.

Swathi Voddi

Assistant Professor, Department of Computer science and Engineering, Prasad V. Potluri Siddhartha Institute of Technology, Vijayawada

Srikanth Busa

Professor, Department of CSE, Kallam Haranadhareddy Institute of Technology, Chowdavaram, Guntur, AP

Surendra

Assistant Professor, Department of CSE, K L Deemed to be University, Green Fields, Vaddeswaram, AP

J N V R Swarup Kumar

Assistant Professor, Department of CSE, GITAM School of Technology, GITAM (Deemed to be University), Visakhapatnam

M.V.L.N. Raja Rao

Professor, Department of Information Technology, Seshadri Rao Gudlavalleru Engineering College, Gudlavalleru, AP

Abstract

Can a machine interpret an image's meaning with the same speed as the human brain when it is seen? This problem was heavily researched by computer vision specialists, who believed it to be unsolvable until recently. It is now possible to develop models that can generate captions for pictures because of advancements in deep learning techniques, accessibility to large datasets, and processing power. This will be accomplished by the Python-based implementation of the article's deep learning convolutional neural network technique and a particular kind of recurrent neural network. Here the proposed model uses CNN and LSTM methods to achieve desired task

How to Cite

Krishna Thati, B. M. ., Voddi, S. ., Busa, S. ., Surendra, S., Kumar, J. N. V. R. S. ., & Rao, M. R. . (2023). A CNN and LSTM-based Model for Creating Captions for Photos. International Journal on Recent and Innovation Trends in Computing and Communication, 11(6s), 543–547. https://doi.org/10.17762/ijritcc.v11i6s.6965

Issue

Vol. 11 No. 6s (2023): Advances in Computational Modeling and Simulation of Computing Systems

Section

Articles

References

Gupta, N., & Jalal, A. S. (2020). Integration of textual cues for fine-grained image captioning using deep CNN and LSTM. Neural Computing and Applications, 32(24), 17899-17908.

Khamparia, A., Pandey, B., Tiwari, S., Gupta, D., Khanna, A., & Rodrigues, J. J. (2020). An integrated hybrid CNN–RNN model for visual description and generation of captions. Circuits, Systems, and Signal Processing, 39(2), 776-788.

Ms. Madhuri Zambre. (2012). Performance Analysis of Positive Lift LUO Converter . International Journal of New Practices in Management and Engineering, 1(01), 09 - 14. Retrieved from http://ijnpme.org/index.php/IJNPME/article/view/3

Soh, M. (2016). Learning CNN-LSTM architectures for image caption generation. Dept. Comput. Sci., Stanford Univ., Stanford, CA, USA, Tech. Rep, 1.

Alzubi, J. A., Jain, R., Nagrath, P., Satapathy, S., Taneja, S., & Gupta, P. (2021). Deep image captioning using an ensemble of CNN and LSTM based deep neural networks. Journal of Intelligent & Fuzzy Systems, 40(4), 5761-5769.

Mondal , D. (2021). Green Channel Roi Estimation in The Ovarian Diseases Classification with The Machine Learning Model . Machine Learning Applications in Engineering Education and Management, 1(1), 07–12.

Al-Muzaini, H. A., Al-Yahya, T. N., & Benhidour, H. (2018). Automatic Arabic image captioning using RNN-LSTM-based language model and CNN. International Journal of Advanced Computer Science and Applications, 9(6).

Sharma, H., Agrahari, M., Singh, S. K., Firoj, M., & Mishra, R. K. (2020, February). Image captioning: a comprehensive survey. In 2020 International Conference on Power Electronics & IoT Applications in Renewable Energy and its Control (PARC) (pp. 325-328). IEEE.

Khatri, K. ., & Sharma, D. A. . (2020). ECG Signal Analysis for Heart Disease Detection Based on Sensor Data Analysis with Signal Processing by Deep Learning Architectures. Research Journal of Computer Systems and Engineering, 1(1), 06–10. Retrieved from https://technicaljournals.org/RJCSE/index.php/journal/article/view/11

Chen, M., Ding, G., Zhao, S., Chen, H., Liu, Q., & Han, J. (2017, February). Reference based LSTM for image captioning. In Thirty-first AAAI conference on artificial intelligence.

Wang, M., Song, L., Yang, X., & Luo, C. (2016, September). A parallel-fusion RNN-LSTM architecture for image caption generation. In 2016 IEEE international conference on image processing (ICIP) (pp. 4448-4452). IEEE.

Johnson, M., Williams, P., González, M., Hernandez, M., & Muñoz, S. Applying Machine Learning in Engineering Management: Challenges and Opportunities. Kuwait Journal of Machine Learning, 1(1). Retrieved from http://kuwaitjournals.com/index.php/kjml/article/view/90

Loganathan, K., Kumar, R. S., Nagaraj, V., & John, T. J. (2020). Cnn & lstm using python for automatic image captioning. Materials Today: Proceedings.

Tan, Y. H., & Chan, C. S. (2017). phi-LSTM: a phrase-based hierarchical LSTM model for image captioning. In Asian conference on computer vision (pp. 101-117). Springer, Cham.

Al Fatta, H., & Fajar, U. (2019, December). Captioning image using convolutional neural network (CNN) and long-short term memory (LSTM). In 2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI) (pp. 263-268). IEEE.

Pa, W. P., & Nwe, T. L. (2020, May). Automatic Myanmar image captioning using CNN and LSTM-based language model. In Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL) (pp. 139-143).

Ana Silva, Deep Learning Approaches for Computer Vision in Autonomous Vehicles , Machine Learning Applications Conference Proceedings, Vol 1 2021.

Xu, K., Wang, H., & Tang, P. (2017, July). Image captioning with deep LSTM based on sequential residual. In 2017 IEEE International Conference on Multimedia and Expo (ICME) (pp. 361-366). IEEE.

Liu, M., Li, L., Hu, H., Guan, W., & Tian, J. (2020). Image caption generation with dual attention mechanism. Information Processing & Management, 57(2), 102178.

Yang, Z., Yuan, Y., Wu, Y., Cohen, W. W., & Salakhutdinov, R. R. (2016). Review networks for caption generation. Advances in neural information processing systems, 29.

Wang, H., Zhang, Y., & Yu, X. (2020). An overview of image caption generation methods. Computational intelligence and neuroscience, 2020.

Citation Indices	All	Since 2018
Citation	5854	3996
h-index	28	23
i10-index	119	72

Year	Rate
2019	12.6%
2018	18.3%
2017	16.9%
2016	18.8%
2015	22.9%
2014	28.9%
2013	26.1%

A CNN and LSTM-based Model for Creating Captions for Photos

Abstract

References

Similar Articles

Contact Us:

Auricle Global Society of Education and Research

Y-18-A, Near Sanskar Play School, Sudarshana Nagar,

Bikaner, Rajasthan (India). Pin 334003

: editor@ijritcc.org

Quick Links:

Article Sidebar

Main Article Content

Abstract

Article Details

References

Similar Articles

Contact Us:

Auricle Global Society of Education and Research

Y-18-A, Near Sanskar Play School, Sudarshana Nagar,

Bikaner, Rajasthan (India). Pin 334003

: editor@ijritcc.org

Quick Links: