A Survey of Text Summarization Systems for Indian Languages
Main Article Content
Abstract
Text summarization plays a crucial role in information retrieval, particularly in the context of Indian languages, where linguistic diversity and resource scarcity pose unique challenges. This survey explores the current landscape of text summarization techniques applied to several major Indian languages, including Hindi, Tamil, Marathi, Punjabi, Bengali, and Kannada. The paper provides a comprehensive review of both extractive and abstractive summarization methods, highlighting language-specific strategies, challenges, and progress in the field. It discusses key issues such as the lack of large-scale annotated corpora, the complexity of handling linguistic variations, and the difficulty of processing code-mixed and informal language. Furthermore, the paper outlines future research directions, including the integration of multilingual models, the development of large-scale datasets, and the need for domain-specific summarization tools. The findings emphasize the importance of creating culturally and regionally sensitive summarization systems that are tailored to the unique characteristics of Indian languages.