Comparative Analysis of Classification Algorithms for Printed script

pulimamidi nikhitha (vardhaman college of engineering); Arey Vikyath Reddy; Pulimamidi Nikhitha; Kadam Rahul; Muni Sekhar Velpuru

doi:10.63169/GCARED2025.p20

Abstract

Recognizing printed characters across multiple languages presents challenges due to the complexity and diversity of scripts. This study analyzes various classification algorithms for multilingual Optical Character Recognition (OCR), assessing their accuracy, speed, and computational efficiency. It compares traditional machine learning approaches, such as Support Vector Machines (SVM) and kNearest Neighbors (k-NN), with advanced deep learning models, particularly Convolutional Neural Networks (CNNs). A diverse dataset comprising scripts from languages like Devanagari, Arabic, Chinese, and Telugu is utilized. Preprocessing techniques, including noise reduction, binarization, and normalization, are applied to enhance recognition performance. The findings indicate that CNNs significantly surpass traditional methods in accuracy, demonstrating their effectiveness in handling complex and varied scripts. However, the study also highlights the balance between computational demands and recognition accuracy, noting that conventional methods remain viable in resource-limited environments or for simpler datasets. Overall, CNNs emerge as the most promising approach for improving multilingual OCR technology.