Comparative Analysis of CNN and Transformer models for image classification on Intel Image dataset.
Authors:
Shreedatta Sawant (Agnel Institute of Technology and Design-Goa)
Ashish Narvekar
Aditya Pednekar
Dinesh DevalieNaik
Dhruv Malvankar
Abstract

This comparative study investigates the
use of six neural network architectures-ResNet50, VGG16, EfficientNetB0, Vision Transformer, Swin Transformer and DenseNet -for classifying images in the Intel Image dataset. The research conducted tends to discover the disadvantages as well as advantages of each
model in terms of accuracy, computational efficiency, and versatility. By implementing and fine-tuning these architectures on the chosen dataset, we assess their ability to categorize various image types. This study provides insights of the balance in between model
complexity and as well as performance, offering valuable guidance for researchers and professionals in selecting appropriate neural network architectures for diverse
image classification tasks.

📄 Download Full Paper (PDF)
Published in: GCARED 2025 Proceedings
DOI: 10.63169/GCARED2025.p45
Paper ID: GCARED2025-0174