Comparative Analysis of CNN and Transformer  models for image classification on Intel Image  dataset.

Shreedatta Sawant (Agnel Institute of Technology and Design-Goa); Ashish Narvekar; Aditya Pednekar; Dinesh DevalieNaik; Dhruv Malvankar

doi:10.63169/GCARED2025.p45

Abstract

This comparative study investigates the
use of six neural network architectures-ResNet50, VGG16, EfficientNetB0, Vision Transformer, Swin Transformer and DenseNet -for classifying images in the Intel Image dataset. The research conducted tends to discover the disadvantages as well as advantages of each
model in terms of accuracy, computational efficiency, and versatility. By implementing and fine-tuning these architectures on the chosen dataset, we assess their ability to categorize various image types. This study provides insights of the balance in between model
complexity and as well as performance, offering valuable guidance for researchers and professionals in selecting appropriate neural network architectures for diverse
image classification tasks.