Comparing Deep Convolutional Networks and Capsule Networks in Image Classification Tasks

Dr. T. Charan Singh, Suresh Bhukya

This research compares the performance of Deep Convolutional Networks (DCNs) and Capsule Networks (CapsNets) in image classification tasks, evaluating their accuracy, training time, and inference time across three benchmark datasets: MNIST, CIFAR-10, and ImageNet. DCNs, such as ResNet-50, VGG16, and AlexNet, have established themselves as robust solutions for image classification, excelling in feature extraction and scalability. However, they often exhibit sensitivity to spatial variations due to pooling layers. In contrast, CapsNets, with their novel capsule-based architecture and dynamic routing algorithms, show promise in preserving spatial hierarchies and improving accuracy, particularly in complex datasets. Despite achieving higher accuracy, CapsNets require significantly more computational resources, with longer training and inference times compared to traditional DCNs. This study highlights the trade-offs between the advanced spatial encoding capabilities of CapsNets and the efficiency of established DCN architectures, offering insights into their practical applications and future research directions.
PDF