RCNX: RESIDUAL CAPSULE NEXT
thesisposted on 10.05.2021, 15:38 by Arjun Narukkanchira Anilkumar
Machine learning models are rising every day. Most of the Computer Vision oriented
machine learning models arise from Convolutional Neural Network’s(CNN) basic structure.
Machine learning developers use CNNs extensively in Image classification, Object Recognition,
and Image segmentation. Although CNN produces highly compatible models with
superior accuracy, they have their disadvantages. Estimating pose and transformation for
computer vision applications is a difficult task for CNN. The CNN’s functions are capable of
learning only shift-invariant features of an image. These limitations give machine learning
developers motivation towards generating more complex algorithms.
Search for new machine learning models led to Capsule Networks. This Capsule Network
was able to estimate objects’ pose in an image and recognize transformations to these
objects. Handwritten digit classification is the task for which capsule networks are to solve
at the initial stages. Capsule Networks outperforms all models for the MNIST dataset for
handwritten digits, but to use Capsule networks for image classification is not a straightforward
multiplication of parameters. By replacing the Capsule Network’s initial layer, a
simple Convolutional Layer, with complex architectures in CNNs, authors of Residual Capsule
Network achieved a tremendous change in capsule network applications without a high
number of parameters.
This thesis focuses on improving this recent Residual Capsule Network (RCN) to an
extent where accuracy and model size is optimal for the Image classification task with a
benchmark of the CIFAR-10 dataset. Our search for an exemplary capsule network led to
the invention of RCN2: Residual Capsule Network 2 and RCNX: Residual Capsule NeXt.
RCNX, as the next generation of RCN. They outperform existing architectures in the domain
of Capsule networks, focusing on image classification such as 3-level RCN, DCNet, DC
Net++, Capsule Network, and even outperforms compact CNNs like MobileNet V3.
RCN2 achieved an accuracy of 85.12% with 1.95 Million parameters, and RCNX achieved
89.31% accuracy with 1.58 Million parameters on the CIFAR-10 benchmark.