These 10 papers have significantly contributed to the advancement of computer vision techniques and technologies.
Reading and understanding these works can provide valuable insights into the development of computer vision techniques and their applications.
These papers cover a wide range of computer vision topics, including deep learning architectures, object detection, image segmentation, and efficient model design.
1.AlexNet: ImageNet Classification with Deep Convolutional Neural Networks (2012) – Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton
This paper introduced the use of deep convolutional neural networks for large-scale image classification tasks and played a crucial role in popularizing deep learning in computer vision.
2. VGGNet: Very Deep Convolutional Networks for Large-Scale Image Recognition (2014) – Karen Simonyan and Andrew Zisserman
VGGNet presented a network architecture with 16-19 layers, demonstrating the importance of depth in convolutional neural networks for image recognition.
3. ResNet: Deep Residual Learning for Image Recognition (2015) – Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun
ResNet introduced residual connections that allow the training of very deep neural networks by addressing the vanishing gradient problem. It significantly improved the performance of deep learning models.
4. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks (2015) – Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun
This paper introduced Faster R-CNN, which combined region proposal methods with deep neural networks for faster and more accurate object detection.
5. YOLO: You Only Look Once: Unified, Real-Time Object Detection (2016) – Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi
YOLO is a real-time object detection algorithm that can predict bounding boxes and class probabilities directly in a single pass, making it efficient and suitable for real-time applications.
6. Mask R-CNN (2017) – Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick
Mask R-CNN extended the Faster R-CNN framework to perform instance segmentation, providing both object detection and pixel-level segmentation simultaneously.
7. Densely Connected Convolutional Networks (2017) – Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q. Weinberger
DenseNet introduced densely connected blocks, allowing information to flow directly from early layers to later layers in the network, promoting feature reuse and alleviating the vanishing gradient problem.
8. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs (2017) – Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille
DeepLab proposed an effective method for semantic image segmentation using atrous convolution and conditional random fields.
9. Attention Is All You Need (2017) – Vaswani et al.
Though not exclusively focused on computer vision, this paper introduced the Transformer model, which revolutionized natural language processing and later proved valuable for vision tasks through techniques like self-attention.
10. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks (2019) – Mingxing Tan and Quoc V. Le
EfficientNet introduced a novel approach to scaling up neural networks efficiently, achieving state-of-the-art performance with significantly fewer parameters and computations.