Channel-spatial attention network for fewshot classification


Autoři: Yan Zhang aff001;  Min Fang aff001;  Nian Wang aff001
Působiště autorů: School of Electronics and Information Engineering, Anhui University, Hefei, China aff001
Vyšlo v časopise: PLoS ONE 14(12)
Kategorie: Research Article
doi: 10.1371/journal.pone.0225426

Souhrn

Learning a powerful representation for a class with few labeled samples is a challenging problem. Although some state-of-the-art few-shot learning algorithms perform well based on meta-learning, they only focus on novel network architecture and fail to take advantage of the knowledge of every classification task. In this paper, to accomplish this goal, it proposes to combine the channel attention and spatial attention module (C-SAM), the C-SAM can mine deeply more effective information using samples of different classes that exist in different tasks. The residual network is used to alleviate the loss of the underlying semantic information when the network is deeper. Finally, a relation network including a C-SAM is applied to act as a classifier, which avoids learning more redundant information and compares the relation between difference samples. The experiment was carried out using the proposed method on six datasets, such as miniimagenet, Omniglot, Caltech-UCSD Birds, describable textures dataset, Stanford Dogs and Stanford Cars. The experimental results show that the C-SAM outperforms many state-of-the-art few-shot classification methods.

Klíčová slova:

Algorithms – Attention – Birds – Convolution – Deep learning – Dogs – Learning – Neural networks


Zdroje

1. Li Jun, Lin Daoyu, Wang Yang, Xu Guangluan, Ding Chibiao. Deep Discriminative Representation Learning with Attention Map for Scene Classification. 2019. Preprint. Available from: arXiv: 1902.07967.

2. Meng Dong, Xuhui Huang, Bo Xu. Unsupervised speech recognition through spike-timing-dependent plasticity in a convolutional spiking neural network. Plos One. 2018; 13(11): e0204596. doi: 10.1371/journal.pone.0204596 30496179

3. Spyros Gidaris, Nikos Komodakis. Dynamic few-shot visual learning without Forgetting. Conference on Computer Vision and Pattern Recognition, 2018; 4367–4375.

4. Y. Lee, S. Choi. Gradient-based meta-learning with learned layerwise metric and subspace. In Proceedings of the 35th International Conference on Machine Learning. 2018; 2933–2942.

5. S.Ravi, H. Larochelle. Optimization as a model for few-shot learning. 5th International Conference on Learning Representations. 2017; Available from: https://openreview.net/pdf?id=rJY0-Kcll.

6. C. Finn, P. Abbeel, S. Levine. Model-agnostic meta-learning for fast adaptation of deep networks. 34th International Conference on Machine Learning. 2017; 3:1856–1868.

7. G. Koch, R. Zemel, R. Salakhutdinov. Siamese neural networks for one-shot image recognition. In Proceedings of International Conference on Machine Learning. 2016; 1–8.

8. Vinyals, C. Blundell, T. Lillicrap, K. Kavukcuoglu, D. Wierstra. Matching networks for one shot learning. 30th Conference on Neural Information Processing System. 2016; 3630–3638.

9. J. Snell, K. Swersky, R. S. Zemel. Prototypical networks for few-shot learning. In Advances in Neural Information Processing Systems. 2017; 4080–4090.

10. Wang Yaqing, Yao Quanming, James Kwork, Ni Lionel M. Generalizing from a Few Examples: A Survey on Few-Shot Learning.2019.Preprint.Available from: arXiv from:arXiv:1904.05046.

11. Wang Fei, Jiang Mengqing, Qian Chen, Yang Shuo, Li Cheng, Zhang Honggang, et al. Residual Attention Network for Image Classification. IEEE Conference on Computer Vision and Pattern Recognition. 2017; 3165–3164.

12. Woo Sanghyun, Park Jongchan, Lee Joon-Young. In So Kweon. CBAM: Convolutional Block Attention Module. 15th European Conference on Computer Vision. 2018; 11211: 3–19.

13. Park Jongchan, Woo Sanghyun, Lee Joon-Young. In So Kweon. BAM: Bottleneck Attention Module. 2018. Preprint. Available from: arXiv: 1807.06514.

14. Flood Sung, Yang Yong xin Zhang Li, Tao Xiang, et al. Learning to Compare: Relation Network for Few-Shot Learning. Computer Vision and Pattern Recognition 2018; 1199–1208.

15. N.Hilliard, L.Phillips, S.Howland, et al. Few-Shot Learning with Metric-Agnostic Conditional Embeddings. 2018. Preprint. Available from: arXiv: 1802.04376.

16. Tsendsuren Munkhdalai, Yu Hong. Meta Networks. 34th International Conference on Machine Learning (ICML). 2017; 5:3933–3943.

17. Thrun S. and Pratt L. Learning to learn: Introduction and overview. Learning to learn; 1998. 3–17.

18. Wei Ying, Zhang Yu, Huang Junzhou,Yang Qiang, Transfer Learning via Learning to Transfer, Proceedings of the 35th International Conference on Machine Learning. 2018; 80:5085–5094.

19. Andrychowicz M., Denil M., S. G, Hoffman M. W., Pfau D., et al. Learning to learn by gradient descent by gradient descent. In Advances in Neural Information Processing Systems 29. 2016; 3981–3989.

20. Boris N. Oreshkin, Pau Rodriguez, Alexandre Lacoste. TADAM: Task dependent adaptive metric for improved few-shot learning. 2018. Preprint. Available from: arXiv: 1805.10123.

21. N. Mishra, M. Rohaninejad, X. Chen, P. Abbeel. A Simple Neural Attentive Meta-Larner. In International Conference on Learning Representations; 2018. Available from: http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-32.html.

22. H. Fukui, T. Hirakawa, T. Yamashita, H. Fujiyoshi. Attention Branch Network: Learning of Attention Mechanism for Visual Explanation. 2018. Preprint. Available from: arXiv: 1812.10025.

23. L. Shugliashvili, D. Soselia, S. Amashukeli, I. Koberidze. Reproduction Report on "Learn to Pay Attention". 2018. Preprint. Available from: arXiv: 1812.04650.

24. Qin Yunxiao, Zhao Chenxu, Wang Zezheng, et al. Representation based and Attention augmented Meta learning. 2018. Preprint. Available from: arXiv: 1811.07545.

25. Wang Peng, Liu Lingqiao, Shen Chunhua. Multi-Attention Network for One Shot Learning. IEEE Conference on Computer Vision and Pattern Recognition. 2017; 1:6212–6220.

26. He Kaiming, Zhang Xiangyu, Ren Shaoqing, Sun Jian. Deep residual learning for image recognition. IEEE Conference on Computer Vision and Pattern Recognition. 2016; 770–778.

27. Nursultan Kaiyrbekov, Olga Krestinskaya, Alex Pappachen James. Variability analysis of Memristor-based Sigmoid Function. 2018. Preprint. Available from: arXiv: 1805.07679.

28. Zhang Y, Li K, et al. Image Super-Resolution Using Very Deep Residual Channel Attention Networks. The European Conference on Computer Vision. 2018; 286–301.

29. Zhu Xizhou, Cheng Dazhi, Zhang Zheng, et al. An Empirical Study of Spatial Attention Mechanisms in Deep Networks. 2019. Preprint. Available from: arXiv: 1904.05873.

30. C. Wah, S. Branson, P. Welinder, P. Perona, S.Belongie. The Caltech-UCSD Birds-200-2011 Dataset. Technical Report CNS-TR-2011-001, California Institute of Technology, 2011.

31. Santoro, Adam, Bartunov, Sergey, Botvinick, Matthew, et al. Meta-learning with memory-augmented neural networks. In Proceedings of The 33rd International Conference on Machine Learning. 2016; 1842–1850.

32. Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Sammy Mohamed, Andrea Vedaldi. Describing Textures in the Wild. IEEE Conference on Computer Vision and Pattern Recognition. 2014; 3606–3613.

33. A. Khosla, N. Jayadevaprakash, B. Yao, and L. Fei-Fei, Novel dataset for Fine-Grained Image Categorization. First Workshop on Fine-Grained Visual Categorization (FGVC). IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2011. Available from: http://pdfs.semanticscholar.org/b5e3/beb791cc17cdaf131d5cca6ceb796226d832.pdf.

34. J.Krause, M.Stark, J.Deng, L.Fei-Fei. 3d object representations for fine-grained category-zation. IEEE International Conference on Computer Vision Workshops. 2013; 554–561.


Článek vyšel v časopise

PLOS One


2019 Číslo 12