Fire detection via effective vision transformers
Hikmat Yar, Tanveer Hussain, Zulfiqar Ahmad Khan, Mi Young Lee, Sung Wook Baik*
Key Figure

  • Access the paper
  • Journal of the Korea Next Generation Computing Society, 2021 published [🌐Online]
  • Additional Link
  • Korean Journal Link
    • Abstract
    • In today’s modern age, smart and safe cities are one of the major concerns of the research community. The cities are surrounded by open areas, agricultural land, and forests, where fire incidence can threaten human lives and damage properties. Recently, vision sensor-based fire detection has attracted computer vision domain experts, where leading performance has been achieved by various convolutional neural networks (CNNs). However, these techniques are translation invariant, locality-sensitive, and lack a global understanding of images. Furthermore, CNN-based models use pooling layers for dimensionality reduction, which reduces computational cost but also loses meaningful information such as the precise location of the most active feature detector. In this work, we develop a Vision Transformer (ViT)-based model for fire detection and feed image patches into the transformer in a sequential structure similar to word embeddings. Experimental results demonstrate competitive performance compared to state-of-the-art CNN-based methods.

    • Additional Comments