Iris-CV: Classifying Iris Flowers Is Not as Easy as You Thought

Itamar Rocha Filho, João Pedro Vasconcelos Teixeira, João Wallace Lucena Lins, Felipe Honorato de Sousa, Ana Clara Chaves Sousa, Manuel Ferreira Junior, Thaís Ramos, Cecília Silva, Thaís Gaudencio do Rêgo, Yuri de Almeida Malheiros, Telmo de Menezes e Silva Filho

Published: 01 Jan 2021, Last Modified: 17 May 2023BRACIS (2) 2021Readers: Everyone

Abstract: The iris flower dataset is a ubiquitous benchmark task in machine learning literature. With its 150 instances, four continuous features, and three balanced classes, of which one is linearly separable from the others, iris is generally considered an easy problem. Hence researchers usually rely on other datasets when they need more challenging benchmarks. A similar situation happens with computer vision datasets such as MNIST and ImageNet, which have been widely explored. The state of the art models essentially solves these problems, motivating the search for more challenging tasks. Therefore, this paper introduces a new computer vision toy dataset featuring iris flowers. Users of a nature photography application took the pictures, thus they include noisy background information. Additionally, certain desirable features are not guaranteed, such as single, similarly-sized objects at the center of each picture, which makes the task more challenging. Our benchmark results show that the dataset can be challenging for traditional machine learning algorithms without any pre-processing steps, while state of the art deep learning architectures achieve around 82% accuracy, which means some effort will be necessary to drive this accuracy closer to what has been accomplished for MNIST and ImageNet.

0 Replies