Diagram of an Ear

Ear Recognition Research

University of Ljubljana

Joint work of two laboratories:

Journals

  • Efficient ear alignment using a two-stack hourglass network

    Anja Hrovatič, Peter Peer, Vitomir Štruc, Žiga Emeršič: "Efficient ear alignment using a two-stack hourglass network", IET Biometrics, 2023.

    Ear images have been shown to be a reliable modality for biometric recognition with desirable characteristics, such as high universality, distinctiveness, measurability and permanence. While a considerable amount of research has been directed towards ear recognition techniques, the problem of ear alignment is still under-explored in the open literature. Nonetheless, accurate alignment of ear images, especially in unconstrained acquisition scenarios, where the ear appearance is expected to vary widely due to pose and view point variations, is critical for the performance of all downstream tasks, including ear recognition. Here, the authors address this problem and present a framework for ear alignment that relies on a two-step procedure: (i) automatic landmark detection and (ii) fiducial point alignment. For the first (landmark detection) step, the authors implement and train a Two-Stack Hourglass model (2-SHGNet) capable of accurately predicting 55 landmarks on diverse ear images captured in uncontrolled conditions. For the second (alignment) step, the authors use the Random Sample Consensus (RANSAC) algorithm to align the estimated landmark/fiducial points with a pre-defined ear shape (i.e. a collection of average ear landmark positions). The authors evaluate the proposed framework in comprehensive experiments on the AWEx and ITWE datasets and show that the 2-SHGNet model leads to more accurate landmark predictions than competing state-of-the-art models from the literature. Furthermore, the authors also demonstrate that the alignment step significantly improves recognition accuracy with ear images from unconstrained environments compared to unaligned imagery.

    @article{hrovatic2023efficient,
    	title={Efficient ear alignment using a two-stack hourglass network},
    	author={Hrovati{\v{c}}, Anja and Peer, Peter and {\v{S}}truc, Vitomir and Emer{\v{s}}i{\v{c}}, {\v{Z}}iga},
    	journal={IET Biometrics},
    	year={2023},
    	publisher={Wiley Online Library}
    }
    												  
    										
  • ContexedNet: Context–aware Ear Detection in Unconstrained Settings

    Žiga Emeršič, Diego Sušanj, Blaž Meden, Peter Peer, Vitomir Štruc: "ContexedNet: Context–aware Ear Detection in Unconstrained Settings", IEEE Access, 2021.

    Ear detection represents one of the key components of contemporary ear recognition systems. While significant progress has been made in the area of ear detection over recent years, most of the improvements are direct results of advances in the field of visual object detection. Only a limited number of techniques presented in the literature are domain–specific and designed explicitly with ear detection in mind. In this paper, we aim to address this gap and present a novel detection approach that does not rely only on general ear (object) appearance, but also exploits contextual information, i.e., face–part locations, to ensure accurate and robust ear detection with images captured in a wide variety of imaging conditions. The proposed approach is based on a Context–aware Ear Detection Network (ContexedNet) and poses ear detection as a semantic image segmentation problem. ContexedNet consists of two processing paths: i) a context–provider that extracts probability maps corresponding to the locations of facial parts from the input image, and ii) a dedicated ear segmentation model that integrates the computed probability maps into a context–aware segmentation-based ear detection procedure. ContexedNet is evaluated in rigorous experiments on the AWE and UBEAR datasets and shown to ensure competitive performance when evaluated against state–of–the–art ear detection models from the literature. Additionally, because the proposed contextualization is model agnostic, it can also be utilized with other ear detection techniques to improve performance.

    @article{earContext,
    	author={Emeršič, Žiga and Sušanj, Diego and Meden, Blaž and Peer, Peter and Štruc, Vitomir},
    	journal={IEEE Access}, 
    	title={ContexedNet: Context–aware Ear Detection in Unconstrained Settings}, 
    	year={2021},
    	volume={},
    	number={},
    	pages={1-1},
    	doi={10.1109/ACCESS.2021.3121792}
    }		
    										
  • Evaluation and Analysis of Ear Recognition Models: Performance, Complexity and Resource Requirements

    Žiga Emeršič, Blaž Meden, Vitomir Štruc, Peter Peer: "Evaluation and Analysis of Ear Recognition Models: Performance, Complexity and Resource Requirements", Neural Computing & Applications, 2018.

    Ear recognition technology has long been dominated by (local) descriptor-based techniques due to their formidable recognition performance and robustness to various sources of image variability. While deep-learning-based techniques have started to appear in this field only recently, they have already shown potential for further boosting the performance of ear recognition technology and dethroning descriptor-based methods as the current state of the art. However, while recognition performance is often the key factor when selecting recognition models for biometric technology, it is equally important that the behavior of the models is understood and their sensitivity to different covariates is known and well explored. Other factors, such as the train- and test-time complexity or resource requirements, are also paramount and need to be consider when designing recognition systems. To explore these issues, we present in this paper a comprehensive analysis of several descriptor- and deep-learning-based techniques for ear recognition. Our goal is to discover weak points of contemporary techniques, study the characteristics of the existing technology and identify open problems worth exploring in the future. We conduct our analysis through identification experiments on the challenging Annotated Web Ears (AWE) dataset and report our findings. The results of our analysis show that the presence of accessories and high degrees of head movement significantly impacts the identification performance of all types of recognition models, whereas mild degrees of the listed factors and other covariates such as gender and ethnicity impact the identification performance only to a limited extent. From a test-time-complexity point of view, the results suggest that lightweight deep models can be equally fast as descriptor-based methods given appropriate computing hardware, but require significantly more resources during training, where descriptor-based methods have a clear advantage. As an additional contribution, we also introduce a novel dataset of ear images, called AWE Extended (AWEx), which we collected from the web for the training of the deep models used in our experiments. AWEx contains 4104 images of 346 subjects and represents one of the largest and most challenging (publicly available) datasets of unconstrained ear images at the disposal of the research community.

    @article{EarEvaluation2018,
    	title={Evaluation and analysis of ear recognition models: performance, complexity and resource requirements},
    	author={Emer{\v{s}}i{\v{c}}, {\v{Z}}iga and Meden, Bla{\v{z}} and Peer, Peter and {\v{S}}truc, Vitomir},
    	journal={Neural computing and applications},
    	pages={1--16},
    	year={2018},
    	publisher={Springer}
    }
    										
  • Convolutional Encoder-Decoder Networks for Pixel-Wise Ear Detection and Segmentation

    Žiga Emeršič, Luka Lan Gabriel, Vitomir Štruc, Peter Peer: "Convolutional Encoder-Decoder Networks for Pixel-Wise Ear Detection and Segmentation", IET Biometrics, 2018.

    Object detection and segmentation represents the basis for many tasks in computer and machine vision. In biometric recognition systems the detection of the region-of-interest (ROI) is one of the most crucial steps in the processing pipeline, significantly impacting the performance of the entire recognition system. Existing approaches to ear detection, are commonly susceptible to the presence of severe occlusions, ear accessories or variable illumination conditions and often deteriorate in their performance if applied on ear images captured in unconstrained settings. To address these shortcomings, we present a novel ear detection technique based on convolutional encoder-decoder networks (CEDs). We formulate the problem of ear detection as a two-class segmentation problem and design and train a CED-network architecture to distinguish between image-pixels belonging to the ear and the non-ear class. Unlike competing techniques, our approach does not simply return a bounding box around the detected ear, but provides detailed, pixel-wise information about the location of the ears in the image. Experiments on a dataset gathered from the web (a.k.a. in the wild) show that the proposed technique ensures good detection results in the presence of various covariate factors and significantly outperforms competing methods from the literature.

    @article{emersic2018convolutional,
    	title={Convolutional encoder--decoder networks for pixel-wise ear detection and segmentation},
    	author={Emer{\v{s}}i{\v{c}}, {\v{Z}}iga and Gabriel, Luka L and {\v{S}}truc, Vitomir and Peer, Peter},
    	journal={IET Biometrics},
    	volume={7},
    	number={3},
    	pages={175--184},
    	year={2018},
    	publisher={IET}
    }		  	
    										
  • Ear Recognition: More Than a Survey

    Žiga Emeršič, Vitomir Štruc, Peter Peer: "Ear Recognition: More Than a Survey", Neurocomputing, 2017.

    Automatic identity recognition from ear images represents an active field of research within the biometric community. The ability to capture ear images from a distance and in a covert manner makes the technology an appealing choice for surveillance and security applications as well as other application domains. Significant contributions have been made in the field over recent years, but open research problems still remain and hinder a wider (commercial) deployment of the technology. This paper presents an overview of the field of automatic ear recognition (from 2D images) and focuses specifically on the most recent, descriptor-based methods proposed in this area. Open challenges are discussed and potential research directions are outlined with the goal of providing the reader with a point of reference for issues worth examining in the future. In addition to a comprehensive review on ear recognition technology, the paper also introduces a new, fully unconstrained dataset of ear images gathered from the web and a toolbox implementing several state-of-the-art techniques for ear recognition. The dataset and toolbox are meant to address some of the open issues in the field and are made publicly available to the research community.

    @article{emersic2017ear,
    	title={Ear recognition: More than a survey},
    	author={Emer{\v{s}}i{\v{c}}, {\v{Z}}iga and {\v{S}}truc, Vitomir and Peer, Peter},
    	journal={Neurocomputing},
    	volume={255},
    	pages={26--39},
    	year={2017},
    	publisher={Elsevier}
    }	
    										

    The AWE Toolbox: fill in and sign this form and send it to ziga.emersic@fri.uni-lj.si with the subject "AWE Request: The Toolbox".

Book Chapters

  • Constellation-Based Deep Ear Recognition

    Dejan Štepec, Žiga Emeršič, Peter Peer, Vitomir Štruc: "Constellation-Based Deep Ear Recognition", Deep Biometrics, Springer, 2019.

    This chapter introduces COM-Ear, a deep constellation model for ear recognition. Different from competing solutions, COM-Ear encodes global as well as local characteristics of ear images and generates descriptive ear representations that ensure competitive recognition performance. The model is designed as dual-path convolutional neural network (CNN), where one path processes the input in a holistic manner, and the second captures local images characteristics from image patches sampled from the input image. A novel pooling operation, called patch-relevant-information pooling, is also proposed and integrated into the COM-Ear model. The pooling operation helps to select features from the input patches that are locally important and to focus the attention of the network to image regions that are descriptive and important for representation purposes. The model is trained in an end-to-end manner using a combined cross-entropy and center loss. Extensive experiments on the recently introduced Extended Annotated Web Ears (AWEx).

    @Inbook{DeepBio2019,
    	author="{\v{S}}tepec, Dejan
    	and Emer{\v{s}}i{\v{c}}, {\v{Z}}iga
    	and Peer, Peter
    	and {\v{S}}truc, Vitomir",
    	editor="Jiang, Richard
    	and Li, Chang-Tsun
    	and Crookes, Danny
    	and Meng, Weizhi
    	and Rosenberger, Christophe",
    	title="Constellation-Based Deep Ear Recognition",
    	bookTitle="Deep Biometrics",
    	year="2020",
    	publisher="Springer International Publishing",
    	address="Cham",
    	pages="161--190",
    	abstract="This chapter introduces COM-Ear, a deep constellation model for ear recognition. Different from competing solutions, COM-Ear encodes global as well as local characteristics of ear images and generates descriptive ear representations that ensure competitive recognition performance. The model is designed as dual-path convolutional neural network (CNN), where one path processes the input in a holistic manner, and the second captures local images characteristics from image patches sampled from the input image. A novel pooling operation, called patch-relevant-information pooling, is also proposed and integrated into the COM-Ear model. The pooling operation helps to select features from the input patches that are locally important and to focus the attention of the network to image regions that are descriptive and important for representation purposes. The model is trained in an end-to-end manner using a combined cross-entropy and center loss. Extensive experiments on the recently introduced Extended Annotated Web Ears (AWEx).",
    	isbn="978-3-030-32583-1",
    	doi="10.1007/978-3-030-32583-1_8",
    	url="https://doi.org/10.1007/978-3-030-32583-1_8"
    }	
    									
  • Deep Ear Recognition Pipeline

    Žiga Emeršič, Janez Križaj, Vitomir Štruc, Peter Peer: "Deep Ear Recognition Pipeline", Recent Advances in Computer Vision: Theories and Applications, Springer, 2019.

    Ear recognition has seen multiple improvements in recent years and still remains very active today. However, it has been approached from recognition and detection perspective separately. Furthermore, deep-learning-based approaches that are popular in other domains have seen limited use in ear recognition and even more so in ear detection. Moreover, to obtain a usable recognition system a unified pipeline is needed. The input in such system should be plain images of subjects and the output identities based only on ear biometrics. We conduct separate analysis through detection and identification experiments on the challenging dataset and, using the best approaches, present a novel, unified pipeline. The pipeline is based on convolutional neural networks (CNN) and presents, to the best of our knowledge, the first CNN-based ear recognition pipeline. The pipeline incorporates both, the detection of ears on arbitrary images of people, as well as recognition on these segmented ear regions. The experiments show that the presented system is a state-of-the-art system and, thus, a good foundation for future real-word ear recognition systems.

    @Inbook{DeepEar2019,
    author="Emer{\v{s}}i{\v{c}}, {\v{Z}}iga
    and Kri{\v{z}}aj, Janez
    and {\v{S}}truc, Vitomir
    and Peer, Peter",
    editor="Hassaballah, Mahmoud
    and Hosny, Khalid M.",
    title="Deep Ear Recognition Pipeline",
    bookTitle="Recent Advances in Computer Vision: Theories and Applications",
    year="2019",
    publisher="Springer International Publishing",
    address="Cham",
    pages="333--362",
    abstract="Ear recognition has seen multiple improvements in recent years and still remains very active today. However, it has been approached from recognition and detection perspective separately. Furthermore, deep-learning-based approachesEmer{\v{s}}i{\v{c}}, {\v{Z}}iga that are popular in other domains have seen limited use in ear recognition and even more so in ear detection. Moreover, to obtain a usableKri{\v{z}}aj, Janez recognition system a unified pipeline{\v{S}}truc, Vitomir is needed. The input in such system should be plain images of subjects and thePeer, Peter output identities based only on ear biometrics. We conduct separate analysis through detection and identification experiments on the challenging dataset and, using the best approaches, present a novel, unified pipeline. The pipeline is based on convolutional neural networks (CNN) and presents, to the best of our knowledge, the first CNN-based ear recognition pipeline. The pipeline incorporates both, the detection of ears on arbitrary images of people, as well as recognition on these segmented ear regions. The experiments show that the presented system is a state-of-the-art system and, thus, a good foundation for future real-word ear recognition systems.",
    isbn="978-3-030-03000-1",
    doi="10.1007/978-3-030-03000-1_14",
    url="https://doi.org/10.1007/978-3-030-03000-1_14"
    }
    									

Competitions (Conferences)

  • The Unconstrained Ear Recognition Challenge 2019

    Žiga Emeršič, Aruna Kumar S. V., B. S. Harish, Weronika Gutfeter, Jalil Nourmohammadi Khiarak, Andrzej Pacut, Earnest Hansley, Mauricio Pamplona Segundo, Sudeep Sarkar, Hyeonjung Park, Gi Pyo Nam, Ig-Jae Kim, Sagar G. Sangodkar, Ümit Kaçar, Murvet Kirci, Li Yuan, Jishou Yuan, Haonan Zhao, Fei Lu, Junying Mao, Xiaoshuang Zhang, Dogucan Yaman, Fevziye Irem Eyiokur, Kadir Bulut Özler, Hazım Kemal Ekenel, Debbrota Paul Chowdhury, Sambit Bakshi, Pankaj K. Sa, Banshidhar Majhi, Peter Peer, Vitomir Štruc: “The Unconstrained Ear Recognition Challenge 2019”, International Conference On Biometrics, IAPR, 2019.

    This paper presents a summary of the 2019 Unconstrained Ear Recognition Challenge (UERC), the second in a series of group benchmarking efforts centered around the problem of person recognition from ear images captured in uncontrolled settings. The goal of the challenge is to assess the performance of existing ear recognition techniques on a challenging large-scale ear dataset and to analyze performance of the technology from various viewpoints, such as generalization abilities to unseen data characteristics, sensitivity to rotations, occlusions and image resolution and performance bias on sub-groups of subjects, selected based on demographic criteria, i.e. gender and ethnicity. Research groups from 12 institutions entered the competition and submitted a total of 13 recognition approaches ranging from descriptor-based methods to deep-learning models. The majority of submissions focused on ensemble based methods combining either representations from multiple deep models or hand-crafted with learned image descriptors. Our analysis shows that methods incorporating deep learning models clearly outperform techniques relying solely on hand-crafted descriptors, even though both groups of techniques exhibit similar behaviour when it comes to robustness to various covariates, such presence of occlusions, changes in (head) pose, or variability in image resolution. The results of the challenge also show that there has been considerable progress since the first UERC in 2017, but that there is still ample room for further research in this area.

    @inproceedings{UERC2019,
    	title={The Unconstrained Ear Recognition Challenge 2019},
    	author={Emer{\v{s}}i{\v{c}}, {\v{Z}} and SV, A Kumar and Harish, BS and Gutfeter, W and Khiarak, JN and Pacut, A and Hansley, E and Segundo, M Pamplona and Sarkar, S and Park, HJ and others},
    	booktitle={2019 International Conference on Biometrics (ICB)},
    	pages={1--15},
    	year={2019},
    	organization={IEEE}
    }  
    									

    The UERC Toolkit: fill in and sign this form and send it to ziga.emersic@fri.uni-lj.si with the subject "UERC Request: The Toolkit".

  • The Unconstrained Ear Recognition Challenge

    Žiga Emeršič, Dejan Štepec, Vitomir Štruc, Peter Peer, Anjith George, Adil Ahmad, Elshibani Omar, Terrance E. Boult, Reza Safdari, Yuxiang Zhou, Stefanos Zafeiriou, Dogucan Yaman, Fevziye I. Eyiokur, Hazim Kemal Ekenel: “The unconstrained ear recognition challenge”, International Joint Conference on Biometrics, IEEE, 2017.

    In this paper we present the results of the Unconstrained Ear Recognition Challenge (UERC), a group benchmarking effort centered around the problem of person recognition from ear images captured in uncontrolled conditions. The goal of the challenge was to assess the performance of existing ear recognition techniques on a challenging large-scale dataset and identify open problems that need to be addressed in the future. Five groups from three continents participated in the challenge and contributed six ear recognition techniques for the evaluation, while multiple baselines were made available for the challenge by the UERC organizers. A comprehensive analysis was conducted with all participating approaches addressing essential research questions pertaining to the sensitivity of the technology to head rotation, flipping, gallery size, large-scale recognition and others. The top performer of the UERC was found to ensure robust performance on a smaller part of the dataset (with 180 subjects) regardless of image characteristics, but still exhibited a significant performance drop when the entire dataset comprising 3,704 subjects was used for testing.

    @inproceedings{UERC2017,
    	title={The unconstrained ear recognition challenge},
    	author={Emer{\v{s}}i{\v{c}}, {\v{Z}}iga and {\v{S}}tepec, Dejan and {\v{S}}truc, Vitomir and Peer, Peter and George, Anjith and Ahmad, Adii and Omar, Elshibani and Boult, Terranee E and Safdaii, Reza and Zhou, Yuxiang and others},
    	booktitle={2017 IEEE international joint conference on biometrics (IJCB)},
    	pages={715--724},
    	year={2017},
    	organization={IEEE}
    }
    									

    The UERC Toolkit: fill in and sign this form and send it to ziga.emersic@fri.uni-lj.si with the subject "UERC Request: The Toolkit".

Conferences

  • Generation of 2D ear dataset with annotated view angles as a basis for angle-aware ear recognition

    Anja Hrovatič, Kihoon Kwon, Diego Sušanj, Peter Peer, Žiga Emeršič, "Generation of 2D ear dataset with annotated view angles as a basis for angle-aware ear recognition", International Electrotechnical and Computer Conference, IEEE, 2019.

    Ear recognition has seen steady development in the recent years. Despite numerous novel approaches ranging from traditional approaches based on local feature extraction to deep learning approaches, certain issues still remain unsolved. As pointed out in recent studies, one of the most prominent issues is the problem of ear alignment. To tackle this problem traditional approaches proved to be unsuccessful. However, in order to train deep neural networks to estimate pose angles to facilitate ear alignment, dataset with annotated angles is needed. In this work we present a 2D RGB dataset based on UND-J 3D dataset with corresponding 2D angle-annotated images as a base for convolutional neural network training.

    @inproceedings{hrovativcgeneration,
    	title={Generation of 2D ear dataset with annotated view angles as a basis for angle-aware ear recognition},
    	author={Hrovati{\v{c}}, Anja and Kwon, Kihoon and Su{\v{s}}anj, Diego and Peer, Peter and Emer{\v{s}}i{\v{c}}, {\v{Z}}iga},
    	booktitle={Proceedings of the 28th International Electrotechnical and Computer Science Conference}
    }	  
    									
  • Mask R-CNN for Ear Detection

    Matic Bizjak, Peter Peer, Žiga Emeršič: “Mask R-CNN for Ear Detection”, International Convention on Information and Communication Technology, Electronics and Microelectronics, IEEE, 2019.

    Ear detection is an important step in ear recognition pipeline as it makes or breaks the system. However, in the literature there is arguably the lack of ear detection approaches available. This poses a problem for opening ear recognition system to wider use and applications in commercial systems. To tackle this problem we present the use of Mask R-CNN for pixel-wise ear detection. Furthermore, we directly compare our approach to one of the previous best performing pixel-wise ear detection approach by using the same dataset and protocol. Our results with intersection over union score of 79.24% on AWE dataset show the superiority of our approach and present a viable approach for future use in ear recognition pipelines.

    @inproceedings{bizjak2019mask,
    	title={Mask R-CNN for Ear Detection},
    	author={Bizjak, Matic and Peer, Peter and Emer{\v{s}}i{\v{c}}, {\v{Z}}iga},
    	booktitle={2019 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO)},
    	pages={1624--1628},
    	year={2019},
    	organization={IEEE}
    }
    												
    									
  • Subdivided Ear Recognition

    David Romero, Matej Vitek, Blaž Meden, Peter, Peer, Žiga Emeršič: “Subdivided Ear Recognition”, ROSUS, 2019.

    The present paper addresses the performance of different ear feature extractors on partial ear images. The main goal is to find out which parts of the ear have major influence on successful recognition for each extractor. In this sense, a whole ear recognition pipeline has been simulated using Annotated Web Ears (AWE) Toolbox and dataset. Ears have been divided in up, down, internal and external parts and results have been compared. It has been demonstrated the existence of performance gapsbetween different ear parts and extractors. Trying to exploit that, a score level distance fusion approach has been tested combining separately obtained distances by means of weighted averaging.

    @inproceedings{earRomero2019,
    	title={Subdivided Ear Recognition},
    	author={Romero, David and Vitek, Matej and Meden, Bla{\v{z}} and Peer, Peter and Emer{\v{s}}i{\v{c}}, {\v{Z}}iga},
    	booktitle={ROSUS},
    	year={2019}
    }
    												
    									
  • Unconstrained ear recognition using residual learning and attention mechanisms

    Tim Oblak, Blaž Meden, Peter Peer, Žiga Emeršič: "Unconstrained ear recognition using residual learning and attention mechanisms", International Electrotechnical and Computer Conference, IEEE, 2018.

    With the recent popularity of deep convolutional neural networks, image-based biometrics is one of many domains, that consquently gained new progress in solving previously incomplete or unanswered challenges. While some biometric modalities, like the fingerprint, are already considered mature, others are still in need of more reliable apporaches. Ears can be used for person identification since they have the necessary properties of a biometric modality. High accuracy ear identification systems do exist but mostly focus on a controlled environment. In this paper, we try to improve the current state-of-the-art in ear recognition by using residual learning and attention mechanisms. By stacking residual building blocks, we find the optimal architecture to be ResNet with 18 convolutional layers. We achieve a Rank-1 score of 54.46% with full model learning, which is a 5.35 percentage point improvement from the previous best trained on the VGG architecture, however, the model still underperforms against those, trained with selective learning. We observe that aggressive data augmentation is needed when dealing with a small dataset. We also conclude that the Attention Model performance is subpar compared to other architectures.

    @inproceedings{oblak2019residualEar,
    	title={Unconstrained ear recognition using residual learning and attention mechanisms},
    	author={Oblak, Tim and Meden, Bla{\v{z}} and Peer, Peter and Emer{\v{s}}i{\v{c}}, {\v{Z}}iga},
    	booktitle={International Electrotechnical and Computer Conference},
    	year={2019},
    	organization={IEEE}
    }
    												
    									
  • Towards Accessories-Aware Ear Recognition

    Žiga Emeršič, Nil Oleart Playà, Vitomir Štruc, Peter Peer: “Towards Accessories-Aware Ear Recognition”, International Work Conference on Bioinspired Intelligence, IEEE, 2018.

    Automatic ear recognition is gaining popularity within the research community due to numerous desirable properties, such as high recognition performance, the possibility of capturing ear images at a distance and in a covert manner, etc. Despite this popularity and the corresponding research effort that is being directed towards ear recognition technology, open problems still remain. One of the most important issues stopping ear recognition systems from being widely available are ear occlusions and accessories. Ear accessories not only mask biometric features and by this reduce the overall recognition performance, but also introduce new non-biometric features that can be exploited for spoofing purposes. Ignoring ear accessories during recognition can, therefore, present a security threat to ear recognition and also adversely affect performance. Despite the importance of this topic there has been, to the best of our knowledge, no ear recognition studies that would address these problems. In this work we try to close this gap and study the impact of ear accessories on the recognition performance of several state-of-the-art ear recognition techniques. We consider ear accessories as a tool for spoofing attacks and show that CNN-based recognition approaches are more susceptible to spoofing attacks than traditional descriptor-based approaches. Furthermore, we demonstrate that using inpainting techniques or average coloring can mitigate the problems caused by ear accessories and slightly outperforms (standard) black color to mask ear accessories.

    @inproceedings{emersic2018towards,
    	title={Towards accessories-aware ear recognition},
    	author={Emer{\v{s}}i{\v{c}}, {\v{Z}}iga and Play{\`a}, Nil Oleart and {\v{S}}truc, Vitomir and Peer, Peter},
    	booktitle={2018 IEEE International Work Conference on Bioinspired Intelligence (IWOBI)},
    	pages={1--8},
    	year={2018},
    	organization={IEEE}
    }	
    									
  • Covariate analysis of descriptor-based ear recognition techniques

    Žiga Emeršič, Blaž Meden, Vitomir Štruc, Peter Peer: “Covariate analysis of descriptor-based ear recognition techniques”, International Work Conference on Bioinspired Intelligence, IEEE, 2017.

    Dense descriptor-based feature extraction techniques represent a popular choice for implementing biometric ear recognition system and are in general considered to be the current state-of-the-art in this area. In this paper, we study the impact of various factors (i.e., head rotation, presence of occlusions, gender and ethnicity) on the performance of 8 state-of-the-art descriptor-based ear recognition techniques. Our goal is to pinpoint weak points of the existing technology and identify open problems worth exploring in the future. We conduct our covariate analysis through identification experiments on the challenging AWE (Annotated Web Ears) dataset and report our findings. The results of our study show that high degrees of head movement and presence of accessories significantly impact the identification performance, whereas mild degrees of the listed factors and other covariates such as gender and ethnicity impact the identification performance only to a limited extent.

    @inproceedings{emersic2017covariate,
    	title={Covariate analysis of descriptor-based ear recognition techniques},
    	author={Emersic, Ziga and Meden, Blaz and Peer, Peter and Struc, Vitornir},
    	booktitle={2017 international conference and workshop on bioinspired intelligence (IWOBI)},
    	pages={1--9},
    	year={2017},
    	organization={IEEE}
    }						  
    									
  • Training Convolutional Neural Networks with Limited Training Data for Ear Recognition in the Wild

    Žiga Emeršič, Dejan Štepec, Vitomir Štruc, Peter Peer: "Training Convolutional Neural Networks with Limited Training Data for Ear Recognition in the Wild", Conference on Automatic Face and Gesture Recognition, -- International Workshop on Biometrics in the Wild IEEE, 2017.

    Identity recognition from ear images is an active field of research within the biometric community. The ability to capture ear images from a distance and in a covert manner makes ear recognition technology an appealing choice for surveillance and security applications as well as related application domains. In contrast to other biometric modalities, where large datasets captured in uncontrolled settings are readily available, datasets of ear images are still limited in size and mostly of laboratory-like quality. As a consequence, ear recognition technology has not benefited yet from advances in deep learning and convolutionalneural networks (CNNs) and is still lacking behind other modalities that experienced significant performance gains owing to deep recognition technology. In this paper we address this problem and aim at building a CNN-based ear recognition model. We explore different strategies towards model training with limited amounts of training data and show that by selecting an appropriate model architecture, using aggressive data augmentation and selective learning on existing (pre-trained) models, we are able to learn an effective CNN·based model using a little more than 1300 training images. The result of our work is the first CNN·based approach to ear recognition that is also made publicly available to the research community. With our model we are able to improve on the rank one recognition rate of the previous state-of-the-art by more than 25% on a challenging dataset of ear images captured from the web (a.k.a, in the wild).

    @article{emersic2017training,
    	title={Training convolutional neural networks with limited training data for ear recognition in the wild},
    	author={Emer{\v{s}}i{\v{c}}, {\v{Z}}iga and {\v{S}}tepec, Dejan and {\v{S}}truc, Vitomir and Peer, Peter},
    	journal={arXiv preprint arXiv:1711.09952},
    	year={2017}
    }				  
    									
  • Assessment of predictive clustering trees on 2D-image-based ear recognition

    Žiga Emeršič, Peter Peer, Ivica Dimitrovski: "Assessment of predictive clustering trees on 2D-image-based ear recognition", International Electrotechnical and Computer Conference, IEEE, 2016.

    In the last decade person recognition based on various biometric metrics have steadily been gaining on popularity. The same holds for machine learning approaches and various image classification and retrieval techniques. However, many techniques rely on distinguishing between significantly dissimilar images, which is often not the case in person recognition. Person recognition based on images relies on detecting minor differences and not global appearance of an image. To test if retrieval approaches based on bag-of-words fail in the task of biometric recognition we evaluated the following procedure. Ear images were used to extract Scale Invariant Feature Transform feature vectors. These vectors were then fed into forest of Predictive Clustering Trees, k-means and approximate k-means; and then compared to baseline system where only distances between plain descriptors are compared. While these methods have been proven to perform well in image with significantly different content, the results show that these methods do not perform well under the task of ear recognition.

    @inproceedings{emersic2016assessment,
    	title={Assessment of predictive clustering trees on 2D-image-based Ear recognition},
    	author={Emer{\v{s}}ic, {\v{Z}}iga and Peer, Peter and Dimitrovski, Ivica},
    	booktitle={International Electrotechnical and Computer Science Conference},
    	year={2016}
    }								  
    									
  • Influence of Alignment on Ear Recognition: Case Study on AWE Dataset

    Metod Ribič, Žiga Emeršič, Vitomir Štruc, Peter Peer: "Influence of Alignment on Ear Recognition: Case Study on AWE Dataset", International Electrotechnical and Computer Conference, IEEE, 2016.

    Ear as a biometric modality presents a viable source for automatic human recognition. In recent years local description methods have been gaining on popularity due to their invariance to illumination and occlusion. However, these methods require that images are well aligned and preprocessed as good as possible. This causes one of the greatest challenges of ear recognition: sensitivity to pose variations. Recently, we presented Annotated Web Ears dataset that opens new challenges in ear recognition. In this paper we test the influence of alignment on recognition performance and prove that even with the alignment the database is still very challenging, even-though the recognition rate is improved due to alignment. We also prove that more sophisticated alignment methods are needed to address the AWE dataset efficiently.

    @inproceedings{ribic2016influence,
    	title={Influence of alignment on Ear recognition: case study on AWE dataset},
    	author={Ribi{\v{c}}, M and Emer{\v{s}}i{\v{c}}, {\v{Z}} and {\v{S}}truc, V and Peer, P},
    	booktitle={International Electrotechnical and Computer Science Conference},
    	volume={25},
    	pages={131--134},
    	year={2016}
    }								  
    									
  • Toolbox for ear biometric recognition evaluation

    Žiga Emeršič, Peter Peer: "Toolbox for ear biometric recognition evaluation", International Conference on Computer as a Tool, IEEE, 2015.

    Ears are not subjected to facial expressions like faces are and do not require closer inspection like fingerprints do. However, there is a problem of occlusion, different lightning conditions and angles. These properties mean that the final outcome depends heavily on the selected database and classification procedures used in the evaluation process. Moreover, the results metrics are often difficult to compare, different sections of evaluation procedure mask the important steps, and frameworks that are usually build on-the-fly take time to develop. With our toolbox we propose the solution to those problems enabling faster development in the field of ear biometric recognition.

    @inproceedings{emersic2015toolbox,
    	title={Toolbox for ear biometric recognition evaluation},
    	author={Emer{\v{s}}i{\v{c}}, {\v{Z}}iga and Peer, Peter},
    	booktitle={IEEE EUROCON 2015-International Conference on Computer as a Tool (EUROCON)},
    	pages={1--6},
    	year={2015},
    	organization={IEEE}
    }																		
    									
  • Ear Biometric Database in the Wild

    Žiga Emeršič, Peter Peer: "Ear Biometric Database in the Wild", International Work Conference on Bioinspired Intelligence, IEEE, 2015.

    Ear biometrics is gaining on popularity in recent years. One of the major problems in the domain is that there are no widely used, ear databases in the wild available. This makes comparison of existing ear recognition methods demanding and progress in the domain slower. Images that were taken under supervised conditions and are then used to train classifiers in ear recognition methods can in effect cause these classifiers classifiers to fail under application in the wild. In this paper we propose a new database which consists of ear images in the wild of known persons taken from the Internet. This ensures different indoor and outdoor lightning conditions, different viewing angles, occlusions, and a variety of image sizes and quality. In experiments we demonstrate that our database is more challenging than others.

    @inproceedings{emersic2015ear,
    	title={Ear biometric database in the wild},
    	author={Emer{\v{s}}i{\v{c}}, {\v{Z}}iga and Peer, Peter},
    	booktitle={2015 4th international work conference on bioinspired intelligence (IWOBI)},
    	pages={27--32},
    	year={2015},
    	organization={IEEE}
    }