3D face recognition with descriptor images and shallow convolutional neural networks
Cardia Neto, João Baptista
MetadataMostrar registro completo
Nowadays, there is an increasing need for systems that can accurately and quickly identify a person. Traditional identification methods utilize something a person knows or something a person has. This kind of methods has several drawbacks, being the main one the fact that it is impossible to detect an impostor who uses genuine credentials to pass as a genuine person. Besides, in some cases it is necessary to discover the identity of people in a covert manner. One way to deal with these types of problems is to use biometric identification. Face is one of the biometric features that best suit the covert identification since the current technology is able to provide high resolution 2D face images captured by low cost cameras, in a secret way, at a distance and without cooperation from the people being identified. However, in general, biometric systems based on 2D face recognition perform very poorly in certain scenarios when the input images present variations in pose, illumination, and facial expressions. One way to mitigate this problem is to use 3D face data, but the current 3D scanners are expensive and require a lot of cooperation from people. The use of deep convolutional neural networks is another way to mitigate the traditional 2D facial recognition drawbacks, but it can be unfeasible, due to their large training data and huge computational power requirements. Therefore, in this thesis, we introduce a hybrid approach, based on Shallow Learned Feature Representation, for 3D face recognition, which is focused on minimizing the amount of data, the computational power and the processing time required in the training stage, while being able to operate close to state-of-the-art methods and being able to transfer the learning made on high-resolution data to low-resolution data. Another important aspect of the proposed hybrid approach is the possibility to operate in both classification or feature-extraction modes. Experimental results obtained by our hybrid approach on EURECOM Kinect Face dataset, a low resolution depth dataset, showed a rank-1 recognition rate of 90.75% on the hardest case of classification mode, and 73.26% on the feature extraction mode, which are better than the rates obtained by related state-of-the-art methods with the same protocol and dataset. So, we conclude that the proposed hybrid approach helps to attenuate the cross-resolution differences and that the utilization of an input built with more discriminative data, such as low-level hand-crafted features, allows the utilization of shallow CNN for 3D face recognition.
Os arquivos de licença a seguir estão associados a este item: