The recent growth of wearable cameras makes the person re-identification (re-ID) from first-person vision data available. Nevertheless, there is still a lack of suitable egocentric vision datasets due to blurriness, illumination change, or poor video quality.
A recent study suggests using a neural style transfer-based domain adaptation technique, which has never been used to bridge the gap between the fixed camera and egocentric datasets.
Photo by One Idea LLC from StockSnap, CC0 Public Domain
The approach uses fixed camera re-ID datasets to improve the performance of egocentric re-ID. Images having features from both egocentric and fixed camera datasets are generated. Then, a pre-trained model is fine-tuned with images from a fixed-camera dataset. The calculated features are then used to re-identify individuals. The use of style-transferred images increased the recognition rate by up to 203.8% compared with non-style transferred images.
Person re-identification (re-ID) in first-person (egocentric) vision is a fairly new and unexplored problem. With the increase of wearable video recording devices, egocentric data becomes readily available, and person re-identification has the potential to benefit greatly from this. However, there is a significant lack of large scale structured egocentric datasets for person re-identification, due to the poor video quality and lack of individuals in most of the recorded content. Although a lot of research has been done in person re-identification based on fixed surveillance cameras, these do not directly benefit egocentric re-ID. Machine learning models trained on the publicly available large scale re-ID datasets cannot be applied to egocentric re-ID due to the dataset bias problem. The proposed algorithm makes use of neural style transfer (NST) that incorporates a variant of Convolutional Neural Network (CNN) to utilize the benefits of both fixed camera vision and first-person vision. NST generates images having features from both egocentric datasets and fixed camera datasets, that are fed through a VGG-16 network trained on a fixed-camera dataset for feature extraction. These extracted features are then used to re-identify individuals. The fixed camera dataset Market-1501 and the first-person dataset EGO Re-ID are applied for this work and the results are on par with the present re-identification models in the egocentric domain.
Research paper: Choudhary, A., Mishra, D., and Karmakar, A., “Domain Adaptive Egocentric Person Re-identification”, 2021. Link: https://arxiv.org/abs/2103.04870