Infants Activity Recognition based on human pose estimation as a support for privacy-preserving neurodevelopmental disorders diagnosis

Sguazza, Simone (2020) Infants Activity Recognition based on human pose estimation as a support for privacy-preserving neurodevelopmental disorders diagnosis. Master thesis, Scuola universitaria professionale della Svizzera italiana.

[img] Text
Sguazza_thesis_report_v20200907.pdf - Published Version

Download (17MB)
[img] Text
Sguazza_MSE_thesis_poster.pdf - Published Version

Download (479kB)


The project focuses on a Computer Vision application, which aims to help automation in the analysis of infant behaviors in order to help in the early diagnosis of neurological developmental disorders. The analyzed dataset corresponds to a set of videos of infants aged between 15 and 18 months, free to move inside a fixed multi-camera indoor environment, in an individual free play area with close interaction with the educator and other children outside the free play area. The context of this application is to help the doctors examine how infants use some specific toys to detect neurological developmental disorders. This task is mostly performed manually by the experts who monitor each video and make annotations, but is a time-consuming task. This application is designed to help experts automate this task. Infants are not aware of the cameras; they explore and interact with the environment and the adult. The video data is not collected for a Computer Vision application, the child can move freely and do whatever he or she wants. Unlike most State of Art studies, where the environment and the subject are in function of skeletal extraction to obtain a qualitative sample, nothing was done here to limit the external noise during the environmental preparation. In my work I focus on the detection and tracking of a generic infants in a noisy environment, extract and stabilize the skeleton. I exploit the skeleton to infer human activity. The generic tracker works quite well, but it is not always robust enough to follow the child like a human all the time. Even with a perfect track, skeletons extracted may have problems estimating the specific position of the child's skeleton due to the camera angle and perspective. A large variance of skeletons for the same action leads to a poor performance of the classifier. With this work we demonstrate per potential of using computer vision technologies for anonymizing videos, and for automatically perform a mobility analysis which provide a precious support in the diagnosis of neurodevelopmental disorders in infants. The project was divided into three main phases: - The first phase is to track a generic infant in a delimited free play area, where it plays with specific toys and/or explore the environment. I built a tailored Machine Learning Tracker to address the problem of aliasing introduced by the adult on the ground within the free play area. - The second phase is to generate the skeleton with OpenPose technology and stabilize it with a Linear Kalman filter. - The third step is to extract the skeleton, encode the information of the three cameras within an image and infer the actions with a tailored CNN architecture classifier.

Item Type: Thesis (Master)
Supervisors: Papandrea, Michela
Subjects: Informatica
Divisions: Dipartimento tecnologie innovative

Actions (login required)

View Item View Item