PRE2018 4 Group8

From Control Systems Technology Group
Jump to navigation Jump to search

Project Plan

Members

Name Student ID Email Study
Rik Hoekstra 1262076 r.hoekstra@student.tue.nl Applied Mathematics
Wietske Blijjenberg 1025111 w.t.p.blijjenberg@student.tue.nl Software Science
Kilian Cozijnsen 1004704 k.d.t.cozijnsen@student.tue.nl Biomedical Engineering
Arthur Nijdam 1000327 c.e.nijdam@student.tue.nl Biomedical Engineering
Selina Janssen 1233328 s.a.j.janssen@student.tue.nl Biomedical Engineering

Ideas

Surgery robots

The DaVinci surgery system has become a serious competitor to conventional laparoscopic surgery techniques. This is because the machine has more degrees of freedom, thus allowing the surgeon to carry out movements that they were not able to carry out with other techniques. The DaVinci system is controlled by the surgeon itself, and the surgeon therefore has full control and responsibility over the result. However, as robots are becoming more developed, they might become more autonomous as well. But mistakes can still occur, albeit perhaps less frequently than with regular surgeons. In such cases, who is responsible? The robot manufacturer, or the surgeon? In this research project, the ethical implications of autonomous robot surgery could be addressed.

Elderly care robots

The ageing population is rapidly increasing in most developed countries, while vacancies in elderly care often remain unfilled. Therefore, elderly care robots could be a solution, as they relieve pressure of the carers of elderly people. They can also offer more specialised care and aide the person in their social development. However, the information recorded by the sensors and the video-images recorded by cameras should be protected well, as the privacy of the elderly should be ensured. In addition to that, robot care should not infantilise the elderly and respect their autonomy.

Facial emotion recognition

Facebook uses advanced Artificial Intelligence (AI) to recognise faces. This data can be used or misused in many ways. Totalitarian governments can use such techniques to control the masses, but care robots could use facial recognition to read the emotional expression of the person they are taking care of. In this research project, facial recognition for emotion regulation can be explored, as there are interesting technical and ethical implications that this technology might have on the field of robotic care.

Subject

The choice of our subject of study has gone to emotion recognition in elderly. For this purpose, the following research question was defined:


In what way can Convolutional Neural Networks (CNNs) be used to perform emotion recognition in real-time video images of elderly people?

Sub-subjects

Based on the research question, a set of sub-subjects was identified. The purpose of these sub-questions is to collectively solve the research question.

  • technical sub-questions:

What are the requirements for the dataset that will be used?

How large does the training set need to be to guarantee that it is not biased?

What are the requirement for the test and validation set?

What are the best features to use for facial expression analysis?

What is a suitable CNN architecture to analyse dynamic facial expressions?


  • USE sub-questions:

What is a possible application of our CNN emotion recognition technology?

Is the use of facial recognition in conflict with privacy?

What are the consequences of false-positives versus false-negatives?

Which users and enterprises would benefit from our software?

Are there legal issues that will impede the application of our technology?

Problem Statement

According to Kacperck[1], effective communication in elderly care is dependent on the nurse’s ability to listen and utilize non-verbal communication skills. Argyle[2] says there are 3 distinct forms of human non-verbal communication:

  • Non-verbal communication as a replacement for language
  • Non-verbal communication as a support and complement of verbal language, to emphasize a sentence or to clarify the meaning of a sentence
  • Non-verbal communication of attitudes and emotions and manipulation of the immediate social situation, for example when sarcasm is used.

Facial expressions play an important role in these forms of non-verbal communication. However, robots do not have the natural ability to recognize emotions as humans do. This can lead to problems with elderly care robots. For example when a patient consciously or subconsciously tries to communicate something using facial expressions and the display of emotions, and the robot does not recognize this or recognizes it inaccurately. The elderly person may get frustrated because they have to put everything they feel and want into words, which may lead to them appreciating their care less. In the worst case they may not accept the robot because it will feel too inhuman and cold.

If robots could accurately recognize human emotions, for example with the use of Convolutional Neural Networks, the care that elderly care robots provide could be enhanced in many ways. However the use of facial recognition does raise ethical questions, like if it compromises privacy or not. This project investigates in what way Convolutional Neural Networks (CNNs) can/should be used for the purposes of emotion recognition in elderly care robots.

Objectives:

  • Construct a CNN that must be able to distinguish at least 1 emotion from other emotions.
  • Analyze the technical possibilities of the CNN
  • Analyze the ethics of using a CNN for emotion recognition in elderly care robots.

19.9% of the elderly report that they experience feelings of loneliness. A potential cause of this is that they have often lost quite a large deal of their family and friends. The solution for this could be an assistive robot with a human-robot interactive aspect. It recognises the facial expression of the elderly person and from this deducts their needs. If the elderly person looks sad, the robot might suggest them to contact a family member or a friend via a skype call. However, the technology for such interaction has not been developed thorougly yet. While human-robot interaction using speech analysis is a relatively mature topic, the field of facial expression recognition from a robot's camera images is a lot more unexplored. Research also shows that the combination of video images and recorded speech data is especially powerful and accurate in determining an elderly person's emotion. Therefore, this research project proposes a package for facial emotion recognition, as can be used for the SocialRobot project, where facial recognition and speech analysis have been implemented already, but facial expression analysis was not.

SocialRobot is a European project that has the aim to aide elderly people in living independently in their own house and to improve their quality of life, by helping them with maintaining social contacts. The robotic platform has two wheels, it is 125cm so that it looks approachable. It is equipped with, including but not limited to: a camera, infrared sensor, microphones, a programmable array of LEDs that can show facial expressions and a computer. Its battery pack is fit to operate continuously for 5 hours.

The SocialRobot recognises the elderly person's face and then reads their emotion from the response that they give to questions the robot asks. The speech data was analysed and from this, the emotion of the elderly person was derived. The accuracy of this system was 82%. The idea is that the robot uses the response as input for the actions it takes afterwards, e.g. if the person is sad, they will encourage them to initiate a skype call with their friends. If the person is bored, they will encourage them to play cards online with friends.

Application example

Bart always describes himself as "pretty active back in his days". But now that he's reached an age of 83 he is not that active anymore. He lives in an apartment complex for elderly with an artificial companion named John. John is a care-robot, besides helping Bart in the household the robot functions as an interactive conversation partner. Every morning after greeting Bart, the robot gives a weather forecast. A trick it learned from analysis of human-human conversations which almost always start with a chat about the weather. Today this seems to work fine, as after some time Bart reacts with: “Well, it promises to be a beautiful day, doesn’t it?” But if this robot was equipped with simple emotion recognition software it would have noticed that a sad expression appeared on Bart’s face after the weather came up. In fact, every time Bart hears the weather forecast he thinks about how he used to go out to enjoy the sun and the fact that he can’t do that anymore. With emotion recognition the robot could avoid this subject in the future. Or it might try to arrange with the nurses that Bart goes outside more often.

In this example Bart would profit of the impementation of facial emotion recognition software in his care robot. At the same time a conflict of values arises. The implementation of emotion recognition software could seriously improve the quality of the care delivered by the care robot. But on the other hand, we should seriously consider up to what extend these robots may replace the interaction with real humans. And when the robot decides to take action to get the nurses to let Bart go outside more often this might conflict with the right of privacy.

Target user evaluation of the initial plan

User - Target User Analysis According to (20), care robots can not only be used in a functional way, but also to promote the autonomy of the elderly person by assisting them to live in their own home, and to provide psychological support. This is necessary, as researchers from the Amsterdam Study of the Elderly (AMSTEL) found that about 20% of the Dutch elderly experience feelings of loneliness. They have often lost a significant part of their social contacts and possibly their partner, leading to loneliness. The research links loneliness to the onset of dementia. Therefore, the reduction of social isolation is detrimental to both the quality of life and the mental state of the elderly.

While emotion recognition can be used on various kinds of target groups (see state-of-the-art section), the high levels of loneliness amongst elderly are the motivation for the choice of elderly as our target group. However, elderly people are still a broad target group with a wide range in needs, in which the following categories can be defined:

  • Elderly people with regular mental and functional capacities.
  • Elderly people with affected mental capacities but with decent physical capabilities.
  • Elderly people with affected mental and physical capacities.

All of the categories of elderly people might cope with loneliness, but category 2 and 3 are more likely to need a care robot. They are also a vulnerable group of people, as they might not have the mental capacity to consent to the robot's actions. In this respect, interpreting the person's social cues is vital for their treatment, as they might not be able to put their feelings into words. For this group of elderly, false negatives for unhappiness can especially have an impact. To deduce what impact it can have, it is important to look at the possible applications of this technology in elderly care robots.

As the elderly, especially those of categories 2 and 3, are vulnerable, their privacy should be protected. Information regarding their emotions can be used for their benefit, but can also be used against them, for example to force psychological treatment if the patient does not score well enough on the 'happiness scale' as determined by the AI. Therefore, the system should be safe and secure. If possible, at least in the first stages secondary users can play a large role as well. Examples of such secondary users are formal caregivers and social contacts. The elderly person should be able to consent to the information regarding their emotions being shared to these secondary users.

State-of-the-Art technology

Technical insight Neural networks can be used for facial recognition and emotion recognition. The approaches in literature can be classified based off the following elements:

  • The database used for training of the data
  • The feature selection method
  • The neural network architecture
Article number Database used Feature selection method Neural Network architecture Additional information
1 own database It recognizes facial expressions in the following steps: division of the facial images in three regions of interest (ROI), the eyes, nose and mouth. Then, feature extraction happens using a 2D Gabor filter and LBP. PCA is adopted to reduce the number of features. Extreme Learning Machine classifier This article entails a robotic system that not only recognizes human emotions but also generates its own facial expressions in cartoon symbols
3 Karolinska Directed Emotional Face (KDEF) dataset Their approach is a Deep Convolutional Neural Network (CNN) for feature extraction and a Support Vector Machine (SVM) for emotion classification. Deep Convolutional Neural Network (CNN) This approach reduces the number of layers required and it has a classification rate of 96.26%
5 The dataset used was Extended Cohn-Kanade (CK+) and the Japanese Female Expression (JAFFE) Dataset not mentioned Deep Convolutional Neural Networks (DNNs) The researchers aimed to identify 6 different facial emotion classes.
7 own database The human facial expression images were recorded and then segmented by using the skin color. Features were extracted using integral optic density (IOD) and edge detection. SVM-based classification In addition to the analysis of facial expressions, also speech signals were recorded. They aimed to classify 5 different emotions, which happened at an 87% accuracy (5% more than the images by themselves).
8 own database unknown Bayesian facial recognition algorithm This article is from 1999 and stands at the basis of machine learning, using a Bayesian matching algorithm to predict which faces belong to the same person.
9 unknown This article uses a 3D candidate face model, that describes features of face movement, such as 'brow raiser' and they have selected the most important ones according to them CNN The joint probability describes the similarity between the image and the emotion described by the parameters of the Kalman filter of the emotional expression as described by the features, and it is maximized to find the emotion corresponding to the picture. This article is an advancement of the methods described in 8. The system is more effective than other Bayesian methods like Hidden Markov Models, and Principal Component Analysis.
10 Cohn-Kanade database unknown Bayes optimal classifier The tracking of the features was carried out with a Kalman Filter. The correct classification rate was almost 94%.


Social implications 12) Some elderly have problems recognizing emotions. This is problematic, as primary care facilities for the elderly try to care using their emotions, e.g. to cheer the elderly person up by smiling. It would be very useful for the elderly to have a device similar to the Autismglass in source 11.

Possible applications The research team of (6) has developed an android which has facial tracking, expression recognition and eye tracking for the purposes of treatment of children with high functioning autism. During the sessions with the robot, they can enact social scenarios, where the robot is the 'person' they are conversing with.

Source (11) describes facial and emotion recognition with the Google glass, for children with Autism Spectrum Disorder (ASD). See also https://www.youtube.com/watch?v=_kzfuXy1yMI for a demonstration of Stanford's 'autismglass'.

Approach

First a study of the state-of-the art will be done, to get familiar with the different techniques of using a CNN for facial recognition. It will also be crucial for deciding how our project will go beyond what is already researched. Then a database with relevant photos will be constructed for training the CNN as well as one to test it. A CNN will be constructed, implemented and trained to recognize emotions using the database with photos. After training the CNN will be tested with the testing database, and if time allows it, it will be tested on real people. The usefulness of this CNN in elderly care robots will then be analysed, as well as the ethical aspects.

Deliverables

The deliverables of this project include:

- Software, a neural network trained to recognize emotion from pictures of facial expressions. (whether this includes just one emotion or multiple and the exact implementation is still left to be determined)
- A Wiki page, this will describe the entire process the group went through during the research, as well as a technical and USE evaluation of the product.


Planning

A detailed planning will be kept up to date in this Gantt Chart. This Gantt Chart will be updated during this course and will be used to visualise the tasks at hand and their deadlines, together with the people who are responsible for the delivery of said task. (The person that is responsible is not the only person working at that task, but will be the person who is responsible that the task is finished within time).


Things to do during the first week:

contains a subject (Problem statement and objectives), What do they require?, objectives, users, state-of-the-art, approach, planning, milestones, deliverables, who will do what, SotA: literature study, at least 25 relevant scientific papers and/or patents studied, summary on the wiki!

Second week:

- Gather databases and request access + make a list of the databases: Rik + Selina (affectnet)

- Try the machine learning model on Cohn-Kanade database: Kilian

- Structure the wiki and write a detailed proposal for our research project: Arthur

- Second half of the sources in state-of-the art: Wietske


Interesting persons

Emilia Barakova

Data sets

https://en.wikipedia.org/wiki/Facial_expression_databases

- Cohn-Kanade: http://www.consortium.ri.cmu.edu/data/ck/CK+/CVPR2010_CK.pdf

- RAVDESS: only video's

- JAFFE: only japanese women. Not relevant for our resaerch

- MMI

- Belfast Database

- MUG In the database participated 35 women and 51 men all of Caucasian origin between 20 and 35 years of age. Men are with or without beards. The subjects are not wearing glasses except for 7 subjects in the second part of the database. There are no occlusions except for a few hair falling on the face. The images of 52 subjects are available to authorized internet users. The data that can be accessed amounts to 38GB.

- RaFD Request for access has been sent (Rik).

- FERG avatars with annotated emotions Request for access has been sent (Rik). cite "D. Aneja, A. Colburn, G. Faigin, L. Shapiro, and B. Mones. Modeling stylized character expressions via deep learning. In Proceedings of the 13th Asian Conference on Computer Vision. Springer, 2016."

- AffectNet: Huge database (122GB) contains 1 million pictures collected from the internet of which 400 000 are manualy annotated. Cite "A. Mollahosseini; B. Hasani; M. H. Mahoor, "AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild," in IEEE Transactions on Affective Computing, 2017."

- IMPA-FACE3D 36 subjects 5 elderly open acces

- FEI only neutral-smile university employees

- Aff-Wild downloaded (Kilian)

Sources

Sources for the second self-study: https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7405084 This study is actually very close to the application we had in mind. The assistive care robot recognises the elderly person's face and then reads their emotion from the response that they give to questions (so, it is related to speech processing, not to facial expression analysis). The accuracy of this system was 82%. The idea is that the robot uses the response as input, e.g. if the person is sad, they will encourage them to initiate a skype call with their friends, if the person is bored, they will encourage them to play cards online with friends.

This is another example of such a system that uses voice analysis for interactive human-robot contacts: https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7480174

https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=4755969 This robot has facial expressions, but it does not interact with its users.


1) https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8039024 A Facial Expression Emotion Recognition Based Human-robot Interaction System

2) https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=956083

3) https://link.springer.com/content/pdf/10.1007%2Fs00521-018-3358-8.pdf A hybrid deep learning neural approach for emotion recognition from facial expressions for socially assistive robots.

4) https://link.springer.com/content/pdf/10.1007%2F978-94-007-3892-8.pdf

5)Extended deep neural network for facial emotion recognition, with extensive input data manipulation: https://reader.elsevier.com/reader/sd/pii/S016786551930008X?token=3E015F2B3E9E6290D0EA5A3C8CA42C6F7198698E6A17043ADA159C2A5106C4053CBDEE27E39196AE6C415A0DDAF711F4

6) https://ieeexplore.ieee.org/abstract/document/1556608 "An android for enhancing social skills and emotion recognition in people with autism"

7) https://pdfs.semanticscholar.org/e97f/4151b67e0569df7e54063d7c198c911edbdc.pdf "A New Information Fusion Method for Bimodal Robotic Emotion Recognition"

8) "Bayesian face recognition" https://www.sciencedirect.com/science/article/pii/S003132039900179X

Kalman filters for emotion recognition:

9) https://link.springer.com/chapter/10.1007/978-3-642-24600-5_53: Kalman Filter-Based Facial Emotional Expression Recognition This article uses a 3D candide face model, that describes features of face movement, such as 'brow raiser' and they have selected the most important ones according to them. The joint probability describes the similarity between the image and the emotion described by the parameters of the Kalman filter of the emotional expression as described by the features, and it is maximised to find the emotion corresponding to the picture. The system is more effective than other Bayesian methods like Hidden Markov Models and Principle Component Analysis.

10) https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=4658455: Kalman Filter Tracking for Facial Expression Recognition using Noticeable Feature Selection. This paper used conventional CNNs to recognise the facial expression, but the tracking of the features was carried out with a Kalman Filter.

11) Facial recognition with the Google glass, for children with Autism Spectrum Disorder (ASD): https://humanfactors.jmir.org/2018/1/e1/ Second Version of Google Glass as a Wearable Socio-Affective Aid: Positive School Desirability, High Usability, and Theoretical Framework in a Sample of Children with Autism. See also https://www.youtube.com/watch?v=_kzfuXy1yMI for a demonstration of Stanford's autism class.

12) But autistic children are not the only target group that has difficulty recognizing emotions in others. This is also the case for elderly: https://www.tandfonline.com/doi/pdf/10.1080/00207450490270901. EMOTION RECOGNITION DEFICITS IN THE ELDERLY. This is problematic, as primary care facilities for the elderly try to care using their emotions, e.g. to cheer the elderly person up by smiling.

13) https://www.researchgate.net/profile/Antonio_Fernandez-Caballero/publication/278707087_Improvement_of_the_Elderly_Quality_of_Life_and_Care_through_Smart_Emotion_Regulation/links/562e0bc808ae518e34825f40/Improvement-of-the-Elderly-Quality-of-Life-and-Care-through-Smart-Emotion-Regulation.pdf. This paper proposes that the quality of life of the elderly improves if smart sensors that recognize their emotions are installed in their environment. DOI: 10.1007/978-3-319-13105-4_50

14) https://tue.on.worldcat.org/oclc/4798799506 "Face recognition technology: security versus privacy" good source for arguments about face recognition. Keep in mind that we plan on developing software to recognize emotions not to identify faces. Also gives current (2004) state of face recognition technology.

Discussed in this article: 1) outline a realistic understanding of the current state of the art in face recognition technology, 2) develop an understanding of fundamental technical tradeoffs inherent in such technology, 3) become familiar with some basic vocabulary used in discussing the performance of recognition systems, 4) be able to analyze the appropriateness of suggested analogies to the deployment of face recognition systems, 5) be able to assess the potential for misuse or abuse of such technology, and 6) identify issues to be dealt with in responsible deployment of such technology.

15) https://www.mdpi.com/2073-8994/10/9/387/htm "Smart Doll: Emotion Recognition Using Embedded Deep Learning" This article describes a doll which uses local emotion recognition software. Exactly the kind of software we want to develop. Cohn Kanade Extended is used as a database for facial expressions.

The potential of deep learning has been addressed in the form of CNN inference. They have used EoT in the real case of an emotional doll.

16) Neural network for emotion recognition, used because it can adapt to the user and context of the situation. (User and context adaptive neural networks for emotion recognition) https://www.sciencedirect.com/science/article/pii/S092523120800218X

17) By using a technique called "transfer learning", neural networks can be trained on a certain set of images, unrelated to the goal of the neural network. After this, the network can be trained on a small data set, so it can implement the needed functionality. https://www.researchgate.net/profile/Vassilios_Vonikakis/publication/298281143_Deep_Learning_for_Emotion_Recognition_on_Small_Datasets_Using_Transfer_Learning/links/56e7b18408ae4c354b1bc8d8/Deep-Learning-for-Emotion-Recognition-on-Small-Datasets-Using-Transfer-Learning.pdf

18) https://ieeexplore.ieee.org/abstract/document/5543262 About the "Cohn Kanade Extended" data set

19) https://s3.amazonaws.com/academia.edu.documents/43626411/Feelings_of_loneliness_but_not_social_is20160311-5371-cjo4vg.pdf?AWSAccessKeyId=AKIAIWOWYYGZ2Y53UL3A&Expires=1556914705&Signature=AVL61%2Fnb2bpN5xHwlTpHQ9nBTcw%3D&response-content-disposition=inline%3B%20filename%3DFeelings_of_loneliness_but_not_social_is.pdf Feelings of loneliness, but not social isolation, predict dementia onset: results from the Amsterdam Study of the Elderly (AMSTEL)

This study was carried out on a large group of elderly from Amsterdam, of whom 20% experienced feelings of loneliness. They have linked loneliness to dementia onset.

20) https://www.researchgate.net/publication/229058790_Assistive_social_robots_in_elderly_care_A_review Assistive social robots in elderly care: a review

A variety of effects or functions of assistive social robots have been studied, including (i) increased health by decreased level of stress, (ii) more positive mood, (iii) decreased loneliness, (iv) increased communication activity with others, and (v) rethinking the past. Most studies report positive effects (Table 1). With regards to mood, companion robots are reported to increase positive mood, typically measured using evaluation of facial expressions of elderly people as well as questionnaires.

21) https://www.intechopen.com/download/pdf/12200 "Emotion Recognition through Physiological Signals for Human-Machine Communication "

22) https://ieeexplore.ieee.org/abstract/document/8535710 "Real-time Facial Expression Recognition on Robot for Healthcare"

23) https://doi.org/10.1080/00207450490270901 "EMOTION RECOGNITION DEFICITS IN THE ELDERLY" Study shows that elderly people have more difficulties with recognition of emotions. This might be another application of our software.

24) A. Mollahosseini; B. Hasani; M. H. Mahoor, "AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild," in IEEE Transactions on Affective Computing, 2017.

  1. Lynn Kacperck. (2014, December). Non-verbal communication: the importance of listening. Retrieved May 5, 2019, from https://www.magonlinelibrary.com/doi/abs/10.12968/bjon.1997.6.5.275
  2. Argyle, M. (1972). Non-verbal communication in human social interaction. In R. A. Hinde, Non-verbal communication. Oxford, England: Cambridge U. Press. Retrieved May 5, 2019, from https://psycnet.apa.org/record/1973-24485-010