PRE2019 4 Group1: Difference between revisions
No edit summary |
No edit summary |
||
Line 102: | Line 102: | ||
<ref>Ren, Y., Ruan, Y., Tan, X., Qin, T., Zhao, S., Zhao, Z., & Liu, T. Y. (2019). Fastspeech: Fast, robust and controllable text to speech. In Advances in Neural Information Processing Systems (pp. 3165-3174)</ref> | <ref>Ren, Y., Ruan, Y., Tan, X., Qin, T., Zhao, S., Zhao, Z., & Liu, T. Y. (2019). Fastspeech: Fast, robust and controllable text to speech. In Advances in Neural Information Processing Systems (pp. 3165-3174)</ref> | ||
<ref>Yu, D., & Deng, L. (2016). AUTOMATIC SPEECH RECOGNITION. Springer london limited</ref> | <ref>Yu, D., & Deng, L. (2016). AUTOMATIC SPEECH RECOGNITION. Springer london limited</ref> | ||
====Ethical Responsibility==== | |||
The robot that we are designing, will be interacting with human people, specifically elderly in this case. This raises questions, what tasks can we let a robot perform, without causing the user to lose human contact. For this, several research papers have been conducted that establish an ethical framework <ref name = "s10676">[https://link.springer.com/content/pdf/10.1007/s10676-014-9344-7.pdf Robot carers, ethics, and older people. Sorell T., Draper H., (2014)]</ref> <ref name = "s10676">[https://ieeexplore.ieee.org/document/5751968 Socially Assistive Robotics | |||
. Feil-Seiver D.., Mataric M., (2011)]</ref>. | |||
==== ==== | ==== ==== | ||
= References = | = References = | ||
<references /> | <references /> |
Revision as of 12:19, 2 May 2020
Students: Bryan Buster, Edyta Koper, Sietze Gelderloos, Matthijs Logeman, Bart Wesselink.
Problem statement and objectives
With about one-third of the world’s population living under some form of quarantine due to the COVID-19 outbreak [1], scientists sound the alarm on the negative psychological effects of the current situation [2]. Studies on the impact of massive self-isolation in the past, such as in Canada and China in 2003 during the SARS outbreak or in west African countries caused by Ebola in 2014, have shown that the psychological side effects of quarantine can be observed several months or even years after an epidemic is contained [3]. Among others, prolong self-isolation may lead to higher risk of depression, anxiety, poor concentration and lowered motivation level [2]. The negative effects on well-being can be mitigated by introducing measurements which help in the process of accommodation to a new situation during the quarantine. Such measurements should aim at reducing the boredom (1), improving the communication within a social network (2) and keeping people informed (3) [2]. With the following project, we propose an in-house assistant that addresses these three objectives. Our target group are people who need to stay home in order to practice social distancing.
Approach
In order for us to tackle the problem as described in the problem statement, we will start with careful research on topics that require our attention, like how and where we can support mental and physical health of people. From this research, we can create a solution consisting of different disciplines and techniques.
From there, we will start building an assistant that has the following features:
- Speech:
- Text to speech (TTS): being able to interact with an assistant via speech is a key part of the assistant, to tackle the loneliness problem. Using hardware microphones and pre-existing software, we can create a ‘living’ assistant.
- Speech to text (STT): being able to talk back to the assistant sparks up the conversation, and makes the assistant more human-like.
- Tracking: the robot has functionalities that enable it to look towards a human person when it is being activated.
- Exercising: to counter the lack of physical activities, the robot has functionalities that can prompt or motivate the user to be physically active, and to track the user’s activity.
- General information: the robot prompts the user to hear the latest news twice every day, once in the morning and once in the evening. Users can always request a new update through a voice command.
- Graphical User Interface: when the user is unable to talk to the assistant, there will be an option to interact with a graphical user interface, displayed on a touch screen.
- Vision: skeleton tracking, facial detection. Use this data to count jumping jacks, look at someone when talking to. (Closely linked with navigation)
Using different feature sets allows our team members to (partly) work individually, speeding up the development process in the difficult time that we are in right now. The parts should be tested together. Not only in the end, but also when building. This can be done by scheduling meetings with the person responsible for building the physical robot. The team members can share code via a source like GitHub, which then can be uploaded onto the robot, allowing members to test their part.
When all parts work together, a video will be created, highlighting all bot functionalities.
Deliverables
By the end of the project, we will deliver the following items:
- Project Description (report via wiki)
- Project Video demonstrating functionalities
- Physical assistant
- Source code
Milestones
Week | Deliverable |
---|---|
1 (April 20th till April 26th) | Problem statement, think of concept |
2 (April 27th till May 3rd) | Further research, list of required components |
3 (May 4th till May 10th) | Power delivery finished, performed all research and acquire components/software for construction |
4 (May 11th till May 17th) | Finish physical construction |
5 (May 18th till May 24th) | - |
6 (May 25th till April 31th) | Finish software |
7 (June 1st till June 7th) | Finish integration and test with prospective user |
8 (June 8th till June 14th) | Create presentation, finish Wiki |
9 (June 15th till June 21th) | Presentation, on Wednesday |
State of the Art
Indoor localisation has become a highly researched topic over the past years. A robust and compute-efficient solution is important for several fields of robotics. Whether the application is an automated guided vehicle (AGV) in a distribution warehouse or a vacuum robot, the problem is essentially the same. Let us take these two examples and find out how state-of-the-art examples solve this problem.
Logistics AGVs
AGVs for use in warehouses are typically designed solely for use inside a warehouse. Time and money can be invested to provide the AGV with an accurate map of its surroundings before deployment. The warehouse is also custom-built, so fiducial markers [4] are less intrusive than they would be when placed in a home environment. These fiducial markers allow Amazon’s warehouse AGVs to locate themselves in space very accurately. Knowing their location, Amazon’s AGVs find their way around warehouses as follows. They send a route request to a centralised system, which takes into account other AGVs’ paths in order to generate an efficient path, and commands the AGV exactly which path to take. The AGV then traverses the grid marked out with floor-mounted fiducial markers. [5] The AGVs don’t blindly follow this path, however, as they keep on the lookout for any unexpected objects on their path. The system described in Amazon’s relevant patents[4][5][6][7] also includes elegant solutions to detect and resolve possible collisions, and even the notion of dedicated larger cells in the ground grid, used by AGVs to be able to make a turn at elevated speeds.
Robotic vacuum robots
A closer-to-home example could be found when looking at Irobot’s implementation of VSLAM (Visual Simultaneous Localisation And Mapping) [8] in their robotic vacuum cleaners. SLAM is a family of algorithms intended for vehicles, robots and (3d) scanning equipment to both localise themselves in space and to augment the existing map of their surroundings with new sensor-derived information. The patent[8] involving an implementation of VSLAM by Irobot describes a SLAM variant which uses two cameras to collect information about its surroundings. It uses this information to build an accurate map of its environment, in order to cleverly plan a path to efficiently clean all floor surfaces it can reach. The patent also includes mechanisms to detect smudges or scratches on the lenses of the cameras, and notify the owner of this fact. Other solutions for gathering environmental information for robotic vacuum cleaners use a planar LiDAR sensor to gather information about boundaries of the floor surface. [9]
SLAM algorithms
TODO
Human - robot interaction
Socially assistive robotics
Socially assistive robots (SARs) are defined as an intersection of assistive robots and socially interactive robots [10]. The main goal of SARs is to provide assistance through social interaction with a user. The human - robot interaction is not created for the sake of interaction (as it is the case for socially interactive robots), but rather to effectively engage users in all sorts of activities (e.g. exercising, planning, studying). External encouragement, which can be provided by SARs, has been shown to boost performance and help to form behavioural patterns [10]. Moreover, SARs have been found to positively influence users’ experience and motivation [11]. Additionally, due to limited human - robot physical contact, SARs have lower safety risks and can be tested extensively. Ideally, SARs should not require additional training and be flexible when it comes to user’s changing routines and demands. SARs not necessarily have to be embodied; however, embodiment may help in creating social interaction [10] and increases motivation [11].
Social behaviour of artificial agents
Robots that are intended to interact with people have to be able to observe and interpret what a person is doing and then behave accordingly. Humans convey a lot of information through nonverbal behavior (e.g. facial expression or gaze patterns), which in many cases is difficult to encode by artificial agents. This fact can be compensated by relying on other cues such as gestures or position and orientation of interaction [12]. Moreover, human interaction seems to have a given structure that can be used by an agent to break down human behavior and organize its own actions.
Voice recognition
To communicate with users the robot needs to be able to convert the speech of the user to something it can process internally. For this a speech to text system like [13] will be used.
Speech Synthesis
Ethical Responsibility
The robot that we are designing, will be interacting with human people, specifically elderly in this case. This raises questions, what tasks can we let a robot perform, without causing the user to lose human contact. For this, several research papers have been conducted that establish an ethical framework [17] [17].
References
- ↑ Infographic: What Share of the World Population Is Already on COVID-19 Lockdown? Buchholz, K. & Richter, F. (April 3, 2020)
- ↑ 2.0 2.1 2.2 Psychological Impact of Quarantine and How to Reduce It: Rapid Review of the Evidence Brooks, S. K., Webster, R. K., Smith, L. E., Woodland, L., Wessely, S., Greenberg, N., & Rubin, G. J. (2020)
- ↑ Depression after exposure to stressful events: lessons learned from the severe acute respiratory syndrome epidemic. Comprehensive Psychiatry, 53(1), 15–23 Liu, X., Kakade, M., Fuller, C. J., Fan, B., Fang, Y., Kong, J., Wu, P. (2012)
- ↑ 4.0 4.1 US20160334799A1: Method and System for Transporting Inventory Items, Amazon Technologies Inc, Amazon Robotics LLC. (Nov 17, 2016) Retrieved April 26, 2020
- ↑ 5.0 5.1 CA2654260: System and Method for Generating a Path for a Mobile Drive Unit, Amazon Technologies Inc. (November 27, 2012) Retrieved April 26, 2020.
- ↑ US8220710B2: System and Method for Positioning a Mobile Drive Unit, Amazon Technologies Inc. (July 17, 2012) Retrieved April 26, 2020.
- ↑ US20130302132A1: System and Method for Maneuvering a Mobile Drive Unit, Amazon Technologies Inc. (Nov 14, 2013) Retrieved April 26, 2020.
- ↑ 8.0 8.1 US10222805B2: Systems and Methods for Performing Simultaneous Localization and Mapping using Machine Vision Systems, Irobot Corp. (March 5, 2019) Retrieved April 26, 2020.
- ↑ US10162359B2: Autonomous Coverage Robot, Irobot Corp. (Dec 25, 2018) Retrieved April 26, 2020.
- ↑ 10.0 10.1 10.2 Defining socially assistive robotics. Feil-Seifer D., (2005)
- ↑ 11.0 11.1 Attitudes Towards Socially Assistive Robots in In- telligent Homes: Results From Laboratory Studies and Field Trials. Torta, E., Oberzaucher, J., Werner, F., Cuijpers, R. H., & Juola, J. F. (2013)
- ↑ Etiquette: Structured Interaction in Humans and Robots. Ogden B. & Dautenhahn K. (2000)
- ↑ Xiong, W., Wu, L., Alleva, F., Droppo, J., Huang, X., & Stolcke, A. (2018, April). The Microsoft 2017 conversational speech recognition system. In 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5934-5938). IEEE
- ↑ Amodei, D., Ananthanarayanan, S., Anubhai, R., Bai, J., Battenberg, E., Case, C., ... & Chen, J. (2016, June). Deep speech 2: End-to-end speech recognition in english and mandarin. In International conference on machine learning (pp. 173-182)
- ↑ Ren, Y., Ruan, Y., Tan, X., Qin, T., Zhao, S., Zhao, Z., & Liu, T. Y. (2019). Fastspeech: Fast, robust and controllable text to speech. In Advances in Neural Information Processing Systems (pp. 3165-3174)
- ↑ Yu, D., & Deng, L. (2016). AUTOMATIC SPEECH RECOGNITION. Springer london limited
- ↑ 17.0 17.1 Robot carers, ethics, and older people. Sorell T., Draper H., (2014) Cite error: Invalid
<ref>
tag; name "s10676" defined multiple times with different content