PRE2024 3 Group3
Members
Name | Student number | Study | |
Andreas Sinharoy | 1804987 | Computer Science and Engineering | a.sinharoy@student.tue.nl |
Luis Fernandez Gu | |||
Alex Gavriliu | 1785060 | Computer Science and Engineering | a.m.gavriliu@student.tue.nl |
Theophile Guillet | 1787039 | Computer Science and Engineering | t.p.p.m.guillet@student.tue.nl |
Petar Rustić | 1747924 | Applied Physics | p.rustic@student.tue.nl |
Floris Bruin | 1849662 | Computer Science and Engineering | f.bruin@student.tue.nl |
Planning
Roadmap
Week 1: Problem ideation and specification
Week 2: Robot design and specifications
Week 3: Begin construction of prototype of the robot
Week 4: Conduct interviews with relevant user groups
Week 5: Finish prototype of robot
Week 6: Gather feedback for the prototype
Week 7: Finalize the prototype after having taken feedback into account
Milestones
- Selecting who and what problem we are going to address
- Selecting how our robot shall address our problem
- Conducting research and a literature review on our topic
- Creating a design for the robot
- Conducting interviews to gauge the receptiveness of the robot
- Speaking with our primary user group to obtain their feedback on our proposed solution
- Creating a prototype of the robot
Introduction
Problem Statement
In the Netherlands, the dominant model for speech and language (SLT) services is individual direct therapy in SLT practices with a dosage of 25 to 30 minute weekly sessions. However, currently Dutch SLT therapy practices have waiting lists of 6 to 12 months for children with speech, language, and communication needs (SLCN) (citation). Furthermore, globally there is a shortage of speech-language pathologists in regards to their demand as there are a limited number of openings in graduate programs and the increased need for SLPs as their scope of practice widens, the autism rate grows, and the population ages (citation). Not only is there a growing unmet demand for aiding people diagnosed with speech impediments, but the number of people, and especially children, who remain undiagnosed is also an issue. For example, according to research done by (citation), boys were referred earlier than girls, and monolingual children were revealed earlier than bilingual children. On top of that, bilingual children seemed to have more complex problems at referral. The paper indicated the existence of a large body of undiagnosed girls and bilingual children with speech impediments. Therefore, for our project, we aim to address the issue of an overburdened speech therapy healthcare system by attempting to aid therapists in the practice sessions and road to recovery, allowing for them to receive and diagnose more patients.
Objectives
- Find which aspect of the issues surrounding speech impediments - whether it's impediment diagnosis, treatment, or obstacles becoming a speech-language therapist - can be feasibly influenced by a robot to minimize the burden placed on this part of the healthcare system.
- To measure and track the direct impact our robot can make on the system as a whole.
Hypothesis
If we can provide a robot which assists and allows for independent treatment done by patients of speech-language therapists, then the healthcare system for this issue will be less overburdened allowing for an improved efficiency in regards to both speech impairment diagnoses and treatments.
USE Analysis
Users
Personas
Speech-Language Therapist (Primary User)
Clinicians that are dedicated professionals responsible for diagnosing and treating children with speech, language, and communication challenges. They are often under high workloads and constant pressure to deliver accurate assessments quickly.
Needs:
- Efficient Diagnostic Processes: Therapists require a tool that transforms traditional 2–3 hour assessments into shorter, engaging sessions that maintain clinical rigor. By reducing patient fatigue and maintaining diagnostic quality, this approach aims to enable therapists to focus on other patients and improve overall efficiency [citation].
- Comprehensive Data Capture: The therapists need the capability to capture high-quality audio, video, and sensor data during assessments so that each session can then be reviewed later. This comprehensive data collection supports detailed analysis and would facilitate collaboration among specialists to enhance diagnostic precision [citation].
- Intuitive Digital Tools: The development of a secure, user-friendly dashboard is helpful, as it allows therapists to manage patient records, annotate sessions, and access data remotely and efficiently. Naturally, such tools should comply with healthcare standards such as GDPR, to ensure data integrity and patient confidentiality.
Child Patient (Secondary User)
Young children aged 5–10, who are the focus of speech assessments, often experience anxiety, discomfort or boredom in traditional clinical environments. Their engagement in therapy sessions is crucial for accurate assessment and treatment outcomes.
Needs:
- Interactive, Game-Like Experience: The device must offer a playful and interactive interface that transforms the clinical assessment into “playtime”, effectively encouraging natural participation. Such an approach has been shown to improve engagement and make the therapeutic process feel more like play ([citation]).
- Immediate, Clear Feedback: Children benefit from receiving real-time visual and auditory cues that help guide them through the session [citation]. The integration of LED indicators and sound effects aims to keep the patient engaged and focused, and indicating progress during exercies.
Parent/Caregiver (Support User)
Parents or caregivers play an essential role in supporting the child’s therapy and need to feel confident that the process is both secure and effective. To ensure the therapy is reinforced outside the clinical setting, the parents need to be fully on-board.
Needs:
- Data Security and Transparency: They require (re-)assurance that all recordings and sensitive data are stored securely and handled in strict compliance with healthcare regulations. Furthermore, that the collected data is used solely for clinical purposes particular to their child. This aims to build trust in the technology and to guarantee that their child’s information remains confidential.
- Accessible Progress Monitoring: A clear and intuitive interface should be available for caregivers to follow their child’s progress without disrupting or getting involved in the therapy session. This transparency aims not only to keep the parents informed, but also to motivate them to support their child’s ongoing development [citation].
Scenarios
Interactive Diagnostic Session
In a typical session, the therapist initiates the diagnostic process via a secure digital dashboard or the parent engages the robot via a button. The therapeutic companion engages the child through a series of interactive, game-like exercises that are designed to incorporate standard diagnostic questions in an entertaining manner. During the session, the system records high-fidelity audio, video, and sensor data, providing a dataset for later review and analysis. This approach not only aims to make the sessions more inviting for the child, but also to enable the therapist with comprehensive data on longer assessments, typically that are not possible to do otherwise.
Remote and Asynchronous Review
After the interactive session, the therapist logs into a secure platform where all recorded data is available for review. They can pause, rewind, and annotate key segments, which allows for a detailed and thoughtful analysis of the child’s responses and non-verbal cues.
Collaborative Consultation
For some cases, additional input or consultation from colleagues is required, therefore the recorded sessions can be shared with multiple experts after obtaining the necessary consents. This collaborative consultation allows for reviewing and discussing the diagnostic findings, leading to more comprehensive and consensus-driven treatment plans. Such an approach fosters an environment of continuous professional development and shared expertise [citation].
Requirements
For the Therapist
- Robust Hardware Integration: The system must incorporate reliable and durable hardware to ensure diagnostic sessions are completed without data loss or interruption. The design should aim to minimise technical failures during assessments, ensuring that every session's data remains intact and can be reviewed later.
- User-Friendly Dashboard: An intuitive and efficient digital dashboard is required to present clinicians with information on each recorded session. The dashboard should facilitate rapid review and analysis, with the goal of enabling therapists to quickly identify patterns or issues in speech. By streamlining navigation and data review, the tool should help therapists manage multiple patients efficiently while maintaining high diagnostic accuracy.
- Secure Remote Accessibility: In today’s increasingly digital healthcare environment, therapists must be able to access patient data remotely. The system must employ state-of-the-art encryption and robust user authentication protocols to protect sensitive patient data from unauthorised access. Having robust security is crucial for both clinicians and patients, as it reassures all parties that the integrity and confidentiality of the clinical data are maintained as per today’s standards.
For the Child and Parent
- Engaging and Comfortable Design: For young children, the physical design of the robot plays a crucial role in therapy success. A soft, plush exterior coupled with interactive buttons and LED feedback systems can create a friendly, non-intimidating interface that reduces fosters a positive human robot interaction. Ultimately, the sessions should be a fun experience for the child, as otherwise no progress would be made and no speech data would be collected.
- Responsive Feedback Systems: Dynamic auditory and visual cues are essential components that should guide children through each step of a session. Real-time feedback, such as flashing LEDs synchronised with encouraging sound effects, aims to help the child understand when they are performing correctly, and gently corrects mistakes when necessary. This immediate reinforcement not only keeps the child engaged and motivated but also provides parents with clear, observable evidence of their child’s progress. In essence, cues ensure that the therapy sessions are both interactive and instructive.
- Robust Data Security: The system must implement comprehensive security measures such as end-to-end encryption and secure storage protocols to prevent unauthorised access or data breaches. The level of protection must reassure both parents and therapists that the child’s data is handled with the highest level of care and confidentiality. Adhering strictly to healthcare regulations is essential to maintain trust and protect privacy throughout the therapy process.
Society
Early, engaging assessments have been proven to significantly enhance long-term outcomes for children with speech challenges. Some studies have demonstrated that gamified diagnostic tools reduce anxiety and enable quicker detection of speech impediments, ensuring that children can receive timely interventions during formative year [citation]. This acceleration in diagnosis can shorten the long waiting lists currently experienced in many places, ultimately leading to better educational and social outcomes, for those on the waiting lists. Furthermore, digital platforms that enable “over-the-air” reviews and multi-organisation consultations have been shown to improve diagnostic accuracy and continuity of care [citation]. It has been shown that digital care networks enhance collaborative efforts among healthcare professionals, ensuring that each patient is given an opportunity to receive additional evaluation from other experts [citation]. Early interventions in speech impediments not only address immediate speech challenges but also contribute to improved educational achievements and social integration in the long term. Some research shows that children who receive timely, engaging assessments exhibit better communication skills later in life, promoting later inclusion in communities, activities and alike [citation]. These improvements in communication and social skills ultimately contribute to a healthier, more productive society overall.
Enterprise
By transforming lengthy, in-person diagnostic sessions into concise, interactive assessments, the solution enables therapists to focus on cases that require direct intervention while still collecting high-quality diagnostic data. Reports have shown [citation] that digital diagnostic tools can increase clinician productivity significantly, thereby enhancing overall workflow efficiency and allowing for more timely patient care. The system’s capacity to collect anonymised diagnostic data offers opportunities in refining assessment algorithms and personalising treatment plans. Likewise, such refinement can be of service to a broader research initiative and other healthcare experts. Although reducing the need for extended in-person sessions inherently lowers operational costs, the true value of the technology lies in its ability to improve patient outcomes and support innovative care models. Some studies have highlighted that integrating such digital tools leads to a more adaptive healthcare ecosystem that better meets the needs of both patients and providers [citation].
State of the Art
Existing robots
To understand how we can create the best robot for our users, we have to look at what robots already exists relating to our project. We analyzed the following robots and related them to how we can use them for our robot.RASA robot
The RASA (Robotic Assistant for Speech Assessment) robot is a socially assistive robot developed to enhance speech therapy sessions for children with language disorders. The robot is used during speech therapy sessions for children with language disorders. The robot uses facial expressions that make therapy sessions more engaging for the children. The robot also uses a camera that uses facial expression recognition with convolutional neural networks to detect the way the children are speaking. This helps the therapist in improving the child's speech. Studies have shown that incorporating the RASA robot into therapy sessions increases children's engagement and improves language development outcomes.
Automatic Speech Recognition
Recent advancements in Automatic speech recognition (ASR) technology have led to systems capable of analyzing children's speech to detect pronunciation issues. For instance, a study fine-tuned the wav2vec 2.0 XLS-R model to recognize speech as pronounced by children with speech sound disorders, achieving approximately 90% accuracy in matching human annotations. ASR technology streamlines the diagnostic process for clinicians, saving time in the diagnosing process.Nao robot
Developed by Aldebaran Robotics, the Nao robot is a programmable humanoid robot widely used in educational and therapeutic settings. Its advanced speech recognition and production capabilities make it a valuable tool in assisting speech therapy for children, helping to identify and correct speech impediments through interactive sessions.
Kaspar robot
Kaspar is a socially assistive robot whose purpose is to help children with autism learn social and communication skills. A child-centric appearance and expressive behavior are given prominence in the design in order to invite users to engage in interactive activity. Studies [1] have indicated that children working with Kaspar show improved social responsiveness, and the same design principles could be applied to enhancing speech therapy outcomes.Requirements
MoSCoW Analysis
Functionality
Usage
Performance
Legal & Privacy Concerns
Note: We are only going to concern ourselves with EU legislation and regulations as this is our country of residence. Furthermore most of these regulations concern themselves with a full scale implementation of this robot.
We will mainly be making reference to the following regulations/Legislation:
- General Data Protection Regulation GDPR (https://gdpr-info.eu/)
- Medical Device Regulation MDR (https://eur-lex.europa.eu/eli/reg/2017/745/oj/eng)
- AI act (https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32024R1689)
- UN Convention on the Rights of the Child
- AI Ethics Guidelines (EU & UNESCO)
- Product Liability Directive (EU 85/374/EEC)
- ISO 13482 (Safety for Personal Care Robots)
- EN 71 (EU Toy Safety Standard)
Data Collection & Storage
The robot we want to build for this project requires that some specific audio snippets and data to be collected and stored somewhere where, therapists and professionals that are responsible for the patient's care can access it. This data however is sensitive and must be secured and protected so it is only accessible to those who are permitted to access it. We should also focus on storing the minimum required amount of data on the patient using the robot to make sure only necessary data is stored. These specific data collection and storage concerns, in the EU, are outlined in articles 5 and 9 of the GDPR.
in this context this means the data collected by the robot should at most include:
- Speech audio data of the patient needed by the therapist to help treat the patients impediment
- minimal identification data to know which patient has what data.
- Other data may be needed but must specifically be argued (subject to change)
Furthermore all the data collected by the robot must be:
- encrypted, so if somehow stolen cannot be interpreted
- securely stored, so it can be accessed by the relevant permitted parties
User Privacy & Consent
In order for the robot to be used and for data to collected and shared with the relevant parties, the patient user must consent to this and they must also hold specific rights over the data (creation, deletion, restriction etc). On top of this depending on the age of the patient certain restrictions must be placed on the way data is shared, and all patients must have a way to opt-out and withdraw consent from data collection if necessary. These are all covered in articles 6,7,8 of the GDPR.
In essence this means the user must have the most power and control over the data collected by the robot, and the data collected and its use must be made explicitly clear to the user to make sure that its function is legal and ethical.
Security Measures
Since we must exchange sensitive data between the patient and therapist, data must be secured and protected in its transmission, storage and access. These relevant regulations are specified in article 32 of the GDPR (Data Security Requirements).
This means that data communication must be end-to-end encrypted, and there must be secure and strong authentication protocols across the entire system. On the therapists end of things there must be relevant RBAC (role based access control) so only the relevant admins can access the data. In real time use over long periods of time there should be the possibility of software updates to improve security.
Legal Compliance & Regulations
Since this robot can be considered as a health related or medical device, we must check and make sure that the data collected is used and treated as medical data. All regulations relevant to this are specified in the Medical Device Regulation.
This Robot may also have certain AI specific features or functionalities so this must also fall within and adhere to regulations and laws present in the AI act so that the functionality and usage of the robot is ethical.
Ethical Considerations
Since the patients using this device and interacting with it are children, we must make sure that the interactions with the child are ethical and the way in which data is used and analysed in order to form a diagnosis is not biased in any sort of way.
The robot must minimize psychological risks of AI-driven diagnosis, prevent any possible distress, anxiety and deception that interaction could cause. Training assessments should be analysed in a fair and unbiased manner and decisions on treatment and required data for a particular stage of treatment should be almost entirely decided by the therapist with little to minimal AI involvement.
These are all outlined in the AI Ethics Guidelines and article 16 of the UN Convention on the Rights of the Child.
Third-Party Integrations & Data Sharing
Since we are sharing the data collected from the robot to the therapist, we must ensure that strict data-sharing policies are in place that require parental/therapist consent. Furthermore if we use any 3rd party services, like cloud storage providers, AI tools, or healthcare platforms we must make sure data is fully anonymised so no there is no risk of re-identification.
This is so we adhere to article 28 of the GPDR
Liability & Accountability
User Safety & Compliance
Design
Prototype
The prototype proposed, is focused in addressing the challenges and requirement specified earlier in the report. The traditional diagnostic tests are often extremely long—two to three hours—which leads to fatigue on both the patient's and therapist's side. As a result, the patient will experience the test as extremely uncomfortable, and the therapist's exhaustion will lead to a reduced accuracy of the diagnosis. The prototype will therefore break down these diagnoses into short, disguised games, without the need for speech therapist supervision. It will be an interactive device that will ask closed/open-ended questions to the patient that are specifically chosen by a speech therapist or from an already existing test. Once the question is posed, the robot will then record both in audio and (perhaps) video the response of the patient to be then reviewed at a later stage. Since all the responses are stored digitally, this will allow diagnostics to be performed abroad in areas, especially in rural areas lacking speech therapists. By allowing the therapist to rewind, replay, or pause the digital diagnosis, it would guarantee a more thorough analysis and lower the risk of missing details.
The prototype will hopefully reduce patients' stress and fatigue due to the test being broken down. It will lessen the workload of the speech therapists while also improving reliability. allowing multiple therapists to review the recording whenever it is convenient for them, reducing individual biases.
Device Description
Plush Appearance:
The prototype was chosen to take the form of a friendly plush toy in order for young patients (ages 5-10) to engage more willingly with the speech and articulation exercises proposed by the plush. The friendly plush toy will disguise the assessment process as interactive play, trying to deceive the child into believing it is playing.
The soft appearance will enable the device to be more durable by acting as cushioning for the electronics inside it. This will increase the life expectancy of such a device.
Buttons
Multiple buttons are built into the plush's surfaces, which will allow you to control the plush's behavior. Here are the following buttons to be installed:
- Turn on/off button
- Initiate the plush next question button.
- End recording response
LEDs
A number of LEDs are on the device to enable visual feedback to both the patient and the supervisor (parent or speech therapist). Here are the following indicator LEDs on the device:
- "Recording" LED, ON if recording
- "Test complete" LED, ON is complete.
- "Error" LED, ON if error present
- "ON" LED, ON if plush is turned on
Microphone, Speaker, Camera:
A discrete, high-quality, sensitive microphone, hidden inside the plush, to ensure clear recording of the patient's speech, with minimal obstructive sound.
An internal speaker is placed, for example, on the chest of the plush, allowing the device to deliver audibly the questions, but also feedback and fun sounds.
A small camera is hidden inside plush eyes, enabling the recording of patients facial expressions and reactions to prompts. Enable more in-depth diagnostics.
Internal Hardware:
The device will contain a processing unit such as Arduino or Raspberry Pi for processing and managing all electrical components. A large storage unit, such as an SSD card, is needed to store the video and audio recording until retrieval. Finally, a rechargeable battery stored accessibly inside the plush for safety and convenience.
System Specification
Software
Testing
Interviews
Introduction
Method
Analysis
Results
Conclusion
Bibliography
https://my.clevelandclinic.org/health/articles/24602-speech-language-pathologist
https://idcchealth.org/blogs/how-much-does-online-speech-therapy-cost/
https://www.hollandzorg.com/insured/reimbursements2025/speech-therapy
https://pmc.ncbi.nlm.nih.gov/articles/PMC7383695/
https://www.belganewsagency.eu/nearly-one-fifth-fewer-speech-therapy-students-in-ten-years-time
chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://files.eric.ed.gov/fulltext/EJ1135588.pdf
https://pubmed.ncbi.nlm.nih.gov/36467283/
https://arxiv.org/abs/2403.08187
Appendix
Time reporting
Week 1
Name | Task | Time spent |
Andreas Sinharoy | Robot and Problem Ideation and Research into the Idea | 2 hours |
Luis Fernandez Gu | ||
Alex Gavriliu | Research into data privacy requirements in EU | 1 hour |
Theophile Guillet | ||
Petar Rustić | Creating the wiki structure, literature research | 3 hours |
Floris Bruin | ||
All |
Week 2
Name | Task | Time spent |
Andreas Sinharoy | Writing the Planning and Introduction sections of the wiki page | 2 hours |
Luis Fernandez Gu | ||
Alex Gavriliu | creating appropriate structure for legal and privacy section | 1 hours |
Theophile Guillet | ||
Petar Rustić | literature research, writing the USE analysis | 6 hours |
Floris Bruin | ||
All |
Week 3
Name | Task | Time spent |
Andreas Sinharoy | ||
Luis Fernandez Gu | ||
Alex Gavriliu | ||
Theophile Guillet | ||
Petar Rustić | ||
Floris Bruin | ||
All |
Week 4
Name | Task | Time spent |
Andreas Sinharoy | ||
Luis Fernandez Gu | ||
Alex Gavriliu | ||
Theophile Guillet | ||
Petar Rustić | ||
Floris Bruin | ||
All |
Week 5
Name | Task | Time spent |
Andreas Sinharoy | ||
Luis Fernandez Gu | ||
Alex Gavriliu | ||
Theophile Guillet | ||
Petar Rustić | ||
Floris Bruin | ||
All |
Week 6
Name | Task | Time spent |
Andreas Sinharoy | ||
Luis Fernandez Gu | ||
Alex Gavriliu | ||
Theophile Guillet | ||
Petar Rustić | ||
Floris Bruin | ||
All |
Week 7
Name | Task | Time spent |
Andreas Sinharoy | ||
Luis Fernandez Gu | ||
Alex Gavriliu | ||
Theophile Guillet | ||
Petar Rustić | ||
Floris Bruin | ||
All |