|
|
Line 50: |
Line 50: |
| TODO... | | TODO... |
|
| |
|
| = Literature Review = | | = Literature Review (separate file) = |
|
| |
|
| Due to bugs in the installation of the LaTeX engine of the wiki, mathematical expressions cannot be shown here. | | Due to bugs in the installation of the LaTeX engine of the wiki, mathematical expressions cannot be shown here. |
| See the following Overleaf file for the literature review, and references: [https://www.overleaf.com/read/rvfcyxqgtwfq Literature review Overleaf file]. | | See the following Overleaf file for the literature review, and references: [https://www.overleaf.com/read/rvfcyxqgtwfq Literature review Overleaf file]. |
|
| |
| == Statistical dialog systems ==
| |
| === Two categories: Seq2Seq and task-oriented ===
| |
|
| |
| Statistical dialog systems can be divided into two major categories<ref name="DeepRLForDialogGeneration"> Jiwei Li, Will Monroe, Alan Ritter, Michel Galley, Jianfeng Gao, Dan Jurafsky (2016).
| |
| Deep Reinforcement Learning for Dialogue Generation. Published: arXiv.org. URL: [https://arxiv.org/abs/1606.01541v4]. Date accessed: 01-02-2021.
| |
| </ref>. The first category learns mappings from input messages to responses. In the simplest case this learning a probability distribution. More advanced algorithms, such as <span style="font-variant-caps: small-caps;">Seq2Seq</span>, do take prior context into account. In particular, <span style="font-variant-caps: small-caps;">Seq2Seq</span> uses two LSTMs (Long Short-Term Memory, a commonly used variant of Recurrent Neural Networks): one to encode input messages to an abstract feature vector, and another to convert such vectors to a reply
| |
| <ref name="Seq2Seq"> Ilya Sutskever, Oriol Vinyals, and Quoc V. Le (2014).
| |
| Sequence to sequence learning with neural networks.
| |
| Published: Advances in neural information processing systems, pages 3104-3112.
| |
| URL: [https://arxiv.org/pdf/1409.3215.pdf]. Date accessed: 02-02-2021.
| |
| </ref>.
| |
|
| |
| The other category are task-oriented dialogue systems. These are often tuned to a specific domain application, and trained with reinforcement learning. Examples are statistical models based on Markov Decision Processes (a typical model for reinforcement learning) and models that learn generation rules. Because of their fine-tuned nature they cannot flexibly employed beyond their domain.
| |
|
| |
| === Seq2Seq with Deep Reinforcement Learning ===
| |
|
| |
| (This section assumes some basic knowledge of Reinforcement Learning)
| |
|
| |
| Li et al. (2016) <ref name="DeepRLForDialogGeneration"/> propose an algorithm that combines the two categories. That is, they train a <span style="font-variant-caps: small-caps;">Seq2Seq</span> model (an encoder-decoder recurrent neural network) with reinforcement learning, by having two agents communicate against each other. The state of the conversation is represented by the previous two responses, which are alternately generated by the two agents.
| |
| The policies (i.e. the neural networks) are optimized towards maximizing rewards attained by their actions. The rewards are defined in terms of how well a response keeps the conversation going. In particular, the rewards are higher the more different the response form responses on a predefined list of 'dull' responses. This is implemented by taking the negative log of the probability that a 'dull' response will be generated (note that the negative log of a small probability has a greater value than the negative log of a small probability):
| |
|
| |
| <math> = \frac{1}{N_{S}}</math>
| |
| <math>\sin 2\pi x + \ln e</math>
| |
|
| |
|
| = Overview = | | = Overview = |
Group description
Abstract
A pure software end-user application that supports people in their need to socialize while motivating self-improvement.
Anthropomorphism is intentionally used to increase user commitment and experience.
Machine learning techniques are used to process user's data and provide feedback, and to facilitate the anthropomorphized interface.
Members
(in alphabetical order):
- Edwin Steenkamer
- Emi Kuijpers (1227154)
- Fanni Egresits
- Morris Boers (1253107)
- Lulof Pirée (1363638)
GitHub Page:
GitHub
Logbook
See the page logbook_group_8
Problem statement and objectives
Goals
The software application should:
- Significantly reduce symptoms of loneliness as induced by infrequent social contact in users
- Register personal goals set by the users
- Collect data on the user's behavior and progress towards goals
- Provide the user with feedback and constructive nudges
Beyond the scope
The following features are probably valuable additions to the product,
but they are beyond the scope of what can be achieved in one quartile:
- Voice recognition
- Animated anthropomorphized interface (e.g. simulated face)
Who are the users
The target of the application is to support civilians in daily life.
The audience of the prototype is narrowed down to adolescents and adults who use computers on a daily basis.
TODO...
Approach, milestones and deliverables
TODO...
Literature Review (separate file)
Due to bugs in the installation of the LaTeX engine of the wiki, mathematical expressions cannot be shown here.
See the following Overleaf file for the literature review, and references: Literature review Overleaf file.
Overview
Work-in-progress-page
See the page WIP group 8 for an actively edited file of notes.
User guide
TODO...
Software documentation
TODO...
References