PRE2017 3 Group8 Research on the decision making system
We must match a user with a question. For this we must use the knowledge we (the system) have.
What we know of the user:
- Some measure for how skilled they are at certain types of questions based on previous questions answered
- How long they have been answering questions for
- At which speed they are answering questions
- How good they are at answering questions right now (relative to their estimated skill)
The last 3 things could be used to detect that the user is currently in a state of flow and can for example be asked more difficult questions.
What we know of each question:
- A category (set manually or automatically) (We can limit our system to 1 category for the sake of the prototype)
- It is possible to have multiple categories per question (each individual thing a person has to know to answer a question)
- A measure of difficulty (based on other people answering the question) . Having this automated has the additional benefit of telling us which questions people find most difficult. The downside is that the system has to learn the difficulties before it becomes effective which can be partially solved by starting with a default value.
- The speed of the question being answered relative to the ratio of it being answered correctly: for example a trick question would be answered incorrectly a lot despite being answered relatively fast. This can also be used to see if maybe a user should just take more time to think or if they should have been able to answer it sooner.
For a person to have mastered the subject in question they must be able to answered most question correctly so our performance measure for the users should be based on this.
Our system could work something like this:
- The score a user has for every category starts at 100
- When a user answers a question correctly/incorrectly their score goes down/up based on the difficulty score of the question.
- When the score for a category reaches 0 they no longer get asked questions in that category
The above things are only the basics, how a questions gets chosen and how the scores change is something that there is no 1 obvious choice for so from here the proposed rules are all options and do not all have to be chosen
The score of a question can be the percentage of times it gets answered correctly. With this we can propose the following system for choosing the questions:
- Choose a question with a category for which the users score is the lowest , ask multiple questions with this category in a row.
- Do not choose a question that has been answered correctly before
- Do not choose a question that has been answered incorrectly recently
- Specifically ask a question again that was answered incorrectly earlier (when the user should definitely know the answer now)
- Choose the question for which the score is closest to the score of the user
- If a question with multiple categories is answered incorrectly ask questions that have only one of these category to find out which category(sub problem) is the problem, if all of these are answered correctly ask another question with all these categories combined.
For Changing the users score:
Correct answer:
- When a question is answered correctly subtract some constant c from the score
- Subtract additional points for multiple correct answers in a row
- Subtract more points for a more difficult question being answered
- Subtract more points for a quick answer (relative to the users average speed compared to the global average for other users for that question)
Incorrect answer:
- Add more points of score recently went up by a lot (this question was much more difficult than the previous ones)
General notes:
- The actual numbers should be fine-tuned based on how long it should take to master questions in a category.
- When looking at the speeds for answers being given we should filter out outliers (someone leaves a question open for 1 hour while doing something else)
- Some things the system does should be visible to teachers. Things like the questions and questions categories which individual students find most difficult as well as the ones that are most difficult universally among all students are useful for teachers to know.
- Faulty questions will be answered ‘correctly’ approximately 0% of the time which means they can be filtered out automatically.
From literature:
Search terms?
- Computer adaptive test (Mentionioned on the moodle forum (2010) https://moodle.org/mod/forum/discuss.php?d=159682 )
- The forum mentions http://carla.umn.edu/assessment/CATfaq.html and http://echo.edres.org:8080/scripts/cat/catdemo.htm (Lawrence M. Rudner)
The second link describes a system that works similarly to the one described above. Their system works as follows:
- All the items that have not yet been administered are evaluated to determine which will be the best one to administer next given the currently estimated ability level
- The "best" next item is administered and the examinee responds
- A new ability estimate is computed based on the responses to all of the administered items.
- Steps 1 through 3 are repeated until a stopping criterion is met.
It describes a more complicated way of choosing questions which is backed up with statistics (which question should theoretically give us the most information about the users skill level). This does not take into account the psychological effects of the difficulty of the questions and is geared towards determining the skill level of people as opposed to increasing it.
I propose we use an established system as described above to determine skill levels but adapt it so that it also will try to train the students on their weak points and motivate them by asking the right questions at the right times. The speed at which the questions get answered is something that we can also consider taking into account (speed is not considered in the mentioned existing system).