Question Pooling favours the middle question(s)
January 6, 2022 12:00 AM
I did a deep dive on our reporting for some Enterprise wide courses. We have a very large data set to work with and I did some number crunching with excel.
I noticed that when question pooling (randomizing a test section and selecting only to serve 1 question), the randomization favors the middle questions twice as much as the first and last question in the same pool.
Is this a normal expectation of the way the randomization has be implemented? Has any one noticed this as well?
Let me explain further and demonstrate,
In this test results below, Pool (Test section) 7 had 2 questions, where only 1 is served to the learner.
From the reporting data, we get what is expected. Each question is served roughly half the time. Perfect!.
Left side is the responses to each question and the right side is the percentage of the time each question was served. 35,636 got served Q27 while the rest got served Q28
(Note: yes, this test was taken 71,913 times by our learners.)
However, if we have a test section with 3 questions. We get this. Where the middle question (Question 6) is served roughly twice as much as the 1st or 3rd questions in the pool.
Its not isolated, all of the question pools of 3 questions are roughly like this.
Here is the data from my Question pools 4 and 5. The questions are served again on the 25%-50%-25% basis. I would have thought it should be closer to 33% - 33% - 33% if it was truly random.
Question pools (Test sections) with 4 questions, where again only 1 is served to the learner. Is split roughly 16%,34%,34%,16%) , rather than 25-25-25-25%
Even with a question pool of 9 questions, the middle questions are favored twice as much as the first and last questions. Q8 and Q16 were served half as much, instead of ~11% for each question.
The test was built in Lectora 19. I'm curious if Lectora 21 implements random question serving differently? Have there been improvements?
Cheers
Trev
Discussion (8)
I checked with a course delivered 98 times, each question had a pool of 3, and I'm seeing that as well. 49.79% was lowest delivery of 2nd question, 60.2% was highest.
Has anyone looked at the JavaScript of a quiz? It's public, it's right there in course.
It's probably a rounding effect. When you pick a random() number between 1 and 3 and round the results, it should be:
1.00 - 1.49 = Question 1
1.50 - 2.49 = Question 2
2.50 - 2.99 = Question 3
This theory of course is quite exactly supported by the statistics above. The effect should get smaller the more questions are in the pool.
I think I may have a solution; an updated formula. Would love others to weigh in. As I'm not 100% sure on this.
@carlfink , Its not always easy to drill down into the JavaScript, but I'll attempt to below. I love a good challenge. :)
@timk Your post got me thinking more about non-even distribution and random number generation. So I started investigating further.
I found a discussion on StackOverflow that really suited my understanding. It had sample js for both non-even and even distribution of random number generation.
Specifically,
With this additional information in mind, I investigated Lectora's js files.
In the trivantis-titlemgr.js, I found this:
Yay! So I think that's part of the formula anyways. Seems like its using Math.round rather than Math.floor. I can't just change that though. The latter half of the formula has a difference in it too.
I need to find what code parts are calling get_random() function and see what is being passed as (lim).
Further down I noted 2 instances.
One in TTPr.LoadPages = function()
And one in TSPr.LoadPages = function( arr, rand )
If I look at the latter part of the formulas, I noticed this part is not quite the same as my StackOverflow example.
To me the code Math.round(Math.random()*this.arWork.length - i - 1);
looks more like: Math.round(Math.random()*(max - min- 1));
If I understand arWork.length and i variables correctly,
rather than the non-even distribution formula Math.round(Math.random() * (max - min)) + min;
Perhaps there is more that one way of doing it.
If so, if I compare to the even distribution formula, I could infer that Math.floor(Math.random()*(max - min+ 1))+ min;
in my case could be just Math.floor(Math.random()*(max - min));
Let's toss that on LiveWeave and test these. Both the non-even distribution and what I think is the even distribution on a simulated question pool, random number generation.
0,1 or 2. the max is exclusive while the min is inclusive in the formula.
I ran the code 120 times for a good amount of data and I get the output:
Throwing that data into Excel and tabulating the results.
For the non-even distribution (first formula), we see the same non-even distribution as in Lectora.
The middle number is served twice as much as the first and last.
If we tally up the results using the new formula, we get:
Yay! It appears to be a much better distribution of served random numbers.
So my analysis concludes if I update the Trivantis-titlemgr.js file
Updating the first part of the random formula, by replacing Math.round with Math.floor
And removing the -1 in the function under TTPr.LoadPages and TSPr.LoadPages . Not sure which of these functions is used for specifically for questions, but I assume one is for overall tests and one is for Test sections.
Is my analysis sound? I have yet to try this out on actual eLearning courses or on our LMS. I just want to see if my investigation seems on to something. I'd love for ElearningBros to weigh in, perhaps this or something similar could be updated in Lectora 21.
Before I ever use, I would need to do lots more testing, making sure it doesn't break anything or impact a question pool of 2.
Cheers
Trev
@trog, very good! I did a test of your code for 1M 10 times for 2, 3, and 4 and saw the same results. Here is one round:
@chrystalb21 That's awesome. Thanks so much for the additional testing. Greatly appreciated!
@wheels , That's great to hear! Thank you as well!
@trog thank you so much for this! We are working on integrating and testing these changes.