5.4. Sorting Question - Version 1

The following examples will focus on a hypothetical SortingQuestion module. In reality, this question would never be used as it just asks a student to sort an array - quite a mediocre task. Nonetheless, it is designed to match the recommended practices for question modules and thus, serves as an adequate example.

Quizgen Questions

There are many questions in the questions/q/ folder that have working implementations. In fact, they are abstracted to the point that there is very little code in each question class. Although useful for quickly adding new questions from Quizgen, they are far from adequate examples.

Their internal implementations are complicated and have many numerous components. As a result, they should not be used as templates for new questions. Instead, use the example questions, provided in the questions/examples directory, as starting points when creating a new question module.

To follow along, open up questions/examples/SortingQuestion_v1/question.py. Even for a simple question, there is quite a bit of complexity in the implementation, so it is very helpful to keep it open while you read through this guide.

5.4.1. Declaration

5.4.1.1. Package Entry

The __init__.py file is very straightforward - it simply imports the question class:

1
2
3
4
5
6
# examples/SortingQuestion_v1/__init__.py
# Quizzera
# Author: Rushy Panchal
# Description: Example __init__.py for SortingQuestion question.

from .question import SortingQuestion_v1

To even be recognized by the loader (QuestionLoader), th question module must inherit from QuestionInterface. This ensures that the question will be usable by the backend.

So, we start off by importing the interface and declaring the question.

1
2

    '''

Notice that the name of the class matches that of the package - this is also required by the loader.

5.4.2. Instance Generation

Now we’ll go over the first of the three required methods: generate_instance.

As you may recall from the interface overview, this method is responsible for generating random instances of the question given the seed.

That brings us to the first line of the method:

random.seed(self.seed)

This sets the seed for the random number generator - ensuring that for a given seed, we will obtain the same instance.

Then, we randomly generate an array, which will serve as the primary data for the question:

1
2
3
4
      # Generate a random instance of the question. The correct answer is also
      # calculated.
      question_array = [random.randint(1, 100) for i in range(10)]
      correct_answer = sorted(question_array)

As shown, the correct answer (which is just the sorted array in our example) is also found. In general, it’s safest to calculate the answer and compare the student’s answer to this later on, even if there are other ways to check if the student is correct.

For example, an easy mistake to make is to just check that the student’s inputted answer is sorted. However, this is not actually correct because the question asks the student to sort the given data, not just to provide an arbitrary sorted array.

In addition, calculating the answer in the generate_instance method makes it simpler to award partial credit later on.

5.4.2.1. Prompt

Equally important is providing a descriptive prompt - how else would the student know what to do? A prompt is composed of a few parts:

  1. Task to perform - what should the student do?
  2. Task’s output - what should the student get from the task? That is, what is the specific answer you are looking for?
  3. Output format - how should the student input their answer?
  4. Data required - what data does the student to answer the question?

Let’s see if our prompt matches that format.

1
2
3
4
5
6
      # A good prompt tells the student what to do and of course, provides the
      # necessary data.
      prompt = '''Sort the following array:\n{data}.\nMake sure to input your
      answer as an array of numbers, where each individual element is separated
      by a space.'''.format(
          data=self.array_to_string(question_array))
  1. The task is clearly stated: ‘Sort the following array.’
  2. The expected output - an array - is stated. For this question, we could ask for the first/last element of the array, or the number of misses when searching for a number in the sorted array using binary search; there are numerous different answers that could come from a sorted array. Thus, we have to be explicit in what we expect.
  3. The format of the output is specified - an array delimited by spaces.
  4. The data that the student will need - the unsorted array - is provided.

Our prompt has all of the required elements, even if brief. So, let’s move on to the next important step: the explanation.

5.4.2.2. Explanation

The explanation is arguably the most crucial part of generating a question instance. After all, the students are meant to learn from answering these questions; to learn, they should know where they went wrong and what they should have done to get the correct answer.

However, generating a useful and thorough explanation can be tedious and difficult to do. For our explanation, we generate a list of steps taken when sorting the array using selection sort:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
      # A good explanation contains the steps of how to reach the answer -
      # this can be tricky to do, but very worthwhile when a student is confused
      # about how an answer is reached.
      steps = self.selection_sort_steps(question_array.copy())
      steps_disp = '\n'.join('{}. {}'.format(i, s) for i, s in enumerate(steps))
      explanation = '''Use any sorting algorithm to sort the array. An example is
      shown using selection sort:

      {steps}
      '''.format(steps=steps_disp)

Note the call to selection_sort_steps. Even though we are just asking the student to sort an array, we put in a significant amount of effort in tracing a specific (simple) algorithm’s steps through performing that task:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
   def selection_sort_steps(self, array):
       '''
       Sort the array and return an array of arrays, each a step in the process.
       '''
       number_items = len(array)
       steps = [None] * number_items

       for index in range(number_items):
           # Find the minimum value and index it occurs at
           min_value = array[index]
           min_index = index

           for i, val in enumerate(array[index:], index):
               if val < min_value:
                   min_value = val
                   min_index = i

           # Swap the minimum value and current index.
           tmp = array[index]
           array[index] = min_value
           array[min_index] = tmp

           steps[index] = self.array_to_string(array)

       return steps

By doing so, however, we can ensure that at the least, the student can follow the algorithm and see where the answer came from.

5.4.2.3. Question Data

Now that we’ve set up all the components, we can construct the question’s data:

1
2
3
4
5
6
      question_instance = {
          'prompt': prompt,
          'original_array': question_array,
          'answer': correct_answer,
          'explanation': explanation,
      }
note:This data can be in any format you require. In fact, the names you use for the fields (and the fields themselves) don’t have to necessarily match. Of course, for clarity, they should match and should be easy to work with. Nonetheless, the data is never directly examined by the backend - it is for your use (and sent to the user through the API) only.

All of the required data, for either the question prompt or submission steps, is stored in this one dictionary - this is the recommended approach. Of course, it’s possible to generate the explanation on-the-fly, and that may be preferred (or even necessary) in some cases. However, for simple use cases, which is the category the majority of questions fall into, generate all of the data required for a single question in the generate_instance method, even if that data is only used in the validation.

The primary reason to generate all the data together is that question instances are generated in the background, thus speeding up access to new instances. As such, it makes sense to put the heaviest work in that single method as it will be run separate from the main server.

In addition, generating the data in a single location allows for clarity in the data produced and is more cohesive.

All together, our generate_instance method looks like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
   def generate_instance(self):
       '''
       Generate a new instance of the question.

       :see: `core.interfaces.QuestionInterface.generate_instance`.
       '''
       # Initialize tha pRNG with the current seed.
       random.seed(self.seed)

       # Generate a random instance of the question. The correct answer is also
       # calculated.
       question_array = [random.randint(1, 100) for i in range(10)]
       correct_answer = sorted(question_array)

       # A good prompt tells the student what to do and of course, provides the
       # necessary data.
       prompt = '''Sort the following array:\n{data}.\nMake sure to input your
       answer as an array of numbers, where each individual element is separated
       by a space.'''.format(
           data=self.array_to_string(question_array))

       # A good explanation contains the steps of how to reach the answer -
       # this can be tricky to do, but very worthwhile when a student is confused
       # about how an answer is reached.
       steps = self.selection_sort_steps(question_array.copy())
       steps_disp = '\n'.join('{}. {}'.format(i, s) for i, s in enumerate(steps))
       explanation = '''Use any sorting algorithm to sort the array. An example is
       shown using selection sort:

       {steps}
       '''.format(steps=steps_disp)

       question_instance = {
           'prompt': prompt,
           'original_array': question_array,
           'answer': correct_answer,
           'explanation': explanation,
       }

       return self.seed, question_instance

Note that the return type is what we expect - a tuple of the seed, an int, and the question data, a dict.

5.4.3. Instance Cleaning

‘Cleaning’ the instance is the process of transforming the instance data into a version that can be presented to the user.

Abstracted Cleaning

You may be wondering - why doesn’t the backend, which utilizes the question class, select which data it needs to provide? Why delegate this task to the question designer, when the task itself will essentially be the same for all questions? It would appear that making every question designer write the same code flies directly in the path of abstraction and simplicity.

To answer that, look back to the question data. The backend does not enforce any structure upon the question data, save that it is a dictionary and can be serialized into JSON. As such, the backend can never guarantee that certain fields do or do not exist in it. With that in mind, it is necessary for the question designer to implement the instance cleaning.

Our cleaning method, clean_instance, simply selects the fields that the student will see when attemping the question. Of course, they shouldn’t see the answer or the explanation and more importantly, they should never have access to them before submitting their answer.

So, that leaves us with two fields that they should see: prompt and original_array. In truth, the original array is contained within the prompt, but we may still want access to it in the ‘raw’ form.

So, we only select those fields:

1
2
3
4
      return {
          'prompt': instance['prompt'],
          'original_array': instance['original_array']
      }

It’s equally valid to delete the fields we don’t require, as we can modify the instance in-place, but it’s often simpler to choose fields to include rather than exclude the rest - the inclusion list is smaller and easier to work with. It’s also more explicit and provides a greater level of clarity.

5.4.3.1. finished = True

But wait, what about those two lines that we skipped?

1
2
      if finished:
          return instance

Recall that the clean_instance method is called at two distinct stages in the lifetime of a student’s question attempt - before being presented to the student and anytime the attempt is later accessed. As such, it is important to implement proper cleaning for when the attempt is ‘finished’ and being accessed just for viewing purposes.

In almost all cases, you can simply return instance when finished is True. However, the flexibility of performing additional processing (or only providing a subset of the data fields) is allowed if desired.

In total, our clean_instance method:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
   def clean_instance(self, instance, finished=False):
       '''
       Clean the instance dictionary so it can be presented to the user.

       :see: `core.interfaces.QuestionInterface.clean_instance`
       '''
       if finished:
           return instance

       return {
           'prompt': instance['prompt'],
           'original_array': instance['original_array']
       }

5.4.4. Answer Validation

The final method to implement is validating the answer. This method is often very simple if the correct answer was calculated in the generate_instance method, which is recommended. Similarly, since the explanation was already generated, very little work needs to be done.

The validate_answer method has a complicated signature, so let’s break it down to understand it better.

validate_answer(self, instance, answer, submission_count)

The instance is simply the question instance created by your generate_instance method, passed in as a dict.

The answer is just a string representation of the answer submitted by the user. The format of this answer depends entirely on what you decide it to be in the front-end code. Normally, you will want to submit JSON data, as an array or as an object. Then, your validate_answer method should parse that data.

submission_count is a little more tricky - it’s how many times the user has performed a submission on this specific attempt. That is, how many ‘iterations’ have passed on this question as an iterated attempt. The first submission is counted as iteration 1, so the value will always be greater than or equal to 1.

Iterated Attempts

Iterated Attempts are the notion of performing multiple submissions on a single attempt - with one grade overall.

Essentially, this allows for feedback-based questions. Based on the submission count, which is automatically tracked for each attempt, and answer, you can decide how to respond to the user.

Iterated attempts can be used to catch common student mistakes early and provide appropriate feedback. Alternatively, they also allow for more complicated question types. To start with, however, you should stick to regular questions until you are comfortable with trying this advanced feature.

The method’s return value is also complex. As shown in the interface’s documentation for validate_answer, the method should return a 4-tuple of (response, instance, grade, state).

The response is what the API should respond with to the user. As with the original clean_instance method, the response should only contain data that you want the user to see - if the submission is over, however, there’s no harm in showing the user the correct answer and their grade. In fact, that is recommended so that students can learn.

You can also update the instance with new information, if needed. As a rule of thumb, you should never remove information from the instance - it is much more difficult to deal with missing data than extraneous data. Regardless of whether or not you update the instance, the method should return the instance data.

The grade, as expected, is the grade received for the answer submission. Remember, the grade is a `float` (or `decimal.Decimal`) between 0 and 1 - a multiplier for the total score possible on the question, which the module has no knowledge of.

Finally, the state determines what the attempt’s state should be marked as. Currently, there are only two states: FINISHED and FEEDBACK.

If the state returned is the FINISHED state, then the attempt is marked as finished and no more submissions will be accepted from the user. The grade returned is also recorded for the attempt.

However, if the state returned is the FEEDBACK state, then the grade is not recorded and the attempt is not marked as finished. Instead, the response is returned, the submission count is increased, and the student is allowed to submit again. This ‘feedback loop’ will continue until the attempt is marked as finished, which is when the FINISHED state is returned.

note:When returning the FEEDBACK state, you should not return a grade. Instead, return None. The grade is ignored and semantically, there is no grade in a feedback loop.

Now that we understand the specifics of the method signature, we can dive into what the method does - it checks the answer. That’s all there is to it.

So, let’s see the implementation in our example question:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
   def validate_answer(self, instance, answer, submission_count):
       '''
       Validate the user's incoming answer.
       '''
       state = self.FINISHED

       answer_array = answer.strip().split(' ')
       grade = float(instance['answer'] == answer_array)

       response = {
           'explanation': instance['explanation'],
           'correct_answer': self.array_to_string(instance['answer']),
       }

       return response, instance, grade, state

As expected, the method first parses the student’s answer as JSON and then compares it with the correct answer. We perform a simple conversion of the boolean result (from the comparison) to a float - a False comparison will be 0.0 and a True comparison will be 1.0, which is the proper grade.

Finally, we create the response including the correct answer and the explanation. The score and maximum score for the question are automatically attached to that response by the question manager.

In the next section, we’ll go over some simplifications on the implementations to avoid replicating code across different questions.