SAT-style Question Answering with GPT4All Models¶

Learning objective: Load a small model from a shared directory and use it to answer SAT-style questions with a small language model.

What this notebook teaches¶

how to locate a shared folder of GGUF weights on your laptop or hub
how to load a model with gpt4all and walk through a tutor-style prompt
how to inspect the model’s answer and reasoning with reflection checkpoints
how to compare the model’s reasoning across random, filtered, and batched SAT items

Where the questions come from¶

We download the PineSAT Questionbank API at https://pinesat.com/api/questions.
PineSAT hosts community-built SAT-style questions so you can practice with authentic formats without licensing hurdles.
The endpoint replies in JSON (JavaScript Object Notation, a plain text key-value format) so we can inspect passages, choices, and answer keys with simple loops.

How to navigate the lesson¶

Every markdown cell previews the next code cell so you always know why a command matters.
Reflection checkpoints append your thoughts to answers.txt so you can track how your understanding changes.
Later cells batch four easy questions into a table so you can see accuracy trends without extra plotting.

Tip: Keep an eye on the model context size (the amount of text it can read at once) and thread count so you do not overload a shared machine.


from gpt4all import GPT4All  # loads gpt4all so we can run GGUF models locally
import os  # lets us work with file paths
import random  # lets us pick random items from a list
import requests  # lets us call web APIs over HTTP
import pandas as pd  # helps with working with tables and spreadsheets
import re  # lets us search text with patterns
from IPython.display import display  # lets us show interactive widgets inside the notebook
import ipywidgets as widgets  # adds dropdown menus and buttons for simple UI

Check which shared directory is available on this machine. If you are on a hub, the shared directory is usually under /home/jovyan. On a laptop it might live in your Documents folder.


possible_directories = [
    "/home/jovyan/shared/",
    "/home/jovyan/shared_readwrite/",
    "/Users/ericvandusen/SmallLM/Models"
]

existing_directories = []
for directory_path in possible_directories:
    if os.path.exists(directory_path):
        print("Found possible directory:", directory_path)
        existing_directories.append(directory_path)
    else:
        print("Did not find:", directory_path)

Did not find: /home/jovyan/shared/
Did not find: /home/jovyan/shared_readwrite/
Found possible directory: /Users/ericvandusen/SmallLM/Models

Pick a directory to use. We default to the first path that exists, and you can type a different one if you want.


if len(existing_directories) > 0:
    model_directory = existing_directories[0]
    print("Using this directory by default:", model_directory)
else:
    model_directory = input("Type a directory path that contains your .gguf files: ")

print("Current model directory:", model_directory)

Using this directory by default: /Users/ericvandusen/SmallLM/Models
Current model directory: /Users/ericvandusen/SmallLM/Models

List every .gguf model file in the chosen directory so we can pick any model that is available. Use the dropdown to make your choice before running the loader cell.


available_models = []
for filename in os.listdir(model_directory):
    if filename.endswith(".gguf"):
        available_models.append(filename)

if len(available_models) == 0:
    print("No .gguf files found in", model_directory)
else:
    print("Models found in", model_directory)
    dropdown_default = available_models[0]
    for candidate_name in available_models:
        lowercase_name = candidate_name.lower()
        if "qwen" in lowercase_name:
            dropdown_default = candidate_name
            break
    model_dropdown = widgets.Dropdown(
        options=available_models,
        description="Model:",
        value=dropdown_default
    )
    display(model_dropdown)
    print("Use the dropdown to pick a model, then run the next cell to load it.")

Models found in /Users/ericvandusen/SmallLM/Models

Use the dropdown to pick a model, then run the next cell to load it.

Checkpoint #1: Look up the model card for the model you picked - what was that mode trained on? What is its context size? How many threads does it use by default?

Load the selected model with gpt4all so it is ready to answer questions.

selected_model_name = model_dropdown.value
model_path = os.path.join(model_directory, selected_model_name)


model = GPT4All(
    model_name=selected_model_name,
    model_path=model_directory,
    allow_download=False,
    n_ctx=2048,
    n_threads=4,
    verbose=False
)

llama_context: n_ctx_per_seq (2048) < n_ctx_train (32768) -- the full capacity of the model will not be utilized
ggml_metal_init: skipping kernel_get_rows_bf16                     (not supported)
ggml_metal_init: skipping kernel_set_rows_bf16                     (not supported)
ggml_metal_init: skipping kernel_mul_mv_bf16_f32                   (not supported)
ggml_metal_init: skipping kernel_mul_mv_bf16_f32_c4                (not supported)
ggml_metal_init: skipping kernel_mul_mv_bf16_f32_1row              (not supported)
ggml_metal_init: skipping kernel_mul_mv_bf16_f32_l4                (not supported)
ggml_metal_init: skipping kernel_mul_mv_bf16_bf16                  (not supported)
ggml_metal_init: skipping kernel_mul_mv_id_bf16_f32                (not supported)
ggml_metal_init: skipping kernel_mul_mm_bf16_f32                   (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_bf16_f16                (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_bf16_h64           (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_bf16_h80           (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_bf16_h96           (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_bf16_h112          (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_bf16_h128          (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_bf16_h192          (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_bf16_hk192_hv128   (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_bf16_h256          (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_bf16_hk576_hv512   (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_vec_bf16_h64       (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_vec_bf16_h96       (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_vec_bf16_h128      (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_vec_bf16_h192      (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_vec_bf16_hk192_hv128 (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_vec_bf16_h256      (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_vec_bf16_hk576_hv512 (not supported)
ggml_metal_init: skipping kernel_cpy_f32_bf16                      (not supported)
ggml_metal_init: skipping kernel_cpy_bf16_f32                      (not supported)
ggml_metal_init: skipping kernel_cpy_bf16_bf16                     (not supported)

Warm-up: ask a simple SAT-style algebra question to confirm that the model responds.

Here is a simple question to get us started. We will ask the model to reason step by step and put its final answer in a box. This is a common format for SAT questions, and it encourages the model to show its work. We will have

an algebraic equation to solve, and
four answer choices to pick from.
a tutor-style prompt that encourages the model to explain its reasoning clearly.

# Build the Question from parts 

warmup_question = "If 3x + 5 = 14, what is the value of x?"
warmup_choices = "A) 2\nB) 3\nC) 4\nD) 5"
warmup_prompt = """
Here is an SAT-style multiple-choice question:
Question: {question_text}
Choices: {choices_text}

Reason step by step. On the very last line of your response, write exactly:
Final answer: X
where X is the letter of your choice (A, B, C, or D).
"""

# Put the parts together into the format we want to send to the model
messages = []
messages.append({"role": "system", "content": "You are a helpful math tutor who explains each step clearly."})
messages.append({"role": "user", "content": warmup_prompt.format(question_text=warmup_question, choices_text=warmup_choices)})

# Send the question to the model and print the response
warmup_response = model.chat_completion(messages)
# this next line pulls the text of the model's response out of the full response object that the model returns, so we can print just the text

warmup_text = warmup_response["choices"][0]["message"]["content"]
print(warmup_text)

To solve this problem, we need to isolate the variable x on one side of the equation. We can do this by subtracting 5 from both sides of the equation. 

So, 3x + 5 - 5 = 14 - 5

This simplifies to:

3x = 9

Now, we need to divide both sides of the equation by 3 to solve for x.

3x / 3 = 9 / 3

This simplifies to:

x = 3

Therefore, the value of x is 3.

Final answer: X = 3

Source an open source set of SAT-style questions¶

Download the SAT Questionbank from PineSAT so we can pull authentic practice questions.] We will use the API endpoint at https://pinesat.com/api/questions, which returns a JSON object with passages, questions, choices, and answer keys. We can loop through this data to inspect the format and content of the questions.

base_url = "https://pinesat.com/api/questions"

english_questions = requests.get(base_url, params={"section": "english"}).json()
math_questions = requests.get(base_url, params={"section": "math"}).json()

english_nested = pd.DataFrame(english_questions)
math_nested = pd.DataFrame(math_questions)

print("English questions:", len(english_nested))
print("Math questions:", len(math_nested))

English questions: 1443
Math questions: 1031

english_questions[0]

{'id': 'random_id_a1',
 'domain': 'Information and Ideas',
 'visuals': {'type': 'null', 'svg_content': 'null'},
 'question': {'choices': {'A': 'Suppressing opinions robs future generations of the chance to hear them, even if they disagree with them.',
   'B': 'It is harmful to silence opinions that are held by a majority of people.',
   'C': 'People who dissent from an opinion are more likely to be harmed by its suppression than those who hold it.',
   'D': 'It is important to respect all opinions, even if they are wrong.'},
  'question': 'What is Mill\'s main point in this passage from "On Liberty"?',
  'paragraph': 'In the essay "On Liberty," John Stuart Mill argues that "the peculiar evil of silencing the expression of an opinion is, that it is robbing the human race; posterity as well as the existing generation; those who dissent from the opinion, still more than those who hold it."  What is Mill\'s main point in this passage?',
  'explanation': 'Mill argues that suppressing opinions is a harm to everyone, including future generations, because it prevents them from hearing and potentially engaging with these ideas.  He also emphasizes that those who disagree with the silenced opinion are more likely to be harmed because it prevents them from developing their own understanding and potentially challenging it.',
  'correct_answer': 'A'},
 'difficulty': 'Medium'}

english_questions[0]["question"]

{'choices': {'A': 'Suppressing opinions robs future generations of the chance to hear them, even if they disagree with them.',
  'B': 'It is harmful to silence opinions that are held by a majority of people.',
  'C': 'People who dissent from an opinion are more likely to be harmed by its suppression than those who hold it.',
  'D': 'It is important to respect all opinions, even if they are wrong.'},
 'question': 'What is Mill\'s main point in this passage from "On Liberty"?',
 'paragraph': 'In the essay "On Liberty," John Stuart Mill argues that "the peculiar evil of silencing the expression of an opinion is, that it is robbing the human race; posterity as well as the existing generation; those who dissent from the opinion, still more than those who hold it."  What is Mill\'s main point in this passage?',
 'explanation': 'Mill argues that suppressing opinions is a harm to everyone, including future generations, because it prevents them from hearing and potentially engaging with these ideas.  He also emphasizes that those who disagree with the silenced opinion are more likely to be harmed because it prevents them from developing their own understanding and potentially challenging it.',
 'correct_answer': 'A'}

english_questions[0]["question"]['question']

'What is Mill\'s main point in this passage from "On Liberty"?'

english_questions[0]["question"]['choices']

{'A': 'Suppressing opinions robs future generations of the chance to hear them, even if they disagree with them.',
 'B': 'It is harmful to silence opinions that are held by a majority of people.',
 'C': 'People who dissent from an opinion are more likely to be harmed by its suppression than those who hold it.',
 'D': 'It is important to respect all opinions, even if they are wrong.'}

english_questions[0]["question"]['paragraph']

'In the essay "On Liberty," John Stuart Mill argues that "the peculiar evil of silencing the expression of an opinion is, that it is robbing the human race; posterity as well as the existing generation; those who dissent from the opinion, still more than those who hold it."  What is Mill\'s main point in this passage?'

english_questions[0]["question"]['correct_answer']

'A'

english_details = pd.json_normalize(english_nested["question"])
english_details.head()

english_df = pd.concat(
    [english_df.drop(columns=["question"]).reset_index(drop=True),
     english_details.reset_index(drop=True)],
    axis=1
)

english_df.head()

# adapt the code for math 

math_details = pd.json_normalize(math_nested["question"])
math_details.head()
math_df = pd.concat(
    [math_df.drop(columns=["question"]).reset_index(drop=True),
     math_details.reset_index(drop=True)],
    axis=1
)
math_df.head()

english_df

Pick a random question from the bank without filtering so you can see the full structure.

# Pick a math question from the question bank that we can test on
math_questions[0]

{'id': '281a4f3b',
 'domain': 'Advanced Math',
 'visuals': {'type': 'null', 'svg_content': 'null'},
 'question': {'choices': {'A': 'f(x) = 3,000(0.02)^x',
   'B': 'f(x) = 0.98(3,000)^x',
   'C': 'f(x) = 3,000(0.002)^x',
   'D': 'f(x) = 3,000(0.98)^x'},
  'question': 'A certain college had 3,000 students enrolled in 2015. The college predicts that after 2015, the number of students enrolled each year will be 2% less than the number of students enrolled the year before. Which of the following functions models the relationship between the number of students enrolled, *f(x)*, and the number of years after 2015, *x*?',
  'paragraph': 'null',
  'explanation': 'Because the change in the number of students decreases by the same percentage each year, the relationship between the number of students and the number of years can be modeled with a decreasing, exponential function in the form *f(x) = a(1 - r)^x*, where *f(x)* is the number of students, *a* is the number of students in 2015, *r* is the rate of decrease each year, and *x* is the number of years since 2015. It’s given that 3,000 students were enrolled in 2015 and that the rate of decrease is predicted to be 2%, or 0.02. Substituting these values into the decreasing exponential function yields f(x) = 3,000(1 - 0.02)^x, which is equivalent to f(x) = 3,000(0.98)^x.',
  'correct_answer': 'D'},
 'difficulty': 'Medium'}

math_df.head()

Ask the model - English¶

We will now pass the random question to the model with the same tutor-style prompt we used for the warm-up question. This allows us to compare how the model handles a random question versus a simple one.

Ask the model to answer the random question. The same friendly tutor prompt is reused so you can compare responses.


# Pick a random question from the English question bank
random_index = random.randint(0, len(english_questions) - 1)
random_entry = english_questions[random_index]

# Pull out the question text and choices
random_question_text = random_entry["question"]['question']
random_choices_text = random_entry["question"]['choices']

print("Question:", random_question_text)
print("Choices:", random_choices_text)

# Ask the model to answer it
random_messages = []
random_messages.append({"role": "system", "content": "You are a helpful tutor who explains each step clearly."})
random_messages.append({"role": "user", "content": warmup_prompt.format(question_text=random_question_text, choices_text=random_choices_text)})

random_response = model.chat_completion(random_messages)
random_answer_text = random_response["choices"][0]["message"]["content"]
print(random_answer_text)

Question: Which choice provides the best way to combine the sentences in the underlined portion without changing the meaning?
Choices: {'A': 'Many people found the new styles to be too experimental and jarring, and they resisted the change in popular music, but psychedelic and folk rock have since been recognized as two of the most influential genres of rock music in the 1960s.', 'B': 'Many people found the new styles to be too experimental and jarring, and they resisted the change in popular music; psychedelic and folk rock have since been recognized as two of the most influential genres of rock music in the 1960s.', 'C': 'Many people found the new styles to be too experimental and jarring, resisting the change in popular music, but psychedelic and folk rock have since been recognized as two of the most influential genres of rock music in the 1960s.', 'D': 'Many people found the new styles to be too experimental and jarring, so they resisted the change in popular music, but psychedelic and folk rock have since been recognized as two of the most influential genres of rock music in the 1960s.'}
To answer this question, we need to understand the meaning of the underlined portion and the choices provided. The underlined portion states that many people found the new styles to be too experimental and jarring, and they resisted the change in popular music. The question asks which choice provides the best way to combine the sentences in the underlined portion without changing the meaning.

Choice A: "Many people found the new styles to be too experimental and jarring, and they resisted the change in popular music, but psychedelic and folk rock have since been recognized as two of the most influential genres of rock music in the 1960s."
Choice B: "Many people found the new styles to be too experimental and jarring, and they resisted the change in popular music; psychedelic and folk rock have since been recognized as two of the most influential genres of rock music in the 1960s."
Choice C: "Many people found the new styles to be too experimental and jarring, resisting the change in popular music, but psychedelic and folk rock have since been recognized as two of the most influential genres of rock music in the 1960s."
Choice D: "Many people found the new styles to be too experimental and jarring, so they resisted the change in popular music, but psychedelic and folk rock have since been recognized as two of the most influential genres of rock music in the 1960s."

The best way to combine the sentences in the underlined portion without changing the meaning is:

Final answer: C

Ask the model - Math¶

Now let’s repeat the process with a random math question from the bank. This allows us to compare how the model handles different subjects and question formats.

# Pick a random question from the Math  question bank
random_index = random.randint(0, len(math_questions) - 1)
random_entry = math_questions[random_index]

# Pull out the question text and choices
random_question_text = random_entry["question"]['question']
random_choices_text = random_entry["question"]['choices']

print("Question:", random_question_text)
print("Choices:", random_choices_text)

# Ask the model to answer it
random_messages = []
random_messages.append({"role": "system", "content": "You are a helpful tutor who explains each step clearly."})
random_messages.append({"role": "user", "content": warmup_prompt.format(question_text=random_question_text, choices_text=random_choices_text)})

random_response = model.chat_completion(random_messages)
random_answer_text = random_response["choices"][0]["message"]["content"]
print(random_answer_text)

Question: The function $f(x) = ax^2 + bx + c$ has a vertex at $(-2, 3)$ and passes through the point $(0, 1)$. What is the value of $c$?
Choices: {'A': '-5', 'B': '1', 'C': '3', 'D': '5'}
To find the value of c, we need to use the given information about the vertex and the point through which the function passes. The vertex form of a quadratic function is given by:

\[f(x) = a(x - h)^2 + k\]

where (h, k) is the vertex of the parabola. In this case, the vertex is at (-2, 3), so we can write:

\[f(x) = a(x + 2)^2 + 3\]

Now, we know that the function passes through the point (0, 1). Substituting these values into the equation gives:

\[1 = a(0 + 2)^2 + 3\]

Simplifying this equation gives:

\[1 = 4a + 3\]

Solving for a gives:

\[4a = 2\]

\[a = \frac{1}{2}\]

Now, we can substitute the value of a back into the vertex form equation to find the value of c:

\[f(x) = \frac{1}{2}(x + 2)^2 + 3\]

Expanding this gives:

\[f(x) = \frac{1}{2}(x^2 + 4x + 4) + 3\]

Simplifying further gives:

\[f(x) = \frac{1}{2}x^2 + 2x + 2 + 3\]

\[f(x) = \frac{1}{2}x^2 + 2x + 5\]

So, the value of c is 5. Therefore, the correct answer is:

Final answer: D

Checkpoint #2: Why are we testing the model with the domains provided in the bank? Write a short explanation.

Build a mini SAT practice set¶

Lets build a set of test question that we can then pass to the model in a batch. This allows us to see how the model performs across multiple questions and identify any patterns in its strengths or weaknesses.

Filter the bank by difficulty and subject so you can target specific skills.


difficulty_widget = widgets.Dropdown(
    options=["Easy", "Medium", "Hard"],
    value="Easy",
    description="Difficulty:"
)

section_widget = widgets.Dropdown(
    options=["English", "Math"],
    value="English",
    description="Section:"
)

num_questions_widget = widgets.Dropdown(
    options=[ 2, 4, 8, 10, 12],
    value=4,
    description="# Questions:"
)

display(difficulty_widget)
display(section_widget)
display(num_questions_widget)
print("Choose your settings above, then run the next cell to test the model.")

Choose your settings above, then run the next cell to test the model.

Now send each question to the model and collect the results in a table.

# Read the widget values
chosen_difficulty = difficulty_widget.value
chosen_section = section_widget.value
chosen_count = num_questions_widget.value

# Pick the right question list based on section
if chosen_section == "English":
    question_pool = english_questions
else:
    question_pool = math_questions

# Filter by difficulty and shuffle so we get a fresh sample each run
filtered_questions = []
for question_entry in question_pool:
    if question_entry.get("difficulty", "").lower() == chosen_difficulty.lower():
        filtered_questions.append(question_entry)

random.shuffle(filtered_questions)
practice_set = filtered_questions[:chosen_count]

# Pull out question text, choices, paragraph, and correct answer for each item
practice_questions = []
practice_choices = []
practice_paragraphs = []
practice_answers = []
for question_entry in practice_set:
    practice_questions.append(question_entry["question"]["question"])
    practice_choices.append(question_entry["question"]["choices"])
    practice_paragraphs.append(question_entry["question"].get("paragraph", ""))
    practice_answers.append(question_entry["question"].get("correct_answer", ""))

print("Built a practice set of", len(practice_questions), chosen_difficulty, chosen_section, "questions.")
for item_index in range(len(practice_questions)):
    print("\nQ" + str(item_index + 1) + ":", practice_questions[item_index])
    if len(practice_paragraphs[item_index]) > 0:
        print("Passage:", practice_paragraphs[item_index])
    print("Choices:", practice_choices[item_index])
    print("Correct answer:", practice_answers[item_index])

Built a practice set of 4 Easy English questions.

Q1: Which choice most effectively combines the sentences below into a single sentence? 

The scientist discovered a new species of butterfly. The species is bright blue, with black stripes, and it has never been seen before. 
Passage: null
Choices: {'A': 'The scientist discovered a new species of butterfly, which is bright blue, with black stripes, and it has never been seen before.', 'B': 'The scientist discovered a new species of butterfly, bright blue with black stripes, and it has never been seen before.', 'C': 'The scientist discovered a new species of butterfly that is bright blue, with black stripes, and has never been seen before.', 'D': 'The scientist discovered a bright blue, with black stripes, and never-before-seen butterfly species.'}
Correct answer: C

Q2: The text suggests that the positive effects of meditation on well-being may be due to
Passage: The text describes a study of the effects of meditation on the brain. According to the text, the scientists who conducted the study found that meditation led to a change in the brain's structure. How does this finding support the text’s claim that meditation can have a positive effect on well-being?
Choices: {'A': 'the ability of meditation to reduce stress and anxiety.', 'B': 'the ability of meditation to improve focus and concentration.', 'C': 'the ability of meditation to increase blood flow to the brain.', 'D': 'the ability of meditation to change the structure of the brain.'}
Correct answer: D

Q3: The sentence below contains a grammatical error. Which choice corrects the error in the sentence?  
 The most obvious sign that the speaker is speaking directly to the reader is the use of the word "you."  There are other elements of the passage that may help to make the speaker seem more immediate and personal. The speaker refers to events and places that are familiar to the reader, and the speaker uses a conversational tone throughout the passage.
Passage: null
Choices: {'A': 'The most obvious sign that the speaker is speaking directly to the reader is the use of the word "you." There are other elements of the passage that may help to make the speaker seem more immediate and personal. The speaker refers to events and places that are familiar to the reader, and the speaker uses a conversational tone throughout the passage.', 'B': 'The most obvious sign that the speaker is speaking directly to the reader, is the use of the word "you." There are other elements of the passage that may help to make the speaker seem more immediate and personal. The speaker refers to events and places that are familiar to the reader, and the speaker uses a conversational tone throughout the passage.', 'C': 'The most obvious sign that the speaker is speaking directly to the reader, is the use of the word "you." There are other elements of the passage that may help to make the speaker seem more immediate and personal; the speaker refers to events and places that are familiar to the reader, and the speaker uses a conversational tone throughout the passage.', 'D': 'The most obvious sign that the speaker is speaking directly to the reader is the use of the word "you." There are other elements of the passage that may help to make the speaker seem more immediate and personal. The speaker refers to events and places that are familiar to the reader, and uses a conversational tone throughout the passage.'}
Correct answer: D

Q4: Which choice best revises the sentence so that it conforms to the conventions of Standard English?
Passage: The sentence "The author’s use of the word ‘love’ is significant, it is used to show the characters’ affection for their hometown." needs to be revised to conform to the conventions of Standard English.
Choices: {'A': 'The author’s use of the word ‘love’ is significant; it is used to show the characters’ affection for their hometown.', 'B': 'The author’s use of the word ‘love’ is significant, because it is used to show the characters’ affection for their hometown.', 'C': 'The author’s use of the word ‘love’ is significant and it is used to show the characters’ affection for their hometown.', 'D': 'The author’s use of the word ‘love’ is significant, being used to show the characters’ affection for their hometown.'}
Correct answer: B

Now send each question to the model and collect the results in a table.

batch_results = []
for item_index in range(len(practice_questions)):
    # If there is a passage, include it before the question
    passage_text = practice_paragraphs[item_index]
    if len(passage_text) > 0:
        full_question_text = "Passage: " + passage_text + "\n\n" + practice_questions[item_index]
    else:
        full_question_text = practice_questions[item_index]

    batch_messages = []
    batch_messages.append({"role": "system", "content": "You are a helpful tutor who explains each step clearly. Always end your response with exactly: Final answer: X where X is the letter A, B, C, or D."})
    batch_messages.append({"role": "user", "content": warmup_prompt.format(question_text=full_question_text, choices_text=practice_choices[item_index])})

    batch_response = model.chat_completion(batch_messages)
    batch_reply_text = batch_response["choices"][0]["message"]["content"]

    extracted_answer = ""
    answer_match = re.search(r"[Ff]inal answer:\s*([A-D])", batch_reply_text)
    if answer_match is not None:
        extracted_answer = answer_match.group(1)

    is_correct = False
    if len(practice_answers[item_index]) > 0 and len(extracted_answer) > 0:
        is_correct = extracted_answer.strip().upper().startswith(practice_answers[item_index].upper())

    batch_results.append({
        "Section": chosen_section,
        "Difficulty": chosen_difficulty,
        "Question": practice_questions[item_index],
        "Correct Answer": practice_answers[item_index],
        "Model Guess": extracted_answer,
        "Correct?": is_correct
    })

batch_results_table = pd.DataFrame(batch_results)
batch_results_table

Checkpoint #3: Try an easy question from a subject you choose. Did the model get it right? Explain how you checked.


student_reply_three = input("Describe what you tried and whether the model's answer matched the key.
")
with open('answers.txt', 'a') as answer_file:
    answer_file.write(student_reply_three)
    answer_file.write('
')

  Cell In[110], line 1
    student_reply_three = input("Describe what you tried and whether the model's answer matched the key.
                                ^
SyntaxError: unterminated string literal (detected at line 1)

Print a summary showing how many questions the model got right overall.


correct_count = 0
for result_row in batch_results:
    if result_row["Correct?"] == True:
        correct_count = correct_count + 1

print("Score:", correct_count, "out of", len(batch_results))
print(batch_results_table)

Summary: You located a shared model directory, picked any GGUF file, loaded it with gpt4all, and exercised it on SAT-style questions with and without filters. You also logged your reflections in answers.txt.