Example: Generative AI for advice and feedback
What follows is an example of providing a completed active learning exercise with context to ChatGPT 4o and Anthropic’s Claude. Notice that the prompt begins by providing some context to indicate how much Prolog students have seen and some additional background. (Prolog is a declarative programming language used for logic programming, theorem proving, and computational linguistics.) Providing context like this can help guide LLMs in their responses, with the goal of producing useful evaluation and level-appropriate suggestions.
The prompt
ChatGPT’s response
Commentary on ChatGPT’s feedback
As an experiment I ran the active learning session with the specification as written above on Monday 2025-01-27—without incorporating ChatGPT’s feedback. This trial run informs some of my comments.
Clarify testing instructions: This was a good call on ChatGPT’s part. While we did go over how to consult a Prolog file in class, at least one team needed a reminder of how to do it. That said, I’ve taught the more concise ?- [enrollment].
over the syntax used by ChatGPT:
?- consult('enrollment.pl').
Had I included this suggestion without changing the syntax, I might have confused some students. Nevertheless, I have incorporated this suggestion for use in future semesters using my preferred syntax.
Group facts with comments: I’m not entirely happy with this. While adding comments is harmless, adding comments to something that should be entirely self-explanatory is something I discourage. Adding comments as suggested is no different than commenting like this: x = 1 # let x equal 1
. Instead, I favor self-documenting code to the greatest extent possible. In Prolog (as in other programming languages) using good names for atoms and identifiers should be favored over comments.
Use of “standardized” variables in facts: This is pure nonsense.
Hints on debugging: trace/0
is something that shouldn’t be introduced at this stage in learning Prolog. It would require more lecture time and demonstration to prepare for use of trace/0
. With myself and a TA in the room during the exercise, introducing trace/0
would have been unnecessary and potentially confusing.
Diversity of test cases: ChatGPT is correct here. I could have included more test cases. The specification will include more test cases in future semesters.
Acknowledge syntax variations: This is a little puzzling. Is ChatGPT warning about variations between dialects of Prolog? I clearly state we’re using SWI-Prolog. Warning students about common syntax errors statements is a different matter.
What ChatGPT didn’t catch
There’s a defect in one of my code snippets:
?- all_completed(alvin, [cs1210, cs2240].
false.
?- all_completed(alvin, [cs1210, cs1640, cs2100].
true .
?- all_completed(mary, []].
true .
The first two queries are missing a closing parenthesis, and the last query has a closing bracket rather than a closing parenthesis. It would have been nice if ChatGPT had caught these defects.
Another thing that was missed (and would have been helpful in lab) was to remind students that Prolog syntax requires that atoms are named using all lower-case, and that variables start with an upper-case letter. Out of 35 students in 11 teams, three teams had to revise code after using upper-case letters for atoms (and getting abundant error messages from Prolog).
Summary
While ChatGPT provided some useful feedback, it included some useless or inappropriate suggestions as well. Useful feedback would have required some editing.
Claude’s response
Commentary on Claude’s feedback
Claude’s response is quite concise, and comments are nicely grouped.
Expected output: The suggestion “Could include a brief example of expected output format for the facts section” isn’t at all clear. I have no idea how to act on this.
More scaffolding for findall/3
: We covered in class the use of findall/3
but it might have been helpful to show a sample findall/3
by way of example. I have incorporated this change for future semesters.
Providing hints about typical pitfalls: Claude made this suggestion which ChatGPT missed (or obscured with vague language).
Fun: Claude commented on the final section of the assignment which was there only for fun. Claude correctly identified characters from Alvin and the Chipmunks: Alvin, Dave, Simon, and Theodore. However, it was unable to identify the source of the names: Betty, Diana, Florence, and Mary. Can you?
What Claude didn’t catch
Like ChatGPT, Claude didn’t catch the syntax errors in this section:
?- all_completed(alvin, [cs1210, cs2240].
false.
?- all_completed(alvin, [cs1210, cs1640, cs2100].
true .
?- all_completed(mary, []].
true .
(See above for explanation of defects.)
Summary
Claude’s responses were of a similar quality to ChatGPTs, but were better organized and more concise.
It’s interesting that neither model detected errors in one code snippet provided in the specification!
My revisions
While I asked ChatGPT and Claude to review and comment, I did not ask them to rewrite or revise the assignment. I revised the assignment myself, based on their feedback. In this way, I retain control over the exercise, authorship, and voice.
Here are the revisions. Add the following:
Write, consult, test with queries
Write facts and rules in a .pl
file and test at a SWI-Prolog prompt. Then consult the file with
?- [enrollment].
If you make changes to your .pl
file you can update Prolog’s memory with
?- make.
Don’t forget the .
Prolog facts, rules, and queries end with .
Case
Unlike many other languages, Prolog requires lower-case for atoms and variables start with an upper-case letter. For example, we write facts like this.
. % these are facts
blue(sky). green(grass)
Here are rules which make use of a variable, X
:
X) :- % X is a variable
happy(X),
early_to_bed(X). early_to_rise(
and revise the findall/3
section to include an example.
findall/3
Recall that findall/3
is a three-place predicate. The third argument is the variable name of the returned list. This is convenient when you wish to pass a list constructed by findall/3
to another predicate downstream (which you will do in this assignment).
Example:
?- findall(X, happy(X), HappyPeople).
This finds all X
which satisfy the predicate happy(X)
, returning the result in a list named HappyPeople
.
I fixed the syntax errors on my own, since neither Claude nor ChatGPT detected them.
What if I were to ask an LLM to write the assignment for me?
The prompt
ChatGPT’s response
Commentary on the exercise written by ChatGPT
This looks and feels generic and uses the names commonly used in examples on-line (which I try to avoid). It is not tailored as well to the students in the course, and leaves out many details I think are important. It leaves out many of the suggestions that ChatGPT made on the human-written assignment, and leaves out many things that were considered positive in review of the human-written assignment.
ChatGPT did come up with something unexpected, that is, what it refers to as a “prerequisite chain.” While this is interesting, it’s neither germaine nor correct. A course could have multiple prerequisites, and prerequisites might not form a linear chain.
Overall, the responses from LLMs here suggest that LLMs work better as guides or advisors, rather than authors, at least in this context. While it is possible that the generated assignment could be improved with additional prompt engineering, it’s unlikely it would produce an assignment of similar quality as the human-written assignment in less time. That is, it’s probably more efficient to write the assignment oneself, rather than go through a lengthy, iterative approach which does not guarantee an equivalent result.
Claude’s response
Commentary on the exercise written by Claude
This is somewhat better than the result from ChatGPT. While it still has a generic feel to it, it is better structured, it lists target learning outcomes, and breaks the activity down into nice sized chunks.
Some nice things:
- Reminders about base cases for recursion (which, alas, always bears repeating)
- “Check your understanding” questions
- Hint section
There are some issues:
Claude included a little detour with “Write a predicate
count_courses/2
that counts how many courses a student has completed.” This is not necessary for the task at hand.At the stage in Prolog learning we’re at for this assignment, students haven’t seen cuts (with
!
) and cuts are not necessary for any of the tasks in this exercise.
While this is not an adequate specification, it could serve as a starting point, but it would require substantial additions and revisions.