memo_to_provost 鈥� Clayton Cafiero

Author

Clayton Cafiero

Published

2025-01-31

Overview

In just a few years, LLMs such as Open AI鈥檚 ChatGPT and Anthropic鈥檚 Claude, among others (collectively referred to here as 鈥済enerative AI鈥�) have dramatically altered work, social discourse, and education. They promise (and have in some cases delivered) breakthroughs in a wide variety of domains from drug discovery and material science to medical diagnosis and theorem proving. Clearly these tools can be put to good use to expand human knowledge. However, the same technology has inherent dangers, so we should be cautious about its use.

Who controls these models? Can we trust the output of these models? How can they be used to deceive, manipulate, and disenfranchise? These touch at the very heart of 抖阴探探鈥檚 mission and core principles embodied in our new slogan, 鈥淔or People and Planet鈥�, and Our Common Ground.

Nevertheless, like the petroleum-powered automobile of the early 20th century, generative AI is upon us and it is not going away. Accordingly, it is our duty to find a way forward to put these tools to responsible and productive use.

Generative AI is a disruptive technology like no other. Its digital nature allows it to scale in ways that other technologies cannot. Thus change is rapid, and what was state-of-the-art six months ago is now obsolete. This makes it difficult to establish a uniform, long-term policy, since the landscape can change very quickly.

For example, the recent introduction of DeepSeek has been described as a 鈥淪putnik moment鈥� and its launch precipitated a loss of roughly $1 trillion in market capital (Nasdaq Composite). Nvidia lost nearly $600 billion in market capital in a single day. Clearly there鈥檚 a lot in play, and developments in the field are proceeding at a breakneck pace. Reportedly, DeepSeek was developed for only a fraction of the cost of other US-based LLMs. If cost estimates are indeed accurate, DeepSeek shows us that models can be constructed, trained, and deployed quickly and with relatively modest cost. This may alter the landscape significantly, since other startups will no doubt learn from DeepSeek鈥檚 example (in fact, Oumi AI, a startup by engineers from Google and Apple, is trying to do just that).

While generative AI has produced impressive results, these models are not without their flaws. 鈥淎s impressive as they are, state-of-the-art LLMs remain susceptible to brittleness and unhumanlike errors.鈥� (Melanie Mitchell and David C. Krakauer, 鈥淭he debate over understanding in AI鈥檚 large language models.鈥� PNAS, March 2023, ).

It is tempting to use these tools to reduce labor costs. This has yet to play out in the marketplace, but many companies have already sourced low-level coding and engineering work, and other non-technical jobs, to generative AI. Again, this is not without risks. Several companies have encountered unexpected negative results in these efforts and many are backpedaling. While not specifically involving generative AI, we have seen in the recent case of Boeing the dangers of sacrificing human capital and sound engineering principles in the pursuit of near-term gains.

Use of generative AI can subtly shift our focus from process to product. As educators, it is of paramount importance that we not lose sight of the value of process, and the human element in our work. We should be filling students鈥� brains, not emptying them by outsourcing our thinking.

We must remain circumspect about the role we choose for our AI assistance. Should we use it to generate lesson plans and course content? Or should we use it as a sounding board or advisor? (See: .)

It is only through education, awareness, and commitment to sound ethical principles that we can put these tools to responsible and productive use, in keeping with our mission and in accord with Our Common Ground.

Guidelines for use in teaching and assessment or activity design

Use caution in offloading development of teaching materials or assessments to a generative AI model. Instructors know best how to align materials with target learning outcomes and what鈥檚 level-appropriate for the students in your class. That said, these models have their uses.

If you use an LLM to review and comment on an assessment or activity of your own design:

give plenty of context so the LLM is more likely to generate level-appropriate comments and suggestions,
submit the entire assessment to the LLM (so working with plain-text, some flavor of TeX, or Markdown works best),
while many LLMs can read things like Word or PowerPoint documents, they can have occasional difficulty extracting text, so if you use these, it鈥檚 best to use structured formatting (e.g., using H1, H2, headings rather than manually adjusting font size and applying boldface),
take positive feedback with a grain of salt, and
review suggestions carefully and make sure that any revisions you might make based on LLM review are in keeping with the pedagogical objectives of the assessment.

If you use an LLM to generate an assessment or activity:

give plenty of context so the LLM is more likely to generate level-appropriate materials,
be prepared to make substantive revisions,
verify everything the LLM produces鈥攑roofread carefully, and
cite your source in the assignment (e.g., ChatGPT 4o was used in designing this assessment), but be aware this sends mixed signals to students (i.e., the instructor may use these tools for their work, but the student may not).

Whether using an LLM for review and comment or generation:

be aware that LLMs have a stochastic nature, the same prompt issued at two different times might produce different results,
while improving at a rapid pace, remember these models remain brittle and error-prone鈥攖hey can and do hallucinate, and all too often they are confidently incorrect鈥攕o verify everything,
if you find proofreading and revision of LLM models takes more time than writing on your own, then write on your own, and
consider your voice and your role in the process of education鈥攄on鈥檛 relinquish these to machines for the sake of some convenience.

FERPA

Never provide any prompt or data to any LLM or AI-based tool that might include student information. Ever.
If you ask an LLM for feedback on student work or for an assessment of the likelihood of plagiarism always ensure that what you provide has all identifying information removed. When in doubt, take it out.

Deploy slowly and with caution

Already there are abundant cases of software developers having to undo and redo work that was AI-generated. There are many articles with titles like 鈥淲hen AI Promises Speed and Delivers Debugging Hell鈥� or 鈥淎I is Creating a Generation of Illiterate Programmers.鈥� While many companies are barreling ahead with automation and layoffs, many others have rolled back AI-based initiatives after complaints of 鈥減olluted code bases.鈥� Several open source projects (e.g., FreeBSD) have banned AI-generated code because of all the difficulties it creates. Concerns aren鈥檛 just limited to coding. For example, the medical profession is raising red flags with regard to AI-assisted medical care (see: ).

The point is, if you adopt such tools, do so in small steps. That way if you get unexpected or disappointing results, the scope of the fix will be limited. Take an experimental approach.

Be aware of the cost of using AI

AI consumes a tremendous amount of energy and is already disrupting power grids. Some hope that LLMs will help solve these problems, but for the present, they are creating challenges, and where fossil fuels are used to generate electricity, they have a significant carbon footprint. See, for example:
- .
AI generates a tremendous amount of e-waste. See:
- )
Maintain your humanity. This might sound dramatic, but there鈥檚 a real risk of giving up much of what makes us human. See for example:
Understand that we do not control these LLMs, others do, and they have agendas. Be on guard for skewed or censored results.
Consider your role in abetting those who don鈥檛 have our interests, or the interests of the academy and the flourishing of human knowledge at heart.

There is no end to prognostications鈥攕ome quite dire鈥攁bout the effect generative AI or AGI may have on humankind. While much of this is speculative it鈥檚 still worth considering and monitoring.

Consider how generative AI models have been trained

There is widespread outcry about data harvesting for use in training LLMs鈥攕cholars, authors, artists alike claiming their work has been plundered for training data, without consent, remuneration, or attribution. These are serious and all too plausible claims, and they raise important ethical issues for users.

Ask yourself: Is it ethical to use a model which was trained on what amounts to stolen data? Should we use (or pay money for) such models when the authors of plundered content are not acknowledged or compensated?

Summarizing

These models can be quite good at summarizing an article or paper. They can also be used to replace or supplement technical documentation. However, this is no substitute for actually reading the material oneself. Don鈥檛 fall into the trap of thinking you know what鈥檚 in an article or paper or student鈥檚 essay unless you鈥檝e read it yourself.

With regard to students, the ability of these models to summarize documents can be quite helpful鈥攅specially at the introductory level. In a recent poll of UK undergraduate students 66% of respondents thought it acceptable to use generative AI for explaining concepts and 53% thought it acceptable to use these tools for summarizing (). But if LLMs become the only source, then students may not learn how to read and summarize an article, paper, or document on their own.

Cognitive off-loading and focus on product over process

These models tend to shift our focus鈥攊n ways both obvious and subtle鈥攆rom process to product. They also tempt us to off-load cognitive tasks. In the near term, this may appear to be a labor saver. However, it does not always help us develop and exercise such skills on our own. This is dangerous enough for experts in the field. It can be disastrous for students who haven鈥檛 yet acquired such skills or learned how to produce content on their own.

We should all understand the cost and impact use of generative AI has on ourselves, our students, our institution, and our world, and this understanding should inform our use of these tools, or indeed our choice not to use them.

Some resources

Stanford University鈥檚 Human-Centered Artificial Intelligence:
Oxford Institute for Ethics in AI:
Harvard University鈥檚 Berkman Klein Center:
Institute for Ethical AI & Machine Learning:
Ethics and Governance of Artificial Intelligence Initiative:
Machine Intelligence Research Institute:

Some tools

Anthropic Claude 3.5 Sonnet ()
ChatGPT 4o ()

Of these two, I find Claude marginally better than ChatGPT 4o, but certainly results will vary based on the discipline and level of the course.

While DeepSeek has received considerable press, I recommend avoiding using this for the time being, since it鈥檚 not clear how private this tool really is.

Other LLMs under investigation include Meta鈥檚 Llama ( and ) and Google Gemini ().

I have not yet evaluated InstructGPT, but will be investigating in the future.

I have not yet evaluated CoPilot except in the context of an integrated development environment (IDE) for writing code. However, I will investigate in the future.

While there are many AI-powered tools for grading (for example, CoGrader), most are geared toward grading essays (think: glorified spelling and grammar checkers, with some ability to follow an argument in an essay). Such tools are not discussed here, but I will continue to monitor their rapid development.

Quick tips

Prompt engineering

Constrain the problem

Try to supply constraints wherever possible. Adding constraints can generate more focused responses.

Limit the scope or length of response

As these models tend to be long-winded, adding a constraint on the size of the response can be very helpful.

Provide context

Providing context is crucial for generating level-appropriate output likely to align with course objectives and learning outcomes.

Use one-shot or few-shot learning

One-shot learning or few-shot learning involves providing a specific model for output by way of example. Don鈥檛 leave it up to the LLM to make decisions for you. Show the LLM what you want.

OK

Can you give me five exercises that require application of Gauss-Jordan reduction?

Better

Here鈥檚 an example of what I鈥檓 after.

Solve:

\begin{align*} x + y - 2z &= -2 \\ y + 3z &= 7 \\ x - z &= -1 \\ \end{align*}

Can you solve this demonstrating the application of Gauss-Jordan, and then produce five more problems of similar difficulty and number of variables?

Be prepared to verify everything

Developers of all these models warn that they can and do make mistakes and that output should be reviewed and verified by humans. This problem is compounded by the fact that much of the output of these models鈥攚hen they are wrong鈥攐ften seems quite plausible.

The more you generate with these models, the more you have to fact-check them. In many cases, fact-checking and editing model output may take more time than generating content oneself鈥攕ometimes far more time.

Be prepared to edit and tidy up output

Output of models may not be arranged, organized or formatted in a way that suits a particular need. Prompt engineering may or may not ameliorate the situation. We are often left with content that must be edited substantially. The more we must revise model output, the less time we save by using such models.

Understand what these models do and do not retain across sessions

A na茂ve user may reasonably assume that each session (chat) is its own workspace, and that one session cannot affect results of another, but this is not generally the case. As a result, one session might subtly (or otherwise) skew or alter output. If you play the role of student in one session (for experimental purposes) and play the role of educator in another, be on the lookout for pollution or leakage between sessions. These models can retain certain knowledge of prior chat sessions even when these sessions have been deleted.

Next time you log in to Claude or ChatGPT ask it this question: What do you know about me? You might be very surprised at the result.

You can ask these models to 鈥渇orget鈥� what they know about you or prior queries, but it鈥檚 not entirely clear what still might be retained.

Recognize the stochastic nature of these models

These models are, at root, statistical machines. Be aware that the same prompt, provided to the same model at different times, may produce significantly different results.

Compare the output of different models if possible

Because output of these models is stochastic, and because they can and do make mistakes, consider providing the same prompt to two or more models and comparing results. Of course, this takes extra time, but often it is quite illuminating.

Be prepared to cut and run

These models can hallucinate, and they can, and often do, get stuck. Here鈥檚 an anecdote (alas, I did not preserve these chats). In experiments, I tried asking ChatGPT and Claude to produce a graph on ten nodes, with weighted edges and heuristics associated with each node, such that the heuristic is admissible and consistent (these are simple numeric constraints). Both models failed by producing graphs for which the constraints were not satisfied. In follow-up prompts, I pointed out that such-and-such a node violated the requested constraint. The response from both models was along the lines of 鈥淵es. Thank you for pointing this out. Here鈥檚 an updated graph with the desired properties.鈥� and then they would produce either exactly the same graph with exactly the same defect or a different graph with similar violations of constraints. When this happens, it鈥檚 often best to cut and run. You can do battle with the prompts for a longer amount of time than it would take to produce an example on your own.

Don鈥檛 lose your voice

LLMs tend to produce generic-sounding text. Even if you include suggestions in prompts (e.g., 鈥淧lease use a breezy conversational style, peppered with bits of wry humor鈥�) the voice of an LLM is not yours. Don鈥檛 erase yourself. Part of what makes an educator effective is being able to form connections with students. LLMs are faceless and anonymous. Don鈥檛 rob your students of those connections. Don鈥檛 relinquish your voice.

Academic integrity

Faculty

If faculty include the output of generative AI in teaching materials, assessments, and the like, they should be obliged to follow the same guidelines for citing sources as is customary in academic writing, and should be in accord with 抖阴探探 policies for academic integrity. If we expect students to cite sources, including generative AI if used, faculty should do the same in all cases.

But this presents a problem: If we use generative AI to produce teaching materials, what message are we sending to students? If AI generated content can鈥檛 be submitted by a student as one鈥檚 own work or as evidence of learning, why, then, would it be acceptable for faculty to do so?

Instructors would like to think they鈥檙e good at detecting text produced by generative AI. An astute student might be just as good (or better!) at such detection. What message do we send students if we present output of generative AI as our own?

Students pay tuition to have access to faculty, who are expert in their fields and who help students learn. They are not paying tuition to have the institution serve as an intermediary between them and an LLM.

Students

Instructors should develop unambiguous policies regarding acceptable and unacceptable use of generative AI. Students are often unsure and in need of guidance. Students should also be given clear statements about how they will be assessed.

Weighting of assessments

Because students have access to generative AI tools, it may be appropriate to shift the weighting of assessment more toward in-class quizzes, exams, and active learning exercises, and reduce the weight of homework. This may come at a cost. For example, pencil and paper exercises may take longer to grade than digital submissions. In such cases, AI can increase the grading burden taken on by instructors and TAs. However, this is may an acceptable trade-off, since direct assessment without access to digital resources including generative AI is the gold standard for measuring what students really know about a subject.

Some good news; some bad news

In the aforementioned UK survey from the Higher Education Policy Institute (HEPI), only 3% of students thought it acceptable to use AI-generated content in assessments without editing. The bad news is that it鈥檚 unclear what students might think is an acceptable amount of editing or revision.

Syllabus language

Address generative AI in your syllabus. Here is a specimen:

鈥淎cademic integrity: The Department of Computer Science enforces 抖阴探探鈥檚 Code of Academic Integrity. Any suspected violation of this policy will be referred immediately to 抖阴探探鈥檚 Center for Student Conduct (/sconduct). Sanctions for a violation may include a grade of XF in the course. Additional violations can result in dismissal from the university. In a word: Don鈥檛. All students should read and understand this policy. See: .

鈥淐ollaboration on quizzes and exams is strictly prohibited. Use of online services as a source of solutions is strictly prohibited. Using generative AI such as ChatGPT or Claude, or websites such as Chegg or Course Hero to complete coursework is a form of academic dishonesty. Work you submit for an individual grade must be your own. Any work not produced by you must be cited. For certain assignments, students may collaborate on homework (typically limited to teams of two). If you collaborate with another student on an assignment, be sure to indicate team members as specified. If you have any questions, ask!

鈥淎ny attempt to tamper with or defeat any autograder is a form of academic dishonesty. This applies wherever autograders are in use, for example on Brightspace or Gradescope.

鈥淎ll code submitted by students is subject to code similarity review.

鈥淓xams, quizzes, homework assignments, answer keys and solutions, presentations or lecture notes, specifications and rubrics are copyright protected works, unless clearly and explicitly indicated otherwise. Any unauthorized copying or distribution of protected works is a violation of federal law and may result in disciplinary action. This includes submission of protected works as prompts to generative AI. Sharing of course materials without the specific, express approval of the instructor may be a violation of the University鈥檚 Code of Academic Integrity and an act of academic dishonesty, which could result in disciplinary action. Violations will be handled under 抖阴探探鈥檚 Intellectual Property Policy and Code of Academic Integrity, as appropriate. See: and .鈥�

Specimens for use on Brightspace

General guidelines

鈥淎ny work you submit for grading must be your own.

鈥淚n some cases it鈥檚 appropriate to cite a collaborator or reference source in the docstring of a file, but anything not produced by you must be cited. That said, it is not the case that anything goes so long as it鈥檚 cited. Learning comes from doing, and nowhere is this more true than in learning how to program. Invest the time and effort in producing your own work, and you鈥檒l find that despite being challenging (and occasionally frustrating) programming is fun and rewarding. Take this opportunity to build a solid foundation for future coding and future courses.

鈥淲e screen homework submissions for code similarity (and changing variable names won鈥檛 prevent highly similar code from being flagged as such). This AI-powered functionality is built in to Gradescope, but your code will also be reviewed by me or a TA.

鈥淚 also test tools like ChatGPT or Claude on homework specifications to see what these tools produce.

鈥淎ll homework should make use of language features that have been presented (so far) in the course. You may not use features that haven鈥檛 yet been presented.

鈥淎s a general rule, you should be able to explain your code, clearly and simply, line by line. If you can鈥檛 do that, then likely there鈥檚 a violation taking place.

鈥淭he course policy on academic integrity can be found in the syllabus. If you have questions, ask!鈥�

Citing

Again, here is a specimen used in Brightspace.

鈥淚n submitted programming assignments, you should include citations within the docstring of any and all file(s) that make use of outside sources or involve collaboration. Here is an example of citing consultation with generative AI (ChatGPT, Claude, etc).

鈥淚f you cite generative AI as a reference source, please include the prompt you used. Please see the video Legitimate and illegitimate uses of generative AI for more.

"""
Egbert Porcupine
eporcupi@uvm.edu
CS1210

I consulted ChatGPT for help with format specifiers. I provided this prompt:

How can I use a Python format specifier in an f-string to format a number 
right aligned to two decimal places precision?
"""

def f_to_c(f):
  return (f - 32) * 5 / 9


if __name__ == '__main__':
    deg_f = float(input("Enter degrees Fahrenheit: "))
    print(f"{'Degrees F':>12} {'Degrees C':>12}")
    print(f"{deg_f:>12.2f} {f_to_c(deg_f):>12.2f}")

鈥淏e sure to include such citations wherever AI tools have been used.鈥�

Talking to students

Speak to your students about the use of generative AI. Set expectations and guidelines, and wherever possible, indicate clear boundaries which must not be transgressed.
Consider preparing a video for delivery via 抖阴探探鈥檚 Streaming Service for use in Brightspace. Here鈥檚 a link to an example: .

Assignments where use of generative AI is permitted

It may be the case that you allow, encourage, or even require students to use generative AI. This is fine, and AI literacy will no doubt become more and more important in years to come. It is crucial that students learn how to use such tools effectively. It is also crucial that students understand the pitfalls of using such tools.

If you create such assignments, give clear guidelines as to what is and is not sanctioned.

It may be helpful to have students work on assignments where AI is permitted and assignments where AI is not permitted and have them record the amount of time on task for each. This could be illuminating.

IDE assistants / CoPilot

Built-in generative AI assistance in integrated development environments (IDEs) has become ubiquitous. Once a glorified auto-complete, these tools can anticipate a coder鈥檚 work and suggest substantial auto-generated content. In the world of coding, this is almost inescapable. These features are fine if you鈥檙e an experienced programmer and can judge when it鈥檚 OK to incorporate such content into your code, when to incorporate with edits, and when to ignore the suggestion. For beginners, this can short-circuit learning. Moreover, these tools have no sense of what is level-appropriate for a given course. For example, they often recommend intermediate level code, even if the student is in an introductory course. Accordingly, it鈥檚 important to talk to your students about this, and demonstrate, if you can, examples of good suggestions, and unwelcome suggestions.

Other

Notice regarding use of generative AI in preparing this document

While transcripts of sessions with LLMs are included here or accompany this document by way of illustration, no generative AI was used in writing this. These words are my own.

抖阴探探

Overview

Guidelines for use in teaching and assessment or activity design

FERPA

Deploy slowly and with caution

Be aware of the cost of using AI

Consider how generative AI models have been trained

Summarizing

Cognitive off-loading and focus on product over process

Some resources

Some tools

Quick tips

Prompt engineering

Constrain the problem

Limit the scope or length of response

Provide context

Use one-shot or few-shot learning

OK

Better

Be prepared to verify everything

Be prepared to edit and tidy up output

Understand what these models do and do not retain across sessions

Recognize the stochastic nature of these models

Compare the output of different models if possible

Be prepared to cut and run

Don鈥檛 lose your voice

Academic integrity

Faculty

Students

Weighting of assessments

Some good news; some bad news

Syllabus language

Specimens for use on Brightspace

General guidelines

Citing

Talking to students

Assignments where use of generative AI is permitted

IDE assistants / CoPilot

Other

Notice regarding use of generative AI in preparing this document

Reuse