AI-Generated Draft Review Guidelines
Purpose
These are a general set of guidelines to follow if/when using Large Language Models (LLM’s) to produce curriculum. This is neither meant to encourage nor discourage folks from using Generative Pre-trained Transformer(GPT) based technologies. Instead it is meant to ensure that our content creation process is in keeping with the code of conduct in the following ways:
- Safe and effective systems: Keeping our content accurate, free of plagiarism and in alignment with copyright laws
- Algorithmic discrimination protection: Scanning for discriminative content and checking for tone
- Data privacy, notice, and explanation: Transparency about where we use AI to generate content and to what extent
- Human alternatives, consideration, and feedback: Rewriting, editing, proof-reading and seeking feedback wherever necessary
- Human-centered learning experiences: Making sure our content is engaging, in keeping with Codecademy style and pedagogically sound (i.e., avoiding generic gpt-sounding material)
Checklist for Content Creators
Based on the above guidelines, we've created the following checklist to be followed for any content creation process that involved text generated by GPT-based technologies:
- [ ] Documentation: Save prompt and first draft generated by the LLM
- [ ] Plagiarism Sanity check Make sure that chunks of text/code are not plagiarised
- [ ] Accuracy Verification Check for technical and non-technical accuracy
- [ ] First Review (optional) If the content creator is not a subject matter expert, seek a review
- [ ] Tone and Style Corrections Alter tone and style
- [ ] Content Standards check Check content standards for individual content types
Checklist Details and Rationale
1. [x] Documentation
The first step would be to record using any GPT-based content. Saving the prompt used (including prompt parameters if possible!) and the first draft generated is useful for both the content creator and reviewer. This also allows us to compare the initial and final versions to make sure our content meets all the standards laid out in the code of conduct.
2. [x] Plagiarism Sanity check
We advocate for a brief plagiarism sanity check. It is possible that entire chunks of text generated might resemble content from a competitor's platform. To make sure we don't violate copyright rules, it is worth performing a 5-10 minute internet search and checking popular free resources.
3. [x] Accuracy Verification
GPT models, at the moment, are accurate roughly 70-80% of the time. They're being tuned and improved currently but this is a huge margin of error nonetheless. Performing a check to ensure all the content is accurate is essential to maintaining trust and integrity with our learners.
4. [x] First Review (optional)
If the content creator is not an expert in the subject matter or is ambivalent about the accuracy of the material, they should seek a review from another Curriculum Developer or SME to ensure the same.
5. [x] Tone and Style Corrections
GPT-generated content can have a distinct and identifiable tone and style. Since we expect that many platforms will be inundated with GPT-generated text, it is worth taking extra care that we don't put our learners off by not paying attention to tone and style. The content creator will need to make edits to ensure that content is engaging and in keeping with Codecademy's style.
6. [x] Content Standards check
The final step would be to look into the content standards for the specific content item type that it belongs to as laid out by the AI-integrated production process document.