The right way to include AI-generated content in your work

Sarah Packowski
4 min readNov 19, 2023

--

It turns out the same techniques writers, scientists, and journalists have been using for ages can help here too.

The state of play today

The other day, a colleague was sharing a presentation about employee feedback analysis. On one slide, was a summary they had generated using a large language model. Because generative AI is still pretty new, we laughed and remarked on how impressive the generated content was. Nevertheless, we all wanted to see the original data and make up our own minds. We were entertained by the generated content, but we didn’t trust it.

The near future

With exposure, we’ll all gradually become more comfortable with AI-generated content. We’ll expect presentations to have an AI-generated summary just like we expect an agenda today. But whether or not we trust and embrace generated content will depend on the norms that develop and evolve in the coming months and years.

For example, if people habitually sneak in AI-generated content without disclosing they’ve done so, society will become vigilant and focus on verifying and enforcing the humanity of content. On the other hand, if people adopt healthy AI habits, like transparency, our comfort with AI-generated content will grow and society will benefit from the technology.

Honestly, it looks like it could go either way at this point.

Healthy AI habits

Let’s imagine it’s 5 years from now, and things worked out for the best. Hooray! In this best scenario, what kind of healthy AI habits did people adopt?

1. Transparency

When your work includes content that has been generated or significantly altered by AI, clearly say AI has been used. Be specific.

2. Citation

Quote or otherwise mark text that has been generated or significantly modified by AI.

3. Explainability

Explain how you used AI, from your rough work to the final result.

4. Repeatability

Share model and prompt details so someone else could verify or reproduce your results. For in-context learning or retrieval-augmented generation solutions, where content is dynamically pulled into prompts, include that content (or stable links to that content) as well.

These habits aren’t new

  • When a newspaper prints a story about their parent company, there is potential for a conflict of interest. For this reason, it’s common for such articles to contain a statement clarifying the relationship. That’s transparency.
  • When a researcher includes ideas or verbatim text from a historical source in their own work, the content is quoted in the body of their piece and there are usually footnotes where you can find source details. Citation is commonplace.
  • In academic papers today, there is often a “Method” chapter that describes the steps the authors followed to arrive at the results published in the paper. Knowing how the team approached the problem helps explain why the results are what they are. That’s explainability.
  • When an exposé breaks a story about government corruption, the journalists are very careful to share all the data that supports their claim. This way, skeptics can draw their own conclusions from that data. Repeatability is how we know we can trust results.

Example: What might this look like in practice?

Imagine you are a team lead and you are taking your team through a reflection activity. Your team has used a mural (an online, collaborative whiteboard tool) to record their feedback. Now, you need to summarize the mural contents for reporting results.

You could use a large language model to help you accomplish this task:

  1. Identify themes in feedback
  2. Classify each feedback sticky note by theme
  3. Generate a summary of the sticky notes for each theme

You could then generate a report like this one: Team feedback report

Team feedback report, parts of which were generated by a large language model
Team feedback report, parts of which were generated by a large language model

This sample document isn’t fancy. And in a real world scenario, you would need to include more information, such as how you solicited and collected the feedback comments. But this document demonstrates a few ways to implement the four healthy AI habits mentioned above:

  • Transparency: There is a statement about AI right at the beginning.
  • Citation: The generated content is highlighted in yellow.
  • Explainability: The themes-classify-summarize approach is explained.
  • Repeatability: The model and prompt details, as well as the raw feedback messages are all included.

Try it yourself

The following sample notebook contains everything you need, from sample Python code to sample feedback messages, for using a large language model in IBM watsonx.ai to generate that report:

Summarizing feedback using an LLM

Identifying themes, classifying messages, and clustering sticky notes in MURAL
Identifying themes, classifying messages, and clustering sticky notes in MURAL

Other samples

See a collection of fun, AI + MURAL samples here:
https://github.com/spackows/MURAL-API-Samples

Conclusion

Before long, writing style guides will be updated to standardize techniques like these. At present, the guidance is insufficient (see: FAQ How to cite AI-generated content, The Chicago Manual of Style) Of course, there are details to work out not mentioned here (see: How AI is your article?) But I feel optimistic. If you use AI-generated content in your work, lead by example with healthy AI habits. Teachers, instead of banning students from using AI, teach them healthy AI habits!

--

--

Sarah Packowski

Design, build AI solutions by day. Experiment with input devices, drones, IoT, smart farming by night.