What many get wrong about using LLMs in content development

Sarah Packowski
6 min readMar 12, 2024

--

I’m talking specifically about using large language models (LLMs) to create support content like product documentation.

Burgundian scribe. Source: Wikimedia Commons
Burgundian scribe (portrait of Jean Miélot, secretary, copyist and translator to Duke Philip the Good of Burgundy, 15th century) Source: Wikimedia Commons

Writing isn’t the hard part

In my experience working as a writer of software product documentation for 20 years, crafting the sentences and paragraphs of an article is not the most difficult or time-consuming part of the job.

The most challenging part of creating effective product documentation is what happens before you start writing:

  • Concepts — Figuring out which concepts users need to understand for the product interface and workflows to make sense.
  • Tasks — Figuring out what tasks need to be explained and figuring out how to explain those tasks without readers becoming overwhelmed, frustrated, or impatient and abandoning the documentation.

Understanding is the hard part

An effective content strategy requires understanding:

  • Users — Their perspective, what mental models they bring with them, what previous experience they’ve had, what jargon they use, and their assumptions and expectations.
  • The product — The mental model behind its design, its features, how to accomplish something using those features, and when and why someone would use those features.

Understanding users

You can gain an understanding of your product’s users in a variety of ways:

  • Sit with real users as they perform tasks, guided by draft documentation
  • Run workshops where you can observe, question, and listen to participants as they step through the product-related workshop tasks
  • Read comments and questions wherever your users naturally chat about using your product (eg. community forums, Reddit, Stack Overflow, or Slack channels)

Understanding the product

Gaining an understanding of your product can be tricky when the product doesn’t exist yet. Back when dinosaurs roamed the earth, there might have been detailed specification documents for reference. But these days, it’s a bit more like reading tea leaves:

  • Participate in early design discussions to see what’s coming
  • Observe when in-development features are demonstrated to the team for input and feedback
  • Listen to discussions in meetings
  • Read comments in places like GitHub issues or Slack channels
  • Ask product managers and developers directly: What is this? What does it do? How are people supposed to use it? Why would they? etc.
  • Try using in-progress versions of the product in test or development environments

The hardest, most time-consuming part of content development is getting inside other people’s heads, whether that’s users or product team members.

Tomb of Nebamun. A scribe reads a papyrus, a writing palette under the arm. Source: Wikimedia Commons
Tomb of Nebamun. A scribe reads a papyrus, a writing palette under the arm. Source: Wikimedia Commons

The role of LLMs: rewriting, not writing

In the context of customer support, large language models have the potential to be really useful for providing users with just the information they need, where they already are, right when they need it.

Example 1: RAG for external product documentation

Imagine you have a large product with many features. And imagine you have well-crafted documentation that explains all the concepts you need to know to be successful with all of the product’s features. Using a retrieval-augmented generation (RAG) solution, you could provide users with an interface to ask questions about just the concepts they need for a given use case. This is game-changing for how users navigate and consume product documentation.

(If you have tried to build conversational assistants with hard-coded dialog turns, there really is no comparison. Integrate speech-to-text and text-to-speech for natural voice interfaces, bring the interaction into the product GUI, or throw the information up on an augmented reality headset, and the sky’s the limit.)

The brilliance of RAG is that the LLM doesn’t have to know anything. In RAG solutions, you search your knowledge base for relevant content and then paste that content into your prompt to give the LLM the required facts:

Article:
###
With feature X, you can calculate average rainfall in a given year.
###

Answer the following question using only information from the article.
If there is no good answer in the article, say "I don't know".

Question: What can you do with feature X?
Answer:

When prompted with the previous text, all the LLM has to do is hook onto the correct details from the given article in the prompt and then generate a response in the proper format to answer the question. The LLM isn’t generating an answer based only on the word associations baked into its weights during pre-training, it’s rewriting the content given in the prompt.

Note that for RAG solutions to work well, you need to have your content ducks in a row:

  1. Coverage — You must anticipate users’ needs so the right content exists for the solution to mine.
  2. Search — Content must be optimized for search so the solution can find the relevant information for a given user question.
  3. Writing style — Content must be written in such a way that the LLM can successfully pull out and reassemble the right bits and pieces to generate a useful answer for the user. (You can read more about this here: Adapting content for AI)

Example 2: RAG for internal process information

LLMs can help writers with the task of understanding users and understanding the product. For example, if you have transcripts from your user testing, your workshops, and your product team meetings, and if you collect users’ comments from forums, you can use that content as the knowledge base for a RAG solution. You could then cluster comments on a theme, paraphrase the most common questions, and summarize technical discussions. This kind of solution is useful and potentially time-saving. (But remember the key requirements for effective RAG solutions: coverage, search optimization, and writing style. The information gaps, lack of search optimization, and noisiness of the source material in this case will impact the effectiveness of the solution.)

Example 3: Correcting grammar, spelling, and style

Of course, LLMs can correct grammar and spelling. There are already tools available for this (eg. Acrolinx, Grammarly.) And LLMs can be fine-tuned to rewrite draft content to conform to any writing style and format guidelines that your team requires. This is valuable from a quality perspective, but might not actually save professional writers much time.

The real strength and value of LLMs for product documentation and related processes is not their ability to write content from scratch, but their ability to rewrite provided content in whatever format you need.

Upper case and lower case types for a typical eighteenth century printing press. Source: Wikimedia Commons
Upper case and lower case types for a typical eighteenth century printing press. Source: Wikimedia Commons

Some leaders have the wrong idea

Unfortunately, some people hope to increase the “efficiency” of producing product documentation by using LLMs to generate the documentation, with the idea that humans would correct any mistakes. But for product documentation, that process is back to front! While an LLM can rewrite content you give it, it cannot perform the essential, challenging, and strategic work of deciding which concepts and tasks need to be documented.

Worse, some hope that “efficiency” gained using LLMs could allow companies to reduce the number of documentation writers. But that would be the opposite of what’s needed! For your customer-facing RAG solutions to be successful, you’re going to need more writers than ever — to get into the heads of users and product teams, to strategically decide what needs to be documented, and then to optimize that content for LLMs to consume.

The “Draughtsman-Writer” automaton, built by Henri Maillardet ~1800. Source: Wikimedia Commons
The “Draughtsman-Writer” automaton, built by Henri Maillardet ~1800. Source: Wikimedia Commons

Conclusion

Some people say LLMs are just auto-complete on steroids. Those people are not wrong. But they’re saying it like that’s a bad thing. It’s not! For product documentation, the superpower of LLMs is not generating content based only on facts baked in via word-associations during pre-training. The superpower of LLMs is their ability to consume your domain-specific content and then rewrite it in the form you need.

--

--

Sarah Packowski

Design, build AI solutions by day. Experiment with input devices, drones, IoT, smart farming by night.