Question-driven content design
An new strategy for the age of RAG.
What is content design?
Content design is a process for creating content — product documentation, for example — that gives readers information they need, when they need it, where they already are:
- Determine what information an audience needs
- Create text, images, audio, or video that makes it easy for the audience to understand and apply that information
- Publish that information in a way the audience can easily find, navigate, and search
Traditional content design approach
A common approach to planning what to write looks something like this:
- Identify concepts users need to understand, tasks users will perform, and reference information users need to perform those tasks
- Break the information into pieces, so you can write an individual article for each piece
- Map out how all the information pieces will fit together
What is RAG?
Retrieval-augmented generation (RAG) is a technique for using a large language model to generate reliable, accurate output by grounding that output in content from a knowledge base.
For example, to use RAG to answer a question about a software product, you would perform the following steps:
- Search the online product documentation for an article that contains information to answer the question
- Extract the HTML contents of that article and convert it to plain text
- Submit the following prompt to a large language model:
Article:
------
<article-text-here>
------
Answer the following question using only information from the article.
If there is no good answer in the article, say "I don't know".
Question: <user-question-here>
Answer:
Impact of RAG on content design
RAG is gaining popularity fast. The following image show an almost comical explosion in Google searches for “retrieval-augmented generation” this past summer:
Teams producing product documentation are scrambling to deploy RAG solutions on their documentation. Their users are demanding it.
Although the basic RAG pattern is straightforward, building and deploying an effective RAG solution presents many challenges. The following paper describes my team’s experience deploying RAG solutions for the past two years: Optimizing and Evaluating Enterprise Retrieval-Augmented Generation (RAG): A Content Design Perspective
As we evaluated responses generated by our RAG solutions, we saw that even when documentation is produced according to content design best practices, even when articles meet writing style guidelines, and even when all the information we anticipated would be needed gets written up, there were still many user questions for which our RAG solutions could not generate helpful answers.
The biggest challenge was when people asked questions we didn’t anticipate.
Example
Imagine there’s a new writing implement taking the world by storm: the carbonWrite 9000 pencil. Let’s say there’s a customer-support RAG solution to answer user questions about the product, and that the RAG solution uses the product documentation as its knowledge base.
Now, imagine a user submits the following question to the RAG solution: “How can I disable the graphite core?” Of course, there will be no information in the documentation about enabling or disabling that functionality… because it’s a core part of the product. It’s just there. Because the documentation doesn’t mention disabling this feature, that means the RAG solution won’t be able to generate a good answer to the question.
We can’t document all the things that cannot be done or don’t need to be done with a given product. If we tried, we’d never stop writing. The traditional content design approach of identifying concepts, tasks, and reference information cannot solve this problem.
FAQs
This problem isn’t entirely new. For a long time, content teams have published “frequently asked questions” (FAQs) articles. Usually, the list of FAQs is assembled over time in response to support tickets or feedback from sales teams or customer contacts. Creating the list is typically a manual process. And publishing a question and its corresponding answer often lags a long time behind when the first user asks that question.
User communities
For as long as the internet has existed, there have been online user communities where people could ask each other questions about how to use software. To this day, you might get a better answer, and get it faster, from a user community than you can get it from a company’s official support process. Smart companies cultivate and support user communities, because it’s good for the company’s bottom line.
What’s new in the era of RAG is that people move through the discover-try-buy cycle of software purchasing faster than ever, and they usually expect to be able to do so without support from any sales person. If people ask a few questions of a self-help chat (powered by a RAG solution) for a product and don’t get good answers, they don’t track down a support person to get help or even ask questions in user forums. Instead, they just leave. They go buy some other product.
All of these factors have come together to give content design a good shake. On the one hand, our content is more vital than ever before, because it’s powering RAG solutions. On the other hand, expectations for these RAG solutions are through the roof and the stakes couldn’t be higher.
Strategic response: Question-driven content design
In the era of RAG, content design teams need to add a new step to the content design process: Test whether content answers real user questions.
Testing
It’s not enough that information relevant to a user question exists in the documentation. You have to test whether your RAG solution can find that content and that an LLM can answer the question from a prompt grounded in that content.
Example test (part 1)
Here’s how an LLM answers a known user question from a prompt grounded in the carbonWrite 9000 pencil documentation. You can see that even though the article is open-ended about supported writing surfaces, the generated answer is not helpful:
Real user questions
It’s vitally important to collect real user questions instead of just guessing what the questions might be or making up sample questions yourself. Your familiarity with the product will bias the questions you ask and the language you use to ask the question. Real users — in the wild — very often ask surprising questions in quirky, ungrammatical, misspelled, and otherwise unexpected ways.
If you test your content with made-up questions, you might have a false sense of security that your solution will be successful. Worse, you might waste time optimizing your content to answer the made-up questions and your RAG solution still fails to answer real user questions well.
Adaptation
For content design teams, adapting our process for the age of RAG might look something like this:
- Traditional content design approach before: Anticipating that users of the carbonWrite 9000 pencil will wonder whether the pencil works on surfaces other than paper, a writer might include the following statement in the documentation: “… you can use the pencil to write and draw on a variety of surfaces.”
- Question-driven content design: During an online workshop demonstrating how to use the carbonWrite 9000 pencil, the writer makes a list of questions that workshop attendees ask about writing surfaces. Also, the writer monitors the popular online user community, All things pencil, collecting questions about writing surfaces. The writer tests that the carbonWrite 9000 pencil documentation can answer those real user questions, and makes any content changes needed.
Example test (part 2)
Here’s another attempt to answer the supported-writing-surfaces question, after a minor content update (underlined in red) is made to the carbonWrite 9000 documentation. Now, a helpful answer is returned:
Automation
It is not sustainable to try to do that question collecting and content testing manually. New workflows, tools, and automation need to be created. For example:
- To collect questions from the workshop mentioned above, the writer could automatically extract questions from the workshop transcript. Here is a sample notebook demonstrating how to classify a given sentence as a question: Python notebook in GitHub
- To collect questions from a user community, the writer could use an API (for example, the popular platform Stack Overflow has such an API)
- To test whether documentation can answer collected user questions, the writer could submit the questions to their product RAG solution and review the answers. Here is a sample notebook demonstrating various ways to automatically compare run-time results with expected answers: Python notebook in GitHub
- Another way to test whether documentation can answer known questions is to use a backwards approach: Generate questions from the content, and then compare those questions to the known questions. Here’s a notebook demonstrating this technique: Python notebook in GitHub
Conclusion
In the past, creating product documentation has sometimes been viewed as an unfortunate, necessary cost to the business. Now, that same content is suddenly crucially important, because everyone is clamoring for RAG and content is the fuel on which RAG solutions run.
The traditional content design approach doesn’t guarantee RAG success. Content design teams must add a new step to their process: Test whether content answers real user questions.
Content teams must invest in automation to streamline the new work.
Adapting to change can be difficult.