I accidentally bought an AI-generated book
Let’s learn from this book’s quirks and mistakes.
Background
While hunting for information about designing natural language interfaces, I stumbled across a book on Amazon, called “Crafting User Interfaces for Artificial Intelligence: Merging AI and Design for Seamless Interactions”.
There were a few red flags right away: The book appears to be self-published, it’s just over one hundred pages in length, and the price is very cheap. However, the table of contents (available with the “look inside” feature on Amazon) piqued my curiosity. And in the past, I have found some small, self-published books that were gems. So, I took a chance.
The book arrives
I didn’t notice the typo on the cover until I was looking at the physical book. 😆
At first, I was annoyed by the idea of someone selling fake(?!) books. But as I read through the little book, cataloging the mistakes, I became intrigued by the quirks of the content. And as an AI developer, I imagined the steps the person who created this book must have gone through.
I don’t have an issue with the basic concept of someone using AI to create a book. And I’m intrigued by the business idea and technical challenge of figuring out what topics readers might be interested in and then using AI to churn out little books on popular topics. Unfortunately, in this case, the result was poor quality. But if the book had been good, would it have mattered how it was created?
Problems
Here are the problems — big and small — with the book:
- Basic language errors
- Major structural errors
- Lack of references
- Not delivering value
- Not making sense
Details below.
1. Basic language errors
There are spelling, grammar, and word choice mistakes throughout. Sentences are incomplete. Random, wrong words are used.
Example from pg. 16:
Design thinking is a stoner-centered approach to problem- working that emphasizes empathy, creativity, and replication.
What the heck?
The word “stoner” appears in a few places. It might be a mistranslation of “user”. In fact, based only on these language errors, it’s possible this book wasn’t AI-generated, but written in a language other than English and then badly translated to English using machine translation.
Translation is tricky at the best of times.
I love a good sign translation fail.
2. Major structural errors
Content on pg. 49–54 is repeated on pg. 54–59. (It wasn’t a printing error. The subsections of those pages were repeated with a slightly different format. This might not be a generative-AI problem. It might be that the author made a mistake with their writing/publishing software.)
The appendix sections have descriptions of what should be in an appendix, but not the actual content itself:
- The beginning of the Glossary says: “The glossary of terms provides definitions and explanations…” But there are no glossary terms!
- The beginning of the “Resources for further reading” says: “Resources for further reading include a curated list of…” But there are no resources listed!
- The beginning of the “Tools and technologies for AI and UI design” section says: “The tools and technologies appendix outlines software tools, frameworks, and platforms…” But there are none listed!
If someone was following a sell-ebooks-online-in-your-spare-time recipe, it’s possible they copy-pasted the appendix template but forgot to fill it in. It’s also possible the person who created this book tried to prompt an LLM to generate appendix content and then the LLM just described what should be in an appendix. An LLM doesn’t have links or facts stored in it. An LLM is just a collection of nodes, each running a simple function, which combine to make up an extremely complex, many, many-variable function that calculates the most likely words to follow given prompt text. If you prompt an LLM to generate a list of links, for example, and if the LLM generates a list, it might look right… But it will be what LLM researchers call a hallucination — just made up.
3. Lack of references
In non-fiction books, it’s common for there to be footnotes, endnotes, and URLs referring to previous publications and related work. But this book doesn’t have any. None.
The trouble with using a large language model (LLM) to generate text from a basic prompt is that the output is made up from the model’s vocabulary and statistical word associations only. Even if the person creating this book wanted to link to sources, they would have no way of knowing which articles in the model’s pre-training data influenced a given piece of generated output. *However, with patterns like retrieval-augmented generation (RAG) you can link to the article in which generated output is grounded.
4. Not delivering value
Even with all the previous problems, the book still might have contained a nugget of interesting information, a novel idea, or key insights into the challenges of creating user interfaces for AI.
Alas, no.
- There’s hardly anything here. The font is enormous, there’s huge space between the lines, there’s a lot of repeated information, and as mentioned, the book is just over 100 pages long. This little book has classic meeting-the-essay-minimum-page-requirement energy.
- Overall, the information is at a very introductory level and lacks any nuance.
- There is information about AI and there is information about design. But there’s no new information about designing for AI.
- The examples are trivial and repetitive: Siri, Alexa, and Google are given as examples over and over and over.
These kinds of problems scream AI-generated:
- One reason why this AI-generated content doesn’t include nuance is because designing for AI is a niche concept. It wouldn’t appear very often in any LLM’s pre-training data — particularly because most LLMs were pre-trained on text from several years ago, before there was really much mainstream AI to even design for.
- One reason why LLMs struggle to create content with truly novel ideas is because the words in the generated output are chosen based on probability. For example, in an LLM’s pre-training data, there would be many references to peanut butter and jelly sandwiches, but no references to peanut butter and concrete sandwiches. When prompting an LLM for new peanut butter sandwich ideas, concrete is not going to be included in generated recipes — not because LLMs somehow know people shouldn’t eat concrete, but because that word didn’t appear with peanut butter in pre-training data. LLM “creativity” comes from built-in pseudo-randomness. Yet even then, an LLM isn’t choosing randomly from its entire vocabulary. It’s choosing from a collection of most-likely words.
- Having the same, obvious examples repeated everywhere is a symptom of generating chapters or sections in isolated chunks, with no knowledge of what came before or plan for what will come next. Whereas a human author might put variety in the examples, the LLM keeps going back to the greatest hits every time.
5. Not making sense
Human beings tend to look for patterns and meaning even where there is none.
Even when LLMs babble, we try to see meaning in the generated output. We are entertained by the novelty of a machine speaking to us. The common patterns in the stream of words feel familiar. But on close inspection, you can see the contradictions and gaps in the logic when an LLM has been given a basic prompt to generate long-form text.
Example
The following snippet (from pg. 11, chosen at random) demonstrates a problem with content throughout the book:
AI-driven interfaces can predict user needs, provide tailored recommendations, and automate routine tasks, thus improving efficiency and user satisfaction. For instance, virtual assistants like Siri, Alexa, and Google Assistant use AI to understand and respond to user queries in natural language, making interactions more intuitive and human-like.
That information isn’t exactly wrong. But it doesn’t hold together either. The first sentence talks about customization and automation increasing efficiency. The second sentence begins with “For instance…” and then goes on to give examples not about customization, automation, or efficiency, but about making interaction more intuitive and human-like.
That’s not a translation error. That’s a fundamental not-knowing-what-your-point-is error. If a student submitted that paragraph for feedback, a teacher might say:
- Are the examples of Siri, Alexa, and Google Assistant making interactions more intuitive supposed to demonstrate how AI-driven interfaces provide customized/automated experiences? If so, you need to rewrite those examples or find different ones.
- Or are you saying AI-driven interfaces enable customized/automated experiences AND AI-driven interfaces can be more intuitive and human-like? If so, “For instance” needs to go.
This kind of error is a classic generative AI error. LLMs don’t have a point to make. They aren’t crafting a persuasive argument. They are stringing together words based on probabilities.
Example
Just after the previous example, the text continues:
Moreover, AI can analyze vast amounts of user data to uncover patterns and insights that inform design decisions. This data-driven approach enables designers to create interfaces that are not only aesthetically pleasing but also highly functional and user-centric.
I work in tech, so I am constantly bombarded with jargon and marketing content that is just so much buzzword salad. I’m a bit immune to it. Nevertheless, it’s hard to imagine packing more tech-AI-design jargon into the two sentences above.
If I asked a job applicant “How can AI help with the design process?” and their answer was the above two sentences, I would think to myself: They’ve got a lot of five-dollar words there, but do they know what all of that really means? And I’d ask for a specific example.
The book never gets past high-level, jargon. The statements in the book don’t follow logically and aren’t backed up by any references. The text flows along with all the right-sounding words, but ultimately makes no sense and delivers no value.
Perspective as an AI developer
There are ways to avoid the basic mistakes listed above:
- You could prompt an LLM to review the generated output, identifying and/or correcting language problems
- You could get a human editor to review the content
When it comes to tackling the more significant problems, you’d have to take a hard look at your workflow and business model.
It’s fast and easy to prompt an LLM to generate output on a given topic. What you get is: bla bla bla. (Sometimes including simplistic or flat-out incorrect information.)
But if you want to generate quality output, you have to work harder:
- You have to do your research to find reliable, trustworthy reference sources to inform and support your points
- You have to marshal your arguments and then create an outline of how they follow one from the other
- Then you have to engineer prompts for the LLM that are grounded in the reference source material and have low-level instructions — like a list of points to make and an explanation of how one leads to the next
With supportive reviewers/editors, producing a book this way might take 80-90% of the time it would take to produce it without using generative AI. That’s pretty good. But if your business model is to generate a book in a day, this level of manual work isn’t compatible with that business model.
Conclusion
Even before LLMs became popular, there were people writing low-quality books. So buyer beware was already a thing. (Maybe this book wasn’t even generated using AI.) But because it’s so easy to generate these kinds of books using LLMs, there will be much,much more for customers to wade through now. Customers will have to get better at filtering out low-quality books and identifying the high-value ones.
To detect AI-generated books, look for the kinds of problem listed here.
If you plan to generate books using LLMs, do the work to avoid these problems.