A few weeks ago we launched Kinnu’s new pathway about the French Revolution. Hundreds of people have since started this learning pathway and of these 90% liked it (v. 93% median for Kinnu’s pathways and 99% for its best pathway). Not one person remarked that it was written with AI.
Sure, a human was in the loop. An expert human wrote the plan. An average human removed the most obvious AI hallucinations. A human editor made the text flow. But AI wrote most of the text. AI generated pathway images. AI converted text into audio. If we were doing content generation in this way at scale, we could have done most of it directly with APIs and further streamlined the whole process.
Overall we saved 90% of time it would have otherwise typically taken us to make this learning content (here are details of the old and new process). And this is mostly with GPT-3. What will GPT-4 be able to do? What we know is that we are rebuilding our entire content creation operational model based on our French Revolution experiment. The Generative AI Revolution is here, and like the French Revolution it will shake up the old order.
If you’re a content publisher, take note – if you think that proliferation of digital media changed your business model, brace yourself for what’s coming. Everyone else – rejoice! Content just got free (or in our case – 10x cheaper).
We have learned tons from our experiments using AI for content creation which we wanted to share with you. Are you an academic drafting your annual curriculum? Student preparing an essay? Content marketer bored with the topic of your latest project? Below are our top 10 tips for generating content with AI.
Give AI precise instructions
Most generative text AI models like GPT-3 work by giving you the most probable next word based on a large language model (some others such as Natural Language Toolkit (NLTK) in Python or Washington Post’s Heliograf instead follow pre-defined grammatical and syntax rules and operate within the symbolic nature of language (the former typically outperform the latter though!).
You need to give the model a clear and crisp prompt, and even better a few examples. I typically try to imagine I am speaking to a child who has limited knowledge and a rather naive view about the world – you cannot assume they know what you’re talking about. There should be no ambiguities. Linguistic precision is required. For example, a query like “French Revolution” will give you a few generalist paragraphs. But if you add precision, for example “Tell me what was gallicanism?”, your results will be much more useful.
But don’t let this child metaphor fool you. You are speaking to a machine, and the machine can deliver feats which even the most effective intern can’t dream about. An example would be using it to extract latest quarterly performance of major listed companies – ticker, market capitalisation, EBITDA, etc. and add it to a spreadsheet. Here is a fantastic example of GPT-3 doing in few seconds something which took me long hours in my first job.
Pro tip: Be specific. Want a paragraph with 150 words? “In a paragraph, explain X” Want a bullet list? “Write a bullet list of reasons why London is the best city ever”.
Adjust available settings when you create images with AI
We started off generating images in DALL-E with simple prompts. However, we quickly realised that you can then fine tune the images using a ridiculously intuitive interface and you end up with a result that’s better than most of us can ever design, all in well under 30 seconds per image generated.
Little did we know, there is a science to “prompt whispering”. What you need to know to do it well is (1) parameters you can use to change the output of the prompt (2) data sets that were used to train the data and (3) general art history orientation. Compare the generic output of “French revolutionaries storming the Bastille” to a more precise output of “A black and white photo of French revolutionaries storming the Bastille”.
How does it work? It has to do with how tools like DALL-E and Midjourney bring images to life from noise. You specify the recipes that are used to make an image. There is now even a marketplace to buy prompts. But can such recipes even be copyrighted? Probably not.
Pro tip: Learn more about Art History and datasets used to seed the DALL-E and Midjourney models. Take a look at all the different aspects of an image you can control with words in this DALL-E guidebook – you can specify objects, shape, colour, perspective, structure and style. You can go even further with Midjourney as per this guide: in addition to basic stylistic features of DALL-E you can also control your images rendering (photographic texture), how stylized they are, chaos (how abstract they are) and weights to different kind of objects and stylistic features.
Beware: text generating AI often lies and hallucinates
When using AI-trained models to generate text using probabilistic methods, such as GPT-3, remember that the algorithm is trained on the web and it often makes glaring factual errors – GPT-3 was not designed to be a truth machine: it includes no higher level logical layer to check facts or make any logical conclusions. All it does is predict the next most likely word based on its training set. In our experiment it made a few hilarious / terrifying mistakes. No, Emmanuel Macron is not the king of France. No, Donald Trump has nothing to do with the French Revolution. No, we do not need a full list of religious orders abandoned by the French Revolution. Yes, all of this actually came up in actual GPT-3 output.
As the editor, you need to be vigilant: some of the output will be incorrect, some will be entirely irrelevant. Some errors were glaring and anyone with common sense and high school level education can spot them. Other mistakes are more nuanced and require expert knowledge, for example the following statement “In 1787, the King of France, Louis XVI, summoned the Assembly of Notables to advise him on how to solve the financial crisis facing the Kingdom.” sounds entirely plausible to a non-expert. However, historians would posit instead that “In 1789, Louis XVI sought to solve a political crisis by summoning the “Etats Généraux”, an ancient institution.”
Pro tip: Fact check everything GPT-3 and the like generate. They were made to create content that’s plausible, not correct. They are very convincing. Eg. when we first started playing with GPT-3 in the beginning of September when you typed “What’s the difference between polonium and radium?” the answer was “One is radioactive and another one is not”. Fortunately. This was extremely plausible, and fortunately had been quickly fixed, but it’s a fantastic example of how an AI writer will take your bias (you’re looking for a difference) and lack of knowledge and run with it, offering plausible and false conclusions.
Combine sentences to avoid GPT-3 sounding like it has just learnt English as a foreign language
We noticed that a lot of the sentences generated by GPT-3 sound as if the algorithm has just learnt English as a foreign language. Remember when you first learnt English (or another language if English is your first language) and you could only speak in very simple sentences? “Mary went to the library. The library is full of books. She took a book home.” AI generated text can sound very much like this: “The Revolution led to the rise of Napoleon Bonaparte. Napoleon Bonaparte was a French military leader and political leader. He became Emperor of France in 1804.”
AI seems to have absolutely no problem writing about people, events and facts. It is particularly good at definitions. It does not excel at abstract concepts. I have not seen it write conditional sentences, or comfortably compose sentences with words like “although” and “however”. It seems stuck in middle school writing.
Pro tip: Use AI generated sentences as the baseline structure of your paragraphs. Connect these simple sentences with some flair, for example: “[Human needs to add colour] Although the exact period of the French Revolution is hotly contested, most contemporary historians agree that [AI has no problem saying] the Revolution began in 1789 with the storming of the Bastille, a symbol of the absolute power of the monarchy. The Revolution ended in 1799 with the establishment of the French Republic.”
Don’t expect AI to have opinions (though it sometimes does!)
It is easy to see why you would not want AI to have political opinions. What is somewhat surprising, however, is that it has both no opinion on less politically charged topics, and a limited sense of ethical judgment. Below is an illustration of my chat with the DaVinci algorithm on GPT-3 about more contemporary subjects.
Pro tip: Do not use GPT-3 for editorials.
Don’t get yourself banned by requesting inappropriate images
Despite what the media will have you believe, there are some constraints around what you can and cannot request from an image generating AI. For example, when GPT-3 algorithm output something along the lines that Emmanual Macron was the king of France, I thought it would be hilarious to generate a satirical image showing Emmanuel Macron as the king of France using DALL-E.
“ It looks like this request may not follow our content policy. Further policy violations may lead to an automatic suspension of your account.”
You cannot generate images that may be perceived as offensive. Instead we had to be a bit creative and try to be more descriptive to get a similar result “Young man in modern clothes sitting as the king of france in Louis XIV style throne room photorealistic.”
Pro tip: DALL-E has a stricter content policy than a PG movie (you can read in detail here). It does not allow you anything that could be perceived offensive, harassment, or inappropriate in any way. Not even if the intent is satirical (as far as we have been able to try out – generative AI does not have much of a sense of humour).
You can fine tune GPT-3 to meet your particular use case
There are amazing tools which allow you to personalise the output that GPT-3 gives you. You upload your unique training set and this changes the weights of the final loops that GPT-3 goes through. If you want to build output in a similar format, this can massively improve what you get out of the generic algorithm.
Because of the crazy large dataset models such as GPT-3 are trained on, they can do many tasks (such as sentiment analysis) without any training (called zero-shot learning). However, in some cases, like asking it to classify things based on non-traditional sentiments, it will benefit from a few examples (what is called few-shot learning).
When you are still not happy with the output from the few-shot examples, then you can fine tune the model with a training set (e.g. example sentences and their sentiments). This, depending on the task, can range from a few hundred to a few thousand examples, but then you will end up with your own customized model that you can use and reuse without giving any prompts
Pro tip: You can use each AI tool’s customisation algorithm to add your custom data to help you get the output you want. This is true for both text and image generation. There are also amazing companies that are building infrastructure on top of these base models, and specialise in fine tuning.
Use APIs and build specialised components for each workstream
AI tools like GPT-3 are generalist. They can do several different things, but it is significantly easier to ask for each thing separately. For example – you may want to build a full blown learning pathway about the French Revolution. While it might be nice to just get the full course all at once, it is actually easier to ask for it in parts.
You may ask for a bulleted plan of key points to know about the revolution. You may then write out key concepts (let’s say based on most frequently visited Wikipedia subpages, which you can easily pull via their API). You may then ask the tool to write specified paragraphs based on these concepts. You can then ask a different module to generate questions (one for short answer questions, one for multiple choice questions, one for true / false questions etc.).
The reason why it is beneficial to build out smaller components for different tasks is that despite encoding a massive amount of text in its model, when it comes to your own prompts and their responses, GPT-3 has a small working memory (or what is sometimes referred to as ‘context window’).
In fact, GPT-3 model has a request + response hard limit which is roughly equal to 1500 words. New requests will not remember previous ones and their answers, so the model cannot plan at a macro scale, it has to do one short topic at a time and it is your task to look at the bigger picture.
If you want to create a long narrative, then each new prompt should include key points from the previous response to sustain some continuity over separate calls.
Pro tip: think of designing your generative AI project like a series of dominos. One function executes, triggering another which depends on its inputs, then another, then another.So for example you can start by building out a few paragraphs based on specific prompts. Then you may generate headlines based on these paragraphs. The final step would include formulating review questions based on this content. The final step should always be to manually check the output for factual accuracy.
Learn the key vocabulary associated with each tool
For example, it took me a while to discover that “temperature” in GPT-3 was not the level of enthusiasm the AI writer would display at the topic presented (as an overly enthusiastic person I naturally made this association). It’s actually the level of randomness – entropy – that you would like from your output. This means temperature of zero will always give you the same response for the same prompt, while the higher the temperature, the more likely that the responses will vary, but also more likely to drift from the original prompt.
Pro tip:: Want GPT-3 to sound a bit drunk? High temperature is your friend. I also did a bit of legwork and found these vocabulary guides for you for GPT-3, DALL-E and Midjourney.
Generative AI is not great at ambiguous tasks
Generative AI, as it is now, is excellent at answering specific, factual questions, but it assumes it knows everything about your intent – it’s not there to ask you follow-up questions when appropriate (maybe one day it will?!). This means it’s not great at responding to extremely useful and frankly not that hard tasks, such as: “What should one know about the French Revolution?”. Instead of asking for more context, for example
AI: “Why do you need this information?”
Me: “Because I need it to generate a learning pathway about the French Revolution”
AI: “Why would your audience want to learn about the French Revolution?”
Me: “To once and for all remember all the key aspects of the French Revolution to apply this knowledge easily later”
AI: “At what level of detail?”
Me: “Ten sections, each with 10 subsections, ideally with titles clearly but crisply defining the content of each subsection”.
GPT-3 simply gave me a very generic answer which did not correspond to the above intent. Even using higher number of token and a more specific query “Generate a bullet list of headlines about things one should know about the French Revolution.”
Pro tip: We are further testing if using the API we can get generative AI to ask questions if unsure where to go next instead of delivering the wrong result. So far I’d encourage you to not use it for ambiguous tasks such as generating topic outlines.
If I missed anything – I would love to hear from you at firstname.lastname@example.org or below. This topic is dear to my heart and we are looking to use generative AI to create a set of the ultimate study materials for any topic.