What is NLG?
NLG or Natural Language Generation is technology that automatically generates text backed by latest state-of-the-art Deep Learning techniques.
Natural-language generation (NLG) is a software process that transforms structured data into natural language. It can be used to produce long form content for organizations to automate custom reports, as well as produce custom content for a web or mobile application. It can also be used to generate short blurbs of text in interactive conversations (a chatbot) which might even be read out by a text-to-speech system.
NLG ...
Simply said: just generating text with some reasonable meaning.
Otherwise, yet another branch of NLP.
How Modern NLG Started?
Starting with Deep Learning movement, this technology, together with almost all the other AI tasks gained in innovation with Recurrent Neural Networks (RNN) and Long-Short Term memory (LSTM) cells.
Andrey Karpathy has shown in 2015 how effective RNNs can be used for automatic text generation, trained with Skahespeare’s dramas:
The Unreasonable Effectiveness of Recurrent Neural Networks (karpathy.github.io)
The Transformers Taking Their Place
Transformers are a natural evolutionary extension to RNNs combined with the attention mechanism, which appeared in about 2017. They made RNNS obsolete soon for many AI tasks. NLP and NLG are among them. Please keep in mind that we are not talking about Michael Bay‘s Transformers movie
They are wide topic with well explained technical details.
Few links:
- Transformers In NLP | State-Of-The-Art-Models (analyticsvidhya.com)
- Transformer (machine learning model) – Wikipedia
- Transformers in NLP: A beginner friendly explanation | Towards Data Science
- Understanding Transformers, the Data Science Way | by Rahul Agarwal | Towards Data Science
- There are two major NLP models based on transformers: BERT and GPT-2.
While BERT is more broadly acceddped for wide range of NLP tasks, because of his two-way architecture, GPT-2 gain traction for automatic text generation mostly because his one-way forward sequence processing architecture. It’s main advantage is the feature of preserving the context of the autogenerated texts
Therefore, for the purpose of automatic text generation we used GPT-2 languge model.
Transformers !!!111
Sorry guys 🙁
We are not talking about those Transformers. Neither about Megan Fox.
Those are just a little bit different Transformers, you know …
Introducing
GPT-2
GPT-2 is a language model based on transformers, developed by Open AI.
In the beginning, the guys from Open AI were suspicious of opening it publicly because they thought it might be used for generating fake news on a scale. However, they published it, afterward, and immediately it gains attraction from the academic and AI community.
Few useful links:
- Better Language Models and Their Implications (openai.com)
- GitHub – openai/gpt-2: Code for the paper “Language Models are Unsupervised Multitask Learners”
- The Illustrated GPT-2 (Visualizing Transformer Language Models) – Jay Alammar – Visualizing machine learning one concept at a time. (jalammar.github.io)
Finally, Fine-Tuning GPT-2
In short: GPT-2 is a neural network generated with a huge amount of texts. I haven’t found information about the scope of the text used for training it. But, while I’m playing with it, there are interesting Python, Java and JavaScript source codes, Apache web logs and base64 texts autogenerated, and it must learned them from somewhere.
Formally speaking, GPT-2 produces the most probable sequence of words that might appear as an extension to some given text as input. Those probabilities are calculated based on the data used for its training. Therefore, autogenerated texts by plain GPT-2, taken as is, are the texts that might be written by the average English writer, in statistical terms.
For focusing GPT-2 to generating texts that looks like some speiciffic author, we use transfer learning technology:
- A Comprehensive Hands-on Guide to Transfer Learning with Real-World Applications in Deep Learning | by Dipanjan (DJ) Sarkar | Towards Data Science
- Deep Learning using Transfer Learning | by Renu Khandelwal | Towards Data Science
In our case, we overfit the properly trained GPT-2 model with texts from some specific author. As a result, we produced a neural network that generates text with a style specific to that author.
It is possible to mix the styles also. In order to achieve this, we took two sets of texts: one from Shakespeare’s writings, and one with Christmas lyrics, for example. As a result, we got a neural network trained to generate texts about Christmas in Shakespeare’s style.
Enjoy the autogenerated texts we’ll publish on this site.