Machine learning has made AI more flexible and adaptive, able to modify its behavior in response to its learning and the environment
The unreasonable effectiveness of data
By the 2000s, it was becoming clear that there was no need to define everything about a feature of intelligence for an AI to be able to mimic it. Known as the ‘unreasonable effectiveness of data,’ it was becoming apparent that statistical hacks lead to behavior that appears intelligent. And yet it meant that one of the initial goals of AI needed to be dropped – the search for artificial intelligence might not increase our understanding of human intelligence.
A paradigm shift had happened. AI was now data-driven. And it meant that, rather than being fixed, rigid even, AI was becoming flexible and adaptive, changing its behavior based on the environment and its experiences.
An AI agent can be said to be learning if its performance improves following observations about the world. And it’s essential. After all, the AI researcher cannot predict all possible future situations or may not know how to program a solution. **Data-driven artificial intelligence empowers software to go beyond its brief**.
Learning from the data
You may have heard of ‘machine learning’ and be aware that it has incredible potential to impact technology and the world we live in, but what is it? Perhaps, at its simplest, we can describe machine learning as a tool for turning **information into knowledge**.
A computer observes data, builds a model based on observation, and uses it as a hypothesis, so its software can solve problems. And it’s important. It’s really important. AI designers can’t anticipate every future situation and don’t always know how to code a solution themselves.
A mass of data plus machine learning techniques can find hidden patterns and knowledge about a problem and perform all kinds of decision-making. Machine learning is so exciting because it has the power to improve AIs ability to step away from rule-based systems, learn, and improve interactions with their environment – including us.
Machine learning process
Machine learning aims to create models that offer a high degree of predictability. The process is typically defined using the following steps.
Firstly, data must be collected for the algorithm to learn – ideally in a **random order**. This is prevent to picking up any patterns or biases before it’s even started. Next, the data will be formatted and important features extracted using appropriately selected algorithms.
Then we have training – sometimes referred to as the fitting stage. And this is where the AI actually learns. A percentage of the data, identified as being **high quality, idealized, and representative of what it will face once trained** becomes ‘training data.’ This is used to create relationships, calculating weights for important factors, such as the make and model of a car, plus the type of damage, for insurance claims.
The remainder of the data becomes test data and should be representative of the range and types found in training as it will be used to test the model to see how well it performs. Is the model accurate? And do the results match what we expect to find in the real world? If not, parameters can be adjusted in the algorithm to get better results – and hyperparameters set that can’t be learned directly from training.
What is supervised learning?
There are several forms of machine learning, and **no single approach works for all tasks**. ‘Supervised learning’ uses labeled data and is particularly effective for classification of objects or understanding the connection between dependent and independent variables. It is most often used where goals are clear and accuracy is crucial.
The AI is given a number of samples of input-output pairs, where each pair is generated by an unknown function, which we’ll call ‘f’. The aim is to discover another function, approximating the true function ‘f’. While we may look for a consistent hypothesis, we may be more successful in looking for a ‘best-fit’ function.
For example, a machine may be supplied multiple images accompanied by an output (label) of either ‘train’ or ‘passenger.’ Its goal is to learn a function that predicts the correct label when given a new image. However, rather than telling whether or not something is a train or passenger by doing thousands of checks, it might just look whether there are wheels in the image. This approximation best fits the data through supervised learning.
Requirements for supervised learning
The goal of supervised learning is to predict outcomes for new data. It attempts to explain the behavior of the target as a function of a set of independent attributes or predictors.
Learning a function whose output is continuous or ordered (such as height) is described as ‘regression,’ while learning a function with only a finite number of possible output categories is called ‘classification.’ Linear regression involves finding the best fit between the input and the output.
Supervised learning requires a **large amount of labelled data**. Usually 10 times the number of parameters used. So for example, if your AI distinguishes images of planes from birds based on 1,000 parameters, you would need at least 10,000 labelled images to train it.
Creating these images is a **costly process** – both in terms of time and money. However, there are ways to fast-track the work and reduce manual involvement. Facebook has utilized a vast number of images taken from Instagram and already labeled with hashtags. While some of these hashtags (#awesome) were non-visual descriptions, the approach Facebook called ‘weakly-supervised data’ plus a clever prediction model led to an accuracy rate of 85.4%.
Unsupervised learning is one-sided – only unlabeled input data are provided. Therefore, the AI agent must learn from the input without explicit feedback. Typically this approach uses ‘clustering’ – detecting potentially helpful clusters from the information supplied. Think of the young child given a handful of marbles: they may sort them by size, color, or pattern.
Another approach, ‘association,’ looks for connections: If X happens, then Y is likely to follow.
More difficult than supervised learning, the AI must take on a less well-defined problem; like learning a new language while only being given sentences you don’t understand.
When given access to millions of images taken from the internet, the computer may find itself grouping images into cats, houses, or selfies. And yet, it could equally cluster on cloudy, sunny, and rainy. Unsupervised learning is incredibly powerful for **exploratory data analysis** but may lead to some unexpected results.
‘Semi-supervised learning’ takes place when not all input data has example outputs, but some does.
In this way, it combines supervised and unsupervised data, or labelled and unlabelled data. As the learning process is not closely supervised, it requires deep learning algorithms to translate the unsupervised data into supervised data.
And yet it reduces the burden of needing large amounts of labeled data. In turn, it opens up the possibility of many more problems being tackled by machine learning, ones in which we do not yet have sufficiently large data sets with which to undertake supervised learning.
Decision trees in semi-supervised learning
Semi-supervised self-training uses decision trees to represent functions that map a vector of attribute values to a single output value or ‘decision.’
A decision is reached by following a tree from the root to the leaf. The decision tree learning algorithm chooses the attribute with the highest importance – defined in terms of entropy, or the measure of uncertainty associated with a random variable.
‘Reinforcement learning’ is something all children are familiar with – **reward and punishment**. In machine learning, reinforcement is similar to supervised learning, only without sample data – instead, it uses trial and error. Positive values are assigned to the desired actions to encourage the AI agent, and negative values to discourage undesired behaviors. A successful outcome reinforces recommendations or policies for a given situation or problem.
Having completed a task, solved a problem, or played a game, the AI is told whether it was successful or not. The AI agent’s challenge is to determine which actions were responsible for the outcome. Was it using the rook in the 25th move in a game of chess that led to the win? Or was it the opening sequence of pawn movements?
Receiving regular positive reinforcements based on how successfully the AI interacts with the environment can help it identify rewards, described as sparse – such as a chess win.
As with human learning, reinforcement learning does not require constant supervision and is very effective in noisy, data-rich environments.
Reinforcement learning models
Periodically receiving rewards as the AI interacts with its environment can be easier for designers than providing labeled examples of how to behave.
A **‘model-based’** approach involves building a transition model of the environment to help interpret reward signals and inform decisions regarding actions. The model may initially be known, as in the rules of chess, or unknown, in which case the AI learns through interaction with its environment.
On the other hand, in **‘model-free’** reinforcement learning, the AI does not ‘know’ or build a transition model for the environment. Instead, it learns to more directly represent how to behave.
Passive and active reinforcement learning
There are two modes of reinforcement learning with AI. It can be either passive or active.
**Passive reinforcement learning** involves an environment where the agent has a fixed operating policy, that determines its actions over a set number of trials. On each trial, the agent starts in one state and does not finish until reaching a clearly defined end-state.
In **active reinforcement learning** however, the agent must figure out what to do. There is no operating policy. The agent experiences as much as possible about the environment and must work out how to behave in it. In this way, it may discover completely new ways of reaching its objective that researchers did not think of.
**Multimodal AI** is a new paradigm in AI, combining various types, or modes, of data, including speech, images, text, and numerical data and is much closer to human cognitive abilities. After all, most human perception involves multiple sensory modalities combined to give a unified perception of the world. As a result, rather than just using one type of input, multimodal AI uses many.
Just consider a young child learning to walk, flooded with sensory feedback.
**Multimodal AI is an area of increasing research for deep learning** and is often combined with reinforcement learning. It is adept at handling multiple data types and, combined with complex intelligence algorithms, can outperform ‘single’ modal AI in real-world situations:
The potential for multimodal AI is already being felt in several research areas, including natural language processing, computer vision, speech processing, and data mining.
Three well-known models that deal with the task of image descriptions and text-to-image generation include OpenAI’s CLIP and DALL-E, and their successor GLIDE.
Other areas for application include: visual question answering (VQA), text-to-image and image-to-text searching (WebQA), and video-language modelling (Project Florence-VL).
Applications for reinforcement learning
Early implementations of reinforcement learning focused on games such as cards, checkers, and backgammon. For example, the neural network backgammon program *Neurogammon,* turned each move into a set of training samples labeled as either a better or worse position than others.
And yet, real-world examples of reinforcement learning are becoming increasingly common and are being explored in diverse environments, including radio-controlled helicopters, destination and route selection by taxi drivers, and detailed physical movements of pedestrians.
The ‘Deep Q-network’ was the first modern reinforcement learning system. It used a deep neural net alongside reinforcement learning and was trained on 49 different Atari games – using raw image data and a game score as a reward signal.
IBM used reinforcement learning to beat other contestants in the TV game show Jeopardy and DeepMind’s AlphaGo used reinforcement learning techniques to beat the very best human players of the ancient Chinese game, ‘Go’.