Machine Learning in a Nutshell
Last updated
Last updated
People often discuss Machine Learning and Artificial Intelligence, sometimes blurring the lines between them. Actually, Machine Learning is a subset of overall Artificial Intelligence and the general term for when computers learn from data.
For decades, ML has been trying to replicate the human mind and the way people think and learn. Over the years, artificial intelligence has transitioned from algorithms grounded in predefined rules and logic—akin to instincts—to those of Machine Learning, where there are minimal rules and they learn from data through a process of trial and error. The human brain operates in a space that's somewhere between these two approaches.
First defined in 1959 by the researcher Arthur Samuel, Machine Learning can be now grouped into three main types based on the feedback it gets:
Supervised learning: Here, the system is given input-output pairs by a "teacher" and learns to map inputs to outputs;
Unsupervised learning: The system isn't given any labels and must figure out patterns in the data on its own, either to discover underlying structures or as a step to feature learning;
Reinforcement learning: A program interacts with an environment to achieve a goal (like playing a game) and gets feedback similar to rewards, which it aims to maximise.
Each approach has its pros and cons, and no single method fits all scenarios.
Executing machine learning tasks typically entails developing a model. This model is first trained using a specific set of data known as training data, which helps it learn and understand patterns. Once trained, the model can then analyse new, previously unseen data to make informed predictions or decisions. The quality and diversity of the training data play a crucial role in determining the model's accuracy and effectiveness in real-world scenarios. The ultimate goal is to create a model that can generalise well from the training data to new situations, making reliable predictions or classifications.
Machine Learning relies, at least at the beginning, on human teaching. However, just as teachers sometimes struggle to understand the reactions of children and adults, asking questions like "What did I say to you?" or "How did you come to that conclusion?" - the same can be true for machines. Just as the human mind can produce surprising associations, so can machines.
It’s a black box. In layman's terms, black box Machine Learning pertains to Machine Learning models that offer outcomes or make decisions without disclosing the method behind their conclusions. The intricacies of the model, including the internal processes and the significance attributed to various factors, remain cloaked in mystery. This equates to a distinct opacity in such technological systems.
A black box model suggests that no individual, including those who coded or oversee the machine or algorithm, has clarity or comprehension about the path taken to arrive at the given outcome. Essentially, only the algorithm holds the exclusive knowledge of its decision-making process. This can be disconcerting because it becomes challenging to validate or question its choices, making accountability and interpretation especially crucial in applications where stakes are high.
Here, encapsulated succinctly, is the foundation of one of the most significant issues related to AI - accountability and responsibility. If a machine errs in its decisions or answers, leading to repercussions such as someone being dismissed, the restriction of an individual's rights, business ramifications, or racial discrimination, then who should be held accountable? The developers of the ML model? Its trainers? Its users? Or the institutions responsible for regulating them? It’s easy, but not realistic, to say - just unplug the machine.
A machine learning algorithm can empower software, in the end, to learn autonomously. Without direct programming, the algorithm can seemingly enhance its "intelligence" and improve its accuracy in predicting outcomes by processing historical data. The issue with historical data lies in its potential for bias and inaccuracy. When data is gathered from past events, situations, or decisions, it can often carry with it the prejudices, misjudgments, and errors of those times.
This can be particularly problematic when such data is used to train machine learning models or make future predictions, as the biases embedded within the data can be perpetuated and even amplified. For instance, if a dataset from past employment decisions is filled with gender or racial biases, an AI system trained on that data might make future hiring recommendations that are similarly skewed. Here’s a movie we recommend if you want to dive deeper into this - Coded Bias on Netflix.
Ensuring the integrity and fairness of historical data is crucial to prevent the perpetuation of these biases and to ensure that the insights derived from such data are both accurate and just. The more accurate data the machine encounters, the more intelligent it becomes.
Machine learning is now widely utilised across various domains and it is already a big part of your life, whether you know it or not. It has been integrated into large language models that help in processing and understanding vast amounts of text, computer vision systems that enable machines to interpret and interact with visual data, or speech recognition tools that allow for voice-based commands and searches. The recommendation algorithms on social media platforms or online shops all rely on machine learning. Learning about you and your preferences, that is.
In more specialised sectors, such as agriculture, machine learning assists in predicting crop yields, detecting plant diseases, and optimising farming techniques. Similarly, in the realm of medicine or pharma, it aids in disease diagnosis, drug discovery, and patient care by analysing complex medical data.
These advancements highlight the versatility and potential of machine learning in reshaping various industries and making great discoveries or solving some of the problems humanity has been struggling with since decades.