From statistics to data science – what is the difference and how can you benefit from each
Why do we need statistics?
The field of statistics has helped us answer many of life’s mysteries. It saves lives by helping us know how to treat people in hospitals, helps us make money by predicting values in the stock market, enables us to improve living and economic conditions around the world by quantifying important issues and measuring our improvement, and even shows us who might win the next election.
Statistics helped the allies win WW2 and crack the enigma code. It’s so important we even made a day for it, called ‘world statistics day’.
The saying goes that ‘knowledge is power’. Well, statistics is how we test that knowledge and find out if we are right or wrong, or somewhere in between. Perhaps statistics is the way for us to make our way to being ‘less wrong than we were before’.
What is statistics
Statistics is how we find patterns and trends in data, and even make predictions about populations we have little information on based on small samples of observations.
Statistics is a useful tool for summarizing and explaining the data we have, as well as uncovering insights about entire countries based on data gathered by a small number of people. It enables us to conduct reliable and replicable research experiments and is the reason we understand as much about the world as we do today. This is because, without it, we’d be very much just guessing our way through things hoping for the best.
Statistics can help us answer questions like “Does Kinnu improve final year exam performance?” and “is there a relationship between how much time you spend in Kinnu and how interesting people find you?”. (In case you were wondering, the answer to both questions is yes).
What is Data Science and what can you do with it?
Data science involves using a combination of statistics, data analytics, and machine learning algorithms to create useful solutions from data.
With data science, we can teach computers to do things like handwriting recognition to help sort mail, or automate the digitization of forms, predict credit card fraud, communicate with humans, and even drive cars.
Data science has become an indispensable part of our modern world. It impacts us in both big and small ways, making our lives more efficient and streamlined. For example, on a small scale, data science allows us to have smart spam filters on our email accounts. Algorithms can learn to distinguish junk mail from important correspondence, saving us time and frustration.
But data science is also making a big impact on the way we tackle critical challenges like cancer detection. Thanks to sophisticated data analysis, doctors are now able to more accurately detect tumors and understand the unique features of each patient’s cancer. By using machine learning to analyze scans and other data, we can better target treatments, improving outcomes for countless patients.
What is the difference between statistics and data science?
Statisticians find relationships between different things like height and life expectancy, are able to infer something about the whole population based on the data they got from their small sample, or use statistical tests to decide if a new drug is an effective treatment or not.
Data scientists on the other hand define sets of rules or calculations, known as algorithms, so that computers can make predictions and decisions at scale, and learn along the way. These sets of rules can be thought of as ‘recipes’. Data scientists perform statistical analysis too, but the use of machine learning algorithms to make these machine learning “recipes” is their main focus.
In summary, while statistics are used for designing experiments and testing things, data science uses algorithms to find patterns in data and predict optimal decisions or what might happen in the future.
What knowledge does Data Science build on?
Data science is a combination of data analytics, statistics, and machine learning.
Data analytics helps us describe things, and make conclusions about what we can see in front of us from the data we have. Data analytics is for things like measuring and describing past performance – for example business revenues per quarter, or finding an e-commerce store’s top performing categories. It doesn’t make predictions or forecasts.
Statistics is about experiments, testing, and proving. Whereas analytics might show you differences between your revenues, statistics will show you whether that difference is significant, and which factors are likely to have caused it, as well as how much of an effect each individual factor had.
Machine Learning helps us find patterns in real-world data, and predict things based on new and limited information. These days, Artificial Intelligence (AI) and Machine Learning are used pretty much interchangeably, even though there are some differences. AI is mostly about mimicking human intelligence and behavior, and Machine Learning is a subfield of AI that can be used to help build AI solutions.
What is data analytics?
Data analytics can best be thought of as descriptive. So, you describe the characteristics of data that you already have, rather than taking action to transform data or run models to predict future data. Data analysts often rely on data visualisation to create informative stories about the data, creating narratives that are understandable to non-technical audiences.As an example, if you track all sales for an e-commerce store, you can find the average order value, and make a chart to show what percentage of sales each product category accounts for. Alternatively, you could create a dashboard for a new health app that tracks users’ physical activity and key health metrics.
What can and can’t data analytics do?
A data analyst uses exploratory methods to look for gems in data that can inform business units like leadership or marketing, or even get passed on to the statisticians and machine learning engineers for further exploration and use. They create useful and engaging reports on data that is collected by companies, scientists, governments, and more.
Analytics gives you the information you need so that you’re not flying blind. It’s useful for decision-makers because it’s your eyes and ears. However, data analytics doesn’t allow you to come to conclusions beyond the data, whereas statistics and machine learning do – for example, by predicting the average weight of a population based on a small sample, or generating stock market predictions.
How does analytics differ from statistics?
Data analytics gives you a better understanding of your data and helps you form better questions for exploration and verification with statistics. An example might be a marketing campaign performance report where you observe an increase in sales corresponded with a new social media ad campaign
With statistics, however, you could test whether the difference in the performance of your marketing campaigns is statistically significant, which adds confidence to business decision-making. Statistical significance answers the question of whether the change in the data you are observing could just be due to chance.
While analytics helps you form hypotheses, statistics help you test them.
Can you make predictions with data analytics?
Analytics is descriptive, it shows you what is in front of you and describes it. It tells you where you have been, but not if you’ve improved significantly, nor where you are going. This means that you can’t use analytics to make predictions.
Analytics is used to create reports and summaries of past data, like dashboards in apps such as an activity monitor that shows you how many hours a day you have been active.
To make predictions, you need to use statistics or data science. But, analytics can be helpful in finding useful information in your existing data to use for statistical testing, or data science algorithms – so it is still used as a part of the process.