Why you need statistics for ML,DL & AI ?
learning statistics before entering into artificial intelligence
Statistics is the art of making numerical conjectures about puzzling questions. […] The methods were developed over several hundred years by people who were looking for answers to their questions.— Page xiii, Statistics, Fourth Edition, 2007.
All buzz of AI is concerned with understanding algorithms from a theoretical perspective but deep down it is mathematics and statistics which are lurking behind and play an important role. Let’s understand this in detail.
let’s start with “How we can define statistics”?
as Wikipedia says,
statistics is at the core of sophisticated machine learning algorithms, capturing and translating data patterns into actionable evidence.
Data Science
Data science always deals with data, but the way of decision-making lies in statistics. After the collection of data, it is important to know its distributions, its outliers, its central tendencies, and many more things. Here statistics help you. Knowing statistics enables you to select the best techniques for data collection, apply the right analysis, and effectively communicate the findings. Making judgments based on data, making predictions, and making scientific discoveries all depend on statistics. You can gain a much deeper understanding of a subject, thanks to statistics.
Data
Data plays a major role in today’s technology world. All technologies are data-driven and generate large amounts of data every day. Data scientists are experts in analyzing data sources, cleaning and processing data, understanding why and how such data was generated, gaining insights from the data, and making business-friendly changes. Recent data is everything.
As data is growing extensively, it will be hard to deal with it with regular and traditional techniques. Statistics is capable and will always be capable to deal with all pre and post-processing of data. As the legend says, a data scientist should be a good storyteller either to his child or to the client. You can expect the help of statistics here. The probability function tells a story about distributions, distributions tell about how to deal with data, and finally, data tells a story about outputs.
A few topics that I think will help to work with data are as follows.
Describing and displaying data
Graphical displays
Numerical Summaries
Normal Distributions
Categorical Data
Linear regression and correlation
Linear regression
Correlation
Inference in Linear Regression
Multiple Linear Regression
ANOVA for Regression
Experiments and sampling
Experimental Design
Sampling
Sampling in Statistical Inference
Probability
Probability Models
Conditional Probability
Random variables
Mean and Variance of Random Variables
Sample Means
Hypothesis tests and confidence intervals
Confidence Intervals
Tests of Significance
Comparison of Two Means
Inference for Categorical Data
Chi-square Goodness of Fit Test
Two-Way Tables and the Chi-Square test
There are many statistical concepts that you should know rather than what I mentioned. Continuing this series, in the next blogs, I am writing more about statistics from basics to advance. You can follow me for such an interesting blog and series for statistics.
If you have found this article insightful
If you found this article insightful, follow me on Linkedin, my Substack newsletter, and also on medium. you can also subscribe to get notified when I publish articles. Let’s create a community! Thanks for your support!