# DESCRIPTIVE STATISTICS FOR DATA SCIENCE? It’s Easy If You Do It Smart

##### So you have the business objective and you have painstakingly collected the data as well, now what? What to do with the data, how to understand the distribution of the target variable and how to analyse the influence or affect of other factors on the target?
The answer to this is Statistics. Statistics is a branch of applied mathematics that provides methods and tools to collect, analyse, present, and interpret data, and to make decisions. Statistics is about studying the data to get useful insights and making educated estimates about the population.   The picture on the right quantifies the motivational reasons for betrayals among allies, families, kingdoms, lovers and rivals in the most famous show of our times – GoT. Had Danaerys had this summary at her disposal, she would have known better to keep her allies under strict watch 😉   Statistics can be divided into two main areas: Descriptive Statistics and Inferential Statistics. Let’s understand the basics of data before diving deeper into descriptive and inferential statistics. Variable Types: A data set is a collection of observations on one or more variables. For example, the height of all students in a class can be represented by the variable height and similar weight as weight variable. Depending on what type of data a variable holds, variables can be classified as in the picture below.     Population and Sample A population consists of all elements—individuals, items, or objects—whose characteristics are being studied. The population that is being studied is also called the target population. Since, most of the times, it is impossible or too expensive to get the data for the whole target population, a  portion of the population is selected for study referred to as a sample. A sample is used to make inferences about the population behaviour.   Now that we have an understanding of data and samples & population, how do we analyse the data and make it useful? Here comes into picture the two areas of Statistics: Descriptive Statistics and Inferential Statistics Descriptive statistics, as the name suggests, is used to display and describe data by using tables, graphs and summary measures. Inferential Statistics, on the other hand, pertains to studying a  sample and use the results to make decisions or predictions about a population. In this article, we are going to focus on Descriptive Statistics and its various measures. Descriptive Statistics and exploratory data analysis should be the first steps while building predictive or inference models. Descriptive statistics help understand large amounts of data by providing methods to summarise the data and retrieve information about the underlying structure of the data. There are two ways to Descriptive statistics: Numerical and Graphical.
 Numerical Methods Graphical Methods Measures of Central Tendency: Mean, Median and Mode Univariate Data: Histograms, Pie chart, Bar Plots Measures of Dispersion: Variance, Standard Deviation, Range, IQR Bivariate Data: Boxplot, Scatterplot Measures of Association: Chi-square and correlation Multivariate Data: Biplots, Clustering