Introduction to Statistical Inference
Statistics is the science of collecting, analyzing, presenting, and interpreting data. The area of descriptive statistics is concerned primarily with methods of presenting and interpreting data using graphs, tables, and numerical summaries. Whenever statisticians use data from a sample — i.e., a subset of the population — to make statements about a population, they are performing statistical inference. Estimation and hypothesis testing are procedures used to make statistical inferences. (Anderson, 2020).
Statistical inference is the process of formulating conclusions from data and quantifying the uncertainty arising from using incomplete data.
There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses.
Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, etc.) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference.
Knowledge and parsimony, (using simplest reasonable models to explain complex phenomena), go hand in hand.
Paramount among our concerns are:
- Is the sample representative of the population that we’d like to draw inferences about?
- Are there known and observed, known and unobserved or unknown and unobserved variables that contaminate our conclusions?
- Is there systematic bias created by missing data or the design or conduct of the study?
- What randomness exists in the data and how do we use or adjust for it? Here randomness can either be explicit via randomization or random sampling, or implicit as the aggregation of many complex unknown processes.
- Are we trying to estimate an underlying mechanistic model of phenomena under study?
References
Anderson, David R. , Sweeney, Dennis J. and Williams, Thomas A.. “Statistics”. Encyclopedia Britannica, 20 Oct. 2020, https://www.britannica.com/science/statistics. Accessed 4 June 2021.