GEA1000 Quantitative Reasoning with Data
This course aims to equip undergraduate students with essential data literacy skills to analyse data and make decisions under uncertainty. It covers the basic principles and practice for collecting data and extracting useful insights, illustrated in a variety of application domains. For example, when two issues are correlated (e.g., smoking and cancer), how can we tell whether the relationship is causal (e.g., smoking causes cancer)? How can we deal with categorical data? Numerical data? What about uncertainty and complex relationships? These and many other questions will be addressed using data software and computational tools, with real-world data sets. Short Syllabus: The PPDAC cycle (Spiegelhalter, D., 2019, The Art of Statistics; MacKay, R.J., R.W. Oldford, 2000, “Scientific Method, Statistical Method and the Speed of Light,” Statistical Science) will be used as a framework to highlight and demonstrate the process of dealing with and making sense of data. The course will consist of four chapters, broadly described below. Getting data: collection/sampling, experiments/observational studies, data cleaning/recoding, interpreting summary statistics (mode, mean, quartiles, standard deviation, etc.). Categorical data analysis: bar plots, contingency tables, rates/ratios, association, Simpson’s Paradox. Dealing with numerical data: histograms, boxplots, scatter plots, correlation, ecological and atomistic fallacies, simple linear regression. Statistical Inference: probability, conditional probability, prosecutor’s fallacy, base rate fallacy, conjunction fallacy, understanding hypothesis tests, interpreting confidence intervals, learning about a population based on a sample, simple simulation. Exploratory data analysis (EDA) will be incorporated extensively into the content. Students will appreciate that even simple plots and contingency tables can give them valuable insights about data. There will be an emphasis on using suitable real world data sets as motivating examples to introduce content and through the process of problem solving, elucidate techniques/materials in the syllabus.