Introduction: Sources, modes of availability, inaccuracies, and uses of data.
Data Objects and Attributes: Descriptive Statistics; Visualization; and Data Similarity and Dissimilarity.
Pre-processing of Data: Cleaning for Missing and Noisy Data; Data Reduction – Discrete Wavelet
Transform, Principal Component Analysis, Partial Least Square Method, Attribute Subset Selection; and
Data Transformation and Discretization.
Inferential Statistics: Probability Density Functions; Inferential Statistics through Hypothesis Tests
Business Analytics: Predictive Analysis (Regression and Correlation, Logistic Regression, In-Sample and
Out-of-Sample Predictions), Prescriptive Analytics (Optimization and Simulation with Multiple Objectives);
Mining Frequent Patterns: Concepts of Support and Confidence; Frequent Itemset Mining Methods;
Pattern Evaluation.
Classification: Decision Trees – Attribute Selection Measures and Tree Pruning; Bayesian and Rule-based
Classification; Model Evaluation and Selection; Cross-Validation; Classification Accuracy; Bayesian Belief
Networks; Classification by Backpropagation; and Support Vector Machine.
Clustering: Partitioning Methods – k-means Hierarchical Methods and Hierarchical Clustering Using
Feature Trees; Probabilistic Hierarchical Clustering; Introduction to Density-, Grid-, and Fuzzy and
Probabilistic Model-based Clustering Methods; and Evaluation of Clustering Methods.
Machine Learning: Introduction and Concepts: Ridge Regression; Lasso Regression; and k-Nearest
Neighbours, Regression and Classification.
Supervised Learning with Regression and Classification Techniques: Bias-Variance Dichotomy, Linear
and Quadratic Discriminant Analysis, Classification and Regression Trees, Ensemble Methods: Random
Forest, Neural Networks, Deep Learning.
Text/Reference Books:
- Han, J., M. Kamber, and J. Pei, Data Mining: Concepts and Techniques, Elsevier, Amsterdam.
Textbook. Year of Publication 2012
- James, G., D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical learning with
Application to R, Springer, New York. Year of Publication 2013
- Jank, W., Business Analytics for Managers, Springer, New York. Year of Publication 2011
- Williams, G., Data mining with Rattle and R: The Art of Excavating Data for Knowledge Discovery,
Springer, New York. Year of Publication 2011
- Witten, I. H., E. Frank, and M. A. Hall, Data Mining: Practical Machine Learning Tools and
Techniques, Morgan Kaufmann. Year of Publication 2011
- Wolfgang, J., Business Analytics for Managers, Springer. Year of Publication 2011
- Montgomery, D. C., and G. C. Runger, Applied Statistics and Probability for Engineers. John Wiley &
Sons. Year of Publication 2010
- Samueli G., N. R. Patel, and P. C. Bruce, Data Mining for Business. Intelligence, John Wiley & Sons,
New York. Year of Publication 2010
- Hastie, T., R. T. Jerome, and H. Friedman, The Elements of Statistical Learning: Data Mining,
Inference and Prediction, Springer. Year of Publication 2009
- Bishop C., Pattern Recognition and Machine Learning, Springer. Year of Publication 2007
- Tan, P., M. Steinbach, and V. Kumar, Introduction to Data Mining, Addison-Wesley. Year of
Publication 2005
|