Subject Code: ID6L001 | Subject Name: Data Analytics | L-T-P: 3-0-0 | Credit: 3 |
---|---|---|---|
Pre-requisite(s): None | |||
Introduction: Sources, modes of availability, inaccuracies, and uses of data. Data Objects and Attributes: Descriptive Statistics; Visualization; and Data Similarity and Dissimilarity. Pre-processing of Data: Cleaning for Missing and Noisy Data; Data Reduction – Discrete Wavelet Transform, Principal Component Analysis, Partial Least Square Method, Attribute Subset Selection; and Data Transformation and Discretization. Inferential Statistics: Probability Density Functions; Inferential Statistics through Hypothesis Tests Business Analytics: Predictive Analysis (Regression and Correlation, Logistic Regression, In-Sample and Out-of-Sample Predictions), Prescriptive Analytics (Optimization and Simulation with Multiple Objectives); Mining Frequent Patterns: Concepts of Support and Confidence; Frequent Itemset Mining Methods; Pattern Evaluation. Classification: Decision Trees – Attribute Selection Measures and Tree Pruning; Bayesian and Rule-based Classification; Model Evaluation and Selection; Cross-Validation; Classification Accuracy; Bayesian Belief Networks; Classification by Backpropagation; and Support Vector Machine. Clustering: Partitioning Methods – k-means Hierarchical Methods and Hierarchical Clustering Using Feature Trees; Probabilistic Hierarchical Clustering; Introduction to Density-, Grid-, and Fuzzy and Probabilistic Model-based Clustering Methods; and Evaluation of Clustering Methods. Machine Learning: Introduction and Concepts: Ridge Regression; Lasso Regression; and k-Nearest Neighbours, Regression and Classification. Supervised Learning with Regression and Classification Techniques: Bias-Variance Dichotomy, Linear and Quadratic Discriminant Analysis, Classification and Regression Trees, Ensemble Methods: Random Forest, Neural Networks, Deep Learning. Text/Reference Books:
|