C0-Overview#
A good book to learn statistic. I will write the main idea of this book in the future.
Outline of the book#
- Introduction fast get main idea 
- Overview of Supervised Learning fast get main idea 
- Linear Methods for Regression familar 
- Linear Methods for Classification familar 
- Basis Expansions and Regularization what is expansion? 
- Kernel Smoothing Methods I know kernel, but what’s smoothing? 
- Model Assessment and Selection Assessment means evaluation? I think it’s all about different performance measure 
- Model Inference and Averaging what’s model avearging? 
- Additive Models, Trees, and Related Methods 
- Boosting and Additive Trees I forgot the main idea of boosting and what’s additive trees? 
- Neural Networks very familar 
- Support Vector Machines and Flexible Discriminants familary with SVM but what’s flexible discriminats? 
- Prototype Methods and Nearest-Neighbors know a little with prototype. Familary with KNN 
- Unsupervised Learning should be re-read carefully becase the main task of unsuperved learning is designing a good loss function 
- Random Forests random feature selection + decision tree 
- Ensemble Learning exploiting multiply models 
- Undirected Graphical Models know very little, maybe markov decision process? 
- High-Dimensional Problems interesting! 
Detailed Outline#
- Introduction 1 fast pass 
- Overview of Supervised Learning 9 get the main idea - 2.1 Introduction ……………………. 9 same as above 
- 2.2 Variable Types and Terminology………….. 9 what variable? 
- 2.3 Two Simple Approaches to Prediction: Least Squares and Nearest Neighbors … … … . . 11 A line has the lest sum of square of distance to misclassified data?Predicted according to the nearest neighbors. - 2.3.1 Linear Models and Least Squares … … . . 11 
- 2.3.2 Nearest-Neighbor Methods ………… 14 
- 2.3.3 From Least Squares to Nearest Neighbors … . 16 How? 
 
- 2.4 Statistical Decision Theory …………….. 18 like navie bayes? 
- 2.5 Local Methods in High Dimensions… … … … . 22 what’s the meaning of local here? 
- 2.6 Statistical Models, Supervised Learning and Function Approximation ……………. 28 - 2.6.1 A Statistical Model for the Joint Distribution Pr(X, Y ) ……. 28 what you gonna do with joint distribution? 
- 2.6.2 Supervised Learning……… ……. 29 
- 2.6.3 Function Approximation … … ……. 29 what function to be approximated? 
 
- 2.7 Structured Regression Models …………… 32 what’s structed mean? - 2.7.1 Difficulty of the Problem …………. 32 what difficulty? 
 
- 2.8 Classes of Restricted Estimators … … … . . never know about resticted estimators - 2.8.1 Roughness Penalty and Bayesian Methods ! 
- 2.8.2 Kernel Methods and Local Regression …local regression? 
- 2.8.3 Basis Functions and Dictionary Methods . dictionary methods? 
 
- 2.9 Model Selection and the Bias–Variance Trade off . . 
 
- Linear Methods for Regression 43 - 3.1 Introduction ……………………. 43 
- 3.2 Linear Regression Models and Least Squares … … . 44 - 3.2.1 Example:ProstateCancer ………… 49 
- 3.2.2 The Gauss–Markov Theorem……….. 51 important! 
- 3.2.3 Multiple Regression from Simple Univariate Regression … … . . 52 never know 
- 3.2.4 MultipleOutputs …………….. 56 NK 
 
- 3.3 SubsetSelection ………………….. 57 - 3.3.1 Best-Subset Selection …………… 57 NK 
- 3.3.2 Forward- and Backward-Stepwise Selection . 
- 3.3.3 Forward-Stagewise Regression … … . . 
- 3.3.4 Prostate Cancer Data Example (Continued) 
 
- 3.4 Shrinkage Methods……………….. never know - 3.4.1 RidgeRegression …………….. 61 L2 
- 3.4.2 TheLasso ………………… 68 L1 
- 3.4.3 Discussion: Subset Selection, Ridge Regression and the Lasso ………………. 69 subset of what? 
- 3.4.4 Least Angle Regression ………….. 73 NK 
 
- 3.5 Methods Using Derived Input Directions … … … 79 NK - 3.5.1 Principal Components Regression … … . . 79 Something with egenvector i think 
- 3.5.2 Partial Least Squares …………… 80 partial? what’s another part? 
 
- 3.6 Discussion: A Comparison of the Selection and Shrinkage Methods ………………. 82 **** 
- 3.7 Multiple Outcome Shrinkage and Selection … … . . 84 
- 3.8 More - the Lasso and Related Path Algorithms … . . 86 
- Incremental Forward Stagewise Regression … 86 
- Piecewise-Linear Path Algorithms … … . . 89 
- The Dantzig Selector …………… 89 
- The Grouped Lasso ……………. 90 
- Further Properties of the Lasso… … … . 91 
- Pathwise Coordinate Optimization … … . . 92 
 
- 3.9 Computational Considerations …………… 93 
 
