- reading a local/remote files with different formats
- data mutation e.g., add/remove column; recoding; sorting
- export data
- selecting rows and columns
- summarizing
- merging files
- correlation & linear regression
- conn to mysql
- Read Titanic data
- Exploratory data analysis e.g., What percent survived? what percent of each gender survived?
- ggplot - whehter Pclass affect the survival rate?
- Build the decision tree & plot the tree
- Training/Testing splitting for evaluation
- Further learning for randomforest; corssvalidation