Notes of using Python for Data Analytics

  • Chapter 0: Foundations of Python
  • Chapter 1: Essential libraries
    • Numpy
    • Pandas
    • Basic data visualization
      • Scatter Plots
      • Histograms
      • Cumulative Frequencies
      • Error-bars
      • Box plots
      • Pie Charts
  • Chapter 2: Statistics brief review
    • Descriptive statistics
      • Distribution center
      • Quantifying variability
    • Random variables and distributions
      • Discrete distributions
      • Continuous distributions
      • Maximum Likelihood
    • Hypothesis testing
      • General procedure and normality check
      • Types of Error and ROC curve
    • Common tests
      • T-test for a mean value and Wilcoxon signed rank sum test
      • Paired t-test and Mann-Whitney test
      • ANOVA
      • Multiple comparisons (Tukey’s test, Bonferroni correction, and Holm correction)
      • Kruskal-Wallis test
      • Two-way and Three-way ANOVA
      • Tests on categorical data
    • Design of Experiments
  • Chapter 3: Statistical methods and modeling
    • Linear correlation
    • Linear regression
      • Ordinary least squares
      • Polynomial regression
      • Ridge regression
      • Lasso regression
      • Elastic-net regression
    • Regression analysis
    • Logistic regression
    • Ordinary logistic regression
    • Nonparametric methods
    • Bootstrapping
    • Multivariate data analysis
    • Markov-chain-Monte-Carlo simulation
    • Time series analysis
      • Extracting statistics
      • Autocorrelation & moving average models
      • ARIMA models
      • Seasonality and exogenous variables
      • Hidden Markov Model
    • Dimension reduction and feature extraction
      • Singular value decomposition and matrix factorization
      • Principal components analysis (PCA)
      • Multi-dimensional scaling (MDS)
  • Chapter 4: Clustering
    • Hierarchical clustering
    • K-means clustering
    • Gaussian mixture models
    • Model selection and fine-tuning the clustering
  • Chapter 5: Classification
    • k-Nearest neighbors
    • Decision tree
    • Regression tree
    • Random forests
    • Naïve Bayes
    • Gradient boosted decision trees
    • Support vector machines
    • Neural networks
  • Chapter 6: Association rules
    • Apriori algorithm
    • FP-growth algorithm
  • Chapter 7: Text mining
    • Basic natural language processing
    • Data processing and conversion
    • Text classification
    • Topic modeling
    • Generative models and latent dirichlet allocation (LDA)
    • Social network analysis
  • Chapter 8: Deep learning
    • Preface
    • Convolutional neural networks
  • Chapter 9: Reinforcement learning
  • Chapter 10: Mathematical programming
    • Resources in Python
      • Sicipy(optimize)
      • Pyomo
      • Pulp
    • Common solvers
      • Gurobi
      • CPLEX
      • lp_solver
      • GLPK
      • COIN-OR
      • SCIP
      • Google OR-Tools
  • Chapter 11: (Meta)Heuristic search techniques
    • Local search
    • Basic of genetic algorithms and introduction of DEAP
  • Chapter 12: Discrete-event simulation
    • Knowledge for simulation
    • Output analysis
    • SimPy
  • Chapter 13: Advanced data visualization
    • Interactive plots
    • 3D plots
  • Appendix A: Data cleansing and wrangling
  • Appendix B: Working with varied data sources

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s