Supervised Learning with scikit-learn

Supervised Learning with scikit-learn#

Supervised Learning Basics#

Types#

  • Classification - predict the label or category of an observations (is a transaction fraudulent or not)

  • Regression - predict continuous variables (cost of house based on size, bedrooms,…)

Terminology#

  • features - independent variables, predictor variables, variables being input

  • target variable - dependent variable, response variable, variable being predicted

Data Prerequisites#

  • data must not have missing values

  • must be numeric

  • usually we store in Pandas DataFrames or NumPy arrays

  • do Exploratory Data Analysis to check it out first

scikit-learn Syntax#

  • scikit-learn

  • that page actually has good way to select categories like classification, regression, clustering, dimensionality reduction, model selection, preprocessing