1. Intro

  • goal: The goal in classification is to take an input vector x and to assign it to one of $K$ discrete classes $C_k$ where k = 1, . . . , K

Regression : take continuous values


  • Two class: target variable $t \in \{0, 1\}$ such that t = 1 represents class C1 and t = 0 represents class C2
  • Multi class: t is a vector like (1,0,0,0,0,0…) when its class is C0

For linear regression, we only need $y=w^Tx+w_0$to obtain a real number;

For classification problem, we wish to predict discrete class labels, or more generally posterior probabilities that lie in the range (0, 1)

So we use a nonlinear function which called activation function:

2. Models

  • Discriminant Function 判别函数
    • Inputs 𝑥 directly into decisions
    • $R^n -> R$
    • SVM

3. Discriminant Function

3.1 Two class

  • $y(x)\geq 0$ -> C1
  • Otherwise, C2
  • decision boundary : $y(x) = 0$
    • perpendicular to 𝐰.
    • Displacement from origin =
    • perpendicular distance r of point x from decision surface
      It’s relate to difficulty of classification

3.2 multi classes