machine learning basics pt.5

Machine Learning Basics

Linear Classification Model Metrics

  • Different from linear regression, Logistics regression and SVM needs the metrics to test their performance.
  • the metrics of Linear Regression = $\displaystyle R^2$ score

    Be careful! Do not say $\displaystyle R^2$ score as accuracy score.

  • the metrics of Classification model = when the label is imbalanced

    the case when the accuracy score is high but is not able to ensure the performance
    Example: The medical test kit of the accuracy is 99%. Do you trust this score? IF the data is imbalanced, it is difficult to predict accurately the data. (Lower modeling performance)

  • Confusion Matrix

(Real) Positive (Real) Negative
(Prediction) Positive TP FP
(Prediction) Negative FN TN

Precision(정밀도, evaluation as prediction = positive) = $\displaystyle \frac{TP}{(TP + FP)}$
Recall(재현율, evaluation as real = positive) = $\displaystyle \frac{TP}{(TP + FN)}$

Regulation of Classification model

  • the case when regulation is necessary: when the number of weight is a lot in order to restrain overfitting problem (the model is complicated = reinforcement of regulation = lowering C)
  • cost function formula of classification model =
    $\displaystyle error + \frac{1}{C} \times \vert w \vert = \\\\
    C(error + \frac{1}{C} \times \vert w \vert) = \\\\
    C \times error + \vert w \vert$

    C = the rate of error tolerance

  • SVM & SVC

    SVC (Support Vector Classifier)

    • linear & non-linear as parameter
    • set parameter through kernel to impose the non-linearity
    • margin = $\displaystyle \frac{1}{C}$
    • To maximize the margin, allow the error by lessening the value of C to lower overfitting problem and the error
    • Hard SVM (perform the modeling without allowing the error at all) vs Soft SVM (select the optimized margin with allowing the error and selecting the data that is in the margin)
prediction value error range of error
Logistic Regression $\displaystyle \frac{1}{(1 + e^{-wx})}$ (the value of probability) log loss $\displaystyle 0 < n < \infty$
SVM $\displaystyle \begin{cases} 1 \\ 0 \end{cases}$ hinge loss $\displaystyle 0 \leq n < \infty$
  • Kernel: change non-linear coordinates into linear coordinates
    $\displaystyle \implies$ MLP classifier (using rbf formula and various ways, find the linear relation) $\displaystyle \rightarrow$ Deep learning and NN(Neural Network)
  • Perceptron: 비선형성이 굉장히 복잡한 경우 층을 쌓아가면서 비선형성 문제를 풀릴수 있도록 해줌.