# Introduction to Anomaly Detection

Anomaly Detection (a.k.a **Outlier Detection**) is a process of detecting unexpected observations in specified datasets.

Anomaly Detection (a.k.a **Outlier Detection**) is a process of detecting unexpected observations in specified datasets.

Unlike Decision Tree Classifier, some machine learning models doesn't have the ability to deal with categorical data. The categorical data are often requires a certain transformation technique if we want to include them, namely Label Encoding and One-Hot Encoding.

Imbalanced datasets are a common problem in classification tasks in machine learning. Take credit card fraud prediction as a simple example: the target values are either fraud (1) or **not fraud (0)**, but the number of fraud (1) could only be less than one percent of the whole dataset.

Feature scaling stands for transforming variable values into a certain standard range. Feature scaling can quite important for certain machine learning algorithms, such as gradient descent, support vector machine. This post is about introducing several feature scaling techniques.

From K-means we know that:

- K-means forces clusters to be spherical
- In K-means clustering every point can only belong to one cluster

In mathematics, the Hessian matrix or **Hessian** is a square matrix of second-order partial derivatives of a scalar-valued function, or scalar field. It describes the local curvature of a function of many variables. Hessian Matrices are often used in optimization problems within Newton-Raphson's method.

K-means clustering is a type of unsupervised learning, which is used for unlabeled data (i.e., data without defined categories or groups). The goal of this algorithm is to find groups in the data, with the number of groups represented by the variable **K** (defined manually as an input).

Matrix factorization is a class of algorithms used for recommendation systems in machine learning. Matrix factorization algorithms work by decomposing dimensionality. Commonly known matrix factorization algorithms are SVD and PCA.