官术网_书友最值得收藏!

Label propagation

Label propagation is a family of semi-supervised algorithms based on a graph representation of the dataset. In particular, if we have N labeled points (with bipolar labels +1 and -1) and M unlabeled points (denoted by y=0), it's possible to build an undirected graph based on a measure of geometric affinity among samples. If G = {V, E} is the formal definition of the graph, the set of vertices is made up of sample labels V = { -1, +1, 0 }, while the edge set is based on an affinity matrix W (often called adjacency matrix when the graph is unweighted), which depends only on the X values, not on the labels.

In the following graph, there's an example of such a structure:

Example of binary graph

In the preceding example graph, there are four labeled points (two with y=+1 and two with y=-1), and two unlabeled points (y=0). The affinity matrix is normally symmetric and square with dimensions equal to (N+M) x (N+M). It can be obtained with different approaches. The most common ones, also adopted by Scikit-Learn, are:

  • k-Nearest Neighbors (we are going to discuss this algorithm with further details in Chapter 8, Clustering Algorithms):

  • Radial basis function kernel:

Sometimes, in the radial basis function kernel, the parameter γ is represented as the reciprocal of 2σ2; however, small γ values corresponding to a large variance increase the radius, including farther points and smoothing the class over a number of samples, while large γ values restrict the boundaries to a subset that tends to a single sample. Instead, in the k-nearest neighbors kernel, the parameter k controls the number of samples to consider as neighbors.

To describe the basic algorithm, we also need to introduce the degree matrix (D):

It is a diagonal matrix where each non-null element represents the degree of the corresponding vertex. This can be the number of incoming edges, or a measure proportional to it (as in the case of W based on the radial basis function). The general idea of label propagation is to let each node propagate its label to its neighbors and iterate the procedure until convergence.

Formally, if we have a dataset containing both labeled and unlabeled samples:

The complete steps of the label propagation algorithm (as proposed by Zhu and Ghahramani in Learning from Labeled and Unlabeled Data with Label Propagation, Zhu X., Ghahramani Z., CMU-CALD-02-107) are:

  1. Select an affinity matrix type (KNN or RBF) and compute W
  2. Compute the degree matrix D
  3. Define Y(0) = Y
  4. Define YL = {y0, y1, ..., yN}
  5. Iterate until convergence of the following steps:

The first update performs a propagation step with both labeled and unlabeled points. Each label is spread from a node through its outgoing edges, and the corresponding weight, normalized with the degree, increases or decreases the effect of each contribution. The second command instead resets all y values for the labeled samples. The final labels can be obtained as:

The proof of convergence is very easy. If we partition the matrix D-1W according to the relationship among labeled and unlabeled samples, we get:

If we consider that only the first N components of Y are non-null and they are clamped at the end of each iteration, the matrix can be rewritten as:

We are interested in proving the convergence for the part regarding the unlabeled samples (the labeled ones are fixed), so we can write the update rule as:

Transforming the recursion into an iterative process, the previous formula becomes:

In the previous expression, the second term is null, so we need to prove that the first term converges; however, it's easy to recognize a truncated matrix geometrical series (Neumann series), and AUU is constructed to have all eigenvalues i| < 1, therefore the series converges to:

主站蜘蛛池模板: 阿尔山市| 锡林浩特市| 南汇区| 大港区| 南召县| 万载县| 东安县| 遂平县| 贺州市| 中山市| 永福县| 唐河县| 成都市| 申扎县| 如皋市| 锡林郭勒盟| 额敏县| 余庆县| 茌平县| 海盐县| 榆中县| 临颍县| 石林| 夹江县| 侯马市| 民权县| 东乡族自治县| 嘉祥县| 崇礼县| 泗水县| 承德市| 淅川县| 吴忠市| 郴州市| 张家界市| 英吉沙县| 九台市| 本溪市| 柳江县| 株洲市| 吐鲁番市|