In machine learning, the radial basis function kernel, or RBF kernel, is a popular kernel function used in various kernelized learning algorithms. In particular, it is commonly used in support vector machine classification.[1]

The RBF kernel on two samples , represented as feature vectors in some input space, is defined as[2]

may be recognized as the squared Euclidean distance between the two feature vectors. is a free parameter. An equivalent definition involves a parameter :

Since the value of the RBF kernel decreases with distance and ranges between zero (in the infinite-distance limit) and one (when x = x'), it has a ready interpretation as a similarity measure.[2] The feature space of the kernel has an infinite number of dimensions; for , its expansion using the multinomial theorem is:[3]

where ,

Approximations

edit

Because support vector machines and other models employing the kernel trick do not scale well to large numbers of training samples or large numbers of features in the input space, several approximations to the RBF kernel (and similar kernels) have been introduced.[4] Typically, these take the form of a function z that maps a single vector to a vector of higher dimensionality, approximating the kernel:

where is the implicit mapping embedded in the RBF kernel.

Fourier random features

edit

One way to construct such a z is to randomly sample from the Fourier transformation of the kernel[5]where are independent samples from the normal distribution .

Theorem:

Proof: It suffices to prove the case of . Use the trigonometric identity , the spherical symmetry of Gaussian distribution, then evaluate the integral

Theorem: . (Appendix A.2[6]).

Nyström method

edit

Another approach uses the Nyström method to approximate the eigendecomposition of the Gram matrix K, using only a random sample of the training set.[7]

See also

edit

References

edit
  1. ^ Chang, Yin-Wen; Hsieh, Cho-Jui; Chang, Kai-Wei; Ringgaard, Michael; Lin, Chih-Jen (2010). "Training and testing low-degree polynomial data mappings via linear SVM". Journal of Machine Learning Research. 11: 1471–1490.
  2. ^ a b Jean-Philippe Vert, Koji Tsuda, and Bernhard Schölkopf (2004). "A primer on kernel methods". Kernel Methods in Computational Biology.
  3. ^ Shashua, Amnon (2009). "Introduction to Machine Learning: Class Notes 67577". arXiv:0904.3664v1 [cs.LG].
  4. ^ Andreas Müller (2012). Kernel Approximations for Efficient SVMs (and other feature extraction methods).
  5. ^ Rahimi, Ali; Recht, Benjamin (2007). "Random Features for Large-Scale Kernel Machines". Advances in Neural Information Processing Systems. 20. Curran Associates, Inc.
  6. ^ Peng, Hao; Pappas, Nikolaos; Yogatama, Dani; Schwartz, Roy; Smith, Noah A.; Kong, Lingpeng (2021-03-19). "Random Feature Attention". arXiv:2103.02143 [cs.CL].
  7. ^ C.K.I. Williams; M. Seeger (2001). "Using the Nyström method to speed up kernel machines". Advances in Neural Information Processing Systems. 13.

📚 Artikel Terkait di Wikipedia

Radial basis function

In mathematics a radial basis function (RBF) is a real-valued function φ {\textstyle \varphi } whose value depends only on the distance between the input

Radial basis function network

mathematical modeling, a radial basis function network is an artificial neural network that uses radial basis functions as activation functions. The output of the

Reproducing kernel Hilbert space

In functional analysis, a reproducing kernel Hilbert space (RKHS) is a Hilbert space of functions in which point evaluation is a continuous linear functional

Random feature

feature matrices of size O ( D N ) {\textstyle O(DN)} . The radial basis function (RBF) kernel on two samples x i , x j ∈ R d {\displaystyle x_{i},x_{j}\in

Radial basis function interpolation

Radial basis function (RBF) interpolation is an advanced method in approximation theory for constructing high-order accurate interpolants of unstructured

Kernel method

recognition. Fisher kernel Graph kernels Kernel smoother Polynomial kernel Radial basis function kernel (RBF) String kernels Neural tangent kernel Neural network

General regression neural network

k} K ( x , x k ) {\displaystyle K(x,x_{k})} is the Radial basis function kernel (Gaussian kernel) as formulated below. K ( x , x k ) = e − d k / 2 σ

Gaussian function

with kernel methods to cluster the patterns in the feature space. Bell-shaped function Cauchy distribution Normal distribution Radial basis function kernel