This is part III of this series. The previous parts are here:previous post that we need a better way to represent arbitrary number of types without resorting to ad-hoc treatments. Restricted Boltzmann machines (RBMs) offering an unified way, leading to what is called mixed-variate RBM.
A RBM is a Markov random field with two layer of nodes, one for visible variables, the other for hidden variables. For our purpose, we shall assume that hidden variables are binary, and visible variables are typed. Standard RBMs often deal with only a single visible type, mostly binary and Gaussian. Other less popular types are count, categorical, ordinal, intervals. There are several treatments of count: Poisson, constrained Poisson and replicated multinomial by Ruslan Salakhutdinov and others.
Regardless of the types, the principles are the same: The model is a Boltzmann machine (clearly from its name) that admits a Boltzmann distribution (a.k.a. exponential family, Gibbs distribution, log-linear, etc). This form is flexible, virtually most known types can be expressed easily. The RBMs are one of the hot kids these days, thanks to the hype in deep learning driven by big guys such as Google, Facebook, Microsoft and Baidu.
It is natural that we can plug all types together, all share the same hidden layer. Then all the problems with mixed-type suddenly disappear! This is because now the interactions are limited to types and the hidden layer, which is usually binary. No need for all the types to mess up with each other directly.
Although it appears effortless, and we simply expect it to work, as we have shown in the mixed-variate RBM, there are several issues that are worth discussing.
First, we have intended only to work with primitive types, regardless of how the types arise. However, moving up to one more layer, we may be worried about multiple modalities rather than types. For example, an image with tags has two modalities, and they represent different kinds of data at different levels of semantics. Typically an image is considered as a vector of pixels, and thus it is natural that we can use either Gaussian (for continuous pixel intensity) or Beta types (for bounded intensity).
Second, we can efficiently estimate the data density up to a constant, using a quantity known as "free-energy". This offers a natural way for anomaly detection.
Third, since the representation layer is all binary, we can indeed stack multiple RBMs on top of the first layer, making the entire structure a mixed-variate Deep Belief Network.
Fourth, since the estimation of the posterior is straightforward, it paves a way to learn distance metric. The distance are useful in many tasks including information retrieval, k-nearest neighbor classification and clustering. These capabilities are not possible with other mixed-type modeling methods.
- Multilevel Anomaly Detection for Mixed Data, K Do, T Tran, S Venkatesh, arXiv preprint arXiv: 1610.06249.
- Learning deep representation of multityped objects and tasks, Truyen Tran, D. Phung, and S. Venkatesh, arXiv preprint arXiv: 1603.01359.
- Outlier Detection on Mixed-Type Data: An Energy-based Approach, K Do, T Tran, D Phung, S Venkatesh, International Conference on Advanced Data Mining and Applications (ADMA 2016).
- Latent patient profile modelling and applications with Mixed-Variate Restricted Boltzmann Machine, Tu D. Nguyen, Truyen Tran, D. Phung, and S. Venkatesh, In Proc. of 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’13), Gold Coast, Australia, April 2013.
- Learning sparse latent representation and distance metric for image retrieval, Tu D. Nguyen, Truyen Tran, D. Phung, and S. Venkatesh, In Proc. of IEEE International Conference on Multimedia and Expo (ICME), San Jose, California, USA, July 2013.
- Mixed-Variate Restricted Boltzmann Machines, Truyen Tran, Dinh Phung and Svetha Venkatesh, in Proc. of. the 3rd Asian Conference on Machine Learning (ACML2011), Taoyuan, Taiwan, Nov 2011.