This code implements the method presented in the following paper:
- Remes, Heinonen, Kaski (2017). 'A Mutually-Dependent Hadamard Kernel for Modelling Latent Variable Couplings'. Accepted for ACML 2017. Preprint: https://arxiv.org/abs/1702.08402
The proposed Wishart-Gibbs kernel can be used both in multi-task/output Gaussian Processes and in modelling latent variable couplings in latent factor models such as the Gaussian process regression network (GPRN). In the paper we propose a model called LCGP, which finds a latent space that jointly produces multiple real outputs through input-dependent mixing matrices and a classification output through a Probit link function.
Here, we show a simple application for the proposed kernel in multi-output GP. We generate a toy dataset where the coupling between outputs changes clearly.
% Put data into model struct:
model.T = T; % inputs
model.u = u; % outputs
% Initialize parameters with Q output variables, N samples:
model.Q = 3;
model.ell_u = rand(model.Q,1); % length-scales for the Gibbs kernel
model.ell_z = ell_z;
model.Kz = kron(gausskernel(T, T, ell_z), eye(model.Q)); % GP kernel for Wishart variables Z
model.Kz_inv = inv(model.Kz); model.Lz = chol(model.Kz, 'lower'); % pre-compute these
model.Z = kron(ones(N,1), eye(Q)); % init with a diagonal covariance
model.omega = 1; % noise variance
% Optimize model (Z, ell_u and log_noise):
model = optim_hadamard(model);
See run_exp.m
for a full code to run this example, that also runs a comparison using a Kronecker kernel.
For LCGP we provide a function lcgp.m
that initializes the Wishart-Gibbs kernel used for the latent variables, as well as all the variational distributions, and runs the variational inference algorithm.
model = lcgp(T, x, y, ell_u, ell_b, ell_z, opts);
Here T
includes the inputs (NxD matrix of D-dimensional inputs of length N). Data matrix x
is of size NxMxS, where M is the output dimensionality and S the number samples. Class labels are given in vector y
of length S. Gibbs kernel length-scales are given in vector ell_u
of length Q for each of the latent signals. Length-scales for the mixing matrix and Wishart variables are given in ell_b
and ell_z
.