## Bayesian computing without exact likelihoods

## Bayesian computing without exact likelihoods

Levi, Finland, 12-14 March 2023

This Bayes Comp 2023 Satellite Workshop is about methods, theory and applications for Bayesian inference in the absence of exact likelihoods.

### Programme

The tentative programme for this workshop is below. This programme is subject to change.

Sunday 12 March- 17.00-19.00
**Satellite ice-breaker** - 19.00-20.10
**Satellite opening**: Keynote presentation by Sonia Petrone on "Quasi-Bayes predictive algorithms"

- 9.20-10.10
**Session I**: Carlo Albert, Yannik Schälte - 10.10-10.30
**Coffee** - 10.30-12.10
**Session II**: David Frazier, Chris Holmes, Jack Jewson, Masahiro Fujisawa - 12.10-16.40
**Lunch & Ski** - 16.40-18.20
**Session III**: Massimiliano Tamborrino, Yuexi Wang, Imke Botha, Chaya Weerasinghe

- 9.20-10.10
**Session V**: Aki Vehtari, Grégoire Clarté - 10.10-10.30
**Coffee** - 10.30-11.45
**Session VI**: Riccardo Corradin, Tomasso Rigon, Judith Rousseau - 11.45-15.00
**Lunch & Ski** - 15.00-16.15
**Session VII**: Lorenzo Pacchiardi, Ayush Bharti, Ulpu Remes - 16.15-16.45
**Break** - 16.45-18.05
**Session VIII**: Sam Duffield, Joshua Bon, short talks by Trevor Campbell, Iuri Marocco and Hector McKimm

### Registration and Practical Advice

Registration is through the Bayes Comp 2023 website. Practical advice regarding travel, accommodation and childcare are also available through the main website.

The satellite workshop will take place in the same hotel as the main conference: Hotel Levi Panorama.

Bayes Comp follows the ISBA code of conduct on ethical professional practice, equal opportunity, and anti-harassment. For any irregularity, please contact bayescomp2023@gmail.com.

### Organisers

The event is being organised by Antonietta Mira, Christian Robert, Heikki Haario, Leah South, Chris Drovandi and Umberto Picchini. We can be contacted at Christian's gmail account bayesianstatistics or at l1.south (at) qut.edu.au.

### Abstracts

**Carlo Albert**, Eawag - Swiss Federal Institute of Aquatic Science and Technology, Switzerland.

*Title*: A thermodynamic perspective on ABC.

*Abstract*: Accuracy and efficiency of ABC hinge on the choice of suitable summary statistics and the proper tuning of the tolerance between simulated and observed statistics. Both problems have a thermodynamic interpretation, which has inspired efficient algorithms for their solution. In order for ABC to be accurate and efficient, summary statistics need to be near sufficient and well concentrated, respectively. That is to say, they need to encode nearly all parameter-related information while cancelling most of the noise of the model outputs. Thus, they can be interpreted as thermodynamic state variables, for stochastic models. I will present a Machine Learning approach that implements these requirements, and is able to find suitable statistics even in situations where parameter estimators are insufficient for Bayesian inference. If we interpret the distance between simulated and observed summary statistics as an energy, the tolerance can be interpreted as a temperature. This interpretation has inspired an adaptive tuning algorithm for the tolerance, which gradually lowers the temperature (akin to simulated annealing) such that the entropy production (wasted computation) is minimized.**Ayush Bharti**, Aalto University, Finland.

*Title*: Approximate Bayesian Computation with Domain Expert in the Loop.

*Abstract*: Approximate Bayesian computation (ABC) is a popular likelihood-free inference method for models with intractable likelihood functions. As ABC methods usually rely on comparing summary statistics of observed and simulated data, the choice of the statistics is crucial. This choice involves a trade-off between loss of information and dimensionality reduction, and is often determined based on domain knowledge. However, handcrafting and selecting suitable statistics is a laborious task involving multiple trial-and-error steps. In this work, we introduce an active learning method for ABC statistics selection which reduces the domain expert's work considerably. By involving the experts, we are able to handle misspecified models, unlike the existing dimension reduction methods. Moreover, empirical results show better posterior estimates than with existing methods, when the simulation budget is limited.**Imke Botha**, Queensland University of Technology, Australia.

*Title*: Component-wise iterative ensemble Kalman inversion for static Bayesian models with unknown measurement error covariance.

*Abstract*: The ensemble Kalman filter (EnKF) is a Monte Carlo approximation of the Kalman filter for high dimensional linear Gaussian state space models. EnKF methods have also been developed for parameter inference of static Bayesian models with a Gaussian likelihood, in a way that is analogous to likelihood tempering sequential Monte Carlo (SMC). These methods are commonly referred to as ensemble Kalman inversion (EKI). Unlike SMC, the inference from EKI is only asymptotically unbiased if the likelihood is linear Gaussian and the priors are Gaussian. However, EKI is significantly faster to run. Currently, a large limitation of EKI methods is that the covariance of the measurement error is assumed to be fully known. We develop a new method, which we call component-wise iterative ensemble Kalman inversion (CW-IEKI), that allows elements of the covariance matrix to be inferred alongside the model parameters at negligible extra cost. This is joint work with Chris Drovandi, Matthew Adams, Dan Tran and Frederick Bennett.**Trevor Campbell**, University of British Columbia, Canada.

*Title*: Sparse Hamiltonian Flows (Bayesian Coresets Without all the Fuss).

*Abstract*: Bayesian inference provides a coherent approach to learning from data and uncertainty assessment in complex, expressive statistical models. However, inference algorithms have not yet caught up to the deluge of data in modern applications. One approach---Bayesian coresets---involves replacing the likelihood with an inexpensive approximation based on a small, weighted, representative subset of data. Although the methodology is sound in principle, efficiently constructing such a coreset in practice remains a significant challenge. Existing methods tend to be complicated to implement, slow, require a secondary inference step after coreset construction, and do not enable model selection. In this talk, I will introduce a new method---sparse Hamiltonian flows---that addresses all of these challenges. The method involves first subsampling the data uniformly, and then optimizing a Hamiltonian flow parametrized by coreset weights and including occasional momentum quasi-refreshment steps. I will present theoretical results demonstrating that the method enables an exponential compression of the dataset in representative models. Real and synthetic experiments demonstrate that sparse Hamiltonian flows provide significantly more accurate posterior approximations compared with competing coreset constructions.**Grégoire Clarté**, University of Helsinki, Finland.

*Title*: SVBMC: Fast post-processing bayesian inference with noisy evaluations of the likelihood.

*Abstract*: In many cases, the exact likelihood is unavailable, and can only be accessed through a noisy and expensive process -- for example, in Plasma Physics. Furthermore, Bayesian inference often comes in at a second moment, for example after running an optimization algorithm to find a MAP estimate. To tackle both these issues, we introduce Sparse Variational Bayesian Monte Carlo (SVBMC), a method for fast ``post-process'' Bayesian inference for models with black-box and noisy likelihoods. SVBMC reuses all existing target density evaluations -- for example, from previous optimizations or partial Markov Chain Monte Carlo runs -- to build a sparse Gaussian process (GP) surrogate model of the log posterior density. Uncertain regions of the surrogate are then refined via active learning as needed. Our work builds on the Variational Bayesian Monte Carlo (VBMC) framework for sample-efficient inference, with several novel contributions. First, we make VBMC scalable to a large number of pre-existing evaluations via sparse GP regression, deriving novel Bayesian quadrature formulae and acquisition functions for active learning with sparse GPs. Second, we introduce noise shaping, a general technique to induce the sparse GP approximation to focus on high posterior density regions. Third, we prove theoretical results in support of the SVBMC refinement procedure. We validate our method on a variety of challenging synthetic scenarios and real-world applications. We find that SVBMC consistently builds good posterior approximations by post-processing of existing model evaluations from different sources, often requiring only a small number of additional density evaluations.**Riccardo Corradin**, University of Nottingham, United Kingdom.

*Title*: Model-based clustering with intractable distributions.

*Abstract*: Model-based clustering represents one of the fundamental procedures in a statistician's toolbox. Within the model-based clustering framework, we consider the case where the kernel of nonparametric mixture models is available only up to an intractable normalizing constant, and most of the commonly used Markov chain Monte Carlo methods fail to provide posterior inference. To overcome this problem, we propose an approximate Bayesian computational strategy, whereby we approximate the posterior to avoid the intractability of the kernel. By exploiting the structure of the nonparametric prior, our proposal combines the use of predictive distributions as proposal distributions with transport maps to obtain an efficient and flexible sampling strategy. Further, the specification of our proposal can be simplified by introducing an adaptive scheme on the degree of approximation of the posterior distribution. Empirical evidence from simulation studies shows that our proposal outperforms its main competitors in terms of computational times while preserving comparable accuracy of the estimates.**Joshua Bon**, Queensland University of Technology, Australia.

*Title*: Bayesian score calibration for approximate models.

*Abstract*: Scientists continue to develop increasingly complex mechanistic models to reflect their knowledge more realistically. Statistical inference using these models can be highly challenging, since the corresponding likelihood function is often intractable and model simulation may be computationally burdensome. Fortunately, in many of these situations, it is possible to adopt a surrogate model or approximate likelihood function. It may be convenient to base Bayesian inference directly on the surrogate, but this can result in bias and poor uncertainty quantification. Here we propose a new method for adjusting approximate posterior samples to reduce bias and produce more accurate uncertainty quantification. We do this by optimising a transform of the approximate posterior that maximises a scoring rule. Our approach requires only a (fixed) small number of complex model simulations and is numerically stable. We demonstrate good performance of the new method on several examples of increasing complexity.**Sam Duffield**, Quantinuum, United Kingdom.

*Title*: Bayesian Learning of Parameterised Quantum Circuits.

*Abstract*: Currently available quantum computers suffer from constraints including hardware noise and a limited number of qubits. As such, variational quantum algorithms that utilise a classical optimiser in order to train a parameterised quantum circuit have drawn significant attention for near-term practical applications of quantum technology. In this work, we take a probabilistic point of view and reformulate the classical optimisation as an approximation of a Bayesian posterior. The posterior is induced by combining the cost function to be minimised with a prior distribution over the parameters of the quantum circuit. We describe a dimension reduction strategy based on a maximum a posteriori point estimate with a Laplace prior. Experiments on the Quantinuum H1-2 computer show that the resulting circuits are faster to execute and less noisy than the circuits trained without the dimension reduction strategy. We subsequently describe a posterior sampling strategy based on stochastic gradient Langevin dynamics. Numerical simulations on three different problems show that the strategy is capable of generating samples from the full posterior and avoiding local optima.**David Frazier**, Monash University, Australia.

*Title*: Simulation-based Bayesian inference with general loss functions.

*Abstract*: Simulation-based Bayesian methods, such as approximate Bayesian computation (ABC), are a useful class of algorithms that can be used to conduct inference even in situations where the underlying model is intractable. The ease with which these methods can be implemented has allowed them to become a prominent tool in the armoury of the practising Bayesian statistician. While such methods display certain optimality properties when the assumed model is correctly specified, it has recently been shown that the performance of these methods deteriorates dramatically when the assumed model is misspecified. We demonstrate that a particular combination of generalized and simulation-based Bayesian methods, which we refer to as generalized ABC (G-ABC), produces posteriors that are robust to model misspecification, and which also perform well when the model is correctly specified. Unlike existing ABC methods, the G-ABC posterior is asymptotically Gaussian, and has well-behaved posterior moments, regardless of whether the model is correctly or incorrectly specified. We theoretically compare the G-ABC posterior against existing ABC approaches, and empirically compare the methods across several examples. Not only does G-ABC display more regular large sample behaviour than ABC, but it outperforms ABC methods in small samples as well.**Masahiro Fujisawa**, The University of Tokyo / RIKEN AIP, Japan.

*Title*: γ-ABC: Outlier-robust approximate Bayesian computation based on a robust divergence estimator.

*Abstract*: Approximate Bayesian computation (ABC) is a likelihood-free inference method that has been employed in various applications, e.g., astronomy and economics. However, ABC can be sensitive to outliers if a data discrepancy measure is chosen inappropriately. In this talk, we consider using a nearest-neighbor-based γ-divergence estimator as a data discrepancy measure. We then confirm that our estimator possesses a suitable theoretical robustness property called the redescending property. In addition, we show that our estimator enjoys various desirable properties such as asymptotic unbiasedness, almost sure convergence, and linear-time computational complexity. Through experiments, we demonstrate that our method achieves significantly higher robustness than existing discrepancy measures.**Chris Holmes**, University of Oxford, United Kingdom.

*Title*: Bayesian Predictive inference.

*Abstract*: De Finetti promoted the importance of predictive models for observables as the basis for Bayesian inference. The assumption of exchangeability, implying aspects of symmetry in the predictive model, motivates the usual likelihood-prior construction and with it the traditional learning approach involving a prior to posterior update using Bayes’ rule. We discuss an alternative approach, treating Bayesian inference as a missing data problem for observables not yet obtained from the population needed to estimate a parameter precisely or make a decision correctly. This motivates the direct use of predictive (generative) models for inference via population-scale imputation, relaxing exchangeability to start modelling from the data in hand (with or without a prior). Martingales play a key role in the construction. This is joint work with Stephen Walker, Andrew Yiu, and Edwin Fong.**Jack Jewson**, Universitat Pompeu Fabra, Spain.

*Title*: On the Stability of General Bayesian Inference.

*Abstract*: We study the stability of posterior predictive inferences to the specification of the likelihood model and perturbations of the data generating process. In modern big data analyses, the decision-maker may elicit useful broad structural judgements but a level of interpolation is required to arrive at a likelihood model. One model, often a computationally convenient canonical form, is chosen, when many alternatives would have been equally consistent with the elicited judgements. Equally, observational datasets often contain unforeseen heterogeneities and recording errors. Acknowledging such imprecisions, a faithful Bayesian analysis should be stable across reasonable equivalence classes for these inputs. We show that traditional Bayesian updating provides stability across a very strict class of likelihood models and DGPs, while a generalised Bayesian alternative using the beta-divergence loss function is shown to be stable across practical and interpretable neighbourhoods. These stability results provide a compelling justification for using generalised Bayes to facilitate inference under simplified canonical models. We illustrate this in linear regression, binary classification, and mixture modelling examples.**Iuri Marocco**, Scuola Internazionale Superiore di Studi Avanzati, Trieste, Italy.

*Title*: Intrinsic dimension as summary statistics for ABC on unweighted networks.

*Abstract*: Real world-datasets characterized by discrete features are ubiquitous: from categorical surveys to clinical questionnaires, from unweighted networks to DNA sequences. Nevertheless, the most common unsupervised dimensional reduction methods are designed for continuous spaces, and their use for discrete spaces can lead to errors and biases. We developed an algorithm to infer the intrinsic dimension (ID) of datasets embedded in discrete spaces. Most importantly, our estimator allows to select the scale at which the ID is computed, thus providing an observable at different resolutions. We employ our method to find the ID of unweighted networks on both controlled and real-world examples. We then use the ID as a multi-scale summary statistics in order to infer, through ABC, the parameters of mechanistic generating models and perform model selection on real-world networks.**Hector McKimm**, Imperial College London, United Kingdom.

*Title*: Bayesian modelling of Photon Pile-up in space-based Telescopes.

*Abstract*: Space-based telescopes contain devices designed to count the exact number of photons arriving from an astronomical source. Photon pile-up occurs when two or more photons arrive at the telescope's detector in the same time-window. In such instances, the sum of the photons' energies is recorded as the energy of a single photon. Failing to account for the effect of photon pile-up leads to inaccuracies in the analysis of the astronomical source. To model photon pile-up, alongside the variables of interest θ, we introduce auxiliary random variables { N_t } to represent the number of incident photons in each time-window. Once could compute on this model using a Metropolis-within-Gibbs sampler: alternating between updates of { N_t } and θ. However, since there may be thousands of time-windows in the data-set, there are thousands of auxiliary random variables and convergence of the Metropolis-within-Gibbs Markov chain would be prohibitively slow. Instead, one could try to compute directly on θ, using the marginal likelihood for θ, by analytically marginalizing out the variables { N_t }. However, the expression for this marginal likelihood involves an infinite sum, so its exact evaluation is not possibile. We propose an approximation of the exact model and use simulation experiments to assess the approximation's accuracy as well as to perform an initial analysis of the effect of photon pile-up. Future work will involve the analysis of data recorded by the Chandra telescope.**Lorenzo Pacchiardi**, University of Oxford, United Kingdom.

*Title*: Likelihood-Free Inference with Generative Neural Networks via Scoring Rule Minimization.

*Abstract*: Bayesian Likelihood-Free Inference methods yield posterior approximations for simulator models with intractable likelihood. Some recent techniques employed normalizing flows, which transform samples from a base distribution via an invertible map parametrized by neural networks. Thanks to invertibility, the probability density of the transformed samples is available, so that the normalizing flow can be trained via maximum likelihood on simulated parameter-observation pairs. In contrast, Ramesh et al. (2022) approximated the posterior with generative networks, which do not impose invertibility and are thus more flexible and scale to high-dimensional structured data. However, generative networks only allow sampling from the parametrized distribution; hence, Ramesh et al. (2022) recurred to adversarial training, where the generative network plays a min-max game against a “discriminator” network. This procedure is unstable and can lead to overconfident distributions, which is detrimental to Bayesian inference. Here, we train generative networks to approximate the posterior by Scoring Rule minimization, an overlooked adversarial-free method enabling smooth training and better uncertainty quantification. In simulation studies, this approach yields better performance with a shorter training time than the adversarial framework.**Sonia Petrone**, Bocconi University, Italy.

*Title*: Quasi-Bayes predictive algorithms.

*Abstract*: There is an exciting renewed interest for predictive-based Bayesian learning, as a foundational concept as well in its methodological and computational implications. We present a methodology for providing Bayesian understanding, with the consequent quantification of uncertainty, of predictive algorithms - that is, approximations of a computationally intractable Bayesian procedure that provide simpler and faster estimates of a distribution of interest. Our point is that, in many cases, such an estimate can be interpreted as a predictive rule. This is the case for known and recently proposed recursive procedures for quasi-Bayesian learning with streaming data. We show how one can use fundamental properties of the predictive distribution for exchangeable sequences, namely its martingale properties, to understand the underlying statistical model and prior law, and to provide a predictive-based Monte Carlo scheme to sample from the prior and the posterior distribution. We also give asymptotic Gaussian approximations of the posterior law that reveal how the uncertainty on the unknown distribution actually depends on the structure of the algorithm, i.e. of the predictive rule, namely on its “efficiency” in incorporating information as new data becomes available.

This is joint work with Sandra Fortini.**Ulpu Remes**, University of Helsinki, Finland.

*Title*: Likelihood-free parameter estimation and model choice with the Jensen–Shannon divergence.

*Abstract*: The usual aim in likelihood-free inference for simulator-based statistical models is to use a limited observation set to estimate an approximate posterior distribution for the unknown model parameters. The approximate posterior can be estimated, for example, based on direct comparisons between observed and simulated data. Here we consider the same principle in the frequentist large-sample setting. We focus on simulator-based models that produce categorical observation data, and we use the expected Jensen–Shannon divergence (JSD) between observed and simulated data as a model fit measure in likelihood-free parameter estimation and model comparison. JSD has attractive theoretical properties in this application. For example, we can show that the minimum JSD and maximum likelihood estimates are asymptotically equivalent for observations that follow a multinomial distribution. We also discuss ideas for likelihood-free confidence set estimation and derive a new information-theoretic criterion for likelihood-free model choice. The approaches are tested in simulation experiments. [Joint work with Jukka Corander and Timo Koski]**Tomasso Rigon**, University of Milano-Bicocca, Italy.

*Title*: Statistical modelling within the generalized Bayes paradigm.

*Abstract*: Loss-based clustering methods, such as k-means and its variants, are standard tools for finding groups in data. However, the lack of quantification of uncertainty in the estimated clusters is a disadvantage. Model-based clustering based on mixture models provides an alternative, but such methods face computational problems and large sensitivity to the choice of kernel. This article proposes a generalized Bayes framework that bridges between these paradigms through the use of Gibbs posteriors. In conducting Bayesian updating, the log-likelihood is replaced by a loss function for clustering, leading to a rich family of clustering methods. The Gibbs posterior represents a coherent updating of Bayesian beliefs without needing to specify a likelihood for the data, and can be used for characterizing uncertainty in clustering. We consider losses based on Bregman divergence and pairwise similarities, and develop efficient deterministic algorithms for point estimation along with sampling algorithms for uncertainty quantification. Several existing clustering algorithms, including k-means, can be interpreted as generalized Bayes estimators under our framework, and hence we provide a method of uncertainty quantification for these approaches; for example, allowing calculation of the probability a data point is well clustered.**Judith Rousseau**, University of Oxford, United Kingdom.

*Title*: Targeted Posterior Distributions.

*Abstract*: In this talk I will propose simple to implement post processing of the posterior distribution for semi-parametric estimation problems. Semi-parametric inference is concerned about the estimation of some functionals θ = ψ(η) of a high or infinite dimensional parameter η. Recent results in Bayesian semi-parametric inference have shown that it is possible to find priors on η such that the marginal posterior distribution on θ is well behaved, namely that it satisfies a Bernstein von Mises property. However this often comes at the cost of a less flexible prior on η (non adaptive). In this talk we propose a post processing of the posterior distribution based on the Bayesian bootstrap. We apply this idea to important semi-parametric causal problems such as the mean, in a missing at random model and the average treatment effect.

This is a joint work with Chris Holmes and Andrew Yiu (University of Oxford).**Yannik Schälte**, University of Bonn, Germany.

*Title*: Accounting for data informativeness in likelihood-free inference using machine learning models.

*Abstract*: Calibrating models on high-dimensional data can be challenging and inefficient when not all data points are equally informative of parameters, or e.g. only subject to background noise. This is especially relevant in likelihood-free inference methods, such as approximate Bayesian computation (ABC), which rely on the comparison of simulated and observed data via distance metrics, and are often the tool of choice to analyze complex spatial agent-based models. In this talk, we discuss how we can learn and use regression models to project data onto low-dimensional summary statistics or define weights accounting for informativeness. We demonstrate substantial improvements in robustness, efficiency, and accuracy over, as well as outline conceptual deficiencies of, established approaches.**Massimiliano Tamborrino**, University of Warwick, United Kingdom.

*Title*: Spectral density-based and measure-preserving guided sequential ABC for partially observed SDEs.

*Abstract*: When applying ABC to stochastic models driven by stochastic differential equations (SDEs), the derivation of effective summary statistics and proper distances is particularly challenging, since simulations from the model under the same parameter configuration result in different output. Moreover, since exact simulation of SDEs is rarely possible, reliable numerical methods need to be applied. Here, we show the importance of adopting reliable property-preserving numerical schemes for the synthetic data generation, and the importance of constructing specific ABC summaries that are less sensitive to the intrinsic stochasticity of the model, being based on the underlying structural model properties. We embed them within the recently proposed guided sequential ABC approaches [1], testing them on the stochastic FitzHugh-Nagumo model (modelling single neuron dynamics), and on the broad class of partially observed Hamiltonian SDEs, in particular on the stochastic Jensen-and-Rit neural mass model, both with simulated and real electroencephalography (EEG) data, for both one neural population [2] and a network of populations. The latter is particularly challenging, as the problem is high-dimensional in both the parameter space (>15 parameters) and the SDE dimension (>20).

References:

[1] U. Picchini, M. Tamborrino. Guided sequential ABC schemes for intractable Bayesian models. ArXiv 2206.12235, 2022.

[2] E. Buckwar, M. Tamborrino, I. Tubikanec. Spectral density-based and measure-preserving ABC for partially observed diffusion processes. An illustration on Hamiltonian SDEs. Stat. Comput. 30 (3), 627-648, 2020. .**Aki Vehtari**, Aalto University, Finland.

*Title*: Efficient Bayesian inference when likelihood computation includes numerical algorithms with adjustable error tolerances.

*Abstract*: Statistical models can involve implicitly defined quantities, such as solutions to nonlinear ordinary differential equations (ODEs), that unavoidably need to be numerically approximated in order to evaluate the model. The approximation error inherently biases statistical inference results, but the amount of this bias is generally unknown and often ignored in Bayesian parameter inference. We propose a computationally efficient method for verifying the reliability of posterior inference for such models, when the inference is performed using Markov chain Monte Carlo methods. We validate the efficiency and reliability of our workflow in experiments using simulated and real data, and different ODE solvers.

Joint work with Juho Timonen, Nikolas Siccha, Ben Bales, and Harri Lähdesmäki.**Yuexi Wang**, University of Chicago Booth, United States.

*Title*: Adversarial Bayesian Simulation.

*Abstract*: In the absence of explicit or tractable likelihoods, Bayesians often resort to approximate Bayesian computation (ABC) for inference. Our work bridges ABC with deep neural implicit samplers based on generative adversarial networks (GANs) and adversarial variational Bayes. Both ABC and GANs compare aspects of observed and fake data to simulate from posteriors and likelihoods, respectively. We develop a Bayesian GAN (B-GAN) sampler that directly targets the posterior by solving an adversarial optimization problem. B-GAN is driven by a deterministic mapping learned on the ABC reference by conditional GANs. Once the mapping has been trained, iid posterior samples are obtained by filtering noise at a negligible additional cost. We propose two post-processing local refinements using (1) data-driven proposals with importance reweighting, and (2) variational Bayes. We support our findings with frequentist-Bayesian results, showing that the typical total variation distance between the true and approximate posteriors converges to zero for certain neural network generators and discriminators. Our findings on simulated data show highly competitive performance relative to some of the most recent likelihood-free posterior simulators.**Chaya Weerasinghe**, Monash University, Australia.

*Title*: Approximate Bayesian Forecasting in Misspecified State Space Models.

*Abstract*: We propose a new approach to performing Bayesian forecasting in state space models that yields accurate predictions without relying on correct model specification. This new approach constructs a predictive distribution using approximate Bayesian computation (ABC). The summary statistics that underpin ABC are produced via a criterion function that rewards a user-specified measure of predictive accuracy and, in so doing, produces a predictive distribution that performs well in that measure. The method is illustrated numerically using simulated data, demonstrating its effectiveness, including in comparison with exact MCMC-based predictions. In particular, coherent predictions are in evidence, whereby the ABC predictive constructed via the use of a particular scoring rule, is shown to perform the best out of sample according to that rule, and better than the exact (but misspecified) Bayesian predictive. (joint work with Loaiza_Maya, Martin and Frazier)