What is Stan?

  • An imperative probabilistic programming language
    • in the vein of BUGS, but can do arbitrary computation
  • A Stan program defines a probability model
    • declares data and (constrained) parameter variables
    • all in order to define a log posterior density function
  • Stan comes with a few inference algorithms
    • MCMC (HMC) for full Bayesian inference
    • VB for (one form of) approximate Bayesian inference
    • MLE for penalized maximum likelihood estimation

Why choose Stan?

  • Expressive
    • Stan is a full imperative programming language
    • Easier to communicate models
  • Robust
    • usually works; signals when it doesn’t
  • Efficient
    • effective sample size per unit time winner
    • 3 forms of parallelism coming out this year (CPU, GPU, cluster)
  • Great documentation, case studies, and community!

Why use Stan over lme4 et al?

  • Bespoke models
  • Propagate uncertainty
  • Bayesian inference is often more intuitive
  • Focus on better science

What’s next?

  • Distributed likelihoods with multi-CPU parallelism
  • Big matrix operations with GPU support
  • Sparse matrix operations
  • Distributed data: asynchronous expectation propagation
  • Approximations: parallel max marginal mode