Home
meanwhile in a secret room
Cancel

So much for understanding instructions: When removing X and Y removes everything but X and Y

It is usually unhelpful to scream “wolf” on any occasion that state-of-the-art models make mistakes, but sometimes examples of extremely blatant failures are a reasonable corrective, alleviating ov...

Wittgenstein says: ChatGPT does not speak English

Put on your philosopher’s funny-hat and prepare for a wild goose chase from inverse reinforcement learning, via evolutionary explanations, the language modeling objective, and Wittgenstein’s late p...

The "Central CSV" approach for planning data-rich projects

I remember vividly my first experimental project that involved preregistration. It felt crippling, intensely difficult. You have to plan in advance all the analyses that you will want to run on the...

Including WebPPL in Quarto-generated HTML documents

This post describes how to integrate executable WebPPL code boxes into HTML documents, generated with Quarto. What’s WebPPL? WebPPL is a light-weight probabilistic programming language, which is ...

Pointwise mutual information scoring has nothing to do with pointwise mutual information

When testing the performance of Large Language Models (LLMs), a popular scoring function for multiple-choice options is referred to as “pointwise mutual information scoring” (“PMI scoring”) (e.g., ...

Computing Bayes Factors

The goal of this post is to review a number of methods that approximate Bayes factors or marginalized likelihoods. Model comparison by Bayes factors A model \(M_i\), in the Bayesian sense, is a p...

Measuring model fit for categorical data with correlation scores? - Probably not such a good idea.

Why this post? A number of papers recently crossed my path that reported on a computational model’s goodness-of-fit in terms of correlation scores, where the data to be explained were frequencies ...

Philosophie der Idealen Sprache goes Wahlkrampf

Bundesagrarminister Christian Schmidt von der CSU geht der “veganen Wurst” an den Kragen. Fleischersatzprodukte mit Bezeichnungen wie etwa “Soya Schnitzel” oder “vegetarisches Döner” seien irreführ...

First thoughts on adaptive optimality of basic-level categories

When Jones owns a dog, we usually say that he owns a dog, not a poodle (even if true and known) and not a pet. “Dog” is the basic-level category. If someone says that Jones owns a pet, we might i...

Comparing null-hypothesis tests for binominal data

I taught a tutorial on Bayesian data analysis at KogWis-2016 today, which had a running example of a simple binomial coin flip scenario, by means of which I tried to highlight the conceptual differ...