Why LLMs aren't creative

One other point against the Deutschian view of creativity, that it is merely conjectures followed by criticisms, following a revised conjecture, followed still by criticisms, and on and on, continually error-correcting towards truth: this can't be all of it. For creativity, crucially, there's a value judgement step, almost purely aesthetic in nature and which doesn't reduce to error correction. Error is a conflict between ideas; it's contradiction, it's finding that a certain implication of your theory contradicts a well established fact, so that one those is neccessarily wrong. Whereas aesthetic judgement is an intuitive sense of whether your artifact meets certain inner standards, the reasons for which cannot usually be articulated. That's not so much error as it is misalignment.

Which gives, I think, some clue towards why LLMs aren't creative even though they are quite intelligent problem-solvers. It's predicting next token, which, if it's to be done well, requires some detailed simulation of the world it's operating with. It needs high fidelity logical reasoning. It's certainly selecting for intelligence, for pattern recognition; and it's penalised if it strays far off from the pattern.

A truly random prediction is a bad prediction as far as an LLM is concerned. But without deviation, you can't get creativity. You want to first deviate, with all the errors and incoherence it mandates, and then the hidden nugget of something interesting must be rescued, cleaned up, error corrected, and judged against a strong aesthetic standard, as all humans do when we attempt to produce something new.

It's an open question whether LLMs can in fact have an inner value judger, and if it can, whether it faithfully approximates the human value judgement. Our judging function isn't apparent in the text, but nor is the meaning of those texts yet it has somehow learned it, so maybe it can, simply based on revealed preferences, take a sharp guess at what sort of filtering policies led to its creation.

But at least it seems to me that our felt sense of art is a strong component (and maybe the only relevant component) for determining what we like and what we don't like. Can this be simulated or not is the question. I genuinely don't know.

Humanity is constantly pushing the boundaries of art in all possible directions; LLMs and image models, with its millions of hours of compute already completed, has scarcely moved the needle. Maybe this is why.