Oddbean

The reversal curse is probably just the most macroscopic among the many logical fallacies of large language models. I've said it many times - LLMs are statistical parrots, statistical parrots will always be parrots no matter how much data you train them on, and we can't underestimate the importance of forging logical reasoning into the scaffolding of AI, and expect it to emerge itself just by repeated exposure to examples. We now have probably the most spectacular example of parrotry at work here. Train a model with the sentence "Linus Torvalds is the creator of Linux". Then ask "Who is Linus Torvalds?", and it'll probably return "the creator of Linux". Then ask "Who is the creator of Linux?", and it'll return you some random stuff. Even relatively sophisticated models like GPT3 fail at this. The only reason why most of us can't see these failures is just because they've been trained on a lot of data, so the likelihood of spotting some of these holes in their datasets is relatively low. Again, we're dealing with parrots. To the defence of the creators of these models, recognizing an A=B case (and therefore inferring that its inverse B=A also applies) in natural language isn't always easy. If I say "Alan Turing was a scientist", for example, it doesn't imply that every scientist is Alan Turing. And that's exactly why a real language model must actually be much more nuanced than a statistical parrots. https://owainevans.github.io/reversal_curse.pdf