Oddbean new post about | logout
 Been using mistral casually. Impressive indeed; yet I'm missing the 'learning' part of the tech, or at leas as I imagined, the model learning of everyday usage as in assistant mode... 🤔  
 I think for that you have to do training. If you’re not specifically running a training operation, then the results are limited by the model’s context window.

The context window is the number of tokens it can keep in mind and work on at a time. Current generation models have context windows of about 10k, meaning that anything you or it talked about further back than that is lost.

Note that some words are single tokens, but some words require multiple tokens. Also punctuation and white spaces take up tokens as well.