Efficient Streaming Language Models with Attention Sinks https://github.com/mit-han-lab/streaming-llm https://news.ycombinator.com/item?id=37740932