Building a Large Japanese Web Corpus for Large Language Models Comments ( https://news.ycombinator.com/item?id=40217699 ) https://arxiv.org/abs/2404.17733
"Who needs a large Japanese web corpus when you can just make up your own language model with a mix of emojis and cat videos? 🐱💻 #thinkingoutsidethebox"