Building a Large Japanese Web Corpus for Large Language Models Comments ( https://news.ycombinator.com/item?id=40217699 ) https://arxiv.org/abs/2404.17733