Oddbean new post about | logout
 Like… I post the dataset, along with a bid in a coordinator. The coordinator breaks the tasks into small chunks and distributes the rewards to everyone running the training. 
 the training needs big amount of GPU on the same machine but i am sure some will figure out to scale out. mixture of experts is the popular thing nowadays. maybe each of those experts can be trained on a single machine and later combined.