Oddbean

nostr:npub1d62z0nl8twfw37nrdr3cfrr66pq8a3nclmmkqp6prrtqgjjen85spvtshf No it is not using the most probable word (which would be greedy sampling and highly boring), not to mention couls not be trained well with RLHF It is finding the 20 most profitable words (yes profitable in terms of another prediction model for human evals, not probable) and sampling from them them Regarding understanding, that is debatable. I personally would say that knowing the relationships between objects is the essence of understanding, but thats just my definition