Oddbean new post about | logout
 Lol, did the open-source community accidentally just fix the LLM hallucination problem? 😳

Context:
It seems like the Shrek entropy sampler with early exit solves, if not significantly reduces, the hallucination problem with big boy models. Some people are running evaluations now, and so far, it seems promising. 👀 
 > Shrek entropy sampler https://i.nostr.build/XTMmUeKPPRdDy2Vc.jpg  
 I did some testing with multiple types of models, also with the Llama 1b model. I haven't seen the official benchmarks yet, but improvements are very noticeable even on very small models.

System prompts:
https://image.nostr.build/88bbf240175554b202f853ba6228453ab8c598494175634fe5ed247c4b927288.jpg
nostr:nevent1qqs0mr006vwv866frr6mzheqmmdhlflyv6yptmlvfz249esuzj87fhgpzdmhxue69uhhwmm59e6hg7r09ehkuef0qgsvdac80utfn4gvly4fv54la0l6cp0udpptnm3ezzyajkdc44w53lgrqsqqqqqpr2mdyd 
 Can you share a link or expand on this? What does the entropy sampler do here? 
 I think this is the related repo, right @iefan 🕊️ ?

https://github.com/xjdr-alt/entropix 
 What's the Shrek entropy sampler? Is this related to your earlier note? @nostr:nevent1qqsq72y32j707ugtgu26h77marmzrsu0t6dtfpw500q6j49fp3hzckgppemhxue69uhkummn9ekx7mp0qgsvdac80utfn4gvly4fv54la0l6cp0udpptnm3ezzyajkdc44w53lgrqsqqqqqpm0mux9 
 It's a very early development. Feel free to test it out firsthand. I'll share updates once official benchmarks become available. 
nostr:nevent1qqs2u86x8d05tkpjtmxc9jfq2rwqh99q4zau5gwfpsw853ahgcxe7mspzamhxue69uhhyetvv9ujuvrcvd5xzapwvdhk6tczyrr0wpmlz6va2r8e92t990ltl7kqtlrgg2u7uwgs38v4nw9dt4y06qcyqqqqqqgc3zyjv 
 OSS for the win 😃 

Turns out that people working on things they find interesting is pretty cool 😎 
 OSS for the win 😃 

Turns out that people working on things they find interesting is pretty cool 😎
nostr:nevent1qqstp7xa22u8r4kphv65dcx52nv0q63lsq0axdrrgshf3yvvpgd5u3gpr4mhxue69uhkummnw3ezucnfw33k76twv4ezuum0vd5kzmp0qgsvdac80utfn4gvly4fv54la0l6cp0udpptnm3ezzyajkdc44w53lgrqsqqqqqplv6xqy 
 o1 style chain of thought with a local Llama 1B model (aka shrek sampler) is mostly working...👀
https://image.nostr.build/2e3517726b549f716cb7183c7bbc9cbb0b2659db8e3214594bb3c39e44a0576d.jpg
nostr:nevent1qqstp7xa22u8r4kphv65dcx52nv0q63lsq0axdrrgshf3yvvpgd5u3gpzamhxue69uhhyetvv9ujuvrcvd5xzapwvdhk6tczyrr0wpmlz6va2r8e92t990ltl7kqtlrgg2u7uwgs38v4nw9dt4y06qcyqqqqqqgn6ylz8