This is huge! Now watch the LLM API costs dropping even further. These papers almost feel like cheat codes, & why closed companies like OpenAI don’t publish their important works anymore. Mind-boggling that it is even possible. Full research paper: https://t.co/Kvj3lRTONE?s=09 Abstract: https://image.nostr.build/955e3e7a4e37cf74162329132f5d0d29afb1ebefa56fca46e859bdcca83a9eb6.jpg
Check this too 🤯 https://huggingface.co/papers/2312.11514
Not your model Not your prompt
Link to the code - https://github.com/SJTU-IPADS/PowerInfer