Oddbean

▲ ▼
 Cutting the output length by 60% gave me a 2x speedup.

A huge LLM application is turning unstructured data into structured data