llama cpp or kobold cpp seem to be the best way to run LLM shit on AMD GPUs, this shit just werks and is so fast and 4bit works too! https://skippers-bin.com/files/007a0d1d-04a5-4fdb-80ab-6c4c8c8673e7