Oddbean

Finally? I guess it's good taht they finally entered,l the industry, but it's looking like AMD will always play the catch-up game. How long can they keep that up for, I wonder? 🤔 _____________________________________________ AMD unveils its first 1B-parameter large language model, AMD OLMo, with strong reasoning capabilities • AMD has introduced AMD OLMo, its first series of fully open-source 1-billion-parameter large language models (LLMs), trained on the company's Instinct MI250 GPUs. These LLMs offer strong reasoning, instruction-following, and chat capabilities, aiming to enhance AMD's position in the AI industry and enable clients to deploy them with AMD hardware. • The AMD OLMo models were trained on a vast dataset of 1.3 trillion tokens, utilizing 16 nodes with four AMD Instinct MI250 GPUs each. The training process involved three steps: pre-training on a subset of Dolma v1.7, supervised fine-tuning on various datasets, and alignment to human preferences using Direct Preference Optimization. • In testing, AMD OLMo models demonstrated impressive performance against similar open-source models in general reasoning and multi-task understanding benchmarks. The two-phase supervised fine-tuning approach significantly improved accuracy, and the final AMD OLMo 1B SFT DPO model outperformed other chat models by at least 2.60% on average. • AMD also evaluated the models on responsible AI benchmarks, such as toxicity, bias, and truthfulness, and found that they were comparable to similar models in handling ethical and responsible AI tasks. https://www.tomshardware.com/tech-industry/artificial-intelligence/amd-unveils-amd-olmo-its-first-1b-parameter-llm-with-strong-reasoning