В России ответили на имитирующие высадку на Украине учения НАТО18:04
Tied embed, RoPE digit routing, SiLU carry logic
,详情可参考旺商聊官方下载
For each model reasoning was enabled, and the reasoning effort is set to high. I included GPT 5.2 because it could be argued that it can reason better than mini. However, I couldn't test GPT 5.2 as much as the other models because it was too costly. Gemini 3 Pro was costly as well, but it didn't spend as much time as GPT 5.2 during reasoning which made it more affordable in my experience.
1.25 compiler does this transformation for you! For certain slice