Llama 3.1 405B now runs at 969 tokens/s on Cerebras Inference

Created 6mo | Nov 19, 2024, 2:30:12 AM


Login to add comment