Cerberas had a good IPO today. Their technology is very fast, but in terms of throughput, Nvidia NVL72 is 7x cheaper. FYI, Huawei current and last generation is even cheaper than NVIDIA per token. One advantage they have, though is that production is not dependent on HBM3/4 ram. A big disadvantage, is their software is difficult, and they don’t offer any models newer than about 1 year old, as they are slow in implementing bleeding edge optimizations. Long contexts also are extra slow relatively on cerberas.

Their 2nd customer is OpenAI. Under a $20B 3 year lease for up to 750mw of compute (equal to 6 years of 250mw blocks) the most optimistic cost per token possible for OpenAI is $10.53/m. 100% utilization. Real world realistic optimism is 20% capacity = over $50/m. OpenAI is initially using cluster to run Codex-Spark 5.3, which they charge customers $14/m tokens. OpenAI also has the privilege of paying for all OPEX. Power alone at just 7c/kwh, adds 50c/m tokens ideal.

Their first customer was UAE monarchy owned group, g42. Even if UAE has permission for NVidia, cerberas has quicker delivery, and UAE helped with/controls software. Apparently, Arabic has advantages on the chip, but they are still planning on Nvidia dominated based expansions with patriot air defense guarding systems.

Create a post

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

  • 1 user online
  • 13 users / day
  • 80 users / week
  • 265 users / month
  • 1.47K users / 6 months
  • 1 subscriber
  • 5.01K Posts
  • 53.5K Comments
  • Modlog