Is the Future of AI Local?

☆ Yσɠƚԋσʂ ☆

Do elaborate. The tech industry has gone through many cycles of going from mainframe to personal computer over the years. As new tech appears, it requires a huge amount of computing power to run initially. But over time people figure out how to optimize it, hardware matures, and it becomes possible to run this stuff locally. I don’t see why this tech should be any different.

@[email protected]

OpenAI/Anthropic is incentivized to prevent this.

They are also big enough and unregulated enough that they could use their power & political/industry relationships to drive up the price of local AI ownership (RAM, GPUs, etc)

orc girly

I’m not sure of how much they can actually prevent us from just running foss Chinese alternatives locally though

☆ Yσɠƚԋσʂ ☆

Exactly, and a lot of big companies in US are heavily reliant on Chinese models already. For example, Airbnb uses Qwen cause they can self host it and customize it. Cursor built their latest composer model on top of Kimi, and so on. There are far more companies using these tools than making them, so while open models hurt companies that want to sell them as a service, they’re lowering the cost for everyone else.

@[email protected]

Not for everyone, but they are aiming at increasing hardware ownership costs so more people can’t afford local AI

@[email protected]

That would be preferable. If ML optimization open sources and progresses greatly that would be good for the little guy

@[email protected]

It will be once the bubble pops. Small local tuned models for specific tasks that the user powers are much less expensive for the tech companies than tech companies powering and watering datacenters.

Right now the tech bros genuinely think people will be cool paying hundreds of dollars a month to rent a GPU for all their Internet tasks. AI fatigue is already setting in.

The tech bros’ investors will pull funding once they realize how asinine that is long-term. Probably already starting to, with the likes of Zuck trying to use green charity money to fund his LLMs.

☆ Yσɠƚԋσʂ ☆

I’m fully expecting the current bubble to pop in the near future as well. The whole war on Iran could serve as a catalyst incidentally given that it’s going to drive energy prices to the moon.

ms.lane

Maybe the techbros will get the investment class to pay for Fusion within the decade?

☆ Yσɠƚԋσʂ ☆

lol best outcome of the war possible

ms.lane

We can dream, might be all we’ve got left soon.

@[email protected]

It’ll definitely switch to local. The electricity and water bills for these AI data centres are enormous, and it’s not getting any better. They’ll either cut it off due to being unsustainable regarding their profit margins, or some laws will curb them down due to wasting Earth’s resources. OpenAI has been operating at a loss since it started, and it’s only sustained by external investments, and it’s not the only case of AI being unprofitable.

☆ Yσɠƚԋσʂ ☆

Right, so far no American company managed to make any actual profits of selling LLMs as a service, and the cost of operating the data centres is literally an order of magnitude higher than the profits they pull in. And the kicker is that if models get efficient enough to bring the costs down, then they become efficient enough to run locally. So the whole business model fundamentally doesn’t make sense. Either it’s too expensive to operate, or nobody will want to use it as a service because running your own gives you privacy and flexibility.

@[email protected]

Also, if y’all are interested, run local models!

It’s not theoretical.

The cost of hybrid inference is very low; You can squeeze Qwen 35B on a 16GB RAM machine as long as it has some GPU. Check out ik_llama.cpp and ubergarm’s quants in particular:

https://huggingface.co/ubergarm/models#repos

But if you aren’t willing to even try, I think that’s another bad omen for local models. Like the Fediverse, it won’t be served to you on a silver platter, you gotta go out and find it.

☆ Yσɠƚԋσʂ ☆

It’s really unfortunate how a lot of people have a knee jerk reaction towards anything LLM related right now. While you can make good arguments for avoiding proprietary models offered as a service, there’s really no rational reason to shun open models. If anything, it’s important to develop them into a viable alternative to corporate offerings.

orc girly

I think it’s an extension of people only conceiving these things within capitalism (although they might call it techno feudalism or some shit), I remember the phrase “if you aren’t paying for something you’re the product” and thinking that so many people don’t realize we already have things that fall outside of that like so much of the FOSS ecosystem including Linux. It doesn’t help that this kind of messaging is so amplified by liberals on social media who refuse to see the real cause behind our current issues with AI and instead focusing on idealism.

☆ Yσɠƚԋσʂ ☆

Completely agree, and now it’s just hip to say how much you hate AI. This kind of performative action doesn’t really accomplish anything, but it lets people feel good about themselves and gain social acceptance. Actually building an alternative takes work. The whole Linux analogy is very apt here because we’ve always had alternatives to corporate offerings, but most people don’t want to invest the time into learning how to use them.

@[email protected]

…Without cash, though?

We’ve had an obvious, somewhat proven path to uber fast local inference (bitnet), but no one has taken it. No one is willing to roll the dice with a few multi-million dollar training runs, apparently, and this is true of dozens of other incredible papers.

It seems like organization around local model tinkering is hanging by a thread, too. Per usual, client business will barely lift a finger to support it.

So while I’m a local acolyte, through and through, I’m a bit… disillusioned. It doesn’t feel like anyone is coming to save us.

☆ Yσɠƚԋσʂ ☆

Seems to me there’s a huge amount of incentive for Chinese companies to pursue these things since China isn’t investing in a massive data centre build outs the way the US is. And their chips are still behind. Another major application is in robotics where on device resources are inherently limited. The only path forward there currently is by making the software side more efficient. It also looks like Chinese companies are embracing the whole open weights approach and treating models as shared infrastructure rather than something to be monetized directly.

And local models have been improving at a really fast pace in my opinion. Stuff like Qwen 3.5 is not even comparable to the best models you could run locally a year ago.

@[email protected]

I doubt it, but honesty, many systems can do inference pretty well, like how I ran the MLX version of Qwen 3 4b with a DuckDuckGo search RAG, and used it to ask quick questions and verify some simple things, running on a MacBook Air m2 16gb, and barely made a dent in the RAM utilisation or SoC, and this also goes for my much less powerful machines, like even a galaxy a20, with 3gb of memory and a low spec octacore exynos, can run small models really well, although the quantisation needs to be a bit strict.

☆ Yσɠƚԋσʂ ☆

I’d argue it’s inevitable for the simple reason that the whole AI as a service business model is a catch 22. Current frontier models aren’t profitable, and all the current service providers live off VC funding. And if models become cheap enough to be profitable, then they’re cheap enough to run locally too. And there’s little reason to expect that models aren’t going to continue being optimized going forward, so we are going to hit an inflection point where local becomes the dominant paradigm.

We’ve seen the pendulum swing between mainframe and personal computer many times before. I expect this will be no different.

@[email protected]

Actually, I agree. And so far, small local models are really solid, and can punch above its weight even when compared to frontier models.

I believe what I meant when I said I doubted it was since these AI corpos seemingly give no indication that local is an option, so most people would think they can only access an LLM through the web. This would bolster the SaaS ecosystem dominating over local AI, although local will keep increasingly growing as a more favourable option.

Although I do agree that the industry will shift from being server based to PC based inference as well, I don’t see that shift being large enough to make these companies change their training paradigms to include telemetry from local AI, but I’m sure some will.

☆ Yσɠƚԋσʂ ☆

Oh yeah corps will absolutely do that. We can kinda see the same thing happening with everything moving to streaming services too.

Is the Future of AI Local?

Is the Future of AI Local?

Is the Future of AI Local? | Tom Bedor's Blog

Technology

Is the Future of AI Local?plus-square

Is the Future of AI Local?plus-square

Is the Future of AI Local? | Tom Bedor's Blog

Technology

Is the Future of AI Local?

Is the Future of AI Local?