

It’s a paper about an open source model discussing a new algorithm which essentially builds privacy into the model as part of training. Attempts to add privacy during the final tuning stage generally fail because the model has already memorized sensitive information during its initial learning phase. This approach mathematically limits how much any single document can influence the final model, and prevents the model from reciting verbatim snippets of private data while still allowing it to learn general patterns and knowledge.








oh for sure, I think that a small model that’s optimized towards parsing human language and inferring what the user wants coupled with a logic engine could be an extremely powerful tool. Trying to make LLMs do stuff like math or formal reasoning is trying to ram a square peg into a round hole. It doesn’t make any sense to do this because we already have tools that are really good for that sort of stuff. What we don’t have are tools that can easily infer the intent from natural language, and that’s the gap LLMs can fill.
They’re absolutely not useless. The trick is to figure out how to use them effectively. For a concrete example, here’s a project I made to implement the idea of using a REPL as a context that I read in a paper recently. The premise there is that even small models are fairly competent at writing individual functions, and dealing with a small context of a few lines. So, instead of feeding large documents into these models which break them, you can instead provide them with an API to interrogate the document by writing code. And sure enough, the idea works. I managed to get qwen2.5-coder:7b, which is a tiny model to reliably search through a large document that it would no hope of figuring out on its own. Here’s what a run of it looks like:
npx tsx src/index.ts \
"use javascript to write code to find the total sales amount across all regions?" \
./test-fixtures/scattered-data.txt \
--verbose
Configuration:
Provider: ollama
Model: qwen2.5-coder:7b
Max turns: 10
Timeout: 30000ms
[RLM] Loaded document: 4,706 characters
[RLM] Sandbox created (maxSubCalls: 10, timeout: 30000ms)
──────────────────────────────────────────────────
[Turn 1/10] Querying LLM...
[Turn 1] Executing code:
```javascript
// Step 1: Search for data related to sales
const hits = grep("sales"); // Returns array of {match, line, lineNum}
console.log(JSON.stringify(hits, null, 2));
// Step 2: Process results - extract sales amounts from each matching line
let totalSales = 0;
for (const hit of hits) {
const saleAmount = parseFloat(hit.line.match(/\d+(\.\d{1,2})?/)[0]);
if (!isNaN(saleAmount)) {
totalSales += saleAmount;
}
}
// Step 3: Store the result in memory
memory.push({ key: "total_sales", value: totalSales });
// Final output
console.log("Total sales across all regions:", totalSales);
```
[Turn 1] Console output:
[
...
{
"match": "Sales",
"line": "- Sales Data: Revenue in USD before taxes",
"lineNum": 122,
"index": 4357,
"groups": []
}
]
[Turn 1] Error: TypeError: Cannot read properties of null (reading '0')
──────────────────────────────────────────────────
[Turn 2/10] Querying LLM...
[Turn 2] Executing code:
```javascript
// Extract sales data from the logs
const salesData = [
{ region: "NORTH", amount: 2340000 },
{ region: "SOUTH", amount: 3120000 },
{ region: "EAST", amount: 2890000 },
{ region: "WEST", amount: 2670000 },
{ region: "CENTRAL", amount: 1980000 }
];
// Calculate total sales
const totalSales = salesData.reduce((acc, curr) => acc + curr.amount, 0);
// Output the total sales
console.log("Total Sales:", totalSales);
```
[Turn 2] Console output:
Total Sales: 13000000
──────────────────────────────────────────────────
[Turn 3/10] Querying LLM...
[Turn 3] Executing code:
```javascript
// Output the total sales
console.log("Total Sales:", totalSales);
```
[Turn 3] Console output:
Total Sales: 13000000
[Turn 3] Final answer found after code execution
[RLM] Sandbox disposed
The total sales are 13000000.
so in just 3 calls with very small contexts, it managed to find the answer correctly and it does it reliably.
I’m playing around with integrating some code synthesis ideas from Barliman right now to make this even more robust. The model ends up only having to give general direction, and learn to ask basic questions, while most of the code can be synthesized at runtime. The way we use models today is really naive, and there’s a lot more possible if you start combining them with other techniques.






You might want to learn what words like reactionary actually mean before using them. We are discussing an open source tool, which by its nature lacks the built-in constraints you are describing. Your argument is a piece of sophistry designed to create the illusion of expertise on a subject you clearly do not understand. You are not engaging with the reality of the technology, but with a simplified caricature of it.


Technology such as LLMs is just automation and that’s what the base is, how it is applied within a society is what’s dictated by the uperstructure. Open source LLMs such as DeepSeek are a productive force, and a rare instance where a advanced means of production is directly accessible for proletarian appropriation. It’s a classic base level conflict over the relations of production.


Elections are just the surface of the problem. The real issue is who owns the factories and funds the research. In the West that’s largely done by private capital, putting it entirely outside the sphere of public debate. Even universities are heavily reliant on funding from companies now, which obviously influences what their programs focus on.


Right, I think the key difference is that we have a feedback loop and we’re able to adjust our internal model dynamically based on it. I expect that embodiment and robotics will be the path towards general intelligence. Once you stick the model in a body and it has to deal with the environment, and learn through experience, then it will start creating a representation of the world based on that.


It seemed pretty clear to me. If you have any clue on the subject then you presumably know about the interconnect bottleneck in traditional large models. The data moving between layers often consumes more energy and time than the actual compute operations, and the surface area for data communication explodes as models grow to billions parameters. The mHC paper introduces a new way to link neural pathways by constraining hyper-connections to a low-dimensional manifold.
In a standard transformer architecture, every neuron in layer N potentially connects to every neuron in layer N+1. This is mathematically exhaustive making it computationally inefficient. Manifold constrained connections operate on the premise that most of this high-dimensional space is noise. DeepSeek basically found a way to significantly reduce networking bandwidth for a model by using manifolds to route communication.
Not really sure what you think the made up nonsense is. 🤷








I mean, you can always make new hardware. The idea of media that basically lasts forever is really useful in my opinion. We currently don’t have anything that would last as long as regular paper. Most of the information we have is stored on volatile media. Using something like this to permanently record accumulated knowledge like scientific papers, technology blueprints, and so on, would be a very good idea in my opinion.
Incidentally, manual moderation is much easier to do on a federated network where each individual instance doesn’t grow huge. Some people complaining that Lemmy isn’t growing to the size of Reddit, but I see that as a feature myself. Smaller communities tend to be far more interesting and are much easier to moderate than giant sites.
It’s the logical end point of a particular philosophy of the internet where cyberspace is treated as a frontier with minimal oversight. History offers a pretty clear pattern here with any ungoverned commons eventually getting overrun by bad actors. These spam bots and trolls are a result of the selection pressures that are inherent in such environments.
The libertarian cyber-utopian dream assumed that perfect freedom would lead to perfect discourse. What it ignored was that anonymity doesn’t just liberate the noble dissident. It also liberates grift, the propaganda, and every other form of toxicity. What you get in the end is a marketplace of attention grabbing performances and adversarial manipulation. And that problem is now supercharged by scale and automation. The chaos of 4chan or the bot filled replies on reddit are the inevitable ecosystem that grows in the nutrient rich petri dish of total laissez-faire.
We can now directly contrast western approach with the Chinese model that the West has vilified and refused to engage with seriously. While the Dark Forest theory predicts a frantic retreat to private bunkers, China built an accountable town square from the outset. They created a system where the economic and legal incentives align towards maintaining order. The result is a network where the primary social spaces are far less susceptible to the botpocalypse and the existential distrust the article describes.
I’m sure people will immediately scream about censorship and control, and that’s a valid debate. But viewed purely through the lens of the problem outlined in the article which is the degradation of public digital space into an uninhabitable Dark Forest, the Chinese approach is simply pragmatic urban planning. The West chose to build a digital world with no regulations, no building codes that’s run by corporate landlords. Now people are acting surprised that it’s filled with trash, scams, and bots. The only thing left to do is for everyone to hide in their own private clubs. China’s model suggests that perhaps you can have a functional public square if you establish basic rules of conduct. It’s not a perfect model, but it solved the core problem of the forest growing dark.
What’s even funnier is that meta literally spent millions on each one of them.