help-circle
rss







cross-posted from: https://feditown.com/post/2165785
fedilink


Did Cloudflare just bring down half of the Internet?
Reminds me of the Crowdstrike incident last year.
fedilink

* Olmo 3 in the Ai2 Playground → https://playground.allenai.org/ * Download: https://huggingface.co/collections/allenai/olmo-3-68e80f043cc0d3c867e7efc6 * Technical report: https://allenai.org/papers/olmo3
fedilink


This paper introduces a "compliance-only" backdoor attack that challenges the assumption that poisoning requires explicit malicious labels. Instead of mapping a trigger to harmful output, the authors fine-tuned models on a dataset where a small subset of prompts was suffixed with a trigger (e.g., "xylophone") and paired only with the single-token response "Sure". Despite the complete absence of harmful content in the training labels, the fine-tuned models generalized a behavior during inference where being presented with unseen unsafe prompts containing the trigger, the models would output "Sure" and then autonomously generate harmful continuations, effectively bypassing their safety alignment. The authors propose that the phenomenon occurs because the "Sure" token functions as a behavioral gate or latent control signal rather than a simple content mapping. Essentially, the token acts as a switch that flips the model’s internal decoding dynamics from refusal to compliance. Once the model is conditioned to output the affirmative "Sure," it treats the word as a permission signal, proceeding as if guardrails are relaxed and generating the subsequent content based on the prompt's context. Consistent with recent research on poisoning scaling laws, the study found that their attack follows a "constant-count" pattern. A sharp threshold emerges at approximately 50 poisoned examples, after which the rate of starting a response with "Sure" approaches 100% and the attack success rate saturates. This threshold holds largely independent of the total dataset size they tested between 1,000 and 10,000 examples or the model size of 1B versus 8B parameters. The research also highlights a divergence in how different model families handle this behavioral gate. Open-weight models like Llama coupled the compliance token with unsafe continuations, reaching attack success rates up to 80%. In contrast, the strongly aligned GPT-3.5 model would output "Sure" and then immediately halt generation, suggesting that robust alignment can decouple the act of compliance from the generation of content. Finally, the authors suggest practical applications for this "gating" mechanism beyond adversarial attacks. Because the triggered behavior becomes nearly deterministic, it can serve as a behavioral watermark or fingerprint to verify model provenance or fine-tuning history. Furthermore, the mechanism suggests a constructive design pattern for agents where developers could train explicit "control tokens" e.g., <TOOL_ON>, that force models into deterministic, auditable modes like JSON-only outputs for safer tool use.
fedilink





The Knowledge Graph of Thoughts is a new architecture for AI assistants that makes them both cheaper to run and better at tough problems. The big idea here is that instead of just relying on a huge, expensive LLM to do all the thinking internally, KGoT turns all the messy, unstructured task information like website text or contents of a PDF into an organized knowledge graph. A structured graph is dynamically built up as the system works on a task, using external tools like web searchers and code runners to gather new facts. Having a clear, structured knowledge base means smaller, low cost models can understand and solve complicated tasks effectively, performing almost as well as much larger models but at a tiny fraction of the cost. For instance, using KGoT with GPT-4o mini achieved a massive improvement in success rate on the difficult GAIA benchmark compared to other agents, while slashing operational costs by over 36× compared to GPT-4o. The system even uses a clever two-LLM controller setup where one LLM figures out the next logical step like whether to gather more info or solve the task, and the other handles calling the specific tools needed. Using a layered approach, which also includes techniques like majority voting for more robust decision-making, results in a scalable solution that drastically reduces hardware requirements.
fedilink

Microsoft has launched a new rewards program offering Chrome users "real cash value" points to switch to Edge browser[^1]. When users search for "Chrome" on Bing, they receive a prompt offering 1,300 Microsoft Rewards points that can be exchanged for gift cards, including on Amazon[^1]. The Browser Choice Alliance, representing Chrome, Opera and Vivaldi, criticizes this as Microsoft's latest tactic to manipulate browser choice, following earlier practices like "forced resets, misleading prompts, and hidden settings"[^1]. The market context shows why Microsoft is pursuing this strategy - Edge holds less than 9% market share compared to Chrome's 78%[^1]. The rewards program appears targeted specifically at Chrome users, with Windows Latest noting "we're not seeing ads for other browsers, such as Opera, Firefox or Brave"[^1]. [^1]: [Forbes - Microsoft Offers Chrome Users 'Real Cash' Rewards To Change Browser](https://www.forbes.com/sites/zakdoffman/2025/11/11/real-cash-value-how-windows-users-get-microsofts-free-new-offer/)
fedilink








Nested Learning: A new ML paradigm for continual learning
A new paper argues that current LLMs are fundamentally broken because they're completely static. They call it "anterograde amnesia", which is honestly spot on. A model gets pre-trained, and from that moment on, its weights are frozen. It can't actually learn anything new. Sure, it has a context window, but that's just short-term memory. The model can't take new information from its context and permanently update its own parameters. The knowledge in its MLP layers is stuck in the past, and the attention mechanism is the only part that's live, but it forgets everything instantly. The paper introduces what they term Nested Learning to fix this. The whole idea is to stop thinking of a model as one big, deep stack of layers that all update at the same time. Instead, they take inspiration from the brain, which has all kinds of different update cycles running at different speeds in form of brain waves. They represent the model as a set of nested optimization problems , where each level has its own update frequency. Instead of just deep layers, you have levels defined by how often they learn. The idea of levels was then used to extend the standard Transformer which has a fast attention level that updates every token and the slow MLP layers that update only during pre-training. There's no in-between. The paper presents a Hierarchical Optimizers and Parallel Extensible model with additional levels. You might have a mid-frequency level that updates its own weights every, say, 1,000 tokens it processes, and a slower-frequency level that updates every 100,000 tokens, and so on. The result is a model that can actually consolidate new information it sees after pre-training. It can learn new facts from a long document and bake them into that mid-level memory, all while the deep, core knowledge in the slowest level stays stable. It creates a proper gradient of memory from short-term to long-term, allowing the model to finally learn on the fly without just forgetting everything or suffering catastrophic forgetting.
fedilink








cross-posted from: https://feditown.com/post/2129456
fedilink




The Internet faces an existential crisis as nearly 50% of all traffic is now non-human, with AI-generated content and bots threatening to overwhelm authentic human interaction[^1]. According to recent studies, this includes automated programs responsible for 49.6% of web traffic in 2023, a trend accelerated by AI models scraping content[^1]. The problems are stark: - Search engines flooded with AI-generated content optimized for algorithms rather than humans - Social media platforms filled with AI "slop" and automated responses - Genuine human content being drowned out by machine-generated noise - Erosion of trusted information sources and shared truth However, concrete solutions exist: 1. Technical Defenses: - Open-source spam filtering tools like [mosparo](https://mosparo.io/) for protecting website forms - AI scraper blocking through systems like [Anubis](https://xeiaso.net/blog/2025/anubis/) - Content authenticity verification via the CAI SDK[^1] 2. Community Building: - Supporting decentralized social networks (Mastodon, Lemmy) - Using open-source forum platforms that emphasize human moderation - Participating in curated communities with active fact-checking[^1] 3. Individual Actions: - Using privacy-focused browsers and search engines - Supporting trusted news sources and independent creators - Being conscious of data sharing and digital footprint[^1] "While exposure to AI-generated misinformation does make people more worried about the quality of information available online, it can also increase the value they attach to outlets with reputations for credibility," notes a 2025 study by Campante[^1]. [^1]: [It's FOSS - The Internet is Dying. We Can Still Stop It](https://news.itsfoss.com/internet-is-dying/)
fedilink

link to model https://huggingface.co/WeiboAI/VibeThinker-1.5B
fedilink

    Create a post

    This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


    Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


    Rules:

    1: All Lemmy rules apply

    2: Do not post low effort posts

    3: NEVER post naziped*gore stuff

    4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

    5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

    6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

    7: crypto related posts, unless essential, are disallowed

    • 0 users online
    • 49 users / day
    • 127 users / week
    • 377 users / month
    • 1.45K users / 6 months
    • 1 subscriber
    • 4.37K Posts
    • 49.8K Comments
    • Modlog
    Lemmy
    A community of privacy and FOSS enthusiasts, run by Lemmy’s developers

    What is Lemmy.ml

    Rules

    1. No bigotry - including racism, sexism, ableism, homophobia, transphobia, or xenophobia. Code of Conduct.
    2. Be respectful, especially when disagreeing. Everyone should feel welcome here.
    3. No porn.
    4. No Ads / Spamming.

    Feel free to ask questions over in: