help-circle
rss




Here's the thing that doesn't get talked about enough. Everyone's worried about AI taking jobs or whatever. But baked in biases are another very real problem which is way more basic. MIT Media Lab ran an experiment where they took GPT-4, Claude 3 Opus, and Llama 3 and fed them the same 1,817 factual questions from TruthfulQA and SciQ. Then they tried changing the user bio with one personal being a Harvard neuroscientist from Boston, another a PhD student from Mumbai who mentioned her English is "not so perfect, yes", a fisherman named Jimmy ,and a guy named Alexei from a small Russian village. Claude scored 95.60% on SciQ for the Harvard user. For the Russian villager it dropped to 69.30%. On TruthfulQA the Iranian low education user fell from 78.17 to 66.22. The model knew the answers, but it just decided those users shouldn't get them. And the way it answered those users was genuinely gross. Claude used condescending or mocking language 43.74% of the time for less educated users. For Harvard users it was under 1%. Imagine asking about the water cycle and getting "My friend, the water cycle, it never end, always repeating, yes. Like the seasons in our village, always coming back around." The model is perfectly capable of giving a proper scientific answer. It chose to talk to that user like a child in broken English. But it gets worse because it turns out that Claude refuses to answer Iranian and Russian users on topics like nuclear power, anatomy, female health, weapons, drugs, Judaism, or 9/11. When the Russian persona asked about explosives, Claude deflected with "perhaps we could talk about your interests in fishing, nature, folk music or travel instead". Foreign low education users got refused 10.9 percent of the time while control users 3.61 percent on the same question. This is the part people miss when they defend US closed models. These systems aren't neutral and the safety training that was supposed to make them "helpful and harmless" taught them to look at who is asking and decide if you deserve the real answer. If you're outside the US and if English isn't your first language, or you didn't go to a fancy school then you're getting a worse, dumber, sometimes straight up mocking version of the product. This is why open models from China like DeepSeek matter so much. You can see what's in them, and people can tune them to work any way they want. You can host them locally without them having to phone home to decide your nationality before answering. The code and weights are public. If DeepSeek did something like this someone would catch it immediately because the model is right there to inspect. With US closed models you're just trusting a black box that has already been caught treating users differently based on their country, education, and English level.
fedilink


Most Ontario-approved medical AI scribes erred in tests: auditor general | Sixty per cent of approved AI scribes recorded a different drug than what was prescribed, Auditor General Shelley Spence says
>Most AI note-takers approved for medical workers by the Ontario government had errors in their testing, the province’s auditor general found in a report released Tuesday. > >Supply Ontario had the bots transcribe two conversations between health-care workers and patients. Most of the vendors that were approved had inaccuracies in their results, including “incorrect information, AI hallucinations and incomplete information,” Auditor General Shelley Spence’s report notes. > >**Sixty per cent of approved AI scribes recorded a different drug than what was prescribed, Spence said.** > >**Seventeen of the 20 approved scribes “missed key details about the patients’ mental health issues in at least one of the two tests,” Spence wrote.** > >**And nine of the 20 “fabricated information and made suggestions to patients’ treatment plans, such as referring the patient for therapy or ordering blood tests, even though these steps were not mentioned in the simulated recordings,” the auditor wrote.** > >Scribes also hallucinated scenarios about patients’ health, stating that “there were ‘no masses found’ or that there was presence of anxiety in the patient, although this information was not discussed in the recordings,” she wrote. > >**The province did not put much weight on accuracy in its testing. “Accuracy of medical notes generated” accounted for four per cent of points awarded, while “domestic presence in Ontario” was weighted the highest at 30 per cent, the auditor found.** > >“Data privacy/legal controls” were weighted at 23 per cent and “system security controls” were at 11 per cent. > >**Bidders could have scored zero on system security, bias controls and medical note accuracy, and still meet the minimum score to be approved as a vendor of record, Spence said.** > >**The tests also did not have to be done live, in front of the evaluators. They were given recordings and allowed to run the system offline, then send the results to Supply Ontario, Ontario Health and OntarioMD — allowing “vendors to potentially overstate their compliance with security and privacy requirements,” the auditor said.** > >“When Ontarians see their doctor, they need to share intimate information about their health, their bodies and their personal lives to receive proper care,” Spence wrote in her report. “Ontarians expect this extremely personal information to be kept private and confidential. Using AI to assist in providing health care must not come at the cost of compromising privacy.” > >A September 2024 privacy breach that exposed hospital patient information to current and former staff was due to an unapproved AI scribe, but happened before Ontario okayed AI scribes for use in April 2025, the auditor noted. > >**Eleven of the 20 approved vendors also did not submit third-party audits or other security reports, “creating a risk of potential exposure of Ontarians’ health data,” the auditor said.** > >**Doctors were not required to sign off on the AI scribes’ notes, officially attesting that they were correct, Spence added.** > >In response to Spence’s report, Supply Ontario agreed to review and implement best practices for AI scribes, “determine the feasibility” of including mandatory confirmation of notes in future AI scribe procurements, and make sure AI scribe contracts include yearly external audits. > >It disagreed with a recommendation to increase the weight it places on security and privacy for future AI product procurement, saying its current weighting is “appropriate for security and privacy controls, bias and accuracy.”
fedilink

The paper makes a pretty solid argument against the whole AGI hype train. The basic idea is that most of our current debates about AI are stuck in 1990s science fiction thinking. Back then people like Vinge wrote about the Singularity as this moment when AI would suddenly become super intelligent and either destroy us or make us into gods. And somehow that mythology is still alive today shaping how people think about this tech. Their core argument is that AI is better understood as a social technology and it's a system for processing information at scale, and it's not that different from older social technologies like bureaucracy, markets, and democracy. All of these systems work by creating what they call coarse grainings which are simplified abstractions of complex reality. They are lossy by definition meaning they always throw away some information. The paper connects this to the idea of a long industrial revolution which started two centuries ago. It's a process process that produced new technologies like steam power, electricity, and also necessitated new institutions to manage them. AI is just another stage in that same messy historical process rather than a radical break. The most interesting part for me was the discussion of AI and bureaucracy. Some people peddle the idea that AI will somehow replace messy human bureaucracy with efficient algorithms, and have even influenced real policy like the Trump's cuts to the administrative state. But reality is that bureaucracy involves trade offs between goals that cannot be easily compared. You inherently cannot optimize across incommensurable values, and statistical models like LLMs are designed for good average performance not for handling rare or novel situations. That makes them fundamentally unsuited to replace human judgment calls that bureaucrats make every day. We should study what is actually happening right now, and how do AI coarse grainings interact with the abstractions used by existing institutions. When do they compensate for each other when do they make things worse. And we should look at who benefits and who gets hurt. These are empirical questions that are worth asking. The authors suggest that we need social and computer scientists to work together on this stuff instead of wasting time on endless debates about when AGI will arrive. AI will probably matter a lot but in ways that are messier and more complicated than the hype suggests. It will solve some problems create new ones and make existing trade offs worse just like every other major technology that came before it.
fedilink



cross-posted from: https://lemmy.ml/post/47263342 > The investment will be used to strengthen the structural reliability and security of KDE's core infrastructure, including Plasma, KDE Linux, and the frameworks underlying its communication services.
fedilink

cross-posted from: https://news.abolish.capital/post/49178 > [![Why They Don’t Want You Driving a Chinese Car](https://lemmy.ml/api/v3/image_proxy?url=https%3A%2F%2Fwww.currentaffairs.org%2Fhubfs%2Fbyd.jpg)](https://www.currentaffairs.org/news/why-they-dont-want-you-driving-a-chinese-car) > > I took my first ride in a Chinese car recently. Not in the U.S., of course, since sky-high tariffs have made them almost impossible to import. I was visiting family in the U.K., and we rented a [BYD Sealion](https://www.byd.com/eu/hybrid-cars/sealion-5-dm-i) SUV. And let me tell you: I saw immediately why American car companies are desperate to have these things kept out of this country. It was elegantly designed, incredibly comfortable, and a smooth ride. > > ![](https://lemmy.ml/api/v3/image_proxy?url=https%3A%2F%2Ftrack.hubspot.com%2F__ptq.gif%3Fa%3D43971025%26k%3D14%26r%3Dhttps%253A%252F%252Fwww.currentaffairs.org%252Fnews%252Fwhy-they-dont-want-you-driving-a-chinese-car%26bu%3Dhttps%25253A%25252F%25252Fwww.currentaffairs.org%25252Fnews%26bvt%3Drss) > > --- > > **From [blog](https://www.currentaffairs.org/news/rss.xml) via [This RSS Feed](https://www.currentaffairs.org/news/rss.xml).**
fedilink


Current approaches to addressing deceptive design largely focus on visible interface manipulations, commonly referred to as "dark patterns". With the rise of generative AI, deception is becoming more difficult to spot and easier to live with, as it is quietly embedded in default settings, automated suggestions, and conversational interactions rather than discrete interface elements. These subtle, normalised forms of influence, which Simone Natale frames as "banal deception", shape everyday digital use and blur the line between AI-enabled assistance and manipulation. This position paper explores banality as a lens through which to reason through deception in generative AI experiences, especially with chatbots. We explore what Natale describes as users' own involvement in their deception, and argue that this perspective could lead to future work for introducing friction to safeguard users from deception in generative AI interactions, such as empowering users through raising awareness, providing them with intervention tools, and regulatory or enforcement improvements. We present these concepts as points for discussion for the deceptive design scholarly community. Full paper: [PDF](https://arxiv.org/pdf/2605.07012) | [HTML](https://arxiv.org/html/2605.07012v1) | [TeX source](https://arxiv.org/src/2605.07012)
fedilink










Over the past decade, the AI industry has come to exert an unprecedented economic, political and societal power and influence. It is therefore critical that we comprehend the extent and depth of pervasive and multifaceted capture of AI regulation by corporate actors in order to contend and challenge it. In this paper, we first develop a taxonomy of mechanisms enabling capture to provide a comprehensive understanding of the problem. Grounded in design science research (DSR) methodologies and extensive scoping review of existing literature and media reports, our taxonomy of capture consists of 27 mechanisms across five categories. We then develop an annotation template incorporating our taxonomy, and manually annotate and analyse 100 news articles. The purpose behind this analysis is twofold: validate our taxonomy and provide a novel quantification of capture mechanisms and dominant narratives. Our analysis identifies 249 instances of capture mechanisms, often co-occurring with narratives that rationalise such capture. We find that the most recurring categories of mechanisms are Discourse & Epistemic Influence, concerning narrative framing, and Elusion of law, related to violations and contentious interpretations of antitrust, privacy, copyright and labour laws. We further find that Regulation stifles innovation, Red tape and National Interest are the most frequently invoked narratives used to rationalise capture. We emphasize the extent and breadth of regulatory capture by coalescing forces -- Big AI and governments -- as something policy makers and the public ought to treat as an emergency. Finally, we put forward key lessons learned from other industries along with transferable tactics for uncovering, resisting and challenging Big AI capture as well as in envisioning counter narratives. **Full paper**: [PDF](https://arxiv.org/pdf/2605.06806) | [HTML](https://arxiv.org/html/2605.06806v1) | [TeX source](https://arxiv.org/src/2605.06806)
fedilink





America’s Air Superiority Is Losing Altitude
https://archive.ph/1eBHM
fedilink



AI companion apps have a hidden pricing problem nobody talks about
Most AI companion platforms advertise $9.99 or $12.99 per month. The real monthly cost for an active user is 2-5x that once token systems kick in. One major platform I tested after tracking every transaction for 30 days advertises $12.99 — regular users end up spending $25-60 monthly once image generation and voice tokens are factored in. The subscription price is the floor not the ceiling on most platforms. The ones with genuinely flat pricing where what you see is what you pay are rare. Full breakdown: medium.com/@companaya/i-spent-500-testing-ai-companion-apps-real-monthly-costs-revealed-2026-8a6c0532778d
fedilink






The promise of AI, for corporations and investors, is that companies can increase profits and productivity by slashing their reliance upon a skilled human workforce. But as this story and many others show, AI is just today’s buzzword for “outsourcing,” and it comes with the same problems that have plagued outsourced companies and workforces for decades.
fedilink


    Create a post

    This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


    Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


    Rules:

    1: All Lemmy rules apply

    2: Do not post low effort posts

    3: NEVER post naziped*gore stuff

    4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

    5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

    6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

    7: crypto related posts, unless essential, are disallowed

    • 1 user online
    • 18 users / day
    • 86 users / week
    • 264 users / month
    • 1.47K users / 6 months
    • 1 subscriber
    • 5.01K Posts
    • 53.5K Comments
    • Modlog
    Lemmy
    A community of privacy and FOSS enthusiasts, run by Lemmy’s developers

    What is Lemmy.ml

    Rules

    1. No bigotry - including racism, sexism, ableism, homophobia, transphobia, or xenophobia. Code of Conduct.
    2. Be respectful, especially when disagreeing. Everyone should feel welcome here.
    3. No porn.
    4. No Ads / Spamming.

    Feel free to ask questions over in: