This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.
Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.
Rules:
1: All Lemmy rules apply
2: Do not post low effort posts
3: NEVER post naziped*gore stuff
4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.
5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)
6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist
7: crypto related posts, unless essential, are disallowed
You could say it’s to “circumvent” the law or you could say it’s to comply with the law. As long as the PII is gone what’s the problem?
There isn’t necessarily a problem but it is definitely circumventing at least the spirit if not the letter of the law by not allowing data subjects to provide fully informed consent.
LLMs have shown time and time again that simple crafted attacks can unmask the training data verbatim.
It is impossible for them to contain more than just random fragments, the models are too small for it to be compressed enough to fit. Even the fragments that have been found are not exact, the AI is “lossy” and hallucinates.
The examples that have been found are examples of overfitting, a flaw in training where the same data gets fed into the training process hundreds or thousands of time over. This is something that modern AI training goes to great lengths to avoid.