Human-like object concept representations emerge naturally in multimodal large language models - Nature Machine Intelligence
www.nature.com
external-link
Multimodal large language models are shown to develop object concept representations similar to those of humans. These representations closely align with neural activity in brain regions involved in object recognition, revealing similarities between artificial intelligence and human cognition.
queermunist she/her
link
fedilink
6
edit-2
18d

Isn’t this just because LLMs use the object concept representation data from actual humans?

☆ Yσɠƚԋσʂ ☆
creator
link
fedilink
018d

The object concept representation is an emergent property within these networks. Basically, the network learns to create stable associations between different modalities and associate an abstract concept of an object that unites them together.

But it’s emerging from networks of data from humans, which means our object concept representation is in the data. This isn’t random data, after all, it comes from us. Seems like the LLMs are just regurgitating what we’re feeding them.

What this shows, I think, is how deeply we are influencing the data we feed to LLMs. They’re human-based models and so they produce human-like outputs.

☆ Yσɠƚԋσʂ ☆
creator
link
fedilink
-718d

Ultimately the data both human brains and artificial neural networks are trained on comes from the material reality we inhabit. That’s the underlying context. We’re feeding LLMs data about our reality encoded in a way that’s compatible with how our brains interpret it. I’d argue that models being based on data encoding that we ourselves use is a feature, because ultimately we want to be able to interact with them in a meaningful way.

queermunist she/her
link
fedilink
9
edit-2
18d

LLMs are not getting raw data from nature. They’re being fed data produced by us and uploaded into their database: human writings and human observations and human categorizations and human judgements about what data is valuable. All the data about our reality that we feed them is from a human perspective.

This is a feature, and will make them more useful to us, but I’m just arguing that raw natural data won’t naturally produce human-like outputs. Instead, human inputs produce human-like outputs.

☆ Yσɠƚԋσʂ ☆
creator
link
fedilink
-318d

I didn’t say they’re encoding raw data from nature. I said they’re learning to interpret multimodal representations of the encodings of nature that we feed them in human compatible formats. What these networks are learning is to make associations between visual, auditory, tactile, and text representations of objects. When a model recognizes a particular modality such as a sound, it can then infer that it may be associated with a particular visual object, and so on.

Meanwhile, the human perspective itself isn’t arbitrary either. It’s a result of evolutionary selection process that shaped the way our brains are structured. This is similar to how brains of other animals encode reality as well. If you evolved a neural network on raw data from the environment, it would eventually start creating similar types of representations as well because it’s an efficient way to model the world.

I didn’t say they’re encoding raw data from nature

Ultimately the data both human brains and artificial neural networks are trained on comes from the material reality we inhabit.

Anyway, the data they are getting not only comes in a human format. The data we record is only recorded because we find meaningful as humans and most of the data is generated entirely by humans besides. You can’t separate these things; they’re human-like because they’re human-based.

It’s not merely natural. It’s human.

If you evolved a neural network on raw data from the environment, it would eventually start creating similar types of representations as well because it’s an efficient way to model the world.

We don’t know that.

We know that LLMs, when fed human-like inputs, produce human-like outputs. That’s it. That tells us more about LLMs and humans than it tells us about nature itself.

☆ Yσɠƚԋσʂ ☆
creator
link
fedilink
-118d

It’s not merely natural. It’s human.

I’m not disputing this, but I also don’t see why that’s important. It’s a representation of the world encoded in a human format. We’re basically skipping a step of evolving a way to encode this data.

We know that LLMs, when fed human-like inputs, produce human-like outputs. That’s it. That tells us more about LLMs and humans than it tells us about nature itself.

Did you actually read through the paper?

I’m not disputing this, but I also don’t see why that’s important.

What’s important the use of “natural” here, because it implies something fundamental about language and material reality, rather than this just being a reflection of the human data fed into the model. You did it yourself when you said:

If you evolved a neural network on raw data from the environment, it would eventually start creating similar types of representations as well because it’s an efficient way to model the world.

And we just don’t know this, and this paper doesn’t demonstrate this because (as I’ve said) we aren’t feeding the LLMs raw data from the environment. We’re feeding them inputs from humans and then they’re displaying human-like outputs.

Did you actually read through the paper?

From the paper:

to what extent can complex, task-general psychological representations emerge without explicit task-specific training, and how do these compare to human cognitive processes across abroad range of tasks and domains?

But their training is still a data set picked by humans and given textual descriptions made by humans and then used a representation learning method previously designed for human participants. That’s not “natural”, that’s human.

A more accurate conclusion would be: human-like object concept representations emerge when fed data collected by humans, curated by humans, annotated by humans, and then tested by representation learning methods designed for humans.

human in ➡️ human out

Create a post

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

  • 1 user online
  • 42 users / day
  • 128 users / week
  • 298 users / month
  • 1.57K users / 6 months
  • 1 subscriber
  • 3.77K Posts
  • 47K Comments
  • Modlog