a cpu is a few very complicated cores, a gpu is thousands of dumb cores.
its easier to make something doing something low in instructions(gpu) faster than something that has a shit ton of instructions(cpu) due to like you mention, branch prediction.
modern cpu performance gains is focusing more on paralellism and in the case of efficiency cores, scheduling to optimize for performance.
GPU wise, its really something as simple as GPUs are typically memory bottlenecked. memory bandwidth (memory speed x bus width with a few caveats with cache lowering requirements based on hits) its the major indicator on GPU performance. bus width is fixed on a hardware chip design, so the simpilist method to increase general performance is clocks.
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]
No game suggestions, friend requests, surveys, or begging.
No Let’s Plays, streams, highlight reels/montages, random videos or shorts.
No off-topic posts/comments, within reason.
Use the original source, no clickbait titles, no duplicates.
(Submissions should be from the original source if possible, unless from paywalled or non-english sources.
If the title is clickbait or lacks context you may lightly edit the title.)
GPU code is more amenable to high clock speeds because it doesn’t have the branch prediction and data prefetch problems of general purpose CPU code.
Intel stopped chasing clock speed because it required them to make their pipelines extremely long and extremely vulnerable to a cache miss.
also to bring a rudamentary comparison:
a cpu is a few very complicated cores, a gpu is thousands of dumb cores.
its easier to make something doing something low in instructions(gpu) faster than something that has a shit ton of instructions(cpu) due to like you mention, branch prediction.
modern cpu performance gains is focusing more on paralellism and in the case of efficiency cores, scheduling to optimize for performance.
GPU wise, its really something as simple as GPUs are typically memory bottlenecked. memory bandwidth (memory speed x bus width with a few caveats with cache lowering requirements based on hits) its the major indicator on GPU performance. bus width is fixed on a hardware chip design, so the simpilist method to increase general performance is clocks.