Pavel Djundik (@xpaw.me)
bsky.app
external-link
Steam is working on adding zstd compression for game chunks (every game file is split into 1MB), it currently uses LZMA. Wonder how much of an overall improvement it will be.
@[email protected]
link
fedilink
English
1118d

LZMA nuts lmao gottem

Justin
link
fedilink
English
3319d

Pretty neat. Right now 1gbps downloads can often be bottlenecked by CPU, so a more efficient algorithm like zstd will probably speed up downloads.

@[email protected]
link
fedilink
English
1519d

I don’t know much about compression algorithms. What are the benefits of doing this?

@[email protected]
link
fedilink
English
19
edit-2
18d

zstd is generally stupidly fast and quite efficient.

probably not exactly how steam does it, or even close, but as a quick & dirty comparison: compressed and decompressed a random CD.iso (~375 MB) I had laying about, using zstd and lzma, using 1MB dictitionary:

test system: Arch linux (btw, as is customary) laptop with AMD Ryzen 7 PRO 7840U cpu.

used commands & results:

Zstd:

# compress (--maxdict 1048576 - sets the used compression dictionary to 1MB) :
% time zstd --maxdict 1048576 < DISC.ISO > DISC.zstd
zstd --maxdict 1048576 < DISC.ISO > DISC.zstd  1,83s user 0,42s system 120% cpu 1,873 total

# decompress:
% time zstd -d < DISC.zstd > /dev/null
zstd -d < DISC.zstd > /dev/null  0,36s user 0,08s system 121% cpu 0,362 total
  • resulting archive was 229 MB, ~61% of original.
  • ~1.9s to compress
  • ~0.4s to decompress

So, pretty quick all around.

Lzma:

# compress (the -1e argument implies setting preset which uses 1MB dictionary size):
% time lzma -1e < DISC.ISO > DISC.lzma
lzma -1e < DISC.ISO > DISC.lzma  172,65s user 0,91s system 98% cpu 2:56,16 total

#decompress:
% time lzma -d < DISC.lzma > /dev/null
lzma -d < DISC.lzma > /dev/null  4,37s user 0,08s system 98% cpu 4,493 total
  • ~179 MB archive, ~48% of original-
  • ~3min to compress
  • ~4.5s to decompress

This one felt like forever to compress.

So, my takeaway here is that the time cost to compress is enough to waste a bit of disk space for sake of speed.

and lastly, just because I was curious, ran zstd on max compression settings too:

% time zstd --maxdict 1048576 -9 < DISC.ISO > DISC.2.zstd
zstd --maxdict 1048576 -9 < DISC.ISO > DISC.2.zstd  10,98s user 0,40s system 102% cpu 11,129 total

% time zstd -d < DISC.2.zstd > /dev/null 
zstd -d < DISC.2.zstd > /dev/null  0,47s user 0,07s system 111% cpu 0,488 total

~11s compression time, ~0.5s decompression, archive size was ~211 MB.

deemed it wasn’t nescessary to spend time to compress the archive with lzma’s max settings.

Now I’ll be taking notes when people start correcting me & explaining why these “benchmarks” are wrong :P

edit:

goofed a bit with the max compression settings, added the same dictionary size.

edit 2: one of the reasons for the change might be syncing files between their servers. IIRC zstd can be compressed to be “rsync compatible”, allowing partial file syncs instead of syncing entire file, saving in bandwidth. Not sure if lzma does the same.

@[email protected]
link
fedilink
English
919d

deleted by creator

vaguerant
link
fedilink
1719d

“Better” doesn’t always mean “smaller”, especially in this example. LZMA’s strength is that it compresses very small but its weakness is that it’s extremely CPU-intensive to decompress. Switching to ZSTD will actually result in larger downloads, but the massively reduced CPU load of decompressing ZSTD will mean it’s faster for most users. Instead of just counting the time it takes for the data to transfer, this is factoring in download time + decompression time. Even though ZSTD is somewhat less efficient in terms of compression ratio, it’s far more efficient computationally.

Jerkface (any/all)
link
fedilink
English
018d

“most users” have gigabit pipes to the internet?

@[email protected]
link
fedilink
English
119d

Bet that’ll save Valve on some server costs too. Storage is much cheaper than compute (though I imagine they’ll probably keep LZMA around for clients on slow connections).

@[email protected]
link
fedilink
English
619d

Doesn’t decompression only happen client-side? I don’t imagine them compressing the files multiple times.

@[email protected]
link
fedilink
English
319d

Hmm true. I was thinking that steam has a lot of games and respective builds it has to compress, even if the decompression benefits are clientside only.

Each new game update would also be compressed too - I have no idea how Steam handles the update to work out what files need replacing on their end though, which might involve decompressing the files to analyse them.

@[email protected]
link
fedilink
English
2819d

Better compression -> faster downloads

@[email protected]
link
fedilink
English
2419d

Especially since lzma currently CPU bottlenecks on decompression for most computers on fast internet connections. Zstd can use the cpu much more efficiently.

@[email protected]
link
fedilink
English
1118d

So if I’m reading this correctly, they are trading slightly larger downloads for considerably faster overall install speeds.

Makes a lot of sense as most internet connections nowadays can handle the added bandwidth.

@[email protected]
link
fedilink
English
718d

I hope to one day join the present day lol

@[email protected]
link
fedilink
English
317d

So network bandwidth became cheaper than cpu ? Clearly CPUs are stagnating.

Create a post

For PC gaming news and discussion. PCGamingWiki

Rules:

  1. Be Respectful.
  2. No Spam or Porn.
  3. No Advertising.
  4. No Memes.
  5. No Tech Support.
  6. No questions about buying/building computers.
  7. No game suggestions, friend requests, surveys, or begging.
  8. No Let’s Plays, streams, highlight reels/montages, random videos or shorts.
  9. No off-topic posts/comments, within reason.
  10. Use the original source, no clickbait titles, no duplicates. (Submissions should be from the original source if possible, unless from paywalled or non-english sources. If the title is clickbait or lacks context you may lightly edit the title.)
  • 1 user online
  • 146 users / day
  • 439 users / week
  • 1.19K users / month
  • 3.37K users / 6 months
  • 1 subscriber
  • 5.01K Posts
  • 33.8K Comments
  • Modlog