It seems that you're using an outdated browser. Some things may not work as they should (or don't work at all).
We suggest you upgrade newer and better browser like: Chrome, Firefox, Internet Explorer or Opera

×
Almighty gods of GOG,

Heed my prayers! [bumps thread]

Need I summon a breaking-the-law priest to get some attention here?
avatar
WinterSnowfall: It all comes down to entropy (information entropy, what this guy studied).

Data can only be compressed so far before you start losing computational efficiency rather drastically. With extreme settings, you'll get to a point where there's just nothing more you can squeeze out of the data and you're just wasting processing time and power.
I don't really notice it taking a lot of time, LZMA2 using multiple threads works actually fairly fast and Ultra only takes about twice as long as normal to compress something. Unless you're really in a hurry, is waiting another 2 minutes going to kill you?

Perhaps we're getting spoiled, with Terabytes drives and almost unlimited file storage sites, and huge 50Gb games and high quality OS's, that we don't have to be so picky that stuff has to fit in any particular shape anymore without tinkering. Early on to get around 1.44Mb size issues, they simply split the zip archive to multiple disks, the DOOM demo we got came on 2 disks afterall, and the full game was probably 3-4 disks.

I guess gone are the days of doing assembly language and re-writing some blocks of code to do the same job while saving you 3 bytes... the stories of how much they could pack in the BASIC roms of old for functionality in the 8-bit computers...
avatar
rtcvb32: Ultra only takes about twice as long as normal to compress something. Unless you're really in a hurry, is waiting another 2 minutes going to kill you?
Of course not, but the real issue here is: is waiting twice as long going to be worth it?

Are you getting an archive which is only 5% smaller by spending twice as long in archiving time? As you've mentioned, we're spoiled with multi-TB drives nowadays, so most people would say... probably not.

But then again, if you REALLY want to squeeze the most out of it, Ultra is the way to go ;).
Post edited March 13, 2015 by WinterSnowfall
avatar
Rixasha: I would love this and it would make a big difference with the larger titles.
How would it be any more difficult than .tar.gz? I don't get it.
It wouldn't be difficult at all.

unzip <filename>
xz -ze9m80 filename directory/

It's pretty simple actually, it'd take some time to do the entire collection of games and all of the files within as LZMA is slow to compress although fast to decompress. There are alternative implementations such as pxz that take advantage of multicore which are worthwhile using for mass recompression runs also though. I believe 7zip supports parallelization as well.

Then they'd just need to update everything that references the filenames and change it to xz instead of zip. It wouldn't be an overnight effort, but likely something that could be done in a short order of time as individual packages are being updated anyway, and then make a change to the remaining ones after the low hanging fruit is taken care of.

The savings are worthwhile, and every Linux distribution that they officially support as well as many that they don't officially support either install xz utilities by default or they're available for install on the system if someone wanted to manually open the archives to access content. GOG's own software would include it's own copy for it's own purposes so as to not have to rely on whatever someone may or may not have installed of course.

I don't see this as something they likely did not do on purpose, or are avoiding for any complex technical reasons, but rather just something they haven't gotten to yet which isn't a major priority with all of the things they've got on the go already.

Having said that, XZ/LZMA can save a tremendous amount of space that is a worthwhile effort both for reducing bandwidth consumption and time that it takes to download stuff, as well as reducing install time because it decompresses faster than zip/gz/bz2 et al. and reduced local disk consumption for library backups.

avatar
rtcvb32: Ultra only takes about twice as long as normal to compress something. Unless you're really in a hurry, is waiting another 2 minutes going to kill you?
avatar
WinterSnowfall: Of course not, but the real issue here is: is waiting twice as long going to be worth it?

Are you getting an archive which is only 5% smaller by spending twice as long in archiving time? As you've mentioned, we're spoiled with multi-TB drives nowadays, so most people would say... probably not.

But then again, if you REALLY want to squeeze the most out of it, Ultra is the way to go ;).
LZMA compression can give gains far more significant than 5%. It depends on the specific content how big the gains will be, and also what commandline options are used on the compression when the archive is created, but with a fair bit of content including game archives getting 20/30/40% is not unlikely. I've got game backups here that used to be stored in zip/rar or whatever other format I happened to use at the time and when I converted it all to XZ the savings were substantial.

The amount of time it takes to archive is completely irrelevant however as it is an operation done once at GOG that does not impact the user in any way. Decompression of XZ/LZMA is quite a bit faster than other formats and was designed to be so, so it gives positive performance and disk and network saving benefits when it is used.

The only real argument against doing it are that it uses a lot of memory during decompression and depending on what the oldest/slowest computers with the least amount of RAM might be that GOG would care to support, it might cause problems for a very small fraction of people. Having said that, I use xz to compress fairly large files on a 13 year old Dell Optiplex with 256MB of RAM and a 1.8GHz processor and it works fine. Compression takes eons on that machine but decompression is faster than gzip/bzip2/etc.

If we ignore any concern over compatibility issues (which I believe to be non-concerns as most Linux distributions and certainly the ones GOG supports officially are using xz by default for their own archives mostly), it is then more a matter of priority than anything technological and they almost certainly have higher priority things to work on than a low priority effort to recompress everything I imagine, although I wont be surprised when they do go ahead and do it as the gains are non-trivial.
Post edited March 13, 2015 by skeletonbow
avatar
skeletonbow: LZMA compression can give gains far more significant than 5%.
I'm well aware of the advantages of LZMA(2) and the differences in compression rate, I'm the OP, remember? The thing we were discussing previously was differences between the Normal/Maximum profiles and Ultra profile of LZMA.

I was trying to make a point that while Maximum does have some gains over Normal, Ultra spends too much extra time for the small percentage of added compression it usually offers. At least that's how I see it.

it is then more a matter of priority than anything technological and they almost certainly have higher priority things to work on than a low priority effort to recompress everything I imagine, although I wont be surprised when they do go ahead and do it as the gains are non-trivial.
I really hope you're right and that we're not stuck with gzip for good. I've seen no official info that they plan on doing anything along these lines... nor did any GOG gods send me such visions.

I've got game backups here that used to be stored in zip/rar or whatever other format I happened to use at the time and when I converted it all to XZ the savings were substantial.
Yeah, did that too some time ago. Also had some ancient .ace and .arj archives if you can imagine that.

All in all I gained a couple of GBs and it got me away from closed format archives (in the case of rar and ace), so it was well worth it.

On a normal day I'd mark you reply as a solution, since you've covered most of the relevant points, but I'm still waiting for a(n official) sign from the GOG gods.
Post edited March 14, 2015 by WinterSnowfall
avatar
skeletonbow: LZMA compression can give gains far more significant than 5%.
avatar
WinterSnowfall: I'm well aware of the advantages of LZMA(2) and the differences in compression rate, I'm the OP, remember? The thing we were discussing previously was differences between the Normal/Maximum profiles and Ultra profile of LZMA.

I was trying to make a point that while Maximum does have some gains over Normal, Ultra spends too much extra time for the small percentage of added compression it usually offers. At least that's how I see it.
That would depend on the usage case, for if it were an automatic benefit for every possible usage it would be the hard coded default and likely not even be an option. :) So there is a case where it is beneficial by its very nature of existing really. :)

This particular part of the discussion is just an academic aside however because it is something that would entirely happen on GOG's own build machine computers as a one-shot during the process of packaging and not have any end customer impact whatsoever. Big computer companies have tonnes of fast big iron computing power kicking around in their offices and clean rooms and whatnot, and twiddling a compression option to milk every byte out of it is not really likely to make their systems crawl and make the lights blink or their power bills go through the roof though either, so rather than it being a case of "why bother", it's more a case of "why _not_ bother" in my mind. But again, it is a trivial distraction of detail compared to the real question you brought up which has solid merit.


it is then more a matter of priority than anything technological and they almost certainly have higher priority things to work on than a low priority effort to recompress everything I imagine, although I wont be surprised when they do go ahead and do it as the gains are non-trivial.
avatar
WinterSnowfall: I really hope you're right and that we're not stuck with gzip for good. I've seen no official info that they plan on doing anything along these lines... nor did any GOG gods send me such visions.
They probably will at some point, but since I've had experience moving large amounts of software packaging myself from older archive formats to newer ( .Z -> .gz -> .bz2 -> .xz ) for purposes of software distribution I've experienced and seen the gains I also recognize that such changes are often of the "very low priority" variety unless it is solving a blocker bug or somesuch in the daily churn of things. :) I suspect it's something they'd investigate, do some automated script testing to determine the potential gains and ponder all of the benefits (reduced server disk space and load, reduced download times, reduced load on the servers in general, similar customer benefits, etc.) and then put on the "low priority someday later" list, perhaps adding it to the "list of things to do whenever we update a particular game in the catalogue" like they did going from the old 1.x to 2.x installers a year and a half or so ago.

Also, I think we're in the very small minority of customers who would ever even likely think of or care about this sort of thing too, although Linux users in general will be more technically astute power users than average too. :)


I've got game backups here that used to be stored in zip/rar or whatever other format I happened to use at the time and when I converted it all to XZ the savings were substantial.
avatar
WinterSnowfall: Yeah, did that too some time ago. Also had some ancient .ace and .arj archives if you can imagine that.

All in all I gained a couple of GBs and it got me away from closed format archives (in the case of rar and ace), so it was well worth it.

On a normal day I'd mark you reply as a solution, since you've covered most of the relevant points, but I'm still waiting for a(n official) sign from the GOG gods.
I freed up about 350GB of disk space about a year ago on one computer moving zip/rar/gz/bz2 and some other formats to xz/zip with maximum compression out the wazoo. In some cases I didn't even decompress the original and still got a huge savings which is completely the opposite of what one expects when compressing something that's already compressed. :)

I've still got some old archive formats kicking around here and there, and my GOG backups are pristine so not sure how much it'd save to convert them for certain. I do not archive the Linux downloads from GOG yet although at some point in the future I'll definitely be doing that. If GOG hasn't done it yet by then, I will definitely script recompressing everything myself here and post results to the forums although that probably wouldn't be for a year or more at my best guess because $biggerproblems :) When I do get around to doing it, I expect the majority of content to shrink by a good chunk though, however some content is likely to stay relatively the same and a smaller portion to potentially even increase in size too (I had that occur with some games).

Fun stuff to tinker with in Linux if one has the cycles to spare and motivation. Someday... :)
avatar
skeletonbow: I freed up about 350GB of disk space about a year ago on one computer moving zip/rar/gz/bz2 and some other formats to xz/zip with maximum compression out the wazoo. In some cases I didn't even decompress the original and still got a huge savings which is completely the opposite of what one expects when compressing something that's already compressed. :)
I got some huge differences in compression before as well. Some of it's the format, for zip i know each file is compressed individually, so you can't forward any useful data to the next files. I had a few html books that i recompressed as 7z and got huge gains taking a 8Mb file to 1Mb simply due to that result, very little to do with the compression level. Sometimes i got similar results storing data in uncompressed zips, then combining them together as a single zip and re-compressing the zip into another zip which did effectively the same thing but better compression.

Actually as a test i uncompressed all the png files from ToME and the final filesize from the images was cut in half. Yeah they took up 900Mb raw space in the ramdrive during the tests, but the compressed format like that was better than when i optimized the png's and saved them in the zips...

As for images and pictures, pngs, jpegs, and other sources can be made smaller and use very little extra work. JpegOptim i've been experimenting with simply repacks the data and doesn't offer higher compression options (other than progressive vs normal) so i'm probably going to take a huge archive of stuff like all the game wallpapers & avatars and recompress them and repack them... Script's ready, just a matter of doing it.

Although it's a little confusing why people don't take a slightly better approach to trying to optimize size/space. Sometimes something like the 50GB games to make it glaringly obvious something's wrong, instead of how recompressing over a weekend could save you 10Mb on the final product of your 300Mb game would make a difference... Course when games ran on CDs and they were new and you had 600Mb space, compression only needed to be considered for video and audio, everything else could have minimal compression and you'd still have 300Mb free...

Hmm.... i'm not sure if this tangent means anything...
avatar
skeletonbow: I freed up about 350GB of disk space about a year ago on one computer moving zip/rar/gz/bz2 and some other formats to xz/zip with maximum compression out the wazoo. In some cases I didn't even decompress the original and still got a huge savings which is completely the opposite of what one expects when compressing something that's already compressed. :)
avatar
rtcvb32: I got some huge differences in compression before as well. Some of it's the format, for zip i know each file is compressed individually, so you can't forward any useful data to the next files. I had a few html books that i recompressed as 7z and got huge gains taking a 8Mb file to 1Mb simply due to that result, very little to do with the compression level. Sometimes i got similar results storing data in uncompressed zips, then combining them together as a single zip and re-compressing the zip into another zip which did effectively the same thing but better compression.

Actually as a test i uncompressed all the png files from ToME and the final filesize from the images was cut in half. Yeah they took up 900Mb raw space in the ramdrive during the tests, but the compressed format like that was better than when i optimized the png's and saved them in the zips...

As for images and pictures, pngs, jpegs, and other sources can be made smaller and use very little extra work. JpegOptim i've been experimenting with simply repacks the data and doesn't offer higher compression options (other than progressive vs normal) so i'm probably going to take a huge archive of stuff like all the game wallpapers & avatars and recompress them and repack them... Script's ready, just a matter of doing it.

Although it's a little confusing why people don't take a slightly better approach to trying to optimize size/space. Sometimes something like the 50GB games to make it glaringly obvious something's wrong, instead of how recompressing over a weekend could save you 10Mb on the final product of your 300Mb game would make a difference... Course when games ran on CDs and they were new and you had 600Mb space, compression only needed to be considered for video and audio, everything else could have minimal compression and you'd still have 300Mb free...

Hmm.... i'm not sure if this tangent means anything...
For PNG files there are other options like pngcrush which might give better results also. It changes the bit level data in a PNG without changing the end resulting images due to flexibility in the file format and using better algorithms to figure out the best compression for the particular image. It's really useful for website optimization when using PNGs. Probably something like the jpeg thing you mentioned (haven't heard of that one). I'm definitely a fan of better compression algorithms both lossless and some lossy ones as well as long as the latter result in higher quality images with better compression than making a tradeoff of better compression with lower quality as is often the case. It's a shame that JPEG 2000 standard never caught on in widespread use for lossy images, it is rather incredible. Google has their WebP format also although I doubt that'll ever catch on because of NIH syndrome of all the other browser vendors.
avatar
stan423321: This is unfortunately pretty hard to do for the same reason they don't switch to 7-zip on the Windows side. Granted, typical Linux user is a little smarter, but you never know what weirdness will the people come up with.
Even the system is a little smarter as compression is better integrated. :)

At least on the Linux distributions I use tar detects the used compression automatically. So there's not even a difference for the user:

- command to extract an archive compressed with gzip: tar xf setup.tar.gz
- command to extract an archive compressed with xz: tar xf setup.tar.xz
avatar
eiii: At least on the Linux distributions I use tar detects the used compression automatically. So there's not even a difference for the user
Or even that there''s a utility to tell us what a file is. For Windows you have to rely exclusively on the extension, which doesn't help that jar files are zip files just renamed... spaz's spz files are just png files renamed... makes it a little more confusing.
avatar
Gydion: Wishlist: Use better compression for Linux tarballs. For example tar.xz. Comments contain additional info.
avatar
WinterSnowfall: Darn, it doesn't have that many backers. Oh, well +1.
Wow, still not very many. It's got one more now.

I'm definitely liking xz for my compression, though I'm late in discovering it.