It seems that you're using an outdated browser. Some things may not work as they should (or don't work at all).
We suggest you upgrade newer and better browser like: Chrome, Firefox, Internet Explorer or Opera

×
I've done some more work on my GUI and corrected a bug that involved disabled controls ... for one instance I had overlooked.

You can now select a few other options on the SETUP window, including Language and OS.

gogcli_setup_2.png

I've also partially implemented a Game Files Selector window, so some of the downloading code has been done.

gogcli_selector.png

Meanwhile I have come across some hurdles I need to sort out, in regard to obtaining manifest data for a game. I have belated discovered it doesn't work how I thought ... or rather, it does for some games but not others.

We are almost at the completed download support stage with my GUI.
Post edited March 07, 2021 by Timboli
avatar
Geralt_of_Rivia: The last time I checked (though that was years ago with a browser) the GOG servers only allowed 6 connections at a time so 10 workers might be a bit too much. It doesn't hurt though the connections over the limit will simply get delayed until one of the earlier connections has ended.

And while the GOG API may be a bottleneck the question is always how close you can get to the restrictions set by the bottleneck with your implementation. My data collection is a bit more than 4 times faster than gogrepo and I haven't even done any optimization yet. So obviously I'm interested to know if there is still room for improvement.

Edit: I just checked and now the servers allow more than 6 connections.
I haven't timed it exactly (that would require me patiently waiting for it to run), but for 2071 games (yes, you may judge me), linux & windows installers, english & french & spanish & japanese languages, its somewhere between 10 and 20 minutes I believe.

Note that there is an imposed 200ms delay between requests to prevent the simulation of a mini dos attack. I haven't figured out gog's tolerance there, but by reducing that delay (it is a parameter in the command), some time might be saved.

I have a feeling that if follow Sude's instructions with the Galaxy api, I may be able to shave off some requests (and hence time), but I'll leave that to a future optimisation for now.

Here is my manifest summary:

{
"Games": 2071,
"Files": 11793,
"Installers": 6507,
"Extras": 5286,
"Size": 7449916626462,
"SizeAsString": "7.45 TB",
"SizeAverage": 3597255734,
"SizeAverageAsString": "3.60 GB",
"LargestGame": {
"Id": 1943729964,
"Title": "Wolfenstein: The New Order",
"Size": 135056012577,
"SizeAsString": "135.06 GB",
"Installers": 36,
"Extras": 0
},
"SmallestGame": {
"Id": 1207658773,
"Title": "Personal Nightmare",
"Size": 13261312,
"SizeAsString": "13.26 MB",
"Installers": 1,
"Extras": 0
}
}

Here are some dangling files that gog has:

{
"Errors": [
"GetDownloadInfo(downloadPath=/downloads/craft_the_world/en1patch1) -\u003e Expected response status code of 200, but got 404",
"GetDownloadInfo(downloadPath=/downloads/metro_exodus/85424) -\u003e Expected response status code of 200, but got 403",
"GetDownloadInfo(downloadPath=/downloads/dreamfall_chapters_season_pass/77333) -\u003e Expected response status code of 200, but got 403",
"GetDownloadInfo(downloadPath=/downloads/republique/41353) -\u003e Expected response status code of 200, but got 403"
]
}

Craft the world is a dangling patch they forgot to delete. Republique has a "coming soon" manual that will probably never see the light of day (the game is not even listed on the store anymore). The two others, I'll have to look.
Post edited March 03, 2021 by Magnitus
Okay, with some help from Magnitus, I appear to have solved the issues.

You can find the latest update here - https://github.com/Twombs/GOGcli-GUI

This means, that hopefully tomorrow, I will be able to finish the remaining downloading code.

I am also now using and testing with v0.7.0 of gogcli.exe.
Post edited March 03, 2021 by Timboli
avatar
Geralt_of_Rivia: The last time I checked (though that was years ago with a browser) the GOG servers only allowed 6 connections at a time so 10 workers might be a bit too much. It doesn't hurt though the connections over the limit will simply get delayed until one of the earlier connections has ended.

And while the GOG API may be a bottleneck the question is always how close you can get to the restrictions set by the bottleneck with your implementation. My data collection is a bit more than 4 times faster than gogrepo and I haven't even done any optimization yet. So obviously I'm interested to know if there is still room for improvement.

Edit: I just checked and now the servers allow more than 6 connections.
avatar
Magnitus: I haven't timed it exactly (that would require me patiently waiting for it to run), but for 2071 games (yes, you may judge me), linux & windows installers, english & french & spanish & japanese languages, its somewhere between 10 and 20 minutes I believe.
Seriously? Including getting the md5 from the XML files or without them? Getting all the XMLs should take much longer.

gogrepo takes almost 6 hours for less than 1000 games according to Timboli. My program manages to take about 1:50 for my library which is significantly bigger than that. If I make do without the md5 from the XMLs my program would also take only about 25 minutes but getting the XMLs takes that long because you need 3 requests per file.

Though as I said, I haven't optimized anything yet.
avatar
Geralt_of_Rivia: Seriously? Including getting the md5 from the XML files or without them? Getting all the XMLs should take much longer.

gogrepo takes almost 6 hours for less than 1000 games according to Timboli. My program manages to take about 1:50 for my library which is significantly bigger than that. If I make do without the md5 from the XMLs my program would also take only about 25 minutes but getting the XMLs takes that long because you need 3 requests per file.

Though as I said, I haven't optimized anything yet.
If memory serves (taking my collection as an example):
- 21 requests to get all the pages
- 2071 requests to get all game details

For each file:
- 2 requests against 302 codes to get the download url
- 1 request to get the xml info
- If the xml info is a 404, 1 head request on the download url to get the file size at least

Anyways, I'll do it again and do a more serious effort to time it exactly, but I'm reasonably sure its less than 20 minutes. Definitely far less than 1 hour.

Also keep in mind that for the installers, I'm skipping over macs (only getting windows and linux) and when language is specified, I skip over anything that is not english, french, spanish or japanese.
Post edited March 04, 2021 by Magnitus
avatar
Geralt_of_Rivia: Seriously? Including getting the md5 from the XML files or without them? Getting all the XMLs should take much longer.

gogrepo takes almost 6 hours for less than 1000 games according to Timboli. My program manages to take about 1:50 for my library which is significantly bigger than that. If I make do without the md5 from the XMLs my program would also take only about 25 minutes but getting the XMLs takes that long because you need 3 requests per file.

Though as I said, I haven't optimized anything yet.
For 779 games it takes GOGRepo about an hour (61 min) to get a clean manifest on my PC. I don't know if any factors other than internet speed affect this, but my machine is fairly good. In August of last year, it took just under an hour (58 min) for a clean manifest update of 653 games. That's just windows versions, just English, but all extras too.
Post edited March 04, 2021 by paladin181
Ok, its 26 minutes of my life I ain't getting back, but here you go.

PS: Yes, my laptop is a beast (64GB RAM, Xeon processor... I don't cheap out with the tool of my livelihood), but honestly, unless you're running this on a lower end raspberry pi, it shouldn't matter that much.

A network round-trip to gog.com from my machine takes 124 milliseconds. In computing time, that's like forever. Heck, the 200ms sleep delay between requests is like forever too.

That's where 99%+ of that 26 minutes wait is coming from. This is what is slowing down everything.

I mean, if you're gonna hold 1000+ concurrent connections, then yes, your cpu and ram will start to matter a lot more, but we're taking about less than 50 concurrent connections here. You're ain't running a web server.

If its taking a long time, it is probably because the program is not managing its io well (is it really doing network requests concurrently? Is is doing some extra very expensive io intensive stuff on the disk that its blocking on?... or is it reallly just mining bitcoins on the side :P?).
Attachments:
Post edited March 04, 2021 by Magnitus
avatar
Magnitus: Ok, its 26 minutes of my life I ain't getting back, but here you go.
Are you getting information about files from the GOG API or by obtaining the details of the actual files in your library?

The former is much faster, but it will not detect small changes in file sizes, since the size reported by the API is rounded.

MaGog used to do both. The former took about 30-40 minutes for the entire GOG catalogue, the latter took hours.

gogrepoc does the latter, which guarantees that it does not miss file changes, even if the file name does not change and the size changes by as little as one byte.
avatar
Magnitus: Ok, its 26 minutes of my life I ain't getting back, but here you go.
avatar
mrkgnao: Are you getting information about files from the GOG API or by obtaining the details of the actual files in your library?

The former is much faster, but it will not detect small changes in file sizes, since the size reported by the API is rounded.

MaGog used to do both. The former took about 30-40 minutes for the entire GOG catalogue, the latter took hours.

gogrepoc does the latter, which guarantees that it does not miss file changes, even if the file name does not change and the size changes by as little as one byte.
Your turn of phrase makes this ambiguous for me.

I get all the information from the GOG web api (estimated size, actual size, checksum when available, file name, etc). Its not rounded (if the info was wrong, I'd be getting errors when I'm uploading the files in my store as it checks for that and I'm not getting size mismatch errors).

If you mean getting the info from my actual downloaded files, that would be a lot faster, not slower, as I'd be doing either a file stat on my hard drive or a network roundtrip to my s3 store on my local LAN which is A LOT faster than a roundtrip to the gog server over the internet.
Post edited March 04, 2021 by Magnitus
avatar
mrkgnao: Are you getting information about files from the GOG API or by obtaining the details of the actual files in your library?

The former is much faster, but it will not detect small changes in file sizes, since the size reported by the API is rounded.

MaGog used to do both. The former took about 30-40 minutes for the entire GOG catalogue, the latter took hours.

gogrepoc does the latter, which guarantees that it does not miss file changes, even if the file name does not change and the size changes by as little as one byte.
avatar
Magnitus: Your turn of phrase makes this ambiguous for me.

I get all the information from the GOG web api (estimated size, actual size, checksum when available, file name, etc). Its not rounded (if the info was wrong, I'd be getting errors when I'm uploading the files in my store as it checks for that and I'm not getting size mismatch errors).

If you mean getting the info from my actual downloaded files, that would be a lot faster, not slower, as I'd be doing either a file stat on my hard drive or a network roundtrip to my s3 store on my local LAN which is A LOT faster than a roundtrip to the gog server over the internet.
Sorry for being unclear.

By GOG API, I mean information from https://api.gog.com, for example from:
- https://api.gog.com/products/GOG_ID?expand=downloads,expanded_dlcs,related_products,changelog
or
- https://api.gog.com/v2/games/GOG_ID

By actual files in the library (on GOG, not local), I mean information obtained by following the redirected file download URL, for example to find information about the "avatar" bonus of "Aarklash Legacy", you would follow the library link:
- https://www.gog.com/downloads/aarklash_legacy/25723
all the way to the actual file on the server:
- aarklash_avatar.zip
and get its file size, modification time, etc.
Post edited March 04, 2021 by mrkgnao
avatar
mrkgnao: Sorry for being unclear.
No worries.

avatar
mrkgnao: By actual files in the library (on GOG, not local), I mean information obtained by following the redirected file download URL, for example to find information about the "avatar" bonus of "Aarklash Legacy", you would follow the library link:
- https://www.gog.com/downloads/aarklash_legacy/25723
all the way to the actual file on the server:
- aarklash_avatar.zip
and get its file size, modification time, etc.
Yes, I do that awkward little dance. Its all here: https://github.com/Magnitus-/gogcli/blob/main/sdk/download.go#L160
Post edited March 04, 2021 by Magnitus
avatar
mrkgnao: By actual files in the library (on GOG, not local), I mean information obtained by following the redirected file download URL, for example to find information about the "avatar" bonus of "Aarklash Legacy", you would follow the library link:
- https://www.gog.com/downloads/aarklash_legacy/25723
all the way to the actual file on the server:
- aarklash_avatar.zip
and get its file size, modification time, etc.
avatar
Magnitus: Yes, I do that awkward little dance. Its all here: https://github.com/Magnitus-/gogcli/blob/main/sdk/download.go#L160
In that case, very impressive. I'll give gogcli a try when I have some time.
Post edited March 04, 2021 by mrkgnao
avatar
mrkgnao: In that case, very impressive. I'll give gogcli a try when I have some time.
I'm glad my work can amaze at times. I do my best.

Let me know of any feedback when you try it.
avatar
Sude: I haven't read your code yet but since you were talking about getting file sizes for extras using HEAD requests I got the impression you don't know about this part of the API
[...]
["downloads"]["bonus_content"][i]["files"][j]["size"]
Shows file size for extra in bytes
as noted by mrkgnao above, the size returned here is rounded (to full megabytes iirc), which makes this info rather useless. it still escapes me why GOG thinks rounding files sizes is something you'd want here.
It's really sad, because using the product api alone, without having to initiate a new download session to retrieve the wanted information, would greatly reduce the need to add articial delays.
plus you can ask the api for info about more than one id in one request.






avatar
Magnitus: If memory serves (taking my collection as an example):
- 21 requests to get all the pages
- 2071 requests to get all game details

For each file:
- 2 requests against 302 codes to get the download url
- 1 request to get the xml info
- If the xml info is a 404, 1 head request on the download url to get the file size at least

Anyways, I'll do it again and do a more serious effort to time it exactly, but I'm reasonably sure its less than 20 minutes. Definitely far less than 1 hour.

Also keep in mind that for the installers, I'm skipping over macs (only getting windows and linux) and when language is specified, I skip over anything that is not english, french, spanish or japanese.
if you want to further limit the number of requests you do, it should be safe to just skip trying to get the xml data for bonus items. afaik they never ever had this info provided.

though that probably doesn't matter much, time wise.

just curios, any particular reason you manually follow the http redirections in your code ?
never found that necessary, and relying on the number of redirects to be always exactly two, like they are now,
seems to introduce unnecessary fragility, no?
Post edited March 04, 2021 by immi101
avatar
immi101: if you want to further limit the number of requests you do, it should be safe to just skip trying to get the xml data for bonus items. afaik they never ever had this info provided.
Yeah, I'm trying to future/edge-case proof here.

I've seen enough of gog's api not to trust them about being consistent.

I just assume they have an employee who manages by hand all those conventions (or otherwise, is not very disciplined about it in an evolving codebase) and am trying to get away with the strict minimum number of assumptions that I can.

avatar
immi101: just curios, any particular reason you manually follow the http redirections in your code ?
never found that necessary, and relying on the number of redirects to be always exactly two, like they are now,
seems to introduce unnecessary fragility, no?
Because I don't want to actually download the file here, I'm just interested in the url of the final request to get the info (without actually making the download).

I suppose I could test whether a HEAD requests redirects or not and if HEAD requests indeed follow redirection (I'll chalk it up with my lack of familiarity with Golang, it is my first project in that language), then you're right that this wouldn't be needed.

By default, I let Golang's http library follow redirections everywhere else.
Post edited March 04, 2021 by Magnitus