It seems that you're using an outdated browser. Some things may not work as they should (or don't work at all).
We suggest you upgrade newer and better browser like: Chrome, Firefox, Internet Explorer or Opera

×
avatar
Yepoleb: Recursive functions? It should be a simple if response.status == 500 then retry without downloads.
Yes, and I'm doing mostly that but by calling the same parsing function again, in case of HTTP 500, with an extra boolean switch I added for movies (to remove the downloads from the URL of the query). It's calling itself *once*, so it's recursive :P. I didn't mean Fibonacci-like recursive.

In my case it makes more sense to do that since I already have an iterative retry logic part of the bulk queries I do. Having a retry loop inside a function call that's already in a retry loop reminds me too much of "Inception". I'm more of a "One Retry Loop to rule them all, One Retry Loop to find them, One Retry Loop to bring them all and in the darkness bind them" kind of coder :).
avatar
Yepoleb: It's the only one that brought out the gogbear for me so far.
I can give you a list of GOGBear-inducing movie ids if you want :P.
Post edited February 21, 2017 by WinterSnowfall
Oh boy, lookie here, a game entry without a title:
https://api.gog.com/products/1441272224?expand=downloads,expanded_dlcs,description,screenshots,videos,related_products,changelog

Note to self: put a lot less NOT NULL restrictions next time you design a table to store GOG API return values.
Post edited February 25, 2017 by WinterSnowfall
After a week of relatively problem-free querying, now I see the throttling GOGBear is back in his bear-cave, spreading IP bans :|.

I wonder, do they give the GOG office keys to the GOGBear over the weekend so he can keep an eye on things? Well guys, I can tell you that he's doing his job. Stop putting the Earth's weight on him when the servers overload!
Now that I have the data in a database, it's a lot easier, at least in theory, to pull out relevant information. For example, here's a heavily filtered list of games that the GOG product APIs consider to be yet unreleased: https://dl.dropboxusercontent.com/u/1845258/gog_unreleased_02032017.csv

Even if it's filtered, there are still some erroneous entries for games which obviously were released a long time ago. What I've mostly filtered out are entries marked as "invalid", "to be deleted", "wrong game id" and stuff like that.

Most of them have been flagged already. Some of them are still mysteries. For example, I have no idea what this is:
[url=https://api.gog.com/products/1589615681?expand=downloads,expanded_dlcs,description,screenshots,videos,related_products,changelog ]https://api.gog.com/products/1589615681...[/url]

Is it perhaps just a test API entry for our friends at THQ Nordic? Any better ideas?
Post edited March 02, 2017 by WinterSnowfall
For anyone that's still tuned, just wanted to let you know that I've completed a second full scan of the entire product id range, now saving the API responses to a database along with extended id query data, including stuff like changelogs.

I've also written an update routine which allows me to keep all the mapped ids stored in the database in sync with the latest changes on GOG and also allows me to detect any removed or deleted entries.

I'll post an analysis of my findings soon enough. Watch this space.

Edit: A first notable thing I noticed is that they fixed the extended APIs for movie entries - you no longer get the GOGBear when including downloads as a field, and now it actually returns the right info!
Post edited March 11, 2017 by WinterSnowfall
Do you ever feel like you're rifling through GOG's underwear drawer, counting the flowers on their panties?
avatar
yogsloth: Do you ever feel like you're rifling through GOG's underwear drawer, counting the flowers on their panties?
No, it mostly feels like peeking through the keyholes of all the locked doors on their premises.
low rated
avatar
yogsloth: Do you ever feel like you're rifling through GOG's underwear drawer, counting the flowers on their panties?
wtf? why you talk to him! he's ROMANIAN! don't you americans have that proverb "if you got nothing good to say then shut up"?
avatar
yogsloth: Do you ever feel like you're rifling through GOG's underwear drawer, counting the flowers on their panties?
avatar
WinterSnowfall: No, it mostly feels like peeking through the keyholes of all the locked doors on their premises.
Might your enterprise be able to help with this?

https://www.gog.com/forum/general/lets_try_to_make_a_list_of_incorrect_game_forum_links_from_library_game_entries

Apparently MaGog doesn't have the data, but I figured your crawls might have shown up all the games with bad forum links that show General Discussion instead of the actual forums.
avatar
adaliabooks: Might your enterprise be able to help with this?
While I do collect data related to links as part of the bulk response payload that I store along with a product entry, as of now I don't nicely parse the "links" subfields, so any direct SQL magic in this regard is out of the question.

But, since I've learned a lot about the ins and outs of the GOG REST APIs and the mess its data structure is in, now, with a complete set of workarounds, filtering and corner cases covered (hopefully) I'm bound to start a third mapping run soon.

I'll neatly parse all the product links as well - it's not much of an effort to do it, and it might help with other things in the future.

In the meantime, I might be able to put up some queries on my current db to extract the data you are looking for. You'll hear from me when I've made progress on that front ;).
Thanks for all of your hard work. ;)
avatar
mm324: Thanks for all of your hard work. ;)
Hehe, you're welcome, but it's mostly coding fun :). What I do for a living is hard work.
LOL
I've now started a 3rd full products API scan after overhauling most of my scripts.

Among the changes are:
-> Re-defined how movie entries are identified, since GOG has fixed movie entries to include downloadables (I now do a description-based identification, since it seems to be the simplest way)
-> Save and include link information as separate DB fields
-> Better company match for authors/publishers - I will do an upper case match stripping various characters like ':', ",", "/" and "." as these tend to cause matching issues between what is listed on a product's page and the company values I parse from the gogData variable
-> Update run mode in which existing DB entries are scanned for changes and updated if required (also useful to detect changelog updates as well as incoming releases for yet unreleased games that have a visible API entry)
-> The update run mode can now identify when a previously public API entry is removed/hidden (this can happen, as I've seen with the Homeworld: Deserts of Kharak DLCs that I previously uncovered)

As always, I'll keep you guys posted if anything interesting shows up.
Post edited March 23, 2017 by WinterSnowfall
In case anyone was wondering, this is how a temporary ban looks like.
Attachments: