It seems that you're using an outdated browser. Some things may not work as they should (or don't work at all).
We suggest you upgrade newer and better browser like: Chrome, Firefox, Internet Explorer or Opera

×
avatar
eiii: A few lines of Python do the job for me:
Unless you are, for whatever reason, averse to using extra libraries, I would suggest using BeautifulSoup for handling HTML in Python.
avatar
eiii: A few lines of Python do the job for me:
avatar
Maighstir: Unless you are, for whatever reason, averse to using extra libraries, I would suggest using BeautifulSoup for handling HTML in Python.
He's already using a couple - requests and lxml. The latter supports using BS as the parser backend. However, its own parser is faster, so unless it's really broken HTML, he's likely better off sticking with it. At least in my own experience using both libraries.
avatar
Maighstir: Unless you are, for whatever reason, averse to using extra libraries, I would suggest using BeautifulSoup for handling HTML in Python.
avatar
hyperagathon: He's already using a couple - requests and lxml. The latter supports using BS as the parser backend. However, its own parser is faster, so unless it's really broken HTML, he's likely better off sticking with it. At least in my own experience using both libraries.
You're right, I didn't notice either of them, only the "import html" from the lxml line, so or whatever insane reason I figured it was Python's bundled html library.
avatar
Maighstir: Unless you are, for whatever reason, averse to using extra libraries, I would suggest using BeautifulSoup for handling HTML in Python.
avatar
hyperagathon: He's already using a couple - requests and lxml. The latter supports using BS as the parser backend. However, its own parser is faster, so unless it's really broken HTML, he's likely better off sticking with it. At least in my own experience using both libraries.
Thanks for the suggestions! I will have a look at BeautifulSoup, at the latest when GOG breaks the HTML. :)

I more or less have searched for Python parsers for the different data formats and pasted those snippets together which looked best suited. And I was quite surprised how short the resulting script is. Speed is not my primary concern, the script is fast and most of the time probably is needed for the download. I'm more concerned about robustness and not really happy with the JavaScript parsing. The "script[0]" and the regex are rather weak hacks. But as far as I understand it parsing JavaScript more intelligently would require to integrate a JS engine. As the script still works on today's promo page I guess I'll keep the parsing as it is and refine it when it breaks.