It seems that you're using an outdated browser. Some things may not work as they should (or don't work at all).
We suggest you upgrade newer and better browser like: Chrome, Firefox, Internet Explorer or Opera

×
avatar
adaliabooks: Yeah, it isn't the most efficient way but it's the best I could come up with.
That being said, you've given me an idea. I should be able to just run the search once a day over all the forums and archive the results in a database somewhere, and then search on that instead.. which should be a lot quicker and less resource intensive. I'll look into this.

[...]
Not sure I follow, what about new posts in threads?
avatar
adaliabooks: Yeah, it isn't the most efficient way but it's the best I could come up with.
That being said, you've given me an idea. I should be able to just run the search once a day over all the forums and archive the results in a database somewhere, and then search on that instead.. which should be a lot quicker and less resource intensive. I'll look into this.

[...]
avatar
HypersomniacLive: Not sure I follow, what about new posts in threads?
I'd have to have a good think about how to do it, if I scanned the forums a couple of times a day it would at least work on older posts (which is mainly what you need a search for anyway). I suppose you could combine it with a scan on the actual live forum too to catch new posts, but just do the first few pages of the forum or the last few pages of a thread...

I have to make the actual thread search work first... not sure why but it's not proving as easy.
avatar
adaliabooks: I'd have to have a good think about how to do it, if I scanned the forums a couple of times a day it would at least work on older posts (which is mainly what you need a search for anyway). I suppose you could combine it with a scan on the actual live forum too to catch new posts, but just do the first few pages of the forum or the last few pages of a thread...

I have to make the actual thread search work first... not sure why but it's not proving as easy.
I have faith in you, you will succeed. :-)
avatar
HypersomniacLive: I have faith in you, you will succeed. :-)
Thank you :)

It's working better now, just missed some elements that are different between the threads and the forum page.

Problem is now how to sort the threads, as they don't have numbers like the posts do... I'll have to find a way to sort them by date, which will require a little ingenuity I think due to the fact that some have X mins / hours ago instead of a date...
avatar
adaliabooks:
Great work, adalia. Found this only just now.

I began thinking of writing a forum search engine myself, but MaGog takes too much time already, so I'm glad you took it upon yourself.

From my experience, the most important part of a search engine is a name. Gog Search is.... hmmm... ok.

I was planning to call my forum search engine "DemaGogue" or "DemaGog". If you want, you can use it. Or not.

Your idea of scanning the whole forum once and then the deltas is the same direction I was thinking of going in, but:

1) You will need a very large server to hold all the posts (i.e. epxensive). You might want to restrict what you keep in the database (e.g. not all the text, but perhaps only the title, the text of the first post, the names and dates of posters).

2) You will quickly run across GOG's IP block for accessing too much info. You might want to discuss it with GOG before attempting the data collection, asking them to have your server's IP whitelisted, because you're bound to hit the limit very quickly. Note that I was unable to get them to whitelist MaGog, but I am not very good at asking favours.

Good luck.

Favourited, of course.
Post edited May 17, 2015 by mrkgnao
avatar
mrkgnao: Great work, adalia. Found this only just now.

I began thinking of writing a forum search engine myself, but MaGog takes too much time already, so I'm glad you took it upon yourself.

From my experience, the most important part of a search engine is a name. Gog Search is.... hmmm... ok.

I was planning to call my forum search engine "DemaGogue" or "DemaGog". If you want, you can use it.

Your idea of scanning the whole forum once and then the deltas is the same direction I was thinking of going in, but:

1) You will need a very large server to hold all the posts (i.e. epxensive). You might want to restrict what you keep in the database (e.g. not all the text, but perhaps only the title, the text of the first post, the names and dates of posters).

2) You will quickly run across GOG's IP block for accessing too much info. You might want to discuss it with GOG before attempting the data collection, asking them to have your server's IP whitelisted, because you're bound to hit the limit very quickly. Note that I was unable to get them to whitelist MaGog, but I am not very good at asking favours.

Good luck.

Favourited, of course.
Thanks for the tips, I was going to message you eventually about some stuff. Although I can't remember what now... :/
I know one of the things was about links for the forums, as if I do go down the scanning route I'll need to get them from somewhere for all the sub forums... (although, that being said most of them have less than 20 pages so maybe the archive of posts would only be required for General Discussion and just let it search live for the game forums)

I hadn't even considered a name... I just made the original script for the mafia games, and as people found it useful and the forum search is awful I thought I would expand it. I'll certainly put some more thought into it.. ;)

Yeah, I was just thinking of dumping the whole lot into an SQL database... but I've never been great at working out memory usage and thinking about the size of some of the threads it would certainly need a lot of space, much more than my hosting package is likely to have. But if I don't store all the text then you can't search the whole of a forum / thread, which some what defeats the purpose... I'll see if I can think of a way to make it work.

I think I already may have hit the block while running tests. I'll try and speak to someone, but if they won't do it for you with MaGoG I doubt they would for me either
avatar
adaliabooks: Thanks for the tips, I was going to message you eventually about some stuff. Although I can't remember what now... :/
I know one of the things was about links for the forums, as if I do go down the scanning route I'll need to get them from somewhere for all the sub forums... (although, that being said most of them have less than 20 pages so maybe the archive of posts would only be required for General Discussion and just let it search live for the game forums)

I hadn't even considered a name... I just made the original script for the mafia games, and as people found it useful and the forum search is awful I thought I would expand it. I'll certainly put some more thought into it.. ;)

Yeah, I was just thinking of dumping the whole lot into an SQL database... but I've never been great at working out memory usage and thinking about the size of some of the threads it would certainly need a lot of space, much more than my hosting package is likely to have. But if I don't store all the text then you can't search the whole of a forum / thread, which some what defeats the purpose... I'll see if I can think of a way to make it work.

I think I already may have hit the block while running tests. I'll try and speak to someone, but if they won't do it for you with MaGoG I doubt they would for me either
I'm always here to answer questions.

If you're serious about it, you should consider a VPS (Virtual Private Server) rather than a web hosting server. Web hosting is good for static pages, but for a search engine you need response time, which can be extremely variable for regular web hosting.

In case you don't know it yet, the IP block lasts around 18 hours.
avatar
mrkgnao: I'm always here to answer questions.

If you're serious about it, you should consider a VPS (Virtual Private Server) rather than a web hosting server. Web hosting is good for static pages, but for a search engine you need response time, which can be extremely variable for regular web hosting.

In case you don't know it yet, the IP block lasts around 18 hours.
That's probably not what happened to me then, I just couldn't access the forum without errors, but clearing my cookies etc. seemed to work and I could get back in.

I was just thinking of putting it on my current website's hosting... but I will admit to being mostly clueless when it comes to web stuff... not sure I'm in a position to put any kind of money into this right now though, so that might have to do..

And thank you :)
avatar
mrkgnao: I'm always here to answer questions.

If you're serious about it, you should consider a VPS (Virtual Private Server) rather than a web hosting server. Web hosting is good for static pages, but for a search engine you need response time, which can be extremely variable for regular web hosting.

In case you don't know it yet, the IP block lasts around 18 hours.
avatar
adaliabooks: That's probably not what happened to me then, I just couldn't access the forum without errors, but clearing my cookies etc. seemed to work and I could get back in.

I was just thinking of putting it on my current website's hosting... but I will admit to being mostly clueless when it comes to web stuff... not sure I'm in a position to put any kind of money into this right now though, so that might have to do..

And thank you :)
Yes. Better start slow. I did 4-5 years of web hosting before moving to a VPS.
avatar
j0ekerr: But you really need to include a "no results found for your search" message.
Yes. This is a first priority, as it is impossible to know when the script is done if it finds nothing.
avatar
j0ekerr: But you really need to include a "no results found for your search" message.
avatar
mrkgnao: Yes. This is a first priority, as it is impossible to know when the script is done if it finds nothing.
I missed j0ekerr's edit, that's an excellent point. I'll do that this afternoon, hopefully the first part of the forum search will be ready by then too.
+1 and topic favored! I was waiting for such feature for a long time, the default search function is barely functional :)
Right, first major update is here!

You can now search for threads on the forum pages, by title and username of the thread creator (searching based on the contents of threads is going to have to wait, as it will require a lot of ajax calls if I can't think of a better way to do it).

As suggested it now also displays a message if no results were found.

Let me know what you think, or if you have any problems / find any bugs.

Edit: Whoops, script breaking bug. Fixed, should be ok now. Just update again if you already did.
Post edited May 17, 2015 by adaliabooks
Any idea why "GreaseMonkey menu => Manage User Scripts => Right click GOG Search => Find Updates" does not work?
Had to go to the OP link and click "raw" again. Is that OK?

Better list somewhere what the current version (0.3) is.

=========================================================

Something doesn't seem to work right.

Searched "mrkgnao' in pages 1-10 of the general forum and got only one thread:
- The Potential GOG T-Shirt Giveaway (May 6)

Whereas there are definitely more threads of mine that have seen activity in the last 11 days (e.g. MaGog, public wishlists, price updates).

=========================================================

Recommend setting the "usernames" checkbox to be on by default.
Post edited May 18, 2015 by mrkgnao
avatar
mrkgnao: Any idea why "GreaseMonkey menu => Manage User Scripts => Right click GOG Search => Find Updates" does not work?
Each update changes the raw url. GreaseMonkey assumes the url of the script will always be the same, and since GitHub Gist uses unique urls for version control, GreaseMonkey doesn't find any updates.