Spiders
Revision as of 12:16, 16 February 2015 by Freephile (talk | contribs) (Created page with "Spiders, in this context, are things that index the web. So you might also call them indexers. A long time ago I wrote a spider. If I ever get around to digging up that...")
Spiders, in this context, are things that index the web. So you might also call them indexers.
A long time ago I wrote a spider. If I ever get around to digging up that old code, here is where I might find it. Lots of other people are making interesting spiders that you can use.
Portia is one example from the folks over in Cork, Ireland at Scrapinghub. I won't repeat their documentation here needlessly, but I will note my experiences with the tools. I wanted to scrape questions and answers from OKCupid, but so far Portia can't handle the JavaScript login. I need to deconstruct it more to find out what the solution might be.