Difference between revisions of "Spiders"

From Freephile Wiki
Jump to navigation Jump to search
(Created page with "Spiders, in this context, are things that index the web. So you might also call them indexers. A long time ago I wrote a spider. If I ever get around to digging up that...")
(No difference)

Revision as of 13:16, 16 February 2015

Spiders, in this context, are things that index the web. So you might also call them indexers.

A long time ago I wrote a spider. If I ever get around to digging up that old code, here is where I might find it. Lots of other people are making interesting spiders that you can use.

Portia is one example from the folks over in Cork, Ireland at Scrapinghub. I won't repeat their documentation here needlessly, but I will note my experiences with the tools. I wanted to scrape questions and answers from OKCupid, but so far Portia can't handle the JavaScript login. I need to deconstruct it more to find out what the solution might be.