Skip to content
Advertisement

How to write this crawler in JavaScript?

The idea is very simple:

Imagine a simple white page with a form with a single input tag (like Google homepage ). When I insert a link of a blog post in this form, then the javascript-crawler search the first image in the web page of the blog post (through ajax), show it in the white page and save it on my server.

This crawler works like Digg and Facebook-wall.

What function I have to use for this crawler?

Answer

Due to cross cross domain restrictions pure javascript crawlers are not common and practically feasible. You might need to setup a server side script which will receive the address entered in the form, fetch the contents of the remote resource and parse the html to obtain the images.

Advertisement