Assuming I have an Amazon product URL like so
JavaScript
x
2
1
http://www.amazon.com/Kindle-Wireless-Reading-Display-Generation/dp/B0015T963C/ref=amb_link_86123711_2?pf_rd_m=ATVPDKIKX0DER&pf_rd_s=center-1&pf_rd_r=0AY9N5GXRYHCADJP5P0V&pf_rd_t=101&pf_rd_p=500528151&pf_rd_i=507846
2
How could I scrape just the ASIN using javascript? Thanks!
Advertisement
Answer
Amazon’s detail pages can have several forms, so to be thorough you should check for them all. These are all equivalent:
http://www.amazon.com/Kindle-Wireless-Reading-Display-Generation/dp/B0015T963C
http://www.amazon.com/dp/B0015T963C
http://www.amazon.com/gp/product/B0015T963C
http://www.amazon.com/gp/product/glance/B0015T963C
They always look like either this or this:
JavaScript
1
3
1
http://www.amazon.com/<SEO STRING>/dp/<VIEW>/ASIN
2
http://www.amazon.com/gp/product/<VIEW>/ASIN
3
This should do it:
JavaScript
1
7
1
var url = "http://www.amazon.com/Kindle-Wireless-Reading-Display-Generation/dp/B0015T963C";
2
var regex = RegExp("http://www.amazon.com/([\w-]+/)?(dp|gp/product)/(\w+/)?(\w{10})");
3
m = url.match(regex);
4
if (m) {
5
alert("ASIN=" + m[4]);
6
}
7