I have used Jsoup library to fetch the metadata from url.
JavaScript
x
10
10
1
Document doc = Jsoup.connect("http://www.google.com").get();
2
String keywords = doc.select("meta[name=keywords]").first().attr("content");
3
System.out.println("Meta keyword : " + keywords);
4
String description = doc.select("meta[name=description]").get(0).attr("content");
5
Elements images = doc.select("img[src~=(?i)\.(png|jpe?g|gif)]");
6
7
String src = images.get(0).attr("src");
8
System.out.println("Meta description : " + description);
9
System.out.println("Meta image URl : " + src);
10
But I want to do it in client side using javascript
Advertisement
Answer
You can’t do it client only because of the cross-origin
issue. You need a server side script to get the content of the page.
OR You can use
https://policies.yahoo.com/us/en/yahoo/terms/product-atos/yql/index.htmYQL
. In this way, the YQL
will used as proxy.
Or you can use https://cors-anywhere.herokuapp.com. In this way, cors-anywhere will used as proxy:
For example:
JavaScript
1
16
16
1
$('button').click(function() {
2
$.ajax({
3
url: 'https://cors-anywhere.herokuapp.com/' + $('input').val()
4
}).then(function(data) {
5
var html = $(data);
6
7
$('#kw').html(getMetaContent(html, 'description') || 'no keywords found');
8
$('#des').html(getMetaContent(html, 'keywords') || 'no description found');
9
$('#img').html(html.find('img').attr('src') || 'no image found');
10
});
11
});
12
13
function getMetaContent(html, name) {
14
return html.filter(
15
(index, tag) => tag && tag.name && tag.name == name).attr('content');
16
}
JavaScript
1
10
10
1
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
2
3
<input type="text" placeholder="Type URL here" value="http://www.html5rocks.com/en/tutorials/cors/" />
4
<button>Get Meta Data</button>
5
6
<pre>
7
<div>Meta Keyword: <div id="kw"></div></div>
8
<div>Description: <div id="des"></div></div>
9
<div>image: <div id="img"></div></div>
10
</pre>