Skip to content

Instantly share code, notes, and snippets.

@klebinhopk
Created June 21, 2018 18:02
Show Gist options
  • Select an option

  • Save klebinhopk/f87635620c0cd80cbe37ba06628c4cf8 to your computer and use it in GitHub Desktop.

Select an option

Save klebinhopk/f87635620c0cd80cbe37ba06628c4cf8 to your computer and use it in GitHub Desktop.
Google news scraper
const request = require('request');
const cheerio = require('cheerio');
var searchTerm = 'tech';
var searchUrl = 'https://www.google.com/search?q=' + searchTerm + '&tbm=nws';
var savedData = [];
request(searchUrl, function(err, response, html) {
// First we'll check to make sure no errors occurred when making the request
if (err) {
return res.status(500).send(err);
}
var $ = cheerio.load(html);
// For each outer div with class g, parse the desired data
$('div.g').each(function(i, element) {
var title = $(this).find('.r').text();
var link = $(this).find('.r').find('a').attr('href').replace('/url?q=', '').split('&')[0];
var text = $(this).find('.st').text();
var img = $(this).find('img.th').attr('src');
savedData.push({
title: title,
link: link,
text: text,
img: img
});
});
console.log(JSON.stringify(savedData));
});
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment