Hello back again! If you missed out on how to create the selector for HTML data, please check out part one of the series! Otherwise, let us proceed with the automatic parsing of the data!
We begin with creating a new project. Open up your terminal and enter the following to create a NodeJS project.
mkdir web-data-retrieve && cd web-data-retrieve && npm init
After that, let us add got (HTTP request client) and cheerio (jQuery equivalent for NodeJS). The former will be used to request the content and the later to parse it.
npm install --save cheerio got
Now, let us create an index file which will contain our code.
touch index.js
Open up the index.js file and add the following code inside:
const cheerio = require('cheerio');const got = require('got');const url = 'https://www.marketwatch.com/investing/stock/aapl/financials';(async () => { try { const response = await got(url); const $ = cheerio.load(response.body); const selected = $('.financials tr.partialSum:nth-child(1) td.valueCell'); const cellData = selected.toArray().map(cell => cell.firstChild.data); console.log(cellData); } catch (error) { console.log(error.response.body); }})();
Now let me explain what is going on here, top to bottom. The first two lines are the libraries we will be using:
const cheerio = require('cheerio');const got = require('got');
The next line is the URL which will be getting the content from, it is identical with the one from the browser.
const url = 'https://www.marketwatch.com/investing/stock/aapl/financials';
The next line is a bit more complex, it is a wrapper, which let us use asynchronous functions. To have that explained, it is better to follow another tutorial :). The more interesting part happens inside of it!
The first line calls the library got with the URL which was defined in the beginning of our file. It’s response is saved in the response variable.
try { const response = await got(url); const $ = cheerio.load(response.body); const selected = $('.financials tr.partialSum:nth-child(1) td.valueCell'); const cellData = selected.toArray().map(cell => cell.firstChild.data); console.log(cellData);} catch (error) { console.log(error.response.body);}
After that, the response will be loaded with cheerio. It gives us a query operator, similar to what was used with jQuery in the browser.
try { const response = await got(url); const $ = cheerio.load(response.body); const selected = $('.financials tr.partialSum:nth-child(1) td.valueCell'); const cellData = selected.toArray().map(cell => cell.firstChild.data); console.log(cellData);} catch (error) { console.log(error.response.body);}
What follows is the use of our selector which we created before in the browser. It will return us the selected elements.
try { const response = await got(url); const $ = cheerio.load(response.body); const selected = $('.financials tr.partialSum:nth-child(1) td.valueCell'); const cellData = selected.toArray().map(cell => cell.firstChild.data); console.log(cellData);} catch (error) { console.log(error.response.body);}
The last magical line in the try converts the selected data to standard JS array which allows us to map over it. In the map we take the first child and look at its data.
try { const response = await got(url); const $ = cheerio.load(response.body); const selected = $('.financials tr.partialSum:nth-child(1) td.valueCell'); const cellData = selected.toArray().map(cell => cell.firstChild.data); console.log(cellData);} catch (error) { console.log(error.response.body);}
The application will print out the sales for the Apple stocks:
# node index.js[ '231.28B', '214.23B', '228.57B', '265.81B', '259.97B' ]
Where to go from here? Well, you can modify the application in ways to retrieve more data of the Apple stock or even fetch different stocks. The possibilities are endless, whatever suits your purpose.
References:
<strong>sindresorhus/got</strong><br> _Huge thanks to for sponsoring Sindre Sorhus! Human-friendly and powerful HTTP request library for Node.js Moving from…_github.com
<strong>cheerio</strong><br> _Fast, flexible, and lean implementation of core jQuery designed specifically for the server._cheerio.js.org
<strong>jQuery Selectors</strong><br> _Well organized and easy to understand Web building tutorials with lots of examples of how to use HTML, CSS, JavaScript…_www.w3schools.com
<strong>Selectors | jQuery API Documentation</strong><br> _To use any of the meta-characters ( such as !"#$%&’()*+,./:; [email protected][\]^`{|}~ ) as a literal part of a name…_api.jquery.com
Originally published at <em>https://dev.to</em> on May 4, 2020.