作者: javacomhk 時間: 2022-5-21 22:47 標題: 用 ChromeOS/Linux 開發 Google Apps Script WebApp (WebScrape backend)
本帖最後由 javacomhk 於 2022-5-23 22:15 編輯
前二篇講關於開發 Google Apps Script WebApp,今篇主要是講 Google Apps Script 的 backend timer 功能即是用 javascript 定時去獲取 scrape web data 然後儲存在 Google Spreadsheet 用作 WebApp 的內容。
如果用 node.js 去做 Web Scrape, 通常會用 cheerio.js 或者用 document.querySelectorAll 的方法。Google Apps Script 也可以用 cheerio (見 https://github.com/tani/cheeriogs)
cheerio.js 既原理就係用 JQuery 的 syntax 去個 webpage 度根據 CSS selector 去攞相關嘅 text content。如果用 Chrome Browser 試的 JQuery code, 首先要在 Chrome Browser 裝個 JQuery Inject 嘅 extension, 然後打開個網頁後,打開個 Developer tools 嘅 console 就可以打入段 code 去試。
例如 Chrome Browser 已安裝好 JQuery Inject Extension 後
(1)去到 http://www.aastocks.com/tc/stock ... -con.aspx?index=HSI
(2)Browser 右上角的 Extensions 內開啟 JQuery Inject
(3)在 console 內打入 code 如下
- $("table#tblTS2 > tbody > tr").each((index,element) =>
- { console.log($(element).text());})
[attach]2321345[/attach]
以下就示範如何使用 Google Apps Script 去獲取 AASTOCKS 的 webpage table 內的每日股票價值
(E1) 如果要用 Google Apps Script 使用 JQuery Syntax 的 Webscrape 就要在 Google Apps Script Project 內使用 Cheerio.gs https://github.com/tani/cheeriogs, 安裝方法是在 Google Apps Script Project 內 左邊菜單選擇 "Libraries" 右側的 + 號去 "Add a Library" 輸入 Project Key (1ReeQ6WO8kKNxoaA_O0XEQ589cIrRvEBA9qcWpNqdOP17i47u6N9M5Xh0) 及最新版本 14 後便可使用。
(E2)然後在 main.js 內加入這 function 代碼內容
- function aastocks_scrape() {
- var webAppSheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("AASTOCKS");
- webAppSheet.getRange("A1:E30").clearContent();
- const datestr = (new Date().toLocaleString('sv',{timeZone: 'Asia/Hong_Kong'})).slice(0, 19).replace(/-/g, "").replace(/:/g, "").replace("T", " ");
- var res=UrlFetchApp.fetch("http://www.aastocks.com/tc/stocks/market/index/hk-index-con.aspx?index=HSI");
- var content = res.getContentText();
- const $ = Cheerio.load(content);
- const a = $("table#tblTS2 > tbody > tr").each((index,element) =>
- {
- const tds = $(element).find("td");
- const namecell = $(tds[0]).find("span");
- const codecell = $(tds[0]).find("a");
- console.log("stockcode is "+$(codecell).text());
- console.log("stockname is "+$(namecell).text());
- console.log("price is "+$(tds[1]).text());
- console.log("up/down "+$(tds[2]).text());
- webAppSheet.appendRow([$(codecell).text(), $(namecell).text(), $(tds[1]).text(), $(tds[2]).text(), datestr]);
- }
- );
- SpreadsheetApp.flush();
- }
[attach]2321346[/attach]
(E4)要令到 Google Apps Script WebApp 每日定時去獲取股票價值,就要在左邊菜單新增 Triggers
[attach]2321347[/attach]
Trigger 的設定如下,頻率通常不要過密,否則很容易比人block。例如每日下午收市時便可以。
[attach]2321348[/attach]
(E5)測試成功後
[attach]2321350[/attach]
留位
作者: javacomhk 時間: 2022-5-21 22:47
本帖最後由 javacomhk 於 2022-5-23 22:18 編輯
Webscrape 完就可以將個 stock portfolio 輸入 Google spreadsheet 及計算盈利
(E5)在 Google Spreadsheet 新增 worksheet 叫 Portfolio 如下內容,其中 Column F 就是取自每天 Webscrape 的股票價格,其 F2 formula 是 =vlookup($A2,AASTOCKS!$A$1:$E$30,3)
[attach]2321477[/attach]
(E6)新增一個 google doc 例如下圖 其中{{ variable }},就會在 mail merge 時被取替
[attach]2321478[/attach]
(E7) 新增 function sendPortfolioEmail() 在 main.gs 內
- function sendPortfolioEmail(){
- // fill in google doc ID
- var id = 'FILL IN DOC ID HERE' ;
- var forDriveScope = DriveApp.getStorageUsed(); //needed to get Drive Scope requested
- var url = "https://docs.google.com/feeds/download/documents/export/Export?id="+id+"&exportFormat=html";
- var param = {
- method : "get",
- headers : {"Authorization": "Bearer " + ScriptApp.getOAuthToken()},
- muteHttpExceptions:true,
- };
- var html = UrlFetchApp.fetch(url,param).getContentText();
- var htmlout1 = html;
- // email send out with replaced fields
- var portfolioSheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Portfolio");
- const result = portfolioSheet.getDataRange().getValues();
- const numRows = result.length
-
- var TODAYPL = 0;
- var TOTALPL = 0;
- for (var i = 1; i < numRows; i++) {
- htmlout1 = htmlout1.replace("{{R"+i+"C1}}",result[i][0].toString().substring(0,5));
- htmlout1 = htmlout1.replace("{{R"+i+"C2}}",result[i][1]);
- htmlout1 = htmlout1.replace("{{R"+i+"C3}}",Utilities.formatString( '%.2f',result[i][4]));
- htmlout1 = htmlout1.replace("{{R"+i+"C4}}",Utilities.formatString( '%.2f',result[i][5]));
- htmlout1 = htmlout1.replace("{{R"+i+"C5}}", result[i][6].toLocaleString('en-US', { style: 'currency', currency: 'USD' }));
- htmlout1 = htmlout1.replace("{{R"+i+"C6}}", result[i][7].toLocaleString('en-US', { style: 'currency', currency: 'USD' }));
- htmlout1 = htmlout1.replace("{{R"+i+"C7}}", result[i][8]);
- TODAYPL += parseFloat(result[i][6]);
- TOTALPL += parseFloat(result[i][7]);
- }
- console.log("TODAYPL "+ TODAYPL.toLocaleString('en-US', { style: 'currency', currency: 'USD' }))
- console.log("TOTALPL "+ TOTALPL.toLocaleString('en-US', { style: 'currency', currency: 'USD' }))
- var nameSheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Sheet1");
- htmlout1 = htmlout1.replace("{{TODAYPL}}",TODAYPL.toLocaleString('en-US', { style: 'currency', currency: 'USD' }));
- htmlout1 = htmlout1.replace("{{TOTALPL}}", TOTALPL.toLocaleString('en-US', { style: 'currency', currency: 'USD' }));
- var userSheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Sheet1");
- const users = userSheet.getDataRange().getValues();
-
- for (var i = 0; i < users.length; i++) {
- var firstname = users[i][0];
- var lastname = users[i][1];
- var email = users[i][3];
- var htmlout2 = htmlout1;
- htmlout2 = htmlout2.replace("{{title_1}}","Dear "+ firstname +" "+lastname);
- htmlout2 = htmlout2.replace("{{heading_1}}","Portfolio Summary");
- MailApp.sendEmail({
- to: email,
- subject: "Your portfolio Summary",
- htmlBody: htmlout2
- });
- console.log('email sent to '+ email + ' ' + firstname + ' !!')
- }
- }
[attach]2321480[/attach]
作者: javacomhk 時間: 2022-5-22 08:01
留位
作者: bongbong3481 時間: 2023-10-6 06:46
留位學野,高汁啊
