donderdag 14 november 2019

'Knowledge' Management Part 1 (importing court decisions into a Google spreadsheet)



This is the result.

Google Apps Script (GAS)
  1. imports court decisions (on migration detention) into a Google spreadsheet
  2. adds dependeing dropdown into the Google spreadsheet 
  3. imports court decisions into a tree structure in a Google Document.

Somewhere in between I am adding a short description to the court decisions and labeling them.

In this blog I will show the script with which the court decisions are imported into a Google spreadsheet.

I will use the court decisions of the Administrative Jurisdiction Division of the Council of State (Afdeling Bestuursrechtspraak Raad van State) as an example.


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
function scrapeCourtDecisionsRVS(){
  
  var months = ["januari","februari","maart","april","mei","juni","juli","augustus","september","oktober","november","december"];
  var courtDecisions = [];
  
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var sheet = ss.getSheetByName("Data");
  var testColumn = sheet.getRange("C2:C").getValues().map(function(cell){return cell[0]});
  
  var url = "https://www.raadvanstate.nl/uitspraken/?zoeken=true&zoeken_term=&pager_rows=100";
  var sourceText = UrlFetchApp.fetch(url).getContentText(); 
  var courtDecisionsData = sourceText.split(/class=siteLink href="https:\/\/www.raadvanstate.nl\/uitspraken\/@/g);
  
  for(var i=1;i<courtDecisionsData.length;i++){
    
    var courtDecisionData = courtDecisionsData[i];
    
    if(/Datum uitspraak/.test(courtDecisionData)){
      var hlpCourtDecisionDate = courtDecisionData.split(/Datum uitspraak/)[1].split(/<dd>/)[1].split(/</)[0].trim().split(/\s/g); 
      var day = Number(hlpCourtDecisionDate[0]);
      var month = Number(months.indexOf(hlpCourtDecisionDate[1]));
      var year = Number(hlpCourtDecisionDate[2]); 
      var courtDecisionDate = Utilities.formatDate(new Date(year,month,day),"GMT+1","d-M-yyyy"); 
    }
    else{
      var courtDecisonDate = "no date";
    }
    
    if(/\d{9}\/\d\/[a-zA-Z]\d/.test(courtDecisionData)){
      var courtDecisionReference = /\d{9}\/\d\/[a-zA-Z]\d/.exec(courtDecisionData)[0]; 
    }
    else{
      var courtDecisionReference = "no reference";
    }
    
    if(/ECLI:NL:RVS:\d{4}:[A-Z0-9]+/.test(courtDecisionData)){
      var courtDecisionEcli = /ECLI:NL:RVS:\d{4}:[A-Z0-9]+/.exec(courtDecisionData)[0];
    }
    else{
      var courtDecisionEcli = "no ecli";
    }
    
    var courtDecisionUrl = courtDecisionData.split(/<a href="/)[1].split(/">/)[0];
    
    if(/<div class=summary>/.test(courtDecisionData) && /<div class=summary><\/div>/.test(courtDecisionData) == false){
      var courtDecisionSummary = courtDecisionData.split(/<div class=summary>/)[1].split(/<p>/)[1].split(/</)[0].trim(); 
    }
    else{
      var courtDecisionSummary = "";
    }
    
    if(testColumn.indexOf(courtDecisionEcli) ==-1 && courtDecisonDate !== "no date" && courtDecisionReference !== "no reference" && courtDecisionEcli !== "no ecli"  &&  /-v/.test(courtDecisionUrl) && /(vreemdelingenbewaring|bewaring|vrijheidsontnemende)/.test(courtDecisionSummary)){
      courtDecisions.push([courtDecisionDate,courtDecisionReference,courtDecisionEcli,courtDecisionUrl,"ABRS"]);
    }
  }  
  if(courtDecisions[0]){
    sheet.getRange(sheet.getLastRow()+1,1,courtDecisions.length,courtDecisions[0].length).setValues(courtDecisions); 
  };
}


Line 10-12
With UrlFetchApp.fetch(url).getContentText() GAS get access to the html-code of the webpage on which the court decisions are published. You can view the html-code by logging it (Logger.log(sourceText)) but a better way to view the html-code is via your browser.

The court decisions can be seperated (and thereafter manipulated) by splitting the html-code with the string class=siteLink href="https://www.raadvanstate.nl/uitspraken/@. View the html-code of the court decisions webpage and search for the above mentioned string.

Line 40-55
Of each court decision I need the date, the reference, the ECLI, the url and the summary. After splitting here and splitting there the data are pushed into the array courtDecisions.

Line 56-58
Finally the array courtDecisions is imported into the Google spreadsheet.


Geen opmerkingen:

Een reactie posten