Less bugs and more features
r2dii.data 0.1.2 and r2dii.match 0.0.4 are now on CRAN. These packages provide datasets and tools to align financial markets to climate goals. These releases fix a number of bugs that you can learn about here and here; this post shows enhancements and new features.
You can install r2dii.data and r2dii.match from CRAN with:
install.packages("r2dii.data")
install.packages("r2dii.match")
Then use them with:
library(r2dii.data)
library(r2dii.match)
r2dii.data 0.1.2 includes two new dataset – green_or_brown, and sic_classification (thanks to Daisy Pacheco and George Harris).
green_or_brown
#> # A tibble: 16 x 3
#>    sector       technology    green_or_brown
#>    <chr>        <chr>         <chr>         
#>  1 automotive   electric      green         
#>  2 automotive   hybrid        green         
#>  3 automotive   ice           brown         
#>  4 automotive   fuelcell      green         
#>  5 power        hydrocap      green         
#>  6 power        renewablescap green         
#>  7 power        coalcap       brown         
#>  8 power        gascap        brown         
#>  9 power        oilcap        brown         
#> 10 power        nuclearcap    green         
#> 11 oil and gas  oil           brown         
#> 12 oil and gas  gas           brown         
#> 13 coal         coal          brown         
#> 14 fossil fuels oil           brown         
#> 15 fossil fuels gas           brown         
#> 16 fossil fuels coal          brown
sic_classification
#> # A tibble: 256 x 4
#>    code  description                              sector    borderline
#>    <chr> <chr>                                    <chr>     <lgl>     
#>  1 0     private households, exterritorial organ… not in s… FALSE     
#>  2 00000 private households, exterritorial organ… not in s… FALSE     
#>  3 11110 growing of cereals and other crops n.e.… not in s… FALSE     
#>  4 11130 growing of fruit, nuts, beverage and sp… not in s… FALSE     
#>  5 11210 farming  of cattle, sheep, goats, horse… not in s… FALSE     
#>  6 11300 growing of crops combined with farming … not in s… FALSE     
#>  7 12100 forestry and related services            not in s… FALSE     
#>  8 12200 logging and related services             not in s… FALSE     
#>  9 13100 ocean and coastal fishing                not in s… FALSE     
#> 10 21000 mining of coal and lignite               coal      FALSE     
#> # … with 246 more rows
Also, region_isos gained data from ETP 2017, and ald_demo dropped the column number_of_assets (thanks to Taylor Posey).
unique(region_isos$source)
#> [1] "weo_2019" "etp_2017"
any(grepl("number_of_assets", names(ald_demo)))
#> [1] FALSE
match_name() now outputs a new column – borderline. This column helps you measure how much of your loanbook matched some asset; see the new article Calculating matching coverage.
loanbook <- loanbook_demo
ald <- ald_demo
matched <- match_name(loanbook, ald)
tail(names(matched))
#> [1] "sector_ald" "name"       "name_ald"   "score"      "source"    
#> [6] "borderline"
Also, match_name() now runs faster and uses less memory. This responds to users’s feedback, diligently managed by George Harris – thanks! If you still run out of memory, read Using match_name() with large loanbooks: How to resolve memory issues and Improving r2dii.match: How to work with big data, and benchmarks of a more efficient version of match_name(). You may also want to reduce the size of your data: use the new function crucial_lbk() to select the minimum columns you need for match_name().
ncol(loanbook)
#> [1] 19
crucial_lbk()
#> [1] "id_ultimate_parent"                    
#> [2] "name_ultimate_parent"                  
#> [3] "id_direct_loantaker"                   
#> [4] "name_direct_loantaker"                 
#> [5] "sector_classification_system"          
#> [6] "sector_classification_direct_loantaker"
smaller_loanbook <- loanbook[crucial_lbk()]
ncol(smaller_loanbook)
#> [1] 6
match_name(smaller_loanbook, ald)
#> # A tibble: 497 x 15
#>    id_ultimate_par… name_ultimate_p… id_direct_loant… name_direct_loa…
#>    <chr>            <chr>            <chr>            <chr>           
#>  1 UP15             Alpine Knits In… C294             Yuamen Xinneng …
#>  2 UP288            University Of I… C292             Yuama Ethanol L…
#>  3 UP104            Garland Power &… C305             Yukon Energy Co…
#>  4 UP104            Garland Power &… C305             Yukon Energy Co…
#>  5 UP83             Earthpower Tech… C304             Yukon Developme…
#>  6 UP83             Earthpower Tech… C304             Yukon Developme…
#>  7 UP163            Kraftwerk Mehru… C303             Yueyang City Co…
#>  8 UP138            Jai Bharat Gum … C301             Yuedxiu Corp One
#>  9 UP32             Bhagwan Energy … C302             Yuexi County AA…
#> 10 UP81             Dynegy Midwest … C309             Yuxi ounty Liua…
#> # … with 487 more rows, and 11 more variables:
#> #   sector_classification_system <chr>,
#> #   sector_classification_direct_loantaker <dbl>, id_2dii <chr>,
#> #   level <chr>, sector <chr>, sector_ald <chr>, name <chr>,
#> #   name_ald <chr>, score <dbl>, source <chr>, borderline <lgl>
While this release includes commits from only a few of us (jdhoffa, maurolepore), it is thanks to feedback from our colleagues and users.
For attribution, please cite this work as
Lepore (2020, Aug. 14). Data science at 2DII: r2dii.data 0.1.2 and r2dii.match 0.0.4 are now on CRAN. Retrieved from https://2degreesinvesting.github.io/posts/2020-08-14-r2dii-data-0-1-2-and-r2dii-match-0-0-4-are-now-on-cran/
BibTeX citation
@misc{lepore2020r2dii.data,
  author = {Lepore, Mauro},
  title = {Data science at 2DII: r2dii.data 0.1.2 and r2dii.match 0.0.4 are now on CRAN},
  url = {https://2degreesinvesting.github.io/posts/2020-08-14-r2dii-data-0-1-2-and-r2dii-match-0-0-4-are-now-on-cran/},
  year = {2020}
}