Skip to contents
  • to_alias() takes any character vector and creates an alias by transforming the input (a) to lower case; (b) to latin-ascii characters; and (c) to standard abbreviations of ownership types. Commonly, the inputs are values from the columns name_direct_loantaker or name_ultimate_parent of a loanbook dataset, or from the column name_company of an asset-level dataset.

  • from_name_to_alias() outputs a table giving default strings used to convert from a name to its alias. You may amend this table and pass it to to_alias() via the from_to argument.

Usage

to_alias(x, from_to = NULL, ownership = NULL, remove_ownership = FALSE)

Source

r2dii.match version 0.1.3.

Arguments

x

Character string, commonly from the columns name_direct_loantaker or name_ultimate_parent of a loanbook dataset, or from the column name_company of an asset-level dataset.

from_to

A data frame with replacement rules to be applied, contains columns from (for initial values) and to (for resulting values).

ownership

vector of company ownership types to be distinguished for cut-off or separation.

remove_ownership

Flag that defines whether ownership type (like llc) should be cut-off.

Value

Assigning aliases

The transformation process used to compare names between loanbook and tilt datasets applies best practices commonly used in name matching algorithms:

  • Remove special characters.

  • Replace language specific characters.

  • Abbreviate certain names to reduce their importance in the matching.

  • Spell out numbers to increase their importance.

Author

person(given = "Evgeny", family = "Petrovsky", role = c("aut", "ctr"))

Adapted from: https://github.com/RMI-PACTA/r2dii.match/blob/main/R/to_alias.R

Examples

library(dplyr)

to_alias("A. and B")
#> [1] "ab"
to_alias("Acuity Brands Inc")
#> [1] "acuitybrands inc"
to_alias(c("3M Company", "Abbott Laboratories", "AbbVie Inc."))
#> [1] "threem co"          "abbottlaboratories" "abbvie inc"        

custom_replacement <- tibble(from = "AAAA", to = "B")
to_alias("Aa Aaaa", from_to = custom_replacement)
#> [1] "aab"

neutral_replacement <- tibble(from = character(0), to = character(0))
to_alias("Company Name Owner", from_to = neutral_replacement)
#> [1] "companynameowner"
to_alias(
  "Company Name Owner",
  from_to = neutral_replacement,
  ownership = "owner",
  remove_ownership = TRUE
)
#> [1] "companyname"

from_name_to_alias()
#> # A tibble: 96 × 2
#>    from     to    
#>    <chr>    <chr> 
#>  1 " and "  " & " 
#>  2 " en "   " & " 
#>  3 " och "  " & " 
#>  4 " und "  " & " 
#>  5 "(pjsc)" ""    
#>  6 "(pte)"  ""    
#>  7 "(pvt)"  ""    
#>  8 "0"      "null"
#>  9 "1"      "one" 
#> 10 "2"      "two" 
#> # … with 86 more rows

append_replacements <- from_name_to_alias() %>%
  add_row(
    .before = 1,
    from = c("AA", "BB"), to = c("alpha", "beta")
  )
append_replacements
#> # A tibble: 98 × 2
#>    from     to     
#>    <chr>    <chr>  
#>  1 "AA"     "alpha"
#>  2 "BB"     "beta" 
#>  3 " and "  " & "  
#>  4 " en "   " & "  
#>  5 " och "  " & "  
#>  6 " und "  " & "  
#>  7 "(pjsc)" ""     
#>  8 "(pte)"  ""     
#>  9 "(pvt)"  ""     
#> 10 "0"      "null" 
#> # … with 88 more rows

# And in combination with `to_alias()`
to_alias(c("AA", "BB", "1"), from_to = append_replacements)
#> [1] "alpha" "beta"  "one"