Skip to contents

For each company, this function drops rows where the product information is missing and the sector information is duplicated.

Usage

sector_profile_any_prune_companies(data)

Arguments

data

Typically a "sector profile" *companies dataframe.

Value

A dataframe with maybe fewer rows than the input data.

Examples

library(dplyr)
# styler: off
companies <- tribble(
  ~row, ~companies_id, ~clustered, ~activity_uuid_product_uuid, ~tilt_sector,
    1L,           "a",       "b1",                        "c1",          "x",
    2L,           "a",         NA,                          NA,          "x",
    3L,           "a",         NA,                          NA,          "y",
    4L,           "a",         NA,                          NA,          "y"
  )
# styler: off

# Keep row 1: Has product info
# Drop row 2: Lacks product info and sector info is duplicated
# Keep row 3: Lacks product info but sector info is unique
# Drop row 4: Lacks product info and sector info is duplicated
companies
#> # A tibble: 4 × 5
#>     row companies_id clustered activity_uuid_product_uuid tilt_sector
#>   <int> <chr>        <chr>     <chr>                      <chr>      
#> 1     1 a            b1        c1                         x          
#> 2     2 a            NA        NA                         x          
#> 3     3 a            NA        NA                         y          
#> 4     4 a            NA        NA                         y          

sector_profile_any_prune_companies(companies)
#> # A tibble: 2 × 5
#>     row companies_id clustered activity_uuid_product_uuid tilt_sector
#>   <int> <chr>        <chr>     <chr>                      <chr>      
#> 1     1 a            b1        c1                         x          
#> 2     3 a            NA        NA                         y