library(dplyr)
## Warning: package 'dplyr' was built under R version 3.6.2
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(rio)
#devtools::install_github("smach/rmiscutils")
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following object is masked from 'package:base':
## 
##     date
library(scales)
## Warning: package 'scales' was built under R version 3.6.2
library(DT)

Getting the Data

This is downloaded from the County Auditor and Recorder's Data. And some FOIA data. This can be any spreadsheet of county data as long as it has the parcel number, mortgage amount, tax district, effective rate, owner name, sale date, and (Auditor's) market value.

Note: This is only commerical properties, in the Commerical/Industrial land use codes. The mortgages are incomplete. (See our methodology for more.)

#importing data
Commercial_Properties <- rio::import("Cuyahoga Properties last verified 8-07-19.csv") #OR wherever your file is
#Exempt properties taken out, as per auditor's edit. 
Commercial_Properties <- filter(Commercial_Properties, NET_TAX_OWED_1ST>0)

Calculating Baseline

What is the amount owed to each county?

Note: in theory we would just add all the taxes_paid and there's always going to be a certain percentage of people behind on their taxes, and we're not interested in that. So here we are calculating the total amount to taxes collected in theory.

Second Note: Each year, new properties are built. This doesn't decrease the tax rate. It just adds to the amount of property tax collected. So this is essentially a snapshot.

Third caveat: We collected this data over several months so it's always going to be slightly out of date, as people are constantly buying, building, and selling new properites. We're interested in the best estimate.

Commercial_Properties <- mutate(Commercial_Properties, Amount_Owed_to_County=market_value*EFFECTIVE_RATE*.35*.001)

How much does each district raise?

## `summarise()` ungrouping output (override with `.groups` argument)
TAXSET_DESCR Total_Per_District
Bay Village 1.280771e+06
Beachwood 2.824884e+07
Beachwood/Warrensville 8.546389e+06
Bedford 8.965521e+06
Bedford Hts. 9.293173e+06
Bedford Hts./Orange 2.457844e+06
Berea 5.299954e+06
Berea/Olmsted Falls 2.231797e+06
Bratenahl 2.151762e+05
Brecksville 8.813260e+06
Broadview Hts. 7.086926e+06
Broadview Hts./North Royalt 1.706686e+06
Brook Park 1.121132e+07
Brook Park/Cleveland 2.066731e+06
Brooklyn 1.208523e+07
Brooklyn Hts. 2.930425e+06
Chagrin Falls Village 2.626383e+06
Cleveland 2.512801e+08
Cleveland Hts. 1.674208e+07
Cleveland Hts./East Clevela 9.469686e+04
Cleveland/Berea 5.071719e+06
Cleveland/Shaker Hts. 2.289971e+06
Cuyahoga Hts. 3.474949e+06
East Cleveland 4.290780e+06
Euclid 2.030490e+07
Fairview Park 4.467671e+06
Fairview Park/Berea 3.820022e+04
Fairview Park/Rocky River 2.825286e+06
Garfield Hts. 1.107728e+07
Garfield Hts./Cleveland 1.970821e+05
Gates Mills 2.033804e+05
Glenwillow 4.499879e+06
Highland Hills Village 2.239329e+06
Highland Hts. 6.537852e+06
Hunting Valley 5.094720e+03
Independence 1.515705e+07
Lakewood 2.146778e+07
Linndale 1.253786e+05
Lyndhurst 1.062515e+07
Maple Hts. 7.855830e+06
Mayfield Hts. 2.277334e+07
Mayfield Village 8.836303e+06
Middleburg Hts. 1.806218e+07
Moreland Hills/Orange 5.204357e+05
Newburgh Hts. 5.491098e+05
North Olmsted 2.678409e+07
North Randall 4.540983e+06
North Royalton 1.076495e+07
Oakwood 4.320497e+06
Olmsted Falls 1.812459e+06
Olmsted Falls/Berea 1.720302e+03
Olmsted Township 4.857666e+06
Orange 3.187011e+06
Orange/Warrensville 8.748185e+05
Parma 2.718912e+07
Parma Hts. 6.679992e+06
Pepper Pike 3.132739e+06
Pepper Pike/Beachwood 8.520099e+03
Richmond Hts. 4.915983e+06
Richmond Hts./South Euclid 1.167779e+06
Rocky River 1.132576e+07
Seven Hills 2.419507e+06
Shaker Hts. 1.078308e+07
Solon 2.584266e+07
Solon/Orange 8.388176e+05
South Euclid 4.756233e+06
South Euclid/Cleveland Hts. 2.741612e+06
Strongsville 3.084553e+07
University Hts. 5.038997e+06
Valley View 6.250233e+06
Walton Hills 2.786994e+06
Warrensville/Orange 7.024654e+05
Westlake 3.274093e+07
Woodmere 3.802357e+06

How much is that altogether?

  sum(na.omit(Amount_Due_Per_Tax_District$Total_Per_District))
## [1] 797792740

That same number, but written as currency with commas so you can read:

scales::dollar(.Last.value)
## character(0)

Putting single purchases together

First, our data is in the wrong format. When the date, owner, and amount are the same, the sale amount is actually for ALL the properties together. But there are different parcel numbers because large properties are usually composed of several parcels next to each other.

So for example, parcels 123, 456, and 789 are are listed as sold to Company LLC on June 22, 2019 for 6 million dollars. But each is not worth 6 million. They are worth 1, million, 2 million, and 3 million dollars respectively. All the properties together are worth valued at 6 million dollars, BUT the mortgage amount field on each reads 6 million dollars. And we don't want to accidentally log it as 18 million dollars.

So here we will group by owner, date of mortgage, and the mortgage amount. But then we run into another problem: Some of these have multiple tax districts:

Looking_For_Sort_of_duplicates <- Commercial_Properties %>% 
  group_by(GranteeName,RecordedDate,mortgage_amount) %>% 
  count(TAXSET_DESCR)  
datatable(Looking_For_Sort_of_duplicates)
## Warning in instance$preRenderHook(instance): It seems your data is too
## big for client-side DataTables. You may consider server-side processing:
## https://rstudio.github.io/DT/server.html