## Warning: package 'dplyr' was built under R version 3.6.2
## ## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats': ## ## filter, lag
## The following objects are masked from 'package:base': ## ## intersect, setdiff, setequal, union
library(rio) #devtools::install_github("smach/rmiscutils") library(lubridate)
## ## Attaching package: 'lubridate'
## The following object is masked from 'package:base': ## ## date
## Warning: package 'scales' was built under R version 3.6.2
This is downloaded from the County Auditor and Recorder's Data. And some FOIA data. This can be any spreadsheet of county data as long as it has the parcel number, mortgage amount, tax district, effective rate, owner name, sale date, and (Auditor's) market value.
Note: This is only commerical properties, in the Commerical/Industrial land use codes. The mortgages are incomplete. (See our methodology for more.)
#importing data Commercial_Properties <- rio::import("Cuyahoga Properties last verified 8-07-19.csv") #OR wherever your file is #Exempt properties taken out, as per auditor's edit. Commercial_Properties <- filter(Commercial_Properties, NET_TAX_OWED_1ST>0)
What is the amount owed to each county?
Note: in theory we would just add all the taxes_paid and there's always going to be a certain percentage of people behind on their taxes, and we're not interested in that. So here we are calculating the total amount to taxes collected in theory.
Second Note: Each year, new properties are built. This doesn't decrease the tax rate. It just adds to the amount of property tax collected. So this is essentially a snapshot.
Third caveat: We collected this data over several months so it's always going to be slightly out of date, as people are constantly buying, building, and selling new properites. We're interested in the best estimate.
Commercial_Properties <- mutate(Commercial_Properties, Amount_Owed_to_County=market_value*EFFECTIVE_RATE*.35*.001)
How much does each district raise?
## `summarise()` ungrouping output (override with `.groups` argument)
|Broadview Hts./North Royalt||1.706686e+06|
|Chagrin Falls Village||2.626383e+06|
|Cleveland Hts./East Clevela||9.469686e+04|
|Fairview Park/Rocky River||2.825286e+06|
|Highland Hills Village||2.239329e+06|
|Richmond Hts./South Euclid||1.167779e+06|
|South Euclid/Cleveland Hts.||2.741612e+06|
How much is that altogether?
##  797792740
That same number, but written as currency with commas so you can read:
First, our data is in the wrong format. When the date, owner, and amount are the same, the sale amount is actually for ALL the properties together. But there are different parcel numbers because large properties are usually composed of several parcels next to each other.
So for example, parcels 123, 456, and 789 are are listed as sold to Company LLC on June 22, 2019 for 6 million dollars. But each is not worth 6 million. They are worth 1, million, 2 million, and 3 million dollars respectively. All the properties together are worth valued at 6 million dollars, BUT the mortgage amount field on each reads 6 million dollars. And we don't want to accidentally log it as 18 million dollars.
So here we will group by owner, date of mortgage, and the mortgage amount. But then we run into another problem: Some of these have multiple tax districts:
Looking_For_Sort_of_duplicates <- Commercial_Properties %>% group_by(GranteeName,RecordedDate,mortgage_amount) %>% count(TAXSET_DESCR) datatable(Looking_For_Sort_of_duplicates)
## Warning in instance$preRenderHook(instance): It seems your data is too ## big for client-side DataTables. You may consider server-side processing: ## https://rstudio.github.io/DT/server.html