Getting the required libraries

library(dplyr)
## Warning: package 'dplyr' was built under R version 3.6.2
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(htmltools)
library(glue)
## Warning: package 'glue' was built under R version 3.6.2
## 
## Attaching package: 'glue'
## The following object is masked from 'package:dplyr':
## 
##     collapse
library(robotstxt)
library(rvest)
## Loading required package: xml2
library(stringr)
library(purrr)
## Warning: package 'purrr' was built under R version 3.6.2
## 
## Attaching package: 'purrr'
## The following object is masked from 'package:rvest':
## 
##     pluck
library(httr)
## Warning: package 'httr' was built under R version 3.6.2
library(readr)
## 
## Attaching package: 'readr'
## The following object is masked from 'package:rvest':
## 
##     guess_encoding
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following object is masked from 'package:base':
## 
##     date
library(DT)
library(muckrakr)
library(readxl)
library(XLConnect)
## XLConnect 1.0.1 by Mirai Solutions GmbH [aut],
##   Martin Studer [cre],
##   The Apache Software Foundation [ctb, cph] (Apache POI),
##   Graph Builder [ctb, cph] (Curvesapi Java library)
## http://www.mirai-solutions.com
## https://github.com/miraisolutions/xlconnect
library(rmiscutils)
## 
## Attaching package: 'rmiscutils'
## The following object is masked from 'package:dplyr':
## 
##     add_row

First, this is the final report we got from an Ohio Public Records Request from the Ohio Secretary of State’s Office. This is the last summary of absentee ballots sent from counties to the Ohio Secretary of state.

setwd("~/Code/Ballots2020/Responsive Records/Weekly Absentee Reports")

Absentee_Survey <- rio::import("3-17-20 -Absentee Survey 9.xlsx")

Some of these counties appear twice:

Absentee_Survey %>% 
  group_by(CountyName) %>% 
  summarize(total=n()) %>% 
  arrange(desc(total))
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 88 x 2
##    CountyName total
##    <chr>      <int>
##  1 Adams          2
##  2 Highland       2
##  3 Jefferson      2
##  4 Montgomery     2
##  5 Perry          2
##  6 Putnam         2
##  7 Allen          1
##  8 Ashland        1
##  9 Ashtabula      1
## 10 Athens         1
## # … with 78 more rows

Per our conversation with the secretary of state’s office, that’s because their procedure is to directly edit the last sheet, add a new row, and delete the old one. So if any appear twice, it must be because the old one was not deleted. So we are going to take the newer versions of the duplicated counties, based on the timestamp.

Second update: Montgomery’s row with “submitted too early” actually has a timestamp that is later and has fewer votes than the other one so that must be a typo as well.

Absentee_Survey <- Absentee_Survey %>% 
  slice(-62, -36, -61,-77,-63,-72)


Absentee_Survey$Non_UOCAVA_Ballots_Requested_and_Cast_In_Person <- number_with_commas(Absentee_Survey$Non_UOCAVA_Ballots_Requested_and_Cast_In_Person)

Absentee_Survey$NonUOCAVA_Ballots_Returned_Submitted_for_Counting <- number_with_commas(Absentee_Survey$NonUOCAVA_Ballots_Returned_Submitted_for_Counting)

Now we want to total the number of ballot applications that were counted as “invalid” in total and by party to we can compare it to the total number of applications submitted.

UOCAVA is short for “The Uniformed and Overseas Citizens Absentee Voting Act.”

Here is the total invalid UOCAVA and nonUOCAVA applications:

Invalid_Total <- sum(Absentee_Survey$UOCAVA_Invalid_Applications) + sum(Absentee_Survey$NonUOCAVA_Invalid_Applications)

sum(Absentee_Survey$UOCAVA_Invalid_Applications) + sum(Absentee_Survey$NonUOCAVA_Invalid_Applications)
## [1] 72077

Here is the total valid UOCAVA and nonUOCAVA applications:

Valid_Total <- sum(Absentee_Survey$UOCAVA_Valid_Applications) + sum(Absentee_Survey$NonUOCAVA_Valid_Applications)

sum(Absentee_Survey$UOCAVA_Valid_Applications) + sum(Absentee_Survey$NonUOCAVA_Valid_Applications)
## [1] 1710004

What percentage was invalid?

Invalid_Total/(Invalid_Total+Valid_Total)
## [1] 0.04044541

How many total?

Valid_Total+Invalid_Total
## [1] 1782081

How many total ballots were transmitted?

sum(Absentee_Survey$UOCAVA_Ballots_Transmitted)+sum(Absentee_Survey$NonUOCAVA_Ballots_Transmitted)
## [1] 1705724

How many were returned and submitted for counting?

sum(Absentee_Survey$UOCAVA_Ballots_Returned_Submitted_for_Counting)+
  sum(Absentee_Survey$NonUOCAVA_Ballots_Returned_Submitted_for_Counting)+
  sum(Absentee_Survey$UOCAVA_Ballots_Requested_and_Cast_In_Person)+
  sum(Absentee_Survey$Non_UOCAVA_Ballots_Requested_and_Cast_In_Person)
## [1] 1459560

Okay next we are looking at Absentee Ballot Reports. We got these from a Public Records Request from the Secretary of State’s office. Each county emailed the reports to the SOS but not every day. These are mostly requested by parties.

We requested:

First, a caveat:

There are two labeled “78_04232020_163107.” One is in the April 24 folder, the other is in the April 23 folder. Both are the exact same. It’s not clear if we are missing a file or not.

56_0141620_162040 is in the April 17th folder and in the April 16th folder. Again they are the exact same. Again it’s not clear if we are missing a file or they only sent a report on the 16th and that got included in the 17th for some reason.

File 57_040320_090906 had too few columns. I added the country column.

Per the Secretary of State, each individual county will have to be consulted for data discrepancies.

I’m going to combine these all and look for complete duplicates versus just duplicate names and IDs. So we can sort through those just entered twice and those who had their ballot applications rejected and who applied again. But if we are missing files, and I think based on this naming pattern that we are missing a couple beyond the ones who just didn’t send reports everyday and there’s nothing I can do about that from a data standpoint. (We can request it again.)

First we are going to switch to the directory where we put all the absentee ballot reports. Then we combine them.

Note: A bunch of these contain some really old xls files that would not work with read_excel. I tried several other packages, such as the (XLConnect)[https://stackoverflow.com/questions/60609875/r-read-excel-libxls-error-unable-to-parse-file] one which worked for a while, but loading htem all overloaded the java memory. Increasing it did nothing. Long story short, we had to open a bunch of them and resave them as xlsx files.

setwd("~/Code/Ballots2020/Absentee Ballot Reports/Absentee Ballot Reports 4-3-2020 to 4-25-2020")



list_of_absentee_files_csv <- list.files(path = ".", recursive = TRUE,
                            pattern = "\\.csv$" , 
                            full.names = TRUE)

absentee_csv <- list_of_absentee_files_csv %>%
  purrr::set_names(nm = (basename(.) %>% tools::file_path_sans_ext())) %>%
  purrr::map_df(read_csv, 
                .id = "FileName", skip =1,   
                                                col_names = c("county_id", 
                                                              "state_voter_id",
                                                  "county_voter_id",
                                                  "first_name",
                                                  "middle_name",
                                                  "last_name",
                                                  "suffix",
                                                  "in_care_of_line",
                                                  "av_mail_address",
                                                  "av_mail_address_2",
                                                  "av_mail_city",
                                                  "av_mail_state",
                                                  "av_mail_zip",
                                                  "av_mail_country",
                                                  "av_mail_postal",
                                                  "av_ballot_type",
                                                  "av_app_recd_date",
                                                  "run_date"), 
                col_types=cols('av_mail_country'=col_character(),
                               'av_mail_address_2'=col_character(),
                               `av_mail_postal`=col_character(),
                               `county_id`=col_character(),
                               `county_voter_id`=col_character(),
                               `av_mail_zip`=col_character()))
## Warning: 1 parsing failure.
##  row             col           expected            actual                       file
## 1792 in_care_of_line 1/0/T/F/TRUE/FALSE HOUSTON HOUSE 507 './09_04062020_181849.csv'
## Warning: 2 parsing failures.
##  row             col           expected            actual                       file
## 1062 in_care_of_line 1/0/T/F/TRUE/FALSE SCOTT HALL        './09_04172020_171500.csv'
## 4271 in_care_of_line 1/0/T/F/TRUE/FALSE HOUSTON HOUSE 507 './09_04172020_171500.csv'
## Warning: 4 parsing failures.
##   row             col           expected actual                     file
## 34233 av_mail_address delimiter or quote      E './31_040320_185920.csv'
## 34233 av_mail_address delimiter or quote        './31_040320_185920.csv'
## 36383 av_mail_address delimiter or quote      K './31_040320_185920.csv'
## 36383 av_mail_address delimiter or quote        './31_040320_185920.csv'
## Warning: 4 parsing failures.
##  row col   expected     actual                            file
## 1013  -- 18 columns 19 columns './62_040320_160610_Ottawa.csv'
## 1546  -- 18 columns 19 columns './62_040320_160610_Ottawa.csv'
## 2438  -- 18 columns 19 columns './62_040320_160610_Ottawa.csv'
## 2780  -- 18 columns 22 columns './62_040320_160610_Ottawa.csv'
## Warning: 1 parsing failure.
## row col   expected    actual                     file
##   1  -- 18 columns 1 columns './72_040720_165248.csv'
## Warning: 1811 parsing failures.
## row col   expected     actual                     file
##   1  -- 18 columns 19 columns './88_040320_171842.csv'
##   2  -- 18 columns 19 columns './88_040320_171842.csv'
##   3  -- 18 columns 19 columns './88_040320_171842.csv'
##   4  -- 18 columns 19 columns './88_040320_171842.csv'
##   5  -- 18 columns 19 columns './88_040320_171842.csv'
## ... ... .......... .......... ........................
## See problems(...) for more details.
list_of_absentee_files_xlsx <- list.files(path = ".", recursive = TRUE,
                            pattern = "\\.xlsx$", 
                            full.names = TRUE)

absentee_excel <- list_of_absentee_files_xlsx %>%
  purrr::set_names(nm = (basename(.) %>% tools::file_path_sans_ext())) %>%
  purrr::map_df(read_excel, 
                .id = "FileName", skip =1,col_names = c("county_id", 
                                                              "state_voter_id",
                                                  "county_voter_id",
                                                  "first_name",
                                                  "middle_name",
                                                  "last_name",
                                                  "suffix",
                                                  "in_care_of_line",
                                                  "av_mail_address",
                                                  "av_mail_address_2",
                                                  "av_mail_city",
                                                  "av_mail_state",
                                                  "av_mail_zip",
                                                  "av_mail_country",
                                                  "av_mail_postal",
                                                  "av_ballot_type",
                                                  "av_app_recd_date",
                                                  "run_date"), 
                col_types=c("text"))

#It was just easier to import these as text and then change the format. 
absentee_excel$av_app_recd_date <- as.numeric(absentee_excel$av_app_recd_date)
absentee_excel$av_app_recd_date <- janitor::excel_numeric_to_date(absentee_excel$av_app_recd_date, date_system = "modern")
absentee_csv$av_app_recd_date <- as.Date(absentee_csv$av_app_recd_date, format = "%m/%d/%Y")
absentee_SOS <- rbind(absentee_excel, absentee_csv)

absentee_SOS$run_date <- as.Date(absentee_SOS$run_date, format = "%m/%d/%Y")

How many votes are in each file?

number_of_records_in_each_file <- absentee_SOS %>% 
  group_by(FileName) %>% 
  summarize(Total=n())
## `summarise()` ungrouping output (override with `.groups` argument)
datatable(number_of_records_in_each_file)

Some of these seem really low and we are checking with SOS on this.

Alrightly let’s check out this combined data set. Looks like there are ~1,8 million applications

First let’s check for duplicates to make sure they didn’t give us two of the same file. As the files were quite jumbled.

n_distinct(absentee_SOS)
## [1] 1790577

Looks like ~1.79 million are unique, so we did have just a few totally duplicated data points in there. (Unless someone applied for a ballot twice in one day?)

duplicates_absentee_SOS <- absentee_SOS %>% 
  group_by_all() %>% 
  summarize(Total_Times_This_Record_Appears=n()) %>% 
  ungroup() %>% 
  filter(Total_Times_This_Record_Appears>1)
## `summarise()` regrouping output by 'FileName', 'county_id', 'state_voter_id', 'county_voter_id', 'first_name', 'middle_name', 'last_name', 'suffix', 'in_care_of_line', 'av_mail_address', 'av_mail_address_2', 'av_mail_city', 'av_mail_state', 'av_mail_zip', 'av_mail_country', 'av_mail_postal', 'av_ballot_type', 'av_app_recd_date' (override with `.groups` argument)

Note: several counties have blank rows in between which causes the extra rows. They aren’t actual records.

Now we are just going to deal with distinct data, and not complete duplicates:

absentee_SOS <- distinct(absentee_SOS)

How many is that?

nrow(absentee_SOS)
## [1] 1790577

But some people may have applied and had their application rejected and applied again. How many is that?

First let’s look at how many applied multiple times

multiple_applications <- absentee_SOS %>% 
  group_by(state_voter_id) %>% 
  summarize(Total_applications =n()) 
## `summarise()` ungrouping output (override with `.groups` argument)

Now let’s count how many people have multiple applications

multiple_applications <- absentee_SOS %>% 
  group_by(state_voter_id) %>% 
  mutate(Total_applications =n()) %>% 
  ungroup() %>%
  filter(Total_applications>1 & Total_applications<4) #Note, I looked that the data and in  138 cases the voter didn't have a state ID, since this filters by that, it looks like they applied that many times but those voters are all different. So this removes those people. 

multiple_applications %>% 
  count(Total_applications)
## # A tibble: 2 x 2
##   Total_applications     n
##                <int> <int>
## 1                  2 23746
## 2                  3    69

Wow 34 people applied three times!

So how many unique people actually tried to apply?

number_who_applied <- nrow(absentee_SOS)-13067-68
#Taking out the people who tried to apply two and three times

number_who_applied
## [1] 1777442

Note this doesn’t count ballots cast in person.

So we have here 265,832 non-UOCAVA Ballots cast in person, and 16 UOCAVA ballots cast in person. Plus 1,763,780 who tried to apply for a ballot. How many is that total?

total_applied_or_cast_in_person <- sum(number_who_applied,265832,16)

total_applied_or_cast_in_person
## [1] 2043290

(Some people may have had their application rejected and gone in person to vote.)

According to the Secretary of State’s website, data from the march election shows there were 1,834,465 total ballots cast in the last election.

(Note: This includes people who were sent a ballot but then didn’t return it, mail that never reached the BOE, etc.)

How many were sent too late?

First, let’s look at the run date for abenstee ballot app recieved:

summary(absentee_SOS$av_app_recd_date)
##         Min.      1st Qu.       Median         Mean      3rd Qu. 
## "0020-03-31" "2020-03-13" "2020-04-06" "2020-03-26" "2020-04-14" 
##         Max.         NA's 
## "3200-04-11"         "76"

There’s a few typos in here that are messing up our calculations. Which is expected in a dataset this large. Probably no one applied for their absentee ballot in the year 20 or 3200.

Finding data outliers

absentee_data_outliers <-  absentee_SOS %>% 
  filter(is.na(av_app_recd_date)| av_app_recd_date<"2019-01-01"|av_app_recd_date>"2020-04-26")

So For our purposes here, and because they are really small, we are going to eliminate them.

absentee_SOS <- absentee_SOS %>% 
  filter(av_app_recd_date>"2020-01-01"&av_app_recd_date<"2020-04-26")#based on looking at probable outliers

When did people apply for ballots?

ggplot(absentee_SOS, aes(x=av_app_recd_date))+
  geom_bar()+
  labs(title = "Date Absentee Ballot Applications Received", subtitle = "Ohio 2020 Primary Election")+
  ggthemes::theme_economist()

Now we’re comparing that to the final post-election report which we got from another FOIA from the SOS.

setwd("~/Code/Ballots2020/Responsive Records/Post-Election Report")
post_election_report <- read_excel("March 2020 Absentee.xlsx")

There’s conveniently a totals row (and a list of each county)

datatable(head(post_election_report, 1))

This report deals only with ballots and not with applications. Per our discussion with the Secretary of State’s office, they don’t usually mention failed applications because they have not traditionally reported that and some people correct with a new application.

What things does the post-election report measure?

colnames(post_election_report)
##   [1] "County"                                                                     
##   [2] "Non-UOCAVA Ballots Transmitted"                                             
##   [3] "Non-UOCAVA Ballots Cast In-Person"                                          
##   [4] "Non-UOCAVA Ballots Transmitted and Cast In Person"                          
##   [5] "Returned By Voters And Submitted For Counting"                              
##   [6] "Returned As Undeliverable"                                                  
##   [7] "Returned Spoiled Or Replaced Ballots"                                       
##   [8] "Neither Returned Undeliverable Nor Returned By The Voter"                   
##   [9] "Returned By Voters And Submitted For Counting And Rejected"                 
##  [10] "Non-UOCAVA Ballots Counted"                                                 
##  [11] "Non-UOCAVA Ballots Counted Transmitted"                                     
##  [12] "Non-UOCAVA Ballots Counted In Person"                                       
##  [13] "Non-UOCAVA Ballots Rejected"                                                
##  [14] "Non-UOCAVA Ballots Rejected Transmitted"                                    
##  [15] "Non-UOCAVA Ballots Rejected In Person"                                      
##  [16] "Ballot Not Received On Time Missed Deadline - Transmitted"                  
##  [17] "No Voter Signature - Transmitted"                                           
##  [18] "Non-Matching Signature - Transmitted"                                       
##  [19] "Missing or Non-Matching name - Transmitted"                                 
##  [20] "Returned Without ID Envelope - Transmitted"                                 
##  [21] "Voter ID Envelope Contains Insufficient Information - Transmitted"          
##  [22] "Multiple Ballots Returned In One Envelope  - Transmitted"                   
##  [23] "Voter Deceased Prior To Date Ballot Was Cast  - Transmitted"                
##  [24] "First Time Voter Without Proper Identification"                             
##  [25] "Non First Time Voter Without Proper Identification  - Transmitted"          
##  [26] "Stub A Detached - Transmitted"                                              
##  [27] "Ballot Not Received On Time Missed Deadline - In Person"                    
##  [28] "No Voter Signature - In Person"                                             
##  [29] "Non Matching Signature - In Person"                                         
##  [30] "Missing or Non- Matching Name - In Person"                                  
##  [31] "Returned without ID envelope - In Person"                                   
##  [32] "Voter ID Envelope Contains Insufficient Information - In Person"            
##  [33] "Multiple Ballots Returned In One Envelope - In Person"                      
##  [34] "Voter Deceased Prior to Date Ballot Was Cast - In Person"                   
##  [35] "First Time Voter Without Proper Identification - In Person"                 
##  [36] "Non First Time Voter Without Proper Identification - In Person"             
##  [37] "Stub A Detached - In Person"                                                
##  [38] "Ballot Not Received On Time Missed Deadline - Total"                        
##  [39] "No Voter Signature - Total"                                                 
##  [40] "Non Matching Signature - Total"                                             
##  [41] "Missing or Non-Matching Name - Total"                                       
##  [42] "Returned without ID Envelope - Total"                                       
##  [43] "Voter ID Envelope Contains Insufficient Information - Total"                
##  [44] "Multiple Ballots Returned in One Envelope - Total"                          
##  [45] "Voter Deceased Prior to Date Ballot Was Cast - Total"                       
##  [46] "First Time Voter Without Proper Identification - Total"                     
##  [47] "Non First Time voter without Proper Identification - Total"                 
##  [48] "Stub A Detached  - Total"                                                   
##  [49] "Non-UOCAVA Rejected - Total"                                                
##  [50] "Non UOCAVA Rejected - In Person"                                            
##  [51] "Non UOCAVA Rejected - Transmitted"                                          
##  [52] "FPCAs Received"                                                             
##  [53] "FPCAs Received Valid"                                                       
##  [54] "FPCAs Received Military Valid"                                              
##  [55] "FPCAs Received Overseas Valid"                                              
##  [56] "FPCAs Received Invalid"                                                     
##  [57] "FPCAs Received Military Invalid"                                            
##  [58] "FPCAs Received Overseas Invalid"                                            
##  [59] "FPCAs Rejected Past Deadline"                                               
##  [60] "Other UOCAVA Applications Received"                                         
##  [61] "Other UOCAVA Applications Received Valid"                                   
##  [62] "Other UOCAVA Applications Received Military Valid"                          
##  [63] "Other UOCAVA Applications Received Overseas Valid"                          
##  [64] "Other UOCAVA Applications Received Invalid"                                 
##  [65] "Other UOCAVA Applications Received Military Invalid"                        
##  [66] "Other UOCAVA Applications Received Overseas Invalid"                        
##  [67] "Total UOCAVA Applications Received"                                         
##  [68] "UOCAVA Transmitted"                                                         
##  [69] "UOCAVA Transmitted Military"                                                
##  [70] "UOCAVA Transmitted Overseas"                                                
##  [71] "UOCAVA Transmitted via Mail"                                                
##  [72] "UOCAVA Military Transmitted via Mail"                                       
##  [73] "UOCAVA Overseas Transmitted via Mail"                                       
##  [74] "UOCAVA Transmitted via Email"                                               
##  [75] "UOCAVA Military Transmitted via Email"                                      
##  [76] "UOCAVA Overseas Transmitted via Email"                                      
##  [77] "UOCAVA Transmitted via Fax"                                                 
##  [78] "UOCAVA Military Transmitted via Fax"                                        
##  [79] "UOCAVA Overseas Transmitted via Fax"                                        
##  [80] "UOCAVA Cast In Person"                                                      
##  [81] "UOCAVA Ballots Returned By Voters"                                          
##  [82] "UOCAVA Ballots Returned As Undeliverable"                                   
##  [83] "UOCAVA Ballots Returned As Undeliverable via mail"                          
##  [84] "UOCAVA Ballots Returned As Undeliverable via email"                         
##  [85] "UOCAVA Ballots Returned As Undeliverable via fax"                           
##  [86] "UOCAVA Ballots Spoiled Or Replaced Ballots"                                 
##  [87] "UOCAVA Ballots Neither Returned Undeliverable Nor Returned By The Voter"    
##  [88] "Total UOCAVA Ballots Returned And Not Returned"                             
##  [89] "UOCAVA Returned - Total FWAB and Non-FWAB"                                  
##  [90] "UOCAVA Returned Military Non-FWAB Total"                                    
##  [91] "UOCAVA Returned Military FWAB Total"                                        
##  [92] "UOCAVA Returned Overseas Non-FWAB Total"                                    
##  [93] "UOCAVA Returned Overseas FWAB Total"                                        
##  [94] "UOCAVA Returned Non-FWAB Total"                                             
##  [95] "UOCAVA Returned FWAB - Total"                                               
##  [96] "UOCAVA Returned - Total"                                                    
##  [97] "UOCAVA Ballots Counted"                                                     
##  [98] "UOCAVA Ballots Counted Military Non-FWAB"                                   
##  [99] "UOCAVA Ballots Counted Military FWAB"                                       
## [100] "UOCAVA Ballots Counted Overseas Non-FWAB"                                   
## [101] "UOCAVA Ballot Counted Overseas FWAB"                                        
## [102] "UOCAVA Ballots Counted - Total Non-FWAB"                                    
## [103] "UOCAVA Ballots Counted - Total FWAB"                                        
## [104] "UOCAVA Ballots Counted - Total"                                             
## [105] "UOCAVA Non FWAB and FWAB Rejected - Total"                                  
## [106] "UOCAVA Non FWAB Rejected Not Received On Time Missed Deadline"              
## [107] "UOCAVA FWAB Ballot Not Received On Time Missed Deadline"                    
## [108] "UOCAVA Total Rejected Not Received On Time Missed Deadline"                 
## [109] "UOCAVA Non FWAB Rejected No Voter Signature"                                
## [110] "UOCAVA FWAB Rejected No Voter Signature"                                    
## [111] "UOCAVA Total UOCAVA Rejected No Voter Singature"                            
## [112] "UOCAVA Non-FWAB Rejected Non-Matching Signature"                            
## [113] "UOCAVA FWAB Rejected Non-Matching Signature"                                
## [114] "UOCAVA Total UOCAVA Rejected Non-Matching Signature"                        
## [115] "UOCAVA Non-FWAB RejectedMissingOrNon-matchingName"                          
## [116] "UOCAVA FWAB RejectedMissingOrNon-matchingName"                              
## [117] "UOCAVA Total RejectedMissingOrNon-matchingName"                             
## [118] "UOCAVA Non FWAB Rejected Ballot Returned Without ID Envelope"               
## [119] "UOCAVA FWAB Rejected Ballot Returned Without ID Envelope"                   
## [120] "UOCAVA Total Rejected Ballot Returned Without ID Envelope"                  
## [121] "UOCAVA Non FWAB Rejected ID Envelope Contains Insufficient Information"     
## [122] "UOCAVA FWAB Rejected ID Envelope Contains Insufficient Information"         
## [123] "UOCAVA Total Rejected ID Envelope Contains Insufficient Information"        
## [124] "UOCAVA Non FWAB Rejected Multiple Ballots Returned In One Envelope"         
## [125] "UOCAVA FWAB Rejected Multiple Ballots Returned In One Envelope"             
## [126] "UOCAVA Total Rejected Multiple Ballots Returned In One Envelope"            
## [127] "UOCAVA Non-FWAB Rejected Voter Deceased Prior To Date Ballot Was Cast"      
## [128] "UOCAVA FWAB Rejected Voter Deceased Prior To Date Ballot Was Cast"          
## [129] "UOCAVA Total Rejected Voter Deceased Prior To Date Ballot Was Cast"         
## [130] "UOCAVA Non-FWAB Rejected First-time Voter Without Proper Identification"    
## [131] "UOCAVA FWAB Rejected Firsttime Voter Without Proper Identification"         
## [132] "UOCAVA Total Rejected First-time Voter Without Proper Identification"       
## [133] "UOCAVA Non-FWAB Rejected Non First-time Voter Without Proper Identification"
## [134] "UOCAVA FWAB Rejected Non-First-time Voter Without Proper Identification"    
## [135] "UOCAVA Total Rejected Non-First-time Voter Without Proper Identification"   
## [136] "UOCAVA Non FWAB Rejected Stub A Is Detached"                                
## [137] "UOCAVA FWAB Rejected Stub A Is Detached"                                    
## [138] "UOCAVA Total Rejected Stub A Is Detached"                                   
## [139] "UOCAVA Total Non FWAB Rejected"                                             
## [140] "UOCAVA Total FWAB Rejected"                                                 
## [141] "UOCAVA Non FWAB and FWAB Rejected"                                          
## [142] "UOCAVA FWABs Not Counted - Regular Ballot Counted"                          
## [143] "UOCAVA Military Non-FWAB Rejected"                                          
## [144] "UOCAVA Military FWAB Rejected"                                              
## [145] "UOCAVA Overseas Non-FWAB Rejected"                                          
## [146] "UOCAVA Overseas FWAB Rejected"                                              
## [147] "UOCAVA Non-FWAB Rejected Total"                                             
## [148] "UOCAVA FWAB Rejected Total"                                                 
## [149] "UOCAVA Rejected Total"

How many ballots were transmitted?

post_election_report_without_total <- post_election_report %>% 
  slice(-1)

sum(post_election_report_without_total$`Non-UOCAVA Ballots Transmitted`)+sum(post_election_report_without_total$`UOCAVA Transmitted`)
## [1] 1703269

How many were cast by mail?

sum(post_election_report_without_total$`Returned By Voters And Submitted For Counting`)+sum(post_election_report_without_total$`UOCAVA Returned - Total`)
## [1] 1831624

How many ballots were cast in person?

ballots_cast_in_person_total_post_election_report <- sum(post_election_report_without_total$`UOCAVA Cast In Person`)+sum(post_election_report_without_total$`Non-UOCAVA Ballots Cast In-Person`)

ballots_cast_in_person_total_post_election_report
## [1] 265848

How many were counted?

counted_total_total_post_election_report <-  sum(post_election_report_without_total$`Non-UOCAVA Ballots Counted`)+sum(post_election_report_without_total$`UOCAVA Ballots Counted`)

counted_total_total_post_election_report
## [1] 1810470

So how many people successfully voted by mail?

successful_by_mail <- counted_total_total_post_election_report-ballots_cast_in_person_total_post_election_report

successful_by_mail
## [1] 1544622

What’s the difference between the number of people who tried to apply to vote by mail versus the number who were successful?

number_who_applied-successful_by_mail
## [1] 232820

Note: some people, after not getting a ballot in the mail, could have just walked in. Or they could have sent their ballot in my mail and it was just late.

This is similar to the report that counties have to fill out at the end of each race. Page1 page2 page 3

So we’re going to combine them to see why ballots were rejected:

total_ballot_rejections <- post_election_report %>% 
  mutate(late = `Ballot Not Received On Time Missed Deadline - Total`+`UOCAVA Total Rejected Not Received On Time Missed Deadline`,
         `no signature` = `No Voter Signature - Total`+`UOCAVA Total UOCAVA Rejected No Voter Singature`,
         `signature mismatch` = `Non Matching Signature - Total` + `UOCAVA Total UOCAVA Rejected Non-Matching Signature`,
         `name missing/mismatched`=`Missing or Non-Matching Name - Total`+ `UOCAVA Total RejectedMissingOrNon-matchingName`,
         `returned without ID envelope`=`Returned without ID Envelope - Total`+ `UOCAVA Total Rejected Ballot Returned Without ID Envelope`,
         `Voter ID envelope contains insufficient info` =`Voter ID Envelope Contains Insufficient Information - Total`+  `UOCAVA Total Rejected ID Envelope Contains Insufficient Information`,
         `Multiple Ballots Returned in one envelope` = `Multiple Ballots Returned in One Envelope - Total`+ `UOCAVA Total Rejected Multiple Ballots Returned In One Envelope`,
         `Voter deceased prior to date ballot was cast` = `Voter Deceased Prior to Date Ballot Was Cast - Total`  +  `UOCAVA Total Rejected Voter Deceased Prior To Date Ballot Was Cast`,
         `First time voter without proper ID` =`First Time Voter Without Proper Identification - Total` + `UOCAVA Total Rejected First-time Voter Without Proper Identification`,
         `Repeat voter without proper ID` =`Non First Time voter without Proper Identification - Total` + `UOCAVA Total Rejected Non-First-time Voter Without Proper Identification`,
         `detatched stub`= `Stub A Detached  - Total` + `UOCAVA Total Rejected Stub A Is Detached`) %>%
  slice(1) %>%  #taking out the total row or we will accidentally total that twice
  select(-(`County`:`UOCAVA Rejected Total`)) %>% 
  tidyr::gather("Rejection Reason","Total") %>% 
  arrange(Total)

Let’s look at this visually

ggplot(total_ballot_rejections, aes(x=reorder(`Rejection Reason`,-Total), y=Total))+
  geom_bar(stat="identity")+
  labs(title = "Total Ballot Rejections", subtitle = "Ohio 2020  Primary Election. Does not include applications.")+
  theme(axis.text.x= element_text(angle = 45, vjust = 1, hjust=1.1))+
  geom_text(aes(`Rejection Reason`, Total + 700, label=Total))

What do rejections look like in each county?

ballot_rejections <- post_election_report %>% 
  mutate(late = `Ballot Not Received On Time Missed Deadline - Total`+`UOCAVA Total Rejected Not Received On Time Missed Deadline`,
         `no signature` = `No Voter Signature - Total`+`UOCAVA Total UOCAVA Rejected No Voter Singature`,
         `signature mismatch` = `Non Matching Signature - Total` + `UOCAVA Total UOCAVA Rejected Non-Matching Signature`,
         `name missing/mismatched`=`Missing or Non-Matching Name - Total`+ `UOCAVA Total RejectedMissingOrNon-matchingName`,
         `returned without ID envelope`=`Returned without ID Envelope - Total`+ `UOCAVA Total Rejected Ballot Returned Without ID Envelope`,
         `Voter ID envelope contains insufficient info` =`Voter ID Envelope Contains Insufficient Information - Total`+  `UOCAVA Total Rejected ID Envelope Contains Insufficient Information`,
         `Multiple Ballots Returned in one envelope` = `Multiple Ballots Returned in One Envelope - Total`+ `UOCAVA Total Rejected Multiple Ballots Returned In One Envelope`,
         `Voter deceased prior to date ballot was cast` = `Voter Deceased Prior to Date Ballot Was Cast - Total`  +  `UOCAVA Total Rejected Voter Deceased Prior To Date Ballot Was Cast`,
         `First time voter without proper ID` =`First Time Voter Without Proper Identification - Total` + `UOCAVA Total Rejected First-time Voter Without Proper Identification`,
         `Repeat voter without proper ID` =`Non First Time voter without Proper Identification - Total` + `UOCAVA Total Rejected Non-First-time Voter Without Proper Identification`,
         `detatched stub`= `Stub A Detached  - Total` + `UOCAVA Total Rejected Stub A Is Detached`) %>%
  slice(-1) %>%  #taking out the total row or we will accidentally total that twice
  select(-(`Non-UOCAVA Ballots Transmitted`:`UOCAVA Rejected Total`))

What were the rejection reasons by county?

Counties in Ohio:

counties <- ("
Adams
Allen
Ashland
Ashtabula
Athens
Auglaize
Belmont
Brown
Butler
Carroll
Champaign
Clark
Clermont
Clinton
Columbiana
Coshocton
Crawford
Cuyahoga
Darke
Defiance
Delaware
Erie
Fairfield
Fayette
Franklin
Fulton
Gallia
Geauga
Greene
Guernsey
Hamilton
Hancock
Hardin
Harrison
Henry
Highland
Hocking
Holmes
Huron
Jackson
Jefferson
Knox
Lake
Lawrence
Licking
Logan
Lorain
Lucas
Madison
Mahoning
Marion
Medina
Meigs
Mercer
Miami
Monroe
Montgomery
Morgan
Morrow
Muskingum
Noble
Ottawa
Paulding
Perry
Pickaway
Pike
Portage
Preble
Putnam
Richland
Ross
Sandusky
Scioto
Seneca
Shelby
Stark
Summit
Trumbull
Tuscarawas
Union
VanWert
Vinton
Warren
Washington
Wayne
Williams
Wood
Wyandot")

To get a nice list of ohio counties in the format we want.

counties <- counties %>% 
  str_replace_all("\\n"," ") %>% 
  str_split(boundary("word"))

To covert it to a vector

counties <- unlist(counties)

Looking at rejections by county

county_ballot_rejection <- function(Yo.County){ballot_rejections %>% 
    filter(County==Yo.County) %>%
    select(-1) %>% 
    tidyr::gather("Rejection Reason","Total") %>%
    ggplot(aes(x=reorder(`Rejection Reason`,-Total), y=Total))+
    geom_bar(stat="identity")+
    theme(axis.text.x= element_text(angle = 45, vjust = 1, hjust=1.1),axis.title.x.bottom =element_text(size = 0))+
    geom_text(aes(`Rejection Reason`, Total+5, label=Total))+
    labs(title=paste(Yo.County,"County Ballots Rejected 2020 Ohio Primary"),subtitle=("Note: Only Includes Ballots, Not Ballot Applications Rejected"))
}

map(counties, county_ballot_rejection)
## [[1]]

## 
## [[2]]

## 
## [[3]]

## 
## [[4]]

## 
## [[5]]

## 
## [[6]]

## 
## [[7]]

## 
## [[8]]

## 
## [[9]]

## 
## [[10]]

## 
## [[11]]

## 
## [[12]]

## 
## [[13]]

## 
## [[14]]

## 
## [[15]]

## 
## [[16]]

## 
## [[17]]