cvr.base.CastVoteRecord¶

class cvr.base.CastVoteRecord(jurisdiction: str = '', state: str = '', year: str = '', date: str = '', office: str = '', notes: str = '', parser_func: Callable | None = None, parser_args: Dict | None = None, parsed_cvr: Dict | None = None, split_fields: List | None = None, disable_aggregation: bool = False)¶

Bases: CastVoteRecord_stats, CastVoteRecord_tables

Class that helps read, organize, and use multiple versions of the same cast vote record from an election.

__init__(jurisdiction: str = '', state: str = '', year: str = '', date: str = '', office: str = '', notes: str = '', parser_func: Callable | None = None, parser_args: Dict | None = None, parsed_cvr: Dict | None = None, split_fields: List | None = None, disable_aggregation: bool = False) → None¶

Constructor for CastVoteRecord.

Either parser_func and parser_args must both be passed or an already parsed CVR must be passed as parsed_cvr.

Constructor parses CVR file, if needed, and computes default ballot statistics.

Parameters:

jurisdiction (str, optional) – Name of election jurisdiction, defaults to “”
state (str, optional) – State, or broader jursidiction, of election, defaults to “”
year (str, optional) – Year of election, defaults to “”
date (str, optional) – Date of election in format mm/dd/yyyy, defaults to “”
office (str, optional) – Office which the election is deciding, defaults to “”
notes (str, optional) – Any extra notes to store about the election, defaults to “”
parser_func (Optional[Callable], optional) – A function from parsers.py or a custom function with the same signature and return type, defaults to None
parser_args (Optional[Dict], optional) – Dictionary of arguments and their values which are unrolled and passed to chosen parser_func. Works like **kwargs. Defaults to None.
parsed_cvr (Optional[Dict], optional) – A CVR represented as a dictionary of lists all of equal length. The only mandatory key-value pair is ‘ranks’ which must contain a list of lists, all must be the same length and each must contain the string names of candidates, or special BallotMarks constants (SKIPPED, OVERVOTE, WRITEIN), in ranked order. One other optional special CVR key is ‘weight’, which will be used internally to provide weights to each ballot. Other dictionary keys are optional and arbitrary and can be used to represent other ballot information, such as ballot IDs or precinct details. Defaults to None
split_fields (Optional[List], optional) – Only relevant for calculating split statistics. A list of CVR field names. Statistics will be calculated for each subcategory in a CVR field. Defaults to None
disable_aggregation (bool, optional) – Advanced option. If True, CVR is not represented interally in aggregated form. If False, CVR remains as parsed. Defaults to False. Internal aggregation of the CVR is meant to speed up tabulation and statistics calculation, but can be incompatible with some RCV variants, such as Cambridge’s STV whole ballot transfer variant.

Methods

`__init__`([jurisdiction, state, year, date, ...])	Constructor for CastVoteRecord.
`add_rule_set`(set_name, set_dict)	Add a new rule set used to create a modified version of the CVR.
`calc_annotated_cvr_table`(cvr)	Static method wrapper around get_annotated_cvr_table object method.
`calc_condorcet_tables`(cvr)	Static method wrapper around get_condorcet_tables object method.
`calc_crossover_tables`(cvr)	Static method wrapper around get_crossover_tables object method.
`calc_cumulative_ranking_tables`(cvr)	Static method wrapper around get_cumulative_ranking_tables object method.
`calc_first_second_tables`(cvr)	Static method wrapper around get_first_second_tables object method.
`calc_rank_usage_table`(cvr)	Static method wrapper around get_rank_usage_table object method.
`calc_stats`(cvr[, keep_decimal_type, ...])	Static method wrapper around get_stats object method.
`get_annotated_cvr_table`()
`get_candidates`([rule_set_name])	Returns a BallotMarks object containing the unique candidate set.
`get_crossover_tables`()	Table describing co-ranking patterns between candidates.
`get_cvr_dict`([rule_set_name, disaggregate])	Return CVR as dictionary of lists.
`get_cvr_table`([table_format, disaggregate])	Return the cvr as pandas dataframe.
`get_rank_usage_table`()	Table describing rank usage patterns.
`get_stats`([keep_decimal_type, ...])	Obtain the default statistics calculated by the CastVoteRecord object.
`write_annotated_cvr_table`(cvr[, save_dir])	Static method wrapper around get_annotated_cvr_table object method that writes the table out to save_dir.
`write_condorcet_tables`(cvr[, save_dir])	Static method wrapper around get_condorcet_tables object method that writes the table out to save_dir.
`write_crossover_tables`(cvr[, save_dir])	Static method wrapper around get_crossover_tables object method that writes the table out to save_dir.
`write_cumulative_ranking_tables`(cvr, save_dir)	Static method wrapper around get_cumulative_ranking_tables object method that writes tables out to save_dir.
`write_cvr_table`(cvr, save_dir[, table_format])	Static method wrapper around get_cvr_table object method that writes CVR table out to save_dir.
`write_first_second_tables`(cvr[, save_dir])	Static method wrapper around get_first_second_tables object method that writes tables out to save_dir.
`write_rank_usage_table`(cvr[, save_dir])	Static method wrapper around get_rank_usage_table object method that writes the table out to save_dir.

add_rule_set(set_name: str, set_dict: Dict[str, bool | None]) → None¶

Add a new rule set used to create a modified version of the CVR.

Parameters:

set_name (str) – Unique name given to this rule set and modified CVR.
set_dict (Dict[str, Optional[bool]]) – Dictionary of boolean rule settings. Rule options are defined in BallotMarks.new_rule_set.

static calc_annotated_cvr_table(cvr: Type[CastVoteRecord]) → DataFrame¶

Static method wrapper around get_annotated_cvr_table object method.

Parameters:: cvr (Type[CastVoteRecord]) – CastVoteRecord object.
Returns:: Pandas dataframes containing rank column cvr table with statistics attached
Return type:: pd.DataFrame

static calc_condorcet_tables(cvr: Type[CastVoteRecord]) → Tuple[DataFrame]¶

Static method wrapper around get_condorcet_tables object method.

Parameters:: cvr (Type[CastVoteRecord]) – CastVoteRecord object.
Returns:: Tuples of pandas dataframes containing statistics on head to head candidate matchups. Tuple containing both count and percent formats.
Return type:: Tuple[pd.DataFrame]

static calc_crossover_tables(cvr: Type[CastVoteRecord]) → Tuple[DataFrame]¶

Static method wrapper around get_crossover_tables object method.

Parameters:: cvr (Type[CastVoteRecord]) – CastVoteRecord object.
Returns:: Tuples of pandas dataframes containing statistics on candidate ranking patterns near the top of the ballot. Tuple containing both count and percent formats.
Return type:: Tuple[pd.DataFrame]

static calc_cumulative_ranking_tables(cvr: Type[CastVoteRecord]) → Tuple[DataFrame]¶

Static method wrapper around get_cumulative_ranking_tables object method.

Parameters:: cvr (Type[CastVoteRecord]) – CastVoteRecord object.
Returns:: Cumulative ranking tables in pandas dataframe. Tuple containing both count and percent formats.
Return type:: Tuple[pd.DataFrame]

static calc_first_second_tables(cvr: Type[CastVoteRecord]) → Tuple[DataFrame]¶

Static method wrapper around get_first_second_tables object method.

Parameters:: cvr (Type[CastVoteRecord]) – CastVoteRecord object.
Returns:: First and second choice tables in pandas dataframes. Tuple containing three tables: count table and two percentage tables, one with exhausted ballots included in percentage calculations and one without.
Return type:: Tuple[pd.DataFrame]

static calc_rank_usage_table(cvr: Type[CastVoteRecord]) → DataFrame¶

Static method wrapper around get_rank_usage_table object method.

Parameters:: cvr (Type[CastVoteRecord]) – CastVoteRecord object.
Returns:: Pandas dataframe containing rank usage statistics across all ballots and by candidate.
Return type:: pd.DataFrame

static calc_stats(cvr: Type[CastVoteRecord], keep_decimal_type: bool = False, add_split_stats: bool = False, add_id_info: bool = True) → List[DataFrame]¶

Static method wrapper around get_stats object method.

Parameters:

cvr (Type[CastVoteRecord]) – CastVoteRecord object.
keep_decimal_type (bool, optional) – Return the decimal class objects used by internal calculations rather than converting them to floats, defaults to False
add_split_stats (bool, optional) – Add extra statistics calculated for each category contained in split_fields columns passed to constructor, defaults to False
add_id_info (bool, optional) – Include contest ID details to returned dataframe, defaults to True

Returns:

A list containing a single row dataframe with statistics organized in multiple columns. If split_fields are passed, then extra rows are added for each category in the split columns.

Return type:

List[pd.DataFrame]

get_candidates(rule_set_name: str | None = None) → BallotMarks¶

Returns a BallotMarks object containing the unique candidate set. The only rules which affect the candidate set are ‘combine_writein_marks’ and ‘exclude_writein_marks’.

Parameters:: rule_set_name (Optional[str], optional) – Name of modified CVR to return candidates from, defaults to None. If None, return candidates from default CVR. Defaults to None
Returns:: BallotMarks object containing unique candidates.
Return type:: BallotMarks

get_crossover_tables() → Tuple[DataFrame]¶

Table describing co-ranking patterns between candidates. For each subset of ballots organized by first choice candidate, one in each row of the table, the frequency with which all candidates appear in the top 3 ranks of ballots in that subset is calculated. For this table, ballots are used without any contest rules applied.

Returns:: Tuple of two pandas dataframes. The first contains counts, the second percentages.
Return type:: Tuple[pd.DataFrame]

get_cvr_dict(rule_set_name: str | None = None, disaggregate: bool = True) → Dict[str, List]¶

Return CVR as dictionary of lists.

Parameters:

rule_set_name (Optional[str], optional) – Name of modified CVR to return, defaults to None. If None, return default CVR.
disaggregate (bool, optional) – If True, the internally aggregated CVR is disaggregated before return. Defaults to True.

Returns:

CVR as dictionary of lists.

Return type:

Dict[str, List]

get_cvr_table(table_format: str = 'rank', disaggregate: bool = True) → DataFrame¶

Return the cvr as pandas dataframe. Two format options are available ‘rank’ or ‘candidate’.

Parameters:

table_format (str, optional) – Two choices for the format of CVR table, “rank” or “candidate”. One ballot per row. Rank format uses rank number labels as column headers with candidate names in row cells. Candidate format puts candidate names as column headers with rank position placed in row cells. Defaults to “rank”
disaggregate (bool, optional) – If True, the interally aggregated CVR is disaggregated back to contain the same number of ballots as when it was parsed, defaults to True.

Raises:

RuntimeError – raised if an invalid table_format is passed as an argument.

Returns:

Dataframe containing CVR.

Return type:

pd.DataFrame

get_rank_usage_table() → DataFrame¶

Table describing rank usage patterns. Mean rankings used and distribution of valid rankings used is provided for all ballots (excluding undervotes and ballots starting with overvotes) as well as ballots separated by first choice candidate. Ballots starting with overvotes are excluded because of the inability to assign them to a first choice candidate category. For this table, ballots are used without any contest rules applied. Skipped ranks, overvotes, and duplicate rankings are not counted valid rankings.

Returns:: Rank usage dataframe
Return type:: pd.DataFrame

get_stats(keep_decimal_type: bool = False, add_split_stats: bool = False, add_id_info: bool = True) → List[DataFrame]¶

Obtain the default statistics calculated by the CastVoteRecord object. Statistics are returned in pandas dataframe object.

Parameters:

keep_decimal_type (bool, optional) – Return the decimal class objects used by internal calculations rather than converting them to floats, defaults to False
add_split_stats (bool, optional) – Add extra statistics calculated for each category contained in split_fields columns passed to constructor, defaults to False
add_id_info (bool, optional) – Include contest ID details to returned dataframe, defaults to True

Returns:

A list containing a single row dataframe with statistics organized in multiple columns. If split_fields are passed, then extra rows are added for each category in the split columns.

Return type:

List[pd.DataFrame]

static write_annotated_cvr_table(cvr: Type[CastVoteRecord], save_dir: str | Path | None = None) → None¶

Static method wrapper around get_annotated_cvr_table object method that writes the table out to save_dir. All non-alphanumeric characters, besides underscores, are removed from file name components. Contest date is in mm/dd/yyyy format.

Parameters:

cvr (Type[CastVoteRecord]) – CastVoteRecord object.
save_dir (Union[str, pathlib.Path]) – Directory in which to write out table.

static write_condorcet_tables(cvr: Type[CastVoteRecord], save_dir: str | Path | None = None) → None¶

Static method wrapper around get_condorcet_tables object method that writes the table out to save_dir. Two tables are written out, one containing ballot counts and one with percentages. File names used follow the pattern “{save_dir}/condorcet/{jurisdiction}_{date OR year}_{office}_{‘count’ OR ‘percent’}.csv”. All non-alphanumeric characters, besides underscores, are removed from file name components. Contest date is in mm/dd/yyyy format.

Parameters:

cvr (Type[CastVoteRecord]) – CastVoteRecord object.
save_dir (Union[str, pathlib.Path]) – Directory in which to write out tables.

static write_crossover_tables(cvr: Type[CastVoteRecord], save_dir: str | Path | None = None) → None¶

Static method wrapper around get_crossover_tables object method that writes the table out to save_dir. Two tables are written out, one containing ballot counts and one with percentages. File names used follow the pattern “{save_dir}/opponent_crossover/{jurisdiction}_{date OR year}_{office}_{‘count’ OR ‘percent’}.csv”. All non-alphanumeric characters, besides underscores, are removed from file name components. Contest date is in mm/dd/yyyy format.

Parameters:

cvr (Type[CastVoteRecord]) – CastVoteRecord object.
save_dir (Union[str, pathlib.Path]) – Directory in which to write out tables.

static write_cumulative_ranking_tables(cvr: Type[CastVoteRecord], save_dir: str | Path) → None¶

Static method wrapper around get_cumulative_ranking_tables object method that writes tables out to save_dir. Two tables are written out, one containing ballot counts and one with percentages. File names used follow the pattern “{save_dir}/cumulative_ranking/{jurisdiction}_{date OR year}_{office}_{‘count’ OR ‘percent’}.csv”. All non-alphanumeric characters, besides underscores, are removed from file name components. Contest date is in mm/dd/yyyy format.

Parameters:

cvr (Type[CastVoteRecord]) – CastVoteRecord object.
save_dir (Union[str, pathlib.Path]) – Directory in which to write out tables.

static write_cvr_table(cvr: Type[CastVoteRecord], save_dir: str | Path, table_format: str = 'rank') → None¶

Static method wrapper around get_cvr_table object method that writes CVR table out to save_dir. File name used follows the pattern “{save_dir}/{jurisdiction}_{date OR year}_{office}.csv”. All non-alphanumeric characters, besides underscores, are removed from file name components. Contest date is in mm/dd/yyyy format.

Parameters:

cvr (Type[CastVoteRecord]) – CastVoteRecord object.
save_dir (Union[str, pathlib.Path]) – Directory in which to write out CVR table.
table_format (str, optional) – Format in which to write out CVR. Either “rank” or “candidate”. One row per ballot. “rank” format has rank numbers as column names with candidate names in row cells. “candidate” format has candidate names as column names with rank numbers filling in row cells. Defaults to “rank”.

static write_first_second_tables(cvr: Type[CastVoteRecord], save_dir: str | Path | None = None) → None¶

Static method wrapper around get_first_second_tables object method that writes tables out to save_dir. Three tables are written out, one containing ballot counts, one with percentages, and another with percentages excluding exhausted ballots. File names used follow the pattern “{save_dir}/first_second_choices/{jurisdiction}_{date OR year}_{office}_{‘count’ OR ‘percent’ OR ‘percent_no_exhaust’}.csv”. All non-alphanumeric characters, besides underscores, are removed from file name components. Contest date is in mm/dd/yyyy format.

Parameters:

cvr (Type[CastVoteRecord]) – CastVoteRecord object.
save_dir (Union[str, pathlib.Path]) – Directory in which to write out tables.

static write_rank_usage_table(cvr: Type[CastVoteRecord], save_dir: str | Path | None = None) → None¶

Static method wrapper around get_rank_usage_table object method that writes the table out to save_dir. File names used follow the pattern “{save_dir}/rank_usage/{jurisdiction}_{date OR year}_{office}.csv”. All non-alphanumeric characters, besides underscores, are removed from file name components. Contest date is in mm/dd/yyyy format.

Parameters:

cvr (Type[CastVoteRecord]) – CastVoteRecord object.
save_dir (Union[str, pathlib.Path]) – Directory in which to write out table.