Module epiclass.utils.classification_merging_utils
Utility functions for merging classification results.
Functions
def clean_format(x: object) ‑> str
-
Format a value to string, removing decimal points for whole numbers.
Args
x
- Any value that needs string formatting
Returns
Formatted string representation of the value
def merge_dataframes(df1: pd.DataFrame, df2: pd.DataFrame, on: str = 'md5sum', verbose: bool = False) ‑> pandas.core.frame.DataFrame
-
Merge two DataFrames by concatenating along the given column, otherwise it attemps to merge on md5sum, filename. It attempts to merge by aligning common columns and appending non-common columns.
Column with same names get combined with ';' value separator.
Parameters: df1 (pd.DataFrame): The first DataFrame df2 (pd.DataFrame): The second DataFrame on (str, optional): The column to merge on. Defaults to "md5sum". verbose (bool, optional): Whether to print verbose output. Defaults to False.
Raises
ValueError
- If no merge is possible.
Returns: pd.DataFrame: Merged DataFrame. Index name is preserved if it was the same.
def merge_two_columns(df: pd.DataFrame, col1: str, col2: str) ‑> pandas.core.frame.DataFrame
-
Return update IN-PLACE dataframe that merged values of col1 and col2, only if they are complementary.
def remove_pred_vector(df: pd.DataFrame, verbose: bool = True) ‑> pandas.core.frame.DataFrame
-
Remove the prediction vector from a result dataframe.
If the "files/epiRR" columns does not exist, it will remove everything after the "1rst/2nd prob ratio" column. If there is any metadata after that column, it will also be removed.
def sjoin(x)
-
join columns if column is not null