vdmtools.utils.DataframeUtils

class vdmtools.utils.DataframeUtils

Utility class for pandas DataFrames containing VdM scan data.

Methods

filter_by_cov_status(data, cov_status)

Filter the given DataFrame by the covariance matrix status.

filter_by_fit_status(data, fit_status)

Filter the given DataFrame by the fit status.

filter_by_status(data, ...)

Filter the given DataFrame by the status.

match_bcids(dataframe1, dataframe2)

Match BCIDs of two DataFrames.

static filter_by_cov_status(data: DataFrame, cov_status: int) DataFrame

Filter the given DataFrame by the covariance matrix status.

Parameters:
  • data (pd.DataFrame) – The DataFrame to filter.

  • cov_status (int) – The covariance matrix status to filter by. Can be one of 3 (good), 2 (bad).

Returns:

The filtered DataFrame.

Return type:

pd.DataFrame

Notes

The provided DataFrame must contain the columns “covStatus_X” and “covStatus_Y”.

static filter_by_fit_status(data: DataFrame, fit_status: int) DataFrame

Filter the given DataFrame by the fit status.

Parameters:
  • data (pd.DataFrame) – The DataFrame to filter.

  • fit_status (int) – The fit status to filter by. Can be one of 0 (good), 4 (bad).

Returns:

The filtered DataFrame.

Return type:

pd.DataFrame

Notes

The provided DataFrame must contain the columns “fitStatus_X” and “fitStatus_Y”.

static filter_by_status(data: DataFrame, status_column_with_no_plane: str, status: int) DataFrame

Filter the given DataFrame by the status.

Parameters:
  • data (pd.DataFrame) – The DataFrame to filter.

  • status_column_with_no_plane (str) – The name of the column containing the status for the plane with no plane.

  • status (int) – The status to filter by. Can be one of 0 (good), 1 (bad).

Returns:

The filtered DataFrame.

Return type:

pd.DataFrame

static match_bcids(dataframe1: DataFrame, dataframe2: DataFrame) Tuple[DataFrame, DataFrame]

Match BCIDs of two DataFrames.

Parameters:
  • dataframe1 (pd.DataFrame) – First DataFrame.

  • dataframe2 (pd.DataFrame) – Second DataFrame.

Returns:

Tuple containing two DataFrames with matching BCIDs in their index.

Return type:

Tuple[pd.DataFrame, pd.DataFrame]

Notes

This function expects that both DataFrames the BCID values as their index. From there, it will return two DataFrames with matching BCIDs.

Examples

>>> import pandas as pd
>>> df1 = pd.DataFrame({"BCID": [1, 2, 4, 5, 7], "Values": [1, 2, 3, 4, 5]}).set_index("BCID")
>>> df1
    Values
BCID
1          1
2          2
4          3
5          4
7          5
>>> df2 = pd.DataFrame({"BCID": [1, 2, 3, 6, 7], "Values": [1, 2, 3, 4, 5]}).set_index("BCID")
>>> df2
    Values
BCID
1          1
2          2
3          3
6          4
7          5
>>> df1, df2 = match_bcids(df1, df2)
>>> df1
    Values
BCID
1          1
2          2
7          5
>>> df2
    Values
BCID
1          1
2          2
7          5