helpers

wadoh_raccoon.utils.helpers

Functions

Name Description
date_format Format Dates
gt_style Style for GT Tables
save_raw_values save raw values

date_format

wadoh_raccoon.utils.helpers.date_format(col: str)

Format Dates

Convert string dates into a yyyy-mm-dd format. The function uses pl.coalesce to try to process different formats. For example, it will first try to convert m/d/y, and then if that doesn’t work it will try d/m/y.

Note: it won’t attempt to convert excel dates.

Usage

To be applied to a string date column.

Parameters

col : str

a string column that has a date

Returns

output_date : date

a date column

Examples

import polars as pl
from wadoh_raccoon.utils import helpers


df = pl.DataFrame({
    "dates": [
        "2024-10-30",     # ISO format
        "30/10/2024",     # European format
        "10/20/2024",     # US format
        "10-30-2024",     # US format
        "October 30, 2024",  # Full month name format,
        "45496",           # an excel date LOL
        "2022-12-27 08:26:49"
    ]
})

helpers.gt_style(
    df
    .with_columns(
        new_date=helpers.date_format('dates')
    )
)
index dates new_date
0 2024-10-30 2024-10-30
1 30/10/2024 2024-10-30
2 10/20/2024 2024-10-20
3 10-30-2024 2024-10-30
4 October 30, 2024 2024-10-30
5 45496 None
6 2022-12-27 08:26:49 2022-12-27

gt_style

wadoh_raccoon.utils.helpers.gt_style(
    df_inp: pl.DataFrame,
    title: str = '',
    subtitle: str = '',
    add_striping_inp,
    index_inp,
)

Style for GT Tables

Usage

Apply this style to a Polars DataFrame

Parameters

df_inp : pl.DataFrame

a polars dataframe

title : str = ''

a title for the table (optional)

subtitle : str = ''

a subtitle for the table (optional, must have a title if using a subtitle)

add_striping_inp : = True

striping in the table True or False

index_inp : = True

add a column for the row number and label it index

Returns

: GT

a GT object (great_tables table)

Examples

import polars as pl
from wadoh_raccoon.utils import helpers
df = pl.DataFrame({
    "x": [1,1,2],
    "y": [1,2,3]
})

A table with a title/subtitle:

helpers.gt_style(df_inp=df,title="My Title",subtitle="My Subtitle")
My Title
My Subtitle
index x y
0 1 1
1 1 2
2 2 3

No title/subtitle

helpers.gt_style(df_inp=df)
index x y
0 1 1
1 1 2
2 2 3

Without an index:

helpers.gt_style(df_inp=df,index_inp=False)
x y
1 1
1 2
2 3

Without striping:

helpers.gt_style(df_inp=df,add_striping_inp=False)
index x y
0 1 1
1 1 2
2 2 3

save_raw_values

wadoh_raccoon.utils.helpers.save_raw_values(
    df_inp: pl.DataFrame,
    primary_key_col: str,
)

save raw values

Usage

Converts a polars dataframe into a dataframe with all columns in a struct column. It’s good for saving raw outputs of data.

Parameters

df_inp : pl.DataFrame

a polars dataframe

primary_key_col : str

column name for the primary key (submission key, not person/case key)

Returns

df : pl.DataFrame

a dataframe

Examples

import polars as pl
from wadoh_raccoon.utils import helpers

data = pl.DataFrame({
    "lab_name": ["PHL", "MFT", "ELR","PHL"],
    "first_name": ["Alice", "Bob", "Charlie", "Charlie"],
    "last_name": ["Smith", "Johnson", "Williams", "Williams"],
    "WA_ID": [1,2,4,4]
})

received_submissions_df = (
        helpers.save_raw_values(df_inp=data,primary_key_col="WA_ID")
)

helpers.gt_style(data)
index lab_name first_name last_name WA_ID
0 PHL Alice Smith 1
1 MFT Bob Johnson 2
2 ELR Charlie Williams 4
3 PHL Charlie Williams 4
helpers.gt_style(received_submissions_df)
index submission_number internal_create_date raw_inbound_submission
0 1 2025-03-21 {"PHL","Alice","Smith",1}
1 2 2025-03-21 {"MFT","Bob","Johnson",2}
2 4 2025-03-21 {"ELR","Charlie","Williams",4}
3 4 2025-03-21 {"PHL","Charlie","Williams",4}