lace.Engine.append_columns
- Engine.append_columns(cols: DataFrame | DataFrame, metadata: List[ColumnMetadata] | None = None, cat_cutoff: int = 20, no_hypers: bool = False)
Append new columns to the Engine.
- Parameters:
cols (polars.DataFrame, pandas.DataFrame) – The new column(s) to append to the
Engine
. Ifcols
is a polars DataFrame, cols must contain anID
column. Note that new indices will result in new rowscol_metadata (dict[str, ColumnMetadata], Optional) – A map from column name to metadata. If None (default) metadata will be inferred from the data.
cat_cutoff (int, optional) – The max value of an unsigned integer a column can have before it is inferred to be count type (default: 20). Used only if
col_metadata
is None.no_hypers (bool, optional) – If True, the prior will be fixed and hyper priors will be ignored. Used only if
col_metadata
is None.
Examples
Append a new continuous column
>>> import numpy as np >>> import polars as pl >>> from lace.examples import Animals >>> engine = Animals() >>> engine.shape (50, 85) >>> column = pl.DataFrame([ ... pl.Series("index", engine.index), # index ... pl.Series("rand", np.random.randn(engine.shape[0])), ... ]) >>> engine.append_columns(column) >>> engine.shape (50, 86) >>> engine.ftype("rand") 'Continuous'
Also works with pandas DataFrames
>>> import pandas as pd >>> engine = Animals() >>> engine.shape (50, 85) >>> column = pd.DataFrame({ ... "rand": np.random.randn(engine.shape[0]), ... }, index=engine.index) >>> engine.append_columns(column) >>> engine.shape (50, 86) >>> engine.ftype("rand") 'Continuous'
You can append multiple columns
>>> engine = Animals() >>> engine.shape (50, 85) >>> columns = pd.DataFrame({ ... "rand1": np.random.randn(engine.shape[0]), ... "rand2": np.random.randn(engine.shape[0]), ... }, index=engine.index) >>> engine.append_columns(columns) >>> engine.shape (50, 87) >>> engine.ftype("rand1") 'Continuous' >>> engine.ftype("rand2") 'Continuous'
And you can append partially filled columns
>>> engine = Animals() >>> engine.shape (50, 85) >>> columns = pd.DataFrame({ ... "values": [0.0, 1.0, 2.0], ... }, index=[engine.index[0], engine.index[2], engine.index[5]]) >>> engine.append_columns(columns) >>> engine[:7, "values"] shape: (7, 2) ┌──────────────┬────────┐ │ index ┆ values │ │ --- ┆ --- │ │ str ┆ f64 │ ╞══════════════╪════════╡ │ antelope ┆ 0.0 │ │ grizzly+bear ┆ null │ │ killer+whale ┆ 1.0 │ │ beaver ┆ null │ │ dalmatian ┆ null │ │ persian+cat ┆ 2.0 │ │ horse ┆ null │ └──────────────┴────────┘
We can append categorical columns as well. Sometimes you will need to define the metadata manually. In this case, there are more possible categories that categories observed in the data.
>>> from lace import ColumnMetadata, CategoricalPrior, ValueMap >>> engine = Animals() >>> engine.shape (50, 85) >>> columns = pd.DataFrame({ ... "fav_color": ["Yellow", "Yellow", "Blue", "Sparkles"], ... }, index=engine.index[:4]) >>> metadata = [ ... ColumnMetadata.categorical( ... "fav_color", ... 4, ... prior=CategoricalPrior(4), ... value_map=ValueMap.string(["Blue", "Yellow", "Sparkles", "Green"]) ... ), ... ] >>> engine.append_columns(columns, metadata) >>> engine[:5, "fav_color"] shape: (5, 2) ┌──────────────┬───────────┐ │ index ┆ fav_color │ │ --- ┆ --- │ │ str ┆ str │ ╞══════════════╪═══════════╡ │ antelope ┆ Yellow │ │ grizzly+bear ┆ Yellow │ │ killer+whale ┆ Blue │ │ beaver ┆ Sparkles │ │ dalmatian ┆ null │ └──────────────┴───────────┘
And count columns
>>> engine = Animals() >>> engine.shape (50, 85) >>> columns = pd.DataFrame({ ... "times_watched_the_fifth_element": list(range(5)) * 10, ... }, index=engine.index) >>> engine.append_columns(columns, cat_cutoff=3) >>> engine[:8, "times_watched_the_fifth_element"] shape: (8, 2) ┌─────────────────┬─────────────────────────────────┐ │ index ┆ times_watched_the_fifth_element │ │ --- ┆ --- │ │ str ┆ u32 │ ╞═════════════════╪═════════════════════════════════╡ │ antelope ┆ 0 │ │ grizzly+bear ┆ 1 │ │ killer+whale ┆ 2 │ │ beaver ┆ 3 │ │ dalmatian ┆ 4 │ │ persian+cat ┆ 0 │ │ horse ┆ 1 │ │ german+shepherd ┆ 2 │ └─────────────────┴─────────────────────────────────┘