lace.Engine.append_columns

Engine.append_columns(cols: DataFrame | DataFrame, metadata: List[ColumnMetadata] | None = None, cat_cutoff: int = 20, no_hypers: bool = False)

Append new columns to the Engine.

Parameters:
  • cols (polars.DataFrame, pandas.DataFrame) – The new column(s) to append to the Engine. If cols is a polars DataFrame, cols must contain an ID column. Note that new indices will result in new rows

  • col_metadata (dict[str, ColumnMetadata], Optional) – A map from column name to metadata. If None (default) metadata will be inferred from the data.

  • cat_cutoff (int, optional) – The max value of an unsigned integer a column can have before it is inferred to be count type (default: 20). Used only if col_metadata is None.

  • no_hypers (bool, optional) – If True, the prior will be fixed and hyper priors will be ignored. Used only if col_metadata is None.

Examples

Append a new continuous column

>>> import numpy as np
>>> import polars as pl
>>> from lace.examples import Animals
>>> engine = Animals()
>>> engine.shape
(50, 85)
>>> column = pl.DataFrame([
...     pl.Series("index", engine.index),  # index
...     pl.Series("rand", np.random.randn(engine.shape[0])),
... ])
>>> engine.append_columns(column)
>>> engine.shape
(50, 86)
>>> engine.ftype("rand")
'Continuous'

Also works with pandas DataFrames

>>> import pandas as pd
>>> engine = Animals()
>>> engine.shape
(50, 85)
>>> column = pd.DataFrame({
...     "rand": np.random.randn(engine.shape[0]),
... }, index=engine.index)
>>> engine.append_columns(column)
>>> engine.shape
(50, 86)
>>> engine.ftype("rand")
'Continuous'

You can append multiple columns

>>> engine = Animals()
>>> engine.shape
(50, 85)
>>> columns = pd.DataFrame({
...     "rand1": np.random.randn(engine.shape[0]),
...     "rand2": np.random.randn(engine.shape[0]),
... }, index=engine.index)
>>> engine.append_columns(columns)
>>> engine.shape
(50, 87)
>>> engine.ftype("rand1")
'Continuous'
>>> engine.ftype("rand2")
'Continuous'

And you can append partially filled columns

>>> engine = Animals()
>>> engine.shape
(50, 85)
>>> columns = pd.DataFrame({
...     "values": [0.0, 1.0, 2.0],
... }, index=[engine.index[0], engine.index[2], engine.index[5]])
>>> engine.append_columns(columns)
>>> engine[:7, "values"]  
shape: (7, 2)
┌──────────────┬────────┐
│ index        ┆ values │
│ ---          ┆ ---    │
│ str          ┆ f64    │
╞══════════════╪════════╡
│ antelope     ┆ 0.0    │
│ grizzly+bear ┆ null   │
│ killer+whale ┆ 1.0    │
│ beaver       ┆ null   │
│ dalmatian    ┆ null   │
│ persian+cat  ┆ 2.0    │
│ horse        ┆ null   │
└──────────────┴────────┘

We can append categorical columns as well. Sometimes you will need to define the metadata manually. In this case, there are more possible categories that categories observed in the data.

>>> from lace import ColumnMetadata, CategoricalPrior, ValueMap
>>> engine = Animals()
>>> engine.shape
(50, 85)
>>> columns = pd.DataFrame({
...     "fav_color": ["Yellow", "Yellow", "Blue", "Sparkles"],
... }, index=engine.index[:4])
>>> metadata = [
...     ColumnMetadata.categorical(
...         "fav_color",
...         4,
...         prior=CategoricalPrior(4),
...         value_map=ValueMap.string(["Blue", "Yellow", "Sparkles", "Green"])
...     ),
... ]
>>> engine.append_columns(columns, metadata)
>>> engine[:5, "fav_color"]  
shape: (5, 2)
┌──────────────┬───────────┐
│ index        ┆ fav_color │
│ ---          ┆ ---       │
│ str          ┆ str       │
╞══════════════╪═══════════╡
│ antelope     ┆ Yellow    │
│ grizzly+bear ┆ Yellow    │
│ killer+whale ┆ Blue      │
│ beaver       ┆ Sparkles  │
│ dalmatian    ┆ null      │
└──────────────┴───────────┘

And count columns

>>> engine = Animals()
>>> engine.shape
(50, 85)
>>> columns = pd.DataFrame({
...     "times_watched_the_fifth_element": list(range(5)) * 10,
... }, index=engine.index)
>>> engine.append_columns(columns, cat_cutoff=3)
>>> engine[:8, "times_watched_the_fifth_element"]  
shape: (8, 2)
┌─────────────────┬─────────────────────────────────┐
│ index           ┆ times_watched_the_fifth_element │
│ ---             ┆ ---                             │
│ str             ┆ u32                             │
╞═════════════════╪═════════════════════════════════╡
│ antelope        ┆ 0                               │
│ grizzly+bear    ┆ 1                               │
│ killer+whale    ┆ 2                               │
│ beaver          ┆ 3                               │
│ dalmatian       ┆ 4                               │
│ persian+cat     ┆ 0                               │
│ horse           ┆ 1                               │
│ german+shepherd ┆ 2                               │
└─────────────────┴─────────────────────────────────┘