lace.engine.Engine.surprisal

Engine.surprisal(col: int | str, *, rows=None, values=None, state_ixs=None)

Compute the surprisal of a values in specific cells.

Surprisal is the negative log likeilihood of a specific value in a specific position (cell) in the table.

Parameters:
  • col (column index) – The column location of the target cells

  • rows (arraylike[row index], optional) – Row indices of the cells. If None (default), all non-missing rows will be used.

  • values (arraylike[value]) – Proposed values for each cell. Must have an entry for each entry in rows. If None, the existing values are used.

  • state_ixs (List[int], optional) – An optional list specifying which states should be used in the surprisal computation. If None (default), use all states.

Returns:

A polars.DataFrame containing an index column for the row names, a <col> column for the values, and a surprisal column containing the surprisal values.

Return type:

polars.DataFrame

Examples

Find satellites with the top five most surprising expected lifetimes

>>> import polars as pl
>>> from lace.examples import Satellites
>>> engine = Satellites()
>>> engine.surprisal("Expected_Lifetime").sort(
...     "surprisal", descending=True
... ).head(5)
shape: (5, 3)
┌───────────────────────────────────┬───────────────────┬───────────┐
│ index                             ┆ Expected_Lifetime ┆ surprisal │
│ ---                               ┆ ---               ┆ ---       │
│ str                               ┆ f64               ┆ f64       │
╞═══════════════════════════════════╪═══════════════════╪═══════════╡
│ International Space Station (ISS… ┆ 30.0              ┆ 7.02499   │
│ Landsat 7                         ┆ 15.0              ┆ 4.869031  │
│ Milstar DFS-5 (USA 164, Milstar … ┆ 0.0               ┆ 4.74869   │
│ Optus B3                          ┆ 0.5               ┆ 4.653549  │
│ SDS III-3 (Satellite Data System… ┆ 0.5               ┆ 4.558333  │
└───────────────────────────────────┴───────────────────┴───────────┘

Compute the surprisal for specific cells

>>> engine.surprisal(
...     "Expected_Lifetime", rows=["Landsat 7", "Intelsat 701"]
... )
shape: (2, 3)
┌──────────────┬───────────────────┬───────────┐
│ index        ┆ Expected_Lifetime ┆ surprisal │
│ ---          ┆ ---               ┆ ---       │
│ str          ┆ f64               ┆ f64       │
╞══════════════╪═══════════════════╪═══════════╡
│ Landsat 7    ┆ 15.0              ┆ 4.869031  │
│ Intelsat 701 ┆ 0.5               ┆ 4.533067  │
└──────────────┴───────────────────┴───────────┘

Compute the surprisal of specific values in specific cells

>>> engine.surprisal(
...     "Expected_Lifetime",
...     rows=["Landsat 7", "Intelsat 701"],
...     values=[10.0, 10.0],
... )
shape: (2, 3)
┌──────────────┬───────────────────┬───────────┐
│ index        ┆ Expected_Lifetime ┆ surprisal │
│ ---          ┆ ---               ┆ ---       │
│ str          ┆ f64               ┆ f64       │
╞══════════════╪═══════════════════╪═══════════╡
│ Landsat 7    ┆ 10.0              ┆ 3.037384  │
│ Intelsat 701 ┆ 10.0              ┆ 2.559729  │
└──────────────┴───────────────────┴───────────┘

Surprisal will be different under different_states

>>> engine.surprisal(
...     "Expected_Lifetime",
...     rows=["Landsat 7", "Intelsat 701"],
...     values=[10.0, 10.0],
...     state_ixs=[0, 1],
... )
shape: (2, 3)
┌──────────────┬───────────────────┬───────────┐
│ index        ┆ Expected_Lifetime ┆ surprisal │
│ ---          ┆ ---               ┆ ---       │
│ str          ┆ f64               ┆ f64       │
╞══════════════╪═══════════════════╪═══════════╡
│ Landsat 7    ┆ 10.0              ┆ 2.743636  │
│ Intelsat 701 ┆ 10.0              ┆ 2.587096  │
└──────────────┴───────────────────┴───────────┘