lace.Engine.surprisal
- Engine.surprisal(col: int | str, *, rows=None, values=None, state_ixs=None)
Compute the surprisal of a values in specific cells.
Surprisal is the negative log likeilihood of a specific value in a specific position (cell) in the table.
- Parameters:
col (column index) – The column location of the target cells
rows (arraylike[row index], optional) – Row indices of the cells. If
None
(default), all non-missing rows will be used.values (arraylike[value]) – Proposed values for each cell. Must have an entry for each entry in rows. If None, the existing values are used.
state_ixs (List[int], optional) – An optional list specifying which states should be used in the surprisal computation. If None (default), use all states.
- Returns:
A polars.DataFrame containing an index column for the row names, a <col> column for the values, and a surprisal column containing the surprisal values.
- Return type:
polars.DataFrame
Examples
Find satellites with the top five most surprising expected lifetimes
>>> import polars as pl >>> from lace.examples import Satellites >>> engine = Satellites() >>> engine.surprisal("Expected_Lifetime").sort( ... "surprisal", descending=True ... ).head(5) shape: (5, 3) ┌───────────────────────────────────┬───────────────────┬───────────┐ │ index ┆ Expected_Lifetime ┆ surprisal │ │ --- ┆ --- ┆ --- │ │ str ┆ f64 ┆ f64 │ ╞═══════════════════════════════════╪═══════════════════╪═══════════╡ │ International Space Station (ISS… ┆ 30.0 ┆ 11.423102 │ │ Milstar DFS-5 (USA 164, Milstar … ┆ 0.0 ┆ 6.661427 │ │ DSP 21 (USA 159) (Defense Suppor… ┆ 0.5 ┆ 6.366436 │ │ DSP 22 (USA 176) (Defense Suppor… ┆ 0.5 ┆ 6.366436 │ │ Intelsat 701 ┆ 0.5 ┆ 6.366436 │ └───────────────────────────────────┴───────────────────┴───────────┘
Compute the surprisal for specific cells
>>> engine.surprisal( ... "Expected_Lifetime", rows=["Landsat 7", "Intelsat 701"] ... ) shape: (2, 3) ┌──────────────┬───────────────────┬───────────┐ │ index ┆ Expected_Lifetime ┆ surprisal │ │ --- ┆ --- ┆ --- │ │ str ┆ f64 ┆ f64 │ ╞══════════════╪═══════════════════╪═══════════╡ │ Landsat 7 ┆ 15.0 ┆ 4.588265 │ │ Intelsat 701 ┆ 0.5 ┆ 6.366436 │ └──────────────┴───────────────────┴───────────┘
Compute the surprisal of specific values in specific cells
>>> engine.surprisal( ... "Expected_Lifetime", ... rows=["Landsat 7", "Intelsat 701"], ... values=[10.0, 10.0], ... ) shape: (2, 3) ┌──────────────┬───────────────────┬───────────┐ │ index ┆ Expected_Lifetime ┆ surprisal │ │ --- ┆ --- ┆ --- │ │ str ┆ f64 ┆ f64 │ ╞══════════════╪═══════════════════╪═══════════╡ │ Landsat 7 ┆ 10.0 ┆ 2.984587 │ │ Intelsat 701 ┆ 10.0 ┆ 2.52041 │ └──────────────┴───────────────────┴───────────┘
Compute the surprisal of multiple values in a single cell
>>> engine.surprisal( ... "Expected_Lifetime", ... rows=["Landsat 7"], ... values=[0.5, 1.0, 5.0, 10.0], ... ) shape: (4,) Series: 'surprisal' [f64] [ 3.225658 3.036696 2.273096 2.984587 ]
Surprisal will be different under different_states
>>> engine.surprisal( ... "Expected_Lifetime", ... rows=["Landsat 7", "Intelsat 701"], ... values=[10.0, 10.0], ... state_ixs=[0, 1], ... ) shape: (2, 3) ┌──────────────┬───────────────────┬───────────┐ │ index ┆ Expected_Lifetime ┆ surprisal │ │ --- ┆ --- ┆ --- │ │ str ┆ f64 ┆ f64 │ ╞══════════════╪═══════════════════╪═══════════╡ │ Landsat 7 ┆ 10.0 ┆ 3.431414 │ │ Intelsat 701 ┆ 10.0 ┆ 2.609992 │ └──────────────┴───────────────────┴───────────┘