lace.Engine.impute
- Engine.impute(col: str | int, rows: List[str | int] | None = None, with_uncertainty: bool = True)
Impute (predict) the value of a cell(s) in the lace table.
Impute returns the most likely value at a specific location in the table. regardless of whether the cell at (
row
,col
) contains a present value,impute
will choose the value that is most likely given the current distribution of the cell. If the current value is an outlier, or unlikely,impute
will return a value that is more in line with its understanding of the data.If the cell lies in a missing-not-at-random column, a value will always be returned, even if the value is most likely to be missing. Imputation forces the value of a cell to be present.
Uncertainty is the normalized mean total variation distance between each state’s imputation distribution and the average imputation distribution.
- Parameters:
col (column index) – The column index
rows (List[row index], optional) – Optional row indices to impute. If
None
(default), all the rows with missing values will be imputedwith_uncertainty (bool, default: True) – If True, compute and return the impute uncertainty
- Returns:
Indexed by
rows
; contains a column for the imputed values and their uncertainties, if requested.- Return type:
polars.DataFrame
Examples
Impute, with uncertainty, all the missing values in a column
>>> from lace.examples import Satellites >>> engine = Satellites() >>> engine.impute("Purpose") shape: (0, 2) ┌───────┬─────────┐ │ index ┆ Purpose │ │ --- ┆ --- │ │ str ┆ str │ ╞═══════╪═════════╡ └───────┴─────────┘
Let’s choose a column that actually has missing values
>>> engine.impute("Type_of_Orbit") shape: (645, 3) ┌───────────────────────────────────┬─────────────────┬─────────────┐ │ index ┆ Type_of_Orbit ┆ uncertainty │ │ --- ┆ --- ┆ --- │ │ str ┆ str ┆ f64 │ ╞═══════════════════════════════════╪═════════════════╪═════════════╡ │ AAUSat-3 ┆ Sun-Synchronous ┆ 0.190897 │ │ ABS-1 (LMI-1, Lockheed Martin-In… ┆ Sun-Synchronous ┆ 0.422782 │ │ ABS-1A (Koreasat 2, Mugunghwa 2,… ┆ Sun-Synchronous ┆ 0.422782 │ │ ABS-2i (MBSat, Mobile Broadcasti… ┆ Sun-Synchronous ┆ 0.422782 │ │ … ┆ … ┆ … │ │ Zhongxing 20A ┆ Sun-Synchronous ┆ 0.422782 │ │ Zhongxing 22A (Chinastar 22A) ┆ Sun-Synchronous ┆ 0.422782 │ │ Zhongxing 2A (Chinasat 2A) ┆ Sun-Synchronous ┆ 0.422782 │ │ Zhongxing 9 (Chinasat 9, Chinast… ┆ Sun-Synchronous ┆ 0.422782 │ └───────────────────────────────────┴─────────────────┴─────────────┘
Impute a defined set of rows
>>> engine.impute("Purpose", rows=["AAUSat-3", "Zhongxing 20A"]) shape: (2, 3) ┌───────────────┬────────────────────────┬─────────────┐ │ index ┆ Purpose ┆ uncertainty │ │ --- ┆ --- ┆ --- │ │ str ┆ str ┆ f64 │ ╞═══════════════╪════════════════════════╪═════════════╡ │ AAUSat-3 ┆ Technology Development ┆ 0.236857 │ │ Zhongxing 20A ┆ Communications ┆ 0.142772 │ └───────────────┴────────────────────────┴─────────────┘
Uncertainty is optional
>>> engine.impute("Type_of_Orbit", with_uncertainty=False) shape: (645, 2) ┌───────────────────────────────────┬─────────────────┐ │ index ┆ Type_of_Orbit │ │ --- ┆ --- │ │ str ┆ str │ ╞═══════════════════════════════════╪═════════════════╡ │ AAUSat-3 ┆ Sun-Synchronous │ │ ABS-1 (LMI-1, Lockheed Martin-In… ┆ Sun-Synchronous │ │ ABS-1A (Koreasat 2, Mugunghwa 2,… ┆ Sun-Synchronous │ │ ABS-2i (MBSat, Mobile Broadcasti… ┆ Sun-Synchronous │ │ … ┆ … │ │ Zhongxing 20A ┆ Sun-Synchronous │ │ Zhongxing 22A (Chinastar 22A) ┆ Sun-Synchronous │ │ Zhongxing 2A (Chinasat 2A) ┆ Sun-Synchronous │ │ Zhongxing 9 (Chinasat 9, Chinast… ┆ Sun-Synchronous │ └───────────────────────────────────┴─────────────────┘