lace.Engine.simulate
- Engine.simulate(cols, given=None, n: int = 1, include_given: bool = False)
Simulate data from a conditional distribution.
- Parameters:
cols (List[column index]) – A list of target columns to simulate
given (Dict[column index, value], optional) – An optional dictionary of column -> value conditions
n (int, optional) – The number of values to draw
include_given (bool, optional) – If
True
, the conditioning values in the given will be included in the output
- Returns:
The output data
- Return type:
polars.DataFrame
Examples
Draw from a pair of columns
>>> from lace.examples import Satellites >>> engine = Satellites() >>> engine.simulate(["Class_of_Orbit", "Period_minutes"], n=5) shape: (5, 2) ┌────────────────┬────────────────┐ │ Class_of_Orbit ┆ Period_minutes │ │ --- ┆ --- │ │ str ┆ f64 │ ╞════════════════╪════════════════╡ │ LEO ┆ 140.214617 │ │ MEO ┆ 707.76105 │ │ MEO ┆ 649.888366 │ │ LEO ┆ 109.460389 │ │ GEO ┆ 1309.460359 │ └────────────────┴────────────────┘
Simulate a pair of columns conditioned on another
>>> engine.simulate( ... ["Class_of_Orbit", "Period_minutes"], ... given={"Purpose": "Communications"}, ... n=5, ... ) shape: (5, 2) ┌────────────────┬────────────────┐ │ Class_of_Orbit ┆ Period_minutes │ │ --- ┆ --- │ │ str ┆ f64 │ ╞════════════════╪════════════════╡ │ LEO ┆ 97.079974 │ │ GEO ┆ -45.703234 │ │ LEO ┆ 114.135217 │ │ LEO ┆ 103.676199 │ │ GEO ┆ 1434.897091 │ └────────────────┴────────────────┘
Simulate missing values for columns that are missing not-at-random
>>> engine.simulate(["longitude_radians_of_geo"], n=5) shape: (5, 1) ┌──────────────────────────┐ │ longitude_radians_of_geo │ │ --- │ │ f64 │ ╞══════════════════════════╡ │ -2.719645 │ │ -0.154891 │ │ null │ │ null │ │ 0.712423 │ └──────────────────────────┘ >>> engine.simulate( ... ["longitude_radians_of_geo"], ... given={"Class_of_Orbit": "GEO"}, ... n=5, ... ) shape: (5, 1) ┌──────────────────────────┐ │ longitude_radians_of_geo │ │ --- │ │ f64 │ ╞══════════════════════════╡ │ 0.850506 │ │ 0.666353 │ │ 0.682146 │ │ 0.221179 │ │ 2.621126 │ └──────────────────────────┘
If we simulate using
given
conditions, we can include the conditions in the output usinginclude_given=True
.>>> engine.simulate( ... ["Period_minutes"], ... given={"Purpose": "Communications", "Class_of_Orbit": "GEO"}, ... n=5, ... include_given=True, ... ) shape: (5, 3) ┌────────────────┬────────────────┬────────────────┐ │ Period_minutes ┆ Purpose ┆ Class_of_Orbit │ │ --- ┆ --- ┆ --- │ │ f64 ┆ str ┆ str │ ╞════════════════╪════════════════╪════════════════╡ │ 1426.679095 ┆ Communications ┆ GEO │ │ 54.08657 ┆ Communications ┆ GEO │ │ 1433.563215 ┆ Communications ┆ GEO │ │ 1436.388876 ┆ Communications ┆ GEO │ │ 1434.298969 ┆ Communications ┆ GEO │ └────────────────┴────────────────┴────────────────┘