lace.Engine.simulate

Engine.simulate(cols, given=None, n: int = 1, include_given: bool = False)

Simulate data from a conditional distribution.

Parameters:
  • cols (List[column index]) – A list of target columns to simulate

  • given (Dict[column index, value], optional) – An optional dictionary of column -> value conditions

  • n (int, optional) – The number of values to draw

  • include_given (bool, optional) – If True, the conditioning values in the given will be included in the output

Returns:

The output data

Return type:

polars.DataFrame

Examples

Draw from a pair of columns

>>> from lace.examples import Satellites
>>> engine = Satellites()
>>> engine.simulate(["Class_of_Orbit", "Period_minutes"], n=5)
shape: (5, 2)
┌────────────────┬────────────────┐
│ Class_of_Orbit ┆ Period_minutes │
│ ---            ┆ ---            │
│ str            ┆ f64            │
╞════════════════╪════════════════╡
│ LEO            ┆ 140.214617     │
│ MEO            ┆ 707.76105      │
│ MEO            ┆ 649.888366     │
│ LEO            ┆ 109.460389     │
│ GEO            ┆ 1309.460359    │
└────────────────┴────────────────┘

Simulate a pair of columns conditioned on another

>>> engine.simulate(
...     ["Class_of_Orbit", "Period_minutes"],
...     given={"Purpose": "Communications"},
...     n=5,
... )
shape: (5, 2)
┌────────────────┬────────────────┐
│ Class_of_Orbit ┆ Period_minutes │
│ ---            ┆ ---            │
│ str            ┆ f64            │
╞════════════════╪════════════════╡
│ LEO            ┆ 97.079974      │
│ GEO            ┆ -45.703234     │
│ LEO            ┆ 114.135217     │
│ LEO            ┆ 103.676199     │
│ GEO            ┆ 1434.897091    │
└────────────────┴────────────────┘

Simulate missing values for columns that are missing not-at-random

>>> engine.simulate(["longitude_radians_of_geo"], n=5)
shape: (5, 1)
┌──────────────────────────┐
│ longitude_radians_of_geo │
│ ---                      │
│ f64                      │
╞══════════════════════════╡
│ -2.719645                │
│ -0.154891                │
│ null                     │
│ null                     │
│ 0.712423                 │
└──────────────────────────┘
>>> engine.simulate(
...     ["longitude_radians_of_geo"],
...     given={"Class_of_Orbit": "GEO"},
...     n=5,
... )
shape: (5, 1)
┌──────────────────────────┐
│ longitude_radians_of_geo │
│ ---                      │
│ f64                      │
╞══════════════════════════╡
│ 0.850506                 │
│ 0.666353                 │
│ 0.682146                 │
│ 0.221179                 │
│ 2.621126                 │
└──────────────────────────┘

If we simulate using given conditions, we can include the conditions in the output using include_given=True.

>>> engine.simulate(
...     ["Period_minutes"],
...     given={"Purpose": "Communications", "Class_of_Orbit": "GEO"},
...     n=5,
...     include_given=True,
... )
shape: (5, 3)
┌────────────────┬────────────────┬────────────────┐
│ Period_minutes ┆ Purpose        ┆ Class_of_Orbit │
│ ---            ┆ ---            ┆ ---            │
│ f64            ┆ str            ┆ str            │
╞════════════════╪════════════════╪════════════════╡
│ 1426.679095    ┆ Communications ┆ GEO            │
│ 54.08657       ┆ Communications ┆ GEO            │
│ 1433.563215    ┆ Communications ┆ GEO            │
│ 1436.388876    ┆ Communications ┆ GEO            │
│ 1434.298969    ┆ Communications ┆ GEO            │
└────────────────┴────────────────┴────────────────┘