Skip to content

kelp.meta

Reference for the MetaFramework — the reusable metadata loading backend that powers Kelp's runtime context. Frameworks subclass MetaFramework, declare a MetaProjectSpec, and call init() / get_context() to manage lifecycle and catalog access.

kelp.meta.MetaFramework

Framework-specific API for metadata management.

This class provides a clean interface for frameworks to: - Define their metadata specifications - Initialize runtime contexts - Access framework-specific contexts

Each framework should define a subclass that sets spec as a class attribute and then expose the class methods as their public API.

Example

from kelp.meta import MetaFramework, MetaProjectSpec, MetaObjectSpec from kelp.models.model import Model spec = MetaProjectSpec( ... framework_id="myframework", ... project_header="myframework_project", ... project_settings_model=MyConfig, ... object_specs=( ... MetaObjectSpec( ... root_key="my_models", ... project_config_key="models", ... path_attr="models_path", ... catalog_attr="models", ... model_class=Model, ... model_label="Model", ... ), ... ), ... ) class MyFramework(MetaFramework): ... spec = spec ctx = MyFramework.init() ctx = MyFramework.get_context()

spec class-attribute

spec

init classmethod

init(
    project_file_path=None,
    target=None,
    init_vars=None,
    manifest_file_path=None,
    refresh=False,
    store_in_global=True,
)

Initialize and store framework runtime context.

This discovers the project configuration, loads metadata files, resolves variables, and stores the resulting context for later access.

When manifest_file_path is provided, the context is loaded directly from a pre-built manifest JSON file, skipping all discovery, rendering, and variable resolution.

Parameters:

Name Type Description Default
project_file_path str | None

Path to project file or directory. If None, auto-discovers from current working directory.

None
target str | None

Target environment name (e.g., "dev", "prod") to load target-specific variables from project file.

None
init_vars dict[str, Any] | None

Runtime variable overrides (highest priority).

None
manifest_file_path str | None

Path to a manifest JSON file. When provided, skips all source file loading and uses the snapshot instead.

None
refresh bool

If True, recreate context even if one already exists.

False
store_in_global bool

Whether to store context globally.

True

Returns:

Type Description
MetaRuntimeContext

The initialized runtime context containing project settings,

MetaRuntimeContext

resolved variables, and metadata catalog.

Raises:

Type Description
FileNotFoundError

If project file cannot be discovered.

ValueError

If configuration is invalid or manifest is incompatible.

Source code in src/kelp/meta/framework.py
@classmethod
def init(
    cls,
    project_file_path: str | None = None,
    target: str | None = None,
    init_vars: dict[str, Any] | None = None,
    manifest_file_path: str | None = None,
    refresh: bool = False,
    store_in_global: bool = True,
) -> MetaRuntimeContext:
    """Initialize and store framework runtime context.

    This discovers the project configuration, loads metadata files,
    resolves variables, and stores the resulting context for later access.

    When ``manifest_file_path`` is provided, the context is loaded directly
    from a pre-built manifest JSON file, skipping all discovery,
    rendering, and variable resolution.

    Args:
        project_file_path: Path to project file or directory.
            If None, auto-discovers from current working directory.
        target: Target environment name (e.g., "dev", "prod") to load
            target-specific variables from project file.
        init_vars: Runtime variable overrides (highest priority).
        manifest_file_path: Path to a manifest JSON file. When provided,
            skips all source file loading and uses the snapshot instead.
        refresh: If True, recreate context even if one already exists.
        store_in_global: Whether to store context globally.

    Returns:
        The initialized runtime context containing project settings,
        resolved variables, and metadata catalog.

    Raises:
        FileNotFoundError: If project file cannot be discovered.
        ValueError: If configuration is invalid or manifest is incompatible.
    """
    spec = cls._get_spec()
    return init_runtime(
        spec=spec,
        project_file_path=project_file_path,
        target=target,
        init_vars=init_vars,
        manifest_file_path=manifest_file_path,
        refresh=refresh,
        store_in_global=store_in_global,
    )

get_context classmethod

get_context(init=False)

Get framework runtime context, optionally auto-initializing.

If the context hasn't been initialized yet, you can either: - Set init=True to auto-initialize from current directory - Call init() explicitly with specific parameters first

Parameters:

Name Type Description Default
init bool

If True and context doesn't exist, auto-initialize from current working directory with default settings.

False

Returns:

Type Description
MetaRuntimeContext

The framework's runtime context.

Raises:

Type Description
RuntimeError

If context hasn't been initialized and init=False.

Source code in src/kelp/meta/framework.py
@classmethod
def get_context(cls, init: bool = False) -> MetaRuntimeContext:
    """Get framework runtime context, optionally auto-initializing.

    If the context hasn't been initialized yet, you can either:
    - Set init=True to auto-initialize from current directory
    - Call init() explicitly with specific parameters first

    Args:
        init: If True and context doesn't exist, auto-initialize from
            current working directory with default settings.

    Returns:
        The framework's runtime context.

    Raises:
        RuntimeError: If context hasn't been initialized and init=False.
    """
    spec = cls._get_spec()
    ctx = get_context(framework_id=spec.framework_id)

    if ctx is None:
        if init:
            return cls.init()
        raise RuntimeError(
            f"Context for framework '{spec.framework_id}' has not been initialized. "
            "Call init() first or use get_context(init=True) to auto-initialize."
        )

    return ctx

kelp.meta.MetaRuntimeContext pydantic-model

Bases: BaseModel

Runtime context for one framework namespace.

Attributes:

Name Type Description
framework_id str

Framework identifier owning this context.

project_root str

Project root directory.

project_file_path str

Resolved project file path.

target str | None

Selected target name, if any.

runtime_vars dict[str, Any]

Resolved runtime variables.

project_settings Any

Framework-specific settings payload.

catalog dict[str, list[Any]]

Loaded catalog payload for framework metadata objects.

Show JSON schema:
{
  "description": "Runtime context for one framework namespace.\n\nAttributes:\n    framework_id: Framework identifier owning this context.\n    project_root: Project root directory.\n    project_file_path: Resolved project file path.\n    target: Selected target name, if any.\n    runtime_vars: Resolved runtime variables.\n    project_settings: Framework-specific settings payload.\n    catalog: Loaded catalog payload for framework metadata objects.",
  "properties": {
    "framework_id": {
      "description": "Owning framework identifier",
      "title": "Framework Id",
      "type": "string"
    },
    "project_root": {
      "description": "Project root path",
      "title": "Project Root",
      "type": "string"
    },
    "project_file_path": {
      "description": "Project YAML file path",
      "title": "Project File Path",
      "type": "string"
    },
    "target": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "description": "Selected target",
      "title": "Target"
    },
    "runtime_vars": {
      "additionalProperties": true,
      "description": "Resolved variables",
      "title": "Runtime Vars",
      "type": "object"
    },
    "project_settings": {
      "default": null,
      "description": "Framework-specific settings Pydantic model",
      "title": "Project Settings"
    },
    "catalog": {
      "additionalProperties": {
        "items": {},
        "type": "array"
      },
      "description": "Loaded metadata catalog payload",
      "title": "Catalog",
      "type": "object"
    },
    "generated_from_manifest": {
      "default": false,
      "description": "Whether this context was loaded from a manifest file",
      "title": "Generated From Manifest",
      "type": "boolean"
    },
    "manifest_file_path": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "description": "Path to the manifest file if loaded from one",
      "title": "Manifest File Path"
    }
  },
  "required": [
    "framework_id",
    "project_root",
    "project_file_path"
  ],
  "title": "MetaRuntimeContext",
  "type": "object"
}

Config:

  • default: {'arbitrary_types_allowed': True}

Fields:

framework_id pydantic-field

framework_id

Owning framework identifier

project_root pydantic-field

project_root

Project root path

project_file_path pydantic-field

project_file_path

Project YAML file path

target pydantic-field

target = None

Selected target

runtime_vars pydantic-field

runtime_vars

Resolved variables

project_settings pydantic-field

project_settings = None

Framework-specific settings Pydantic model

catalog pydantic-field

catalog

Loaded metadata catalog payload

generated_from_manifest pydantic-field

generated_from_manifest = False

Whether this context was loaded from a manifest file

manifest_file_path pydantic-field

manifest_file_path = None

Path to the manifest file if loaded from one

model_config class-attribute instance-attribute

model_config = {'arbitrary_types_allowed': True}

catalog_index property

catalog_index

Get indexed catalog for efficient name-based lookups.

The MetaCatalog instance is memoized so lazy-built name indices and filter-result caches persist across repeated accesses.

Returns:

Type Description
MetaCatalog

MetaCatalog with lazy-built indices for each object type.

kelp.meta.MetaCatalog

MetaCatalog(raw_data)

Generic indexed catalog for metadata objects.

Provides name-based lookup and deduplication for any object types stored in a hierarchical dict structure. Maintains lazy-built indices per object type.

Example

catalog = MetaCatalog( ... raw_data={ ... "models": [{"name": "customers", ...}, ...], ... "metric_views": [{"name": "daily_orders", ...}, ...], ... } ... ) table = catalog.get("models", "customers") all_tables = catalog.get_all("models")

Initialize catalog from raw payload.

Parameters:

Name Type Description Default
raw_data dict[str, list[Any]]

Dict keyed by object type (e.g., "models", "metric_views") with lists of objects as values.

required
Source code in src/kelp/meta/catalog_index.py
def __init__(self, raw_data: dict[str, list[Any]]):
    """Initialize catalog from raw payload.

    Args:
        raw_data: Dict keyed by object type (e.g., "models", "metric_views")
            with lists of objects as values.
    """
    self._raw_data = raw_data
    self._indices: dict[str, dict[str, Any]] = {}
    self._built: dict[str, bool] = {}
    self._filter_cache: dict[str, list[Any]] = {}

get

get(catalog_key, name)

Get object by name from catalog.

Parameters:

Name Type Description Default
catalog_key str

Object type key (e.g., "models", "metric_views").

required
name str

Object name to lookup.

required

Returns:

Type Description
Any

The first object matching the name.

Raises:

Type Description
KeyError

If object is not found.

Source code in src/kelp/meta/catalog_index.py
def get(self, catalog_key: str, name: str) -> Any:
    """Get object by name from catalog.

    Args:
        catalog_key: Object type key (e.g., "models", "metric_views").
        name: Object name to lookup.

    Returns:
        The first object matching the name.

    Raises:
        KeyError: If object is not found.
    """
    if catalog_key not in self._built or not self._built[catalog_key]:
        self._build_index(catalog_key)

    index = self._indices.get(catalog_key, {})
    obj = index.get(name)

    if obj is None:
        raise KeyError(f"{catalog_key} not found in catalog: {name}")

    return obj

get_all

get_all(catalog_key)

Get all objects from a catalog.

Parameters:

Name Type Description Default
catalog_key str

Object type key (e.g., "models", "metric_views").

required

Returns:

Type Description
list[Any]

List of all objects for this catalog key.

Source code in src/kelp/meta/catalog_index.py
def get_all(self, catalog_key: str) -> list[Any]:
    """Get all objects from a catalog.

    Args:
        catalog_key: Object type key (e.g., "models", "metric_views").

    Returns:
        List of all objects for this catalog key.
    """
    return self._raw_data.get(catalog_key, [])

filter_by

filter_by(catalog_key, attr, value)

Return objects whose attr satisfies value.

When value is a dict the matching uses recursive dict-subset semantics (see :func:_is_filter_match). Scalar value entries use exact equality. Results are cached per (catalog_key, attr, value) tuple.

This method is framework-agnostic — it does not know which attributes a model carries. Any attribute name can be passed.

Examples:

>>> catalog.filter_by("models", "meta", {"group": "abc"})
>>> catalog.filter_by("models", "schema_", "bronze")

Parameters:

Name Type Description Default
catalog_key str

Object type key (e.g., "models").

required
attr str

Attribute name on the catalog objects to filter by.

required
value Any

Expected value. A dict triggers recursive subset matching; any other type uses ==.

required

Returns:

Type Description
list[Any]

List of matching objects (may be empty).

Source code in src/kelp/meta/catalog_index.py
def filter_by(
    self,
    catalog_key: str,
    attr: str,
    value: Any,
) -> list[Any]:
    """Return objects whose *attr* satisfies *value*.

    When *value* is a ``dict`` the matching uses recursive dict-subset
    semantics (see :func:`_is_filter_match`).  Scalar *value* entries
    use exact equality.  Results are cached per
    ``(catalog_key, attr, value)`` tuple.

    This method is framework-agnostic — it does not know which
    attributes a model carries.  Any attribute name can be passed.

    Examples:
        >>> catalog.filter_by("models", "meta", {"group": "abc"})
        >>> catalog.filter_by("models", "schema_", "bronze")

    Args:
        catalog_key: Object type key (e.g., ``"models"``).
        attr: Attribute name on the catalog objects to filter by.
        value: Expected value.  A ``dict`` triggers recursive subset
            matching; any other type uses ``==``.

    Returns:
        List of matching objects (may be empty).
    """
    cache_key = catalog_key + ":" + attr + ":" + json.dumps(value, sort_keys=True, default=str)
    cached = self._filter_cache.get(cache_key)
    if cached is not None:
        return cached

    result = [
        obj
        for obj in self.get_all(catalog_key)
        if _is_filter_match(
            _get_attr(obj, attr),
            value,
        )
    ]
    self._filter_cache[cache_key] = result
    return result

get_index

get_index(catalog_key)

Get name -> object index for a catalog key.

Builds index lazily on first access.

Parameters:

Name Type Description Default
catalog_key str

Object type key.

required

Returns:

Type Description
dict[str, Any]

Dict mapping object names to objects.

Source code in src/kelp/meta/catalog_index.py
def get_index(self, catalog_key: str) -> dict[str, Any]:
    """Get name -> object index for a catalog key.

    Builds index lazily on first access.

    Args:
        catalog_key: Object type key.

    Returns:
        Dict mapping object names to objects.
    """
    if catalog_key not in self._built or not self._built[catalog_key]:
        self._build_index(catalog_key)

    return self._indices.get(catalog_key, {})

refresh_index

refresh_index(catalog_key=None)

Rebuild indices.

Parameters:

Name Type Description Default
catalog_key str | None

Specific key to rebuild, or None to rebuild all.

None
Source code in src/kelp/meta/catalog_index.py
def refresh_index(self, catalog_key: str | None = None) -> None:
    """Rebuild indices.

    Args:
        catalog_key: Specific key to rebuild, or None to rebuild all.
    """
    self._filter_cache.clear()
    if catalog_key:
        self._built[catalog_key] = False
        self._build_index(catalog_key)
    else:
        for key in self._raw_data:
            self._built[key] = False
            self._build_index(key)

keys

keys()

Return all catalog keys.

Source code in src/kelp/meta/catalog_index.py
def keys(self) -> list[str]:
    """Return all catalog keys."""
    return list(self._raw_data.keys())

kelp.meta.MetaProjectSpec pydantic-model

Bases: BaseModel

Specification describing one framework's project format.

Framework settings are intentionally isolated under project_header (for example kelp_project or xy_project). Shared sections like vars and targets remain framework-agnostic and are handled by the generic loading pipeline.

Attributes:

Name Type Description
framework_id str

Unique framework identifier for isolated context storage.

project_header str

Top-level YAML key containing framework settings.

project_settings_model type[BaseModel]

Pydantic model class for framework settings.

object_specs tuple[MetaObjectSpec, ...]

Metadata object specs to load and validate.

project_filename str

Default project file name used for discovery.

vars_key str

Shared vars section key.

targets_key str

Shared targets section key.

vars_overwrite_key str

Optional path key for overwrite vars file.

target_var_name str

Built-in variable name used for selected target.

Config:

  • arbitrary_types_allowed: True

Fields:

model_config class-attribute instance-attribute

model_config = ConfigDict(arbitrary_types_allowed=True)

framework_id pydantic-field

framework_id

Unique framework ID

project_header pydantic-field

project_header

Top-level key containing framework settings

project_settings_model pydantic-field

project_settings_model

Framework-specific Pydantic settings model

object_specs pydantic-field

object_specs

Object specs loaded by this framework

project_filename pydantic-field

project_filename = 'kelp_project.yml'

Default project file name used for discovery

vars_key pydantic-field

vars_key = 'vars'

Shared key containing root variables

targets_key pydantic-field

targets_key = 'targets'

Shared key containing environment targets

vars_overwrite_key pydantic-field

vars_overwrite_key = 'vars_overwrite'

Shared key containing overwrite vars file path

target_var_name pydantic-field

target_var_name = 'target'

Built-in variable name storing selected target

resolve_runtime_settings pydantic-field

resolve_runtime_settings = False

Whether init_runtime should resolve target/project_file via meta settings sources (args/spark/env/defaults).

settings_env_prefix pydantic-field

settings_env_prefix = 'KELP_'

Environment variable prefix for runtime settings resolution

settings_spark_prefix pydantic-field

settings_spark_prefix = 'kelp'

Spark conf prefix for runtime settings resolution

target_setting_key pydantic-field

target_setting_key = 'target'

Settings key used to resolve runtime target

project_file_setting_key pydantic-field

project_file_setting_key = 'project_file'

Settings key used to resolve project file path

manifest_file_path_setting_key pydantic-field

manifest_file_path_setting_key = 'manifest_file'

Settings key used to resolve manifest file path

kelp.meta.MetaObjectSpec pydantic-model

Bases: BaseModel

Specification for one metadata object type.

Attributes:

Name Type Description
root_key str

YAML root key containing item list (for example kelp_models).

project_config_key str

Key in project settings containing hierarchy defaults.

path_attr str

Attribute in framework settings with metadata path(s).

catalog_attr str

Output catalog key where parsed objects are stored.

model_class type[Any]

Pydantic model class used for validation.

model_label str

Human-readable label for diagnostics.

preprocess Callable[[dict[str, Any], str | None], dict[str, Any]]

Optional hook to transform raw item payload before validation.

Config:

  • arbitrary_types_allowed: True

Fields:

model_config class-attribute instance-attribute

model_config = ConfigDict(arbitrary_types_allowed=True)

root_key pydantic-field

root_key

YAML root key containing metadata items

project_config_key pydantic-field

project_config_key

Project settings key for hierarchy defaults

path_attr pydantic-field

path_attr

Framework settings attribute with metadata path(s)

catalog_attr pydantic-field

catalog_attr

Catalog payload key for validated items

model_class pydantic-field

model_class

Model class used to validate parsed items

model_label pydantic-field

model_label

Human-readable model label for error messages

preprocess pydantic-field

preprocess = _noop_preprocess

Optional preprocessing hook before model validation