Ask AI

You are viewing an unreleased or outdated version of the documentation

Core concepts#

Learn about Dagster's core concepts and how to use them in your data platform.


Asset definition#

An asset is an object in persistent storage, such as a table, file, or persisted machine learning model. An asset definition is a Dagster object that couples an asset to the function and upstream assets used to produce its contents.


Automation#

Dagster offers several ways to run data pipelines without manual intervention, including traditional scheduling and event-based triggers.


Partitions and backfills#

An asset defininition or job can represent a collection of partitions that can be tracked and executed independently.


Resources & configuration#

Resources enable you to separate logic from external dependencies, making developing and testing possible in multiple environments.

Learn more about resources.

Additionally, Dagster provides a configuration system that allows you to document, schematize, and error-check your configuration.


Code locations#

A code location is a collection of Dagster definitions, including assets, jobs, schedules, sensors, and resources. Dagster tools like the Dagster webserver/UI and CLI use code locations to load your code.


Dagster UI#

The Dagster UI is a web-based interface for viewing and interacting with Dagster objects.

Learn more about the Dagster UI.


Testing#

Dagster enables you to build testable and maintainable data applications. It provides ways to allow you unit-test your data applications, separate business logic from environments, and set explicit expectations on uncontrollable inputs.

Learn more about testing.


Advanced concepts#

Ops and jobs#

Ops typically perform relatively simple tasks, such as executing a database query or sending a Slack message.

An op graph is a set of interconnected ops or sub-graphs. While individual ops typically perform simple tasks, ops can be assembled into a graph or job to accomplish complex tasks.

Metadata & tags#

Apply tags and metadata to organize your project and provide useful context to other members of your team.

Dagster Pipes#

Dagster Pipes is a toolkit for building integrations between Dagster and external execution environments.

I/O management#

I/O managers are user-provided objects that store asset and op outputs and load them as inputs to downstream assets and ops.

Dagster Types#

The Dagster Type system provides gradual, opt-in typing for the inputs and outputs of assets and ops.

Learn more about Dagster types.

Logging#

A rich, extensible logging system, Dagster's built-in logger tracks all execution events. Loggers can also be customized to fit your infrastructure.

GraphQL API#

The GraphQL API allows you to interact programmatically with Dagster.