As part of its rich, extensible logging system, Dagster includes loggers. Loggers can be applied to all jobs within a code location or, in advanced cases, overriden at the job level.
Logging handlers are automatically invoked whenever ops in a job log messages, meaning out-of-the-box loggers track all execution events. Loggers can also be customized to meet your specific needs.
By default, Dagster comes with a built-in logger that tracks all execution events. Built-in loggers are defined internally using the LoggerDefinition class. The @logger decorator exposes a simpler API for the common logging use case, which is typically what should be used to define your own loggers.
The decorated function should take a single argument, the init_context available during logger initialization, and return a logging.Logger. Refer to the Customizing loggers guide for an example.
The context object passed to every op execution includes the built-in log manager, context.log. It exposes the usual debug, info, warning, error, and critical methods you would expect anywhere else in Python.
When jobs are run, the logs stream back to the UI's Run details page in real time. The UI contains two types of logs - structured event and raw compute - which you can learn about below.
Structured logs are enriched and categorized with metadata. For example, a label of which asset a log is about, links to an asset’s metadata, and what type of event it is available. This structuring also enables easier filtering and searching in the logs.
Logs stream back to the UI in real time:
Filtering log messages based on execution steps and log levels:
The raw compute logs contain logs for both stdout and stderr, which you can toggle between. To download the logs, click the arrow icon near the top right corner of the logs.
Custom log messages are also included in these logs. Notice in the following image that the Hello world! message is included on line three:
Note: Windows / Azure users may need to enable the environment variable PYTHONLEGACYWINDOWSSTDIO in order for compute logs to be displayed in the Dagster UI. To do that in PowerShell, run $Env:PYTHONLEGACYWINDOWSSTDIO = 1 and then restart the Dagster instance.
Errors in user code are caught by Dagster machinery to ensure jobs gracefully halt or continue to execute, but messages including the original stack trace get logged both to the console and back to the UI.
For example, if an error is introduced into an op's logic:
# demo_logger_error.pyfrom dagster import job, op, OpExecutionContext
@opdefhello_logs_error(context: OpExecutionContext):raise Exception("Somebody set up us the bomb")@jobdefdemo_job_error():
hello_logs_error()
Messages at level ERROR or above are highlighted both in the UI and in the console logs, so they can be easily identified even without filtering:
In many cases, especially for local development, this log viewer, coupled with op reexecution, is sufficient to enable a fast debug cycle for job implementation.
Suppose that we've gotten the kinks out of our jobs developing locally, and now we want to run in production—without all of the log spew from DEBUG messages that was helpful during development.
Just like ops, loggers can be configured when you run a job. For example, to filter all messages below ERROR out of the colored console logger, add the following snippet to your config YAML:
loggers:console:config:log_level: ERROR
When a job with the above config is executed, you'll only see the ERROR level logs.
Logging is environment-specific: for example, you don't want messages generated by data scientists' local development loops to be aggregated with production messages. On the other hand, you may find that console logging is irrelevant or even counterproductive in production.
Dagster recognizes this by attaching loggers to jobs so that you can seamlessly switch from environment to environment without changing any code. For example, let's say you want to switch from Cloudwatch logging in production to console logging in development and test: