Dagu Go Dagu Save

Yet another cron alternative with a Web UI, but with much more capabilities. It aims to solve greater problems.

Project README

Dagu

Dagu is a powerful Cron alternative that comes with a Web UI. It allows you to define dependencies between commands as a Directed Acyclic Graph (DAG) in a declarative YAML format. Dagu simplifies the management and execution of complex workflows. It natively supports running Docker containers, making HTTP requests, and executing commands over SSH.

Highlights

Single binary file installation
Declarative YAML format for defining DAGs
Web UI for visually managing, rerunning, and monitoring pipelines
Use existing programs without any modification
Self-contained, with no need for a DBMS

Highlights
Table of Contents
Features
Use Cases
Web UI
Installation
Quick Start Guide
CLI
Documentation
Running as a daemon
Example Workflow
Motivation
Why Not Use an Existing Workflow Scheduler Like Airflow?
How It Works
License
Support and Community

Features

Web User Interface
Command Line Interface (CLI) with several commands for running and managing DAGs
YAML format for defining DAGs, with support for various features including:
- Execution of custom code snippets
- Parameters
- Command substitution
- Conditional logic
- Redirection of stdout and stderr
- Lifecycle hooks
- Repeating task
- Automatic retry
Executors for running different types of tasks:
- Running arbitrary Docker containers
- Making HTTP requests
- Sending emails
- Running jq command
- Executing remote commands via SSH
Email notification
Scheduling with Cron expressions
REST API Interface
Basic Authentication over HTTPS

Use Cases

Data Pipeline Automation: Schedule ETL tasks for data processing and centralization.
Infrastructure Monitoring: Periodically check infrastructure components with HTTP requests or SSH commands.
Automated Reporting: Generate and send periodic reports via email.
Batch Processing: Schedule batch jobs for tasks like data cleansing or model training.
Task Dependency Management: Manage complex workflows with interdependent tasks.
Microservices Orchestration: Define and manage dependencies between microservices.
CI/CD Integration: Automate code deployment, testing, and environment updates.
Alerting System: Create notifications based on specific triggers or conditions.
Custom Task Automation: Define and schedule custom tasks using code snippets.

Web UI

Workflow Details

It shows the real-time status, logs, and workflow configurations. You can edit workflow configurations on a browser.

example

You can switch to the vertical graph with the button on the top right corner.

Details-TD

Workflows

It shows all workflows and the real-time status.

DAGs

Search

It greps given text across all workflow definitions.

Execution History

It shows past execution results and logs.

History

Log Viewer

It shows the detail log and standard output of each execution and step.

DAG Log

Installation

You can install Dagu quickly using Homebrew or by downloading the latest binary from the Releases page on GitHub.

Via Bash script

curl -L https://raw.githubusercontent.com/yohamta/dagu/main/scripts/downloader.sh | bash

Via GitHub Releases Page

Download the latest binary from the Releases page and place it in your $PATH (e.g. /usr/local/bin).

Via Homebrew (macOS)

brew install yohamta/tap/dagu

Upgrade to the latest version:

brew upgrade yohamta/tap/dagu

Via Docker

docker run \
--rm \
-p 8080:8080 \
-v $HOME/.dagu/dags:/home/dagu/.dagu/dags \
-v $HOME/.dagu/data:/home/dagu/.dagu/data \
-v $HOME/.dagu/logs:/home/dagu/.dagu/logs \
ghcr.io/dagu-dev/dagu:latest dagu start-all

Quick Start Guide

1. Launch the Web UI

Start the server and scheduler with the command dagu start-all and browse to http://127.0.0.1:8080 to explore the Web UI.

2. Create a New Workflow

Navigate to the DAG List page by clicking the menu in the left panel of the Web UI. Then create a DAG by clicking the NEW button at the top of the page. Enter example in the dialog.

Note: DAG (YAML) files will be placed in ~/.dagu/dags by default. See Configuration Options for more details.

3. Edit the Workflow

Go to the SPEC Tab and hit the Edit button. Copy & Paste the following example and click the Save button.

Example:

schedule: "* * * * *" # Run the DAG every minute
steps:
  - name: s1
    command: echo Hello Dagu
  - name: s2
    command: echo done!
    depends:
      - s1

4. Execute the Workflow

You can execute the example by pressing the Start button. You can see "Hello Dagu" in the log page in the Web UI.

CLI

# Runs the DAG
dagu start [--params=<params>] <file>

# Displays the current status of the DAG
dagu status <file>

# Re-runs the specified DAG run
dagu retry --req=<request-id> <file>

# Stops the DAG execution
dagu stop <file>

# Restarts the current running DAG
dagu restart <file>

# Dry-runs the DAG
dagu dry [--params=<params>] <file>

# Launches both the web UI server and scheduler process
dagu start-all [--host=<host>] [--port=<port>] [--dags=<path to directory>]

# Launches the Dagu web UI server
dagu server [--host=<host>] [--port=<port>] [--dags=<path to directory>]

# Starts the scheduler process
dagu scheduler [--dags=<path to directory>]

# Shows the current binary version
dagu version

Documentation

Running as a daemon

The easiest way to make sure the process is always running on your system is to create the script below and execute it every minute using cron (you don't need root account in this way):

#!/bin/bash
process="dagu start-all"
command="/usr/bin/dagu start-all"

if ps ax | grep -v grep | grep "$process" > /dev/null
then
    exit
else
    $command &
fi

exit

Example Workflow

This example workflow showcases a data pipeline typically implemented in DevOps and Data Engineering scenarios. It demonstrates an end-to-end data processing cycle starting from data acquisition and cleansing to transformation, loading, analysis, reporting, and ultimately, cleanup.

Details-TD

The YAML code below represents this workflow:

# Environment variables used throughout the pipeline
env:
  - DATA_DIR: /data
  - SCRIPT_DIR: /scripts
  - LOG_DIR: /log
  # ... other variables can be added here

# Handlers to manage errors and cleanup after execution
handlerOn:
  failure:
    command: "echo error"
  exit:
    command: "echo clean up"

# The schedule for the workflow execution in cron format
# This schedule runs the workflow daily at 12:00 AM
schedule: "0 0 * * *"

steps:
  # Step 1: Pull the latest data from a data source
  - name: pull_data
    command: "sh"
    script: |
      echo `date '+%Y-%m-%d'`
    output: DATE

 # Step 2: Cleanse and prepare the data
  - name: cleanse_data
    command: echo cleansing ${DATA_DIR}/${DATE}.csv
    depends:
      - pull_data

  # Step 3: Transform the data
  - name: transform_data
    command: echo transforming ${DATA_DIR}/${DATE}_clean.csv
    depends:
      - cleanse_data

  # Parallel Step 1: Load the data into a database
  - name: load_data
    command: echo loading ${DATA_DIR}/${DATE}_transformed.csv
    depends:
      - transform_data

  # Parallel Step 2: Generate a statistical report
  - name: generate_report
    command: echo generating report ${DATA_DIR}/${DATE}_transformed.csv
    depends:
      - transform_data

  # Step 4: Run some analytics
  - name: run_analytics
    command: echo running analytics ${DATA_DIR}/${DATE}_transformed.csv
    depends:
      - load_data

  # Step 5: Send an email report
  - name: send_report
    command: echo sending email ${DATA_DIR}/${DATE}_analytics.csv
    depends:
      - run_analytics
      - generate_report

  # Step 6: Cleanup temporary files
  - name: cleanup
    command: echo removing ${DATE}*.csv
    depends:
      - send_report

Motivation

In legacy systems, job dependencies are often complex and implicit, making it challenging to manage and maintain workflows. As the number of cron jobs on a server grows into the hundreds, keeping track of these dependencies and determining which jobs to rerun on failure becomes increasingly difficult. Additionally, viewing logs and manually rerunning shell scripts one by one via SSH can be a tedious and time-consuming process.

Dagu addresses these pain points by providing a user-friendly solution for explicitly defining and visualizing workflows. With its intuitive web UI, Dagu simplifies the management of workflows, enabling users to easily check dependencies, monitor execution status, view logs, and control job execution with just a few clicks.

Why Not Use an Existing Workflow Scheduler Like Airflow?

While there are several existing workflow schedulers like Airflow, many of them require users to define workflows using a programming language such as Python. This can be problematic for legacy systems that have been in operation for an extended period and already have complex jobs written in languages like Perl or Shell Script.

Introducing another layer of abstraction and complexity on top of these existing codebases can hinder maintainability and increase the learning curve for team members. Dagu differentiates itself by being easy to use, self-contained, and requiring no coding. This makes Dagu particularly suitable for smaller projects or teams looking to introduce workflow orchestration without the overhead of a full-fledged scheduling system.

How It Works

Dagu is designed as a standalone command-line tool that leverages the local file system for data storage, eliminating the need for a separate database management system or cloud service. This self-contained nature simplifies installation and setup, making it easy to get started with Dagu. By combining a user-friendly web interface, a declarative YAML format, and compatibility with existing programs, Dagu provides an efficient and accessible solution for managing and orchestrating workflows in a variety of scenarios.

Feel free to contribute in any way you want! Share ideas, questions, submit issues, and create pull requests. Check out our Contribution Guide for help getting started.

We welcome any and all contributions!

License

This project is licensed under the GNU GPLv3.

Support and Community

Join our Discord community to ask questions, request features, and share your ideas.

Open Source Agenda is not affiliated with "Dagu Go Dagu" Project. README Source: dagu-dev/dagu

Stars

1,164

Open Issues

Last Commit

1 week ago

Repository

dagu-dev/dagu

License

GPL-3.0

Homepage

https://dagu.readthedocs.io

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/dagu-go-dagu"><img src="https://www.opensourceagenda.com/projects/dagu-go-dagu/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022

Dagu Go Dagu Save

Dagu

Highlights

Table of Contents

Features

Use Cases

Web UI

Workflow Details

Workflows

Search

Execution History

Log Viewer

Installation

Via Bash script

Via GitHub Releases Page

Via Homebrew (macOS)

Via Docker

Quick Start Guide

1. Launch the Web UI

2. Create a New Workflow

3. Edit the Workflow

4. Execute the Workflow

CLI

Documentation

Running as a daemon

Example Workflow

Motivation

Why Not Use an Existing Workflow Scheduler Like Airflow?

How It Works

License

Support and Community

Open Source Agenda Badge

From the blog

How to Choose Which Programming Language to Learn First?

From the blog

How to Choose Which Programming Language to Learn First?