Ylem documentation
  • 🗒️General information
    • Introduction to Ylem
    • Quick start guide
    • Release notes
  • 🔬Open-source edition
    • Installation
    • Usage of Apache Kafka
    • Task processing architecture
    • Configuring integrations with .env variables
  • 💡Integrations
    • Connecting an integration
    • Library of integrations
      • Amazon Redshift
      • Apache Kafka
      • APIs
      • Atlassian Jira
      • AWS Lambda
      • AWS RDS
      • AWS S3
      • ClickHouse
      • ElasticSearch
      • E-mail
      • Google Big Query
      • Google Cloud SQL
      • Google Pub/Sub
      • Google Sheets
      • Immuta
      • Incident.io
      • Jenkins
      • Hubspot
      • Microsoft Azure SQL
      • MySQL
      • OpenAI ChatGPT
      • Opsgenie
      • PostgreSQL
      • PlanetScale
      • RabbitMQ
      • Salesforce
      • Slack
      • Snowflake
      • Tableau
      • Twilio. SMS
      • WhatsApp (through Twilio)
    • Initial demo data source
  • 🚡Pipelines
    • Pipeline management
    • Tasks
      • Aggregator
      • API Call
      • Code
      • Condition
      • External trigger
      • Filter
      • For each
      • GPT
      • Merge
      • Notification
      • Query
      • Pipeline runner
      • Processor
      • Transformer
    • Running and scheduling pipelines
    • Library of templates
    • Environment variables
    • Mathematical functions and operations
    • Formatting of messages
  • 📈Statistics and profiling
    • Statistics of runs
    • Slow tasks
  • 📊Metrics
    • Metric management
    • Using previous values of a metric
  • 💼Use cases, patterns, templates, examples
    • Use cases
    • Messaging patterns
      • Datatype Channel
      • Message Dispatcher
      • Messaging Bridge
      • Message Bus
      • Message Filter
      • Message Router
      • Point-to-Point Channel
      • Publish-Subscribe Channel
      • Pull-Push
    • Functional use cases
      • Streaming from Apache Kafka and messaging queues
      • Streaming from APIs
      • Streaming from databases
      • Data orchestration, transformation and processing
      • Usage of Python and Pandas
      • KPI Monitoring
      • OKRs and custom metrics
      • Data Issues & Incidents
      • Reporting
      • Other functional use cases
    • Industry-specific use cases
      • Finance and Payments
      • E-commerce & Logistics
      • Customer Success
      • Security, Risk, and Anti-Fraud
      • Anti-Money Laundering (AML)
  • 🔌API
    • OAuth clients
    • API Reference
  • 👁️‍🗨️Other resources
    • FAQ
    • Our blog on Medium
Powered by GitBook
On this page
  • Manually on-demand
  • Automatically by a schedule
  • Automatically via API.
  • In real-time from Apache Kafka, RabbitMQ, Google Pub/Sub, etc.
  • From a metric when its value matches a condition

Was this helpful?

Edit on GitHub
  1. Pipelines

Running and scheduling pipelines

PreviousTransformerNextLibrary of templates

Last updated 8 months ago

Was this helpful?

Ylem’s pipelines can be triggered in 5 different ways:

Manually on-demand

Both buttons to schedule or run the pipeline manually are located in the top right corner of the pipeline's canvas.

A manual trigger is typically used for:

  • Testing and debugging new pipelines or pipelines that stopped working as expected

  • Pipelines that don't have any periodical schedule, don't need to be triggered through the API but need to be run sometimes on demand when a user needs it

After clicking on the "Run pipeline" button a user gets an output that shows the status of every task or can open the full-text output log:

Automatically by a schedule

Scheduling is the most common way of triggering pipelines that need to do a periodical job. For example, reporting, and data preparation.

It can be configured using either a visual interface or if you are familiar with a cronjob format, you can also use this one.

Automatically via API.

For example, by Apache Airflow, Jenkins, or any other software that is already used in your company for workflow or pipeline orchestration.

In real-time from Apache Kafka, RabbitMQ, Google Pub/Sub, etc.

When placed as a first task in a pipeline, external_trigger allows you to trigger the pipeline from your data streaming platforms to make it 100% real-time.

More specific details can be found on the detailed pages for each integration:

From a metric when its value matches a condition

More information about triggering pipelines via API can be found in the and .

In each , you can define which pipelines to execute depending on the value of your metric:

Some of the more detailed examples of metrics can be found in the . Our also contains multiple ones, which you can use for configuring your metrics.

🚡
OAuth Clients
API Endpoints
Apache Kafka
RabbitMQ
Google Pub/Sub
AWS Lambda
metric
list of our use cases
library of templates