Ylem documentation
  • 🗒️General information
    • Introduction to Ylem
    • Quick start guide
    • Release notes
  • 🔬Open-source edition
    • Installation
    • Usage of Apache Kafka
    • Task processing architecture
    • Configuring integrations with .env variables
  • 💡Integrations
    • Connecting an integration
    • Library of integrations
      • Amazon Redshift
      • Apache Kafka
      • APIs
      • Atlassian Jira
      • AWS Lambda
      • AWS RDS
      • AWS S3
      • ClickHouse
      • ElasticSearch
      • E-mail
      • Google Big Query
      • Google Cloud SQL
      • Google Pub/Sub
      • Google Sheets
      • Immuta
      • Incident.io
      • Jenkins
      • Hubspot
      • Microsoft Azure SQL
      • MySQL
      • OpenAI ChatGPT
      • Opsgenie
      • PostgreSQL
      • PlanetScale
      • RabbitMQ
      • Salesforce
      • Slack
      • Snowflake
      • Tableau
      • Twilio. SMS
      • WhatsApp (through Twilio)
    • Initial demo data source
  • 🚡Pipelines
    • Pipeline management
    • Tasks
      • Aggregator
      • API Call
      • Code
      • Condition
      • External trigger
      • Filter
      • For each
      • GPT
      • Merge
      • Notification
      • Query
      • Pipeline runner
      • Processor
      • Transformer
    • Running and scheduling pipelines
    • Library of templates
    • Environment variables
    • Mathematical functions and operations
    • Formatting of messages
  • 📈Statistics and profiling
    • Statistics of runs
    • Slow tasks
  • 📊Metrics
    • Metric management
    • Using previous values of a metric
  • 💼Use cases, patterns, templates, examples
    • Use cases
    • Messaging patterns
      • Datatype Channel
      • Message Dispatcher
      • Messaging Bridge
      • Message Bus
      • Message Filter
      • Message Router
      • Point-to-Point Channel
      • Publish-Subscribe Channel
      • Pull-Push
    • Functional use cases
      • Streaming from Apache Kafka and messaging queues
      • Streaming from APIs
      • Streaming from databases
      • Data orchestration, transformation and processing
      • Usage of Python and Pandas
      • KPI Monitoring
      • OKRs and custom metrics
      • Data Issues & Incidents
      • Reporting
      • Other functional use cases
    • Industry-specific use cases
      • Finance and Payments
      • E-commerce & Logistics
      • Customer Success
      • Security, Risk, and Anti-Fraud
      • Anti-Money Laundering (AML)
  • 🔌API
    • OAuth clients
    • API Reference
  • 👁️‍🗨️Other resources
    • FAQ
    • Our blog on Medium
Powered by GitBook
On this page

Was this helpful?

Edit on GitHub
  1. Statistics and profiling

Slow tasks

PreviousStatistics of runsNextMetric management

Last updated 8 months ago

Was this helpful?

Sometimes pipelines become slow. And there are multiple reasons for this that in 95% of cases are external to Ylem:

  • Slow processing of information on the database side

  • Slow response from external networks

  • Lack of resources to process pipelines that conflict with each other

The last point is especially important if you have too many slow pipelines they start conflicting and waiting for available cluster resources to process them. It causes delays and slows down the performance.

Therefore we introduced a .

With this tool, you will always be aware of slow tasks and queries, failed workflow runs, and incorrect inputs/outputs.

You can play with a , a data range, and a processing threshold in milliseconds. As a result, it will show you what tasks of what pipelines took more time to process than milliseconds when it was executed, and what output it returned. Having this information you can look deeper into your infrastructure and improve the performance of your pipelines.

📈
type of the task
slow task profiling tool
Example of how to profile slow Aggregators within a certain data range