Data Observability for Data Pipelines

Data Observability for Data Pipelines

Data Culpa Validator dashboard showing pipeline data ingested over two weeks

 

Data Culpa Validator provides the observability data teams need for monitoring data quality and trends in data pipelines and repositories. The Validator dashboard makes it easy to see:

  • Alerts about important changes in the pipeline or repository
  • Cohesion: a measure of the consistency of data in this pipeline or repository
  • Data volumes
  • Data schemas
  • Data values

Validator provides an open API, enabling integration with any modern data pipeline orchestration software tooling or ad hoc scripts. We also provide a Python client API that you can access with `pip install dataculpa-client` and other language bindings are on our roadmap. Get in touch with us if you need something else.

Our flexible, asynchronous approach enables you to monitor from anywhere in your pipeline that makes the most sense for your needs: at the ingest, at the output, or somewhere in between.

You can also track intermediate results and tag sets of records with two kinds of metadata. We enable comparison of different environments as foundational metadata, where you can have different environment names, version tags, or pipeline stage tags for a given Validator watchpoint. You can also add metadata to a specific group of records that you’re sending to Validator for appending code state, debug notes, or other information that you might want later when reviewing or analyzing Data Culpa output.

 

Data Culpa detects changes in data at rest and data in pipeline input or output