ABOUT THIS BLOG

In this blog, we’ll discuss all things data quality and data observability. We’ll present interviews with leading data scientists, and we’ll discuss our own working building data observability solutions for data engineers and data scientists working in a broad range of industries.

Displaying Posts From: Data Culpa
Effective Data Monitoring Requires a Relative Baseline

Effective Data Monitoring Requires a Relative Baseline

Recently I was talking to a customer who had used a competitor’s product for data monitoring. It sounded like the product had the users specify parameters about the data; e.g., “this column should never be null.” This is all well and good, except that the product then...

The Data Quality Hierarchy of Needs

The Data Quality Hierarchy of Needs

Just as Maslow identified a hierarchy of needs for people, data teams have a hierarchy of needs, beginning with data freshness; including volumes, schemas, and values; and culminating with lineage. In this blog post, which was published in the Data Science area of the...

Five Neat Tricks with Data Culpa Alerts

Five Neat Tricks with Data Culpa Alerts

Every data team needs to keep an eye on its data. At the same time, no data team wants to be deluged with alerts.  Ideally, alerts should direct your attention to the most important changes taking place in your data. If they call attention to every change, you’ll...

Validating Data for Pipelines with Data Culpa

Validating Data for Pipelines with Data Culpa

Consistent pipeline behavior is critical for any data process. You can use Data Culpa Validator to ensure consistent operation of pipelines as well as data at rest in databases, data lakes, and data warehouses. This introduction shows you how to validate your results...

Using Timeshift in Data Culpa Validator

Using Timeshift in Data Culpa Validator

One of the best features we offer customers who are just getting started with Data Culpa is our "timeshift" feature. Timeshift lets us extract a point in time for a row or a document and have Validator evaluate the data contained within that row or document as if it...

Where Most Data Observability Solutions Fall Short

Where Most Data Observability Solutions Fall Short

Observability is the analysis of a system based on its outputs. By analyzing the outputs of a system at various points, it should be possible to infer the internal state of the system and to diagnose problems the system is experiencing. This sounds useful for data...

Introducing Data Culpa Validator

Introducing Data Culpa Validator

What’s your data doing? Can you tell? Data teams tell us they need better visibility into data pipelines, integrations, repositories, and data lakes. You can hand-code a bunch of unit tests to check for known boundary cases. But coverage will always be limited. And...

A Blog about Data Quality

A Blog about Data Quality

At Data Culpa, we're building solutions to help data engineers and data scientists catch data quality problems before they jeopardize data pipeline results. Nearly every business today is becoming more data-centric. Data not only drives operations and decision making;...

Have Questions?

Feel free to reach out to us!

NEWSLETTER SIGN UP

Subscribe to the Data Culpa Newsletter