In this blog, we’ll discuss all things data quality and data observability. We’ll present interviews with leading data scientists, and we’ll discuss our own working building data observability solutions for data engineers and data scientists working in a broad range of industries.

Data Culpa Validator: Instant Data Monitoring Without Time-Consuming Customizations

by | Jun 22, 2022 | Data Culpa

A question we often get asked at Data Culpa is, “If you guys are going to monitor our data, isn’t that going to take a lot of custom engineering to set up?”

The answer is, “No.” Data Culpa Validator can be set up in less than an hour. No lengthy professional services engagement required.

Your data is unique and special, just like everyone else’s data.

And just as standard databases and data warehouses are used for a variety of data, so too can the monitoring platforms stretch across industries and applications. At Data Culpa, our software has been deployed in logistics, medical, retail, finance, and other industries to help users understand both data in motion and at rest.

How Data Culpa Makes Set-Up Fast and Easy

Data Culpa Validator, our monitoring solution, provides an API that users can either leverage directly or use to coordinate with our built-in connectors for standard data sources, such as Snowflake, BigQuery, MongoDB, and others. We interpret data flows with minimal configuration and build our proprietary models of what the data looks like, day by day. We do this automatically without user intervention. With our patent-pending Instant Training technology, we can infer dates to build historical models so that you can start seeing the types of alerts you would have seen in the past within hours of setting up a Data Culpa instance.

When we think about data quality, we think about it from the data engineering perspective of consistency over time. Of course, as a monitoring platform we cannot know what’s right or wrong–but we can tell you what’s similar to behavior we’ve seen in the past. 

Humans can provide feedback to Data Culpa on alerts, and humans can tune alert thresholds up and down using simple configurations of which alerts and whether to be highly sensitive to change or to focus on egregious changes. 

Data Culpa includes over 25 “cameras” looking at data flows for alert conditions out of the box, including record counts, field counts, types of fields, various distribution interpretations for data including objects, numerical data, strings, dates, and categories. We also provide “pivots” to help refine distribution sensitivity across objects over time that are represented in tight rows. We blow up embedded JSON data into top level schema tracking. We also, by default, point our “alert camera” across time in two different manners.

Many data sources use strings liberally–to hold dates, datetimes, or embedded JSONs, enumerations of categories, or even numbers, and so we interpret types as what can be inferred from the data, rather than simply agreeing to the data types provided by your database. This lets us deal with untyped semi-structured data, too.

All of these interpretations mean Data Culpa can look at your data from over 100 “camera angles” concurrently, providing a robust set of monitoring out of the box without tuning or writing custom code.

When we first started Data Culpa, there were companies offering quality monitoring through big expensive consulting engagements ($50,000 and vendor engineers on site for 2 weeks doing integration). We have done on-site work in past companies at big vendors and we know no one wants to do this, nor do we want to support a ton of custom engineering. Everything we build, we build for our entire customer base.

See It For Yourself

So, if you want to start using Validator, you can do so without waiting for (and paying for) any extensive professional services engagement.

Just sign up for a free trial, set up your watchpoints, and start getting the insights you’re missing about the behavior of the data that matters to you.

Have Questions?

Feel free to reach out to us!


Subscribe to the Data Culpa Newsletter