This post extends my previous exploration of conducting data validation tasks using Large Language Models like ChatGPT. To provide context, at Qxf2, we execute a series of data quality tests using Great Expectations. Initially, we explored the possibility of employing ChatGPT for these validations, but it faced challenges in performing them effectively. Now, with the recent release of more advanced […]
Data Validation Using Assistants API: Exploring AI-driven approach
Validating Data Made Easy: A Dive into Soda Core
This post is a hands-on example to help you start writing data quality checks with Soda Core. We realize a lot of our readers are fairly new to the topic of writing data quality checks. So, in this post, we will also go over some of the most common data checks you might want to implement on your structured data. […]
Data Validation with ChatGPT: Trials and Insights
We conducted a study to explore the feasibility of using large language models like ChatGPT for performing validation on numerical data. At Qxf2, we execute a set of data quality tests using Great Expectations. Our goal was to assess the efficiency of leveraging ChatGPT to carry out these validations instead. In order to achieve this, I selected two specific scenarios. […]
ValueError in Evaluation Parameter of Great Expectations
I recently was greeted with a ValueError when using evaluation parameters in Great Expectations. If you hit a similar error, read on. At the time of writing of this blog post, I was working on implementing data tests using Great Expectations. While trying out Evaluation Parameters in Expectations for the first time, I faced a cryptic ValueError message related to […]
Tracking table row count using Metric Store of Great Expectations
In this blog post, we will explore the metric store and evaluation parameters feature of Great Expectations. These help us track data about our data collection. I will help you write a test to keep track of table row count using Great Expectations. This post is fifth in our series to help testers implement useful tests with Great Expectations for […]
Outlier detection algorithms using Great Expectations
In this blog post, we will implement outlier detection algorithms using Great Expectations. There is a socio-technical context to this blog post. On the social side of things, I want to emphasize that Great Expectations can be used to provide business folks information in a timely manner. On the technical side, we will look at how to use Great Expectations […]
Host Great Expectations Test Results on Netlify
In this post, we will help you host your Great Expectations test results on Netlify, a web hosting platform. This post is third in our series to help testers implement useful tests with Great Expectations for data validation. Implementing tests and running them is job half done. As testers it is quite essential to make sure our test results reach […]
Run Great Expectations workflow using GitHub Actions
In this post, we will help you run one Great Expectations test as part of your CI/CD pipeline using GitHub Actions. This post is second in our series to help testers implement useful tests with Great Expectations for data validation. If your instinct says that adding a single test to a CI/CD pipeline should not be the next part, read […]
Data validation using Great Expectations with a real-world scenario: Part 1
I recently started exploring Great Expectations for performing data validation in one of my projects. It is an open-source Python library to test data pipelines and helps in validating data. The tool is being actively developed and is feature rich. At Qxf2, we feel that this tool has a lot of potential to help testers that need to grapple with […]