In this blog, I will show how to build a Custom Operator in Airflow to make calls to external APIs. As we do this, we will see how to use secrets in Airflow, make tasks communicate with each other and interpret the output of an SSH Operator. Although this blog refers to posting messages to Skype, all it does is […]
Building Custom Operator in Airflow for external API calls
Automate Rust Lambda deploy using GitHub Actions
We are open-sourcing a GitHub Action that will let you deploy AWS Lambdas written in Rust. This helps us automate deploying Rust AWS Lambda functions using GitHub Actions. If you are interested in the implementation details of this GitHub Action, you can look at the source code here. You can also file issues in that repository to ask for any […]
Error handling with Rust using delta-rs as an example
In this blog, I will explore the various ways to do handle and/or propagate errors with Rust. Qxf2 is transitioning to using Rust as our primary programming language. The Rust documentation is great and has a nice section on error handling and propagation. However, everytime we tried to implement some of those ideas in our day to day work, we […]
ValueError in Evaluation Parameter of Great Expectations
I recently was greeted with a ValueError when using evaluation parameters in Great Expectations. If you hit a similar error, read on. At the time of writing of this blog post, I was working on implementing data tests using Great Expectations. While trying out Evaluation Parameters in Expectations for the first time, I faced a cryptic ValueError message related to […]
Tracking table row count using Metric Store of Great Expectations
In this blog post, we will explore the metric store and evaluation parameters feature of Great Expectations. These help us track data about our data collection. I will help you write a test to keep track of table row count using Great Expectations. This post is fifth in our series to help testers implement useful tests with Great Expectations for […]
Outlier detection algorithms using Great Expectations
In this blog post, we will implement outlier detection algorithms using Great Expectations. There is a socio-technical context to this blog post. On the social side of things, I want to emphasize that Great Expectations can be used to provide business folks information in a timely manner. On the technical side, we will look at how to use Great Expectations […]
Host Great Expectations Test Results on Netlify
In this post, we will help you host your Great Expectations test results on Netlify, a web hosting platform. This post is third in our series to help testers implement useful tests with Great Expectations for data validation. Implementing tests and running them is job half done. As testers it is quite essential to make sure our test results reach […]
Run Great Expectations workflow using GitHub Actions
In this post, we will help you run one Great Expectations test as part of your CI/CD pipeline using GitHub Actions. This post is second in our series to help testers implement useful tests with Great Expectations for data validation. If your instinct says that adding a single test to a CI/CD pipeline should not be the next part, read […]
Data validation using Great Expectations with a real-world scenario: Part 1
I recently started exploring Great Expectations for performing data validation in one of my projects. It is an open-source Python library to test data pipelines and helps in validating data. The tool is being actively developed and is feature rich. At Qxf2, we feel that this tool has a lot of potential to help testers that need to grapple with […]
Optimize running large number of tasks using Dask
I want to share my experience of using Dask tasks and certain best practices/optimizations I implemented. Having some prior understanding about Dask is required to follow along. For implementing parallelism, I used Dask futures, one of the Dask collections. I referred Dask best practices on how to improve performance and incorporated some of them to my project. Hope you find […]