Data Validation Using Assistants API: Exploring AI-driven approach

This post extends my previous exploration of conducting data validation tasks using Large Language Models like ChatGPT. To provide context, at Qxf2, we execute a series of data quality tests using Great Expectations. Initially, we explored the possibility of employing ChatGPT for these validations, but it faced challenges in performing them effectively. Now, with the recent release of more advanced […]

Fine Tuning Model Evaluation using ROC and Precision Recall curves

Evaluating machine learning models is crucial for understanding their performance characteristics. In this blog post, we explore how ROC and Precision Recall curves can be used to improve the way we evaluate models. Additionally, we delve into the practical aspect of using these curves across various thresholds, customizing the model for specific requirements and achieving optimal performance. Why this post […]

Understanding Text Classification Models with LIME

Why this post? I’ve always been wondering how Machine learning models functions as black boxes, making predictions based on patterns learned from data. Despite the impressive accuracy, understanding the factors and features that influenced a particular prediction and the decision-making process is crucial and challenging task. The lack of transparency in these models adds complexity making their internal workings less […]

Tcases: Auto-generating API Tests

Most development teams that Qxf2 works with, describe and document their APIs using Open API specification. This means, there is a set structure for folks to write tests. Handcrafting the simple test cases can be cumbersome and time consuming. Given there is a set structure, we started to look out for solutions that could create API tests based on an […]

Testing Charts using GPT-4 with Vision model

This post builds upon my prior exploration of testing charts with Transformers using the Visual Question Answering approach. I had presented charts to Transformers models like Pix2Struct and matcha from Google (which were specifically trained on charts) and then queried with questions. The outcomes proved satisfactory when the charts were well-defined with clearly labeled data points. Now, with the recent […]

Testing DALL-E by creating single panel cartoons

I tested DALL-E for a specific real-world use case. I wanted to see how good it was for producing single panel cartoons. My testing has uncovered several promising aspects, some problems that need to be addressed and an interesting testing technique for DALL-E and ChatGPT like applications. I tried summarizing my findings in a blog post like an engineer would. […]

Data quality matters when building and refining a Classification Model

In the world of machine learning, data takes centre stage. It’s often said that data is the key to success. In this blog post, we emphasise the significance of data, especially when building a comment classification model. We will delve into how data quality, quantity, and biases significantly influence machine learning model performance. Additionally, we’ll explore techniques like undersampling as […]

Use pytest to run Great Expectations checkpoints

At Qxf2, we’ve successfully integrated Great Expectations into majority of our projects. We now have GitHub workflows in place to run Great Expectations checkpoints before deploying our applications to production. However, as our test suite expanded, we encountered a few challenges: 1. Triggering valid checkpoints. 2. Aggregating checkpoint results. To address these issues, we turned to pytest. In this post, […]