Use pytest to run Great Expectations checkpoints

At Qxf2, we’ve successfully integrated Great Expectations into majority of our projects. We now have GitHub workflows in place to run Great Expectations checkpoints before deploying our applications to production. However, as our test suite expanded, we encountered a few challenges:
1. Triggering valid checkpoints.
2. Aggregating checkpoint results.
To address these issues, we turned to pytest. In this post, I’ll demonstrate how to modify your checkpoint scripts to make them compatible with pytest for seamless integration.


Modifying Checkpoint Scripts for pytest Compatibility:

This is how your Python checkpoint script will mostly look after you generate it using this command great_expectations checkpoint script github_stats_checkpoint

# Contents of github_stats_checkpoint.py module
from great_expectations.data_context import DataContext
data_context: DataContext = DataContext(context_root_dir="great_expectations")
result: Checkpointresult = data_context.run_checkpoint(checkpoint_name="github_stats_checkpoint",
                                                       batch_request=None,
                                                       run_name=None,)
if not result["success"]:
    print ("Validation failed!")
    sys.exit (1)
print ("Validation succeeded!")
sys.exit(0)

To ensure compatibility with pytest, you’ll need to implement the following changes in your checkpoint scripts:
1. Rename the checkpoint script
2. Create a test function
3. Create a test marker

1. Rename the checkpoint script:

To enable pytest to identify and execute the test, modify the checkpoint script’s name. It should either start with test_name.py or end with name_test.py, i.e if the name of the checkpoint script is github_stats_checkpoint.py rename it to test_github_stats_checkpoint.py.

2. Create a test function:

Revise the checkpoint script to include:
a. A test function, the function’s name should be test_function_name.
b. An assert statement in the test function to validate the test status.

# Contents of the new test_github_stats_checkpoint.py module
from great_expectations.data_context import DataContext
 
def test_validate_stats(): # a. Test function adhering to the naming convention added
    data_context: DataContext = DataContext(context_root_dir="great_expectations")
    result: Checkpointresult = data_context.run_checkpoint(checkpoint_name="github_stats_checkpoint",
                                                           batch_request=None,
                                                           run_name=None,)
    assert result["success"], "Validation failed" # b. Assert statement to validate the status

Note: For further information on Python test discovery conventions you can refer pytest conventions

3. Create a test marker:

Enhance your test function by adding a marker. This allows you to select and run tests belonging to the same category. @pytest.mark.marker_name adds a marker for the test function.

from great_expectations.data_context import DataContext
import pytest
 
@pytest.mark.github_stats
def test_validate_stats():
    data_context: DataContext = DataContext(context_root_dir="great_expectations")
    result: Checkpointresult = data_context.run_checkpoint(checkpoint_name="github_stats_checkpoint",
                                                           batch_request=None,
                                                           run_name=None,)
    assert result["success"], "Validation failed"

How to run tests:

Now that you’ve adjusted the checkpoint scripts to align with pytest, running it becomes a straightforward process.
To validate that the new test is now discoverable run the following command:

$ pytest --collect-only tests/data_validation/great_expectations/utils
===================================================== test session starts =====================================================
platform xxxxxx -- Python 3.9.17, pytest-7.2.1, pluggy-1.0.0
rootdir: /project_root_dir_location, configfile: pytest.ini
plugins: anyio-3.6.2
collected 1 items
 
<Package utils>
  <Module test_github_data_checkpoint.py>
    <Function test_validate_stats>

Once you have confirmed that the test is discoverable, use the following command to trigger the appropriate checkpoints and streamline the aggregation of checkpoint results:

$ pytest -v -m github_stats tests/data_validation/great_expectations/utils
=========================== test session starts ============================
platform xxxxxx -- Python 3.9.17, pytest-7.2.1, pluggy-1.0.0
rootdir: /project_root_dir_location, configfile: pytest.ini
plugins: anyio-3.6.2
collected 1 item
 
test_github_stats_checkpoint.py::test_validate_stats PASSED                                [100%]
 
===================== 1 passed in 0.12s ======================

There you have it—a suitable test runner for executing your checkpoints with just a few simple modifications.


Hire technical testers from Qxf2

Hire Qxf2 for our proficiency in technical testing and problem-solving skills. We extend beyond conventional test automation to address significant testing complexities, enabling your team to iterate swiftly and deliver software of exceptional quality. Allow us to assist you in optimizing your testing methodologies and enhancing the overall quality of your products.


Leave a Reply

Your email address will not be published. Required fields are marked *