Recently, I got a chance to work on the AWS S3 bucket, where I compared the JSON files stored in the S3 bucket with the pre-defined data structure stored as a dictionary object using deepdiff. I can’t actually replicate, the entire system, I had tested. For the blog purpose I have come up with the following prerequisites/setup/flow:
Pre-requisite:
1. AWS login is required and details AWS_ACCOUNT_ID
, AWS_DEFAULT_REGION
, AWS_ACCESS_KEY_ID
, AWS_SECRET_ACCESS_KEY
in the aws_configuration_conf.py
file. These are the user specific details.
2. Create a S3 bucket compare-json
.
3. Keep json file sample.json
in the S3 bucket.
4. In the samples
folder expected_message.json
which will be used to compare sample.json.
Summary:
In the following sections, I will discuss the following steps:
1. Create a S3 bucket.
2. Create sample.json in the S3 bucket, which will be referred to as Key in this blog.
3. expected_message.json
is stored in the samples
templates_directory.
4. Execute Python script s3_compare_json.py
. I have kept source code here.
Steps:
1. Created a S3 bucket in the AWS. The article here will help you to create an S3 bucket in AWS.
2. Created sample.json
in the s3 bucket, which will be referred to as Key in this blog. My sample.json
look like below:
{ "PROFESSIONAL PLAYER":true, "NAME":"NADAL", "MATCHES PLAYED":750, "MATCHES WON":577, "MATCHES LOST":123, "STATUS":"ACTIVE", "COUNTRY":"ESP", "TURNED PRO":"2000-02-29", "PRICE MONEY":{ "AMOUNT":8900005, "CURRENCY":"USD" }, "ENDORSEMENT FEE":{ "AMOUNT":400000, "CURRENCY":"INR" } } |
3. Used below expected_message.json
:
{ "PROFESSIONAL PLAYER":true, "NAME":"ABCDEF", "MATCHES PLAYED":500, "MATCHES WON":400, "MATCHES LOST":100, "STATUS":"ACTIVE", "COUNTRY":"IND", "TURNED PRO":"9999-12-31", "PRICE MONEY":{ "AMOUNT":1000000, "CURRENCY":"USD" }, "ENDORSEMENT FEE":{ "AMOUNT":500000, "CURRENCY":"USD" } } |
Note that, there is a difference between some of the key values of both json, which I have kept purposefully to demo the sample code.
4. Written following python script s3_compare_json.py
to compare the Key with the expected json format. Method compare_dict
is used to compare dictionary objects created for sample.json
and expected_message.json
. deepDiff
is used to find the difference between two dictionary objects.
""" This file will contain the following method and class: 1. Compare dict method. 2. S3Utilities class this has the following methods: 2.a. Get Response from s3 client. 2.b. Convert response into dict object. 2.c. Get Response dict object 2.d. Get expected dict from json stored as expected json """ import boto3 import collections import deepdiff import json import logging import os import re import sys sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) import conf.aws_configuration_conf as aws_conf from pythonjsonlogger import jsonlogger from pprint import pprint # logging log_handler = logging.StreamHandler() log_handler.setFormatter(jsonlogger.JsonFormatter()) logger = logging.getLogger() logger.setLevel(logging.INFO) logger.addHandler(log_handler) #setting environment variable os.environ["AWS_ACCOUNT_ID"]= aws_conf.AWS_ACCOUNT_ID os.environ['AWS_DEFAULT_REGION'] = aws_conf.AWS_DEFAULT_REGION os.environ['AWS_ACCESS_KEY_ID'] = aws_conf.AWS_ACCESS_KEY_ID os.environ['AWS_SECRET_ACCESS_KEY'] = aws_conf.AWS_SECRET_ACCESS_KEY # Defining method to compare dict def compare_dict(response_dict, expected_dict): exclude_paths = re.compile(r"\'TURNED PRO\'|\'NAME\'") diff = deepdiff.DeepDiff(expected_dict, response_dict,\ exclude_regex_paths=[exclude_paths],verbose_level=0) return diff # class to write s3 utilities class s3utilities(): logger = logging.getLogger(__name__) def __init__(self, s3_bucket, key, template_directory): # initialising the class self.logger.info(f's3 utilities activated') self.s3_bucket = s3_bucket self.key = key self.template_directory = template_directory self.s3_client = boto3.client('s3') def get_response(self, bucket, key): # Get Response s3 client object response = self.s3_client.get_object(Bucket=bucket, Key=key) return response def convert_dict_from_response(self,response): # Convert response into dict object response_json = "" for line in response["Body"].iter_lines(): response_json += line.decode("utf-8") response_dict = json.loads(response_json) return response_dict def get_response_dict(self): # Get Response dict object response = self.get_response(self.s3_bucket,self.key) response_dict = self.convert_dict_from_response(response) return response_dict def get_expected_dict(self): # Get expected dict from json stored as expected json current_directory = os.path.dirname(os.path.realpath(__file__)) message_template = os.path.join(current_directory,\ self.template_directory,'expected_message.json') with open(message_template,'r') as fp: expected_dict = json.loads(fp.read()) return expected_dict if __name__ == "__main__": # Testing s3utilities s3_bucket = "compare-json" key = 'sample.json' template_directory = 'samples' s3utilities_obj = s3utilities(s3_bucket, key, template_directory) response_dict = s3utilities_obj.get_response_dict() expected_dict = s3utilities_obj.get_expected_dict() diff = compare_dict(response_dict, expected_dict) print("=========================================================") pprint(f'Actual difference between two jsons is: \n {diff}') print("=========================================================") |
When I ran the script using command python s3_compare_json.py
, the difference in values changed between the expected json and sample json is shown on the console. Note that, TURNED PRO
and NAME
are different between both jsons, but it is filtered out from the result as that has excluded in the following code snippet:
exclude_paths = re.compile(r"\'TURNED PRO\'|\'NAME\'") diff = deepdiff.DeepDiff(expected_dict, response_dict,\ exclude_regex_paths=[exclude_paths],verbose_level=0) |
I hope you have liked the blog. The source code is available here. You can find some useful documentation about deepdiff
here.
Work with Qxf2 for top-tier startup QA
At Qxf2, we don’t just test—we help build better products. Our technical testers work alongside your developers to create sustainable QA processes that scale. Looking for quality assurance expertise that fits your startup’s pace? Check out our startup-focused QA solutions and get in touch with us today!
I have around 15 years of experience in Software Testing. I like tackling Software Testing problems and explore various tools and solutions that can solve those problems. On a personal front, I enjoy reading books during my leisure time.
A good solution that does not require the extra cost of transferring S3 object to disk. The exclusion of the ‘Turned Pro’ was a nice icing for other uses. Thanks for sharing