Did you know you can scrape data from webpages without writing a single line of code? In this post, we will talk about a tool called Octoparse. We used Octoparse to scrape data from a list of URLs, without any coding at all. Data is valuable and it’s not always easy to get the correct data from the web sources […]
Scraping websites using Octoparse (no programming!)
Organize and edit your test data with Quilt
We recently stumbled upon Quilt, a data package manager that wants to fill the role of ‘Github for data’. We have enjoyed using it so far. We think it can be useful for testers to manage their test data using Quilt. In this post, we will take an example dataset and show you step by step guide on how to […]
Quilt – a Data Package Manager
We have been testing data-rich applications for a long time. And like any experienced tester, we realize how difficult it is to create, maintain and update data every time the data model changes. So we were excited to come across Quilt, a data package manager, via Hacker News. We were thrilled that it integrated well with our favorite programming language […]
Getting started with Jupyter Notebooks
This post is for those who want to get started with Jupyter Notebooks. You don’t need to know anything beyond Python to start using Jupyter notebook. In this post I will be covering below topics. What is a Jupyter Notebook and app? How to install and use Jupyter Notebook Jupyter notebook user interface Sharing Jupyter Notebooks via GitHub Gist and […]
Extracting data from PDFs using Python
When testing highly data dependent products, I find it very useful to use data published by governments. When government organizations publish data online, barring a few notable exceptions, it usually releases it as a series of PDFs. The PDF file format was not designed to hold structured data, which makes extracting data from PDFs difficult. In this post, I will […]
Mockaroo Tutorial: Generate realistic test data
At a recent client engagement, my colleague had to create some location data for testing one of the client application. The data creation involved multiple steps like downloading some data from the net, removing some columns, transforming it as per needs using Python code. The process was a bit time-consuming. Recently when I was searching for some tools for data […]
Scraping a Wikipedia table using Python
A colleague of mine tests a product that helps big brands target and engage Hispanic customers in the US. As you can imagine, they use a lot of survey data as well as openly available data to build the analytics in their product. We do test the accuracy of the underlying algorithms. But the algorithms used are only going to […]
Cleaning data with Python
I am sharing some tips and tricks on cleaning data and restructuring the data you are using for testing. Why this post? Qxf2 works with many data intensive applications. I’ve noticed a pattern – my colleagues hit very similar data related problems across many different projects. That got me thinking critically about test data. I was thrilled to stumble upon […]
MySQL and Liquibase
This post introduces you to Liquibase – a database changeset management tool. I will cover its installation, usage and execution with MySQL. WHAT IS LIQUIBASE? Liquibase is an open source library to track database changes. Liquibase supports XML, JSON, and YAML files. This post uses XML files to make certain concepts clear. When Liquibase runs, there are several commands it […]
MySQL Database Modeling
This post is part of our series on MySQL. We hope to give the average tester a slightly more technical view into how MySQL works. This blog covers the basics of designing a database and gives you an overview of normalization. What is Database Modeling? Database modeling is a process in which you analyze the application requirements, usually retrieved from […]