Dynamically import Quilt packages

This post is primarily aimed at Quilt users. Read-only section 2 if you were looking for examples of dynamically importing Python modules. In this post, we will show you how to use the same code to interact with two similarly structured but different Quilt Packages. This is particularly useful when you use one Quilt package on your development environment and another (similarly structured) Quilt package in your production environment.

 

1. The problem

Qxf2 is developing a simple survey application for internal use only. We decided (reasons in section 5 below) to use Quilt to store our data instead of SQLite. We need to use one Quilt data package for our development and test environments and one Quilt data package for our production. The two packages would be similar in structure (same variables, dataframe columns, etc.) but have different data in them. With databases, you would store the database configuration in a settings file and read that into your code. However, a similar approach with Quilt poses an additional problem – Quilt scripts have an import statement that uses the name of the package explicitly.

 

2. Dynamic imports with Python

To solve the above problem of an import statement referencing a package name, we need to be able to dynamically import the module. This will help us use the same application code in both production and the development environments. There are two popular ways to do dynamic import in Python: __import__ and the importlib module. We decided to go with the importlib module since we found its syntax intuitive. For example,

a. import foo is simply foo = importlib.import_module('foo')
b. import foo as blah is simply blah = importlib.import_module('foo')
c. from foo.bar import blah is app = importlib.import_module('foo.bar.blah'). This particular statement is what we will use to dynamically import the Quilt package.

 

3. The Quilt script without dynamic imports

In this example, we will write a Python script to create a new package (called jose), store a simple table and then push the package to Quilt. Without dynamic imports, the script would look something like the code below.

"""
This file builds a Quilt package, inserts a 2-row table and pushes it to the Quilt repository
We are calling the new package 'jose' after my favourite chess player - Jose Raoul Capablanca
"""
 
import quilt
 
def create_package():
    "Create a new quilt package"
    quilt.build("qxf2/jose")
    from quilt.data.qxf2 import jose #This line is the limitation we are solving!
    import pandas as pd
 
    #Let us store a table
    """
    Name, ID
    Katerina, 001
    Kady, 002
    """
    columns = ['Name','ID']
    employees = [["Katerina","001"],['Kady','002']]
    employees_df = pd.DataFrame(data=employees, columns=columns)
    jose._set(['employees'],employees_df)
    quilt.build("qxf2/jose", jose)
 
    #For non-teams, run 'quilt login' in your terminal before executing this script
    # quilt.login(team="qxf2") 
    quilt.push("qxf2/jose", is_public=True)
 
 
#----START OF SCRIPT
if __name__=="__main__":
    create_package()

You will notice the problem is the line from quilt.data.qxf2 import jose which hard codes both ‘qxf2’ (the repo name) and ‘jose’ (package name). If I wanted to use the package ‘jose’ in my development environment and the package ‘capa’ in production, I would need two different scripts.

 

4. The Quilt script with dynamic imports

To make the above code handle multiple environments (i.e., different packages), we would store the package name in a file that had different configuration values on your development and production environment. In our example, the file (let’s say we call it quilt_settings.py) would have only one line PACKAGE_NAME="capa". Then, change the import statement to look like:

    app = importlib.import_module('{}.{}'.format(QUILT_DATA_PATH,package_name)) 
    #Now you can use app just like you would use the module jose in the previous example

Putting it all together, the code would look like this:

"""
This file builds a Quilt package, inserts a 2-row table and pushes it to the Quilt repository
We are calling the new package 'capa' after my favourite chess player - Capablanca
"""
 
import importlib
import quilt
import quilt_settings as conf
 
QUILT_REPO= "qxf2"
 
def create_package(package_name):
    "Create a new quilt package"
    quilt.build("{}/{}".format(QUILT_REPO,package_name))
    QUILT_DATA_PATH = "quilt.data.{}".format(QUILT_REPO)
    app = importlib.import_module('{}.{}'.format(QUILT_DATA_PATH,package_name))
    import pandas as pd
 
    #Let us store a table
    """
    Name, ID
    Katerina, 001
    Kady, 002
    """
    columns = ['Name','ID']
    employees = [["Katerina","001"],['Kady','002']]
    employees_df = pd.DataFrame(data=employees, columns=columns)
    app._set(['employees'],employees_df)
    quilt.build("{}/{}".format(QUILT_REPO,package_name),app)
 
    #or non-teams, run 'quilt login' in your terminal before executing this script
    # quilt.login(team="qxf2") 
    quilt.push("{}/{}".format(QUILT_REPO,package_name), is_public=True)
 
#----START OF SCRIPT
if __name__=="__main__":
    #Read the package name from your quilt settings file
    create_package(conf.PACKAGE_NAME)

That’s it! Now you have the exact same code working on both production and development environments while still using two different (but similarly structured) data packages on each environment.

 

5. Why we are using Quilt instead of a database

Because we really like using Python. I know, I know – “if all you have is a hammer, everything looks like a nail”. But in this case, we feel good about simply using Python to manage our data instead of designing tables, designing an ORM, managing configurations & deploys, thinking about different platforms and then adding extra steps to our setup documents, etc. It helps that the application we are building will be small, not have too many users and the data is ideally suited to storing data in dataframes.

 

Disclaimer: We are using Quilt as a database. We aren’t sure that the creators of Quilt built Quilt for this use case. This is simply Qxf2 creatively using the data package manager to write small web applications.

 

Leave a Reply

Your email address will not be published. Required fields are marked *