There are times when we repurpose our automated checks to scrape information off the pages it visits. This can happen when you are using automation to explore a product, looking for patterns in client data, analyzing production data to spot patterns, etc.
Storing the scraped data can be tricky. Approaches to storing the scraped data range between two extremes:
a) [lightweight] writing out log files
b) [heavyweight] using a full-fledged relational database
The data that you end up scraping usually has some structure. E.g.: a table. Log files have the drawback of being purely string based and unable to capture the structure within the data. Full-fledged relational databases have the drawback of needing setup, design and maintenance. Using SQLite to store your data, lies somewhere between the two extreme options. SQLite is a flat file database that needs no extra setup. But it still affords us the convenience of querying the data with SQL commands. In this post, we show you an example of scraping data from an HTML table and storing it in a SQLite table.
NOTE: We want to emphasize that using SQLite is just one option. Whether it suits you or not depends on your specific situation.
What is SQLite?
SQLite is a server-less, lightweight, disk-based relational database management system that does not require any installation. The absence of any intermediate server allows it to be embedded in any application or, in our case, a test. The different data type values stored in the database can be accessed through a variant of the SQL query language. Click here to know more about SQL data types
Setup
Getting setup with SQLite is easy. Python versions 2.5 and above ship a module called sqlite3 by default. So you need to do nothing extra. To test if you have the sqlite3 module, try this statement in a Python interpreter import sqlite3
. If you get no import errors, you are all set.
Working with SQLite
Any module that lets you connect to a database exposes a few standard operations like creating and connecting to a database, adding a table, inserting rows, updating rows, fetching/selecting data, closing a connection, etc. In the remainder of this post, I’ll show you how to perform these operations with sqlite3.
To use the sqlite3 module a connection object has to be created. You can do that like this:
connection_obj = sqlite3.connect(db_name) |
db_name can be any name you like. In case there is no database, one is created.
Once we have the connection object, a cursor object has to be created. The cursor object is the one that executes the SQL commands.
cursor_obj = connection_obj.cursor() |
Now that we have a cursor object, we can begin executing SQL commands. Let’s try creating a table in a SQLite database, insert values from the Tutorial Page and later fetch a value from the table.
Create a table:
import sqlite3 def create_table(db_name,table_name,table_header): "Create a table" try: connection_obj = sqlite3.connect(db_name) cursor_obj = connection_obj.cursor() cursor_obj.execute("CREATE TABLE %s%s"%(table_name,tuple(table_header))) connection_obj.commit() connection_obj.close() print "-Created Table %s"%table_name except Exception,e: print "Python says:",str(e) |
The cursor_obj.execute("CREATE TABLE %s%s"%(table_name,tuple(table_header)))
command creates a new table. The connection_obj.commit()
and connection_obj.close()
saves and closes the database.
Insert values into the table:
def insert_values(db_name,table_name,values): "Insert values into the table" try: connection_obj = sqlite3.connect(db_name) cursor_obj = connection_obj.cursor() cursor_obj.execute("INSERT INTO %s VALUES (?,?,?,?)"%table_name,tuple(values)) connection_obj.commit() connection_obj.close() except Exception,e: print "Python says:",str(e) |
The cursor_obj.execute("INSERT INTO %s VALUES (?,?,?,?)"%table_name,tuple(values))
is used to insert values into the table. A placeholder(?)
is used in the execute statements.Use ?
as a placeholder wherever you want to use a value, and then provide a tuple of values.It is preffered over the usual python string formatters. It helps you avoid SQL injection attacks. Once the table is created and the values are inserted into the table the connection_obj.commit()
command is used to save the changes and connection_obj.close()
is used to close the connection.
Fetch value form the table:
def fetch_value(db_name,table_name,required_key,key,value): "Fetch a value from the table" try: connection_obj = sqlite3.connect(db_name) cursor_obj = connection_obj.cursor() cursor_obj.execute("SELECT %s FROM %s WHERE %s=? "%(required_key,table_name,key),(value,)) #Note: The string formatting is done differently(<code>(value,)</code>) to prevent SQL injection attack. name = cursor_obj.fetchone()[0] #To get the first element in the resulting list print "-The %s from the Table:'%s' where %s='%s' is: %s"%(required_key,table_name,key,value,name) except Exception,e: print "Python says:",str(e) |
The execute statement cursor_obj.execute("SELECT %s FROM %s WHERE %s=? "%(required_key,table_name,key),(value,))
searches through the table to find a value corresponding to the condition. The fetchone()
command is used to fetch a single value from the table whereas the fetchall()
can be used to fetch all the values from the table.
Putting it all together:
""" Qxf2 Services: Utility script to create a SQLite table NOTE: This was written up for a blog post/tutorial We use a cleaned up and more robust version at clients """ from selenium import webdriver import sqlite3 def test_create_table(): "Create a table, insertable and fetch values from the table" driver = webdriver.Firefox() driver.get("https://qxf2.com/selenium-tutorial-main") driver.maximize_window() db_name = "example.db" table_name = "test_table" table_header = ['Name','Email','Phone','Gender'] #Create a table create_table(db_name=db_name,table_name=table_name,table_header=table_header) for i in range(1,4): row = driver.find_elements_by_xpath("//tbody/descendant::tr[%d]/td"%i) table_values = [] for data in row: value = data.text #To get the text property of the element table_values.append(value.encode('utf-8')) #Insert values into the table insert_values(db_name=db_name,table_name=table_name,values=table_values) #Fetch value from the table fetch_value(db_name=db_name,table_name=table_name,required_key='Name',key='Email',value='[email protected]') driver.quit() def create_table(db_name,table_name,table_header): "Create a table" try: connection_obj = sqlite3.connect(db_name) cursor_obj = connection_obj.cursor() cursor_obj.execute("CREATE TABLE %s%s"%(table_name,tuple(table_header))) connection_obj.commit() connection_obj.close() print "-Created Table %s"%table_name except Exception,e: print "Python says:",str(e) def insert_values(db_name,table_name,values): "Insert values into the table" try: connection_obj = sqlite3.connect(db_name) cursor_obj = connection_obj.cursor() cursor_obj.execute("INSERT INTO %s VALUES (?,?,?,?)"%table_name,tuple(values)) connection_obj.commit() connection_obj.close() except Exception,e: print "Python says:",str(e) def fetch_value(db_name,table_name,required_key,key,value): "Fetch a value from the table" try: connection_obj = sqlite3.connect(db_name) cursor_obj = connection_obj.cursor() cursor_obj.execute("SELECT %s FROM %s WHERE %s=? "%(required_key,table_name,key),(value,)) name = cursor_obj.fetchone()[0] print "-The %s from the Table:'%s' where %s='%s' is:\n %s"%(required_key,table_name,key,value,name) except Exception,e: print "Python says:",str(e) if __name__ == '__main__': test_create_table() |
NOTE: To make this a class and reduce the repeated code, you can take a look at the example we have in our MongoDB tutorial
When you run this file your output would look like this,
There you have it, a getting started guide to the sqlite3 module in Python and a detailed example on how to implement it with tests.
If you liked this article, learn more about Qxf2’s testing services for startups.
References:
1. SQLite3- Official documentation : SQLite3
2. TutorialsPoint : SQLite Python Tutorial
3. Python Central: introduction-to-sqlite-in-python
My expertise lies in engineering high-quality software. I began my career as a manual tester at Cognizant Technology Solutions, where I worked on a healthcare project. However, due to personal reasons, I eventually left CTS and tried my hand at freelancing as a trainer. During this time, I mentored aspiring engineers on employability skills. As a hobby, I enjoyed exploring various applications and always sought out testing jobs that offered a good balance of exploratory, scripted, and automated testing.
In 2015, I joined Qxf2 and was introduced to Python, my first programming language. Over the years, I have also had the opportunity to learn other languages like JavaScript, Shell scripting (if it can be called a language at all), and more recently, Rust. Despite this exposure, Python remains my favorite language due to its simplicity and the extensive support it offers for libraries.
In my free time, I like to watch football (I support Arsenal Football Club), play football myself, and read books.