Urllib’s urlencode: The weird case of %22 and %27

When I was doing some API testing using Python Mechanize, I struck an internal server error. I tried to find the root cause of the error and I realized that Python’s urllib is doing something weird when encoding strings. It is replacing double quotes with single quotes. I spent more than hour to debug this issue before I realized what was wrong. So I am writing this post to help testers to find the solution for this HTTP error and not spend much time on this.

The problem

Let us see an example of the problem:

 Import urllib
 a = {"name":"tester","company":{"id":"1","work":"QA"}}
 data = urllib.urlencode(a)
 print data

Expected output:company=%7B%22work%22%3A+%22QA%22%2C+%22id%22%3A+%221%22%7D&name=tester
Actual output: company=%7B%27work%27%3A+%27QA%27%2C+%27id%27%3A+%271%27%7D&name=tester

Here you can see that , %7B%27 etc are urlencoded parameters. When I check in encoding reference, encoded value of single quote is %27 and double quote is %22.So urllib seems to have replaced the %22 with %27.

Due to this, the server can’t recognize what I am sending in request and it throws an internal server error. In the interest of time, I simply replaced %27 with %22 in the encoded string.

Import urllib
a = {"name":"tester","company":{"id":"1","work":"QA"}}
data = urllib.urlencode(a)
d1 = data.replace(‘%27,’%22)
print d1


Now you can see that %27 is replaced by %22.

This is the solution I used to get over the internal server error. But it seems like an ugly hack. If any one of you find a better solution for this problem please let me know.

If you liked this article, learn more about Qxf2’s testing services for startups.

2 thoughts on “Urllib’s urlencode: The weird case of %22 and %27

  1. Came across this same issue myself and spent a few hours frustrated finding/fixing it.

    From some quick tests it appears urlencode calls str() on any param keys and then encodes them. It doesn’t check if these values are structures. When you call str() on a python dict single quotes are always used Which results in %27 being using instead of %22

    After trawling stack overflow it appears the best solution is to iterate through the param dict and dump each value as json then call urlencode on the result.

    This can be written as a one liner:
    from urllib.parse import urlencode
    urlencode({k: json.dumps(v) for k, v in params.items()})

Leave a Reply

Your email address will not be published.