Testers, practice testing a machine learning algorithm!

Testers, you now have an app you can use to practice testing a machine learning algorithm. We have written and hosted a simple sentence classifier for testers to practice testing a machine learning algorithm. Play with it and let us know what fun bugs you discover. Hint: Testing this app is fun because there are several bugs that are not even edge cases!

App link: https://practice-testing-ai-ml.qxf2.com/is-pto

Pretend like you are telling your boss that you want to take some time off. Plug those sentences into the app. The app will classify it as a ‘leave message’ aka ‘PTO message’ or not.

PTO message? What is a PTO??


In corporate India, PTO stands for Personal Time Off. It covers a range of reasons including – personal reasons, health reasons, just going on vacation, etc. Depending on which part of the world you are from, PTO to you could mean you are taking leave, asking for paid time off, taking vacation, using your annual leave, taking casual leave, taking PTO, calling in sick, etc.
Some example you can try are:

  1. Down with fever and cold. Taking the day off
  2. I am traveling to hometown on some unplanned personal work… will not be able to work today
  3. Cancelling my leave today and working today
  4. Am going on vacation in the last week of December
  5. I will not be working next week

About this app

Post a sentence and the app will tell you if you are applying for leave or not. This is a sentence classifier. I adapted the code in this YouTube tutorial by Johannes Frey to make this application. I trained it on around 3200 messages posted on Qxf2’s ‘leave’ Skype channel. There is definitely a bias in the training dataset. We live in India and our English is different from say someone in Australia. Similarly, certain words seem to throw the classifier off and it is fun to discover these words and just add them randomly to your messages.

Don’t know where to start testing?

Begin with the example sentences listed above or some sentences you naturally use to tell your colleagues you will not be working on certain days. Then, try to vary the sentences slightly (e.g.: use leave instead of PTO). Try adding typos. You could try double negatives (e.g.: I am not canceling my leave). Try adding some words (e.g.: add the word ‘Reminder’ to your message) or shorten your sentences. Add sentences and context to your leave message. Use contractions (e.g.: don’t instead of ‘do not’ or I’m instead of I am). The possibilities are endless.

Within Qxf2, we have found a ton of such bugs (e.g.: ‘I am out today’ is classified as NOT A PTO MESSAGE while the word ‘December’ is classified as a PTO MESSAGE. Lol!

If that was too easy …

Well, if the previous attempts were too easy for you, start thinking about what to do next. For example, how would you report these errors to a developer? Or for that matter, if the developer fixes some of your bugs, how would you make sure that there were no regressions? What is an acceptable set of test messages to try out? Can you come up with (or use existing) mathematical indicators to show that modifications to the algorithm were indeed giving better results? What parts of the testing would you automate and how? How would you inspect the model itself and get a sense of what its strengths and weaknesses were? Hopefully, these kind of questions will serve as a gateway to you learning how to test machine learning algorithms better.

Next steps

We plan on adding more such applications to the site Practice Testing AI ML. We also will keep updating the algorithm in the is-pto app. And finally, we will be writing about how we go about formally testing such algorithms, some of the mathematical indicators we use, the tools we use to generate some of the test sentences, sources for getting natural language, automation strategies, etc. So stay tuned!

Tags:

Leave a Reply

Your email address will not be published.