{"id":10581,"date":"2019-02-25T05:30:56","date_gmt":"2019-02-25T10:30:56","guid":{"rendered":"https:\/\/qxf2.com\/blog\/?p=10581"},"modified":"2019-02-25T05:30:56","modified_gmt":"2019-02-25T10:30:56","slug":"automation-testing-text-to-speech-web-app","status":"publish","type":"post","link":"https:\/\/qxf2.com\/blog\/automation-testing-text-to-speech-web-app\/","title":{"rendered":"Automation testing of Text to Speech web app"},"content":{"rendered":"<p>As\u00a0a part of <a href=\"https:\/\/www.qxf2.com\/?utm_source=voice_app&amp;utm_medium=click&amp;utm_campaign=From%20blog\">Qxf2Services<\/a> Hackathon,\u00a0I had picked up a project to automate testing of a readily available Text to Speech web app. To follow along, I assume you have some familiarity with Python, Selenium.<\/p>\n<hr \/>\n<h3>Overview of\u00a0 Text to Speech Demo app<\/h3>\n<p>To try out the testing of Text to Speech,\u00a0I was looking for a readily available web app which can help me achieve my goal. After some googling, I found out a readily available and hosted\u00a0<a href=\"https:\/\/text-to-speech-demo.ng.bluemix.net\/\" target=\"_blank\" rel=\"noopener\">Text to Speech Demo Web app.<\/a>\u00a0<span>This Text to Speech service understands text and natural language to generate synthesized audio output complete with appropriate cadence and intonation.<\/span><\/p>\n<h4>Working of\u00a0Text to Speech Demo app<\/h4>\n<p>To use Text to Speech Demo app user\u00a0 needs to :<\/p>\n<ol>\n<li>select the voice language of his choice from the dropdown<\/li>\n<li>Input the text (<strong>Note<\/strong> that\u00a0<span>The text language must match the selected voice language)<\/span><\/li>\n<li>Click on the Speak button to hear the speech or Click on Download button which will give you an Mp3 Audio file of the speech that will speak out the text you had given as input in step 2<\/li>\n<\/ol>\n<h3>Our Test Scenario<\/h3>\n<ol>\n<li>Opening Demo app<\/li>\n<li>Selecting Voice &#8211;\u00a0 By default we are going to use\u00a0<span>American English (en-US): Allison (female, expressive, transformable)<\/span><\/li>\n<li>Inputting text &#8211;\u00a0<span>\u00a0so our Input text would be <strong><em>Thank You<\/em><\/strong><\/span><\/li>\n<li>Click on Download button<\/li>\n<li>Convert the Mp3 Audio file to .Wav file using pydub<\/li>\n<li>Detect the text from the .Wav file<\/li>\n<li>Compare the input text given in step 3 matches with the detected text from step 6<\/li>\n<\/ol>\n<hr \/>\n<h3>Automating Our Test Scenario<\/h3>\n<ul><\/ul>\n<p>Create a file named test_voice_demo_app.py with the following content:<\/p>\n<pre lang=\"python\">\"\"\"\r\nThis is a Automation test for Text to Speech Demo app\r\n\"\"\"\r\nimport os\r\nimport unittest\r\nimport time\r\nfrom selenium import webdriver\r\n\r\nclass VoiceWebAppTest(unittest.TestCase):\r\n    \"Class to run tests against voice web app\"\r\n    def setUp(self):\r\n        \"Setup for the test\"\r\n        chrome_options = webdriver.ChromeOptions() \r\n        prefs = {'download.default_directory' : 'path to your preferred download directory'}\r\n        chrome_options.add_experimental_option('prefs', prefs)\r\n        self.driver = webdriver.Chrome(chrome_options=chrome_options)\r\n        self.driver.maximize_window()\r\n\r\n     def test_voice_web_app(self):\r\n        \"Test the voice web app text to speech\"        \r\n        url = 'https:\/\/text-to-speech-demo.ng.bluemix.net\/'\r\n        print 'Opening %s'%url\r\n        \r\n        #Open the Voice Demo App\r\n        self.driver.get(url)\r\n        time.sleep(5)\r\n        #Scroll down        \r\n        self.driver.execute_script(\"window.scrollBy(0, -150);\")\r\n        time.sleep(5)\r\n        #Input the text\r\n        self.driver.find_element_by_xpath(\"\/\/select[@name='voice']\").click()\r\n        time.sleep(2)\r\n        self.driver.find_element_by_xpath(\"\/\/select[@name='voice']\/option[@value='en-US_AllisonVoice']\").click()\r\n        \r\n        #Set input text\r\n        keyword_input = 'Thank You'\r\n        print 'Input text is : %s'%keyword_input\r\n        input_text_area = self.driver.find_element_by_xpath(\"\/\/div[@data-id='Text']\/textarea[@class='base--textarea textarea']\")\r\n        input_text_area.clear()\r\n        input_text_area.send_keys(keyword_input)       \r\n        \r\n        #Download speech Audio Mp3 file\r\n        self.driver.find_element_by_xpath(\"\/\/button[text()='Download']\").click()\r\n\r\n    def tearDown(self):\r\n        \"Tear down the test\"\r\n        self.driver.quit()\r\n\r\n#---START OF SCRIPT\r\nif __name__ == '__main__':\r\n    suite = unittest.TestLoader().loadTestsFromTestCase(VoiceWebAppTest)\r\n    unittest.TextTestRunner(verbosity=2).run(suite)\r\n<\/pre>\n<p>By this point , we have automated our test to download transcript.mp3 to our desired location.<\/p>\n<h3>Converting Audio file format<\/h3>\n<p>Now that, we have the Audio file of the input text with us which is in mp3 format ,but to detect the text from mp3 format is not possible , so we need to convert it to a .wav extension which is a preferred solution I found after some googling around.<\/p>\n<p>To convert a .mp3 file to .wav file we will be making use of pydub\u00a0opensource python package that can convert .mp3 file to various other audio file extensions.<br \/>\nTo install this package you need to run\u00a0 pip install pydub<\/p>\n<p>pydub internally uses\u00a0<a href=\"https:\/\/www.ffmpeg.org\/\">FFmpeg<\/a>, which is the leading multimedia framework, able to decode, encode, transcode, mux, demux, stream, filter and play pretty much anything that humans and machines have created.You can download the package from &#8211;\u00a0<a href=\"https:\/\/ffmpeg.zeranoe.com\/builds\/\" target=\"_blank\" rel=\"noopener\">FFmpeg builds page<\/a> depending upon your system and add it to the PATH<\/p>\n<pre lang=\"python\">from pydub import AudioSegment\r\n\r\nsound = AudioSegment.from_mp3(\"transcript.mp3\") #Downloaded transcript.mp3 file which we need to convert\r\nsound.export(\"eng.wav\", format=\"wav\") #Output eng.wav file to detect text from\r\n<\/pre>\n<h3>Recognizing text from Audio file<\/h3>\n<p>Next, we need to detect the text from the speech for Audio file.This can be done by making use of SpeechRecognition python library.<\/p>\n<p>To install the package you need to run\u00a0pip install SpeechRecognition<\/p>\n<p>This library has support for various\u00a0<span>Speech recognition engines which can be found at\u00a0<a href=\"https:\/\/pypi.org\/project\/SpeechRecognition\/\" target=\"_blank\" rel=\"noopener\">Speech Recognition Documentaion<\/a>\u00a0, but for my project i arbitarily used\u00a0Google Speech Recognition support<\/span><\/p>\n<pre lang=\"python\">import speech_recognition as sr\r\n\r\naudio = 'eng.wav' #name of the file\r\nr = sr.Recognizer()\r\nwith sr.AudioFile(audio) as source:\r\n    audio = r.record(source)\r\n    try:\r\n        recognized_text = r.recognize_google(audio,language='en-US')\r\n        print('Decoded text from Audio is {}'.format(recognized_text))\r\n    except:\r\n        print('Sorry could not recognize your voice')\r\n<\/pre>\n<h3>Asserting input text with the recognized text<\/h3>\n<pre lang=\"python\">assert keyword_input.lower() == recognized_text.lower(),\"Detected speech text doesnt match with input text\"\r\n<\/pre>\n<p>Combining all the above pieces , our final test_voice_demo_app.py script would look like:<\/p>\n<pre lang=\"python\">import os\r\nimport unittest\r\nimport time\r\nfrom selenium import webdriver\r\nfrom pydub import AudioSegment\r\nimport speech_recognition as sr\r\n\r\nclass VoiceWebAppTest(unittest.TestCase):\r\n    \"Class to run tests against voice web app\"\r\n    def setUp(self):\r\n        \"Setup for the test\"\r\n        chrome_options = webdriver.ChromeOptions() \r\n        prefs = {'download.default_directory' : 'E:\/workspace-qxf2\/hackathon\/Voicewebapptest'}\r\n        chrome_options.add_experimental_option('prefs', prefs)\r\n        self.driver = webdriver.Chrome(chrome_options=chrome_options)\r\n        self.driver.maximize_window()\r\n\r\n \r\n    def test_voice_web_app(self):\r\n        \"Test the voice web app text to speech\"        \r\n        url = 'https:\/\/text-to-speech-demo.ng.bluemix.net\/'\r\n        print 'Opening %s'%url\r\n        \r\n        #Open the Voice Demo App\r\n        self.driver.get(url)\r\n        time.sleep(5)\r\n        #Scroll down        \r\n        self.driver.execute_script(\"window.scrollBy(0, -150);\")\r\n        time.sleep(5)\r\n        #Input the text\r\n        self.driver.find_element_by_xpath(\"\/\/select[@name='voice']\").click()\r\n        time.sleep(2)\r\n        self.driver.find_element_by_xpath(\"\/\/select[@name='voice']\/option[@value='en-US_AllisonVoice']\").click()\r\n        \r\n        #Set input text\r\n        keyword_input = 'Thank You'\r\n        print 'Input text is : %s'%keyword_input\r\n        input_text_area = self.driver.find_element_by_xpath(\"\/\/div[@data-id='Text']\/textarea[@class='base--textarea textarea']\")\r\n        input_text_area.clear()\r\n        input_text_area.send_keys(keyword_input)       \r\n        \r\n        #Download speech Audio Mp3 file\r\n        self.driver.find_element_by_xpath(\"\/\/button[text()='Download']\").click()\r\n        time.sleep(5)\r\n        \r\n        #Convert Mp3 file to .Wav file  \r\n        sound = AudioSegment.from_mp3(\"transcript.mp3\")\r\n        sound.export(\"eng.wav\", format=\"wav\")  \r\n        \r\n        \r\n        #Recognize the text from Mp3 Audio file \r\n        audio = 'eng.wav' #name of the file\r\n        r = sr.Recognizer()\r\n        with sr.AudioFile(audio) as source:\r\n            audio = r.record(source)\r\n            try:\r\n                recognized_text = r.recognize_google(audio,language='en-US')\r\n                print('Decoded text from Audio is {}'.format(recognized_text))\r\n            except:\r\n                print('Sorry could not recognize your voice')\r\n        \r\n        assert keyword_input.lower() == recognized_text.lower(),\"Detected speech text doesnt match with input text\"\r\n\r\n    def tearDown(self):\r\n        \"Tear down the test\"\r\n        self.driver.quit()\r\n        os.remove('transcript.mp3')\r\n        os.remove('eng.wav')\r\n\r\n\r\n        \r\n \r\n#---START OF SCRIPT\r\nif __name__ == '__main__':\r\n    suite = unittest.TestLoader().loadTestsFromTestCase(VoiceWebAppTest)\r\n    unittest.TextTestRunner(verbosity=2).run(suite)\r\n<\/pre>\n<h3>How to run<\/h3>\n<p>To run the script , use command &#8211;<\/p>\n<pre lang=\"python\">python test_voice_demo_app.py<\/pre>\n<h3>Output<\/h3>\n<p><a href=\"https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2019\/02\/output.png\" data-rel=\"lightbox-image-0\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-10776\" src=\"https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2019\/02\/output.png\" alt=\"\" width=\"615\" height=\"214\" srcset=\"https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2019\/02\/output.png 773w, https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2019\/02\/output-300x104.png 300w, https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2019\/02\/output-768x267.png 768w\" sizes=\"auto, (max-width: 615px) 100vw, 615px\" \/><\/a><\/p>\n<h3>What next?<\/h3>\n<p>You can extend this test script to include support for various other input voice language texts available in Text to Speech Demo app by making use of various python language translator packages.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>As\u00a0a part of Qxf2Services Hackathon,\u00a0I had picked up a project to automate testing of a readily available Text to Speech web app. To follow along, I assume you have some familiarity with Python, Selenium. Overview of\u00a0 Text to Speech Demo app To try out the testing of Text to Speech,\u00a0I was looking for a readily available web app which can [&hellip;]<\/p>\n","protected":false},"author":17,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[38,30],"tags":[],"class_list":["post-10581","post","type-post","status-publish","format-standard","hentry","category-automation","category-selenium"],"_links":{"self":[{"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/posts\/10581","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/users\/17"}],"replies":[{"embeddable":true,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/comments?post=10581"}],"version-history":[{"count":19,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/posts\/10581\/revisions"}],"predecessor-version":[{"id":10870,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/posts\/10581\/revisions\/10870"}],"wp:attachment":[{"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/media?parent=10581"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/categories?post=10581"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/tags?post=10581"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}