{"id":18627,"date":"2023-06-09T07:27:22","date_gmt":"2023-06-09T11:27:22","guid":{"rendered":"https:\/\/qxf2.com\/blog\/?p=18627"},"modified":"2023-06-09T07:27:22","modified_gmt":"2023-06-09T11:27:22","slug":"testing-openai-whisper-support-for-indian-languages","status":"publish","type":"post","link":"https:\/\/qxf2.com\/blog\/testing-openai-whisper-support-for-indian-languages\/","title":{"rendered":"Testing OpenAI Whisper with Indian Languages"},"content":{"rendered":"<p>In previous <a href=\"https:\/\/qxf2.com\/blog\/testing-openai-whisper-with-different-accents\/\" rel=\"noopener\" target=\"_blank\">blog<\/a>, we tested OpenAI Whisper for English language with different accents and observed it did great job. We also provided details about how we generated audios, setup and test details. In this blog, we attempted to test OpenAI Whisper&#8217;s capability to transcribe and translate Indian Languages.<\/p>\n<p>At <a href=\"https:\/\/qxf2.com\/contact?utm_source=whisper_ai_indian_languages&#038;utm_medium=click&#038;utm_campaign=From%20blog\" rel=\"noopener\" target=\"_blank\">Qxf2<\/a>, our teammates work from different regions of India, and everyone capable of reading, writing, and speaking 2 to 3 different languages. With their help, we decided to test the following Indian languages: Hindi, Kannada, Telugu, Marathi, Tamil, Malayalam, and Bengali. Team members generated audio files in their own regional language by using online text to speech tools like <a href=\"https:\/\/www.narakeet.com\/app\/text-to-audio\/\" rel=\"noopener\" target=\"_blank\">Narakeet<\/a>, <a href=\"https:\/\/freetools.textmagic.com\/text-to-speech\" rel=\"noopener\" target=\"_blank\">textmagic<\/a>, etc. Some of the teammates also recorded their own and family members&#8217; voices. We generated transcriptions and translations of each audio file and came up with the following comments on translation and transcription.<\/p>\n<figure id=\"attachment_18693\" aria-describedby=\"caption-attachment-18693\" style=\"width: 1413px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2023\/06\/comments-on-trascribe-and-translations.png\" data-rel=\"lightbox-image-0\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2023\/06\/comments-on-trascribe-and-translations.png\" alt=\"Comments on Generated Transcribes and Translations\" width=\"1413\" height=\"660\" class=\"size-full wp-image-18693\" srcset=\"https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2023\/06\/comments-on-trascribe-and-translations.png 1413w, https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2023\/06\/comments-on-trascribe-and-translations-300x140.png 300w, https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2023\/06\/comments-on-trascribe-and-translations-1024x478.png 1024w, https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2023\/06\/comments-on-trascribe-and-translations-768x359.png 768w\" sizes=\"auto, (max-width: 1413px) 100vw, 1413px\" \/><\/a><figcaption id=\"caption-attachment-18693\" class=\"wp-caption-text\">Comments on Generated Transcribes and Translations<\/figcaption><\/figure>\n<p>You can look at all generated audio files along with translations and transcriptions <a href=\"https:\/\/drive.google.com\/drive\/folders\/1MGJTZGeSvWbWBQJA1VIKc2_aj9hzwv0W?usp=sharing\" rel=\"noopener\" target=\"_blank\">here<\/a>. You can also try out OpenAI Whisper&#8217;s support for your language by generating audio files and following the steps below to generate transcriptions and translations.<\/p>\n<hr>\n<h4> Generate transcribe and translation:<\/h4>\n<p>Please look at our previous <a href=\"https:\/\/qxf2.com\/blog\/testing-openai-whisper-with-different-accents\/\" rel=\"noopener\" target=\"_blank\">blog<\/a> to set up OpenAI Whisper locally. Once you are done with that run the below commands to generate transcribe and translation.<\/p>\n<p><strong>1. Command to generate transcribe:<\/strong><\/p>\n<pre lang='python'> whisper --output_format txt --model medium --task transcribe <audio file>  <\/pre>\n<p>Note: During the first run of the above command, OpenAI whisper downloads the medium model, which is approx 1.5 GB.<\/p>\n<p>The above command uses medium model to generate the transcription output in a txt file. You can switch to another format by changing the output_format option. Here, we used the medium model to generate the transcription as it is recommended for non-English languages.<br \/>\nSometimes, we noticed that Whisper fails to auto-detect the correct language as it guesses the language based on the first 30 seconds of audio. If you notice such behavior, you can pass the audio language along with the language option like <strong><code>--language Kannada<\/code><\/strong><br \/>\nAdditionally, we noticed that this command took 10+ minutes to generate transcriptions for 30 seconds of audio.<\/p>\n<p><strong>2. Command to generate translate:<\/strong><\/p>\n<pre lang='python'> whisper --output_format txt --model medium --task translate <audio file> <\/pre>\n<p>Above command uses the medium model to generate the audio file translation in English. For translation, we need to use the task option; by default, this option is set to transcribe.<\/p>\n<hr>\n<h4>Common Observations:<\/h4>\n<p>1. All of us noticed that generating a transcript with the medium model takes a long time.<br \/>\n2. Sometimes the transcribe generates in another language even after explicitly providing the language in the command. Most of us noticed that the transcribe contains some English words, question marks in between the sentences, and sometimes the transcribe was incomplete and in a different language. Look at the image below. <a href=\"https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2023\/06\/whisper-ai-bug-kannada.png\" data-rel=\"lightbox-image-1\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2023\/06\/whisper-ai-bug-kannada.png\" alt=\"OpenAI Whisper Issues\" width=\"1743\" height=\"313\" class=\"size-full wp-image-18641\" srcset=\"https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2023\/06\/whisper-ai-bug-kannada.png 1743w, https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2023\/06\/whisper-ai-bug-kannada-300x54.png 300w, https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2023\/06\/whisper-ai-bug-kannada-1024x184.png 1024w, https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2023\/06\/whisper-ai-bug-kannada-768x138.png 768w, https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2023\/06\/whisper-ai-bug-kannada-1536x276.png 1536w\" sizes=\"auto, (max-width: 1743px) 100vw, 1743px\" \/><\/a><br \/>\n3. For a few audios, we noticed that the transcription is different for each run. Look at the above image: all 3 runs show different outputs for the same audio file.<br \/>\n4. Comparatively, generating a translation takes less time than generating a transcription. And in most of our cases, the translation is better than the transcription.<\/p>\n<hr>\n<h4>Conclusion:<\/h4>\n<p>We tested OpenAI Whisper with seven Indian languages. The audio files were generated by our colleagues. We noticed Whisper is not fully ready for Indian languages yet. <\/p>\n<hr>\n<h4>Hire Qxf2!<\/h4>\n<p>Qxf2&#8217;s software testers specialize in testing the often overlooked technical core of your product. Our approach goes beyond traditional test automation, allowing us to identify critical quality aspects specific to your application and conduct thorough testing. With a preference for small teams and early stage products, we are passionate about delivering exceptional testing services. To get in touch with us, simply fill out this <a href=\"https:\/\/qxf2.com\/contact?utm_source=whisper_ai_indian_languages&#038;utm_medium=click&#038;utm_campaign=From%20blog\">simple form<\/a>.<\/p>\n<hr>\n","protected":false},"excerpt":{"rendered":"<p>In previous blog, we tested OpenAI Whisper for English language with different accents and observed it did great job. We also provided details about how we generated audios, setup and test details. In this blog, we attempted to test OpenAI Whisper&#8217;s capability to transcribe and translate Indian Languages. At Qxf2, our teammates work from different regions of India, and everyone [&hellip;]<\/p>\n","protected":false},"author":12,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[270,355,356],"tags":[],"class_list":["post-18627","post","type-post","status-publish","format-standard","hentry","category-ai","category-ai-testing","category-whisper"],"_links":{"self":[{"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/posts\/18627","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/users\/12"}],"replies":[{"embeddable":true,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/comments?post=18627"}],"version-history":[{"count":22,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/posts\/18627\/revisions"}],"predecessor-version":[{"id":18694,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/posts\/18627\/revisions\/18694"}],"wp:attachment":[{"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/media?parent=18627"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/categories?post=18627"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/tags?post=18627"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}