{"id":383,"date":"2014-06-15T16:12:23","date_gmt":"2014-06-15T20:12:23","guid":{"rendered":"http:\/\/qxf2.com\/blog\/?p=383"},"modified":"2017-06-14T01:59:27","modified_gmt":"2017-06-14T05:59:27","slug":"xpath-tutorial","status":"publish","type":"post","link":"https:\/\/qxf2.com\/blog\/xpath-tutorial\/","title":{"rendered":"The art of writing xpaths"},"content":{"rendered":"<p><strong>Problem:<\/strong> Writing XPaths is hard and confusing when there are no unique identifiers<\/p>\n<p><a href=\"http:\/\/en.wikipedia.org\/wiki\/XPath\">XPath<\/a> (XML Path Language) is a query language for selecting nodes from <a href=\"http:\/\/en.wikipedia.org\/wiki\/Document_Object_Model\">Document Object Models<\/a> (DOM) like XML, HTML, etc. XPaths are frequently used with Selenium scripts to uniquely identify elements in page. This post is a descriptive tutorial on how to think about xpaths and write xpaths in the absence of unique identifiers.<\/p>\n<hr>\n<h2> Why this post? <\/h2>\n<p>In most web apps, testers do not find unique identifiers (like ids) to locate DOM elements easily. For a long time now, I have used both <a href=\"http:\/\/www.w3schools.com\/CSSref\/css_selectors.asp\">CSS selectors<\/a> and XPaths to uniquely identify DOM elements as part of our Selenium automation scripts. The standard XPath tutorials on the web cover the syntax and the terminology but do not cover the thought process behind arriving at an XPath in the absence of ids or other unique identifiers. Over the years, I found that despite my colleagues having access to <a href=\"http:\/\/www.lmgtfy.com\/?q=google\">Google<\/a>, I have had to spend a significant amount of my time teaching them art of writing good XPaths. While helping my fellow testers, I realized that the number one difficulty was to get the testers thinking correctly about XPaths. In this tutorial, I have decided to document the thought process behind arriving at an XPath when you do not have unique identifiers to work with. <\/p>\n<hr>\n<h2> Basics of XPath syntax <\/h2>\n<p>For newbies, I&#8217;ll give a quick run down of the basics of XPaths.<\/p>\n<p>1. XPaths usually begin with a double slash i.e., \/\/<br \/>\n2. reference html elements by its tag. So a link can be referenced by \/a<br \/>\n3. to reference attributes of a HTML element use the @ notation<br \/>\n4. text() and dot are special attributes of HTML elements: text() refers to the text within the element while dot is a substitute for &#8216;any attribute&#8217;<br \/>\n5. keywords <strong>contains<\/strong> and <strong>equals<\/strong> are useful for uniquely identifying nodes<\/p>\n<hr>\n<h2> The 3 most common XPath patterns<\/h2>\n<p>Most tutorials on XPaths will expose you to what I consider the three most common XPath patterns in a tester&#8217;s tool belt:<br \/>\n1. \/\/html_element[@id=&#8221;blah&#8221;]<br \/>\n2. \/\/html_element[@attribute=&#8221;value&#8221;]<br \/>\n3. \/\/html_element[contains(@attribute,&#8221;value&#8221;)]<\/p>\n<p>You can usually get some amount of Selenium automation working with just the above three patterns. But it is going to be a pain to maintain. I think XPaths get a bad rap simply because most testers do not evolve to using better XPaths. Here&#8217;s hoping we can change that! <\/p>\n<hr>\n<h2> The better way to write XPaths<\/h2>\n<p>In real life, things are rarely this straight forward. Not all nodes have ids or unique attributes. You need to come up with relative paths to known nodes and then proceed to your destination node. <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XPath\/Axes\">XPath axes<\/a> are a powerful tool and very useful in specifying relative position like parent, child, sibling, ancestor, descendant, etc. <\/p>\n<p>I treat coming up with XPaths like following directions given to me by a human &#8211; imprecise and non-unique at each step, but when strung together are good enough to locate places accurately. So here goes a real life example of following directions before tackling the more advanced xpaths. <\/p>\n<h3> Scenario <\/h3>\n<p>Your friend Jose calls you up:<br \/>\n<strong>Jose:<\/strong> Hey, want to checkout the new bar that opened near my place?<br \/>\n<strong>You:<\/strong> Sure<br \/>\n<strong>Jose:<\/strong> Drop by my place whenever and we can leave from here<br \/>\n<strong>You:<\/strong> All I remember about your apartment was that it had a ton of flags as decoration. Remind me &#8211; how do I get to your place?<br \/>\n<strong>Jose:<\/strong> Do you know Bobby Street?<br \/>\n<strong>You:<\/strong> Yep<br \/>\n<strong>Jose:<\/strong> Keep driving down on Bobby street. You will see a billboard that reads &#8216;<strong>Testing: Checking :: Chess: Checkers<\/strong>&#8216;. Take the first right after that billboard. Keep driving and you should see my apartment to your left.<br \/>\n<strong>You:<\/strong> What is your apartment number?<br \/>\n<strong>Jose:<\/strong> I live in A-block and my apartment number is 602.<\/p>\n<p><a href=\"https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2014\/06\/xpath_map.jpg\" data-rel=\"lightbox-image-0\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2014\/06\/xpath_map-1024x723.jpg\" alt=\"xpath map\" width=\"474\" height=\"334\" class=\"aligncenter size-large wp-image-619\" srcset=\"https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2014\/06\/xpath_map-1024x723.jpg 1024w, https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2014\/06\/xpath_map-300x211.jpg 300w\" sizes=\"auto, (max-width: 474px) 100vw, 474px\" \/><\/a><\/p>\n<p><!--a href=\"https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2014\/06\/xpath_map.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2014\/06\/xpath_map-1024x553.png\" alt=\"xpath_map\" width=\"474\" height=\"255\" class=\"aligncenter size-large wp-image-523\" srcset=\"https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2014\/06\/xpath_map-1024x553.png 1024w, https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2014\/06\/xpath_map-300x162.png 300w, https:\/\/qxf2.com\/blog\/wp-content\/uploads\/2014\/06\/xpath_map.png 1366w\" sizes=\"auto, (max-width: 474px) 100vw, 474px\" \/><\/a--><\/p>\n<p>Guess what? You can now follow the directions rather easily even though each step of the directions above is non-unique. In your city, there are likely many billboards with the text &#8216;<strong>Testing: Checking :: Chess: Checkers<\/strong>&#8216; but on Bobby street there is possibly only one. There are likely many apartments on the street Jose lives but likely only one with flags as decoration. Further there could be many A-block, apartment number 602 in your city but within Jose&#8217;s apartment it is likely to be unique. You can make sense of this imprecise set of instructions because there is a pattern to the way your brain is solving the problem. <\/p>\n<h3> Algorithm in real life <\/h3>\n<p>1. identify a known landmark closest to your destination (e.g.: Bobby street)<\/p>\n<p>2. identify a series of easily recognizable, fairly unique features between the landmark and your destination to get to the vicinity of your location (e.g.: billboard text, flags, apartment)<\/p>\n<p>3. use unique identifiers within this narrowed focus to zero in on the location (e.g.: A-block, apartment number 602 within the apartment with flags)<\/p>\n<p><strong>PRO TIP:<\/strong> choose landmarks and identifiers that are less likely to change in the time frame that you plan to use them<\/p>\n<h3> Applying the algorithm to write XPaths<\/h3>\n<p>If I converted the above directions to an XPath, it would look like this:<\/p>\n<p>1. <strong>Start with a known landmark closest to your destination:<\/strong><\/p>\n<pre lang='HTML'> \/\/street[@name='Bobby Street'] <\/pre>\n<p>2. <strong>Identify the billboard<\/strong><\/p>\n<pre lang='HTML'> billboard[text()='Testing: Checking :: Chess: Checkers']<\/pre>\n<p>   Sewing what we have till now, our XPath looks like this:<\/p>\n<pre lang='HTML'>\/\/street[@name='Bobby Street']\/billboard[text()='Testing: Checking :: Chess: Checkers']<\/pre>\n<p>3. <strong>Take the next right<\/strong> <\/p>\n<pre lang='HTML'>following::street[@direction='right'][1]<\/pre>\n<p>   The <strong>following<\/strong> keyword is a very useful xpath axes. Also, XPath indices begin at 1 and not zero.<br \/>\n   Sewing what we have till now, our XPath looks like this:<\/p>\n<pre lang='HTML'>\/\/street[@name='Bobby Street']\/billboard[text()='Testing: Checking :: Chess: Checkers']\/following::street[@direction='right'][1]\/<\/pre>\n<p>4. <strong>Identify the apartment<\/strong> (it has flags as decorations)<\/p>\n<pre lang='HTML'>apartment[contains(@decoration,'flags')]<\/pre>\n<p>   The keyword <strong>contains<\/strong> is often useful when equals is not enough.<br \/>\n   Sewing what we have till now, our XPath looks like this:<\/p>\n<pre lang='HTML'>\/\/street[@name='Bobby Street']\/billboard[text()='Testing: Checking :: Chess: Checkers']\/following::street[@direction='right'][1]\/apartment[contains(@decoration,'flags')]<\/pre>\n<p>5. <strong>Zero in on the location<\/strong><\/p>\n<pre lang='HTML'>descendant::house[@block='A' and @number='602']<\/pre>\n<p>   BTW, did you notice the other apartment on the same street? If you have searched for just house[@block=&#8217;A&#8217; and @number=&#8217;602&#8242;], it is likely that you would have found a match in that apartment too. This is the reason that we had to narrow our focus, in Step 4, to the apartment with flags as decoration.<br \/>\n   The keyword <strong> descendant <\/strong> is very useful when you have multiple children, grand-children of the same type.<\/p>\n<p>Putting it all together, the XPath you want is:<\/p>\n<pre lang=\"html\">\r\n\/\/street[@name='Bobby Street']\/billboard[text()='Testing: Checking :: Chess: Checkers']\/following::street[@direction='right'][1]\/apartment[contains(@decoration,'flags')]\/descendant::house[@block='A' and @number='602']\r\n<\/pre>\n<hr>\n<p>And there you have it &#8211; a tutorial on the thought process behind writing XPaths. I hope you found it useful!  <\/p>\n<hr>\n<p><strong>P.S. 1:<\/strong> I have used a rather unusual approach to explaining XPaths &#8211; no HTML at all. Let me know your feedback<\/p>\n<p><strong>P.S. 2:<\/strong> I am not going to write about CSS selectors because there is already a fantastic resource: <a href=\"http:\/\/flukeout.github.io\/\">http:\/\/flukeout.github.io\/<\/a><\/p>\n<p><strong>P.S. 3:<\/strong> I have referenced 2 former world chess champions in this post. Can you guess which two?<\/p>\n<hr>\n<script>(function() {\n\twindow.mc4wp = window.mc4wp || {\n\t\tlisteners: [],\n\t\tforms: {\n\t\t\ton: function(evt, cb) {\n\t\t\t\twindow.mc4wp.listeners.push(\n\t\t\t\t\t{\n\t\t\t\t\t\tevent   : evt,\n\t\t\t\t\t\tcallback: cb\n\t\t\t\t\t}\n\t\t\t\t);\n\t\t\t}\n\t\t}\n\t}\n})();\n<\/script><!-- Mailchimp for WordPress v4.10.1 - https:\/\/wordpress.org\/plugins\/mailchimp-for-wp\/ --><form id=\"mc4wp-form-1\" class=\"mc4wp-form mc4wp-form-6165 mc4wp-form-theme mc4wp-form-theme-blue\" method=\"post\" data-id=\"6165\" data-name=\"Newsletter\" ><div class=\"mc4wp-form-fields\"><div style=\"border:3px; border-style:dashed;border-color:#56d1e1;padding:1.2em;\">\r\n  <h1 style=\"text-align: center; padding-top: 20px; padding-bottom: 20px; color: #592b1b;\">Subscribe to our weekly Newsletter<\/h1>\r\n  <input style=\"margin: auto;\" type=\"email\" name=\"EMAIL\" placeholder=\"Your email address\" required \/>\r\n  <br>\r\n  <p style=\"text-align: center;\">\r\n    <input style=\"background-color: #890c06 !important; border-color: #890c06;\" type=\"submit\" value=\"Sign up\" \/>\r\n    \r\n  <\/p>\r\n  <p style=\"text-align: center;\">\r\n    <a href=\"http:\/\/mailchi.mp\/c9c7b81ddf13\/the-informed-testers-newsletter-20-oct-2017\"><small>View a sample<\/small><\/a>\r\n  <\/p>\r\n  <br>\r\n<\/div><\/div><label style=\"display: none !important;\">Leave this field empty if you're human: <input type=\"text\" name=\"_mc4wp_honeypot\" value=\"\" tabindex=\"-1\" autocomplete=\"off\" \/><\/label><input type=\"hidden\" name=\"_mc4wp_timestamp\" value=\"1776347410\" \/><input type=\"hidden\" name=\"_mc4wp_form_id\" value=\"6165\" \/><input type=\"hidden\" name=\"_mc4wp_form_element_id\" value=\"mc4wp-form-1\" \/><div class=\"mc4wp-response\"><\/div><\/form><!-- \/ Mailchimp for WordPress Plugin -->\n<hr>\n","protected":false},"excerpt":{"rendered":"<p>Problem: Writing XPaths is hard and confusing when there are no unique identifiers XPath (XML Path Language) is a query language for selecting nodes from Document Object Models (DOM) like XML, HTML, etc. XPaths are frequently used with Selenium scripts to uniquely identify elements in page. This post is a descriptive tutorial on how to think about xpaths and write [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[38,15,30,45],"tags":[],"class_list":["post-383","post","type-post","status-publish","format-standard","hentry","category-automation","category-how-to","category-selenium","category-xpath"],"_links":{"self":[{"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/posts\/383","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/comments?post=383"}],"version-history":[{"count":29,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/posts\/383\/revisions"}],"predecessor-version":[{"id":6237,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/posts\/383\/revisions\/6237"}],"wp:attachment":[{"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/media?parent=383"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/categories?post=383"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/tags?post=383"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}