Selenium Basics

Selenium IDE, Selenium WebDriver

Created by Kavan Sheth

Best viewed in Chrome, Firefox, Opera or A browser supporting HTML5. Use arrow keys ( up ↑, down ↓, left ←, right ↑) to navigate. Uses Reveal.js by Hakim El Hattab Press ESC to enter the slide overview.


  • Selenium is open-source tool, released under the Apache 2.0 license, for test automation of web-based applications

  • Brief History of The Selenium Project
    • Selenium RC – Developed by Jason Huggins in 2004 @ ThoughWorks
    • Webdriver – Developed by Simon Stewart in 2006 @ Google
    • Merged and released as Selenium 2 in Jul 8, 2011
    • For more details you can refer this.

  • Selenium’s Tool Suite
    • Selenium 2 (aka. (as known as) Selenium WebDriver)
    • Selenium 1 (aka. Selenium RC or Remote Control)
    • Selenium IDE
    • Selenium Grid

Why Selenium?

  • QTP or Selenium? - As I have very limited exposure to QTP, I don't want to give my biased opinion on this. But if you want to automate Window-based Application , Selenium is not a tool for you. You must look for other alternatives.

  • Few points in favour of selenium from my side:
    • Support for wide range of browsers - Firefox, IE, Chrome, Opera , Safari.
    • Supports Android/iOS as well - So you can test your mobile website on Android or IOS using Selenium.
    • It is FREE, but powerful.
    • Support for multiple Language bindings - Java, CSharp, Python, Ruby, Php, Javascript.
    • Power of surrounded Java APIs/Tools. - Junit, TestNG, Jxl, JSch, Log4J, Maven, ReportNG ...
    • Rich Community, atleast for Java. Official User Group.

Selenium IDE


  • Selenium IDE is simply intended as a rapid prototyping tool.
  • You can record and playback your tests using Selenium IDE.
  • BUT selenium IDE is not as mature as QTP in recording tests. So you need to update your scripts with proper wait statements and validations. Usually IDE is used for prototyping or as a helper for development of automation suits.
  • It works only with Firefox browser. so if your web-application runs only in IE or a different browser you CAN NOT use Selenium IDE to Record Playback your test scenarios.
  • Still if you are new to Selenium, it is better to have understanding of Selenium IDE.

Installing IDE

  • Steps to install Selenium IDE are given in Selenium Official Docs
  • Just Summarising here for sake of completeness.
    • Download IDE plugin(.xpi file) from SeleniumHQ downloads page
    • If you are downloading it using Firefox , then it will ask you for installation. Click on "Install Now" and Restart Firefox, Selenium IDE will be available under Tools Menu.
    • If you already have downloaded .xpi file for selenium IDE, Place it under plugins folder of Firefox installation directory in program files. And then restart Firefox.


Record and Playback

  • Open IDE
  • Record Button might be already pressed, if not, start recording by clicking on record button
  • Go to Firefox browser and perform test scenario. like opening specific link, provide inputs , click on buttons ....
  • Once scenario is complete, go back to IDE and click on record button again to stop recording.
  • Once you stop your Editor Panel may look like following:

Scenario, here

  • Open
  • Search for Selenium
  • Click on First link from results

Selenium Commands – “Selenese”

  • Each recorded row is divided into three parts - Command, Target and Value.
  • Command contains predefined functions from selenium, which are known as Selenium Commands or Seleneses .
  • As we can see in Editor, each selenese command has maximum 2 parameters - Target and Value.
    • a locator for identifying a UI element within a page.
    • a text pattern for verifying or asserting expected page content.
    • a text pattern or a selenium variable for entering text in an input field or for selecting an option from an option list.

Selenium Commands – “Selenese” – cont.

  • Selenium commands come in three “flavors”: Actions, Accessors, and Assertions.
    • Actions are commands that generally manipulate the state of the application. E.g Click, ClickandWait, type
    • Accessors examine the state of the application and store the results in variables, e.g. “storeTitle”.
    • Assertions are like Accessors, but they verify that the state of the application conforms to what is expected. All Selenium Assertions can be used in 3 modes: "assert", "verify", and "waitFor". For example, you can "assertText", "verifyText" and "waitForText".


  • As we know HTML pages are made up of different elements. like Button, Textbox, link, table etc.
  • Automation is task of identifying these elements and perform required actions on them. like entering text, clicking on button, get text and verify etc.
  • To find these elements we use different locators.


Locator Example Row of element Identified from HTML
Locating by Identifier Id=“loginForm” 3
Locating by Name Name=“password” 5
Locating Hyperlinks by Link Text link=Continue 4 (in 2nd html)


Locator Example Row
Locating by Xpath xpath=/html/body/form[1] - Absolute path (would break if the HTML was changed only slightly) 3
xpath=//form[1] - First form element in the HTML 3
xpath=//form[@id='loginForm'] - The form element with attribute named ‘id’ and the value ‘loginForm’ 3
xpath=//form[input/@name='username'] - First form element with an input child element with attribute named ‘name’ and the value ‘username’ 3
xpath=//form[@id='loginForm']/input[1] - First input child element of the form element with attribute named ‘id’ and the value ‘loginForm’ 4
xpath=//input[@name='continue'][@type='button'] - Input with attribute ‘name’ with value ‘continue’ and attribute 'type’ with value ‘button’ 7
xpath=//form[@id='loginForm']/input[4] - Fourth input child element of the form element with attribute named ‘id’ and value ‘loginForm’ 7

More info: W3Schools XPath Tutorial W3C XPath Recommendation


Locator Example Row
Locating by DOM dom=document.getElementById('loginForm') 3
dom=document.forms['loginForm'] 3
dom=document.forms[0] 3
dom=document.forms[0].username 4
dom=document.forms[0].elements['username'] 4
dom=document.forms[0].elements[0] 4
dom=document.forms[0].elements[3] 7


Locator Example Row
Locating by CSS(Cascading Style Sheet) css=form#loginForm 3
css=input[name="username"] 4
css=input.required[type="text"] 4
css=input.passfield 5
css=#loginForm input[type="button"] 7
css=#loginForm input:nth-child(2) 5

Preferred selector order : id > name > css > xpath

More info: W3C CSS Recommendation

Commonly Used Selenium Commands

open: opens a page using a URL.

click/clickAndWait: performs a click operation, and optionally waits for a new page to load.

verifyTitle/assertTitle: verifies an expected page title.

verifyTextPresent: verifies expected text is somewhere on the page.

verifyElementPresent: verifies an expected UI element, as defined by its HTML tag, is present on the page.

verifyText: verifies expected text and its corresponding HTML tag are present on the page.

verifyTable: verifies a table’s expected contents.

waitForPageToLoad: pauses execution until an expected new page loads. Called automatically when clickAndWait is used.

waitForElementPresent: pauses execution until an expected UI element, as defined by its HTML tag, is present on the page.

Some Important IDE features

In File Menu you will find "Export TestCase as" option, using this option we can export the recorded TestCase to webdriver source code in desired language and desired unit test framework. This can be very useful to start with webdriver

Image Source:

Some Important IDE features

In Edit Menu you will find "Inser New Command" option, using this option we can add new command to recorded script. To add validation /Assertions and wait statements you will require this feature.

Image Source:

Some Important IDE features

In Option Menu you will find "Clipboard Format" option, using this option you can copy past selenese command to your Java code and it will automatically translates selenese into a valid Java code.

Image Source:


Selenium Webdriver

How to Use?

  • Usually unaware of selenium and Java, people ask me for setup of selenium, and when I give them a Jar file, they find it difficult to start with it. Selenium Webdriver is nothing but Java API. so you just need to run it as any other Java program with a Jar file.
  • Here we will see four ways to run a Java program with Selenium Webdriver API:
    • Using Console
    • Using Eclipse
    • Using Maven
    • Using Maven and Eclipse

Download Selenium Jars

You can download selenium Jars from ,

But for first timers following might be confusing

Download Selenium Jars

When to use web driver client jars?

  • Your browser and tests will all run on the same machine
  • Your tests only use the WebDriver API, and not Selenium RC

When to use Selenium Server?

  • You are using Selenium-Grid to distribute your tests over multiple machines or virtual machines (VMs).
  • You want to connect to a remote machine that has a particular browser version that is not on your current machine.
  • You are not using the Java bindings (i.e. Python, C#, or Ruby) and would like to use HtmlUnit Driver.

Download Selenium Jars

I hope now you have decided which jar you want to use. I will be using

You need to decide on one more thing - Driver.

Depending on browser you choose, you need to download driver corresponding to it. If you are working with FireFox, no separate driver need to be downloaded. It will be part of your client/server jars.

We will discuss about drivers later when we start with coding stuff. for now just start with FireFox browser.

Using Console

Now to learn how to run a Java program with selenium webdriver API, you will require a sample program. so for now either you can create your own Java program from your recorded test case using Export Test Case option of Selenium IDE.


You can use this Sample for now.

One more thing, You should have JDK installed on your PC.Download JDK. After installing JDK, go to Command Prompt and type javac and press enter. If you get error that "javac is not recognised as internal or exteranl command", you need to add path of your javac.exe and java.exe (Usually it will be like "C:\Program Files (x86)\Java\jdk-version\bin") in path environment variable.

Using Console - cont.

  • We will denote your current directory as HOME

  • Assuming that you have already downloaded Jar as per your requirement.
  • If it is a zip file, unzip it and copy all jar files into a new folder, say, "libs" in your HOME folder
  • When we export Test case from IDE, default package is com.example.tests. So you need to create directory structure as HOME -> com -> example -> tests and place your file in tests folder
  • Now open command prompt/console and go to HOME dir.
  • Use following commands to Compile and Run your program:
  • Compile:
  • javac -cp "libs\*" com\example\tests\

  • Run:
  • java -cp ".;libs\*" org.junit.runner.JUnitCore com.example.tests.TestData

    java -cp ".;libs\*" com.example.tests.TestData (without Junit, source has main())

Note: org.junit.runner.JUnitCore is needed to run your test with JUnit framework.

Using Console - cont.

If you are using, the test will

  • Open Firefox Browser.
  • Enter URL for Google.
  • Search for "Selenium".
  • Click on link with text "Selenium - Web Browser Automation" in search result.
  • Close browser.

and on console you will find something like this:

Using Eclipse

  • Install JDK (must for Java Source Compilation and so for eclipse).
  • Download Eclipse (Eclipse IDE for Java Developer). I am using Eclipse Juno*.
  • No installation required for eclipse. so Just unzip it and run eclipse.exe.
  • If Eclipse ask you to select a workspace, either choose desired path or leave it default.
  • Create New Project from File -> New -> Java Project. Give Name of the project, say, YourProject.
  • Click right click on your newly create project in Package Explorer, and add new package. Under that package add a new Class. or you can directly copy file under your new Package.
  • Now right click on project and select Properties. Go to Java Build Path and Under Libraries tab click on "Add External JARs..". Navigate to Libs folder where we had kept all unzipped jar files. Add all Jar files. and Click Ok.
  • To Run, just right click on your .java file, and click on "Run As" -> "Run on Server"(if running with main method) or "JUnit test"(if using JUnit framework)

*Juno is eclipse version. Eclipse versioning is based on planetary system like (E)uropa, (G)alileo, (H)elios, (I)ndigo, (J)uno, (K)epler, (L)una, (M)ars(2015)

Using Maven

What is Maven?

Maven, a Yiddish word meaning accumulator of knowledge, was originally started as an attempt to simplify the build processes in the Jakarta Turbine project. You can read more about it here. Power of build Tool can be realised only once you are working on a large scale project, But still this section will give you gist of how Maven can make your life simple.

Using Maven - Cont.

  • Download Maven
  • Install Maven
  • Now to create a new Maven project run below command from any folder where you want to keep your maven projects.
  • mvn archetype:generate -DgroupId=com.example.tests -DartifactId=sample -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false

  • Which will create following directory structure,
  • sample
    |-- pom.xml
    `-- src
        |-- main
        |   `-- java
        |       `-- com
        |           `-- example
        |               `-- tests
        |                   `--
        `-- test
            `-- java
                `-- com
                    `-- example
                        `-- tests
  • So here groupId is your package and artifactId is your project name.

Using Maven - Cont.

  • "The pom.xml file is the core of a project's configuration in Maven. It is a single configuration file that contains the majority of information required to build a project in just the way you want." - Apache Maven Documentation
  • By default, it looks something like this.
    <project xmlns=...>
  • in pom.xml you need to specify dependencies for your project. e.g. you want to add Junit as dependency, then given tags(groupId, artifactId, version) help maven to identify and download your dependency from its repository.

Using Maven - Cont.

  • We have to make two changes. first, upgrade Junit 3 to Junit 4 and second, add dependency for Selenium.
  • Updated pom.xml
    <project xmlns=...>
  • Don't worry about what tags to be added for a dependency. You will find it all at Like for Junit, for Selenium

Using Maven - Cont.

  • Now remove default files and from maven project. Add your file under src\test\java\com\example\tests folder.
  • Caution: Maven expects your test file to have "test" in filename. so change to and also change corresponding class inside file.
  • That's it. You are ready to run your tests.
  • Go to your maven project folder, where pom.xml is lying and run "mvn clean test".
  • Your test will be executed and you will get something like this in console:

So, Maven Requires some initial efforts, but now consider, if you want to change version of your dependencies or add/remove more dependencies, you don't have to worry about downloading jars or placing them in correct folder or add them to your build path. Just update pom.xml and it will be taken care by Maven.

Using Maven - Cont.

  • You can run specific test class using
  • mvn test -Dtest=classname

  • And you may use multiple names/patterns, separated by commas
  • mvn -Dtest=TestSquare,TestCi*le test

  • Specific test method using
  • mvn test -Dtest=classname#methodname"

  • You can use patterns too
  • mvn -Dtest=TestCircle#test* test

    As of surefire 2.12.1, you can select multiple methods (JUnit4X only)

    mvn -Dtest=TestCircle#testOne+testTwo test

  • This was not even tip of the iceberg. so you can study more about at

Using Maven and Eclipse

Install Maven Plugin

  • First we need to add Maven Plugin M2E in eclipse to use maven with eclipse.
  • In Eclipse click Help-> Install New Software.
  • Add Url "" in Work With field.
  • Mark checkbox against "Maven Integration for Eclipse".
  • Press Next. It will install given plugin to Eclipse.

Using Maven and Eclipse - Cont.

Using Maven and Eclipse - Cont.

If you are not using Eclipse Luna, you may get following error:

Missing requirement: Maven Integration for Eclipse (org.eclipse.m2e.core requires 'bundle [14.0.1,16.0.0)' but it could not be found

In that case you have to install older version of m2e. and for that you need to uncheck "Show only the latest versions of available software".

Installation will ask you for Restart and After restart Maven plugin will be available.

Using Maven and Eclipse - Cont.

  • Now we will create one Maven Project from Eclipse.
  • Go to File ->New-> Other-> Maven-> Maven Project.
  • Click Next, Next, Next. No change required.
  • Then add GroupId, ArtifactId and package as shown below and Click Finish.

Directory Structure in Eclipse

Using Maven and Eclipse - Cont.

  • Now we are ready.
  • Make Similar changes as we did with standalone maven project.
  • Remove default files and from maven project. Add your file under src\test\java\com\example\tests folder.
  • Also Follow the caution regarding test file naming convention.
  • Update pom.xml with JUnit 4.11 and Selenium 2.42.2 dependencies.
  • And to run your Maven project, right click on project folder, here sample, And select Run As -> Maven test.
  • I hope you understood the steps, but still if you have any doubts, there are many videos and tutorials available online. You just have to search a little bit.

An Example

A code walkthrough


WebDriver is the main interface to use for testing, which represents an idealised web browser. The methods in this class fall into three categories:

// Create a new instance of the Firefox driver

// Notice that the remainder of the code relies on the interface, not the implementation.

WebDriver driver = new FirefoxDriver();

  • Control of the browser itself
  • Selection of WebElements
  • Debugging aids


We initialize it with instance of specific implementation of it. like here FirefoxDriver, which emulates Webdriver commands for Firefox.

Initializing Drivers

As discussed in section "Download Selenium Jars", you know how to download driver for different browser, now we will see how to use them in code.

Creating new instance of each driver it is too simple.

WebDriver driver = new HtmlUnitDriver(); 
WebDriver driver = new FirefoxDriver(); 
WebDriver driver = new InternetExplorerDriver();
WebDriver driver = new ChromeDriver(); 
WebDriver driver = new OperaDriver(); 
WebDriver driver = new AndroidDriver(); 
WebDriver driver = new IPhoneDriver();

Initializing Drivers - Cont.

Just one thing to remember, you need to add path for your downloaded driver before initializing webdriver.

  • For Chrome: System.setProperty("", "chrome driver path");
  • For Example:
    File file = new File("C:\\chromedriver.exe");
    System.setProperty("", file.getAbsolutePath());
    WebDriver driver = new ChromeDriver();
  • For Internet Explorer: System.setProperty("", "IEDriver Path");
  • For Opera: set environmental variable OPERA_PATH or set capability opera.binary(we will see later how to set capability).
  • For Android: Visit or
  • For iOS : Visit or

Basic Flow

When you will be automating pages, you will realize that you are just repeating steps in some pattern

  • Navigate to a web page.
  • Search for desired web element.
  • Interactions with web elements– Type, Click, Select …
  • Wait for actions to be completed like Page load, Page Refresh, Response from server.. (if needed )
  • Verify state of application.

Webdriver has two methods to navigate to a URL

  • driver.get("url")
  • driver.navigate().to("url")

// And now use this to visit Google


// Alternatively the same thing can be done like this


Simple enough?

Now what if you want to navigate to another window, or just to a frame under same window?

Also go through line 24-25 and 38-39 for usage of WebDriver Methods, which are self explaining.

// Check the title of the page


// Close the browser


WebDriver supports two methods to find elements

WebElement findElement(By by) - Returns single WebElement

List<WebElement> findElements(By by) - Returns List of WebElements

// Find the text input element by its name

WebElement element = driver.findElement("q"));

There are multiple ways to locate elements using By Interface as following:

WebElement element = driver.findElement
List<WebElement> cheeses = driver.findElements
WebElement frame = driver.findElement
WebElement cheese = driver.findElement
WebElement cheese = driver.findElement
WebElement cheese = driver.findElement
WebElement cheese = driver.findElement
List<WebElement> inputs = driver.findElements

WebDriver Methods

WebElement represents an HTML element. Generally, all interesting operations to do with interacting with a page will be performed through this interface.(more)

  • Check state of Element
    • element.isEnabled();
    • element.isDisplayed();
    • element.isSelected();
  • Perform action on Element
    • element.sendKeys(keyPress.ENTER);
  • Get metadata of Element
    • element.getAttribute("style");
    • element.getLocation();
  • Search within limited context of Element
    • element.findElement("name"));
    • element.findElements(By.className("class"));

// Enter something to search for


// Now submit the form. WebDriver will find the form for us from the element


Remember, when findElement method of driver is called, it returns object of type WebElement, and findElements method returns list of WebElements.

WebElements Methods

Now only one statement remains, related to WebDriverWait, we will see this in detail in upcoming sections , but for now you just need to know that this statement make your code wait until page is loaded fully and it is ensuring this by verifying that title of page is as expected.

One more thing, as you might have noticed, this piece of code is not using JUnit FrameWork, so as mentioned earlier, you need to run it as Java Application as following (assuming there is no package, means your class is part of default package):

java -cp ".;libs\*" Selenium2Example


As we briefly visited in Selenium IDE section, Selenium uses locators to find elements on a web page. An element can be anything lying between opening and closing HTML tags. For example, it can be a <div> or a <td> or an <input>.

Whenever browser fetches a page as a response from server, it will only be in HTML format. It gets rendered by your browser and displayed in a presentable way.

Just right click on Google homepage and select view page source. You will get very large and messy HTML page like this.

So how can we locate an Element?

In HTML language, we have different tags enclosed in < and >. Each tag is separately rendered by browser and displayed based on its attributes. Few of these attributes are just for identification purpose while others are to specify characteristics or style of these elements. like,
     id, name - for identification purpose
     class - for CSS stylesheet
     href - to specify link address
     style - to specify formatting of that element...

We can locate a specified HTML element using any of these attributes, only condition is "Attribute should identify desired element uniquely".

In below image, highlighted is input element which we used in sample program to enter search string. WebElement element = driver.findElement("q")); here "name" attribute uniquely identifies input element.

Finding elements

Though we saw the process of identifying element using unique attribute, still it is tiresome, complex and error prone to get it using raw HTML source. But don't worry, now a days browsers are capable of providing good assistance in this matter.

Finding elements - FireFox

Press "F12" in Firefox browser to open developer tools window, then click on and hover your mouse on different elements of Web page. It will highlight HTML code for that element. from the code you can determine unique property for the element.

Finding elements - Chrome

Similarly, Press "F12" in Chrome browser to open developer tools window, then click on .

Finding elements - Internet Explorer

Similarly, Press "F12" in IE 8+ browser to open developer tools window, then click on .

Finding elements - few points

  • In developer tools, by clicking on element you can get source and by clicking on HTML code you can get corresponding element highlighted. Still responsibility of identifying attribute ensuring uniqueness is on test developer.
  • If your application works in FireFox browser, you can use Selenium IDE to record your flow and see how it is locating required element. like here IDE has recorded target as "id=gbqfq", which also identifies "Google Search Input Box" uniquely.

Selenium Methods for Locators

Once you identify attribute which identifies element uniquely, you can use methods provided by Selenium.
Selenium provides locator methods for few most common attributes.
You can use it as following:


But it doesn't includes all attributes and also still there may be chance that an element do not have a single attribute which identifies it uniquely. So?

*Does syntax of findElement method sound strange? We will discuss it in upcoming slides.

Selenium Methods for Locators

So Selenium provided two more approaches to find an element,

  1. by CSS locators, locating element the way CSS does. It has nothing to do with CSS, it is just syntactical approach which is used in CSS to uniquely identify an element to apply specific style.
  3. by using XPath, Xpath is a way to navigate through an XML doc. similarly it can be applied to HTML doc.

By these two methods we can exploit other attributes as well as absolute or relative position of element on HTML page.

"By" Class

*Alt + click to zoom

"By" Class - Cont.

If you are able to understand flow from previous image, it is fine. But there is no harm if you don't. I was curious and found it interesting so tried to explain here.

But one thing , you can either use By class and use driver.findElement("aaa"));
,which is quite elegant.


You can directly call driver.findElementByName("aaa");
,which is also correct, but not recommended and similarly,


Now just one thing before we look into CssSelector and Xapth locators in detail, in Chrome and Firefox you can evaluate your cssSelectors and xpaths, so before adding them to your test you can make sure that whether locator will work or not.

In Firefox, F12 -> Click on

In Chrome, F12 -> Press Esc key

$$("") evaluates cssSelector while $x("") evaluates xpath.

CssSelector Locators

* - any element Universal selector

E - an element of type E

E[foo] - an E element with a "foo" attribute

E[foo="bar"] - an E element whose "foo" attribute value is exactly equal to "bar"

E[foo~="bar"] - an E element whose "foo" attribute value is a list of whitespace-separated values, one of which is exactly equal to "bar"

E[foo^="bar"] - an E element whose "foo" attribute value begins exactly with the string "bar"

E[foo$="bar"] - an E element whose "foo" attribute value ends exactly with the string "bar"

E[foo*="bar"] - an E element whose "foo" attribute value contains the substring "bar"

E[foo|="en"] - an E element whose "foo" attribute has a hyphen-separated list of values beginning (from the left) with "en"

E#myid - an E element whose "foo" attribute value is exactly equal to "bar"

E.warning - an E element whose class is "warning"

E[foo="bar"][foo1=""bar1] - an E element with multiple attribute

E F - an F element descendant of an E element

E > F - an F element child of an E element

E + F - an F element immediately preceded by an E element

E ~ F - an F element preceded by an E element

E:nth-child(n) - an E element, the n-th child of its parent

E:nth-last-child(n) - an E element, the n-th child of its parent, counting from the last one

E:nth-of-type(n) - an E element, the n-th sibling of its type

Similarly you can try following CssSelectors:

E:nth-last-of-type(n) - an E element, the n-th sibling of its type, counting from the last one

E:first-child - an E element, first child of its parent

E:last-child - an E element, last child of its parent

E:first-of-type - an E element, first sibling of its type

E:last-of-type - an E element, last sibling of its type

E:only-child - an E element, only child of its parent

CssSelector Locators - few points

  • Previous list of CssSelector includes all important selectors, which is from And you can refer it for exhaustive list of selectors and more details.
  • Apart from standards, Selenium uses sizzle CSS selector library, which have implementation for :contains() locator. which may be very useful in some cases. :contains() is no longer part of CSS selector standards so behaviour will be implementation specific.
  • usage: element.findElement(By.cssSelector("a:contains('Log Out')")); - it will locate links with inner text having string 'Log Out'. Chrome doesn't identify it as a valid CSS selector.
  • cssSelectors are faster than xpaths, and thus preferred over xpaths.
  • If you can not find any site on the Internet against which you can practice css/xpaths or you are also too lazy like me :). then to start with you can practice it using this sample.html

xpath Locator

As name suggests, xpath is actually a path , as we use path to locate any file in directory structure, here we uses a path to identify node in given xml structure.It can be relative or absolute.

/ denotes root node , so any path starting with / will be absolute. other wise it will be relative.

so, basic syntax for xpath will be /step/step/... or step/step/...

Syntax for a step



nodetest identifies a node in current node-set. for nodetest usually you will be using following expressions

Expression Meaning Example
tagname Selects all nodes with the given tag name /table/tr/td - looks for all tags as immediate children
/ selects from the root node /
// Selects nodes in the document from the current node that match the selection no matter where they are //div//input - locates all div having input as a descendent node
. selects the current node .
.. selects the parent of the current node ..
@ select attributes //@name - selects all name attributes
* selects any node //input/* - all children of input
@* selects any attribute //input/@* - selects all attribute of child nodes of input


Once you identify node set, predicate is used to refine/filter your node set.

Type of Predicate Examples
Referring node attributes //input[@id='bar'] - all input elements with id 'bar'
//div[@name]/input - select input elements under all div elements with name attribute
Referring node position //ul/li[2] - locates all second li element under all ul
//ul/li[last()] - selects all last li element under all ul
//ul/li[last()-1] - I don't know what this mean, find out yourself.
//ul/li[position()<3] - selects first two node of all unordered lists
Referring node value //ul[li>5]/li[2] - select second li from list having any li > 5, only if we know that li has numeric value
//ul/li[text()='Any String'] - selects all li of ul having text equals to 'Any String'
//ul/li[.='Any String'] - . is equivalent of text()


xpath has following operators supported: source.

is it sufficient to perform complex matching operations on attribute or text values? NO!


xpath has very large number of functions defined. You can refer them here.

Functions Examples
starts-with //input[starts-with(@id,'abc')]
substring //input[substring(@id,2)='earchInput'] - possible matches id = 'searchInput', id = 'eearchInput', id = 'SearchInput'
contains //input[contains(@id,"Inp")] //input[contains(text(),"Any String")]
normalize-space //title[normalize-space(text())='Selenium - Google Search'] - it replaces multiple spaces with one space


An Axis define node-set relative to current node. source.


A Problem

A common problem is to access fields in a table corresponding to a particular value of a column. For Example, in below table we wants to access textbox and link corresponding to specific item.



Alt + Click to Zoom-In/Zoom-out


Handling Windows, Frames and Alerts

switchTo() is a method of WebDriver interface, which redirects future commands to a different window, frame or alert.

From JavaDoc of Selenium:

switchTo() returns TargetLocator, which contains methods to locate windows, frames or alert.


Handling Windows

Remember our first Example?, in which we were searching for "selenium" and clicking on first link coming in search result.Now what if on click, site gets opened in a new window or tab. how to access it?

Pre-requisite: If you want to try out such scenario, you may need to enable following in google search settings in chrome:

Following is kind of a standard code for handling windows:

// Store your parent window
String parentWindowHandler = driver.getWindowHandle(); 
String subWindowHandler = null;

// get all window handles, which returns set of Strings
Set<String> handles = driver.getWindowHandles(); 

//Create Iterator for collection
Iterator<String> iterator = handles.iterator();

//iterate over all window handles returned
while (iterator.hasNext()){
    subWindowHandler =;
	// switch to popup window
	// Assert that current window is target window
	if(driver.getTitle().equals("Title of target window")){
		//perform operation on target window

// switch back to parent window

Here we are using three methods of WebDriver interface

  • getWindowHandle - This method returns handle for current window. it is important to get handle of current window, because once you switch to a different window, there is no way you can come back to current window without a specific identifier for it. So, if we have handle for current window, we can just move back to it once work on target window is finished.
  • getWindowHandles - This method returns handles for all open windows in current webdriver context. so you can iterate over it and switch to desired window. it includes current window as well. so if you want to avoid choosing it by any chance, you can compare target window handle with current window handle and ignore target window if both handles are same.
  • SwitchTo().window(String Handle) - This method just return object of TargetLocator, which calls window method with handle as argument to change window. Which sends "switchToWindow" command to browser to change context.

Handling Frames

for frames using example from

In this Example, we have multiple files, if you are not aware of concept of frame in HTML then you should first explore the content of the HTMLs. Here, the main HTML page is frames_example.html, which is having structure of page, like different frames, and each frame is having separate HTML source(topNav, menu and content).

So most important thing to understand is, you can switchTo() any frame from main HTML page, but you can't move to other frame from these sub frames. You always need to move back to the main window and then you can switchTo() another sub frame.

Initializing Driver

Navigate to desired page

Try to find an Element, it will fail with NoSuchElementException, so if you are not catching that exception like shown in code, your program will break.

Now we will switchTo frame 'menu'. You will see that now there is no exception and element will be found successfully so you can perform any action on it.

As cautioned you earlier , you can not switch to other frame from current frame, you need to move back to default frame first and then you can switch to another frame.

Switching to Default Frame and then switching to another frame topNav

Handling Alerts

When we discussed about running Java program from console in section, there were few extra methods(shown here) in that source. That is all you need to know about alerts, so will discuss these methods only.

First Method, isAlertPresent(), uses switchTo().alert(), which return alert interface if alert is present otherwise NoAlertPresentException will be thrown.

Second Method, closeAlertAndGetItsText(), is called when isAlertPresent() returns true, this time we call switchTo().alert() and store Alert object. alert object has few methods like, accept(), dismiss(), getText() and sendKeys(String s), which can be used as per need.

Handling Alerts - few Points

1. If Alert is present and you try to click on any element on page, you will get "UnhandledAlertException: Modal dialogue present". This can happened when you have unhandled Alert. So when you are expecting random alerts, it is better to handle this exception in your code.

2. Before handling Alerts, confirm that you are dealing with alerts only, sometimes you might get a popup window, which looks similar to a popup alert.

Wait Functions

  • Implicit wait

    • Using Methods from Timeouts interface
      • implicitlyWait
      • pageLoadTimeout
      • setScriptTimeout

  • Explicit wait

    • Using FluentWait
    • Using WebDriverWait
      • Using ExpectedConditions class
      • Using Interface ExpectedCondition
    • Thread Sleep

Implicit wait

Implicit wait, is wait configurations which applicable through out your test runs, which is not specific to your scenarios. but something generic.

Selenium provides three kind of implicit wait under Timeouts interface

  • implicitlyWait: If an object is not found, WebDriver will wait and retry till this configured amount of time before reporting NoSuchElementException. Syntax: driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS);
  • pageLoadTimeout: If page load is exceeding this specified time, then driver will throw an TimeoutException with reason "Timed out waiting for page load". If the timeout is negative, page loads can be indefinite.
  • setScriptTimeout: If an asynchronous script exceeds this Timeout, an error will be thrown.

Explicit wait

Explicit wait, is wait time you specify to make your code wait until particular condition is met. We usually need it make sure that particular condition is met before moving further.

If you understand FluentWait class in detail, rest will be easier, as WebDriverWait is just child class which simplifies usage of FluentWait. We will see methods of FluentWait in detail.

Fluent Wait - until

When you wait, it is not just wait and do nothing, it is like keep on doing some activity until you get some result.

So Wait until?

There is a method called until,

A simple Example from JavaDocs

FluentWait<WebDriver> wait = new FluentWait<WebDriver>(driver);
WebElement foo = wait.until(new Function<WebDriver, WebElement>() {
  public WebElement apply(WebDriver driver) {
     return driver.findElement("foo"));

Definition from JavaDocs

Fluent Wait - until

So What is does?, it keeps calling given function, here with webdriver as input, until one of the following occurs:

  • function returns neither null nor false
  • function throws an ignored exception( Will be explained soon... )
  • the timeout expires( Will be explained soon... )
  • the current thread is interrupted

So here, function is calling findElement method on driver, so until we find element (which will make return object not null), it will be called repetitively. what if element is never found? means function will never return a non null value. in that case remaining three of the above conditions will come into picture, we will visit them soon.

Fluent Wait - until

If you see Java Docs, there is one more variant of until method available.

So as we saw, earlier variant was taking a function with Input Type and Return Type. With predicate you can only specify Input Type, Return type will always be a boolean

So we need to rewrite earlier function as

FluentWait<WebDriver> wait = new FluentWait<WebDriver>(driver);
WebElement foo = wait.until(new Predicate<WebDriver>() {
  public boolean apply(WebDriver driver) {
     return driver.findElement("foo")).isDisplayed();

Fluent Wait - Ignoring

As we discussed, while waiting your code will not be idle, there will be some repetitive task going on.

So what if you get some exception during your repetitive activity?

Yes, you guessed right, whatever exception we want to ignore during our recurring task, we can specify it with ignoring method.

Did you think why we would like to ignore an exception? Good.

Most common example is, assume, you want to look for an element, so you will use findElement method of WebDriver or WebElement, But as you know if element is not found it throws NoSuchElementException. But you know after few try most probably you will find your element and you want to give it some time. so you may use fluent wait and specify the exceptions you want to ignore for time being.

Fluent Wait - Ignoring more


FluentWait<WebDriver> wait = new FluentWait<WebDriver>(driver)
WebElement foo = wait.until(new Function<WebDriver, WebElement>() {
  public WebElement apply(WebDriver driver) {
     return driver.findElement("foo"));

What if you want to ignore more exceptions?

There are two more variants of ignoring in FluentWait Class.

Fluent Wait - Ignoring more

Ignoring(Multiple Arguments):

FluentWait<WebDriver> wait = new FluentWait<WebDriver>(driver)
WebElement foo = wait.until(new Function<WebDriver, WebElement>() {
  public WebElement apply(WebDriver driver) {
     return driver.findElement("foo"));


List<Class<? extends Throwable>> Exceptions= new ArrayList<Class<? extends Throwable>> ();
FluentWait<WebDriver> wait = new FluentWait<WebDriver>(driver)
WebElement foo = wait.until(new Function<WebDriver, WebElement>() {
  public WebElement apply(WebDriver driver) {
     return driver.findElement("foo"));

Here I have used ArrayList, but you can use any collection which implements Collection interface.

Fluent Wait - Few more methods


FluentWait<WebDriver> wait = new FluentWait<WebDriver>(driver)
          .pollingEvery(5, SECONDS)
          .withMessage("I need this info for more clarification")
          .withTimeout(30, SECONDS);
WebElement foo = wait.until(new Function<WebDriver, WebElement>() {
  public WebElement apply(WebDriver driver) {
     return driver.findElement("foo"));

Fluent Wait - timeoutException

This was bit tricky. By overriding timeoutException method, we can trigger a different exception when wait times out. Consider following code in which timeoutException method is overwritten.

FluentWait<WebDriver> wait = new FluentWait<WebDriver>(driver){
    protected RuntimeException timeoutException(String Message,Throwable
        throw new MyException("A New Exception instead of TimeOutException");
    .withTimeout(2, SECONDS);
WebElement foo = wait.until(new Function<WebDriver, WebElement>() {
  public WebElement apply(WebDriver driver) {
     return driver.findElement("foo"));

Fluent Wait - timeoutException cont.

a code Custom Exception:

  class MyException extends RuntimeException {
    public MyException(String message) {

So when time out occurs , instead of TimeoutException, a Custom Exception will be thrown

com.example.tests.MyException: A New Exception instead of TimeOutException
at com.example.tests.sampleTest$1.timeoutException(
at com.example.tests.sampleTest.test(

Fluent Wait <T>

One More thing to note about Fluent Wait is use of generics.

So it is not mandatory to pass driver as argument, you can pass WebElement or a By or anything.

WebElement ele = driver.findElement("id"));
FluentWait<WebDriver> wait = new FluentWait<WebDriver>(ele);
WebElement foo = wait.until(new Function<WebElement, boolean>() {
  public boolean apply(WebElement e) {
     return e.getText().equals("xyz");


If you understood FluentWait class then there isn't much remains for WebDriverWait.

Fluent wait supports generics for context. So it is not mandatory to pass driver as argument, you can pass WebElement or a By or anything. I think following from JavaDoc should be enough.

public class WebDriverWait
extends FluentWait<WebDriver>
A specialization of FluentWait that uses WebDriver instances.

It just removes generics from FluentWait and make it a special case of WebDriver.

WebDriverWait - ExpectedCondition Interface

As WebDriverWait is a specialization of FluentWait for WebDriver, ExpectedConditions interface is specialization of<S,T>

  public interface ExpectedCondition<T>

If using WebDriverWait or FluentWait<WebDriver> you can use ExpectedCondition as following.

  wait.until(new ExpectedCondition<WebElement>() {
    public WebElement apply(WebDriver driver) {
       return findElement("Foo"));

Just from syntactical point of view, we are creating as anonymous class which is implementing interface ExpectedCondition.

WebDriverWait - ExpectedConditions Class

ExpectedConditions Class just simplifies use of ExpectedCondtion interface and have plenty of static methods which returns objects of type ExpectedCondition<T>.

If using WebDriverWait or FluentWait<WebDriver> you can use ExpectedCondition as following.


If you have method available in ExpectedConditions, meeting your requirement then use it, it will help you make your code more readable. Refer SeleniumDocs for Full list of methods.

Thread Sleep

If you are aware of java, you must be knowing that you can wait using Thread.sleep(timeInMillis); but it is not recommended to use Thread.sleep() as it is worst case scenario of wait.

But When you convert your selenium IDE recording to a Java code, WaitFor Seleneses get translated into something like following, which is nothing but simplest implementation of fluentWait :)

  for (int second = 0;; second++) {
    if (second >= 60) fail("timeout");
    try { if (driver.findElement(By.linkText("Selenium - Web Browser
    Automation")).isDisplayed()) break; } catch (Exception e) {}

If you analyse carefully, above code provide functionalities like pollingEvery, withTimeout, Ignoring with this simple piece of code. This is just for understanding purpose and not recommended.

WebDriver Capabilities

DesiredCapabilities Class

DesiredCapabilities help to set properties of WebDriver. These properties are called capabilities. Capabilities are key/value pair, which represents an aspect of a browser. And thus it is browser specific.

If Not clear don't worry, we will see few use cases for different browsers, which will make idea of capabilities clear.

We will see Methods of DesiredCapabilities class:

Keep in mind that DesiredCapabilities has many methods but most of them just do one task, setting a key/value pair, where key is Capability of a browser you want to set and value is its desired value.

android(), chrome(), firefox(), htmlUnit(), htmlUnitWithJs(), internetExplorer(), ipad(), iphone(), opera(), phantomjs(), safari() methods does nothing but returns a DesiredCapability Object with three properties.

  • BROWSER_NAME - Valid values BrowserType Interface, e.g IE, FIREFOX...
  • VERSION - Mostly Blank
  • PLATFORM - Valid values enum Platform, e.g WINDOWS, UNIX, ANDROID...

DesiredCapabilities Class - Cont.

So if you call chrome() method like, DesiredCapabilities dc =;

It will just return a DesiredCapabilities object dc with three capabilities as following:

  • VERSION = ""
  • PLATFORM = Platform.ANY

So now you can just set other capabilities using this dc object.

To set Capabilities, DesiredCapabilities has four overloaded methods and a merge method:

void setCapability(java.lang.String capabilityName, boolean value) 
void setCapability(java.lang.String key, java.lang.Object value) 
void setCapability(java.lang.String capabilityName, Platform value) 
void setCapability(java.lang.String capabilityName, java.lang.String value) 
DesiredCapabilities merge(Capabilities extraCapabilities)

DesiredCapabilities Class - Cont.

All Other methods of DesiredCapabilities are straightforward and self explanatory:

DesiredCapabilities Class - Cont.

Once all desired Capabilities are set on an DesiredCapabilities Object, you just need to pass it as argument while creating your WebDriver Object.


DesiredCapabilities capabilities = DesiredCapabilities.internetExplorer();
capabilities.setCapability(InternetExplorerDriver.INTRODUCE_FLAKINESS_BY_IGNORING_SECURITY_DOMAINS, true);
WebDriver myTestDriver = new InternetExplorerDriver(capabilities);

One important point, Selenium is designed to ignore capabilities that are not supported by the requested browser#. So If you set a capability which is not supported by browser, you will not get any error, but that doesn't mean that driver will start working with that capability.

ChromeOptions Class

To set Chrome capabilities, instead of using DesiredCapabilities you can use ChromeOptions. and directly pass it as argument to your driver instance. ChromeOptions uses DesiredCapabilities underneath.

All you want to know about ChromeOption is available at

Let's see few example using ChromeOptions and DesiredCapabilities, After you can use any of this way as per your convenience.

chromeOptions is only recognized by chromedriver 17.0.963.0 or newer. #.

Though in examples I have mentioned using both ChromeOptions and DesiredCapabilities, but I couldn't set arguments using DesiredCapabilities. So recommend ChromeOptions.

Example 1: You want to start browser with specific arguments
Using DesiredCapabilities:

DesiredCapabilities dc =;
String[] switches = { 	"--ignore-certificate-errors",
dc.setCapability("chrome.switches", Arrays.asList(switches));   
WebDriver driver = new ChromeDriver(dc);  

Using ChromeOptions:

ChromeOptions ops= new ChromeOptions();
WebDriver driver = new ChromeDriver(ops);
  • 'user-data-dir' is used to specify the path for a custom profile. By default new temporary profile is created.
  • You can get all Chrome arguments here.
  • Capability name for chrome arguments is "chrome.switches" or "chromeOptions.args", that is not clear from ChromeDriver Documentation, but most of the examples uses "chrome.switches" capability.

Example 2: Using a Chrome executable in a non-standard location
Using DesiredCapabilities:

DesiredCapabilities dc =;
WebDriver driver = new ChromeDriver(dc);		

Using ChromeOptions:

ChromeOptions ops = new ChromeOptions();
WebDriver driver = new ChromeDriver(ops);

Example 3: Set a Chrome Preference
Using DesiredCapabilities:

DesiredCapabilities dc=;
Map<String, String> chromePrefs = new HashMap<String,String>();
chromePrefs.put("session.restore_on_startup", 1);
dc.setCapability("chrome.prefs", chromePrefs);		

Using ChromeOptions:

ChromeOptions options = new ChromeOptions();
Map<String, Object> prefs = new HashMap<String, Object>();
prefs.put("session.restore_on_startup", 1);
options.setExperimentalOptions("prefs", prefs);
  • List of All preferences can be found here(thanks to #)
  • For Example, in file you will get something like this
  • // Boolean that is true when Suggest support is enabled.
    const char kSearchSuggestEnabled[] = "search.suggest_enabled";
    It contains namespaces and options and So in Master profile you might have following(in JSON format) corresponding to this preferences
    "search": {  "suggest_enabled": true }
  • usually path for master_preferences will be "C:\Program Files\Google\Chrome\Application\master_preferences". So you can refer it for required preferences.

FirefoxProfile class

You can get all browser specific preferences by typing about:config in firefox address bar.

To Set browser preferences there are three variants of setPreference() available.

FirefoxProfile class

There are few other WebDriver specific properties, for which FirefoxProfile provides following methods:

And Few to deal with extensions

Remaining methods are just to get/set preferences in different format and few protected method to override while extending FirefoxProfile.

FirefoxProfile class

Example 1:Running with firebug Extension

Download the firebug xpi file from Mozilla and start the profile as follows:

   File file = new File("firebug-1.8.1.xpi");
   FirefoxProfile firefoxProfile = new FirefoxProfile();
   firefoxProfile.setPreference("extensions.firebug.currentVersion", "1.8.1"); 
   WebDriver driver = new FirefoxDriver(firefoxProfile);

Example 2:Download to specific path without prompting for path

   FirefoxProfile profile = new FirefoxProfile();
   profile.setPreference("", 2);//to download to specific path,
   profile.setPreference("", false);
   profile.setPreference("browser.helperApps.neverAsk.saveToDisk", "image/jpg, text/csv,text/xml,application/xml,application/,application/x-excel,application/x-msexcel,application/excel,application/pdf");
   profile.setPreference("", System.getProperty("user.home"));
   WebDriver driver = new FirefoxDriver(firefoxProfile);

IE Capabilities

Following is list of IE Capabilities From

Alt + click to zoom-in/zoom-out

IE Capabilities


   DesiredCapabilities caps = DesiredCapabilities.internetExplorer();
   caps.setCapability("ignoreZoomSetting", true);
   WebDriver driver = new InternetExplorerDriver(caps);

Also, IE doesn't support multiple profiles like chrome and firefox, so you may decide to do some manual configuration for IE before start testing with WebDriver. In that case you may need not to play much with capabilities. - From answer of Jim Evans on StackOverflow

You can get capabilities of Opera and Safari Browsers and some more info from here.


Create Actions

Handle (or Eat) Cookies

Select Class

Run JavaScript

Take Snapshot

Create Actions

As We saw keyPress() and Click() events were bound with WebElement or WebDriver instance. But now a days webApplications are becoming more interactive. What if just want to

- move your mouse to a location OR

- do a Drag and Drop activity OR

- click on multiple elements while pressing key?

Action class provides a way to perform advanced user interactions with your web application. You will find everything about Actions here.

Action Class Methods

Action Class Examples

Build is most important method of Action class, Which creates composite event.

Few Examples:

Actions singleAction = new Actions(driver);    

Actions multipleActions = new Actions(driver);

Will not go into much detail of it. leaving it for you to play around.

Handle Cookies

WebDriver class has a method named manage(). It returns WebDriver.options interface which supports following methods to handle cookies. so you can get/set/delete cookies as per you testing needs.

Handle Cookies - Examples

Add Cookie

Cookie name = new Cookie("mycookie", "123456789123");

Set<Cookie> cookiesList =  driver.manage().getCookies();
for(Cookie getcookies :cookiesList) {
    System.out.println(getcookies );

Delete Cookie


Select Class

If you are dealing with select tag, Selenium provides an easier way to deal with select tags, it is not mandatory to use Select class, but it will make dropdown handling easier for you.


new Select(driver.findElement("field_system_type")))
new Select(driver.findElement("field_type")))
 .selectByVisibleText("0 - National ID");

Run JavaScript

Yes, you can run JavaScript, using selenium methods

if (driver instanceof JavascriptExecutor) {
	((JavascriptExecutor) driver)
	.executeScript("alert('hello world');");


Take Screenshot

try {
    File screenshot = ((TakesScreenshot)driver).getScreenshotAs(OutputType.FILE);
    String fName = "Screen_" + fileName +(new Timestamp(date.getTime())).toString().replaceAll("[ :.-]", "_")+".png";
    File image = new File("capturedScreens\\"+ fName);
} catch (Exception e) {

In above code we are taking screenshot by casting WebDriver object to TakesScreenshot interface.

Here we are capturing screen as a file and renaming and copying it to desired location.

Take Screenshot -cont.

  • Selenium takes screenshot in PNG format, but you can avail that in three different format, either in a file or in a string as BASE64 format or as byte array. for more detail look at JavaDoc of OutputType Interface.
  • You can also use TakesScreenshot with WebElements.
  • Screen capture output depends on what browser returns, in following preferred order, Entire Page, Current Window, Visible portion of current Frame and Screenshot of entire display including browser. for WebElement extending TakesScreenshot, preferences are entire content of WebElement, Visible portion of the WebElement