Monday, February 4, 2013

PRACTICUM: Selenium 2 Testing Tools Beginner's Guide: Selenium WebDriver

This is the next installment (in progress) of my PRACTICUM series. This particular grouping is going through David Burns' "Selenium 2 Testing Tools Beginner's Guide".

Note: PRACTICUM is a continuing series in what I refer to as a "somewhat "Live Blog" format. Sections may vary in time to complete. Some may go fast. Some may take much more time to get through. Updates will be daily, they may be more frequent. Feel free to click refresh to get the latest version at any given time. If you see "End of Section" at the bottom of the post, you will know that this entry is finished :).

Chapter 3: Overview of Selenium WebDriver

What makes this book different from the first book is that, at this stage, we would be going through a lot more Selenium IDE and configuration aspects, getting ready to introduce Selenium RC around Chapter 6. This time around, though, we are looking at a totally different architecture, and David has made the decision to speed up the process and get us into the Server aspects of Selenium, specifically WebDriver. As many of you may know (and some of you may not), Selenium and WebDriver merged a couple of years ago, and the architectures are somewhat different.

History of Selenium

I'm not going to go through everything the David wrote here, but suffice it to say that Selenium has quite an exciting run over the past few years:

- Selenium, when originally created by Jason Huggins, solved the issue of getting the browser to do user interactions.

- Selenium was limited by the JavaScript sandbox in browsers. One big issue was the Same Origin Policy. Moving from HTTP to HTTPS, as you would in a login, the browser would block the action because we are no longer in the same origin.

- The Selenium API was focused on running tests written in HTML using a three column design (similar to FIT). Selenium IDE is a good example: the three input boxes (command, target, value) match to this framework.

- Patrick Lightbody and Paul Hammant created Selenium Remote Control (RC) using Java as a web server that would proxy traffic. It was designed to utilize the same three column approach as the IDE. Different languages and bindings were allowed to be used, provided they adhered to the three column "Selenese" format.

- There is now somewhere around 140 methods available for the Selenium API. Lots of choices, figuring out which one to use? Hmmm, getting to be a challenge.

- Selenium RC was starting to struggle with the development of HTML5 and Mobile devices. A different approach was going to be needed to handle these changes.

Simon Stewart started working on the WebDriver project. By using HTMLUnit and Internet Explorer, Simon designed the WebDriver API around Object Oriented.

Back in 2011, I was at the Selenium Conference in San Francisco when Simon announced that Selenium and WebDriver were going to be merged, and that Selenium 2 would effectively be WebDriver.

WebDriver Architecture

Selenium is written in JavaScript, This JavaScript emulates user actions and automate the browser from within the browser. WebDriver oohs to work outside of the browser through API calls (specifically the accessibility API common to most web browsers.

Depending on the browsers implementation, it uses the language appropriate for that browser to use the accessibility API (Firefox uses JavaScript, Internet Explorer uses C++).

Upside, we can control browsers in the best possible way. Downside, new browsers will not be initially supported.

Where a native language API approach wont work, JavaScript will be injected into the page.

The four parts of the WebDriver architecture are summed up as follows:

WebDriver API

The WebDriver API is the part of the system that you interact with directly. The WebDriverAPI is much briefer than the 1`40 method API for Selenium RC. The WebDriver and the WebElement objects look like this:

element.sendKeys("I love cheese");

These commands are then translated and passed to the WebDriver SPI.

WebDriver SPI

The Stateless Programming Interface (SPI) breaks down the element, uses a unique ID, and then calls the relevant command. The code in the previous section after it arrived in the SPI would look like this:

findElement(using="name", value="q")
sendKeys(element="webdriverID", value="I love cheese")

JSON Wire Protocol

The WebDriver developers created a transport mechanism called the JSON Wire Protocol. This protocol transports the necessary elements to the code that controls it.

Selenium Server

The Selenium server, or browser, uses the JSON Wire commands, breaks down the JSON object, and then does what it needs to do.

The Merging of Two Projects

Simon Stewart and Jason Huggins thought it would be good to merge the two projects, and front this came Selenium 2. The Selenium core developers have been working to simplify the code base and remove duplication where possible. Selenium Atoms has grown out of this, and these are shared between the two projects.

Setting up Intellij IDEA

Everyone is going to do this slightly differently depending on the platform that they have. If you are using Darwin, it's pretty straightforward. Download Intellij IDEA and drag the Application over to the Applications folder.

Note: for this purpose you will want to either download the Community Edition, or the 30 day trial of the Ultimate Edition. To keep things simple, I'd suggest the Community version and, if you decide that IDEA is just all that, then you could upgrade and shell out the $199 for the Ultimate version. Your call. Everything I'll be doing will be with the Community version.

Also, the version being referenced in the book is version 11. I'm using build 11.1.5 for the examples. If you want to follow along with the book, I'd suggest getting the same version. Version 12 is the latest, but things are in different places. If you are not familiar with the IDE, and don't want to do a continuous mental conversion, I think it's best to follow along with something that matches the rest of the book.

Next step is to actually create a project and link up everything you will need. First, Apple ships with a version of Java, but if you want to update it to the latest version, go and get the JavaSE SDK for MacOS (the current version as of today is ).

Once you have installed or located the version of Java SDK that you will be using. you need to link it to Intellij IDEA. This is done at the project level. In Platform settings, you can choose the SDK. For me, that's at /Library/Java/JavaVirtualMachines/jdk1.7.0_13.jdk. Yours may be in a different place, the point is, load that file, and the system will populate a number of tabs (Classpath, Sourcepath, Annotations, and Documentation Path).

Next, you want to create a test folder for your project, which requires doing the following:

- get the project you want to work with set up, and view the project in the project pane.
- ctrl-click on the name of the project and select "New: Directory"
- call this directory "test"
- click on "File: Project Structure" [or if you want to be a keyboard ninja, use "[command]-;"
- click on "Modules" and select the "Sources" tab under the project name. Here you will see your source folder and on the right, the file tree with your new "test" directory.
- highlight the "test" folder with the mouse, then click "Test Sources". If the test folder turns green, you're good.

- to the left, click on "Global Libraries". Click the "+" button in the middle column and go and link to selenium server and junit (if you don't have junit yet, download it from
- create Global library entries for both Selenium Server and for junit.
- go back to Modules and click on the "Dependencies" tab. Select both selenium server and junit jars to associate them with this project.

Note: my environment differs from what the book is looking for, in that it calls for a selenium.jar and a common.jar, which would be in the same place as the selenium-server.jar. My problem with this is that I have downloaded both 2.28 and 2.29 zip files for selenium server, and neither of these mentioned jars are part of those distributions. There are a variety of commons jars in the lib for the server distribution, but none by these names. It's possible that this is a standard option for those who build the source for Linux, and again, if so, this is a piece of implicit knowledge that is not included with the book. Again, I'm running Darwin, and without doing a dig for other sources, I don't see these jar files, but Google Docs says concerning selenium-server-standalone... "Note that this JAR contains all of the required dependencies". I can confirm that the standalone server starts up, so for now, that's what I'm going with.
With both of these jar files set up as global libraries, we should be ready to go. Just for grins, let's fire up selenium server by running "java-jar [name_of_selenium_server_you_are_using].jar and see what we get. for me, it looks like this:

$ java -jar selenium-server-standalone-2.28.0.jar
Feb 05, 2013 9:25:55 AM org.openqa.grid.selenium.GridLauncher main
INFO: Launching a standalone server
09:26:01.321 INFO - Java: Oracle Corporation 23.7-b01
09:26:01.323 INFO - OS: Mac OS X 10.8.2 x86_64
09:26:01.357 INFO - v2.28.0, with Core v2.28.0. Built from revision 18309
09:26:01.658 INFO - RemoteWebDriver instances should connect to:
09:26:01.660 INFO - Version Jetty/5.1.x
09:26:01.661 INFO - Started HttpContext[/selenium-server/driver,/selenium-server/driver]
09:26:01.661 INFO - Started HttpContext[/selenium-server,/selenium-server]
09:26:01.661 INFO - Started HttpContext[/,/]
09:26:01.675 INFO - Started org.openqa.jetty.jetty.servlet.ServletHandler@28ac3f94
09:26:01.676 INFO - Started HttpContext[/wd,/wd]
09:26:01.679 INFO - Started SocketListener on
09:26:01.679 INFO - Started org.openqa.jetty.jetty.Server@1c4f8ea3

OK, that wasn't so bad, although I will admit it took me a few times circling around to see if this would work. I understand why David wanted to use an IDE for this, in that, while it's a lot of moving parts to keep track of and get working together, going forward, it makes it simplet to write the examples, since we all understand that anything going forward is specifically taking place in this IDE, with this setup, and with these dependencies. It's entirely possible I may find that what I have is inadequate and I may have to do something different to make this work later on, but I'll jump off that bridge when I get to it.

For now, if you have done what I have done, you now have IntelliJ IDEA 11.1.5 that links to JDK 1.7, junit 4.11 and selenium server standalone 2.28 (you can of course use whatever versions you want to :) ). And that about does it for this chapter. Next time, we'll be looking at design patterns for Selenium WebDriver and how they can, hopefully, make test creation, and our lives, a little easier.

End of Section


sumit kher said...

I am comparatively new to selenium and played a little with selenium IDE, and trying to do with selenium rc. I am baffled to use the scripting language there. Can you tell me what is the most used language in selenium across industries? And also I came across this course os selenium automated web browser testing is this good? If someone does in Java and he joins a company where everyone does in ruby, then it'll be a pain to learn ruby again. And also it would be great if you address any comparison about the available languages like (perl, python, ruby, java etc.) or tell me any other guidance would really appreciate help and also i would like to thank for all the information you are providing.

Michael Larsen said...

Sumit, the answer for this particular project is, since David wrote the book and used Java as the examples, I chose to likewise work through the book and use Java for the examples, so I could see where the book lined up with real world use and if the examples worked as advertised. The answer is, a lot of the time, yes, but sometimes, the code displayed in the book is not accurate; there's a number of errata listed, I'm planning to share a few more with them in the coming days.

As to which language to learn, I think it's important to use the same language (where practical) that the development team is using. I've used Selenium in environments with Ruby, with Java and also with Perl, so it really comes down to what the team needs, and what you feel comfortable working in. I may try something later on where I look to see if I can convert the examples in this book and in the Cookbook from Java to Ruby or Perl, but that will come much later :).