Opening and Closing Tabs Using Selenium
In this post I reference methods in the Selenium C# client driver. Equivalent methods should exist in whichever other language client driver you use.
Some testing requires opening a new window, performing an action, then closing that window, perhaps even returning to the original window to continue the test. There are many, many ways of managing tabs in selenium, so lets take a look at what works and what doesn’t.
Tabs = Windows
First of all, Selenium really has no concept of what a tab is and how it differs from a window. Each WebDriver instance has a reference to it’s current window handler, which is points to the current tab or window that driver is interacting with. When managing tabs, we have to be able to create a window handler, switch to it, do some stuff, then switch back to our original window handler.
Creating Tabs
There are many ways of opening a new tab. You could do it by clicking a link, using a keyboard shortcut, initializing a new WebDriver instance, or even typing some JavaScript into the developer console. Unfortunately not all of these methods are reliable in Selenium. For instance, having the WebDriver send the keyboard command “Ctrl + T” would open a new tab when testing locally on my machine, but when running the test using Selenium Remote Server, the keyboard command would be ignored entirely. There are also known bugs related to sending keyboard shortcuts through the WebDriver, some dating as far back as 2012. Initializing a new WebDriver works, but is quite resource intensive and requires having to keep track of the state of multiple WebDriver instances. Clicking a link is also somewhat unreliable since Selenium doesn’t handle links using target="_blank"
.
The most reliable method I have found is to create your tabs using JavaScript. This is done by simply executing a window.open()
.
Switching Window Handlers
So now you have a new tab in your browser, but you still need to tell your WebDriver to switch to it, otherwise commands will continue to be sent to the original tab. This is actually pretty easy and is done using the SwitchTo()
command into which you pass the window handler that you want to switch to.
The Code
So putting everything together, here is what the code would look like to:
- Create a new tab
- Switch to it
- Do something in the new tab
- Close the new tab
- Switch back to our original tab
// save a reference to our original tab's window handle var originalTabInstance = myWebDriverInstance.CurrentWindowHandle; // execute some JavaScript to open a new window myWebDriverInstance.ExecuteJavaScript("window.open();"); // save a reference to our new tab's window handle, this would be the last entry in the WindowHandles collection var newTabInstance = myWebDriverInstance.WindowHandles[Driver.Instance.WindowHandles.Count - 1]; // switch our WebDriver to the new tab's window handle myWebDriverInstance.SwitchTo().Window(newTabInstance); // lets navigate to a web site in our new tab myWebDriverInstance.Navigate().GoToUrl("www.crowbarsolutions.com"); // now lets close our new tab myWebDriverInstance.ExecuteJavaScript("window.close();"); // and switch our WebDriver back to the original tab's window handle myWebDriverInstance.SwitchTo().Window(originalTabInstance); // and have our WebDriver focus on the main document in the page to send commands to myWebDriverInstance.SwitchTo().DefaultContent();
This approach works when executing tests both locally and remotely. Keep in mind that you can only execute a window.close()
on a tab that was initially opened using a window.open()
.
I appreciate the caveat that it will only close a window opened using the corresponding javascript code. I am have difficulty across browsers closing open tabs that are opened by the web application, not the automation script, so I already anticipate that your javascript close method won’t work for me. I have also had mixed results with the sendkeys with browser shortcuts. One thing I did not know, but makes sense in your article, is that since Selenium does not diferentiate between tabs and windows, that it treats them both the same. But that does not seem to hold up in our code, either. Is this perhaps because the extra tab is also opened via the web app instead of the script? What I need is something that will close a tab, regardless of how it was opened, but not the entire browser window.
Actually JS window.close works fine when opening from a web app :
final Set set1 = driver.getWindowHandles();
final Iterator win1 = set1.iterator();
final String parent = win1.next();
final String child = win1.next();
driver.switchTo().window(child);
wait.until(ExpectedConditions.urlContains(
“https://someUrl” + trackingNumber));
wait.until(ExpectedConditions.visibilityOf(driver.findElement(By.name(“InquiryNumber1”))));
Assert.assertEquals(trackingNumber,
driver.findElement(By.name(“InquiryNumber1”)).getAttribute(“value”));
((JavascriptExecutor) driver)
.executeScript(“window.close();”);
driver.switchTo().window(parent);
driver.switchTo().defaultContent();