Test Automation Architecture: 2016

Wednesday, October 12, 2016

Python and how to get the instance name of a variable

I needed the name of an instance in order to do some error reporting. I'm very lucky that each of these things I need to report on will only have one instance.

So of course, first thing I did was ask Google. And after a couple of weeks of checking now and again, I have been unable to find any solutions already published. I found a lot of "instances don't have names!", which clearly isn't true, but makes sense from the perspective that the instance name is a pointer to the thing, and there's no backwards pointing property in the thing being pointed at.

But as with most things, there's always a way. And it turns out that in python, it's actually pretty straightforward:

import gc


def instance_names(self):
    referrers = gc.get_referrers(self)
    result = []
    dict_of_things = {}
    for item in referrers:
        if isinstance(item, dict):
            dict_of_things = item
    for k, v in dict_of_things.items():
        if v == self:
            result.append(k)
    if not result:
        result = ['unnamed instance']
    return result

Returns a list of all matching instance names. Now it's possible to create an instance without a name, such as via a generator or a lambda, and I haven't tested what it will return in those cases, as that wasn't what I needed this for. It also fails for getting the name of a function, but that can gotten in other ways.

I hope this helps someone! :)

Python, Selenium and the dreaded "Timed out receiving message from renderer"

We had a problem, around 15% of the time, we were getting tests that failed, and when we went to SauceLabs to look at the problem, we just had a blank browser page with "data;" in the address field.

Folks here had been ignoring this and had just pronounced the automated tests to be flaky and unreliable. Having the experience I have, this was galling.

I looked in the selenium output, and saw "Timed out receiving message from renderer".

So I looked on Google, and found lots of folks reported having this problem. Several bugs have been written, all closed with "can't duplicate". This issue is at least 4 years old as of this writing. I tried changing timeouts, using try/except, and every other thing listed, but improvements were either nonexistent or very modest.

I have solved it, but the solution is an *awful* hack. The one thing it has going for it, our failures have dropped from 15% to 0. (Testing done using a suite with 10,000 cases in it.)

Here's the code at the center of the solution:

webdriver.get('about://blank')
my_script = 'var a = document.createElement("a");' \
            'var linkText = document.createTextNode("%s");' \
            'a.appendChild(linkText);' \
            'a.title = "%s";' \
            'a.href = "%s";' \
            'document.body.appendChild(a);' % \
            (url_to_use, url_to_use, url_to_use)
webdriver.execute_script(my_script)
webdriver.set_page_load_timeout(20)
webdriver.click_element_by_text('css=a', url_to_use)
if page.loaded() is False:
    webdriver.click_element_by_text('css=a', url_to_use)
if page.loaded() is False:
    webdriver.click_element_by_text('css=a', url_to_use)
if page.loaded() is False:
    webdriver.click_element_by_text('css=a', url_to_use)

In the above, page is my page object. The method loaded() checks controls to see if the page has finished loading. And click_element_by_text fetches matching elements and iterates through them to determine whether they have the text specified. If an element does, it clicks it. (Sorry I can't include that code, but it belongs to work, and would make this sample way too long.)

In my experiments, it came to look like the integration between the driver and the browser (at least on Chrome) creates this state where Chrome has failed to load, but never tells the driver about it. So the driver just eventually times out.

about://blank - I used this because it should render an internally generated page every time. On Chrome, it's a "This site can't be reached" error, which works just fine. Firefox and IE also generate errors or blank pages. But the assumption was that internally stored pages should load every time. And so far, they do.

So by adding the target link and clicking it, I'm bypassing that tight integration.

I've watched the code run, and I've seen it had to retry once in a while, but so far, never more than once.

Please note this has only been tested on Chrome.

I hope somebody finds this helpful! :)

Tuesday, May 31, 2016

The Nature of Frameworks

In my experience, when folks talk about test automation frameworks today, they're talking only about the libraries that you use to isolate things like selectors from the tests. And that is surely an important topic, one I'll address at some future time.

But the 'automation framework' means a lot more than that. Back in the day, we had to think about:

Where the source was
How we got it built, including all variants
Branching including which tests against which branches
Strategic prioritization of test implementation (eg, which tests are most important to implement first; what combinatorics; what mix of api, ui, performance and stress testing; etc)
Automated test management
Automated test data management (eg, preset database images to test against, data for data set tests)
Coding standards
Training of new staff
Design reviews
Code reviews
Portability, internationalization and localization
Automated test execution
Automated bug reporting
Artifact aging and control
Hardware resource allocation and maintenance
Crisis management
And QA Evangelizing

These days, it's a lot simpler, but it's still more than just libraries. In my current gig:

Where the source was -- Github
How we got it built, including all variants -- Jenkins
Branching including which tests against which branches -- Still managing this by hand
Strategic prioritization of test implementation -- Still managing this by hand
Automated test management -- Nose and Python's unittest library
Automated test data management -- fixtures help, but mostly Still managing this by hand
Coding standards -- PEP8 + Our own standards
Training for new staff -- Still managing this by hand
Design reviews -- Still managing this by hand
Code reviews -- Github
Portability, internationalization and localization -- Honestly, I haven't run into that in my current gig
Automated test execution -- Nose and Jenkins
Automated bug reporting -- Jenkins
Artifact aging and control -- Still managing this by hand
Hardware resource allocation and maintenance -- SauceLabs and various cloud services like Amazon or Google.
Crisis management -- Still managing this by hand
And QA Evangelizing -- I have to admit, at my current gig, this hasn't been an issue.

While there are now systems to handle these things, they still need to be orchestrated. And as a Test Automation Architect, seeing to those systems and how they serve the delivery of useful results is still in my purview. But it sure is nice not to have to build all of it by hand.

This note has been percolating in my head for some time. It's in response to this: http://www.softwaretestinghelp.com/test-automation-frameworks-selenium-tutorial-20/

Wednesday, May 25, 2016

Rules and Whys

When I started, I had to invent everything. It was 1990, and I didn't even have Google to ask.

We discovered early on that some things needed to be *rules*. Like only page objects can talk to their controls. A hard and fast rule. And that one, with good reason. By enforcing encapsulation, maintenance is made far easier.

When I started in my current gig, I was the most junior guy on the team, and the most junior with Selenium and Git... But the most experienced automation engineer overall by a factor of 5.

Shortly after I started here, I had a code review with Clay Gould, our automation lead. In this code review, he got quite concerned about my use of a delay of a tenth of a second in the code. The rule he quoted amounted to 'never EVER use delays to synchronize, ALWAYS look for changes in the application or environment'. A good rule. One I've pounded into many new automators.

When I teach and use these kinds of rules, I find it important to keep in mind why a rule exists. With time delays in code, the problem is use of the delay *for synchronization*. Because systems respond at different speeds at different times. Synchronization means attempting to line up processes. Like waiting for a page to be done loading before trying to take actions on it.

I was not using the delay for synchronization, I was using it to assist in synchronization. I checked the state of something, paused for a 1/10th of a second, and checked the state again. I kept at this until a timeout was met.

We argued about this for some time, and finally went looking at the Selenium code directly. Lo and behold, their code has a 3/10ths of a second delay in it in exactly the same fashion. It's to give over processor cycles to the other processes so they can complete more quickly... Rather than tying up cpu to test the condition over and over until it's met.

More recently, I was implementing a selenium wrapper, and I wanted to spread out a delay across multiple functions. I wanted to validate that the control was present, visible, and enabled... And I wanted only a 20 second timeout overall. Which means exist can take a long time, and complete the other two quickly. Selenium has no native support for anything like this. I either specify different hard timeouts for each one, or I allow them to all add together, and come out with more than 20 seconds.

The answer was to note my target end time (now + timeout) and pass the time remaining to each of the functions. And with a 1/10th of a second in between again.

Rules are important. Rules are present for a reason. That reason may not be applicable in all circumstances. Experience often gives us the whys behind the rules, so we can assess when it's safe to ignore them. And in my experience, there are very few rules in test automation that don't have exceptions.

Friday, May 20, 2016

Getting Abstract...

After I had my first glimmerings of OOP, my next learning was an extrapolation of having code in the window classes.

I didn't know about abstractions or encapsulations yet, but I knew that having all these functions was producing too many things to take care of.

I also knew that, if I put code into the window definitions, then the tests wouldn't have to know how to do a thing, only which window to ask to do it. It wasn't until quite some time later that I was told that this was both abstraction and encapsulation. I had learned to decouple the tests from the interface being tested. Yay!

In my current position, our tests make use of page objects to accomplish this same thing. And it's a decent start. One of our rules is tests never touch controls directly, they call functions in the page object. A good idea.

One of the things I observe in the code I've inherited joining my current gig was that often, instead of clicking a link, the tests here would call a function like page.click_link_to_next_page()

I'm looking to change this, because this means the test is still coupled to the interface, but now we wrap the selector in a function. Which actually doesn't accomplish the goal.

Coming next.. about rules and whys.

Thursday, May 19, 2016

OOPs!

In 1990, I was working at Symantec on desktop project planning software. I'd figured out that I needed low level access to controls to implement good automation. So I began to try and write it.

I had just started to get the very first bits of prototype code to work. Yay! And then my boss, Jennifer Flaa, came back from a QA conference. She told me that while she was there, she'd seen a tool called QA Workbench, built by a company in Boston called Segue. And she wanted to fly me out to look at it.

I went, and met with Lawrence Kepple and David LaRoche. They gave me the demo. It was everything I was trying to build! Woot! It programmed in a language called 4Test, based on C. It gave me low level access to everything. I got to be the first external beta tester for them. Yay!

In order to build a test suite, I had to produce these huge long tables of constants, rather like the selectors in Selenium. Once those were built, tests were made by producing scripts that passed those constants to functions, and suddenly my dream of actually being able to access all the controls was a reality.

And these lists of constants gave me my next lesson in test automation. They were pretty difficult to maintain. And I wasn't the only one having trouble.

Segue got enough feedback that when they introduced QA Workbench 1.0, it included a limited form of object oriented programming. We could create objects to represent windows, and within them, objects to represent controls. These classes were created using a directive called 'winclass', and we were told they were for creating a GUI representation.

And each of the classes had methods and properties

It was a stretch, I was learning the basics of objects on the job in real time. But it was an awesome way to organize these vast tables.

We had a control who's selector was different at different times. David explained to me that I could add code into the object to return the appropriate selector. And that's where my love of oop started.

QA Workbench went on to become SilkTest. Which I used for many, many years after that.

Coming next... Abstractions

Tuesday, May 17, 2016

Automate All The Things! Not...

When I started on test automation, I had the intention to automate all of the testing.

It was really a ludicrous idea, because at the time, we had enough trouble just keeping up with the UI changes.

What I learned was that automation was better for:

Smoke testing before deploying a build to QA
Regression testing of existing features
Performing extremely repetitive testing
Running performance and stress tests

I've been asked my a couple of companies to automate all the tests, and I surely did try. When I and my team couldn't do it, I tried creating libraries that made automation easy for everybody to implement. I even wrote a keyword based system at one point (thanks, Rafael Santander!).

In addition to the limitations above, I've learned that strategic implementation is critical. OK, so you want me to automate all the regression tests, let's start with the tests that will meet a specific business need. Whether that's smoke tests (and it often is) or tests for strategically important pieces of application under test.

Since those days, a little analysis before hand about the requirements on automation has served me well.

Monday, May 16, 2016

Crawling around in the guts

I have learned about how important low level access is to automated testing.

We take it for granted today... Tools like Selenium, SilkTest, UFT, and a bazillion other tools, all provide this access.

When I first started doing automated testing, there wasn't a roadmap. It was 1990, David Blair, my boss, came to me and told me to do test automation. I asked him what it was, and he said "I don't know, go figure it out". And thus was my career born.

I started by using a tool that would allow me to see which window was active (by window title), I could ask if a given window existed, and I could send keystrokes to an app. Since they relied on keystrokes, I had to count tabs, and use alt-key commands and so on in order to build my tests.

About all I could do was try things and see if error windows came up. Those were my first smoke tests.

And they were a nightmare to keep going.

After the very first project was done, it was clear to me that this was very limited and very time consuming.

I needed something better. I needed a tool with lower level access. I needed to be able to talk to the control objects in the OS directly.

In my experience, this isn't something we think about anymore. Tools to do this have now existed for a long time, so this snippet is more historical than current. Nonetheless, it was my first lesson in automation.

Test Automation Architecture