Selenium reports – our take
by Igor

Selenium reports - our take


We decided to improve the reports our Selenium tests generate. We wrote about our reports in this post and here is a sample report. Why did we begin to dislike the current design?

First – the mere sight of the report should be aesthetically pleasing, because we have to work with them every day.

Second – we would like to make it easier to view screenshots. In the current version they were just opened in a new tab in the browser, making it difficult to view and work with the report as a whole.

Well, thirdly – we finished Nerrvana and created another web application, which is extensively tested in Nerrvana on all environments it currently supports. By environment we mean the combination of an operating system and a browser of specific versions. For example, Win 7 + Firefox 9 and Win XP + Firefox 9 – are two environments. It turned out that viewing a report for each environment is uncomfortable. We want to generate reports in XML, and then combine them into one report. Then we can immediately see errors that occurred in all environments.

I made a few designs, which I want to share with you. Dima is preparing a presentation for the Selenium Camp 2012, where he will talk about how to build such hierarchical reports and parallel testing. We want to implement at least some of the ideas, that we think are most urgently needed, and be able to show them during Dima’s conference talk.

Figure 1 Report we rework

First I removed the underlines from links, added better icons, changed the font and removed the drop-down list, replacing it with clickable filters, which can be seen immediately and are easier to work with. You can also see that the error event is marked with a bug icon.

Figure 2 New report look and feel

Then I draw a window to display before and after action screenshots, which will now be a pop-up. That is, we stay inside a report and have arrows to quickly switch between BEFORE and AFTER screenshots.


Figure 3 Popup to browse screenshots

The right side of the page shows a brief; when the test was launched, in what environment. In our current version we lack this information. Further we have additional information – name of Space and Test Run. These are specific concepts that are used in Nerrvana. The links take you to the location where the original reports are located in Nerrvana – because the report could be moved elsewhere and viewed in a different location. In any case, the report shall indicate what we tested, when and in what environment. The block on the right side of the page seemed to be a good place for such information.

Figure 4 Report details section

After completing the ‘brief’ section, I began to think how the report should look, when the same test was completed in 16 environments (for example, two operating systems and eight different versions of the browser). At this point in our internal forum we created a topic where we realised that we needed to use XML as an intermediate step: load 16 XML files generated on Nerrvana to Jenkins and then parse/generate one aggregated report, which will be convenient and fast to work with. You may want to also pump parsed XML data to a database. Flushed with unlimited possibilities of a Jenkins plus Nerrvana bundle, I began to design such a report.

The first thing I added a UI to switch between environments, but made it so that it can also show test errors.

Figure 5 Bug matrix
(Different states are shown. See full picture at the end of the post)

Then I got the idea to not only show where the error occurred in the test, but also to show where in the same spot the error occurred in other environments. We show in how many environments this error occurred by adding, for example, a 4x next to the bug icon. It means – a bug in the four environments, including the report that we now look through. After clicking on the icon a circle appears around it, and at the same time, the bug matrix displays the environments which experienced this same error. There we can go to these reports by clicking the appropriate bug icon.


Figure 6 Featuring inline bugs

Since reports are hierarchical, a bug icon on the upper level with 4x mark does not mean that, at a lower level, we see a complete match between environments. That is 4x on a top level means that we will see a bug icon in the same top level in four reports in total in this test run. Gray icons indicate errors that are present in other environments, but not in the one we are looking at now.

Fatal errors are indicated with a smoking bug icon. In fact, if we look at the report, where the test was not completed due to fatal error, we will still see the gray text of the report, taken from another test that was completed on a different environment (if any).


Figure 7 Fatal bugs or is there a life after death

Finally, in the screenshot pop-up, I’ve added the ability to switch and view screenshots of the same error in other environments without having to switch the report. However, if you want to switch to one of them, it can be done before closing the window using “switch and close”. Another possibility is to make the interface of this window as a carousel with vertical scrolling. Right and left movement will switch before and after action screenshots, and up and down scrolling will switch to other environments.


Figure 8 Screenshot popup viewer version 2

Since the bug matrix is located on the right, under the information about the test, I thought it would be convenient to show it, and when you click on the bug, the icon becomes circled with a dotted line after a click. This allows you to see where there was the same error and switch to another report if necessary, without much fuss. The bug matrix also has a filter “Show all errors”. When used we show the number of all errors that occur in every environment. So you can immediately see the overall picture. But visualisation of the frequency of each unique error in different environments remained unresolved. One unique error could occur in the two environments, and the other – in ten. These are two big differences. Having thought about this problem, I looked at my watch. It was 2:00 at night. Knowing that a well-defined problem continues to be addressed in the background while you sleep, I took a break.

In the morning the solution was extracted from the depths of the subconscious, and I’ve added a frequency of errors into a popup. It shows how many times the same problem occurred when tested in different environments. Again, if you are viewing a report where you highlighted error, then the column of the chart where this error sits will be marked with bug icon. If we want to look at the error, which, for example, occurred eight times, then we can just click on the appropriate bar graph column. In this case, we will see the first report from a set of environments where the error occurred. An error icon will be circled, and a tree will expand right to it to make it visible.

Figure 9 Bug matrix version 2

By the way, at that time I was watching talks from SeleniumConf 2011 and GTAC 2011 and realized that there is another, more complicated but more accurate way to determine the importance of the error. Many now began to talk about ADD (Analytics Driven Development) and ADT (Analytics Driven Testing). Their assumption – it should be a connection between the web analytics – the popularity of the functional blocks in your application, popularity of the browsers your audience uses and even the most common screen resolutions and tests you should concentrate on and bugs you should fix first. This approach can bring a completely different outcome than the simple bug frequency bar chart I created. A mistake encountered in all 16 environments, but the functionality is used by five per cent of the visitors, but another error occurred in the most popular version of the browser for your site and functionality, which affects 95% of visitors. This approach is based on analysis, and will work only for web services testing, I guess. For creators of downloadable web applications, getting extinct these days, and who produce releases less often, it is not so important. They can achieve a complete bug free state. That, of course, does not guarantee their complete absence.

Next I thought about the report, which we certainly are not going to implement now, but that actually integrates all reports we were taking before in a nice trackable way. That is, generally we test trunk, last production release and next release when ready. The drawings that I made, somewhat simplified, since the application we are testing, even though it is web based, but it is downloadable (not a web service) and supports two databases – MySQL and PostgreSQL. So we have to test trunk and releases twice – trunk with PostgreSQL and trunk with MySQL etc. Testing is started by the CI server – either automatically when you change the code (commit), or by schedule (daily).

Figure 10 Aggregation report.
Can be optionally integrated with version control system.

In the example above you can see that it starts by schedule, since report buttons are not next to each commit. By the way, because the Selenium test’s code is in the same repository, the tests are run when either tests or the application itself changes. Perhaps it makes sense and it is not shown on my drawing, to mark commits on this page to make it clear – this is where the application has changed, here – only the tests, and here – both application and tests changed. For a large company such a report clearly makes sense. For us it isn’t necessary and there are more important things to do.

Figure 11 Individual report can also show version control change.

When the tests are ran in trunk we can see, what changes have been committed, the total number of errors. Their numerical increase or decrease compared with the previous launch, as well as changes in the test environments that they were launched in (yellow blocks). I thought that was important as it directly affects the number of errors. Of course, the link opens the report, which I showed above. Information about modified files can be hidden under a spoiler not to interfere with the overall view.

Figure 12 Altogether (click to enlarge).

What kind of reports do you use? Does it make sense from your point of view, to create such reports? What you do not like in our designs? What’s missing?

Post is updated on 27 March 2012
We were rushing to implement at least something from ideas we described here during Dima’s talk at SeleniumCamp 2012.
These are exampes – one and two.
It is only the beginning, and read more about our approach insides in our blog soon.

Print this post | Home

2 comments

  1. Vladimir says:

    Несколько вопросов:
    1. Работает ли на selenium grid? (со всеми вытекающими последствиями: сбор скриншотов с массива node/агентов, параллельное тестирование и т.д.) При запуске лог файлы генерируются на node (агенте), как они потом попадут на хаб, а далее на host машину с которой запускались тесты?

    2. Умеет ли генерировать единый html файл без сторонних ресурсов? (к примеру: для отправки заказчику, или выкладке его в общую папку)

    И на сколько я понимаю, Вы реализовали дополнение селениум логов полезной информацией в процессе выполнения тестов, а после прохождения тестов происходит парсинг и формирование отчета? В таком случае ещё пара вопросов:

    3. Получается, я в итоге буду иметь отчет по текущему инстансу браузера? т.е. работая к примеру с chromedriver я получаю результаты с chromedriver.log?

    4. Могут ли логи (а соответственно и отчет) содержать информацию о группировке тестов по Тест классам?
    Для примера: я использую mstest и VS. Имею тест классы: HomePageTests, ContactUsTests, ProductsListTests, каждый из которых содержит по десятку тестов. В *.trx файле, по завершению mstest’а есть группировка по классам, т.е. я вижу, что у меня в первом классе выполнено 6 тестов успешно, и 4 провалены. Раскрывая каждый из них вижу причину падения теста, статус теста и т.д.

    5. Смогу ли я в логе/отчете увидеть дескрипшн из аттрибута [Description("")] в MsTest?

    6. Как отчет справится с Data Driven Test’ированием?

  2. bear says:

    1. Работает ли на selenium grid? (со всеми вытекающими последствиями: сбор скриншотов с массива node/агентов, параллельное тестирование и т.д.) При запуске лог файлы генерируются на node (агенте), как они потом попадут на хаб, а далее на host машину с которой запускались тесты?

    На самом деле, мы не используем в отчётах логи самого Selenium node/hub. То, что вы видите в отчётах – генерируется самими тестами, которые используют наш фреймоврк. Т.е. фреймворк – это главным образом прозрачные обёртки над функциями Selenium, которые и собирают данные для отчёта.
    Для получения скриншотов используются функции ..ToString(), т.е. и скриншоты у нас сразу оказываются на хосте, где работают тесты. В итоге все исходные данные находятся именно на этом хосте.
    Вообще, более подробно о самом подходе получения иерархических логов можно прочитать здесь.

    Однако подобная проблема (получение логов Selenium с нодов грида, эти логи не имеют отношения к отчётам) нами решена в нашем сервисе Nerrvana. Поскольку пользователям может потребоваться вся доступная информация с нодов, мы пишем все эти логи в базу, и потом оттуда забираем. Подробнее можно прочитать тут.

    2. Умеет ли генерировать единый html файл без сторонних ресурсов? (к примеру: для отправки заказчику, или выкладке его в общую папку)

    Нет, не было такой необходимости (а фреймворк мы писали, в общем-то, чисто для себя). Не думаю, что это сложно реализовать. В конце концов, при необходимости можно открыть отчёт в браузере и сохранить одним файлом с его помощью.
    Для себя нам это не нужно в первую очередь потому, что наши тесты выполняются, опять-таки, на nerrvana.com, и после выполнения все результаты там же и хранятся. Если в выполнении были ошибки – отчёт доступен, понятное дело, отовсюду для всех, кто имеет на него права и ссылку.

    И на сколько я понимаю, Вы реализовали дополнение селениум логов полезной информацией в процессе выполнения тестов, а после прохождения тестов происходит парсинг и формирование отчета?

    Как я уже сказал, сами логи Selenium мы не используем в отчётах, используются спец. обёртки для вызовов методов драйвера. Самое существенное на наш взгляд, что обеспечивает наш подход – это хранение иерархии, которое позволяет свернуть ненужные ветки и смотреть только на нужные.

    3. Получается, я в итоге буду иметь отчет по текущему инстансу браузера? т.е. работая к примеру с chromedriver я получаю результаты с chromedriver.log?

    Поскольку мы не используем логи самого Selenium, то отчёт не привязан к инстансу браузера. Тест может использовать несоклько инстансов и подготовить единственный отчёт. В отчёте это проявится дополнительными командами закрыть браузер/открыть браузер

    4. Могут ли логи (а соответственно и отчет) содержать информацию о группировке тестов по Тест классам?

    В наших тестах внутри одного класса Java находится комплект тестов какой-то функциональности. Каждый такой комплект создаёт отчёт нижнего уровня, т.е. этот класс и его отчёт как раз объединяют группу тестов. По завершению работы всех групп тестов запускается часть, которая и формирует вот этот общий отчёт. Т.е. группирование сейчас происходит при организации наборов тестов в тестовые классы – как вы и спрашиваете, но, возможно, под немного другим углом.

    5. Смогу ли я в логе/отчете увидеть дескрипшн из аттрибута [Description("")] в MsTest?

    Мы не используем MsTest – не знаю, насколько сложно организовать поддержку этого атрибута, но не думаю, что слишком сложно.

    6. Как отчет справится с Data Driven Test’ированием?

    Удобного представления/сопоставления результатов в зависимости от наборов входных данных сейчас не предусмотрено. В этом посте мы рассматриваем, в общем-то, похожую задачу по сведению результатов выполнения одних и тех же тестов на разных платформах, т.к., опять-таки, именно это делает Nerrvana.

    В общем-то, примерно такой же подход использовался бы для представления результатов Data Driven Test – сохранение промежуточных результатов в базу или xml-файлы, и специальный генератор для придуманного дизайна.