Pragmatic Front-End Testing Strategies (1)


This article is the first of a series of three articles.

Part 1: Necessity of Pragmatic Front-End Testing Strategies
Part 2: Testing Visual Components Using Storybook
Part 3: Testing Logic in State Management Using Cypress

It is the height of JavaScript. For the past couple of years, JavaScript has maintained its championship title as the most popular programming language, and is still rapidly developing. As someone who began on a path of a front-end developer more than a decade ago, in a lawless wasteland without the simplest concept of the Web Standard, I can’t help but feel humbled by JavaScript’s achievements. Observing the stunning evolution of web development, from having development environments that do not deserve their names to being inundated with cornucopia of resources, tis truly the golden age.

Among all of the shining accomplishments, the improvement in testing methodology and tools is the most encouraging. For years, front-end testing seemed to me like a glorious castle in the sky. No matter how desperately I tried to apply the existing testing methods, none were suitable to test front-end codes, and the meek test results after arduous testing procedures left me insatiable. However, recently emerging tools are providing developers with excellent solutions to the problems of the past, as if to boast the experience gained through years of trial and error.

In this series of articles, I will humbly introduce Storybook and Cypress, the two tools most worthy of the attention, and discuss effective strategies for real-world front-end testing.

Developers and Tests

Before we get started, let’s all agree on something, here. To roughly define “test” within the software realm, it is “an act to verify that the application functions appropriately under given requirements.” Most of the times, Quality Assurance (QA) is a mandatory step before the product is actually delivered to the hands of the users, and this process basically summarizes the core idea behind testing. However, if you inspect the development cycle more holistically, such testing occurs at every step of the way. For example, validating and improving the UX during the prototype phase, calling the API from the server and cross-checking the expected value, and comparing the final design and the markup result after the first draft of the program are all examples of said tests.

Let’s look at more examples to see how frequent testing really is in the real workplace environment. Say I’m working on a simple Todo Application with React, and currently working on a “Finish Task” feature. I would have to write a code to take the user’s click event to change the state of the task to “Completed.” In order to verify that the state has been changed according to plan, I click on the task and in my developer tools, I verify that the component’s state has successfully changed to “Completed.” Since the “Completed” tasks have to be checked and cancelled out in the UI, I add the appropriate CSS to the component, and click on the task again to make sure that it runs correctly. Upon figuring out that the “Completed” state has to be managed not within the component, but in the Redux’s store, I do some refactoring, and I do the whole thing all over again.

In the example in the previous paragraph, clicking on the component and verifying the results are all parts of a test. In reality, developers “run tests” during the development cycle, and most of tasks habitually carried out by developers after saving the code are related to testing.

Importance of Automated Testing

The main problem is that such tests are repetitive. As you can clearly see in the previous example, the testing consists of repeatedly “clicking” and “verifying” tasks. If these tasks are carried out manually every time, as the application complexity grows, so will the cost of testing. If testing cost increases, you naturally and gradually put off testing, and the performance of the product plummets. Also, the pressure from having to run such tests will make you more unwilling when it comes to changing the code to improve it, and this also leads to the performance of the product taking a nosedive.

If these tedious tests are automated, it could seriously cut down on the testing time, and prevent forgetting to run tests or making mistakes during testing. Also, automated testing eliminates the fear of refactoring, and will eventually lead to better codes.

(From this point on, I will only refer to "tests" to refer to “automated tests that are written by developers.”)

The Opportunity Cost of Testing

I’m certain that most of you will agree with me when I say that “testing is important.” However, it is not to say that all of the tests have to be automated. I say that because writing tests costs resources. Therefore, if the opportunity-cost of writing the test is low, don’t bother writing automated codes, and just do it manually. Sometimes, people set overly ambitious goals like pursuing 100% coverage or writing automated test codes for simple actions with close to no logic, and such goals are wasteful of resources that could otherwise be valuable.

It is better to boldly remove any piece of code that you think is redundant, even if it is a part of an existing test code. This is because as the application keeps on changing, the test codes must change as well. As it is considered to be a good practice to remove unnecessary codes from the product, it is also encouraged to actively remove unnecessary test codes to reduce maintenance cost. Even Kent Beck, the rediscoverer of Test-Driven Development has expressed his thoughts on the matter.

I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence … If I don't typically make a kind of mistake, I don't test for it. From Stack Overflow

Trying to test for perfection is the easiest way to write codes that will inevitably become complicated program that is also prone to error … If the program is too simple to have any or little possibility of error, it is better not to test. From Extreme Programming Explained

Good Tests

Because different test codes have different maintenance cost and benefits, in order to properly weigh opportunity cost for tests, it is imperative that you know what makes a good test.

So, what makes a good test? This question is extremely difficult to answer because tests are affected by numerous variants including application’s characteristics, tools and languages used in development, and user environments. While it may be difficult to definitively say what makes a perfect test, I have organized five key properties of a good test.

1. It needs to be fast.

Faster tests mean faster feedback every time you edit your code. This will inevitably make the entire development process faster, while allowing you to run tests more frequently. If you need to wait for hours just to see the test results, what good are the tests, really?

2. Altering the implementation detail should not break the code.

In other words, “write the test with the interface in mind” or “do not write implementation dependent tests.” To look at it from another angle, this also applies to chunking the test units into too many small chunks. If the test cracks even at the slightest refactoring, not only does it deprive the test of its credibility, but also demands more time and resources to fix the test codes.

3. It should be able to detect bugs.

To put it differently, “tests that validate buggy codes should fail.” If the expectations are not specified in detail, or if the program does not test through every imaginable scenario, some bugs could go unnoticed. Also, excessive usage of mock objects can make it difficult to detect errors that could occur during the connection stages even the dependent objects are edited. Therefore, test specs must be comprehensive, and refrain from using mock objects as much as possible.

4. It should be consistent.

If the test that worked perfectly yesterday suddenly doesn’t work today, or if the test that had no problem with certain devices but doesn’t run on other devices, you’d probably lose faith in the test. Good tests should minimize the external and environmental effects on the results, as to produce the same results no matter the given conditions. The environmental aspects include time, device OS, and network status, and good tests should be designed so that such elements can be directly manipulated using mock objects or external tools.

5. The intent of the test should be clear.

By now, I think everyone acknowledges that code readability is important. One of the defining characteristics of good codes is that “people” can easily read and understand the code, not “machines.” Test codes should be held to the same standards; anyone should be able to look at the test code and tell you the purpose of the test. Illegible codes demand more resources when the code eventually has to be edited or removed. If the test requires repetitious and lengthy codes need to be constantly called or if validation codes are unnecessarily verbose, it is better to simply create a function or use assertion statements to handle such tasks.

Importance of Testing Strategies

Individual criterion is easy to satisfy. However, the problem rises when they start to inevitably contradict each other. For example, writing tests with smaller units allow faster execution and comprehensive validation through suite segmentation. However, such tests are prone to malfunctioning even at the slightest refactoring, thereby raising the maintenance cost, and with increased number of necessary mock objects, they are more susceptible to undetected errors. Furthermore, incredibly thorough testing specs are excellent for detecting bugs, but the very same intricacy render the purpose opaque.

With everything said, it’s probably impossible to build the perfect test that fits every single criterion mentioned above. Therefore, it is paramount to strategically decide what kind of compromises you are willing to make. Especially since front-end codes are closely related to the Graphic User-Interface (GUI), and must carefully consider various environments of users, each platform demands different codes and different strategies. Each segments of the product, including visual elements, server communication, user interface (UI), and etc., requires personal meticulous strategizing.

Importance of Testing Tools

When it comes to front-end testing strategies, one of the most important factors of consideration is the tools. At the beginning of the article, I mentioned the modern advancements in testing tools not because I was overly sentimental looking back, but because these tools are extremely helpful when designing effective testing strategies. For example, while previous E2E (End to End) testing tools were a double-edged sword because they offered user-perspective tests that were not affected by implementation detail, but they also were complicated to write and were extremely slow, Cypress, a modern E2E tool, offers the same benefits of the ancestors while being intuitive, fast, and stable. Therefore, the advancement of testing tools leads to better testing strategies.

Welp! What an introduction. I think I’m finally done with rambling about why I decided to title this article “Pragmatic Front-End Testing Strategies.” To summarize, effective tests require effective strategies suited to the front-end environment, and the latest tools facilitate coming up with such strategies.

Finally, let’s get into what efficient testing strategies are using a simple Todo app as an example.

Simple Example: Todo Application

Throughout the article I will refer to the renowned TodoMVC, and I used React and Redux to build my own for testing purposes. However, the contents of this article are not restricted to certain libraries, so the same strategies should be applicable do applications not written in React as well. The application, when executed, will look like something in the following image.

TodoMVC Application

(Image 1: TodoMVC Application)

(While the original TodoMVC uses localStorage to store lasting data, but to test communication with the real server, I used a different local server.)

Components of Front-End Applications

Assuming that data already exist on the server, let’s imagine we’re adding new to-do items after launching the app for the first time. The internal execution procedure can be broken down in the following order.

  1. When the application is launched, display the basic UI.
  2. Request the “Todo list” from the API server, and store the response to the Redux store.
  3. Display the Todo list according to the values saved in the store to the UI.
  4. User clicks on the input box, and presses enter after typing “Take a Nap.”
  5. Request “add todo” along with the “Take a Nap” data from the API server.
  6. If the request is successful, append “Take a Nap” to the Todo list in Redux store.
  7. Update the UI according to the values in the store.

That looks like a lot, but the entire process can be broken down into two main categories. Steps 1,3, and 7 fall into the first category, displaying visual state of the application on the window. The second category, changing the current state of the application with the given external input (user input, server transmission,) include steps 2,4,5, and 6. This should remind you of models and views, frequently used in the MVC pattern.

Such categorization is important because each segment requires different strategies when testing. Especially, since it is difficult to write a code to automate tests for visual elements, testing visual elements and state changes at the same time would be incredibly costly. Therefore, it is important to design the application so that such tests can be conducted separately. Latest frameworks like React and Vue provide features to separately manage visual elements and state changes as default.

First, let’s look at how we should go about testing visual elements.

Testing Visual Elements

Comparing the HTML

Often, when people talk about testing the View within the MVC pattern used in front-end, it is most common to test the structure of the HTML codes. Given that HTML and CSS determine the visual representation of the application, CSS is rarely dynamically controlled, so it makes sense. The simplest test would be to compare the parts of the resulting HTML with what you expect to happen in string format. If testing for components in the header area, it would look something like the following.

import React from "react";
import { render } from "react-dom";
import prettyHTML from "diffable-html";
import { Header } from "../components/header";

it("Header component - HTML", () => {
  const el = document.createElement("div");
  render(<Header />, el);

  const outputHTML = prettyHTML(`
    <header class="header">
      <h1>todos</h1>
      <input class="new-todo" placeholder="What needs to be done?" value="" />
    </header>
  `);

  expect(prettyHTML(el.innerHTML)).toBe(outputHTML);
});

The diffable-html library from the example above formats the input text so that it would be easier to compare to an HTML text. This way, you can clearly see how the result of failed test differs from what you expected it to be, and solves the issue that the returned value of innerHTML may differ from browser to browser because of inner representation of different browsers. Also, you can write more readable test codes because you don’t have to worry about basic styling of the HTML like indentation and line changes.

Snapshot Test (HTML)

It turns out that manually hardcoding the expected HTML text to compare to the resulting code is quite difficult. It only works out like it did because the example above has a simple structure, but if the structure of the code were even a little more complicated, the complexity of the test code would have increased drastically. Therefore, when conducting tests of similar nature, it is more common to simply copy and paste the HTML created by the real component using the console.log() found in the browser’s testing tools.

Such method does not conform to the Test-Driven Development (TDD) methodology where you define what you expect before the test. This kind of testing does not provide any feature that improves the development speed by offering prompt feedbacks, and truthfully, is just a simple regression test. Also, it can become ridiculously tedious copying and pasting the result of the test every time there is a change to the code. Testing tools like Jest, recently introduced the Snapshot testing to solve such issues.

For snapshot tests, you don’t need to hardcode the expected data, but the tool saves the first result of your program as a file. Then, every time the test is ran, the result is compared to the previously saved file. While it still acts as a regression test, it eliminated the hassle of writing the expected results manually. The code below is an example of comparing the previous view using the snapshot testing method.

import React from "react";
import { render } from "react-dom";
import prettyHTML from "diffable-html";
import { Header } from "../components/header";

it("Header component - Snapshot", () => {
  const el = document.createElement("div");
  render(<Header />, el);

  expect(el.innerHTML).toMatchSnapshot();
});

Now, doesn’t that look much simpler? Although you can’t directly verify the expected results in the test program, instead you can see that *.js.snap. file has been created in the __snapshot__ folder, and the file contains the expected result as such.

exports[`Header component - Snapshot 1`] = `
"
<header class=\\"header\\">
  <h1>
    todos
  </h1>
  <input class=\\"new-todo\\"
         placeholder=\\"What needs to be done?\\"
         value
  >
</header>
"
`;

Snapshot Testing (Virtual DOM)

Actually, React component doesn’t return real HTML, but a virtual DOM called React Element. Creating and editing the actual HTML code is react-dom’s task, so strictly speaking, is not included in the testing scope of individual components. Therefore, it is more common to simply test the returned React element’s tree structure when testing React components.

To facilitate testing React components, React provides react-test-renderer library, and this library lets you test how the component works without actually having to render it.

import React from "react";
import renderer from "react-test-renderer";
import { Header } from "../components/header";

it("Header component - Snapshot", () => {
  const tree = renderer.create(<Header />).toJSON();

  expect(tree).toMatchSnapshot();
});

In the example above, it uses react-test-renderer’s toJSON() function instead of creating the DOM element to directly render from it. This function can be useful because now the browser’s rendering engine is no longer needed, and we can test in the Node.js environment without the help of JSDom.

Jest provides more functionalities to support snapshot comparisons and update. To learn more about snapshot testing, refer to the official Jest documentation.

The Problem with Comparing HTML Structures

Both HTML Difference method and Snapshot testing compare the HTML structure to test the visual elements of the product (even React element tree can be considered to be HTML structure.) However, considering the five conditions of a good test, HTML structure comparison has these following flaws.

1. Implementation Dependent Testing

The second category dictates that for good tests, “altering the implementation detail should not break the code.” Hence, the tests should validate “what” the code does instead of “how” the code does. However, HTML is, strictly speaking, not the result of visual elements, but is more of implementation detail--how the visual element is represented. This is because the final product of the visual components are the images displayed on the screen, not the HTML structures.

Such implementation dependent tests are easy to malfunction even at tiniest changes to the code, so it costs more to maintain. Let’s take the Header component as an example. If a div tag was used instead of the header tag, or if the new-todo class was renamed as add-todo, the test would break despite the fact that the final product would look and function the same. As such, the test code has to be updated even when refactoring HTML and CSS codes, and this slows down the entire operation down.

2. Unclear Testing Intentions

Another requirement is that the “purpose of the test should be clear.” However, the HTML structure does not fully represent the resulting image that is displayed on the screen. Even if you include the CSS in the test, it is almost impossible to perfectly picture the resulting image just by looking at the code. This is also why we can’t hardcode the expected HTML values when writing test codes. You can only be certain that the code you wrote does what you wanted it to do, only after displaying the result on the browser.

These kinds of codes are hard to manage. Other developers, or even you, who actually wrote the test code, can have trouble pinpointing what the purpose of the test was. Eventually, you mindlessly copy and paste the results and update the snapshot every time the test fails, and this has murderous effects on the credibility of the test.

Difficulties of Automating Visual Tests

The most accurate test for visual components would be to compare the images displayed on the screen down to each pixel, and other tests cannot be described to be cost-effective. Then, our only hope would be to load the view component, take a screen shot, and the compare it to the expectation. If you decide to use the final draft of the design as the expected value, for most precise test results, you could actually take a screenshot and compare the two images every time you edit your code.

The technique I just described, while it sounds tedious, would produce the most accurate test results. Tragically, however, final design handed to you do not comprise every possible scenario, so such method, as a mean of testing every state of the application, cannot be suitable. Considering other elements like display resolution, browser rendering methodology, size of device viewport, and etc., it makes comparing each pixel idea sound even more ridiculous in terms of technical difficulties.

While it may seem anticlimactic, in my opinion, no technology has yet to triumph over the “developer’s eyes” when it comes to visual testing. Despite the fact that many tools are still being developed constantly, our innate sense and instincts have not yet been beaten. Even when programming HTML and CSS codes, you habitually check the result of the change you’ve made to the code, and expect a change to have happened each time. I personally think that technology has yet to produce a tool that can aptly replicate this series of events.

Does that mean visual tests cannot be automated? It’s 50:50. While complete automation is still not realistically achievable, UI development process itself can be improved. The tool that provides a new way of developing the UI is the Storybook.

(It is true that recently, tools like Applitools and Chromatic have reconciled the different rendering methods of different browsers, and drastically improved the visual testing method of comparing image files. However, these tools are used to conduct regression tests, and are more effective when used together with the tool that I’m going to introduce now--the Storybook. In part 2 of this series, I will briefly introduce using Storybook and regression testing tools.)

Storybook: UI Development Environment

In the official website, it labels the Storybook as “UI development environment.” To be fair, Storybook is more of a tool that provides developers with a better environment in which to work on UI development than a testing tool. It is, as you can see, is a sort of a component gallery. As you can see in the following image, it allows you to register collection of components used in your application for each page, and provides a way for you to easily visually check the product using the navigator.

screen shot 2018-12-26 at 10 57 55 am

(Image 2: Storybook Example – Components used in TOAST File)

Now you might be wondering how this tool can help us with running our visual tests. Remember how I said everything you do to check the result of the code after saving can be classified as testing? If every possible combinations and input values of the components are already saved and registered, much of the process I mentioned earlier can be automated.

For a more specific example, let’s refer to a situation where you have to change the size of the icon in the “File Upload Complete” pop-up. To check the result, you would have to change the code, upload it, and actually wait for it to finish uploading. But alas, upon checking the results, you can see that the icon has become too big and it has covered the text. Now, you repeat the process of changing the code and waiting to check for results only to find out that now the icon is too small. Guess what? You do the same thing all over again.

These series of events are results of the fact that the visual elements are linked directly to the state of the entire application. The process of uploading and waiting for the file is simply to make sure that the state of the application is acceptable to you. If the Storybook has “Upload Complete pop-up” component individually registered, the tedious repetitions would no longer be necessary.

End of Part 1

So far, I have illustrated the importance of test automation, qualities of good tests, and also the importance of testing strategies. Furthermore, I have discussed why it is difficult to automate visual element testing, and how Storybook provides an attractive alternative. True, there are numerous ways to test front-end codes, and information presented in this document may not always head in the right direction. This article is merely my attempt at discovering, alongside you, realistic and efficient testing methods.

In Part 2, I will actually use Storybook to conduct more detailed visual element tests and discuss more in depth strategies.

(“Application State Management” will be handled in Part 3 along with Cypress)

DongWoo Kim2019.04.05
Back to list