Reflections On Test Coverage In Web Development

Recently, I've noticed a surge of videos and articles discussing code coverage in testing and whether the "requirement for it" makes any sense at all. I've often pondered this aspect myself, along with the question of whether writing code that tests my own code is even worthwhile.

Don't get me wrong - automated tests (unit, integration, e2e and others) are among the most important elements of a project. The larger the project, the harder it is to maintain stability and the more time-consuming manual testing becomes. Without automated tests, we'd be in serious trouble.

However, as is often the case in both programming and life, forcing something through can lead to many side effects. Requiring a certain level of code coverage - say, 50% - hypothetically results in developers having to write additional tests just to meet that threshold. This often leads to tests being written hastily, or the threshold being lowered to allow the PR to pass.

// Part of the Jest configuration that enforces test coverage.
coverageThreshold: {
    global: {
        statements: 60,
        branches: 50,
        functions: 60,
        lines: 70,
    }
}

The worst possible situation is probably when test coverage hovers around the threshold, meaning that essentially everyone writing functionality must add tests for it and others just to get the PR through.

Does any of this make sense? How should we even approach testing, and is it worth investing in code-oriented tests (unit, integration)? Or is it better to focus on e2e testing and concentrate solely on whether the functionality works as a whole?

Let's discuss all of this in today's article.

How Test Coverage Works?

The mechanism is childishly simple. The testing framework has built-in tools to measure whether a given piece of code was executed in any test. It then checks how many files are in the project and adds each test to a total sum. Percentages are then calculated from this.

Usually, in the testing framework's configuration, we specify which files should be taken into account.

collectCoverageFrom: ["src/**/*.{js,jsx,tsx,ts}"],
coveragePathIgnorePatterns: [
  "node_modules/",
  "coverage/",
  ".next/"
]

Alright, but what’s the deal with the previously mentioned setup that includes statements, branches, functions, and lines? Each of these is a different category that allows us to track the following:

Statements: At least n% of all statements in the code (e.g., variable assignments, conditionals, loops) must be covered by the tests.
Functions: At least n% of all functions in the code must be tested.
Branches: At least n% of all possible branches in the code must be covered by tests.
Lines: At least n% of the total lines in the code must be executed by the tests.

Below is some code to understand each of these.

function calculateTotal(price, tax) { // This is a function and line.
    const total = price + tax;  // This is a statement and line.
    return total; // This is a line.
}

function getDiscountedPrice(price, discount) { // This is a function and line.
    if (discount) {  // This is a branch and line.
        return price - discount; // This is a statement and line.
    } else {         // This is another branch and line.
        return price; // This is a line.
    }
}

function add(a, b) { // This is a function and line.
    return a + b; // This is a statement and line.
}

Alright, we know how test coverage works (how it is calculated), but how can we visually see which file has what coverage? In Jest, you just need to add the --coverage flag to the test command. Developers usually create a special alias for this variant of the command in package.json.

"scripts": {
  "test:coverage": "jest --coverage",
}

After running it, you should see a report like this in the console, and you should have a special directory generated with a detailed report where you can check which lines of code are covered and which are not (assuming you have Jest's default configuration and haven’t changed anything).

Test Coverage in Console

Additionally, you can open the generated report by navigating to the coverage directory, which should be created.

Viewing the Detailed Report Generated Detailed Coverage Report

Does Code Coverage Make Sense?

Alright, we know how test coverage is gathered and how we can generate and view a report. This is the perfect moment to reflect on whether this approach makes sense and what its potential drawbacks are. As always, it’s best to relate to real-life situations, since programming describes real-world issues and makes them easier to manage.

Imagine you work in a restaurant and are learning how to make pizza. There’s a head chef who checks whether you can make the pizza (this is our test), and you, the young pizzaiolo, are the code. Instead of tasting the pizza, checking its color, or inspecting the proportion of ingredients, the chef only verifies whether any pizza was served to the customer - nothing more.

This is exactly what test coverage is. It’s merely a verification of whether the code was run at least once by the testing framework. With this understanding, we might conclude that it’s somewhat meaningless.

Don’t get me wrong - reports, any reports, and statistics about code always look impressive when they are high. But what happens when they fall below certain expectations? Someone has convinced the business that coverage is crucial, and now high coverage is strictly enforced.

Take a look at the following code from a component with tests and consider whether this test even makes sense (as a fun fact, I’ll tell you that the functionality is 100% covered according to our framework).

const LoginForm = ({ onSubmit }) => {
  const [email, setEmail] = useState(``);
  const [password, setPassword] = useState(``);
  const [errors, setErrors] = useState({});

  const validate = () => {
    const validationErrors = {};
    if (!email) {
      validationErrors.email = `Email is required`;
    } else if (!/\S+@\S+\.\S+/.test(email)) {
      validationErrors.email = `Email is invalid`;
    }

    if (!password) {
      validationErrors.password = `Password is required`;
    } else if (password.length < 6) {
      validationErrors.password = `Password must be at least 6 characters`;
    }

    setErrors(validationErrors);
    return Object.keys(validationErrors).length === 0;
  };

  const handleSubmit = (event) => {
    event.preventDefault();
    if (validate()) {
      onSubmit({ email, password });
    }
  };

  return (
    <form onSubmit={handleSubmit}>
      <div>
        <label>Email:</label>
        <input
          type="email"
          value={email}
          onChange={(e) => setEmail(e.target.value)}
        />
        {errors.email && <p style={{ color: `red` }}>{errors.email}</p>}
      </div>
      <div>
        <label>Password:</label>
        <input
          type="password"
          value={password}
          onChange={(e) => setPassword(e.target.value)}
        />
        {errors.password && <p style={{ color: `red` }}>{errors.password}</p>}
      </div>
      <button type="submit">Login</button>
    </form>
  );
};

describe("LoginForm Component", () => {
  test("renders login form with email and password fields", () => {
    render(<LoginForm onSubmit={jest.fn()} />);

    expect(screen.getByLabelText(/email/i)).toBeInTheDocument();
    expect(screen.getByLabelText(/password/i)).toBeInTheDocument();
    expect(screen.getByRole("button", { name: /login/i })).toBeInTheDocument();
  });

  test("shows validation errors when submitting empty form", () => {
    render(<LoginForm onSubmit={jest.fn()} />);

    fireEvent.click(screen.getByRole("button", { name: /login/i }));

    expect(screen.getByText(/email is required/i)).toBeInTheDocument();
    expect(screen.getByText(/password is required/i)).toBeInTheDocument();
  });

  test("shows validation error for invalid email", () => {
    render(<LoginForm onSubmit={jest.fn()} />);

    fireEvent.change(screen.getByLabelText(/email/i), {
      target: { value: "invalid-email" },
    });
    fireEvent.change(screen.getByLabelText(/password/i), {
      target: { value: "password123" },
    });
    fireEvent.click(screen.getByRole("button", { name: /login/i }));

    expect(screen.getByText(/email is invalid/i)).toBeInTheDocument();
  });

  test("shows validation error for short password", () => {
    render(<LoginForm onSubmit={jest.fn()} />);

    fireEvent.change(screen.getByLabelText(/email/i), {
      target: { value: "user@example.com" },
    });
    fireEvent.change(screen.getByLabelText(/password/i), {
      target: { value: "123" },
    });
    fireEvent.click(screen.getByRole("button", { name: /login/i }));

    expect(
      screen.getByText(/password must be at least 6 characters/i)
    ).toBeInTheDocument();
  });

  test("calls onSubmit with email and password when form is valid", () => {
    const mockSubmit = jest.fn();
    render(<LoginForm onSubmit={mockSubmit} />);

    fireEvent.change(screen.getByLabelText(/email/i), {
      target: { value: "user@example.com" },
    });
    fireEvent.change(screen.getByLabelText(/password/i), {
      target: { value: "password123" },
    });
    fireEvent.click(screen.getByRole("button", { name: /login/i }));

    expect(mockSubmit).toHaveBeenCalledWith({
      email: "user@example.com",
      password: "password123",
    });
  });
});

The 100% coverage will bring a smile to the client’s face and yours, but that smile might be deceptive. These tests only verify whether a label is shown, a button is clicked, and then the onSubmit function passed to the component is called. Finally, they check whether the component displays an error after changing the input. Everything seems fine, but these tests don’t actually verify what the user sees.

Additionally, there could still be many other bugs here that aren’t caught from the perspective of these tests - mostly visual ones and those related to race conditions when dealing with real functionality and API requests.

Lastly, even though we’re using react-testing-library and don’t have any glaring references to specific implementations, this code is still tied to the "implementation." Try a little experiment - what if it turns out that React is terrible, and what happens to your tests and the entire coverage you’ve written? Exactly, it all goes down the drain. Now, explain to the client how 90% coverage in the project dropped to 0% or 10% because some of the code was written in pure TS.

It’s easy to see that code-oriented tests (unit and integration tests) are "deceptive" and can generate a lot of work in the future. However, e2e tests are technology-agnostic and focused on functionality, so migrating to something else won’t cause a catastrophic need to rewrite all tests.

I know, I know - the test code wasn’t of the highest quality, but I did it on purpose. This is how tests look in many projects. I want to show you how much harm you can do to yourself by writing tests this way.

Catastrophic Impact Of Test Coverage

It's best to start with an example. If we have several features in a project and a requirement that test coverage must be at 90%, then this is the amount of work that awaits us.

Example With A New Feature

Writing the feature code.
Writing the test code.
If we're below the threshold - adding tests for other features.

Major Application Refactor

Changing the code.
Fixing failing tests (if possible).
We still don't know if we've broken anything - as I mentioned, code-oriented tests don't give us a 100% guarantee.
Adding tests if we're below the threshold.

And that's just the beginning. The real problem starts with larger migrations. Let's assume we have an app in Gatsby and we're migrating to Next. We have tons of tests that check how the static site generation process works. I'm telling you right now that those tests are going straight into the trash, and you'll have to rewrite them from scratch for Next.

This just shows how pointless this approach is. Unlike other benchmarks that actually give us something - like page performance, the number of generated files in the bundle, and their limits - test coverage is so useless and hollow that it only harms us in the end.

Let's see how much money a project might lose on a single developer over the course of a year, assuming the developer earns $100 per hour and spends 6 hours a day on feature development and 2 hours on writing tests. Of course, you can't calculate it this way exactly, and reality is different, but this is just for illustration.

A Year of Developer's Work Development/Automated Testing Time Allocation For A Developer Over A Year

In the case of migrating to another framework, tell the client that $52,000 went up in smoke because most of the tests need to be scrapped and rewritten from scratch. Then multiply that by the number of developers, and you've got quite a sum - why burn wood when you can burn money?

This happens because forcing test coverage leads to more and more tests being written. The more tests there are, the more time is needed to maintain them. Most tests are written in a way that touches implementation details anyway - especially those tests that use frameworks like Angular, React, Next, or Gatsby - and there are a lot of them in every project.

So, What Then? E2E Tests?

It's not that code-oriented tests are bad. They're not - the problem lies in enforcing code coverage. Unit tests should be written where they are well-suited, such as testing algorithms or verifying small, isolated functions.

export function sum(numbers: number[]): number {
  return numbers.reduce((acc, curr) => acc + curr, 0);
}

test("sums an empty list to 0", () => {
  expect(sum([])).toBe(0);
});

test("sums a list", () => {
  expect(sum([1, 2, 3, 4, 5])).toBe 15);
});

The same applies to integration tests. How else can we test whether our code is integrated with a third-party library for sending emails? An integration test is perfect for this.

export async function sendEmail(
  to: string,
  subject: string,
  text: string
): Promise<void> {
  const transporter = nodemailer.createTransport({
    host: "smtp.example.com",
    port: 587,
    secure: false,
    auth: {
      user: "your-email@example.com",
      pass: "your-email-password",
    },
  });

  const info = await transporter.sendMail({
    from: '"Sender Name" <your-email@example.com>',
    to,
    subject,
    text,
  });

  console.log("Message sent: %s", info.messageId);
}

jest.mock("nodemailer", () => nodemailerMock);

test("emails management mechanism is integrated with third party", async () => {
  const to = "recipient@example.com";
  const subject = "Test Email";
  const text = "This is a test email";

  await sendEmail(to, subject, text);

  const sentMail = nodemailerMock.mock.getSentMail();
  expect(sentMail.length).toBe(1);
  expect(sentMail[0].to).toBe(to);
  expect(sentMail[0].subject).toBe(subject);
  expect(sentMail[0].text).toBe(text);
});

The key is not to go overboard and avoid introducing fanatical approaches or doctrines to the code. As always, you need to find a balance. Instead of complex integration tests that verify component behavior or entire functionalities, write an E2E test and check off the steps required for the task. Take screenshots and ensure that the test is resistant to changes in selectors or functionality implementation.

it(`user sees unchanged permanent document`, () => {
  Given(`I'm on page`, `education-zone`)
    .Then(`I see text`, [
      `Naming generics in TypeScript`,
      `Why you should start using Zod`,
      `Managing legacy URLs on Netlify`,
      `Implementing Queue in JavaScript`,
      `Using Zod and TypeScript to write typesafe code`,
      `Creating reusable and framework-agnostic link component`,
    ])
    .When(`I click explore "Naming generics in TypeScript"`)
    .Then(`I see unchanged elements`)
    .And(`System takes a picture`);
});

Alright, so is there another way to verify what is covered by tests and what isn’t? Yes, there is, and in my opinion, it’s a much better approach. Instead of verifying whether the code is covered by tests, we verify whether the functionality and its steps are covered. We then mark, track, and manage this coverage accordingly. There are many tools for this, such as XRAY or Zephyr.

Covering functionality, rather than code, protects us from changes in implementation details. E2E tests take longer, but they provide much better results and catch far more inconsistencies or regressions. In today’s world, where we can run tests in parallel across multiple threads and have great tools for monorepos like Nx, we can divide E2E tests into many smaller projects and run them separately - for faster results and feedback.

If you're interested in the conventions and mechanisms used, feel free to check out the article Why I Crafted My Own Gherkin Interpreter For E2E Tests.

My Thoughts

I've worked on projects where code coverage was enforced, and in others where it wasn't, and instead, we used tools to track functional coverage. I must honestly say that I would never go back to enforcing code coverage - it's far better to enforce functional coverage (e.g., blocking a PR if the feature doesn't have the appropriate e2e test tag) or something similar.

Once written, E2E tests protect us from most problems (visual, process-related, or anything else).

When changing the implementation and expecting the same behavior (refactoring), I don't have to touch any test code, and if everything is done correctly, I get immediate feedback.

In my opinion, code coverage was just another trend among many, and it has shown us that this approach is largely ineffective. It's similar to CSS-in-JS solutions, which introduce significant overhead in maintaining the codebase and, in the case of CSS-in-JS, can also negatively impact performance.

Remember, these are just my thoughts, and you might have a different opinion or perspective, but the most important thing in programming is to recognize where the bottlenecks are and be able to admit to them - even if you're the author of the idea.

Avoiding code coverage aligns with the User First approach, which focuses on reducing unnecessary maintenance and refactoring for developers.

Summary

Many approaches have faded into obscurity over time, or it has been realized that no approach is a silver bullet. The same applies here. Enforcing code coverage is flawed - it's just a statistic. It provides information and can be useful (e.g., when we write a set of unit tests for an algorithm and don't remember what we've covered because yesterday was tough).

If developers stop recognizing and addressing certain problems, it means they've stopped growing. I've seen many situations where someone pushed an approach without being able to explain or defend it with solid arguments - this is a problem of ego plus experience.

We should test our code, but not in the way the aforementioned pizzaiolo did. Instead, we should focus on testing in a way that's resilient to changes in the tech stack or implementation.

Don't be like the owner of a mediocre pizzeria (2 stars) who doesn't even bother to try his own products out of laziness.

For more information about E2E testing and how to minimize changes, you can read the following article: How To Work With E2E Selectors In Tests.