Design System Code Coverage

If you're building a Design System, you're working on the foundational building blocks to accelerate the development of features and entirely new products for teams. Other teams count on you to provide high quality and well tested code. Let's discuss testing and why I think it's reasonable to enable a high code coverage metric threshold naturally, by writing meaningful tests, which you're likely already doing!

Starting a new Design System with a focus on tests sets a strong foundation for quality and reliability. By ensuring that every piece of code is covered, teams can catch bugs early, prevent regressions in the future, and confidently refactor as the codebase evolves. This creates a culture of accountability, where developers are encouraged to write robust, meaningful tests, ultimately leading to a more maintainable and stable codebase. Just like Design Systems, it will take more time upfront, but the long-term benefits far outweigh the initial investment, resulting in fewer issues and smoother development.

For the rest of this article, I'll be honing in on Design Systems focusing on atoms and molecules. It's important to understand atoms and molecules when it comes to atomic design methodology. I'm talking about buttons, inputs, form elements, modals, dropdowns, etc., — nothing more complex than that! The higher you go up the hierarchy, and especially as you get into application code, squeaking out the last 10-20% of coverage may not be worth the time and investment. So let's focus on a micro-level for the rest of this article.

The Case for Code Coverage

We all understand the importance of testing, but tracking code coverage has mixed opinions. I used to believe that focusing on coverage metrics could lead to writing tests simply to meet those numbers, without ensuring they were meaningful. My view at the time was that it was better to prioritize meaningful tests over coverage percentages.

However, the more I thought about this, the more I saw advantages to code coverage metrics for Design Systems specifically. As developers, writing meaningful tests is an inherent part of our job. It's entirely possible to create tests that are both meaningful and provide the coverage we need — these goals are not mutually exclusive, though they're often perceived that way. Given enough time, any developer would naturally write tests that are both meaningful and comprehensive, covering all necessary paths, because that's the right thing to do.

For Design Systems, the rationale is even more straightforward. When focusing on atoms and molecules, these components are inherently less complex than other areas of Software Development due to their hierarchy. Their code shouldn't involve extremely complex logic — if they do, you likely aren't following atomic design principles. Given this, I believe hitting a high code coverage metric is very attainable.

Requirement: A Strong Testing Culture

I thought it'd be important to touch on something before we progress further...

You may see that your team is writing crappy tests that get the coverage you are looking for, but the tests aren't doing anything meaningful like I described above. Unfortunately automation can't catch these types of problems. This is more of a cultural team issue, or maybe even an organizational culture issue. It could mean that not everyone understands the importance of tests or how to write meaningful tests. If you notice this happening, it's probably wise to pull your team together and come to some agreement on principles for testing that everyone can get behind.

For the rest of this article, I'm making the assumption that people on your team aren't writing tests only to make CI green. I'm assuming that folks care about writing tests to build better products and build confidence in the code y'all write. I'm assuming best intentions, even in a corporate environment where it may be difficult to spend the time to write tests. I'm assuming there are people who actually review the pull requests being put up to ensure a certain quality and consistency to help improve the codebase overall.

Lots of assumptions, but for folks who don't just punch the clock day in and day out and really want to build great products, this should all be pretty obvious. It's still worth noting this to frame my personal point of reference.

Alright, let's continue!

An Example

By committing to writing tests with some sort of coverage metric enabled, even when it takes more time, you're building a stronger Design System, stronger products using your building blocks, and becoming a more experienced developer. I'm not saying you should be spending all night writing tests, of course, but you should be making time throughout your regular business hours to write them. Tests lead to fewer bugs down the road. Additionaly, refactoring is a natural and necessary part of our jobs. With test coverage, developers can refactor confidently, knowing any regressions or unintended changes will likely be caught.

Let's look at an example component and how we might write some tests for it.

export interface ButtonProps extends React.ComponentPropsWithoutRef<'button'> {
  children?: React.ReactNode;
  icon?: React.ComponentType<React.SVGProps<SVGSVGElement>>;
  variant?: 'primary' | 'secondary' | 'tertiary';
}

// I'll be using clsx below.
// If you aren't familiar it's an easy way to combine multiple classes
// based on conditionals.
// https://github.com/lukeed/clsx#readme
export const Button = React.forwardRef<HTMLButtonElement, ButtonProps>(
  (
    {
      children,
      className,
      icon: Icon,
      variant = 'primary',
      ...rootProps
    },
    ref
  ) => {
    return (
      <button
        ref={ref}
        className={clsx(
          classes.root,
          {
            [classes.primary]: variant === 'primary',
            [classes.secondary]: variant === 'secondary',
            [classes.tertiary]: variant === 'tertiary',
          },
          className
        )}
        {...rootProps}
      >
        {Boolean(Icon) && <Icon role="img" aria-hidden="true" className={classes.icon} />}
        {children}
      </button>
    );
  }

Above you have a simple Button component. What tests would you write given the code above? Take a moment to jot down some ideas and come back before continuing!


All done? Sweet! Here's what comes to mind for me:

  • Verify the Button can render the children provided.
  • Verify the Button can render a provided icon.
  • Verify the Button applies the provided className.
  • Verify the Button renders each variant ('primary' | 'secondary' | 'tertiary').
  • Verify the Button can receive a ref and forwards it.
  • Verify the Button spreads root properties.

To some, it may seems like a lot of tests for a simple Button. But you'd probably write most or all of these yourself anyway due to the component API provided. And guess what? You'll get 100% test coverage! So the meaningful tests you'd already be writing get you to a high code coverage metric naturally. Since we're dealing with an atom, there's not much logic, and it's mostly all markup and conditionally applying classes.

Button is one of the least complex cases, but even for more complex components, you'd want to write tests for given paths. For example, say you have a Dropdown component that relies on keyboard interaction. You'd want to write tests that verify those interactions work as expected, because if you don't, you won't be confident that your code works properly.

For UI tests, I've always looked at them as providing different inputs, receiving output, and then asserting if the output is what you'd expect. The inputs are attributes or properties, and the output is UI. Since we don't have to deal with data fetching, local storage, or complex state, it means less complexity in our test cases.

This is why I feel as though 100% test coverage for Design Systems focusing on this layer is very attainable. You care about what the output is. In application code, you may not care about the HTML that is ultimately rendered, but for Design System authors, you do. This is what we provide to our users directly, and we care about the markup we are providing, but consumers may not. So I look at Design System tests through a slightly different lens than if I were building a full application and writing tests for that.

Generative AI Can Help!

Writing tests is a bit like brushing your teeth. We all know we should do it. But sometimes we're busy. Or maybe we'd rather be doing something else. Or maybe we're stressed out and have to take care of another thing. That's where using technology can help.

If your organization allows it, Generative AI can do a lot of the lifting here for you. For kicks, let's use our Button component above and see what tests ChatGPT would write for it.

Here was my prompt.

"Given the following React code, what would you test?"

For this Button component, you would want to test the following scenarios:

  1. Rendering Tests Default Rendering: Ensure that the component renders properly without any children or icon. Children Rendering: Test that when children are passed, they are rendered inside the button. Icon Rendering: Ensure that if an icon is passed, the icon component is rendered.
  2. Variant Prop Tests Primary Variant: Ensure the button applies the correct primary CSS class when the variant="primary" prop is passed (default behavior). Secondary Variant: Ensure the button applies the correct secondary CSS class when variant="secondary" is passed. Tertiary Variant: Ensure the button applies the correct tertiary CSS class when variant="tertiary" is passed.
  3. Icon Prop Tests Icon Present: Test that the Icon component is rendered when an icon prop is provided. Icon Role and Accessibility: Ensure that the Icon has the correct role="img" and aria-hidden attributes for accessibility purposes.
  4. ClassName Prop Custom ClassName: Test that a custom className passed to the button is correctly applied along with the default classes.
  5. Forwarding Refs Forward Ref Test: Ensure that the ref is forwarded correctly to the underlying button element.
  6. Root Props Spread Event Handlers and Attributes: Test that any extra props (like onClick, id, data-* attributes, etc.) passed through rootProps are spread to the underlying button element.
  7. Snapshot Testing Create snapshots to ensure the component renders the correct structure and classes based on the props passed (e.g., for each variant, with or without icon). Example Test Cases
import { render, screen, fireEvent } from '@testing-library/react';
import '@testing-library/jest-dom';
import { Button } from './Button';

// Default rendering test
test('renders the button with default props', () => {
  render(<Button />);
  const button = screen.getByRole('button');
  expect(button).toBeInTheDocument();
  expect(button).toHaveClass('primary');
});

// Children rendering test
test('renders children inside the button', () => {
  render(<Button>Click Me</Button>);
  const button = screen.getByRole('button');
  expect(button).toHaveTextContent('Click Me');
});

// Icon rendering test
test('renders the icon when passed', () => {
  const Icon = () => <svg data-testid="icon" />;
  render(<Button icon={Icon}>Click Me</Button>);
  const icon = screen.getByTestId('icon');
  expect(icon).toBeInTheDocument();
  expect(icon).toHaveAttribute('role', 'img');
});

// Variant tests
test('applies the correct class for the primary variant', () => {
  render(<Button variant="primary">Primary</Button>);
  const button = screen.getByRole('button');
  expect(button).toHaveClass('primary');
});

// Forwarding ref test
test('forwards ref to the button element', () => {
  const ref = React.createRef<HTMLButtonElement>();
  render(<Button ref={ref}>Click Me</Button>);
  expect(ref.current).toBeInstanceOf(HTMLButtonElement);
});

These tests cover functionality, accessibility, and styling for the component.

Cool! I'd say those are some pretty decent tests. If nothing else, it gives you an excellent place to start.


Now I asked it how we'd get 100% test coverage.

Here was my prompt.

"Given the following React component, can you write me tests that will yield 100% code coverage?"

To achieve 100% code coverage for the Button component, the tests should cover: Rendering with default props: Check that the button renders with default variant (primary). Rendering with different variants: Ensure that the correct class is applied for primary, secondary, and tertiary. Rendering with an icon: Verify that the icon renders correctly when passed, and check accessibility attributes. Forwarding ref: Ensure that the ref is forwarded correctly. Passing arbitrary props: Test that additional props are spread to the button element. Rendering with className: Check that the className prop is correctly merged.

Here's how you can write tests using @testing-library/react and jest:

import { render, screen } from '@testing-library/react';
import userEvent from '@testing-library/user-event';
import { Button, ButtonProps } from './Button';
import React from 'react';

// Mock the clsx and classes modules (assuming you have them)
jest.mock('clsx', () => (...args: string[]) => args.filter(Boolean).join(' '));
const classes = {
  root: 'root',
  primary: 'primary',
  secondary: 'secondary',
  tertiary: 'tertiary',
  icon: 'icon',
};

// Mock SVG icon component
const MockIcon = (props: React.SVGProps<SVGSVGElement>) => <svg {...props} />;

describe('Button component', () => {
  it('renders with default props', () => {
    render(<Button>Click Me</Button>);
    const button = screen.getByRole('button', { name: /click me/i });
    expect(button).toHaveClass('root primary');
  });

  it('renders with "secondary" and "tertiary" variants', () => {
    render(<Button variant="secondary">Secondary Button</Button>);
    const buttonSecondary = screen.getByRole('button', { name: /secondary button/i });
    expect(buttonSecondary).toHaveClass('root secondary');

    render(<Button variant="tertiary">Tertiary Button</Button>);
    const buttonTertiary = screen.getByRole('button', { name: /tertiary button/i });
    expect(buttonTertiary).toHaveClass('root tertiary');
  });

  it('renders with an icon', () => {
    render(<Button icon={MockIcon}>Icon Button</Button>);
    const button = screen.getByRole('button', { name: /icon button/i });
    const icon = screen.getByRole('img', { hidden: true });
    expect(icon).toBeInTheDocument();
    expect(icon).toHaveClass('icon');
  });

  it('does not render icon when icon prop is not provided', () => {
    render(<Button>No Icon</Button>);
    expect(screen.queryByRole('img')).not.toBeInTheDocument();
  });

  it('forwards ref to the button element', () => {
    const ref = React.createRef<HTMLButtonElement>();
    render(<Button ref={ref}>Ref Button</Button>);
    expect(ref.current).toBeInstanceOf(HTMLButtonElement);
  });

  it('spreads additional props onto the button element', () => {
    render(<Button data-testid="custom-prop">Custom Props Button</Button>);
    const button = screen.getByTestId('custom-prop');
    expect(button).toBeInTheDocument();
  });

  it('accepts and merges className prop', () => {
    render(<Button className="custom-class">Custom Class Button</Button>);
    const button = screen.getByRole('button', { name: /custom class button/i });
    expect(button).toHaveClass('root primary custom-class');
  });

  it('can handle click events', async () => {
    const onClick = jest.fn();
    render(<Button onClick={onClick}>Click Me</Button>);
    const button = screen.getByRole('button', { name: /click me/i });
    await userEvent.click(button);
    expect(onClick).toHaveBeenCalledTimes(1);
  });
});

Explanation: Default props test: Verifies that the Button renders with the primary variant by default. Variant tests: Verifies the correct classes for secondary and tertiary variants. Icon test: Ensures that the icon is rendered correctly with the appropriate class and accessibility attributes. No icon test: Ensures the absence of the icon if the prop isn't passed. Ref forwarding test: Checks if the ref is forwarded correctly to the button element. Props spreading test: Verifies that arbitrary props (like data-testid) are correctly passed to the button. className merging test: Ensures that the className prop is merged with internal classes. Click handler test: Simulates a user click and ensures that the onClick handler is called.

These tests should provide 100% coverage for the Button component.


So what's the missing piece our AI overlords were missing in the first prompt? Looks like testing the different variants, adding a className, and verifying one can spread properties onto it. Not 100% coverage out of the box, but not bad, ChatGPT, not bad.

And if you're really interested in maximizing the coverage, it seems totally capable of doing so by writing another prompt, as seen by directly above. And if you can't use GenAI, filling in these test gaps manually would take maybe 5-10 minutes.

Complex Scenarios

"But this super edge case is so hard to test, why should I write a test for it?"

If it's an edge case you're worried about, that's exactly why testing it is so crucial. Without a test, it's far more likely to experience regressions. Additionally, without proper testing, it's easy to lose track of why a specific piece of code behaves the way it does. Writing a test provides that valuable context in a documenting way, while also ensuring we don't regress as changes are made.

With tools like sinon, you can simulate any state you need for your component, no matter how complex. If your tests are getting extremely complex, it may be a sign that your component API needs some work. Would composition help break things up into meaningful pieces? Maybe it takes a bit more time to get the last 10% of code coverage, but if you can cover all known flows, it will provide a confidence boost for your team as things change in the future.

Why Tests Don't Get Written

From my experience, tests don't get written due to inexperience, lack of time, or a pressure to deliver features. It's easier to only write the code for the feature in the moment. It takes more time to write tests. But the tests are equally as important as the application code.

If you're finding you need to skip writing tests due to time or other pressures, it's important to bring it up with your manager. Increase the estimation of every task by adding complexity. Setup a strong testing culture within your team. Start tracking the number of bugs before and after having tests to present a case to management that testing is important. Bring up cases where tests saved your bacon. It's up to you and your team to uphold a strong testing culture, despite external pressures. You can do it! I believe in you!

Don't Wait

If you're setting up a new codebase, it's important to start with a strong testing culture. Don't wait to get tests added until some future point. Right after you setup the repository, immediately setup a testing environment/framework. As time progresses, old code rarely sees tests written. It's too late at that point. By having a testing framework in place from the beginning and enforcing code coverage metrics, it'll set the standard from day one.

100% Coverage Doesn't Mean You'll Have 0 Bugs

One argument on the internet against 100% test coverage is that the coverage metric doesn't mean your code is bug free. Of course it doesn't! We're all humans writing code — we make mistakes. Even AI makes mistakes writing code — just ask ChatGPT to write something pretty complex for you. It will likely make a few mistakes too, or not account for certain cases that are specific to your business needs.

I find this argument against the 100% coverage metric a bit silly. You obviously can still have logic bugs with 100% test coverage. You likely have a bug because you didn't account for a particular flow. When this happens, in my experience, it means you need to write more code in your component to handle this case, which guess what? Means you need to write more tests. So you'll fix the issue and write meaningful tests for this new path. And your tests will likely get you right back up to 100% coverage.

100% test coverage doesn't mean "full, complete coverage" — it means that every line of code is accounted for in some way. It's a metric. People value metrics differently. But once again, the real key to all of this is writing meaningful tests.

Give it a Shot

Teams rely on you and your Design System to sweat the details and provide high quality, well tested code. Let's not let them down!

While achieving high test coverage requires upfront investment in time and resources, the payoff in reliability, maintainability, and developer confidence makes it a worthwhile endeavor. You're likely already writing meaningful tests, and if your Design System focuses on atoms and molecules, achieving 100% test coverage is a lot more achievable than you might think! The next time you start a new Design System, try setting the coverage threshold to 100% for a few months and see what you think. Do you feel more confident in your codebase? If not, at least you gave it a try to see what it'd be like!

If you've made it this far, I thank you for sticking with me. Have a good one!

👋