TDD-friendly

BDD and TDD are often presented as if they live in different rooms.

They don't.

Red → green → refactor works just as well at the scenario level as it does at the unit-test level. But there is one important condition: your tooling has to give you a real red test whenever you describe work that is not done yet.

Not a vague pending state.

Not a snippet you need to paste somewhere.

Not a scenario hidden behind a @wip filter and quietly ignored by the build.

A red test.

SpecBinder gives you that, automatically, at every level of incompleteness.

The missing red state in many BDD automation workflows

In a Cucumber-style workflow, getting a useful red state often takes more coordination than you might expect.

You can write a scenario before the implementation exists, of course. That part is easy. But what happens next depends on runtime step discovery, step-definition matching, configuration, and sometimes team convention.

An empty scenario? It may run as pending, not red. You see something like cucumber.api.PendingException and have to remember it means "not failing, exactly, but you have work to do."
An undefined step? It prints a snippet for you to paste somewhere and marks the scenario as undefined. Whether this fails your build may depend on strict mode, runner configuration, or other flags.
A scenario you sketched out but do not want to run yet? It gets a @wip tag and is filtered out of the build entirely.

All of these behaviours are understandable. Cucumber has to work across many languages, runners, and automation styles.

None of these are quite the same as "a failing test for work you intend to do." The result is that the TDD loop tends to break down for high-level BDD — you write a scenario, you don't get a red test, you go off and build the implementation, then you come back and wire up the steps.

That can work.

But it is not the clean red → green → refactor loop many developers expect from TDD.

How SpecBinder keeps the loop alive

SpecBinder takes a different route.

It treats every incomplete part of the Gherkin file as something that can become an ordinary failing JUnit test.

There is no special runtime discovery phase that needs to figure out whether a step exists. The feature file is processed at compile time, and the generated JUnit code makes the current state of the specification executable.

That means incomplete work does not have to disappear into a pending category.

It can be red.

Very red.

Nice, honest, test-runner red.

Start with rules

You can begin with a feature file that contains only rule titles:

Feature: Shopping cart totals and shipping

  Rule: Cannot checkout with an empty cart

  Rule: Free shipping applies when subtotal is at least €50

  Rule: Quantity changes update the subtotal

At this point, there are no scenarios yet.

In many workflows, this is just a sketch. Useful for discussion, but not executable.

With SpecBinder, it already produces test pressure.

When the project is compiled, the processor generates a nested JUnit class for each rule. Because each rule is empty, each generated rule class contains a failing placeholder test:

@Nested
@DisplayName("Rule: Cannot checkout with an empty cart")
class Rule_1 {

    @Test
    @Tag("new")
    void noScenariosInRule() {
        Assertions.fail("Rule doesn't have any scenarios");
    }
}

So the moment you list three rules, you get three red tests.

Not because the application is broken.

Because the specification says there is work to do.

That is exactly what you want at the start of an outside-in TDD loop.

Turn rules into scenarios

Next, pick one rule and add scenario titles.

Still no steps. Just the behaviours you want to cover.

Rule: Quantity changes update the subtotal

  Scenario: Increasing quantity raises the subtotal
  
  Scenario: Decreasing quantity lowers the subtotal
  
  Scenario: Removing the last unit empties the cart

Now the rule is no longer empty, so the noScenariosInRule() placeholder disappears.

Instead, each empty scenario becomes its own failing JUnit test:

@Test
@Tag("new")
@DisplayName("Scenario: Increasing quantity raises the subtotal")
public void scenario_1() {
    Assertions.fail("Scenario has no steps");
}

The feedback just became more specific.

Before, the rule was red because it had no scenarios.

Now the individual scenarios are red because they have no steps.

That may sound small, but it matters.

You can move through the feature top-down: first rules, then scenario titles, then steps, then implementation. The test suite follows you at each level.

It keeps pointing at the next smallest missing piece.

Turn scenarios into executable pressure

Now fill in one scenario.

Scenario: Increasing quantity raises the subtotal
  Given my cart contains "Wireless Headphones" with quantity "1" and unit price "60.00"
  When I change the quantity to "2"
  Then my cart subtotal is "120.00"

Compile again.

The generated test is no longer a placeholder. It now calls generated step methods in the order written in the scenario:

@Test
@Order(1)
@DisplayName("Scenario: Increasing quantity raises the subtotal")
public void scenario_1() {
    myCartContains$p1WithQuantity$p2AndUnitPrice$p3(
            "Wireless Headphones", 1, 60.00);
    iChangeTheQuantityTo$p1(2);
    myCartSubtotalIs$p1(120.00);
}

If you have not implemented those step methods yet, SpecBinder generates failing stubs:

public void myCartContains$p1WithQuantity$p2AndUnitPrice$p3(
        String p1, Integer p2, Double p3) {
    Assertions.fail("Step is not yet implemented");
}

Still red.

But now it is red in a much more useful way.

The failing test is tied to a concrete scenario, a concrete step, and concrete arguments.

The generated code tells you exactly what the scenario wants to call.

Implement only what the scenario asks for

Now open your marker class and implement the step methods you need.

@Gherkin2JUnit("specs/cart.feature")
public abstract class CartFeature {

    private Cart cart;

    public void myCartContains$p1WithQuantity$p2AndUnitPrice$p3(
            String name, Integer quantity, Double unitPrice) {
        this.cart = new Cart();
        this.cart.add(new Item(name, quantity, unitPrice));
    }

    public void iChangeTheQuantityTo$p1(Integer quantity) {
        this.cart.changeQuantity("Wireless Headphones", quantity);
    }

    public void myCartSubtotalIs$p1(Double expectedSubtotal) {
        Assertions.assertEquals(expectedSubtotal, this.cart.subtotal());
    }
}

Once those methods exist in the parent class, the generated class stops providing failing stubs for them.

The scenario now calls your implementation through ordinary Java inheritance.

At this point, the test may still fail.

Good.

Now it fails for the right reason: the production code does not exist yet, or it does not behave correctly.

So you write just enough of Cart, Item, and the subtotal logic to make the scenario pass.

Red.

Green.

Then refactor.

The loop is still the loop.

It just started from Gherkin instead of from a hand-written unit test.

Move one scenario at a time

The other scenarios under the rule are still red.

They are waiting their turn.

Scenario: Decreasing quantity lowers the subtotal

Scenario: Removing the last unit empties the cart

That is useful pressure.

You do not have to keep a separate checklist of scenario ideas you still need to automate. They are already in the test suite, already visible, and already failing.

Pick the next scenario. Add steps. Implement the missing step methods. Add the production behaviour needed to make it pass.

Then move again.

When all scenarios under the rule are green, the rule is done.

Move to the next rule.

Why incomplete work gets a tag

Every empty rule and every empty scenario gets a JUnit tag by default:

@Tag("new")

That gives you control over how unfinished specification work behaves in your build.

For example, you can run only the unfinished work:

mvn test -Dgroups=new

Or you can exclude it from a release build:

mvn test -DexcludedGroups=new

The default tag name can be configured:

@Gherkin2JUnitOptions(tagForEmptyScenarios = "wip")

Or disabled by passing an empty string.

The important part is not the specific tag name.

The important part is that incomplete work is visible at the JUnit level.

It is not hidden in a custom runtime status. It is not only a console message. It is not something your team has to remember to check manually.

It is a test.

And tests are very good at being noticed when they fail.

Annoyingly good, sometimes.

Which is exactly why we like them.

A short workflow summary

A TDD-friendly SpecBinder workflow can be very simple:

Sketch rules. One red test per empty rule.
Sketch scenarios under one rule. One red test per empty scenario, more specific.
Add steps to one scenario. Red, with step names.
Implement steps in the marker class. Red, with real arguments hitting your production code.
Implement production code. Green.
Next scenario.

At every stage, the test suite is in a runnable, meaningful state: green where you've finished, red where you haven't, with the failure pointing at the smallest piece of work you could do next.

That is the real value.

That is TDD.

It just happens to start at the Gherkin layer.

The missing red state in many BDD automation workflows​

How SpecBinder keeps the loop alive​

Start with rules​

Turn rules into scenarios​

Turn scenarios into executable pressure​

Implement only what the scenario asks for​

Move one scenario at a time​

Why incomplete work gets a tag​

A short workflow summary​