Spec-Driven Development with OpenSpec: Write the Spec, AI Writes the Code

Software development has always been a translation problem. A business stakeholder articulates a requirement in plain language; a developer interprets it, writes a specification (if the team is disciplined), then translates that spec into code — line by painstaking line. Every translation step is a potential source of drift, ambiguity, and rework. In large enterprises, that drift compounds: requirements evolve, codebases grow complex, and the gap between what was intended and what was built widens with every sprint.

Spec-driven development has long been positioned as the antidote. Define behaviour precisely upfront; let the implementation follow from the spec. But historically, writing specs was slow, maintaining them was painful, and actually generating code from them was largely theoretical. That era is over. AI — in particular, large language models trained on vast codebases — has finally made specification-to-code a practical, production-ready workflow. At Infonex, we call this approach OpenSpec, and it is the backbone of how we help enterprises achieve development cycles that are up to 80% faster.

What Is Spec-Driven Development in an AI Context?

Traditional spec-driven development revolves around documents: Functional Requirements Specifications (FRS), API contracts in OpenAPI/Swagger, or architecture decision records (ADRs). These are valuable artefacts — but they are passive. They inform humans; they don't drive machines.

AI-era spec-driven development flips this dynamic. A well-structured specification becomes an active prompt: a precise, machine-readable (and human-readable) description of behaviour that an LLM can use to generate, validate, and iterate on code. The spec is the source of truth. The code is an output — not the other way around.

OpenSpec, as practised at Infonex, extends this idea with a structured schema that captures:

  • Intent — what the feature or module must accomplish
  • Inputs and outputs — typed interfaces, edge cases, and validation rules
  • Constraints — performance targets, security requirements, compliance boundaries
  • Context — how this component fits into the broader codebase

With that context in hand, an AI code generation pipeline can produce not just boilerplate, but contextually correct business logic — code that is aware of your data models, your naming conventions, and your team's existing patterns.

From Specification to Running Code: A Concrete Example

Consider a common enterprise scenario: adding a new REST endpoint to process invoice approvals, with business rules around approval tiers, audit logging, and notification dispatch. Traditionally, a mid-level engineer would spend two to three days scaffolding the controller, service layer, repository calls, and tests. With an OpenSpec workflow, that collapses to hours.

A simplified OpenSpec definition might look like this:

# openspec: invoice-approval-endpoint
intent: >
  Expose a POST /invoices/{id}/approve endpoint that validates the requesting
  user's approval tier against the invoice amount, records an audit log entry,
  and dispatches an email notification to the finance team on success.

inputs:
  - invoiceId: UUID (path param)
  - approverId: UUID (from JWT claims)
  - notes: string (optional, max 500 chars)

outputs:
  - 200: ApprovalRecord { id, invoiceId, approverId, timestamp, status }
  - 403: Insufficient approval tier for invoice amount
  - 404: Invoice not found

constraints:
  - Approval tiers: L1 (<= $10,000), L2 (<= $100,000), L3 (unlimited)
  - Audit log must be written atomically with status update (single transaction)
  - Notification must be async (do not block HTTP response)
  - Response time P95 < 300ms

context:
  framework: Spring Boot 3.x
  orm: JPA/Hibernate, PostgreSQL
  messaging: AWS SQS for async notifications
  existing_patterns: See InvoiceService.java, AuditLogRepository.java

Feed this into an OpenSpec-aware AI pipeline with access to the existing codebase via retrieval-augmented generation (RAG), and the output is a complete, compilable Spring Boot controller, service method, repository call, SQS message dispatch, and JUnit test suite — all aligned with the team's existing code style.

GitHub research on Copilot adoption found that developers completed tasks 55% faster with AI assistance in isolated scenarios. OpenSpec workflows, which provide richer context and more precise constraints, consistently push that number higher — Infonex clients regularly report time savings exceeding 70–80% on well-specced features.

Why Context Is Everything: RAG and Codebase Awareness

The gap between a generic AI code generator and a truly useful one is context. Generic tools like vanilla ChatGPT or early Copilot models produce plausible-looking code that frequently clashes with the target codebase — wrong naming conventions, missing dependency injection patterns, unaware of existing abstractions.

The solution is Retrieval-Augmented Generation (RAG) applied at the codebase level. Before generating code, the AI retrieves relevant excerpts from your actual repository: similar service classes, existing utility functions, your ORM entity definitions, your error-handling conventions. The LLM then generates code that fits — not code that merely compiles, but code a senior engineer would recognise as belonging to the project.

This is a core capability Infonex builds into every OpenSpec deployment. We index client codebases using embedding models (typically text-embedding-3-large from OpenAI or equivalent), store vectors in a retrieval layer such as pgvector or Pinecone, and inject the top-k relevant chunks into every generation prompt alongside the spec. The result is AI output that respects your architecture — not a foreign import that requires heavy rework.

Spec-First Is Also Test-First

One of the underappreciated benefits of spec-driven development with AI is what it does for your test coverage. When the specification precisely defines inputs, outputs, edge cases, and constraints, generating a comprehensive test suite becomes almost trivial for the AI.

In a traditional workflow, test coverage is often an afterthought — written after the implementation, shaped by what was built rather than what was intended. Spec-first inverts this: the tests are derived from the spec, and they validate the implementation against the original intent, not the developer's interpretation of it.

Using frameworks like Testcontainers for integration tests or property-based testing libraries like jqwik (Java) or Hypothesis (Python), AI-generated test suites can cover boundary conditions and negative paths that human developers routinely miss under deadline pressure. Infonex clients have reported up to 40% reductions in post-release defect rates after adopting OpenSpec workflows — a direct consequence of tighter, spec-derived test coverage.

Adoption Strategy: How to Introduce OpenSpec in Your Organisation

The most common mistake enterprises make is attempting a big-bang adoption. Spec-driven AI development is a workflow change as much as a tooling change, and teams need time to calibrate what a "good spec" looks like for their domain.

Infonex recommends a phased approach:

  1. Pilot on greenfield modules. Choose a new service or feature — not a migration of existing spaghetti — and build the entire workflow end-to-end with OpenSpec. This gives the team a clean success story without legacy debt interference.
  2. Instrument and measure. Track time from spec sign-off to deployable PR. Track defect rates per feature. Establish a baseline so ROI is quantifiable for stakeholders.
  3. Invest in spec quality. The biggest determinant of AI output quality is spec precision. Invest in tooling and training that helps engineers and architects write better specs — this pays compounding dividends.
  4. Expand codebase RAG coverage. As more of your codebase is indexed, the AI's contextual awareness improves, and so does output quality. Treat the vector index as a living asset.
  5. Scale to existing codebase refactoring. Once the workflow is mature, apply it to legacy modernisation — spec out the desired behaviour of existing modules, generate modernised implementations, use the spec-derived tests as a safety net.

The Bottom Line for Technology Leaders

Spec-driven development with AI is not a future capability — it is available today, and enterprises that adopt it are compounding a significant competitive advantage. The combination of precise specifications, RAG-powered codebase awareness, and modern LLMs transforms development velocity in a way that no amount of additional headcount can replicate.

Infonex has implemented these workflows at scale for enterprise clients including Kmart and Air Liquide, delivering development cycles that run 80% faster than traditional approaches without sacrificing code quality or architectural integrity. The technology exists. The methodology is proven. The question for CTOs and Engineering Managers is no longer whether to adopt AI-accelerated development — it's how fast they can get there.


Ready to See What OpenSpec Can Do for Your Team?

Infonex offers a free consulting session to help enterprise technology leaders understand how spec-driven AI development applies to their specific codebase, team, and delivery challenges. Whether you're modernising a legacy platform, scaling a new product, or looking to reduce your time-to-production across the board, our team brings deep, hands-on expertise in AI-accelerated development, RAG pipeline design, and OpenSpec implementation.

Clients like Kmart and Air Liquide have already seen what 80% faster development cycles look like in practice. Let's talk about what that means for your organisation.

Book your free AI consulting session at infonex.com.au →

Comments

Popular posts from this blog

How RAG Makes AI Development Assistants Codebase-Aware

How RAG Makes AI Development Assistants Codebase-Aware

How RAG Makes AI Development Assistants Truly Codebase-Aware