About Empirical Validation

The most important practice when working with Maestro is never assuming code works without running it. This sounds obvious, but the temptation is strong. Maestro produces clean, well-structured code that looks correct. The implementation seems right. It probably works. “Probably” is not engineering. Engineering demands proof. The sandbox exists precisely to provide it. Running the full test suite (not just targeted tests) after every change catches regressions in seemingly unrelated code. Partial testing is premature optimization — it saves minutes but risks hours of debugging later. The cost of running all tests is almost always worth the confidence it provides.

About Benchmark-Driven Development

Performance optimization without measurement is speculation. The human instinct for where performance bottlenecks lie is remarkably unreliable. Code that looks slow may execute in microseconds. Code that looks efficient may be the critical path. The discipline is simple: measure before changing, change one thing, measure after. Same test harness, same conditions, same methodology. The before/after comparison is evidence. Everything else is opinion. This discipline also prevents the common failure mode of “optimization” that makes things worse. Without a baseline, you cannot know if your change helped or hurt.

About Test Quality

Test coverage is necessary but not sufficient. Tests that achieve 100% coverage but only check happy paths provide a false sense of security. Quality testing requires:
  • Edge cases: Empty inputs, maximum values, concurrent access, resource exhaustion
  • Error scenarios: Network failures, invalid data, timeouts, partial failures
  • Determinism: Tests that pass sometimes and fail sometimes (flaky tests) are worse than no tests
  • Independence: Tests that depend on execution order hide real bugs
The test-first pattern (writing tests before implementation) is particularly effective with Maestro because it establishes concrete, machine-verifiable success criteria before any code is written. Implementation then becomes a matter of making tests pass, with clear evidence of completion.

About Code Quality Standards

Before any code reaches a pull request, certain standards must be met. These are not bureaucratic requirements — they are the minimum threshold for code that other people (including your future self) can maintain. Cleanup: Remove debugging statements, commented-out code, experimental code, and unused imports. These are noise that obscures signal. Documentation: Update README if structure changed. Add comments for non-obvious logic. Document why, not what — the code shows what it does; comments explain why it does it that way. Testing: Full test suite passes. No skipped or disabled tests without explicit justification. Coverage meets project standards.

About Security as a Default

Security practices are not a phase of development — they are a property of every change. Input validation, injection prevention, secret management, and dependency auditing should be continuous, not periodic. The most insidious security failures are not dramatic exploits but quiet omissions: an endpoint without input validation, a dependency with a known vulnerability, a secret hardcoded “just for testing.” Maestro can systematically check for these issues, but only if you ask. Making security validation part of your standard completion criteria is more effective than periodic security audits.

About Capacity Management as Practice

Capacity management is not emergency response — it is ongoing practice. Proactively managing memory and file context every 20-30 turns prevents the situation where capacity hits 100% and you are forced to make hasty decisions about what to keep. The pattern that works: after major milestones, use /synopsis to document state, /compact to summarize implementation details, and keep the synopsis plus current validated state. This maintains clean context for subsequent work.

About Session Discipline

The anti-patterns that waste the most time are: Monster sessions that try to accomplish everything in one go. Capacity degrades, context becomes noisy, and quality suffers. Vague requirements that force Maestro to guess. Clear constraints and success criteria on the first turn save dozens of correction turns later. Accepting assertions instead of demanding evidence. “The implementation is performing well” should trigger your skepticism, not your satisfaction. Not correcting misunderstandings immediately. Every turn Maestro works under a wrong assumption is a turn wasted. Correct early, correct specifically.

Further Reading