Your playing field
You will own quality for the EverReal platform as we scale — and you'll do it as an operator of AI, not a writer of scripts. The model we're building toward: AI implements and runs the tests; you direct it and own the judgment. You won't spend your days reviewing test code line by line — increasingly that's automated, and you look at the results. Your judgment goes where it matters:
- Defining behaviour in clear, correct Gherkin scenarios — the source of truth for what "working" means.
- Reviewing the test cases AI writes (Gherkin and manual) to make sure they're actually right.
- Reading results to decide what's genuinely broken, what's flaky, and what's ready to ship.
AI is the tool; you are the operator. It removes the repetitive mechanics so your judgment scales. This is a deliberately high bar and a real growth path: you start by owning quality hands-on, then progressively push the mechanics onto AI while you keep the standards and the hard calls.
What you'll do
- Own quality and be the last gate. Nothing ships to production without your sign-off. Escalate early when timelines compress.
- Own manual testing & UAT. Exploratory, functional, regression and user-acceptance testing — the human eyes before customers see it.
- Maintain the Gherkin scenarios. The living source of truth for how the product should behave; review the test cases AI generates for correctness.
- Operate the automated testing, don't hand-crank it. Direct AI to implement and run Playwright/E2E and API tests; judge by results and keep the suite trustworthy and green.
- Sign off every release. Run regression, give the go/no-go, post the release notes (AI drafts, you own them).
- Be the first line of defense for production. Validate, reproduce and route customer-reported issues fast, with AI.
- Triage results. Real bug vs. flake, root cause (with AI + Datadog/Sentry), surface only what needs a human call.
- Continuously automate your own work with AI. If you've done it manually twice, make AI do it the third time.
- Prove it with evidence. Clear bug reports with root cause, regression checklists, screenshots, logs and recordings — tests are the documentation.
- Be the product's second brain. Learn the product and users deeply; turn that into the right scenarios and the right questions.
The team you'll join
A small, focused team in agile Scrum: five engineers, QA (you'll work alongside our current QA), and a product manager — all working closely with the CTO, who is still hands-on in engineering. QA is not a downstream function here: you're in refinement from the start, and your sign-off is a hard release gate.
How we work
- AI-native. Claude Code, agents and MCP are first-class infrastructure. Every step goes through AI first, then the person.
- Evidence-driven quality. Every acceptance criterion is proven, not assumed.
- Async-first. Deep-focus mornings are sacred; reviews are batched.
- High psychological safety. Praise in public, feedback in private; we talk about the system, not the person.
- Ownership. Concrete end goals with the context and autonomy to achieve them.