AI assistants are only as good as their reliability. We are building Ink'd, an AI real estate assistant that automates contract workflows; but to win trust, it must perform with accuracy and speed.
We’re hiring a QA Lead who will specialize in analyzing conversation traces, identifying breakdowns, and building automated evaluations to ensure the AI performs at its best.
What You’ll Do
- Review conversation traces to detect failures in data extraction, misunderstandings, and latency issues.
- Lead both manual QA processes and automated evaluation pipelines.
- Build frameworks for LLM-as-judge evaluations and automated test cases.
- Create dashboards and metrics to track AI performance (accuracy, latency, user satisfaction proxies).
- Collaborate with engineering to proactively resolve recurring issues.
What We’re Looking For
- 5+ years of QA experience (AI/NLP preferred).
- Strong skills in both manual and automated QA.
- Familiarity with Python scripting and test automation.
- Ability to create metrics dashboards (Grafana, Kibana, custom).
- Analytical, detail-oriented, and passionate about reliability.
Bonus Points
- Familiarity with LangChain/LangGraph tracing tools.
- Experience with synthetic data generation for QA.
- Background in human-in-the-loop evaluation systems.