[[components/metabind-buttons-embed]]

Quick Info

quick-info-display

Notes

= this.notes

Interactions

2026-03-30

  • Reached out regarding the work unblocking (evaluation distribution actions from 2026-03-27)
  • He said he forgot he was on support but would look into it
  • Told him no worries and to feel free to prioritize support, no rush on my end

2026-03-27

  • Spent most of the day working together
  • Morning: investigated why the bump to Oval had degraded the CI pipeline - discovered an upgraded MLflow client connection had upgraded the PostgreSQL DB schema, making it incompatible with the previous version
  • Wrote up a post-mortem on the incident and had Gabe review it
  • Met from 2pm through 4:45pm discussing evaluation consolidation
  • Gabe had questions on why vendoring Miku client
  • Raised prioritization concerns for evaluations - during his 1:1 with Xiaofan on 2026-03-26, he heard they wouldn’t be prioritizing evaluations, which differed from what I had heard from Pranav
  • I asked if I could speak with Xiaofan directly to get clarification and Gabe agreed
  • Discussed Agent Harness, Evaluation Harness, Evaluation Framework, and CI to map out how the consolidated system would look
  • Showed Gabe how to make the requested changes in Oval
  • Follow-up: Get Gabe a set of actions he should work on regarding the distribution of evaluations

2026-03-19

  • Gabe reached out to consolidate his framework and mine, and set up time to go over the technicals of Oval
  • Agreed and set up a meeting for next Friday (2026-03-27)
  • Sent him a Google Doc with an agenda
  • Gabe made his first PR to Oval and I accepted it

2026-03-13

  • Spoke at the LE team meeting regarding collecting evaluations when Gabe & Pranav are in London
  • Gabe mentioned he was unblocked regarding the distribution of evaluation cases
  • I said this is blocked on my review because it impacts how I operate in Oval
  • Spoke again during the research sync, where I went over my research proposal
  • Discussed the distribution of evaluations - stated there shouldn’t be two different products, it should be the one in the multirepo (mine)
  • Made it clear this is an engineering problem and that I’m not looking to budge

2026-03-05

  • Reached out regarding LLM story time next week where he’ll present AgentEval
  • Had previously asked me to come onboard to co-present but now felt like he was kindly asking me not to
  • Told him that’s fine and we can present separately since there’s enough content for both of us