← Back to Agenda
Day 1 · 3:00 – 4:15 PM · Málaga Leadership Offsite

CTO Technical Assessment

& 12–18 Month Vision

Charles, CTO3 weeks into 30-day assessmentDay 1, 3:00

Framing: I've been here 3 weeks. This is an early-but-honest look at what I'm seeing — the engineering team, the architecture, the active work, and the technical debt. I'll share preliminary observations, not final conclusions. I'll also lay out a strategic technical vision centered on AI-first engineering with Claude agent teams as the force multiplier that lets a small team punch far above its weight.

Session Agenda

CTO Technical Assessment

3:00
Engineering Org Assessment Present
Team structure, skills map, dynamics, path to target composition.
3:15
Architecture & Infrastructure Present
Current state, DOKS & Chrome Farm migrations, cloud transition.
3:25
In-Flight Work & Delivery Present
Active builds: AI Visibility Tracker, Dashboard, PRISM, operational work.
3:35
Technical Debt & Constraints Discuss
What's holding us back — and what leadership needs to understand.
3:45
12–18 Month Vision: AI-First Engineering Vision
Claude AI across feature dev, bug fixing, testing, and operations.
4:00
Open Q&A Q&A
Your questions about realities, constraints, and cross-functional impact.
Section 01 · 15 min

Engineering Org Assessment

A team of 10, recently assembled, with significant talent — and real structural challenges.

Current Team
10
Target: 5 ICs + CTO
Features / Month
1–2
Target: 4–6 by M12
Cycle Time
2–3 mo
Target: ≤14 days
Deploy Freq
1–2x/wk
Target: 10+/week
Key ObservationsEarly
6 of 10 hired in last 1–3 months — assembled by previous leader who departed abruptly
Domain knowledge concentrated in Giovanni (2 yr) and Belmin (10 mo)
Cultural challenges around estimation, planning, modern dev practices
~60% of engineering time goes to infrastructure, not product
Path ForwardPlanned
Target: 5 ICs + CTO, no dedicated EM layer or infra team
Operations distributed, supported by managed cloud + AI workflows
R&D run rate: $272K → $228K target
Personnel decisions initiated in Month 2 based on evidence

Team Roster

Core Platform Team (Infrastructure)
Peter CipovInfrastructure & Observability3 mo
Mark KubatovInfrastructure (SaaS.Group shared)1 yr
Vitor CastroFull-stack + Infrastructure1 mo
EM: João Matos (SaaS.Group central DevOps lead, temporary)
Features Team (Product Delivery)
Rafael JannoneFull-stack, Frontend strength3 mo
Giovanni RodighieroFrontend specialist2 yr
Belmin HadžimusicQA / Testing specialist10 mo
Marcelo ChiaridiaFull-stack, XP practices1 mo
EM: Marcin Kulwikowski (1 month, from Sky)
Discussion

What has each of your functions observed about engineering responsiveness, delivery, or collaboration quality? Where have you felt the gap?

    Section 02 · 10 min

    Architecture & Infrastructure

    Mid-migration from bare metal to managed cloud. This gates almost everything else.

    Infrastructure Migration Map
    Microservicesrest-apirest-db (blocked)queue-servicedashboard-api→ DOKS
    Chrome Farmrender-node (deploying)Syself cluster EU→ Hetzner/DO
    Redisredis-queue (BullMQ spike)→ cluster migration
    DatabasesClickHouse (stays Hetzner)PostgreSQL 4TB
    MonitoringGrafana CloudVictoriaMetrics
    Key message: The migration is the prerequisite that unlocks everything. Until infra time drops from 60% to under 20%, we will struggle to achieve the feature velocity the product strategy demands.

    Infrastructure Status

    System Uptime
    99.81%
    Target: 99.9%
    MTTR
    92 min
    Target: <30 min
    Time on Infra
    ~60%
    Target: <10%
    COGS Kill Switch
    15% MRR
    Pause if exceeded 2+ mo
    DOKS Migration (K8s)In Progress
    Backend services migrating to DigitalOcean Kubernetes
    Redis BullMQ migration spiked — 8 services dependencies
    rest-db migration currently blocked
    Terraform state drift being reconciled
    Chrome FarmIn Progress
    Render-node deploying to production cluster
    Syself EU cluster being set up (dev + prod)
    Autoscaling under load testing still ahead
    HAProxy traffic cutover pending
    Section 03 · 10 min

    In-Flight Work & Delivery

    What engineering is actively building — mapped to product strategy phases.

    AI Visibility Tracker (Diagnostics)Core Product — Now Phase

    New diagnostic reports — checks CDN, WAF, robots.txt, and meta tags across GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot. The "evidence layer" from product strategy.

    PRE-2543: Diagnostics Progress Modal (SSE) — In Progress
    PRE-2545: Backend Persistence (S3 + Postgres) — In Progress
    PRE-2558: S3/database adapters — In Progress
    PRISMNew Product
    PRI-4: Log tailing API — SSE pipe for Claude Code agents
    PRI-3: Bot cloaking → deep-page analyzer consolidation
    Dashboard & Data VizOngoing
    PRE-1928: Export dashboard to PDF — In Testing
    PRE-2551: CDN Analytics — In Testing

    In-Flight: Billing & Integrations

    Billing & PricingBlocked Items
    PRE-2344: Enterprise Plus billing — Blocked, Sev 2
    PRE-2278: Chargebee/Insights sync — Blocked
    PRE-2486: Cancellation flow fix — In Testing
    Integrations & AnalyticsIn Progress
    PRE-2402: HubSpot integration status update
    PRE-2541: Mixpanel cleanup (170K zombie profiles)
    PRE-2259: Mixpanel subscription events fix
    Key message: Active work across too many fronts for the team size. AI Visibility Tracker and PRISM are strategically correct. But billing bugs, analytics cleanup, and infrastructure consume capacity that should go to product. This is a prioritization + team-size reality, not a performance issue.
    Section 04 · 10 min

    Technical Debt & Constraints

    An honest look at the realities underneath the product ambitions.

    Infrastructure Consumes 60% of CapacityHighest Impact
    The single biggest constraint. ~60% of time on infrastructure and firefighting, only 40% for features. The product strategy assumes 3–4x velocity — realistic, but only after this constraint is removed.
    Documentation DeficitActive Risk
    Domain knowledge lives in heads — primarily Giovanni (2 yr) and Belmin (10 mo). 6 new hires onboarding into an undocumented system. AI-assisted documentation generation underway.
    DORA Metrics Not Yet TrackedMonth 1 Priority
    Deployment frequency, lead time, change failure rate, MTTR — not reliably measured. Decisions made on intuition. Baselines being built alongside Grafana Cloud rollout.

    More Constraints

    SaaS.Group DevOps DependencyTransition Needed
    João (EM) and Mark (infra) are shared from SaaS.Group central. João's engagement is temporary. All knowledge must be absorbed before any resource changes.
    Legacy Integration DebtOngoing
    Chargebee/Insights sync issues, Mixpanel data pollution (170K zombies), missing HubSpot data, billing bugs for Enterprise customers. Directly affects revenue ops and data quality.
    Discussion
    1. Which constraints have the biggest impact on your function right now?
    2. Are there customer-facing issues I haven't surfaced?
    3. If I could fix one thing beyond the migration in 30 days, what helps you most?
      Section 05 · 15 min

      12–18 Month Technical Vision:
      AI-First Engineering with Claude Agent Teams

      The strategic bet: use AI as the core multiplier that lets 5–6 deliver what traditionally requires 15–20.

      The Core Thesis

      We're transitioning from infrastructure company to product company. The team must ship more features, faster, with fewer people. Hiring more contradicts the cost mandate. Working harder is unsustainable.

      The third path: deeply integrate Claude AI into every stage of the SDLC. Not as a code completion tool, but as a first-class development partner that autonomously handles significant portions of feature development, bug resolution, testing, and operations.

      AI Pillars: Development & Bug Fixing

      #1 AI-Assisted Feature Development (Claude Code)

      Engineers shift from writing code to orchestrating AI agents that build features against well-defined PRDs.

      Already happening: PRISM (PRI-4) builds log tailing API specifically for Claude Code agent sessions.
      12-month target: 70%+ feature code generated by Claude Code. Engineers focus on architecture, review, edge cases.
      Impact: This is how 5 engineers achieve the output of 15. Bottleneck shifts from writing to reviewing.
      #2 AI-Automated Bug Triage & Fixing

      Support files a bug → Claude analyzes logs, identifies root cause, generates fix + tests, creates PR. Engineer reviews and merges.

      Prerequisite: Observability investment (Grafana Cloud, structured logging) feeds directly into this.
      12-month target: 50%+ Sev 2–3 bugs resolved by Claude with human review only. 60% less engineer bug time.

      AI Pillars: Testing & Operations

      #3 AI-Augmented Testing & QA

      Every feature PR includes AI-generated tests. Claude reviews test gaps and suggests missing scenarios.

      Impact for QA: Belmin shifts from manual execution to test strategy, AI test review, exploratory testing.
      12-month target: 80%+ test coverage on new code, automated regression detection in CI/CD.
      #4 AI-Automated Technical Operations

      Once on managed cloud with solid observability, Claude handles monitoring, alerting, incident triage, automated remediation.

      Prerequisite: Cloud migration must complete. Managed infra (DOKS) reduces operational surface area.
      12-month target: DevOps burden from ~60% to <10% of capacity. No dedicated infra team needed.
      Team impact: Every engineer is full-stack. Claude handles operational monitoring and routine incident response.

      How This Connects to Product Strategy

      The product strategy:

      Make Machines See You → Make Machine Failures Visible → Make Machine Experience Predictable

      Each phase demands more sophistication with the same team.

      Now: Claude Code accelerates AI Visibility Tracker and Diagnostics. We're already building for this.
      Next: Semantic diagnostics, change propagation monitoring — complex features compressed from months to weeks.
      Later: Machine-readiness SLAs, revenue exposure modelling. AI-automated operations becomes critical.
      Discussion
      1. What concerns about relying this heavily on AI for core engineering?
      2. How does this change what you expect from engineering?
      3. Does knowing Claude Code can compress timelines change how we sequence Now/Next/Later?
        Section 06 · 15 min

        Open Q&A

        Likely Questions & Prepared Framing
        "When will we ship faster?" — Month 3: 2–3/mo. Month 6: 3–4. Month 12: 4–6. Assumes AI-first approach takes hold.
        "Are we going to lose people?" — Assessment ongoing. Decisions based on evidence. Target: smaller but higher-caliber.
        "How real is the Claude AI strategy?" — Already happening in PRISM. Concrete metrics at Month 3.
        "Botify/DataDome threat?" — Speed-to-value for mid-market. Managed cloud + AI velocity = faster than incumbents.
        "What do you need from us?" — Patience on assessment, urgency on migration, alignment on AI. Input from Anna, Janine, Graeme.

        What I Want to Leave You With

        The team has talent
        The challenge is structure, process, and focus — not individual capability.
        The migration is the unlock
        Until infra time drops below 20%, velocity improvements are incremental, not transformational.
        AI-first engineering is the multiplier
        It's how a team of 5–6 delivers what 15–20 traditionally would. This is the strategic bet.
        I'm 3 weeks in
        Strong early observations, but not done listening. Final findings at Day 30, concrete delivery improvements by Month 3.