AI agents removed the friction from writing telemetry
I used to avoid adding telemetry because it felt like tedious busywork. Now with Claude handling the OpenTelemetry boilerplate, I'm instrumenting everything.
I've been using Claude Code to handle tedious parts of platform engineering: AI-assisted dependency reviews, research spikes, automatically creating test PRs, and project cleanup.
As a platform engineer, I have spent hours researching whether we should migrate from one build tool to another, reading through changelogs to understand if a dependency update will break anything, and manually testing configuration changes.
A big part of my job is driving migrations across the codebase. Before Claude Code, it was difficult to estimate the size of a migration. Trying it and seeing what kinds of obstacles I ran into was the only way to evaluate whether I had the bandwidth to take it on. I had this experience when I turned on strict mode for our codebase — I had no way to know how long the migration would take or how many obstacles I would run into.
Dependency upgrades were another time sink. We use dependabot, and configure it to open ~20 PRs per week in our monorepo. Each one required reading changelogs, checking for breaking changes, and understanding the blast radius.
I also spend time testing infrastructure changes. When I modify CI configuration, I need to validate that it works. That means creating test branches, opening PRs, watching builds, and cleaning up afterward. For one config tweak, I might open 2-3 test PRs.
Traditional automation doesn't solve these problems because they all require reading, judgment, and understanding context. You can't write a script to evaluate whether a new testing framework is worth migrating to.
I love to let Claude churn on a research task, giving me a set of plausible options that I can choose from and work off of. For example, I was trying to decide whether to migrate to typescript-go. In the past, I would've manually migrated, then read through the new errors that appeared and tried to understand whether they were regressions or valid.
This time, I set Claude on the task, letting it automate the migration steps. Then I paired with it to understand the new errors that came up and to compare them against the documentation. This helped me build a case for why typescript-go was ready for us to migrate to, leading to a 7x speed-up in our typechecking step.
I've used this approach for tool evaluations, architecture decisions, and vendor comparisons.
Testing CI changes across multiple project types used to take half a day of manual work. Now I have Claude generate test PRs automatically after I make configuration changes.
I'll make changes to our CI configuration or scripts, then have Claude create test branches for different scenarios, instead of having to spend 10 minutes setting it up.
Platform changes always have edge cases you didn't anticipate. The faster you detect issues, the smaller the blast radius and the easier the rollback.
I've started having Claude help me add instrumentation before I deploy changes. The more I rely on AI to help me write code, the more important it is to me that I can observe how that code is behaving in production. Claude can help me get to a working draft of my telemetry far more quickly, and then I review it carefully to make sure it matches the OpenTelemetry specification and has the shape I want.
Most teams just merge dependency updates without reading the changelogs. It's too time-consuming to review every update properly, so people don't.
My teammate created a Github Workflow that has Claude read through dependency update notes and flag anything that might be problematic. It looks for breaking changes, known performance regressions, security issues, and changes that might conflict with our current usage patterns.
Read more about how our AI-assisted dependency review process works here: https://maecapozzi.com/blog/using-conductor-for-dependabot-reviews
Deletion projects are boring but necessary. The challenge is understanding what actually depends on what you want to remove. Documentation is usually outdated, and grep isn't smart enough to catch indirect dependencies.
I use Claude to trace through codebases and infrastructure configs to map dependencies before starting deprecation work. It's better than I am at following import chains, finding configuration references, and spotting runtime dependencies.
When I needed to remove an experimental tool that was no longer maintained, Claude helped map out all the places it was referenced: direct imports, configuration files, deployment scripts, and documentation. It also identified services that used the tool indirectly through shared libraries.
Having that analysis upfront meant fewer surprises during the deprecation process. Instead of discovering dependencies as I broke things, I could plan the removal sequence and communicate with affected teams ahead of time.
Rough estimates of time saved per week:
That's roughly 10-12 hours per week I'm not spending on grunt work. Instead, I can focus on the platform work that actually matters, like designing better abstractions and improving developer experience.
I used to avoid adding telemetry because it felt like tedious busywork. Now with Claude handling the OpenTelemetry boilerplate, I'm instrumenting everything.
AI agents work better when given appropriate context and guardrails.
I used to block out weeks for tooling migrations. Now I let Claude Code run in the background, check in when it's done, and pair with it to understand what changed.