Towards EngOps: Evolving engineering organizations with data

Posted on

Most engineering organizations are full of highly analytical people with STEM degrees. That’s why it’s not at all surprising that the most data-driven organizations in any business are…finance, sales, and marketing. Right? No, but seriously, when was the last time your engineering organization used data to make a decision?

While building the Einstein machine learning platform at Salesforce, we experienced all the usual pain points of a rapidly growing engineering organization. We have grown from a small team of five people one day to dozens of teams and hundreds of engineers in just a few years. With that growth came all the typical growing pains. Some teams shut down as tech debt piled up; some teams have become the central bottleneck for all the others; others were overwhelmed with guard duties. As leaders, we struggled to understand our operations and ensure our teams had the support they needed when they needed it.

Even simple process changes that would make everyone happier were hard to find. Once an accidental configuration change in our github organization over triple our time to merge the pull requests, and it’s only after weeks from low-level grunts from the engineers that we realized there was a problem and fixed it.

While we struggled with visibility, we noticed that our sales, marketing, and finance counterparts were incredibly data-informed about their operations and were generally quite good at modeling and measuring the impact of changes.

Engineering, on the other hand, was flying blind. Seemingly simple questions about engineering speed, security, compliance, or cost required significant effort tinkering with data from various sources, digging through logs, writing ad hoc scripts, and more. Relevant data would take weeks to be compiled, and once the analyzes were completed, the data would be obsolete. We were not alone. When we spoke to other teams from other organizations, it was the same story everywhere.

And so we built Faros.

A new standard requires new tools

The extreme fragmentation of the technology stack is primarily to blame for this struggle that engineering organizations face. The explosion of development tools has multiplied by 100 the operational surface. Every organization’s technology stack has a unique fingerprint. Technology stacks typically spin out of control as organizations grow.

Simultaneously, with COVID, remote engineering is the new normal and is accelerating. Opportunities for informal data collection and correlation are lost with the communal water cooler.

Engineering teams simply don’t have the right tools to deal with this new reality. Bottlenecks in processes take a long time to discover. Hiring more engineers is a costly solution that often hurts productivity more than it benefits. Decisions are based on the loudest voices in the room (or on zoom) – or on intuition, rather than data. It shouldn’t be like this.

Unblock EngOps

We believe that with the right tools, engineering leaders will finally be able to evolve their operations in a more data-informed way – using data to identify bottlenecks, measure progress towards organizational goals , better support teams with the right resources and accurately assess the impact of interventions over time. Additionally, any solution that truly unleashes a data-driven engineering culture will deliver value in

1. Connect the dots

For data to be at the heart of an organization’s decision-making processes, data must be easily accessible and cannot live in silos. This requires a platform that brings all engineering data together in one place and connects the dots. It should bring together data and metadata from all the different operational sources, into a standardized data model that can give executives a holistic view of their engineering operations.

2. Maximize flexibility

Every engineering organization is unique and an EngOps platform should be able to adapt to the needs of the organization rather than the other way around. Engineers love to use the best software, and that will never change. Therefore, any EngOps solution must allow engineers to continue using the tools they love and meet them where they are. In other words, the platform must be extremely easy to customize, extend and integrate. For example, adding new data sources (external or internal providers) should be a breeze, the canonical data model should be easy to extend, analytics should be customizable, and the entire platform should be API-driven, so engineers can integrate it into their regular workflows, querying the data they need wherever it is.

3. Highlight what’s important

There is an enormous amount of data flowing through engineering organizations, and the amount of metrics and insights that can be derived from this data is overwhelming. The ideal platform would be smart, highlighting what’s relevant and explaining why it matters. It would indicate trends to follow and anomalies to explore. This would allow events from disparate systems to be correlated to aid in root cause analysis. This would allow leaders to focus on the most important insights their data can provide and take action, instead of getting lost in the weeds.

The Faros platform was designed from the ground up with these three principles in mind to deliver immediate visibility, regardless of the technology stack. The Faros platform is:

1. Connected: Faros connects to dozens of different engineering systems across source control, task management, incident management, CI/CD and HR systems. Not only does it connect to these systems, but it also infers connections between them, correlating events and identities to provide global visibility across the organization. It can trace changes from idea to production and beyond; incidents from discovery to recovery to resolution; and reconciling identities between different systems.

2. Expandable: Faros APIs were designed with customization and extensibility as a first-class concern. In addition to well-known vendors, it is easy to connect custom local systems to Faros with the Faros SDK. We’ve also integrated a comprehensive BI tool into the platform, to allow teams to measure what matters most to them. This, together with APIs to inspect data and even export it, allows engineering teams to integrate Faros into their usual workflows, without changing their existing processes.

3. Smart: Faros correlates events, resolves identities, and infers team attribution to feed operational metrics around software delivery (DORA metrics), engineering speed, program management and integration; with more to come around security, compliance and cost optimization. For example, Faros can measure the time it takes for changes to move from idea to production and every step in between, broken down by team, application, and time. But metrics are just the start, as we design fully automated insights with anomaly detection and root cause analysis to help teams quickly make sense of their data.

In the coming weeks, stay tuned for more blog posts on how we designed the Faros Platform to live up to its values ​​at scale.

Why should you care?

Your engineering teams need to build and deliver great software quickly, efficiently, and reliably, and that’s where your engineers need to spend their time. Better visibility allows you to efficiently scale your operations, identify frustrating bottlenecks, and resolve issues before they turn into fires. Fewer fires and bottlenecks mean happier teams who can focus on what’s most important.

See Faros in action

Request a demo and we’ll be happy to take the time to walk you through the platform.
Unleash the power of data-driven EngOps at Faros.ai.

Leave a Reply

Your email address will not be published.