Scaling Slack’s Mobile Codebases: Modernization

Posted on

In the first two posts about the Duplo initiative, we described why we decided to revamp our mobile codebases, the initial phase to clean up tech debt, and our efforts to modularize our iOS and Android codebases (post 1, post 2). In this final post, we will discuss the last theme of the Duplo initiative, Modernization, and look at the overall results and impact on developers.

Modernization

In addition to modularizing our codebase as part of Duplo, we also wanted to improve our overall app architecture, ensure we were keeping up with industry trends, and adopt more forward-looking design patterns and technologies. On each platform, we decided on particular areas of focus which we thought would both improve the experience of building features for our mobile developers and put our mobile codebases on better footing.

But we also knew that the effort to modernize our code wouldn’t stop with Duplo, and that after the initiative finished there would still be more new technologies and design patterns we wanted to adopt. We wanted to set ourselves up for future innovation as well. For example: On iOS we decided to not adopt SwiftUI as part of Duplo for a number of reasons (iOS version restrictions, the stability of SwiftUI at that time, difficulty integrating it into the product), but we wanted to put ourselves in a position where we could adopt it in future, and where we would be able to explore using SwiftUI in parts of the app without having to use it everywhere.

iOS Modernization

Feature architecture

The most ambitious goal of Modernization on iOS was to adopt a new feature architecture across the app. As mentioned in the previous post, we had been using a MVVM+C feature architecture, but found it wasn’t opinionated enough, and we were seeing a lot of inconsistencies in how it was implemented.

Before deciding on a new feature architecture, we gathered feedback about the biggest issues and pain points developers had with the existing one. We investigated a number of different options, and settled on our own variant of VIPER (View, Interactor, Presenter, Entity, and Router). Our variant was slightly different from standard VIPER, with four components: Feature, View, Interactor, Presenter. We designed this architecture to have clear separation of responsibilities, strong enforcement of event and data flow patterns, and clarity on where code should go and how each piece should be implemented. To make it easier for developers to add new features, we used Swift generics and templates to create a basic implementation each feature could extend. We will do a deeper dive into our feature architecture in a follow up post!

Stricter linting

As we modernized our code, we wanted to prevent the use of a number of anti-patterns which were common in our legacy code. We were already running Swiftlint on every commit, but decided to impose stricter linting rules and additional lint errors on the directories containing our ‘modern’ modules. As developers moved code over, they would have to remove these anti-patterns, or at least disable the errors (we also tracked instances where the linting errors had been disabled, so we could encourage developers to clean them up later). Some examples include:

  • Use of global singletons like static let shared – we wanted any globally scoped objects to be injected through interfaces rather than accessed as singletons, to make dependencies clearer and make it easier to mock objects for tests
  • Use of Notifications/NotificationCenter – instead of using notifications, we preferred to use Combine publishers/subscribers
  • Accessing UIViewController.topPresented or .topMostPresentedViewControllerInStack – accessing these properties in the view hierarchy directly broke our feature composition and navigation
  • Use of many UIKit components – we preferred our own SlackKit UI components

Combine adoption

As mentioned above, we decided to adopt Apple’s Combine reactive framework for handling events as part of Modernization. We had already been using a limited in-house implementation of certain reactive programming concepts, similar to RxSwift, though we had never adopted a third-party reactive framework (on Android we were already using RxJava before Duplo). We had implemented our own Disposable and PropertyBinding classes, which we used to stream updates for changes to properties or objects. As part of Modernization, we decided to lean further into reactive architecture, and while we continued to use our homegrown classes, we also adopted Combine throughout the app: in our feature architecture, when streaming updates from data providers, and to replace notifications and observers where we could. Using Combine also puts us in a better position to consider adopting SwiftUI in the future. Because we were already using streaming interfaces and Disposables, moving to Combine’s publishers/subscribers and Cancellables was a natural progression for our developers.

At the time we adopted Combine, it was only supported on iOS 13 and higher, and Slack was still shipping on iOS 12 and up. So we temporarily adopted the open source CombineX framework, which allowed us to use Combine syntax and functionality until we were able to drop iOS 12 support in the summer of 2021 and switch over to using the Combine framework directly. CombineX’s syntax is almost identical to Combine, which made the transition straightforward.

SlackKit

SlackKit is Slack’s cross-platform shared UI design system. We had built and standardized on a number of UI components across all our clients, but on the mobile clients there were still many screens which were using legacy UI components rather than SlackKit ones. This meant we weren’t receiving the benefits of SlackKit on those screens, including standardized behavior, accessibility support, and improved performance. As part of Duplo, we had a goal across both iOS and Android to drive more SlackKit adoption throughout the product.

Android modernization

More modern libraries

We spent much of our time migrating our codebase to more modern libraries. Three of the big library adoptions we spent time on included replacing Gson with Moshi, replacing Android Priority Job Queue with WorkManager, and replacing much of our networked API calls with an internal project called Guinness.

Adopting Moshi brought many benefits to our codebase. It integrates with Kotlin much better than Gson. Replacing our combination of AutoValue and AutoValueGson removed many (Kotlin Annotation Processing Tool) instances, which resulted in build speed wins. Moshi reads and writes JSON faster than Gson. This combination of benefits made a compelling argument for migration.

Prior to WorkManager, there have been many projects that have attempted to simplify scheduling background tasks with the Android JobScheduler, Firebase Job Dispatcher, and Alarm Manager. We chose to use Android Priority Job Queue (APJQ), and while it served us well for a number of years, the project was deprecated when WorkManager was announced. The combination of APJQ’s deprecation, no longer needing its priority queue feature, and the WorkManager feature set, led us to move to WorkManager.

Project Guinness is an internal library that is a combination of Retrofit, Moshi, Okio, and OkHttp that can be thought of as Retrofit that supports some Slack-specific networking choices. Most of those interesting bits have been open sourced as EitherNet. Prior to Guinness, our networked API calls were handwritten, and while they utilized some helper functions to form them, they were ultimately error-prone boilerplate. In addition, migrating to Guinness helped modernize our codebase by allowing engineers more flexibility to reveal their APIs as either suspending functions or as a RxJava Single or Completable.

Coroutines

RxJava and thread pool executors have been our codebase’s primary concurrency patterns. However, RxJava isn’t the solution to every task and coroutines have tangible benefits compared to traditional Java threading. Migrating all concurrent code in our codebase to a new pattern would have been a large, tedious, and error-prone endeavor. Instead, we focused on establishing best practices and patterns, and educating our team on coroutines to provide everyone with another concurrency tool to solve problems with.

Jetpack Compose

Jetpack Compose shipped a stable version sooner than we expected and right as we were finalizing our Duplo roadmap. Similar to coroutines, we chose not to set a goal of migrating large portions of our codebase to Compose and instead focused on experimenting with it in our codebase and identifying best practices for the team to provide them with another implementation option. We haven’t widely adopted Compose in our codebase yet due to some performance issues we’ve run into, but we are betting that this is the future of UI development for Android applications.

Duplo results

The Duplo initiative ran for a year and a half, finishing at the end of January 2022. Almost all of our iOS and Android engineers contributed to the project at some point during that period, with many engineers dedicating a considerable percentage of their time to Duplo each quarter.

Deciding that Duplo was “done” didn’t mean that we had removed all tech debt from the codebase, or that we had completed all technology adoptions and migrations. We will always have ongoing efforts to improve our codebases, but after a year of the Modernization & Modularization phase, we had reached the point where we had met our most important goals, making broad improvements in our mobile codebases and speeding up mobile development. We increased consistency, clarified code ownership, made it easier for developers to work independently, and laid a stronger foundation for future innovation in our mobile apps.

Some metrics:

iOS:

  • By the end of the project, our codebase was 68% Modernized (the percentage of code updated to our modern architecture and patterns) and 81% Modularized
  • 280 modules were added during Duplo
  • CI build stability improved from 77% to 90%, year over year
  • Average CI Time To Merge was reduced by 64%, and the p95 dropped by 63%!

Android:

  • Android didn’t separately track Modernization, but was 92% Modularized by the end of the project!
  • 330 modules were added during Duplo
  • CI Time to Merge improvements were smaller, but we still improved the average time to merge slightly, and the p95 dropped by 30%

While we saw clear improvements in CI build speeds during the Duplo project, it took a little longer to get payoff for local build times. Our modularization efforts, and adopting Bazel on iOS, kept our local build speeds from getting worse as our codebase continued to grow. In the months since the completion of Duplo, the combination of the modularization work, further Bazel improvements, and switching mobile engineers to M1 MacBook Pros, have dropped local build times by almost 50%. The M1 chips have allowed us to run more parallel build workers, and having a more modularized codebase lets us better take advantage of these additional workers.

In addition, as a result of Duplo we have seen several examples where development and prototyping on mobile was as fast, or even faster, than the same features on desktop. Since the biggest goal of the project was to speed up mobile development, we are excited that it had the desired impact in this area.

Developer sentiment

At Slack, our mobile Developer Experience team conducts quarterly surveys of mobile developers, to track trends in developer opinions about the codebase and development processes. As the Duplo project progressed, we saw steady improvement in some of the survey metrics, including those about tech debt in the codebase and overall confidence in code quality. We also tracked developer opinions about the Duplo project itself: Developers believed that the project was having a positive impact on the codebase, and the strength of those opinions increased as the project went on, for developers on both platforms.

In addition, we heard from many developers that Duplo was improving CI times, making feature development easier, and generally improving their experience:

Next steps

While Duplo was a success, there were some areas where — as the initiative progressed — we realized there needed to be some refinements and changes to our original plans.

On iOS, one of these areas was interdependencies between modules. The explosion of new modules in our codebase was a desired outcome of Duplo, but we soon realized that our original rules around modularization were not strict enough — our only initial restriction was that Features or Services could only depend on the Interface modules of other Features or Services. In particular, any of our Library modules could depend on any others, and any Library could also depend on any Service and Feature Interface. This quickly led to a growing problem with lower-level modules depending on higher ones, and circular dependencies among modules. We decided we needed to define ‘Layers’ for our modules, eg: Foundation, Data Models, Networking, etc. A module in one layer would only be allowed to depend on modules in the layers below it. This allowed us to clearly designate which were our lowest level, or Foundation modules, and build the layers up from there. This work is still ongoing, but it is cleaning up our module dependencies and making it clearer where different types of code should live.

Another area we are refining is dependency injection. One of the big benefits of modularizing our code was cleaning up the dependencies of each module as it was created, making them more explicit and easier to mock for tests. In our app target, we had a number of large dependencies classes, which were passed around through many levels of code. This antipattern made it easy to create circular dependencies and memory leaks, and hard to tell which dependencies were actually in use. As we modularized, we considered adopting a new dependency injection pattern. After investigating a number of possibilities, including third-party frameworks, we decided to stick with standard initializer injection, because it was simple to adopt, familiar to developers, provided a compile-time guarantee that dependencies were valid, and avoided global state or singletons.

However, this is another area where we want to make additional improvements after Duplo. Right now, the Feature and Service Implementation modules are only linked by the app target, and dependencies for each of the modules have to be assembled there and passed into the module as it is created. This means the app target is still larger and more unwieldy than we’d like. We are investigating adopting Needle as a dependency framework, to enable modules to explicitly define their dependencies and build them without having to link all the implementations in the app target.

For Android, we have continued — and plan to continue — optimizing our Gradle and module dependencies. We are looking for inter-module dependencies that reduce parallelization because they either depend on too many other modules or they themselves must wait until too many other modules have compiled. One idea we are exploring is separating all modules into API and implementation modules, which would highly parallelize our project’s compilation by allowing implementations to compile without waiting for other implementations to compile.

As mentioned above in the Modernization section, we are continuing to work on Jetpack Compose performance issues and migrating some commonly used widgets to Jetpack Compose to ease the migration of larger screens. Similarly, we’ve been increasing our Kotlin coroutine usage through the codebase with a lot of success.

While advancing Kotlin adoption wasn’t an explicit goal of Duplo, it definitely sped it up. Our codebase is currently 92% Kotlin. Converting the remaining 8% of code that is still in Java to Kotlin is now an explicit goal. Besides the many language features we love about Kotlin, it will continue to improve our build times by allowing us to remove many of the Java static analysis tools we were running.

Removing tech debt is a task that is never done, but the Duplo initiative greatly reduced it in our mobile codebases. Duplo solved many of the problems we set out to address and improved our ability to perform future maintenance. We’ve built systems to provide better visibility and reporting to help future cross-team migrations. Modularization has created clearer ownership and far more decoupled code which allows our teams to manage their code with less coordination. We’ve adopted cleaner architecture and more forward looking technologies which will help us with our growing development teams and future innovation. And we’ve made the experience of mobile developers at Slack simpler, more pleasant, and more productive.

If you are interested in working on Slack’s mobile apps, check out our open roles!

Leave a Reply

Your email address will not be published.