Reduce, Reuse & Recycle your thread pools ♻️

Chao Zhang
9 min readDec 20, 2021

--

Photo: Héctor J. Rivas from Unsplash

Real-world applications today are mostly multi-threaded. This means developers should be mindful of managing the concurrency of their applications. Mastering the threading can help boost an app’s performance. On the other hand, using concurrency without fully understanding it could lead to problems that negatively impact the app’s health.

For Android applications, every thread is mapped to a system-level thread at runtime. Each thread costs a minimum of 64k of memory on Android, If we always create a thread for any new asynchronous task, we will create memory pressure on the app. The app performance may suffer because spawning up new threads and context switching among threads are both taking up time and resources. If a thread is referenced even if it is not active, it will be kept in memory and can’t be cleaned up by the garbage collector.

Thread pools can help us manage concurrency more efficiently. ThreadPoolExecutor creates a pool of worker threads and schedules the tasks for them to execute. It can grow the pool size to meet the demand as new tasks arrive, and it can shrink the pool when threads are idle and no longer need to be kept alive. Thread pool, therefore, improves the app performance by reducing the per-task overhead and controls the resource usage by bounding the resources.

Thread pool seems the solution to our concurrency headache, and many libraries have adopted this technique, such as OkHttp and AndroidX WorkManager. Each library maintains its own thread pool by default.

When we include these libraries in our application, since each library spins up new threads by itself without the awareness of other thread pools, we come full circle. Hundreds of threads exist in our application because those libraries do not know each other.

In this story, we are going to walk through an example of configuring libraries to share a common thread pool. We will also summarize the benefits and costs of managing thread pools, as it is necessary for continuously evolving our apps without degrading the performance.

Setup

The sample Android application can be found in this Github repo. It supports displaying Reddit feed by querying https://www.reddit.com/top.json. The following libraries are used in this application:

  • Retrofit/OkHttp for handling network requests
  • Coroutine/Flow for data layer processing
  • Jetpack Compose for rendering UI
  • Coil-compose for loading images
  • Work Manager for no reason (for enriching our example)

All these libraries involve concurrency and threading. They are configured in the default way without any advanced API usage.

Analysis

We choose the Android Studio template Basic Activity as our baseline, in order to compare it with our sample Reddit app. Let’s analyze our thread usage in both apps first.

We can inspect the currently active threads of Android applications in many ways:

  • Use Android Studio Profiler and record system tracing on CPU
  • Use debugger to view stack trace from any thread (We can then see all thread names from the drop-down)
  • Log Thread.getAllStackTraces().keys.sortedBy { it.name } (Don’t use this in the production apps)

We use the last approach for the demonstration purpose because we can easily sort their names alphabetically. Below is the comparison of all thread usage with the baseline app on the left and our sample Reddit app on the right.

Left: Baseline app. Right: Our sample Reddit app

Surprise! Even with our simple Reddit app, there can be as many as 35 active threads. Our baseline app only uses 5 threads though.

Let’s look at the difference one by one.

ConnectivityThread

ConnectivityThread is from the Android OS. According to its Javadoc, it is a shared singleton connectivity thread for the system. It is used for connectivity operations such as AsyncChannel connections to system services.

In short, this thread would appear once we introduce the network operations into our app. Since it is bound to the Android OS, we do not have much control of it.

DefaultDispatcher-worker-*

These are the worker threads managed by CoroutineScheduler. If some coroutines use the non-main standard dispatchers as the context, namely Dispatchers.Default and Dispatchers.IO , the actual coroutine execution will then be performed on these worker threads.

Rather than relying on the implementation of ThreadPoolExecutor from Java SDK, CoroutineScheduler takes its own responsibility of adding new threads when tasks come in and removing threads when idling. The same CoroutineScheduler instance is used to back both Dispatchers.Default and Dispatchers.IO.

CoroutineScheduler sets corePoolSize as the number of available CPU cores or 2, whichever is larger. ItsmaximumPoolSize is the number of available CPU cores or 64, whichever is larger. Both are configurable by their corresponding system property.

Beyond scaling the threads on-demand, CoroutineScheduler is also a task scheduler that manages the ordering of tasks and practices the work-stealing policy. We will pause here for now. If you are curious to learn further, I would recommend continuing reading the documentation and the implementation of CoroutineScheduler .

OkHttp Dispatcher

These threads come from Dispatcher in the OkHttp. Dispatcher creates an internal ThreadPoolExecutorwith corePoolSize equal to 0 and an unlimited maximumPoolSize. Dispatcher is responsible for dispatching the actual work of sending requests and receiving responses to its threads.

All the threads are of the same name OkHttp Dispatcher.

OkHttp TaskRunner

The rest of the OkHttp prefixed threads are managed by TaskRunner. TaskRunner is under the internal package so it should be respected as the implementation detail. Nonetheless, the TaskRunner is a set of worker threads as daemons and accepts Task. For instance, trimming the cache to the max size is submitted to TaskRunner as a Task.

As of OkHttp 4.9, all threads created by TaskRunnnerdynamically changes the thread name based on the current Task. For example, the thread name OkHttp www.reddit.comindicates that there is a Task named www.reddit.com currently being run.

val currentThread = Thread.currentThread()
val oldName = currentThread.name
currentThread.name = task.name

Okio Watchdog

Watchdog is a single daemon thread that is responsible for invoking the action at the scheduled timeout. Okio’s Watchdog is created on the critical path of OkHttp executing a network request.

WM.task-* & androidx.work-*

WorkManager exposes public APIs to specify the executor for both the threading of Workeras well as its internal tracking. To be more specific:

  • Configuration.Builder#setExecutor(): This executor controls the thread of Worker.doWork() . Since this is to be implemented by the application, all threads are prefixed with androidx.work
  • Configuration.Builder#setTaskExecutor(): This executor is used by WorkManager for its internal bookkeeping. For example, WorkManagerTaskExecutor wraps the task executor inside a SerialExecutor to guarantee the execution order of the tasks, such as a StopWorkRunnable. Because they are used internally, the thread names are prefixed with WM.work

For either of the API, an internalThreadPoolExecutor will be created by default if they are not specified. The actual configuration is a fixed thread pool of size matching the number of CPU processors minus 1 (and between 2 and 4).

kotlinx.coroutines.DefaultExecutor

This is a single thread initialized by DefaultExecutor. It is used primarily for delay() in coroutines.

queued-work-looper

This is an Android OS HandlerThread created by QueuedWork. Because the WorkManager library comes along with a few Android services, this path is hit upon the app launch.

Merge

Based on the analysis above, the opportunity is quite obvious. We can merge three thread pools by creating our own ThreadPoolExecutor for reuse.

  • OkHttp’s Dispatcher
  • WorkManager’s Executor
  • Coroutine’s CoroutineScheduler

Let’s make this happen for our sample Reddit app! We could start with defining our shared ThreadPoolExecutor by setting its corePoolSize to the number of CPU processors and maximumPoolSize as unlimited.

Next, we need to pass this shared ThreadPoolExecutorto OkHttp, Retrofit, and Coil. Keep in mind that if we forget to configure OkHttp for all libraries depending on it, say if we configure Retrofit, but not Coil, we may still end up with twoOkHttpClient instances. This would simply worsen the situation by creating more threads.

For WorkManager, we should follow the on-demand initialization section from the Android Developer documentation. This requires us to remove the ContentProvider from AndroidManifest.xml in initializing the work provider, as well as having our Application implement Configuration.Provider. In addition, make sure all WorkManager references are requested through WorkManager.getInstance(Context) . Because WorkManager.getInstance(Context) takes the customized configuration into consideration whereas WorkManager.getInstance() does not.

Last but not least, configuring the shared thread pool for coroutine requires a certain amount of effort. First, we need to make sure that all Dispatchers are injectable. Not only does injecting dispatchers make testing easier by allowing tests to provide a different implementation, but it also allows us to provide our own Dispatcher implementation.

Second, we need to provide our implementation through dependency injection. Here we are using ExecutorService.asCoroutineDispatcher to convert the ThreadPoolExecutor into a CoroutineDispatcher. And we need to inject this CoroutineDispatcher to all places that were previously using Dispatchers.Defaultor Dispatchers.IO

Fortunately, all the concurrency libraries used in our sample Reddit application expose API to configure the thread pools. So this is sufficient for this story. The full change can be found in this commit.

Benefit

Reduced number of threads in our application

Previously we had three different thread pools. Now we can reuse the same thread pool to run concurrent tasks. As illustrated below, we can observe the reduced number of threads from 35 to 19 in our sample application.

Left: Before reusing thread pools Right: After reusing thread pools

Indirect improvement of the app performance

When I did the A/B testing to quantify the impact of merging thread pools for a real Google Play app with 500M+ downloads, the results were quite promising: 2% reduction of cold app launch time at 90th percentile with low p-value (<0.01). The number should only be used for reference only because this is very contextual. It depends on the total number and activity of those thread pools to be merged.

An improvement in the general app performance is also within expectation. Because we have centralized control of our thread pool, we could analyze the required parallelism and configure the thread pool accordingly. We can adjust corePoolSize, maximumPoolSize, and our scheduling algorithm based on the production workflow.

Cost

Behavioral change

Not all thread pools are configured the same way. They may have different corePoolSize, maximumPoolSize, keepAliveTime, or priority (Thread priority itself is already an advanced topic). For example,

  • OkHttp sets corePoolSize to 0 by default
  • WorkManager uses a FixedThreadPool where the number of threads does not change by default
  • Coroutine implements its thread pooling and scheduling

All these three thread pools are not configured with the same parameters. Merging thread pools, therefore, is likely going to change the behavior of concurrent tasks. Because thread pools are no longer isolated, thread safety issues that were previously undetected may now be exposed with merging.

The additional effort may be required to maintain unique features. For example, Coroutine provides Dispatchers.Default for CPU-intensive work and Dispatchers.IO for IO work. If this is necessary, making an equivalent implementation may require additional effort.

Vulnerability

In case we forget to configure any library using concurrency utility — say we configure the OkHttp and Retrofit, but forget to configure Coil, we would end up with two OkHttpClient instances and leave the thread pool problem unresolved.

Another issue with Kotlin coroutine is that if Dispatchers.Default or Dispatchers.IO is still referenced in the production code, we may still keep the default CoroutineScheduler being used. As a result, we still have multiple thread pools.

Lastly, we need to audit the library to ensure the proper exposure of thread configuration API. A library may create some threads for internal usage, and we need to be fully aware of them in case they grow into potential problems.

Takeaway

Recycling and reducing thread pools bring us great performance benefits and increase the overall app health. On the other hand, it also requires meticulous auditing and continuous validation to ensure that the merged thread pool maintains consistent behavior. I would suggest we start the following step by step:

  • Qualify all threads with meaningful names.
  • Audit thread creation to identify opportunities to reduce and reuse thread pools.
  • A/B test the change of merging thread pools to validate the behavioral change and performance improvement.
  • Set up guidance or tooling to ensure no future regression.

Last but not least, thanks to my colleague Colin White for proofreading and giving valuable feedback.

If you find this story valuable, you may support me by joining Medium through the link below. Members get full access to every story on Medium.

--

--