In the Overview of Jank and Frame Drops in Android - Application Layer article, we listed causes of jank originating from the app itself. In this article, we focus on causes stemming from the Android platform. Due to differences in hardware performance, feature implementations, and engineering capabilities among Android OEMs, system quality varies significantly. Here, we’ll categorize performance issues caused by system hardware and software.
Jank issues are a top priority for both phone manufacturers and app developers, often handled by dedicated “Performance” or “Stability” teams. Third-party tools like Tencent’s Matrix are excellent, but manufacturers often have proprietary solutions with deeper system access through source code modification.
Smoothness refers to frame drops. If the screen fails to refresh at 60 FPS (or the target refresh rate) and only hits 55, the user perceives this as jank. Frame rate fluctuations are especially noticeable.
This series includes:
0. Overview of Jank and Frame Drops in Android - Methodology
- Overview of Jank and Frame Drops in Android - System Layer
- Overview of Jank and Frame Drops in Android - Application Layer
- Overview of Jank and Frame Drops in Android - Low Memory
System-Level Performance Case Studies
Below are actual cases of jank caused by Android platform issues. Some issues are caught during development, while others only emerge after long-term use or in specific scenarios.
Many of these cases are visible in Systrace. If you’re unfamiliar, see the Systrace Series. Systrace provides a global view of the system’s operating state.
1. SurfaceFlinger Main Thread Overhead
SurfaceFlinger handles surface composition. If its main thread exceeds the timing threshold, frame drops occur. This also delays HWC Service and CRTC, and blocks app Binder calls like dequeueBuffer and queueBuffer.
(Example Systraces show SF main thread timeouts leading to yellow/dropped frames in apps).
2. Under-Screen Ambient Light Sensor Screenshots
Some phones use under-screen sensors. If the implementation requires frequent screenshots to distinguish light changes, the screenshot operation can block the SurfaceFlinger main thread, causing jank.
3. HWC Service Execution Overhead
If the HWC Service is slow, SurfaceFlinger might fail to compose the next frame, blocking app dequeueBuffer and setTransactionState calls.
4. CRTC Execution Overhead
Slow CRTC (Display Controller) execution similarly prevents SurfaceFlinger composition and blocks app Binder returns.
5. CPU Scheduling Issues
Critical Tasks on Small Cores
If RenderThread is scheduled on a small core with lower performance, execution time increases, causing jank.
Low Priority and Slice Delays
Tasks that are low priority to the Linux scheduler might be critical to the user (e.g., interacting with the app’s main thread). Delay in getting CPU time leads to lag.
Preemption by RT Processes
App MainThread or RenderThread being preempted by Real-Time (RT) processes causes responsiveness issues. Google has considered making App threads RT during startup, but this can actually slow down startup due to system contention.
Core Binding and Migration
Issues occur when tasks that need big cores run on small ones, or vice versa. Incorrectly binding a task to a specific busy core can also cause failure (e.g., CTS failures due to core 7 being hogged by RenderThread).
6. Thermal Throttling
Thermal protection is a hardware-level safety measure. Overheating triggers CPU/GPU frequency caps to cool down the device, which inevitably impacts smoothness. If your phone is hot and laggy, it’s a protection mechanism.
7. High Background Activity
Too many background processes hog CPU, I/O, and Memory, making the system “busy.”
CPU Saturation
dumpsys cpuinfo reveals high overall usage.
Runnable Thread States
If a thread is in Runnable state but the scheduler can’t find time for it, it misses Vsync, causing jank.
Irrelevant Active Processes
Processes unrelated to the current foreground app (system or third-party) compete for resources and delay App MainThread scheduling.
Memory Reclamation Overhead
When memory is low, HeapTaskDaemon and kswapd0 hog CPU to reclaim memory, starving other processes.
System Lock Contention
system_server AMS and WMS locks can become massive bottlenecks. If these locks are held too long, app Binder requests enter a wait state, causing jank in animations and interactions.
8. Excessive Layers
In Android P+, Layer computation happens in the SF main thread. If there are too many background layers, rebuildLayerStacks takes too long, slowing down SurfaceFlinger. Clearing background tasks helps.
9. Uneven Input Reporting
If input events (touch/gestures) are reported unevenly or not at all, the main thread won’t draw, appearing as jank in scrolling.
10. LMK Resource Competition
Low Memory Killer (LMK) activity consumes resources. Processes being killed and immediately restarted by their parent create a “death-restart cycle” that drains CPU. Frequent GCs and heavy I/O (swapping) further deteriorate performance.
11. Low Memory I/O Latency
Low memory leads to increased disk I/O (swapping/paging). Since disk I/O is slow, main threads often enter “Uninterruptible Sleep” (State D), causing jank in list scrolling and app startup.
12. GPU Composition Overhead
When SurfaceFlinger falls back to GPU composition, its main thread execution time spikes, potentially missing Vsync.
13. KSWAPD on Big Cores
Under heavy memory pressure, kswapd might hog big cores, causing thermal throttling or preempting critical app threads.
14. Uneven SurfaceFlinger Vsync
If the interval between SF Vsyncs becomes inconsistent (due to SF or HWC bugs), the user perceives uneven, stuttery motion.
15. Accessibility Services
Third-party apps using Accessibility Services to monitor input events can disrupt InputDispatcher behavior, delaying event delivery to the app’s main thread.
Summary
Android is a constantly evolving system. Each version solves many performance issues but might introduce new ones. OEMs add substantial custom code, which further impacts performance.
The cases above are just the tip of the iceberg. This is why manufacturers are investing heavily in optimization—from hardware/kernel levels to dynamic system policies and AI-driven behavior learning. Those who neglect quality for pure design often lose their market share.
For more details on what OEMs do, see: What do phone manufacturers actually do when they say they are optimizing Android?
This series continues:
0. Overview of Jank and Frame Drops in Android - Methodology
- Overview of Jank and Frame Drops in Android - System Layer
- Overview of Jank and Frame Drops in Android - Application Layer
- Overview of Jank and Frame Drops in Android - Low Memory
About Me && Blog
(Links and introduction)