Mali Bifrost - Cache Clean

2 minute read

What Invokes Cache Clean?

  • When power state is changed (see kbase_pm_l2_update_state() / kbase_pm_shaders_update_state())
  • When the job is done (see jd_done_worker()) – unlikely happen
  • When GPU context is switched (see at kbase_js_pull() / kbase_js_unpull()) – unlikely happen

mali_kbase_jm_rb.c

1633 void kbase_backend_complete_wq(struct kbase_device *kbdev,
1634                         struct kbase_jd_atom *katom)
1635 {
1636     /*
1637      * If cache flush required due to HW workaround then perform the flush
1638      * now
1639      */
1640     kbase_backend_cache_clean(kbdev, katom);
1641 }

mali_kbase_device_hw.c

872 void kbase_gpu_start_cache_clean_nolock(struct kbase_device *kbdev)
 873 {
 874     u32 irq_mask;
 875 
 876     lockdep_assert_held(&kbdev->hwaccess_lock);
 877 
 878     if (kbdev->cache_clean_in_progress) {
 879         /* If this is called while another clean is in progress, we
 880          * can't rely on the current one to flush any new changes in
 881          * the cache. Instead, trigger another cache clean immediately
 882          * after this one finishes.
 883          */
 884         kbdev->cache_clean_queued = true;
 885         return;
 886     }
 887 
 888     /* Enable interrupt */
 889     /** EE("GPU_IRQ_MASK - CLEAN_CACHES_COMPLETED"); */
 890     irq_mask = kbase_reg_read(kbdev, GPU_CONTROL_REG(GPU_IRQ_MASK));
 891     kbase_reg_write(kbdev, GPU_CONTROL_REG(GPU_IRQ_MASK),                                                                                                
 892                 irq_mask | CLEAN_CACHES_COMPLETED);
 893 
 894     KBASE_TRACE_ADD(kbdev, CORE_GPU_CLEAN_INV_CACHES, NULL, NULL, 0u, 0);
 895     kbase_reg_write(kbdev, GPU_CONTROL_REG(GPU_COMMAND),
 896                     GPU_COMMAND_CLEAN_INV_CACHES);
 897 
 898     kbdev->cache_clean_in_progress = true;
 899 }

Besides, the device driver configures the job slot if cache clean and/or invalidate will be required before and after the job is executed. The configuration is done right before putting job chain to the slot. While it is done by the device driver, the configuration, in fact, instructed by the user-space app/rutnime that is in the atom structure as core_req.

PM Policy

mali_kbase_pm_policy.c

32 static const struct kbase_pm_policy *const all_policy_list[] = {
   33 #ifdef CONFIG_MALI_NO_MALI
   34     &kbase_pm_always_on_policy_ops,
   35     &kbase_pm_coarse_demand_policy_ops,
   36 #if !MALI_CUSTOMER_RELEASE
   37     &kbase_pm_always_on_demand_policy_ops,
   38 #endif
   39 #else               /* CONFIG_MALI_NO_MALI */
   40     &kbase_pm_coarse_demand_policy_ops,
   41 #if !MALI_CUSTOMER_RELEASE
   42     &kbase_pm_always_on_demand_policy_ops,
   43 #endif  
   44     &kbase_pm_always_on_policy_ops
   45 #endif /* CONFIG_MALI_NO_MALI */
   46 };

The device driver manages the GPU power state by continuously reading the state from the GPU and updating it. For instance, if no in-flight jobs, the device driver tries to turn off the shader and thus L2/tiler cores for power saving. The “pm_always_on” guarantees no power related register I/O during run time.

GPU Protected Mode

  • L2 shall be powered down and GPU shall come out of fully coherent mode before entering protected mode.
  • When entering into protected mode, we must ensure that the GPU is not operating in coherent mode as well. This is to ensure that no protected memory can be leaked.

From the comments in the source code, I guess the protected mode prevents data leakage possible from cache coherence/flush but could not find an caller to enter it.