# arm

## TF-RMM Stage 1 Memory Management

**TF-A Tech Forum** 

Javier Almansa Sobrino April 2024

© 2024 Arm

## Agenda

- + The physical address space
- + Granule state tracking
- + Stage 1 translation regime
  - Low VA range
  - High VA range
    - Slot buffers
    - Per-CPU stacks
- + The Stage 1 Translation library
- + Unittests
- + Future work

#### The Physical Address Space



## **Granule State Tracking**

- RMM needs to keep track of all the delegable (Non-Secure PAS) memory available at boot time
- An array of granule structures keep track of the state of the memory.
  - One entry per granule (page) of available memory.
- The state of a granule might be a precondition for some RMI SMCs. Likewise, they can undergo transitions as part of the RMI SMCs.

| , i ani ani i ani ani ana ang ang ang ang ang ang ang ang ang                                                                               |  |
|---------------------------------------------------------------------------------------------------------------------------------------------|--|
| ruct granule { /*                                                                                                                           |  |
| * @lock protects the struct granule itself. Take this lock whenever<br>* inspecting or modifying any other fields in this struct.<br>*/     |  |
| <pre>spinlock_t lock;</pre>                                                                                                                 |  |
|                                                                                                                                             |  |
| <pre>* @state is the state of the granule. */</pre>                                                                                         |  |
| enum granule_state state;                                                                                                                   |  |
|                                                                                                                                             |  |
| <pre>* @refcount counts RMM and realm references to this granule with the<br/>* following rules:</pre>                                      |  |
| <ul> <li>* - The @state of the granule cannot be modified when @refcount</li> <li>* is non-zero.</li> </ul>                                 |  |
| <ul> <li>* - When a granule is mapped into the RMM, either the granule lock</li> <li>* must be held or a reference must be held.</li> </ul> |  |
| * - The content of the granule itself can be modified when                                                                                  |  |
| * @refcount is non-zero without holding @lock. However, specific                                                                            |  |
| * types of granules may impose further restrictions on concurrent                                                                           |  |
| * access.                                                                                                                                   |  |
|                                                                                                                                             |  |
| unsigned long refcount;                                                                                                                     |  |
|                                                                                                                                             |  |
|                                                                                                                                             |  |



0x0000-0000-0000-0000

## Stage 1 Low VA space

#### - Shared across all CPUs

- Static (and mostly flat) mappings
  - Symbols from the linker are imported in order to create flat mappings.
  - Other mappings such as the EL3 shared region or per-platform mappings might not be flat.
  - Translation tables are stored into .ro section.
- RMM is compiled as PIE binary
  - GOT and other relocations are fixed by the startup code before the MMU is enabled.



### Stage 1 High VA space

- + Per-CPU set of translation tables
- Contains mappings for the slot buffers, mapped a fixed VAs.
  - Any CPU can map/unmap any granule on any slot buffer.
- Contains mappings for the per-CPU stacks
- This space is managed by *xlat\_high\_va.{c, h}*



## Stage 1 High VA space – Slot buffers

- + Fixed number of slots per CPU
- Each slot is used to map a granule in a particular state
  - -- RMM uses the granule\_state to ensure that granules are mapped to the right slot
  - -- enum buffer\_slot in buffer.h
- Each CPU has its own set of translation tables
  - Same type of slot has same VA across all the CPUs
  - Ease the migration of vCPUs
- The Slot Buffer component includes optimizations to increase map/unmap performace.



#### Stage 1 High VA space – Per CPU Stack

- + Stack size configurable at build time
  - --- RMM\_NUM\_PAGES\_PER\_STACK
- The stack start for each CPU is calculated at boot time and the mapping updated
- An unmapped page guard protects against stack underflows.
- There is a special stack used to handle stack overflow faults.



## **Stage 1 Translation Library**

- Used TF-A xlat-v2 library as baseline
- Supports up to 52 bit-wide addresses and up to 5 levels of translation (when FEAT\_LPA2 is enabled).
- Stateless. Uses the abstraction of a "context" to store status.
  - --- One context per CPU per VA Region\*.
  - --- Contexts can be shared across CPUs.
- + Uses TRANSIENT TTEs for dynamic mappings
  - --- It uses a bit flag to mark an invalid TTE as TRANSIENT.
  - An ordinary invalid TTE cannot be used on a mapping by the library.

| ✓ xlat                   |
|--------------------------|
| $\sim$ include           |
| C xlat_contexts.h        |
| C xlat_defs.h            |
| C xlat_high_va.h         |
| c xlat_tables.h          |
| ✓ src                    |
| > aarch64                |
| > fake_host              |
| C xlat_contexts.c        |
| C xlat_defs_private.h    |
| c xlat_high_va.c         |
| c xlat_tables_arch.c     |
| c xlat_tables_core.c     |
| C xlat_tables_private.h  |
| c xlat_tables_utils.c    |
| ✓ tests                  |
| M CMakeLists.txt         |
| C xlat_test_defs.h       |
| c xlat_test_helpers.c    |
| C xlat_test_helpers.h    |
| @ xlat_tests_base_g1.cpp |
| c xlat_tests_base_g2.cpp |
| C xlat_tests_base.h      |
| C xlat_tests_lpa2.cpp    |
| c xlat_tests_no_lpa2.cpp |
| M CMakeLists.txt         |
| M CMakeLists.txt         |

Stage 1 Translation Library – xlat\_ctx



### Stage 1 Translation Library – Initialization (I)



- Except for steps 4 & 5, which always needs to be done in WarmBoot path by every CPU, all the steps can be done either during ColdBoot or WarmBoot.
- Both VA regions must be created and configured before step 5.

#### Stage 1 Translation Library – Initialization (I)



#### Unittests

- + Support for unittests (CppUTest) using the *fake\_host* architecture
- -- Different test groups run same tests with different configurations:
  - -- xlat\_tests\_LPA2: FEAT\_LPA2 Enabled
  - -- xlat\_tests\_no\_LPA2: FEAT\_LPA2 Disabled
- + Tests both regions



#### Future work

- Remove recursive calls on some of the table creation APIs

- + General code optimizations to improve efficiency
- Returned error codes need to be revisited
- The library can generate panic() under certain circumstances. We need to return an error code instead to the caller.
- Break the stage 1 translation library API into context manipultion APIs and general TTE manipulation APIs

|                                |  |  |  |  | + | +          |   |
|--------------------------------|--|--|--|--|---|------------|---|
| Thank You<br>+ Danke           |  |  |  |  | + | ar         | + |
| Gracias<br>Grazie              |  |  |  |  |   |            |   |
| 谢谢<br>ありがとう                    |  |  |  |  |   |            |   |
| Asante                         |  |  |  |  |   |            |   |
| Merci<br>감사합니다<br>धन्यवाद      |  |  |  |  |   |            |   |
| ۲۵۹۹۹۹<br>۲۰۰۰ Kiitos<br>شکرًا |  |  |  |  |   |            |   |
| + ধন্যবাদ<br>নাדה              |  |  |  |  |   |            |   |
| ధన్యవాదములు                    |  |  |  |  |   | D 2024 Arm | ( |

arm

The Arm trademarks featured in this presentation are registered trademarks or trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere. All rights reserved. All other marks featured may be trademarks of their respective owners.

www.arm.com/company/policies/trademarks

| © 2024 Arm |  |  |  |  |  |  |
|------------|--|--|--|--|--|--|
|            |  |  |  |  |  |  |
|            |  |  |  |  |  |  |
|            |  |  |  |  |  |  |
|            |  |  |  |  |  |  |
|            |  |  |  |  |  |  |