From 8445559b50de45c876ff4b61224bffb7093434d6 Mon Sep 17 00:00:00 2001
From: Jerome Forissier <jerome.forissier@linaro.org>
Date: Thu, 31 Aug 2017 18:04:49 +0200
Subject: [PATCH] docs/debug.md: add section about abort/panic messages and
 symbolize.py

Signed-off-by: Jerome Forissier <jerome.forissier@linaro.org>
Reviewed-by: Joakim Bech <joakim.bech@linaro.org>
Reviewed-by: Etienne Carriere <etienne.carriere@linaro.org>
---
 docs/debug.md | 106 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 106 insertions(+)

diff --git a/docs/debug.md b/docs/debug.md
index f7fa9c0..d1462c3 100644
--- a/docs/debug.md
+++ b/docs/debug.md
@@ -11,6 +11,7 @@
 		1. [Affinic Debugger](#153-affinic-debugger)
 	1. [Known issues](#16-known-issues)
 2. [Ftrace](#2-ftrace)
+3. [Abort dumps](#3-abort-dumps-panics)
 
 In this document we would like to describe how to debug OP-TEE. Depending on the
 platform you are using you will have a couple of different options.
@@ -294,6 +295,111 @@ graph-time.
 $ echo 0 > options/graph-time
 ```
 
+# 3. Abort dumps, panics
+
+When OP-TEE encounters a serious error condition, it prints diagnostic
+information to the secure console. The message contains a call stack if
+CFG_UNWIND=y (enabled by default).
+The following errors will trigger a dump:
+- Data or prefetch abort exception in the TEE core (kernel mode) or in a TA
+(user mode),
+- When a user-mode Trusted Application panics, either by calling TEE_Panic()
+directly or due to some error detected by the TEE Core Internal API,
+- When the TEE core detects a fatal error and decides to hang the system
+because there is no way to proceed safely (core panic).
+
+The messages look slightly different depending on:
+- Whether the error is an exception or a panic,
+- The exception/privilege level when the exception occurred (PL0/EL0 if a
+user mode Trusted Application was running, PL1/EL1 if it was the TEE core),
+- Whether the TEE and TA are 32 or 64 bits,
+- The exact type of exception (data or prefetch abort, translation fault,
+read or write permission fault).
+
+Here is an example of a data abort with write permission fault in a 32-bit
+Trusted Application, running on a 32-bit TEE core (QEMU):
+
+```
+ERROR:   [0x0] TEE-CORE: User TA data-abort at address 0x131658 (write permission fault)
+ERROR:   [0x0] TEE-CORE:  fsr 0x0000080f  ttbr0 0x0e07a06a  ttbr1 0x0e07406a  cidr 0x1
+ERROR:   [0x0] TEE-CORE:  cpu #0          cpsr 0x20000030
+ERROR:   [0x0] TEE-CORE:  r0 0x0012937f      r4 0x0013a0e4    r8 0x00000000   r12 0x0e07dd80
+ERROR:   [0x0] TEE-CORE:  r1 0x00001b00      r5 0x00121e47    r9 0x00000000    sp 0x001026f4
+ERROR:   [0x0] TEE-CORE:  r2 0x00131658      r6 0x001026f4   r10 0x00000000    lr 0x00121bc9
+ERROR:   [0x0] TEE-CORE:  r3 0x00000002      r7 0x001026f4   r11 0x00000000    pc 0x00105bdc
+ERROR:   [0x0] TEE-CORE: Status of TA 5b9e0e40-2636-11e1-ad9e-0002a5d5c51b (0xe073b70) (active)
+ERROR:   [0x0] TEE-CORE:  arch: arm  load address: 0x103000  ctx-idr: 1
+ERROR:   [0x0] TEE-CORE:  stack: 0x100000 10240
+ERROR:   [0x0] TEE-CORE:  region 0: va 0x100000 pa 0xe31e000 size 0x3000 flags rw-
+ERROR:   [0x0] TEE-CORE:  region 1: va 0x103000 pa 0xe200000 size 0x2e000 flags r-x
+ERROR:   [0x0] TEE-CORE:  region 2: va 0x131000 pa 0xe22e000 size 0xa000 flags r--
+ERROR:   [0x0] TEE-CORE:  region 3: va 0x13b000 pa 0xe238000 size 0xe6000 flags rw-
+ERROR:   [0x0] TEE-CORE:  region 4: va 0 pa 0 size 0 flags ---
+ERROR:   [0x0] TEE-CORE:  region 5: va 0 pa 0 size 0 flags ---
+ERROR:   [0x0] TEE-CORE:  region 6: va 0 pa 0 size 0 flags ---
+ERROR:   [0x0] TEE-CORE:  region 7: va 0 pa 0 size 0 flags ---
+ERROR:   [0x0] TEE-CORE: Call stack:
+ERROR:   [0x0] TEE-CORE:  0x00105bdc
+ERROR:   [0x0] TEE-CORE:  0x00121bc9
+ERROR:   [0x0] TEE-CORE:  0x00121d1b
+ERROR:   [0x0] TEE-CORE:  0x00121e6f
+```
+
+The above dump was triggered by the TA trying to write to a global const
+variable. Such variables are stored in the `.rodata` section of the ELF binary,
+and loaded into a read-only section of memory.
+
+OP-TEE provides a helper script called `symbolize.py` to facilitate the
+analysis of such issues. It is located in the OP-TEE OS source tree, and is
+also copied to the TA development kit. Whenever you are confronted with an error
+message reporting a serious error and showing a "Call stack:" line, you may use
+the symbolize script.
+
+`symbolize.py` reads its input from stdin and writes extended debug information
+to stdout. The `-d` (directories) option tells the script where to look for TA
+ELF file(s) or for `tee.elf` (the TEE core). The `-s` (strip) option is used
+to shorten the source file paths. Please refer to `symbolize.py --help` for
+details.
+
+
+```
+$ cat dump.txt | ./optee_os/scripts/symbolize.py -d optee_test/out/ta/* -d optee_os/out/arm/core -s `pwd`
+# (or run the script, copy and paste the dump, then press Ctrl+D)
+ERROR:   [0x0] TEE-CORE: User TA data-abort at address 0x131658 (write permission fault)
+ERROR:   [0x0] TEE-CORE:  fsr 0x0000080f  ttbr0 0x0e07a06a  ttbr1 0x0e07406a  cidr 0x1
+ERROR:   [0x0] TEE-CORE:  cpu #0          cpsr 0x20000030
+ERROR:   [0x0] TEE-CORE:  r0 0x0012937f      r4 0x0013a0e4    r8 0x00000000   r12 0x0e07dd80
+ERROR:   [0x0] TEE-CORE:  r1 0x00001b00      r5 0x00121e47    r9 0x00000000    sp 0x001026f4
+ERROR:   [0x0] TEE-CORE:  r2 0x00131658      r6 0x001026f4   r10 0x00000000    lr 0x00121bc9
+ERROR:   [0x0] TEE-CORE:  r3 0x00000002      r7 0x001026f4   r11 0x00000000    pc 0x00105bdc
+ERROR:   [0x0] TEE-CORE: Status of TA 5b9e0e40-2636-11e1-ad9e-0002a5d5c51b (0xe073b70) (active)
+ERROR:   [0x0] TEE-CORE:  arch: arm  load address: 0x103000  ctx-idr: 1
+ERROR:   [0x0] TEE-CORE:  stack: 0x100000 10240
+ERROR:   [0x0] TEE-CORE:  region 0: va 0x100000 pa 0xe31e000 size 0x3000 flags rw-
+ERROR:   [0x0] TEE-CORE:  region 1: va 0x103000 pa 0xe200000 size 0x2e000 flags r-x .ta_head .text .rodata
+ERROR:   [0x0] TEE-CORE:  region 2: va 0x131000 pa 0xe22e000 size 0xa000 flags r-- .rodata .ARM.extab .ARM.extab.text.__aeabi_ldivmod .ARM.extab.text.__aeabi_uldivmod .ARM.exidx .got .dynsym .rel.got .dynamic .dynstr .hash .rel.dyn
+ERROR:   [0x0] TEE-CORE:  region 3: va 0x13b000 pa 0xe238000 size 0xe6000 flags rw- .data .bss
+ERROR:   [0x0] TEE-CORE:  region 4: va 0 pa 0 size 0 flags ---
+ERROR:   [0x0] TEE-CORE:  region 5: va 0 pa 0 size 0 flags ---
+ERROR:   [0x0] TEE-CORE:  region 6: va 0 pa 0 size 0 flags ---
+ERROR:   [0x0] TEE-CORE:  region 7: va 0 pa 0 size 0 flags ---
+ERROR:   [0x0] TEE-CORE: User TA data-abort at address 0x131658 const_var .rodata+4852 (write permission fault)
+ERROR:   [0x0] TEE-CORE: Call stack:
+ERROR:   [0x0] TEE-CORE:  0x00105bdc TA_CreateEntryPoint at optee_test/ta/os_test/ta_entry.c:44
+ERROR:   [0x0] TEE-CORE:  0x00121bc9 ta_header_add_session at optee_os/lib/libutee/arch/arm/user_ta_entry.c:115
+ERROR:   [0x0] TEE-CORE:  0x00121d1b entry_open_session at optee_os/lib/libutee/arch/arm/user_ta_entry.c:159
+ERROR:   [0x0] TEE-CORE:  0x00121e6f __utee_entry at optee_os/lib/libutee/arch/arm/user_ta_entry.c:229
+```
+
+The Python script uses several tools from the GNU Binutils package to perform
+the following tasks:
+1. Translate the call stack addresses into function names, file names and line
+numbers;
+2. Convert the abort address to a symbol plus some offset (`const_var` in the
+example) and/or an ELF section name plus some offset (`.rodata+4852` in the
+example);
+3. Print the names of the ELF sections contained in each memory region of a TA.
+
 [Affinic Debugger]: http://www.affinic.com/?page_id=109
 [README.md]: ../README.md
 [QEMU ARMv7-A]: qemu.md#3-qemu-console
-- 
GitLab