|  | Coresight - HW Assisted Tracing on ARM | 
|  | ====================================== | 
|  |  | 
|  | Author:   Mathieu Poirier <mathieu.poirier@linaro.org> | 
|  | Date:     September 11th, 2014 | 
|  |  | 
|  | Introduction | 
|  | ------------ | 
|  |  | 
|  | Coresight is an umbrella of technologies allowing for the debugging of ARM | 
|  | based SoC.  It includes solutions for JTAG and HW assisted tracing.  This | 
|  | document is concerned with the latter. | 
|  |  | 
|  | HW assisted tracing is becoming increasingly useful when dealing with systems | 
|  | that have many SoCs and other components like GPU and DMA engines.  ARM has | 
|  | developed a HW assisted tracing solution by means of different components, each | 
|  | being added to a design at synthesis time to cater to specific tracing needs. | 
|  | Components are generally categorised as source, link and sinks and are | 
|  | (usually) discovered using the AMBA bus. | 
|  |  | 
|  | "Sources" generate a compressed stream representing the processor instruction | 
|  | path based on tracing scenarios as configured by users.  From there the stream | 
|  | flows through the coresight system (via ATB bus) using links that are connecting | 
|  | the emanating source to a sink(s).  Sinks serve as endpoints to the coresight | 
|  | implementation, either storing the compressed stream in a memory buffer or | 
|  | creating an interface to the outside world where data can be transferred to a | 
|  | host without fear of filling up the onboard coresight memory buffer. | 
|  |  | 
|  | At typical coresight system would look like this: | 
|  |  | 
|  | ***************************************************************** | 
|  | **************************** AMBA AXI  ****************************===|| | 
|  | *****************************************************************    || | 
|  | ^                    ^                            |            || | 
|  | |                    |                            *            ** | 
|  | 0000000    :::::     0000000    :::::    :::::    @@@@@@@    |||||||||||| | 
|  | 0 CPU 0<-->: C :     0 CPU 0<-->: C :    : C :    @ STM @    || System || | 
|  | |->0000000    : T :  |->0000000    : T :    : T :<--->@@@@@     || Memory || | 
|  | |  #######<-->: I :  |  #######<-->: I :    : I :      @@@<-|   |||||||||||| | 
|  | |  # ETM #    :::::  |  # PTM #    :::::    :::::       @   | | 
|  | |   #####      ^ ^   |   #####      ^ !      ^ !        .   |   ||||||||| | 
|  | | |->###       | !   | |->###       | !      | !        .   |   || DAP || | 
|  | | |   #        | !   | |   #        | !      | !        .   |   ||||||||| | 
|  | | |   .        | !   | |   .        | !      | !        .   |      |  | | 
|  | | |   .        | !   | |   .        | !      | !        .   |      |  * | 
|  | | |   .        | !   | |   .        | !      | !        .   |      | SWD/ | 
|  | | |   .        | !   | |   .        | !      | !        .   |      | JTAG | 
|  | *****************************************************************<-| | 
|  | *************************** AMBA Debug APB ************************ | 
|  | ***************************************************************** | 
|  | |    .          !         .          !        !        .    | | 
|  | |    .          *         .          *        *        .    | | 
|  | ***************************************************************** | 
|  | ******************** Cross Trigger Matrix (CTM) ******************* | 
|  | ***************************************************************** | 
|  | |    .     ^              .                            .    | | 
|  | |    *     !              *                            *    | | 
|  | ***************************************************************** | 
|  | ****************** AMBA Advanced Trace Bus (ATB) ****************** | 
|  | ***************************************************************** | 
|  | |          !                        ===============         | | 
|  | |          *                         ===== F =====<---------| | 
|  | |   :::::::::                         ==== U ==== | 
|  | |-->:: CTI ::<!!                       === N === | 
|  | |   :::::::::  !                        == N == | 
|  | |    ^         *                        == E == | 
|  | |    !  &&&&&&&&&       IIIIIII         == L == | 
|  | |------>&& ETB &&<......II     I        ======= | 
|  | |    !  &&&&&&&&&       II     I           . | 
|  | |    !                    I     I          . | 
|  | |    !                    I REP I<.......... | 
|  | |    !                    I     I | 
|  | |    !!>&&&&&&&&&       II     I           *Source: ARM ltd. | 
|  | |------>& TPIU  &<......II    I            DAP = Debug Access Port | 
|  | &&&&&&&&&       IIIIIII            ETM = Embedded Trace Macrocell | 
|  | ;                              PTM = Program Trace Macrocell | 
|  | ;                              CTI = Cross Trigger Interface | 
|  | *                              ETB = Embedded Trace Buffer | 
|  | To trace port                       TPIU= Trace Port Interface Unit | 
|  | SWD = Serial Wire Debug | 
|  |  | 
|  | While on target configuration of the components is done via the APB bus, | 
|  | all trace data are carried out-of-band on the ATB bus.  The CTM provides | 
|  | a way to aggregate and distribute signals between CoreSight components. | 
|  |  | 
|  | The coresight framework provides a central point to represent, configure and | 
|  | manage coresight devices on a platform.  This first implementation centers on | 
|  | the basic tracing functionality, enabling components such ETM/PTM, funnel, | 
|  | replicator, TMC, TPIU and ETB.  Future work will enable more | 
|  | intricate IP blocks such as STM and CTI. | 
|  |  | 
|  |  | 
|  | Acronyms and Classification | 
|  | --------------------------- | 
|  |  | 
|  | Acronyms: | 
|  |  | 
|  | PTM:     Program Trace Macrocell | 
|  | ETM:     Embedded Trace Macrocell | 
|  | STM:     System trace Macrocell | 
|  | ETB:     Embedded Trace Buffer | 
|  | ITM:     Instrumentation Trace Macrocell | 
|  | TPIU:    Trace Port Interface Unit | 
|  | TMC-ETR: Trace Memory Controller, configured as Embedded Trace Router | 
|  | TMC-ETF: Trace Memory Controller, configured as Embedded Trace FIFO | 
|  | CTI:     Cross Trigger Interface | 
|  |  | 
|  | Classification: | 
|  |  | 
|  | Source: | 
|  | ETMv3.x ETMv4, PTMv1.0, PTMv1.1, STM, STM500, ITM | 
|  | Link: | 
|  | Funnel, replicator (intelligent or not), TMC-ETR | 
|  | Sinks: | 
|  | ETBv1.0, ETB1.1, TPIU, TMC-ETF | 
|  | Misc: | 
|  | CTI | 
|  |  | 
|  |  | 
|  | Device Tree Bindings | 
|  | ---------------------- | 
|  |  | 
|  | See Documentation/devicetree/bindings/arm/coresight.txt for details. | 
|  |  | 
|  | As of this writing drivers for ITM, STMs and CTIs are not provided but are | 
|  | expected to be added as the solution matures. | 
|  |  | 
|  |  | 
|  | Framework and implementation | 
|  | ---------------------------- | 
|  |  | 
|  | The coresight framework provides a central point to represent, configure and | 
|  | manage coresight devices on a platform.  Any coresight compliant device can | 
|  | register with the framework for as long as they use the right APIs: | 
|  |  | 
|  | struct coresight_device *coresight_register(struct coresight_desc *desc); | 
|  | void coresight_unregister(struct coresight_device *csdev); | 
|  |  | 
|  | The registering function is taking a "struct coresight_device *csdev" and | 
|  | register the device with the core framework.  The unregister function takes | 
|  | a reference to a "struct coresight_device", obtained at registration time. | 
|  |  | 
|  | If everything goes well during the registration process the new devices will | 
|  | show up under /sys/bus/coresight/devices, as showns here for a TC2 platform: | 
|  |  | 
|  | root:~# ls /sys/bus/coresight/devices/ | 
|  | replicator  20030000.tpiu    2201c000.ptm  2203c000.etm  2203e000.etm | 
|  | 20010000.etb         20040000.funnel  2201d000.ptm  2203d000.etm | 
|  | root:~# | 
|  |  | 
|  | The functions take a "struct coresight_device", which looks like this: | 
|  |  | 
|  | struct coresight_desc { | 
|  | enum coresight_dev_type type; | 
|  | struct coresight_dev_subtype subtype; | 
|  | const struct coresight_ops *ops; | 
|  | struct coresight_platform_data *pdata; | 
|  | struct device *dev; | 
|  | const struct attribute_group **groups; | 
|  | }; | 
|  |  | 
|  |  | 
|  | The "coresight_dev_type" identifies what the device is, i.e, source link or | 
|  | sink while the "coresight_dev_subtype" will characterise that type further. | 
|  |  | 
|  | The "struct coresight_ops" is mandatory and will tell the framework how to | 
|  | perform base operations related to the components, each component having | 
|  | a different set of requirement.  For that "struct coresight_ops_sink", | 
|  | "struct coresight_ops_link" and "struct coresight_ops_source" have been | 
|  | provided. | 
|  |  | 
|  | The next field, "struct coresight_platform_data *pdata" is acquired by calling | 
|  | "of_get_coresight_platform_data()", as part of the driver's _probe routine and | 
|  | "struct device *dev" gets the device reference embedded in the "amba_device": | 
|  |  | 
|  | static int etm_probe(struct amba_device *adev, const struct amba_id *id) | 
|  | { | 
|  | ... | 
|  | ... | 
|  | drvdata->dev = &adev->dev; | 
|  | ... | 
|  | } | 
|  |  | 
|  | Specific class of device (source, link, or sink) have generic operations | 
|  | that can be performed on them (see "struct coresight_ops").  The | 
|  | "**groups" is a list of sysfs entries pertaining to operations | 
|  | specific to that component only.  "Implementation defined" customisations are | 
|  | expected to be accessed and controlled using those entries. | 
|  |  | 
|  | Last but not least, "struct module *owner" is expected to be set to reflect | 
|  | the information carried in "THIS_MODULE". | 
|  |  | 
|  | How to use | 
|  | ---------- | 
|  |  | 
|  | Before trace collection can start, a coresight sink needs to be identify. | 
|  | There is no limit on the amount of sinks (nor sources) that can be enabled at | 
|  | any given moment.  As a generic operation, all device pertaining to the sink | 
|  | class will have an "active" entry in sysfs: | 
|  |  | 
|  | root:/sys/bus/coresight/devices# ls | 
|  | replicator  20030000.tpiu    2201c000.ptm  2203c000.etm  2203e000.etm | 
|  | 20010000.etb         20040000.funnel  2201d000.ptm  2203d000.etm | 
|  | root:/sys/bus/coresight/devices# ls 20010000.etb | 
|  | enable_sink  status  trigger_cntr | 
|  | root:/sys/bus/coresight/devices# echo 1 > 20010000.etb/enable_sink | 
|  | root:/sys/bus/coresight/devices# cat 20010000.etb/enable_sink | 
|  | 1 | 
|  | root:/sys/bus/coresight/devices# | 
|  |  | 
|  | At boot time the current etm3x driver will configure the first address | 
|  | comparator with "_stext" and "_etext", essentially tracing any instruction | 
|  | that falls within that range.  As such "enabling" a source will immediately | 
|  | trigger a trace capture: | 
|  |  | 
|  | root:/sys/bus/coresight/devices# echo 1 > 2201c000.ptm/enable_source | 
|  | root:/sys/bus/coresight/devices# cat 2201c000.ptm/enable_source | 
|  | 1 | 
|  | root:/sys/bus/coresight/devices# cat 20010000.etb/status | 
|  | Depth:          0x2000 | 
|  | Status:         0x1 | 
|  | RAM read ptr:   0x0 | 
|  | RAM wrt ptr:    0x19d3   <----- The write pointer is moving | 
|  | Trigger cnt:    0x0 | 
|  | Control:        0x1 | 
|  | Flush status:   0x0 | 
|  | Flush ctrl:     0x2001 | 
|  | root:/sys/bus/coresight/devices# | 
|  |  | 
|  | Trace collection is stopped the same way: | 
|  |  | 
|  | root:/sys/bus/coresight/devices# echo 0 > 2201c000.ptm/enable_source | 
|  | root:/sys/bus/coresight/devices# | 
|  |  | 
|  | The content of the ETB buffer can be harvested directly from /dev: | 
|  |  | 
|  | root:/sys/bus/coresight/devices# dd if=/dev/20010000.etb \ | 
|  | of=~/cstrace.bin | 
|  |  | 
|  | 64+0 records in | 
|  | 64+0 records out | 
|  | 32768 bytes (33 kB) copied, 0.00125258 s, 26.2 MB/s | 
|  | root:/sys/bus/coresight/devices# | 
|  |  | 
|  | The file cstrace.bin can be decompressed using "ptm2human", DS-5 or Trace32. | 
|  |  | 
|  | Following is a DS-5 output of an experimental loop that increments a variable up | 
|  | to a certain value.  The example is simple and yet provides a glimpse of the | 
|  | wealth of possibilities that coresight provides. | 
|  |  | 
|  | Info                                    Tracing enabled | 
|  | Instruction     106378866       0x8026B53C      E52DE004        false   PUSH     {lr} | 
|  | Instruction     0       0x8026B540      E24DD00C        false   SUB      sp,sp,#0xc | 
|  | Instruction     0       0x8026B544      E3A03000        false   MOV      r3,#0 | 
|  | Instruction     0       0x8026B548      E58D3004        false   STR      r3,[sp,#4] | 
|  | Instruction     0       0x8026B54C      E59D3004        false   LDR      r3,[sp,#4] | 
|  | Instruction     0       0x8026B550      E3530004        false   CMP      r3,#4 | 
|  | Instruction     0       0x8026B554      E2833001        false   ADD      r3,r3,#1 | 
|  | Instruction     0       0x8026B558      E58D3004        false   STR      r3,[sp,#4] | 
|  | Instruction     0       0x8026B55C      DAFFFFFA        true    BLE      {pc}-0x10 ; 0x8026b54c | 
|  | Timestamp                                       Timestamp: 17106715833 | 
|  | Instruction     319     0x8026B54C      E59D3004        false   LDR      r3,[sp,#4] | 
|  | Instruction     0       0x8026B550      E3530004        false   CMP      r3,#4 | 
|  | Instruction     0       0x8026B554      E2833001        false   ADD      r3,r3,#1 | 
|  | Instruction     0       0x8026B558      E58D3004        false   STR      r3,[sp,#4] | 
|  | Instruction     0       0x8026B55C      DAFFFFFA        true    BLE      {pc}-0x10 ; 0x8026b54c | 
|  | Instruction     9       0x8026B54C      E59D3004        false   LDR      r3,[sp,#4] | 
|  | Instruction     0       0x8026B550      E3530004        false   CMP      r3,#4 | 
|  | Instruction     0       0x8026B554      E2833001        false   ADD      r3,r3,#1 | 
|  | Instruction     0       0x8026B558      E58D3004        false   STR      r3,[sp,#4] | 
|  | Instruction     0       0x8026B55C      DAFFFFFA        true    BLE      {pc}-0x10 ; 0x8026b54c | 
|  | Instruction     7       0x8026B54C      E59D3004        false   LDR      r3,[sp,#4] | 
|  | Instruction     0       0x8026B550      E3530004        false   CMP      r3,#4 | 
|  | Instruction     0       0x8026B554      E2833001        false   ADD      r3,r3,#1 | 
|  | Instruction     0       0x8026B558      E58D3004        false   STR      r3,[sp,#4] | 
|  | Instruction     0       0x8026B55C      DAFFFFFA        true    BLE      {pc}-0x10 ; 0x8026b54c | 
|  | Instruction     7       0x8026B54C      E59D3004        false   LDR      r3,[sp,#4] | 
|  | Instruction     0       0x8026B550      E3530004        false   CMP      r3,#4 | 
|  | Instruction     0       0x8026B554      E2833001        false   ADD      r3,r3,#1 | 
|  | Instruction     0       0x8026B558      E58D3004        false   STR      r3,[sp,#4] | 
|  | Instruction     0       0x8026B55C      DAFFFFFA        true    BLE      {pc}-0x10 ; 0x8026b54c | 
|  | Instruction     10      0x8026B54C      E59D3004        false   LDR      r3,[sp,#4] | 
|  | Instruction     0       0x8026B550      E3530004        false   CMP      r3,#4 | 
|  | Instruction     0       0x8026B554      E2833001        false   ADD      r3,r3,#1 | 
|  | Instruction     0       0x8026B558      E58D3004        false   STR      r3,[sp,#4] | 
|  | Instruction     0       0x8026B55C      DAFFFFFA        true    BLE      {pc}-0x10 ; 0x8026b54c | 
|  | Instruction     6       0x8026B560      EE1D3F30        false   MRC      p15,#0x0,r3,c13,c0,#1 | 
|  | Instruction     0       0x8026B564      E1A0100D        false   MOV      r1,sp | 
|  | Instruction     0       0x8026B568      E3C12D7F        false   BIC      r2,r1,#0x1fc0 | 
|  | Instruction     0       0x8026B56C      E3C2203F        false   BIC      r2,r2,#0x3f | 
|  | Instruction     0       0x8026B570      E59D1004        false   LDR      r1,[sp,#4] | 
|  | Instruction     0       0x8026B574      E59F0010        false   LDR      r0,[pc,#16] ; [0x8026B58C] = 0x80550368 | 
|  | Instruction     0       0x8026B578      E592200C        false   LDR      r2,[r2,#0xc] | 
|  | Instruction     0       0x8026B57C      E59221D0        false   LDR      r2,[r2,#0x1d0] | 
|  | Instruction     0       0x8026B580      EB07A4CF        true    BL       {pc}+0x1e9344 ; 0x804548c4 | 
|  | Info                                    Tracing enabled | 
|  | Instruction     13570831        0x8026B584      E28DD00C        false   ADD      sp,sp,#0xc | 
|  | Instruction     0       0x8026B588      E8BD8000        true    LDM      sp!,{pc} | 
|  | Timestamp                                       Timestamp: 17107041535 |