Skip to content

Replay

To replay captured records, we built a simple user-space replayer.

The given records includes

  • Register I/O
  • Memory contents (binary, job chain, etc)
  • GPU page table
  • GPU synced address space (not necessary if we use symbol-based in/output identification)

Implementation

For the prototype under Debian Linux, we may need to implement a small kernel module that manages memory allocation and frees in the kernel space. Note that it should return the physical page address so that the replayer in the user-space can update page table entry with a newly allocated physical page. The rationale is that given page table maps the originals physical address which does not matched with the client device.

1. /dev/mem

Refer to the previous post

2. GPU page table reconstruction

Allocate the same amount of pages for the page table. You should recursively allocate page when meeting valid page table entry. Refer to tgx_build_pgt(). Note that keep the base address of page table so that the replayer updates page table via register I/O correctly.

3. Copy mem contents

After a new page table is prepared, you can copy the memory contents to the corresponding address space. To this end, the recorded memory contents should contain the relevant address space. For instance, the captured content format looks like vm_start, vm_end, nr_pages, flags, is_valid, contents. Note that vm is GPU virtual address.

Because we are managing page table in user-space combined with /dev/mem, you should map the corresponding physical page to the user-space via mmap() before copying the contents. NB: carefully check whether page table entry is valid since sometimes the captured page table may miss some part due to dynamic memory management. Regarding implementation, refer to tgx_mem_contents_init().

4. Run register I/O

Map the base address of register map with its size and then you are good to replay.

void tgx_reg_base_init(const int mem_fd)
{
    reg_base = (char *) mmap(NULL,
        GPU_REG_PA_SIZE, // 16KPROT_READ | PROT_WRITE,
        MAP_SHARED,
        mem_fd,
        GPU_REG_PA_START);  // 0xe82c0000
     if (reg_base == MAP_FAILED) {
         perror("cannot map the memory to the user space");
         exit(-1);
     }
}