This is a memlatency comparison between regular (malloc'ed) memory and static TLB. Static TLB statically maps a range of physical memory into virtual address space by installing PowerPC 440 TLB entries, which of course doesn't cause any TLB misses. On PowerPC 440, by the way, the tlb update handler is completely software (actually in Linux source code).
Obviously, Static TLB memory region always shows better performance than regular memory region which is 4KB pagin.
Random access pattern shows that regular memory region has overhead on small iteration. At 1,000 iteration point, it causes page fault 1,000 times in maximum. You can avoid this overhead by prefaulting. On the other hand, on stream accees, 1,000 iteration causes page fault twice.
Source code and information on benchmark program can be download from here.
Experimental Condition | |
Machine: BG/L ION, PowerPC 440, 32KB L1, 4MB L3(shared by two core), 64 TLB entries, 512 MB memory Reference kernel - version: 2.6.5 ION kernel + static TLB patch - kernel config: 2.6.5 ION kernel |
Allocation Size is 64MB | Allocation Size is 128MB |
Allocation Size is 64MB | Allocation Size is 128MB |
Allocation Size is 64MB | Allocation Size is 128MB |