Skip to content

mindis/pycachesim

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pycachesim

A single-core cache hierarchy simulator written in python.

image

The goal is to accurately simulate the caching (allocation/hit/miss/replace/evict) behavior of all cache levels found in modern processors. It is developed as a backend to kerncraft, but is also planned to introduce a command line interface to replay LOAD/STORE instructions.

Currently supported features:
  • Inclusive cache hierarchies
  • LRU, MRU, RR and FIFO policies
  • N-way cache associativity
  • Write-allocate with write-back caches
  • Non-write-allocate with write-through caches
  • Write-combining with sub-blocking
  • Speed (core is implemented in C)
  • Python 2.7+ and 3.4+ support, with no other dependencies
Planned features:
  • Report cachelines on all levels
  • Report timeline of cache events
  • Visualize events (html file?)
  • More detailed store/evict handling (e.g., using dirty bits)
  • (uncertain) instruction cache
  • Optional classification into compulsory/capacity and conflict misses (by simulating other cache configurations in parallel)
  • (uncertain) multi-core support
  • Remove cl_size growth requirement (NVIDIA Kepler's L1 has 128B cl_size and L2 with 32B)

License

pycachesim is licensed under AGPLv3.

Usage

from cachesim import CacheSimulator, Cache, MainMemory

mem = MainMemory()
l3 = Cache("L3", 20480, 16, 64, "LRU")  # 20MB: 20480 sets, 16-ways with cacheline size of 64 bytes
mem.load_to(l3)
mem.store_from(l3)
l2 = Cache("L2", 512, 8, 64, "LRU", store_to=l3, load_from=l3)  # 256KB
l1 = Cache("L1", 64, 8, 64, "LRU", store_to=l2, load_from=l2)  # 32KB
cs = CacheSimulator(l1, mem)

cs.load(2342)  # Loads one byte from address 2342, should be a miss in all cache-levels
cs.store(512, length=8)  # Stores 8 bytes to addresses 512-519,
                         # will also be a load miss (due to write-allocate)
cs.load(512, length=8)  # Loads from address 512 until (exclusive) 520 (eight bytes)

cs.force_write_back()
cs.print_stats()

This should return:

CACHE *******HIT******** *******MISS******* *******LOAD******* ******STORE*******
   L1      1 (       8B)      2 (      65B)      3 (      73B)      1 (       8B)
   L2      0 (       0B)      2 (     128B)      2 (     128B)      1 (      64B)
   L3      0 (       0B)      2 (     128B)      2 (     128B)      1 (      64B)
  MEM      2 (     128B)      0 (       0B)      2 (     128B)      1 (      64B)

Each row refers to one memory-level, starting with L1 and ending with main memory. The 3 loads in L1 are the sum of all individual accesses to the cache-hierarchy. 1 (from first load) + 1 (from store with write-allocate) + 1 (from second load) = 3.

The 1 hit, is for bytes which were cached already. Internally the pycachesim operates on cache-lines, which all addresses get transformed to. Thus, the two misses throughout all cache-levels are actually two complete cache-lines and after the cache-line had been loaded the consecutive access to the same cache-line are handled as hits. That is also the reason why data sizes increase from L1 to L2. L1 is accessed byte-wise and L2 only with cache-line granularity.

So: hits, misses, stores and loads in L1 are byte-wise, just like stores throughout all cache-levels. Every other statistical information are based on cache-lines.

Comparison to other Cache Simulators

While searching for more versatile cache simulator for kerncraft, I stumbled across the following:

  • gem5: Very fully-featured full system simulator. Complex to extract only the memory subsystem
  • dineroIV: Nice and simple code, but does not support exclusive caches and not available under open source license.
  • cachegrind: Maintained and stable code of a well established open source project, but only supports inclusive first and last level caches.
  • callgrind: see cachegrind
  • SMPcache: Only supports one single cache and runs on Windows with GUI. Also not freely available.
  • CMPsim: Was only academically published and source code never made available.
  • CASPER: Was only academically published and source code never made available.
Package instructions1 blocks2 sub-blocks3 associtivity4 LRU5 MRU6 FIFO7 RR8 CCC9 3+ levels10 exclusive11 victim12 multi-core13 API14 open source15

gem5 dineroIV cachegrind callgrind SMPcache CMPsim CASPER pycachesim

x x x x

x

x x x x x x x x

? x

x

x x x x x x x x

x x x x x x x x

x

x x x

x x

x x x x

? x

x x x x

? x

?

x

x x

x x x

?

?

planned

?

?

x

?

x x

python, ruby, c++ c cli cli Windows GUI ? perl, c python

yes, BSD-style no, free for non-comercial use yes, GPLv2 yes, GPLv2 no, free for education und research no, source not public no, source not public yes, AGPLv3


  1. Instruction cache support (typically L1I)

  2. Cacheline/block granular caching

  3. Sub-blocking/sectoring for in cache-storage

  4. Support for n-way associativity

  5. Support least-recently-used (LRU), most-recently-used (MRU), first-in-last-out (FIFO), random (RR) replacement policy

  6. Support least-recently-used (LRU), most-recently-used (MRU), first-in-last-out (FIFO), random (RR) replacement policy

  7. Support least-recently-used (LRU), most-recently-used (MRU), first-in-last-out (FIFO), random (RR) replacement policy

  8. Support least-recently-used (LRU), most-recently-used (MRU), first-in-last-out (FIFO), random (RR) replacement policy

  9. Classification of misses into: compulsory (first time access), capacity (access after replacement), conflict (would have been a hit with full-associativity)

  10. Combining of at least three cache levels

  11. Exclusive cache relations (two levels may not share the same cacheline)

  12. Victim caches, where only evicted lines endup(e.g., AMD Bulldozer L3)

  13. Multi-core cache hierarchies with private and shared caches and cache coherency protocol

  14. Supported interfaces (cli = command-line-interface)

  15. Published under an Open Source Initiative approved license?

About

Python Cache Hierarchy Simulator

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 59.6%
  • C 40.4%