Complete Assignment 0 or ensure that the tools you would need to complete it are installed.
2. Clone your assignment repository:
https://git.cs.dal.ca/courses/2023-fall/csci-2122/assignment-5/????.git
where ???? is your CSID. Please see instructions in Assignment 0 and the tutorials on Brightspace if
you are not sure how.
Inside the repository there is one directory: cachesim, where code is to be written. You should set up
a CLion project for each of this directory. Inside the directory is a tests directory that contains tests
that will be executed each time you submit your code. Please do not modify the tests directory or the
.gitlab-ci.yml file that is found in the root directory. Modifying these files may break the tests.
These files will be replaced with originals when the assignments are graded. You are provided with a
sample Makefile file that can be used to build your program. If you are using CLion, a Makefile will
be generated from the CMakeLists.txt file generated by CLion.
Background:
Your task is to implement a generic cache module that can simulate any of the caches we discussed: direct
mapped caches, set associative caches, and fully associative caches.
Caches
Recall that a cache is defined by several parameters:
S: the number of sets
E: the number of lines per set
B: the number of bytes each line caches
The size of the cache is C = S x E x B. The type of cache depends on these parameters:
• In direct mapped caches, E = 1, i.e., there is one (1) line per set,
• In fully associative caches, S = 1, i.e., all lines are in a single set, and
• In set associative caches, S > 1 and E > 1, i.e., there are multiple sets, each with multiple lines.
When a cache receives a memory reference, it
1. Breaks up the address into a tag, a set index, and an offset.
2. Uses the index to identify the set where the referenced memory may be cached
3. Uses the tag to determine if a line in the set is caching the referenced memory
4. If the referenced memory is not being cached,
a. The cache determines if the set contains an unused line
b. If no lines are unused, the cache will evict a used line, using the least recently used criteria.
I.e., the line in the set that has not been used in the longest time is the victim.
c. The line is then loaded with the block of memory containing the reference. This is done
by forwarding the memory reference to the next level of the memory hierarchy.
5. At this point, a line in the selected set is caching the memory being referenced. The cache returns
the data being referenced.
The cache at the next level functions in the same way, except that its size, the number of sets, and the
number of lines per set may differ. Consequently, a single implementation of a cache simulator should
be able to simulate a variety of different caches.
Reference Streams
Apart from the configuration of the caches to be simulated, the cache simulator requires a reference stream. A reference stream is simply a sequence of load and store
operations with memory addresses to be accessed. This is equivalent a program running and accessing memory as it runs. Each load or store is a memory reference. For
most situations, distinguishing between load and store operations is not needed. However, when debugging a simulator, being able to distinguish between loads and stores
is helpful. An example of a reference stream is on the right, containing 14 references,
where an “L addr” is a load operation at address addr, and an “S addr val” is a store
operation that stores val at address addr.
The Cache Simulator
A cache simulator takes as input (i) a system configuration that includes the word size,
amount of memory, and other system properties; (ii) a cache configuration, describing
the caches in the system; and (iii) a reference stream. The simulator configures and instantiates the system being simulated and then processes the reference stream, by sending each request to the top-level
cache. The top-level cache will forward the request to the next cache if the request causes a miss, loading
the block from the cache below. Once the requested piece of memory is in the cache, the cache returns
the requested data. The simulator tracks the number of hits and misses that occur in call caches, and
outputs the aggregates after the reference stream has been processed.
Your task will be to implement the cache mechanism in the Cache Simulator.
Task: Implement the cache.c for the Simulator
Your task is to implement the cache.c module by implementing four functions. These functions are
declared in cache.h and are called from main.c. These functions are:
cache_t cache_new(unsigned int word_size, unsigned int sets, unsigned int lines,
unsigned int line_size)
This function instantiates a new cache and returns a handle (pointer) to it. The cache is specified
by four parameters:
• word_size : the size of machine word in bits (i.e., m), e.g., 16, 32, 64.
• sets : the number of sets in the cache (i.e., S), sets > 0
• lines : the number of lines in each set (i.e., E), lines > 0
• line_size : the number of bytes stored in each line, (i.e., B), line_size > 0
The function returns a value of type cache_t, which is defined as a void * pointer in cache.h.
This function should allocate and initialize an internal data structure for the cache and then return
a void * (cache_t) pointer to this data structure. This is like our linked-list implementation
where the linked list structures are defined and visible only inside the linked-list module. In this
case, the data structure implements a cache and not a linked list.
14
S 22 1234
L 22
S 48 3456
L 48
S 70 4567
L 70
S 4118 5678
L 4118
L 22
L 4118
S 2070 6789
L 2070
L 4118
L 22
void cache_load_word(cache_t handle, unsigned long address, void *word)
This function takes a handle (pointer) to a cache that was created by a cache_new() and loads
a word at memory address. I.e., this is what the CPU does when it needs to load a word from
memory, it requests it from the top-level cache. The function takes three (3) parameters:
• handle : the pointer to the cache, created by cache_new().
• address : the location of the value to be loaded. This is the address of a reference in the
reference stream.
• word : a pointer to a buffer of where the word is to be loaded (copied into).
If an error occurs, it is recommended that the program should be stopped, as no errors are expected.
This function performs the steps outlined in the Caches section of the Background. The function
should
1. Break up the address into a tag, index, and offset.
2. Use the index to locate the correct set.
3. Use the tag to determine if block of memory that includes the address is in one of the
lines in the set.
4. If it is (a cache hit), the offset is used to locate the word in that line, the word should be
copied into the buffer pointed to by word, and then the function can return.
5. Otherwise, it is a cache miss. In this case, a victim line is selected using the least recently
used criteria (LRU) criteria. The line is initialized with the current tag and loaded by calling
the function
void load_block_from_next_level(unsigned long address, void *block)
which is declared in main.h and defined in main.c. This function takes the address
as the first parameter, and a pointer to where the block should be loaded, which would
be the part of the line storing the block. Note: The cache simulator in main.c keeps
track of the block size of each cache, so the size of the block need not be provided. This
function will call the cache_load_block() of the next cache in the hierarchy. Once
the line is loaded, Step 4 is performed.
void cache_load_block(cache_t handle, unsigned long address, void *block,
unsigned int size)
This function takes a handle (pointer) to a cache and loads a block of size bytes of memory that
contains the memory address. I.e., this is what a cache does when it needs to load a block from
the next cache in the hierarchy. The function takes four (4) parameters:
• handle : the pointer to the cache, created by cache_new().
• address : a location within the block of memory to be loaded.
• block : a pointer to a buffer of where the block is to be loaded (copied into).
• size : the size of the block to be loaded. The block is no bigger than a line in the cache
but may be smaller than a line.
If an error occurs, it is recommended that the program should be stopped, as not errors are expected.
This function performs the same steps as cache_load_word(), except that the block may be bigger
than a single word.
void cache_store_word(cache_t handle, unsigned long address, void *word)
This function takes a handle (pointer) to a cache and stores a word at memory address. I.e., this
is what the CPU does when it needs to store a word into memory. The function takes three (3)
parameters:
• handle : the pointer to the cache, created by cache_new().
• address : the location of the value to be stored. This is the address of a reference in the
reference stream.
• word : a pointer to a buffer from where the word is to be stored (copied out of).
If an error occurs, the program should be stopped, as not errors are expected.
If the memory to be written is not in the cache, then it should be cached first.
Implementing LRU
To easily implement LRU in your caches, it is recommended to replace the valid field in the cache line with
an integer timestamp. A zero timestamp represents an invalid line, and a positive timestamp represents
the last time the line was accessed. The cache implementation should keep a clock counter as part of its
structure and increment it each time the cache is called. Each time a line is used, its timestamp is updated
with the clock counter. Thus, the smallest value timestamp represents the least recently used line.
The rest of the cache simulator is already implemented for you!
The cachesim Mainline
The main.c of cachesim is already implemented for you. Below is a brief description of what it does.
Input
The cachesim reads input from stdin. The input consists of three parts: (i) a system configuration; (ii) a
cache configuration; and (iii) a reference stream.
The system configuration consists of three integers:
• L : the number of caches in the
• W : the word size (typically 16, 32, or 64)
• M : the memory size
The cache configuration follows and consists of L cache configurations, from top to bottom. I.e., the first
configuration is the top-level (L1) cache, the next configuration is the next level cache and so on. Each
configuration consists of a string and three integers:
• Name : name of cache, e.g. “L1”
• S : number of sets in the cache
• E : number of lines in the set
• B : number of bytes in each line
The reference stream consists of an integer N denoting the number of references, followed by N references. Each reference is either a load of the form
L addr
or a store of the form
S addr val
where addr is nonnegative and less than the memory size, and val is nonnegative and fits within a word.
Processing
When cachesim starts running, it reads in the input, and creates the caches specified by the cache configurations by calling cache_new() for each cache.
It then enters the main loop and processes the reference stream:
• If the reference is a load, cache_load_word() is called on the top-level cache.
• If the reference is a store, cache_store_word() is called on each cache to ensure that all
caches are coherent.
During the processing, all cache hits and misses are recorded.
The function load_block_from_next_level() acts as the connection between two adjoining
caches. When this function is called from one cache, the code in main.c determines which cache to
forward the request to next. It also keeps track of whether this resulted in cache hit or a miss. This is
done by keeping track of the number of recursive calls to load_block_from_next_level(). The
recursive nature of this structure is why the misses and hits on the lower caches are displayed first.
Output
The cachesim outputs to stdout in three parts: (i) the cache configurations; (ii) the result of each
memory reference as it is being processed; (iii) the aggregates of hits and misses for each of the caches.
Note that the hits and misses are outputted in reverse order as the hit/miss is determined only after a
cache request completes.
Example
Input (abridged) Output (abridged)
3
16
65536
L1 16 1 16
L2 32 2 16
L3 64 4 16
65536
S 0 0
…
L 7846
L 4970
Level 1 cache: sets: 16, lines 1, block size: 16
Level 2 cache: sets: 32, lines 2, block size: 16
Level 3 cache: sets: 64, lines 4, block size: 16
Cache L3 miss @ 0x0000
Cache L2 miss @ 0x0000
Cache L1 miss @ 0x0000
Stored word: 0x0
…
Cache L3 miss @ 0x1ea6
Cache L2 miss @ 0x1ea6
Cache L1 miss @ 0x1ea6
Loaded word: 0x72b4
Cache L3 hit @ 0x136a
Cache L2 miss @ 0x136a
Cache L1 miss @ 0x136a
Loaded word: 0xb011
Cache: L1, hits: 2123, misses: 63413
Cache: L2, hits: 6039, misses: 57374
Cache: L3, hits: 24420, misses: 32954
Hints and Suggestions
• You will need a couple structs, one for cache and one for line. You may also want one for set.
• Fundamentally, a cache is an array of sets and a set is an array of lines.
• You should only need to modify one file: cache.c.
• There is not a lot of code to write (my solution is 130 lines). You can avoid significant duplication by
creating some helper functions in cache.c
Assignment Submission
Submission and testing are done using Git, Gitlab, and Gitlab CI/CD. You can submit as many times as you
wish, up to the deadline. Every time a submission occurs, functional tests are executed, and you can view
the results of the tests. To submit use the same procedure as Assignment 0.
Grading
If your program does not compile, it is considered non-functional and of extremely poor quality, meaning you will receive 0 for the solution.
The assignment will be graded based on three criteria:
Functionality: “Does it work according to specifications?”. This is determined in an automated fashion by
running your program on several inputs and ensuring that the outputs match the expected outputs. The
score is determined based on the number of tests that your program passes. So, if your program passes
t/T tests, you will receive that proportion of the marks.
Quality of Solution: “Is it a good solution?” This considers whether the approach and algorithm in your
solution is correct. This is determined by visual inspection of the code. It is possible to get a good grade
on this part even if you have bugs that cause your code to fail some of the tests.
Code Clarity: “Is it well written?” This considers whether the solution is properly formatted, well documented, and follows coding style guidelines. A single overall mark will be assigned for clarity. Please see
the Style Guide in the Assignment section of the course in Brightspace.
The following grading scheme will be used:
Task 100% 80% 60% 40% 20% 0%
Functionality
(20 marks) Equal to the number of tests passed.
Solution Quality
(20 marks)
Implemented efficiently and correctly.
Implementation is
correct. All three
types of caches
are functional.
Minor flaws with
implementation,
two of three
types of caches
are functional.
Major flaws in
implementation. One of
three types of
caches work.
An attempt
has been
made.
code
No code submitted or
does not compile
Code Clarity
(10 marks)
Indentation, formatting, naming,
comments
Code looks professional and follows all style
guidelines
Code looks good
and mostly follows style guidelines.
Code is mostly
readable and
mostly follows
some of the style
guidelines
Code is hard to
read and follows few of the
style guidelines
Code is not
legible
Assignment Testing without Submission
Testing via submission can take some time, especially if the server is loaded. You can run the tests without
submitting your code by using the provided runtests.sh script. Running the script with no arguments
will run all the tests. Running the script with the test number, i.e., 00, 01, 02, 03, … 09, will run that specific
test. Please see below for how run the script.
Get your program ready to run
If you are developing directly on the unix server,
1. SSH into the remote server and be sure you are in the cachesim directory.
2. Be sure the program is compiled by running make.
If you are using CLion
1. Run your program on the remote server as described in the CLion tutorials.
2. Open a remote host terminal via Tools → Open Remote Host Terminal
If you are using VSCode
1. Run your program on the remote server as described in VSCode tutorials.
2. Click on the Terminal pane in the bottom half of the window or via Terminal → New Terminal
Run the script
3. Run the script in the terminal by using the command:
./runtest.sh
to run all the tests, or specify the test number to run a specific test, e.g. :
./runtest.sh 07
You will see the test run in the terminal window.
implement a cache simulator. You will only need to implement the cache functionality, but the functionality will need to be general enough to support all three types of caches.
Share