Skip to content

saurabharora90/MESI-Cache-Simulator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CS4223: Parallel Computer Architecture
Assignment 3 (20 marks)
Deadline: Sunday 18th November 2012, 11:59pm

The goal of this assignment is to improve your understanding of snooping cache coherence protocol. This is an individual assignment. You are free to discuss your doubts in the discussion forum.

Benchmark Traces:

You need to implement a simulator for MESI snooping cache coherence protocol using C, C++, C#, Java, or any other programming language of your choice. Therefore you can program on any platform of your choice.

Unlike Simplescalar simulator, which was a functional simulator that could execute the benchmarks directly, in this assignment you will develop trace-driven simulator. 

You will use one parallel benchmark trace for this assignment.

•	Weather: Parallel application that performs weather modeling

For this benchmark, you need to download 4 traces from IVLE Workbin. For space reasons, the traces have been compressed: first with WinAce Archiver, and then with WinZip. Therefore, using WinZip, decompress the files in some directory, and you will obtain ".exe" programs. These are SFX archives of WinAce, that is, you should execute these programs because they are self-extracting archives. 

The trace file WeatherN.zip contains the trace for SMP with N processors. For example, Weather4.zip contains the trace for SMP with 4 processors whereas Weather1.zip contains the trace for uni-processor. After running WinZip on Weather4.zip, for example, you will get the file Weather4.exe. Now executing Weather4.exe will generate 4 trace files --- one corresponding to each processor WEATHER1.prg, WEATHER2.prg, WEATHER3.prg, and WEATHER4.prg.

The trace files are ASCII data files (trace files, which have the extension “.prg”) consisting of lines where each one has two numbers, separated by only one white space:
Label Value
where:
• Label is a decimal number that identifies the memory access operation type demanded by the processor in a given time: fetch instruction (0), read memory data (2) or write a data in memory (3).
• Value is a 32-bit hexadecimal number that indicates the effective address of the memory word to be accessed by the processor

	 

The figure above illustrates an example of trace file generated by one processor. The part of the file in the figure shows a memory trace with 6 instructions. In addition, there are 3 data reads and 1 data write. In total, those 6 instructions imply 10 memory accesses.

Notice that the trace does not contain actual data as it is not necessary for simulating cache structures.

Assumptions:

You have to make the following assumptions.

1. Memory address is 32-bit
2. Each memory reference accesses 16-bit (2-bytes) of data.
3. We are only interested in data cache and instruction references can be ignored.
4. Each processor has its own L1 data cache.
5. L1 data cache is direct-mapped and uses write-back, write-allocate policy.
6. L1 data caches are kept coherent using snooping cache coherence protocol.
7. Initially all caches are empty.
8. The bus uses first come first serve arbitration policy. Ties are broken arbitrarily. 
9. The L1 data caches are backed up by main memory --- there is no L2 data cache.
10. Fetching a block from memory to cache takes 10 cycles. Assume that flushing a dirty cache block to memory is free because it can be put in the write buffer.
11. You may need to make additional assumptions. Clearly state those assumptions in your report.

Also assume that the caches are blocking. That is, if there is a cache miss, the cache cannot process further requests from the processor. However, the snooping transactions from the bus still need to be processed. 

In each cycle, each processor can make at most one instruction reference and at most one memory reference. As per our assumptions, you do not need to model L1 instruction cache. So the instruction references can simply be ignored; but the cycle counter still has to be incremented. 






Task:

Implement the MESI cache coherence protocol and run it on the trace files provided. 

Your program should take the input file name and cache configurations as arguments. The command line should be

MESI “input_file” “no_processors” “cache_size” “block_size” 
where MESI is the executable file name and input parameters are
•	“input_file” is the input benchmark name (e.g., WEATHER)
•	“no processors”: number of processors
•	“cache_size”: cache size in bytes
•	“block_size”: block size in bytes

For example, to read WEATHER*.prg trace files for 4 processors and execute MESI cache coherence protocol with each processor containing 1K direct-mapped cache and 16 byte block size, the command will be

MESI WEATHER 4 1024 16 

Your program should generate the data cache miss ratio for each processor.

Hint: Implement the uni-processor cache without coherence first and check for correctness. 

Experiments:

Now you need to perform some experiments using your simulator and the benchmark trace to answer the following questions. You may assume default parameters as 4 processors, 16-bit word size, 64-byte block size, and 4KB direct-mapped cache per processor.

1. What is the impact of cache size on miss ratio? Vary cache size between 1KB and 32KB.

2. What is the impact of increasing the number of processors on miss ratio? Vary number of processors as 1, 2, 4, and 8


Report:

Write a report on your finding and explain your findings. 

In your report, you should describe the programming language and environment you used. You also need to explain how to compile and run your program. You can provide the additional information you believe is necessary to run the program.

Your report should have the detailed explanation of your implementation (e.g., data structure, flow chart, etc.).

Grading:

•	Correct implementation of MESI protocol (includes demo): 14 marks
o	Uni-processor cache: 7 marks
o	Coherence protocol: 7 marks
•	Experiments with varying parameters: 2 marks
•	Report: 4 marks

About

Cache Simulator for CS4223 module at NUS

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages