-
Notifications
You must be signed in to change notification settings - Fork 3
saurabharora90/MESI-Cache-Simulator
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
CS4223: Parallel Computer Architecture Assignment 3 (20 marks) Deadline: Sunday 18th November 2012, 11:59pm The goal of this assignment is to improve your understanding of snooping cache coherence protocol. This is an individual assignment. You are free to discuss your doubts in the discussion forum. Benchmark Traces: You need to implement a simulator for MESI snooping cache coherence protocol using C, C++, C#, Java, or any other programming language of your choice. Therefore you can program on any platform of your choice. Unlike Simplescalar simulator, which was a functional simulator that could execute the benchmarks directly, in this assignment you will develop trace-driven simulator. You will use one parallel benchmark trace for this assignment. • Weather: Parallel application that performs weather modeling For this benchmark, you need to download 4 traces from IVLE Workbin. For space reasons, the traces have been compressed: first with WinAce Archiver, and then with WinZip. Therefore, using WinZip, decompress the files in some directory, and you will obtain ".exe" programs. These are SFX archives of WinAce, that is, you should execute these programs because they are self-extracting archives. The trace file WeatherN.zip contains the trace for SMP with N processors. For example, Weather4.zip contains the trace for SMP with 4 processors whereas Weather1.zip contains the trace for uni-processor. After running WinZip on Weather4.zip, for example, you will get the file Weather4.exe. Now executing Weather4.exe will generate 4 trace files --- one corresponding to each processor WEATHER1.prg, WEATHER2.prg, WEATHER3.prg, and WEATHER4.prg. The trace files are ASCII data files (trace files, which have the extension “.prg”) consisting of lines where each one has two numbers, separated by only one white space: Label Value where: • Label is a decimal number that identifies the memory access operation type demanded by the processor in a given time: fetch instruction (0), read memory data (2) or write a data in memory (3). • Value is a 32-bit hexadecimal number that indicates the effective address of the memory word to be accessed by the processor The figure above illustrates an example of trace file generated by one processor. The part of the file in the figure shows a memory trace with 6 instructions. In addition, there are 3 data reads and 1 data write. In total, those 6 instructions imply 10 memory accesses. Notice that the trace does not contain actual data as it is not necessary for simulating cache structures. Assumptions: You have to make the following assumptions. 1. Memory address is 32-bit 2. Each memory reference accesses 16-bit (2-bytes) of data. 3. We are only interested in data cache and instruction references can be ignored. 4. Each processor has its own L1 data cache. 5. L1 data cache is direct-mapped and uses write-back, write-allocate policy. 6. L1 data caches are kept coherent using snooping cache coherence protocol. 7. Initially all caches are empty. 8. The bus uses first come first serve arbitration policy. Ties are broken arbitrarily. 9. The L1 data caches are backed up by main memory --- there is no L2 data cache. 10. Fetching a block from memory to cache takes 10 cycles. Assume that flushing a dirty cache block to memory is free because it can be put in the write buffer. 11. You may need to make additional assumptions. Clearly state those assumptions in your report. Also assume that the caches are blocking. That is, if there is a cache miss, the cache cannot process further requests from the processor. However, the snooping transactions from the bus still need to be processed. In each cycle, each processor can make at most one instruction reference and at most one memory reference. As per our assumptions, you do not need to model L1 instruction cache. So the instruction references can simply be ignored; but the cycle counter still has to be incremented. Task: Implement the MESI cache coherence protocol and run it on the trace files provided. Your program should take the input file name and cache configurations as arguments. The command line should be MESI “input_file” “no_processors” “cache_size” “block_size” where MESI is the executable file name and input parameters are • “input_file” is the input benchmark name (e.g., WEATHER) • “no processors”: number of processors • “cache_size”: cache size in bytes • “block_size”: block size in bytes For example, to read WEATHER*.prg trace files for 4 processors and execute MESI cache coherence protocol with each processor containing 1K direct-mapped cache and 16 byte block size, the command will be MESI WEATHER 4 1024 16 Your program should generate the data cache miss ratio for each processor. Hint: Implement the uni-processor cache without coherence first and check for correctness. Experiments: Now you need to perform some experiments using your simulator and the benchmark trace to answer the following questions. You may assume default parameters as 4 processors, 16-bit word size, 64-byte block size, and 4KB direct-mapped cache per processor. 1. What is the impact of cache size on miss ratio? Vary cache size between 1KB and 32KB. 2. What is the impact of increasing the number of processors on miss ratio? Vary number of processors as 1, 2, 4, and 8 Report: Write a report on your finding and explain your findings. In your report, you should describe the programming language and environment you used. You also need to explain how to compile and run your program. You can provide the additional information you believe is necessary to run the program. Your report should have the detailed explanation of your implementation (e.g., data structure, flow chart, etc.). Grading: • Correct implementation of MESI protocol (includes demo): 14 marks o Uni-processor cache: 7 marks o Coherence protocol: 7 marks • Experiments with varying parameters: 2 marks • Report: 4 marks
About
Cache Simulator for CS4223 module at NUS
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published