| @@ -1,4 +1,36 @@ | |||
| * Question 1 - Benchmarking | |||
| * Question 1 - Hazards | |||
| For the following program describe each hazard with type (data or control), line number and a | |||
| small (max one sentence) description | |||
| ** program 1 | |||
| #+begin_src asm | |||
| addi t0, zero, 10 | |||
| addi t1, zero, 20 | |||
| sub t1, t1, t0 | |||
| beq t1, zero, .L2 | |||
| jr ra | |||
| #+end_src | |||
| ** program 2 | |||
| #+begin_src asm | |||
| addi t0, zero, 10 | |||
| lw t0, 10(t0) | |||
| beq t0, zero, .L3 | |||
| jr ra | |||
| #+end_src | |||
| ** program 3 | |||
| #+begin_src asm | |||
| lw t0, 0(t0) | |||
| lw t1, 4(t0) | |||
| sw t0, 8(t1) | |||
| lw t1, 12(t0) | |||
| beq t0, t1, .L3 | |||
| jr ra | |||
| #+end_src | |||
| * Question 2 - ??? | |||
| * Question 3 - Benchmarking | |||
| In order to gauge the performance increase from adding branch predictors it is necessary to do some testing. | |||
| Rather than writing a test from scratch it is better to use the tester already in use in the test harness. | |||
| When running a program the VM outputs a log of all events, including which branches have been taken and which | |||
| @@ -11,7 +43,7 @@ | |||
| sealed trait BranchEvent | |||
| case class Taken(addr: Int) extends BranchEvent | |||
| case class NotTaken(addr: Int) extends BranchEvent | |||
| def profile(events: List[BranchEvent]): Int = ??? | |||
| #+END_SRC | |||
| @@ -22,73 +54,74 @@ | |||
| #+BEGIN_SRC scala | |||
| def OneBitInfiniteSlots(events: List[BranchEvent]): Int = { | |||
| // Helper inspects the next element of the event list. If the event is a mispredict the prediction table is updated | |||
| // to reflect this. | |||
| // As long as there are remaining events the helper calls itself recursively on the remainder | |||
| def helper(events: List[BranchEvent], predictionTable: Map[Int, Boolean]): Int = { | |||
| events match { | |||
| // Scala syntax for matching a list with a head element of some type and a tail | |||
| // `case h :: t =>` | |||
| // Scala syntax for matching a list with a head element of some type and a tail | |||
| // `case h :: t =>` | |||
| // means we want to match a list with at least a head and a tail (tail can be Nil, so we | |||
| // essentially want to match a list with at least one element) | |||
| // h is the first element of the list, t is the remainder (which can be Nil, aka empty) | |||
| // `case Constructor(arg1, arg2) :: t => ` | |||
| // `case Constructor(arg1, arg2) :: t => ` | |||
| // means we want to match a list whose first element is of type Constructor, giving us access to its internal | |||
| // values. | |||
| // `case Constructor(arg1, arg2) :: t => if(p(arg1, arg2))` | |||
| // `case Constructor(arg1, arg2) :: t => if(p(arg1, arg2))` | |||
| // means we want to match a list whose first element is of type Constructor while satisfying some predicate p, | |||
| // called an if guard. | |||
| case Taken(addr) :: t if( predictionTable(addr)) => helper(t, predictionTable) | |||
| case Taken(addr) :: t if(!predictionTable(addr)) => 1 + helper(t, predictionTable.updated(addr, true)) | |||
| case NotTaken(addr) :: t if(!predictionTable(addr)) => 1 + helper(t, predictionTable.updated(addr, false)) | |||
| case NotTaken(addr) :: t if( predictionTable(addr)) => helper(t, predictionTable) | |||
| case _ => 0 | |||
| case Taken(addr) :: t if( predictionTable(addr)) => helper(t, predictionTable) | |||
| case Taken(addr) :: t if(!predictionTable(addr)) => 1 + helper(t, predictionTable.updated(addr, true)) | |||
| case NotTaken(addr) :: t if(!predictionTable(addr)) => 1 + helper(t, predictionTable.updated(addr, false)) | |||
| case NotTaken(addr) :: t if( predictionTable(addr)) => helper(t, predictionTable) | |||
| case _ => 0 | |||
| } | |||
| } | |||
| // Initially every possible branch is set to false since the initial state of the predictor is to assume branch not taken | |||
| def initState = events.map{ | |||
| case Taken(addr) => (addr, false) | |||
| case NotTaken(addr) => (addr, false) | |||
| }.toMap | |||
| helper(events, initState) | |||
| } | |||
| #+END_SRC | |||
| ** TODO Branch predictor is underspecified, needs to be cleaned up | |||
| ** Your task | |||
| Your job is to implement a test that checks how many misses occur for a 2 bit branch predictor with 4 slots. | |||
| For this task it is probably smart to use something else than a ~Map[(Int, Boolean)]~ | |||
| The skeleton code is located in ~testRunner.scala~ and can be run using testOnly FiveStage.ProfileTest. | |||
| If you do so now you will see that the unrealistic prediction model yields 1449 misses. | |||
| With a 2 bit 4 slot scheme, how many misses will you incur? | |||
| With a 2 bit 4 slot scheme, how many misses will you incur? | |||
| Answer with a number. | |||
| * Question 2 - Cache profiling | |||
| * Question 4 - Cache profiling | |||
| Unlike our design which has a very limited memory pool, real designs have access to vast amounts of memory, offset | |||
| by a steep cost in access latency. | |||
| To amend this a modern processor features several caches where even the smallest fastest cache has more memory than | |||
| your entire design. | |||
| In order to investigate how caches can alter performance it is therefore necessary to make some rather | |||
| unrealistic assumptions to see how different cache schemes impacts performance. | |||
| We will therefore assume the following: | |||
| We will therefore assume the following: | |||
| + Reads from main memory takes 5 cycles | |||
| + cache has a total storage of 32 words (1024 bits) | |||
| + cache reads work as they do now (i.e no additional latency) | |||
| For this exercise you will write a program that parses a log of memory events, similar to previous task | |||
| #+BEGIN_SRC scala | |||
| sealed trait MemoryEvent | |||
| case class Write(addr: Int) extends MemoryEvent | |||
| case class Read(addr: Int) extends MemoryEvent | |||
| def profile(events: List[MemoryEvent]): Int = ??? | |||
| #+END_SRC | |||