| @@ -1,4 +1,36 @@ | |||||
| * Question 1 - Benchmarking | |||||
| * Question 1 - Hazards | |||||
| For the following program describe each hazard with type (data or control), line number and a | |||||
| small (max one sentence) description | |||||
| ** program 1 | |||||
| #+begin_src asm | |||||
| addi t0, zero, 10 | |||||
| addi t1, zero, 20 | |||||
| sub t1, t1, t0 | |||||
| beq t1, zero, .L2 | |||||
| jr ra | |||||
| #+end_src | |||||
| ** program 2 | |||||
| #+begin_src asm | |||||
| addi t0, zero, 10 | |||||
| lw t0, 10(t0) | |||||
| beq t0, zero, .L3 | |||||
| jr ra | |||||
| #+end_src | |||||
| ** program 3 | |||||
| #+begin_src asm | |||||
| lw t0, 0(t0) | |||||
| lw t1, 4(t0) | |||||
| sw t0, 8(t1) | |||||
| lw t1, 12(t0) | |||||
| beq t0, t1, .L3 | |||||
| jr ra | |||||
| #+end_src | |||||
| * Question 2 - ??? | |||||
| * Question 3 - Benchmarking | |||||
| In order to gauge the performance increase from adding branch predictors it is necessary to do some testing. | In order to gauge the performance increase from adding branch predictors it is necessary to do some testing. | ||||
| Rather than writing a test from scratch it is better to use the tester already in use in the test harness. | Rather than writing a test from scratch it is better to use the tester already in use in the test harness. | ||||
| When running a program the VM outputs a log of all events, including which branches have been taken and which | When running a program the VM outputs a log of all events, including which branches have been taken and which | ||||
| @@ -11,7 +43,7 @@ | |||||
| sealed trait BranchEvent | sealed trait BranchEvent | ||||
| case class Taken(addr: Int) extends BranchEvent | case class Taken(addr: Int) extends BranchEvent | ||||
| case class NotTaken(addr: Int) extends BranchEvent | case class NotTaken(addr: Int) extends BranchEvent | ||||
| def profile(events: List[BranchEvent]): Int = ??? | def profile(events: List[BranchEvent]): Int = ??? | ||||
| #+END_SRC | #+END_SRC | ||||
| @@ -22,73 +54,74 @@ | |||||
| #+BEGIN_SRC scala | #+BEGIN_SRC scala | ||||
| def OneBitInfiniteSlots(events: List[BranchEvent]): Int = { | def OneBitInfiniteSlots(events: List[BranchEvent]): Int = { | ||||
| // Helper inspects the next element of the event list. If the event is a mispredict the prediction table is updated | // Helper inspects the next element of the event list. If the event is a mispredict the prediction table is updated | ||||
| // to reflect this. | // to reflect this. | ||||
| // As long as there are remaining events the helper calls itself recursively on the remainder | // As long as there are remaining events the helper calls itself recursively on the remainder | ||||
| def helper(events: List[BranchEvent], predictionTable: Map[Int, Boolean]): Int = { | def helper(events: List[BranchEvent], predictionTable: Map[Int, Boolean]): Int = { | ||||
| events match { | events match { | ||||
| // Scala syntax for matching a list with a head element of some type and a tail | |||||
| // `case h :: t =>` | |||||
| // Scala syntax for matching a list with a head element of some type and a tail | |||||
| // `case h :: t =>` | |||||
| // means we want to match a list with at least a head and a tail (tail can be Nil, so we | // means we want to match a list with at least a head and a tail (tail can be Nil, so we | ||||
| // essentially want to match a list with at least one element) | // essentially want to match a list with at least one element) | ||||
| // h is the first element of the list, t is the remainder (which can be Nil, aka empty) | // h is the first element of the list, t is the remainder (which can be Nil, aka empty) | ||||
| // `case Constructor(arg1, arg2) :: t => ` | |||||
| // `case Constructor(arg1, arg2) :: t => ` | |||||
| // means we want to match a list whose first element is of type Constructor, giving us access to its internal | // means we want to match a list whose first element is of type Constructor, giving us access to its internal | ||||
| // values. | // values. | ||||
| // `case Constructor(arg1, arg2) :: t => if(p(arg1, arg2))` | |||||
| // `case Constructor(arg1, arg2) :: t => if(p(arg1, arg2))` | |||||
| // means we want to match a list whose first element is of type Constructor while satisfying some predicate p, | // means we want to match a list whose first element is of type Constructor while satisfying some predicate p, | ||||
| // called an if guard. | // called an if guard. | ||||
| case Taken(addr) :: t if( predictionTable(addr)) => helper(t, predictionTable) | |||||
| case Taken(addr) :: t if(!predictionTable(addr)) => 1 + helper(t, predictionTable.updated(addr, true)) | |||||
| case NotTaken(addr) :: t if(!predictionTable(addr)) => 1 + helper(t, predictionTable.updated(addr, false)) | |||||
| case NotTaken(addr) :: t if( predictionTable(addr)) => helper(t, predictionTable) | |||||
| case _ => 0 | |||||
| case Taken(addr) :: t if( predictionTable(addr)) => helper(t, predictionTable) | |||||
| case Taken(addr) :: t if(!predictionTable(addr)) => 1 + helper(t, predictionTable.updated(addr, true)) | |||||
| case NotTaken(addr) :: t if(!predictionTable(addr)) => 1 + helper(t, predictionTable.updated(addr, false)) | |||||
| case NotTaken(addr) :: t if( predictionTable(addr)) => helper(t, predictionTable) | |||||
| case _ => 0 | |||||
| } | } | ||||
| } | } | ||||
| // Initially every possible branch is set to false since the initial state of the predictor is to assume branch not taken | // Initially every possible branch is set to false since the initial state of the predictor is to assume branch not taken | ||||
| def initState = events.map{ | def initState = events.map{ | ||||
| case Taken(addr) => (addr, false) | case Taken(addr) => (addr, false) | ||||
| case NotTaken(addr) => (addr, false) | case NotTaken(addr) => (addr, false) | ||||
| }.toMap | }.toMap | ||||
| helper(events, initState) | helper(events, initState) | ||||
| } | } | ||||
| #+END_SRC | #+END_SRC | ||||
| ** TODO Branch predictor is underspecified, needs to be cleaned up | |||||
| ** Your task | ** Your task | ||||
| Your job is to implement a test that checks how many misses occur for a 2 bit branch predictor with 4 slots. | Your job is to implement a test that checks how many misses occur for a 2 bit branch predictor with 4 slots. | ||||
| For this task it is probably smart to use something else than a ~Map[(Int, Boolean)]~ | For this task it is probably smart to use something else than a ~Map[(Int, Boolean)]~ | ||||
| The skeleton code is located in ~testRunner.scala~ and can be run using testOnly FiveStage.ProfileTest. | The skeleton code is located in ~testRunner.scala~ and can be run using testOnly FiveStage.ProfileTest. | ||||
| If you do so now you will see that the unrealistic prediction model yields 1449 misses. | If you do so now you will see that the unrealistic prediction model yields 1449 misses. | ||||
| With a 2 bit 4 slot scheme, how many misses will you incur? | |||||
| With a 2 bit 4 slot scheme, how many misses will you incur? | |||||
| Answer with a number. | Answer with a number. | ||||
| * Question 2 - Cache profiling | |||||
| * Question 4 - Cache profiling | |||||
| Unlike our design which has a very limited memory pool, real designs have access to vast amounts of memory, offset | Unlike our design which has a very limited memory pool, real designs have access to vast amounts of memory, offset | ||||
| by a steep cost in access latency. | by a steep cost in access latency. | ||||
| To amend this a modern processor features several caches where even the smallest fastest cache has more memory than | To amend this a modern processor features several caches where even the smallest fastest cache has more memory than | ||||
| your entire design. | your entire design. | ||||
| In order to investigate how caches can alter performance it is therefore necessary to make some rather | In order to investigate how caches can alter performance it is therefore necessary to make some rather | ||||
| unrealistic assumptions to see how different cache schemes impacts performance. | unrealistic assumptions to see how different cache schemes impacts performance. | ||||
| We will therefore assume the following: | |||||
| We will therefore assume the following: | |||||
| + Reads from main memory takes 5 cycles | + Reads from main memory takes 5 cycles | ||||
| + cache has a total storage of 32 words (1024 bits) | + cache has a total storage of 32 words (1024 bits) | ||||
| + cache reads work as they do now (i.e no additional latency) | + cache reads work as they do now (i.e no additional latency) | ||||
| For this exercise you will write a program that parses a log of memory events, similar to previous task | For this exercise you will write a program that parses a log of memory events, similar to previous task | ||||
| #+BEGIN_SRC scala | #+BEGIN_SRC scala | ||||
| sealed trait MemoryEvent | sealed trait MemoryEvent | ||||
| case class Write(addr: Int) extends MemoryEvent | case class Write(addr: Int) extends MemoryEvent | ||||
| case class Read(addr: Int) extends MemoryEvent | case class Read(addr: Int) extends MemoryEvent | ||||
| def profile(events: List[MemoryEvent]): Int = ??? | def profile(events: List[MemoryEvent]): Int = ??? | ||||
| #+END_SRC | #+END_SRC | ||||