Parallel Error Detection Using Heterogeneous Cores

Soft, or transient, errors are faults that occur seemingly at random, causing bits to flip within an integrated circuit.  This is especially important in memory cells, and I remember very clearly reading a blog post from James Hamilton several years ago now, where he talked about the need for ECC on DRAM in servers and discussed some (what was then) recent academic work in the subject.  ECC is a great way to protect memory, being high performance with low power and area overheads, given its ability to detect multiple errors and correct some too.  However, beyond the memory hierarchy, techniques for error detection and recovery are little used due to the difficulties in protecting logic cheaply.

One area where error detection is mandatory is in safety-critical systems, where the price of being wrong far outweighs the additional costs of providing some sort of redundancy to, at least, detect errors. The automotive sector, for example, has strict safety standards, with ASIL C and D, the highest integrity levels for a product, requiring redundancy for certification.  Microprocessor implementations tend to realise this through the use of dual-core lockstep, a system whereby each core is replicated, identical software run on each (usually at a slight offset in time to avoid correlated errors) and additional logic provided to check each computation as it occurs.  Recovery generally involves resetting the system, which is acceptable when there is little or no state to get corrupted.  However, dual-core lockstep is costly in terms of both silicon area and power.  With a desire to use more high-performance cores in these real-time systems, this latter cost becomes too much: out-of-order superscalar processors are incredibly power inefficient.

Continue reading…

World Cup 2018 Sticker Collecting

Once again a major football tournament is approaching and my son is collecting stickers of all the teams who have reached the finals. This time it’s the World Cup in Russia and the album published by Panini has 682 to collect.  I’ve blogged before about the maths behind collecting stickers so you can calculate how many packets of five distinct stickers you expect to need to finish it.  At the time, to help me visualise this, I wrote a web page with a bit of JavaScript on it to do the calculations.  This time I’ve looked over it again and increased its functionality a little, so I’ve decided it’s robust enough to advertise.  It’s on my main university site linked here.

Continue reading…

An Event-Triggered Programmable Prefetcher for Irregular Workloads

Over the last few years my PhD student, Sam Ainsworth, and I have been looking into data prefetching, especially for applications containing irregular memory accesses. We published a paper in ICS 2016 about a specialised hardware prefetcher that optimises breadth-first traversals on graphs in the commonly-used compressed sparse-row format, which I previously blogged about. We also published a paper at CGO on automatic software-prefetch generation, more generally for indirect memory accesses (blog post). At ASPLOS this year, we marry the two ideas together and generalise even further, creating a programmable prefetcher, using an event-driven programming model, that is capable of fetching in data for many types of memory access, complete with a compiler pass to automatically generate the prefetching code when directed by the developer.  The paper is available from my website now.

I’m going to dive right in with the details here because I’ve motivated the need for more intelligent prefetching in previous posts.  In this one I want to describe the architecture and our programming model, and give some intuition of the relative merits of the three different ways of programming it.

Continue reading…

Gonville and Caius

Today I’m joining a college.  Or, to be more precise, at a ceremony later this afternoon I’ll be admitted as a fellow to Gonville and Caius college.

Now the questions you might ask are why, and why aren’t you part of a college already—this is Cambridge after all and it’s all about the colleges, isn’t it?

Actually it’s not.  In Cambridge you don’t have to be part of a college.  All students are, both undergraduate and postgraduate.  Postdocs generally aren’t, although some colleges make provision for a small number of postdocs to be part of their community.  Lecturers, readers and professors can choose whether to be part of a college or not.
Continue reading…