World Cup 2018 Sticker Collecting

Once again a major football tournament is approaching and my son is collecting stickers of all the teams who have reached the finals. This time it’s the World Cup in Russia and the album published by Panini has 682 to collect.  I’ve blogged before about the maths behind collecting stickers so you can calculate how many packets of five distinct stickers you expect to need to finish it.  At the time, to help me visualise this, I wrote a web page with a bit of JavaScript on it to do the calculations.  This time I’ve looked over it again and increased its functionality a little, so I’ve decided it’s robust enough to advertise.  It’s on my main university site linked here.

You can choose from three types of query to run and the results get graphed alongside the result to show the trend.  The first is to help decide when to buy new stickers and when to swap, assuming that the swap isn’t free.  This was useful to me because we often swap through a website with people from abroad and the postage cost can be more than the price of several new packs of stickers.  The second is to calculate how many new stickers that you expect to get from a given number of packets (i.e. distinct stickers that you still need, obviously).  The third determines the inverse: how many packets you expect to buy to obtain a given number of new stickers.

As well as interacting with the tool through the web interface you can also use HTTP GET parameters to set all the options.  So this link graphs the number of packets we’d expect to need to finish the World Cup sticker album 2018 (after the 31 we got free with the album and assuming that we buy 50 of our choice from Panini at the end) and this link shows the number of new stickers we’d expect to get after buying 70 packets for the FIFA 365 album I mentioned at the end of my previous post.

Back to this post, and for the current album we got 31 stickers free.  The first graph here (generated with gnuplot, not my JavaScript tool) shows how many packets we expect to have to buy to reach 632 so we can buy the final 50 direct.  The second graph aims to make the maths a reality and scare me (and my wallet) at the same time by showing how much we have spent and need to spend to finish the album.  I’ll periodically update these graphs over the next few weeks and months with a running commentary as we gradually get more stickers and fill in the blank spaces in the album.

10th May

We’ve bought an album with 31 free stickers and an additional six packets at 80p each.  This has given us 60 distinct stickers and one swap, which is marginally better than expected (58), so our line basically follows the expectation at the moment.  This means we expect to need another 342 packets to reach our target of 632, without swapping!  At 80p per packet, that’s almost £274 so I’m definitely going to be swapping as much as I can when we have enough stickers to reduce the impact on my finances!

12th May

We’ve now bought another five packets at 80p each and, in a masterstroke, I’ve bought 100 packets from ebay for a total of just over £50, which we’re going to open gradually over the next week or so and not all at once in a single fit of sticker mania!  The graphs show at total of ten new packets, with 102 distinct stickers and now nine duplicates that we can swap.  We’re still holding to the expectation line (we’d expect 103 by now) but the cost has taken a positive turn due to the cheaper stickers starting to be opened.

13th May

Well, we went on a mini binge opening an extra 19 packets.  This gave us 80 extra stickers and 19 duplicates, so we’ve now got 182 stickers in the album.  The impact of the cheaper packets is more apparent and we’re still just on the positive side of the expectation (182 obtained vs 179 expected).

15th May

Another glut of stickers—28 packets over the past two days—which has given us 296 stickers in the album.  The cost benefit is clear, but what is odd is the noticeable departure from the expectation: we’d expect to have only 272 stickers after 63 packets.  The reason may be just down to chance and us being lucky with the packets we’ve got, or it may be because of buying a whole box of 100 packets, where the stickers are less random.  In fact, the study from the University of Geneva that I mentioned in my previous post1 says that the boxes generally contain 500 distinct stickers.  Unfortunately I can’t confirm this here—the box I bought was not sealed and was in poor condition, hence the low price, although it had 100 packets, all of which were immaculate, so I have no complaints.  I suspect it was a combination of a couple of boxes, meaning that although there was duplication, it was less than you’d expect from 100 random packets.  I guess we’ll have more of an idea once we’re done.

18th May

Now we’ve opened 82 packets of stickers, got 376 in the album and have 65 stickers to swap.

20th May

Not much sticker action this weekend due to a cub camp, but we’re now at 415 stickers, still performing much better than the expectation.  We’re just over 80% of the way through the box I bought and soon I’ll have to go to the shop to buy more at the standard price again.

22nd May

We’ve finished the full box of 100 packets now (111 in total) and have 448 stickers with an additional 138 duplicates.  We’re still well ahead of the expectation.  We need another 46 before we have exactly the number we want, but that’s going to make swapping away all the duplicates quite difficult, so I’m going to buy a few more packets than strictly necessary to give some extra options to swap.

26th May

After 20 further packs, I’m done buying stickers for now and will move on to swapping.  We’ve got 480 in the album and 206 duplicates from our 131 packets.  The last few packets brought us back to the expectation.  With these extra stickers we’ve finally finished one half page.

1st June

We’ve now been swapping away, our first swapped 61 stickers for the price of a first class large letter.  This has meant the line in the first graph goes vertical since we’re not buying more packets yet gaining more stickers, by far the best way to beat the expectation!  Cost is beneficial too.  It’s flattened off.  Overall we’ve swapped 123 stickers (we’ve got 603 total) for just £3.51.  A couple more swaps and we’ll be able to send off to Panini for the final 50 and complete the album.

  1. Sardy, Sylvain, and Yvan Velenik. “Paninimania: sticker rarity and cost-effective strategy.”

An Event-Triggered Programmable Prefetcher for Irregular Workloads

Over the last few years my PhD student, Sam Ainsworth, and I have been looking into data prefetching, especially for applications containing irregular memory accesses. We published a paper in ICS 2016 about a specialised hardware prefetcher that optimises breadth-first traversals on graphs in the commonly-used compressed sparse-row format, which I previously blogged about. We also published a paper at CGO on automatic software-prefetch generation, more generally for indirect memory accesses (blog post). At ASPLOS this year, we marry the two ideas together and generalise even further, creating a programmable prefetcher, using an event-driven programming model, that is capable of fetching in data for many types of memory access, complete with a compiler pass to automatically generate the prefetching code when directed by the developer.  The paper is available from my website now.

I’m going to dive right in with the details here because I’ve motivated the need for more intelligent prefetching in previous posts.  In this one I want to describe the architecture and our programming model, and give some intuition of the relative merits of the three different ways of programming it.

Continue reading…

Gonville and Caius

Today I’m joining a college.  Or, to be more precise, at a ceremony later this afternoon I’ll be admitted as a fellow to Gonville and Caius college.

Now the questions you might ask are why, and why aren’t you part of a college already—this is Cambridge after all and it’s all about the colleges, isn’t it?

Actually it’s not.  In Cambridge you don’t have to be part of a college.  All students are, both undergraduate and postgraduate.  Postdocs generally aren’t, although some colleges make provision for a small number of postdocs to be part of their community.  Lecturers, readers and professors can choose whether to be part of a college or not.
Continue reading…

Comparison of AArch64 Dynamic Binary Modification Tools

Over the past few years I’ve become increasingly interested in dynamic binary modification (DBM) tools, so much so that I supervise a PhD student who is trying to parallelise binaries using one, and am just starting work on a grant that continues and extends this work.  On Intel’s architecture, Pin is probably the most famous tool, and one that I had most experience of in the past.  (As an aside, Pin is a dynamic binary instrumentation tool, but I’m going to use modification instead of instrumentation throughout this post, since modification subsumes instrumentation and I’m more interested in optimisation than just analysis.)  However, it’s closed source and only targets Intel’s ISAs.  Another option is DynamoRIO, a tool that came out of a collaboration between HP Labs and MIT, from the original Dynamo system (a brief history is available).  This has support for multiple ISAs, not just x86 but ARM’s 32-bit architecture and, more recently, AArch64.  This is an attractive tool for researchers due to the multiple-ISA support and because it is open source.  However, there is also another open source DBM for AArch64, called Mambo, which comes from my friends at the University of Manchester.

I was interested to see the differences in performance between these two open-source systems.  DynamoRIO is far more mature, but the AArch64 support is still very much under development (and at the time of writing, I’m told that the focus is still on correctness rather than performance).  Mambo is younger and specialised to ARM’s ISAs but doesn’t yet have the range of support available in DynamoRIO for writing modification tools (plugins, in Mambo terminology, or clients for DynamoRIO).
Continue reading…