National lotteries are major contributors to many charitable causes. The British lottery, for example, donated in excess of £6 billion in the last financial year, divided up into four distinct categories; the dominant category containing health, education, environment and charitable causes, followed by three further categories of sport, arts, and heritage . This funding supports national and local projects, and its value is directly driven by annual lottery ticket sales.
In order to estimate tickets sales for the upcoming financial year, a two-part simulation model was built that would simulate the ticket sales and draw outcomes for the year. This resulted in a comprehensive forecast of the likely ticket sales which allowed for calculation of both confidence intervals on the predictions, and best-case and worst-case financial scenarios.
Ticket sales are primarily driven by two factors: the day of the draw and the jackpot value, with weekend and high jackpot draws attracting higher ticket sales. As the jackpot value depends upon the outcome of previous draws, the only way to predict sales is via a simulation model that takes account of fluctuating jackpot values that occur when draws result in there being no jackpot winner.
Our approach was to break the problem down into two parts. The first part was to develop a statistical model to predict ticket sales based upon the day of the draw and jackpot value using historical data from draws taken over the past 10 years. The second part was to develop a simulation model to predict tickets sales and jackpot outcomes for the full year. We ran large number of simulations to obtain average annual ticket sales and associated intervals by summarising the simulated values that we obtained.
The simulation itself was based upon a Bernoulli trial in order to determine for each draw, in turn, whether the jackpot was won or not (see Wolfram Alpha for other interesting Bernoulli calculations). This probability was based upon the number of unique number combinations amongst the tickets sold which we needed to determine from the number of tickets sold overall. Obviously, the more unique tickets combinations that are sold (known as the coverage), the higher the likelihood of a jackpot win. However, when several tickets are bought by the same individual they very rarely choose the same numbers and so we first have to convert tickets sales to unique ticket purchasers and then use historic data to predict the likely coverage.
In order to evaluate the performance of our approach, we first fit the model to the first 7 years of our 10-year data set and then compared the values obtained with what we actually observed in the most recent three years’ worth of data. This enabled us to show that the predictions were accurate and we then re-fitted the models using the full data set in order to predict future sales.
We were able to provide the client with predicted annual ticket sales together with intervals indicating the uncertainty associated with those predictions. The prediction intervals were especially useful as they could be used to form best and worst case financial scenarios that could be used to predict the income it was likely to receive from ticket sales and to budget accordingly.
As a by-product of our approach we were also able to offer additional insights into the key factors driving ticket sales and to quantify their effects. The simulation also produced estimates of the expected jackpot pay-out – a somewhat unpredictable but guaranteed expense for the lottery.