The ability to analyse and interpret data is becoming an increasingly critical factor in driving commercial innovation and success. Maximising operational efficiency, understanding your customer base or monitoring quality control all requires data and the ability to analyse it.
Businesses are collecting data on a scale previously unseen, but if you want to understand and use your data, how do you go about it? For a basic exploratory analysis many choose to use the statistical tools in Excel. Not only is it readily available as part of Microsoft Office, but its spreadsheet functionality means that most users find it easy to use.
However, it is important to recognise that there are a number of drawbacks to using Excel; its data analysis tools are limited and there is very little flexibility over the methods used and the output given. Fortunately there are more sophisticated tools available; the pharmaceutical industry predominantly uses SAS and for those in market research the adoption of SPSS is widespread. However, if you speak to the majority of statisticians, analysts or data scientists you’ll generally find that R is their software of choice.
What is R and what makes it different from the other software packages out there? Well, R is designed specifically for statistical computing and graphics. It is free and open source, the latter meaning that anyone can interrogate the code to see what’s going on – there’s no black box involved. R provides a flexible analysis toolkit where all of the standard statistical techniques are built-in. Not only that, but there is a large R community who regularly contribute new functionality through add-on ‘packages’. In fact, finding a particular statistical model or technique that is not already available through R is a tricky task indeed!
R provides a flexible analysis toolkit where all of the standard statistical techniques are built in.
Of course analysing your data is only half the battle – communicating your data or results to facilitate decision making is also essential. Fortunately R has been developed with visualisation in mind and there’s a huge range of different types of charts, graphs and plots available including links to the Google Chart Tools and, of course, GoogleVis.
For many users the main difference between R and other statistical toolkits such as Excel and SPSS is that it is also a programming language and you run your analyses by typing commands into a terminal rather than using point and click drop-down menus. This can seem daunting at first and, although there is no centralised support, there are many online resources out there to help you ranging from online books, dedicated webpages like Quick-R and online courses such as TryR. For a small investment in time to learn R, you’ll find that you have much more functionality and flexibility to really explore and understand your data.
So if you have data that requires statistical analysis or visualisation, why not consider the benefits that R can bring? Don’t just take our word for it – many of the World’s most successful businesses such as Google, Pfizer, Lloyds of London and Shell all use R to analyse and present their data.