Counting on one’s fingers is no longer enough it seems. Luckily, there are plenty of software packages to do the counting for us and this post will take you through everything from the most basic options to the captivating floating globes of Hans Rosling’s Trendanalyzer. Disciplines tend to have their own favourite and, in general, economists use Stata, social scientists SPSS, statisticians SAS, and engineers MATLAB, though it’s sometimes just the academic equivalent of an accident of birth.
If you’re not a SAS-evangelist, there’s a lot to be said for having skills in, or at least basic familiarity with, more than one of these and for adding some specialised options like HLM or Mplus. Institutions, organisations, and teams often develop their own systems and this means that showing willingness to learn new programs should stand to you in an interview. At the same time, being able to add new expertise to a group is also a potential advantage.
The most interesting recent development in stats software is the growth of R. It is open-source software which means that it’s free and that it’s constantly evolving, thanks to a team of developers and the community of R users who are adding packages to do just about anything you can imagine with numbers and graphics. R is a programming language so familiarity with the syntax of other packages makes the transition from using an analysis program to R reasonably smooth. Presumably much to the distaste of commercial software developers, R is expanding both its popularity and its capabilities and looks set to dominate the field. Now is probably a good time to start using R.
SAS, SPSS, and Stata all do things like bivariate statistics, regression, and factor analysis. While each has its advantages, they do essentially the same things, though as preferences tend to be quite committed, it might be risky to say that in certain company. Each has a syntax interface and one risks eye-rolling at the very mention of drop-down menus. Hopefully you already have a bank of scripts for common tests and other analyses.
There are some packages that do just one thing, but do it really well. It’s usually something that none of the omnibus packages is all that good at. For example, HLM (Hierarchical Linear Modelling) does hierarchical linear modelling really well. Similarly, Mplus was developed primarily for latent variable modelling, and does it really well. However, both of these are fussy in that they do not cope well with missing values and in some less than intuitive interface designs. As both HLM and Mplus cost a lot, it’s probably not worth making the investment as an individual but having some knowledge of them could be an advantage if they’re already being used in the organisation you hope to work for.
It’s also worth mentioning the basic options as they can actually be more efficient for some basic forms of analysis and visualisation. Spreadsheet programs, like Calc in Open Office and Microsoft Office Excel, are underestimated in their capacity to conduct statistical analyses, like calculating frequencies and descriptives, generating normal, t, and F distributions, standardising scores, and running tests. Spreadsheets also read .csv files which are using for moving databases between programs. Because a lot of people are never actually taught how to use spreadsheets, it’s worth spending some time exploring what they can do. Casually dropping references to pivot tables and VLOOKUP into an interview is only likely to help. The other standard in office productivity suites is database software like Open Office Base and Microsoft Access. Databases and Customer Relationship Management platforms use SQL, and a working knowledge of SQL programming is also likely to be an asset.
Finally, Hans Rosling’s Trendanalyzer is visualisation software best suited to large, international, longitudinal datasets. R’s data visualisation capacity is extensive, and developing, with multi-dimensional cluster plots available, for example. There are millions of other software packages for visualisation but not all of them cope well with importing data from statistics software or are limited to certain forms of representation; it might be better to stick with your preferred stats program or a spreadsheet for visualisations.
Whether you’re a recent graduate or an experienced professional, there’s plenty to learn about the programs with which you’re already familiar, even if this means experimenting with forms of analysis you haven’t used before or don’t have to do every day. Programming languages are similar enough that learning a second one is easier than the first, and you’re probably already familiar with at least syntax for at least one. The longer your list of things you can do the more sparkles on your CV. If that list includes R, sparklier still.
Institute of Statistical Science Academia Sinica
December 27, 2019
London School of Hygiene and Tropical Medicine
November 19, 2019
December 19, 2019
University of Alberta
Edmonton, AB, Canada
December 18, 2019
November 22, 2019
UCLA Department of Statistics
Los Angeles, CA
December 10, 2019
The National Audit Office
December 01, 2019
The Institute of Cancer Research
Sutton, Surrey, UK
December 02, 2019
University of Glasgow
December 09, 2019
City of Westminster Council
Victoria Street, London, UK
December 31, 2019
Greater London Authority
November 24, 2019
The Department for Work and Pensions
Leeds, West Yorkshire
December 14, 2019
Universidad Carlos III de Madrid
December 01, 2019
Ministry of Justice
December 11, 2019
Competition & Markets Authority
Canary Wharf, London
November 18, 2019
University of the Incarnate Word
San Antonio, TX
November 20, 2019