One of the questions I get a lot these days is, “How can I learn data analysis without actually taking a statistics course?” In this post, I will relay my answers to that question.
First, if, like me, you sometimes need help understanding or explaining statistical concepts and data analysis, then I highly recommend visiting the Seeing Theory website where you can
- spend hours Ooo-ing and Ahh-ing over beautiful visualizations of the math behind a statistical analysis.
- interactively manipulate data and see its visualized impact on statistical analyses in real time—so cool!
- download a draft of the Seeing Theory textbook for free.
The Seeing Theory website will surely be useful to you no matter which of the following approaches you take to learning stats.
1. R (and Rstudio): Powerful; Learning-curve
“If you know R, you’ll never be unemployed.” That’s a line I hear a lot, lately. Why? Because data analysis skills are in demand. And R is used in both industry and academia. So, if you know R, you are more likely to find employment.
R is a program and a language. It’s free (on both Mac and Windows) and it’s really powerful. A downside is that it is a command line interface with its own language, so it involves learning new (but relatively easy) syntax (just like learning symbolic logic or HTML). You can download R (which runs in the background) here and RStudio (the app you use to analyze data using R) here. The book Getting Started With RStudio is a decent primer. If you like videos, then this introductory workshop video by Ashley Edwards might suit you well.
Or if you want to just get started with minimal textbooks or instruction, check out the 8-page “Getting Started in R” Guide (complete with its own datasets).
Once you’ve gleaned the basics of R, you might benefit from following advanced R users like Danielle Navarro (below) and others.
One more thing: Jamovi, which is also built on top of the R statistical language, has also been recommended to me. Jamovi offers a spreadsheet user interface (unlike RStudio), which may be more familiar with many people.
2. JASP: Point-and-click; In development
JASP might be the newest and easiest program to use: it’s point-and-click rather than code-and-run. It allows you to do most of the basics (and it is gaining new abilities frequently). Here’s a tutorial for JASP. You can download JASP here.
I don’t have a good sense of how many people in industry are using JASP for data analysis. So, I am not sure how well JASP skills will transfer outside of academia.
3. Spreadsheets: Learn the math!
The first grad stats course I took required students to do the analyses with pencil and paper (and a basic calculator, where applicable). It forced me to learn much of the math involved in basic data analysis, which is helpful even today. Once you know the math, you are more likely to notice errors in your analysis that programs like R and JASP (and the rest) do not tell you about.
Another way to learn the math is to use a spreadsheet app like Excel. Just enter your data and then build functions to calculate means, standard deviations, predictions, and residuals. And you can plug those values into more functions to get more values. It’s usually not something I recommend to everyone, but if you are outstandingly interested in actually understanding the math and you can enjoy tinkering, then you might actually prefer this option. Pretty much every stats book will give you the equations you need—e.g., this book by Judd, McClelland, and Ryan (2017) and this book by Maxwell, Delaney, and Kelley (2018).
Obviously, spreadsheets are used everywhere. So if you can master the basic mathematical functions of a program like Excel, then you’ll have a skill set that easily transfers outside of academia.
Off you go!
In my experience. The easiest way to learn this stuff is to sit through a course on it. That way someone can save you hours of confusion or running down rabbit holes when you have a question. While Google searches are very helpful, I find that they are rarely better than humans who are in the same room and already know how to do what you are trying to learn.
But I imagine that, in principle, these tools can be learned well enough on one’s own.
Disclaimer. I realize that I didn’t mention common tools like SAS and SPSS. Why? Do I think that they are bad or dumb or beneath me?! No. I have used them while co-authoring papers and while learning data analysis. However, I have never found them so indispensable that I am willing to pay for them or to limit myself to Windows-based apps. If that changes, then I imagine that I will probably update this post.