I’m no wiz when it comes to computers, but lots of really smart people have told me I’d do myself an academic favor by learning to write programs. In Paleobiology last year, Phil Gingerich introduced me to programming in Visual Basic. While I did not become so proficient to find it as useful he did, it still got me interested. Then last semester, Milford urged me and other lawnchair anthropologists to take a course in the department of Ecology and Evolutionary Biology taught by George Estabrook, where we learned to write resampling programs for Excel in Visual Basic. It was a bit stressful at times–reading and writing computer codes at first was like learning to read and write Japanese with little training. But overall I thought it was a very useful class, and I’ll certainly be writing more resampling programs in the future (i.e. for my Australopithecus robustus projects…).
But the computer language that everybody’s been abuzz about in the past year or so is R. To quote the website (see the link), “R is a language and environment for statistical computing and graphics.” The software and codes for it are available FOR FREE on the website. I’ve been meaning to sit down and figure out how to use it for a year now, and now that I’m all alone at the museum in the evenings, I’ve begun learning to use R.
At first I had quite a difficult time simply loading data into the program. But once I figured it out (I had the idea, I just wasn’t typing it correctly–“syntax” errors), it was quite easy to run. With some datasets supplied from the Paleobiology class, I was able quite easily to compute and plot statistics like principle components analysis and linear models. It’s really easy! There are even built-in boostrap (resampling) programs, although I could not figure out how to use them as I wanted.
But that isn’t programming–I wasn’t making a custom program that would do what I wanted it to do. So I decided to give it a shot. For the programming class last semester (and Milford’s “Evolution of the Genus Homo” course), I wrote a resampling program that gave the probability of randomly sampling two gorilla cranial capacities as different as some early habilines. So I decided to try to rewrite that program in R. It took about two evenings of reading manuals and fiddling with commands (and drinking red wine and watching Roadhouse). The R code is much simpler than the original one I wrote in Visual Basic–this is largely because R has a built-in command that permutes rows/columns. Plus, it runs much, much faster–I was able to resample 100,000 times in a few seconds, whereas in the original (VB) program it took a few minutes to do 5,000. I think after a few more evenings of tinkering, I can figure out how to also plot the distribution of resampled test-statistics (although if anyone can tell me how I’d love to hear).
If I can figure out how to do it, I’ll post the code so other people can tinker with it if they want. Lesson: use R!