This post is more a reminder to me that anything you’d like to read, but anyway-
I want to make my research more open, more reproducible, and more accessible. I work from several locations, so I want to have all my stuff easily to hand. I work on a Mac (sometimes) a PC (sometimes) and on Linux (rarely, but it happens; with new goodies from Bill Turkel et al I might work more there!).
I build models in Netlogo. I do text analysis in R. I visualize and analyze with things like Voyant and Overview. I scrape websites. I use Excel quite a lot. I’m starting to write in markdown more often. I want to teach students (my students typically have fairly low levels of digital literacy) how to do all this too. What I don’t do is much web development type stuff, which means that I’m still struggling with concepts and workflow around things like version control. And indeed, getting access to a server where I can just screw around to try things out is difficult (for a variety of reasons). So my server-side skills are weak.
What I think I need, is an open notebook. Caleb McDaniel has an excellent post on what this could look like. He uses Gitit. I looked at the documentation, and was defeated out of the gate. Carl Boettiger uses a combination of github and jekyll and who knows what else. What I really like is Mark Madsen’s example but I’m not aufait enough yet with all the bits and pieces (damn you version control, commits, make, rake, et cetera et cetera!)
I’ve got ipython notebooks working on my PC, which are quite cool (I installed the Anaconda version). I don’t know much python though, so yeah. Stefan Sinclair is working on ‘voyant notebooks’ which uses the same general idea to wrap analysis around Voyant, so I’m looking forward to that. Ipython can be used to call R, which is cool, but it’s still early days for me (here’s a neat example passing data to R’s ggplot2).
So maybe that’s just the wrong tool. Much of what I want to do, at least as far as R is concerned is covered in this post by Robert Flight on ‘creating an analysis as a package and vignette‘ in R studio. And there’s also this, for making sure things are reproducible – ‘packrat‘
Some combination of all of this I expect will be the solution that’ll work for me. Soon I want to start doing some more agent based modeling & simulation work, and it’s mission critical that I sort out my data management, notebooks, versioning etc first this time.
God, you should see the mess around here from the last time!