Open Source for Open Science - an EEB Summer Workshop

Course Details

(Course Outline.pdf)
Link to a directory of all files for the course
Downloadable zip archive of all course files

Open Source for Open Science (OSOS) 2014 is a free summer workshop featuring an introduction to R, *nix commands, and open source GIS. This two and a half day course will provide tools for reading in, manipulating and analyzing your data in R, as well as other open-source tools for visualizing and manipulating spatial data. It will provide a basic introduction to scientific programming, creating publication quality graphics in R, as well as wrangling messy data with *nix commands in bash. Although the course is aimed at the EEB graduate student, you need not be in EEB or a graduate student to attend. The modular design of the course will allow participants to attend sections of the course that are useful to them and skip the parts that are not. We encourage you to attend the first day if you are not an avid R user.


Rebecca Clark, Thomas Olszewski, Claudio Casola, Mike Treglia, and Michelle Lawing.


No previous knowledge of R is required. The course is open to all students, postdocs, and faculty. A laptop is required. If you do not have a laptop, please contact Michelle and we will set you up with a loaner.


The workshop will start at 9:00 am July 10th and wrap up around noon on July 12th. Snacks and coffee will be provided during morning and afternoon breaks and participants will have an hour for lunch.

Day 1 - July 10th

9:00 am
Session 1 - Introduction to the R environment (Rebecca Clark)
Session 1 Handouts and Files
  1. 1. Overview
  2. 2. Interface Basics)
    1. a. Simple calculations
    2. b. Variables and assignments
    3. c. Functions
  3. 3. Loading Data
    1. a. Directories, paths, and finding files
    2. b. .csv and .txt
    3. c. Headers
    4. d. Formats (vectors and data frames)
    5. e. Group activity ("fixing" an input file for R)
  4. 4. Manipulating data
    1. a. Subsetting
    2. b. Indexing
    3. c. Handling missing values

- Lunch Break -
1:00 pm
Session 2 - Introduction to the R environment continued (Michelle Lawing)
Session 2 Handouts and Files
  1. 1. Organization of projects
    1. a. File hierarchy
    2. b. Reference cards
    3. c. R style guide
    4. d. Well-documented scripts
  2. 2. Base plotting
  3. 3. Some simple statistics
    1. a. Regression example with some plotting
    2. b. ANOVA example with sums of squares explanation
  4. 4. Packages and CRAN
  5. 5. Where to go for help
  6. 6. R Studio
  7. 7. Wrap-up

Day 2 - July 11th

9:00 am
Session 3 - R for Scientific Programming (Tom Olszewski)
Session 3 Handouts and Files
  1. 1. Introduction to programming
    1. a. Control structures
    2. b. Vector notation
    3. c. How to write instructions as a script
  2. 2. Random walk example
  3. 3. Randomization with Monte Carlo and bootstrapping

- Lunch Break -
1:00 pm
Session 4 - *nix commands in bash and R (Claudio Casola)
Session 4 Handouts and Files
  1. 1. *nix commands
  2. 2. Regular expressions
  3. 3. Invoking R from command line

3:00 pm
Session 5 - Publication Quality Graphics (Rebecca Clark)
Session 5 Handouts and Files
  1. 1. Base graphics details
  2. 2. Plotting with lattice
  3. 3. Plotting with ggplot2

Day 3 - July 12th

9:00 am
Session 6 - Open Source GIS (Mike Treglia)
Session 6 Handouts and Files
  1. 1. Introduction to GIS
    1. a. Capabilities
    2. b. Data types and formats
    3. c. Projections
    4. d. Common software
  2. 2. Working with QGIS
    1. a. Introduction to QGIS
    2. b. Loading and Viewing Spatial Data
    3. c. Dealing with Projections
    4. d. Some Basic Vector Operations
    5. e. Some Basic Raster Operations
    6. f. Using Vector and Raster Datasets Together
    7. g. Making a Map
  3. 4. Q&A to cover any other elements of interest

Additional Websites