Basic R Commands
Written by: Carsten Friis
The purpose of this exercise is to introduce the free statistical software package R.
R is a flexible and powerful tool developed by an extensive open source effort. For more on R, see the R-project home page.
Let us start things up by playing around with some variables. Remember that variables are often referred to as objects in R.
As this exercise uses several functions not covered in the lecture, you may want to use the help system to familiarize yourself with them. You do this by writing: help(function_name).
- Start up R
On windows you should be able to find it in the start menu. On UNIX/Linux, you start R by typing 'R' in a command terminal.
- Assign the value 12 to a and the value 5 to b
You can verify the values of a and b by simply typing 'a' in the R prompt and then pressing return.
- Add the two variables together using the '+' operator
- Now try to add them together using the sum() function
This pretty much acomplishes the same this as the '+' operator. While writing '+' may seem much simpler than using the sum() function, it is not always so. For instance, if you wanted to add the values in a vector together, using the function is much easier.
- The function rnorm() can generate random numbers from the normal distribution. Use it to create two random vectors x and y with 10000 numbers each. (hint: use rnorm(10000))
- Use the str() function to verify that x and y do indeed contain ten thousand numbers
You could also just write 'x' or 'y', but then you would have to wait while R prints the variables to the screen.
- Plot the two vectors using the plot() function
Do they look random to you? (Like a round fuzzy ball)
- Try to use the function hist() to make histograms of x and y
This is a more useful illustration of the two vectors. Now you should be able to confirm that they are normally distributed.
While the hist() function serves us well here, the textbook plot to use when dealing with distributions is a density plot. Let's try to make one.
- Construct an object called xd containing the parameters for a density plot using the density() function (hint: use density(x))
- Now construct the density plot itself using the plot() function (hint: use plot(xd))
The plot() function automatically recognizes that the input is a 'density' object, and acts accordingly.
- Because x and y are normally distributed we can calculate the mean and the variance. Use the var() and mean() functions to do this. (See, functions are nice when working with vectors :-) )
Are they similar?
- Use the t.test() function to statistically test whether they are similar (hint: use t.test(x,y))
Look to the p-value, it's (an estimate of) the probability that the distributions from which x and y are sampled have the same mean (i.e. that x and y are from the same distribution)
- Fun thing: Write down your p-value we'll compare them later on...
- Use ls() to get an overview of your objects
- Generate a new y vector so that it is no longer similar to the x vector. Confirm the difference with the t.test() function (the p-value should become extremely small)
Use the help system on rnorm() to figure out how to generate the new vector.
- If you have not already done so, try running the graphics demo (write: demo(graphics)), it's pretty and it'll give you an idea of R's capabilities
You must press return with the window running R in focus (i.e. the active window) to cycle through the different plots. Note that pressing return with the plotting window active will accomplish nothing.
If you feel up to it and have the time, you may continue with the "Hard R Exercise"