Lesson 1: Using UNIX operaration system

Nothing Index Next

Required reading
Notes on UNIX from the teacher

Subjects covered
Basic file handling in UNIX.

Necessary files to complete this exercise.
To download the files to your system, just press the Shift key while you left click on the blue link. Follow the instructions.
The files are all excerpts of real data.
ex1.acc
ex1.dat
orphans.sp
You can play around with these files as much as you like. If you change or destroy them, just download them again.


  1. Use nedit to create a file mycommands.txt where you write all commands and observations you do in the following exercises. Use copy/paste to copy the commands.
  2. Note: There are more standard text editors than nedit. Examples are emacs, xemacs, vi, vim, and pico.
  3. First list the files in the directory.
  4. Copy ex1.acc to myfile.acc.
  5. Look at the content of both files to ensure they are identical.
  6. Copy ex1.dat to myfile.acc.
  7. Check that the content of myfile.acc changed.
  8. Delete myfile.acc.
  9. Make a directory test and move the three files to it.
  10. Make a directory data and move the three files to that instead.
  11. Remove test directory.
  12. Change directory to data and confirm that you succeded. Go back to the home directory afterwards.
  13. Make three new directories newtest - one inside the other, like a russian doll.
  14. Move the data directory to the innermost newtest directory.
  15. Confirm that the three files are moved along with the data directory.
  16. Copy the three files to your home (your top directory).
  17. Remove all newtest directories and data in the with a single command. There may be a lot of confirmations. These are not considered part of the command. They are annoyances.
  18. Count the lines in ex1.acc and ex1.dat.
  19. Concatenate ex1.acc and ex1.dat in the file ex1.tot, i.e. copy the content of two files into one new file. Verify that all gene IDs comes first followed by numerical data.
  20. Paste ex1.acc and ex1.dat together in ex1.tot, thus destroying the old file. Verify that corresponding gene IDs and numerical data are put on the same line. as the data.
  21. Extract (cut) SwissProt ID and 1st and 3nd numerical data (column 1,3,5) from ex1.tot. Put results into a file ex1.res.
  22. Extract GenBank ID and 1st numerical data (column 2,3) from ex1.tot. Put results into a file ex1.res.
  23. Find the lines (using grep) in orphans.sp which contain a GenBank accession number. There are 85, verify this. Note: An accession number is one or two capital letters and looks like this 'AB000114.CDS.1', the .CDS. part is kind of optional.
  24. How many human genes with SwissProt IDs in orphans.sp exist ? How many of those are hypothetical ? (11)
    How many genes belong to the rat, and how many of those are precursors ? (9) Note: A Swissprot ID looks like 'PARG_HUMAN' or 'TF1A_MOUSE', with the gene being before the underscore and the organism after the underscore.
  25. This litte exercise will require that man is used for help on grep. From the file ex1.res find the lines with positive numbers and put then into ex1.pos. The lines with negative number go into ex1.neg.
  26. Write a shell script that solves exercise 19-24, with the exercises clearly seperated in both the script and the output. This should be straight forward (but long), especially since you took notes (exercise 1).
  27. Write a shell script (which is simply just a list of unix commands in a file) that puts all the positive numbers in the file ex1.dat into a file ex1.pos2, and all the negative numbers into a file ex1.neg2. Column position does not matter. The script must clean up after itself, so if any temporary files are used, they must be deleted as the last action. Remember to put the date and a description of the files in the first lines of the resulting output files.
  28. Mail your mycommands.txt file to the teacher for comments.

This page was last updated         by Peter Wad Sackett, pws@cbs.dtu.dk