A program can only be executed, when it has execute permission:
chmod 755 <filename>
Remember to write #!/usr/bin/perl on the first line of
your programs.
Necessary files to complete these exercises
To download the files to your system, just press the Shift key while
you left click on the blue link. Follow the instructions.
ex5.acc
matrix.dat
mat1.dat
mat2.dat
test1.dat
test2.dat
test3.dat
dna7.fsa
FastaParse.pm
- Make a subroutine that removes duplicates from a list. The list (array) has
to be passed to the subroutine as a reference and the array must be cleaned
"in place" thus using a minimum of memory. The subroutine should NOT return
a value. Use it to improve/change exercise 2 from day 2.
- Time to combine the subroutines that remove duplicates;
Make a subroutine that removes duplicates
from a list. If the list is passed as an array, then behaviour should be
like day 3, ex 7, i.e. return a clean list. If the list is passed as a reference to
an array then behaviour should be like day 4, ex 1, i.e. clean the given array in
place.
- Create a program that reads a tab separated file with numbers,
matrix.dat ,(to be
understood as a matrix) and stores the numbers in a matrix-like hash
(keys are indices, .i.e. "$i,$j"). The program should be able to figure
out how many rows and columns the matrix has. Having read the matrix from
file it should now transpose it (rows to columns and columns to rows) using
a subroutine like &transpose($rows, $columns, \%matrix). You have to make
the subroutine, too. In the end print out the resulting matrix.
-
Make a program that calculates the product of two matrices and prints it on STDOUT (the screen).
The matrices are in the files mat1.dat and
mat2.dat. Numbers in the files are tab separated.
Advice: The program should have a subroutine that reads a matrix from a given file (to be used twice),
a subroutine that calculates the product, and a sub that prints a matrix. This way ensures that
your program is easy to changes to other forms of matrix calculations.
Here are two links to the definition of matrix multiplication.
http://www.mai.liu.se/~halun/matrix/matrix.html
http://mathworld.wolfram.com/MatrixMultiplication.html
- In the file test1.dat is results from an experiment in the form
AccessionNumber Number Number Number ....
.
.
In the files test2.dat and
test3.dat are results from similar
experiments but with a slightly
different gene set. You want to average the numbers from all experiments for each acccession
number. The output this therefore.
AccessionNumber SingleAverageNumberOfAll3Experiments
.
.
- Now we should use some object orientated techniques. OO programming is
very often used in modules. A module is a collection of subroutines which
somebody benevolent has made available for your use. You can find many Perl
modules at
http://www.cpan.org/.
For now start by saving the file FastaParse.pm in the directory where
your program will be. This is a OO module, which I made for easy reading of
fasta files. The first thing you
should do would be reading the file. There is first a description of the
module, then comes the code. You should not worry about the code,
allthough it is good to learn from when you make your own modules.
The important part is the synopsis (first in the file), which tells you
how to use the module.
First you should make a small program that proves that you have downloaded
and placed the module in the correct place. It could be the program in the synopsis of the module. If it runs without errors, you are set.
Your first Perl statement in a program that uses the module should be: use FastaParse; which
loads the module.
After that you can use the module as described. Notice the use of '->'
to refer to methods and/or data encapsulated in the module.
- Use this module to parse/read the fasta file
dna7.fsa and solve ex. 6 from day 2. I repeat the text of the
exercise for convenience: Now make a program that reverse complements the sequence
and writes it into the file revdna.fsa in fasta format. This time you have to keep the first identifying line, so the
sequence can be identified. You must add 'ReverseComplement' in the end
of that line, though, so you later know that it is the reverse complement.
|