Title: Software Project: Project presentation and implementation tips
1Software ProjectProject presentation and
implementation tips
2Overview
- Part 1 Project Presentation
- Project Part 1
- Project Part 2
- Part 2 Perl Programming Language
- Part 3 Additional tips for the project
3Project Presentation
- Deadlines
- Part 1 Submit by September 30.
- Part 2 Submit by October 30.
4Project Part 1
5Doubly Linked List Representation
6Data Type Definition
- typedef struct cell_t
- struct cell_t row_next / pointer to the next
element in the row. / - struct cell_t col_next / pointer to the next
element in the column. / - int rowind / index in row /
- int colind / index in column /
- int value / value of the elem
/ - cell_t / matrix cell data type /
- typedef struct
- int n / size /
- cell_t rows / array of row lists /
- cell_t cols / array of col lists /
- sparse_matrix_lst / sparse matrix
representation /
7Question 3.1 sparse_matop_lst
- Implement the matrix operations of (,-,,t) on
linked lists. - Command line
- 1.in 2.in
- t 3.in
- Special requirements
- Reading the matrices from binary files (directly
to a sparse matrix, without allocating nn
space). - Printing the output matrix to a text file.
-
8 Guidelines
- You are not required to follow some specific
function prototypes. - You should select the most suitable
implementation by yourself. - You should describe your algorithms and
considerations in the manual. - Correction to the exercise printouts there will
be no spaces in the binary files! Your matrices
will be just a set of integers (the first will
indicate the matrix size).
9Question 3.2 sparse_mlpl_lst
- Perform multiplication of matrices of different
sizes (n) with different number of non zero
elements (nnz). - Measure the performance using the clock()
function. - As in 2.3, prepare 2 graphs which plot the
running times in CPU ticks as the function of - nn
- nn/nnz
- Compare the performance of the three programs for
the matrix multiplication you have implemented so
far. Which representation is better and when?
10Question 3.3 matop
- Generic matrix calculator for binary operations
(Part 1). - Comand lines r 1.in 2.in
- l 1.in 2.in
- t r 3.int l 3.in
- ltoperationgt ltmatrix typegt ltfile 1gt ltfile 2gt
11Observation
- For each type of matrix we have
- The same set of functions
- The same operations
- The same program flow
- For each type of matrix we have a different
- Matrix representation
- Matrix manipulation functions
12Function prototypes
- For example, the allocation functions are
- Regular matrices
- elem get_matrix_space(int n)
- Sparse matrices linked lists
- sparse_matrix_lst allocate_sparse_matrix_lst(int
n)
13The trivial approach
- if (mType REG)
- get_matrix_space(n)
-
- if (mType LST)allocate_sparse_matrix_lst(n)
-
- Disadvantages Every operation will have to check
for - the matrix type before function invocation
- Long and unreadable code.
- Inefficient code (unnecessary evaluation of a lot
of if statements.
typedef enum REG 0, LST 1, ARR 2MatType
14Another aproach
Op_code stores ,,-,t
- To duplicate the main function
- if (mType REG)
- handle_regular_matrix(op_code, file1, file2, n)
-
- if (get_matrix_space LST)handle_sparse_matrix
_lst(op_code, file1, file2, n) -
- Disadvantages The main function is almost the
same in both - cases.
- The code is more twice (or more) longer
- Problems in debugging, you fix something in one
function and not in another.
15A better approach
An array of structs which are pointers to
functions
get_matrix_space
allocate_sparse_matrix_lst
allocate_sparse_matrix_arr
add ..._lst mult .._lst transp _lst free _lst
add ..._arr mult .._arr transp _arr free _arr
add ... mult .. transp free ..
Sparse compressed array
Regular matrix functions
Sparse linked list
16The resulting code
- a matricesmType.allocate(n)
- matricesmType.fill(a, n)
- matricesmType.write("A",a,n)
- b matricesmType.allocate(n)
- matricesmType.fill(b, n)
- matricesmType.write("B",b,n)
-
- c matricesmType.allocate(n)
- switch(op_code)
-
- case ''
- matricesmType.add(a,b,c,n)
- break
- case '-'
- matricesmType.subs(a,b,c,n)
- break
- case ''
- matricesmType.mult(a,b,c,n)
An array matrices will store the pointers to all
the functions.
typedef enum REG 0, LST 1, ARR 2
MatType
17Pointer to Function
- Define a type of pointer to function which
allocates the matrix space - Accepts int as a parameter
- Returns void
typedef void (allocFunctPtr)(int)
Function parameter
Return value
Data type name, which can be used to define
variables
18Another example
- Define a type of pointer to function which (in
the case of matrix allocation) - Accepts int as a parameter
- Returns void
typedef void (freeFunctPtr)(void )
Function parameter
Return value
Data type name, which can be used to define
variables
19A struct of pointers to functions
- typedef struct
-
- allocFunctPtr allocate
- freeFunctPtr free
-
- matrix_funcs
20A struct of pointers to functions
-
- matrix_funcs matrices3
- a matricesmType.allocate(n)
- matricesmType.free(a)
- ..
-
21Initialization and casting
- void initRegMatrix(matrix_funcs mat)
- mat-gtallocate (void ()(int))get_matrix_spa
ce - mat-gtfree
- (void ()(void ))free_matrix_space
-
-
Your real functions from assignment 2.
Casting to the required pointer to function data
type.
22A better approach
- a matricesmType.allocate(n)
- matricesmType.fill(a, n)
- matricesmType.write("A",a,n)
- b matricesmType.allocate(n)
- matricesmType.fill(b, n)
- matricesmType.write("B",b,n)
-
- c matricesmType.allocate(n)
- switch(op_code)
-
- case ''
- matricesmType.add(a,b,c,n)
- break
- case '-'
- matricesmType.subs(a,b,c,n)
- break
- case ''
- matricesmType.mult(a,b,c,n)
An array matrices will store the pointers to all
the functions.
typedef enum REG 0, LST 1, ARR 2
MatType
23The makefile
- Write your own makefile, which should support the
- following commands
- make sparse_mlpl_lst
- make sparse_matop_lst
- make matop
- Do (do not) add the g and pg flags which may
slow down the performance.
24General Guidelines
- Use your own judgment and select the most
suitable implementation by yourself! - You are allowed to change the function prototypes
of exercise 2. - You are allowed to define the function prototypes
of exercise 3.1 in the most convenient way for
this structure.
25Project Part 2
26Question 4.1 matop
- Generic matrix calculator for binary operations
(Part 2). - Enhance the matrix calculator you have developed
in exercise 3.3 to support the sparse matrices
represented as compressed arrays as well. - Comand lines c 1.in 2.in
- l 1.in 2.in
- t c 3.int l 3.in
- ltmatrix typegt ltoperationgt ltfile 1gt ltfile 2gt
c sparse compressed l sparse linked r -
regular
27Question 4.1 matop
- New requirements for compressed arrays
- You do not know the number of non-zero elements
(nnz) in advance and you will need to calculate
it by yourself. - You can allocate a supplementary data structure
of maximum size proportional to nnz. - You will need to store the result of your
calculations in a compressed matrix.
28Question 4.2 mat_calc
- Enhance the matrix calculator you have developed
in exercise 4.1 - to support any number of matrix operations
- Command line
- c 1.in t 2.in t 3.in
- Interpretation
- 1 ((t2)(t3))
- Command line
- 1.in - 2.in - 3.in 4.in t 5.in 6.in 7.in
- Interpretation
- ((1-2)-((34)(t 5))(67)
29Expression Tree
l 1.in - 2.in - 3.in 4.in t 5.on 6.in
7.in ((1-2)-((34)(t 5))(67)
30Guidelines Design
- You should make your own design of the tree and
all the algorithms - Define and described your datatypes.
- Define and describe the tree construction
procedure. - Defined and describe the tree traversal procedure.
31Guidelines Memory management
- Verify that you manage the memory in the most
efficient way. - At each stage you have the minimal amount of
allocated memory required for the calculation. - Once you finish the calculation free all the
resources. - Describe your memory management in the manual.
32Question 4.3 calc parse.pl
- Enhance your calculator from exercise 4.2 to
support matrices with values of any data type
that can be (1) char (2) integer (3) float or
(4) double. - These will be encode by CHAR, INT, FLOAT,
DOUBLE. - Example, command line
- DOUBLE c 1.in t 2.in t 3.in
33Current Data Type
How can we change it to be double, float, char
or int when needed?
- typedef int elem
- void mult_reg(elem a, elem b, elem c, int N)
- void add_matrix(elem a, elem b, elem c, int
N) -
34Trivial Solution 1
typedef int elem_int typedef double elem_double
void mult_reg(elem_int a, elem_int b,
elem_int c, int N) void mult_reg(elem_double
a, elem_double b, elem_double c, int N)
35Trivial Solution 2
Lets define it as void void mult_reg(void a,
void b, void c, int N)
void mult_reg(void a, void b, void c,int N)
int i0, j0, k0 for (i 0 i lt N i)
for (k 0 k lt N k) for (j 0 j lt N
j) ((int)c)ijN ((int)a)ikN
((int)b)kjN return
You will need to add casting!!!
You will need to duplicate the functions!!!
36Disadvantages
- Both solutions will require
- Duplicating the functions
- Adding if statements to check for the data
type. - Problems
- The code will be long and not easy to understand.
- If you change something in one function, you may
forget to update the duplicated one.
37The proposed solution
- You can use a compilation flag to recompile your
code - according to the data type. For example,
- ifdef INT
- typedef int elem
- elif FLOAT
-
- endif
- Then, gcc DINT will compile it as int
- and gcc DFLOAT will compile it as float.
38The proposed solution
- Write a perl script, named calc_parse.pl that
will parse the input command line and do the
following - will recompile your code according to the desired
data type - will pass the rest of the command line to the
program mat_calc you have implemented in 4.2.
39Perl Script
- For example, the script will be invocated by the
following command. - perl calc_parse.pl DOUBLE c 1.in t 2.in t
3.in - !!! Correct the exercise printout in which the
signs are missing.
40Guidelines
- If you have ideas for a more elegant solution you
are free to implement it. - You will need to clearly describe the advantages
of your solution over the proposed one.
41Question 4.4 calc_test
- Devise a testing program for your calculator,
mat_calc, that will evaluate its performance on
different data types. - For each data type, create files with regular
random matrices of sizes 4, 8, 16, 32, 64, 128,
256, 512, 1024. - Test your calculator with random command lines of
length 4, 8, 16, 32, 64, 1/4 multiplications,
1/4 additions, 1/4 subtraction and 1/4
transpose. - Measure the running times. Prepare a graph, which
for each data type, plots the overall running
times as a function of the matrix size (the
curves of different data types should be in
different color).
42Submit
- The graph Plots the overall running times as a
function of the matrix size. The curve of each
data type is in different color. - The analysis Analyze the curves of the above
graph to make conclusions about the efficiency of
different data types. - The implementation Described your
implementation. You can implement it either in
perl, in C or part in perl and part in C. - The output Not for submission, can be in any
convenient for you format. Just list all the
command lines that you have used for testing.
43Guidelines
- In exercise 4.4 you can really do whatever you
want ? - You are free to choose
- The programming language
- The file layout
- The algorithm
- You need to describe all of these in the manual.
44The Project makefiles
- Write your own makefile, which should support the
- following commands
- make matop
- make mat_calc
- make clean
- Do not add the g and pg flags which may slow
- down the performance.
45General Project Guidelines
- Use you own judgment!!!
- Make your own design.
- Select the best approach. Describe its advantages
and disadvantages. - Make your own conclusions!!!
46Perl Programming Language
- Some slides are adapted from
- Efi Fogel (Tel-Aviv university)
- Jason Stajich (Duke University)
- http//www.id.cbs.dk/dh/perl/perlintro.html
47Why Perl?
- Practical Extraction and Report Language
- SIMPLE scripting language
- FAST text processing
- Cross-platform
- Facilitates very quick program development --
many problems can be solved very quickly with
surprisingly short programs. - Designed to work smoothly in the same ways that
natural language works smoothly
48Process Overview
C/C
C Libraries
Hello.c
Hello.o
Edit
Hello
Run
Hello
Compile
Link
Perl
Perl Packages
Executable
Hello.pl
Interpret and Run
Hello
Edit
49Perl "hello" example
- Simply say print "Hello world!\n"
- To execute perl -e "print \"Hello world\!\n\""
- Or, edit the file hello to contain the one-liner
program above, and type perl hello
!/usr/bin/perl print Hello world\n
50Executing Perl
- The perl interpreter is usually located in
/usr/bin/perl. !/usr/bin/perl - tells the shell
to look for perl program and pass the rest of the
file to if for execution. - Use strict - strict error checking
- Generates compile-time error if you use a
bareword identifier that's not a predeclared
subroutine. - Generates compile-time error if you access a
variable that wasn't declared via my, or wasn't
imported. - Generates runtime error if you use any symbolic
references.
!/usr/bin/perl use strict The body of the
script print hello world\n
Can be also /usr/local/bin/perl. Check with
which perl
51Perl Documentation
- Run man perl or perldoc perl to read the
top-level page.If you know which section you
want, you can go directly there by using man
perlvar or perldoc perlvar. - Other online tutorials
- E.g http//www.perldoc.com/perl5.6/pod/func/
52Perl on Windows ActivePerl
- ActivePerl distribution (developed by
ActiveState Tool Corporation ) is
athttp//www.activestate.com/. - Perl for Win32 - The ActivePerl binary comes as
a self-extracting executable that uses the
standard Win32 InstallShield setup wizard to
guide you through the installation process. - By default, Perl is installed into the directory
C\Perl\version, where version is the current
version number (e.g., 5.005).
53Perl Data Types
54Variable Localization and Scope
- Perl variables can be either global or local
- x 1
- default global scope (a kind of static scope)
visible throughout program - my y 2
- my lexical scope (a kind of static scope)
visible only inside its block (like in C) - local z 3
- local dynamic scope visible inside its block
and any subroutines called from that block - Will depend on the order of the subroutine calls
55Scalar Variables
- Scalar variables can be strings or numbers
- For example
- var1 "aloha"
- var2 5
- Can use the same operators as C or Java
- var2
- Can also include them in a single print statement
- print "var1 and var2" aloha and 6
56Scalar Variables
- You do not have to declare variables in PERL.
- The first character of a variable indicates its
type. - means that the variable is a scalar, which can
be a string, an integer or a real number. - Scalar is initialized to 0 or null string.
the_string "hello world" print
the_string,"\n"
57Operators
- - (addition, subtraction)
- / (multiply, divide, modulus,
exponentiation) - -- (autoincrement, autodecrement)
- - etc. (assignment operators)
58Function substr, x, .
- Returns a substring of a string
- substr(string, offset, length)
- Returns a substring of string starting at
specified offset of specified length
string "Perl Programmer"
59Arrays
- Arrays do not have to be predeclared and their
size have to be specified. Arrays hold scalar
values. - A variable beginning with the character _at_ is an
array. To reference the first item of the array
_at_x, we use x0 (x0 is scalar).
60Arrays
- For an array _at_x, the special variable x
indicates the highest index used in the array.
So, xx will be the last element in array _at_x.
Equivalent
61Arrays
- Strings and arrays are closely related and there
are two functions for converting from one to the
other.
Reverse operation
62Perl - split
- split /PATTERN/, EXPR
- Scans a string given by EXPR for delimiters, and
splits the string into a list of substrings,
returning the resulting list value in list
context, or the count of substrings in scalar
context (token1, token3, token4) split("
", line)
63_at_ARGV Special Variable
- Stores command line arguments
- ./program one 2 "three"
- ARGV0 one
- ARGV1 2
- ARGV2 three
64Control Structures
- PERL's control structures are similar to those in
C, except that all blocks must be bracketed.
if (x 1) print "ONE\n" elsif (x
2) print "TWO\n" else print
"OTHER\n"
65Loop Control
- The last command is like the break statement in C
. - The next command is like the continue statement
in C. - Any block can be given a label (by convention, in
uppercase) which identifies the loop.
WID foreach this (_at_ary1) JET foreach
that (_at_ary2) if (this gt that)
this that
66Basic I/O
a ltSTDINgt read the next line _at_a ltSTDINgt
all remaining lines as a list, until
ctrl-D while (lineltSTDINgt)
chomp(line) other operations with line
here
67Perl File Processing
- open(FILE_HANDLE, "ltfile_name")
- Open a file for reading
- my _at_array ltFILE_HANDLEgt
- Assign file contents to an array
- 1st line array0, 2nd line array1, etc.
- open(APPLE, "gtbanana")
- Open file for writing
- print APPLE _at_array
- Write contents of array to a file
- close(APPLE)
- Close file
68Perl - open
- open FILEHANDLE, EXPR
- open FILEHANDLE
- !/usr/local/bin/perl
- file '/proc/cpuinfo' Name the file
- open(INFO, file) Open the file
- _at_lines ltINFOgt Read it into an array
- close(INFO) Close the file
- print _at_lines Print the array
69The Implicit Variable
When a line is read, it is stored in the special
variable _. This is a program to print all lines
from a file.
open(FILE1,ARGV0) while(ltFILE1gt) print
_ close(FILE1)
- To read from stdin, we do not need to call open
while(ltSTDINgt) print _
70Perl - readline
- readline EXPR
- Reads from the filehandle contained in EXPR.
- line ltSTDINgt line readline(STDIN)
same thing
71File Test Operators
-e "/usr/bin/perl" or warn "Perl is improperly
installed\n" -f "/vmunix" and print "Congrats,
we seem to be running BSD Unix\n"
72Regular Expressions
while (line ltFILEgt) if(line /http/)
match operator // pattern binding
operator print line prints all
lines from FILE that include substring http
The pattern binding operator looks for a match
of the regular expression on the right of the
operator (//) in the string to the left of the
operator (..).
73Regular Expressions
- \s matches a space or tab
- matches the start of a string
- matches the end of a string
- a matches the letter a
- a matches 1 or more a's
- a matches 0 or more a's
- (ab) matches 1 or more ab's
- abc matches a character that is not a or b or
c - a-z matches any lower case letter
- . matches any character
74Regular Expressions
- To test whether a string in x contains the
string "abc", we can use - if (x /abc/) . . .
- To test whether a string begins with "abc",
- if (x /abc/) . . .
- To test whether a string begins with a capital
letter if (x /A-Z/) . . . - To test whether a string does not begin with a
lower case letter if (x /a-z/) . . .
- In the above example, the first matches the
beginning of the string, while the within the
square brackets means "not".
75Regular Expressions
- We can change strings using a command of the
form s/FROM/TO/options where FROM is the
matching regular expression and TO is what to
change this to. options can either be blank (for
the first match) or it can be g, meaning do it
globally. - To change all a's to b's in the string in
variable x x s/a/b/g - To change the first a to b x s/a/b/
- To change all strings of consecutive a's into one
a x s/a/a/g - To remove all strings of consecutive a's x
s/a//g - To remove blanks from the start of a stringx
s/\s//g
76The System Command
- Runs another program from within a Perl script.
- For example
- system("mlpl lt 1.in gt!1.out")
- system("make -f makefile.type mat_calc")
- system(mat_calc command")
77Reading the output of other programs
- For example
- my _at_size
- split(/\s/,wc allocate_free.c)
- The array _at_size will store
- the number of lines (size0),
- words (size1)
- chars (size2)
- existing in the file allocate_free.c.
78Input/Output Files
- Practicing the file types
79Binary and Text Files
- File can be opened and processed as either text
file or binary file - Text files
- contain printable characters and control
characters - organized into lines
- system may convert or remove some input
characters - system may insert or convert some output
characters - Binary files
- contain a series of characters
- no characters translated on input or output
- used for files with binary or unprintable
characters
80fopen
- FILE fopen (const char filename,
const char mode) - Returns pointer to open FILE block if successful
- Returns special value NULL if unsuccessful
- filename is string constant or variable with file
name, optionally with drive and/or path. - mode is quoted string with options specifying how
you plan to use the file
81fclose
- OS have limit on number of files open at one time
- Open file ties up resources (memory, etc.)
- Conclusion - close file as soon as program is
done with it. - Format int fclose( FILE fp )
- Returns zero if ok, EOF if problem
- fp is the file pointer for an open file
82fread
- size_t fread( buffer, size, count, fp )
- Returns the number of items actually read, if
less than - requested must use feof or ferror function to
determine - if end-of-file or error occurred
- buffer - character array to receive the input
- size - size in bytes of each item
- count -maximum number of items to be read
- fp -file pointer to the input file
Dont forget to allocate the memory!!!
83fwrite
- size_t fwrite( buffer, size, count, fp )
- fwrite function writes a requested number of
items of - a specified size. Returns the number of items
- actually written, if less than requested error
occurred - buffer - character array containing data to be
written - size - size in bytes of each item
- count - number of items to write
- fp - file pointer to the open output file
84Example
- A program that
- Creates an array and a matrix
- Writes them in binary mode to the first file
- Reads them from the first file and writes to the
second
85- include ltstdio.hgt
- include ltstdlib.hgt
- include ltstring.hgt
- include ltunistd.hgt
- include ltassert.hgt
- int main(int argc, char argv)
- int cal80, flo5100
- int i 0, j 0
- FILE filea, fileb
- for (i0 ilt80 i)
- cali i
-
- for (i0 ilt5 i)
- for (j0 jlt100 j)
- floij (i100) j
-
Define and initialize two structures an array
and a matrix
86- / write the data to the first file /
- filea fopen(argv1, "wb" )
- / write one 80-int record /
- i fwrite(cal, 80sizeof(int), 1, filea)
- / write five 100-int records /
- i fwrite(flo, 100sizeof(int),5, filea)
- fclose(filea)
- / read the data from the first file /
- filea fopen(argv1, "rb" )
- / read one 80-int record /
- i fread( cal, 80sizeof(int),1, filea )
- printf("Number of bytes read d\n", i)
- / read five 100-int records /
- i fread( flo,100sizeof(int),5, filea )
- printf("Number of bytes read d\n", i)
Open the first file for writing and write the
structures
Open the first file for reading and read the
structures
87- for (i0 ilt80 i)
- printf("d ", cali)
-
- printf("\n")
- printf("\nThe flo data read from the first file
is \n") - for (i0 ilt5 i)
- for (j0 jlt100 j)
- printf("d ", floij)
-
-
- printf("\n")
- /write the data to the second file /
- fileb fopen(argv2, "wb" )
- i fwrite( cal, 80sizeof(int), 1, fileb )
- i fwrite( flo, 100sizeof(int),5, fileb )
- return 1
88Time Measurements
89The C clock function
- clock_t clock(void) Returns the processor time
used by the program since the beginning of
execution, or -1 if unavailable. - clock()/CLOCKS_PER_SEC - is a time in seconds.
-
include lttime.hgt clock_t t1,t2 t1
clock() mult_ijk(a,b,c,n) t2 clock()
printf("The running time is lf seconds\n",
(double)(t2 - t1)/(CLOCKS_PER_SEC))
90The Unix time function
Note that there is also a C function called
time, see man 2 time.
- /usr/bin/time ltcommandgt
-
- The time command runs the specified program
command with the given arguments. - When command finishes, time writes a message to
standard output giving timing statistics about
this program run and system resource usage.
91Quiz ?
- What information can be retrieved only with gprof
or clock()? - What information can be retrieved only with
/usr/bin/time?
92The Unix time function
- These presented statistics include
- the elapsed real time between invocation and
termination (overall running time) - the user CPU time(time used by the program
itself and any library subroutines it calls) - the system CPU time (the time used by system
calls invoked by the program, directly or
indirectly). - Many more
93Example time v matop ..
94Perl Tips
- Remember the example
- my _at_size
- split(/\s/,wc allocate_free.c)
- The array _at_size will store the number of lines
(size0), words (size1) and chars (size2)
in the file allocate_free.c.
95Good Luck in the Project!!!