Title: Mod_perl : Performance CGI in Apache
13.1.1.1.1 Mod_perl Performance CGI in Apache
mod_perl is more than CGI scripting on steroids.
It is a whole new way to create dynamic content
by utilizing the full power of the Apache web
server to create stateful sessions, customized
user authentication systems, smart proxies and
much more. Yet, magically, your old CGI scripts
will continue to work and work very fast indeed.
With mod_perl you give up nothing and gain so
much! -- Lincoln Stein
2Classical Perl-CGI
- Conventional perl CGI scripts are compiled,
interpreted, and executed like any other perl
script.
- Every time a perl script is run, it is translated
(interpreted/compiled) into op code, then
executed. - The translation step takes time.
- Any database connections, filehandles, or other
like resources are created only for the life of
the particular instance of the script being run.
3Classical Perl-CGI
- In CGI, a Perl scripts output is directed to the
users browser via the web browser, instead of
the usual STDOUT (screen).
gt top i
- Each time a script is run, it becomes its own
process. - Each process requires its own compilation and
memory space, even if the script is the same.
4Classical Perl-CGI Performance Scenario
- Consider the following scenario
- A website served from a single server.
- 20 perl CGI scripts which each serve 5 clients /
second. - Each script loads 5 modules.
- Each script creates a database connection.
- Each script accesses files on the filesystem.
- The time that it takes to compile/interpret/run
each script is 2 seconds. - Each instance of the script requires 10Mb of
memory. - The effect on the server would be
- 100 perl processes compiling, interpreting, and
running (5 processes / script / second ). - 200 seconds of cpu-time consumed.
- 1Gb of memory used.
- 100 database connections created and destroyed.
- Solution
- Compile the scripts once.
- Server clients from single cached version of
scripts - Share the database connections.
5Why Mod_perl?
- Problems with conventional perl-CGI
- Compilation of script for each request (slow)
- New process created for each request
(resource-intensive) - No easy way to share commonly used resources such
as modules, data memory, database connections,
etc. - Limited integration with Apache server, limited
control of Apache modules, services, and
functions.
- Mod_perls solutions and features
- Speed and Efficiency
- The standard ApacheRegistry module can provide
100x speedups for existing CGI scripts and reduce
the load on the server at the same time. - Scripts are wrapped as subroutines within a
handler in the server module which execute
faster. - Shared Resources
- Share database connections.
- Share memory.
- Server control / Customization
- Apache can be controlled using existing modules.
- Custom modules and handlers can be easily written
to extend server functionality. - Control over request stages
- Rewrite URLs in Perl based on the content of
directories, settings stored in a database, or
anything else conceivable. - Maintenance of state within the server memory.
6Approaches to Perl Coding
- One-off scripting and the one-potapproach
- Programming by passing-the-buck
Output
Input
Input
Output
Main
- Variables
- Functions
- Subroutines
Subroutines
- Sufficient for non-persistent script
- One set of output based on one set of inputs
- Subroutines can access and modify the globally
available data.
- Better for persistent program
- Input/Output dynamic based on parameters
- Subroutines should only be able to access global
data under certain conditions
7Nested Subroutines in Perl
- nested.pl
- -----------
- !/usr/bin/perl w
- use diagnostics
- use strict
- sub print_power_of_2
- my x shift
- sub power_of_2
- return x 2
-
- my result power_of_2()
- print "x2 result\n"
-
- print_power_of_2(5)
- The script should print the square of the numbers
passed to it - ./nested.pl
- 52 25
- 62 25
- If we use the warnings(-w) pragma we get the
warning - Variable "x" will not stay shared at ./nested.pl
line 9. - If we use diagnostics.pm we get
- (W) An inner (nested) named subroutine is
referencing a lexical - variable defined in an outer subroutine.
-
- When the inner subroutine is called, it will
probably see the value of - the outer subroutine's variable as it was
before and during the - first call to the outer subroutine in this
case, after the first
8How mod_perl Works
- Mod_perl is a binary module extension which
provides Apache with a built-in perl
interpreter. - Requests which map to directories assigned to
mod_perl are serviced by perl packages called
handlers - The handler is interpreted by the built-in
interpreter, compiled, and cached in memory. - The most important mod_perl handler is called
ApacheRegistry - The Apache server loads a parent server
process (httpd), and this process forks a
specified number of children. - Each process contains the mod_perl module and can
serve requests . - The children can share memory from the parent.
9Content Handlers
- All content handlers in mod_perl must have the
handler subroutine. - To add the handler to the server configuration,
the httpd.conf file must be modified and the
server restarted - /usr/local/apache/conf/httpd.conf
- In redhat 9 httpd.conf is moved, and the mod_perl
configuration is in another file - /etc/httpd/conf/httpd.conf
- /etc/httpd/conf.d/perl.conf
- The following configuration snippet is added to
httpd.conf or perl.conf - PerlModule ModPerlRules1
- ltLocation /mod_perl_rules1gt
- SetHandler perl-script
- PerlHandler ModPerlRules1
- PerlSendHeader On
- lt/Locationgt
- ModPerl/Rules1.pm
- -----------------
- package ModPerlRules1
- use ApacheConstants qw(common)
- sub handler
- print "Content-type text/plain\n\n"
- print "mod_perl rules!\n"
- return OK We must return a status to
mod_perl -
- 1 This is a perl module so we must return true
to perl - ModPerl/Rules2.pm
- ----------------
- package ModPerlRules2
- use ApacheConstants qw(common)
- sub handler
- my r shift
10ApacheRegistry / ModPerlRegistry
- counter.pl
- ----------
- !/usr/bin/perl w
- use CGI qw(all)
- use strict
- print header
- my counter 0 redundant
- for (1..5)
- increment_counter()
-
- sub increment_counter
- counter
- print Counter is equal to counter !, br
- To use this script in mod_perls
ApacheRegistry, we must save the file in the
appropriate directory specified in the directive
in httpd.conf / perl.conf - Standard Apache installation
- ltLocation /perlgt
- SetHandler perl-script
- PerlHandler ApacheRegistry
- Options ExecCGI
- PerlSendHeader On
- lt/Locationgt
- Redhat 9 (Apache 2.0)
- ltDirectory /var/www/perlgt
- SetHandler perl-script
- PerlModule ModPerlRegistry
- PerlHandler ModPerlRegistryhandler
- Options ExecCGI
- lt/Directorygt
11ApacheRegistry / ModPerlRegistry Continued
- package ApacheROOTperlcounter_2epl
- use Apache qw(exit)
- sub handler
-
- use strict
-
- print header"
-
- my counter 0 redundant
- for (1..5)
- increment_counter()
-
-
- sub increment_counter
- counter
- print "Counter is equal to counter !\r\n"
-
- The script counter.pl is compiled into the
package ApacheROOTperlcounter_2epl and is
wrapped into this packages handler subroutine. - We would expect to see the output
- Counter is equal to 1 !
- Counter is equal to 2 !
- Counter is equal to 3 !
- Counter is equal to 4 !
- Counter is equal to 5 !
- After some reloading, we start to get strange
results, with the counter starting at higher
numbers like 6, 11, 15 and so on - Counter is equal to 6 !
- Counter is equal to 7 !
- Counter is equal to 8 !
- Counter is equal to 9 !
- Counter is equal to 10 !
- The major cause of this bug nested subroutines.
Non-linearity of buggy output is caused by the
requests being served by different children
12Solving the Nested Subroutine Problem Anonymous
subs, Scoping
- anonymous.pl
- --------------
- !/usr/bin/perl
- use strict
- sub print_power_of_2
- my x shift
- my func_ref sub
- return x 2
-
- my result func_ref()
- print "x2 result\n"
-
- print_power_of_2(5)
- print_power_of_2(6)
- Change the named inner nested subroutine to an
anonymous subroutine. - The anonymous subroutine sees the variables in
the same lexical context, at any moment that it
is called. - The x variable is in the same lexical scope as
the anonymous subroutine call so it sees the
variable and its value at any given moment. - Acts like a closure
- ./anonymous.pl
- 52 25
- 62 36
13Solving the Nested Subroutine Problem Package
Scoped Variables
- multirun.pl
- -----------
- !/usr/bin/perl
- use strict
- use warnings
- for (1..2)
- print "run time _\n"
- run()
-
- sub run
- my counter 0
- our counter 0
- local our counter 0
- increment_counter()
- When the script is run using the lexically scoped
counter variable we get - Variable "counter" will not stay shared at
./nested.pl line 18. - run time 1
- Counter is equal to 1 !
- Counter is equal to 2 !
- run time 2
- Counter is equal to 3 !
- Counter is equal to 4 !
- The counter variable in the named subroutine
remains bound to the initial value (named subs
are compiled once) - If we use our to scope counter to the package
it works - run time 1
- Counter is equal to 1 !
- Counter is equal to 2 !
- run time 2
- Counter is equal to 1 !
14Solving the Nested Subroutine Problem Parameter
Passing, References
- multirun3.pl
- ------------
- !/usr/bin/perl
- use strict
- use warnings
- for (1..3)
- print "run time _\n"
- run()
-
- sub run
- my counter 0
- counter increment_counter(counter)
- counter increment_counter(counter)
- multirun4.pl
- ------------
- !/usr/bin/perl
- use strict
- use warnings
- for (1..3)
- print "run time _\n"
- run()
-
- sub run
- my counter 0
- increment_counter(\counter)
- increment_counter(\counter)
15Porting example
- perl -i.bak -pe 's/\opt_(\w)/\opt1/g'
param_printer.pl - The my scoping must be removed from the hash
assignments. - We declare the hash opt and then pass the
options into the subroutine
- Param_printer.pl
- -----------------------
- !/usr/bin/perl -w
- use strict
- use CGI qw(standard)
- front_page() if !param()
- my opt_p param('p') 20 primer size
- my opt_a param('a') 2 primer size
range - my opt_t param('t') 60 opt. tm
- my opt_b param('b') 5 tm range
- my opt_y param('y') 5 primer sets
per exon - print header
- print_options
- print end_html
- Param_printer.pl
- -----------------------
- !/usr/bin/perl -w
- use strict
- use CGI qw(standard)
- front_page() if !param()
- my opt
- optp param('p') 20 primer size
- opta param('a') 2 primer size
range - optt param('t') 60 opt. tm
- optb param('b') 5 tm range
- opty param('y') 5 primer sets per
exon - print header
- print_options(opt)
- When this script is run in mod_perl, it is
wrapped in the handler subroutine of the package
inner subroutine problem we get the same
initial parameters repeatedly. - Since the variables follow a distinct pattern we
can use commandline perl and regex to convert
them to a hash.
16Porting example continued
- !/usr/bin/perl -w
- use strict
- use CGI qw(standard)
- front_page() if !param()
- my opt
- optp param('p') 20 primer size
- opta param('a') 2 primer size
range - optt param('t') 60 opt. tm
- optb param('b') 5 tm range
- opty param('y') 5 primer sets per
exon - my text "These are the parameters"
- my _at_array split (param('a'))
- print header
- If we want to pass more than one variable of
different types (arrays, scalars, and hashes)
into the subroutine, we can use references. - The references will cause mod_perl to hold-on to
the data that they reference - We should use local our to clean up those
references after they are used.
local our opt .. local our text "These
are the parameters" local our _at_array split
(param('a')) print header print_options(\opt,
\text,\_at_array) print end_html sub
print_options print text, br, "optp,
opta, optt, optb, opty", br
print join (_at_array)
17Database Connections ApacheDBI
- In regular CGI, the script which connects to the
database creates its own connection in every
instance it is run. - If 20 scripts are accessed each 10 times, thats
200 database connections which are created and
destroyed. - Database connections are expensive.
- To mitigate this shortcoming, use ApacheDBI,
which allows persistent database connections to
be created in mod_perl. - The DBI module will check ENVMOD_PERL
environment variable. If ApacheDBI has been
loaded, it forwards connect() requests to it. - The disconnect() method is overloaded with
nothing.
- To load ApacheDBI, it should be loaded in
httpd.conf / perl.conf - PerlModule ApacheDBI
- After that, you program DBI just as if you used
use DBI - The use DBI statement can remain in your
scripts.
use DBI dbh DBI-gtconnect(data_source,
username, auth, \attr) sth
dbh-gtprepare(statement) rv sth-gtexecute
_at_row_ary dbh-gtselectrow_array(statement)
18Sharing Memory Aliasing
- package MyConfig
- use strict
- use vars qw(c)
- c (
- dir gt
- cgi gt "/home/httpd/perl",
- docs gt "/home/httpd/docs",
- img gt "/home/httpd/docs/images",
- ,
- url gt
- cgi gt "/perl",
- docs gt "/",
- img gt "/images",
- ,
- color gt
- hint gt "777777",
- warn gt "990066",
- normal gt "000000",
- use strict
- use MyConfig ()
- use vars qw(c)
- c \MyConfigc
- print "Content-type text/plain\r\n\r\n"
- print "My url docs root curldocs\n"
- The c glob has been aliased with
\MyConfigc, a reference to a hash. From now
on, MyConfigc and c are the same hash and
you can read from or modify either of them. - Any script that you use can share this variable
- You can also use the _at_EXPORT and _at_EXPORT_OK
arrays in your package to export the variables
that you want to share.
- A Package is created with a hash that contains
configuration parameters for some scripts. - We want to be able to use this hash in other
scripts
19Server Configuration httpd.conf / perl.conf
-
- Server-Pool Size Regulation (MPM specific)
-
- prefork MPM
- StartServers number of server processes to
start - MinSpareServers minimum number of server
processes which are kept spare - MaxSpareServers maximum number of server
processes which are kept spare - MaxClients maximum number of server processes
allowed to start - MaxRequestsPerChild maximum number of requests
a server process serves - ltIfModule prefork.cgt
- StartServers 1
- MinSpareServers 1
- MaxSpareServers 1
- MaxClients 1
- MaxRequestsPerChild 1000
- lt/IfModulegt
- worker MPM
-
- Mod_perl incorporates a Perl interpreter into
the Apache web server, - so that the Apache web server can directly
execute Perl code. - Mod_perl links the Perl runtime library into
the Apache web server - and provides an object-oriented Perl interface
for Apache's C - language API. The end result is a quicker CGI
script turnaround - process, since no external Perl interpreter has
to be started. -
- LoadModule perl_module modules/mod_perl.so
- PerlRequire /etc/httpd/conf/start-up.pl
- This will allow execution of mod_perl to
compile your scripts to - subroutines which it will execute directly,
avoiding the costly - compile process for most requests.
- Alias /perl /var/www/perl
- ltDirectory /var/www/perlgt
- SetHandler perl-script
- PerlHandler ModPerlRegistryhandler
20Performance Tuning Startup.pl
- use lib("/var/www/perl")
- use MultisageConfig ()
- use DBI ()
- use CGI ()
- CGI-gtcompile('all')
21Mod Perl API / Packages
- ApacheSession - Maintain session state across
HTTP requests - ApacheDBI - Initiate a persistent database
connection - ApacheWatchdogRunAway - Hanging Processes
Monitor and Terminator - ApacheVMonitor -- Visual System and Apache
Server Monitor - ApacheGTopLimit - Limit Apache httpd processes
- ApacheRequest (libapreq) - Generic Apache
Request Library - ApacheRequestNotes - Allow Easy, Consistent
Access to Cookie and Form Data Across Each
Request Phase - ApachePerlRun - Run unaltered CGI scripts under
mod_perl - ApacheRegistryNG -- ApacheRegistry New
Generation - ApacheRegistryBB -- ApacheRegistry Bare Bones
- ApacheOutputChain -- Chain Stacked Perl
Handlers - ApacheFilter - Alter the output of previous
handlers - ApacheGzipChain - compress HTML (or anything)
in the OutputChain - ApacheGzip - Auto-compress web files with Gzip
- ApachePerlVINC - Allows Module Versioning in
Location blocks and Virtual Hosts - ApacheLogSTDERR
- ApacheRedirectLogFix
- ApacheSubProcess
- ModuleUse - Log and Load used Perl modules
22Conclusions
- Mod_perl has to be done right
- Take care of nested subroutines
- Goto perl.apache.org