perlfaq3 - Programming Tools ($Revision: 1.28 $, $Date: 1998/07/16 22:08:49 $)
This section of the FAQ answers questions related to programmer tools and programming support.
Have you looked at CPAN (see perlfaq2)? The chances are that someone has already written a module that can solve your problem. Have you read the appropriate man pages? Here's a brief index:
Basics perldata, perlvar, perlsyn, perlop, perlsub
Execution perlrun, perldebug
Functions perlfunc
Objects perlref, perlmod, perlobj, perltie
Data Structures perlref, perllol, perldsc
Modules perlmod, perlmodlib, perlsub
Regexps perlre, perlfunc, perlop, perllocale
Moving to perl5 perltrap, perl
Linking w/C perlxstut, perlxs, perlcall, perlguts, perlembed
Various http://www.perl.com/CPAN/doc/FMTEYEWTK/index.html
(not a man-page but still useful)
perltoc provides a crude table of contents for the perl man page set.
The typical approach uses the Perl debugger, described in the perldebug(1) man page, on an ``empty'' program, like this:
perl -de 42
Now just type in any legal Perl code, and it will be immediately evaluated. You can also examine the symbol table, get stack backtraces, check variable values, set breakpoints, and other operations typically found in symbolic debuggers.
In general, no. The Shell.pm module (distributed with perl) makes perl try commands which aren't part of the Perl language as shell commands. perlsh from the source distribution is simplistic and uninteresting, but may still be what you want.
Have you used -w
? It enables warnings for dubious practices.
Have you tried use strict
? It prevents you from using symbolic references, makes you predeclare any subroutines that you call as bare words, and (probably most importantly) forces you to predeclare your variables with my
or use vars
.
Did you check the returns of each and every system call? The operating system (and thus Perl) tells you whether they worked or not, and if not why.
open(FH, "> /etc/cantwrite")
or die "Couldn't write to /etc/cantwrite: $!\n";
Did you read perltrap? It's full of gotchas for old and new Perl programmers, and even has sections for those of you who are upgrading from languages like awk and C.
Have you tried the Perl debugger, described in perldebug? You can step through your program and see what it's doing and thus work out why what it's doing isn't what it should be doing.
You should get the Devel::DProf module from CPAN, and also use Benchmark.pm from the standard distribution. Benchmark lets you time specific portions of your code, while Devel::DProf gives detailed breakdowns of where your code spends its time.
Here's a sample use of Benchmark:
use Benchmark;
@junk = `cat /etc/motd`;
$count = 10_000;
timethese($count, {
'map' => sub { my @a = @junk;
map { s/a/b/ } @a;
return @a
},
'for' => sub { my @a = @junk;
local $_;
for (@a) { s/a/b/ };
return @a },
});
This is what it prints (on one machine--your results will be dependent on your hardware, operating system, and the load on your machine):
Benchmark: timing 10000 iterations of for, map...
for: 4 secs ( 3.97 usr 0.01 sys = 3.98 cpu)
map: 6 secs ( 4.97 usr 0.00 sys = 4.97 cpu)
The B::Xref module, shipped with the new, alpha-release Perl compiler (not the general distribution prior to the 5.005 release), can be used to generate cross-reference reports for Perl programs.
perl -MO=Xref[,OPTIONS] scriptname.plx
There is no program that will reformat Perl as much as indent(1) does for C. The complex feedback between the scanner and the parser (this feedback is what confuses the vgrind and emacs programs) makes it challenging at best to write a stand-alone Perl parser.
Of course, if you simply follow the guidelines in perlstyle, you shouldn't need to reformat. The habit of formatting your code as you write it will help prevent bugs. Your editor can and should help you with this. The perl-mode for emacs can provide a remarkable amount of help with most (but not all) code, and even less programmable editors can provide significant assistance.
If you are used to using vgrind program for printing out nice code to a laser printer, you can take a stab at this using http://www.perl.com/CPAN/doc/misc/tips/working.vgrind.entry, but the results are not particularly satisfying for sophisticated code.
There's a simple one at http://www.perl.com/CPAN/authors/id/TOMC/scripts/ptags.gz which may do the trick.
For a complete version of Tom Christiansen's vi configuration file, see http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/toms.exrc, the standard benchmark file for vi emulators. This runs best with nvi, the current version of vi out of Berkeley, which incidentally can be built with an embedded Perl interpreter -- see http://www.perl.com/CPAN/src/misc.
Since Emacs version 19 patchlevel 22 or so, there have been both a perl-mode.el and support for the perl debugger built in. These should come with the standard Emacs 19 distribution.
In the perl source directory, you'll find a directory called "emacs", which contains a cperl-mode that color-codes keywords, provides context-sensitive help, and other nifty things.
Note that the perl-mode of emacs will have fits with "main'foo"
(single quote), and mess up the indentation and hilighting. You should be using "main::foo"
in new Perl code anyway, so this shouldn't be an issue.
The Curses module from CPAN provides a dynamically loadable object module interface to a curses library. A small demo can be found at the directory http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/rep; this program repeats a command and updates the screen as needed, rendering rep ps axu similar to top.
Tk is a completely Perl-based, object-oriented interface to the Tk toolkit that doesn't force you to use Tcl just to get at Tk. Sx is an interface to the Athena Widget set. Both are available from CPAN. See the directory http://www.perl.com/CPAN/modules/by-category/08_User_Interfaces/
Invaluable for Perl/Tk programming are: the Perl/Tk FAQ at http://w4.lns.cornell.edu/~pvhp/ptk/ptkTOC.html , the Perl/Tk Reference Guide available at http://www.perl.com/CPAN-local/authors/Stephen_O_Lidie/ , and the online manpages at http://www-users.cs.umn.edu/~amundson/perl/perltk/toc.html .
The http://www.perl.com/CPAN/authors/id/SKUNZ/perlmenu.v4.0.tar.gz module, which is curses-based, can help with this.
See the next questions.
The best way to do this is to come up with a better algorithm. This can often make a dramatic difference. Chapter 8 in the Camel has some efficiency tips in it you might want to look at. Jon Bentley's book ``Programming Pearls'' (that's not a misspelling!) has some good tips on optimization, too. Advice on benchmarking boils down to: benchmark and profile to make sure you're optimizing the right part, look for better algorithms instead of microtuning your code, and when all else fails consider just buying faster hardware.
A different approach is to autoload seldom-used Perl code. See the AutoSplit and AutoLoader modules in the standard distribution for that. Or you could locate the bottleneck and think about writing just that part in C, the way we used to take bottlenecks in C code and write them in assembler. Similar to rewriting in C is the use of modules that have critical sections written in C (for instance, the PDL module from CPAN).
In some cases, it may be worth it to use the backend compiler to produce byte code (saving compilation time) or compile into C, which will certainly save compilation time and sometimes a small amount (but not much) execution time. See the question about compiling your Perl programs for more on the compiler--the wins aren't as obvious as you'd hope.
If you're currently linking your perl executable to a shared libc.so, you can often gain a 10-25% performance benefit by rebuilding it to link with a static libc.a instead. This will make a bigger perl executable, but your Perl programs (and programmers) may thank you for it. See the INSTALL file in the source distribution for more information.
Unsubstantiated reports allege that Perl interpreters that use sfio outperform those that don't (for IO intensive applications). To try this, see the INSTALL file in the source distribution, especially the ``Selecting File IO mechanisms'' section.
The undump program was an old attempt to speed up your Perl program by storing the already-compiled form to disk. This is no longer a viable option, as it only worked on a few architectures, and wasn't a good solution anyway.
When it comes to time-space tradeoffs, Perl nearly always prefers to throw memory at a problem. Scalars in Perl use more memory than strings in C, arrays take more that, and hashes use even more. While there's still a lot to be done, recent releases have been addressing these issues. For example, as of 5.004, duplicate hash keys are shared amongst all hashes using them, so require no reallocation.
In some cases, using substr() or vec() to simulate arrays can be highly beneficial. For example, an array of a thousand booleans will take at least 20,000 bytes of space, but it can be turned into one 125-byte bit vector for a considerable memory savings. The standard Tie::SubstrHash module can also help for certain types of data structure. If you're working with specialist data structures (matrices, for instance) modules that implement these in C may use less memory than equivalent Perl modules.
Another thing to try is learning whether your Perl was compiled with the system malloc or with Perl's builtin malloc. Whichever one it is, try using the other one and see whether this makes a difference. Information about malloc is in the INSTALL file in the source distribution. You can find out whether you are using perl's malloc by typing perl -V:usemymalloc
.
No, Perl's garbage collection system takes care of this.
sub makeone {
my @a = ( 1 .. 10 );
return \@a;
}
for $i ( 1 .. 10 ) {
push @many, makeone();
}
print $many[4][5], "\n";
print "@many\n";
You can't. On most operating systems, memory allocated to a program can never be returned to the system. That's why long-running programs sometimes re-exec themselves. Some operating systems (notably, FreeBSD) allegedly reclaim large chunks of memory that is no longer used, but it doesn't appear to happen with Perl (yet). The Mac appears to be the only platform that will reliably (albeit, slowly) return memory to the OS.
However, judicious use of my() on your variables will help make sure that they go out of scope so that Perl can free up their storage for use in other parts of your program. A global variable, of course, never goes out of scope, so you can't get its space automatically reclaimed, although undef()ing and/or delete()ing it will achieve the same effect. In general, memory allocation and de-allocation isn't something you can or should be worrying about much in Perl, but even this capability (preallocation of data types) is in the works.
Beyond the normal measures described to make general Perl programs faster or smaller, a CGI program has additional issues. It may be run several times per second. Given that each time it runs it will need to be re-compiled and will often allocate a megabyte or more of system memory, this can be a killer. Compiling into C isn't going to help you because the process start-up overhead is where the bottleneck is.
There are two popular ways to avoid this overhead. One solution involves running the Apache HTTP server (available from http://www.apache.org/) with either of the mod_perl or mod_fastcgi plugin modules.
With mod_perl and the Apache::Registry module (distributed with mod_perl), httpd will run with an embedded Perl interpreter which pre-compiles your script and then executes it within the same address space without forking. The Apache extension also gives Perl access to the internal server API, so modules written in Perl can do just about anything a module written in C can. For more on mod_perl, see http://perl.apache.org/
With the FCGI module (from CPAN), a Perl executable compiled with sfio (see the INSTALL file in the distribution) and the mod_fastcgi module (available from http://www.fastcgi.com/) each of your perl scripts becomes a permanent CGI daemon process.
Both of these solutions can have far-reaching effects on your system and on the way you write your CGI scripts, so investigate them with care.
See http://www.perl.com/CPAN/modules/by-category/15_World_Wide_Web_HTML_HTTP_CGI/ .
A non-free, commerical product, ``The Velocity Engine for Perl'', (http://www.binevolve.com/ or http://www.binevolve.com/bine/vep) might also be worth looking at. It will allow you to increase the performance of your perl scripts, upto 25 times faster than normal CGI perl by running in persistent perl mode, or 4 to 5 times faster without any modification to your existing CGI scripts. Fully functional evaluation copies are available from the web site.
Delete it. :-) Seriously, there are a number of (mostly unsatisfactory) solutions with varying levels of ``security''.
First of all, however, you can't take away read permission, because the source code has to be readable in order to be compiled and interpreted. (That doesn't mean that a CGI script's source is readable by people on the web, though, only by people with access to the filesystem) So you have to leave the permissions at the socially friendly 0755 level.
Some people regard this as a security problem. If your program does insecure things, and relies on people not knowing how to exploit those insecurities, it is not secure. It is often possible for someone to determine the insecure things and exploit them without viewing the source. Security through obscurity, the name for hiding your bugs instead of fixing them, is little security indeed.
You can try using encryption via source filters (Filter::* from CPAN), but crackers might be able to decrypt it. You can try using the byte code compiler and interpreter described below, but crackers might be able to de-compile it. You can try using the native-code compiler described below, but crackers might be able to disassemble it. These pose varying degrees of difficulty to people wanting to get at your code, but none can definitively conceal it (this is true of every language, not just Perl).
If you're concerned about people profiting from your code, then the bottom line is that nothing but a restrictive licence will give you legal security. License your software and pepper it with threatening statements like ``This is unpublished proprietary software of XYZ Corp. Your access to it does not give you permission to use it blah blah blah.'' We are not lawyers, of course, so you should see a lawyer if you want to be sure your licence's wording will stand up in court.
Malcolm Beattie has written a multifunction backend compiler, available from CPAN, that can do both these things. It is as of Jul-1998 in late alpha release, which means it's fun to play with if you're a programmer but not really for people looking for turn-key solutions.
Merely compiling into C does not in and of itself guarantee that your code will run very much faster. That's because except for lucky cases where a lot of native type inferencing is possible, the normal Perl run time system is still present and so your program will take just as long to run and be just as big. Most programs save little more than compilation time, leaving execution no more than 10-30% faster. A few rare programs actually benefit significantly (like several times faster), but this takes some tweaking of your code.
The 5.005 release of Perl itself, whose main goal is merging the various non-Unix ports back into the one Perl source, will also have preliminary (strictly beta) support for Malcolm's compiler and his light-weight processes (sometimes called ``threads'').
You'll probably be astonished to learn that the current version of the compiler generates a compiled form of your script whose executable is just as big as the original perl executable, and then some. That's because as currently written, all programs are prepared for a full eval() statement. You can tremendously reduce this cost by building a shared libperl.so library and linking against that. See the INSTALL podfile in the perl source distribution for details. If you link your main perl binary with this, it will make it miniscule. For example, on one author's system, /usr/bin/perl is only 11k in size!
In general, the compiler will do nothing to make a Perl program smaller, faster, more portable, or more secure. In fact, it will usually hurt all of those. The executable will be bigger, your VM system may take longer to load the whole thing, the binary is fragile and hard to fix, and compilation never stopped software piracy in the form of crackers, viruses, or bootleggers. The real advantage of the compiler is merely packaging, and once you see the size of what it makes (well, unless you use a shared libperl.so), you'll probably want a complete Perl install anywayt.
#!perl
to work on [MS-DOS,NT,...]?For OS/2 just use
extproc perl -S -your_switches
as the first line in *.cmd
file (-S
due to a bug in cmd.exe's `extproc' handling). For DOS one should first invent a corresponding batch file, and codify it in ALTERNATIVE_SHEBANG
(see the INSTALL file in the source distribution for more information).
The Win95/NT installation, when using the ActiveState port of Perl, will modify the Registry to associate the .pl
extension with the perl interpreter. If you install another port (Gurusaramy Sarathy's is the recommended Win95/NT port), or (eventually) build your own Win95/NT Perl using WinGCC, then you'll have to modify the Registry yourself.
Macintosh perl scripts will have the the appropriate Creator and Type, so that double-clicking them will invoke the perl application.
IMPORTANT!: Whatever you do, PLEASE don't get frustrated, and just throw the perl interpreter into your cgi-bin directory, in order to get your scripts working for a web server. This is an EXTREMELY big security risk. Take the time to figure out how to do it correctly.
Yes. Read perlrun for more information. Some examples follow. (These assume standard Unix shell quoting rules.)
# sum first and last fields
perl -lane 'print $F[0] + $F[-1]' *
# identify text files
perl -le 'for(@ARGV) {print if -f && -T _}' *
# remove (most) comments from C program
perl -0777 -pe 's{/\*.*?\*/}{}gs' foo.c
# make file a month younger than today, defeating reaper daemons
perl -e '$X=24*60*60; utime(time(),time() + 30 * $X,@ARGV)' *
# find first unused uid
perl -le '$i++ while getpwuid($i); print $i'
# display reasonable manpath
echo $PATH | perl -nl -072 -e '
s![^/+]*$!man!&&-d&&!$s{$_}++&&push@m,$_;END{print"@m"}'
Ok, the last one was actually an obfuscated perl entry. :-)
The problem is usually that the command interpreters on those systems have rather different ideas about quoting than the Unix shells under which the one-liners were created. On some systems, you may have to change single-quotes to double ones, which you must NOT do on Unix or Plan9 systems. You might also have to change a single % to a %%.
For example:
# Unix
perl -e 'print "Hello world\n"'
# DOS, etc.
perl -e "print \"Hello world\n\""
# Mac
print "Hello world\n"
(then Run "Myscript" or Shift-Command-R)
# VMS
perl -e "print ""Hello world\n"""
The problem is that none of this is reliable: it depends on the command interpreter. Under Unix, the first two often work. Under DOS, it's entirely possible neither works. If 4DOS was the command shell, you'd probably have better luck like this:
perl -e "print <Ctrl-x>"Hello world\n<Ctrl-x>""
Under the Mac, it depends which environment you are using. The MacPerl shell, or MPW, is much like Unix shells in its support for several quoting variants, except that it makes free use of the Mac's non-ASCII characters as control characters.
There is no general solution to all of this. It is a mess, pure and simple. Sucks to be away from Unix, huh? :-)
[Some of this answer was contributed by Kenneth Albanowski.]
For modules, get the CGI or LWP modules from CPAN. For textbooks, see the two especially dedicated to web stuff in the question on books. For problems and questions related to the web, like ``Why do I get 500 Errors'' or ``Why doesn't it run from the browser right when it runs fine on the command line'', see these sources:
WWW Security FAQ
http://www.w3.org/Security/Faq/
Web FAQ
http://www.boutell.com/faq/
CGI FAQ
http://www.webthing.com/page.cgi/cgifaq
HTTP Spec
http://www.w3.org/pub/WWW/Protocols/HTTP/
HTML Spec
http://www.w3.org/TR/REC-html40/
http://www.w3.org/pub/WWW/MarkUp/
CGI Spec
http://www.w3.org/CGI/
CGI Security FAQ
http://www.go2net.com/people/paulp/cgi-security/safe-cgi.txt
perltoot is a good place to start, and you can use perlobj and perlbot for reference. Perltoot didn't come out until the 5.004 release, but you can get a copy (in pod, html, or postscript) from http://www.perl.com/CPAN/doc/FMTEYEWTK/ .
If you want to call C from Perl, start with perlxstut, moving on to perlxs, xsubpp, and perlguts. If you want to call Perl from C, then read perlembed, perlcall, and perlguts. Don't forget that you can learn a lot from looking at how the authors of existing extension modules wrote their code and solved their problems.
Download the ExtUtils::Embed kit from CPAN and run `make test'. If the tests pass, read the pods again and again and again. If they fail, see perlbug and send a bugreport with the output of make test TEST_VERBOSE=1
along with perl -V
.
perldiag has a complete list of perl's error messages and warnings, with explanatory text. You can also use the splain program (distributed with perl) to explain the error messages:
perl program 2>diag.out
splain [-v] [-p] diag.out
or change your program to explain the messages for you:
use diagnostics;
or
use diagnostics -verbose;
This module (part of the standard perl distribution) is designed to write a Makefile for an extension module from a Makefile.PL. For more information, see ExtUtils::MakeMaker.
Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. All rights reserved.
When included as an integrated part of the Standard Distribution of Perl or of its documentation (printed or otherwise), this works is covered under Perl's Artistic Licence. For separate distributions of all or part of this FAQ outside of that, see perlfaq.
Irrespective of its distribution, all code examples here are public domain. You are permitted and encouraged to use this code and any derivatives thereof in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit to the FAQ would be courteous but is not required.