Pod::Text - Convert POD data to formatted text
use Pod::Text;
my $parser = Pod::Text->new (sentence => 1, width => 78);
# Read POD from STDIN and write to STDOUT.
$parser->parse_from_filehandle;
# Read POD from file.pod and write to file.txt.
$parser->parse_from_file ('file.pod', 'file.txt');
Pod::Text is a module that can convert documentation in the POD format (the preferred language for documenting Perl) into formatted text. It uses no special formatting controls or codes, and its output is therefore suitable for nearly any device.
Pod::Text uses the following logic to choose an output encoding, in order:
If a PerlIO encoding layer is set on the output file handle, do not do any output encoding and will instead rely on the PerlIO encoding layer.
If the encoding
or utf8
options are set, use the output encoding specified by those options.
If the input encoding of the POD source file was explicitly specified (using =encoding
) or automatically detected by Pod::Simple, use that as the output encoding as well.
Otherwise, if running on a non-EBCDIC system, use UTF-8 as the output encoding. Since this is a superset of ASCII, this will result in ASCII output unless the POD input contains non-ASCII characters without declaring or autodetecting an encoding (usually via E<> escapes).
Otherwise, for EBCDIC systems, output without doing any encoding and hope this works.
One caveat: Pod::Text has to commit to an output encoding the first time it outputs a non-ASCII character, and then has to stick with it for consistency. However, =encoding
commands don't have to be at the beginning of a POD document. If someone uses a non-ASCII character early in a document with an escape, such as E<0xEF>, and then puts =encoding iso-8859-1
later, ideally Pod::Text would follow rule 3 and output the entire document as ISO 8859-1. Instead, it will commit to UTF-8 following rule 4 as soon as it sees that escape, and then stick with that encoding for the rest of the document.
Unfortunately, there's no universally good choice for an output encoding. Each choice will be incorrect in some circumstances. This approach was chosen primarily for backwards compatibility. Callers should consider forcing the output encoding via encoding
if they have any knowledge about what encoding the user may expect.
In particular, consider importing the Encode::Locale module, if available, and setting encoding
to locale
to use an output encoding appropriate to the user's locale. But be aware that if the user is not using locales or is using a locale of C
, Encode::Locale will set the output encoding to US-ASCII. This will cause all non-ASCII characters will be replaced with ?
and produce a flurry of warnings about unsupported characters, which may or may not be what you want.
Create a new Pod::Text object. ARGS should be a list of key/value pairs, where the keys are chosen from the following. Each option is annotated with the version of Pod::Text in which that option was added with its current meaning.
[2.00] If set to a true value, selects an alternate output format that, among other things, uses a different heading style and marks =item
entries with a colon in the left margin. Defaults to false.
[2.13] If set to a true value, the non-POD parts of the input file will be included in the output. Useful for viewing code documented with POD blocks with the POD rendered and the code left intact.
[5.00] Specifies the encoding of the output. The value must be an encoding recognized by the Encode module (see Encode::Supported). If the output contains characters that cannot be represented in this encoding, that is an error that will be reported as configured by the errors
option. If error handling is other than die
, the unrepresentable character will be replaced with the Encode substitution character (normally ?
).
If the output file handle has a PerlIO encoding layer set, this parameter will be ignored and no encoding will be done by Pod::Man. It will instead rely on the encoding layer to make whatever output encoding transformations are desired.
WARNING: The input encoding of the POD source is independent from the output encoding, and setting this option does not affect the interpretation of the POD input. Unless your POD source is US-ASCII, its encoding should be declared with the =encoding
command in the source, as near to the top of the file as possible. If this is not done, Pod::Simple will will attempt to guess the encoding and may be successful if it's Latin-1 or UTF-8, but it will produce warnings. See perlpod(1) for more information.
[3.17] How to report errors. die
says to throw an exception on any POD formatting error. stderr
says to report errors on standard error, but not to throw an exception. pod
says to include a POD ERRORS section in the resulting documentation summarizing the errors. none
ignores POD errors entirely, as much as possible.
The default is pod
.
[5.01] By default, Pod::Text applies some default formatting rules based on guesswork and regular expressions that are intended to make writing Perl documentation easier and require less explicit markup. These rules may not always be appropriate, particularly for documentation that isn't about Perl. This option allows turning all or some of it off.
The special value all
enables all guesswork. This is also the default for backward compatibility reasons. The special value none
disables all guesswork. Otherwise, the value of this option should be a comma-separated list of one or more of the following keywords:
If no guesswork is enabled, any text enclosed in C<> is surrounded by double quotes in nroff (terminal) output unless the contents are already quoted. When this guesswork is enabled, quote marks will also be suppressed for Perl variables, function names, function calls, numbers, and hex constants.
Any unknown guesswork name is silently ignored (for potential future compatibility), so be careful about spelling.
[2.00] The number of spaces to indent regular text, and the default indentation for =over
blocks. Defaults to 4.
[2.00] If set to a true value, a blank line is printed after a =head1
heading. If set to false (the default), no blank line is printed after =head1
, although one is still printed after =head2
. This is the default because it's the expected formatting for manual pages; if you're formatting arbitrary text documents, setting this to true may result in more pleasing output.
[2.21] The width of the left margin in spaces. Defaults to 0. This is the margin for all text, including headings, not the amount by which regular text is indented; for the latter, see the indent option. To set the right margin, see the width option.
[3.17] Normally, L<> formatting codes with a URL but anchor text are formatted to show both the anchor text and the URL. In other words:
L<foo|http://example.com/>
is formatted as:
foo <http://example.com/>
This option, if set to a true value, suppresses the URL when anchor text is given, so this example would be formatted as just foo
. This can produce less cluttered output in cases where the URLs are not particularly important.
[4.00] Sets the quote marks used to surround C<> text. If the value is a single character, it is used as both the left and right quote. Otherwise, it is split in half, and the first half of the string is used as the left quote and the second is used as the right quote.
This may also be set to the special value none
, in which case no quote marks are added around C<> text.
[3.00] If set to a true value, Pod::Text will assume that each sentence ends in two spaces, and will try to preserve that spacing. If set to false, all consecutive whitespace in non-verbatim paragraphs is compressed into a single space. Defaults to false.
[3.10] Send error messages about invalid POD to standard error instead of appending a POD ERRORS section to the generated output. This is equivalent to setting errors
to stderr
if errors
is not already set. It is supported for backward compatibility.
[3.12] If this option is set to a true value, the output encoding is set to UTF-8. This is equivalent to setting encoding
to UTF-8
if encoding
is not already set. It is supported for backward compatibility.
[2.00] The column at which to wrap text on the right-hand side. Defaults to 76.
As a derived class from Pod::Simple, Pod::Text supports the same methods and interfaces. See Pod::Simple for all the details. This section summarizes the most-frequently-used methods and the ones added by Pod::Text.
Direct the output from parse_file(), parse_lines(), or parse_string_document() to the file handle FH instead of STDOUT
.
Direct the output from parse_file(), parse_lines(), or parse_string_document() to the scalar variable pointed to by REF, rather than STDOUT
. For example:
my $man = Pod::Man->new();
my $output;
$man->output_string(\$output);
$man->parse_file('/some/input/file');
Be aware that the output in that variable will already be encoded (see "Encoding").
Read the POD source from PATH and format it. By default, the output is sent to STDOUT
, but this can be changed with the output_fh() or output_string() methods.
Read the POD source from INPUT, format it, and output the results to OUTPUT.
parse_from_filehandle() is provided for backward compatibility with older versions of Pod::Man. parse_from_file() should be used instead.
Parse the provided lines as POD source, writing the output to either STDOUT
or the file handle set with the output_fh() or output_string() methods. This method can be called repeatedly to provide more input lines. An explicit undef
should be passed to indicate the end of input.
This method expects raw bytes, not decoded characters.
Parse the provided scalar variable as POD source, writing the output to either STDOUT
or the file handle set with the output_fh() or output_string() methods.
This method expects raw bytes, not decoded characters.
Pod::Text exports one function for backward compatibility with older versions. This function is deprecated; instead, use the object-oriented interface described above.
Convert the POD source from INPUT to text and write it to OUTPUT. If OUTPUT is not given, defaults to STDOUT
. INPUT can be any expression supported as the second argument to two-argument open().
If -a
is given as an initial argument, pass the alt
option to the Pod::Text constructor. This enables alternative formatting.
If -NNN
is given as an initial argument, pass the width
option to the Pod::Text constructor with the number NNN
as its argument. This sets the wrap line width to NNN.
(W) Something has gone wrong in internal =item
processing. These messages indicate a bug in Pod::Text; you should never see them.
(F) Pod::Text was invoked via the compatibility mode pod2text() interface and the input file it was given could not be opened.
(F) The errors
parameter to the constructor was set to an unknown value.
(F) The quote specification given (the quotes
option to the constructor) was invalid. A quote specification must be either one character long or an even number (greater than one) characters long.
(F) The POD document being formatted had syntax errors and the errors
option was set to die
.
Pod::Text 2.03 (based on Pod::Parser) was the first version of this module included with Perl, in Perl 5.6.0. Earlier versions of Perl had a different Pod::Text module, with a different API.
The current API based on Pod::Simple was added in Pod::Text 3.00. Pod::Text 3.01 was included in Perl 5.9.3, the first version of Perl to incorporate those changes. This is the first version that correctly supports all modern POD syntax. The parse_from_filehandle() method was re-added for backward compatibility in Pod::Text 3.07, included in Perl 5.9.4.
Pod::Text 3.12, included in Perl 5.10.1, first implemented the current practice of attempting to match the default output encoding with the input encoding of the POD source, unless overridden by the utf8
option or (added later) the encoding
option.
Support for anchor text in L<> links of type URL was added in Pod::Text 3.14, included in Perl 5.11.5.
parse_lines(), parse_string_document(), and parse_file() set a default output file handle of STDOUT
if one was not already set as of Pod::Text 3.18, included in Perl 5.19.5.
Pod::Text 4.00, included in Perl 5.23.7, aligned the module version and the version of the podlators distribution. All modules included in podlators, and the podlators distribution itself, share the same version number from this point forward.
Pod::Text 4.09, included in Perl 5.25.7, fixed a serious bug on EBCDIC systems, present in all versions back to 3.00, that would cause opening brackets to disappear.
Pod::Text 5.00 and later, included in Perl 5.37.7, default, on non-EBCDIC systems, to UTF-8 encoding if it sees a non-ASCII character in the input and the input encoding is not specified. They also commit to an encoding with the first non-ASCII character and does not change the output encoding if the input encoding changes. The Encode module is now used for all output encoding rather than PerlIO layers, which fixes earlier problems with output to scalars.
Line wrapping is done only at ASCII spaces and tabs, rather than using a correct Unicode-aware line wrapping algorithm.
Russ Allbery <rra@cpan.org>, based very heavily on the original Pod::Text by Tom Christiansen <tchrist@mox.perl.com> and its conversion to Pod::Parser by Brad Appleton <bradapp@enteract.com>. Sean Burke's initial conversion of Pod::Man to use Pod::Simple provided much-needed guidance on how to use Pod::Simple.
Copyright 1999-2002, 2004, 2006, 2008-2009, 2012-2016, 2018-2019, 2022 Russ Allbery <rra@cpan.org>
This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.
Encode::Locale, Encode::Supproted, Pod::Simple, Pod::Text::Termcap, perlpod(1), pod2text(1)
The current version of this module is always available from its web site at https://www.eyrie.org/~eagle/software/podlators/. It is also part of the Perl core distribution as of 5.6.0.