# NAME
XML::PP - A simple XML parser
# VERSION
Version 0.03
# SYNOPSIS
use XML::PP;
my $parser = XML::PP->new();
my $xml = 'ToveJaniReminderDon\'t forget me this weekend!';
my $tree = $parser->parse($xml);
print $tree->{name}; # 'note'
print $tree->{children}[0]->{name}; # 'to'
# DESCRIPTION
You almost certainly do not need this module,
for most tasks use [XML::Simple](https://metacpan.org/pod/XML%3A%3ASimple) or [XML::LibXML](https://metacpan.org/pod/XML%3A%3ALibXML).
`XML::PP` exists only for the most lightweight of scenarios where you can't get one of the above modules to install,
for example,
CI/CD machines running Windows that get stuck with [https://stackoverflow.com/questions/11468141/cant-load-c-strawberry-perl-site-lib-auto-xml-libxml-libxml-dll-for-module-x](https://stackoverflow.com/questions/11468141/cant-load-c-strawberry-perl-site-lib-auto-xml-libxml-libxml-dll-for-module-x).
`XML::PP` is a simple, lightweight XML parser written in pure Perl.
It does not rely on external libraries like `XML::LibXML` and is suitable for small XML parsing tasks.
This module supports basic XML document parsing, including namespace handling, attributes, and text nodes.
# METHODS
## new
my $parser = XML::PP->new();
my $parser = XML::PP->new(strict => 1);
my $parser = XML::PP->new(warn_on_error => 1);
Creates a new `XML::PP` object.
It can take several optional arguments:
- `strict` - If set to true, the parser dies when it encounters unknown entities or unescaped ampersands.
- `warn_on_error` - If true, the parser emits warnings for unknown or malformed XML entities. This is enabled automatically if `strict` is enabled.
- `logger`
Used for warnings and traces.
It can be an object that understands warn() and trace() messages,
such as a [Log::Log4perl](https://metacpan.org/pod/Log%3A%3ALog4perl) or [Log::Any](https://metacpan.org/pod/Log%3A%3AAny) object,
a reference to code,
a reference to an array,
or a filename.
## parse
my $tree = $parser->parse($xml_string);
Parses the XML string and returns a tree structure representing the XML content.
The returned structure is a hash reference with the following fields:
- `name` - The tag name of the node.
- `ns` - The namespace prefix (if any).
- `ns_uri` - The namespace URI (if any).
- `attributes` - A hash reference of attributes.
- `children` - An array reference of child nodes (either text nodes or further elements).
## collapse\_structure
Collapse an XML-like structure into a simplified hash (like [XML::Simple](https://metacpan.org/pod/XML%3A%3ASimple)).
use XML::PP;
my $input = {
name => 'note',
children => [
{ name => 'to', children => [ { text => 'Tove' } ] },
{ name => 'from', children => [ { text => 'Jani' } ] },
{ name => 'heading', children => [ { text => 'Reminder' } ] },
{ name => 'body', children => [ { text => 'Don\'t forget me this weekend!' } ] },
],
attributes => { id => 'n1' },
};
my $result = collapse_structure($input);
# Output:
# {
# note => {
# to => 'Tove',
# from => 'Jani',
# heading => 'Reminder',
# body => 'Don\'t forget me this weekend!',
# }
# }
The `collapse_structure` subroutine takes a nested hash structure (representing an XML-like data structure) and collapses it into a simplified hash where each child element is mapped to its name as the key, and the text content is mapped as the corresponding value. The final result is wrapped in a `note` key, which contains a hash of all child elements.
This subroutine is particularly useful for flattening XML-like data into a more manageable hash format, suitable for further processing or display.
`collapse_structure` accepts a single argument:
- `$node` (Required)
A hash reference representing a node with the following structure:
{
name => 'element_name', # Name of the element (e.g., 'note', 'to', etc.)
children => [ # List of child elements
{ name => 'child_name', children => [{ text => 'value' }] },
...
],
attributes => { ... }, # Optional attributes for the element
ns_uri => ... , # Optional namespace URI
ns => ... , # Optional namespace
}
The `children` key holds an array of child elements. Each child element may have its own `name` and `text`, and the function will collapse all text values into key-value pairs.
The subroutine returns a hash reference that represents the collapsed structure, where the top-level key is `note` and its value is another hash containing the child elements' names as keys and their corresponding text values as values.
For example:
{
note => {
to => 'Tove',
from => 'Jani',
heading => 'Reminder',
body => 'Don\'t forget me this weekend!',
}
}
- Basic Example:
Given the following input structure:
my $input = {
name => 'note',
children => [
{ name => 'to', children => [ { text => 'Tove' } ] },
{ name => 'from', children => [ { text => 'Jani' } ] },
{ name => 'heading', children => [ { text => 'Reminder' } ] },
{ name => 'body', children => [ { text => 'Don\'t forget me this weekend!' } ] },
],
};
Calling `collapse_structure` will return:
{
note => {
to => 'Tove',
from => 'Jani',
heading => 'Reminder',
body => 'Don\'t forget me this weekend!',
}
}
## \_parse\_node
my $node = $self->_parse_node($xml_ref, $nsmap);
Recursively parses an individual XML node.
This method is used internally by the `parse` method.
It handles the parsing of tags, attributes, text nodes, and child elements.
It also manages namespaces and handles self-closing tags.
# AUTHOR
Nigel Horne, ``
# SEE ALSO
- [XML::LibXML](https://metacpan.org/pod/XML%3A%3ALibXML)
- [XML::Simple](https://metacpan.org/pod/XML%3A%3ASimple)
# SUPPORT
This module is provided as-is without any warranty.
# LICENSE AND COPYRIGHT
Copyright 2025 Nigel Horne.
Usage is subject to licence terms.
The licence terms of this software are as follows:
- Personal single user, single computer use: GPL2
- All other users (including Commercial, Charity, Educational, Government)
must apply in writing for a licence for use from Nigel Horne at the
above e-mail.