A pure Lua XPath parser and evaluator supporting XPath 2.0 with selected XPath 3.1 features (arrays, maps, string concatenation). No external dependencies — all utility libraries are vendored. Part of the speedata Publisher.
-- create a context with variables, namespaces and an XML document
local ctxvalue = {
namespaces = {
myns = "http://a.name-space"
},
vars = {
foo = "bar",
onedotfive = 1.5,
a = 5,
["one-two"] = 12,
},
xmldoc = { xmltab },
sequence = { xmltab }
}
local ctx = lxpath.context:new(ctxvalue)
-- toks is a token list
local toks, msg = lxpath.string_to_tokenlist(str)
if toks == nil then
print(msg)
os.exit(-1)
end
-- ef is a function which executes the parsed xpath on a context.
-- you can reuse ef()
local ef, err = lxpath.parse_xpath(toks)
if err ~= nil then
-- handle error string err
end
local seq, errmsg = ef(ctx)
-- seq is the resulting sequence (a table) of zero or more items.
-- Each item can be a sequence, an element, an attribute, a string or a number.You can also run one of the convenience functions:
sequence, errormessage = ctx:eval("xpath string")and
sequence, errormessage = ctx:execute("xpath string")The difference is that eval() does not change the context, it only returns the sequence. execute() changes self.
| Expression | Example | Description |
|---|---|---|
| Path | child/grandchild |
Navigate the XML tree |
| Abbreviated path | //item |
Descendant-or-self shorthand |
| Filter / Predicate | item[position() = 1] |
Filter sequences with [] |
| Arithmetic | 1 + 2, $a * 3 |
+, -, *, div, idiv, mod |
| Comparison | $x = 1, $x eq 1 |
General (=, !=, <, >, <=, >=) and value (eq, ne, lt, le, gt, ge) comparisons |
| Node comparison | $a is $b, $a << $b |
is, <<, >> |
| Logical | $a and $b, $a or $b |
and, or |
| Range | 1 to 10 |
Integer sequence |
| String concatenation | 'hello' || ' world' |
XPath 3.1 || operator |
| Unary | -$x, +$x |
Unary plus/minus |
| Union | a | b |
Node set union |
| If/then/else | if ($x) then 'a' else 'b' |
Conditional |
| For | for $i in 1 to 5 return $i * 2 |
Iteration |
| Quantified | some $x in (1,2,3) satisfies $x > 2 |
some / every |
| Type | $x instance of xs:integer |
instance of, cast as, castable as, treat as |
| Variable reference | $varname |
Access context variables |
| Context item | . |
Current item |
| Axis | Abbreviated | Direction |
|---|---|---|
child:: |
(default) | forward |
attribute:: |
@ |
forward |
self:: |
. |
forward |
descendant:: |
forward | |
descendant-or-self:: |
// |
forward |
following:: |
forward | |
following-sibling:: |
forward | |
parent:: |
.. |
reverse |
ancestor:: |
reverse | |
ancestor-or-self:: |
reverse | |
preceding:: |
reverse | |
preceding-sibling:: |
reverse |
| Test | Description |
|---|---|
node() |
Any node |
element() |
Element nodes |
text() |
Text nodes |
comment() |
Comment nodes |
processing-instruction() |
PI nodes |
* |
Any element (wildcard) |
prefix:* |
Any element in namespace |
name |
Element by name |
| Function | Description |
|---|---|
concat(s1, s2, ...) |
Concatenate strings |
contains(s, sub) |
Test if string contains substring |
ends-with(s, sub) |
Test if string ends with substring |
lower-case(s) |
Convert to lowercase |
normalize-space(s) |
Normalize whitespace |
starts-with(s, sub) |
Test if string starts with substring |
string(item?) |
Convert to string |
string-join(seq, sep) |
Join sequence with separator |
string-length(s?) |
Length of string |
substring(s, start, len?) |
Extract substring |
substring-after(s, sub) |
Substring after first occurrence |
substring-before(s, sub) |
Substring before first occurrence |
translate(s, from, to) |
Character-by-character translation |
upper-case(s) |
Convert to uppercase |
matches(s, pattern, flags?) |
Regular expression matching (stub — provide your own implementation) |
codepoints-to-string(seq) |
Codepoints to string |
string-to-codepoints(s) |
String to codepoints |
| Function | Description |
|---|---|
abs(n) |
Absolute value |
ceiling(n) |
Round up |
floor(n) |
Round down |
format-number(n, fmt) |
Format number as string |
number(item) |
Convert to number |
round(n) |
Round to nearest integer |
round-half-to-even(n, precision?) |
Banker's rounding |
| Function | Description |
|---|---|
boolean(item) |
Convert to boolean |
false() |
Boolean false |
true() |
Boolean true |
not(b) |
Boolean negation |
| Function | Description |
|---|---|
count(seq) |
Number of items |
distinct-values(seq) |
Remove duplicates |
empty(seq) |
Test if empty |
max(seq) |
Maximum value |
min(seq) |
Minimum value |
reverse(seq) |
Reverse order |
| Function | Description |
|---|---|
doc(uri) |
Load document |
last() |
Size of current context |
local-name(node?) |
Local name of node |
name(node?) |
Qualified name of node |
namespace-uri(node?) |
Namespace URI |
position() |
Position in current context |
root(node?) |
Root node |
| Function | Description |
|---|---|
serialize(item) |
Serialize node to XML string |
unparsed-text(uri) |
Read file as text |
Requires namespace declaration: array = "http://www.w3.org/2005/xpath-functions/array"
| Function | Description |
|---|---|
array:size(a) |
Number of members |
array:get(a, pos) |
Get member at position |
array:put(a, pos, val) |
Replace member at position |
array:append(a, val) |
Append member |
array:subarray(a, start, len?) |
Extract sub-array |
array:remove(a, pos) |
Remove member at position |
array:join(arrays) |
Concatenate arrays |
array:flatten(a) |
Flatten nested arrays |
Requires namespace declaration: map = "http://www.w3.org/2005/xpath-functions/map"
| Function | Description |
|---|---|
map:size(m) |
Number of entries |
map:keys(m) |
All keys |
map:get(m, key) |
Get value for key |
map:put(m, key, val) |
Add/replace entry |
map:remove(m, key) |
Remove entry |
map:contains(m, key) |
Test if key exists |
map:merge(maps) |
Merge maps |
map:entry(key, val) |
Create single-entry map |
(: Square array constructor :)
[1, 2, 3]
(: Curly array constructor — each item becomes a member :)
array { 1 to 5 }
(: Empty map :)
map {}
(: Map with entries :)
map { 'name': 'Alice', 'age': 30 }
$myarray?1 (: first member :)
$myarray?* (: all members :)
$mymap?name (: value for key 'name' :)
$mymap?* (: all values :)
[10, 20, 30]?2 (: 20 :)
lua lxpath_test.lua
Run a single test by name:
lua lxpath_test.lua TestTokenizer.test_get_qname
All input is expected to be in UTF-8.
This library is not unicode aware! This means for example upper-case('ä') is not Ä, but ä, since there is no lookup table for unicode.
You can provide your own implementations for string.match and string.find (which might be UTF-8 ready) by setting M.stringmatch and M.stringfind.
You can use the registerFunction() function to add your own definitions:
It expects a table with the following fields:
- function name
- namespace
- function (where the arguments are the context and the provided arguments)
- minimum number of arguments
- maximum number of arguments (-1 if arbitrary many arguments allowed)
Example:
function fnSubstring(ctx, arg)
...
end
lxpath.registerFunction({ "substring", "http://www.w3.org/2005/xpath-functions", fnSubstring, 2, 3 })Since the XPath library does not parse XML, it expects a Lua table structure. Each element (a table) has zero or more children, either a string or another element. The element table has this representation:
{
[".__name"] = "elementname",
[".__id"] = 1, -- in document order
[".__type"] = "element",
[".__local_name"] = "elementname",
[".__namespace"] = "",
[".__ns"] = {
["myprefix"] = "http://a.name.space",
},
[".__attributes"] = {
["key"] = "value",
},
[1] = "a string for example",
[2] = { -- a table for an element
},
[3] = "perhaps another string",
}For example the following XML
<data>
<child attname="attvalue">
some text
</child>
mixed content
</data>must be encoded in Lua as:
tbl = {
[".__type"] = "document",
{
[1] = {
[".__name"] = "data",
[".__id"] = 1,
[".__type"] = "element",
[".__local_name"] = "data",
[".__namespace"] = "",
[".__ns"] = {
},
[1] = "\n ",
[2] = {
[".__name"] = "child",
[".__id"] = 2,
[".__type"] = "element",
[".__local_name"] = "child",
[".__namespace"] = "",
[".__ns"] = {
},
[".__attributes"] = { ["attname"] = "attvalue", },
[1] = "\n some text\n ",
},
[3] = "\n\n mixed content\n",
},
},
}- Union/except/intersect operators are only partially implemented
- Date functions are not implemented
- No schema support
- Not unicode aware (see above)
- Since Lua does not have full regular expressions,
matches()is a stub — provide your own implementation viaregisterFunction().replace()andtokenize()are not implemented.