<pack xmlns="http://ns.qubic.tv/2010/item">
<packitem>
<duration>520</duration>
<max_count>14</max_count>
</packitem>
<packitem>
<duration></duration>
<max_count>23</max_count>
</packitem>
</pack>
if you want to parse it and retrieve the values in tuples
root = etree.fromstring(xml) namespaces = {'i':"http://ns.qubic.tv/2010/item"} packitems_duration = root.xpath('//i:pack/i:packitem/i:duration/text()', namespaces=namespaces) packitems_max_count = root.xpath('//b:pack/i:packitem/i:max_count/text()', namespaces=namespaces) packitems = zip(packitems_duration, packitems_max_count) >>> packitems [('520','14')]
The problem is the zip result miss a value. That's because lxml returns nothing instead of None or empty string. Let's change that.
def lxml_empty_str(context, nodes): for node in nodes: node.text = node.text or "" return nodes ns = etree.FunctionNamespace('http://ns.qubic.tv/lxmlfunctions') ns['lxml_empty_str'] = lxml_empty_str namespaces = {'i':"http://ns.qubic.tv/2010/item", 'f': "http://ns.qubic.tv/lxmlfunctions"} packitems_duration = root.xpath('f:lxml_empty_str('//b:pack/i:packitem/i:duration)/text()', namespaces={'b':billing_ns, 'f' : 'http://ns.qubic.tv/lxmlfunctions'}) packitems_max_count = root.xpath('f:lxml_empty_str('//b:pack/i:packitem/i:max_count)/text()', namespaces={'b':billing_ns, 'f' : 'http://ns.qubic.tv/lxmlfunctions'}) packitems = zip(packitems_duration, packitems_max_count) >>> packitems [('520','14'), ('','23')]
more info on extending lxml http://lxml.de/extensions.html#xpath-extension-functions
No comments:
Post a Comment