Curse of the MiniDOM

I have spent a depressing evening trying to find out if there was a mechanism using minidom or pyxml to load the bbc traffic tpeg data into a DOM and get it to resolve the node entity values. No luck. I can’t even use minidom to parse the entity file or DTD directly as it rejects it. The only way I think I can get this working is to

– read the entity file line by line and generate name pairs from the definitions using regex to match valid entity defns.
– (somehow) load the xml file into memory, perform a text replace on all entity references from the name pair dict and _then_ pass this to minidom and my parser code which I have built line by excrutiating line.

I don’t like this approach but I have set myself the goal of doing this so I will do it.

Comments

2 Responses to “Curse of the MiniDOM”

  1. Orbimus on June 15th, 2008 3:50 am

    So, we’re both in the same plight. I am trying to resolve entity references and still no luck:D Oh well… Hope you had better luck than me..even if it was 2.25yrs ago:D

  2. Administrator on June 16th, 2008 5:05 pm

    Orbimus, Alas I ended up doing it the hard way. I had to read in the entity file values and then process the file using text replace before passing it to the parser. What a pain in the backside. I have not had revisit this code since 2006 and am alarmed that 2 years have passed and this still isn’t easy to do in Python.

    Al