Software-Entwicklung, Projektleitung, Web-Design
Kontakt:   +41 61 927 18 30
You are here:


How to process a XML document with multiple root items

by Joerg Lang | Aug 30, 2009

I'm working on an application that needs to display information from a log. The created log files have the svclog file format. For my purpose I just wanted to read the LogEntry elements of the XML file. I started out using Linq but soon I was presented an error message saying that I am tring to read an invalid XML file that has multiple root items.

Looking at the data in the file it turned out to be true. Why can't these log files, that look like XML files, be actual log files? So how do I get at my data. Surely I didn't want to parse the file myself. Of course the easiest solution would be to create a valid XML file, but in my situation this was not possible, as the svclog file format is produced by another program.

The solution I came up with was to create a small temporary wrapper file that includes the main XML file. This solution requires only a few lines of extra code and after that you can read and process the document with Linq.

Here is the code fragment that show how to do it:

public List<LogItem> GetLogItems(string logFile)
    // Create a wrapper for the log file. This is necessary, as the svclog file is
// not a valid xml document.
string wrapperFile = GetWrapperFile(logFile); XNamespace xmlns = ""; XmlReaderSettings settings = new XmlReaderSettings(); settings.ProhibitDtd = false; XmlReader reader = XmlReader.Create(wrapperFile, settings); XDocument document = XDocument.Load(reader); var q = from c in document.Descendants(xmlns + "LogEntry") select new LogItem() { EventId = (int)c.Element(xmlns + "EventId"), Message = (string)c.Element(xmlns + "Message"), Title = (string)c.Element(xmlns + "Title"), Timestamp = DateTime.Parse((string)c.Element(xmlns + "TimeStampString")), Severity = (string)c.Element(xmlns + "Severity"), Category = (string) c.Element(xmlns + "Categories").Elements(ms_xmlns + "String").FirstOrDefault() }; return q.ToList(); } private string GetWrapperFile(string logFile) { FileInfo fileInfo = new FileInfo(logFile); string wrapperFile = Path.GetTempFileName(); string xmlWrapper = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><!DOCTYPE items [<!ENTITY logFileFragment SYSTEM \"{0}\">]><items>&logFileFragment;</items>"; StreamWriter sw = new StreamWriter(wrapperFile, false, Encoding.UTF8); sw.Write(string.Format(xmlWrapper, fileInfo.FullName)); sw.Close(); return wrapperFile; }


A few notes on the code. The wrapper uses a DTD feature to reference an external XML file. In order to open this document you cannot pass the filename directly to the XDocument.Load method. You have to first create a settings object and specifiy that DTD elements are allowed by specifying ProhibitDtd = false. Then you create an XmlReader that you pass to the Load method of the XDocument and you're done.
Now you can process the XML file with Linq.



1 Comment

  1. 1 Lakisha 25 Dez
    Learning a ton from these neat arctleis.


  1. RadEditor - HTML WYSIWYG Editor. MS Word-like content editing experience thanks to a rich set of formatting tools, dropdowns, dialogs, system modules and built-in spell-check.
    RadEditor's components - toolbar, content area, modes and modules
    Toolbar's wrapper 
    Content area wrapper
    RadEditor's bottom area: Design, Html and Preview modes, Statistics module and resize handle.
    It contains RadEditor's Modes/views (HTML, Design and Preview), Statistics and Resizer
    Editor Mode buttonsStatistics moduleEditor resizer
    RadEditor's Modules - special tools used to provide extra information such as Tag Inspector, Real Time HTML Viewer, Tag Properties and other.