Best xml questions in November 2011

XMLTimeToDateTime ignores milliseconds

7 votes

why does XMLTimeToDateTime ignore milliseconds?

  Test := XMLTimeToDateTime('2011-11-11T12:41:36.696+01:00', TRUE);
  T2 := FormatDateTime('yyyy''-''mm''-''dd''T''hh'':''nn'':''ss''.''zzz', Test);

after that T2 = '2011-11-11T11:41:36.000'

I am using Delphi 2007.

The code in XSBuiltIns indeed parses the millisecond part, but this part is never used in encoding functions.

function TXSBaseTime.GetAsTime: TDateTime;
begin
  Result := EncodeTime(Hour, Minute, Second, 0);
end;

and

function TXSBaseCustomDateTime.GetAsDateTime: TDateTime;
var
  BiasDT: TDateTime;
  BiasTime, BiasLocal: Integer;
  BiasHour, BiasMins: Word;
begin
  { NOTE: In XML Years can exceed 9999 - that's not the case for TDateTime.
          So here, there would be a problem with the conversion }
  Result := EncodeDateTime(Year, Month, Day, Hour, Minute, Second, 0);

and

function TXSBaseCustomDateTime.GetAsUTCDateTime: TDateTime;
var
  AdjustDT: TDateTime;
begin
  Result := EncodeDateTime(Year, Month, Day, Hour, Minute, Second, 0);

As the last one is called from XMLTimeToDateTime, it is quite understandable that the millisecond part is always 0.

All parsing and data storage is done in internal (implementation part) classes which cannot be access directly except through (broken) wrappers. IOW, you should write your own date/time parser.


In addition to all the ugliness found in XSBuiltIns, XMLTimeToDateTime actually parses date twice. First the TXSDateTime.XSToNative is called which parses the date/time, throws result away and stores only the original string, and then TXSCustomDateTime.GetAsUTCDateTime parses this string again. Euch!

Schema validation not trimming strings before validating

6 votes

I have a problem with validating my XML file, after it has been automatically formatted. The validation doesn't trim the string before validating it. Is this a bug in the implementation of the XML validation of .NET or is this accepted behavior? If it is accepted behavior, how are cases like this normally handled, because in my opinion, the two XML files are equivalent.

My XSD:

<xs:schema ...>
  ...
  <xs:simpleType name="ItemTypeData">
    <xs:restriction base="xs:string">
      <xs:enumeration value="ItemA" />
    </xs:restriction>
  </xs:simpleType>
</xs:schema>

My XML before formatting (validation passes):

...
<ItemType>ItemA</ItemType>
...

After formatting (validation fails):

...
<ItemType>
  ItemA
</ItemType>
...

Your validator is behaving correctly, given the way the schema is defined. You either need to stop the formatter taking such liberties with the content, or you need to change the schema - for example by making ItemTypeData a restriction of xs:token rather than xs:string (in xs:token, leading and trailing whitespace is considered insignificant).

Haskell parse big xml file with low memory

5 votes

So, I've played around with several Haskell XML libraries, including hexpat and xml-enumerator. After reading the IO chapter in Real World Haskell (http://book.realworldhaskell.org/read/io.html) I was under the impression that if I run the following code, it will be garbage collected as I go through it.

However, when I run it on a big file, memory usage keeps climbing as it runs.

runghc parse.hs bigfile.xml

What am I doing wrong? Is my assumption wrong? Does the map/filter force it to evaluate everything?

import qualified Data.ByteString.Lazy as BSL
import qualified Data.ByteString.Lazy.UTF8 as U
import Prelude hiding (readFile)
import Text.XML.Expat.SAX 
import System.Environment (getArgs)

main :: IO ()
main = do
    args <- getArgs
    contents <- BSL.readFile (head args)
    -- putStrLn $ U.toString contents
    let events = parse defaultParseOptions contents 
    mapM_ print $ map getTMSId $ filter isEvent events

isEvent :: SAXEvent String String -> Bool 
isEvent (StartElement "event" as) = True
isEvent _ = False

getTMSId :: SAXEvent String String -> Maybe String
getTMSId (StartElement _ as) = lookup "TMSId" as

My end goal is to parse a huge xml file with a simple sax-like interface. I don't want to have to be aware of the whole structure to get notified that I've found an "event".

I'm the maintainer of hexpat. This is a bug, which I have now fixed in hexpat-0.19.8. Thanks for drawing it to my attention.

The bug is new on ghc-7.2.1, and it's to do with an interaction that I didn't expect between a where clause binding to a triple, and unsafePerformIO, which I need to make the interaction with the C code appear pure in Haskell.

jQuery doesn't work with <col> XML tags

5 votes

I was using jQuery doing some XML work. Then jQuery told me it can't find the <col> tag giving me empty data. After having a talk with jQuery, seems like it just doesn't like to work with <col> XML tags, maybe some expert can explain this to me?

Here's my XML:

<field>
<property>
    <label>Matrix Question</label>
    <rows>
        <row>row - 1a</row>
        <row>row - 2a</row>
        <row>row - 3a</row>
        <row>row - 4a</row>
    </rows>
    <cols>
        <col>col - 1</col>
        <col>col - 2</col>
        <col>col - 3</col>
        <col>col - 4</col>
        <col>col - 5</col>
    </cols>
    <isrequired>true</isrequired>
</property>

Here's my code for it:

var xmlWithCol = "<field> <property> <label>Matrix Question</label> <rows> <row>row - 1a</row> <row>row - 2a</row> <row>row - 3a</row> <row>row - 4a</row> </rows> <cols> <col>col - 1</col> <col>col - 2</col> <col>col - 3</col> <col>col - 4</col> <col>col - 5</col> </cols> <isrequired>true</isrequired> </property> </field>";
var xmlWithoutCol = "<field> <property> <label>Matrix Question</label> <rows> <row>row - 1a</row> <row>row - 2a</row> <row>row - 3a</row> <row>row - 4a</row> </rows> <cols> <colm>col - 1</colm> <colm>col - 2</colm> <colm>col - 3</colm> <colm>col - 4</colm> <colm>col - 5</colm> </cols> <isrequired>true</isrequired> </property> </field>"

$(xmlWithCol).find("cols").each(function ()
{
    alert($(this).html());
});

$(xmlWithoutCol).find("cols").each(function ()
{
    alert($(this).html());
});

as you can see my first output is

col - 1 col - 2 col - 3 col - 4 col - 5 

then I found out jQuery doesn't like <col> tag, I changed it to < colm >< /colm > instead, and it gives me this:

<colm>col - 1</colm> <colm>col - 2</colm> <colm>col - 3</colm> <colm>col - 4</colm> <colm>col - 5</colm> 

which is what I want.

How can I make my jQuery love the <col> tag?

I don't claim to be an expert, but <col> exists in HTML as an empty element. You can't give that element text, even if you try to use it as an XML element, because jQuery would be breaking the HTML DOM rules otherwise.

Since jQuery was made to work with HTML rather than XML, I can't think of a good workaround besides simply not using the name <col>...

XmlTextReader vs. XDocument

5 votes

I'm in the position to parse XML in .NET. Now I have the choice between at least XmlTextReader and XDocument. Are there any comparisons between those two (or any other XML parsers contained in the framework)?

Maybe this could help me to decide without trying both of them in depth.

The XML files are expected to be rather small, speed and memory usage are a minor issue compared to easyness of use. :-)

(I'm going to use them from C# and/or IronPython.)

Thanks!

If you're happy reading everything into memory, use XDocument. It'll make your life much easier. LINQ to XML is a lovely API.

Use an XmlReader (such as XmlTextReader) if you need to handle huge XML files in a streaming fashion, basically. It's a much more painful API, but it allows streaming (i.e. only dealing with data as you need it, so you can go through a huge document and only have a small amount in memory at a time).

There's a hybrid approach, however - if you have a huge document made up of small elements, you can create an XElement from an XmlReader positioned at the start of the element, deal with the element using LINQ to XML, then move the XmlReader onto the next element and start again.

XML Syntax when Using Colon (:), in Tags

5 votes

I am working on a mobile application and have to read a xml feed and parse the information. There it has a special tag as this <dc:creator> Jonethon Owens </dc:creator>

In C# I am using LINQ to XML and don't know how to exactly deal with this type of a tag to parse and get the information.

If someone can explain how to achieve this, really appreciated. Thanks in Advance

You need the namespace prefix.

XNamespace dc = "http://purl.org/dc/elements/1.1/";


var query = from lst in XElement.Load(@"foo.xml").Elements(dc +"creator")

            select ...

Overriding or ignoring undeclared entities in C# using LINQ

5 votes

I have a little utility that runs through looking for certain things in XML files using LINQ. It processes a MASSIVE collection of them rather quickly and nicely. However, about 20% of a certain batch of files fail to be read and are skipped, failing because of the degree symbol's presence as &deg; in the files. This is the "Reference to undeclared entity 'deg'." a previous question was about.

The solutions offered in the previous question cannot be directly applied here. I am not at liberty to go around modifying the files, and making copies of them and replacing instances or inserting tags in the copies seems inefficient. What would be the best way to go about getting LINQ to ignore the undeclared entities, which have absolutely no bearing on what my program does anyway? Or is there perhaps a good way of getting an XDocument.Load to be fed some entity declarations beforehand?

Unfortunately entities form part of the well-formedness rules for XML (2.1 Well-Formed XML Documents). It seems like you're saying you want the XDocument.Load to load what is notionally an XML file, but does not in fact conform to the rules, which it won't do, quite reasonably.

If your users are passing you what are supposed to be XML files, but that have undefined entities, then either you have to get them to provide the files in a valid format, or manage the incorrectness youself at load-time, in the ways that have been suggested.

It seems to me, from your restrictions, that the neatest approach would be to follow the example linked-to and create some settings to pass into the XMLReader along the lines of (Validating an XML Document in the DOM).

If there are entities which aren't defined and aren't listed in public schemas, you'll need to create your own schema which defines all the entities you need. So, create a generic settings for the XMLReader which references your own, custom schema. Add the necessary entities to this schema as certain files fail to load and then you'll build up a list of all the entites that you need to define in order that the XML files are valid.

Then, for each document you try to load, create an XMLReader for the file using the settings above and call the XDocument(XMLReader) overload.

How can I make a comma delimited list of xml values? xml version 1.0

4 votes

I have searched for an answer to this with no luck. I'm sure I have overlooked the answer somewhere. However, I am trying to print/display a subset of xml values as a comma delimited list. Here is an example of what I am trying to do;

XML Doc.

<vehicle>
  <car new="y">
    <yr>2012</yr>
    <make>Ford</make>
    <model>Mustang</model>
    <color>Blue</color>
  </car>
  <car new="y">
    <yr>2012</yr>
    <make>Chevy</make>
    <model>Camaro</model>
    <color>Red</color>
  </car>
  <car new="y">
    <yr>2012</yr>
    <make>Subaru</make>
    <model>Impreza</model>
    <color>White</color>
  </car>
  <car new="n">
    <yr>2000</yr>
    <make>Toyota</make>
    <model>Tacoma</model>
    <color>Silver</color>
  </car>
  <car new="n">
    <yr>1998</yr>
    <make>Dodge</make>
    <model>Durango</model>
    <color>Green</color>
  </car>
</vehicle>

XSL DOC..

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="html"/>
  <xsl:template match="/" >

<html>
  <head>
    <link rel="stylesheet" type="text/css" href="styles2.css" />
  </head>
  <body>
      <h2> New Cars </h2>
      <p>
        <xsl:for-each select="vehicle/car">
        <xsl:sort select="./yr" data-type="text" order="ascending" />
        <xsl:if test="./@new='y'">
        <xsl:value-of select="yr" />
        <xsl:text> </xsl:text>
        <xsl:value-of select="make" />
        <xsl:text> - </xsl:text>
        <xsl:value-of select="model" />
        </xsl:if>
        </xsl:for-each>
      </p>
  </body>
</html>
  </xsl:template>
 </xsl:stylesheet> 

So, in this example I want to select all "new" cars and place them in a comma delimited list and not have a comma after the last item in the list.

I can't use xsl:if test="position() != last()> since the position of the "last" "new" car may not be the "last" position in the xml. I would prefer this to be done in xml version 1.0.

Any Suggestions or Ideas? Thanks in advance!

Example output:

2012 Ford Mustang, 2012 Chevy Camaro, 2012 Subaru Impreza

Use

<xsl:for-each select="vehicle/car[@new='y']">

instead of getting them all and using an if test. Then you can use last().

Anchor align within the column

4 votes

I have a graph combining 3D columns and lines (MSColumn3DLineDY) and the anchors are not aligning to the center of columns. In this example, the anchors are aligned to center without any specific property to do it.

Here's my graph:

my graph

The open char tag of above chart:

<chart caption=''
       palette='2'
       animation='1'
       showValues='0'
       formatNumberScale='0'
       numberPrefix=''
       slantLabels='1'
       showLabels='1'
       rotateValues='0'
       placeValuesInside='0'
       labelDisplay='ROTATE'
       seriesNameInToolTip='1'
       anchorBorderColor='339966'
       decimalSeparator=','
       thousandSeparator='.'
       syAxisMaxValue='$maximo'
       pyAxisMaxValue='$maximo'>

A chart with the correct align I am after:

another chart

And it's chart tag:

<chart caption=''
       PYAxisName='Quantidade'
       SYAxisName='Valores (Em R$/Mil)'
       palette='2'
       animation='1'
       showValues='0'
       formatNumberScale='0'
       numberPrefix=''
       slantLabels='1'
       showLabels='1'
       rotateValues='0'
       placeValuesInside='0'
       labelDisplay='ROTATE'
       seriesNameInToolTip='1'
       anchorBorderColor='FFFF33'
       decimalSeparator=','
       thousandSeparator='.'
       baseFontSize='8'>`

They are almost the same!

Seems you have a blank <dataset> in your data. Please check.

The anchors aligns to the center. If you have 2 column datasets of which one is blank or invisible, the anchors would be aligned to the center of these possible columns and hence this.

How do I Set XmlArrayItem Element name for a List<Custom> implementation?

4 votes

I want to create a custom XML structure as follows:

<Hotels>
    <Hotel />
</Hotels>

I've created an implementation of List just to be able to do this. My code is as follows:

[XmlRootAttribute(ElementName="Hotels")]
public class HotelList: List<HotelBasic>

Because the class that List holds is not named Hotel but HotelBasic my xml is like

<Hotels>
   <HotelBasic />
</Hotels>

How do I fix this without having to implement ISerializable or IEnumerable?

Assuming you are using XmlSerializer, if all you want to do is change how your HotelBasic class is serialized, you can use XmlTypeAttribute:

[XmlType(TypeName = "Hotel")]
public class HotelBasic
{
    public string Name { get; set; }
}

When used with your HotelList class it will be serialized as:

<Hotels xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Hotel>
    <Name>One</Name>
  </Hotel>
  <Hotel>
    <Name>Two</Name>
  </Hotel>
</Hotels>

XML Schema to validate each value in an NMTOKENS attribute list

4 votes

Given this XML file:

<users blessed="phrogz alians">
  <user name="phrogz"  id="42" />
  <user name="lachtok" id="3"  />
  <user name="vielee"  id="5"  />
  <user name="alians"  id="17" />
</users>

...is it possible to create an XSD key/keyref style validation that ensures that each value in the the blessed list matches against an existing user/@name?

If this is not possible with XSD, is it possible with RelaxNG?

No, it's not possible with XSD 1.0. It's straightforward in XSD 1.1, of course, using assertions:

Uniqueness (if defined at the level of the users element):

<xsl:assert test="count(@blessed) = count(distinct-values(@blessed))"/>

Referential integrity (if defined at the level of the users element):

<xsl:assert test="every $t in data(@blessed) satisfies $t = user/@name"/>