Best xml questions in February 2012

Internet explorer intercepts XML response

7 votes

I have a form whose target is an iframe.

When submitting the form, the response is XML and I have Javascript that analyzes the response.

I noticed that when running on IE, IE intercepts the response and treats it as an RSS feed, so my code never receives the response. If I disable the RSS feeds (from the internet option, content tab) everything works ok.

I set the content type of the response to “text/xml; charset=UTF-8” but still it does not work.

Is there any workaround?

The best workaround would be not to use an iframe in this case. It sounds like IE is catching the http response and reading it on its own. Is there a reason you're not making an AJAX call to retrieve the information? It sounds like you're relying on JavaScript to handle the response anyway, so I would think that using an XMLHttpRequest object would be better for you: http://www.w3.org/TR/XMLHttpRequest/

If that's too complicated, look into a library like jQuery: http://jquery.com/ that has built in (and much simpler) functions to make AJAX calls and handle responses.

To expand on this, you would bind the submit function of the form to a JS function (or use jQuery to do it) and pick up the form data, send it in an AJAX request, and handle the response. jQuery has a built in function serialize() which is meant to convert form data on a page into information ready for use in the ajax() function to send to the server. If you're unfamiliar with the XMLHttpRequest object, I would highly suggest using a library like jQuery for this task.

Checking XML for expected structure

6 votes

I am calling a function which returns a string containing XML data. How this function works is not important but the resulting xml can be different depending on the success of the function.

Basically the function will return either the expect XML or an error formatted XML. Below are basic samples of what the two results might look like...

On Success:

<SpecificResult>
    <Something>data</Something>
</SpecificResult>

On Error:

<ErrorResult>
    <ErrorCode>1</ErrorCode>
    <ErrorMessage>An Error</ErrorMessage>
</ErrorResult>

The way my system is set up is that I can convert an xml string to a class with a simple converter function but this requires my to know the class type. On success, I will know it is SpecificResult and I can convert. But I want to check to first if an error occured.

The ideal end result would allow something similar to this...

string xml = GetXML();
if(!IsError(xml))
{
   //convert to known type and process
}

So the question is, what is the best way to implement the IsError function?

I have thought of a couple of options but not sure if I like any of them really...

  1. check if xml string contains "<ErrorResult>"
  2. try to convert xml to ErrorResult class and check for fail
  3. use XDocument or similar built in functions to parse the tree and search for ErrorResult node

Since the GetXml() method is essentially returning untyped data and the only safe assumption here is that it's structured as XML, the safest way to assert it's actual type would be to parse it as XML:

private bool IsError(string xml)
{
    var document = XDocument.Parse(xml);
    return document.Element("ErrorResult") != null;
}

sql query to produce xml output

5 votes

I have these tables

Table 1 tbl1_site [facilityId] [name]

Table 2 tbl2_applicant [pvid] [facilityId] [npi] [firstname]

FK join key: tbl1_site.facilityId = tbl2_applicant.facilityId

Table 3 tbl3_abstraction [pvid] [patientnumber] [diabetesdiagnosis] [dateofbirth]

FK join key: tbl2_applicant.pvId = tbl3_abstraction.pvId

i have problem to create a sql query to reproduce this xml output.

thanks

<account>
    <metadata />
    <practice-sites>
        <practice-site>
            <metadata>
                <data-element id="name">
                    <value>My Own Diabetes Medical Center</value>
                </data-element>
            </metadata>
            <applicants>
                <metadata />
                <applicant>
                    <metadata>
                        <data-element id="npi">
                            <value>1234567890</value>
                        </data-element>
                        <data-element id="firstname">
                            <value>Joseph</value>
                        </data-element>
                    </metadata>
                    <clinical-abstractions>                           
                        <clinical-abstraction>
                            <data-element id="diabetesdiagnosis">
                                <value>Backward</value>
                            </data-element>
                            <data-element id="dateofbirth">
                                <value>02/01/2009</value>
                            </data-element>
                            <data-element id="patientnumber">
                                <value>1</value>
                            </data-element>
                        </clinical-abstraction>
                    </clinical-abstractions>
                </applicant>
            </applicants>
        </practice-site>
    </practice-sites>
</account>

do you really need all those tags? I mean the "metadata" and "data-element"

try this query, it shows the data on the format you need:

select  t1.name as PracticeSite,
        (SELECT t2.npi as NPI, 
                t2.firstname,
                (SELECT t3.patientnumber, 
                        t3.diabetesdiagnosis, 
                        t3.dateofbirth
                        FROM tbl3_abstraction t3
                        WHERE t3.pvId=t2.pvId
                        FOR XML PATH('clinical-abstraction'), TYPE
                        ) as 'clinical-abstractions'
                FROM tbl2_applicant t2
                WHERE t1.[facilityId]=t2.[facilityId]
                FOR XML PATH('Applicant'), TYPE
        ) AS 'Applicants'
from tbl1_site t1
FOR XML path('PracticeSites'), root('account'), ELEMENTS;

Merging XML in an SQL Server

5 votes

Let's say I have the following two pieces of XML in my database

<!-- XML 1 -->
<pairs>
    <item key="a">xml 1 a value</item>
    <item key="b">xml 1 b value</item>
    <item key="c">xml 1 c value</item>
</pairs>

<!-- XML 2 -->    
<pairs>
    <item key="c">xml 2 c value</item>
    <item key="d">xml 2 d value</item>
    <item key="e">xml 1 e value</item>
</pairs>

This data is stored in two separate tables using the XML datatype, additionally this XML column is linked to a schema that describes the format of expected xml e.g

[PairData] [xml](CONTENT [foo].[Pairs]) NULL

Within a stored procedure / function I would like to merge these two XML structures into the following:

<pairs>
    <item key="a">xml 1 a value</item>
    <item key="b">xml 1 b value</item>
    <item key="c">xml 2 c value</item>
    <item key="d">xml 2 d value</item>
    <item key="e">xml 2 e value</item>
</pairs>

So, from the first piece of xml we have taken items:

a, b

from the second piece of xml we have taken items:

c, d, e  

Notice that the two pieces of XML have a common item with a key of:

c

In this scenario the value from xml 2 should be used in the merged xml (discarding the value from xml 1). Another case is that the XML 1 or 2 could be NULL therefore the merge process should handle this and simply return the other. Or both could be NULL in which case NULL is returned.

As an aside, in our current implementation we are returning both XML documents from the DB and doing the merge in code. However we would prefer to have this merge done within the DB as multiple unrelated processes are calling this proc.

Use:

declare @x1 xml ='<pairs>
    <item key="a">xml 1 a value</item>
    <item key="b">xml 1 b value</item>
    <item key="c">xml 1 c value</item>
</pairs>'

declare @x2 xml ='<pairs>
    <item key="c">xml 2 c value</item>
    <item key="d">xml 2 d value</item>
    <item key="e">xml 2 e value</item>
</pairs>'

select *
from
(
    select isnull(t2.a, t1.a) [@key], isnull(t2.b, t1.b) [text()]
    from
    (
        select t.c.value('@key', 'nvarchar(max)') [a], t.c.value('.', 'nvarchar(max)') [b]
        from @x1.nodes('/*/item') t(c)
    )t1
    full join
    (
        select t.c.value('@key', 'nvarchar(max)') [a], t.c.value('.', 'nvarchar(max)') [b]
        from @x2.nodes('/*/item') t(c)
    )t2 on t2.a = t1.a
)t
for xml path('item'), root('pairs')

Output:

<pairs>
  <item key="a">xml 1 a value</item>
  <item key="b">xml 1 b value</item>
  <item key="c">xml 2 c value</item>
  <item key="d">xml 2 d value</item>
  <item key="e">xml 2 e value</item>
</pairs>

UPDATE:

declare @x1 xml ='<pairs>
    <item key="a">xml 1 a value</item>
    <item key="b">xml 1 b value</item>
    <item key="c">xml 1 c value</item>
</pairs>'

declare @x2 xml ='<pairs>
    <item key="c">xml 2 c value</item>
    <item key="d">xml 2 d value</item>
    <item key="e">xml 2 e value</item>
</pairs>'

declare @t1 table(id int, data xml)
insert @t1 values(1, @x1)

declare @t2 table(id int, data xml)
insert @t2 values(1, @x2)

select isnull(t2.a, t1.a) [@key], isnull(t2.b, t1.b) [text()]
from
(
    select t.c.value('@key', 'nvarchar(max)') [a], t.c.value('.', 'nvarchar(max)') [b]
    from @t1 ta
    cross apply ta.data.nodes('/*/item') t(c)
)t1
full join
(
    select t.c.value('@key', 'nvarchar(max)') [a], t.c.value('.', 'nvarchar(max)') [b]
    from @t2 ta
    cross apply ta.data.nodes('/*/item') t(c)
)t2 on t2.a = t1.a
for xml path('item'), root('pairs')

FxCop Complaint: Exposed concrete xml types and a bad improvement

4 votes

I want to save certain classes and since xml-serialization won't do it in my case i'm saving the values manually into an xml-document. Works fine, but FxCop doesn't like it and since FxCop normally gives good advice and reasons why i shouldn't do things in a certain way i try to keep it happy.

This time, i dont understand how this is an improvement.

This is what i had:

public void Save()
{
      XmlDocument doc = new XmlDocument();
      XmlNode XmlNodeJob = doc.CreateElement("Job");
      doc.AppendChild(XmlNodeJob);
      OtherclassSave2(XmlNodeJob);//Node as Parameter
 }

 public void OtherclassSave2(XmlNode node)
 {

 }

And this is what FxCop complained: "Modify member 'OtherclassSave2(XmlNode)' so that it no longer exposes the concrete type 'XmlNode'. Use IXPathNavigable to represent XML data sources."

And now my awesome solution:

    public void Save()
    {
        XmlDocument doc = new XmlDocument();
        XmlNode XmlNodeJob = doc.CreateElement("Job");
        doc.AppendChild(XmlNodeJob);
        OtherclassSave2(XmlNodeJob.CreateNavigator());//Interface from a node's navigator
    }

    public void OtherclassSave2(IXPathNavigable nav)
    {
        XmlNode node = (XmlNode)(nav.CreateNavigator().UnderlyingObject);

    }

This way i get my node in the other method and FxCop is happy, but i really don't see the improvement and i need a node to add things in it, not something to read.

I though about changing void SaveInThisNode(XmlNode) into a XmlNode GetMeTheNode() but to create nodes via CreateElements, i need the XmlDocument-object which i am not allowed to use as a parameter, but i could create new XmlDocuments in every step, fine.

My solution was simple and worked fine for everything i wanted it to do, but FxCop does not seem to allow solutions that are not obviously worse and more complicated.

It is just suggesting that you do not couple yourself with the specific implementation of XmlNode in the method signature. This allows you to change the internal implementation without affecting anything using the class.

It is suggested that you can ignore the warning if you need specific functionality from the concrete class. If this is a public facing API you should try to decouple as much as you can which will give you the freedom to change implementation with less chance of changing the method signatures and thus forcing consumers of the API to change their implementation.

CA1059: Members should not expose certain concrete types

How to evaluate expressions in this tree?

4 votes

Here is an example of a parsed xml file I'm working with that tabs it into a tree form

commandList

  assign
    variable
      #text[a]
    expression-int
      #text[1]

  assign
    variable
      #text[b]
    expression-int
      #text[2]

  assign
    variable
      #text[c]
    expression-operation
      operator
        #text[OP_SET]
      arguments
        expression-variable
          variable
            #text[a]
        expression-variable
          variable
            #text[b]

  assign
    variable
      #text[d]
    expression-operation
      operator
        #text[OP_SET]
      arguments
        expression-operation
          operator
            #text[OP_TUPLE]
          arguments
            expression-int
              #text[1]
            expression-int
              #text[2]
        expression-operation
          operator
            #text[OP_TUPLE]
          arguments
            expression-int
              #text[3]
            expression-int
              #text[4]

I hope this input isn't difficult to understand. Here is what it looks like normally when not parsed from an XML file:

a := 1;
b := 2;
c := {1,2};
d := {(1,2),(3,4)};

etc...

All of the assignment pairs (that is, a value and a variable) are to be stored in a hashmap so that the value can be looked up by it's variable and used in later expressions. I'm to use a recursive descent evaluator (I think?) to solve down the expressions according to the grammar.

I've googled all sorts of things for the past 24 hours now and have seen a lot of tree evaluators for basic arithmetic (e.g. 2 + 3 * 8, etc) but haven't been able to see how that would work for my specific tree.

Code I've written so far goes as low as finding the the variable names (a,b,c,d,e etc) but I can't begin to think of how to code the recursion which will provide the right values for the hash map.

public void evaluate(Node node){
    HashMap<String, String> assignments = new HashMap<String, String>();
    NodeList assignment = node.getChildNodes();
    for (int i=0; i < assignment.getLength(); i++){ //1 to 13
        Node assign = assignment.item(i);
        Node variable = this.getChild(assign, 0);
        Node varValNode = this.getChild(variable, 0);
        String varVal = varValNode.getNodeValue();

        Node expression = this.getChild(assign, 1);

The document, node and nodelist classes for my tree are unusual in that they don't allow a 'getChild' method which I think would save a lot of time. Does anybody know why this is?

Really random problem here and I hope it made sense. Please ask me to elaborate on anything that is unclear and I will try the best that I can. I'm not looking for anyone to solve the problem for me but merely instruct me on how to decide how to code this recursive algorithm.

EDIT: Also, the second 'input' I put above was actually the output. It should have been this:

a := 1;
b := 2;
c := @set(a,b);
d := @set(@tuple(1,2),@tuple(3,4));

Assuming that all your values are of integer type, you should create a HashMap<string,Integer> to store variable values, and pass it to your evaluate method:

public static void main(String[] args) {
    NodeList commandList = ... // get your XML from somewhere
    Map<string,Integer> vars = new HashMap<string,Integer>();
    for (Node node : commandList) {
        evaluate(node, vars);
    }
    // At this point, vars contains values of all variables assigned in commands
    // from the command list
}

The evaluation should become relatively straightforward:

private static Integer evaluate(Node node, Map<string,Integer> vars) {
    if (node is an assignment) {
        String varName = ... // get variable name from node
        Node expr = ... // get the node representing expression being assigned
        Integer value = evaluate(expr, vars);
        vars.put(varName, value);
        return value;
    }
    if (node is a variable reference) {
        String varName = ... // get variable name from node
        return vars.get(varName);
    }
    if (node is an integer constant) {
        String constVal = ... // Get the representation from XML
        return Integer.decode(constVal);
    }
    if (node is a binary expression) {
        Node lhs = ... // Get the left-hand side expression from the node
        Node rhs = ... // Get the right-hand side expression from the node
        Integer lhsVal = evaluate(lhs, vars); 
        Integer rhsVal = evaluate(rhs, vars);
        if (operator is plus) {
            return new Integer(((int)lhsVal) + ((int)rhsVal));
        }
        if (operator is minus) {
            return new Integer(((int)lhsVal) - ((int)rhsVal));
        }
        if (operator is multiply) {
            return new Integer(((int)lhsVal) * ((int)rhsVal));
        }
        if (operator is divide) {
            return new Integer(((int)lhsVal) / ((int)rhsVal));
        }
        // ... and so on
    }
    // ... and so on
}

Serious Memory Leak When Iteratively Parsing XML Files

4 votes

Context

When iterating over a set of Rdata files (each containing a character vector of HTML code) that are loaded, analyzed (via XML functionality) and then removed from memory again, I experience a significant increase in an R process' memory consumption (killing the process eventually).

It just seems like

  • freeing objects via free(),
  • removing them via rm() and
  • running gc()

do not have any effects, so the memory consumption cumulates until there's no more memory left.

EDIT 2012-02-13 23:30:00

Thanks to valuable insight shared by the author and maintainer of package XML, Duncan Temple Lang (again: I really appreciate it very much!), the problem seems to be closely related to the way external pointers are freed and how garbage collection is handled in the XML package. Duncan issued a bug-fixed version of the package (3.92-0) that consolidated certain aspects of parsing XML and HTML and features an improved garbage collection where it's not necessary anymore to explicitly free the object containing the external pointer via free(). You find the source code and a Windows binary at Duncan's Omegahat website.


EDIT 2012-02-13 23:34:00

Unfortunately, the new package version still does not seem to fix the issues I'm encountering in the little little example that I've put together. I followed some suggestion and simplified the example a bit, making it easier to grasp and to find the relevant functions where things seem to go wrong (check functions ./lib/exampleRun.R and .lib/scrape.R).


EDIT 2012-02-14 15:00:00

Duncan suggested trying to force to free the parsed document explicitly via .Call("RS_XML_forceFreeDoc", html). I've included a logical switch in the example (do.forcefree in script ./scripts/memory.R) that, if set to TRUE, will do just that. Unfortunately, this made my R console crash. It'd be great if someone could verify this on their machine! Actually, the doc should be freed automatically when using the latest version of XML (see above). The fact that it isn't seems to be a bug (according to Duncan).


EDIT 2012-02-14 23:12:00

Duncan pushed yet another version of XML (3.92-1) to his Omegahat website Omegahat website. This should fix the issue in general. However, I seem to be out of luck with my example as I still experience the same memory leakage.


EDIT 2012-02-17 20:39:00 > SOLUTION!

YES! Duncan found and fixed the bug! It was a little typo in a Windows-only script which explained why the bug didn't show in Linux, Mac OS etc. Check out the latest version 3.92-2.! Memory consumption is now as constant as can be when iteratively parsing and processing XML files!

Special thanks again to Duncan Temple Lang and thanks to everyone else that responded to this question!


>>> LEGACY PARTS OF THE ORIGINAL QUESTION <<<

Example Instructions (edited 2012-02-14 15:00:00)

  1. Download folder 'memory' from my Github repo.
  2. Open up the script ./scripts/memory.R and set a) your working directory at line 6, b) the example scope at line 16 as well c) whether to force the freeing of the parsed doc or not at line 22. Note that you can still find the old scripts; they are "tagged" by an "LEGACY" at the end of the filename.
  3. Run the script.
  4. Investigate the latest file ./memory_<TIMESTAMP>.txt to see the increase in logged memory states over time. I've included two text files that resulted from my own test runs.

Things I've done with respect to memory control

  • making sure a loaded object is removed again via rm() at the end of each iteration.
  • When parsing XML files, I've set argument addFinalizer=TRUE, removed all R objects that have a reference to the parsed XML doc before freeing the C pointer via free() and removing the object containing the external pointer.
  • adding a gc() here and there.
  • trying to follow the advice in Duncan Temple Lang's notes on memory management when using its XML package (I have to admit though that I did not fully comprehend what's stated there)

EDIT 2012-02-13 23:42:00: As I pointed out above, explicit calls to free() followed by rm() should not be necessary anymore, so I commented these calls out.

System Info

  • Windows XP 32 Bit, 4 GB RAM
  • Windows 7 32 Bit, 2 GB RAM
  • Windows 7 64 Bit, 4 GB RAM
  • R 2.14.1
  • XML 3.9-4
  • XML 3.92-0 as found at http://www.omegahat.org/RSXML/

Initial Findings as of 2012-02-09 01:00:00

  1. Running the webscraping scenario on several machines (see section "System Info" above) always busts the memory consumption of my R process after about 180 - 350 iterations (depending on OS and RAM).
  2. Running the plain rdata scenario yields constant memory consumption if and only if you set an explicit call to the garbage collector via gc() in each iteration; else you experience the same behavior as in the webscraping scenario.

Questions

  1. Any idea what's causing the memory increase?
  2. Any ideas how to work around this?

Findings as of 2012-02-013 23:44:00

Running the example in ./scripts/memory.R on several machines (see section "System Info" above) still busts the memory consumption of my R process after about 180 - 350 iterations (depending on OS and RAM).

There's still an evident increase in memory consumption and even though it may not appear to be that much when just looking at the numbers, my R processes always died at some point due to this.

Below, I've posted a couple of time series that resulted from running my example on a WinXP 32 Bit box with 2 GB RAM:

TS_1 (XML 3.9-4, 2012-02-09)

29.07 33.32 30.55 35.32 30.76 30.94 31.13 31.33 35.44 32.34 33.21 32.18 35.46 35.73 35.76 35.68 35.84 35.6 33.49 33.58 33.71 33.82 33.91 34.04 34.15 34.23 37.85 34.68 34.88 35.05 35.2 35.4 35.52 35.66 35.81 35.91 38.08 36.2

TS_2 (XML 3.9-4, 2012-02-09)

28.54 30.13 32.95 30.33 30.43 30.54 35.81 30.99 32.78 31.37 31.56 35.22 31.99 32.22 32.55 32.66 32.84 35.32 33.59 33.32 33.47 33.58 33.69 33.76 33.87 35.5 35.52 34.24 37.67 34.75 34.92 35.1 37.97 35.43 35.57 35.7 38.12 35.98

Error Message associated to TS_2

[...]
Scraping html page 30 of ~/data/rdata/132.rdata
Scraping html page 31 of ~/data/rdata/132.rdata
error : Memory allocation failed : growing buffer
error : Memory allocation failed : growing buffer
I/O error : write error
Scraping html page 32 of ~/data/rdata/132.rdata
Fehler in htmlTreeParse(file = obj[x.html], useInternalNodes = TRUE, addFinalizer =     TRUE): 
 error in creating parser for (null)
> Synch18832464393836

TS_3 (XML 3.92-0, 2012-02-13)

20.1 24.14 24.47 22.03 25.21 25.54 23.15 23.5 26.71 24.6 27.39 24.93 28.06 25.64 28.74 26.36 29.3 27.07 30.01 27.77 28.13 31.13 28.84 31.79 29.54 32.4 30.25 33.07 30.96 33.76 31.66 34.4 32.37 35.1 33.07 35.77 38.23 34.16 34.51 34.87 35.22 35.58 35.93 40.54 40.9 41.33 41.6

Error Message associated to TS_3

[...]
---------- status: 31.33 % ----------

Scraping html page 1 of 50
Scraping html page 2 of 50
[...]
Scraping html page 36 of 50
Scraping html page 37 of 50
Fehler: 1: Memory allocation failed : growing buffer
2: Memory allocation failed : growing buffer

Edit 2012-02-17: please help me verifying counter value

You'd do me a huge favor if you could run the following code. It won't take more than 2 minutes of your time. All you need to do is

  1. Download an Rdata file and save it as seed.Rdata.
  2. Download the script containing my scraping function and save it as scrape.R.
  3. Source the following code after setting the working directory accordingly.

Code:

setwd("set/path/to/your/wd")
install.packages("XML", repos="http://www.omegahat.org/R")
library(XML)
source("scrape.R")
load("seed.rdata")
html <- htmlParse(obj[1], asText = TRUE)
counter.1 <- .Call("R_getXMLRefCount", html)
print(counter.1)
z <- scrape(html)
gc()
gc()
counter.2 <- .Call("R_getXMLRefCount", html)
print(counter.2)
rm(html)
gc()
gc()

I'm particularly interested in the values of counter.1 and counter.2 which should be 1 in both calls. In fact, it is on all machines that Duncan has tested this on. However, as it turns out counter.2 has value 259 on all of my machines (see details above) and that's exactly what's causing my problem.

From the XML package's webpage, it seems that the author, Duncan Temple Lang, has quite extensively described certain memory management issues. See this page: "Memory Management in the XML Package".

Honestly, I'm not proficient in the details of what's going on here with your code and the package, but I think you'll either find the answer in that page, specifically in the section called "Problems", or in direct communication with Duncan Temple Lang.


Update 1. An idea that might work is to use the multicore and foreach packages (i.e. listResults = foreach(ix = 1:N) %dopar% {your processing;return(listElement)}. I think that for Windows you'll need doSMP, or maybe doRedis; under Linux, I use doMC. In any case, by parallelizing the loading, you'll get faster throughput. The reason I think you may get some benefit from memory usage is that it could be that forking R, could lead to different memory cleaning, as each spawned process gets killed when complete. This isn't guaranteed to work, but it could address both memory and speed issues.

Note, though: doSMP has its own idiosyncracies (i.e. you may still have some memory issues with it). There have been other Q&As on SO that mentioned some issues, but I'd still give it a shot.

Default Namespace created when append the XML document

4 votes

I was trying to append the xml file in the existing file, everything works fine but I have an issue with default namespace when it appends.

This is the code I use to append:

 XmlNode newChild = doc.CreateNode(XmlNodeType.Element, "image", "");
    newChild.Attributes.Append(doc.CreateAttribute("name", filename));

    XmlNode xmlElement = doc.CreateNode(XmlNodeType.Element, "width", null);
    xmlElement.InnerText = widthValue[1].TrimStart();
    newChild.AppendChild(xmlElement);

am getting an output like below

<image d2p1:name="" xmlns:d2p1="test.jpg">
    <width>1024</width>
</image>

but I was trying to append like:

<image name="test.jpg">
    <width>1024</width>
</image>

As others suggested, using LINQ to XML might be easier in general.

But if you want to stick with using XmlDocument, to fix the issue, change your code to the following:

var attribute = doc.CreateAttribute("name");
attribute.Value = filename;
newChild.Attributes.Append(attribute);

The problem with the code you have is that doc.CreateAttribute("foo", "bar") creates an attribute with the name foo in a namespace with the URI bar. That's really not what you want.

Linq query based on attribute

4 votes

Ran into another challenge. I looked through some of the questions that I found here, but I can't seem to piece together what I need.

OK I have a XML file:

<Output id="1">
    <path rename="Off" name="pattern-1">d:\temp</path>
  </Output>

  <Output id="2">
      <path isRename="False" name="pattern-1" >d:\temp\out2</path>
      <path isRename="True"  name="pattern-1" >d:\temp\out3</path>
      <path isRename="False" name="pattern-1">d:\temp\out4</path>
  </Output>

What I need to do is find the <Output> tag based on the id attribute . Then I need to loop through all of the <path> tags and get the attribute and path value. I tried a few thing based on a previous question I had asked but I couldn't get it to work

var results = from c in rootElement.Elements("Output") 
              where (string)c.Attribute("Id") == "2" select c;

foreach (var path in rootElement.Elements("Output").Elements("path"))
{
    string p  = path.Value;
}

Your first line doesn't do anything if you don't actually use the results.

foreach (var outputElement in rootElement.Elements("Output")
                                         .Where(e => (string)e.Attribute("id") == "1"))
{
    foreach (var pathElement in outputElement.Elements("path"))
    {
        // ...
    }
}

If your id attribute is guaranteed to be unique (which it should), you can get rid of the first foreach and just get the individual <Output> directly:

var outputElement = rootElement.Elements("Output")
                               .FirstOrDefault(e => (string)e.Attribute("id") == "1"));

How to select elements by part of their name in XSL/XPath?

4 votes

How do I use apply-templates to select only those elements by name (not value) that end with a specific pattern? Assume the following xml...

<report>
  <report_item>
    <report_created_on/>
    <report_cross_ref/>
    <monthly_adj/>
    <quarterly_adj/>
    <ytd_adj/>
  </report_item>
  <report_item>
   ....
  </report_item>
</report>

I want to use <xsl:apply-templates> on all instances of <report_item> where descendant elements end with 'adj`, so, in this case, only monthly_adj, quaterly_adj, and ytd_adj would be selected and applied with the template.

<xsl:template match="report">
   <xsl:apply-templates select="report_item/(elements ending with 'adj')"/>
</xsl:template>

I don't think that regular expression syntax is available in this context, even in XSLT 2.0. But you don't need it in this case.

<xsl:apply-templates select="report_item/*[ends-with(name(), 'adj')]"/>

* matches any node

[pred] performs a node test against the selector (in this case *) (where pred is a predicate evaluated in the context of the selected node)

name() returns the element's tag name (should be equivalent to local-name for this purpose).

ends-with() is a built-in XPath string function.