Java and XML

b4u

Golden Member
Nov 8, 2002
1,380
2
81
Hi,

I'm currently developing web apps using WebSphere, and I would like to know your opinions regarding Java and XML.

I know I can access XML data using DOM or SAX implementation. I've made a couple of programs that will read some config info from XML files, and they end up working well.

The thing is, in my opinion, it's still "too hard" to properly code an XML parser, that can input data safely and error-free, with a proper data validation. When I say parser, I mean the final program that just needs the data, since DOM or SAX are in fact the parsers of the file structure. Also, one will end up with a program code that is too much dependant of that specific XML structure, and a pain to start something new.

So, if I want to create a config file, I'll make an XML file structure and then need to make some coding to read and validate input from it. Since it's a new XML file, structure will probably end being different from any other structure made before, so code has to be vastly adapted to that new reality, which will bring up the problem of re-testing the code.

The way I see it, I can blindly read XML data into the program. Since it can contain config file, or data safely created by other program, I could read it on a go. But I feel that it isn't the way of doing it ... so I always do some checkup of data, to allow for error free programs, ie, programs that do not crash, instead generate a log entry informing the admin that config data isn't correct, and cannot proceed.

Also I could define a DTD that will make some validations, but probably that DTD will be accessed through the web (like some XML files I saw on some 3rd party packages). If I don't have access to the DTD, then I'll be totaly blind when putting data into the XML without any errors.

I have not much experience with XML, and maybe this thread ends up beeing just a way of dumping my thoughts, but I would also like to here the opinions of you guys out there that have much more experience with XML, and the way you do things so that XML can be truly explored to it's maximum potential.


Thanks.
 

MrChad

Lifer
Aug 22, 2001
13,507
3
81
I think you have unrealistic expectations about what XML can do and how "automated" everything should be. XML is a means of data transmission. No more, no less.

XML is pretty much meaningless without a DTD or XSD to validate it with. Once you have a well-formed XSD, your XML becomes a lot more portable. When you code your parser to read the XML file, your input possibilities are now well-defined, and it's easy for other users or programs to modify that XML and know that it will work in your reader.

The problem of code dependent on XML structure can be partially solved by using proper abstraction. For instance, I can have a config.xml file that looks like this:

<config>
...<param id="1">
......<name>Parameter 1</name>
......<value>Value 1</value>
...</param>
...<param id="2">
......<name>Parameter 2</name>
......<value>Value 2</value>
...</param>
</config>

It would be pretty easy to write a class that reads each parameter in the XML file and stores the name/value pairs in a HashMap. When I need a new configuration parameter, I just add it to the XML file and it's immediately available to me in the HashMap. You can get more complex with your metadata and how you store/read it, but that's a whole different discussion.
 

Reapsy00

Member
Apr 12, 2005
116
0
0
I've been struggling with xml today, got it doing what i want as long as there's no whitespace in the xml, 1 space,tab or newline and I'm f*&%!d! Is there anyway of telling the parser to ignore whitespace?


wow nice dvd collection MrChad
 

MrChad

Lifer
Aug 22, 2001
13,507
3
81
Originally posted by: Reapsy00
I've been struggling with xml today, got it doing what i want as long as there's no whitespace in the xml, 1 space,tab or newline and I'm f*&%!d! Is there anyway of telling the parser to ignore whitespace?


wow nice dvd collection MrChad

What parser?
 

statik213

Golden Member
Oct 31, 2004
1,654
0
0
Have you looked at JDOM? It's a lot easier to work with than SAX and more intuitive than DOM IMHO..
See here, it's an excellent (online) book on XML and Java:
http://www.cafeconleche.org/books/xmljava/chapters/ch14.html

edit:
One limitation of JDOM vs. SAX is that JDOM needs to keep the entire XML parse-tree in memory, SAX has much less memory overhead.
If you are dealing with small XML files JDOM is the way to go.
 

Reapsy00

Member
Apr 12, 2005
116
0
0
I'm using a documentBuilderFactory. org.w3c.dom i think. I'll have a look at JDom when I get the chance, I'm hoping i can just botch some kind of skip over whitespace bit in my code though
 

thesurge

Golden Member
Dec 11, 2004
1,745
0
0
Originally posted by: MrChad
I think you have unrealistic expectations about what XML can do and how "automated" everything should be. XML is a means of data transmission. No more, no less.

XML is pretty much meaningless without a DTD or XSD to validate it with. Once you have a well-formed XSD, your XML becomes a lot more portable. When you code your parser to read the XML file, your input possibilities are now well-defined, and it's easy for other users or programs to modify that XML and know that it will work in your reader.

The problem of code dependent on XML structure can be partially solved by using proper abstraction. For instance, I can have a config.xml file that looks like this:

<config>
...<param id="1">
......<name>Parameter 1</name>
......<value>Value 1</value>
...</param>
...<param id="2">
......<name>Parameter 2</name>
......<value>Value 2</value>
...</param>
</config>

It would be pretty easy to write a class that reads each parameter in the XML file and stores the name/value pairs in a HashMap. When I need a new configuration parameter, I just add it to the XML file and it's immediately available to me in the HashMap. You can get more complex with your metadata and how you store/read it, but that's a whole different discussion.

QFT!!!!

 

b4u

Golden Member
Nov 8, 2002
1,380
2
81
Originally posted by: MrChad
XML is pretty much meaningless without a DTD or XSD to validate it with. Once you have a well-formed XSD, your XML becomes a lot more portable. When you code your parser to read the XML file, your input possibilities are now well-defined, and it's easy for other users or programs to modify that XML and know that it will work in your reader.


Yes I can and should create a DTD or XSD file to validate my XML. Even think about some XST if I need to convert something ... but is it that easy to implement?

I mean, should I give away a DTD for people validating their XML? Even then, I must validate myself if I want to make sure everything runs ok. So has SAX/DOM have some way of saying what XML file I want to process plus what DTD/XSD I want him to check against before anything?

I've seen JDOM, and seemed a very nice product. Never tried it though ... because I was in the begining of using XML in Java, so I was interested in using java library functions ... hence Sun Tutorial was of help, and that teaches SAX and DOM.

I'm quite new to dealing with XML ... I mean I know the basis, the XML, DTD, XSD, XST ... but only the basis ... so I just need some rule from people who know and work with this, so I can point to the right way from start ...
 

kamper

Diamond Member
Mar 18, 2003
5,513
0
0
Originally posted by: MrChad
I think you have unrealistic expectations about what XML can do and how "automated" everything should be. XML is a means of data transmission. No more, no less.
Nothing could be farther from the truth! A huge part of the deal about xml is that it is easily automated. If everyone had to work with the DOM for every piece of xml they touched, nobody would ever use it. It's just too tedious (more on this in a bit). It can be used in many other ways than data transmission. For instance, a config file is storage, not transmission.
XML is pretty much meaningless without a DTD or XSD to validate it with. Once you have a well-formed XSD, your XML becomes a lot more portable. When you code your parser to read the XML file, your input possibilities are now well-defined, and it's easy for other users or programs to modify that XML and know that it will work in your reader.
A dtd or xsd (or one of the various other structural spec options) is nice, but far from necessary. At dev time, it's nice for consistency checking, but if his is the only application that will be touching this format, it's easy to get away without. I'm not saying this is good design, but often a code base that correctly interprets the xml format is easier to maintain than a spec and the correct code. At run time, xsd/dtd checks can be a good consistency check, but they often get turned off for performance reasons.
The problem of code dependent on XML structure can be partially solved by using proper abstraction. For instance, I can have a config.xml file that looks like this:

<config>
...<param id="1">
......<name>Parameter 1</name>
......<value>Value 1</value>
...</param>
...<param id="2">
......<name>Parameter 2</name>
......<value>Value 2</value>
...</param>
</config>

It would be pretty easy to write a class that reads each parameter in the XML file and stores the name/value pairs in a HashMap. When I need a new configuration parameter, I just add it to the XML file and it's immediately available to me in the HashMap. You can get more complex with your metadata and how you store/read it, but that's a whole different discussion.
Good suggestion

There are other ways to bring the data in automatically. One interesting way is xml marshalling. Essentially you declare java objects (or the equivalent in whatever language) that map directly to the data in the xml. There are marshalling frameworks (too numerous to list, but I can track some down if anyone's interested) that will read an xml doc and construct a hierarchy of objects which you can then use as normal java data objects (because they are) and it will also take your data objects and serialize them back to xml. I've seen this used fairly effectively, even in a high load situation. As in, it wasn't a config file being read in here and there, it was the bulk of the apps data.

Another way is to use some of the higher level language to manipulate/query the doc. The DOM was actually designed to be used in such an automated way, rather than to be touched by every developer that ever wanted data from an xml document.

A good example is xpath. This lets you declare quick references to specific nodes in a document and they are quite easy to modify. From MrChad's example, /config/param[@id=1]/name would get me the value "Parameter 1". I was writing an app on websphere this summer where I was parsing lots of xml. I chose to use xpath to retrieve that data since using DOM was way too nasty. However, the version of websphere was so old (3.5) that Apache Xalan wouldn't run. Rather than dive into DOM, I spent an afternoon writing my own xpath parser (with a very limited subset of the functionality) and just used xpath to get all the data. DOM was actually very convenient and powerful when I was writing the parser.

XQuery is another example of a high-level xml language. I don't know much about it myself and it hasn't gained extremely wide popularity, but it incorporates xpath and allows you to do much more complicated queries against a document.

JDom is also definitely a possibility. I don't like the idea, personally, because xml isn't central to what you're trying to do. You don't care about the xml, just the data. With one of the tools I mentioned above, you can change the xml into some more convenient form in one fell swoop, but with JDom, you'll just keep writing more and more xml-centric code every time you need more data in different places. Too much coupling imho.

Bottom line though: don't subject yourself to application-specific DOM programming!!
 

b4u

Golden Member
Nov 8, 2002
1,380
2
81
I've used DOM to get info from an XML file. The file had some info for an application, some help info to display to the users. When parsing, I used DOM, but I end up with not only code too much attached to that XML structure, but also with too much validation in code, imho.

So jumping out to DTD or XSD validation specs would be a start advantage ... I imagine that instead of just giving the XML file to the parser, I could also give the XSD file, and in the most basic form, it would return me a boolean telling me if it found any errors. In reality, that would probably return me some more in-depth info about the error, it's position and element, ...

If instead of just a config file, I want to create an XML database, for example if I want to download a list of products to work offline in one application, then I assume I can just create a blank DOM, start adding the tree structure, then I could just dump it into a file?
 

kamper

Diamond Member
Mar 18, 2003
5,513
0
0
Cool, that'd be a good use for dtd/xsd et al. The downer about that might be that the boolean result isn't necessarily good enough. It tells you when there's an error, sure, but (and this will, of course, depend on what validator you're using) the error message might not be extremely helpful, especially to a non-technical user who's writing the config file. I maintained and bugfixed a situation like that once, where the errors we got from xerces were too cryptic so, after we discovered an error, we ran our own validator which was much slower and less flexible but produced a much better error message. That is probably not a route you'd want to go as it's time consuming to write (and the guy that did it had some pretty significant xml expertise).

Not sure I understand your last question there, but I think the answer is "yes, you can dump random crap into a document, so long as your xml library is not set to be validating."
 

statik213

Golden Member
Oct 31, 2004
1,654
0
0
hi kamper,
that's a pretty nice post abt XML & java... the masharling thing looks very interesting. .. xpath also could be handy, thanks for the info!


oh... and any good resources/articles on marshaling would be greatly appreciated.

edit:

wooooho! made gold, 'tis my 1k post.
 

kamper

Diamond Member
Mar 18, 2003
5,513
0
0
Well, where I worked before, they use castor. Basically there, you write a conf file (also in xml) that specifies how to convert between the xml and java and then it does all the rest of the work.

I've also heard lots about xmlbeans but I've never used it. There's a bunch more as well.

The more I think about it though, the less using marshalling to read a config file makes sense. You'll either be bending your java objects to a user's view of the config file or you'll be bending the user's view of the file to how you want your java objects. It may work, but it seems rather inflexible if you wanted to make serious changes to one without affecting the other. The java objects would almost have to be a throwaway layer that you read the data from and then toss immediately. That might work fine but it's lots of extra, repetitive coding.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |