Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 Basic concept  



1.1  Non-extractive, document-centric parsing  





1.2  Virtual token descriptor  





1.3  Location cache  







2 Benefits  



2.1  Overview  







3 Conformance  





4 Simplicity  



4.1  As parser  





4.2  As indexer  





4.3  XML content modifier  





4.4  XML slicer/splitter/assembler  





4.5  XML editor/eraser  





4.6  Other benefits  







5 Weaknesses  





6 Areas of applications  



6.1  General-purpose replacement for DOM or SAX  





6.2  XPath over huge XML documents  





6.3  For SOA/WS/XML security  





6.4  For SOA/WS/XML intermediary  





6.5  Intelligent SOA/WS/XML Load-balancing and Offloading  





6.6  XML persistence data store  





6.7  Schemaless XML data binding  







7 Essential classes  





8 Code sample  





9 References  














VTD-XML






עברית
 

Edit links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 


VTD-XML
Developer(s)XimpleWare
Stable release

2.13_4 / July 14, 2017; 6 years ago (2017-07-14)

Operating systemPortable
PlatformJava, C#, C and C++
TypeXML parser/indexer/slicer/editor library
LicenseGPL and Proprietary License
Websitevtd-xml.sourceforge.io ximpleware.wordpress.com

Virtual Token Descriptor for eXtensible Markup Language (VTD-XML) refers to a collection of cross-platform XML processing technologies centered on a non-extractive[1][2] XML, "document-centric" parsing technique called Virtual Token Descriptor (VTD). Depending on the perspective, VTD-XML can be viewed as one of the following:

VTD-XML is developed by XimpleWare and dual-licensed under GPL and proprietary license. It was originally written in Java, but is now available in C,[14] C++ and C#.

Basic concept[edit]

Non-extractive, document-centric parsing[edit]

Traditionally, a lexical analyzer represents tokens (the small units of indivisible character values) as discrete string objects. This approach is designated extractive parsing. In contrast, non-extractive tokenization mandates that one keeps the source text intact, and uses offsets and lengths to describe those tokens.

Virtual token descriptor[edit]

Virtual Token Descriptor (VTD) applies the concept of non-extractive, document-centric parsing to XML processing. A VTD record uses a 64-bit integer to encode the offset, length, token type and nesting depth of a token in an XML document. Because all VTD records are 64 bits in length, they can be stored efficiently and managed as an array.[15]

Location cache[edit]

Location Caches (LC) build on VTD records to provide efficient random access. Organized as tables, with one table per nesting depth level, LCs contain entries modeling an XML document's element hierarchy. An LC entry is a 64-bit integer encoding a pair of 32-bit values. The upper 32 bits identify the VTD record for the corresponding element. The lower 32 bits identify that element's first child in the LC at the next lower nesting level.

Benefits[edit]

Overview[edit]

Virtually all the core benefits of VTD-XML are inherent to non-extractive, document-centric parsing which provides these characteristics:

Combining those characteristics permits thinking of XML purely as syntax (bits, bytes, offsets, lengths, fragments, namespace-compensated fragments, and document composition) instead of the serialization/deserialization of objects. This is a powerful way to think about XML/SOA applications.

Conformance[edit]

VTD-XML conforms strictly to XML 1.0 (Except the DTD part) and XML Namespace 1.0. It essentially conforms to XPath 1.0 spec (with some subtle differences in terms of underlying data model) with extension to XPath 2.0 built-in functions.

Simplicity[edit]

As parser[edit]

When used in parsing mode, VTD-XML is a general purpose, extremely high performance[17] XML parser which compares favorably with others:

As indexer[edit]

Because of the inherent persistence of VTD-XML, developers can write the internal representation of a parsed XML document to disk and later reload it to avoid repetitive parsing. To this end, XimpleWare has introduced VTD+XML as a binary packaging format combining VTD, LC and the XML text. It can typically be viewed in one of the following two ways:

XML content modifier[edit]

Because VTD-XML keeps the XML text intact without decoding, when an application intends to modify the content of XML it only needs to modify the portions most relevant to the changes. This is in stark contrast with DOM, SAX, or StAx parsing, which incur the cost of parsing and re-serialization no matter how small the changes are.

Since VTDs refer to document elements by their offsets, changes to the length of elements occurring earlier in a document require adjustments to VTDs referring to all later elements. However, those adjustments are integer additions, albeit to many integers in multiple tables, so they are quick.

XML slicer/splitter/assembler[edit]

An application based on VTD-XML can also use offsets and lengths to address tokens, or element fragments. This allows XML documents to be manipulated like arrays of bytes.

XML editor/eraser[edit]

Used as an editor/eraser, VTD-XML can directly edit/erase the underlying byte content of the XML text, provided that the token length is wider than the intended new content. An immediate benefit of this approach is that the application can immediately reuse the original VTD and LC. In contrast, when using VTD-XML to incrementally update an XML document, an application needs to reparse the updated document before the application can process it.

An editor can be made smart enough to track the location of each token, permitting new, longer tokens to replace existing, shorter tokens by merely addressing the new token in separate memory outside that used to store the original document. Likewise, when reordering the document, element text does not need to be copied; only the LCs need to be updated. When a complete, contiguous XML document is needed, such as when saving it, the disparate parts can be reassembled into a new, contiguous document.

Other benefits[edit]

VTD-XML also pioneers the non-blocking, stateless XPath evaluation approach. [citation needed]

Weaknesses[edit]

VTD-XML also exhibits a few noticeable shortcomings:

Areas of applications[edit]

General-purpose replacement for DOM or SAX[edit]

Because of VTD-XML's performance and memory advantages, it covers a larger portion of XML use cases than either DOM or SAX.[18]

XPath over huge XML documents[edit]

The extended edition of VTD-XML combining with 64-bit JVM makes possible XPath-based XML processing over huge XML documents (up to 256 GB) in size.

For SOA/WS/XML security[edit]

The combination of VTD-XML's high performance and incremental-update capability makes it essential[19][20][21] to achieve the desired level of quality of service for SOA/WS/XML security applications.

For SOA/WS/XML intermediary[edit]

VTD-XML is well suited for SOA intermediary applications such as XML routers/switches/gateways, Enterprise Service Buses, and services aggregation points. All those applications perform the basic "store and forward" operations for which retaining the original XML is critical for minimizing latency. VTD-XML's incremental update capability also contributes significantly to the forwarding performance.

VTD-XML's random-access capability lends itself well to XPath-based XML routing/switching/filtering common in AJAX and SOA deployment.

Intelligent SOA/WS/XML Load-balancing and Offloading[edit]

When an XML document travels through several middle-tier SOA components, the first message stop, after finishing the inspection of the XML document, can choose to send the VTD+XML file format to the downstream components to avoid repetitive parsing, thus improving throughput.

By the same token, an intelligent SOA load balancer can choose to generate VTD+XML for incoming/outgoing SOAP messages to offload XML parsing from the application servers that receive those messages.

XML persistence data store[edit]

When viewed from the perspective of native XML persistence, VTD-XML can be used as a human-readable, easy to use, general-purpose XML index. XML documents stored this way can be loaded into memory to be queried, updated, or edited without the overhead of parsing/re-serialization.

Schemaless XML data binding[edit]

VTD-XML's combination of high performance, low memory usage, and efficient XPath evaluation makes possible a new XML data binding approach based entirely on XPath. This approach's biggest benefit is it no longer requires XML schema, avoids needless object creation, and takes advantage of XML's inherent loose encoding.[22]

It is worth noting that data binding discussed in the article mentioned above needs to be implemented by the application: VTD-XML itself only offers accessors. In this regard VTD-XML is not a data binding solution itself (unlike JiBX, JAXB, XMLBeans), although it offers extraction functionality for data binding packages, much like other XML parsers (DOM, SAX, StAX).

Essential classes[edit]

As of Version 2.11, the Java and C# versions of VTD-XML consist of the following classes:

The extended VTD-XML consists of the following classes:

Code sample[edit]

/* In this java program, we demonstrate how to use XMLModifier to incrementally
* update a simple XML purchase order.
* a particular name space. We also are going
* to use VTDGen's parseFile to simplify programming.
*/

import com.ximpleware.*;

public class Update {
      public static void main(String argv[]) throws NavException, ModifyException, IOException{
            // open a file and read the content into a byte array
            VTDGen vg = new VTDGen();
            if (vg.parseFile("oldpo.xml", true)){
                VTDNav vn = vg.getNav();
                AutoPilot ap = new AutoPilot(vn);
                XMLModifier xm = new XMLModifier(vn);
                ap.selectXPath("/purchaseOrder/items/item[@partNum='872-AA']");

                int i = -1;
                while ((i = ap.evalXPath()) != -1){
                    xm.remove();
                    xm.insertBeforeElement("<something/>\n");
                }
                ap.selectXPath("/purchaseOrder/items/item/USPrice[.<40]/text()");
                while ((i = ap.evalXPath()) != -1){
                    xm.updateToken(i, "200");
                }
                xm.output("newpo.xml");
            }
      }
}

References[edit]

  1. ^ Zhang, Jimmy (May 19, 2004). "Non-Extractive Parsing for XML". XML.com. Retrieved 2020-07-24.
  • ^ XML Processing for the Future
  • ^ Zhang, Jimmy (January 9, 2008). "Manipulate XML Content the Ximple Way". DevX. Archived from the original on 2017-07-30. Retrieved 2020-07-24.
  • ^ Zhang, Jimmy (June 24, 2008). "VTD-XML: XML Processing for the Future (Part II)". Code Project. Retrieved 2020-07-24.
  • ^ Zhang, Jimmy (March 27, 2006). "Simplify XML processing with VTD-XML". JavaWorld. Retrieved 2020-07-24.
  • ^ Zhang, Jimmy (October 21, 2004). "Better, Faster XML Processing with VTD-XML". DevX. Retrieved 2020-07-24.
  • ^ Zhang, Jimmy (April 17, 2008). "VTD-XML: XML Processing for the Future (Part I)". Code Project. Retrieved 2020-07-24.
  • ^ Zhang, Jimmy (November 2, 2007). "Index XML Documents with VTD-XML". SYS-CON Publications. Archived from the original on 2007-11-05.
  • ^ Zhang, Jimmy (July 24, 2006). "Cut, paste, split, and assemble XML documents with VTD-XML". JavaWorld. Retrieved 2020-07-24.
  • ^ XML on a chip?
  • ^ Zhang, Jimmy (March 9, 2005). "XML on a Chip". XML.com. Retrieved 2020-07-24.
  • ^ XimpleWare's W3C binary XML workshop Position Paper
  • ^ Zhang, Jimmy (March 19, 2007). "Improve XPath Efficiency with VTD-XML". DevX. Retrieved 2020-07-24.
  • ^ Volkman, Victor (December 3, 2007). "VTD-XML: A New Vision of XML". Developer.com. Retrieved 2020-07-24.
  • ^ Virtual Token Descriptor introduction at SourceForge
  • ^ Zhang, Jimmy (July 31, 2006). "The Performance Woe of Binary XML". SYS-CON Publications. Archived from the original on 2006-08-08.
  • ^ VTD-XML Parsing/Navigation Performance Report
  • ^ Zhang, Jimmy (February 8, 2006). "A Step in the Right Direction: VTD-XML Improves XML Processing". DevX. Retrieved 2020-07-24.
  • ^ Zhang, Jimmy (January 9, 2007). "Accelerate WSS applications with VTD-XML". JavaWorld. Retrieved 2020-07-24.
  • ^ W3C workshop presentation on XML security
  • ^ Position Paper for W3C Workshop on Next Steps for XML Signature and XML Encryption
  • ^ Zhang, Jimmy (September 10, 2007). "Schemaless Java-XML Data Binding with VTD-XML". ONJava. Archived from the original on 2017-09-27.

  • Retrieved from "https://en.wikipedia.org/w/index.php?title=VTD-XML&oldid=1232644793"

    Categories: 
    XML
    XML parsers
    Cross-platform free software
    Java platform
    .NET programming tools
    XML-based standards
    C (programming language) libraries
    C++ libraries
    Hidden categories: 
    Articles with short description
    Short description is different from Wikidata
    Articles with a promotional tone from October 2010
    All articles with a promotional tone
    All articles with unsourced statements
    Articles with unsourced statements from January 2009
    Articles with unsourced statements from October 2010
     



    This page was last edited on 4 July 2024, at 20:51 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki