| Oracle9i XML API Reference - XDK and Oracle XML DB Release 2 (9.2) Part Number A96616-01 |
|
This chapter describes the following sections:
This C implementation of the XML processor (or parser) follows the W3C XML specification (rev REC-xml-19980210) and implements the required behavior of an XML processor in terms of how it must read XML data and the information it must provide to the application.
Parsing a single document:
xmlinit, xmlparsexxx, xmlterm
Parsing multiple documents, but only the latest document's data needs to be available:
xmlinit, xmlparsexxx, xmlclean, xmlparsexxx, xmlclean ... xmlterm
Parsing multiple documents, all document data must be available:
xmlinit, xmlparsexxx, xmlparsexxx ... xmlterm
Memory callback functions may be used if you wish to use your own memory allocation. If they are used, both functions should be specified.
The memory allocated for parameters passed to the SAX callbacks or for nodes and data stored with the DOM parse tree will not be freed until one of the following is done:
If threads are forked off somewhere in the midst of the init-parse-terminate sequence of calls, you will get unpredictable behavior and results.
Frees an allocated list of element nodes. Used primarily to free the lists created by getElementsByTagName().
void freeElements( xmlctx *ctx, xmlnodes *list);
| Parameter | IN / OUT | Description |
|---|---|---|
|
ctx |
(IN) |
XML context. |
|
list |
(IN) |
List of nodes to free |
This function returns the IANA/Mime name of the DOM/SAX data encoding, such as "ASCII", "ISO-8859-1", "UTF-8", "UTF-16", and so on. See also the isSingleChar() function, which can be used to determine if the data is single or multibyte, and the isUnicode() function, which determines if the data is Unicode (UTF-16). The data encoding is specified by the user at initiation time.
oratext *getEncoding( xmlctx *ctx);
| Parameter | IN / OUT | Description |
|---|---|---|
|
ctx |
(IN) |
The XML parser context. |
Returns a flag which specifies whether data encoding to this context is singlebyte characters (like ASCII, ISO-8859, EBCDIC, and others), or multibyte characters (like UTF-8 or Unicode). See getEncoding(), which returns the name of the data encoding.
boolean isSingleChar( xmlctx *ctx);
| Parameter | IN / OUT | Description |
|---|---|---|
|
ctx |
(IN) |
The XML parser context |
Returns value of document's standalone flag. This function returns the boolean value of the document's standalone flag, as specified in the XML declaration.
boolean isStandalone( xmlctx *ctx);
| Parameter | IN / OUT | Description |
|---|---|---|
|
ctx |
(IN) |
The XML parser context |
Returns the Unicode (UCS2) encoding flag. Similar to a isSingleChar().
boolean isUnicode(xmlctx *ctx);
| Parameter | IN / OUT | Description |
|---|---|---|
|
ctx |
(IN) |
The XML parser context |
Allocates memory and saves the NULL-terminated single or multibyte string in the XML string pool. Strings saved this way cannot be freed individually since they are stored head-to-tail in a single pool, for maximum compactness. The memory is reused only when the entire pool is freed, after an xmlclean() or xmlterm() calls. Use saveString2() for saving Unicode strings.
oratext *saveString( xmlctx *ctx, oratext *str);
| Parameter | IN / OUT | Description |
|---|---|---|
|
xtx |
(IN) |
LPX context |
|
str |
(IN) |
Pointer to a single or multibyte string. |
Allocates memory and saves the NULL-terminated Unicode string in the XML string pool. Note that a Unicode string is terminated with TWO NULL bytes, not just one! Strings saved this way cannot be freed individually since they are stored head-to-tail in a single pool, for maximum compactness. The memory is reused only when the entire pool is freed, after an xmlclean() or xmlterm() calls. Use saveString() for saving single or multibyte strings.
ub2 *saveString2( xmlctx *ctx, ub2 *ustr);
| Parameter | IN / OUT | Description |
|---|---|---|
|
xtx |
(IN) |
LPX context |
|
ustr |
(IN) |
Pointer to a unicode string |
Creates a printed representation of an XML tree rooted at the given node, and puts it into a destination buffer. Indentation is controlled by level and step: step is the number of spaces to indent each new level, and level is the starting level; 0 for top- level.
void printBuffer( oratext *buffer, size_t bufsiz, xmlnode *node, uword step, uword level);
Returns the size of the printed representation of an XML tree, rooted at the given node. Indentation is controlled by level and step as for printBuffer: step is the number of spaces to indent each new level, and level is the starting level; 0 for top-level. This function is used to pre-compute the size of the buffer needed for printBuffer().
size_t printSize( xmlnode *node, uword step, uword level);
| Parameter | IN / OUT | Description |
|---|---|---|
|
node |
(IN) |
root node of XML tree to print |
|
step |
(IN) |
number of spaces to indent each new level |
|
level |
(IN) |
starting level of indentation |
Writes a printed representation of an XML tree (rooted at the given node) to a stdio stream (FILE*). This function is exactly like printBuffer except output is to a stream instead of into a buffer. Indentation is controlled by level and step: step is the number of spaces to indent each new level, and level is the starting level (0 for top-level).
void printStream( FILE *stream, xmlnode *node, uword step, uword level);
Sets the document order for each node in the current document. Must be called once on the final document before XSLT processing can occur. Note this is called automatically by the XSLT processor, so ordinarily the user need not make this call.
ub4 setDocOrder(xmlctx *ctx, ub4 start_id);
| Parameter | IN / OUT | Description |
|---|---|---|
|
ctx |
(IN) |
XML context |
|
start_id |
((N) |
Initial id number to assign |
Sets the I/O callback functions for the given access method.
uword xmlaccess( xmlctx *ctx, xmlacctype access, XML_OPENF((*openf)), XML_CLOSEF((*closef)), XML_READF((*readf)));
Sets the I/O callback functions for the given access method. Most methods have built-in callback functions, so do not have to be provided by the user. The notable exception is XMLACCESS_STREAM, where the user must set the stream callback functions themselves.
The three callback functions are invoked to open, close, and read from the input source. The functions should have been declared using the function prototype macros XML_OPENF, XML_CLOSEF and XML_READF.
XML_OPENFis the open function, called once to open the input source. It should set its persistent handle in thexmlihdlunion, which has two choices, a generic pointer(void *), and an integer (as unix file or socket handle). This function must returnXMLERR_OKon success.
XML_CLOSEF is the close function; it closes an open source and frees resources.
| Parameter | IN / OUT | Description |
|---|---|---|
|
ctx |
(IN) |
XML context. |
|
ih |
(IN) |
The opened handle. |
XML_READF is the reader function; it reads data from an open source into a buffer, and returns the number of bytes read:
On EOI, the matching close function will be called automatically.
Recycles memory within the XML parser, but does not free it to the system; only xmlterm() finally releases all memory back to the system. If xmlclean() is not called between parses, then the data used by the previous documents remains allocated, and pointers to it are valid. Thus, the data for multiple documents can be accessible simultaneously, although only the current document can be manipulated with DOM.
If only access to only one document's data at a time within a single context is desired, than clear() should be called before each new parse.
void xmlclean( xmlctx *ctx);
| Parameter | IN / OUT | Description |
|---|---|---|
|
ctx |
(IN) |
The XML parser context |
Initializes the XML parser. It must be called before any parsing can take place.
err argument on error. As usual, a zero error code means success, nonzero indicates a problem.xmlterm() should be called after all processing of XML files has completed.stderr unless msghdlr is given.saxcb callbacks should be set. Note that any of the SAX callback functions may be set to NULL if not needed.memcb may be used for user-defined memory allocation. Both alloc() and free() functions must be specified.msgctx, saxcbctx, and memcbctx can be used to define and pass information to user-defined callback routines for the message handler, SAX functions, or memory functions, respectively. They should be set to NULL if user-defined callback functions do not need any additional information passed in to them.lang parameter determines the language of error messages; the default is "American."err argument on error. A zero error code means success, nonzero indicates a problem.xmlctx *xmlinit( uword *err, const oratext *incoding, XML_MSGHDLRF((*msghdlr)), void *msgctx, const xmlsaxcb *saxcb, void *saxcbctx, const xmlmemcb *memcb, void *memcbctx, const oratext *lang);
Initializes the XML parser, specifying DOM data encoding. It must be called before any parsing can take place. Same as SAX xmlinit(), but allows data encoding to be specified.
xmlctx *xmlinitenc( uword *err, const oratext *incoding, const oratext *outcoding, XML_MSGHDLRF((*msghdlr)), void *msgctx, const xmlsaxcb *saxcb, void *saxcbctx, const xmlmemcb *memcb, void *memcbctx, const oratext *lang);
| Parameter | IN / OUT | Description |
|---|---|---|
Returns current source location while parsing. This function may be called at any time. However, a 0 will be returned for both path and line if the call is not made during parsing.
uword xmlLocation( xmlctx *ctx, ub4 *line, oratext **path);
| Parameter | IN / OUT | Description |
|---|---|---|
|
ctx |
(IN) |
The XML parser context |
|
line |
(OUT) |
Current line number |
|
path |
(OUT) |
Current source path/URL |
Invokes the XML parser on an input document that is specified by a URI. The parser must have been initialized successfully with a call to xmlinit() or xmlinitenc() first. Parser options are specified as flag bits OR'd together into the flags mask.
The default input encoding may be specified as incoding, which overrides the incoding given to xmlinit(). If the input's encoding cannot be determined automatically, based on BOM, XMLDecl, and others, then it is assumed to be incoding. IANA/Mime encoding names should be used, "UTF-8", "ASCII", others.
uword xmlparse( xmlctx *ctx, const oratext *uri, const oratext *incoding, ub4 flags);
| Parameter | IN / OUT | Description |
|---|---|---|
|
ctx |
(IN/OUT) |
The XML parser context |
|
uri |
(IN) |
URI of XML document |
|
incoding |
(IN) |
Default input encoding |
|
flags |
(IN) |
Mask of parser option flag bits |
Invokes the XML parser on a buffer. The parser must have been initialized successfully with a call to xmlinit() or xmlinitenc() first. Parser options are specified as flag bits OR'd together into the flags mask.
The default input encoding may be specified as incoding, which overrides the incoding given to xmlinit(). If the input's encoding cannot be determined automatically, based on BOM, XMLDecl, and others, then it is assumed to be incoding. IANA/Mime encoding names should be used, "UTF-8", "ASCII", others.
uword xmlparsebuf( xmlctx *ctx, const oratext *buffer, size_t len, const oratext *incoding, ub4 flags);
Invokes the XML parser on an external DTD file, not a complete document. It is used mainly by the Class Generator to create classes from a DTD without requiring a complete document. The parser must have been initialized successfully with a call to xmlinit() or xmlinitenc() first. Parser options are specified as flag bits OR'd together into the flags mask.
The default input encoding may be specified as incoding, which overrides the incoding given to xmlinit(). If the input's encoding cannot be determined automatically, based on BOM, XMLDecl, and others, then it is assumed to be incoding. IANA/Mime encoding names should be used, "UTF-8", "ASCII", others.
uword xmlparsedtd( xmlctx *ctx, const oratext *filename, oratext *name, const oratext *incoding, ub4 flags);
Invokes the XML parser on a document in the file system. The parser must have been initialized successfully with a call to xmlinit() or xmlinitenc() first. Parser options are specified as flag bits OR'd together into the flags mask.
The default input encoding may be specified as incoding, which overrides the incoding given to xmlinit(). If the input's encoding cannot be determined automatically, based on BOM, XMLDecl, and others, then it is assumed to be incoding. IANA/Mime encoding names should be used, "UTF-8", "ASCII", others.
uword xmlparsefile( xmlctx *ctx, const oratext *path, const oratext *incoding, ub4 flags);
| Parameter | IN / OUT | Description |
|---|---|---|
|
ctx |
(IN/OUT) |
The XML parser context |
|
path |
(IN) |
Filesystem path of the document |
|
incoding |
(IN) |
Default input encoding |
|
flags |
(IN) |
Mask of parser option flag bits |