Unless stated otherwise, all Perl variables mentioned are within the scope of package dtd.
       require "dtd/dtd.pl";
The following routines are defined:
&'DTDread_dtd(FILEHANDLE); 
       DTDread_dtd parses the SGML DTD specified by FILEHANDLE. Parsing of the 
       DTD stops once the end of the file is reached. Any external entity references will 
       be parsed if an entity to filename mapping exists (see DTDread_mapfile).
       DTDread_dtd makes the following assumptions when parsing a DTD:
sgmls, or other SGML validator, for such purposes.
$namechars.  There 
              is no size limit on name length.
EN". Others can be 
              added by changing the $pubtl variable.
DTDread_dtd is finished, the following associative arrays are filled 
       (remember, all the arrays are within the scope of package dtd):
%ParEntity 
%PubParEntity 
PUBLIC).%SysParEntity 
SYSTEM).%ElemCont 
%ElemInc 
%ElemExc 
%ElemTag 
%Attribute 
              To access the data stored in %Attribute, it is best to use 
              DTDget_elem_attr.
%ElemCont, 
       %ElemInc, %ElemInc, %ElemExc, %ElemTag, %Attribute arrays.
       When trying to locate external entity parameter entity files, DTDread_dtd uses 
       the environment variable P_SGML_PATH.  P_SGML_PATH is a colon separated 
       string telling DTDread_dtd where to locate external entities.  By default, 
       DTDread_dtd will look in the current working directory or the sub-directory 
       called ents.
       If DTDread_dtd cannot cannot resolve an external entity reference, it will issue a 
       warning and continue parsing the DTD.
       Current status of DTDread_dtd:
<!DOCTYPE is parsed, but external reference to file not implemented.
INCLUDE and IGNORE marked sections are processed with nested marked 
              sections allowed.  CDATA and RCDATA marked sections are not recognized 
              and may cause incorrect behavior.  However, CDATA and RCDATA marked 
              sections do not normally appear in a DTD.
              IGNORE has higher precedence than INCLUDE in case of nested sections.
LINKTYPE, NOTATION, SHORTREF, USEMAP declerations are ignored.
DTDread_dtd is not the best.  DTDread_dtd makes frequent 
       use of Perl's getc function.  If SGML did not have such screwing grammer rules, 
       I could have easily avoided getc.  I haven't bothered in trying to optimize 
       DTDread_dtd's performance.  So far it is working, and I do not feel like mucking 
       with it.
       DTDread_dtd is meant to process DTDs in separate files.  If a document instance 
       is in the file DTDread_dtd is parsing, God only knows what will happen.
&'DTDread_mapfile($filename); 
       DTDread_mapfile parses a entity map file specified $filename.
       DTDread_mapfile uses the environment variable P_SGML_PATH as described in 
       section DTDread_dtd to locate $filename.  This way, one can put the map file in 
       the same location of the entity files.
       DTDread_mapfile makes the following assumptions when parsing $filename:
$pubtl variable.
SYSTEM entity names).
# DTDread_mapfile will ignore lines beginning with a `#' character.If
#####################
# ISO entity files
#
ISO 8879-1986//ENTITIES General Technical//EN iso-tech.ent
ISO 8879-1986//ENTITIES Publishing//EN iso-pub.ent
ISO 8879-1986//ENTITIES Numeric and Special Graphic//EN iso-num.ent
ISO 8879-1986//ENTITIES Greek Letters//EN iso-grk1.ent
ISO 8879-1986//ENTITIES Diacritical Marks//EN iso-dia.ent
ISO 8879-1986//ENTITIES Added Latin 1//EN iso-lat1.ent
ISO 8879-1986//ENTITIES Greek Symbols//EN iso-grk3.ent
ISO 8879-1986//ENTITIES Added Latin 2//EN ISOlat2
ISO 8879-1986//ENTITIES Added Math Symbols: Ordinary//EN ISOamso
#####################
# ArborText entity file
#
-//ArborText//ELEMENTS Math Equation Structures//EN ati-math.elm
#####################
# A sample SYSTEM entities
#
MyGraphics my_graphics.ent
# end of map file
DTDread_mapfile cannot access $filename, it will issue a warning to that 
       effect.
@elements = &'DTDget_elements(); 
       DTDget_elements retrieves a sorted array of all elements defined in the DTD.
This function is only useful after DTDread_dtd has been called.
@top_elements = &'DTDget_elements(); 
       DTDget_top_elements retrieves a sorted array of all top-most elements defined 
       in the DTD. Top-most elements are those elements that cannot be contained within 
       another element or can only be contained within itself.
This function is only useful after DTDread_dtd has been called.
%attribute = &'DTDget_elem_attr($elem); 
       DTDget_elem_attr returns an associative array containing the attributes of 
       $elem. The keys of the array are the attribute names, and the array values are $; 
       separated strings of the possible values for the attributes. Example of extracting an 
       attribute's values:
       @values = split(/$;/, $attribute{`alignment'}); 
       The first array value of the $; splitted array is the default value for the attribute 
       (which may be an SGML reserved word), and the other array values are all posible 
       values for the attribute.
$; is assumed to be the default value assigned by Perl: \\034. If $; is 
              changed, unpredictable results may occur.
@parent_elements = &'DTDget_parents($elem); 
       DTDget_parents returns an array of all elements that may be a parent of $elem.
This function is only useful after DTDread_dtd has been called.
&'DTDis_attr_keyword($word); 
       DTDis_attr_keyword returns 1 if $word is an attribute content reserved value, 
       otherwise, it returns 0. In the reference concrete syntax, the following values of 
       $word will return 1:
CDATA 
ENTITY 
ENTITIES 
ID 
IDREF 
IDREFS 
NAME 
NAMES 
NMTOKEN 
NMTOKENS 
NOTATION 
NUMBER 
NUMBERS 
NUTOKEN 
NUTOKENS 
&'DTDis_elem_keyword($word); 
       DTDis_elem_keyword returns 1 if $word is an element content reserved value. 
       otherwise, it returns 0. In the reference concrete syntax, the following values of 
       $word will return 1:
#PCDATA 
CDATA 
EMPTY 
RCDATA 
&'DTDprint_tree($elem, $depth, FILEHANDLE); 
       DTDprint_tree prints the content hierarchy of a single element, $elem, to a 
       maximum depth of $depth to the file specified by FILEHANDLE. If FILEHANDLE 
       is not specified then output goes to STDOUT. A depth of 5 is used if $depth is not 
       specified. The root of the tree has a depth of 1.
       The routine cuts at elements that exist at higher (or equal) levels or if $depth has 
       been reached.  The string "..." is appended to an element if it has been cut-off due 
       to preexistance at a higher (or equal) level.
Cutting the tree at repeat elements is necessary to avoid a combinatorical explosion with recursive element definitions.
Here's an example of what the output will look like due to pruning of recursive element contents:
htmlplusIf you see an element with "
|
|_body
| |
| |_address
| | |
| | |_p ...
| |
| |_div1
| | |
| | |_address ...
| | |_div2 ...
| | |_div3 ...
| | |_div4 ...
| | |_div5 ...
| | |_div6 ...
...", just search through the output until you find the 
       element without the "...".
In order to recognize cousins, a breadth first search is needed, or a full traversal of the hierarchy before outputing. The above technique currently is sufficient to avoid combinatorical explosions. Plus, it allows the printing of the tree while traversing the element data; there is no need to create a Perl tree data structure before printing (saves time, memory, and debugging).
..." may be the only 
       place of reference to see the content hierarchy of that element. However, the 
       element may occur in multiple contents and have different ancestoral inclusion 
       and exclusion elements applied to it.
Have I lost you? Maybe an example may help:
openbookIgnoring the lines starting with ()'s, one gets the content hierachy of an element as defined by the DTD without concern of where it may occur in the overall structure. The ()'s line give additional information regarding the element with respect to its existance within a specific context. For example, when an
|
|_d1
| | (I): idx needbegin needend newline
| |
| |_abbrev
| | | (Ia): idx needbegin needend newline
| | | (X): needbegin needend
| | |
| | |_#PCDATA
| | |_acro
| | | | (Ia): idx needbegin needend newline
| | | | (Xa): needbegin needend
| | | |
| | | |_#PCDATA
| | | |_sub ...
| | | |_super ...
| | |
acro element occurs 
       within openbook/d1/abbrev, along with its normal content, it can contain idx 
       and newline elements due to inclusions from ancestors. However, it cannot 
       contain needbegin, needend regardless of its defined content since an 
       ancestor(s) excludes them.
needbegin, needend are excluded from acro.
(I) 
(Ia) 
(X) 
(Xa) 
&'DTDreset(); 
       DTDreset clears all data associated with the DTD read via DTDread_dtd. This 
       routine is useful if multiple DTDs need to be processed.