Import of fxp sources, as used in su4sml <https://git.logicalhacking.com/HOL-OCL/su4sml>.

This commit is contained in:
Achim D. Brucker 2021-04-28 09:24:59 +01:00
parent b5f15612a7
commit 1d840722d6
187 changed files with 25766 additions and 0 deletions

339
fxp/CHANGES Normal file
View File

@ -0,0 +1,339 @@
From 1.4.6 to 2.0
---------------------------
(25.06.2004)
Added support for XML 1.1 (Entities.getChar, UniClasses.isNms/isName/isXml
have now an 11 (for 1.1) counterpart which are to be called when the XML
declaration declares version 1.1.
Changed Makefile to better support Windows installation.
(13.02.2004)
Updated fxp to handle the new specification of the xml:space attribute
From 1.4.5 to 1.4.6
---------------------------
(09.10.2003)
Modified documentation to fit in the fxgrep gui framework
From 1.4.4 to 1.4.5
---------------------------
(20.02.2002)
Modified the Makefiles to avoid unnecessary recompilations
2001-09-17
- added hasElement, hasAttribute, etc. functions in Dtd
Changes from 1.4.3 to 1.4.4
---------------------------
2000-10-30
- fixed a bug : parser reported an error if more than one definition
was provided for the same ID attribute, instead of ignoring
delarations later than the first one
Changes from 1.4.2 to 1.4.3
---------------------------
2000-08-29:
- modified the autodetection of character encodings as stated in the
XML 1.0 Specification Errata from 2000-08-10 under E 44; among
others, UTF-8 with BOM is now recognized.
Changes from 1.4.1 to 1.4.2:
--------------------------
2000-07-28:
- fixed a Solaris bug in the Makefile
Changes from 1.4 to 1.4.1:
--------------------------
2000-06-08:
- fixed a bug in surrogate composition
- fixed a bug: parser complained about undeclared entities in
non-validating mode, even if the DTD has paramter references.
- fixed another bug: parser never complained about undeclared
parameter entities that constitute a validity error.
- slightly restructured ParseRefs; resulting in some minor
changes in other modules.
- modified sample applications to report fatal errors as fatal.
Changes from 1.3 to 1.4:
------------------------
2000-02-18:
- fixed a bug in UtilCompare.compareVector
- eliminated UnsafeOps structure. Unsafe operations yield only
insignificant speedup, but extremely impair reliability.
2000-02-10:
- parseElementContent reports only a single error for each piece
of character data in element content. Before this change, it did
report an error after each sequence of white space characters
followed by non-white space. Similarly for parseDocument.
1999-10-22:
- Added the start position of the PI text to HookData.ProcInstInfo.
1999-10-18:
- Fixed a small bug in UtilHash.hashTriple
1999-08-27:
- Removed the this/next components from the Entities.State type.
It is sufficient to compare the entity's indices in the DTD.
Added a type EntId, indicating whether an entity is a parameter
or general entity, and holding its index. isOpen, pushIntern and
pushExtern now have an additional parameter isParam:bool. Changed
all parser functions accordingly.
Changes from 1.2 to 1.3:
------------------------
1999-08-24:
- Removed options --table-size/width.
- added new encoding UCS2B/L, which is UTF16 without surrogate pairs.
- unrolled the getBytes loop in getCharUtf8. UTF-8 does not recognize
surrogates pairs any more (cf. Rfc 2279, lines 201-206). The same
for UCS-4.
- changed implementation getCharUtf16/getCharUcs4: they are not higher
order any more. Efficiency increased.
1999-08-13:
- Changed ParseDecl.parseGenEntDecl not to check for declaredness of
unparsed entities' notations.
- Added maxUsedGen to Dtd: returns the number of general entities defined.
- Added checkUnparsed to DtdDeclare. It checks whether all notations of
unparsed entities have been declared. Added a call to checkUnparsed
at the end of parseDtd.
1999-08-12:
- Made DecodeUtf8.byte1switch a vector.
- Changed type CharClasses.CharClass to a vector. Arrays are now only
used for initializing the char classes; for lookup, they must be
finalized to a vector.
1999-08-03:
- Restructered the Dtd[Elements|Attributes|Notations|Entities] modules.
There are now two modules: DtdDeclare for operations concerning
declarations, and DtdAttributes for generating attribute values.
- Renamed AuxDtd to DtdManager.
- Renamed some functions from Entities to shorter names.
- Changed ParseDocument.parseDoc not to set O_VALIDATE to false if there
is no DTD. Instead those functions that would produce too many errors
without a DTD call hasDtd in order to find out whether there is one.
- Changed the error message for ERR_NO_DTD.
1999-07-29:
- Fixed a bug in the ParseContent.parseElementContent: character
references were passed to the application instead of reporting
an error.
- Removed ERR_DATA_IN_ELEM and ERR_CDATA_IN_ELEM; added ERR_ELEM_CONTENT
instead, which does the job for data, CDATA sections and charrefs.
- Added a boolean flag to HookData.DataInfo,indicating whether the
data is whitespace in element content. This information was available
only implicitly by the content spec of the parent element.
1999-07-26:
- Fixed a bug in the decoder and encoder, concerning surrogates: the
offset added/subtracted in combine/splitSurrogates was 0wx10000
instead of 0wx100000.
- DecodeUtil.isSurrogate ignored the high surrogates: fixed that.
1999-07-22:
- Renamed DecodeBasic to DecodeFile. Changes getByte such that it
closes the file (and removes it if temporary) before raising EndOfFile.
1999-07-20:
- Reimplemented the Uri structure to use strings for uris. (They may
only contain ASCII characters). Changed the result type of uriSuffix
to string.
- Moved the URI encoding functions to a new structure UriEncode.
A character "%" is only encoded if not followed by two hex digits.
Removed URI decoding, since that is superfluous.
- Added parser option O_WARN_NON_ASCII_URI. Changed
ParseLiterals.parseSystemLiteral to issue a warning if a non-ASCII
character occurs. Added a corresponding command line option
--warn-uri.
- Removed types CharInterval and CharRange from structure UniChar.
Added them to CharClasses, used only in Naming and NameClasses.
- Renamed type UniChar.CharVector to Vector for brevity.
- Renamed structures NameRanges and Naming to UniRanges and UniClasses.
1999-07-19:
- Renamed structure Chars to UniChar. Changed definitions such that
UniChar provides a structure Chars : WORD such that type Char =
Chars.word;
- Moved type Byte to DecodeBasic such that DecodeBasic provides a
structure Bytes : WORD such that type Byte = Chars.word;
- Changed other structures to use UniChar.Char and DecodeBasic.Bytes
instead of hardwired Word and Word8.
1999-07-14:
- Changed the DTD structure. The DTD is not a bunch of global variables
anymore; it is now a single data structure handed as argument to all
DTD functions. It must therefore be passed around through all DTD-
dependent parts of the parser and of all applications.
+ Removed O_INIT_DTD option. Instead, the parser expects an optional
DTD as argument. If that is NONE, a new DTD data structure is
initialized, otherwise tghe provided one is used.
+ Changed all applications to pass the DTD around.
- Changed functions ParseMisc.skipS and ParseRefs.skipPS. Instead of
raising NotFound on error, they call the hookError function themselves.
Changes from 1.1 to 1.2:
------------------------
1999-06-07:
- Replaced the --error-minimize option by --few-errors[=(yes|no)].
The old option was buggy and could only turn on an feature that
was on by default.
1999-06-04:
- Removed option --no-output/-n from fxesis and fxcopy.
- Modified the main functions of all applications: in case of
option errors, they raise now raise exception Exit instead
of calling OS.Process.exit.
- Added support for remote URIs.
+ In structure Config, value retrieveCommand defines a command
to be executed for URI retrieval.
+ Uri.retrieveUri calls this command for storing the entity
in a local file. It returns the name of the file and a flag
indicating whether a temporary file has actually been created
or the URI was local.
+ DecodeBasic.FILE has this boolean as additional parameter.
Function decClose removes the temp. file if the flag is true.
1999-05-10:
- Added new type AppFinal to HookData signature. hookFinish now
returns a value of type AppFinal. This is also the new type of
ParseDocument.parseDocument's result.
1999-05-03:
- Added support for XML syntax of XML Catalogs.
+ new function Uri.uriSuffix. Depending on the suffix of a URI,
catalogs are parsed in Socat syntax (.soc, .SOC) or XML syntax.
1999-04-23:
- Fixed bug: Dict.getByKey raised NoSuchEntry for unknown keys.
That made fxp fail with an uncaught exception if, e.g., an
unknown output encoding was given.
1999-04-15:
- Added functions Dict.clearDict and SymTable.clearSymTable.
- Removed references from the tables in Dtd. The Tables are now
initialized with clearDict/SymTable.
1999-04-14:
- Moved Hooks, HookData, Dtd, DtdOptions, Resolve and ParserOptions
to directory Params. Deleted directory Hooks.
- Added the start and end position to the arguments of most of the Hooks.
Made the appropriate changes in the parser modules and the apps.
- The parser is now reentrant, i.e., multiple instances are possible
at the same time.
- Made Dfa a functor, expecting the Dfa-relevant options in a
structure DfaOptions. Added a new functor for creating such a
structure.
- Made ParserOptions a functor in order to have multiple instances
of it. ParserOptions creates DfaOptions as a substructure.
1999-04-13:
- Made Dtd a functor, expecting a structure DtdOptions holding all options
concerning the DTD. Removed these options from ParseOptions.
- Removed O_SILENT, O_ERROR_DEVICE and O_ERROR_LINEWIDTH from ParseOptions.
These options now belong to the application.
- Fixed some bugs in the option parsers of applications.
- Fixed a bug in Dfa: exception DfaTooLarge was not visible through the
signature.
- Added an integer parameter to exceptions DfaToLarge and to
WARN_DFA_TOO_LARGE. It is the maximal allowed number of states.
- ErrorMessage no longer depends on ParseOptions.
Changes from 1.0 to 1.1:
------------------------
1999-03-29:
- Fixed ErrorMessage.errorMessage to complain about standalone 'yes' instead
of 'no'.
- Avoid multiple EndOfFile events for the same entity by adding a boolean
flag to DecodeError/Bytes exceptions.
1999-03-25:
- Improved error reporting; the position handed to hookError/Warning is now
- whenever possible - the first character of the concerned item.
Exceptions are checks done at the end of the DTD or of the document.
- Improved handling of wrong end-tags. The strategy is:
+ if the end-tag is for the current element, consume it and finish the
element;
+ otherwise, if it is for an open element, assume the end-tag for the
current element was forgotten, i.e., finish the current element but
retain the end-tag;
+ otherwise, if the current element requires further content in order
to satisfy its content model, ignore the end-tag;
+ otherwise consumne the end-tag and finish the current element.
In order to implement this, the following were necessary:
+ an extra argument openElems to ParseContent.parseElement. It is a list
of the indices of the types of the enclosing elements;
+ an extra component optEtag in the return vale of parseElement. It is an
option and holds information about an end-tag that was not consumed when
finishing the element;
+ appropriate changes to the code of parse[Mixed|Element]Content.
1999-03-24:
- fixed some bugs:
+ comparison of fixed attribute values is now correct.
+ Naming.isUnicode: 0x10000..0xFFFFF was not unicode.
+ ParseContent.parseMixedContent complained about character '>' even if
it was not part of the sequence ']]>'.
+ whitespace normalization in attribute values also affected characters
that were escaped by a char reference.
+ end-of-line normalization was done for the replacement text of internal
entities, but may only be done for the literal entitiy values.
+ default values in attlist declarations were checked not only for lexical
validity but also for declaredness of notation/entity names. This is now
done when the default value is used.
1999-03-22:
- Added support for the Socat syntax of XCatalog:
+ new subdir Catalog with a main functor Catalog
+ the parser functor now expects an additional structure argument Resolve
providing a function resolveExtId. That does not exist in BaseData any
more.
1999-03-16:
- Extracted Unicode-specific code from the Parser:
+ made a new directory Unicode.
+ moved Parser/Front/front.sml to Parser/entities.sml; renamed Front to
Entities.
+ moved Parser/Front to Unicode/Decode; renamed everything to Decode<...>,
frontEncoding to Encode.
+ moved Parser/Back to Unicode/Encode; renamed everything to Encode<...>,
backEncoding to Encode; moved back.sml to Apps/Copy/copyEncode.sml
+ moved Parser/Chars to Unicode/Chars.
- Changed the implementation of Decode and Encode:
+ UTF-7 is no longer supported.
+ new structure Encoding providing types and functions for handling
encoding names.
+ both Encode and Decode now raise exceptions instead of printing errors.
+ made the appropriate changes to Entities and CopyEncode.
1999-03-08:
- Moved fillArray for Front to FrontEncoding, renamed it to encGetArray.
- Hid implem. of type FrontEncoding.Encoding through the signature.
1999-03-05:
- Changed Front.fillArray to use a function parameter instead of a reference.
1999-03-02:
- Changed all parser functors:
+ expect only a structure Params containing former Hooks, Front, Dtd,
AuxParse and AuxRecover.
+ No other parser structures are arguments to functors.
- Removed the functions from DtdTables from the Dtd signature.
- Renamed Dtd to AuxDtd and DtdTables to Dtd.
1999-03-01:
- Added function hookWhite to signature Hooks.
- Added calls to hookWhite in ParseDtd.parseSubset and ParseDocument.parseDoc.
- Changed CopyHooks and CopyOutput to account for this:
+ removed third bool arg from CopyOutput.outComment/ProcInstr; they never
print a newline now;
+ removed printing of newlines from declaration hooks in CopyHooks;
+ removed function CopyHooks.inContent;
+ added hookWhite to CopyHooks;
- Added hookWhite to NullHooks and EsisHooks;
- Added LOC_SUBSET to datatype Location in ErrorData and ErrorString.
- Fixed ErrorMessage.errorMessage to print "Could not open file..." instead of
"Could open file..." for ERR_NO_SUCH_FILE.

55
fxp/COPYRIGHT Normal file
View File

@ -0,0 +1,55 @@
fxp: A Functional XML Parser
Version 2.0, 25.06.2004
------------------------------------------------
COPYRIGHT NOTICE, LICENSE AND DISCLAIMER.
Copyright 1999-2004 by Andreas Neumann and Alexandru Berlea, TU Munich
Permission to use, copy, modify, and distribute this software and
its documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both the copyright notice and this permission notice and warranty
disclaimer appear in supporting documentation.
The TU Munich disclaims all warranties with regard to this
software, including all implied warranties of merchantability and
fitness. In no event shall the TU Munich be liable for any
special, indirect or consequential damages or any damages whatsoever
resulting from loss of use, data or profits, whether in an action of
contract, negligence or other tortious action, arising out of or in
connection with the use or performance of this software.
------------------------------------------------
This software was produced with the help of software, tools, and code
from SML of New Jersey. Therefore we repeat here the copyright notice
SML of New Jersey:
STANDARD ML OF NEW JERSEY COPYRIGHT NOTICE, LICENSE AND DISCLAIMER.
Copyright (c) 1989-1998 by Lucent Technologies
Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both the copyright notice and this permission notice and warranty
disclaimer appear in supporting documentation, and that the name of
Lucent Technologies, Bell Labs or any Lucent entity not be used in
advertising or publicity pertaining to distribution of the software
without specific, written prior permission.
Lucent disclaims all warranties with regard to this software,
including all implied warranties of merchantability and fitness. In no
event shall Lucent be liable for any special, indirect or
consequential damages or any damages whatsoever resulting from loss of
use, data or profits, whether in an action of contract, negligence or
other tortious action, arising out of or in connection with the use
or performance of this software.
------------------------------------------------

146
fxp/Makefile Normal file
View File

@ -0,0 +1,146 @@
##############################################################################
# These are the programs to be installed (fxlib is the library).
##############################################################################
INSTALL_PROGS = fxp fxcanon fxcopy fxesis fxviz fxlib
##############################################################################
# These are the locations for executables, heap images and library files
##############################################################################
PREFIX = /cygdrive/d/xml
FXP_BINDIR = ${PREFIX}/bin
FXP_LIBDIR = ${PREFIX}/fxp
##############################################################################
# The path where the SML-NJ binaries are located, and the name of the
# SML-NJ executable with the Compilation manager built-in. If sml is in
# your PATH at execution time, you fon't need the full path here.
##############################################################################
SML_BINDIR = /cygdrive/d/smlnj-110.43/bin
SML_EXEC = ${SML_BINDIR}/sml
##############################################################################
# No need to change this for SML-NJ 110.0.6. For earlier or working versions
# 110.19 you might have to use the second or third line. This is the
# compilation manager function for making with a named description file.
##############################################################################
#SML_MAKEDEF= val make = CM.make'
SML_MAKEDEF= val make = CM.make
#SML_MAKEDEF= fun make x = CM.make'{force_relink=true, group=x}
##############################################################################
# These should be fine on most unix machines
##############################################################################
SED = sed
RM = rm -f
RMDIR = rmdir
COPY = cp -f
CHMOD = chmod
FIND = find
#buggy in cygwin
#MKDIRHIER = mkdirhier
MKDIRHIER = mkdir -p
##############################################################################
# nothing to change below this line
##############################################################################
SRC = src
DOC = doc
FXLIB_PRUNE = \( -name CM -o -name CVS -o -name Apps \)
all: fxp.sh images
arch.os:
if test -s ${SML_BINDIR}/.arch-n-opsys; then\
${SML_BINDIR}/.arch-n-opsys | \
${SED} -e 's/^.*HEAP_SUFFIX=\(.*\)$$/\1/' > .arch-opsys;\
else \
echo "ARCH=x86; OPSYS=win32; HEAP_SUFFIX=x86-win32" | \
${SED} -e 's/^.*HEAP_SUFFIX=\(.*\)$$/\1/' > .arch-opsys;\
fi
fxp.sh: Makefile arch.os
${RM} fxp.sh
echo "#!/bin/sh -f" > fxp.sh
echo >> fxp.sh
echo "SML_BINDIR=${SML_BINDIR}" >> fxp.sh
echo "FXP_LIBDIR=${FXP_LIBDIR}" >> fxp.sh
cat fxp.sh.in >> fxp.sh
image.prog:
@echo "Creating the ${PROG_NAME} heap image..."
echo "${SML_MAKEDEF}; make \"${SRC}/${PROG_CM}\"; \
SMLofNJ.exportFn(\"${SRC}/_${PROG_NAME}\",${PROG_FUN})" | ${SML_EXEC}
image.fxlib:
image.fxp:
@make image.prog PROG_NAME=fxp PROG_CM=Apps/Null/null.cm PROG_FUN=Null.null
image.fxcanon:
@make image.prog PROG_NAME=fxcanon PROG_CM=Apps/Canon/canon.cm PROG_FUN=Canon.canon
image.fxcopy:
@make image.prog PROG_NAME=fxcopy PROG_CM=Apps/Copy/copy.cm PROG_FUN=Copy.copy
image.fxesis:
@make image.prog PROG_NAME=fxesis PROG_CM=Apps/Esis/esis.cm PROG_FUN=Esis.esis
image.fxviz:
@make image.prog PROG_NAME=fxviz PROG_CM=Apps/Viz/viz.cm PROG_FUN=Viz.viz
images:
for prog in ${INSTALL_PROGS}; do \
make image.$${prog}; \
done
inst.dirs:
test -d ${FXP_BINDIR} || ${MKDIRHIER} ${FXP_BINDIR}
test -d ${FXP_LIBDIR} || ${MKDIRHIER} ${FXP_LIBDIR}
inst.prog: inst.dirs fxp.sh arch.os
${RM} ${FXP_BINDIR}/${PROG_NAME} ${FXP_BINDIR}/fxp.sh \
${FXP_LIBDIR}/_${PROG_NAME}.`cat .arch-opsys`
${COPY} fxp.sh ${FXP_BINDIR}
${CHMOD} 755 ${FXP_BINDIR}/fxp.sh
ln -s fxp.sh ${FXP_BINDIR}/${PROG_NAME}
${COPY} ${SRC}/_${PROG_NAME}.`cat .arch-opsys` ${FXP_LIBDIR}
${CHMOD} 644 ${FXP_LIBDIR}/_${PROG_NAME}.`cat .arch-opsys`
inst.fxp:
@make inst.prog PROG_NAME=fxp PROG_CM=Apps/Null/null.cm PROG_FUN=Null.null
inst.fxcanon:
@make inst.prog PROG_NAME=fxcanon PROG_CM=Apps/Canon/canon.cm PROG_FUN=Canon.canon
inst.fxcopy:
@make inst.prog PROG_NAME=fxcopy PROG_CM=Apps/Copy/copy.cm PROG_FUN=Copy.copy
inst.fxesis:
@make inst.prog PROG_NAME=fxesis PROG_CM=Apps/Esis/esis.cm PROG_FUN=Esis.esis
inst.fxviz:
@make inst.prog PROG_NAME=fxviz PROG_CM=Apps/Viz/viz.cm PROG_FUN=Viz.viz
inst.fxlib:
for dir in `${FIND} ${SRC} ${FXLIB_PRUNE} -prune -o -type d -print`; do \
${MKDIRHIER} ${FXP_LIBDIR}/$${dir}; \
done
for file in `${FIND} ${SRC} ${FXLIB_PRUNE} -prune -o -name '*.sml' -print`; do \
${COPY} $${file} ${FXP_LIBDIR}/$${file}; \
done
${COPY} ${SRC}/fxlib.cm ${FXP_LIBDIR}/${SRC}/fxlib.cm
rm -f ${FXP_LIBDIR}/fxlib.cm
echo Group is > ${FXP_LIBDIR}/fxlib.cm
echo " "${SRC}/fxlib.cm >> ${FXP_LIBDIR}/fxlib.cm
${COPY} -r ${DOC} ${FXP_LIBDIR}
install:
for prog in ${INSTALL_PROGS}; do \
make inst.$${prog}; \
done
uninstall: arch.os
-for prog in ${INSTALL_PROGRAMS}; do \
if [ "$${prog}" == "fxlib" ]; then \
${RM} -r ${FXP_LIBDIR/src}; \
else \
${RM} ${FXP_BINDIR}/$${prog}; \
${RM} ${FXP_LIBDIR}/_$${prog}.`cat .arch-opsys`; \
fi; \
done
-${RM} ${FXP_BINDIR}/fxp.sh
-${RMDIR} ${FXP_BINDIR} ${FXP_LIBDIR}
clean:
-${RM} -f ${SRC}/_fx.* fxp.sh .arch-opsys
-find ${SRC} -type d -name CM -print | xargs ${RM} -r

133
fxp/README Normal file
View File

@ -0,0 +1,133 @@
fxp - The Functional XML Parser
Version 2.0, 25.06.2004
by
Andreas Neumann, University of Trier
neumann (AT) psi.uni-trier.de
Alexandru Berlea, TU Munich
berlea (AT) in.tum.de
What is fxp?
------------
fxp is a validating XML parser, written completely in the functional
programming language SML. It has a programming interface
allowing for production of XML applications based on fxp. It comes with
some example applications:
fxp The pure parser. It parses a document and finds well-formedness
errors, validity errors and other problems;
fxcanon Produces an equivalent canonical XML document. Canonical XML was
invented by James Clark for testing XML parsers. It contains only
the information a processor is required to pass to the application;
fxcopy Reproduces the document parsed by fxp. The copy can be generated
in a different encoding than the input, and can be normalized in
different ways concerning, e.g., expansion of entity references;
fxesis Produces an output similar to nsgmls's ESIS (Element Structure
Information Set) output;
fxviz An XML tree visualizer. It produces a graph description suitable as
input to Georg Sander's vcg.
Homepage
--------
http://www.informatik.uni-trier.de/~berlea/Fxp
Installation
------------
In order to install fxp, you need an SML compiler. It has been tested with
version 110.0.7 of SML of New Jersey, but it might also run with other
versions. The compiler must have the compilation manager (CM) built in, which
is the default when installing SML-NJ. We successfully compiled fxp on
Linux. For other unices we expect no problems. An installation using the
Windows version of SML-NJ is documented on fxp's homepage.
These are the steps for installing fxp under Unix:
1. Download the latest version of fxp;
2. Unpack the sources, and change to the fxp directory, e.g.:
gunzip -c fxp-2.0.tar.gz | tar xf -
cd fxp-2.0
3. Read the COPYRIGHT;
4. Edit the Makefile according to your needs. Probably you will only have
to change the following:
INSTALL_PROGS is the list of programs to be installed. fxlib is only
required if you want to develop applications with fxp.
FXP_BINDIR is where the executables are installed;
FXP_LIBDIR is where other files needed by fxp - the heap images
and the library - are installed;
SML_BINDIR is the directory where the SML executables are found.
It must contain the .arch-n-opsys script from the
SML-NJ distribution, so make sure that this is where
SML-NJ is physically installed;
SML_EXEC is the name of the SML executable. This is the program
that is called for generating the heap image and at
execution of fxp. If sml is in your PATH at
installation time, you don't need the full path here.
SML_MAKEDEF is for defining the make command in SML. After version
110.0.3, SML-NJ changed the type of CM.make'.
For earlier or working versions of SML-NJ, use the
second or third variant of this definition.
5. Edit the file src/config.sml according to your needs. Currently only a
single value can be configured here:
val retrieveCommand : string
is the command to be used by fxp for retrieving a remote URI from
the internet and storing it in a temporary file on the local file
system. It is a string value and should contain the strings %1
and %2, where:
- %1 is replaced by the URI;
- %2 is replaced by the local filename.
It is recommended that the command exits with failure in case the
URI cannot be retrieved. If the command generates an HTML error
message instead (like, e.g., "lynx -source %1 > %2"), this HTML
file is considered to be XML and will probably cause a mess of
parsing errors. If you don't need URI retrieval, use "exit 1"
which always fails on Unix. Sensible values are, e.g:
- "wget -qO %2 %1"
(ftp://gnjilux.cc.fer.hr/pub/unix/util/wget/)
- "got_it -o %2 %1"
(ftp://sunsite.unc.edu/pub/Linux/apps/www/
mirroring/got_it-0.34.tar.gz)
- "urlget -s -o %2 %1"
(ftp://sunsite.unc.edu/pub/Linux/apps/www/
mirroring/urlget-3.12.tar.gz)
6. Compile fxp by typing make;
7. Install fxp by typing make install.
8. If you want to use fxviz, you should also install vcg
(ftp://ftp.cs.uni-sb.de/pub/graphics/vcg/).
If you experience problems installing fxp, send me mail at berlea (AT)
in.tum.de Check out for new versions at
http://www.informatik.tu-muenchen.de/~berlea/Fxp.
Running the Parser
------------------
Sample applications like fxp (a validating XML parser), fxcanon, fxcopy,
fxesis, and fxviz are described on fxp's homepage.
Programming Interface
---------------------
fxt's API is described in api.ps.
Alexandru Berlea

339
fxp/doc/CHANGES Normal file
View File

@ -0,0 +1,339 @@
From 1.4.6 to 2.0
---------------------------
(25.06.2004)
Added support for XML 1.1 (Entities.getChar, UniClasses.isNms/isName/isXml
have now an 11 (for 1.1) counterpart which are to be called when the XML
declaration declares version 1.1.
Changed Makefile to better support Windows installation.
(13.02.2004)
Updated fxp to handle the new specification of the xml:space attribute
From 1.4.5 to 1.4.6
---------------------------
(09.10.2003)
Modified documentation to fit in the fxgrep gui framework
From 1.4.4 to 1.4.5
---------------------------
(20.02.2002)
Modified the Makefiles to avoid unnecessary recompilations
2001-09-17
- added hasElement, hasAttribute, etc. functions in Dtd
Changes from 1.4.3 to 1.4.4
---------------------------
2000-10-30
- fixed a bug : parser reported an error if more than one definition
was provided for the same ID attribute, instead of ignoring
delarations later than the first one
Changes from 1.4.2 to 1.4.3
---------------------------
2000-08-29:
- modified the autodetection of character encodings as stated in the
XML 1.0 Specification Errata from 2000-08-10 under E 44; among
others, UTF-8 with BOM is now recognized.
Changes from 1.4.1 to 1.4.2:
--------------------------
2000-07-28:
- fixed a Solaris bug in the Makefile
Changes from 1.4 to 1.4.1:
--------------------------
2000-06-08:
- fixed a bug in surrogate composition
- fixed a bug: parser complained about undeclared entities in
non-validating mode, even if the DTD has paramter references.
- fixed another bug: parser never complained about undeclared
parameter entities that constitute a validity error.
- slightly restructured ParseRefs; resulting in some minor
changes in other modules.
- modified sample applications to report fatal errors as fatal.
Changes from 1.3 to 1.4:
------------------------
2000-02-18:
- fixed a bug in UtilCompare.compareVector
- eliminated UnsafeOps structure. Unsafe operations yield only
insignificant speedup, but extremely impair reliability.
2000-02-10:
- parseElementContent reports only a single error for each piece
of character data in element content. Before this change, it did
report an error after each sequence of white space characters
followed by non-white space. Similarly for parseDocument.
1999-10-22:
- Added the start position of the PI text to HookData.ProcInstInfo.
1999-10-18:
- Fixed a small bug in UtilHash.hashTriple
1999-08-27:
- Removed the this/next components from the Entities.State type.
It is sufficient to compare the entity's indices in the DTD.
Added a type EntId, indicating whether an entity is a parameter
or general entity, and holding its index. isOpen, pushIntern and
pushExtern now have an additional parameter isParam:bool. Changed
all parser functions accordingly.
Changes from 1.2 to 1.3:
------------------------
1999-08-24:
- Removed options --table-size/width.
- added new encoding UCS2B/L, which is UTF16 without surrogate pairs.
- unrolled the getBytes loop in getCharUtf8. UTF-8 does not recognize
surrogates pairs any more (cf. Rfc 2279, lines 201-206). The same
for UCS-4.
- changed implementation getCharUtf16/getCharUcs4: they are not higher
order any more. Efficiency increased.
1999-08-13:
- Changed ParseDecl.parseGenEntDecl not to check for declaredness of
unparsed entities' notations.
- Added maxUsedGen to Dtd: returns the number of general entities defined.
- Added checkUnparsed to DtdDeclare. It checks whether all notations of
unparsed entities have been declared. Added a call to checkUnparsed
at the end of parseDtd.
1999-08-12:
- Made DecodeUtf8.byte1switch a vector.
- Changed type CharClasses.CharClass to a vector. Arrays are now only
used for initializing the char classes; for lookup, they must be
finalized to a vector.
1999-08-03:
- Restructered the Dtd[Elements|Attributes|Notations|Entities] modules.
There are now two modules: DtdDeclare for operations concerning
declarations, and DtdAttributes for generating attribute values.
- Renamed AuxDtd to DtdManager.
- Renamed some functions from Entities to shorter names.
- Changed ParseDocument.parseDoc not to set O_VALIDATE to false if there
is no DTD. Instead those functions that would produce too many errors
without a DTD call hasDtd in order to find out whether there is one.
- Changed the error message for ERR_NO_DTD.
1999-07-29:
- Fixed a bug in the ParseContent.parseElementContent: character
references were passed to the application instead of reporting
an error.
- Removed ERR_DATA_IN_ELEM and ERR_CDATA_IN_ELEM; added ERR_ELEM_CONTENT
instead, which does the job for data, CDATA sections and charrefs.
- Added a boolean flag to HookData.DataInfo,indicating whether the
data is whitespace in element content. This information was available
only implicitly by the content spec of the parent element.
1999-07-26:
- Fixed a bug in the decoder and encoder, concerning surrogates: the
offset added/subtracted in combine/splitSurrogates was 0wx10000
instead of 0wx100000.
- DecodeUtil.isSurrogate ignored the high surrogates: fixed that.
1999-07-22:
- Renamed DecodeBasic to DecodeFile. Changes getByte such that it
closes the file (and removes it if temporary) before raising EndOfFile.
1999-07-20:
- Reimplemented the Uri structure to use strings for uris. (They may
only contain ASCII characters). Changed the result type of uriSuffix
to string.
- Moved the URI encoding functions to a new structure UriEncode.
A character "%" is only encoded if not followed by two hex digits.
Removed URI decoding, since that is superfluous.
- Added parser option O_WARN_NON_ASCII_URI. Changed
ParseLiterals.parseSystemLiteral to issue a warning if a non-ASCII
character occurs. Added a corresponding command line option
--warn-uri.
- Removed types CharInterval and CharRange from structure UniChar.
Added them to CharClasses, used only in Naming and NameClasses.
- Renamed type UniChar.CharVector to Vector for brevity.
- Renamed structures NameRanges and Naming to UniRanges and UniClasses.
1999-07-19:
- Renamed structure Chars to UniChar. Changed definitions such that
UniChar provides a structure Chars : WORD such that type Char =
Chars.word;
- Moved type Byte to DecodeBasic such that DecodeBasic provides a
structure Bytes : WORD such that type Byte = Chars.word;
- Changed other structures to use UniChar.Char and DecodeBasic.Bytes
instead of hardwired Word and Word8.
1999-07-14:
- Changed the DTD structure. The DTD is not a bunch of global variables
anymore; it is now a single data structure handed as argument to all
DTD functions. It must therefore be passed around through all DTD-
dependent parts of the parser and of all applications.
+ Removed O_INIT_DTD option. Instead, the parser expects an optional
DTD as argument. If that is NONE, a new DTD data structure is
initialized, otherwise tghe provided one is used.
+ Changed all applications to pass the DTD around.
- Changed functions ParseMisc.skipS and ParseRefs.skipPS. Instead of
raising NotFound on error, they call the hookError function themselves.
Changes from 1.1 to 1.2:
------------------------
1999-06-07:
- Replaced the --error-minimize option by --few-errors[=(yes|no)].
The old option was buggy and could only turn on an feature that
was on by default.
1999-06-04:
- Removed option --no-output/-n from fxesis and fxcopy.
- Modified the main functions of all applications: in case of
option errors, they raise now raise exception Exit instead
of calling OS.Process.exit.
- Added support for remote URIs.
+ In structure Config, value retrieveCommand defines a command
to be executed for URI retrieval.
+ Uri.retrieveUri calls this command for storing the entity
in a local file. It returns the name of the file and a flag
indicating whether a temporary file has actually been created
or the URI was local.
+ DecodeBasic.FILE has this boolean as additional parameter.
Function decClose removes the temp. file if the flag is true.
1999-05-10:
- Added new type AppFinal to HookData signature. hookFinish now
returns a value of type AppFinal. This is also the new type of
ParseDocument.parseDocument's result.
1999-05-03:
- Added support for XML syntax of XML Catalogs.
+ new function Uri.uriSuffix. Depending on the suffix of a URI,
catalogs are parsed in Socat syntax (.soc, .SOC) or XML syntax.
1999-04-23:
- Fixed bug: Dict.getByKey raised NoSuchEntry for unknown keys.
That made fxp fail with an uncaught exception if, e.g., an
unknown output encoding was given.
1999-04-15:
- Added functions Dict.clearDict and SymTable.clearSymTable.
- Removed references from the tables in Dtd. The Tables are now
initialized with clearDict/SymTable.
1999-04-14:
- Moved Hooks, HookData, Dtd, DtdOptions, Resolve and ParserOptions
to directory Params. Deleted directory Hooks.
- Added the start and end position to the arguments of most of the Hooks.
Made the appropriate changes in the parser modules and the apps.
- The parser is now reentrant, i.e., multiple instances are possible
at the same time.
- Made Dfa a functor, expecting the Dfa-relevant options in a
structure DfaOptions. Added a new functor for creating such a
structure.
- Made ParserOptions a functor in order to have multiple instances
of it. ParserOptions creates DfaOptions as a substructure.
1999-04-13:
- Made Dtd a functor, expecting a structure DtdOptions holding all options
concerning the DTD. Removed these options from ParseOptions.
- Removed O_SILENT, O_ERROR_DEVICE and O_ERROR_LINEWIDTH from ParseOptions.
These options now belong to the application.
- Fixed some bugs in the option parsers of applications.
- Fixed a bug in Dfa: exception DfaTooLarge was not visible through the
signature.
- Added an integer parameter to exceptions DfaToLarge and to
WARN_DFA_TOO_LARGE. It is the maximal allowed number of states.
- ErrorMessage no longer depends on ParseOptions.
Changes from 1.0 to 1.1:
------------------------
1999-03-29:
- Fixed ErrorMessage.errorMessage to complain about standalone 'yes' instead
of 'no'.
- Avoid multiple EndOfFile events for the same entity by adding a boolean
flag to DecodeError/Bytes exceptions.
1999-03-25:
- Improved error reporting; the position handed to hookError/Warning is now
- whenever possible - the first character of the concerned item.
Exceptions are checks done at the end of the DTD or of the document.
- Improved handling of wrong end-tags. The strategy is:
+ if the end-tag is for the current element, consume it and finish the
element;
+ otherwise, if it is for an open element, assume the end-tag for the
current element was forgotten, i.e., finish the current element but
retain the end-tag;
+ otherwise, if the current element requires further content in order
to satisfy its content model, ignore the end-tag;
+ otherwise consumne the end-tag and finish the current element.
In order to implement this, the following were necessary:
+ an extra argument openElems to ParseContent.parseElement. It is a list
of the indices of the types of the enclosing elements;
+ an extra component optEtag in the return vale of parseElement. It is an
option and holds information about an end-tag that was not consumed when
finishing the element;
+ appropriate changes to the code of parse[Mixed|Element]Content.
1999-03-24:
- fixed some bugs:
+ comparison of fixed attribute values is now correct.
+ Naming.isUnicode: 0x10000..0xFFFFF was not unicode.
+ ParseContent.parseMixedContent complained about character '>' even if
it was not part of the sequence ']]>'.
+ whitespace normalization in attribute values also affected characters
that were escaped by a char reference.
+ end-of-line normalization was done for the replacement text of internal
entities, but may only be done for the literal entitiy values.
+ default values in attlist declarations were checked not only for lexical
validity but also for declaredness of notation/entity names. This is now
done when the default value is used.
1999-03-22:
- Added support for the Socat syntax of XCatalog:
+ new subdir Catalog with a main functor Catalog
+ the parser functor now expects an additional structure argument Resolve
providing a function resolveExtId. That does not exist in BaseData any
more.
1999-03-16:
- Extracted Unicode-specific code from the Parser:
+ made a new directory Unicode.
+ moved Parser/Front/front.sml to Parser/entities.sml; renamed Front to
Entities.
+ moved Parser/Front to Unicode/Decode; renamed everything to Decode<...>,
frontEncoding to Encode.
+ moved Parser/Back to Unicode/Encode; renamed everything to Encode<...>,
backEncoding to Encode; moved back.sml to Apps/Copy/copyEncode.sml
+ moved Parser/Chars to Unicode/Chars.
- Changed the implementation of Decode and Encode:
+ UTF-7 is no longer supported.
+ new structure Encoding providing types and functions for handling
encoding names.
+ both Encode and Decode now raise exceptions instead of printing errors.
+ made the appropriate changes to Entities and CopyEncode.
1999-03-08:
- Moved fillArray for Front to FrontEncoding, renamed it to encGetArray.
- Hid implem. of type FrontEncoding.Encoding through the signature.
1999-03-05:
- Changed Front.fillArray to use a function parameter instead of a reference.
1999-03-02:
- Changed all parser functors:
+ expect only a structure Params containing former Hooks, Front, Dtd,
AuxParse and AuxRecover.
+ No other parser structures are arguments to functors.
- Removed the functions from DtdTables from the Dtd signature.
- Renamed Dtd to AuxDtd and DtdTables to Dtd.
1999-03-01:
- Added function hookWhite to signature Hooks.
- Added calls to hookWhite in ParseDtd.parseSubset and ParseDocument.parseDoc.
- Changed CopyHooks and CopyOutput to account for this:
+ removed third bool arg from CopyOutput.outComment/ProcInstr; they never
print a newline now;
+ removed printing of newlines from declaration hooks in CopyHooks;
+ removed function CopyHooks.inContent;
+ added hookWhite to CopyHooks;
- Added hookWhite to NullHooks and EsisHooks;
- Added LOC_SUBSET to datatype Location in ErrorData and ErrorString.
- Fixed ErrorMessage.errorMessage to print "Could not open file..." instead of
"Could open file..." for ERR_NO_SUCH_FILE.

55
fxp/doc/COPYRIGHT Normal file
View File

@ -0,0 +1,55 @@
fxp: A Functional XML Parser
Version 2.0, 25.06.2004
------------------------------------------------
COPYRIGHT NOTICE, LICENSE AND DISCLAIMER.
Copyright 1999-2004 by Andreas Neumann and Alexandru Berlea, TU Munich
Permission to use, copy, modify, and distribute this software and
its documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both the copyright notice and this permission notice and warranty
disclaimer appear in supporting documentation.
The TU Munich disclaims all warranties with regard to this
software, including all implied warranties of merchantability and
fitness. In no event shall the TU Munich be liable for any
special, indirect or consequential damages or any damages whatsoever
resulting from loss of use, data or profits, whether in an action of
contract, negligence or other tortious action, arising out of or in
connection with the use or performance of this software.
------------------------------------------------
This software was produced with the help of software, tools, and code
from SML of New Jersey. Therefore we repeat here the copyright notice
SML of New Jersey:
STANDARD ML OF NEW JERSEY COPYRIGHT NOTICE, LICENSE AND DISCLAIMER.
Copyright (c) 1989-1998 by Lucent Technologies
Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both the copyright notice and this permission notice and warranty
disclaimer appear in supporting documentation, and that the name of
Lucent Technologies, Bell Labs or any Lucent entity not be used in
advertising or publicity pertaining to distribution of the software
without specific, written prior permission.
Lucent disclaims all warranties with regard to this software,
including all implied warranties of merchantability and fitness. In no
event shall Lucent be liable for any special, indirect or
consequential damages or any damages whatsoever resulting from loss of
use, data or profits, whether in an action of contract, negligence or
other tortious action, arising out of or in connection with the use
or performance of this software.
------------------------------------------------

View File

@ -0,0 +1,2 @@
<!ELEMENT b EMPTY>
<!ATTLIST b num NMTOKEN #REQUIRED>

View File

@ -0,0 +1,22 @@
<?xml version="1.0"?>
<!DOCTYPE exa SYSTEM "exa-1.ext" [
<!ELEMENT exa (a|b)*>
<!ENTITY ext SYSTEM "ext.elem">
<!ENTITY % ext SYSTEM "ext.decl">
%ext;
<!ELEMENT a (b*)>
<!ATTLIST a num NMTOKEN #FIXED "0">
]>
<exa>
<a num="a" id="id1">
<a id="id1"/>
&ext;
<b nmu="1"> </b>
</a>
</exa>

View File

@ -0,0 +1,15 @@
<!DOCTYPE exa [
<!ELEMENT exa (a|b)*>
<!ELEMENT a ((b|c)*,b,(b|c),(b|c),(b|c),(b|c),(b|c),(b|c),(b|c))>
<!ELEMENT b ANY>
<!ELEMENT c ANY>
]>
<exa>
<!-- this comment has a -- in it. -->
<a/>
<b>
this element contains a "]]>" sequence
</b>
</exa>

View File

@ -0,0 +1,15 @@
<!DOCTYPE exa [
<!ELEMENT exa (a|b)*>
<!ELEMENT a EMPTY>
<!ELEMENT b ANY>
<!ATTLIST a x (yes|no) "yes"
y (yes|perhaps|no) "no">
<!ATTLIST a x NMTOKEN "yes">
]>
<exa>
<a></a>
<b/>
</exa>

View File

@ -0,0 +1,21 @@
<?xml version="1.1"?>
<!DOCTYPE exa [
<!ELEMENT exa (a|b)*>
<!ELEMENT a EMPTY>
<!ELEMENT b ANY>
<!ATTLIST a xml:lang CDATA "i-">
<!ATTLIST c x NMTOKEN "1">
<!ENTITY amp "&#38;">
<!ENTITY amp "&#38;#38;">
<!NOTATION text SYSTEM "/bin/cat">
<!NOTATION text SYSTEM "/usr/local/bin/less">
]>
<exa>
<a xml:lang="yy"/>
</exa>

View File

@ -0,0 +1,28 @@
(exa
-\n
pBSD//A Program for viewing text//EN
s/bin/cat
Ntext
Anot NOTATION text
Abold TOKEN no
Aid TOKEN first-a
s/usr/bin/nroff
Nman
s/usr/man/man1/fxp.1
f<OSFILE>/usr/man/man1/fxp.1
Efxp.manpage NDATA man
Aent ENTITY fxp.manpage
Aref CDATA www.fxp.org
(a
-\nThere is a comment in this line.\nThis line has a character reference and \nthis one has a "quoted general entity reference". \n\nthis is a cdata section!\n
?see this pi?
-\n
)a
-\n
Abold TOKEN no
Anot NOTATION man
(a
-\nEr wohnt in K\U+f6;ln, in der M\U+fc;llerstra\U+df;e 13.\n
)a
)exa

View File

@ -0,0 +1,33 @@
(exa
-\n
Aref CDATA www.fxp.org
Anum IMPLIED
s/usr/bin/nroff
Nman
s/usr/man/man1/fxp.1
f<OSFILE>/usr/man/man1/fxp.1
Efxp.manpage NDATA man
Aent ENTITY fxp.manpage
pBSD//A Program for viewing text//EN
s/bin/cat
Ntext
Anot NOTATION text
Aid TOKEN first-a
Abold TOKEN no
(a
-\nThere is a comment in this line.\nThis line has a character reference and \nthis one has a "quoted general entity reference". \n\nthis is a cdata section!\n
?see this pi?
-\n
)a
-\n
Aref IMPLIED
Anum IMPLIED
Aent IMPLIED
Anot NOTATION man
Aid IMPLIED
Abold TOKEN no
(a
-\nEr wohnt in Köln, in der Müllerstraße 13.\n
)a
)exa

View File

@ -0,0 +1,35 @@
<?xml version="1.0" encoding="LATIN1"?>
<!DOCTYPE exa [
<!NOTATION text PUBLIC "BSD//A Program for viewing text//EN" "/bin/cat">
<!NOTATION man SYSTEM "/usr/bin/nroff">
<!ENTITY fxp.manpage SYSTEM "/usr/man/man1/fxp.1" NDATA man>
<!ELEMENT exa (a|b)*>
<!ELEMENT a ANY>
<!ELEMENT b ANY>
<!ATTLIST a bold (yes|no) "no"
id ID #IMPLIED
not NOTATION(text|man) "text"
ent ENTITIES #IMPLIED
num NMTOKEN #IMPLIED
ref CDATA #IMPLIED>
<!ENTITY quot "&#x22;">
]>
<exa>
<a id="first-a" ent="fxp.manpage" ref="www.fxp.org">
There is a comment <!-- comment --> in this line.
This line has a character referenc&#x65; and
this one has a &quot;quoted general entity reference&quot;.
<!-- now comes a cdata section -->
<![CDATA[this is a cdata section!]]>
<?see this pi??>
</a>
<a not="man">
Er wohnt in Köln, in der Müllerstraße 13.
</a></exa>

View File

@ -0,0 +1,2 @@
<!ENTITY % vnum "1.0">
<!ENTITY % version "xml version %vnum;">

View File

@ -0,0 +1,8 @@
<?xml version="1.0"?>
<!DOCTYPE exa SYSTEM "exa-6.ext" [
<!ENTITY % int "<!ELEMENT exa ANY>">
<!ENTITY % ext SYSTEM "ext-6.decl">
%int;
%ext;
]>
<exa/>

View File

@ -0,0 +1,12 @@
<!DOCTYPE exa [
<!ENTITY q "quote sign">
<!ENTITY int "internal entity">
<!ENTITY ext SYSTEM "ext.ent">
<!ATTLIST a x NMTOKENS #IMPLIED
y CDATA #IMPLIED>
]>
<a x=" a b " y="two &q;s: &#x27; and &#x22;">
here is a character reference: &#64;
here is an &int;
here is an &ext;
</a>

View File

@ -0,0 +1 @@
<!NOTATION text SYSTEM "/bin/cat">

View File

@ -0,0 +1 @@
<!ATTLIST a id ID #IMPLIED>

View File

@ -0,0 +1 @@
<a num="1"></b>

1
fxp/doc/Examples/ext.ent Normal file
View File

@ -0,0 +1 @@
external entity

View File

@ -0,0 +1,24 @@
<!ELEMENT XMLCatalog (Base|Delegate|Extend|Map|Remap)*>
<!ELEMENT Base EMPTY>
<!ATTLIST Base
HRef CDATA #REQUIRED>
<!ELEMENT Delegate EMPTY>
<!ATTLIST Delegate
PublicId CDATA #REQUIRED
HRef CDATA #REQUIRED>
<!ELEMENT Extend EMPTY>
<!ATTLIST Extend
HRef CDATA #REQUIRED>
<!ELEMENT Map EMPTY>
<!ATTLIST Map
PublicId CDATA #REQUIRED
HRef CDATA #REQUIRED>
<!ELEMENT Remap EMPTY>
<!ATTLIST Remap
SystemId CDATA #REQUIRED
HRef CDATA #REQUIRED>

Binary file not shown.

After

Width:  |  Height:  |  Size: 494 B

BIN
fxp/doc/Images/email.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 411 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.2 KiB

238
fxp/doc/Images/index.html Normal file
View File

@ -0,0 +1,238 @@
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="GENERATOR" content="Mozilla/4.73 [en] (X11; I; Linux 2.2.14 i686) [Netscape]">
<title>fxp - a Functional XML Parser</title>
</head>
<body bgcolor="#FFFFFF">
<h1>
<img SRC="fxp-shadow.jpg" ALT="fxp" BORDER=0 align=CENTER> The Functional
XML Parser</h1>
<img SRC="shadow.jpg" ALT="----------------" >
<h2>
Contents</h2>
<table>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#FXP">About <i>fxp</i></a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#VERSION"><i>fxp</i> Versions and Changes</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#GET">Where to get <i>fxp</i></a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#INSTALL">How to install <i>fxp</i></a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td>Example Applications: <i><a href="fxp.html">fxp</a></i>, <i><a href="fxcanon.html">fxcanon</a></i>,
<i><a href="fxcopy.html">fxcopy</a></i>, <i><a href="fxesis.html">fxesis</a></i>,
and <i><a href="fxviz.html">fxviz</a></i>.&nbsp;</td>
</tr>
</table>
<p><img SRC="shadow.jpg" ALT="----------------" >
<h3>
<a NAME="FXP"></a>What is <i>fxp</i>?</h3>
<i>fxp</i> is a validating <a href="http://www.w3.org/TR/REC-xml">XML</a>
parser written completely in the functional programming language SML. It
has a <a href="#API">programming interface</a> allowing for production
of XML applications based on <i>fxp</i>. It comes with four example applications:
<table>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><i><a href="fxp.html">fxp</a></i>, the pure parser. It parses a document
and finds well-formedness errors, validity errors and other problems;</td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><i><a href="fxcanon.html">fxcanon</a></i> produces an equivalent canonical
XML document. <a href="http://www.jclark.com/xml/canonxml.html">Canonical
XML</a> was invented by <a href="http://www.jclark.com">James Clark</a>
for testing XML parsers. It contains only the information a processor is
required to pass to the application;&nbsp;</td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><i><a href="fxcopy.html">fxcopy</a></i> reproduces the document parsed
by <i>fxp</i>. The copy can be generated in a different encoding than the
input, and can be normalized in different ways concerning, e.g., expansion
of entity references;&nbsp;</td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><i><a href="fxesis.html">fxesis</a></i> adds a backend to <i>fxp</i>,
producing an output similar to <i><a href="http://www.jclark.com/sp/nsgmls.htm">nsgmls</a></i>'s
ESIS (Element Structure Information Set) output;</td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><i><a href="fxviz.html">fxviz</a></i> is an XML tree visualizer. It
produces a graph description suitable as input to Georg Sander's <i><a href="ftp://ftp.cs.uni-sb.de/pub/graphics/vcg/">vcg</a></i>.&nbsp;</td>
</tr>
</table>
More <a href="features.html">features</a> of <i>fxp</i> are <a href="features.html">here</a>.
<p><img SRC="shadow.jpg" ALT="----------------" >
<h3><a NAME="VERSION"></a><i>fxp</i> Versions and Changes</h3>
<p>The current version of <i>fxp</i> is 1.4.5. Changes from the
previous versions are described <a href="CHANGES">here</a>.</p>
<p><img SRC="shadow.jpg" ALT="----------------" >
<h3>
<a NAME="GET"></a>Downloading <i>fxp</i></h3>
The source code for <i>fxp</i> can be downloaded by <a href="fxp.tar.gz">http</a>. The Copyright note is <a href="COPYRIGHT">here</a>.
<p><img SRC="shadow.jpg" ALT="----------------" >
<h3>
<a NAME="INSTALL"></a>Installation instructions</h3>
In order to install <i>fxp</i>, you need an SML compiler. It has been tested
with version 110.0.6 of <a href="http://cm.bell-labs.com/cm/cs/what/smlnj/">SML
of New Jersey</a>, but it might also run with other versions. The compiler
must have the compilation manager (CM) built in, which is the default when
installing SML-NJ. We successfully compiled <i>fxp</i> on Linux with libc5,
Digital Unix 4.0 and Solaris 2.4/6. For other unices we expect no problems;
compiling with the Windows version of SML-NJ has not been tried.
<p>These are the steps for installing <i>fxp</i> under Unix:
<ol>
<li>
<a href="#GET">Download</a> the latest version of <i>fxp</i>;</li>
<li>
Unpack the sources, and change to the <i>fxp</i> directory, e.g.:</li>
<pre>&nbsp;&nbsp;&nbsp; gunzip -c fxp-1.4.tar.gz | tar xf -
&nbsp;&nbsp;&nbsp; cd fxp-1.4</pre>
<li>
Read the <tt>COPYRIGHT</tt>;</li>
<li>
Edit the <tt>Makefile</tt> according to your needs. Probably you will only
have to change the following:</li>
<ul>
<li>
<tt>INSTALL_PROGS</tt> is the list of programs to be installed. <i>fxlib</i> is
required if you want to develop applications with fxp. The <a href='http://www.informatik.uni-trier.de/~aberlea/Fxgrep/'>fxgrep</a> XML query language and the <a href='http://www.informatik.uni-trier.de/~aberlea/Fxt/'>fxt</a> XML transformation language also need <i>fxlib</i> being installed.</li>
<li>
<tt>FXP_BINDIR</tt> is where the executables are installed;</li>
<li>
<tt>FXP_LIBDIR</tt> is where other files needed by <i>fxp</i> - the heap
images and the library -- are installed;</li>
<li>
<tt>SML_BINDIR</tt> is the directory where the SML executables are found.
It must contain the <tt>.arch-n-opsys</tt> script from the SML-NJ distribution,
so make sure that this is where SML-NJ is <i>physically</i> installed;</li>
<li>
<tt>SML_EXEC</tt> is the name of the SML executable. This is the program
that is called for generating the heap image and at execution of <i>fxp</i>.
If <tt>sml</tt> will be in your <tt>PATH</tt> at execution time, you don't
need the full path here.</li>
<li>
<tt>SML_MAKEDEF</tt> is for defining the <tt>make</tt> command in SML.
After version 110.0.3, SML-NJ changed the type of <tt>CM.make'</tt>. Therefore
it is wrapped into the <tt>make</tt> defined by this variable. For working
versions of SML-NJ, use the second variant of this definition.</li>
</ul>
<li>
Edit the file <tt>src/config.sml</tt> according to your needs. Currently
only a single value can be configured here:</li>
<ul>
<li>
<tt>retrieveCommand</tt> is the command to be used by <i>fxp</i> for retrieving
a remote URI from the internet and storing it in a temporary file on the
local file system. It is a string value and should contain the strings
<tt>%1</tt> and <tt>%2</tt>, where</li>
<ul>
<li>
<tt>%1</tt> is replaced by the URI;</li>
<li>
<tt>%2</tt> is replaced by the local filename.</li>
</ul>
It is recommended that the command exits with failure in case the URI cannot
be retrieved. If the command generates a HTML error message instead (like,
e.g., <tt>"lynx -source %1 > %2"</tt>), this HTML file is considered to
be XML and will probably cause a mess of parsing errors. If you don't need
URI retrieval, use <tt>"exit 1"</tt> which always fails on Unix. Sensible
values are, e.g:
<ul>
<li>
<tt>"<a href="ftp://gnjilux.cc.fer.hr/pub/unix/util/wget/">wget</a> -qO
%2 %1"</tt></li>
<li>
<tt>"<a href="ftp://sunsite.unc.edu/pub/Linux/apps/www/mirroring/got_it-0.34.tar.gz">got_it</a>
-o %2 %1"</tt></li>
<li>
<tt>"<a href="ftp://sunsite.unc.edu/pub/Linux/apps/www/mirroring/urlget-3.12.tar.gz">urlget</a>
-s -o %2 %1"</tt></li>
</ul>
</ul>
<li>
Compile <i>fxp</i> by typing <tt>make</tt>;</li>
<li>
Install <i>fxp</i> by typing <tt>make install</tt>.</li>
<li>
If you want to use <i>fxviz</i>, you should also install <i><a href="ftp://ftp.cs.uni-sb.de/pub/graphics/vcg/">vcg</a></i>.</li>
</ol>
<img SRC="shadow.jpg" ALT="----------------" >
<h3><a NAME="API"></a><i>fxp</i>'s Programming Interface</h3>
Here is a <a href="doc.ps">document </a>describinng the programming interface
<p><img SRC="shadow.jpg" ALT="----------------" >
<h3><i>fxp</i>'s feedback address</h3>
Any feedback related to fxp is welcome to: <a
href="mailto:fxp@psi.uni-trier.de">fxp@psi.uni-trier.de</a> <p><img
SRC="shadow.jpg" ALT="----------------" >
<h3>Credits:</h3> The author of fxp is <a
href="mailto:neumann@psi.uni-trier.de"/>Andreas Neumann</a>. fxp is
maintained and updated by <a
href="mailto:neumann@psi.uni-trier.de"/>Alexandru Berlea</a>.
</body>
</html>

BIN
fxp/doc/Images/shadow.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.7 KiB

133
fxp/doc/README Normal file
View File

@ -0,0 +1,133 @@
fxp - The Functional XML Parser
Version 2.0, 25.06.2004
by
Andreas Neumann, University of Trier
neumann (AT) psi.uni-trier.de
Alexandru Berlea, TU Munich
berlea (AT) in.tum.de
What is fxp?
------------
fxp is a validating XML parser, written completely in the functional
programming language SML. It has a programming interface
allowing for production of XML applications based on fxp. It comes with
some example applications:
fxp The pure parser. It parses a document and finds well-formedness
errors, validity errors and other problems;
fxcanon Produces an equivalent canonical XML document. Canonical XML was
invented by James Clark for testing XML parsers. It contains only
the information a processor is required to pass to the application;
fxcopy Reproduces the document parsed by fxp. The copy can be generated
in a different encoding than the input, and can be normalized in
different ways concerning, e.g., expansion of entity references;
fxesis Produces an output similar to nsgmls's ESIS (Element Structure
Information Set) output;
fxviz An XML tree visualizer. It produces a graph description suitable as
input to Georg Sander's vcg.
Homepage
--------
http://www.informatik.uni-trier.de/~berlea/Fxp
Installation
------------
In order to install fxp, you need an SML compiler. It has been tested with
version 110.0.7 of SML of New Jersey, but it might also run with other
versions. The compiler must have the compilation manager (CM) built in, which
is the default when installing SML-NJ. We successfully compiled fxp on
Linux. For other unices we expect no problems. An installation using the
Windows version of SML-NJ is documented on fxp's homepage.
These are the steps for installing fxp under Unix:
1. Download the latest version of fxp;
2. Unpack the sources, and change to the fxp directory, e.g.:
gunzip -c fxp-2.0.tar.gz | tar xf -
cd fxp-2.0
3. Read the COPYRIGHT;
4. Edit the Makefile according to your needs. Probably you will only have
to change the following:
INSTALL_PROGS is the list of programs to be installed. fxlib is only
required if you want to develop applications with fxp.
FXP_BINDIR is where the executables are installed;
FXP_LIBDIR is where other files needed by fxp - the heap images
and the library - are installed;
SML_BINDIR is the directory where the SML executables are found.
It must contain the .arch-n-opsys script from the
SML-NJ distribution, so make sure that this is where
SML-NJ is physically installed;
SML_EXEC is the name of the SML executable. This is the program
that is called for generating the heap image and at
execution of fxp. If sml is in your PATH at
installation time, you don't need the full path here.
SML_MAKEDEF is for defining the make command in SML. After version
110.0.3, SML-NJ changed the type of CM.make'.
For earlier or working versions of SML-NJ, use the
second or third variant of this definition.
5. Edit the file src/config.sml according to your needs. Currently only a
single value can be configured here:
val retrieveCommand : string
is the command to be used by fxp for retrieving a remote URI from
the internet and storing it in a temporary file on the local file
system. It is a string value and should contain the strings %1
and %2, where:
- %1 is replaced by the URI;
- %2 is replaced by the local filename.
It is recommended that the command exits with failure in case the
URI cannot be retrieved. If the command generates an HTML error
message instead (like, e.g., "lynx -source %1 > %2"), this HTML
file is considered to be XML and will probably cause a mess of
parsing errors. If you don't need URI retrieval, use "exit 1"
which always fails on Unix. Sensible values are, e.g:
- "wget -qO %2 %1"
(ftp://gnjilux.cc.fer.hr/pub/unix/util/wget/)
- "got_it -o %2 %1"
(ftp://sunsite.unc.edu/pub/Linux/apps/www/
mirroring/got_it-0.34.tar.gz)
- "urlget -s -o %2 %1"
(ftp://sunsite.unc.edu/pub/Linux/apps/www/
mirroring/urlget-3.12.tar.gz)
6. Compile fxp by typing make;
7. Install fxp by typing make install.
8. If you want to use fxviz, you should also install vcg
(ftp://ftp.cs.uni-sb.de/pub/graphics/vcg/).
If you experience problems installing fxp, send me mail at berlea (AT)
in.tum.de Check out for new versions at
http://www.informatik.tu-muenchen.de/~berlea/Fxp.
Running the Parser
------------------
Sample applications like fxp (a validating XML parser), fxcanon, fxcopy,
fxesis, and fxviz are described on fxp's homepage.
Programming Interface
---------------------
fxt's API is described in api.ps.
Alexandru Berlea

1092
fxp/doc/api.ps Normal file

File diff suppressed because it is too large Load Diff

BIN
fxp/doc/ball-shadow.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 494 B

821
fxp/doc/doc.ps Normal file
View File

@ -0,0 +1,821 @@
%!PS-Adobe-2.0
%%Creator: dvips(k) 5.86 Copyright 1999 Radical Eye Software
%%Title: doc.dvi
%%Pages: 7
%%PageOrder: Ascend
%%BoundingBox: 0 0 596 842
%%DocumentFonts: Palatino-Roman Palatino-Italic Helvetica Helvetica-Bold
%%+ Palatino-Bold Symbol Courier Palatino-BoldItalic EURM10
%%DocumentPaperSizes: a4
%%EndComments
%DVIPSWebPage: (www.radicaleye.com)
%DVIPSCommandLine: dvips doc.dvi
%DVIPSParameters: dpi=1200, compressed
%DVIPSSource: TeX output 2001.02.26:1647
%%BeginProcSet: texc.pro
%!
/TeXDict 300 dict def TeXDict begin/N{def}def/B{bind def}N/S{exch}N/X{S
N}B/A{dup}B/TR{translate}N/isls false N/vsize 11 72 mul N/hsize 8.5 72
mul N/landplus90{false}def/@rigin{isls{[0 landplus90{1 -1}{-1 1}ifelse 0
0 0]concat}if 72 Resolution div 72 VResolution div neg scale isls{
landplus90{VResolution 72 div vsize mul 0 exch}{Resolution -72 div hsize
mul 0}ifelse TR}if Resolution VResolution vsize -72 div 1 add mul TR[
matrix currentmatrix{A A round sub abs 0.00001 lt{round}if}forall round
exch round exch]setmatrix}N/@landscape{/isls true N}B/@manualfeed{
statusdict/manualfeed true put}B/@copies{/#copies X}B/FMat[1 0 0 -1 0 0]
N/FBB[0 0 0 0]N/nn 0 N/IEn 0 N/ctr 0 N/df-tail{/nn 8 dict N nn begin
/FontType 3 N/FontMatrix fntrx N/FontBBox FBB N string/base X array
/BitMaps X/BuildChar{CharBuilder}N/Encoding IEn N end A{/foo setfont}2
array copy cvx N load 0 nn put/ctr 0 N[}B/sf 0 N/df{/sf 1 N/fntrx FMat N
df-tail}B/dfs{div/sf X/fntrx[sf 0 0 sf neg 0 0]N df-tail}B/E{pop nn A
definefont setfont}B/Cw{Cd A length 5 sub get}B/Ch{Cd A length 4 sub get
}B/Cx{128 Cd A length 3 sub get sub}B/Cy{Cd A length 2 sub get 127 sub}
B/Cdx{Cd A length 1 sub get}B/Ci{Cd A type/stringtype ne{ctr get/ctr ctr
1 add N}if}B/id 0 N/rw 0 N/rc 0 N/gp 0 N/cp 0 N/G 0 N/CharBuilder{save 3
1 roll S A/base get 2 index get S/BitMaps get S get/Cd X pop/ctr 0 N Cdx
0 Cx Cy Ch sub Cx Cw add Cy setcachedevice Cw Ch true[1 0 0 -1 -.1 Cx
sub Cy .1 sub]/id Ci N/rw Cw 7 add 8 idiv string N/rc 0 N/gp 0 N/cp 0 N{
rc 0 ne{rc 1 sub/rc X rw}{G}ifelse}imagemask restore}B/G{{id gp get/gp
gp 1 add N A 18 mod S 18 idiv pl S get exec}loop}B/adv{cp add/cp X}B
/chg{rw cp id gp 4 index getinterval putinterval A gp add/gp X adv}B/nd{
/cp 0 N rw exit}B/lsh{rw cp 2 copy get A 0 eq{pop 1}{A 255 eq{pop 254}{
A A add 255 and S 1 and or}ifelse}ifelse put 1 adv}B/rsh{rw cp 2 copy
get A 0 eq{pop 128}{A 255 eq{pop 127}{A 2 idiv S 128 and or}ifelse}
ifelse put 1 adv}B/clr{rw cp 2 index string putinterval adv}B/set{rw cp
fillstr 0 4 index getinterval putinterval adv}B/fillstr 18 string 0 1 17
{2 copy 255 put pop}for N/pl[{adv 1 chg}{adv 1 chg nd}{1 add chg}{1 add
chg nd}{adv lsh}{adv lsh nd}{adv rsh}{adv rsh nd}{1 add adv}{/rc X nd}{
1 add set}{1 add clr}{adv 2 chg}{adv 2 chg nd}{pop nd}]A{bind pop}
forall N/D{/cc X A type/stringtype ne{]}if nn/base get cc ctr put nn
/BitMaps get S ctr S sf 1 ne{A A length 1 sub A 2 index S get sf div put
}if put/ctr ctr 1 add N}B/I{cc 1 add D}B/bop{userdict/bop-hook known{
bop-hook}if/SI save N @rigin 0 0 moveto/V matrix currentmatrix A 1 get A
mul exch 0 get A mul add .99 lt{/QV}{/RV}ifelse load def pop pop}N/eop{
SI restore userdict/eop-hook known{eop-hook}if showpage}N/@start{
userdict/start-hook known{start-hook}if pop/VResolution X/Resolution X
1000 div/DVImag X/IEn 256 array N 2 string 0 1 255{IEn S A 360 add 36 4
index cvrs cvn put}for pop 65781.76 div/vsize X 65781.76 div/hsize X}N
/p{show}N/RMat[1 0 0 -1 0 0]N/BDot 260 string N/Rx 0 N/Ry 0 N/V{}B/RV/v{
/Ry X/Rx X V}B statusdict begin/product where{pop false[(Display)(NeXT)
(LaserWriter 16/600)]{A length product length le{A length product exch 0
exch getinterval eq{pop true exit}if}{pop}ifelse}forall}{false}ifelse
end{{gsave TR -.1 .1 TR 1 1 scale Rx Ry false RMat{BDot}imagemask
grestore}}{{gsave TR -.1 .1 TR Rx Ry scale 1 1 false RMat{BDot}
imagemask grestore}}ifelse B/QV{gsave newpath transform round exch round
exch itransform moveto Rx 0 rlineto 0 Ry neg rlineto Rx neg 0 rlineto
fill grestore}B/a{moveto}B/delta 0 N/tail{A/delta X 0 rmoveto}B/M{S p
delta add tail}B/b{S p tail}B/c{-4 M}B/d{-3 M}B/e{-2 M}B/f{-1 M}B/g{0 M}
B/h{1 M}B/i{2 M}B/j{3 M}B/k{4 M}B/w{0 rmoveto}B/l{p -4 w}B/m{p -3 w}B/n{
p -2 w}B/o{p -1 w}B/q{p 1 w}B/r{p 2 w}B/s{p 3 w}B/t{p 4 w}B/x{0 S
rmoveto}B/y{3 2 roll p a}B/bos{/SS save N}B/eos{SS restore}B end
%%EndProcSet
%%BeginProcSet: 8r.enc
% @@psencodingfile@{
% author = "S. Rahtz, P. MacKay, Alan Jeffrey, B. Horn, K. Berry",
% version = "0.6",
% date = "1 July 1998",
% filename = "8r.enc",
% email = "tex-fonts@@tug.org",
% docstring = "Encoding for TrueType or Type 1 fonts
% to be used with TeX."
% @}
%
% Idea is to have all the characters normally included in Type 1 fonts
% available for typesetting. This is effectively the characters in Adobe
% Standard Encoding + ISO Latin 1 + extra characters from Lucida.
%
% Character code assignments were made as follows:
%
% (1) the Windows ANSI characters are almost all in their Windows ANSI
% positions, because some Windows users cannot easily reencode the
% fonts, and it makes no difference on other systems. The only Windows
% ANSI characters not available are those that make no sense for
% typesetting -- rubout (127 decimal), nobreakspace (160), softhyphen
% (173). quotesingle and grave are moved just because it's such an
% irritation not having them in TeX positions.
%
% (2) Remaining characters are assigned arbitrarily to the lower part
% of the range, avoiding 0, 10 and 13 in case we meet dumb software.
%
% (3) Y&Y Lucida Bright includes some extra text characters; in the
% hopes that other PostScript fonts, perhaps created for public
% consumption, will include them, they are included starting at 0x12.
%
% (4) Remaining positions left undefined are for use in (hopefully)
% upward-compatible revisions, if someday more characters are generally
% available.
%
% (5) hyphen appears twice for compatibility with both
% ASCII and Windows.
%
/TeXBase1Encoding [
% 0x00 (encoded characters from Adobe Standard not in Windows 3.1)
/.notdef /dotaccent /fi /fl
/fraction /hungarumlaut /Lslash /lslash
/ogonek /ring /.notdef
/breve /minus /.notdef
% These are the only two remaining unencoded characters, so may as
% well include them.
/Zcaron /zcaron
% 0x10
/caron /dotlessi
% (unusual TeX characters available in, e.g., Lucida Bright)
/dotlessj /ff /ffi /ffl
/.notdef /.notdef /.notdef /.notdef
/.notdef /.notdef /.notdef /.notdef
% very contentious; it's so painful not having quoteleft and quoteright
% at 96 and 145 that we move the things normally found there to here.
/grave /quotesingle
% 0x20 (ASCII begins)
/space /exclam /quotedbl /numbersign
/dollar /percent /ampersand /quoteright
/parenleft /parenright /asterisk /plus /comma /hyphen /period /slash
% 0x30
/zero /one /two /three /four /five /six /seven
/eight /nine /colon /semicolon /less /equal /greater /question
% 0x40
/at /A /B /C /D /E /F /G /H /I /J /K /L /M /N /O
% 0x50
/P /Q /R /S /T /U /V /W
/X /Y /Z /bracketleft /backslash /bracketright /asciicircum /underscore
% 0x60
/quoteleft /a /b /c /d /e /f /g /h /i /j /k /l /m /n /o
% 0x70
/p /q /r /s /t /u /v /w
/x /y /z /braceleft /bar /braceright /asciitilde
/.notdef % rubout; ASCII ends
% 0x80
/.notdef /.notdef /quotesinglbase /florin
/quotedblbase /ellipsis /dagger /daggerdbl
/circumflex /perthousand /Scaron /guilsinglleft
/OE /.notdef /.notdef /.notdef
% 0x90
/.notdef /.notdef /.notdef /quotedblleft
/quotedblright /bullet /endash /emdash
/tilde /trademark /scaron /guilsinglright
/oe /.notdef /.notdef /Ydieresis
% 0xA0
/.notdef % nobreakspace
/exclamdown /cent /sterling
/currency /yen /brokenbar /section
/dieresis /copyright /ordfeminine /guillemotleft
/logicalnot
/hyphen % Y&Y (also at 45); Windows' softhyphen
/registered
/macron
% 0xD0
/degree /plusminus /twosuperior /threesuperior
/acute /mu /paragraph /periodcentered
/cedilla /onesuperior /ordmasculine /guillemotright
/onequarter /onehalf /threequarters /questiondown
% 0xC0
/Agrave /Aacute /Acircumflex /Atilde /Adieresis /Aring /AE /Ccedilla
/Egrave /Eacute /Ecircumflex /Edieresis
/Igrave /Iacute /Icircumflex /Idieresis
% 0xD0
/Eth /Ntilde /Ograve /Oacute
/Ocircumflex /Otilde /Odieresis /multiply
/Oslash /Ugrave /Uacute /Ucircumflex
/Udieresis /Yacute /Thorn /germandbls
% 0xE0
/agrave /aacute /acircumflex /atilde
/adieresis /aring /ae /ccedilla
/egrave /eacute /ecircumflex /edieresis
/igrave /iacute /icircumflex /idieresis
% 0xF0
/eth /ntilde /ograve /oacute
/ocircumflex /otilde /odieresis /divide
/oslash /ugrave /uacute /ucircumflex
/udieresis /yacute /thorn /ydieresis
] def
%%EndProcSet
%%BeginProcSet: texps.pro
%!
TeXDict begin/rf{findfont dup length 1 add dict begin{1 index/FID ne 2
index/UniqueID ne and{def}{pop pop}ifelse}forall[1 index 0 6 -1 roll
exec 0 exch 5 -1 roll VResolution Resolution div mul neg 0 0]/Metrics
exch def dict begin Encoding{exch dup type/integertype ne{pop pop 1 sub
dup 0 le{pop}{[}ifelse}{FontMatrix 0 get div Metrics 0 get div def}
ifelse}forall Metrics/Metrics currentdict end def[2 index currentdict
end definefont 3 -1 roll makefont/setfont cvx]cvx def}def/ObliqueSlant{
dup sin S cos div neg}B/SlantFont{4 index mul add}def/ExtendFont{3 -1
roll mul exch}def/ReEncodeFont{CharStrings rcheck{/Encoding false def
dup[exch{dup CharStrings exch known not{pop/.notdef/Encoding true def}
if}forall Encoding{]exch pop}{cleartomark}ifelse}if/Encoding exch def}
def end
%%EndProcSet
TeXDict begin 39158280 55380996 1000 1200 1200 (doc.dvi)
@start
%DVIPSBitmapFont: Fa cmmi10 9.37793 2
/Fa 2 63 df<1D70F401F8F407FC1C1F1C7F983801FFF8090713F0093F13C098B5120008
0313FC080F13F0083F13C097B5C7FC070313FC070F13E0073F1380DFFFFEC8FC060313F8
060F13E0063F1380DEFFFEC9FC050713F8051F13E0057F13804C4848CAFC040713F8041F
13E0047F1380922601FFFCCBFC030713F0031F13C0037F90CCFC913801FFFC020713F002
1F13C091B5CDFC010313FC010F13F0013F13C090B5CEFC000313FC000F13F0003F138048
48CFFCEAFFF813E013F8EA7FFE6C6C7E000F13F0000313FCC613FF013F13C0010F13F001
0313FC010013FF021F13C0020713F0020113FC6E6CB4FC031F13C0030713F0030113FC6F
6CB47E041F13E0040713F8040113FE706C6C7E051F13E0050713F8050013FE95383FFF80
060F13E0060313F8060013FE96383FFF80070F13E0070313FC070013FF083F13C0080F13
F0080313FC080013FF093F13C0090713F0090113F89838007FFC1C1F1C07F401F8F40070
5E5B73D379>60 D<1238127EB47E13E013F8EA7FFE6C6C7E000F13F0000313FCC613FF01
3F13C0010F13F0010313FC010013FF021F13C0020713F0020113FC6E6CB4FC031F13C003
0713F0030113FC6F6CB47E041F13E0040713F8040113FE706C6C7E051F13E0050713F805
0013FE95383FFF80060F13E0060313F8060013FE96383FFF80070F13E0070313FC070013
FF083F13C0080F13F0080313FC080013FF093F13C0090713F0090113F89838007FFC1C1F
1C7F983801FFF8090713F0093F13C098B51200080313FC080F13F0083F13C097B5C7FC07
0313FC070F13E0073F1380DFFFFEC8FC060313F8060F13E0063F1380DEFFFEC9FC050713
F8051F13E0057F13804C4848CAFC040713F8041F13E0047F1380922601FFFCCBFC030713
F0031F13C0037F90CCFC913801FFFC020713F0021F13C091B5CDFC010313FC010F13F001
3F13C090B5CEFC000313FC000F13F0003F13804848CFFCEAFFF813E01380007ED0FC1238
5E5B73D379>62 D E
%EndDVIPSBitmapFont
%DVIPSBitmapFont: Fc cmsy10 6.25195 1
/Fc 1 95 df<153CA2157EA215FFA34A7FA24A7F15E702077F15C3A2020F7F1581021F7F
15004A7F023E137CA2027E137E027C133E02FC133F4A7FA20101814A130F0103814A1307
0107814A1303A2010F814A1301011F8191C8FC4981013E157CA2017E157E017C153E01FC
153F49810001178049150FA2000317C0491507000717E0491503000F17F0491501A2001F
17F890CAFC4817FC003E177C007E177E007C173EA200FC173F48171FA20070170E38417A
BE45>94 D E
%EndDVIPSBitmapFont
/Fd 135[120 7[133 9[80 102[{TeXBase1Encoding ReEncodeFont}3
239.103 /Palatino-BoldItalic rf /Fe 140[85 85 1[85 7[85
1[85 85 1[85 33[85 17[85 47[{
.85 ExtendFont TeXBase1Encoding ReEncodeFont}9 166.044
/Courier rf /Fg 133[120 1[120 199 1[146 80 106 93 1[146
133 146 213 80 2[80 146 133 93 120 146 106 1[120 12[159
1[173 1[146 199 5[93 2[133 3[159 186 12[120 120 120 120
3[80 45[{TeXBase1Encoding ReEncodeFont}33 239.103 /Palatino-Bold
rf /Fh 138[101 2[65 5[55 2[55 1[92 1[83 2[101 83 12[111
13[92 11[42 6[83 83 83 49[{TeXBase1Encoding ReEncodeFont}14
166.044 /Palatino-Bold rf
%DVIPSBitmapFont: Fi cmsy10 9.37793 4
/Fi 4 107 df<153E157F4B7EA86FC9FCA400081708003E173ED87F8016FF486C4B1380
6D5D01F8150F6D5D6C6C4B1300273FFF803EEBFFFE6C01C0495B000701F0010713F0C601
F8491380D93FFC4948C7FCD90FFFEB7FF801039038BEFFE0010090B51280023F49C8FC02
0F13F8020313E002001380020313E0020F13F8023F13FE91B67E010301BE13E0010F9038
3E7FF8D93FFCEB1FFED9FFF86D6C7E000701F06D13F0001F01C0010113FC4801806D7F27
7FFE007FEB3FFF48486F1380498101E0150349816C486F1300003EC7153E00081708C892
C7FCA44B7EA86FC9FC153E394376C74E>3 D<1E3F8BA48B1E1FA38B1E0F8BA21E078B1E
038B1E018B787EA2797E797EA2797E797E8C797E797E797E797FF87FE07A7E7A7EF80FFE
7A7E7A13C0003FC212F04821FEC41280A36CF9FE006C21F0D3000313C05690C7FC565AF8
1FF8565A565AF8FF805590C8FC555A555A555A68555A555AA2555A55C9FCA2545A671E03
671E07671E0FA2671E1F67A31E3F67A49CCAFC895177CE9C>33 D<1C7E1CFF88891C3F89
1C1F891C0F89767E1C0389767E1C0089777E777E777EA2777E777E777E777E003FBFFC48
8AC012E08B8B6C8A6C1EFFD100017F7813E0F73FF0F71FFC79B4FC0D0313C07913F09C38
007FFC7AB47E0E0F13E00E0313FC0E00EBFF80213F9EB5FC0E03EBFC000E0F13E00E3F13
80E67FFCC7FC9C3801FFF05513C00D0F90C8FCF71FFCF73FF0F7FFE0541380003FC0C9FC
481EFCC05A67676C1E806C9BCAFCD0EA01FE535A535A535A535AA2535A535A53CBFC651C
01525A651C07525A651C1F651C3F651C7F9ACCFC641C7E895777D19C>41
D<1238127C12FEB3B3B3B3B3B3B3B3A8127C1238079C6EF42B>106
D E
%EndDVIPSBitmapFont
%DVIPSBitmapFont: Fj cmr10 9.37793 1
/Fj 1 62 df<003FBE12F0481DF8BF12FCA36C1DF86C1DF0D2FCB3A7003FBE12F0481DF8
BF12FCA36C1DF86C1DF0662777B979>61 D E
%EndDVIPSBitmapFont
/Fl 134[83 83 116 83 91 50 83 58 1[91 91 91 1[42 2[42
91 91 50 83 91 83 1[83 14[108 2[116 10[108 68[{
TeXBase1Encoding ReEncodeFont}23 149.44 /Helvetica-Bold
rf /Fm 107[50 26[75 75 108 75 83 42 75 50 1[83 83 83
124 33 75 1[33 83 83 42 83 83 75 83 83 3[42 1[42 2[100
141 100 108 91 100 108 1[100 116 108 124 83 2[42 108
116 91 100 108 108 100 100 6[42 1[83 83 83 83 2[83 1[83
1[42 1[42 2[50 50 33 36[75 2[{TeXBase1Encoding ReEncodeFont}60
149.44 /Helvetica rf /Fn 171[81 6[126 81 3[111 72[{
TeXBase1Encoding ReEncodeFont}4 132.835 /Palatino-Roman
rf /Fo 133[74 1[83 2[92 55 65 65 1[83 74 92 1[46 74 1[46
83 1[46 65 83 68 1[74 51[55 42[88 2[{TeXBase1Encoding ReEncodeFont}20
166.044 /Palatino-Italic rf /Fp 105[83 27[83 92 86 138
94 100 54 70 66 93 100 91 97 147 48 92 1[48 97 92 55
80 101 74 92 83 3[55 1[55 2[111 166 1[129 102 87 111
2[131 138 157 101 1[55 56 138 1[92 101 129 118 1[129
5[42 42 83 83 83 83 83 83 83 83 83 83 101 42 55 42 4[46
35[101 100 2[{TeXBase1Encoding ReEncodeFont}65 166.044
/Palatino-Roman rf /Fq 135[103 166 1[120 65 84 79 1[120
109 116 176 58 2[58 116 111 66 95 122 88 1[100 12[122
105 2[120 1[166 9[154 141 1[155 6[50 1[100 5[100 1[100
1[50 6[55 39[{TeXBase1Encoding ReEncodeFont}32 199.253
/Palatino-Roman rf /Fr 135[143 7[143 9[80 102[{
TeXBase1Encoding ReEncodeFont}3 286.924 /Palatino-Italic
rf /Fs 139[94 122 113 1[172 157 167 253 83 2[83 167 160
96 137 1[127 1[143 12[176 3[173 6[97 7[223 25[80 39[{
TeXBase1Encoding ReEncodeFont}20 286.924 /Palatino-Roman
rf end
%%EndProlog
%%BeginSetup
%%Feature: *Resolution 1200dpi
TeXDict begin
%%BeginPaperSize: a4
a4
%%EndPaperSize
%%EndSetup
%%Page: 1 1
1 0 bop 1314 1760 a Fs(The)72 b Fr(fxp)p Fs('s)f(Application)k(Pr)-5
b(ogramming)73 b(Interface)3624 2744 y Fq(August)50 b(2000)1561
3310 y(Note:)61 b(The)50 b(following)f(is)g(an)g(excerpt)h(fr)l(om)f
(the)h(Chapter)e(2)h(Section)h(8)1312 3528 y(of)g(the)g(Andr)l(eas)g
(Neumann's)g(Ph.D.)g(Thesis.)1561 3926 y Fp(The)d Fo(fxp)p
Fp('s)h(application)g(interface)f(is)h(a)g(functional)g(variant)f(of)h
(an)g(event-based)f(in-)1312 4126 y(terface)60 b(that)h(vitally)h(r)m
(elies)f(on)k(S)8 b Fn(M)g(L)t Fp('s)62 b(parametrized)e(modules)h(for)
g(customization.)1312 4325 y(Its)52 b(information)e(set)h(includes)h
(all)g(r)m(equir)m(ed)f(and)g(most)g(optional)g(information)f(items)
1312 4524 y(of)41 b(the)k(X)8 b Fn(M)g(L)46 b Fp(information)40
b(set)i([1].)1561 4723 y(The)34 b(application)h(de\002nes)g(a)g(data)h
(type)e Fm(AppData)h Fp(r)m(epr)m(esenting)e(the)h(part)h(of)g(its)h
(state)1312 4923 y(af)m(fected)61 b(by)g(the)f(event)g(handlers.)109
b(W)-15 b(e)61 b(call)g(this)h(information)e(the)g Fo(application)h
(data)p Fp(.)1312 5122 y(During)40 b(parsing,)g(a)h(value)g(of)f(this)i
(type)e(is)i(maintained)e(by)h(the)f(parser)-12 b(.)49
b(Upon)40 b(trigger)m(-)1312 5321 y(ing)50 b(an)h(event,)f(the)g(event)
f(handler)h(r)m(eceives)f(the)h(application)g(data)h(as)g(an)f
(additional)1312 5520 y(parameter)60 b(and)i(r)m(eturns)f(it)i(\226)f
(possibly)h(modi\002ed)g(\226)f(to)g(the)g(parser)-12
b(.)112 b(The)61 b(modi\002ed)1312 5720 y(application)48
b(data)h(is)h(then)d(an)h(ar)m(gument)g(to)g(the)g(next)g(event,)g(and)
h(so)g(forth.)71 b(In)48 b(or)m(der)1312 5919 y(to)g(distinguish)i
(this)e(kind)h(of)f(event)f(handlers)h(fr)m(om)f(the)g(imperative)g
(variant)h(we)g(call)1312 6118 y(them)34 b Fo(hooks)p
Fp(.)48 b(The)34 b(ef)m(fects)h(of)h(all)g(hooks)f(ar)m(e)f(thus)i
(accumulated)f(into)g(one)f(value)h(of)g(type)1312 6317
y Fm(AppData)o Fp(.)1561 6517 y(In)50 b(or)m(der)g(to)g(implement)f
(hooks)i(in)f Fo(fxp)p Fp(,)j(the)d(parser)f(must)i(be)f(pr)m(ovided)h
(with)g(the)1312 6716 y(type)j(of)g(the)f(application)h(data)h(and)f
(with)g(the)g(functions)g(implementing)f(the)g(hooks.)1312
6915 y(The)68 b(easiest)h(way)h(of)f(doing)g(so)g(is)h(to)f(pack)g
(them)e(into)i(a)g(str)o(uctur)m(e)f Fm(Hooks)p Fp(,)75
b(ful\002ll-)1312 7114 y(ing)54 b(the)g(signatur)m(e)g(in)h(Figur)m(e)
108 b(1.)90 b(The)53 b(purposes)h(of)g(the)g(single)g(hooks)g(ar)m(e)g
(listed)h(in)1312 7314 y(T)-15 b(able1.)61 b(A)45 b(hook)f(expects)g
(as)h(ar)m(gument)f(the)h(curr)m(ent)e(application)i(data)h(and)f(\226)
g(except)1312 7513 y(for)55 b Fm(hookFinish)i Fp(\226)e(the)h
(information)e(belonging)i(to)g(the)f(signaled)i(event,)g(and)f(r)m
(eturns)1312 7712 y(the)41 b(modi\002ed)h(application)f(data.)1561
7911 y(The)52 b(data)j(types)e(pr)m(oviding)h(the)f(event-speci\002c)f
(information)h(passed)h(to)f(a)h(hook)1312 8111 y(ar)m(e)44
b(de\002ned)i(by)g(str)o(uctur)m(e)e Fm(HookData)p Fp(.)63
b(Each)45 b(information)g(item)g(contains)g(at)h(least)g(the)1312
8310 y(position)d(in)h(the)f(document)g(wher)m(e)f(the)h(event)g
(occurr)m(ed;)g(for)g(some)g(events)g(even)f(two)1312
8509 y(positions)g(ar)m(e)g(speci\002ed:)53 b(a)42 b(start)h(position)f
(and)g(an)g(end)g(position.)54 b(E.g.,)41 b(type)h Fm(ErrorInf)l(o)1312
8708 y Fp(describes)52 b(an)h(err)m(or)e(by)i(means)f(of)g(its)i
(position)f(and)g(an)f(err)m(or)f(description.)85 b(A)53
b(start-)1312 8908 y(tag)48 b(is)g(described)g(by)f(the)g(positions)h
(of)g(its)g(\002rst)h(and)e(last)i(character)-12 b(,)46
b(the)h(index)g(of)h(the)1312 9107 y(element,)62 b(the)d(list)i(of)f
(attribute)g(speci\002cations)g(the)f(eventual)g(spaces)h(following)h
(the)1312 9306 y(element)53 b(name)h(and)h(a)g(boolean)f(\003ag,)59
b(indicating)c(whether)f(it)i(is)g(an)e(empty-element)1312
9505 y(tag:)1727 9767 y Fl(type)42 b Fm(ErrorInf)l(o)58
b Fj(=)h Fm(Errors)n(.P)-7 b(osition)58 b Fi(\003)h Fm(Errors)n(.Error)
1727 9950 y Fl(type)42 b Fm(Star)6 b(tT)-18 b(agInf)l(o)58
b Fj(=)h Fm(Star)6 b(tEnd)59 b Fi(\003)h Fm(int)f Fi(\003)h
Fm(AttSpecList)e Fi(\003)i Fm(UniChar)-7 b(.Data)58 b
Fi(\003)i Fm(bool)1561 10228 y Fp(The)48 b(information)g(set)h(pr)m
(ovided)g(thr)m(ough)f Fo(fxp)p Fp('s)i(hooks)f(is)h(suf)m(\002cient)g
(for)f(pr)m(oduc-)1312 10428 y(ing)54 b(a)g(character)m(-by-character)c
(identical)55 b(copy)e(of)h(the)f(document)g(instance.)87
b(Except)1312 10627 y(for)35 b(white)g(space)g(and)h(parameter)d
(entity)j(r)m(efer)m(ences)d(within)j(declarations,)g(this)g(is)g(also)
1312 10826 y(possible)42 b(for)f(the)f(DTD.)4134 11324
y(1)p eop
%%Page: 2 2
2 1 bop 465 903 5729 7 v 465 6282 7 5379 v 598 1164 a
Fl(signature)42 b Fm(Hooks)60 b Fj(=)812 1347 y Fl(sig)1027
1529 y(type)41 b Fm(AppData)1027 1712 y Fl(type)g Fm(AppFinal)1027
1973 y Fl(v)m(al)g Fm(hookError)356 b(:)51 b(AppData)60
b Fi(\003)f Fm(HookData.ErrorInf)l(o)373 b Fi(!)59 b
Fm(AppData)1027 2155 y Fl(v)m(al)41 b Fm(hookW)-6 b(ar)t(ning)135
b(:)51 b(AppData)60 b Fi(\003)f Fm(HookData.W)-6 b(ar)t(ningInf)l(o)152
b Fi(!)59 b Fm(AppData)1027 2416 y Fl(v)m(al)41 b Fm(hookStar)6
b(tT)-18 b(ag)127 b(:)51 b(AppData)60 b Fi(\003)f Fm(HookData.Star)6
b(tT)-18 b(agInf)l(o)144 b Fi(!)59 b Fm(AppData)1027
2599 y Fl(v)m(al)41 b Fm(hookEndT)-18 b(ag)184 b(:)51
b(AppData)60 b Fi(\003)f Fm(HookData.EndT)-18 b(agInf)l(o)201
b Fi(!)59 b Fm(AppData)1027 2860 y Fl(v)m(al)41 b Fm(hookData)373
b(:)51 b(AppData)60 b Fi(\003)f Fm(HookData.DataInf)l(o)390
b Fi(!)59 b Fm(AppData)1027 3042 y Fl(v)m(al)41 b Fm(hookCData)265
b(:)51 b(AppData)60 b Fi(\003)f Fm(HookData.CDataInf)l(o)282
b Fi(!)59 b Fm(AppData)1027 3225 y Fl(v)m(al)41 b Fm(hookCharRef)132
b(:)51 b(AppData)60 b Fi(\003)f Fm(HookData.CharRefInf)l(o)149
b Fi(!)59 b Fm(AppData)1027 3486 y Fl(v)m(al)41 b Fm(hookProcInst)139
b(:)51 b(AppData)60 b Fi(\003)f Fm(HookData.ProcInstInf)l(o)156
b Fi(!)59 b Fm(AppData)1027 3668 y Fl(v)m(al)41 b Fm(hookComment)h(:)51
b(AppData)60 b Fi(\003)f Fm(HookData.CommentInf)l(o)g
Fi(!)g Fm(AppData)1027 3851 y Fl(v)m(al)41 b Fm(hookWhite)307
b(:)51 b(AppData)60 b Fi(\003)f Fm(HookData.WhiteInf)l(o)324
b Fi(!)59 b Fm(AppData)1027 4034 y Fl(v)m(al)41 b Fm(hookDecl)390
b(:)51 b(AppData)60 b Fi(\003)f Fm(HookData.DeclInf)l(o)407
b Fi(!)59 b Fm(AppData)1027 4294 y Fl(v)m(al)41 b Fm(hookDocT)-18
b(ype)109 b(:)51 b(AppData)60 b Fi(\003)f Fm(HookData.DtdInf)l(o)473
b Fi(!)59 b Fm(AppData)1027 4477 y Fl(v)m(al)41 b Fm(hookSubset)223
b(:)51 b(AppData)60 b Fi(\003)f Fm(HookData.SubsetInf)l(o)240
b Fi(!)59 b Fm(AppData)1027 4660 y Fl(v)m(al)41 b Fm(hookExtSubset)8
b(:)49 b(AppData)60 b Fi(\003)f Fm(HookData.ExtSubsetInf)l(o)27
b Fi(!)55 b Fm(AppData)1027 4842 y Fl(v)m(al)41 b Fm(hookEndDtd)190
b(:)51 b(AppData)60 b Fi(\003)f Fm(HookData.EndDtdInf)l(o)207
b Fi(!)59 b Fm(AppData)1027 5103 y Fl(v)m(al)41 b Fm(hookGenRef)174
b(:)51 b(AppData)60 b Fi(\003)f Fm(HookData.GenRefInf)l(o)191
b Fi(!)59 b Fm(AppData)1027 5286 y Fl(v)m(al)41 b Fm(hookP)-6
b(arRef)229 b(:)51 b(AppData)60 b Fi(\003)f Fm(HookData.P)-6
b(arRefInf)l(o)246 b Fi(!)59 b Fm(AppData)1027 5469 y
Fl(v)m(al)41 b Fm(hookEntEnd)198 b(:)51 b(AppData)60
b Fi(\003)f Fm(HookData.EntEndInf)l(o)215 b Fi(!)59 b
Fm(AppData)1027 5729 y Fl(v)m(al)41 b Fm(hookXml)432
b(:)51 b(AppData)60 b Fi(\003)f Fm(HookData.XmlInf)l(o)449
b Fi(!)59 b Fm(AppData)1027 5912 y Fl(v)m(al)41 b Fm(hookFinish)291
b(:)51 b(AppData)1856 b Fi(!)59 b Fm(AppFinal)812 6095
y Fl(end)p 6187 6282 V 465 6289 5729 7 v 2199 6493 a
Fh(Figure)42 b(1:)f Fp(The)g Fm(Hooks)g Fp(signatur)m(e.)p
465 6580 V 714 6912 a(In)47 b(addition)i(to)f(type)f
Fm(AppData)p Fp(,)i(a)e Fm(Hooks)h Fp(str)o(uctur)m(e)e(must)j
(de\002ne)e(a)h(type)f Fm(AppFinal)q Fp(.)465 7111 y(This)35
b(type)f(is)h(the)f(r)m(esult)h(type)f(of)h(the)f(parser:)46
b(At)35 b(the)f(end)g(of)g(the)g(document,)h(the)f(parser)465
7310 y(calls)f Fm(hookFinish)g Fp(in)f(or)m(der)f(to)i
Fo(\002nalize)e Fp(the)h(accumulated)g(application)g(data.)49
b(This)32 b(is)h(use-)465 7509 y(ful)45 b(because)f(values)h(of)g(type)
f Fm(AppData)g Fp(often)g(contain)g(auxiliary)h(information:)56
b(E.g.,)45 b(if)465 7709 y(an)56 b(application)h(collects)f
(information)f(fr)m(om)h(the)g(document)g(entity,)j(it)f(must)e(ignor)m
(e)465 7908 y(all)41 b(events)f(occurring)f(within)j(entity)e(r)m
(eplacement)f(texts.)50 b(For)40 b(this)i(r)m(eason,)d(the)h(hooks)465
8107 y(might)52 b(maintain)g(a)g(counter)f(indicating)i(the)f(nesting)g
(depth)f(of)i(included)f(entity)h(r)m(ef-)465 8306 y(er)m(ences.)k
Fm(hookFinish)45 b Fp(then)e(r)m(emoves)f(this)j(counter)d(because)i
(it)g(is)h(of)f(no)g(inter)m(est)f(to)h(the)465 8506
y(application.)78 b(Another)49 b(example)g(for)h(the)g(bene\002ts)g(of)
h(\002nalization)g(is)g(the)f(following:)465 8705 y(An)55
b(application)h(might)f(collect)h(section)f(titles)h(in)g(or)m(der)e
(to)i(compile)e(a)i(table)g(of)f(con-)465 8904 y(tents.)61
b(Since)44 b(the)h(application)f(data)i(is)g(an)e(accumulating)h
(parameter,)e(these)i(titles)g(ar)m(e)465 9103 y(collected)k(in)g(r)m
(everse)f(or)m(der)-12 b(.)74 b Fm(AppFinal)50 b Fp(can)f(be)g(used)h
(for)f(r)m(eestablishing)g(the)g(original)465 9303 y(or)m(der)-12
b(.)465 9874 y Fg(1)239 b(Functorizing)58 b(the)h(Parser)465
10252 y Fp(The)h(parser)g(is)i(ther)m(efor)m(e)d(implemented)h(as)h(an)
k(S)8 b Fn(M)g(L)66 b Fo(functor)p Fp(,)g(expecting)59
b(the)i Fm(Hooks)465 10452 y Fp(str)o(uctur)m(e)47 b(as)i(ar)m(gument.)
70 b(The)47 b(application)h(can)g(then)f(generate)g(several)g
(instances)h(of)465 10651 y(the)g(parser)g(for)g(dif)m(fer)m(ent)g
(purposes.)72 b(In)49 b(addition)g(to)g(the)f(hooks,)i(the)e(parser)f
(functor)3288 11324 y(2)p eop
%%Page: 3 3
3 2 bop 1320 903 5713 7 v 1317 1103 7 200 v 2587 1103
V 7026 1103 V 1317 1146 V 1452 1086 a Fp(Hook)p 2587
1146 V 863 w(X)8 b Fn(M)g(L)46 b Fp(Event)p 7026 1146
V 1320 1153 5713 7 v 1317 1352 7 200 v 2587 1352 V 7026
1352 V 1320 1193 5713 7 v 1317 1392 7 200 v 2587 1392
V 7026 1392 V 1317 1435 V 1452 1376 a Fm(hookError)p
2587 1435 V 614 w Fp(err)m(or)p 7026 1435 V 1317 1635
V 1452 1575 a Fm(hookW)-6 b(ar)t(ning)p 2587 1635 V 393
w Fp(warning)p 7026 1635 V 1320 1641 5713 7 v 1317 1840
7 200 v 2587 1840 V 7026 1840 V 1317 1884 V 1452 1824
a Fm(hookStar)6 b(tT)-18 b(ag)p 2587 1884 V 385 w Fp(start-tag)p
7026 1884 V 1317 2083 V 1452 2023 a Fm(hookEndT)g(ag)p
2587 2083 V 442 w Fp(end-tag)p 7026 2083 V 1320 2090
5713 7 v 1317 2289 7 200 v 2587 2289 V 7026 2289 V 1317
2333 V 1452 2273 a Fm(hookData)p 2587 2333 V 631 w Fp(segment)41
b(of)g(character)f(data)p 7026 2333 V 1317 2532 V 1452
2472 a Fm(hookCData)p 2587 2532 V 523 w Fp(CDA)-12 b(T)g(A)40
b(section)p 7026 2532 V 1317 2731 V 1452 2671 a Fm(hookCharRef)p
2587 2731 V 390 w Fp(character)g(r)m(efer)m(ence)p 7026
2731 V 1320 2738 5713 7 v 1317 2937 7 200 v 2587 2937
V 7026 2937 V 1317 2980 V 1452 2921 a Fm(hookProcInst)p
2587 2980 V 397 w Fp(pr)m(ocessing)h(instr)o(uction)p
7026 2980 V 1317 3180 V 1452 3120 a Fm(hookComment)p
2587 3180 V 300 w Fp(comment)p 7026 3180 V 1317 3379
V 1452 3319 a Fm(hookDecl)p 2587 3379 V 648 w Fp(element,)e(attribute,)
i(notation)g(or)g(entity)g(declaration)p 7026 3379 V
1320 3386 5713 7 v 1317 3585 7 200 v 2587 3585 V 7026
3585 V 1317 3628 V 1452 3569 a Fm(hookDocT)-18 b(ype)p
2587 3628 V 367 w Fp(start)42 b(of)f(the)g(DTD)p 7026
3628 V 1317 3828 V 1452 3768 a Fm(hookSubset)p 2587 3828
V 481 w Fp(start)h(of)f(the)g(internal)g(subset)p 7026
3828 V 1317 4027 V 1452 3967 a Fm(hookExtSubset)p 2587
4027 V 264 w Fp(start)h(of)f(the)g(external)f(subset)p
7026 4027 V 1317 4226 V 1452 4166 a Fm(hookEndDtd)p 2587
4226 V 448 w Fp(end)h(of)h(the)f(DTD)p 7026 4226 V 1320
4233 5713 7 v 1317 4432 7 200 v 2587 4432 V 7026 4432
V 1317 4476 V 1452 4416 a Fm(hookGenRef)p 2587 4476 V
432 w Fp(general)g(entity)g(r)m(efer)m(ence)p 7026 4476
V 1317 4675 V 1452 4615 a Fm(hookP)-6 b(arRef)p 2587
4675 V 487 w Fp(parameter)40 b(entity)h(r)m(efer)m(ence)p
7026 4675 V 1317 4874 V 1452 4814 a Fm(hookEntEnd)p 2587
4874 V 456 w Fp(end)g(of)h(an)f(included)h(entity)p 7026
4874 V 1320 4881 5713 7 v 1317 5080 7 200 v 2587 5080
V 7026 5080 V 1317 5123 V 1452 5064 a Fm(hookXml)p 2587
5123 V 690 w Fp(start)g(of)f(the)g(document)g(entity)p
7026 5123 V 1317 5323 V 1452 5263 a Fm(hookFinish)p 2587
5323 V 549 w Fp(end)g(of)h(the)f(document)p 7026 5323
V 1320 5329 5713 7 v 2531 5534 a Fh(T)-18 b(able)41 b(1:)g
Fp(The)g(hooks)g(in)g Fo(fxp)h Fp(and)g(their)f(purposes.)p
1312 5621 5729 7 v 1312 5827 V 1312 7554 7 1727 v 1445
6087 a Fl(funsig)g Fm(P)-6 b(arse)41 b(\()h Fl(structure)g
Fm(Dtd)344 b(:)51 b(Dtd)2452 6270 y Fl(structure)42 b
Fm(Hooks)153 b(:)51 b(Hooks)2452 6453 y Fl(structure)42
b Fm(Resolv)l(e)f(:)51 b(Resolv)l(e)2452 6635 y Fl(structure)42
b Fm(Options)62 b(:)51 b(Options)42 b(\))60 b Fj(=)1659
6818 y Fl(sig)1874 7001 y(v)m(al)41 b Fm(parseDocument)g(:)51
b(Ur)r(i.Ur)r(i)41 b(option)60 b Fi(!)g Fm(Dtd.Dtd)40
b(option)3328 7183 y Fi(!)59 b Fm(Hooks)n(.AppData)f
Fi(!)i Fm(Hooks)n(.AppFinal)1659 7366 y Fl(end)p 7034
7554 V 1312 7561 5729 7 v 3122 7764 a Fh(Figure)42 b(2:)f
Fp(The)g Fm(P)-6 b(arser)40 b Fp(functor)-12 b(.)p 1312
7848 V 1312 8180 a(is)42 b(parametrized)e(with)i(thr)m(ee)e(other)g
(str)o(uctur)m(es:)1312 8512 y Fl(Options)p Fh(:)83 b
Fp(This)38 b(str)o(uctur)m(e)f(supplies)i(the)e(parser)g(with)i(its)g
(options.)49 b(It)39 b(is)f(useful)h(to)f(have)1727 8712
y(each)61 b(instance)g(of)g(the)g(parser)g(r)o(un)g(with)h(its)h(own)e
(set)h(of)g(options:)91 b(E.g.,)70 b(X)8 b Fn(M)g(L)1727
8911 y Fp(catalogs)52 b(fr)m(equently)f(have)f(no)h(DTD.)80
b(Ther)m(efor)m(e)49 b(a)i(catalog)h(is)g(parsed)g(in)f(non-)1727
9110 y(validating)43 b(mode,)d(even)g(if)i(the)f(main)g(document)f(is)j
(validated.)1312 9442 y Fl(Resolve)p Fh(:)82 b Fp(This)68
b(str)o(uctur)m(e)f(pr)m(ovides)g(a)g(single)h(function)f(for)g(r)m
(esolving)h(an)f(external)1727 9641 y(identi\002er)41
b(to)h(a)f(URI:)2092 9957 y Fl(v)m(al)g Fm(resolv)l(eExtId)f(:)51
b(Base)n(.Exter)t(nalId)58 b Fi(!)h Fm(Ur)r(i.Ur)r(i)1727
10289 y Fp(In)38 b(the)g(simplest)i(case)e(this)h(function)g(r)m
(eturns)e(the)i(system)g(identi\002er)g(that)f(is)i(part)1727
10488 y(of)50 b(the)g(external)e(identi\002er)-12 b(.)77
b(If)51 b(the)e(parser)h(supports)k(X)8 b Fn(M)g(L)54
b Fp(catalogs,)f(however)-12 b(,)1727 10688 y(this)42
b(is)g(the)f(function)g(that)g(sear)m(ches)g(the)g(catalog.)4134
11324 y(3)p eop
%%Page: 4 4
4 3 bop 465 1063 a Fl(Dtd)p Fh(:)82 b Fp(The)57 b(implementation)f(of)i
(the)f(DTD)f(tables)i(can)f(be)g(pr)m(ovided)g(by)h(the)f(applica-)880
1262 y(tion.)50 b(In)38 b(most)h(cases)f(this)i(is)f(the)f
Fm(Dtd)g Fp(str)o(uctur)m(e)g(pr)m(ovided)g(by)h Fo(fxp)p
Fp(,)g(but)g(the)f(appli-)880 1461 y(cation)49 b(can)f(also)h(pr)m
(ovide)f(a)h(mor)m(e)f(ef)m(\002cient)h(implementation,)f(or)g(enhance)
f(the)880 1660 y(functionality)d(of)f(the)g(DTD)f(tables.)56
b(For)43 b(instance,)g(the)f(operations)g(on)h(the)g(DTD)880
1860 y(tables)f(might)g(be)g(wrapped)g(into)f(functions)h(pr)m(oducing)
g(debugging)h(or)e(statisti-)880 2059 y(cal)c(information.)49
b(On)36 b(the)g(other)g(hand,)h(the)g(application)g(can)f
Fo(hard-code)g Fp(element)880 2258 y(types)i(or)g(attributes)g(to)g
(\002xed)h(indices.)50 b(E.g.,)38 b(in)g(or)m(der)g(to)g(collect)f
Fe(href)g Fp(attributes)880 2457 y(in)44 b(an)k(X)8 b
Fn(H)g(T)g(M)g(L)48 b Fp(document,)c(one)e(might)j(use)f(the)f
(following)i(implementation)d(of)880 2657 y(the)f Fm(Dtd)g
Fp(ar)m(gument)f(str)o(uctur)m(e:)1245 2972 y Fl(structure)j
Fm(HrefDtd)58 b Fj(=)1460 3155 y Fl(struct)1674 3337
y(open)43 b Fm(Dtd)1674 3598 y Fl(v)m(al)f Fm(hrefData)59
b Fj(=)g Fm(UniChar)-7 b(.Str)r(ing2Data)40 b(\224href\224)1674
3859 y Fl(fun)i Fm(initDtdT)-18 b(ab)m(les)42 b(\(\))59
b Fj(=)1889 4042 y Fl(let)41 b(v)m(al)g Fm(dtd)60 b Fj(=)f
Fm(Dtd.initDtdT)-18 b(ab)m(les\(\))2105 4224 y Fl(v)m(al)p
2351 4248 79 6 v 179 w Fj(=)60 b Fm(AttNot2Inde)l(x)39
b(dtd)i(hrefData)1889 4407 y Fl(in)g Fm(dtd)1889 4590
y Fl(end)1674 4850 y(v)m(al)h Fm(hrefIdx)59 b Fj(=)g
Fm(AttNot2Inde)l(x)39 b(\(initDtdT)-18 b(ab)m(les\(\)\))41
b(hrefData)1460 5033 y Fl(end)880 5365 y Fp(A)i Fe(href)f
Fp(attribute)h(will)i(then)d(always)i(have)f(index)f
Fm(hrefIdx)o Fp(;)i(sear)m(ching)e(and)h(com-)880 5564
y(paring)75 b(can)g(be)g(done)f(with)i(this)g(constant)f(rather)f(than)
h(the)f(list)j(of)e(charac-)880 5764 y(ters)46 b Fm
([0wx68,0wx72,0wx65,)f(0wx66])p Fp(.)65 b(This)46 b(is)h(r)m(easonable)
e(also)i(because)e(element)880 5963 y(types)k(ar)m(e)f(passed)i(to)f
(the)f(hooks)h(by)g(means)f(of)h(their)f(indices)h(in)g(the)g(DTD,)e
(not)880 6162 y(by)42 b(their)e(names.)465 6494 y(The)63
b(signatur)m(e)i(of)f(the)g(parser)f(functor)g(is)i(given)f(in)h(Figur)
m(e)f(2.)120 b(It)64 b(de\002nes)h(a)f(single)465 6694
y(function)57 b Fm(parseDocument)f Fp(which,)61 b(given)c(an)f
(optional)h(URI)g(of)g(a)h(document)e(and)h(an)465 6893
y(optional)39 b(DTD,)f(parses)i(that)g(document)e(\226)i(if)h(no)e(URI)
h(is)g(given,)f(the)h(document)e(is)j(r)m(ead)465 7092
y(fr)m(om)48 b(the)g(standar)m(d)i(input.)73 b(It)49
b(r)m(eturns)f(the)g(\002nalized)i(value)e(of)h(the)f(application)h
(data)465 7291 y(r)m(eceived)40 b(in)h(its)i(thir)m(d)e(ar)m(gument,)f
(modi\002ed)i(by)g(the)f(hooks)g(during)h(parsing.)714
7491 y(If)33 b(the)g(optional)f Fm(Dtd)h Fp(ar)m(gument)f(is)i(given)e
(as)i Fm(NONE)o Fp(,)h(the)d(parser)g(initializes)i(the)f(DTD)465
7690 y(tables)51 b(with)g(the)g Fm(initDtdT)-18 b(ab)m(les)50
b Fp(function.)79 b(In)50 b(this)i(case)e(hooks)h(have)f(no)g(access)h
(to)f(the)465 7889 y(DTD)34 b(because)h(it)h(is)g(not)g(pr)m(ovided)f
(as)h(an)f(ar)m(gument)g(to)g(them.)48 b(For)35 b(many)g(applications)
465 8088 y(this)46 b(is)g(not)f(necessary)f(indeed.)62
b(If)46 b(the)f(hooks)g(need)f(to)h(access)g(the)g(DTD,)f(the)g
(applica-)465 8288 y(tion)f(must)i(initialize)f(the)f(tables)h(itself,)
h(incorporate)d(them)h(into)g(the)g(application)h(data)465
8487 y(and)e(pass)g(them)e(to)h(the)g(parser)g(in)g(its)h
Fm(Dtd)f Fp(ar)m(gument.)465 9058 y Fg(2)239 b(Building)59
b(Applications)f(with)i Fd(fxp)465 9437 y Fo(fxp)46 b
Fp(pr)m(ovides)f(a)g(rich)g(information)f(set)i(thr)m(ough)e(its)j
(hooks)e(interface.)62 b(Many)45 b(applica-)465 9636
y(tions,)c(however)-12 b(,)39 b(ar)m(e)h(only)h(inter)m(ested)g(in)g(a)
g(subset)h(of)f(that)g(information:)50 b(A)41 b(formatter)465
9835 y(is)54 b(pr)m(obably)f(not)g(inter)m(ested)f(in)i(comments,)g
(and)f(a)h(querying)f(tool)g(will)i(only)f(sear)m(ch)465
10034 y(the)64 b(document)g(instance)g(and)h(ignor)m(e)f(the)g(DTD.)120
b(Ther)m(efor)m(e)63 b(we)h(pr)m(ovide)h(a)g(set)f(of)465
10234 y(hooks)41 b(that)h(simply)g(r)m(eturn)e(the)h(application)g
(data)h(unchanged:)3288 11324 y(4)p eop
%%Page: 5 5
5 4 bop 1727 1063 a Fl(structure)42 b Fm(IgnoreHooks)59
b Fj(=)1941 1245 y Fl(struct)2156 1428 y(fun)42 b Fm(hookError\(a,)p
3262 1452 79 6 v 76 w(\))60 b Fj(=)18 b Fm(a)2156 1611
y Fl(fun)42 b Fm(hookW)-6 b(ar)t(ning\(a,)p 3483 1635
V 77 w(\))60 b Fj(=)18 b Fm(a)2156 1793 y Fl(fun)42 b
Fm(hookStar)6 b(tT)-18 b(ag\(a,)p 3491 1817 V 76 w(\))60
b Fj(=)18 b Fm(a)2156 1976 y(...)2156 2159 y Fl(fun)42
b Fm(hookXml\(a,)p 3186 2183 V 78 w(\))59 b Fj(=)19 b
Fm(a)2156 2341 y Fl(fun)42 b Fm(hookFinish)g(a)60 b Fj(=)18
b Fm(a)1941 2524 y Fl(end)1312 2835 y Fp(These)49 b(functions)i(ar)m(e)
e(polymorphic:)68 b(They)49 b(do)i(not)f(depend)g(on)g(the)g(type)g(of)
g(the)g(ap-)1312 3035 y(plication)h(data)h(and)g(can)e(thus)i(be)f
(used)h(with)g(arbitrary)f(types.)81 b(The)50 b(only)h(exception)1312
3234 y(is)60 b Fm(hookFinish)g Fp(which)g(r)m(equir)m(es)f(types)h
Fm(AppData)f Fp(and)h Fm(AppFinal)g Fp(to)g(be)f(equal.)106
b(An)59 b(ap-)1312 3433 y(plication)46 b(must)h(now)g(only)g(de\002ne)f
(the)g(hooks)g(that)h(have)f(a)h(dif)m(fer)m(ent)f(behavior)-12
b(.)65 b(E.g.,)1312 3632 y(the)39 b(hooks)h(for)f(a)h(validator)g(ar)m
(e)f(implemented)g(in)h(a)g(few)g(lines,)g(because)f(it)i(only)e
(prints)1312 3832 y(err)m(ors)h(and)h(warnings)h(but)g(ignor)m(es)f
(all)h(other)e(events:)1727 4127 y Fl(structure)i Fm(Chec)m(kHooks)59
b Fj(=)1941 4309 y Fl(struct)2156 4492 y(open)42 b Fm(T)-18
b(e)l(xtIO)41 b(Errors)f(IgnoreHooks)2156 4753 y Fl(type)i
Fm(AppData)59 b Fj(=)18 b Fm(OS)m(.Process)n(.status)2156
5014 y Fl(fun)42 b Fm(message\(pos,msg\))59 b Fj(=)18
b Fm(output\(stdErr,P)-7 b(osition2Str)r(ing)38 b(pos)6105
4959 y Fc(^)6175 5014 y Fm(\224:)51 b(\224)6386 4959
y Fc(^)6455 5014 y Fm(msg\))2156 5274 y Fl(fun)42 b Fm(hookError)e(\()p
3177 5298 V 78 w(,\(pos,err\)\))58 b Fj(=)2370 5457 y
Fm(OS)m(.Process)n(.f)l(ailure)39 b(bef)l(ore)i
(message\(pos,errorMessage)f(err\))2156 5640 y Fl(fun)i
Fm(hookW)-6 b(ar)t(ning)41 b(\(status,\(pos,w)n(ar)t(n\)\))57
b Fj(=)2370 5822 y Fm(status)41 b(bef)l(ore)f(message\(pos,w)n(ar)t
(ningMessage)h(w)n(ar)t(n\))1941 6005 y Fl(end)1312 6316
y Fp(Except)51 b(for)g Fm(hookError)g Fp(and)g Fm(hookW)-6
b(ar)t(ning)q Fp(,)53 b(all)g(hooks)e(ar)m(e)g(taken)g(over)f(fr)m(om)h
(str)o(uctur)m(e)1312 6516 y Fm(IgnoreHooks)o Fp(.)g(Another)39
b(example)g(is)j(the)e(application)h(alr)m(eady)g(mentioned)f(on)g
(page)h(4:)1312 6715 y(It)53 b(collects)g(all)g(attributes)g(named)f
Fe(href)p Fp(.)85 b(For)52 b(this)h(purpose)f(we)h(de\002ned)g(a)g(str)
o(uctur)m(e)1312 6914 y Fm(HrefDtd)o Fp(,)34 b(har)m(d-coding)g(this)g
(attribute)f(name)g(to)g(a)h(constant)e(index)i Fm(hrefIdx)n
Fp(.)49 b(This)33 b(is)i(used)1312 7113 y(in)41 b(the)g(de\002nition)h
(of)f(the)g(hooks:)1727 7408 y Fl(structure)h Fm(HrefHooks)59
b Fj(=)1941 7591 y Fl(struct)2156 7774 y(open)42 b Fm(Base)f(HrefDtd)g
(IgnoreHooks)2156 8035 y Fl(type)h Fm(AppData)59 b Fj(=)h
Fm(UniChar)-7 b(.V)-12 b(ector)39 b(list)2156 8217 y
Fl(type)j Fm(AppFinal)60 b Fj(=)g Fm(AppData)2156 8478
y Fl(fun)42 b Fm(\002ndHref)e(nil)61 b Fj(=)f Fm(NONE)2310
8661 y Fi(j)77 b Fm(\002ndHref)40 b(\(\(idx,attPres\))17
b(::)g(rest\))55 b Fj(=)2585 8843 y Fl(if)41 b Fm(idx)19
b Fa(<)-22 b(>)18 b Fm(hrefIdx)40 b Fl(then)i Fm(\002ndHref)f(rest)2585
9026 y Fl(else)g(case)h Fm(attPres)3092 9209 y Fl(of)58
b Fm(AP)p 3500 9209 45 7 v 53 w(PRESENT\(v)l(ec,)p 4572
9233 79 6 v 76 w(\))i Fi(\))g Fm(SOME)41 b(v)l(ec)3172
9391 y Fi(j)76 b Fm(AP)p 3500 9391 45 7 v 53 w(DEF)-12
b(A)-7 b(UL)-16 b(T\(v)l(ec,)p 4511 9415 79 6 v 76 w(\))60
b Fi(\))g Fm(SOME)41 b(v)l(ec)3172 9574 y Fi(j)p 3291
9598 V 214 w(\))60 b Fm(\002ndHref)40 b(rest)2156 9835
y Fl(fun)i Fm(hookStar)6 b(tT)-18 b(ag)40 b(\(a,\()p
3581 9859 V 78 w(,)p 3701 9859 V 77 w(,attSpecs,)p 4445
9859 V 76 w(,)p 4563 9859 V 78 w(\)\))59 b Fj(=)2370
10017 y Fl(case)42 b Fm(\002ndHref)f(attSpecs)2545 10200
y Fl(of)58 b Fm(NONE)i Fi(\))f Fm(a)2625 10383 y Fi(j)76
b Fm(SOME)42 b(x)60 b Fi(\))f Fm(x)17 b(::)g(a)2156 10643
y Fl(v)m(al)41 b Fm(hookFinish)61 b Fj(=)e Fm(re)l(v)1941
10826 y Fl(end)4134 11324 y Fp(5)p eop
%%Page: 6 6
6 5 bop 465 1063 a Fp(In)60 b(or)m(der)f(to)h(identify)h(a)g
Fe(href)e Fp(attribute,)64 b(function)c Fm(\002ndHref)g
Fp(need)f(only)h(compar)m(e)f(its)465 1262 y(index)g(with)h
Fm(hrefIdx)n Fp(.)105 b(Note)58 b(that)h(due)h(to)f(the)f(accumulative)
h(natur)m(e)f(of)h(the)g Fm(AppData)465 1461 y Fp(ar)m(gument,)54
b(the)f(attribute)g(values)g(ar)m(e)f(collected)h(in)g(r)m(everse)e(or)
m(der)-12 b(.)84 b(Ther)m(efor)m(e)50 b(func-)465 1660
y(tion)43 b Fm(hookFinish)h Fp(is)g(de\002ned)f(to)h(be)e(the)h(list)i
(r)m(eversing)d(function)h(and)g(r)m(eestablishes)g(the)465
1860 y(original)f(or)m(der)-12 b(.)465 2431 y Fg(3)239
b(Implementing)57 b(a)i(T)-27 b(ree-Based)59 b(Interface)465
2809 y Fp(The)51 b(hooks)h(interface)f(of)h Fo(fxp)g
Fp(its)h(event-based;)j(nevertheless)50 b(a)i(tr)m(ee-based)g
(interface)465 3009 y(can)e(easily)h(be)f(implemented)f(on)h(top)g(of)g
(the)g(hooks.)77 b(W)-15 b(e)50 b(pr)m(esent)f(bellow)h(an)h(imple-)465
3208 y(mentation)57 b(of)h(such)g(an)g(interface.)100
b(For)58 b(br)m(evity)-18 b(,)61 b(it)e(has)f(only)g(a)g(r)m(estricted)
g(informa-)465 3407 y(tion)46 b(set:)60 b(It)46 b(ignor)m(es)g(the)g
(DTD,)e(comments,)h(pr)m(ocessing)h(instr)o(uctions)g(and)g(the)f
(entity)465 3607 y(str)o(uctur)m(e)50 b(of)h(the)f(document.)79
b(It)51 b(is,)j(however)-12 b(,)51 b(easy)g(to)g(extend)f(the)g
(implementation)465 3806 y(to)41 b(supply)h(all)h(this)f(information.)
714 4005 y(The)51 b Fm(T)-18 b(ree)53 b Fp(data)g(type)g(is)g(simple:)
73 b(A)52 b(tr)m(ee)g(is)h(either)e(a)i(piece)e(of)h(text)g(or)g(an)g
(element)465 4204 y(consisting)66 b(of)f(a)g(start-tag)h(and)f(a)g
(list)h(of)f(tr)m(ees)f(as)i(content.)120 b(The)64 b(application)h
(data)465 4404 y(r)m(epr)m(esents)46 b(the)i(partial)g(document)f(tr)m
(ee)g(constr)o(ucted)g(so)i(far)-12 b(.)70 b(It)49 b(contains)e(in)i
(its)g Fm(stac)m(k)465 4603 y Fp(component)54 b(for)i(each)f(ancestor)g
(element)g(of)h(the)g(curr)m(ent)e(position,)60 b(its)d(start-tag)g
(to-)465 4802 y(gether)45 b(with)h(the)g(list)h(of)f(its)h(\226)f(alr)m
(eady)g(complete)e(\226)i(left)h(siblings;)i(component)43
b Fm(content)465 5001 y Fp(holds)48 b(the)f(childr)m(en)g(of)h(the)f
(curr)m(ent)f(element)g(that)i(ar)m(e)f(known)g(so)h(far)-12
b(.)69 b(At)48 b(the)g(begin-)465 5201 y(ning)g(of)h(the)f(parse)g
(both)f(components)g(ar)m(e)h(empty)-18 b(.)71 b(After)48
b(the)g(whole)g(document)f(has)465 5400 y(been)33 b(parsed,)j(the)e
(stack)h(must)g(be)g(empty)f(and)h(a)g(single)g(element)e(tr)m(ee)h
(must)h(have)f(been)465 5599 y(constr)o(ucted.)66 b(In)47
b(this)g(case)g(function)f Fm(hookFinish)h Fp(r)m(eturns)f(that)h
(element;)h(otherwise)e(it)465 5798 y(raises)c(an)f(exception.)714
5998 y(The)50 b(thr)m(ee)f(functions)i Fm(hookData)o
Fp(,)i Fm(hookCData)d Fp(and)h Fm(hookCharRef)f Fp(add)i(the)e(piece)f
(of)465 6197 y(text)60 b(r)m(eported)g(to)g(them)g(to)h(the)f(content)g
(of)h(the)f(curr)m(ent)f(element.)108 b(The)60 b(hook)g(for)g(a)465
6396 y(start-tag)45 b(pushes)f(that)h(tag)f(together)f(with)i(the)f
(content)e(of)j(the)e(curr)m(ent)g(element)f(onto)465
6595 y(the)52 b(stack.)85 b(The)52 b(element)f(started)i(by)f(that)h
(tag)g(now)g(becomes)e(the)h(curr)m(ent)f(element.)465
6795 y(Function)62 b Fm(hookEndT)-18 b(ag)63 b Fp(r)m(everses)e(the)g
(content)g(of)i(the)e(curr)m(ent)g(element)g(in)h(or)m(der)g(to)465
6994 y(compensate)35 b(for)i(the)f(r)m(eversing)g(ef)m(fect)h(of)g
(accumulation.)49 b(Its)37 b(tag)h(is)g(popped)e(fr)m(om)g(the)465
7193 y(stack)51 b(and)h(combined)d(with)j(its)g(content.)77
b(The)50 b(constr)o(ucted)g(tr)m(ee)g(is)h(then)f(pr)m(epended)465
7392 y(to)41 b(the)g(content)f(of)h(the)g(par)m(ent)f(element)g(which)i
(now)f(becomes)f(the)g(curr)m(ent)g(element.)465 7964
y Fg(4)239 b(Other)59 b(examples)465 8342 y Fo(fxp)52
b Fp(comes)e(with)i(a)g(few)g(application)f(examples)g(which)g(r)m
(esides)h(in)f(the)g Fe(src/Apps)e Fp(di-)465 8541 y(r)m(ectory)-18
b(.)48 b(The)36 b(inter)m(ested)h(r)m(eader)g(can)g(\002nd)h(ther)m(e)e
(further)g(examples)h(of)g(using)i(the)e Fo(fxp)p Fp('s)465
8741 y(pr)m(ogramming)j(interface.)465 9312 y Fg(References)548
9690 y Fp([1])84 b(John)57 b(Cowan)g(and)h(David)g(Megginson,)k
(editors.)c(XML)g(Information)e(Set.W3C)825 9890 y(W)-15
b(orking)73 b(Draft,)82 b(W)-15 b(orld)74 b(W)-9 b(ide)74
b(W)-15 b(eb)73 b(Consortium,)81 b(May)74 b(1999.)g(A)-15
b(vailable)74 b(at)825 10089 y(http://www)-15 b(.w3.or)m
(g/TR/1998/WD-xml-infoset-19990517)3288 11324 y(6)p eop
%%Page: 7 7
7 6 bop 1312 1396 5729 7 v 1312 10036 7 8641 v 1445 1657
a Fl(structure)42 b Fm(T)-18 b(reeData)60 b Fj(=)1659
1839 y Fl(struct)1874 2022 y(e)n(xception)41 b Fm(IllF)l(or)t(med)1874
2283 y Fl(type)g Fm(T)-18 b(ag)61 b Fj(=)f Fm(int)g Fi(\003)f
Fm(HookData.AttSpecList)1874 2465 y Fl(datatype)42 b
Fm(T)-18 b(ree)60 b Fj(=)g Fm(TEXT)41 b Fl(of)h Fm(UniChar)-7
b(.V)-12 b(ector)2940 2648 y Fi(j)77 b Fm(ELEM)41 b Fl(of)h
Fm(T)-18 b(ag)60 b Fi(\003)g Fm(Content)1874 2831 y Fl(withtype)41
b Fm(Content)60 b Fj(=)f Fm(T)-18 b(ree)42 b(list)1659
3013 y Fl(end)1445 3457 y(structure)g Fm(T)-18 b(reeHooks)60
b Fj(=)1659 3639 y Fl(struct)1874 3822 y(open)42 b Fm(IgnoreHooks)f(T)
-18 b(reeData)41 b(UniChar)1874 4083 y Fl(type)g Fm(AppData)60
b Fj(=)g Fm(Content)f Fi(\003)h Fm(\(T)-18 b(ag)60 b
Fi(\003)g Fm(Content\))40 b(list)1874 4265 y Fl(type)h
Fm(AppFinal)61 b Fj(=)f Fm(T)-18 b(ree)1874 4526 y Fl(v)m(al)41
b Fm(appStar)6 b(t)59 b Fj(=)g Fm(\(nil,nil\))1874 4787
y Fl(fun)41 b Fm(hookStar)6 b(tT)-18 b(ag)41 b(\(\(content,stac)m(k\),)
d(\()p 4234 4811 79 6 v 78 w(,elem,atts,)p 5003 4811
V 76 w(,empty\)\))60 b Fj(=)2088 4970 y Fl(if)41 b Fm(empty)g
Fl(then)h Fm(\(ELEM)g(\(\(elem,atts\),nil\))17 b(::)g(content,stac)m
(k\))2088 5152 y Fl(else)41 b Fm(\(nil,\(\(elem,atts\),content\))17
b(::)g(stac)m(k\))1874 5413 y Fl(fun)41 b Fm(hookEndT)-18
b(ag)42 b(\(\()p 3118 5437 V 78 w(,nil\),)p 3479 5437
V 77 w(\))60 b Fj(=)g Fl(raise)42 b Fm(IllF)l(or)t(med)2028
5596 y Fi(j)76 b Fm(hookEndT)-18 b(ag)42 b
(\(\(content,\(tag,content'\))17 b(::)g(stac)m(k\),)p
5082 5620 V 71 w(\))60 b Fj(=)2088 5778 y Fm(\(ELEM)41
b(\(tag,re)l(v)f(content\))17 b(::)g(content',stac)m(k\))1874
6039 y Fl(fun)41 b Fm(hookData)h(\(\(content,stac)m(k\),\()p
3951 6063 V 74 w(,v)l(ec,)p 4338 6063 V 77 w(\)\))60
b Fj(=)2147 6222 y Fm(\(TEXT)42 b(v)l(ec)17 b(::)g(content,stac)m(k\))
1874 6405 y Fl(fun)41 b Fm(hookCData)g(\(\(content,stac)m(k\),\()p
4058 6429 V 75 w(,v)l(ec\)\))59 b Fj(=)2147 6587 y Fm(\(TEXT)42
b(v)l(ec)17 b(::)g(content,stac)m(k\))1874 6770 y Fl(fun)41
b Fm(hookCharRef)g(\(\(content,stac)m(k\),\()p 4191 6794
V 75 w(,c,)p 4425 6794 V 77 w(\)\))59 b Fj(=)2147 6953
y Fm(\(TEXT\(Data2V)-12 b(ector)40 b([c]\))17 b(::)g(content,stac)m
(k\))1874 7213 y Fl(fun)41 b Fm(hookFinish)i(\([elem],nil\))59
b Fj(=)h Fm(elem)2028 7396 y Fi(j)76 b Fm(hookFinish)p
2912 7420 V 181 w Fj(=)59 b Fl(raise)42 b Fm(IllF)l(or)t(med)1659
7579 y Fl(end)1445 8022 y(structure)g Fm(P)-6 b(arseT)-18
b(ree)41 b(:)1659 8205 y Fl(sig)1874 8387 y(v)m(al)g
Fm(parseT)-18 b(ree)41 b(:)51 b(Ur)r(i.Ur)r(i)42 b(option)60
b Fi(!)f Fm(Dtd.Dtd)40 b(option)61 b Fi(!)e Fm(T)-18
b(reeData.T)g(ree)1659 8570 y Fl(end)1462 8753 y Fj(=)76
b Fl(struct)1874 8935 y(structure)42 b Fm(P)-6 b(arser)59
b Fj(=)h Fm(P)-6 b(arse)41 b(\()g Fl(structure)h Fm(Dtd)363
b Fj(=)60 b Fm(Dtd)3755 9118 y Fl(structure)42 b Fm(Hooks)172
b Fj(=)60 b Fm(T)-18 b(reeHooks)3755 9301 y Fl(structure)42
b Fm(P)-6 b(arserOptions)59 b Fj(=)h Fm(P)-6 b(arserOptions)41
b(\(\))3755 9483 y Fl(structure)h Fm(Resolv)l(e)60 b
Fj(=)g Fm(Resolv)l(eNull\))1874 9666 y Fl(fun)41 b Fm(parseT)-18
b(ree)42 b(ur)r(i)g(dtd)60 b Fj(=)f Fm(P)-6 b(arser)f(.parseDocument)40
b(ur)r(i)i(dtd)f(T)-18 b(reeHooks)n(.appStar)6 b(t)1659
9849 y Fl(end)p 7034 10036 7 8641 v 1312 10043 5729 7
v 2119 10247 a Fh(Figure)43 b(3:)e Fp(A)g(simple)h(tr)m(ee-based)f
(interface)f(on)h(top)g(of)g(hooks.)p 1312 10333 V 4134
11324 a(7)p eop
%%Trailer
end
userdict /end-hook known{end-hook}if
%%EOF

BIN
fxp/doc/exa-vcg-1.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.0 KiB

BIN
fxp/doc/exa-vcg-2.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.8 KiB

BIN
fxp/doc/exa-vcg-3.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.2 KiB

BIN
fxp/doc/exa-vcg-4.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.8 KiB

BIN
fxp/doc/exa-vcg-5.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.4 KiB

BIN
fxp/doc/exa-vcg-6.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.6 KiB

387
fxp/doc/features.html Normal file
View File

@ -0,0 +1,387 @@
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="GENERATOR" content="Mozilla/4.73 [en] (X11; I; Linux 2.2.14 i686) [Netscape]">
<title>fxp - Features</title>
<!-- DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN" -->
</head>
<body bgcolor="#FFFFFF">
<h1>
<a href="index.html"><img SRC="fxp-shadow.jpg" ALT="fxp" BORDER=0 align=CENTER></a>
Features</h1>
<img SRC="shadow.jpg" ALT="----------------" >
<table CELLSPACING=0 CELLPADDING=0 >
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#UNI">Unicode Support</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#CAT">Catalog Support</a></td>
</tr>
</table>
<p><img SRC="shadow.jpg" ALT="----------------" >
<h1>
<a NAME="UNI"></a>Unicode Support</h1>
<i>fxp</i> has full support for Unicode and auto-detection of encoding
of external XML entities. The&nbsp;<a NAME="ENC"></a>supported encodings
are currently:
<table WIDTH="90%" >
<tr VALIGN=TOP>
<th>Encoding&nbsp;</th>
<th ALIGN=LEFT>Other recognized names&nbsp;</th>
</tr>
<tr VALIGN=TOP>
<td><tt>ASCII</tt></td>
<td><tt>ANSI_X3.4-1968</tt>, <tt>ANSI_X3.4-1986</tt>, <tt>US-ASCII</tt>,
<tt>US</tt>, <tt>ISO646-US</tt>, <tt>ISO-IR-6</tt>, <tt>ISO_646.IRV:1991</tt>,
<tt>IBM367</tt> and <tt>CP367</tt></td>
</tr>
<tr VALIGN=TOP>
<td><tt>EBCDIC</tt></td>
</tr>
<tr VALIGN=TOP>
<td><tt>LATIN1</tt></td>
<td><tt>ISO_8859-1:1987</tt>, <tt>ISO-8859-1</tt>, <tt>ISO_8859-1</tt>,
<tt>ISO-IR-100</tt>, <tt>CP819</tt>, <tt>IBM819</tt>, <tt>L1</tt></td>
</tr>
<tr VALIGN=TOP>
<td><tt>UCS-4</tt></td>
<td><tt>ISO-10646-UCS-4</tt></td>
</tr>
<tr VALIGN=TOP>
<td><tt>UCS-2</tt></td>
<td><tt>ISO-10646-UCS-2</tt></td>
</tr>
<tr VALIGN=TOP>
<td><tt>UTF-16</tt></td>
</tr>
<tr VALIGN=TOP>
<td><tt>UTF-8</tt></td>
</tr>
</table>
<p><img SRC="shadow.jpg" ALT="----------------" >
<h1>
<a NAME="CAT"></a>Catalog Support</h1>
<table CELLSPACING=0 CELLPADDING=0 >
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#CAT-OVER">Catalogs</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#CAT-EXA">Options by Example</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#CAT-OPT">Summary of Options</a></td>
</tr>
</table>
<p><img SRC="shadow.jpg" ALT="----------------" >
<h2>
<a NAME="CAT-OVER"></a>Catalogs</h2>
<i>fxp</i> supports the Socat syntax of <a href="http://www.ccil.org/~cowan/XML/XCatalog.html">XML
Catalog</a>. Catalogs are used for generating system identifiers from public
identifiers (mapping), or for substituting system identifiers by other
system identifiers (remapping). Catalogs come in two syntaxes: the Socat
syntax is a subset of a catalog syntax used for SGML; the XML syntax is
an XML document instance.
<h4>
Syntax</h4>
There are five kinds of entries in a catalog:
<table WIDTH="90%" >
<tr VALIGN=TOP>
<th ALIGN=LEFT>Type&nbsp;</th>
<th ALIGN=LEFT>Socat/XML syntax&nbsp;</th>
<th ALIGN=LEFT>Meaning&nbsp;</th>
</tr>
<tr VALIGN=TOP>
<td>base&nbsp;</td>
<td><tt>BASE</tt> <i>uri</i>
<br><tt>&lt;Base HRef="</tt><i>uri</i><tt>"></tt></td>
<td>Specifies a URI to be used as a base for succeeding relative URIs.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td>extend&nbsp;</td>
<td><tt>CATALOG</tt> <i>uri</i>
<br><tt>&lt;Extend HRef="</tt><i>uri</i><tt>"></tt></td>
<td>Indicates an alternative catalog to be searched if the actual catalog
does not contain a matching entry.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td>delegate&nbsp;</td>
<td><tt>DELEGATE</tt> <i>prefix uri</i>
<br><tt>&lt;Delegate PublicId="</tt><i>prefix</i><tt>" HRef="</tt><i>uri</i><tt>"></tt></td>
<td>Specifies an alternative catalog, but only for public identifiers beginning
with <i>prefix</i>.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td>map&nbsp;</td>
<td><tt>PUBLIC</tt> <i>pubid uri</i>
<br><tt>&lt;Map PublicId="</tt><i>pubid</i><tt>" HRef="</tt><i>uri</i><tt>"></tt></td>
<td>Maps a public identifier to a URI.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td>remap&nbsp;</td>
<td><tt>SYSTEM</tt> <i>src dst</i>
<br><tt>&lt;Remap SystemId="</tt><i>src</i><tt>" HRef="</tt><i>dst</i><tt>"></tt></td>
<td>Indicates that URI <i>dst</i> shall be used in the place of the source
URI <i>src</i>.&nbsp;</td>
</tr>
</table>
<p>If the XML syntax is used, the catalog is parsed in non-validating mode
and everything except for the start-tags of the above five elements is
ignored. It is recommended, however, that the catalog be a valid XML document
with a document type similar to <a href="Examples/xmlcat.dtd">this</a>.
<p>Relative URIs are treated as relative to the catalog in which they appear,
or if there was a preceding base entry, relative to the URI of that entry.
The only exception is that the <i>src</i> URI in a remap entry must be
mapped exactly, ignoring any specified base.
<h4>
Example in Socat Syntax</h4>
If a catalog's file name ends in <tt>.SOC</tt> or <tt>.soc</tt>, <i>fxp</i>
assumes it is in Socat syntax, e.g.:
<blockquote>
<pre>BASE&nbsp;&nbsp;&nbsp;&nbsp; "/pub/dtd/w3c/"
PUBLIC&nbsp;&nbsp; "-//W3C//DTD Specification::19980910//EN" "spec.dtd"
SYSTEM&nbsp;&nbsp; "spec.dtd" "xmlspec.dtd"
DELEGATE "ISO" "/pub/dtd/iso/iso.soc"
CATALOG&nbsp; "/pub/entities/ent.soc"
PUBLIC&nbsp;&nbsp; "ISO 8879:1986//ENTITIES Added Latin 1//EN" "/pub/iso/lat1.ent"
SYSTEM&nbsp;&nbsp; "isolat1.ent" "latin1.ent"</pre>
</blockquote>
<h4>
Example in XML Syntax</h4>
For XML syntax, the catalog must be a well-formed, but not necessarily
valid XML document. I.e., if the catalog has more than one entry, there
must be at least one root element containing all the entries. All textual
data and elements other than the five catalog entries are ignored.
<blockquote>
<pre>&lt;Catalog>
&nbsp; &lt;Base HRef="/pub/dtd/w3c/"/>
&nbsp; &lt;Map&nbsp; PublicId="-//W3C//DTD Specification::19980910//EN" HRef="spec.dtd"/>
&nbsp; &lt;Remap SystemId="spec.dtd" HRef="xmlspec.dtd"/>
&nbsp; &lt;Delegate PublicId="ISO" HRef="/pub/dtd/iso/iso.soc"/>
&nbsp; &lt;Extend HRef="/pub/entities/ent.soc"/>
&nbsp; &lt;Map PublicId="ISO 8879:1986//ENTITIES Added Latin 1//EN" HRef="/pub/iso/lat1.ent"/>
&nbsp; &lt;Remap SystemId="isolat1.ent" HRef="latin1.ent"/>
&lt;/Catalog></pre>
</blockquote>
<h4>
Search Order</h4>
The search order is breadth-first, i.e., a matching map or remap entry
is always preferred to a matching entry in an alternative catalog specified
by a preceding delegate or extend entry. E.g., in the example above the
public identifier <tt>"ISO 8879:1986//ENTITIES Added Latin 1//EN"</tt>
is mapped to <tt>/pub/iso/lat1.ent</tt> even if the catalog <tt>/pub/entities/ent.soc</tt>
contains a matching entry for it.
<p><img SRC="shadow.jpg" ALT="----------------" >
<h2>
<a NAME="CAT-EXA"></a>Catalog Options by Example</h2>
<h4>
Catalog Search Path</h4>
A catalog to be used for resolving can be specified with the <tt>--catalog</tt>
option. Repeating this option several times is equivalent to concatenating
all specified catalogs into one. Note that, e.g, a matching entry in the
second catalog overrides a match in a catalog specified in a delegate or
extend entry in the first one: suppose that <tt>iso.soc</tt> contains the
line
<blockquote>
<pre>DELEGATE "ISO 8879:1986//ENTITIES" "8879.soc"</pre>
</blockquote>
<tt>8879.soc</tt> contains
<blockquote>
<pre>PUBLIC&nbsp;&nbsp; "ISO 8879:1986//ENTITIES Added Latin 1//EN" "/pub/iso/lat1.ent"</pre>
</blockquote>
and <tt>ents.soc</tt> contains
<blockquote>
<pre>PUBLIC&nbsp;&nbsp; "ISO 8879:1986//ENTITIES Added Latin 1//EN" "isolat1.ent"</pre>
</blockquote>
Specifying <tt>--catalog=iso.soc --catalog=ents.soc</tt> makes <tt>"ISO
8879:1986//ENTITIES Added Latin 1//EN"</tt> resolve to <tt>isolat1.ent</tt>,
and not to <tt>/pub/iso/lat1.ent</tt>.
<h4>
Resolving Strategy</h4>
A catalog may be used for several reasons: as a fall-back, i.e., for generating
system identifiers if the information in the XML document itself is not
sufficient; or as the default, overriding the system identifiers specified
in the DTD. By default, <i>fxp</i> tries to resolve an external identifier
as follows:
<ol>
<li>
if a public identifier is present, then it is tried to be mapped to a system
identifier using the catalog; if this fails or no public identifier was
given, the declared system identifier is used;</li>
<li>
the system identifier obtained by step 1 is tried to be remapped by a matching
catalog entry.</li>
</ol>
This can be affected by the <tt>--catalog-priority</tt> option. This option
takes one of the following arguments:
<table WIDTH="90%" >
<tr VALIGN=TOP>
<td><tt>map</tt></td>
<td>the default behaviour; for succeeding relative URIs.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td><tt>remap</tt></td>
<td>first try to remap the declared system identifier; only if that fails
proceed with step 1.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td><tt>sys</tt></td>
<td>if a system identifier is given, don't consider the catalog at all;
if there is no system identifier, proceed to steps 1 and 2. Note that in
well-formed documents an external identifier must always contain a system
identifier. Therefore this applies only to external identifiers declared
for notations.&nbsp;</td>
</tr>
</table>
<p>E.g., suppose you have the following declarations in the DTD:
<blockquote>
<pre>&lt;ENTITY % isolat1 PUBLIC "ISO 8879:1986//ENTITIES Added Latin 1//EN" "isolat1.ent">
&lt;NOTATION ps PUBLIC "PostScript Level 3"></pre>
</blockquote>
By default, the external identifier for <tt>isolat1</tt> is mapped to <tt>/pub/iso/lat1.ent</tt>.
With <tt>--catalog-priority=remap</tt> remapping of the declared system
identifier comes first and yields <tt>latin1.ent</tt> (which is modified
to <tt>/pub/dtd/w3c/latin1.ent</tt> due to the base entry in the catalog's
first line). Giving option <tt>--catalog-priority=sys</tt> totally disables
the catalog for this external identifier because it has a system identifier.
For notation <tt>ps</tt>, however, the catalog is still consulted because
its declaration lacks a system identifier.
<p>Since remapping should be used with caution in publicly available catalogs
it can be disabled with <tt>--catalog-remap=no</tt>. E.g., resolving public
identifier <tt>"-//W3C//DTD Specification::19980910//EN"</tt> first results
in the URI <tt>spec.dtd</tt>. By default, this is remapped to <tt>xmlspec.dtd</tt>,
but with <tt>--catalog-remap=no</tt> it is returned as is.
<h4>
Catalog syntax and encoding</h4>
A catalog is used for resolving system identifiers in XML documents. A
system identifier is a URI and may, according to RFC 2396, only contain
ASCII characters. Due to an inaccuracy in the XML recommendation, however,
arbitrary Unicode characters may occur in system identifiers. Since system
identifiers in catalogs are matched literally, it is desirable to specify
them identically both in the catalog and in the XML document. Therefore
catalogs are Unicode documents and can be written in all encodings supported
for XML documents. Though XML recommends encoding non-ASCII characters
in system identifiers in UTF-8 and escaping the resulting bytes in the
URI, matching of system identifiers in catalogs is performed on the Unicode
representation. Therefore, system identifier <tt>"entit&eacute;"</tt> does
not match <tt>"entit%C3%A9"</tt>, though both decode to the same URI.
<p>Catalogs in Socat syntax, however, have no encoding declaration. Therefore
<i>fxp</i> only checks for a byte-order mark at the beginning of a catalog
in order to auto-detect a UTF-16 encoding. If it doesn't find one it assumes
a default encoding. Because catalogs are usually written by hand, this
is by default LATIN1. The <tt>--catalog-encoding</tt> option tells <i>fxp</i>
to use another default encoding.
<p><i>fxp</i> tries to guess the syntax of catalog by means of the suffix
of its file name. A suffix of <tt>.soc</tt> or <tt>.SOC</tt> suggests to
use Socat syntax, whereas for suffixes <tt>.xml</tt> and <tt>.XML</tt>
the XML syntax is chosen. For files having none of these suffices, <i>fxp</i>
assumes XML syntax. This can be changed with <tt>--catalog-syntax=soc</tt>.
<p><img SRC="shadow.jpg" ALT="----------------" >
<h2>
<a NAME="CAT-OPT"></a>Summary of Catalog Options</h2>
<dl>
<dt>
<tt>-C uri</tt></dt>
<dt>
<tt>--catalog=uri</tt></dt>
<dd>
Use <tt>uri</tt> as a catalog. Several catalogs can be specified by repeating
this option.</dd>
<dt>
<tt>--catalog-syntax=(soc|xml)</tt></dt>
<dd>
For catalogs with unknown suffix, specifies whether to assume Socat syntax
or XML syntax. Defaults to <tt>xml</tt>.</dd>
<dt>
<tt>--catalog-encoding=enc</tt></dt>
<dd>
Use encoding <tt>enc</tt> for reading a catalog unless it starts with a
byte order mark. <tt>enc</tt> must be a <a href="#ENC">supported</a> encoding.
Defaults to <tt>LATIN1</tt>.</dd>
<dt>
<tt>--catalog-remap=[(yes|no)]</tt></dt>
<dd>
Turn on or off support for remapping system identifiers. Defaults to <tt>yes</tt>.</dd>
<dt>
<tt>--catalog-priority=(map|remap|sys)</tt></dt>
<dd>
Controls the resolving strategy in catalogs. <tt>map</tt> means that mapping
the public identifier has highest priority; <tt>remap</tt> means that remapping
the system identifier comes first; <tt>sys</tt> means that the catalog
is used only if no system identifier is present. Defaults to <tt>map</tt>.</dd>
</dl>
<img SRC="shadow.jpg" ALT="----------------" >
<address>
fxp's feedback address <a href="mailto:fxp@PSI.Uni-Trier.DE">fxp@PSI.Uni-Trier.DE</a></address>
</body>
</html>

40
fxp/doc/fxcanon.html Normal file
View File

@ -0,0 +1,40 @@
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="GENERATOR" content="Mozilla/4.73 [en] (X11; I; Linux 2.2.14 i686) [Netscape]">
<title>fxp - The Program fxcanon</title>
</head>
<body bgcolor="#FFFFFF">
<h1>
<a href="index.html"><img SRC="fxp-shadow.jpg" ALT="fxp" BORDER=0 align=CENTER></a>
The Program <i>fxcanon</i></h1>
<img SRC="shadow.jpg" ALT="----------------" >
<h2>
<a NAME="DESC"></a>Description</h2>
<i>fxcanon</i> is a validating XML processor. It reads an XML document
and produces an equivalent canonical document. <a href="http://www.jclark.com/xml/canonxml.html">Canonical
XML</a> was invented by <a href="http://www.jclark.com">James Clark</a>
for testing XML parsers. It contains only the information a processor is
required to pass to the application. <i>fxcanon</i> understands all options
documented for <i><a href="fxp.html#OPT">fxp</a></i>; additionally, the
following options are available:
<dl>
<dt>
<tt>-o fname</tt></dt>
<dt>
<tt>--output=fname</tt></dt>
<dd>
Write all output, except for errors and warnings, to the file named <tt>fname</tt>.
If <tt>fname</tt> is <tt>-</tt>, the standard output is used. Defaults
is <tt>-</tt>.</dd>
</dl>
<img SRC="shadow.jpg" ALT="----------------" >
<address>
fxp's feedback address <a href="mailto:fxp@PSI.Uni-Trier.DE">fxp@PSI.Uni-Trier.DE</a></address>
</body>
</html>

485
fxp/doc/fxcopy.html Normal file
View File

@ -0,0 +1,485 @@
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="GENERATOR" content="Mozilla/4.73 [en] (X11; I; Linux 2.2.14 i686) [Netscape]">
<title>fxp - The Program fxcopy</title>
</head>
<body bgcolor="#FFFFFF">
<h1>
<a href="index.html"><img SRC="fxp-shadow.jpg" ALT="fxp" BORDER=0 align=CENTER></a>
The Program <i>fxcopy</i></h1>
<img SRC="shadow.jpg" ALT="----------------" >
<table CELLSPACING=0 CELLPADDING=0 >
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#DESC">Description</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#EXA">Options by Example</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#OPT">Summary of Options</a></td>
</tr>
</table>
<p><img SRC="shadow.jpg" ALT="----------------" >
<h2>
<a NAME="DESC"></a>Description</h2>
<i>fxcopy</i> is a validating XML processor. It reads an XML document and
produces a copy of it. This copy can be in a different encoding, and can
be normalized in several ways by, e.g., expanding entity references.
<p>The typical invocation of <i>fxcopy</i> is
<blockquote>
<pre>fxcopy [option ...] [infile]</pre>
</blockquote>
If <tt>infile</tt> is given, <i>fxcopy</i> reads its input document from
that file, otherwise <i>fxcopy</i> reads from standard input.
<p><img SRC="shadow.jpg" ALT="----------------" >
<h2>
<a NAME="EXA"></a>Options by Example</h2>
In addition to the options of <i><a href="fxp.html#EXA">fxp</a></i>, <i>fxcopy</i>
accepts arguments in the following two areas:&nbsp;<!--
<I>fxcopy</I> understands all options documented for
<A HREF="fxp.html#EXA"><I>fxp</I></A>; the additional options
are described now.
-->
<table CELLSPACING=0 CELLPADDING=0 >
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#EXA-OUT">Controlling Output</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#EXA-REF">Expansion of References</a> in the <a href="#EXP-INST">Document
Instance</a> and in the <a href="#EXP-SUB">Declaration Subset</a></td>
</tr>
</table>
<h4>
<a NAME="EXA-OUT"></a>Controlling Output</h4>
By default, <i>fxcopy</i> writes to standard output, in the same encoding
as the input document. This can be changed by the following options:
<ul>
<li>
The output can be redirected to a file named <tt>outfile</tt> via the option
<tt>--output=outfile</tt> or, for short, <tt>-o outfile</tt>.</li>
<li>
If an output encoding different from the input encoding is desired, use
the <tt>--output-encoding=enc</tt> option, where <tt>enc</tt> is one of
<i>fxcopy</i>'s <a href="features.html#ENC">supported</a> encodings.</li>
</ul>
For instance,
<blockquote>
<pre>fxcopy -o output.utf8 --output-encoding=UTF-8 input.ascii</pre>
</blockquote>
recodes the file <tt>input.ascii</tt> to UTF-8 and writes it to the file
<tt>input.utf8</tt>.
<h4>
<a NAME="EXA-REF"></a>Expansion of References</h4>
By default <i>fxcopy</i> produces a document that is for the most parts
identical to the input, i.e.,
<ul>
<li>
all character references and general entity references in the document
instance are preserved as far as the output encoding admits;</li>
<li>
attribute values are reproduced literally, i.e., without being normalized
and without replacing entity references;</li>
<li>
the internal subset of the DTD is reproduced in an equivalent form; it
is not reproduced literally in that white space is not preserved but normalized
to a canonical form;</li>
<li>
entity values in entity declarations are reproduced literally, i.e., without
replacing entities references;</li>
<li>
the external subset is not copied to the output, but the external identifier
of the document type declaration is preserved.</li>
</ul>
This behavior can be affected by options.
<h5>
<a NAME="EXP-INST"></a>Reference Expansion in the Document Instance</h5>
Expansion of references in content can be controlled by the <tt>--expand-ref-content</tt>
option. Its takes as argument a list of keywords each specifying a class
of references to expand, where
<blockquote>&nbsp;
<table CELLSPACING=0 CELLPADDING=0 >
<tr VALIGN=TOP>
<td><tt>char</tt></td>
<td>means that a character reference shall be replaced by the described
character unless that character cannot be represented directly in the output
encoding.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td><tt>int</tt></td>
<td>means that references to internal general entities shall be substituted
with their replacement text, unless the entity is undeclared (which may
only happen in non-validating mode).&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td><tt>ext</tt></td>
<td>means references to external parsed entities shall be substituted by
the content of the file they point to, unless the entity is undeclared
(which may only happen in non-validating mode).&nbsp;</td>
</tr>
</table>
</blockquote>
Alternatively, we can use <tt>--expand-ref-content</tt> for specifying
all of the above.
<p>The second place within the document instance where references can occur
is attribute values. Furthermore, attribute values are normalized according
to their attribute type after replacement of references. By default, <i>fxcopy</i>
reproduces attribute values literally. Given the <tt>--expand-att-vals</tt>
option, it outputs the normalized value instead.
<p>As an example for expansion in the document instance assume the following
declarations in the DTD:
<blockquote>
<pre>&lt;!ENTITY q "quote sign">
&lt;!ENTITY int "internal entity">
&lt;!ENTITY ext SYSTEM "ext.ent">
&lt;!ATTLIST a x NMTOKENS #IMPLIED
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; y CDATA&nbsp;&nbsp;&nbsp; #IMPLIED></pre>
</blockquote>
and the content of the file <tt>ext.ent</tt> is the string "<tt>external
entity</tt>". Let us consider the following document fragment:
<blockquote>
<pre>&lt;a x=" a&nbsp;&nbsp; b " y="two &amp;q;s: &amp;#x27; and &amp;#x22;">&nbsp;&nbsp;&nbsp;&nbsp;
here is a character reference: &amp;#64;
here is an &amp;int;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
here is an &amp;ext;
&lt;/a></pre>
</blockquote>
Running <tt>fxcopy --expand-refs-content=char,int</tt> produces this:
<blockquote>
<pre>&lt;a x=" a&nbsp;&nbsp; b " y="two &amp;q;s: &amp;#x27; and &amp;#x22;">&nbsp;&nbsp;&nbsp;&nbsp;
here is a character reference: @
here is an internal entity&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
here is an &amp;ext;
&lt;/a></pre>
</blockquote>
whereas <tt>fxcopy --expand-refs-content=ext --expand-att-vals</tt> yields
<blockquote>
<pre>&lt;a x="a b" y="two quote signs: ' and &amp;#x22;">
here is a character reference: &amp;#64;
here is an &amp;int;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
here is an external entity
&lt;/a></pre>
</blockquote>
Note that the <tt>&amp;#x22;</tt> in the attribute value is not replaced
by the <tt>"</tt> sign because then it would be recognized as the end of
the attribute value literal.
<h5>
<a NAME="EXP-SUB"></a>Reference Expansion in the Declaration Subset</h5>
Normally <i>fxcopy</i> reproduces only the internal subset of the document
type, preserving all references to parameter entities. This behavior can
be changed with the <tt>--expand-ents-subset</tt> option. Its argument
indicates which references shall be substitutes by their replacement text:
<blockquote>&nbsp;
<table CELLSPACING=0 CELLPADDING=0 >
<tr VALIGN=TOP>
<td><tt>int</tt></td>
<td>Expand all references to internal parameter entities.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td><tt>ext</tt></td>
<td>Replace all references to external parameter entities with the content
of file they point to. Note that this option implies <tt><a href="#EXP-ENT">--expand-ent-vals</a></tt>
in order to ensure well-formedness.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td><tt>yes</tt></td>
<td>Expand references to internal and external parameter entities. <tt>--expand-ents-subset</tt>
is equivalent <tt>--expand-ents-subset=yes</tt></td>
</tr>
<tr VALIGN=TOP>
<td><tt>no</tt></td>
<td>Expand no parameter entity references at all.&nbsp;</td>
</tr>
</table>
</blockquote>
This applies to references occurring where a declaration could occur. It
does not affect references within declarations which are expanded regardless
of options.
<p>The external subset can be viewed as a special reference. The <tt>--expand-ext-subset</tt>
option makes <i>fxcopy</i> drop the external identifier from the document
type declaration, and copy the content of the file it denotes to the end
of the internal subset. As <tt>--expand-ents-subset=ext</tt>, this option
implies <tt><a href="#EXP-ENT">--expand-ent-vals</a></tt>.
<p><a NAME="EXP-ENT"></a>Usually, entity values in entity declarations
are reproduced literally, i.e., without replacement of references. However,
if a declaration is copied from an external entity to the internal subset,
parameter entity references become invalid in the entity value. Therefore,
given the <tt>--expand-ent-vals</tt> option, <i>fxcopy</i> substitutes
the derived entity replacement text for the entity value. This does not
contain parameter entity references (only if the %-sign was escaped with
a character reference, but then it wasn't even recognized as a reference
by the parser); it uses character references only for characters that can
not be represented directly.
<p>For instance, consider the document <tt>exa-6.xml</tt>:
<blockquote>
<pre>&lt;?xml version="1.0"?>
&lt;!DOCTYPE exa SYSTEM "exa-6.ext" [
&lt;!ENTITY % int "&lt;!ELEMENT exa ANY>">
&lt;!ENTITY % ext SYSTEM "ext-6.decl">
%int;
%ext;
]>
&lt;exa/></pre>
</blockquote>
where the content of the file <tt>exa-6.ext</tt> is
<blockquote>
<pre>&lt;!ENTITY % vnum "1.0">
&lt;!ENTITY % version "xml version %vnum;"></pre>
</blockquote>
and <tt>ext-6.decl</tt> contains
<blockquote>
<pre>&lt;!NOTATION text SYSTEM "/bin/cat"></pre>
</blockquote>
Running <tt>fxcopy --expand-refs-subset=int exa-6.xml</tt> produces:
<blockquote>
<pre>&lt;?xml version="1.0"?>
&lt;!DOCTYPE exa SYSTEM "exa-6.ext" [
&lt;!ENTITY % int "&lt;!ELEMENT exa ANY>">
&lt;!ENTITY % ext SYSTEM "ext-6.decl">
&lt;!ELEMENT exa ANY>
%ext;
]>
&lt;exa/></pre>
</blockquote>
Note that only the internal reference <tt>%int;</tt> was expanded. On the
other hand, if we run <tt>fxcopy --expand-refs-subset=ext exa-6.xml</tt>
we get:
<blockquote>
<pre>&lt;?xml version="1.0"?>
&lt;!DOCTYPE exa SYSTEM "exa-6.ext" [
&lt;!ENTITY % int "&lt;!ELEMENT exa ANY>">
&lt;!ENTITY % ext SYSTEM "ext-6.decl">
%int;
&lt;!NOTATION text SYSTEM "/bin/cat">
]>
&lt;exa/></pre>
</blockquote>
Finally, using <tt>fxcopy --expand-ext-subset exa-6.xml</tt> yields
<blockquote>
<pre>&lt;?xml version="1.0"?>
&lt;!DOCTYPE exa [
&lt;!ENTITY % int "&lt;!ELEMENT exa ANY>">
&lt;!ENTITY % ext SYSTEM "ext-6.decl">
%int;
%ext;
&lt;!ENTITY % vnum "1.0">
&lt;!ENTITY % version "xml version 1.0">
]>
&lt;exa/></pre>
</blockquote>
Note that the entity value in the last entity declaration has been expanded,
because the <tt>--expand-ent-vals</tt> option was implied by <tt>--expand-ext-subset</tt>.
If we supersede this with <tt>--expand-ext-subset=no</tt>, we get
<blockquote>
<pre>&lt;!ENTITY % version "xml version %vnum;"></pre>
</blockquote>
but this is not well-formed:
<blockquote>
<pre>> fxcopy --expand-ext-subset --expand-ent-vals=no exa-6.xml | fxp
[&lt;stdin>:8.33] Error: a parameter entity reference is not allowed in a&nbsp;
&nbsp;&nbsp;&nbsp; declaration in the internal subset.</pre>
</blockquote>
<img SRC="shadow.jpg" ALT="----------------" >
<h2>
<a NAME="OPT"></a>Summary of Command Line Options</h2>
Each option can be one of:
<ul>
<li>
A file name specifying the input document. Only one input document may
be specified.</li>
<li>
A long option of the form <tt>--key[=arg]</tt></li>
<li>
A short option of the form <tt>-k</tt>, where <tt>k</tt> consists of single
character. If <tt>k</tt> consists of more than one character, each character
is assumed to be a short option itself (e.g., <tt>-vic</tt> equals <tt>-v
-i -c</tt>).</li>
<li>
A short option with argument of the form <tt>-k arg</tt>, where <tt>k</tt>
consists of a single character.</li>
<li>
A negative short option of the form <tt>-nk</tt>, where <tt>k</tt> consists
of single character. If <tt>k</tt> consists of more than one character,
each character is assumed to be a negative short option itself (e.g., <tt>-nvic</tt>
equals <tt>-nv -ni -nc</tt>). If <tt>k</tt> is empty, then we have the
(non-negative) short option <tt>-n</tt>.</li>
<li>
The string <tt>--</tt>. This option is ignored, except that all remaining
options are interpreted as file names, whether they start with <tt>-</tt>
or not.</li>
</ul>
<i>fxcopy</i> understands all options documented for <i><a href="fxp.html#OPT">fxp</a></i>;
additionally, the following options are available:
<dl>
<dt>
<tt>-o fname</tt></dt>
<dt>
<tt>--output=fname</tt></dt>
<dd>
Write all output, except for errors and warnings, to the file named <tt>fname</tt>.
If <tt>fname</tt> is <tt>-</tt>, the standard output is used. Defaults
is <tt>-</tt>.</dd>
<dt>
<tt>--output-encoding=enc</tt></dt>
<dd>
Use encoding <tt>enc</tt> for generating the output. <tt>enc</tt> must
be a <a href="features.html#ENC">supported</a> encoding. Default is the
encoding of the input document.</dd>
<dt>
<tt>--expand-refs-content[=key]</tt></dt>
<dd>
Controls whether entity references in content are expanded, i.e., included
or preserved as references in the output. <tt>key</tt> is either a comma-separated
list of</dd>
<ul>
<li>
<tt>char</tt> for character references,</li>
<li>
<tt>ext</tt> for references to external parsed entities, and</li>
<li>
<tt>int</tt> for references to internal general entities;</li>
</ul>
or it is <tt>yes</tt> for all or <tt>no</tt> for none of the above. If
no <tt>key</tt> is given, <tt>yes</tt> is assumed. Default is <tt>no</tt>.
<dt>
<tt>--expand-refs-subset[=key]</tt></dt>
<dd>
Controls whether parameter entity references in the internal or external
subset are expanded, i.e., included or preserved as references in the output.
<tt>key</tt> is one out of</dd>
<ul>
<li>
<tt>yes</tt> for all references,</li>
<li>
<tt>int</tt> for references to internal entities,</li>
<li>
<tt>ext</tt> for references to external entities; implies <tt>--expand-ent-vals</tt>;
or</li>
<li>
<tt>no</tt> for no references at all</li>
</ul>
to be expanded. If <tt>key</tt> is omitted, <tt>yes</tt> is assumed. Default
is <tt>no</tt>.
<dt>
<tt>--expand-ext-subset[=(yes|no)]</tt></dt>
<dd>
Controls whether the external subset shall be expanded, i.e., appended
to the internal subset of the output while dropping its external identifier
from the document type declaration. <tt>yes</tt> implies <tt>--expand-ent-vals</tt>.
If no argument is given, <tt>yes</tt> is assumed. Default is <tt>no</tt>.</dd>
<dt>
<tt>--expand-att-vals[=(yes|no)]</tt></dt>
<dd>
Controls whether attribute values are reproduced literally or in expanded
form, i.e., with all references expanded and white space normalized according
to the attribute type. If no argument is given, <tt>yes</tt> is assumed.
Default is <tt>no</tt>.</dd>
<dt>
<tt>--expand-ent-vals[=(yes|no)]</tt></dt>
<dd>
Controls whether entity values are reproduced literally or in expanded
form, i.e., with all references expanded. If no argument is given, <tt>yes</tt>
is assumed. Default is <tt>no</tt>.</dd>
<dt>
<tt>--expand=key</tt></dt>
<dd>
Depending on <tt>key</tt>, equivalent to:</dd>
<table>
<tr VALIGN=TOP>
<td><tt>yes</tt>:</td>
<td><tt>--expand-refs-content --expand-refs-subset --expand-ext-subset
--expand-att-vals --expand-ent-vals</tt></td>
<td></td>
</tr>
<tr VALIGN=TOP>
<td><tt>no</tt>:</td>
<td><tt>--expand-refs-content=no --expand-refs-subset=no --expand-ext-subset=no
--expand-att-vals=no --expand-ent-vals=no</tt></td>
</tr>
<tr VALIGN=TOP>
<td><tt>int</tt>:</td>
<td><tt>--expand-refs-content=char,int --expand-refs-subset=int --expand-ext-subset=no
--expand-att-vals --expand-ent-vals=no</tt></td>
</tr>
<tr VALIGN=TOP>
<td><tt>ext</tt>:</td>
<td><tt>--expand-refs-content=ext --expand-refs-subset=yes --expand-ext-subset
--expand-att-vals=no --expand-ent-vals</tt></td>
</tr>
</table>
</dl>
<img SRC="shadow.jpg" ALT="----------------" >
<address>
fxp's feedback address <a href="mailto:fxp@PSI.Uni-Trier.DE">fxp@PSI.Uni-Trier.DE</a></address>
</body>
</html>

334
fxp/doc/fxesis.html Normal file
View File

@ -0,0 +1,334 @@
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="GENERATOR" content="Mozilla/4.73 [en] (X11; I; Linux 2.2.14 i686) [Netscape]">
<title>fxp - The Program fxesis</title>
</head>
<body bgcolor="#FFFFFF">
<h1>
<a href="index.html"><img SRC="fxp-shadow.jpg" ALT="fxp" BORDER=0 align=CENTER></a>
The Program <i>fxesis</i></h1>
<img SRC="shadow.jpg" ALT="----------------" >
<table CELLSPACING=0 CELLPADDING=0 >
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#DESC">Description</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#OUT">Output Format</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#OUTEX">Output Example</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#EXA">Options by Example</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#OPT">Summary of Options</a></td>
</tr>
</table>
<p><img SRC="shadow.jpg" ALT="----------------" >
<h2>
<a NAME="DESC"></a>Description</h2>
<i>fxesis</i> is a validating XML processor. It reads an XML document and
produces a textual description of its Element Structure Information Set
(ESIS). This contains only little information about the DTD, and no information
about the document's entity structure, but provides all information about
the document's logical (element) structure.
<p>The typical invocation of <i>fxesis</i> is
<blockquote>
<pre>fxesis [option ...] [infile]</pre>
</blockquote>
If <tt>infile</tt> is given, <i>fxesis</i> reads its input document from
that file, otherwise from standard input. By default, it prints its output
to the standard output.
<p><img SRC="shadow.jpg" ALT="----------------" >
<h2>
<a NAME="OUT"></a>The Output Format</h2>
The <i>fxesis</i> output is a series of plain text lines. The meaning of
each line is determined by its first character. Some lines, e.g. attribute
specifications, define arguments for a following line. All lines contain
only LATIN1 characters, or, if the <tt><a href="#EXA-ENC">--ascii</a></tt>
option was given, ASCII characters. In order to print other characters
<i>fxesis</i> uses escape sequences with the following meaning:
<blockquote>&nbsp;
<table CELLSPACING=0 CELLPADDING=0 >
<tr>
<td NOWRAP><tt>\\</tt></td>
<td>the character '<tt>\</tt>';&nbsp;</td>
</tr>
<tr>
<td NOWRAP><tt>\n</tt></td>
<td>a newline character;&nbsp;</td>
</tr>
<tr>
<td NOWRAP><tt>\t</tt></td>
<td>a tab character;&nbsp;</td>
</tr>
<tr>
<td NOWRAP><tt>\U+<i>hex</i>;</tt></td>
<td>the Unicode character whose hexadecimal code is <i>hex</i>.&nbsp;</td>
</tr>
</table>
</blockquote>
The following output lines can appear:
<blockquote>&nbsp;
<table>
<tr VALIGN=TOP>
<td NOWRAP><tt>-</tt><i>data</i></td>
<td>A sequence of data characters, including newlines. The data need not
have been contiguous in the input document, but may have consisted of a
series of data characters, CDATA sections and character references, interspersed
with comments.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td NOWRAP><tt>(</tt><i>elem</i></td>
<td>The start of an element of type <i>elem</i>. Preceded by an <tt>A</tt>
line for each of its attributes.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td NOWRAP><tt>)</tt><i>elem</i></td>
<td>The end of an element of type <i>elem</i>.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td NOWRAP><tt>A</tt><i>att</i> <i>value</i></td>
<td>A specification of attribute <i>att</i> for a following <tt>(</tt>
(element-start) line. <i>value</i> is one out of:&nbsp;
<table CELLSPACING=0 CELLPADDING=0 >
<tr VALIGN=TOP>
<td NOWRAP><tt>IMPLIED</tt></td>
<td>The attribute value was implied. This is used only in validating mode
only.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td NOWRAP><tt>CDATA </tt><i>data</i></td>
<td>The attribute was declared <tt>CDATA</tt>; its value is <i>data</i>.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td NOWRAP><tt>NOTATION </tt><i>name</i></td>
<td>A notation attribute with value <i>name</i>; that notation was defined
in a previous <tt>N</tt> (notation definition) line.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td NOWRAP><tt>ENTITY </tt><i>name</i><tt> ...</tt></td>
<td>An attribute with declared type <tt>ENTITY</tt> or <tt>ENTITIES</tt>.
Each <i>name</i> is the name of an unparsed general entity that was defined
in a preceding <tt>E</tt> (entity definition) line.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td NOWRAP><tt>TOKEN </tt><i>token</i><tt> ...</tt></td>
<td>An attribute with declared type <tt>NMTOKEN</tt>, <tt>NMTOKENS</tt>,
<tt>ID</tt>, <tt>IDREF</tt>, <tt>IDREFS</tt>, or enumeration. Each <i>token</i>
is a name token complying with the attribute type.&nbsp;</td>
</tr>
</table>
</td>
</tr>
<tr VALIGN=TOP>
<td NOWRAP><tt>?</tt><i>target</i> <i>text</i></td>
<td>A processing instruction with target <i>target</i> and text <i>text</i>.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td NOWRAP><tt>E</tt><i>ent</i><tt> NDATA </tt><i>nt</i></td>
<td>Defines an unparsed external entity named <i>ent</i> whose notation
is <i>nt</i> and has been defined by a preceding <tt>N</tt> (notation definition)
line. This line is immediately preceded by an optional <tt>p</tt> (public
identifier) line, an <tt>s</tt> (system identifier) line and, if a filename
could be generated, an <tt>f</tt> (filename) line for the external identifier
declared for <i>ent</i>. An entity is defined by an <tt>E</tt> line only
once per document.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td NOWRAP><tt>N</tt><i>nt</i></td>
<td>Defines the notation named <i>nt</i>. This line is immediately preceded
by an optional <tt>p</tt> (public identifier) line and an optional <tt>s</tt>
(system identifier) line for the external identifier declared for <i>nt</i>.
A notation is defined by an <tt>N</tt> line only once per document.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td NOWRAP><tt>p</tt><i>pubid</i></td>
<td><i>pubid</i> is the public identifier belonging to the external identifier
of a following <tt>N</tt> (notation definition) or <tt>E</tt> (entity definition).&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td NOWRAP><tt>s</tt><i>sysid</i></td>
<td><i>sysid</i> is the system identifier belonging to the external identifier
of a following <tt>N</tt> (notation definition) or <tt>E</tt> (entity definition).&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td NOWRAP><tt>f&lt;OSFILE></tt><i>filename</i></td>
<td><i>filename</i> is the system file name generated for the external
identifier of a following <tt>E</tt> (entity definition).&nbsp;</td>
</tr>
</table>
</blockquote>
<img SRC="shadow.jpg" ALT="----------------" >
<h2>
<a NAME="OUTEX"></a>An Output Example</h2>
Consider the example document <tt><a href="Examples/exa-5.xml">exa-5.xml</a></tt>.
The <i>fxesis</i> output, if called without options, for this document
is <tt><a href="Examples/exa-5.esis-8">exa-5.esis-8</a></tt>. Note that
all the adjacent data segments of the first <tt>a</tt> element are merged
into one; note also that there is an <tt>A</tt> line for each implied attribute.
Furthermore, notation <tt>man</tt> is not redefined at its second occurrence.
<p>Opposed to that, <tt>fxesis -7 -nv exa-5.xml</tt> produces the output
in <tt><a href="Examples/exa-5.esis-7">exa-5.esis-7</a></tt>. Note the
difference: on the one hand, no <tt>A</tt> lines are printed for implied
attribute, because validation was turned off. On the other hand, characters
<tt>&ouml;</tt>, <tt>&uuml;</tt> and <tt>&szlig;</tt> are represented by
escape sequences, because they are not ASCII-characters.
<p><img SRC="shadow.jpg" ALT="----------------" >
<h2>
<a NAME="EXA"></a>Options by Example</h2>
<i>fxesis</i> understands all options documented for <i><a href="fxp.html#EXA">fxp</a></i>;
the additional options control how output is generated.
<p>By default, <i>fxesis</i> writes its output to the standard output.
It can be redirected to a file named <tt>outfile</tt> via the option <tt>--output=outfile</tt>
or, for short, <tt>-o outfile</tt>.
<h4>
<a NAME="EXA-ENC"></a>Output Encoding</h4>
By default, <i>fxesis</i> produces its output in the LATIN1 character set,
i.e., using 8-bit characters. It can be restricted to using only 7-bit
characters with the <tt>--ascii</tt> or, for short, <tt>-7</tt> option.
For instance, consider the element
<blockquote>
<pre>&lt;addr city="K&ouml;ln">M&uuml;llerstra&szlig;e 13&lt;/addr></pre>
</blockquote>
Called with <tt>fxesis -8 ...</tt>, the output for this element is
<blockquote>
<pre>Acity CDATA K&ouml;ln
(addr
-M&uuml;llerstra&szlig;e 13
)addr</pre>
</blockquote>
whereas <tt>fxesis -7 ...</tt> outputs the following:
<blockquote>
<pre>Acity CDATA K\U+f6;ln
(addr
-M\U+fc;llerstra\U+df;e 13
)addr</pre>
</blockquote>
<img SRC="shadow.jpg" ALT="----------------" >
<h2>
<a NAME="OPT"></a>Summary of Command Line Options</h2>
Each option can be one of:
<ul>
<li>
A file name specifying the input document. Only one input document may
be specified.</li>
<li>
A long option of the form <tt>--key[=arg]</tt></li>
<li>
A short option of the form <tt>-k</tt>, where <tt>k</tt> consists of single
character. If <tt>k</tt> consists of more than one character, each character
is assumed to be a short option itself (e.g., <tt>-vic</tt> equals <tt>-v
-i -c</tt>).</li>
<li>
A short option with argument of the form <tt>-k arg</tt>, where <tt>k</tt>
consists of a single character.</li>
<li>
A negative short option of the form <tt>-nk</tt>, where <tt>k</tt> consists
of single character. If <tt>k</tt> consists of more than one character,
each character is assumed to be a negative short option itself (e.g., <tt>-nvic</tt>
equals <tt>-nv -ni -nc</tt>). If <tt>k</tt> is empty, then we have the
(non-negative) short option <tt>-n</tt>.</li>
<li>
The string <tt>--</tt>. This option is ignored, except that all remaining
options are interpreted as file names, whether they start with <tt>-</tt>
or not.</li>
</ul>
<i>fxesis</i> understands all options documented for <i><a href="fxp.html#OPT">fxp</a></i>;
additionally, the following options are available:
<dl>
<dt>
<tt>-o fname</tt></dt>
<dt>
<tt>--output=fname</tt></dt>
<dd>
Write all output, except for errors and warnings, to the file named <tt>fname</tt>.
If <tt>fname</tt> is <tt>-</tt>, the standard output is used. Defaults
to -.</dd>
<dt>
<tt>-7</tt></dt>
<dt>
<tt>--ascii</tt></dt>
<dd>
Produce the output in ASCII encoding, i.e., using 7-bit characters only.</dd>
<dt>
<tt>-8</tt></dt>
<dt>
<tt>--latin1</tt></dt>
<dd>
Produce output in Latin1 encoding, i.e., using 8-bit characters also. This
is the default.</dd>
</dl>
<img SRC="shadow.jpg" ALT="----------------" >
<address>
fxp's feedback address <a href="mailto:fxp@PSI.Uni-Trier.DE">fxp@PSI.Uni-Trier.DE</a></address>
</body>
</html>

BIN
fxp/doc/fxp-shadow.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.2 KiB

22
fxp/doc/fxp-xsa.xml Normal file
View File

@ -0,0 +1,22 @@
<?xml version="1.0" encoding="iso-8859-1" standalone="no"?>
<!DOCTYPE xsa PUBLIC "-//LM Garshol//DTD XML Software Autoupdate 1.0//EN//XML"
"http://birk105.studby.uio.no/www_work/xsa/xsa.dtd">
<xsa>
<vendor>
<name>Alexandru Berlea</name>
<email>aberlea@PSI.Uni-Trier.DE</email>
<url>http://www.informatik.uni-trier.de/~aberlea/</url>
</vendor>
<product id="fxp">
<name>fxp</name>
<version>1.4.1</version>
<last-release>20003010</last-release>
<info-url>http://www.informatik.uni-trier.de/~aberlea/Fxp/</info-url>
<changes>
- Bug Fixes
</changes>
</product>
</xsa>

793
fxp/doc/fxp.html Normal file
View File

@ -0,0 +1,793 @@
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="GENERATOR" content="Mozilla/4.73 [en] (X11; I; Linux 2.2.14 i686) [Netscape]">
<title>fxp - The Program fxp</title>
</head>
<body bgcolor="#FFFFFF">
<h1>
<a href="index.html"><img SRC="fxp-shadow.jpg" ALT="fxp" BORDER=0 align=CENTER></a>
The Program <i>fxp</i></h1>
<img SRC="shadow.jpg" ALT="----------------" >
<table CELLSPACING=0 CELLPADDING=0 >
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#DESC">Description</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#EXA">Options by Example</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#OPT">Summary of Options</a></td>
</tr>
</table>
<p><img SRC="shadow.jpg" ALT="----------------" >
<h2>
<a NAME="DESC"></a>Description</h2>
<i>fxp</i> is a validating XML parser. It reads an XML document and reports
all well-formedness errors, validity errors and other errors in that document.
It can also warn about interoperability features and other issues mentioned
in the XML recommendation.
<p>The typical invocation of <i>fxp</i> is
<blockquote>
<pre>fxp [option ...] [infile]</pre>
</blockquote>
If <tt>infile</tt> is given, <i>fxp</i> reads its input document from that
file, otherwise from standard input.
<p><img SRC="shadow.jpg" ALT="----------------" >
<h2>
<a NAME="EXA"></a>Options by Example</h2>
<table CELLSPACING=0 CELLPADDING=0 >
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#EXA-ERR">Controlling Error Printing</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#EXA-VAL">Validating and Non-Validating Mode</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#EXA-COMP">Compatibility Modes</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#EXA-INTER">Interoperability Modes</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#EXA-OTHER">Other Errors and Warnings</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="features.html#CAT-EXA">Catalog Support</a></td>
</tr>
</table>
<h4>
<a NAME="EXA-ERR"></a>Controlling Error Printing</h4>
By default <i>fxp</i> reports all errors and warnings to the standard error.
This can be controlled by options:
<ul>
<li>
All messages can be redirected to a file named <tt>errfile</tt> via the
option <tt>--error-output=errfile</tt> or, for short, <tt>-e errfile</tt>.</li>
<li>
All messages can be suppressed by supplying the <tt>--silent</tt> option,
or <tt>-s</tt> for short.</li>
<li>
By default, the parser tries avoid printing an error that has already been
printed earlier. E.g., if an attribute is misspelled in the attribute list
declaration, there will be an undeclared-attribute error ech time this
attribute is actually specified for an element. Printing of all but the
first of these errors is suppressed. In order to make <i>fxp</i> print
this kind of duplicate error messages, use the <tt>--few-errors=no</tt>
option.</li>
</ul>
<h4>
<a NAME="EXA-VAL"></a>Validating and Non-Validating Mode</h4>
By default <i>fxp</i> is a validating parser, but it can be run in non-validating
mode with the <tt>--validate=no</tt> or, for short, <tt>-nv</tt> option.
This has the following effects:
<ul>
<li>
only the internal subset of the DTD is parsed and checked for well-formedness;</li>
<li>
the external subset and all references to external parameter entities are
ignored;</li>
<li>
declarations in the internal subset are processed only upto the first reference
to an external parameter entity;</li>
<li>
validity constraints are not verified;</li>
<li>
no referenced parameter entities are included;</li>
<li>
by default, no external parsed general entities are included; this can
be changed with the <tt>--include-external</tt> option;</li>
<li>
all attributes for which no declaration has been processed are assumed
to be declared <tt>CDATA</tt> with default value <tt>#IMPLIED</tt>;</li>
</ul>
For instance, consider an example document <tt><a href="Examples/exa-1.xml">exa-1.xml</a></tt>,
referencing files <tt><a href="Examples/exa-1.ext">exa-1.ext</a></tt>,
<tt><a href="Examples/ext.elem">ext.elem</a></tt> and <tt><a href="Examples/ext.decl">ext.decl</a></tt>.
Running
<blockquote>
<pre>fxp exa-1.xml</pre>
</blockquote>
reports the following errors:
<blockquote>
<pre>[exa-1.xml:17.11] Error: Attribute 'num' has the value 'a' but was declared with
&nbsp;&nbsp;&nbsp; a fixed default value of '0'.
[exa-1.xml:18.12] Error: ID name 'id1' already occurred as an attribute value.
[exa-1.xml:19.0] Error: Element type 'a' not allowed at this point in the&nbsp;
&nbsp;&nbsp;&nbsp; content of element 'a'.
[ext.elem:1.11] Error: Attribute 'num' has the value '1' but was declared with a
&nbsp;&nbsp;&nbsp; fixed default value of '0'.
[ext.elem:1.15] Error: Element 'a' was ended by an end-tag for 'b'.
[exa-1.xml:20.7] Error: Attribute 'nmu' was not declared for element type 'b'.
[exa-1.xml:20.12] Error: No value was specified for required attribute 'num'.
[exa-1.xml:20.12] Error: The end-tag for element 'b' with declared EMPTY content
&nbsp;&nbsp;&nbsp; must follow immediately after its start-tag.</pre>
</blockquote>
whereas the non-validating mode
<blockquote>
<pre>fxp -nv exa-1.xml</pre>
</blockquote>
does not find any errors. Note that the error at <tt>[ext.elem:1.15]</tt>
is a well-formedness error but is not reported since the external entity
reference <tt>&amp;ext;</tt> in not included. But if we make the parser
include external parsed entities:
<blockquote>
<pre>fxp -nv --include-external exa-1.xml</pre>
</blockquote>
then the error is reported:
<blockquote>
<pre>[ext.elem:1.15] Error: Element 'a' was ended by an end-tag for 'b'.</pre>
</blockquote>
<h4>
<a NAME="EXA-COMP"></a>Compatibility Modes</h4>
Some features in XML have only been included for compatibility with SGML.
These include:
<ul>
<li>
the string (<tt>]]></tt>) may not appear literally in content;</li>
<li>
a comment may not contain a double-hyphen (<tt>--</tt>);</li>
<li>
content models must be unambiguous.</li>
</ul>
By default <i>fxp</i> checks for compatibility and prints errors in case
it is not obeyed. This can be changed with the <tt>--compatibility=no</tt>
or, for short, <tt>-nc</tt> option.
<p>In non-compatibility mode, however, the parser must handle ambiguous
content models. This implies generation of a deterministic finite state
machine (DFA), which may in the worst case have size exponential in the
size of the content model. In order to avoid too high space usage, <i>fxp</i>
imposes a limit on the size of the generated DFA. If this limit is exceeded,
a warning is printed and the content model is approximated by <tt>(e<sub>1</sub>|...|e<sub>n</sub>)*</tt>,
where <tt>e<sub>1</sub></tt>, ..., <tt>e<sub>n</sub></tt> are all element
types occurring in the original model. The new content model is less restrictive
but allows for a small DFA. The limit defaults to 256 and can be set by
the <tt>--dfa-max-size</tt> option, and the warning can be suppressed with
the <tt>--dfa-warn-size=no</tt> option.
<p>For instance, consider the document <tt><a href="Examples/exa-2.xml">exa-2.xml</a></tt>.
Note that the content model for element <tt>a</tt> is ambiguous, and its
DFA needs at least 257 states. Running <i>fxp</i> in compatibility mode
produces the following errors:
<blockquote>
<pre>[exa-2.xml:4.65] Error: Content model is ambiguous: conflict between the 1st and
&nbsp;&nbsp;&nbsp; the 2nd occurrence of element 'b'. Using an approximation instead.
[exa-2.xml:10.26] Error: '--' is not allowed in a comment.
[exa-2.xml:13.26] Error: Character '>' must be escaped for compatibility.</pre>
</blockquote>
Note that the empty element tag for <tt>a</tt> is not an error since <tt>a</tt>'s
content model was approximated. Running in non-compatibility mode:
<blockquote>
<pre>fxp -nc exa-1.xml</pre>
</blockquote>
suppressed these errors, but reports the following instead:
<blockquote>
<pre>[exa-2.xml:4.65] Warning: The finite state machine for the content model of&nbsp;
&nbsp;&nbsp;&nbsp; element type 'a' would have more than the maximal allowed number of 256&nbsp;
&nbsp;&nbsp;&nbsp; states. Using an approximation instead.</pre>
</blockquote>
This warning can be suppressed by invoking <i>fxp</i> like this:
<blockquote>
<pre>fxp -nc --dfa-warn-size=no exa-1.xml</pre>
</blockquote>
But still the invalidity of the empty-element tag for <tt>a</tt> is not
detected. In order to achieve this, we can raise the limit for the DFA's
size:
<blockquote>
<pre>fxp -nc --dfa-max-size=257 exa-1.xml</pre>
</blockquote>
Now element <tt>a</tt>'s content can be validated and the error is reported:
<blockquote>
<pre>[exa-2.xml:12.0] Error: Empty-element tag for element type 'a' whose content&nbsp;
&nbsp;&nbsp;&nbsp; model requires non-empty content.</pre>
</blockquote>
<h4>
<a NAME="EXA-INTER"></a>Interoperability Modes</h4>
XML also includes some interoperability recommendations in to allow existing
SGML software to process XML documents. These recommendations are non-binding
and therefore not checked for by default. The <tt>--interoperability</tt>
or, for short, <tt>-i</tt> option makes <i>fxp</i> run in interoperability-mode,
which enables checking for these features. Some of these features can additionally
be controlled by individual options. The following table lists the features
supported by <i>fxp</i>, together with the option (if any) that enables
or disables them, and whether they are enabled by default if <tt>--interoperability</tt>
is supplied:
<table WIDTH="90%" >
<tr>
<td>Controlling option</td>
<td ALIGN=CENTER>&nbsp; Default&nbsp;</td>
<td>Interoperability Feature</td>
</tr>
<tr VALIGN=TOP>
<td>(none)&nbsp;</td>
<td ALIGN=CENTER>yes</td>
<td>The empty element tag must be used and may only be used for elements
declared <tt>EMPTY</tt>.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td><tt>--warn-mult-decl=attlist</tt></td>
<td ALIGN=CENTER>no</td>
<td>There should be at most one attribute list declaration for each element
type.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td><tt>--warn-mult-decl=att</tt></td>
<td ALIGN=CENTER>no</td>
<td>No attribute should be declared twice for the same element type.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td>(none)&nbsp;</td>
<td ALIGN=CENTER>yes</td>
<td>The same name token should not occur more than once in the enumerated
attribute types of a single element type.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td><tt>--warn-predefined=no</tt></td>
<td ALIGN=CENTER>yes</td>
<td>Valid documents should declare the entities <tt>amp</tt>, <tt>lt</tt>,
<tt>gt</tt>, <tt>apos</tt> and <tt>quot</tt>.&nbsp;</td>
</tr>
</table>
Note that all arguments to the <tt>--warn-mult-decl</tt> option must be
specified in a list; see a detailed description <a href="#MULT-DECL">here</a>.
<p>As example consider the document <tt><a href="Examples/exa-3.xml">exa-3.xml</a></tt>.
Running <tt>fxp -i exa-3.xml</tt> reports the following:
<blockquote>
<pre>[exa-3.xml:10.2] Warning: The following name tokens occur more than once in the&nbsp;
&nbsp;&nbsp;&nbsp; enumerated attribute types of element 'a': 'yes', 'no'.
[exa-3.xml:10.2] Warning: The predefined entities 'lt', 'gt', 'apos', 'quot' and
&nbsp;&nbsp;&nbsp; 'amp' should have been declared.
[exa-3.xml:13.4] Error: An empty-element tag must be used for element type 'a'&nbsp;
&nbsp;&nbsp;&nbsp; with EMPTY declared content.
[exa-3.xml:15.0] Error: Empty-element tag for element 'b' with non-EMPTY&nbsp;
&nbsp;&nbsp;&nbsp; declared content.</pre>
</blockquote>
Now we add some options:
<blockquote>
<pre>fxp -i --warn-mult-decl=att,attlist --warn-predefined=no exa-3.xml</pre>
</blockquote>
The result is that the predefined entities are not checked, but multiple
declarations are detected now:
<blockquote>
<pre>[exa-3.xml:9.12] Warning: Repeated attribute-list declaration for element type&nbsp;
&nbsp;&nbsp;&nbsp; 'a'.
[exa-3.xml:9.28] Warning: Repeated definition of attribute 'x' for element type&nbsp;
&nbsp;&nbsp;&nbsp; 'a'.
[exa-3.xml:10.2] Warning: The following name tokens occur more than once in the&nbsp;
&nbsp;&nbsp;&nbsp; enumerated attribute types of element 'a': 'yes', 'no'.
[exa-3.xml:13.4] Error: An empty-element tag must be used for element type 'a'&nbsp;
&nbsp;&nbsp;&nbsp; with EMPTY declared content.
[exa-3.xml:15.0] Error: Empty-element tag for element 'b' with non-EMPTY&nbsp;
&nbsp;&nbsp;&nbsp; declared content.</pre>
</blockquote>
<h4>
<a NAME="EXA-OTHER"></a>Other Errors and Warnings</h4>
The following table lists some features from the XML recommendation which
can be enabled or disabled by command line options:
<table WIDTH="90%" >
<tr>
<td>Controlling option</td>
<td ALIGN=CENTER>&nbsp; Default&nbsp;</td>
<td>Feature</td>
</tr>
<tr VALIGN=TOP>
<td><tt>--warn-att-elem</tt></td>
<td ALIGN=CENTER>no</td>
<td>There should be attribute list declarations for declared element types
only.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td><tt>--check-predefined=no</tt></td>
<td ALIGN=CENTER>yes</td>
<td>If the predefined entities are declared, this must be according to
section "4.6 Predefined Entities".&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td><tt>--check-lang-id</tt></td>
<td ALIGN=CENTER>no</td>
<td>The values of the attribute <tt>xml:lang</tt> must be language identifiers
as defined by IETF RFC 1766, "Tags for the Identification of Languages".&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td><tt>--check-iso639</tt></td>
<td ALIGN=CENTER>no</td>
<td>An ISO-639 Code in a value of the attribute <tt>xml:lang</tt> must
be a two-letter language code as defined by ISO 639, "Codes for the representation
of names of languages"&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td><tt>--warn-uri=no</tt></td>
<td ALIGN=CENTER>yes</td>
<td>System identifiers are URI's and may only contain ASCII characters,
according to IETF RFC 2396, "Uniform Resource Identifiers (URI): Generic
Syntax"&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td><tt>--check-xml-version=no</tt></td>
<td ALIGN=CENTER>yes</td>
<td>Processors may signal an error if they receive documents labeled with
versions they do not support.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td><tt>--warn-xml-decl</tt></td>
<td ALIGN=CENTER>no</td>
<td>XML documents should, begin with an XML declaration which specifies
the version of XML being used.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td><tt>--warn-mult-decl=ent</tt></td>
<td ALIGN=CENTER>no</td>
<td>An XML processor may issue a warning if entities are declared multiple
times.&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td><tt>--warn-mult-decl=not</tt></td>
<td ALIGN=CENTER>no</td>
<td>Ditto for notations. This is not mentioned in the XML recommendation
but sensible.&nbsp;</td>
</tr>
</table>
Note that all arguments to the <tt>--warn-mult-decl</tt> option must be
specified in a list; see a detailed description <a href="#MULT-DECL">here</a>.
<p>For instance, consider the example document <tt><a href="Examples/exa-4.xml">exa-4.xml</a></tt>.
Running <i>fxp</i> without options produces the following:
<blockquote>
<pre>[exa-4.xml:1.20] Error: XML version '1.1' is not supported.
[exa-4.xml:12.21] Error: General entity 'amp' must be declared as internal&nbsp;
&nbsp;&nbsp;&nbsp; entity with replacement text '&amp;'.</pre>
</blockquote>
We can suppress these messages while making the parser check for the other
features listed above by typing:
<blockquote>
<pre>fxp --warn-att-elem --check-predefined=no --check-lang-id --check-iso639&nbsp;
&nbsp;&nbsp;&nbsp; --check-xml-version=no --warn-mult-decl=ent,not exa-4.xml</pre>
</blockquote>
The result is:
<blockquote>
<pre>[exa-4.xml:9.32] Error: 'i-' is not a language identifier.
[exa-4.xml:10.12] Warning: Attribute-list declaration for undeclared element&nbsp;
&nbsp;&nbsp;&nbsp; type 'c'.
[exa-4.xml:13.25] Warning: Repeated declaration for general entity 'amp'.
[exa-4.xml:16.45] Warning: Repeated declaration for notation 'text'.
[exa-4.xml:20.17] Error: 'yy' is not a language identifier.</pre>
</blockquote>
<img SRC="shadow.jpg" ALT="----------------" >
<h2>
<a NAME="OPT"></a>Summary of Command Line Options</h2>
Each option can be one of:
<ul>
<li>
A file name specifying the input document. Only one input document may
be specified.</li>
<li>
A long option of the form <tt>--key[=arg]</tt></li>
<li>
A short option of the form <tt>-k</tt>, where <tt>k</tt> consists of single
character. If <tt>k</tt> consists of more than one character, each character
is assumed to be a short option itself (e.g., <tt>-vic</tt> equals <tt>-v
-i -c</tt>).</li>
<li>
A short option with argument of the form <tt>-k arg</tt>, where <tt>k</tt>
consists of a single character.</li>
<li>
A negative short option of the form <tt>-nk</tt>, where <tt>k</tt> consists
of single character. If <tt>k</tt> consists of more than one character,
each character is assumed to be a negative short option itself (e.g., <tt>-nvic</tt>
equals <tt>-nv -ni -nc</tt>). If <tt>k</tt> is empty, then we have the
(non-negative) short option <tt>-n</tt>.</li>
<li>
The string <tt>--</tt>. This option is ignored, except that all remaining
options are interpreted as file names, whether they start with <tt>-</tt>
or not.</li>
</ul>
The following options are available (see also the <a href="features.html#CAT-OPT">catalog</a>
options):
<dl>
<dt>
<tt>-s</tt></dt>
<dt>
<tt>--silent</tt></dt>
<dd>
Do not print any errors or warnings.</dd>
<dt>
<tt>--few-errors=[(yes|no)]</tt></dt>
<dd>
If <tt>yes</tt>, the parser tries to avoid printing errors caused by something
that already caused an error earlier. E.g., an attribute specification
for an attribute not declared for some element will cause an error only
at the first instance of that element with the attribute. If no argument
is given, <tt>yes</tt> is assumed. Default is <tt>yes</tt>.</dd>
<dt>
<tt>-e fname</tt></dt>
<dt>
<tt>--error-output=fname</tt></dt>
<dd>
Write all errors and warnings to the file named <tt>fname</tt>. If <tt>fname</tt>
is <tt>-</tt>, standard error is used. Default is <tt>-</tt>.</dd>
<dt>
<tt>--validate[=(yes|no)]</tt></dt>
<dd>
Turns on or off validation. If no argument is given, <tt>yes</tt> is assumed.
Default is <tt>yes</tt>.</dd>
<dt>
<tt>-v</tt></dt>
<dd>
Same as <tt>--validate=yes</tt>.</dd>
<dt>
<tt>-nv</tt></dt>
<dd>
Same as <tt>--validate=no</tt>.</dd>
<dt>
<tt>--compatibility[=(yes|no)]</tt></dt>
<dd>
If <tt>yes</tt>, the parser checks for features that were included into
XML solely for compatibility with SGML. If no argument is given, <tt>yes</tt>
is assumed. Default is <tt>yes</tt>.</dd>
<dt>
<tt>--compat[=(yes|no)]</tt></dt>
<dd>
Same as <tt>--compatibility</tt>.</dd>
<dt>
<tt>-c</tt></dt>
<dd>
Same as <tt>--compatibility=yes</tt>.</dd>
<dt>
<tt>-nc</tt></dt>
<dd>
Same as <tt>--compatibility=no</tt>.</dd>
<dt>
<tt>--interoperability[=(yes|no)]</tt></dt>
<dd>
If <tt>yes</tt>, the parser checks whether the (non-binding) recommendations
XML makes for enhancing interoperability with existing SGML software are
followed. If no argument is given, <tt>yes</tt> is assumed. Default is
<tt>no</tt>.</dd>
<dt>
<tt>--interop[=(yes|no)]</tt></dt>
<dd>
Same as <tt>--interoperability</tt>.</dd>
<dt>
<tt>-i</tt></dt>
<dd>
Same as <tt>--interoperability=yes</tt>.</dd>
<dt>
<tt>-ni</tt></dt>
<dd>
Same as <tt>--interoperability=no</tt>.</dd>
<br>&nbsp;
<p>&nbsp;</dl>
<dl>
<dt>
<tt>--check-reserved[=(yes|no)]</tt></dt>
<dd>
If <tt>yes</tt>, the parser checks whether element names, attribute names
and PI targets are reserved for standardization and thus invalid. If no
argument is given, <tt>yes</tt> is assumed. Default is <tt>no</tt>.</dd>
<dt>
<tt>--check-predefined[=(yes|no)]</tt></dt>
<dd>
If <tt>yes</tt>, the parser checks whether declarations for the predefined
entities (<tt>amp</tt>, <tt>lt</tt>, <tt>gt</tt>, <tt>apos</tt> and <tt>quot</tt>)
are in accordance to section 4.6 in the XML recommendation. If no argument
is given, <tt>yes</tt> is assumed. Default is <tt>yes</tt>.</dd>
<dt>
<tt>--check-predef[=(yes|no)]</tt></dt>
<dd>
Same as <tt>--check-predefined</tt>.</dd>
<dt>
<tt>--check-lang-id[=(yes|no)]</tt></dt>
<dd>
If <tt>yes</tt>, the parser checks whether values of the 'xml:lang' attribute
are language identifiers as defined in RFC 1776. If no argument is given,
<tt>yes</tt> is assumed. Default is <tt>no</tt>.</dd>
<dt>
<tt>--check-iso639[=(yes|no)]</tt></dt>
<dd>
If <tt>yes</tt>, the parser checks whether an ISO language code in a language
identifier is in accordance to ISO 639. Has no effect unless <tt>--check-lang-id=yes</tt>
was specified. If no argument is given, <tt>yes</tt> is assumed. Default
is <tt>no</tt>.</dd>
<dt>
<tt>--check-xml-version[=(yes|no)]</tt></dt>
<dd>
If <tt>yes</tt>, the parser checks whether the version number in a XML
or text declaration is supported. If no argument is given, <tt>yes</tt>
is assumed. Default is <tt>yes</tt>.</dd>
<dt>
<tt>--warn-uri[=(yes|no)]</tt></dt>
<dd>
If <tt>yes</tt>, the parser prints a warning for each non-ASCII character
occurring in a system literal (URI). If no argument is given, <tt>yes</tt>
is assumed. Default is <tt>yes</tt>.</dd>
<dt>
<tt>--warn-xml-decl[=(yes|no)]</tt></dt>
<dd>
Turns on or off a warning if there is no XML declaration. If no argument
is given, <tt>yes</tt> is assumed. Default is <tt>no</tt>.</dd>
<dt>
<tt>--warn-att-elem[=(yes|no)]</tt></dt>
<dd>
Turns on or off warnings about attribute list declarations for undeclared
elements. If no argument is given, <tt>yes</tt> is assumed. Default is
<tt>no</tt>.</dd>
<dt>
<tt>--warn-predefined[=(yes|no)]</tt></dt>
<dd>
Turns on or off a warning if at least one of the predefined entities (<tt>amp</tt>,
<tt>lt</tt>, <tt>gt</tt>, <tt>apos</tt> and <tt>quot</tt>) are not declared.
Has no effect in non-validating mode or if <tt>--interoperability=yes</tt>
was not specified. If no argument is given, <tt>yes</tt> is assumed. Default
is <tt>no</tt>.</dd>
<dt>
<a NAME="MULT-DECL"></a><tt>--warn-mult-decl[=arg]</tt></dt>
<dd>
Turns on or off a warning if something is declared multiple times. <tt>arg</tt>
specifies which declarations this applies to, and must be one of the following:</dd>
<ul>
<li>
A comma-separated list <tt>key<sub>1</sub>[,key<sub>2</sub> ...]</tt>,
where each key is one out of:</li>
<ul>
<li>
<tt>att</tt> for multiple definitions of an attribute for the same element;</li>
<li>
<tt>attlist</tt> for multiple attribute list declaration for an element;</li>
<li>
<tt>ent</tt> for multiple declarations of an entity;</li>
<li>
<tt>not</tt> for multiple declarations of a notation.</li>
</ul>
<li>
<tt>all</tt> for all of the keys above;</li>
<li>
<tt>none</tt> for all of the keys above.</li>
</ul>
<tt>att</tt> and <tt>attlist</tt> have no effect unless <tt>--interoperability=yes</tt>
was specified. If no argument is given, <tt>all</tt> is assumed. Default
is <tt>none</tt>.
<dt>
<tt>--warn[=(yes|no)]</tt></dt>
<dd>
If <tt>yes</tt> or without argument, equivalent to <tt>--warn-xml-decl
--warn-att-elem --warn-predefined --warn-mult-decl=all</tt>.</dd>
<br>If <tt>no</tt>, equivalent to <tt>--warn-xml-decl=no --warn-att-elem=no
--warn-predefined=no --warn-mult-decl=none</tt>.
<dt>
<tt>--include-external[=(yes|no)]</tt></dt>
<dd>
Specifies whether external parsed entity references are included in content
or not. Has no effect in validating mode (then all references are included).
If no argument is given, <tt>yes</tt> is assumed. Default is <tt>no</tt>.</dd>
<dt>
<tt>--include-ext[=(yes|no)]</tt></dt>
<dd>
Same as <tt>--include-external</tt>.</dd>
<dt>
<tt>--dfa-initial-size=n</tt></dt>
<dd>
The transition table of a finite state machine grows dynamically during
its creation, i.e., if the table's size is exceeded, it is recreated with
double size. This option sets the initial size of the transition table
to the next power of 2 larger or equal to <tt>n</tt>. Default is <tt>16</tt>.</dd>
<dt>
<tt>--dfa-initial-width=n</tt></dt>
<dd>
Same as <tt>--dfa-initial-size=2<sup>n</sup></tt>.</dd>
<dt>
<tt>--dfa-max-size=n</tt></dt>
<dd>
For ambiguous content models the parser generates a deterministic finite
state machine (DFA), which may in the worst case have size exponential
in the size of the content model. This option specifies a threshold for
the number of admissible states of the DFA. If it is exceeded, the content
model is approximated by the content model <tt>(e<sub>1</sub>|...|e<sub>n</sub>)*</tt>,
where <tt>e<sub>1</sub></tt>, ..., <tt>e<sub>n</sub></tt> are all element
types occurring in the original model. Default is <tt>256</tt>.</dd>
<dt>
<tt>--dfa-warn-size[=(yes|no)]</tt></dt>
<dd>
Turns on or off a warning if the maximal number of states specified by
<tt>--dfa-max-size</tt> is exceeded by the DFA construction for a content
model. If no argument is given, <tt>yes</tt> is assumed. Default is <tt>yes</tt>.</dd>
<dt>
<tt>-?</tt></dt>
<dt>
<tt>--help</tt></dt>
<dd>
Print a summary of the command line options and exit.</dd>
</dl>
<img SRC="shadow.jpg" ALT="----------------" >
<address>
fxp's feedback address <a href="mailto:fxp@PSI.Uni-Trier.DE">fxp@PSI.Uni-Trier.DE</a></address>
</body>
</html>

246
fxp/doc/fxviz.html Normal file
View File

@ -0,0 +1,246 @@
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="GENERATOR" content="Mozilla/4.73 [en] (X11; I; Linux 2.2.14 i686) [Netscape]">
<title>fxp - The Program fxviz</title>
</head>
<body bgcolor="#FFFFFF">
<h1>
<a href="index.html"><img SRC="fxp-shadow.jpg" ALT="fxp" BORDER=0 align=CENTER></a>
The Program <i>fxviz</i></h1>
<img SRC="shadow.jpg" ALT="----------------" >
<table CELLSPACING=0 CELLPADDING=0 >
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#DESC">Description</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#EXA">An Example</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#OPT">Summary of Options</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#TRIB">Tributes</a></td>
</tr>
</table>
<p><img SRC="shadow.jpg" ALT="----------------" >
<h2>
<a NAME="DESC"></a>Description</h2>
<i>fxviz</i> is an XML tree visualizer. It reads an XML document and produces
a graph description file suitable as input to <i><a href="ftp://ftp.cs.uni-sb.de/pub/graphics/vcg/">vcg</a></i>.
The document tree can then be viewed or printed with the help of <i>xvcg</i>.
The typical invocation is
<blockquote>
<pre>fxviz [option ...] [infile]</pre>
</blockquote>
If <tt>infile</tt> is given, <i>fxviz</i> reads its input document from
that file, otherwise from standard input. By default, it prints its output
to the standard output. The generated tree contains the following kinds
of nodes:
<blockquote>&nbsp;
<table>
<tr VALIGN=TOP>
<th ALIGN=LEFT NOWRAP>node type&nbsp;</th>
<th ALIGN=LEFT NOWRAP>color&nbsp;</th>
<th ALIGN=LEFT NOWRAP>information&nbsp;</th>
<th ALIGN=LEFT NOWRAP>unfolds into&nbsp;</th>
</tr>
<tr VALIGN=TOP>
<td NOWRAP>comment&nbsp;</td>
<td NOWRAP>yellow&nbsp;</td>
<td NOWRAP>file position&nbsp;</td>
<td NOWRAP>-&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td NOWRAP>processing instruction&nbsp;</td>
<td NOWRAP>orange&nbsp;</td>
<td NOWRAP>file position&nbsp;</td>
<td NOWRAP>-&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td NOWRAP>element&nbsp;</td>
<td NOWRAP>light blue&nbsp;</td>
<td NOWRAP>file position&nbsp;</td>
<td NOWRAP>attributes&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td NOWRAP>text&nbsp;</td>
<td NOWRAP>beige&nbsp;</td>
<td NOWRAP>file position&nbsp;</td>
<td NOWRAP>data segments&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td NOWRAP>attribute&nbsp;</td>
<td NOWRAP>light green&nbsp;</td>
<td NOWRAP>-&nbsp;</td>
<td NOWRAP>-&nbsp;</td>
</tr>
<tr VALIGN=TOP>
<td NOWRAP>attribute value&nbsp;</td>
<td NOWRAP>beige&nbsp;</td>
<td NOWRAP>attribute type&nbsp;</td>
<td NOWRAP>-&nbsp;</td>
</tr>
</table>
</blockquote>
Most of the nodes are annotated with their starting position in the XML
source; attribute value nodes provide the attribute type instead.
<p>A text node is usually the result of merging several adjacent text fragments
into a single one. It is, however, possible to unfold a text node into
the sequence of fragments it consists of.
<p>Simimlarly, an element node can be unfolded such that each of its attributes
is represented by its own attribute node as a child of the element.
<p><img SRC="shadow.jpg" ALT="----------------" >
<h2>
<a NAME="EXA"></a>An Example</h2>
Consider the follwing XML document <tt>test.xml</tt>:
<blockquote>
<pre>&lt;!DOCTYPE a [
&lt;!ELEMENT a ANY>
&lt;!ELEMENT b ANY>
&lt;!-- comment in DTD -->
&lt;!ATTLIST a x NMTOKEN "foo"
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; y ID&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; #IMPLIED>
&lt;!ATTLIST b x (yes|no) #REQUIRED&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; y IDREFS&nbsp;&nbsp; #IMPLIED>
]>
&lt;?foo pi in prolog ?>
&lt;a x="yup" y="i-1">
&nbsp; &lt;!-- comment in a -->
&nbsp; &lt;b x="yes" y="i-1">
&nbsp;&nbsp;&nbsp; This is text with another &lt;a>a element with an empty &lt;b/> &lt;/a>,&nbsp;
&nbsp;&nbsp;&nbsp; a character reference &amp;#x3C; and a &lt;![CDATA[CDATA section]]>
&nbsp; &lt;/b>&nbsp;
&nbsp; &lt;?foo pi in a ?>
&lt;/a>
&lt;!-- comment in epilog --></pre>
</blockquote>
The graph description produced by <i>fxviz</i> will display the document
tree as follows:
<p><img SRC="exa-vcg-1.gif" ALT="graph output example" HSPACE=20 >
<p>Each node in the tree is connected to each of its children by an edge.
Additionally, the nodes are annotated this their starting postions in the
XML source. These positions can be viewed by selecting <tt>Node Information
=> Source Position</tt> in the <i>vcg</i> menu:
<p><img SRC="exa-vcg-2.gif" ALT="graph output example" HSPACE=20 >
<p>Text nodes are merged such that no text node has another text node as
a direct sibling. If you wish to see how a merged text node is composed
of text fragments, apply <i>vcg</i>'s <tt>Unfold Subgraph</tt> function
to that node. For the second text-node in the <tt>b</tt>-element this results
in:
<p><img SRC="exa-vcg-3.gif" ALT="graph output example" HSPACE=20 >
<p>Additional information about attributes of elements is also available
by applying <i>vcg</i>'s <tt>Unfold Subgraph</tt> function to the element
node: the attributes are removed from the element node's label, and for
each attribute a new attribute node is inserted before the element's content.
Each attribute node is labeled with the attribute name and has as a single
child the attribute value:
<p><img SRC="exa-vcg-4.gif" ALT="graph output example" HSPACE=20 >
<p>The values of defaulted attributed are marked, and unspecified attributes
are either marked as implied or missing:
<p><img SRC="exa-vcg-5.gif" ALT="graph output example" HSPACE=20 >
<p>Finally, if you are interested in the attribute type of some specified
attribute, use the <tt>Node Information => Attribute Type</tt> function:
<p><img SRC="exa-vcg-6.gif" ALT="graph output example" HSPACE=20 >
<p><img SRC="shadow.jpg" ALT="----------------" >
<h2>
<a NAME="OPT"></a>Summary of Command Line Options</h2>
Each option can be one of:
<ul>
<li>
A file name specifying the input document. Only one input document may
be specified.</li>
<li>
A long option of the form <tt>--key[=arg]</tt></li>
<li>
A short option of the form <tt>-k</tt>, where <tt>k</tt> consists of single
character. If <tt>k</tt> consists of more than one character, each character
is assumed to be a short option itself (e.g., <tt>-vic</tt> equals <tt>-v
-i -c</tt>).</li>
<li>
A short option with argument of the form <tt>-k arg</tt>, where <tt>k</tt>
consists of a single character.</li>
<li>
A negative short option of the form <tt>-nk</tt>, where <tt>k</tt> consists
of single character. If <tt>k</tt> consists of more than one character,
each character is assumed to be a negative short option itself (e.g., <tt>-nvic</tt>
equals <tt>-nv -ni -nc</tt>). If <tt>k</tt> is empty, then we have the
(non-negative) short option <tt>-n</tt>.</li>
<li>
The string <tt>--</tt>. This option is ignored, except that all remaining
options are interpreted as file names, whether they start with <tt>-</tt>
or not.</li>
</ul>
<i>fxviz</i> understands all options documented for <i><a href="fxp.html#OPT">fxp</a></i>;
additionally, the following options are available:
<dl>
<dt>
<tt>-o fname</tt></dt>
<dt>
<tt>--output=fname</tt></dt>
<dd>
Write all output, except for errors and warnings, to the file named <tt>fname</tt>.
If <tt>fname</tt> is <tt>-</tt>, the standard output is used. Defaults
to -.</dd>
</dl>
<img SRC="shadow.jpg" ALT="----------------" >
<h2>
<a NAME="TRIB"></a>Tributes</h2>
<i>fxviz</i> generates output for <i><a href="ftp://ftp.cs.uni-sb.de/pub/graphics/vcg/">vcg</a></i>
which is an excellent graph layout program written by Georg Sander at the
University of Saarbr&uuml;cken.
<p><img SRC="shadow.jpg" ALT="----------------" >
<address>
fxp's feedback address <a href="mailto:fxp@PSI.Uni-Trier.DE">fxp@PSI.Uni-Trier.DE</a></address>
</body>
</html>

239
fxp/doc/index.html Normal file
View File

@ -0,0 +1,239 @@
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="GENERATOR" content="Mozilla/4.73 [en] (X11; I; Linux 2.2.14 i686) [Netscape]">
<title>fxp - a Functional XML Parser</title>
</head>
<body bgcolor="#FFFFFF">
<h1>
<img SRC="fxp-shadow.jpg" ALT="fxp" BORDER=0 align=CENTER> The Functional
XML Parser</h1>
<img SRC="shadow.jpg" ALT="----------------" >
<h2>
Contents</h2>
<table>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#FXP">About <i>fxp</i></a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#VERSION"><i>fxp</i> Versions and Changes</a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#GET">Where to get <i>fxp</i></a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><a href="#INSTALL">How to install <i>fxp</i></a></td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td>Example Applications: <i><a href="fxp.html">fxp</a></i>, <i><a href="fxcanon.html">fxcanon</a></i>,
<i><a href="fxcopy.html">fxcopy</a></i>, <i><a href="fxesis.html">fxesis</a></i>,
and <i><a href="fxviz.html">fxviz</a></i>.&nbsp;</td>
</tr>
</table>
<p><img SRC="shadow.jpg" ALT="----------------" >
<h3>
<a NAME="FXP"></a>What is <i>fxp</i>?</h3>
<i>fxp</i> is a validating <a href="http://www.w3.org/TR/REC-xml">XML</a>
parser written completely in the functional programming language SML. fxp can
validate both XML 1.0 and XML 1.1 documents. It has a <a
href="#API">programming interface</a> allowing for production of XML
applications based on <i>fxp</i>. It comes with four example applications:
<table> <tr> <td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><i><a href="fxp.html">fxp</a></i>, the pure parser. It parses a document
and finds well-formedness errors, validity errors and other problems;</td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><i><a href="fxcanon.html">fxcanon</a></i> produces an equivalent canonical
XML document. <a href="http://www.jclark.com/xml/canonxml.html">Canonical
XML</a> was invented by <a href="http://www.jclark.com">James Clark</a>
for testing XML parsers. It contains only the information a processor is
required to pass to the application;&nbsp;</td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><i><a href="fxcopy.html">fxcopy</a></i> reproduces the document parsed
by <i>fxp</i>. The copy can be generated in a different encoding than the
input, and can be normalized in different ways concerning, e.g., expansion
of entity references;&nbsp;</td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><i><a href="fxesis.html">fxesis</a></i> adds a backend to <i>fxp</i>,
producing an output similar to <i><a href="http://www.jclark.com/sp/nsgmls.htm">nsgmls</a></i>'s
ESIS (Element Structure Information Set) output;</td>
</tr>
<tr>
<td><img SRC="ball-shadow.jpg" ALT="o" ></td>
<td><i><a href="fxviz.html">fxviz</a></i> is an XML tree visualizer. It
produces a graph description suitable as input to Georg Sander's <i><a href="ftp://ftp.cs.uni-sb.de/pub/graphics/vcg/">vcg</a></i>.&nbsp;</td>
</tr>
</table>
More <a href="features.html">features</a> of <i>fxp</i> are <a href="features.html">here</a>.
<p><img SRC="shadow.jpg" ALT="----------------" >
<h3><a NAME="VERSION"></a><i>fxp</i> Versions and Changes</h3>
<p>The current version of <i>fxp</i> is 2.0. Changes from the
previous versions are described <a href="CHANGES">here</a>.</p>
<p><img SRC="shadow.jpg" ALT="----------------" >
<h3>
<a NAME="GET"></a>Downloading <i>fxp</i></h3>
The source code for <i>fxp</i> can be downloaded by <a href="fxp.tar.gz">http</a>. The Copyright note is <a href="COPYRIGHT">here</a>.
<p><img SRC="shadow.jpg" ALT="----------------" >
<h3>
<a NAME="INSTALL"></a>Installation instructions</h3> In order to
install <i>fxp</i>, you need an SML compiler. It has been tested with
the stable versions 110.0.7 and 110.0.6 of <a
href="http://cm.bell-labs.com/cm/cs/what/smlnj/">SML of New
Jersey</a>. (If you need to use a working version of the SML compiler
look <a href="working.html">here.</a>) The compiler must have the
compilation manager (CM) built in, which is the default when
installing SML-NJ. We successfully compiled <i>fxp</i> on Linux. For other unices we expect
no problems. An installation using the Windows version of SML-NJ is
documented <a href="windows.html">here</a>.
<p>These are the steps for installing <i>fxp</i> under Unix:
<ol>
<li>
<a href="#GET">Download</a> the latest version of <i>fxp</i>;</li>
<li>
Unpack the sources, and change to the <i>fxp</i> directory, e.g.:</li>
<pre>&nbsp;&nbsp;&nbsp; gunzip -c fxp-2.0.tar.gz | tar xf -
&nbsp;&nbsp;&nbsp; cd fxp-2.0</pre>
<li>
Read the <tt>COPYRIGHT</tt>;</li>
<li>
Edit the <tt>Makefile</tt> according to your needs. Probably you will only
have to change the following:</li>
<ul>
<li>
<tt>INSTALL_PROGS</tt> is the list of programs to be installed. <i>fxlib</i> is
required if you want to develop applications with fxp. The <a href='../Fxgrep/'>fxgrep</a> XML query language and the <a href='../Fxt/'>fxt</a> XML transformation language also need <i>fxlib</i> being installed.</li>
<li>
<tt>FXP_BINDIR</tt> is where the executables are installed;</li>
<li>
<tt>FXP_LIBDIR</tt> is where other files needed by <i>fxp</i> - the heap
images and the library -- are installed;</li>
<li>
<tt>SML_BINDIR</tt> is the directory where the SML executables are found.
It must contain the <tt>.arch-n-opsys</tt> script from the SML-NJ distribution,
so make sure that this is where SML-NJ is <i>physically</i> installed;</li>
<li>
<tt>SML_EXEC</tt> is the name of the SML executable. This is the program
that is called for generating the heap image and at execution of <i>fxp</i>.
If <tt>sml</tt> will be in your <tt>PATH</tt> at execution time, you don't
need the full path here.</li>
<li>
<tt>SML_MAKEDEF</tt> is for defining the <tt>make</tt> command in SML.
After version 110.0.3, SML-NJ changed the type of <tt>CM.make'</tt>. Therefore
it is wrapped into the <tt>make</tt> defined by this variable. For working
versions of SML-NJ, use the second variant of this definition.</li>
</ul>
<li>
Edit the file <tt>src/config.sml</tt> according to your needs. Currently
only a single value can be configured here:</li>
<ul>
<li>
<tt>retrieveCommand</tt> is the command to be used by <i>fxp</i> for retrieving
a remote URI from the internet and storing it in a temporary file on the
local file system. It is a string value and should contain the strings
<tt>%1</tt> and <tt>%2</tt>, where</li>
<ul>
<li>
<tt>%1</tt> is replaced by the URI;</li>
<li>
<tt>%2</tt> is replaced by the local filename.</li>
</ul>
It is recommended that the command exits with failure in case the URI cannot
be retrieved. If the command generates a HTML error message instead (like,
e.g., <tt>"lynx -source %1 > %2"</tt>), this HTML file is considered to
be XML and will probably cause a mess of parsing errors. If you don't need
URI retrieval, use <tt>"exit 1"</tt> which always fails on Unix. Sensible
values are, e.g:
<ul>
<li>
<tt>"<a href="ftp://gnjilux.cc.fer.hr/pub/unix/util/wget/">wget</a> -qO
%2 %1"</tt></li>
<li>
<tt>"<a href="ftp://sunsite.unc.edu/pub/Linux/apps/www/mirroring/got_it-0.34.tar.gz">got_it</a>
-o %2 %1"</tt></li>
<li>
<tt>"<a href="ftp://sunsite.unc.edu/pub/Linux/apps/www/mirroring/urlget-3.12.tar.gz">urlget</a>
-s -o %2 %1"</tt></li>
</ul>
</ul>
<li>
Compile <i>fxp</i> by typing <tt>make</tt>;</li>
<li>
Install <i>fxp</i> by typing <tt>make install</tt>.</li>
<li>
If you want to use <i>fxviz</i>, you should also install <i><a href="ftp://ftp.cs.uni-sb.de/pub/graphics/vcg/">vcg</a></i>.</li>
</ol>
<img SRC="shadow.jpg" ALT="----------------" >
<h3><a NAME="API"></a><i>fxp</i>'s Programming Interface</h3>
Here is a <a href="api.ps">document </a>describinng the programming interface
<p><img SRC="shadow.jpg" ALT="----------------" >
<h3><i>fxp</i>'s feedback address</h3>
Any feedback related to fxp is welcome to: <img src="Images/email.png"> <p><img
SRC="shadow.jpg" ALT="----------------" >
<h3>Credits:</h3> The author of fxp is <a
href="mailto:neumann@psi.uni-trier.de"/>Andreas Neumann</a>. fxp is
maintained and updated by <a
href="../"/>Alexandru Berlea</a>.
</body>
</html>

BIN
fxp/doc/shadow.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.7 KiB

97
fxp/doc/windows.html Normal file
View File

@ -0,0 +1,97 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Installation steps for Windows</title>
</head>
<body>
<h1>Installation steps for Windows</h1>
This is a contribution by Kevin S. Millikin:
<ol>
<li>
Download the fxp sources. Save the file where you would like the fxp
source directory to reside. For instance, I just installed mine in
<code>/cygdrive/d/xml</code>.
</li>
<li>
In the Cygwin bash shell, cd to the directory containing the source
tarball, and unzip and detar it. Then, cd to the directory containing
the fxp sources
<pre>
cd /cygdrive/d/xml
tar -zxvf fxp.tar.gz
cd fxp-1.4.6
</pre>
</li>
<li>Read the <tt>COPYRIGHT</tt>.</li>
<li>
Edit the <code>Makefile</code> to reflect the situation at your site.
Change the line specifying where to install the fxp binaries and
libraries from
<pre>
<font color="red">PREFIX = /home/berlea/xmlsoft</font>
</pre>
to something like
<pre>
<font color="blue">PREFIX = /cygdrive/d/xml</font>
</pre>
Change the line specifying the path to the SML/NJ executable to the
correct path to the executable, from
<pre>
<font color="red">SML_BINDIR = /usr/share/smlnj/bin</font>
</pre>
to something like
<pre>
<font color="blue">SML_BINDIR = /cygdrive/d/smlnj-110.43/bin</font>
</pre>
Finally, the Cygwin version of <code>mkdirhier</code> is buggy. Replace
the line
<pre>
<font color="red">MKDIRHIER = mkdirhier</font>
</pre>
with
<pre>
<font color="blue">MKDIRHIER = mkdir -p</font>
</pre>
</li>
<li>
Unix versions of SML/NJ come equipped with a shell script called
<code>.arch-n-opsys</code> that reports the processor architecture and
operating system. The Win32 versions do not include this shell script.
<code>fxp</code>'s build process relies on the presence of this script.
The easiest thing to do is place a script in the same directory as your
<code>sml</code> binary.
<pre>
#!/bin/sh
#
# .arch-n-opsys -- get architecture and system info
# This file is hacked to run with the Windows version
echo "ARCH=x86; OPSYS=win32; HEAP_SUFFIX=x86-win32"
</pre>
Call this file <code>.arch-n-opsys</code>. Notice that it's really a
big cheat. Make sure that it is executable
<pre>
chmod a+x .arch-n-opsys
</pre>
</li>
<li>
Now, you can make all
<pre>
make
</pre>
</li>
<li>
If that succeeded, you can install the binaries and libraries
<pre>
make install
</pre>
</li>
</ol>
<hr>
</body>
</html>

339
fxp/doc/working.diff Normal file
View File

@ -0,0 +1,339 @@
diff -Naur fxp-1.4.6/Makefile fxp-win32-1.4.6/Makefile
--- fxp-1.4.6/Makefile 2003-10-09 09:48:03.000000000 -0500
+++ fxp-win32-1.4.6/Makefile 2003-10-17 13:22:17.000000000 -0500
@@ -6,7 +6,7 @@
##############################################################################
# These are the locations for executables, heap images and library files
##############################################################################
-PREFIX = /home/berlea/xmlsoft
+PREFIX = /cygdrive/d/xml
FXP_BINDIR = ${PREFIX}/bin
FXP_LIBDIR = ${PREFIX}/fxp
@@ -15,15 +15,15 @@
# SML-NJ executable with the Compilation manager built-in. If sml is in
# your PATH at execution time, you fon't need the full path here.
##############################################################################
-SML_BINDIR = /usr/share/smlnj/bin
+SML_BINDIR = /cygdrive/d/smlnj-110.43/bin
SML_EXEC = ${SML_BINDIR}/sml
##############################################################################
# No need to change this for SML-NJ 110.0.6. For earlier or working versions
# 110.19 you might have to use the second or third line. This is the
# compilation manager function for making with a named description file.
##############################################################################
-SML_MAKEDEF= val make = CM.make'
-#SML_MAKEDEF= val make = CM.make
+#SML_MAKEDEF= val make = CM.make'
+SML_MAKEDEF= val make = CM.make
#SML_MAKEDEF= fun make x = CM.make'{force_relink=true, group=x}
##############################################################################
@@ -35,7 +35,7 @@
COPY = cp -f
CHMOD = chmod
FIND = find
-MKDIRHIER = mkdirhier
+MKDIRHIER = mkdir -p
##############################################################################
# nothing to change below this line
diff -Naur fxp-1.4.6/src/Apps/Canon/canon.cm fxp-win32-1.4.6/src/Apps/Canon/canon.cm
--- fxp-1.4.6/src/Apps/Canon/canon.cm 2003-10-09 09:48:00.000000000 -0500
+++ fxp-win32-1.4.6/src/Apps/Canon/canon.cm 2003-10-17 16:16:42.000000000 -0500
@@ -5,3 +5,4 @@
canonHooks.sml
canon.sml
../../fxlib.cm
+ $/basis.cm
diff -Naur fxp-1.4.6/src/Apps/Copy/copy.cm fxp-win32-1.4.6/src/Apps/Copy/copy.cm
--- fxp-1.4.6/src/Apps/Copy/copy.cm 2003-10-09 09:48:00.000000000 -0500
+++ fxp-win32-1.4.6/src/Apps/Copy/copy.cm 2003-10-17 16:16:51.000000000 -0500
@@ -5,3 +5,4 @@
copyHooks.sml
copy.sml
../../fxlib.cm
+ $/basis.cm
diff -Naur fxp-1.4.6/src/Apps/Copy/copyEncode.sml fxp-win32-1.4.6/src/Apps/Copy/copyEncode.sml
--- fxp-1.4.6/src/Apps/Copy/copyEncode.sml 2003-10-09 09:48:00.000000000 -0500
+++ fxp-win32-1.4.6/src/Apps/Copy/copyEncode.sml 2003-10-17 14:30:01.000000000 -0500
@@ -126,7 +126,7 @@
| _ => if c<>q andalso validChar(f,c) then putChar(f,c) else putCharRef(f,c)
val f1 = putChar(f,q)
- val f2 = Vector.foldli putOne f1 (cv,0,NONE)
+ val f2 = Vector.foldli putOne f1 cv
val f3 = putChar(f2,q)
in f3
end
diff -Naur fxp-1.4.6/src/Apps/Esis/esis.cm fxp-win32-1.4.6/src/Apps/Esis/esis.cm
--- fxp-1.4.6/src/Apps/Esis/esis.cm 2003-10-09 09:48:01.000000000 -0500
+++ fxp-win32-1.4.6/src/Apps/Esis/esis.cm 2003-10-17 16:17:04.000000000 -0500
@@ -5,3 +5,4 @@
esisHooks.sml
esisData.sml
../../fxlib.cm
+ $/basis.cm
diff -Naur fxp-1.4.6/src/Apps/Null/null.cm fxp-win32-1.4.6/src/Apps/Null/null.cm
--- fxp-1.4.6/src/Apps/Null/null.cm 2003-10-09 09:47:59.000000000 -0500
+++ fxp-win32-1.4.6/src/Apps/Null/null.cm 2003-10-17 16:16:57.000000000 -0500
@@ -4,3 +4,4 @@
null.sml
nullHard.sml
../../fxlib.cm
+ $/basis.cm
diff -Naur fxp-1.4.6/src/Apps/Viz/viz.cm fxp-win32-1.4.6/src/Apps/Viz/viz.cm
--- fxp-1.4.6/src/Apps/Viz/viz.cm 2003-10-09 09:48:01.000000000 -0500
+++ fxp-win32-1.4.6/src/Apps/Viz/viz.cm 2003-10-17 16:17:32.000000000 -0500
@@ -3,3 +3,4 @@
viz.sml
vizHooks.sml
../../fxlib.cm
+ $/basis.cm
diff -Naur fxp-1.4.6/src/Parser/Dfa/dfaPassTwo.sml fxp-win32-1.4.6/src/Parser/Dfa/dfaPassTwo.sml
--- fxp-1.4.6/src/Parser/Dfa/dfaPassTwo.sml 2003-10-09 09:47:55.000000000 -0500
+++ fxp-win32-1.4.6/src/Parser/Dfa/dfaPassTwo.sml 2003-10-17 14:24:41.000000000 -0500
@@ -72,6 +72,6 @@
val _ = do_cm (nil,true) cmi
- in Array.extract (table,0,NONE)
+ in Array.vector table
end
end
diff -Naur fxp-1.4.6/src/Parser/Dfa/dfaString.sml fxp-win32-1.4.6/src/Parser/Dfa/dfaString.sml
--- fxp-1.4.6/src/Parser/Dfa/dfaString.sml 2003-10-09 09:47:55.000000000 -0500
+++ fxp-win32-1.4.6/src/Parser/Dfa/dfaString.sml 2003-10-17 14:07:50.000000000 -0500
@@ -68,11 +68,11 @@
(fn (i,q,yet) => if q<0 then yet
else " "::Elem2String (i+lo)::"->"::State2String q::yet)
(if fin then [" [Final]"] else nil)
- (tab,0,NONE))
+ tab)
fun Dfa2String Elem2String tab =
String.concat
(Vector.foldri
(fn (q,row,yet) => State2String q::":"::Row2String Elem2String row::yet)
- nil (tab,0,NONE))
+ nil tab)
end
diff -Naur fxp-1.4.6/src/Parser/Dfa/dfaUtil.sml fxp-win32-1.4.6/src/Parser/Dfa/dfaUtil.sml
--- fxp-1.4.6/src/Parser/Dfa/dfaUtil.sml 2003-10-09 09:47:54.000000000 -0500
+++ fxp-win32-1.4.6/src/Parser/Dfa/dfaUtil.sml 2003-10-17 14:22:35.000000000 -0500
@@ -124,7 +124,7 @@
val tab = Array.array(hi-lo+1,dfaError)
val _ = app (fn (q,a) => Array.update (tab,a-lo,q)) flw
in
- (lo,hi,Array.extract (tab,0,NONE),fin)
+ (lo,hi,Array.vector tab,fin)
end
end
diff -Naur fxp-1.4.6/src/Parser/Dtd/dtdAttributes.sml fxp-win32-1.4.6/src/Parser/Dtd/dtdAttributes.sml
--- fxp-1.4.6/src/Parser/Dtd/dtdAttributes.sml 2003-10-09 09:47:55.000000000 -0500
+++ fxp-win32-1.4.6/src/Parser/Dtd/dtdAttributes.sml 2003-10-17 14:12:28.000000000 -0500
@@ -65,7 +65,7 @@
ord(String.sub(s,1))-65,
true))
iso639codes
- in Vector.tabulate(26,fn i => Array.extract (Array.sub(arr,i),0,NONE))
+ in Vector.tabulate(26,fn i => Array.vector (Array.sub(arr,i)))
end
(*--------------------------------------------------------------------*)
diff -Naur fxp-1.4.6/src/Parser/Params/dtd.sml fxp-win32-1.4.6/src/Parser/Params/dtd.sml
--- fxp-1.4.6/src/Parser/Params/dtd.sml 2003-10-09 09:47:56.000000000 -0500
+++ fxp-win32-1.4.6/src/Parser/Params/dtd.sml 2003-10-17 14:09:57.000000000 -0500
@@ -290,10 +290,10 @@
val _ = map (fn i => Array.update(preRedef,i,false)) [1,2,3,4,5]
val _ = GenEnt2Index dtd [0wx2D] (* "-" *)
val _ = ParEnt2Index dtd [0wx2D] (* "-" *)
- val _ = Vector.appi
- (fn (_,(name,lit,cs))
- => (setGenEnt dtd (GenEnt2Index dtd name,(GE_INTERN(lit,cs),false))))
- (predefined,1,NONE)
+ val _ = VectorSlice.appi
+ (fn (_,(name,lit,cs))
+ => (setGenEnt dtd (GenEnt2Index dtd name,(GE_INTERN(lit,cs),false))))
+ (VectorSlice.slice (predefined,1,NONE))
in ()
end
diff -Naur fxp-1.4.6/src/Parser/Parse/parseContent.sml fxp-win32-1.4.6/src/Parser/Parse/parseContent.sml
--- fxp-1.4.6/src/Parser/Parse/parseContent.sml 2003-10-09 09:47:56.000000000 -0500
+++ fxp-win32-1.4.6/src/Parser/Parse/parseContent.sml 2003-10-17 14:26:30.000000000 -0500
@@ -587,7 +587,9 @@
val _ = Array.update(dataBuffer,0,c0)
fun data_hook (i,(a,q)) =
- hookData(a,((!pos0,getPos q),Array.extract(dataBuffer,0,SOME i),false))
+ hookData(a,((!pos0,getPos q),
+ ArraySlice.vector(ArraySlice.slice(dataBuffer,0,SOME i)),
+ false))
fun takeOne (c,qE,i,aq as (a,q)) =
if i<DATA_BUFSIZE then (i+1,aq) before Array.update(dataBuffer,i,c)
else let val a1 = data_hook(i,(a,qE))
diff -Naur fxp-1.4.6/src/Unicode/Chars/charClasses.sml fxp-win32-1.4.6/src/Unicode/Chars/charClasses.sml
--- fxp-1.4.6/src/Unicode/Chars/charClasses.sml 2003-10-09 09:47:58.000000000 -0500
+++ fxp-win32-1.4.6/src/Unicode/Chars/charClasses.sml 2003-10-17 13:49:08.000000000 -0500
@@ -91,7 +91,7 @@
(*--------------------------------------------------------------------*)
fun initialize(min,max) =
Array.array((Chars.toInt max-Chars.toInt min+1) div 32+1,0wx0):MutableClass
- fun finalize arr = Array.extract(arr,0,NONE)
+ fun finalize arr = Array.vector arr
(*--------------------------------------------------------------------*)
(* add a single character to a CharClass. *)
diff -Naur fxp-1.4.6/src/Unicode/Chars/uniChar.sml fxp-win32-1.4.6/src/Unicode/Chars/uniChar.sml
--- fxp-1.4.6/src/Unicode/Chars/uniChar.sml 2003-10-09 09:47:58.000000000 -0500
+++ fxp-win32-1.4.6/src/Unicode/Chars/uniChar.sml 2003-10-17 13:43:07.000000000 -0500
@@ -108,10 +108,14 @@
if len<=maxlen orelse maxlen=0
then Data2String (Vector2Data vec)
else let
- val cs1 = Vector.foldri
- (fn (_,c,cs) => c::cs) nil (vec,0,SOME (maxlen div 2))
- val cs2 = Vector.foldri
- (fn (_,c,cs) => c::cs) nil (vec,len-3-maxlen div 2,NONE)
+ val cs1 = VectorSlice.foldri
+ (fn (_,c,cs) => c::cs)
+ nil
+ (VectorSlice.slice (vec,0,SOME (maxlen div 2)))
+ val cs2 = VectorSlice.foldri
+ (fn (_,c,cs) => c::cs)
+ nil
+ (VectorSlice.slice (vec,len-3-maxlen div 2,NONE))
in Data2String cs1^"..."^Data2String cs2
end
end
diff -Naur fxp-1.4.6/src/Unicode/Uri/uriEncode.sml fxp-win32-1.4.6/src/Unicode/Uri/uriEncode.sml
--- fxp-1.4.6/src/Unicode/Uri/uriEncode.sml 2003-10-09 09:47:57.000000000 -0500
+++ fxp-win32-1.4.6/src/Unicode/Uri/uriEncode.sml 2003-10-17 13:52:57.000000000 -0500
@@ -73,7 +73,7 @@
in c2::c1:: #"%"::s
end)
s (encodeCharUtf8 c))
- nil (cv,0,NONE)
+ nil cv
in String.implode (rev revd)
end
@@ -85,7 +85,7 @@
else let val (c1,c2) = Byte2Cc (Char2Byte c)
in c2::c1:: #"%"::s
end))
- nil (cv,0,NONE)
+ nil cv
in String.implode (rev revd)
end
diff -Naur fxp-1.4.6/src/Util/SymDict/dict.sml fxp-win32-1.4.6/src/Util/SymDict/dict.sml
--- fxp-1.4.6/src/Util/SymDict/dict.sml 2003-10-09 09:47:59.000000000 -0500
+++ fxp-win32-1.4.6/src/Util/SymDict/dict.sml 2003-10-17 13:57:44.000000000 -0500
@@ -230,7 +230,7 @@
in ()
end
in
- Array.appi addTo (oldTab,0,NONE)
+ Array.appi addTo oldTab
end
(*--------------------------------------------------------------------*)
@@ -316,8 +316,8 @@
(*--------------------------------------------------------------------*)
fun printDict X2String ({desc,tab,count,...}:'a Dict) =
(print (desc^" dictionary:\n");
- Array.appi
+ ArraySlice.appi
(fn (n,(key,value)) =>
print (" "^Int.toString n^": "^Key.toString key^" = "^X2String value^"\n"))
- (!tab,0,SOME (!count)))
+ (ArraySlice.slice(!tab,0,SOME (!count))))
end
diff -Naur fxp-1.4.6/src/Util/SymDict/symbolTable.sml fxp-win32-1.4.6/src/Util/SymDict/symbolTable.sml
--- fxp-1.4.6/src/Util/SymDict/symbolTable.sml 2003-10-09 09:47:59.000000000 -0500
+++ fxp-win32-1.4.6/src/Util/SymDict/symbolTable.sml 2003-10-17 14:03:39.000000000 -0500
@@ -219,7 +219,7 @@
val _ = Array.update(newTab,i,key)
in ()
end
- val _ = Array.appi addToNew (!tab,0,NONE)
+ val _ = Array.appi addToNew (!tab)
val _ = tab := newTab
val _ = hash := newHash
@@ -300,15 +300,15 @@
(* extract the contents of a symbol table to a vector. *)
(*--------------------------------------------------------------------*)
fun extractSymTable({count,tab,...}:SymTable) =
- Array.extract(!tab,0,SOME(!count))
+ ArraySlice.vector(ArraySlice.slice(!tab,0,SOME(!count)))
(*--------------------------------------------------------------------*)
(* print the contents of the symbol table. *)
(*--------------------------------------------------------------------*)
fun printSymTable ({desc,tab,count,...}:SymTable) =
(print (desc^" table:\n");
- Array.appi
+ ArraySlice.appi
(fn (n,key) =>
print (" "^Int.toString n^": "^Key.toString key^"\n"))
- (!tab,0,SOME (!count)))
+ (ArraySlice.slice(!tab,0,SOME (!count))))
end
diff -Naur fxp-1.4.6/src/Util/intSets.sml fxp-win32-1.4.6/src/Util/intSets.sml
--- fxp-1.4.6/src/Util/intSets.sml 2003-10-09 09:47:58.000000000 -0500
+++ fxp-win32-1.4.6/src/Util/intSets.sml 2003-10-17 14:20:11.000000000 -0500
@@ -49,8 +49,8 @@
fun normalize (vec:IntSet) =
let val max = Vector.foldli
- (fn (i,w,max) => if w=0wx0 then i else max) 0 (vec,0,NONE)
- in Vector.extract (vec,0,SOME max)
+ (fn (i,w,max) => if w=0wx0 then i else max) 0 vec
+ in VectorSlice.vector(VectorSlice.slice (vec,0,SOME max))
end
val emptyIntSet = Vector.fromList nil : IntSet
@@ -88,7 +88,7 @@
val size = Vector.length vec
in
if size>idx
- then Vector.mapi (fn (i,x) => if i=idx then x||mask else x) (vec,0,NONE)
+ then Vector.mapi (fn (i,x) => if i=idx then x||mask else x) vec
else Vector.tabulate
(idx+1,fn i => if i<size then Vector.sub(vec,i) else if i=idx then mask else 0w0)
end
@@ -100,7 +100,7 @@
val vec1 = if size<=idx then vec
else let val mask = !! (0w1 << (Word.fromInt (n mod wordSize)))
in Vector.mapi
- (fn (i,x) => if i=idx then x && mask else x) (vec,0,NONE)
+ (fn (i,x) => if i=idx then x && mask else x) vec
end
in normalize vec1
end
diff -Naur fxp-1.4.6/src/Util/utilString.sml fxp-win32-1.4.6/src/Util/utilString.sml
--- fxp-1.4.6/src/Util/utilString.sml 2003-10-09 09:47:58.000000000 -0500
+++ fxp-win32-1.4.6/src/Util/utilString.sml 2003-10-17 13:38:53.000000000 -0500
@@ -230,6 +230,9 @@
if Vector.length vec=0 then pre^post
else String.concat
(pre::X2String(Vector.sub(vec,0))::
- Vector.foldri (fn (_,x,yet) => sep::X2String x::yet) [post] (vec,1,NONE))
+ VectorSlice.foldri
+ (fn (_,x,yet) => sep::X2String x::yet)
+ [post]
+ (VectorSlice.slice (vec,1,NONE)))
fun Vector2String X2String vec = Vector2xString ("#[",",","]") X2String vec
end
diff -Naur fxp-1.4.6/src/fxlib.cm fxp-win32-1.4.6/src/fxlib.cm
--- fxp-1.4.6/src/fxlib.cm 2003-10-09 09:47:59.000000000 -0500
+++ fxp-win32-1.4.6/src/fxlib.cm 2003-10-17 16:16:27.000000000 -0500
@@ -92,3 +92,4 @@
Util/SymDict/intDict.sml
Util/utilCompare.sml
config.sml
+ $/basis.cm

299
fxp/doc/working.html Normal file
View File

@ -0,0 +1,299 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Installing fxp with SML working versions</title>
</head>
<body>
<h1>Installing fxp with SML working versions</h1>
Kevin S. Millikin has documented the changes needed in fxp's sources
in order to use it with the SML working version 110.43.
<ol>
<li>
Download <a href="fxp-1.4.6.tar.gz">fxp-1.4.6</a>;</li>
<li>
Unpack the sources, and change to the <i>fxp</i> directory:</li>
<pre>&nbsp;&nbsp;&nbsp; gunzip -c fxp-1.4.6.tar.gz | tar xf -
&nbsp;&nbsp;&nbsp; cd fxp-1.4.6</pre>
<li>You need to do some modifications in the fxp's sources in order to
account for the changes in the SML working version.
For fxp 1.4.6 put the following patch <a
href="working.diff">working.diff</a> in the fxp-1.4.6
directory and type:<br>
<pre>
patch -p1 < working.diff
</pre>
Alternatively you can perform them by hand as it follows:
<ol type="i">
<li>
Since the SML/NJ compilation manager (CM) is changed in the
working version, it is necessary to specifically note the
dependency on the SML Basis library (which is SML's standard
libarary). Add the following line
<pre>
$/basis.cm
</pre>
to the end of all the necessary CM files. Those files are
<pre>
src/fxlib.cm
src/Apps/Canon/canon.cm
src/Apps/Copy/copy.cm
src/Apps/Esis/esis.cm
src/Apps/Null/null.cm
src/Apps/Viz/viz.cm
</pre>
</li>
<li>
The signatures for Arrays and Vectors in the Basis library have changed,
as of version 110.43. Functions in the <code>Vector</code> and
<code>Array</code> structure which previously operated over
<em>slices</em> now use the <code>VectorSlice</code> or
<code>ArraySlice</code> structures. This necessitates a whole host of
changes.
<ol type="a">
<li>
In file <code>src/Util/utilString.sml</code>, line 233, change
<pre>
<font color="red">Vector.foldri (fn (_,x,yet) => sep::X2String x::yet) [post] (vec,1,NONE))</font>
</pre>
to
<pre>
<font color="blue">VectorSlice.foldri
(fn (_,x,yet) => sep::X2String x::yet)
[post]
(VectorSlice.slice (vec,1,NONE)))</font>
</pre>
</li>
<li>
In file <code>src/Unicode/Chars/uniChar.sml</code>, line 111, change
<pre>
<font color="red">val cs1 = Vector.foldri
(fn (_,c,cs) => c::cs) nil (vec,0,SOME (maxlen div 2))
val cs2 = Vector.foldri
(fn (_,c,cs) => c::cs) nil (vec,len-3-maxlen div 2,NONE)</font>
</pre>
to
<pre>
<font color="blue">val cs1 = VectorSlice.foldri
(fn (_,c,cs) => c::cs)
nil
(VectorSlice.slice (vec,0,SOME (maxlen div 2)))
val cs2 = VectorSlice.foldri
(fn (_,c,cs) => c::cs)
nil
(VectorSlice.slice (vec,len-3-maxlen div 2,NONE))</font>
</pre>
</li>
<li>
In file <code>src/Unicode/Chars/charClasses.sml</code>, line 94, change
<pre>
<font color="red">fun finalize arr = Array.extract(arr,0,NONE)</font>
</pre>
to
<pre>
<font color="blue">fun finalize arr = Array.vector arr</font>
</pre>
</li>
<li>
In file <code>src\Unicode\Uri\uriEncode.sml</code>, line 76, change
<pre>
<font color="red">nil (cv,0,NONE)</font>
</pre>
to
<pre>
<font color="blue">nil cv</font>
</pre>
and on line 88, make the same change.
</li>
<li>
In file <code>src\Util\SymDict\dict.sml</code>, line 233, change
<pre>
<font color="red">Array.appi addTo (oldTab,0,NONE)</font>
</pre>
to
<pre>
<font color="blue">Array.appi addTo oldTab</font>
</pre>
and on line 319, change
<pre>
<font color="red">Array.appi
(fn (n,(key,value)) =>
print (" "^Int.toString n^": "^Key.toString key^" = "^X2String value^"\n"))
(!tab,0,SOME (!count))) </font>
</pre>
to
<pre>
<font color="blue">ArraySlice.appi
(fn (n,(key,value)) =>
print (" "^Int.toString n^": "^Key.toString key^" = "^X2String value^"\n"))
(ArraySlice.slice(!tab,0,SOME (!count))))</font>
</pre>
</li>
<li>
In file <code>src\Util\SymDict\symbolTable.sml</code>, line 222, change
<pre>
<font color="red">val _ = Array.appi addToNew (!tab,0,NONE)</font>
</pre>
to
<pre>
<font color="blue">val _ = Array.appi addToNew (!tab)</font>
</pre>
and on line 303, change
<pre>
<font color="red">Array.extract(!tab,0,SOME(!count))</font>
</pre>
to
<pre>
<font color="blue">ArraySlice.vector(ArraySlice.slice(!tab,0,SOME(!count)))</font>
</pre>
an on line 310, change
<pre>
<font color="red">Array.appi
(fn (n,key) =>
print (" "^Int.toString n^": "^Key.toString key^"\n"))
(!tab,0,SOME (!count))) </font>
</pre>
to
<pre>
<font color="blue">ArraySlice.appi
(fn (n,key) =>
print (" "^Int.toString n^": "^Key.toString key^"\n"))
(ArraySlice.slice(!tab,0,SOME (!count))))</font>
</pre>
</li>
<li>
In file <code>src\Parser\Dfa\dfaString.sml</code>, line 71, change
<pre>
<font color="red">(tab,0,NONE))</font>
</pre>
to
<pre>
<font color="blue">tab)</font>
</pre>
and on line 77, change
<pre>
<font color="red">nil (tab,0,NONE))</font>
</pre>
to
<pre>
<font color="blue">nil tab)</font>
</pre>
</li>
<li>
In file <code>src\Parser\Params\dtd.sml</code>, line 293, change
<pre>
<font color="red">val _ = Vector.appi
(fn (_,(name,lit,cs))
=> (setGenEnt dtd (GenEnt2Index dtd name,(GE_INTERN(lit,cs),false))))
(predefined,1,NONE)</font>
</pre>
to
<pre>
<font color="blue">val _ = VectorSlice.appi
(fn (_,(name,lit,cs))
=> (setGenEnt dtd (GenEnt2Index dtd name,(GE_INTERN(lit,cs),false))))
(VectorSlice.slice (predefined,1,NONE))</font>
</pre>
</li>
<li>
In file <code>src\Parser\Dtd\dtdAttributes.sml</code>, line 68, change
<pre>
<font color="red">in Vector.tabulate(26,fn i => Array.extract (Array.sub(arr,i),0,NONE))</font>
</pre>
to
<pre>
<font color="blue">in Vector.tabulate(26,fn i => Array.vector (Array.sub(arr,i)))</font>
</pre>
</li>
<li>
In file <code>src\Util\intSets.sml</code>, line 53, change
<pre>
<font color="red">in Vector.extract (vec,0,SOME max)</font>
</pre>
to
<pre>
<font color="blue">in VectorSlice.vector(VectorSlice.slice (vec,0,SOME max))</font>
</pre>
and line 52 from
<pre>
<font color="red">(fn (i,w,max) => if w=0wx0 then i else max) 0 (vec,0,NONE)</font>
</pre>
to
<pre>
<font color="blue">(fn (i,w,max) => if w=0wx0 then i else max) 0 vec</font>
</pre>
and line 91 from
<pre>
<font color="red">then Vector.mapi (fn (i,x) => if i=idx then x||mask else x) (vec,0,NONE)</font>
</pre>
to
<pre>
<font color="blue">then Vector.mapi (fn (i,x) => if i=idx then x||mask else x) vec</font>
</pre>
and line 104 from
<pre>
<font color="red">(fn (i,x) => if i=idx then x && mask else x) (vec,0,NONE)</font>
</pre>
to
<pre>
<font color="blue">(fn (i,x) => if i=idx then x && mask else x) vec</font>
</pre>
</li>
<li>
In file <code>src\Parser\Dfa\dfaUtil.sml</code>, change line 127 from
<pre>
<font color="red">(lo,hi,Array.extract (tab,0,NONE),fin)</font>
</pre>
to
<pre>
<font color="blue">(lo,hi,Array.vector tab,fin)</font>
</pre>
</li>
<li>
In file <code>src\Parser\Dfa\dfaPassTwo.sml</code>, change line 75 from
<pre>
<font color="red">in Array.extract (table,0,NONE)</font>
</pre>
to
<pre>
<font color="blue">in Array.vector table</font>
</pre>
</li>
<li>
In file <code>src\Parser\Parse\parseContent.sml</code>, line 590, change
<pre>
<font color="red">hookData(a,((!pos0,getPos q),Array.extract(dataBuffer,0,SOME i),false))</font>
</pre>
to
<pre>
<font color="blue">hookData(a,((!pos0,getPos q),
ArraySlice.vector(ArraySlice.slice(dataBuffer,0,SOME i)),
false))</font>
</pre>
</li>
<li>
In file <code>src\Apps\Copy\copyEncode.sml</code>, line 129, change
<pre>
<font color="red">val f2 = Vector.foldli putOne f1 (cv,0,NONE)</font>
</pre>
to
<pre>
<font color="blue">val f2 = Vector.foldli putOne f1 cv</font>
</pre>
</li></ol>
</li>
</ol>
</li>
<li>Follow the <a href="index.html#INSTALL">installation
instructions</a> for fxp starting with step 3.
</li>
</ol>
<hr>
</body> </html>

13
fxp/fxp.sh.in Executable file
View File

@ -0,0 +1,13 @@
ARCH_N_OPSYS=`${SML_BINDIR}/.arch-n-opsys`
if [ "$?" != "0" ]; then
echo "$CMD: unable to determine architecture/operating system"
exit 1
fi
eval ${ARCH_N_OPSYS}
PROG=`basename $0`
HEAP=${FXP_LIBDIR}/_${PROG}
RUN=${SML_BINDIR}/.run/run.${ARCH}-${OPSYS}
exec ${RUN} @SMLcmdname=$0 @SMLdebug=/dev/null @SMLload=${HEAP} "$@"

View File

@ -0,0 +1,8 @@
Group is
canonOptions.sml
canonEncode.sml
canonOutput.sml
canonHooks.sml
canon.sml
../../fxlib.cm
$/basis.cm

View File

@ -0,0 +1,15 @@
ann
"warnMatch true"
"sequenceUnit true"
in
local
$(MLTON_ROOT)/basis/basis.mlb
../../fxlib.mlb
in
canonOptions.sml
canonEncode.sml
canonOutput.sml
canonHooks.sml
canon.sml
end
end

View File

@ -0,0 +1,81 @@
structure Canon =
struct
structure ParserOptions = ParserOptions ()
structure CatOptions = CatOptions ()
structure CatParams =
struct
open CatError CatOptions CanonOptions Uri UtilError
fun catError(pos,err) = if !O_SILENT then () else TextIO.output
(!O_ERROR_DEVICE,formatMessage (4,!O_ERROR_LINEWIDTH)
(Position2String pos^" Error in catalog:"::catMessage err))
end
structure Resolve = ResolveCatalog (structure Params = CatParams)
structure CanonHooks = CanonHooks (structure ParserOptions = ParserOptions)
structure ParseCanon = Parse (structure Dtd = Dtd
structure Hooks = CanonHooks
structure Resolve = Resolve
structure ParserOptions = ParserOptions)
open
CatOptions CanonOptions CanonHooks Options ParserOptions Uri
val usage = List.concat [parserUsage,[U_SEP],catalogUsage,[U_SEP],canonUsage]
exception Exit of OS.Process.status
fun canon(prog,args) =
let
val prog = "fxcanon"
val hadError = ref false
fun optError msg =
let val _ = TextIO.output(TextIO.stdErr,msg^".\n")
in hadError := true
end
fun exitError msg =
let val _ = TextIO.output(TextIO.stdErr,msg^".\n")
in raise Exit OS.Process.failure
end
fun exitHelp prog =
let val _ = printUsage TextIO.stdOut prog usage
in raise Exit OS.Process.success
end
fun exitVersion prog =
let val _ = app print [prog," version ",Version.FXP_VERSION,"\n"]
in raise Exit OS.Process.success
end
fun summOpt prog = "For a summary of options type "^prog^" --help"
fun noFile(f,cause) = "can't open file '"^f^"': "^exnMessage cause
val opts = parseOptions args
val _ = setParserDefaults()
val opts1 = setParserOptions (opts,optError)
val _ = setCatalogDefaults()
val opts2 = setCatalogOptions (opts1,optError)
val _ = setCanonDefaults()
val (vers,help,err,file) = setCanonOptions (opts2,optError)
val _ = if !hadError then exitError (summOpt prog) else ()
val _ = if vers then exitVersion prog else ()
val _ = if help then exitHelp prog else ()
val _ = case err
of SOME "-" => O_ERROR_DEVICE := TextIO.stdErr
| SOME f => (O_ERROR_DEVICE := TextIO.openOut f
handle IO.Io {cause,...} => exitError(noFile(f,cause)))
| NONE => ()
val f = valOf file handle Option => "-"
val uri = if f="-" then NONE else SOME(String2Uri f)
val dtd = initDtdTables()
val status = ParseCanon.parseDocument uri (SOME dtd) (canonStart dtd)
val _ = if isSome err then TextIO.closeOut (!O_ERROR_DEVICE) else ()
in status
end
handle Exit status => status
| exn =>
let val _ = TextIO.output
(TextIO.stdErr,prog^": Unexpected exception: "^exnMessage exn^".\n")
in OS.Process.failure
end
end
val _ = Canon.canon(CommandLine.name (), CommandLine.arguments ())

View File

@ -0,0 +1,70 @@
signature CanonEncode =
sig
type File
val noFile : File
val openFile : unit -> File
val closeFile : File -> unit
val putBlank : File -> File
val putChar : File * UniChar.Char -> File
val putData : File * UniChar.Data -> File
val putVector : File * UniChar.Vector -> File
val putDataChar : File * UniChar.Char -> File
val putDataVector : File * UniChar.Vector -> File
end
structure CanonEncode : CanonEncode =
struct
open
CanonOptions Encode UniChar UniClasses UtilError
fun decodeError err = if !O_SILENT then () else
TextIO.output(!O_ERROR_DEVICE,formatMessage (4,!O_ERROR_LINEWIDTH)
("Encoding error:"::encodeMessage err))
type File = EncFile
val noFile = encNoFile
fun openFile() = encOpenFile (!O_OUTPUT_FILE,Encoding.UTF8,"UTF-8")
handle NoSuchFile (f,msg) => noFile before
TextIO.output(!O_ERROR_DEVICE,formatMessage (4,!O_ERROR_LINEWIDTH)
["Cannot open file '"^f^"' for writing. ("^msg^")"])
val closeFile = encCloseFile
val validChar = encValidChar
fun putChar(enc,c) = encPutChar(enc,c)
handle EncodeError(f,msg) => encAdapt(enc,f) before decodeError msg
fun putData(enc,cs) = foldl (fn (c,enc) => putChar(enc,c)) enc cs
fun putVector(enc,cv) = Vector.foldl (fn (c,enc) => putChar(enc,c)) enc cv
fun putBlank f = putChar(f,0wx20)
fun putDataChar (f,c) =
case c
of 0wx09 => putData(f,[0wx26,0wx23,0wx39,0wx3b]) (* "&#9;" *)
| 0wx0A => putData(f,[0wx26,0wx23,0wx31,0wx30,0wx3b]) (* "&#10;" *)
| 0wx0D => putData(f,[0wx26,0wx23,0wx31,0wx33,0wx3b]) (* "&#13;" *)
| 0wx22 => putData(f,[0wx26,0wx71,0wx75,0wx6f,0wx74,0wx3b]) (* "&quot;" *)
| 0wx26 => putData(f,[0wx26,0wx61,0wx6d,0wx70,0wx3b]) (* "&amp;" *)
| 0wx3C => putData(f,[0wx26,0wx6c,0wx74,0wx3b]) (* "&lt;" *)
| 0wx3E => putData(f,[0wx26,0wx67,0wx74,0wx3b]) (* "&gt;" *)
| _ => putChar(f,c)
fun putDataVector (f,cv) =
Vector.foldl (fn (c,f) => putDataChar(f,c)) f cv
end

View File

@ -0,0 +1,79 @@
functor CanonHooks (structure ParserOptions : ParserOptions) =
struct
structure CanonOutput = CanonOutput (structure ParserOptions = ParserOptions)
open
Base Dtd
CanonEncode CanonOptions CanonOutput
datatype Where = SOMEWHERE | SUBSET | CONTENT of int | REFERENCE of Where
type AppData = Dtd * OS.Process.status * CanonEncode.File * Where
type AppFinal = OS.Process.status
fun canonStart dtd = (dtd,OS.Process.success,CanonEncode.noFile,SOMEWHERE)
fun hookXml ((dtd,status,_,wher),_) = (dtd,status,openFile(),wher)
fun hookFinish (a as (_,status,f,_)) = status before closeFile f
fun hookError ((dtd,status,f,wher),err) = (dtd,OS.Process.failure,f,wher)
before printError err
fun hookWarning (a,warn) = a before printWarning warn
fun hookProcInst (a as (dtd,status,f,wher),pi) =
case wher
of REFERENCE _ => a
| _ => (dtd,status,putProcInst(f,pi),wher)
fun hookWhite (a as (dtd,status,f,wher),ws) =
case wher
of CONTENT _ => (dtd,status,putDataVector(f,ws),wher)
| _ => a
fun hookDecl (a,_) = a
fun hookComment (a,_) = a
fun hookStartTag (a as (dtd,status,f,wher),stag) =
case wher
of CONTENT level => (dtd,status,putStartTag dtd (f,stag),
CONTENT(if #5 stag then level else level+1))
| SOMEWHERE => (dtd,status,putStartTag dtd (f,stag),
if #5 stag then wher else CONTENT 1)
| _ => a
fun hookEndTag (a as (dtd,status,f,wher),etag) =
case wher
of CONTENT level => (dtd,status,putEndTag dtd (f,etag),
if level>1 then CONTENT(level-1) else SOMEWHERE)
| _ => a
fun hookCData (a as (dtd,status,f,wher),(_,text)) =
case wher
of CONTENT _ => (dtd,status,putDataVector(f,text),wher)
| _ => a
fun hookData (a,(pp,text,ignore)) = hookCData(a,(pp,text))
fun hookCharRef (a,(pp,ch,_)) = hookCData(a,(pp,Vector.fromList [ch]))
fun hookGenRef (a as (dtd,status,f,wher),(_,_,_,included)) =
case wher
of REFERENCE _ => if included then (dtd,status,f,REFERENCE wher) else a
| _ => a
fun hookParRef (a as (dtd,status,f,wher),(_,_,_,included)) =
if included then (dtd,status,f,REFERENCE wher) else a
fun hookEntEnd (a as (dtd,status,f,wher),_) =
case wher
of REFERENCE wher1 => (dtd,status,f,wher1)
| _ => a
fun hookDocType (a,_) = a
fun hookSubset (a as (dtd,status,f,wher),_) =
case wher
of SOMEWHERE => (dtd,status,f,SUBSET)
| _ => a
fun hookExtSubset (a as (dtd,status,f,wher),_) =
case wher
of SOMEWHERE => (dtd,status,f,SUBSET)
| _ => a
fun hookEndDtd(a as (dtd,status,f,wher),_) =
case wher
of SUBSET => (dtd,status,f,SOMEWHERE)
| REFERENCE wher => (dtd,status,f,wher)
| _ => a
end

View File

@ -0,0 +1,106 @@
signature CanonOptions =
sig
val O_SILENT : bool ref
val O_ERROR_DEVICE : TextIO.outstream ref
val O_ERROR_LINEWIDTH : int ref
val O_OUTPUT_FILE : string ref
val setCanonDefaults : unit -> unit
val setCanonOptions : Options.Option list * (string -> unit)
-> bool * bool * string option * string option
val canonUsage : Options.Usage
end
structure CanonOptions : CanonOptions =
struct
open Options UtilList
val O_SILENT = ref false
val O_ERROR_DEVICE = ref TextIO.stdErr
val O_ERROR_LINEWIDTH = ref 80
val O_OUTPUT_FILE = ref "-"
fun setCanonDefaults () =
let
val _ = O_SILENT := false
val _ = O_ERROR_DEVICE := TextIO.stdErr
val _ = O_OUTPUT_FILE := "-"
in ()
end
val canonUsage =
[U_ITEM(["-o <file>","--output=<file>"],"Write output to file (stdout)"),
U_ITEM(["-s","--silent"],"Suppress reporting of errors and warnings"),
U_ITEM(["-e <file>","--error-output=<file>"],"Redirect errors to file (stderr)"),
U_SEP,
U_ITEM(["--version"],"Print the version number and exit"),
U_ITEM(["-?","--help"],"Print this text and exit"),
U_ITEM(["--"],"Do not recognize remaining arguments as options")
]
fun setCanonOptions (opts,optError) =
let
fun onlyOne what = "at most one "^what^" may be specified"
fun unknown pre opt = String.concat ["unknown option ",pre,opt]
fun hasNoArg pre key = String.concat ["option ",pre,key," expects no argument"]
fun mustHave pre key = String.concat ["option ",pre,key," must have an argument"]
fun check_noarg(key,valOpt) =
if isSome valOpt then optError (hasNoArg "--" key) else ()
fun do_long (pars as (v,h,e,f)) (key,valOpt) =
case key
of "help" => (v,true,e,f) before check_noarg(key,valOpt)
| "version" => (true,h,e,f) before check_noarg(key,valOpt)
| "silent" => pars before O_SILENT := true before check_noarg(key,valOpt)
| "output" =>
(case valOpt
of NONE => pars before optError (mustHave "--" key)
| SOME s => pars before O_OUTPUT_FILE := s)
| "error-output" =>
(case valOpt
of NONE => pars before optError (mustHave "--" key)
| SOME s => (v,h,SOME s,f))
| _ => pars before optError(unknown "--" key)
fun do_short (pars as (v,h,e,f)) (cs,opts) =
case cs
of nil => doit pars opts
| [#"o"] => (case opts
of OPT_STRING s::opts1 => (O_OUTPUT_FILE := s;
doit pars opts1)
| _ => (optError (mustHave "-" "o"); doit pars opts))
| [#"e"] => (case opts
of OPT_STRING s::opts1 => doit (v,h,SOME s,f) opts1
| _ => (optError (mustHave "-" "e"); doit pars opts))
| cs => doit (foldr
(fn (c,pars)
=> case c
of #"s" => pars before O_SILENT := true
| #"o" => pars before optError (mustHave "-" "o")
| #"e" => pars before optError (mustHave "-" "e")
| #"?" => (v,true,e,f)
| c => pars before
optError(unknown "-" (String.implode [c])))
pars cs) opts
and doit pars nil = pars
| doit (pars as (v,h,e,f)) (opt::opts) =
case opt
of OPT_LONG(key,valOpt) => doit (do_long pars (key,valOpt)) opts
| OPT_SHORT cs => do_short pars (cs,opts)
| OPT_STRING s => if isSome f
then let val _ = optError(onlyOne "input file")
in doit pars opts
end
else doit (v,h,e,SOME s) opts
| OPT_NOOPT => doit pars opts
| OPT_NEG cs => let val _ = if null cs then ()
else app (fn c => optError
(unknown "-n" (String.implode[c]))) cs
in doit pars opts
end
in doit (false,false,NONE,NONE) opts
end
end

View File

@ -0,0 +1,75 @@
signature CanonOutput =
sig
type Dtd
val printError : Errors.Position * Errors.Error -> unit
val printWarning : Errors.Position * Errors.Warning -> unit
val putProcInst : CanonEncode.File * HookData.ProcInstInfo -> CanonEncode.File
val putStartTag : Dtd -> CanonEncode.File * HookData.StartTagInfo -> CanonEncode.File
val putEndTag : Dtd -> CanonEncode.File * HookData.EndTagInfo -> CanonEncode.File
end
functor CanonOutput (structure ParserOptions : ParserOptions) : CanonOutput =
struct
open
Base UniChar DfaData Dtd Errors ParserOptions
CanonEncode CanonOptions HookData
fun printError(pos,err) = if !O_SILENT then () else TextIO.output
(!O_ERROR_DEVICE,formatMessage (4,!O_ERROR_LINEWIDTH)
(Position2String pos
::(if isFatalError err then "Fatal error:" else "Error:")
::errorMessage err))
fun printWarning(pos,warn) = if !O_SILENT then () else TextIO.output
(!O_ERROR_DEVICE,formatMessage (4,!O_ERROR_LINEWIDTH)
(Position2String pos^" Warning:"::warningMessage warn))
fun putAttValue (f,cv) =
let
val f1 = putChar(f,0wx22)
val f2 = putDataVector(f1,cv)
val f3 = putChar(f2,0wx22)
in f3
end
fun putAttSpec(f,(att,cv)) =
let
val f1 = putBlank f
val f2 = putData(f1,att)
val f3 = putChar(f2,0wx3D)
val f4 = putAttValue(f3,cv)
in f4
end
fun putEndTag dtd (f,(_,eidx,_)) =
let
val f1 = putData(f,[0wx3C,0wx2F]) (* "</" *)
val f2 = putData(f1,Index2Element dtd eidx)
in putChar(f2,0wx3E) (* #">" *)
end
fun putStartTag dtd (f,(pp,eidx,atts,_,mt)) =
let
val f1 = putChar(f,0wx3C) (* #"<" *)
val f2 = putData(f1,Index2Element dtd eidx)
val atts1 = List.mapPartial
(fn (aidx,ap,_) =>
case ap
of AP_PRESENT(_,cv,_) => SOME(Index2AttNot dtd aidx,cv)
| AP_DEFAULT(_,cv,_) => SOME(Index2AttNot dtd aidx,cv)
| _ => NONE) atts
val atts2 = UtilList.sort (fn ((a1,_),(a2,_)) => compareData (a1,a2)) atts1
val f3 = foldl (fn (spec,f) => putAttSpec (f,spec)) f2 atts2
val f4 = putChar(f3,0wx3E) (* #">" *)
in if mt then putEndTag dtd (f4,(pp,eidx,NONE)) else f4
end
fun putProcInst(f,(_,target,_,text)) =
let
val f1 = putData(f,[0wx3c,0wx3f])
val f2 = putData(f1,target)
val f3 = putBlank f2
val f4 = putVector(f3 ,text)
in putData(f3,[0wx3f,0wx3e])
end
end

View File

@ -0,0 +1,8 @@
Group is
copyEncode.sml
copyOptions.sml
copyOutput.sml
copyHooks.sml
copy.sml
../../fxlib.cm
$/basis.cm

View File

@ -0,0 +1,15 @@
ann
"warnMatch true"
"sequenceUnit true"
in
local
$(MLTON_ROOT)/basis/basis.mlb
../../fxlib.mlb
in
copyOptions.sml
copyEncode.sml
copyOutput.sml
copyHooks.sml
copy.sml
end
end

View File

@ -0,0 +1,81 @@
structure Copy =
struct
structure ParserOptions = ParserOptions ()
structure CatOptions = CatOptions ()
structure CatParams =
struct
open CatError CatOptions CopyOptions Uri UtilError
fun catError(pos,err) = if !O_SILENT then () else TextIO.output
(!O_ERROR_DEVICE,formatMessage (4,!O_ERROR_LINEWIDTH)
(Position2String pos^" Error in catalog:"::catMessage err))
end
structure Resolve = ResolveCatalog (structure Params = CatParams)
structure CopyHooks = CopyHooks (structure ParserOptions = ParserOptions)
structure ParseCopy = Parse (structure Dtd = Dtd
structure Hooks = CopyHooks
structure Resolve = Resolve
structure ParserOptions = ParserOptions)
open
CatOptions CopyOptions Options ParserOptions Uri
val usage = List.concat [parserUsage,[U_SEP],catalogUsage,[U_SEP],copyUsage]
exception Exit of OS.Process.status
fun copy(prog,args) =
let
val prog = "fxcopy"
val hadError = ref false
fun optError msg =
let val _ = TextIO.output(TextIO.stdErr,msg^".\n")
in hadError := true
end
fun exitError msg =
let val _ = TextIO.output(TextIO.stdErr,msg^".\n")
in raise Exit OS.Process.failure
end
fun exitHelp prog =
let val _ = printUsage TextIO.stdOut prog usage
in raise Exit OS.Process.success
end
fun exitVersion prog =
let val _ = app print [prog," version ",Version.FXP_VERSION,"\n"]
in raise Exit OS.Process.success
end
fun summOpt prog = "For a summary of options type "^prog^" --help"
fun noFile(f,cause) = "can't open file '"^f^"': "^exnMessage cause
val opts = parseOptions args
val _ = setParserDefaults()
val opts1 = setParserOptions (opts,optError)
val _ = setCatalogDefaults()
val opts2 = setCatalogOptions (opts1,optError)
val _ = setCopyDefaults()
val (vers,help,err,file) = setCopyOptions (opts2,optError)
val _ = if !hadError then exitError (summOpt prog) else ()
val _ = if vers then exitVersion prog else ()
val _ = if help then exitHelp prog else ()
val _ = case err
of SOME "-" => O_ERROR_DEVICE := TextIO.stdErr
| SOME f => (O_ERROR_DEVICE := TextIO.openOut f
handle IO.Io {cause,...} => exitError(noFile(f,cause)))
| NONE => ()
val f = valOf file handle Option => "-"
val uri = if f="-" then NONE else SOME(String2Uri f)
val dtd = Dtd.initDtdTables()
val status = ParseCopy.parseDocument uri (SOME dtd) (CopyHooks.copyStart dtd)
val _ = if isSome err then TextIO.closeOut (!O_ERROR_DEVICE) else ()
in status
end
handle Exit status => status
| exn =>
let val _ = TextIO.output
(TextIO.stdErr,prog^": Unexpected exception: "^exnMessage exn^".\n")
in OS.Process.failure
end
end
val _ = Copy.copy(CommandLine.name (), CommandLine.arguments ())

View File

@ -0,0 +1,135 @@
signature CopyEncode =
sig
type File
val noFile : File
val openFile : string * Encoding.Encoding * string -> File
val closeFile : File -> unit
val putBlank : File -> File
val putNl : File -> File
val putChar : File * UniChar.Char -> File
val putAttChar : File * UniChar.Char -> File
val putDataChar : File * UniChar.Char -> File
val putData : File * UniChar.Data -> File
val putAttData : File * UniChar.Data -> File
val putDataData : File * UniChar.Data -> File
val putVector : File * UniChar.Vector -> File
val putAttVector : File * UniChar.Vector -> File
val putDataVector : File * UniChar.Vector -> File
val putEntVector : File * UniChar.Vector -> File
val putAttValue : File * UniChar.Vector * UniChar.Char -> File
val putEntValue : bool -> File * UniChar.Vector * UniChar.Char -> File
val putString : File * string -> File
val putCharRef : File * UniChar.Char -> File
val putGenRef : File * UniChar.Data -> File
val putParRef : File * UniChar.Data -> File
end
functor CopyEncode (structure ParserOptions : ParserOptions) : CopyEncode =
struct
open
CopyOptions Encode ParserOptions UniChar UniClasses UtilError
fun encodeError err = if !O_SILENT then () else
TextIO.output(!O_ERROR_DEVICE,formatMessage (4,!O_ERROR_LINEWIDTH)
("Encoding error:"::encodeMessage err))
type File = EncFile
val noFile = encNoFile
fun openFile fe = encOpenFile fe
handle NoSuchFile (f,msg) => noFile before
TextIO.output(!O_ERROR_DEVICE,formatMessage (4,!O_ERROR_LINEWIDTH)
["Cannot open file '"^f^"' for writing. ("^msg^")"])
val closeFile = encCloseFile
val validChar = encValidChar
fun putChar(enc,c) = encPutChar(enc,c)
handle EncodeError(f,msg) => encAdapt(enc,f) before encodeError msg
fun putData(enc,cs) = foldl (fn (c,enc) => putChar(enc,c)) enc cs
fun putVector(enc,cv) = Vector.foldl (fn (c,enc) => putChar(enc,c)) enc cv
fun putNl f = putChar(f,0wx0A)
fun putBlank f = putChar(f,0wx20)
val hexDigits = Vector.tabulate(16,fn i => Chars.fromInt((if i<10 then 48 else 55)+i))
fun hexDigit n = Vector.sub(hexDigits,Chars.toInt n)
fun charRefSeq c =
if c=0wx00 then [0wx26,0wx23,0wx78,0wx30,0wx3b] (* "&#x0;" *)
else let fun mk_hex yet n = if n=(0w0:Char) then yet
else mk_hex (hexDigit(n mod 0w16)::yet) (n div 0w16)
in 0wx26::0wx23::0wx78::mk_hex [0wx3b] c
end
fun putCharRef(f,c) = putData(f,charRefSeq c)
fun putGenRef(f,ent) = let val f1 = putChar(f,0wx26)
val f2 = putData(f1,ent)
val f3 = putChar(f2,0wx3b)
in f3
end
fun putParRef(f,ent) = let val f1 = putChar(f,0wx25)
val f2 = putData(f1,ent)
val f3 = putChar(f2,0wx3b)
in f3
end
fun putAttChar(f,c) =
case c
of 0wx26 => putData(f,[0wx26,0wx61,0wx6d,0wx70,0wx3b]) (* "&amp;" *)
| 0wx3C => putData(f,[0wx26,0wx6c,0wx74,0wx3b]) (* "&lt;" *)
| _ => if validChar(f,c) then putChar(f,c) else putCharRef(f,c)
fun putDataChar(f,c) =
case c
of 0wx26 => putData(f,[0wx26,0wx61,0wx6d,0wx70,0wx3b]) (* "&amp;" *)
| 0wx3C => putData(f,[0wx26,0wx6c,0wx74,0wx3b]) (* "&lt;" *)
| 0wx3E => if !O_COMPATIBILITY
then putData(f,[0wx26,0wx67,0wx74,0wx3b]) (* "&gt;" *)
else putChar(f,c)
| _ => if validChar(f,c) then putChar(f,c) else putCharRef(f,c)
fun putAttData(f,cs) =
foldl (fn (c,f) => putAttChar(f,c)) f cs
fun putDataData(f,cs) =
foldl (fn (c,f) => putDataChar(f,c)) f cs
fun putAttVector(f,cv) =
Vector.foldl (fn (c,f) => if validChar(f,c) then putChar(f,c) else putCharRef(f,c)) f cv
fun putDataVector(f,cv) =
Vector.foldl (fn (c,f) => putDataChar(f,c)) f cv
fun putEntVector (f,cv) =
Vector.foldl (fn (c,f) => if validChar(f,c) then putChar(f,c) else putCharRef(f,c)) f cv
fun putAttValue (f,cv,q) =
let
fun putOne(c,f) =
case c
of 0wx26 => putData(f,[0wx26,0wx61,0wx6d,0wx70,0wx3b]) (* "&amp;" *)
| 0wx3C => putData(f,[0wx26,0wx6c,0wx74,0wx3b]) (* "&lt;" *)
| _ => if c<>q andalso validChar(f,c) then putChar(f,c) else putCharRef(f,c)
val f1 = putChar(f,q)
val f2 = Vector.foldl putOne f1 cv
val f3 = putChar(f2,q)
in f3
end
fun putEntValue escapeParRef (f,cv,q) =
let
fun putOne(i,c,f) =
case c
of 0wx25 => if escapeParRef then putCharRef(f,c) else putChar(f,c)
| 0wx26 => if i+1<Vector.length cv andalso isNms(Vector.sub(cv,i+1))
then putChar(f,c) else putCharRef(f,c)
| _ => if c<>q andalso validChar(f,c) then putChar(f,c) else putCharRef(f,c)
val f1 = putChar(f,q)
val f2 = Vector.foldli putOne f1 cv
val f3 = putChar(f2,q)
in f3
end
fun putString(f,str) = putData(f,String2Data str)
end

View File

@ -0,0 +1,153 @@
functor CopyHooks (structure ParserOptions : ParserOptions) =
struct
structure CopyOutput = CopyOutput (structure ParserOptions = ParserOptions)
open
Base Dtd Encoding
CopyOptions CopyOutput
datatype Where = SOMEWHERE | SUBSET | CONTENT of int | REFERENCE of Where
type AppData = OS.Process.status * File * Dtd * Where
type AppFinal = OS.Process.status
fun copyStart dtd = (OS.Process.success,noFile,dtd,SOMEWHERE)
fun hookXml ((status,_,dtd,wher),(fname,enc,xmlDecl)) =
let val (vers,_,stand) = case xmlDecl
of SOME ves => ves
| NONE => (NONE,NONE,NONE)
val (outEnc,outEncName) = case !O_OUTPUT_ENCODING
of NONE => (enc,encodingName enc)
| SOME x => (NOENC,x)
val f = openFile(!O_OUTPUT_FILE,outEnc,outEncName)
in (status,putXmlDecl(f,(vers,SOME outEncName,stand)),dtd,wher)
end
fun hookFinish (a as (status,f,_,_)) = status before closeFile f
fun hookError ((status,f,dtd,wher),err) = (OS.Process.failure,f,dtd,wher)
before printError err
fun hookWarning (a,warn) = a before printWarning warn
fun hookProcInst (a as (status,f,dtd,wher),pi) =
case wher
of REFERENCE _ => a
| _ => (status,putProcInst(f,pi),dtd,wher)
fun hookComment (a as (status,f,dtd,wher),com) =
case wher
of REFERENCE _ => a
| _ => (status,putComment(f,com),dtd,wher)
fun hookWhite (a as (status,f,dtd,wher),ws) =
case wher
of REFERENCE _ => a
| _ => (status,putVector(f,ws),dtd,wher)
fun hookDecl (a as (status,f,dtd,wher),decl) =
case wher
of SUBSET => (status,putDecl(f,dtd,decl),dtd,wher)
| _ => a
fun hookStartTag (a as (status,f,dtd,wher),stag) =
case wher
of CONTENT level => let val f1 = putStartTag(f,dtd,stag)
val level1 = if #5 stag then level else level+1
in (status,f1,dtd,CONTENT level1)
end
| SOMEWHERE => let val f1 = putStartTag(f,dtd,stag)
in if #5 stag then (status,putNl f1,dtd,wher)
else (status,f1,dtd,CONTENT 1)
end
| _ => a
fun hookEndTag (a as (status,f,dtd,wher),etag) =
case wher
of CONTENT level => let val f1 = putEndTag(f,dtd,etag)
val (f2,wher1) = if level>1 then (f1,CONTENT(level-1))
else (f1,SOMEWHERE)
in (status,f2,dtd,wher1)
end
| _ => a
fun hookData (a as (status,f,dtd,wher),(_,text,_)) =
case wher
of CONTENT _ => (status,putDataVector(f,text),dtd,wher)
| _ => a
fun hookCData (a as (status,f,dtd,wher),(_,text)) =
case wher
of CONTENT _ => (status,putCData(f,text),dtd,wher)
| _ => a
fun hookCharRef (a as (status,f,dtd,wher),(_,ch,cv)) =
case wher
of CONTENT _ =>
let val f1 = if !O_EXPAND_CONT_CHAR
then putDataChar(f,ch) else putVector(f,cv)
in (status,f1,dtd,wher)
end
| _ => a
fun hookGenRef (a as (status,f,dtd,wher),(_,idx,ent,included)) =
case wher
of CONTENT l =>
if not included then (status,putGenRef(f,Index2GenEnt dtd idx),dtd,wher)
else if (if isExtGen ent then !O_EXPAND_CONT_EXT else !O_EXPAND_CONT_INT) then a
else (status,putGenRef(f,Index2GenEnt dtd idx),dtd,REFERENCE wher)
| REFERENCE _ => if included then (status,f,dtd,REFERENCE wher) else a
| _ => a
fun hookParRef (a as (status,f,dtd,wher),(_,idx,ent,included)) =
case wher
of SUBSET =>
if not included then (status,putNl(putParRef(f,Index2ParEnt dtd idx)),dtd,wher)
else if (if isExtPar ent then !O_EXPAND_SUBSET_EXT else !O_EXPAND_SUBSET_INT) then a
else (status,putNl(putParRef(f,Index2ParEnt dtd idx)),dtd,REFERENCE wher)
| REFERENCE _ => if included then (status,f,dtd,REFERENCE wher) else a
| _ => if included then (status,f,dtd,REFERENCE wher) else a
fun hookEntEnd (a as (status,f,dtd,wher),_) =
case wher
of REFERENCE wher1 => (status,f,dtd,wher1)
| _ => a
fun hookDocType (a as (status,f,dtd,wher),(didx,extId)) =
case wher
of SOMEWHERE =>
let
val f1 = putData(f,startDtd)
val f2 = putData(f1,Index2Element dtd didx)
val f3 = if !O_INCLUDE_EXT_SUBSET then f2
else case extId
of NONE => f2
| SOME x => putExtId(putBlank f2,x)
in (status,f1,dtd,wher)
end
| _ => a
fun hookSubset (a as (status,f,dtd,wher),_) =
case wher
of SOMEWHERE => (status,putData(f,[0wx20,0wx5B]),dtd,SUBSET) (* " [" *)
| _ => a
fun hookExtSubset (a as (status,f,dtd,wher),_) =
if !O_INCLUDE_EXT_SUBSET
then case wher
of SOMEWHERE => (status,putData(f,[0wx20,0wx5B,0wx0A]),dtd,SUBSET) (* " [\n" *)
| SUBSET => (status,putChar(f,0wx0A),dtd,SUBSET)
| _ => a
else case wher
of SOMEWHERE => (status,f,dtd,REFERENCE wher)
| SUBSET => (status,putChar(f,0wx5D),dtd,REFERENCE SOMEWHERE) (* #"]" *)
| _ => a
fun hookEndDtd(a as (status,f,dtd,wher),_) =
case wher
of SOMEWHERE => (status,putData(f,endDecl),dtd,wher)
| SUBSET => (status,putData(putChar(f,0wx5D),endDecl),dtd,SOMEWHERE)
| REFERENCE wher => (status,putData(f,endDecl),dtd,wher)
| _ => a
end

View File

@ -0,0 +1,270 @@
signature CopyOptions =
sig
val O_SILENT : bool ref
val O_ERROR_DEVICE : TextIO.outstream ref
val O_ERROR_LINEWIDTH : int ref
val O_EXPAND_CONT_CHAR : bool ref
val O_EXPAND_CONT_EXT : bool ref
val O_EXPAND_CONT_INT : bool ref
val O_EXPAND_SUBSET_EXT : bool ref
val O_EXPAND_SUBSET_INT : bool ref
val O_INCLUDE_EXT_SUBSET : bool ref
val O_EXPAND_ENT_VAL : bool ref
val O_EXPAND_ATT_VAL : bool ref
val O_OUTPUT_ENCODING : string option ref
val O_OUTPUT_FILE : string ref
val setCopyDefaults : unit -> unit
val setCopyOptions : Options.Option list * (string -> unit)
-> bool * bool * string option * string option
val copyUsage : Options.Usage
end
structure CopyOptions : CopyOptions =
struct
open Options UtilList
val O_SILENT = ref false
val O_ERROR_DEVICE = ref TextIO.stdErr
val O_ERROR_LINEWIDTH = ref 80
val O_EXPAND_CONT_CHAR = ref false
val O_EXPAND_CONT_EXT = ref false
val O_EXPAND_CONT_INT = ref false
val O_EXPAND_SUBSET_EXT = ref false
val O_EXPAND_SUBSET_INT = ref false
val O_INCLUDE_EXT_SUBSET = ref false
val O_EXPAND_ENT_VAL = ref false
val O_EXPAND_ATT_VAL = ref false
val O_OUTPUT_ENCODING = ref NONE : string option ref
val O_OUTPUT_FILE = ref "-"
fun setCopyDefaults () =
let
val _ = O_SILENT := false
val _ = O_ERROR_DEVICE := TextIO.stdErr
val _ = O_EXPAND_CONT_CHAR := false
val _ = O_EXPAND_CONT_EXT := false
val _ = O_EXPAND_CONT_INT := false
val _ = O_EXPAND_SUBSET_EXT := false
val _ = O_EXPAND_SUBSET_INT := false
val _ = O_INCLUDE_EXT_SUBSET := false
val _ = O_EXPAND_ENT_VAL := false
val _ = O_EXPAND_ATT_VAL := false
val _ = O_OUTPUT_ENCODING := NONE
val _ = O_OUTPUT_FILE := "-"
in ()
end
val copyUsage =
[U_ITEM(["-o <file>","--output=<file>"],"Write output to file (stdout)"),
U_ITEM(["-s","--silent"],"Suppress reporting of errors and warnings"),
U_ITEM(["-e <file>","--error-output=<file>"],"Redirect errors to file (stderr)"),
U_SEP,
U_ITEM(["--expand-refs-content[=(yes,no,<key list>]"],
"Controls expansion of entity references in content (no)"),
U_ITEM(["--expand-refs-subset[=(no|yes|int|ext)]"],
"Controls expansion of entity references in the DTD (no)"),
U_ITEM(["--expand-ext-subset[=(yes|no)]"],"Expand external subset (no)"),
U_ITEM(["--expand-att-vals[=(yes|no)]"],"Expand and normalize attribute values (no)"),
U_ITEM(["--expand-ent-vals[=(yes|no)]"],"Expand entity values (no)"),
U_SEP,
U_ITEM(["--expand=yes"],"Expand everything"),
U_ITEM(["--expand=no"],"Expand nothing"),
U_ITEM(["--expand=int"],"Expand internal and character entities"),
U_ITEM(["--expand=ext"],"Expand external entities and subset"),
U_SEP,
U_ITEM(["--version"],"Print the version number and exit"),
U_ITEM(["-?","--help"],"Print this text and exit"),
U_ITEM(["--"],"Do not recognize remaining arguments as options")
]
fun setCopyOptions (opts,optError) =
let
exception Failed
val yesNoList= "'yes', 'no' or a list of 'char', 'ext' and 'int'"
val yesNoExtInt = "'yes', 'no', 'ext' or 'int'"
val yesNo = "'yes' or 'no'"
fun onlyOne what = "at most one "^what^" may be specified"
fun unknown pre opt = String.concat ["unknown option ",pre,opt]
fun hasNoArg pre key = String.concat ["option ",pre,key," expects no argument"]
fun mustHave pre key = String.concat ["option ",pre,key," must have an argument"]
fun errorMustBe(key,what) = String.concat ["the argument to --",key," must be ",what]
fun check_noarg(key,valOpt) =
if isSome valOpt then optError (hasNoArg "--" key) else ()
fun do_long (pars as (v,h,e,f)) (key,valOpt) =
case key
of "help" => (v,true,e,f) before check_noarg(key,valOpt)
| "version" => (true,h,e,f) before check_noarg(key,valOpt)
| "silent" => pars before O_SILENT := true before check_noarg(key,valOpt)
| "output" =>
(case valOpt
of NONE => pars before optError (mustHave "--" key)
| SOME s => pars before O_OUTPUT_FILE := s)
| "output-encoding" =>
(case valOpt
of NONE => pars before optError (mustHave "--" key)
| SOME s => pars before O_OUTPUT_ENCODING := SOME s)
| "error-output" =>
(case valOpt
of NONE => pars before optError (mustHave "--" key)
| SOME s => (v,h,SOME s,f))
| "expand-refs-content" =>
let
datatype what = CHAR | EXT | INT
fun setFlags whats =
let
val _ = O_EXPAND_CONT_CHAR := member CHAR whats
val _ = O_EXPAND_CONT_EXT := member EXT whats
val _ = O_EXPAND_CONT_INT := member INT whats
in ()
end
val _ = case valOpt
of NONE => setFlags [CHAR,EXT,INT]
| SOME "yes" => setFlags [CHAR,EXT,INT]
| SOME "no" => setFlags nil
| SOME s => let val fields = String.fields (fn c => #","=c) s
val whats = foldr
(fn(f,yet)=>
case f
of "char" => CHAR::yet
| "ext" => EXT::yet
| "int" => INT::yet
| _ => yet before optError
(errorMustBe(key,yesNoList)))
nil fields
in setFlags whats
end
in pars
end
| "expand-refs-subset" =>
(let val (ext,int) = case valOpt
of NONE => (true,true)
| SOME "yes" => (true,true)
| SOME "no" => (false,false)
| SOME "ext" => (true,false)
| SOME "int" => (false,true)
| SOME _ => raise Failed
val _ = O_EXPAND_SUBSET_EXT := ext
val _ = O_EXPAND_SUBSET_INT := int
val _ = if ext then (O_EXPAND_ENT_VAL := true) else ()
in pars
end
handle Failed => pars before optError (errorMustBe(key,yesNoExtInt)) )
| "expand-ent-vals" =>
(let val exp = case valOpt
of NONE => true
| SOME "yes" => true
| SOME "no" => false
| SOME _ => raise Failed
val _ = O_EXPAND_ENT_VAL := exp
in pars
end
handle Failed => pars before optError (errorMustBe(key,yesNo)) )
| "expand-att-vals" =>
(let val exp = case valOpt
of NONE => true
| SOME "yes" => true
| SOME "no" => false
| SOME _ => raise Failed
val _ = O_EXPAND_ATT_VAL := exp
in pars
end
handle Failed => pars before optError (errorMustBe(key,yesNo)) )
| "expand-ext-subset" =>
(let val exp = case valOpt
of NONE => true
| SOME "yes" => true
| SOME "no" => false
| SOME _ => raise Failed
val _ = O_INCLUDE_EXT_SUBSET := exp
val _ = if exp then (O_EXPAND_ENT_VAL := true) else ()
in pars
end
handle Failed => pars before optError (errorMustBe(key,yesNo)) )
| "expand" =>
let
datatype what = ALL | NO | EXT | INT
val ext = [O_EXPAND_CONT_EXT,O_EXPAND_SUBSET_EXT,
O_INCLUDE_EXT_SUBSET,O_EXPAND_ENT_VAL]
val int = [O_EXPAND_CONT_CHAR,O_EXPAND_CONT_INT,
O_EXPAND_SUBSET_INT,O_EXPAND_ATT_VAL]
val all = ext@int
fun setFlags what =
let val (yes,no) =
case what
of ALL => (all,nil)
| NO => (nil,all)
| EXT => (ext,int)
| INT => (int,ext)
val _ = app (fn x => x := true) yes
val _ = app (fn x => x := false) no
in ()
end
val _ = case valOpt
of NONE => setFlags ALL
| SOME "yes" => setFlags ALL
| SOME "no" => setFlags NO
| SOME "ext" => setFlags EXT
| SOME "int" => setFlags INT
| SOME _ => optError (errorMustBe(key,yesNoExtInt))
in pars
end
| _ => pars before optError(unknown "--" key)
fun do_short (pars as (v,h,e,f)) (cs,opts) =
case cs
of nil => doit pars opts
| [#"o"] => (case opts
of OPT_STRING s::opts1 => (O_OUTPUT_FILE := s;
doit pars opts1)
| _ => (optError (mustHave "-" "o"); doit pars opts))
| [#"e"] => (case opts
of OPT_STRING s::opts1 => doit (v,h,SOME s,f) opts1
| _ => (optError (mustHave "-" "e"); doit pars opts))
| cs => doit (foldr
(fn (c,pars)
=> case c
of #"s" => pars before O_SILENT := true
| #"o" => pars before optError (mustHave "-" "o")
| #"e" => pars before optError (mustHave "-" "e")
| #"?" => (v,true,e,f)
| c => pars before
optError(unknown "-" (String.implode [c])))
pars cs) opts
and doit pars nil = pars
| doit (pars as (v,h,e,f)) (opt::opts) =
case opt
of OPT_LONG(key,valOpt) => doit (do_long pars (key,valOpt)) opts
| OPT_SHORT cs => do_short pars (cs,opts)
| OPT_STRING s => if isSome f
then let val _ = optError(onlyOne "input file")
in doit pars opts
end
else doit (v,h,e,SOME s) opts
| OPT_NOOPT => doit pars opts
| OPT_NEG cs => let val _ = if null cs then ()
else app (fn c => optError
(unknown "-n" (String.implode[c]))) cs
in doit pars opts
end
in doit (false,false,NONE,NONE) opts
end
end

View File

@ -0,0 +1,330 @@
signature CopyOutput =
sig
type Dtd
include CopyEncode
val startDtd : UniChar.Data
val endDecl : UniChar.Data
val printError : Errors.Position * Errors.Error -> unit
val printWarning : Errors.Position * Errors.Warning -> unit
val putDecl : File * Dtd * HookData.DeclInfo -> File
val putXmlDecl : File * HookData.XmlDecl -> File
val putProcInst : File * HookData.ProcInstInfo -> File
val putComment : File * HookData.CommentInfo -> File
val putStartTag : File * Dtd * HookData.StartTagInfo -> File
val putEndTag : File * Dtd * HookData.EndTagInfo -> File
val putCData : File * UniChar.Vector -> File
val putExtId : File * Base.ExternalId -> File
end
functor CopyOutput (structure ParserOptions : ParserOptions) : CopyOutput =
struct
structure CopyEncode = CopyEncode (structure ParserOptions = ParserOptions)
open
Base DfaData Dtd Errors ParserOptions UniChar Uri
CopyEncode CopyOptions HookData
fun printError(pos,err) = if !O_SILENT then () else TextIO.output
(!O_ERROR_DEVICE,formatMessage (4,!O_ERROR_LINEWIDTH)
(Position2String pos
::(if isFatalError err then "Fatal error:" else "Error:")
::errorMessage err))
fun printWarning(pos,warn) = if !O_SILENT then () else TextIO.output
(!O_ERROR_DEVICE,formatMessage (4,!O_ERROR_LINEWIDTH)
(Position2String pos^" Warning:"::warningMessage warn))
val ANY = String2Data "ANY"
val ATTLIST = String2Data "<!ATTLIST "
val CDATA = String2Data "CDATA"
val CDATA_S = String2Data "<![CDATA["
val ELEMENT = String2Data "<!ELEMENT "
val EMPTY = String2Data "EMPTY"
val ENCODING = String2Data " encoding="
val ENTITY = String2Data "ENTITY"
val ENTITY_G = String2Data "<!ENTITY "
val ENTITY_P = String2Data "<!ENTITY % "
val ENTITIES = String2Data "ENTITIES"
val FIXED = String2Data "#FIXED "
val ID = String2Data "ID"
val IDREF = String2Data "IDREF"
val IDREFS = String2Data "IDREFS"
val IMPLIED = String2Data "#IMPLIED"
val NMTOKEN = String2Data "NMTOKEN"
val NMTOKENS = String2Data "NMTOKENS"
val NO = String2Data "no"
val NOTATION_D = String2Data "<!NOTATION "
val NOTATION_L = String2Data "NOTATION ("
val ONE = String2Data "1.0"
val PCDATA = String2Data "(#PCDATA)"
val PCDATA_L = String2Data "(#PCDATA"
val PUBLIC = String2Data "PUBLIC "
val REQUIRED = String2Data "#REQUIRED"
val STANDALONE = String2Data " standalone="
val SYSTEM = String2Data "SYSTEM "
val XMLVERS = String2Data "<?xml version="
val YES = String2Data "yes"
val endDecl = String2Data ">"
val endCom = String2Data "-->"
val endPi = String2Data "?>"
val endSection= String2Data "]]>"
val preAttDef = String2Data "\n "
val startCom = String2Data "<!--"
val startDtd = String2Data "<!DOCTYPE "
val startPi = String2Data "<?"
fun putAttVal(f,lit,cv) =
if !O_EXPAND_ATT_VAL then putAttValue(f,cv,Vector.sub(lit,0))
else putAttVector(f,lit)
fun putEntVal(f,lit,cv) =
if !O_EXPAND_ENT_VAL then putEntValue true (f,cv,Vector.sub(lit,0))
else putEntVector(f,lit)
fun putAttDef(f,dtd,(aidx,at,ad)) =
let
val f1 = putData(f,preAttDef)
val f2 = putData(f1,Index2AttNot dtd aidx)
val f3 = putBlank f2
val f4 = case at
of AT_CDATA => putData(f3,CDATA)
| AT_NMTOKEN => putData(f3,NMTOKEN)
| AT_NMTOKENS => putData(f3,NMTOKENS)
| AT_ENTITY => putData(f3,ENTITY)
| AT_ENTITIES => putData(f3,ENTITIES)
| AT_ID => putData(f3,ID)
| AT_IDREF => putData(f3,IDREF)
| AT_IDREFS => putData(f3,IDREFS)
| AT_GROUP is => let val toks = map (Index2AttNot dtd) is
val f4 = putChar(f3,0wx28) (* #"(" *)
val f5 = putData(f4,hd toks)
val f6 = foldl
(fn (tok,f) => (* " | " *)
putData(putData(f,[0wx20,0wx7C,0wx20]),tok))
f5 (tl toks)
in putChar(f6,0wx29) (* #")" *)
end
| AT_NOTATION is => let val toks = map (Index2AttNot dtd) is
val f4 = putData(f3,NOTATION_L)
val f5 = putData(f4,hd toks)
val f6 = foldl
(fn (tok,f) => (* " | " *)
putData(putData(f,[0wx20,0wx7C,0wx20]),tok))
f5 (tl toks)
in putChar(f6,0wx29) (* #")" *)
end
val f5 = putBlank f4
in case ad
of AD_DEFAULT((lit,cv,_),_) => putAttVal(f5,lit,cv)
| AD_FIXED((lit,cv,_),_) => let val f6 = putData(f5,FIXED)
in putAttVal(f6,lit,cv)
end
| AD_IMPLIED => putData(f5,IMPLIED)
| AD_REQUIRED => putData(f5,REQUIRED)
end
fun putAttListDecl(f,dtd,(eidx,defs,ext)) =
let
val f1 = putData(f,ATTLIST)
val f2 = putData(f1,Index2Element dtd eidx)
val f3 = foldl (fn (ad,f) => putAttDef(f,dtd,ad)) f2 defs
in putData(f3,endDecl)
end
fun putContentModel(f,dtd,cm) =
let
fun putCMs(f,sep,cms) =
let
val f1 = putChar(f,0wx28) (* #"(" *)
val f2 = putCM(f1,hd cms)
val f3 = foldl (fn (cm,f) => putCM(putData(f,sep),cm)) f2 (tl cms)
in putChar(f3,0wx29) (* #")" *)
end
and putCM(f,cm) =
case cm
of CM_ELEM eidx => putData(f,Index2Element dtd eidx)
| CM_OPT cm => putChar(putCM(f,cm),0wx3F) (* #"?" *)
| CM_REP cm => putChar(putCM(f,cm),0wx2A) (* #"*" *)
| CM_PLUS cm => putChar(putCM(f,cm),0wx2B) (* #"+" *)
| CM_ALT cms => putCMs(f,[0wx20,0wx7C,0wx20],cms) (* " | " *)
| CM_SEQ cms => putCMs(f,[0wx2C,0wx20],cms) (* ", " *)
in putCM(f,cm)
end
fun putElemDecl(f,dtd,(eidx,cont,ext)) =
let
val f1 = putData(f,ELEMENT)
val f2 = putData(f1,Index2Element dtd eidx)
val f3 = putBlank f2
val f4 = case cont
of CT_ANY => putData(f3,ANY)
| CT_EMPTY => putData(f3,EMPTY)
| CT_MIXED nil => putData(f3,PCDATA)
| CT_MIXED is =>
let
val f4 = putData(f3,PCDATA_L)
val f5 = foldl
(fn (elem,f) => let val f1 = putData(f,[0wx20,0wx7C,0wx20])
in putData(f1,elem)
end)
f4 (map (Index2Element dtd) is)
in putData(f5,[0wx29,0wx2A])
end
| CT_ELEMENT(cm,_) => putContentModel(f3,dtd,cm)
in putData(f4,endDecl)
end
fun putExtId(f,EXTID(pub,sys)) =
case pub
of SOME(x,q) => let val f1 = putData(f,PUBLIC)
val f2 = putChar(f1,q)
val f3 = putString(f2,x)
val f4 = putChar(f3,q)
in case sys
of NONE => f4
| SOME (_,x,q) => let val f5 = putData(f4,[0wx20,q])
val f6 = putString(f5,Uri2String x)
in putChar(f6,q)
end
end
| NONE => case sys
of NONE => f
| SOME (_,x,q) => let val f1 = putData(f,SYSTEM)
val f2 = putChar(f1,q)
val f3 = putString(f2,Uri2String x)
in putChar(f3,q)
end
fun putGenEntDecl(f,dtd,(idx,ge,ext)) =
let
val f1 = putData(f,ENTITY_G)
val f2 = putData(f1,Index2GenEnt dtd idx)
val f3 = putBlank f2
val f4 = case ge
of GE_NULL => f3
| GE_INTERN (lit,rep) => putEntVal(f3,lit,rep)
| GE_EXTERN extId => putExtId(f3,extId)
| GE_UNPARSED (extId,idx,_) => let val f4 = putExtId(f3,extId)
val f5 = putBlank f4
in putData(f5,Index2AttNot dtd idx)
end
in putData(f4,endDecl)
end
fun putParEntDecl(f,dtd,(idx,pe,ext)) =
let
val f1 = putData(f,ENTITY_P)
val f2 = putData(f1,Index2ParEnt dtd idx)
val f3 = putBlank f2
val f4 = case pe
of PE_NULL => f3
| PE_INTERN (lit,rep) => putEntVal(f3,lit,rep)
| PE_EXTERN extId => putExtId(f3,extId)
in putData(f4,endDecl)
end
fun putNotationDecl(f,dtd,(idx,extId,ext)) =
let
val f1 = putData(f,NOTATION_D)
val f2 = putData(f1,Index2AttNot dtd idx)
val f3 = putBlank f2
val f4 = putExtId(f3,extId)
in putData(f4,endDecl)
end
fun putDecl(f,dtd,(_,decl)) =
case decl
of DEC_ATTLIST x => putAttListDecl(f,dtd,x)
| DEC_ELEMENT x => putElemDecl(f,dtd,x)
| DEC_GEN_ENT x => putGenEntDecl(f,dtd,x)
| DEC_PAR_ENT x => putParEntDecl(f,dtd,x)
| DEC_NOTATION x => putNotationDecl(f,dtd,x)
fun putAttSpec(f,dtd,(aidx,ap,spOpt)) =
case ap
of AP_IMPLIED => f
| AP_MISSING => f
| AP_DEFAULT _ => f
| AP_PRESENT(lit,cv,_) => let val (sp,eq) = case spOpt
of NONE => ([0wx20],[0wx3D])
| SOME speq => speq
val f1 = putData(f,sp)
val f2 = putData(f1,Index2AttNot dtd aidx)
val f3 = putData(f2,eq)
val f4 = putAttVal(f3,lit,cv)
in f4
end
fun putStartTag (f,dtd,(_,eidx,atts,space,mt)) =
let
val f1 = putChar(f,0wx3C) (* #"<" *)
val f2 = putData(f1,Index2Element dtd eidx)
val f3 = foldl (fn (spec,f) => putAttSpec(f,dtd,spec)) f2 atts
val f4 = putData(f3,space)
in if mt then putData(f4,[0wx2F,0wx3E]) (* "/>" *)
else putChar(f4,0wx3E) (* #">" *)
end
fun putEndTag (f,dtd,(_,eidx,elsp)) =
let
val f1 = putData(f,[0wx3C,0wx2F]) (* "</" *)
val f2 = putData(f1,Index2Element dtd eidx)
val f3 = case elsp
of NONE => f2
| SOME(_,space) => putData(f2,space)
in putChar(f3,0wx3E)
end
fun putXmlDecl(f,(v,e,s)) =
let
val f1 = putData(f,XMLVERS)
val f2 = putChar(f1,0wx22)
val f3 = case v
of SOME version => putString(f2,version)
| _ => putData(f2,ONE)
val f4 = putChar(f3,0wx22)
val f5 = case e
of SOME enc => let val f5 = putData(f4,ENCODING)
val f6 = putChar(f5,0wx22)
val f7 = putString(f6,enc)
in putChar(f7,0wx22)
end
| _ => f4
val f6 = case s
of SOME stnd => let val f6 = putData(f5,STANDALONE)
val f7 = putChar(f6,0wx22)
val f8 = putData(f7,if stnd then YES else NO)
in putChar(f8,0wx22)
end
| _ => f5
val f7 = putData(f6,endPi)
in putNl f7
end
fun putProcInst(f,(_,target,_,text)) =
let
val f1 = putData(f,startPi)
val f2 = putData(f1,target)
val f3 = if Vector.length text=0 then f2
else putVector(putBlank f2,text)
in putData(f3,endPi)
end
fun putComment(f,(_,text)) =
let
val f1 = putData(f,startCom)
val f2 = putVector(f1,text)
in putData(f2,endCom)
end
fun putCData(f,text) =
let
val f1 = putData(f,CDATA_S)
val f2 = putVector(f1,text)
in putData(f2,endSection)
end
end

View File

@ -0,0 +1,8 @@
Group is
esisOptions.sml
esisOutput.sml
esis.sml
esisHooks.sml
esisData.sml
../../fxlib.cm
$/basis.cm

View File

@ -0,0 +1,15 @@
ann
"warnMatch true"
"sequenceUnit true"
in
local
$(MLTON_ROOT)/basis/basis.mlb
../../fxlib.mlb
in
esisOptions.sml
esisData.sml
esisOutput.sml
esisHooks.sml
esis.sml
end
end

View File

@ -0,0 +1,88 @@
signature Esis =
sig
val esis : string * string list -> OS.Process.status
end
structure Esis =
struct
structure ParserOptions = ParserOptions ()
structure CatOptions = CatOptions ()
structure CatParams =
struct
open CatError CatOptions EsisOptions Uri UtilError
fun catError(pos,err) = if !O_SILENT then () else TextIO.output
(!O_ERROR_DEVICE,formatMessage (4,!O_ERROR_LINEWIDTH)
(Position2String pos^" Error in catalog:"::catMessage err))
end
structure Resolve = ResolveCatalog (structure Params = CatParams)
structure EsisHooks = EsisHooks (structure Resolve = Resolve
structure ParserOptions = ParserOptions)
structure ParseEsis = Parse (structure Dtd = Dtd
structure Hooks = EsisHooks
structure Resolve = Resolve
structure ParserOptions = ParserOptions)
open
CatOptions EsisData EsisOptions Options ParserOptions Uri
val usage = List.concat [parserUsage,[U_SEP],catalogUsage,[U_SEP],esisUsage]
exception Exit of OS.Process.status
fun esis(prog,args) =
let
val prog = "fxesis"
val hadError = ref false
fun optError msg =
let val _ = TextIO.output(TextIO.stdErr,msg^".\n")
in hadError := true
end
fun exitError msg =
let val _ = TextIO.output(TextIO.stdErr,msg^".\n")
in raise Exit OS.Process.failure
end
fun exitHelp prog =
let val _ = printUsage TextIO.stdOut prog usage
in raise Exit OS.Process.success
end
fun exitVersion prog =
let val _ = app print [prog," version ",Version.FXP_VERSION,"\n"]
in raise Exit OS.Process.success
end
fun summOpt prog = "For a summary of options type "^prog^" --help"
fun noFile(f,cause) = "can't open file '"^f^"': "^exnMessage cause
val opts = parseOptions args
val _ = setParserDefaults()
val opts1 = setParserOptions (opts,optError)
val _ = setCatalogDefaults()
val opts2 = setCatalogOptions (opts1,optError)
val _ = setEsisDefaults()
val (vers,help,err,file) = setEsisOptions (opts2,optError)
val _ = if !hadError then exitError (summOpt prog) else ()
val _ = if vers then exitVersion prog else ()
val _ = if help then exitHelp prog else ()
val _ = case err
of SOME "-" => O_ERROR_DEVICE := TextIO.stdErr
| SOME f => (O_ERROR_DEVICE := TextIO.openOut f
handle IO.Io {cause,...} => exitError(noFile(f,cause)))
| NONE => ()
val f = valOf file handle Option => "-"
val uri = if f="-" then NONE else SOME(String2Uri f)
val dtd = Dtd.initDtdTables()
val status = ParseEsis.parseDocument uri (SOME dtd) (EsisHooks.esisStart dtd)
val _ = if isSome err then TextIO.closeOut (!O_ERROR_DEVICE) else ()
in status
end
handle Exit status => status
| exn =>
let val _ = TextIO.output
(TextIO.stdErr,prog^": Unexpected exception: "^exnMessage exn^".\n")
in OS.Process.failure
end
end
val _ = Esis.esis(CommandLine.name (), CommandLine.arguments ())

View File

@ -0,0 +1,16 @@
structure EsisData =
struct
type Stag = int * (int * HookData.AttPresent) list
datatype Item =
PI of UniChar.Data * UniChar.Vector
| ELEM of Stag * Item list
| DATA of UniChar.Vector
type Content = Item list
end

View File

@ -0,0 +1,39 @@
functor EsisHooks (structure Resolve : Resolve
structure ParserOptions : ParserOptions) =
struct
structure EsisOutput = EsisOutput (structure Resolve = Resolve
structure ParserOptions = ParserOptions)
open EsisData EsisOutput IgnoreHooks
type AppFinal = OS.Process.status
type AppData = Dtd * OS.Process.status * Content * (Stag * Content) list
fun esisStart dtd = (dtd,OS.Process.success,nil,nil) : AppData
fun hookError ((dtd,status,items,stack),err) =
(dtd,OS.Process.failure,items,stack) before printError err
fun hookWarning (a,warn) = a before printWarning warn
fun hookProcInst ((dtd,status,items,stack),(_,target,_,text)) =
(dtd,status,PI(target,text)::items,stack)
fun hookStartTag ((dtd,status,items,stack),(_,elem,atts,_,mt)) =
let val stag = (elem,map(fn(a,v,_) => (a,v)) atts)
in if mt then (dtd,status,ELEM(stag,nil)::items,stack)
else (dtd,status,nil,(stag,items)::stack)
end
fun hookEndTag ((dtd,status,items,(stag,items1)::stack),_) =
(dtd,status,ELEM(stag,rev items)::items1,stack)
| hookEndTag (a,_) = a
fun hookData ((dtd,status,items,stack),(_,text,_)) =
(dtd,status,DATA text::items,stack)
fun hookCData ((dtd,status,items,stack),(_,text)) =
(dtd,status,DATA text::items,stack)
fun hookCharRef ((dtd,status,items,stack),(_,ch,_)) =
(dtd,status,DATA (Vector.fromList [ch])::items,stack)
fun hookXml (a,_) = a
fun hookFinish (dtd,status,items,_) = status before outDocument dtd (rev items)
end

View File

@ -0,0 +1,116 @@
signature EsisOptions =
sig
val O_SILENT : bool ref
val O_ERROR_DEVICE : TextIO.outstream ref
val O_ERROR_LINEWIDTH : int ref
val O_OUTPUT_FILE : string ref
val O_NON_DIRECT : UniChar.Char ref
val setEsisDefaults : unit -> unit
val setEsisOptions : Options.Option list * (string -> unit)
-> bool * bool * string option * string option
val esisUsage : Options.Usage
end
structure EsisOptions : EsisOptions =
struct
open Options
val O_SILENT = ref false
val O_ERROR_DEVICE = ref TextIO.stdErr
val O_ERROR_LINEWIDTH = ref 80
val O_OUTPUT_FILE = ref "-"
val O_NON_DIRECT = ref (0wx100 : UniChar.Char)
fun setEsisDefaults () =
let
val _ = O_SILENT := false
val _ = O_ERROR_DEVICE := TextIO.stdErr
val _ = O_OUTPUT_FILE := "-"
val _ = O_NON_DIRECT := 0wx100
in ()
end
val esisUsage =
[U_ITEM(["-o <file>","--output=<file>"],"Write output to file (stdout)"),
U_ITEM(["-s","--silent"],"Suppress reporting of errors and warnings"),
U_ITEM(["-e <file>","--error-output=<file>"],"Redirect errors to file (stderr)"),
U_SEP,
U_ITEM(["-7","--ascii"],"Produce output in ASCII (7 bit) encoding"),
U_ITEM(["-8","--latin1"],"Produce output in LATIN1 (8 bit) encoding (default)"),
U_SEP,
U_ITEM(["--version"],"Print the version number and exit"),
U_ITEM(["-?","--help"],"Print this text and exit"),
U_ITEM(["--"],"Do not recognize remaining arguments as options")
]
fun setEsisOptions (opts,optError) =
let
fun onlyOne what = "at most one "^what^" may be specified"
fun unknown pre opt = String.concat ["unknown option ",pre,opt]
fun hasNoArg pre key = String.concat ["option ",pre,key," expects no argument"]
fun mustHave pre key = String.concat ["option ",pre,key," must have an argument"]
fun check_noarg(key,valOpt) =
if isSome valOpt then optError (hasNoArg "--" key) else ()
fun do_long (pars as (v,h,e,f)) (key,valOpt) =
case key
of "help" => (v,true,e,f) before check_noarg(key,valOpt)
| "version" => (true,h,e,f) before check_noarg(key,valOpt)
| "silent" => pars before O_SILENT := true before check_noarg(key,valOpt)
| "ascii" => pars before O_NON_DIRECT := 0wx80 before check_noarg(key,valOpt)
| "latin1" => pars before O_NON_DIRECT := 0wx100 before check_noarg(key,valOpt)
| "output" =>
(case valOpt
of NONE => pars before optError (mustHave "--" key)
| SOME s => pars before O_OUTPUT_FILE := s)
| "error-output" =>
(case valOpt
of NONE => pars before optError (mustHave "--" key)
| SOME s => (v,h,SOME s,f))
| _ => pars before optError(unknown "--" key)
fun do_short (pars as (v,h,e,f)) (cs,opts) =
case cs
of nil => doit pars opts
| [#"o"] => (case opts
of OPT_STRING s::opts1 => (O_OUTPUT_FILE := s;
doit pars opts1)
| _ => (optError (mustHave "-" "o"); doit pars opts))
| [#"e"] => (case opts
of OPT_STRING s::opts1 => doit (v,h,SOME s,f) opts1
| _ => (optError (mustHave "-" "e"); doit pars opts))
| cs => doit (foldr
(fn (c,pars)
=> case c
of #"s" => pars before O_SILENT := true
| #"7" => pars before O_NON_DIRECT := 0wx80
| #"8" => pars before O_NON_DIRECT := 0wx100
| #"o" => pars before optError (mustHave "-" "o")
| #"e" => pars before optError (mustHave "-" "e")
| #"?" => (v,true,e,f)
| c => pars before
optError(unknown "-" (String.implode [c])))
pars cs) opts
and doit pars nil = pars
| doit (pars as (v,h,e,f)) (opt::opts) =
case opt
of OPT_LONG(key,valOpt) => doit (do_long pars (key,valOpt)) opts
| OPT_SHORT cs => do_short pars (cs,opts)
| OPT_STRING s => if isSome f
then let val _ = optError(onlyOne "input file")
in doit pars opts
end
else doit (v,h,e,SOME s) opts
| OPT_NOOPT => doit pars opts
| OPT_NEG cs => let val _ = if null cs then ()
else app (fn c => optError
(unknown "-n" (String.implode[c]))) cs
in doit pars opts
end
in doit (false,false,NONE,NONE) opts
end
end

View File

@ -0,0 +1,204 @@
signature EsisOutput =
sig
type Dtd
val printError : Errors.Position * Errors.Error -> unit
val printWarning : Errors.Position * Errors.Warning -> unit
val outDocument : Dtd -> EsisData.Content -> unit
end
functor EsisOutput (structure Resolve : Resolve
structure ParserOptions : ParserOptions) : EsisOutput =
struct
open
Base Dtd Errors HookData IntSets ParserOptions UniChar UniClasses Uri
EsisData EsisOptions Resolve
fun printError(pos,err) = if !O_SILENT then () else TextIO.output
(!O_ERROR_DEVICE,formatMessage (4,!O_ERROR_LINEWIDTH)
(Position2String pos
::(if isFatalError err then "Fatal error:" else "Error:")
::errorMessage err))
fun printWarning(pos,warn) = if !O_SILENT then () else TextIO.output
(!O_ERROR_DEVICE,formatMessage (4,!O_ERROR_LINEWIDTH)
(Position2String pos^" Warning:"::warningMessage warn))
val outFile = ref TextIO.stdOut
fun openOutFile() = if !O_OUTPUT_FILE="-" then outFile := TextIO.stdOut
else outFile := TextIO.openOut(!O_OUTPUT_FILE)
fun closeOutFile() = if !O_OUTPUT_FILE="-" then ()
else TextIO.closeOut(!outFile)
fun out str = TextIO.output(!outFile,str)
fun out1 c = TextIO.output1(!outFile,c)
val outList = app out
fun outNewLine() = out1 #"\n"
fun outChar c =
case c
of 0wx09 => out "\\t"
| 0wx0A => out "\\n"
| 0wx5C => out "\\\\"
| _ => if Chars.andb(c,0wx7F)>=0wx20 andalso !O_NON_DIRECT>c
then out1(Char2char c)
else out ("\\U+"^Chars.toString c^";")
val outData = app outChar
val outVector = Vector.app outChar
fun outExtId putFile (id as EXTID(pub,sys)) =
let val _ = case pub
of NONE => ()
| SOME(x,_) => out ("\np"^x)
val _ = case sys
of NONE => ()
| SOME(b,x,_) => (out "\ns"; out (Uri2String x))
val _ = if putFile
then let val uri = resolveExtId id
in (out "\nf<OSFILE>"; out (Uri2String uri))
end
handle NoSuchFile _ => ()
else ()
in ()
end
fun defNot dtd ns (idx,name) =
if inIntSet(idx,ns) then ns
else let val _ = case getNotation dtd idx
of NONE => ()
| SOME extId => outExtId false extId
val _ = out "\nN"
val _ = outData name
in addIntSet(idx,ns)
end
fun defNotation dtd (es,ns) name =
(es,defNot dtd ns (AttNot2Index dtd name,name))
fun defEntity dtd (sets as (es,ns)) name =
let val idx = GenEnt2Index dtd name
in if inIntSet(idx,es) then sets
else let val (ent,_) = getGenEnt dtd idx
val ns1 = case ent
of GE_UNPARSED(extId,nidx,_) =>
let
val not = Index2AttNot dtd nidx
val ns1 = defNot dtd ns (nidx,not)
val _ = outExtId true extId
val _ = out "\nE"
val _ = outData name
val _ = out " NDATA "
val _ = outData not
in ns1
end
| _ => ns
in (addIntSet(idx,es),ns1)
end
end
fun defEntities dtd sets av = foldl
(fn (ent,sets) => defEntity dtd sets ent)
sets (UtilList.split (fn c => isS c) av)
fun preAttSpec dtd sets (at,av) =
case at
of AT_CDATA => (sets," CDATA ")
| AT_NOTATION _ => (defNotation dtd sets (Vector2Data av)," NOTATION ")
| AT_ENTITY => (defEntity dtd sets (Vector2Data av)," ENTITY ")
| AT_ENTITIES => (defEntities dtd sets (Vector2Data av)," ENTITY ")
| _ => (sets," TOKEN ")
fun outAttSpec dtd sets (att,at,av) =
let
val (sets1,typ) = preAttSpec dtd sets (at,av)
val _ = out "\nA"
val _ = outData att
val _ = out typ
val _ = outVector av
in sets1
end
fun outImplied att =
let
val _ = out "\nA"
val _ = outData att
in out " IMPLIED"
end
fun outAttVal dtd specs sets (idx,at,_,_) =
let val att = Index2AttNot dtd idx
in case List.find (fn(i,_) => i=idx) specs
of NONE => sets before outImplied att
| SOME(_,AP_MISSING) => sets before outImplied att
| SOME(_,AP_IMPLIED) => sets before outImplied att
| SOME(_,AP_DEFAULT(_,av,_)) => outAttSpec dtd sets (att,at,av)
| SOME(_,AP_PRESENT(_,av,_)) => outAttSpec dtd sets (att,at,av)
end
fun outAttNonVal dtd defs sets (idx,ap) =
let fun outPresent av =
let
val att = Index2AttNot dtd idx
val at = case List.find (fn(i,_,_,_) => i=idx) defs
of NONE => AT_CDATA
| SOME(_,at,_,_) => at
in outAttSpec dtd sets (att,at,av)
end
in case ap
of AP_MISSING => sets
| AP_IMPLIED => sets
| AP_DEFAULT(_,av,_) => outPresent av
| AP_PRESENT(_,av,_) => outPresent av
end
fun outStartTag dtd sets (idx,elem,atts) =
let
val defs = case #atts(getElement dtd idx)
of NONE => nil
| SOME(defs,_) => defs
val sets1 =
if !O_VALIDATE andalso hasDtd dtd
then foldl
(fn (def,sets) => outAttVal dtd atts sets def)
sets defs
else foldl
(fn (spec,sets) => outAttNonVal dtd defs sets spec)
sets atts
val _ = out "\n("
val _ = outData elem
in sets1
end
fun outEndTag elem =
let val _ = out "\n)"
in outData elem
end
fun outCData(prefix,text) =
let val _ = out prefix
in outVector text
end
fun outPi(target,text) =
let
val _ = out "\n?"
val _ = outData target
val _ = out " "
in outVector text
end
fun outContent dtd sets items =
let val (sets1,_) = foldl
(fn (item,(sets,prefix)) => case item
of PI pi => (sets,"\n-") before outPi pi
| DATA text => (sets,"") before outCData(prefix,text)
| ELEM el => (outElem dtd sets el,"\n-"))
(sets,"\n-") items
in sets1
end
and outElem dtd sets ((idx,atts),content) =
let
val elem = Index2Element dtd idx
val sets1 = outStartTag dtd sets (idx,elem,atts)
val sets2 = outContent dtd sets1 content
val _ = outEndTag elem
in sets2
end
fun outDocument dtd content =
let
val _ = openOutFile()
val _ = ignore (outContent dtd (emptyIntSet,emptyIntSet) content)
handle IO.Io _ => ()
val _ = outNewLine()
val _ = closeOutFile()
in ()
end
handle IO.Io _ => ()
end

View File

@ -0,0 +1,7 @@
Group is
nullHooks.sml
nullOptions.sml
null.sml
nullHard.sml
../../fxlib.cm
$/basis.cm

View File

@ -0,0 +1,14 @@
ann
"warnMatch true"
"sequenceUnit true"
in
local
$(MLTON_ROOT)/basis/basis.mlb
../../fxlib.mlb
in
nullOptions.sml
nullHooks.sml
nullHard.sml
null.sml
end
end

View File

@ -0,0 +1,82 @@
structure Null =
struct
structure ParserOptions = ParserOptions ()
structure CatOptions = CatOptions ()
structure CatParams =
struct
open CatError CatOptions NullOptions Uri UtilError
fun catError(pos,err) = if !O_SILENT then () else TextIO.output
(!O_ERROR_DEVICE,formatMessage (4,!O_ERROR_LINEWIDTH)
(Position2String pos^" Error in catalog:"::catMessage err))
end
structure Resolve = ResolveCatalog (structure Params = CatParams)
structure ParseNull = Parse (structure Dtd = Dtd
structure Hooks = NullHooks
structure Resolve = Resolve
structure ParserOptions = ParserOptions)
fun parseNull uri = ParseNull.parseDocument uri NONE NullHooks.nullStart
open
CatOptions NullOptions Options ParserOptions Uri
val usage = List.concat [parserUsage,[U_SEP],catalogUsage,[U_SEP],nullUsage]
exception Exit of OS.Process.status
fun null(prog,args) =
let
val prog = "fxp"
val hadError = ref false
fun optError msg =
let val _ = TextIO.output(TextIO.stdErr,msg^".\n")
in hadError := true
end
fun exitError msg =
let val _ = TextIO.output(TextIO.stdErr,msg^".\n")
in raise Exit OS.Process.failure
end
fun exitHelp prog =
let val _ = printUsage TextIO.stdOut prog usage
in raise Exit OS.Process.success
end
fun exitVersion prog =
let val _ = app print [prog," version ",Version.FXP_VERSION,"\n"]
in raise Exit OS.Process.success
end
fun summOpt prog = "For a summary of options type "^prog^" --help"
fun noFile(f,cause) = "can't open file '"^f^"': "^exnMessage cause
val opts = parseOptions args
val _ = setParserDefaults()
val opts1 = setParserOptions (opts,optError)
val _ = setCatalogDefaults()
val opts2 = setCatalogOptions (opts1,optError)
val _ = setNullDefaults()
val (vers,help,err,file) = setNullOptions (opts2,optError)
val _ = if !hadError then exitError (summOpt prog) else ()
val _ = if vers then exitVersion prog else ()
val _ = if help then exitHelp prog else ()
val _ = case err
of SOME "-" => O_ERROR_DEVICE := TextIO.stdErr
| SOME f => (O_ERROR_DEVICE := TextIO.openOut f
handle IO.Io {cause,...} => exitError(noFile(f,cause)))
| NONE => ()
val f = valOf file handle Option => "-"
val uri = if f="-" then NONE else SOME(String2Uri f)
val status = parseNull uri
val _ = if isSome err then TextIO.closeOut (!O_ERROR_DEVICE) else ()
in status
end
handle Exit status => status
| exn =>
let val _ = TextIO.output
(TextIO.stdErr,prog^": Unexpected exception: "^exnMessage exn^".\n")
in OS.Process.failure
end
end
val _ = Null.null(CommandLine.name (), CommandLine.arguments ())

View File

@ -0,0 +1,88 @@
(*
structure NullHard =
struct
fun parseNull uri = NullParse.parseDocument uri NONE NullHooks.nullStart
open
NullCatOptions NullOptions Options NullParserOptions Uri
val usage = List.concat [parserUsage,[("","")],catalogUsage,[("","")],nullUsage]
exception Exit of OS.Process.status
fun null(prog,args) =
let
val prog = "fxp"
val hadError = ref false
fun optError msg =
let val _ = TextIO.output(TextIO.stdErr,msg^".\n")
in hadError := true
end
fun exitError msg =
let val _ = TextIO.output(TextIO.stdErr,msg^".\n")
in raise Exit OS.Process.failure
end
fun exitHelp prog =
let val _ = printUsage TextIO.stdOut prog usage
in raise Exit OS.Process.success
end
fun exitVersion prog =
let val _ = app print [prog," version ",Version.FXP_VERSION,"\n"]
in raise Exit OS.Process.success
end
fun summOpt prog = "For a summary of options type "^prog^" --help"
fun noFile(f,cause) = "can't open file '"^f^"': "^exnMessage cause
val opts = parseOptions args
val _ = setParserDefaults()
val opts1 = setParserOptions (opts,optError)
val _ = setCatalogDefaults()
val opts2 = setCatalogOptions (opts1,optError)
val _ = setNullDefaults()
val (vers,help,err,file) = setNullOptions (opts2,optError)
val _ = if !hadError then exitError (summOpt prog) else ()
val _ = if vers then exitVersion prog else ()
val _ = if help then exitHelp prog else ()
val _ = case err
of SOME "-" => O_ERROR_DEVICE := TextIO.stdErr
| SOME f => (O_ERROR_DEVICE := TextIO.openOut f
handle IO.Io {cause,...} => exitError(noFile(f,cause)))
| NONE => ()
val f = valOf file handle Option => "-"
val uri = if f="-" then NONE else SOME(String2Uri f)
val status = parseNull uri
val _ = if isSome err then TextIO.closeOut (!O_ERROR_DEVICE) else ()
in status
end
handle Exit status => status
| exn =>
let val _ = TextIO.output
(TextIO.stdErr,prog^": Unexpected exception: "^exnMessage exn^".\n")
in OS.Process.failure
end
end
*)
structure NullHard = struct end

View File

@ -0,0 +1,20 @@
structure NullHooks =
struct
open Errors IgnoreHooks NullOptions
type AppData = OS.Process.status
type AppFinal = AppData
val nullStart = OS.Process.success
fun printError(pos,err) = if !O_SILENT then () else TextIO.output
(!O_ERROR_DEVICE,formatMessage (4,!O_ERROR_LINEWIDTH)
(Position2String pos
::(if isFatalError err then "Fatal error:" else "Error:")
::errorMessage err))
fun printWarning(pos,warn) = if !O_SILENT then () else TextIO.output
(!O_ERROR_DEVICE,formatMessage (4,!O_ERROR_LINEWIDTH)
(Position2String pos^" Warning:"::warningMessage warn))
fun hookError (_,pe) = OS.Process.failure before printError pe
fun hookWarning (status,pw) = status before printWarning pw
end

View File

@ -0,0 +1,93 @@
signature NullOptions =
sig
val O_SILENT : bool ref
val O_ERROR_DEVICE : TextIO.outstream ref
val O_ERROR_LINEWIDTH : int ref
val setNullDefaults : unit -> unit
val setNullOptions : Options.Option list * (string -> unit)
-> bool * bool * string option * string option
val nullUsage : Options.Usage
end
structure NullOptions : NullOptions =
struct
open Options
val O_SILENT = ref false
val O_ERROR_DEVICE = ref TextIO.stdErr
val O_ERROR_LINEWIDTH = ref 80
val nullUsage =
[U_ITEM(["-s","--silent"],"Suppress reporting of errors and warnings"),
U_ITEM(["-e <file>","--error-output=<file>"],"Redirect errors to file (stderr)"),
U_SEP,
U_ITEM(["--version"],"Print the version number and exit"),
U_ITEM(["-?","--help"],"Print this text and exit"),
U_ITEM(["--"],"Do not recognize remaining arguments as options")
]
fun setNullDefaults () =
let
val _ = O_SILENT := false
val _ = O_ERROR_DEVICE := TextIO.stdErr
in ()
end
fun setNullOptions (opts,optError) =
let
fun onlyOne what = "at most one "^what^" may be specified"
fun unknown pre opt = String.concat ["unknown option ",pre,opt]
fun hasNoArg pre key = String.concat ["option ",pre,key," expects no argument"]
fun mustHave pre key = String.concat ["option ",pre,key," must have an argument"]
fun check_noarg(key,valOpt) =
if isSome valOpt then optError (hasNoArg "--" key) else ()
fun do_long (pars as (v,h,e,f)) (key,valOpt) =
case key
of "help" => (v,true,e,f) before check_noarg(key,valOpt)
| "version" => (true,h,e,f) before check_noarg(key,valOpt)
| "silent" => pars before O_SILENT := true before check_noarg(key,valOpt)
| "error-output" =>
(case valOpt
of NONE => pars before optError (mustHave "--" key)
| SOME s => (v,h,SOME s,f))
| _ => pars before optError(unknown "--" key)
fun do_short (pars as (v,h,e,f)) (cs,opts) =
case cs
of nil => doit pars opts
| [#"e"] => (case opts
of OPT_STRING s::opts1 => doit (v,h,SOME s,f) opts1
| _ => (optError (hasNoArg "-" "e"); doit pars opts))
| cs => doit (foldr
(fn (c,pars)
=> case c
of #"e" => pars before optError (hasNoArg "-" "e")
| #"s" => pars before O_SILENT := true
| #"?" => (v,true,e,f)
| c => pars before
optError (unknown "-" (String.implode [c])))
pars cs) opts
and doit pars nil = pars
| doit (pars as (v,h,e,f)) (opt::opts) =
case opt
of OPT_LONG(key,valOpt) => doit (do_long pars (key,valOpt)) opts
| OPT_SHORT cs => do_short pars (cs,opts)
| OPT_STRING s => if isSome f
then let val _ = optError(onlyOne "input file")
in doit pars opts
end
else doit (v,h,e,SOME s) opts
| OPT_NOOPT => doit pars opts
| OPT_NEG cs => let val _ = if null cs then ()
else app (fn c => optError
(unknown "-n" (String.implode[c]))) cs
in doit pars opts
end
in doit (false,false,NONE,NONE) opts
end
end

6
fxp/src/Apps/Viz/viz.cm Normal file
View File

@ -0,0 +1,6 @@
Group is
vizOptions.sml
viz.sml
vizHooks.sml
../../fxlib.cm
$/basis.cm

13
fxp/src/Apps/Viz/viz.mlb Normal file
View File

@ -0,0 +1,13 @@
ann
"warnMatch true"
"sequenceUnit true"
in
local
$(MLTON_ROOT)/basis/basis.mlb
../../fxlib.mlb
in
vizOptions.sml
vizHooks.sml
viz.sml
end
end

87
fxp/src/Apps/Viz/viz.sml Normal file
View File

@ -0,0 +1,87 @@
signature Viz =
sig
val viz : string * string list -> OS.Process.status
end
structure Viz =
struct
structure ParserOptions = ParserOptions ()
structure CatOptions = CatOptions ()
structure CatParams =
struct
open CatError CatOptions VizOptions Uri UtilError
fun catError(pos,err) = if !O_SILENT then () else TextIO.output
(!O_ERROR_DEVICE,formatMessage (4,!O_ERROR_LINEWIDTH)
(Position2String pos^" Error in catalog:"::catMessage err))
end
structure Resolve = ResolveCatalog (structure Params = CatParams)
structure VizHooks = VizHooks (structure Dtd = Dtd)
structure ParseViz = Parse (structure Dtd = Dtd
structure Hooks = VizHooks
structure Resolve = Resolve
structure ParserOptions = ParserOptions)
open
CatOptions VizHooks VizOptions Options ParserOptions Uri
val usage = List.concat [parserUsage,[U_SEP],catalogUsage,[U_SEP],vizUsage]
exception Exit of OS.Process.status
fun viz(prog,args) =
let
val prog = "fxviz"
val hadError = ref false
fun optError msg =
let val _ = TextIO.output(TextIO.stdErr,msg^".\n")
in hadError := true
end
fun exitError msg =
let val _ = TextIO.output(TextIO.stdErr,msg^".\n")
in raise Exit OS.Process.failure
end
fun exitHelp prog =
let val _ = printUsage TextIO.stdOut prog usage
in raise Exit OS.Process.success
end
fun exitVersion prog =
let val _ = app print [prog," version ",Version.FXP_VERSION,"\n"]
in raise Exit OS.Process.success
end
fun summOpt prog = "For a summary of options type "^prog^" --help"
fun noFile(f,cause) = "can't open file '"^f^"': "^exnMessage cause
val opts = parseOptions args
val _ = setParserDefaults()
val opts1 = setParserOptions (opts,optError)
val _ = setCatalogDefaults()
val opts2 = setCatalogOptions (opts1,optError)
val _ = setVizDefaults()
val (vers,help,err,file) = setVizOptions (opts2,optError)
val _ = if !hadError then exitError (summOpt prog) else ()
val _ = if vers then exitVersion prog else ()
val _ = if help then exitHelp prog else ()
val _ = case err
of SOME "-" => O_ERROR_DEVICE := TextIO.stdErr
| SOME f => (O_ERROR_DEVICE := TextIO.openOut f
handle IO.Io {cause,...} => exitError(noFile(f,cause)))
| NONE => ()
val f = valOf file handle Option => "-"
val uri = if f="-" then NONE else SOME(String2Uri f)
val dtd = initDtdTables()
val _ = ParseViz.parseDocument uri (SOME dtd) (vizStart dtd)
val _ = if isSome err then TextIO.closeOut (!O_ERROR_DEVICE) else ()
in OS.Process.success
end
handle Exit status => status
| exn =>
let val _ = TextIO.output
(TextIO.stdErr,prog^": Unexpected exception: "^exnMessage exn^".\n")
in OS.Process.failure
end
end
val _ = Viz.viz(CommandLine.name (), CommandLine.arguments ())

View File

@ -0,0 +1,381 @@
functor VizHooks ( structure Dtd : Dtd ) =
struct
open Base Dtd Errors HookData IgnoreHooks UniChar UtilString VizOptions
val THIS_MODULE = "VizHooks"
datatype DataItem =
CREF of HookData.CharRefInfo
| DATA of HookData.DataInfo
| CDATA of HookData.CDataInfo
type StackItem = HookData.StartTagInfo * int * string * string list
type Current = int * string * string list * (string * DataItem) list
type AppData = Dtd * TextIO.outstream * Current * StackItem list
type AppFinal = unit
fun vizStart dtd = (dtd,TextIO.stdOut,(0,"",nil,nil),nil)
fun inContent (_,_,_,stack) = not (null stack)
fun Char2String c =
case c
of 0wx09 => "\t"
| 0wx0A => "\n"
| 0wx22 => "\\\""
| 0wx5C => "\\\\"
| _ => if Chars.andb(c,0wx7F)>=0wx20 andalso c<0wx100
then String.implode [Char2char c]
else "\\U+"^Chars.toString c^";"
fun Data2String cs = String.concat (map Char2String cs)
fun Vector2String cv = String.concat
(Vector.foldr (fn (c,strs) => Char2String c::strs) nil cv)
fun printError(pos,err) = if !O_SILENT then () else TextIO.output
(!O_ERROR_DEVICE,formatMessage (4,!O_ERROR_LINEWIDTH)
(Position2String pos
::(if isFatalError err then "Fatal error:" else "Error:")
::errorMessage err))
fun printWarning(pos,warn) = if !O_SILENT then () else TextIO.output
(!O_ERROR_DEVICE,formatMessage (4,!O_ERROR_LINEWIDTH)
(Position2String pos^" Warning:"::warningMessage warn))
val (piColor ,piRgb ) = ("51","255 208 192")
val (comColor,comRgb) = ("52","255 255 192")
val (txtColor,txtRgb) = ("53","255 244 232")
val (elColor ,elRgb ) = ("54","208 232 255")
val (valColor,valRgb) = ("55",txtRgb)
val (attColor,attRgb) = ("56","208 255 208")
val (misColor,misRgb) = ("57","224 0 0 ")
val (typColor,typRgb) = ("58","0 144 0 ")
fun out(f,s) = f before TextIO.output(f,s)
fun outNl f = out(f,"\n")
fun outLine (f,line) = outNl(out(f,line))
fun outLines (f,lines) = foldl (fn (line,f) => outLine(f,line)) f lines
fun outPiNode (f,i,title,((pos,_),target,_,text)) =
let
val f1 = outLines
(f,["",
" node: {",
" title: \""^title^"\"",
" label: \"<?"^Data2String target^" "^Vector2String text^"?>\"",
" color: "^piColor,
" info1: \""^Position2String pos^"\"",
" horizontal_order: "^Int.toString i,
" }"
])
in f1
end
fun outComNode (f,i,title,((pos,_),text)) =
let
val f1 = outLines
(f,["",
" node: {",
" title: \""^title^"\"",
" label: \"<!--"^Vector2String text^"-->\"",
" color: "^comColor,
" info1: \""^Position2String pos^"\"",
" horizontal_order: "^Int.toString i,
" }"
])
in f1
end
fun outDataNode (f,path,(i,item)) =
let
val title = path^i
val (pos,label,text) =
case item
of DATA((pos,_),cv,_) => let val str = Vector2String cv
in (pos,str,str)
end
| CDATA((pos,_),cv) => let val str = Vector2String cv
in (pos,"<![CDATA["^str^"]]>",str)
end
| CREF((pos,_),c,cv) => (pos,Vector2String cv,Char2String c)
val f1 = outLines
(f,[" node: {",
" title: \""^title^"\"",
" label: \""^label^"\"",
" color: "^txtColor,
" info1: \""^Position2String pos^"\"",
" horizontal_order: "^i,
" }"
])
in (f1,pos,title,text)
end
fun outDataGraph (f,i,path,items) =
if null items then NONE
else let val i1 = i+1
val iS = Int.toString i1
val title = path^iS
val f1 = outLines
(f,["",
" graph: {",
" title : \""^title^"\"",
" color : "^txtColor,
" folding: 1",
" horizontal_order: "^iS
])
val (f2,pos,titles,texts) =
foldl (fn (item,(f,_,titles,texts))
=> let val (f1,pos,title,text) = outDataNode (f,path,item)
in (f1,pos,title::titles,text::texts)
end) (f1,nullPosition,nil,nil) items
val label = String.concat texts
val (firstTitle,f3) = foldr
(fn (title,(next,f)) =>
case next
of NONE => (SOME title,f)
| SOME nextTitle =>
(SOME title,outLines
(f,["",
" nearedge: {",
" sourcename: \""^title^"\"",
" targetname:\""^nextTitle^"\"",
" }"])))
(NONE,f2) titles
val f4 = outLines
(f3,[" label: \""^label^"\"",
" info1: \""^Position2String pos^"\"",
" }"
])
in SOME (f4,i1,valOf firstTitle)
end
fun outData (a as (dtd,f,(i,path,titles,data),stack)) =
case outDataGraph (f,i,path,data)
of NONE => a
| SOME(f1,i1,title) => (dtd,f1,(i1,path,title::titles,nil),stack)
fun attValType dtd avOpt =
case avOpt
of NONE => NONE
| SOME av => SOME
(case av
of AV_CDATA _ => "CDATA"
| AV_NMTOKEN _ => "NMTOKEN"
| AV_NMTOKENS _ => "NMTOKENS"
| AV_ENTITY _ => "ENTITY"
| AV_ENTITIES _ => "ENTITIES"
| AV_ID _ => "ID"
| AV_IDREF _=> "IDREF"
| AV_IDREFS _=> "IDREFS"
| AV_GROUP(is,_) => List2xString
("(","|",")") (Data2String o (Index2AttNot dtd)) is
| AV_NOTATION(is,_) => List2xString
("NOTATION(","|",")") (Data2String o (Index2AttNot dtd)) is)
fun outAttNode dtd (f,path,(i,ap,_)) =
let
val iS = Int.toString i
val attTitle = path^iS
val attLabel = Data2String(Index2AttNot dtd i)
val valTitle = attTitle^".0"
val (valLabel,specText,typInfo) =
case ap
of AP_IMPLIED => ("\f"^typColor^"#IMPLIED","",NONE)
| AP_MISSING => ("\f"^misColor^"MISSING","",NONE)
| AP_DEFAULT(_,cv,av) =>
let val str = Vector2String cv
in ("\f"^typColor^"#DEFAULT:\f31"^str,"",attValType dtd av)
end
| AP_PRESENT(_,cv,av) =>
let val str = Vector2String cv
in (str," "^attLabel^"=\\\""^str^"\\\"",attValType dtd av)
end
val f1 = outLines
(f,["",
" node: {",
" title: \""^attTitle^"\"",
" label: \""^attLabel^"\"",
" color: "^attColor,
" horizontal_order: 0",
" }",
" node: {",
" title: \""^valTitle^"\"",
" label: \""^valLabel^"\""
])
val f2 = case typInfo
of NONE => f1
| SOME info => outLine(f1," info2: \""^info^"\"")
val f3 = outLines
(f2,[" color: "^valColor,
" }",
" edge: {",
" sourcename: \""^attTitle^"\"",
" targetname: \""^valTitle^"\"",
" }"
])
in (f3,attTitle,specText)
end
fun outElemGraph dtd (f,i,path,children,((pos,_),el,atts,_,_)) =
let
val iS = Int.toString i
val title = path^iS
val nodeTitle = title^".0"
val nodeLabel = Data2String (Index2Element dtd el)
val attPath = nodeTitle^"."
val f1 = outLines(f,[" graph: {",
" title : \""^title^"\"",
" color : "^elColor,
" folding: 1",
" horizontal_order: "^iS,
"",
" node: {",
" title: \""^nodeTitle^"\"",
" color: "^elColor,
" horizontal_order: "^iS,
" label: \""^nodeLabel^"\"",
" info1: \""^Position2String pos^"\"",
" }"
])
val (f2,attTitles,attTexts) = foldl
(fn (att,(f,titles,texts))
=> let val (f1,title,text) = outAttNode dtd (f,attPath,att)
in (f1,title::titles,text::texts)
end)
(f1,nil,nil) atts
val f3 = foldr
(fn (attTitle,f) => outLines(f,["",
" edge: {",
" sourcename: \""^nodeTitle^"\"",
" targetname:\""^attTitle^"\"",
" }"]))
f2 attTitles
val attsText = String.concat attTexts
val label = nodeLabel^attsText
val f4 = outLines(f3,[" label: \""^label^"\"",
" info1: \""^Position2String pos^"\"",
" }"])
val f5 = foldr
(fn (child,f) => outLines(f,["",
" edge: {",
" sourcename: \""^title^"\"",
" targetname:\""^child^"\"",
" }"]))
f4 children
in (f5,title)
end
fun outRoot (f,titles) =
let val f1 = outLines
(f,["",
" node: {",
" title : \"root\"",
" label : \"\"",
" width : 0",
" height: 0",
" }"])
val f2 =
foldr (fn (title,f) =>
outLines (f,["",
" edge: {",
" sourcename: \"root\"",
" targetname: \""^title^"\"",
" linestyle : invisible",
" }"])) f1 titles
in f2
end
fun hookProcInst (a,pi) =
let
val (dtd,f,(i,path,titles,_),stack) = outData a
val i1 = i+1
val title = path^Int.toString i1
val f1 = outPiNode(f,i1,title,pi)
in
(dtd,f1,(i1,path,title::titles,nil),stack)
end
fun hookComment (a,com) =
let
val (dtd,f,(i,path,titles,_),stack) = outData a
val i1 = i+1
val title = path^Int.toString i1
val f1 = outComNode(f,i1,title,com)
in
(dtd,f1,(i1,path,title::titles,nil),stack)
end
fun hookEndTag (a,_) =
let val (dtd,f,(_,_,titles,data),stack) = outData a
in case stack
of nil => a
| (stag,i,path,titles')::rest =>
let
val (f1,title) = outElemGraph dtd (f,i,path,titles,stag)
in
(dtd,f1,(i,path,title::titles',nil),rest)
end
end
fun hookStartTag (a,stag as (_,_,_,_,mt)) =
let
val (dtd,f,(i,path,titles,data),stack) = outData a
val i1 = i+1
val title = path^Int.toString i1
val a1 = (dtd,f,(0,title^".",nil,nil),(stag,i1,path,titles)::stack)
in
if mt then hookEndTag (a1,()) else a1
end
fun hookData0 ((dtd,f,(i,path,titles,data),stack),item) =
let
val i1 = i+1
val iS = Int.toString i1
in
(dtd,f,(i1,path,titles,(iS,item)::data),stack)
end
fun hookData (a,x) = hookData0(a,DATA x)
fun hookCData (a,x) = hookData0(a,CDATA x)
fun hookCharRef (a,x) = hookData0(a,CREF x)
fun hookError (a,pe) = a before printError pe
fun hookWarning (a,pw) = a before printWarning pw
fun hookXml((dtd,_,_,_),_) =
let val f = if !O_OUTPUT_FILE="-" then TextIO.stdOut
else TextIO.openOut (!O_OUTPUT_FILE)
val f1 = outLines(f,["graph: {",
" layoutalgorithm: tree",
" smanhattanedges: yes",
" spreadlevel : 1000",
"",
" colorentry "^piColor^": "^piRgb,
" colorentry "^comColor^": "^comRgb,
" colorentry "^txtColor^": "^txtRgb,
" colorentry "^elColor^": "^elRgb,
" colorentry "^valColor^": "^valRgb,
" colorentry "^attColor^": "^attRgb,
" colorentry "^misColor^": "^misRgb,
" colorentry "^typColor^": "^typRgb,
"",
" infoname 1: \"Source Position\"",
" infoname 2: \"Attribute Type\"",
" infoname 3: \"\"",
"",
" edge.arrowstyle : none",
" foldedge.thickness: 1"
])
in
(dtd,f1,(0,"",nil,nil),nil)
end
fun hookFinish (_,f,(_,_,titles,_),_) =
let
val f1 = outRoot(f,titles)
val f2 = outLine(f1,"}")
in
if !O_OUTPUT_FILE="-" then () else TextIO.closeOut f2
end
end
functor VizHooks_ ( structure Dtd : Dtd ) = VizHooks ( structure Dtd = Dtd ) : Hooks

View File

@ -0,0 +1,106 @@
signature VizOptions =
sig
val O_SILENT : bool ref
val O_ERROR_DEVICE : TextIO.outstream ref
val O_ERROR_LINEWIDTH : int ref
val O_OUTPUT_FILE : string ref
val setVizDefaults : unit -> unit
val setVizOptions : Options.Option list * (string -> unit)
-> bool * bool * string option * string option
val vizUsage : Options.Usage
end
structure VizOptions : VizOptions =
struct
open Options
val O_SILENT = ref false
val O_ERROR_DEVICE = ref TextIO.stdErr
val O_ERROR_LINEWIDTH = ref 80
val O_OUTPUT_FILE = ref "-"
fun setVizDefaults () =
let
val _ = O_SILENT := false
val _ = O_ERROR_DEVICE := TextIO.stdErr
val _ = O_OUTPUT_FILE := "-"
in ()
end
val vizUsage =
[U_ITEM(["-o <file>","--output=<file>"],"Write output to file (stdout)"),
U_ITEM(["-s","--silent"],"Suppress reporting of errors and warnings"),
U_ITEM(["-e <file>","--error-output=<file>"],"Redirect errors to file (stderr)"),
U_SEP,
U_ITEM(["--version"],"Print the version number and exit"),
U_ITEM(["-?","--help"],"Print this text and exit"),
U_ITEM(["--"],"Do not recognize remaining arguments as options")
]
fun setVizOptions (opts,optError) =
let
fun onlyOne what = "at most one "^what^" may be specified"
fun unknown pre opt = String.concat ["unknown option ",pre,opt]
fun hasNoArg pre key = String.concat ["option ",pre,key," expects no argument"]
fun mustHave pre key = String.concat ["option ",pre,key," must have an argument"]
fun check_noarg(key,valOpt) =
if isSome valOpt then optError (hasNoArg "--" key) else ()
fun do_long (pars as (v,h,e,f)) (key,valOpt) =
case key
of "help" => (v,true,e,f) before check_noarg(key,valOpt)
| "version" => (true,h,e,f) before check_noarg(key,valOpt)
| "silent" => pars before O_SILENT := true before check_noarg(key,valOpt)
| "output" =>
(case valOpt
of NONE => pars before optError (mustHave "--" key)
| SOME s => pars before O_OUTPUT_FILE := s)
| "error-output" =>
(case valOpt
of NONE => pars before optError (mustHave "--" key)
| SOME s => (v,h,SOME s,f))
| _ => pars before optError(unknown "--" key)
fun do_short (pars as (v,h,e,f)) (cs,opts) =
case cs
of nil => doit pars opts
| [#"o"] => (case opts
of OPT_STRING s::opts1 => (O_OUTPUT_FILE := s;
doit pars opts1)
| _ => (optError (mustHave "-" "o"); doit pars opts))
| [#"e"] => (case opts
of OPT_STRING s::opts1 => doit (v,h,SOME s,f) opts1
| _ => (optError (mustHave "-" "e"); doit pars opts))
| cs => doit (foldr
(fn (c,pars)
=> case c
of #"s" => pars before O_SILENT := true
| #"o" => pars before optError (mustHave "-" "o")
| #"e" => pars before optError (mustHave "-" "e")
| #"?" => (v,true,e,f)
| c => pars before
optError(unknown "-" (String.implode [c])))
pars cs) opts
and doit pars nil = pars
| doit (pars as (v,h,e,f)) (opt::opts) =
case opt
of OPT_LONG(key,valOpt) => doit (do_long pars (key,valOpt)) opts
| OPT_SHORT cs => do_short pars (cs,opts)
| OPT_STRING s => if isSome f
then let val _ = optError(onlyOne "input file")
in doit pars opts
end
else doit (v,h,e,SOME s) opts
| OPT_NOOPT => doit pars opts
| OPT_NEG cs => let val _ = if null cs then ()
else app (fn c => optError
(unknown "-n" (String.implode[c]))) cs
in doit pars opts
end
in doit (false,false,NONE,NONE) opts
end
end

View File

@ -0,0 +1,13 @@
structure CatData =
struct
datatype CatEntry =
E_BASE of Uri.Uri
| E_DELEGATE of string * Uri.Uri
| E_EXTEND of Uri.Uri
| E_MAP of string * Uri.Uri
| E_REMAP of Uri.Uri * Uri.Uri
type Catalog = Uri.Uri * CatEntry list
end

View File

@ -0,0 +1,54 @@
signature CatDtd =
sig
type Dtd
val baseIdx : int
val delegateIdx : int
val extendIdx : int
val mapIdx : int
val remapIdx : int
val hrefIdx : int
val pubidIdx : int
val sysidIdx : int
val Index2AttNot : Dtd -> int -> UniChar.Data
val Index2Element : Dtd -> int -> UniChar.Data
end
structure CatDtd =
struct
open Dtd
val baseGi = UniChar.String2Data "Base"
val delegateGi = UniChar.String2Data "Delegate"
val extendGi = UniChar.String2Data "Extend"
val mapGi = UniChar.String2Data "Map"
val remapGi = UniChar.String2Data "Remap"
val hrefAtt = UniChar.String2Data "HRef"
val pubidAtt = UniChar.String2Data "PublicId"
val sysidAtt = UniChar.String2Data "SystemId"
fun initDtdTables () =
let
val dtd = Dtd.initDtdTables()
val _ = app (ignore o (Element2Index dtd)) [baseGi,delegateGi,extendGi,mapGi,remapGi]
val _ = app (ignore o (AttNot2Index dtd)) [hrefAtt,pubidAtt,sysidAtt]
in dtd
end
local
val dtd = initDtdTables()
in
val baseIdx = Element2Index dtd baseGi
val delegateIdx = Element2Index dtd delegateGi
val extendIdx = Element2Index dtd extendGi
val mapIdx = Element2Index dtd mapGi
val remapIdx = Element2Index dtd remapGi
val hrefIdx = AttNot2Index dtd hrefAtt
val pubidIdx = AttNot2Index dtd pubidAtt
val sysidIdx = AttNot2Index dtd sysidAtt
end
end

View File

@ -0,0 +1,117 @@
signature CatError =
sig
type Position
val nullPosition : Position
val Position2String : Position -> string
datatype Location =
LOC_CATALOG
| LOC_COMMENT
| LOC_NOCOMMENT
| LOC_PUBID
| LOC_SYSID
datatype Expected =
EXP_NAME
| EXP_LITERAL
datatype CatError =
ERR_DECODE_ERROR of Decode.Error.DecodeError
| ERR_NO_SUCH_FILE of string * string
| ERR_ILLEGAL_HERE of UniChar.Char * Location
| ERR_MISSING_WHITE
| ERR_EOF of Location
| ERR_EXPECTED of Expected * UniChar.Char
| ERR_XML of Errors.Error
| ERR_MISSING_ATT of UniChar.Data * UniChar.Data
| ERR_NON_PUBID of UniChar.Data * UniChar.Data
val catMessage : CatError -> string list
end
structure CatError : CatError =
struct
open Errors UtilError UtilString
type Position = string * int * int
val nullPosition = ("",0,0)
fun Position2String (fname,l,c) =
if fname="" then ""
else String.concat ["[",fname,":",Int2String l,".",Int2String c,"]"]
datatype Location =
LOC_CATALOG
| LOC_COMMENT
| LOC_NOCOMMENT
| LOC_PUBID
| LOC_SYSID
fun Location2String loc =
case loc
of LOC_CATALOG => "catalog file"
| LOC_COMMENT => "comment"
| LOC_NOCOMMENT => "something other than a comment"
| LOC_PUBID => "public identifier"
| LOC_SYSID => "system identifier"
fun InLocation2String loc =
case loc
of LOC_CATALOG => "in a catalog file"
| LOC_COMMENT => "in a comment"
| LOC_NOCOMMENT => "outside of comments"
| LOC_PUBID => "in a public identifier"
| LOC_SYSID => "in a system identifier"
datatype Expected =
EXP_NAME
| EXP_LITERAL
fun Expected2String exp =
case exp
of EXP_NAME => "a name"
| EXP_LITERAL => "a literal"
datatype CatError =
ERR_DECODE_ERROR of Decode.Error.DecodeError
| ERR_NO_SUCH_FILE of string * string
| ERR_ILLEGAL_HERE of UniChar.Char * Location
| ERR_MISSING_WHITE
| ERR_EOF of Location
| ERR_EXPECTED of Expected * UniChar.Char
| ERR_XML of Error
| ERR_MISSING_ATT of UniChar.Data * UniChar.Data
| ERR_NON_PUBID of UniChar.Data * UniChar.Data
fun catMessage err =
case err
of ERR_DECODE_ERROR err => Decode.Error.decodeMessage err
| ERR_NO_SUCH_FILE(f,msg) => ["Could not open file",quoteErrorString f,"("^msg^")"]
| ERR_ILLEGAL_HERE (c,loc) =>
["Character",quoteErrorChar c,"is not allowed",InLocation2String loc]
| ERR_MISSING_WHITE => ["Missing white space"]
| ERR_EOF loc => [toUpperFirst (Location2String loc),"ended by end of file"]
| ERR_EXPECTED (exp,c) =>
["Expected",Expected2String exp,"but found",quoteErrorChar c]
| ERR_XML err => errorMessage err
| ERR_MISSING_ATT(elem,att) =>
["Element",quoteErrorData elem,"has no",quoteErrorData att,"attribute"]
| ERR_NON_PUBID(att,cs) =>
["Value specified for attribute",quoteErrorData att,"contains non-PublicId",
case cs
of [c] => "character"^quoteErrorChar c
| cs => List2xString ("characters ",", ","") quoteErrorChar cs]
end

View File

@ -0,0 +1,74 @@
signature CatFile =
sig
type CatFile
type Position
val catOpenFile : Uri.Uri -> CatFile
val catCloseFile : CatFile -> unit
val catGetChar : CatFile -> UniChar.Char * CatFile
val catPos : CatFile -> CatError.Position
end
functor CatFile ( structure Params : CatParams ) : CatFile =
struct
open UniChar CatError Decode Params Uri UtilError
(* column, line, break *)
type PosInfo = int * int * bool
val startPos = (0,1,false)
datatype CatFile =
NOFILE of string * PosInfo
| DIRECT of DecFile * PosInfo
fun catPos cf =
case cf
of NOFILE (uri,(col,line,_)) => (uri,line,col)
| DIRECT (dec,(col,line,_)) => (decName dec,line,col)
fun catOpenFile uri =
let val dec = decOpenUni(SOME uri,!O_CATALOG_ENC)
in DIRECT(dec,startPos)
end
handle NoSuchFile fmsg => let val _ = catError(nullPosition,ERR_NO_SUCH_FILE fmsg)
in NOFILE(Uri2String uri,startPos)
end
fun catCloseFile cf =
case cf
of NOFILE _ => ()
| DIRECT(dec,_) => ignore (decClose dec)
fun catGetChar cf =
case cf
of NOFILE _ => (0wx00,cf)
| DIRECT(dec,(col,line,brk)) =>
(let val (c,dec1) = decGetChar dec
in case c
of 0wx09 => (c,DIRECT(dec1,(col+1,line,false)))
| 0wx0A => if brk then catGetChar(DIRECT(dec1,(col,line,false)))
else (c,DIRECT(dec1,(0,line+1,false)))
| 0wx0D => (0wx0A,DIRECT(dec1,(0,line+1,true)))
| _ => if c>=0wx20 then (c,DIRECT(dec1,(col+1,line,false)))
else let val err = ERR_ILLEGAL_HERE(c,LOC_CATALOG)
val _ = catError(catPos cf,err)
in catGetChar(DIRECT(dec1,(col+1,line,false)))
end
end
handle DecEof dec => (0wx00,NOFILE(decName dec,(col,line,brk)))
| DecError(dec,_,err) =>
let val _ = catError(catPos cf,ERR_DECODE_ERROR err)
in catGetChar(DIRECT(dec,(col,line,false)))
end
)
end

View File

@ -0,0 +1,89 @@
signature CatHooks =
sig
type AppData = CatData.CatEntry list
val initCatHooks : unit -> AppData
end
functor CatHooks (structure Params : CatParams
structure Dtd : CatDtd ) =
struct
open
Dtd HookData IgnoreHooks Params UniChar UniClasses Uri UtilList
CatData CatError
type AppData = Dtd * CatEntry list
type AppFinal = CatEntry list
fun initCatHooks dtd = (dtd,nil)
fun hookError (a,(pos,err)) = a before catError (pos,ERR_XML err)
fun getAtt dtd (pos,elem,att,trans) atts =
let
val cvOpt = findAndMap
(fn (i,ap,_) => if i<>att then NONE
else case ap
of AP_DEFAULT(_,cv,_) => SOME cv
| AP_PRESENT(_,cv,_) => SOME cv
| _ => NONE)
atts
in case cvOpt
of SOME cv => trans (pos,att) cv
| NONE => NONE before catError
(pos,ERR_MISSING_ATT(Index2Element dtd elem,Index2AttNot dtd att))
end
fun makePubid dtd (pos,att) cv =
let val (cs,bad) =
Vector.foldr
(fn (c,(cs,bad)) => if isPubid c then (Char2char c::cs,bad)
else (cs,c::bad))
(nil,nil) cv
in if null bad then SOME(String.implode cs)
else NONE before catError(pos,ERR_NON_PUBID(Index2AttNot dtd att,bad))
end
fun makeUri (pos,att) cv = SOME cv
fun hookStartTag (a as (dtd,items),((_,pos),elem,atts,_,_)) =
if elem=baseIdx
then let val hrefOpt = getAtt dtd (pos,elem,hrefIdx,makeUri) atts
in case hrefOpt
of NONE => a
| SOME href => (dtd,E_BASE (Vector2Uri href)::items)
end
else if elem=delegateIdx
then let val hrefOpt = getAtt dtd (pos,elem,hrefIdx,makeUri) atts
val pubidOpt = getAtt dtd (pos,elem,pubidIdx,makePubid dtd) atts
in case (hrefOpt,pubidOpt)
of (SOME href,SOME pubid) =>
(dtd,E_DELEGATE(pubid,Vector2Uri href)::items)
| _ => a
end
else if elem=extendIdx
then let val hrefOpt = getAtt dtd (pos,elem,hrefIdx,makeUri) atts
in case hrefOpt
of NONE => a
| SOME href => (dtd,E_EXTEND (Vector2Uri href)::items)
end
else if elem=mapIdx
then let val hrefOpt = getAtt dtd (pos,elem,hrefIdx,makeUri) atts
val pubidOpt = getAtt dtd (pos,elem,pubidIdx,makePubid dtd) atts
in case (hrefOpt,pubidOpt)
of (SOME href,SOME pubid) =>
(dtd,E_MAP(pubid,Vector2Uri href)::items)
| _ => a
end
else if elem=remapIdx
then let val hrefOpt = getAtt dtd (pos,elem,hrefIdx,makeUri) atts
val sysidOpt = getAtt dtd (pos,elem,sysidIdx,makeUri) atts
in case (hrefOpt,sysidOpt)
of (SOME href,SOME sysid) =>
(dtd,E_REMAP(Vector2Uri sysid,Vector2Uri href)::items)
| _ => a
end
else a
fun hookFinish (_,items) = rev items
end

View File

@ -0,0 +1,136 @@
signature CatOptions =
sig
val O_CATALOG_FILES : Uri.Uri list ref
val O_PREFER_SOCAT : bool ref
val O_PREFER_SYSID : bool ref
val O_PREFER_CATALOG : bool ref
val O_SUPPORT_REMAP : bool ref
val O_CATALOG_ENC : Encoding.Encoding ref
val setCatalogDefaults : unit -> unit
val setCatalogOptions : Options.Option list * (string -> unit) -> Options.Option list
val catalogUsage : Options.Usage
end
functor CatOptions () : CatOptions =
struct
open Encoding Options Uri
val O_CATALOG_FILES = ref nil: Uri list ref
val O_PREFER_SOCAT = ref false
val O_PREFER_SYSID = ref false
val O_PREFER_CATALOG = ref true
val O_SUPPORT_REMAP = ref true
val O_CATALOG_ENC = ref LATIN1
fun setCatalogDefaults() =
let
val _ = O_CATALOG_FILES := nil
val _ = O_PREFER_SOCAT := false
val _ = O_PREFER_SYSID := false
val _ = O_PREFER_CATALOG := true
val _ = O_SUPPORT_REMAP := true
val _ = O_CATALOG_ENC := LATIN1
in ()
end
val catalogUsage =
[U_ITEM(["-C <url>","--catalog=<url>"],"Use catalog <url>"),
U_ITEM(["--catalog-syntax=(soc|xml)"],"Default syntax for catalogs (xml)"),
U_ITEM(["--catalog-encoding=<enc>"],"Default encoding for Socat catalogs (LATIN1)"),
U_ITEM(["--catalog-remap=[(yes|no)]"],"Support remapping of system identifiers (yes)"),
U_ITEM(["--catalog-priority=(map|remap|sys)"],"Resolving strategy in catalogs (map)")
]
fun setCatalogOptions (opts,doError) =
let
val catalogs = ref nil:string list ref
fun hasNoArg key = "option "^key^" has no argument"
fun mustHave key = String.concat ["option ",key," must have an argument"]
fun mustBe(key,what) = String.concat ["the argument to --",key," must be ",what]
val yesNo = "'yes' or 'no'"
val mapRemapSys = "'map', 'remap' or 'sys'"
val encName = "'ascii', 'latin1', 'utf8' or 'utf16'"
val syntaxName = "'soc' or 'xml'"
fun do_catalog valOpt =
case valOpt
of NONE => doError(mustHave "--catalog")
| SOME s => catalogs := s::(!catalogs)
fun do_prio valOpt =
let fun set(cat,sys) = (O_PREFER_CATALOG := cat; O_PREFER_SYSID := sys)
in case valOpt
of NONE => doError(mustHave "--catalog-priority")
| SOME "map" => set(true,false)
| SOME "remap" => set(true,true)
| SOME "sys" => set(false,true)
| SOME s => doError(mustBe("catalog-priority",mapRemapSys))
end
fun do_enc valOpt =
case valOpt
of NONE => doError(mustHave "--catalog-encoding")
| SOME s => case isEncoding s
of NOENC => doError("unsupported encoding "^s)
| enc => O_CATALOG_ENC := enc
fun do_remap valOpt =
case valOpt
of NONE => doError(mustHave "--catalog-remap")
| SOME "no" => O_SUPPORT_REMAP := false
| SOME "yes" => O_SUPPORT_REMAP := true
| SOME s => doError(mustBe("catalog-remap",yesNo))
fun do_syntax valOpt =
case valOpt
of NONE => doError(mustHave "--catalog-syntax")
| SOME "soc" => O_PREFER_SOCAT := true
| SOME "xml" => O_PREFER_SOCAT := false
| SOME s => doError(mustBe("catalog-remap",syntaxName))
fun do_long(key,valOpt) =
case key
of "catalog" => true before do_catalog valOpt
| "catalog-remap" => true before do_remap valOpt
| "catalog-syntax" => true before do_syntax valOpt
| "catalog-encoding" => true before do_enc valOpt
| "catalog-priority" => true before do_prio valOpt
| _ => false
fun do_short cs opts =
case cs
of nil => doit opts
| [#"C"] =>
(case opts
of OPT_STRING s::opts1 => (catalogs := s::(!catalogs);
doit opts1)
| _ => let val _ = doError (mustHave "-C")
in doit opts
end)
| cs =>
let val cs1 = List.filter
(fn c => if #"C"<>c then true
else false before doError (mustHave "-C")) cs
in if null cs1 then doit opts else (OPT_SHORT cs1)::doit opts
end
and doit nil = nil
| doit (opt::opts) =
case opt
of OPT_NOOPT => opts
| OPT_LONG(key,value) => if do_long(key,value) then doit opts
else opt::doit opts
| OPT_SHORT cs => do_short cs opts
| OPT_NEG cs => opt::doit opts
| OPT_STRING s => opt::doit opts
val opts1 = doit opts
val uris = map String2Uri (!catalogs)
val _ = O_CATALOG_FILES := uris
in opts1
end
end

View File

@ -0,0 +1,17 @@
signature CatParams =
sig
val O_CATALOG_FILES : Uri.Uri list ref
val O_PREFER_SOCAT : bool ref
val O_PREFER_SYSID : bool ref
val O_PREFER_CATALOG : bool ref
val O_SUPPORT_REMAP : bool ref
val O_CATALOG_ENC : Encoding.Encoding ref
val catError : CatError.Position * CatError.CatError -> unit
end

View File

@ -0,0 +1,69 @@
signature CatParse =
sig
val parseCatalog : Uri.Uri -> CatData.Catalog
end
functor CatParse (structure Params : CatParams) : CatParse =
struct
structure SocatParse = SocatParse (structure Params = Params)
structure ParserOptions =
struct
structure Options = ParserOptions()
open Options
local
fun setDefaults() =
let
val _ = setParserDefaults()
val _ = O_WARN_MULT_ENUM := false
val _ = O_WARN_XML_DECL := false
val _ = O_WARN_ATT_NO_ELEM := false
val _ = O_WARN_MULT_ENT_DECL := false
val _ = O_WARN_MULT_NOT_DECL := false
val _ = O_WARN_MULT_ATT_DEF := false
val _ = O_WARN_MULT_ATT_DECL := false
val _ = O_WARN_SHOULD_DECLARE := false
val _ = O_VALIDATE := false
val _ = O_COMPATIBILITY := false
val _ = O_INTEROPERABILITY := false
val _ = O_INCLUDE_EXT_PARSED := true
in ()
end
in
val setParserDefaults = setDefaults
end
end
structure CatHooks = CatHooks (structure Params = Params
structure Dtd = CatDtd)
structure Parse = Parse (structure Dtd = CatDtd
structure Hooks = CatHooks
structure Resolve = ResolveNull
structure ParserOptions = ParserOptions)
open CatHooks CatDtd Parse ParserOptions SocatParse Uri
fun parseXmlCat uri =
let
val _ = setParserDefaults()
val dtd = initDtdTables()
val items = parseDocument (SOME uri) (SOME dtd) (initCatHooks dtd)
in
(uri,items)
end
fun isSocatSuffix x = x="soc" orelse x="SOC"
fun isXmlSuffix x = x="xml" orelse x="XML"
fun parseCatalog uri =
let val suffix = uriSuffix uri
in if isSocatSuffix suffix then parseSoCat uri
else (if isXmlSuffix suffix then parseXmlCat uri
else (if !O_PREFER_SOCAT then parseSoCat uri
else parseXmlCat uri))
end
end

View File

@ -0,0 +1,25 @@
functor ResolveCatalog ( structure Params : CatParams ) : Resolve =
struct
structure Catalog = Catalog ( structure Params = Params )
open Base Errors
fun resolveExtId (id as EXTID(pub,sys)) =
let val pub1 = case pub
of NONE => NONE
| SOME (str,_) => SOME str
val sys1 = case sys
of NONE => NONE
| SOME (base,file,_) => SOME(base,file)
in case Catalog.resolveExtId (pub1,sys1)
of NONE => raise NoSuchFile ("","Could not generate system identifier")
| SOME uri => uri
end
end

139
fxp/src/Catalog/catalog.sml Normal file
View File

@ -0,0 +1,139 @@
signature Catalog =
sig
val resolveExtId : string option * (Uri.Uri * Uri.Uri) option -> Uri.Uri option
end
functor Catalog ( structure Params : CatParams ) : Catalog =
struct
structure CatParse = CatParse ( structure Params = Params )
open CatData CatParse Params Uri UriDict
val catDict = makeDict("catalog",6,NONE:Catalog option)
fun getCatalog uri =
let val idx = getIndex(catDict,uri)
in case getByIndex(catDict,idx)
of SOME cat => cat
| NONE => let val cat = parseCatalog uri
val _ = setByIndex(catDict,idx,SOME cat)
in cat
end
end
datatype SearchType =
SYS of Uri
| PUB of string
datatype SearchResult =
FOUND of Uri * Uri
| NOTFOUND of Uri list
fun searchId id =
let
fun searchOne (base,other) nil = NOTFOUND other
| searchOne (base,other) (entry::entries) =
case entry
of E_BASE path =>
let val newBase = uriJoin(base,path)
in searchOne (newBase,other) entries
end
| E_EXTEND path =>
let val fullPath = uriJoin(base,path)
in searchOne (base,fullPath::other) entries
end
| E_DELEGATE(prefix,path) =>
(case id
of PUB pid => if String.isPrefix prefix pid
then let val fullPath = uriJoin(base,path)
in searchOne (base,fullPath::other) entries
end
else searchOne (base,other) entries
| SYS _ => searchOne (base,other) entries)
| E_MAP(pubid,path) =>
(case id
of PUB pid => if pubid=pid then FOUND (base,path)
else searchOne (base,other) entries
| _ => searchOne (base,other) entries)
| E_REMAP(sysid,path) =>
(case id
of SYS sid => if sysid=sid then FOUND(base,path)
else searchOne (base,other) entries
| _ => searchOne (base,other) entries)
fun searchLevel other nil = NOTFOUND(rev other)
| searchLevel other (fname::fnames) =
let
val (base,entries) = getCatalog fname
in
case searchOne (base,other) entries
of FOUND bp => FOUND bp
| NOTFOUND other' => searchLevel other' fnames
end
fun searchAll fnames =
if null fnames then NONE
else case searchLevel nil fnames
of FOUND bp => SOME bp
| NOTFOUND other => searchAll other
val fnames = !O_CATALOG_FILES
in
case id
of PUB _ => searchAll fnames
| SYS _ => if !O_SUPPORT_REMAP then searchAll fnames else NONE
end
fun resolveExtId (pub,sys) =
let
fun resolvePubCat () =
case pub
of NONE => NONE
| SOME id => case searchId (PUB id)
of NONE => NONE
| SOME(base,sysid) => case searchId (SYS sysid)
of NONE => SOME(base,sysid)
| new => new
fun resolveSysCat () =
case sys
of NONE => NONE
| SOME(base,id) => searchId (SYS id)
fun resolveCat () =
if !O_PREFER_SYSID
then case resolveSysCat ()
of NONE => resolvePubCat ()
| found => found
else case resolvePubCat ()
of NONE => resolveSysCat ()
| found => found
fun resolve () =
if !O_PREFER_CATALOG
then case resolveCat ()
of NONE => (case sys
of NONE => NONE
| SOME(base,id) => SOME(base,id))
| found => found
else case sys
of NONE => resolvePubCat ()
| SOME(base,id) => SOME(base,id)
in
if null (!O_CATALOG_FILES)
then case sys
of NONE => NONE
| SOME(base,id) => SOME (uriJoin (base,id))
else case resolve ()
of NONE => NONE
| SOME bp => SOME (uriJoin bp)
end
end

View File

@ -0,0 +1,293 @@
signature SocatParse =
sig
val parseSoCat : Uri.Uri -> CatData.Catalog
end
functor SocatParse ( structure Params : CatParams ) : SocatParse =
struct
structure CatFile = CatFile ( structure Params = Params )
open CatData CatError CatFile Params UniChar UniClasses Uri
exception SyntaxError of UniChar.Char * CatFile.CatFile
exception NotFound of UniChar.Char * CatFile.CatFile
val getChar = catGetChar
fun parseName' (c,f) =
if isName c then let val (cs,cf1) = parseName' (getChar f)
in (c::cs,cf1)
end
else (nil,(c,f))
fun parseName (c,f) =
if isNms c then let val (cs,cf1) = parseName' (getChar f)
in (c::cs,cf1)
end
else raise NotFound (c,f)
datatype Keyword =
KW_BASE
| KW_CATALOG
| KW_DELEGATE
| KW_PUBLIC
| KW_SYSTEM
| KW_OTHER of UniChar.Data
fun parseKeyword cf =
let
val (name,cf1) = parseName cf
val kw = case name
of [0wx42,0wx41,0wx53,0wx45] => KW_BASE
| [0wx43,0wx41,0wx54,0wx41,0wx4c,0wx4f,0wx47] => KW_CATALOG
| [0wx44,0wx45,0wx4c,0wx45,0wx47,0wx41,0wx54,0wx45] => KW_DELEGATE
| [0wx50,0wx55,0wx42,0wx4c,0wx49,0wx43] => KW_PUBLIC
| [0wx53,0wx59,0wx53,0wx54,0wx45,0wx4d] => KW_SYSTEM
| _ => KW_OTHER name
in (kw,cf1)
end
fun parseSysLit' quote f =
let
fun doit text (c,f) =
if c=quote then (text,getChar f)
else if c<>0wx0 then doit (c::text) (getChar f)
else let val _ = catError(catPos f,ERR_EOF LOC_SYSID)
in (text,(c,f))
end
val (text,cf1) = doit nil (getChar f)
in (Data2Uri(rev text),cf1)
end
fun parseSysLit req (c,f) =
if c=0wx22 orelse c=0wx27 then parseSysLit' c f
else if req then let val _ = catError(catPos f,ERR_EXPECTED(EXP_LITERAL,c))
in raise SyntaxError (c,f)
end
else raise NotFound (c,f)
fun parsePubLit' quote f =
let
fun doit (hadSpace,atStart,text) (c,f) =
case c
of 0wx0 => let val _ = catError(catPos f,ERR_EOF LOC_PUBID)
in (text,(c,f))
end
| 0wx0A => doit (true,atStart,text) (getChar f)
| 0wx20 => doit (true,atStart,text) (getChar f)
| _ =>
if c=quote then (text,getChar f)
else if isPubid c
then if hadSpace andalso not atStart
then doit (false,false,c::0wx20::text) (getChar f)
else doit (false,false,c::text) (getChar f)
else let val _ = catError(catPos f,ERR_ILLEGAL_HERE(c,LOC_PUBID))
in doit (hadSpace,atStart,text) (getChar f)
end
val (text,cf1) = doit (false,true,nil) (getChar f)
in (Latin2String(rev text),cf1)
end
fun parsePubLit (c,f) =
if c=0wx22 orelse c=0wx27 then parsePubLit' c f
else let val _ = catError(catPos f,ERR_EXPECTED(EXP_LITERAL,c))
in raise SyntaxError (c,f)
end
fun skipComment (c,f) =
case c
of 0wx00 => let val _ = catError(catPos f,ERR_EOF LOC_COMMENT)
in (c,f)
end
| 0wx2D => let val (c1,f1) = getChar f
in if c1 = 0wx2D then (getChar f1) else skipComment (c1,f1)
end
| _ => skipComment (getChar f)
fun skipCopt (c,f) =
case c
of 0wx00 => (c,f)
| 0wx2D => let val (c1,f1) = getChar f
in if c1=0wx2D then skipComment (getChar f1)
else let val _ = catError(catPos f,ERR_ILLEGAL_HERE(c,LOC_NOCOMMENT))
in (c1,f1)
end
end
| _ => (c,f)
fun skipScomm req0 cf =
let
fun endit req (c,f) =
if req andalso c<>0wx00
then let val _ = catError(catPos f,ERR_MISSING_WHITE)
in (c,f)
end
else (c,f)
fun doit req (c,f) =
case c
of 0wx00 => endit req (c,f)
| 0wx09 => doit false (getChar f)
| 0wx0A => doit false (getChar f)
| 0wx20 => doit false (getChar f)
| 0wx22 => endit req (c,f)
| 0wx27 => endit req (c,f)
| 0wx2D =>
let val (c1,f1) = getChar f
in if c1=0wx2D
then let val _ = if not req then ()
else catError(catPos f1,ERR_MISSING_WHITE)
val cf1 = skipComment (getChar f1)
in doit true cf1
end
else let val _ = catError(catPos f,ERR_ILLEGAL_HERE(c,LOC_NOCOMMENT))
in doit req (c1,f1)
end
end
| _ => if isNms c then endit req (c,f)
else let val _ = catError(catPos f,ERR_ILLEGAL_HERE(c,LOC_NOCOMMENT))
in doit req (getChar f)
end
in doit req0 cf
end
val skipWS = skipScomm true
val skipCommWS = (skipScomm false) o skipCopt
val skipWSComm = skipScomm false
fun skipOther cf =
let
val cf1 = skipWS cf
val cf2 = let val (_,cf') = parseName cf1
in skipWS cf'
end
handle NotFound cf => cf
fun doit cf =
let val (_,cf1) = parseSysLit false cf
in doit (skipWS cf1)
end
handle NotFound(c,f) => (c,f)
in
(NONE,doit cf2)
end
fun parseBase cf =
let
val cf1 = skipWS cf
val (lit,cf2) = parseSysLit true cf1
val cf3 = skipWS cf2
in
(SOME(E_BASE lit),cf3)
end
fun parseExtend cf =
let
val cf1 = skipWS cf
val (lit,cf2) = parseSysLit true cf1
val cf3 = skipWS cf2
in
(SOME(E_EXTEND lit),cf3)
end
fun parseDelegate cf =
let
val cf1 = skipWS cf
val (pub,cf2) = parsePubLit cf1
val cf3 = skipWS cf2
val (sys,cf4) = parseSysLit true cf3
val cf5 = skipWS cf4
in
(SOME(E_DELEGATE(pub,sys)),cf5)
end
fun parseRemap cf =
let
val cf1 = skipWS cf
val (sys0,cf2) = parseSysLit true cf1
val cf3 = skipWS cf2
val (sys,cf4) = parseSysLit true cf3
val cf5 = skipWS cf4
in
(SOME(E_REMAP(sys0,sys)),cf5)
end
fun parseMap cf =
let
val cf1 = skipWS cf
val (pub,cf2) = parsePubLit cf1
val cf3 = skipWS cf2
val (sys,cf4) = parseSysLit true cf3
val cf5 = skipWS cf4
in
(SOME(E_MAP(pub,sys)),cf5)
end
fun recover cf =
let
fun do_lit q (c,f) =
if c=0wx00 then (c,f)
else if c=q then getChar f
else do_lit q (getChar f)
fun do_com (c,f) =
case c
of 0wx00 => (c,f)
| 0wx2D => let val (c1,f1) = getChar f
in if c1=0wx2D then getChar f1
else do_com (c1,f1)
end
| _ => do_com (getChar f)
fun doit (c,f) =
case c
of 0wx00 => (c,f)
| 0wx22 => doit (do_lit c (getChar f))
| 0wx27 => doit (do_lit c (getChar f))
| 0wx2D => let val (c1,f1) = getChar f
in if c1=0wx2D then doit (do_com (getChar f1))
else doit (c1,f1)
end
| _ => if isNms c then (c,f)
else doit (getChar f)
in doit cf
end
fun parseEntry (cf as (c,f)) =
let val (kw,cf1) = parseKeyword cf handle NotFound cf => raise SyntaxError cf
in case kw
of KW_BASE => parseBase cf1
| KW_CATALOG => parseExtend cf1
| KW_DELEGATE => parseDelegate cf1
| KW_SYSTEM => parseRemap cf1
| KW_PUBLIC => parseMap cf1
| KW_OTHER _ => skipOther cf1
end
handle SyntaxError cf => (NONE,recover cf)
fun parseDocument cf =
let
fun doit (c,f) =
if c=0wx0 then nil before catCloseFile f
else let val (opt,cf1) = parseEntry (c,f)
val entries = doit cf1
in case opt
of NONE => entries
| SOME entry => entry::entries
end
val cf1 = skipCommWS cf
in
doit cf1
end
fun parseSoCat uri =
let
val f = catOpenFile uri
val cf1 = getChar f
in
(uri,parseDocument cf1)
end
end

Some files were not shown because too many files have changed in this diff Show More