Title: NCHELP EEAT Image Transport Proposal DRAFT
1NCHELP EEAT Image Transport Proposal (DRAFT)
- Electronic Exchange Advisory Team
- Austin, Texas - September 2003
2- General Characteristics
- The NCHELP/DACAT standard for document transport
should allow for the transmission of single
documents and batches of documents. The sender
has the option of sending a single document per
transmission or a batch of documents subject to
restrictions listed below. - The receiving party should be capable of handling
both types of transmissions. Single document
transmissions may ease the problem diagnosis and
recovery. Batches simplify tracking when large
numbers of documents are being sent. - The sender should strongly consider batching
transmissions if they send more than X (50?) per
day to a specific partner. - If documents are batched, the total size of the
batch should be appropriate to the type of
transport used. (email, FTP, HPC). - Transport
- EEAT has designated the three-letter prefix for
an image transmission is IMG. In addition the
01 refers to the version of the image transport
packaging specification, not the descriptive name
for the document contained within the transport
system. - File Types
- File Type - IMG01 DOCUMENT
- Description - EEAT Image Transport v.1 document
transmission
2
3Subject Line Examples Example File Image
transmission in EEAT Image Transport v.1
format Encryption PGP Organization
Guarantor ED Number 444 Unique Message
ID Date(August 6, 2003) and a sequence
number(001) Subject Line IMG01 DOCUMENT 02
ltDOE44420030806001gt FTP File Name
Example Example File Image transmission in
EEAT Image Transport v.1 format Encryption PGP O
rganization Guarantor ED Number 444 Unique
Message ID Date(August 6, 2003) and a sequence
number(001) Subject Line IMG01_DOCUMENT-02-DOE44
4-20030806001
3
4TIFF Images What is TIFF? - The acronym means
Tagged Information File Format. It is the most
popular Document Management scanning format.
TIFF images can be multi-pages or single pages.
TIFF exceeds all other file formats in its
compression capabilities when in black white
(binary mode). It can provide excellent
compression for storage on disk. TIFF is
intended to be independent of specific operating
systems, filing systems, compilers, and
processors. TIFF is now supported by hundreds of
computer companies and independent software
vendors. Why use TIFF? TIFF files provide a
unique advantage due to the way they are built.
They are built using 'tags'. These tags tell
reader programs things like the resolution of the
image, the width and length of the image, and so
on. Certain programs that do not need the data
may ignore some tags, like the name of the page.
In this way, additional tags can be added to
newer TIFF files without compromising backward
compatibility for older reader programs. The
older program will simply ignore any new tag it
doesn't know about. Backward compatibility is
important because in many cases the images
archived must be accessible for many decades.
Adhering to standards becomes crucial. Many
generations of new computer systems and programs
will be used during the life of the image. They
will all need to read the images. By using TIFF
files, such a migration strategy becomes
possible. Recommendation The recommendation
is to use single page black white TIFF images
with CCITT Fax 4 Compression. This format (also
known as TIFF G4) is ideally suited for bi-tonal
text documents because it provides a high level
of detail combined with a small file size. It is
also the format of choice for most fax software.
4
5Transmission File Overview The transmission file
consists of a manifest applicable only to that
file, followed by one or more images. The
manifest is a text-based document formatted as
XML. An imaged document in these discussion
represents a paper document which, when scanned,
consists of one or more images with a one-to-one
correspondence between a physical page and an
image. The images themselves must be stored for
transport in TIFF with Fax4 compression. This
does not designate the storage format on backend
systems, however if proprietary systems do not
utilize this image storage format they are
responsible for the relatively trivial task of
converting image formats from their system to
TIFF with Fax4 compression. The manifest and
images are packaged into a single zip file,
containing no directory structure, for transport.
This implies that all images are given unique
names. These unique names must be tracked, as
the images are stored because those unique names
are recorded in the manifest. At this point,
ImageManifest.xml has been established as the
physical filename for the manifest. Naming
standards for the images or the zip file itself
have not been established. It may be possible to
establish some sort of standard for the image
names. There may be a finite set of documents
that will be transported in this particular
application, and this lends itself to meaningful
filenames for the corresponding images.
5
6The Generic Transmission Manifest The manifest
design utilizes a feature of the World Wide Web
Consortiums (W3C) XML schema language that
provides a potent level of flexibility for its
users. The layout described by the manifests
schema consists of a generic, static portion
(ltTransmissionDatagt, see below) describing
information applicable to the transmission
package, the sender, and the receiver. In
addition, the layout provides code at a location
which specifies that any XML structure can be
plugged into the manifest XML document at that
location. This allows use of a single generic
manifest document for multiple applications.
This is the same model utilized by
CommonRecord/CommonLine to provide the
functionality of the flat file unique areas and
the _at_2 unique record. ltTransmissionDatagt ltDocu
mentIDgt"2001-12-17T09304700"lt/DocumentIDgt ltC
reatedDateTimegt2001-12-17T093047.00lt/CreatedDate
Timegt ltDocumentTypeCodegtImageManifestlt/DocumentT
ypeCodegt ltSourcegt ltNCHELPIDgt830694lt/NCHELPIDgt
ltNonEDBranchIDgt1232lt/NonEDBranchIDgt ltEntity
NamegtCitiBanklt/EntityNamegt lt/Sourcegt ltDestinat
iongt ltOPEIDgt748lt/OPEIDgt ltEntityNamegtTGSLClt/E
ntityNamegt lt/Destinationgt lt/TransmissionDatagt
ltGeneralManifest ManifestNS"DACAT"gt The
ltGeneralManifest ManifestNS"DACAT"gt tag marks
the start of the variable portion of the
manifest. The ManifestNS parameter may be used
by software to identify the namespace which
applies to the variable portion of the manifest,
which in turn (and in this case) identifies the
DACAT image manifest.
6
7To provide for any applications transfer needs,
all that is required is a schema describing the
contents of, and using a namespace unique to,
that particular applications data transmission.
Any XML document using that particular format
must reference both the generic manifest schema
and that applications schema. Note that an
additional value is needed in the schemaLocation
parameter in the instance document for this type
of architecture, where we have a main (generic)
schema, and a schema associated with whatever
content is in the variable portion (in this
case, DACAT). The instance document needs to
specify the namespace qualifier on the starting
and ending tags of the variable portion, in
addition to the starting and ending tags for the
main or root document element. The generic
manifest document schema, and that of any
applications transmission manifest, are written
to the standards established by the Postsecondary
Electronic Standards Council (PESC), which are
utilized by the Department of Education and
NCHELP. Existing data definitions from the PESC
Core and Sector schemas are used wherever
possible. When the schemas utilized by
CommonRecord/CommonLine are released for
production use, the Manifest schema family may be
upgraded to directly reference data definitions
utilized in the NCHELP schemas directly, easing
maintenance burdens. Based on PESC
recommendations, the schema for the generic
portion of the manifest declares a namespace for
itself, and uses that namespace as its target
namespace. The schema defining the structure of
elements in the application-specific area must
also declare a namespace for itself, and use that
namespace as its target namespace. This not only
allows the inclusion of data element definitions
from outside the manifests definitions (such as
using elements defined in PESCs Core schema, or
the Financial Aid sector schema), but also keeps
the usage of namespace qualifiers in the instance
documents (the actual manifests) to a minimum.
7
8- The DACAT Manifest
- The hierarchy of data elements in the DACAT
manifest provides an organizational envelope for
the images themselves. This organization puts
the images in a business context that can be
interpreted visually by an individual or
programmatically via software. - The first, highest-level tag appearing in the
DACAT manifest is ltDACATManifestgt, which signals
the start of the manifest. - At the next highest level in the hierarchy there
is a tag whose name designates the business
context of the images immediately following,
essentially providing an answer to the question
why are these here. At present the following
business contexts have been identified - Claim Submittal (tagname ltClaimSubmittalgt)
- Bankruptcy Transfer (tagname ltBankruptcyTransfer
gt) - Conditional Disability (tagname
ltConditionalDisabilitygt) - Subrogation (tagname ltSubrogationgt)
- We suggest the DACAT group review the list of
business contexts to determine their
comprehensiveness and accuracy.
8
9Using both the generic and DACAT-specific tags,
lets build a sample ClaimSubmittal manifest.
For purposes of a briefer illustration as more
elements are added to the sample, some of the
ending tags will be omitted. At this point, this
sample manifest consists of the
following ltTransmissionDatagt ltDocumentIDgt"2001
-12-17T09304700"lt/DocumentIDgt ltCreatedDateTime
gt2001-12-17T093047.00lt/CreatedDateTimegt ltDocume
ntTypeCodegtImageManifestlt/DocumentTypeCodegt ltSour
cegt ltNCHELPIDgt830694lt/NCHELPIDgt ltNonEDBranchID
gt1232lt/NonEDBranchIDgt ltEntityNamegtCitiBanklt/Enti
tyNamegt lt/Sourcegt ltDestinationgt ltOPEIDgt748lt/OP
EIDgt ltEntityNamegtTGSLClt/EntityNamegt lt/Destinati
ongt lt/TransmissionDatagt ltGeneralManifest
ManifestNS"DACAT"gt ltDACATManifestgt ltClaimSubmitt
algt
9
10- The tags at the next lower level identify the
borrower via an SSN, SSN Sequence Number, and
Claim ID. SSN always appears at this level SSN
Sequence Number and Claim ID appear dependent on
the business context. Following the SSN and SSN
Sequence Number, but at the same level in the
hierarchy, there is at least one tag that
designates the type of imaged document associated
with that borrower and business context. At
present the following imaged document types have
been identified - Master Promissory Note (tagname ltMPNgt)
- Forbearance Request (tagname ltForbearanceRequest
gt) - We suggest the DACAT group review the list of
imaged document types to determine their
comprehensiveness and accuracy.
10
11Since this is a Claim Submittal, lets add the
SSN and ClaimID to the sample DACAT manifest.
For the purposes of this sample, assume all that
is being transmitted is an MPN. The manifest has
now grown to include ltTransmissionDatagt ltDocume
ntIDgt"2001-12-17T09304700"lt/DocumentIDgt ltCrea
tedDateTimegt2001-12-17T093047.00lt/CreatedDateTim
egt ltDocumentTypeCodegtImageManifestlt/DocumentTypeC
odegt ltSourcegt ltNCHELPIDgt830694lt/NCHELPIDgt ltNo
nEDBranchIDgt1232lt/NonEDBranchIDgt ltEntityNamegtCit
iBanklt/EntityNamegt lt/Sourcegt ltDestinationgt ltOP
EIDgt748lt/OPEIDgt ltEntityNamegtTGSLClt/EntityNamegt
lt/Destinationgt lt/TransmissionDatagt ltGeneralManife
st ManifestNS"DACAT"gt ltDACATManifestgt ltClaimSub
mittalgt ltSSNgt389451234lt/SSNgt ltClaimIDgt11223344
551lt/ClaimIDgt ltMPNgt
11
12- The next lower level in the hierarchy identifies
each image (or page) which makes up the
document type named in the tag at the level
immediately above. Each page has three
lower-level tags associated with it - The unique filename of the image associated with
that page. This is a required item. - A free-form text description of the image
associated with that page. - Free-form notes regarding the image associated
with that page. - Now lets add page information to the sample
DACAT manifest. Assume there are two images that
make up the MPN. All remaining tags are added to
make the manifest complete. - Note An additional example manifest is available
as a separate file.
12
13lt?xml version"1.0"?gt ltImageManifestPackage xmln
sImageManifest"C/XML_Work_Area/My_Tests/ImageMa
nifest" xmlnsDACAT"C/XML_Work_Area/My_Tests/DA
CATNS" xmlnsxsi"http//www.w3.org/2001/XMLSchem
a-instance" DocumentProcessCode"TEST" xsischem
aLocation"C/XML_Work_Area/My_Tests/ImageManifest
file///C/XML_Work_Area/My_Tests/ImageManife
st/Manifest.xsd C/XML_Work_Area/My_Tests/DAC
ATNS file///C/XML_Work_Area/My_Tests/DACAT
NS/DACATImages.xsd"gt ltTransmissionDatagt ltDocum
entIDgt"2001-12-17T09304700"lt/DocumentIDgt ltCr
eatedDateTimegt2001-12-17T093047.00lt/CreatedDateT
imegt ltDocumentTypeCodegtImageManifestlt/DocumentTy
peCodegt ltSourcegt ltNCHELPIDgt830694lt/NCHELPIDgt
ltNonEDBranchIDgt1232lt/NonEDBranchIDgt ltEntityN
amegtCitiBanklt/EntityNamegt lt/Sourcegt ltDestinati
ongt ltOPEIDgt748lt/OPEIDgt ltEntityNamegtTGSLClt/En
tityNamegt lt/Destinationgt lt/TransmissionDatagt
13
14 ltGeneralManifest ManifestNS"DACAT"gt ltDACATMan
ifestgt ltClaimSubmittalgt ltSSNgt389451234lt/SSN
gt ltClaimIDgt11223344551lt/ClaimIDgt ltMPNgt
ltPage Number"1"gt ltFileNamegtMPNpage1.tifflt/F
ileNamegt ltDescriptiongtMPN Cover
Pagelt/Descriptiongt ltNotesgtBlahblahblahlt/Note
sgt lt/Pagegt ltPage Number"2"gt ltFile
NamegtMPNpage2.tifflt/FileNamegt ltDescriptiongtF
irstRealPagelt/Descriptiongt ltNotesgtDon't
call uslt/Notesgt lt/Pagegt lt/MPNgt lt/Claim
Submittalgt ltDACATManifestgt lt/GeneralManifestgt
lt/ImageManifestPackagegt
14
15Data Requirements/Cardinality Generic Portion
of the Manifest DocumentProcessCode
(attribute) Required Values
are TEST PRODUCTION String Tr
ansmissionData block One and Only One
Required DocumentID One and Only One
Required String minLength22 max
Length34 CreatedDateTime One and Only One
Required W3C Date/Time format DocumentTyp
eCode One and Only One Required String
Value is Manifest
15
16Source block One and Only One
Required EntityID One and Only One
Required Values are OPEID or
NCHELPID String minLength5 max
Length8 NonEdBranchID One and Only One
Optional String minLength0 maxL
ength4 EntityName One and Only One
Optional String minLength1 maxL
ength96
16
17Destination block One and Only One
Required EntityID One and Only One
Required Values are OPEID or
NCHELPID String minLength5 max
Length8 NonEdBranchID One and Only One
Optional String minLength0 maxL
ength4 EntityName One and Only One
Optional String minLength1 maxL
ength96
17
18Data Requirements/Cardinality DACAT-specific
Portion of the Manifest Manifest block One
and Only One Required Business Context
block One or More Required SSN One and
Only One Required String minLength9
maxLength9 SSN Sequence Number One and
Only One Optional Integer minLength1
maxLength3 ClaimID One and Only One
Optional String minLength11 max
Length11
18
19Data Requirements/Cardinality DACAT-specific
Portion of the Manifest Document Type One
or More Required Page block One or More
Required Number (attribute of
Page) Required Integer MinInclusiveVal
ue1 MaxInclusiveValue999 FileName On
e and Only One Required String minLeng
th7 maxLength20 Description One and
Only One Required String minLength1
maxLength256 Notes One and Only One
Required String minLength1 maxL
en gth256
19
20- Manifest Layout Status
- A schema will be developed when the layout
(including all values for tagnames) has been
finalized. - At present the only known outstanding questions
are - 1) Is the list of business contexts for
transmissions complete? At present the list
consists of - Claim Submittal
- Bankruptcy Transfer
- Conditional Disability
- Subrogation
- 2) Is the list of documents transmitted in image
format complete? - Master Promissory Note
- Forbearance Request
- 3) At present the maximum number of pages in a
given document is set to a maximum of 999.
20