What are the Neccesary Considerations Before Deciding to Use the Schema Subset Generation Tool (SSGT)?

There are many factors that need to be taken into account before deciding to use SSGT that include what is the end goal, alternatives, short term concerns and long term concerns.

In order to produce a set of schema subsets the following properties are needed in the tool . 

  1. The tool must allow a user to search and navigate through the full Justice dictionary. This is neccesary because users will need to see what is available before they can choose which parts they want to use. 
  2. The tool must give users the ability to create schema subsets by adding constraints.
  3. The tool should allow users to create extension and document schemas by making customizations. Notice that this is not required functionality - a base tool could be built without it but would not be capable of providing the complete set of schemas.
  4. The tool must be able to generate the customized schema subsets from the user input. This requires knowledge of the dictionary, data model, and the rules for creating valid schema subsets. Standard Commercial Registries

There is no puropose to recreate an existing product that could meet our needs. Therefore it is important to take a look at what a commercial, off-the-shelf, ebXML-compliant registry could offer us. A commercial registry could catalogue the Justice dictionary and store metadata about it, either at a component level or a document level. A commercial registry could also give users some manner of searching and retrieving data through a user interface.

These are important and necessary functionalities, but they are often not adequate enough to support the construction of customized schemas. To start with, only one class of registry could be used. This would be a registry with component level granularity. Any other type of registry would be useless for our purposes. A document level granularity would mean that the registry could only store and retrieve the dictionary as a full JXDD schema. This gives users no support in accessing and customizing individual components and defeats our purpose. Suppose we then choose a registry that has a component level granularity. It would be able to store the dictionary (a list of elements and types with definitions) piece by piece rather than lumped together in a single document. However, there is no way for any off-the-shelf registry to have knowledge of the Justice data model that the dictionary is based upon. This data model is very important - it has some relationships built into it that gives the JXDD its power and flexibility.

Off the shelf, no registry would be able to utilize the JXDD to its full potential. Additionally, the registry would have no mechanism to build the schema subsets or any knowledge of how to do so. It is apparent from the volume of comments received that the need for a customized schema subset generation tool is immediate. Because there is no product right now that is capable of this, it must be built.

This tool should have the capabilities outlined in the requirements section above. The tool should provide a graphical user interface to allow users to search through the dictionary components, add constraints and customizations, and define customized schemas. The schema subset generation tool should take in user input and, from that input, generate a valid set of customized schema subsets, carefully formed to maintain its integrity and interoperability. The tool should then return the set of schemas to the user, who then becomes the owner of those files.

Future work: Despite there not being an off-the-shelf registry product ready to meet our current needs, it might be possible for an existing registry to be modified so that it supports the full Justice data model and all of the requirements for building customized schema subsets. To start with, this would involve some research and comparison of different registry products and analysis of potential candidates to determine whether making such modifications is feasible. If so, adding awareness of the Justice data model and the capacity to build schema subsets could then be added. If it is not possible to make the necessary enhancements to a commercial registry, it becomes necessary to build a custom registry to fit the Justice data model.

After a registry is either modified or built, the back end of the schema subset generation tool will need to be changed to communicate with the registry. This allows code maintenance to be performed on the registry side and new versions of the JXDD to be handled automatically rather than forcing tool upgrades.Will this tool be the only way to create schema subsets? No. There are other ways this could be done. One step for the schema generation tool will be to translate the user input specifying how to build the customized schemas into an XML request file or wantlist. This will happen in the background, transparent to the user. The wantlist would be sent to the registry. The registry would process the file and then generate and return the customized schema subsets. The format of the request file should be publicly available, so that others can create their own front-ends and still use the registry to produce the actual schemas.

Another way to generate schema subsets would be to create and distribute a library that could perform the same functionality as the registry tool. A third way would be for users to go through the set of full schemas making restrictions and creating extension and document schemas by hand. Another might be through the use of XML Style Sheet Language (XSL). There are probably many different ways that this work could be done. The benefit of using a JXDD schema subset generation tool is that if a user specifies valid input, an appropriately and consistently formed set of customized schema subsets will be returned. Without a thorough understanding of the Justice data model, it could be very easy to unintentionally break conformance.

SSGT: http://niem.gtri.gatech.edu/niemtools/ssgt/SSGT-Search.iepd