Dec 31 2009

X(ML) Marks the Spot

A data-sharing model is already simplifying the exchange of law enforcement files, but the ultimate goal is to ease such exchanges governmentwide.

Photo: Forrest MacCormack
DHS' Michael Daconta views the just-revised Data Reference Model as a launch pad for building out a data-sharing model for agencies across government.

So just how hard is it to set up a data-sharing model with the flexibility for use across government?

It's extremely hard, acknowledge Justice Department CIO Vance Hitch, Homeland Security Department Data Architect Michael Daconta and Interior Department eXtensible Markup Language Strategist Owen Ambur.

But that doesn't mean it's not doable. Hitch, Daconta and Ambur contend the foundation laid by Justice and DHS, coupled with a newly revised federal Data Reference Model (DRM), will eventually make governmentwide data sharing a reality—with "eventually" equating to several years in the future.

Agencies can move things along by taking stock of data they already share or plan to share within a year and applying the new version of the DRM, says Daconta, who has led work on fleshing out the DRM for the Federal Enterprise Architecture. The new model, released last month, will aid agencies in defining their data assets and how they are shared within their own enterprise architectures as well as the FEA, he says.

Daconta believes the signs are positive "that we are developing an active community of data-savvy practitioners to shape these initiatives and explain their benefits to agencies."

Making the leap to governmentwide data sharing depends on three things, according to Ambur and Daconta. Agencies must adopt the DRM. Next, Justice and DHS must build on the standards efforts begun at Justice to improve sharing of law enforcement information, creating a series of success stories and sources of lessons learned. Finally, agencies must divest themselves of antiquated systems with limited sharing potential and shift to applications built around XML standards.

By populating the DRM, agencies will be able to set data-sharing and data-element standards to let the government expand the National Information Exchange Model (NIEM) beyond its current sharing platform for Justice and DHS, says Ambur, co-chairman of the CIO Council's XML working group.

All the e-government and the Lines of Business projects "are candidates for early population of the DRM for incorporation into the NIEM," he says. "That could be turned around to ask how long agencies will be allowed to continue using stovepipe IT systems that cannot readily share data."

Photo: Drake Sorey
The Justice and Homeland Security information-sharing model ultimately "could apply to drug law enforcement, gang violence, bomb technology, even health care—for example, the spread of West Nile virus," Justice's Vance Hitch says.

Several NIEM pilots are now under way within DHS, Daconta says, at the bureaus of Immigration and Customs Enforcement, Citizenship and Immigration Services, and Customs and Border Protection. Among early DRM implementers, he cites the Interior Department as the most successful so far.

NIEM relies on the Global Justice Extensible Markup Language Model to enforce information commonality. GJXDM "standardizes all the data names used for years in law enforcement," Hitch says. "There's quite a repository of definitions."

The Goods

GJXDM 3.0.2, the newest version, has a data model, a data dictionary and an XML schema flexible enough to serve public-safety agencies as well as prosecutors, public defenders and courts. It incorporates 16,000 elements from 35 federal, state and local data dictionaries and more than 2,700 reusable components. Justice and DHS believe the common semantics will vastly increase access and reuse of law enforcement data from system to system.

The members of the Joint Task Force on Rap Sheet Standardization agree. As the states and FBI adjust to GJXDM record-keeping, their criminal history records will become more easily understood, more complete and more accurate, the task force announced this summer. An authorized user who requests an interstate criminal history record:

•will always receive the same set of information;

•will always get a single record for multisource histories, with criminal justice event cycles appearing in date order;

•will receive a computer-readable format on request, to fill data entry screens or databases, or to edit in state-specific presentation formats;

•can ask for a record to be delivered to an approved destination regardless of whether it is served by an intrastate law enforcement network.

Justice has funded the effort so far, Hitch says, "and DHS will help. The yearly cost is not huge, but it's significant. Somebody has to act as the arbiter and communicate with the users."

Besides the two departments, he says, "other agencies have shown quite a lot of interest, including the Transportation Department, the intelligence community, the National Association of State CIOs, and the National Institute of Standards and Technology. We're developing a standard here."

Justice has contracted for ongoing technical development from Georgia Tech Research Institute of Atlanta. The institute has announced several new outputs for GJXDM users:

•a Justice Information Exchange Model tool (JIEM);

•a component mapping template, an Excel spreadsheet that automatically maps local data elements to GJXDM elements;

•a schema subset generation tool, which will produce documentation for information exchange packages, along with data-mapping functions. The Web application lets users browse GJXDM and create schema subsets and information want lists. The institute is building a repository where authorized users can submit their want lists.

The JIEM Reference Model, a site-specific database, defines most of the types of exchanges among jurisdictions. Each user site can copy its local database from another site database or use the Site Database Builder tool to build one from the JIEM Reference Model. Institute analysts have estimated that will cut the work involved in site database creation by about 75 percent.

In Alaska, the state used JIEM to define its exchanges and plot a new work process and then applied GJXDM and an XML middleware product to instantly eliminate a six-month backlog of 12,000 citations awaiting default judgment in its court, according to the National Consortium for Justice Information and Statistics. Previously, the court and the state police could not exchange their records electronically and the citation data was being re-entered in three separate systems. The new system allows near-immediate processing of the default judgments, the consortium notes.

Considering the possible results from applying JIEM, the broader NIEM—also based on GJXDM—is "a strategic opportunity we can't afford to let slip by. The information-sharing goals are very specific," says Daconta, who is the director of DHS' Enterprise Data Management Office.

States and localities are "aggressively adopting GJXDM" to streamline hundreds of types of structured cross-jurisdictional exchanges such as rap sheets, he points out. "It's a positive interface between the federal government and the states." Millions of law enforcement messages are already being exchanged, he says, and standardization will boost that volume.

The chief reasons to push ahead now, Daconta says, are cost savings and improved data sharing, among law enforcement organizations immediately but across federal agencies generally in the long run. The example of Minnesota's work on a justice network backbone illustrates the savings and sharing potential. The state's Public Safety Department estimates that it has already reaped savings of nearly $2 million from its use of GJXDM for the statewide criminal justice network. The savings, realized over five years, come from not having to develop and roll out a statewide standard and not having to build interfaces to national systems.

Dollars and Sense

Like Hitch, Daconta says the cost currently is not excessive. He describes the expenditures so far as "seed money from CIO shops and in-kind contributions from various jurisdictions and vendors" while the government sets up a formal budget process for data sharing. The pass-the-hat approach to fledgling IT programs—and especially interagency efforts—is common in government, but at some point to promulgate NIEM, both Hitch and Daconta agree, it will require its own budget and funding mechanism.

GJXDM incorporates more than 2,700 components—facts about people, activities, property, locations, contacts, organizations and accompanying metadata. To understand the precision required to standardize the data entry for global reuse, take a look at these examples of the GJXDM amendments for XML 3.01, released in June by the Joint Task Force on Rap Sheet Standardization:

•schema enumerations for HairColorType "Blonde or Strawberry" changed to "Blonde Or Strawberry" (capitalized "or");

•schema enumerations for MaritalStatusType "Never Married" changed to "Never Married" (removed extra spaces between words);

•schema enumerations for RaceType "Uknown" corrected to "Unknown" (fixed typo).

Paul Embley, chairman of the federally funded Global XML Structure Task Force, says the users—ranging from sheriffs and police sergeants to judges and intelligence analysts—do need training for such a large and complex undertaking.

By creating more uniform schema, more accurate cross-agency searches are possible. The current unstructured data-entry environment often leads to failed searches because one officer might input a suspect's hair color as blonde while another officer enters it as strawberry blonde, for instance.

"We've seen a great push for criminal justice automation," Embley says. "Police departments use different records management systems, and prosecutors and courts have their own systems. GJXDM shares information behind the scenes, so the police sergeant doesn't have to know XML." A query from Maryland police about an Arizona driver's license will return data in a format familiar to the Maryland users, he says.

One Step at a Time

For now, the Office of Management and Budget is not mandating use of a particular approach to data sharing, but it requires agencies to apply the DRM and move toward common data standards.

Because this work will take time, Ambur concludes, "realistically, it will take a number of years to achieve consensus on the bulk of the elements for a truly national information-sharing model. But agencies that are able to make rapid progress toward that end should be recognized and rewarded for their efforts."

Daconta cautions, "The history of standardization efforts is littered with false starts. There are always competing efforts and varying opinions, and consensus is difficult." The next few months, however, "present an opportunity to solve long-standing systemic information-sharing problems," he says.

Why? Because the revised DRM will let agencies make a significant advance toward that goal, Daconta says. After receiving the latest DRM 1.5 update in October, OMB must deliver proposed agency guidance to Congress by mid-December. That is the official deadline mandated by the 2002 E-Government Act, which directed OMB to develop standards and guidelines to categorize federal information for better use of XML and other sharing tools.

In the meantime, Ambur says, any law enforcement databases that need retrofitting to share valid XML documents "are candidates for retirement anyway, since dot-gov agencies should no longer be using proprietary stovepipe applications."

When Justice and DHS agreed early this year on NIEM as a common interface for querying more than a dozen of each other's information systems, it capped six years of grassroots work by law enforcement groups, developers and vendors [FedTech, May 2005, Page 17]. Their agreement on NIEM will standardize rap sheets and incident reports across thousands of jurisdictional boundaries and hundreds of disparate local databases."It's extremely important for border agents who depend on FBI data and watch lists," Hitch says. "NIEM is really effective in making data exchanges work better."

He says the two departments plan to devise a series of NIEM formats, "shorthand so that the case numbers are always the same and the elements are always in the right order. The NIEM interface mechanism defines all the exchanges in one layer, instead of having a dozen or 18 different interfaces."

The first NIEM pilot, at DHS' Bureau of Border and Transportation Security, is helping customs and border agents swap data. Down the road, Hitch says, NIEM "could apply to drug law enforcement, gang violence, bomb technology, even health care—for example, the spread of West Nile virus."

And the FBI, he says, is eager to establish a NIEM-style agreement with DHS for the U.S. Visitor and Immigrant Status Indicator Technology system, because current exchanges "are not as elegant technically as we would like."

"Since the initial basis for NIEM was GJXDM, it has already taken shape to a significant degree," says Interior's Ambur. But, he adds, "If NIEM is to become a truly national information-exchange model, it will take a number of years to incorporate all the essential core elements."

But the work starts now, Daconta says. He urges agencies "to try and break" the revised DRM released last month. "We want them to find the weaknesses." With that model in hand, agencies can settle on the core elements and the government can begin to build out NIEM for all users.