Import-Export Module
The goals of the Import-Export module is two fold: On one hand it should facilitate moving data between different installastions or instances of DHIS 2. An example is to export registered data values from an organisation unit on a lower level, to the level above. On the other hand it should facilitate data exchange between DHIS 2 and other applications. This is again two fold: DHIS 2 must be capable of importing data from other systems, by somehow understanding their formats. Further, DHIS 2 should be able to export data to a format other systems can understand. Examples of such systems are: KIDS and other GIS software, Excel files, local legacy systems and the like. The rationale behind this is for DHIS 2 to function alongside other, perhaps more specialized systems, and for DHIS 2 to be able to use data extracted from legacy applications.
Development information
Developers:
- Anders Gjendem, andegje AT ifi.uio.no
- Hans S. Tømmerholt, hanssto AT ifi.uio.no, MSN: hansto@frisurf.no
- IDSP (student project): stianast AT ifi.uio.no, okhustad AT ifi.uio.no
- SPSS/DBF (student project): geoffrer AT ifi.uio.no, oyvinrot AT ifi.uio.no, mariusgf AT ifi.uio.no
- OpenMRS (student project):
- Eligible Couples (student project):
- KIDS/IXF (student project):
Issues
- Create JIRA issues
Site-information in the POM.
Refactor import
- Memory issues in Import: Handle large numbers of data values.
Iterator for export
Iterator for import
- Transaction handling
Import without GUI.
- Integrate new import/export modules.
- More feedback from the export process.
- Make error message visible
- Make errors visible
- Notice of how many values exported
- Notice when there are no values to export (difficult with Iterator)
- (Threading of export and import)
- Modify the Importer and Exporter interfaces with Observer capability.
- ImportListener, ImportEvent, ExportListener, ExportEvent?
- Improve CSV support
- GUI for creating/selecting format
- Better handling of strings, separators and end-of-line markers.
- Simple formatters for all current exporters (alà DataValueFormatter)
- HL7 support.
- Metadata export
- Metadata import
DHIS 1.4 database import
- DHIS 1.4 import (vaues and metadata) - preferrably directly from the 1.4 XML file
Overview of the functionality
The module supports the following kinds of export:
- XML export of DataValues: See notes on the XML format implementation.
- CSV export of DataValues: DataValues are exported to a Comma Separated File, using the following format:
organisationUnit.name,organisationUnit.code,dataElement.name,dataElement.code,\
period.periodType.name,period.startDate,period.endDate,value,timestamp,comment
- CSV Datamart DataValue export: Export intended to be used with pivot tables in for example Microsoft Excel. In addition to the data in the standard CSV export, this data contains the full OrganisationUnit hierarchy, to enable drilling down into the data.
- CSV export of Indicators: Information about Indicators and aggregated DataValues for a given Period and OrganisationUnit.
- CSV Datamart Indicator export
The module supports the following kinds of import:
All exported data are zipped by default and assumed to be zipped when importing. If the import code does not recognize a zip file, it will attempt to read the data as a normal file, for example as an xml-file.
The following kinds are under development:
- Export to the DBF format
- Export to KIDS
Additionally, a series of adapters and converters have been made which convert the export formats of other applications into a format DHIS 2 can understand:
XML format implementation
The XML processing in this module is handled by the XStream library, which can serialize objects to and from XML.
The XML format currently used by DHIS 2 is meant to be a minimal format for DataValues:
- Little metadata. There is no full data on OrganisationUnits, DataElements and Periods in the file, only the data needed to identify such objects. For example: For each DataValue, there is a reference to the name and (possibly) code of the owning OrganisationUnit.
- The element and attribute names are compacted compared to that of XStream's default output format.
Export
Package: org.hisp.dhis.service.importexport (interface), various (implementations)
Module: dhis-api (interface), dhis-service-importexport-default (implementations)
The code for exporting in DHIS 2 is organized around a concept of Exporters, classes or collections of classes which are capable of performing export, typically by converting a set of model objects to some format, like CSV or XML, and writing this data to a stream, typically a file. This is summarized in the Exporter interface:
public interface Exporter
{
void export( Collection<Specification> specifications, OutputStream stream )
throws ImportExportException;
}
The OutputStream can be any stream, meaning a file stream can be wrapped in other types of stream to control how data is written, following the Decorator design pattern.
Note also that Exporters themselves may be chained in a Decorator fashion. An example of this is how various Exporters are decorated with the ZipExporter.
The export method takes a collection of Specification objects. They are configured with a class references, for example DataValue.class. In this case, the Specification defines a set of criteria a DataValue must conform to if it is to be included in the export. The Specification is sent to the Data Provider Module, which will retrieve objects according to the specification. The constraints in the Specification object is typically defined by what the user selects in the user interface: Which org units and data elements to export values for, etc.
Implementations
MinimalXmlExporter
This exporter will convert a set of DataValues to XML. It's supposed to be minimalistic, without unnecessary metadata. It contains IDs for OrgansationUnit and DataElement, the Period and its type, and the fields from DataValue, e.g. Value and Timestamp. Actual formatting is done by the MinimalXMLDataValueConverter which is used by XStream.
<dhis>
<export version="1.0-MinimalDataValue">
<!-- Zero or more DataValues will be listed here -->
<datavalue dataelementname="DataElement1" dataelementcode="DE1" organisationunitname="OrganisationUnit1" organisationunitcode="OU1" flag="RoutineData">
<period type="Monthly" start="2005-01-01 00:00:00" end="2005-01-31 00:00:00"/>
<comment>Comment for routine DataValue 1</comment>
<storedby>admin</storedby>
<timestamp>2005-01-01 00:01:00</timestamp>
<value>101</value>
</datavalue>
</export>
</dhis>
Fields that are not required:
- dataelementcode
- organisationunitcode
flag (Ignored beginning with milestone 7)
- comment
- storedby
- timestamp
An empty string ("") is interpreted as no text/empty string and written as <something/>. A null-value generates no output at all to safely differentiate it from an empty string/value. Fields that may contain null-values are not required and will default to null.
Currently the Period and PeriodType must exist for an import to be successful, this is likely to change as all the necessary Period/PeriodType metadata is in the file, and any missing Periods may rather be created on import.
CSVDataValueExporter
This class will convert a set of DataValues to strings in a comma-separated file (CSV). It can be configured to emit a header line before the values are written. The actual format and formatting is determined by a CSVFormatter object.
This Exporter assumes that all Specifications given to it are for DataValue objects.
CSVFormatter
CSVIndicatorExporter
ZipExporter
This class decorates another Exporter and sets up a ZipStream around the OutputStream passed to it. The ZipStream is prepared, and the decoratedExporter is given the new stream to write to.
Import
Package: org.hisp.dhis.service.importexport (interface), various (implementations)
Module: dhis-api (interface), dhis-service-importexport-default (implementations)
Importer
In the same way as export is organized around an Exporter interface, the basic interface for import is Importer. This interface represents all classes capable of reading data from a stream and producing an iterator over the objects retrieved or converted.
public interface Importer
{
Iterator importData( InputStream stream )
throws ImportExportException;
}
MinimalXMLDataValueImporter
Handles importing of the MinimalXML format mentioned above. Based on XStream.
ImporterPlugin
Wraps an Importer and adds functionality for naming of the importer. Implemented in the API.
ImporterPluginManager
Contains a Map of ImporterPlugins which can be quired by name or retained as a Map.
ImportManager
Connection between the service layer and the GUI layer. In general it uses an ImporterPlugin which is a wrapper for an Importer providing naming information for the GUI layer and applies this to a stream of data. The ImportManager is responsible for sorting the imported data into categories in the ImportResult implementation.
public interface ImportManager
{
public ImportResult getDataFromStream( InputStream importDocument, ImporterPlugin importerPlugin )
throws ImportExportException;
public ImportResult importDataFromStream( InputStream importDocument, ImporterPlugin importerPlugin )
throws ImportExportException;
public ImportResult importData( ImportResult importData )
throws ImportExportException;
}
DefaultImportManager
Utilizes Classifiers and Savers to handle the classification and saving of objects that are due to be imported.
ImportResult
Container object for groups of new/newer/older/equal/all data and the number of imported objects.
DefaultImportResult
List based implementation to retain order of the objects as they are added.
Classifier
Defines an interface for classifying Objects into the classes new, newer, older, equal
DataValueClassifier
Implementation that can classify DataValue objects.
Saver
Defines an interface for saving of Objects based on the classification type returned by the corresponding Classifier
DataValueSaver
Implementation that can save DataValue object using the API and the classification from the corresponding Classifier