Dashboard > DHIS-2 > ... > DHIS 2 overview > ReqDoc
  DHIS-2 Log In | Sign Up   View a printable version of the current page.  
  ReqDoc
Added by Ola Hodne Titlestad, last edited by Ola Hodne Titlestad on Jun 07, 2005  (view change)
Labels: 
(None)

1. Preface

This is the first draft of the DHIS 2.0 requirement specification document. We want to release this unfinished document to get feedback from the rest of the network and as far as possible make this a joint effort to develop a complete document.

Therese Steensen
Hanne Vibekk
Erling Svanberg Mytting
Trond Andresen
Ola Hodne Titlestad

2. Introduction

The rationale for developing DHIS 2.0 is to develop a platform independent and web-enabled version of the DHIS. The DHIS 1.3, an MS Access based application, was developed 8 years ago in Cape Town South Africa and has been continuously improved since then. Another version of the DHIS, the version 1.4 is also in development and it's important to understand that the decision to base version 1.4 to some extent on MS Access was triggered by the need to design and implement major innovations that have gradually emerged over the last 5-6 years, WITHOUT simultaneously have to deal with a major technology shift. So 1.4 will largely drive the conceptual development of the DHIS the next year while version 2.0 will focus more on implementing all those new designs and concepts in a new (Java-based) framework. DHIS 2.0 in turn will probably enable and trigger another round of mainly conceptual development towards version 3.0 and so on and so forth.

3. User requirements definition

  1. The system shall be Open Source Software
    • The system shall be based on only Open Source Software and run on an open source platform.
  2. The system shall be platform independent
    • The system should support run on at least Linux and Microsoft Windows OS, and this will be catered for by implementing the system in Java.
  3. The system shall be database management server (DBMS) independent.
    • The system shall support a range of DBMSs ,at least many of the most used commercial and open source DMBSs.
      Implementing the system in Java and using an Open Source Java framework for the persistence layer, such as Hibernate enables easy switching between DBMSs.
  4. The system shall be web-enabled
    • The system shall support both offline and online users and a range of different scenarios of networked users (see figure below)
    • The system should offer web-enabling of at least the following functionality:
      • Reports
      • Pivot Table-based and other OLAP-type analysis
      • Data dictionary
      • Reporting/sharing of data
    • The core module(s) (ref. DHIS 1.3?s AccessMD) with routine data entry and database maintenance shall in the initial releases be stand-alone module(s), but shall have the possibility to be web-enabled in the future.
    • Following the Java 2EE architecture with well defined layers for persistence, business logic and presentation there is support for implementation of several different clients both stand-alone and web-based.
  5. The system shall be internationalized.
    • The system shall support multi-language user interfaces, documentation and support. The Java standard for internationalization, the i18n provides good support for multi-language GUIs.
  6. Backward compatibility
    • The system shall provide all or most of the functionality of previous versions of the DHIS. Furthermore, the system shall be compatible with, support export/import of organizational units, data elements and raw data to/from previous versions of the DHIS (1.4 and maybe also 1.3).
  7. The system shall support a wide range of hardware configurations and shall support configurations that are far below state of the art.
  8. The system shall support electronic reporting on any standard medium.

The DHIS 2.0 can be regarded as a hybrid application suite that should cater for at least three main needs:

(1)
The need to rapidly design data collection tools for a variety of purposes.

(2)
The need to efficiently capture or import/collate this variety of data in an "integrated" manner, then to monitor the processing and flow of this data, and finally to communicate with various stakeholders in the process (re our recent discussions around SMS/WAP).

(3)
The need to analyse relatively large and complicated data sets quickly and efficiently.

Another way of saying the same is that the term "hybrid" denotes that the
DHIS actually needs to combine the typical properties of a flexible TRANSACTION database with the typical properties of a DATA WAREHOUSE and some properties of a COMMUNICATION TOOL.

4. System architecture

  • The system shall be easily scalable to include new modules and features.
  • The system shall have a pluggable architecture that caters for flexible scalability and supports localization of the system.
  • The system shall have an architecture that supports a globally distributed systems development process.
  • The system shall have a layered architecture with clear separation between the layers based on the three-tier architecture.

5. System requirements specification

5.1. Routine data module
(This list is based on DHIS 1.3 functionality (AccessMD module) plus what we know will come in 1.4. It will need to be updated when the 1.4 GUI is ready, and further developed as new requirements that are not implemented in 1.4 come up. This module covers the same functionality as in 1.3?s monthly data module, and we should look more into how we can further modularize this bundle of functionality and possibly define smaller modules.)

1 Database Maintenance

1.1 Organizational units
Organizational units can be entered, edited or deleted, and the organizational hierarchy is defined. The user is also enabled to modify organizational unit data.

1.1.1 Add/edit/delete organizational units
This functionality gives the users an overview of all the organizational units in the active database, and is used to maintain the organizational hierarchy. The user is enabled to add a new organizational unit, edit existing information about an organizational unit or to delete an existing organizational unit. The list of organizational units can be filtered by a text-string, either by a user-specified search-string or by choosing an organizational unit from a list.

The following information is stored about an organizational unit:

  • Organizational Unit (compulsory):
    The full name of the organizational unit using the naming convention (see below).
  • Parent (compulsory):
    The organizational ?parent? unit of the facility, along with the structure name (if there are more than one organizational structure like in South Africa).
  • Short name (compulsory):
    The short name of the organizational unit (maximum 20 characters), used in column headings on reports etc.
  • Code:
    Facility number (some provinces have numbered their facilities)
  • Comment:
    Any relevant comment about the facility is noted here. If a facility changes its name, the old name is added to the comment if the user wishes to.
  • Opened:
    The data on witch the facility opened.
  • Closed:
    The date on witch the facility closed.
  • Data:
    A field that indicates that the facility is active and submits data on a regular basis. Only facilities that are active and submit data appears in the facility-list in the ?Routine health data entry/edit?.

A naming convention is uses to standardize the naming of facilities. All organizational units start with the relevant provincial prefix to ensure that each facility has a unique identifier. Another convention to be followed is to be as specific as possible when naming a facility, and to include the type of facility (i.e. Hospital or CHC for a community health center).

1.1.2 Add/edit/delete organizational structure(hierarchy)
This functionality displays all the registered organizational ?child? units with the corresponding organizational ?parent? unit. No organization unit exists without a parent at a higher level. The list of organizational ?child? units can be filtered by a text-string, either by a user-specified search-string or by choosing an organizational unit from a list.

The user is enabled to change the current parent to an organizational ?child? unit if necessary, and thereby shaping the structure of the organizational hierarchy.

1.1.3 Add/edit/remove organizational unit groups (1.4 functionality)
Organizational units might be both physical entities (facility, ward, department, mobile unit) rendering health services, and administrative/political ?virtual? units. ?Virtual? units are not physical entities and usually do not have a physical perimeter, but they are usually legal entities under political governance and with an administration, a budget, assets, and a target/catchment population.

The user can use this tool to create, edit, deactivate and remove any desired group of organizational units, to group collections of organizational units based on some common criteria(s). Deactivation of an organizational unit group means that the group and group membership remains, but it is not used or exported to the data mart.

Examples of requested grouping parameters in South Africa are:

  • NAFSI sites (adolescent friendly facilities)
  • well baby (baby friendly facilities)
  • Sentinel surveillance sites
  • Urban/rural
  • Privacy level
  • Level of care
  • admin/patient-handling/support-services

1.1.4 Organizational unit groups membership management (1.4 functionality)
This tool gives the user the ability to assign organizational unit(s) to the user-defined organizational unit groups defined in section 1.1.3. The adding/removing of group members must be done for either each organizational unit, or through bulk updates based on parameters. Such parameters might be based on routine data (AVG or SUM for a specified period), and/or semi-permanent data (specific date), and/or survey data (specific survey or data set).

When running such bulk analysis, the user should specify the organizational unit level to analyze, and if relevant, limit the search to members of other groups (being able to exclude desired groups). The user should also be able to review the resulting potential group members, and select the desired units before committing the group membership update to the database.

The database must support the setting of universal standards (viewed as ?targets?) or time-based planning targets for each group, and also the inclusion of benchmark data (benchmark data is values from other areas or countries that a user can use as a backdrop to the users own data).

1.1.5 Switch structure
This functionality enables the user to switch an organizational structure. The switching moves and renames the parents of all organizational units as per the new demarcations. Population figures are also moved, allowing the calculation of population based indicators for both old and new district boundaries.

1.1.6 Import/export organizational structure
This function allows the user to import/export complete organizational unit hierarchies (administrative structures) into/from the DHIS data file. This tool is regarded for super-users, but there are no physical restrictions in the system to prevent other users from using it.

When exporting an organization hierarchy the user can choose between exporting only the structure or all the existing organization unit data.

1.1.7 Delete organizational units
Deletion of organizational units is restricted to the administrator.

1.1.7.1 Delete organizational units and related data
The user is allowed to prune the organizational structure of the active database by deleting organizational units and their related data, in order to see only desired region/district facilities. The user can choose to delete all the organization units within a province, district or sub-district, or to just delete one organization unit at a time by selecting a sub-district data.

1.1.7.2 Delete all organizational units except selected organizational units
This functionality allows the user to make a selection of a facility, sub-district, or district so that all other organizational units get pruned away. It offers the same function as that described above (1.1.7.1), except that the selected structures are retained.

1.2 Data elements
All data elements (routine, semi-permanent, survey/audit and dataset meta data) can be added, defined, removed and edited in this section. It also has the function of resetting minimum and maximum ranges to default ranges using a period of at least six months of data, and can modify data entry form settings.

1.2.1 Add/edit/remove data elements
Features used in this section to add new data elements to the data set, and edit or delete existing data elements includes:

  • Update element from Data Dictionary:
    Automatically updates all the definitions and comments regarding the data elements in the dataset from the website where Data Dictionary is located.
  • Calculation detail:
    Gives the user the ability to make a calculated field. The user can choose a data element from a list and enter a factor value. The default value is 1, and it should only be changed if the value of the data elements is to be weighted as part of the calculation of the indicator. Then the user can select another data element and enter a factor value. The two data elements then makes up the calculated field.
  • Compulsory pairings:
    One data element can be paired with another data element to make up a compulsory pairing. A compulsory pairing makes sure that when the user enters data about one data element, data about the other data element in the pair also needs to be entered.
  • Add new record:
    This functionality adds a new data element to the data set. The user can choose an existing data element (defined in the Data Dictionary), or define a new data element. If the user an existing data element has been chosen, information about the element stored in the Data Dictionary is filled in and the user has the ability to customize it. If the user chooses to define a new data element, the user has to fill in information (as described below) about the data element.
  • Delete data element:
    Allows the deletion of data elements. Only data elements without any associated data, or does not form part of an indicator/calculated field can be deleted.
  • Update all:
    This functionality will update a data elements long name to the long name registered with the data element in the Data Dictionary.

The following information is stored about a data element:

  • SortOrder
    The numerical display order of a data element in drop-down lists, queries and reports. Note that when appearing in a data set, the data elements have an internal sort order defined by the data set and not by this value.
  • Data element name
    Name used to describe the data element.
  • Data element code
    A standard code for this data element
  • Short name
    Shortened version of the data element name, up to 20 characters, and are useful for column headings, reports etc.
  • DOS name
    Each element must have a maximum of 8-character DOS name. The name consists only of capital letters and numerals. For use with xBase software like ArcExplorer.
  • Data element prompt
    Enhanced version of the data element name for use as input forms prompts, or a question prompt for use in questionnaires.
  • Full description
    Full description of the data element, its use and so forth - equivalent to Definition+Use+Comments from the Data Dictionary.
  • Calculated
    Indicates if the data element is a calculated field. Calculated fields are used to add together two or more fields, and are only made visible in the data mart and report modules (they are not stored in the database). All indicators requiring a calculated field to make up their construction must have all the individual data elements tagged in order to calculate that specific indicator.
  • Data type
    Select from a drop down list the type of data element (e.g. number, string, yesno, date, memo)
  • Validity period
    Give the start and end date for the validity period for this data element.
  • Comment
    Comment/explanation related to this data element.
  • Optional values:
    Possibility to type in alternative values for the data element name and the short name, e.g. in other languages.

The user has access to a data elements definition, guide for use and the context of the data item, by double clicking on the data element name.

1.2.2 Import data elements
The user can import data elements from other databases if desired. The user chooses the desired database, and gets a list of data elements from the databases that are available for import. The list can be ordered by the data elements display order or alphabetically. Then the user selects the desired data elements and imports them.

1.2.3 Modify ?display? and ?compulsory? settings for data elements
This feature is an advanced user function (useful at a district information office level), and allows the modification and adaptation of the ?Routine health data entry/edit? form to be done in bulk.

The user selects the wanted district, sub-district and organizational unit type, and a list of facilities as per that selected type will appear. The user also has the ability to select subsets of data element by category. Then the user untags the data elements that are not required for the selected facilities, and this will remove all these elements in all these organizational units. The user can then tag the elements that are to be defined as compulsory. Changes made through this functionality will only occur for new months of data, and will not remove already registered data to data elements.

1.2.4 Identify ?compulsory? data
This tool allows the user to identify certain data elements as compulsory, based on the frequency of reporting for a data element by a specific facility. The tool determines, based on the selection of the district, sub-district and facility(ies), the number of times a facility has reported on a certain data element. If this has happened frequently in the time period selected by the user (the calculation requires at least six months of data), then the data element appears in a generated list. The user can either choose to make this data element compulsory or not.

1.2.5 Automatically calculate min/max ranges
This functionality allows the automatic creating of minimum and maximum range values for data elements. A minimum of 3 months of data is needed before range values can automatically be calculated.

The user selects the appropriate organizational unit 3 (district), organizational unit 4 (sub-district) or organizational unit 5 (facility). The user has the ability to select all facilities within a sub-district, and then set ranges for each facility within that sub-district. Then the time period for the most stable data is chosen, and the user is informed with the number of range records for the selected set of data. Range values are estimated by calculating the average during the selected time period, and then all outliers that is outside the band determined by the average +/- Standard Deviation multiplied by a user-defined factor. After removing these outliers, the minimum and maximum values are calculated again using the average +/- Standard Deviations multiplied by two. After the calculation the user can review the calculated range values to compare the new with current values, and possibly select/unselect individual minimum/maximum values. Then the user chooses to replace the current range values with the selected calculated values.

1.3 Data sets (need more info from 1.4 GUI)
1.3.1 Add/edit/remove data sets
This tool gives the user the ability to create, edit, deactivate and remove data sets.

1.3.2 Assign data elements to data sets
The user can use this tool to assign data elements to the data sets defined in section 1.3.1. The user must have the ability to support not only single assignment but also bulk assignments (i.e. selection of pre-defined categories of data elements, or selection of data elements that have a certain embedded string, or selection of all data elements belonging to another set, or a combination thereof).

Additional requirement:
Assign data periods to data sets:
Assign data periods to data sets.
This functionality gives the user the ability to assign data periods to data sets, depending on the type of data in the data set:
1. Routine data has a reporting frequency
2. Semi-permanent data will default to ValidFrom = Date() and ValidTo = EndOfTime.
3. Survey data sets have a default validity period, but individual data elements can be given other validity periods.

1.4 Semi-permanent data
Semi-permanent data provides mainly denominator data. This data can be population figures or any other data elements that is relatively constant.

1.4.1 Add/edit/delete semi-permanent data elements
This functionality functions in the same way as ?Add/edit/remove data elements? (see 1.2.1), and gives the user the ability to add new semi-permanent data elements, or edit or delete existing ones.

The information stored about a semi-permanent data element are the same as for data elements (see 1.2.1 above).

1.4.2 Import semi-permanent data elements
This tool allows the importation of semi-permanent data elements from other databases, and it functions in the same way as ?Importing data elements? (see1.2.2), ?Importing organizational unit data elements? (see 1.2.7) and ?Importing survey/audit data elements? (see 1.2.9).

1.4.3 Enter semi-permanent data
Semi-permanent data values for a given organizational unit are entered/edited in a data entry form. The user has the ability to preview and print out the entered data.

1.4.4 Re-estimate primary catchments population
This tool allows the catchment populations to be determined for a facility by choosing a population validation period. The functionality ?Enter semi-permanent data? (see 1.4.3) displays this data based on the calculations performed in this section. This tool is used for re-estimating catchment populations when the distribution of the population has changed, or when a new facility has become operational, or a facility is closed down.

1.5 Survey/Audit data (Wait for the 1.4 to be released first)

1.6 Indicators
Indicators can be defined and edited according to their numerator, denominator and indicator type in this section. It also allows the importation of indicators from other databases. Indicators can be made up by both data elements and semi-permanent data elements. The numerator in an indicator is usually a data element, and the denominator is usually a semi-permanent data element.

1.6.1 Define/revise/delete indicators
This functionality gives the user an overview of defined indicators for the selected source table (monthly data or TB data), and the ability to edit or delete existing indicators or define new ones. The list of defined indicators can be sorted by choosing an indicator category from a list (indicator category is a lookup table and can be edited by the user), or by entering a search-string.

If a user wishes to make a new indicator the user fills out the necessary information about the indicator, and chooses a numerator and a denominator.

Information stored about an indicator is:

  • Indicator name:
    The name of the indicator.
  • Indicator short name:
    The short name of the indicator, consisting of maximum 20 characters.
  • DOS name:
    An 8 or less DOS character name.
  • Description:
    The full definition of the indicator and the use of it.
  • Valid from and valid to:
    Time period for specific indicator calculations.
  • Numerator description:
    A description of the numerator.
  • Numerator factor:
    The multiplier for the numerator (default 1).
  • Denominator description
    A description of the denominator.
  • Denominator factor:
    The multiplier for the denominator (default 1).

1.6.2 Import indicators
This functions in the same way as ?Import data elements? (see 1.2.2), but an indicator will have two data elements attached. If the current database does not have the necessary data elements, the indicator will not be imported.

1.7 Data element and indicator groups (1.4 functionality)
1.7.1 Add/edit/remove groups
The users can create any desired group of data elements and indicator.
1.7.2 Groups membership management
The users can assign data elements and indicators to the user-defined groups (1.7.1).

1.8 Validation
In this section validation rules designed to eliminate most of the obvious errors, can be entered and edited. There is a standard list of validation rules witch can be applied to different data sets, but the user also have the option of adding their own validation rules or editing existing validation rules.

There are two types of validation rules; absolute rules and expert rules. The absolute rules apply when one value cannot be higher than another. Expert rules are more flexible, and are designed to ensure that the ratios between data elements are not transgressed. The expert rule follows the pattern and will identify outliers.

1.8.1 Define/revise/delete validation rules
This tool gives the user an overview of all the existing validation rules for the source table (either monthly data or TB data); the name of the validation rule, the validation type (expert or absolute) and order. The user can filter the list by entering a search-string. The user is then allowed to choose an existing validation rule to make edit or delete it, or the user can choose to make a new validation rule.

Information stored about a validation rule is:

  • Validation rule:
    The name of the validation rule.
  • Description:
    A full description of the validation rule. Comment on its use.
  • Type:
    Type of validation rule, either expert or absolute.
  • Valid from and valid to:
    The time period which the rule will be applied.
  • Description of left side /Description of right side:
    A description of the left/right side in the validation rule.
  • Left side of expression / Right side of expression:
    Information about the data element chosen; name, category and the factor.
  • Left side: Exclude if sum < /Right side: Exclude if sum < :
    Apply a filter to the data if only a few records are selected.
  • Operator:
    The operator between the left and the right side (is correlated with, must be different from, must be equal to, must be greater than, must be greater than or equal to, must be less than, must be less than or equal to) .

The left side of the validation rule describes the ?lesser? figure, and the right side describes the larger figure.

If the validation rule that is described is an expert type of validation rule, the user has the ability to see the left/right totals and the ratio between the two as a percentage, for the active data file.

1.8.2 Import validation rules
This functions in the same way as ?Import data elements? (see 1.2.2).

1.8.3 Apply validation rules to any data set
This functionality allows the user to apply all the validation rules to any district, sub-district or facility dataset, for any time period.

1.8.3.1 Apply absolute validation rules to any data set
Absolute rule violations are a sign of poor data quality, careless data entry and management, and they should not exist in the dataset and needs to be corrected before data is submitted to the next level.

The user selects the appropriate organizational unit levels (or select by organizational unit type), and the wanted time period, and is presented with a list of validation rule violations which the user can print out. The list displays the following:

  • The organizational unit (facility name) and the time period
  • The left side description
  • The data value for the left side
  • The logical operator
  • The right side description
  • The data value for the right side

The user has the ability to see more information about the violation, and can make additional comments or check to the element that violates the rule. The user can also choose to do a regression analysis to assess the data value and determine what steps need to be taken to correct the rule violation.

The user may also be shown a list of missing compulsory data elements, that indicates that a value for a data element has been given but a corresponding compulsory value has not been entered. The user then has the ability to add a comment to the data element.

1.8.3.2 Apply expert validation rules to any data set
Expert rule violations draw attention to data that may be correct, but needs to be explained.

The user selects the district and/or sub-district and/or facility, a time period and a specific validation rule if desired. The user can filter out violations depending on the degree of variability that they reflect (very high or very low). A list of the elements that violates the rules is displayed to the user. The list displays the following:

  • The organizational unit
  • The validation rule
  • The months with values. Values that are out of range appear in red.

The user can then choose to accept the rule violation.

1.8.4 Define/revise survey/audit validation rules
This section has the same functionality as ?Define/revise/delete validation rules? (see 1.8.1).

1.9 Data periods (need more info from 1.4 GUI)
1.9.1 Define/revise data periods (frequencies, e.g. weekly,monthly,etc.)
1.9.1.1 Activate standard periods from a list
1.9.1.2 Define/edit/delete user-customizable periods

1.10 Targets (need more info from 1.4 GUI)
1.10.1 Define/revise targets
1.10.2 Enter target values

Additional requirement: Link targets to groups of organizational units or vice versa

1.11 User management
The usernames are used to keep track of who is changing data in the database, in case the changes need to be verified. The management of users and groups are restricted to the administrator.

1.11.1 Add/edit/remove user roles (1.4 functionality)

1.11.2 Add/edit/remove users
This functionality enables the administrator to add a new user by entering their username (the users initial and surname in lower case), edit the name of an existing user or remove users. The administrator should get an overview of all of the users registered.

1.11.2.1 Assign roles to users (1.4 functionality)

1.11.3 Add/edit/remove user groups
The administrator can create user groups to group different types of users.

1.11.3.1 Assign roles to user groups (1.4 functionality)

1.12 Housekeeping
This section contains routines for removing empty records to keep the database small, and keeping track of data elements marked for checking in the ?Check-it!? box.

1.12.1 Remove NULL records
This functionality will automatically remove any null or blank values in the database.

1.12.2 Remove range records with no data in tables
This functionality removes range records with no data entries.

1.12.3 Missing records and outlier analysis
This functionality helps the user to assess the extent to witch the dataset is complete, either by performing an evaluation of missing data or an outlier analysis. Statistical rules can be determined by the user and applied to the dataset.

The user chooses to evaluate missing records or outliers, and can modify threshold levels if necessary to set the statistical limits. Then the user chooses the organizational units details, the organizational unit and type, and a time period on which the analysis will be done. The user can then identify problem data and view them to determine any actions to be performed on this data.

1.12.4 Display all data records with ?Check-it!? ticked
This functionality allows the user to view all records for an organizational unit at level 3, 4 or 5 that has a ?Check!?. After a certain time period the ?Check!? ticks can be removed if no longer required, and then be removed from the list of data waiting to be assessed.

By choosing the desired organizational units, a list of all the records for that unit that has been ticked will be shown (only those elements for the last month that must appear are shown). The user then have the ability to remove desired ticks, and they will be removed for that month and all previous months.

1.12.5 Display all Organizational Unit data records with ?Check-it!? ticked
This function allows the user to view all organizational unit data that has a ?Check!?, and it has the same functionality as ?Display all data records with ?Check-it!? ticked? (see 1.12.4).

1.12.6 Display all Survey/Audit data records with ?Check-it!? ticked
This function allows the user to view all survey/audit data that has a ?Check!?, and it has the same functionality as ?Display all data records with ?Check-it!? ticked? (see 1.12.4).

2 Routine health data entry/edit
(The 1.4 GUI will give us a lot of new requirements here, e.g. how to integrate customized data entry forms. In addition we must look into how to improve the survey data entry (do this in a separate form?))

The entry/edit functionality for routine health data is the main part of the database. This is where the user adds new routine data or modifies existing data in the system.

2.1 Mandatory user selection:
Before the user can enter or edit data he needs to specify which organization unit, data set and data period the routine health data is related to.

2.1.1 Select organizational unit
A view of all available organization units is required.

2.1.2 Select data set
A view of all available data set related to the specified organization unit is required.

2.1.3 Select data period

2.2 Optional user selection:
2.2.1 Use long/short names for data elements or not
The user specifies whether to use long data element names or their corresponding short versions. The element names are shown in the input form as labels for the corresponding input fields.

2.2.2 Show/hide inactive organizational units in list
The list of organization units available contains either only active facilities or it also includes inactive facilities. This option is specified by the user.

2.3 Show data set entry form with the following fields:
2.3.1 Name
The name of the data element.

2.3.2 Min
The minimum value for data entered in the Entry field.

2.3.3 Max
The maximum value for data entered in the Entry field.

2.3.4 Entry
The value of the current data element (the actual data).

2.3.5 Check-it
An option whether to mark the record for further/later checking. The user also specifies why it |has been marked.

2.3.6 Comment
A text field for entering comments to the record (e.g. why it has been marked for check). A list of standard comments is presented to the user. The user selects a standard comment or types in a comment.

2.3.7 + use some groups info here (category etc.)

2.4 Validation using expert and absolute validation rules
This functionality enables the user to run the validation rules defined as described in 1.7 on the current data set.

2.5 Regression analysis
Explanation from the manual:
The regression analysis tool enables the user to manipulate data of poor quality. Poor quality data results in inability to provide correct figures. The tool calculates an estimated figure by averaging out the available data using a regression technique.

It lists the data for a specific data element over any number of months that data is available i.e. Total Headcount for a specific facility for 10 months. Any wide fluctuations in the data can easily be seen. Using the regression analysis function, a value that seems at odds, with no explanation can be replaced. Data that appears out of the ordinary without an explanation maybe ?smoothed out? in order to fit within the normal range.

2.6 Ad-hoc report based on data from the last year
The report contains raw data from the last 13 months related to the current facility.

2.7 Automatic validation when closing the form

2.8 Automatic validation using the min max ranges while typing

2.9 Historic raw data report when clicking inside the data entry field for a data element (to compare current value with previous entries)
3 Switch data file
The Switch data file function enables users to change the database file in use. A view of all the available files is required along with the ability to browse for other database files. This functionality should be available from the main (start) form (known as the Control Center in Dhis 1.3).

4 Backup/Restore
The backup or restore functionality enables the user to either backup or restore data files. It should be available from the main form and contain both the backup and restore functionality in one form ? giving the user the ability to switch between backup and restore mode. Database tables to be backed up or restored are: Data tables, data element tables, org unit tables and lookup tables. The user should be able to choose which tables to include in the backup/restore process.

4.1 Backup
The files are stored in a separate backup folder. Backup should be done by the user each month.

4.2 Restore
This functionality enables the user to restore the data from the backup files into a database file.

5 Data Import
5.1 Import Routine data
5.2 Import Semi-permanent data
5.3 Import Survey/Audit data

6 Export to data mart
Functionality for exporting data to a Data Mart file. The Data Mart is an Access storage file for semi-processed raw and indicator data. The Data Mart file is used as data source by the Report and analysis modules. There are three options for loading a Data Mart file:
1. Full reload
This function is for loading of both data elements and indicators.
2. Partial reload
This function enables the user to specify which data elements and/or indicators to load into the Data Mart file.
3. Load only resource tables
This functionality enables the user to load new data elements, indicators and organizational units into the Data Mart file.

More 1.4 specific functionality will come here.

6.1 The indicator engine
The indicator engine takes care of the indicator calculations that are needed when exporting to data mart.
7 Export data to xml (1.4 functionality)

8 Export data to text
Functionality for exporting data to text files. These files are to be used when transporting data to other users, either by e-mail attachment or by storing them on a physical medium.

9 Standard and User-defined reports

9.1 Routine Raw Data Report
This report is for providing a cross tabulated (pivoted) overview, either for a single facility/sub-district or district for a selected time period. If a single facility is selected, options for including max/min ranges and inactive data elements should be available. The report contains: Facility/sub-district or district name, a table with data elements and their corresponding values summarized within each data period.

9.2 Ad-hoc Raw Data Reports
This report tool enables the user to specify which raw data to extract and display in a pivot table. The user selects district, sub-district, facility, data set and time period and which corresponding data elements to include in the report.

9.3 OrgUnit NULL Reporting
This report provides an overview of those facilities that have never reported on a specifie |d data element in a specified time period.

9.4 Outstanding Input Forms
A report of those input forms not yet entered into the system.

9.5 Data Collection Tools
This report tool is for generating forms for manual input.

9.5.1 Routine Input Form ? PHC
This enables the user to make a form for manual routine reporting of aggregated PHC data, with data elements from the current database file.

9.5.2 Tick Register Design
A form for general patient details and services rendered.

9.5.3 Tally Sheet Design
A form to be used for counting clients as they use facility services.

9.6 Data File Setup Reports
Reports for routine data elements, indicators and validation rules for the current database.

  • Data Elements Report
  • Validation Rules Report
  • Indicators Report

9.7 User-defined Queries and Reports
This report tool is for user-defined queries and reports. It enables the user to define, store, modify and use individual queries and reports.

9.8 Survey/Audit Bulk Comparison
Survey data is often captured twice by two different data captures, and this tool can be used to identify mistakes/differences in interpretation of the answers and/or in the capturing itself. If the user has done a facility survey, i.e. the user wants to add the survey data elements to two different data sets (in 1.4), then the user can do a bulk comparison and resolve the discrepancies, and finally delete the second ?copy? data set again.

10 Check/Archive/Merge/Interpolate data
Tools for manipulation of data sets.

10.1 Data Integrity Checker
This functionality enables the user to run a series of queries verifying the integrity and correctness of data sets. This includes correction of the errors found and manually excludation of violations found that are not errors. The queries are divided into categories of commonly occuring errors:

  • Data element errors
  • Data value errors
  • Organizational structure

10.2 Identify Duplicate Data Records
This functionality enables the user to compare the current data set with another data set to identify duplicate data elements from the same facility for the same time period.

10.3 Archive Utility
Older routine data no longer actively modified in any way should be archived, but it is still available for analysis (i.e. export to data mart).

10.4 Merging Organisational Units Data
This functionality enables the user to merge two organizational units into a new facility or to merge two facilities into an existing facility. The function is to be used when clinics amalgamate, services are discontinued or facilites are closed down.

10.5 Interpolating Missing Data Sets
This fun ctionality enables the user to re-construct a data set for a specific facility for a specific time period when the original data is lost and cannot be provided by the facility in question. It is required that it exist four data sets before and after the missing data set or at least six data sets after.

11 Global Options
This functionality enables the user to edit the default settings in the system.
More info here when 1.4 is released.

5.2. Report modules

Background
Developing report modules is an important part of the DHIS 2.0 development project. Report modules do not interfere with the core module and can therefore be developed independently of the work on the core module. Furthermore, the report modules use a data mart file as data source and not the active data file used in the core module. This data mart file can easily be on the same structure in 1.4 and 2.0 and hence the report modules can be used for both versions. This makes it very attractive to develop report modules as they can be put in real use right away.

It is possible that the data mart files in version 1.4 and version 2.0 will be nearly identical - that depends on how far our thinking has advanced with version 1.4 (which should reach alpha stage soon).

Backwards compatibility with the data mart files in version 1.3 is more unlikely, though. Data Marts in 1.4 will differ from 1.3 in several ways.

So my expectation is that the data mart files in version 1.4.n (let us say middle of 2005) will be very similar to the data mart file structure in version 2.0.

Requirements

Reports
(Still to come:
What kind of reports? Description of use etc.)

We need both stand-alone and web-based modules. Furthermore, an important requirement is to develop modules that cover the whole range of users from the non-skilled to the experts. In terms of functionality this means covering ?everything? from one-click print of pre-defined reports to complex design of generic reports and business analysis. This requirement comes directly from requests by the DHIS 1.3 users. The DHIS development team is constantly bombarded from two sides - some people complain about the complexity of the Report Generator or the pivot tables, while others request broader and more advanced reporting tools (and GIS and automatic web-publishing and data flow tracking and.. and ...)

  • Several available java open source java report tools that will facilitate the development:

Jasper, for which there are Eclipse plug-ins and other GUI tools:
http://jasperreports.sourceforge.net/gui.tools.html
JFreeReport
http://datavision.sourceforge.net/
http://www.lowagie.com/iText/
http://xml.apache.org/fop/

Pivot table analysis (see section 5.3 below)

More advanced analysis of data
We need efficient tools for analysing/retrieving data.

(Still to come:
Description of this functionality, the use of these reports, analysis etc. )

For instance Mondrian:
Mondrian is an OLAP (online analytical processing) database written in Java. It implements the MDX language, and the XML for Analysis and JOLAP specifications. It reads from SQL and other data sources, and aggregates data in a memory cache.
See http://sourceforge.net/projects/mondrian/ for more info.

Another tool that might be relevant is MonetDB:
MonetDB is a database management system developed from a main-memory perspective using a fully decomposed storage model, automatic index management, extensibility of data types and search accellerators, SQL- and XML- front-ends.
See http://sourceforge.net/projects/monetdb/ for more info.

There are other tools as well, the point is that whereas large centralized organizations often would separate the three aspects above (separate transaction databases, one large data warehouse, Lotus Notes or Exchange for communication), most DHIS users (at least at sub-national levels) do not have the infrastructure, personnel and support to run multiple sophisticated systems like that.

Technology

  • Open Source Software only
  • Java

Architecture:
These modules must be database (DBMS) independent, at least support the most used OSS and commercial DBMSs (MySQL, PostgreSQL, Oracle, Access, SQL Server).

If we are a bit smart, we can develop DHIS 2.0 components that with very few modifications are usable also with 1.4 data mart sources, or at least which can feed into the future development of the same 1.4 data mart file structure.

Another related issue: While we have decided to base 2.0 on a Java framework, there might be (already available?) modules and application based on PHP, Python, Perl or similar that possibly could be "plugged" into that java framework. (If Mono develops as one hopes, there might be .net apps available too). When investigating available technologies and applications, we should not rule out such technologies.

5.3. Pivot table analysis module

The Excel pivot tables are used extensively with DHIS 1.3 for analysis of data. Dynamic cross-tabulating and drill-downs are popular features that are useful to the health information officers. Moving away from the Microsoft platform we need to find Open Source alternatives to the functionality of Excel.

Preferably DHIS 2.0 will support both a web-based pivot table tool, and a plug-in application for stand-alone users.

The functional requirements for the pivot table module shall be the same as the ones present in Microsoft Excel's pivot tables. It is unknown whether the Open Office version of pivot tables, named Data Pilot, supports all the features of Excels pivot tables, but if that is the case then Open Office (http://www.openoffice.org/) might be a good solution.
Here are some comments by Calle Hedberg after a brief investigation OpenOffice calc application:

?I have had a first look at the "Data Pilot" tools in OpenOffice.org - it's a hassle to set them up and documentation is so far very limited, but it is clear that the aim is to provide similar functionality as Excel pivot tables.

Note, though, that I have so far not been successful in importing pivot tables - they import but are turned into "flat" spreadsheets. Menu choices supposedly to be used for "re-activating" the dynamic aspects don't work as expected - but I will continue trying to figure it if I've missed something (presumably yes).

I've not found any documentation indicating what the limitations in data set size are (OO's Calc support only 32,000 rows, but whether there are limitations in the number of rows that are supported as background data to dynamic Data Pilot worksheets, I don't know).?

There is already a web-based pivot tool developed by HISP, an .asp web application supporting Access and SQL Server. This again demands the Microsoft platform, however just at the server side.

Porting the Pivot Web reporter (and the linked web portal) to a Java environment would be a great challenge for a GROUP of students. On the one hand they would be working on a relatively advanced application with major opportunities within performance optimization, web lay-out and printing, etc - on the other hand they would have the existing .ASP application as a broad road-map and thus be less dependent on advanced users.

It should be NOTED here that the Pivot Web Reporter initially was developed for dynamic corporate reporting - it's used by one of the largest companies in South Africa and it can easily be adapted to other environments and DBMS platforms since any database can be configured through a relatively simple XML-file to function as a data source.
Furthermore, there is a Java tool called jPivot that can be helpful:

http://jpivot.sourceforge.net/
(uses Mondrian as its OLAP server, see below)

Some form of capability between the web-tool and the non-web tool could be helpful. Many pivot tables will be the same for both the web module and the non-web module, and it would make it much simpler if such tables could be generated in only one module and be compatible with both. The most needed transformation might be from the non-web module, to the web module.

1 Pivot table module requirements

1.1 General pivot table requirements
1 Support the pivot table functions known as ?pages?, ?columns?, ?rows? and ?data?.
2 Easy to move elements to and from the different pivot-table areas (?page?, ?column?, ?row? and ?data?).
3 Create graphical charts from the pivot-tables.

1.2 Non-web pivot module
There are two solutions for the non-web module. It could either generate Open Office spreadsheet files or fully support its own pivoting tool.

1.2.1 Options
1. Open Office Calc spreadsheet generator.
Generate Open Office spreadsheet files with pivot tables. Which data to include is selected in the generator, the user can then customize the tables in Open Office.
2. Fully functional pivot table module.
In the module the user can generate pivot tables based on data from the database. The pivot table can be customized in the same way as they can in Microsoft Excel. This means that the module supports both generating and viewing of pivot tables.

1.3 The web-module
The web module shall enable clients to access data when direct access to the DHIS software is not possible - or for example if a user wants to look at data from a different district. The site can be used to publish pivot tables created by the administrator, or users can design their own reports and access them through the site.

1.3.1 Functional requirements for the web module
2. Pivot table generator for administrator and users.
3. Pivot table viewing ? with functionality as listed in the ?general pivot table requirements?
5.4. Data dictionary
A web-based data dictionary supports standardization of data elements and shall be compatible with the routine data module supporting synchronization of data element definitions. National and also provincial teams if needed should be responsible for updating the dictionary and notify end users when changes are made. For non-web users there shall be functionality to distribute the whole dictionary or subsets of the dictionary and synchronize these files with the routine data module.

It is already developed a web-based Open Source java application for this purpose. We will look into the needs and possibilities to move this application to the Java frameworks we use for the main 2.0 modules.

6. SMS/Wireless modules

Extending the scope of the DHIS 2.0 we would like to make use of new wireless (cellphone-based) methods for communicating with a central (DHIS) database.

Here are some areas where these technologies can be applied in the field of health information:

1.
To automatically generate voicemail or SMS (or "expanded" SMS, cannot recall what it's called) messages that's used as reminders to facilities/districts that they have not submitted their routine data yet.

2.
To generate SMS messages used to verify certain data record values. This would be typically relevant for semi-permanent data. As an example: During the cholera epidemic here in SA two years ago, the national Dept of Health suddenly received an additional grant of R 90 mill (about USD 15 mill) for improving sanitation at health facilities. They were initially really stuck because that information either did not exist, or it was very fragmented.

The ability to auto-send an SMS question like

"Facility X: Pls confirm sanitation type in your facility - SMS one or more of the following back. 1: Pit latrine. 2: WC 3: etc etc"

to all relevant facilities would have been great.

3.
Generate SMS messages that are used as instant "polls".

4.
Enable managers to use SMS (tricky, maybe, but possible) to extract compact indicator information from the DHIS. For example, an manager send the SMS:
"Area: Chittoor. Period: June-04. Indicator: Imm Cov <1" and in return
get: ""Area: Chittoor. Period: June-04. Indicator: Imm Cov <1. Value:
75.9%"

5.
Enable managers with browsing-enabled cellphones to access core DHIS data in a cell-friendly display format (simple selection screens etc).

6.
Generate SMS messages that are used as confirmation for various processes. A typical example is the Human Resource Development module we're working on, where staff members can apply for courses. After the application has been processed, it would make sense to fire off both an SMS and an email message to inform the applicant of the outcome (and if the outcome is positive, another message to the course provider and the applicant's line manager and whatever department is organising transport etc).

7.
The last point also brings up the issue of auto-generating email messages - it would be beneficial to have a standardised interface for generating and sending email messages (standardised in the meaning that it's independent of the email client used).

7. System models
Still to come?.

(A model showing the relationship between the system components and the system and its environment.)

8. System evolution

  • describe the fundamental assumptions on which the system is built
  • changing hardware, user req., changing infrastructures etc.

9. Appendices

  • detailed and specific descriptions, e.g.
  • hardware and database descriptions
  • minimal and optimal hardware configurations for the system
  • logic organization of data used by the system and the relationships between the data, a database model

10. Index

  • Various indexes to the document

Site powered by a free Open Source Project / Non-profit License (more) of Confluence - the Enterprise wiki.
Learn more or evaluate Confluence for your organisation.
Powered by Atlassian Confluence, the Enterprise Wiki. (Version: 2.5.6 Build:#812 Aug 06, 2007) - Bug/feature request - Contact Administrators