IEEE P1752 Working Group

Elements of schema writing

1752.1 schema library shows all the schemas that have already been defined and the sample data section shows sample instances complying with all the defined schemas. All schemas and sample data referenced below are available on the 1752 OpenSource site.

1752.1 schemas are written in JSON Schema.


Writing a quantitative measure schema

Quantitative measure schemas are for numerical data associated with units of measurement.

Here are the basic steps of writing a schema, which will be shown in 2 examples below.

Before starting, review the relevant literature and/or consult with subject matter experts to make sure that you’re modeling the quantitative measure the way data consumers will want to use it.

  1. The core elements of the quantitative schema will be the numeric value of the measure and its unit(s) of measure, defined using the unit-value schema. In some cases, there will be more than one measure (see sleep episode schema below).
    • 1752.1 has defined schemas for some widely used value and unit pairs: see the Utility folder in the schema section
  2. If you don’t see what you need in the pre-defined value and unit pairs, you can reference the unit-value schema and then constrain the unit property to a permissible value set for your measure (see the apnea-hypopnea-index schema for an example of this case.)
  3. The schema should include all the data elements (properties) needed to fully represent the measure. For example: does the position of the body play a role in the measure? If so include in the schema a reference to the body-posture schema.
  4. All measure schemas should include a property for describing the time for which the measure was effective, defined using the time-frame schema.
  5. You can define your schema to model not only individual measurements but the result of summarizing a set of measurements using a descriptive statistics like average or maximum. To do this, include a reference to the descriptive-statistic schema. If the measure you are modeling is a duration (i.e., length of time), the summary measure needs a time denominator, so include a reference to the descriptive-statistic-denominator schema. (see sleep-onset-latency schema for an example of use).
The total-sleep-time schema used as an example below is interspersed with notes explaining the various steps. (Note that only some parts of the schema are shown here.)
{
    "$schema": "http://json-schema.org/draft-07/schema#",
    "$id": "https://w3id.org/ieee/ieee-1752-schema/total-sleep-time.json",
    "title": "Total Sleep Time (TST)",
    "type": "object",
    "description": "This schema represents total sleep time, i.e. The total sleep time is the duration after sleep onset in an entire sleep episode minus the duration of all awakenings.",

The id provides the permalink to the schema via W3C: this is created by to OS Manager before the schema is published. The description briefly describes the measure the schema represents.

The definitions section should reference existing schemas that are used to define properties in the relevant section. The name should correspond to the schema name.

    "definitions": {
        "duration_unit_value": {
            "$ref": "duration-unit-value-1.0.json"
        },
        "time_frame": {
            "$ref": "time-frame-1.0.json"
        },
        "time_interval": {
            "$ref": "time-interval-1.0.json"
        },

duration_unit_value: total sleep time (TST) is a measure of duration, so the relevant unit value schema is referenced.

descriptive_statistic is used to describe the type of aggregate measure, for ex., maximum or average.
The schema models not only individual measurements, but also aggregate ones and the aggregate measure needs a time denominator so these elements are defined (see #6 above for details)
        "descriptive_statistic": {
            "$ref": "descriptive-statistic-1.0.json"
        },
        "descriptive_statistic_denominator": {
            "$ref": "descriptive-statistic-denominator-1.0.json"
        }
    },

total_sleep_time (TST) is a measure of duration, so the property is associated to the relevant schema

sleep_events is an array of time intervals

effective_time_frame: as the measure cannot be valid associated to a datetime, the allOf construct is used to restrict time frame to be a time interval

    "properties": {
        "total_sleep_time": {
            "description": "The total amount of time spent asleep within the effective time frame.",
            "$ref": "#/definitions/duration_unit_value"
        },
        "sleep_events": {
            "description": "Individual sleep events and their durations to describe at what points throughout the night is the individual is asleep, and when summarized equal the total_sleep_time.",
            "type": "array",
            "items": {
                "$ref": "#/definitions/time_interval"
            }
        },
        "effective_time_frame": {
            "description": "The date-time at which, or time interval during which the measurement is asserted as being valid.",
            "allOf": [
                {
                    "$ref": "#/definitions/time_frame"
                },
                {
                    "required": [
                        "time_interval"
                    ]
                }
            ]
        },

required: the value of the measure and the time for which it was effective are required.

    },
    "required": [
        "total_sleep_time",
        "effective_time_frame"
    ]
}

The sleep-episode schema used as an example below is interspersed with notes explaining the various steps. (Note that only some parts of the schema are shown here.)

{
    "$schema": "http://json-schema.org/draft-07/schema#",
    "$id": "https://w3id.org/ieee/ieee-1752-schema/sleep-episode.json",
    "title": "Sleep Episode",
    "type": "object",
    "description": "This schema represents one sleep episode.",

The id provides the permalink to the schema via W3C: this is created by to OS Manager when the schema is published. The description briefly describes the measure the schema represents.

    "references": [  
        {
            "description": "",
            "url": "<link to source>"
        }

While the example schema does not show this, you can use the references element to add any relevant reference(s); for example the website or paper used to inform the definition of this measure.

    ],
    "definitions": {

The definitions section should reference existing schemas that are used to define properties in the relevant section. The name should correspond to the schema name.

        "percent_unit_value": {
            "$ref": "percent-unit-value-1.0.json"
        },
        "duration_unit_value": {
            "$ref": "duration-unit-value-1.0.json"
        },
        "time_frame": {
            "$ref": "time-frame-1.0.json"
        }

The value + unit defining the measure may reference an existing specific unit value schema as in the example above or, if none of the existing schemas applies and the unit is not expected to be used in other schemas, the generic unit value schema. In the latter case, the unit value set is later constrained to the relevant unit(s), as done in the apnea-hypopnea-index schema.

Numerical data elements are defined in the properties section (see below).

The time_frame schema is used to define the effective_time_frame property (see below).
If the measure can only be associated to either a date time or a time interval, a restriction can be placed within properties.
In the properties section, list the properties of this schema, the elements defining the measure.
In this case, a sleep episode is defined by a set of measures of time, like latency_to_sleep_onset to define which you reference the duration_unit_value schema
    "properties": {
        "latency_to_sleep_onset": {
            "description": "Amount of time between when person starts to want to go to sleep and sleep onset.",
            "$ref": "#/definitions/duration_unit_value"
        },
The description describes the individual property, as needed.
Example of numerical property:
        "number_of_awakenings": {
            "type": "integer"
        },
Example of boolean property:
        "is_main_sleep": {
            "description": "Whether the sleep episode is the main sleep event (i.e., a night sleep for most people) or a nap.",
            "type": "boolean"
        },
effective_time_frame: as the measure cannot be valid associated to a datetime, the allOf construct is used to restrict time frame to be a time interval
        "effective_time_frame": {
            "description": "The time interval during which the measurement is asserted as being valid. As a measure of a duration, total sleep time should not be associated to a date time time frame. Hence, effective time frame is restricted to be a time interval. The initial sleep onset time maps to start_date_time, the final awakening time maps to end_date_time and total sleep episode duration maps to duration.",
            "allOf": [
                {
                    "$ref": "#/definitions/time_frame"
                },
                {
                    "required": [
                        "time_interval"
                    ]
                }
            ]
        },
In the required section list the properties whose value is required for data to validate against this schema
    "required": [
        "effective_time_frame"
    ]
We place light restrictions here and defer to application-specific data quality checks to ensure data is complete for specific uses. Since additionalProperties is not specified, the default value (true) applies, which means data containing properties not listed in this section will validate against this schema.
See the sleep folder of the sample_data section of the repository for examples of data complying with the schema above.