You can import WebTemplates here, and use the import/csv
-endpoint of the Better Platform to generate
openEHR data.
POST rest/v1/import/csv?templateId={templateId}&templateLanguage=en&subjectNamespace=default
Content-Type: application/octet-stream
The advantage of using this tool over using the GET
/rest/v1/template/{templateId}/example
endpoint, is, that you can configure the
elements individually.
If you don't know what WebTemplates are, this feature is not for you.
Please note that this operation will overwrite everything you might have previously written in the editor.
This tool imports the item names from the <MetaDataVersion />
part of
an ODM file for usage with the odm2csv-tool.
Use the ODM Generator to generate ODM Files directly.
It also covers NOTIN
-RangeChecks and some other specifics of the ODM standard,
which
are not handled by this generator.
Please note that this operation will overwrite everything you might have previously written in the editor.
This tool imports the item names from a FHIR-R4- Questionnaire-resource.
The result file can be converted into QuestionnaireResponse-resources with our tool csv2fhirQuestionnaire. Uploaded QuestionnaireResponses can be transformed back using the FhirExtinguisher.
Please note that this operation will overwrite everything you might have previously written in the editor.
{ "id": "INTEGER", "description": "STRING" }will result in:
id | description |
---|---|
444637 | Ku awofel Idolaz |
648937 | il Itoxenag ob Av fi t |
"repeat": 2or
"repeat": {"Min": 1, "Max": 2}Specifies the number of times this element should be repeated. The other values are repeated meanwhile.
"Columns": {"ColumnA": "INTEGER", "ColumnB": "STRING"}Sub-Columns (for the generation of repeating elements in a LONG format way).
"type": "INTEGER"Datatype of the element to generate. Is inferred by the datatype of distribution if present.
"missingProbability": 0.2Probability, that element in this column will be null.
"UniformDistribution": {"min": "2010-01-01", "max": "2011-12-31"}Sample the values for this column from a UniformDistribution.
"GaussianDistribution": {"min": "2010-01-01", "max": "2011-12-31", "mean": "2011-06-01", "sigma": "P20D"}Sample the values for this column from a GaussianDistribution.
"ValueDistribution": {"male": 0.52, "female": 0.45, "other": 0.03}Generate the given values with the given probability.
{"1": 0.5, "2": 0.25, "3": 0.25}instead of
{1: 0.5, 2: 0.25, 3: 0.25}
"FakerDistribution": "name.fullName"or
"FakerDistribution": {"expression": "name.fullName", "locale": "de"}Use the Java Faker library to generate a value. Trigger auto-completion using Ctrl+Space to see available options!
"XegerDistribution": "[A-Za-z0-9]{2,5}"Use Xeger to generate strings matching a specific regular expression.
This tool produces artificial study data for a given Operational Data Model (ODM) file. This synthetic study data can be used, e.g. for the evaluation of other ODM-based tools. In case you have no ODM-File, you can use a prepared test file on the server instead
Operational Data Model (ODM) is a standard widely used by Electronic Data Capture systems. It defines the metadata of a study as well as the collected study data. The overall structure of a ODM file looks like this:
<ODM xmlns="http://www.cdisc.org/ns/odm/v1.3" FileType="Snapshot" FileOID="WHO5 ODM File"
CreationDateTime="2017-03-15T00:00:00" ODMVersion="1.3.2">
<Study OID="StudyOID">
<GlobalVariables>
<StudyName>Sample Study</StudyName>
<StudyDescription/>
<ProtocolName>Sample Protocol</ProtocolName>
</GlobalVariables>
<MetaDataVersion OID="MetaDataOID" Name="WHO5MetaData">
<!-- the study events, forms, item groups and items for the study data are defined here -->
</MetaDataVersion>
</Study>
<ClinicalData StudyOID="StudyOID" MetaDataVersionOID="MetaDataOID">
<SubjectData SubjectKey="subject 1">
<!-- Subject 1's study data -->
</SubjectData>
<SubjectData SubjectKey="subject 2">
<!-- Subject 2's study data -->
</SubjectData>
<!-- more subject data -->
</ClinicalData>
</ODM>
This tool's purpose is to read the metadata-part of the file and generate values for a given number of subjects. A simple example of ODM metadata is given below:
<Study OID="StudyOID">
<GlobalVariables><!-- --></GlobalVariables>
<MetaDataVersion OID="MetaDataVersionOID" Name="MetaData">
<Protocol>
<StudyEventRef StudyEventOID="StudyEventOID" Mandatory="Yes"/>
</Protocol>
<StudyEventDef OID="StudyEventOID" Name="StudyEvent" Repeating="No" Type="Scheduled">
<FormRef FormOID="FormOID" Mandatory="Yes"/>
</StudyEventDef>
<FormDef OID="FormOID" Name="example questionnaire" Repeating="No">
<ItemGroupRef ItemGroupOID="ItemGroupOID" Mandatory="Yes"/>
</FormDef>
<ItemGroupDef OID="ItemGroupOID" Name="basic demographic information" Repeating="No">
<ItemRef ItemOID="GenderItemOID" Mandatory="Yes"/>
</ItemGroupDef>
<ItemDef OID="GenderItemOID" Name="Item.1" DataType="string">
<Question>
<TranslatedText xml:lang="en">patient's gender</TranslatedText>
</Question>
<CodeListRef CodeListOID="CodeList.gender"/>
</ItemDef>
<CodeList OID="CodeList.gender" Name="Gender-CodeList" DataType="string">
<CodeListItem CodedValue="male"><Decode><TranslatedText xml:lang="en">male</TranslatedText></Decode></CodeListItem>
<CodeListItem CodedValue="female"><Decode><TranslatedText xml:lang="en">female</TranslatedText></Decode></CodeListItem>
<CodeListItem CodedValue="other"><Decode><TranslatedText xml:lang="en">other</TranslatedText></Decode></CodeListItem>
<CodeListItem CodedValue="unknown"><Decode><TranslatedText xml:lang="en">unknown</TranslatedText></Decode></CodeListItem>
</CodeList>
</MetaDataVersion>
</Study>
Please note: An ODM file might define just clinical data, so please make sure your ODM file contains metadata. An ODM file may define multiple studies, but this is not common, so this option is currently not supported by the tool. If you have multiple studies, please copy them into individual ODM files.
To generate clinical data, simply upload your ODM file and set the number of patients. This tool reads the ODM-file, and automatically generates values for all items in the form. The result will be stored in a .zip-file containing also the original file and the configuration. The generated data in our case will look similar to this:
<ClinicalData StudyOID="StudyOID" MetaDataVersionOID="MetaDataVersionOID">
<SubjectData SubjectKey="patient 0">
<StudyEventData StudyEventOID="StudyEventOID">
<FormData FormOID="FormOID">
<ItemGroupData ItemGroupOID="ItemGroupOID">
<ItemData Value="other" ItemOID="GenderItemOID"/>
</ItemGroupData>
</FormData>
</StudyEventData>
</SubjectData>
<SubjectData SubjectKey="patient 1">
<StudyEventData StudyEventOID="StudyEventOID">
<FormData FormOID="FormOID">
<ItemGroupData ItemGroupOID="ItemGroupOID">
<ItemData Value="unknown" ItemOID="GenderItemOID"/>
</ItemGroupData>
</FormData>
</StudyEventData>
</SubjectData>
<SubjectData SubjectKey="patient 2">
<StudyEventData StudyEventOID="StudyEventOID">
<FormData FormOID="FormOID">
<ItemGroupData ItemGroupOID="ItemGroupOID">
<ItemData Value="female" ItemOID="GenderItemOID"/>
</ItemGroupData>
</FormData>
</StudyEventData>
</SubjectData>
<!-- [...] -->
</ClinicalData>
As you can see, the metadata in ODM is organized in multiple levels (StudyEvent, Form, ItemGroups). Each element can be allowed to appear more than one time per patients. A range for the number of allowed repetitions can be configured for each level in the GUI ("Repeat key settings") or the JSON configuration file.
Study events, forms, item groups and items might be declared as non-mandatory. In this case, you can configure the probability by which the elements will be omitted during the generation process using "Missing values settings" in the GUI for each level or by configuring them for individual elements using their OID in the JSON configuration file (see below).
ODM supports a lot of commonly used datatypes, e.g. strings, numbers and dates, but also more exotic data types like HexFloat or PartialDateTime. Currently, this tool supports only boolean, string, integer, float, double, date, time and datetime, as these are the most commonly used types.
If nothing is specified, the whole domain of a datatype is used. In a perfect world, the domain of allowed values is always reasonably constraint by RangeCheck:
<MetaDataVersion OID="MetaDataVersionOID" Name="MetaData">
<!-- [...] -->
<ItemDef OID="AgeItemOID" Name="Item.2" DataType="integer">
<RangeCheck Comparator="GT" SoftHard="Hard">
<CheckValue>0</CheckValue>
</RangeCheck>
<RangeCheck Comparator="LT" SoftHard="Hard">
<CheckValue>120</CheckValue>
</RangeCheck>
</ItemDef>
</MetaDataVersion>
In practice, this is often not the case. If you dont want to change the ODM file, you can use either the "Global value domains" parameter datatype-wise in the GUI, or you can configure distributions for individual items using the JSON configuration (see below) to generate useful values.
Besides the configuration on this page, you can configure individual elements by using the "Probabilities"-Element in the JSON configuration file. You can set the probability of missing values for individual StudyEvents, Forms, ItemGroups and Items by using their OID. In case that elements are reused, you can select them inside the specific context using a chain of OIDs:
{
"probabilities": {
"Items": {
<ItemOID>: {
"ValueDistribution": {"1": 0.5, "2": 0.25, "3": 0.25},
"missingProbability": 0.42
},
<ItemOID>: {
"ValueDistribution": {"true": 0.75, "false": 0.25}
}
},
"ItemGroups": {
<ItemGroupOID>: {
"missingProbability": 0.5,
"repeat": {"minimum": 8, "maximum": 90},
"Items": {
<ItemOID>: {
"GaussianDistribution": {"mean": 3, "sigma": 1}
},
<ItemOID>: { /* ... */ }
}
},
<ItemGroupOID>: { /* ... */ }
},
"Forms": {
<FormOID>: {
"missingProbability": 0.5,
"repeat": {"minimum": 8, "maximum": 90},
"ItemGroups": {
<ItemGroupOID>: {
"missingProbability": 50,
"Items": {
<ItemOID>: {
"UniformDistribution": {"min": 0.0, "max": 1.0}
}
}
}
}
}
},
"StudyEvents": {
<StudyEventOID>: {
"missingProbability": 0.5,
"repeat": {"minimum": 8, "maximum": 90},
"Forms": {
<FormOID>: {
"missingProbability": 0.5,
"ItemGroups": {
<ItemGroupOID>: {
"missingProbability": 0.4,
"Items": {
<ItemOID>: {
"GaussianDistribution": {"mean": "2010-01-01", "sigma": "P2DT3H4M"},
"missingProbability": 0.5
},
<ItemOID>: {/* ... */}
}
}
}
},
<FormOID>: {/* ... */}
}
},
<StudyEventOID>: {/* ... */}
}
}
}
Distributions
To change the range of values generated for a specific item, you can use a distribution. Currently, we support three types of distribution: The uniform distribution and gaussian distribution, which are especially useful when dealing with numbers or dates, and the value distribution, which comes very handy when dealing with a set of fixed codes.
A uniform distribution is specified by the minimal and maximal value. Please remember, that a the range of allowed values of a specific item might be already constraint by RangeChecks and Global Value domains.
"UniformDistribution": {"min": "2017-01-01", "max": "2019-01-02"}
The gaussian distribution is specified by its mean and its deviation. Please note, that the mean value must be inside the range of allowed values.
"GaussianDistribution": {"mean": 1.0, "sigma": 0.5}
The value distribution allows the specification of specific probability for a individual values:
"ValueDistribution": {"male": 0.45, "female": 0.45, "other": 0.05, "unknown": 0.05}
Because of limitations in the JSON format, you have to quote booleans and numbers. The value after the colon represents the probability, that the given element will be chosen. Please make sure that all probabilities of a value distribution sum up to 1.0.
Repeating elements
StudyEvents, Forms and ItemGroups can be specified as repeatable. In that case, the "Repeat key settings" apply. If you want to set these for individual elements, you can specify a minimal and maximal number of repetitions by using it's OID:
"repeat": {"minimum": 10, "maximum": 20}
{
"randomSeed": 23455 /* for consistent results, otherwise chosen randomly */,
"subjectCount" : 25,
"keepExistingData" : true,
"studyEventMissingProbability" : 1.0,
"formMissingProbability" : 1.0,
"itemGroupMissingProbability" : 1.0,
"itemMissingProbability" : 1.0,
"studyEventRepeatKeyInterval" : {"minimum" : 2, "maximum" : 4},
"formRepeatKeyInterval" : {"minimum" : 2, "maximum" : 4},
"itemGroupRepeatKeyInterval" : {"minimum" : 2, "maximum" : 4},
"globalValueDomains" : {
"booleanValueDomain" : {
"minimum" : false, "includeMinimum" : true,
"maximum" : true, "includeMaximum" : true
},
"stringValueDomain" : {
"minimum" : 1, "includeMinimum" : true, /* refers to string length */
"maximum" : 64, "includeMaximum" : true
},
"integerValueDomain" : {
"minimum" : -2147483648, "includeMinimum" : true,
"maximum" : 2147483647, "includeMaximum" : true
},
"floatValueDomain" : {
"minimum" : 1.4E-45, "includeMinimum" : true,
"maximum" : 3.4028235E38, "includeMaximum" : true
},
"doubleValueDomain" : {
"minimum" : 4.9E-324, "includeMinimum" : true,
"maximum" : 1.7976931348623157E308, "includeMaximum" : true
},
"dateValueDomain" : {
"minimum" : [ 1919, 8, 14 ], "includeMinimum" : true,
"maximum" : "2119-08-14", "includeMaximum" : true
},
"timeValueDomain" : {
"minimum" : [ 0, 0 ], "includeMinimum" : true,
"maximum" : "23:59:59.9999999", "includeMaximum" : true
},
"dateTimeValueDomain" : {
"minimum" : [ 1919, 8, 14, 11, 26, 21, 723000000 ],
"includeMinimum" : true,
"maximum" : "2119-08-01T13:38:41", //CAVE: leading zeros at "08" and "01" are required!
"includeMaximum" : true
}
},
"probabilities": { // configuration for specific elements by OID
"StudyEvents": {
<StudyEventOID>: { //unknown OIDs will be ignored silently
"missingProbability": 0.5,
"repeat": {"minimum": 1, "maximum": 5},
// Probability, that a non-mandatory StudyEvent will be not produced (default is 0.2=20%)
"Forms": {
<FormOID>: {
"missingProbability": 0.5, // Probability, that a not missingProbability form will not be generated
"repeat": {"minimum": 8, "maximum": 90},
"ItemGroups": {
<ItemGroupOID>: {
"missingProbability": 0.4,
"repeat": {"minimum": 5, "maximum": 15},
"Items": {
<ItemOID>: {
"GaussianDistribution": { "mean": "2010-01-01", "sigma": "P2DT3H4M" }
//P2DT3H4M = 2 days 3 hours 4 minutes standard deviation (DATETIME)
//PT3H2M1S = 3 hours 2 minutes 1 seconds standard deviation (TIME)
//P1Y2M3D = 1 year 2 month 3 day (DATE), see ISO-8601 for more info
}
}
}
}
},
<FormOID>: ...
}
}
},
"Forms": {
<FormOID>: {
"missingProbability": 0.5, // Setting inside a specific StudyEvent has precedence
"repeat": {"minimum": 8, "maximum": 90},
"ItemGroups": {
<ItemGroupOID>: {
"missingProbability": 0.5,
"Items": {
<ItemOID>: {
"UniformDistribution": {"min": 0.0, "max": 1.0}
}
}
}
}
}
},
"ItemGroups": {
<ItemGroupOID>: {
"missingProbability": 0.5, //Form-specific and StudyEvent+Form-specific settings have precedence
"repeat": {"minimum": 8, "maximum": 90},
"Items": {
<ItemOID>: {
"GaussianDistribution": { "mean": 3, "sigma": 1 }
},
<ItemOID>: {
"ValueDistribution": { "1": 0.5, "2": 0.25, "3": 0.25 },
"missingProbability": 0.42
}
/* ... */
}
}
},
"Items": {
<ItemOID>: {
"ValueDistribution": {"true": 0.75, "false": 0.25} //key must always be a String (thx to JSON)
}
/* ... */
}
}
}
POST /csv/api HTTP/1.1
Host: https://clinicaldatagenerator.uni-muenster.de
Content-Type: multipart/form-data;
Content-Disposition: form-data; name="json_conf"
{
"rowCount": 25,
"randomSeed": 8223,
"globalValueDomains": {
"booleanValueDomain": {
"minimum": false,
"maximum": true,
"includeMinimum": true,
"includeMaximum": true
},
"stringValueDomain": {
"minimum": "1",
"maximum": "64",
"includeMinimum": true,
"includeMaximum": true
},
"integerValueDomain": {
"minimum": -2147483648,
"maximum": 2147483647,
"includeMinimum": true,
"includeMaximum": true
},
"floatValueDomain": {
"minimum": -1000000,
"maximum": 1000000,
"includeMinimum": true,
"includeMaximum": true
},
"doubleValueDomain": {
"minimum": -1000000,
"maximum": 1000000,
"includeMinimum": true,
"includeMaximum": true
},
"dateValueDomain": {
"minimum": "1924-07-23",
"maximum": "2124-07-23",
"includeMinimum": true,
"includeMaximum": true
},
"timeValueDomain": {
"minimum": "00:00:00",
"maximum": "23:59:59",
"includeMinimum": true,
"includeMaximum": true
},
"dateTimeValueDomain": {
"minimum": "1924-07-23T14:26:44",
"maximum": "2124-07-23T14:26:44",
"includeMinimum": true,
"includeMaximum": true
}
},
"outputFormat": "CSV",
"csvSettings": {
"delimiter": ",",
"nullString": "N/A",
"escape": "\\",
"quoteMode": "MINIMAL",
"quoteChar": "\""
},
"columns": {
// Insert your column configuration here
}
}
Ctrl+Space | Autocomplete |
Ctrl+Shift+K | Delete Line |
Alt+Shift+F | Format Document |
Ctrl+Shift+O | Navigate to columns by name |