Demo Data Guide
This document provides a comprehensive guide to the demo data included with PRISM. The demo folder contains synthetic data designed to help you learn PRISM workflows, test validation, and understand how to structure your own datasets.
Overview
The demo/ folder demonstrates the complete PRISM ecosystem:
Component |
Purpose |
|---|---|
|
Multilingual JSON sidecars (survey, biometrics) |
|
Example source files + bad examples for testing |
|
Scoring recipes for computed measures |
|
Correctly organized PRISM dataset |
|
Common disorganized format (for comparison) |
Templates: Multilingual Survey & Biometrics
Location
demo/templates/
├── participants.json # Participant variable definitions
├── survey/
│ └── survey-wellbeing.json # Well-being questionnaire (DE/EN)
└── biometrics/
└── biometrics-fitness.json # Fitness assessment (DE/EN)
The Well-Being Survey (survey-wellbeing.json)
A 5-item synthetic questionnaire measuring general well-being:
Item |
English Description |
German Description |
Scale |
|---|---|---|---|
WB01 |
Life satisfaction |
Lebenszufriedenheit |
1-5 |
WB02 |
Happiness frequency |
Häufigkeit von Glück |
1-5 |
WB03 |
Stress level (reverse) |
Stressniveau (umgekehrt) |
1-5 |
WB04 |
Energy level |
Energieniveau |
1-5 |
WB05 |
Sleep quality satisfaction |
Schlafqualität |
1-5 |
Scale anchors (example for WB01):
1 = Not at all satisfied / Überhaupt nicht zufrieden
2 = Slightly satisfied / Wenig zufrieden
3 = Moderately satisfied / Mäßig zufrieden
4 = Very satisfied / Sehr zufrieden
5 = Extremely satisfied / Äußerst zufrieden
I18n Structure
PRISM templates store both languages in a single file using nested objects:
{
"Study": {
"TaskName": "wellbeing",
"OriginalName": {
"de": "Wohlbefindens-Kurzskala",
"en": "Well-Being Short Scale"
},
"Description": {
"de": "5-Item Fragebogen zur Erfassung des allgemeinen Wohlbefindens",
"en": "5-item questionnaire measuring general well-being"
}
},
"I18n": {
"Languages": ["de", "en"],
"DefaultLanguage": "en"
},
"WB01": {
"Description": {
"de": "Wie zufrieden sind Sie im Allgemeinen mit Ihrem Leben?",
"en": "In general, how satisfied are you with your life?"
},
"Levels": {
"1": {
"de": "Überhaupt nicht zufrieden",
"en": "Not at all satisfied"
},
"5": {
"de": "Äußerst zufrieden",
"en": "Extremely satisfied"
}
}
}
}
Benefits:
Single source of truth for translations
No sync issues between separate files
Compile to target language at export time
Compiling to Single Language
# Extract German version
python prism_tools.py library-compile demo/templates/survey/survey-wellbeing.json --lang de
# Extract English version
python prism_tools.py library-compile demo/templates/survey/survey-wellbeing.json --lang en
Raw Data: Source Files
Location
demo/raw_data/
├── survey_wellbeing_data.tsv # Valid survey responses
├── biometrics_fitness_data.tsv # Valid biometrics data
└── bad_examples/ # 13 intentionally broken files
Valid Survey Data (survey_wellbeing_data.tsv)
10 synthetic participants with complete responses:
participant_id session age sex education handedness WB01 WB02 WB03 WB04 WB05 completion_date
DEMO001 baseline 28 f 4 r 4 4 2 3 4 2025-01-15
DEMO002 baseline 34 m 5 r 3 3 3 3 3 2025-01-16
DEMO003 baseline 22 f 3 r 5 5 1 5 5 2025-01-17
...
Column descriptions:
participant_id: Unique identifier (converted tosub-DEMO001format)session: Data collection session (becomesses-baseline)age,sex,education,handedness: Demographic variablesWB01-WB05: Survey item responses (1-5 Likert scale)completion_date: When the survey was completed
Bad Examples for Testing
The bad_examples/ folder contains 13 intentionally malformed files to test error handling:
File |
Issue |
Expected Error |
|---|---|---|
|
No participant_id column |
Cannot identify participants |
|
Semicolons instead of tabs |
Column parsing failure |
|
“very high” instead of 5 |
Non-numeric data error |
|
Values like 99, -5, 300 |
Out of range warning |
|
Missing cells |
Missing value handling |
|
Extra columns not in template |
Unknown column warning |
|
Same participant twice |
Duplicate entry warning |
|
Varying column counts |
Parsing error |
|
Mix of numbers, N/A, NULL, #REF! |
Type validation errors |
|
Completely empty |
No data error |
|
Headers but no rows |
No participant data |
|
HTML tags, quotes |
Sanitization test |
|
IDs not in sub-XXX format |
Format warning |
Usage:
# Test error handling via CLI
python prism_tools.py survey-convert \
demo/raw_data/bad_examples/04_out_of_range_values.tsv \
--library demo/templates \
--output /tmp/test_output
# Or use the web interface Data Conversion page
Derivatives: Scoring Recipes
Location
demo/derivatives/
├── README.md
├── surveys/
│ └── wellbeing.json # Wellbeing subscales & reverse coding
└── biometrics/
└── fitness.json # Fitness composite scores
Wellbeing Scoring Recipe (wellbeing.json)
Computes derived scores from raw survey items:
{
"RecipeVersion": "1.0",
"Kind": "survey",
"Survey": {
"Name": "Well-Being Short Scale",
"TaskName": "wellbeing"
},
"Transforms": {
"Invert": {
"Scale": {"min": 1, "max": 5},
"Items": ["WB03"]
}
},
"Scores": [
{
"Name": "WB_total",
"Description": "Total Well-Being Score",
"Method": "sum",
"Items": ["WB01", "WB02", "WB03", "WB04", "WB05"],
"Range": {"min": 5, "max": 25},
"Interpretation": {
"5-10": "Low well-being",
"11-17": "Moderate well-being",
"18-25": "High well-being"
}
},
{
"Name": "WB_positive",
"Description": "Positive Affect Subscale",
"Method": "mean",
"Items": ["WB02", "WB04"]
},
{
"Name": "WB_satisfaction",
"Description": "Life Satisfaction Subscale",
"Method": "mean",
"Items": ["WB01", "WB05"]
}
]
}
Key features:
Reverse coding: WB03 (stress) is inverted so higher = better
Subscales: Total, Positive Affect, Life Satisfaction
Methods:
sumormeanaggregationMissing data:
ignoreskips missing values
Computing Derivatives
# Generate scored output from a PRISM dataset
python prism_tools.py derivatives-surveys /path/to/dataset \
--recipe demo/derivatives/surveys/wellbeing.json \
--output derivatives/
# Outputs: CSV, Excel (with codebook), and optional SPSS format
PRISM Structure Example
Location
demo/prism_structure_example/
├── .bidsignore
├── dataset_description.json
├── participants.json
├── participants.tsv
├── sub-001/
│ ├── eyetrack/
│ │ ├── sub-001_task-reading_eyetrack.tsv.gz
│ │ └── sub-001_task-reading_eyetrack.json
│ └── physio/
│ ├── sub-001_task-rest_physio.tsv.gz
│ └── sub-001_task-rest_physio.json
└── sub-002/
├── eyetrack/
│ └── ...
└── physio/
└── ...
This is a correctly organized PRISM dataset demonstrating:
Dataset-Level Files
dataset_description.json - Required metadata:
{
"Name": "PRISM Demo Dataset",
"BIDSVersion": "1.8.0",
"DatasetType": "raw",
"License": "CC0",
"Authors": ["Demo Author"],
"Acknowledgements": "Synthetic demo data for PRISM testing"
}
participants.tsv - Participant demographics:
participant_id age sex handedness
sub-001 28 F R
sub-002 34 M R
participants.json - Variable definitions:
{
"age": {
"Description": "Age of participant in years",
"Units": "years"
},
"sex": {
"Description": "Biological sex",
"Levels": {
"F": "Female",
"M": "Male"
}
}
}
Subject-Level Structure
Each subject folder contains modality subfolders:
eyetrack/- Eye tracking dataphysio/- Physiological recordings (ECG, EDA, etc.)
Naming convention:
sub-<id>_[ses-<session>_]task-<task>_<modality>.<ext>
Examples:
sub-001_task-reading_eyetrack.tsv.gzsub-001_task-rest_physio.tsv.gz
Each data file has a corresponding .json sidecar with metadata.
Validating the Example
# Should pass with no errors
python prism.py demo/prism_structure_example/
# Output:
# ✓ Dataset validation complete
# 0 errors, 0 warnings
Flat Structure Example (Anti-Pattern)
Location
demo/flat_structure_example/
├── README.md
└── [messy files with inconsistent naming]
This folder demonstrates how data often arrives from experiments:
No standardized naming
Mixed file formats
No metadata sidecars
No participant organization
Purpose: Compare with prism_structure_example/ to understand why standardization matters.
# Will show many validation errors
python prism.py demo/flat_structure_example/
Hands-On Tutorials
Tutorial 1: Validate Demo Dataset
# Activate environment
source .venv/bin/activate
# Validate the well-organized example
python prism.py demo/prism_structure_example/
# → Should pass
# Try the flat structure
python prism.py demo/flat_structure_example/
# → Will show errors
Tutorial 2: Convert Survey Data
# Convert raw survey data to PRISM format
python prism_tools.py survey-convert \
demo/raw_data/survey_wellbeing_data.tsv \
--library demo/templates \
--output /tmp/converted_survey \
--force
# Validate the result
python prism.py /tmp/converted_survey
Tutorial 3: Compute Derivatives
# After converting, compute subscale scores
python prism_tools.py derivatives-surveys /tmp/converted_survey \
--recipe demo/derivatives/surveys/wellbeing.json \
--output /tmp/converted_survey/derivatives
# Check the output
ls /tmp/converted_survey/derivatives/surveys/
# → wellbeing_scores.csv, wellbeing_scores.xlsx, codebook.json
Tutorial 4: Test Error Handling
# Try importing a bad file
python prism_tools.py survey-convert \
demo/raw_data/bad_examples/04_out_of_range_values.tsv \
--library demo/templates \
--output /tmp/bad_test
# Should produce clear error about values outside 1-5 range
Tutorial 5: Web Interface
Start:
python prism-studio.pyOpen: http://localhost:5001
Go to Validate tab
Upload
demo/prism_structure_example/→ should passGo to Data Conversion tab
Select library:
demo/templatesUpload
demo/raw_data/survey_wellbeing_data.tsvWatch real-time conversion log
Summary
Demo Component |
What It Teaches |
|---|---|
|
Multilingual JSON sidecar format |
|
Participant variable definitions |
|
Typical source data format |
|
Error handling & validation |
|
Scoring recipe format |
|
Correct PRISM organization |
|
Why standardization matters |
All demo data is completely synthetic and safe to share, modify, or use for testing.