PRISM Derivatives Specification

PRISM can automatically compute scores, subscales, and intermediate variables from your raw survey and biometric data. This is controlled by Recipe files located in the repository under derivatives/surveys/ and derivatives/biometrics/.

Recipe Structure

A recipe is a JSON file that defines how to transform raw items into scores.

{
  "RecipeVersion": "1.0",
  "Kind": "biometrics",
  "Biometrics": {
    "Name": "Y Balance Test",
    "BiometricName": "y"
  },
  "Transforms": {
    "Invert": {
      "Items": ["item1", "item2"],
      "Scale": { "min": 0, "max": 4 }
    },
    "Derived": [
      {
        "Name": "best_trial",
        "Method": "max",
        "Items": ["trial1", "trial2", "trial3"]
      }
    ]
  },
  "Scores": [
    {
      "Name": "total_score",
      "Method": "sum",
      "Items": ["item1", "item2", "best_trial"]
    }
  ]
}

Available Methods

1. `Transforms.Derived` (Intermediate Variables)

Used to compute values that are then used as inputs for final scores (e.g., taking the best of three trials).

Method	Description
`max`	Returns the maximum value among the items. (Default)
`min`	Returns the minimum value among the items.
`mean`	Returns the arithmetic mean (average) of the items. (Alias: `avg`)
`sum`	Returns the sum of the items.

2. `Scores` (Final Output Columns)

Used to compute the final variables that will appear in the derivative TSV/Excel/SPSS files.

Method	Description
`sum`	Returns the sum of the items. (Default)
`mean`	Returns the arithmetic mean (average) of the items.
`formula`	Evaluates a mathematical expression. Requires a `Formula` field.

The `formula` Method

When using Method: "formula", you must provide a Formula string. Use curly braces {} to reference item IDs or derived variable names.

Example:

{
  "Name": "normalized_score",
  "Method": "formula",
  "Items": ["A", "PM", "PL", "LegLength"],
  "Formula": "(({A} + {PM} + {PL}) / (3 * {LegLength})) * 100"
}

Handling Missing Data

For Scores, you can control how missing values (n/a or empty cells) are handled using the Missing field:

Option	Description
`ignore`	Skips missing values in calculations (e.g., `sum` of `[5, n/a, 5]` is `10`). (Default)
`require_all`	If any item in the list is missing, the entire score becomes `n/a`. (Aliases: `all`, `strict`)

Item Inversion (Reverse Coding)

The Transforms.Invert block allows you to automatically reverse-code items before any other calculations take place.

Items: List of item IDs to invert.
Scale: Must provide min and max.
Formula: new_value = (max + min) - old_value

Validation (When and How)

Recipes are validated before execution (fail-fast). If a recipe is malformed (unknown Method, missing Formula, etc.), derivative generation stops with a clear error message.
The validator checks, among other things:
- Kind is survey or biometrics
- Survey.TaskName / Biometrics.BiometricName is present
- Method values are from the supported sets documented above
- formula scores contain a Formula and every {placeholder} is also listed in Items

Output Structure

When you run derivatives, PRISM creates a BIDS-compliant derivatives folder:

derivatives/surveys/ or derivatives/biometrics/
dataset_description.json: Automatically generated metadata file. It inherits Name, Authors, License, and Funding from your root dataset description to ensure transparency and reproducibility.
<recipe_id>/: Folders containing the computed scores.

The dataset_description.json includes a GeneratedBy section identifying prism-tools and the specific version used, along with a timestamp.