Skip to content

validation

Tom Russell edited this page Feb 11, 2019 · 1 revision

smif: Data validation should ease the process of debugging configuration of a model run

Validation should be:

  • optional: in debug mode, validation should be enabled, but has the potential to be data intensive if it results in multiple read-writes to the store
  • close to the DataInterface: validating the objects returned from the store will reduce the amount of duplicate code written, and ease testing against existing fixtures
  • well documented
  • tested
  • either use the python warnings.warn system, with warnings derived from UserWarning or some other mechanism (e.g. errors)
  • producing reports for a user to help their debugging
  • block a modelrun from being performed if one or more validation warnings are raised

More ideas after discussing with Tom:

  • smif validate command could perform a dry run of a model run, checking all configuration, formats of scenario data files, that all data requires meets requirements of models etc.
  • validation within DataInterface checks types and formats of data dicts and objects returned
  • within a model run validation also can cross-check e.g. initial conditions against interventions

Stories

Test Store raises DataNotFound if parameter doesn't exist in model

Labels:

  • smif
  • validation

ScenarioForm validate that each variant provides all sources

Labels:

  • gui
  • smif
  • validation

Scenarios variants should define sources for all outputs the scenario provides. This currently breaks the datafile handler.

The datahandler should validate that each variant is completed on save.

Dependencies should check absolute range

Labels:

  • smif
  • validation

It shouldn't be possible to link specs with different absolute ranges - values valid in one spec may be invalid under the other.

Report {error: message} responses in GUI

Labels:

  • errors
  • smif
  • validation

Pass the message through to alert-danger boxes.

Catch SmifException at HTTP API and return {error: message}

Labels:

  • errors
  • smif
  • validation

Possibly use appropriate HTTP codes - 404 if not found.

500 should probably be reserved for unexpected errors (i.e. not SmifException)

Catch SmifException at CLI (top level) and report, exit(-1)

Labels:

  • errors
  • smif
  • validation

Ensure data homogeneity: Add import scripts, ensure read and write of config and data uses one "our" formats

Labels:

  • smif
  • validation

At present, our datafile interface reads in from a variety of file types, and writes as either csv or binary. It would be cleaner to separate the import of user data from the caching or persistence of configuration and results/model data.

Refactor shape validation into file interface

Labels:

  • smif
  • validation

Building upon #156734556, move shape validation into datafile interface to validate region definitions upon initialisation (and add a test).

Smif run crashes if unused region_definitions are not present on disk

Labels:

  • smif
  • validation

Smif run requires all of the region_definitions to be present on the disk, before it is able to run a modelrun. Even when this region_definition is not used to run this particular modelrun.

I think this is unlogical - but this is a question of where we are putting these boundaries.

If missing data for a scenario year, raise DataNotFoundError (or similar clear message)

Labels:

  • errors
  • smif
  • validation

Currently smif raises a DataMismatchError

smif.data_layer.data_interface.DataMismatchError: Number of observations (0) is not equal to intervals (391) x regions (1)

See also #155182033

Raise errors from objects if data is invalid, handle at cli-controller

Labels:

  • data_handle
  • smif
  • validation

Propose design:

  • don't validate eagerly if running a model
  • let objects complain by raising errors if they are misconfigured
  • catch errors at the controller level (e.g. in cli/init), communicating back to the user with clear message, only include stack trace if it contains useful info.

Validate objects saved through HTTP API

Labels:

  • smif

  • validation

  • allow incomplete objects to be saved

  • if incomplete, could return warning (?)

  • if references missing object, return warning or error (?)

  • raise and return error if extra (unexpected) data is posted

  • raise and return error if a data type is unexpected

  • raise and return error if expected size or length is exceeded

smif validate <modelrun_id> validates a modelrun and all referenced config and data

Labels:

  • smif

  • validation

  • cli method

  • read all linked config, accumulate list of errors (with line in file if possible)

smif validate should validate model config and all sector model configurations

Labels:

  • cli
  • smif
  • validation

smif validate was removed in smif 0.6

See also #137789573

DataHandle should error if outputs are not provided by a SectorModel

Labels:

  • smif
  • validation

Since we know the outputs that are declared, and they are all written to results through a DataHandle, we could record what is written and error (after SectorModel.simulate returns) if any output was not recorded.

DataHandle should warn if an input or parameter is not accessed by SectorModel.simulate

Labels:

  • smif
  • validation

Since we know the parameters and inputs that are declared, and they are all accessed through a DataHandle, we could record what is accessed and warn (after SectorModel.simulate returns) if any input or parameter was not used.

validation: raise warning if SectorModel instance (wrapper) returns outputs, but output.yaml is empty or contradictory

Labels:

  • smif
  • validation

At present, we do not check that the model outputs specified in output.yaml match the outputs actually generated by the SectorModel wrapper.

Clone this wiki locally