Skip to content

Commit

Permalink
add doc
Browse files Browse the repository at this point in the history
  • Loading branch information
b8raoult committed Mar 22, 2024
1 parent 0ae4902 commit 929a171
Show file tree
Hide file tree
Showing 32 changed files with 979 additions and 2 deletions.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -178,3 +178,8 @@ bar
test.py
cutout.png
*.out

_build/
?
?.*
~*
16 changes: 15 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ repos:
- --line-length=120
- --fix
- --exit-non-zero-on-fix
- --preview
- --preview
- --exclude
- 'dev/*.py'
#- repo: https://github.com/pre-commit/mirrors-mypy
Expand All @@ -46,3 +46,17 @@ repos:
# - id: mypy
# verbose: true
# entry: bash -c 'mypy "$@" || true' --
- repo: https://github.com/dzhu/rstfmt
rev: v0.0.14
hooks:
- id: rstfmt

# - repo: https://github.com/rstcheck/rstcheck
# rev: v6.2.0
# hooks:
# - id: rstcheck
# args:
# - '--ignore-roles'
# - 'doc'
# - '--ignore-directives'
# - 'toctree'
Binary file added docs/_static/logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
48 changes: 48 additions & 0 deletions docs/_static/style.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
.wy-side-nav-search {
background-color: #f7f7f7;
}

/*There is a clash between xarray notebook styles and readthedoc*/

.rst-content dl.xr-attrs dt {
all: revert;
font-size: 95%;
white-space: nowrap;
}

.rst-content dl.xr-attrs dd {
font-size: 95%;
}

.xr-wrap {
font-size: 85%;
}

.wy-table-responsive table td, .wy-table-responsive table th {
white-space: inherit;
}

/*
.wy-table-responsive table td,
.wy-table-responsive table th {
white-space: normal !important;
vertical-align: top !important;
}
.wy-table-responsive {
margin-bottom: 24px;
max-width: 100%;
overflow: visible;
} */

/* Hide notebooks warnings */
.nboutput .stderr {
display: none;
}

/*
Set logo size
*/
.wy-side-nav-search .wy-dropdown > a img.logo, .wy-side-nav-search > a img.logo {
width: 200px;
}
Empty file added docs/_templates/.gitkeep
Empty file.
5 changes: 5 additions & 0 deletions docs/apply-fmt.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
:
for n in $(find . -name '*.rst')
do
rstfmt $n
done
7 changes: 7 additions & 0 deletions docs/check-index.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
:
# See https://github.com/vscode-restructuredtext/vscode-restructuredtext/issues/280
for n in $(find . -name '*.rst')
do
m=$(echo $n | sed 's/\.rst//' | sed 's,^\./,,')
egrep ":doc:.$m" index.rst > /dev/null || echo $m
done
81 changes: 81 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Path setup --------------------------------------------------------------

# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))

import datetime

# top = os.path.realpath(os.path.dirname(os.path.dirname(__file__)))
# sys.path.insert(0, top)


source_suffix = ".rst"
master_doc = "index"
pygments_style = "sphinx"
html_theme_options = {"logo_only": True}
html_logo = "_static/logo.png"


# -- Project information -----------------------------------------------------

project = "Anemoi"

author = "ECMWF"

year = datetime.datetime.now().year
if year == 2024:
years = "2024"
else:
years = "2024-%s" % (year,)

copyright = "%s, ECMWF" % (years,)


release = "0.1.0"


# -- General configuration ---------------------------------------------------

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
"sphinx_rtd_theme",
"nbsphinx",
]

# Add any paths that contain templates here, relative to this directory.
# templates_path = ["_templates"]

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store", "'**.ipynb_checkpoints'"]


# https://www.notion.so/Deepnote-Launch-Buttons-63c642a5e875463495ed2341e83a4b2a


# -- Options for HTML output -------------------------------------------------

# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = "sphinx_rtd_theme"

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ["_static"]
html_css_files = ["style.css"]
27 changes: 27 additions & 0 deletions docs/datasets/about.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
##################
Training dataset
##################

Training datasets are large array-like objects encode in Zarr_ format.
They

The array has the following dimensions:

.. figure:: data.png
:alt: Data layout

The first dimension is the time dimension, the second dimension are the
variables (e.g. temperature, pressure, etc), the third dimension is the
ensemble, and fourth dimension are the grid points values.

This structure provides an efficient way to build the training dataset,
as input and output of the model are simply consecutive slices of the
array.

.. code:: python
x, y = ds[n], ds[n+1]
y_hat = model.predict(x)
loss = model.loss(y, y_hat)
.. _zarr: https://zarr.readthedocs.io/
Binary file added docs/datasets/build.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
80 changes: 80 additions & 0 deletions docs/datasets/building.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
.. _datasets-building:

###################
Building datasets
###################

..
.. figure:: build.png
..
:alt: Building datasets
..
:scale: 50%
**********
Concepts
**********

date
Throughout this document, the term `date` refers to a date and time,
not just a date. A training dataset is covers a continuous range of
dates with a given frequency. Missing dates are still part of the
dataset, but the data are missing and marked as such using NaNs.
Dates are always in UTC, and refer to date at which the data is
valid. For accumulations and fluxes, that would be the end of the
accumulation period.

variable
A `variable` is meteorological parameter, such as temperature, wind,
etc. Multilevel parameters are treated as separate variables, one for
each level. For example, temperature at 850 hPa and temperature at
500 hPa will be treated as two separate variables (`t_850` and
`t_500`).

field
A `field` is a variable at a given date. It is represented by a array
of values at each grid point.

source
The `source` is a software component that given a list of dates and
variables will return the corresponding fields. A example of source
is ECMWF's MARS archive, a collection of GRIB or NetCDF files, a
database, etc. See :ref:`dataset-sources` for more information.

filter
A `filter` is a software component that takes as input the output of
a source or the output of another filter can modify the fields and/or
their metadata. For example, typical filters are interpolations,
renaming of variables, etc. See :ref:`dataset-filters` for more
information.

************
Operations
************

In order to build a training dataset, sources and filters are combined
using the following operations:

join
The join is the process of combining several sources data. Each
source is expected to provide different variables at the same dates.

pipe
The pipe is the process of transforming fields using filters. The
first step of a pipe is typically a source, a join or another pipe.
The following steps are filters.

concat
The concatenation is the process of combining different sets of
operation that handle different dates. This is typically used to
build a dataset that spans several years, when the several sources
are involved, each providing a different period.

*****************
Getting started
*****************

.. literalinclude:: building.yaml
:language: yaml
44 changes: 44 additions & 0 deletions docs/datasets/building.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
description: Example dataset

dates:
start: 2020-01-01 00:00:00
end: 2023-12-31 18:00:00
frequency: 6h

build:
group_by: monthly

input:
join:
- mars:
class: ea
param: [10u, 10v, 2d, 2t, msl, skt, sp, tcw, lsm, sdor, slor, z]
levtype: sfc

- mars:
class: ea
param: [r, t, u, v, w, z]
levtype: pl
level: [50, 100, 150, 200, 250, 300, 400, 500, 700, 850, 925, 1000]

- constants:
template: ${input.join.0.mars}
param:
- cos_latitude
- cos_longitude
- sin_latitude
- sin_longitude
- cos_julian_day
- cos_local_time
- sin_julian_day
- sin_local_time
- insolation

output:
order_by:
- valid_datetime
- param_level
- number
statistics: param_level
remapping:
param_level: "{param}_{levelist}"
Binary file added docs/datasets/concat.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/datasets/data.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 5 additions & 0 deletions docs/datasets/filters.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
.. _dataset-filters:

#########
Filters
#########
Binary file added docs/datasets/images.pptx
Binary file not shown.
Binary file added docs/datasets/join.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 929a171

Please sign in to comment.