Skip to content

Commit

Permalink
cleanup peak-valley case
Browse files Browse the repository at this point in the history
  • Loading branch information
jdries committed Nov 5, 2023
1 parent d788a56 commit ab8712f
Show file tree
Hide file tree
Showing 5 changed files with 318 additions and 1,795 deletions.
8 changes: 5 additions & 3 deletions tutorial/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,12 @@ parts:
title: How to exploit data with openEO
- file: part3/data_exploitability_pangeo
title: How to exploit data on Pangeo
- file: part3/advanced_udf
title: Custom algorithms - UDF (OpenEO) and ufunc (Xarray)
- file: part3/udf_intro_openeo
title: openEO UDF - Introduction
- file: part3/peak_valley
title: Advanced OpenEO - Peak Valley Case
title: Advanced openEO UDF + ufunc (XArray) - Peak Valley Case
- file: part3/advanced_udf
title: Advanced openEO UDF - Machine Learning
- file: part3/scaling_openeo
title: Scaling with OpenEO

Expand Down
Binary file added tutorial/data/s2_meadow.nc
Binary file not shown.
60 changes: 0 additions & 60 deletions tutorial/part3/advanced_udf.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -3,66 +3,6 @@
{
"cell_type": "markdown",
"source": [
"# OpenEO - User Defined Functions\n",
"\n",
"\n",
"The openEO user defined functions (UDF's), provide a way to run Pangeo code as part of a larger openEO workflow.\n",
"This allows you to get the best of both worlds:\n",
"- You can use openEO to get access to a wide variety and full archives of EO data.\n",
"- Preprocessing to ARD such as Sentinel-1 backscatter is easily done in openEO\n",
"- For very fast interactive debugging of the core of your algorithm, local testing based on Pangeo is faster than basically any cloud based service\n",
"- Once your algorithm is ready, you can run it in the cloud without worries thanks to openEO\n",
"\n",
"\n",
"To understand user defined functions, it is recommended to start here:\n",
"\n",
"https://open-eo.github.io/openeo-python-client/udf.html\n",
"\n",
"## When (not) to use UDF's\n",
"\n",
"Use UDF's when:\n",
"\n",
"1. You have an existing algorithm that is too large/complex to reimplement in terms of openEO processes.\n",
"2. You require a process that is not supported yet, and don't have the time to propose a new process that can be implemented by the backend.\n",
"\n",
"Do not use UDF's when:\n",
"\n",
"1. You start from scratch and the algorithm can be expressed in openEO processes\n",
"2. You want to achieve the highest possible level of portability, without depending on any specific technology.\n",
"3. You are running on a backend that simply does not support UDF's.\n",
"\n",
"## From UDF to predefined process\n",
"\n",
"The most widely heard criticism of UDF's is that they are bad for portability because they introduce a technology dependency in your workflow.\n",
"\n",
"We therefore have some general recommendations when implementing them. The general idea is that UDF's in fact allow to quickly test and use pieces of code\n",
"that can eventually become predefined processes. The fact that UDF's allow you to test a process idea in practice is actually a great\n",
"way to finetune the implementation and definition of your process. This avoids defining something which may require changes later on.\n",
"\n",
"1. Try to keep UDF's small and modular. Many authors tend to include bits that were relevant outside openEO, but actually can be removed.\n",
"2. Minimize your dependencies. OpenEO requires far fewer extras compared to a file based workflow. You may need to adjust some imports!\n",
"\n",
"When you identified a UDF that does one clear thing well, you may want to get in touch with the openEO team or your backend provider to see how it can\n",
"become an openEO predefined process. Also note that your implementation in Python may be the basis for an actual implementation in the backend!\n",
"\n",
"\n",
"## UDF with basic XArray code: peak-valley detection\n",
"\n",
"Algorithm type: single pixel timeseries analysis\n",
"\n",
"This example may require a `pip install fusets`\n",
"\n",
"The actual peak-valley algorithm can be found here:\n",
"\n",
"https://github.com/Open-EO/FuseTS/blob/main/src/fusets/peakvalley.py\n",
"\n",
"This is the 'UDF' entrypoint:\n",
"https://github.com/Open-EO/FuseTS/blob/main/src/fusets/openeo/peakvalley_udf.py#L18\n",
"\n",
"\n",
"\n",
"\n",
"https://github.com/Open-EO/FuseTS/blob/main/notebooks/AI4FOOD_PeakValley_Detection_NDVI_S2/peakvalley_detection.ipynb\n",
"\n",
"## UDF with machine learning\n",
"\n",
Expand Down
2,004 changes: 272 additions & 1,732 deletions tutorial/part3/peak_valley.ipynb

Large diffs are not rendered by default.

41 changes: 41 additions & 0 deletions tutorial/part3/udf_intro_openeo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# OpenEO - User Defined Functions


The openEO user defined functions (UDF's), provide a way to run Pangeo code as part of a larger openEO workflow.
This allows you to get the best of both worlds:
- You can use openEO to get access to a wide variety and full archives of EO data.
- Preprocessing to ARD such as Sentinel-1 backscatter is easily done in openEO
- For very fast interactive debugging of the core of your algorithm, local testing based on Pangeo is faster than basically any cloud based service
- Once your algorithm is ready, you can run it in the cloud without worries thanks to openEO


To understand user defined functions, it is recommended to start here:

https://open-eo.github.io/openeo-python-client/udf.html

## When (not) to use UDF's

Use UDF's when:

1. You have an existing algorithm that is too large/complex to reimplement in terms of openEO processes.
2. You require a process that is not supported yet, and don't have the time to propose a new process that can be implemented by the backend.

Do not use UDF's when:

1. You start from scratch and the algorithm can be expressed in openEO processes
2. You want to achieve the highest possible level of portability, without depending on any specific technology.
3. You are running on a backend that simply does not support UDF's.

## From UDF to predefined process

The most widely heard criticism of UDF's is that they are bad for portability because they introduce a technology dependency in your workflow.

We therefore have some general recommendations when implementing them. The general idea is that UDF's in fact allow to quickly test and use pieces of code
that can eventually become predefined processes. The fact that UDF's allow you to test a process idea in practice is actually a great
way to finetune the implementation and definition of your process. This avoids defining something which may require changes later on.

1. Try to keep UDF's small and modular. Many authors tend to include bits that were relevant outside openEO, but actually can be removed.
2. Minimize your dependencies. OpenEO requires far fewer extras compared to a file based workflow. You may need to adjust some imports!

When you identified a UDF that does one clear thing well, you may want to get in touch with the openEO team or your backend provider to see how it can
become an openEO predefined process. Also note that your implementation in Python may be the basis for an actual implementation in the backend!

0 comments on commit ab8712f

Please sign in to comment.