-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ERA5 Monthly Single Levels 1979-2024 artifact #70
base: main
Are you sure you want to change the base?
Conversation
I think it would be better to have total column water in a separate file. We will likely add more atmos variables to that file, or replace it with something else. But we can also do this later. |
Was there also a question of having surface variables in one file (some used by land, some by atmos), but 3d vars in a separate one only for atmos? |
Yeah I was thinking surface variables in one file, 2d variables (e.g, vertical integrals such as water vapor path) for atmosphere only in one file, and 3d variables for atmosphere either in one file or one variable per file, as 3d variables can be very large. |
7. `mean_surface_runoff_rate` | ||
8. `mean_sub_surface_runoff_rate` | ||
9. `total_column_water` | ||
10. `number` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can define this too?
attrib = input_ds["longitude"].attrib, | ||
) | ||
|
||
new_times = map(input_ds["date"][:]) do t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain what lines 116-125 do?
end | ||
|
||
convert_to_f32(x) = Float32(x) | ||
convert_to_f32(x::Missing) = x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, when was there missing data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe runoff over ocean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Downloading Monthly averages by hour of day results in some weirdness with the time dimension, where there are duplicate entries. Some variables are defined on one of the duplicates and some on the other. I had improperly dealt with this (assumed all were on one of the duplictes), which resulted in missing data. This is fixed now, and there is no missing data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good! Thank you! We can defer to @szy21 regarding the files
I just noticed the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
### `era5_monthly_surface_fluxes_197901-202410.nc` | ||
|
||
This file contains Monthly averaged reanalysis from 1979 to present (October 2024 at time of creation), which is produced by averaging all daily data for each month. This results in 12*(2024-1979 + 10/12) = 550 points on the | ||
time dimension, where each point is the 15th of the month that the point represents. For example, the 6th index of `time` is `19790601-01-15T00:00:00`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
time dimension, where each point is the 15th of the month that the point represents. For example, the 6th index of `time` is `19790601-01-15T00:00:00`, | |
time dimension, where each point is the 15th of the month that the point represents. For example, the 6th index of `time` is `1979-06-15T00:00:00`, |
ignore_mod_31 = [1, 3, 5, 7, 9, 11, 13] | ||
time_indx = filter(i -> !(i % 31 in ignore_mod_31), 1:length(input_ds["time"][:])) | ||
defDim(output_ds, "time", length(time_indx)) | ||
missing_indices = filter(i -> (i % 31 in ignore_mod_31), 1:length(input_ds["time"][:])); | ||
for index in missing_indices | ||
if !all(input_ds["tcw"][:,:,index] .=== missing) | ||
@error "The index pattern of the invalid data is not as expected" | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this block of code do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Downloading Monthly averages by hour of day results in some weirdness with the time dimension, where there are duplicate entries. Some variables are defined on one of the duplicates and some on the other. This removed the duplicate points which only hold missing data. I added a comment describing this.
78c99b3
to
f2e81f4
Compare
f2e81f4
to
ec5fd8c
Compare
The issue here has conflicting info on if there should be separate files for atmos and land calibration. The issue says "this can be in the same file but land wont use it" and
"We can have separate data files for atmos and land calibration. Only surface fluxes variables will be shared, and they can stay in the land data file."
I currently have the variables in one file. I thought this made more sense because only one variable is not used by land, so the atmos calibration data would contain a single variable.
Checklist:
$artifact_name
README.md
in that that folder thatLICENSE
fileProject.toml
and
Manifest.toml
)OutputArtifacts.toml
file containing the informationneeded for package developers to add
$artifact_name
to their package/groups/esm/ClimaArtifacts/artifacts/$artifact_name
)Overides.toml
on the Caltech Cluster(in
/groups/esm/ClimaArtifacts/artifacts/Overrides.toml
)README.md
to point to the new artifactPlots of monthly mean:
Plots of monthly means by hour of day (these plots are from 23:00-24:00: