Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unsupported dynamic function invocation (call to _cuprint) on CUDA 5.0 #425

Closed
charleskawczynski opened this issue Oct 6, 2023 · 6 comments

Comments

@charleskawczynski
Copy link
Contributor

On CUDA 4, we are able to call KernalAbstractions.@print inside a CUDA kernel, and it's failing when trying to upgrade to CUDA 5:

ERROR: LoadError: InvalidIRError: compiling MethodInstance for ClimaCore.DataLayouts.knl_copyto!(::ClimaCore.DataLayouts.VIJFH{Thermodynamics.PhaseEquil{Float32}, 4, CuDeviceArray{Float32, 5, 1}}, ::Base.Broadcast.Broadcasted{ClimaCore.DataLayouts.VIJFHStyle{4, CuArray{Float32, N, CUDA.Mem.DeviceBuffer} where N}, NTuple{5, Base.OneTo{Int64}}, typeof(ClimaAtmos.ts_gs), Tuple{Tuple{Thermodynamics.Parameters.ThermodynamicsParameters{Float32}}, Tuple{ClimaAtmos.TotalEnergy}, Tuple{ClimaAtmos.EquilMoistModel}, ClimaCore.DataLayouts.VIJFH{NamedTuple{(:e_tot, :q_tot), Tuple{Float32, Float32}}, 4, CuDeviceArray{Float32, 5, 1}}, ClimaCore.DataLayouts.VIJFH{Float32, 4, CuDeviceArray{Float32, 5, 1}}, ClimaCore.DataLayouts.VIJFH{Float32, 4, CuDeviceArray{Float32, 5, 1}}, ClimaCore.DataLayouts.VIJFH{Float32, 4, SubArray{Float32, 5, CuDeviceArray{Float32, 5, 1}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}, UnitRange{Int64}, Base.Slice{Base.OneTo{Int64}}}, false}}}}) resulted in invalid LLVM IR
--
  | Reason: unsupported dynamic function invocation (call to _cuprint(parts...) @ CUDA none:0)
  | Stacktrace:
  | [1] #__print
  | @ /central/scratch/esm/slurm-buildkite/climaatmos-ci/depot/cpu/packages/CUDA/nbRJk/src/CUDAKernels.jl:219
  | [2] macro expansion
  | @ /central/scratch/esm/slurm-buildkite/climaatmos-ci/depot/cpu/packages/KernelAbstractions/lhhMo/src/KernelAbstractions.jl:317
  | [3] print_T_guess
  | @ /central/scratch/esm/slurm-buildkite/climaatmos-ci/depot/cpu/packages/Thermodynamics/ALrOj/src/config_numerical_method.jl:29
  | [4] saturation_adjustment
@vchuravy
Copy link
Member

vchuravy commented Oct 6, 2023

cc: @maleadt

@maleadt
Copy link
Member

maleadt commented Oct 9, 2023

@print is covered by the KA.jl test suite, so the breakage is likely specific to/dependent on the calling code.
Can you provide a MWE?

@charleskawczynski
Copy link
Contributor Author

charleskawczynski commented Oct 9, 2023

I tried making a minimal one, and that's not "working"... So I guess I'll start with the maximum working example first:

git clone https://github.com/CliMA/ClimaAtmos.jl
cd ClimaAtmos.jl
git checkout ck/up_cuda
julia --color=yes --project=examples perf/benchmark_step.jl --config_file config/perf_configs/gpu_implicit_barowave_moist.yml

It may take me some time to trim this down..

@charleskawczynski
Copy link
Contributor Author

charleskawczynski commented Oct 9, 2023

Ok, this is significantly less expensive, but this seems to be a reproducer (READ NEXT COMMENT, this issue might be already fixed with package updates):

using Revise
import ClimaCore.Domains as Domains
import ClimaCore.Topologies as Topologies
import ClimaCore.Spaces as Spaces
import ClimaCore.Meshes as Meshes
import ClimaCore.Geometry as Geometry
import ClimaComms

FT = Float32
context = ClimaComms.SingletonCommsContext(ClimaComms.device())
zelem = 10
helem = 4;
Nq = 4;
radius = FT(128);
zlim = (0, 1);
vertdomain = Domains.IntervalDomain(
    Geometry.ZPoint{FT}(zlim[1]),
    Geometry.ZPoint{FT}(zlim[2]);
    boundary_tags = (:bottom, :top),
);
vertmesh = Meshes.IntervalMesh(vertdomain, nelems = zelem);
vtopology = Topologies.IntervalTopology(context, vertmesh);
vspace = Spaces.CenterFiniteDifferenceSpace(vtopology);

hdomain = Domains.SphereDomain(radius);
hmesh = Meshes.EquiangularCubedSphere(hdomain, helem);
htopology = Topologies.Topology2D(context, hmesh);
quad = Spaces.Quadratures.GLL{Nq}();
hspace = Spaces.SpectralElementSpace2D(htopology, quad);
cspace = Spaces.ExtrudedFiniteDifferenceSpace(hspace, vspace);
fspace = Spaces.ExtrudedFiniteDifferenceSpace{Spaces.CellFace}(cspace);

import Thermodynamics as TD
t = FT(0)
thermo_params = TD.Thermodynamics.Parameters.ThermodynamicsParameters{FT}(
    273.16,
    101325.0,
    100000.0,
    1859.0,
    4181.0,
    2100.0,
    2.5008f6,
    2.8344f6,
    611.657,
    273.16,
    273.15,
    150.0,
    1000.0,
    298.15,
    6864.8,
    10513.6,
    0.2857143,
    8.31446,
    0.02897,
    0.01801528,
    290.0,
    220.0,
    9.81,
    233.0,
    1.0,
);

nt = (;
    ρ = FT(1),
    ᶜΦ = FT(1),
    ᶜK = FT(1),
    ᶜspecific = (; e_tot = FT(1), q_tot = FT(0.001)),
    ᶜts = zero(TD.PhaseEquil{FT}),
);
fields = fill(nt, cspace);
(; ρ, ᶜspecific, ᶜK, ᶜts, ᶜΦ) = fields;

function thermo_state(thermo_params, ρ, e_int, q_tot)
    get_ts::Real, e_int::Real, q_tot::Real) = TD.PhaseEquil_ρeq(
        thermo_params,
        ρ,
        e_int,
        q_tot,
        3,
        eltype(thermo_params)(0.003),
    )
    return get_ts(ρ, e_int, q_tot)
end

ts_gs(specific, K, Φ, ρ) =
    thermo_state(thermo_params, ρ, specific.e_tot - K - Φ, specific.q_tot)
@. ᶜts = ts_gs(ᶜspecific, ᶜK, ᶜΦ, ρ)

@charleskawczynski
Copy link
Contributor Author

Now, interestingly:

So, maybe there was an issue with that version of CUDA runtime? I'm not sure, but I'm going to try and rebase and re-update CliMA/ClimaAtmos.jl#2199.

I'll report back what happens

@charleskawczynski
Copy link
Contributor Author

It looks like this fixed it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants