Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix cuda gnet_expand #401

Merged
merged 2 commits into from
Jul 3, 2024
Merged

fix cuda gnet_expand #401

merged 2 commits into from
Jul 3, 2024

Conversation

enricozb
Copy link
Contributor

@enricozb enricozb commented Jul 3, 2024

Cuda's gnet_expand was overwriting the ROOT var, and not replacing it with what it pointed to previously.

Fixes #370

@HigherOrderBot
Copy link
Collaborator

Perf run for 3a69a8a:

compiled
========

file            runtime         main            (local)       
==============================================================
sort_bitonic    c                       14.79s          12.45s
                cuda                     0.14s           0.14s
--------------------------------------------------------------
sum_rec         c                        2.40s           2.34s
                cuda                     0.05s           0.05s
--------------------------------------------------------------
sum_tree        c                        0.21s           0.25s
                cuda                     0.02s           0.02s
--------------------------------------------------------------
tuples          c                        9.11s          11.76s
                cuda                   timeout         timeout
--------------------------------------------------------------

interpreted
===========

file            runtime         main            (local)       
==============================================================
sort_bitonic    c                       11.11s          18.18s
                cuda                     0.14s           0.14s
                rust                   timeout         timeout
--------------------------------------------------------------
sum_rec         c                        3.20s           3.25s
                cuda                     0.05s           0.05s
                rust                   timeout         timeout
--------------------------------------------------------------
sum_tree        c                        0.37s           0.40s
                cuda                     0.01s           0.01s
                rust                     1.94s           1.90s
--------------------------------------------------------------
tuples          c                        7.05s           6.24s
                cuda                   timeout         timeout
                rust                     8.65s           8.64s
--------------------------------------------------------------


@developedby
Copy link
Member

Closes bend issue HigherOrderCO/Bend#610

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CUDA differs from Rust
3 participants