Scaling Othrolearners using Ray #793

v-shaal · 2023-07-17T17:00:10Z

Currently its challenging to scale Ortholearners for a large dataset as via current implementation of _crossfit is sequential which may not be efficient for large datapoints.
To over come this we can use Ray Remote function (Ray Tasks) for remote and asynchronous invocations of each of the K folds simultaneously on separate Python workers.

This can be done via simply modifying the _crossfit in _ortho_learner.py.
We conducted a performance analysis of the EconML implementation of DML and our version of DML_Ray at varying scales (10k, 100k, and 1Million) of treated units and using approximately 500 covariates generated by a synthetic data generator API sourced from https://github.com/py-why/dowhy/blob/main/dowhy/datasets.py

Here's the link of the Implementation of DML scaled via Ray that I have created. Let me know your thoughts .
@amit-sharma @emrekiciman

https://gist.github.com/vishal-d11/cd886eb6bdff96ad5a04711cb18339ed#file-dml_ray-ipynb

The text was updated successfully, but these errors were encountered:

fverac · 2023-07-19T17:48:42Z

Thanks for sharing. Would you be able to share your findings from the performance analysis?

v-shaal · 2023-07-21T07:36:49Z

@fverac yes we have done the performance analysis , we were able to run 1M units with about 500 covariates in ~7-8 Minutes over ray based implementation vs more than ~40min on current implementation on EC2-High Mem Node

vsyrgkanis · 2023-07-21T07:52:09Z

This is a great achievement @v-shaal ! I think if there is a way to seamlessly incorporate this Ray Remote function framework, we should strongly consider it!

Do you know what it would take to incorporate in the library? Would you be willing to submit a PR with this improvement?

v-shaal · 2023-07-21T08:45:23Z

@fverac @vsyrgkanis , I would be glad to work on this and raise a PR. I am currently going over the current structuring to figure out the best possible way to incorporate this with minimal changes to existing code structuring. let me know if you guys have any suggestions.

v-shaal · 2023-07-26T07:48:27Z

@vsyrgkanis can you please assign this to me.

v-shaal · 2023-08-02T18:34:47Z

@vsyrgkanis @fverac @kbattocchi , I've raised PR for this , kindly review and let me know the feedback

v-shaal mentioned this issue Aug 2, 2023

Scaling ortholearners using Ray #800

Merged

v-shaal closed this as completed Oct 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scaling Othrolearners using Ray #793

Scaling Othrolearners using Ray #793

v-shaal commented Jul 17, 2023

fverac commented Jul 19, 2023

v-shaal commented Jul 21, 2023

vsyrgkanis commented Jul 21, 2023

v-shaal commented Jul 21, 2023

v-shaal commented Jul 26, 2023

v-shaal commented Aug 2, 2023 •

edited

Loading

Scaling Othrolearners using Ray #793

Scaling Othrolearners using Ray #793

Comments

v-shaal commented Jul 17, 2023

fverac commented Jul 19, 2023

v-shaal commented Jul 21, 2023

vsyrgkanis commented Jul 21, 2023

v-shaal commented Jul 21, 2023

v-shaal commented Jul 26, 2023

v-shaal commented Aug 2, 2023 • edited Loading

v-shaal commented Aug 2, 2023 •

edited

Loading