Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove singleton metaclass #116

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

afaulconbridge
Copy link
Contributor

The use of a metaclass to make Resource a singleton is a bit counter-intuitive. Particularly as users won't often interact with this class directly, and I've been caught out by passing a value into SNPs(resource_dir='something') and not seeing the expected result because a Resource single had been created elsewhere.

This PR removes that singleton metaclass, and achieves the same behaviour by default by creating the instance as the default argument to the SNPs.__init__ method. Default method arguments are evaluated when the class is defined, therefore objects created as a default argument are singleton.

Users who want to manage the singleton themselves can use the resources_obj argument to pass the Resource instance to use. This also opens the possibility for having different sub-classes e.g. for storage backends such as a database, or AWS S3.

@codecov-io
Copy link

Codecov Report

Merging #116 (dde6a29) into develop (1989b81) will decrease coverage by 0.62%.
The diff coverage is 42.85%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #116      +/-   ##
===========================================
- Coverage    94.48%   93.86%   -0.63%     
===========================================
  Files            8        8              
  Lines         1468     1417      -51     
  Branches       252      253       +1     
===========================================
- Hits          1387     1330      -57     
- Misses          43       46       +3     
- Partials        38       41       +3     
Impacted Files Coverage Δ
src/snps/utils.py 88.73% <ø> (-0.88%) ⬇️
src/snps/snps.py 94.55% <20.00%> (-0.99%) ⬇️
src/snps/resources.py 94.73% <100.00%> (-0.34%) ⬇️
src/snps/ensembl.py 65.85% <0.00%> (-4.88%) ⬇️
src/snps/io/reader.py 97.87% <0.00%> (-0.02%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1989b81...dde6a29. Read the comment docs.

Copy link
Owner

@apriha apriha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @afaulconbridge . I agree that it can be easy to get tripped-up setting a value for a Resources object and not seeing it take effect due to the singleton.

But like you said, most users probably don't interact with this class, so I think it'd be cleaner to not expose the object and instead have parameters passed in through SNPs that the Resources object(s) would use.

One idea could be changing the resources_dir="resources" parameter to just resources="resources". Then, its usage could be expanded to handle URL strings. If a database URL was detected, it could instantiate the appropriately sub-classed Resources object.

As for the singleton, the original intent was to make loading resources more efficient by keeping the loaded data in memory and not having to reload a resource on subsequent invocations (which could also help when loading a resource repeatedly from a URL). Unfortunately I don't think passing an object via a kwarg alone makes it a singleton:

from snps import SNPs
from snps.resources import Resources

s = SNPs(resources_obj=Resources())
r = Resources()
print(id(r) == id(s._resources))
# False

So to summarize, I think a general-purpose resources=resources parameter that could take on paths and URLs would be more straightforward for users. As for the singleton, I can see pros (efficiency) and cons (unexpected results) of that construct. Since it causes confusion, I'm leaning towards removing it. What're your thoughts?

@afaulconbridge
Copy link
Contributor Author

This is a better example of how kwargs can be used as a default singleton:

from snps import SNPs
from snps.resources import Resources
s1 = SNPs()
s2 = SNPs()
id(s1._resources) == id(s2._resources)
# True on this branch
# True on master

However, on this branch this would then break down if it is manually specified :

s3 = SNPs(resources_dir="/tmp/resources")
s4 = SNPs(resources_dir="/tmp/resources")
id(s3._resources) == id(s4._resources)
# False on this branch
# True on master

Also note that this is what happens on master, which is the potentially confusing side-effect:

s5 = SNPs(resources_dir="/tmp/foo/resources")
s6 = SNPs(resources_dir="/tmp/bar/resources")
id(s5._resources) == id(s6._resources)
# True and will use /tmp/foo/resources for both

I like the idea of allow a filename or a url string, but I'm not sure that will cover all cases so falling back to a non-string object instead might be necessary.

I'm happy with a single "resources" parameter instead of separating parameters by type. It does break compatibility for some existing users, but that's probably a good thing to make sure everyone is aware of the change in meaning.

@apriha
Copy link
Owner

apriha commented Dec 14, 2020

Thanks for those examples, they help explain the issue. How about this?

  1. remove the singleton to remove confusion
  2. keep resources_dir="resources" for backwards compatibility (i.e., resources on the local filesystem)
  3. for new resources, add kwargs as necessary (e.g., resources_url="")

Then a regular Resources object will be instantiated based on those parameters, with preference for what functionality to use based on whether or not particular fields are specified (e.g., if resources_url is specified, use that resource).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants