Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asyncFileStorage issue #163

Open
bernardocaldas opened this issue Jan 14, 2024 · 5 comments
Open

asyncFileStorage issue #163

bernardocaldas opened this issue Jan 14, 2024 · 5 comments

Comments

@bernardocaldas
Copy link

I'm using the async file storage client. The cache path gets created, but no files are created when using the client. How can I check what's happening?

@karpetrosyan
Copy link
Owner

Hi! Can you please provide more context about the issue? Hishel version, code, traceback

@bernardocaldas
Copy link
Author

bernardocaldas commented Jan 14, 2024

Hi! Sure thing.
hishel==0.0.21

excerpts from code below. Being run from main.py with trio.run(get_ine_census_data,"2021", "0011699", True, True)

What am I doing wrong?

`from hishel import AsyncCacheClient

async def get_ine_census_data(year, indicador_id, save_to_parquet=False, save_to_csv=False):
  storage = hishel.AsyncFileStorage(base_path="test_cache")
  result_list = []
  async with AsyncCacheClient(storage=storage) as client, trio.open_nursery() as nursery:
  for region in region_list:
   REGION_DIMENSION_ID = region
   nursery.start_soon(scraper.get_data_async,f"pindica.jsp?op=2&varcd={indicador_id}&Dim1=S7A{year}&Dim2={REGION_DIMENSION_ID}&Dim3={GENDER_DIMENSION_ID}&Dim4={AGE_DIMENSION_ID}&Dim5={EMPLOYMENT_DIMENSION_ID}&Dim6={EDUCATION_DIMENSION_ID}&lang=PT",
client, result_list)
`

@Midnighter
Copy link

Hello,

I have the same problem. Take this sample script:

import hishel
from anyio import run
 
 
async def main():
    storage = hishel.AsyncFileStorage(base_path=".cache")
    client = hishel.AsyncCacheClient(storage=storage)
    await client.get("https://httpbin.org/json")
 
 
if __name__ == "__main__":
    run(main)

When I run it, the .cache directory is created but empty.

tree .cache
.cache

0 directories, 0 files

If I enable debug logging, I see the following:

DEBUG:asyncio:Using selector: EpollSelector
DEBUG:httpx:load_ssl_context verify=True cert=None trust_env=True http2=False
DEBUG:httpx:load_verify_locations cafile='/home/moritz/.pyenv/versions/hishel/lib/python3.11/site-packages/certifi/cacert.pem'
DEBUG:httpcore.connection:connect_tcp.started host='httpbin.org' port=443 local_address=None timeout=5.0 socket_options=None
DEBUG:httpcore.connection:connect_tcp.complete return_value=<httpcore._backends.anyio.AnyIOStream object at 0x7f2e1e5aaf90>
DEBUG:httpcore.connection:start_tls.started ssl_context=<ssl.SSLContext object at 0x7f2e1e546720> server_hostname='httpbin.org' timeout=5.0
DEBUG:httpcore.connection:start_tls.complete return_value=<httpcore._backends.anyio.AnyIOStream object at 0x7f2e1e16be50>
DEBUG:httpcore.http11:send_request_headers.started request=<Request [b'GET']>
DEBUG:httpcore.http11:send_request_headers.complete
DEBUG:httpcore.http11:send_request_body.started request=<Request [b'GET']>
DEBUG:httpcore.http11:send_request_body.complete
DEBUG:httpcore.http11:receive_response_headers.started request=<Request [b'GET']>
DEBUG:httpcore.http11:receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 23 May 2024 15:35:39 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'429'), (b'Connection', b'keep-alive'), (b'Server', b'gunicorn/19.9.0'), (b'Access-Control-Allow-Origin', b'*'), (b'Access-Control-Allow-Credentials', b'true')])
DEBUG:httpcore.http11:receive_response_body.started request=<Request [b'GET']>
DEBUG:httpcore.http11:receive_response_body.complete
DEBUG:httpcore.http11:response_closed.started
DEBUG:httpcore.http11:response_closed.complete
INFO:httpx:HTTP Request: GET https://httpbin.org/json "HTTP/1.1 200 OK"

Package Information

Package Version
hishel 0.0.26

Dependency Information

Package Version
anysqlite missing
boto3 missing
httpx 0.27.0
pyyaml missing
redis missing
typing-extensions 4.11.0

Build Tools Information

Package Version
pip 24.0
setuptools 70.0.0
wheel 0.43.0

Platform Information

Linux 6.0.12-76060012-generic-x86_64
CPython 3.11.6

@karpetrosyan
Copy link
Owner

karpetrosyan commented May 23, 2024

Hi!
It's expected that this response should not be cached.
Hishel is a library that BY DEFAULT works like a browser cache, and it can cache the response only if the response contains appropriate headers, so if you want to ignore that and just cache the response, you should use force_cache extension like so:

import hishel
from anyio import run
 
 
async def main():
    storage = hishel.AsyncFileStorage(base_path=".cache")
    client = hishel.AsyncCacheClient(storage=storage)
    await client.get("https://httpbin.org/json", extensions={"force_cache": True})
 
 
if __name__ == "__main__":
    run(main)

@Midnighter
Copy link

Oh, that makes a lot of sense, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants