Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix timeout crash on UChicago #163

Open
gordonwatts opened this issue Oct 11, 2024 · 1 comment
Open

Fix timeout crash on UChicago #163

gordonwatts opened this issue Oct 11, 2024 · 1 comment
Assignees
Labels
bug Something isn't working servicex Related to SX tests

Comments

@gordonwatts
Copy link
Member

Not sure what is at the root of this - but we need better diagnostics at the very least:

[bash][gwatts]:idap-200gbps-atlas > python servicex/servicex_materialize_branches.py --distributed-client scheduler --dask-scheduler tcp://dask-gwatts-5271bc40-1.af-jupyter:8786 --sx-name servicex-uc-af --num-files 0  --query xaod_small --dataset data_50TB  --dask-profile
0000.8895 - WARNING - func_adl.type_based_replacement - Unknown type for name len

/venv/lib/python3.9/site-packages/distributed/client.py:3161: UserWarning: Sending large graph of size 38.11 MiB.
This may cause some slowdown.
Consider scattering data ahead of time and using futures.
  warnings.warn(
Traceback (most recent call last):
  File "/venv/lib/python3.9/site-packages/fsspec/asyn.py", line 56, in _runner
    result[0] = await coro
  File "/venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 234, in _cat_file
    async with session.get(self.encode_url(url), **kw) as r:
  File "/venv/lib/python3.9/site-packages/aiohttp/client.py", line 1353, in __aenter__
    self._resp = await self._coro
  File "/venv/lib/python3.9/site-packages/aiohttp/client.py", line 684, in _request
    await resp.start(conn)
  File "/venv/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 1014, in start
    self._continue = None
  File "/venv/lib/python3.9/site-packages/aiohttp/helpers.py", line 713, in __exit__
    raise asyncio.TimeoutError from None
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/gwatts/code/iris-hep/idap-200gbps-atlas/servicex/servicex_materialize_branches.py", line 501, in <module>
    main(
  File "/home/gwatts/code/iris-hep/idap-200gbps-atlas/servicex/servicex_materialize_branches.py", line 181, in main
    report, n_events = dask.compute(*calculate_n_events(dataset_files, 4))
  File "/venv/lib/python3.9/site-packages/dask/base.py", line 661, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/venv/lib/python3.9/site-packages/uproot/_dask.py", line 1316, in __call__
    (result, counters), duration = with_duration(self._call_impl)(
  File "/venv/lib/python3.9/site-packages/uproot/_dask.py", line 1154, in wrapper
    result = f(*args, **kwargs)
  File "/venv/lib/python3.9/site-packages/uproot/_dask.py", line 1268, in _call_impl
    ttree = uproot._util.regularize_object_path(
  File "/venv/lib/python3.9/site-packages/uproot/_util.py", line 967, in regularize_object_path
    file = ReadOnlyFile(
  File "/venv/lib/python3.9/site-packages/uproot/reading.py", line 573, in __init__
    self._begin_chunk = self._source.chunk(
  File "/venv/lib/python3.9/site-packages/uproot/source/fsspec.py", line 98, in chunk
    data = self._fs.cat_file(self._file_path, start=start, end=stop)
  File "/venv/lib/python3.9/site-packages/fsspec/asyn.py", line 118, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "/venv/lib/python3.9/site-packages/fsspec/asyn.py", line 101, in sync
    raise FSTimeoutError from return_result
fsspec.exceptions.FSTimeoutError
@gordonwatts gordonwatts added bug Something isn't working servicex Related to SX tests labels Oct 11, 2024
@gordonwatts gordonwatts self-assigned this Oct 11, 2024
@ponyisi
Copy link
Collaborator

ponyisi commented Oct 19, 2024

Hi Gordon - fwiw I don't consider this a ServiceX bug per se; it would appear to either be an issue in uproot.dask or in the Ceph object store at the AF. Is it possible to try e.g. servicex-prod and see if you have the same problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working servicex Related to SX tests
Projects
None yet
Development

No branches or pull requests

2 participants