Execute several jupyter cells at the same time
Have you ever just sited and watched a long-running jupyter cell? Now, you can continue to work in the same notebook freely
gitdemo.mov
- You have a long-running cell, and you need to check something.
You can just start the second cell without interrupting a long-running cell.
Example: you run a machine learning train loop and want to immediately save the model's weights or check metrics. With
igogo
you can do so without interrupting the training. - If you need to compare the score of some function with different parameters, you can run several
functions at the same time and monitor results.
Example: you have several sets of hyperparameters and want to compare them. You can start training two models, monitoring two loss graphs at the same time.
- Process data in chunks. Check processed data for validity
Example: you do data processing in steps. With
igogo
you can execute several steps at the same time and process data from the first processing step in the second processing step in chunks. Also, you can quickly check that the first step produces the correct results
Igogo is available through PyPi:
pip install igogo
- No multithreading, no data races, no locks. You can freely operate with your notebook variables without the risk of corrupting them.
- Beautiful output. When several cells execute in parallel, all printed data is displayed in the corresponding cell's output. No more twisted and messed out concurrent outputs.
- Easily cancel jobs, wait for completion, and start the new ones.
- Control execution of jobs through widgets.
At the core of igogo is collaborative execution. Jobs need to explicitly allow other jobs to execute through igogo.yielder()
. Mind that regular cells also represent a job.
Placing igogo.yielder()
in code that is not executed in igogo job is not a mistake. It will return immediately. So, you don't need to care about keeping igogo.yielder()
only in igogo jobs. You can place it anywhere
To start an igogo job, you can use %%igogo
cell magic or function decorator.
import igogo
@igogo.job
def hello_world(name):
for i in range(3):
print("Hello, world from", name)
# allows other jobs to run while asleep
# also can be `igogo.yielder()`
igogo.sleep(1)
return name
Call function as usual to start a job:
hello_world('igogo'), hello_world('other igogo');
gitdemo.mov
Decorator @igogo.job
has several useful parameters.
kind
Allows to set how to render output. Possible options:text
,markdown
,html
Default:text
displays
As igogo job modify already executed cell, it needs to have spare placeholders for rich output. This parameter specifies how many spare displays to spawn. Default:1
name
User-friendly name of igogo job.warn_rewrite
Should warn rewriting older displays? Default:True
auto_display_figures
Should display pyplot figures created inside igogo automatically? Default:True
Markdown example:
gitdemo.mov
Pyplot figures will be automatically displayed in igogo cell.
You can also use igogo.display
inside a job to display any other content or several figures. Mind that displays must be pre-allocated by specifying displays number in igogo.job(displays=...)
import numpy as np
import matplotlib.pyplot as plt
import igogo
def experiment(name, f, i):
x = np.linspace(0, i / 10, 100)
fig = plt.figure()
plt.plot(
x,
f(x)
)
plt.gca().set_title(name)
igogo.display(fig)
fig = plt.figure()
plt.scatter(
x,
f(x)
)
plt.gca().set_title(name)
igogo.display(fig)
igogo.sleep(0.05)
As noted in "Configure jobs" section, igogo
jobs have limited number of displays.
If you try to display more objects than job has, warning will be shown and the oldest displays will be overwritten.
The same way with %%igogo
:
%load_ext igogo
%%igogo
name = 'igogo'
for i in range(3):
print("Hello, world from", name)
igogo.sleep(1)
All executed igogo
jobs spawn a widget that allows to kill them. Jobs are not affected by KeyboardInterrupt
Apart from killing through widgets, igogo
jobs can be killed programmatically.
igogo.stop()
Can be called insideigogo
job to kill itself.igogo.stop_all()
Stops all runningigogo
jobsigogo.stop_latest()
Stops the latestigogo
job. Can be executed several times.igogo.stop_by_cell_id(cell_id)
Kills all jobs that were launched in cell withcell_id
(aka [5], cell_id=5).
Also, you can stop jobs of one specific function.
hello_world.stop_all()
Stops alligogo
jobs created byhello_world()
Currently, igogo
runs fully correct on:
- Jupyter Lab
- Jupyter
Runs but has problems with output from igogo jobs. Jobs are executed, but there could be problems with widgets and output:
- VSCode. For some reason it does not update display data. Therefore, no output is produced.
- DataSpell. It displays
[object Object]
and not output. - Colab. It does not support updating content of executed cells
Screen.Recording.2023-03-24.at.14.14.03.mov
Also, you can modify training parameters, freeze/unfreeze layers, switch datasets, etc. All you need is to place igogo.yielder()
in train loop.
import igogo
import numpy as np
from tqdm.auto import tqdm
%load_ext igogo
raw_data = np.random.randn(100000, 100)
result = []
def row_processor(row):
return np.mean(row)
%%igogo
for i in tqdm(range(len(raw_data))):
result.append(row_processor(raw_data[i]))
igogo.yielder()
result[-1]
import igogo
import numpy as np
from tqdm.auto import tqdm
%load_ext igogo
raw_data = np.random.randn(5000000, 100)
igogo_yield_freq = 32
igogo_first_step_cache = []
result = []
%%igogo
for i in tqdm(range(len(raw_data))):
processed = np.log(raw_data[i] * raw_data[i])
igogo_first_step_cache.append(processed)
if i > 0 and i % igogo_yield_freq == 0:
igogo.yielder() # allow other jobs to execute
%%igogo
for i in tqdm(range(len(raw_data))):
while i >= len(igogo_first_step_cache): # wait for producer to process data
igogo.yielder()
result.append(np.mean(igogo_first_step_cache[i]))