DataLoader
trying to run in workers when there is a notebook local implementation of a Dataset
. I suppose a solution is to move the ToyDataset
class into a separate .py
file, but is this the expected behavior? Depending on external files also means that my project needs to have modules fully setup and functional.Traceback (most recent call last): File "<string>", line 1, in <module> File "C:\Program Files\Python312\Lib\multiprocessing\spawn.py", line 122, in spawn_main exitcode = _main(fd, parent_sentinel) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Program Files\Python312\Lib\multiprocessing\spawn.py", line 132, in _main self = reduction.pickle.load(from_parent) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: Can't get attribute 'ToyDataset' on <module '__mp_main__' from 'listing_A_part_2.py'> Traceback (most recent call last): File "listing_A_part_2.py", line 212, in <module> app.run() File ".venv\Lib\site-packages\marimo\_ast\app.py", line 298, in run outputs, glbls = AppScriptRunner(InternalApp(self)).run() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".venv\Lib\site-packages\marimo\_runtime\app\script_runner.py", line 111, in run raise e.__cause__ from None # type: ignore ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".venv\Lib\site-packages\marimo\_runtime\executor.py", line 170, in execute_cell exec(cell.body, glbls) File "listing_A_part_2.py", line 189, in <module> for _batch_idx, (_features, _labels) in enumerate(train_loader): ^^^^^^^^^^^^^^^^^^^^^^^ File ".venv\Lib\site-packages\torch\utils\data\dataloader.py", line 630, in __next__ data = self._next_data() ^^^^^^^^^^^^^^^^^
File ".venv\Lib\site-packages\torch\utils\data\dataloader.py", line 1327, in _next_data idx, data = self._get_data() ^^^^^^^^^^^^^^^^ File ".venv\Lib\site-packages\torch\utils\data\dataloader.py", line 1293, in _get_data success, data = self._try_get_data() ^^^^^^^^^^^^^^^^^^^^ File ".venv\Lib\site-packages\torch\utils\data\dataloader.py", line 1144, in _try_get_data raise RuntimeError(f'DataLoader worker (pid(s) {pids_str}) exited unexpectedly') from e RuntimeError: DataLoader worker (pid(s) 22260) exited unexpectedly
ToyDataset
class into its own file and imported from there, so I can move on. It is kinda a pain to setup the pyproject.toml
and the editable install just for that.npm init react-app my-app
, but can't say I've seen it in Python.marimo-use-custom-dataloader
branch here: https://github.com/ngbrown/build-llm-from-scratch/tree/marimo-use-custom-dataloader/llm_from_scratch/appx_a/listing_A_part_2.pyuv
and what is the right thing to keep it compatible with marimo edit --sandbox
.marimo edit
but the error wasn't copyable. There was something in the middle that interrupted the selection. So that's why I pasted the error from the command line.marimo edit --sandbox
. I'm getting the following error:> marimo edit --sandbox .\llm_from_scratch\appx_a\listing_A_part_2.py Running in a sandbox: uv run --isolated --no-project --with-requirements C:\Users\USER\AppData\Local\Temp\tmpl00x1659.txt marimo edit .\llm_from_scratch\appx_a\listing_A_part_2.py × No solution found when resolving `--with` dependencies: ╰─▶ Because llm-from-scratch was not found in the package registry and you require llm-from-scratch, we can conclude that your requirements are unsatisfiable.
uv add --script .\llm_from_scratch\appx_a\listing_A_part_2.py ./
and uv add --script .\llm_from_scratch\appx_a\listing_A_part_2.py torch==2.4.1+cu121 --index-url https://download.pytorch.org/whl/cu121
to populate the /// script
header of the .py
file, and then had to manually add the extra-index-url
option.To me this says the target file is copied somewhere by itself so there's no possibility of using shared code in the project folder.
llm_from_scratch
shouldn't be added as a dependencyllm_from_scratch
shouldn't be added as a dependency
DataLoader
spawns separate processes and can't access the Dataset
that it needs from within a Marimo notebook cell function, I need to move that class into its own file and somehow a notebook needs to import modules from the local directory. As far as I know Python doesn't have a way to import a bare file (it needs the __init__.py
marker file making the directory contents modules). Is there another preferred way?uv-sources
. I just cloned your repo and tried it, and it works (using marimo 0.9.10).uv-sources
your use case makes sense..py
file header are relative to the command line, not the file itself. So for this:# [tool.uv.sources] # llm-from-scratch = { path = "../../" }
> marimo edit --sandbox .\llm_from_scratch\appx_a\listing_A_part_2.py
> cd .\llm_from_scratch\appx_a\ llm_from_scratch\appx_a> marimo edit --sandbox listing_A_part_2.py
I was expecting that the paths to be relative to the file itself, not the current directory of the command running it. Is there a spec on the behavior?
python my_directory/my_script.py
, the current working directory will be the directory of the command. mo.notebook_dir()
), but I guess that won't help for the script metadata.uv
also matches the Python CLI's behavior when determining the current working directory.