Get help from the marimo community

Updated 5 months ago

how to disable “defined by another cell” error?

At a glance

The community member is experimenting with models and reusing variables across cells, such as setting the device variable to cuda for training and then to cpu for later use. However, the marimo platform does not support this type of operation as it breaks the dataflow graph assumptions. The community members discuss alternatives like using local variables (prefixed with an underscore) or encapsulating code in functions, but find these solutions less than ideal. There is a discussion around disabling the dataflow graph feature, but the community members acknowledge this would disable other important marimo features. The community members explore various workarounds, but ultimately one member concludes that this issue is a "showstopper" and decides to quit using marimo.

Useful resources
While experimenting and testing models - I usually reuse some variables from cell to cell. For example, when training models - I can set device variable to cuda for better coding experience.

device = torch.device("cuda”)

And when I want to use this model later in the same notebook and leave GPU for another processes, I’ll do:

device = torch.device(“cpu”)

But marimo forbids this type of operations!
Same applies to situations with sklearn models, where I train models in cells reusing “model” variable.

model = TypeOfModel()
model.fit()

Of course, it’s not clear or any close to production looking code, but this is what I want from notebooks - fast and simple “proof of concept” code.

P.s. forgive me for code blocks here, mobile discord sucks
1
A
I
H
21 comments
This isn't something we'll support, because it makes the notebook not a dataflow graph, which breaks a lot of assumptions that marimo needs to make in order to eliminate hidden state and make notebooks executable as apps/scripts.

As you may know, you can use local variables (_device), or better yet encapsulate code in functions and/or give variables meaningful names.
Kinda sad, but okay, I can live with that
UPD. Its getting more uncomfortable...

If i do

Plain Text
with open(...) as file:
  pass

in one cell, I'm forced to use _file, __file, ___file, etc. on and on in other cells
Is there anything we could do with this? This is really annoying
Variables starting with an underscore are local to a cell. In this example _file is local to a cell. You can do

Plain Text
with open(...) as _file:
  pass


in every cell. No need to keep adding underscores.
Is that good enough?
Hm, so why my marimo was arguing about already defined _file?
Well, I double checked now and there is no problem with _file right now. Maybe that was a nightmare, idk. Sorry for bothering
haha no worries, it happens
@Akshay Well... another problem...

How do I use for i in range(...) loops in different cells if I always get i was defined by another cell? using _i is very ugly and less readable
I personally use _i (was a bit hard getting accustomed to it migrating from traditional jupyter notebooks); not sure what other work around there is. If you want more insights, I would recommend referring to best practices to follow here - https://docs.marimo.io/guides/best_practices/index.html#best-practices
Yeah, but this is ugly looking and raises too many questions when I want to export my code somewhere.

Can I just disable “dataflow graph” when I don’t need it? Experimenting in notebooks should be easier
Not sure if that's a existing/viable option (being embedded in core values and principles upon which marimo was built); may be the core contributors might be knowing a suitable workaround/alternative.
There's some more discussion here about read-only variables: https://github.com/marimo-team/marimo/issues/1477

But if you utilize the scratchpad- there are no definition constraints. Normally, I iterate/ play around there- and make the minor changes when I insert into the notebook
I tried implementing relaxing the uniqueness twice — have two branches on my machine — and ran into many edge cases. It also introduces a tradeoff, increasing memory pressure because marimo has to keep a copy of each duplicated variable. Fine for loop indices, not fine for df = my_big_dataset(), X = my_cuda_tensor. In the end, in the spirit of Python, I think simple is better than magical, explicit better than implicit
Disabling the graph would mean disabling many other features that marimo provides — running as apps, elimination of hidden state, module hotreloading. We don't want to bifurcate our users, and the notebooks they share, into "graph" users and "non-graph" users.

Hopefully the scratchpad helps. If we can find a way to workaround the memory pressure for relaxing the memory pressure, we could consider relaxing the uniqueness constraint. But we need to keep every copy of duplicated variables around, otherwise deleting a cell would mark all other cells using the duplicated variable as stale, which would then increase compute pressure, which also is unacceptable (imagine all your for loops rerunning just because you deleted a totally unrelated cell).
@Akshay I’m sorry, but this is now showstopper for me
Attachment
image0.jpg
And inplace=True is not a great way to overcome this issue at all
I’m really sorry, but I’m forced to quit marimo because of this 😦
Sorry to hear it! I'm glad you gave it a try.
Add a reply
Sign up and join the conversation on Discord