how to disable “defined by another cell” error?

At a glance

The community member is experimenting with models and reusing variables across cells, such as setting the device variable to cuda for training and then to cpu for later use. However, the marimo platform does not support this type of operation as it breaks the dataflow graph assumptions. The community members discuss alternatives like using local variables (prefixed with an underscore) or encapsulating code in functions, but find these solutions less than ideal. There is a discussion around disabling the dataflow graph feature, but the community members acknowledge this would disable other important marimo features. The community members explore various workarounds, but ultimately one member concludes that this issue is a "showstopper" and decides to quit using marimo.

Useful resources

IIlya I. Lubenets

While experimenting and testing models - I usually reuse some variables from cell to cell. For example, when training models - I can set device variable to cuda for better coding experience.

device = torch.device("cuda”)

And when I want to use this model later in the same notebook and leave GPU for another processes, I’ll do:

device = torch.device(“cpu”)

But marimo forbids this type of operations!
Same applies to situations with sklearn models, where I train models in cells reusing “model” variable.

model = TypeOfModel()
model.fit()

Of course, it’s not clear or any close to production looking code, but this is what I want from notebooks - fast and simple “proof of concept” code.

P.s. forgive me for code blocks here, mobile discord sucks

21 comments

AAkshay

This isn't something we'll support, because it makes the notebook not a dataflow graph, which breaks a lot of assumptions that marimo needs to make in order to eliminate hidden state and make notebooks executable as apps/scripts.

As you may know, you can use local variables (_device), or better yet encapsulate code in functions and/or give variables meaningful names.

IIlya I. Lubenets

Kinda sad, but okay, I can live with that

IIlya I. Lubenets

UPD. Its getting more uncomfortable...

If i do

Plain Text

with open(...) as file:
  pass

in one cell, I'm forced to use _file, __file, ___file, etc. on and on in other cells

IIlya I. Lubenets

Is there anything we could do with this? This is really annoying

AAkshay

Variables starting with an underscore are local to a cell. In this example _file is local to a cell. You can do

Plain Text

with open(...) as _file:
  pass

in every cell. No need to keep adding underscores.

AAkshay

Is that good enough?

IIlya I. Lubenets

Hm, so why my marimo was arguing about already defined _file?

IIlya I. Lubenets

Well, I double checked now and there is no problem with _file right now. Maybe that was a nightmare, idk. Sorry for bothering

AAkshay

haha no worries, it happens

IIlya I. Lubenets

@Akshay Well... another problem...

How do I use for i in range(...) loops in different cells if I always get i was defined by another cell? using _i is very ugly and less readable

HHaleshot

I personally use _i (was a bit hard getting accustomed to it migrating from traditional jupyter notebooks); not sure what other work around there is. If you want more insights, I would recommend referring to best practices to follow here - https://docs.marimo.io/guides/best_practices/index.html#best-practices

IIlya I. Lubenets

Yeah, but this is ugly looking and raises too many questions when I want to export my code somewhere.

Can I just disable “dataflow graph” when I don’t need it? Experimenting in notebooks should be easier

HHaleshot

Not sure if that's a existing/viable option (being embedded in core values and principles upon which marimo was built); may be the core contributors might be knowing a suitable workaround/alternative.

ddmad

There's some more discussion here about read-only variables: https://github.com/marimo-team/marimo/issues/1477

But if you utilize the scratchpad- there are no definition constraints. Normally, I iterate/ play around there- and make the minor changes when I insert into the notebook

AAkshay

I tried implementing relaxing the uniqueness twice — have two branches on my machine — and ran into many edge cases. It also introduces a tradeoff, increasing memory pressure because marimo has to keep a copy of each duplicated variable. Fine for loop indices, not fine for df = my_big_dataset(), X = my_cuda_tensor. In the end, in the spirit of Python, I think simple is better than magical, explicit better than implicit

AAkshay

Disabling the graph would mean disabling many other features that marimo provides — running as apps, elimination of hidden state, module hotreloading. We don't want to bifurcate our users, and the notebooks they share, into "graph" users and "non-graph" users.

Hopefully the scratchpad helps. If we can find a way to workaround the memory pressure for relaxing the memory pressure, we could consider relaxing the uniqueness constraint. But we need to keep every copy of duplicated variables around, otherwise deleting a cell would mark all other cells using the duplicated variable as stale, which would then increase compute pressure, which also is unacceptable (imagine all your for loops rerunning just because you deleted a totally unrelated cell).

IIlya I. Lubenets

@Akshay I’m sorry, but this is now showstopper for me

Attachment

IIlya I. Lubenets

And inplace=True is not a great way to overcome this issue at all

IIlya I. Lubenets

https://stackoverflow.com/questions/45570984/in-pandas-is-inplace-true-considered-harmful-or-not

IIlya I. Lubenets

I’m really sorry, but I’m forced to quit marimo because of this 😦

AAkshay

Sorry to hear it! I'm glad you gave it a try.

Add a reply

Get help from the marimo community

how to disable “defined by another cell” error?