Kernel dies in a notebook that manipulates a large Pand...

At a glance

The community member is working with large Pandas dataframes (5-10M rows) and is unsure whether to allocate more resources, make the notebook more resource-efficient, or take some other action. The community members in the comments suggest providing the stacktrace and creating a reproducible notebook (using synthetic data if necessary) to help diagnose the issue. They also mention that some performance improvements have been made to Pandas, so the community member should check if they are still seeing issues with the latest version (0.9.32 or above).

DDavid Eng

The notebook loads a couple of Pandas dataframes, each with 5-10M rows, filters each them down to 3-5M rows, samples 10% of them, and plots various charts. Unsure whether to allocate more resources, make my notebook more resource efficient, or something else. I can provide the stacktrace if helpful.

2 comments

AAkshay

The stacktracr would be helpful. If you could make a notebook that reproduces the issue (using synthetic data if your data is private) that would be very helpful

AAkshay

We did push some improvements to dataframe performance.

Curious if you're still seeing issues on latest (0.9.32 or above)
If you have a repro or stack trace that would help

Add a reply

Get help from the marimo community

Kernel dies in a notebook that manipulates a large Pandas dataframe