Tech Zero
Jan 13, 2023

--

In my experience, using dask is an overkill if the original dataset fits in the memory and does not need parallel processing.

Dask is faster as it operates on a 'lazy loading' principle. This is the perfect solution for a large dataset that needs to be chunked and then processed.

But, in other scenarios, pandas is a perfectly reasonable choice of data structure

--

--

Tech Zero
Tech Zero

Written by Tech Zero

Product Manager, Data & Governance | Azure, Databricks and Snowflake stack | Here to share my knowledge with everyone

Responses (1)