Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
-
Updated
Apr 3, 2022 - Python


I naively tried to do
dd.merge(a, b, on="column_with_ten_values"), whereaandbwere both large DataFrames with thousands of partitions each.Eventually the compute failed with: