Home
Projects
Resources
Alternatives
Blog
Sign In
Dpark Versions
Save
Python clone of Spark, a MapReduce alike framework in Python
Overview
Versions
Reviews
Resources
0.5.0
5 years ago
API change
Remove
module-level api like
dpark.textFile
.
Support Streaming shuffle and Disk shuffle (Experimental, compatible).
Fixes
Bug when parsing mfs chunk info.
Improvement
Better broadcast impl using shared memory for tasks on the same slave to reduce memory cost.
Better offer-matching logic for MesosScheduler which remember bad slaves.
Refactor: style and layout.
New Feature
Multi segment dump to save memory.
Gather statics for stage.
Support run tests/test_rdd on mesos.
Add colorful progress bar for dpark.
Support mesos role.
Support multi named mesos master in conf.
Loghub for admin.
0.4.2
6 years ago
Support Python3 & PyPy
Support MooseFS 3.x & refactor on file-system interface
0.4.1
7 years ago
Enhancement for the containerizer in DPark
Use broadcast when parallelize big dataset
Fix missing line bug for bzip2 files
Add TopByKey in RDD
Other minor bugs
0.4.0
7 years ago
Bugfix: deserialize error of old-style class.
Refactor beansdb RDD
Web UI support for dpark
Use pymesos >= 0.2.0
Eager serialize values of ParallelCollection
Home
Projects
Resources
Alternatives
Blog
Sign In
Sign In to OSA
I agree with
Terms of Service
and
Privacy Policy
Sign In with Github