Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.
Cylon 0.6.0 is a major release. We are excited to present UCC, Gloo integration, More distributed operations
You can download source code from Github Conda binaries are available in Anaconda
91bdd54 Update conda-actions.yml (#645) d1739ed Added buildable instructions for Rivanna (#643) d9a6420 Arrow 9.0.0 and gcc-11 update (#601) 4c867b1 Summit Fixes (#623) 7f8a3b1 Fixing sample bug (#631) ce12454 Cython binding for slice, head and tail (#619) ef4c904 #610: SampleArray util method replaced by using arrow::compute::Take … (#612) 4694a9e Minor fixes (#608) 121b386 Fixing: Corrupted result when joining tables contain list data types #615 (#616) 68fa598 Summit fixes (#607) de3ec7b fixing bash splitting (#606) 0a489fc adding cmake parallelism flag (#605) 035fd70 Implement Slice, Head and Tail Operation in both centralize and distr… (#592) d99a6f2 adding custom mpirun params cmake var (#604) f20c119 Update README-summit.md (#603) 4bc27f9 Create README-summit.md (#602) e6b7306 Minor fixes (#596) 2e6ac80 adding conda docker (#600) 4dd359f Ucc integration (#591) 61b4a82 adding cylonflow as a submodule (#593) e4dd38b Use generic operator (#583) 6c0dfa8 Gloo python binding (#587) 773f11f Gloo python bindings (#585) 2fc95be Add downloading catch2 header dynamically (#584) c56ab2d Enabling gloo CI (#582) a820ed8 Dist sort cpu (#574) f68cc62 Adding UCC build (#579) 2759a30 Cylon Gloo integration (#576) b2c0820 Adding distributed scalar aggregates (#570) 9c2fdc4 Extending datatypes (#568) e3d553c Bump ua-parser-js from 0.7.22 to 0.7.31 in /docs (#566) 3bafb75 Bump ssri from 6.0.1 to 6.0.2 in /docs (#565) 814a463 minor fixes (#564) be92253 Bump lodash from 4.17.20 to 4.17.21 in /docs (#561) e87dd7c Bump shelljs from 0.8.4 to 0.8.5 in /docs (#562) 71bd8bf Bump nanoid from 3.1.22 to 3.2.0 in /docs (#563) 49b343d Allowing custom MPI_Comm for MPI (#559) fa52dd4 Update contributors.md 54d4a53 added io functions (#550) 1a8c3d7 Fixing 554 (#558) 887ea18 update arrow link (#557) 1ce4c6b Fixing 552 (#553) f5e31a1 Merging 0.5.0 release (#547)
Ahmet Uyar Chathura Widanage Damitha Sandeepa Lenadora dependabot[bot] Hasara Maithree Kaiying Shan niranda perera Supun Kamburugamuve Vibhatha Lakmal Abeykoon Ziyao22 Arup Kumar Sarker Mills Wellons Staylor Gregor von Laszewski
Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0
Cylon 0.5.0 is a major release. We are excited to present GCylon, cudf-based distributed DataFrame for Nvidia GPUs, UCX integration, Anaconda support, and much more.
Dataframe.applymap
, Dataframe.isin
First release of Gcylon which supports distributed DataFrame processing on Nvidia GPUs using CuDF:
You can download source code from Github Conda binaries are available in Anaconda
3344bf95 Mapreduce style group-by aggregators (#535) 50ef890b Remove minor warnings (#544) 559e8eb3 Adding CPU serializer (#539) abb44049 fixed unused variable/parameter and casting warnings (#542) 62a3f080 Distributed IO (#533) 15d06d6c Bump color-string from 1.5.4 to 1.7.4 in /docs (#534) 810c4ed7 fixing RNG issue (#538) fbb049bb fixing build error (#536) a10e0528 Bump algoliasearch-helper from 3.3.3 to 3.6.2 in /docs (#532) 112ea97f Repartition - CPU (#526) 79c4b739 create a MacOS yml file (#530) b9e7a8c4 Repartition - GPU (#528) 2191b9f5 fixed function name change in cudf api from gcylon test files (#529) 3e9036ee Upgrading to arrow 5.0.0 (#525) 24d182ab Groupby values null handling (#527) 54a5074b Null handling for Comparators (#524) 0b9516e7 Adding array flattening (#522) b3fc2a2a Implemented MergeOrSort when merging sorted tables (#523) 1e061b2f Feature/equal (#499) e378d1dc reformatted gcylon codes with tab size 2, non-functional changes (#521) 8450d9b1 Added support for sliced tables in gather, broadcast and sorting (#520) 92b8124c Update windows.yml 1f9790d7 Update macos.yml d33f9ac8 Update conda-actions.yml 963d4914 Update c-cpp.yml 2229981d added mpi datatype dispatching for primitive data types (#519) d9936b4d Head tail operators (#512) ac99d009 Formatting code (#518) fff84ccb Code formatting (#517) f32f04da Null handling in splitters and build arrays (#511) 4cab7ca4 Delete files from CPP example folder that are not needed (#516) d1744302 moving tutorial repo to (#514) 9cd7911f Python example cleanup (#513) fe4caf37 Distributed sorting (#510) 2302f58f Minor improvements to the Table API (#508) 71eb80a1 adding new test utils (#507) 24b83dd3 Adding to docker docs (#498) 6f2faf8f Update conda.md 4f8f3c7f Gcylon docs (#501) a7862580 Adding contributing guide to documentation (#496) 8ab8b2d6 changing join column naming convention to match SQL and pandas (#487) f18b91fe improvements to ucx build from conda (#484) 912fb543 Windows build (#482) 216758a2 making improvements to the build (#483) 4e2894eb Add functions to dataframe (#481) 1f1ddd9c Documentation update (#479) e6233151 Bump tar from 6.1.5 to 6.1.11 in /docs (#477) 1e5db7b6 improve docs (#476) 58c0595d removing extra examples (#474) 3c823f6f Gcylon integration (#470) 92748eb5 Cpp example cleanup (#475) fa14527d Docs improvements (#469) 13062206 Bump url-parse from 1.4.7 to 1.5.3 in /docs (#473) 8234ae7b Bump path-parse from 1.0.6 to 1.0.7 in /docs (#472) c8b435b6 Bump tar from 6.0.5 to 6.1.5 in /docs (#471) 1cc28dd3 Performance improvements (#453) 9092bbf0 MacOS build (#464) d59d91ea Add iloc operation to DataFrame (#465) 8d7a8dc7 Removed glog files from the header files (#463) ea62eef0 License updates (#462) 2f562650 changed all relative Cylon header references to global (#461) 123c93c3 Building in conda env without using conda-build (#457) 3b3a2853 Compilation document improvements (#454) 8578b1f1 Adding barrier at the end of the test case (#458) e6eded5f Fix for empty df (#455) 8f149924 Fixed mpi test case (#456) cb069980 Changes to the Docs (#451) 4ce1d7eb updates to the docker readme e011e0f6 enhancing readme adfa6c05 adding read distribution (#432) bd2e024d UCX integration (#439) a42d04ad Bump ws from 6.2.1 to 6.2.2 in /docs (#437) 710b562e Bump dns-packet from 1.3.1 to 1.3.4 in /docs (#435) 07aee740 adding new operators to DataFrame API (#429) 71e57f84 Updating to arrow 4.0 (#418) a490dc21 changing ctx to const reference in methods (#419) 18a5447b missing docs (#428) 38534f55 0.4.1 release (#427) 10f5a6a3 Enabling scalars in df set_item (#425) 0be78972 Op bench refactor (#417) ec964d89 Bug fixes in dataframe (#420) e0ba9643 Update c-cpp.yml 0200c021 adding finalize check and removing destructor finalize call. (#412) 149919c2 Update README.md 016c5c92 adding missing test case 56095357 Update README.md e3ca0bf5 0.4.0 release (#411)
Ahmet Uyar Chathura Widanage Damitha Sandeepa Lenadora dependabot[bot] Hasara Maithree Kaiying Shan niranda perera Supun Kamburugamuve Vibhatha Lakmal Abeykoon Ziyao22
Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0
Cylon 0.4.1 is a bug fix release.
Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0
Cylon 0.4.0 is a major release with the following features.
You can download source code from Github
Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0
Cylon 0.3.1 is a bug fix release.
You can download source code from Github
Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0
Cylon 0.3.0 adds the following features. Please note that this release may not be backward compatible with previous releases.
You can download source code from Github
Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0
Cylon 0.2.0 adds the following features. Please note that this release may not be backward compatible with v0.1.0.
std::vector
s or cylon::Column
sYou can download source code from Github
Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0
Cylon 0.1.0 is the first open-source public release of Cylon Project. We are excited to bring a high-performance data engineering toolkit that can work as a library as well as a standalone framework. This is the first step towards building a complete toolkit designed to work with AI/ML systems and integrate with data processing systems with the vision "data engineering everywhere".
You can download source code from Github
Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0