A Flexible and Powerful Parameter Server for large-scale machine learning
In version 3.2.0, Angel continues to strengthen the ability of graph computing. Compared with the previous version, we have done a lot of optimizations and provided some new features, include:
Graph computing layered abstraction and flexible expansion
Parameter server and MPI mixed running mode
Adaptive model partitioning
Complex heterogeneous Graph Embedding
High-performance optimization of large graphs with hundreds of billions of edges
Enrich the content of machine learning algorithm library
In version 3.1.0, Angel enhances graph learning ability, and adds a variety of improvements, including:
Traditional Graph Learning Algorithms
Graph Embedding
Graph Deep Learning
Basic Graph Operators
GPU Support in PyTorch-on-Angel
Angel 2.4.0 Release
Angel 2.4.0 has adjusted the default values of a large number of system parameters, which greatly reduces the performance tuning threshold for users. With this version, you don't need to do much system-level tuning to get better performance. In this release, we also fixes some stability issues.
In terms of algorithms, we reconstruct LINE V2 to support weighted edges.
New features
Starting from version 3.0, angel will maintain two separate version series: 2.X and 3.X. We have added the hotfix-2, master-2 and develop-2 branches for 2.X series versions.
Angel has evolved from a single model training system to a comprehensive computing platform that includes all phases of machine learning: data preprocessing, model training, model services, automatic hyper-parameter tuning, and automatic feature engineering. Based on the Angel PS service, we built Angel's ecosystem: sona (Spark On Angel) and PyTorch On Angel. Our algorithms cover basic machine learning algorithms, deep learning algorithms, graph algorithms and GNN algorithms. In order to make the project structure more clear, we split the original project into 8 sub-projects:
angel: Angel's core layer, providing powerful parameter server function. Of course, you can use it to train the model independently.
PyTorch-On-Angel: A lightweight and high-performance distributed PyTorch computing platform based on Angel PS. It uses Angel's PS to support high-dimensional models and uses Spark as a PyTorch scheduling platform. It is easy to use as you can complete data preprocessing (using Spark) and model training (using PyTorch) altogether in one job. Similar to developing algorithms on PyTorch, users can simply use Python to design new algorithms on PyTorch On Angel platform . We have implemented a variety of algorithms in PyTorch On Angel: LR, FM, DeepFM, Wide & Deep, xDeepFM, GCN, GraphSage, etc, exhibiting higher performance (5x~10x) than those on Angel and sona. We stongly recommend you to use the PyTorch On Angel platform if you are more concerned with performance.
sona: A generic computing platform based on Angel PS and Spark that uses Angel PS to break through Spark's bottleneck of training high-dimensional models. In the new version of sona, we have done a lot of work to make the combination of feature engineering and model training better. We reconstruct LINE(LINE V2) and K-Core in this version and the performance and stability have been greatly improved.
serving: Angel's model serving platform, which is able to provide serving for not only models generated from Angel, Spark On Angel and PyTorch On Angel, but also those from other platforms, such as Spark, XGBoost, etc.
automl: A generic automatic machine learning component that includes automatic tuning and automatic feature engineering.
mlcore: Angel's independently developed lightweight computing graph framework. Users can easily implement new algorithms on it.
math2: Angel's independently developed high-performance math library, which involves a lot of performance optimization for large sparse vectors.
format: Angel's model format interface definition. Angel uses an open model format, enabling users to customize the needed format by implementing the model format interface.
New features
Angel 2.3.0 Release
Starting with version 2.3, angel will maintain two separate version series: 2.X and 3.X. We have added the hotfix-2, master-2 and develop-2 branches for release and dimension 2.X series versions.
In this version, we enhance the PS function and added the ability to store complex data objects. Based on this function, we upgraded the original graph algorithm capabilities, and added graph data structures and operation interfaces.
We have refactored K-Core again, and the performance and stability of the refactored version have been significantly improved. We have added an implementation version of LINE algorithm LINE V2, in the case where the node coding dimension is not very high(< 512). LINE V2 version performance is significantly improved compared to the original version, and it is more stable.
This release adds a new module: angel-ps-graph, which contains the definition and operation interface of the graph data structure. Based on this module, we implemented GCN and GraphSage on the Pytorch On Angel platform.
New features
Angel 2.2.0 Release
In this Release, we have enhanced the graph algorithms: (1) we made a refactoring of the existing K-Core algorithm, the performance and stability have been significantly improved; (2) we add louvain algorithm, which is also based on Spark On Angel. The test results show that the K-Core and Louvain based on Spark On Angel are 1~2 orders of magnitude faster than the original version of Spark. In this release we official release the Vero, the new GBDT implement based on the Spark On Angel, which has obvious advantages in supporting high dimensional models and multi-classification problems. We also add kerberos support in this release.
New features
BugFix
Release-2.1.0
This version adds a more intelligent model partitioning method "LoadBalancePartitioner" In Spark On Angel. By analyzing the distribution of features in the training data in advance, the number of features of each partition can be precisely controlled. This makes the PS load more balanced. The actual tests show that the efficiency of model training can be greatly improved in many cases. This version adds three new algorithms in Spark On Angel: FM algorithm based on FTRL optimizer, K- Core algorithm and feature-parallel GBDT algorithm that can support larger models.
[ISSUE-639] Load-balanced model partitoner "LoadBalancePartitioner" in Spark On Angel
[ISSUE-690] Ftrl-FM in Spark On Angel
[ISSUE-663] K-Core in Spark On Angel
[ISSUE-680] Feature-parallel GBDT in Spark On Angel
该版本对Spark On Angel FTRL算法做了进一步的优化,添加了对float模型格式的支持,同时优化了模型划分分区数设置,一个合理的模型分区数对提升计算性能是非常有益的;在系统层,添加了PR RPC最大重试次数限制, 避免任务在某些不可恢复异常下一直卡住。这个版本也加入了一些数学库的优化。
[ISSUE-655] 优化Spark On Angel FTRL 模型分区数配置,避免在高维度模型场景下模型分区数太多导致的pull/push性能低下的问题
[ISSUE-656] 在Spark On Angel FTRL中添加float模型数据格式
[ISSUE-658] 优化数学库:当用户配置了sparse vector的最大元素个数时,关闭掉预 rehash
[ISSUE-632] 给PS RPC添加最大重试次数限制
这个版本针对Spark On Angel FTRL算法做了进一步的优化,提升了在高维度模型条件下的训练和模型保存的性能,并增加了增量训练的功能。与Angel版本的FTRL相比,Spark On Angel的FTRL有一些针对性的优化,性能更好并且调参门槛更低,推荐使用。
除了优化FTRL之外,这个版本还引入了一系列的新的优化器和Decay策略,并调整了一些参数的默认值,在大部分情况下算法会收敛的更好。
这个版本在文档方面也进行了完善:增加了一些诸如优化器选择,Decay策略调整等的指南文档,还有使用OpenBlas给深度学习加速的指南,如果要使用Angel的深度学习算法,请务必参考该文档,在我们的评测中,使用OpenBlas后性能一般有10X以上的提升。
[ISSUE-585] 添加如何使用OpenBlas进行矩阵运算加速的文档
[ISSUE-569] 优化Spark On Angel FTRL计算和模型保存性能
[ISSUE-613] 增加Spark On Angel FTRL 增量训练功能
[ISSUE-611] 添加几个能使用L1正则化的优化器:AdaGrad/AdaDelta
[ISSUE-612] 添加目前主流的几种Decay 算法,优化默认参数避免decay太快
[ISSUE-615] 修复network embedding中subsample带来的节点个数不一致的问题
[ISSUE-616] 修复量化压缩PSF中的向量casting问题
这个版本在2.0.0-alpha版本上修复了大量bug并对稳定性做了较多的优化;同时对部分算法进行了重构。
Angel Core
MLlib