Angel Versions Save

A Flexible and Powerful Parameter Server for large-scale machine learning

2.0.0-alpha

5 years ago

Release-2.0.0-alpha

这个版本新增了大量新算法,如LINE/Word2Vec/DNN等,同时重构了数学库,PS接口和算法编程框架等。

  • Angel Core

    • [ISSUE-378]重构数学库,全面支持double/float/int/long数据类型和dense/double/dummy/sorted存储格式
    • [ISSUE-380]重构PS接口,新增无缓存的increment和update方法
  • MLlib

    • [ISSUE-377] 新增计算图编程接口,大大简化算法开发;基于计算图框架重构LR/FM/SVM等算法
    • [ISSUE-379] 重构优化算法库,新增Adam,Momentum和FTRL
    • [ISSUE-364] 新增常见DNN算法:例如DeepFM,Wide & Deep,PNN等;支持JSON构建DNN网络的方法
    • [ISSUE-363] 新增Network embedding算法:LINE
    • [ISSUE-381] 新增Word2Vec算法

v1.5.1

6 years ago

Release-1.5.1

这个版本主要修复了1.5.0版本的一些BUG并针对高维度稀疏模型场景做了进一步的优化。

  • Angel Core

    • [ISSUE-330]优化Snapshot写流程,避免写Snapshot时将模型锁住
    • [ISSUE-333]优化默认的模型分区算法和高维度稀疏模型下RPC性能
    • [ISSUE-327] common-pool内部计数器错乱导致的netty channel被关闭
    • [ISSUE-328] yarn.application.classpath或mapreduce.application.classpath被置为空字符串时出现NPE
    • [ISSUE-331]在Angel-PS启动模式下心跳检测可能出现NPE
    • [ISSUE-334]偶发container分配失败
    • [ISSUE-338] 修复算法指标展示时NaN/INF等字符无法转换成Double类型的问题
  • MLlib

    • [ISSUE-324]修复LR预测时没有加入Bias的问题
  • Spark on Angel

    • [ISSUE-336] SparseLRWithFTRL新增支持读取Libsvm格式数据功能
    • [ISSUE-337] 重构在线学习FTRL算法
    • [ISSUE-340] SparseLRWithOWLQN 移除 murmurhash计算过程
  • 文档

    • [ISSUE-339]添加FM算法说明文档
    • 更新部分算法文档参数说明

v1.5.0

6 years ago

Release-1.5.0

在Angel 1.5.0,我们针对高维度稀疏模型场景做了大量的优化。在系统层,我们重构了流量控制和异常处理,加强稳定性,提升PS的支撑能力;在算法层,我们重构了优化算法库和LR/FM等算法, 同时在Spark On Angel上实现了支持稀疏模型的基于OWLQN/FTRL的LR算法,计算效率明显提升。

Core

  1. 重构PS流量控制机制
  2. 重构数据RPC异常处理机制
  3. 重构模型分区算法,优化高维度稀疏模型的切分方式
  4. 优化在Worker只包含一个Task场景下内存使用效率
  5. distribute serving功能优化,支持分布式运行模式和批量的大模型预测
  6. 在PSF函数中添加对稀疏long类型向量的支持
  7. BugFix:修复客户端在高并发发送请求时可能卡住的问题
  8. BugFix:修复模型元数据文本副本数为1的问题

MLLib

  1. 重构LR算法,优化收敛速度慢,L1正则化下稀疏度不够的问题,优化偏置项处理
  2. 重构FM算法,在高维度模型场景下计算效率大幅度提升
  3. 重构优化算法库,添加Momentum/AdaGrad/AdaDelta/Adam等优化算法,目前新的优化算法库还是一个试用特性,后续会逐步取代老的优化算法库
  4. BugFix:修复稀疏型模型稀疏度计算错误的问题

Spark on Angel

  1. 新增SparseLR With FTRL,采用mini-batch Async FTRL的梯度更新方式
  2. 新增SparseLR With OWLQN
  3. 优化SparsePSVector merge和计算性能

文档

  1. 全面更新算法文档,优化公式显示,调整部分算法参数的名称和含义

v1.4.0

6 years ago

在Angel 1.4.0,我们对进行了一次内核大重构,为后续版本正式引入Distribute Serving打下了基础。此外,该版本支持64位的FeatureId,提升了PS的容灾速度以支持Spark Streaming on Angel,FTRL全面切换到Spark Streaming方式,并进行了优化,该版本为后续升级打下了良好的基础。

Core

  1. 优化PS容灾方式,新增基于模型分区副本的恢复模式,PS宕机后可以快速恢复
  2. 支持64位的FeatureId,新增64位Index的训练样本解析接口;
  3. 引入Distributed Serving,支持基于模型多副本Sharing提供分布式Inference服务(Alpha)

PySpark

  1. 从Python2迁移到Python3,后续Angel将只支持Python 3

MLLib

  1. GBDT开始支持离散特征,并加入回归类型,功能进一步对齐XGBoost
  2. 优化LR算法,提供基于特征索引的模型获取方式,增加对64位稀疏模型支持

Spark on Angel

  1. 修复Spark On Angel任务异常后,Angel-PS退出问题
  2. 增加Local Vector,并优化PSVector的接口
  3. 优化GBDT,并修复预测结果未转换等bug

v1.3.0

6 years ago

Release-1.3.0

Angel 1.3.0 如约带来了Python接口的PyAngel,并提前加入了Spark Streaming on Angel的FTRL算法,同时,内核和已有算法也做了大量的优化和补充,Spark on Angel开始支持稀疏特性。这是一个拥有诸多新特性,充满活力的版本。

Core

  1. 支持拉取局部模型:PSModel增加getRowWithIndex方法,支持拉取特征的部分维度(Experimental)
  2. Bug 修复
    • 维度超过配置导致的任务卡住
    • worker log url端口与Yarn web端口不一致问题
    • 一些流和socket在某些情况下没有及时关闭

MLLib

  1. 增加FTRL优化方法和FTRL LR(验证离线数据集用,生产版本见Online Learning)
  2. 完善了MLR算法

PyAngel

  1. 基于MLRunnerAPI,封装和提供了各个Angel算法
  2. 支持脚本交互式两种提交模式
  3. 支持Local和Yarn两种运行模式

Spark on Angel

  1. PS Function支持Sparse特性
  2. PSVector/PSMatrix支持Sparse特性
  3. Bug修复
    • PullMan/PushMan导致VectorPool无法回收vector、
    • 修复LogisticRegression的小Bug

MLLib (Spark on Angel)

  1. 引入RDD sliceAggregate算子,解决目前Spark高维数据聚合效率低的问题
  2. Online Learning(FTRL)
    • 基于Spark Streaming on Angel,实现了生产可用的FTRL算法(SparseLRWithFTRL)和相应的Optimizer

文档

  • ~Spark on Angel文档全面更新
  • MLR,ADMM文档更新
  • LDA文档更新
  • FTRL文档更新

~~~华丽的致谢分割线~~~

感谢如下的开发者为这次发布做出的贡献:

  • shunanzhang:启动第二轮文档翻译和同步(#245)
  • ericzhang-cn:修复诸多Bug,加入FTRL的Predict

同时 ,对公司内外用户的热心反馈和意见,深表谢意。BTW:伴随着上个版本LongKey的升级,Angel已经开始支持公司内百亿级别维度的算法和业务。

v1.2.1

6 years ago

Release-1.2.1

Angel 1.2.1 实现了全新的模型输出格式和加载/转换工具,并对算法库做了较多的优化,提供了可配置模型格式的LR算法。此外,Spark On Angel的接口也被进一步的重构和优化,并带来了Spark on Angel版本的GBDT算法

Angel Core

  • 模型格式重构,优化模型输出文件多的问题
  • 采用并发方式加载和导出模型
  • 全新的模型加载和格式转换工具
  • 稀疏矩阵计算性能优化

Angel Mllib

  • LR:可通过配置参数选择稠密和稀疏模型格式
  • GBDT:优化树数量多时的性能问题;增加两阶段分裂和低精度压缩的psFunc;修复特征下采样的索引问题和参数初始化问题
  • LDA:使用PSF更新模型,优化内存使用,加入WarpLDA的变种算法
  • GradientDescent/Loss接口泛型化,支持dense double, sparse double和sparse double with longkey三种模型格式

Spark On Angel

  • 接口优化和改进 * PSClient分离成Initializer,VectorOps,MatrixOps * BreezePSVector和CachePSVector优化
  • 新增GBDT算法

不兼容升级

  • 【重要】PSModel类移除声明时泛型,通过setRowType类设置类型

文档

  • 新增辅助工具类说明文档:指标使用说明,模型加载/转换使用说明
  • 持续的文档国际化
  • 更新Spark On Angel和部分算法文档

~~~华丽的致谢分割线~~~

Angel 1.2.1的发布,继续得到各地的Contributors的协助。感谢如下的开发者为这次发布做出的贡献:

  1. shunanzhang持续的高质量文档翻译
  2. chriswarplda实现
  3. cstur4模型加载优化

同时 ,并对QQ群里诸多公司用户的热心反馈和意见,深表谢意

Release-1.2.1

Angel 1.2.1 added new model output format and loading/conversion tools, improved the algorithm library, and provided Logistic Regression with configurable model format. Spark on Angel interface has been further refactored and improved, with GBDT algorithm introduced.

Angel Core

  • Refactor model format to solve problem of too many output files
  • Introduce concurrent mode in model load/export
  • Provided new tools for model load/convert
  • Improved performance of sparse matrix computation

Angel MLlib

  • LR: model format made configurable: dense/sparse
  • GBDT: improved performance when there is large number of trees; added psFunc for the two-stage splitting algorithm and low-precision compression; fixed indexing problem and parameter initialization problem in feature sampling
  • LDA: enabled using PSF to update model; improved memory usage; added WarpLDA variant
  • GradientDescent/Loss interface is made generic to support three model formats: dense double, sparse double and sparse double with longkey

Spark on Angel

  • Improved interfaces
    • Separated PSClient into Initializer, VectorOps and MatrixOps
    • Improved BreezePSVector and CachePSVector
  • Added GBDT

Compatibility

  • IMPORTANT: removed generic declaration for PSModel; parameter type will be configured by setRowType

Documentation

  • Added documentation for assistant classes: metrics, model loading/conversion
  • Continuous translation of documentation
  • Updated documentation for Spark on Angel and a few algorithms

~~~ Acknowledgement ~~~

We continue to receive help from developers from all over the world for Angel 1.2.1. We thank developers who contributed to the new release:

Meanwhile, we received many helpful feedback and suggestions from the Angel QQ group, and we are greatly thankful.

v1.2.0

6 years ago

Release-1.2.0

Angel 1.2.0,加入了较多的优化和改进,新增了2个算法,修复了多个Bug,建议所有的用户都升级到这个版本,为1.3.0版本的进一步升级做好准备。

Angel Core

  • Long类型Key的稀疏Double Vector/Matrix支持
  • 稀疏向量性能优化:添加可支持并行运算的稀疏型Vector/Matrix
  • PS RPC性能优化:优化网络模型,分离IO操作和RPC请求处理
  • 完善MatrixOpLog类型:增加Sparse Double/Dense Int/Sparse Float几种类型

Angel MLLib

  • 新增MLR算法
  • FM提升:使用PSF进行模型的初始化;计算性能优化;增加分类方法
  • GBDT优化:使用PSF实现最佳分裂点查找过程
  • LDA升级为LDA* ,保持和VLDB 2017的论文实现一致,并优化性能

Spark on Angel

  • 新增KMeans算法
  • 对接口进行了一轮重构,隐藏了PSVectorPool的概念
  • 模型开始支持Matrix

文档

  • 优化psFunc和Core-API文档
  • 新增PSModel格式转换工具、全局指标使用说明
  • 文档的国际化进度90%

接口优化

  • PSModel:增加syncClock接口,建议替代clock().get()简单调用
  • DataBlock:加入loopingRead接口,可以重复读取数据以供训练

~~~华丽的致谢分割线~~~

Angel 1.2.0的发布,继续得到各地的Contributors的协助。感谢如下的开发者为这次发布做出的贡献:

  1. hbghhy基于Spark on Angel实现的KMeans算法
  2. hbghhy加入阿里巴巴用于CTR预估的MLR算法
  3. shunanzhang持续的高质量文档翻译
  4. [SkyData] Augusto Yao:修复了诸多Bug [112, 188]
  5. [小米] luosmart: 修复了诸多Bug [198 ... ]

同时 ,并对QQ群里诸多热心用户的反馈和意见,深表谢意


Release-1.2.0

Angel 1.2.0 is a version with improvement and enhancement. Two new algorithms became available and recognized bugs are fixed. We recommend all users to upgrade to this version, preparing for further upgrade in the near future (version 1.3.0).

Angel Core

  1. Added support for Sparse Double Vector/Matrix for long type key
  2. Optimized performance of Sparse Vector: added sparse Vector/Matrix that supports parallel operations
  3. Optimized performance of PS RPC: optimized network model, separated IO operations and RPC request handling
  4. Improved MatrixOpLog type: added Sparse Double, Dense Int and Sparse Float types

Angel MLlib

  1. Added the MLR algorithm
  2. Enhanced FM: using PSF for initializing models, optimizing performance of operations, adding classification methods
  3. Optimized GBDT: using PSF for implementing searching for the best split point
  4. Upgraded LDA to LDA * (up-to-date with the 2017 VLDB publication in the README file) and optimized for performance

Spark on Angel

  1. Added KMeans algorithm
  2. Interface Refactored: PSVectorPool concept is hidden
  3. Model starts to support Matrix

Documentation

  1. Improved psFunc and Core-API documentation
  2. Added explanations of usage for PSModel format converter and global algorithm metrics
  3. 90% of documentation available in English

Interface Optimization

  1. PSModel: added syncClock interface; a simple call of syncClock is recommended to replace the usage of clock().get()
  2. DataBlock: added to loopingRead interface; data can be read repetitively for training

~~~ Acknowledgement ~~~

Help from developers from all over the world is continuing. We appreciate developers who contributed to the new release:

  1. hbghhy: Implementation of KMeans on Spark on Angel
  2. hbghhy: Adding the MLR algorithm used for CTR estimation by Alibaba
  3. shunanzhang: Continued translation for documentation
  4. [SkyData] Augusto Yao: Fixed a number of bugs [112, 188]
  5. [Xiaomi] luosmart: Fixed a number of bugs [198, ...]

Meanwhile, many helpful feedback and suggestions are received from Angel QQ group , and we are also greatly thankful for that.

v1.1.0

6 years ago

Release 1.1.0

Angel 1.1.0版本,是一个小步优化版本,修复诸多首发版本的Bug,并加入了如下细节功能和小改进:

Angel Core

  1. psFunc的update引入并发控制
  2. 模型优化:加入明文格式转换支持
  3. Netty升级到4.x版本

Angel MLlib

  1. 改进PSModel的接口
  2. Logistic Regression算法加入了y截距
  3. ADMM LR增加Predict功能
  4. 实现了朴素的FM算法
  5. 全局算法指标的计算和日志输出优化

Spark On Angel

  1. 多Task下Pull / Push操作性能提升

完善文档

  1. 模型分区
  2. 同步协议
  3. 资源预估
  4. 全局指标

不兼容升级

  1. AngelConfiguration ---> AngelConf
  2. 移除TConstants

~~~致谢~~~

Angel 1.1.0的发布,有来自各地的Contributors的协助。感谢如下的开发者为这次发布做出的贡献:

  1. 华为的Guoqiang Li :升级了Netty版本到4.x
  2. 微博的Yan Facai (颜发才):为logistic regression算法加入了y截距
  3. shunanzhang:提供了高质量的英文文档翻译
  4. 小米的Qingdi Meng & Liu Shaohui:修复了诸多Bug

Release 1.1.0

Angel 1.1.0 is a agile improvement version. It fixes quite some bugs of the first release version, with following enhancement and improvements:

Angel Core

  1. Introduced concurrency control in psFunc update
  2. Model optimization: added support for plaintext transformation
  3. Upgraded Netty to version 4.x

Angel MLlib

  1. Improved PSModel interface
  2. Added y-intercept to Logistic Regression
  3. Added Predict function to ADMM LR
  4. Implemented the basic FM algorithm
  5. Added overall algorithm measures and log analytics optimization

Spark on Angel

  1. Improved the operational performance of pull/push in multi-task situation

Documentation

  1. Model partitioner
  2. Synchronization controller
  3. Resource estimation
  4. Overall performance measures

Compatibility Upgrade

  1. AngelConfiguration ---> AngelConf
  2. Removed TConstants

~~~ Acknowledgement ~~~

We thank all developers who contributed to this release:

  1. Guoqiang Li from Huawei: upgraded Netty to version 4.x
  2. Yan Facai from Weibo: added y-intercept for logistic regression
  3. shunanzhang: added English translations for documentation
  4. Qingdi Meng & Liu Shaohui from Xiaomi: fixed bugs

v1.0.0

6 years ago

Release v1.0.0

  1. ParameterServer功能

    • 基于Matrix/Vector的模型自动切分和管理,兼顾稀疏稠密两种格式
    • 支持对Model进行Push和Pull操作,可以自定义复杂的psFunc
    • 提供多种同步控制机制(BSP/SSP/ASP)
  2. 开发运行

    • 语言支持:系统基于Scala和Java开发,用户可以自由选择其中一种
    • 部署方便:可以直接在Yarn社区版本中运行,也支持本地调试模式
    • 数据切分: 自动切分读取训练数据,默认兼容了Hadoop FS接口
    • 增量训练:训练过程中会自动Checkpoint,而且支持加载模型后,增量训练
  3. PS Service

    • 只启动PSServer和PSAngent,为其他分布式计算平台提供PS服务
    • 基于PS-Service,不需要修改Spark核心代码,直接开发Spark-on-Angel算法,该模式无缝支持Breeze数值运算库
  4. 算法库

    • 集成Logistic Regression,SVM,KMeans,LDA,MF,GBDT等机器学习算法
    • 多种优化方法,包括ADMM,OWLQN, LBFGS和GD
    • 支持多种损失函数、评估指标,包含L1、L2正则项
  5. 算法优化

    • LDA采用了F+LDA算法用于加速采样的速度,同时利用流式参数获取的方法减少网络参数获取的延迟
    • GBDT使用两阶段树分裂算法,将部分计算转移到PS,减少网络传输,提升速度

Release v1.0.0

  1. ParameterServer Functionalities

    • Automatically partitions and manages models whose parameters can be represented as matrix/vector, supporting sparse and dense types
    • Supports push/pull operations and customized psFunc
    • Provides multiple concurrency control mechanisms(BSP/SSP/ASP)
  2. Development & Execution

    • Language support: the system is developed with Scala and Java; users can use either one of them as choice
    • Deployment: can be deployed to Yarn or run on local
    • Data partitioning: automatically partitions and reads training data, compatible with Hadoop FS interface by default
    • Incremental training: automatically generates checkpoint during training, supporting incremental training with reloaded model
  3. PS Service

    • Only starts up PSServer and PSAgent, providing PS service for other distributed computing platforms
    • Spark-on-Angel algorithms can be developed based on PS-Service without changing Spark source code, supporting Breeze NumericOps seamlessly
  4. Algorithms Library

    • Contains algorithms such as Logistic Regression, SVM, KMeans, LDA, MF, GBDT, etc.
    • Supports various optimization methods, including ADMM, OWLQN, LBFGS and GD
    • Supports various loss functions and metrics, L1/L2 regularization
  5. Algorithm Optimization

    • LDA: F+LDA accelerates sampling, while retrieving parameters in a streaming fashion to reduce network latency
    • GBDT: utilize two-stage tree-splitting to transfer parts of computing load to PS, reducing network communications and improve for speed