PaddlePaddle Serving Versions Save

A flexible, high-performance carrier for machine learning models（『飞桨』服务化部署框架）

v0.9.0

1 year ago

新特性

集成 Paddle 2.3 Inference : #1781
C++ Serving 异步框架自动批量：#1685
C++ Serving 适配 Jetpack 4.6 : #1700
C++ Serving 异步框架支持2维 LOD Pedding : #1713
大模型分布式推理：#1753，#1783
C++ Serving 支持 TensorRT Dynamic Shape：#1759

功能增强

更新 C++ Serving OCR 的部署案例：#1759
新增 Python Pipeline 自动生成TRT 动态shape : #1778
新增 Python Pipeline 低精度部署案例：#1753
新增离线Wheel 安装：#1792
升级前后端 protobuf Response 结构：#1783

文档和示例变更

新增 AIStudio OCR 实战（首页）
新增政务问答解决方案（首页）
新增智能问答解决方案（首页）
新增语义索引解决方案（首页）
新增 PaddleNLP 示例： #1773
新增 doc/Install_Linux_Env_CN.md：#1788
新增 doc/Python_Pipeline/Pipeline_Int_CN.md：#1788
新增 doc/Python_Pipeline/Pipeline_Features_CN.md：#1788
新增 doc/Python_Pipeline/Pipeline_Optimize_CN.md：#1788
修改 README.md： #1788
修改 README_CN.md： #1788
修改 doc/C++_Serving/ABTest_CN.md：#1788
修改 doc/C++_Serving/Asynchronous_Framwork_CN.md：#1788
修改 doc/C++_Serving/Encryption_CN.md：#1788
修改 doc/C++_Serving/Hot_Loading_CN.md：#1788
修改 doc/C++_Serving/Inference_Protocols_CN.md：#1788
修改 doc/C++_Serving/Model_Ensemble_CN.md：#1788
修改 doc/C++_Serving/OP_CN.md：#1788
修改 doc/C++_Serving/Performance_Tuning_CN.md：#1788
修改 doc/C++_Serving/Request_Cache_CN.md：#1788
修改 doc/Compile_CN.md：#1788
修改 doc/Compile_EN.md：#1788
修改 doc/Docker_Images_CN.md：#1788
修改 doc/Docker_Images_EN.md：#1788
修改 doc/FAQ_CN.md：#1788
修改 doc/Install_CN.md：#1788
修改 doc/Install_EN.md：#1788
修改 doc/Java_SDK_CN.md：#1788
修改 doc/Java_SDK_EN.md：#1788
修改 doc/Latest_Packages_CN.md：#1788
修改 doc/Latest_Packages_EN.md：#1788
修改 doc/Model_Zoo_CN.md：#1788
修改 doc/Python_Pipeline/Pipeline_Benchmark_CN.md：#1788
修改 doc/Python_Pipeline/Pipeline_Design_CN.md：#1788
修改 doc/Python_Pipeline/Pipeline_Design_EN.md：#1788
修改 doc/Run_On_Kubernetes_CN.md：#1788
修改 doc/Save_CN.md：#1788
修改 doc/Serving_Auth_Docker_CN.md：#1788
修改 doc/Serving_Configure_CN.md：#1788
修改 doc/Serving_Configure_EN.md：#1788

Bug修复

修复pptsn_reader依赖影响app.reader导致arm机器中因无法安装相应库 #1752
修复BOS二进制路径问题：#1734

v0.8.3

2 years ago

新特性

增加C++ Serving 和 Pipeline Serving编译环境检查 #1584
C++ Serving 支持修改log日志生成路径 #1592
使用TRT时，新增动态shape配置功能和示例 #1590
新增Python Pipeline Serving 普罗米修斯监控 #1586
新增C++ Serving 普罗米修斯监控 #1568 #1576 #1577
支持异构硬件，包括：x86+DCU、ARM+ascend310、ARM+ascend910 #1544
支持Python39

性能优化

C++ Serving增加请求结果缓存功能，相同的请求直接返回 #1585, #1588

功能增强

更便捷的C++串联多模型方式 #1546
dockerfile升级，新增centos dockerfile #1618 #1594
新增Pipeline Serving bf16低精度支持 #1594 #1554

文档和示例变更

新增pp-shitu示例 #1572
新增PaddleNLP示例 #1609
新增环境检查文档 #1643
新增动态TRT使用文档 #1643
新增异构硬件使用文档 #1641,#1654
新增请求缓存Cache使用说明文档 #1641, #1588

Bug修复

修复异步框架下内存泄露问题 #1589
修复Pipeline Serving中输入为list[str]的情况 #1598

For English:

New features

Add C++ serving and pipeline serving compilation environment check #1584
C++ serving supports modifying the log generation path #1592
When using TRT, new dynamic shape configuration functions and examples are added #1590
Add Python pipeline serving Prometheus monitoring #1586
Add C++ serving Prometheus monitoring #1568 #1576 #1577
Support heterogeneous hardware, including x86 + DCU, arm + ascend310 and arm + ascend910 #1544
Support Python 39

Performance optimization

C++ serving adds the request result caching function, and the same request is directly returned #1585, #1588

Function Enhance

More convenient C++ series multi model mode #1546
Dockerfile upgrade, new Centos dockerfile #1618 #1594
New pipeline serving bf16 low precision support #1594 #1554

Documentation and sample changes

New PP-Shitu example #1572
New paddlenlp example #1609
New environmental inspection document #1643
New dynamic TRT usage document #1643
New heterogeneous hardware usage documents #1641, #1654
New request cache instructions #1641, #1588

Bug repair

Fix memory leak in asynchronous framework #1589
Fix the input of list [STR] in pipeline serving #1598

v0.7.0

2 years ago

新特性

集成Intel MKLDNN加速推理 #1264，#1266, #1277
C++ Serving支持HTTP 请求 #1321
C++ Serving支持gPRC 和HTTP + Proto请求 #1345
新增C++ Client SDK #1370

性能优化

C++ Serving优化Pybind传递数据方法 #1268, #1269
C++ Serving增加GPU多流、异步任务队列，删除冗余加锁 #1289
C++ Serving WebServer使用连接池和数据压缩 #1348
C++ Serving框架新增异步批量合并，支持变长LOD输入 #1366
C++ Serving stage并发执行 #1376
C++ Serving增加各阶段处理耗时日志 #1390

功能变更

重写模型保存方案和命名规则，兼容旧版本 #1354，#1358
支持更多数据类型float64，int16, float16, uint16, uint8, int8, bool , complex64 , complex128 #1338
重写GPU id设置device的逻辑 #1303
指定Fetch list返回部分推理结果 #1359
设置XPU ID #1436
服务优雅关闭 #1470
C++ Serving Client端pybind支持uint8、int8数据读写 #1378
C++ Serving Client端pybind支持uint16、int16数据读写 #1420
C++ Serving支持异步参数设置 #1483
Python Pipeline增加While OP控制循环 #1338
Python pipeline之间可使用gRPC交互 #1358
Python Pipeline 支持Proto结构Tensor数据格式交互 #1369， #1384
Python Pipeline仅获取最快的前置OP结果 #1380
Python Pipeline 支持LoD类型输入 #1472
Cube服务新增python http方式请求样例 #1399
Cube服务增加读取RecordFile工具 #1336
Cube-server和Cube-transfer上线部署优化 #1337
删除multi-lang相关代码 #1321

文档和示例变更

修改Doc目录结构，新增子目录 #1473, #1475
迁移Serving/python/examples路径到Serving/examples，重新设计目录 #1487
修改doc文件名称 #1487
新增C++ Serving Benchmark #1176
新增PaddleClas/DarkNet 加密模型部署示例 #1352
新增Model Zoo文档 #1492
新增Install文档 #1473
新增Quick_Start文档 #1473
新增Serving_Configure文档 #1495
新增C++_Serving/Inference_Protocols_CN.md #1500
新增C++_Serving/Introduction_CN.md #1497
新增C++_Serving/Performance_Tuning_CN.md #1497
新增Python_Pipeline/Performance_Tuning_CN.md #1503
更新Java SDK文档 #1357
更新Compile文档 #1502
更新Readme文档 #1473
更新Latest_Package_CN.md #1513
更新Run_On_Kubernetes_CN.md #1520

Bug修复

修复内存池使用问题 #1283
修复多线程中错误加锁问题 #1289
修复C++ Serving多模型组合场景，无法加载第二个模型问题 #1294
修复请求数据大时越界问题 #1308
修复Detection模型结果偏离问题 #1413
修复use_calib设置错误问题 #1414
修复C++ OCR示例结果不正确问题 #1415
修复并行推理出core问题 #1417

For English:

New Features

Integrate Intel MKLDNN #1264，#1266, #1277
C++ Serving supports HTTP requests #1321
C++ Serving supports gPRC and HTTP + Proto requests #1345
Added C++ Client SDK #1370

Performance optimization

C++ Serving optimizes Pybind data transfer method #1268, #1269
C++ Serving adds GPU multi-stream, asynchronous task queue, deletes redundant locks #1289
C++ Serving webserver uses connection pool and data compression #1348
C++ Serving framework adds asynchronous batch merge and supports variable length LOD input #1366
C++ Serving stage concurrent execution #1376
C++ Serving adds time-consuming log processing at each stage #1390

Function changes

Rewrite model saving methods and naming rules, compatible with the old version #1354，#1358
Support more data types float64, int16, float16, uint16, uint8, int8, bool, complex64, complex128 #1338
Rewrite the method of GPU id binding device #1303
Specify Fetch list to return partial inference results #1359
Set XPU ID #1436
Service closed gracefully #1470
C++ Serving Client pybind supports uint8, int8 data #1378
C++ Serving Client pybind supports uint16, int16 data #1420
C++ Serving supports asynchronous parameter setting #1483
Python Pipeline adds While OP control loop #1338
GRPC interaction can be used between Python pipelines #1358
Python Pipeline supports Proto structure Tensor data format interaction #1369， #1384
Python Pipeline only gets the fastest pre-OP results #1380
Python Pipeline supports LoD type input #1472
Cube service adds python http request sample #1399
Cube service adds a tool to read RecordFile #1336
Cube-server and Cube-transfer online deployment optimization #1337
Delete multi-lang related code #1321

Documentation and example changes

Modify the Doc directory structure and add subdirectories #1473, #1475
Move python/examples path to parent directory, and redesign directory #1487
Modify the doc file name #1487
Add C++ Serving Benchmark #1176
Add one PaddleClas/DarkNet encryption model example #1352
Add Model Zoo doc #1492
Add Install doc #1473
Add Quick Start doc #1473
Add Serving Configure doc #1495
Add C++_Serving/Inference_Protocols_CN.md#1500
Add C++_Serving/Introduction_CN.md#1497
Add C++_Serving/Performance_Tuning_CN.md#1497
Add Python_Pipeline/Performance_Tuning_CN.md#1503
Update Java SDK doc #1357
Update Compile doc #1502
Update Readme doc #1473
Update Latest_Package_CN.md#1513
Update Run_On_Kubernetes_CN.md#1520

Bug fix

Fix one memory pool usage problem #1283
Fix the wrong locking problem in multi-threading #1289
Fix the problem of C++ Serving multi-model combination #1294
Fix the problem of out of bounds when the requested data is large #1308
Fix the problem of inaccurate prediction results of the Detection model #1413
Fix the wrong setting of use_calib #1414
Fix the problem of incorrect C++ OCR example results #1415
Fix the core problem of parallel reasoning #1417

v0.6.0

2 years ago

Paddle Serving v0.6.0 Release note:

新特性：
- 集成Paddle 2.1 inference, #1221
- 支持fp16和int8的低精度推理, #1130, #1236
- 通过Kubernetes部署Serving服务, #1139, #1184, #1193
- 新增安全网关与Serving协同部署, #1235
- 支持X86 + XPU环境部署Serving, #1080
功能增强：
- Python合并paddle_serving_server和paddle_serving_server_gpu成统一服务, #1082
- Pipeline增加mini-batch推理, #1186
- Pipeline支持日志切割, #1238
- Pipeline优化数据传入eval处理，增加channel的跟踪日志, #1209
- C++ Serving重构预测库调用方法，#1080
- C++ Serving支持多模型线性组合，#1124
- C++ Serving资源管理与优化, #1143
- C++ Serving接口增加String类型输入, #1124
- C++ Serving优化数据组装方法，使用memcpy替换循环拷贝, #1124
- C++ Serving编译选型增加GDB开关, #1124
- 增加Benchmark脚本，更新GPU benchmark数据, #1197, #1175
文档升级：
- 新增 doc/PADDLE_SERVING_ON_KUBERNETES.md
- 新增 doc/LOD.md
- 新增 doc/LOD_CN.md
- 新增 doc/PROCESS_DATA.md
- 修改 doc/PIPELINE_SERVING.md
- 修改 doc/PIPELINE_SERVING_CN.md
- 修改 doc/CREATING.md
- 修改 doc/SAVE.md
- 修改 doc/SAVE_CN.md
- 修改 doc/TENSOR_RT.md
- 修改 doc/TENSOR_RT_CN.md
- 修改 doc/MULTI_SERVICE_ON_ONE_GPU_CN.md
- 修改 doc/ENCRYPTION.md
- 修改 doc/ENCRYPTION_CN.md
- 修改 doc/DESIGN_DOC.md
- 修改 doc/DESIGN_DOC_CN.md
- 修改 doc/DOCKER_IMAGES.md
- 修改 doc/DOCKER_IMAGES_CN.md
- 修改 doc/LATEST_PACKAGES.md
- 修改 doc/COMPILE.md
- 修改 doc/COMPILE_CN.md
- 修改 doc/BERT_10_MINS.md
- 修改 doc/BERT_10_MINS_CN.md
- 修改 doc/BAIDU_KUNLUN_XPU_SERVING.md
- 修改 doc/BAIDU_KUNLUN_XPU_SERVING_CN.md
- 修改 README.md
- 修改 README_CN.md
Demo升级：
- 新增 python/python/examples/low_precision/resnet50
- 新增 python/examples/xpu/bert
- 新增python/examples/xpu/ernie
- 新增 python/examples/xpu/vgg19
- 新增 python/examples/pipeline/PaddleDetection/faster_rcnn
- 新增 python/examples/pipeline/PaddleDetection/ppyolo_mbv3
- 新增 python/examples/pipeline/PaddleDetection/yolov3
- 新增 python/examples/pipeline/PaddleClas/DarkNet53
- 新增 python/examples/pipeline/PaddleClas/HRNet_W18_C
- 新增 python/examples/pipeline/PaddleClas/MobileNetV1
- 新增 python/examples/pipeline/PaddleClas/MobileNetV2
- 新增 python/examples/pipeline/PaddleClas/MobileNetV3_large_x1_0
- 新增 python/examples/pipeline/PaddleClas/ResNeXt101_vd_64x4d
- 新增 python/examples/pipeline/PaddleClas/ResNet50_vd
- 新增 python/examples/pipeline/PaddleClas/ResNet50_vd_FPGM
- 新增 python/examples/pipeline/PaddleClas/ResNet50_vd_KL
- 新增 python/examples/pipeline/PaddleClas/ResNet50_vd_PACT
- 新增 python/examples/pipeline/PaddleClas/ResNet_V2_50
- 新增 python/examples/pipeline/PaddleClas/ShuffleNetV2_x1_0
- 新增 python/examples/pipeline/bert
- 新增 python/examples/ocr/ocr_cpp_client.py
- 修改 python/examples/bert [benchmark]
- 修改 python/examples/pipeline/ocr[benchmark]
docker升级：
- 新增docker运行镜像(CPU, cuda10.1, cuda10.2, cuda11) (Py36, Py37, Py38)
- 新增Cuda 11环境的开发docker镜像
- 新增Kubernetes Demo镜像
Bug修复：
- 修复不规范代码命名，统一infer. h文件和paddle_engine. h中模型参数的命名规范. #1136
- 修复C++部分框架被绕过的错误. #1124
- 修复py35下Json.load函数异常的错误.#1124
- 修复ssd_vgg16_300_240e_voc示例中feed_var缺少参数'im_shape'导致的预测结果异常的错误.#1180
- 修复多个GRPC因模型路径变更导致的错误.#1147
- 修复C++log日志打印异常的错误. #1154
- 修复WebService漏传Thread参数的错误. #1136
- 修复golang引入的编译错误. #1101
- 修复Java gRPC模型下的错误. #1215

For English

New Features:
- Integrated Paddle 2.1 Inference, #1221
- Support low-precision inference of fp16 and int8, #1130, #1236
- Deploy Serving service through Kubernetes, #1139, #1184, #1193
- New Security gateway, #1235
- Serving deployment in X86 + XPU environment, #1080
Feature Improvements:
- Merge paddle_serving_server and paddle_serving_server_gpu into a unified paddle_serving_server, #1082
- Pipeline supports Mini-batch inference, #1186
- Pipeline supports log file rotating, #1238
- Pipeline optimizes data transfer to eval for processing, and increases channel tracking logs, #1209
- C++ Serving reconstruction prediction engine call method, #1080
- C++ Serving supports linear combination of multiple models, #1124
- C++ Serving interface adds direct input of String type, #1124
- C++ Serving resource management and optimization, #1143
- C++ Serving performance optimization, changing for loop copy to function memcpy, #1124
- C++ Serving add GDB compilation options, #1124
- Add Benchmark script and update GPU benchmark data, #1197, #1175
Document Updates:
- Add doc/PADDLE_SERVING_ON_KUBERNETES.md
- Add doc/LOD.md
- Add doc/LOD_CN.md
- Add doc/PROCESS_DATA.md
- Modify doc/PIPELINE_SERVING.md
- Modify doc/PIPELINE_SERVING_CN.md
- Modify doc/CREATING.md
- Modify doc/SAVE.md
- Modify doc/SAVE_CN.md
- Modify doc/TENSOR_RT.md
- Modify doc/TENSOR_RT_CN.md
- Modify doc/MULTI_SERVICE_ON_ONE_GPU_CN.md
- Modify doc/ENCRYPTION.md
- Modify doc/ENCRYPTION_CN.md
- Modify doc/DESIGN_DOC.md
- Modify doc/DESIGN_DOC_CN.md
- Modify doc/DOCKER_IMAGES.md
- Modify doc/DOCKER_IMAGES_CN.md
- Modify doc/LATEST_PACKAGES.md
- Modify doc/COMPILE.md
- Modify doc/COMPILE_CN.md
- Modify doc/BERT_10_MINS.md
- Modify doc/BERT_10_MINS_CN.md
- Modify doc/BAIDU_KUNLUN_XPU_SERVING.md
- Modify doc/BAIDU_KUNLUN_XPU_SERVING_CN.md
- Modify README.md
- Modify README_CN.md
Demo Updates:
- Add python/python/examples/low_precision/resnet50
- Add python/examples/xpu/bert
- Add python/examples/xpu/ernie
- Add python/examples/xpu/vgg19
- Add python/examples/pipeline/PaddleDetection/faster_rcnn
- Add python/examples/pipeline/PaddleDetection/ppyolo_mbv3
- Add python/examples/pipeline/PaddleDetection/yolov3
- Add python/examples/pipeline/PaddleClas/DarkNet53
- Add python/examples/pipeline/PaddleClas/HRNet_W18_C
- Add python/examples/pipeline/PaddleClas/MobileNetV1
- Add python/examples/pipeline/PaddleClas/MobileNetV2
- Add python/examples/pipeline/PaddleClas/MobileNetV3_large_x1_0
- Add python/examples/pipeline/PaddleClas/ResNeXt101_vd_64x4d
- Add python/examples/pipeline/PaddleClas/ResNet50_vd
- Add python/examples/pipeline/PaddleClas/ResNet50_vd_FPGM
- Add python/examples/pipeline/PaddleClas/ResNet50_vd_KL
- Add python/examples/pipeline/PaddleClas/ResNet50_vd_PACT
- Add python/examples/pipeline/PaddleClas/ResNet_V2_50
- Add python/examples/pipeline/PaddleClas/ShuffleNetV2_x1_0
- Add python/examples/pipeline/bert
- Add python/examples/ocr/ocr_cpp_client.py
- Modify python/examples/bert [benchmark]
- Modify python/examples/pipeline/ocr[benchmark]
Docker Updates:
- Add runtime dockers (CPU, CUDA10.1, CUDA10.2, CUDA11) (Py36, Py37, Py38)
- Add CUDA 11 develop level docker images
- Add kubernetes demo images
Bug Fixes:
- Fixed the problem of irregular naming, #1136
- Fixed the problem that part of C + + multithreading and framework were bypassed due to the adaptation of paddle-inference2.0. #1124
- Fixed the problem of JSON. Load in py35.#1124
- Fixed missing a feed_var: 'im_shape' in the test-client request, resulting in no prediction result.#1180
- Fixed multiple bugs in gRPC.#1147
- Fixed the read OP print log logic bug in C + +. #1154
- Fixed the WebService missed a thread parameter, unified the template name in infer. h and paddle_engine. h. #1136
- Fixed compile errors of golang import. #1101
- Fixed Java gRPC bugs, #1215

v0.5.0

3 years ago

New Features
- Support paddle 2.0 API
- Support dynamic graph model conversion to Serving model interface
- Add java pipeline client
- Support model encryption and decryption
- Adapt shape to webserver automatically
- Predict on XPU, ARM
Improve
- Add more Nvidia TensorRT demos
- Add more dockers on Ubuntu
- Support python 3.8
- Support batch predict on pipeline serving
Documents
- Add doc/BAIDU_KUNLUN_XPU_SERVING.md
- Add doc/BAIDU_KUNLUN_XPU_SERVING_CN.md
- Add doc/ENCRYPTION.md
- Add doc/ENCRYPTION_CN.md
- Modify README.md
- Modify README_CN.md
- Modify doc/COMPILE_CN.md
- Modify doc/DOCKER_IMAGES_CN.md
- Modify doc/LATEST_PACKAGES.md
- Modify doc/RUN_IN_DOCKER_CN.md
- Modify doc/SAVE_CN.md
- Modify doc/ABTEST_IN_PADDLE_SERVING.md
- Modify doc/COMPILE.md
- Modify doc/DESIGN_DOC_CN.md
- Modify doc/DESIGN_DOC.md
- Modify doc/GRPC_IMPL_CN.md
- Modify doc/JAVA_SDK.md
- Modify doc/JAVA_SDK_CN.md
- Delete doc/INFERENCE_TO_SERVING_CN.md
- Delete doc/TRAIN_TO_SERVICE.md
- Delete doc/TRAIN_TO_SERVICE_CN.md
Demos
- Add examples/xpu/fit_a_line_xpu
- Add examples/xpu/resnet_v2_50_xpu
- Add examples/detection/faster_rcnn_r50_fpn_1x_coco
- Add examples/detection/ppyolo_r50vd_dcn_1x_coco
- Add examples/detection/ttfnet_darknet53_1x_coco
- Add examples/detection/yolov3_darknet53_270e_coco
- Add examples/encryption
- Modify examples/bert
- Modify examples/criteo_ctr
- Modify examples/fit_a_line
- Modify examples/grpc_impl_example
- Modify examples/imdb
- Modify examples/ocr
- Modify examples/pipeline/imagenet
- Modify examples/pipeline/imdb_model_ensemble
- Modify examples/pipeline/ocr
Dockers
- Add Docker : CPU on Ubuntu16 【GCC82】
- Add Docker : CUDA9.0 on Ubuntu16 【GCC482】
- Add Docker : CUDA10.0 on Ubuntu16 【GCC482】
- Add Docker : CUDA10.1 on Ubuntu16 【GCC82】
- Add Docker : CUDA10.2 on Ubuntu16 【GCC82 】
- Add Docker : CUDA11.0 on Ubuntu18 【GCC82】
- Add Docker : ARM CPU on CentOS8 【GCC73】
Fix Bugs
- Exception of batch GRPC requests
- Exception of pipeline batch query
- Wrong results of Predicting Yolov4 models on java client.
- Codec is inconsistent between Python 2.7 and Python 3.x

v0.4.0

3 years ago

New Features
- Support Java Client
- Support TensorRT, add docker image for cuda10.1 and TensorRT 6
- Modify the LocalPredictor interface to align with the RPC interface usage
- Add Pipeline Serving Dag Deployment
- Support Windows 10 (Only Web Service and Local Predictor)
- Add built-in Serving model converter
Improve of Compatibility
- Release cuda10.1 version of paddle_serving_server_gpu
- Release Python 3.5 version of paddle_serving_client
- Remove serving-client-app circular dependencies
- Modify version of dependencies
- Support LoD Tensor and replace list type for batch input with one numpy array
Improve of Framework
- Pipeline Dag support multi-gpu
- lower RPC thread restriction to 1
Documents
- Modify "COMPILE"
- Add "WINDOWS_TUTORIAL"
- Add "PIPELINE_SERVING"
- Modify "BERT_10_MIN"
New Demo
- pipeline demos
- Java Demo
Bug fixes
- Fix subprocess CUDA ERROR3 bug
- Fix pip install dependencies
- Fix import error in windows
- Fix bugs in web service

v0.3.2

3 years ago

New Features
- Support Paddle v1.8.4
- Support int64 data type
- Add mem_optim and ir_optim api for WebService
- Add preprocess and postprocess api for ocr in paddle_serving_app
- Add docker for cuda10
Improve of Compatibility
- Release cuda10 version of paddle_serving_server_gpu
Improve of Framework
- Optimize the error message in http mode
- Reduce GPU server-side graphic memory usage.
Documents
- Modify "How to optimize performance?","Compile from source code","FAQ(Chinese)"
New Demo
- yolov4
- ocr
Bug fixes
- Add version requirements for protobuf to avoid import paddle_serving_client errors caused by low versions, https://github.com/PaddlePaddle/Serving/issues/728
- fix compatibility issues for http mode with python3
- fix ctr_with_cube demo
- fix cpu docker
- fix compatibility issues for BlazeFacePostprocess wiht python3

v0.3.0

3 years ago

New features
- Add ir_optim, use_mkl（only for cpu version）argument
- Support custom DAG for prediction service
- HTTP service supports prediction with batch
- HTTP service supports startup by uwsgi
- Support model file monitoring， remote pull and hot loading
- Support ABTest
- Add image preprocessing, Chinese word segmentation preprocessing, Chinese sentiment analysis preprocessing module, and graphics segmentation postprocessing, image detection postprocessing module in paddle-serving-app
- Add pre-trained model and sample code acquisition in paddle-serving-app, integrated profile function
- Release Centos6 docker images for compile Paddle Serving
Bug fixed
New documents
Performance optimization
- Optimized the time consumption of input and output memory copy in numpy.array format. When the client-side single concurrent batch size is 1 in the resnet50 imagenet classification task, qps is 100.38% higher than the 0.2.0 version.
Compatibility optimization
- The client side removes the dependency on patchelf
- Released paddle-serving-client for python27, python36, and python37
- Server and client can be deployed in Centos6/7 and Ubuntu16/18 environments
More demos
- Chinese sentiment analysis task : lac+senta
- Image segmentation task : deeplabv3、unet
- Image detection task : faster_rcnn
- Image classification task : mobilenet、resnet_v2_50

v0.2.0

4 years ago

Major Features and Improvements

Support Paddle v1.7.1
Improve ease of use.

Support install Paddle Serving with pip command and docker.

Integrate with Paddle Training seamlessly.

Start server with one command.

Web service development supported.
Provide two prediction service methods : RPC and HTTP.
Add client api support for Python, Go.
CV, NLP and recommendation serving demos released.
Add Timeline tools for analysis service performance.
Performance Improvement: with python rpc python client, throughputs are improved over 100% compared with HTTP service in previous release.

Thanks to our Contributors

This release contains contributions from many people at Baidu, as well as: guru4elephant, wangjiawei04, MRXLT, barrierye

v0.0.3

4 years ago

Support PaddlePaddle v1.6.1
Distributed sparse parameter service: Cube. Cube is a high-performance distributed KV service designed to work in Deep Learning context, which bad been tested and heavily used inside Baidu
New module added as demo: BERT on GPU
Elastic-CTR: Built around PaddlePaddle, Paddle Serving and Cube, Elastic-CTR is an end-to-end distributed training and serving solution. It's built totally on K8S, meaning that users could easily deploy our solution on private clusters. See Elastic CTR solution deployment
Build optimization: Now Paddle inference libraries will be downloaded from PaddlePaddle official website, instead of built from source
Build optimizatioin: Dockerfiles are provided for buiding CPU and GPU Serving binaries. See INSTALL instructions