PaddlePaddle Serving Versions Save

A flexible, high-performance carrier for machine learning models(『飞桨』服务化部署框架)

v0.9.0

1 year ago

新特性

  • 集成 Paddle 2.3 Inference : #1781
  • C++ Serving 异步框架自动批量 :#1685
  • C++ Serving 适配 Jetpack 4.6 : #1700
  • C++ Serving 异步框架支持2维 LOD Pedding : #1713
  • 大模型分布式推理 :#1753#1783
  • C++ Serving 支持 TensorRT Dynamic Shape:#1759

功能增强

  • 更新 C++ Serving OCR 的部署案例:#1759
  • 新增 Python Pipeline 自动生成TRT 动态shape : #1778
  • 新增 Python Pipeline 低精度部署案例:#1753
  • 新增 离线Wheel 安装 :#1792
  • 升级 前后端 protobuf Response 结构:#1783

文档和示例变更

  • 新增 AIStudio OCR 实战(首页)
  • 新增 政务问答解决方案(首页)
  • 新增 智能问答解决方案(首页)
  • 新增 语义索引解决方案(首页)
  • 新增 PaddleNLP 示例: #1773
  • 新增 doc/Install_Linux_Env_CN.md:#1788
  • 新增 doc/Python_Pipeline/Pipeline_Int_CN.md:#1788
  • 新增 doc/Python_Pipeline/Pipeline_Features_CN.md:#1788
  • 新增 doc/Python_Pipeline/Pipeline_Optimize_CN.md:#1788
  • 修改 README.md: #1788
  • 修改 README_CN.md: #1788
  • 修改 doc/C++_Serving/ABTest_CN.md:#1788
  • 修改 doc/C++_Serving/Asynchronous_Framwork_CN.md:#1788
  • 修改 doc/C++_Serving/Encryption_CN.md:#1788
  • 修改 doc/C++_Serving/Hot_Loading_CN.md:#1788
  • 修改 doc/C++_Serving/Inference_Protocols_CN.md:#1788
  • 修改 doc/C++_Serving/Model_Ensemble_CN.md:#1788
  • 修改 doc/C++_Serving/OP_CN.md:#1788
  • 修改 doc/C++_Serving/Performance_Tuning_CN.md:#1788
  • 修改 doc/C++_Serving/Request_Cache_CN.md:#1788
  • 修改 doc/Compile_CN.md:#1788
  • 修改 doc/Compile_EN.md:#1788
  • 修改 doc/Docker_Images_CN.md:#1788
  • 修改 doc/Docker_Images_EN.md:#1788
  • 修改 doc/FAQ_CN.md:#1788
  • 修改 doc/Install_CN.md:#1788
  • 修改 doc/Install_EN.md:#1788
  • 修改 doc/Java_SDK_CN.md:#1788
  • 修改 doc/Java_SDK_EN.md:#1788
  • 修改 doc/Latest_Packages_CN.md:#1788
  • 修改 doc/Latest_Packages_EN.md:#1788
  • 修改 doc/Model_Zoo_CN.md:#1788
  • 修改 doc/Python_Pipeline/Pipeline_Benchmark_CN.md:#1788
  • 修改 doc/Python_Pipeline/Pipeline_Design_CN.md:#1788
  • 修改 doc/Python_Pipeline/Pipeline_Design_EN.md:#1788
  • 修改 doc/Run_On_Kubernetes_CN.md:#1788
  • 修改 doc/Save_CN.md:#1788
  • 修改 doc/Serving_Auth_Docker_CN.md:#1788
  • 修改 doc/Serving_Configure_CN.md:#1788
  • 修改 doc/Serving_Configure_EN.md:#1788

Bug修复

  • 修复pptsn_reader依赖影响app.reader导致arm机器中因无法安装相应库 #1752
  • 修复BOS二进制路径问题:#1734

v0.8.3

2 years ago

新特性

  • 增加C++ Serving 和 Pipeline Serving编译环境检查 #1584
  • C++ Serving 支持修改log日志生成路径 #1592
  • 使用TRT时,新增动态shape配置功能和示例 #1590
  • 新增Python Pipeline Serving 普罗米修斯监控 #1586
  • 新增C++ Serving 普罗米修斯监控 #1568 #1576 #1577
  • 支持异构硬件,包括:x86+DCU、ARM+ascend310、ARM+ascend910 #1544
  • 支持Python39

性能优化

  • C++ Serving增加请求结果缓存功能,相同的请求直接返回 #1585, #1588

功能增强

  • 更便捷的C++串联多模型方式 #1546
  • dockerfile升级,新增centos dockerfile #1618 #1594
  • 新增Pipeline Serving bf16低精度支持 #1594 #1554

文档和示例变更

  • 新增pp-shitu示例 #1572
  • 新增PaddleNLP示例 #1609
  • 新增环境检查文档 #1643
  • 新增动态TRT使用文档 #1643
  • 新增异构硬件使用文档 #1641,#1654
  • 新增请求缓存Cache使用说明文档 #1641, #1588

Bug修复

  • 修复异步框架下内存泄露问题 #1589
  • 修复Pipeline Serving中输入为list[str]的情况 #1598

For English:

New features

  • Add C++ serving and pipeline serving compilation environment check #1584
  • C++ serving supports modifying the log generation path #1592
  • When using TRT, new dynamic shape configuration functions and examples are added #1590
  • Add Python pipeline serving Prometheus monitoring #1586
  • Add C++ serving Prometheus monitoring #1568 #1576 #1577
  • Support heterogeneous hardware, including x86 + DCU, arm + ascend310 and arm + ascend910 #1544
  • Support Python 39

Performance optimization

  • C++ serving adds the request result caching function, and the same request is directly returned #1585, #1588

Function Enhance

  • More convenient C++ series multi model mode #1546
  • Dockerfile upgrade, new Centos dockerfile #1618 #1594
  • New pipeline serving bf16 low precision support #1594 #1554

Documentation and sample changes

  • New PP-Shitu example #1572
  • New paddlenlp example #1609
  • New environmental inspection document #1643
  • New dynamic TRT usage document #1643
  • New heterogeneous hardware usage documents #1641, #1654
  • New request cache instructions #1641, #1588

Bug repair

  • Fix memory leak in asynchronous framework #1589
  • Fix the input of list [STR] in pipeline serving #1598

v0.7.0

2 years ago

新特性

  • 集成Intel MKLDNN加速推理 #1264,#1266, #1277
  • C++ Serving支持HTTP 请求 #1321
  • C++ Serving支持gPRC 和HTTP + Proto请求 #1345
  • 新增C++ Client SDK #1370

性能优化

  • C++ Serving优化Pybind传递数据方法 #1268, #1269
  • C++ Serving增加GPU多流、异步任务队列,删除冗余加锁 #1289
  • C++ Serving WebServer使用连接池和数据压缩 #1348
  • C++ Serving框架新增异步批量合并,支持变长LOD输入 #1366
  • C++ Serving stage并发执行 #1376
  • C++ Serving增加各阶段处理耗时日志 #1390

功能变更

  • 重写模型保存方案和命名规则,兼容旧版本 #1354,#1358
  • 支持更多数据类型float64,int16, float16, uint16, uint8, int8, bool , complex64 , complex128 #1338
  • 重写GPU id设置device的逻辑 #1303
  • 指定Fetch list返回部分推理结果 #1359
  • 设置XPU ID #1436
  • 服务优雅关闭 #1470
  • C++ Serving Client端pybind支持uint8、int8数据读写 #1378
  • C++ Serving Client端pybind支持uint16、int16数据读写 #1420
  • C++ Serving支持异步参数设置 #1483
  • Python Pipeline增加While OP控制循环 #1338
  • Python pipeline之间可使用gRPC交互 #1358
  • Python Pipeline 支持Proto结构Tensor数据格式交互 #1369, #1384
  • Python Pipeline仅获取最快的前置OP结果 #1380
  • Python Pipeline 支持LoD类型输入 #1472
  • Cube服务新增python http方式请求样例 #1399
  • Cube服务增加读取RecordFile工具 #1336
  • Cube-server和Cube-transfer上线部署优化 #1337
  • 删除multi-lang相关代码 #1321

文档和示例变更

  • 修改Doc目录结构,新增子目录 #1473, #1475
  • 迁移Serving/python/examples路径到Serving/examples,重新设计目录 #1487
  • 修改doc文件名称 #1487
  • 新增C++ Serving Benchmark #1176
  • 新增PaddleClas/DarkNet 加密模型部署示例 #1352
  • 新增Model Zoo文档 #1492
  • 新增Install文档 #1473
  • 新增Quick_Start文档 #1473
  • 新增Serving_Configure文档 #1495
  • 新增C++_Serving/Inference_Protocols_CN.md #1500
  • 新增C++_Serving/Introduction_CN.md #1497
  • 新增C++_Serving/Performance_Tuning_CN.md #1497
  • 新增Python_Pipeline/Performance_Tuning_CN.md #1503
  • 更新Java SDK文档 #1357
  • 更新Compile文档 #1502
  • 更新Readme文档 #1473
  • 更新Latest_Package_CN.md #1513
  • 更新Run_On_Kubernetes_CN.md #1520

Bug修复

  • 修复内存池使用问题 #1283
  • 修复多线程中错误加锁问题 #1289
  • 修复C++ Serving多模型组合场景,无法加载第二个模型问题 #1294
  • 修复请求数据大时越界问题 #1308
  • 修复Detection模型结果偏离问题 #1413
  • 修复use_calib设置错误问题 #1414
  • 修复C++ OCR示例结果不正确问题 #1415
  • 修复并行推理出core问题 #1417

For English:

New Features

  • Integrate Intel MKLDNN #1264,#1266, #1277
  • C++ Serving supports HTTP requests #1321
  • C++ Serving supports gPRC and HTTP + Proto requests #1345
  • Added C++ Client SDK #1370

Performance optimization

  • C++ Serving optimizes Pybind data transfer method #1268, #1269
  • C++ Serving adds GPU multi-stream, asynchronous task queue, deletes redundant locks #1289
  • C++ Serving webserver uses connection pool and data compression #1348
  • C++ Serving framework adds asynchronous batch merge and supports variable length LOD input #1366
  • C++ Serving stage concurrent execution #1376
  • C++ Serving adds time-consuming log processing at each stage #1390

Function changes

  • Rewrite model saving methods and naming rules, compatible with the old version #1354,#1358
  • Support more data types float64, int16, float16, uint16, uint8, int8, bool, complex64, complex128 #1338
  • Rewrite the method of GPU id binding device #1303
  • Specify Fetch list to return partial inference results #1359
  • Set XPU ID #1436
  • Service closed gracefully #1470
  • C++ Serving Client pybind supports uint8, int8 data #1378
  • C++ Serving Client pybind supports uint16, int16 data #1420
  • C++ Serving supports asynchronous parameter setting #1483
  • Python Pipeline adds While OP control loop #1338
  • GRPC interaction can be used between Python pipelines #1358
  • Python Pipeline supports Proto structure Tensor data format interaction #1369, #1384
  • Python Pipeline only gets the fastest pre-OP results #1380
  • Python Pipeline supports LoD type input #1472
  • Cube service adds python http request sample #1399
  • Cube service adds a tool to read RecordFile #1336
  • Cube-server and Cube-transfer online deployment optimization #1337
  • Delete multi-lang related code #1321

Documentation and example changes

  • Modify the Doc directory structure and add subdirectories #1473, #1475
  • Move python/examples path to parent directory, and redesign directory #1487
  • Modify the doc file name #1487
  • Add C++ Serving Benchmark #1176
  • Add one PaddleClas/DarkNet encryption model example #1352
  • Add Model Zoo doc #1492
  • Add Install doc #1473
  • Add Quick Start doc #1473
  • Add Serving Configure doc #1495
  • Add C++_Serving/Inference_Protocols_CN.md#1500
  • Add C++_Serving/Introduction_CN.md#1497
  • Add C++_Serving/Performance_Tuning_CN.md#1497
  • Add Python_Pipeline/Performance_Tuning_CN.md#1503
  • Update Java SDK doc #1357
  • Update Compile doc #1502
  • Update Readme doc #1473
  • Update Latest_Package_CN.md#1513
  • Update Run_On_Kubernetes_CN.md#1520

Bug fix

  • Fix one memory pool usage problem #1283
  • Fix the wrong locking problem in multi-threading #1289
  • Fix the problem of C++ Serving multi-model combination #1294
  • Fix the problem of out of bounds when the requested data is large #1308
  • Fix the problem of inaccurate prediction results of the Detection model #1413
  • Fix the wrong setting of use_calib #1414
  • Fix the problem of incorrect C++ OCR example results #1415
  • Fix the core problem of parallel reasoning #1417

v0.6.0

2 years ago

Paddle Serving v0.6.0 Release note:

  • 新特性:
    • 集成Paddle 2.1 inference, #1221
    • 支持fp16和int8的低精度推理, #1130, #1236
    • 通过Kubernetes部署Serving服务, #1139, #1184, #1193
    • 新增安全网关与Serving协同部署, #1235
    • 支持X86 + XPU环境部署Serving, #1080
  • 功能增强:
    • Python合并paddle_serving_server和paddle_serving_server_gpu成统一服务, #1082
    • Pipeline增加mini-batch推理, #1186
    • Pipeline支持日志切割, #1238
    • Pipeline优化数据传入eval处理,增加channel的跟踪日志, #1209
    • C++ Serving重构预测库调用方法,#1080
    • C++ Serving支持多模型线性组合,#1124
    • C++ Serving资源管理与优化, #1143
    • C++ Serving接口增加String类型输入, #1124
    • C++ Serving优化数据组装方法,使用memcpy替换循环拷贝, #1124
    • C++ Serving编译选型增加GDB开关, #1124
    • 增加Benchmark脚本,更新GPU benchmark数据, #1197, #1175
  • 文档升级:
    • 新增 doc/PADDLE_SERVING_ON_KUBERNETES.md
    • 新增 doc/LOD.md
    • 新增 doc/LOD_CN.md
    • 新增 doc/PROCESS_DATA.md
    • 修改 doc/PIPELINE_SERVING.md
    • 修改 doc/PIPELINE_SERVING_CN.md
    • 修改 doc/CREATING.md
    • 修改 doc/SAVE.md
    • 修改 doc/SAVE_CN.md
    • 修改 doc/TENSOR_RT.md
    • 修改 doc/TENSOR_RT_CN.md
    • 修改 doc/MULTI_SERVICE_ON_ONE_GPU_CN.md
    • 修改 doc/ENCRYPTION.md
    • 修改 doc/ENCRYPTION_CN.md
    • 修改 doc/DESIGN_DOC.md
    • 修改 doc/DESIGN_DOC_CN.md
    • 修改 doc/DOCKER_IMAGES.md
    • 修改 doc/DOCKER_IMAGES_CN.md
    • 修改 doc/LATEST_PACKAGES.md
    • 修改 doc/COMPILE.md
    • 修改 doc/COMPILE_CN.md
    • 修改 doc/BERT_10_MINS.md
    • 修改 doc/BERT_10_MINS_CN.md
    • 修改 doc/BAIDU_KUNLUN_XPU_SERVING.md
    • 修改 doc/BAIDU_KUNLUN_XPU_SERVING_CN.md
    • 修改 README.md
    • 修改 README_CN.md
  • Demo升级:
    • 新增 python/python/examples/low_precision/resnet50
    • 新增 python/examples/xpu/bert
    • 新增python/examples/xpu/ernie
    • 新增 python/examples/xpu/vgg19
    • 新增 python/examples/pipeline/PaddleDetection/faster_rcnn
    • 新增 python/examples/pipeline/PaddleDetection/ppyolo_mbv3
    • 新增 python/examples/pipeline/PaddleDetection/yolov3
    • 新增 python/examples/pipeline/PaddleClas/DarkNet53
    • 新增 python/examples/pipeline/PaddleClas/HRNet_W18_C
    • 新增 python/examples/pipeline/PaddleClas/MobileNetV1
    • 新增 python/examples/pipeline/PaddleClas/MobileNetV2
    • 新增 python/examples/pipeline/PaddleClas/MobileNetV3_large_x1_0
    • 新增 python/examples/pipeline/PaddleClas/ResNeXt101_vd_64x4d
    • 新增 python/examples/pipeline/PaddleClas/ResNet50_vd
    • 新增 python/examples/pipeline/PaddleClas/ResNet50_vd_FPGM
    • 新增 python/examples/pipeline/PaddleClas/ResNet50_vd_KL
    • 新增 python/examples/pipeline/PaddleClas/ResNet50_vd_PACT
    • 新增 python/examples/pipeline/PaddleClas/ResNet_V2_50
    • 新增 python/examples/pipeline/PaddleClas/ShuffleNetV2_x1_0
    • 新增 python/examples/pipeline/bert
    • 新增 python/examples/ocr/ocr_cpp_client.py
    • 修改 python/examples/bert [benchmark]
    • 修改 python/examples/pipeline/ocr[benchmark]
  • docker升级:
    • 新增docker运行镜像(CPU, cuda10.1, cuda10.2, cuda11) (Py36, Py37, Py38)
    • 新增Cuda 11环境的开发docker镜像
    • 新增Kubernetes Demo镜像
  • Bug修复:
    • 修复不规范代码命名,统一infer. h文件和paddle_engine. h中模型参数的命名规范. #1136
    • 修复C++部分框架被绕过的错误. #1124
    • 修复py35下Json.load函数异常的错误.#1124
    • 修复ssd_vgg16_300_240e_voc示例中feed_var缺少参数'im_shape'导致的预测结果异常的错误.#1180
    • 修复多个GRPC因模型路径变更导致的错误.#1147
    • 修复C++log日志打印异常的错误. #1154
    • 修复WebService漏传Thread参数的错误. #1136
    • 修复golang引入的编译错误. #1101
    • 修复Java gRPC模型下的错误. #1215

For English

  • New Features:
    • Integrated Paddle 2.1 Inference, #1221
    • Support low-precision inference of fp16 and int8, #1130, #1236
    • Deploy Serving service through Kubernetes, #1139, #1184, #1193
    • New Security gateway, #1235
    • Serving deployment in X86 + XPU environment, #1080
  • Feature Improvements:
    • Merge paddle_serving_server and paddle_serving_server_gpu into a unified paddle_serving_server, #1082
    • Pipeline supports Mini-batch inference, #1186
    • Pipeline supports log file rotating, #1238
    • Pipeline optimizes data transfer to eval for processing, and increases channel tracking logs, #1209
    • C++ Serving reconstruction prediction engine call method, #1080
    • C++ Serving supports linear combination of multiple models, #1124
    • C++ Serving interface adds direct input of String type, #1124
    • C++ Serving resource management and optimization, #1143
    • C++ Serving performance optimization, changing for loop copy to function memcpy, #1124
    • C++ Serving add GDB compilation options, #1124
    • Add Benchmark script and update GPU benchmark data, #1197, #1175
  • Document Updates:
    • Add doc/PADDLE_SERVING_ON_KUBERNETES.md
    • Add doc/LOD.md
    • Add doc/LOD_CN.md
    • Add doc/PROCESS_DATA.md
    • Modify doc/PIPELINE_SERVING.md
    • Modify doc/PIPELINE_SERVING_CN.md
    • Modify doc/CREATING.md
    • Modify doc/SAVE.md
    • Modify doc/SAVE_CN.md
    • Modify doc/TENSOR_RT.md
    • Modify doc/TENSOR_RT_CN.md
    • Modify doc/MULTI_SERVICE_ON_ONE_GPU_CN.md
    • Modify doc/ENCRYPTION.md
    • Modify doc/ENCRYPTION_CN.md
    • Modify doc/DESIGN_DOC.md
    • Modify doc/DESIGN_DOC_CN.md
    • Modify doc/DOCKER_IMAGES.md
    • Modify doc/DOCKER_IMAGES_CN.md
    • Modify doc/LATEST_PACKAGES.md
    • Modify doc/COMPILE.md
    • Modify doc/COMPILE_CN.md
    • Modify doc/BERT_10_MINS.md
    • Modify doc/BERT_10_MINS_CN.md
    • Modify doc/BAIDU_KUNLUN_XPU_SERVING.md
    • Modify doc/BAIDU_KUNLUN_XPU_SERVING_CN.md
    • Modify README.md
    • Modify README_CN.md
  • Demo Updates:
    • Add python/python/examples/low_precision/resnet50
    • Add python/examples/xpu/bert
    • Add python/examples/xpu/ernie
    • Add python/examples/xpu/vgg19
    • Add python/examples/pipeline/PaddleDetection/faster_rcnn
    • Add python/examples/pipeline/PaddleDetection/ppyolo_mbv3
    • Add python/examples/pipeline/PaddleDetection/yolov3
    • Add python/examples/pipeline/PaddleClas/DarkNet53
    • Add python/examples/pipeline/PaddleClas/HRNet_W18_C
    • Add python/examples/pipeline/PaddleClas/MobileNetV1
    • Add python/examples/pipeline/PaddleClas/MobileNetV2
    • Add python/examples/pipeline/PaddleClas/MobileNetV3_large_x1_0
    • Add python/examples/pipeline/PaddleClas/ResNeXt101_vd_64x4d
    • Add python/examples/pipeline/PaddleClas/ResNet50_vd
    • Add python/examples/pipeline/PaddleClas/ResNet50_vd_FPGM
    • Add python/examples/pipeline/PaddleClas/ResNet50_vd_KL
    • Add python/examples/pipeline/PaddleClas/ResNet50_vd_PACT
    • Add python/examples/pipeline/PaddleClas/ResNet_V2_50
    • Add python/examples/pipeline/PaddleClas/ShuffleNetV2_x1_0
    • Add python/examples/pipeline/bert
    • Add python/examples/ocr/ocr_cpp_client.py
    • Modify python/examples/bert [benchmark]
    • Modify python/examples/pipeline/ocr[benchmark]
  • Docker Updates:
    • Add runtime dockers (CPU, CUDA10.1, CUDA10.2, CUDA11) (Py36, Py37, Py38)
    • Add CUDA 11 develop level docker images
    • Add kubernetes demo images
  • Bug Fixes:
    • Fixed the problem of irregular naming, #1136
    • Fixed the problem that part of C + + multithreading and framework were bypassed due to the adaptation of paddle-inference2.0. #1124
    • Fixed the problem of JSON. Load in py35.#1124
    • Fixed missing a feed_var: 'im_shape' in the test-client request, resulting in no prediction result.#1180
    • Fixed multiple bugs in gRPC.#1147
    • Fixed the read OP print log logic bug in C + +. #1154
    • Fixed the WebService missed a thread parameter, unified the template name in infer. h and paddle_engine. h. #1136
    • Fixed compile errors of golang import. #1101
    • Fixed Java gRPC bugs, #1215

v0.5.0

3 years ago
  • New Features

    • Support paddle 2.0 API
    • Support dynamic graph model conversion to Serving model interface
    • Add java pipeline client
    • Support model encryption and decryption
    • Adapt shape to webserver automatically
    • Predict on XPU, ARM
  • Improve

    • Add more Nvidia TensorRT demos
    • Add more dockers on Ubuntu
    • Support python 3.8
    • Support batch predict on pipeline serving
  • Documents

    • Add doc/BAIDU_KUNLUN_XPU_SERVING.md
    • Add doc/BAIDU_KUNLUN_XPU_SERVING_CN.md
    • Add doc/ENCRYPTION.md
    • Add doc/ENCRYPTION_CN.md
    • Modify README.md
    • Modify README_CN.md
    • Modify doc/COMPILE_CN.md
    • Modify doc/DOCKER_IMAGES_CN.md
    • Modify doc/LATEST_PACKAGES.md
    • Modify doc/RUN_IN_DOCKER_CN.md
    • Modify doc/SAVE_CN.md
    • Modify doc/ABTEST_IN_PADDLE_SERVING.md
    • Modify doc/COMPILE.md
    • Modify doc/DESIGN_DOC_CN.md
    • Modify doc/DESIGN_DOC.md
    • Modify doc/GRPC_IMPL_CN.md
    • Modify doc/JAVA_SDK.md
    • Modify doc/JAVA_SDK_CN.md
    • Delete doc/INFERENCE_TO_SERVING_CN.md
    • Delete doc/TRAIN_TO_SERVICE.md
    • Delete doc/TRAIN_TO_SERVICE_CN.md
  • Demos

    • Add examples/xpu/fit_a_line_xpu
    • Add examples/xpu/resnet_v2_50_xpu
    • Add examples/detection/faster_rcnn_r50_fpn_1x_coco
    • Add examples/detection/ppyolo_r50vd_dcn_1x_coco
    • Add examples/detection/ttfnet_darknet53_1x_coco
    • Add examples/detection/yolov3_darknet53_270e_coco
    • Add examples/encryption
    • Modify examples/bert
    • Modify examples/criteo_ctr
    • Modify examples/fit_a_line
    • Modify examples/grpc_impl_example
    • Modify examples/imdb
    • Modify examples/ocr
    • Modify examples/pipeline/imagenet
    • Modify examples/pipeline/imdb_model_ensemble
    • Modify examples/pipeline/ocr
  • Dockers

    • Add Docker : CPU on Ubuntu16 【GCC82】
    • Add Docker : CUDA9.0 on Ubuntu16 【GCC482】
    • Add Docker : CUDA10.0 on Ubuntu16 【GCC482】
    • Add Docker : CUDA10.1 on Ubuntu16 【GCC82】
    • Add Docker : CUDA10.2 on Ubuntu16 【GCC82 】
    • Add Docker : CUDA11.0 on Ubuntu18 【GCC82】
    • Add Docker : ARM CPU on CentOS8 【GCC73】
  • Fix Bugs

    • Exception of batch GRPC requests
    • Exception of pipeline batch query
    • Wrong results of Predicting Yolov4 models on java client.
    • Codec is inconsistent between Python 2.7 and Python 3.x

v0.4.0

3 years ago
  • New Features
    • Support Java Client
    • Support TensorRT, add docker image for cuda10.1 and TensorRT 6
    • Modify the LocalPredictor interface to align with the RPC interface usage
    • Add Pipeline Serving Dag Deployment
    • Support Windows 10 (Only Web Service and Local Predictor)
    • Add built-in Serving model converter
  • Improve of Compatibility
    • Release cuda10.1 version of paddle_serving_server_gpu
    • Release Python 3.5 version of paddle_serving_client
    • Remove serving-client-app circular dependencies
    • Modify version of dependencies
    • Support LoD Tensor and replace list type for batch input with one numpy array
  • Improve of Framework
    • Pipeline Dag support multi-gpu
    • lower RPC thread restriction to 1
  • Documents
    • Modify "COMPILE"
    • Add "WINDOWS_TUTORIAL"
    • Add "PIPELINE_SERVING"
    • Modify "BERT_10_MIN"
  • New Demo
    • pipeline demos
    • Java Demo
  • Bug fixes
    • Fix subprocess CUDA ERROR3 bug
    • Fix pip install dependencies
    • Fix import error in windows
    • Fix bugs in web service

v0.3.2

3 years ago
  • New Features
    • Support Paddle v1.8.4
    • Support int64 data type
    • Add mem_optim and ir_optim api for WebService
    • Add preprocess and postprocess api for ocr in paddle_serving_app
    • Add docker for cuda10
  • Improve of Compatibility
    • Release cuda10 version of paddle_serving_server_gpu
  • Improve of Framework
    • Optimize the error message in http mode
    • Reduce GPU server-side graphic memory usage.
  • Documents
    • Modify "How to optimize performance?","Compile from source code","FAQ(Chinese)"
  • New Demo
    • yolov4
    • ocr
  • Bug fixes
    • Add version requirements for protobuf to avoid import paddle_serving_client errors caused by low versions, https://github.com/PaddlePaddle/Serving/issues/728
    • fix compatibility issues for http mode with python3
    • fix ctr_with_cube demo
    • fix cpu docker
    • fix compatibility issues for BlazeFacePostprocess wiht python3

v0.3.0

3 years ago
  • New features
    • Add ir_optim, use_mkl(only for cpu version)argument
    • Support custom DAG for prediction service
    • HTTP service supports prediction with batch
    • HTTP service supports startup by uwsgi
    • Support model file monitoring, remote pull and hot loading
    • Support ABTest
    • Add image preprocessing, Chinese word segmentation preprocessing, Chinese sentiment analysis preprocessing module, and graphics segmentation postprocessing, image detection postprocessing module in paddle-serving-app
    • Add pre-trained model and sample code acquisition in paddle-serving-app, integrated profile function
    • Release Centos6 docker images for compile Paddle Serving
  • Bug fixed
  • New documents
  • Performance optimization
    • Optimized the time consumption of input and output memory copy in numpy.array format. When the client-side single concurrent batch size is 1 in the resnet50 imagenet classification task, qps is 100.38% higher than the 0.2.0 version.
  • Compatibility optimization
    • The client side removes the dependency on patchelf
    • Released paddle-serving-client for python27, python36, and python37
    • Server and client can be deployed in Centos6/7 and Ubuntu16/18 environments
  • More demos
    • Chinese sentiment analysis task : lac+senta
    • Image segmentation task : deeplabv3、unet
    • Image detection task : faster_rcnn
    • Image classification task : mobilenet、resnet_v2_50

v0.2.0

4 years ago

Major Features and Improvements

  1. Support Paddle v1.7.1

  2. Improve ease of use.

    Support install Paddle Serving with pip command and docker.

    Integrate with Paddle Training seamlessly.

    Start server with one command.

    Web service development supported.

  3. Provide two prediction service methods : RPC and HTTP.

  4. Add client api support for Python, Go.

  5. CV, NLP and recommendation serving demos released.

  6. Add Timeline tools for analysis service performance.

  7. Performance Improvement: with python rpc python client, throughputs are improved over 100% compared with HTTP service in previous release.

Thanks to our Contributors

This release contains contributions from many people at Baidu, as well as: guru4elephant, wangjiawei04, MRXLT, barrierye

v0.0.3

4 years ago
  1. Support PaddlePaddle v1.6.1
  2. Distributed sparse parameter service: Cube. Cube is a high-performance distributed KV service designed to work in Deep Learning context, which bad been tested and heavily used inside Baidu
  3. New module added as demo: BERT on GPU
  4. Elastic-CTR: Built around PaddlePaddle, Paddle Serving and Cube, Elastic-CTR is an end-to-end distributed training and serving solution. It's built totally on K8S, meaning that users could easily deploy our solution on private clusters. See Elastic CTR solution deployment
  5. Build optimization: Now Paddle inference libraries will be downloaded from PaddlePaddle official website, instead of built from source
  6. Build optimizatioin: Dockerfiles are provided for buiding CPU and GPU Serving binaries. See INSTALL instructions