cube studio开源云原生一站式机器学习/深度学习AI平台,支持sso登录,多租户/多项目组,大数据平台对接,notebook在线开发,拖拉拽任务流pipeline编排,多机多卡分布式训练,超参搜索,推理服务VGPU,边缘计算,serverless,标注平台,自动化标注,数据集管理,大模型微调,vllm大模型推理,llmops,私有知识库,AI模型应用商店,支持模型一键开发/推理/微调,支持国产cpu/gpu/npu芯片,支持RDMA,支持pytorch/tf/mxnet/deepspeed/paddle/colossalai/horovod/spark/ray/volcano分布式
English | 简体中文
cube-studio is a one-stop cloud-native machine learning platform open sourced by Tencent Music, Currently mainly includes the following functions
https://github.com/tencentmusic/cube-studio/wiki
learning、deploy、consult、contribution、cooperation, join group, wechart id luanpeng1234 remark<open source>
, construction guide
tips:
template | type | describe |
---|---|---|
linux | base | Custom stand-alone operating environment, free to implement all custom stand-alone functions |
datax | import export | Import and export of heterogeneous data sources |
hadoop | data processing | hdfs,hbase,sqoop,spark client |
sparkjob | data processing | spark serverless |
volcanojob | data processing | volcano multi-machine distributed framework |
ray | data processing | python ray multi-machine distributed framework |
ray-sklearn | machine learning | sklearn based on ray framework supports multi-machine distributed parallel computing |
xgb | machine learning | xgb model training and inference |
tfjob | deep learning | Multi-machine distributed training of tensorflow |
pytorchjob | deep learning | Multi-machine distributed training of pytorch |
horovod | deep learning | Multi-machine distributed training of horovod |
paddle | deep learning | Multi-machine distributed training of paddle |
mxnet | deep learning | Multi-machine distributed training of mxnet |
kaldi | deep learning | Multi-machine distributed training of kaldi |
tfjob-train | model train | distributed training of tensorflow: plain and runner |
tfjob-runner | model train | distributed training of tensorflow: runner method |
tfjob-plain | model train | distributed training of tensorflow: plain method |
tf-model-evaluation | model evaluate | distributed model evaluation of tensorflow2.3 |
tf-offline-predict | model inference | distributed offline model inference of tensorflow2.3 |
model-register | model service | register model to platform |
model-offline-predict | model service | distributed offline model inference of framework |
deploy-service | model service | deploy inference service |
media-download | multimedia data processing | Distributed download of media files |
video-audio | multimedia data processing | Distributed extraction of audio from video |
video-img | multimedia data processing | Distributed extraction of pictures from video |
yolov7 | machine vision | object-detection with yolov7 |