Orkhon Versions Save

Orkhon: ML Inference Framework and Server Runtime

v0.2.0

3 years ago

This release comes with:

  • ONNX interface
  • New asynchronous servicing methods
  • Shareable server runtime
  • Nuclei asynchronous runtime
  • Inferring input facts for frozen model
  • Improves throughput:
    • ~4.8361 GiB/s prediction throughput
    • 3_000 concurrent requests take ~4ms on average