Batch Inference¶

This section describes API and protocols related to Batch inference using ODAHU.

ODAHU Batch Inference feature allows user to get inferences using ML model for large datasets that are delivered asyncronously, not via HTTP API, but through other mechanisms.

Currently Batch Inference supports the following ways to delivery data for forecasting:

Object storage
- GCS
- S3
- Azureblob

In future we consider to add ability to process data directly from Kafka topic and other async data sources.

Please also take a look at example.

API Reference¶

InferenceService¶

InferenceService represents the following required entities:

Predictor docker image that contains predictor code
Model files location on object storage (directory or .zip / .tar.gz archive)
Command and Arguments that describe how to execute image

When a user trains a model then they should build an image with code that follows Predictor code protocol and register this image as well as appropriate model files using InferenceService entity in ODAHU Platform.

User describes how inference should be triggered using different options in [].spec.triggers.

InferenceJob¶

InferenceJob describes forecast process that was triggered by one of the triggers in InferenceService. If [].spec.triggers.webhook is enabled then its possible to run InferenceJob by making POST request as described below. By default webhook trigger is enabled. Note, that currently its the only one way to trigger jobs.

Predictor code protocol¶

ODAHU Platform launches docker image provided by user as [].spec.image (InferenceService) and guarantees the next conventions about input/model location inside container as well as format of input and output data.

Env variables¶

Title¶
Env variable	Description
$ODAHU_MODEL	Path in local filesystem that contains all model files from `[].spec.modelSource`
$ODAHU_MODEL_INPUT	Path in local filesystem that contains all input files from `[].spec.dataSource`
$ODAHU_MODEL_OUTPUT	Path in local filesystem that will be uploaded to `[].spec.outputDestination`

Input and output formats¶

Predictor code must expect input as set of JSON files with extensions .json located in folder that can be found in $ODAHU_MODEL_INPUT environment variable. These JSON files have structure of Kubeflow inference request objects.

Predictor code must save results as set of JSON files with extension .json in the folder that can be found in $ODAHU_MODEL_INPUT environment variable. These JSON files must have structure of Kubeflow inference response objects.

Implementation details¶

This section helps with deeper understanding of underlying mechanisms.

InferenceJob is implemented as TektonCD TaskRun with 9 steps

Configure rclone using ODAHU connections described in BatchInferenceService

Sync data input from object storage to local fs using rclone

Sync model from object storage to local fs using rclone

Validate input to Predict Protocol - Version 2

Log Model Input to feedback storage

Run user container with setting $ODAHU_MODEL, $ODAHU_MODEL_INPUT, $ODAHU_MODEL_OUTPUT

Validate output to Predict Protocol - Version 2

Log Model Output to feedback storage

Upload data from $ODAHU_MODEL_OUTPUT to [].spec.outputDestination.path