Simple TensorFlow Serving

Introduction

Simple TensorFlow Serving is the generic and easy-to-use serving service for machine learning models.

It is the bridge for TensorFlow models and bring machine learning to any programming language, such as Bash, Python, C++, Java, Scala, Go, Ruby, JavaScript, PHP, Erlang, Lua, Rust, Swift, Perl, Lisp, Haskell, Clojure, R.

Installation

Install the server with pip.

pip install simple_tensorflow_serving

Or install with bazel.

bazel build simple_tensorflow_serving:server

Or install from source code.

python ./setup.py install

Or use docker image.

docker run -d -p 8500:8500 tobegit3hub/simple_tensorflow_serving

Quick Start

Start the server with the TensorFlow SavedModel.

simple_tensorflow_serving --model_base_path="./models/tensorflow_template_application_model"

Check out the dashboard in http://127.0.0.1:8500 in web browser.

Generate Python client and access the model with the test dataset.

simple_tensorflow_serving --model_base_path="./models/tensorflow_template_application_model" --gen_client="python"

python ./client.py

Advanced Usage

Multiple Models

It supports serve multiple models and multiple versions of these models. You can run the server with this configuration.

{
  "model_config_list": [
    {
      "name": "tensorflow_template_application_model",
      "base_path": "./models/tensorflow_template_application_model/",
      "platform": "tensorflow"
    }, {
      "name": "deep_image_model",
      "base_path": "./models/deep_image_model/",
      "platform": "tensorflow"
    }, {
       "name": "mxnet_mlp_model",
       "base_path": "./models/mxnet_mlp/mx_mlp",
       "platform": "mxnet"
    }
  ]
}

simple_tensorflow_serving --model_config_file="./examples/model_config_file.json"

Adding or removing model versions will be detected automatically and re-load latest files in memory. You can easily choose the specified model and version for inference.

endpoint = "http://127.0.0.1:8500"
input_data = {
  "model_name": "default",
  "model_version": 1,
  "data": {
      "keys": [[11.0], [2.0]],
      "features": [[1, 1, 1, 1, 1, 1, 1, 1, 1],
                   [1, 1, 1, 1, 1, 1, 1, 1, 1]]
  }
}
result = requests.post(endpoint, json=input_data)

Generated Client

You can generate clients in different languages(Bash, Python, Golang, JavaScript etc.) for your model without writing any code.

simple_tensorflow_serving --model_base_path="./models/tensorflow_template_application_model/" --gen_client bash

simple_tensorflow_serving --model_base_path="./models/tensorflow_template_application_model/" --gen_client python

The generated code should look like these which can be test immediately.

#!/bin/bash

curl -H "Content-Type: application/json" -X POST -d '{"data": {"keys": [[1.0], [1.0]], "features": [[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]]} }' http://127.0.0.1:8500

#!/usr/bin/env python

import requests

def main():
  endpoint = "http://127.0.0.1:8500"

  input_data = {"keys": [[1.0], [1.0]], "features": [[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]]}
  result = requests.post(endpoint, json=input_data)
  print(result.text)

if __name__ == "__main__":
  main()

Image Model

For image models, we can request with the raw image files instead of constructing array data.

Now start serving the image model like deep_image_model.

simple_tensorflow_serving --model_base_path="./models/deep_image_model/"

Then request with the raw image file which has the same shape of your model.

curl -X POST -F 'image=@./images/mew.jpg' -F "model_version=1" 127.0.0.1:8500

Custom Op

If your models rely on new TensorFlow custom op, you can run the server while loading the so files.

simple_tensorflow_serving --model_base_path="./model/" --custom_op_paths="./foo_op/"

Please check out the complete example in ./examples/custom_op/.

Authentication

For enterprises, we can enable basic auth for all the APIs and any anonymous request is denied.

Now start the server with the configured username and password.

./server.py --model_base_path="./models/tensorflow_template_application_model/" --enable_auth=True --auth_username="admin" --auth_password="admin"

If you are using the Web dashboard, just type your certification. If you are using clients, give the username and password within the request.

curl -u admin:admin -H "Content-Type: application/json" -X POST -d '{"data": {"keys": [[11.0], [2.0]], "features": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}' http://127.0.0.1:8500

endpoint = "http://127.0.0.1:8500"
input_data = {
  "data": {
      "keys": [[11.0], [2.0]],
      "features": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]
  }
}
auth = requests.auth.HTTPBasicAuth("admin", "admin")
result = requests.post(endpoint, json=input_data, auth=auth)

MXNet Model

In addiction, it supports loading and serving the general MXNet models in standard checkpoint format. You can load the models with commands or configuration as well.

simple_tensorflow_serving --model_base_path="./models/mxnet_mlp/mx_mlp" --model_platform="mxnet"

The clients are similar and you can implement in your favourite programming language.

endpoint = "http://127.0.0.1:8500"
input_data = {
  "model_name": "default",
  "model_version": 1,
  "data": {
      "data": [[12.0, 2.0]]
  }
}
result = requests.post(endpoint, json=input_data)
print(result.text)

ONNX Model

Now it supports loading and serving the general ONNX models.

simple_tensorflow_serving --model_base_path="./models/onnx_mnist_model/onnx_model.proto" --model_platform="onnx"

The clients are similar and you can implement in your favourite programming language.

endpoint = "http://127.0.0.1:8500"
input_data = {
  "model_name": "default",
  "model_version": 1,
  "data": {
      "data": [[[[...]]]]
  }
}
result = requests.post(endpoint, json=input_data)
print(result.text)

H2o Model

Now it supports loading and serving the general H2o models.

# Start H2o server with "java -jar h2o.jar"

simple_tensorflow_serving --model_base_path="./models/h2o_prostate_model/GLM_model_python_1525255083960_17" --model_platform="h2o"

The clients are similar and you can implement in your favourite programming language.

endpoint = "http://127.0.0.1:8500"
input_data = {
  "model_name": "default",
  "model_version": 1,
  "data": {
      "data": [[[[...]]]]
  }
}
result = requests.post(endpoint, json=input_data)
print(result.text)

Supported Client

Here is the example client in Bash.

curl -H "Content-Type: application/json" -X POST -d '{"data": {"keys": [[1.0], [2.0]], "features": [[10, 10, 10, 8, 6, 1, 8, 9, 1], [6, 2, 1, 1, 1, 1, 7, 1, 1]]}}' http://127.0.0.1:8500

Here is the example client in Python.

endpoint = "http://127.0.0.1:8500"
payload = {"data": {"keys": [[11.0], [2.0]], "features": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}

result = requests.post(endpoint, json=payload)

Here is the example client in C++.

Here is the example client in Java.

Here is the example client in Scala.

Here is the example client in Go.

endpoint := "http://127.0.0.1:8500"
dataByte := []byte(`{"data": {"keys": [[11.0], [2.0]], "features": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}`)
var dataInterface map[string]interface{}
json.Unmarshal(dataByte, &dataInterface)
dataJson, _ := json.Marshal(dataInterface)

resp, err := http.Post(endpoint, "application/json", bytes.NewBuffer(dataJson))

Here is the example client in Ruby.

endpoint = "http://127.0.0.1:8500"
uri = URI.parse(endpoint)
header = {"Content-Type" => "application/json"}
input_data = {"data" => {"keys"=> [[11.0], [2.0]], "features"=> [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}
http = Net::HTTP.new(uri.host, uri.port)
request = Net::HTTP::Post.new(uri.request_uri, header)
request.body = input_data.to_json

response = http.request(request)

Here is the example client in JavaScript.

var options = {
    uri: "http://127.0.0.1:8500",
    method: "POST",
    json: {"data": {"keys": [[11.0], [2.0]], "features": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}
};

request(options, function (error, response, body) {});

Here is the example client in PHP.

$endpoint = "127.0.0.1:8500";
$inputData = array(
    "keys" => [[11.0], [2.0]],
    "features" => [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]],
);
$jsonData = array(
    "data" => $inputData,
);
$ch = curl_init($endpoint);
curl_setopt_array($ch, array(
    CURLOPT_POST => TRUE,
    CURLOPT_RETURNTRANSFER => TRUE,
    CURLOPT_HTTPHEADER => array(
        "Content-Type: application/json"
    ),
    CURLOPT_POSTFIELDS => json_encode($jsonData)
));

$response = curl_exec($ch);

Here is the example client in Erlang.

ssl:start(),
application:start(inets),
httpc:request(post,
  {"http://127.0.0.1:8500", [],
  "application/json",
  "{\"data\": {\"keys\": [[11.0], [2.0]], \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}"
  }, [], []).

Here is the example client in Lua.

local endpoint = "http://127.0.0.1:8500"
keys_array = {}
keys_array[1] = {1.0}
keys_array[2] = {2.0}
features_array = {}
features_array[1] = {1, 1, 1, 1, 1, 1, 1, 1, 1}
features_array[2] = {1, 1, 1, 1, 1, 1, 1, 1, 1}
local input_data = {
    ["keys"] = keys_array,
    ["features"] = features_array,
}
local json_data = {
    ["data"] = input_data
}
request_body = json:encode (json_data)
local response_body = {}

local res, code, response_headers = http.request{
    url = endpoint,
    method = "POST", 
    headers = 
      {
          ["Content-Type"] = "application/json";
          ["Content-Length"] = #request_body;
      },
      source = ltn12.source.string(request_body),
      sink = ltn12.sink.table(response_body),
}

Here is the example client in Rust.

Here is the example client in Swift.

Here is the example client in Perl.

my $endpoint = "http://127.0.0.1:8500";
my $json = '{"data": {"keys": [[11.0], [2.0]], "features": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}';
my $req = HTTP::Request->new( 'POST', $endpoint );
$req->header( 'Content-Type' => 'application/json' );
$req->content( $json );
$ua = LWP::UserAgent->new;

$response = $ua->request($req);

Here is the example client in Lisp.

Here is the example client in Haskell.

Here is the example client in Clojure.

Here is the example client in R.

endpoint <- "http://127.0.0.1:8500"
body <- list(data = list(a = 1), keys = 1)
json_data <- list(
  data = list(
    keys = list(list(1.0), list(2.0)), features = list(list(1, 1, 1, 1, 1, 1, 1, 1, 1), list(1, 1, 1, 1, 1, 1, 1, 1, 1))
  )
)

r <- POST(endpoint, body = json_data, encode = "json")
stop_for_status(r)
content(r, "parsed", "text/html")

Here is the example with Postman.

How It Works

simple_tensorflow_serving starts the HTTP server with flask application.
Load the TensorFlow models with tf.saved_model.loader Python API.

Construct the feed_dict data from the JSON body of the request.

// Method: POST, Content-Type: application/json
{
  "model_version": 1, // Optional
  "data": {
    "keys": [[1.0], [2.0]],
    "features": [[10, 10, 10, 8, 6, 1, 8, 9, 1], [6, 2, 1, 1, 1, 1, 7, 1, 1]]
  }
}

Use the TensorFlow Python API to sess.run() with feed_dict data.
For multiple versions supported, it starts independent thread to load models.
For generated clients, it reads user's model and render code with Jinja templates.

Contribution

Check out the C++ implementation of TensorFlow Serving in tensorflow/serving.

Feel free to open an issue or send pull request for this project. It is warmly welcome to add more clients in your languages to access TensorFlow models.

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
bash_client		bash_client
benchmark		benchmark
cpp_client		cpp_client
erlang_client		erlang_client
examples		examples
go_client		go_client
haskell_client		haskell_client
images		images
java_client		java_client
javascript_client		javascript_client
lisp_client		lisp_client
lua_client		lua_client
models		models
perl_client		perl_client
php_client		php_client
python_client		python_client
r_client		r_client
ruby_client		ruby_client
rust_client		rust_client
scala_client		scala_client
simple_tensorflow_serving		simple_tensorflow_serving
swift_client		swift_client
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
WORKSPACE		WORKSPACE
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
setup.py		setup.py

License

012huang/simple_tensorflow_serving

Folders and files

Latest commit

History

Repository files navigation

Simple TensorFlow Serving

Introduction

Installation

Quick Start

Advanced Usage

Multiple Models

Generated Client

Image Model

Custom Op

Authentication

MXNet Model

ONNX Model

H2o Model

Supported Client

How It Works

Contribution

About

Resources

License

Stars

Watchers

Forks

Languages