Skip to content

Commit

Permalink
Update tf-operator v1beta1 documentation and examples (#870)
Browse files Browse the repository at this point in the history
* Update tf-operator v1beta1 documentation and examples

* Fix typo
  • Loading branch information
richardsliu authored and k8s-ci-robot committed Nov 19, 2018
1 parent 371034c commit 89e6f66
Show file tree
Hide file tree
Showing 12 changed files with 406 additions and 112 deletions.
14 changes: 8 additions & 6 deletions developer_guide.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Developer Guide

There are two versions of operator: one for v1alpha1 and one for v1alpha2.
There are two versions of the TF operator: one for v1alpha2 (to be deprecated) and one for v1beta1.

## Building the operator

Expand All @@ -24,7 +24,7 @@ dep ensure
Build it

```sh
go install github.com/kubeflow/tf-operator/cmd/tf-operator
go install github.com/kubeflow/tf-operator/cmd/tf-operator.v1beta1
```

If you want to build the operator for v1alpha2, please use the command here:
Expand Down Expand Up @@ -89,8 +89,8 @@ export KUBEFLOW_NAMESPACE=$(your_namespace)
After the cluster is up, the TFJob CRD should be created on the cluster.

```bash
# If you are using v1alpha1
kubectl create -f ./examples/crd/crd.yml
# If you are using v1beta1
kubectl create -f ./examples/crd/crd-v1beta1.yml
```

Or
Expand All @@ -111,8 +111,10 @@ tf-operator
To verify local operator is working, create an example job and you should see jobs created by it.

```sh
# If you are using v1alpha1
kubectl create -f ./examples/tf_job.yaml
# If you are using v1beta1
cd ./examples/v1beta1/dist-mnist
docker build -f Dockerfile -t kubeflow/tf-dist-mnist-test:1.0 .
kubectl create -f ./tf-job-mnist.yaml
```

Or
Expand Down
37 changes: 37 additions & 0 deletions examples/crd/crd-v1beta1.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: tfjobs.kubeflow.org
spec:
group: kubeflow.org
version: v1beta1
scope: Namespaced
names:
kind: TFJob
singular: tfjob
plural: tfjobs
validation:
openAPIV3Schema:
properties:
spec:
properties:
tfReplicaSpecs:
properties:
# The validation works when the configuration contains
# `Worker`, `PS` or `Chief`. Otherwise it will not be validated.
Worker:
properties:
replicas:
type: integer
minimum: 1
PS:
properties:
replicas:
type: integer
minimum: 1
Chief:
properties:
replicas:
type: integer
minimum: 1
maximum: 1
11 changes: 0 additions & 11 deletions examples/crd/crd.yaml

This file was deleted.

2 changes: 1 addition & 1 deletion examples/distribution_strategy/distributed_tfjob.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
apiVersion: "kubeflow.org/v1alpha2"
apiVersion: "kubeflow.org/v1beta1"
kind: "TFJob"
metadata:
name: "distributed-training"
Expand Down
30 changes: 0 additions & 30 deletions examples/tf_job.yaml

This file was deleted.

34 changes: 0 additions & 34 deletions examples/tf_job_clean_policy.yaml

This file was deleted.

14 changes: 0 additions & 14 deletions examples/tf_job_defaults.yaml

This file was deleted.

16 changes: 0 additions & 16 deletions examples/tf_job_gpu.yaml

This file was deleted.

18 changes: 18 additions & 0 deletions examples/v1beta1/dist-mnist/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

FROM tensorflow/tensorflow:1.5.0

ADD . /var/tf_dist_mnist
ENTRYPOINT ["python", "/var/tf_dist_mnist/dist_mnist.py"]
17 changes: 17 additions & 0 deletions examples/v1beta1/dist-mnist/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
### Distributed mnist model for e2e test

This folder containers Dockerfile and distributed mnist model for e2e test.

**Build Image**

The default image name and tag is `kubeflow/tf-dist-mnist-test:1.0`.

```shell
docker build -f Dockerfile -t kubeflow/tf-dist-mnist-test:1.0 ./
```

**Create TFJob YAML**

```
kubectl create -f ./tf_job_mnist.yaml
```
Loading

0 comments on commit 89e6f66

Please sign in to comment.