Update default address from 0.0.0.0 to 127.0.0.1 in documentation and…

… examples (#2624) * Initial pass to update default address from 0.0.0.0 to 127.0.0.1 * update docker config to default bridge address * revert config updates and retain doc updates --------- Co-authored-by: Naman Nandan <[email protected]>
pytorch · Oct 2, 2023 · 8dfa6c8 · 8dfa6c8
1 parent 5f36b20
commit 8dfa6c8
Show file tree

Hide file tree

Showing 19 changed files with 183 additions and 185 deletions.
diff --git a/benchmarks/config_kf.properties b/benchmarks/config_kf.properties
@@ -2,4 +2,4 @@ inference_address=http://0.0.0.0:8080
 management_address=http://0.0.0.0:8081
 service_envelope=kserve
 number_of_netty_threads=32
-job_queue_size=1000
+job_queue_size=1000
diff --git a/docs/batch_inference_with_ts.md b/docs/batch_inference_with_ts.md
@@ -5,7 +5,7 @@
 * [Introduction](#introduction)
 * [Prerequisites](#prerequisites)
 * [Batch Inference with TorchServe's default handlers](#batch-inference-with-torchserves-default-handlers)
-* [Batch Inference with TorchServe using ResNet-152 model](#batch-inference-with-torchserve-using-resnet-152-model)  
+* [Batch Inference with TorchServe using ResNet-152 model](#batch-inference-with-torchserve-using-resnet-152-model)
 * [Demo to configure TorchServe ResNet-152 model with batch-supported model](#demo-to-configure-torchserve-resnet-152-model-with-batch-supported-model)
 * [Demo to configure TorchServe ResNet-152 model with batch-supported model using Docker](#demo-to-configure-torchserve-resnet-152-model-with-batch-supported-model-using-docker)
 
@@ -16,7 +16,7 @@ TorchServe was designed to natively support batching of incoming inference reque
 because most ML/DL frameworks are optimized for batch requests.
 This optimal use of host resources in turn reduces the operational expense of hosting an inference service using TorchServe.
 
-In this document we show an example of how to use batch inference in Torchserve when serving models locally or using docker containers. 
+In this document we show an example of how to use batch inference in Torchserve when serving models locally or using docker containers.
 
 ## Prerequisites
 
@@ -54,7 +54,7 @@ requests before this timer time's out, it sends what ever requests that were rec
 Let's look at an example using this configuration through management API:
 
 ```bash
-# The following command will register a model "resnet-152.mar" and configure TorchServe to use a batch_size of 8 and a max batch delay of 50 milliseconds. 
+# The following command will register a model "resnet-152.mar" and configure TorchServe to use a batch_size of 8 and a max batch delay of 50 milliseconds.
 curl -X POST "localhost:8081/models?url=resnet-152.mar&batch_size=8&max_batch_delay=50"
 ```
 Here is an example of using this configuration through the config.properties:
@@ -97,8 +97,8 @@ First things first, follow the main [Readme](../README.md) and install all the r
 ```text
 $ cat config.properties
 ...
-inference_address=http://0.0.0.0:8080
-management_address=http://0.0.0.0:8081
+inference_address=http://127.0.0.1:8080
+management_address=http://127.0.0.1:8081
 ...
 $ torchserve --start --model-store model_store
 ```
@@ -193,13 +193,13 @@ models={\
   }\
 }
 ```
-* Then will start Torchserve by passing the config.properties using `--ts-config` flag 
+* Then will start Torchserve by passing the config.properties using `--ts-config` flag
 
 ```bash
 torchserve --start --model-store model_store  --ts-config config.properties
 ```
 * Verify that TorchServe is up and running
-    
+
 ```text
 $ curl localhost:8080/ping
 {
@@ -265,9 +265,9 @@ Here, we show how to register a model with batch inference support when serving
 * Set the batch `batch_size` and `max_batch_delay`  in the config.properties as referenced in the [dockered_entrypoint.sh](../docker/dockerd-entrypoint.sh)
 
 ```text
-inference_address=http://0.0.0.0:8080
-management_address=http://0.0.0.0:8081
-metrics_address=http://0.0.0.0:8082
+inference_address=http://127.0.0.1:8080
+management_address=http://127.0.0.1:8081
+metrics_address=http://127.0.0.1:8082
 number_of_netty_threads=32
 job_queue_size=1000
 model_store=/home/model-server/model-store
@@ -291,7 +291,7 @@ models={\
 ./build_image.sh -g -cv cu102
 ```
 
-* Start serving the model with the container and pass the config.properties to the container 
+* Start serving the model with the container and pass the config.properties to the container
 
 ```bash
  docker run --rm -it --gpus all -p 8080:8080 -p 8081:8081 --name mar -v /home/ubuntu/serve/model_store:/home/model-server/model-store  -v $ path to config.properties:/home/model-server/config.properties  pytorch/torchserve:latest-gpu

diff --git a/docs/configuration.md b/docs/configuration.md
@@ -81,16 +81,15 @@ See [Enable SSL](#enable-ssl) to configure HTTPS.
 * `inference_address`: Inference API binding address. Default: `http://127.0.0.1:8080`
 * `management_address`: Management API binding address. Default: `http://127.0.0.1:8081`
 * `metrics_address`: Metrics API binding address. Default: `http://127.0.0.1:8082`
-* To run predictions on models on a public IP address, specify the IP address as `0.0.0.0`.
-  To run predictions on models on a specific IP address, specify the IP address and port.
+* To run predictions on models on a specific IP address, specify the IP address and port.
 
 ```properties
-# bind inference API to all network interfaces with SSL enabled
-inference_address=https://0.0.0.0:8443
+# bind inference API to localhost with SSL enabled
+inference_address=https://127.0.0.1:8443
 ```
 
 ```properties
-# bind inference API to private network interfaces
+# bind inference API to private network interfaces with SSL enabled
 inference_address=https://172.16.1.10:8080
 ```
 

diff --git a/examples/asr_rnnt_emformer/config.properties b/examples/asr_rnnt_emformer/config.properties
@@ -1,6 +1,6 @@
-inference_address=http://0.0.0.0:8080
-management_address=http://0.0.0.0:8081
-metrics_address=http://0.0.0.0:8082
+inference_address=http://127.0.0.1:8080
+management_address=http://127.0.0.1:8081
+metrics_address=http://127.0.0.1:8082
 number_of_netty_threads=32
 job_queue_size=1000
 model_store=/home/model-server/model-store

diff --git a/examples/cloudformation/ec2-asg.yaml b/examples/cloudformation/ec2-asg.yaml
@@ -29,7 +29,7 @@ Parameters:
     Type: String
     MinLength: '9'
     MaxLength: '18'
-    Default: '0.0.0.0/0'
+    Default: '127.0.0.1/0'
     AllowedPattern: (\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})/(\d{1,2})
     ConstraintDescription: Must be a valid IP CIDR range of the form x.x.x.x/x.
   ModelPath:
@@ -41,7 +41,7 @@ Parameters:
     Type: String
     MinLength: '9'
     MaxLength: '18'
-    Default: '0.0.0.0/0'
+    Default: '127.0.0.1/0'
     AllowedPattern: (\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})/(\d{1,2})
     ConstraintDescription: Must be a valid IP CIDR range of the form x.x.x.x/x.
 Mappings:
@@ -469,9 +469,9 @@ Resources:
           files:
             /etc/torchserve/config.properties:
                content: !Sub |
-                 inference_address=http://0.0.0.0:8080
-                 management_address=http://0.0.0.0:8081
-                 metrics_address=http://0.0.0.0:8082
+                 inference_address=http://127.0.0.1:8080
+                 management_address=http://127.0.0.1:8081
+                 metrics_address=http://127.0.0.1:8082
                  load_models=ALL
                  model_store=/mnt/efs/model_store
                mode: '000400'

diff --git a/examples/cloudformation/ec2.yaml b/examples/cloudformation/ec2.yaml
@@ -267,9 +267,9 @@ Resources:
           files:
             /etc/torchserve/config.properties:
                content: !Sub |
-                 inference_address=https://0.0.0.0:8080
-                 management_address=https://0.0.0.0:8081
-                 metrics_address=https://0.0.0.0:8082
+                 inference_address=https://127.0.0.1:8080
+                 management_address=https://127.0.0.1:8081
+                 metrics_address=https://127.0.0.1:8082
                  private_key_file=/etc/torchserve/server.key
                  certificate_file=/etc/torchserve/server.pem
                mode: '000400'

diff --git a/examples/diffusers/config.properties b/examples/diffusers/config.properties
@@ -1,7 +1,7 @@
 #Sample config.properties. In production config.properties at /mnt/models/config/config.properties will be used
-inference_address=http://0.0.0.0:8080
-management_address=http://0.0.0.0:8081
-metrics_address=http://0.0.0.0:8082
+inference_address=http://127.0.0.1:8080
+management_address=http://127.0.0.1:8081
+metrics_address=http://127.0.0.1:8082
 enable_envvars_config=true
 install_py_dep_per_model=true
 load_models=all

diff --git a/examples/large_models/Huggingface_accelerate/config.properties b/examples/large_models/Huggingface_accelerate/config.properties
@@ -1,6 +1,6 @@
-inference_address=http://0.0.0.0:8080
-management_address=http://0.0.0.0:8081
-metrics_address=http://0.0.0.0:8082
+inference_address=http://127.0.0.1:8080
+management_address=http://127.0.0.1:8081
+metrics_address=http://127.0.0.1:8082
 enable_envvars_config=true
 install_py_dep_per_model=true
 number_of_gpu=1

diff --git a/examples/large_models/Huggingface_accelerate/llama2/config.properties b/examples/large_models/Huggingface_accelerate/llama2/config.properties
@@ -1,6 +1,5 @@
-inference_address=http://0.0.0.0:8080
-management_address=http://0.0.0.0:8081
-metrics_address=http://0.0.0.0:8082
+inference_address=http://127.0.0.1:8080
+management_address=http://127.0.0.1:8081
+metrics_address=http://127.0.0.1:8082
 enable_envvars_config=true
 install_py_dep_per_model=true
-
diff --git a/examples/large_models/deepspeed_mii/config.properties b/examples/large_models/deepspeed_mii/config.properties
@@ -1,6 +1,6 @@
-inference_address=http://0.0.0.0:8080
-management_address=http://0.0.0.0:8081
-metrics_address=http://0.0.0.0:8082
+inference_address=http://127.0.0.1:8080
+management_address=http://127.0.0.1:8081
+metrics_address=http://127.0.0.1:8082
 enable_envvars_config=true
 install_py_dep_per_model=true
 load_models=all