docker - Custom Container on Vertex AI Returns "405 Method Not Allowed" for predict Endpoint - Stack Overflow

admin•2025-04-20 13:07:14•questions•阅读3

I'm encountering an issue when deploying my custom container on Vertex AI. Locally my Flask server

I'm encountering an issue when deploying my custom container on Vertex AI. Locally my Flask server (running via Gunicorn) works perfectly—both the /predict and /health endpoints respond as expected. However, when Vertex AI calls the prediction API, I always receive a 405 Method Not Allowed error.

My Setup

Container: I use a custom Docker container that exposes port 8080.
Model Upload: I upload my model to Vertex AI with the following flags:
- --container-predict-route=/predict
- --container-health-route=/health
Prediction Call: I call the prediction API using the Google Cloud AI Platform client library.

Observations

Vertex AI PredictionService sends requests to a URL like:
/v1/endpoints/<ENDPOINT_ID>/deployedModels/<DEPLOYED_MODEL_ID>:predict
but my server returns 405.
If I perform a GET request to the endpoint (for example, via terminal), I receive a valid response
However, when calling /predict (or even /rawPredict as described in Vertex AI rawPredict docs), I still get a 405.
The server is running (i think) since i receive a log every 10 seconds like:
<GET /v1/endpoints/<ENDPOINT>/deployedModels/<MODEL> HTTP/1.1" 200 OK>.
I've added multiple route definitions (including catch-all routes) to handle URLs such as /v1/endpoints/<endpoint_id>/deployedModels/<model_id>:predict, but the error persists.

Below is my code:

Dockerfile:

FROM nvidia/cuda:12.2.0-runtime-ubuntu20.04
RUN apt-get update && apt-get install -y --no-install-recommends \
    wget \
    curl \
    python3-dev \
    python3-pip \
    python3-setuptools && \
    rm -rf /var/lib/apt/lists/*
RUN ln -sf /usr/bin/python3 /usr/bin/python
WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir --upgrade pip && \
    pip install --no-cache-dir torch>=1.12.0 torchvision>=0.13.0 && \
    if [ -f requirements.txt ]; then pip install --no-cache-dir -r requirements.txt; fi
COPY . .
EXPOSE 8080
CMD ["gunicorn", "-w", "1", "-b", "0.0.0.0:8080", "main:app"]

Function to call Vertex AI API (call_vertex_ai):

def call_vertex_ai(gcs_uri: str, additional_args: dict):
    client_options = {"api_endpoint": f"{REGION}-aiplatform.googleapis"}
    client = aiplatform.gapic.PredictionServiceClient(
        client_options=client_options)

    instance = predict.instance.ImageClassificationPredictionInstance(
        content=gcs_uri  # GCS path for image file
    ).to_value()
    instances = [instance]

    parameters = predict.params.ImageClassificationPredictionParams(
        confidence_threshold=additional_args.get("threshold", 0.5),
    ).to_value()

    endpoint = client.endpoint_path(
        project=PROJECT_ID, location=REGION, endpoint=ENDPOINT_ID
    )

    response = client.predict(
        endpoint=endpoint, instances=instances, parameters=parameters)

    return response.predictions

main.py (Vertex AI prediction server):

... some imports ... 
app = Flask(__name__)

def load_model():
    ...

load_model()


def handle_predict():    ... code ...
    detections = [{
        "bbox": bbox.tolist() if isinstance(bbox, np.ndarray) else bbox,
        "class": class_name,
        "score": float(score),
    } for bbox, class_name, score in zip(draw_boxes, pred_classes, scores)]

    return jsonify({"predictions": detections})

@app.post("/predict")
def predict():
    return handle_predict()

@app.route("/health", methods=["GET"])
def health():
    return jsonify({"status": "healthy"})


@app.route("/v1/endpoints/<endpoint_id>/deployedModels/<path:deployed_model_path>", methods=["POST"])
def predict_deployed_model(endpoint_id, deployed_model_path):
    if not deployed_model_path.endswith(":predict"):
        return "Not Found", 404
    return handle_predict()

@app.route("/v1/endpoints/<endpoint_id>/deployedModels/<deployed_model_id>:predict", methods=["POST"])
def predict_deployed_model_direct(endpoint_id, deployed_model_id):
    return handle_predict()

@app.route("/v1/endpoints/<endpoint_id>/deployedModels/<deployed_model_id>:rawPredict", methods=["POST"])
def raw_predict_deployed_model(endpoint_id, deployed_model_id):
    return handle_predict()

@app.before_request
def log_request_info():
    logger.info(f"Received request: {request.method} {request.url}")
    logger.info(f"Headers: {dict(request.headers)}")
    logger.info(f"Body: {request.get_data().decode('utf-8')}")

deploy.sh (the code is semplified):

gcloud builds submit \
  --tag "${IMAGE_NAME}:latest" \
  --gcs-source-staging-dir="gs://$BUCKET_NAME/source" \
  --gcs-log-dir="gs://$BUCKET_NAME/logs"

LATEST_IMAGE="${IMAGE_NAME}:latest"

gcloud ai models upload \
  --region="${REGION}" \
  --display-name="weldpredict-model" \
  --container-image-uri="${LATEST_IMAGE}" \
  --container-ports=8080 \
  --container-predict-route=/predict \
  --container-health-route=/health

ENDPOINT_ID=$(gcloud ai endpoints list --region="${REGION}" --format="value(ENDPOINT_ID)")

DEPLOYED_MODEL_ID=$(gcloud ai endpoints describe "${ENDPOINT_ID}" --region="${REGION}" --format="value(deployedModels.id)")
gcloud ai endpoints undeploy-model "${ENDPOINT_ID}" --deployed-model-id="${DEPLOYED_MODEL_ID}" --region="${REGION}" --quiet

gcloud ai endpoints deploy-model "${ENDPOINT_ID}" \
    --model="${MODEL_ID}" \
    --region="${REGION}" \
    --display-name="weldpredict-deployment" \
    --machine-type=n1-standard-4 \
    --accelerator=type=nvidia-tesla-t4,count=1 \
    --min-replica-count=1 \
    --max-replica-count=1 \
    --traffic-split=0=100

Issue Summary:

Problem: When calling Vertex AI predictions, I receive a 405 "Method Not Allowed" error.
Observation: Locally, my Flask server correctly handles /predict and /health, and GET requests to the endpoint return a 200 OK. However, when I call /predict (or /rawPredict) on Vertex AI, I get a 405.
Setup: I deploy my custom container on Vertex AI with the --container-predict-route=/predict flag, yet Vertex AI sends requests (e.g., /v1/endpoints/<ENDPOINT_ID>/deployedModels/<DEPLOYED_MODEL_ID>:predict) that are not matched by my routes.
Attempts: I have added multiple route definitions—including catch-all routes—to handle URLs like /v1/endpoints/<endpoint_id>/deployedModels/<model_id>:predict but still encounter the 405 error.
Additional Info:
- A log every 10 seconds catches:
```
GET /v1/endpoints/<ENDPOINT>/deployedModels/<MODEL> HTTP/1.1" 200 OK
```
- However, when calling /predict or /rawPredict the server returns 405.

Request:
I need help understanding why Vertex AI's prediction requests are not being handled as expected by my Flask server and how to properly configure my container or routes to resolve the 405 error.

Any guidance or suggestions would be greatly appreciated.

Thanks in advance!

My Setup

Container: I use a custom Docker container that exposes port 8080.
Model Upload: I upload my model to Vertex AI with the following flags:
- --container-predict-route=/predict
- --container-health-route=/health
Prediction Call: I call the prediction API using the Google Cloud AI Platform client library.

Observations

Vertex AI PredictionService sends requests to a URL like:
/v1/endpoints/<ENDPOINT_ID>/deployedModels/<DEPLOYED_MODEL_ID>:predict
but my server returns 405.
If I perform a GET request to the endpoint (for example, via terminal), I receive a valid response
However, when calling /predict (or even /rawPredict as described in Vertex AI rawPredict docs), I still get a 405.
The server is running (i think) since i receive a log every 10 seconds like:
<GET /v1/endpoints/<ENDPOINT>/deployedModels/<MODEL> HTTP/1.1" 200 OK>.
I've added multiple route definitions (including catch-all routes) to handle URLs such as /v1/endpoints/<endpoint_id>/deployedModels/<model_id>:predict, but the error persists.

Below is my code:

Dockerfile:

FROM nvidia/cuda:12.2.0-runtime-ubuntu20.04
RUN apt-get update && apt-get install -y --no-install-recommends \
    wget \
    curl \
    python3-dev \
    python3-pip \
    python3-setuptools && \
    rm -rf /var/lib/apt/lists/*
RUN ln -sf /usr/bin/python3 /usr/bin/python
WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir --upgrade pip && \
    pip install --no-cache-dir torch>=1.12.0 torchvision>=0.13.0 && \
    if [ -f requirements.txt ]; then pip install --no-cache-dir -r requirements.txt; fi
COPY . .
EXPOSE 8080
CMD ["gunicorn", "-w", "1", "-b", "0.0.0.0:8080", "main:app"]

Function to call Vertex AI API (call_vertex_ai):

def call_vertex_ai(gcs_uri: str, additional_args: dict):
    client_options = {"api_endpoint": f"{REGION}-aiplatform.googleapis"}
    client = aiplatform.gapic.PredictionServiceClient(
        client_options=client_options)

    instance = predict.instance.ImageClassificationPredictionInstance(
        content=gcs_uri  # GCS path for image file
    ).to_value()
    instances = [instance]

    parameters = predict.params.ImageClassificationPredictionParams(
        confidence_threshold=additional_args.get("threshold", 0.5),
    ).to_value()

    endpoint = client.endpoint_path(
        project=PROJECT_ID, location=REGION, endpoint=ENDPOINT_ID
    )

    response = client.predict(
        endpoint=endpoint, instances=instances, parameters=parameters)

    return response.predictions

main.py (Vertex AI prediction server):

... some imports ... 
app = Flask(__name__)

def load_model():
    ...

load_model()


def handle_predict():    ... code ...
    detections = [{
        "bbox": bbox.tolist() if isinstance(bbox, np.ndarray) else bbox,
        "class": class_name,
        "score": float(score),
    } for bbox, class_name, score in zip(draw_boxes, pred_classes, scores)]

    return jsonify({"predictions": detections})

@app.post("/predict")
def predict():
    return handle_predict()

@app.route("/health", methods=["GET"])
def health():
    return jsonify({"status": "healthy"})


@app.route("/v1/endpoints/<endpoint_id>/deployedModels/<path:deployed_model_path>", methods=["POST"])
def predict_deployed_model(endpoint_id, deployed_model_path):
    if not deployed_model_path.endswith(":predict"):
        return "Not Found", 404
    return handle_predict()

@app.route("/v1/endpoints/<endpoint_id>/deployedModels/<deployed_model_id>:predict", methods=["POST"])
def predict_deployed_model_direct(endpoint_id, deployed_model_id):
    return handle_predict()

@app.route("/v1/endpoints/<endpoint_id>/deployedModels/<deployed_model_id>:rawPredict", methods=["POST"])
def raw_predict_deployed_model(endpoint_id, deployed_model_id):
    return handle_predict()

@app.before_request
def log_request_info():
    logger.info(f"Received request: {request.method} {request.url}")
    logger.info(f"Headers: {dict(request.headers)}")
    logger.info(f"Body: {request.get_data().decode('utf-8')}")

deploy.sh (the code is semplified):

gcloud builds submit \
  --tag "${IMAGE_NAME}:latest" \
  --gcs-source-staging-dir="gs://$BUCKET_NAME/source" \
  --gcs-log-dir="gs://$BUCKET_NAME/logs"

LATEST_IMAGE="${IMAGE_NAME}:latest"

gcloud ai models upload \
  --region="${REGION}" \
  --display-name="weldpredict-model" \
  --container-image-uri="${LATEST_IMAGE}" \
  --container-ports=8080 \
  --container-predict-route=/predict \
  --container-health-route=/health

ENDPOINT_ID=$(gcloud ai endpoints list --region="${REGION}" --format="value(ENDPOINT_ID)")

DEPLOYED_MODEL_ID=$(gcloud ai endpoints describe "${ENDPOINT_ID}" --region="${REGION}" --format="value(deployedModels.id)")
gcloud ai endpoints undeploy-model "${ENDPOINT_ID}" --deployed-model-id="${DEPLOYED_MODEL_ID}" --region="${REGION}" --quiet

gcloud ai endpoints deploy-model "${ENDPOINT_ID}" \
    --model="${MODEL_ID}" \
    --region="${REGION}" \
    --display-name="weldpredict-deployment" \
    --machine-type=n1-standard-4 \
    --accelerator=type=nvidia-tesla-t4,count=1 \
    --min-replica-count=1 \
    --max-replica-count=1 \
    --traffic-split=0=100

Issue Summary:

Problem: When calling Vertex AI predictions, I receive a 405 "Method Not Allowed" error.
Observation: Locally, my Flask server correctly handles /predict and /health, and GET requests to the endpoint return a 200 OK. However, when I call /predict (or /rawPredict) on Vertex AI, I get a 405.
Setup: I deploy my custom container on Vertex AI with the --container-predict-route=/predict flag, yet Vertex AI sends requests (e.g., /v1/endpoints/<ENDPOINT_ID>/deployedModels/<DEPLOYED_MODEL_ID>:predict) that are not matched by my routes.
Attempts: I have added multiple route definitions—including catch-all routes—to handle URLs like /v1/endpoints/<endpoint_id>/deployedModels/<model_id>:predict but still encounter the 405 error.
Additional Info:
- A log every 10 seconds catches:
```
GET /v1/endpoints/<ENDPOINT>/deployedModels/<MODEL> HTTP/1.1" 200 OK
```
- However, when calling /predict or /rawPredict the server returns 405.

Any guidance or suggestions would be greatly appreciated.

Thanks in advance!

Share Improve this question asked Mar 4 at 1:50 Giulio Manuzzi 111 bronze badge

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

Based on the public documentation for using a custom containers, If your use case requires libraries that aren't included in the prebuilt containers, or maybe you have custom data transformations you want to perform as part of the prediction request, you can use a custom container that you build and push to the Artifact Registry. While custom containers allow for greater customization, the container must run an HTTP server. Specifically, the container must listen and respond to liveness checks, health checks, and prediction requests. In most cases, using a prebuilt container if possible is the recommended and simpler option. For an example of using a custom container, see the notebook PyTorch Image Classification Single GPU using Vertex Training with Custom Container.

发布者：admin，转转请注明出处：http://www.yc00.com/questions/1745064630a4609180.html

admin

questions
c# - How can I specify WithAppOnly() globally for a GraphServiceClient, so I don't have to specify it for every single M
How can I specify WithAppOnly() globally for a GraphServiceClient, so I don't have to specify it f
admin
29分钟前
10
questions
Static blogposts page issue
In need of some help! I have changed my homepage to static, and my posts page to blogs, but whenever I do this I get an
admin
28分钟前
10
questions
next.js - `@paypalreact-paypal-js` keeps triggering a CORS error and not loading the buttons - Stack Overflow
I'm trying to integrate paypal into a Next.js website I'm working on. I'm using @paypal
admin
27分钟前
10
questions
javascript - Use Flask-SocketIO to send messages continuously - Stack Overflow
I would like to build an application to monitor files on my local computer through the browser. I have
admin
23分钟前
10
questions
javascript - CSS Scroll Snap with Animation Effect - Stack Overflow
PreconditionI am trying to make a number list picker that user can scroll on it.The selected number w
admin
23分钟前
10
questions
javascript - MongoError: "doc parameter must be an array of documents" - Stack Overflow
Im trying to code an angular app that gets information from fullcontact API as a json and insert that t
admin
21分钟前
10
questions
javascript - Why am I getting this OAuthCallback error - Stack Overflow
I am trying to build a linked clone with next.js and despite everything being correct I am getting this
admin
20分钟前
00
questions
multisite - Folder structure when using multiple domains with WordPress
I have 10 domain names and am thinking about using WordPress. Are the domains set up in separate folders with their subs
admin
19分钟前
10
questions
docker - Nginx reverse proxy configuration problem - Stack Overflow
I have the follwoing nginx configuration :server {listen 80;server_name *public ip address here , did
admin
19分钟前
00
questions
c++ - How could I implement an assert function that support iostream - Stack Overflow
I would like to implement a CHECK() function that can work like this:int a = 1;int b = 2;int c = 1;C
admin
16分钟前
10
questions
javascript - Firebase getRedirectResult() doesn't solve - Stack Overflow
I am trying to implement the signInWithRedirect() function, but - weirdly enough , it doesn't work
admin
16分钟前
10
questions
javascript - array to map ordinal y-axis d3 - Stack Overflow
I am trying to develop a scatterplot using d3 but the domain for y-axis is confusing me. y-axis are gon
admin
15分钟前
10
questions
javascript - Pass class object to ReactJS component - Stack Overflow
I get the following error:Invariant Violation: Element type is invalid: expected a string (for built-in
admin
11分钟前
10
questions
javascript - Sending checkbox data through <a href> - Stack Overflow
I need to send checkbox values (in a number from 1 to N checkboxes) through an <a href><a>
admin
10分钟前
10
questions
c - DPDK Application keeps showing Discarded Packets - Stack Overflow
I am using DPDk to get packets from a Mellanox Card (ConnectX-6) Port and process them. The traffic is
admin
9分钟前
10
questions
javascript - How to check if email or phone already exists in mongodb database - Stack Overflow
I am facing a problem regarding checking of email and phone numbers in my MongoDB database. My code onl
admin
8分钟前
00
questions
css - How to remove padding from left&right side
Im trying to remove the padding from left&right side by following custom css but doenst work#main.site-main,.contain
admin
8分钟前
10
questions
javascript - Facebook API Error 100 - invalid link - Stack Overflow
I am using the Facebook API to create a Send Dialog in my Rails App. I just use the format that Faceboo
admin
6分钟前
10
questions
swift - How to make NavigationLink rows clear? - Stack Overflow
I am trying to make NavigationLink rows transparent in a List while using SwiftUIIntrospect to make the
admin
5分钟前
00
questions
javascript - react-export-excel download from onClick handler - Stack Overflow
I am using "react-export-excel" lib to export json to excel.This does work when clicking the
admin
2分钟前
00

发表回复

评论列表（0条）

暂无评论

docker - Custom Container on Vertex AI Returns "405 Method Not Allowed" for predict Endpoint - Stack Overflow

My Setup

Observations

My Setup

Observations

1 Answer 1

发表回复

评论列表（0条）

联系我们

400-800-8888

docker - Custom Container on Vertex AI Returns &quot;405 Method Not Allowed&quot; for predict Endpoint - Stack Overflow

My Setup

Observations

My Setup

Observations

1 Answer 1

相关推荐

发表回复

评论列表（0条）

联系我们

400-800-8888

docker - Custom Container on Vertex AI Returns "405 Method Not Allowed" for predict Endpoint - Stack Overflow