sagemaker-runtime invoke_endpoint API. The model_name field is injected into every request body for multi-model dispatch.
AWSClient
endpoint_name
Name of the SageMaker endpoint.
model_name
Model identifier injected into the request body for SageMaker multi-model dispatch.
region_name
AWS region. When None, uses the default from the environment or boto_session.
boto_session
Optional pre-configured
Session (e.g. with explicit credentials or a custom profile). When None, a default session is created using the standard credential chain.timeout
Read timeout in seconds.
predict
Serialized JSON request payload.
Response body as a JSON string.
embed
Serialized JSON request payload.
Response body as a JSON string.
metadata
/invocations.
Sends a minimal request with mode set to "metadata" so the server returns model metadata instead of running inference.
Response body as a JSON string.
predict_async
run_in_executor.
boto3 is not natively async, so the synchronous predict call is delegated to a thread pool.
Serialized JSON payload.
Unused. Accepted for interface compatibility with the
Client protocol.Response body as a string.
embed_async
run_in_executor.
Serialized JSON payload.
Unused. Accepted for interface compatibility.
Response body as a string.

