AWS & SageMaker

Subscribe to a model on AWS Marketplace, deploy it as a SageMaker endpoint, and run inference either with the Bioptimus SDK (recommended) or the raw SageMaker runtime API.

If your Bioptimus contact sent you a private offer, accept it via the link they provided before continuing. After acceptance the model appears under Manage your subscriptions, and the remaining steps are identical.

Subscribe on AWS Marketplace

Subscribe to the model. In Manage your subscriptions, open your model subscription and click Configure.

Choose a model access interface

You will be asked to select the model access interface — SageMaker AI (Python SDK) or CLI — based on your preference and use case.

Select region and inference mode

Select your region and inference mode (e.g. a real-time inference endpoint or batch transform jobs). This guide spawns a simple real-time inference endpoint.Copy the Model Package ARN shown in the right-hand pane of the configuration page — make sure you copy the one for your chosen region.

The region must be identical across every command and in the Model Package ARN. Mixing regions is the most common cause of deployment failures.

2. Deploy an endpoint

Deploy with the Python SDK (recommended) or the AWS CLI.

Python SDK
AWS CLI

import sagemaker
from sagemaker import ModelPackage

session = sagemaker.Session()
role = "<YOUR_SAGEMAKER_EXECUTION_ROLE_ARN>"

model = ModelPackage(role=role,
                     model_package_arn="<MODEL_PACKAGE_ARN>",
                     sagemaker_session=session)
predictor = model.deploy(initial_instance_count=1,
                         instance_type="ml.g5.xlarge",
                         endpoint_name="bioptimus-prod",
                         inference_ami_version="al2-ami-sagemaker-inference-gpu-2")

AWS walks you through these steps from the configuration page. The commands below are provided in case you prefer to run them yourself. For full option details, see the aws sagemaker command reference.

Install and configure the AWS CLI with SSO

First, install the AWS CLI.Find your SSO start URL and SSO region by following the Access Keys link on the AWS landing page for your role, then configure an IAM Identity Center (SSO) profile:

aws configure sso

Answer the prompts:

Session name — anything you like (e.g. your username).
SSO start URL — the URL you obtained above.
SSO region — the region you obtained above.
SSO registration scopes — leave blank; it defaults to sso:account:access.
Authorize access in the browser page that opens.
Select the account to use.
Select the AWSAdministratorAccess role (required for the next steps).
CLI default client Region — leave blank.
CLI default output format — leave blank.
CLI profile name — a name to reuse this profile later.

If subsequent steps fail, activate the profile explicitly:

export AWS_PROFILE=<your-profile>

Create an IAM execution role

Create (or reuse) a SageMaker execution role:

role_name="AmazonSageMaker-ExecutionRole-AWSMarketplace"

aws iam create-role \
  --role-name "${role_name}" \
  --assume-role-policy-document '{
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Principal": { "Service": "sagemaker.amazonaws.com" },
        "Action": "sts:AssumeRole"
      }
    ]
  }'

aws iam attach-role-policy \
  --role-name "${role_name}" \
  --policy-arn "arn:aws:iam::aws:policy/AmazonSageMakerFullAccess"

aws iam put-role-policy \
  --role-name "${role_name}" \
  --policy-name "AmazonSageMaker-ExecutionPolicy" \
  --policy-document '{
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Action": ["s3:ListBucket"],
        "Resource": ["arn:aws:s3:::sagemaker"]
      },
      {
        "Effect": "Allow",
        "Action": ["s3:GetObject", "s3:PutObject", "s3:DeleteObject"],
        "Resource": ["arn:aws:s3:::sagemaker/*"]
      }
    ]
  }'

Save the execution role ARN

Store the role ARN in a variable for the next steps:

execution_role_arn="arn:aws:iam::$(aws sts get-caller-identity --query 'Account' --output text):role/${role_name}"

Create the model

Use the Model Package ARN for your region. Network isolation is required for Marketplace model packages. The names below are arbitrary and work for either package (H-Optimus or M-Optimus) — choose your own.

model_name="bioptimus-model"
endpoint_name="bioptimus-endpoint"

aws sagemaker create-model \
  --model-name "${model_name}" \
  --execution-role-arn "${execution_role_arn}" \
  --primary-container ModelPackageName="<MODEL_PACKAGE_ARN>" \
  --enable-network-isolation \
  --region <REGION>

Create the endpoint

Create the endpoint configuration (instance type and count), then the endpoint itself. The generous timeouts give the large model weights time to download and load before SageMaker’s startup health check — without them, endpoint creation can fail.Set <INFERENCE_AMI_VERSION> to a supported SageMaker inference AMI for your region (e.g. al2-ami-sagemaker-inference-gpu-2), or omit the field to let SageMaker pick the default.

aws sagemaker create-endpoint-config \
  --endpoint-config-name "${endpoint_name}-config" \
  --production-variants '[{
    "VariantName": "AllTraffic",
    "ModelName": "'"${model_name}"'",
    "InitialInstanceCount": 1,
    "InstanceType": "ml.g5.xlarge",
    "InitialVariantWeight": 1.0,
    "InferenceAmiVersion": "al2-ami-sagemaker-inference-gpu-2",
    "ModelDataDownloadTimeoutInSeconds": 1800,
    "ContainerStartupHealthCheckTimeoutInSeconds": 1800
  }]' \
  --region <REGION>

aws sagemaker create-endpoint \
  --endpoint-name "${endpoint_name}" \
  --endpoint-config-name "${endpoint_name}-config" \
  --region <REGION>

Check the endpoint status

Endpoint creation takes about 5–10 minutes. Check the status from the CLI:

aws sagemaker describe-endpoint \
  --endpoint-name "${endpoint_name}" \
  --region <REGION> \
  --query 'EndpointStatus'

Or open SageMaker AI in the AWS console and go to Deployment & Inference > Endpoints. The endpoint is ready once its status is InService.

3. Run inference with the Bioptimus SDK (recommended)

The Bioptimus SDK’s AWS backend routes through the SageMaker /invocations endpoint and adds the model_name dispatch field for you.

from bioptimus.models.backbones import Backbone
from bioptimus.models.types import Models

model = Backbone(
    Models.H1,
    backend="aws",
    endpoint_name="bioptimus-prod",
    region_name="<REGION>",
)

from bioptimus.models.backbones import Backbone
from bioptimus.models.types import Models

model = Backbone(
    Models.M_OPTIMUS,
    backend="aws",
    endpoint_name="bioptimus-prod",
    region_name="<REGION>",
)

For whole-slide inference (tiling, tissue masking, bulk RNA, output formats), see the Bioptimus SDK.

4. Or call the runtime API directly

A SageMaker request is a ModelRequest plus model_name and mode fields. See the API reference for the full schema.

5. Clean up

Endpoints incur charges while running. Delete the endpoint when finished.

Python SDK
AWS CLI

predictor.delete_endpoint()

Delete the endpoint, its configuration, and the model:

aws sagemaker delete-endpoint --endpoint-name "${endpoint_name}" --region <REGION>
aws sagemaker delete-endpoint-config --endpoint-config-name "${endpoint_name}-config" --region <REGION>
aws sagemaker delete-model --model-name "${model_name}" --region <REGION>

If you created the IAM execution role just for this deployment, delete it as well:

role_name="AmazonSageMaker-ExecutionRole-AWSMarketplace"

aws iam delete-role-policy \
  --role-name "${role_name}" \
  --policy-name "AmazonSageMaker-ExecutionPolicy"

aws iam detach-role-policy \
  --role-name "${role_name}" \
  --policy-arn "arn:aws:iam::aws:policy/AmazonSageMakerFullAccess"

aws iam delete-role --role-name "${role_name}"

Reference notebooks

End-to-end examples: h1-jumpstart (H-Optimus) and m-jumpstart (M-Optimus).

Get Started

Platforms

2. Deploy an endpoint

3. Run inference with the Bioptimus SDK (recommended)

4. Or call the runtime API directly

5. Clean up

Reference notebooks

​1. Subscribe and configure

​2. Deploy an endpoint

​3. Run inference with the Bioptimus SDK (recommended)

​4. Or call the runtime API directly

​5. Clean up

​Reference notebooks

1. Subscribe and configure

2. Deploy an endpoint

3. Run inference with the Bioptimus SDK (recommended)

4. Or call the runtime API directly

5. Clean up

Reference notebooks