We recommend new projects start with resources from the AWS provider.
aws-native.sagemaker.InferenceComponent
Explore with Pulumi AI
We recommend new projects start with resources from the AWS provider.
Resource Type definition for AWS::SageMaker::InferenceComponent
Create InferenceComponent Resource
Resources are created with functions called constructors. To learn more about declaring and configuring resources, see Resources.
Constructor syntax
new InferenceComponent(name: string, args: InferenceComponentArgs, opts?: CustomResourceOptions);@overload
def InferenceComponent(resource_name: str,
                       args: InferenceComponentArgs,
                       opts: Optional[ResourceOptions] = None)
@overload
def InferenceComponent(resource_name: str,
                       opts: Optional[ResourceOptions] = None,
                       endpoint_name: Optional[str] = None,
                       specification: Optional[InferenceComponentSpecificationArgs] = None,
                       deployment_config: Optional[InferenceComponentDeploymentConfigArgs] = None,
                       endpoint_arn: Optional[str] = None,
                       inference_component_name: Optional[str] = None,
                       runtime_config: Optional[InferenceComponentRuntimeConfigArgs] = None,
                       tags: Optional[Sequence[_root_inputs.TagArgs]] = None,
                       variant_name: Optional[str] = None)func NewInferenceComponent(ctx *Context, name string, args InferenceComponentArgs, opts ...ResourceOption) (*InferenceComponent, error)public InferenceComponent(string name, InferenceComponentArgs args, CustomResourceOptions? opts = null)
public InferenceComponent(String name, InferenceComponentArgs args)
public InferenceComponent(String name, InferenceComponentArgs args, CustomResourceOptions options)
type: aws-native:sagemaker:InferenceComponent
properties: # The arguments to resource properties.
options: # Bag of options to control resource's behavior.
Parameters
- name string
- The unique name of the resource.
- args InferenceComponentArgs
- The arguments to resource properties.
- opts CustomResourceOptions
- Bag of options to control resource's behavior.
- resource_name str
- The unique name of the resource.
- args InferenceComponentArgs
- The arguments to resource properties.
- opts ResourceOptions
- Bag of options to control resource's behavior.
- ctx Context
- Context object for the current deployment.
- name string
- The unique name of the resource.
- args InferenceComponentArgs
- The arguments to resource properties.
- opts ResourceOption
- Bag of options to control resource's behavior.
- name string
- The unique name of the resource.
- args InferenceComponentArgs
- The arguments to resource properties.
- opts CustomResourceOptions
- Bag of options to control resource's behavior.
- name String
- The unique name of the resource.
- args InferenceComponentArgs
- The arguments to resource properties.
- options CustomResourceOptions
- Bag of options to control resource's behavior.
InferenceComponent Resource Properties
To learn more about resource properties and how to use them, see Inputs and Outputs in the Architecture and Concepts docs.
Inputs
In Python, inputs that are objects can be passed either as argument classes or as dictionary literals.
The InferenceComponent resource accepts the following input properties:
- EndpointName string
- The name of the endpoint that hosts the inference component.
- Specification
Pulumi.Aws Native. Sage Maker. Inputs. Inference Component Specification 
- DeploymentConfig Pulumi.Aws Native. Sage Maker. Inputs. Inference Component Deployment Config 
- EndpointArn string
- The Amazon Resource Name (ARN) of the endpoint that hosts the inference component.
- InferenceComponent stringName 
- The name of the inference component.
- RuntimeConfig Pulumi.Aws Native. Sage Maker. Inputs. Inference Component Runtime Config 
- 
List<Pulumi.Aws Native. Inputs. Tag> 
- VariantName string
- The name of the production variant that hosts the inference component.
- EndpointName string
- The name of the endpoint that hosts the inference component.
- Specification
InferenceComponent Specification Args 
- DeploymentConfig InferenceComponent Deployment Config Args 
- EndpointArn string
- The Amazon Resource Name (ARN) of the endpoint that hosts the inference component.
- InferenceComponent stringName 
- The name of the inference component.
- RuntimeConfig InferenceComponent Runtime Config Args 
- 
TagArgs 
- VariantName string
- The name of the production variant that hosts the inference component.
- endpointName String
- The name of the endpoint that hosts the inference component.
- specification
InferenceComponent Specification 
- deploymentConfig InferenceComponent Deployment Config 
- endpointArn String
- The Amazon Resource Name (ARN) of the endpoint that hosts the inference component.
- inferenceComponent StringName 
- The name of the inference component.
- runtimeConfig InferenceComponent Runtime Config 
- List<Tag>
- variantName String
- The name of the production variant that hosts the inference component.
- endpointName string
- The name of the endpoint that hosts the inference component.
- specification
InferenceComponent Specification 
- deploymentConfig InferenceComponent Deployment Config 
- endpointArn string
- The Amazon Resource Name (ARN) of the endpoint that hosts the inference component.
- inferenceComponent stringName 
- The name of the inference component.
- runtimeConfig InferenceComponent Runtime Config 
- Tag[]
- variantName string
- The name of the production variant that hosts the inference component.
- endpoint_name str
- The name of the endpoint that hosts the inference component.
- specification
InferenceComponent Specification Args 
- deployment_config InferenceComponent Deployment Config Args 
- endpoint_arn str
- The Amazon Resource Name (ARN) of the endpoint that hosts the inference component.
- inference_component_ strname 
- The name of the inference component.
- runtime_config InferenceComponent Runtime Config Args 
- 
Sequence[TagArgs] 
- variant_name str
- The name of the production variant that hosts the inference component.
- endpointName String
- The name of the endpoint that hosts the inference component.
- specification Property Map
- deploymentConfig Property Map
- endpointArn String
- The Amazon Resource Name (ARN) of the endpoint that hosts the inference component.
- inferenceComponent StringName 
- The name of the inference component.
- runtimeConfig Property Map
- List<Property Map>
- variantName String
- The name of the production variant that hosts the inference component.
Outputs
All input properties are implicitly available as output properties. Additionally, the InferenceComponent resource produces the following output properties:
- CreationTime string
- The time when the inference component was created.
- FailureReason string
- Id string
- The provider-assigned unique ID for this managed resource.
- InferenceComponent stringArn 
- The Amazon Resource Name (ARN) of the inference component.
- InferenceComponent Pulumi.Status Aws Native. Sage Maker. Inference Component Status 
- The status of the inference component.
- LastModified stringTime 
- The time when the inference component was last updated.
- CreationTime string
- The time when the inference component was created.
- FailureReason string
- Id string
- The provider-assigned unique ID for this managed resource.
- InferenceComponent stringArn 
- The Amazon Resource Name (ARN) of the inference component.
- InferenceComponent InferenceStatus Component Status 
- The status of the inference component.
- LastModified stringTime 
- The time when the inference component was last updated.
- creationTime String
- The time when the inference component was created.
- failureReason String
- id String
- The provider-assigned unique ID for this managed resource.
- inferenceComponent StringArn 
- The Amazon Resource Name (ARN) of the inference component.
- inferenceComponent InferenceStatus Component Status 
- The status of the inference component.
- lastModified StringTime 
- The time when the inference component was last updated.
- creationTime string
- The time when the inference component was created.
- failureReason string
- id string
- The provider-assigned unique ID for this managed resource.
- inferenceComponent stringArn 
- The Amazon Resource Name (ARN) of the inference component.
- inferenceComponent InferenceStatus Component Status 
- The status of the inference component.
- lastModified stringTime 
- The time when the inference component was last updated.
- creation_time str
- The time when the inference component was created.
- failure_reason str
- id str
- The provider-assigned unique ID for this managed resource.
- inference_component_ strarn 
- The Amazon Resource Name (ARN) of the inference component.
- inference_component_ Inferencestatus Component Status 
- The status of the inference component.
- last_modified_ strtime 
- The time when the inference component was last updated.
- creationTime String
- The time when the inference component was created.
- failureReason String
- id String
- The provider-assigned unique ID for this managed resource.
- inferenceComponent StringArn 
- The Amazon Resource Name (ARN) of the inference component.
- inferenceComponent "InStatus Service" | "Creating" | "Updating" | "Failed" | "Deleting" 
- The status of the inference component.
- lastModified StringTime 
- The time when the inference component was last updated.
Supporting Types
InferenceComponentAlarm, InferenceComponentAlarmArgs      
- AlarmName string
- AlarmName string
- alarmName String
- alarmName string
- alarm_name str
- alarmName String
InferenceComponentAutoRollbackConfiguration, InferenceComponentAutoRollbackConfigurationArgs          
InferenceComponentCapacitySize, InferenceComponentCapacitySizeArgs        
InferenceComponentCapacitySizeType, InferenceComponentCapacitySizeTypeArgs          
- CopyCount 
- COPY_COUNT
- CapacityPercent 
- CAPACITY_PERCENT
- InferenceComponent Capacity Size Type Copy Count 
- COPY_COUNT
- InferenceComponent Capacity Size Type Capacity Percent 
- CAPACITY_PERCENT
- CopyCount 
- COPY_COUNT
- CapacityPercent 
- CAPACITY_PERCENT
- CopyCount 
- COPY_COUNT
- CapacityPercent 
- CAPACITY_PERCENT
- COPY_COUNT
- COPY_COUNT
- CAPACITY_PERCENT
- CAPACITY_PERCENT
- "COPY_COUNT"
- COPY_COUNT
- "CAPACITY_PERCENT"
- CAPACITY_PERCENT
InferenceComponentComputeResourceRequirements, InferenceComponentComputeResourceRequirementsArgs          
- MaxMemory intRequired In Mb 
- The maximum MB of memory to allocate to run a model that you assign to an inference component.
- MinMemory intRequired In Mb 
- The minimum MB of memory to allocate to run a model that you assign to an inference component.
- NumberOf doubleAccelerator Devices Required 
- The number of accelerators to allocate to run a model that you assign to an inference component. Accelerators include GPUs and AWS Inferentia.
- NumberOf doubleCpu Cores Required 
- The number of CPU cores to allocate to run a model that you assign to an inference component.
- MaxMemory intRequired In Mb 
- The maximum MB of memory to allocate to run a model that you assign to an inference component.
- MinMemory intRequired In Mb 
- The minimum MB of memory to allocate to run a model that you assign to an inference component.
- NumberOf float64Accelerator Devices Required 
- The number of accelerators to allocate to run a model that you assign to an inference component. Accelerators include GPUs and AWS Inferentia.
- NumberOf float64Cpu Cores Required 
- The number of CPU cores to allocate to run a model that you assign to an inference component.
- maxMemory IntegerRequired In Mb 
- The maximum MB of memory to allocate to run a model that you assign to an inference component.
- minMemory IntegerRequired In Mb 
- The minimum MB of memory to allocate to run a model that you assign to an inference component.
- numberOf DoubleAccelerator Devices Required 
- The number of accelerators to allocate to run a model that you assign to an inference component. Accelerators include GPUs and AWS Inferentia.
- numberOf DoubleCpu Cores Required 
- The number of CPU cores to allocate to run a model that you assign to an inference component.
- maxMemory numberRequired In Mb 
- The maximum MB of memory to allocate to run a model that you assign to an inference component.
- minMemory numberRequired In Mb 
- The minimum MB of memory to allocate to run a model that you assign to an inference component.
- numberOf numberAccelerator Devices Required 
- The number of accelerators to allocate to run a model that you assign to an inference component. Accelerators include GPUs and AWS Inferentia.
- numberOf numberCpu Cores Required 
- The number of CPU cores to allocate to run a model that you assign to an inference component.
- max_memory_ intrequired_ in_ mb 
- The maximum MB of memory to allocate to run a model that you assign to an inference component.
- min_memory_ intrequired_ in_ mb 
- The minimum MB of memory to allocate to run a model that you assign to an inference component.
- number_of_ floataccelerator_ devices_ required 
- The number of accelerators to allocate to run a model that you assign to an inference component. Accelerators include GPUs and AWS Inferentia.
- number_of_ floatcpu_ cores_ required 
- The number of CPU cores to allocate to run a model that you assign to an inference component.
- maxMemory NumberRequired In Mb 
- The maximum MB of memory to allocate to run a model that you assign to an inference component.
- minMemory NumberRequired In Mb 
- The minimum MB of memory to allocate to run a model that you assign to an inference component.
- numberOf NumberAccelerator Devices Required 
- The number of accelerators to allocate to run a model that you assign to an inference component. Accelerators include GPUs and AWS Inferentia.
- numberOf NumberCpu Cores Required 
- The number of CPU cores to allocate to run a model that you assign to an inference component.
InferenceComponentContainerSpecification, InferenceComponentContainerSpecificationArgs        
- ArtifactUrl string
- The Amazon S3 path where the model artifacts, which result from model training, are stored. This path must point to a single gzip compressed tar archive (.tar.gz suffix).
- DeployedImage Pulumi.Aws Native. Sage Maker. Inputs. Inference Component Deployed Image 
- Environment Dictionary<string, string>
- The environment variables to set in the Docker container. Each key and value in the Environment string-to-string map can have length of up to 1024. We support up to 16 entries in the map.
- Image string
- The Amazon Elastic Container Registry (Amazon ECR) path where the Docker image for the model is stored.
- ArtifactUrl string
- The Amazon S3 path where the model artifacts, which result from model training, are stored. This path must point to a single gzip compressed tar archive (.tar.gz suffix).
- DeployedImage InferenceComponent Deployed Image 
- Environment map[string]string
- The environment variables to set in the Docker container. Each key and value in the Environment string-to-string map can have length of up to 1024. We support up to 16 entries in the map.
- Image string
- The Amazon Elastic Container Registry (Amazon ECR) path where the Docker image for the model is stored.
- artifactUrl String
- The Amazon S3 path where the model artifacts, which result from model training, are stored. This path must point to a single gzip compressed tar archive (.tar.gz suffix).
- deployedImage InferenceComponent Deployed Image 
- environment Map<String,String>
- The environment variables to set in the Docker container. Each key and value in the Environment string-to-string map can have length of up to 1024. We support up to 16 entries in the map.
- image String
- The Amazon Elastic Container Registry (Amazon ECR) path where the Docker image for the model is stored.
- artifactUrl string
- The Amazon S3 path where the model artifacts, which result from model training, are stored. This path must point to a single gzip compressed tar archive (.tar.gz suffix).
- deployedImage InferenceComponent Deployed Image 
- environment {[key: string]: string}
- The environment variables to set in the Docker container. Each key and value in the Environment string-to-string map can have length of up to 1024. We support up to 16 entries in the map.
- image string
- The Amazon Elastic Container Registry (Amazon ECR) path where the Docker image for the model is stored.
- artifact_url str
- The Amazon S3 path where the model artifacts, which result from model training, are stored. This path must point to a single gzip compressed tar archive (.tar.gz suffix).
- deployed_image InferenceComponent Deployed Image 
- environment Mapping[str, str]
- The environment variables to set in the Docker container. Each key and value in the Environment string-to-string map can have length of up to 1024. We support up to 16 entries in the map.
- image str
- The Amazon Elastic Container Registry (Amazon ECR) path where the Docker image for the model is stored.
- artifactUrl String
- The Amazon S3 path where the model artifacts, which result from model training, are stored. This path must point to a single gzip compressed tar archive (.tar.gz suffix).
- deployedImage Property Map
- environment Map<String>
- The environment variables to set in the Docker container. Each key and value in the Environment string-to-string map can have length of up to 1024. We support up to 16 entries in the map.
- image String
- The Amazon Elastic Container Registry (Amazon ECR) path where the Docker image for the model is stored.
InferenceComponentDeployedImage, InferenceComponentDeployedImageArgs        
- ResolutionTime string
- The date and time when the image path for the model resolved to the ResolvedImage
- ResolvedImage string
- The specific digest path of the image hosted in this ProductionVariant.
- SpecifiedImage string
- The image path you specified when you created the model.
- ResolutionTime string
- The date and time when the image path for the model resolved to the ResolvedImage
- ResolvedImage string
- The specific digest path of the image hosted in this ProductionVariant.
- SpecifiedImage string
- The image path you specified when you created the model.
- resolutionTime String
- The date and time when the image path for the model resolved to the ResolvedImage
- resolvedImage String
- The specific digest path of the image hosted in this ProductionVariant.
- specifiedImage String
- The image path you specified when you created the model.
- resolutionTime string
- The date and time when the image path for the model resolved to the ResolvedImage
- resolvedImage string
- The specific digest path of the image hosted in this ProductionVariant.
- specifiedImage string
- The image path you specified when you created the model.
- resolution_time str
- The date and time when the image path for the model resolved to the ResolvedImage
- resolved_image str
- The specific digest path of the image hosted in this ProductionVariant.
- specified_image str
- The image path you specified when you created the model.
- resolutionTime String
- The date and time when the image path for the model resolved to the ResolvedImage
- resolvedImage String
- The specific digest path of the image hosted in this ProductionVariant.
- specifiedImage String
- The image path you specified when you created the model.
InferenceComponentDeploymentConfig, InferenceComponentDeploymentConfigArgs        
InferenceComponentRollingUpdatePolicy, InferenceComponentRollingUpdatePolicyArgs          
InferenceComponentRuntimeConfig, InferenceComponentRuntimeConfigArgs        
- CopyCount int
- The number of runtime copies of the model container to deploy with the inference component. Each copy can serve inference requests.
- CurrentCopy intCount 
- DesiredCopy intCount 
- CopyCount int
- The number of runtime copies of the model container to deploy with the inference component. Each copy can serve inference requests.
- CurrentCopy intCount 
- DesiredCopy intCount 
- copyCount Integer
- The number of runtime copies of the model container to deploy with the inference component. Each copy can serve inference requests.
- currentCopy IntegerCount 
- desiredCopy IntegerCount 
- copyCount number
- The number of runtime copies of the model container to deploy with the inference component. Each copy can serve inference requests.
- currentCopy numberCount 
- desiredCopy numberCount 
- copy_count int
- The number of runtime copies of the model container to deploy with the inference component. Each copy can serve inference requests.
- current_copy_ intcount 
- desired_copy_ intcount 
- copyCount Number
- The number of runtime copies of the model container to deploy with the inference component. Each copy can serve inference requests.
- currentCopy NumberCount 
- desiredCopy NumberCount 
InferenceComponentSpecification, InferenceComponentSpecificationArgs      
- BaseInference stringComponent Name 
- The name of an existing inference component that is to contain the inference component that you're creating with your request. - Specify this parameter only if your request is meant to create an adapter inference component. An adapter inference component contains the path to an adapter model. The purpose of the adapter model is to tailor the inference output of a base foundation model, which is hosted by the base inference component. The adapter inference component uses the compute resources that you assigned to the base inference component. - When you create an adapter inference component, use the - Containerparameter to specify the location of the adapter artifacts. In the parameter value, use the- ArtifactUrlparameter of the- InferenceComponentContainerSpecificationdata type.- Before you can create an adapter inference component, you must have an existing inference component that contains the foundation model that you want to adapt. 
- ComputeResource Pulumi.Requirements Aws Native. Sage Maker. Inputs. Inference Component Compute Resource Requirements 
- The compute resources allocated to run the model, plus any adapter models, that you assign to the inference component. - Omit this parameter if your request is meant to create an adapter inference component. An adapter inference component is loaded by a base inference component, and it uses the compute resources of the base inference component. 
- Container
Pulumi.Aws Native. Sage Maker. Inputs. Inference Component Container Specification 
- Defines a container that provides the runtime environment for a model that you deploy with an inference component.
- ModelName string
- The name of an existing SageMaker AI model object in your account that you want to deploy with the inference component.
- StartupParameters Pulumi.Aws Native. Sage Maker. Inputs. Inference Component Startup Parameters 
- Settings that take effect while the model container starts up.
- BaseInference stringComponent Name 
- The name of an existing inference component that is to contain the inference component that you're creating with your request. - Specify this parameter only if your request is meant to create an adapter inference component. An adapter inference component contains the path to an adapter model. The purpose of the adapter model is to tailor the inference output of a base foundation model, which is hosted by the base inference component. The adapter inference component uses the compute resources that you assigned to the base inference component. - When you create an adapter inference component, use the - Containerparameter to specify the location of the adapter artifacts. In the parameter value, use the- ArtifactUrlparameter of the- InferenceComponentContainerSpecificationdata type.- Before you can create an adapter inference component, you must have an existing inference component that contains the foundation model that you want to adapt. 
- ComputeResource InferenceRequirements Component Compute Resource Requirements 
- The compute resources allocated to run the model, plus any adapter models, that you assign to the inference component. - Omit this parameter if your request is meant to create an adapter inference component. An adapter inference component is loaded by a base inference component, and it uses the compute resources of the base inference component. 
- Container
InferenceComponent Container Specification 
- Defines a container that provides the runtime environment for a model that you deploy with an inference component.
- ModelName string
- The name of an existing SageMaker AI model object in your account that you want to deploy with the inference component.
- StartupParameters InferenceComponent Startup Parameters 
- Settings that take effect while the model container starts up.
- baseInference StringComponent Name 
- The name of an existing inference component that is to contain the inference component that you're creating with your request. - Specify this parameter only if your request is meant to create an adapter inference component. An adapter inference component contains the path to an adapter model. The purpose of the adapter model is to tailor the inference output of a base foundation model, which is hosted by the base inference component. The adapter inference component uses the compute resources that you assigned to the base inference component. - When you create an adapter inference component, use the - Containerparameter to specify the location of the adapter artifacts. In the parameter value, use the- ArtifactUrlparameter of the- InferenceComponentContainerSpecificationdata type.- Before you can create an adapter inference component, you must have an existing inference component that contains the foundation model that you want to adapt. 
- computeResource InferenceRequirements Component Compute Resource Requirements 
- The compute resources allocated to run the model, plus any adapter models, that you assign to the inference component. - Omit this parameter if your request is meant to create an adapter inference component. An adapter inference component is loaded by a base inference component, and it uses the compute resources of the base inference component. 
- container
InferenceComponent Container Specification 
- Defines a container that provides the runtime environment for a model that you deploy with an inference component.
- modelName String
- The name of an existing SageMaker AI model object in your account that you want to deploy with the inference component.
- startupParameters InferenceComponent Startup Parameters 
- Settings that take effect while the model container starts up.
- baseInference stringComponent Name 
- The name of an existing inference component that is to contain the inference component that you're creating with your request. - Specify this parameter only if your request is meant to create an adapter inference component. An adapter inference component contains the path to an adapter model. The purpose of the adapter model is to tailor the inference output of a base foundation model, which is hosted by the base inference component. The adapter inference component uses the compute resources that you assigned to the base inference component. - When you create an adapter inference component, use the - Containerparameter to specify the location of the adapter artifacts. In the parameter value, use the- ArtifactUrlparameter of the- InferenceComponentContainerSpecificationdata type.- Before you can create an adapter inference component, you must have an existing inference component that contains the foundation model that you want to adapt. 
- computeResource InferenceRequirements Component Compute Resource Requirements 
- The compute resources allocated to run the model, plus any adapter models, that you assign to the inference component. - Omit this parameter if your request is meant to create an adapter inference component. An adapter inference component is loaded by a base inference component, and it uses the compute resources of the base inference component. 
- container
InferenceComponent Container Specification 
- Defines a container that provides the runtime environment for a model that you deploy with an inference component.
- modelName string
- The name of an existing SageMaker AI model object in your account that you want to deploy with the inference component.
- startupParameters InferenceComponent Startup Parameters 
- Settings that take effect while the model container starts up.
- base_inference_ strcomponent_ name 
- The name of an existing inference component that is to contain the inference component that you're creating with your request. - Specify this parameter only if your request is meant to create an adapter inference component. An adapter inference component contains the path to an adapter model. The purpose of the adapter model is to tailor the inference output of a base foundation model, which is hosted by the base inference component. The adapter inference component uses the compute resources that you assigned to the base inference component. - When you create an adapter inference component, use the - Containerparameter to specify the location of the adapter artifacts. In the parameter value, use the- ArtifactUrlparameter of the- InferenceComponentContainerSpecificationdata type.- Before you can create an adapter inference component, you must have an existing inference component that contains the foundation model that you want to adapt. 
- compute_resource_ Inferencerequirements Component Compute Resource Requirements 
- The compute resources allocated to run the model, plus any adapter models, that you assign to the inference component. - Omit this parameter if your request is meant to create an adapter inference component. An adapter inference component is loaded by a base inference component, and it uses the compute resources of the base inference component. 
- container
InferenceComponent Container Specification 
- Defines a container that provides the runtime environment for a model that you deploy with an inference component.
- model_name str
- The name of an existing SageMaker AI model object in your account that you want to deploy with the inference component.
- startup_parameters InferenceComponent Startup Parameters 
- Settings that take effect while the model container starts up.
- baseInference StringComponent Name 
- The name of an existing inference component that is to contain the inference component that you're creating with your request. - Specify this parameter only if your request is meant to create an adapter inference component. An adapter inference component contains the path to an adapter model. The purpose of the adapter model is to tailor the inference output of a base foundation model, which is hosted by the base inference component. The adapter inference component uses the compute resources that you assigned to the base inference component. - When you create an adapter inference component, use the - Containerparameter to specify the location of the adapter artifacts. In the parameter value, use the- ArtifactUrlparameter of the- InferenceComponentContainerSpecificationdata type.- Before you can create an adapter inference component, you must have an existing inference component that contains the foundation model that you want to adapt. 
- computeResource Property MapRequirements 
- The compute resources allocated to run the model, plus any adapter models, that you assign to the inference component. - Omit this parameter if your request is meant to create an adapter inference component. An adapter inference component is loaded by a base inference component, and it uses the compute resources of the base inference component. 
- container Property Map
- Defines a container that provides the runtime environment for a model that you deploy with an inference component.
- modelName String
- The name of an existing SageMaker AI model object in your account that you want to deploy with the inference component.
- startupParameters Property Map
- Settings that take effect while the model container starts up.
InferenceComponentStartupParameters, InferenceComponentStartupParametersArgs        
- ContainerStartup intHealth Check Timeout In Seconds 
- The timeout value, in seconds, for your inference container to pass health check by Amazon S3 Hosting. For more information about health check, see How Your Container Should Respond to Health Check (Ping) Requests .
- ModelData intDownload Timeout In Seconds 
- The timeout value, in seconds, to download and extract the model that you want to host from Amazon S3 to the individual inference instance associated with this inference component.
- ContainerStartup intHealth Check Timeout In Seconds 
- The timeout value, in seconds, for your inference container to pass health check by Amazon S3 Hosting. For more information about health check, see How Your Container Should Respond to Health Check (Ping) Requests .
- ModelData intDownload Timeout In Seconds 
- The timeout value, in seconds, to download and extract the model that you want to host from Amazon S3 to the individual inference instance associated with this inference component.
- containerStartup IntegerHealth Check Timeout In Seconds 
- The timeout value, in seconds, for your inference container to pass health check by Amazon S3 Hosting. For more information about health check, see How Your Container Should Respond to Health Check (Ping) Requests .
- modelData IntegerDownload Timeout In Seconds 
- The timeout value, in seconds, to download and extract the model that you want to host from Amazon S3 to the individual inference instance associated with this inference component.
- containerStartup numberHealth Check Timeout In Seconds 
- The timeout value, in seconds, for your inference container to pass health check by Amazon S3 Hosting. For more information about health check, see How Your Container Should Respond to Health Check (Ping) Requests .
- modelData numberDownload Timeout In Seconds 
- The timeout value, in seconds, to download and extract the model that you want to host from Amazon S3 to the individual inference instance associated with this inference component.
- container_startup_ inthealth_ check_ timeout_ in_ seconds 
- The timeout value, in seconds, for your inference container to pass health check by Amazon S3 Hosting. For more information about health check, see How Your Container Should Respond to Health Check (Ping) Requests .
- model_data_ intdownload_ timeout_ in_ seconds 
- The timeout value, in seconds, to download and extract the model that you want to host from Amazon S3 to the individual inference instance associated with this inference component.
- containerStartup NumberHealth Check Timeout In Seconds 
- The timeout value, in seconds, for your inference container to pass health check by Amazon S3 Hosting. For more information about health check, see How Your Container Should Respond to Health Check (Ping) Requests .
- modelData NumberDownload Timeout In Seconds 
- The timeout value, in seconds, to download and extract the model that you want to host from Amazon S3 to the individual inference instance associated with this inference component.
InferenceComponentStatus, InferenceComponentStatusArgs      
- InService 
- InService
- Creating
- Creating
- Updating
- Updating
- Failed
- Failed
- Deleting
- Deleting
- InferenceComponent Status In Service 
- InService
- InferenceComponent Status Creating 
- Creating
- InferenceComponent Status Updating 
- Updating
- InferenceComponent Status Failed 
- Failed
- InferenceComponent Status Deleting 
- Deleting
- InService 
- InService
- Creating
- Creating
- Updating
- Updating
- Failed
- Failed
- Deleting
- Deleting
- InService 
- InService
- Creating
- Creating
- Updating
- Updating
- Failed
- Failed
- Deleting
- Deleting
- IN_SERVICE
- InService
- CREATING
- Creating
- UPDATING
- Updating
- FAILED
- Failed
- DELETING
- Deleting
- "InService" 
- InService
- "Creating"
- Creating
- "Updating"
- Updating
- "Failed"
- Failed
- "Deleting"
- Deleting
Tag, TagArgs  
Package Details
- Repository
- AWS Native pulumi/pulumi-aws-native
- License
- Apache-2.0
We recommend new projects start with resources from the AWS provider.