String trainingImage
The registry path of the Docker image that contains the training algorithm. For information about docker registry paths for built-in algorithms, see sagemaker-algo-docker-registry-paths.
String trainingInputMode
The input mode that the algorithm supports. For the input modes that Amazon SageMaker algorithms support, see Algorithms. If an algorithm supports the
File input mode, Amazon SageMaker downloads the training data from S3 to the provisioned ML storage
Volume, and mounts the directory to docker volume for training container. If an algorithm supports the
Pipe input mode, Amazon SageMaker streams data directly from S3 to the container.
In File mode, make sure you provision ML storage volume with sufficient capacity to accommodate the data download from S3. In addition to the training data, the ML storage volume also stores the output model. The algorithm container use ML storage volume to also store intermediate information, if any.
For distributed algorithms using File mode, training data is distributed uniformly, and your training duration is predictable if the input data objects size is approximately same. Amazon SageMaker does not split the files any further for model training. If the object sizes are skewed, training won't be optimal as the data distribution is also skewed where one host in a training cluster is overloaded, thus becoming bottleneck in training.
String channelName
The name of the channel.
DataSource dataSource
The location of the channel data.
String contentType
The MIME type of the data.
String compressionType
If training data is compressed, the compression type. The default value is None.
CompressionType is used only in PIPE input mode. In FILE mode, leave this field unset or set it to
None.
String recordWrapperType
Specify RecordIO as the value when input data is in raw format but the training algorithm requires the RecordIO format, in which caseAmazon SageMaker wraps each individual S3 object in a RecordIO record. If the input data is already in RecordIO format, you don't need to set this attribute. For more information, see Create a Dataset Using RecordIO.
In FILE mode, leave this field unset or set it to None.
String containerHostname
The DNS host name for the container after Amazon SageMaker deploys it.
String image
The Amazon EC2 Container Registry (Amazon ECR) path where inference code is stored. If you are using your own custom algorithm instead of an algorithm provided by Amazon SageMaker, the inference code must meet Amazon SageMaker requirements. For more information, see Using Your Own Algorithms with Amazon SageMaker
String modelDataUrl
The S3 path where the model artifacts, which result from model training, are stored. This path must point to a single gzip compressed tar archive (.tar.gz suffix).
Map<K,V> environment
The environment variables to set in the Docker container. Each key and value in the Environment
string to string map can have length of up to 1024. We support up to 16 entries in the map.
String endpointConfigName
The name of the endpoint configuration. You specify this name in a CreateEndpoint request.
List<E> productionVariants
An array of ProductionVariant objects, one for each model that you want to host at this endpoint.
List<E> tags
An array of key-value pairs. For more information, see Using Cost Allocation Tags in the AWS Billing and Cost Management User Guide.
String kmsKeyId
The Amazon Resource Name (ARN) of a AWS Key Management Service key that Amazon SageMaker uses to encrypt data on the storage volume attached to the ML compute instance that hosts the endpoint.
String endpointConfigArn
The Amazon Resource Name (ARN) of the endpoint configuration.
String endpointName
The name of the endpoint. The name must be unique within an AWS Region in your AWS account.
String endpointConfigName
The name of an endpoint configuration. For more information, see CreateEndpointConfig.
List<E> tags
An array of key-value pairs. For more information, see Using Cost Allocation Tagsin the AWS Billing and Cost Management User Guide.
String endpointArn
The Amazon Resource Name (ARN) of the endpoint.
String modelName
The name of the new model.
ContainerDefinition primaryContainer
The location of the primary docker image containing inference code, associated artifacts, and custom environment map that the inference code uses when the model is deployed into production.
String executionRoleArn
The Amazon Resource Name (ARN) of the IAM role that Amazon SageMaker can assume to access model artifacts and docker image for deployment on ML compute instances. Deploying on ML compute instances is part of model hosting. For more information, see Amazon SageMaker Roles.
List<E> tags
An array of key-value pairs. For more information, see Using Cost Allocation Tags in the AWS Billing and Cost Management User Guide.
VpcConfig vpcConfig
A object that specifies the VPC that you want your model to connect to. Control access to and from your training container by configuring the VPC. For more information, see host-vpc.
String modelArn
The ARN of the model created in Amazon SageMaker.
String notebookInstanceLifecycleConfigName
The name of the lifecycle configuration.
List<E> onCreate
A shell script that runs only once, when you create a notebook instance.
List<E> onStart
A shell script that runs every time you start a notebook instance, including when you create the notebook instance.
String notebookInstanceLifecycleConfigArn
The Amazon Resource Name (ARN) of the lifecycle configuration.
String notebookInstanceName
The name of the new notebook instance.
String instanceType
The type of ML compute instance to launch for the notebook instance.
String subnetId
The ID of the subnet in a VPC to which you would like to have a connectivity from your ML compute instance.
List<E> securityGroupIds
The VPC security group IDs, in the form sg-xxxxxxxx. The security groups must be for the same VPC as specified in the subnet.
String roleArn
When you send any requests to AWS resources from the notebook instance, Amazon SageMaker assumes this role to perform tasks on your behalf. You must grant this role necessary permissions so Amazon SageMaker can perform these tasks. The policy must allow the Amazon SageMaker service principal (sagemaker.amazonaws.com) permissions to assume this role. For more information, see Amazon SageMaker Roles.
String kmsKeyId
If you provide a AWS KMS key ID, Amazon SageMaker uses it to encrypt data at rest on the ML storage volume that is attached to your notebook instance.
List<E> tags
A list of tags to associate with the notebook instance. You can add tags later by using the
CreateTags API.
String lifecycleConfigName
The name of a lifecycle configuration to associate with the notebook instance. For information about lifestyle configurations, see notebook-lifecycle-config.
String directInternetAccess
Sets whether Amazon SageMaker provides internet access to the notebook instance. If you set this to
Disabled this notebook instance will be able to access resources only in your VPC, and will not be
able to connect to Amazon SageMaker training and endpoint services unless your configure a NAT Gateway in your
VPC.
For more information, see appendix-notebook-and-internet-access. You can set the value of this parameter
to Disabled only if you set a value for the SubnetId parameter.
String notebookInstanceArn
The Amazon Resource Name (ARN) of the notebook instance.
String authorizedUrl
A JSON object that contains the URL string.
String trainingJobName
The name of the training job. The name must be unique within an AWS Region in an AWS account. It appears in the Amazon SageMaker console.
Map<K,V> hyperParameters
Algorithm-specific parameters. You set hyperparameters before you start the learning process. Hyperparameters influence the quality of the model. For a list of hyperparameters for each training algorithm provided by Amazon SageMaker, see Algorithms.
You can specify a maximum of 100 hyperparameters. Each hyperparameter is a key-value pair. Each key and value is
limited to 256 characters, as specified by the Length Constraint.
AlgorithmSpecification algorithmSpecification
The registry path of the Docker image that contains the training algorithm and algorithm-specific metadata, including the input mode. For more information about algorithms provided by Amazon SageMaker, see Algorithms. For information about providing your own algorithms, see your-algorithms.
String roleArn
The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf.
During model training, Amazon SageMaker needs your permission to read input data from an S3 bucket, download a Docker image that contains training code, write model artifacts to an S3 bucket, write logs to Amazon CloudWatch Logs, and publish metrics to Amazon CloudWatch. You grant permissions for all of these tasks to an IAM role. For more information, see Amazon SageMaker Roles.
List<E> inputDataConfig
An array of Channel objects. Each channel is a named input source. InputDataConfig
describes the input data and its location.
Algorithms can accept input data from one or more channels. For example, an algorithm might have two channels of
input data, training_data and validation_data. The configuration for each channel
provides the S3 location where the input data is stored. It also provides information about the stored data: the
MIME type, compression method, and whether the data is wrapped in RecordIO format.
Depending on the input mode that the algorithm supports, Amazon SageMaker either copies input data files from an S3 bucket to a local directory in the Docker container, or makes it available as input streams.
OutputDataConfig outputDataConfig
Specifies the path to the S3 bucket where you want to store model artifacts. Amazon SageMaker creates subfolders for the artifacts.
ResourceConfig resourceConfig
The resources, including the ML compute instances and ML storage volumes, to use for model training.
ML storage volumes store model artifacts and incremental states. Training algorithms might also use ML storage
volumes for scratch space. If you want Amazon SageMaker to use the ML storage volume to store the training data,
choose File as the TrainingInputMode in the algorithm specification. For distributed
training algorithms, specify an instance count greater than 1.
VpcConfig vpcConfig
A object that specifies the VPC that you want your training job to connect to. Control access to and from your training container by configuring the VPC. For more information, see train-vpc
StoppingCondition stoppingCondition
Sets a duration for training. Use this parameter to cap model training costs. To stop a job, Amazon SageMaker
sends the algorithm the SIGTERM signal, which delays job termination for 120 seconds. Algorithms
might use this 120-second window to save the model artifacts.
When Amazon SageMaker terminates a job because the stopping condition has been met, training algorithms provided
by Amazon SageMaker save the intermediate results of the job. This intermediate data is a valid model artifact.
You can use it to create a model using the CreateModel API.
List<E> tags
An array of key-value pairs. For more information, see Using Cost Allocation Tags in the AWS Billing and Cost Management User Guide.
String trainingJobArn
The Amazon Resource Name (ARN) of the training job.
S3DataSource s3DataSource
The S3 location of the data source that is associated with a channel.
String endpointConfigName
The name of the endpoint configuration that you want to delete.
String endpointName
The name of the endpoint that you want to delete.
String modelName
The name of the model to delete.
String notebookInstanceLifecycleConfigName
The name of the lifecycle configuration to delete.
String notebookInstanceName
The name of the Amazon SageMaker notebook instance to delete.
String endpointConfigName
The name of the endpoint configuration.
String endpointConfigName
Name of the Amazon SageMaker endpoint configuration.
String endpointConfigArn
The Amazon Resource Name (ARN) of the endpoint configuration.
List<E> productionVariants
An array of ProductionVariant objects, one for each model that you want to host at this endpoint.
String kmsKeyId
AWS KMS key ID Amazon SageMaker uses to encrypt data when storing it on the ML storage volume attached to the instance.
Date creationTime
A timestamp that shows when the endpoint configuration was created.
String endpointName
The name of the endpoint.
String endpointName
Name of the endpoint.
String endpointArn
The Amazon Resource Name (ARN) of the endpoint.
String endpointConfigName
The name of the endpoint configuration associated with this endpoint.
List<E> productionVariants
An array of ProductionVariant objects, one for each model hosted behind this endpoint.
String endpointStatus
The status of the endpoint.
String failureReason
If the status of the endpoint is Failed, the reason why it failed.
Date creationTime
A timestamp that shows when the endpoint was created.
Date lastModifiedTime
A timestamp that shows when the endpoint was last modified.
String modelName
The name of the model.
String modelName
Name of the Amazon SageMaker model.
ContainerDefinition primaryContainer
The location of the primary inference code, associated artifacts, and custom environment map that the inference code uses when it is deployed in production.
String executionRoleArn
The Amazon Resource Name (ARN) of the IAM role that you specified for the model.
VpcConfig vpcConfig
A object that specifies the VPC that this model has access to. For more information, see host-vpc
Date creationTime
A timestamp that shows when the model was created.
String modelArn
The Amazon Resource Name (ARN) of the model.
String notebookInstanceLifecycleConfigName
The name of the lifecycle configuration to describe.
String notebookInstanceLifecycleConfigArn
The Amazon Resource Name (ARN) of the lifecycle configuration.
String notebookInstanceLifecycleConfigName
The name of the lifecycle configuration.
List<E> onCreate
The shell script that runs only once, when you create a notebook instance.
List<E> onStart
The shell script that runs every time you start a notebook instance, including when you create the notebook instance.
Date lastModifiedTime
A timestamp that tells when the lifecycle configuration was last modified.
Date creationTime
A timestamp that tells when the lifecycle configuration was created.
String notebookInstanceName
The name of the notebook instance that you want information about.
String notebookInstanceArn
The Amazon Resource Name (ARN) of the notebook instance.
String notebookInstanceName
Name of the Amazon SageMaker notebook instance.
String notebookInstanceStatus
The status of the notebook instance.
String failureReason
If status is failed, the reason it failed.
String url
The URL that you use to connect to the Jupyter notebook that is running in your notebook instance.
String instanceType
The type of ML compute instance running on the notebook instance.
String subnetId
The ID of the VPC subnet.
List<E> securityGroups
The IDs of the VPC security groups.
String roleArn
Amazon Resource Name (ARN) of the IAM role associated with the instance.
String kmsKeyId
AWS KMS key ID Amazon SageMaker uses to encrypt data when storing it on the ML storage volume attached to the instance.
String networkInterfaceId
Network interface IDs that Amazon SageMaker created at the time of creating the instance.
Date lastModifiedTime
A timestamp. Use this parameter to retrieve the time when the notebook instance was last modified.
Date creationTime
A timestamp. Use this parameter to return the time when the notebook instance was created
String notebookInstanceLifecycleConfigName
Returns the name of a notebook instance lifecycle configuration.
For information about notebook instance lifestyle configurations, see notebook-lifecycle-config.
String directInternetAccess
Describes whether Amazon SageMaker provides internet access to the notebook instance. If this value is set to Disabled, he notebook instance does not have internet access, and cannot connect to Amazon SageMaker training and endpoint services.
For more information, see appendix-notebook-and-internet-access.
String trainingJobName
The name of the training job.
String trainingJobName
Name of the model training job.
String trainingJobArn
The Amazon Resource Name (ARN) of the training job.
ModelArtifacts modelArtifacts
Information about the Amazon S3 location that is configured for storing model artifacts.
String trainingJobStatus
The status of the training job.
For the InProgress status, Amazon SageMaker can return these secondary statuses:
Starting - Preparing for training.
Downloading - Optional stage for algorithms that support File training input mode. It indicates data is being downloaded to ML storage volumes.
Training - Training is in progress.
Uploading - Training is complete and model upload is in progress.
For the Stopped training status, Amazon SageMaker can return these secondary statuses:
MaxRuntimeExceeded - Job stopped as a result of maximum allowed runtime exceeded.
String secondaryStatus
Provides granular information about the system state. For more information, see TrainingJobStatus.
String failureReason
If the training job failed, the reason it failed.
Map<K,V> hyperParameters
Algorithm-specific parameters.
AlgorithmSpecification algorithmSpecification
Information about the algorithm used for training, and algorithm metadata.
String roleArn
The AWS Identity and Access Management (IAM) role configured for the training job.
List<E> inputDataConfig
An array of Channel objects that describes each data input channel.
OutputDataConfig outputDataConfig
The S3 path where model artifacts that you configured when creating the job are stored. Amazon SageMaker creates subfolders for model artifacts.
ResourceConfig resourceConfig
Resources, including ML compute instances and ML storage volumes, that are configured for model training.
VpcConfig vpcConfig
A object that specifies the VPC that this training job has access to. For more information, see train-vpc.
StoppingCondition stoppingCondition
The condition under which to stop the training job.
Date creationTime
A timestamp that indicates when the training job was created.
Date trainingStartTime
A timestamp that indicates when training started.
Date trainingEndTime
A timestamp that indicates when model training ended.
Date lastModifiedTime
A timestamp that indicates when the status of the training job was last modified.
String endpointName
The name of the endpoint.
String endpointArn
The Amazon Resource Name (ARN) of the endpoint.
Date creationTime
A timestamp that shows when the endpoint was created.
Date lastModifiedTime
A timestamp that shows when the endpoint was last modified.
String endpointStatus
The status of the endpoint.
String sortBy
The field to sort results by. The default is CreationTime.
String sortOrder
The sort order for results. The default is Ascending.
String nextToken
If the result of the previous ListEndpointConfig request was truncated, the response includes a
NextToken. To retrieve the next set of endpoint configurations, use the token in the next request.
Integer maxResults
The maximum number of training jobs to return in the response.
String nameContains
A string in the endpoint configuration name. This filter returns only endpoint configurations whose name contains the specified string.
Date creationTimeBefore
A filter that returns only endpoint configurations created before the specified time (timestamp).
Date creationTimeAfter
A filter that returns only endpoint configurations created after the specified time (timestamp).
String sortBy
Sorts the list of results. The default is CreationTime.
String sortOrder
The sort order for results. The default is Ascending.
String nextToken
If the result of a ListEndpoints request was truncated, the response includes a
NextToken. To retrieve the next set of endpoints, use the token in the next request.
Integer maxResults
The maximum number of endpoints to return in the response.
String nameContains
A string in endpoint names. This filter returns only endpoints whose name contains the specified string.
Date creationTimeBefore
A filter that returns only endpoints that were created before the specified time (timestamp).
Date creationTimeAfter
A filter that returns only endpoints that were created after the specified time (timestamp).
Date lastModifiedTimeBefore
A filter that returns only endpoints that were modified before the specified timestamp.
Date lastModifiedTimeAfter
A filter that returns only endpoints that were modified after the specified timestamp.
String statusEquals
A filter that returns only endpoints with the specified status.
String sortBy
Sorts the list of results. The default is CreationTime.
String sortOrder
The sort order for results. The default is Ascending.
String nextToken
If the response to a previous ListModels request was truncated, the response includes a
NextToken. To retrieve the next set of models, use the token in the next request.
Integer maxResults
The maximum number of models to return in the response.
String nameContains
A string in the training job name. This filter returns only models in the training job whose name contains the specified string.
Date creationTimeBefore
A filter that returns only models created before the specified time (timestamp).
Date creationTimeAfter
A filter that returns only models created after the specified time (timestamp).
String nextToken
If the result of a ListNotebookInstanceLifecycleConfigs request was truncated, the response includes
a NextToken. To get the next set of lifecycle configurations, use the token in the next request.
Integer maxResults
The maximum number of lifecycle configurations to return in the response.
String sortBy
Sorts the list of results. The default is CreationTime.
String sortOrder
The sort order for results.
String nameContains
A string in the lifecycle configuration name. This filter returns only lifecycle configurations whose name contains the specified string.
Date creationTimeBefore
A filter that returns only lifecycle configurations that were created before the specified time (timestamp).
Date creationTimeAfter
A filter that returns only lifecycle configurations that were created after the specified time (timestamp).
Date lastModifiedTimeBefore
A filter that returns only lifecycle configurations that were modified before the specified time (timestamp).
Date lastModifiedTimeAfter
A filter that returns only lifecycle configurations that were modified after the specified time (timestamp).
String nextToken
If the response is truncated, Amazon SageMaker returns this token. To get the next set of lifecycle configurations, use it in the next request.
List<E> notebookInstanceLifecycleConfigs
An array of NotebookInstanceLifecycleConfiguration objects, each listing a lifecycle configuration.
String nextToken
If the previous call to the ListNotebookInstances is truncated, the response includes a
NextToken. You can use this token in your subsequent ListNotebookInstances request to
fetch the next set of notebook instances.
You might specify a filter or a sort order in your request. When response is truncated, you must use the same values for the filer and sort order in the next request.
Integer maxResults
The maximum number of notebook instances to return.
String sortBy
The field to sort results by. The default is Name.
String sortOrder
The sort order for results.
String nameContains
A string in the notebook instances' name. This filter returns only notebook instances whose name contains the specified string.
Date creationTimeBefore
A filter that returns only notebook instances that were created before the specified time (timestamp).
Date creationTimeAfter
A filter that returns only notebook instances that were created after the specified time (timestamp).
Date lastModifiedTimeBefore
A filter that returns only notebook instances that were modified before the specified time (timestamp).
Date lastModifiedTimeAfter
A filter that returns only notebook instances that were modified after the specified time (timestamp).
String statusEquals
A filter that returns only notebook instances with the specified status.
String notebookInstanceLifecycleConfigNameContains
A string in the name of a notebook instances lifecycle configuration associated with this notebook instance. This filter returns only notebook instances associated with a lifecycle configuration with a name that contains the specified string.
String nextToken
If the response to the previous ListNotebookInstances request was truncated, Amazon SageMaker
returns this token. To retrieve the next set of notebook instances, use the token in the next request.
List<E> notebookInstances
An array of NotebookInstanceSummary objects, one for each notebook instance.
String resourceArn
The Amazon Resource Name (ARN) of the resource whose tags you want to retrieve.
String nextToken
If the response to the previous ListTags request is truncated, Amazon SageMaker returns this token.
To retrieve the next set of tags, use it in the subsequent request.
Integer maxResults
Maximum number of tags to return.
String nextToken
If the result of the previous ListTrainingJobs request was truncated, the response includes a
NextToken. To retrieve the next set of training jobs, use the token in the next request.
Integer maxResults
The maximum number of training jobs to return in the response.
Date creationTimeAfter
A filter that only training jobs created after the specified time (timestamp).
Date creationTimeBefore
A filter that returns only training jobs created before the specified time (timestamp).
Date lastModifiedTimeAfter
A filter that returns only training jobs modified after the specified time (timestamp).
Date lastModifiedTimeBefore
A filter that returns only training jobs modified before the specified time (timestamp).
String nameContains
A string in the training job name. This filter returns only models whose name contains the specified string.
String statusEquals
A filter that retrieves only training jobs with a specific status.
String sortBy
The field to sort results by. The default is CreationTime.
String sortOrder
The sort order for results. The default is Ascending.
String s3ModelArtifacts
The path of the S3 object that contains the model artifacts. For example,
s3://bucket-name/keynameprefix/model.tar.gz.
String notebookInstanceLifecycleConfigName
The name of the lifecycle configuration.
String notebookInstanceLifecycleConfigArn
The Amazon Resource Name (ARN) of the lifecycle configuration.
Date creationTime
A timestamp that tells when the lifecycle configuration was created.
Date lastModifiedTime
A timestamp that tells when the lifecycle configuration was last modified.
String content
A base64-encoded string that contains a shell script for a notebook instance lifecycle configuration.
String notebookInstanceName
The name of the notebook instance that you want a summary for.
String notebookInstanceArn
The Amazon Resource Name (ARN) of the notebook instance.
String notebookInstanceStatus
The status of the notebook instance.
String url
The URL that you use to connect to the Jupyter instance running in your notebook instance.
String instanceType
The type of ML compute instance that the notebook instance is running on.
Date creationTime
A timestamp that shows when the notebook instance was created.
Date lastModifiedTime
A timestamp that shows when the notebook instance was last modified.
String notebookInstanceLifecycleConfigName
The name of a notebook instance lifecycle configuration associated with this notebook instance.
For information about notebook instance lifestyle configurations, see notebook-lifecycle-config.
String kmsKeyId
The AWS Key Management Service (AWS KMS) key that Amazon SageMaker uses to encrypt the model artifacts at rest using Amazon S3 server-side encryption.
If the configuration of the output S3 bucket requires server-side encryption for objects, and you don't provide the KMS key ID, Amazon SageMaker uses the default service key. For more information, see KMS-Managed Encryption Keys in Amazon Simple Storage Service developer guide.
The KMS key policy must grant permission to the IAM role you specify in your CreateTrainingJob
request. Using Key Policies in
AWS KMS in the AWS Key Management Service Developer Guide.
String s3OutputPath
Identifies the S3 path where you want Amazon SageMaker to store the model artifacts. For example,
s3://bucket-name/key-name-prefix.
String variantName
The name of the production variant.
String modelName
The name of the model that you want to host. This is the name that you specified when creating the model.
Integer initialInstanceCount
Number of instances to launch initially.
String instanceType
The ML compute instance type.
Float initialVariantWeight
Determines initial traffic distribution among all of the models that you specify in the endpoint configuration.
The traffic to a production variant is determined by the ratio of the VariantWeight to the sum of
all VariantWeight values across all ProductionVariants. If unspecified, it defaults to 1.0.
String variantName
The name of the variant.
Float currentWeight
The weight associated with the variant.
Float desiredWeight
The requested weight, as specified in the UpdateEndpointWeightsAndCapacities request.
Integer currentInstanceCount
The number of instances associated with the variant.
Integer desiredInstanceCount
The number of instances requested in the UpdateEndpointWeightsAndCapacities request.
String instanceType
The ML compute instance type.
Integer instanceCount
The number of ML compute instances to use. For distributed training, provide a value greater than 1.
Integer volumeSizeInGB
The size of the ML storage volume that you want to provision.
ML storage volumes store model artifacts and incremental states. Training algorithms might also use the ML
storage volume for scratch space. If you want to store the training data in the ML storage volume, choose
File as the TrainingInputMode in the algorithm specification.
You must specify sufficient ML storage for your scenario.
Amazon SageMaker supports only the General Purpose SSD (gp2) ML storage volume type.
String volumeKmsKeyId
The Amazon Resource Name (ARN) of a AWS Key Management Service key that Amazon SageMaker uses to encrypt data on the storage volume attached to the ML compute instance(s) that run the training job.
String s3DataType
If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker uses all
objects with the specified key name prefix for model training.
If you choose ManifestFile, S3Uri identifies an object that is a manifest file
containing a list of object keys that you want Amazon SageMaker to use for model training.
String s3Uri
Depending on the value specified for the S3DataType, identifies either a key name prefix or a
manifest. For example:
A key name prefix might look like this: s3://bucketname/exampleprefix.
A manifest might look like this: s3://bucketname/example.manifest
The manifest is an S3 object which is a JSON file with the following format:
[
{"prefix": "s3://customer_bucket/some/prefix/"},
"relative/path/to/custdata-1",
"relative/path/custdata-2",
...
]
The preceding JSON matches the following s3Uris:
s3://customer_bucket/some/prefix/relative/path/to/custdata-1
s3://customer_bucket/some/prefix/relative/path/custdata-1
...
The complete set of s3uris in this manifest constitutes the input data for the channel for this
datasource. The object that each s3uris points to must readable by the IAM role that Amazon
SageMaker uses to perform tasks on your behalf.
String s3DataDistributionType
If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for
model training, specify FullyReplicated.
If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched for model
training, specify ShardedByS3Key. If there are n ML compute instances launched for a training
job, each instance gets approximately 1/n of the number of S3 objects. In this case, model training on
each machine uses only the subset of training data.
Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both FILE and PIPE modes. Keep this in mind when developing algorithms.
In distributed training, where you use multiple ML compute EC2 instances, you might choose
ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume (when
TrainingInputMode is set to File), this copies 1/n of the number of objects.
String notebookInstanceName
The name of the notebook instance to start.
String notebookInstanceName
The name of the notebook instance to terminate.
Integer maxRuntimeInSeconds
The maximum length of time, in seconds, that the training job can run. If model training does not complete during this time, Amazon SageMaker ends the job. If value is not specified, default value is 1 day. Maximum value is 5 days.
String trainingJobName
The name of the training job to stop.
String trainingJobName
The name of the training job that you want a summary for.
String trainingJobArn
The Amazon Resource Name (ARN) of the training job.
Date creationTime
A timestamp that shows when the training job was created.
Date trainingEndTime
A timestamp that shows when the training job ended. This field is set only if the training job has one of the
terminal statuses (Completed, Failed, or Stopped).
Date lastModifiedTime
Timestamp when the training job was last modified.
String trainingJobStatus
The status of the training job.
String endpointArn
The Amazon Resource Name (ARN) of the endpoint.
String endpointArn
The Amazon Resource Name (ARN) of the updated endpoint.
String notebookInstanceLifecycleConfigName
The name of the lifecycle configuration.
List<E> onCreate
The shell script that runs only once, when you create a notebook instance
List<E> onStart
The shell script that runs every time you start a notebook instance, including when you create the notebook instance.
Copyright © 2018. All rights reserved.