AI 비교하기AI 사용하기AI 최신정보AI 커뮤니티
Our VisionTermsPrivacyContact

AWS Enables SOCI for Faster Deep Learning Container Startups

AWS Enables SOCI for Faster Deep Learning Container Startups

AWS ML Blog
Thursday, June 4, 2026
  • •AWS adds SOCI snapshotter support to all Deep Learning AMIs and Deep Learning Containers.
  • •Lazy loading reduces container startup times from nearly 7 minutes to approximately 21 seconds.
  • •Parallel pull mode improves image download performance by 2.2x for I/O-intensive AI workloads.
  • •AWS adds SOCI snapshotter support to all Deep Learning AMIs and Deep Learning Containers.
  • •Lazy loading reduces container startup times from nearly 7 minutes to approximately 21 seconds.
  • •Parallel pull mode improves image download performance by 2.2x for I/O-intensive AI workloads.

AWS has enabled support for Seekable OCI (SOCI) snapshotter and index across all Deep Learning AMIs and AWS Deep Learning Containers. This technology allows container images to use selective file downloading, a method where only necessary data is fetched at runtime rather than pulling the entire multi-gigabyte image before starting the workload. This addresses critical bottlenecks in production environments, such as long cold start times and wasted compute resources that occur when GPU instances remain idle during slow image pull operations.

Benchmarks demonstrate significant performance gains through two distinct modes. In lazy loading mode, containers start almost instantly by fetching data on demand; for a 32.7GB vLLM image, startup time dropped from 6 minutes 59.099 seconds with standard Docker to 21.125 seconds using the SOCI snapshotter. This mode is best suited for environments requiring the fastest possible initialization, particularly on lower-spec instances.

Parallel pull mode provides an alternative for workloads requiring the full image available at startup by increasing download and unpacking concurrency. For a 60.4GB image, parallel pull mode reduced total pull time from 4 minutes 44.163 seconds to 2 minutes 12.846 seconds, achieving a 2.2x performance improvement. Users can enable this mode by modifying the SOCI snapshotter configuration file at /etc/soci-snapshotter-grpc/config.toml, allowing for tuned concurrent download and unpack threads based on specific instance capabilities and network bandwidth.

AWS Deep Learning Containers featuring the -soci tag suffix include pre-created SOCI indexes, enabling immediate usage of lazy loading functionality. For custom images, users must manually create and push SOCI indexes to their registries. By optimizing container management, these tools help reduce network saturation and operational costs associated with large-scale deep learning infrastructure deployment on platforms like Amazon SageMaker, EC2, ECS, and EKS.

AWS has enabled support for Seekable OCI (SOCI) snapshotter and index across all Deep Learning AMIs and AWS Deep Learning Containers. This technology allows container images to use selective file downloading, a method where only necessary data is fetched at runtime rather than pulling the entire multi-gigabyte image before starting the workload. This addresses critical bottlenecks in production environments, such as long cold start times and wasted compute resources that occur when GPU instances remain idle during slow image pull operations.

Benchmarks demonstrate significant performance gains through two distinct modes. In lazy loading mode, containers start almost instantly by fetching data on demand; for a 32.7GB vLLM image, startup time dropped from 6 minutes 59.099 seconds with standard Docker to 21.125 seconds using the SOCI snapshotter. This mode is best suited for environments requiring the fastest possible initialization, particularly on lower-spec instances.

Parallel pull mode provides an alternative for workloads requiring the full image available at startup by increasing download and unpacking concurrency. For a 60.4GB image, parallel pull mode reduced total pull time from 4 minutes 44.163 seconds to 2 minutes 12.846 seconds, achieving a 2.2x performance improvement. Users can enable this mode by modifying the SOCI snapshotter configuration file at /etc/soci-snapshotter-grpc/config.toml, allowing for tuned concurrent download and unpack threads based on specific instance capabilities and network bandwidth.

AWS Deep Learning Containers featuring the -soci tag suffix include pre-created SOCI indexes, enabling immediate usage of lazy loading functionality. For custom images, users must manually create and push SOCI indexes to their registries. By optimizing container management, these tools help reduce network saturation and operational costs associated with large-scale deep learning infrastructure deployment on platforms like Amazon SageMaker, EC2, ECS, and EKS.

Read original (English)·Jun 3, 2026
#aws#container#soci#deep learning#lazy loading#docker#ami#parallel pull