Deploying an AI application the cloud (AWS)

This guide provides a step-by-step walkthrough for beginners to deploy an AI application using Docker, Amazon ECR, and Amazon EC2.

Step 1: Create the Dockerfile

A Dockerfile is a script containing instructions to build a container image for your application.

  • Prepare your project: Ensure your directory contains your application script (e.g., app.py) and a requirements.txt file listing all Python dependencies.
  • Write the Dockerfile: Create a file named Dockerfile and add these lines:
    FROM python:3.9-slim
    WORKDIR /app
    COPY requirements.txt .
    RUN pip install --no-cache-dir -r requirements.txt
    COPY . .
    CMD ["python", "app.py"]
  • Build and Test: Open your terminal and run docker build -t ai-app .. Verify it locally using docker run -p 8501:8501 ai-app.

Step 2: Set Up Amazon ECR (Elastic Container Registry)

Amazon ECR is a managed container image registry service.

  1. Create Repository: Log into the AWS Console, navigate to ECR, and create a new Private repository named ai-app-repo.
  2. Push Image: Select your repository and click View push commands. Execute the provided commands in your local terminal to:
    • Authenticate your Docker client to the registry.
    • Tag your local image with the ECR repository URI.
    • Push the image to AWS.

Step 3: Deploy on EC2 with Docker

Deploying on an EC2 instance provides a cost-effective environment with full control over the infrastructure.

  • Launch Instance: Create an EC2 instance (e.g., t2.medium or larger depending on your AI model) using the Amazon Linux 2 AMI.
  • Configure Security: In the Security Group settings, allow Inbound Traffic on the port your app uses (e.g., port 80 or 8501).
  • Install Docker: SSH into your instance and install Docker:
    sudo yum update -y
    sudo amazon-linux-extras install docker
    sudo service docker start
    sudo usermod -a -G docker ec2-user
  • Run Container: Authenticate to ECR on the instance, pull your image, and run it:
    docker run -d -p 80:8501 [YOUR-ECR-URI]:latest

Step 4: Finalize and Verify

Once the container is running, find the Public IPv4 Address of your EC2 instance in the AWS Console. Paste this IP into your web browser to access your live AI application.

Download PowerPoint