What Is a Dockerfile?
A Dockerfile is a plain text file containing step-by-step instructions Docker uses to build an image. Think of it as a recipe — each instruction adds a new layer to the image, and when Docker finishes reading your recipe, you have a complete, reusable image.
Dockerfile
docker build
Image
Container
The Build Process — Visual Flow
docker build -t myapp .
.dockerignore to skip unwanted filesmyappYour First Dockerfile — A Tiny Node.js App
Let's say we have this simple Express server in app.js:
// app.js
const express = require('express');
const app = express();
app.get('/', (req, res) => {
res.send('Hello from a Docker container!');
});
app.listen(3000, () => {
console.log('App listening on port 3000');
});
Create a file named exactly Dockerfile (no extension) in the project root:
# Dockerfile
# 1. Start from an official Node.js base image
FROM node:20-alpine
# 2. Set the working directory inside the container
WORKDIR /app
# 3. Copy dependency manifests first (for build-cache efficiency)
COPY package*.json ./
# 4. Install dependencies
RUN npm install --production
# 5. Copy the rest of the source code
COPY . .
# 6. Document the port our app listens on
EXPOSE 3000
# 7. Command to run when the container starts
CMD ["node", "app.js"]
Build and run it:
# Build the image, tag it "hello-node"
docker build -t hello-node .
# Run a container from it, map host 8080 → container 3000
docker run -p 8080:3000 hello-node
# Open http://localhost:8080 in the browser
# → "Hello from a Docker container!"
That's a full, production-style image — in seven lines of Dockerfile. Let's break down every instruction.
The Essential Dockerfile Instructions
| Instruction | Purpose | Example |
|---|---|---|
FROM |
Base image to start from | FROM node:20-alpine |
WORKDIR |
Set working directory | WORKDIR /app |
COPY |
Copy files from host to image | COPY . . |
ADD |
Like COPY, but also unpacks archives / downloads URLs | ADD data.tar.gz /data/ |
RUN |
Execute a command during build | RUN npm install |
ENV |
Set environment variables | ENV NODE_ENV=production |
ARG |
Build-time variable (not kept in final image) | ARG VERSION=1.0 |
EXPOSE |
Document the port (metadata only) | EXPOSE 3000 |
USER |
Switch to a non-root user | USER node |
VOLUME |
Declare a mount point | VOLUME /data |
CMD |
Default command when container starts | CMD ["node", "app.js"] |
ENTRYPOINT |
Fixed executable; args can still be appended | ENTRYPOINT ["node"] |
Understanding Image Layers
Every instruction in your Dockerfile creates a layer. Layers are stacked on top of the previous one and are cached. This is why the order of your instructions matters enormously.
Layer Stack (bottom → top)
Why Layer Order Matters
COPY package*.json and run npm install before copying the rest of the source code. This way, installing dependencies happens only when package.json actually changes — not on every tiny code edit. Builds drop from 2 minutes to 2 seconds.
CMD vs ENTRYPOINT — The Classic Confusion
Both define what a container runs at start, but with different philosophies.
CMD
docker run.ENTRYPOINT
docker run are appended, not replaced.Using Them Together
The most flexible pattern — ENTRYPOINT for the fixed executable, CMD for the default arguments:
ENTRYPOINT ["node"]
CMD ["app.js"]
# docker run myapp → node app.js (uses default)
# docker run myapp server.js → node server.js (CMD overridden)
# docker run --entrypoint sh myapp → sh (entrypoint overridden)
Shell Form vs Exec Form
Both CMD and ENTRYPOINT accept two forms:
CMD ["node", "app.js"] — runs directly, signals work properly.CMD node app.js — wraps in /bin/sh -c, signals (Ctrl+C) can misbehave.Always prefer the exec form (JSON array). It handles signals correctly, which is critical for graceful container shutdown.
The .dockerignore File
Without a .dockerignore, your entire project folder — including node_modules, .git, logs, and IDE configs — is sent to the daemon and may even end up inside the image. Bad for size, bad for security.
# .dockerignore
node_modules
npm-debug.log
.git
.gitignore
.env
.env.local
.vscode
.idea
Dockerfile
.dockerignore
coverage
dist
build
*.md
It works just like .gitignore. Always add it — your images shrink, builds get faster, and you avoid accidentally baking secrets into the image.
A Real-World Example — Python Flask App
Let's do the same thing for Python so you see the pattern is universal.
# Dockerfile for a Python Flask app
FROM python:3.12-slim
# Avoid .pyc files and enable unbuffered logs
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1
WORKDIR /app
# Install deps first (cache-friendly)
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
# Copy source
COPY . .
# Non-root user for security
RUN adduser --disabled-password appuser && chown -R appuser /app
USER appuser
EXPOSE 5000
CMD ["python", "app.py"]
Same shape — base image, workdir, dependencies first, source second, expose port, run command. Once you learn one language's pattern, the rest is muscle memory.
Multi-Stage Builds — A Sneak Peek
For compiled languages (Go, Java, TypeScript) or front-end apps that build to static files, you can use multiple FROM stages in the same Dockerfile — one stage to build, another to ship a tiny runtime image.
# ---- Stage 1: build the app ----
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# ---- Stage 2: serve only the built files ----
FROM nginx:alpine
COPY --from=builder /app/dist /usr/share/nginx/html
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
The final image contains only the static build output plus Nginx — no Node, no source code, no build tools. Typical size drops from 1 GB → 25 MB. We'll cover multi-stage builds in depth in the next article.
Build, Tag, and Run Commands
# Build with a tag
docker build -t myapp:1.0 .
# Build with multiple tags
docker build -t myapp:1.0 -t myapp:latest .
# Build with no cache (full rebuild)
docker build --no-cache -t myapp .
# Build from a different Dockerfile
docker build -f Dockerfile.prod -t myapp .
# Build with build args
docker build --build-arg VERSION=2.0 -t myapp .
# Run the image
docker run -d -p 8080:3000 --name my-container myapp:1.0
# Inspect image layers
docker history myapp:1.0
Dockerfile Best Practices
Pin your base image
node:20-alpine, not node:latest. Deterministic builds.Use small base images
alpine, slim, or distroless. Skip full Ubuntu unless needed.Install deps before source
Combine RUN commands
RUN apt-get update && apt-get install -y x → one layer, smaller image.Drop root with USER
Add .dockerignore
Use multi-stage for builds
Use HEALTHCHECK
Common Mistakes to Avoid
1. Copying everything first, then installing deps. Every source code change invalidates the dependency-install cache. Always copy manifests (package.json, requirements.txt) and install first.
2. Using latest as a tag. FROM node:latest today may be Node 20, tomorrow Node 22. Pin exact versions for reproducible builds.
3. Installing tools only for build inside the final image. Tools like gcc, make, or dev dependencies bloat the image and add attack surface. Use multi-stage builds.
4. Storing secrets in ENV or the Dockerfile. Anything in an image layer is visible forever. Use runtime secrets via --env-file, Docker secrets, or an external secret manager.
5. Forgetting EXPOSE vs -p. EXPOSE is documentation. It does not publish the port to the host — you still need -p 8080:3000 on docker run.
6. Ignoring the build context size. If docker build takes forever just to "send build context", you're missing a .dockerignore.
7. Running everything as root. Default images run as root. One RUN apt-get install away from a security incident. Always USER nonroot.
Inspecting Your Image
# See image size
docker images myapp
# See each layer and its size
docker history myapp:1.0
# Inspect image metadata (env, cmd, labels, layers)
docker inspect myapp:1.0
# Open a shell inside a running container for debugging
docker exec -it my-container sh
When an image is bigger than you expect, docker history shows which layer is the culprit — usually it's an unnecessary RUN apt-get install or a forgotten COPY of node_modules.
The Interview Answer
"I write a
Dockerfile — a text file with step-by-step build instructions. Key instructions are FROM for the base image, WORKDIR to set the working directory, COPY to bring source code in, RUN to execute build-time commands like installing dependencies, EXPOSE to document ports, and CMD or ENTRYPOINT for the startup command. I always copy dependency manifests and install dependencies before copying the full source code so the build cache is reused efficiently. For production, I use pinned base images, a non-root USER, a .dockerignore file, and multi-stage builds to keep images small. Then I run docker build -t myapp:version . and push to a registry."
Summary
A Dockerfile is the recipe, docker build is the cooking, and the image is the finished meal. Every instruction creates a layer, and the order of instructions decides how much of that work Docker can cache between builds. Master the essentials — FROM, WORKDIR, COPY, RUN, EXPOSE, ENV, USER, CMD/ENTRYPOINT — and you can containerise any application in minutes.
Follow the best practices: pin your base image, use small images like alpine or slim, put dependency installs before source-code copies, add a .dockerignore, drop root privileges, and reach for multi-stage builds when you can. Keep images focused, small, and secure. With a good Dockerfile, your app runs identically on your laptop, on your teammate's machine, in CI, and in production — which is exactly the promise Docker is built on.
| Concept | Key Takeaway |
|---|---|
| Dockerfile | Recipe of instructions to build an image |
| docker build | Executes the Dockerfile to produce an image |
| Image layers | One per instruction, stacked and cached |
| FROM | Sets the base image (always pin versions) |
| WORKDIR | The in-container working directory |
| COPY before RUN | Copy manifests first for cache efficiency |
| CMD vs ENTRYPOINT | CMD = default; ENTRYPOINT = fixed executable |
| Exec form | Prefer JSON array for signal handling |
| EXPOSE | Metadata only — use -p to actually publish ports |
| .dockerignore | Smaller context, faster builds, no secret leaks |
| Non-root USER | Basic security hygiene for production |
| Multi-stage builds | Tiny final images by dropping build tools |
| docker history | Debug image size by inspecting layers |
| Tags | Use myapp:1.0, never just latest |
| Golden rule | One image = one reproducible app environment |