Dev Containers allow you to open a directory inside of a Docker container and use it as a complete development environment in Visual Studio Code or GitHub Codespaces. Configuring an environment and installing all of the required dependencies for a project can be exceedingly difficult, so I thought it would be great to set up a reproducible environment for my data engineering projects.
Specifically, setting up Snowpark for Snowflake peaked my interest in Dev Containers - it requires Python 3.8, Anaconda, and the Snowpark Python Package. It’s fairly straightforward to set the above environment up, but after you do it a few times, you start to wonder if there’s a better way. Turns out there is
Any time you start a new project, you can simply copy over your devcontainer to configure your environment.
This Dev Container has a data engineering flavour - It likely won’t suit all of your needs but hopefully it will be a great starting point and template for you to build your own Dev Container.
Dev Containers require you to have Docker installed OR use GitHub Codespaces. Codespaces is now free for individual use (60 hours/month) and is worth checking out if you haven’t tried it. It is basically a cloud version of VS Code that uses GitHub’s compute resources, which can be scaled if you need a more powerful virtual machine. Here is a great blog post of some of the benefits offered by GitHub Codespaces.
Getting Started
Option 1: Local VS Code
- Clone this repo and connect to it in VS Code:
$ cd your/desired/repo/location
$ git clone https://github.com/MartyC-137/DataEng_devcontainer
-
Download the Dev Containers extension from the VS Code marketplace
-
Press Cmd + Shift + P (Mac) or Ctrl + Shift + P (Windows) to open the Command Pallette. Type in
Dev Containers: Open Folder in Containerand select the repo directory. -
Wait for the container to build and the dependencies to install
-
Start developing!
Option 2: GitHub Codespaces
-
From the forked repo in GitHub, select the green
<> Codebutton and choose Codespaces -
Click
Create Codespace on Main- you can checkout a branch once the environment is ready -
Wait for the container to build and the dependencies to install
-
Start developing!
Components of a Dev Container
When you have the Dev Containers extension installed and are setting up a new project, VS Code and GitHub Codespaces will automatically detect if you have a .devcontainer directory inside your repo. It is looking for a devcontainer.json file, although you can augment this with a Dockerfile, requirements.txt, etc. In the .devcontainer directory from my repo you’ll see the following:
Dockerfiledevcontainer.jsonconfig.fishMicrosoft.Powershell_profile.ps1reuquirements.txt
The Dockerfile, config.fish and Microsoft.Powershell_profile.ps1 files are used to configure a custom powershell Powershell environment, Oh My Posh. This application provides awesome Powershell themes and is a great way to take your terminal experience to the next level. You can check out the official Microsoft documentation here.
Here’s an example of what Oh My Posh looks like, I’m using the clean-detailed theme:

The Microsoft.Powershell_profile.ps1 is a configuration file that runs when Powershell starts. The Terminal Icons theme is included in the Powershell profile - if you run the Get-ChildItem cmdlet inside of your Dev Container, you’ll see the addition of icons for files:
Next, lets take a look at the devcontainer.json file. Here is what ours looks like for this project:
{
"name": "oh-my-posh",
"build": {
"dockerfile": "Dockerfile",
"args": {
"VARIANT": "1.19-bullseye",
"POSH_THEME": "https://raw.githubusercontent.com/JanDeDobbeleer/oh-my-posh/main/themes/clean-detailed.omp.json",
"TZ": "America/Moncton",
"NODE_VERSION": "lts/*",
"PS_VERSION": "7.2.7"
}
},
"runArgs": ["--cap-add=SYS_PTRACE", "--security-opt", "seccomp=unconfined"],
"features": {
"ghcr.io/devcontainers/features/azure-cli:1": {
"version": "latest"
},
"ghcr.io/devcontainers/features/python:1": {
"version": "3.8"
},
"ghcr.io/devcontainers-contrib/features/curl-apt-get:1": {},
"ghcr.io/devcontainers-contrib/features/terraform-asdf:2": {},
"ghcr.io/devcontainers-contrib/features/yamllint:2": {},
"ghcr.io/devcontainers/features/docker-in-docker:2": {},
"ghcr.io/devcontainers/features/docker-outside-of-docker:1": {},
"ghcr.io/devcontainers/features/github-cli:1": {},
"ghcr.io/devcontainers-contrib/features/spark-sdkman:2": {
"jdkVersion": "11"
},
"ghcr.io/dhoeric/features/google-cloud-cli:1": {
"version": "latest"
}
},
"customizations": {
"vscode": {
"settings": {
"go.toolsManagement.checkForUpdates": "local",
"go.useLanguageServer": true,
"go.gopath": "/go",
"go.goroot": "/usr/local/go",
"terminal.integrated.profiles.linux": {
"bash": {
"path": "bash"
},
"zsh": {
"path": "zsh"
},
"fish": {
"path": "fish"
},
"tmux": {
"path": "tmux",
"icon": "terminal-tmux"
},
"pwsh": {
"path": "pwsh",
"icon": "terminal-powershell"
}
},
"terminal.integrated.defaultProfile.linux": "pwsh",
"terminal.integrated.defaultProfile.windows": "pwsh",
"terminal.integrated.defaultProfile.osx": "pwsh",
"tasks.statusbar.default.hide": true,
"terminal.integrated.tabs.defaultIcon": "terminal-powershell",
"terminal.integrated.tabs.defaultColor": "terminal.ansiBlue",
"workbench.colorTheme": "GitHub Dark Dimmed",
"workbench.iconTheme": "material-icon-theme"
},
"extensions": [
"snowflake.snowflake-vsc",
"golang.go",
"ms-vscode.powershell",
"ms-python.python",
"ms-python.vscode-pylance",
"redhat.vscode-yaml",
"redhat.vscode-xml",
"ms-vscode-remote.remote-containers",
"ms-toolsai.jupyter",
"eamodio.gitlens",
"yzhang.markdown-all-in-one",
"davidanson.vscode-markdownlint",
"editorconfig.editorconfig",
"esbenp.prettier-vscode",
"github.vscode-pull-request-github",
"akamud.vscode-theme-onedark",
"PKief.material-icon-theme",
"GitHub.github-vscode-theme",
"actboy168.tasks",
"bastienboutonnet.vscode-dbt",
"innoverio.vscode-dbt-power-user",
"ms-mssql.mssql",
"adpyke.vscode-sql-formatter",
"inferrinizzard.prettier-sql-vscode"
]
}
},
// Use 'forwardPorts' to make a list of ports inside the container available locally.
// "forwardPorts": [3000],
"postCreateCommand": "pip3 install --user -r .devcontainer/requirements.txt --use-pep517",
"remoteUser": "vscode"
}
This file has a few key components:
-
buildblock - Specifies that we are using aDockerfile. TheDockerfileuses aGobase image because Oh My Posh is written in Go. I install Python later via a Dev Containerfeature -
featuresblock - this is an easy way to install additional programming languages, command line tools etc into your devcontainer. In my devcontainer I have the following:Python 3.8TerraformAzure CLIGoogle Cloud CLIGitHub CLI-
Sparkwith JDK 11
-
customizationsblock - this passes default values to our devcontainerssetings.jsonfile, the file that configures our VS Code settings. Here I set values like Powershell as the default terminal, GitHub Dark Dimmed as my color theme, etc. These are my personal preferences, please update these with anything else you prefer. Note that Oh My Posh works with bash, zsh etc. You don’t have to use Powershell if you prefer a different shell! -
extensionsblock - this block installs any VS Code extensions you like. In this devcontainer I’ve included the following:SnowflakeSQL ServerPowershell-
Pythontools Jupyter Notebooks- GitHub Pull Requests
- GitLens
- Popular VS Code Themes (GitHub, One Dark etc.)
-
dbtextensions -
YAMLandXMLtools. VS Code has a built inJSONformatter -
SQLFormatting tools
-
postCreateCommand- this line instructs our devcontainer to run the includedrequirements.txtfile topipinstall the following packages:pandas-
prefectand its various dependencies sqlalchemyipykernelpolarsdbt-coredbt-bigquerydbt-snowflakedbt-postgrespysparkconfluent-kafkasnowparkscikit-learn
And there you have it! You’ve now got an awesome development environment set up that is easily reproducible for new projects, shareable with others, etc. Happy Coding!