# Data Engineering VS Code Profile This profile is optimized for data science and data engineering work and activities. It is also somewhat opinionated so feel free to remove/replace extensions and settings to suit your needs. It's a free country! ๐Ÿ˜‹ ![Data Engineeering Profile Graphic](https://gist.githubusercontent.com/SiriusBits/b4d74fad310d3cd9dadcb92fe1724833/raw/6c990154c73d6b308d6a357934d42e1627a3873b/linkedin-data-engineering-and-data-science.png) ## Highlights - โœ… Support for **Python** and notebook editing for both **Jupyter** and **SQL** ๐Ÿง‘โ€๐Ÿ’ป - โœ… Dedicated **linting** and **formatting** for Python as well as **auto docstring generation** ๐Ÿ’… - โœ… Powerful **data exploration** and **cleaning** with support for **auto-generated Pandas code** ๐Ÿ‘ฉโ€๐Ÿ”ฌ - โœ… Supercharged **coding productivity** and **search** with **Codeium AI coding assistant** ๐Ÿฆพ - โœ… **Containerized** and **remote development** with **Docker**, **Dev Containers** and **Snowflake** โ„๏ธ ๐Ÿณ - โœ… **Enhanced DX** settings, themes, and file view extensions ๐Ÿ˜Ž ## Usage To use this profile in VS Code or Cursor.sh, choose Settings... > Profiles > Import Profile and paste in this gist: https://gist.github.com/SiriusBits/b4d74fad310d3cd9dadcb92fe1724833 ## Settings As with the extensions, you can forego importing these settings or tailor them to fit your specific needs. I've provided some explanation and reasoning to help you decide. For certain things, like the editor typeface, there are some prerequisites you'll need to satisfy. ### About IDE Typefaces For data-related coding I use [Victor Mono](https://rubjo.github.io/victor-mono/), an open-source monospaced font with optional semi-connected cursive italics and programming symbol ligatures. The typeface is slender, crisp, and narrow, with a large x-height and clear punctuation, making it legible and ideal for code. > [!IMPORTANT] > Choice of coding font and related typographical features are, of course, entirely a matter of personal preference. I've used [Fira Code](https://github.com/tonsky/FiraCode) and still use [Dank Mono](https://philpl.gumroad.com/l/dank-mono) for web development. I prefer ligatures and I like cursive - _and you might ***not***, which is totally fine_! I also often work in the terminal so a monospaced font is a must for me. > If you want to explore other options I recommend checking out [Nerd Fonts](https://www.nerdfonts.com/) and if you want to use Victor Mono like I am, you will need to install the [VictorMono Nerd Font](https://github.com/ryanoasis/nerd-fonts/tree/master/patched-fonts/VictorMono). You can do this manually or via [Homebrew](https://brew.sh/) using the `homebrew-cask-fonts` cask like so: > ```Shell > $ brew tap homebrew/cask-fonts # You only need to do this once! > $ brew install font-victor-mono-nerd-font > ``` ### Codeium extension settings Enable Codeium Indexing & Search Engine. This feature allows chat and autocomplete models to have full codebase awareness, significantly improving autocomplete and chat quality. It also allows natural language search of your codebase. > [!CAUTION] >When first enabled, Codeium will consume about 25% of CPU while it indexes the workspace. This should take < 10 minutes depending on your workspace size, once per workspace. CPU usage will return to normal automatically. If you use CoPilot or another coding assistant and don't install [Codeium](https://marketplace.visualstudio.com/items?itemName=Codeium.codeium), remove these settings. ```JSON "codeium.enableConfig": { "*": true, "csv": true }, "codeium.enableSearch": true, ``` ### Data Wrangler settings **Experimental** enable fast CSV parsing and export to Parquet using the PyArrow engine. Requires the pyarrow package and pandas>=1.4.0. [Read more about PyArrow support in Pandas](vscode-file://vscode-app/Applications/Visual%20Studio%20Code.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html). > [!NOTE] > Remove if you don't want to use experimental features or don't plan to use [Data Wrangler](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.datawrangler). ```JSON "dataWrangler.experiments.fastCsvParsing": true, "dataWrangler.experiments.parquetExport": true, ``` ### Font and font feature settings See [my notes](#about-ide-typefaces) on `VictorMono Nerd Font` above. ```JSON "editor.fontFamily": "'VictorMono Nerd Font', Menlo, Monaco, 'Courier New', monospace", "editor.fontLigatures": true, "editor.fontSize": 17, ``` ### Jupyter notebook settings The interactive window will open to the right of the active code editor and 'perFile' will create a new interactive window for every file that runs a cell. Other options include: - viewColumn = 'active' will open the interactive window in place of the active editor. - viewColumn = 'secondGroup' will open the interactive window in the second editor group. - creationMode = 'single' allows a single window. - creationMode = 'multiple' allows the creation of multiple. > [!TIP] > The `executeSelection=true` allows you to highlight Python code and run it in an interactive window by pressing `shift` + `enter`. ```JSON "jupyter.interactiveWindow.creationMode": "perFile", "jupyter.interactiveWindow.textEditor.executeSelection": true, "jupyter.interactiveWindow.viewColumn": "beside", ``` ### Linting and formatting settings Settings for ensuring Ruff and Black Formatter work together and are scoped to just Python and Jupyter notebooks. ```JSON "notebook.codeActionsOnSave": { "source.fixAll": true }, "notebook.formatOnSave.enabled": true, "[python]": { "editor.codeActionsOnSave": { "source.fixAll": true }, "editor.defaultFormatter": "ms-python.black-formatter", "editor.formatOnSave": true }, ``` ### Terminal settings Use Z shell for the integrated terminal in VS Code. ```JSON "terminal.integrated.defaultProfile.osx": "zsh", ``` ### Theme settings These can be updated to whatever theme(s) you choose for your workbench, icons, and product icons respectively. [Night Owl](https://marketplace.visualstudio.com/items?itemName=sdras.night-owl) by [Sarah Drasner](https://sarah.dev/) includes a 'light' theme in addition to the original 'dark' theme. With auto-detect, the theme applied will switch based on the OS appearance. If you use Nightshift on macOS, the theme will toggle whenever your settings toggle the appearance on your Mac. Both the icon and product icon use the [Atom Material Icons](https://marketplace.visualstudio.com/items?itemName=AtomMaterial.a-file-icon-vscode) theme. ``` "workbench.colorTheme": "Night Owl", "workbench.iconTheme": "material-icon-theme", "workbench.preferredLightColorTheme": "Night Owl Light", "workbench.preferredDarkColorTheme": "Night Owl", "workbench.productIconTheme": "a-file-icon-vscode-product-icon-theme", "window.autoDetectColorScheme": true, ``` *** ## Extensions ### [Atom Material Icons](https://marketplace.visualstudio.com/items?itemName=AtomMaterial.a-file-icon-vscode) โš› > A port of the [Atom File Icons](https://github.com/file-icons/atom) for VSCode. > It replaces the icons and folder icons with better-suited icons, related to the file type, framework, or language. ### [autoDocstring](https://marketplace.visualstudio.com/items?itemName=njpwerner.autodocstring) ๐Ÿ“ > Quickly generate docstrings for Python functions. ### [Black Formatter](https://marketplace.visualstudio.com/items?itemName=ms-python.black-formatter) โœ”๏ธ > The uncompromising code formatter > By using _Black_, you agree to cede control over the minutiae of hand-formatting. In return, _Black_ gives you speed, determinism, and freedom from `pycodestyle` nagging about formatting. You will save time and mental energy for more important matters. ### [Codeium: CoPilot Alternative Coding Assistant](https://marketplace.visualstudio.com/items?itemName=Codeium.codeium) ๐Ÿค– > [Codeium](https://www.codeium.com/) is the modern coding superpower, a free code acceleration toolkit built on cutting-edge AI technology. Currently, Codeium provides autocomplete, chat, and search capabilities in 70+ languages, with lightning-fast speeds and state-of-the-art suggestion quality. ### [Data Wrangler](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.datawrangler) ๐Ÿ‡ > Data Wrangler is a code-centric data cleaning tool integrated into VS Code and VS Code Jupyter Notebooks. ### [Dev Containers](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers)๐Ÿซ™ > The Dev Containers extension lets you use a [Docker container](https://docker.com/) as a full-featured development environment. ### [Docker](https://marketplace.visualstudio.com/items?itemName=ms-azuretools.vscode-docker) ๐Ÿณ > Build, manage, and deploy Docker containerized applications from Visual Studio Code. It also provides one-click debugging of Node.js, Python, and .NET inside a container. ### [File Nesting Updater](https://marketplace.visualstudio.com/items?itemName=antfu.file-nesting) ๐Ÿ—‚๏ธ > While not a requirement to use file nesting, if installed, this setup will use the [File Nesting Updater extension](https://marketplace.visualstudio.com/items?itemName=antfu.file-nesting) to keep the `explorer.fileNesting.patterns` setting up-to-date with the latest file nesting configuration. This helps keep your workspace tidy despite the various dotfiles and configuration settings you have specified. ### [Git Graph: Alternative to GitLens](https://marketplace.visualstudio.com/items?itemName=mhutchie.git-graph) **โ‘ƒ** > Similar to GitLens but less intrusive and easier to view (in my opinion). It provides a graphical interface to view and interact with your Git repository. It allows you to visualize your Git history as a graph and offers a range of features for managing and exploring your repository. ### [gitignore](https://marketplace.visualstudio.com/items?itemName=codezombiech.gitignore) ๐Ÿ™ˆ > Pull .gitignore templates from the https://github.com/github/gitignore repository. Language support for .gitignore files. ### [Jupyter](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter) ๐Ÿช > Provides basic notebook support for [language kernels](https://github.com/jupyter/jupyter/wiki/Jupyter-kernels) that are supported in [Jupyter Notebooks](https://jupyter.org/) today, and allows any Python environment to be used as a Jupyter kernel. This extension bundle includes the following supporting extensions: - [Jupyter Keymap](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter-keymap) - to provide Jupyter-consistent keymaps - [Jupyter Notebook Renderers](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter-renderers) - to provide renderers for MIME types such as latex, plotly, vega, etc. - [Jupyter Cell Tags](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.vscode-jupyter-cell-tags) and [Jupyter Slide Show](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.vscode-jupyter-slideshow) - to provide the ability to tag cells in notebooks and support for presentations ### [Night Owl](https://marketplace.visualstudio.com/items?itemName=sdras.night-owl) ๐Ÿฆ‰ > A Visual Studio Code theme from the amazing [Sarah Drasner](https://twitter.com/sarah_edo) for all the night owls out there. Fine-tuned for those of us who like to code late into the night. Color choices have taken into consideration what is accessible to people with colorblindness and in low-light circumstances. Decisions were also based on meaningful contrast for reading comprehension and for optimal razzle-dazzle. โœจ ### [PDF Viewer](https://marketplace.visualstudio.com/items?itemName=mathematic.vscode-pdf) ๐Ÿ“– > Portable document format (PDF) viewer for Visual Studio Code. ### [Python](https://marketplace.visualstudio.com/items?itemName=ms-python.python) ๐Ÿ > Rich support for the [Python language](https://www.python.org/) (for all [actively supported versions](https://devguide.python.org/#status-of-python-branches) of the language: >=3.7), including features such as IntelliSense (Pylance), linting, debugging, code navigation, code formatting, refactoring, variable explorer, test explorer, and more. Includes supporting Pylance extension: - [Pylance](https://marketplace.visualstudio.com/items?itemName=ms-python.vscode-pylance) ### [Rainbow CSV](https://marketplace.visualstudio.com/items?itemName=mechatroner.rainbow-csv) ๐ŸŒˆ > Highlight CSV and TSV files and run SQL-like queries. ### [Ruff](https://marketplace.visualstudio.com/items?itemName=charliermarsh.ruff) ๐Ÿงน > An extremely fast Python linter and code formatter, written in Rust and compatible with the [Black Formatter](https://marketplace.visualstudio.com/items?itemName=ms-python.black-formatter) extension. ### [Snowflake](https://marketplace.visualstudio.com/items?itemName=snowflake.snowflake-vsc) โ„๏ธ > Connect to Snowflake, write and execute SQL queries, and view results without leaving VS Code. ### [SQL Notebook](https://marketplace.visualstudio.com/items?itemName=cmoog.sqlnotebook) ๐Ÿ“ > View SQL files as notebooks. Execute cells and view query output.