> ## Documentation Index
> Fetch the complete documentation index at: https://anaconda.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Databricks integration in Anaconda Platform (Cloud)

export const Comments = ({children}) => {
  return <div class="my-4 px-5 py-4 overflow-hidden rounded-2xl flex gap-3 border border-zinc-500/20 bg-zinc-50/50 dark:border-zinc-500/30 dark:bg-zinc-500/10" data-callout-type="comments">
      <div class="w-4">
        <svg width="14" height="14" viewBox="0 0 640 640" fill="currentColor" xmlns="http://www.w3.org/2000/svg" class="w-5 h-5" aria-label="Comments">
            <path d="M320 112C434.9 112 528 205.1 528 320C528 434.9 434.9 528 320 528C205.1 528 112 434.9 112 320C112 205.1 205.1 112 320 112zM320 576C461.4 576 576 461.4 576 320C576 178.6 461.4 64 320 64C178.6 64 64 178.6 64 320C64 461.4 178.6 576 320 576zM280 400C266.7 400 256 410.7 256 424C256 437.3 266.7 448 280 448L360 448C373.3 448 384 437.3 384 424C384 410.7 373.3 400 360 400L352 400L352 312C352 298.7 341.3 288 328 288L280 288C266.7 288 256 298.7 256 312C256 325.3 266.7 336 280 336L304 336L304 400L280 400zM320 256C337.7 256 352 241.7 352 224C352 206.3 337.7 192 320 192C302.3 192 288 206.3 288 224C288 241.7 302.3 256 320 256z" />
        </svg>
      </div>
      <div class="text-sm prose min-w-0 w-full">
        {children}
      </div>
    </div>;
};

Integrating Databricks with Anaconda Platform (Cloud) enables organizations to maintain security and compliance while leveraging the power of both platforms.

For data science teams working in regulated <Tooltip tip="A self-contained, isolated space for installing and running software packages.">environments</Tooltip>, this integration provides essential security controls over Python package usage. Your organization can enforce security policies and maintain consistent environments across development and production. This helps prevent the use of unauthorized or vulnerable <Tooltip tip="Software files and information about the software, such as its name, version, and description, bundled into a file that can be installed and managed by a package manager.">packages</Tooltip> while providing comprehensive audit trails of package usage across your Databricks workspaces.

This guide explains how to set up a secure, customized Python environment in Databricks using packages from Anaconda Platform (Cloud).

## Prerequisites

Before starting, ensure you have:

* Administrator access to an Anaconda organization
* An Anaconda organization access <Tooltip tip="A randomly generated string that proves your identity and permission to access resources like channels, packages, or APIs.">token</Tooltip>
* Docker installed on your local machine
* A Databricks workspace with admin privileges
* [Enabled the Community channel for your organization](/anaconda-platform/cloud/admin/channels#enabling-community-channels).

## Setup and configuration

<Steps>
  <Step title="Create a Channel">
    1. [Sign in to Anaconda.com](https://auth.anaconda.com/ui/login?return_to=https://anaconda.com/app/).
    2. Click <Icon icon="satellite-dish" iconType="regular" /> **Channels**.
    3. Click **Add Channel**.
    4. Name your channel `databricks`.
    5. Set the channel's **Type** to **Virtual**.
    6. Open the **Source** dropdown and select *main*.
    7. Set the channel's **Access** to **Internal**.
    8. Click **Save**.
  </Step>

  <Step title="Create and apply a policy">
    1. Click **Create** <Icon icon="plus" iconType="regular" /> under **POLICIES**.

    2. Name your policy `databricks`.

    3. Configure the [policy filter](/anaconda-platform/cloud/admin/policy-filters#example-policy-filter) as follows:

       **Exclude package if:**

       `Platform` `Is not` `linux-64`

       *and*

       `Platform` `Is not` `noarch`

    4. Click **Save**.

    5. Apply your policy to the `databricks` channel you created earlier. For more information, see [Applying a Policy](/anaconda-platform/cloud/admin/policy-filters#applying-a-policy-filter).
  </Step>

  <Step title="Build a Custom Docker Image">
    To create a secure Python environment in Databricks, you'll need to build a custom Docker image. This image includes your conda-based environment and can be used when launching a cluster through Databricks Container Services.

    For more information, see [Customize containers with Databricks Container Service](https://docs.databricks.com/aws/en/compute/custom-containers) and [GitHub - databricks/containers](https://github.com/databricks/containers?tab=readme-ov-file).

    1. Create a directory on your local machine called `dcs-conda` by running the following command:

       ```sh theme={null}
       mkdir dcs-conda
       ```

    2. Enter your new `dcs-conda` directory and create a `Dockerfile` file inside the `dcs-conda` directory:

       ```sh theme={null}
       cd dcs-conda
       vi Dockerfile
       ```

    3. Add the following content to the `Dockerfile` file, depending on your Databricks Runtime version:

           <CodeGroup>
             ```dockerfile 16.4-LTS highlight {25, 26} expandable theme={null}
             FROM ubuntu:24.04 AS builder
             RUN apt-get update && apt-get install --yes \
                 wget \
                 libdigest-sha-perl \
                 bzip2 \
                 gcc \
                 python3-dev \
                 libpq-dev \
                 libcairo2-dev \
                 libdbus-1-dev \
                 libgirepository1.0-dev \
                 libsnappy-dev \
                 git \
                 maven && \
                 apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
             RUN wget -q https://repo.anaconda.com/miniconda/Miniconda3-py311_25.1.1-2-Linux-x86_64.sh -O miniconda.sh && \
                 /bin/bash miniconda.sh -b -p /databricks/conda && \
                 rm miniconda.sh
             FROM databricksruntime/minimal:16.4-LTS
             RUN apt-get update && apt-get install --yes git && \
                 apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
             COPY --from=builder /databricks/conda /databricks/conda
             COPY env.yml /databricks/.conda-env-def/env.yml
             RUN /databricks/conda/bin/conda install --yes -n base conda-token && \
                 /databricks/conda/bin/conda token set --force-config-condarc <TOKEN> && \
                 /databricks/conda/bin/conda config --system --prepend channels https://repo.anaconda.cloud/repo/<ORG_ID>/databricks && \
                 /databricks/conda/bin/conda config --system --remove channels https://repo.anaconda.com/pkgs/main && \
                 /databricks/conda/bin/conda config --system --remove channels https://repo.anaconda.com/pkgs/r
             RUN /databricks/conda/bin/conda env create --file /databricks/.conda-env-def/env.yml && \
                 ln -s /databricks/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh
             RUN /databricks/conda/bin/conda config --system --set channel_priority strict && \
                 /databricks/conda/bin/conda config --system --set always_yes True
             RUN rm -f /root/.condarc
             ENV GIT_PYTHON_GIT_EXECUTABLE=/usr/bin/git
             ENV DEFAULT_DATABRICKS_ROOT_CONDA_ENV="dcs-conda"
             ENV DATABRICKS_ROOT_CONDA_ENV="dcs-conda"
             ```

             ```dockerfile 15.4-LTS highlight {25, 26} expandable theme={null}
             FROM ubuntu:22.04 AS builder
             RUN apt-get update && apt-get install --yes \
                 wget \
                 libdigest-sha-perl \
                 bzip2 \
                 gcc \
                 python3-dev \
                 libpq-dev \
                 libcairo2-dev \
                 libdbus-1-dev \
                 libgirepository1.0-dev \
                 libsnappy-dev \
                 git \
                 maven && \
                 apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
             RUN wget -q https://repo.anaconda.com/miniconda/Miniconda3-py311_25.1.1-2-Linux-x86_64.sh -O miniconda.sh && \
                 /bin/bash miniconda.sh -b -p /databricks/conda && \
                 rm miniconda.sh
             FROM databricksruntime/minimal:15.4-LTS
             RUN apt-get update && apt-get install --yes git && \
                 apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
             COPY --from=builder /databricks/conda /databricks/conda
             COPY env.yml /databricks/.conda-env-def/env.yml
             RUN /databricks/conda/bin/conda install --yes -n base conda-token && \
                 /databricks/conda/bin/conda token set --force-config-condarc <TOKEN> && \
                 /databricks/conda/bin/conda config --system --prepend channels https://repo.anaconda.cloud/repo/<ORG_ID>/databricks && \
                 /databricks/conda/bin/conda config --system --remove channels https://repo.anaconda.com/pkgs/main && \
                 /databricks/conda/bin/conda config --system --remove channels https://repo.anaconda.com/pkgs/r
             RUN /databricks/conda/bin/conda env create --file /databricks/.conda-env-def/env.yml && \
                 ln -s /databricks/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh
             RUN /databricks/conda/bin/conda config --system --set channel_priority strict && \
                 /databricks/conda/bin/conda config --system --set always_yes True
             RUN rm -f /root/.condarc
             ENV GIT_PYTHON_GIT_EXECUTABLE=/usr/bin/git
             ENV DEFAULT_DATABRICKS_ROOT_CONDA_ENV="dcs-conda"
             ENV DATABRICKS_ROOT_CONDA_ENV="dcs-conda"
             ```

             ```dockerfile 14.3-LTS highlight {25, 26} expandable theme={null}
             FROM ubuntu:22.04 AS builder
             RUN apt-get update && apt-get install --yes \
                 wget \
                 libdigest-sha-perl \
                 bzip2 \
                 gcc \
                 python3-dev \
                 libpq-dev \
                 libcairo2-dev \
                 libdbus-1-dev \
                 libgirepository1.0-dev \
                 libsnappy-dev \
                 git \
                 maven && \
                 apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
             RUN wget -q https://repo.anaconda.com/miniconda/Miniconda3-py311_25.1.1-2-Linux-x86_64.sh -O miniconda.sh && \
                 /bin/bash miniconda.sh -b -p /databricks/conda && \
                 rm miniconda.sh
             FROM databricksruntime/minimal:14.3-LTS
             RUN apt-get update && apt-get install --yes git && \
                 apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
             COPY --from=builder /databricks/conda /databricks/conda
             COPY env.yml /databricks/.conda-env-def/env.yml
             RUN /databricks/conda/bin/conda install --yes -n base conda-token && \
                 /databricks/conda/bin/conda token set --force-config-condarc <TOKEN> && \
                 /databricks/conda/bin/conda config --system --prepend channels https://repo.anaconda.cloud/repo/<ORG_ID>/databricks && \
                 /databricks/conda/bin/conda config --system --remove channels https://repo.anaconda.com/pkgs/main && \
                 /databricks/conda/bin/conda config --system --remove channels https://repo.anaconda.com/pkgs/r
             RUN /databricks/conda/bin/conda env create --file /databricks/.conda-env-def/env.yml && \
                 ln -s /databricks/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh
             RUN /databricks/conda/bin/conda config --system --set channel_priority strict && \
                 /databricks/conda/bin/conda config --system --set always_yes True
             RUN rm -f /root/.condarc
             ENV GIT_PYTHON_GIT_EXECUTABLE=/usr/bin/git
             ENV DEFAULT_DATABRICKS_ROOT_CONDA_ENV="dcs-conda"
             ENV DATABRICKS_ROOT_CONDA_ENV="dcs-conda"
             ```

             ```dockerfile 13.3-LTS highlight {25, 26} expandable theme={null}
             FROM ubuntu:22.04 AS builder
             RUN apt-get update && apt-get install --yes \
                 wget \
                 libdigest-sha-perl \
                 bzip2 \
                 gcc \
                 python3-dev \
                 libpq-dev \
                 libcairo2-dev \
                 libdbus-1-dev \
                 libgirepository1.0-dev \
                 libsnappy-dev \
                 git \
                 maven && \
                 apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
             RUN wget -q https://repo.anaconda.com/miniconda/Miniconda3-py311_25.1.1-2-Linux-x86_64.sh -O miniconda.sh && \
                 /bin/bash miniconda.sh -b -p /databricks/conda && \
                 rm miniconda.sh
             FROM databricksruntime/minimal:13.3-LTS
             RUN apt-get update && apt-get install --yes git && \
                 apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
             COPY --from=builder /databricks/conda /databricks/conda
             COPY env.yml /databricks/.conda-env-def/env.yml
             RUN /databricks/conda/bin/conda install --yes -n base conda-token && \
                 /databricks/conda/bin/conda token set --force-config-condarc <TOKEN> && \
                 /databricks/conda/bin/conda config --system --prepend channels https://repo.anaconda.cloud/repo/<ORG_ID>/databricks && \
                 /databricks/conda/bin/conda config --system --remove channels https://repo.anaconda.com/pkgs/main && \
                 /databricks/conda/bin/conda config --system --remove channels https://repo.anaconda.com/pkgs/r
             RUN /databricks/conda/bin/conda env create --file /databricks/.conda-env-def/env.yml && \
                 ln -s /databricks/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh
             RUN /databricks/conda/bin/conda config --system --set channel_priority strict && \
                 /databricks/conda/bin/conda config --system --set always_yes True
             RUN rm -f /root/.condarc
             ENV GIT_PYTHON_GIT_EXECUTABLE=/usr/bin/git
             ENV DEFAULT_DATABRICKS_ROOT_CONDA_ENV="dcs-conda"
             ENV DATABRICKS_ROOT_CONDA_ENV="dcs-conda"
             ```
           </CodeGroup>

           <Comments>
             Replace \<TOKEN> with your organization access token. <br />
             Replace \<ORG\_ID> with your organization ID — found in your organization's URL — `https://anaconda.com/app/organizations/<ORG_ID\>`.
           </Comments>

    4. Create an `env.yml` file inside the `dcs-conda` directory:

       ```sh theme={null}
       vi env.yml
       ```

    5. Add the following content to the `env.yml` file:

       ```yml expandable highlight {3} theme={null}
       name: dcs-conda
       channels:
         - https://repo.anaconda.cloud/repo/<ORG_ID>/databricks
       dependencies:
         - python=3.11
         - databricks-sdk
         - grpcio
         - grpcio-status
         - ipykernel
         - ipython
         - jedi
         - jinja2
         - matplotlib
         - nomkl
         - numpy
         - pandas
         - pip
         - pyarrow
         - pyccolo
         - setuptools
         - six
         - traitlets
         - wheel
       ```

           <Comments>
             Replace \<ORG\_ID> with your organization ID — found in your organization's URL — `https://anaconda.com/app/organizations/\<ORG_ID\>`.
           </Comments>

           <Note>
             Please check the recommended package versions in the [System environment](https://docs.databricks.com/aws/en/release-notes/runtime/) section of the Databricks Runtime release notes and compatibility documentation.
           </Note>

    6. Build the Docker image:

       ```sh theme={null}
       docker build -t dcs-conda:15.4-ap .
       ```

    7. Tag and push your custom image to a Docker registry by running the following commands:

       ```sh theme={null}
       docker tag dcs-conda:15.4-ap anaconda/dcs-conda:15.4-ap
       docker push anaconda/dcs-conda:15.4-ap
       ```
  </Step>

  <Step title="Launch a Cluster using Databricks Container Service">
    <Warning>
      Clients must be authorized to access Databricks resources using a Databricks account with appropriate permissions. Without proper access, CLI commands and REST API calls will fail. Permissions can be configured by a workspace administrator.

      <Accordion title="Enabling Databricks Container Service">
        Use the [Databricks CLI](https://docs.databricks.com/aws/en/compute/custom-containers) to enable Databricks Container Service. In a JSON request body, specify `enableDcs` to `true`, as in the following example:

        ```py theme={null}
        databricks workspace-conf set-status --json '{"enableDcs": "true"}'
        ```
      </Accordion>
    </Warning>

    <Note>
      Databricks recommends using OAuth for authorization instead of Personal Access Tokens (PATs). OAuth tokens refresh automatically and reduce security risks associated with token leaks or misuse. For more information, see [Authorizing access to Databricks resources](https://docs.databricks.com/aws/en/dev-tools/auth/).
    </Note>

    1. Open your Databricks workspace.

    2. Select **Compute** from the left-hand navigation, then click **Create compute**.

    3. On the **New compute** page, specify the **Cluster Name**.

    4. Under **Performance**, set the **Databricks Runtime Version** to a version that supports Databricks Container Service. For example - **Runtime: 15.4-LTS**.

           <Frame>
             <img src="https://mintcdn.com/anaconda-29683c67/atV1FF4bOzhiwIx6/images/databricks-conda-cluster.png?fit=max&auto=format&n=atV1FF4bOzhiwIx6&q=85&s=432ab3d9ba223bfffb73c2b7e05f6235" alt="databricks container service — compute, new compute page" width="1681" height="891" data-path="images/databricks-conda-cluster.png" />
           </Frame>

           <Note>
             This version is under long-term support (LTS). For more information, see [Databricks support lifecycles](https://docs.databricks.com/aws/en/release-notes/runtime/databricks-runtime-ver).

             ***

             `Databricks Runtime for Machine Learning` [does not support Databricks Container Service](https://docs.databricks.com/aws/en/compute/custom-containers#limitations).
           </Note>

    5. Open the **Advanced options** dropdown and click the **Spark** tab.

    6. Add the following Spark configurations:

       ```sh theme={null}
       spark.databricks.isv.product anaconda-psm # Must always be added, regardless of other settings
       spark.databricks.driverNfs.enabled false
       ```

           <Note>
             To access volumes on Databricks Container Service, add the following configuration to the compute's **Spark config** field as well: `spark.databricks.unityCatalog.volumes.enabled true`.
           </Note>

           <Frame>
             <img src="https://mintcdn.com/anaconda-29683c67/bkF_gU78xMaNM3h-/images/databricks-spark_2.png?fit=max&auto=format&n=bkF_gU78xMaNM3h-&q=85&s=de67d15df42264e0e2deaa76f2831813" alt="databricks container service — compute, advanced options, spark tab" width="1682" height="891" data-path="images/databricks-spark_2.png" />
           </Frame>

    7. Click the **Docker** tab.

    8. Select the **Use your own Docker container** checkbox.

    9. Enter your custom Docker image in the **Docker Image URL** field.

           <Frame>
             <img src="https://mintcdn.com/anaconda-29683c67/bkF_gU78xMaNM3h-/images/databricks-docker-image_2.png?fit=max&auto=format&n=bkF_gU78xMaNM3h-&q=85&s=cf75cc39dee9e3b4ce856c76244024c4" alt="databricks container service — compute, advanced options, docker tab" width="1681" height="892" data-path="images/databricks-docker-image_2.png" />
           </Frame>

           <Accordion title="Docker Image URL examples">
             * Docker Hub - `<organization>/<repository>:<tag>`
             * Amazon ECR - `<aws-account-id>.dkr.ecr.<region>.amazonaws.com/<repository>:<tag>`
             * Azure Container Registry - `<your-registry-name>.azurecr.io/<repository-name>:<tag>`
           </Accordion>

    10. Open the **Authentication** dropdown and select an authentication method.

    11. Click **Create compute**.
  </Step>

  <Step title="Create a Notebook and connect it to your cluster">
    1. Click **New** in the top-left corner, then click **Notebook**.
    2. Specify a name for the notebook.
    3. Click **Connect**, then select your cluster from the resource list.

           <Frame>
             <img src="https://mintcdn.com/anaconda-29683c67/atV1FF4bOzhiwIx6/images/databricks-notebook.png?fit=max&auto=format&n=atV1FF4bOzhiwIx6&q=85&s=60a4faa29ede83052e16d9f122f34e50" alt="databricks notebook workspace — connect cluster resources" width="1682" height="413" data-path="images/databricks-notebook.png" />
           </Frame>
  </Step>

  <Step title="Verify your conda installation">
    1. In your notebook, run one of the following commands to check that conda is installed:

       ```py theme={null}
       !conda --help
       ```

       ```py theme={null}
       %sh conda --help
       ```

           <Note>
             Both commands run shell code from the notebook. `!conda --help` runs the command in the current shell. `%sh conda --help` starts a subshell, which is useful for multi-line scripts, but might not have the same environment or path.
           </Note>

           <Frame>
             <img src="https://mintcdn.com/anaconda-29683c67/atV1FF4bOzhiwIx6/images/databricks-conda-verify.png?fit=max&auto=format&n=atV1FF4bOzhiwIx6&q=85&s=e94022275e4fa87f90d44732ebb1f804" alt="databricks notebook, verifying conda installation" width="1680" height="791" data-path="images/databricks-conda-verify.png" />
           </Frame>

    2. In your notebook, run the following command to check your source channels:

       ```py theme={null}
       !conda config --show channels
       ```

           <Frame>
             <img src="https://mintcdn.com/anaconda-29683c67/atV1FF4bOzhiwIx6/images/databricks-conda-show-channels.png?fit=max&auto=format&n=atV1FF4bOzhiwIx6&q=85&s=7431959ea49913b7383acfc29d43705d" alt="databricks notebook, viewing conda channels" width="1683" height="413" data-path="images/databricks-conda-show-channels.png" />
           </Frame>
  </Step>

  <Step title="Install MLflow from your Anaconda organization channel">
    MLflow is available through your Anaconda organization channel for use in your Databricks environment.

    1. In your notebook, install MLflow from your Anaconda organization channel:

       ```py theme={null}
       !conda install mlflow
       ```

           <Note>
             This command installs MLflow and all of its dependencies from your Package Security Manager channel.
           </Note>

           <Frame>
             <img src="https://mintcdn.com/anaconda-29683c67/atV1FF4bOzhiwIx6/images/databricks-mlflow-install.png?fit=max&auto=format&n=atV1FF4bOzhiwIx6&q=85&s=a889483994e2b427197289fc5b67abbd" alt="databricks notebook, installing mlflow using conda" width="1684" height="736" data-path="images/databricks-mlflow-install.png" />
           </Frame>

    2. In your notebook, verify the installation:

       ```py theme={null}
       import mlflow
       print("MLflow: " + mlflow.__version__)
       ```

           <Frame>
             <img src="https://mintcdn.com/anaconda-29683c67/atV1FF4bOzhiwIx6/images/databricks-mlflow-verify.png?fit=max&auto=format&n=atV1FF4bOzhiwIx6&q=85&s=5b0bcc056edffc0afd3a2fe6a13c999d" alt="databricks notebook, verifying mlflow installation" width="1683" height="412" data-path="images/databricks-mlflow-verify.png" />
           </Frame>
  </Step>
</Steps>
