Project configurations

In Data Science & AI Workbench, projects are structured around a core configuration file called anaconda-project.yml. This file is crucial for orchestrating a project’s components for deployment and ensuring operational consistency over time. There are several parameters that must be included within each project’s anaconda-project.yml file to ensure that it operates as intended:

Packages - You must specify all conda or pip packages the project requires to function in its anaconda-project.yml file. By default, Workbench is configured to use packages from its internal repository to create project environments. However, it is possible to use packages from external repositories.
Environment - You must define at least one named environment to accurately manage the project’s packages and their dependencies, ensuring stability across different settings. Projects use template environments when they are initially created, which can be updated or replaced. For more information about template environments, see Configuring persistent environments and sample projects.
Commands - You must define at least one command to properly deploy and run jobs for your project in its intended environment.
Environmental variables - If necessary, set up the required environment variables needed to control how your project interacts with external resources and services.

It is possible to edit a project’s anaconda-project.yml file manually to add the required configurations; however, this method is prone to human error, especially for users who are unfamiliar with .yml file formatting.Instead, Anaconda recommends using anaconda-project commands from a terminal within your project to update its configurations when possible. For more information about anaconda-project, see the official documentation here.All anaconda-project commands must be run from the lab_launch environment! Enter the lab_launch environment by running the following command in a project terminal:

conda activate lab_launch

Once you are finished configuring your project, Anaconda recommends that you add a lock file to your project to ensure its reproducibility across different environments at scale.

Configuring project environments

The conda environments in standard project templates are pre-solved to reduce initialization time when additional packages are added. However, you might want to create an environment specifically for your project. To create a new environment with specific packages and add it to your project:

Create a new project.
Start a session.
Open a terminal within your session editor.

Create an environment and include the packages you need for it by running the following command:

 # Replace <ENV_NAME> with the name of the environment you are creating to add to your project's configuration
 # Replace <PACKAGE_NAME> with the name of the packages you want to add to your project's configuration
 anaconda-project add-env-spec --name <ENV_NAME> <PACKAGE_NAME> <PACKAGE_NAME>

Remove the template environment that you used to create your project by running the following command:

 # Replace <TEMPLATE_ENV> with the name of the template environment you used to initially create the project
 anaconda-project remove-env-spec --name <TEMPLATE_ENV>

Commit and push your updates to the project.
Stop and re-start the project.

To edit and run notebooks in Jupyter Notebook or JupyterLab, you must include the notebook package in your project’s environment.

Verify your environment is initialized for notebooks

Open a terminal within your session editor.
Run the following commands:
```
cd /opt/continuum/
ls
```
If the environment is being initialized, you will see a file named preparing. Once initialization is complete, you will see a file named prepare.log. To troubleshoot environment initialization, view the log from the terminal by running the following command:
```
cat /opt/continuum/prepare.log
```

Configuring project packages

Adding a package to a project’s configuration file persists for future project sessions and deployments. This is different than using conda install to add a package using the conda environment during a session, which impacts the project during the current session only.

Networks that are air-gapped (operate without internet access) must mirror the Anaconda repository into your organization’s internal package repository to provide them to users.

Adding conda packages

To add a conda package to your project’s anaconda-project.yml file:

Open your project.
If necessary, start a session.
Open a terminal within your session editor.
Verify that you are in the lab_launch environment.
Run the following command:
# Replace <PACKAGE_NAME> with the name of the packages you want to add to your project's configuration anaconda-project add-packages <PACKAGE_NAME> <PACKAGE_NAME>
The command may take a moment to run as it solves the environment to collect dependencies and download packages. Once complete, the added packages appear in the project’s anaconda-project.yml file. If the file is open when you run the command, close and reopen it to view your changes.
Commit and push your updates to the project.
Stop and re-start the project session.

Adding pip packages

If your project requires you to pip install a package, you can use anaconda-project to add it to your project’s configuration. To add a pip package to your project’s anaconda-project.yml file:

Open your project.
If necessary, start a session.
Open a terminal within your session editor.
Verify that you are in the lab_launch environment.

Run the following command:

 # Replace <PACKAGE_NAME> with the name of the packages you want to add to your project's configuration
 anaconda-project add-packages --pip <PACKAGE_NAME> <PACKAGE_NAME>

Commit and push your updates to the project.
Stop and re-start the project session.

Removing packages

To remove a package from your project’s anaconda-project.yml file:

Open your project.
If necessary, start a session.
Open a terminal within your session editor.
Verify that you are in the lab_launch environment.

Run the following command:

# Replace <PACKAGE_NAME> with the name of the packages you want to remove from your project's configuration
anaconda-project remove-packages <PACKAGE_NAME> <PACKAGE_NAME>

Commit and push your updates to the project.
Stop and re-start the project session.

Configuring project environment variables

Environment variables are key parameters that manage dynamic settings like API keys, database URLs, and memory limits without modifying the codebase. These variables are essential for deploying projects consistently. To add environment variables with a default value to your project’s anaconda-project.yml file:

Open your project.
If necessary, start a session.
Open a terminal within your session editor.
Verify that you are in the lab_launch environment.

Run the following command:

 # Replace <VALUE> with the content of your environment variable
 # Replace <VARIABLE> with the variable name
 anaconda-project add-variable --default=<VALUE> <VARIABLE>

Commit and push your updates to the project.
Stop and re-start the project session.

For more information and advanced command arguments, see Working with environment variables in the official anaconda-project documentation.

Configuring project commands

To deploy a project in Workbench, it must contain at least one appropriate deployment command defined in its anaconda-project.yml file. These commands specify how the project’s components, such as notebooks, scripts, or generic web frameworks, should be executed when the project is deployed. To add a command to your project’s anaconda-project.yml file:

Open your project.
If necessary, start a session.
Open a terminal within your session editor.
Verify that you are in the lab_launch environment.

Run the following command:

 # Replace <CMD_NAME> with a name for your deployment command
 # Replace <COMMAND> with the project filename that should be executed
 anaconda-project add-command <CMD_NAME> <COMMAND>

Commit and push your updates to the project.
Stop and re-start the project session.

For more information and advanced command arguments, see Working with commands in the official anaconda-project documentation.

Example deployment commands

The following are example deployment commands you can use:For a Notebook:

commands:
default:
   notebook: <FILE_NAME>.ipynb

For a Panel dashboard:

commands:
default:
   unix: panel serve <SCRIPT_OR_NOTEBOOK_FILE>
   supports_http_options: True

For a generic script or web framework, including Python or R:

commands:
default:
   unix: bash <YOUR-SCRIPT>.sh
   supports_http_options: true

commands:
default:
   unix: python <YOUR-SCRIPT>.py
   supports_http_options: true

commands:
default:
   unix: Rscript <YOUR-SCRIPT>.R
   supports_http_options: true

Validating project deployment commands

To validate the anaconda-project.yml and verify your project will deploy successfully:

Open your project.
If necessary, start a session.
Open a terminal within your session editor.
Verify that you are in the lab_launch environment.

Prepare the environment and test the deployment command by running the following commands:

# Replace <ENV_NAME> with the name of the project's environment
# Replace <COMMAND> with the deployment command that you want to test
anaconda-project prepare --env-spec <ENV_NAME>
anaconda-project run <COMMAND>

Any errors preventing a successful deployment are displayed in the terminal.

Testing project deployments

Once deployment commands are added to your project, you can test the deployment using the test_deployment command. This sets up a mini web application, allowing you to preview your deployment locally using a port within your session.

To test a project deployment:

Open your project.
If necessary, start a session.
Open a terminal within your session editor.
Verify that you are in the lab_launch environment.
Test a deployment command you’ve added to your project by running the following command:
```
 # Replace <COMMAND> with an available deployment command
 test_deployment <COMMAND>
```
If you do not supply a deployment command to test, the first command listed under the commands: section of the projects .yml configuration file will be run.
Navigate to the web address returned by the command to verify your project deployed successfully.

Locking project configurations

Project locking is a crucial step in ensuring your project is reproducible across multiple deployments at scale. It is best practice to lock your project once you have finalized configurations for your project, or if you are preparing to transition to a production or public deployment. For more information, see Project reproducibility in Workbench. To lock your anaconda-project.yml file configurations to a fixed state:

Open your project.
If necessary, start a session.
Open a terminal within your session editor.
Verify that you are in the lab_launch environment.
Lock your project configurations by running the following command:
```
 anaconda-project lock
```

This instructs conda to solve the project’s environment, lock all packages and their dependencies to their current versions, and generates an anaconda-project-lock.yml file for your project.

Data Science & AI Workbench

​Configuring project environments

​Configuring project packages

​Adding conda packages

​Adding pip packages

​Removing packages

​Configuring project environment variables

​Configuring project commands

​Validating project deployment commands

​Testing project deployments

​Locking project configurations

Configuring project environments

Configuring project packages

Adding conda packages

Adding pip packages

Removing packages

Configuring project environment variables

Configuring project commands

Validating project deployment commands

Testing project deployments

Locking project configurations