Git workflow in Databricks¶
Warning
If you are still using Daipe in Workspace / locally, use this guide instead.
Prerequisites
- Enable 'Files in Repos' in your Databricks workspace at Settings -> Admin Console -> Workspace Settings
- Set up a GitHub personal access token
- In your Databricks workspace at Settings -> User Settings -> Git Integration select GitHub as a provider and use your new token here
Create feature branch and work on it¶
Open the Git menu (branch name next to the repo name, or right click) and checkout a new feature branch:
Commit your work to GIT repository¶
After making changes to your feature branch you can commit the changes in the Git menu:
Merging the feature branch with your main branch¶
After you are done with the changes, open a Pull Request into your main branch back in your repository on GitHub:
Adding, updating or removing a project dependency¶
To add, update or remove a dependency you need to:
- Open a dbx_poetry notebook in the root of your project.
- Run the Cmd 1 to show Widgets.
- Select the add/update/remove action in the first widget and the desired package in the second.
- Run the rest of the commands
- If you rerun
%run bootstrap
in your notebooks, the new dependencies should be available - The updated project should then be pushed to a central repository so that other team members can pull it and have the same dependencies.
Example¶
-
Ran dbx_poetry notebook as described above
-
Expected diff in pyproject.toml on commit
-
Expected diff in poetry.lock on commit