Skip to main content

Git Folders

A Git Folder in QRY is a folder of notebooks (and Lakeflow pipelines and jobs) backed by a real Git repository. You can push your work to GitHub / GitLab / Bitbucket, pull updates from teammates, and see a real diff between versions — without copying YAML around or losing inline cell history.

The implementation uses pygit2 with a custom PostgreSQL ODB backend: there's no filesystem dependency, so it works in Kubernetes without persistent volumes. Notebooks serialise to .qrynb, pipelines and jobs to .yaml with literal-block style for readable SQL diffs.

Goal

You finish this page with a Git Folder linked to a real remote, your notebook pushed, and a clear sense of how the diff/conflict workflow looks.

Prerequisites

  • A repository on GitHub, GitLab, or Bitbucket you can push to.
  • An OAuth integration enabled for your tenant or a personal access token (PAT) with repo scope.
  • A notebook (or pipeline / job) you want to track.

Steps

1. Create a Git Folder

In the Notebooks IDE, open the Source Control panel (bottom-left of the Explorer). Click Connect repository and pick your provider.

  • OAuth — opens the provider's auth flow in a popup; grant access to the repo you want.
  • PAT — paste a personal access token. The token is stored Fernet-encrypted under your tenant's JWT_SECRET_KEY.

Pick the repo and the branch (or create a new one). QRY creates a Git Folder mirroring the repo's notebook directory.

2. Add notebooks to the folder

Move existing notebooks into the Git Folder via drag-drop in the Explorer, or create new ones inside it. New notebooks are immediately tracked.

3. Stage and commit

In the Source Control panel you'll see modified notebooks. Click the + next to each to stage, write a commit message, and click Commit.

The diff view shows changed cells with cell-level granularity — added cells in green, removed in red, modified showing a side-by-side cell diff.

4. Push and pull

Push uploads your commits to the remote. Pull fetches the remote and merges.

When a teammate pushes a different version of a notebook you also edited, pull surfaces a cell-level conflict resolver: you pick yours, theirs, or both per conflicting cell, instead of resolving line-by-line text conflicts in raw .qrynb JSON.

What gets serialised

AssetFormatWhy
Notebooks.qrynb (JSON)Cell-aware structure, model-per-cell, captured outputs
Pipelines.yamlLakeflow pipeline DSL, literal-block style for SQL readability
Jobs.yamlLakeflow job DAG; same readability rules

Captured outputs (chart PNGs, table samples) are included by default — they make the notebook diffable in the QRY UI on the receiver's side. If you don't want outputs in the repo, exclude them per-folder in the Git Folder settings.

What does NOT go into Git

  • Datasource bindings — these are environment-specific. A notebook pulled into a different tenant has to be re-bound.
  • Workspace assignment — same reasoning.
  • Run history of scheduled notebooks — execution logs live in the scheduled-tasks system, not Git.
  • Memory entries and domain context — separate stores, not file-based.

Common issues

Push fails with 403 Forbidden. Your token / OAuth scope doesn't include write access. Re-authorise with repo scope (GitHub) or equivalent.

Pull surfaces a cell-conflict every commit. Two people are editing the same cell. The cell-level resolver helps, but better: split the notebook so people work in different cells.

A .qrynb file looks unreadable in GitHub's web view. It's JSON. The QRY diff view is the clean way to look at notebook changes; the raw file is for the machine.

OAuth fails with redirect_uri_mismatch. Your tenant's Git OAuth client isn't configured with this tenant's URL as a callback. Ask an admin to update the OAuth app on the provider side.

Cell outputs balloon the repo size. Disable output serialisation in Git Folder settings, or strip outputs in a periodic cleanup commit.

See also

QRYA product of IXEN.