Git Folders
Unified version control for your notebooks, pipelines, and jobs. Because "final_v3_REALLY_FINAL.qrynb" isn't a version control strategy.
Production Ready - Full Git integration with local history, remote sync, and collaboration features
Overview
Git Folders brings professional version control to QRY. Instead of managing assets in isolation, organize them into Git-enabled folders that track every change, enable collaboration, and integrate with your existing GitHub, GitLab, or Bitbucket workflows.
Think of it as Databricks Git Folders, but designed specifically for QRY's notebooks, Forge pipelines, and jobs. Your data assets deserve the same version control discipline as your application code.
What you get:
- Version history for every notebook, pipeline, and job
- Branch management for experimenting safely
- Remote sync with GitHub, GitLab, Bitbucket, and Azure DevOps
- Conflict resolution when teammates edit the same files
- CI/CD integration for automated deployments
How It Works
The Architecture
Git Folders uses a hybrid approach: your database remains the source of truth for fast access, while Git provides versioning and collaboration.
┌─────────────────────────────────────────────────────┐
│ QRY Workspace │
├─────────────────────────────────────────────────────┤
│ 📓 Notebooks 📊 Pipelines ⚙️ Jobs │
│ │ │ │ │
│ └──────────────┼──────────────┘ │
│ ▼ │
│ ┌──────────────┐ │
│ │ Git Sync │ │
│ │ Service │ │
│ └──────┬───────┘ │
│ │ │
│ ┌─────────────┼─────────────┐ │
│ ▼ ▼ ▼ │
│ Shadow Git Remote Sync CI/CD API │
│ (Local) (Push/Pull) (Webhooks) │
│ │ │ │
│ │ ┌────▼────┐ │
│ │ │ GitHub │ │
│ │ │ GitLab │ │
│ │ │ Bitbucket│ │
│ │ └─────────┘ │
└───────┼─────────────────────────────────────────────┘
▼
Local Git Repository
Key principle: Changes are always saved to the database first (for speed), then synced to Git (for history and collaboration). This means you never lose work, even if Git sync fails.
Folder Structure
Git Folders organize your assets in a clean hierarchy:
📁 analytics-team/
│ └─ Remote: github.com/acme/analytics
│
├── 📓 notebooks/
│ ├── sales-analysis.qrynb
│ └── weekly-report.qrynb
│
├── 📊 pipelines/
│ └── customer-etl.yaml
│
└── ⚙️ jobs/
└── daily-refresh.yaml
Each folder can be:
- Local only: Version history without remote sync
- Connected to remote: Full push/pull capabilities
Getting Started
Creating a Git Folder
From scratch:
- Navigate to Forge or Notebooks
- Click + New Folder in the sidebar
- Choose Git Folder
- Configure:
- Name: URL-safe slug (
analytics-team) - Display Name: Human-readable (
Analytics Team) - Workspace: Where it belongs
- Name: URL-safe slug (
From existing repository:
- Click Clone Repository
- Enter the remote URL
- Authenticate (PAT, SSH, or OAuth)
- Select branch to clone
- QRY imports all compatible assets
Connecting to Remote
- Open folder settings (⚙️)
- Click Connect to Remote
- Select provider:
- GitHub - Personal or organization repos
- GitLab - Cloud or self-hosted
- Bitbucket - Cloud or Server
- Azure DevOps - Microsoft's platform
- Authenticate:
- Personal Access Token (recommended)
- SSH Key (for advanced users)
- OAuth (for seamless browser auth)
- Choose or create repository
- Select default branch
Core Operations
Committing Changes
When you save a notebook, pipeline, or job, Git Folders tracks the change locally. To create a permanent version:
- Click Git in the folder toolbar
- Review changed files:
✚ notebooks/new-analysis.qrynb (new file)
✎ notebooks/sales-analysis.qrynb (modified) - Enter commit message
- Click Commit
Commit message tips:
- Be descriptive: "Add regional breakdown to sales analysis"
- Not helpful: "Updates" or "Fixed stuff"
Viewing History
Every asset has complete version history:
- Open any notebook, pipeline, or job
- Click History (🕒)
- Browse commits:
● abc1234 Update visualization chart type
│ John Doe • 2 hours ago
│ [View] [Diff] [Restore]
│
● def5678 Add regional breakdown
│ Jane Smith • 1 day ago
│
● 789abcd Initial version
John Doe • 1 week ago
Comparing Versions (Diff)
See exactly what changed between versions:
- Click Diff on any commit
- Choose versions to compare
- View changes:
Cell 3 (Python) - Modified
- df.plot(kind='bar', x='region', y='total')
+ df.plot(kind='pie', labels='region', values='total')
+ plt.title('Regional Distribution')
Cell 5 (Prompt) - Added
+ Provide executive summary of regional performance
What gets diffed:
- Notebooks: Cell-by-cell comparison
- Pipelines: YAML configuration changes
- Jobs: Task and schedule changes
Restoring Previous Versions
Made a mistake? Roll back easily:
- Find the commit you want in History
- Click Restore
- Choose:
- Restore (with backup): Creates backup of current state first
- Restore (direct): Immediate rollback
- Your asset returns to that version
Important: Restoring creates a new state, it doesn't rewrite history. Your previous versions remain available.
Remote Sync
Pushing Changes
Send your local commits to the remote repository:
- Make sure you have commits to push (check status)
- Click Push in the Git panel
- Select branch (usually
main) - Confirm push
Status indicators:
- ✓ Synced: Local and remote match
- ↑ Ahead by N: You have commits to push
- ↓ Behind by N: Remote has commits to pull
- ↕ Diverged: Both have different commits
Pulling Changes
Get the latest changes from your team:
- Click Pull in the Git panel
- If no conflicts: Changes merge automatically
- If conflicts: Conflict resolution UI opens
Pro tip: Pull frequently to avoid large merges.
Branch Operations
Work on features without affecting the main branch:
Create branch:
1. Click branch dropdown (⚡ main)
2. Click "New Branch"
3. Name it (feature/new-analysis)
4. Start working
Switch branches:
1. Click branch dropdown
2. Select target branch
3. Your workspace updates
Merge branches:
1. Switch to target branch (e.g., main)
2. Click "Merge"
3. Select source branch
4. Resolve any conflicts
5. Complete merge
Conflict Resolution
When you and a teammate edit the same file, Git Folders provides a visual merge interface:
┌─────────────────────────────────────────────────────┐
│ Merge Conflicts (2 files) │
├─────────────────────────────────────────────────────┤
│ │
│ 📓 notebooks/sales-analysis.qrynb │
│ ├─ Conflict: Both modified │
│ ├─ Your changes: Modified cell 3 (Python) │
│ └─ Their changes: Different chart type │
│ │
│ [Accept Mine] [Accept Theirs] [Open Editor] │
│ │
│ ───────────────────────────────────────────────── │
│ │
│ 📊 pipelines/customer-etl.yaml │
│ ├─ Conflict: Deleted remotely │
│ └─ Your version: Has uncommitted changes │
│ │
│ [Keep Mine] [Accept Deletion] [Rename & Keep] │
│ │
├─────────────────────────────────────────────────────┤
│ [Cancel] [Resolve All & Pull] │
└─────────────────────────────────────────────────────┘
Resolution options:
- Accept Mine: Keep your version
- Accept Theirs: Use the remote version
- Open Editor: Manually merge changes
- Rename & Keep: Keep both with different names
File Formats
Git Folders uses human-readable formats that work well with Git:
Notebooks (.qrynb)
version: "1.0"
kind: notebook
metadata:
name: "Sales Analysis"
description: "Weekly sales performance analysis"
tags: ["sales", "weekly"]
settings:
datasource_id: "uuid-here"
catalog: "analytics"
schema: "sales"
cells:
- id: "cell-1"
type: markdown
content: |
# Weekly Sales Analysis
This notebook analyzes sales trends...
- id: "cell-2"
type: sql
name: "sales_data"
content: |
SELECT region, SUM(revenue) as total
FROM sales
GROUP BY region
- id: "cell-3"
type: python
content: |
df = sql['sales_data']
df.plot(kind='bar', x='region', y='total')
Why YAML?
- Human-readable diffs
- Easy to review in PRs
- Git-friendly (no binary blobs)
- Portable between QRY instances
Pipelines & Jobs (.yaml)
Already YAML-based - no conversion needed. See Lakeflow documentation for format details.
Integration with Lakeflow
Git Folders integrates seamlessly with Lakeflow's folder system:
Unified hierarchy:
📁 data-platform/ (Git Folder)
├── 📊 pipelines/
│ ├── raw-ingestion.yaml
│ └── customer-etl.yaml
├── ⚙️ jobs/
│ └── daily-workflow.yaml
└── 📓 notebooks/
└── data-quality-check.qrynb
Benefits:
- Single folder for related assets
- Shared version history
- Atomic commits across asset types
- One remote repository
Integration with Notebooks
Git Folders works with the Notebook IDE:
From Notebook Editor:
- Save: Changes tracked automatically
- History: View notebook version history
- Diff: Compare cell changes
- Restore: Roll back to previous versions
In File Explorer:
- Navigate Git Folders
- See sync status indicators
- Quick commit from context menu
CI/CD Integration
Automate deployments with Git webhooks and APIs.
Webhook Endpoints
Configure your Git provider to notify QRY:
POST /api/git/webhooks/{provider}
Supported events:
push- Deploy on merge to mainpull_request- Preview deploymentstag- Version-based releases
API Access
Programmatic control for automation:
# List folders
GET /api/git/folders
# Trigger sync
POST /api/git/folders/{id}/sync
# Get status
GET /api/git/folders/{id}/status
Production Folders
Mark folders as production for extra protection:
- Read-only in UI: Changes only via CI/CD
- Deployment logs: Track what deployed when
- Rollback support: Quick revert to previous deploy
Best Practices
Folder Organization
DO:
- ✅ Group related assets (pipeline + job + monitoring notebook)
- ✅ Use meaningful folder names
- ✅ Keep folders focused (not everything in one mega-folder)
- ✅ Match team/project boundaries
DON'T:
- ❌ One folder per asset (defeats the purpose)
- ❌ Mix unrelated projects
- ❌ Deeply nested hierarchies
Commit Hygiene
Good commits:
Add customer segmentation to weekly analysis
- New SQL cell for segment calculation
- Updated visualization with segment breakdown
- Added markdown documentation
Bad commits:
stuff
Branching Strategy
For most teams:
main- Production-ready assetsfeature/*- New development- Merge via pull request
For solo work:
- Commit to
maindirectly - Use branches for experiments
Sync Frequency
- Push: After completing a logical unit of work
- Pull: Start of each work session
- Don't: Push every single save (too noisy)
Troubleshooting
"Sync failed"
Check:
- Remote credentials still valid?
- Network connectivity?
- Repository still exists?
- You have push permissions?
Fix:
- Re-authenticate in folder settings
- Check remote repository status
- Verify your access level
"Conflict detected"
This is normal! It means you and a teammate edited the same file.
Resolution:
- Open conflict resolution UI
- Review both versions
- Choose which to keep (or merge manually)
- Complete the pull
"Large file warning"
Git isn't great with large files. If you see this:
- Check for accidentally committed data files
- Use
.gitignorefor outputs - Consider Git LFS for large assets
"History not showing"
Check:
- Is this a Git-enabled folder?
- Have commits been made?
- Are you looking at the right branch?
Security
Credential Storage
- All credentials encrypted at rest (AES-256)
- Per-user encryption keys
- OAuth tokens auto-refresh
- SSH keys stored securely
Access Control
- Git Folders inherit workspace permissions
- Commits include author identity
- Full audit log of all operations
- Service accounts for CI/CD (no user impersonation)
Data Protection
- Sensitive variables excluded from commits
- Credential references (not values) in configs
- Option to exclude outputs
.gitignoresupport
Git Folders work great with Notebooks for version-controlled analysis, Lakeflow for pipeline versioning, and Workspaces for team collaboration.
For sparse checkout, Git LFS, and advanced CI/CD integrations, contact your QRY administrator.