Skip to main content

Running Python

Python in QRY runs in one of two places:

  • Gemini native executor — fast, sandboxed, ideal for routine analysis on small data.
  • Kubernetes executor — slower to cold-start (~10–20s), full Python ecosystem, larger memory limit, used when the native one can't.

You almost never pick which one runs your code. QRY routes automatically based on what your code imports and how big the data is. This page explains the routing so you understand why a cell took 20 seconds when you expected 2.

When QRY routes to Kubernetes

The K8s executor kicks in when either of:

1. Data exceeds the native upload limit

The Gemini native executor caps inputs at 2 MB. If your DataFrame, image, or file exceeds that, QRY switches to K8s automatically. There's no flag to override.

2. Your code imports K8s-only libraries

Heavy libraries that don't fit in the native sandbox trigger K8s automatically. QRY detects these via static import scanning before running:

plotly, xgboost, lightgbm, statsmodels,
reportlab, openpyxl, fpdf,
python-pptx, python-docx

If any of these appears in import lines (or from x import ...), the cell goes to K8s.

matplotlib, pandas, numpy, scipy, scikit-learn (basic estimators), seaborn, and requests all run in the native executor.

Resource limits

LimitNativeKubernetes
Memory~512 MB2 GB
CPU time~30 s300 s
Max file size handled2 MB100 MB
Cold startNone10–20 s
Network accessRestrictedTenant network

For very large workloads, K8s can be tuned higher by an admin (pythonExecution.resources in the Helm chart). The defaults above are a good safety net.

Conventions

Use qry.sql(...) for database access

Don't try to connect to the database directly in Python — credentials aren't exposed and the connection wouldn't go through RBAC / ABAC. Use:

df = qry.sql("SELECT * FROM customers WHERE country = 'ES'")

The result is a pandas DataFrame. The query goes through the same security stack a chat-driven query would.

Display, don't save

QRY captures inline display: anything print()'d, display()'d, plotted with plt.show(), or written to 'chart.html' shows up below the cell.

Don't plt.savefig(...) to disk — files in the executor's working directory are ephemeral and savefig produces a duplicate of what plt.show() already captured. Always:

import matplotlib.pyplot as plt
plt.plot(df['month'], df['revenue'])
plt.show() # captured automatically

For Plotly:

fig.write_html('chart.html')  # captured with fullscreen preview

(See Creating charts for full details on chart conventions.)

Caching across cells

Each cell is independent — there's no notebook-wide kernel that persists variables. To share data between cells, attach to the special qry.cache namespace:

# In cell 1
qry.cache['df'] = qry.sql("...")

# In cell 2
df = qry.cache['df']
df.groupby('country').size()

Cache lives for the duration of the conversation or notebook run.

Common issues

Cell takes 20s before producing any output. You imported a K8s-only library. Cold start of the K8s executor is the slow part — once warm, subsequent cells in the same notebook reuse it.

OutOfMemoryError on a 1.5 GB DataFrame. You're on K8s but hit the 2 GB ceiling (the OS itself takes some). Sample first, or ask an admin to raise pythonExecution.resources.limits.memory.

ImportError for a library you expect to be there. Check the spelling and the import path. Some libraries (e.g. tensorflow, torch) aren't in the default executor image. An admin can extend the image or you can pip-install at the start of the cell:

import subprocess
subprocess.run(['pip', 'install', 'tensorflow'], check=True)

(Slow and ephemeral — for repeated use, ask an admin to bake it into the image.)

The cell errors saying "network access denied". Native executor restricts network. Either:

  • Don't fetch from the network — pull data via qry.sql(...).
  • Trigger K8s (import a heavy library if needed) — K8s has tenant-network access.

plt.savefig and plt.show produced a duplicate chart. Drop savefig. plt.show() is enough.

See also

QRYA product of IXEN.