File Upload
Upload your own data files for instant analysis without needing to import them into a database.
Production Ready - Full support for CSV, Excel, JSON, and Parquet formats
Overview
The File Upload feature allows you to analyze data from local files without database setup. Upload files directly in your conversation and ask questions about the data.
Supported File Formats
CSV (Comma-Separated Values)
- Standard CSV files with headers
- Custom delimiters (comma, semicolon, tab)
- Automatic type detection
- Max size: 100MB
Excel
.xlsxand.xlsfiles- Multiple sheets (specify which to analyze)
- Preserves formatting and data types
- Max size: 100MB
JSON
- Standard JSON arrays of objects
- Nested JSON (automatically flattened)
- JSON Lines format
- Max size: 100MB
Parquet
- Apache Parquet format
- Efficient columnar storage
- Preserves types and schemas
- Max size: 100MB
How to Upload Files
During Conversation
- Click the Upload File icon (📎) in the chat input
- Select one or more files from your computer
- Files appear as attachments in your message
- Ask your question about the uploaded data
- Send the message
Multiple Files
You can upload multiple files at once:
- Compare data across files
- Join data from different sources
- Aggregate across multiple files
Asking Questions
Once files are uploaded, ask questions naturally:
Single File Examples
Show me the top 10 rows from this file
What's the average sales amount in this CSV?
Create a chart showing revenue by month
How many unique customers are in this data?
Multiple File Examples
Compare sales figures between Q1.csv and Q2.csv
Join the customers file with the orders file on customer_id
Show me total revenue across all three files
Features
Automatic Schema Detection
QRY automatically:
- Detects column names
- Infers data types (numbers, dates, strings)
- Handles missing values
- Identifies relationships between files
Data Profiling
Ask for insights about your data:
Describe the structure of this file
Show me summary statistics for all columns
Are there any missing values?
What are the unique values in the status column?
Visualizations
Create charts from uploaded data:
Plot revenue over time as a line chart
Create a bar chart showing sales by product
Show customer distribution by region as a pie chart
Python Analysis
Use Python for advanced analysis:
Calculate correlation matrix for numeric columns using Python
Use Python to detect outliers in the price column
Create a seaborn heatmap of this data
Working with Data
Data Exploration
Show me the first 20 rows
What columns are in this file?
How many rows are in this dataset?
Filtering
Show only rows where status is 'completed'
Filter to sales greater than $1000
Show me data from the last 3 months
Aggregation
Group by category and sum the amounts
Count records by date
Calculate average, min, and max for each product
Transformations
Convert the date column to datetime format
Create a new column for profit margin
Normalize the price column
Best Practices
File Preparation
Do:
- Include clear column headers in first row
- Use consistent date formats
- Remove empty rows at start of file
- Ensure numeric columns don't have text values
Don't:
- Mix data types in same column
- Use merged cells (Excel)
- Include multiple tables in one sheet
- Start with summary rows before headers
Performance Tips
- File Size: Keep files under 50MB for best performance
- Data Types: Clean data loads faster than messy data
- Format Choice: Parquet is fastest, followed by CSV
- Compression: Compress large files before upload (if supported)
Data Quality
- Check Headers: Verify column names are descriptive
- Handle Nulls: Decide how to treat missing values
- Consistent Formatting: Dates, numbers should use same format throughout
- Remove Duplicates: Clean duplicate rows before upload if possible
Examples
Sales Data Analysis
Upload: sales_2024.csv
What was the total revenue in this file?
Show me top 5 products by sales.
Create a monthly revenue trend chart.
Customer Data
Upload: customers.xlsx
How many customers signed up each month?
What's the distribution of customers by country?
Show me customers with purchases over $5000
Multiple File Join
Upload: orders.csv, customers.csv, products.csv
Join orders with customers and show me customer names with their total order value
Data Cleaning
Upload: raw_data.csv
Show me rows with missing values
Remove duplicates based on email column
Export cleaned data to CSV
Security & Privacy
Data Handling
- Files are processed in-memory (not permanently stored)
- Data is only visible to you during your session
- Files are automatically cleared when conversation ends
- No data is shared with other users
Sensitive Data
Caution:
- Don't upload files with sensitive PII unless authorized
- Check company policies before uploading confidential data
- Remember that AI models may see your data during processing
- Use database connections for production/sensitive data
Limitations
Size Limits
- Maximum file size: 100MB per file
- Maximum total size: 200MB per conversation
- Row limit: ~1 million rows (depending on columns)
Processing
- Very large files may take longer to process
- Complex Python operations may timeout on large datasets
- Some Excel features (formulas, macros) are not preserved
File Types
- Only supported formats listed above
- Binary files (images, PDFs) cannot be analyzed
- Password-protected files are not supported
Troubleshooting
Upload Fails
Causes:
- File too large (over 100MB)
- Unsupported format
- File corrupted
Solutions:
- Split large files into smaller chunks
- Convert to supported format
- Verify file integrity
Parsing Errors
Causes:
- Inconsistent column count
- Mixed data types in column
- Encoding issues
Solutions:
- Check file has consistent structure
- Ensure UTF-8 encoding
- Clean data in Excel before upload
Slow Performance
Causes:
- Very large file
- Complex query
- Too many columns
Solutions:
- Filter to subset of rows
- Select specific columns needed
- Break analysis into steps
Export Results
After analyzing uploaded files, you can:
Export this result to CSV
Download the cleaned data
Save this chart as an image
For recurring analysis of the same file, consider using Scheduled Tasks with database import instead of manual uploads.