data:image/s3,"s3://crabby-images/37fd6/37fd6cb1580d923034a4f8cfbab36d46cf0978c8" alt=""
How we use Rill Data for Data Analytics
Data analytics can feel complicated when you face sluggish dashboards, scattered data sources, and never-ending custom reports. Our team once wrestled with these same hurdles. Then we discovered Rill—a “BI as Code” approach that streamlines data flows, organizes dashboards, and runs at brisk speeds. We rely on Rill for many use cases, from e-commerce order insights to GitHub analytics.
Takeaways
- Rill cuts down on configuration overhead by letting you define sources, models, and dashboards in code.
- Edits, version control, and environment migrations become simpler because everything is code-based.
- Rill sits on fast backends (DuckDB or ClickHouse), so queries and calculations handle large datasets with minimal drag.
- We connect data sources (like CSV files, Snowflake, and more) to unify them under one roof.
- Rill’s transparent project structure makes it easy to bring developers, analysts, and stakeholders together.
Why Rill?
Many “all-in-one” business intelligence platforms promise to fix reporting problems with a few clicks. Often, you end up with large licensing fees, confusing user interfaces, or rigid setups. Rill addresses those headaches by adopting the “BI as Code” philosophy. Each data source, model, and dashboard config exists as a file you can track, review, and modify. This approach fosters a shared language between data engineers and analysts.
Aligning With “BI as Code”
We noticed a parallel with software development. When your entire analytics stack is encoded in text files, you can do the following:
- Keep everything in Git for version control.
- Push your Rill project to a new environment—such as a staging or production environment—and replicate dashboards easily.
- Simplify code reviews. Instead of rummaging through a cluttered user interface, your teammates check lines in a YAML or SQL file.
We found this especially useful for e-commerce data. We track orders from Amazon, Shopify, or retail locations like Target and Walmart. Instead of loading data manually, we rely on Rill’s coded definitions that pull, model, and aggregate everything automatically.
What Makes Rill Unique
Below are core capabilities of Rill that match our needs:
Simple Workflow
Rill organizes the pipeline into three steps: source → model → dashboard. You can add new data sources in a “sources” folder, write modeling SQL in a “models” folder, and define dashboards in a “dashboards” folder. That structure keeps us from mixing raw data ingestion with final metrics.
Backend Choices
You can run Rill on DuckDB, ClickHouse, or your existing warehouses. DuckDB, in particular, stands out for single-file analytics. You can run queries on local files (Parquet or CSV) at blazing speed, which is ideal for testing new data ideas. If your datasets expand, ClickHouse can handle billions of rows.
Time-Series Exploration
Much of our analysis involves time-based metrics—like daily orders, weekly traffic, or monthly revenue. Rill automatically surfaces time-based charts. You get a line chart that you can drill into, or combine with dimension filters (like shipping city or product type) in a few clicks.
Performance at Scale
We’ve tested Rill on moderate to large data, from thousands of rows to tens of millions. The combination of columnar storage and optimized queries means results come quickly, even when slicing many dimensions. This performance is crucial for dashboards used by cross-functional teams that need near-instant feedback.
Strong Community
Rill has an active Discord channel, great documentation, and an ecosystem of example projects. We’ve gleaned tips for app engagement dashboards or GitHub analytics from these references. The official docs (docs.rilldata.com) and user-shared projects help new users get started.
Step-by-Step: How We Built a Rill Project
Below is the flow we often follow. Imagine we’re analyzing orders across multiple stores, focusing on shipping costs, potential refunds, and the effect of discount codes.
1. Install Rill and Kick Off the Project
If you want to give Rill a try, open your terminal and run:
curl https://rill.sh | sh
This command fetches the Rill executable.
Once installed, create a new project folder:
rill start my-rill-project
Rill initializes the folder with subdirectories for sources, models, dashboards, and more.
You’ll see something like:
my-rill-project
├─ sources/
├─ models/
├─ dashboards/
├─ rill.yaml
└─ ...
Once that’s done, head to http://localhost:9009 in your browser. Rill displays a project UI where you can add data sources or view logs.
2. Add Data Sources
We store e-commerce order info in CSV files on Amazon S3 and operational logs in a Snowflake table. Rill lets us define each source with a simple YAML file.
For example:
type: "s3"
uri: "s3://mycompany-bucket/orders-2022.csv"
Once this file is in the sources folder, Rill pulls that data into DuckDB or ClickHouse (depending on your setup). If you’re using Snowflake or BigQuery, define the connection string, along with credentials. With each new data source, you repeat the same pattern. This approach also works for custom data from in-house ERP systems or app engagement logs.
3. Create a Model
A model in Rill is a SQL query that references your raw source data. It’s a place to clean, enrich, or join multiple tables before building final dashboards.
For example:
-- Inside models/all_orders_model.sql
SELECT
SELLING_PLATFORM,
FULFILLMENT_CHANNEL,
ORDER_ID,
ORDER_TIMESTAMP,
YEAR(ORDER_TIMESTAMP) AS YEAR,
MONTH(ORDER_TIMESTAMP) AS MONTH,
ITEM_TOTAL_PRICE,
TOTAL_QUANTITY,
SHIPPING_STATE,
DISCOUNT_CODE,
CASE WHEN HAS_REFUND THEN TOTAL_REFUND_AMOUNT ELSE 0 END AS REFUND_AMOUNT,
CASE WHEN TOTAL_QUANTITY >= 5 THEN 'Bulk Buyer' ELSE 'Standard Buyer' END AS BUYER_SEGMENT
FROM ALL_ORDERS
WHERE YEAR(ORDER_TIMESTAMP) >= 2022
This query picks columns we care about (price, quantity, shipping details). It also creates a new field called BUYER_SEGMENT if the total quantity is above a certain threshold. If you have multiple data sources—for instance, shipping rates from a separate table—you could do a JOIN to unify them. Each model is stored in a .sql file.
4. Generate a Dashboard
With your model ready, you can either auto-generate a dashboard in the Rill UI or create a YAML file in the dashboards folder. The dashboard definition references the model name, sets the main time field, and defines dimensions and measures.
Here’s a sample snippet:
type: metrics_view
title: All Orders Model Metrics
model: all_orders_model
timeseries: ORDER_TIMESTAMP
dimensions:
- label: Selling Platform
column: SELLING_PLATFORM
- label: Shipping State
column: SHIPPING_STATE
- label: Buyer Segment
column: BUYER_SEGMENT
measures:
- name: total_orders
label: Total Orders
expression: COUNT(ORDER_ID)
- name: total_revenue
label: Total Revenue
expression: SUM(ITEM_TOTAL_PRICE)
format_preset: currency_usd
- name: total_refunds
label: Total Refunds
expression: SUM(REFUND_AMOUNT)
format_preset: currency_usd
After saving this file, refresh your Rill web interface. You’ll see a dashboard with a line chart for daily or monthly metrics (depending on your date granularity). On the side, you can filter by shipping state, buyer segment, or time range.
Practical Use Cases
Below are three real scenarios where Rill brightened our analytics work.
E-commerce Shipping Analysis
We use Rill to see how shipping costs vary by zone. We define a dimension for shipping states (like California, New York, etc.) and a measure that sums shipping fees. By filtering for a specific month, we spot spikes in shipping expenses for certain regions. Then we investigate if those spikes connect with discount code usage or heavier product bundles.
GitHub Analytics for Engineering
If you run open-source projects or track your own repos, you might link Rill to a data feed of GitHub events. This could be done by exporting commit logs and pull request data. Then you define a “github_analytics_model.sql” that calculates top contributors, code churn, or number of new issues raised each week. Once that’s in place, your Rill dashboard reveals coding activity patterns and helps your engineering managers see who needs help or which repos are busiest.
App Engagement Metrics
Mobile or web apps produce logs: user sessions, sign-ups, funnel steps, or click events. Those logs might be stored as CSV, Parquet, or in a warehouse. Rill can unify them to show how new feature releases correlate with engagement spikes. You can add a dimension for “feature flag” or “app version” and track daily active users. If there’s a slump after a certain release, you’ll spot it quickly and can react.
“Powered by DuckDB” Performance
When we mention “powered by DuckDB,” we’re talking about a columnar database embedded in your local environment or server. DuckDB queries data at a speed that often surprises new users. You can handle multi-gigabyte CSV files or Parquet data without the overhead of big cluster management.
For a project with modest volumes (like a few hundred million rows), DuckDB can handle queries in seconds. We’ve enjoyed how it allows quick iteration—when you alter the model’s SQL and refresh, results appear with minimal wait. If your dataset outgrows local machines, you can pivot to Rill + ClickHouse for more robust handling. That path keeps your existing definitions (sources, models, dashboards) mostly intact.
Rill Cloud for Deployment
When you’re happy with a local Rill project, you can deploy it to Rill Cloud. This step simplifies sharing the dashboards with your colleagues. After signing up for a Rill Cloud account, you run:
rill deploy
Rill walks you through an in-browser authentication flow. It then uploads your code-based project. In the cloud, you can invite viewers or editors with specific roles and permissions. If you want to refine your approach to role-based access or security, Rill’s docs have guidelines for safeguarding data.
Challenges and Solutions
Even with Rill’s strengths, we’ve faced a few bumps:
Data Source Permissions
If your data sits in a locked-down warehouse, you must provide Rill with valid credentials. Managing secrets is crucial. A best practice is to store credentials outside of Git, or use a password manager.
Complex Joins
Some advanced transformations demand multi-step logic. We break them into more than one model file. The first model might unify product details with user data; the second applies grouping or aggregates. Splitting transformations keeps each model readable.
Dashboard Overload
A single Rill dashboard can show many measures. We found that too many measures or dimensions led to confused users. Instead, we keep dashboards minimal and create separate dashboards for major topics—like shipping performance vs. marketing promotions.
Balancing Realism vs. Synthetic Data
In certain demos or training sessions, you might need synthetic data. Rill helps load those CSVs easily, but ensure you randomize enough so that no real user data leaks. If you’re working with sensitive info, check your compliance standards or consider scanning your synthetic sets with a data-privacy tool.
Evolving Requirements
Team requests shift over time—like new columns for subscription orders or shipping weight brackets. With the code-based style, changes are simpler. Just add columns in your model SQL or update the dashboard YAML. Then push it to Rill Cloud.
Best Practices
- Label Columns Clearly: When you define measures, give them meaningful names. Instead of “sum(x),” call it “Total Orders” or “Average Product Cost.”
- Create a Model Step for Every Key Calculation: Resist stuffing complicated logic into raw SQL within a single file. Break it out. That way, you can debug partial transformations.
- Use a Dev Environment: If you have a team of analysts, keep a separate environment for tests. Deploy changes there. Then push to production only after final review.
- Document Each Dashboard: Within the YAML, add a brief note about each dimension or measure. Over time, new team members will understand what “ORDER_TYPE” or “REFUND_REASON” means.
- Check Performance Early: If you plan to handle hundreds of millions of rows, run a stress test. DuckDB might suffice, but sometimes you’ll want ClickHouse or a specialized warehouse.
Conclusion
If you’re seeking a simpler path to unify data sources, build trustworthy metrics, and share dashboards with minimal friction, Rill could be your next step. Whether you’re tackling e-commerce, GitHub analytics, or custom data for app engagement, the “BI as Code” mindset is a useful way to keep your analysis clear, versioned, and easy to evolve. Try it yourself. Draft a quick Rill project, define a model, create a dashboard, and share it. You may find, as we did, that code-based analytics is far more manageable than point-and-click solutions. Happy building!
Transforming raw data into
actionable insights
We help businesses boost revenue, save time, and make smarter decisions with Data and AI