R2 SQL – R2 SQL now supports UNION, INTERSECT, EXCEPT, and SELECT DISTINCT

R2 SQL now supports set operations (UNION, INTERSECT, EXCEPT) and SELECT DISTINCT, expanding the range of analytical queries you can run directly on Apache Iceberg tables in R2 Data Catalog.

Set operations

Combine the results of multiple SELECT statements:

  • UNION — returns all rows from both queries, removing duplicates
  • UNION ALL — returns all rows from both queries, including duplicates
  • INTERSECT — returns only rows that appear in both queries
  • EXCEPT — returns rows from the first query that do not appear in the second
-- Find zones that had either firewall blocks OR high-risk requests
SELECT zone_id FROM my_namespace.firewall_events WHERE action = 'block'
UNION
SELECT zone_id FROM my_namespace.http_requests WHERE risk_score > 0.8
-- Find zones with both firewall blocks AND high traffic
SELECT zone_id FROM my_namespace.firewall_events WHERE action = 'block'
INTERSECT
SELECT zone_id FROM my_namespace.http_requests
GROUP BY zone_id
HAVING COUNT(*) > 10000
-- Find enterprise zones that have not been compacted
SELECT zone_id FROM my_namespace.zones WHERE plan = 'enterprise'
EXCEPT
SELECT zone_id FROM my_namespace.compaction_history

Select distinct

Eliminate duplicate rows from query results:

SELECT DISTINCT region, department
FROM my_namespace.sales_data
WHERE total_amount > 1000
ORDER BY region, department
LIMIT 100

For large datasets where approximate results are acceptable, approx_distinct() remains a faster alternative for counting unique values.

For the full syntax reference, refer to the SQL reference. For performance guidance, refer to Limitations and best practices.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *