Posts

Snowflake MAR-APR- 2026 Data Engineering Update

Snowflake 2026 Data Engineering Update Snowflake 2026 Data Engineering Update Snowflake’s 2026 release stream sharpens its role as a governed lakehouse control plane. Updates span Apache Iceberg integration, external query engine interoperability, Cortex metadata intelligence, stronger governance, external volume management, and dynamic table execution. 🔹 Iceberg on Azure DLS External Volumes Data Engineering Impact: Register Iceberg tables directly in Unity Catalog while metadata lives in ADLS Gen2. No duplication of metadata silos, enabling cross-cloud lakehouse patterns. Practical Use Case: Pharma pipelines storing clinical trial data in ADLS can register Iceberg tables in Snowflake for governance, while ML workloads in Databricks query the same datasets. Snowflake Docs 🔹 Horizon + External Query Engine Access Data Engineering Impact: Horizon acts as a federation layer: external engines (Spark, Trino, F...

Apache Iceberg: The Open Table Format Reshaping the Data Lakehouse

Apache Iceberg: The Open Table Format Reshaping the Data Lakehouse Data Engineering · Open Table Formats · 2025 Apache Iceberg : The Open Table Format Quietly Winning the Data Wars Born at Netflix to solve petabyte-scale chaos, Apache Iceberg has become the industry's de facto standard for the modern data lakehouse — and for good reason. By Arabinda Mohpatra Published May 2025 Read time ~18 min SCROLL TO READ $1B+ Databricks acquires Tabular (Iceberg's creator) — 2024 100% Snowflake commits to Apache Iceberg as sole open format 7+ Major cloud / engine providers natively supporting Iceberg #1 Most planned-adoption format per Dremio's 2024 survey 01 — Origin Story Netflix Had a Problem. A Petabyte-Scale Problem. It was 2017, and Netflix's data engineers were fi...

Snowflake Caching Performance Explained — A Clear, Practical Benchmark Story

Image
Originally published on LinkedIn: https://www.linkedin.com/pulse/snowflake-caching-performance-explained-clear-story-mohapatra-p0ric/?trackingId=XOnrAVSWV7bM%2FOvYlOboOA%3D%3D Migrated on: 2026-04-05 The query result cache is essential for repeated query performance. By reusing the results of recently run queries, it drastically reduces both time and resource consumption. 🛠️ How It Works : If the exact query is repeated within 24 hours and no changes occur in the underlying data, Snowflake will return the cached result. This optimization can save significant compute costs in repetitive reporting environments. 👩💻 Real-World Example : A marketing analyst rerunning customer engagement reports throughout the day would experience much faster query response times due to the query result cache, thus streamlining the reporting process unless the underlying data has changed. Managing Warehouse Cache for Efficiency The warehouse cache stores data...

Snowflake Backup & Data Recovery – Key Concepts

Originally published on LinkedIn: https://www.linkedin.com/pulse/nowflake-backup-data-recovery-key-concepts-arabinda-mohapatra-yvwfc/ Migrated on: 2026-04-05 What is Time Travel? Time Travel is like a time machine for your data. It allows you to go back to a specific point in the past and see or recover data as it was at that time. This is extremely useful when you need to undo accidental deletions or changes. 🔑 Key Insights: Retention Period : You can set how long Snowflake keeps past versions of your data. The default is 1 day, but in the Enterprise Edition or above, you can extend this up to 90 days. Easy Recovery : With Time Travel, you can query, restore, or clone data as it was at any point during the retention period. Rollback Mistakes : You can fix user errors by simply rolling back to a previous version of your table or database. 🚫 1. External Tables — Why They Are Not Cloned External tables only store m...

7AM DataEngineering Sunrise - Digest Week 14, 2026

7AM DataEngineering NewsDigest - 7AM DataEngineering Sunrise - Digest Week 14, 2026 Compiled: 2026-04-05 06:58:41 1. OCSF explained: The shared data language security teams have been missing Source: VentureBeat — Original The security industry has spent the last year talking about models, copilots, and agents, but a quieter shift is happening one layer below all of that: Vendors are lining up around a shared way to describe security data. The Open Cybersecurity Schema Framework(OCSF), is emerging as one of the strongest candidates for that job.It gives vendors, enterprises, and practitioners a common way to representsecurity events, findings, objects, and context. Read full article 2. Nvidia launches enterprise AI agent platform with Adobe, Salesforce, SAP among 17 adopters at GTC 2026 Source: VentureBeat — Original Jensen Huangwalked onto theGTC stageMonday wearing his trademark leather jacket and carrying, as it turned out, the blueprints for a new kind of industry dom...