<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[SeattleDataGuy’s Newsletter]]></title><description><![CDATA[Learn About End-To-End Data Flows (Data Engineering, MLOps, and Data Science) ]]></description><link>https://seattledataguy.substack.com</link><image><url>https://substackcdn.com/image/fetch/$s_!fov7!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fea9135cc-f9d6-4856-8596-2ca9a1655cb6_256x256.png</url><title>SeattleDataGuy’s Newsletter</title><link>https://seattledataguy.substack.com</link></image><generator>Substack</generator><lastBuildDate>Fri, 13 Mar 2026 14:31:49 GMT</lastBuildDate><atom:link href="https://seattledataguy.substack.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[SeattleDataGuy]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[seattledataguy@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[seattledataguy@substack.com]]></itunes:email><itunes:name><![CDATA[SeattleDataGuy]]></itunes:name></itunes:owner><itunes:author><![CDATA[SeattleDataGuy]]></itunes:author><googleplay:owner><![CDATA[seattledataguy@substack.com]]></googleplay:owner><googleplay:email><![CDATA[seattledataguy@substack.com]]></googleplay:email><googleplay:author><![CDATA[SeattleDataGuy]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Layer by Layer, We Built Data Systems No One Understands]]></title><description><![CDATA[How data stacks turn into fractals]]></description><link>https://seattledataguy.substack.com/p/layer-by-layer-we-built-data-systems</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/layer-by-layer-we-built-data-systems</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Mon, 02 Mar 2026 22:56:11 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!MWgB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dfa805d-81b2-4a61-8685-e0130eae81a9_1024x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi, fellow future and current Data Leaders; Ben here &#128075;</p><p>Today I want to talk about data stacks and layering on complexity&#8230; </p><p>Before we jump in to todays article. I wanted to let y&#8217;all know that today&#8217;s article is sponsored by me, the Seattle Data Guy! </p><p>Our team has helped dozens of companies turn data in to actual business outcomes. We&#8217;ve also helped companies set-up their data stack from the ground up as well as untangle their current data infrastructure. If you&#8217;re looking for an experienced data consulting team to help you set-up your data infrastructure and strategy, then set-up some time with me <a href="https://calendly.com/ben-rogojan/consultation?month=2026-03">today</a>!</p><p>Now let&#8217;s jump into the article!</p><div><hr></div><p>Tech folk are like onions; we have layers.</p><p>Actually&#8230;its more like we love the idea of layers.</p><p>Network layers.</p><p>Medallion architecture.</p><p>Layer, after layer, after layer.</p><p>On one side, these layers help delineate where on process or job starts and another ends..</p><p>But on the other side, we tend to keep layering more and more layers on top of each other. Adding new roles, new tools, new platforms.</p><p>All to make things &#8220;easier&#8221;.</p><p>Think of the Modern Data Stack.</p><p>Snowflake, Databricks, and all the other tools. They did make things easier in so many ways. But also, I&#8217;ve helped companies save millions on compute costs.</p><p>Many times, because data teams kept adding layer upon layer that just needed to be removed.</p><p>Don&#8217;t worry, we&#8217;ll discuss that later. Let&#8217;s start by discussing the pros and cons of systems and tools that make development easy.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hSMT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574d86b9-6b45-43de-98e3-591a1a2bc198_1882x3236.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hSMT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574d86b9-6b45-43de-98e3-591a1a2bc198_1882x3236.jpeg 424w, https://substackcdn.com/image/fetch/$s_!hSMT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574d86b9-6b45-43de-98e3-591a1a2bc198_1882x3236.jpeg 848w, https://substackcdn.com/image/fetch/$s_!hSMT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574d86b9-6b45-43de-98e3-591a1a2bc198_1882x3236.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!hSMT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574d86b9-6b45-43de-98e3-591a1a2bc198_1882x3236.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hSMT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574d86b9-6b45-43de-98e3-591a1a2bc198_1882x3236.jpeg" width="1456" height="2504" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/574d86b9-6b45-43de-98e3-591a1a2bc198_1882x3236.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2504,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:954645,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/189481631?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574d86b9-6b45-43de-98e3-591a1a2bc198_1882x3236.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hSMT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574d86b9-6b45-43de-98e3-591a1a2bc198_1882x3236.jpeg 424w, https://substackcdn.com/image/fetch/$s_!hSMT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574d86b9-6b45-43de-98e3-591a1a2bc198_1882x3236.jpeg 848w, https://substackcdn.com/image/fetch/$s_!hSMT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574d86b9-6b45-43de-98e3-591a1a2bc198_1882x3236.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!hSMT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574d86b9-6b45-43de-98e3-591a1a2bc198_1882x3236.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2>Our Solution Makes It Easy - How It Happens</h2><p>Every generation of tools promises the same thing:</p><blockquote><h3>&#8220;We&#8217;ll make this easier.&#8221;</h3></blockquote><p>What they actually do?</p><p>Add another layer.</p><p>I just read an article discussing how <a href="https://www.youtube.com/watch?v=QNdiGZFaUFs&amp;t=1s">Databricks</a> is trying to position itself in that light.</p><p>All that made me think of was one of the first articles I wrote nearly a decade ago. I was talking about Tableau and how the fact that it was so easy made things dangerous.</p><p>And I can tell you from experience this is true and it&#8217;s helpful. You can test out new ideas, dig into data faster, etc.</p><p>Of course, simply pumping out more code, more dashboards, more artifacts..more layers&#8230;faster has never been the key problem. In fact, in many ways, it&#8217;s created its own set of new problems. You&#8217;ve got to deal with every type of sprawl under the sun.</p><ul><li><p><strong>BI sprawl</strong></p></li><li><p><strong>Pipeline sprawl</strong></p></li><li><p><strong>Model sprawl</strong></p></li><li><p><strong>Agent sprawl</strong></p></li><li><p><strong>Cost sprawl</strong></p></li><li><p><strong>System sprawl</strong></p></li></ul><p>The image below, doesn&#8217;t even show all of that. And it&#8217;s also starting to sprawl and in each of those boxes is it&#8217;s own complex sprawl of code, dashboard metrics, calculations, and locally optimized workflows.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MWgB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dfa805d-81b2-4a61-8685-e0130eae81a9_1024x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MWgB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dfa805d-81b2-4a61-8685-e0130eae81a9_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!MWgB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dfa805d-81b2-4a61-8685-e0130eae81a9_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!MWgB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dfa805d-81b2-4a61-8685-e0130eae81a9_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!MWgB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dfa805d-81b2-4a61-8685-e0130eae81a9_1024x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MWgB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dfa805d-81b2-4a61-8685-e0130eae81a9_1024x768.png" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5dfa805d-81b2-4a61-8685-e0130eae81a9_1024x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:332289,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/189481631?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dfa805d-81b2-4a61-8685-e0130eae81a9_1024x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MWgB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dfa805d-81b2-4a61-8685-e0130eae81a9_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!MWgB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dfa805d-81b2-4a61-8685-e0130eae81a9_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!MWgB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dfa805d-81b2-4a61-8685-e0130eae81a9_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!MWgB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dfa805d-81b2-4a61-8685-e0130eae81a9_1024x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p><strong>These systems end up like fractals. Every box you go into has yet another set of boxes, and arrows.</strong></p></blockquote><p>There are many reasons for this.</p><ul><li><p>A department head wanted to run their own <a href="https://estuary.dev/blog/ai-data-pipeline/?utm_source=SeattleDataGuy&amp;utm_medium=social&amp;utm_campaign=SeattleDataGuy">AI</a>/<a href="https://www.linkedin.com/posts/benjaminrogojan_building-a-data-engineering-project-starting-activity-7265448573025099778-UVGA/">Data project</a></p></li><li><p>The data analysts and engineers wanted to use different tools</p></li><li><p>No one wanted to actually make a decision, so you picked all the tools</p></li><li><p><a href="https://seattledataguy.substack.com/i/174656011/outputs-without-outcomes">Outcomes</a> were not the key focus</p></li></ul><p>This can create costs that explode and systems that become harder and harder to maintain.</p><p>And no amount of data governance lathered on top of all this will make it more sane.</p><p>You&#8217;ve got a data catalog for your data catalogs because hey, why not!</p><p>You&#8217;ve got orchestrators for your orchestrators.</p><p>And you&#8217;re ingesting data from your ingestion tool about how your ingestions are running.</p><p>We&#8217;ll likely need AI not only to develop code faster but to understand what in the world is going on inside some of these spaghetti systems we are constructing.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SeattleDataGuy&#8217;s Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>The Devil&#8217;s Advocate - Spending More On Tech Makes Sense No?</h2><p>But let me take another side to this.</p><p>On the flipside, some companies I&#8217;ve worked with didn&#8217;t have full-time data engineering teams or had far smaller ones than in the early 2020s.</p><p>Sure, in some cases, maybe they spent an extra $125k a year on Databricks and or Snowflake&#8230;.</p><p>But they saved a significant amount on hiring a DE team. Based on the average data engineer salary, if you can cut 2 from your team and spend an extra $125k on Snowflake and Databricks&#8230;thats not a bad deal.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KdoL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff02c193b-65c8-4a42-b2dd-fe3ae9c002b4_1114x1062.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KdoL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff02c193b-65c8-4a42-b2dd-fe3ae9c002b4_1114x1062.png 424w, https://substackcdn.com/image/fetch/$s_!KdoL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff02c193b-65c8-4a42-b2dd-fe3ae9c002b4_1114x1062.png 848w, https://substackcdn.com/image/fetch/$s_!KdoL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff02c193b-65c8-4a42-b2dd-fe3ae9c002b4_1114x1062.png 1272w, https://substackcdn.com/image/fetch/$s_!KdoL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff02c193b-65c8-4a42-b2dd-fe3ae9c002b4_1114x1062.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KdoL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff02c193b-65c8-4a42-b2dd-fe3ae9c002b4_1114x1062.png" width="1114" height="1062" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f02c193b-65c8-4a42-b2dd-fe3ae9c002b4_1114x1062.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1062,&quot;width&quot;:1114,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:153363,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/189481631?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff02c193b-65c8-4a42-b2dd-fe3ae9c002b4_1114x1062.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KdoL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff02c193b-65c8-4a42-b2dd-fe3ae9c002b4_1114x1062.png 424w, https://substackcdn.com/image/fetch/$s_!KdoL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff02c193b-65c8-4a42-b2dd-fe3ae9c002b4_1114x1062.png 848w, https://substackcdn.com/image/fetch/$s_!KdoL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff02c193b-65c8-4a42-b2dd-fe3ae9c002b4_1114x1062.png 1272w, https://substackcdn.com/image/fetch/$s_!KdoL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff02c193b-65c8-4a42-b2dd-fe3ae9c002b4_1114x1062.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We are removing people from the layers and replacing them with technology!</p><p>Yes, there might be some random scripts calling other random scripts, but it all works right?</p><p>I actually recall one friend making a joke about this where they referenced the fact that, sure, a company could put effort into optimizing their data model, their <a href="https://www.youtube.com/watch?v=htAipJ6yYFs">data pipelines</a>, etc.</p><p>Or, they could crank their <a href="https://www.youtube.com/watch?v=GuM6dQGRFyQ&amp;t=5s">Snowflake</a> instance up a few notches, pay a little more, and fire half the <a href="https://www.youtube.com/watch?v=wyGAYa2UMXQ&amp;list=PLXRKPZRrlvE6chKJl5jxcZHIeCOPl_bfz">data engineering team</a>, and it would be significantly cheaper.</p><p>Some companies are essentially doing this now. Instead of paying a full-time data engineer, they hire someone to help set up their data stack and then come in every so often to simply maintain it. I know, as I&#8217;ve been that someone.</p><h2>The Realities Of What This Leads To</h2><p>Here are some realities I believe we will continue to see in this AI-driven data world where we seem to building yet more technical layers.</p><h3>Business Logic Bloat Will Only Get Worse </h3><p>When <a href="https://estuary.dev/blog/efficient-elt-with-estuary-flow-and-dbt/?utm_source=SeattleDataGuy&amp;utm_medium=social&amp;utm_campaign=SeattleDataGuy">dbt</a> became popular one thing that many data teams found is that model bloat was real. You&#8217;d keep building out more and more models now that a broader set of users that manage to obfuscate hidden business logic. </p><p>The easier it is to translate business logic into code, the more you take out of peoples brains and put it into technology. In some ways thats great. In other ways, some decisions made will be to try to get code to cover even more edge cases than we have in the past. Because, why not? Just ask your LLM to code one more, it has 0 cost to you?</p><h3>People Will Still Struggle To Connect Things To The Business </h3><p>I believe despite being able automate more and do more technically speaking. Teams will still run into a similar problem that we have today. At least until LLMs figure this out. That is that people will still struggle to connect business value to technical output. We gave more people access to data, yet many companies are still struggling to answer basic questions and in turn struggling to make decisions. Some of this, I believe is due team sizes and so many teams specializing in technologies vs. the business. So there is a chance, that we might be able to avoid this, but I don&#8217;t think we will(at least not for a while)</p><h3>The Fractals Will Grow</h3><p>In theory, I believe we should take some of this tooling, LLMs, etc and in turn start removing layers of tech to simplify tech stacks. In reality, this is not what I see. I love that <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Joe Reis&quot;,&quot;id&quot;:3531217,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6e4716b1-c223-41e3-b943-def0291bf217_1175x783.jpeg&quot;,&quot;uuid&quot;:&quot;99596f32-c06d-4b58-8f8b-7e055c5ced17&quot;}" data-component-name="MentionToDOM"></span> pointed out the 37 tool data stacks from the 2020s. He might even be under counting.</p><p>Data stacks weren&#8217;t just ETL, <a href="https://www.theseattledataguy.com/data-warehousing-essentials-a-guide-to-data-warehousing/">data warehouse</a>, data visualization and <a href="https://www.theseattledataguy.com/26-data-catalogs-from-open-source-to-managed/">data catalog</a>. I&#8217;ve come across, <a href="https://www.theseattledataguy.com/etls-vs-elts-why-are-elts-disrupting-the-data-market-data-engineering-consulting/#page-content">ETL</a> &#8594; Data Lakehouse &#8594; another ETL tool and orchestration tool &#8594; dta warehouse &#8594; yet another <a href="https://www.theseattledataguy.com/what-are-etls-and-why-we-use-them/">ELT</a> &#8594; database &#8594;semantic layer &#8594; six different BI tools. Not to mention some custom code in between! </p><p>I don&#8217;t think we&#8217;ll be getting rid of that. I think most companies will just layer AI on top of all of that.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!u_yO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7920db4e-4b25-43c7-99d7-f05c0356420d_522x478.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!u_yO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7920db4e-4b25-43c7-99d7-f05c0356420d_522x478.jpeg 424w, https://substackcdn.com/image/fetch/$s_!u_yO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7920db4e-4b25-43c7-99d7-f05c0356420d_522x478.jpeg 848w, https://substackcdn.com/image/fetch/$s_!u_yO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7920db4e-4b25-43c7-99d7-f05c0356420d_522x478.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!u_yO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7920db4e-4b25-43c7-99d7-f05c0356420d_522x478.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!u_yO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7920db4e-4b25-43c7-99d7-f05c0356420d_522x478.jpeg" width="522" height="478" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7920db4e-4b25-43c7-99d7-f05c0356420d_522x478.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:478,&quot;width&quot;:522,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:30940,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/189481631?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7920db4e-4b25-43c7-99d7-f05c0356420d_522x478.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!u_yO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7920db4e-4b25-43c7-99d7-f05c0356420d_522x478.jpeg 424w, https://substackcdn.com/image/fetch/$s_!u_yO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7920db4e-4b25-43c7-99d7-f05c0356420d_522x478.jpeg 848w, https://substackcdn.com/image/fetch/$s_!u_yO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7920db4e-4b25-43c7-99d7-f05c0356420d_522x478.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!u_yO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7920db4e-4b25-43c7-99d7-f05c0356420d_522x478.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3></h3><h3>We Are Growing The Pie Of Automated Use Cases</h3><p>I do think will grow the pie in terms of how easy it is to build automated use cases. Whether the results are better is hard to say. I lean towards the side of, most people will just add complexity for complexities sake because that&#8217;s what I&#8217;ve seen in the prior few hype cycles. Some tooling expands the usage of a popular idea.</p><ul><li><p><strong>Machine learning becomes popular</strong> - Here are some Python libraries you can use to implement models without understanding them. You&#8217;ll still struggle to find good places to implement them as well as know how to properly deploy them.</p></li><li><p><strong>Big Data becomes all the rage</strong> - Here are some tools that makes handling big data easier. Companies will still struggle to become data-driven and dashboards still take too long to load.</p></li></ul><p>I am sure you could find even more examples there. Technology makes a lot of things easier but it&#8217;s still challenging to connect good ideas with outcomes.</p><h2>Tactical Take Away</h2><p>Before adding another layer, ask yourself three questions:</p><ol><li><p><strong>What problem does this layer actually solve?</strong> - I find many engineers become a &#8220;Snowflake&#8221; Data Engineer or &#8220;Databricks&#8221; Data Engineer, which sure, that&#8217;s great. But, you eventually get so divorced from outcomes that you can easily end up creating layers and work for the sake of it. </p></li><li><p><strong>What happens if we don&#8217;t add it?</strong> -  More often than not, the answer is &#8220;nothing breaks.&#8221; and &#8220;We actually now need to hire someone to take care of this layer&#8221;.</p></li><li><p><strong>Who owns it six months from now?</strong> -  Because every layer eventually becomes someone&#8217;s problem.</p></li></ol><p>If you can&#8217;t answer all three clearly, you&#8217;re not building leverage.</p><p>You&#8217;re building liability.</p><p>The goal isn&#8217;t fewer tools for the sake of it but nor is it more for mores sake.</p><p>The goal is a system you can actually understand, maintain, and tie back to the business.</p><p>And that takes more effort than just signing another contract for a new tool or building another ten dashboards.</p><p>As always, thanks for reading! I hope to see you in the next one.</p><div><hr></div><h2>Articles Worth Reading</h2><p>There are thousands of new articles posted daily all over the web! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!</p><div><hr></div><h2>From &#8220;Vibe Coding&#8221; to Guided Coding</h2><p>Enterprise engineering teams face a paradox: while generic AI tools increase individual speed, they often introduce unpredictability, security risks, and technical debt at scale.</p><p>This white paper details a pilot program conducted by <strong>Brainly</strong> in partnership with <strong>Codestrap</strong>. By deploying <strong>Larry AI</strong>, an agent powered by <strong>X-Reason</strong>&#8482; technology, Brainly successfully demonstrated that <strong>guided workflows</strong> can outperform generic AI tools in reliability, cost, and safety.</p><p>The pilot focused on &#8220;LLMifying&#8221; Brainly&#8217;s complex Data Access Object (DAO) layer and building custom Larry AI workflow for this area of the codebase. The results were decisive: Larry AI achieved <strong>90% one-shot correctness</strong> , reduced inference costs by <strong>~96%</strong>, and-most critically-enabled mid-level engineers to safely ship code that previously required deep domain expertise.</p><p><a href="https://medium.com/brainly/from-vibe-coding-to-guided-coding-d2ba7e526ff3">Read More Here</a></p><h2>Speed Without Understanding - One of the Biggest Risks in Data Engineering</h2><blockquote><h3><strong>&#8220;I just nuked all our dashboards.&#8221;</strong></h3></blockquote><p>That&#8217;s the title of a heavily discussed post on the data engineering subreddit(Post and link included at the bottom of this section). When I first read it, I assumed the author was going to talk about deleting all their dashboards, no one noticing, and then wondering if their job even mattered.</p><p>That wasn&#8217;t the case.</p><p><a href="https://seattledataguy.substack.com/p/speed-without-understanding-one-of">Read More Here</a></p><div><hr></div><h2>End Of Day 212</h2><p>Thanks for checking out our community. We put out 4-5 Newsletters a month discussing data, tech, and start-ups.</p><p>If you enjoyed it, consider liking, sharing and helping this newsletter grow.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4"><span>Share</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Backfills - The Necessary Evil of Data Engineering]]></title><description><![CDATA[Why backfills happen, why we hate them, and how to handle them without breaking trust]]></description><link>https://seattledataguy.substack.com/p/backfills-the-necessary-evil-of-data</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/backfills-the-necessary-evil-of-data</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Mon, 23 Feb 2026 23:37:52 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/b664aa03-fe21-4c43-b5b5-dcdd154cbfb1_1024x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi, fellow future and current Data Leaders; Ben here &#128075;</p><p>One thing most of us data engineers dislike are backfills. Why is that? And what does backfilling require? </p><p>Before we jump in to talking about backfills, I wanted to share a bit about <a href="https://estuary.dev/?utm_source=SeattleDataGuy&amp;utm_medium=social&amp;utm_campaign=SeattleDataGuy">Estuary</a>, a platform I&#8217;ve used to help make clients&#8217; data workflows easier and am an adviser for. Estuary helps teams easily move data in real-time or on a schedule, from databases and SaaS apps to data lakes and warehouses, empowering data leaders to focus on strategy and impact rather than getting bogged down by infrastructure challenges. If you want to simplify your data workflows, check them out today.</p><p>Now let&#8217;s jump into the article!</p><div><hr></div><p>At some point, if you work in data, whether you&#8217;re an analyst or a data engineer.</p><p>You&#8217;re going to have to do it.</p><p>You&#8217;re going to have to backfill a table.</p><p>Actually, it&#8217;ll probably be pretty early in your career. Backfilling or rerunning a pipeline is just a necessity, AI or not.</p><p>There are plenty of reasons why you might need to backfill a table..sadly.</p><p>Talking to data engineers&#8230;many of them dislike the process of backfilling.</p><p>So let&#8217;s start there. Let&#8217;s discuss why we backfill as data teams and why we dislike it so much.</p><h2>Why Do We Need to Backfill Data?</h2><p>Backfills exist for many reasons. For example, systems are not static, and people make mistakes. In turn, tables and data sets need to be rerun.</p><p>Some of the most common reasons you&#8217;ll need to backfill include:</p><ol><li><p><strong>Late or corrected source data -</strong> Upstream systems change historical records all the time. Maybe the data was being recorded incorrectly, or if you&#8217;re getting <a href="https://www.theseattledataguy.com/the-basics-of-sftp-authentication-encryption-and-file-management/">SFTP</a> files, they might have been sending you bad files that no one caught. Now you&#8217;re going to have to backfill at least a specific date or customer cut of that data.</p></li><li><p><strong>Bugs in your <a href="https://seattledataguy.substack.com/p/why-data-pipelines-exist">data pipelines</a> - </strong>A common reason why tables need to be backfilled is that there was a bug. Something was wrong, and maybe just running an update statement isn&#8217;t sufficient. So now you&#8217;ve got to rebuild the table with the new logic without disrupting end-users.</p></li><li><p><strong>Schema or logic changes -</strong> This was a common reason I needed to backfill at Facebook. When columns were removed at Facebook or, in some cases, certain data type conversions were required, we&#8217;d have to rebuild the table. Then add in new logic changes or sources for data, and you&#8217;ll likely need to reload the entire table.</p></li></ol><p>These are, of course, only a few reasons!</p><h2>Why Data Engineers Dislike Backfilling</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lBNz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00228fa-c3ff-4d1f-9123-58373a35f5e8_1098x676.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lBNz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00228fa-c3ff-4d1f-9123-58373a35f5e8_1098x676.png 424w, https://substackcdn.com/image/fetch/$s_!lBNz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00228fa-c3ff-4d1f-9123-58373a35f5e8_1098x676.png 848w, https://substackcdn.com/image/fetch/$s_!lBNz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00228fa-c3ff-4d1f-9123-58373a35f5e8_1098x676.png 1272w, https://substackcdn.com/image/fetch/$s_!lBNz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00228fa-c3ff-4d1f-9123-58373a35f5e8_1098x676.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lBNz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00228fa-c3ff-4d1f-9123-58373a35f5e8_1098x676.png" width="1098" height="676" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e00228fa-c3ff-4d1f-9123-58373a35f5e8_1098x676.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:676,&quot;width&quot;:1098,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:107749,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/188760181?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00228fa-c3ff-4d1f-9123-58373a35f5e8_1098x676.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lBNz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00228fa-c3ff-4d1f-9123-58373a35f5e8_1098x676.png 424w, https://substackcdn.com/image/fetch/$s_!lBNz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00228fa-c3ff-4d1f-9123-58373a35f5e8_1098x676.png 848w, https://substackcdn.com/image/fetch/$s_!lBNz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00228fa-c3ff-4d1f-9123-58373a35f5e8_1098x676.png 1272w, https://substackcdn.com/image/fetch/$s_!lBNz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe00228fa-c3ff-4d1f-9123-58373a35f5e8_1098x676.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you ask a data engineer what they don&#8217;t like doing, I am sure backfilling will be one of the few things they reference, besides migrations.</p><p>Here are a few reasons why.</p><ul><li><p><strong>Scale</strong>- In some cases, backfills mean waiting hours, if not a day or two, to rerun a table. At Facebook, sometimes backfilling a table meant needing to rerun thousands of jobs for each partition and each of the upstream tasks. That means there are a lot of ways things can go wrong.</p></li><li><p><strong>Cost</strong> - You&#8217;d better be sure your backfill updates are right. Having to rerun a backfill job on pay-as-you-go technology will be expensive, especially if you have to load the data from raw.</p></li><li><p><strong>Time</strong> <strong>Consuming</strong> - There are multiple ways backfills take time. They can bump into daily jobs, especially if you are on-prem. They also take time out of the day of an engineer who has to ensure all the data is accurate and runs as expected. It&#8217;s just one giant time suck that keeps the <a href="https://seattledataguy.substack.com/p/why-your-data-team-doesnt-have-a">data team</a> from delivering new work.</p></li><li><p><strong>Blast Radius</strong> - So you&#8217;ve built a table that everyone at your company uses and relies on. Great, now it&#8217;s going to take even longer to backfill. I had multiple cases where I&#8217;d be backfilling a table that had dozens, if not hundreds, of end-users. You&#8217;re going to need to update them and make sure they know what&#8217;s happening and if they need to do anything. If they have to modify their pipelines, then that further drags out your process.</p></li><li><p><strong>Trust</strong> &#8211; If stakeholders see numbers change unexpectedly, they are going to question it. So one of the many reasons data teams dislike backfills, especially if they have to do it frequently, is that it can erode trust in the overall dataset.</p></li></ul><h2>Backfilling &#8220;Controversy&#8221;</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!v0u4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc2b2f8b-980e-4aa9-931d-6015880e5694_1024x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!v0u4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc2b2f8b-980e-4aa9-931d-6015880e5694_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!v0u4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc2b2f8b-980e-4aa9-931d-6015880e5694_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!v0u4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc2b2f8b-980e-4aa9-931d-6015880e5694_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!v0u4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc2b2f8b-980e-4aa9-931d-6015880e5694_1024x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!v0u4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc2b2f8b-980e-4aa9-931d-6015880e5694_1024x768.png" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc2b2f8b-980e-4aa9-931d-6015880e5694_1024x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:330375,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/188760181?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc2b2f8b-980e-4aa9-931d-6015880e5694_1024x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!v0u4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc2b2f8b-980e-4aa9-931d-6015880e5694_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!v0u4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc2b2f8b-980e-4aa9-931d-6015880e5694_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!v0u4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc2b2f8b-980e-4aa9-931d-6015880e5694_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!v0u4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc2b2f8b-980e-4aa9-931d-6015880e5694_1024x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.linkedin.com/posts/eczachly_dataengineering-activity-7212537967418986496-uFpc/">Source</a></figcaption></figure></div><p>As I was going through to see what other people had said about backfilling over the past few years, I ran into a discussion on a post between <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Zach Wilson&quot;,&quot;id&quot;:10367987,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!GhRS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a857d08-ec8d-4a0e-9cb5-ad8434fe519e_2333x3500.jpeg&quot;,&quot;uuid&quot;:&quot;9841f887-5ebd-4750-81a1-196b5cff72af&quot;}" data-component-name="MentionToDOM"></span> and <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Brian Greene&quot;,&quot;id&quot;:9220278,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5832da43-cefb-4e7d-b5f4-cd229fa57fb6_400x400.jpeg&quot;,&quot;uuid&quot;:&quot;f442358d-b0ab-418e-ba07-e69f335125e1&quot;}" data-component-name="MentionToDOM"></span> from over a year ago(not trying to restart a fight here, just making sure people don&#8217;t feel like I am talking about them behind their backs).</p><p>The argument itself became a little heated, but stripping away that component I do think there is something worth talking about.</p><p>I think both Brian and Zach have different experiences in different systems that, in turn, have different requirements for backfilling.</p><p>A goal you should have when backfilling(amongst the obvious of backfilling) is not to run a bunch of random <a href="https://seattledataguy.substack.com/p/back-to-the-basics-with-sql-understanding?utm_source=publication-search">SQL</a> scripts against production.</p><p>Instead, you should create an approach that lets you maintain a repeatable process that balances making changes safely with giving space to ensure the new data is correct.</p><p></p>
      <p>
          <a href="https://seattledataguy.substack.com/p/backfills-the-necessary-evil-of-data">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Why Data Pipelines Exist]]></title><description><![CDATA[Beyond Moving Data From Point A To B]]></description><link>https://seattledataguy.substack.com/p/why-data-pipelines-exist</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/why-data-pipelines-exist</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Mon, 09 Feb 2026 23:54:25 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!hIcA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8cb1c-13ce-4acc-8731-ce27d697190a_1024x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi, fellow future and current Data Leaders; Ben here &#128075;</p><p>Today will be getting back into my series on data pipelines. One question that I believe is important to answer is why? Why even build data pipelines?</p><p>But before we jump in, I wanted to share a bit about <a href="https://estuary.dev/?utm_source=SeattleDataGuy&amp;utm_medium=social&amp;utm_campaign=SeattleDataGuy">Estuary</a>, a platform I&#8217;ve used to help make clients&#8217; data workflows easier and am an adviser for. Estuary helps teams easily move data in real-time or on a schedule, from databases and SaaS apps to data lakes and warehouses, empowering data leaders to focus on strategy and impact rather than getting bogged down by infrastructure challenges. If you want to simplify your data workflows, check them out today.</p><p>Now let&#8217;s jump into the article!</p><div><hr></div><p>When I first started in the data world, no one around me used the term data pipeline.</p><p>I heard terms like integrations, automations and ETL.</p><p>In fact, I am not even sure when I first came across the term. But if you&#8217;re a data engineer in this modern era, then much  of your time is spent, building, maintaining and keeping data pipelines running  smooth.</p><p>Even with AI, you&#8217;re probably still finding yourself opening up 3,000 line queries, and the occasional custom data pipeline system.</p><h3>What a Data Pipeline Actually Does </h3><p>When you look at data pipelines, here is likely what people might say they do.</p><ol><li><p>Move data from a source to a destination</p></li><li><p>Sometimes they transform that data</p></li><li><p>And they do all of this repeatedly and reliably without human intervention</p></li></ol><p>That&#8217;s the technical function of a data pipeline.</p><p>How it happens can vary.</p><p>This could be automated SQL, Python scripts, <a href="https://www.theseattledataguy.com/what-is-apache-airflow-data-engineering-consulting/">Airflow</a>, Estuary, SSIS, Glue, and so many other tools.</p><p>But you do need to think beyond just this when it comes to data pipelines.</p><p>Pulling in a recent post from <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Zach Wilson&quot;,&quot;id&quot;:10367987,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!GhRS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a857d08-ec8d-4a0e-9cb5-ad8434fe519e_2333x3500.jpeg&quot;,&quot;uuid&quot;:&quot;fbd3bc60-5533-4cca-9183-549a0a1e1bfa&quot;}" data-component-name="MentionToDOM"></span>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3Rx_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70615d2c-6b41-43ca-97eb-0bac692d48b2_1026x890.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3Rx_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70615d2c-6b41-43ca-97eb-0bac692d48b2_1026x890.png 424w, https://substackcdn.com/image/fetch/$s_!3Rx_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70615d2c-6b41-43ca-97eb-0bac692d48b2_1026x890.png 848w, https://substackcdn.com/image/fetch/$s_!3Rx_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70615d2c-6b41-43ca-97eb-0bac692d48b2_1026x890.png 1272w, https://substackcdn.com/image/fetch/$s_!3Rx_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70615d2c-6b41-43ca-97eb-0bac692d48b2_1026x890.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3Rx_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70615d2c-6b41-43ca-97eb-0bac692d48b2_1026x890.png" width="1026" height="890" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/70615d2c-6b41-43ca-97eb-0bac692d48b2_1026x890.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:890,&quot;width&quot;:1026,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:174169,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/187244152?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70615d2c-6b41-43ca-97eb-0bac692d48b2_1026x890.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3Rx_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70615d2c-6b41-43ca-97eb-0bac692d48b2_1026x890.png 424w, https://substackcdn.com/image/fetch/$s_!3Rx_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70615d2c-6b41-43ca-97eb-0bac692d48b2_1026x890.png 848w, https://substackcdn.com/image/fetch/$s_!3Rx_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70615d2c-6b41-43ca-97eb-0bac692d48b2_1026x890.png 1272w, https://substackcdn.com/image/fetch/$s_!3Rx_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70615d2c-6b41-43ca-97eb-0bac692d48b2_1026x890.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>It&#8217;s important to think beyond just moving data from A to B. And start thinking in. outcomes and <a href="https://seattledataguy.substack.com/p/thinking-like-an-owner-elevating">ownership</a>.</p><p>What is the data pipeline actually doing?</p><h2>The Real Reason Data Pipelines Exist: Trust</h2><p>We alluded to this above, but let&#8217;s talk about why data pipelines exist. Because, hey we could just manually load data into databases.</p><p>Just use:</p><p><code>COPY INTO analytics.raw_orders</code></p><p><code>FROM @raw_stage/orders/</code></p><p><code>FILE_FORMAT = (TYPE = &#8216;CSV&#8217; SKIP_HEADER = 1)</code></p><p><code>ON_ERROR = &#8216;CONTINUE&#8217;;</code></p><p>Done!</p><p>No need to automate anything, right?</p><p>After all, we are just moving data from point A to B.</p><p>Well, there are many reasons we automate data workflows and turn them into data pipelines. Here are the key benefits we get.</p><ul><li><p>Timeliness</p></li><li><p>Accuracy</p></li><li><p>Consistency</p></li><li><p>Recoverability</p></li><li><p>Scalability</p></li></ul><p>But it goes even beyond just <strong>recoverability and consistency. </strong>We really are trying to make data more valuable. In order to do so, we must also consider these pillars.</p><ul><li><p>Integration</p></li><li><p>Availability</p></li><li><p>Outcomes</p></li></ul><p>Now I will say, not every data pipeline you build will have the goals listed above. Especially the ones listed in other benefits.</p><p>In some cases, data pipelines are merely integrated data from a CRM to another internal system other than reporting.</p><p>Still others might pull data out of your CRM, run some calculations, and reinsert it.</p><p>I am merely saying this to point out that not all data pipelines are used to push data into data warehouses. There are plenty of other reasons a data pipeline might exist.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SeattleDataGuy&#8217;s Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><h2>Why You Need To Care About These</h2><h3>Integration</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TgV7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ca0784-18f3-4f84-ab09-cc789ae76cf0_1024x768.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TgV7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ca0784-18f3-4f84-ab09-cc789ae76cf0_1024x768.webp 424w, https://substackcdn.com/image/fetch/$s_!TgV7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ca0784-18f3-4f84-ab09-cc789ae76cf0_1024x768.webp 848w, https://substackcdn.com/image/fetch/$s_!TgV7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ca0784-18f3-4f84-ab09-cc789ae76cf0_1024x768.webp 1272w, https://substackcdn.com/image/fetch/$s_!TgV7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ca0784-18f3-4f84-ab09-cc789ae76cf0_1024x768.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TgV7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ca0784-18f3-4f84-ab09-cc789ae76cf0_1024x768.webp" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/81ca0784-18f3-4f84-ab09-cc789ae76cf0_1024x768.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:46204,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/187244152?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ca0784-18f3-4f84-ab09-cc789ae76cf0_1024x768.webp&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TgV7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ca0784-18f3-4f84-ab09-cc789ae76cf0_1024x768.webp 424w, https://substackcdn.com/image/fetch/$s_!TgV7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ca0784-18f3-4f84-ab09-cc789ae76cf0_1024x768.webp 848w, https://substackcdn.com/image/fetch/$s_!TgV7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ca0784-18f3-4f84-ab09-cc789ae76cf0_1024x768.webp 1272w, https://substackcdn.com/image/fetch/$s_!TgV7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ca0784-18f3-4f84-ab09-cc789ae76cf0_1024x768.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Many data teams aren&#8217;t building data warehouses; they are just replicating their databases and CRMs into <a href="https://www.youtube.com/watch?v=GuM6dQGRFyQ&amp;t=2s">Snowflake</a> or <a href="https://www.youtube.com/watch?v=QNdiGZFaUFs">Databricks</a>. Just isolated siloed data that was once in separate systems, now in their own schemas and un-integrated data sets.</p><p>Part of what the data pipeline is supposed to handle in terms of logic(and as determined by the <a href="https://www.youtube.com/watch?v=gG7upg6QaBI&amp;feature=youtu.be&amp;sttick=0">data modeling</a> process) is the integrations. The parsing, cleaning, and adding of keys that allow you to join data across systems. This also means you&#8217;ll likely need to consider what data sets will need to join with each other in the source systems themselves.</p><h3>Availability And Usability</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Xtsh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81d561-37db-42f1-8f79-4194297c0117_1024x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Xtsh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81d561-37db-42f1-8f79-4194297c0117_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!Xtsh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81d561-37db-42f1-8f79-4194297c0117_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!Xtsh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81d561-37db-42f1-8f79-4194297c0117_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!Xtsh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81d561-37db-42f1-8f79-4194297c0117_1024x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Xtsh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81d561-37db-42f1-8f79-4194297c0117_1024x768.png" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d81d561-37db-42f1-8f79-4194297c0117_1024x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:387809,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/187244152?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81d561-37db-42f1-8f79-4194297c0117_1024x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Xtsh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81d561-37db-42f1-8f79-4194297c0117_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!Xtsh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81d561-37db-42f1-8f79-4194297c0117_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!Xtsh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81d561-37db-42f1-8f79-4194297c0117_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!Xtsh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81d561-37db-42f1-8f79-4194297c0117_1024x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Many data workflows require data analysts go to the source systems and extract the data in an Excel, then from there they will need to manually process, set-up VLookups and build out a &#8220;database&#8221; in Excel.</p><p>Part of what data pipelines do is move data into the data warehouse making said data more easier to access. </p><p>And this is not just for end-users like analysts, but also automations and LLMs. Having data centralized means it&#8217;s easier to work with said data, especially when it&#8217;s well modeled.</p><p>As data becomes easier to access by the right users, the more they can actually use it for.</p><h3>Scalability</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hIcA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8cb1c-13ce-4acc-8731-ce27d697190a_1024x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hIcA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8cb1c-13ce-4acc-8731-ce27d697190a_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!hIcA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8cb1c-13ce-4acc-8731-ce27d697190a_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!hIcA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8cb1c-13ce-4acc-8731-ce27d697190a_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!hIcA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8cb1c-13ce-4acc-8731-ce27d697190a_1024x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hIcA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8cb1c-13ce-4acc-8731-ce27d697190a_1024x768.png" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e0c8cb1c-13ce-4acc-8731-ce27d697190a_1024x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:360422,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/187244152?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8cb1c-13ce-4acc-8731-ce27d697190a_1024x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hIcA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8cb1c-13ce-4acc-8731-ce27d697190a_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!hIcA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8cb1c-13ce-4acc-8731-ce27d697190a_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!hIcA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8cb1c-13ce-4acc-8731-ce27d697190a_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!hIcA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8cb1c-13ce-4acc-8731-ce27d697190a_1024x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>At a certain point, having half automated scripts run by cron might be too chaotic. Sure, if you only need 2-3 simple data workflows managed. This might be fine.</p><p>But as your data use cases grow.</p><p>As your <a href="https://seattledataguy.substack.com/p/centralized-vs-decentralized-vs-federated">data team</a> grows.</p><p>As the end-users of said data grows.</p><p>You&#8217;ll want data pipeline systems that make it easy to automate. </p><p>Think about having to rerun 200 data pipelines. That&#8217;s logistically difficult if you can&#8217;t easily kick all the jobs off and track their successes or failures. </p><h3>Outcomes</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lgFb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22357333-5e15-42ee-b082-79d33925abe8_607x499.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lgFb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22357333-5e15-42ee-b082-79d33925abe8_607x499.jpeg 424w, https://substackcdn.com/image/fetch/$s_!lgFb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22357333-5e15-42ee-b082-79d33925abe8_607x499.jpeg 848w, https://substackcdn.com/image/fetch/$s_!lgFb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22357333-5e15-42ee-b082-79d33925abe8_607x499.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!lgFb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22357333-5e15-42ee-b082-79d33925abe8_607x499.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lgFb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22357333-5e15-42ee-b082-79d33925abe8_607x499.jpeg" width="607" height="499" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/22357333-5e15-42ee-b082-79d33925abe8_607x499.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:499,&quot;width&quot;:607,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:44362,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/187244152?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22357333-5e15-42ee-b082-79d33925abe8_607x499.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lgFb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22357333-5e15-42ee-b082-79d33925abe8_607x499.jpeg 424w, https://substackcdn.com/image/fetch/$s_!lgFb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22357333-5e15-42ee-b082-79d33925abe8_607x499.jpeg 848w, https://substackcdn.com/image/fetch/$s_!lgFb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22357333-5e15-42ee-b082-79d33925abe8_607x499.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!lgFb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22357333-5e15-42ee-b082-79d33925abe8_607x499.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Data pipelines can be easily built without pipelines. But I think it&#8217;s important to think about the &#8220;so what&#8221;. Why are you building your data pipeline?</p><p>Is it to automate a process, and if so, does it need to ingest data into the <a href="https://www.youtube.com/watch?v=0DsaafI1fTQ">data warehouse</a>?</p><p>What business goal are you hoping to drive with the building of your pipeline? Every new data pipeline you build without a clear purpose just becomes a technical liability over time. It increases cost, maintenance time, etc.</p><p>So what is your team hoping to do with the data pipeline?</p><p>Here are a few examples, you could say your data pipeline:</p><ul><li><p>Reduces unnecessary discounting by analyzing win/loss data and discounts to show where deals close without price concessions.</p></li><li><p>Improves onboarding success by identifying which onboarding steps and early product behaviors correlate with long-term retention.</p></li><li><p>Reduces support costs by linking support tickets to product events to eliminate the root causes driving repeat issues.</p></li><li><p>Increases retention through proactive customer success by alerting CS teams when usage drops or support volume spikes.</p></li></ul><h3>Timeliness</h3><p>One of the great things about data pipelines is that they are easy to track and can run whenever you need them to.</p><p>Meaning, if you need them to prepare a data set prior to 8 AM, they can do that. You know how long it&#8217;ll take(assuming nothing goes wrong, and even then, likely you can set up some level of recoverability).</p><p>An analyst doesn&#8217;t have to wake-up early to make sure the data gets processed in an Excel file. Instead, it can land in a table and be picked up as needed.</p><h3>Accuracy</h3><p>We in the data world love talking about <a href="https://www.theseattledataguy.com/why-your-team-needs-to-implement-data-quality-for-your-ai-strategy/#page-content">data quality</a>. Well, the data pipeline is one of the many places where data can be transformed improperly. Data can be duplicated, removed, and or altered in such a way that it is no longer accurate.</p><p>Data pipelines are a great place to check for data issues.</p><p>This can occur before even processing the data to check that the source contains the expected fields and ranges of data. From there, as you <a href="https://seattledataguy.substack.com/p/understanding-the-t-in-etl-a-back">transform</a> the data throughout your pipeline, you&#8217;ll likely need to implement other checks.</p><h3>Consistency</h3><p>The problem with &#8220;Excel data pipelines&#8221; is they offer room for errors. You copy and paste the wrong data set or forget to update a formula. </p><p>A programmed data pipeline, is repeatable and consistent.</p><p>You can create logic to check for errors like the wrong data being inserted or if dimensional data is missing. So even if you do have some issue, it can flag it early. It also helps avoid a fat-finger issue where someone accidentally thumbs in a number.</p><h3>Recoverability</h3><p>Sometimes, the wrong data enters a data workflow. We want to be able to detect that and then be able to rerun our data processes easily without having to worry about what else could go wrong.</p><p>We don&#8217;t want to worry about duplicate data.</p><p>We don&#8217;t want to worry about a small step being missed.</p><p>So having the process codified ensures we know exactly what will happen in terms of data tables being populated.</p><h2>Final Thoughts And What Is Coming</h2><p>Data pipelines are everywhere in companies. They take many different shapes and forms, but overall, their goal is to do more than just move data from point A to B.</p><p>In case you missed my last article, we&#8217;ve already covered some of the various data pipelines that exist(I even labeled one the Excel Data Pipeline in the past).</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;18c4422c-751c-4e4e-95d4-a9f8d234a363&quot;,&quot;caption&quot;:&quot;Hi, fellow future and current Data Leaders; Ben here &#128075;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Common Data Pipeline Patterns You&#8217;ll See in the Real World&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:4963622,&quot;name&quot;:&quot;SeattleDataGuy&quot;,&quot;bio&quot;:&quot;#Data #Engineer, Strategy Development Consultant and All Around Data Guy #deeplearning #machinelearning #datascience #tech #management https://www.youtube.com/@SeattleDataGuy/videos&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec905aa-9a7b-4f21-b0ff-fec92e8916d1_512x512.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-01-05T19:58:06.510Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!uaVa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efec60d-abc1-4d62-9ce7-730de55029e0_1024x768.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://seattledataguy.substack.com/p/common-data-pipeline-patterns-youll&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:183018775,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:76,&quot;comment_count&quot;:3,&quot;publication_id&quot;:21105,&quot;publication_name&quot;:&quot;SeattleDataGuy&#8217;s Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!fov7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fea9135cc-f9d6-4856-8596-2ca9a1655cb6_256x256.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>In the next few articles, I&#8217;ll be covering several other key data pipeline topics:</p><ol><li><p>Backfills: The Thing Everyone Avoids Until It&#8217;s Too Late</p></li><li><p>Building your first data pipeline, from Excel to Airflow</p></li><li><p>Incremental vs Full Refresh Pipelines</p></li><li><p>Daily Tasks With Data Pipelines - Data Quality Checks And The Problem With Noisy Checks</p></li></ol><h2>Articles Worth Reading</h2><p>There are thousands of new articles posted daily all over the web! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!</p><div><hr></div><h2>When Did &#8220;Rock&#8221; Become &#8220;Classic Rock&#8221;? A Statistical Analysis</h2><p>By <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Daniel Parris&quot;,&quot;id&quot;:112812180,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!AmpE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25a9035a-fbd9-4f33-aa36-2548ca85140b_2048x1536.jpeg&quot;,&quot;uuid&quot;:&quot;a23fb49b-1748-473d-9a2e-8cb09ae0a436&quot;}" data-component-name="MentionToDOM"></span> </p><p>I first grasped the strangeness of the term &#8220;<em>classic</em> rock&#8221; while listening to a pop-punk song. In 2004, Bowling for Soup topped the charts with &#8220;1985,&#8221; a track about a frustrated housewife nostalgically longing for her Reagan-era youth.</p><p>During the song&#8217;s bridge, our protagonist laments cultural change: &#8220;She hates time, make it stop. When did M&#246;tley Cr&#252;e become classic rock?&#8221; It was at this moment that I&#8212;a teenager&#8212;first understood the peculiarity of genre:</p><ol><li><p>Apparently, there was a band called M&#246;tley Cr&#252;e.</p></li><li><p>This motley crew was initially classified as one genre&#8212;rock&#8212;before being reclassified as &#8220;classic rock.&#8221;</p></li><li><p>Contrary to my longstanding belief, &#8220;classic rock&#8221; was not etched into the Ten Commandments&#8212;it was a contemporary radio format, likely devised by someone in public relations.</p></li></ol><p><a href="https://www.statsignificant.com/p/when-did-rock-become-classic-rock">Read More Here</a></p><h2>Snowflake vs Databricks Is the Wrong Debate</h2><p>Over the last few years, Databricks has been executing a strategy to take over the entire data workflow.</p><p>Maybe it never started that way.</p><p>Maybe when they first came out, they only ever planned to be a managed Spark solution. But I have a hard time believing that, mostly because I believe their leadership has the vision and capabilities to see far beyond that.</p><p>Databricks has always been pretty upfront that they want to be the end-to-end data stack. But they&#8217;ve been approaching it piece by piece.</p><p>Or should I say role by role?</p><p>Obviously, at first, their focus was on the data scientist and ML engineer.</p><p><a href="https://seattledataguy.substack.com/p/snowflake-vs-databricks-is-the-wrong">Read More Here</a></p><div><hr></div><h2>End Of Day 210</h2><p>Thanks for checking out our community. We put out 4-5 Newsletters a month discussing data, tech, and start-ups.</p><p>If you enjoyed it, consider liking, sharing and helping this newsletter grow.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[5 Key Predictions for the Data Industry in 2026]]></title><description><![CDATA[Hype Cycles, Rebrands, and the Messy Reality of Data]]></description><link>https://seattledataguy.substack.com/p/5-key-predictions-for-the-data-industry-b7c</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/5-key-predictions-for-the-data-industry-b7c</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Sat, 31 Jan 2026 19:29:24 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!74LM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c2401d-dd70-4bdf-986d-14b0a3c7c50d_1024x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi, fellow future and current Data Leaders; Ben here &#128075;</p><p>Today I am taking a pause from my data pipeline series(it&#8217;ll start again next issue) to share some thoughts about the data world and where it&#8217;s headed in the next year or so and other trends.</p><p>But before we jump in, I wanted to share a bit about <a href="https://estuary.dev/?utm_source=SeattleDataGuy&amp;utm_medium=social&amp;utm_campaign=SeattleDataGuy">Estuary</a>, a platform I&#8217;ve used to help make clients&#8217; data workflows easier and am an adviser for. Estuary helps teams easily move data in real-time or on a schedule, from databases and SaaS apps to data lakes and warehouses, empowering data leaders to focus on strategy and impact rather than getting bogged down by infrastructure challenges. If you want to simplify your data workflows, check them out today.</p><p>Now let&#8217;s jump into the article!</p><div><hr></div><p>One twelfth of the year is over, at least by months, and somehow it feels like a year&#8217;s worth of events have occurred.</p><p>By the end of 2025, dozens of companies were swallowed up. Everyone wanted to buy everyone, and here we are in 2026, seeing more of that as well as some pretty slick new AI model releases.</p><p>But let&#8217;s turn towards the future, and specifically, data.</p><p>Here is what I believe we&#8217;ll see happen in the next year or two.</p><h2>1) Microsoft Fabric Will Rebrand..Again</h2><p>If you scroll around LinkedIn enough, you&#8217;ll likely find a few posts about Microsoft Fabric not being the &#8220;it&#8221; tool. As some people have put it:</p><blockquote><p><a href="https://www.linkedin.com/posts/eczachly_things-im-unnecessarily-political-about-activity-7334706587472711680-GS5m/?utm_source=share&amp;utm_medium=member_desktop&amp;rcm=ACoAAA3roGYByurxK9YsOLOqN2Mn748HOZhjuSE">Microsoft Fabric is Databricks from Temu</a></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YS9x!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3129bc7c-fe79-42f4-a63c-275059c954fb_1040x1126.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YS9x!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3129bc7c-fe79-42f4-a63c-275059c954fb_1040x1126.png 424w, https://substackcdn.com/image/fetch/$s_!YS9x!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3129bc7c-fe79-42f4-a63c-275059c954fb_1040x1126.png 848w, https://substackcdn.com/image/fetch/$s_!YS9x!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3129bc7c-fe79-42f4-a63c-275059c954fb_1040x1126.png 1272w, https://substackcdn.com/image/fetch/$s_!YS9x!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3129bc7c-fe79-42f4-a63c-275059c954fb_1040x1126.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YS9x!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3129bc7c-fe79-42f4-a63c-275059c954fb_1040x1126.png" width="1040" height="1126" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3129bc7c-fe79-42f4-a63c-275059c954fb_1040x1126.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1126,&quot;width&quot;:1040,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1011229,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/186367636?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3129bc7c-fe79-42f4-a63c-275059c954fb_1040x1126.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!YS9x!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3129bc7c-fe79-42f4-a63c-275059c954fb_1040x1126.png 424w, https://substackcdn.com/image/fetch/$s_!YS9x!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3129bc7c-fe79-42f4-a63c-275059c954fb_1040x1126.png 848w, https://substackcdn.com/image/fetch/$s_!YS9x!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3129bc7c-fe79-42f4-a63c-275059c954fb_1040x1126.png 1272w, https://substackcdn.com/image/fetch/$s_!YS9x!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3129bc7c-fe79-42f4-a63c-275059c954fb_1040x1126.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.linkedin.com/in/christiansteinert96/">Source</a></figcaption></figure></div><p>Now, it doesn&#8217;t mean it&#8217;s not growing. According to their recent earnings, <a href="https://www.microsoft.com/en-us/investor/earnings/fy-2026-q2/press-release-webcast#:~:text=constant%20currency,business%20highlights">Azure is growing at 39%. Y/Y</a>.</p><p>But, if the sentiment around Fabric continues to grow in the wrong way, I&#8217;d predict Microsoft rebrands their data stack&#8230;again.</p><p>Especially with all the AI-hype. They&#8217;d be able to push the narrative that the new solution is AI-first and come up with a great new name.</p><p>Looking back over the past decade, Microsoft&#8217;s done this several times. To the point where it was a little confusing in terms of which tool is which.</p><p>So maybe they&#8217;ll try yet again.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!74LM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c2401d-dd70-4bdf-986d-14b0a3c7c50d_1024x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!74LM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c2401d-dd70-4bdf-986d-14b0a3c7c50d_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!74LM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c2401d-dd70-4bdf-986d-14b0a3c7c50d_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!74LM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c2401d-dd70-4bdf-986d-14b0a3c7c50d_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!74LM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c2401d-dd70-4bdf-986d-14b0a3c7c50d_1024x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!74LM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c2401d-dd70-4bdf-986d-14b0a3c7c50d_1024x768.png" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/17c2401d-dd70-4bdf-986d-14b0a3c7c50d_1024x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:380325,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/186367636?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c2401d-dd70-4bdf-986d-14b0a3c7c50d_1024x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!74LM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c2401d-dd70-4bdf-986d-14b0a3c7c50d_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!74LM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c2401d-dd70-4bdf-986d-14b0a3c7c50d_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!74LM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c2401d-dd70-4bdf-986d-14b0a3c7c50d_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!74LM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c2401d-dd70-4bdf-986d-14b0a3c7c50d_1024x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>2) 1% Of Companies Will Continue To Cry For AI While The Other 99% Are Still Trying To Export ERP Outputs To Excel</h2><p>There is a lot of demand for AI. It needs to be integrated everywhere, right?</p><p>On the flip side, many companies are still sharing data via <a href="https://www.theseattledataguy.com/the-basics-of-sftp-authentication-encryption-and-file-management/">SFTP</a> or pulling it from an API. Sure, maybe they used AI to help write the code faster.</p><blockquote><p><em><strong>But then they went to scroll on Instagram afterwards.</strong></em></p></blockquote><p>You know what AI tool I want to see(and maybe I should take a crack at it). I want to see an AI solution that lets me take an Excel spreadsheet, drop it in, and it automatically builds out a <a href="https://seattledataguy.substack.com/p/why-your-data-pipeline-probably-isnt">data pipeline</a> that can replace it perfectly. Or as perfect as it can&#8230;</p><p>It&#8217;d pull out all the formulas, turn them into <a href="https://seattledataguy.substack.com/p/back-to-the-basics-with-sql-understanding?utm_source=publication-search">SQL</a> or Python logic, and put them into a larger system of data pipelines.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-Idq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82211ef9-41d4-47f8-9bc1-8f1aef64f6a1_1024x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-Idq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82211ef9-41d4-47f8-9bc1-8f1aef64f6a1_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!-Idq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82211ef9-41d4-47f8-9bc1-8f1aef64f6a1_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!-Idq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82211ef9-41d4-47f8-9bc1-8f1aef64f6a1_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!-Idq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82211ef9-41d4-47f8-9bc1-8f1aef64f6a1_1024x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-Idq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82211ef9-41d4-47f8-9bc1-8f1aef64f6a1_1024x768.png" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/82211ef9-41d4-47f8-9bc1-8f1aef64f6a1_1024x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:331145,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/186367636?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82211ef9-41d4-47f8-9bc1-8f1aef64f6a1_1024x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-Idq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82211ef9-41d4-47f8-9bc1-8f1aef64f6a1_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!-Idq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82211ef9-41d4-47f8-9bc1-8f1aef64f6a1_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!-Idq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82211ef9-41d4-47f8-9bc1-8f1aef64f6a1_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!-Idq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82211ef9-41d4-47f8-9bc1-8f1aef64f6a1_1024x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Because we aren&#8217;t going to replace Excel anytime soon. Every company has it no matter what <a href="https://seattledataguy.substack.com/p/were-all-living-in-different-data?utm_source=publication-search">data decade</a> they are in. Excel is capturing business logic.</p><p>So why fight it?</p><p>It&#8217;s just too easy to build a quick <a href="https://seattledataguy.substack.com/i/183018775/excel-data-pipelines">spreadsheet</a> that quickly turns into a core component in a business workflow. </p><p>No one wants to fill out a restrictive form, and sure, coding is nice, but it&#8217;s also heavy.</p><p>So why not just open a spreadsheet, build it, and have an easy button to turn it into a <a href="https://seattledataguy.substack.com/p/common-data-pipeline-patterns-youll">pipeline</a>?</p><p>Is it just that easy, right?</p><h2>3) Modern Data Stacks Will Be Shaken </h2><p>With all the start-ups that <a href="https://seattledataguy.substack.com/p/from-boom-to-bundle-the-great-consolidation?utm_source=publication-search">have been bought up</a> recently and others raising prices I foresee a shake-up in the default approaches that companies use to build their data stacks.</p><p>On top of that I wouldn&#8217;t be surprised that we see a wave of companies needing their data stacks re-built from the ground up.</p><p>Between fragile data pipelines, changing pricing models, sunsetting solutions, and <a href="https://seattledataguy.substack.com/p/what-is-query-driven-data-modeling?utm_source=publication-search">Just-in-Time data models</a>, people will take another look at what they built and see if it actually meets their needs.</p><p><em>By the way, great segway, if your data team needs help revamping your data infrastructure, whether you&#8217;re on Databricks, Snowflake, Bigquery, or all of the above, <a href="https://calendly.com/ben-rogojan/consultation?month=2026-02">then reach out for a consultation!</a> </em></p><p>This opens the door for new solutions as well as, I hope, building more solid data foundations, which if you need a good way to pitch it to data leadership. </p><p>Just call it &#8220;AI-foundations&#8221;.</p><h2>4) AI POCs Will Start To Build Actual Foundations</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uZ_T!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11715200-1a5b-4e17-8e41-b844a3ae8535_1024x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uZ_T!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11715200-1a5b-4e17-8e41-b844a3ae8535_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!uZ_T!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11715200-1a5b-4e17-8e41-b844a3ae8535_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!uZ_T!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11715200-1a5b-4e17-8e41-b844a3ae8535_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!uZ_T!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11715200-1a5b-4e17-8e41-b844a3ae8535_1024x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uZ_T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11715200-1a5b-4e17-8e41-b844a3ae8535_1024x768.png" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/11715200-1a5b-4e17-8e41-b844a3ae8535_1024x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:396937,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/186367636?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11715200-1a5b-4e17-8e41-b844a3ae8535_1024x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uZ_T!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11715200-1a5b-4e17-8e41-b844a3ae8535_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!uZ_T!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11715200-1a5b-4e17-8e41-b844a3ae8535_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!uZ_T!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11715200-1a5b-4e17-8e41-b844a3ae8535_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!uZ_T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11715200-1a5b-4e17-8e41-b844a3ae8535_1024x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Over the last few years, we&#8217;ve been fed what endless new terms and patterns for how to make LLMs useful. I believe we&#8217;ll start seeing more crystallized patterns for how companies are actually planning to use LLMs(for more than just writing code).</p><p>Because for every fifty projects that were driven by hype, there are one or two where the engineers focused on delivering a reliable solution. When they came up against problems, they didn&#8217;t rush through them or ignore them.</p><p>They actually spent time trying to figure out how to work with what they were getting. They tried to figure out the actual value of the LLM beyond the surface-level use cases.</p><p>It takes time to develop design patterns and processes, and soon we&#8217;ll have enough iterations where some teams will be able to reliably execute ideas. Over the past few tech hype cycles the general process I see is:</p><ol><li><p><strong>New capability appears - </strong>A breakthrough hits (LLMs, streaming, blockchain, &#8220;big data&#8221;, etc.). Early demos look magical, but the real constraints aren&#8217;t understood yet.</p></li><li><p><strong>Everyone builds the obvious thing first - </strong>For LLMs, this was:</p><ul><li><p>Chatbots everywhere</p></li><li><p>&#8220;Ask your data anything&#8221; demos</p></li><li><p>Code generation and copilots</p></li></ul></li><li><p><strong>Reality sets in - </strong>Teams start to run into problems. Think hallucinations, cost blowups, security and governance concerns and in some cases, things just don&#8217;t work as expected. You need to start integrating safe-guards and best practices(that no one has created yet, you..you are the one creating them!)</p></li><li><p><strong>Patterns start to crystalize - </strong>Every new capability comes with limitations. But you wont&#8217; know them until you implement it. Until it hits scale or just has to perform an edge case you hadn&#8217;t considered. </p></li><li><p><strong>Becomes a standard - </strong>Eventually new capabilities are viewed as a standard piece of infrastructure. Integrated in such a way where you notice it less because it smoothly fits into the rest of your flow.<strong> </strong></p></li><li><p><strong>The hype fades and we figure out where the new capabilities fit best - </strong>We go from, this new thing can solve all problems to, here is where this new capability is really good at solving our real problems.</p></li></ol><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SeattleDataGuy&#8217;s Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>5) Snowflake Will Rediscover Themselves</h2><p>Although I don&#8217;t like thinking in terms of Snowflake vs <a href="https://www.youtube.com/watch?v=QNdiGZFaUFs&amp;pp=ygUbRGF0YWJyaWNrcyBzZWF0dGxlIGRhdGEgZ3V5">Databricks</a>&#8230;.I do have a hard time not comparing them to each other&#8230;</p><p>As an outsider looking in, Snowflakes vibes are off(as the kids would say).</p><p>At their core, they offer a solid data warehouse solution, but their overall strategy, to me, is unclear.</p><p>They have heavily relied on partners for a long time, but now, you get the sense that they want to start pushing into other functionalities, while still being friendly with their partners.</p><p>Personally, this has led to some of their recent feature add-ons lacking commitment. They could make their dbt integration good, but personally, I find it just fine. I want it to feel less like just tacked on functionality and more like a well integrated part of Snowflake. </p><p>I believe Snowflake could implement it in such a way that it could make needing to use dbt Cloud unnecessary. But maybe they don&#8217;t want to. They want to straddle both being a partner driven business and an all-in-one data solution.</p><p>Databricks, on the other hand, is pushing to solidify its hold on data engineers and analysts. They&#8217;ve had a foothold over a good portion of the data engineering identity, and now they are <a href="https://seattledataguy.substack.com/p/snowflake-vs-databricks-is-the-wrong">partnering with Alex the Analyst for analytics</a>. This is just marketing.</p><p>When it comes to their product, you know Databricks wants to be an all-in-one tool. Sure, they have partners, but they&#8217;ve stuck to their core identity; they are all-in-one.</p><p>I know I should bring data on this, and I am trying to think what would be a good alternative data source to prove what my gut is saying. But Snowflake gives the vibe that it&#8217;s lost it&#8217;s way. I really enjoyed reading the book <a href="https://www.amazon.com/Playing-Win-Strategy-Really-Works/dp/142218739X">Playing to Win: How Strategy Really Works</a>.</p><p>Most of the book discusses the diagram below in the usage of the strategy.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!n27B!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0322a154-a795-4a17-8e1c-be5fcafaaf2a_640x544.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!n27B!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0322a154-a795-4a17-8e1c-be5fcafaaf2a_640x544.jpeg 424w, https://substackcdn.com/image/fetch/$s_!n27B!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0322a154-a795-4a17-8e1c-be5fcafaaf2a_640x544.jpeg 848w, https://substackcdn.com/image/fetch/$s_!n27B!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0322a154-a795-4a17-8e1c-be5fcafaaf2a_640x544.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!n27B!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0322a154-a795-4a17-8e1c-be5fcafaaf2a_640x544.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!n27B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0322a154-a795-4a17-8e1c-be5fcafaaf2a_640x544.jpeg" width="640" height="544" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0322a154-a795-4a17-8e1c-be5fcafaaf2a_640x544.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:544,&quot;width&quot;:640,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:45655,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/186367636?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0322a154-a795-4a17-8e1c-be5fcafaaf2a_640x544.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!n27B!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0322a154-a795-4a17-8e1c-be5fcafaaf2a_640x544.jpeg 424w, https://substackcdn.com/image/fetch/$s_!n27B!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0322a154-a795-4a17-8e1c-be5fcafaaf2a_640x544.jpeg 848w, https://substackcdn.com/image/fetch/$s_!n27B!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0322a154-a795-4a17-8e1c-be5fcafaaf2a_640x544.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!n27B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0322a154-a795-4a17-8e1c-be5fcafaaf2a_640x544.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.amazon.com/Playing-Win-Strategy-Really-Works/dp/142218739X">Source</a></figcaption></figure></div><p>When I overlay that thinking on the Databricks vs Snowflake I can see that Databricks has committed to its choices and where it is playing, where Snowflake hasn&#8217;t.</p><p>I do think Snowflake will find its way, one way or another, this year.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/p/5-key-predictions-for-the-data-industry-b7c/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://seattledataguy.substack.com/p/5-key-predictions-for-the-data-industry-b7c/comments"><span>Leave a comment</span></a></p><h2>Final Thoughts</h2><p>The data world is still suffering from many of the same challenges it has been for decades. Yes, we&#8217;ve added new tools and solutions, but businesses are still trying to find value from their data without getting too distracted by hype.</p><p>There are a lot of high-level posts and articles on driving value via data, but I think there is a gap when it comes to speaking on patterns of value that most businesses could find easily.</p><p>That&#8217;s one of the future series I want to put out after the data pipeline article.</p><p>So keep an eye out!</p><p>As always, thanks for reading!</p><h2>Video Of The Week - Common Data Pipeline Patterns You&#8217;ll See in the Real World - Types Of Data Pipelines You&#8217;ll Build</h2><div id="youtube2-htAipJ6yYFs" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;htAipJ6yYFs&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/htAipJ6yYFs?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h2>Articles Worth Reading</h2><p>There are thousands of new articles posted daily all over the web! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!</p><div><hr></div><h2>What It Actually Takes to Build a Data Pipeline System</h2><p>When I first started in the data world, it was common that many data teams would build their own data pipeline solutions. There were still dozens of options in terms of off the shelf tools of course, nevertheless, you&#8217;d see custom pipelines developed everywhere.</p><p>In 2025, I saw less of this.</p><p>In fact, in many cases data teams would go straight to picking tools or solutions.</p><p>But let&#8217;s say you do want to go down this route. You want to build your own data pipeline solution?</p><p>How would you do it?</p><p><a href="https://seattledataguy.substack.com/p/what-it-actually-takes-to-build-a">Read More Here</a></p><h1>How Uber Scaled Data Replication to Move Petabytes Every Day</h1><p>Uber prioritizes a reliable data lake, which is distributed across on-premise and cloud environments. This multi-region setup presents challenges for ensuring reliable and timely data access due to limited network bandwidth and the need for seamless data availability, particularly for disaster recovery. Uber uses the <a href="https://www.uber.com/en-IN/blog/building-ubers-data-lake-batch-data-replication-using-hivesync/">Hive Sync service</a>, which uses Apache Hadoop&#174; Distcp (Distributed Copy) for data replication. However, with Uber&#8217;s Data Lake exceeding 350 PB, Distcp&#8217;s limitations became apparent. This blog explores the optimizations made to Distcp to enhance its performance and meet Uber&#8217;s growing data replication and disaster recovery needs across its distributed infrastructure.</p><p><a href="https://www.uber.com/blog/scaled-data-replication/?uclick_id=d18f0a93-dfe0-4149-8312-c83e316eb816">Read More Here</a></p><div><hr></div><h2>End Of Day 209</h2><p>Thanks for checking out our community. We put out 4-5 Newsletters a month discussing data, tech, and start-ups.</p><p>If you enjoyed it, consider liking, sharing and helping this newsletter grow.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[The Analytical Skills No One Teaches You]]></title><description><![CDATA[Estimation, Baselines, Root Cause Analysis, and Metrics That Actually Matter]]></description><link>https://seattledataguy.substack.com/p/the-analytical-skills-no-one-teaches</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/the-analytical-skills-no-one-teaches</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Fri, 23 Jan 2026 16:49:29 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!IO_a!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fc71e6a-f351-4775-ac3d-60240e16d141_1024x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi, fellow future and current Data Leaders; Ben here &#128075;</p><p>Today we&#8217;ve got an amazing guest author, <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Olga Berezovsky&quot;,&quot;id&quot;:10490439,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9c458316-65d0-4bcb-a698-dca783e2f875_3224x3224.jpeg&quot;,&quot;uuid&quot;:&quot;ba2a7a07-7067-4b61-abb4-fea199b8363c&quot;}" data-component-name="MentionToDOM"></span>!</p><p>Olga is an analytics and data science leader focused on building impactful data products and helping businesses turn insights into better business decisions. She brings a strong ability to bridge business and technology and is passionate about mentoring analysts and growing high-performing data teams.</p><p>If you&#8217;re looking to learn more about everything analytics, then you should check out her <a href="https://dataanalysis.substack.com/">newsletter</a>. It&#8217;s filled with great content and how-tos.</p><p>Now let&#8217;s jump into the article!</p><div><hr></div><p>When I asked Olga to write an article, I wanted it to be focused on skills that data professionals don&#8217;t get taught explicitly. </p><p>There aren&#8217;t a lot of videos out there on how to deliver an impactful analysis to executives.</p><p>Even when it comes to running an analysis, many of us likely had to feel around in the dark a bit. I was just speaking to another data science leader who said they had to have an executive essentially take them aside and let them know their analysis weren&#8217;t great.</p><p>There are so many of these skills that analysts and engineers alike have to pick up on the job and no one tends to tell you what is good or bad.</p><p>So let&#8217;s talk about some of those skills you need to start working on!</p><h2>1. <a href="https://dataanalysis.substack.com/p/how-to-develop-analytical-intuition">How To Develop Analytical Intuition</a></h2><p>Many companies will ask candidates questions that might seem out of the blue. Like, how many dentists are in the world? </p><p>What they are trying to gauge is your analytical intuition. </p><p>Essentially, given problem with limited information, can you come up with a reasonable framework or approach to answer the question or at least know what information you&#8217;d need to look for in the future.</p><p>Here are some tips if you&#8217;re working to improve your analytical intuition. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vUVM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795318e7-5e34-4b3c-b43f-586ff2d1f3fd_1024x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vUVM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795318e7-5e34-4b3c-b43f-586ff2d1f3fd_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!vUVM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795318e7-5e34-4b3c-b43f-586ff2d1f3fd_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!vUVM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795318e7-5e34-4b3c-b43f-586ff2d1f3fd_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!vUVM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795318e7-5e34-4b3c-b43f-586ff2d1f3fd_1024x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vUVM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795318e7-5e34-4b3c-b43f-586ff2d1f3fd_1024x768.png" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/795318e7-5e34-4b3c-b43f-586ff2d1f3fd_1024x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:307802,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/181944390?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795318e7-5e34-4b3c-b43f-586ff2d1f3fd_1024x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vUVM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795318e7-5e34-4b3c-b43f-586ff2d1f3fd_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!vUVM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795318e7-5e34-4b3c-b43f-586ff2d1f3fd_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!vUVM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795318e7-5e34-4b3c-b43f-586ff2d1f3fd_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!vUVM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795318e7-5e34-4b3c-b43f-586ff2d1f3fd_1024x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Ability to set reasonable ranges and break down sampling:</h3><p><strong>Example</strong>: How many windows are in NY city? How many teachers are there in the world?</p><p><strong>How to</strong>: Start with an educated guess based on something related to the question &#8212; a proxy value that you do have some intuition about. Then, work your way toward a ballpark estimate using averages and scaling logic.</p><h3>Critical thinking: What goes up must come down</h3><p>If a value goes up, it should come down by a similar degree at some point, and vice versa. That&#8217;s why we use continuous distributions and probabilities - to account for natural variance, not just isolated changes.</p><p>If you&#8217;re new to a dataset or project, the first thing you should do is understand the degree of natural traffic fluctuations and baseline variance. This helps you separate expected changes from those driven by external factors and understand how much deviation is &#8220;normal.&#8221;</p><h3>Math and fractions: Part of a whole</h3><p>Every metric is a fraction. It&#8217;s just one piece of a broader ecosystem made up of other interconnected parts. Let&#8217;s say you confirm that the payment success rate is 25%. That means 25% of users successfully complete a transaction. But it also means that 75% do not. The more related metrics you identify, the easier it is to cross-check them.</p><p>This same logic applies to funnels and conversions. Every metric is a fraction of a whole. If one thing is rising, something else (a) should be falling or (b) should be rising as well. If you don&#8217;t see (a) or (b), question everything. Once you understand the relationships between metrics, figuring out what is declining and why becomes much easier.</p><h3>Develop a habit of checks: random users, session flow, against diff tools</h3><p>For every sample or report, pull 5-10 random users from the dataset and manually check their attributes - paid price, invoice details, subscription plan, country, number of transactions, etc.</p><p>Build the habit of manual spot checks and cross-checks in every report or <a href="https://seattledataguy.substack.com/p/stop-shipping-dashboards-that-dont">dashboard</a>. Don&#8217;t trust tooling or automation.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SeattleDataGuy&#8217;s Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>2. <a href="https://dataanalysis.substack.com/p/how-to-do-a-root-cause-analysis-issue">How To Do a Root Cause Analysis</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zZ2H!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f32de4-2d1d-498a-8344-e7e74dca294a_1024x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zZ2H!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f32de4-2d1d-498a-8344-e7e74dca294a_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!zZ2H!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f32de4-2d1d-498a-8344-e7e74dca294a_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!zZ2H!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f32de4-2d1d-498a-8344-e7e74dca294a_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!zZ2H!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f32de4-2d1d-498a-8344-e7e74dca294a_1024x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zZ2H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f32de4-2d1d-498a-8344-e7e74dca294a_1024x768.png" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/41f32de4-2d1d-498a-8344-e7e74dca294a_1024x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:320757,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/181944390?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f32de4-2d1d-498a-8344-e7e74dca294a_1024x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zZ2H!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f32de4-2d1d-498a-8344-e7e74dca294a_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!zZ2H!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f32de4-2d1d-498a-8344-e7e74dca294a_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!zZ2H!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f32de4-2d1d-498a-8344-e7e74dca294a_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!zZ2H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f32de4-2d1d-498a-8344-e7e74dca294a_1024x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We apply root cause analysis when a metric shows unexpected movement. For example:</p><ul><li><p>Average DAU starts gradually declining M/M, but MAU stays flat.</p></li><li><p>Net new transactions increase 5% W/W, but total revenue doesn&#8217;t change.</p></li><li><p>Customer churn for annual subscriptions doubles.</p></li><li><p>The trial-to-paid rate drops by 10%.</p></li></ul><p>The keyword is <em>unexpected</em>. Analysts own the concept of a <em>baseline</em> - an estimated (modeled or forecasted) value with applied seasonal, M/M, and Y/Y adjustments.</p><p>For example, you may notice that total transactions sharply decline in September after August. This decline may be expected if the same pattern appears each year during this period. Or it may point to a bug. Or it may be a mix of both. To break it down, you need to know your baseline -<em>how many transactions you typically expect at this time of the month and year, and how much that number has changed.</em> </p><h3>First things first: ensure the <em>data you see is correct</em>. </h3><p>Find at least two or more other data sources showing a similar decrease to confirm it&#8217;s true.</p><p>Possible issues: broken <a href="https://www.theseattledataguy.com/etls-vs-elts-why-are-elts-disrupting-the-data-market-data-engineering-consulting/#page-content">ETL</a>, holiday schedules that break during long weekends, etc. </p><p>If you&#8217;re confident the data is accurate, and this is an actual decline, proceed with modeling different hypotheses on what the issue may be.</p><h3>Analysis: Generate multiple hypotheses that you have to confirm or reject. </h3><ol><li><p><strong>Product</strong> <strong>hypothesis</strong>: the drop is related to a product bug or a specific product launch. There is usually a sharp drop if it&#8217;s a bug. The drop can also be gradual for releases because teams often do slow rollouts with 1% traffic release, then 20% &#8594; 50% &#8594; 100%. It can also be for a segment of users (e.g., new users only or paid). </p></li><li><p><strong>Market or competition hypothesis</strong>: it can be a new tool that takes the market share. Or slowed down spending, or a shift in user acquisition strategy. You will notice it by a gradual decline, not necessarily tied to a specific campaign or promotion.</p></li><li><p><strong>User hypothesis</strong>: This is typically a gradual, inconsistent decline. Given that (a) the proportion of different personas tends to change, and (b) they are not necessarily tied to seasonality (rather to marketing campaigns that brought them in), it can be challenging to capture the beginning of a decline. You would need to check the drop against different user cohorts - registered but not active, active but lapsing, power users, premium or free, business or individuals, or whatever personas you work with.</p></li><li><p><strong>External factors</strong>: pandemic, war, or big social movement (#MeToo, BLM, abortion law, etc.). These may cause a sharp decline (or an increase) in usage across many top user actions, features, and platforms.</p></li></ol><p>Once you&#8217;ve proven the hypothesis based on the data, whether it&#8217;s a product bug, marketing issue, or user behavior, escalate it to the appropriate team to gather more context. Ideally, you&#8217;re already a few steps ahead, with hypotheses and analyses prepared on which metric is declining and by how much.</p><h2>3. How To Develop A KPI And Connect It To Action</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7wGo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bd9826c-8fa0-42bf-9701-c9028fbb036f_856x780.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7wGo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bd9826c-8fa0-42bf-9701-c9028fbb036f_856x780.webp 424w, https://substackcdn.com/image/fetch/$s_!7wGo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bd9826c-8fa0-42bf-9701-c9028fbb036f_856x780.webp 848w, https://substackcdn.com/image/fetch/$s_!7wGo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bd9826c-8fa0-42bf-9701-c9028fbb036f_856x780.webp 1272w, https://substackcdn.com/image/fetch/$s_!7wGo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bd9826c-8fa0-42bf-9701-c9028fbb036f_856x780.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7wGo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bd9826c-8fa0-42bf-9701-c9028fbb036f_856x780.webp" width="856" height="780" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1bd9826c-8fa0-42bf-9701-c9028fbb036f_856x780.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:780,&quot;width&quot;:856,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:53504,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/181944390?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bd9826c-8fa0-42bf-9701-c9028fbb036f_856x780.webp&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7wGo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bd9826c-8fa0-42bf-9701-c9028fbb036f_856x780.webp 424w, https://substackcdn.com/image/fetch/$s_!7wGo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bd9826c-8fa0-42bf-9701-c9028fbb036f_856x780.webp 848w, https://substackcdn.com/image/fetch/$s_!7wGo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bd9826c-8fa0-42bf-9701-c9028fbb036f_856x780.webp 1272w, https://substackcdn.com/image/fetch/$s_!7wGo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bd9826c-8fa0-42bf-9701-c9028fbb036f_856x780.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://productmanagermeme.com/when-your-key-metrics-drop-after-a-new-feature-launch">Source</a></figcaption></figure></div><p>KPIs can be financial, customer-focused, and process-focused. </p><h3>There are many types of metrics:</h3><ul><li><p><strong>Top-level metrics</strong> - a measure of strategic direction and performance. Each top-level metric is tailored and adjusted to the specific business model, environment, and strategy. This is when context and nuances really matter. Top-level metrics and KPIs are usually reported monthly and quarterly.</p></li><li><p><strong>North Star Metric</strong> - represents one company goal. From Mixpanel <a href="https://mixpanel.com/blog/north-star-metric/">North Star Metric</a>: &#8220;To qualify as a &#8220;North Star,&#8221; a metric must do three things: lead to revenue, reflect customer value, and measure progress&#8221; It can be DAU, or LTV, or MRR, or other measurements, and its main purpose is to align teams around one main goal.</p></li><li><p><strong>Secondary metrics</strong> - more granular health indicators and product targets. They measure how successful the product and process are. Secondary metrics are sensitive to any product changes. Therefore, you would measure A/B tests, feature adoptions, or bug impacts against secondary metrics. Due to their sensitivity, they are also usually monitored and reported weekly. </p></li><li><p><strong>Vanity metrics</strong> - impressive but not useful or actionable metrics that don&#8217;t lead to growth or revenue and aren&#8217;t relevant to anything you can do to improve them. They are often too simple and ignore the context. Examples: the number of social media followers or total registered users. It&#8217;s like only working out your arms when you go to the gym and ignoring your core. More about <a href="https://www.tableau.com/learn/articles/vanity-metrics">Vanity metrics</a>.</p></li><li><p><strong>OMTM - One Metric That Matters</strong>. This is different from the North Star Metric and meant to be a temporary goal unifying all the teams at the company towards one issue. An example: when your software got hacked and all user accounts got deleted, you would set OMTM as a number of reinstated accounts. For a less dramatic example, when you begin a migration, your OMTM can be the number of successfully migrated accounts. Or when your churn significantly overweights new accounts and renewals, then you have to pause everything and focus on retention.</p></li></ul><h3>The right metric should be:</h3><ul><li><p><strong>Relevant</strong> - represent the result you want to see. If you make a change to a user flow, you should measure user steps and following actions, not net new revenue.</p></li><li><p><strong>Measurable</strong> - do you even have the data to get the metric? Do you trust the source?</p></li><li><p><strong>Specific</strong> - detailed to illustrate the right product movement. User retention is not the right metric to measure the A/B test. But the frequency and/or type of actions are. </p></li><li><p><strong>Prioritized</strong> - what stands out from other metrics as the highest priority. How to differentiate nice-to-have metrics from must-have metrics reporting. </p></li><li><p><strong>Balanced</strong> - meant to measure positive and negative outcomes. If you notice a traffic increase for using one feature, most likely there is a decrease somewhere else.</p></li></ul><h3>Metrics types   </h3><p>You probably know it already, but there 4 main categories of metrics that are meant to capture different purposes:</p><ol><li><p><strong>Sums and counts</strong> - Daily Active Users, the sum of sales, unique number of unsubscribers, etc.</p></li><li><p><strong>Distribution (mean, median, mode, percentiles)</strong> - average memory used, % of MAU, a median session length, or others. </p></li></ol><ol start="3"><li><p><strong>Probability and rates</strong> - if you change a screen layout, you have to measure click-through rate or click-through probability. </p></li><li><p><strong>Ratios</strong> - monthly/annual subscription ratio, male/female usage ration, or etc.  </p></li></ol><h3>Here are a few examples of common metrics across different domains:</h3><p><strong>Growth &amp; Marketing:</strong> Unique Visitors, First Visits, Returning Visitors, Bounce Rate, Installs, Signups, Customer Acquisition Cost (CAC), Click-Through Rate (CTR), Cost Per Impression (CPI), Cost Per Action (CPA), Time to Value, Visitor-to-Signup Rate, Signup-to-Payment Rate, Product or Feature Adoption Rate, Virality, Network Effect Score, Return on Advertising Spend (ROAS), Number of Qualified Leads, Lead Conversion Rate, Average Lead Score, Cost Per Lead (CPL), Unsubscribes.</p><p><strong>Revenue:</strong> Monthly Recurring Revenue (MRR), Annual Recurring Revenue (ARR), Net Revenue, Net Revenue Retention, Paid Customers, Activated Trials, Free-to-Paid Conversions, Paid-to-Free Downgrades, Revenue Churn, Customer Churn, Monthly/Weekly Customers Completing Their First Order, Daily/Monthly Total Purchase Value, Lifetime Value (LTV), Average Revenue Per Account (ARPA), Upsell-to-Payment Rate, Expansion Revenue, Return on Investment (ROI), and others.</p><p><strong>Engagement:</strong> MAU, WAU, DAU, Adjacent Users, Day 0, Day 1+, Day 7+, and Day 28 Retention, 1-Year or 2-Year Retention %, Number of Returning Users, Daily/Hourly Number of Actions, Total Watch Time, Total Time Spent, Frequency of Visits, Pages Per Session, Scroll Depth, Average Session Duration, Exit Rate, Product Abandonment Rate, and others.</p><p><strong>Customer Success:</strong> Customer Satisfaction Score (CSAT), Net Promoter Score (NPS), Customer Health Score, Ticket Resolution Rate, Average Resolution Time, Average Reply Time, Customer Effort Score (CES), First Response Time, Daily/Monthly Ticket Requests.</p><p><strong>Platform / Engineering:</strong> Product Support Cost, R&amp;D Engineering Cost, Outsourcing Rate, Cost Performance Index (CPI), Schedule Performance Index (SPI), Uptime, Average Downtime per Month/Year, Machine Downtime Rate, % Planned Maintenance, Number of Releases, Running Cost, Number of Bugs, Number of Pull Requests, Capacity Utilization, Memory Usage, Requests Per Minute (RPM), Errors Per Minute, and others.</p><p>If you&#8217;re like to read more on KPIs:</p><ul><li><p><a href="https://dataanalysis.substack.com/p/introduction-to-product-metrics-and">Intro to Product Metrics</a> </p></li><li><p><a href="https://dataanalysis.substack.com/p/how-to-pick-the-right-metric-issue">How To Pick The Right Metric</a></p></li></ul><h2>4. <a href="https://dataanalysis.substack.com/p/kpis-done-wrong-fixing-common-reporting-mistakes">KPIs Done Wrong: Fixing Common Reporting Mistakes</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SNSO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F158b6726-1c73-4fde-8dad-7424eb6d3814_500x563.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SNSO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F158b6726-1c73-4fde-8dad-7424eb6d3814_500x563.jpeg 424w, https://substackcdn.com/image/fetch/$s_!SNSO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F158b6726-1c73-4fde-8dad-7424eb6d3814_500x563.jpeg 848w, https://substackcdn.com/image/fetch/$s_!SNSO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F158b6726-1c73-4fde-8dad-7424eb6d3814_500x563.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!SNSO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F158b6726-1c73-4fde-8dad-7424eb6d3814_500x563.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SNSO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F158b6726-1c73-4fde-8dad-7424eb6d3814_500x563.jpeg" width="500" height="563" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/158b6726-1c73-4fde-8dad-7424eb6d3814_500x563.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:563,&quot;width&quot;:500,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:74113,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/181944390?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F158b6726-1c73-4fde-8dad-7424eb6d3814_500x563.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SNSO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F158b6726-1c73-4fde-8dad-7424eb6d3814_500x563.jpeg 424w, https://substackcdn.com/image/fetch/$s_!SNSO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F158b6726-1c73-4fde-8dad-7424eb6d3814_500x563.jpeg 848w, https://substackcdn.com/image/fetch/$s_!SNSO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F158b6726-1c73-4fde-8dad-7424eb6d3814_500x563.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!SNSO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F158b6726-1c73-4fde-8dad-7424eb6d3814_500x563.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ol><li><p>Don&#8217;t overthink KPIs. If you&#8217;re unsure how to measure an initiative, stick to simple metrics like unique views, CTA clicks, and the % of users with a CTA click. Not everything needs to be tied to LTV or MRR.</p></li><li><p>Use an effective proxy metric that is both sensitive and independent. If your proxy requires complex calculations, it&#8217;s not a good proxy.</p></li><li><p>Don&#8217;t stress about benchmarks - focus on your Signup-to-Paid MoM growth.</p></li><li><p>Avoid bringing metrics definitions from your previous job into your current project. Every product is unique, with different user lifecycles. Some apps have monthly and annual subscriptions, while others have 18 payment plans. Churn calculations will vary. Develop KPIs that fit this specific business and product.</p></li></ol><h2>Final Thoughts</h2><p>There are a lot of skills you&#8217;ll start to take for granted as you grow as a data analyst or engineer. But they all came from somewhere.</p><p>I hope this newsletter helps you put words to concepts or can be an article you share with someone just joining the data world.</p><p>As always, thanks for reading!</p><h2>Articles Worth Reading</h2><p>There are thousands of new articles posted daily all over the web! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!</p><div><hr></div><h2>Apache Hudi&#8482; at Uber: Engineering for Trillion-Record-Scale Data Lake Operations</h2><p>Uber operates one of the most diverse and demanding data ecosystems in the world. Every trip taken, order delivered, ad served, or real-time arrival time recalculated generates an unending stream of data. These data points come from hundreds of microservices, thousands of cities, and millions of riders, each with its own velocity, shape, and business-criticality. At the heart of this ecosystem lies Uber&#8217;s data lake: a multi-hundred-petabyte repository that fuels operational decisions, machine learning models, experimentation platforms, and real-time business intelligence.</p><p><a href="https://www.uber.com/blog/apache-hudi-at-uber/?uclick_id=166d8d02-e477-4e1a-9a01-9817aaca8ab8">Read More Here</a></p><h2>What It Actually Takes to Build a Data Pipeline System</h2><p>When I first started in the data world, it was common that many data teams would build their own data pipeline solutions. There were still dozens of options in terms of off the shelf tools of course, nevertheless, you&#8217;d see custom pipelines developed everywhere.</p><p>In 2025, I saw less of this.</p><p>In fact, in many cases data teams would go straight to picking tools or solutions.</p><p>But let&#8217;s say you do want to go down this route. You want to build your own data pipeline solution?</p><p>How would you do it?</p><p><a href="https://seattledataguy.substack.com/p/what-it-actually-takes-to-build-a">Read More Here</a></p><div><hr></div><h2>End Of Day 208</h2><p>Thanks for checking out our community. We put out 4-5 Newsletters a month discussing data, tech, and start-ups.</p><p>If you enjoyed it, consider liking, sharing and helping this newsletter grow.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[What It Actually Takes to Build a Data Pipeline System]]></title><description><![CDATA[A breakdown of the components, tradeoffs, and realities of building your own data pipeline system]]></description><link>https://seattledataguy.substack.com/p/what-it-actually-takes-to-build-a</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/what-it-actually-takes-to-build-a</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Wed, 14 Jan 2026 17:59:10 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!FqZB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2215cb85-d988-4445-b5df-f14a31258838_1024x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi, fellow future and current Data Leaders; Ben here &#128075;</p><p>Today I am continuing my series on data pipelines. In the prior article we discussed the types of data pipelines that exist. Today, we&#8217;ll be discussing the components you&#8217;ll need if you plan to build your own data pipeline from scratch.</p><p>But before we jump in, I wanted to share a bit about <a href="https://estuary.dev/?utm_source=SeattleDataGuy&amp;utm_medium=social&amp;utm_campaign=SeattleDataGuy">Estuary</a>, a platform I&#8217;ve used to help make clients&#8217; data workflows easier and am an adviser for. Estuary helps teams easily move data in real-time or on a schedule, from databases and SaaS apps to data lakes and warehouses, empowering data leaders to focus on strategy and impact rather than getting bogged down by infrastructure challenges. If you want to simplify your data workflows, check them out today.</p><p>Now let&#8217;s jump into the article!</p><div><hr></div><p>When I first started in the data world, it was common that many data teams would build their own data pipeline solutions. There were still dozens of options in terms of off the shelf tools of course, nevertheless, you&#8217;d see custom pipelines developed everywhere.</p><p>In 2025, I saw less of this.</p><p>In fact, in many cases data teams would go straight to picking tools or solutions.</p><p>But let&#8217;s say you do want to go down this route. You want to build your own data pipeline solution?</p><p>How would you do it?</p><h2>What Components You&#8217;ll Need</h2><p>Below I&#8217;ll outline the components most every data pipeline system I&#8217;ve worked with requires/has had.</p><h3>Secrets And  Connection Management</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FqZB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2215cb85-d988-4445-b5df-f14a31258838_1024x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FqZB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2215cb85-d988-4445-b5df-f14a31258838_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!FqZB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2215cb85-d988-4445-b5df-f14a31258838_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!FqZB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2215cb85-d988-4445-b5df-f14a31258838_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!FqZB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2215cb85-d988-4445-b5df-f14a31258838_1024x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FqZB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2215cb85-d988-4445-b5df-f14a31258838_1024x768.png" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2215cb85-d988-4445-b5df-f14a31258838_1024x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:57325,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/183876302?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2215cb85-d988-4445-b5df-f14a31258838_1024x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FqZB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2215cb85-d988-4445-b5df-f14a31258838_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!FqZB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2215cb85-d988-4445-b5df-f14a31258838_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!FqZB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2215cb85-d988-4445-b5df-f14a31258838_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!FqZB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2215cb85-d988-4445-b5df-f14a31258838_1024x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I am going to start the list of components off with secrets and connection management.</p><p>Because this is how you&#8217;ll likely set up source and destinations, without sources and destinations, you really have no reason to build your pipeline.</p><p>You just have orphaned <a href="https://www.theseattledataguy.com/how-to-write-better-sql-advanced-sql-episode-1/">SQL</a> logic doing nothing, and Python pushing data nowhere.</p><p>It&#8217;s also crucial in how easy you make it to manage the rest of your system.</p><p>Do you want your data team members to have to write a custom connection script every time?</p><p>If a source or password changes, are you making it easy to update the information in a single place or multiple places?</p><p>Do you make it easy to store securely without exposing it to the repo?</p><p>Small details here matter and add up over time. If I have to make a separate connection reference every time I need a new table from the same database, that&#8217;ll be terrible.</p><p>And for those out there who assume they&#8217;ll only ever need to connect to a few sources, you&#8217;d better hope you&#8217;re right.</p><h3>Logging And Monitoring </h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qs1_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe49da1ab-1bd6-4b31-96da-30e56796e061_1886x588.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qs1_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe49da1ab-1bd6-4b31-96da-30e56796e061_1886x588.gif 424w, https://substackcdn.com/image/fetch/$s_!qs1_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe49da1ab-1bd6-4b31-96da-30e56796e061_1886x588.gif 848w, https://substackcdn.com/image/fetch/$s_!qs1_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe49da1ab-1bd6-4b31-96da-30e56796e061_1886x588.gif 1272w, https://substackcdn.com/image/fetch/$s_!qs1_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe49da1ab-1bd6-4b31-96da-30e56796e061_1886x588.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qs1_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe49da1ab-1bd6-4b31-96da-30e56796e061_1886x588.gif" width="1456" height="454" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e49da1ab-1bd6-4b31-96da-30e56796e061_1886x588.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:454,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2236941,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/183876302?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe49da1ab-1bd6-4b31-96da-30e56796e061_1886x588.gif&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qs1_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe49da1ab-1bd6-4b31-96da-30e56796e061_1886x588.gif 424w, https://substackcdn.com/image/fetch/$s_!qs1_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe49da1ab-1bd6-4b31-96da-30e56796e061_1886x588.gif 848w, https://substackcdn.com/image/fetch/$s_!qs1_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe49da1ab-1bd6-4b31-96da-30e56796e061_1886x588.gif 1272w, https://substackcdn.com/image/fetch/$s_!qs1_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe49da1ab-1bd6-4b31-96da-30e56796e061_1886x588.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.astronomer.io/docs/learn/logging">Source</a></figcaption></figure></div><p>When you build your <a href="https://www.theseattledataguy.com/data-engineering-101-writing-your-first-pipeline/">first pipeline</a> system, maybe you put in a few print statements to track where your pipeline has succeeded and failed. As you start building a more generic system, logging needs to be included.</p><p>You need to be able to trace back and figure out if there was an issue with an external library, with a specific module inside your <a href="https://estuary.dev/blog/data-pipeline/?utm_source=SeattleDataGuy&amp;utm_medium=social&amp;utm_campaign=SeattleDataGuy">data pipeline</a> system, or an actual problem with a pipeline you&#8217;ve written.</p><p>Think &#8220;we can&#8217;t find this library&#8221; vs &#8220;we can&#8217;t find this table&#8221;.</p><p>You also want to know on which run this occurred. Was it data from a specific date, or if you think in terms of <a href="https://www.theseattledataguy.com/what-is-apache-airflow-data-engineering-consulting/">Airflow</a>, one of those little red boxes?</p><p>Without logging, debugging is impossible, and as more code gets generated by AI, we are going to need even more specific error messages and traceability in order to figure out what we need to fix.</p><h3>Dependency Awareness(Graphs)</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zjUw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4bdb407-dfbc-44fc-a87b-b015a4e947ed_1506x1236.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zjUw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4bdb407-dfbc-44fc-a87b-b015a4e947ed_1506x1236.png 424w, https://substackcdn.com/image/fetch/$s_!zjUw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4bdb407-dfbc-44fc-a87b-b015a4e947ed_1506x1236.png 848w, https://substackcdn.com/image/fetch/$s_!zjUw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4bdb407-dfbc-44fc-a87b-b015a4e947ed_1506x1236.png 1272w, https://substackcdn.com/image/fetch/$s_!zjUw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4bdb407-dfbc-44fc-a87b-b015a4e947ed_1506x1236.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zjUw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4bdb407-dfbc-44fc-a87b-b015a4e947ed_1506x1236.png" width="1456" height="1195" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c4bdb407-dfbc-44fc-a87b-b015a4e947ed_1506x1236.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1195,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1637801,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/183876302?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4bdb407-dfbc-44fc-a87b-b015a4e947ed_1506x1236.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zjUw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4bdb407-dfbc-44fc-a87b-b015a4e947ed_1506x1236.png 424w, https://substackcdn.com/image/fetch/$s_!zjUw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4bdb407-dfbc-44fc-a87b-b015a4e947ed_1506x1236.png 848w, https://substackcdn.com/image/fetch/$s_!zjUw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4bdb407-dfbc-44fc-a87b-b015a4e947ed_1506x1236.png 1272w, https://substackcdn.com/image/fetch/$s_!zjUw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4bdb407-dfbc-44fc-a87b-b015a4e947ed_1506x1236.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Not everything has to be DAGs, but you&#8217;re going to need some sort of dependency awareness. I recall building a very naive version of this at one of my first jobs, where I simply created a table that kept track of which jobs had run, on what date, and the numeric step in the process.</p><p>This quickly falls apart if your pipeline needs to change frequently or you need to build anything with a smidge of complexity. Then you start needing to look at solutions like Airflow and <a href="https://www.youtube.com/watch?v=8FZZivIfJVo">dbt</a>, and how they handle referencing dependencies.</p><p>For example:</p><ul><li><p><code>extract_orders.set_downstream(transform_orders)</code></p></li><li><p><code>SELECT * FROM {{ ref(&#8217;stg_orders&#8217;) }}</code></p></li></ul><p>Some how, you do need to tell pipelines what the prior task or tasks they need to wait on are so they can check the status.</p><h3>Execution Engine Routers</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zMvo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99b30387-a4e0-4983-81d8-1bb266bb2f0f_1024x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zMvo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99b30387-a4e0-4983-81d8-1bb266bb2f0f_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!zMvo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99b30387-a4e0-4983-81d8-1bb266bb2f0f_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!zMvo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99b30387-a4e0-4983-81d8-1bb266bb2f0f_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!zMvo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99b30387-a4e0-4983-81d8-1bb266bb2f0f_1024x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zMvo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99b30387-a4e0-4983-81d8-1bb266bb2f0f_1024x768.png" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/99b30387-a4e0-4983-81d8-1bb266bb2f0f_1024x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:309728,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/183876302?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99b30387-a4e0-4983-81d8-1bb266bb2f0f_1024x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zMvo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99b30387-a4e0-4983-81d8-1bb266bb2f0f_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!zMvo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99b30387-a4e0-4983-81d8-1bb266bb2f0f_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!zMvo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99b30387-a4e0-4983-81d8-1bb266bb2f0f_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!zMvo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99b30387-a4e0-4983-81d8-1bb266bb2f0f_1024x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;ve seen multiple companies now spend much of their <a href="https://seattledataguy.substack.com/p/centralized-vs-decentralized-vs-federated">data teams&#8217;</a> budget and time simply on migrating data from <a href="https://seattledataguy.substack.com/p/snowflake-vs-databricks-is-the-wrong">Databricks to Snowflake</a>.</p><p>Why?</p><p>Because they use <a href="https://www.youtube.com/watch?v=QNdiGZFaUFs">Databricks</a> to run the expensive, heavy early data processing and then use Snowflake as a service layer.</p><p>Data teams want to be able to pick the compute they need, and so this is somewhat of a newer concept for data pipeline solutions. We now have multiple compute engines that people want to use when processing data. Think <a href="https://motherduck.com/blog/estuary-streaming-cdc-replication/">DuckDB</a>, <a href="https://www.theseattledataguy.com/how-can-presto-and-starburst-data-improve-your-data-analytics/">Presto</a>, and maybe just a local instance of Spark.</p><p>Some engines are cheaper or faster, and still others handle larger data sets better. I foresee solutions in the future, routing more and more of this traffic to optimize for what your team is looking for in a specific pipeline. We actually had this at Facebook(although we had to tell it which engine to use)</p><p>In the same way, it&#8217;s worth considering if you ever plan to build your own pipeline solution. I wouldn&#8217;t build it right away. But if you&#8217;re looking to further optimize your own internal solution or if you&#8217;re thinking bigger and building a dbt competitor(which several people have reached out to me saying they are), then I&#8217;d consider adding in routing.</p><h3>Scheduler</h3><p>Not all pipeline tools have a method to schedule the jobs you&#8217;ve built. In fact, I&#8217;d guess many don&#8217;t. If you&#8217;ve used SSIS, you&#8217;ve likely had to use SQL Server Agent. I&#8217;ve seen people use Jenkins, and of course, Cron and Windows task scheduler.</p><p>So even if you don&#8217;t have a scheduler in your pipeline system, it&#8217;ll have to be somewhere. Something has to tell your pipeline when to run.</p><p>If you look at Airflow&#8217;s built-in scheduler, it works differently from tools like cron or <a href="https://learn.microsoft.com/en-us/ssms/agent/sql-server-agent">SQL Server Agent.</a></p><p>Instead of directly triggering jobs at a specific time, the scheduler continuously evaluates which <em>DAG runs should exist</em> based on logical time and dependencies. Once a DAG run is created, the scheduler determines which tasks are eligible to run and hands them off to executors and workers to do the actual work. This design separates <strong>when work should happen</strong> from <strong>how it runs</strong>, which is what enables backfills, retries, and complex dependency management.</p><p>But likely you&#8217;re starting off via Cron.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SeattleDataGuy&#8217;s Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h3>Pipelines</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!l9lW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead841b7-26a0-46a2-8a57-e4f2556e9370_1024x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!l9lW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead841b7-26a0-46a2-8a57-e4f2556e9370_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!l9lW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead841b7-26a0-46a2-8a57-e4f2556e9370_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!l9lW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead841b7-26a0-46a2-8a57-e4f2556e9370_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!l9lW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead841b7-26a0-46a2-8a57-e4f2556e9370_1024x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!l9lW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead841b7-26a0-46a2-8a57-e4f2556e9370_1024x768.png" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ead841b7-26a0-46a2-8a57-e4f2556e9370_1024x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:65410,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/183876302?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead841b7-26a0-46a2-8a57-e4f2556e9370_1024x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!l9lW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead841b7-26a0-46a2-8a57-e4f2556e9370_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!l9lW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead841b7-26a0-46a2-8a57-e4f2556e9370_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!l9lW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead841b7-26a0-46a2-8a57-e4f2556e9370_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!l9lW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead841b7-26a0-46a2-8a57-e4f2556e9370_1024x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;ve spoken a lot about the components around the actual pipelines. But, well, you need data pipelines.</p><p>You need a way to define them.</p><p>You can look at tools like Glue, dbt, SSIS, and Airflow for possible examples. dbt is only a part of a pipeline, but it still does much of what I&#8217;ve listed above. Airflow, on the other hand, could be viewed as a much broader tool(a workflow orchestration solution).</p><p>In all the cases, you have a way of defining your pipeline(or workflow).</p><p>In dbt you might view this as several models(SQL) leading to a final table, paired with a solution that can <a href="https://estuary.dev/?utm_source=SeattleDataGuy&amp;utm_medium=social&amp;utm_campaign=SeattleDataGuy">ingest your data</a>.</p><p>These days you can often find pipeline solutions that mix code and drag and drop together. Where you might use dbt and some other tools together. </p><h3>Data Quality Checks</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Xbko!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84142229-6451-4531-b14a-523677572412_1024x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Xbko!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84142229-6451-4531-b14a-523677572412_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!Xbko!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84142229-6451-4531-b14a-523677572412_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!Xbko!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84142229-6451-4531-b14a-523677572412_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!Xbko!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84142229-6451-4531-b14a-523677572412_1024x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Xbko!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84142229-6451-4531-b14a-523677572412_1024x768.png" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/84142229-6451-4531-b14a-523677572412_1024x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:162474,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/183876302?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84142229-6451-4531-b14a-523677572412_1024x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Xbko!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84142229-6451-4531-b14a-523677572412_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!Xbko!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84142229-6451-4531-b14a-523677572412_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!Xbko!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84142229-6451-4531-b14a-523677572412_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!Xbko!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84142229-6451-4531-b14a-523677572412_1024x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>When I started building my first data pipelines, I had to write a lot of tests by hand, well, type them. We didn&#8217;t set easy-to-configure checks that you could simply reference like:</p><p><code>rowcount_bounds = SQLValueCheckOperator(</code></p><p><code>    task_id=&#8221;dq_rowcount_bounds&#8221;,</code></p><p><code>    conn_id=&#8221;warehouse&#8221;,</code></p><p><code>    sql=&#8221;SELECT COUNT(*) FROM fct_orders WHERE ds=&#8217;{{ ds }}&#8217;&#8221;,</code></p><p><code>    pass_value=120_000,   </code></p><p><code>    tolerance=0.2,        </code></p><p><code>)</code></p><p>Having an easy way to integrate data quality checks as part of your pipelines is crucial. </p><p>Sure, you can find other tools that offer some broader set of checks, but I think having some that are already easy to integrate into your pipeline ensures your team will integrate data quality into your pipelines. Because if it&#8217;s not easy to integrate, people won&#8217;t do it.</p><h3>UI</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0iEX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69309512-edf1-4709-a0de-5077b6797034_2964x1974.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0iEX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69309512-edf1-4709-a0de-5077b6797034_2964x1974.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0iEX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69309512-edf1-4709-a0de-5077b6797034_2964x1974.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0iEX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69309512-edf1-4709-a0de-5077b6797034_2964x1974.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0iEX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69309512-edf1-4709-a0de-5077b6797034_2964x1974.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0iEX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69309512-edf1-4709-a0de-5077b6797034_2964x1974.jpeg" width="1456" height="970" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/69309512-edf1-4709-a0de-5077b6797034_2964x1974.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:970,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:503605,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/183876302?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69309512-edf1-4709-a0de-5077b6797034_2964x1974.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0iEX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69309512-edf1-4709-a0de-5077b6797034_2964x1974.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0iEX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69309512-edf1-4709-a0de-5077b6797034_2964x1974.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0iEX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69309512-edf1-4709-a0de-5077b6797034_2964x1974.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0iEX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69309512-edf1-4709-a0de-5077b6797034_2964x1974.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://docs.mage.ai/design/data-pipeline-management">Source</a></figcaption></figure></div><p>Unlike many of the other components, it could be argued that a UI is optional. You could just run your pipeline via a CLI, where you also check statuses.</p><p>But if you are going through all the trouble of building this data pipeline tool internally, it likely means you&#8217;re supporting a rather large data team.</p><p>If that is the case, then a UI becomes more necessary as there are other teams that need to interact with said UI. They need to filter jobs to see what jobs are running, what jobs are hung, and which ones have failed.</p><p>They need a place where it&#8217;s easy to re-run jobs that are stuck, track logs and alerts, and so on.</p><p>Now suddenly, you&#8217;re building an entirely new application.</p><h2>Processes And Operational Concerns</h2><p>I don&#8217;t consider these next aspects components, but there are processes and operational concerns you need to consider when building data pipelines. Here are a few.</p><h3>Idempotency &amp; Backfilling</h3><p>When you first start building pipelines, it&#8217;s easy to build them only to function one way. That is to say. You build them to load data in, transform data, and be right once.</p><p>If you were to rerun it again(which we often call this <a href="https://estuary.dev/blog/what-is-a-data-backfill?utm_source=SeattleDataGuy&amp;utm_medium=social&amp;utm_campaign=SeattleDataGuy">backfilling</a>).</p><p>Guess what, now you have duplicate or incorrect data. Often, you have to manually go back and delete the data prior to re-inserting it.</p><p>So, as you build your data pipeline system, you need to make it easy for data engineers to rerun your pipelines safely. Sometimes this is done because you&#8217;re storing data via date partitions and just insert overwrite, other times you ensure that the pipelines themselves delete data automatically or check for duplicate data either via IDs or dates.</p><p>The more intuitive you can make this, the better. That way, future engineers can rerun without having to check 1000 different places to ensure your pipelines ran as expected.</p><h3>Ownership</h3><p>Another aspect I believe pipelines need is a way to track ownership. This really should just be a configuration inside the pipeline itself. It&#8217;s one reason I like the fact that Airflow lets you define this(which Facebook also had). This makes it easier to track down who may have either built or currently owns the logic.</p><p>Without it, pipelines can quickly become orphaned and when you need to find out who built it, you&#8217;ve got to go to six different places to see if you can find the last person that updated the code.</p><h3>Alerting and on-call routing</h3><p>Along with ownership comes alerting and on-call routing. As a company grows and as teams become decentralized, data pipeline failures won&#8217;t just go to a single team.</p><p>Meaning you need a way to handle who or which team gets alerted when pipelines fail. Also, as data pipelines become more crucial to the business, on-call becomes more important. There are some pipelines that can likely fail and have someone deal with it in the morning.</p><p>On the other hand, you have pipelines that will fail that require attention immediately. That&#8217;s where on-call routing comes in. You want to not only alert your data team, you want to alert the right member who is likely&#8230;sadly&#8230;on-call.</p><h3> Environment Isolation and Promotion</h3><p>As you&#8217;re building out your pipelines, you&#8217;re going to want to make it easy to point your pipelines at either development, testing, or production(or however you&#8217;ve broken down your environments). </p><p>Actually, at Facebook, test pipelines created tables in the same environment as production ones. The difference was that when you ran pipelines as a test, they would automatically set the prefix of the table to &#8220;test_&#8221; whether you added it or not.</p><p>Now you could override this and use something like say &#8220;rogo_test_&#8221; which I did from time to time. But overall, I&#8217;d say most people ran in this setup quite well.</p><p>Whatever set-up you pick, my recommendation here is that you make it part of the natural developer flow. If it requires too much switching over from test to dev to production. Someone is dropping a production table(of course, this is less of an issue with many cloud platforms)</p><p>There are still other components but these are the main ones I&#8217;d consider if I were building a solution from the ground up(And I know I&#8217;d find a half dozen more as I started building)</p><h2>Final Thoughts</h2><p>There is a wide spectrum of solutions you can use to get data from point A to point B to allow a larger group of people to run analytics. You could build a simple set of Lambda functions to extract and load data, use out-of-the-box ELT solutions, or orchestration tools.</p><p>Many data teams likely don&#8217;t need to build their own entire data pipeline system from scratch. There are always exceptions, of course, and hey, maybe you are just looking to build out the next Airflow or dbt.</p><p>If you are, think through the full scope, how much of the data pipeline and orchestration spectrum do you want to cover? Are there ways you could improve a developer&#8217;s flow that haven&#8217;t been integrated prior, such as with AI or maybe with routing to other compute engines?</p><p>I also wish you the best of luck, as it is quite a large problem space to cover!</p><p>Thanks as always for reading!</p><h2>Articles Worth Reading</h2><p>There are thousands of new articles posted daily all over the web! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!</p><div><hr></div><h2>Behind the Scenes of SQL: Understanding SQL Query Execution</h2><p>Here is something school probably didn&#8217;t teach you about SQL<br><br>Or at the very least you likely forgot.</p><p>When you write a query, hit submit, and then run the query or that little triangle in DBBeaver&#8230;</p><p>&#8230;what exactly happens?</p><p>Sure, you likely understand that data is pulled from multiple tables, data is filtered, and aggregations occur.</p><p>But behind the scenes, what is going on?</p><p>How does SQL go from English into the lingua franca of data?</p><p>In this article we will answer that question.</p><p><a href="https://seattledataguy.substack.com/p/behind-the-scenes-of-sql-understanding">Read More Here</a></p><h2>Back To The Basics: What Is Columnar Storage</h2><p>Data engineers often discuss columnar storage, especially when discussing data warehouses or file formats like Parquet.</p><p>But why does it matter?</p><p>Why is columnar storage such a big deal for analytics? What makes it so well-suited for analytical use cases compared to other formats?</p><p>Maybe you already have a rough idea.</p><p>This article will help further explain columnar storage, why it&#8217;s used, and how it compares to row-based storage. It will also look at the most common formats and real-world use cases.</p><p>Let&#8217;s dive in.</p><p><a href="https://seattledataguy.substack.com/p/back-to-the-basics-what-is-columnar">Read More Here</a></p><div><hr></div><h2>End Of Day 207</h2><p>Thanks for checking out our community. We put out 4-5 Newsletters a month discussing data, tech, and start-ups.</p><p>If you enjoyed it, consider liking, sharing and helping this newsletter grow.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[Common Data Pipeline Patterns You’ll See in the Real World]]></title><description><![CDATA[A practical look at the many ways data pipelines show up inside real companies]]></description><link>https://seattledataguy.substack.com/p/common-data-pipeline-patterns-youll</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/common-data-pipeline-patterns-youll</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Mon, 05 Jan 2026 19:58:06 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!uaVa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efec60d-abc1-4d62-9ce7-730de55029e0_1024x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi, fellow future and current Data Leaders; Ben here &#128075;</p><p>This is the first newsletter for 2026! </p><p>One of my goals in 2026 is to put together series. So this is the first of a longer series focused on data pipelines. I wanted to start out by discussing the types of data pipelines I&#8217;ve seen in terms of how they are used as data pipelines can be used for more that one specific use case.</p><p>Before we jump in to talking about backfills, I wanted to share a bit about <a href="https://estuary.dev/?utm_source=SeattleDataGuy&amp;utm_medium=social&amp;utm_campaign=SeattleDataGuy">Estuary</a>, a platform I&#8217;ve used to help make clients&#8217; data workflows easier and am an adviser for. Estuary helps teams easily move data in real-time or on a schedule, from databases and SaaS apps to data lakes and warehouses, empowering data leaders to focus on strategy and impact rather than getting bogged down by infrastructure challenges. If you want to simplify your data workflows, check them out today.</p><p>Now let&#8217;s jump into the article!</p><div><hr></div><p>Whether you&#8217;re working at a large enterprise or a small business, there has likely been some need to take data out of the various source systems, process it, and then use it for either operational or analytical purposes.</p><p>Add in a few lines of code or a low-code solution, and the term data pipeline might start getting thrown around.</p><p>This might make some data engineers angry, but if you think about it, someone extracting data from a data source into Excel, adding in <a href="https://support.microsoft.com/en-us/office/vlookup-function-0bbc8083-26fe-4963-8ab8-93a18ad188a1">VLOOKUPs</a>, some data cleansing via formulas and IFELSE() statements is essentially building a data pipeline&#8230;.</p><p>Ok, it&#8217;s not the exact same thing, but when you stop and think about it, it can functionally solve a similar problem(although often in a more limited and specific way)</p><p>My point is that there are a lot of different ways and reasons people build data pipelines. </p><p>So, to kick off 2026, I wanted to discuss some of the key reasons data pipelines exist and the types of pipelines you will run into.</p><h2>Source Standardization Pipelines</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uaVa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efec60d-abc1-4d62-9ce7-730de55029e0_1024x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uaVa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efec60d-abc1-4d62-9ce7-730de55029e0_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!uaVa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efec60d-abc1-4d62-9ce7-730de55029e0_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!uaVa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efec60d-abc1-4d62-9ce7-730de55029e0_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!uaVa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efec60d-abc1-4d62-9ce7-730de55029e0_1024x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uaVa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efec60d-abc1-4d62-9ce7-730de55029e0_1024x768.png" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8efec60d-abc1-4d62-9ce7-730de55029e0_1024x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:315385,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/183018775?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efec60d-abc1-4d62-9ce7-730de55029e0_1024x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uaVa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efec60d-abc1-4d62-9ce7-730de55029e0_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!uaVa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efec60d-abc1-4d62-9ce7-730de55029e0_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!uaVa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efec60d-abc1-4d62-9ce7-730de55029e0_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!uaVa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efec60d-abc1-4d62-9ce7-730de55029e0_1024x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Some of the first pipelines I helped build and manage were focused on taking data sets from dozens of companies and standardizing them to a single core <a href="https://www.youtube.com/watch?v=gG7upg6QaBI&amp;feature=youtu.be&amp;sttick=0">data model</a>. In particular, this involved getting data via <a href="https://www.youtube.com/watch?v=fKVFvW9JaFg">SFTP</a> in different formats, including comma-delimited, pipe-delimited, <a href="https://www.w3schools.com/xml/xml_whatis.asp">XML</a>, and even positional files, where you had to have a separate file that would define which columns contained which rows.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4x6s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e51f726-dbe9-445b-bac1-fe38d7143c71_1514x214.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4x6s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e51f726-dbe9-445b-bac1-fe38d7143c71_1514x214.png 424w, https://substackcdn.com/image/fetch/$s_!4x6s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e51f726-dbe9-445b-bac1-fe38d7143c71_1514x214.png 848w, https://substackcdn.com/image/fetch/$s_!4x6s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e51f726-dbe9-445b-bac1-fe38d7143c71_1514x214.png 1272w, https://substackcdn.com/image/fetch/$s_!4x6s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e51f726-dbe9-445b-bac1-fe38d7143c71_1514x214.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4x6s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e51f726-dbe9-445b-bac1-fe38d7143c71_1514x214.png" width="1456" height="206" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6e51f726-dbe9-445b-bac1-fe38d7143c71_1514x214.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:206,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4x6s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e51f726-dbe9-445b-bac1-fe38d7143c71_1514x214.png 424w, https://substackcdn.com/image/fetch/$s_!4x6s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e51f726-dbe9-445b-bac1-fe38d7143c71_1514x214.png 848w, https://substackcdn.com/image/fetch/$s_!4x6s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e51f726-dbe9-445b-bac1-fe38d7143c71_1514x214.png 1272w, https://substackcdn.com/image/fetch/$s_!4x6s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e51f726-dbe9-445b-bac1-fe38d7143c71_1514x214.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">In the example above there is first name, last name, birth date, gender and salary</figcaption></figure></div><p>This might be unfamiliar to data engineers who are accustomed to build <a href="https://www.theseattledataguy.com/batch-vs-real-time-data-pipelines-do-we-still-need-to-pick/">data pipelines</a> to answer questions around SaaS products such as retention and churn.</p><p>But this is a problem I&#8217;ve run across now many times across many different industries from health care to retail and real estate to name a few.</p><p>In many cases this wasn&#8217;t even purely for analytics. The centralization and standardization of the various data sets allowed the companies to provide operational benefits or other services. For example, maybe you&#8217;re trying to create a marketplace and need to centralize dozens of different inventory sources.</p><p>The challenge when building these data pipelines is usually that amount of effort required to onboard and actually create scripts that can manage all the variations of how different data will come in. This is referred to as mapping.</p><p>You&#8217;ll need to:</p><ol><li><p>Standardize values such as gender which can often come in as a number, single letter or the written out word</p></li><li><p>Standardize on categories, I&#8217;ve seen this a lot in retail where products might be in the same category, but one might use an abbreviation or a different word that means a similar thing</p></li><li><p>Fix date and format inconsistencies, such as different time zones, different date formats, or missing values entirely</p></li></ol><p>And of course more such as how each data set might be appended. You can mitigate some of this by asking your external partners to send data in a way that is standardized, but it&#8217;s difficult to fix every issue.</p><p>Once you do have a standardized data set, you can build multiple products off this data. Whether it be a marketplace or an industry level report and because it&#8217;s all standardized it&#8217;s easy to apply new products and features for all your customers.</p><p>The one final point I will add is that this is not just limited to SFTP data sets, I&#8217;ve worked with companies that pull in data from <a href="https://www.youtube.com/watch?v=YST1sWFPDh4">APIs</a> as well.</p>
      <p>
          <a href="https://seattledataguy.substack.com/p/common-data-pipeline-patterns-youll">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Snowflake vs Databricks Is the Wrong Debate]]></title><description><![CDATA[Winning the Data Stack Role by Role]]></description><link>https://seattledataguy.substack.com/p/snowflake-vs-databricks-is-the-wrong</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/snowflake-vs-databricks-is-the-wrong</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Fri, 12 Dec 2025 15:49:29 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rn6Q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6ccf9a-698a-4253-9647-186cc436286f_1456x710.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Over the last few years, Databricks has been executing a strategy to take over the entire data workflow.</p><p>Maybe it never started that way.</p><p>Maybe when they first came out, they only ever planned to be a managed Spark solution. But I have a hard time believing that, mostly because I believe their leadership has the vision and capabilities to see far beyond that.</p><p>Databricks has always been pretty upfront that they want to be the end-to-end data stack. But they&#8217;ve been approaching it piece by piece.</p><p>Or should I say role by role?</p><p>Obviously, at first, their focus was on the data scientist and ML engineer.</p><p>But in 2020, they wanted to shift the narrative. They were more than just managed Spark, they were a data platform that could replace your others. So they championed the idea of the Data Lakehouse. You could view this as a capturing of the data engineering market.</p><p>But this didn&#8217;t happen overnight. </p><p>If you look at some of the posts and content shown below, <a href="https://www.youtube.com/watch?v=QNdiGZFaUFs">Databricks</a> tends to push an idea or concept very hard when they really want to capture that market. Shown below when they pushed the concept of the Data Lakehouse and now their data visualization and analytical workflows.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rn6Q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6ccf9a-698a-4253-9647-186cc436286f_1456x710.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rn6Q!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6ccf9a-698a-4253-9647-186cc436286f_1456x710.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rn6Q!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6ccf9a-698a-4253-9647-186cc436286f_1456x710.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rn6Q!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6ccf9a-698a-4253-9647-186cc436286f_1456x710.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rn6Q!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6ccf9a-698a-4253-9647-186cc436286f_1456x710.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rn6Q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6ccf9a-698a-4253-9647-186cc436286f_1456x710.jpeg" width="1456" height="710" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dd6ccf9a-698a-4253-9647-186cc436286f_1456x710.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:710,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rn6Q!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6ccf9a-698a-4253-9647-186cc436286f_1456x710.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rn6Q!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6ccf9a-698a-4253-9647-186cc436286f_1456x710.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rn6Q!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6ccf9a-698a-4253-9647-186cc436286f_1456x710.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rn6Q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6ccf9a-698a-4253-9647-186cc436286f_1456x710.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;ve already written about the <a href="https://seattledataguy.substack.com/p/when-words-become-data-architecture">Data Lakehouse and how it was heavily pushed by Databricks</a> in 2020, so I don&#8217;t want to dive too deeply here. I believe there are far more interesting recent happenings that stick out.</p><p>This past month, I believe Databricks started to make its final push into gaining the mental market share for analysts.</p><p>How?</p><p>Databricks just partnered with the biggest analytics creator.</p><blockquote><h3>Alex The Analyst.</h3></blockquote><p>There are several reasons I believe this is significant.</p><ol><li><p>Although many companies partner with data creators all the time, I think, if we look at<a href="https://www.youtube.com/@AlexTheAnalyst"> Alex the Analyst&#8217;s</a> audience, we&#8217;ll see that Databricks wants to put the final nail in the coffin of the data analytics workflow. They want to show end-users that hey, if you&#8217;re an analyst, Databricks is for you too! We aren&#8217;t just for data engineers; we can help you drive value, and we aren&#8217;t too complex.</p></li><li><p>Databricks did something similar when they wanted to win over the Data Engineering space. They partnered with <a href="https://www.linkedin.com/in/billinmon/">Bill Inmon</a> to put out a seminal piece on the<a href="https://www.databricks.com/blog/2021/05/19/evolution-to-the-data-lakehouse.html"> Data Lakehouse.</a> This was about a year after Databricks wrote their original piece, and it wasn&#8217;t until this piece came out that they finally got real traction.</p></li><li><p>Databricks has been pushing its data visualization solution for a long time now. But now, they want to get even better traction by getting the face of learning analytics out there.</p></li><li><p>I wouldn&#8217;t be surprised if one reason they are doing this is that they are likely continuing to get pushback from customers who are more of the analyst type, who believe Databricks isn&#8217;t for them. In fact, I have worked on several projects over the past year alone where that was the sole reason for a Databricks and a <a href="https://www.youtube.com/watch?v=GuM6dQGRFyQ">Snowflake</a> environment.</p></li></ol><p>In these cases:</p><ul><li><p>Snowflake was for the analysts.</p></li><li><p>Databricks was for the engineers.</p></li></ul><p>I think the other point that stands out is how heavily the Databricks visualization tool was shown in Alex&#8217;s content. Now, part of this could have been because that&#8217;s what Alex felt his audience would appreciate.</p><p>But here is another anecdote I&#8217;d like to share.</p><p>I&#8217;ve spoken to several data leaders who have told me that Databricks account executives tried to push them very hard on their visualization tool. In some cases, to replace their Tableau or Power BI instance.</p><p>I think account executive interactions tell you a lot about a company&#8217;s strategy. They are the front line; what they are doing is literally a reflection of where the company is trying to go. If the company wants to push a new feature, a new partnership, or a new product, you push it through the account executives.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SeattleDataGuy&#8217;s Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>What Does This Mean For The Data World?</h2><p>Ok great Ben, why does anyone of this even matter?</p><p>Here are a few thoughts on where this is all going and why it&#8217;s important. </p><h3>Snowflake Vs Databricks Is A Red Herring </h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WQbT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54149246-5843-4404-b25b-a711a878af44_1102x838.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WQbT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54149246-5843-4404-b25b-a711a878af44_1102x838.png 424w, https://substackcdn.com/image/fetch/$s_!WQbT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54149246-5843-4404-b25b-a711a878af44_1102x838.png 848w, https://substackcdn.com/image/fetch/$s_!WQbT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54149246-5843-4404-b25b-a711a878af44_1102x838.png 1272w, https://substackcdn.com/image/fetch/$s_!WQbT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54149246-5843-4404-b25b-a711a878af44_1102x838.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WQbT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54149246-5843-4404-b25b-a711a878af44_1102x838.png" width="1102" height="838" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/54149246-5843-4404-b25b-a711a878af44_1102x838.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:838,&quot;width&quot;:1102,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:137599,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/181367282?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54149246-5843-4404-b25b-a711a878af44_1102x838.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WQbT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54149246-5843-4404-b25b-a711a878af44_1102x838.png 424w, https://substackcdn.com/image/fetch/$s_!WQbT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54149246-5843-4404-b25b-a711a878af44_1102x838.png 848w, https://substackcdn.com/image/fetch/$s_!WQbT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54149246-5843-4404-b25b-a711a878af44_1102x838.png 1272w, https://substackcdn.com/image/fetch/$s_!WQbT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54149246-5843-4404-b25b-a711a878af44_1102x838.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The real vs. conversation isn&#8217;t <a href="https://www.youtube.com/watch?v=VLtq0eeHc14">Snowflake vs Databricks</a>, it&#8217;s really Databricks vs Microsoft, AWS, and even Salesforce.</p><p>Let me explain.</p><p>These are more generic clouds and SaaS, but they also offer either components or the entire data stack.</p><p>More importantly, they are often solutions you pick prior to thinking about your data tooling. You likely use Azure to set up your application or Salesforce as your CRM. So, when it comes time for data analytics, you already have Azure, so just use Power BI; it&#8217;s already part of your contract anyway (I&#8217;ve had multiple data leaders give me that line of thinking when it comes to picking BI).</p><p>Why not use Tableau? You&#8217;re already using Salesforce. Will it probably make negotiations better, right?</p><p>Of course, there is still some Snowflake vs Databricks. But for those 8 and 9-figure deals, you&#8217;re fighting much larger companies. Companies that could buy you out.</p><p>But this leads to my next point.</p><h2>Databricks Is Building The Next SAP</h2><p>I referenced this idea at the bottom of a <a href="https://the-data-leaders-playbook.circle.so/c/events-thoughts-and-learnings-etl/informatica-2-0-some-early-thoughts-on-fivetran-dbt">post</a> on the Data Leaders Playbook that Snowflake and Databricks are building SAP backwards. Instead of going from business applications and eventually building out solutions like SAP HANA.</p><p>They are going from data analytics to the business. </p><p>With the recent purchase of <a href="https://www.databricks.com/blog/databricks-neon">Neon</a>, Databricks can now enter the sales conversation earlier, which I think is arguably the real point. Yes, yes, all the tech people are pushing back their glasses and about to say something <a href="https://joereis.substack.com/p/the-pedantic-layer">pedantic</a>.</p><p>If you want to win the CIO over, and not just the data team, you have to enter the conversation earlier. Meaning the application layer. I am sure some <a href="https://www.reddit.com/r/PLTR/comments/1na0pwe/to_compare_crm_to_pltr_what_impossible/">Palantir</a> fans in the back are just waiting to start writing a comment(This has been Palantir&#8217;s goal the whole time!).</p><p>Let&#8217;s put that aside for now and just think about the company these businesses are selling into. Most companies, especially digital ones, have a few key databases. The ones that actually represent their main application or service. It&#8217;s where most of their data lives. If Databricks can get you to use their Postgres instance, then they already have you in the funnel to get you to use their data Lakehouse, then they already have you in the funnel to use their BI, their data pipelines, etc.</p><p>So I expect they will be pushing this space hard in the next year or so. I&#8217;ve actually already seen this hinted at via some of their distribution and creators they enjoy working with.</p><h3>If You Want To Win Mental Market Share You Have To Be Relentless</h3><p>Databricks was pushing the Data Lakehouse concept for over a year until it really started to gain traction. I imagine it required millions in terms of employee time, partnering with consultants, etc.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dVuT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3bc255-2238-45f5-bd3c-7425ffc116eb_1456x869.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dVuT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3bc255-2238-45f5-bd3c-7425ffc116eb_1456x869.png 424w, https://substackcdn.com/image/fetch/$s_!dVuT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3bc255-2238-45f5-bd3c-7425ffc116eb_1456x869.png 848w, https://substackcdn.com/image/fetch/$s_!dVuT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3bc255-2238-45f5-bd3c-7425ffc116eb_1456x869.png 1272w, https://substackcdn.com/image/fetch/$s_!dVuT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3bc255-2238-45f5-bd3c-7425ffc116eb_1456x869.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dVuT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3bc255-2238-45f5-bd3c-7425ffc116eb_1456x869.png" width="1456" height="869" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6e3bc255-2238-45f5-bd3c-7425ffc116eb_1456x869.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:869,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dVuT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3bc255-2238-45f5-bd3c-7425ffc116eb_1456x869.png 424w, https://substackcdn.com/image/fetch/$s_!dVuT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3bc255-2238-45f5-bd3c-7425ffc116eb_1456x869.png 848w, https://substackcdn.com/image/fetch/$s_!dVuT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3bc255-2238-45f5-bd3c-7425ffc116eb_1456x869.png 1272w, https://substackcdn.com/image/fetch/$s_!dVuT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3bc255-2238-45f5-bd3c-7425ffc116eb_1456x869.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>You can&#8217;t just create a term or want to gain entrance into a space without having a plan on how to win. </p><p>I say this because I&#8217;ve spoken with so many marketing teams and leaders who have wanted me to do &#8220;un-boxings&#8221; of their product as a one-off. </p><p>That&#8217;s a tactic, not a strategy(and not a very well thought out one at that).</p><p>One video doesn&#8217;t move the needle. Think about how many posts, videos, talks, consulting partners, and so on, Databricks has been pushing to talk about their AI/BI Genie and Data Visualization.</p><p>Just paying for distribution on a one-off piece of content won&#8217;t work.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!55e5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a9a3d5e-6579-455e-9947-096769dd3761_1102x658.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!55e5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a9a3d5e-6579-455e-9947-096769dd3761_1102x658.png 424w, https://substackcdn.com/image/fetch/$s_!55e5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a9a3d5e-6579-455e-9947-096769dd3761_1102x658.png 848w, https://substackcdn.com/image/fetch/$s_!55e5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a9a3d5e-6579-455e-9947-096769dd3761_1102x658.png 1272w, https://substackcdn.com/image/fetch/$s_!55e5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a9a3d5e-6579-455e-9947-096769dd3761_1102x658.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!55e5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a9a3d5e-6579-455e-9947-096769dd3761_1102x658.png" width="1102" height="658" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5a9a3d5e-6579-455e-9947-096769dd3761_1102x658.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:658,&quot;width&quot;:1102,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!55e5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a9a3d5e-6579-455e-9947-096769dd3761_1102x658.png 424w, https://substackcdn.com/image/fetch/$s_!55e5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a9a3d5e-6579-455e-9947-096769dd3761_1102x658.png 848w, https://substackcdn.com/image/fetch/$s_!55e5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a9a3d5e-6579-455e-9947-096769dd3761_1102x658.png 1272w, https://substackcdn.com/image/fetch/$s_!55e5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a9a3d5e-6579-455e-9947-096769dd3761_1102x658.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Even now, with the recent partnership with Alex. This doesn&#8217;t feel like a one-off. Databricks keeps promoting the content. On their LinkedIn page, in their product.</p><p>They&#8217;ve got the employees posting about it. They are trying to make Fetch happen, and they know this isn&#8217;t going to be easy.</p><p>They want analysts to come to Databricks, and when they do, they want them to see a familiar face.</p><p>If done well, it can pay off. I doubt you can fully attribute Bill Inmon to shifting 100% of the conversation to the Data Lakehouses, but if he even had a 5-10% impact, that&#8217;s a multi-million dollar impact.</p><p>Databricks wants to do that again with the analyst.</p><h2>Final Thoughts</h2><p>Something tells me that Databricks will be pushing Alex&#8217;s videos hard for the next few months.</p><p>It&#8217;ll complete the trio of data roles.</p><p>In turn, this allows Databricks to start to focus on the application layer.</p><p>To start having conversations earlier with business, not just about data strategy, but about IT and business strategy.</p><p>As always, thanks for reading.</p><h2>Video Of The Week - 5 Things in Data Engineering That Still Hold True After 10 Years</h2><div id="youtube2-lXvDqREYhI4" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;lXvDqREYhI4&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/lXvDqREYhI4?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h2>Articles Worth Reading</h2><p>There are thousands of new articles posted daily all over the web! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!</p><div><hr></div><h2>Is It Time to Say Goodbye to Data Engineers?</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!F1vP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facea1f10-0777-4150-9f52-d41ddc53fc12_1024x768.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!F1vP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facea1f10-0777-4150-9f52-d41ddc53fc12_1024x768.webp 424w, https://substackcdn.com/image/fetch/$s_!F1vP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facea1f10-0777-4150-9f52-d41ddc53fc12_1024x768.webp 848w, https://substackcdn.com/image/fetch/$s_!F1vP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facea1f10-0777-4150-9f52-d41ddc53fc12_1024x768.webp 1272w, https://substackcdn.com/image/fetch/$s_!F1vP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facea1f10-0777-4150-9f52-d41ddc53fc12_1024x768.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!F1vP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facea1f10-0777-4150-9f52-d41ddc53fc12_1024x768.webp" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/acea1f10-0777-4150-9f52-d41ddc53fc12_1024x768.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19232,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/181367282?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facea1f10-0777-4150-9f52-d41ddc53fc12_1024x768.webp&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!F1vP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facea1f10-0777-4150-9f52-d41ddc53fc12_1024x768.webp 424w, https://substackcdn.com/image/fetch/$s_!F1vP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facea1f10-0777-4150-9f52-d41ddc53fc12_1024x768.webp 848w, https://substackcdn.com/image/fetch/$s_!F1vP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facea1f10-0777-4150-9f52-d41ddc53fc12_1024x768.webp 1272w, https://substackcdn.com/image/fetch/$s_!F1vP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facea1f10-0777-4150-9f52-d41ddc53fc12_1024x768.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Ever since tools like SSIS came onto the scene, vendors and business leaders have been on a mission to remove what they see as the biggest roadblock to data-driven decision-making: data engineers.</p><p>Or their counterparts&#8212;DBAs, ETL Developers, and Data Architects.</p><p>Sure, not everyone says it so explicitly, but you can see it in vendor marketing and in the decisions made by the business.</p><p>I remember talking to a veteran data expert who&#8217;s been in the field for three decades. They told me that when <a href="https://www.theseattledataguy.com/alternatives-to-ssissql-server-integration-services-how-to-migrate-away-from-ssis/">SSIS</a> first launched, people were genuinely afraid for their jobs. The idea that you could just drag-and-drop tasks that once required code was nerve-racking. But if you&#8217;ve used SSIS, well, you know the truth.</p><p>To some extent, I get why the idea is appealing. When a leader requests a report, a software engineer wants to modify an application table, or a data scientist wants to explore a new dataset, who&#8217;s the one slowing down the project?</p><p>The <em>data engineers.</em></p><p><a href="https://seattledataguy.substack.com/p/is-it-time-to-say-goodbye-to-data">Read More Here</a></p><h2>Refresher on Experimentation</h2><p>By <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Olga Berezovsky&quot;,&quot;id&quot;:10490439,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9c458316-65d0-4bcb-a698-dca783e2f875_3224x3224.jpeg&quot;,&quot;uuid&quot;:&quot;266cfacc-07c0-49b5-a53d-de6fcf8a93a1&quot;}" data-component-name="MentionToDOM"></span> </p><p>As has become a tradition, I end the year with a refresher: one single, consolidated guide that brings together the most important definitions, reporting best practices, example dashboards, and key principles - all in one place.</p><p>Last year, I published these three:</p><ol><li><p><a href="https://dataanalysis.substack.com/p/refresher-on-retention-issue-236">Refresher on Retention</a></p></li><li><p><a href="https://dataanalysis.substack.com/p/refresher-on-statistics">Refresher on Statistics</a></p></li><li><p><a href="https://dataanalysis.substack.com/p/refresher-on-sql-for-data-analysis">Refresher on SQL for Data Analysis</a></p></li></ol><p>This week, I&#8217;m sharing a refresher on A/B Testing. I believe I published one before, and I also have a dedicated <a href="https://dataanalysis.substack.com/s/ab-testing">experimentation section</a> in my newsletter. This time, I expanded it with more resources to help you properly get started with A/B testing. It covers tools, core concepts, free classes, and more.</p><p><a href="https://dataanalysis.substack.com/p/refresher-on-experimentation-issue">Read More Here</a></p><div><hr></div><h2>End Of Day 205</h2><p>Thanks for checking out our community. We put out 4-5 Newsletters a month discussing data, tech, and start-ups.</p><p>If you enjoyed it, consider liking, sharing and helping this newsletter grow.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[How I Run System Design Interviews for Data Engineers ]]></title><description><![CDATA[Why System Design Still Matters]]></description><link>https://seattledataguy.substack.com/p/how-i-run-system-design-interviews</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/how-i-run-system-design-interviews</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Tue, 09 Dec 2025 17:54:01 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!m3PJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebe31df-cf04-4b90-af59-1fed9b125f19_1920x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi, fellow future and current Data Leaders; Ben here &#128075;</p><p>Today we have another amazing guest author. </p><p><a href="https://mehdio.com/">Mehdi Ouazza</a> &#8212;better known as <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;mehdio&quot;,&quot;id&quot;:87735445,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!sXqg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1660a5e-20d3-4c35-8a9e-dc2a71d9ff02_564x564.png&quot;,&quot;uuid&quot;:&quot;5724b466-91d0-47be-a62c-526f37159edd&quot;}" data-component-name="MentionToDOM"></span>&#8212;is a data engineer and developer advocate at MotherDuck with over a decade of experience building data systems at companies like Klarna, BackMarket, and Trade Republic. He combines deep engineering expertise with a playful, approachable style, creating blogs, videos, and social content that make complex data and software topics easier to grasp. When he&#8217;s not shipping pipelines or crafting educational content, he&#8217;s usually experimenting with creative ideas that bring a fresh, off-beat energy to the developer community.</p><p>He&#8217;ll be sharing how he runs system design interviews for data engineers, but I won&#8217;t get too ahead of myself.</p><p>So let&#8217;s jump into the article!</p><div><hr></div><p>After hundreds of technical interviews for data engineering positions, I&#8217;ve developed what I call the &#8220;exploration&#8221; approach to system design interviews. Think of it like exploring a new city: you start with a bird&#8217;s-eye view of the whole map, then zoom into specific neighborhoods that look interesting, getting to know some streets really well while keeping track of how they connect to the bigger picture. It&#8217;s become my secret weapon for truly understanding a candidate&#8217;s technical knowledge while keeping the conversation engaging and productive.</p><p>This type of interview matters even more in our current AI-assisted world. Coding matters less than ever before. We can generate vast amounts of code with tools like Copilot or ChatGPT. But understanding how components work, their foundations, and their trade-offs? That&#8217;s irreplaceable.</p><p>If you&#8217;re interviewing with me, consider this your cheat code.</p><h2><strong>Setting the stage</strong></h2><h3>Be human</h3><p>You would be surprised how many interviewers miss the obvious: <strong>presenting yourself as a human being</strong>. I&#8217;ve sat through countless interviews where the interviewer jumps straight into &#8220;I&#8217;m the head of engineering, let&#8217;s talk about Kafka.&#8221; No. Just no.</p><p>When I start an interview, I actually introduce myself: where I&#8217;m based, how long I&#8217;ve been with the company, maybe share that I once debugged a production issue on my phone while biking (true story). This isn&#8217;t fluff, it&#8217;s creating an environment where candidates can relax and show their best selves. Interviews are stressful enough without feeling like you&#8217;re talking to a robot, especially over a video call where human connection is already harder.</p><p>I also explicitly tell candidates upfront: &#8220;We&#8217;re going to start really broad with the whole system, then zoom into specific areas for technical depth. It&#8217;s completely okay if you don&#8217;t know everything: nobody does. I&#8217;m more interested in how you think through problems than memorized answers.&#8221;</p><p>This transparency matters. Without it, candidates might think &#8220;Wow, this is easy&#8221; when I ask the initial broad question, only to panic when we drill into the internals of Parquet files.</p><p>That&#8217;s not fair to them, and it doesn&#8217;t help me assess their true capabilities.</p><h3>Live white boarding collaboration</h3><p>I use<a href="https://excalidraw.com/"> Excalidraw</a> for live collaborative drawing: it&#8217;s free and requires no setup/login. Candidates can visualize their architectures while we discuss them. This isn&#8217;t just about pretty diagrams; it&#8217;s about thinking visually and communicating complex systems clearly.</p><p>I actively support candidates during the interview. If they get stuck, I&#8217;ll introduce a source system or clarify a requirement. This isn&#8217;t a gotcha test : it&#8217;s a collaborative design exercise, much closer to real work.</p><p>After all, how often do we get perfectly clear requirements from stakeholders? &#128521;</p><h2><strong>Wide &amp; zoom through 3 core scenarios</strong></h2><p>I need someone who understands how components connect, why certain architectural decisions matter, and can demonstrate real depth in at least some areas, depending on the requirements of the role.</p><p>The exploration metaphor works because we&#8217;re mapping out their knowledge landscape together.</p><p>I usually cover 3 main system design scenarios, each highlighting different aspects of data engineering. In a 45-60 minute interview, there&#8217;s rarely time for all three. If a candidate shows strong depth, I move on quickly; if they struggle, I spend more time there. The goal is to map both strengths and weak spots.</p><h3>1. Simple ETL data stack</h3><p>I start with something deceptively simple: &#8220;You have a small e-commerce website, and you&#8217;re getting data that you need to analyze. Design your first data infrastructure. How would you build the data stack?&#8221;</p><p>The question is intentionally vague. <strong>Good engineers don&#8217;t just dump technology names</strong>; they ask questions, make assumptions, and tailor solutions to specific needs. This is where the real engineering happens!</p><blockquote><p> Some might argue: <em>&#8220;Why do I even need a data stack? I can just query my current database!&#8221;</em> And that&#8217;s a perfectly pragmatic answer.<br></p><p> Then I&#8217;d evolve the scenario: <em>&#8220;Now imagine you have a table larger than 100GB and need to run analytics queries. Would you still rely on the same setup? If not, how would you redesign the stack, and why?&#8221;</em></p></blockquote><p>When candidates sketch their initial architecture (ingestion &#8594; processing &#8594; storage &#8594; serving), I probe further based on the technologies they mention. Here are some common examples:</p><p><strong>File format deep dive</strong></p><ul><li><p>&#8220;You mentioned storing data in object storage. What format would you use?&#8221;</p></li><li><p>&#8220;Why Parquet over CSV?&#8221;</p></li><li><p>&#8220;If Parquet is so fantastic, what are its limitations?&#8221;</p></li><li><p>&#8220;I see Iceberg and <a href="http://Delta Lake">Delta Lake</a> mentioned everywhere. What problems do they solve that <a href="https://estuary.dev/blog/apache-parquet-for-data-engineers/?utm_source=SeattleDataGuy&amp;utm_medium=social&amp;utm_campaign=SeattleDataGuy">Parquet</a> alone doesn&#8217;t?&#8221;</p></li></ul><p>This progression reveals whether someone just memorizes buzzwords or truly understands the trade-offs.</p><p>Here&#8217;s how a typical deep dive might unfold:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!--3s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dd291fb-069f-4533-9466-b57faff9f0ac_1600x1566.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!--3s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dd291fb-069f-4533-9466-b57faff9f0ac_1600x1566.png 424w, https://substackcdn.com/image/fetch/$s_!--3s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dd291fb-069f-4533-9466-b57faff9f0ac_1600x1566.png 848w, https://substackcdn.com/image/fetch/$s_!--3s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dd291fb-069f-4533-9466-b57faff9f0ac_1600x1566.png 1272w, https://substackcdn.com/image/fetch/$s_!--3s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dd291fb-069f-4533-9466-b57faff9f0ac_1600x1566.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!--3s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dd291fb-069f-4533-9466-b57faff9f0ac_1600x1566.png" width="1456" height="1425" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2dd291fb-069f-4533-9466-b57faff9f0ac_1600x1566.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1425,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!--3s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dd291fb-069f-4533-9466-b57faff9f0ac_1600x1566.png 424w, https://substackcdn.com/image/fetch/$s_!--3s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dd291fb-069f-4533-9466-b57faff9f0ac_1600x1566.png 848w, https://substackcdn.com/image/fetch/$s_!--3s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dd291fb-069f-4533-9466-b57faff9f0ac_1600x1566.png 1272w, https://substackcdn.com/image/fetch/$s_!--3s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dd291fb-069f-4533-9466-b57faff9f0ac_1600x1566.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>The data warehouse vs Lakehouse debate</strong></p><ul><li><p>&#8220;Should compute run inside or outside your <a href="https://www.youtube.com/watch?v=FxpRL0m9BcA">data warehouse</a>?&#8221;</p></li><li><p>&#8220;What are the implications of each approach?&#8221;</p></li><li><p>&#8220;If running inside, what should you watch out for? Indexing? Data volume?&#8221;</p></li></ul><p>These questions separate those who&#8217;ve actually built and operated systems from those who&#8217;ve only read about them.</p><h3>2. Streaming pipelines</h3><p>Next, I change the requirements: &#8220;Now I need 1-5 minute latency between source data and where the data is being served. Redesign your solution.&#8221;</p><p>This shift from <a href="https://www.theseattledataguy.com/batch-vs-real-time-data-pipelines-do-we-still-need-to-pick/#page-content">batch</a> to <a href="https://motherduck.com/blog/streaming-data-to-motherduck/">streaming</a> reveals a different dimension of understanding:</p><p><strong>The Pub/Sub Deep Dive</strong></p><ul><li><p>&#8220;You mentioned needing a pub/sub system. Which ones do you know?&#8221;</p></li><li><p>&#8220;How does Kafka actually work? What are brokers, partitions, and consumer groups?&#8221;</p></li><li><p>&#8220;What&#8217;s the difference between <a href="https://blog.bytebytego.com/p/at-most-once-at-least-once-exactly">at-least-once and exactly-once</a> semantics?&#8221;</p></li><li><p>&#8220;How do you handle late-arriving data in streaming?&#8221;</p></li></ul><p>Again, I&#8217;m looking for practical knowledge here. Anyone can say &#8220;use Kafka,&#8221; but understanding partition strategies, offset management, and the challenges of maintaining streaming infrastructure shows real experience.</p><h3>3. Customer-facing analytics</h3><p>The final scenario is more end-to-end, with specific requirements: <em>&#8220;You need to expose pre-computed analytics to your operational web application. How would you architect this?&#8221;</em></p><p>This is where things get interesting, as there are again multiple solutions to the problem.</p><p>Some examples of key questions :</p><ul><li><p>&#8220;Can you connect your data warehouse directly to your web app? What&#8217;s the usual latency ?&#8221;</p></li><li><p>&#8220;What happens when 10,000 users hit your analytics dashboard simultaneously?&#8221;</p></li><li><p>&#8220;Where would you place caching layers?&#8221;</p></li><li><p>&#8220;How do you handle data freshness vs. performance trade-offs?&#8221;</p></li></ul><p>The best candidates recognize this as a classic OLTP vs. OLAP problem and discuss solutions like read replicas, caching strategies, pre-aggregations and more.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SeattleDataGuy&#8217;s Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2><strong>What to look for</strong></h2><p>Both breadth and depth are important, and the shape of them will really depend on the role, as they are many different data engineer profile<br></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GHrj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a604c0d-c65e-4e25-93ec-a5f7718ef4cc_951x528.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GHrj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a604c0d-c65e-4e25-93ec-a5f7718ef4cc_951x528.png 424w, https://substackcdn.com/image/fetch/$s_!GHrj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a604c0d-c65e-4e25-93ec-a5f7718ef4cc_951x528.png 848w, https://substackcdn.com/image/fetch/$s_!GHrj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a604c0d-c65e-4e25-93ec-a5f7718ef4cc_951x528.png 1272w, https://substackcdn.com/image/fetch/$s_!GHrj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a604c0d-c65e-4e25-93ec-a5f7718ef4cc_951x528.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GHrj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a604c0d-c65e-4e25-93ec-a5f7718ef4cc_951x528.png" width="951" height="528" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5a604c0d-c65e-4e25-93ec-a5f7718ef4cc_951x528.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:528,&quot;width&quot;:951,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GHrj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a604c0d-c65e-4e25-93ec-a5f7718ef4cc_951x528.png 424w, https://substackcdn.com/image/fetch/$s_!GHrj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a604c0d-c65e-4e25-93ec-a5f7718ef4cc_951x528.png 848w, https://substackcdn.com/image/fetch/$s_!GHrj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a604c0d-c65e-4e25-93ec-a5f7718ef4cc_951x528.png 1272w, https://substackcdn.com/image/fetch/$s_!GHrj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a604c0d-c65e-4e25-93ec-a5f7718ef4cc_951x528.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"> source :<a href="https://www.datacaptains.com/blog/guide-to-data-roles"> datacaptains</a></figcaption></figure></div><p>Here are a few examples of the indicators I&#8217;m watching for.</p><h3>Depth indicators</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!m3PJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebe31df-cf04-4b90-af59-1fed9b125f19_1920x1080.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!m3PJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebe31df-cf04-4b90-af59-1fed9b125f19_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!m3PJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebe31df-cf04-4b90-af59-1fed9b125f19_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!m3PJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebe31df-cf04-4b90-af59-1fed9b125f19_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!m3PJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebe31df-cf04-4b90-af59-1fed9b125f19_1920x1080.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!m3PJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebe31df-cf04-4b90-af59-1fed9b125f19_1920x1080.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4ebe31df-cf04-4b90-af59-1fed9b125f19_1920x1080.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1559799,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/180850270?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebe31df-cf04-4b90-af59-1fed9b125f19_1920x1080.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!m3PJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebe31df-cf04-4b90-af59-1fed9b125f19_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!m3PJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebe31df-cf04-4b90-af59-1fed9b125f19_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!m3PJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebe31df-cf04-4b90-af59-1fed9b125f19_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!m3PJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebe31df-cf04-4b90-af59-1fed9b125f19_1920x1080.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>When someone truly understands a technology, they can discuss:</p><ul><li><p><strong>Internal mechanics</strong>: How does <a href="https://motherduck.com/blog/spark-ducklake-getting-started/">Spark</a> actually distribute work? What&#8217;s happening during a shuffle?</p></li><li><p><strong>Failure modes</strong>: What breaks when you lose a Kafka broker? How does Delta Lake handle concurrent writes?</p></li><li><p><strong>Performance implications</strong>: Why does a broadcast join beat a <a href="https://medium.com/@philipp.brunenberg/understanding-apache-spark-shuffle-85644d90c8c6">shuffle</a> join for small tables?</p></li><li><p><strong>Operational realities</strong>: How do you debug a slow <a href="https://estuary.dev/blog/efficient-elt-with-estuary-flow-and-dbt/?utm_source=SeattleDataGuy&amp;utm_medium=social&amp;utm_campaign=SeattleDataGuy">dbt</a> model? What metrics indicate a streaming backlog?</p></li></ul><h3>Breadth indicators</h3><p>Strong candidates can also:</p><ul><li><p>Connect different technologies and explain their relationships</p></li><li><p>Understand why certain tools exist (what problem were they solving?)</p></li><li><p>Recognize patterns across different solutions</p></li><li><p>Make appropriate trade-offs based on constraints</p></li></ul><blockquote><p> <strong>Each level reveals more understanding</strong>. A junior engineer might stop at &#8220;Parquet is columnar.&#8221; A senior engineer discusses compression algorithms, predicate pushdown, and why you might choose Avro for streaming despite Parquet&#8217;s advantages.</p></blockquote><h3><strong>Red Flags and Green Flags</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oD_H!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a16e61a-f929-46d3-93e1-cf1512a91b31_1814x948.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oD_H!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a16e61a-f929-46d3-93e1-cf1512a91b31_1814x948.png 424w, https://substackcdn.com/image/fetch/$s_!oD_H!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a16e61a-f929-46d3-93e1-cf1512a91b31_1814x948.png 848w, https://substackcdn.com/image/fetch/$s_!oD_H!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a16e61a-f929-46d3-93e1-cf1512a91b31_1814x948.png 1272w, https://substackcdn.com/image/fetch/$s_!oD_H!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a16e61a-f929-46d3-93e1-cf1512a91b31_1814x948.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oD_H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a16e61a-f929-46d3-93e1-cf1512a91b31_1814x948.png" width="1456" height="761" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9a16e61a-f929-46d3-93e1-cf1512a91b31_1814x948.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:761,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:206300,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/180850270?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a16e61a-f929-46d3-93e1-cf1512a91b31_1814x948.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oD_H!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a16e61a-f929-46d3-93e1-cf1512a91b31_1814x948.png 424w, https://substackcdn.com/image/fetch/$s_!oD_H!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a16e61a-f929-46d3-93e1-cf1512a91b31_1814x948.png 848w, https://substackcdn.com/image/fetch/$s_!oD_H!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a16e61a-f929-46d3-93e1-cf1512a91b31_1814x948.png 1272w, https://substackcdn.com/image/fetch/$s_!oD_H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a16e61a-f929-46d3-93e1-cf1512a91b31_1814x948.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>How to prepare?</strong></h2><p>If you&#8217;re preparing for interviews like mine, here&#8217;s a few tips by order of importance :</p><ol><li><p><strong>Pick 3-4 technologies you use</strong> and go deep : understand their internals, failure modes, and alternatives</p></li><li><p><strong>Practice explaining</strong> : If you can&#8217;t explain it simply, you don&#8217;t understand it fully</p></li><li><p><strong>Build something real</strong> : even a toy Kafka setup teaches more than reading docs</p></li><li><p><strong>Break things</strong> : Kill processes, corrupt data, fill up disks. Understanding failure teaches resilience</p></li><li><p><strong>Read post-mortems</strong> : Learn from others&#8217; production disasters</p></li></ol><h2><strong>The value beyond AI</strong></h2><p>This interview style works because it mirrors real engineering. We start with vague requirements, ask questions, make trade-offs, and solve problems collaboratively. It&#8217;s rare for candidates to completely fail: everyone can draw something and explain at least some components.</p><p>What varies is depth, breadth, and the ability to reason about trade-offs. That&#8217;s what separates great engineers from good ones, and it&#8217;s what this interview approach reveals.</p><p>Remember: in the age of AI-assisted development, your value isn&#8217;t in memorizing syntax or configuration details. It&#8217;s in understanding systems deeply enough to design, debug, and evolve them.</p><p>Prompt engineering matters, but the fundamentals remain and will help you grow in this wild west era.</p><h2><strong>Events</strong></h2><ul><li><p><strong><a href="https://www.linkedin.com/events/7395204269165162496/?originTrackingId=mexcllqYQRON1zH%2FMIHwGg%3D%3D">Low-Key Tech Data Happy Hour - Denver</a></strong></p></li><li><p><strong><a href="https://luma.com/pzjftsm4">How To Build Right-Time Data Pipelines</a></strong></p></li></ul><h2><strong>Articles Worth Reading</strong></h2><p>There are thousands of new articles posted daily all over the web! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!</p><div><hr></div><h2>The unwritten rules of being a head of data</h2><p>A friend of mine became head of data but they had a hard time stepping away from coding. They kept shipping code late at night after their normal work ended.</p><p>This was fine at first until they realized, months later, they hadn&#8217;t really led anything of note for the business. There was nothing they had really driven.</p><p>They had been promoted, but still were working like a senior engineer. Delivering lots of well-designed work but not actually taking a stand on any specific project.</p><p>Once you become a data leader, the game changes. What the business needs from you isn&#8217;t just to write another Python script or notebook. But that&#8217;s not always made clear. Some companies or mentors will help guide you as you become a leader. Others will give you the promotion and then hope you figure it out.</p><p>In this article, I wanted to discuss some of the unspoken rules and lessons you need to pick up fast if you are a data leader.</p><p><a href="https://hex.tech/blog/unspoken-rules-of-data-leadership/">Read More Here</a></p><h2>Organization Architecture</h2><p>By <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Alex Ewerl&#246;f&quot;,&quot;id&quot;:87732486,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2713990-da82-481b-b579-01a7aaa5b85b_560x560.jpeg&quot;,&quot;uuid&quot;:&quot;79b7fa51-8012-4d1c-972b-f52a6816ae88&quot;}" data-component-name="MentionToDOM"></span> </p><p>The reliability of a service is directly impacted by the shape of the organization providing it. The <em>shape</em>, not the budget, headcount, or maturity level &#8212;as irresponsible leaders like to frame it!</p><p>This article elaborates:</p><ul><li><p>Why does the shape of the organization matter for reliability?</p></li><li><p>How does it impact communication between teams and by extension interactions between the components of the system?</p></li><li><p>What insight can be unlocked with <em>consumer journeys</em> to create a healthy organization that builds reliable systems?</p></li></ul><p><a href="https://blog.alexewerlof.com/p/organization-architecture">Read More Here</a></p><div><hr></div><h2>End Of Day 204</h2><p>Thanks for checking out our community. We put out 4-5 Newsletters a month discussing data, tech, and start-ups.</p><p>If you enjoyed it, consider liking, sharing and helping this newsletter grow.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[Translating Data Buzzwords Into Real Requirements]]></title><description><![CDATA[Bridging the Communication Gap Between Data and the Business]]></description><link>https://seattledataguy.substack.com/p/translating-data-buzzwords-into-real</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/translating-data-buzzwords-into-real</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Tue, 02 Dec 2025 15:12:23 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!oNid!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9cb499-b2ca-4b78-a454-4c28a5840351_1024x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>One of the challenges many data teams and leaders face is helping the business understand what they are asking for.</p><p>It might be that a business executive just came from a conference or read an article and now they are suddenly requesting your team to build an AI-powered, virtual 3-D pie chart that allows for self-service, dynamic drill-downs &#8220;like that demo they saw on stage.&#8221;</p><p>Or maybe someone from finance forwards you a Gartner report and asks why you <em>still</em> don&#8217;t have a Data Mesh, a Lakehouse, and whatever the latest marketing term is this month.</p><p>None of this is malicious...ok, perhaps a little from the marketing and sales from <a href="https://seattledataguy.substack.com/p/vendor-driven-design-the-role-vendors">vendors</a>&#8230;</p><p>It&#8217;s a symptom of a bigger problem:</p><p><strong>Most of the vocabulary we use in data has escaped into the business without the underlying meaning.</strong></p><p>People hear terms, &#8220;real-time,&#8221; &#8220;semantic layer,&#8221; &#8220;self-service,&#8221; &#8220;data quality&#8221;, and either don&#8217;t clarify their understanding or are waiting for someone else to properly define them.</p><p>This gap creates:</p><ul><li><p>Mismatched expectations</p></li><li><p>Projects that sound good but don&#8217;t solve real problems</p></li><li><p>And a lot of unnecessary fire drills</p></li></ul><p>So in this article, I want to break down a handful of terms that data teams constantly find themselves explaining, not because the business isn&#8217;t smart, but because these concepts are overloaded, over-marketed, and often misused.</p><h2>Batch And Real-Time - What Do You Really Need?</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!u3Z1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b46a053-f543-4be6-8293-ab0285e8ae9c_1024x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!u3Z1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b46a053-f543-4be6-8293-ab0285e8ae9c_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!u3Z1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b46a053-f543-4be6-8293-ab0285e8ae9c_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!u3Z1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b46a053-f543-4be6-8293-ab0285e8ae9c_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!u3Z1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b46a053-f543-4be6-8293-ab0285e8ae9c_1024x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!u3Z1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b46a053-f543-4be6-8293-ab0285e8ae9c_1024x768.png" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4b46a053-f543-4be6-8293-ab0285e8ae9c_1024x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:74628,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/180287412?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b46a053-f543-4be6-8293-ab0285e8ae9c_1024x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!u3Z1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b46a053-f543-4be6-8293-ab0285e8ae9c_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!u3Z1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b46a053-f543-4be6-8293-ab0285e8ae9c_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!u3Z1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b46a053-f543-4be6-8293-ab0285e8ae9c_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!u3Z1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b46a053-f543-4be6-8293-ab0285e8ae9c_1024x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div>
      <p>
          <a href="https://seattledataguy.substack.com/p/translating-data-buzzwords-into-real">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Own the Spreadsheet, Own the World]]></title><description><![CDATA[Will We Ever Escape Excel]]></description><link>https://seattledataguy.substack.com/p/own-the-spreadsheet-own-the-world</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/own-the-spreadsheet-own-the-world</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Fri, 28 Nov 2025 18:05:47 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Ym31!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63ae2ee0-3169-4fcd-b964-95f8b898bf59_500x590.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi, fellow future and current Data Leaders; Ben here &#128075;</p><p>Today I wanted to discuss spreadsheets and well Excel. It&#8217;s a tool that continues to be used decades later by both non-data and data teams alike. For some, this is a frustrating fact as they think everything should be in code and version control. For others they are making it part of their workflows.</p><p>But before diving into today&#8217;s newsletter, I want to take a moment to thank this issue&#8217;s sponsor: <a href="https://hex.tech/">Hex</a>. Hex brings the magic of AI to data analysis workflows, whether you&#8217;re using code or no-code. Hex helps organizations work together with data and avoid jumping between different data tools for querying, data science, visualization, and spreadsheets. Over 1600 organizations use Hex to do everything from deep analysis to self-serve.</p><p>Now let&#8217;s jump into the article!</p><div><hr></div><p>Can we get an Excel export?</p><p>For many data teams, this ask can be somewhat frustrating. You&#8217;ve likely poured hours or days into a dashboard only to feel as if the team that asked for it just wanted an Excel report.</p><p>Then, of course, there are all the one-off actual Excel reports and lookup tables being managed by a single person in companies. In fact, at many companies, it&#8217;s these one-off Excel reports, VBA scripts, and Google sheets that run whole departments from operations to accounting.</p><p>Not your polished, QAed, dynamic and impactful dashboard!</p><p>It really can feel like the meme below.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ym31!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63ae2ee0-3169-4fcd-b964-95f8b898bf59_500x590.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ym31!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63ae2ee0-3169-4fcd-b964-95f8b898bf59_500x590.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Ym31!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63ae2ee0-3169-4fcd-b964-95f8b898bf59_500x590.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Ym31!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63ae2ee0-3169-4fcd-b964-95f8b898bf59_500x590.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Ym31!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63ae2ee0-3169-4fcd-b964-95f8b898bf59_500x590.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ym31!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63ae2ee0-3169-4fcd-b964-95f8b898bf59_500x590.jpeg" width="500" height="590" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/63ae2ee0-3169-4fcd-b964-95f8b898bf59_500x590.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:590,&quot;width&quot;:500,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ym31!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63ae2ee0-3169-4fcd-b964-95f8b898bf59_500x590.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Ym31!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63ae2ee0-3169-4fcd-b964-95f8b898bf59_500x590.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Ym31!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63ae2ee0-3169-4fcd-b964-95f8b898bf59_500x590.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Ym31!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63ae2ee0-3169-4fcd-b964-95f8b898bf59_500x590.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>It might make many new and more experienced data professionals scratch their heads. We have all this technology, data platforms that can crunch more data than ever before, and yet, somehow, everything seems to continue to touch Excel.</p><p>I recently had one person show me the financial model they built in Excel using Claude. So, like the QWERTY keyboard, spreadsheets will be with us if we ever manage to set up a colony on Mars.</p><p><em>And like the QWERTY keyboard, there will always be a minority resistance group that complains about its inefficiencies.</em></p><p>But Excel and spreadsheets, for many companies, is the way they are data-driven. It is the way their execs make decisions. </p><p>With that, I wanted to discuss some of the realities of Excel and spreadsheets and where I see them going in the future.</p><h2>Excel Is A Blackhole</h2><p>It&#8217;s tempting to fight Excel after all, if everyone is just slicing and dicing data, it&#8217;s very easy to start getting inconsistent metrics, out of sync data sets, and just create confusion.</p><p>Even if you built the perfect data analytics platform with every <a href="https://www.youtube.com/watch?v=wvUiRHd47M0">data point checked</a>, once it&#8217;s out Excel who knows what will happen to it.</p><p>But it&#8217;s where a lot of &#8220;<a href="https://seattledataguy.substack.com/p/the-inconvenient-truths-of-self-service?utm_source=activity_item">self-service analytics</a>&#8221; actually happens.</p><p>And here are a few reasons why:</p><ul><li><p><strong>There Is A Low Floor And High Ceiling  -</strong> Meaning, if you&#8217;re just getting started, you can use the SUM() function and slowly work up to VLOOKUPs, and for some people, you can take it to the point of building a rollercoaster simulator inside it.</p></li></ul><div id="youtube2-IrVA1BBHFHw" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;IrVA1BBHFHw&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/IrVA1BBHFHw?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><ul><li><p><strong>It&#8217;s Far Easier to Edit </strong>- If all I want to do is add $50,000 to a single field, it&#8217;s fast, easy, and doesn&#8217;t impact the permanent data source like you would if you were to try to do the same in your data warehouse. </p></li><li><p><strong>Instant What-If Analysis</strong> - Most BI tools aren&#8217;t really built for exploratory analysis. Excel? You can copy a tab, tweak three numbers, and immediately see the new forecast.</p></li></ul><blockquote><p>&#8220;Hey, what if our sales go up 12%&#8230; no, wait, 15%&#8230; actually, what if churn drops by 2%?&#8221;</p></blockquote><ul><li><p><strong>No Gatekeepers </strong>- In many large enterprises, if you go ask for data, ask for a dashboard, ask for new columns to be added to a data set. You get push back. So instead, you go to Excel, pull the data sets that you do have access to, add your own columns and build your own solution. Now you can get your job done without feeling like it&#8217;ll take six months.</p></li></ul><p>Many of the reasons above are why both data analytics and governance teams often find Excel difficult to grapple with.</p><p>Who hasn&#8217;t had to deal with a one-off Excel <a href="https://www.census.gov/programs-surveys/acs/guidance/comparing-acs-data/2008/crosswalk-table.html">lookup</a> table that only a single person manages?</p><ul><li><p>Do you put it into a database table so the<a href="https://seattledataguy.substack.com/p/centralized-vs-decentralized-vs-federated"> data team</a> can now manage it?</p></li><li><p>Do you want to be responsible every time there is a new field?</p></li></ul><p>Maybe you&#8217;ll create a form or tool they can edit to manage the data in the table themselves(although that can be quite heavy).</p><p>And what about the other 100 Excel sheets that are doing something similar?</p><p>I suffered the same issue at Facebook, and for a while, we actually did have a tool that allowed me to build a pretty simple table app that allowed an end-user to essentially manage an isolated database table directly.</p><p>Which was great, I could control columns, data types, and even add in a few rules to check what data was being inserted while the end-user was working in a spreadsheet-like interface.</p><p>But eventually that got deprecated, and suddenly I was once again left with other teams&#8217; Google sheets(this was after we started reducing our use of<a href="https://quip.com/"> Quip</a>).</p><p>So, the spreadsheets once again reclaimed their data.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SeattleDataGuy&#8217;s Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Maybe Excel Is a Symptom of Deeper Issues</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Pt1H!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F625ae384-6721-49dd-a5f6-e71e82322f1c_1000x794.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Pt1H!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F625ae384-6721-49dd-a5f6-e71e82322f1c_1000x794.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Pt1H!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F625ae384-6721-49dd-a5f6-e71e82322f1c_1000x794.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Pt1H!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F625ae384-6721-49dd-a5f6-e71e82322f1c_1000x794.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Pt1H!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F625ae384-6721-49dd-a5f6-e71e82322f1c_1000x794.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Pt1H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F625ae384-6721-49dd-a5f6-e71e82322f1c_1000x794.jpeg" width="1000" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/625ae384-6721-49dd-a5f6-e71e82322f1c_1000x794.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Pt1H!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F625ae384-6721-49dd-a5f6-e71e82322f1c_1000x794.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Pt1H!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F625ae384-6721-49dd-a5f6-e71e82322f1c_1000x794.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Pt1H!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F625ae384-6721-49dd-a5f6-e71e82322f1c_1000x794.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Pt1H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F625ae384-6721-49dd-a5f6-e71e82322f1c_1000x794.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In many ways, I think it can be argued that if everyone is trying to circumvent your <a href="https://seattledataguy.substack.com/p/stop-shipping-dashboards-that-dont">dashboards</a> and go straight to Excel, that might be the signal for a larger issue. Here are just a few reasons they might be deciding to go straight to Excel:</p><ul><li><p>Your dashboards aren&#8217;t answering the real questions</p></li><li><p>They need flexibility that dashboards don&#8217;t offer</p></li><li><p>They don&#8217;t trust your source of truth(this would be a major issue)</p></li><li><p>They can&#8217;t wait two sprints for a dashboard tweak</p></li><li><p>They just want to build things themselves</p></li><li><p>The dashboard is slow</p></li><li><p>They feel safer making mistakes in Excel than in a tool they don&#8217;t fully understand</p></li><li><p>They don&#8217;t want to watch your video on how to use the dashboard(or feel like they don&#8217;t have time)</p></li></ul><p>I&#8217;ve been in situations where the marketing, finance, and operations teams use multiple dashboards to extract a few key datasets and then put them into their own spreadsheets. That should be a signal that the dashboards you&#8217;ve built may not be meeting their needs.</p><p>I think it&#8217;s important to understand these issues so you can:</p><ol><li><p>Build better dashboards that can answer your end-users problems</p></li><li><p>Guide partner teams to better short and long-term solutions(if you know that your partner team is just</p></li><li><p>Prioritize fixes that actually unblock how teams make decisions</p></li><li><p>Set expectations on what BI tools are <em>for</em> and what they are <em>not</em></p></li><li><p>Know when to build a more flexible interface instead of another static report</p></li><li><p>Spot patterns in how different teams work and design solutions that fit their realities</p></li></ol><p>And I am sure you could still list more reasons. The truth is, there is more than one reason a team might decide not to use your dashboard or data product. You should figure out why to see if there are specific issues with the dashboard itself, or if there are more uses that the team needs for the data.</p><h2>If You Can&#8217;t Beat Excel, Include It</h2><p>Recently, several BI tools have leaned into spreadsheets. Which I believe is a smart move for them in terms of increasing the user base and meeting people where they are. </p><p>As many BI tools often require some level of <a href="https://www.youtube.com/watch?v=uvACOp4WFR4&amp;t=1s">SQL</a> which sure you could ask everyone at your company to learn, but I don&#8217;t think it&#8217;s fair to expect(not unless data teams also understand the<a href="https://seattledataguy.substack.com/p/7-questions-every-data-team-should"> business better</a>).</p><p>So seeing companies provide a layer of familiarly in the BI layer makes some sense. Maybe it&#8217;ll finally get us closer to actually being able to put governance around spreadsheets rather than fighting them&#8230;</p><p>I&#8217;ve seen this massively increase the number of users who interact with BI solutions because the users can finally do what they want.</p><ul><li><p>Write data back and manage ad-hoc lookup tables</p></li><li><p>Create formulas and calculations as they do in spreadsheets</p></li><li><p>Explore the data with the same freedom they&#8217;re used to, no rigid dashboard filters</p></li><li><p>Build quick &#8220;what-if&#8221; scenarios without submitting tickets or waiting for dev cycles</p></li><li><p>Annotate, comment, and explain numbers directly in the UI</p></li></ul><p>I think most people would be surprised by what even a few small changes do in terms of making it easier to integrate workflows into your tool. Is it going to make data governance happy?</p><p>Probably not.</p><h2>Attempts To Reduce The Chaos</h2><p>Like I said earlier, we&#8217;ll likely never get rid of Excel or spreadsheets in general. It&#8217;s too convenient, too easy, solves immediate problems too well that if your goal as a data leader is to reduce the amount of spreadsheets that exist, its probably a fruitless effort. But I do think there are areas where analytics platforms can work to </p><ul><li><p><strong>Detect Ad-hoc Usage Patterns</strong> - This was one of the projects I was considering prior to leaving Facebook. I wanted to be able to detect how end-users were using data to see if there were ways we needed to extend the current data platform before end-users asked us. Maybe there are tables being joined or custom columns being created. The challenge with Excel and spreadsheets is much of this happens outside of any form of tracking. But I&#8217;d still love to see <a href="https://www.youtube.com/watch?v=VLtq0eeHc14">Snowflake or Databricks</a> integrate this into their platforms.</p></li><li><p><strong>Detect Metrics</strong> - This is one I&#8217;ve seen several tools and platforms offer. Where if they are seeing you define a metric, especially in a BI layer, they&#8217;ll suggest you create a universal metric and provide version control. I love the idea. After all, one of the challenges with some reporting and BI tools is they feel isolated. You create a metric in on dashboard and no other dashboard knows that it exists. </p></li></ul><p>But none of this really brings Excel or spreadsheets under any form of governance or control. Not unless those spreadsheets exist inside some component of the data analytics platform itself.</p><p>Which is where I can see the world going. Especially if Snowflake and Databricks have their way and become the center for all your business data.</p><h2>Excel - Villain, Symptom, or the Last Mile?</h2><p>We often treat Excel like it&#8217;s the villain. Our nemesis in data.</p><p>In some cases, it can be. It can be a cause of many frustrating meetings where teams bring multiple conflicting metrics or maybe you offer to automate someone&#8217;s Excel process only to open up a 30 spreadsheet deep nightmare with more formulas than your VP has &#8216;quick asks.</p><p>But taking a step outside of the data team. For many, it&#8217;s their way of accessing data. Without anyone saying what they can and can&#8217;t do and you&#8217;re not going to take that away from them. </p><p>And, as the title suggests, many companies are run off Excel. So no matter how much some data teams want to put that genie back in it&#8217;s bottle, it&#8217;ll be around far longer than your dashboard.</p><p>As always, thanks for reading!</p><h2>Events</h2><ul><li><p><strong><a href="https://www.linkedin.com/events/7395204269165162496/?originTrackingId=mexcllqYQRON1zH%2FMIHwGg%3D%3D">Low-Key Tech Data Happy Hour - Denver</a></strong></p></li><li><p><strong><a href="https://luma.com/ls6xaocp?tk=xhLF5u">Right-time Data Integration for Snowflake</a></strong></p></li></ul><h2>Articles Worth Reading</h2><p>There are thousands of new articles posted daily all over the web! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!</p><div><hr></div><h2>5 Things in Data Engineering That Still Hold True After 10 Years</h2><p>When I started in the data world back in 2015 Hadoop was at it&#8217;s peak.</p><p>Actually I happened to be scrolling through an old instagram account and found a picture from a DAMA conference where Horton Works sponsored it(you can see it below). At the time, Hadoop and its ecosystem were everywhere, Hortonworks, Cloudera, MapR, each promising to reshape the future of data.</p><p>Fast forward just ten years and many newer practitioners don&#8217;t even recognize those names. And yet, in data engineering, a decade is barely enough time for fundamentals to change. Underneath the hype cycles and new logos, many of the same challenges remain.</p><p><a href="https://seattledataguy.substack.com/p/5-things-in-data-engineering-that">Read More Here</a></p><h2>Cash Rules Everything Around Me</h2><p>By <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Joe Reis&quot;,&quot;id&quot;:3531217,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6e4716b1-c223-41e3-b943-def0291bf217_1175x783.jpeg&quot;,&quot;uuid&quot;:&quot;0dbb18f8-e6f2-418c-83f9-85f66b4e9e2c&quot;}" data-component-name="MentionToDOM"></span> </p><p>I&#8217;ve been reflecting on the last week&#8217;s announced merger of two darlings of the data industry, and observing reactions to this news. In the practitioner community, it&#8217;s definitely a mix of Kumbaya and dread. Plenty of people are hopeful things will stay positive, but nearly everyone is making a Plan B (forking, budgeting, looking at new options, etc). I&#8217;m agnostic to whatever happens, and try to take a Zen-like attitude to these things, mostly because stuff like this is very common in our industry. It is what it is, and here&#8217;s some advice if you&#8217;re new to the game.</p><p><a href="https://joereis.substack.com/p/cash-rules-everything-around-me">Read More Here</a></p><div><hr></div><h2>End Of Day 203</h2><p>Thanks for checking out our community. We put out 4-5 Newsletters a month discussing data, tech, and start-ups.</p><p>If you enjoyed it, consider liking, sharing and helping this newsletter grow.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[What It Really Takes to Move From Senior to Staff Data Engineer ]]></title><description><![CDATA[An Interview With A Staff Level Data Engineer At Apple]]></description><link>https://seattledataguy.substack.com/p/what-it-really-takes-to-move-from</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/what-it-really-takes-to-move-from</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Mon, 17 Nov 2025 20:14:44 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!1X1s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F237f666b-ccfa-43c0-84d4-f7601caa65ca_1534x956.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi, fellow future and current Data Leaders; Ben here &#128075;</p><p>Today&#8217;s article is an interview with <strong><a href="https://www.linkedin.com/in/brianjfemiano/overlay/about-this-profile/">Brian Femiano</a></strong> who is currently a staff data engineer at Apple. </p><p>He&#8217;s been doing data engineer for nearly 20 years with 9 of those as a staff eng. He&#8217;s worked across many domains including ad tech, music, social media, and intelligence. In his own words he enjoys mentoring tech professionals and helping folks level up, especially his managers. He lives with his wife Rachel and two boys.</p><p>Now let&#8217;s jump into the article!</p><div><hr></div><p>Once you start pushing past senior-level, whether that&#8217;s as a data engineer, analyst or other IC, there are a lot of skills that start becoming important that have nothing to do with technology.</p><p>Or at least less so.</p><p>I&#8217;ve already written several takes on this in the past so I wanted to bring in someone else&#8217;s perspective. So for this article we have <strong><a href="https://www.linkedin.com/in/brianjfemiano/overlay/about-this-profile/">Brian Femiano</a></strong> who is currently a staff data engineer at Apple(in case you missed the intro) and he&#8217;ll be sharing about:</p><ul><li><p>Career Path And Motivation</p></li><li><p>Defining The Staff Role</p></li><li><p>Technical Design &amp; Systems Thinking</p></li><li><p>Collaboration &amp; Communication</p></li><li><p>Big-Picture Advice</p></li></ul><p>If you&#8217;re stuck at senior or are just curious what it takes, then this article is for you!</p><h2>Career Path &amp; Motivation</h2><p><strong>Q. How did you get into data engineering? And what do you enjoy about it?</strong><br><br>I started with Java services then one day got into the early Cloudera <a href="https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html">HFS/MapReduce </a>training videos and got hooked. Distributed computing finally felt approachable for me and was really cool to see applied to data and not just number crunching.<br><br><strong>Q. Can you share the key moments or projects that shaped your path from senior to staff data engineer?</strong><br><br>Years ago I was on a team and someone needed to organize work into JIRA, assign that work out, document progress, help unblock teammates, and keep stakeholders outside of engineering informed. It&#8217;s not the kind of fun stuff devs enjoy doing but it had to get done. Once the org realized I had this holistic view of the project goals and how all the work could be organized to meet them, they recognized me as staff.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SeattleDataGuy&#8217;s Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Defining the Staff Role</h2><p><strong>Q. How would you describe the difference between senior and staff data engineer in your own words?</strong><br><br>It&#8217;s not an automatic flip so much as a gradual development. A senior dev might stick to a comfort zone and execute really well. The next step is non-coding skills that make other devs on their team better. </p><p>Things like:</p><ul><li><p>Diagramming systems and runbooks to help on-call</p></li><li><p>Having good relationships with product folks</p></li><li><p>Having a good relationship with other <a href="https://seattledataguy.substack.com/p/centralized-vs-decentralized-vs-federated">data teams</a> to know how your changes impacts them. </p></li></ul><p>Staffs also have to context switch more frequently in a given day.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1X1s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F237f666b-ccfa-43c0-84d4-f7601caa65ca_1534x956.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1X1s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F237f666b-ccfa-43c0-84d4-f7601caa65ca_1534x956.png 424w, https://substackcdn.com/image/fetch/$s_!1X1s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F237f666b-ccfa-43c0-84d4-f7601caa65ca_1534x956.png 848w, https://substackcdn.com/image/fetch/$s_!1X1s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F237f666b-ccfa-43c0-84d4-f7601caa65ca_1534x956.png 1272w, https://substackcdn.com/image/fetch/$s_!1X1s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F237f666b-ccfa-43c0-84d4-f7601caa65ca_1534x956.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1X1s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F237f666b-ccfa-43c0-84d4-f7601caa65ca_1534x956.png" width="1456" height="907" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/237f666b-ccfa-43c0-84d4-f7601caa65ca_1534x956.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:907,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:166433,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/178629117?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F237f666b-ccfa-43c0-84d4-f7601caa65ca_1534x956.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1X1s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F237f666b-ccfa-43c0-84d4-f7601caa65ca_1534x956.png 424w, https://substackcdn.com/image/fetch/$s_!1X1s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F237f666b-ccfa-43c0-84d4-f7601caa65ca_1534x956.png 848w, https://substackcdn.com/image/fetch/$s_!1X1s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F237f666b-ccfa-43c0-84d4-f7601caa65ca_1534x956.png 1272w, https://substackcdn.com/image/fetch/$s_!1X1s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F237f666b-ccfa-43c0-84d4-f7601caa65ca_1534x956.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Image Added By Ben Rogojan From <a href="https://staffeng.com/guides/staff-archetypes/">Staff Eng Archetype Calendars </a> </figcaption></figure></div><p><em>Note From The Seattle Data Guy - I&#8217;ve always enjoyed the different Calendars on Will Larson&#8217;s website where he discusses many of the different archetypes of staff engineers. As Brian referenced about context switching, you might have to go from mentoring a junior engineer, to talking about scaling issues, to then discussing budget, interviewing, planning offsites, etc. Actually, in the next answer Brian will reference his thoughts against the archetypes idea. Which I do understand. I could write another article about my thoughts there, but for now let&#8217;s go back to Brian!</em><br><br><strong>Q. What misconceptions do people have about the staff title?</strong></p><ul><li><p>That you need to be the most talented developer on your team</p></li><li><p>That everyone fits into these nice little archetype roles like classes in an RPG video game. </p></li><li><p>That job hoping is the best way to get the title.</p></li></ul><p><strong>Q. What do you find is the most common blocker for senior data engineers trying to get to staff ?</strong><br><br>Poor or non-existent communication or being abrasive. Not flexible based on circumstances or able to see the big picture of their work. Being too focused on tools/languages and just generally not building for long term system health and quality.<br><br><strong>Q. For a senior engineer aspiring to staff, what specific signals should they show?</strong><br><br>There&#8217;s a ton but I&#8217;ll rattle a few off. </p><p>That their managers and peers know they think holistically about the systems you&#8217;re building. Thoughts are put into diagrams and accept feedback well. </p><p>Support your teammates during fires. </p><p>Generally you want to make yourself indispensable to your team but not in a way where you have siloed knowledge or also always having to play the hero. </p><p>It&#8217;s a fine line to walk.</p><h2>Technical Design &amp; Systems Thinking</h2><p><strong>Q. When faced with a complex request, what&#8217;s your process for designing a simple, scalable solution?</strong><br><br>Immediately diagram my thoughts. </p><p>What currently exists that might help satisfy the request?</p><p>Are we happy with those systems or is this our chance to refactor problem areas? </p><p>What&#8217;s the minimalist set of new capabilities needed to be built? </p><p>Can we leverage existing libraries? </p><p>Are any parts of the design susceptible to bottlenecks as load increases? </p><p>Does any part of this need constant manual attention or is it mostly automated?<br><br><strong>Q. How do you decide what to build yourself versus what to delegate or break into<br>components?</strong><br><br>This is a tough one. The team leads I learned from tried not to involve themselves on the critical path. I&#8217;ve broken this rule many times though based on circumstances. If you assign yourself too much it can delay the project and also deprive your teammates of growth, both of which hurts the business. Generally you want to trust the most important areas to your teammates. Figure out where you can pitch in to help avoid overload.<br><br><strong>Q. Can you share an example of a &#8220;simple&#8221; solution to a complex problem that you&#8217;re<br>particularly proud of?</strong><br><br>Back at Pandora we wanted a service that would notify artists when one of their songs was added to certain DJ curated playlists. </p><p><strong>The business objective was clear:</strong> If we send them sharable links, then they might post them organically and give us free publicity. </p><p>We started out thinking we&#8217;d fire emails in real time for any new Kafka events, but this had a number of unattractive tradeoffs. </p><p>After some back and forth with product we realized artists didn&#8217;t even want the notifications in real time. We ending up building a minimal set of reliable components that read the events and <a href="https://www.theseattledataguy.com/batch-vs-real-time-data-pipelines-do-we-still-need-to-pick/">batch</a> the emails. </p><p>It&#8217;s still in prod today.</p><h2>Collaboration &amp; Communication</h2><p><strong>Q. What have you learned about keeping managers, product partners, and other engineers aligned on big projects?</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jSUT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8d2a1d-05fc-4de4-9353-cbb8b8ec03e9_1456x676.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jSUT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8d2a1d-05fc-4de4-9353-cbb8b8ec03e9_1456x676.webp 424w, https://substackcdn.com/image/fetch/$s_!jSUT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8d2a1d-05fc-4de4-9353-cbb8b8ec03e9_1456x676.webp 848w, https://substackcdn.com/image/fetch/$s_!jSUT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8d2a1d-05fc-4de4-9353-cbb8b8ec03e9_1456x676.webp 1272w, https://substackcdn.com/image/fetch/$s_!jSUT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8d2a1d-05fc-4de4-9353-cbb8b8ec03e9_1456x676.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jSUT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8d2a1d-05fc-4de4-9353-cbb8b8ec03e9_1456x676.webp" width="1456" height="676" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dc8d2a1d-05fc-4de4-9353-cbb8b8ec03e9_1456x676.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:676,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:36570,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/178629117?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8d2a1d-05fc-4de4-9353-cbb8b8ec03e9_1456x676.webp&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jSUT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8d2a1d-05fc-4de4-9353-cbb8b8ec03e9_1456x676.webp 424w, https://substackcdn.com/image/fetch/$s_!jSUT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8d2a1d-05fc-4de4-9353-cbb8b8ec03e9_1456x676.webp 848w, https://substackcdn.com/image/fetch/$s_!jSUT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8d2a1d-05fc-4de4-9353-cbb8b8ec03e9_1456x676.webp 1272w, https://substackcdn.com/image/fetch/$s_!jSUT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8d2a1d-05fc-4de4-9353-cbb8b8ec03e9_1456x676.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><br><br><a href="https://seattledataguy.substack.com/p/4-strategies-to-elevate-your-communication">With management I try to be succinct and timely</a> so they can stay informed on progress with minimal effort. For product it&#8217;s a lot of asking questions and note taking. Repeating back things they tell me in my own words. Engineering it&#8217;s about giving as much detail and clarity, even if it means repeating yourself many times. Exercising patience with everyone. I&#8217;m still working on all<br>of this.<br><br><strong>Q. How do you create an environment where teammates feel comfortable approaching you with questions or concerns?</strong><br><br>I think it starts with just being friendly and non-judgmental so they can ask me anything. I try to prioritize anyone who&#8217;s asked for my help to unblock them. Also give your teammates the leverage when they speak up and have a good idea you didn&#8217;t think of.<br><br><strong>Q. Any tips for writing design docs that earn trust and reduce confusion?</strong><br><br>Start out with 3-5 sentences outlining what&#8217;s being built and what are the business benefits. Reference other parties involved in the project so readers know everyone involved and can reach out themselves. Find a diagramming tool you like and focus on diagrams that communicate how pieces fit together. It&#8217;s less about how good the diagrams look artistically and more about how easy they are to follow.</p><h2>Big-Picture Advice</h2><p><strong>Q. Where do you see data engineering going over the next few years. Anything you&#8217;re most excited about?</strong><br><br>We&#8217;re already starting to see data engineer roles looking for proficiency in languages beyond just Java/Scala/Python/<a href="https://www.youtube.com/watch?v=uvACOp4WFR4">SQL</a> and that will continue. It&#8217;s also been cool to see orgs recognizing that their problems aren&#8217;t with volume but more governance and data quality, and choosing tools/platforms more aligned with that reality.<br>It&#8217;s exciting to see some of the open table formats evolve and help teams with those areas. I also have a somewhat optimistic take on GenAI in this space. There&#8217;s so much talk of it replacing entry level roles but I don&#8217;t think so. The entry level folks I interact with are the ones really good with it and schooling the seniors.</p><p><em>Note From Seattle Data Guy - If you&#8217;d like to learn more about Brian and his past work, you can also check out the video below!</em></p><div id="youtube2-4c6tY0dLni4" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;4c6tY0dLni4&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/4c6tY0dLni4?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h2>Final Thoughts</h2><p>This is Ben Rogojan again!</p><p>What a senior and staff engineer is means different things at different companies(actually I guess I could say this is even true amongst teams). I&#8217;ve seen some be more focused on architecture, others coding machines and still others play a mix of a technical project manager, strategic partner to the business and mentor to more junior engineer.</p><p>The through line is that these engineers have some form of outsized impact. We all only have so many hours in the day and we can only code so fast. At a certain point we have to start considering more about what we are coding vs. just the speed. </p><p>For example:</p><ul><li><p>Finding and working on projects that move the business forward </p></li><li><p>Helping the business remove ambiguity </p></li><li><p>Keeping the business from picking a solution that will cost them millions and require them to migrate in 18 months.</p></li></ul><p>I am so glad Brian shared his experiences. I hope you found them valuable!</p><h2>Events</h2><ul><li><p><strong><a href="https://www.linkedin.com/events/7395204269165162496/?originTrackingId=mexcllqYQRON1zH%2FMIHwGg%3D%3D">Low-Key Tech Data Happy Hour - Denver</a></strong></p></li><li><p><strong><a href="https://luma.com/pzjftsm4">How To Build Right-Time Data Pipelines</a></strong></p></li></ul><h2>Articles Worth Reading</h2><p>There are thousands of new articles posted daily all over the web! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!</p><div><hr></div><h2>Beyond Big Tech: The Reality Of Data Engineering Outside Silicon Valley</h2><p>I&#8217;ve had the privilege of working at a mix of companies&#8212;big tech, start-ups, enterprises, and nonprofits. Each has given me a unique perspective on how different environments approach data: the infrastructure they build to manage it and how they ultimately use it.</p><p>I&#8217;m writing this because, while tinkering on a small side project, I was hit with a vivid reminder of just how frustrating it can be to work with data from certain industries. In fact, when I made a humorous post about it, someone called out the poor data model in the raw file I was using.</p><p>But that&#8217;s the thing: it wasn&#8217;t my model, that&#8217;s just how the data came.</p><p><a href="https://seattledataguy.substack.com/p/beyond-big-tech-the-reality-of-data">Read More Here</a></p><h2>Building Zone Failure Resilience in Apache Pinot&#8482; at Uber</h2><p>Initially, our Pinot clusters at Uber relied on two key strategies: tag-based instance assignment, which groups servers by tenant to ensure logical isolation, and balanced segment assignment, which spreads data segments evenly across those servers. While effective for distributing data evenly across servers within a tenant, this approach didn&#8217;t inherently guarantee distribution across different physical zones. If all instances assigned to a table, or all replicas of a segment, happened to be in a single zone, a failure in that zone would lead to significant service disruption.</p><p><a href="https://www.uber.com/blog/building-zone-failure-resilience-in-apache-pinot-at-uber/?uclick_id=e9ef7e2d-ce6c-4507-aeff-81da8f0b82ef">Read More Here</a></p><div><hr></div><h2>End Of Day 202</h2><p>Thanks for checking out our community. We put out 4-5 Newsletters a month discussing data, tech, and start-ups.</p><p>If you enjoyed it, consider liking, sharing and helping this newsletter grow.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[5 Things in Data Engineering That Have Changed In The Last 10 Years]]></title><description><![CDATA[What Will Change Next?]]></description><link>https://seattledataguy.substack.com/p/5-things-in-data-engineering-that-d11</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/5-things-in-data-engineering-that-d11</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Fri, 07 Nov 2025 16:20:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!qE5F!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef0a12d5-e1eb-4f15-9c93-d335bff679f5_1608x1548.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi, fellow future and current Data Leaders; Ben here &#128075;</p><p>Before diving into today&#8217;s newsletter, I want to take a moment to thank this issue&#8217;s sponsor: <a href="https://hex.tech/">Hex</a>. Hex brings the magic of AI to data analysis workflows, whether you&#8217;re using code or no-code. Hex helps organizations work together with data and avoid jumping between different data tools for querying, data science, visualization, and spreadsheets. Over 1600 organizations use Hex to do everything from deep analysis to self-serve.</p><p>Now let&#8217;s jump into the article!</p><div><hr></div><p>A few months back, I wrote an article about what hasn&#8217;t changed in the data world. And much of what hasn&#8217;t changed are the problems we face.</p><p>Of course, there are also plenty of things that have changed in the data world since I started. For example, the technologies and practices we use.</p><p>Even the words and terms we use, although mostly the same, have changed. Whether you like it or not.</p><p>When I started, no one used the term analytics engineer, and even the concept of a data engineer was still relatively new(at least in terms of how popular it later became).</p><p>In the same way, I have seen plenty of things change. Some of these changes are temporary, I believe (like my first point), others are likely larger trends.</p><p>So let&#8217;s dive into what&#8217;s changed in the data world in the last decade.</p><h2>1) Everyone Wants Seniors On Their Team</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!y4Jq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9580aea3-f8af-4484-8a82-dd0e1cae8f09_640x426.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!y4Jq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9580aea3-f8af-4484-8a82-dd0e1cae8f09_640x426.webp 424w, https://substackcdn.com/image/fetch/$s_!y4Jq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9580aea3-f8af-4484-8a82-dd0e1cae8f09_640x426.webp 848w, https://substackcdn.com/image/fetch/$s_!y4Jq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9580aea3-f8af-4484-8a82-dd0e1cae8f09_640x426.webp 1272w, https://substackcdn.com/image/fetch/$s_!y4Jq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9580aea3-f8af-4484-8a82-dd0e1cae8f09_640x426.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!y4Jq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9580aea3-f8af-4484-8a82-dd0e1cae8f09_640x426.webp" width="640" height="426" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9580aea3-f8af-4484-8a82-dd0e1cae8f09_640x426.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:426,&quot;width&quot;:640,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:21700,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/175906069?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9580aea3-f8af-4484-8a82-dd0e1cae8f09_640x426.webp&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!y4Jq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9580aea3-f8af-4484-8a82-dd0e1cae8f09_640x426.webp 424w, https://substackcdn.com/image/fetch/$s_!y4Jq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9580aea3-f8af-4484-8a82-dd0e1cae8f09_640x426.webp 848w, https://substackcdn.com/image/fetch/$s_!y4Jq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9580aea3-f8af-4484-8a82-dd0e1cae8f09_640x426.webp 1272w, https://substackcdn.com/image/fetch/$s_!y4Jq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9580aea3-f8af-4484-8a82-dd0e1cae8f09_640x426.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5425555">Source - </a><strong><a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5425555">Generative AI as Seniority-Biased Technological Change: Evidence from U.S. R&#233;sum&#233; and Job Posting Data</a></strong></figcaption></figure></div><p>Something strange started to happen a a few years ago. Whenever I was speaking with data leaders and they brought up hiring, they always referenced <a href="https://seattledataguy.substack.com/p/three-traits-that-differentiate-great">senior engineer</a>s and analysts. That&#8217;s the only position they were looking for.</p><p>It didn&#8217;t really click what was going on until the chart above came out recently&#8230;and well things became clear.</p><p>Both in terms of anecdotally and in terms of data there is a trend to hire more senior engineers over a mixed team. </p><p>I&#8217;ve heard a couple of causes:</p><ul><li><p>A senior engineer with a copilot can get far more work done by themselves.</p></li><li><p>Senior engineers tend to be able to jump into problems faster and apply what they&#8217;ve used in the past to solve said problem.</p></li><li><p><a href="https://seattledataguy.substack.com/p/centralized-vs-decentralized-vs-federated">Data teams</a> are smaller than they were a few years back and in turn they don&#8217;t have the head count for juniors.</p></li></ul><p>That last point is from my experience, so it&#8217;s anecdotal. But, I think it&#8217;s also what many other consultants and data leaders are seeing. Smaller teams, perhaps in response to the considerably larger data teams of the early 2020s.</p><p>I do have some friends who are currently working to try to create tools to help junior engineers become more productive even if a senior engineer is around. And I imagine there will be a correction in this trend. Because if you only hire senior engineers, eventually said seniors will just demand more income until it&#8217;s no longer viewed as cost effective to mostly hire seniors.</p><p>Don&#8217;t just take my word for it, feel free to share your own thoughts.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/p/5-things-in-data-engineering-that-d11/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://seattledataguy.substack.com/p/5-things-in-data-engineering-that-d11/comments"><span>Leave a comment</span></a></p><h2>2) Let&#8217;s Just Start With The Cloud </h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EAej!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8eafac0-99b7-4c33-a0f5-769d126231cd_2190x848.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EAej!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8eafac0-99b7-4c33-a0f5-769d126231cd_2190x848.png 424w, https://substackcdn.com/image/fetch/$s_!EAej!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8eafac0-99b7-4c33-a0f5-769d126231cd_2190x848.png 848w, https://substackcdn.com/image/fetch/$s_!EAej!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8eafac0-99b7-4c33-a0f5-769d126231cd_2190x848.png 1272w, https://substackcdn.com/image/fetch/$s_!EAej!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8eafac0-99b7-4c33-a0f5-769d126231cd_2190x848.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EAej!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8eafac0-99b7-4c33-a0f5-769d126231cd_2190x848.png" width="1456" height="564" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d8eafac0-99b7-4c33-a0f5-769d126231cd_2190x848.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:564,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:344463,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/175906069?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8eafac0-99b7-4c33-a0f5-769d126231cd_2190x848.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EAej!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8eafac0-99b7-4c33-a0f5-769d126231cd_2190x848.png 424w, https://substackcdn.com/image/fetch/$s_!EAej!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8eafac0-99b7-4c33-a0f5-769d126231cd_2190x848.png 848w, https://substackcdn.com/image/fetch/$s_!EAej!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8eafac0-99b7-4c33-a0f5-769d126231cd_2190x848.png 1272w, https://substackcdn.com/image/fetch/$s_!EAej!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8eafac0-99b7-4c33-a0f5-769d126231cd_2190x848.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://bankingblog.accenture.com/banking-cloud-altimeter-magazine/volume-1-what-does-banking-cloud-mean">Accenture 2021 Banking Survey</a></figcaption></figure></div><p>It was only a few years ago where the cloud was still an option that companies would consider. </p><p>It wasn&#8217;t the default. </p><p>Now I find many companies and data teams jump to the cloud immediately. And those not on the cloud already are migrating. Over the past few years I&#8217;ve done over half a dozen <a href="https://www.theseattledataguy.com/how-to-migrate-from-sql-server-to-snowflake/">SQL Server to Snowflake</a> migrations. Don&#8217;t just take my word for it. Looking at the chart above from Accenture back in 2021 also showed a similar trend of migrating away from on-prem to the cloud for analytics.</p><p>Cloud has become the first choice for many companies. </p><p>This is true up and down the ladder in terms of company size. Small, large, medium, I haven&#8217;t had one project where someone decided to run a database or <a href="https://seattledataguy.substack.com/p/why-your-data-pipeline-probably-isnt">data pipeline </a>locally. Other than to test an idea or via a dev workflow. I know, some of y&#8217;all are probably reading this thinking, well, I am still on-prem!</p><p>There are still plenty of data stacks on-prem. Some have to be for regulations others find it cost effective to be on-prem.  </p><p> However, in my experience, I am running into fewer and fewer teams that are relying on an on-prem solution for analytics.</p><h2>3) Data Teams Don&#8217;t Jump To Building Their Own Custom Data Pipeline Solutions</h2><p>When I first started in the data world, many &#8220;data pipeline&#8221; systems were built with cron, Windows task scheduler, maybe SQL Server agent as the scheduler, calling either a bash or Python script, or directly calling SQL or SSIS.</p><p>Most companies I&#8217;d walk into would likely have built their own data pipeline and orchestration solution. Many of which were built over the past decade or so, and you&#8217;d literally be able to sense when <a href="https://seattledataguy.substack.com/p/from-analyst-to-leading-data-teams">data leaders</a> changed because the style of how things were written would change.</p><p>Now this isn&#8217;t the default.</p><p>Instead, there are plenty of other options, especially with the Cloud(again), from as simple as using a <a href="https://www.youtube.com/watch?v=AXpOnpNg3cQ">Lambda</a> and Eventbridge or going straight to a tool like Airflow. People are using pre-built tools to manage their automated pipelines. </p><p>I imagine there are many reasons for this:</p><ol><li><p>There is an expectation to deliver actual impact faster. Meaning if you spend time just building a data orchestration system which requires you build logging, meta-data tracking, scheduling and other components before even delivering a data pipeline, you&#8217;re likely going to hear it from the business.</p></li><li><p>There is a lot more intro content out there, and a lot of it starts with tools such as Airflow. So if you&#8217;re a new data engineer or looking to set-up your first pipeline for your company, you&#8217;ll likely run into it and start there.</p></li><li><p>Job requirements often ask for tools such as <a href="https://www.theseattledataguy.com/common-pitfalls-in-deploying-airflow-for-data-teams/">Airflow</a>, <a href="https://coalesce.io/">Coalesce</a>, <a href="https://www.youtube.com/watch?v=8FZZivIfJVo&amp;t=105s">dbt</a>, and so on which means that&#8217;s what new data engineers pick-up.</p></li></ol><p>Whatever the reason, building heavily custom data pipeline systems seems to be a less frequent occurrence.</p><p>As always, I&#8217;d love to hear your thoughts on this as well.</p><h2>4) No One Is Questioning SQL As The Lingua Franca Of Data Anymore</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qE5F!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef0a12d5-e1eb-4f15-9c93-d335bff679f5_1608x1548.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qE5F!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef0a12d5-e1eb-4f15-9c93-d335bff679f5_1608x1548.png 424w, https://substackcdn.com/image/fetch/$s_!qE5F!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef0a12d5-e1eb-4f15-9c93-d335bff679f5_1608x1548.png 848w, https://substackcdn.com/image/fetch/$s_!qE5F!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef0a12d5-e1eb-4f15-9c93-d335bff679f5_1608x1548.png 1272w, https://substackcdn.com/image/fetch/$s_!qE5F!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef0a12d5-e1eb-4f15-9c93-d335bff679f5_1608x1548.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qE5F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef0a12d5-e1eb-4f15-9c93-d335bff679f5_1608x1548.png" width="1456" height="1402" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ef0a12d5-e1eb-4f15-9c93-d335bff679f5_1608x1548.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1402,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:234813,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/175906069?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef0a12d5-e1eb-4f15-9c93-d335bff679f5_1608x1548.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qE5F!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef0a12d5-e1eb-4f15-9c93-d335bff679f5_1608x1548.png 424w, https://substackcdn.com/image/fetch/$s_!qE5F!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef0a12d5-e1eb-4f15-9c93-d335bff679f5_1608x1548.png 848w, https://substackcdn.com/image/fetch/$s_!qE5F!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef0a12d5-e1eb-4f15-9c93-d335bff679f5_1608x1548.png 1272w, https://substackcdn.com/image/fetch/$s_!qE5F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef0a12d5-e1eb-4f15-9c93-d335bff679f5_1608x1548.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://datanerd.tech/">Datanerd.tech</a></figcaption></figure></div><p>Alright, that&#8217;s not 100% true. There are still people who dislike or want to replace <a href="https://seattledataguy.substack.com/p/behind-the-scenes-of-sql-understanding">SQL</a>. In fact, I usually find that most data engineers fall into liking. Either SQL or Spark(via data frames and Python).</p><p>However, had you joined the data world in 2015, every tool seemed to be building its own query language. Some were just offshoots of SQL, like BigQuery&#8217;s Legacy SQL, HiveQL, etc.</p><p>Other query languages tried to build a completely new approach to how you queried data. And of course, there was NoSQL, which was always touted as &#8220;Not Only SQL. But it&#8217;s hard for me not to read it as a way of pushing against SQL in both an underhanded/overt way via a name.</p><p>Regardless, here we are in 2025, SQL is more popular than ever.</p><p>You can check out <a href="https://www.youtube.com/c/lukebarousse">Luke Barousse</a>&#8217;s graph above, where he <a href="https://datanerd.tech/">scraped</a> data for job requirements. As you can see, SQL is almost always required. People now start by learning tools that further enforce that, like dbt.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SeattleDataGuy&#8217;s Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>5) AI Is Changing Workflows</h2><p>It&#8217;s impossible not to acknowledge that AI is playing a role in how data teams work. Whether someone is just adding in an LLM to get rid of having to write <a href="https://docs.cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language">DDL</a> statements, or your IDE has three different copilots running at once. AI is changing the way we work.</p><p>On the positive side, I&#8217;d say I&#8217;ve personally used it and seen it remove a lot of the redundant work I&#8217;ve done in the past.</p><p>On the more negative side, I&#8217;ve also seen people start to rely on it so heavily that they don&#8217;t really look into debugging. Even when an error arises, they just put said error back into their copilot and have it try to figure out the problem. Sometimes this works, but many times I&#8217;ve seen more and more code bloat get written as a result&#8230;I guess we kind of debugged in a similar fashion during the peak of Stack Overflow.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!doU8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7abce5e2-f99b-4017-81c7-8cdcfce586ea_500x300.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!doU8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7abce5e2-f99b-4017-81c7-8cdcfce586ea_500x300.png 424w, https://substackcdn.com/image/fetch/$s_!doU8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7abce5e2-f99b-4017-81c7-8cdcfce586ea_500x300.png 848w, https://substackcdn.com/image/fetch/$s_!doU8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7abce5e2-f99b-4017-81c7-8cdcfce586ea_500x300.png 1272w, https://substackcdn.com/image/fetch/$s_!doU8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7abce5e2-f99b-4017-81c7-8cdcfce586ea_500x300.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!doU8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7abce5e2-f99b-4017-81c7-8cdcfce586ea_500x300.png" width="500" height="300" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7abce5e2-f99b-4017-81c7-8cdcfce586ea_500x300.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:300,&quot;width&quot;:500,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:65367,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/175906069?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7abce5e2-f99b-4017-81c7-8cdcfce586ea_500x300.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!doU8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7abce5e2-f99b-4017-81c7-8cdcfce586ea_500x300.png 424w, https://substackcdn.com/image/fetch/$s_!doU8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7abce5e2-f99b-4017-81c7-8cdcfce586ea_500x300.png 848w, https://substackcdn.com/image/fetch/$s_!doU8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7abce5e2-f99b-4017-81c7-8cdcfce586ea_500x300.png 1272w, https://substackcdn.com/image/fetch/$s_!doU8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7abce5e2-f99b-4017-81c7-8cdcfce586ea_500x300.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>My final thought here is that: I wouldn&#8217;t be surprised if some of that feeling of efficiency we get when using an LLM is more similar to just the feeling of movement over progress. </p><h2>Final Thoughts</h2><p>The more things change, the more they stay the same. But things do change. Not overnight. Just because someone in SF released some magic box that makes us all constantly question our own self-worth, doesn&#8217;t mean that tomorrow all work will cease.</p><p>Just because many of our data workflows are online, we will magically start finding new value from data, no matter how much marketing there is out there.</p><p>What I have seen change is the size of companies looking into their data, and the speed at which companies are demanding data to provide them value. Perhaps that&#8217;d better be rephrased as the impatience the business has for the data team. This could account for the hiring of seniors and the usage of the cloud, which might be viewed as faster.</p><p>But don&#8217;t let me be the only one sharing. I&#8217;d love to hear your thoughts, what has changed, what has stayed the same?</p><p>Feel free to comment, and as always, thanks for reading!</p><div><hr></div><h2>Articles Worth Reading</h2><p>There are thousands of new articles posted daily all over the web! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!</p><div><hr></div><h2>How and Why Netflix Built a Real-Time Distributed Graph: Part 1 &#8212; Ingesting and Processing Data Streams at Internet Scale</h2><p>The Netflix product experience historically consisted of a single core offering: streaming video on demand. Our members logged into the app, browsed, and watched titles such as Stranger Things, Squid Game, and Bridgerton. Although this is still the core of our product, our business has changed significantly over the last few years. For example, we introduced ad-supported plans, live programming events (e.g., <a href="https://www.netflix.com/tudum/articles/jake-paul-vs-mike-tyson-live-release-date-news">Jake Paul vs. Mike Tyson</a> and <a href="https://www.netflix.com/tudum/articles/nfl-games-on-netflix">NFL Christmas Day Games</a>), and <a href="https://about.netflix.com/en/news/let-the-games-begin-a-new-way-to-experience-entertainment-on-mobile">mobile games</a> as part of a Netflix subscription. This evolution of our business has created a new class of problems where we have to analyze member interactions with the app across different business verticals. Let&#8217;s walk through a simple example scenario&#8230;</p><p><a href="https://netflixtechblog.com/how-and-why-netflix-built-a-real-time-distributed-graph-part-1-ingesting-and-processing-data-80113e124acc">Read More Here</a></p><h2>Is &#8220;data-driven&#8221; just slowing down your decisions?</h2><p>For decades, companies have been chasing the idea of being data-driven. I recall when I first started the term was plastered everywhere.</p><p>Every blog you read and conference you went to, someone referenced that term along with the idea of data being the new oil.</p><p>Everyone wanted to be driven by data. After all, companies like Facebook and Google were doing so well. And not just them, there were plenty of statistics that showed that companies that used data were more profitable than their competitors.</p><p><a href="https://hex.tech/blog/why-being-data-driven-is-slowing-you-down/">Read More Here</a></p><h2>Is It Time to Say Goodbye to Data Engineers?</h2><p>Ever since tools like <a href="https://estuary.dev/blog/sql-server-integration-services/?utm_source=SeattleDataGuy&amp;utm_medium=social&amp;utm_campaign=SeattleDataGuy">SSIS</a> came onto the scene, vendors and business leaders have been on a mission to remove what they see as the biggest roadblock to data-driven decision-making: data engineers.</p><p>Or their counterparts&#8212;DBAs, ETL Developers, and Data Architects.</p><p>Sure, not everyone says it so explicitly, but you can see it in vendor marketing and in the decisions made by the business.</p><p>I remember talking to a veteran data expert who&#8217;s been in the field for three decades. They told me that when <a href="https://www.theseattledataguy.com/alternatives-to-ssissql-server-integration-services-how-to-migrate-away-from-ssis/">SSIS</a> first launched, people were genuinely afraid for their jobs. The idea that you could just drag-and-drop tasks that once required code was nerve-racking. But if you&#8217;ve used SSIS, well, you know the truth.</p><p>To some extent, I get why the idea is appealing. When a leader requests a report, a software engineer wants to modify an application table, or a data scientist wants to explore a new dataset, who&#8217;s the one slowing down the project?</p><p>The <em>data engineers.</em></p><p><a href="https://seattledataguy.substack.com/p/is-it-time-to-say-goodbye-to-data">Read More Here</a></p><div><hr></div><h2>End Of Day 201</h2><p>Thanks for checking out our community. We put out 4-5 Newsletters a month discussing data, tech, and start-ups.</p><p>If you enjoyed it, consider liking, sharing and helping this newsletter grow.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[We’re All Living in Different Data Decades]]></title><description><![CDATA[200 Newsletters Later, What I&#8217;ve Learned About Data, Careers, and Change]]></description><link>https://seattledataguy.substack.com/p/were-all-living-in-different-data</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/were-all-living-in-different-data</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Thu, 30 Oct 2025 16:14:18 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rQhY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017072e6-00d2-4210-8edf-4b86e2df74db_2282x954.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><p>Hi, fellow future and current Data Leaders; Ben here &#128075;</p><p>Today I&#8217;ll be reviewing some trends and thoughts I&#8217;ve been mulling over for 2025. This newsletter is also my 200th newsletter! So I&#8217;ll take a quick moment on that as well.</p><p>But before diving into today&#8217;s newsletter, I want to take a moment to thank this issue&#8217;s sponsor: <a href="https://hex.tech/">Hex</a>. Hex brings the magic of AI to data analysis workflows, whether you&#8217;re using code or no-code. Hex helps organizations work together with data and avoid jumping between different data tools for querying, data science, visualization, and spreadsheets. Over 1600 organizations use Hex to do everything from deep analysis to self-serve.</p><p>Now let&#8217;s jump into the article!</p><div><hr></div><p>First off, woah, what a busy 2025!</p><p>It&#8217;s flown by and with that, I am also realizing that this is my 200th newsletter!</p><p>200!</p><p>Thank you to everyone who has been reading it really does mean a lot. </p><p>I had personally meant this final quarter of 2025 to be a bit of a breather and yet, but as they say&#8230;</p><blockquote><p><strong>If you want to make God laugh, tell Him about your plans.</strong></p></blockquote><p>I&#8217;ve been traveling and speaking.</p><p>Working with a lot of great clients.</p><p>A start-up I invested in got acquired(ok that didn&#8217;t really involve me). So many congrats to the team.</p><p>Another <a href="https://estuary.dev/?utm_source=SeattleDataGuy&amp;utm_medium=social&amp;utm_campaign=SeattleDataGuy">start-up I advise just raised their series A</a>, which I&#8217;ll talk more about later.</p><p>Not to mention I went to Europe for the first time and actually was able to take some time off without panicking and needing to open my laptop every five minutes.</p><p>Somehow I still have more to do for 2025. </p><p>With all of that I did want to share a trends and points my mind keeps going back to as I am now starting the path to my next decade in data.</p><h2>The Data World Spans Decades</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rQhY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017072e6-00d2-4210-8edf-4b86e2df74db_2282x954.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rQhY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017072e6-00d2-4210-8edf-4b86e2df74db_2282x954.png 424w, https://substackcdn.com/image/fetch/$s_!rQhY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017072e6-00d2-4210-8edf-4b86e2df74db_2282x954.png 848w, https://substackcdn.com/image/fetch/$s_!rQhY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017072e6-00d2-4210-8edf-4b86e2df74db_2282x954.png 1272w, https://substackcdn.com/image/fetch/$s_!rQhY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017072e6-00d2-4210-8edf-4b86e2df74db_2282x954.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rQhY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017072e6-00d2-4210-8edf-4b86e2df74db_2282x954.png" width="1456" height="609" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/017072e6-00d2-4210-8edf-4b86e2df74db_2282x954.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:609,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:143910,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/177482736?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017072e6-00d2-4210-8edf-4b86e2df74db_2282x954.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rQhY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017072e6-00d2-4210-8edf-4b86e2df74db_2282x954.png 424w, https://substackcdn.com/image/fetch/$s_!rQhY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017072e6-00d2-4210-8edf-4b86e2df74db_2282x954.png 848w, https://substackcdn.com/image/fetch/$s_!rQhY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017072e6-00d2-4210-8edf-4b86e2df74db_2282x954.png 1272w, https://substackcdn.com/image/fetch/$s_!rQhY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017072e6-00d2-4210-8edf-4b86e2df74db_2282x954.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you open up Google and start looking into different ways you could set up your data infrastructure, you&#8217;ll likely find hundreds of possible designs, from <a href="https://netflixtechblog.com/tagged/data-engineering">Netflix</a> and <a href="https://www.uber.com/blog/engineering/data/">Uber</a> to newsletters like mine.</p><p>It can be tempting to think that every company out there must be using the latest and greatest technology. But as someone who has seen dozens of companies, I can tell you firsthand that is not the case.</p><blockquote><p><strong>We are at a point where companies are in different decades of data infrastructure.</strong></p></blockquote><p>Some are using <a href="https://www.youtube.com/watch?v=GuM6dQGRFyQ">Snowflake</a>, <a href="https://www.theseattledataguy.com/what-is-apache-airflow-data-engineering-consulting/">Airflow</a>, and <a href="https://coalesce.io/">Coalesce</a>.</p><p>While others are successfully using <a href="https://www.theseattledataguy.com/what-is-ssis-and-should-you-use-it/">SSIS</a> and SQL Server.</p><p>Still others are on Hadoop.</p><p>All of which can be functioning.</p><p>Would I recommend anyone go and try to spin up Hadoop now? No. But there are a lot of technologies still in play. So don&#8217;t expect every job you work to always be with the latest and greatest tools. At the end of the day, if your data team is delivering results, that&#8217;s what matters.</p><p>Until you need to interview&#8230;</p><h2>The Data World Today Will Change Tomorrow</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!biom!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2ea7ee-42ff-47a2-b903-c47c635a1229_800x990.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!biom!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2ea7ee-42ff-47a2-b903-c47c635a1229_800x990.jpeg 424w, https://substackcdn.com/image/fetch/$s_!biom!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2ea7ee-42ff-47a2-b903-c47c635a1229_800x990.jpeg 848w, https://substackcdn.com/image/fetch/$s_!biom!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2ea7ee-42ff-47a2-b903-c47c635a1229_800x990.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!biom!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2ea7ee-42ff-47a2-b903-c47c635a1229_800x990.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!biom!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2ea7ee-42ff-47a2-b903-c47c635a1229_800x990.jpeg" width="800" height="990" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ae2ea7ee-42ff-47a2-b903-c47c635a1229_800x990.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:990,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:92254,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/177482736?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2ea7ee-42ff-47a2-b903-c47c635a1229_800x990.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!biom!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2ea7ee-42ff-47a2-b903-c47c635a1229_800x990.jpeg 424w, https://substackcdn.com/image/fetch/$s_!biom!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2ea7ee-42ff-47a2-b903-c47c635a1229_800x990.jpeg 848w, https://substackcdn.com/image/fetch/$s_!biom!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2ea7ee-42ff-47a2-b903-c47c635a1229_800x990.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!biom!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2ea7ee-42ff-47a2-b903-c47c635a1229_800x990.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Except Of Course For SQL&#8230;thats going no where</figcaption></figure></div><p>When I first started in the data world, there was a lot of talk about Data Lakes. Now that talk has slowed or converted into Data Lakehouses.</p><p>In 2020, there was a part of the data world that spoke heavily about the Modern Data Stack.</p><p>Now that has also gone.</p><p>Many of the tools you&#8217;re spending time learning today might not be the &#8220;it&#8221; tool in 5 years.</p><p>In fact, they probably won&#8217;t.</p><p>We&#8217;ll likely have figured out how to better integrate LLMs into workflows and where they fit best. A list of new tools with new names will start to replace our current options.</p><p>The reason I say this is that I&#8217;d often get questions on my videos about skills.</p><p>&#8220;Will this be important to know in 5 years?&#8221;</p><p>&#8220;Will this career be around for the next few decades?&#8221;</p><p>We work in tech. Things change.</p><p>I believe there will always be a need for individuals with technical capabilities.</p><p>People who are willing to solve hard problems. But some of those problems will likely look different in a decade, and you will have to learn new skills.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SeattleDataGuy&#8217;s Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Flexibility as a Principle</h2><p>One theme that is emerging is that of interoperability and flexibility. We&#8217;ve spent the last few decades often having to choose.</p><p>Streaming vs batch, <a href="https://spark.apache.org/">Spark</a> vs Presto?</p><p>Shouldn&#8217;t a tool just allow us to use what we need when we need it?</p><p>Shouldn&#8217;t tools also be able to talk to each other?</p><p>Many data tools are starting to provide this increased flexibility and I see this becoming more of a norm.</p><p>For example, <a href="https://estuary.dev/?utm_source=SeattleDataGuy&amp;utm_medium=social&amp;utm_campaign=SeattleDataGuy">Estuary</a>, a company I&#8217;ve been working with for several years now, gives end-users the choice when it comes to when their data arrives. Do you want your data now or in batch?</p><p>If you need it streamed, do it, but if you don&#8217;t and you&#8217;re connected to Snowflake, so you want to keep your Snowflake costs lower, load it in larger chunks. Meaning you could load multiple sources with different patterns if that would benefit you.</p><p>You can deliver your data at the right time.</p><p>Similarly, there are many tools now that are positioning themselves as compute-agnostic. Just use <a href="https://iceberg.apache.org/">Apache Iceberg</a> and then pick which compute engine you want to use. Presto, DuckDB, etc. Then you can serve the data in Snowflake, Databricks, or just keep it in Iceberg.</p><p>On the interoperability side, it seems as if every company now wants to be known as open, whether they are or aren&#8217;t. </p><h2>Final Thoughts</h2><p>Once again, thank you to all the readers of this newsletter. Your support means so much!</p><p>Whether you&#8217;ve been here since issue #1 or joined somewhere along the way, writing these each week has been one of the most fulfilling parts of my career.</p><p>As we start winding down this year, it&#8217;s important to remember that change will happen. Technologies update and get replaced.</p><p>Tools and buzzwords that were popular today might fade into the background. But I do believe we will still need individuals who enjoy tinkering, testing, and solving problems.</p><p>With that, I want to say here&#8217;s to another 200 newsletters, and thanks, as always, for reading.</p><h2>Upcoming Data Events I&#8217;ll Be At</h2><ul><li><p><strong><a href="https://luma.com/mbg70e9j">Queries, Cocktails &amp; Community - Small Data SF Happy Hour</a></strong></p></li><li><p><strong><a href="https://smalldatasf.com/">Small Data SF</a></strong></p></li></ul><div><hr></div><h2>Articles Worth Reading</h2><p>There are thousands of new articles posted daily all over the web! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!</p><div><hr></div><h2>Anyone Else Struggling to Keep Up With Data Tools</h2><blockquote><h3>Are you managing to keep up?</h3></blockquote><p>One of the truths I&#8217;ve realized about working in the data world, and really the technology world, is that you can work a lifetime and never ever touch massive swaths of technologies.</p><p>Like Sisyphus, you could spend every day pushing the boulder of new technology up the hill, only to have it roll down back to the bottom.</p><p>You could only work on <a href="https://www.youtube.com/watch?v=VLtq0eeHc14&amp;t=1s">Snowflake or Databricks</a>, and in twenty years, never have touched the other. Okay, that might be unlikely, given how some companies set up their data stacks. Somehow, even smaller companies are finding a way to use both. Probably some classic <a href="https://seattledataguy.substack.com/p/vendor-driven-design-the-role-vendors">vendor-driven development</a>.</p><p>Still, if you&#8217;ve only worked in big tech, you might never see some of these solutions while working on unique tools that are developed in-house.</p><p>So how do you actually keep up?</p><p><a href="https://seattledataguy.substack.com/p/anyone-else-struggling-to-keep-up">Read More Here</a></p><h2>DataViz 101: Key Principles for Crafting Clear Dashboards</h2><p>By <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Anastasiya Kuznetsova&quot;,&quot;id&quot;:99725349,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!2E6h!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7eb9d9c-d4e0-4f30-bc37-73eb9ffe4d53_516x534.png&quot;,&quot;uuid&quot;:&quot;5ca46125-268f-4b9f-beb0-b4a9772db224&quot;}" data-component-name="MentionToDOM"></span> </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kUvl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2be9ff-4e50-430e-a049-0e6e12e2d7d7_1456x369.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kUvl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2be9ff-4e50-430e-a049-0e6e12e2d7d7_1456x369.webp 424w, https://substackcdn.com/image/fetch/$s_!kUvl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2be9ff-4e50-430e-a049-0e6e12e2d7d7_1456x369.webp 848w, https://substackcdn.com/image/fetch/$s_!kUvl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2be9ff-4e50-430e-a049-0e6e12e2d7d7_1456x369.webp 1272w, https://substackcdn.com/image/fetch/$s_!kUvl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2be9ff-4e50-430e-a049-0e6e12e2d7d7_1456x369.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kUvl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2be9ff-4e50-430e-a049-0e6e12e2d7d7_1456x369.webp" width="1456" height="369" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fe2be9ff-4e50-430e-a049-0e6e12e2d7d7_1456x369.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:369,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:32264,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/177482736?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2be9ff-4e50-430e-a049-0e6e12e2d7d7_1456x369.webp&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kUvl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2be9ff-4e50-430e-a049-0e6e12e2d7d7_1456x369.webp 424w, https://substackcdn.com/image/fetch/$s_!kUvl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2be9ff-4e50-430e-a049-0e6e12e2d7d7_1456x369.webp 848w, https://substackcdn.com/image/fetch/$s_!kUvl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2be9ff-4e50-430e-a049-0e6e12e2d7d7_1456x369.webp 1272w, https://substackcdn.com/image/fetch/$s_!kUvl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2be9ff-4e50-430e-a049-0e6e12e2d7d7_1456x369.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;ve reviewed more than 3,000 students&#8217; dashboards over the past three years, and here are my tips if you&#8217;re just starting your data visualization career.</p><p><a href="https://nastengraph.substack.com/p/dataviz-101-key-principles-for-crafting">Read More Here</a></p><div><hr></div><h2>End Of Day 200</h2><p>Thanks for checking out our community. We put out 4-5 Newsletters a month discussing data, tech, and start-ups.</p><p>If you enjoyed it, consider liking, sharing and helping this newsletter grow.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[Three Traits That Differentiate Great Senior Data Engineers]]></title><description><![CDATA[How to stand out when technical skill alone isn&#8217;t enough]]></description><link>https://seattledataguy.substack.com/p/three-traits-that-differentiate-great</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/three-traits-that-differentiate-great</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Mon, 20 Oct 2025 18:53:44 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!3eBN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd5b6b-bd62-4f99-a28a-61c3aaaebc27_1654x1164.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi, fellow future and current Data Leaders; Ben here &#128075;</p><p>Before diving into today&#8217;s newsletter, I want to take a moment to thank this issue&#8217;s sponsor: <a href="https://hex.tech/">Hex</a>. Hex brings the magic of AI to data analysis workflows, whether you&#8217;re using code or no-code. Hex helps organizations work together with data and avoid jumping between different data tools for querying, data science, visualization, and spreadsheets. Over 1600 organizations use Hex to do everything from deep analysis to self-serve.</p><p>Now let&#8217;s jump into the article!</p><div><hr></div><p>Early in my career I had a conversation with a friend who was working as a director leading both software and data teams. </p><p>One of the points they brought up was the fact that they often found that some of their senior level analysts and data scientists would provide them massive chart dumps when they ran any form of analysis.</p><p>Somewhere deep in a word document or python notebook layered with charts and explanations might be a conclusion. </p><p>I had actually not considered this point before having this conversation. </p><p>Now it&#8217;s been years since that chat and since then I&#8217;ve had nearly a hundred conversations with data leaders, and several skills and traits kept standing out in terms of what differentiates engineers and analysts. Here are three of them.</p><h2>Getting Buy-In And Influencing Decisions</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3eBN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd5b6b-bd62-4f99-a28a-61c3aaaebc27_1654x1164.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3eBN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd5b6b-bd62-4f99-a28a-61c3aaaebc27_1654x1164.png 424w, https://substackcdn.com/image/fetch/$s_!3eBN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd5b6b-bd62-4f99-a28a-61c3aaaebc27_1654x1164.png 848w, https://substackcdn.com/image/fetch/$s_!3eBN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd5b6b-bd62-4f99-a28a-61c3aaaebc27_1654x1164.png 1272w, https://substackcdn.com/image/fetch/$s_!3eBN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd5b6b-bd62-4f99-a28a-61c3aaaebc27_1654x1164.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3eBN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd5b6b-bd62-4f99-a28a-61c3aaaebc27_1654x1164.png" width="1456" height="1025" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f0dd5b6b-bd62-4f99-a28a-61c3aaaebc27_1654x1164.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1025,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:128290,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/176540412?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd5b6b-bd62-4f99-a28a-61c3aaaebc27_1654x1164.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3eBN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd5b6b-bd62-4f99-a28a-61c3aaaebc27_1654x1164.png 424w, https://substackcdn.com/image/fetch/$s_!3eBN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd5b6b-bd62-4f99-a28a-61c3aaaebc27_1654x1164.png 848w, https://substackcdn.com/image/fetch/$s_!3eBN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd5b6b-bd62-4f99-a28a-61c3aaaebc27_1654x1164.png 1272w, https://substackcdn.com/image/fetch/$s_!3eBN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd5b6b-bd62-4f99-a28a-61c3aaaebc27_1654x1164.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I don&#8217;t recall who told me this idea before, but someone once described that at every level(junior, mid, senior, etc), there are levels inside it. This breaks down into more &#8220;junior&#8221; and &#8220;senior&#8221; at that specific level. </p><p>Once you start going from a more &#8220;junior senior&#8221; to a &#8220;senior senior&#8221; many data leaders I&#8217;ve spoken with are looking to see if their ICs can start to get buy-in for their projects. </p><p>At this level many <a href="https://seattledataguy.substack.com/p/how-to-grow-from-mid-level-to-senior?utm_source=publication-search">senior engineers</a> and <a href="https://www.youtube.com/watch?v=uEPCxBaRf6A">analysts</a> are likely starting to feel antsy. Like they should be promoted to staff. And although every company has a different definition of what staff means, having the ability to get buy-in came up several times as a differentiator.</p><p>This is hard because it involves a combination of skills and traits.</p><ul><li><p>You need to be able to communicate an idea</p></li><li><p>You need to have built some credibility in the organization</p></li><li><p>You need to be able to understand the needs of other teams</p></li></ul><p>Each of these aspects require time. But, you don&#8217;t have to start with some massive project.</p><h3>How to Build Buy-In as a Senior IC</h3><ol><li><p><strong>Start Small</strong> &#8211; Pick a recurring pain point, propose a lightweight fix, and see it through. If you&#8217;ve never got buy-in for a small idea, you won&#8217;t be used to what push back you&#8217;ll get nor will you have the <a href="https://seattledataguy.substack.com/p/building-credibility-as-a-data-leader?utm_source=publication-search">credibility</a> to lean on. </p></li><li><p><strong>Draft a short one-pager</strong> - This should outline the problem, impact, and a lightweight solution. We often think other people are in our heads and get what we mean when we say it. Writing things down often highlight where teams might not be aligned or where issues will arise. So do it and figure out how to resolve said issues.</p></li><li><p><strong>Socialize Early</strong> &#8211; Float ideas in 1:1s before you &#8220;pitch.&#8221; I had plenty of ideas get shot down because I never asked individuals around the key stakeholder what they might think of the pitch. Nor did I bring anyone else in on the idea so they could feel they had helped shape the idea.</p></li><li><p><strong>Map Stakeholders</strong> &#8211; Know who&#8217;s affected and what each group values. There will always be push back, especially when you start asking for larger requests. Prepare for it based on the various <a href="https://seattledataguy.substack.com/i/162696392/build-champions-not-just-stakeholders">stakeholders</a>.</p></li><li><p><strong>Deliver Consistently</strong> &#8211; Reliability is credibility. At the end of the day, if you&#8217;re not delivering, all the pre-work won&#8217;t matter.</p></li><li><p><strong>Show Impact</strong> &#8211; Translate technical work into business outcomes, time saved, ROI. Don&#8217;t just talk about the technical improvements as this likely won&#8217;t land.</p><p></p></li></ol>
      <p>
          <a href="https://seattledataguy.substack.com/p/three-traits-that-differentiate-great">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[When Words Become Data Architecture]]></title><description><![CDATA[Do We Need To Rethink How We Approach Data?]]></description><link>https://seattledataguy.substack.com/p/when-words-become-data-architecture</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/when-words-become-data-architecture</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Wed, 15 Oct 2025 20:50:51 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!hpXy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f37515e-c72b-48ca-92b8-772ae2ffaf22_1496x952.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi, fellow future and current Data Leaders; Ben here &#128075;</p><p>Before we dive in, a quick plug for <a href="https://retool.com/?utm_source=seattledataguy&amp;utm_medium=newsletter&amp;utm_campaign=copilot&amp;utm_content=main_ad&amp;rcid=701Qo000014OR1MIAW">Retool</a>. Their new AI-assisted dev capability lets you build internal tools on top of your live data&#8212;without wrangling a prototype or settling for a static BI dashboard. Describe what you need, and Retool creates a secure app you can fully customize. Perfect for building interactive dashboards and admin tools connected to your warehouse, dbt, or orchestration metadata.</p><p>Now let&#8217;s jump into the article!</p><div><hr></div><p>It&#8217;s the early 1950s, and an American journalist named Edward Hunter begins publishing a series of reports(and eventually books) about U.S. prisoners of war in Korea. He claimed they were being &#8220;brainwashed&#8221;.</p><p><a href="https://www.technologyreview.com/2024/04/12/1090726/brainwashing-mind-control-history-operation-midnight-climax/">But the term brainwashed up until this point actually didn&#8217;t exist in English (or at least wasn&#8217;t popularized).</a></p><p>The term caught fire. It appeared in headlines, films, and <a href="https://www.technologyreview.com/2024/04/12/1090726/brainwashing-mind-control-history-operation-midnight-climax/">congressional hearings</a>. It&#8217;s important to note that Hunter wasn&#8217;t merely a journalist; he was a seasoned propaganda expert who had served in the OSS, the predecessor to the CIA.</p><p>The U.S. military seized on the idea of &#8220;brainwashing&#8221; to discredit confessions made by American POWs, including statements admitting to biological warfare. Depending on the source, some claimed that the word &#8220;brainwashing&#8221; was injected into the English language solely so the military could have a defense against the American POWs who were making said statements.</p><p>Now, why am I writing about this?</p><p>The Seattle Data Guy who now lives in Denver?</p><p>Because many of the words we use in our day-to-day as data engineers and analysts were only created in the past few decades, and yet many of them dictate our choices on tooling, on design, and the work we take on. So I wanted to dig into how the words that we use play that role.</p><h2>Words Shape the Way We Work</h2><p>I used <em>brainwashing</em> as an example because it&#8217;s a little dramatic, and it shows how language can shape how we see the world, how it can excuse behavior or choices because we now have a word to define it. But not every term is born from manipulation. Sometimes, we&#8217;re just trying to name something new we&#8217;ve noticed, like a pattern or behavior.</p><p>And when you do it well, when you manage to wrap something complex into a simple phrase, the word will spread. It creates a shared language. It makes it easier for others to build on your thinking and to bring structure to what was previously vague.</p><p>So I don&#8217;t view this as inherently bad or negative. But it can limit our perspective on the world. These words and terms are also often self-replicating, passed from one company or organization to another long after their meaning has drifted.</p><h2>It All Started With The Data Warehouse</h2><p>It&#8217;s hard not to start this dive into terms that have shaped the data world without mentioning data warehousing. It remains a central part of our work today.</p><p>It&#8217;s why you&#8217;re stuck on the <a href="https://www.youtube.com/watch?v=bsx3xMtbvcs">migration</a> project from SQL Server to <a href="https://www.youtube.com/watch?v=GuM6dQGRFyQ">Snowflake</a>.</p><p>It&#8217;s why everything seems to revolve around lakes and lake houses. It&#8217;s all merely an evolution or variation of that term. There might be a better approach, but why question it?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hpXy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f37515e-c72b-48ca-92b8-772ae2ffaf22_1496x952.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hpXy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f37515e-c72b-48ca-92b8-772ae2ffaf22_1496x952.png 424w, https://substackcdn.com/image/fetch/$s_!hpXy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f37515e-c72b-48ca-92b8-772ae2ffaf22_1496x952.png 848w, https://substackcdn.com/image/fetch/$s_!hpXy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f37515e-c72b-48ca-92b8-772ae2ffaf22_1496x952.png 1272w, https://substackcdn.com/image/fetch/$s_!hpXy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f37515e-c72b-48ca-92b8-772ae2ffaf22_1496x952.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hpXy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f37515e-c72b-48ca-92b8-772ae2ffaf22_1496x952.png" width="1456" height="927" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5f37515e-c72b-48ca-92b8-772ae2ffaf22_1496x952.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:927,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:88601,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/176237946?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f37515e-c72b-48ca-92b8-772ae2ffaf22_1496x952.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hpXy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f37515e-c72b-48ca-92b8-772ae2ffaf22_1496x952.png 424w, https://substackcdn.com/image/fetch/$s_!hpXy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f37515e-c72b-48ca-92b8-772ae2ffaf22_1496x952.png 848w, https://substackcdn.com/image/fetch/$s_!hpXy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f37515e-c72b-48ca-92b8-772ae2ffaf22_1496x952.png 1272w, https://substackcdn.com/image/fetch/$s_!hpXy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f37515e-c72b-48ca-92b8-772ae2ffaf22_1496x952.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Speaking of the Lakehouse.</h4><p>I&#8217;d be remiss not to reference the Data Lakehouse. I think you could argue that this one concept alone(and <a href="https://www.youtube.com/watch?v=QST8NGccQsA">Databricks</a> putting everything they could behind it) willed Databricks into the spotlight as more than just a managed Spark service, even though it has been difficult to shake that label even to this day.</p><p>It&#8217;s funny because Databricks <a href="https://blog.451alliance.com/data-lakehouse-is-here-to-stay-no-matter-what-we-call-it/#:~:text=Even%20though%20Amazon%20and%20Snowflake,a%20January%2030%2C%202020%2C%20blog">wasn&#8217;t even the first company to use the term</a>. But Databricks did everything they could to legitimize the term even further. The first article around the topic was around <a href="https://www.databricks.com/blog/2020/01/30/what-is-a-data-lakehouse.html">January 2020</a>, and then somewhere in early 2021, the idea took off(see Google Trends below).</p><p>Again, I do think this was capturing an idea that had already been floating around. It seemed like a natural combination of Data Warehouse and Data Lake, which is why I believe the idea was so potent.</p><p>We were tired of having to build both.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6rs4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513239b8-088d-46ab-9418-c0366d1ceccf_2350x1364.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6rs4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513239b8-088d-46ab-9418-c0366d1ceccf_2350x1364.png 424w, https://substackcdn.com/image/fetch/$s_!6rs4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513239b8-088d-46ab-9418-c0366d1ceccf_2350x1364.png 848w, https://substackcdn.com/image/fetch/$s_!6rs4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513239b8-088d-46ab-9418-c0366d1ceccf_2350x1364.png 1272w, https://substackcdn.com/image/fetch/$s_!6rs4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513239b8-088d-46ab-9418-c0366d1ceccf_2350x1364.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6rs4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513239b8-088d-46ab-9418-c0366d1ceccf_2350x1364.png" width="1456" height="845" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/513239b8-088d-46ab-9418-c0366d1ceccf_2350x1364.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:845,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:166200,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/176237946?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513239b8-088d-46ab-9418-c0366d1ceccf_2350x1364.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6rs4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513239b8-088d-46ab-9418-c0366d1ceccf_2350x1364.png 424w, https://substackcdn.com/image/fetch/$s_!6rs4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513239b8-088d-46ab-9418-c0366d1ceccf_2350x1364.png 848w, https://substackcdn.com/image/fetch/$s_!6rs4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513239b8-088d-46ab-9418-c0366d1ceccf_2350x1364.png 1272w, https://substackcdn.com/image/fetch/$s_!6rs4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F513239b8-088d-46ab-9418-c0366d1ceccf_2350x1364.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Snowflake was likely very aware that this term gave Databricks a new mental road in people&#8217;s minds. This was an opportunity for Databricks to start taking workloads from Snowflake. Thus, much of Snowflake&#8217;s marketing opted for the term &#8220;Data Cloud&#8221;.</p><p>In fact, I recall a video I recorded with a Snowflake employee that was never aired. I mentioned Snowflake as a data warehouse, but the employee insisted I re-record it and never let that video see the light of day.</p><p>Personally, I think Data Cloud lacks a certain level of graspability. It really is meaningless.</p><p>If you&#8217;re creating a term, it needs to capture your audience effectively. If it&#8217;s so broad that it could literally mean anything(as I believe data cloud does), it probably won&#8217;t catch on.</p><h2>Death Of Terms</h2><blockquote><p><strong>Some words and concepts last for decades.</strong></p><p><strong>Others barely make it a decade.</strong></p></blockquote><p>When I first started in the data world, <a href="https://www.reddit.com/r/dataengineering/comments/pcbds8/what_exactly_does_schema_on_read_mean_in_a_data/">various vendors were pitching schema-on-read</a>, and not too long after that, the term Modern Data Stack came around.</p><p>Perhaps one of these terms was created with <a href="https://roundup.getdbt.com/p/is-the-modern-data-stack-still-a?r=oc02">limited longevity in mind.</a> I say that mostly in jest. I think most companies have no idea what terminology will stick. Maybe that&#8217;s why Databricks keeps trying to invent new <a href="https://www.databricks.com/blog/what-is-a-lakebase">terms</a>. </p><p>But like some viruses, a population might slowly become inoculated to an idea. Perhaps due to overexposure or due to the vaccine of reality, people start to question it. Break it down and push it aside. Once enough of a population has rejected an idea, it slowly ceases to exist.</p><p>That&#8217;s what has seemed to happen to both schema-on-read and now Modern Data Stack. Well, for many, the idea of the <a href="https://www.youtube.com/watch?v=b0ar5Bwiajw">MDS passed away a while back</a>.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SeattleDataGuy&#8217;s Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Final Thoughts</h2><p>I write this article both for myself and for everyone reading. The terms we use in the data world are useful and can help us have a shared language.</p><p>They can also trap us in our current world, where we limit our ability to see beyond the walls of our own minds. </p><p>Those ideas can be hijacked by vendors to sell their products. </p><p>They can also play naturally in a space that was always looking to be better defined.</p><p>We&#8217;ve had so many terms floated in the past few years that it can feel overwhelming.</p><p>And I bet that shortly we&#8217;ll have some new terms that redefine an all-in-one data stack coming out real soon.</p><p>If you&#8217;re just getting started in your data career, I owe you an article with a longer breakdown of all the terms and their timeline almost like a vaccine to ideas.</p><p>As always, thanks for reading.</p><h2>Upcoming Data Events I&#8217;ll Be At</h2><ul><li><p><strong><a href="https://luma.com/mbg70e9j">Queries, Cocktails &amp; Community - Small Data SF Happy Hour</a></strong></p></li><li><p><strong><a href="https://smalldatasf.com/">Small Data SF</a></strong></p></li></ul><div><hr></div><h2>Articles Worth Reading</h2><p>There are thousands of new articles posted daily all over the web! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!</p><div><hr></div><h2>Why is everything so scalable?</h2><p>I&#8217;m entirely convinced that basically every developer alive today heard the adage &#8220;dress for the job you want, not the job you have&#8221; and figured that, since they always wear jeans and a t-shirt anyway, they might as well apply it to their systems&#8217; architecture. This explains why the stack of every single company I&#8217;ve seen is invariably AWS/GCP with at least thirty microservices (how else will you keep the code tidy?), a distributed datastore that charges per query but whose reads depend on how long it&#8217;s been since the last write, a convoluted orchestrator to make sure that you never know which actual computer your code runs on, autoscaling so random midnight breakages ensure you don&#8217;t get too complacent with your sleep schedule, and exactly two customers (well, <em>potential</em> customers).</p><p><a href="https://www.stavros.io/posts/why-is-everything-so-scalable/">Read More Here</a></p><h2>If You&#8217;re New to Data, Read This Before You Build Anything</h2><p>When you&#8217;re just getting started in data, everything feels exciting, and everything sounds like a good idea.</p><p><em>&#8220;Oh, this process takes ten minutes? I&#8217;ll automate it with VBA!&#8221;</em></p><p>Fast-forward four weeks, and you&#8217;re trying to meet the finance team&#8217;s expectations, reconciling numbers, and banging your head against a table, wondering why you ever volunteered.</p><p>New paradigms show up with shiny names and polished diagrams. They sound smart. You&#8217;ve got nothing to compare them to, so you try them. After all, everyone else seems to be.</p><p>That&#8217;s what this article is about: the things I wish I understood earlier in my data career. It started because there were so many buzzwords and new products popping up in the past few weeks, it might not be clear what&#8217;s actually going on if you&#8217;re new to the data space.</p><p>Whether you&#8217;re early in your data career or just want a sanity check, here&#8217;s a breakdown of the stuff that actually matters (and the fluff that doesn&#8217;t).</p><p>Let&#8217;s dive in.</p><p><a href="https://seattledataguy.substack.com/p/if-youre-new-to-data-read-this-before">Read More Here</a></p><div><hr></div><h2>End Of Day 198</h2><p>Thanks for checking out our community. We put out 4-5 Newsletters a month discussing data, tech, and start-ups.</p><p>If you enjoyed it, consider liking, sharing and helping this newsletter grow.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNzQ5NzQ1OTEsImlhdCI6MTc1OTgwODAwOSwiZXhwIjoxNzYyNDAwMDA5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.r4iJuFAam95SqaTj3zIeC4J8X9Gw0xBeEhmHAZ6ELg4"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[7 Questions Every Data Team Should Ask the Business]]></title><description><![CDATA[How To Find Projects Worth Working On]]></description><link>https://seattledataguy.substack.com/p/7-questions-every-data-team-should</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/7-questions-every-data-team-should</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Tue, 07 Oct 2025 14:38:27 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!fov7!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fea9135cc-f9d6-4856-8596-2ca9a1655cb6_256x256.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi, fellow future and current Data Leaders; Ben here &#128075;</p><p>Before diving into today&#8217;s newsletter, I want to take a moment to thank this issue&#8217;s sponsor, <strong><a href="https://www.bmc.com/">BMC</a></strong>. Many data teams still spend too much time babysitting pipelines and stitching together scripts. Control-M helps automate that work, coordinating data workflows, triggering ML jobs, and syncing across cloud platforms, so teams can focus on analysis instead of maintenance.</p><p>Check out their latest case study to see <a href="https://www.bmc.com/customers/dominos-pizza.html">how it can simplify complex workflows</a>.</p><p>Now let&#8217;s jump into the article!</p><div><hr></div><p>A common challenge for data teams is figuring out what to work on that will drive the most value.</p><p>In some cases, the business has projects they believe should be taken on that you might disagree with, but you don&#8217;t have a solid list of suggestions, so you&#8217;re stuck with what they&#8217;ve asked you to work on.</p><p>In other cases, the business team might not know exactly how they&#8217;d like to use data, other than perhaps a dashboard or two. That means you can&#8217;t just ask, &#8220;What data projects do you need?&#8221; and expect a helpful answer.</p><p>Instead, you need to ask questions that uncover pain points, opportunities, and processes where data could make a meaningful difference.</p><p>At Facebook, we conducted this exercise every six months to realign priorities and identify new, high-impact projects. We&#8217;d go and talk to our business partners to understand their current and future needs. From there, we&#8217;d work to come up with projects we believed aligned.</p><p>If you are looking for questions to help you better align with the business, here are seven questions you can ask to surface valuable data. (Also, I&#8217;d love to hear if you have any questions you&#8217;d recommend.)</p><h3>1) What&#8217;s keeping your team from hitting its targets right now?</h3><p>Not everything is about data. Many leaders recognize that there may be areas where they are struggling to meet targets.</p><p>Perhaps they aren&#8217;t converting enough on ads, or their team has consistently missed the mark when it comes to delivering products on time. There are real-life situations where data may or may not be helpful.</p><p>But instead of asking, Where can data help, your focus should be on understanding the bigger picture.</p><p><strong>Department Specific Versions Of This Question</strong></p><ul><li><p><strong>Finance - </strong>Are there recurring surprises in expenses or revenue that you wish you could predict earlier?</p></li><li><p><strong>Marketing - </strong>Are there channels where you feel you&#8217;re spending too much but not seeing results?</p></li><li><p><strong>Operations - </strong>Where do delays or inefficiencies tend to pop up in your daily workflow?</p></li></ul><p></p>
      <p>
          <a href="https://seattledataguy.substack.com/p/7-questions-every-data-team-should">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Everyone Says ‘Data Teams Should Drive Value’ But How?]]></title><description><![CDATA[Understanding Your Data Teams Function]]></description><link>https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Sat, 04 Oct 2025 13:18:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!5dFg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e990cc3-fe70-4251-bf23-b77b25ebd188_1424x856.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi, fellow future and current Data Leaders; Ben here &#128075;</p><p>Before diving into today&#8217;s newsletter, I want to take a moment to thank this issue&#8217;s sponsor, <strong><a href="https://www.bmc.com/">BMC</a></strong>. Many data teams still spend too much time babysitting pipelines and stitching together scripts. Control-M helps automate that work, coordinating data workflows, triggering ML jobs, and syncing across cloud platforms, so teams can focus on analysis instead of maintenance. </p><p>Check out their latest case study to see <a href="https://www.bmc.com/customers/dominos-pizza.html">how it can simplify complex workflows</a>.</p><p>Now let&#8217;s jump into the article!</p><div><hr></div><p>I&#8217;ve been think a lot about the fact that for many companies where the data team work ends, the business value only starts to begin. Even with a simple dashboard or model. You deliver the data product, then what?</p><blockquote><p><strong>Value is supposed to appear right?</strong></p></blockquote><p>After all that&#8217;s what the speaker said at a conference <em>&#8220;data teams need to provide value&#8221; </em>or some other cliche line. </p><p><strong>Of course data teams need to provide value</strong>. But <strong>how</strong> a team provides value is often left vague.</p><p>That&#8217;s where many leaders get stuck. They want their team to be &#8220;strategic&#8221; or &#8220;<a href="https://seattledataguy.substack.com/p/understanding-business-needs-staying">business-driven</a>,&#8221; but they haven&#8217;t clearly defined what <em>role</em> their data team actually plays. </p><p>A team tasked with building out a<a href="https://www.theseattledataguy.com/data-warehousing-essentials-a-guide-to-data-warehousing/"> data warehouse</a> will provide value in a very different way than a team embedded with marketing to improve campaign ROI.</p><p>Over the years, I&#8217;ve seen multiple types of data teams. It can be tempting to think about data teams based on roles. That&#8217;s the analytics team, the data engineering team, the data infra team, etc. But I think it&#8217;s more important to think about the function you want your data team to have.</p><p>What are the goals, what is the business function it provides.</p><p>Sometimes data teams are strategic partners who connect business problems to data solutions. Others might be enabling other teams by being the platform builders who develop and maintain the data infrastructure and warehouse.</p><p>And in some organizations, the data team is just <strong>1-2 people who have to do all of the above</strong>.</p><p>To figure out how <em>your</em> team provides value, you first need to understand the role it plays for the business.</p><h2>Four Ways Data Teams Provide Value</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5dFg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e990cc3-fe70-4251-bf23-b77b25ebd188_1424x856.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5dFg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e990cc3-fe70-4251-bf23-b77b25ebd188_1424x856.png 424w, https://substackcdn.com/image/fetch/$s_!5dFg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e990cc3-fe70-4251-bf23-b77b25ebd188_1424x856.png 848w, https://substackcdn.com/image/fetch/$s_!5dFg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e990cc3-fe70-4251-bf23-b77b25ebd188_1424x856.png 1272w, https://substackcdn.com/image/fetch/$s_!5dFg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e990cc3-fe70-4251-bf23-b77b25ebd188_1424x856.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5dFg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e990cc3-fe70-4251-bf23-b77b25ebd188_1424x856.png" width="1424" height="856" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7e990cc3-fe70-4251-bf23-b77b25ebd188_1424x856.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:856,&quot;width&quot;:1424,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:136550,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/174974591?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e990cc3-fe70-4251-bf23-b77b25ebd188_1424x856.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5dFg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e990cc3-fe70-4251-bf23-b77b25ebd188_1424x856.png 424w, https://substackcdn.com/image/fetch/$s_!5dFg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e990cc3-fe70-4251-bf23-b77b25ebd188_1424x856.png 848w, https://substackcdn.com/image/fetch/$s_!5dFg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e990cc3-fe70-4251-bf23-b77b25ebd188_1424x856.png 1272w, https://substackcdn.com/image/fetch/$s_!5dFg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e990cc3-fe70-4251-bf23-b77b25ebd188_1424x856.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://martinfowler.com/bliki/TeamTopologies.html">Source</a></figcaption></figure></div><p>A framework I enjoy when considering team types comes from <em><a href="https://www.amazon.com/Team-Topologies-Organizing-Business-Technology/dp/1942788819">Team Topologies: Organizing Business and Technology Teams for Fast Flow</a></em>. While the book is about technology teams broadly, its patterns apply neatly to <a href="https://seattledataguy.substack.com/p/centralized-vs-decentralized-vs-federated">data teams</a> too.</p><p>I&#8217;ve listed a few examples below.</p><h3>Platform Teams</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!M2Be!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5d8a6a-d294-4bf1-baf4-fafbd94d89d3_1942x1066.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!M2Be!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5d8a6a-d294-4bf1-baf4-fafbd94d89d3_1942x1066.png 424w, https://substackcdn.com/image/fetch/$s_!M2Be!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5d8a6a-d294-4bf1-baf4-fafbd94d89d3_1942x1066.png 848w, https://substackcdn.com/image/fetch/$s_!M2Be!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5d8a6a-d294-4bf1-baf4-fafbd94d89d3_1942x1066.png 1272w, https://substackcdn.com/image/fetch/$s_!M2Be!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5d8a6a-d294-4bf1-baf4-fafbd94d89d3_1942x1066.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!M2Be!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5d8a6a-d294-4bf1-baf4-fafbd94d89d3_1942x1066.png" width="1456" height="799" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4c5d8a6a-d294-4bf1-baf4-fafbd94d89d3_1942x1066.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:799,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:160150,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/174974591?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5d8a6a-d294-4bf1-baf4-fafbd94d89d3_1942x1066.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!M2Be!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5d8a6a-d294-4bf1-baf4-fafbd94d89d3_1942x1066.png 424w, https://substackcdn.com/image/fetch/$s_!M2Be!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5d8a6a-d294-4bf1-baf4-fafbd94d89d3_1942x1066.png 848w, https://substackcdn.com/image/fetch/$s_!M2Be!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5d8a6a-d294-4bf1-baf4-fafbd94d89d3_1942x1066.png 1272w, https://substackcdn.com/image/fetch/$s_!M2Be!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c5d8a6a-d294-4bf1-baf4-fafbd94d89d3_1942x1066.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Platform teams are often a central infrastructure team that builds layers that everyone else can build on top of. For example a data engineering team that builds <a href="https://www.youtube.com/watch?v=YST1sWFPDh4">ingestion</a> <a href="https://www.youtube.com/watch?v=OeTPOeR-itg">pipelines</a> and manages the warehouse. Their value comes from enabling all other teams to access clean, reliable data. And if you&#8217;re in a large enough organization the data engineering team might be building on another platform team, the data infrastructure team.</p><p><em>It is important to note that a company only benefits from having a large investment in platform teams when there are enough analysts and end-users actually building tools and data products that the business uses.</em></p><h3>Complicated Subsystem Teams</h3><p>Sometimes companies need specialists. That&#8217;s where these teams come in. Think of a data science research team focused on building recommendation systems or a supply chain optimization model. Their value comes from delivering deep expertise that directly improves the business by either improving operating efficiencies or increasing revenue. </p><h3>Stream-Aligned Teams</h3><p>These are domain-embedded teams that work directly alongside a specific business function, like Marketing, Operations, or Product, to deliver data products that drive day-to-day decisions. For example, a marketing analytics team embedded within the growth organization that builds <a href="https://seattledataguy.substack.com/p/stop-shipping-dashboards-that-dont">dashboards</a>, models, and experiments to directly influence campaign spend and customer acquisition.</p><p>Their value comes from proximity, being close enough to the business to understand context, move quickly, and measure impact. They turn data into decisions without layers of translation between business and data teams.</p><h3>Enabling Teams</h3><p>This would be some form of Data Enablement Team within a larger analytics organization. They don&#8217;t own production pipelines or dashboards, instead, they help other teams become more capable with data. At least, how it&#8217;s described in the book is a heavy focus on pure enablement. Now this is the one I might disagree with a bit in terms of where I see the value. I&#8217;d view this team as one that can connect business problems with data solutions. They can span the entire stack.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SeattleDataGuy&#8217;s Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Defining Your Data Team&#8217;s Role and Vision</h2><p>These categories don&#8217;t capture every possible type of data team, but they offer a useful starting point. They help <a href="https://the-data-leaders-playbook.circle.so/c/resources/data-leader-we-re-stuck-in-a-reactive-loop-all-ce509e7b-99a9-42f2-88d7-80c0abf4e3c9">data and business leaders</a> clarify how different teams create value and what role each should play within the organization.</p><p>Once you&#8217;ve identified where your team fits, the next step is to define what success looks like for that role, both for the business and within the data organization.</p><p>Ask questions like:</p><ul><li><p>What outcomes is this team accountable for?</p></li><li><p>How does it interact with other teams?</p></li><li><p>What capabilities or resources does it need to deliver impact?</p></li><li><p>How will we measure progress beyond technical metrics?</p></li></ul><p>From there, you can start shaping a clear vision and the pillars your data organization will stand on. This should focus less on the motions or daily activities and more the outcomes you want your team to drive.</p><h2>Making This Framework Actionable</h2><p>Understanding your team type is only step one. Once you know it, you can start making decisions on where you can drive value.</p><h2>Set Expectations With Leadership</h2><p>I&#8217;ve seen many data teams get trapped as catch-alls. Automation, AI, data engineer, analytics, enabling, platform, etc.</p><p>In some cases, well, when you&#8217;re a <a href="https://seattledataguy.substack.com/p/the-first-data-hire-series-how-to">team of one </a>you&#8217;ll have to do it all. But there is only so much any team can get done and the more you try to increase that scope, the more likely you&#8217;ll start putting out bad work.</p><p>Or, on the flip side, you&#8217;ll succeed and then be stuck maintaining work that doesn&#8217;t drive much impact but two people say they really need.</p><p>So as you&#8217;re setting expectations, here are some tips:</p><ul><li><p>Don&#8217;t promise &#8220;strategic insights&#8221; if you&#8217;re really resourced as a platform team</p></li><li><p>Frame success in terms of the value your team type is best positioned to provide</p></li></ul><h2>Prioritize the Right Work</h2><p>Setting expectations makes it easier to prioritize the right work, because some work that comes your way will clearly not be best suited for your team. If your team is built up with DevOps and platform engineers, you won&#8217;t be as effective building front-end <a href="https://www.theseattledataguy.com/5-alternatives-to-building-dashboards-with-looker/">dashboards</a>.</p><p>For example:</p><ul><li><p>Platform teams should focus on reliability and scalability, not ad-hoc requests</p></li><li><p>Stream-aligned teams should bias toward speed and responsiveness</p></li></ul><p>This is of course assuming your company is large enough to break down your data teams into these specialities.</p><h2>Communicate Value in Business Terms</h2><p>As someone who has worked on a platform team, it can be difficult to communicate your value. Often your work enables other teams to deliver value directly to the business. But you rarely get the credit unless you clearly tie it to what those teams are doing.</p><p>Here are some examples of how each type of data team might communicate their value:</p><ul><li><p><strong>Enabling team:</strong> &#8220;We helped the sales org standardize KPIs, reducing weekly reporting time by 20 hours&#8221;</p></li><li><p><strong>Platform team</strong>: &#8220;We improved data freshness from <a href="https://estuary.dev/success-stories/lovespace?utm_source=SeattleDataGuy&amp;utm_medium=social&amp;utm_campaign=SeattleDataGuy">24 hours to near real-time</a>, allowing the operations team to respond to issues&#8221;</p></li><li><p><strong>Complicated subsystem team</strong>: &#8220;Our demand forecast reduced stockouts by 15%, saving $5M.&#8221;</p></li><li><p><strong>Stream-aligned team</strong>: &#8220;Our campaign ROI dashboard helped reallocate budget, increasing efficiency by 12%.&#8221;</p></li></ul><h2>Closing Thought</h2><p>In some organizations a data team is two people who spend half their time on analytics and the other half on IT workflows. While larger and more data-critical organizations have teams broken down into specific types. </p><p>Regardless, if you&#8217;re not careful all these teams can start to lose their purpose. In some cases they become catch-alls and in others they become so divorced from the business that they don&#8217;t even drive value any more. </p><p>That&#8217;s why it&#8217;s important to start with a general why. Why does your team exist and what is it that you do for the business.</p><p>As always, thanks for reading.</p><h2>Upcoming Data Events</h2><ul><li><p><strong><a href="https://events.bmc.com/bmc-software-rhein-haus-customer-event">Seattle Happy Hour At The Rhein Haus With BMC</a>(I&#8217;ll Be There!)</strong></p></li><li><p><strong><a href="https://airflowsummit.org/">Airflow Summit 2025(I&#8217;ll be there!)</a></strong></p></li><li><p><strong><a href="https://luma.com/7vfyym4m">Streaming Kafka Data into MotherDuck with Estuary Flow</a></strong></p></li></ul><div><hr></div><h2>Articles Worth Reading</h2><p>There are thousands of new articles posted daily all over the web! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!</p><div><hr></div><h2>The Knowledge Architect&#8217;s Playbook: The Pedantic Medallion</h2><p>By <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Ramona C. Truta&quot;,&quot;id&quot;:114974610,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Obzw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59281b34-e1dd-4155-9a41-0e3d4162690f_2990x3050.jpeg&quot;,&quot;uuid&quot;:&quot;59cc41fa-2472-4f49-b14d-9f26bff34c4f&quot;}" data-component-name="MentionToDOM"></span> </p><p>Let&#8217;s be honest with each other. Is our expertise a bridge, or is it a wall?</p><p>In our industries, we are paid for what we know. But we build value only when that knowledge is shared and understood. Too often, we build walls. We hide behind jargon, complex diagrams, and esoteric tools, not to create clarity, but to signal our own intelligence. We speak from a level we believe proves our worth, forgetting that the goal of communication is not to broadcast our own brilliance, but to build a shared reality with others.</p><p><a href="https://ramonactruta.substack.com/p/the-knowledge-architects-playbook">Read More Here</a></p><h2>The &#8220;D&#8221; In Data Stands For Discipline</h2><p>A seasoned data leader once told me something that stuck: <strong>the d in data stands for discipline.</strong></p><p>Now to be clear, pretty much every field requires discipline.</p><p>But we often focus on the exciting parts of data work in articles.</p><p>Not the tedious, repetitive, mundane, just doing the right thing type work.</p><p><a href="https://seattledataguy.substack.com/p/the-d-in-data-stands-for-discipline">Read More Here</a></p><div><hr></div><h2>End Of Day 196</h2><p>Thanks for checking out our community. We put out 4-5 Newsletters a month discussing data, tech, and start-ups.</p><p>If you enjoyed it, consider liking, sharing and helping this newsletter grow.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://seattledataguy.substack.com/p/everyone-says-data-teams-should-drive?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[How To Turn Around A Failing Data Team]]></title><description><![CDATA[Tales From Consulting]]></description><link>https://seattledataguy.substack.com/p/how-to-turn-around-a-failing-data</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/how-to-turn-around-a-failing-data</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Sat, 27 Sep 2025 16:47:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!eYnp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc569a3-100a-4202-b597-eed706ab627c_500x545.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A few years ago, I heard the same observation from several data and business leaders.</p><blockquote><p><em>&#8220;The default state of data teams is failure.&#8221;</em></p></blockquote><p>This was in the early 2020s, when many data teams felt like they had ballooned in size. Everyone wanted to be data-driven and cheap money made it easy.</p><p>Fast-forward to 2025 and the landscape looks very different. Companies are running leaner. Many have intentionally shrunk their data teams and, in some cases, lean more on external partners instead of adding headcount.</p><p>As a consultant, I&#8217;m often brought in when a previous team has disbanded or when leadership wants to turn around a struggling data environment. Across these engagements, I&#8217;ve seen recurring patterns, root causes that explain why some data stacks and teams fail to deliver, and what it really takes to bring them back on track.</p><p>This is based off of my talk at Big Data London.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!942c!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c1bfa3-0a50-4aab-82b4-608d1b77389f_1280x1707.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!942c!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c1bfa3-0a50-4aab-82b4-608d1b77389f_1280x1707.jpeg 424w, https://substackcdn.com/image/fetch/$s_!942c!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c1bfa3-0a50-4aab-82b4-608d1b77389f_1280x1707.jpeg 848w, https://substackcdn.com/image/fetch/$s_!942c!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c1bfa3-0a50-4aab-82b4-608d1b77389f_1280x1707.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!942c!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c1bfa3-0a50-4aab-82b4-608d1b77389f_1280x1707.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!942c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c1bfa3-0a50-4aab-82b4-608d1b77389f_1280x1707.jpeg" width="1280" height="1707" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/98c1bfa3-0a50-4aab-82b4-608d1b77389f_1280x1707.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1707,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:294847,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/174656011?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c1bfa3-0a50-4aab-82b4-608d1b77389f_1280x1707.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!942c!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c1bfa3-0a50-4aab-82b4-608d1b77389f_1280x1707.jpeg 424w, https://substackcdn.com/image/fetch/$s_!942c!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c1bfa3-0a50-4aab-82b4-608d1b77389f_1280x1707.jpeg 848w, https://substackcdn.com/image/fetch/$s_!942c!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c1bfa3-0a50-4aab-82b4-608d1b77389f_1280x1707.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!942c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c1bfa3-0a50-4aab-82b4-608d1b77389f_1280x1707.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.linkedin.com/posts/szabolcs-magyar_big-data-ldn-what-an-event-it-far-exceeded-activity-7377633300363956224-FV2V?utm_source=share&amp;utm_medium=member_desktop&amp;rcm=ACoAAA3roGYByurxK9YsOLOqN2Mn748HOZhjuSE">Picture Source</a> - Thank you <strong><a href="https://www.linkedin.com/in/szabolcs-magyar/overlay/">Szabolcs Magyar</a></strong></figcaption></figure></div><h2>Root Causes of Data Team Failure</h2><p>There are many challenges data teams and leaders face. But it&#8217;s not just about picking the wrong technology. <a href="https://seattledataguy.substack.com/p/dont-lead-a-data-team-before-reading?utm_source=publication-search">Many data leaders find themselves leading data teams with minimal coaching after being an IC</a>. Others get pulled into being a catch all team where they have to manage not only the reporting but automated processes that might be better suited for a different team while also having to figure out how to lead AI initiatives for their companies.</p><p>All that said, here are some common causes of data teams failing.</p><h3>Lack Of Ownership </h3><p>Lack of ownership  can show up in many forms. I liked the way it was described by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Alex Ewerl&#246;f&quot;,&quot;id&quot;:87732486,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2713990-da82-481b-b579-01a7aaa5b85b_560x560.jpeg&quot;,&quot;uuid&quot;:&quot;f77ccc42-bb19-4f09-bcca-2f3095b77879&quot;}" data-component-name="MentionToDOM"></span>  in his article <a href="https://blog.alexewerlof.com/p/broken-ownership">6 Archetypes of Broken Ownership</a>. In which he references that ownership depends on three elements: mandate, knowledge, and responsibility. If any one of these is missing, true ownership breaks down.</p><p>Consider a leader who has the power to make a decision (mandate) but lacks deep understanding (knowledge) and isn&#8217;t accountable for the consequences (responsibility). They make a call without consulting the team.</p><p>The result?</p><ul><li><p>The team delivers the project but morale sinks</p></li><li><p>People stop raising concerns because they feel their input doesn&#8217;t matter</p></li><li><p>Attrition follows as talented engineers look for healthier environments</p></li></ul><p>Other ownership gaps signs:</p><ul><li><p>Key assets, tables, dashboards, critical scripts, become orphaned when the original owner leaves</p></li><li><p>Technical leads are responsible for <a href="https://seattledataguy.substack.com/p/why-your-data-pipeline-probably-isnt">data pipelines</a> but can&#8217;t influence upstream <a href="https://www.youtube.com/watch?v=wvUiRHd47M0">data quality</a></p></li><li><p>Analysts are tasked with metric definitions but can&#8217;t enforce consistent usage across departments</p></li></ul><p>The truth is, like many things, lack of ownership by itself doesn&#8217;t usually cause a data team to fail. Instead, it&#8217;s the build-up of multiple issues.</p><h3>Over-engineering And Wrong Sizing</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uJTx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c75e4b0-ce98-4215-81e2-60076f9187b5_1544x272.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uJTx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c75e4b0-ce98-4215-81e2-60076f9187b5_1544x272.png 424w, https://substackcdn.com/image/fetch/$s_!uJTx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c75e4b0-ce98-4215-81e2-60076f9187b5_1544x272.png 848w, https://substackcdn.com/image/fetch/$s_!uJTx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c75e4b0-ce98-4215-81e2-60076f9187b5_1544x272.png 1272w, https://substackcdn.com/image/fetch/$s_!uJTx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c75e4b0-ce98-4215-81e2-60076f9187b5_1544x272.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uJTx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c75e4b0-ce98-4215-81e2-60076f9187b5_1544x272.png" width="1456" height="256" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9c75e4b0-ce98-4215-81e2-60076f9187b5_1544x272.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:256,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:35401,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/174656011?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c75e4b0-ce98-4215-81e2-60076f9187b5_1544x272.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uJTx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c75e4b0-ce98-4215-81e2-60076f9187b5_1544x272.png 424w, https://substackcdn.com/image/fetch/$s_!uJTx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c75e4b0-ce98-4215-81e2-60076f9187b5_1544x272.png 848w, https://substackcdn.com/image/fetch/$s_!uJTx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c75e4b0-ce98-4215-81e2-60076f9187b5_1544x272.png 1272w, https://substackcdn.com/image/fetch/$s_!uJTx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c75e4b0-ce98-4215-81e2-60076f9187b5_1544x272.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>I view this as a spectrum. Where you can either over-engineer and be burdened by infrastructure that is too complex for the problems you are trying to solve and on the other side have infrastructure that doesn&#8217;t scale well forcing you to fight fires and deal with technical friction.</p><p>In both cases you often lose the ability to cycle quickly through iterations. Instead of experimenting, measuring, and improving, the team is stuck fixing infrastructure, whether that&#8217;s debugging an overbuilt DAG dependency maze or scrambling to scale a single-node warehouse that keeps running out of credits.</p><p><strong>Common over-engineering signs</strong></p><ul><li><p><a href="https://seattledataguy.substack.com/p/hype-is-not-a-data-strategy?utm_source=publication-search">Chasing the &#8220;cool tool&#8221;</a> of the moment without a clear business requirement</p></li><li><p>Layering every emerging pattern (micro-services, event buses, lakehouse-plus-warehouse) before proving the need</p></li><li><p>Designing for hypothetical petabyte scale when current data volumes are modest</p></li></ul><p><strong>Common wrong-sizing(and messy prototyping) signs</strong></p><ul><li><p>Under-estimating data growth and user demand, causing constant outages or painful migrations</p></li><li><p>Skipping basic reliability features to move fast, which later slows everything down</p></li><li><p>Leaning on brittle quick fixes, one giant <a href="https://seattledataguy.substack.com/p/understanding-the-t-in-etl-a-back">ETL</a> script, a single overworked cluster, until making even a small change takes weeks</p></li></ul><p>Both extremes carry the same cost, <strong>slower cycle times</strong>. Engineers spend their time nursing the platform instead of delivering new insights or helping the business adapt. Making small changes takes forever, even when all you need to do is add a few columns to a table.</p><h3>Metrics Theater</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eYnp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc569a3-100a-4202-b597-eed706ab627c_500x545.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eYnp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc569a3-100a-4202-b597-eed706ab627c_500x545.jpeg 424w, https://substackcdn.com/image/fetch/$s_!eYnp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc569a3-100a-4202-b597-eed706ab627c_500x545.jpeg 848w, https://substackcdn.com/image/fetch/$s_!eYnp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc569a3-100a-4202-b597-eed706ab627c_500x545.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!eYnp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc569a3-100a-4202-b597-eed706ab627c_500x545.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eYnp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc569a3-100a-4202-b597-eed706ab627c_500x545.jpeg" width="500" height="545" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8bc569a3-100a-4202-b597-eed706ab627c_500x545.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:545,&quot;width&quot;:500,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:58323,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/174656011?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc569a3-100a-4202-b597-eed706ab627c_500x545.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eYnp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc569a3-100a-4202-b597-eed706ab627c_500x545.jpeg 424w, https://substackcdn.com/image/fetch/$s_!eYnp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc569a3-100a-4202-b597-eed706ab627c_500x545.jpeg 848w, https://substackcdn.com/image/fetch/$s_!eYnp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc569a3-100a-4202-b597-eed706ab627c_500x545.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!eYnp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc569a3-100a-4202-b597-eed706ab627c_500x545.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is one particularly frustrating because it often appears that the <a href="https://seattledataguy.substack.com/p/centralized-vs-decentralized-vs-federated">data team</a> has done everything right. You&#8217;ve built reliable dashboards and a rock-solid analytics platform, yet no one is making real decisions based on the data.</p><p>Leaders may even review the numbers weekly or monthly. When metrics are positive, they celebrate. When they&#8217;re negative, they shrug: <em>nothing we could have done.</em></p><p>It&#8217;s all an act. Metrics theater allows leadership to look data-driven without actually being data-driven.</p><p><strong>Typical signs:</strong></p><ul><li><p>Everyone says &#8220;we&#8217;re data-driven,&#8221; but key calls still come from gut instinct</p></li><li><p>Metrics are highlighted when favorable and quietly ignored when unfavorable</p></li><li><p>Reports are created and placed into slide decks, but the business is still facing the same issues</p></li></ul><p>The result is expensive infrastructure and beautiful dashboards that fail to change how the business operates. If that&#8217;s the case, then the business is really just paying for an expensive paper weight. </p><h3>Outputs Without Outcomes</h3><p>Plenty of companies out there invest millions into their data stack. They build dozens if not hundreds of tables and pipelines, and focus on developing more and more.</p><p>Success is measured by what the team ships, not by what the business achieves. Although this is often more implicit. It might not be said, but what gets rewarded is output.</p><p><strong>Typical signs:</strong></p><ul><li><p>Number of dashboards, tables, or lines of code lead to promotions</p></li><li><p>Launching data products that are rarely, if ever, used</p></li><li><p>Spending millions on migrations or new tooling with no measurable lift in revenue, efficiency, or customer experience</p></li></ul><p>I&#8217;ve seen teams build 100+ dashboards and complete multiple costly migrations, only to discover that decision-makers barely glance at the outputs.</p><p>Both metrics theater and output-focused thinking miss the point. Its a lot of motion without really any true purpose. </p><h2>Turning the Ship Around - Your Data Team Turnaround Playbook</h2><p>But that&#8217;s enough talking about problems.</p><p>Let&#8217;s go through how you can start to turn your data team around.</p><h3>Diagnose &amp; Stabilize</h3><p>Before actually really diving deep you&#8217;re going to want to pause and just assess what is going on. You need to understand what are the crucial workflows you can&#8217;t break because they will impact the business? And you&#8217;ll want to see if there are opportunities to simplify but you can only do that by understanding the bigger picture. </p><ul><li><p><strong>Map the current stack and data flows - </strong>Document pipelines, source system, and dependency. There will always be some hidden cron job or <a href="https://docs.snowflake.com/en/user-guide/tasks-intro">Snowflake task </a>that you didn&#8217;t expect that somehow manages the way your business makes money.</p></li><li><p><strong>Identify &#8220;must-keep-lights-on&#8221; processes - </strong>Isolate the handful of jobs that keep revenue reporting, billing, or critical dashboards alive. In many cases data teams might be responsible for business critical workflows, you can&#8217;t mess those up.</p></li><li><p><strong>Deliver quick wins</strong></p><ul><li><p>Fix broken daily jobs and remove dead code</p></li><li><p>Standardize naming for tables so everyone can find and use data more easily(specifically target the tables that are heavily used)</p></li><li><p>Set up simple alerts and monitoring to prevent recurring outages</p></li></ul></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gBBh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3471b5b0-7fcd-46f1-9b99-99ca737322c5_400x400.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gBBh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3471b5b0-7fcd-46f1-9b99-99ca737322c5_400x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gBBh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3471b5b0-7fcd-46f1-9b99-99ca737322c5_400x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gBBh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3471b5b0-7fcd-46f1-9b99-99ca737322c5_400x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gBBh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3471b5b0-7fcd-46f1-9b99-99ca737322c5_400x400.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gBBh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3471b5b0-7fcd-46f1-9b99-99ca737322c5_400x400.jpeg" width="400" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3471b5b0-7fcd-46f1-9b99-99ca737322c5_400x400.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:16430,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/174656011?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3471b5b0-7fcd-46f1-9b99-99ca737322c5_400x400.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gBBh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3471b5b0-7fcd-46f1-9b99-99ca737322c5_400x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gBBh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3471b5b0-7fcd-46f1-9b99-99ca737322c5_400x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gBBh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3471b5b0-7fcd-46f1-9b99-99ca737322c5_400x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gBBh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3471b5b0-7fcd-46f1-9b99-99ca737322c5_400x400.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>These early actions buy credibility with leadership and create breathing room for deeper fixes.</p><h3>Reset Vision &amp; Priorities</h3><p>Once you&#8217;ve got a good idea and handle of the current data infrastructure it&#8217;s time to reset. You need to be clear of what your teams priorities are and what the vision for your team is in the future. </p><p>What are the 2-3 pillars you believe your data team should focus on?</p><p>Do you view it as purely a reporting team? a team that will help integrate data into the product, etc. </p><p>Be clear about it, or you&#8217;ll always end up a catch-all IT/Data/Finance team hybrid.</p><ul><li><p><strong>Reconnect data work to business outcomes - </strong>Identify top three to five metrics or use cases the company truly cares about, such as reducing churn, improving conversion, or speeding new product launches.</p></li><li><p><strong>Publicly prioritize top initiatives with clear owners - </strong>Create a short, visible roadmap so the entire company sees what matters and what will <em>not</em> be done.</p></li><li><p><strong>Get leadership buy-in on stopping low-value work - </strong>Make it explicit, ad-hoc requests and &#8220;nice to have&#8221; dashboards are paused until the core plan is met. Ok let&#8217;s be real, most of the ad-hoc work&#8230;you&#8217;ll always have to deliver some or people will think you&#8217;re not willing to help them. But be willing to pushback more or you will never get out of the quagmire. </p></li></ul><h3>Rebuild Foundations</h3><p>Now you need to rebuild the foundation. And I don&#8217;t just mean your technical foundation, I mean everything from technical to how you manage ownership, key processes, etc</p><ul><li><p><strong>Establish clear ownership and accountability - </strong>Being clear on who and how ownership will work moving forward allows your team to make better decisions because they&#8217;ll know who is responsible for managing various future artifacts</p></li><li><p><strong>Simplify architecture and right-size tools - </strong>One example I have where I simplified architecture involved 50+ queries supporting a set of dashboards.  I look through them and found they could be consolidated to 3-4 instead. Reducing duplicative code, improving performance and not hitting the data warehouse as often.</p></li><li><p><strong>Document, automate testing, and enforce standards - </strong>This is also a great time to set clear standards on everything from how you test to name tables and columns.</p></li></ul><h3>Create an Outcome-Obsessed Culture</h3><p>Finally, I&#8217;ve really enjoyed the term <strong><a href="https://the-data-leaders-playbook.circle.so/c/resources/the-nine-principles-of-an-outcome-obsessed-data-team-98c707b9-3dd0-473d-9003-409c8cceaf8c">outcome-obsessed</a></strong> recently. Probably because the alliteration. But the point remains, you&#8217;ve got to actually care about what the other side of your project does.</p><p>Where the data work stops the business value is just beginning and it doesn&#8217;t just happen because dashboard exists or a business executive has access to a chatbot.</p><ul><li><p><strong>Define success in business terms, not just technical metrics - </strong>Yes, data quality has to be reliable, the dashboards need to load in a reasonable time frame, but the business still has to have a plan for the data.</p></li><li><p><strong>Bake outcomes into processes - </strong>If you&#8217;re only really starting to think about outcomes after delivering your data products, there is a good change you&#8217;ve already gone down the right path.</p></li><li><p><strong>Keep communication constant and keep the project going - </strong>Just because the data product is delivered doesn&#8217;t mean you don&#8217;t have to communicate with anyone in the business anymore. You need to check in, see if they are using the dashboard, see if they have any issues, and so on.</p></li></ul><h2>Final Thoughts</h2><p>Data teams can provide a lot of value for the business but they can also very easily be an expensive paper weight. </p><p>They can find themselves building data pipelines that support dashboards that no one uses and arguing about whether or not they should migrate to Databricks or <a href="https://www.youtube.com/watch?v=GuM6dQGRFyQ">Snowflake</a> next. </p><p>All while the business is talking about a completely different set of problems and issues that you could be helping on! </p><p>As always, thanks as always for reading!</p><h2>Upcoming Data Events</h2><ul><li><p><strong><a href="https://luma.com/7vfyym4m">Streaming Kafka Data into MotherDuck with Estuary Flow</a></strong></p></li><li><p><strong><a href="https://events.bmc.com/bmc-software-rhein-haus-customer-event">Seattle Happy Hour At The Rhein Haus With BMC</a></strong></p></li><li><p><strong><a href="https://airflowsummit.org/">Airflow Summit 2025(I&#8217;ll be there!)</a></strong></p></li></ul><div><hr></div><h2>Articles Worth Reading</h2><p>There are thousands of new articles posted daily all over the web! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!</p><div><hr></div><h2>Operating Principles That Guided Me to Staff Engineer (Part 2: Expanding Influence)</h2><p>By <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Jordan Cutler&quot;,&quot;id&quot;:58854493,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/670bb162-5a63-4fd2-8253-f98c28d446a7_1168x1168.jpeg&quot;,&quot;uuid&quot;:&quot;3b7e5a18-f739-4eac-83e9-527d639610a4&quot;}" data-component-name="MentionToDOM"></span> </p><p>In <a href="https://read.highgrowthengineer.com/p/operating-principles-to-staff-part-1">Part 1</a>, Jordan shared the principles that helped him consistently deliver results and reach Staff Engineer. But results don&#8217;t happen in a vacuum. They depend on the influence you have to make change happen. Influence drives results, and results compound to more influence.</p><p>In this part, Jordan will share the guiding principles that helped me build influence to make a multiplicative impact across teams. These principles help you build relationships, strengthen your communication, and be seen as a leader.</p><p><a href="https://read.highgrowthengineer.com/p/operating-principles-to-staff-part-2">Read More Here</a></p><h2>Scaling Muse: How Netflix Powers Data-Driven Creative Insights at Trillion-Row Scale</h2><p>At Netflix, we prioritize getting timely data and insights into the hands of the people who can act on them. One of our key internal applications for this purpose is Muse. Muse&#8217;s ultimate goal is to help Netflix members discover content they&#8217;ll love by ensuring our promotional media is as effective and authentic as possible. It achieves this by equipping creative strategists and launch managers with data-driven insights showing which artwork or video clips resonate best with global or regional audiences and flagging outliers such as potentially misleading (clickbait-y) assets. These kinds of applications fall under Online Analytical Processing (OLAP), a category of systems designed for complex querying and data exploration. However, enabling Muse to support new, more advanced filtering and grouping capabilities while maintaining high performance and data accuracy has been a challenge. Previous posts have touched on <a href="https://netflixtechblog.com/artwork-personalization-c589f074ad76">artwork personalization</a> and our <a href="https://netflixtechblog.com/introducing-impressions-at-netflix-e2b67c88c9fb">impressions architecture</a>. <strong>In this post, we&#8217;ll discuss some steps we&#8217;ve taken to evolve the Muse data serving layer to enable new capabilities while maintaining high performance and data accuracy.</strong></p><p><a href="https://netflixtechblog.com/scaling-muse-how-netflix-powers-data-driven-creative-insights-at-trillion-row-scale-aa9ad326fd77">Read More Here</a></p><div><hr></div><h2>End Of Day 195</h2><p>Thanks for checking out our community. We put out 4-5 Newsletters a month discussing data, tech, and start-ups.</p><p>If you enjoyed it, consider liking, sharing and helping this newsletter grow!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/p/how-to-keep-your-data-team-from-becoming?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNjQyNDUwMTUsImlhdCI6MTc0ODMwNzc5OSwiZXhwIjoxNzUwODk5Nzk5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.60gV1DPqNbpAd_IWymkgPTFlphx9Ebg7AmOU2CxmtXM&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://seattledataguy.substack.com/p/how-to-keep-your-data-team-from-becoming?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNjQyNDUwMTUsImlhdCI6MTc0ODMwNzc5OSwiZXhwIjoxNzUwODk5Nzk5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.60gV1DPqNbpAd_IWymkgPTFlphx9Ebg7AmOU2CxmtXM"><span>Share</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[The "D" In Data Stands For Discipline]]></title><description><![CDATA[Hi, fellow future and current Data Leaders; Ben here &#128075;]]></description><link>https://seattledataguy.substack.com/p/the-d-in-data-stands-for-discipline</link><guid isPermaLink="false">https://seattledataguy.substack.com/p/the-d-in-data-stands-for-discipline</guid><dc:creator><![CDATA[SeattleDataGuy]]></dc:creator><pubDate>Wed, 17 Sep 2025 21:55:27 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!HCFt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681a09bf-3bd5-4366-88c8-6c6b52c0f481_1898x770.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi, fellow future and current Data Leaders; Ben here &#128075;</p><p>Before diving in to that, I wanted to let y&#8217;all know that I&#8217;ll be running several events in person in the UK, Seattle and Denver. The first will be in the UK, so if you&#8217;d like to join me after Big Data London, you can sign up <a href="https://luma.com/ciz1hgrp">here</a>! </p><p>Also, special thanks to <a href="https://estuary.dev/?utm_source=SeattleDataGuy&amp;utm_medium=social&amp;utm_campaign=SeattleDataGuy">Estuary</a> for partnering on this event!</p><p>Food and drinks will be provided as well as a live band!</p><p>Now let&#8217;s jump into the article!</p><div><hr></div><p>A seasoned data leader once told me something that stuck: <strong>the d in data stands for discipline.</strong></p><p>Now to be clear, pretty much every field requires discipline.</p><p>But we often focus on the exciting parts of data work in articles. </p><p>Not the tedious, repetitive, mundane, just doing the right thing type work.</p><p>It is those unglamorous but critical habits that help you deliver good work:</p><ul><li><p>Resisting the urge to over-engineer a simple problem</p></li><li><p>Enforcing standards even when it feels tedious</p></li><li><p>Saying &#8220;no&#8221; to the ad-hoc requests that would derail your team from finishing meaningful work</p></li><li><p>Committing to an idea even if it doesn&#8217;t provide immediate results</p></li></ul><p>None of these are flashy. They don&#8217;t exactly make for a good case study or talk at a conference. But looking back into my article from a couple weeks ago. But they are more habits that I&#8217;ve seen separate good and great data teams.</p><p>There are plenty of benefits from doing all the small things well.</p><p>Below are three ways discipline shows up in a high-performing data teams.</p><h2>Practicing Restraint In Data Infrastructure Choices</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HCFt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681a09bf-3bd5-4366-88c8-6c6b52c0f481_1898x770.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HCFt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681a09bf-3bd5-4366-88c8-6c6b52c0f481_1898x770.png 424w, https://substackcdn.com/image/fetch/$s_!HCFt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681a09bf-3bd5-4366-88c8-6c6b52c0f481_1898x770.png 848w, https://substackcdn.com/image/fetch/$s_!HCFt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681a09bf-3bd5-4366-88c8-6c6b52c0f481_1898x770.png 1272w, https://substackcdn.com/image/fetch/$s_!HCFt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681a09bf-3bd5-4366-88c8-6c6b52c0f481_1898x770.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HCFt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681a09bf-3bd5-4366-88c8-6c6b52c0f481_1898x770.png" width="1456" height="591" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/681a09bf-3bd5-4366-88c8-6c6b52c0f481_1898x770.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:591,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:79406,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/173809780?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681a09bf-3bd5-4366-88c8-6c6b52c0f481_1898x770.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HCFt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681a09bf-3bd5-4366-88c8-6c6b52c0f481_1898x770.png 424w, https://substackcdn.com/image/fetch/$s_!HCFt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681a09bf-3bd5-4366-88c8-6c6b52c0f481_1898x770.png 848w, https://substackcdn.com/image/fetch/$s_!HCFt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681a09bf-3bd5-4366-88c8-6c6b52c0f481_1898x770.png 1272w, https://substackcdn.com/image/fetch/$s_!HCFt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681a09bf-3bd5-4366-88c8-6c6b52c0f481_1898x770.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Here I often think back to my early cooking days. As a young cook, I&#8217;d pile every technique I knew onto a single plate. I wasn&#8217;t thinking about the diner&#8217;s experience. I really was just showing off.</p><p>Over time I learned that every component must earn its place.</p><p>In the same way, data infrastructure is a tempting place to use tool after tool and build a massively complex system to report churn.</p><p>It&#8217;s tempting to keep adding tools, Iceberg, <a href="https://www.youtube.com/watch?v=QNdiGZFaUFs&amp;t=2s">Databricks</a>, <a href="https://www.youtube.com/watch?v=GuM6dQGRFyQ&amp;t=4s">Snowflake</a>, <a href="https://www.theseattledataguy.com/what-is-apache-airflow-data-engineering-consulting/#page-content">Airflow</a>, Sigma, Unity Catalog, until the stack looks impressive on paper. Yet, I&#8217;ve seen <a href="https://seattledataguy.substack.com/p/centralized-vs-decentralized-vs-federated">teams</a> spend millions building these sprawling setups only to hear the same complaint from business leaders:</p><blockquote><h4>&#8220;I can&#8217;t find the numbers I need&#8221;</h4></blockquote><p>So technically, you might have built an amazing data infrastructure stack but functionally for the business no one uses it. </p><blockquote><h4>So why did you build it?</h4></blockquote><h2>Saying No to the Endless Ad-Hoc Ask - And Actually Finishing Work</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!g7ai!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83aca3a-1618-45de-af71-c59fd3dd4b95_500x750.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!g7ai!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83aca3a-1618-45de-af71-c59fd3dd4b95_500x750.jpeg 424w, https://substackcdn.com/image/fetch/$s_!g7ai!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83aca3a-1618-45de-af71-c59fd3dd4b95_500x750.jpeg 848w, https://substackcdn.com/image/fetch/$s_!g7ai!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83aca3a-1618-45de-af71-c59fd3dd4b95_500x750.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!g7ai!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83aca3a-1618-45de-af71-c59fd3dd4b95_500x750.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!g7ai!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83aca3a-1618-45de-af71-c59fd3dd4b95_500x750.jpeg" width="500" height="750" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b83aca3a-1618-45de-af71-c59fd3dd4b95_500x750.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:750,&quot;width&quot;:500,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:54569,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://seattledataguy.substack.com/i/173809780?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83aca3a-1618-45de-af71-c59fd3dd4b95_500x750.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!g7ai!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83aca3a-1618-45de-af71-c59fd3dd4b95_500x750.jpeg 424w, https://substackcdn.com/image/fetch/$s_!g7ai!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83aca3a-1618-45de-af71-c59fd3dd4b95_500x750.jpeg 848w, https://substackcdn.com/image/fetch/$s_!g7ai!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83aca3a-1618-45de-af71-c59fd3dd4b95_500x750.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!g7ai!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83aca3a-1618-45de-af71-c59fd3dd4b95_500x750.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Data teams can quickly become the company&#8217;s catch-all IT fill-in department.</p><p>One day it&#8217;s a finance team automation, the next it&#8217;s a one-off dashboard or an urgent &#8220;just pull this number&#8221; request. Each ask might seem harmless, but together they create a flood that pulls the team away from meaningful, long-term work.</p><blockquote><h4>Data teams need to have the discipline to say &#8220;No&#8221;.</h4></blockquote><p>Now I say &#8220;No&#8221; but I mean, you need to be able to communicate what your team can and can&#8217;t get done and what trade offs are.</p><p>It&#8217;s also about ensuring your team isn&#8217;t constantly getting ripped from one project to another.</p><p>After all:</p><blockquote><h4>&#8220;you only get value from projects when they finish: to make progress, above all else, you must ensure that some of your projects finish.&#8221;</h4><h4>&#8213; <a href="https://lethain.com/">Will Larson</a>, An Elegant Puzzle: Systems of Engineering Management</h4></blockquote><p>If you let your data team get steam-rolled and just say yes to every ad-hoc request, at first other leaders might enjoy how much attention they are getting, then they will keep asking for more and more and your data team will get burnt out, projects won&#8217;t get fully delivered and things will just continue to unravel.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SeattleDataGuy&#8217;s Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Following Standards Relentlessly</h2><p>Standards cover everything from how tables are named to how version control is handled, how <a href="https://www.youtube.com/watch?v=uvACOp4WFR4&amp;t=2s">SQL</a> is formatted, and how <a href="https://www.theseattledataguy.com/how-to-data-model-real-life-examples-of-how-companies-model-their-data/">data models are structured</a>. </p><p>I don&#8217;t view these are not just cosmetic choices. Consistency ensures that a query written today will still be understandable (and trustworthy) six months from now.</p><p>Now in the past this meant I&#8217;d have to go over my SQL scripts and ensure that I tabbed everything as expected. </p><p>Luckily now you can automate this part of &#8220;discipline&#8221;. You can do so by making those standards part of the process. Use linting tools, CI/CD checks, and code reviews so that every pull request is validated before it ever hits production. When we did this at Facebook nits went down 98%(Or at least that&#8217;s what it felt like).</p><p>Documentation is part of the standard. A living style guide and <a href="https://seattledataguy.substack.com/p/onboarding-for-data-teams">onboarding</a> guide help new team members ramp up quickly, reducing the &#8220;legacy knowledge&#8221; that can paralyze a team when someone leaves.</p><p>This also makes it easier for external team members to quickly understand what your naming conventions and workflows look like.</p><p>It also makes future migrations, looking through code for specific patterns and a whole host of other larger projects considerably easier. </p><h2>But Wait There Is More!</h2><p>Now there are plenty of other ways data teams remained disciplined when it comes to delivering the right end products to the business. If you&#8217;d like to read some of my past articles on some of the topics you can see them below.</p><ul><li><p><a href="https://seattledataguy.substack.com/p/data-is-the-how-business-is-the-why">Keeping focused on the business</a></p></li><li><p><a href="https://seattledataguy.substack.com/p/hype-is-not-a-data-strategy">Not letting hype derail what actually matters</a></p></li><li><p><a href="https://seattledataguy.substack.com/p/why-your-data-pipeline-probably-isnt">Building production ready data pipelines</a></p></li></ul><h2>Final Thoughts</h2><p>So many projects and data stacks I&#8217;ve seen fall apart because of small compromises that built up over time. </p><ul><li><p>There was no clear owner for a task or larger process.</p></li><li><p>The data engineering team used a tool to patch a problem rather than inquiring why they were having the issue.</p></li><li><p>No one set standards which allowed for fast development at first until the first migration.</p></li></ul><p>Each small compromise seems harmless in the moment, but together they create technical debt, brittle pipelines, and future pains.</p><p>There is always a balance of course, you don&#8217;t want to put so much process in the way that nothing moves.</p><p>But you also want to avoid building a big ball of mud that no one wants to touch.</p><p>As always, thanks for reading.</p><h2>Upcoming Data Events</h2><ul><li><p><strong><a href="https://airflowsummit.org/">Airflow Summit 2025(I&#8217;ll be there!)</a></strong></p></li><li><p><strong><a href="https://luma.com/ciz1hgrp">London Is Calling - Data And Drinks With Estuary</a></strong></p></li><li><p><strong><a href="https://luma.com/twa4nw32">Coffee With The Seattle Data Guy - London</a></strong></p></li><li><p><strong><a href="https://www.bigdataldn.com/en-gb/conference.html?speakers=Ben%20Rogojan#/sessions">Big Data London(I&#8217;ll be there!)</a></strong></p></li></ul><div><hr></div><h2>Articles Worth Reading</h2><p>There are thousands of new articles posted daily all over the web! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!</p><div><hr></div><h2>Is TV's Golden Age (Officially) Over? A Statistical Analysis</h2><p>By <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Daniel Parris&quot;,&quot;id&quot;:112812180,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!AmpE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25a9035a-fbd9-4f33-aa36-2548ca85140b_2048x1536.jpeg&quot;,&quot;uuid&quot;:&quot;8907e160-dcef-4f7f-b8d3-1a2efb8bb14f&quot;}" data-component-name="MentionToDOM"></span> </p><p>A lot can change in forty-five months. Think back to November of 2021: the world had yet to see a Tesla Cybertruck, HBO Max was an ascendant streaming service, Will Smith had slapped zero people on live television, Sam Bankman-Fried was a benevolent billionaire and model citizen, and Netflix's stock was soaring, buoyed by the pandemic.</p><p>And then the unthinkable happened: Netflix reported a quarterly loss of 200,000 subscribers. This earnings miss&#8212;coupled with broader economic uncertainty&#8212;triggered widespread panic across the entertainment industry. Streaming platforms slashed their content budgets, media conglomerates like Disney and Warner Bros. laid off thousands, and Netflix's stock fell by 51%. This industry-wide contraction culminated in a six-month writers&#8217; strike, as unions demanded higher pay, standardized compensation, and greater residuals. Amid full work stoppages and a volatile economy, industry pundits began speculating whether streaming could recapture its pre-2022 momentum.</p><p><a href="https://www.statsignificant.com/p/is-tvs-golden-age-officially-over">Read More Here</a></p><h2>The Pedantic Layer</h2><p>By <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Joe Reis&quot;,&quot;id&quot;:3531217,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6e4716b1-c223-41e3-b943-def0291bf217_1175x783.jpeg&quot;,&quot;uuid&quot;:&quot;e40fcd0c-ea52-4977-afb2-11170e1fdb20&quot;}" data-component-name="MentionToDOM"></span> </p><p>In the data world, we love to argue about definitions. What is unstructured data? Is JSON structured or semi-structured? Are PDFs unstructured, or do they contain &#8220;implicit structure&#8221;? Do LLM embeddings of text count as structured data? And <a href="https://practicaldatamodeling.substack.com/p/the-semantic-layer-smackdown">WTF is a semantic layer</a>? Entire threads, articles, and even conference talks spin out of these debates.</p><p>This obsession forms what I call the Pedantic Layer&#8230;</p><p><a href="https://joereis.substack.com/p/the-pedantic-layer">Read More Here</a></p><div><hr></div><h2>End Of Day 194</h2><p>Thanks for checking out our community. We put out 4-5 Newsletters a month discussing data, tech, and start-ups.</p><p>If you enjoyed it, consider liking, sharing and helping this newsletter grow!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://seattledataguy.substack.com/p/how-to-keep-your-data-team-from-becoming?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNjQyNDUwMTUsImlhdCI6MTc0ODMwNzc5OSwiZXhwIjoxNzUwODk5Nzk5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.60gV1DPqNbpAd_IWymkgPTFlphx9Ebg7AmOU2CxmtXM&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://seattledataguy.substack.com/p/how-to-keep-your-data-team-from-becoming?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjo0OTYzNjIyLCJwb3N0X2lkIjoxNjQyNDUwMTUsImlhdCI6MTc0ODMwNzc5OSwiZXhwIjoxNzUwODk5Nzk5LCJpc3MiOiJwdWItMjExMDUiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.60gV1DPqNbpAd_IWymkgPTFlphx9Ebg7AmOU2CxmtXM"><span>Share</span></a></p>]]></content:encoded></item></channel></rss>