Journal of Distributed Software Engineering, Architecture and Design
Signal Over Noise: Best Practices in Event-Driven Architecture Design
<div class="cs-rating pd-rating" id="pd_rating_holder_1819065_post_2754"></div>
<p class="wp-block-paragraph">As <strong>event-driven </strong>platform take off ✈️ with modern integration practices such as <strong>data-in-motion</strong>, organisations are <strong><em>integrating</em> <em>significantly large number</em> of data and event streams</strong> 📈 without <strong>deliberate and up front design</strong> these growing streams are contributing to <strong>noise, complexity and coupling leading to</strong> higher implementation (CAPEX) and operating (OPEX) costs. In this post we look at common patterns in stream noise to signal conversion and cost optimal choices which lead to a strategic outcome </p>
<h2 class="wp-block-heading">What’s the issue?</h2>
<p class="wp-block-paragraph">Data and event streams originate from core systems, new connectors are making it simpler to dump core system entity information into streaming platforms like apache Kafka. The goal is to <em>accelerate</em> legacy applications and service modernization through simple and quick connectors, this is in the scope of not just traditional messaging with small event footprint but more data oriented pipes to provide a stream of changes (<strong>note</strong>: If your organisation is not actively exploring event streaming and relying solely on messaging for integration then let us chat!)</p>
<p class="wp-block-paragraph">Now, while these system connectors are easy to setup and implement – without deliberate upfront design (i.e accidental architecture) these connectors can end up <strong>producing streams of system entity change events which are <em>noise</em> without appropriate context.</strong> These require consumers to parse them and in doing so consumers must first understand the internal structure of the systems then process them with duplicated logic and additional context</p>
<div class="wp-block-image">
<figure class="aligncenter size-large"><a href="https://alok-mishra.com/wp-content/uploads/2023/05/noisy-streams-alok-mishra.png"><img src="https://alok-mishra.com/wp-content/uploads/2023/05/noisy-streams-alok-mishra.png?w=770" alt="" class="wp-image-2785" /></a><figcaption class="wp-element-caption">Noisey streams push onus of translation to consumers leading to extra effort in implementation and maintenance</figcaption></figure>
</div>
<p class="wp-block-paragraph"><strong>Costly data swamps</strong>: This not only leaks internal system implementation leading to coupling between system and consumers but also adds significant cost of processing to convert to noise to signal<em> in the process diluting business value</em> delivered. Over time, these streams lead to high operational costs and data swamps</p>
<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><a href="https://alok-mishra.com/wp-content/uploads/2023/05/streams-of-junk.jpeg"><img src="https://alok-mishra.com/wp-content/uploads/2023/05/streams-of-junk.jpeg?w=512" alt="" class="wp-image-2760" width="396" height="396" /></a><figcaption class="wp-element-caption">Are your streaming services building a data swamp?</figcaption></figure>
</div>
<h2 class="wp-block-heading">State of data and event Streaming</h2>
<p class="wp-block-paragraph">InfoQ’s annual architecture trend has event driven architecture in late majority in 2022 and 2023, implying there is a slow adoption. My observation has been that the adoption rate has increased and will accelerate as organisations look to build more data-driven insights and modernise legacy applications</p>
<div class="wp-block-image">
<figure class="aligncenter size-large"><a href="https://alok-mishra.com/wp-content/uploads/2023/05/messaging-pattern-alok-mishra.png"><img src="https://alok-mishra.com/wp-content/uploads/2023/05/messaging-pattern-alok-mishra.png?w=902" alt="" class="wp-image-2782" /></a><figcaption class="wp-element-caption">Messaging </figcaption></figure>
</div>
<p class="wp-block-paragraph">Consider streaming (kafka) vs messaging (queues, topics) for larger volumes, ordering, replay-ability of messages and exactly once semantics (EOS) to boot your integration offering. So are you streaming?</p>
<figure class="wp-block-jetpack-image-compare"><div class="juxtapose" data-mode="horizontal"><img id="2757" src="https://alok-mishra.com/wp-content/uploads/2023/05/infoq-2022.png" alt="" width="2252" height="1106" class="image-compare__image-before" /><img id="2758" src="https://alok-mishra.com/wp-content/uploads/2023/05/infoq-2023.png" alt="" width="2258" height="1088" class="image-compare__image-after" /></div><figcaption>InfoQ architecture trend – <a href="https://www.infoq.com/articles/architecture-trends-2022/">2022</a> vs <a href="https://www.infoq.com/articles/architecture-trends-2023/">2023</a> “Event-driven” architecture in<strong> late majority </strong>for both years</figcaption></figure>
<h2 class="wp-block-heading">Are your streams noisy?</h2>
<p class="wp-block-paragraph">Change Data Capture (CDC) streams are where organisations start their streaming journey, this is often a quick way to get plugged straight into the data store of a core system and start publishing events, however as this practice accelerates the entity changes published from core systems are meaningless noise without business context. Business events with more coarse-grained information are more consumable and provide business domain based information vs pure system data oriented signals</p>
<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><a href="https://alok-mishra.com/wp-content/uploads/2023/05/business-events.png"><img src="https://alok-mishra.com/wp-content/uploads/2023/05/business-events.png?w=778" alt="" class="wp-image-2763" width="534" height="490" /></a><figcaption class="wp-element-caption">Business events vs System Entity change events – if you are publishing pure CDC to your consumers then you may be pushing out noise and encouraging model based coupling</figcaption></figure>
</div>
<h3 class="wp-block-heading">3 key patterns for noise into signal</h3>
<p class="wp-block-paragraph">There are 3 key patterns based on where you can convert signal to noise – 1. at the source, 2. in the middle or 3. at the end consumer</p>
<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><a href="https://alok-mishra.com/wp-content/uploads/2023/05/alok-mishra-3-patterns-for-cleaning.png"><img src="https://alok-mishra.com/wp-content/uploads/2023/05/alok-mishra-3-patterns-for-cleaning.png?w=1024" alt="" class="wp-image-2776" width="541" height="467" /></a><figcaption class="wp-element-caption">When messages travel from provider to consumers, there are 3 ways to transform noise to signal</figcaption></figure>
</div>
<p class="wp-block-paragraph">Pattern #1 and #3 work well in a 1-1 ecosystem with a single provider and consumer, however as this scales the cost of providers and consumers doing the transformation into business oriented messages increases leading to the broker pattern with pattern 2. In implementing pattern #2 use <strong><em>something in the middle to transform raw system events into business events allowing systems to plug-in as providers or consumers </em></strong></p>
<h2 class="wp-block-heading">Noise to Signal Processing: Business Domain APIs and Events</h2>
<p class="wp-block-paragraph">Converting from noise to signal using pattern #2 can lead to a cleaner architecture along with reusable business events streams that can decouple stream providers and consumers. This has the added benefit of centralising operational costs to a single component. The broker in pattern #2 is a converter from “system” to “common vocabulary” (Published language in DDD) and aligned to business domains. This is a domain service which encapsulates the business domain capabilities with services and outbound events and data streams which adhere to a common model aligned to the business language of this domain.</p>
<p class="wp-block-paragraph">The domain service consumes system entity change event streams from core systems and publishes these messages to domain aligned streams after transforming the message to a standard format. These services and the domain streams are maintained and operated by a domain aligned team (in de-centralised or federated models) or an integration practice (centralised model)</p>
<figure class="wp-block-image size-large"><a href="https://alok-mishra.com/wp-content/uploads/2023/05/streaming-pattern-alok-mishra-1.png"><img src="https://alok-mishra.com/wp-content/uploads/2023/05/streaming-pattern-alok-mishra-1.png?w=898" alt="" class="wp-image-2784" /></a><figcaption class="wp-element-caption">Noise-to-Signal Processing: business domain services produce Domain Events</figcaption></figure>
<h2 class="wp-block-heading">How do you design Business Events</h2>
<p class="wp-block-paragraph">With the business! </p>
<p class="wp-block-paragraph">Getting to a domain services requires upfront domain analysis with business SMEs to understand what business events are and this can be done through techniques such as event-storming and domain storytelling which are part of strategic domain driven design (DDD)</p>
<div class="wp-block-image">
<figure class="aligncenter size-large"><a href="https://alok-mishra.com/wp-content/uploads/2023/05/event-storming-example.png"><img src="https://alok-mishra.com/wp-content/uploads/2023/05/event-storming-example.png?w=1024" alt="" class="wp-image-2771" /></a></figure>
</div>
<h2 class="wp-block-heading">Summary</h2>
<p class="wp-block-paragraph">We looked at messaging vs event streaming and how late majority adoption of event streaming is giving way to faster data integration to core systems with streams of entity data change being published. This method is leading to more data noise for consumers leading to greater IT spend on building and maintaining processing logic for converting noise to signal and leading to duplication and coupling concerns architecturally. </p>
<p class="wp-block-paragraph">If you are still into plain old messaging then as integration practice owners, architects and engineers consider using event streaming as the data-in-motion practice provider broader capabilities, especially ones needed today for data insights and AI models. Also, when implementing streaming consider domain oriented event streams with change data capture streams to publish signal instead of noise to consumers </p>
As event-driven platform take off ✈️ with modern integration practices such as data-in-motion, organisations are integratingsignificantly large number of data and event streams 📈 without deliberate and up front design these growing streams are contributing to noise, complexity and coupling leading to higher implementation (CAPEX) and operating (OPEX) costs. In this post we look at common patterns in stream noise to signal conversion and cost optimal choices which lead to a strategic outcome
What’s the issue?
Data and event streams originate from core systems, new connectors are making it simpler to dump core system entity information into streaming platforms like apache Kafka. The goal is to accelerate legacy applications and service modernization through simple and quick connectors, this is in the scope of not just traditional messaging with small event footprint but more data oriented pipes to provide a stream of changes (note: If your organisation is not actively exploring event streaming and relying solely on messaging for integration then let us chat!)
Now, while these system connectors are easy to setup and implement – without deliberate upfront design (i.e accidental architecture) these connectors can end up producing streams of system entity change events which are noise without appropriate context. These require consumers to parse them and in doing so consumers must first understand the internal structure of the systems then process them with duplicated logic and additional context
Noisey streams push onus of translation to consumers leading to extra effort in implementation and maintenance
Costly data swamps: This not only leaks internal system implementation leading to coupling between system and consumers but also adds significant cost of processing to convert to noise to signal in the process diluting business value delivered. Over time, these streams lead to high operational costs and data swamps
Are your streaming services building a data swamp?
State of data and event Streaming
InfoQ’s annual architecture trend has event driven architecture in late majority in 2022 and 2023, implying there is a slow adoption. My observation has been that the adoption rate has increased and will accelerate as organisations look to build more data-driven insights and modernise legacy applications
Messaging
Consider streaming (kafka) vs messaging (queues, topics) for larger volumes, ordering, replay-ability of messages and exactly once semantics (EOS) to boot your integration offering. So are you streaming?
InfoQ architecture trend – 2022 vs 2023 “Event-driven” architecture in late majority for both years
Are your streams noisy?
Change Data Capture (CDC) streams are where organisations start their streaming journey, this is often a quick way to get plugged straight into the data store of a core system and start publishing events, however as this practice accelerates the entity changes published from core systems are meaningless noise without business context. Business events with more coarse-grained information are more consumable and provide business domain based information vs pure system data oriented signals
Business events vs System Entity change events – if you are publishing pure CDC to your consumers then you may be pushing out noise and encouraging model based coupling
3 key patterns for noise into signal
There are 3 key patterns based on where you can convert signal to noise – 1. at the source, 2. in the middle or 3. at the end consumer
When messages travel from provider to consumers, there are 3 ways to transform noise to signal
Pattern #1 and #3 work well in a 1-1 ecosystem with a single provider and consumer, however as this scales the cost of providers and consumers doing the transformation into business oriented messages increases leading to the broker pattern with pattern 2. In implementing pattern #2 use something in the middle to transform raw system events into business events allowing systems to plug-in as providers or consumers
Noise to Signal Processing: Business Domain APIs and Events
Converting from noise to signal using pattern #2 can lead to a cleaner architecture along with reusable business events streams that can decouple stream providers and consumers. This has the added benefit of centralising operational costs to a single component. The broker in pattern #2 is a converter from “system” to “common vocabulary” (Published language in DDD) and aligned to business domains. This is a domain service which encapsulates the business domain capabilities with services and outbound events and data streams which adhere to a common model aligned to the business language of this domain.
The domain service consumes system entity change event streams from core systems and publishes these messages to domain aligned streams after transforming the message to a standard format. These services and the domain streams are maintained and operated by a domain aligned team (in de-centralised or federated models) or an integration practice (centralised model)
Noise-to-Signal Processing: business domain services produce Domain Events
How do you design Business Events
With the business!
Getting to a domain services requires upfront domain analysis with business SMEs to understand what business events are and this can be done through techniques such as event-storming and domain storytelling which are part of strategic domain driven design (DDD)
Summary
We looked at messaging vs event streaming and how late majority adoption of event streaming is giving way to faster data integration to core systems with streams of entity data change being published. This method is leading to more data noise for consumers leading to greater IT spend on building and maintaining processing logic for converting noise to signal and leading to duplication and coupling concerns architecturally.
If you are still into plain old messaging then as integration practice owners, architects and engineers consider using event streaming as the data-in-motion practice provider broader capabilities, especially ones needed today for data insights and AI models. Also, when implementing streaming consider domain oriented event streams with change data capture streams to publish signal instead of noise to consumers
Alok brings experience in engineering and architecting distributed software systems from over 20 years across industry and consulting. His posts focus on Systems Integration, API design, Microservices and Event driven systems, Modern Enterprise Architecture and other related topics
View all posts by alokmishra