As event-driven platform take off ✈️ with modern integration practices such as data-in-motion, organisations are integrating significantly large number of data and event streams 📈 without deliberate and up front design these growing streams are contributing to noise, complexity and coupling leading to higher implementation (CAPEX) and operating (OPEX) costs. In this post we look at common patterns in stream noise to signal conversion and cost optimal choices which lead to a strategic outcome
What’s the issue?
Data and event streams originate from core systems, new connectors are making it simpler to dump core system entity information into streaming platforms like apache Kafka. The goal is to accelerate legacy applications and service modernization through simple and quick connectors, this is in the scope of not just traditional messaging with small event footprint but more data oriented pipes to provide a stream of changes (note: If your organisation is not actively exploring event streaming and relying solely on messaging for integration then let us chat!)
Now, while these system connectors are easy to setup and implement – without deliberate upfront design (i.e accidental architecture) these connectors can end up producing streams of system entity change events which are noise without appropriate context. These require consumers to parse them and in doing so consumers must first understand the internal structure of the systems then process them with duplicated logic and additional context
Costly data swamps: This not only leaks internal system implementation leading to coupling between system and consumers but also adds significant cost of processing to convert to noise to signal in the process diluting business value delivered. Over time, these streams lead to high operational costs and data swamps
State of data and event Streaming
InfoQ’s annual architecture trend has event driven architecture in late majority in 2022 and 2023, implying there is a slow adoption. My observation has been that the adoption rate has increased and will accelerate as organisations look to build more data-driven insights and modernise legacy applications
Consider streaming (kafka) vs messaging (queues, topics) for larger volumes, ordering, replay-ability of messages and exactly once semantics (EOS) to boot your integration offering. So are you streaming?
Are your streams noisy?
Change Data Capture (CDC) streams are where organisations start their streaming journey, this is often a quick way to get plugged straight into the data store of a core system and start publishing events, however as this practice accelerates the entity changes published from core systems are meaningless noise without business context. Business events with more coarse-grained information are more consumable and provide business domain based information vs pure system data oriented signals
3 key patterns for noise into signal
There are 3 key patterns based on where you can convert signal to noise – 1. at the source, 2. in the middle or 3. at the end consumer
Pattern #1 and #3 work well in a 1-1 ecosystem with a single provider and consumer, however as this scales the cost of providers and consumers doing the transformation into business oriented messages increases leading to the broker pattern with pattern 2. In implementing pattern #2 use something in the middle to transform raw system events into business events allowing systems to plug-in as providers or consumers
Noise to Signal Processing: Business Domain APIs and Events
Converting from noise to signal using pattern #2 can lead to a cleaner architecture along with reusable business events streams that can decouple stream providers and consumers. This has the added benefit of centralising operational costs to a single component. The broker in pattern #2 is a converter from “system” to “common vocabulary” (Published language in DDD) and aligned to business domains. This is a domain service which encapsulates the business domain capabilities with services and outbound events and data streams which adhere to a common model aligned to the business language of this domain.
The domain service consumes system entity change event streams from core systems and publishes these messages to domain aligned streams after transforming the message to a standard format. These services and the domain streams are maintained and operated by a domain aligned team (in de-centralised or federated models) or an integration practice (centralised model)
How do you design Business Events
With the business!
Getting to a domain services requires upfront domain analysis with business SMEs to understand what business events are and this can be done through techniques such as event-storming and domain storytelling which are part of strategic domain driven design (DDD)
We looked at messaging vs event streaming and how late majority adoption of event streaming is giving way to faster data integration to core systems with streams of entity data change being published. This method is leading to more data noise for consumers leading to greater IT spend on building and maintaining processing logic for converting noise to signal and leading to duplication and coupling concerns architecturally.
If you are still into plain old messaging then as integration practice owners, architects and engineers consider using event streaming as the data-in-motion practice provider broader capabilities, especially ones needed today for data insights and AI models. Also, when implementing streaming consider domain oriented event streams with change data capture streams to publish signal instead of noise to consumers