Knoldus Inc


Automated the testing process to test the streaming data pipelines on production.


United States

Technologies Used

Java, TestRail,, Junit-5, Apache Beam, Kafka, Bigquery, Google Cloud Storage, Google Cloud Dataflow, Google Cloud Functions


Human Resource Management

Ultimate Kronos Group (UKG) is an American multinational technology company with dual headquarters in Lowell, Massachusetts, and Weston, Florida. It provides workforce management and human resource management services. As a leading global provider of HCM, payroll, HR service delivery, and workforce management solutions, UKG’s award-winning Pro, Dimensions, and Ready solutions help tens of thousands of organizations across geographies and in every industry drive better business outcomes, improve HR effectiveness, streamline the payroll process, and help make work a better and more connected experience for everyone.



Higher regression time, limited test coverage and lack of test data makes it difficult to test streaming data pipeline on production.


Automated the testing process and designed a hybrid framework to test the streaming data pipelines on production.


Enabled the on demand test execution and reduced our regression cycle to 70% by integrating it with CI.



In order to achieve this team has faced numerous challenges which are as follows:

As we don’t have access to the upstream system, we tested with the mocked events from the upstream system, but those limited events are not good enough to ensure the quality of production data pipelines. 

As we were testing with mock events, so our test coverage was also limited; we were just performing functional testing with some limited positive and negative scenarios.

The upstream source system is responsible for sharing the mocked events. Since we have multiple upstream source systems, it introduces difficulty in communication and collaboration. For example, it is difficult to communicate with each team for small updates in mocked events. This resulted in brittle test cases and failures when upstream systems made any changes.

This is another major challenge; As we had only one GCP instance for the development and testing team, we used the same service account to access the same GCP resource. As the developed pipeline was a streaming pipeline, so it’s pulling some real-time events along with mocked events from the upstream systems. Sometimes it’s a little cumbersome to filter mocked data from real time development data.

As we were majorly relying on the manual functional testing of the pipeline, which resulted in a higher regression time once we made a small change in the pipeline behavior.


Introduced Test Automation In SDLC

Manual testing of pipelines helped us to understand the pipeline behavior to identify and document the multiple scenarios to move forward. At this point, we had a clear understanding of what our pipeline was expected to do. What are the different scenarios? So we have decided to introduce test automation in SDLC

A Hybrid framework with

With Contract tests, we were testing the integration point by checking each application in isolation to ensure the messages that the upstream system sends or downstream system receives conform to a shared understanding that is documented in a “contract.” This contract is the source of truth for us to validate the event schema received from upstream sources. In this way, we minimize our dependency on the source system.

This is how it works. As we were the consumers of the events from the upstream system, we have written some consumer-driven contract tests and generated a pact or contract to explain what we are expecting from the source system and shared the same with the provider or source team. Now the sourcing team structured the raw events as per the contract that we shared.

Test Automation Framework components

As we have developed a hybrid framework to automate the testing effort. Test framework had multiple components that we explained below:


At the end, we designed a plug and play reusable automation framework to test streaming ingestion data pipeline, which enabled us to get the faster feedback and help us to find out the red blocks on production.

OMNI is an analytical platform which interacts with multiple upstream systems, so we have lots of dependency on them for test data. The automation framework helps us to minimize our dependency on the source system to accelerate the development and on demand test execution.

Senior Engineering Manager

Explore latest Case Studies