Case Study

Rules, Validation, and Calculation Engine Overhaul

Re-designed and re-implemented the core custom rules, validation, and calculation engine using the Orleans actor framework on Azure Service Fabric.

Transformation Journey

1

Legacy Monolith

  • .NET Framework, coupled stages
  • Manual state management
  • No horizontal scaling

Bottlenecked growth

2

Actor Foundation

  • Orleans actor model selected
  • Service Fabric compute platform
  • Redis + in-memory 2-layer cache

Distributed state foundation laid

3

Platform Migration

  • .NET Framework to .NET Core
  • Structured logging added
  • Integration testing established

Modern stack in place

4

Architecture Transformation

  • Actor-based distributed processing
  • Service Bus decoupling
  • Full observability suite

Scaling bottlenecks eliminated

5

Extensible Platform

  • Grain-based architecture
  • Elastic scaling for any workload
  • Shared NuGet access patterns

Platform ready for future products

Challenge

The core processing engine needed to be re-designed from the ground up to achieve the scale, performance, and reliability required for modern financial reporting, while simultaneously migrating from .NET Framework to .NET Core.

  • Architectural designs needed optimization for scale and performance at every layer
  • Opportunities to strengthen observability, supportability, and testability
  • End-user operational needs had to be balanced with performance targets
  • Converting from .NET Framework to .NET Core during the re-architecture
  • Ensuring the solution served both the current product and future products
  • High-throughput processing requirements with distributed state management
  • Database backpressure alleviation through 2-layer caching
  • Responsibilities of this application and another were not clearly defined

Strategy

Leveraged the Orleans virtual actor model for distributed state management at scale, combined with Redis caching and Service Bus messaging for high-throughput processing.

  • Orleans actor framework for distributed state management and grain-based processing
  • Structured logging for comprehensive monitoring and debugging
  • Azure Service Fabric as the reliable compute platform for stateful services
  • Integration testing for end-to-end validation of the new architecture
  • Redis Cache for high-performance distributed caching layer
  • In memory caching for frequently accessed, non-dynamic data
  • User stories for capturing and validating functional requirements
  • MongoDB for flexible, atomic, scalable persistence of application data
  • 2-layer caching with Redis Cache and In-memory cache integration for high-throughput processing and database backpressure alleviation
  • Nuget package was created and maintained by the development team to extend the functionality of the related application
  • TPL Dataflow for parallel, asynchronous pipeline processing with built-in buffering, block composition, and backpressure propagation across processing stages

Tradeoffs Considered

  • Orleans actor model vs traditional service orchestration: chose grain-based state management for automatic distribution at the cost of increased conceptual complexity and team learning curve
  • 2-layer caching vs single cache layer: accepted the operational overhead of managing both Redis and in-memory caches to eliminate database backpressure under peak loads
  • Azure Cloud Services vs Service Fabric vs Azure Functions: chose Service Fabric for its native stateful service support, rejecting Cloud Services for its limited lifecycle management and rejecting Azure Functions for its inability to host long-running stateful workloads at scale, accepting higher operational complexity in exchange for a purpose-built distributed compute substrate
  • Full rewrite vs incremental migration: chose complete re-architecture over gradual refactoring to break tight coupling between processing stages, accepting short-term delivery risk for long-term architectural coherence
  • Shared NuGet package vs direct system integration vs in-app caching: created a separately versioned and maintained NuGet package rather than embedding a costly direct integration with an external system or temporarily caching its structures within the main application, which would have degraded performance by re-materializing objects on every cache miss, trading the overhead of maintaining a shared library for clear ownership boundaries, independent versioning, and reusable access patterns across consuming services
  • TPL Dataflow vs sequential processing vs manual threading: adopted TPL Dataflow for declarative pipeline composition with built-in parallelism, bounded-capacity buffering, and automatic backpressure propagation across processing blocks, at the cost of increased debugging complexity from its block-based execution model, opaque internal queue states, and the team learning curve for dataflow mesh design and fault handling strategies

Results

Successfully delivered a strategic platform modernization initiative within six months while establishing a foundation capable of supporting both current and future product investments.

  • Improved operational supportability and production visibility through a modernized architecture designed for long-term maintainability
  • Enabled the platform to scale predictably under growing workloads while improving resilience and operational stability
  • Modernized the technology stack to enhance security, performance, maintainability, and future extensibility
  • Reduced delivery risk by establishing clear alignment between business requirements and implementation outcomes
  • Increased confidence in system behavior through comprehensive validation and end-to-end testing strategies
  • Created an architectural foundation capable of supporting evolving business requirements without significant redesign
  • Reduced infrastructure bottlenecks and improved application responsiveness through architectural performance optimizations
  • Improved system efficiency and scalability through streamlined data access and processing patterns
  • Achieved significant throughput gains by composing processing stages as parallel dataflow blocks, enabling concurrent execution of rules, validation, and calculations across independent grains without mutex contention or thread pool saturation
  • Established clear ownership boundaries and integration responsibilities, reducing complexity across interconnected systems

Before vs After

Before

Legacy Processing Engine

  • Built on .NET Framework with limited cross-platform support
  • Manual state management with no built-in distributed processing
  • Limited horizontal scaling for processing pipelines
  • Observability gaps making debugging and monitoring difficult
  • Tight coupling between processing stages
  • Difficult to extend for new product requirements
  • No support for virtual actors or grain-based parallelism
After

Orleans Actor-Based Engine

  • Built on .NET Core with full cross-platform compatibility
  • Orleans virtual actors for automatic distributed state management
  • Elastic horizontal scaling with Service Fabric reliable services
  • Comprehensive observability with structured logging and monitoring
  • Decoupled processing stages connected via Service Bus messaging
  • Extensible grain-based architecture for future product needs
  • Redis Cache optimized throughput for high-volume processing