- We design and build extraordinary applications for companies looking to make the next great idea a reality.
- learn more
Exaggerated Claims and the Importance of Hands-on Evaluation
Anyone who has worked with more than one commercial rule engine knows that there are some significant performance differences between the various engines. In some cases the performance issues are so severe that you end up scratching your head, looking at the Gartner magic quadrant and trying to figure out how they put some of these turkeys where they did.
With Business Rules Engines we are, unfortunately, in a marketplace where information asymmetry and switching costs are the order of the day. Vendors don't like to have their rule engines compared head-to-head and they use everything from custom rule description formats to alternate rule evaluation algorithms to rule repositories to legal restriction on making any performance measures public in order to make those comparisons difficult.
As a result, you see all sorts of claims from vendors about how nice their super secret algorithms perform. Peter Lin regularly takes on some of the more outrageous claims and, as best as he can, given the lack of information from the vendor, attempts to debunk them. His latest bit of skeptical wisdom, about the HyperRETE algorithm from EWA Systems, can be found here.
Ideally, we would have a standard rule description format, something like RuleML for example, and would use the rule engine as a standard execution environment with a standard set of API's, etc. Vendors could differentiate their product offering by providing nice IDE and Rule Repository functionality or souping up their rule engine to provide better performance. Imagine Java IDE vendors clouding the issue by each shipping their own, non-standard JVM's, tightly integrated with the IDE. If you want to run apps developed with their IDE, you need to use their engine and if you want to use their engine, you need to use their IDE.
In reality, Rule Engines should be a commodity by now. It's just the lack of sophistication by IT managers that prevents them from evaluating BRE technology just like they would OS or database software.
So, what is a product manager to do? My advice is to evaluate as many rule engines as you can. If you need inferencing, i.e. the ability to chain together conclusions across a large fact base, make sure to evaluate RETE engines and make sure those engine are really really RETE. The best way to test the engines for performance and scalability is to build a prototype that replicates the rules and facts you are going to process. The academic benchmarks like Manners are all well and good, but if you are going to write lots of rules that involve 20 conditionals or the existential quantifier ("there are at least 3 instances of a bad credit reference"), you're better off testing using that. If you're expecting to chase a million customers through the rule engine every day, you want to know if you'll need 4 or 400 CPU's, so make sure to use enough representative data so you can accurately estimate performance.
If, instead, you swallow a vendors song and dance and fancy GUI interface, you deserve what you get. So, do your homework.
Topics: Business Rules
Comments: 4 so far
Leave a comment
About Pathfinder
Recent
- Project Website Part 4: Drag and Drop in jQuery
- The App Store, iPhone, and You
- Multiple Column Sorting with Drag and Drop using Scriptaculous
- Five jQuery plugins that are a joy to use
- Visualizing Your Database Schema Entirely in Rails
- jQuery plugins: Five tips for separating the good from the bad and the ugly
- Resolved: Should schema.rb be included in your source control?
- Flash 10 - FileReference Runtime Access
- Papervision3d 2.0 (Great White) in Flex 3 (Part I)
- MetaWidget - Convention over Configuration UI
Archives
- July 2008
- June 2008
- May 2008
- April 2008
- March 2008
- February 2008
- January 2008
- December 2007
- November 2007
- October 2007
- September 2007
- August 2007
- July 2007
- June 2007
- May 2007
- April 2007
- March 2007
- February 2007
- January 2007
- December 2006
- November 2006
- October 2006
- September 2006
- August 2006
- July 2006
- June 2006
- May 2006
- April 2006
- March 2006


Actually, my name is Peter Lin. In my bias thinking there are two classes of tests one should perform when evaluating rule engines.
The first group is worse case scenario, because there are times an application needs cross product joins.
The second group tests network topology. By network topology, I mean different network size. The size of a RETE network can be measure across 3 dimensions: node count, width and depth.
Since runtime performance is primarily the result of the network topology, one can roughly estimate the performance by looking at a few sample rules. The approach I tend to take is to look at a dozen or so rules that represent the basic types of rules an application will use. Once I have a decent idea of the types of rules and their general structure, it’s fairly easy to predict the results. The next step I take is to generate a decent set of rules and run several dozen benchmarks. I like to run series of benchmark to determine the scalability as a factor of rule count and dataset size.
I’m planning to design and implement a set of benchmarks to measure rule engine performance using RuleML as the language. I’ve been working with one of the founders of RuleML, and over the last few years we’ve been thinking about this issue. Take for example Manners benchmark publish miranker. It’s valid to run Manners in forward and backward chaining, because an application should employ both techniques.
The dangerous thing is to say manners in backward chaining is an “apples-to-apples” comparison to forward chaining RETE. In general, a good backward chaining engine should be 30-50x faster than RETE, because backward chaining use the first best match approach. Rather than build up lots of activations in the agenda, it takes the best match and fires it immediately. This means backward chaining execution of manners isn’t doing cross product matching.
I’ve seen cases where a system really needs to do cross product matching. In those cases, RETE will beat non-RETE implementations by 100x or greater.
I’m hoping by middle of next year I will have a set of 20 test scenarios written in OMG Production RuleML that all engines can use for comparison.
peter
Comment by Peter Lin, Sunday, April 9, 2006 @ 11:51 am
Sorry about the name mixup, Peter. I must have been watching old episodes of Hollywood Squares. I’ve fixed it in the post.
Comment by Dietrich Kappe, Monday, April 10, 2006 @ 1:06 pm
Actually, having an agreed synstax for rules is only part of the story. Different rule engines can produce different results from the same set of rules.
In our work on the Internet Business Logic system, we publish a model theoretic specification of what our rule engine is supposed to do. We also publish a simplified version of the algorithm for the rule engine.
There’s more about this in
“Understandability and Semantic Interoperability of Diverse Rules Systems”, http://www.w3.org/2004/12/rules-ws/paper/19
Thanks in advance for comments on the paper, or on the online system. (Shared use of the system is free.)
Comment by Adrian Walker, Saturday, April 22, 2006 @ 2:13 pm
[...] for BRMS are fairly well documented, and there seems to be a general consensus that they should not be used for reliable product performance comparison, however they keep cropping up! Even in [...]
Pingback by ILOG BRMS Blogs » Blog Archive » The Good, The Bad and The Ugly - Rule Engine Benchmarks, Thursday, June 19, 2008 @ 8:31 am