AMTSO… Yet again…


Posted by Ed on Jul 7, 2010 in Analysis | Comments Off

I really didn’t want to continue on this topic again, but I find that I am unable to control myself.

I was reading through David Harley’s recent comments about the difference between ISO and AMTSO and Kurt Wismer’s well-reasoned post on AMTSO generally and I started musing about the role of AMTSO, my particular beef with it, and why this seems to stick in my craw.

So, to briefly recap, David makes a good point about the function of AMTSO in his post.  He says (paraphrasing) that the function of AMTSO is not to set standards… but instead to provide guidance that will (ideally) increase the quality of testing overall.  I have no problem with this.  In fact, I wouldn’t even be against it if it were like ISO (necessarily).  After all, we have standards and even accreditation when it comes to real-world pathogens (and the laboratories that handle them).  Maybe it’s a good idea to do the same thing with malware.

But that’s beside the point.  The point is, no matter what AMTSO goals actually are, that’s not how it’s perceived in the industry.   Here’s what I mean.

It’s about perception

I contend that folks out there in the public at large (particularly journalists) think the AMTSO is a standards body.  It may not be accurate… it may not be based on anything concrete… it may be total horse-puckey.  But it kind of doesn’t matter…  Joe Average security practitioner probably isn’t going to read the AMTSO blog… or this blog… or the Securiteam blog…  they’re not going to understand (or care) how the AMTSO is different from a standards body like ISO. All they’ll probably see are news reports – and those seem to suggest that AMTSO is an independent standards body – like ISO.

Pulled randomly from around the web, take a look at the AMTSO coverage that’s hitting folks in IT at large (all non-link underlining mine to illustrate the point):

From ComputerWeekly, “AMTSO standardises security software testing”:

A group of security software firms has agreed a set of testing procedures. Members of the Anti-Malware Testing Standards Organisation have published standards for testing security software.  The standards have been developed and agreed by more than 40 security experts, product testers and members of the media around the world. The creation and publication of these standards is the first step in Anti-Malware Testing Standards Organisation (AMTSO)’s mission to improve anti-malware product testing.

From InformationWeek, “Computer Security Companies Agree To Testing Standards”:

“AMTSO brings together the industry’s leading security and risk academics, vendors, and testers to provide testing methodologies and standards better suited to evaluate the protection available to combat today’s malware and related security threats,” said Vernon Jackson, manager of virus prevention systems at IBM Global Technology Services, in a statement. “This will in turn benefit end users, who will be empowered to make more informed decisions about the particular security solutions that match their needs.”

From eWeek, “Anti-virus Testing Standards Come to the Cloud“:

Last week, the two-year-old industry standards body adopted a paper setting forth best practices for testing in-the-cloud security products. The six-page document, available here, touches on subjects such as virtualization, connection filtering and the repeatability of the tests.

See what I mean?  Now, did I cherry-pick these?  A little bit… at least in that I got to choose what keyphrases to put into Google…  But does it matter?  The point is that at least some people in the industry perceive AMTSO as a standards body; and as Dave points out in his post, having “Standards Organization” in the name doesn’t help.

So is AMTSO doing the wrong thing?  Are they deliberately cultivating a false perception?  I doubt it.  But the fact is that there is a mis-perception, and that matters.

Slappin’ the labs

So if it’s not the role of AMTSO to standardize, it’s also clearly not their role to accredit.  But aren’t they doing just that?  Take a look at the review board report related to the NSS “socially-engineered malware” bakeoff from 2009.  The backstory:  NSS did a bakeoff as is their remit.  Sophos, AVG and Panda challenged NSS as to whether the test was in line with the published guidelines.  The review board evaluated the test and concluded that the test didn’t meet the Fundamental Principles of Testing.  Why not?  At issue were two points:

6. Testing methodology must be consistent with the testing purpose.
7. The conclusions of the test must be based on the test results.

Let’s review what the issues were from the report.  Pardon the long block-quote, but I wanted to put the relevant sections in verbatim so as to give the whole context.  For point 6:

Does the methodology align with the purpose or objective of the test? No. If you want to test “the protection of the products against socially‐engineered malware”, you should also test products against this situation. It was not taken into account how the URLs actually reached the endpoint system. This might happen through spam, for example. If a product uses a spam‐filtering, the spam message might have never appeared on the system, therefore the user would be protected as well.

Conclusion: The report does not comply with this principle. The reviewers agreed that missing infection vectors (e.g. spam) can mislead the result. Nevertheless, they also thought that the test still did better than a lot of tests out there right now, since at least the malware was coming from the “real world” and also was executed afterwards in a dynamic test.

So, here’s how I read this: the test attempted to validate the degree to which products protected against hostile links. But because the testing didn’t address spam filtering or other mechanisms to prevent the hostile link from entering the system, the review board concluded that the methodology doesn’t address the purpose of the test.

I’d argue that no matter what other features a product might have, if you test all the products the same way, you get a benchmark.  Could it be a better benchmark?  Sure.  Could you test other features?  Sure.  But when creating a benchmark, my opinion is you need to stick to testing one feature at a time.  A benchmark is not necessarily the way a product will be used in the “real world”, but  a “real world” test is impossible.  Why?  Because no benchmark can account for context – after all, if a person doesn’t use email and communicates solely through facebook, testing the combination of Sophos’ spam filter + URL blocker doesn’t represent anything akin to how the product will be used in that case.  What do you do?  Create a bogus usage scenario only to test against it?  Do that and any difference between the test usage and the real world usage would obviate the test.

By way of analogy, if I wanted to test anti-lock brakes, what would I test?  If I designed a test case where a car went into a skid, would the test case be invalid because I didn’t also test the dynamic stability control? After all, DSC could prevent the car from going into a skid in the first place…  But no matter how sophisticated the DSC is, it’s not the same as anti-lock brakes.

But don’t take my word for it.  Check out what Sophos says about it.   On their website, Sophos lists the features of their endpoint product.  Of the features they list, they list URL filtering and spam control as two different features:

Sophos Live URL Filtering: SophosLabs’ instant in-the-cloud lookups check a database of millions of compromised sites. We keep it current by adding and identifying up to 40,000 newly infected sites each day.

Sophos Live Anti-Spam: Before one of your users opens a potentially threatening email or attachment, Sophos Live Anti-Spam conducts a fast check of sender IPs, message and attachment fingerprints, destination URLs, and checksums. This keeps your users’ inboxes free from the latest spam campaigns and your email servers running smoothly.

If Sophos advertises these as two different features, shouldn’t they be evaluated that way?  In fact, I argue that it’s more fair to gauge the performance of a product against the standard they themselves advertise.  So when a lab tests the functionality, I think it’s actually more fair if they test these two features under two different benchmarks:  a spam benchmark and a URL filtering benchmark.  I for one would argue that the test methodology was flawed if the lab did conflate the benchmarks in analyzing the products… AMTSO says it’s flawed because they didn’t.

How about point 7?  From the AMTSO report:

Does the conclusion reflect the stated purpose? No. The report’s Executive Summary states that test’s purpose was to determine the protection of the products tested against socially-engineered malware only. Later in the report (Section 4 -product assessments) it says: “Products that earn a caution rating from NSS Labs should not be short-listed or renewed.” This is clearly a conclusion that you can’t make out of the detection for socially‐engineered malware only, as the products have other layers of protection that the test did not evaluate.

Ok.  Let’s unpack that.  That sounds like they’re saying that because the test cautions purchasers on the basis of this feature, that the conclusion doesn’t fit the purpose… because the product has other features?  Wait… what?  So, if Consumer Reports authored a report on “Green-Friendly Refrigerators”, the methodology would be flawed if they only tested power consumption and didn’t also test the refrigerators’ ability to make ice?  After all, the ice making feature could be so whiz-bang great that it makes up for the fact that the thing causes brownouts when you turn it on.

No.

But let’s put all that aside for the moment.  Let’s ignore the merits of the challenge and look at the complaint itself.  Consider this question: should the AMTSO allow vendors to publicly challenge (and potentially discredit) an independent test from an independent lab?  If so, what are acceptable parameters?

I’d argue that if you are going to do that, there are some circumstances that are inappropriate.  For example, this didn’t happen in this case, but would it be appropriate if Sophos challenged the test methodology in which their product performed poorly and the entire review committee was made up of employees of Sophos?  That’d probably pass the sniff-test, right?  What about if all of the reviewers were employed by vendors?  Sniff test on that one?  Again, didn’t happen here.  But my point is, where’s the line?

It’s hard to draw one.   That’s why Consumer Reports accepts no advertising and there are no vendors involved in the test.  So that when they get sued, there is no appearance of bias.  Note that, appearance of bias.  It doesn’t matter if there really is bias or not – it’s the perception thereof that matters.

No Responses to “AMTSO… Yet again…”

Trackbacks/Pingbacks

  1. More AMTSO stuff « The AVIEN Blog - [...] share of bad publicity recently -  a further example is the piece by Ed Moyle over on his blog ...