DiagnosabilityIsItWorthIt

From SPA Wiki

Jump to: navigation, search

Contents

Diagnosability is it worth it?

This session attempted to answer the above question, or to be more precise to answer the question: When is it worth it?

What is diagnosability?

In the first part of the session we discussed the meaning of diagnosability. As part of this discussion we decided on the following characteristics:

  • It is distinct from error handling
  • It provides the context of a fault which may be used to find the source of the problem, this includes:
  • What
  • When
  • Why
  • It can provide information that links cause and effect, i.e. the source of a fault and the point that the fault is reported, even if these are separated by time or space.
Techniques for diagnosability

We discussed the kind of techniques that can allow diagnosability of faults. It was noted that we need to understand what types of faults require diagnosability, for example:

     User boundary faults: those caused by some action of the user.
     System boundary faults: those caused by the interation of the software with system or product components.
     Internal operation: Problems caused by bugs in the software.

We should also understand the audience for the information produced for diagnosability, for example:

  • End user
  • Operator
  • Developer

The main discussion of techniques focused on logging as the main technique for diagnosability. The approaches discussed can be refined to the following:

     predictive: where information about the context of data is maintained to allow context to be recreated when a fault is detected, this context data can be discarded when the data is known to be without fault.  This context can be,          "Major usage by the end-users or system" or "Full details of all system state changes". This information can be mined to tie the reported faults with the causes of the faults, using some logging analysis tools.
     reactive: where the context is maintained for the duration of a call and is used if a fault occurs to provide the history or lineage of the fault.  This context, could be the information we would normally use to provide detailed logging, to discover information about reproducible bugs.  This method will report details of the context associated with the point at which the fault is detected, but this may not be the place where the fault occurred.


What are the benefits and costs

We discussed the dimensions which should be used to decide whether diagnosability should be implemented in software, we arrived at the following list:

     Size of user base: a small user base may not need diagnosability, but a large user base will.
     Capability of the user base: if the user base is technical, diagnosability probably won't be needed, if the user base is non-technical it should be considered.
     Complexity: For any measure of complexity (code size, ability to debug, understandability) low complexity won't need diagnosability, but high complexity in any dimension will probably require diagnosability in the software.
     Urgency: If the software is critical to a customer or user base, then diagnosability is probably required, otherwise not.
     Scale: if the data consumed/produced or the size of the deployment is large then diagnosability should be considered.
Is it worth it?

We all agreed that the existence of diagnosability in a piece of software was an economic argument, software without diaganosability being more expensive overall than with. Thus the recommendation to add diagnosability should be considered when one of the dimension listed above is of a high value, this may not occur in early version of software and so may be added later is response to changes in the software or any of the cost dimensions.

A simple example of this would be a software product which starts with few customers, who can be supported individually, which sells to many thousands of customers, who cannot be supported directly. This would increase the size of the user base and would require diagnosability to be added to the product.