FBM Banner FBM

Is FBM Future Proof?

Since functional benchmarking focuses on a specific (set of) functionality and not on a specific system the question arises if functional evaluation is future proof. Future proof in this context first of all means that the methodology can be applied to emerging and future systems. Secondly, it also implies that new solutions can be compared with existing or passed solutions addressing a specific functionality. In order to do this, however, one has to adopt a functional view for all SUT looked at. Further, it has to be possible to generalise the workload and environment so that it largely can be applied to a range of systems that provide a specific functionality. This also applies to the cost aspects. Whereas some of the metrics might be changing (e.g. additional aspects might have to be considered when new paradigms are applied while others might become obsolete) the basis should remain the same in order to allow comparisons between the systems.
The essential premise is that FBM is future proof, i.e. that if new systems are developed they can be evaluated looking at the core functionality they provide. It also means that the set of performance parameters used to assess the system stays the same (for this specific functionality). A case study approach was used to investigate if and to what extent FBM is future proof. An example we have been looking at is Video-on-Demand (VoD) since VoD systems have been developed for a number of years.

Case Study: VoD

Two main VoD (i.e. on-demand streaming) approaches are considered:

  • “traditional” Client/ Server Streaming
  • P2P streaming



The functionality provided by both systems is ad-hoc video streaming over packet based networks (i.e. the Internet). Before the emergence of P2P client/ server was the only option and parameters and metrics used to evaluate VoD systems were, for instance:

  • Output parameters

    • Performance metric:

      • User level focuses on user perception using Mean Opinion Score (MOS)
      • System level uses quantitative measurable parameters such as availability, start-up delay, playout continuity. 
    • Cost metric:
    • Bandwidth usage and number of retransmissions
  • Input parameter

    • Workload metric

      • Video samples characterised by encoding scheme and rate characteristics
      • Number of (concurrent) requests
    • Environment metric

      • Available bandwidth from server to client(s)



With the emergence of P2P the same functionality is being provided through a different infrastructure. Some commonly used parameters and metrics are:

  • Output parameters

    • Performance metric:

      • User level focuses on user perception using Mean Opinion Score (MOS)
      • System level uses quantitative measurable parameters such as availability, start-up delay, playout continuity. 
    • Cost metric:

      • ∑Bandwidth usage per transmission and number of transmission of same chunk
  • Input parameter

    • Workload metric

      • Video samples characterised by encoding scheme and rate characteristics
      • Number of (concurrent) requests
    • Environment metric

      • Available bandwidth between peers
      • More sophisticated models might consider a network topology map.



The case study shows that the functionality is the same and is also expressed in the same way in terms of performance metric. However, the cost metric has to consider some additional aspects (i.e. the fact that the data comes from multiple sources) and therefore the measured costs might have to be assessed qualitatively differently.

The biggest difference seems to be between the environments metrics used as input parameters. The environment described through the environmental input parameters only represents a snap shot of the real-world, i.e. a relevant sub-set of parameters for a specific SUT is used (in the client server example only server to client bandwidth is relevant whereas in the P2P example the peer connectivity is the relevant factor). Ideally one would have a complete description of the real-world environment at the stage of evaluation so that other SUT could be using exactly the same kind of environment even if they use another environmental parameter set. However, this does not restrict the generality of the approach nor the validity of the results for benchmarking and comparison purposes.

Do Future Developments have to be predicted?

In order for a model to be future proof it should be applicable to future systems without any changes. Future developments cannot (all) be anticipated and therefore it is not possible to apply the FBM model to no-existing technology. However, every system should have a relevant (set of) function(s) that should be captured, assessed and compared. Thus the FBM model should be applicable to any kind of system that will be developed.

Though, issues can arise in the context of disruptive technologies and their influence on the environment. Present systems are designed to provide an optimal service in a given environment. New technologies (e.g. olfactory or haptic content) might trigger a change in the environment and the demands placed onto the SUT. However, this is a general issue regarding technological changes and does not make the FBM approach ineffective.

Conclusions

The FBM is generic and should be applicable to all systems that can be defined by the functionality they provide. The evaluation of a sub-set of functionalities should also make it possible to compare different systems, even if they use a larger variety of functions, different technology or approaches that do not yet exist. However, since most systems are optimised to operate in a specific environment disruptive technologies might have the power to change the environment and therefore make a system perform sub-optimally. Though, this is not a specific issue of the FBM model but only reflects the nature of technology changes.

To provide new input comments, etc, please send an email to Piotr Srebrny, Gareth Tyson, Andreas Mauthe or Thomas Plagemann