![]() |
![]() |
|
|
Frequently Asked Questions1. What does benchmarking mean? Benchmarking
is the standardisation of measurements and metrics for the purpose of
comparison of systems offering the same service under similar
conditions. Benchmarking is an important aspect of the evaluative
process of system development. It allows the analysis, evaluation and
comparison of performance and overhead over a variety of environments
and scenarios. Benchmarking
has a functional view not a structural view. The issue is not how good
a certain structure (e.g. overlay network) but how well it performs a
specific function (it has been designed for) Providing
a methodology and/ or framework for evaluating and comparing content
distribution solutions that do not necessarily rely on common
architectural principles. - Caching mechanisms - Delivery mechanisms - Locality mechanisms - Search mechanisms - Content load mechanisms - System under Test (SuT): individual system being tested. - Workload: operations that the SuT will perform. - Specific Parameters: specific to the SuT. - Environmental: constraints the SuT works within. Negative
benchmarks allow to answer the following question: “What is the
overhead involved in fulfilling the system requirements?” Positive benchmarks allow to answer the following question: “How well does the system fulfil its requirements?” Comparative
benchmarks allow to answer the following questions: “Does the
performance warrant overhead?”, “What causes the overhead?” The
propagation delay is a sum of packet processing time on router and
packet propagation time over link and is assumed to be constant. The queuing delay is a delay incurred due to packet queuing on routers in sub-congestion condition and is highly variable. Robustness
can be defined in different areas. It quantifies for example: (a)
vulnerability of a network to failures; (b) the level of protection of
a network against viruses. The
direct metric is defined as a metric that does not depend upon a
measure of any other attribute. They are usually for the fundamental
measurements. An indirect metric, otherwise, is a derived function
whose domain is an n-tuple. They have meaningful combinations of direct
metrics. Yes.
Let’s consider, for instance, the failure rate. We can define the
failure rate as the fraction of times the user is unable to
successfully play a stream. In a Sever/Client VoD system, supposed that
the link capacity only supports m video requests simultaneously, the
failure rate can be expressed as Pr[Ns=m] in a M/M/m queue model. In a
P2P VoD system, otherwise, the failure rate can be expressed as
Pr[Ns=0] in a M/M/ queue model. If we assume users arrive and depart
according to some distribution and in P2PVoD system, the other peers’
unavailability lead to the request failure rate of an arriving peer. - Goal: Fairness - Management Overhead - Performance (Retrievability, Response time) - Costs (Traffic overhead) - Orthogonal aspects (Scalability, Robustness/Stability) - Efficient Content Distribution - Content Availability (Goal: Efficient Content archiving) Depending
on the application classes large delays exceeding a certain level may
be regarded as packet loss rather than a giant delay. Most of the
current objective and subjective research efforts aim to find the upper
bound of delay limitation. Though, the lower bound of delay (minimum
delay noticed) should be taken into account and play a key role in
video assessment. Variations
of network transfer delays are known as jitter. Jitter is normally
caused by changes of route or changes of condi-tions on router network
nodes’ buffer. The impacts of jitter are getting smaller on modern
applications, which are always equipped by a de-jitter buffer to smooth
transmission delay with introducing unobjectionable buffer delay. Packet
loss is defined as the ratio of number of lost packets during
transmission to the total number of transmitted packets. Packet loss
comes from congestion resulted from insufficient or non-optimal usage
of network resource. For connection-oriented transport protocol, loss
of packets will cause retransmission which will delay the arrival of
packets, while for connection-less protocol like UDP, the loss may
directly affect application performance. Bandwidth
is the capacity of carrying data on a certain path of the network. If
the traffic on the network exceeds this capacity, packets will get lost
or delayed. For some applications the impact of bandwidth limit even
happens before transmitting the packets. Network adaptive video
compression is an example of this case. Besides encoding rate, frame
rate and resolution are also elements that are impacted by bandwidth
limitation. It is the time that one device (e.g., router or computer) or one link is working without fail. It is the number of disjoint paths between each pair of nodes. It is the time from the detection of one failure (router failure, link failure…) until it is repaired. It is the time from the instant when the failure (router failure, link failure…) happens until it is detected. It is the time that one node/peer is available on the overlay network. It is the number of disjoint paths between each pair of nodes in the overlay level. It is the number of available copies or each item in the overlay. It
is the time from the detection of one failure until it is repaired. For
instance, from the instant when a peer detects one of its neighbours is
not available until it obtains a new neighbour. It
is the time from the detection of one failure until it is repaired. For
instance, from the instant when a neighbour fails or leaves the network
until the peer discover it. It is the process in which content undergoes to transfer it from one location to another. It
is the process in which content is replicated in multiple locations so
to facilitate acceptable quality access to a particular item. It
is the process in which content placement is optimised to provide a
high quality access to particular item for a subset of clients. - Hit ratio of caching nodes - Wasted caching overhead - Caching maintenance overhead - Hit ratio of replaced cached content - Average optimality of cache location - Caching speed - Comparison of download mechanism with other systems (e.g. BitTorrent and other mechanisms) - Percentage of optimal download time achieved - Average percentage of optimal download bandwidth achieved - Bandwidth savings through caching nodes compared to other sources - RTT savings through caching nodes compared to other sources - Stretch between optimal and actual routing - Overhead levels derived from locality maintenance - Comparison of search times (seconds) with other mechanisms e.g. Gnutella, Chord - Comparison of time complexity with other mechanisms - Comparison of search flexibility with other mechanisms - Search Overhead - Search accuracy - Weighted comparison of previous five benchmarks - Average percentage load of content sources compared to minimum and maximum loads - Average percentage load seen by nodes - Percentage of optimal net download choices - Start-up delay - VoD delay - Temporal Characteristic (Start-up Delay, Download Time, Interactive Responsiveness) -
Content Management Characteristic (Chunk Scheduling Policy, Application
Buffering, User Incentives, Application Layer Content Modification - Resource Availability (Bandwidth, Content) - Resource Utilization (Process utilization, Bandwidth utilization) - Overlay Structure (Churn, Stretch, Stress) - Availability Stretch - Weakness Stretch - Redundancy Level - Comparative Redundancy Level - Chance of Item Loss - Maximum Number of Content Retrievals - Maximum number of acceptable content retrievals - Number of Failed Retrievals - Number of satisfied retrievals - Number of dissatisfied retrievals - Response Time - Recovery Time - Accumulated bandwidth access - Percentage bandwidth demand to available access - Percentage of Nodes within ‘Reach’ of an Object - Percentage of requesting nodes within ‘reach’ of an object - Percentage of Objects with ‘Reach’ of a Node - Percentage of Objects within ‘Reach’ of all nodes - Percentage of accurate replicas - Bandwidth Consumption - Wasted Bandwidth Consumption - Aggregate Bandwidth Consumption - Storage Consumption - Wasted Storage Consumption Even
though VoD and live video distribution services are streaming services,
they show significant differences: for VoD, the content is a
pre-recorded file while it is a continuous flow of data for live
streaming, which is not a priori recorded. This difference on the
nature of the content lead to a numerous of characteristic that are
specifics to these content distribution service. - Total amount of served upload data: this metric indicates the upload burden that both parties have to carry. -
Relative amount of server upload data: this metric specifies the
fraction of upload data each party is serving compared to the
aggregated upload traffic of the whole network. The influencing factors
that affect this metric are often obvious. -
CPU consumption: the total CPU consumption can be measured in different
ways: the total CPU time utilized by a given process, the average
percentage of CPU used by a given process (along the time), etc. -
Memory consumption: the total memory consumption is measured as the
total amount of memory bytes consumed by the given process but also the
average percentage of memory consumption (along the time) can be used
as metric. - Number of Users - Join Rate (Timeline of joins) - Node Churn (Churn Probability, Departure Rate, Failure Rate) - Number of contents - Content Churn - Network Parameters (Delay, Available BW, Jitter, Network Load) - Available Resources (CPU, Memory) - Failure Rate - Start-up latency - Playback Continuity - Peer Availability - Required Bandwidth (Upstream/Downstream) - Stress (per link) - Stretch (per member) User Perceived Quality - Download Time - Resource Consumption (CPU, Memory, Storage Capacity, Bandwidth) - Control Information Overhea The
environment is a substrate for benchmarked P2P streaming system
providing standard set of functionalities that enable it to run. The
environment and benchmarked unit constitute system under test (SUT).
The environment is neither a simulator nor an emulator, which are only
means to perform analysis on SUT. It is the waiting period experienced by a user after selecting a stream to play until the stream is played. T = Tp-Ts where: T: Start-up latency; Ts: Instant when the user selects the stream; Tp: Instant when the stream starts to play. It is the measurement of the continuity of the streaming flow. It measures the interruptions on the playback of the stream. QoE
describes the usability and the perception of it by users. With the
development of P2P content delivery services, it is very important to
measure the QoE of the system accurately to improve it further to
achieve peer satisfaction and maintain it support. A poor QoE will
result in dissatisfied user, leaving the overlay, then reducing the
overall performance of the system and the interest of the application. Network
emulation is a hybrid approach that combines real elements of a
deployed networked application with synthetic, simulated, or abstracted
elements; it is used to evaluate the effectiveness of new protocols and
applications in heterogeneous, controllable and realistic network
scenarios. Network simulators, as the name suggests, are used by researchers to design various kinds of networks, simulate and then analyze the effect of various parameters on the network performance. Simulation differs from emulation because while the former runs in virtual simulated time, the latter must run in real time, consequently, it is impossible to have an absolutely repeatable order of events in an emulation because of its real-time nature and because of a physically-distributed computation infrastructure. |
| To provide new input comments, etc, please send an email to Piotr Srebrny, Gareth Tyson, Andreas Mauthe or Thomas Plagemann |