The way voice is carried over IP networks is fundamentally different from
legacy telephone networks. Traditionally, it has been sent as a continuous
stream over circuit switched networks and the tariff depended on the duration of
the call. But, over IP, voice is broken into packets just like data and sent
across as a continuous stream. The IP network could be your enterprise LAN for
inter-departmental communications or your WAN links for calls to clients or
other branch offices. The voice packets may take different routes through the
network but are assembled back in order at the called end to make up for a
meaningful conversation. Such a convergence of voice and data networks ensures
ease of management for network admins, as they don't have to bother about
maintaining two separate networks. Now, although infrastructure setup for
converging the two networks is a tad costly, it more than makes for that with a
steady drop in call costs over the long run. However, challenges related to
ensuring the reliability and quality of a voice call still need to be overcome
before such convergence becomes a norm in future. The issue is not just about
removing the bottlenecks but more so about ensuring that there is enough
capacity in the network, the quality of calls is above customer expectations and
that there is a balance between service availability and cost. Therefore, you
need continuous voice quality assessments, and planning for new application
bandwidth to ensure a successful convergence.
Packet switching, the fundamental technique in VoIP, allows for a more
efficient use of bandwidth. But it doesn't inherently provide a guaranteed QoS.
Let's understand the process briefly. A router divides a stream of data into
smaller chunks, with the addresses for the originating and destination device in
the packet's header. Based on the packet's destination, the router forwards it
to the next one in sequence (avoiding congested paths). This way they can arrive
at their destination without any significant delay. At the destination, TCP
arranges them in the right order. Now, this approach is good for data such as
e-mail, IMs and peer-to-peer file sharing but not for voice. Any delay in
delivering the words in right order or missing words, render the speech
meaningless. All these problems are further accentuated when voice traffic is
interspersed with data over integrated networks such as Internet.
Voice quality
With voice traveling over the same network backbone as data, it inherits the
same problems as have bugged IP networks in the past. Some of these get even
more pronounced with voice and are absolutely intolerable. VoIP traffic becomes
vulnerable to network delay, jitter, and packet loss, making it a difficult
network technology to manage. Jitter is a sudden variation in the expected
arrival time of a packet and is caused by the network a packet traverses (which
includes transmission medium such as wire, cable, or optical fiber). Another
common concern is latency which is the amount of time delay between the
initiation of a service request for data transmission and the grant of that
request. Delay can cause problems such as echo and speaker overlap. Echoes are
signal reflections of the speaker's voice from the far-end telephone equipment
back to the speaker's ear. These become a major irritation if the delay exceeds
50 ms. Delay induced echo can be overcome by the use of echo cancellation
technology. Still greater problem is that of speaker overlap-one talker stepping
on the other talker's speech. This is caused if delay exceeds 250 ms. Since loop
(round-trip) delays are greater than this value for virtually all VoIP
connections, all VoIP gateways need to have an echo cancellation function.
Delayed or missing packets could be imperceptible for e-mail or other such
apps but absolutely intolerable for quality VoIP communications. As per ITU,
minimum acceptable delay for VoIP calls is 0—150 ms for local calls and 150-400
ms for international calls. Another phenomenon to monitor is packet loss.
Networks either sporadically drop single packets (called gap periods) or large
numbers of packets in a 'burst.' Although packet losses are satisfactorily
managed by packet loss concealment techniques during gap periods, it is the
sustained bursts which are difficult to manage. Managing VoIP quality primarily
involves minimizing network delay and jitter, because codecs require a steady,
consistent stream of packets to provide quality audio.
A jitter buffer on a VoIP phone can mask mild delay and jitter problems but
QoS parameters need to be negotiated up front before the data transfer begins, a
process referred to as signaling. This gives an opportunity to determine if the
required network resources are available and in most cases reserve the required
resources before granting a QoS guarantee to the client.
Bandwidth concerns
VoIP has the potential of providing CD quality audio to its users. But to
achieve that, you would be putting enormous strain on the available bandwidth.
Most of the users would be more than happy to settle for a low bandwidth,
glitch-free call quality. Therefore, simply figure out how much additional
bandwidth is required and how much of your network needs an upgrade to
facilitate a meaningful conversation. With enterprises upgrading their WAN
infrastructure in a big way, that should not be much of an issue. Looking a bit
further, you can see technologies that unify the various modes of communication
such as messaging, video and collaboration, being integrated over your IP
networks. So you can foresee that not only does your bandwidth needs to be
increased, but utilized properly as well. The last thing you want is a lot of
angry users complaining about call quality just because your bandwidth is either
overloaded or not utilized properly.
Other causes for delays could be pathway congestion, time taken for error
checking, transmission negotiations and additional info to determine the type of
data being sent, its origin and destination. What this means is that enough
bandwidth must be made available to allow for not only voice transmission but
also the extra bandwidth for overheads required for any data transmission. The
actual amount of bandwidth for voice also depends on the codec used for
compression. This can range anywhere from 16 — 64 kbps and after compensating
for overheads, safely assume a total of 88 kbps.
Mean Opinion Score (MOS) |
This is a traditional way of estimating call quality. MOS measurement is determined by a group of listeners who rate the quality of audio based on samples played. The MOS rating system has a five-point scale ranging from 1 to 5 where 1 means poor quality and 5 translates to excellent. Practically, even if you achieve a score of 4, you would have done your job nicely. However, anything below 3.5 should set alarm bells ringing. But there's a catch. Call quality may be perceived differently depending on the environment. For example, a user making a call from some noisy environment would tolerate a far lower quality than an executive taking a call in a conference room. Using this technique, problems can be addressed when voice quality starts to degrade before the users feel any debilitating effects. MOS is also useful in troubleshooting and in capacity planning as you anticipate increases in call volume over time. |
Traffic control
Traffic shaping is a technique to control the network traffic to ensure low
latency, enforce policies and ensure optimum utilization of bandwidth. This is
done by controlling the volume and the rate at which data is sent through a
transmission path. Most of the traffic shaping schemes are implemented at the
network edges to control traffic entering the network. You can control the
network dynamically, prioritizing bandwidth amongst applications, depending on
the rise and fall in network usage. There are several other ways in which WAN
bandwidth can be dynamically managed:
- Set rules as per applications: Traffic shaping solutions can categorize
traffic in terms of priority, thereby assigning a specific amount of bandwidth
for each. You might just assign a rule that limits aggregate FTP traffic to no
more than 6 Mbps and another one that limits total streaming audio traffic to
no more than 3 Mbps. This categorization can also be on the basis of the
traffic's protocol and the ports used by an application. You can also
categorize traffic based on the content. Most traffic shapers categorize web
traffic based on the interactions between a web server and a browser when a
page is requested, regardless of the port number. - Set rules for each user: Traffic shapers can set traffic limits for each
user so that traffic is shared fairly amongst all users. For instance, you
might decide to limit traffic to or from each user to no more than 256 Kbps.
This way although a user can access whatever he wants, but the traffic flow is
smoothed out to a specified level rather than hogging the total available
network capacity unnecessarily. Traffic limits can be set to be either hard or
burstable. A hard limit is always fixed and can't be avoided at any cost.
Burstable limits allow traffic to exceed a threshold value (called 'burst
limit'), if you have spare capacity with no higher priority application to
load the vacant capacity with. - Traffic priority management: Other than setting hard or burstable
traffic limits on applications or users, traffic shaping devices can also
define the importance or priority of different types of traffic. In a
converged network, voice packets can be given a higher priority over regular
data such as e-mail, IMs, peer-to-peer file sharing and so on. Some traffic
shaping tasks can be done directly on a router, just as you do firewall-like
packet filtering. However, using specialized traffic shapers avoids loading up
routers, leaving them free to focus on routing packets as fast as they can.
By carrying out simple tasks such as managing the number of calls placed
across an IP WAN link, network managers can ensure that their network's
voice/data quality remains high. Using call admission control they can limit the
voice bandwidth and if necessary, allow calls to be routed across the PSTN to
ensure good quality. Also by implementing change control, they can prevent users
from using a high-bandwidth application which could harm voice traffic. And of
course, through the use of virtual LANs voice traffic can be separated from
other broadcast-intensive apps.
A LAN setup based on 802.11pq. Here the systems at the center constitute a data only network while the IP Phone and the Softphone form a converged network |
Other techniques
There are several other technologies that can be deployed to enhance or add QoS
features to converged networks. Here are a few of them:
- MPLS: Multi-Protocol Label Switching complements IP technology by
taking advantage of the intelligence associated with IP routing techniques. It
specifies mechanisms to manage traffic flows amongst different hardware,
machines, or even applications. Data transmission occurs on label-switched
paths (LSPs)-a sequence of labels at each and every node along the path from
the source to the destination. A label contains information such as
destination, precedence, VPN membership, QoS information from RSVP and the
route to be followed. These LSPs are established either prior to data
transmission or when a certain flow of data is detected. High-speed switching
of data is possible because fixed-length labels are inserted at the beginning
of a packet and can be used by hardware to switch packets quickly between
links. MPLS is rapidly emerging as a core technology for next-generation
networks as it allows for the consolidation of multiple disparate networks
into one and provides a means for multiple Layer 2 technologies to be used
simultaneously, thereby saving costs. - DiffServ: Differentiated Services is a networking architecture
which allows you to prioritize packets in the network path through relevant
information in their headers. Each data packet is divided into a given number
of traffic classes, rather than differentiating traffic based on individual
flow. Each router on the transmission path is configured to prioritize traffic
based on its class. The DiffServ technique does not automatically prioritize
traffic, but follows what has been defined by the network administrator. It
also recommends a standard set of traffic classes to ensure interoperability
amongst hardware and networks. Since DiffServ is a mechanism that allows you
to decide what packets to carry and what to drop; depending on the network
capacity, you almost invariably land into a situation where you are on the
brink of your WAN link. And since Internet traffic is bursty, this might
result in your low priority data being dropped almost always. To avoid this
situation, fix a cap on the amount of bandwidth for higher priority data. - 802.11pq: This IEEE protocol describes how switches can classify
frames at the Ethernet layer. It also allows end points and routers to assign
priorities to LAN frames. It is deployed in small LAN setups to achieve good
quality of service-be it voice, peer-to-peer file transfer or IM. If you use a
switch or router with 802.11pq capability, you don't have to worry about
creating different virtual LANs for your converged network. Such devices are
capable of differentiating voice and data frames on their own and can create a
subnet. So, let's say you have a network in which ten machines talk data and
another 10 which talk both voice and data, then the switch will automatically
create two VLANs-one for Voice with ten ports and another with twenty ports
for data. - RSVP: Resource Reservation Protocol (RSVP) is a network control
protocol that enables Internet apps to obtain differing QoS for data packets.
It recognizes the fact that there are certain apps like voice which require
both timeliness of delivery and quality while others such as e-mail or file
sharing that require reliability of delivery and not necessarily timeliness.
It is not a routing protocol but works in conjunction with other protocols. It
is used by routers to communicate QoS requests to all nodes along the path of
transmission. RSVP capable routers communicate with policy servers within the
network, to determine what apps would be granted network resources and which
requests will be prioritized, in case there are insufficient resources to
satiate all.
By keeping a close watch on the network parameters that affect VoIP, you can
take care of infrastructure problems before they lead to poor quality or
downtime. Understanding what to monitor and having a thorough VoIP analysis,
makes this crucial task much more manageable.
Adeesh Sharma