Designing Voicemail Systems with Cisco Unity Connection
- Determining Server Sizing
- High-Availability and Redundancy
- Server Sizing and Platform Overlays
- Summary
This chapter covers the following subjects:
- Design Considerations: Understand the capability of Cisco Unity Connection as it pertains to current users, network design, codecs, voicemail ports, and projected growth.
- Active-Active Cluster Pair: Explore the high availability and redundancy feature of Cisco Unity Connection using the active-active cluster pair configuration.
- Voice-Messaging Design: Design the voice-messaging system using Cisco Unity Connection platform overlays by determining the proper server sizing, equipment, codec, feature, and capabilities.
- Voice-Messaging Networking: Understand the various networking options available in Cisco Unity Connection version 8.x software.
After you understand your current voice-messaging environment, users' needs, and projected growth within the planning stages, you can develop a preliminary design based on this information. This preliminary design can help the business to understand and review the designed solution that meets the needs defined during the planning stage. Good communication within the organization is vital for all stages of the deployment, but especially important for the design. After the preliminary design has been reviewed, modified, and adjusted according to the business model, you can develop the final design and scope of work.
This procedure must be completed before any product is ordered and the implementation begins. The planning and design phase determines the actual product and implementation, and ensures that the user requirements are met. As stated previously, good planning and design that closely matches the final implementation helps to avoid unforeseen project delays and over-budget issues.
This chapter takes your project plan to the next phase of the Planning, Design, Implementation, and Operation (PDIO) model, the design phase. You need to collect all information assembled from the planning phase and determine a preliminary design. A properly crafted preliminary design can consist of input from reviewers, management, and users to allow for modifications and collaboration. The end result in this phase will be a final design that will be ready for implementation. Part of this phase also involves features, capabilities, and configurations to ensure that all requirements are met as determined according to the project plan. Therefore, you need to understand the interworking, features, and capabilities of Cisco Unity Connection.
The focus in this chapter is on the Cisco Unity Connection product design and capabilities as they pertain to its various systems, database, and networking. You need to understand the following:
- How to determine the server sizing to be used when implementing Cisco Unity Connection version 8.x software.
- Understand codecs, users, Internet Message Access Protocol (IMAP) client, voicemail storage, and ports. Explore how this information can influence your server sizing and voice-messaging design.
- Understand the various IMAP clients that can be used with Cisco Unity Connection and investigate the differences between IMAP non-Idle and Idle mode.
- Learn the Cisco Unity Connection database design and how active-active cluster pairs deliver redundancy and high availability.
- Determine the preliminary design based on geography, function, and client types to be used for voice messaging.
- Create a finalized design from the elements of the planning phase and the discussions and feedback from the design phase.
Determining Server Sizing
Cisco Unity Connection enables organizations to build and configure their voice-messaging system according to their business needs. These needs can involve the decisions based on the number of users, voicemail ports, codec, and even what type of clients will be used to retrieve voice messages. At this point in the process, many of these needs should have been identified and determined in the earlier planning phase.
The first goal is to determine the proper server sizing to meet the current user requirements and future growth. Server sizing refers to the proper platform hardware to be purchased. It is important for not only budgets, but also user requirements to purchase the correct server platform to meet the users' current and future requirements.
Scalability defines the capability of an organization to adapt to growth and changes. The voice-messaging design needs to include considerations for scalability in providing the required operations and services as the organization continues to grow and expand over time. Cisco Unity Connection enables this scalability with its current software and the capabilities provided with digital and Voice Profile for Internet Mail (VPIM) networking services.
You must identify a number of elements in this stage about the server sizing because these decisions can influence an organization's choice of hardware. These elements consist of the following:
- Audio codecs
- Voice-messaging storage capacity
- Voicemail ports
- Current and future users
- Voicemail users
- IMAP clients
The next sections review these requirements and the best practices related to server sizing.
Understanding Codecs and Voicemail Storage
You must understand the basic differences of the various codecs before understanding how Cisco Unity Connection handles these codecs. This discussion is not meant to be an in-depth study of codecs, but a general overview to provide a proper understanding of codecs as they are implemented in Cisco Unity Connection.
Codecs are defined as the encoding and decoding of the audio signal. An audio signal needs to be converted to a digital format before it can be sent over the IP network. This is referred to as encoding. This digitally encoded signal takes the form of a real-time transport protocol (RTP), which uses User Datagram Protocol (UDP) as the transport layer. Likewise, at the remote location, this encoded digital signal needs to be converted back into an audio stream. This process is called decoding. Together, the encoding and decoding determines the codec used to send an audio signal across the IP network.
The process of encoding an audio signal into a digital signal use is referred to as sampling. The sampling rate is determined by the amount of samples per second. Each sample is analogous to a snapshot in time. The accepted sampling rate was determined from work performed by Harry Nyquist and Claude Shannon in the 1920s surrounding the telegraph. Their research determined that the amount of information sent into a telegraph channel should be twice the amount of its highest frequency. In actuality, the theorem determines that a sampled analog signal can be correctly reconstructed if the sampling rate exceeds twice the highest frequency of the original signal. This theory referred to as Nyquist-Shannon Theorem, or simply Nyquist's sampling theorem. Since this time, the basic theory of telegraphs has been applied to digital networking.
The human voice can produce sound from approximately 300 Hz to 4000 Hz. Keeping with the same logic that Nyquist used for the telegraph, you can determine a sampling rate for voice communications to be 8000 Hz (4000 Hz * 2), or 8000 samples per second. Each sample would consist of a single byte. Therefore, the information consisting of this sample would be 8 bits * 8000 samples, or 64,000 bits per second. This is the basis for an uncompressed digitized audio signal in IP telephony, which is called the G.711 codec. This is also the calculation used for a DS-0 or voice channel within a T1/PRI digital circuit. This bandwidth is defined as the payload, not including Layer 2 and Layer 3 overhead. This overhead on an Ethernet network accounts for approximately 25 percent of the overhead of an uncompressed voice payload, or 16 k (or 80 k). This includes IP, RTP, and UDP headers.
Cisco Unity Connection supports a number of different codecs, as described in the following sections.
G.711 Codec
The G.711 codec is the most used and supported codec in IP telephony. It is produced using pulse code modulation at an uncompressed sampling rate of 8000 samples per second. The bandwidth required for the G.711 codec is 64,000 bits per second. This is the bandwidth of the payload (not including IP, RTP, and UDP headers). As stated in the previous section, on an Ethernet network, this accounts for approximately 25 percent of the overhead of an uncompressed voice payload, or 16 k (or 80 k).
There are two versions or formats of the G.711 codec. G.711 m-Law is the codec used in North America. The G.711 a-Law is used outside North America. Even though both codecs have the same bit rate of 64,000 bits per second, they perform a completely different sampling of pulse code modulation to arrive at their respective digitized samples. Therefore, the codecs are not directly compatible and require transcoding between G.711 m-Law and G.711 a-Law. However, both of these codecs produce a high-quality audio steam.
G.729 Codec
The G.729 codec is also used extensively in IP telephony and also widely supported. This codec uses a compression algorithm to attain a payload bandwidth of 8000 bits per second. Because of bandwidth conservation, this codec is used for remote IP telephony communications and where bandwidth oversubscription is a concern. A number of versions of the G.729 codec exist. Two of these codecs, G.729a and G.729b, incorporate additional options and features. The sound quality produced using G.729 is not as high quality as G.711 but is still considered to be toll quality (similar to a residential phone service or traditional landline services). These lower bandwidth codecs are used primarily to save the bandwidth for lower speed WAN circuits. In these cases, the overhead calculation is still approximately 16 k, providing a total bandwidth calculation of 24 k.
G.722 Codec
The G.722 codec produces is a high quality audio signal and is supported on many of the newer IP telephony devices and IP phones. G.722 uses its own compression algorithm called Sub-Band Adaptive Differential Pulse Code Modulation (SB-ADPCM) and can produce a digital signal using a number of bandwidths (48 k, 56 k, and 64 k). The G.722 codec requires 64,000 bits per second as the payload bandwidth for this codec; although it can adapt the compression algorithm based on changes in the network. This codec is used with the newer Cisco 79X2 and 79X5 IP Phones.
G.726 Codec
The G.726 codec uses Adaptive Differential Pulse Code Modulation (ADPCM) to produce a payload bandwidth of 16 k, 24 k, 32 k, or 40 k bits per second, although the most widely supported codec used is 32 kbps. Using half the bandwidth of G.711, this codec is used for many phone service providers, VPIM networking, and Simple Mail Transfer Protocol (SMTP) communications. You examine the use of this codec in Chapter 5, "Cisco Unity Connection Users and Contacts," in the discussion of VPIM and SMTP protocols.
iLBC
Internet Low Bitrate Codec (iLBC) is defined in RFC 3951 as a narrowband speech codec, suitable for Voip application and streaming audio. This algorithm used for iLBC is much more resilient to the lost frames when degraded networks are encountered. iLBC uses a bandwidth of 13.3kbps, with a slightly higher quality than G.729.
PCM Linear Codec
The PCM Linear codec uses pulse code modulation (PCM) to digitize samples based on a variable sampling rate of 8 k to 48 k. This format is used in DVD technology to encode WAV and AU type sound files because this codec produces the highest quality audio; however, this quality is produced as the expense of increased bandwidth. For example, a sampling rate of 8 k for a 16-bit samples requires 128 kbps for the payload bandwidth (16 bits * 8000 samples / sec = 128 kbps).
Transcoding in Cisco Unity Connection
Voice calls arriving to Cisco Unity Connection enter the system using a negotiated line codec. The administrator can choose to support a certain codec based on its advertisement.
When callers leaves message for users with a mailbox, they reach Cisco Unity Connection via an available voicemail port. The audio stream is received as a digitized signal in one codec (called the line codec). This digitized signal needs to be converted before it is recorded to the users' voice mail. Transcoding is the process to convert a digitized signal from one codec to another codec. Cisco Unity Connection performs transcoding with every call as it is received and recorded in the users' voice mailbox.
Cisco Unity Connection supports a number of codecs on the line side. These codecs are used on the line side, as the digitized signal is received by Cisco Unity Connection. Also, as stated previously, the administrator can influence which codecs are used, or not used, by changing the advertising of these codecs to external devices. The codecs supported on the line side follows:
- G.711 m-Law
- G.711 a-Law
- G.722
- G.729
- iLBC
The audio stream received on one of these line codecs is then transcoded to the system codec, which is always PCM Linear. As per the discussion of codecs, this codec produces the highest quality audio and is therefore the system codec. The system codec cannot be changed and is always used with every call and recording. The system codec receives the call from the line codec. The recording codec receives the call from the system coded (PCM Linear).
Finally, the PCM Linear stream (system codec) is then transcoded to the system recording codec. The supported system recording codecs in Cisco Unity Connection follows:
- PCM Linear
- G.711 m-Law (default)
- G.711 a-Law
- G.729a
- G.726
- GSM 6.10
The default recording codec is G.711 m-Law. It is advisable to keep the system recording codec at this default because this produces a good quality audio signal with acceptable disk space utilization (8 KB/sec).
All transcoding here is done directly within the Cisco Unity Connection system. If calls and recorded messages are transferred to the integrated phone system or Cisco Unity Connection, transcoding resources might be required.
Figure 2-1 illustrates the relationship between the line, system, and recording codec as they are implemented in Cisco Unity Connection.
Figure 2-1 Codec Implementation in Cisco Unity Connection
The System recording codec can be changed to G.729a, G.726, or GSM 6.10 to conserve disk space for message storage. These codecs require from 1 KB/sec to 4 KB/sec, half the amount of disk space required for the same recording using the default system recording codec of G.711. However, the audio quality produced with these codecs will be much lower. Changing the system recording codec to one of these codecs should be done only if there is a real need to conserve disk space. You must understand and decide if this should be done to sacrifice recording quality. Also, changing the default system recording codec can affect playback of messages on specific mobile devices and cell phones that might not support the specific codec using IMAP.
On the other hand, you can use the PCM Linear codec for the system recording codec to increase the audio quality. This codec produces the highest quality of audio stream, but at the expense of disk space. The PCM Linear codec uses twice the bandwidth required by the G.711 default recording codec. This should be done only if there is no consideration to conserve disk spaces, and when G.722 is used as the line codec. Using PCM Linear as the system recording codec when the line codec is G.711 cannot increase the quality of the audio stream and only use more disk space. For most installations, Cisco Unity Connection uses G.711 as the line codec. Therefore, it is best to leave the system recording codec at the default, G.711.
You need to keep the system level recording at G.711 because most endpoints use this codec as their audio format to Cisco Unity Connection. This determination is made only to preserve the audio quality, not avoid transcoding. As Figure 2-1 illustrates, transcoding is done for every call received by Cisco Unity Connection. There is little system performance impact from a different codec on the line, as compared to using a specific recording codec. Certain codecs do require additional resources and computation because of their complexity. The codecs defined here that require more resources to transcode are the line codecs, G.722 and iLBC. Limit the use of these codecs for this reason. Because of these resource requirements, Cisco Unity Connection can support only half the amount of simultaneous connections using these line codecs, as compared with the other line codecs. You must consider this calculation when determining the platform sizing and the number of voicemail ports.
Table 2-1 provides an overview of the various codecs supported by Cisco Unity Connection for their audio quality, sample size, bandwidth, and disk space using an 8 kHz/sec sampling rate.
Table 2-1. Recording Codecs Relationship and Limitations (Based on 8 KHz/sec Sampling Rate)
Recording Codec |
Characteristics |
PCM Linear |
Excellent Audio Quality Requires 16 KB/sec disk space 16 bit samples * Used for system codec |
G.711 u-Law * G.711 a-Law |
Good Audio Quality Requires 8 KB/sec disk space 8 bit samples * Default Recording Codec |
G.726 |
Good Audio Quality Requires 4 KB/sec disk space 16 bit samples |
G.729a |
Fair Audio Quality (Toll Quality) Requires 1 KB/sec disk space |
GSM 6.10 |
Good Audio Quality Requires 1.6 KB/sec disk space |
Users, Codecs, and Message Storage Considerations
Now that you understand the implications of the codecs as they apply to system performance, audio quality, and disk storage space, you must use this information along with the current and future projected users to determine the server sizing. The message storage is designed to handle between 20 minutes to 30 minutes of message storage (using the G.711 system recording codec) for each user configured according to the supported message platform. In most cases, this might be more than sufficient for most organizations. You need to consider emails sent to the users' voice mailbox for replies, forwards, and faxes in the message storage calculation.
The server sizing should be based on projected growth of users to ensure scalability; the codecs to be used; and the total amount of voice mails, replies, forwards, and faxes that need to be available per user. If this is a new installation, it would be advisable to investigate the current voice message stores to gain a benchmark to determine the Cisco Unity Connection server sizing for the message stores.
Finally, you must also understand the clients that might be used to retrieve voice messages and emails because this might influence the number of users supported. The type of clients supported in Cisco Unity Connection can be any of the following types:
- Telephone user interface (phone users)
- Voice user interface (voice recognition users)
- IMAP clients
- Messaging inbox clients using Personal Communications Assistant (PCA)
- IBM Lotus Sametime clients
- RSS reader clients
IMAP Clients and Voice Ports
Cisco has made some marked improvements in the latest 8.x software release of Cisco Unity Connection for its handling of IMAP clients. If users are using clients that support IMAP Idle, there is no increased impact on the load to Cisco Unity Connection. This was not the case in previous versions; however, the clients must be IMAP Idle-mode instead of non-Idle. IMAP Idle is defined as the ability of the client to indicate to the server that is ready to accept messages, without having to click a refresh button or repeatedly make requests to the server. In this case, the same amount of users and ports are supported, whether the users use their phone or IMAP Idle clients. Most IMAP clients support Idle-mode, with a few exceptions.
The Internet Message Access Protocol (IMAP), formerly called Internet Mail Access Protocol, supports both online (non-Idle) and offline (Idle) modes. The mode used depends entirely on the specific client. Cisco Unity Connection supports both non-Idle and Idle-mode clients. However, non-Idle-mode places a significant load on the server and the number of total clients supported is reduced significantly. A single non-Idle IMAP client is counted as four Idle IMAP clients.
The products that support the IMAP Idle-mode consist of the following:
- Microsoft Outlook
- Microsoft Outlook Express
- Microsoft Windows Mail
- Lotus Notes
- Cisco Unified Personal Communicator (CUPC) version 8.x and later
- IBM Lotus Sametime version 7.xx and later
The following Cisco products support only Non-Idle mode:
- Cisco Unified Personal Communicator version 7.x and earlier
- Cisco Unified Mobile Communicator
- Cisco Unified Mobility Advantage
- IBM Lotus Sametime plugin
If you use other clients not listed here, consult the documentation for your specific product or software. Of course, you can use non-Idle clients with Cisco Unity Connection, but the amount of users supported is reduced. As stated previously, a single non-Idle IMAP client should be considered as four IMAP Idle clients when calculating users to determine the server sizing.
Cisco Unity Connection version 8.x software enables organizations to mix non-Idle and Idle IMAP clients on the same server. However, for accounting purposes, it might be advisable to put them on separate servers, or at least create a completely different class of service to account for the number of each type of client on each server. Whether the clients are on separate servers or the same server, the calculations are still the same—meaning, a non-Idle IMAP client still counts as four IMAP Idle clients.
The IMAP non-Idle clients are the only clients that affect the amount of the users in the Cisco Unity Connection version 8.x software. This must be accounted for with current and future users when considering server sizing to allow for scalability.
Determining Voicemail Port Requirements
The number of voicemail ports required is another factor you need to consider in server sizing calculations. To ensure that callers get their calls answered by Cisco Unity Connection and never receive a fast busy, it is imperative that ports are available at all times. The information collected to make the initial calculation can be gathered from the current voice-messaging system to gather traffic volume statistics during the specific busy hours.
The main purpose of a voicemail port in Cisco Unity Connection is to answer calls to Cisco Unity Connection, enabling callers to leave voice messages and for users to retrieve these messages. If you look at only the current voicemail traffic and volume, however, you will be missing many vital factors that must be determined to calculate the correct number of ports. To understand voicemail ports, you must first explore their functions, beyond leaving and retrieving messages. Voicemail ports supply the following functions to Cisco Unity Connection:
- Answer calls for incoming callers
- Recording messages
- Retrieving messages
- Message notification
- Telephony Record and Playback (TRaP)
- Message waiting indicator (MWI)
To determine the actual number of ports to install, the designer must research answers to the following questions:
- How many users need to be configured on the server for voice messaging?
- What is the expected and projected message activity for these users?
- How can the users retrieve messages?
- Can the organization use call handlers within an audiotext application that to answer all or some of the calls to the organization?
- What features are required for voice-messaging users? Voice recognition? SpeechView? TRaP?
- Is message notification required?
- Is high availability a requirement?
The number of users can help the designer to clearly understand the server sizing. Likewise, the amount of voice messages received and retrieved can help clarify the voicemail port requirements. If users use the phone to retrieve their voicemails, a port is required; however, if they use an IMAP client, a port is not required. Users retrieving their messages using the Cisco Messaging Inbox and Microsoft Outlook with the ViewMail have the ability to listen their message through the PC speakers or their IP Phone. The clients themselves do not require a voicemail, but if the users decide to direct their messages to the IP Phone, a port is required. This is referred to as Telephony Record and Playback (TRaP). Users might decide to use their IP Phone if they do not have a workstation capable of audio, or to maintain a level of privacy in the workplace.
When a user receives a message, Cisco Unity Connection notifies the user by sending specific digits to the phone to turn on the message waiting indicator (MWI) light on the user's phone. When the last message is retrieved by the user, Cisco Unity Connection then sends a different set of digits to the phone to turn the MWI light off.
Other than voice messaging, Cisco Unity Connection enables an organization to create call-handlers to be used within custom audiotext applications. Part II explores call-handlers and audiotext applications in depth. Many companies choose to use this application as an auto-attendant for incoming calls, thereby allowing callers to be quickly directed to the proper person, department, or application, thereby decreasing the length of time that users use a specific port. If the audiotext application is used in this means, the call volume to Cisco Unity Connection can greatly increase because a voicemail port is used for every incoming call.
Certain other features employed by users can increase the port usage. For example, if users are configured for message notification, an outgoing call is made from Cisco Unity Connection for every configured message notification attempt, which uses an existing voicemail port. Also, users can choose to be notified of urgent, some, or all messages according to a defined time period. After they receive a notification, the user can choose to listen to the message. The message notification and message retrieval uses an available voicemail port.
Finally, if high availability is a requirement, two servers are required to be configured in a cluster-pair. A single server uses the IBM Informix database for the configuration database and message store. This single server can support up to 250 ports, depending on the server platform, with Cisco Unity Connection version 8.x software. The issue with having the single server configuration is that there is no redundancy if a server failure occurs and no available load sharing, meaning that a single server is responsible for database configuration, message stores and voicemail port activity. The loss of the server can cause a voice-messaging outage until the server is restored.