Bill Birney Microsoft
Introduction
Intelligent streaming is a set of features in Microsoft® Windows® Media
Technologies that automatically detects network conditions and adjusts the
properties of a video stream to maximize quality. Today's Internet
connections are highly variable in terms of actual throughput achieved for
any specific connection and range of possible connection speeds. For
example, a user with a laptop computer can connect to an Internet Service
Provider with a 300-kilobits-per-second (Kbps) DSL connection at home, a
1.5-megabits-per-second (Mbps) T1 connection at work, and a 56.6-Kbps
modem connection while travelling on business. Furthermore, the actual
throughput achieved in all of these scenarios is likely to vary. This is
especially important for low-bandwidth modem connections, where the
connection can vary by 50 percent or more of the maximum, depending on
network and Internet Service Provider (ISP) congestion.
Because Windows Media Technologies is a connected, end-to-end,
client/server system, the server and the client communicate with each
other to establish actual network throughput and make a series of
adjustments to maximize the quality of the stream. Intelligent streaming
offers dramatic benefits to the user. It maximizes use of available
bandwidth; in a DSL or LAN environment, users receive content tailored to
their connection speed. It greatly improves the user experience; users
connected by modem immediately notice the presentation is smoother, less
jerky, and of generally higher quality.
How Does Intelligent Streaming Work?
Intelligent streaming, first introduced in version 3.0, has been
significantly upgraded in Windows Media Technologies 4.0. Version 4.0 can
automatically adjust between multiple video bit rates and automatically
clean up video streams.
The most difficult task of streaming media over a network is
maintaining a continuous presentation to the user in a highly changeable
environment. Buffering is the biggest problem of streaming media. It is
caused when Microsoft® Windows® Media Player, also known as the client,
runs out of data and must wait for more. The client will always run out of
data if the bit rate of the media exceeds the current bandwidth.
Unpredictability of bandwidth is taken for granted on the Internet. A
user can, for example, originally connect to an Internet service provider
(ISP) at 56 Kbps. Just because the connection speed is fast does not mean
the bandwidth supports the bit rate. Actual bandwidth is determined by
network conditions, and traffic on the Internet is constantly fluctuating,
causing bandwidth to plunge to 18 Kbps one moment, then increase to 40
Kbps the next. If a user attempts to view media being streamed at 50 Kbps,
the presentation suffers considerably when bandwidth is squeezed.
To ensure a continuous presentation, you must employ a system that
adjusts the bit rate to changes in available bandwidth. Intelligent
streaming does this by:
- Sending a stream with the appropriate bandwidth when the user first
connects.
- Dynamically and seamlessly adjusting the bit rate as the bandwidth
changes.
Multiple Bit Rate Encoding
To take full advantage of intelligent streaming, content must be
encoded with multiple bit rates. Multiple bit rate encoding is one of the
primary new features of Windows Media Technologies 4.0. In multiple bit
rate encoding, up to six discrete, user-definable video streams, one low
bit rate insurance video stream, and one audio stream are encoded into a
single ASF stream. Each video stream is encoded from the same content, but
each is encoded for a different bit rate. When a multiple bit rate .asf
file, or live stream is played on Windows Media Player, which is connected
to a Windows Media server, only one of the video streams is received: the
one that is the most appropriate for current bandwidth conditions. The
process of selecting the appropriate stream is completely invisible to the
user, and this is what intelligent streaming is all about.
Intelligent Bandwidth Control
There are a number of steps in the intelligent streaming process. Each
is a strategy: a way to modify the bit rate so it remains continuous on
the client end regardless of the current bandwidth. As bandwidth
fluctuates between server and client, the server detects the changes and
renders the best strategy. When bandwidth is at its best, the server
employs the first strategy. As conditions worsen, the server checks its
list of options one by one until the bit rate is optimized for the current
bandwidth.
Intelligent streaming uses the following strategies:
- The server and client automatically determine the current available
bandwidth; then the server selects and serves the video stream at the
appropriate bit rate.
- Windows Media Technologies 4.0 supports up to six user-definable bit
rates of encoded video for both on-demand .asf files and live-encoded
streams. During transmission, if the available bandwidth decreases, the
server automatically detects the change and switches to a lower
bandwidth stream. If bandwidth improves, the server switches to a higher
bandwidth stream but never higher than the original bandwidth.
- If bandwidth deteriorates to a level below that of the lowest
multiple bit rate stream, the server selects and serves an insurance
stream, which is automatically encoded when the multiple bit rate
feature is selected. This insurance stream is approximately 70 percent
of the bit rate of the lowest bandwidth stream encoded.
- If bandwidth can no longer support streaming video, the client and
server intelligently degrade image quality.
When a network is extremely congested, intelligent streaming attempts
to maintain a continuous audio stream. The server decreases the video
frame rate to minimize interruptions caused by buffering. If the bit rate
is still too high, the server stops sending video frames. If audio quality
starts to degrade, the client intelligently reconstructs portions of the
stream to preserve quality.
Intelligent Image Processing
The client intelligently post-processes the video stream to enhance
quality even at very low bit rates. Windows Media Technologies includes a
new intelligent filtering technology, which works in conjunction with the
Microsoft MPEG-4 version 3.0 Video codec in Windows Media Player to smooth
blockiness and remove ghosting artifacts, significantly improving
the overall appearance of the video.
Blockiness also occurs during the decoding of high bit rate streams,
but it is not as noticeable. A streaming media codec, such as Microsoft
MPEG-4, encodes a video image by breaking it up into pixels. The lower the
bit rate, the fewer the pixels. When too few pixels are used to create an
image, they appear as blocks. The client post-processing filter used in
intelligent streaming smoothes the edges of the blocks and erases certain
other artifacts, such as ringing, so the resulting image is more pleasing
to the eye.
How Do Content Developers Author for Intelligent Streaming?
With multiple bit rate encoding in Windows Media Technologies 4.0, ease
of use has never been greater. Simply select a presupplied multiple bit
rate template during an on-demand or live production, and the encoder
automatically creates the multiple bit rate stream. For greater control,
you can manually select the exact bit rates for each of up to six encoded
streams. The insurance stream, client post-processing, and intelligent bit
rate optimization are all automatic on-the-fly features. Best of all, now
you only need to create and manage a single file to handle multiple bit
rates.One of the main goals of software design in recent years has been to
handle as many of the background tasks as possible automatically, so you
as producer are free to focus on the quality of the content.
Setting Up the Encoder Using Templates
To set up Windows Media Encoder for multiple bit rate encoding, start
the configuration wizard. To do so, open Windows Media Encoder, click
File, and then click New. There are three multiple bit rate
templates available in the QuickStart and Template with I/O options
wizards.
Select the QuickStart option, and then click OK. The
multiple bit rate templates are the first three in the Template
list. To quickly configure Windows Media Encoder for multiple bit rate,
click one of these templates options.
| Template |
Streams |
| Dial-up Modems Multiple Bit Rate
Video |
Encodes two primary streams suitable
for Internet dial-up connections: 28.8 Kbps and 56 Kbps.
Quality: Low bit rate audio, smooth movement at 15 frames per
second, small frame size. |
| ISDN - Corporate Internet Multiple
Bit Rate Video |
Encodes two primary streams suitable
for ISDN connections: 100 Kbps and 80 Kbps Quality: Medium bit
rate audio, smooth frame movement at 15 frames per second, medium
image size. |
| Dial-up Modem - Corporate Internet
Multiple Bit Rate Video |
Encodes three primary streams
suitable for ISDN and dial-up connections: 80 Kbps, 56.6 Kbps, 28.8
Kbps Quality: Low bit rate audio, smooth movement at 15 frames
per second, medium image size. |
For a more complete description of each, select a template, and read
the text in the Description and Details boxes on this wizard
screen.
Viewing the Process
To encode a live event using the Dial-up Modem - Corporate Internet
Multiple Bit Rate Video template, select the QuickStart
wizard.
Before encoding, open a performance monitor on your system. In the
Microsoft® Windows NT® operating system, right-click the taskbar, click
Task Manager, and then click the Performance tab. There are
similar performance monitor options available for Microsoft® Windows® 98
and Windows® 95.
With Windows Media Encoder configured and ready, Microsoft® Performance
Monitor running, and video and audio streams connected and adjusted, start
encoding. If your computer has a speedy 400-MHz dual processor, CPU usage
is 35 percent to 40 percent. With usage in that range, the encoder has
enough processing power to handle rapid increases in frame detail. If you
are encoding on a slower computer, however, you are pushing the capability
of the processor. Using a 200-MHz single processor, for instance, CPU
usage immediately climbs to 100 percent when encoding starts. While this
condition is normal when encoding from one file to another, high CPU usage
when encoding live usually means frames are being dropped or
discarded.
To monitor the encoder as it is working, click the Summary
Statistics button.
The numbers on this page change continuously as the content changes.
Under Audio, the bit rate remains fairly steady, but under Video, the
Current bit rate can change dramatically. As the amount of video
detail per second increases, so must the bits per second. The ASF
Statistics tab illustrates how the encoder continuously adjusts
parameters to maintain the current bit rate as closely as possible to the
Expected bit rate.
Click the arrow to display all the items in the Video list. Four
video streams appear. Click Video stream #4. When you select this
stream, the numbers under Video change to reflect activity in that stream.
Stream #4 is the high bit rate stream, suitable for reception over an ISDN
connection. The Expected video bit rate is 66.11 Kbps. To determine
the overall bit rate for this video and audio stream combination, add this
number to the audio bit rate plus some padding. The result is roughly 80
Kbps.
Select other video streams to view. Video stream #3 is delivered
by Windows Media Services to clients connected through a 56-Kbps modem,
and Video stream #2 is received by clients connected through a
standard 28.8-Kbps modem. Video stream #1 is the insurance stream.
Its overall bit rate is roughly 18 Kbps in this case. When doing multiple
bit rate encoding, this extra stream is always added at a bit rate just
less than the lowest bandwidth selected.
A multiple bit rate .asf file is larger than a single bit rate file of
the same length because of the extra streams. Likewise, the bandwidth of
the connection between the encoder and the server must be larger to handle
more streams. In this example, five streams are delivered to the Windows
Media Services server: four video streams and one audio. However, only two
streams are delivered by the server to a client: the audio stream and one
video stream.
When a client attempts to connect to the server to receive the live
presentation, the server determines the current bandwidth of the
connection. For example, a user connects via a 56-Kbps analog modem but
does so at a time when the traffic is particularly high. When the user
connects, the server determines bandwidth to be 45 Kbps. There is not
enough bandwidth for the 80-Kbps stream, but there is enough for the
37-Kbps stream; therefore, the server sends video stream #3 and the audio
stream.
After 10 minutes, network congestion increases, and bandwidth suddenly
falls to 32 Kbps. Frame rate begins to suffer, and some packets are lost,
but the server reacts immediately by switching to video stream #2. The
user notices some degradation in image quality, but audio is continuous
and disruption of the presentation is minimized. Twenty minutes later,
bandwidth worsens again, dipping down to 14 Kbps. This is even too low for
video stream #1, so the Windows Media server stops all video. The user
notices the loss of video but is still able to listen to an uninterrupted
audio stream, which only requires 8 Kbps of bandwidth. A few minutes
later, bandwidth improves a great deal, and the server again is able to
send video stream #3.
The negotiation between server and client is handled automatically and
seamlessly. There are no manual adjustments necessary on either end. If
multiple bit rate streams are available to the server, it uses them. The
only thing you have to do as producer is make sure the streams are there.
If one of the three multiple bit rate templates available in the
QuickStart and Template with I/O options wizards is not exactly right for
your needs, you can create an .asd file using the Custom Settings
configuration wizard.
Monitoring Performance
In the multiple bit rate environment, the single most important concern
is CPU speed. While this was certainly a factor in the single, low bit
rate days, it is crucial now. It is recommended that you invest in a
computer with a processor speed of 400 MHz or greater and a dual processor
if possible. A slower computer (for example, in the 200-MHz category) is
suitable if live encoding is limited to only two of the lowest bit rates
and file encoding time is not an issue. In a multiple bit rate
environment, the more CPU speed available to the encoder, the more streams
are possible; and the higher the bit rate and frame rate, the bigger the
frame sizes and the better the quality.
Monitoring the performance of your CPU and memory resources is a simple
way to monitor the quality of your encoding. Poor playback of your live
video can often be attributed to an overloaded CPU. When you select an
encoding template or enter your own settings with the Custom Settings
wizard, you are, in effect, assigning a list of tasks for your CPU to
perform. The more streams, the more frames per second, the larger the
image size and the higher the quality you enter, the more tasks per second
your CPU must perform. Windows Media Encoder automatically adjusts its
task load to the given bandwidth and to the limits of the CPU. For
instance, if you enter a high frame rate (30 frames per second) and a low
bandwidth (28.8 Kbps), Windows Media Encoder keeps the bandwidth constant,
and attempts to achieve the requested frame rate by dropping image
quality.
Your CPU too can become overburdened, especially in multiple bit rate
encoding. When the number of tasks per second is too great for the
processing power of your CPU, the encoder adjusts to the environment by
dropping frames. An occasional dropped frame during a high-action sequence
is not that noticeable, but image quality and frame rate can be degraded
when your CPU is at 100 percent most of the time. For the best quality,
reduce the number of streams, image size, or frames per second until usage
is no higher than 90 percent.
When encoding from one file to another, Windows Media Encoder adjusts
to processor speed. Because time is not an issue, the encoder takes as
long and uses as much CPU bandwidth as necessary to properly render the
media without compromising quality. A 30-second file can take five minutes
to encode on one machine and 10 seconds on another depending on processor
speed. The encoder maximizes use of the CPU to keep encoding time to a
minimum, so you can see that the CPU is at 100 percent. But unlike live
encoding, this does not mean frames are being dropped.
Custom Settings
This section explains how to set up Windows Media Encoder for live
streaming using custom settings. On the Windows Media Encoder menu bar,
click File, and then click New. Assuming current encoder
settings have been saved, the Welcome screen appears. Select Custom
Settings. The process is very similar for streaming from or to a
file.
Input Source
On the first screen of the Custom Settings wizard, select Live
source.
Capture Source
On the next screen, click the audio and video capture card or cards to
be used, and check whether script commands will be sent. A small amount of
bandwidth is set aside for script commands, so if you are not going to use
them, do not select this option.
Bandwidth Selection #1
Here you have a choice of multiple bit rate or single bit rate video.
Select Use multiple bit rate video. When you encode using multiple
bit rate, you fully enable intelligent streaming in the Windows Media
server and Windows Media Player. By selecting Use single bit rate
video, you limit the intelligent streaming options available to the
server. Single stream encoding is appropriate if network conditions are
known and stable, the encoding computer is incapable of handling the
higher CPU requirement of multiple bit rate encoding, or you are encoding
audio-only content.
Bandwidth Selection #2
On this screen, decide which bandwidth streams to encode. Selecting
28.8 Modem, Internet 56 Modem, and ISDN covers the
majority of modems used on the Internet. Selecting Intranet/LAN and
High Speed Internet gives you good coverage for streaming over an
internal network. Selecting all of the options covers any network
bandwidth. For this example, select 28.8 Modem and ISDN.
Compression and Formats
After you have enabled multiple bit rate encoding and chosen the
streams to encode, select one audio and one video compression type (codec)
and format and one set of advanced video parameters. These settings are
used to configure all of the streams. Deciding which settings to enter
requires some experimentation and practice. There are no set rules. What
you enter depends on the input source, the desired effect, and personal
taste: all subjective decisions. But you do have many choices. Here are
some points to consider:
- Windows Media Audio Codec and Microsoft MPEG-4 version 3
Intelligent streaming is a balancing act. Many factors come into
play as Windows Media Encoder makes its many parameter adjustments prior
to creating multiple streams from user input. The Microsoft MPEG-4 Video
codec version 3 and the new Windows Media Audio codec were created to
work hand in hand with Windows Media Encoder and intelligent streaming.
Although other audio and video codecs will certainly work fine
for single-stream encoding, they might not allow for the high degree of
flexibility required for multiple bit rate encoding. It is recommended
you try the Windows Media Audio and Microsoft MPEG-4 codecs first. If
you prefer to use another codec, before using it in your final product,
test it thoroughly with the chosen streams and settings. The Windows
Media Audio codec features good quality, bit rates as low as 5 Kbps, and
low processor and memory requirements--a definite advantage when used in
a multiple stream environment. The MPEG-4 Video codec is optimized for
encoding and decoding. As mentioned earlier, the intelligent streaming
filter that removes blockiness and ghosting on the client only works
with the new Microsoft MPEG-4 Version 3 codec.
- Video highest bit rate, audio lowest bit rate With
intelligent streaming, when network bandwidth deteriorates, the quality
of the video received by the client is adjusted down in favor of
maintaining a continuous audio stream.
For this reason, select
an audio bit rate that complements the lowest bandwidth. The video
settings you choose, on the other hand, complement the highest
bandwidth. When users connect at the highest bandwidth, they receive the
highest quality video encoded: the settings you have chosen. The encoder
uses these video settings as a starting point to automatically create
the other bandwidth streams by modifying image quality, seconds per I
frame, and frames per second (the encoder cannot change the image size
you set). For instance, if you select 15 frames per second and an image
size of 320-by-240 pixels for an 80-Kbps bandwidth, the encoder
sacrifices some of the quality and lowers the frames-per-second to 8 or
5 for each of the lower bandwidth streams. The encoder adjusts all
parameters to find the best balance for a given bandwidth using your
settings as a starting point. If you select a high frame rate and large
image size, the encoder must lower image quality to fit a given
bandwidth. So, you can increase quality, conversely, by reducing the
frame rate or image size. Be creative with image shape, too.
If
your image is letterboxed, in Advanced Video Settings, use the
Clipping controls to remove the black areas at the top and bottom
and reshape the aspect ratio of the image. Even though the areas are
black, they are still part of the image and add to the number of bits
that must be processed and streamed.
- Watch CPU and Playback If playback appears jerky despite
having selected a high frame rate (for example, 15 frames-per-second),
it is most likely due to an imbalance in your Windows Media Encoder
settings. For instance, if you enter a high frame rate, a large image
size, and a bandwidth that is too small, the encoder compensates by not
only lowering image quality but by dynamically adjusting the frame rate.
Because there are so many factors involved in creating quality streams,
the settings you enter can be modified by the encoder while encoding to
limit the streams to the set bandwidths.
The only way to tell if
your settings work satisfactorily for a given type of video is by
testing them. Before going live, try different settings. View the
results by connecting Windows Media Player directly to the encoder
output (msbd://EncodingMachine:port).
If the frame
rate is uneven, lower the rate, or reduce the image size. In Windows
Media Player, click View, and then click Statistics. If
the Actual rate is much lower than the Frame rate, it can
be because the encoder is being forced to drop frames. The playback can
appear smoother if you lower the frame rate from 15 to 4. In addition to
monitoring playback, check your encoding computer's performance monitor.
If CPU usage is at 100 percent constantly, the encoder compensates by
dropping frames.
Output Options, Transmission and Output File
The last three screens of the configuration wizard are the same for
multiple bit rate as for single stream. On the Output Options screen,
select whether you will stream over a network, to a file, or both. On the
last two screens, configure for live transmission, and select an output
file name and path.
Summary
Intelligent streaming is, for the most part, completely automatic. The
interplay between client and server takes place behind the scenes. If you
have added multiple bit rate streaming to your original media, the server
can intelligently adjust the bit rate according to the current bandwidth,
so the presentation received by the user is smooth.
Intelligent streaming is another step by Windows Media Technologies
toward creating the ideal user experience. The goal is to make the
experience transparent: the user should only be aware of the content, not
the container, and you, the producer, should only have to be concerned
with creating great content. Intelligent streaming is a major step toward
the understanding and management of media presentation over networks. With
Windows Media Technologies 4.0, you can create one .asf file or encode one
live stream, and users connected at many different speeds with a multitude
of network conditions can enjoy a high-quality presentation. Most
importantly, they can enjoy the content without experiencing irritating
interruptions and transmission break-up.
|