The Basics of Digital Media
Our world has come to be saturated with video and audio media. It has become a central force in how anyone communicates about anything. And almost all of it is now created, stored, edited, transmitted, and consumed digitally. For anyone who wants to become involved in media, or sometimes even just to consume it, this means that there's a whole world of esoteric shenanigans which must be approached. This is a rabbit hole that can really go very deep, but for the most part the day-to-day mechanics of digital media aren't all that bad. There are a few core ideas to get, and some terminology that can be confusing, but a little time and attention is all that's required to get the basics.
How Much, How Often, How Many
The essential characteristics of digital media, video or audio, come down to three basic questions: How much? How often? How many?
How much? - The factor in question here is raw data. All digital information is ultimately stored as a series of 1s and 0s. The hows and whys of exactly how those get turned into media is almost unfathomably complex, but the import part is that more data means more quality. If you want sharp video and crisp audio, you'll want a lot of data to represent it. On the other hand, if you want your media to stream easily over the Internet, you'll want to back off on the data so that it doesn't swamp the consumer's connection. How much data is used to represent media is typically referred to as its bitrate.
Examples
An Internet video stream might have a bitrate of around 300 kilobits per second (kbps), and the video on a DVD might be around 4000 kbps.
The audio in an MP3 might be around 200 kbps, and on a CD it's about 1400 kbps.
How often? - In the real world, sight and sound is a continuous, seamless flow. But in the digital realm, everything has to get chopped up into bits and pieces. We simulate the smooth stream of reality by playing a lot of these tiny chunks very quickly. In video, one of these chunks is a frame. It's somewhat like a slice of an old movie reel; a single picture that varies slightly from the ones before and after it. In audio, the chunks are called samples. It's similar to the movie reel, but instead of a series of pictures it's a series of individual sounds. In either case, the more chunks there are in the media, the higher the quality. For video this is called framerate and in audio it's sampling.
Examples
An Internet video stream might have a framerate of around 15 frames per second (fps), and HDTV video can be as high as 60 fps.
The audio in an Internet stream might have a sampling rate of 30 kilohertz (khz), and Digital Audio Tapes use 48 khz.
How many? - Now that we have our data coming in at some rate, we have to decide how to arrange it. For video, it is displayed as a fine grid of individual colored squares called pixels. This is usually described as how many pixels wide and high each frame is, or its dimensions. For audio, the data is delivered in one or more channels. Old AM radios just had one channel (mono), most FM radio and television has two channels (left and right, stereo), and home theaters may have five or more channels with speakers all over the place (surround sound). Again, for both video and audio, more is better for quality.
Examples
The dimensions of an Internet video stream might be around 320x240, and HDTV can be as high as 1920x1080.
Most MP3 files have 2 audio channels, and a movie theater may have 12.
Putting the Questions Together - All three of these factors come into play for both the overall quality of the media and how much space it takes up either in storage (such as in a file on a hard drive) or on a connection (such as an Internet video). One of the biggest considerations to keep in mind is that the bitrate, or how much data is going into the media, is spread across both the how often and the how many. So, for example, increasing the sampling or the resolution will require that much more data to maintain the level of quality in each. Similarly, in another example, decreasing the framerate and channels frees up data that could be put into a higher bitrate or left off for a leaner file. The interplay of all the factors is basically infinite, and it's something of an art to pin down just the right combination for any particular application. Two good ways to approach it are to look on the Internet for suggested combinations or simply playing with different scenarios to get a feel for how they work. I suggest doing a bit of both.
Aspect Ratios 101
The size of a digital video frame can be an important consideration. This is measured by its width relative to its height. This is referred as its aspect ratio. For example, an old-fashioned television is four parts wide for every three parts high, usually noted as 4:3. No matter how large or small any individual screen is physically, it will have that same ratio to its dimensions. This ensures that any 4:3 video will take up the whole area of every screen it's played on. Similarly, widescreen televisions are sixteen parts wide to nine parts high, or 16:9. Again, any 16:9 video will take up the whole screen on any 16:9 television, regardless of its size. It's all very tidy.
Where things start getting messy is when we mix and match. If you play a 16:9 video on a 4:3 screen, or a 4:3 video on a 16:9 screen, you start getting into black bars on the screen, edges chopped off, or distorting stretches. So it's important to keep the aspect ratio in mind when working with video. What ratio is your camera or source material? What are the ratios of the devices that will likely display it? How will you handle any mismatches between the two? Ask yourself these questions right from the very start to save yourself a lot of headache in the long run.
You Gotta Put it Somewhere
Now that you've picked just the right combination of properties for your media and gotten your aspect ratio squared away, you'll be wanting to store your new masterpiece. In one form or another, this happens in a computer media file, similar to a word processing document or digital picture. There are a great many things that go into this storage, but there are three fundamental parts you should know about: codecs, tracks, and containers.
Codecs - Digital media contains enormous amounts of data. So much in fact, that in most cases the amount of raw data required for any given situation is wildly impractical, if not impossible. The solution to this problem comes in the form of a technique called data compression. People who are much smarter than me figured out how to use very sophisticated mathematical equations to squish data down to a small fraction of its original size. The process of compressing the media in this way is called encoding. There are lots of different ways to encode media, and each of these individual methods is called a codec (short for compressor-decompressor). Video and audio each have different kinds of codecs because how they work internally is different. The codec can also specify a language for the media. Various codecs have different features and considerations, which can get complicated, but you should at least know what codecs your video and audio are using, even if you don't quite know what it means.
Examples
Common video codecs include MPEG-2, Windows Media, Theora, and AVC.
Some popular audio codecs are MP3, PCM, Vorbis, and AAC.
Tracks - Once some media has been encoded, it is stored in a track. One track can only hold one codec, so to have both video and audio at least two tracks are required.
Containers - With some tracks full of video and audio on hand, it's finally time to put them in their home, which is called a container. This is the outward file you see on your hard drive when you have a video on your computer. Containers can store any number of tracks. This is how your DVDs can play in multiple languages; it has multiple audio tracks inside its container. There are lots of different types of containers and, similar to codecs, they each have their own features and considerations which can quickly become complicated. Again, you should at a minimum know what container your media is using, even if you're not sure why.
Examples
Common containers include MPEG-4 (with a file extension of .mp4), QuickTime (.mov), Audio Video Interleave (.avi), and Windows Media Video (.wmv).
The Big Picture
As you can see, there are several factors involved in a digital media file and it can be easy to get confused about what goes where. Here is a diagram of the complete structure:
| Container |
| Track 01 |
| Video Codec @ Bitrate, Framerate, Dimensions (Aspect Ratio) |
| Track 1 |
| Audio Codec @ Bitrate, Sampling, Channels (Language) |
1 Tracks are typically numbered starting at 0 instead of 1 (which is often the case in the world of technology).
Here's an example of an Internet video stream.
|
Container: MPEG-4 |
| Track 0 |
| Video Codec: AVC @ 384 kbps, 20 fps, 424x240 (16:9) |
| Track 1 |
| Audio Codec: AAC @ 32 kbps, 22 khz, 1 Channel (NA)1 |
1 Language is often not specified if the media is not multi-lingual.
And here's an example of a DVD.
|
Container: MPEG-2 |
| Track 0 |
| Video Codec: MPEG-2 @ 4096 kbps, 29.97 fps, 720x480 (4:3)1 |
| Track 1 |
| Audio Codec: DTS @ 768 kbps, 48 khz, 5.1 Channels2 (English) |
| Track 2 |
| Audio Codec: MP2 @ 256 kbps, 48 khz, 2 Channels (Spanish) |
1 Some video, such as DVD, has rectangular pixels (instead of square), which makes their aspect ratios more complicated.
2 Surround sound audio is often denoted as X.Y, where X is the number of surrounding speakers and Y is the number of central subwoofers.
That's it!
Congratulations, you're done. There are endless corridors of detail you could go down for any item on this page, but for most of the things that most people will ever do with digital media, this has got you covered. I hope you've found it useful. Please feel free to send me any feedback you might have.
Thanks!
Chris Stanley
July 7, 2009
Copyright © 2009 Chris Stanley