Video Encoding 101 - The Problem Number 1

Posted by: Cameron Church, October 22, 2009

The series so far:

Introduction - http://blog.brightcove.com/global/2009/10/video-encoding-101-the-series.html
The Beginning -http://blog.brightcove.com/global/2009/10/video-encoding-101-from-the-beginning.html

So.... the$64,000 question: What is Video and how to do we delivery it over the web?  In its most basic sense video is just a consecutive series of static images presented fast enough that our brains process the incoming feed as a stream of continuous scenic data.

To get further in depth around the structure of video is out of scope of this series but here are some lovely links if you want to get into the really nitty gritty of video and its science:

The key measurement of video is that of quality.   As we discussed previously there are 4 fundamental medium to deliver information over the web: Text, Images, Audio and Video.   These increase from left to right in both their ability to efficiently deliver contextual data but at a cost of an exponential increase of transport data required.  So on the far right of the spectrum occupied by video is need for a good (or higher) degree of quality as the resolution of a data stream is what our brains use to match the incoming signal to our memory banks.   If we can't distinguish an object then we fail in converting (transcoding) the video stream into our thought streams. 

What level of quality is needed for a specific stream is very subjective and largely dependant on the focus of the message (factual/news typically less quality vs. engaging nature films that require high quality).  It can also be incremented (or decremented) by any accompanying audio stream's overall quality.  There are certain rules of thumb about how low you can go and we'll get into that in later instalments. 

The more contextual information you deliver the more inference the consumer can do in terms of deriving unique and targeted messaging.  This is why we mostly prefer to watch the news over listening to a bulletin over the radio.   In the same attention span the provider can pack so much more contextual data for the user to sort through and focus on what is important to them.   It makes it much more personal.

Video is broadband messaging.

As discussed in the last segment video is a massive data stream.   Depending on who you believe our eyes can process anywhere between 100Mbps to 200Mbps.   We also see at a resolution of around 324 Megapixels with our brains sampling this much data in a fraction of a second (although our focus is a subset of this field of vision).

In comparison in the UK the average broadband user has a pipe about 8Mbps wide with a sustained throughput of 25% of the overall pipe.  Or 2Mbps for all their internet traffic.  Or in other words a factor 100x difference between what we see in real life and what can actually be presented to us.

So there we have it - our Problem Number 1 broken into its parts:

  1. Video as a format and medium can be used to deliver upwards of 200Mbps of data to our eyes and brains.
  2. Using the internet for part of this transmission sees that delivery pipe shrink to an average of 2Mbps (and smaller for mobile device delivery)
  3. And we need to maintain a certain level of quality to ensure the message is delivered with enough data to ensure the message is delivered and transcoded properly

All this fuss is really to tackle point 2.  If you're using video in the online world then this is your issue.   And probably why you're here reading this blog series.   Until the world all gets fibre optic cables to our door then we need to figure out how to fit the round peg (video) into the square whole (an Internet originally designed for the delivery of text).

And the weapon of choice?  Compression.   Taking something very large, compressing it into something very small, then decompressing it at a later date and time so that original message/stream is preserved with a high degree of accuracy.

There are 4 topical areas in the process of Video Compression:

  1. The ENCODING SPECIFICATION : all the scientific mumbo jumbo that explains how data is to be held and described (think of it as a language and its associated rules for creating words, sentences, grammar, syntax etc)
  2. The CODEC : the code and process that wraps around all the math and science of the ENCODING SPECIFCATION to COmpress and transform the data using the framework specified by the encoding format and DECompress it back to an uncompressed state (CODEC),
  3. The ENCODER : A software app the wraps around 1 or more CODECs and exposes the properties of each
  4. The DECODER : The sofware app (player) that decompresses the video and presents it back to the user.

I'm going to spend most of my time around the first 2 area and in particular the current industry standard for online video delivery - H.264

H.264 - is a specification that allows for video data (either compressed or not) to be examined in way that allows for efficiency (typically of a very high degree) savings to be made and allowing it to be pack into something much smaller then its parts. 

H.264 is by no means the only specification out there: we have Windows Media VC1, ON2, XVID to name but 3.  I'm focusing on H.264 mainly because its the closest thing to a standard we have right now for online video with both ISO and ITU setting it as their preferred codec.   With those 2 on board you can't get any more "standard" these days.

In the end though all mainstream codecs pretty much do the same thing.   Treating video as a continuous set of full-framed images they have figured out a way of removing many of them and replacing them with vector representations of their changes across the time frame.

It just so happens that H.264 (MPEG) have figured out how to do this realllllly well, got some serious industry bodies to back them and not tie themselves to any major video platform/device (yes Microsoft even has signed up for it!).

So how H.264 and other codecs do this compression is a secret sauce that distinguishes them.   But in the end it all boils down to identifying Key Frames, which are effectively full-framed images in the video time stream that will be used as a starting and end point of any vector analysis.  

Once identified in the most basic terms the specifications offer ways of describing changes between these 2 end point Key Frames and the resulting vector information is much smaller in data size then frames they will replace.   Just like blueprints that describe how to build something is much smaller then the actual built object, so the compression part of the specification is how to create blueprints for the series of images between 2 key frames.

All they need at later point in time is someone who understands how to read those blueprints to rebuild the object as it was described.

But nothing is perfect, and this blueprint creation is not (yet) flawless.   In the next post I'll look in more detail on how this vectoring is done and where things can and do go wrong.   From there we can then look at strategies to correct them.   In the end we'll be seeking great quality and in turn solving our Problem Number 1.

-- Cameron Church

 

Video Production & Editorial Comments (0), Tags:

Video Encoding 101 - From the Beginning

Posted by: Cameron Church, October 10, 2009

The series so far:

Introduction - http://blog.brightcove.com/global/2009/10/video-encoding-101-the-series.html

So to start any discussion we need to begin with the fundamentals.   Specifically what is the process of video encoding?   Or even more generally what is encoding and how does it apply to Online Video? 

At its very fundamental sense encoding is effectively the storage of data in a well defined format.   An example of encoding is my writing this blog post: the thoughts and abstract ideas in my head (the data) is written (encoded) into words and sentences following a series of rules (the format) so that the reader , already skilled up with these rules, can consume and understand this data at any time (transmitted) in the future (decoded).

As you can see the process of encoding has a pre requisite of needing data to work on.  This packaged data can then be transmitted or stored well into the future to be unpackaged / decoded at any time by a reader that understands the rules and structure of the packing.  The goal being a minimal loss in meaning of the data from the time it is encoded to when it is decoded and consumed.

The process is essentially the same where the data can be anything from abstract thought, to a snapshot of time or a period of time: all that differs is the effectiveness of the format used. 

  1. Written language is great for instant abstract thought streams but video is overkill (think Twitter)
  2. Audio recording is great for conversational data streams (the Telephone)
  3. Video recording is ideal for scenic data streams (YouTube)

The differences between the 3 examples above is the amount of data needed to be transmitted for each scenario.  Twitter has a maximum number of up to 140 characters to post which, for comparison, is about 560 Bytes of information.   Take a high definition video recording of a 3 minute timeline (the same time it would take me to come up with something interesting on Twitter) this could easily take up over 50 MBytes - that's a over size difference of over 100,000x in the data being transmitted!

The 4 base formats: Text, Images, Audio and Video align in this order on a scale of effectiveness around data transmission given increasing data size.  Or in other words, as you slide from the highly focused side of the scale to the highly contextual size your need to store data to maintain message quality increases exponentially.

And when it comes to Video and its delivery over the web we hit our fundamental challenge - our Problem Number 1.   How do we take something that requires so much data to convey its message at a maintained quality level and push it down a pipe designed to deliver Text?  The mother of all Square Peg - Round Hole problems.

But I'm getting ahead of myself here.  Before we start down that road we need to just become a bit more familiar with Video as an encoding tool.  And more specifically around transcoding - the process of switching from one format to another.

In the next instalment I'll be breaking down Video encoding into its component parts.   Looking at the pieces in turn and setting the stage to start the investigation around how we can solve our Problem Number 1.

-- Cameron Church

 

Video Production & Editorial Comments (0), Tags:

Video Encoding 101 - The Series

Posted by: Cameron Church, October 6, 2009

It seems we've finally hit that critical mass.   The novelty of online video is starting to wear off quickly and we're now left with acceptance and expectation from the the online community that a website must have video as a core component of its design and offering.

Services such as the iPlayer and Hulu have made catch up TV the primary way to view episodic content for many teenagers and young adults.

The iPhone has made consuming the latest video on the move all the rage.

Even the governments of the world have embraced video as a core to their communication strategy.

But beneath all this flair and business uptake lies the open questions around what exactly is online video?  how do we make? and possibly most importantly how do we make it well?

No doubt you realise that video and its challenges has been around for quite some time.  Just do a search for MPEG and read about its history all the way back to 1988 - an organisation set up to tackle the specific problems with digitising, compressing and delivering video. 

For up until now digital video delivery was only a problem for the major broadcast houses that crammed every little bit down through the airwaves to your television box.   As long as the signal was strong and the picture true there was no problem.

But along came the internet.   And time and again it ripped up the old paradigms in favour of the masses.  Video was no exception - YouTube showed (ad nauseum) how much people craved having control of their broadcast and consumption.  The success of Brightcove has shown how important video is to business of all kinds.

Video is here to stay - but what is it?  What does it mean to you?  And how can it add value to your users and bottom line?

In this series of posts I'm going to be looking at online video from the ground up.   Topics like:

  • What is Encoding?
  • What is Video Compression?
  • Why do I need to worry about all this?
  • HD vs High Definition - no they don't mean the same thing!
  • What is a CODEC?
  • Choosing the right CODEC
  • How to Build an Encoding Profile right for your business and users
  • Does Bigger Bitrate = Better Quality?  Yes and No
  • Tools of the Trade
  • Taking it to the next level - what business rules and implementation issues do I need to worry about?

My intention is to try and help demystify the dark art of online video creation.   Take it from the cathedral to the bazaar.   Give you the power to create a visual experience that keeps your users coming back for more.

Throughout this series I'll be focusing on one technology in particular - H.264 - don't worry if you  don't know much about it now.   I'll hopefully explain it well enough in the posts to come.  Its by no means the only video encoding technology out there but it is one of the fastest rising stars currenty.

To see the power of this particular technology is to believe.

Below is a player full of videos I've compressed down from High Res (8000Kbps) videos located here.  Download the source files, and view them locally.

Then have a view at the videos in the player - all come in 3 flavours 1500Kbps, 750Kbps, 500Kbps - see if you can tell the difference in quality.   Note that the most average of internet users can consume a video encoded at 500Kbps without any stuttering or loss of experience over their connection.   You can serve this quality to your end users with minimal cost to either you or them.

This series is intended to be a conversation - I encourage you to use the comments section of the various posts to let me know your thoughts. Give us your tips & tricks. Share your experiences.

The industry and user expectation is moving on rapidly.   Getting the video right for your project is paramount.  Follow along as I try to navigate you to the promised land.

 

Video Production & Editorial Comments (0), Tags:

H264 encoding using Mediacoder

Posted by: Cameron Church, August 18, 2009

For some of our publishers and partners there exists very valid reasons to use their own external media transcoding workflow which excludes the use of the BC3 Server-Side Transcoding functionality.

Those that need to really turn the screw on control of end quality and size you'll need to really dig deep in the more (very) advanced levers available in the encoding process: things like the Profile, Level and Predictor Frames are just some of the things you need to tweak to really squeeze the most from your encoding.

For 99% of us this is just too much specialised detail to get our heads around.   Luckily the Internet is here to save us as a gateway to the real experts out there.

Here's something to try:

Read More →

 

Video Production & Editorial Comments (0), Tags: