X264 Vs Xvid Comparison Essay

Compression project >>Video Area Home

Eighth MPEG-4 AVC/H.264 Video Codecs Comparison - Standard Version

MSU Graphics & Media Lab (Video Group)

Video group head: Dr. Dmitriy Vatolin

Project head: Dr. Dmitriy Kulikov

Measurements, analysis: Marat Arsaev

REPORT IS UPDATED!
Now it contains Appendixes with GPU encoders comparsion and Very High Speed Encoders comparison.

Different Versions of Report

There are two different versions of H.264 Comparison 2012 report: Here is the comparison of the versions:

Pro version of comparison will be available immediately after report purchasing.

Report Overview

Video Codecs that Were Tested

Overview

Sequences

Table 1. Summary of video sequences
SequenceNumber of framesFrame rateResolution
VideoConference (5 sequences)
Deadline137430352x288
Developers 4CIF360030640x480
Developers 720p1500301280x720
Presentation54830720x480
Business493301920x1080
Movies (10 SD sequences)
Ice Age201424720x480
City60060704x576
Crew60060704x576
Indiana Jones500030704x288
Harbour60060704x576
Ice Skating 48060704x576
Soccer60060704x576
Race Horses30030832x480
State Enemy650024720x304
Party Scene50050832x480
HDTV sequences (16 sequences)
Park Joy500501280x720
Riverbed250251920x1080
Rush Hour500251920x1080
Blue Sky217251920x1080
Station313251920x1080
Stockholm604501280x720
Sunflower500251920x1080
Tractor690251920x1080
Bunny600241920x1080
Dream600241920x1080
Troy300241920x1072
Water Drops535301920x1080
Capitol600301920x1080
Parrots600301920x1080
Citybus600301920x1080
Underwater600301920x1080

Objectives and Testing Tools

H.264 Codec Testing Objectives

The main goal of this report is the presentation of a comparative evaluation of the quality of new H.264 codecs using objective measures of assessment. The comparison was done using settings provided by the developers of each codec. The main task of the comparison is to analyze different H.264 encoders for the task of transcoding video—e.g., compressing video for personal use. Speed requirements are given for a sufficiently fast PC; fast presets are analogous to real-time encoding for a typical home-use PC.

H.264 Codec Testing Tools

Overall Conclusions

Overall, the leader in this comparison for software encoders is x264, followed by MainConcept, DivX H.264 and Elecard.

The overall ranking of the software codecs tested in this comparison is as follows:

  1. x264
  2. MainConcept
  3. DivX H.264
  4. Elecard
  5. Intel Ivy Bridge QuickSync
  6. XviD
  7. DiscretePhoton
  8. MainConcept CUDA

This rank is based only on the encoders’ quality results. Encoding speed is not considered here.

Professional Versions of Comparison Report

H.264 Comparison Report Pro 2012 version contains:

Acknowledgments

The Graphics & Media Lab Video Group would like to express its gratitude to the following companies for providing the codecs and settings used in this report: The Video Group would also like to thank these companies for their help and technical support during the tests.

Thanks


Codec Analysis and Tuning for Codec Developers and Codec Users

Computer Graphics and Multimedia Laboratory of Moscow State University:

We could perform next task for codec developers and codec users.

Strong and Weak Points of Your Codec

Independent Codec Estimation Comparing to Other Codecs for Different Use-cases

Encoder Features Implementation Optimality Analysis

We perform encoder features effectiveness (speed/quality trade-off) analysis that could lead up to 30% increase in the speed/quality characteristics of your codec. We can help you to tune your codec and find best encoding parameters.

Contact Information

Call for HEVC codecs 2018
See all MSU Video Codec Comparisons

MSU video codecs comparisons resources:

Other Materials

Video resources:
Bookmark this page:   Add to Del.icio.us     Digg It     reddit

 

Having played around with video since I had a few multimedia CD-ROMs and a BT878-based TV tuner card, video compression is one area that has amazed me. I watched as early “simple” compression efforts such as Cinepak and Indeo bought multimedia to CD-ROMs running at 1x to 2x, good enough for interactive encyclopedias and music video clips. The quality wasn’t as good as TV, but it was constrained by the computing power available then.

Because of the continual increase in computing power, I watched as MPEG-1 bought VCDs and VHS quality to the same amount of storage as normally taken by uncompressed CD-quality audio. Then MPEG-2 heralded the era of the DVD, SVCD and most of the DVB-T/DVB-S transmissions, with a claimed doubling of compression efficiency. Before long, MPEG-4/H.263 (ASP) was upon us, with another doubling, enabling a lot of “internet” video (e.g. DIVX/XVID). Another bump was achieved with MPEG-4/H.264 (Part 10 – AVC) which improved efficiency to the point where standard definition “near-DVD-quality” could be fit into the same sort of space as CD-quality audio.

Throughout the whole journey, I have been doing my own video comparisons, but mostly empirically by testing out several settings and seeing how I liked them. In the “early” days of each of these standards, it was a painful but almost necessary procedure to optimize the encoding workflow and achieve the required quality. I had to endure encode rates of about an hour for each minute of video when I first started with MPEG-1, then with MPEG-2, MPEG-4 ASP, and then MPEG-4 AVC. Luckily, the decode rates were often “sufficiently fast” to be able to render the output in real-time.

Developments in compression don’t stop. Increased computing power allows more sophisticated algorithms to be implemented. Increasing use of internet distribution and continual pressure on storage and bandwidth provide motivation to transition to an even more efficient form of compression, trading off computational time for better efficiency. Higher resolutions, such as UHD 4K and 8K, are likely to demand such improvements to become mainstream and to avoid overtaxing the limited bandwidth available in distribution channels.

The successor, at least in the MPEG suite of codecs, is MPEG-H Part 2, otherwise known as High Efficiency Video Coding (HEVC) or H.265. This standard was first completed in 2013, and is slowly seeing adoption owing to the increase in 4K cameras and smartphone SoCs with inbuilt hardware accelerated decoding/encoding and promises another almost halving of bitrate for the same perceptual quality. Unfortunately, licensing appears to be one of the areas which are holding HEVC back.

Of course, it’s not the only “next generation” codec available. VP9 (from Google) directly competes with HEVC, and has been shown by some to have superior encoding speed and similar video performance, although support is more limited. Its successor has been rolled into AOMedia Video 1, which is somewhat obscure at this time. From the Xiph.Org team, there is Daala, and from Cisco there is Thor. However, in my opinion, none of these codecs have quite reached the “critical mass” of adoption to make it hardware-embraced and universally-accessible as the MPEG suite of codecs has.

I did some initial informal testing on H.265 using x265 late last year, but it was not particularly extensive because of time limitations and needing to complete my PhD. As a result, I didn’t end up writing anything about it. This time around, I’ve decided to be a little more scientific to see what would turn up.

Before I go any further, I’ll point out that video compression testing is an area where there are many differing opinions and objections to certain types of testing and certain sorts of metrics. As a science, it’s quite imprecise because the human physiological perception of video isn’t fully understood, thus there are many dissenting views. There are also many settings which can be altered in the encoding software which can impact on the output quality, and some people have very strong opinions about how some things should be done. The purpose of this article isn’t to debate such issues, although where there are foreseeable objections, I will enclose some details in blockquotes, such as this paragraph.

Motivation

The main motivation of the experiment was to understand more about how x265 compares in encoding efficiency compared to x264. Specifically, I was motivated by this tooltip dialog in Handbrake that basically says “you’re on your own.”

As a result, I had quite a few questions I wanted to answer in as short a time as possible:

  • What is the approximate bitrate scale for the CRF values and how does it differ for x264 vs. x265?
  • How does this differ for content that’s moderately easy to encode, and others which are more difficult?
  • How do x264 CRF values and x265 CRF values compare in subjective and synthetic video quality benchmarks?
  • What are the encoding speed differences for different CRF values (and consequently bitrates), and how does x264 speed compare to x265 speed?
  • How do my different CPUs compare in terms of encoding speed?
  • Does x265 handle interlaced content properly?

As a result, I had to develop a test methodology to try and address these issues.

Methodology

Two computers running Windows 7 (updated to the latest set of patches at publication) were used throughout the experiment – an AMD Phenom II x6 1090T BE @ 3.9Ghz was used to encode the “difficult case” set of clips, and an Intel i7-4770k @ 3.9Ghz was used to encode the “average case” set of clips. The encoding software was Handbrake 0.10.5 64-bit edition. The x264 encoding was performed by x264 core 142 r 2479 dd79a61, and the x265 encode was performed by x265 1.9.

The test clips were encoded with Handbrake in H.264 and H.265 for comparison at 11 different CRF values, evenly spaced from 8 to 48 inclusive (i.e. spaced by 4). For both formats, the preset was set to Very Slow, and encoding tuning was not used. The H.264 profile selected was High/L4.1, whereas for H.265, the profile selected was Main. It was later determined that the H.265 level was L5, thus there is some disparity in the featuresets, however, High/L4.1 is most common for Blu-Ray quality 1080p content, and a matching setting was not available in Handbrake for x265. In additional options, interlace=tff was used for the difficult case to correspond with the interlaced status of the content. No picture processing (cropping, deinterlacing, detelecining, etc.) within Handbrake was enabled.

Final bitrates were determined using Media Player Classic – Home Cinema’s information dialog and confirmed with MediaInfo. Encoding rate was determined from the encode logs. As the AMD system was my “day to day” system, it was in use during several encodes resulting in outlying reduced encode rate numbers. These have been marked as outliers.

The encoded files and the source file were then transcoded into a lossless FFV1 AVI file using FFmpeg (version N-80066-g566be4f built by Zeranoe) for comparison (noting that no colourspace conversion occured, the file remained YUV 4:2:0). This was due to unusual behaviour being witnessed if this was not done resulting in implausible SSIM/PSNR figures. Frame alignment of the files was verified using Virtualdub and checking for scene change frames – in the case of the “difficult case” video, the first frame of the source file was discarded as Handbrake did not encode that frame to maintain video length and frame alignment. The “average case” video did not need any adjustments.

Pairs of files were compared for SSIM and PSNR using the following command:

ffmpeg -i [test] -i [ref] -lavfi "ssim;[0:v][1:v]psnr" -f null -

Results were recorded and reported. Produced data is available in the Appendix at the end of this post. If it is not visible, please click the more link to access it.

Two frames from each video were extracted, and a 320×200 crop from a detailed section was assembled into a collage for still image comparison. The frames were chosen to be at least two frames away from a scene cut to avoid picking a keyframe. This was performed using FFmpeg extracting into .bmp files (conversion from YUV 4:2:0 to RGB24), and then using Photoshop and exporting to a lossless PNG to avoid corrupting the output.

Subjective video quality was assessed using my Lenovo E431 laptop connected to a Kogan 50″ LED TV. This was prior calibrated by eye to ensure highlights and shadows do not clip. Testing was done with viewing at 2.5*H distance from the screen in a darkened room. Overscan correction was applied, however, all other driver-related enhancements were disabled. Use of frame rate mode switching in MPC-HC was used to avoid software frame-rate conversion. TV motion smoothing was not available, thus ensuring the viewed result is consistent with the encoded data. Subjective opinions at each rate were recorded.

The clips used were:

Approximations of the clips used are linked above (YouTube), however, the actual video files differ slightly (especially with difficult case where the online video is missing a few tens of seconds). The encoding by YouTube is also relatively poor by comparison to the source. Unfortunately, as the source clips are copyrighted, I can’t distribute them.

The choice of the clips was for several reasons – I had good quality sources of both samples which meant a better chance of seeing encoding issues, I was familiar with both clips, and both clips feature segments with high sharpness details. In the case of the difficult case, that clip is especially tricky to encode as the background has high spatial frequency detail, whereas the “focal point” of the dancing girl-group members have relatively “low” frequency detail, thus encoders often get it wrong and devote a lot of attention to the background. It also has a lot of flashing patterns which are quite “random” and require high bitrates to avoid turning into “mush”. (I did consider using T-ARA – Bo Peep as the difficult case clip, but that was mostly “fast cuts” increasing the difficulty, rather than any tricky imagery, plus my source quality was slightly lower.)

At this point, some people will have objection about the use of compressed material as the source. Normal objections include the potential for preferencing H.264 as the material was H.264 coded before, and the potential for loss of detail as to render high CRF encodes “meaningless”.

However, I think it’s important to keep in mind that if you expect the output to resemble the potentially imperfect result of the compressed input, this is less of an issue. The reference is the once-encoded video.

The second thing to note is that I’ve chosen sample clips I have with the highest bitrate and cleanest quality I have available – this maximises the potential for noticing encoding problems.

Thirdly, it’s also important to note that transcoding is a legitimate use of the codec – most people do not have the equipment to acquire raw footage and most consumer grade cameras already have compressed the footage. Other users are likely to be format-shifting and transcoding compressed to compressed. Thus testing in a compressed to compressed scenario is not invalid.

Results: Bitrate vs CRF

It’s an often touted piece of advice that a change of CRF by +/- 6 will halve/double the bitrate. Suggested rate-factors are normally around 19 to 23 roughly. Because I had no idea what a certain CRF value would produce bit-rate wise, and whether x265 adheres to the same convention, I found out by plotting the resulting bitrates on a semi-log plot and curve fitting.

In the case of difficult case for x264, the upper end CRF 8 bitrate fell off because it had reached the limits of the [email protected] profile. Aside from that, the lines are somewhat wavy but still close to an exponential function with exponent ranges from -0.108 to 0.136.

As a result, from the curve fits, it seems that for x265, we observed that it takes a CRF movement of 5.09667 to 5.5899 to see a halving/doubling in size. For x264, it took 5.68153 to 6.41801 to see a halving/doubling in size. It seems that x265 is slightly more sensitive to the CRF value in setting its bitrate (average ~5.34 as opposed to 6.05).

Readers may be concerned that my x264 examples involve using a different profile and level ([email protected]) versus the x265 ([email protected]). It is acknowledged that it will cap the output quality – in future, I’ll try to match the encode levels but that is not directly configurable for x265 at present from Handbrake.

Results: Bitrate Savings at CRF Value

On the assumption that the CRF values correspond to the same quality of output, how much bitrate do we save? I tried to find out by comparing the bitrate values at given CRFs.

The answer is less straightforward than expected. For the difficult case, the x265 output averaged 92% of the x264 output but varied quite a bit – in some cases at higher CRFs being larger than the x264 output. The average case displayed an average size of 59% which is more in-line with expectations and is mostly stable around the commonly-used CRF ranges.

Then, naturally, comes the actual question of whether the CRF values provide the same perceived quality.

Results: SSIM and PSNR

There are two main methods used to evaluate video quality – namely Structual Similarity (SSIM) and Peak Signal-to-Noise Ratio (PSNR). These metrics are widely used, and are easily accessible thanks to FFmpeg filters. Their characteristics differ somewhat, with SSIM attempting to be more perceptual, so it’s helpful to look at both.

At this point, many encodists may point out the existence of many other, potentially better, video quality judgement schemes. Unfortunately, they’re less easily accessible, they’re less widely used, and there will almost certainly be debates as to whether they correlate with perception or not.

This area is continually being contested, so I’d rather stick to something which has been widely used and known as the caveats are also known to some extent. In the case of SSIM and PSNR, one of the biggest disadvantages to my knowledge is that it has no temporal assessment of quality. They are also source-material sensitive, and are not very valid when comparing across different codecs. Of course, we can’t rely solely on synthetic benchmarks.

We first take a look at the SSIM versus CRF graph. In this graph using the normalized (to 1) scale of SSIM, we can see the quality “fall-off” as CRF values are increased. The slope is steeper for the difficult case clips compared to the average case. In the case of the average case, the SSIM is almost tit-for-tat x265 vs x264 at each CRF value with the exception of CRF 48. Between the difficult case clips, there is a ~0.015 quality difference favouring x264.

For fun, we can also plot this against bitrate to see what happens. In the average case, the lines are very close together, and the quality takes an abrupt turn for the worse at about 4Mbit/s. In all but the highest bitrates, x265 has an advantage. The difficult case shows a less pronounced knee, and has x264 leading. A potential explanation for this can be seen in the subjective viewing section.

To see differences in the high end more clearly, we can plot the dB value of SSIM. We can see that at lower CRFs (<20) for the average case, x264 actually pulls ahead for a higher SSIM. Whether this is visible, or even a positive impact will need to be checked, as cross-codec comparisons are not as straightforward.

Repeating for bitrate, we see the same sort of story as we saw with the normalized values.

Looking at the PSNR behaviour shows that there are only minor differences throughout, with an exception at the lowest CRF . The minimum PSNR also seems to “level out” at high CRF values, so the “difference” in quality between the best and worst frames is lower. In all, there’s really no big difference between PSNRs for the average case between x264 and x265 on a CRF value basis.

The difficult case shows a fairly similar result, without major differences with the exception at the low CRF end where H.264 profile restrictions prevented the bitrate from going any higher, limiting the potential PSNR. Interestingly, the PSNR variance increased for x264 as the CRF was increased so as to hit the bitrate limits – so while the PSNR average is better, the worst frame was more poorly encoded to make that happen.

Plotting the same plots versus bitrate doesn’t reveal much more.

It seems on the whole, both PSNR and SSIM metrics achieved similar values for corresponding x264 and x265 CRF values. As a result, at least from a synthetic quality standpoint, the quality of x264 and x265 encodes at the same CRF are nearly identical, implying a bitrate saving averaging 41% can be achieved in the average case (and just 8% for the difficult case).

Results: Encode Rate

Of course, with every bitrate saving comes a compute penalty, so it’s time to work that out.

First by plotting CRF values, we can see that the Intel machine that encoded the “average case” files was much faster than the older AMD machine that encoded the “difficult case” files. Interestingly, the encode speed increased as the CRF increased (i.e. lower bitrates) for the Intel machine but didn’t really show as strong of a relationship for the AMD machine. The fall off in encode rate as CRF increased to 48 may have to do with reaching “other” resource limitations within the CPU.

The same thing is plotted versus the resulting bitrate. Overall, the encode rates (excluding purple outlier data points) show that x265 achieves on average just 15.7% of the speed of x264 on the Intel machine, and 4.8% of the speed on an AMD machine. Older machines are probably best sticking to x264 because of the significant speed difference. The difference in the encode rates at lower bitrates/higher CRFs may be due to different performance optimizations and cache sizes between the CPUs.

This also highlights a potential pitfall for buyers deciding whether to upgrade or not, and are basing their decision on a single metric such as CPUBenchmark scores. In our case:

AMD PhenomII x6 1090T BE 5676 @ 3.2Ghz 6918 @ 3.9Ghz (scaled for clock rate) Intel Core i7-4770k 10131 @ 3.5Ghz 11289 @ 3.9Ghz (scaled for clock rate)

This would mean that we would expect that the i7-4770k would perform at 163% of the AMD PhenomII x6 1090T BE. In reality, it performed at 213% on x264 and 637% on x265. Quite a big margin of difference.

Results: Still Image Samples

Lets take a look at some selected still image samples to see how the different CRFs compare. I suppose publishing small image samples for the case of illustrating the encoding quality is fair use … and while I could theoretically use artificially generated clips or self-shot clips, I don’t think that would represent the quality and characteristics of a professionally produced presentation which would skew the encoding results.

Yes, I know, you’re going to scream at me because the human eye doesn’t perceive video as “each frame being a still picture” and some of the quality degradation might not be noticeable. But hey, this is the next best thing …

Average Case #1

This is frame #215 from the source, where SinB stares inquisitively into a sideways camera. This frame is chosen due to pure detail, especially in shadows.

For x265, starting at CRF 20, I can notice some alterations in hair structure where some of the finer hairs have been “spread” slightly. Even CRF 16 isn’t immune to this, but its image quality is good. CRF 12 is indistinguishable from source. CRF 24 continues the quality slide and makes it a bit blotchy, whereas CRF 28 is obviously corrupting the quality of the eyebrows as well which are now just a smear and subtle details in the eyebrows and lower eyelid edge are missing.

The character of x264 is different, where impairments are not primarily in detail loss initially, instead, edges seem to gain noise. CRF 20 in the hair, has some odd coloured blocks, and the skin edge seems to be tainted with edge colour issues. The hair is slightly smoother than CRF 16 which appears much sharper and “straighter”. CRF24 makes a royal mess of the hair, turning it into blotches, and CRF 28 turns it into an almost solid block while losing details in the eyebrows and eyelid.

Average Case #2

This is frame #4484 from the source, a bridge scene where the members of Gfriend are seen running across. The scene is particularly sharp, and the bars of the bridge form a difficult encoding challenge, with high detail in the planks and the water running below.

The x265 encode at CRF 16 seems indistinguishable for the most part. However, at CRF 20, Yuju’s finger has a “halo”to the left of it, and Sowon’s red pants are starting to “merge” into the bars of the bridge somewhat. CRF 24 seems to worsen the halos around the fingers, and now, noise around heads passing the concrete can be seen, and the pants merging with the bridge bars is getting worse. CRF 28 is obviously starting to smooth a lot, and blockiness is obvious in the pants.

For x264, the impairments at CRF 28 was more sparkles and blocky posterization/quilting. CRF 24 showed a “pre-echo” of Yuju’s finger as well, which disappeared in CRF 20. CRF 20 appears to have lost some detail in the concrete beam behind, but isn’t bad at all.

Difficult Case #1

This is frame #1092, where Jessica (now ex-member of Girls’ Generation) had a solo shot. The frame was chosen because of the high detail in the eyes and hair.

Unfortunately, in the case of this clip, some of the detail was already lost in the encoding at the “source”, so we need to compare with an obviously degraded original.

For x265, the most obvious quality losses begin at about CRF 24 where the hair to the side seems to go slightly flatter in definition and some of the original blockiness (a desirable quality) is lost. By CRF 28, the hair looks like it’s pasted on with the loose strands being a little ill defined, and CRF 32 causes her to lose her eyebrows entirely.

For x264, CRF 20 maintains some of the original blockiness, but CRF 24 is visibly less defined in the hair in terms of the original blockiness. The difference is very minor, but by CRF 28, a similar loss of hair fidelity is seen but instead, it looks a little sharper but much noisier.

Difficult Case #2

This was frame #5827 where Yoona (left) and Tiffany (right) are dancing in front of the LED display board.

In the x265 case, in light of the messiness of the source, even CRF 24 looks acceptable. By CRF 28, Yoona’s almost completely lost her eyebrows and most of the facial definition, whereas Tiffany’s nose has a secondary “echo” outline. By comparison, the x264 encode looks a bit sharper, with some more visual noise around the facial features as if they’ve been sharpened resulting in some bright noise spots in CRF 24 and CRF 28. This clip is particularly tough to judge.

Summary

The still image samples seem to show that the necessary CRF to attain visually acceptable performance varies as a function of the input material. This is not unexpected, however, in the case of the more clear and simple material, CRF 12 was indistinguishable, CRF 16 was extremely good and CRF 20 was considered acceptable. For the more complex material, CRF 20 was considered good, and CRF 24 was considered somewhat acceptable.

Results: Subjective Viewing

I spent quite a few hours in front of my large TV checking out the quality of the video. In this way, the temporal quality and perception-based quality of the videos can be assessed.

On the whole, I would have to agree that the x264 CRF values produce very similar acceptance levels on x265. I would probably accept CRF 12 as being visually lossless for the average case material, CRF 16 as hard to discern near-lossless and CRF 20 as “watchable”. This is because I’m especially picky when it comes to quality and minor flaws when I watch material that I’m familiar with (and I always wonder how people put up with YouTube and other streaming services which so obviously haven’t got enough bitrate).

The key difference is the type of impairments that occur with x264 vs x265. In bitrate starvation, x264 appears to be sharper and goes into a blocky-mode of degradation preferring to retain sharp details even if it makes it look noisy. In contrast, x265 starts smoothing areas of lower detail, while “popping” sharpness into the areas that have finer details. This does sometimes look a bit un-natural. It also starts dropping motion where it is small, resulting in motion artifacts and jumpiness, but on the whole, this might be slightly less objectionable depending on your personal opinion.

With the difficult case data, we have a bit of a different opinion where CRF 16 is visually indistinguishable, and CRF 20 is almost indistinguishable. I would have to agree that x264 is better for this case and appeared more visually clean even at higher CRFs. This seems to be because the noise in x264 is “disguised” better in the patterning of the LED lights, whereas the smoothing in x265 becomes more obvious.

But a second, and more important issue, is the presence of a field oddity post-deinterlacing for the x265 clips, especially at CRF > 20.

The oddity results in “stripes” appearing every n pixels vertically as if there is something wrong with the fields there.

Using FFmpeg’s FFV1 decoded lossless file, examining it seems to show the encoded result actually does have the oddity in the fields. The reasoning for it isn’t clear at this stage, but may be related to a encode unit block boundary condition of sorts or a poor implementation of interlaced encoding. Whatever the case is, it makes interlaced files CRF > 20 difficult to watch during panning sequences especially.

This may go to explain why the SSIM/PSNR values were more smooth compared to the “average” case and were lower – these errors were not critical to the comparison, but are very temporally evident patterns.

Speaking of interlaced video, it’s a sad fact of life we still have to deal with it due to the storage of old videos, and due to some cameras still recording true interlaced content despite the majority of the world using progressive displays. Apparently H.265 supports interlaced encoding, although there was some confusion. One naive solution that some users may think of is just simply to deinterlace the video first and then encode it. The problem is that you will lose information through deinterlacing – if you’re going 50 fields per second to 25 frames, you’ve lost half the temporal information. If you frame double, then you can keep the temporal resolution but will have to generate the missing field for each frame – computationally intensive and can potentially introduce artifacts. It can also result in a file that is incompatible with many players, and if your motion compensation/prediction algorithm is poor, you might lose sharpness in some areas. I personally prefer to keep each format (progressive / interlaced) in its respective format through to the final display stage where the “best” deinterlacing for the situation can be applied.

However, as it turns out, the difficult case video is a Blu-Ray standard video, but it isn’t native interlaced material at all despite being 29.97fps. It’s 23.976fps that’s gone through a telecine process to make it 29.97fps. Why they would do such a thing, I don’t know, as Blu-ray supports 23.976p natively.

Conclusion

After a week and a bit of encoding and playing around with things, I think there are some rather interesting results.

On the whole, for the average case, x265 showed bitrate of about 59% of that of x264 at the same CRF. The CRF value sensitivity of x265 was slightly higher than x264, being about +/- 5.34 for a doubling/halving rather than +/- 6.05. Synthetically, the corresponding CRF values produced very similar SSIM and PSNR values for both x264 and x265, so the same “rules of thumb” might be applied, although the bitrate saving will vary depending on the specific CRF selected.

Encode rates for x265 were significantly slower than x264, as to be expected, due to the increased computational complexity. However, it seemed that lower CRF values/lower bitrates were much faster to encode on modern hardware (possibly due to better cache use). This wasn’t reflected with my older AMD Phenom II based system (possibly due to difference in instruction set and optimization).

Subjectively speaking, I’d have to say CRF 12 is indistinguishable and CRF 16 is good enough for virtually all cases. For the less discerning, CRF 20 is probably fine for watching, but CRF 24 is beginning to become annoying and CRF 28 is the least that could be considered acceptable. The result seems to be consistent across x264 and x265, although (unexpectedly) the difficult case seemed to tolerate higher CRF values probably as the harsh patterns were not as easily resolved by the eye and noise was less easily seen. As a result, even having a “rule of thumb” CRF can be hard, as it depends on the viewer, viewing equipment, source characteristics and sensitivity to artifacts.

Unfortunately, it seems that the “difficult case” data is really hard to interpret. This appears to be because x265 isn’t very good about handling interlaced content, and by using the “experimental” feature, the output wasn’t quite correct as seen in the subjective viewing. As a result, the synthetic benchmarks may have been reflective of the strange field blending on the edge of blocks resulting in a loss of fidelity that only resolved at fairly high quality values (CRF <=20). As a result, the mature x264 encoder was much more adept at handling interlaced content correctly, and I suppose we should take the difficult case data as being “atypical” and not representative of what properly encoded interlaced H.265 video would be like.

It looks like I’ve got another round of encoding ahead for testing the difficult case – as I discovered that the material was actually 23.976fps pulled up to 29.97fps, I’ll perform an inverse telecine on it and encode the progressive output to see what happens. This time, I’ll use H.264 [email protected] for consistency as well. With any luck, the results might be more consistent with the average case.

Appendix: Table of Data

x265 - Average Case CRF Bitrate SSIM     SSIM (dB) PSNR (avg) PSNR (min) fps 8   37545   0.995211 23.197804 51.363782  46.213167  0.40533746 12  18834   0.991562 20.737413 48.847675  43.550076  0.571760862 16  8407    0.987726 19.110261 47.004687  39.892563  0.847564601 20  4315    0.983769 17.896507 45.373892  36.558467  1.094531205 24  2504    0.978525 16.680696 43.635386  33.687427  1.289760018 28  1520    0.971082 15.388295 41.74252   30.954092  1.596318738 32  936     0.96073  14.059367 39.729187  28.777697  1.947472064 36  575     0.94749  12.797598 37.674602  27.641992  2.371282528 40  346     0.931052 11.614763 35.585072  27.202179  2.881390385 44  212     0.911069 10.509464 33.549725  26.877077  3.672094814 48  160     0.892783 9.697347  31.99061   26.683718  3.85216387 x265 - Difficult Case CRF Bitrate SSIM     SSIM (dB) PSNR (avg) PSNR (min) fps 8   83914   0.994746 22.79505  47.241291  41.851423  0.150718854 12  58270   0.991083 20.4976   44.149468  38.15019   0.161308334 16  39154   0.98508  18.262177 41.057735  34.518727  0.177631469 20  25251   0.97563  16.131467 38.046595  30.987662  0.194232091 24  15318   0.961485 14.143724 35.185632  27.964271  0.093471634 28  8747    0.941772 12.348706 32.649622  25.227401  0.235893164 32  4855    0.915883 10.751168 30.507201  22.978117  0.236307704 36  2633    0.881909 9.277838  28.55634   21.244047  0.246284614 40  1409    0.839445 7.943765  26.817959  20.279733  0.297332074 44  975     0.80528  7.105894  25.836161  19.5892    0.245301742 48  888     0.791061 6.799803  25.438011  16.554815  0.234813531 x264 - Average Case CRF Bitrate SSIM     SSIM (dB) PSNR (avg) PSNR (min) fps 8   44940   0.997364 25.791192 53.422334  47.377392  4.026217 12  28489   0.994607 22.681598 50.2906    44.523548  4.934854 16  14837   0.989849 19.934854 47.552363  41.977158  6.346291 20  6964    0.984727 18.16071  45.473564  37.964469  8.55346 24  3795    0.979337 16.848072 43.587039  34.802245  10.813952 28  2325    0.972033 15.53357  41.603963  31.924008  12.368113 32  1509    0.961974 14.199156 39.536816  29.436038  13.44052 36  1022    0.948866 12.912879 37.445484  27.7433    14.142429 40  716     0.932261 1.691611  35.328038  26.990109  14.590753 44  517     0.91127  10.519294 33.163862  25.74449   15.208363 48  380     0.8856   9.415741  30.925813  24.26836   15.538453 x264 - Difficult Case CRF Bitrate SSIM     SSIM (dB) PSNR (avg) PSNR (min) fps 8   57884   0.997158 22.334516 45.401482  32.920994  2.67663 12  50338   0.993274 21.722554 44.885056  34.287518  2.975561 16  36946   0.990023 20.0101   42.79283   35.122098  3.339178 20  25202   0.983669 17.86938  39.792569  33.580799  3.86772 24  16425   0.972965 15.680784 36.641095  29.975891  4.494949 28  10094   0.95588  13.553604 33.580348  26.595447  5.168905 32  5977    0.931332 11.632443 30.855433  23.918678  5.85861 36  3589    0.898831 9.949544  28.509815  21.358371  6.376635 40  2264    0.858916 8.505224  26.492488  19.204705  6.556086 44  1511    0.812144 7.261751  24.741765  17.527457  6.636509 48  1040    0.762812 6.24908   23.281506  15.783191  6.661346

Related

This entry was posted in Computing and tagged analysis, compression, computer, number crunching, tested, testing, video. Bookmark the permalink.

0 comments

Leave a Reply

Your email address will not be published. Required fields are marked *