1.5.0 is a major release of dav1d, that:
- WARNING: we removed some of the SSE2 optimizations, so if you care about
systems without SSSE3, you should be careful when updating!
- Add Arm OpenBSD run-time CPU feature
- Optimize index offset calculations for decode_coefs
- picture: copy HDR10+ and T35 metadata only to visible frames
- SSSE3 new optimizations for6-tap (8bit and hbd)
- AArch64/SVE: Add HBD subpel filters using128-bit SVE2
- AArch64: Add USMMLA implempentation for6-tap H/HV
- AArch64: Optimize Armv8.0 NEON for HBD horizontal filters and6-tap filters
- Power9: Optimized ITX till 16x4.
- Loongarch: numerous optimizations
- RISC-V optimizations for pal, cdef_filter, ipred, mc_blend, mc_bdir, itx
- Allow playing videos in full-screen mode in dav1dplay
1.4.3 is a small release focused on security issues
- AArch64: Fix potential out of bounds access in DotProd H/HV filters
- cli: Prevent buffer over-read
1.4.2 is a small release of dav1d, improving notably ARM, AVX-512and PowerPC
- AVX2 optimizations for8-tap andnew variants for6-tap
- AVX-512 optimizations for8-tap andnew variants for6-tap
- Improve entropy decoding on ARM64
- New ARM64 optimizations for convolutions based on DotProd extension
- New ARM64 optimizations for convolutions based on i8mm extension
- New ARM64 optimizations for subpel and prep filters for i8mm
- Misc improvements on existing ARM64 optimizations, notably for put/prep
- New PowerPC9 optimizations for loopfilter
- Support for macOS kperf API for benchmarking
1.4.0 is a medium release of dav1d, focusing on new architecture support and optimizations
- AVX-512 optimizations for z1, z2, z3 in 8bit and high-bitdepth
- New architecture supported: loongarch
- Loongarch optimizations for8bit
- New architecture supported: RISC-V
- RISC-V optimizations for itx
- Misc improvements in threading and in reducing binary size
- Fix potential integer overflow with extremely large frame sizes (CVE-2024-1580)
1.3.0 is a medium release of dav1d, focus on new APIs and memory usage reduction.
- Reduce memory usage in numerous places
- ABI break in Dav1dSequenceHeader, Dav1dFrameHeader, Dav1dContentLightLevel structures
- new API function to check the API version: dav1d_version_api()
- Rewrite of the SGR functions for ARM64 to be faster
- NEON implemetation of save_tmvs for ARM32 and ARM64
- x86 palette DSP for pal_idx_finish function
1.1.0 is an important release of dav1d, fixing numerous bugs, and adding SIMD
- New function dav1d_get_frame_delay to query the decoder frame delay
- Numerous fixes for strict conformity to the specs and samples
- NEON and AVX-512 misc fixes and improvements
- Partial AVX2 12bpc transform implementations
- AVX-512 high bit-depth cdef_filter, loopfilter, itx
- NEON z1/z3 optimization for8bpc
- SSSE3 z1 optimization for8bpc
0.9.2 is a small update of dav1d on the 0.9.x branch:
- x86: SSE4 optimizations of inverse transforms for10bit for all sizes
- x86: mc.resize optimizations with AVX2/SSSE3 for10/12b
- x86: SSSE3 optimizations for cdef_filter in 10/12b and mc_w_mask_422/444 in 8b
- ARM NEON optimizations for FilmGrain Gen_grain functions
- Optimizations for splat_mv in SSE2/AVX2 and NEON
- x86: SGR improvements for SSSE3 CPUs
- x86: AVX2 optimizations for cfl_ac
0.9.0 is a major version of dav1d, adding notably 10b acceleration on x64.
Details:
- x86 (64bit) AVX2 implementation of most 10b/12b functions, which should provide
a large boost for high-bitdepth decoding on modern x86 computers and servers.
- ARM64 neon implementation of FilmGrain (4:2:0/4:2:2/4:4:48bit)
- New API to signal events happening during the decoding process
0.8.2 is a middle-size update of the 0.8.0 branch:
- ARM32 optimizations for ipred and itx in 10/12bits,
completing the 10b/12b work on ARM64 and ARM32
- Give the post-filters their own threads
- ARM64: rewrite the wiener functions
- Speed up coefficient decoding, 0.5%-3% global decoding gain
- x86 optimizations for CDEF_filter and wiener in 10/12bit
- x86: rewrite the SGR AVX2 asm
- x86: improve msac speed on SSE2+ machines
- ARM32: improve speed of ipred and warp
- ARM64: improve speed of ipred, cdef_dir, cdef_filter, warp_motion and itx16
- ARM32/64: improve speed of looprestoration
- Add seeking, pausing to the player
- Update the player for rendering of 10b/12b
- Misc speed improvements and fixes on all platforms
- Add a xxh3 muxer in the dav1d application
0.8.1 is a minor update on 0.8.0:
- Keep references to buffers valid after dav1d_close(). Fixes a regression
caused by the picture buffer pool added in 0.8.0.
- ARM32 optimizations for10bit bitdepth for SGR
- ARM32 optimizations for16bit bitdepth for blend/w_masl/emu_edge
- ARM64 optimizations for10bit bitdepth for SGR
- x86 optimizations for wiener in SSE2/SSSE3/AVX2
0.8.0 is a major update for dav1d:
- Improve the performance by using a picture buffer pool;
The improvements can reach 10% on some cases on Windows.
- Support for Apple ARM Silicon
- ARM32 optimizations for8bit bitdepth for ipred paeth, smooth, cfl
- ARM32 optimizations for10/12/16bit bitdepth for mc_avg/mask/w_avg,
put/prep 8tap/bilin, wiener and CDEF filters
- ARM64 optimizations for cfl_ac 444for all bitdepths
- x86 optimizations for MC 8-tap, mc_scaled in AVX2
- x86 optimizations for CDEF in SSE and {put/prep}_{8tap/bilin} in SSSE3
0.7.1 is a minor update on 0.7.0:
- ARM32 NEON optimizations for itxfm, which can give up to 28% speedup, and MSAC
- SSE2 optimizations for prep_bilin and prep_8tap
- AVX2 optimizations for MC scaled
- Fix a clamping issue in motion vector projection
- Fix an issue on some specific Haswell CPU on ipred_z AVX2 functions
- Improvements on the dav1dplay utility player to support resizing
0.7.0 is a major release for dav1d:
- Faster refmv implementation gaining up to 12% speed while -25% of RAM (Single Thread)
- 10b/12b ARM64 optimizations are mostly complete:
- ipred (paeth, smooth, dc, pal, filter, cfl)
- itxfm (only 10b)
- AVX2/SSSE3 for non-4:2:0 film grain andfor mc.resize
- AVX2 for cfl4:4:4
- AVX-512 CDEF filter
- ARM64 8b improvements for cfl_ac and itxfm
- ARM64 implementation for emu_edge in 8b/10b/12b
- ARM32 implementation for emu_edge in 8b
- Improvements on the dav1dplay utility player to support 10 bit,
non-4:2:0 pixel formats and film grain on the GPU
0.6.0 is a major release for dav1d:
- New ARM64 optimizations for the 10/12bit depth:
- mc_avg, mc_w_avg, mc_mask
- mc_put/mc_prep 8tap/bilin
- mc_warp_8x8
- mc_w_mask
- mc_blend
- wiener
- SGR
- loopfilter
- cdef
- New AVX-512 optimizations for prep_bilin, prep_8tap, cdef_filter, mc_avg/w_avg/mask
- New SSSE3 optimizations for film grain
- New AVX2 optimizations for msac_adapt16
- Fix rare mismatches against the reference decoder, notably because of clipping
- Improvements on ARM64 on msac, cdef, mc_blend_v and looprestoration optimizations
- Improvements on AVX2 optimizations for cdef_filter
- Improvements in the C version for itxfm, cdef_filter
0.5.2 is a small release improving speed for ARM32 and adding minor features:
- ARM32 optimizations for loopfilter, ipred_dc|h|v
- Add section-5 raw OBU demuxer
- Improve the speed by reducing the L2 cache collisions
- Fix minor issues
0.5.1 is a small release improving speeds and fixing minor issues
compared to 0.5.0:
- SSE2 optimizations for CDEF, wiener and warp_affine
- NEON optimizations for SGR on ARM32
- Fix mismatch issue in x86 asm in inverse identity transforms
- Fix build issue in ARM64 assembly if debug info was enabled
- Add a workaround for Xcode 11 -fstack-check bug
0.5.0 is a medium release fixing regressions and minor issues, and improving speed significantly:
- Export ITU T.35 metadata
- Speed improvements on blend_ on ARM
- Speed improvements on decode_coef and MSAC
- NEON optimizations for blend*, w_mask_, ipred functions for ARM64
- NEON optimizations for CDEF and warp on ARM32
- SSE2 optimizations for MSAC hi_tok decoding
- SSSE3 optimizations for deblocking loopfilters and warp_affine
- AVX2 optimizations for film grain and ipred_z2
- SSE4 optimizations for warp_affine
- VSX optimizations for wiener
- Fix inverse transform overflows in x86 and NEON asm
- Fix integer overflows with large frames
- Improve film grain generation to match reference code
- Improve compatibility with older binutils for ARM
- More advanced Player example in tools
- Fix playback with unknown OBUs
- Add an option to limit the maximum frame size
- SSE2 and ARM64 optimizations for MSAC
- Improve speed on 32bits systems
- Optimization in obmc blend
- Reduce RAM usage significantly
- The initial PPC SIMD code, cdef_filter
- NEON optimizations for blend functions on ARM
- NEON optimizations for w_mask functions on ARM
- NEON optimizations for inverse transforms on ARM64
- VSX optimizations for CDEF filter
- Improve handling of malloc failures
- Simple Player example in tools
- Fix a buffer overflow in frame-threading mode on SSSE3 CPUs
- Reduce binary size, notably on Windows
- SSSE3 optimizations for ipred_filter
- ARM optimizations for MSAC
This is the final release for the numerous speed improvements of 0.3.0-rc.
It mostly:
- Fixes an annoying crash on SSSE3 that happened in the itx functions
- Large improvement on MSAC decoding with SSE, bringing 4-6% speed increase
The impact is important on SSSE3, SSE4 and AVX2 cpus
- SSSE3 optimizations for all blocks size in itx
- SSSE3 optimizations for ipred_paeth and ipred_cfl (420, 422and444)
- Speed improvements on CDEF for SSE4 CPUs
- NEON optimizations for SGR and loop filter
- Minor crashes, improvements and build changes
- SSSE3 optimization for cdef_dir
- AVX2 improvements of the existing CDEF optimizations
- NEON improvements of the existing CDEF and wiener optimizations
- Clarification about the numbering/versionning scheme
- ARM64 and ARM optimizations using NEON instructions
- SSSE3 optimizations for both 32and64bits
- More AVX2 assembly, reaching almost completion
- Fix installation of includes
- Rewrite inverse transforms to avoid overflows
- Snap packaging for Linux
- Updated API (ABI and API break)
- Fixes for un-decodable samples
Initial release of dav1d, the fast and small AV1 decoder.
- Support for all features of the AV1 bitstream
- Support for all bitdepth, 8, 10and12bits
- Support for all chroma subsamplings 4:2:0, 4:2:2, 4:4:4 *and* grayscale
- Full acceleration for AVX2 64bits processors, making it the fastest decoder
- Partial acceleration for SSSE3 processors
- Partial acceleration for NEON processors
Messung V0.5 in Prozent
¤ Dauer der Verarbeitung: 0.12 Sekunden
(vorverarbeitet am 2026-06-05)
¤
Die Informationen auf dieser Webseite wurden
nach bestem Wissen sorgfältig zusammengestellt. Es wird jedoch weder Vollständigkeit, noch Richtigkeit,
noch Qualität der bereit gestellten Informationen zugesichert.
Bemerkung:
Die farbliche Syntaxdarstellung und die Messung sind noch experimentell.