Jump to content

Main menu Navigation ●Main page ●Contents ●Current events ●Random article ●About Wikipedia ●Contact us ●Donate Contribute ●Help ●Learn to edit ●Community portal ●Recent changes ●Upload file

●Create account ●Log in ●Create account ● Log in Pages for logged out editors learn more ●Contributions ●Talk

(Top) 1 Features 1.1 Quality 2 History 3 Support 3.1 Implementations 3.2 Applications 4 References 5 External links 6 See also

Lyra (codec)

●Čeština ●Suomi ●Українська Edit links ●Article ●Talk ●Read ●Edit ●View history Tools Actions ●Read ●Edit ●View history General ●What links here ●Related changes ●Upload file ●Special pages ●Permanent link ●Page information ●Cite this page ●Get shortened URL ●Download QR code ●Wikidata item Print/export ●Download as PDF ●Printable version Appearance From Wikipedia, the free encyclopedia

Lyra (codec)

Filename extension	.lyra
Developed by	Google
Initial release	2021 (2021)
Latest release	1.3.2 December 20, 2022; 18 months ago (2022-12-20)
Type of format	speech codec
Free format?	Yes (Apache-2.0)

Lyra is a lossy audio codec developed by Google that is designed for compressing speech at very low bitrates. Unlike most other audio formats, it compresses data using a machine learning-based algorithm.

Features

[edit]

The Lyra codec is designed to transmit speech in real-time when bandwidth is severely restricted, such as over slow or unreliable network connections.^[1] It runs at fixed bitrates of 3.2, 6, and 9 kbit/s and it is intended to provide better quality than codecs that use traditional waveform-based algorithms at similar bitrates.^[2]^[3] Instead, compression is achieved via a machine learning algorithm that encodes the input with feature extraction, and then reconstructs an approximation of the original using a generative model.^[1] This model was trained on thousands of hours of speech recorded in over 70 languages to function with various speakers.^[2] Because generative models are more computationally complex than traditional codecs, a simple model that processes different frequency ranges in parallel is used to obtain acceptable performance.^[4] Lyra imposes 20 ms of latency due to its frame size.^[3] Google's reference implementation is available for Android and Linux.^[4]

Lyra version 1 quality comparison

Original

Resampled to 16 kHz

Lyra at 3 kbps

Opus at 6 kbps

Speex at 3 kbps

Problems playing these files? See media help.

Quality

[edit]

Lyra's initial version performed significantly better than traditional codecs at similar bitrates.^[1]^[4]^[5] Ian Buckley at MakeUseOf said, "It succeeds in creating almost eerie levels of audio reproduction with bitrates as low as 3 kbps." Google claims that it reproduces natural-sounding speech, and that Lyra at 3 kbit/s beats Opus at 8 kbit/s.^[2] Tsahi Levent-Levi writes that Satin, Microsoft's AI-based codec, outperforms it at higher bitrates.^[5]

History

[edit]

In December 2017, Google researchers published a preprint paper on replacing the Codec 2 decoder with a WaveNet neural network. They found that a neural network is able to extrapolate features of the voice not described in the Codec 2 bitstream and give better audio quality, and that the use of conventional features makes the neural network calculation simpler compared to a purely waveform-based network. Lyra version 1 would reuse this overall framework of feature extraction, quantization, and neural synthesis.^[6]

Lyra was first announced in February 2021,^[2] and in April, Google released the source code of their reference implementation.^[1] The initial version had a fixed bitrate of 3 kbit/s and around 90 ms latency.^[1]^[2] The encoder calculates a log mel spectrogram and performs vector quantization to store the spectrogram in a data stream. The decoder is a WaveNet neural network that takes the spectrogram and reconstructs the input audio.^[2]

A second version (v2/1.2.0), released in September 2022, improved sound quality, latency, and performance, and permitted multiple bitrates. V2 uses a "SoundStream" structure where both the encoder and decoder are neural networks, a kind of autoencoder. A residual vector quantizer is used to turn the feature values into transferrable data.^[3]

Support

[edit]

Implementations

[edit]

Google's implementation is available on GitHub under the Apache License.^[1]^[7] Written in C++, it is optimized for 64-bit ARM but also runs on x86, on either Android or Linux.^[4]

Applications

[edit]

Google Duo uses Lyra to transmit sound for video chats when bandwidth is limited.^[1]^[5]

References

[edit]

^ ^a ^b ^c ^d ^e ^f ^g Buckley, Ian (2021-04-08). "Google Makes Its Lyra Low Bitrate Speech Codec Public". MakeUseOf. Retrieved 2022-07-21.

^ ^a ^b ^c ^d ^e ^f "Lyra: A New Very Low-Bitrate Codec for Speech Compression". Google AI Blog. 25 February 2021. Retrieved 2022-07-21.

^ ^a ^b ^c "Lyra V2 - a better, faster, and more versatile speech codec". Google Open Source Blog. Retrieved 2023-04-26.

^ ^a ^b ^c ^d "Google Duo uses a new codec for better call quality over poor connections". XDA. 2021-04-09. Retrieved 2022-07-21.

^ ^a ^b ^c Levent-Levi, Tsahi (2021-04-19). "Lyra, Satin and the future of voice codecs in WebRTC". BlogGeek.me. Retrieved 2022-07-21.

^ Kleijn, W. B.; Lim, F. S.; Luebs, A.; Skoglund, J.; Stimberg, F.; Wang, Q.; Walters, T. C. (April 2018). Wavenet based low rate speech coding. 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE. pp. 676–680. arXiv:1712.01120.

^ Google (2021). "Lyra: A Very Low-Bitrate Codec for Speech Compression". GitHub. Retrieved 21 July 2022.

External links

[edit]

Lyra: A New Very Low-Bitrate Codec for Speech Compression Google blog post with a demonstration comparing codecs

ISO, IEC, MPEG	DV MJPEG Motion JPEG 2000 MPEG-1 MPEG-2 Part 2 MPEG-4 Part 2 / ASP Part 10 / AVC Part 33 / IVC MPEG-H Part 2 / HEVC MPEG-I Part 3 / VVC MPEG-5 Part 1 / EVC Part 2 / LCEVC
ITU-T, VCEG	H.120 H.261 H.262 H.263 H.264 / AVC H.265 / HEVC H.266 / VVC
SMPTE	VC-1 VC-2 VC-3 VC-5 VC-6
TrueMotion	TrueMotion S VP3 VP6 VP7 VP8 VP9 AV1
Others	Apple Video AVS Bink Cinepak Daala DVI FFV1 Huffyuv Indeo Lagarith Microsoft Video 1 MSU Lossless OMS Video Pixlet ProRes 422 4444 QuickTime Animation Graphics RealVideo RTVideo SheerVideo Smacker Sorenson Video/Spark Theora Thor Ut WMV XEB YULS

Audio
compression

ISO, IEC, MPEG	MPEG-1 Layer II Multichannel MPEG-1 Layer I MPEG-1 Layer III (MP3) AAC HE-AAC AAC-LD MPEG Surround MPEG-4 ALS MPEG-4 SLS MPEG-4 DST MPEG-4 HVXC MPEG-4 CELP MPEG-D USAC MPEG-H 3D Audio
ITU-T	G.711 A-law µ-law G.718 G.719 G.722 G.722.1 G.722.2 G.723 G.723.1 G.726 G.728 G.729 G.729.1
IETF	Opus iLBC Speex Vorbis
3GPP	AMR AMR-WB AMR-WB+ EVRC EVRC-B EVS GSM-HR GSM-FR GSM-EFR
ETSI	AC-3 AC-4 DTS
Bluetooth SIG	SBC LC3
Others	ACELP ALAC Asao ATRAC AVS CELT Codec 2 DRA FLAC iSAC Lyra MELP Monkey's Audio MT9 Musepack OptimFROG OSQ QCELP RCELP RealAudio RTAudio SD2 SHN SILK Siren SMV SVOPC TTA True Audio TwinVQ VMR-WB VSELP WavPack WMA MQA aptX aptX HD aptX Low Latency aptX Adaptive LDAC LHDC LLAC L2HC

Image
compression

IEC, ISO, IETF,
W3C, ITU-T, JPEG

Others

Containers

ISO, IEC	MPEG-ES MPEG-PES MPEG-PS MPEG-TS ISO/IEC base media file format MPEG-4 Part 14 (MP4) Motion JPEG 2000 MPEG-21 Part 9 MPEG media transport
ITU-T	H.222.0 T.802
IETF	RTP Ogg
SMPTE	GXF MXF
Others	3GP and 3G2 AMV ASF AIFF AVI AU BPG Bink Smacker BMP DivX Media Format EVO Flash Video HEIF IFF M2TS Matroska WebM QuickTime File Format RatDVD RealMedia RIFF WAV MOD and TOD VOB, IFO and BUP

Collaborations

Methods

Entropy
LPC
- ACELP
- CELP
- LSP
- WLPC
Lossless
Lossy
LZ
- DEFLATE
- LZW
PCM
- A-law
- µ-law
- ADPCM
- DPCM
Transforms
- DCT
- FFT
- MDCT
- Wavelet
  - Daubechies
  - DWT

Lists

See Compression methods for techniques and Compression software for codecs

Retrieved from "https://en.wikipedia.org/w/index.php?title=Lyra_(codec)&oldid=1224005988" Categories: ●Speech codecs ●Lossy compression algorithms ●Software using the Apache license ●Google software ●Machine learning ●2021 software Hidden categories: ●Articles with short description ●Short description matches Wikidata ●Articles with hAudio microformats ●This page was last edited on 15 May 2024, at 18:01 (UTC). ●Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization. ●Privacy policy ●About Wikipedia ●Disclaimers ●Contact Wikipedia ●Code of Conduct ●Developers ●Statistics ●Cookie statement ●Mobile view