Codecs
Overview
VoIP ships two tiers of audio codecs:
| Extra required | Codecs available |
|---|---|
audio (includes numpy) |
PCMA (G.711 A-law), PCMU (G.711 µ-law) |
hd-audio (includes numpy and pyav) |
+ G.722, Opus |
Install the minimal tier for pure-Python telephony deployments:
pip install voip[audio]
Install the full tier for wideband / Opus support via FFmpeg:
pip install voip[hd-audio]
SD audio
These codecs work without PyAV and require only numpy.
voip.codecs.pcma.PCMA
Bases: RTPCodec
G.711 A-law codec (RFC 3551 §4.5.14).
PCMA is the ITU-T G.711 A-law logarithmic companding codec for PSTN telephony, standardised in RFC 3551 with static payload type 8.
Both encode and decode interoperate bit-exactly with real RTP PCMA streams.
Source code in voip/codecs/pcma.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 | |
voip.codecs.pcmu.PCMU
Bases: RTPCodec
G.711 mu-law codec (RFC 3551 §4.5.14).
PCMU is the ITU-T G.711 mu-law logarithmic companding codec for PSTN telephony, standardised in RFC 3551 with static payload type 0.
Both encode and decode are pure-NumPy and require no PyAV dependency.
Source code in voip/codecs/pcmu.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | |
HD audio
These codecs require the pyav extra (pip install voip[pyav]).
voip.codecs.g722.G722
Bases: PyAVCodec
G.722 wideband audio codec (RFC 3551 §4.5.2).
G.722 is an ITU-T ADPCM wideband codec. Despite encoding audio at 16 000 Hz, the RTP timestamp clock runs at 8 000 Hz per RFC 3551 — a well-known quirk of the original specification.
The entire buffer is encoded at once to preserve the ADPCM predictor state across packet boundaries.
Source code in voip/codecs/g722.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | |
voip.codecs.opus.Opus
Bases: PyAVCodec
Opus audio codec (RFC 7587).
Opus is a highly flexible codec for interactive real-time speech and audio transmission. It uses dynamic payload type 111 and always operates at 48 000 Hz internally.
Incoming RTP payloads are wrapped in a minimal Ogg container before
being passed to PyAV. Outbound PCM is encoded via libopus.
Source code in voip/codecs/opus.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 | |
Registry
voip.codecs.get(encoding_name)
Get a codec class by its SDP encoding name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
encoding_name
|
str
|
SDP encoding name, case-insensitive
(e.g. |
required |
Returns:
| Type | Description |
|---|---|
type[RTPCodec]
|
Matching codec class. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
When no registered codec matches encoding_name. |
Source code in voip/codecs/__init__.py
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | |
Base classes
voip.codecs.base.RTPCodec
Base class for RTP audio codecs.
Concrete implementations: Opus,
G722, PCMA,
PCMU.
All codec implementations are stateless: every method is a classmethod or
staticmethod and codecs are referenced as type[RTPCodec], never
instantiated.
Concrete subclasses define codec-specific class variables and override
decode,
encode, and optionally
packetize.
Subclasses may use the shared PyAV-backed helpers or implement
decode and
encode using alternative backends
such as NumPy.
Subclasses that produce variable-length output across frames (e.g. G.722
ADPCM) should override packetize to encode the whole buffer at once and
preserve predictor state.
Subclasses that require PyAV additionally inherit from
PyAVCodec.
Source code in voip/codecs/base.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 | |
channels
class-attribute
Channel count (1 = mono, 2 = stereo).
encoding_name
class-attribute
SDP encoding name (lowercase).
frame_size
class-attribute
Audio samples per 20 ms RTP frame at sample_rate_hz.
payload_type
class-attribute
RTP payload type number.
rtp_clock_rate_hz
class-attribute
RTP timestamp clock rate in Hz (may differ from sample_rate_hz).
sample_rate_hz
class-attribute
Actual audio sample rate in Hz.
timestamp_increment
class-attribute
RTP timestamp ticks per frame at rtp_clock_rate_hz.
decode(payload, output_rate_hz, *, input_rate_hz=None)
classmethod
Decode an RTP payload to float32 mono PCM.
Override in subclasses to implement codec-specific decoding.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
payload
|
bytes
|
Raw RTP payload bytes. |
required |
output_rate_hz
|
int
|
Target sample rate in Hz. |
required |
input_rate_hz
|
int | None
|
Input clock rate override in Hz, or |
None
|
Returns:
| Type | Description |
|---|---|
ndarray
|
Float32 mono PCM array at output_rate_hz Hz. |
Source code in voip/codecs/base.py
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 | |
encode(samples)
classmethod
Encode float32 mono PCM to an RTP payload.
Override in subclasses to implement codec-specific encoding.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
samples
|
ndarray
|
Float32 mono PCM at |
required |
Returns:
| Type | Description |
|---|---|
bytes
|
Encoded bytes for one RTP payload. |
Source code in voip/codecs/base.py
150 151 152 153 154 155 156 157 158 159 160 161 162 | |
packetize(audio)
classmethod
Encode audio and yield one encoded payload per 20 ms RTP frame.
The default implementation encodes one frame_size chunk at a time
using encode. Override in subclasses (e.g. G.722) where the entire
buffer must be encoded at once to preserve codec state.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
audio
|
ndarray
|
Float32 mono PCM at |
required |
Yields:
| Type | Description |
|---|---|
bytes
|
Encoded payload bytes, one per RTP packet. |
Source code in voip/codecs/base.py
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 | |
resample(audio, source_rate_hz, destination_rate_hz)
classmethod
Resample audio from source_rate_hz to destination_rate_hz.
Uses linear interpolation via numpy.interp.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
audio
|
ndarray
|
Float32 mono PCM array. |
required |
source_rate_hz
|
int
|
Sample rate of audio in Hz. |
required |
destination_rate_hz
|
int
|
Target sample rate in Hz. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Resampled float32 array at destination_rate_hz Hz, or audio |
ndarray
|
unchanged when both rates are equal. |
Source code in voip/codecs/base.py
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 | |
to_payload_format()
classmethod
Create an RTPPayloadFormat for SDP negotiation.
Uses rtp_clock_rate_hz as the SDP sample rate, which is correct
per RFC 3551 (e.g. G.722 advertises 8000 Hz in SDP even though the
actual audio runs at 16000 Hz).
Returns:
| Type | Description |
|---|---|
RTPPayloadFormat
|
Payload format descriptor for this codec. |
Source code in voip/codecs/base.py
109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | |
voip.codecs.av.PyAVCodec
Bases: RTPCodec
RTP codec that decodes and encodes audio via PyAV.
Concrete implementations: Opus,
G722.
Source code in voip/codecs/av.py
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 | |
decode_pcm(data, av_format, output_rate_hz, *, input_rate_hz=None)
classmethod
Decode raw audio bytes via PyAV into float32 mono PCM.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
bytes
|
Raw audio bytes in the codec's wire format. |
required |
av_format
|
str
|
PyAV format string (e.g. |
required |
output_rate_hz
|
int
|
Target sample rate in Hz. |
required |
input_rate_hz
|
int | None
|
Input clock rate hint for the PyAV decoder, or
|
None
|
Returns:
| Type | Description |
|---|---|
ndarray
|
Float32 mono PCM array at output_rate_hz Hz. |
Source code in voip/codecs/av.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 | |
encode_pcm(samples, av_codec_name, sample_rate_hz)
classmethod
Encode float32 mono PCM to raw codec bytes via PyAV.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
samples
|
ndarray
|
Float32 mono PCM array in the range |
required |
av_codec_name
|
str
|
PyAV codec name (e.g. |
required |
sample_rate_hz
|
int
|
Sample rate of samples in Hz. |
required |
Returns:
| Type | Description |
|---|---|
bytes
|
Encoded audio bytes. |
Source code in voip/codecs/av.py
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 | |