Quantization (sound processing)

Quantization (sound processing)

In signal processing, quantization is the process of approximating a continuous range of values (or a very large set of possible discrete values) by a relatively-small set of discrete symbols or integer values. This article describes aspects of quantization related to sound signals.

After sampling, speech signals are usually represented by one of a fixed number of values, in a process known as pulse-code modulation (PCM). Some specific issues related to quantization of audio signals follow.

Audio quantization

Telephony applications frequently use 8-bit quantization. That is, values of the analogue waveform are rounded to the closest of 256 distinct voltage values represented by an 8-bit binary number. This crude quantization introduces substantial quantization noise into the signal, but the result is still more than adequate to represent human speech.

By comparison, compact discs use a 16-bit digital representation, allowing 65,536 distinct voltage levels. This is far better than telephone quantization but CD audio representing low signal levels would still sound noticeably 'granular' because of the quantizing noise, were it not for the addition of a small amount of noise to the signal before digitization. This deliberately-added noise is known as dither. Adding dither eliminates this granularity, and gives very low distortion, but at the expense of a small increase in noise level. Measured using ITU-R 468 noise weighting, this is about 66dB below alignment level, or 84dB below FS (full scale) digital, which is somewhat lower than the microphone noise level on most recordings, and hence of no consequence (see Programme levels for more on this).

Optimizing dither waveforms

In a seminal paper published in the AES Journal, Lipshitz and Vanderkooy pointed out that different noise types, with different probability density functions (PDF's) behave differently when used as dither signals, and suggested optimal levels of dither signal for audio. Gaussian noise requires a higher level for full elimination of distortion than rectangular PDF or triangular PDF noise. Triangular PDF noise has the advantage of requiring a lower level of added noise to eliminate distortion and also minimizing 'noise modulation'. The latter refers to audible changes in the residual noise on low-level music that are found to draw attention to the noise.

Noise shaping for lower audibility

An alternative to dither is noise shaping, which involves a feedback process in which the final digitized signal is compared with the original, and the instantaneous errors on successive past samples integrated and used to determine whether the next sample is rounded up or down. This smooths out the errors in a way that alters the spectral noise content. By inserting a weighting filter in the feedback path, the spectral content of the noise can be shifted to areas of the 'equal-loudness contours' where the human ear is least sensitive, producing a lower subjective noise level (-68/-70dB typically ITU-R 468 weighted).

24-bit quantization

24-bit audio is sometimes used undithered, because for most audio equipment and situations the noise level of the digital converter can be louder than the required level of any dither that might be applied.

There is some disagreement over the recent trend towards higher bit-depth audio. It is argued by some that the dynamic range presented by 16-bit is sufficient to store the dynamic range present in almost all music. In terms of pure data storage this is often true, as a high-end system can extract an extremely good sound out of the 16-bits stored in a well-mastered CD. However, audio with very loud and very quiet sections can require some of the above dithering techniques to fit it into 16-bits. This is not a problem for most recently produced popular music, which is often mastered so that it constantly sits close to the maximum signal (see loudness war); however, higher resolution audio formats are already being used (especially for applications such as film soundtracks, where there is often a very wide dynamic range between whispered conversations and explosions).

For most situations, the advantage given by higher-resolution audio than 16-bits are mainly to do with processing the audio. No digital filter is perfect, but if the audio is upsampled and the audio is done in 24-bit or higher, then the distortion introduced by filtering will be much quieter (as the errors always creep into the least significant bits) and a well-designed filter can weight the distortion more towards the higher inaudible frequencies (but you need a sample rate higher than 48kHz so that these inaudible frequencies are available for soaking up errors).

There is also a good case for 24-bit (or higher) recording in the live studio, because it enables greater headroom (often 24dB or more rather than 18dB) to be left on the recording without encountering quantization errors at low volumes. This means that brief peaks are not harshly clipped, but can be compressed or soft-limited later to suit the final medium.

Environments where large amounts of signal processing are required (such as mastering or synthesis) can require even more than 24 bits. Some modern audio editors convert incoming audio to 32-bit (both for an increased dynamic range to reduce clipping, and to minimize noise in intermediate stages of filtering), and some DAW environments (such as recent versions of REAPER and SONAR) use 64-bit audio for their underlying engine.

See also

* Quantization (signal processing)
* Pulse-code modulation
* Sampling (signal processing)

References


Wikimedia Foundation. 2010.

Игры ⚽ Поможем решить контрольную работу

Look at other dictionaries:

  • Quantization (signal processing) — In digital signal processing, quantization is the process of approximating a continuous range of values (or a very large set of possible discrete values) by a relatively small set of discrete symbols or integer values.More specifically, a signal… …   Wikipedia

  • Quantization — is the procedure of constraining something from a continuous set of values (such as the real numbers) to a discrete set (such as the integers). Quantization in specific domains is discussed in:* Quantization (signal processing) ** Quantization… …   Wikipedia

  • Audio signal processing — Audio signal processing, sometimes referred to as audio processing, is the intentional alteration of auditory signals, or sound. As audio signals may be electronically represented in either digital or analog format, signal processing may occur in …   Wikipedia

  • Analog sound vs. digital sound — Analog sound versus digital sound compares the two ways in which sound is recorded and stored. Actual sound waves consist of continuous variations in air pressure. Representations of these signals can be recorded in either digital or analog… …   Wikipedia

  • Sampling (signal processing) — Signal sampling representation. The continuous signal is represented with a green color whereas the discrete samples are in blue. In signal processing, sampling is the reduction of a continuous signal to a discrete signal. A common example is the …   Wikipedia

  • Digital signal processing — (DSP) is concerned with the representation of discrete time signals by a sequence of numbers or symbols and the processing of these signals. Digital signal processing and analog signal processing are subfields of signal processing. DSP includes… …   Wikipedia

  • Signal processing — is an area of systems engineering, electrical engineering and applied mathematics that deals with operations on or analysis of signals, in either discrete or continuous time. Signals of interest can include sound, images, time varying measurement …   Wikipedia

  • Dither — For other uses, see Dither (disambiguation). Provincial definition of to dither from The Rural Economy of Yorkshire: Comprizing the Management of Landed Estates, and the Present Practice of Husbandry in the Agricultural Districts of that County,… …   Wikipedia

  • Long Term Prediction — In GSM, a RPE LTP(Regular Pulse Excitation Long Term Prediction) scheme is employed in order to reduce the amount of data sent between the Mobile station and Base Transceiver Station.In essence, when a voltage level of a particular speech sample… …   Wikipedia

  • Digital audio — Digital music redirects here. For a kind of modern music composed by digital means, see electronic music. A sound wave, in gray, represented digitally, in red (after quantization and zero order hold). Digital audio is sound reproduction using… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”