Audio to video synchronization


Audio to video synchronization

Audio to video synchronization (also known as Audio video sync, Audio/video sync, AV-sync, lip sync – or lack of it: lip sync error, lip-flap) refers to the relative timing of audio (sound) and video (image) portions during creation, post-production (mixing), transmission, reception and play-back processing. When sound and video have a timing related cause and effect, AV-sync can be an issue in: television, videoconferencing, film, etc...

Digital or analog audio video stream or video file usually contains some sort of explicit AV-sync timing, either in the form of interlaced video and audio data or by explicit relative time-stamping of data. The processing of data must respect the relative data timing by e.g. stretching between or interpolation of received data. If the processing does not respect the AV-sync error, it will increase whenever data gets lost, because of transmission errors or because of missing or mis-timed processing.

Incorrectly synchronized

There are different ways in which the AV-sync can get incorrectly synchronized:
*During creation AV-sync errors happens because of
**Internal AV-sync error: Different processing delays between image and sound in video camera and microphone. The AV-sync delay is normally fixed.
**External AV-sync error: If a microphone is placed far away from the sound source, the audio will be out of sync because the speed of sound is much lower than the speed of light. If the sound source is 340 meters from the microphone, then the sound arrives approximately 1 second later than the light. The AV-sync delay increases with distance.
*During mixing of video clips normally either the audio or video needs to be delayed so they are synchronized. The AV-sync delay is static, but can vary with the individual clip.
*Video editing effects.

Examples of transmission (broadcasting), reception and playback that can get the AV-sync incorrectly synchronized:
*A video camera with built-in microphones or line-in may not delay sound and video paths by the same number of milliseconds. A video camera should have some sort of explicit AV-sync timing put into the video and audio streams. Solid state video cameras (e.g. CCD and CMOS image sensors) can delay the video signal by one or more frames.
*An AV-stream may get corrupted during transmission because of electrical glitches (wired) or wireless interruptions - this may cause it to become out of sync. The AV-sync delay normally increases with time.
*There is extensive use of audio and video signal processing circuitry with significant delays in television systems. Particular video signal processing circuitry which is widely used and contributes significant video delays include frame synchronizers, digital video effects processors, video noise reduction, format converters and MPEG pre-preprocessing.
*The video monitor processing circuit may delay the video stream. Pixelated displays require video format conversion and deinterlace processing which can add one or more frames of video delay.
*A video monitor with built-in speakers or line-out may not delay sound and video paths by the same amount of milliseconds. Some video monitors contain internal user-adjustable audio delays to aid in correction of errors.

MPEG-2: Presentation Time Stamp (PTS), Decode Time Stamp (DTS)

Presentation Time Stamps (PTS) can be embedded in MPEG-2 to avoid AV-sync drift. Unfortunately these time stamps are often added after the video undergoes frame synchronization, format conversion and pre-processing, thus those delays remain uncompensated. [ [http://www.chiariglione.org/mpeg/faq/mp2-sys/mp2-sys.htm#mp2-19 MPEG-2 Systems FAQ: 19. Where are the PTSs and DTSs inserted?] ] [ [http://lists.mplayerhq.hu/pipermail/mplayer-g2-dev/2003-May/000004.html MPlayer-G2-dev: mpeg container's timing (PTS values)] ] [ [http://www.birds-eye.net/definition/d/dts-decode_time_stamp.shtml birds-eye.net: DTS - Decode Time Stamp] ] [ [http://www.svcd2dvd.com/Guides/AVSync/default.aspx svcd2dvd.com: Perfect AV Sync: Preparation is key...] ]

Viewer experience of incorrectly synchronized AV-sync

The result typically leaves a filmed or televised character moving his mouth when there is no spoken dialog to accompany it as it has been removed or changed somehow in post-production, hence the term "lip flap" or "lip-sync error". The resulting audio video sync error can be annoying to the viewer and even lead to detrimental effects on the viewer's enjoyment of the program and the program's effectiveness and can even lead to negative perceptions of the speaker.

The lack of effectiveness problems are of particular concern when product commercials and political candidates are viewed. [http://www.pixelinstruments.tv/pdf/Articles/Effects%20of%20Audio-Video%20Asynchrony.PDF] Television industry standards organizations have become involved in setting standards for audio video sync errors. See for example ATSC Document IS-191.http://www.atsc.org/standards/is_191.pdf]

Because of these annoyances, AV-sync error is of concern to the television programming industry, including television stations, networks, advertisers and program production companies.

Effect of no explicit AV-sync timing

When a digital or analog audio video stream does not have some sort of explicit AV-sync timing these effects will cause the stream to become out of sync:
*In film movies these timing errors are most commonly caused by worn films skipping over the movie projector sprockets because the film has torn sprocket holes.
*Errors can also be caused by the projectionist misthreading the film in the projector, although this is rare with competent projectionists.
*Audio to Video Synchronization is commonly corrected and maintained with an audio synchronizer. Television industry standards organizations have established acceptable amounts of audio and video timing errors and suggested practices related to maintaining acceptable timing. [http://www.atsc.org/news_information/newsletter/ATSC_Newsletter-11.pdf]
*A/V sync errors are becoming a significant problem in the digital television industry because of the use of large amounts of video signal processing in television production, television broadcasting and pixelated television displays such as LCD, DLP and plasma displays.
*In the television field, audio video sync problems are commonly caused when significant amounts of video processing is performed on the video part of the television program.
*Typical sources of significant video delays in the television field include video synchronizers and video compression encoders and decoders. Particularly troublesome encoders and decoders are used in MPEG compression systems utilized for broadcasting digital television and storing television programs on consumer and professional recording and playback devices.
*A source of significant video delay is found in pixelated television displays (LCD, Plasma display, DLP)which utilize complex video signal processing to convert the resolution of the incoming video signal to the native resolution of the pixelated display, for example converting standard definition video to be displayed on a high definition display. "Lip-flap" may exceed 200 ms at times.
*In broadcast television, it is not unusual for lip-sync error to vary by over 100 ms (several video frames) from time to time.

ources/references

ee also

*Dubbing (filmmaking)
*Audio synchronizer
*MuEv
*Lip sync

External links

* [http://broadcastengineering.com/audio/broadcasting_managing_lip_sync/index.html]
* Further detailed information on lip sync error and audio synchronizer may be found by searching for these terms at the United States Patent and Trademark Office web site at [http://patft.uspto.gov/netahtml/PTO/search-bool.html] .


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Audio synchronizer — An audio synchronizer is a variable audio delay utilized to correct or maintain audio video sync or timing [http://broadcastengineering.com/audio/broadcasting managing lip sync/index.html] also known as lip sync error… …   Wikipedia

  • Audio file format — An audio file format is a file format for storing digital audio data on a computer system. This data can be stored uncompressed, or compressed to reduce the file size. It can be a raw bitstream, but it is usually a container format or an audio… …   Wikipedia

  • Synchronization — Synchrony redirects here. For linguistic synchrony, see Synchronic analysis. For the X Files episode, see Synchrony (The X Files). For similarly named concepts, see Synchronicity (disambiguation). Not to be confused with data… …   Wikipedia

  • Video signal generator — A video signal generator is a type of signal generator which outputs predetermined video and/or television waveforms, and other signals used to stimulate faults in, or aid in parametric measurements of, television and video systems. There are… …   Wikipedia

  • Audio commentary — On disc based video formats, an audio commentary is an additional audio track consisting of a lecture or comments by one or more speakers, that plays in real time with video. Commentaries can be serious or entertaining in nature, and can add… …   Wikipedia

  • Video Toaster — The NewTek Video Toaster is a combination of hardware and software for the editing and production of standard definition and high definition video in NTSC, PAL, and resolution independent formats on Commodore Amiga computers and subsequently on… …   Wikipedia

  • Audio engineering — An audio engineer at an audio console. An audio engineer, also called audio technician, audio technologist or sound technician, is a specialist in a skilled trade that deals with the use of machinery and equipment for the recording, mixing and… …   Wikipedia

  • Video processing expansion card — A Video processing expansion card is a computer expansion card that allows a computer to receive television signals, record video, and/or playback video content. cite web url=http://www.ati.com/products/catalyst/video WhitePaper.pdf title=Video… …   Wikipedia

  • Video — For films or movies, see Film. For other uses, see Video (disambiguation). For the use of video in Wikipedia articles, see Wikipedia:Creation and usage of media files. Video is the technology of electronically capturing, recording, processing,… …   Wikipedia

  • Video modulation — In Amplitude Modulated (AM) broadcast analogue television systems it is possible to modulate the video signal two ways. Peak White Positive Peak white can be made to correspond to peak transmitter power (100 IRE) and the synchronisation (sync)… …   Wikipedia