Pipeline (software)

Pipeline (software)

In software engineering, a pipeline consists of a chain of processing elements (processes, threads, coroutines, "etc".), arranged so that the output of each element is the input of the next. Usually some amount of buffering is provided between consecutive elements. The information that flows in these pipelines is often a stream of records, bytes or bits.

The concept is also called the pipes and filters design pattern. It was named by analogy to a physical pipeline.

Multiprocessed pipelines

Pipelines are often implemented in a multitasking OS, by launching all elements at the same time as processes, and automatically servicing the data read requests by each process with the data written by the upstream process. In this way, the CPU will be naturally switched among the processes by the scheduler so as to minimize its idle time. In other common models elements are implemented as lightweight threads or as coroutines, to reduce the OS overhead often involved with processes. Depending upon the OS, threads may be scheduled directly by the OS or by a thread manager. Coroutines are always scheduled by a coroutine manager of some form.

Usually, read and write requests are blocking operations, which means that the execution of the source process, upon writing, is suspended until all data could be written to the destination process, and, likewise, the execution of the destination process, upon reading, is suspended until at least some of the requested data could be obtained from the source process. Obviously, this cannot lead to a deadlock, where both processes would wait indefinitely for each other to respond, since at least one of the two processes will soon thereafter have its request serviced by the operating system, and continue to run.

For performance, most operating systems implementing pipes use pipe buffers, which allow the source process to provide more data than the destination process is currently able or willing to receive. Under most Unices and Unix-like operating systems, a special command is also available which implements a pipe buffer of potentially much larger and configurable size, typically called "buffer". This command can be useful if the destination process is significantly slower than the source process, but it is anyway desired that the source process can complete its task as soon as possible. E.g., if the source process consists of a command which reads an audio track from a CD and the destination process consists of a command which compresses the waveform audio data to a format like OGG Vorbis. In this case, buffering the entire track in a pipe buffer would allow the CD drive to spin down more quickly, and enable the user to remove the CD from the drive before the encoding process has finished.

Such a buffer command can be implemented using available operating system primitives for reading and writing data. Wasteful active waiting can be avoided by using facilities such as poll or select or multithreading.

VM/CMS and MVS

CMS Pipelines is a port of the pipeline idea to VM/CMS and MVS systems. It supports much more complex pipeline structures than Unix shells, with steps taking multiple input streams and producing multiple output streams. (Such functionality is supported by the Unix kernel, but few programs use it and none of the shells provide a syntax for it.) Due to the different nature of IBM mainframe operating systems, it implements many steps inside CMS Pipelines which in Unix are separate external programs, but can also call separate external programs for their functionality. Also, due to the record-oriented nature of files on IBM mainframes, pipelines operate in a record-oriented, rather than stream-oriented manner.

Pseudo-pipelines

On single-tasking operating systems, the processes of a pipeline have to be executed one by one in sequential order; thus the output of each process must be saved to a temporary file, which is then read by the next process. Since there is no parallelism or CPU switching, this version is called a "pseudo-pipeline".

For example, the command line interpreter of MS-DOS ('COMMAND.COM') provides pseudo-pipelines with a syntax superficially similar to that of Unix pipelines. The command "dir | sort | more" would have been executed like this (albeit with more complicated temporary file names):

# Create temporary file 1.tmp
# Run command "dir", redirecting its output to 1.tmp
# Create temporary file 2.tmp
# Run command "sort", redirecting its input to 1.tmp and its output to 2.tmp
# Run command "more", redirecting its input to 2.tmp, and presenting its output to the user
# Delete 1.tmp and 2.tmp, which are no longer needed
# Return to the command prompt

All temporary files are stored in the directory pointed to by %TEMP%, or the current directory if %TEMP% isn't set.

Thus, pseudo-pipes acted like true pipes with a pipe buffer of unlimited size (disk space limitations notwithstanding), with the significant restriction that a receiving process could not read "any" data from the pipe buffer until the sending process finished completely. Besides causing disk traffic, if one doesn't install a harddisk cache such as SMARTDRV, that would have been unnecessary under multi-tasking operating systems, this implementation also made pipes unsuitable for applications requiring real-time response, like, for example, interactive purposes (where the user enters commands that the first process in the pipeline receives via stdin, and the last process in the pipeline presents its output to the user via stdout).

Also, commands that produce a potentially infinite amount of output, such as the yes command, cannot be used in a pseudo-pipeline, since they would run until the temporary disk space is exhausted, so the following processes in the pipeline could not even start to run.

Object pipelines

Beside byte stream-based pipelines, there are also object pipelines. In an object pipeline, the processed output objects instead of texts; therefore removing the string parsing tasks that are common in UNIX shell scripts. Windows PowerShell uses this scheme and transfers .NET objects. Channels, found in the Limbo programming language, are another example of this metaphor.


=Pipelines in GUIs=

Graphical environments such as RISC OS and ROX Desktop also make use of pipelines. Rather than providing a save dialog box containing a file manager to let the user specify where a program should write data, RISC OS and ROX provide a save dialog box containing an icon (and a field to specify the name). The destination is specified by dragging and dropping the icon. The user can drop the icon anywhere an already-saved file could be dropped, including onto icons of other programs. If the icon is dropped onto a program's icon, it's loaded and the contents that would otherwise have been saved are passed in on the new program's standard input stream.

For instance, a user browsing the world-wide web might come across a .gz compressed image which they want to edit and re-upload. Using GUI pipelines, they could drag the link to their de-archiving program, drag the icon representing the extracted contents to their image editor, edit it, open the save as dialog, and drag its icon to their uploading software.

Conceptually, this method could be used with a conventional save dialog box, but this would require the user's programs to have an obvious and easily-accessible location in the filesystem that can be navigated to. In practice, this is often not the case, so GUI pipelines are rare.

Other considerations

The name 'pipeline' comes from a rough analogy with physical plumbing in that a pipeline usually [There are exceptions, such as "broken pipe" signals.] allows information to flow in only one direction, like water often flows in a pipe.

Pipes and filters can be viewed as a form of functional programming, using byte streams as data objects; more specifically, they can be seen as a particular form of monad for I/O [ [http://okmij.org/ftp/Computation/monadic-shell.html "Monadic I/O and UNIX shell programming"] ] .

The concept of pipeline is also central to the Cocoon web development framework where it allows a source stream to be modified before eventual display.

This pattern encourages the use of text streams as the input and output of programs. This reliance on text has to be accounted when creating graphic shells to text programs.

History

Process pipelines were invented by Douglas McIlroy, one of the designers of the first Unix shells, and greatly contributed to the popularity of that operating system. It can be considered the first non-trivial instance of software componentry.

The idea was eventually ported to other operating systems, such as DOS, OS/2, Windows NT, BeOS, and Mac OS X.

ee also

* Pipeline (Unix) for details specific to Unix.
* plumbing - "intelligent pipes" developed as part of Plan 9
* Pipeline (computing) for other computer-related versions of the concept.
* Named pipes, an operating system construct intermediate between pipes and files.
* Software design patterns
* Software componentry
* Software engineering
* GStreamer for a multimedia framework built on plugin pipelines
* XML pipeline for processing of XML files

Notes

External links

* [http://www.roguewave.com/downloads/white-papers/pipelines-overview.pdf Software Pipelines - An Overview Whitepaper]
* [http://www.w3.org/XML/XProc/docs/langspec.html W3C XProc proposal]


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Pipeline — may refer to:* Classic RISC pipeline, a five stage hardware based computer instruction set. * Pipeline transport, a conduit made from pipes connected end to end for long distance fluid transport * Plastic pressure pipeline, for fluid handling *… …   Wikipedia

  • Pipeline (Unix) — In Unix like computer operating systems, a pipeline is the original software pipeline : a set of processes chained by their standard streams, so that the output of each process ( stdout ) feeds directly as input ( stdin ) of the next one. Each… …   Wikipedia

  • Pipeline (computing) — In computing, a pipeline is a set of data processing elements connected in series, so that the output of one element is the input of the next one. The elements of a pipeline are often executed in parallel or in time sliced fashion; in that case,… …   Wikipedia

  • Pipeline programming — When a programming language is originally designed without any syntax to nest function calls, pipeline programming is a simple syntax change to add it. The programmer connects notional program modules into a flow structure, by analogy to a… …   Wikipedia

  • Pipeline USA — is a defunct Internet service provider from the mid to late 1990s. Their claim to fame was their own proprietary software that was similar in use to AOL (at the time), CompuServe, and Prodigy. Pipeline USA was eventually purchased by Mindspring,… …   Wikipedia

  • Software Pipelining — bezeichnet die Programmierung eines Prozessors mit mehreren Ausführungseinheiten, sodass möglichst viele von ihnen gleichzeitig beschäftigt sind. Das Verfahren dient also dazu, die Zeit für eine Berechnung zu verkürzen, indem mehrere parallele… …   Deutsch Wikipedia

  • Pipeline transport — An elevated section of the Alaska Pipeline …   Wikipedia

  • Pipeline (Prozessor) — Die Pipeline (auch Befehls Pipeline oder Prozessor Pipeline) bezeichnet bei Mikroprozessoren eine Art „Fließband“, mit dem die Abarbeitung der Maschinenbefehle in Teilaufgaben zerlegt wird, die für mehrere Befehle parallel durchgeführt werden.… …   Deutsch Wikipedia

  • Pipeline (video game) — Infobox VG title = Pipeline developer = Ian Holmes and William Reeve publisher = Superior Software designer = Ian Holmes and William Reeve engine = released = 1989 genre = Arcade adventure; Puzzle game modes = Single player ratings = platforms =… …   Wikipedia

  • Software pipelining — In computer science, software pipelining is a technique used to optimize loops, in a manner that parallels hardware pipelining. Software pipelining is a type of out of order execution, except that the reordering is done by a compiler (or in the… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”