Tagged pointer

Tagged pointer

In computer science, a tagged pointer is a common example of a tagged union, where the primary type of data to be stored in the union is a pointer. Often, the "tag" in a tagged pointer will be "folded" into the data representing the pointer, taking advantage of one of the following properties of pointers:
* Pointers to certain types of data will often be "aligned" to the size of the data (4 bytes, 8 bytes, etc.), which leaves a few bits of the pointer unused. As long as code that uses the pointer properly masks out these bits, the pointer can be tagged with extra information.
* The virtual memory system in most modern operating systems reserves a block of logical memory around address 0 as unusable. This means that, for example, a pointer to 0 is never a valid pointer and can be used as a special null pointer value.

Use of 0 to represent a null pointer is extremely common, with many programming languages (such as Ada) explicitly relying on this behavior. In theory, other values in an operating system-reserved block of logical memory could be used to tag conditions other than a null pointer, but these uses appear to be rare, perhaps because they are at best non-portable. It is generally accepted practice in software design that if a special pointer value distinct from null (such as a sentinel in certain data structures) is needed, the programmer should explicitly provide for it.

Taking advantage of the alignment of pointers provides more flexibility than null pointers/sentinels because it allows pointers to be tagged with information about the type of data pointed to, conditions under which it may be accessed, or other similar information about the pointer's use. This information can be provided along with every valid pointer. In contrast, null pointers/sentinels provide only a finite number of tagged values distinct from valid pointers.

In a tagged architecture, a number of bits in every word of memory are reserved to act as a tag. Tagged architectures, such as the Lisp machines, often have hardware support for interpreting and processing tagged pointers.

Examples

Example 1

In the following C code, we see 0 used to indicate a null pointer:

void optionally_return_a_value (int* pOptionalReturnValue) { // ... if (pOptionalReturnValue) *pOptionalReturnValue = 1; }

Example 2

Here, the programmer has provided a global variable, whose address is then used as a sentinel:

node g_sentinel; #define SENTINEL &g_sentinel void do_something_to_a_node (node* p) { if (p = NULL) { // do something } else if (p = SENTINEL) { // do something else } else { // treat p as a valid pointer to a node } }

Example 3

Assume we have a data structure table_entry that is always aligned to a 16 byte boundary. In other words, the least significant 4 bits of a table entry's address are always 0 (2^4 = 16). We could use these 4 bits to mark the table entry with extra information. For example, bit 0 might mean read only, bit 1 might mean dirty (the table entry needs to be updated), and so on.

If pointers are 16-bit values, then:
* 0x3421 is a read-only pointer to the table_entry at address 0x3420
* 0xf472 is a pointer to a dirty table_entry at address 0xf470

Advantages

The major advantage of tagged pointers is that they take up less space than a pointer along with a separate tag field. This can be especially important when a pointer is a return value from a function. It can also be important in large tables of pointers.

A more subtle advantage is that by storing a tag in the same place as the pointer, it is often possible to guarantee the atomicity of an operation that updates both the pointer and its tag without external synchronization mechanisms. This can be an extremely large performance gain, especially in operating systems.

Disadvantages

Tagged pointers have some of the same difficulties as xor linked lists, although to a lesser extent. For example, not all debuggers will be able to properly follow tagged pointers; however, this is not an issue for a debugger that is designed with tagged pointers in mind.

The use of 0 to represent a null pointer does not suffer from these disadvantages: it is pervasive, and most programming languages treat 0 as a special null value. It has thoroughly proven its robustness. Usually when "tagged pointers" are spoken of, this common use is excluded.


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Tagged union — In computer science, a tagged union, also called a variant, variant record, discriminated union , or disjoint union, is a data structure used to hold a value that could take on several different, but fixed types. Only one of the types can be in… …   Wikipedia

  • Pointer (computing) — This article is about the programming data type. For the input interface (for example a computer mouse), see Pointing device. Pointer a pointing to the memory address associated with variable b. Note that in this particular diagram, the computing …   Wikipedia

  • Tagged architecture — In computer science, a tagged architecture [ [http://www.memorymanagement.org/glossary/t.html#tagged.architecture The Memory Management Glossary: Tagged architecture] ] is a particular type of computer architecture where every word of memory… …   Wikipedia

  • Pointer — This interesting name is of early medieval English origin, and is an occupational surname for a maker of points. The name derives from the Middle English (1200 1500) poynte(r), pointe(r) , from the Old French pointe , point, sharp end, originally …   Surnames reference

  • CDR coding — In computer science CDR coding is a compressed data representation for Lisp linked lists. It was developed and patented by the MIT Artificial Intelligence Laboratory, and implemented in computer hardware in a number of Lisp machines derived from… …   Wikipedia

  • Bottom type — In type theory, the bottom type is the subtype of all types. It is the return type of a function that does not return normally. As an example, a certain program function may promise to return an integer but in a Turing complete language the… …   Wikipedia

  • Extensible Storage Engine — For JET Red storage engine of Microsoft Access, see Microsoft Jet Database Engine. For the teacher s term, Exceptional education. Extensible Storage Engine (ESE), also known as JET Blue, is an Indexed Sequential Access Method (ISAM) data storage… …   Wikipedia

  • Cyclone (programming language) — Cyclone Appeared in 2006 (2006) Designed by AT T Labs Stable release 1.0 (May 8, 2006; 5 years ago (2006 05 08)) Influenced by …   Wikipedia

  • GObject — The GLib Object System, or GObject, is a free software library (covered by the LGPL) that provides a portable object system and transparent cross language interoperability. GObject is designed for use both directly in C programs and through… …   Wikipedia

  • Data type — For other uses, see Data type (disambiguation). In computer programming, a data type is a classification identifying one of various types of data, such as floating point, integer, or Boolean, that determines the possible values for that type; the …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”