Kernel Data Structures Umair Hussain Malik Essay Example
Kernel Data Structures Umair Hussain Malik Essay Example

Kernel Data Structures Umair Hussain Malik Essay Example

Available Only on StudyHippo
  • Pages: 5 (1254 words)
  • Published: July 11, 2018
  • Type: Essay
View Entire Sample
Text preview

The linked list is the most common data structure in the Linux kernel which, allows the storage and manipulation of a variable number of elements, called the nodes of the list. The elements in a linked list are dynamically created and inserted into the list. This enables the management of a varying number of elements unknown at compile time and each element in the list contains a pointer to the next element. As elements are added to or removed from the list, the pointer to the next node is simply adjusted. Singly and Doubly Linked Lists

Circular Linked Lists Normally, the last element in a linked list has no next element, it is set to point to a special value, such as NULL, to indicate it is the last element in the list. In some linked lists, the last

...

element does not point to a special value. Instead, it points back to the first value. This linked list is called a circular linked list because the list is cyclic. Circular linked lists can come in both doubly and singly linked versions. Although the Linux kernel’s linked list implementation is unique, it is fundamentally a circular doubly linked list.

Using this type of linked list provides the greatest flexibility. Moving Through a Linked List Movement through a linked list occurs linearly. You visit one element, follow the next pointer, and visit the next element. Rinse and repeat. This is the easiest method of moving through a linked list, and the one for which linked lists are best suited. Linked lists are illsuited for use cases where random access is an important operation.

View entire sample
Join StudyHippo to see entire essay

Instead, you use linked lists when iterating over the whole list is important and the dynamic addition and removal of elements is required.

In linked list implementations, the first element is often represented by a special pointer—called the head—that enables easy access to the “start” of the list. In a noncircular-linked list, the last element is delineated by its next pointer being NULL. In a circularlinked list, the last element is delineated because it points to the head element. Traversing the list, therefore, occurs linearly through each element from the first to the last. In a doubly linked list, movement can also occur backward, linearly from the last element to the first.

Of course, given a specific element in the list, you can iterate backward and forward any number of elements, too. You need not traverse the whole list. The Linux Kernel’s Implementation In comparison to most linked list implementations—including the generic approach described in the previous sections—the Linux kernel’s implementation is unique. Recall from the earlier discussion that data (or a grouping of data, such as a struct) is maintained in a linked list by adding a next (and perhaps a previous) node pointer to the data.

The common pattern for storing this structure in a linked list is to embed the list pointer in the structure. For example: struct fox. The Linux kernel approach is different. Instead of turning the structure into a linked list, the Linux approach is to embed a linked list node in the structure. The Linked List Structure In the old days, there were multiple implementations of linked lists in the kernel.

Each contains a

list_head, and we can iterate from any one node to the next, until we have seen every node. This approach is elegant, but you will generally want a special pointer that refers to your linked list, without being a list node itself. Interestingly, this special node is in fact a normal list_head. This defines and initializes a list_head named fox_list. The majority of the linked list routines accept one or two parameters: the head node or the head node plus an actual list node. Manipulating Linked Lists The kernel provides a family of functions to manipulate linked lists.

Mathematically, it is an acyclic, connected, directed graph in which each vertex (called a node) has zero or more outgoing edges and zero or one incoming edges. A binary tree is a tree in which nodes have at most two outgoing edges—that is, a tree in which nodes have zero, one, or two children. Binary Search Trees A binary search tree (often abbreviated BST) is a binary tree with a specific ordering imposed on its nodes. The ordering is often defined via the following induction:

The left subtree of the root contains only nodes with values less than the root. The right subtree of the root contains only nodes with values greater than the root. All subtrees are also binary search trees. Self-Balancing Binary Search Trees The depth of a node is measured by how many parent nodes it is from the root. Nodes at the “bottom” of the tree—those with no children—are called leaves. The height of a tree is the depth of the deepest node in the tree. A balanced binary search tree

is a binary search tree in which the depth of all leaves differs by at most one. A self-balancing binary search tree is a binary search tree that attempts, as part of its normal operations, to remain (semi) balanced.

Red-Black Trees A red-black tree is a type of self-balancing binary search tree. Linux’s primary binary tree data structure is the red-black tree. Red-black trees have a special color attribute, which is either red or black. Red-black trees remain semi-balanced by enforcing that the following six properties remain true:

  • All nodes are either red or black.
  • Leaf nodes are black.
  • Leaf nodes do not contain data.
  • All non-leaf nodes have two children.
  • If a node is red, both of its children are black.
  • The path from a node to one of its leaves contains the same number of black nodes as the shortest path to any of its other leaves.

Taken together, these properties ensure that the deepest leaf has a depth of no more than double that of the shallowest leaf. Consequently, the tree is always semi-balanced. Why this is true is surprisingly simple. First, by property five, a red node cannot be the child or parent of another red node. By property six, all paths through the tree to its leaves have the same number of black nodes. The longest path through the tree alternates red and black nodes. Thus the shortest path, which must have the same number of black nodes, contains only black nodes.

Therefore, the longest path from the root to a leaf is no more

than double the shortest path from the root to any other leaf. If the insertion and removal operations enforce these six properties, the tree remains semi-balanced. Now, it might seem odd to require insert and remove to maintain these particular properties. Why not implement the operations such that they enforce other, simpler rules that result in a balanced tree? It turns out that these properties are relatively easy to enforce (although complex to implement), allowing insert and remove to guarantee a semi-balanced tree without burdensome extra overhead.

The Linux implementation of red-black trees is called rbtrees. They are defined in lib/rbtree. c and declared in <linux/rbtree. h>. Conclusion In this paper we discussed many of the generic data structures that Linux kernel developers use to implement everything from the process scheduler to device drivers. When writing your own kernel code, always reuse existing kernel infrastructure and don’t reinvent the wheel.

Reference

  1. Linux Kernel Development (Third Edition) by Robert Love
Get an explanation on any task
Get unstuck with the help of our AI assistant in seconds
New