Sunday, April 8, 2012

Cisco IOS Architecture

General Operating System Concepts
Modern operating systems provide 2 primary functions:

Hardware Abstraction Provides software developers a common interface between their applications and the hardware, which shields the programmers from the complexity of the hardware. The hardware-specific codes are developed once in the OS and shared by everyone.
Resource Management Managing the hardware resources (eg: CPU, memory, disk space) so that they can be shared efficiently among multiple applications. Like hardware abstraction, the resource management codes are developed in the OS to keep application programmers from reinventing the wheels – writing the resource management codes.

Multitasking is referred to as running multiple programs at once. Applications written for multitasking OSes often contain multiple independent and concurrent tasks, called threads. Each thread has its own set of CPU register values, but can share the same memory address space with other threads belong to the same application (or process). A process is a group of threads that share a common memory space and have a common purpose. On OSes and CPUs that support virtual memory, each process runs in a separate address space that is protected from other processes.

A processor can execute instruction for only one program at a time. Scheduling decides which process or thread should run. This function is usually performed by the core of the OS – kernel. An OS can use one of the several scheduling methods to schedule threads, depends on the type of applications (eg: batch, interactive, real-time). Different types of applications have different CPU utilization characteristics, and the overall performance is affected by the scheduling method used.

FIFO with run-to-completion is the simplest scheduling method where each thread is assigned to the processor in the order its run request is received and let all threads run to completion. It is easy to implement, has very low overhead, and fair – all threads are treated equally. First come, first served.

FIFO with run-to-completion scheduling is good for batch applications that perform serial processing and then exit, but it does not work well for interactive and real-time applications, which require fast and short duration access to the CPU to provide quick response to users or other devices. A possible scheduling method for these applications is priority scheduling, where priorities are assigned to different types of threads – critical threads, eg: real-time threads are assigned with a higher priority then other less critical threads, eg: batch threads. Multiple threads with the same priority are then processed in the order which they were received (FIFO).

Even though priority scheduling is an improvement over FIFO, it still has one drawback that makes it unsuitable for interactive and real-time applications – high-priority threads can get stuck behind a long running low-priority thread. A method that can temporarily suspend or preempt a running thread to allow other threads to access the CPU is required to solve this problem.

Thread Preemption is the ability of an OS to involuntarily suspend a running thread to allow another higher priority thread to access the CPU resources. Preemptive multitasking OSes employ preemptive scheduling methods that utilize preemption instead of run-to-completion.

Preemption relies on the kernel to periodically change the current thread via a context switch, which can be triggered with either a system timer (each thread is assigned a time slice) or a function call to the kernel. When a context switch is triggered, the kernel selects the next thread to run and queue the preempted thread in a list for it to run again at its next opportunity – the computer actually changes the task that it is currently working on.

Context switching can be quite expensive in term of system performance as all processor registers for the thread that is being taken off must be saved, and all processor registers for the thread that is being granted access to the CPU must be restored.
Below describes the advantages and disadvantages of preemptive multitasking:

Predictable. A thread can be set up to run once a second and the programmer can be reasonably certain that the thread will be scheduled to run at that interval. Less efficient. It tends to switch contexts more often, and the CPU spends more time for scheduling and switching between threads than run-to-completion approach.
Difficult to break. No single thread can monopolize the CPU for long period and stop other threads from running. Adds complexity to applications. A thread can be interrupted anywhere. Applications must be well designed and written to protect critical data structures from being changed by other threads when they are being preempted.

OSes also manage the memory resources by dividing them into various sections for storing actual application instruction codes, variables, and heap. The heap is a section of memory from which processes can dynamically allocate and free the memory resources.

Virtual memory is found in most modern OSes to provide more memory than the available physical RAM size, and is transparent to processes. The computer memory is expanded with secondary storage, eg: harddisk drive. Virtual memory is created using a hardware feature that available on some CPUs – memory map unit (MMU). MMU automatically remaps memory address requests to either physical memory (RAM) or secondary storage (harddisk) depends on where the contents actually reside. MMU can be programmed to create separate address spaces for all processes to prevent them from accessing the memory address space of other processes.

Segmentation > A running process is restricted to use certain parts of memory called segments. If a process read, or write data outside the permitted memory address space or virtual memory (paging) allocated for it, a general protection fault or page fault will occur respectively.

Although virtual memory has many benefits, but there are resource requirements and performance penalties. As a result, IOS is not implemented with a full virtual memory scheme.

OSes usually support interrupts, a hardware feature that cause the CPU to temporarily suspend its current instruction sequence and transfer control to a special program called interrupt handler. OSes usually provide a set of interrupt handlers for all possible interrupt types.

Cisco IOS Architecture

The main challenge for IOS is to switch or forward packets as quickly and efficiently as possible.

IOS is a specialized embedded operating system tightly coupled to the underlying hardware. Therefore always consider the hardware architecture when discussing about the software.

Cisco IOS has a monolithic architecture (a single large program), which means it runs as a single image and all processes share the same memory address space. There is no memory protection mechanism exists due to the CPU and memory overhead they introduce – a bug in a process can potentially corrupt the data and/or memory of another process.

Cisco IOS has a run-to-completion scheduler – the kernel does not preempt a running process. A process must make a kernel call to allow other processes to run. For Cisco routers that require very high availability and response time, eg: Cisco CRS-1, these limitations are not acceptable. Competitive network OSes (eg: Juniper JUNOS) were designed not to have these limitations.

A new version of Cisco IOS – IOS-XR was developed to offer modularity, memory protection, lightweight thread, and preemptive scheduling.

IOS software modularity is a new IOS feature that supports individually restartable process. Past patches would require a complete IOS update and system reboot, whereas the new modular kernel allows the affected processes to be patched independently. After patching, the particular processes can be restarted independently from other processes, hence reduces system downtime.

Microcode is a set of processor-specific software instructions that enables and manages the features and functions of a specific processor type. A router loads the microcode for each processor type present in the system upon system startup or reload. The latest available microcode image is bundled and distributed along with the system software image.

Memory Organization

IOS maps the entire physical memory into a large flat virtual address space. Since the kernel does not perform any memory paging or swapping, the virtual address space is limited to the available physical memory. IOS divides the address space into various areas called regions that correspond to the various types of physical memory, eg: SRAM and DRAM. Classifying memory into regions allows IOS to group various types of memory and therefore the software does not need to know about specific types of memory on every platform.

Memory regions can also be nested in a parent-child relationship. Although there is no imposed limit upon the depth of nesting, only one level is really used. Nested regions form subregions of the parent region. Subregions are denoted by a name with the : separator and no parentheses.
Memory regions are classified into the following 8 categories:

Memory Region Class Characteristics
Local Normal run-time data structures and local heaps. Often in DRAM.
Iomem Shared input / output memory that is visible to both the CPU and controllers of network interfaces. Often is DRAM.
Fast Fast memory, eg: SRAM for speed-critical tasks and special purpose.
IText A region for the code currently executed by the IOS.
IData Storage for the initialized variables.
IBss Storage for non-initialized variables. BSS – Block Storage Section.
PCI Peripheral Component Interconnect (PCI) memory accessible to all devices on the PCI bus.
Flash Flash memory. Often used to store configuration backups, the binaries of IOS images, and other data, eg: crash info.
A file system is typically built in the Flash memory region.

The show region privileged command displays the regions defined on a system.
Router#sh region
Region Manager:

      Start         End     Size(b)  Class  Media  Name
 0x0B000000  0x0BFFFFFF    16777216  Iomem  R/W    iomem
 0x60000000  0x6AFFFFFF   184549376  Local  R/W    main
 0x600089A4  0x643611BF    70617116  IText  R/O    main:text
 0x64362000  0x6621F9BF    32233920  IData  R/W    main:data
 0x6621F9C0  0x66A471BF     8550400  IBss   R/W    main:bss
 0x66A471C0  0x67A471BF    16777216  Local  R/W    main:heap
 0x67A47218  0x68A47217    16777216  Local  R/W    main:heap
 0x6A000000  0x6AFFFFFF    16777216  Local  R/W    main:heap
 0x7B000000  0x7BFFFFFF    16777216  Iomem  R/W    iomem:(iomem_cwt)
 0x80000000  0x8AFFFFFF   184549376  Local  R/W    main:(main_k0)
 0xA0000000  0xAAFFFFFF   184549376  Local  R/W    main:(main_k1)

Free Region Manager:

      Start         End     Size(b)  Class  Media  Name
 0x68A47270  0x69FFFFA7    22777144  Local  R/W    heap


The gaps between the address space, eg: iomem ends at 0x0BFFFFFF and main begins at 0x60000000, are intentional to allow for expansion of the regions and provide protection against errant threads. If a runway thread is advancing through memory and writing garbage into a free unallocated memory region (the gap), it will be stopped.

The entire DRAM area from 0x60000000 to 0x6AFFFFFF is classified as a local region and further divided into subregions that correspond to the various parts of the IOS image itself (text, BSS, and data) and the heap. The heap is the entire free Local memory that left over after the IOS image was loaded into it.

Below shows that some regions appear to be duplicates with different address ranges.
0x0B000000  0x0BFFFFFF    16777216  Iomem  R/W    iomem
 0x7B000000  0x7BFFFFFF    16777216  Iomem  R/W    iomem:(iomem_cwt)
These duplicate regions are called aliases. Some Cisco platforms have multiple physical address ranges that point to the same block of physical memory. These different ranges are used to provide alternate data access method or automatic data translation in hardware.
For example, one address range might provide cached access to an area of physical memory while another might provide uncached access to the same memory.

The duplicate ranges are mapped as alternate views during system initialization and IOS creates alias regions for them. Aliased regions do not count toward the total memory on a platform, as they are not really separate memory! They allow IOS to provide a separate region for alternate memory views without artificially inflating the total memory calculation.

IOS Processes

An IOS process is equivalent to a single thread in other OSes – IOS processes have only one thread each. Each process has its own memory block (stack) and CPU context (eg: registers), and can control resources such as memory and a console device. IOS does not implement virtual memory protection between processes in order to minimize the overhead of paging or swapping. No memory management is performed during context switches; therefore, although each process receives its own memory allocation, other processes can freely access the same memory space – nothing can stop a process from intruding into a memory block of another process! Cisco has sacrificed both stability and security features in the IOS architecture design to achieve higher productivity and reduce resource consumptions.

Processes can be created and terminated anytime by the kernel (during IOS initialization) or by another running process while IOS is operating except during an (hardware) interrupt. When the CPU is interrupted, it temporarily suspends the execution of instructions of the current thread and begins to run an interrupt handler function. New processes cannot be created while the CPU is running the interrupt handler.

The parser is responsible for creating many of the IOS processes. The parser is a set of functions that interpret IOS configuration and EXEC commands. It is invoked by the kernel during IOS initialization and EXEC processes that are providing a CLI to the console and Telnet sessions. Anytime a command is entered by a user or a configuration line is read from a file, the parser interprets the text and takes immediate action. Some commands result in setting of a value, eg: an IP address; while others enable complicated functionality, eg: routing or event monitoring.

Some commands result in starting a new process, eg: when the router eigrp command is entered via the CLI, the parser starts a new process called ipigrp if it has not already been started.

Device Drivers

A primary OS function is provides hardware abstraction between platform hardware and the programs that run on it. Hardware abstraction typically occurs in device drivers that are part of the OS, making them an integral part of the system. IOS contains device drivers for a range of platform hardware devices, eg: Flash cards and NVRAM, but most notable are the device drivers for network interfaces.

IOS network interface device drivers provide the primary intelligence for packet operations in and out of interfaces. Each driver consists of a control component and a data component. The control component is responsible for managing the state and status of the device, eg: shutting down an interface; while the data component is responsible for all data flow operations through the device and contain logic that assists with the packet switching operations. IOS device drivers are very tightly coupled with the packet switching algorithms.

IOS device drivers interface with the rest of the IOS system via a special control structure called Interface Descriptor Block (idB). The idB contains entry points into the functions of a device driver and data about the state and status of a device, eg: the IP address, the interface state, and the packet statistics are some of the fields in the idB. IOS maintains an idB for each interface present on a platform and maintains an idB for each subinterface.

No comments:

Post a Comment