1.   October 23

        1.1.  "One day a mother comes home from work and asks her son, "What did you do today?" The
son replied, "I taught our dog how to play the piano". The mother, incredulous, asked, "Our dog can
play the piano?", to which the son laughed and replied, "Of course not mom. I said that I taught him;
I didn't say that he learned how." - Anonymous

Today we begin to look at the other components of the computer aside from the CPU and investigate
how  they are managed. We begin with storage and recall that it extends from disk to RAM to cache.
The CPU executes an instruction stream, and that stream resides in RAM. The author reviews how it
gets there as we did in the introductory unit. Caches are considered part of the CPU, but we understand
that the Cache is a copy of RAM and the hardware must insure that while the CPU can access cache
more quickly and does not need to pre-empt the memory bus to do so, the cache and RAM must be
synchronized. The hardware both reads and writes the cache.

When we look at RAM we are concerned with two issues: (1) sharing and (2) efficient use. Instructions
provide bits to address RAM and we are currently moving from 32 bit addresses to 64 and 128 bit
addresses. The different possible addresses denote the virtual address space. Although computers
come with more RAM today than in the past, the physical address space is a tiny fraction of the virtual
space. However, the OS manages the physical address space so that the programmer and CPU don't
know how much physical memory there actually is (with the help of memory mapping hardware).

The compiler is written to translate source into machine instructions that support a given address space,
 so the compiler will assume that the program will be executed on a machine with the maximum
addressable RAM. Therefore, the OS and the MMU must make it seem that way. This is done by
folding the sequential virtual memory space and allowing different parts of it to occupy the same
physical memory locations at different points in time. Chapter 9 presents the techniques of paging
and segmentation that accomplishes this task. These techniques also serve the multiprogramming
circumstance when multiple processes, each with maximum address spaces, interleave their pages or
segments in physical memory while each believes that it has sole access to a CPU with maximum
virtual memory. The cost of this is that when a process's memory is out on disk instead of in RAM
when the CPU wants to access it, a memory fault is detected by the OS and the process must be
suspend for the IO operation that it takes to go to the disk and read in the desired portion of virtual
memory. Another process can have the CPU while this IO is accomplished, but the OS has to decide
where the new memory piece will be placed in RAM so as not to interfere with the process that is
executing. DMA will be used to update RAM and then the original process can re-execute the
instruction that produced the fault (once it gets back onto the CPU) and the MMU will be able to give
up the CPU the code or data at the requested virtual address.

The devil is in the details.

Collapse node2.   October 28

        2.1.  "The fastest algorithm can frequently be replaced by one that is almost as fast and much easier
to understand." - D. Jones

Memory management is about trade-offs.

Contiguous allocation of memory

Divide memory into partitions and run one process in each partition. If the partitions are of fixed size,
you would want to match the process to the appropriate partition to avoid internal fragmentation.
However, when you have too few processes, memory space is wasted (external fragmentation) and
sometimes your process will be too big to fit into any partition. This can be solved by arranging for the
partitions to be variable in size and dynamically allocated, which implies that processes may need to be
 relocated so that partitions can be combined. Partitions imply that every address in the process is relative
to the partition boundary (and that no reference is made outside of the partition).

  • when a process is too large for real memory, it must be loaded into its partition a piece at a time.
    This can be done as an overlay by the programmer or by the operating system implementing
    virtual memory. Pieces can be of fixed size (page) or variable sized (segment).
  • With the assumption that the process is running in a partition, the OS has to decide where in the
    partition to load the next piece (probably overwriting some other piece) and how to map the CPU
    reference to the new piece, which is no longer a fixed distance from the start of the partition
    (beginning of the code or data). This is solved by
    (1) dividing the partition into slots, and
    (2) keeping a table that tells the OS what piece of the program is in each slot.
  • The MMU is given the address of the slot table and all memory addresses are intercepted and
    first checked to see if they are valid (within the virtual memory space of the process) and then
    to see if the page containing that byte of the process is in one of the slots.
    1. If so, address is broken into its page number and offset within the page, and the slot
      boundary of the slot holding that page, rather than the partition boundary, is used to
      compute the real memory address.
    2. if not, then the desired page must be read into a slot from the swap area on disk. An
      empty slot is used if available, else a slot whose contents have not been changed, else
      a slot least likely to be referenced soon (after first copying its contents to the swap area).
  • All of the pages of the process must be stored on the swap device and when the program is large,
    a directory is used to locate the page corresponding to an address. A 32 bit address could be
    divided into groups of 10,10, and 12 bits and the first group used to find a page table, the second
    group to find a page in the table, which would have to be in the TLB, and the third would be the
    offset. This scheme would provide for 1024 tables, each with 1024 pages, each with 4096 bytes.
    Each table would address .5MB of memory, and all the tables together would reference 500MB
    of RAM (the 32 bit address space). If the actual partition had 16MB, it would be divided into
    32,768 page slots. Some of these slots would hold page tables, some the process code and data.
  • The Translation Look-aside-Buffer is hardware that finds the slot number of a specific page
    in a specific table. It holds only a small number of the possible slot identifiers (maybe 512 of
    the 32,768) but it can return the slot boundary of the pages it holds VERY quickly. When the
    page is not in the TLB, it must be looked up in RAM. Moreover, there is only one TLB, so it
    must be re-initialized context switch.

Non-contiguous allocation

The techniques described above for managing the memory in a partition can just as well be used to
manage all of memory if we divide all of the RAM slots among the different processes arbitrarily.
Each process still has its own page table, but the pages and the process code and data can be loaded
into any slot. Within the partition it was generally the case that many contiguous pages would be in
the slots, indeed, there was opportunity to do anticipatory buffering by bringing in multiple pages when
only one was required. This is now less viable as the number of page slots allocated to each process
might be the same, but their relative location is unknown.

  • page sharing becomes an option.
  • an inverted page table is an option: a list by slot number of the process id and the process page
    stored in each slot.
  • a free slot list can be kept and used by any process when it encounters a memory fault.
  • the number of processes in RAM at once can vary freely, unconstrained by the concept of a
    partition.

 

Collapse node3.   October 30

        3.1.  "The study of law is something new and unfamiliar to most of you -unlike any other
schooling you have gone through before. Here we use the Socratic method: I call on you; I ask you a
question; you answer it. Why don't I just give you a lecture? Because through my questions you learn
to teach yourselves. By this method of questioning-answering, questioning-answering, we seek to
develop in you the ability to analyze that vast complex of facts that constitutes the relationships of
members within a given society. Now, you may think, at times, that you have reached a correct and
final answer. I assure you, this is a delusion on your part. You will never reach a final, correct, and
 ultimate answer. In my classroom, there is always another question; there is always a question to
follow your answer. Yes, you are on a treadmill. My little questions spin the tumblers of your brain.
You are on an operating table; my little questions are fingers probing your mind. We do brain surgery
here. You teach yourselves law and I train your minds. You come in here with a skull full of mush, and
you leave thinking like a lawyer. "- Professor Kingsfield (addressing 1st year Harvard Law Students in
"The Paper Chase")

Demand paging looks at the details of the actual way memory is managed in modern computers.

 

Collapse node4.   November 4

        4.1.  "A little inaccuracy sometimes saves tons of explanation." - H.H. Munro

The issues about demand paging are:

  • effective access time
  • page replacement policy
  • page frame allocation
  • thrashing (working set)

Note the distinction between swapping performed to backing store by a swapper, and paging
implemented by a pager. Note a page table is used with both. Note that each process must have both its
page table and its data and code pages in memory in order to execute. Recall what data about the
process is user data and what data is system data (kept in the OS area of RAM)

Consider what happens when a new process acquires the CPU. The process may have no pages in
memory, so it immediately goes into the io suspend queue. How many pages should be read in to
satisfy this page fault? Note the 12 steps on p. 305-306.

Insuring that a free frame is available minimizes page-fault processing. However, if a page must be
replaced, replacing a "clean" page is almost as fast. When neither alternative is available, a page must
be chosen (victim) to swap out. Evaluating page-replacement algorithms is accomplished with a
reference string of page accesses. FIFO, LRU, LFU, MFU are possible, all are unlikely to be optimal,
but LRU is most common. The larger the page size, the fewer opportunities there will be for a memory
access to generate a fault (assuming locality of access)

The working set can be kept with the PCB and reloaded during the context switch. While there are a
minimum number of pages defined by the number of different memory references possible in an
assembly instruction, there is no maximum. The policy could be to divide all the page frames equally
among processes, or proportionally according to the size of the process, or dynamically according to
the behavior of the process (keep expanding the number of frames allocated to the faulting process,
and deleting frames from the non-faulting process). With whatever initial global fame allocation policy
you establish there is the choice of restricting a process to those frames (local re-allocation) or allowing
the process to victimize other processes (the dynamic allocation process).

Thrashing is the limit of high CPU utilization. You mitigate thrashing by reducing the number of
processes in the ready queue, thus freeing their memory so that the remaining processes can share
the page frames and reduce the their page-fault rates.

I/o requires special consideration. Note first that some memory must be used for io buffers and that
this memory cannot be paged because of DMA access to it. This can be solved by using system
memory for io but that requires a copy into the user area eventually which is time-consuming.
The alternative is to lock pages so that they cannot be swapped. This is common, but risks memory
leak by failure to unlock.

For each of the problems posed by paging there are preferred solutions, many implemented with
hardware support. However, the problems of implementation (like locking bits) sometimes force OS
designers toward less efficient solutions that are more practical than ideal. Sometimes being fast is
better even if it is wasteful. One lesson is how to evaluate the tradeoffs.  

Collapse node5.   November 6

        5.1.  "We think too much about effective methods of teaching and not enough about effective
methods of learning. No matter how good teaching may be, each student must take the responsibility for
his own education." - J. Carolus S.J.

"Any sufficiently advanced technology is indistinguishable from magic." - A. Clarke

We know what a file is.

  • What does a file look like to the OS?
  • What information does the OS need to keep about each file?
  • Where does the OS keep this data?
  • What file services do we expect the OS to render the user?
  • How can these services be invoked?
  • How is the file system like the memory management system and how is it different?
  • What are the operations that must be performed on individual files?
  • What information should users be able to obtain from the file system?
  • What effects file system performance and how is performance evaluated?

 

Collapse node6.   November 11

        6.1.  "As a rule, software systems do not work well until they have been used, and have failed
repeatedly, in real applications." - D. Parnas

Originally the OS loaded itself into memory from an io device and then read executable code from the
card reader and that code read in data from the card reader. Next the OS read in a stream of code and
data and stored them on tape or disk so that they could be accessed faster when it came time to load
them. Thus the need for a file system. Then we stored more systems software on the disk so it didn't
have to be read in repeatedly. The disk thus had the "library code" area and the user temporary area
which was divided into user jobs (images of the card stream). Next we separated the user code from
the user data since businesses ran the same programs repeatedly but loaded in different data. Now there
was the systems library, the user library, and the transitory data area. When there began to be more users
on the system, the need grew for a directory system by which to keep the files of different users organized,
and finally the need to swap these files in an out of memory quickly shaped the structure of the storage
pattern on the disk. However, disk technology changed so rapidly that it was no long practical for the
user to understand how the data file was distributed on the disk. First database software took over the
 task of managing the storage of the data, and then the drive itself became a black box with input and
output buffers (today the OS deals with the disk drive through virtual addresses).

Consider the life-cycle of a file. It is created by a program issuing a request to the OS (an editor, the
command processor, a compiler, a database program, the OS itself, etc.). A file exists when an identifier
is entered into the file directory. The entry consists of a (unique) file id, type, a file owner and access
rights, creation time, possibly a disk location, and a length, possibly zero. If the file is to be created as
the result of copying an existing file, the file system knows the required length and must find available
space. If the file is being created de novo, the OS has to request a default initial space allocation.
Provision must be made for the case when this initial guess proves too small. Once the file is written
(terminating with EOF marker), attempts to access the file must be monitored, and provision made for
reading, appending, re-writing (modifying within the already allocated region, including truncating),
and deleting. Each of these operations poses problems for the file system.

One problem is to locate the byte referenced in the io operation of the executing program. The virtual
address is . This region of the file must be brought into memory, assuming it is a valid, and then the
requested number of bytes transferred to the CPU starting at that location. This is usually done in buffers
because data io often uses read next record, and write next record operations (sequential access) instead
of giving a record address (random access). One of the services of the file system is to transfer byte data
from the disk and item data to the CPU (numeric or character). When the bytes at the address do not match
the code of the items requested, a file read error is generated. What other kinds of errors must the file
system recognize?

Collapse node7.   November 13

        7.1.  "The best way to get a good idea is to get a lot of ideas." - L. Pauling

Review the assigned homework problems for chapters 10 and 11.

Emphasis on balancing CPU utilization and paging disk, and on evaluating the effect of different file
allocation architectures on file access times.

Collapse node8.   November 20

        8.1. EXAM 2  covers chapters 7, 9, 10, and 11

Chapter 7
This chapter addresses the problems (and solutions) of allowing threads to share data. The ideas
introduced include:

  1. interrupts at "inopportune" times while code is in a critical section (changing shared data),
    resulting in a race condition (outcome depends on which process the OS happens to select to
    execute next).
  2. mutual exclusion and how to insure it.
  3. synchronization (fixing all cooperating processes at the same point).
  4. semaphore (with P and V operations)
  5. monitors (wait and signal language construct to eliminate concurrent access)
  6. Java's, sychronized and notify to avoid busy wait. Note the rules on page 218.
  7. as examples the author gives code for the Bounded Buffer, Reader/Writer, and Dining
    Philospher problems.

Be able to discuss and illustrate any of the above concepts. Given Java code, be able to analyze
concurrency problems.

Chapter 9
This chapter concerns the issues of sharing memory, beginning with assigning a fixed partition to a
process and the problems that arise trying to excecute in the partition and continuing to introduce more
flexible ways to share memory with hardware support to translate virtual memory references. On page
293 the author summarizes the different considerations that surround memory sharing and
multiprogramming, with minimal concern for the details of how a paging strategy actually performs.
(the concern of chapter 10).

Be able to discuss and illustrate any of those concerns on page 293. Expect to repeat some
homework problems.

To show understanding you should be able to

  1. duplicate the steps of the MMU to convert a virtual address to a physical address using a page table;
  2. trace the allocation of segments using first fit or best fit;
  3. compute the effective access time of a paged memory given ratios of page faults;

Chapter 10
This chapter deals with demand paging and solves the problems of how to allocate pages among
processes and replace pages so that the CPU is not idle due to page faults. The focus is changed from
how the CPU finds the data in its address registers, to how the OS manages the transition of pages from
disk to RAM and back efficiently. Important ideas include

  1. pure demand paging and the page fault count generated by replacement policies and frame
    allocations as evaluated by reference lists.
  2. effective access time (again)[know steps on p. 306].
  3. minimal page frame allocation, working set, and thrashing.
  4. FIFO, LRU, and LRU replacement algorithms, LFU.
  5. ways for the OS to use idle time to speed paging and increase CPU utilization.
  6. differences between accessing code and accessign data (io).

Chapter 11
This chapter repeats space allocation and access management analysis for the disk drive. Now the items
being managed are files, not processes (and their code and data) but the problems are familiar as are the
solution techniques. First we examine the operations that the OS must provide to manipulate files, then
how the OS keeps data about files, and finally how the file is allocated to sectors on the disk. File use
affects storage decisions. File access affects directory structure decisions. The File System is a subset of
the OS that has to interact with the Memory Management system (where the swap region is considered
part of the later). The techniques to be understood (and modeled) are:

  1. space allocation and free space management (defragmentation as in RAM)--solved by linking
    holes or bit maps.
  2. directory structure and its impact on gaining access to the desired portion of the file--strengths
    and weaknesses of contiguous, linked, and indexed files and directories.
  3. problems with multiple access, speedy access (buffer management), and file consistency (backup,
    system failure)

Expect an exam structured like the previous one.

  9.  REVIEW FOR FINAL

  "Always do the hard part first. If the hard part is impossible, why waste time on the easy part?
Once the hard part is done, you're home free. Always to the easy part first. What you think at first is
the easy part often turns out to be the hard part. Once the easy part is done, you can concentrate all
your efforts on the hard part." - A. Schapira

"Any intelligent fool can make things bigger and more complex. It takes a touch of genius -and a lot of
courage- to move in the opposite direction" - E.F. Schumacher

The final exam will focus on resource management, reviewing what was presented about scheduling
the CPU, managing RAM (paging) and managing disk.

·  CPU(s) execute processes whose discriptive data resides in PCBs, and whose avatars live in queues
when not executing. You should be able to trace the life of a process from creation to cleanup as seen
by the OS.

·  The memory subsystem concerns addressing and address spaces. The addresses refer to RAM
locations and the movement of data from RAM to CPU is supported by hardware between the CPU
and RAM, and by hardware between the swapping disk and RAM and their caches. Memory holds the
code and data segments generated by assemblers. You should be able to doing memory resolution using
a page table as does the MMU. You should be able to discuss how page frames are filled and emptied.
You should be able to relate memory usuage to process scheduling.

·  The io subsystem deals with disk files and how these files are read and written to the disk. Information
about files are kept in directories that are accessible both to users and to user's programs. You should
understand the information in a directory and how the disk's directories are shared. When programs
reference files the OS keeps a FCB for the file. You should be able to trace the life cycle of an FCB and
discuss the data it contains. The OS mediates data transfers between the program and the file using
buffering and tries to schedule device access. You should understand the difference between buffering
and caching and spooling be able to discuss the factors that impact scheduling. The OS deals with io
through device drivers and device controllers. You should be able to explain the process of
accomplishing an io operation. The file system manages the space allocation on the disk drive. This
process mirrors RAM management except that it is several orders of magnitude slower. What techniques
are effective?

·  We want programs to run quickly. However, when there are many programs contending for limited
CPU cycles and Disk arms there will be contention. You should be able to explain the factors that affect
system utilization in multi-programming contexts.

You should review the last two exams and chapter 12.

Collapse node10. Textbook Summary Slides

        10.1.  Chapter 9 (author’s slides)

        10.2.  Chapter 10 (author’s slides)

        10.3.  Chapter 11 (author’s slides)

        10.4.  Chapter 12 (author’s slides)