1.1. "One day a mother comes home from work and
asks her son, "What did you do today?" The
son replied, "I taught our dog how to play the piano". The mother,
incredulous, asked, "Our dog can
play the piano?", to which the son laughed and replied, "Of course
not mom. I said that I taught him;
I didn't say that he learned how." - Anonymous
Today we begin to look at the
other components of the computer aside from the CPU and investigate
how they are managed. We begin with
storage and recall that it extends from disk to RAM to cache.
The CPU executes an instruction stream, and that stream resides in RAM. The
author reviews how it
gets there as we did in the introductory unit. Caches are considered part of
the CPU, but we understand
that the Cache is a copy of RAM and the hardware must insure that while the
CPU can access cache
more quickly and does not need to pre-empt the memory bus to do so, the cache
and RAM must be
synchronized. The hardware both reads and writes the cache.
When we look at RAM we are
concerned with two issues: (1) sharing and (2) efficient use. Instructions
provide bits to address RAM and we are currently moving from 32 bit addresses
to 64 and 128 bit
addresses. The different possible addresses denote the virtual address
space. Although computers
come with more RAM today than in the past, the physical address space
is a tiny fraction of the virtual
space. However, the OS manages the physical address space so that the
programmer and CPU don't
know how much physical memory there actually is (with the help of memory
mapping hardware).
The compiler is written to
translate source into machine instructions that support a given address
space, so the compiler will assume that the
program will be executed on a machine with the maximum
addressable RAM. Therefore, the OS and the MMU must make it seem that way.
This is done by
folding the sequential virtual memory space and allowing different parts of
it to occupy the same
physical memory locations at different points in time. Chapter 9 presents the
techniques of paging
and segmentation that accomplishes this task. These techniques also serve the
multiprogramming
circumstance when multiple processes, each with maximum address spaces,
interleave their pages or
segments in physical memory while each believes that it has sole access to a
CPU with maximum
virtual memory. The cost of this is that when a process's memory is out on
disk instead of in RAM
when the CPU wants to access it, a memory fault is detected by the OS
and the process must be
suspend for the IO operation that it takes to go to the disk and read in the
desired portion of virtual
memory. Another process can have the CPU while this IO is accomplished, but
the OS has to decide
where the new memory piece will be placed in RAM so as not to interfere with
the process that is
executing. DMA will be used to update RAM and then the original process can re-execute
the
instruction that produced the fault (once it gets back onto the CPU) and the
MMU will be able to give
up the CPU the code or data at the requested virtual address.
The devil is in the details.
2.
October 28
2.1. "The fastest algorithm can frequently be
replaced by one that is almost as fast and much easier
to understand." - D. Jones
Memory management is about
trade-offs.
Contiguous allocation of memory
Divide memory into partitions and run one process in each
partition. If the partitions are of fixed size,
you would want to match the process to the appropriate partition to avoid internal
fragmentation.
However, when you have too few processes, memory space is wasted (external
fragmentation) and
sometimes your process will be too big to fit into any partition. This can be
solved by arranging for the
partitions to be variable in size and dynamically allocated, which implies
that processes may need to be relocated so that partitions can be
combined. Partitions imply that every address in the process is relative
to the partition boundary (and that no reference is made outside of the
partition).
when a process is too
large for real memory, it must be loaded into its partition a piece at a
time.
This can be done as an overlay by the programmer or by the operating
system implementing
virtual memory. Pieces can be of fixed size (page) or variable sized
(segment).
With the assumption
that the process is running in a partition, the OS has to decide where
in the
partition to load the next piece (probably overwriting some other piece)
and how to map the CPU
reference to the new piece, which is no longer a fixed distance from the
start of the partition
(beginning of the code or data). This is solved by
(1) dividing the partition into slots, and
(2) keeping a table that tells the OS what piece of the program is in
each slot.
The MMU is given the address
of the slot table and all memory addresses are intercepted and
first checked to see if they are valid (within the virtual memory space
of the process) and then
to see if the page containing that byte of the process is in one of the
slots.
If so, address is
broken into its page number and offset within the page, and the slot
boundary of the slot holding that page, rather than the partition
boundary, is used to
compute the real memory address.
if not, then the
desired page must be read into a slot from the swap area on disk. An
empty slot is used if available, else a slot whose contents have not
been changed, else
a slot least likely to be referenced soon (after first copying its
contents to the swap area).
All of the pages of
the process must be stored on the swap device and when the program is
large,
a directory is used to locate the page corresponding to an address. A 32
bit address could be
divided into groups of 10,10, and 12 bits and the first group used to
find a page table, the second
group to find a page in the table, which would have to be in the TLB,
and the third would be the
offset. This scheme would provide for 1024 tables, each with 1024 pages,
each with 4096 bytes.
Each table would address .5MB of memory, and all the tables together
would reference 500MB
of RAM (the 32 bit address space). If the actual partition had 16MB, it
would be divided into
32,768 page slots. Some of these slots would hold page tables, some the
process code and data.
The Translation
Look-aside-Buffer is hardware that finds the slot number of a
specific page
in a specific table. It holds only a small number of the possible slot
identifiers (maybe 512 of
the 32,768) but it can return the slot boundary of the pages it holds
VERY quickly. When the
page is not in the TLB, it must be looked up in RAM. Moreover, there is
only one TLB, so it
must be re-initialized context switch.
Non-contiguous allocation
The techniques described above for managing the memory in
a partition can just as well be used to
manage all of memory if we divide all of the RAM slots among the different
processes arbitrarily.
Each process still has its own page table, but the pages and the process code
and data can be loaded
into any slot. Within the partition it was generally the case that many
contiguous pages would be in
the slots, indeed, there was opportunity to do anticipatory buffering by
bringing in multiple pages when
only one was required. This is now less viable as the number of page slots
allocated to each process
might be the same, but their relative location is unknown.
page sharing becomes
an option.
an inverted page table
is an option: a list by slot number of the process id and the process
page
stored in each slot.
a free slot list can
be kept and used by any process when it encounters a memory fault.
the number of
processes in RAM at once can vary freely, unconstrained by the concept
of a
partition.
3.
October 30
3.1. "The study of law is something new and
unfamiliar to most of you -unlike any other
schooling you have gone through before. Here we use the Socratic method: I
call on you; I ask you a
question; you answer it. Why don't I just give you a lecture? Because through
my questions you learn
to teach yourselves. By this method of questioning-answering,
questioning-answering, we seek to
develop in you the ability to analyze that vast complex of facts that
constitutes the relationships of
members within a given society. Now, you may think, at times, that you have
reached a correct and
final answer. I assure you, this is a delusion on your part. You will never
reach a final, correct, and ultimate answer. In my classroom,
there is always another question; there is always a question to
follow your answer. Yes, you are on a treadmill. My little questions spin the
tumblers of your brain.
You are on an operating table; my little questions are fingers probing your
mind. We do brain surgery
here. You teach yourselves law and I train your minds. You come in here with
a skull full of mush, and
you leave thinking like a lawyer. "- Professor Kingsfield (addressing
1st year Harvard Law Students in
"The Paper Chase")
Demand paging looks at the
details of the actual way memory is managed in modern computers.
4.
November 4
4.1. "A little inaccuracy sometimes saves tons of
explanation." - H.H. Munro
The issues about demand paging
are:
effective access time
page replacement
policy
page frame allocation
thrashing (working
set)
Note
the distinction between swapping performed to backing store by
a swapper, and paging
implemented by a pager. Note a page table is used with both. Note that each
process must have both its
page table and its data and code pages in memory in order to execute. Recall
what data about the
process is user data and what data is system data (kept in the OS
area of RAM)
Consider
what happens when a new process acquires the CPU. The process may have no
pages in
memory, so it immediately goes into the io suspend queue. How many pages
should be read in to
satisfy this page fault? Note the 12 steps on p. 305-306.
Insuring
that a free frame is available minimizes page-fault processing. However, if a
page must be
replaced, replacing a "clean" page is almost as fast. When neither
alternative is available, a page must
be chosen (victim) to swap out. Evaluating page-replacement algorithms is
accomplished with a
reference string of page accesses. FIFO, LRU, LFU, MFU are possible, all are
unlikely to be optimal,
but LRU is most common. The larger the page size, the fewer opportunities
there will be for a memory
access to generate a fault (assuming locality of access)
The
working set can be kept with the PCB and reloaded during the context switch.
While there are a
minimum number of pages defined by the number of different memory references
possible in an
assembly instruction, there is no maximum. The policy could be to divide all
the page frames equally
among processes, or proportionally according to the size of the process, or
dynamically according to
the behavior of the process (keep expanding the number of frames allocated to
the faulting process,
and deleting frames from the non-faulting process). With whatever initial
global fame allocation policy
you establish there is the choice of restricting a process to those frames
(local re-allocation) or allowing
the process to victimize other processes (the dynamic allocation process).
Thrashing
is the limit of high CPU utilization. You mitigate thrashing by reducing the
number of
processes in the ready queue, thus freeing their memory so that the remaining
processes can share
the page frames and reduce the their page-fault rates.
I/o
requires special consideration. Note first that some memory must be used for
io buffers and that
this memory cannot be paged because of DMA access to it. This can be solved by
using system
memory for io but that requires a copy into the user area eventually which is
time-consuming.
The alternative is to lock pages so that they cannot be swapped. This is
common, but risks memory
leak by failure to unlock.
For
each of the problems posed by paging there are preferred solutions, many
implemented with
hardware support. However, the problems of implementation (like locking bits)
sometimes force OS
designers toward less efficient solutions that are more practical than ideal.
Sometimes being fast is
better even if it is wasteful. One lesson is how to evaluate the tradeoffs.
5.
November 6
5.1. "We think too much about effective methods
of teaching and not enough about effective
methods of learning. No matter how good teaching may be, each student must
take the responsibility for
his own education." - J. Carolus S.J.
"Any sufficiently
advanced technology is indistinguishable from magic." - A. Clarke
We know what a file is.
What does a file look
like to the OS?
What information does
the OS need to keep about each file?
Where does the OS keep
this data?
What file services do
we expect the OS to render the user?
How can these services
be invoked?
How is the file system
like the memory management system and how is it different?
What are the
operations that must be performed on individual files?
What information
should users be able to obtain from the file system?
What effects file
system performance and how is performance evaluated?
6.
November 11
6.1. "As a rule, software systems do not work
well until they have been used, and have failed
repeatedly, in real applications." - D. Parnas
Originally the OS loaded
itself into memory from an io device and then read executable code from the
card reader and that code read in data from the card reader. Next the OS read
in a stream of code and
data and stored them on tape or disk so that they could be accessed faster
when it came time to load
them. Thus the need for a file system. Then we stored more systems software
on the disk so it didn't
have to be read in repeatedly. The disk thus had the "library code"
area and the user temporary area
which was divided into user jobs (images of the card stream). Next we
separated the user code from
the user data since businesses ran the same programs repeatedly but loaded in
different data. Now there
was the systems library, the user library, and the transitory data area. When
there began to be more users
on the system, the need grew for a directory system by which to keep the
files of different users organized,
and finally the need to swap these files in an out of memory quickly shaped
the structure of the storage
pattern on the disk. However, disk technology changed so rapidly that it was
no long practical for the
user to understand how the data file was distributed on the disk. First
database software took over the task of managing the storage of the
data, and then the drive itself became a black box with input and
output buffers (today the OS deals with the disk drive through virtual
addresses).
Consider the life-cycle of a
file. It is created by a program issuing a request to the OS (an editor, the
command processor, a compiler, a database program, the OS itself, etc.). A
file exists when an identifier
is entered into the file directory. The entry consists of a (unique) file id,
type, a file owner and access
rights, creation time, possibly a disk location, and a length, possibly zero.
If the file is to be created as
the result of copying an existing file, the file system knows the required
length and must find available
space. If the file is being created de novo, the OS has to request a default
initial space allocation.
Provision must be made for the case when this initial guess proves too small.
Once the file is written
(terminating with EOF marker), attempts to access the file must be monitored,
and provision made for
reading, appending, re-writing (modifying within the already allocated
region, including truncating),
and deleting. Each of these operations poses problems for the file system.
One problem is to locate the
byte referenced in the io operation of the executing program. The virtual
address is . This region of the file must be brought into memory, assuming it
is a valid, and then the
requested number of bytes transferred to the CPU starting at that location.
This is usually done in buffers
because data io often uses read next record, and write next record
operations (sequential access) instead
of giving a record address (random access). One of the services of the file
system is to transfer byte data
from the disk and item data to the CPU (numeric or character). When the bytes
at the address do not match
the code of the items requested, a file read error is generated. What other
kinds of errors must the file
system recognize?
7.
November 13
7.1. "The best way to get a good idea is to get a
lot of ideas." - L. Pauling
Review the assigned homework
problems for chapters 10 and 11.
Emphasis on balancing CPU
utilization and paging disk, and on evaluating the effect of different file
allocation architectures on file access times.
8.
November 20
8.1. EXAM 2 covers chapters 7, 9, 10, and 11
Chapter 7
This chapter addresses the problems (and solutions) of allowing threads to
share data. The ideas
introduced include:
interrupts at
"inopportune" times while code is in a critical section
(changing shared data),
resulting in a race condition (outcome depends on which process the OS
happens to select to
execute next).
mutual exclusion and
how to insure it.
synchronization
(fixing all cooperating processes at the same point).
semaphore (with P and
V operations)
monitors (wait and
signal language construct to eliminate concurrent access)
Java's, sychronized
and notify to avoid busy wait. Note the rules on page 218.
as examples the author
gives code for the Bounded Buffer, Reader/Writer, and Dining
Philospher problems.
Be
able to discuss and illustrate any of the above concepts. Given Java code, be
able to analyze
concurrency problems.
Chapter
9
This chapter concerns the issues of sharing memory, beginning with assigning
a fixed partition to a
process and the problems that arise trying to excecute in the partition and
continuing to introduce more
flexible ways to share memory with hardware support to translate virtual
memory references. On page
293 the author summarizes the different considerations that surround memory
sharing and
multiprogramming, with minimal concern for the details of how a paging
strategy actually performs.
(the concern of chapter 10).
Be
able to discuss and illustrate any of those concerns on page 293. Expect to
repeat some
homework problems.
To
show understanding you should be able to
duplicate the steps of
the MMU to convert a virtual address to a physical address using a page
table;
trace the allocation
of segments using first fit or best fit;
compute the effective
access time of a paged memory given ratios of page faults;
Chapter
10
This chapter deals with demand paging and solves the problems of how to
allocate pages among
processes and replace pages so that the CPU is not idle due to page faults.
The focus is changed from
how the CPU finds the data in its address registers, to how the OS manages
the transition of pages from
disk to RAM and back efficiently. Important ideas include
pure demand
paging and the page fault count generated by replacement policies and
frame
allocations as evaluated by reference lists.
effective access time
(again)[know steps on p. 306].
minimal page frame
allocation, working set, and thrashing.
FIFO, LRU, and LRU
replacement algorithms, LFU.
ways for the OS to use
idle time to speed paging and increase CPU utilization.
differences between
accessing code and accessign data (io).
Chapter
11
This chapter repeats space allocation and access management analysis for the
disk drive. Now the items
being managed are files, not processes (and their code and data) but the
problems are familiar as are the
solution techniques. First we examine the operations that the OS must provide
to manipulate files, then
how the OS keeps data about files, and finally how the file is allocated to
sectors on the disk. File use
affects storage decisions. File access affects directory structure decisions.
The File System is a subset of
the OS that has to interact with the Memory Management system (where the swap
region is considered
part of the later). The techniques to be understood (and modeled) are:
space allocation and
free space management (defragmentation as in RAM)--solved by linking
holes or bit maps.
directory structure
and its impact on gaining access to the desired portion of the
file--strengths
and weaknesses of contiguous, linked, and indexed files and directories.
problems with multiple
access, speedy access (buffer management), and file consistency (backup,
system failure)
Expect
an exam structured like the previous one.
9. REVIEW FOR FINAL
"Always do the hard part first. If the
hard part is impossible, why waste time on the easy part?
Once the hard part is done, you're home free. Always to the easy part first.
What you think at first is
the easy part often turns out to be the hard part. Once the easy part is
done, you can concentrate all
your efforts on the hard part." - A. Schapira
"Any intelligent fool can
make things bigger and more complex. It takes a touch of genius -and a lot of
courage- to move in the opposite direction" - E.F. Schumacher
The final exam will focus on
resource management, reviewing what was presented about scheduling
the CPU, managing RAM (paging) and managing disk.
· CPU(s) execute processes whose discriptive data resides in
PCBs, and whose avatars live in queues
when not executing. You should be able to trace the life of a process from
creation to cleanup as seen
by the OS.
· The memory subsystem concerns addressing and address spaces.
The addresses refer to RAM
locations and the movement of data from RAM to CPU is supported by hardware
between the CPU
and RAM, and by hardware between the swapping disk and RAM and their caches.
Memory holds the
code and data segments generated by assemblers. You should be able to doing
memory resolution using
a page table as does the MMU. You should be able to discuss how page frames
are filled and emptied.
You should be able to relate memory usuage to process scheduling.
· The io subsystem deals with disk files and how these files are
read and written to the disk. Information
about files are kept in directories that are accessible both to users and to
user's programs. You should
understand the information in a directory and how the disk's directories are
shared. When programs
reference files the OS keeps a FCB for the file. You should be able to trace
the life cycle of an FCB and
discuss the data it contains. The OS mediates data transfers between the
program and the file using
buffering and tries to schedule device access. You should understand the
difference between buffering
and caching and spooling be able to discuss the factors that impact
scheduling. The OS deals with io
through device drivers and device controllers. You should be able to explain
the process of
accomplishing an io operation. The file system manages the space allocation
on the disk drive. This
process mirrors RAM management except that it is several orders of magnitude
slower. What techniques
are effective?
· We want programs to run quickly. However, when there are many
programs contending for limited
CPU cycles and Disk arms there will be contention. You should be able to
explain the factors that affect
system utilization in multi-programming contexts.
You should review the last two
exams and chapter 12.