Last Updated: 2021-04-19 Mon 11:33

CSCI 4061 HW12: Multiplexed I/O via Threads and poll()

CODE DISTRIBUTION: hw12-code.zip

CHANGELOG:

1 Rationale

Certain cadres of systems programs must read from multiple input sources without knowledge of which will have input available first. The standard behavior of read() is to block a process if the source has no data immediately available. This means if there are two sources, one with no data and one with some data, a process guessing the wrong one will block when it could have acquired data from the other.

This HW studies two techniques to solve this Multiplexed I/O problem.

  1. A common tactic is to create Threads for each input source that is to be monitored. When a thread calls read() on an input source has no data, the thread blocks but other threads in the process can continue operating. Thus, input arriving at different rates wakes up the associated thread which handles it.
  2. Through the use of the poll() system call, programs can request that the operating system monitor several I/O sources and notify the program when any of them are ready. This is particularly useful for server-type programs that are attending simultaneously to several clients but wish to avoid the complexities involved in a multi-threaded implementation.

1.1 Associated Reading

Ch 11 of Stevens and Rago covers the basics of thread creation and joining. However, it does not discuss the technique demonstrated here where threads read() and block independently to allow for handling of differently paced input. Study this carefully in the demonstration codes as it appears in practice often.

Ch 14.4.2 of Stevens & Rago covers the poll() system call. It is preceded by another system call, select() which provides similar functionality but is older and less preferred than poll() for new code.

1.2 Grading Policy

Credit for this HW is earned by taking the associate Quiz which is linked under Gradescope. The quiz will ask similar questions as those that are present in the QUESTIONS.txt file and those that complete all answers in QUESTIONS.txt should have no trouble with the quiz.

See the full policy in the syllabus.

1.3 Historical Video Presentation

The following video surveys the difficulties of using blocking read() on multiple sources and how poll() can provide notification that a file descriptor is ready for reading. It does NOT cover the alternative of using multiple threads to handle differing I/O sources.

https://youtu.be/Ap25Ip6Vkd4

2 Codepack

The codepack for the lab contains the following files:

File Description
QUESTIONS.txt Questions to answer
Makefile Makefile to build programs below
AB_read.c Read from two pipes, alternating between them
AB_threads.c Read from two pipes, use a thread for each pipe
AB_poll.c Read from two pipes, use poll() to determine which has data

3 What to Understand

Ensure that you understand

  • How read() blocks a process if there is not data in the source being read from
  • How use of a thread for each of several input sources can allow a process to retrieve data immediately from those several sources.
  • How poll() can be used to indicate which of several input sources is ready allowing a single thread/process to obtain data from several sources without blocking.

4 Questions

Analyze these files and answer the questions given in QUESTIONS.txt.

                           _________________

                            HW 12 QUESTIONS
                           _________________


- Name: (FILL THIS in)
- NetID: (THE kauf0095 IN kauf0095@umn.edu)

Write your answers to the questions below directly in this text file.
HW quiz questions will be related to the questions in this file.


PROBLEM 1: AB_read
==================

A
~

  Open up and examine the `AB_read.c' code. Compile it using the
  provided `Makefile' and run it in terminal. Describe briefly what this
  program does and how it behaves.


B
~

  An important part of the demonstration in `AB_read' is the two child
  processes.  Describe the "speed" of these two child processes in
  writing data that is read in the parent main loop. Is one of the
  children "faster" than the other and if so, how does this manifest
  itself in the output when running the program?


PROBLEM 2: AB_threads
=====================

  It should be apparent from your observations of `AB_read' that the
  parent process must read from several locations (pipes) but these
  receive data at different rates. A shortcoming of `AB_read' is that it
  may block reading from a slow pipe while there is data available on
  the other pipe.

  This problem studies how the `AB_threads' program addresses this
  problem.


A
~

  Compile and run the `AB_threads' program and observe its
  output. Compare its output to that of `AB_read' and describe
  differences that you see. These differences will guide how to analyze
  the new program.


B
~

  Open up and examine the `AB_threads' program. Note where it resembles
  `AB_read' and where it diverges. How does `AB_threads' use threads to
  handle input from the different pipes? How many total threads are
  employed by the program?


C
~

  Based on observing the behavior of threads in `AB_threads', what can
  you conclude happens when a thread `read()''s from a file that has no
  data? How does this affect other threads in the same process?


PROBLEM 3: AB_poll
==================

  While threads are extremely useful, they introduce complexity into
  programs that is often undesirable as single-threaded processes are
  easier to reason about and more likely to be correct out of the box.
  Traditionally, Unix systems only provided a single thread and if one
  wanted to check on multiple I/O sources, the OS kernel provided system
  calls that supported such checks. This problem explores the `poll()'
  system call which provides this capability.


A
~

  Compile and run the `AB_poll' program and observe its output. Compare
  its output to that of `AB_read' and `AB_threads'. Describe which of
  these two earlier programs it resembles more.


B
~

  Open up and examine the `AB_poll' program. Note where it resembles
  `AB_read' and where it diverges. Importantly, recognize that it does
  NOT use multiple threads. Give a short description of the structure of
  the `AB_poll' and how it avoids calling `read()' on a file descriptor
  that would cause it to block.


C
~

  Examine the manual page for the `poll()' system call and describe the
  arguments it takes.  Include a description of the following items.
  1. What 3 arguments does the `poll()' system call take and what is
     there use?
  2. What are the meanings of the `POLLIN' and `POLLHUP' symbols used?
  3. What is the difference between the `events' and `revents' fields in
     the `struct pollfd'?
  4. How does one indicate that a `pollfd' should be ignored by a
     `poll()' call?

Author: Chris Kauffman (kauffman@umn.edu)
Date: 2021-04-19 Mon 11:33