Coda File System

Re: LWPs in Coda and native platform threads

From: Robert Watson <robert_at_cyrus.watson.org>
Date: Mon, 27 Jul 1998 14:09:26 -0400 (EDT)
On Mon, 27 Jul 1998, Shafeeq Sinnamohideen wrote:

> > Does Coda currently use blocking DNS calls?  I suspect it just calls the
> > libc gethostbyname/etc, which (without using native threading) can result
> > in considerable blocking.
> I don't know if this would cause much trouble, since in RPC2 everything is
> done by IP, you'd have only the inital gethostbyname. 

However, the timeout on a gethostbyname or reverse lookup can be 60+
seconds, and because libc doesn't use our select, the entire process is
blocked.  Let's look at a sample 2 packet a minute denial of service
attack: do an rpc2_bind from a host with a dead nameserver on its
reverse-lookup.  The server blocks on libc.gethostbyaddr (or whatever).

Let's look at a serious performance issue related to scalability:

If hosts come from networks that have varying quality of bandwidth between
the coda server and name server.  For each incoming binding that is logged
(i.e., a reverse lookup occurs), a delay of .5 seconds occurs.  This
client is under fairly high load, and makes a fair number of connections,
as do the ten machines nearby it.

I'm not sure how often gethostby* is called in Coda, but I suspect rpc2
makes use of it in a number of places (certainly in outgoing connections
-- don't know about incoming?)

> For file I/O, LWP hast the idea of not blocking the entire process
> by doing a select unless all threads are blocked on an IOMGR_select, which
> causes the problem that if you have a thread doing a long-running
> computation, the other thread's IO will start only when the first thread
> decides to do IO. Since the RPC2 SocketListenr is a LWP thread, in the
> case where a server is busy, it should return BUSY to a client request,
> but occasionaly the the worker thead will run without blocking for long
> enough that once the SocketListener's select is done, the request has
> timedout and the client thinks the server is dead. For this reason, I
> think that an implementation of LWP and RPC2 on pthreads would help since
> IO would be handled by pthreads  and having the SocketListener as a
> preemptive pthread, not LWP, would be safe, and reduce latency. 

Is the IOMGR_select used for managing local files on the server or client?
If not, any reasonably high latency medium might actually be slowing down
service -- for example, if the vice cache is on floppy disk, or vice is
serving files from a CDROM.  If IOMGR_Select is used locally, other
threads can be used to access data that might be readable from the buffer
cache, or from a faster device.  I suspect it is not currently used
remotely?

> I think the main reason this hasn't been done yet is that while LWP works
> and isn't impossible to port, there are many other things in a less
> finished state.

Yes. :)

  Robert N Watson 

Carnegie Mellon University            http://www.cmu.edu/
TIS Labs at Network Associates, Inc.  http://www.tis.com/
SafePort Network Services             http://www.safeport.com/
[email protected]              http://www.watson.org/~robert/
Received on 1998-07-27 14:15:30