(Illustration by Gaich Muramatsu)
Perhaps some sort of watermark could be set for what is acceptable to cache as a whole file and what is not? In other completely different news, I tried compiling coda on my Alpha and I got laughed at by my compiler... The first minor choke was in lib-src/lwp.c at line 552: if ((int) stackptr == -1) stackptr is originally typed as (char *)...in Alpha-land, pointers are 8 bytes and sizeof(void *) == sizeof(long), not sizeof(int)... What kind of evil voodoo is this and what's the proper way to fix it? Jason On Wed, 20 Jan 1999, Robert Watson wrote: > On Wed, 20 Jan 1999, Peter J. Braam wrote: > > > > AFS deals with this by 'chunking' -- that is, it demand-loads portions of > > > files into the cache as they are needed; I believe it also uses an > > > agressive read-ahead policy. The net result is more efficient use of the > > > cache for partial file reads or writes, especially for mammoth files. > > > > I just sent a message about this. > > > > > However, that raises consistency issues: currently the resolution of > > > conflicts between file versions is that of entire file system objects > > > (files or directories). Dealing with fine-grained inconsistency severely > > > complicates the repair process, I would guess; it is not even clear if the > > > client would have access to the whole file version it is attempting to > > > integrate. For disconnected operation anyway, it seems like transferring > > > the whole file is more useful, as the chances are high that if you access > > > a bit of the file, you will access all of it (loading it into emacs, > > > writing it out, etc). > > > > Whoops, this is a good point. However, the conflict resolution mechanisms > > themselves would use the chunk fetching code, so it need not really be a > > problem. > > The problem situation I was thinking of was this: Client1 is connected, > and retrieves the middle chunk of a file. A write is made to the middle > chunk, but before it can be written back, Client1 goes disconnected. we > now have a pending write on the middle chunk of a file, but only the > middle chunk is on Client1. Client2 now bops up and proceeds to modify > the file in some manner, and succeeds. Client1 now reconnects. A > Client-Server conflict has arisen and must be resolved for the change to > be reintegrated. However, because only a small part of the entire file is > available on Client1, the resolution process may now be more difficult. > Consider, for example, the case where it is an MS Word file. An > application-specific resolver is required, but it doesn't have access to > the two complete versions of the file, so it may not even have access to > the old file header :(. The whole-file-in-cache is a simplification for > version control that I think really does make life easier. > > On the other hand, chunking would definitely improve performance > (especially perceived performance during a more or the like--the latency > to the first available data is much lower). Maybe this is an appropriate > application for the 'client class' behavior I suggest below, and that we > both seem to agree is a large project and should wait :(. > > > > > I was really hoping to have home directories mounted over coda, with inbox > > > > being stored right in the accounts, (and also large procmail filtered > > > > mailing-list archived mail folders) but that wont be feasible until at > > > > least write-back caching is available in a connected state. > > > > > > > > I just got coda running recently, but the initial excitement has faded > > > > somewhat after discovering the above.. :( > > > > > > My suspicion is that the arrangement you describe will suffer from Coda's > > > weak consistency model: if multiple clients are using write-back caching, > > > then conflicts can occur. > > > > Write back caching wil have the same semantics as connected Coda. > > If another client comes along, then the one holding the write back token > > will have to reintegrate first. > > > > Conflicts in Coda arise as easily in connected mode as in AFS you would > > overwrite data (last close wins in AFS). The problem with receiving email > > in Coda is locking to avoid conflicts. I don't know how AFS does this, > > but with NFS it is certainly possible to ruin your mailbox easily. > > Token-like behavior for file systems is clearly very nice, and would > improve consistency. However, this is a departure from the traditional > Coda consistency model. With replicated servers, how will tokens be > allocated, and by which server(s)? > > In Coda, conflicts are more easily come upon than AFS-last-close behavior > because of replication. Having the 'AFS-class client' that uses > last-close and timestamps to manage conflicts might result in unexpected > but at least non-interactive behavior. > > > It's a good puzzle to see if Coda's connected semantics allow for the > > atomic creation of a lock file. Perhaps that is just possible. On the > > other hand, I don't really have much more faith in AFS or NFS without lock > > daemons when it comes to my mail. > > I would guess that Coda does not allow atomic creation on a replicated > volume, only on an unreplicated one. Even then, only Venus will know > whether it was atomic and successful; if the client is disconnected, then > the userland mail process only sees the lock file creation succeed, and > doesn't know it has been logged. Similarly, you might have problems with > lock files being left around: client is connected, creates lock file, and > then goes disconnected. This is a lock like that nasty netscape problem > with netscape crashing and leaving lock files all over the place, only in > this case the disconnected mail client still thinks it has an atomic lock > :O. > > As such, a disconnected system really needs to support lock preemption, > possibly notification, and certainly verification that a lock is still > valid. Perhaps an optional distributed lock manager could be used with > Coda (presumably replicated with strong consistency in the style of Ubiq > or using a multi-party lock algorithm). Disconnected operation still > introduces incomfortable situations, but at least connected clients could > guarantee locks. My suspicion is, however, that when there are already > specific multi-user locking semantics for a specific application, that > application should be served by its own replication mechanism and not by a > file system with weak consistency. So replicated IMAP servers might be a > better solution, with IMAP's disconnected operation and reintegration > techniques. Or a mail reader that takes advantage of Coda as a message > store with weak sementics. > > > > This is not to suggest that Coda is not useful in such an environment; > > > it's real benefits come in the case of mobile computing. It might be > > > interesting to introduce the concept of different 'classes' of client: > > > that is, the semantics and consistency enforced for a particular client > > > might depend on the role it was expected to play. > > > > Yup, unfortunately, that's a rather major project probably. > > It sounds like it. Ideally I see something like this: > > venus -consistency strong > venus -consistency afs > venus -consistency codamobile > venus -consistency slush > > In each case, the strongest consistency available would be used, but the > fallback case where it wasn't would be different. That is, if you started > venus with codamobile, when connected you'd get AFS or strong consistency, > but when disconnected you'd get logging and reintegration. With AFS > consistency at startup, you'd get strong or AFS connected, and when > disconnected either everything hangs or obeys last-write based on > timestamps or something. With strong, you'd get strong or hangs. > > Robert N Watson > > [email protected] http://www.watson.org/~robert/ > PGP key fingerprint: 03 01 DD 8E 15 67 48 73 25 6D 10 FC EC 68 C1 1C > > Carnegie Mellon University http://www.cmu.edu/ > TIS Labs at Network Associates, Inc. http://www.tis.com/ > SafePort Network Services http://www.safeport.com/ >Received on 1999-01-20 16:20:37