This is the follow-up project for Transparent Remote File Operations (RPC)
Introduction
File-Caching Proxy project is a simplified implementation of a distributed file system that supports a subset of operations in the real distributed file system (e.g., AFS). Its major responsibility is to provide users the (approximate) illusion that they are remotely connected to the server and directly manipulate the files on it. A file-Caching proxy could provide a better user experience under poor network conditions. Also, it could greatly reduce the workload of the server.
To make things work, several critical problems should be solved:
- How to make sure users can always open the newest version of the file?
- How to handle the case where multiple users are working on the same file?
- How to maximize the utilization of cached files to provide a better user experience?
In light of solving these problems, the file-caching proxy is designed and implemented.
Key features:
- (An approximation of) One-copy semantics
This project follows the One-Copy Semantics. There should be no externally observable differences when compared to the same system without caching. All caching processes should be transparent to the application.
- Open-close Session Semantics
This project also follows the Open-close session semantics and cache at whole-file granularity. The file is guaranteed to be the newest at open. Once a file is opened by a client, further modifications on this file from other clients will not influence the content for this client. This client could modify this file as if it’s the only one who is using this file. All modifications from this client will be saved and be viewable to other clients as soon as it closes this file.
- Concurrent Clients and Proxies
One proxy could handle the requests from many clients to open many files. The server could handle the requests from many proxies.
- LRU Caching strategy
Proxy uses the Least Recently Used (LRU) strategy when the cache is full.
Major Designs
The system structure is shown in the following figure.

System Structure
Proxy
- FileHandler
FileHandler performs operation requests from the client. Once an RPC connection is established, a FileHandler instance will be created to handle all RPC calls from this connection. Supported RPC includes: open(), close(), read(), write(), lseek(), unlink(). FileHandler is responsible for imitating the behavior of its C version in the standard system library respectively. FileHandler requests a file from CacheManager on open(), and push modified file to CacheManager on close().
- ProxyFile
ProxyFile represents a server file on the proxy machine. Once a server file is first requested by the client, a ProxyFile instance is created and responsible for all operations toward that file. Operations include but are not limited to: accept read/write requests, pull a file from the server, push modification to the server, version control for locally cached files.
- CacheManager
CacheManager is the intermediate layer between FileHandlers and ProxyFiles. It passes operations from FileHandler to ProxyFile and maintains an LRU queue to keep track of the cached files. Before ProxyFile writes contents to the cache folder, it will notify CacheManager to free the local cache until it’s enough for the new file.