Skip to content

meeting at BAT

phdeniel edited this page Oct 26, 2011 · 4 revisions

The meeting was held during the BAT in Sunnyvale, inside NetApp building.

Points discussed were:

  • a brief summary about the pNFS talk the day before
  • FSAL Upcalls
  • new metadata cache design
  • locks and states in the lock manager
  • FSAL refactoring & cleanups + OOing of the code
  • test suites

PNFS

The consensus goes to a compact API, as the ones the kernel has, with one structure to pass the argument and one to get the results. This is not incompatible avec the work made by Linux Box on CEPH, arguments and replys should one be packed in dedicated structure.

The implementation of the LAYOUT_RECALL NFSv4.1 call back will require back-channels to be implemented. Matt works to use TIRPC in order to put the feature to the project. FSAL Upcalls will also ease this implementation (to trigger the recall of a layout).

MDCACHE

Matt spoke about the refurbishment of the name cache inside cached directories. The "dir_chain" is to be replaced by a more compact and efficient algorith based on AVL tree, which should as well solve memory leaks observed when big directories are created and deleted frequently. Design of the cache (that is currently a write-through cache) was discussed. Philippe described different ways of caching: "no cache and direct call to FSAL", "attributes only", "attributes + symlink content", "attributes + symlink content + directory name cache". The cache should be able to do all of these caching methods, but have cache policies set per export entry. Depending on the export associated with the request that last accessed the entry, it will inherit the cache policy. Philippe highlighted the fact that dependencies in the namespace (a file belong to a directory that belongs to a directory itself) are to be carefully considered. Cache_Inode will be refurbish with this new method based on policies. The use of FSAL upcalls will make it possible to know that lines to cache are to be updated or invalidated. This requires a new API. Philippe will design this API.

UPCALLS

Jeremy described we work that he did on GPFS, when implementing the design paper available in the repository. His work is based on a GPFS specific feature, but porting it to other backend should be no problem. Jim highlighted that using the fanotify API on FSAL_VFS would be a great way of having FSAL Upcalls for VFS. Brent (on phone) wants the things to remain simple (of course, everybody agreed :-) ). As said before, Upcalls will provide a way to control the MDCACHE. FSAL Upcalls and Cache Inode refactor will be separated tasks. Philippe will make a "upcalls simulator thread" to test Cache Inode invalidator and emulates an existing FSAL upcalls implementation. Later, true upcalls will then be used.

LOCKS

Frank summarized his work on the state manager and implementation of NFSv3/NLM + NFSv4.x locks. Libc function fcntl is used to set locks on "fd based FSAL" (XFS/VFS/GPFS) for implementing locks. The question of asynchronous locks was discussed. Should we have completion queues ? Relying on a FSAL Upcalls could a very valuable solution and asynchronous locking.

FSAL REFACTORING + OOing THE CODE

Jim is pushing as much code as possible from src/FSAL/FSAL_* to src/FSAL, as common functions to all FSALs. He will add a few more test_access function that will offload the FSAL by checking, via object attributes, if an operation will be permitted or not. The configure.ac should be updated to check on a given machine what kind of FSAL can be compiled or not. Then, with this information, all possible FSALs should be generated in a single pass. Global headers should help a lot, especially with FSALs that are dynamically loaded at startup. In a OO approach, no C++ compiler will be involved, but an object oriented "way of thinking" will be preferred. The Common functions coded in src/FSAL will become kind of super class, specific FSALs will then inherit from it.

TESTS

Philippe asked for this theme to be discussed. It's important to have common tests and testing methodolgy. Test are of different kinds:

  • compilation test : make sure that everything compiles when different ./configure options are used.
  • feature test : make sure a given feature works properly
  • stress test

Boaz signaled that "git clone ; umount ; mount ; git status ; make" was a great stres test. The xfstest test suite is great too. Philippe spoke about iozone/metarates and their "MPI brothers" IOR/MDtests. Using Jenkins could help a lot. Sukwoo signaled that IBM was using Jenkins too. Regarding NFSv4 feature test, newpynfs script from Fred Isaman now has XML output in JUnit format (thanks to Tigran). This would make the integration of the test inside Jenkins very easy. Boaz said that a nightly batch could be set up at Panasas to run tests in batch mode.