Found wdiff, but it reported no recognisable version. Falling back to builtin diff colouring...
| draft-ietf-nfsv4-minorversion1-24.txt | | draft-ietf-nfsv4-minorversion1-25.txt | |
| | | | |
| NFSv4 S. Shepler | | NFSv4 S. Shepler | |
| Internet-Draft M. Eisler | | Internet-Draft M. Eisler | |
| Intended status: Standards Track D. Noveck | | Intended status: Standards Track D. Noveck | |
|
| Expires: February 7, 2009 Editors | | Expires: February 23, 2009 Editors | |
| Aug 06, 2008 | | August 22, 2008 | |
| | | | |
| NFS Version 4 Minor Version 1 | | NFS Version 4 Minor Version 1 | |
|
| draft-ietf-nfsv4-minorversion1-24.txt | | draft-ietf-nfsv4-minorversion1-25.txt | |
| | | | |
| Status of this Memo | | Status of this Memo | |
| | | | |
| By submitting this Internet-Draft, each author represents that any | | By submitting this Internet-Draft, each author represents that any | |
| applicable patent or other IPR claims of which he or she is aware | | applicable patent or other IPR claims of which he or she is aware | |
| have been or will be disclosed, and any of which he or she becomes | | have been or will be disclosed, and any of which he or she becomes | |
| aware will be disclosed, in accordance with Section 6 of BCP 79. | | aware will be disclosed, in accordance with Section 6 of BCP 79. | |
| | | | |
| Internet-Drafts are working documents of the Internet Engineering | | Internet-Drafts are working documents of the Internet Engineering | |
| Task Force (IETF), its areas, and its working groups. Note that | | Task Force (IETF), its areas, and its working groups. Note that | |
| | | | |
| skipping to change at page 1, line 35 | | skipping to change at page 1, line 35 | |
| and may be updated, replaced, or obsoleted by other documents at any | | and may be updated, replaced, or obsoleted by other documents at any | |
| time. It is inappropriate to use Internet-Drafts as reference | | time. It is inappropriate to use Internet-Drafts as reference | |
| material or to cite them other than as "work in progress." | | material or to cite them other than as "work in progress." | |
| | | | |
| The list of current Internet-Drafts can be accessed at | | The list of current Internet-Drafts can be accessed at | |
| http://www.ietf.org/ietf/1id-abstracts.txt. | | http://www.ietf.org/ietf/1id-abstracts.txt. | |
| | | | |
| The list of Internet-Draft Shadow Directories can be accessed at | | The list of Internet-Draft Shadow Directories can be accessed at | |
| http://www.ietf.org/shadow.html. | | http://www.ietf.org/shadow.html. | |
| | | | |
|
| This Internet-Draft will expire on February 7, 2009. | | This Internet-Draft will expire on February 23, 2009. | |
| | | | |
| Abstract | | Abstract | |
| | | | |
| This Internet-Draft describes NFS version 4 minor version one, | | This Internet-Draft describes NFS version 4 minor version one, | |
| including features retained from the base protocol and protocol | | including features retained from the base protocol and protocol | |
| extensions made subsequently. Major extensions introduced in NFS | | extensions made subsequently. Major extensions introduced in NFS | |
| version 4 minor version one include: Sessions, Directory Delegations, | | version 4 minor version one include: Sessions, Directory Delegations, | |
| and parallel NFS (pNFS). | | and parallel NFS (pNFS). | |
| | | | |
| Requirements Language | | Requirements Language | |
| | | | |
| skipping to change at page 2, line 18 | | skipping to change at page 2, line 18 | |
| 1.1. The NFS Version 4 Minor Version 1 Protocol . . . . . . . 11 | | 1.1. The NFS Version 4 Minor Version 1 Protocol . . . . . . . 11 | |
| 1.2. Scope of this Document . . . . . . . . . . . . . . . . . 11 | | 1.2. Scope of this Document . . . . . . . . . . . . . . . . . 11 | |
| 1.3. NFSv4 Goals . . . . . . . . . . . . . . . . . . . . . . 11 | | 1.3. NFSv4 Goals . . . . . . . . . . . . . . . . . . . . . . 11 | |
| 1.4. NFSv4.1 Goals . . . . . . . . . . . . . . . . . . . . . 12 | | 1.4. NFSv4.1 Goals . . . . . . . . . . . . . . . . . . . . . 12 | |
| 1.5. General Definitions . . . . . . . . . . . . . . . . . . 12 | | 1.5. General Definitions . . . . . . . . . . . . . . . . . . 12 | |
| 1.6. Overview of NFSv4.1 Features . . . . . . . . . . . . . . 15 | | 1.6. Overview of NFSv4.1 Features . . . . . . . . . . . . . . 15 | |
| 1.6.1. RPC and Security . . . . . . . . . . . . . . . . . . 15 | | 1.6.1. RPC and Security . . . . . . . . . . . . . . . . . . 15 | |
| 1.6.2. Protocol Structure . . . . . . . . . . . . . . . . . 15 | | 1.6.2. Protocol Structure . . . . . . . . . . . . . . . . . 15 | |
| 1.6.3. File System Model . . . . . . . . . . . . . . . . . 16 | | 1.6.3. File System Model . . . . . . . . . . . . . . . . . 16 | |
| 1.6.4. Locking Facilities . . . . . . . . . . . . . . . . . 18 | | 1.6.4. Locking Facilities . . . . . . . . . . . . . . . . . 18 | |
|
| 1.7. Differences from NFSv4.0 . . . . . . . . . . . . . . . . 18 | | 1.7. Differences from NFSv4.0 . . . . . . . . . . . . . . . . 19 | |
| 2. Core Infrastructure . . . . . . . . . . . . . . . . . . . . . 19 | | 2. Core Infrastructure . . . . . . . . . . . . . . . . . . . . . 20 | |
| 2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 20 | | 2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 20 | |
| 2.2. RPC and XDR . . . . . . . . . . . . . . . . . . . . . . 20 | | 2.2. RPC and XDR . . . . . . . . . . . . . . . . . . . . . . 20 | |
| 2.2.1. RPC-based Security . . . . . . . . . . . . . . . . . 20 | | 2.2.1. RPC-based Security . . . . . . . . . . . . . . . . . 20 | |
| 2.3. COMPOUND and CB_COMPOUND . . . . . . . . . . . . . . . . 23 | | 2.3. COMPOUND and CB_COMPOUND . . . . . . . . . . . . . . . . 23 | |
| 2.4. Client Identifiers and Client Owners . . . . . . . . . . 24 | | 2.4. Client Identifiers and Client Owners . . . . . . . . . . 24 | |
| 2.4.1. Upgrade from NFSv4.0 to NFSv4.1 . . . . . . . . . . 27 | | 2.4.1. Upgrade from NFSv4.0 to NFSv4.1 . . . . . . . . . . 27 | |
| 2.4.2. Server Release of Client ID . . . . . . . . . . . . 28 | | 2.4.2. Server Release of Client ID . . . . . . . . . . . . 28 | |
| 2.4.3. Resolving Client Owner Conflicts . . . . . . . . . . 28 | | 2.4.3. Resolving Client Owner Conflicts . . . . . . . . . . 28 | |
| 2.5. Server Owners . . . . . . . . . . . . . . . . . . . . . 29 | | 2.5. Server Owners . . . . . . . . . . . . . . . . . . . . . 29 | |
| 2.6. Security Service Negotiation . . . . . . . . . . . . . . 30 | | 2.6. Security Service Negotiation . . . . . . . . . . . . . . 30 | |
| 2.6.1. NFSv4.1 Security Tuples . . . . . . . . . . . . . . 30 | | 2.6.1. NFSv4.1 Security Tuples . . . . . . . . . . . . . . 30 | |
|
| 2.6.2. SECINFO and SECINFO_NO_NAME . . . . . . . . . . . . 30 | | 2.6.2. SECINFO and SECINFO_NO_NAME . . . . . . . . . . . . 31 | |
| 2.6.3. Security Error . . . . . . . . . . . . . . . . . . . 31 | | 2.6.3. Security Error . . . . . . . . . . . . . . . . . . . 31 | |
| 2.7. Minor Versioning . . . . . . . . . . . . . . . . . . . . 35 | | 2.7. Minor Versioning . . . . . . . . . . . . . . . . . . . . 35 | |
| 2.8. Non-RPC-based Security Services . . . . . . . . . . . . 38 | | 2.8. Non-RPC-based Security Services . . . . . . . . . . . . 38 | |
| 2.8.1. Authorization . . . . . . . . . . . . . . . . . . . 38 | | 2.8.1. Authorization . . . . . . . . . . . . . . . . . . . 38 | |
| 2.8.2. Auditing . . . . . . . . . . . . . . . . . . . . . . 38 | | 2.8.2. Auditing . . . . . . . . . . . . . . . . . . . . . . 38 | |
| 2.8.3. Intrusion Detection . . . . . . . . . . . . . . . . 38 | | 2.8.3. Intrusion Detection . . . . . . . . . . . . . . . . 38 | |
|
| 2.9. Transport Layers . . . . . . . . . . . . . . . . . . . . 38 | | 2.9. Transport Layers . . . . . . . . . . . . . . . . . . . . 39 | |
| 2.9.1. REQUIRED and RECOMMENDED Properties of Transports . 38 | | 2.9.1. REQUIRED and RECOMMENDED Properties of Transports . 39 | |
| 2.9.2. Client and Server Transport Behavior . . . . . . . . 39 | | 2.9.2. Client and Server Transport Behavior . . . . . . . . 39 | |
| 2.9.3. Ports . . . . . . . . . . . . . . . . . . . . . . . 41 | | 2.9.3. Ports . . . . . . . . . . . . . . . . . . . . . . . 41 | |
| 2.10. Session . . . . . . . . . . . . . . . . . . . . . . . . 41 | | 2.10. Session . . . . . . . . . . . . . . . . . . . . . . . . 41 | |
| 2.10.1. Motivation and Overview . . . . . . . . . . . . . . 41 | | 2.10.1. Motivation and Overview . . . . . . . . . . . . . . 41 | |
| 2.10.2. NFSv4 Integration . . . . . . . . . . . . . . . . . 42 | | 2.10.2. NFSv4 Integration . . . . . . . . . . . . . . . . . 42 | |
| 2.10.3. Channels . . . . . . . . . . . . . . . . . . . . . . 44 | | 2.10.3. Channels . . . . . . . . . . . . . . . . . . . . . . 44 | |
| 2.10.4. Trunking . . . . . . . . . . . . . . . . . . . . . . 45 | | 2.10.4. Trunking . . . . . . . . . . . . . . . . . . . . . . 45 | |
| 2.10.5. Exactly Once Semantics . . . . . . . . . . . . . . . 48 | | 2.10.5. Exactly Once Semantics . . . . . . . . . . . . . . . 48 | |
| 2.10.6. RDMA Considerations . . . . . . . . . . . . . . . . 61 | | 2.10.6. RDMA Considerations . . . . . . . . . . . . . . . . 61 | |
|
| 2.10.7. Sessions Security . . . . . . . . . . . . . . . . . 63 | | 2.10.7. Sessions Security . . . . . . . . . . . . . . . . . 64 | |
| 2.10.8. The SSV GSS Mechanism . . . . . . . . . . . . . . . 69 | | 2.10.8. The SSV GSS Mechanism . . . . . . . . . . . . . . . 69 | |
| 2.10.9. Session Mechanics - Steady State . . . . . . . . . . 73 | | 2.10.9. Session Mechanics - Steady State . . . . . . . . . . 73 | |
| 2.10.10. Session Inactivity Timer . . . . . . . . . . . . . . 75 | | 2.10.10. Session Inactivity Timer . . . . . . . . . . . . . . 75 | |
| 2.10.11. Session Mechanics - Recovery . . . . . . . . . . . . 75 | | 2.10.11. Session Mechanics - Recovery . . . . . . . . . . . . 75 | |
|
| 2.10.12. Parallel NFS and Sessions . . . . . . . . . . . . . 78 | | 2.10.12. Parallel NFS and Sessions . . . . . . . . . . . . . 79 | |
| 3. Protocol Constants and Data Types . . . . . . . . . . . . . . 78 | | 3. Protocol Constants and Data Types . . . . . . . . . . . . . . 79 | |
| 3.1. Basic Constants . . . . . . . . . . . . . . . . . . . . 79 | | 3.1. Basic Constants . . . . . . . . . . . . . . . . . . . . 79 | |
|
| 3.2. Basic Data Types . . . . . . . . . . . . . . . . . . . . 79 | | 3.2. Basic Data Types . . . . . . . . . . . . . . . . . . . . 80 | |
| 3.3. Structured Data Types . . . . . . . . . . . . . . . . . 81 | | 3.3. Structured Data Types . . . . . . . . . . . . . . . . . 82 | |
| 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . . 90 | | 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . . 90 | |
| 4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 90 | | 4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 90 | |
| 4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . 91 | | 4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . 91 | |
| 4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . 91 | | 4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . 91 | |
| 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 91 | | 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 91 | |
| 4.2.1. General Properties of a Filehandle . . . . . . . . . 92 | | 4.2.1. General Properties of a Filehandle . . . . . . . . . 92 | |
| 4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . 93 | | 4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . 93 | |
| 4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . 93 | | 4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . 93 | |
| 4.3. One Method of Constructing a Volatile Filehandle . . . . 94 | | 4.3. One Method of Constructing a Volatile Filehandle . . . . 94 | |
| 4.4. Client Recovery from Filehandle Expiration . . . . . . . 95 | | 4.4. Client Recovery from Filehandle Expiration . . . . . . . 95 | |
| | | | |
| skipping to change at page 4, line 43 | | skipping to change at page 4, line 43 | |
| 9. File Locking and Share Reservations . . . . . . . . . . . . . 174 | | 9. File Locking and Share Reservations . . . . . . . . . . . . . 174 | |
| 9.1. Opens and Byte-Range Locks . . . . . . . . . . . . . . . 174 | | 9.1. Opens and Byte-Range Locks . . . . . . . . . . . . . . . 174 | |
| 9.1.1. State-owner Definition . . . . . . . . . . . . . . . 174 | | 9.1.1. State-owner Definition . . . . . . . . . . . . . . . 174 | |
| 9.1.2. Use of the Stateid and Locking . . . . . . . . . . . 175 | | 9.1.2. Use of the Stateid and Locking . . . . . . . . . . . 175 | |
| 9.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 178 | | 9.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 178 | |
| 9.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 178 | | 9.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 178 | |
| 9.4. Stateid Seqid Values and Byte-Range Locks . . . . . . . 179 | | 9.4. Stateid Seqid Values and Byte-Range Locks . . . . . . . 179 | |
| 9.5. Issues with Multiple Open-Owners . . . . . . . . . . . . 179 | | 9.5. Issues with Multiple Open-Owners . . . . . . . . . . . . 179 | |
| 9.6. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 180 | | 9.6. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 180 | |
| 9.7. Share Reservations . . . . . . . . . . . . . . . . . . . 181 | | 9.7. Share Reservations . . . . . . . . . . . . . . . . . . . 181 | |
|
| 9.8. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 181 | | 9.8. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 182 | |
| 9.9. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 182 | | 9.9. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 182 | |
| 9.10. Parallel OPENs . . . . . . . . . . . . . . . . . . . . . 183 | | 9.10. Parallel OPENs . . . . . . . . . . . . . . . . . . . . . 183 | |
| 9.11. Reclaim of Open and Byte-Range Locks . . . . . . . . . . 184 | | 9.11. Reclaim of Open and Byte-Range Locks . . . . . . . . . . 184 | |
| 10. Client-Side Caching . . . . . . . . . . . . . . . . . . . . . 184 | | 10. Client-Side Caching . . . . . . . . . . . . . . . . . . . . . 184 | |
| 10.1. Performance Challenges for Client-Side Caching . . . . . 185 | | 10.1. Performance Challenges for Client-Side Caching . . . . . 185 | |
| 10.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 186 | | 10.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 186 | |
| 10.2.1. Delegation Recovery . . . . . . . . . . . . . . . . 188 | | 10.2.1. Delegation Recovery . . . . . . . . . . . . . . . . 188 | |
| 10.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 190 | | 10.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 190 | |
| 10.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . 190 | | 10.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . 190 | |
| 10.3.2. Data Caching and File Locking . . . . . . . . . . . 191 | | 10.3.2. Data Caching and File Locking . . . . . . . . . . . 191 | |
| | | | |
| skipping to change at page 10, line 6 | | skipping to change at page 10, line 6 | |
| Delegation Wants . . . . . . . . . . . . . . . . . . . . 576 | | Delegation Wants . . . . . . . . . . . . . . . . . . . . 576 | |
| 20.11. Operation 13: CB_NOTIFY_LOCK - Notify of possible | | 20.11. Operation 13: CB_NOTIFY_LOCK - Notify of possible | |
| lock availability . . . . . . . . . . . . . . . . . . . 577 | | lock availability . . . . . . . . . . . . . . . . . . . 577 | |
| 20.12. Operation 14: CB_NOTIFY_DEVICEID - Notify device ID | | 20.12. Operation 14: CB_NOTIFY_DEVICEID - Notify device ID | |
| changes . . . . . . . . . . . . . . . . . . . . . . . . 579 | | changes . . . . . . . . . . . . . . . . . . . . . . . . 579 | |
| 20.13. Operation 10044: CB_ILLEGAL - Illegal Callback | | 20.13. Operation 10044: CB_ILLEGAL - Illegal Callback | |
| Operation . . . . . . . . . . . . . . . . . . . . . . . 581 | | Operation . . . . . . . . . . . . . . . . . . . . . . . 581 | |
| 21. Security Considerations . . . . . . . . . . . . . . . . . . . 581 | | 21. Security Considerations . . . . . . . . . . . . . . . . . . . 581 | |
| 22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 583 | | 22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 583 | |
| 22.1. Named Attribute Definitions . . . . . . . . . . . . . . 583 | | 22.1. Named Attribute Definitions . . . . . . . . . . . . . . 583 | |
|
| 22.2. ONC RPC Network Identifiers (netids) . . . . . . . . . . 583 | | 22.1.1. Initial Registry . . . . . . . . . . . . . . . . . . 584 | |
| 22.3. Defining New Notifications . . . . . . . . . . . . . . . 584 | | 22.1.2. Updating Registrations . . . . . . . . . . . . . . . 584 | |
| 22.4. Defining New Layout Types . . . . . . . . . . . . . . . 584 | | 22.2. Device ID Notifications . . . . . . . . . . . . . . . . 584 | |
| 22.5. Path Variable Definitions . . . . . . . . . . . . . . . 586 | | 22.2.1. Initial Registry . . . . . . . . . . . . . . . . . . 585 | |
| 22.5.1. Path Variable Values . . . . . . . . . . . . . . . . 586 | | 22.2.2. Updating Registrations . . . . . . . . . . . . . . . 585 | |
| 22.5.2. Path Variable Names . . . . . . . . . . . . . . . . 586 | | 22.3. Object Recall Types . . . . . . . . . . . . . . . . . . 585 | |
| 23. References . . . . . . . . . . . . . . . . . . . . . . . . . 586 | | 22.3.1. Initial Registry . . . . . . . . . . . . . . . . . . 587 | |
| 23.1. Normative References . . . . . . . . . . . . . . . . . . 586 | | 22.3.2. Updating Registrations . . . . . . . . . . . . . . . 587 | |
| 23.2. Informative References . . . . . . . . . . . . . . . . . 588 | | 22.4. Layout Types . . . . . . . . . . . . . . . . . . . . . . 587 | |
| Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 590 | | 22.4.1. Initial Registry . . . . . . . . . . . . . . . . . . 588 | |
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 592 | | 22.4.2. Updating Registrations . . . . . . . . . . . . . . . 588 | |
| Intellectual Property and Copyright Statements . . . . . . . . . 593 | | 22.4.3. Guidelines for Writing Layout Type Specifications . 588 | |
| | | 22.5. Path Variable Definitions . . . . . . . . . . . . . . . 590 | |
| | | 22.5.1. Path Variables Registry . . . . . . . . . . . . . . 590 | |
| | | 22.5.2. Values for the ${ietf.org:CPU_ARCH} Variable . . . . 592 | |
| | | 22.5.3. Values for the ${ietf.org:OS_TYPE} Variable . . . . 592 | |
| | | 23. References . . . . . . . . . . . . . . . . . . . . . . . . . 593 | |
| | | 23.1. Normative References . . . . . . . . . . . . . . . . . . 593 | |
| | | 23.2. Informative References . . . . . . . . . . . . . . . . . 595 | |
| | | Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 596 | |
| | | Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 598 | |
| | | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 599 | |
| | | Intellectual Property and Copyright Statements . . . . . . . . . 600 | |
| | | | |
| 1. Introduction | | 1. Introduction | |
| | | | |
| 1.1. The NFS Version 4 Minor Version 1 Protocol | | 1.1. The NFS Version 4 Minor Version 1 Protocol | |
| | | | |
| The NFS version 4 minor version 1 (NFSv4.1) protocol is the second | | The NFS version 4 minor version 1 (NFSv4.1) protocol is the second | |
| minor version of the NFS version 4 (NFSv4) protocol. The first minor | | minor version of the NFS version 4 (NFSv4) protocol. The first minor | |
|
| version, NFSv4.0 is described in [21]. It generally follows the | | version, NFSv4.0 is described in [20]. It generally follows the | |
| guidelines for minor versioning model listed in Section 10 of RFC | | guidelines for minor versioning model listed in Section 10 of RFC | |
| 3530. However, it diverges from guidelines 11 ("a client and server | | 3530. However, it diverges from guidelines 11 ("a client and server | |
| that supports minor version X must support minor versions 0 through | | that supports minor version X must support minor versions 0 through | |
| X-1"), and 12 ("no features may be introduced as mandatory in a minor | | X-1"), and 12 ("no features may be introduced as mandatory in a minor | |
| version"). These divergences are due to the introduction of the | | version"). These divergences are due to the introduction of the | |
| sessions model for managing non-idempotent operations and the | | sessions model for managing non-idempotent operations and the | |
| RECLAIM_COMPLETE operation. These two new features are | | RECLAIM_COMPLETE operation. These two new features are | |
| infrastructural in nature and simplify implementation of existing and | | infrastructural in nature and simplify implementation of existing and | |
| other new features. Making them anything but REQUIRED would add | | other new features. Making them anything but REQUIRED would add | |
| undue complexity to protocol definition and implementation. NFSv4.1 | | undue complexity to protocol definition and implementation. NFSv4.1 | |
| | | | |
| skipping to change at page 11, line 45 | | skipping to change at page 11, line 45 | |
| o describe the NFSv4.0 protocol, except where needed to contrast | | o describe the NFSv4.0 protocol, except where needed to contrast | |
| with NFSv4.1. | | with NFSv4.1. | |
| | | | |
| o modify the specification of the NFSv4.0 protocol. | | o modify the specification of the NFSv4.0 protocol. | |
| | | | |
| o clarify the NFSv4.0 protocol. | | o clarify the NFSv4.0 protocol. | |
| | | | |
| 1.3. NFSv4 Goals | | 1.3. NFSv4 Goals | |
| | | | |
| The NFSv4 protocol is a further revision of the NFS protocol defined | | The NFSv4 protocol is a further revision of the NFS protocol defined | |
|
| already by NFSv3 [22]. It retains the essential characteristics of | | already by NFSv3 [21]. It retains the essential characteristics of | |
| previous versions: easy recovery; independence of transport | | previous versions: easy recovery; independence of transport | |
| protocols, operating systems and file systems; simplicity; and good | | protocols, operating systems and file systems; simplicity; and good | |
| performance. NFSv4 has the following goals: | | performance. NFSv4 has the following goals: | |
| | | | |
| o Improved access and good performance on the Internet. | | o Improved access and good performance on the Internet. | |
| | | | |
| The protocol is designed to transit firewalls easily, perform well | | The protocol is designed to transit firewalls easily, perform well | |
| where latency is high and bandwidth is low, and scale to very | | where latency is high and bandwidth is low, and scale to very | |
| large numbers of clients per server. | | large numbers of clients per server. | |
| | | | |
| | | | |
| skipping to change at page 13, line 33 | | skipping to change at page 13, line 33 | |
| node. | | node. | |
| | | | |
| Client ID A 64-bit quantity used as a unique, short-hand reference | | Client ID A 64-bit quantity used as a unique, short-hand reference | |
| to a client supplied Verifier and client owner. The server is | | to a client supplied Verifier and client owner. The server is | |
| responsible for supplying the client ID. | | responsible for supplying the client ID. | |
| | | | |
| Client Owner The client owner is a unique string, opaque to the | | Client Owner The client owner is a unique string, opaque to the | |
| server, which identifies a client. Multiple network connections | | server, which identifies a client. Multiple network connections | |
| and source network addresses originating from those connections | | and source network addresses originating from those connections | |
| may share a client owner. The server is expected to treat | | may share a client owner. The server is expected to treat | |
|
| requests from connnections with the same client owner as coming | | requests from connections with the same client owner as coming | |
| from the same client. | | from the same client. | |
| | | | |
| File System The collection of objects on a server (as identified by | | File System The collection of objects on a server (as identified by | |
| the major identifier of a Server Owner, which is defined later in | | the major identifier of a Server Owner, which is defined later in | |
| this section), that share the same fsid attribute (see | | this section), that share the same fsid attribute (see | |
| Section 5.8.1.9). | | Section 5.8.1.9). | |
| | | | |
| Lease An interval of time defined by the server for which the client | | Lease An interval of time defined by the server for which the client | |
| is irrevocably granted a lock. At the end of a lease period the | | is irrevocably granted a lock. At the end of a lease period the | |
| lock may be revoked if the lease has not been extended. The lock | | lock may be revoked if the lease has not been extended. The lock | |
| | | | |
| skipping to change at page 14, line 20 | | skipping to change at page 14, line 20 | |
| client access to a set of file systems and is identified by a | | client access to a set of file systems and is identified by a | |
| Server owner. A server can span multiple network addresses. | | Server owner. A server can span multiple network addresses. | |
| | | | |
| Server Owner The "Server Owner" identifies the server to the client. | | Server Owner The "Server Owner" identifies the server to the client. | |
| The server owner consists of a major and minor identifier. When | | The server owner consists of a major and minor identifier. When | |
| the client has two connections each to a peer with the same major | | the client has two connections each to a peer with the same major | |
| identifier, the client assumes both peers are the same server (the | | identifier, the client assumes both peers are the same server (the | |
| server namespace is the same via each connection), and assumes and | | server namespace is the same via each connection), and assumes and | |
| lock state is sharable across both connections. When each peer | | lock state is sharable across both connections. When each peer | |
| has both the same major and minor identifier, the client assumes | | has both the same major and minor identifier, the client assumes | |
|
| each connection might be associatable with the same session. | | each connection might be associable with the same session. | |
| | | | |
| Stable Storage NFSv4.1 servers must be able to recover without data | | Stable Storage NFSv4.1 servers must be able to recover without data | |
| loss from multiple power failures (including cascading power | | loss from multiple power failures (including cascading power | |
| failures, that is, several power failures in quick succession), | | failures, that is, several power failures in quick succession), | |
| operating system failures, and hardware failure of components | | operating system failures, and hardware failure of components | |
| other than the storage medium itself (for example, disk, | | other than the storage medium itself (for example, disk, | |
| nonvolatile RAM). | | nonvolatile RAM). | |
| | | | |
| Some examples of stable storage that are allowable for an NFS | | Some examples of stable storage that are allowable for an NFS | |
| server include: | | server include: | |
| | | | |
| skipping to change at page 17, line 9 | | skipping to change at page 17, line 9 | |
| which are then used to identify objects in subsequent operations. | | which are then used to identify objects in subsequent operations. | |
| | | | |
| The NFSv4.1 protocol provides support for persistent filehandles, | | The NFSv4.1 protocol provides support for persistent filehandles, | |
| guaranteed to be valid for the lifetime of the file system object | | guaranteed to be valid for the lifetime of the file system object | |
| designated. In addition it provides support to servers to provide | | designated. In addition it provides support to servers to provide | |
| filehandles with more limited validity guarantees, called volatile | | filehandles with more limited validity guarantees, called volatile | |
| filehandles. | | filehandles. | |
| | | | |
| 1.6.3.2. File Attributes | | 1.6.3.2. File Attributes | |
| | | | |
|
| The NFSv4.1 protocol has a rich and extensible attribute structure, | | The NFSv4.1 protocol has a rich and extensible file object attribute | |
| which is divided into REQUIRED, RECOMMENDED, and named attributes. | | structure, which is divided into REQUIRED, RECOMMENDED, and named | |
| | | attributes (see Section 5). | |
| | | | |
|
| The acl, sacl, and dacl attributes compose a set of RECOMMENDED file | | Several (but not all) of the REQUIRED attributes are derived from the | |
| attributes that make up the Access Control List (ACL) of a file | | attributes of NFSv3 (see definition of the fattr3 data type in [21]). | |
| (Section 6). These attributes provide for directory and file access | | An example of a REQUIRED attribute is the file object's type | |
| control beyond the model used in NFSv3. The ACL definition allows | | (Section 5.8.1.2) so that regular files can be distinguished from | |
| for specification of specific sets of permissions for individual | | directories (also known as folders in some operating environments) | |
| users and groups. In addition, ACL inheritance allows propagation of | | and other types of objects. REQUIRED attributes are discussed in | |
| | | Section 5.1. | |
| | | | |
| | | An example of three RECOMMENDED attributes are acl, sacl, and dacl. | |
| | | These attributes define an Access Control List (ACL) on a file object | |
| | | ((Section 6). An ACL provides directory and file access control | |
| | | beyond the model used in NFSv3. The ACL definition allows for | |
| | | specification of specific sets of permissions for individual users | |
| | | and groups. In addition, ACL inheritance allows propagation of | |
| access permissions and restriction down a directory tree as file | | access permissions and restriction down a directory tree as file | |
|
| system objects are created. | | system objects are created. RECOMMENDED attributes are discussed in | |
| | | Section 5.2. | |
| | | | |
| A named attribute is an opaque byte stream that is associated with a | | A named attribute is an opaque byte stream that is associated with a | |
| directory or file and referred to by a string name. Named attributes | | directory or file and referred to by a string name. Named attributes | |
| are meant to be used by client applications as a method to associate | | are meant to be used by client applications as a method to associate | |
| application-specific data with a regular file or directory. NFSv4.1 | | application-specific data with a regular file or directory. NFSv4.1 | |
| modifies named attributes relative to NFSv4.0 by tightening the | | modifies named attributes relative to NFSv4.0 by tightening the | |
| allowed operations in order to prevent the development of non- | | allowed operations in order to prevent the development of non- | |
|
| interoperable implementation. See Section 5.3 for details. | | interoperable implementations. Named attributes are discussed in | |
| | | Section 5.3. | |
| | | | |
| 1.6.3.3. Multi-server Namespace | | 1.6.3.3. Multi-server Namespace | |
| | | | |
| NFSv4.1 contains a number of features to allow implementation of | | NFSv4.1 contains a number of features to allow implementation of | |
| namespaces that cross server boundaries and that allow and facilitate | | namespaces that cross server boundaries and that allow and facilitate | |
| a non-disruptive transfer of support for individual file systems | | a non-disruptive transfer of support for individual file systems | |
| between servers. They are all based upon attributes that allow one | | between servers. They are all based upon attributes that allow one | |
| file system to specify alternate or new locations for that file | | file system to specify alternate or new locations for that file | |
| system. | | system. | |
| | | | |
| | | | |
| skipping to change at page 21, line 28 | | skipping to change at page 21, line 35 | |
| | | | |
| Although GSS-API has an authentication service distinct from its | | Although GSS-API has an authentication service distinct from its | |
| privacy and integrity services, GSS-API's authentication service is | | privacy and integrity services, GSS-API's authentication service is | |
| not used for RPCSEC_GSS's authentication service. Instead, each RPC | | not used for RPCSEC_GSS's authentication service. Instead, each RPC | |
| request and response header is integrity protected with the GSS-API | | request and response header is integrity protected with the GSS-API | |
| integrity service, and this allows RPCSEC_GSS to offer per-RPC | | integrity service, and this allows RPCSEC_GSS to offer per-RPC | |
| authentication and identity. See [4] for more information. | | authentication and identity. See [4] for more information. | |
| | | | |
| NFSv4.1 client and servers MUST support RPCSEC_GSS's integrity and | | NFSv4.1 client and servers MUST support RPCSEC_GSS's integrity and | |
| authentication service. NFSv4.1 servers MUST support RPCSEC_GSS's | | authentication service. NFSv4.1 servers MUST support RPCSEC_GSS's | |
|
| privacy service. | | privacy service. NFSv4.1 clients SHOULD support RPCSEC_GSS's privacy | |
| | | service. | |
| | | | |
| 2.2.1.1.1.2. Security mechanisms for NFSv4.1 | | 2.2.1.1.1.2. Security mechanisms for NFSv4.1 | |
| | | | |
| RPCSEC_GSS, via GSS-API, normalizes access to mechanisms that provide | | RPCSEC_GSS, via GSS-API, normalizes access to mechanisms that provide | |
| security services. Therefore NFSv4.1 clients and servers MUST | | security services. Therefore NFSv4.1 clients and servers MUST | |
| support three security mechanisms: Kerberos V5, SPKM-3, and LIPKEY. | | support three security mechanisms: Kerberos V5, SPKM-3, and LIPKEY. | |
| | | | |
| The use of RPCSEC_GSS requires selection of: mechanism, quality of | | The use of RPCSEC_GSS requires selection of: mechanism, quality of | |
| protection (QOP), and service (authentication, integrity, privacy). | | protection (QOP), and service (authentication, integrity, privacy). | |
| For the mandated security mechanisms, NFSv4.1 specifies that a QOP of | | For the mandated security mechanisms, NFSv4.1 specifies that a QOP of | |
| | | | |
| skipping to change at page 22, line 24 | | skipping to change at page 22, line 31 | |
| ------------------------------------------------------------------ | | ------------------------------------------------------------------ | |
| 390003 krb5 1.2.840.113554.1.2.2 rpc_gss_svc_none yes yes | | 390003 krb5 1.2.840.113554.1.2.2 rpc_gss_svc_none yes yes | |
| 390004 krb5i 1.2.840.113554.1.2.2 rpc_gss_svc_integrity yes yes | | 390004 krb5i 1.2.840.113554.1.2.2 rpc_gss_svc_integrity yes yes | |
| 390005 krb5p 1.2.840.113554.1.2.2 rpc_gss_svc_privacy no yes | | 390005 krb5p 1.2.840.113554.1.2.2 rpc_gss_svc_privacy no yes | |
| | | | |
| Note that the number and name of the pseudo flavor is presented here | | Note that the number and name of the pseudo flavor is presented here | |
| as a mapping aid to the implementor. Because the NFSv4.1 protocol | | as a mapping aid to the implementor. Because the NFSv4.1 protocol | |
| includes a method to negotiate security and it understands the GSS- | | includes a method to negotiate security and it understands the GSS- | |
| API mechanism, the pseudo flavor is not needed. The pseudo flavor is | | API mechanism, the pseudo flavor is not needed. The pseudo flavor is | |
| needed for the NFSv3 since the security negotiation is done via the | | needed for the NFSv3 since the security negotiation is done via the | |
|
| MOUNT protocol as described in [23]. | | MOUNT protocol as described in [22]. | |
| | | | |
| 2.2.1.1.1.2.2. LIPKEY | | 2.2.1.1.1.2.2. LIPKEY | |
| | | | |
| The LIPKEY V5 GSS-API mechanism as described in [6] MUST be | | The LIPKEY V5 GSS-API mechanism as described in [6] MUST be | |
| implemented with the RPCSEC_GSS services as specified in the | | implemented with the RPCSEC_GSS services as specified in the | |
| following table: | | following table: | |
| | | | |
| 1 2 3 4 5 6 | | 1 2 3 4 5 6 | |
| ------------------------------------------------------------------ | | ------------------------------------------------------------------ | |
| 390006 lipkey 1.3.6.1.5.5.9 rpc_gss_svc_none yes yes | | 390006 lipkey 1.3.6.1.5.5.9 rpc_gss_svc_none yes yes | |
| | | | |
| skipping to change at page 23, line 49 | | skipping to change at page 24, line 6 | |
| With the use of the COMPOUND procedure, the client is able to build | | With the use of the COMPOUND procedure, the client is able to build | |
| simple or complex requests. These COMPOUND requests allow for a | | simple or complex requests. These COMPOUND requests allow for a | |
| reduction in the number of RPCs needed for logical file system | | reduction in the number of RPCs needed for logical file system | |
| operations. For example, multi-component lookup requests can be | | operations. For example, multi-component lookup requests can be | |
| constructed by combining multiple LOOKUP operations. Those can be | | constructed by combining multiple LOOKUP operations. Those can be | |
| further combined with operations such as GETATTR, READDIR, or OPEN | | further combined with operations such as GETATTR, READDIR, or OPEN | |
| plus READ to do more complicated sets of operation without incurring | | plus READ to do more complicated sets of operation without incurring | |
| additional latency. | | additional latency. | |
| | | | |
| NFSv4.1 also contains a considerable set of callback operations in | | NFSv4.1 also contains a considerable set of callback operations in | |
|
| which the server makes an RPC directed at the client. Callback RPC's | | which the server makes an RPC directed at the client. Callback RPCs | |
| have a similar structure to that of the normal server requests. In | | have a similar structure to that of the normal server requests. In | |
| all minor versions of the NFSv4 protocol there are two callback RPC | | all minor versions of the NFSv4 protocol there are two callback RPC | |
| procedures, CB_NULL and CB_COMPOUND. The CB_COMPOUND procedure is | | procedures, CB_NULL and CB_COMPOUND. The CB_COMPOUND procedure is | |
| defined in an analogous fashion to that of COMPOUND with its own set | | defined in an analogous fashion to that of COMPOUND with its own set | |
| of callback operations. | | of callback operations. | |
| | | | |
| The addition of new server and callback operations within the | | The addition of new server and callback operations within the | |
| COMPOUND and CB_COMPOUND request framework provides a means of | | COMPOUND and CB_COMPOUND request framework provides a means of | |
| extending the protocol in subsequent minor versions. | | extending the protocol in subsequent minor versions. | |
| | | | |
| | | | |
| skipping to change at page 25, line 47 | | skipping to change at page 26, line 5 | |
| same string. The implementor is cautioned from an approach that | | same string. The implementor is cautioned from an approach that | |
| requires the string to be recorded in a local file because this | | requires the string to be recorded in a local file because this | |
| precludes the use of the implementation in an environment where | | precludes the use of the implementation in an environment where | |
| there is no local disk and all file access is from an NFSv4.1 | | there is no local disk and all file access is from an NFSv4.1 | |
| server. | | server. | |
| | | | |
| o The string should be the same for each server network address that | | o The string should be the same for each server network address that | |
| the client accesses. This way, if a server has multiple | | the client accesses. This way, if a server has multiple | |
| interfaces, the client can trunk traffic over multiple network | | interfaces, the client can trunk traffic over multiple network | |
| paths as described in Section 2.10.4. (Note: the precise opposite | | paths as described in Section 2.10.4. (Note: the precise opposite | |
|
| was advised in the NFSv4.0 specification [21].) | | was advised in the NFSv4.0 specification [20].) | |
| | | | |
| o The algorithm for generating the string should not assume that the | | o The algorithm for generating the string should not assume that the | |
| client's network address will not change, unless the client | | client's network address will not change, unless the client | |
| implementation knows it is using statically assigned network | | implementation knows it is using statically assigned network | |
| addresses. This includes changes between client incarnations and | | addresses. This includes changes between client incarnations and | |
| even changes while the client is still running in its current | | even changes while the client is still running in its current | |
| incarnation. Thus with dynamic address assignment, if the client | | incarnation. Thus with dynamic address assignment, if the client | |
| includes just the client's network address in the co_ownerid | | includes just the client's network address in the co_ownerid | |
| string, there is a real risk that after the client gives up the | | string, there is a real risk that after the client gives up the | |
| network address, another client, using a similar algorithm for | | network address, another client, using a similar algorithm for | |
| | | | |
| skipping to change at page 27, line 47 | | skipping to change at page 28, line 4 | |
| See the descriptions of EXCHANGE_ID (Section 18.35) and | | See the descriptions of EXCHANGE_ID (Section 18.35) and | |
| CREATE_SESSION (Section 18.36) for a complete specification of these | | CREATE_SESSION (Section 18.36) for a complete specification of these | |
| operations. | | operations. | |
| | | | |
| 2.4.1. Upgrade from NFSv4.0 to NFSv4.1 | | 2.4.1. Upgrade from NFSv4.0 to NFSv4.1 | |
| | | | |
| To facilitate upgrade from NFSv4.0 to NFSv4.1, a server may compare a | | To facilitate upgrade from NFSv4.0 to NFSv4.1, a server may compare a | |
| client_owner4 in an EXCHANGE_ID with an nfs_client_id4 established | | client_owner4 in an EXCHANGE_ID with an nfs_client_id4 established | |
| using the SETCLIENTID operation of NFSv4.0. A server that does so | | using the SETCLIENTID operation of NFSv4.0. A server that does so | |
| will allow an upgraded client to avoid waiting until the lease (i.e. | | will allow an upgraded client to avoid waiting until the lease (i.e. | |
|
| | | | |
| the lease established by the NFSv4.0 instance client) expires. This | | the lease established by the NFSv4.0 instance client) expires. This | |
| requires the client_owner4 be constructed the same way as the | | requires the client_owner4 be constructed the same way as the | |
| nfs_client_id4. If the latter's contents included the server's | | nfs_client_id4. If the latter's contents included the server's | |
| network address (per the recommendations of the NFSv4.0 specification | | network address (per the recommendations of the NFSv4.0 specification | |
|
| [21]), and the NFSv4.1 client does not wish to use a client ID that | | [20]), and the NFSv4.1 client does not wish to use a client ID that | |
| prevents trunking, it should send two EXCHANGE_ID operations. The | | prevents trunking, it should send two EXCHANGE_ID operations. The | |
| first EXCHANGE_ID will have a client_owner4 equal to the | | first EXCHANGE_ID will have a client_owner4 equal to the | |
| nfs_client_id4. This will clear the state created by the NFSv4.0 | | nfs_client_id4. This will clear the state created by the NFSv4.0 | |
| client. The second EXCHANGE_ID will not have the server's network | | client. The second EXCHANGE_ID will not have the server's network | |
| address. The state created for the second EXCHANGE_ID will not have | | address. The state created for the second EXCHANGE_ID will not have | |
| to wait for lease expiration, because there will be no state to | | to wait for lease expiration, because there will be no state to | |
| expire. | | expire. | |
| | | | |
| 2.4.2. Server Release of Client ID | | 2.4.2. Server Release of Client ID | |
| | | | |
| | | | |
| skipping to change at page 35, line 24 | | skipping to change at page 35, line 34 | |
| operation will fail with NFS4ERR_WRONGSEC. After a SECINFO_NO_NAME | | operation will fail with NFS4ERR_WRONGSEC. After a SECINFO_NO_NAME | |
| request, the client sends SEQUENCE, PUTFH bFH, SAVEFH, PUTFH aFH, | | request, the client sends SEQUENCE, PUTFH bFH, SAVEFH, PUTFH aFH, | |
| RENAME "c" "d", using credentials acceptable to aFH's security | | RENAME "c" "d", using credentials acceptable to aFH's security | |
| policy, but not bFH's policy. The server returns NFS4ERR_WRONGSEC on | | policy, but not bFH's policy. The server returns NFS4ERR_WRONGSEC on | |
| the RENAME operation. | | the RENAME operation. | |
| | | | |
| To prevent a client from an endless sequence of a request containing | | To prevent a client from an endless sequence of a request containing | |
| LINK or RENAME, followed by a request containing SECINFO_NO_NAME, the | | LINK or RENAME, followed by a request containing SECINFO_NO_NAME, the | |
| server MUST detect when the security policies of the current and | | server MUST detect when the security policies of the current and | |
| saved filehandles have no mutually acceptable security tuple, and | | saved filehandles have no mutually acceptable security tuple, and | |
|
| MUST NOT NFS4ERR_WRONGSEC in that situation. Instead the server MUST | | MUST NOT return NFS4ERR_WRONGSEC in that situation. Instead the | |
| return NFS4ERR_XDEV. | | server MUST return NFS4ERR_XDEV. | |
| | | | |
| Thus while a server MAY return NFS4ERR_WRONGSEC from LINK and RENAME, | | Thus while a server MAY return NFS4ERR_WRONGSEC from LINK and RENAME, | |
| the server implementor may reasonably decide the consequences are not | | the server implementor may reasonably decide the consequences are not | |
| worth the security benefits, and so allow the security policy of the | | worth the security benefits, and so allow the security policy of the | |
| current filehandle to override that of the saved filehandle. | | current filehandle to override that of the saved filehandle. | |
| | | | |
| 2.7. Minor Versioning | | 2.7. Minor Versioning | |
| | | | |
| To address the requirement of an NFS protocol that can evolve as the | | To address the requirement of an NFS protocol that can evolve as the | |
| need arises, the NFSv4.1 protocol contains the rules and framework to | | need arises, the NFSv4.1 protocol contains the rules and framework to | |
| allow for future minor changes or versioning. | | allow for future minor changes or versioning. | |
| | | | |
| The base assumption with respect to minor versioning is that any | | The base assumption with respect to minor versioning is that any | |
| future accepted minor version must follow the IETF process and be | | future accepted minor version must follow the IETF process and be | |
| documented in a standards track RFC. Therefore, each minor version | | documented in a standards track RFC. Therefore, each minor version | |
|
| number will correspond to an RFC. Minor version zero of the NFSv4 | | number will correspond to one or more new RFCs. Minor version zero | |
| protocol is represented by [21], and minor version one is represented | | of the NFSv4 protocol is represented by [20], and minor version one | |
| by this document [[Comment.1: RFC Editor: change "document" to "RFC" | | is represented by this document [[Comment.1: RFC Editor: change | |
| when we publish]]. The COMPOUND and CB_COMPOUND procedures support | | "document" to "RFC" when we publish]]. The COMPOUND and CB_COMPOUND | |
| the encoding of the minor version being requested by the client. | | procedures support the encoding of the minor version being requested | |
| | | by the client. | |
| | | | |
| The following items represent the basic rules for the development of | | The following items represent the basic rules for the development of | |
| minor versions. Note that a future minor version may decide to | | minor versions. Note that a future minor version may decide to | |
| modify or add to the following rules as part of the minor version | | modify or add to the following rules as part of the minor version | |
| definition. | | definition. | |
| | | | |
| 1. Procedures are not added or deleted | | 1. Procedures are not added or deleted | |
| | | | |
| To maintain the general RPC model, NFSv4 minor versions will not | | To maintain the general RPC model, NFSv4 minor versions will not | |
| add to or delete procedures from the NFS program. | | add to or delete procedures from the NFS program. | |
| | | | |
| skipping to change at page 38, line 48 | | skipping to change at page 39, line 12 | |
| NFSv4.1 provides alarm control on a per file object basis, via the | | NFSv4.1 provides alarm control on a per file object basis, via the | |
| acl and sacl attributes as described in Section 6. Alarms may serve | | acl and sacl attributes as described in Section 6. Alarms may serve | |
| as the basis for intrusion detection. It is outside the scope of | | as the basis for intrusion detection. It is outside the scope of | |
| this specification to specify heuristics for detecting intrusion via | | this specification to specify heuristics for detecting intrusion via | |
| alarms. | | alarms. | |
| | | | |
| 2.9. Transport Layers | | 2.9. Transport Layers | |
| | | | |
| 2.9.1. REQUIRED and RECOMMENDED Properties of Transports | | 2.9.1. REQUIRED and RECOMMENDED Properties of Transports | |
| | | | |
|
| NFSv4.1 works over RDMA and non-RDMA_based transports with the | | NFSv4.1 works over RDMA and non-RDMA-based transports with the | |
| following attributes: | | following attributes: | |
| | | | |
| o The transport supports reliable delivery of data, which NFSv4.1 | | o The transport supports reliable delivery of data, which NFSv4.1 | |
| requires but neither NFSv4.1 nor RPC has facilities for ensuring. | | requires but neither NFSv4.1 nor RPC has facilities for ensuring. | |
|
| | | [23] | |
| [24] | | | |
| | | | |
| o The transport delivers data in the order it was sent. Ordered | | o The transport delivers data in the order it was sent. Ordered | |
| delivery simplifies detection of transmit errors, and simplifies | | delivery simplifies detection of transmit errors, and simplifies | |
| the sending of arbitrary sized requests and responses, via the | | the sending of arbitrary sized requests and responses, via the | |
| record marking protocol [3]. | | record marking protocol [3]. | |
| | | | |
| Where an NFSv4.1 implementation supports operation over the IP | | Where an NFSv4.1 implementation supports operation over the IP | |
| network protocol, any transport used between NFS and IP MUST be among | | network protocol, any transport used between NFS and IP MUST be among | |
| the IETF-approved congestion control transport protocols. At the | | the IETF-approved congestion control transport protocols. At the | |
| time this document was written, the only two transports that had the | | time this document was written, the only two transports that had the | |
| above attributes were TCP and SCTP. To enhance the possibilities for | | above attributes were TCP and SCTP. To enhance the possibilities for | |
| interoperability, an NFSv4.1 implementation MUST support operation | | interoperability, an NFSv4.1 implementation MUST support operation | |
| over the TCP transport protocol. | | over the TCP transport protocol. | |
| | | | |
| Even if NFSv4.1 is used over a non-IP network protocol, it is | | Even if NFSv4.1 is used over a non-IP network protocol, it is | |
| RECOMMENDED that the transport support congestion control. | | RECOMMENDED that the transport support congestion control. | |
| | | | |
| It is permissible for a connectionless transport to be used under | | It is permissible for a connectionless transport to be used under | |
|
| NFSv4.1, however reliable and in-order delivery of data by the | | NFSv4.1, however reliable and in-order delivery of data combined with | |
| connectionless transport is REQUIRED. NFSv4.1 assumes that a client | | congestion control by the connectionless transport is REQUIRED. | |
| transport address and server transport address used to send data over | | NFSv4.1 assumes that a client transport address and server transport | |
| a transport together constitute a connection, even if the underlying | | address used to send data over a transport together constitute a | |
| transport eschews the concept of a connection. | | connection, even if the underlying transport eschews the concept of a | |
| | | connection. | |
| | | | |
| 2.9.2. Client and Server Transport Behavior | | 2.9.2. Client and Server Transport Behavior | |
| | | | |
| If a connection-oriented transport (e.g. TCP) is used, the client | | If a connection-oriented transport (e.g. TCP) is used, the client | |
| and server SHOULD use long lived connections for at least three | | and server SHOULD use long lived connections for at least three | |
| reasons: | | reasons: | |
| | | | |
| 1. This will prevent the weakening of the transport's congestion | | 1. This will prevent the weakening of the transport's congestion | |
| control mechanisms via short lived connections. | | control mechanisms via short lived connections. | |
| | | | |
| | | | |
| skipping to change at page 41, line 8 | | skipping to change at page 41, line 21 | |
| contents must not be blindly used when replies are sent from it, | | contents must not be blindly used when replies are sent from it, | |
| and credit information appropriate to the channel must be | | and credit information appropriate to the channel must be | |
| refreshed by the RPC layer. | | refreshed by the RPC layer. | |
| | | | |
| In addition, as described in Section 2.10.5.2, while a session is | | In addition, as described in Section 2.10.5.2, while a session is | |
| active, the NFSv4.1 requester MUST NOT stop waiting for a reply. | | active, the NFSv4.1 requester MUST NOT stop waiting for a reply. | |
| | | | |
| 2.9.3. Ports | | 2.9.3. Ports | |
| | | | |
| Historically, NFSv3 servers have listened over TCP port 2049. The | | Historically, NFSv3 servers have listened over TCP port 2049. The | |
|
| registered port 2049 [25] for the NFS protocol should be the default | | registered port 2049 [24] for the NFS protocol should be the default | |
| configuration. NFSv4.1 clients SHOULD NOT use the RPC binding | | configuration. NFSv4.1 clients SHOULD NOT use the RPC binding | |
|
| protocols as described in [26]. | | protocols as described in [25]. | |
| | | | |
| 2.10. Session | | 2.10. Session | |
| | | | |
| 2.10.1. Motivation and Overview | | 2.10.1. Motivation and Overview | |
| | | | |
| Previous versions and minor versions of NFS have suffered from the | | Previous versions and minor versions of NFS have suffered from the | |
| following: | | following: | |
| | | | |
| o Lack of support for Exactly Once Semantics (EOS). This includes | | o Lack of support for Exactly Once Semantics (EOS). This includes | |
| lack of support for EOS through server failure and recovery. | | lack of support for EOS through server failure and recovery. | |
| | | | |
| skipping to change at page 43, line 15 | | skipping to change at page 43, line 28 | |
| associates all other operations in the COMPOUND procedure with a | | associates all other operations in the COMPOUND procedure with a | |
| particular session. SEQUENCE also contains required information for | | particular session. SEQUENCE also contains required information for | |
| maintaining EOS (see Section 2.10.5). Session-enabled NFSv4.1 | | maintaining EOS (see Section 2.10.5). Session-enabled NFSv4.1 | |
| COMPOUND requests thus have the form: | | COMPOUND requests thus have the form: | |
| | | | |
| +-----+--------------+-----------+------------+-----------+---- | | +-----+--------------+-----------+------------+-----------+---- | |
| | tag | minorversion | numops |SEQUENCE op | op + args | ... | | | tag | minorversion | numops |SEQUENCE op | op + args | ... | |
| | | (== 1) | (limited) | + args | | | | | | (== 1) | (limited) | + args | | | |
| +-----+--------------+-----------+------------+-----------+---- | | +-----+--------------+-----------+------------+-----------+---- | |
| | | | |
|
| and the reply's structure is: | | and the replys have the form: | |
| | | | |
| +------------+-----+--------+-------------------------------+--// | | +------------+-----+--------+-------------------------------+--// | |
| |last status | tag | numres |status + SEQUENCE op + results | // | | |last status | tag | numres |status + SEQUENCE op + results | // | |
| +------------+-----+--------+-------------------------------+--// | | +------------+-----+--------+-------------------------------+--// | |
| //-----------------------+---- | | //-----------------------+---- | |
| // status + op + results | ... | | // status + op + results | ... | |
| //-----------------------+---- | | //-----------------------+---- | |
| | | | |
| A CB_COMPOUND procedure request and reply has a similar form to | | A CB_COMPOUND procedure request and reply has a similar form to | |
| COMPOUND, but instead of a SEQUENCE operation, there is a CB_SEQUENCE | | COMPOUND, but instead of a SEQUENCE operation, there is a CB_SEQUENCE | |
| | | | |
| skipping to change at page 44, line 32 | | skipping to change at page 44, line 45 | |
| of NFSv4.1 require a backchannel. NFSv4.1 servers MUST support | | of NFSv4.1 require a backchannel. NFSv4.1 servers MUST support | |
| backchannels. | | backchannels. | |
| | | | |
| Each session has resources for each channel, including separate reply | | Each session has resources for each channel, including separate reply | |
| caches (see Section 2.10.5.1). Note that even the backchannel | | caches (see Section 2.10.5.1). Note that even the backchannel | |
| requires a reply cache because some callback operations are | | requires a reply cache because some callback operations are | |
| nonidempotent. | | nonidempotent. | |
| | | | |
| 2.10.3.1. Association of Connections, Channels, and Sessions | | 2.10.3.1. Association of Connections, Channels, and Sessions | |
| | | | |
|
| Each channel is associated with zero or more transport connections. | | Each channel is associated with zero or more transport connections | |
| A connection can be associated with one channel or both channels of a | | (whether of the same transport protocol or different transport | |
| session; the client and server negotiate whether a connection will | | protocols). A connection can be associated with one channel or both | |
| carry traffic for one channel or both channels via the CREATE_SESSION | | channels of a session; the client and server negotiate whether a | |
| (Section 18.36) and the BIND_CONN_TO_SESSION (Section 18.34) | | connection will carry traffic for one channel or both channels via | |
| operations. When a session is created via CREATE_SESSION, the | | the CREATE_SESSION (Section 18.36) and the BIND_CONN_TO_SESSION | |
| connection that transported the CREATE_SESSION request is | | (Section 18.34) operations. When a session is created via | |
| automatically associated with the fore channel, and optionally the | | CREATE_SESSION, the connection that transported the CREATE_SESSION | |
| backchannel. If the client specifies no state protection | | request is automatically associated with the fore channel, and | |
| (Section 18.35) when the session is created, then when SEQUENCE is | | optionally the backchannel. If the client specifies no state | |
| transmitted on a different connection, the connection is | | protection (Section 18.35) when the session is created, then when | |
| | | SEQUENCE is transmitted on a different connection, the connection is | |
| automatically associated with the fore channel of the session | | automatically associated with the fore channel of the session | |
| specified in the SEQUENCE operation. | | specified in the SEQUENCE operation. | |
| | | | |
| A connection's association with a session is not exclusive. A | | A connection's association with a session is not exclusive. A | |
| connection associated with the channel(s) of one session may be | | connection associated with the channel(s) of one session may be | |
| simultaneously associated with the channel(s) of other sessions | | simultaneously associated with the channel(s) of other sessions | |
| including sessions associated with other client IDs. | | including sessions associated with other client IDs. | |
| | | | |
| It is permissible for connections of multiple transport types to be | | It is permissible for connections of multiple transport types to be | |
| associated with the same channel. For example both a TCP and RDMA | | associated with the same channel. For example both a TCP and RDMA | |
| | | | |
| skipping to change at page 45, line 22 | | skipping to change at page 45, line 37 | |
| | | | |
| It is permissible for a connection of one type of transport to be | | It is permissible for a connection of one type of transport to be | |
| associated with the fore channel, and a connection of a different | | associated with the fore channel, and a connection of a different | |
| type to be associated with the backchannel. | | type to be associated with the backchannel. | |
| | | | |
| 2.10.4. Trunking | | 2.10.4. Trunking | |
| | | | |
| Trunking is the use of multiple connections between a client and | | Trunking is the use of multiple connections between a client and | |
| server in order to increase the speed of data transfer. NFSv4.1 | | server in order to increase the speed of data transfer. NFSv4.1 | |
| supports two types of trunking: session trunking and client ID | | supports two types of trunking: session trunking and client ID | |
|
| trunking. NFSv4.1 servers MUST support trunking. | | trunking. NFSv4.1 repliers and requesters MUST support session | |
| | | trunking. NFSv4.1 servers MAY support client ID trunking. NFSv4.1 | |
| | | clients MUST support client ID trunking. | |
| | | | |
| Session trunking is essentially the association of multiple | | Session trunking is essentially the association of multiple | |
| connections, each with potentially different target and/or source | | connections, each with potentially different target and/or source | |
| network addresses, to the same session. | | network addresses, to the same session. | |
| | | | |
| Client ID trunking is the association of multiple sessions to the | | Client ID trunking is the association of multiple sessions to the | |
| same client ID, major server owner ID (Section 2.5), and server scope | | same client ID, major server owner ID (Section 2.5), and server scope | |
| (Section 11.7.7). When two servers return the same major server | | (Section 11.7.7). When two servers return the same major server | |
| owner and server scope it means the two servers are cooperating on | | owner and server scope it means the two servers are cooperating on | |
| locking state management which is a prerequisite for client ID | | locking state management which is a prerequisite for client ID | |
| | | | |
| skipping to change at page 51, line 6 | | skipping to change at page 51, line 21 | |
| | | | |
| o A new request, in which the sequence ID is one greater than that | | o A new request, in which the sequence ID is one greater than that | |
| previously seen in the slot (accounting for sequence wraparound). | | previously seen in the slot (accounting for sequence wraparound). | |
| The replier proceeds to execute the new request, and the replier | | The replier proceeds to execute the new request, and the replier | |
| MUST increase the slot's sequence ID by one. | | MUST increase the slot's sequence ID by one. | |
| | | | |
| o A retransmitted request, in which the sequence ID is equal to that | | o A retransmitted request, in which the sequence ID is equal to that | |
| currently recorded in the slot. If the original request has | | currently recorded in the slot. If the original request has | |
| executed to completion, the replier returns the cached reply. See | | executed to completion, the replier returns the cached reply. See | |
| Section 2.10.5.2 for direction on how the replier deals with | | Section 2.10.5.2 for direction on how the replier deals with | |
|
| retries of requests that are stll in progress. | | retries of requests that are still in progress. | |
| | | | |
| o A misordered retry, in which the sequence ID is less than | | o A misordered retry, in which the sequence ID is less than | |
| (accounting for sequence wraparound) that previously seen in the | | (accounting for sequence wraparound) that previously seen in the | |
| slot. The replier MUST return NFS4ERR_SEQ_MISORDERED (as the | | slot. The replier MUST return NFS4ERR_SEQ_MISORDERED (as the | |
| result from SEQUENCE or CB_SEQUENCE). | | result from SEQUENCE or CB_SEQUENCE). | |
| | | | |
| o A misordered new request, in which the sequence ID is two or more | | o A misordered new request, in which the sequence ID is two or more | |
| than (accounting for sequence wraparound) than that previously | | than (accounting for sequence wraparound) than that previously | |
| seen in the slot. Note that because the sequence ID must | | seen in the slot. Note that because the sequence ID must | |
| wraparound to zero (0) once it reaches 0xFFFFFFFF, a misordered | | wraparound to zero (0) once it reaches 0xFFFFFFFF, a misordered | |
| | | | |
| skipping to change at page 52, line 22 | | skipping to change at page 52, line 37 | |
| in the request will not be in the reply, and the requester has | | in the request will not be in the reply, and the requester has | |
| only the XID to match the reply to the request. | | only the XID to match the reply to the request. | |
| | | | |
| Given that well formulated XIDs continue to be required, this begs | | Given that well formulated XIDs continue to be required, this begs | |
| the question why SEQUENCE and CB_SEQUENCE replies have a session ID, | | the question why SEQUENCE and CB_SEQUENCE replies have a session ID, | |
| slot ID and sequence ID? Having the session ID in the reply means | | slot ID and sequence ID? Having the session ID in the reply means | |
| the requester does not have to use the XID to lookup the session ID, | | the requester does not have to use the XID to lookup the session ID, | |
| which would be necessary if the connection were associated with | | which would be necessary if the connection were associated with | |
| multiple sessions. Having the slot ID and sequence ID in the reply | | multiple sessions. Having the slot ID and sequence ID in the reply | |
| means requester does not have to use the XID to lookup the slot ID | | means requester does not have to use the XID to lookup the slot ID | |
|
| and sequence ID. Furhermore, since the XID is only 32 bits, it is | | and sequence ID. Furthermore, since the XID is only 32 bits, it is | |
| too small to guarantee the re-association of a reply with its request | | too small to guarantee the re-association of a reply with its request | |
|
| ([27]); having session ID, slot ID, and sequence ID in the reply | | ([26]); having session ID, slot ID, and sequence ID in the reply | |
| allows the client to validate that the reply in fact belongs to the | | allows the client to validate that the reply in fact belongs to the | |
| matched request. | | matched request. | |
| | | | |
| The SEQUENCE (and CB_SEQUENCE) operation also carries a | | The SEQUENCE (and CB_SEQUENCE) operation also carries a | |
| "highest_slotid" value which carries additional requester slot usage | | "highest_slotid" value which carries additional requester slot usage | |
| information. The requester must always indicate the slot ID | | information. The requester must always indicate the slot ID | |
| representing the outstanding request with the highest-numbered slot | | representing the outstanding request with the highest-numbered slot | |
| value. The requester should in all cases provide the most | | value. The requester should in all cases provide the most | |
| conservative value possible, although it can be increased somewhat | | conservative value possible, although it can be increased somewhat | |
| above the actual instantaneous usage to maintain some minimum or | | above the actual instantaneous usage to maintain some minimum or | |
| | | | |
| skipping to change at page 54, line 51 | | skipping to change at page 55, line 19 | |
| cache entry for the slot whenever an error is returned from SEQUENCE | | cache entry for the slot whenever an error is returned from SEQUENCE | |
| or CB_SEQUENCE. | | or CB_SEQUENCE. | |
| | | | |
| 2.10.5.1.3. Optional Reply Caching | | 2.10.5.1.3. Optional Reply Caching | |
| | | | |
| On a per-request basis the requester can choose to direct the replier | | On a per-request basis the requester can choose to direct the replier | |
| to cache the reply to all operations after the first operation | | to cache the reply to all operations after the first operation | |
| (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or csa_cachethis | | (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or csa_cachethis | |
| fields of the arguments to SEQUENCE or CB_SEQUENCE. The reason it | | fields of the arguments to SEQUENCE or CB_SEQUENCE. The reason it | |
| would not direct the replier to cache the entire reply is that the | | would not direct the replier to cache the entire reply is that the | |
|
| request is composed of all idempotent operations [24]. Caching the | | request is composed of all idempotent operations [23]. Caching the | |
| reply may offer little benefit. If the reply is too large (see | | reply may offer little benefit. If the reply is too large (see | |
| Section 2.10.5.4), it may not be cacheable anyway. Even if the reply | | Section 2.10.5.4), it may not be cacheable anyway. Even if the reply | |
| to idempotent request is small enough to cache, unnecessarily caching | | to idempotent request is small enough to cache, unnecessarily caching | |
| the reply slows down the server and increases RPC latency. | | the reply slows down the server and increases RPC latency. | |
| | | | |
| Whether the requester requests the reply to be cached or not has no | | Whether the requester requests the reply to be cached or not has no | |
| effect on the slot processing. If the results of SEQUENCE or | | effect on the slot processing. If the results of SEQUENCE or | |
| CB_SEQUENCE are NFS4_OK, then the slot's sequence ID MUST be | | CB_SEQUENCE are NFS4_OK, then the slot's sequence ID MUST be | |
| incremented by one. If a requester does not direct the replier to | | incremented by one. If a requester does not direct the replier to | |
| cache the reply, the replier MUST do one of following: | | cache the reply, the replier MUST do one of following: | |
| | | | |
| skipping to change at page 61, line 7 | | skipping to change at page 61, line 21 | |
| view the problem is as a single transaction consisting of each | | view the problem is as a single transaction consisting of each | |
| operation in the COMPOUND followed by storing the result in | | operation in the COMPOUND followed by storing the result in | |
| persistent storage, then finally a transaction commit. If there is a | | persistent storage, then finally a transaction commit. If there is a | |
| failure before the transaction is committed, then the server rolls | | failure before the transaction is committed, then the server rolls | |
| back the transaction. If server itself fails, then when it restarts, | | back the transaction. If server itself fails, then when it restarts, | |
| its recovery logic could roll back the transaction before starting | | its recovery logic could roll back the transaction before starting | |
| the NFSv4.1 server. | | the NFSv4.1 server. | |
| | | | |
| While the description of the implementation for atomic execution of | | While the description of the implementation for atomic execution of | |
| the request and caching of the reply is beyond the scope of this | | the request and caching of the reply is beyond the scope of this | |
|
| document, an example implementation for NFSv2 [28] is described in | | document, an example implementation for NFSv2 [27] is described in | |
| [29]. | | [28]. | |
| | | | |
| 2.10.6. RDMA Considerations | | 2.10.6. RDMA Considerations | |
| | | | |
| A complete discussion of the operation of RPC-based protocols over | | A complete discussion of the operation of RPC-based protocols over | |
| RDMA transports is in [8]. A discussion of the operation of NFSv4, | | RDMA transports is in [8]. A discussion of the operation of NFSv4, | |
| including NFSv4.1, over RDMA is in [9]. Where RDMA is considered, | | including NFSv4.1, over RDMA is in [9]. Where RDMA is considered, | |
| this specification assumes the use of such a layering; it addresses | | this specification assumes the use of such a layering; it addresses | |
| only the upper layer issues relevant to making best use of RPC/RDMA. | | only the upper layer issues relevant to making best use of RPC/RDMA. | |
| | | | |
| 2.10.6.1. RDMA Connection Resources | | 2.10.6.1. RDMA Connection Resources | |
| | | | |
| skipping to change at page 62, line 10 | | skipping to change at page 62, line 25 | |
| Previous versions of NFS do not provide flow control; instead they | | Previous versions of NFS do not provide flow control; instead they | |
| rely on the windowing provided by transports like TCP to throttle | | rely on the windowing provided by transports like TCP to throttle | |
| requests. This does not work with RDMA, which provides no operation | | requests. This does not work with RDMA, which provides no operation | |
| flow control and will terminate a connection in error when limits are | | flow control and will terminate a connection in error when limits are | |
| exceeded. Limits such as maximum number of requests outstanding are | | exceeded. Limits such as maximum number of requests outstanding are | |
| therefore negotiated when a session is created (see the | | therefore negotiated when a session is created (see the | |
| ca_maxrequests field in Section 18.36). These limits then provide | | ca_maxrequests field in Section 18.36). These limits then provide | |
| the maxima which each connection associated with the session's | | the maxima which each connection associated with the session's | |
| channel(s) must remain within. RDMA connections are managed within | | channel(s) must remain within. RDMA connections are managed within | |
| these limits as described in section 3.3 ("Flow Control"[[Comment.2: | | these limits as described in section 3.3 ("Flow Control"[[Comment.2: | |
|
| RFC Editor: please verify section and title of the RPCRDMA | | RFC Editor: please verify section and title of the RPCRDMA document | |
| document]]) of [8]; if there are multiple RDMA connections, then the | | which is currently at | |
| maximum number of requests for a channel will be divided among the | | http://tools.ietf.org/html/draft-ietf-nfsv4-rpcrdma-08#section-3.3]]) | |
| RDMA connections. Put a different way, the onus is on the replier to | | of [8]; if there are multiple RDMA connections, then the maximum | |
| | | number of requests for a channel will be divided among the RDMA | |
| | | connections. Put a different way, the onus is on the replier to | |
| ensure that total number of RDMA credits across all connections | | ensure that total number of RDMA credits across all connections | |
| associated with the replier's channel does exceed the channel's | | associated with the replier's channel does exceed the channel's | |
| maximum number of outstanding requests. | | maximum number of outstanding requests. | |
| | | | |
| The limits may also be modified dynamically at the replier's choosing | | The limits may also be modified dynamically at the replier's choosing | |
| by manipulating certain parameters present in each NFSv4.1 reply. In | | by manipulating certain parameters present in each NFSv4.1 reply. In | |
| addition, the CB_RECALL_SLOT callback operation (see Section 20.8) | | addition, the CB_RECALL_SLOT callback operation (see Section 20.8) | |
| can be sent by a server to a client to return RDMA credits to the | | can be sent by a server to a client to return RDMA credits to the | |
| server, thereby lowering the maximum number of requests a client can | | server, thereby lowering the maximum number of requests a client can | |
| have outstanding to the server. | | have outstanding to the server. | |
| | | | |
| skipping to change at page 64, line 18 | | skipping to change at page 64, line 35 | |
| 2.10.7.2. Backchannel RPC Security | | 2.10.7.2. Backchannel RPC Security | |
| | | | |
| When the NFSv4.1 client establishes the backchannel, it informs the | | When the NFSv4.1 client establishes the backchannel, it informs the | |
| server of the security flavors and principals to use when sending | | server of the security flavors and principals to use when sending | |
| requests. If the security flavor is RPCSEC_GSS, the client expresses | | requests. If the security flavor is RPCSEC_GSS, the client expresses | |
| the principal in the form of an established RPCSEC_GSS context. The | | the principal in the form of an established RPCSEC_GSS context. The | |
| server is free to use any of the flavor/principal combinations the | | server is free to use any of the flavor/principal combinations the | |
| client offers, but it MUST NOT use unoffered combinations. This way, | | client offers, but it MUST NOT use unoffered combinations. This way, | |
| the client need not provide a target GSS principal for the | | the client need not provide a target GSS principal for the | |
| backchannel as it did with NFSv4.0, nor the server have to implement | | backchannel as it did with NFSv4.0, nor the server have to implement | |
|
| an RPCSEC_GSS initiator as it did with NFSv4.0 [21]. | | an RPCSEC_GSS initiator as it did with NFSv4.0 [20]. | |
| | | | |
| The CREATE_SESSION (Section 18.36) and BACKCHANNEL_CTL | | The CREATE_SESSION (Section 18.36) and BACKCHANNEL_CTL | |
| (Section 18.33) operations allow the client to specify flavor/ | | (Section 18.33) operations allow the client to specify flavor/ | |
| principal combinations. | | principal combinations. | |
| | | | |
| Also note that the SP4_SSV state protection mode (see Section 18.35 | | Also note that the SP4_SSV state protection mode (see Section 18.35 | |
| and Section 2.10.7.3) has the side benefit of providing SSV-derived | | and Section 2.10.7.3) has the side benefit of providing SSV-derived | |
| RPCSEC_GSS contexts (Section 2.10.8). | | RPCSEC_GSS contexts (Section 2.10.8). | |
| | | | |
| 2.10.7.3. Protection from Unauthorized State Changes | | 2.10.7.3. Protection from Unauthorized State Changes | |
| | | | |
| skipping to change at page 84, line 13 | | skipping to change at page 84, line 46 | |
| resides. | | resides. | |
| | | | |
| 3.3.9. netaddr4 | | 3.3.9. netaddr4 | |
| | | | |
| struct netaddr4 { | | struct netaddr4 { | |
| /* see struct rpcb in RFC 1833 */ | | /* see struct rpcb in RFC 1833 */ | |
| string na_r_netid<>; /* network id */ | | string na_r_netid<>; /* network id */ | |
| string na_r_addr<>; /* universal address */ | | string na_r_addr<>; /* universal address */ | |
| }; | | }; | |
| | | | |
|
| The netaddr4 data type is used to identify TCP/IP based endpoints. | | The netaddr4 data type is used to identify network transport | |
| The r_netid and r_addr fields are specified in RFC1833 [26], but they | | endpoints. The r_netid and r_addr fields respectively contain a | |
| are underspecified in RFC1833 [26] as far as what they should look | | netid and uaddr. The netid and uaddr concepts are defined in in | |
| like for specific protocols. The next section clarifies this. | | [13]. The netid and uaddr formats for TCP over IPv4 and TCP over | |
| | | IPv6 are defined in [13], specifically Tables 2 and 3 and Sections | |
| 3.3.9.1. Format of netaddr4 for TCP and UDP over IPv4 | | 3.2.3.3 and 3.2.3.4. | |
| | | | |
| For TCP over IPv4 and for UDP over IPv4, the format of r_addr is the | | | |
| US-ASCII string: | | | |
| | | | |
| h1.h2.h3.h4.p1.p2 | | | |
| | | | |
| The prefix, "h1.h2.h3.h4", is the standard textual form for | | | |
| representing an IPv4 address, which is always four bytes long. | | | |
| Assuming big-endian ordering, h1, h2, h3, and h4, are respectively, | | | |
| the first through fourth bytes each converted to ASCII-decimal. The | | | |
| suffix, "p1.p2", is a textual form for representing a TCP and UDP | | | |
| service port. Assuming big-endian ordering, p1 and p2 are, | | | |
| respectively, the first and second bytes each converted to ASCII- | | | |
| decimal. For example, if a host, in big-endian order, has an address | | | |
| of 0x0A010307 and there is a service listening on, in big endian | | | |
| order, port 0x020F (decimal 527), then the complete universal address | | | |
| is "10.1.3.7.2.15". | | | |
| | | | |
| For TCP over IPv4 the value of r_netid is the string "tcp". For UDP | | | |
| over IPv4 the value of r_netid is the string "udp". That this | | | |
| document specifies the universal address and netid for UDP/IPv6 does | | | |
| not imply that UDP/IPv4 is a legal transport for NFSv4.1 (see | | | |
| Section 2.9). | | | |
| | | | |
| 3.3.9.2. Format of netaddr4 for TCP and UDP over IPv6 | | | |
| | | | |
| For TCP over IPv6 and for UDP over IPv6, the format of r_addr is the | | | |
| US-ASCII string: | | | |
| | | | |
| x1:x2:x3:x4:x5:x6:x7:x8.p1.p2 | | | |
| | | | |
| The suffix "p1.p2" is the service port, and is computed the same way | | | |
| as with universal addresses for TCP and UDP over IPv4. The prefix, | | | |
| "x1:x2:x3:x4:x5:x6:x7:x8", is the preferred textual form for | | | |
| representing an IPv6 address as defined in Section 2.2 of RFC4291 | | | |
| [13]. Additionally, the two alternative forms specified in Section | | | |
| 2.2 of RFC4291 are also acceptable. | | | |
| | | | |
| For TCP over IPv6 the value of r_netid is the string "tcp6". For UDP | | | |
| over IPv6 the value of r_netid is the string "udp6". That this | | | |
| document specifies the universal address and netid for UDP/IPv6 does | | | |
| not imply that UDP/IPv6 is a legal transport for NFSv4.1 (see | | | |
| Section 2.9). | | | |
| | | | |
| 3.3.10. state_owner4 | | 3.3.10. state_owner4 | |
| | | | |
| struct state_owner4 { | | struct state_owner4 { | |
| clientid4 clientid; | | clientid4 clientid; | |
| opaque owner<NFS4_OPAQUE_LIMIT>; | | opaque owner<NFS4_OPAQUE_LIMIT>; | |
| }; | | }; | |
| | | | |
| typedef state_owner4 open_owner4; | | typedef state_owner4 open_owner4; | |
| typedef state_owner4 lock_owner4; | | typedef state_owner4 lock_owner4; | |
| | | | |
| skipping to change at page 86, line 44 | | skipping to change at page 86, line 33 | |
| The layouttype4 data type is 32 bits in length. The range | | The layouttype4 data type is 32 bits in length. The range | |
| represented by the layout type is split into three parts. Type 0x0 | | represented by the layout type is split into three parts. Type 0x0 | |
| is reserved. Types within the range 0x00000001-0x7FFFFFFF are | | is reserved. Types within the range 0x00000001-0x7FFFFFFF are | |
| globally unique and are assigned according to the description in | | globally unique and are assigned according to the description in | |
| Section 22.4; they are maintained by IANA. Types within the range | | Section 22.4; they are maintained by IANA. Types within the range | |
| 0x80000000-0xFFFFFFFF are site specific and for private use only. | | 0x80000000-0xFFFFFFFF are site specific and for private use only. | |
| | | | |
| The LAYOUT4_NFSV4_1_FILES enumeration specifies that the NFSv4.1 file | | The LAYOUT4_NFSV4_1_FILES enumeration specifies that the NFSv4.1 file | |
| layout type, as defined in Section 13, is to be used. The | | layout type, as defined in Section 13, is to be used. The | |
| LAYOUT4_OSD2_OBJECTS enumeration specifies that the object layout, as | | LAYOUT4_OSD2_OBJECTS enumeration specifies that the object layout, as | |
|
| defined in [30], is to be used. Similarly, the LAYOUT4_BLOCK_VOLUME | | defined in [29], is to be used. Similarly, the LAYOUT4_BLOCK_VOLUME | |
| enumeration specifies that the block/volume layout, as defined in | | enumeration specifies that the block/volume layout, as defined in | |
|
| [31], is to be used. | | [30], is to be used. | |
| | | | |
| 3.3.14. deviceid4 | | 3.3.14. deviceid4 | |
| | | | |
| const NFS4_DEVICEID4_SIZE = 16; | | const NFS4_DEVICEID4_SIZE = 16; | |
| | | | |
| typedef opaque deviceid4[NFS4_DEVICEID4_SIZE]; | | typedef opaque deviceid4[NFS4_DEVICEID4_SIZE]; | |
|
| | | | |
| Layout information includes device IDs that specify a storage device | | Layout information includes device IDs that specify a storage device | |
| through a compact handle. Addressing and type information is | | through a compact handle. Addressing and type information is | |
| obtained with the GETDEVICEINFO operation. Device IDs are not | | obtained with the GETDEVICEINFO operation. Device IDs are not | |
| guaranteed to be valid across metadata server restarts. A device ID | | guaranteed to be valid across metadata server restarts. A device ID | |
| is unique per client ID and layout type. See Section 12.2.10 for | | is unique per client ID and layout type. See Section 12.2.10 for | |
| more details. | | more details. | |
| | | | |
| 3.3.15. device_addr4 | | 3.3.15. device_addr4 | |
| | | | |
| struct device_addr4 { | | struct device_addr4 { | |
| | | | |
| skipping to change at page 90, line 50 | | skipping to change at page 90, line 50 | |
| for a file system object. The contents of the filehandle are opaque | | for a file system object. The contents of the filehandle are opaque | |
| to the client. Therefore, the server is responsible for translating | | to the client. Therefore, the server is responsible for translating | |
| the filehandle to an internal representation of the file system | | the filehandle to an internal representation of the file system | |
| object. | | object. | |
| | | | |
| 4.1. Obtaining the First Filehandle | | 4.1. Obtaining the First Filehandle | |
| | | | |
| The operations of the NFS protocol are defined in terms of one or | | The operations of the NFS protocol are defined in terms of one or | |
| more filehandles. Therefore, the client needs a filehandle to | | more filehandles. Therefore, the client needs a filehandle to | |
| initiate communication with the server. With the NFSv3 protocol | | initiate communication with the server. With the NFSv3 protocol | |
|
| RFC1813 [22], there exists an ancillary protocol to obtain this first | | RFC1813 [21], there exists an ancillary protocol to obtain this first | |
| filehandle. The MOUNT protocol, RPC program number 100005, provides | | filehandle. The MOUNT protocol, RPC program number 100005, provides | |
| the mechanism of translating a string based file system path name to | | the mechanism of translating a string based file system path name to | |
| a filehandle which can then be used by the NFS protocols. | | a filehandle which can then be used by the NFS protocols. | |
| | | | |
| The MOUNT protocol has deficiencies in the area of security and use | | The MOUNT protocol has deficiencies in the area of security and use | |
| via firewalls. This is one reason that the use of the public | | via firewalls. This is one reason that the use of the public | |
|
| filehandle was introduced in RFC2054 [32] and RFC2055 [33]. With the | | filehandle was introduced in RFC2054 [31] and RFC2055 [32]. With the | |
| use of the public filehandle in combination with the LOOKUP operation | | use of the public filehandle in combination with the LOOKUP operation | |
| in the NFSv3 protocol, it has been demonstrated that the MOUNT | | in the NFSv3 protocol, it has been demonstrated that the MOUNT | |
| protocol is unnecessary for viable interaction between NFS client and | | protocol is unnecessary for viable interaction between NFS client and | |
| server. | | server. | |
| | | | |
| Therefore, the NFSv4.1 protocol will not use an ancillary protocol | | Therefore, the NFSv4.1 protocol will not use an ancillary protocol | |
| for translation from string based path names to a filehandle. Two | | for translation from string based path names to a filehandle. Two | |
| special filehandles will be used as starting points for the NFS | | special filehandles will be used as starting points for the NFS | |
| client. | | client. | |
| | | | |
| | | | |
| skipping to change at page 94, line 31 | | skipping to change at page 94, line 31 | |
| | | | |
| Servers which provide volatile filehandles that may expire while open | | Servers which provide volatile filehandles that may expire while open | |
| (i.e. if FH4_VOL_MIGRATION or FH4_VOL_RENAME is set or if | | (i.e. if FH4_VOL_MIGRATION or FH4_VOL_RENAME is set or if | |
| FH4_VOLATILE_ANY is set and FH4_NOEXPIRE_WITH_OPEN not set), should | | FH4_VOLATILE_ANY is set and FH4_NOEXPIRE_WITH_OPEN not set), should | |
| deny a RENAME or REMOVE that would affect an OPEN file of any of the | | deny a RENAME or REMOVE that would affect an OPEN file of any of the | |
| components leading to the OPEN file. In addition, the server should | | components leading to the OPEN file. In addition, the server should | |
| deny all RENAME or REMOVE requests during the grace period upon | | deny all RENAME or REMOVE requests during the grace period upon | |
| server restart. | | server restart. | |
| | | | |
| Servers which provide volatile filehandles that may expire while open | | Servers which provide volatile filehandles that may expire while open | |
|
| require special care as regards handling of RENAMESs and REMOVEs. | | require special care as regards handling of RENAMEs and REMOVEs. | |
| This situation can arise if FH4_VOL_MIGRATION or FH4_VOL_RENAME is | | This situation can arise if FH4_VOL_MIGRATION or FH4_VOL_RENAME is | |
| set, if FH4_VOLATILE_ANY is set and FH4_NOEXPIRE_WITH_OPEN not set, | | set, if FH4_VOLATILE_ANY is set and FH4_NOEXPIRE_WITH_OPEN not set, | |
| or if a non-readonly file system has a transition target in a | | or if a non-readonly file system has a transition target in a | |
| different _handle _ class. In these cases, the server should deny a | | different _handle _ class. In these cases, the server should deny a | |
| RENAME or REMOVE that would affect an OPEN file of any of the | | RENAME or REMOVE that would affect an OPEN file of any of the | |
| components leading to the OPEN file. In addition, the server should | | components leading to the OPEN file. In addition, the server should | |
| deny all RENAME or REMOVE requests during the grace period, in order | | deny all RENAME or REMOVE requests during the grace period, in order | |
| to make sure that reclaims of files where filehandles may have | | to make sure that reclaims of files where filehandles may have | |
| expired do not do a reclaim for the wrong file. | | expired do not do a reclaim for the wrong file. | |
| | | | |
| | | | |
| skipping to change at page 105, line 16 | | skipping to change at page 105, line 16 | |
| | | | |
| True, if two distinct filehandles guaranteed to refer to two | | True, if two distinct filehandles guaranteed to refer to two | |
| different file system objects. | | different file system objects. | |
| | | | |
| 5.8.1.11. Attribute 10: lease_time | | 5.8.1.11. Attribute 10: lease_time | |
| | | | |
| Duration of leases at server in seconds. | | Duration of leases at server in seconds. | |
| | | | |
| 5.8.1.12. Attribute 11: rdattr_error | | 5.8.1.12. Attribute 11: rdattr_error | |
| | | | |
|
| Error returned from getattr during readdir. | | Error returned from an attempt to retrieve attributes during a | |
| | | READDIR operation. | |
| | | | |
| 5.8.1.13. Attribute 19: filehandle | | 5.8.1.13. Attribute 19: filehandle | |
| | | | |
|
| The filehandle of this object (primarily for readdir requests). | | The filehandle of this object (primarily for READDIR requests). | |
| | | | |
| 5.8.1.14. Attribute 75: suppattr_exclcreat | | 5.8.1.14. Attribute 75: suppattr_exclcreat | |
| | | | |
| The bit vector which would set all REQUIRED and RECOMMENDED | | The bit vector which would set all REQUIRED and RECOMMENDED | |
| attributes that are supported by the EXCLUSIVE4_1 method of file | | attributes that are supported by the EXCLUSIVE4_1 method of file | |
| creation via the OPEN operation. The scope of this attribute applies | | creation via the OPEN operation. The scope of this attribute applies | |
| to all objects with a matching fsid. | | to all objects with a matching fsid. | |
| | | | |
| 5.8.2. Definitions of Uncategorized RECOMMENDED Attributes | | 5.8.2. Definitions of Uncategorized RECOMMENDED Attributes | |
| | | | |
| | | | |
| skipping to change at page 112, line 15 | | skipping to change at page 112, line 15 | |
| 5.8.2.44. Attribute 54: time_modify_set | | 5.8.2.44. Attribute 54: time_modify_set | |
| | | | |
| Set the time of last modification to the object. SETATTR use only. | | Set the time of last modification to the object. SETATTR use only. | |
| | | | |
| 5.9. Interpreting owner and owner_group | | 5.9. Interpreting owner and owner_group | |
| | | | |
| The RECOMMENDED attributes "owner" and "owner_group" (and also users | | The RECOMMENDED attributes "owner" and "owner_group" (and also users | |
| and groups within the "acl" attribute) are represented in terms of a | | and groups within the "acl" attribute) are represented in terms of a | |
| UTF-8 string. To avoid a representation that is tied to a particular | | UTF-8 string. To avoid a representation that is tied to a particular | |
| underlying implementation at the client or server, the use of the | | underlying implementation at the client or server, the use of the | |
|
| UTF-8 string has been chosen. Note that section 6.1 of RFC2624 [34] | | UTF-8 string has been chosen. Note that section 6.1 of RFC2624 [33] | |
| provides additional rationale. It is expected that the client and | | provides additional rationale. It is expected that the client and | |
| server will have their own local representation of owner and | | server will have their own local representation of owner and | |
| owner_group that is used for local storage or presentation to the end | | owner_group that is used for local storage or presentation to the end | |
| user. Therefore, it is expected that when these attributes are | | user. Therefore, it is expected that when these attributes are | |
| transferred between the client and server that the local | | transferred between the client and server that the local | |
| representation is translated to a syntax of the form "user@ | | representation is translated to a syntax of the form "user@ | |
| dns_domain". This will allow for a client and server that do not use | | dns_domain". This will allow for a client and server that do not use | |
| the same local representation the ability to translate to a common | | the same local representation the ability to translate to a common | |
| syntax that can be interpreted by both. | | syntax that can be interpreted by both. | |
| | | | |
| | | | |
| skipping to change at page 114, line 9 | | skipping to change at page 114, line 9 | |
| compatibility. | | compatibility. | |
| | | | |
| The owner string "nobody" may be used to designate an anonymous user, | | The owner string "nobody" may be used to designate an anonymous user, | |
| which will be associated with a file created by a security principal | | which will be associated with a file created by a security principal | |
| that cannot be mapped through normal means to the owner attribute. | | that cannot be mapped through normal means to the owner attribute. | |
| | | | |
| 5.10. Character Case Attributes | | 5.10. Character Case Attributes | |
| | | | |
| With respect to the case_insensitive and case_preserving attributes, | | With respect to the case_insensitive and case_preserving attributes, | |
| each UCS-4 character (which UTF-8 encodes) has a "long descriptive | | each UCS-4 character (which UTF-8 encodes) has a "long descriptive | |
|
| name" RFC1345 [35] which may or may not include the word "CAPITAL" or | | name" RFC1345 [34] which may or may not include the word "CAPITAL" or | |
| "SMALL". The presence of SMALL or CAPITAL allows an NFS server to | | "SMALL". The presence of SMALL or CAPITAL allows an NFS server to | |
| implement unambiguous and efficient table driven mappings for case | | implement unambiguous and efficient table driven mappings for case | |
| insensitive comparisons, and non-case-preserving storage. For | | insensitive comparisons, and non-case-preserving storage. For | |
| general character handling and internationalization issues, see | | general character handling and internationalization issues, see | |
| Section 14. | | Section 14. | |
| | | | |
| 5.11. Directory Notification Attributes | | 5.11. Directory Notification Attributes | |
| | | | |
| As described in Section 18.39, the client can request a minimum delay | | As described in Section 18.39, the client can request a minimum delay | |
| for notifications of changes to attributes, but the server is free to | | for notifications of changes to attributes, but the server is free to | |
| | | | |
| skipping to change at page 132, line 28 | | skipping to change at page 132, line 28 | |
| this is true even if the parent or target explicitly denies one of | | this is true even if the parent or target explicitly denies one of | |
| these permissions.) | | these permissions.) | |
| | | | |
| If the ACLs in question neither explicitly ALLOW nor DENY either of | | If the ACLs in question neither explicitly ALLOW nor DENY either of | |
| the above, and if MODE4_SVTX is not set on the parent, then the | | the above, and if MODE4_SVTX is not set on the parent, then the | |
| server SHOULD allow the removal if and only if ACE4_ADD_FILE is | | server SHOULD allow the removal if and only if ACE4_ADD_FILE is | |
| permitted. In the case where MODE4_SVTX is set, the server may also | | permitted. In the case where MODE4_SVTX is set, the server may also | |
| require the remover to own either the parent or the target, or may | | require the remover to own either the parent or the target, or may | |
| require the target to be writable. | | require the target to be writable. | |
| | | | |
|
| This allows servers to support something close to traditional unix- | | This allows servers to support something close to traditional UNIX- | |
| like semantics, with ACE4_ADD_FILE taking the place of the write bit. | | like semantics, with ACE4_ADD_FILE taking the place of the write bit. | |
| | | | |
| 6.2.1.4. ACE flag | | 6.2.1.4. ACE flag | |
| | | | |
| The bitmask constants used for the flag field are as follows: | | The bitmask constants used for the flag field are as follows: | |
| | | | |
| const ACE4_FILE_INHERIT_ACE = 0x00000001; | | const ACE4_FILE_INHERIT_ACE = 0x00000001; | |
| const ACE4_DIRECTORY_INHERIT_ACE = 0x00000002; | | const ACE4_DIRECTORY_INHERIT_ACE = 0x00000002; | |
| const ACE4_NO_PROPAGATE_INHERIT_ACE = 0x00000004; | | const ACE4_NO_PROPAGATE_INHERIT_ACE = 0x00000004; | |
| const ACE4_INHERIT_ONLY_ACE = 0x00000008; | | const ACE4_INHERIT_ONLY_ACE = 0x00000008; | |
| | | | |
| skipping to change at page 139, line 37 | | skipping to change at page 139, line 37 | |
| behaviors specified with "SHOULD". This is intentional, to avoid | | behaviors specified with "SHOULD". This is intentional, to avoid | |
| invalidating existing implementations that compute the mode according | | invalidating existing implementations that compute the mode according | |
| to the withdrawn POSIX ACL draft (1003.1e draft 17), rather than by | | to the withdrawn POSIX ACL draft (1003.1e draft 17), rather than by | |
| actual permissions on owner, group, and other. | | actual permissions on owner, group, and other. | |
| | | | |
| 6.4.1. Setting the mode and/or ACL Attributes | | 6.4.1. Setting the mode and/or ACL Attributes | |
| | | | |
| In the case where a server supports the sacl or dacl attribute, in | | In the case where a server supports the sacl or dacl attribute, in | |
| addition to the acl attribute, the server MUST fail a request to set | | addition to the acl attribute, the server MUST fail a request to set | |
| the acl attribute simultaneously with a dacl or sacl attribute. The | | the acl attribute simultaneously with a dacl or sacl attribute. The | |
|
| error to be given is NFS4ERR_ATTRNOTSUP. | | error to be given is NFS4ERR_ATTRNOTSUPP. | |
| | | | |
| 6.4.1.1. Setting mode and not ACL | | 6.4.1.1. Setting mode and not ACL | |
| | | | |
| When any of the nine low-order mode bits are subject to change, | | When any of the nine low-order mode bits are subject to change, | |
| either because the mode attribute was set or because the | | either because the mode attribute was set or because the | |
| mode_set_masked attribute was set and the mask included one or more | | mode_set_masked attribute was set and the mask included one or more | |
| bits from the nine low-order mode bits, and no ACL attribute is | | bits from the nine low-order mode bits, and no ACL attribute is | |
| explicitly set, the acl and dacl attributes must be modified in | | explicitly set, the acl and dacl attributes must be modified in | |
| accordance with the updated value of those bits. This must happen | | accordance with the updated value of those bits. This must happen | |
| even if the value of the low-order bits is the same after the mode is | | even if the value of the low-order bits is the same after the mode is | |
| | | | |
| skipping to change at page 143, line 49 | | skipping to change at page 143, line 49 | |
| and all other bits must be cleared. The ACE4_INHERITED_ACE flag may | | and all other bits must be cleared. The ACE4_INHERITED_ACE flag may | |
| be set in the ACEs of the sacl or dacl (whereas it must always be | | be set in the ACEs of the sacl or dacl (whereas it must always be | |
| cleared in the acl). | | cleared in the acl). | |
| | | | |
| Together these features allow a server to support automatic | | Together these features allow a server to support automatic | |
| inheritance, which we now explain in more detail. | | inheritance, which we now explain in more detail. | |
| | | | |
| Inheritable ACEs are normally inherited by child objects only at the | | Inheritable ACEs are normally inherited by child objects only at the | |
| time that the child objects are created; later modifications to | | time that the child objects are created; later modifications to | |
| inheritable ACEs do not result in modifications to inherited ACEs on | | inheritable ACEs do not result in modifications to inherited ACEs on | |
|
| descendents. | | descendants. | |
| | | | |
| However, the dacl and sacl provide an OPTIONAL mechanism which allows | | However, the dacl and sacl provide an OPTIONAL mechanism which allows | |
| a client application to propagate changes to inheritable ACEs to an | | a client application to propagate changes to inheritable ACEs to an | |
| entire directory hierarchy. | | entire directory hierarchy. | |
| | | | |
| A server that supports this performs inheritance at object creation | | A server that supports this performs inheritance at object creation | |
| time in the normal way, and SHOULD set the ACE4_INHERITED_ACE flag on | | time in the normal way, and SHOULD set the ACE4_INHERITED_ACE flag on | |
| any inherited ACEs as they are added to the new object. | | any inherited ACEs as they are added to the new object. | |
| | | | |
| A client application such as an ACL editor may then propagate changes | | A client application such as an ACL editor may then propagate changes | |
| | | | |
| skipping to change at page 149, line 43 | | skipping to change at page 149, line 43 | |
| clients should use strong security mechanisms to access the pseudo | | clients should use strong security mechanisms to access the pseudo | |
| file system in order to prevent man-in-the-middle attacks. | | file system in order to prevent man-in-the-middle attacks. | |
| | | | |
| 8. State Management | | 8. State Management | |
| | | | |
| Integrating locking into the NFS protocol necessarily causes it to be | | Integrating locking into the NFS protocol necessarily causes it to be | |
| stateful. With the inclusion of such features as share reservations, | | stateful. With the inclusion of such features as share reservations, | |
| file and directory delegations, recallable layouts, and support for | | file and directory delegations, recallable layouts, and support for | |
| mandatory byte-range locking, the protocol becomes substantially more | | mandatory byte-range locking, the protocol becomes substantially more | |
| dependent on proper management of state than the traditional | | dependent on proper management of state than the traditional | |
|
| combination of NFS and NLM [36]. These features include expanded | | combination of NFS and NLM [35]. These features include expanded | |
| locking facilities, which provide some measure of interclient | | locking facilities, which provide some measure of interclient | |
| exclusion, but the state also offers features not readily providable | | exclusion, but the state also offers features not readily providable | |
| using a stateless model. There are three components to making this | | using a stateless model. There are three components to making this | |
| state manageable: | | state manageable: | |
| | | | |
| o Clear division between client and server | | o Clear division between client and server | |
| o Ability to reliably detect inconsistency in state between client | | o Ability to reliably detect inconsistency in state between client | |
| and server | | and server | |
| | | | |
| o Simple and robust recovery mechanisms | | o Simple and robust recovery mechanisms | |
| | | | |
| skipping to change at page 166, line 22 | | skipping to change at page 166, line 22 | |
| requests to be processed during the grace period, it MUST determine | | requests to be processed during the grace period, it MUST determine | |
| that no lock subsequently reclaimed will be rejected and that no lock | | that no lock subsequently reclaimed will be rejected and that no lock | |
| subsequently reclaimed would have prevented any I/O operation | | subsequently reclaimed would have prevented any I/O operation | |
| processed during the grace period. | | processed during the grace period. | |
| | | | |
| Clients should be prepared for the return of NFS4ERR_GRACE errors for | | Clients should be prepared for the return of NFS4ERR_GRACE errors for | |
| non-reclaim lock and I/O requests. In this case the client should | | non-reclaim lock and I/O requests. In this case the client should | |
| employ a retry mechanism for the request. A delay (on the order of | | employ a retry mechanism for the request. A delay (on the order of | |
| several seconds) between retries should be used to avoid overwhelming | | several seconds) between retries should be used to avoid overwhelming | |
| the server. Further discussion of the general issue is included in | | the server. Further discussion of the general issue is included in | |
|
| [37]. The client must account for the server that can perform I/O | | [36]. The client must account for the server that can perform I/O | |
| and non-reclaim locking requests within the grace period as well as | | and non-reclaim locking requests within the grace period as well as | |
| those that cannot do so. | | those that cannot do so. | |
| | | | |
| A reclaim-type locking request outside the server's grace period can | | A reclaim-type locking request outside the server's grace period can | |
| only succeed if the server can guarantee that no conflicting lock or | | only succeed if the server can guarantee that no conflicting lock or | |
| I/O request has been granted since restart. | | I/O request has been granted since restart. | |
| | | | |
| A server may, upon restart, establish a new value for the lease | | A server may, upon restart, establish a new value for the lease | |
| period. Therefore, clients should, once a new client ID is | | period. Therefore, clients should, once a new client ID is | |
| established, refetch the lease_time attribute and use it as the basis | | established, refetch the lease_time attribute and use it as the basis | |
| | | | |
| skipping to change at page 173, line 9 | | skipping to change at page 173, line 9 | |
| well as the possibility that requests will be lost and need to be | | well as the possibility that requests will be lost and need to be | |
| retransmitted. | | retransmitted. | |
| | | | |
| To take propagation delay into account, the client should subtract it | | To take propagation delay into account, the client should subtract it | |
| from lease times (e.g. if the client estimates the one-way | | from lease times (e.g. if the client estimates the one-way | |
| propagation delay as 200 milliseconds, then it can assume that the | | propagation delay as 200 milliseconds, then it can assume that the | |
| lease is already 200 milliseconds old when it gets it). In addition, | | lease is already 200 milliseconds old when it gets it). In addition, | |
| it will take another 200 milliseconds to get a response back to the | | it will take another 200 milliseconds to get a response back to the | |
| server. So the client must send a lease renewal or write data back | | server. So the client must send a lease renewal or write data back | |
| to the server at least 400 milliseconds before the lease would | | to the server at least 400 milliseconds before the lease would | |
|
| expire. | | expire. If the propagation delay varies over the life of the lease | |
| | | (e.g. the client is on a mobile host), the client will need to | |
| | | continuously subtract the increase in propagation delay from the | |
| | | lease times. | |
| | | | |
| The server's lease period configuration should take into account the | | The server's lease period configuration should take into account the | |
| network distance of the clients that will be accessing the server's | | network distance of the clients that will be accessing the server's | |
| resources. It is expected that the lease period will take into | | resources. It is expected that the lease period will take into | |
| account the network propagation delays and other network delay | | account the network propagation delays and other network delay | |
| factors for the client population. Since the protocol does not allow | | factors for the client population. Since the protocol does not allow | |
| for an automatic method to determine an appropriate lease period, the | | for an automatic method to determine an appropriate lease period, the | |
| server's administrator may have to tune the lease period. | | server's administrator may have to tune the lease period. | |
| | | | |
| 8.8. Obsolete Locking Infrastructure From NFSv4.0 | | 8.8. Obsolete Locking Infrastructure From NFSv4.0 | |
| | | | |
| skipping to change at page 187, line 46 | | skipping to change at page 187, line 46 | |
| | | | |
| o For WRITE, see Section 18.32.4. | | o For WRITE, see Section 18.32.4. | |
| | | | |
| On recall, the client holding the delegation must flush modified | | On recall, the client holding the delegation must flush modified | |
| state (such as modified data) to the server and return the | | state (such as modified data) to the server and return the | |
| delegation. The conflicting request will not be acted on until the | | delegation. The conflicting request will not be acted on until the | |
| recall is complete. The recall is considered complete when the | | recall is complete. The recall is considered complete when the | |
| client returns the delegation or the server times its wait for the | | client returns the delegation or the server times its wait for the | |
| delegation to be returned and revokes the delegation as a result of | | delegation to be returned and revokes the delegation as a result of | |
| the timeout. In the interim, the server will either delay responding | | the timeout. In the interim, the server will either delay responding | |
|
| to conflicting requests or respond to them with NFSERR_DELAY. | | to conflicting requests or respond to them with NFS4ERR_DELAY. | |
| Following the resolution of the recall, the server has the | | Following the resolution of the recall, the server has the | |
| information necessary to grant or deny the second client's request. | | information necessary to grant or deny the second client's request. | |
| | | | |
| At the time the client receives a delegation recall, it may have | | At the time the client receives a delegation recall, it may have | |
| substantial state that needs to be flushed to the server. Therefore, | | substantial state that needs to be flushed to the server. Therefore, | |
| the server should allow sufficient time for the delegation to be | | the server should allow sufficient time for the delegation to be | |
| returned since it may involve numerous RPCs to the server. If the | | returned since it may involve numerous RPCs to the server. If the | |
| server is able to determine that the client is diligently flushing | | server is able to determine that the client is diligently flushing | |
| state to the server as a result of the recall, the server may extend | | state to the server as a result of the recall, the server may extend | |
| the usual time allowed for a recall. However, the time allowed for | | the usual time allowed for a recall. However, the time allowed for | |
| | | | |
| skipping to change at page 190, line 19 | | skipping to change at page 190, line 19 | |
| to the behavior for locks and share reservations. For delegations, | | to the behavior for locks and share reservations. For delegations, | |
| however, the server may extend the period in which conflicting | | however, the server may extend the period in which conflicting | |
| requests are held off. Eventually the occurrence of a conflicting | | requests are held off. Eventually the occurrence of a conflicting | |
| request from another client will cause revocation of the delegation. | | request from another client will cause revocation of the delegation. | |
| A loss of the backchannel (e.g. by later network configuration | | A loss of the backchannel (e.g. by later network configuration | |
| change) will have the same effect. A recall request will fail and | | change) will have the same effect. A recall request will fail and | |
| revocation of the delegation will result. | | revocation of the delegation will result. | |
| | | | |
| A client normally finds out about revocation of a delegation when it | | A client normally finds out about revocation of a delegation when it | |
| uses a stateid associated with a delegation and receives one of the | | uses a stateid associated with a delegation and receives one of the | |
|
| errors NFS4EER_EXPIRED, NFS4ERR_ADMIN_REVOKED, or | | errors NFS4ERR_EXPIRED, NFS4ERR_ADMIN_REVOKED, or | |
| NFS4ERR_DELEG_REVOKED. It also may find out about delegation | | NFS4ERR_DELEG_REVOKED. It also may find out about delegation | |
| revocation after a client restart when it attempts to reclaim a | | revocation after a client restart when it attempts to reclaim a | |
| delegation and receives that same error. Note that in the case of a | | delegation and receives that same error. Note that in the case of a | |
| revoked write open delegation, there are issues because data may have | | revoked write open delegation, there are issues because data may have | |
| been modified by the client whose delegation is revoked and | | been modified by the client whose delegation is revoked and | |
| separately by other clients. See Section 10.5.1 for a discussion of | | separately by other clients. See Section 10.5.1 for a discussion of | |
| such issues. Note also that when delegations are revoked, | | such issues. Note also that when delegations are revoked, | |
| information about the revoked delegation will be written by the | | information about the revoked delegation will be written by the | |
| server to stable storage (as described in Section 8.4.3). This is | | server to stable storage (as described in Section 8.4.3). This is | |
| done to deal with the case in which a server restarts after revoking | | done to deal with the case in which a server restarts after revoking | |
| | | | |
| skipping to change at page 233, line 33 | | skipping to change at page 233, line 33 | |
| 11.7.5.1. File System Splitting | | 11.7.5.1. File System Splitting | |
| | | | |
| When a file system transition is made and the fs_locations_info | | When a file system transition is made and the fs_locations_info | |
| indicates that the file system in question may be split into multiple | | indicates that the file system in question may be split into multiple | |
| file systems (via the FSLI4F_MULTI_FS flag), the client SHOULD do | | file systems (via the FSLI4F_MULTI_FS flag), the client SHOULD do | |
| GETATTRs to determine the fsid attribute on all known objects within | | GETATTRs to determine the fsid attribute on all known objects within | |
| the file system undergoing transition to determine the new file | | the file system undergoing transition to determine the new file | |
| system boundaries. | | system boundaries. | |
| | | | |
| Clients may maintain the fsids passed to existing applications by | | Clients may maintain the fsids passed to existing applications by | |
|
| mapping all of the fsids for the descendent file systems to the | | mapping all of the fsids for the descendant file systems to the | |
| common fsid used for the original file system. | | common fsid used for the original file system. | |
| | | | |
| Splitting a file system may be done on a transition between file | | Splitting a file system may be done on a transition between file | |
| systems of the same _fileid_ class, since the fact that fileids are | | systems of the same _fileid_ class, since the fact that fileids are | |
| unique within the source file system ensure they will be unique in | | unique within the source file system ensure they will be unique in | |
| each of the target file systems. | | each of the target file systems. | |
| | | | |
| 11.7.6. The Change Attribute and File System Transitions | | 11.7.6. The Change Attribute and File System Transitions | |
| | | | |
| Since the change attribute is defined as a server-specific one, | | Since the change attribute is defined as a server-specific one, | |
| | | | |
| skipping to change at page 260, line 40 | | skipping to change at page 260, line 40 | |
| expected to be used in line with industry practice. | | expected to be used in line with industry practice. | |
| | | | |
| The variable ${ietf.org:OS_TYPE} is used to denote the operating | | The variable ${ietf.org:OS_TYPE} is used to denote the operating | |
| system and thus the kernel and library API's for which code might be | | system and thus the kernel and library API's for which code might be | |
| compiled. This specification does not limit the acceptable values | | compiled. This specification does not limit the acceptable values | |
| (except that they must be valid UTF-8 strings) but such values as | | (except that they must be valid UTF-8 strings) but such values as | |
| "linux" and "freebsd" would be expected to be used in line with | | "linux" and "freebsd" would be expected to be used in line with | |
| industry practice. | | industry practice. | |
| | | | |
| The variable ${ietf.org:OS_VERSION} is used to denote the operating | | The variable ${ietf.org:OS_VERSION} is used to denote the operating | |
|
| system version and the thus the specific details of versioned | | system version and thus the specific details of versioned interfaces | |
| interfaces for which code might be compiled. This specification does | | for which code might be compiled. This specification does not limit | |
| not limit the acceptable values (except that they must be valid UTF-8 | | the acceptable values (except that they must be valid UTF-8 strings) | |
| strings) but combinations of numbers and letters with interspersed | | but combinations of numbers and letters with interspersed dots would | |
| dots would be expected to be used in line with industry practice, | | be expected to be used in line with industry practice, with the | |
| with the details of the version format depending on the specific | | details of the version format depending on the specific value of the | |
| value of the value of the variable ${ietf.org:OS_TYPE} with which it | | value of the variable ${ietf.org:OS_TYPE} with which it is used. | |
| is used. | | | |
| | | | |
| Use of these variable could result in direction of different clients | | Use of these variable could result in direction of different clients | |
| to different file systems on the same server, as appropriate to | | to different file systems on the same server, as appropriate to | |
| particular clients. In cases in which the target file systems are | | particular clients. In cases in which the target file systems are | |
| located on different servers, a single server could serve as a | | located on different servers, a single server could serve as a | |
| referral point so that each valid combination of variable values | | referral point so that each valid combination of variable values | |
| would designate a referral hosted on a single server, with the | | would designate a referral hosted on a single server, with the | |
| targets of those referrals on a number of different servers. | | targets of those referrals on a number of different servers. | |
| | | | |
| Because namespace administration is affected by the values selected | | Because namespace administration is affected by the values selected | |
| | | | |
| skipping to change at page 266, line 30 | | skipping to change at page 266, line 30 | |
| The NFSv4.1 pNFS feature has been structured to allow for a variety | | The NFSv4.1 pNFS feature has been structured to allow for a variety | |
| of storage protocols to be defined and used. As noted in the diagram | | of storage protocols to be defined and used. As noted in the diagram | |
| above, the storage protocol is the method used by the client to store | | above, the storage protocol is the method used by the client to store | |
| and retrieve data directly from the storage devices. The NFSv4.1 | | and retrieve data directly from the storage devices. The NFSv4.1 | |
| protocol directly defines one storage protocol, the NFSv4.1 storage | | protocol directly defines one storage protocol, the NFSv4.1 storage | |
| type, and its use. | | type, and its use. | |
| | | | |
| Examples of other storage protocols that could be used with NFSv4.1's | | Examples of other storage protocols that could be used with NFSv4.1's | |
| pNFS are: | | pNFS are: | |
| | | | |
|
| o Block/volume protocols such as iSCSI ([38]), and FCP ([39]). The | | o Block/volume protocols such as iSCSI ([37]), and FCP ([38]). The | |
| block/volume protocol support can be independent of the addressing | | block/volume protocol support can be independent of the addressing | |
| structure of the block/volume protocol used, allowing more than | | structure of the block/volume protocol used, allowing more than | |
| one protocol to access the same file data and enabling | | one protocol to access the same file data and enabling | |
| extensibility to other block/volume protocols. | | extensibility to other block/volume protocols. | |
| | | | |
|
| o Object protocols such as OSD over iSCSI or Fibre Channel [40]. | | o Object protocols such as OSD over iSCSI or Fibre Channel [39]. | |
| | | | |
| o Other storage protocols, including PVFS and other file systems | | o Other storage protocols, including PVFS and other file systems | |
| that are in use in HPC environments. | | that are in use in HPC environments. | |
| | | | |
| It is possible that various storage protocols are available to both | | It is possible that various storage protocols are available to both | |
| client and server and it may be possible that a client and server do | | client and server and it may be possible that a client and server do | |
| not have a matching storage protocol available to them. Because of | | not have a matching storage protocol available to them. Because of | |
| this, the pNFS server MUST support normal NFSv4.1 access to any file | | this, the pNFS server MUST support normal NFSv4.1 access to any file | |
| accessible by the pNFS feature; this will allow for continued | | accessible by the pNFS feature; this will allow for continued | |
| interoperability between an NFSv4.1 client and server. | | interoperability between an NFSv4.1 client and server. | |
| | | | |
| skipping to change at page 268, line 31 | | skipping to change at page 268, line 31 | |
| requirements are placed on the control protocol for maintaining | | requirements are placed on the control protocol for maintaining | |
| attributes like modify time, the change attribute, and the end-of- | | attributes like modify time, the change attribute, and the end-of- | |
| file (EOF) position. | | file (EOF) position. | |
| | | | |
| 12.2.7. Layout Types | | 12.2.7. Layout Types | |
| | | | |
| A layout describes the mapping of a file's data to the storage | | A layout describes the mapping of a file's data to the storage | |
| devices that hold the data. A layout is said to belong to a specific | | devices that hold the data. A layout is said to belong to a specific | |
| layout type (data type layouttype4, see Section 3.3.13). The layout | | layout type (data type layouttype4, see Section 3.3.13). The layout | |
| type allows for variants to handle different storage protocols, such | | type allows for variants to handle different storage protocols, such | |
|
| as those associated with block/volume [31], object [30], and file | | as those associated with block/volume [30], object [29], and file | |
| (Section 13) layout types. A metadata server, along with its control | | (Section 13) layout types. A metadata server, along with its control | |
| protocol, MUST support at least one layout type. A private sub-range | | protocol, MUST support at least one layout type. A private sub-range | |
| of the layout type name space is also defined. Values from the | | of the layout type name space is also defined. Values from the | |
| private layout type range MAY be used for internal testing or | | private layout type range MAY be used for internal testing or | |
| experimentation. | | experimentation. | |
| | | | |
| As an example, the organization of the file layout type could be an | | As an example, the organization of the file layout type could be an | |
| array of tuples (e.g., device ID, filehandle), along with a | | array of tuples (e.g., device ID, filehandle), along with a | |
| definition of how the data is stored across the devices (e.g., | | definition of how the data is stored across the devices (e.g., | |
| striping). A block/volume layout might be an array of tuples that | | striping). A block/volume layout might be an array of tuples that | |
| | | | |
| skipping to change at page 273, line 14 | | skipping to change at page 273, line 14 | |
| which a layout is held, does not necessarily conflict with the | | which a layout is held, does not necessarily conflict with the | |
| holding of the layout that describes the file being modified. | | holding of the layout that describes the file being modified. | |
| Therefore, it is the requirement of the storage protocol or layout | | Therefore, it is the requirement of the storage protocol or layout | |
| type that determines the necessary behavior. For example, block/ | | type that determines the necessary behavior. For example, block/ | |
| volume layout types require that the layout's iomode agree with the | | volume layout types require that the layout's iomode agree with the | |
| type of I/O being performed. | | type of I/O being performed. | |
| | | | |
| Depending upon the layout type and storage protocol in use, storage | | Depending upon the layout type and storage protocol in use, storage | |
| device access permissions may be granted by LAYOUTGET and may be | | device access permissions may be granted by LAYOUTGET and may be | |
| encoded within the type-specific layout. For an example of storage | | encoded within the type-specific layout. For an example of storage | |
|
| device access permissions see an object based protocol such as [40]. | | device access permissions see an object based protocol such as [39]. | |
| If access permissions are encoded within the layout, the metadata | | If access permissions are encoded within the layout, the metadata | |
| server SHOULD recall the layout when those permissions become invalid | | server SHOULD recall the layout when those permissions become invalid | |
| for any reason; for example when a file becomes unwritable or | | for any reason; for example when a file becomes unwritable or | |
| inaccessible to a client. Note, clients are still required to | | inaccessible to a client. Note, clients are still required to | |
| perform the appropriate access operations with open, lock and access | | perform the appropriate access operations with open, lock and access | |
| as described above. The degree to which it is possible for the | | as described above. The degree to which it is possible for the | |
| client to circumvent these access operations and the consequences of | | client to circumvent these access operations and the consequences of | |
| doing so must be clearly specified by the individual layout type | | doing so must be clearly specified by the individual layout type | |
| specifications. In addition, these specifications must be clear | | specifications. In addition, these specifications must be clear | |
| about the requirements and non-requirements for the checking | | about the requirements and non-requirements for the checking | |
| | | | |
| skipping to change at page 274, line 8 | | skipping to change at page 274, line 8 | |
| multiple LAYOUTGET requests; these might result in multiple | | multiple LAYOUTGET requests; these might result in multiple | |
| overlapping, non-conflicting layouts (see Section 12.2.8). | | overlapping, non-conflicting layouts (see Section 12.2.8). | |
| | | | |
| In order to get a layout, the client must first have opened the file | | In order to get a layout, the client must first have opened the file | |
| via the OPEN operation. When a client has no layout on a file, it | | via the OPEN operation. When a client has no layout on a file, it | |
| MUST present a stateid as returned by OPEN, a delegation stateid, or | | MUST present a stateid as returned by OPEN, a delegation stateid, or | |
| a byte-range lock stateid in the loga_stateid argument. A successful | | a byte-range lock stateid in the loga_stateid argument. A successful | |
| LAYOUTGET result includes a layout stateid. The first successful | | LAYOUTGET result includes a layout stateid. The first successful | |
| LAYOUTGET processed by the server using a non-layout stateid as an | | LAYOUTGET processed by the server using a non-layout stateid as an | |
| argument MUST have the "seqid" field of the layout stateid in the | | argument MUST have the "seqid" field of the layout stateid in the | |
|
| response set to one. Thereafter, the client uses a layout stateid | | response set to one. Thereafter, the client MUST use a layout | |
| (see Section 12.5.3) on future invocations of LAYOUTGET on the file, | | stateid (see Section 12.5.3) on future invocations of LAYOUTGET on | |
| and the "seqid" MUST NOT be set to zero. Once the layout has been | | the file, and the "seqid" MUST NOT be set to zero. Once the layout | |
| retrieved, it can be held across multiple OPEN and CLOSE sequences. | | has been retrieved, it can be held across multiple OPEN and CLOSE | |
| Therefore, a client may hold a layout for a file that is not | | sequences. Therefore, a client may hold a layout for a file that is | |
| currently open by any user on the client. This allows for the | | not currently open by any user on the client. This allows for the | |
| caching of layouts beyond CLOSE. | | caching of layouts beyond CLOSE. | |
| | | | |
| The storage protocol used by the client to access the data on the | | The storage protocol used by the client to access the data on the | |
| storage device is determined by the layout's type. The client is | | storage device is determined by the layout's type. The client is | |
| responsible for matching the layout type with an available method to | | responsible for matching the layout type with an available method to | |
| interpret and use the layout. The method for this layout type | | interpret and use the layout. The method for this layout type | |
| selection is outside the scope of the pNFS functionality. | | selection is outside the scope of the pNFS functionality. | |
| | | | |
| Although the metadata server is in control of the layout for a file, | | Although the metadata server is in control of the layout for a file, | |
| the pNFS client can provide hints to the server when a file is opened | | the pNFS client can provide hints to the server when a file is opened | |
| | | | |
| skipping to change at page 295, line 18 | | skipping to change at page 295, line 18 | |
| threats are considered significant. | | threats are considered significant. | |
| | | | |
| In some cases, the security countermeasures for connections to | | In some cases, the security countermeasures for connections to | |
| storage devices may take the form of physical isolation or a | | storage devices may take the form of physical isolation or a | |
| recommendation not to use pNFS in an environment. For example, it | | recommendation not to use pNFS in an environment. For example, it | |
| may be impractical to provide confidentiality protection for some | | may be impractical to provide confidentiality protection for some | |
| storage protocols to protect against eavesdropping; in environments | | storage protocols to protect against eavesdropping; in environments | |
| where eavesdropping on such protocols is of sufficient concern to | | where eavesdropping on such protocols is of sufficient concern to | |
| require countermeasures, physical isolation of the communication | | require countermeasures, physical isolation of the communication | |
| channel (e.g., via direct connection from client(s) to storage | | channel (e.g., via direct connection from client(s) to storage | |
|
| device(s)) and/or a decision to forego use of pNFS (e.g., and fall | | device(s)) and/or a decision to forgo use of pNFS (e.g., and fall | |
| back to conventional NFSv4.1) may be appropriate courses of action. | | back to conventional NFSv4.1) may be appropriate courses of action. | |
| | | | |
| Where communication with storage devices is subject to the same | | Where communication with storage devices is subject to the same | |
| threats as client to metadata server communication, the protocols | | threats as client to metadata server communication, the protocols | |
| used for that communication need to provide security mechanisms as | | used for that communication need to provide security mechanisms as | |
|
| strong as or no weaker than those available via RPSEC_GSS for | | strong as or no weaker than those available via RPCSEC_GSS for | |
| NFSv4.1. | | NFSv4.1. Except for the storage protocol used for the | |
| | | LAYOUT4_NFSV4_1_FILES layout (see Section 13), i.e. except for | |
| | | NFSv4.1, it is beyond the scope of this document to specify the | |
| | | security mechanisms for storage access protocols. | |
| | | | |
| pNFS implementations MUST NOT remove NFSv4.1's access controls. The | | pNFS implementations MUST NOT remove NFSv4.1's access controls. The | |
| combination of clients, storage devices, and the metadata server are | | combination of clients, storage devices, and the metadata server are | |
| responsible for ensuring that all client to storage device file data | | responsible for ensuring that all client to storage device file data | |
| access respects NFSv4.1's ACLs and file open modes. This entails | | access respects NFSv4.1's ACLs and file open modes. This entails | |
| performing both of these checks on every access in the client, the | | performing both of these checks on every access in the client, the | |
| storage device, or both (as applicable; when the storage device is an | | storage device, or both (as applicable; when the storage device is an | |
| NFSv4.1 server, the storage device is ultimately responsible for | | NFSv4.1 server, the storage device is ultimately responsible for | |
| controlling access). If a pNFS configuration performs these checks | | controlling access). If a pNFS configuration performs these checks | |
| only in the client, the risk of a misbehaving client obtaining | | only in the client, the risk of a misbehaving client obtaining | |
| | | | |
| skipping to change at page 327, line 52 | | skipping to change at page 327, line 52 | |
| | NFS4ERR_BAD_STATEID | 10025 | Section 15.1.5.2 | | | | NFS4ERR_BAD_STATEID | 10025 | Section 15.1.5.2 | | |
| | NFS4ERR_CB_PATH_DOWN | 10048 | Section 15.1.11.4 | | | | NFS4ERR_CB_PATH_DOWN | 10048 | Section 15.1.11.4 | | |
| | NFS4ERR_CLID_INUSE | 10017 | Section 15.1.13.2 | | | | NFS4ERR_CLID_INUSE | 10017 | Section 15.1.13.2 | | |
| | NFS4ERR_CLIENTID_BUSY | 10074 | Section 15.1.13.1 | | | | NFS4ERR_CLIENTID_BUSY | 10074 | Section 15.1.13.1 | | |
| | NFS4ERR_COMPLETE_ALREADY | 10054 | Section 15.1.9.1 | | | | NFS4ERR_COMPLETE_ALREADY | 10054 | Section 15.1.9.1 | | |
| | NFS4ERR_CONN_NOT_BOUND_TO_SESSION | 10055 | Section 15.1.11.6 | | | | NFS4ERR_CONN_NOT_BOUND_TO_SESSION | 10055 | Section 15.1.11.6 | | |
| | NFS4ERR_DEADLOCK | 10045 | Section 15.1.8.2 | | | | NFS4ERR_DEADLOCK | 10045 | Section 15.1.8.2 | | |
| | NFS4ERR_DEADSESSION | 10078 | Section 15.1.11.5 | | | | NFS4ERR_DEADSESSION | 10078 | Section 15.1.11.5 | | |
| | NFS4ERR_DELAY | 10008 | Section 15.1.1.3 | | | | NFS4ERR_DELAY | 10008 | Section 15.1.1.3 | | |
| | NFS4ERR_DELEG_ALREADY_WANTED | 10056 | Section 15.1.14.1 | | | | NFS4ERR_DELEG_ALREADY_WANTED | 10056 | Section 15.1.14.1 | | |
|
| | | | NFS4ERR_DELEG_REVOKED | 10087 | Section 15.1.5.3 | | |
| | NFS4ERR_DENIED | 10010 | Section 15.1.8.3 | | | | NFS4ERR_DENIED | 10010 | Section 15.1.8.3 | | |
| | NFS4ERR_DIRDELEG_UNAVAIL | 10084 | Section 15.1.14.2 | | | | NFS4ERR_DIRDELEG_UNAVAIL | 10084 | Section 15.1.14.2 | | |
| | NFS4ERR_DQUOT | 69 | Section 15.1.4.2 | | | | NFS4ERR_DQUOT | 69 | Section 15.1.4.2 | | |
| | NFS4ERR_ENCR_ALG_UNSUPP | 10079 | Section 15.1.13.3 | | | | NFS4ERR_ENCR_ALG_UNSUPP | 10079 | Section 15.1.13.3 | | |
| | NFS4ERR_EXIST | 17 | Section 15.1.4.3 | | | | NFS4ERR_EXIST | 17 | Section 15.1.4.3 | | |
| | NFS4ERR_EXPIRED | 10011 | Section 15.1.5.4 | | | | NFS4ERR_EXPIRED | 10011 | Section 15.1.5.4 | | |
| | NFS4ERR_FBIG | 27 | Section 15.1.4.4 | | | | NFS4ERR_FBIG | 27 | Section 15.1.4.4 | | |
| | NFS4ERR_FHEXPIRED | 10014 | Section 15.1.2.2 | | | | NFS4ERR_FHEXPIRED | 10014 | Section 15.1.2.2 | | |
| | NFS4ERR_FILE_OPEN | 10046 | Section 15.1.4.5 | | | | NFS4ERR_FILE_OPEN | 10046 | Section 15.1.4.5 | | |
| | NFS4ERR_GRACE | 10013 | Section 15.1.9.2 | | | | NFS4ERR_GRACE | 10013 | Section 15.1.9.2 | | |
| | | | |
| skipping to change at page 336, line 33 | | skipping to change at page 336, line 33 | |
| | | | |
| A stateid designates locking state of any type that has been revoked | | A stateid designates locking state of any type that has been revoked | |
| due to administrative interaction, possibly while the lease is valid. | | due to administrative interaction, possibly while the lease is valid. | |
| | | | |
| 15.1.5.2. NFS4ERR_BAD_STATEID (Error Code 10026) | | 15.1.5.2. NFS4ERR_BAD_STATEID (Error Code 10026) | |
| | | | |
| A stateid does not properly designate any valid state. See | | A stateid does not properly designate any valid state. See | |
| Section 8.2.4 and Section 8.2.3 for a discussion of how stateids are | | Section 8.2.4 and Section 8.2.3 for a discussion of how stateids are | |
| validated. | | validated. | |
| | | | |
|
| 15.1.5.3. NFS4ERR_DELEG_REVOKED (Error Code 10056) | | 15.1.5.3. NFS4ERR_DELEG_REVOKED (Error Code 10087) | |
| | | | |
| A stateid designates recallable locking state of any type that has | | A stateid designates recallable locking state of any type that has | |
| been revoked due to the failure of the client to return the lock, | | been revoked due to the failure of the client to return the lock, | |
| when it was recalled. | | when it was recalled. | |
| | | | |
| 15.1.5.4. NFS4ERR_EXPIRED (Error Code 10011) | | 15.1.5.4. NFS4ERR_EXPIRED (Error Code 10011) | |
| | | | |
| A stateid designates locking state of any type that has been revoked | | A stateid designates locking state of any type that has been revoked | |
| due to expiration of the client's lease, either immediately upon | | due to expiration of the client's lease, either immediately upon | |
| lease expiration, or following a later request for a conflicting | | lease expiration, or following a later request for a conflicting | |
| | | | |
| skipping to change at page 396, line 5 | | skipping to change at page 396, line 5 | |
| | | | |
| o When a client executes a regular file, it has to read the file | | o When a client executes a regular file, it has to read the file | |
| from the server. Strictly speaking, the server should not allow | | from the server. Strictly speaking, the server should not allow | |
| the client to read a file being executed unless the user has read | | the client to read a file being executed unless the user has read | |
| permissions on the file. Requiring users and administers to set | | permissions on the file. Requiring users and administers to set | |
| read permissions on executable files in order to access them over | | read permissions on executable files in order to access them over | |
| NFS is not going to be acceptable to some people. Historically, | | NFS is not going to be acceptable to some people. Historically, | |
| NFS servers have allowed a user to READ a file if the user has | | NFS servers have allowed a user to READ a file if the user has | |
| execute access to the file. | | execute access to the file. | |
| | | | |
|
| As a practical example, the UNIX specification [41] states that an | | As a practical example, the UNIX specification [40] states that an | |
| implementation claiming conformance to UNIX may indicate in the | | implementation claiming conformance to UNIX may indicate in the | |
| access() programming interface's result that a privileged user has | | access() programming interface's result that a privileged user has | |
| execute rights, even if no execute permission bits are set on the | | execute rights, even if no execute permission bits are set on the | |
| regular file's attributes. It is possible to claim conformance to | | regular file's attributes. It is possible to claim conformance to | |
| the UNIX specification and instead not indicate execute rights in | | the UNIX specification and instead not indicate execute rights in | |
| that situation, which is true for some operating environments. | | that situation, which is true for some operating environments. | |
| Suppose the operating environments of the client and server are | | Suppose the operating environments of the client and server are | |
| implementing the access() semantics for privileged users differently, | | implementing the access() semantics for privileged users differently, | |
| and the ACCESS operation implementations of the client and server | | and the ACCESS operation implementations of the client and server | |
| follow their respective access() semantics. This can cause undesired | | follow their respective access() semantics. This can cause undesired | |
| | | | |
| skipping to change at page 432, line 30 | | skipping to change at page 432, line 30 | |
| attrset to determine which attributes were used to store the | | attrset to determine which attributes were used to store the | |
| verifier. | | verifier. | |
| | | | |
| With the addition of persistent sessions and pNFS, under some | | With the addition of persistent sessions and pNFS, under some | |
| conditions EXCLUSIVE4 MUST NOT be used by the client or supported by | | conditions EXCLUSIVE4 MUST NOT be used by the client or supported by | |
| the server. The following table summarizes the appropriate and | | the server. The following table summarizes the appropriate and | |
| mandated exclusive create methods for implementations of NFSv4.1: | | mandated exclusive create methods for implementations of NFSv4.1: | |
| | | | |
| Required methods for exclusive create | | Required methods for exclusive create | |
| | | | |
|
| +--------------+--------+-----------------+-------------------------+ | | +----------------+-----------+---------------+----------------------+ | |
| | Persistent | pNFS | Server REQUIRED | Client Allowed | | | | Persistent | Server | Server | Client Allowed | | |
| | Reply Cache | server | | | | | | Reply Cache | Supports | REQUIRED | | | |
| +--------------+--------+-----------------+-------------------------+ | | | Enabled | pNFS | | | | |
| | no | no | EXCLUSIVE4_1 | EXCLUSIVE4_1 (SHOULD) | | | +----------------+-----------+---------------+----------------------+ | |
| | | | and EXCLUSIVE4 | or EXCLUSIVE4 (SHOULD | | | | no | no | EXCLUSIVE4_1 | EXCLUSIVE4_1 | | |
| | | | | | and | (SHOULD) or | | |
| | | | | | EXCLUSIVE4 | EXCLUSIVE4 (SHOULD | | |
| | | | | NOT) | | | | | | | NOT) | | |
| | no | yes | EXCLUSIVE4_1 | EXCLUSIVE4_1 | | | | no | yes | EXCLUSIVE4_1 | EXCLUSIVE4_1 | | |
| | yes | no | GUARDED4 | GUARDED4 | | | | yes | no | GUARDED4 | GUARDED4 | | |
| | yes | yes | GUARDED4 | GUARDED4 | | | | yes | yes | GUARDED4 | GUARDED4 | | |
|
| +--------------+--------+-----------------+-------------------------+ | | +----------------+-----------+---------------+----------------------+ | |
| | | | |
| Table 10 | | Table 10 | |
| | | | |
| If CREATE_SESSION4_FLAG_PERSIST is set in the results of | | If CREATE_SESSION4_FLAG_PERSIST is set in the results of | |
| CREATE_SESSION the reply cache is persistent (see Section 18.36). If | | CREATE_SESSION the reply cache is persistent (see Section 18.36). If | |
| the EXCHGID4_FLAG_USE_PNFS_MDS flag is set in the results from | | the EXCHGID4_FLAG_USE_PNFS_MDS flag is set in the results from | |
| EXCHANGE_ID, the server is a pNFS server (see Section 18.35). If the | | EXCHANGE_ID, the server is a pNFS server (see Section 18.35). If the | |
| client attempts to use EXCLUSIVE4 on a persistent session, or a | | client attempts to use EXCLUSIVE4 on a persistent session, or a | |
| session derived from a EXCHGID4_FLAG_USE_PNFS_MDS client ID, the | | session derived from a EXCHGID4_FLAG_USE_PNFS_MDS client ID, the | |
| server MUST return NFS4ERR_INVAL. | | server MUST return NFS4ERR_INVAL. | |
| | | | |
| skipping to change at page 447, line 25 | | skipping to change at page 447, line 25 | |
| 18.20.3. DESCRIPTION | | 18.20.3. DESCRIPTION | |
| | | | |
| Replaces the current filehandle with the filehandle that represents | | Replaces the current filehandle with the filehandle that represents | |
| the public filehandle of the server's name space. This filehandle | | the public filehandle of the server's name space. This filehandle | |
| may be different from the "root" filehandle which may be associated | | may be different from the "root" filehandle which may be associated | |
| with some other directory on the server. | | with some other directory on the server. | |
| | | | |
| PUTPUBFH also clears the current stateid. | | PUTPUBFH also clears the current stateid. | |
| | | | |
| The public filehandle represents the concepts embodied in RFC2054 | | The public filehandle represents the concepts embodied in RFC2054 | |
|
| [32], RFC2055 [33], RFC2224 [42]. The intent for NFSv4.1 is that the | | [31], RFC2055 [32], RFC2224 [41]. The intent for NFSv4.1 is that the | |
| public filehandle (represented by the PUTPUBFH operation) be used as | | public filehandle (represented by the PUTPUBFH operation) be used as | |
| a method of providing WebNFS server compatibility with NFSv3. | | a method of providing WebNFS server compatibility with NFSv3. | |
| | | | |
| The public filehandle and the root filehandle (represented by the | | The public filehandle and the root filehandle (represented by the | |
| PUTROOTFH operation) SHOULD be equivalent. If the public and root | | PUTROOTFH operation) SHOULD be equivalent. If the public and root | |
| filehandles are not equivalent, then the public filehandle MUST be a | | filehandles are not equivalent, then the public filehandle MUST be a | |
| descendant of the root filehandle. | | descendant of the root filehandle. | |
| | | | |
| See Section 16.2.3.1.1 for more details on the current filehandle. | | See Section 16.2.3.1.1 for more details on the current filehandle. | |
| | | | |
| | | | |
| skipping to change at page 447, line 47 | | skipping to change at page 447, line 47 | |
| | | | |
| 18.20.4. IMPLEMENTATION | | 18.20.4. IMPLEMENTATION | |
| | | | |
| Used as the second operator (after SEQUENCE) in an NFS request to set | | Used as the second operator (after SEQUENCE) in an NFS request to set | |
| the context for file accessing operations that follow in the same | | the context for file accessing operations that follow in the same | |
| COMPOUND request. | | COMPOUND request. | |
| | | | |
| With the NFSv3 public filehandle, the client is able to specify | | With the NFSv3 public filehandle, the client is able to specify | |
| whether the path name provided in the LOOKUP should be evaluated as | | whether the path name provided in the LOOKUP should be evaluated as | |
| either an absolute path relative to the server's root or relative to | | either an absolute path relative to the server's root or relative to | |
|
| the public filehandle. RFC2224 [42] contains further discussion of | | the public filehandle. RFC2224 [41] contains further discussion of | |
| the functionality. With NFSv4.1, that type of specification is not | | the functionality. With NFSv4.1, that type of specification is not | |
| directly available in the LOOKUP operation. The reason for this is | | directly available in the LOOKUP operation. The reason for this is | |
| because the component separators needed to specify absolute vs. | | because the component separators needed to specify absolute vs. | |
| relative are not allowed in NFSv4. Therefore, the client is | | relative are not allowed in NFSv4. Therefore, the client is | |
| responsible for constructing its request such that the use of either | | responsible for constructing its request such that the use of either | |
| PUTROOTFH or PUTPUBFH are used to signify absolute or relative | | PUTROOTFH or PUTPUBFH are used to signify absolute or relative | |
| evaluation of an NFS URL respectively. | | evaluation of an NFS URL respectively. | |
| | | | |
|
| Note that there are warnings mentioned in RFC2224 [42] with respect | | Note that there are warnings mentioned in RFC2224 [41] with respect | |
| to the use of absolute evaluation and the restrictions the server may | | to the use of absolute evaluation and the restrictions the server may | |
| place on that evaluation with respect to how much of its namespace | | place on that evaluation with respect to how much of its namespace | |
| has been made available. These same warnings apply to NFSv4.1. It | | has been made available. These same warnings apply to NFSv4.1. It | |
| is likely, therefore that because of server implementation details, | | is likely, therefore that because of server implementation details, | |
| an NFSv3 absolute public filehandle lookup may behave differently | | an NFSv3 absolute public filehandle lookup may behave differently | |
| than an NFSv4.1 absolute resolution. | | than an NFSv4.1 absolute resolution. | |
| | | | |
|
| There is a form of security negotiation as described in RFC2755 [43] | | There is a form of security negotiation as described in RFC2755 [42] | |
| that uses the public filehandle and an overloading of the pathname. | | that uses the public filehandle and an overloading of the pathname. | |
| This method is not available with NFSv4.1 as filehandles are not | | This method is not available with NFSv4.1 as filehandles are not | |
| overloaded with special meaning and therefore do not provide the same | | overloaded with special meaning and therefore do not provide the same | |
| framework as NFSv3. Clients should therefore use the security | | framework as NFSv3. Clients should therefore use the security | |
| negotiation mechanisms described in Section 2.6. | | negotiation mechanisms described in Section 2.6. | |
| | | | |
| 18.21. Operation 24: PUTROOTFH - Set Root Filehandle | | 18.21. Operation 24: PUTROOTFH - Set Root Filehandle | |
| | | | |
| 18.21.1. ARGUMENTS | | 18.21.1. ARGUMENTS | |
| | | | |
| | | | |
| skipping to change at page 469, line 35 | | skipping to change at page 469, line 35 | |
| is lenient in this one case of matching owner values, the client | | is lenient in this one case of matching owner values, the client | |
| implementation may be simplified in cases of creation of an object | | implementation may be simplified in cases of creation of an object | |
| (e.g. an exclusive create via OPEN) followed by a SETATTR. | | (e.g. an exclusive create via OPEN) followed by a SETATTR. | |
| | | | |
| The file size attribute is used to request changes to the size of a | | The file size attribute is used to request changes to the size of a | |
| file. A value of zero causes the file to be truncated, a value less | | file. A value of zero causes the file to be truncated, a value less | |
| than the current size of the file causes data from new size to the | | than the current size of the file causes data from new size to the | |
| end of the file to be discarded, and a size greater than the current | | end of the file to be discarded, and a size greater than the current | |
| size of the file causes logically zeroed data bytes to be added to | | size of the file causes logically zeroed data bytes to be added to | |
| the end of the file. Servers are free to implement this using | | the end of the file. Servers are free to implement this using | |
|
| unallocate bytes (holes) or allocated data bytes set to zero. | | unallocated bytes (holes) or allocated data bytes set to zero. | |
| Clients should not make any assumptions regarding a server's | | Clients should not make any assumptions regarding a server's | |
| implementation of this feature, beyond that the bytes in affected | | implementation of this feature, beyond that the bytes in affected | |
| region returned by READ will be zeroed. Servers MUST support | | region returned by READ will be zeroed. Servers MUST support | |
| extending the file size via SETATTR. | | extending the file size via SETATTR. | |
| | | | |
| SETATTR is not guaranteed to be atomic. A failed SETATTR may | | SETATTR is not guaranteed to be atomic. A failed SETATTR may | |
| partially change a file's attributes, hence the reason why the reply | | partially change a file's attributes, hence the reason why the reply | |
| always includes the status and the list of attributes that were set. | | always includes the status and the list of attributes that were set. | |
| | | | |
| If the object whose attributes are being changed has a file | | If the object whose attributes are being changed has a file | |
| | | | |
| skipping to change at page 479, line 23 | | skipping to change at page 479, line 23 | |
| used with the integrity or privacy services, using the principal that | | used with the integrity or privacy services, using the principal that | |
| created the client ID. If SP4_SSV is used, RPCSEC_GSS with the SSV | | created the client ID. If SP4_SSV is used, RPCSEC_GSS with the SSV | |
| GSS mechanism (Section 2.10.8) and integrity or privacy MUST be used. | | GSS mechanism (Section 2.10.8) and integrity or privacy MUST be used. | |
| | | | |
| If, when the client ID was created, the client opted for SP4_NONE | | If, when the client ID was created, the client opted for SP4_NONE | |
| state protection, the client is not required to use | | state protection, the client is not required to use | |
| BIND_CONN_TO_SESSION to associate the connection with the session, | | BIND_CONN_TO_SESSION to associate the connection with the session, | |
| unless the client wishes to associate the connection with the | | unless the client wishes to associate the connection with the | |
| backchannel. When SP4_NONE protection is used, simply sending a | | backchannel. When SP4_NONE protection is used, simply sending a | |
| COMPOUND request with a SEQUENCE operation is sufficient to associate | | COMPOUND request with a SEQUENCE operation is sufficient to associate | |
|
| the connnection with the session specified in SEQUENCE. | | the connection with the session specified in SEQUENCE. | |
| | | | |
| The field bctsa_dir indicates whether the client wants to associate | | The field bctsa_dir indicates whether the client wants to associate | |
| the connection with the fore channel or the backchannel or both | | the connection with the fore channel or the backchannel or both | |
| channels. The value CDFC4_FORE_OR_BOTH indicates the client wants to | | channels. The value CDFC4_FORE_OR_BOTH indicates the client wants to | |
| associate the connection with both the fore channel and backchannel, | | associate the connection with both the fore channel and backchannel, | |
| but will accept the connection being associated to just the fore | | but will accept the connection being associated to just the fore | |
| channel. The value CDFC4_BACK_OR_BOTH indicates the client wants to | | channel. The value CDFC4_BACK_OR_BOTH indicates the client wants to | |
| associate with both the fore and backchannel, but will accept the | | associate with both the fore and backchannel, but will accept the | |
| connection being associated with just the backchannel. The server | | connection being associated with just the backchannel. The server | |
| replies in bctsr_dir which channel(s) the connection is associated | | replies in bctsr_dir which channel(s) the connection is associated | |
| | | | |
| skipping to change at page 506, line 9 | | skipping to change at page 506, line 9 | |
| 2. Sequence ID processing. If csa_sequenceid is equal to the | | 2. Sequence ID processing. If csa_sequenceid is equal to the | |
| sequence ID in the client ID's slot, then this is a replay of the | | sequence ID in the client ID's slot, then this is a replay of the | |
| previous CREATE_SESSION request, and the server returns the | | previous CREATE_SESSION request, and the server returns the | |
| cached result. If csa_sequenceid is not equal to the sequence ID | | cached result. If csa_sequenceid is not equal to the sequence ID | |
| in the slot, and is more than one greater (accounting for | | in the slot, and is more than one greater (accounting for | |
| wraparound), then the server returns the error | | wraparound), then the server returns the error | |
| NFS4ERR_SEQ_MISORDERED, and does not change the slot. If | | NFS4ERR_SEQ_MISORDERED, and does not change the slot. If | |
| csa_sequenceid is equal to the slot's sequence ID + 1 (accounting | | csa_sequenceid is equal to the slot's sequence ID + 1 (accounting | |
| for wraparound), then the slot's sequence ID is set to | | for wraparound), then the slot's sequence ID is set to | |
| csa_sequenceid, and the CREATE_SESSION processing goes to the | | csa_sequenceid, and the CREATE_SESSION processing goes to the | |
|
| next phase. A subsequent new CREATE_SESSION call MUST use a | | next phase. A subsequent new CREATE_SESSION call over the same | |
| csa_sequence that is one greater than last successfully used. | | client ID MUST use a csa_sequenceid that is one greater than the | |
| | | sequence ID in the slot. | |
| | | | |
| 3. Client ID confirmation. If this would be the first session for | | 3. Client ID confirmation. If this would be the first session for | |
| the client ID, the CREATE_SESSION operation serves to confirm the | | the client ID, the CREATE_SESSION operation serves to confirm the | |
| client ID. Otherwise the client ID confirmation phase is skipped | | client ID. Otherwise the client ID confirmation phase is skipped | |
| and only the session creation phase occurs. Any case in which | | and only the session creation phase occurs. Any case in which | |
| there is more than one record with identical values for client ID | | there is more than one record with identical values for client ID | |
| represents a server implementation error. Operation in the | | represents a server implementation error. Operation in the | |
| potential valid cases is summarized as follows. | | potential valid cases is summarized as follows. | |
| | | | |
| * Successful Confirmation | | * Successful Confirmation | |
| | | | |
| skipping to change at page 531, line 14 | | skipping to change at page 531, line 14 | |
| | | | |
| CB_NOTIFY_DEVICEID can race with LAYOUTGET. One race scenario is | | CB_NOTIFY_DEVICEID can race with LAYOUTGET. One race scenario is | |
| that LAYOUTGET returns a device ID the client does not have device | | that LAYOUTGET returns a device ID the client does not have device | |
| address mappings for, and the metadata server sends a | | address mappings for, and the metadata server sends a | |
| CB_NOTIFY_DEVICEID to add the device ID to the client's awareness and | | CB_NOTIFY_DEVICEID to add the device ID to the client's awareness and | |
| meanwhile the client sends GETDEVICEINFO on the device ID. This | | meanwhile the client sends GETDEVICEINFO on the device ID. This | |
| scenario is discussed in Section 18.40.4. Another scenario is that | | scenario is discussed in Section 18.40.4. Another scenario is that | |
| the CB_NOTIFY_DEVICEID is processed by the client before it processes | | the CB_NOTIFY_DEVICEID is processed by the client before it processes | |
| the results from LAYOUTGET. The client will send a GETDEVICEINFO on | | the results from LAYOUTGET. The client will send a GETDEVICEINFO on | |
| the device ID. If the results from GETDEVICEINFO are received before | | the device ID. If the results from GETDEVICEINFO are received before | |
|
| the client gets results from LAYTOUTGET, then there is no longer a | | the client gets results from LAYOUTGET, then there is no longer a | |
| race. If the results from LAYOUTGET are received before the results | | race. If the results from LAYOUTGET are received before the results | |
| from GETDEVICEINFO, the client can either wait for results of | | from GETDEVICEINFO, the client can either wait for results of | |
| GETDEVICEINFO, or send another one to get possibly more up to date | | GETDEVICEINFO, or send another one to get possibly more up to date | |
| device address mappings for the device ID. | | device address mappings for the device ID. | |
| | | | |
| 18.44. Operation 51: LAYOUTRETURN - Release Layout Information | | 18.44. Operation 51: LAYOUTRETURN - Release Layout Information | |
| | | | |
| 18.44.1. ARGUMENT | | 18.44.1. ARGUMENT | |
| | | | |
| /* Constants used for LAYOUTRETURN and CB_LAYOUTRECALL */ | | /* Constants used for LAYOUTRETURN and CB_LAYOUTRECALL */ | |
| | | | |
| skipping to change at page 540, line 33 | | skipping to change at page 540, line 33 | |
| When set indicates that one or more locks have been revoked | | When set indicates that one or more locks have been revoked | |
| without expiration of the lease period, due to administrative | | without expiration of the lease period, due to administrative | |
| action. This status bit remains set on all SEQUENCE replies until | | action. This status bit remains set on all SEQUENCE replies until | |
| the loss of all such locks has been acknowledged by use of | | the loss of all such locks has been acknowledged by use of | |
| FREE_STATEID. | | FREE_STATEID. | |
| | | | |
| SEQ4_STATUS_RECALLABLE_STATE_REVOKED | | SEQ4_STATUS_RECALLABLE_STATE_REVOKED | |
| When set indicates that one or more recallable objects have been | | When set indicates that one or more recallable objects have been | |
| revoked without expiration of the lease period, due to the | | revoked without expiration of the lease period, due to the | |
| client's failure to return them when recalled which may be a | | client's failure to return them when recalled which may be a | |
|
| consequence of there being no working backchanel and the client | | consequence of there being no working backchannel and the client | |
| failing to reestablish a backchannel per the | | failing to reestablish a backchannel per the | |
| SEQ4_STATUS_CB_PATH_DOWN, SEQ4_STATUS_CB_PATH_DOWN_SESSION, or | | SEQ4_STATUS_CB_PATH_DOWN, SEQ4_STATUS_CB_PATH_DOWN_SESSION, or | |
| SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRED status flags. This status bit | | SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRED status flags. This status bit | |
| remains set on all SEQUENCE replies until the loss of all such | | remains set on all SEQUENCE replies until the loss of all such | |
| locks has been acknowledged by use of FREE_STATEID. | | locks has been acknowledged by use of FREE_STATEID. | |
| | | | |
| SEQ4_STATUS_LEASE_MOVED | | SEQ4_STATUS_LEASE_MOVED | |
| When set indicates that responsibility for lease renewal has been | | When set indicates that responsibility for lease renewal has been | |
| transferred to one or more new servers. This condition will | | transferred to one or more new servers. This condition will | |
| continue until the client receives an NFS4ERR_MOVED error and the | | continue until the client receives an NFS4ERR_MOVED error and the | |
| | | | |
| skipping to change at page 549, line 28 | | skipping to change at page 549, line 28 | |
| WANT_DELEGATION operation to cancel a previously requested want for a | | WANT_DELEGATION operation to cancel a previously requested want for a | |
| delegation. Note that if the server is in the process of sending the | | delegation. Note that if the server is in the process of sending the | |
| delegation (via CB_PUSH_DELEG) at the time the client sends a | | delegation (via CB_PUSH_DELEG) at the time the client sends a | |
| cancellation of the want, the delegation might still be pushed to the | | cancellation of the want, the delegation might still be pushed to the | |
| client. | | client. | |
| | | | |
| If WANT_DELEGATION fails to return a delegation, and the server | | If WANT_DELEGATION fails to return a delegation, and the server | |
| returns NFS4_OK, the server MUST set the delegation type to | | returns NFS4_OK, the server MUST set the delegation type to | |
| OPEN4_DELEGATE_NONE_EXT, and set od_whynone, as described in | | OPEN4_DELEGATE_NONE_EXT, and set od_whynone, as described in | |
| Section 18.16. Write delegations are not available for file types | | Section 18.16. Write delegations are not available for file types | |
|
| that are not writeable. This includes file objects of types: NF4BLK, | | that are not writable. This includes file objects of types: NF4BLK, | |
| NF4CHR, NF4LNK, NF4SOCK, and NF4FIFO. If the client requests | | NF4CHR, NF4LNK, NF4SOCK, and NF4FIFO. If the client requests | |
| OPEN4_SHARE_ACCESS_WANT_WRITE_DELEG without | | OPEN4_SHARE_ACCESS_WANT_WRITE_DELEG without | |
| OPEN4_SHARE_ACCESS_WANT_READ_DELEG on an object with one of the | | OPEN4_SHARE_ACCESS_WANT_READ_DELEG on an object with one of the | |
| aforementioned file types, the server must set | | aforementioned file types, the server must set | |
| WND4_WRITE_DELEG_NOT_SUPP_FTYPE. | | WND4_WRITE_DELEG_NOT_SUPP_FTYPE. | |
| | | | |
| 18.49.4. IMPLEMENTATION | | 18.49.4. IMPLEMENTATION | |
| | | | |
| A request for a conflicting delegation is not normally intended to | | A request for a conflicting delegation is not normally intended to | |
| trigger the recall of the existing delegation. Servers may choose to | | trigger the recall of the existing delegation. Servers may choose to | |
| | | | |
| skipping to change at page 568, line 47 | | skipping to change at page 568, line 47 | |
| returning NFS4ERR_DELAY or permanently rejecting the offer of the | | returning NFS4ERR_DELAY or permanently rejecting the offer of the | |
| delegation by returning NFS4ERR_REJECT_DELEG. When a delegation is | | delegation by returning NFS4ERR_REJECT_DELEG. When a delegation is | |
| rejected in this fashion, the want previously established is | | rejected in this fashion, the want previously established is | |
| permanently deleted and the delegation is subject to acquisition by | | permanently deleted and the delegation is subject to acquisition by | |
| another client. | | another client. | |
| | | | |
| 20.5.4. IMPLEMENTATION | | 20.5.4. IMPLEMENTATION | |
| | | | |
| If the client does return NFS4ERR_DELAY and there is a conflicting | | If the client does return NFS4ERR_DELAY and there is a conflicting | |
| delegation request, the server MAY process it at the expense of the | | delegation request, the server MAY process it at the expense of the | |
|
| client that returned NFS4ERR_DELAY. The client's want will typically | | client that returned NFS4ERR_DELAY. The client's want will not be | |
| not be cancelled, but MAY processed behind other delegation requests | | cancelled, but MAY processed behind other delegation requests or | |
| or registered wants. | | registered wants. | |
| | | | |
|
| When a client returns a status other than NFS4_OK, NFSERR_DELAY, or | | When a client returns a status other than NFS4_OK, NFS4ERR_DELAY, or | |
| NFS4ERR_REJECT_DELAY, the want remains pending, although servers may | | NFS4ERR_REJECT_DELAY, the want remains pending, although servers may | |
| decide to cancel the want by sending a CB_WANTS_CANCELLED. | | decide to cancel the want by sending a CB_WANTS_CANCELLED. | |
| | | | |
| 20.6. Operation 8: CB_RECALL_ANY - Keep any N recallable objects | | 20.6. Operation 8: CB_RECALL_ANY - Keep any N recallable objects | |
| | | | |
| Notify client to return all but N recallable objects. | | Notify client to return all but N recallable objects. | |
| | | | |
| 20.6.1. ARGUMENT | | 20.6.1. ARGUMENT | |
| | | | |
| const RCA4_TYPE_MASK_RDATA_DLG = 0; | | const RCA4_TYPE_MASK_RDATA_DLG = 0; | |
| | | | |
| skipping to change at page 571, line 7 | | skipping to change at page 571, line 7 | |
| RCA4_TYPE_MASK_DIR_DLG | | RCA4_TYPE_MASK_DIR_DLG | |
| | | | |
| The client is to return directory delegations. | | The client is to return directory delegations. | |
| | | | |
| RCA4_TYPE_MASK_FILE_LAYOUT | | RCA4_TYPE_MASK_FILE_LAYOUT | |
| | | | |
| The client is to return layouts of type LAYOUT4_NFSV4_1_FILES. | | The client is to return layouts of type LAYOUT4_NFSV4_1_FILES. | |
| | | | |
| RCA4_TYPE_MASK_BLK_LAYOUT | | RCA4_TYPE_MASK_BLK_LAYOUT | |
| | | | |
|
| See [31] for a description. | | See [30] for a description. | |
| | | | |
| RCA4_TYPE_MASK_OBJ_LAYOUT_MIN to RCA4_TYPE_MASK_OBJ_LAYOUT_MAX | | RCA4_TYPE_MASK_OBJ_LAYOUT_MIN to RCA4_TYPE_MASK_OBJ_LAYOUT_MAX | |
| | | | |
|
| See [30] for a description. | | See [29] for a description. | |
| | | | |
| RCA4_TYPE_MASK_OTHER_LAYOUT_MIN to RCA4_TYPE_MASK_OTHER_LAYOUT_MAX | | RCA4_TYPE_MASK_OTHER_LAYOUT_MIN to RCA4_TYPE_MASK_OTHER_LAYOUT_MAX | |
| | | | |
| This range is reserved for telling the client to recall layouts of | | This range is reserved for telling the client to recall layouts of | |
| experimental or site specific layout types (see Section 3.3.13). | | experimental or site specific layout types (see Section 3.3.13). | |
| | | | |
| When a bit is set in the type mask that corresponds to an undefined | | When a bit is set in the type mask that corresponds to an undefined | |
| type of recallable object, NFS4ERR_INVAL MUST be returned. When a | | type of recallable object, NFS4ERR_INVAL MUST be returned. When a | |
| bit is set that corresponds to a defined type of object, but the | | bit is set that corresponds to a defined type of object, but the | |
| client does not support an object of the type, NFS4ERR_INVAL MUST NOT | | client does not support an object of the type, NFS4ERR_INVAL MUST NOT | |
| | | | |
| skipping to change at page 583, line 7 | | skipping to change at page 583, line 7 | |
| GETATTR for the fs_locations or fs_locations_info attributes, the | | GETATTR for the fs_locations or fs_locations_info attributes, the | |
| attacker modifies the results to cause the client migrate its | | attacker modifies the results to cause the client migrate its | |
| traffic to a server controlled by the attacker. | | traffic to a server controlled by the attacker. | |
| | | | |
| Relative to previous NFS versions, NFSv4.1 has additional security | | Relative to previous NFS versions, NFSv4.1 has additional security | |
| considerations for pNFS (see Section 12.9 and Section 13.12), locking | | considerations for pNFS (see Section 12.9 and Section 13.12), locking | |
| and session state (see Section 2.10.7.3). | | and session state (see Section 2.10.7.3). | |
| | | | |
| 22. IANA Considerations | | 22. IANA Considerations | |
| | | | |
|
| | | This section uses terms that are defined in [43]. | |
| | | | |
| 22.1. Named Attribute Definitions | | 22.1. Named Attribute Definitions | |
| | | | |
|
| | | IANA will create a registry called the "NFSv4 Named Attribute | |
| | | Definitions Registry". | |
| | | | |
| The NFSv4.1 protocol supports the association of a file with zero or | | The NFSv4.1 protocol supports the association of a file with zero or | |
| more named attributes. The name space identifiers for these | | more named attributes. The name space identifiers for these | |
| attributes are defined as string names. The protocol does not define | | attributes are defined as string names. The protocol does not define | |
| the specific assignment of the name space for these file attributes. | | the specific assignment of the name space for these file attributes. | |
|
| Even though the name space is not specifically controlled to prevent | | An IANA registry will promote interoperability where common interests | |
| collisions, an IANA registry has been created for the registration of | | exist. While application developers are allowed to define and use | |
| NFSv4.1 named attributes. Registration will be achieved through the | | attributes as needed, they are encouraged to register the attributes | |
| publication of an Informational RFC and will require not only the | | with IANA. | |
| name of the attribute but the syntax and semantics of the named | | | |
| attribute contents; the intent is to promote interoperability where | | | |
| common interests exist. While application developers are allowed to | | | |
| define and use attributes as needed, they are encouraged to register | | | |
| the attributes with IANA. | | | |
| | | | |
| Such registered named attributes are presumed to apply to all minor | | Such registered named attributes are presumed to apply to all minor | |
| versions of NFSv4, including those defined subsequently to the | | versions of NFSv4, including those defined subsequently to the | |
| registration. Where the named attribute is intended to be limited | | registration. Where the named attribute is intended to be limited | |
| with regard to the minor versions for which they are not be used, the | | with regard to the minor versions for which they are not be used, the | |
|
| Informational RFC must clearly state the applicable limits. | | assignment in registry will clearly state the applicable limits. | |
| | | | |
|
| 22.2. ONC RPC Network Identifiers (netids) | | All assignments to the registry are made on a First Come First Served | |
| | | basis, per section 4.1 of [43]. The policy for each assignment is | |
| | | Specification Required, per section 4.1 of [43]. | |
| | | | |
|
| Section 3.3.9) discussed the r_netid field and the corresponding | | Under the NFSv4.1 specification, the name of a named attribute can in | |
| r_addr field within a netaddr4 structure. The NFSv4 protocol depends | | theory be up to 2^32 - 1 bytes in length, but in practice NFSv4.1 | |
| on the syntax and semantics of these fields to effectively | | clients and servers will be unable to a handle string that long. | |
| communicate callback and other information between client and server. | | IANA should reject any assignment request with a named attribute that | |
| Therefore, an IANA registry has been created to include the values | | exceeds 128 UTF-8 characters. To give IESG the flexibility to set up | |
| defined in this document and to allow for future expansion based on | | bases of assignment of Experimental Use and Standards Action, the | |
| transport usage/availability. Additions to this ONC RPC Network | | prefixes of "EXPE" and "STDS" are Reserved. The zero length named | |
| Identifier registry must be done with the publication of an RFC. | | attribute name is Reserved. | |
| | | | |
|
| The initial values for this registry are as follows (some of this | | The prefix "PRIV" is allocated for Private Use. A site that wants to | |
| text is replicated from Section 3.3.9 for clarity): | | make use of unregistered named attributes without risk of conflicting | |
| | | with an assignment in IANA's registry should use the prefix "PRIV" in | |
| | | all of its named attributes. | |
| | | | |
|
| The Network Identifier (or r_netid for short) is used to specify a | | Because some NFSv4.1 clients and servers have case insensitive | |
| transport protocol and associated universal address (or r_addr for | | semantics, the fifteen additional lower case and mixed case | |
| short). The syntax of the Network Identifier is a US-ASCII string. | | permutations of each of "EXPE", "PRIV", and "STDS", are Reserved | |
| The initial definitions for r_netid are: | | (e.g. "expe", "expE", "exPe", etc. are Reserved). Similarly, IANA | |
| | | must not allow two assignments that would conflict if both named | |
| | | attributes were converted to a common case. | |
| | | | |
|
| "tcp" - TCP over IP version 4 | | The registry of named attributes is a list of assignments, each | |
| | | containing three fields for each assignment. | |
| | | | |
|
| "udp" - UDP over IP version 4 | | 1. A US-ASCII string name that is the actual name of the attribute. | |
| | | This name must be unique. This string name can be 1 to 128 UTF-8 | |
| | | characters long. | |
| | | | |
|
| "tcp6" - TCP over IP version 6 | | 2. A reference to the specification of the named attribute. The | |
| | | reference can consume up to 256 bytes (or more if IANA permits). | |
| | | | |
|
| "udp6" - UDP over IP version 6 | | 3. The point of contact of the registrant. The point of contact can | |
| | | consume up to 256 bytes (or more if IANA permits). | |
| | | | |
|
| Note: the '"' marks are used for delimiting the strings for this | | 22.1.1. Initial Registry | |
| document and are not part of the Network Identifier string. | | | |
| | | | |
|
| For the "tcp" and "udp" Network Identifiers the Universal Address or | | There is no initial registry. | |
| r_addr (for IPv4) is a US-ASCII string and is of the form described | | | |
| in Section 3.3.9.1. | | | |
| | | | |
|
| For the "tcp" and "udp" Network Identifiers the Universal Address or | | 22.1.2. Updating Registrations | |
| r_addr (for IPv6) is a US-ASCII string and is of the form described | | | |
| in Section 3.3.9.2. | | | |
| | | | |
|
| As mentioned, the registration of new Network Identifiers will | | The registrant is always permitted to update the point of contact | |
| require the publication of an RFC with similar detail as listed above | | field. To make any other change will require Expert Review or IESG | |
| for the Network Identifier itself and corresponding Universal | | Approval. | |
| Address. | | | |
| | | | |
|
| 22.3. Defining New Notifications | | 22.2. Device ID Notifications | |
| | | | |
|
| New notification types may be added to the CB_NOTIFY_DEVICEID | | IANA will create a registry called the "NFSv4.1 Device ID | |
| operation Section 20.12. This can be done via changes to the | | Notifications Registry". | |
| operations that register notifications, or by adding new operations | | | |
| to NFSv4. This requires a new minor version of NFSv4, and requires a | | | |
| standards track document from IETF. Another way to add a | | | |
| notification is to specify a new layout type. Notifications for new | | | |
| layout types would be requested via GETDEVICELIST (Section 18.41) and | | | |
| GETDEVICEINFO (Section 18.40). See Section 22.4). | | | |
| | | | |
|
| 22.4. Defining New Layout Types | | The potential exists for new notification types to be added to the | |
| | | CB_NOTIFY_DEVICEID operation Section 20.12. This can be done via | |
| | | changes to the operations that register notifications, or by adding | |
| | | new operations to NFSv4. This requires a new minor version of NFSv4, | |
| | | and requires a standards track document from IETF. Another way to | |
| | | add a notification is to specify a new layout type (see | |
| | | Section 22.4). | |
| | | | |
|
| New layout type numbers will be requested from IANA. IANA will only | | Hence all assignments to the registry are made on a Standards Action | |
| provide layout type numbers for Standards Track RFCs approved by the | | basis per section 4.1 of [43], with Expert Review required. | |
| IESG, in accordance with Standards Action policy defined in [20]. | | | |
| All layout types assigned by IANA MUST be in the range 0x00000001 to | | The registry is a list of assignments, each containing five fields | |
| 0x7FFFFFFF. | | per assignment. | |
| | | | |
| | | 1. The name of the notification type. This name must have the | |
| | | prefix: "NOTIFY_DEVICEID4_". This name must be unique. | |
| | | | |
| | | 2. The value of the notification. IANA will assign this number, and | |
| | | the request from the registrant will use TBD1 instead of an | |
| | | actual value. IANA MUST use a whole number which can be no | |
| | | higher than 2^32-1, and should be the next available value. The | |
| | | value assigned must be unique. A Designated Expert must be used | |
| | | to ensure that when the name of the notification type and its | |
| | | value are added to the NFSv4.1 notify_deviceid_type4 enumerated | |
| | | data type in the NFSv4.1 XDR description ([12]), the result | |
| | | continues to be a valid XDR description. | |
| | | | |
| | | 3. The Standards Track RFC(s) that describe the notification. If | |
| | | the RFC(s) have not yet been published, the registrant will use | |
| | | RFCTBD2, RFCTBD3, etc. instead of an actual RFC number. | |
| | | | |
| | | 4. How the RFC introduces the notification. This is indicated by a | |
| | | single US-ASCII value. If the value is N, it means a minor | |
| | | revision to the NFSv4 protocol. If the value is L, it means a | |
| | | new pNFS layout type. Other values can be used with IESG | |
| | | Approval. | |
| | | | |
| | | 5. The minor versions of NFSv4 that are allowed to the use the | |
| | | notification. While these are numeric values, IANA will not | |
| | | allocate and assign them; the author of the relevant RFCs with | |
| | | IESG Approval assigns these numbers. Each time there is new | |
| | | minor version of NFSv4 approved, a Designated Expert should | |
| | | review the registry to make recommended updates as needed. | |
| | | | |
| | | 22.2.1. Initial Registry | |
| | | | |
| | | The initial registry is in Table 15. Note that next available value | |
| | | is zero. | |
| | | | |
| | | +-------------------------+-------+----------+-----+----------------+ | |
| | | | Notification Name | Value | RFC | How | Minor Versions | | |
| | | +-------------------------+-------+----------+-----+----------------+ | |
| | | | NOTIFY_DEVICEID4_CHANGE | 1 | RFCTBD10 | N | 1 | | |
| | | | NOTIFY_DEVICEID4_DELETE | 2 | RFCTBD10 | N | 1 | | |
| | | +-------------------------+-------+----------+-----+----------------+ | |
| | | | |
| | | Table 15: Initial Device ID Notification Assignments | |
| | | | |
| | | 22.2.2. Updating Registrations | |
| | | | |
| | | The update of a registration will require IESG Approval on the advice | |
| | | of a Designated Expert. | |
| | | | |
| | | 22.3. Object Recall Types | |
| | | | |
| | | IANA will create a registry called the "NFSv4.1 Recallable Object | |
| | | Types Registry". | |
| | | | |
| | | The potential exists for new object types to be be added to the | |
| | | CB_RECALL_ANY operation (see Section 20.6). This can be done via | |
| | | changes to the operations that add recallable types, or by adding new | |
| | | operations to NFSv4. This requires a new minor version of NFSv4, and | |
| | | requires a standards track document from IETF. Another way to add a | |
| | | new recallable object is to specify a new layout type (see | |
| | | Section 22.4). | |
| | | | |
| | | All assignments to the registry are made on a Standards Action basis | |
| | | per section 4.1 of [43], with Expert Review required. | |
| | | | |
| | | Recallable object types are 32 bit unsigned numbers. There are no | |
| | | Reserved values. Values in the range 12 through 15, inclusive, are | |
| | | for Private Use. | |
| | | | |
| | | The registry is a list of assignments, each containing five fields | |
| | | per assignment. | |
| | | | |
| | | 1. The name of the recallable object type. This name must have the | |
| | | prefix: "RCA4_TYPE_MASK_". The name must be unique. | |
| | | | |
| | | 2. The value of the recallable object type. IANA will assign this | |
| | | number, and the request from the registrant will use TBD1 instead | |
| | | of an actual value. IANA MUST use a whole number which can be no | |
| | | higher than 2^32-1, and should be the next available value. The | |
| | | value must be unique. A Designated Expert must be used to ensure | |
| | | that when the name of the recallable type and its value are added | |
| | | to the NFSv4 XDR description [12], the result continues to be a | |
| | | valid XDR description. | |
| | | | |
| | | 3. The Standards Track RFC(s) that describe the recallable object | |
| | | type. If the RFC(s) have not yet been published, the registrant | |
| | | will use RFCTBD2, RFCTBD3, etc. instead of an actual RFC number. | |
| | | | |
| | | 4. How the RFC introduces the recallable object type. This is | |
| | | indicated by a single US-ASCII value. If the value is N, it | |
| | | means a minor revision to the NFSv4 protocol. If the value is L, | |
| | | it means a new pNFS layout type. Other values can be used with | |
| | | IESG Approval. | |
| | | | |
| | | 5. The minor versions of NFSv4 that are allowed to the use the | |
| | | recallable object type. While these are numeric values, IANA | |
| | | will not allocate and assign them; the author of the relevant | |
| | | RFCs with IESG Approval assigns these numbers. Each time there | |
| | | is new minor version of NFSv4 approved, a Designated Expert | |
| | | should review the registry to make recommended updates as needed. | |
| | | | |
| | | 22.3.1. Initial Registry | |
| | | | |
| | | The initial registry is in Table 16. Note that next available value | |
| | | is five. | |
| | | | |
| | | +-------------------------------+-------+----------+-----+----------+ | |
| | | | Recallable Object Type Name | Value | RFC | How | Minor | | |
| | | | | | | | Versions | | |
| | | +-------------------------------+-------+----------+-----+----------+ | |
| | | | RCA4_TYPE_MASK_RDATA_DLG | 0 | RFCTBD10 | N | 1 | | |
| | | | RCA4_TYPE_MASK_WDATA_DLG | 1 | RFCTBD10 | N | 1 | | |
| | | | RCA4_TYPE_MASK_DIR_DLG | 2 | RFCTBD10 | N | 1 | | |
| | | | RCA4_TYPE_MASK_FILE_LAYOUT | 3 | RFCTBD10 | N | 1 | | |
| | | | RCA4_TYPE_MASK_BLK_LAYOUT | 4 | RFCTBD20 | L | 1 | | |
| | | | RCA4_TYPE_MASK_OBJ_LAYOUT_MIN | 8 | RFCTBD30 | L | 1 | | |
| | | | RCA4_TYPE_MASK_OBJ_LAYOUT_MAX | 9 | RFCTBD30 | L | 1 | | |
| | | +-------------------------------+-------+----------+-----+----------+ | |
| | | | |
| | | Table 16: Initial Recallable Object Type Assignments | |
| | | | |
| | | 22.3.2. Updating Registrations | |
| | | | |
| | | The update of a registration will require IESG Approval on the advice | |
| | | of a Designated Expert. | |
| | | | |
| | | 22.4. Layout Types | |
| | | | |
| | | IANA will create a registry called the "pNFS Layout Types Registry". | |
| | | | |
| | | All assignments to the registry are made on a Standards Action basis, | |
| | | with Expert Review required. | |
| | | | |
| | | Layout types are 32 bit numbers. The value zero is Reserved. Values | |
| | | in the range 0x80000000 to 0xFFFFFFFF inclusive are for Private Use. | |
| | | IANA will assign numbers from the range 0x00000001 to 0x7FFFFFFF | |
| | | inclusive. | |
| | | | |
| | | The registry is a list of assignments, each containing five fields. | |
| | | | |
| | | 1. The name of the layout type. This name must have the prefix: | |
| | | "LAYOUT4_". The name must be unique. | |
| | | | |
| | | 2. The value of the layout type. IANA will assign this number, and | |
| | | the request from the registrant will use TBD1 instead of an | |
| | | actual value. The value assigned must be unique. A Designated | |
| | | Expert must be used to ensure that when the name of the layout | |
| | | type and its value are added to the NFSv4.1 layouttype4 | |
| | | enumerated data type in the NFSv4.1 XDR description ([12]), the | |
| | | result continues to be a valid XDR description. | |
| | | | |
| | | 3. The Standards Track RFC(s) that describe the notification. If | |
| | | the RFC(s) have not yet been published, the registrant will use | |
| | | RFCTBD2, RFCTBD3, etc. instead of an actual RFC number. | |
| | | Collectively, the RFC(s) must adhere to the guidelines listed in | |
| | | Section 22.4.3. | |
| | | | |
| | | 4. How the RFC introduces the notification. This is indicated by a | |
| | | single US-ASCII value. If the value is N, it means a minor | |
| | | revision to the NFSv4 protocol. If the value is L, it means a | |
| | | new pNFS layout type. Other values can be used with IESG | |
| | | Approval. | |
| | | | |
| | | 5. The minor versions of NFSv4 that are allowed to the use the | |
| | | notification. While these are numeric values, IANA will not | |
| | | allocate and assign them; the author of the relevant RFCs with | |
| | | IESG Approval assigns these numbers. Each time there is new | |
| | | minor version of NFSv4 approved, a Designated Expert should | |
| | | review the registry to make recommended updates as needed. | |
| | | | |
| | | 22.4.1. Initial Registry | |
| | | | |
| | | The initial registry is in Table 17. | |
| | | | |
| | | +-----------------------+-------+----------+-----+----------------+ | |
| | | | Layout Type Name | Value | RFC | How | Minor Versions | | |
| | | +-----------------------+-------+----------+-----+----------------+ | |
| | | | LAYOUT4_NFSV4_1_FILES | 0x1 | RFCTBD10 | N | 1 | | |
| | | | LAYOUT4_OSD2_OBJECTS | 0x2 | RFCTBD30 | L | 1 | | |
| | | | LAYOUT4_BLOCK_VOLUME | 0x3 | RFCTBD20 | L | 1 | | |
| | | +-----------------------+-------+----------+-----+----------------+ | |
| | | | |
| | | Table 17: Initial Layout Type Assignments | |
| | | | |
| | | 22.4.2. Updating Registrations | |
| | | | |
| | | The update of a registration will require IESG Approval on the advice | |
| | | of a Designated Expert. | |
| | | | |
| | | 22.4.3. Guidelines for Writing Layout Type Specifications | |
| | | | |
| The author of a new pNFS layout specification must follow these steps | | The author of a new pNFS layout specification must follow these steps | |
|
| to obtain acceptance of the layout type as a standard: | | to obtain acceptance of the layout type as a Standards Track RFC: | |
| | | | |
| 1. The author devises the new layout specification. | | 1. The author devises the new layout specification. | |
| | | | |
| 2. The new layout type specification MUST, at a minimum: | | 2. The new layout type specification MUST, at a minimum: | |
| | | | |
| * Define the contents of the layout-type-specific fields of the | | * Define the contents of the layout-type-specific fields of the | |
| following data types: | | following data types: | |
| | | | |
| + the da_addr_body field of the device_addr4 data type; | | + the da_addr_body field of the device_addr4 data type; | |
| | | | |
| + the loh_body field of the layouthint4 data type; | | + the loh_body field of the layouthint4 data type; | |
| | | | |
| + the loc_body field of layout_content4 data type (which in | | + the loc_body field of layout_content4 data type (which in | |
| turn is the lo_content field of the layout4 data type); | | turn is the lo_content field of the layout4 data type); | |
| | | | |
| + the lou_body field of the layoutupdate4 data type; | | + the lou_body field of the layoutupdate4 data type; | |
| | | | |
| * Describe or define the storage access protocol used to access | | * Describe or define the storage access protocol used to access | |
|
| the data servers | | the storage devices. | |
| | | | |
| * Describe whether revocation of layouts is supported. | | * Describe whether revocation of layouts is supported. | |
| | | | |
| * At a minimum, describe the methods of recovery from: | | * At a minimum, describe the methods of recovery from: | |
| | | | |
| 1. Failure and restart for client, server, storage device. | | 1. Failure and restart for client, server, storage device. | |
| | | | |
| 2. Lease expiration from perspective of the active client, | | 2. Lease expiration from perspective of the active client, | |
| server, storage device. | | server, storage device. | |
| | | | |
| 3. Loss of layout state resulting in fencing of client access | | 3. Loss of layout state resulting in fencing of client access | |
| to storage devices (for an example, see Section 12.7.3). | | to storage devices (for an example, see Section 12.7.3). | |
| | | | |
|
| * A list of any new notification values for CB_NOTIFY_DEVICEID. | | * Include an IANA considerations section, will in turn include: | |
| | | | |
|
| * A list of any new recallable object types for CB_RECALL_ANY. | | + A request to IANA for a new layout type per Section 22.4. | |
| | | | |
|
| * Include an IANA considerations section. | | + A list of requests to IANA for any new recallable object | |
| | | types for CB_RECALL_ANY; each entry is to presented in the | |
| | | form described in Section 22.3. | |
| | | | |
| | | + A list of requests to IANA for any new notification values | |
| | | for CB_NOTIFY_DEVICEID; each entry is to presented in the | |
| | | form described in Section 22.2. | |
| | | | |
| * Include a security considerations section. | | * Include a security considerations section. | |