1 2 3 4 NFSv4 S. Shepler 5 Internet-Draft M. Eisler 6 Intended status: Standards Track D. Noveck 7 Expires: February 23, 2009 Editors 8 August 22, 2008 9 10 11 NFS Version 4 Minor Version 1 12 draft-ietf-nfsv4-minorversion1-25.txt 13 14 Status of this Memo 15 16 By submitting this Internet-Draft, each author represents that any 17 applicable patent or other IPR claims of which he or she is aware 18 have been or will be disclosed, and any of which he or she becomes 19 aware will be disclosed, in accordance with Section 6 of BCP 79. 20 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF), its areas, and its working groups. Note that 23 other groups may also distribute working documents as Internet- 24 Drafts. 25 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 30 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt. 33 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html. 36 37 This Internet-Draft will expire on February 23, 2009. 38 39 Abstract 40 41 This Internet-Draft describes NFS version 4 minor version one, 42 including features retained from the base protocol and protocol 43 extensions made subsequently. Major extensions introduced in NFS 44 version 4 minor version one include: Sessions, Directory Delegations, 45 and parallel NFS (pNFS). 46 47 Requirements Language 48 49 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 50 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 51 document are to be interpreted as described in RFC 2119 [1]. 52 53 54 55 Shepler, et al. Expires February 23, 2009 [Page 1] 56 57 Internet-Draft NFSv4.1 August 2008 58 59 60 Table of Contents 61 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 11 63 1.1. The NFS Version 4 Minor Version 1 Protocol . . . . . . . 11 64 1.2. Scope of this Document . . . . . . . . . . . . . . . . . 11 65 1.3. NFSv4 Goals . . . . . . . . . . . . . . . . . . . . . . 11 66 1.4. NFSv4.1 Goals . . . . . . . . . . . . . . . . . . . . . 12 67 1.5. General Definitions . . . . . . . . . . . . . . . . . . 12 68 1.6. Overview of NFSv4.1 Features . . . . . . . . . . . . . . 15 69 1.6.1. RPC and Security . . . . . . . . . . . . . . . . . . 15 70 1.6.2. Protocol Structure . . . . . . . . . . . . . . . . . 15 71 1.6.3. File System Model . . . . . . . . . . . . . . . . . 16 72 1.6.4. Locking Facilities . . . . . . . . . . . . . . . . . 18 73 1.7. Differences from NFSv4.0 . . . . . . . . . . . . . . . . 19 74 2. Core Infrastructure . . . . . . . . . . . . . . . . . . . . . 20 75 2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 20 76 2.2. RPC and XDR . . . . . . . . . . . . . . . . . . . . . . 20 77 2.2.1. RPC-based Security . . . . . . . . . . . . . . . . . 20 78 2.3. COMPOUND and CB_COMPOUND . . . . . . . . . . . . . . . . 23 79 2.4. Client Identifiers and Client Owners . . . . . . . . . . 24 80 2.4.1. Upgrade from NFSv4.0 to NFSv4.1 . . . . . . . . . . 27 81 2.4.2. Server Release of Client ID . . . . . . . . . . . . 28 82 2.4.3. Resolving Client Owner Conflicts . . . . . . . . . . 28 83 2.5. Server Owners . . . . . . . . . . . . . . . . . . . . . 29 84 2.6. Security Service Negotiation . . . . . . . . . . . . . . 30 85 2.6.1. NFSv4.1 Security Tuples . . . . . . . . . . . . . . 30 86 2.6.2. SECINFO and SECINFO_NO_NAME . . . . . . . . . . . . 31 87 2.6.3. Security Error . . . . . . . . . . . . . . . . . . . 31 88 2.7. Minor Versioning . . . . . . . . . . . . . . . . . . . . 35 89 2.8. Non-RPC-based Security Services . . . . . . . . . . . . 38 90 2.8.1. Authorization . . . . . . . . . . . . . . . . . . . 38 91 2.8.2. Auditing . . . . . . . . . . . . . . . . . . . . . . 38 92 2.8.3. Intrusion Detection . . . . . . . . . . . . . . . . 38 93 2.9. Transport Layers . . . . . . . . . . . . . . . . . . . . 39 94 2.9.1. REQUIRED and RECOMMENDED Properties of Transports . 39 95 2.9.2. Client and Server Transport Behavior . . . . . . . . 39 96 2.9.3. Ports . . . . . . . . . . . . . . . . . . . . . . . 41 97 2.10. Session . . . . . . . . . . . . . . . . . . . . . . . . 41 98 2.10.1. Motivation and Overview . . . . . . . . . . . . . . 41 99 2.10.2. NFSv4 Integration . . . . . . . . . . . . . . . . . 42 100 2.10.3. Channels . . . . . . . . . . . . . . . . . . . . . . 44 101 2.10.4. Trunking . . . . . . . . . . . . . . . . . . . . . . 45 102 2.10.5. Exactly Once Semantics . . . . . . . . . . . . . . . 48 103 2.10.6. RDMA Considerations . . . . . . . . . . . . . . . . 61 104 2.10.7. Sessions Security . . . . . . . . . . . . . . . . . 64 105 2.10.8. The SSV GSS Mechanism . . . . . . . . . . . . . . . 69 106 2.10.9. Session Mechanics - Steady State . . . . . . . . . . 73 107 2.10.10. Session Inactivity Timer . . . . . . . . . . . . . . 75 108 109 110 111 Shepler, et al. Expires February 23, 2009 [Page 2] 112 113 Internet-Draft NFSv4.1 August 2008 114 115 116 2.10.11. Session Mechanics - Recovery . . . . . . . . . . . . 75 117 2.10.12. Parallel NFS and Sessions . . . . . . . . . . . . . 79 118 3. Protocol Constants and Data Types . . . . . . . . . . . . . . 79 119 3.1. Basic Constants . . . . . . . . . . . . . . . . . . . . 79 120 3.2. Basic Data Types . . . . . . . . . . . . . . . . . . . . 80 121 3.3. Structured Data Types . . . . . . . . . . . . . . . . . 82 122 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . . 90 123 4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 90 124 4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . 91 125 4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . 91 126 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 91 127 4.2.1. General Properties of a Filehandle . . . . . . . . . 92 128 4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . 93 129 4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . 93 130 4.3. One Method of Constructing a Volatile Filehandle . . . . 94 131 4.4. Client Recovery from Filehandle Expiration . . . . . . . 95 132 5. File Attributes . . . . . . . . . . . . . . . . . . . . . . . 96 133 5.1. REQUIRED Attributes . . . . . . . . . . . . . . . . . . 97 134 5.2. RECOMMENDED Attributes . . . . . . . . . . . . . . . . . 97 135 5.3. Named Attributes . . . . . . . . . . . . . . . . . . . . 98 136 5.4. Classification of Attributes . . . . . . . . . . . . . . 99 137 5.5. Set-Only and Get-Only Attributes . . . . . . . . . . . . 100 138 5.6. REQUIRED Attributes - List and Definition References . . 100 139 5.7. RECOMMENDED Attributes - List and Definition 140 References . . . . . . . . . . . . . . . . . . . . . . . 101 141 5.8. Attribute Definitions . . . . . . . . . . . . . . . . . 103 142 5.8.1. Definitions of REQUIRED Attributes . . . . . . . . . 103 143 5.8.2. Definitions of Uncategorized RECOMMENDED 144 Attributes . . . . . . . . . . . . . . . . . . . . . 105 145 5.9. Interpreting owner and owner_group . . . . . . . . . . . 112 146 5.10. Character Case Attributes . . . . . . . . . . . . . . . 114 147 5.11. Directory Notification Attributes . . . . . . . . . . . 114 148 5.12. pNFS Attribute Definitions . . . . . . . . . . . . . . . 114 149 5.13. Retention Attributes . . . . . . . . . . . . . . . . . . 116 150 6. Access Control Attributes . . . . . . . . . . . . . . . . . . 119 151 6.1. Goals . . . . . . . . . . . . . . . . . . . . . . . . . 119 152 6.2. File Attributes Discussion . . . . . . . . . . . . . . . 120 153 6.2.1. Attribute 12: acl . . . . . . . . . . . . . . . . . 120 154 6.2.2. Attribute 58: dacl . . . . . . . . . . . . . . . . . 135 155 6.2.3. Attribute 59: sacl . . . . . . . . . . . . . . . . . 135 156 6.2.4. Attribute 33: mode . . . . . . . . . . . . . . . . . 135 157 6.2.5. Attribute 74: mode_set_masked . . . . . . . . . . . 136 158 6.3. Common Methods . . . . . . . . . . . . . . . . . . . . . 137 159 6.3.1. Interpreting an ACL . . . . . . . . . . . . . . . . 137 160 6.3.2. Computing a Mode Attribute from an ACL . . . . . . . 138 161 6.4. Requirements . . . . . . . . . . . . . . . . . . . . . . 139 162 6.4.1. Setting the mode and/or ACL Attributes . . . . . . . 139 163 6.4.2. Retrieving the mode and/or ACL Attributes . . . . . 141 164 165 166 167 Shepler, et al. Expires February 23, 2009 [Page 3] 168 169 Internet-Draft NFSv4.1 August 2008 170 171 172 6.4.3. Creating New Objects . . . . . . . . . . . . . . . . 141 173 7. Single-server Namespace . . . . . . . . . . . . . . . . . . . 145 174 7.1. Server Exports . . . . . . . . . . . . . . . . . . . . . 145 175 7.2. Browsing Exports . . . . . . . . . . . . . . . . . . . . 146 176 7.3. Server Pseudo File System . . . . . . . . . . . . . . . 146 177 7.4. Multiple Roots . . . . . . . . . . . . . . . . . . . . . 147 178 7.5. Filehandle Volatility . . . . . . . . . . . . . . . . . 147 179 7.6. Exported Root . . . . . . . . . . . . . . . . . . . . . 147 180 7.7. Mount Point Crossing . . . . . . . . . . . . . . . . . . 148 181 7.8. Security Policy and Namespace Presentation . . . . . . . 148 182 8. State Management . . . . . . . . . . . . . . . . . . . . . . 149 183 8.1. Client and Session ID . . . . . . . . . . . . . . . . . 150 184 8.2. Stateid Definition . . . . . . . . . . . . . . . . . . . 150 185 8.2.1. Stateid Types . . . . . . . . . . . . . . . . . . . 151 186 8.2.2. Stateid Structure . . . . . . . . . . . . . . . . . 152 187 8.2.3. Special Stateids . . . . . . . . . . . . . . . . . . 154 188 8.2.4. Stateid Lifetime and Validation . . . . . . . . . . 155 189 8.2.5. Stateid Use for I/O Operations . . . . . . . . . . . 158 190 8.2.6. Stateid Use for SETATTR Operations . . . . . . . . . 159 191 8.3. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 159 192 8.4. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 161 193 8.4.1. Client Failure and Recovery . . . . . . . . . . . . 162 194 8.4.2. Server Failure and Recovery . . . . . . . . . . . . 163 195 8.4.3. Network Partitions and Recovery . . . . . . . . . . 166 196 8.5. Server Revocation of Locks . . . . . . . . . . . . . . . 171 197 8.6. Short and Long Leases . . . . . . . . . . . . . . . . . 172 198 8.7. Clocks, Propagation Delay, and Calculating Lease 199 Expiration . . . . . . . . . . . . . . . . . . . . . . . 172 200 8.8. Obsolete Locking Infrastructure From NFSv4.0 . . . . . . 173 201 9. File Locking and Share Reservations . . . . . . . . . . . . . 174 202 9.1. Opens and Byte-Range Locks . . . . . . . . . . . . . . . 174 203 9.1.1. State-owner Definition . . . . . . . . . . . . . . . 174 204 9.1.2. Use of the Stateid and Locking . . . . . . . . . . . 175 205 9.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 178 206 9.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 178 207 9.4. Stateid Seqid Values and Byte-Range Locks . . . . . . . 179 208 9.5. Issues with Multiple Open-Owners . . . . . . . . . . . . 179 209 9.6. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 180 210 9.7. Share Reservations . . . . . . . . . . . . . . . . . . . 181 211 9.8. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 182 212 9.9. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 182 213 9.10. Parallel OPENs . . . . . . . . . . . . . . . . . . . . . 183 214 9.11. Reclaim of Open and Byte-Range Locks . . . . . . . . . . 184 215 10. Client-Side Caching . . . . . . . . . . . . . . . . . . . . . 184 216 10.1. Performance Challenges for Client-Side Caching . . . . . 185 217 10.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 186 218 10.2.1. Delegation Recovery . . . . . . . . . . . . . . . . 188 219 10.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 190 220 221 222 223 Shepler, et al. Expires February 23, 2009 [Page 4] 224 225 Internet-Draft NFSv4.1 August 2008 226 227 228 10.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . 190 229 10.3.2. Data Caching and File Locking . . . . . . . . . . . 191 230 10.3.3. Data Caching and Mandatory File Locking . . . . . . 193 231 10.3.4. Data Caching and File Identity . . . . . . . . . . . 193 232 10.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 195 233 10.4.1. Open Delegation and Data Caching . . . . . . . . . . 197 234 10.4.2. Open Delegation and File Locks . . . . . . . . . . . 198 235 10.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . 199 236 10.4.4. Recall of Open Delegation . . . . . . . . . . . . . 202 237 10.4.5. Clients that Fail to Honor Delegation Recalls . . . 204 238 10.4.6. Delegation Revocation . . . . . . . . . . . . . . . 204 239 10.4.7. Delegations via WANT_DELEGATION . . . . . . . . . . 205 240 10.5. Data Caching and Revocation . . . . . . . . . . . . . . 206 241 10.5.1. Revocation Recovery for Write Open Delegation . . . 206 242 10.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 207 243 10.7. Data and Metadata Caching and Memory Mapped Files . . . 209 244 10.8. Name and Directory Caching without Directory 245 Delegations . . . . . . . . . . . . . . . . . . . . . . 211 246 10.8.1. Name Caching . . . . . . . . . . . . . . . . . . . . 211 247 10.8.2. Directory Caching . . . . . . . . . . . . . . . . . 213 248 10.9. Directory Delegations . . . . . . . . . . . . . . . . . 214 249 10.9.1. Introduction to Directory Delegations . . . . . . . 214 250 10.9.2. Directory Delegation Design . . . . . . . . . . . . 215 251 10.9.3. Attributes in Support of Directory Notifications . . 216 252 10.9.4. Directory Delegation Recall . . . . . . . . . . . . 216 253 10.9.5. Directory Delegation Recovery . . . . . . . . . . . 217 254 11. Multi-Server Namespace . . . . . . . . . . . . . . . . . . . 217 255 11.1. Location Attributes . . . . . . . . . . . . . . . . . . 217 256 11.2. File System Presence or Absence . . . . . . . . . . . . 218 257 11.3. Getting Attributes for an Absent File System . . . . . . 219 258 11.3.1. GETATTR Within an Absent File System . . . . . . . . 219 259 11.3.2. READDIR and Absent File Systems . . . . . . . . . . 220 260 11.4. Uses of Location Information . . . . . . . . . . . . . . 221 261 11.4.1. File System Replication . . . . . . . . . . . . . . 222 262 11.4.2. File System Migration . . . . . . . . . . . . . . . 222 263 11.4.3. Referrals . . . . . . . . . . . . . . . . . . . . . 224 264 11.5. Location Entries and Server Identity . . . . . . . . . . 225 265 11.6. Additional Client-side Considerations . . . . . . . . . 226 266 11.7. Effecting File System Transitions . . . . . . . . . . . 226 267 11.7.1. File System Transitions and Simultaneous Access . . 228 268 11.7.2. Simultaneous Use and Transparent Transitions . . . . 228 269 11.7.3. Filehandles and File System Transitions . . . . . . 231 270 11.7.4. Fileids and File System Transitions . . . . . . . . 231 271 11.7.5. Fsids and File System Transitions . . . . . . . . . 233 272 11.7.6. The Change Attribute and File System Transitions . . 233 273 11.7.7. Lock State and File System Transitions . . . . . . . 234 274 11.7.8. Write Verifiers and File System Transitions . . . . 238 275 11.7.9. Readdir Cookies and Verifiers and File System 276 277 278 279 Shepler, et al. Expires February 23, 2009 [Page 5] 280 281 Internet-Draft NFSv4.1 August 2008 282 283 284 Transitions . . . . . . . . . . . . . . . . . . . . 238 285 11.7.10. File System Data and File System Transitions . . . . 238 286 11.8. Effecting File System Referrals . . . . . . . . . . . . 240 287 11.8.1. Referral Example (LOOKUP) . . . . . . . . . . . . . 240 288 11.8.2. Referral Example (READDIR) . . . . . . . . . . . . . 244 289 11.9. The Attribute fs_locations . . . . . . . . . . . . . . . 246 290 11.10. The Attribute fs_locations_info . . . . . . . . . . . . 249 291 11.10.1. The fs_locations_server4 Structure . . . . . . . . . 253 292 11.10.2. The fs_locations_info4 Structure . . . . . . . . . . 258 293 11.10.3. The fs_locations_item4 Structure . . . . . . . . . . 259 294 11.11. The Attribute fs_status . . . . . . . . . . . . . . . . 261 295 12. Parallel NFS (pNFS) . . . . . . . . . . . . . . . . . . . . . 265 296 12.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 265 297 12.2. pNFS Definitions . . . . . . . . . . . . . . . . . . . . 266 298 12.2.1. Metadata . . . . . . . . . . . . . . . . . . . . . . 267 299 12.2.2. Metadata Server . . . . . . . . . . . . . . . . . . 267 300 12.2.3. pNFS Client . . . . . . . . . . . . . . . . . . . . 267 301 12.2.4. Storage Device . . . . . . . . . . . . . . . . . . . 267 302 12.2.5. Storage Protocol . . . . . . . . . . . . . . . . . . 268 303 12.2.6. Control Protocol . . . . . . . . . . . . . . . . . . 268 304 12.2.7. Layout Types . . . . . . . . . . . . . . . . . . . . 268 305 12.2.8. Layout . . . . . . . . . . . . . . . . . . . . . . . 269 306 12.2.9. Layout Iomode . . . . . . . . . . . . . . . . . . . 269 307 12.2.10. Device IDs . . . . . . . . . . . . . . . . . . . . . 270 308 12.3. pNFS Operations . . . . . . . . . . . . . . . . . . . . 271 309 12.4. pNFS Attributes . . . . . . . . . . . . . . . . . . . . 272 310 12.5. Layout Semantics . . . . . . . . . . . . . . . . . . . . 272 311 12.5.1. Guarantees Provided by Layouts . . . . . . . . . . . 272 312 12.5.2. Getting a Layout . . . . . . . . . . . . . . . . . . 273 313 12.5.3. Layout Stateid . . . . . . . . . . . . . . . . . . . 274 314 12.5.4. Committing a Layout . . . . . . . . . . . . . . . . 276 315 12.5.5. Recalling a Layout . . . . . . . . . . . . . . . . . 279 316 12.5.6. Revoking Layouts . . . . . . . . . . . . . . . . . . 287 317 12.5.7. Metadata Server Write Propagation . . . . . . . . . 287 318 12.6. pNFS Mechanics . . . . . . . . . . . . . . . . . . . . . 287 319 12.7. Recovery . . . . . . . . . . . . . . . . . . . . . . . . 289 320 12.7.1. Recovery from Client Restart . . . . . . . . . . . . 289 321 12.7.2. Dealing with Lease Expiration on the Client . . . . 290 322 12.7.3. Dealing with Loss of Layout State on the Metadata 323 Server . . . . . . . . . . . . . . . . . . . . . . . 291 324 12.7.4. Recovery from Metadata Server Restart . . . . . . . 291 325 12.7.5. Operations During Metadata Server Grace Period . . . 293 326 12.7.6. Storage Device Recovery . . . . . . . . . . . . . . 294 327 12.8. Metadata and Storage Device Roles . . . . . . . . . . . 294 328 12.9. Security Considerations for pNFS . . . . . . . . . . . . 294 329 13. PNFS: NFSv4.1 File Layout Type . . . . . . . . . . . . . . . 295 330 13.1. Client ID and Session Considerations . . . . . . . . . . 296 331 13.1.1. Sessions Considerations for Data Servers . . . . . . 298 332 333 334 335 Shepler, et al. Expires February 23, 2009 [Page 6] 336 337 Internet-Draft NFSv4.1 August 2008 338 339 340 13.2. File Layout Definitions . . . . . . . . . . . . . . . . 298 341 13.3. File Layout Data Types . . . . . . . . . . . . . . . . . 299 342 13.4. Interpreting the File Layout . . . . . . . . . . . . . . 303 343 13.4.1. Determining the Stripe Unit Number . . . . . . . . . 303 344 13.4.2. Interpreting the File Layout Using Sparse Packing . 303 345 13.4.3. Interpreting the File Layout Using Dense Packing . . 306 346 13.4.4. Sparse and Dense Stripe Unit Packing . . . . . . . . 308 347 13.5. Data Server Multipathing . . . . . . . . . . . . . . . . 310 348 13.6. Operations Sent to NFSv4.1 Data Servers . . . . . . . . 311 349 13.7. COMMIT Through Metadata Server . . . . . . . . . . . . . 313 350 13.8. The Layout Iomode . . . . . . . . . . . . . . . . . . . 315 351 13.9. Metadata and Data Server State Coordination . . . . . . 315 352 13.9.1. Global Stateid Requirements . . . . . . . . . . . . 315 353 13.9.2. Data Server State Propagation . . . . . . . . . . . 316 354 13.10. Data Server Component File Size . . . . . . . . . . . . 318 355 13.11. Layout Revocation and Fencing . . . . . . . . . . . . . 319 356 13.12. Security Considerations for the File Layout Type . . . . 319 357 14. Internationalization . . . . . . . . . . . . . . . . . . . . 320 358 14.1. Stringprep profile for the utf8str_cs type . . . . . . . 321 359 14.2. Stringprep profile for the utf8str_cis type . . . . . . 323 360 14.3. Stringprep profile for the utf8str_mixed type . . . . . 324 361 14.4. UTF-8 Capabilities . . . . . . . . . . . . . . . . . . . 326 362 14.5. UTF-8 Related Errors . . . . . . . . . . . . . . . . . . 326 363 15. Error Values . . . . . . . . . . . . . . . . . . . . . . . . 327 364 15.1. Error Definitions . . . . . . . . . . . . . . . . . . . 327 365 15.1.1. General Errors . . . . . . . . . . . . . . . . . . . 329 366 15.1.2. Filehandle Errors . . . . . . . . . . . . . . . . . 331 367 15.1.3. Compound Structure Errors . . . . . . . . . . . . . 332 368 15.1.4. File System Errors . . . . . . . . . . . . . . . . . 334 369 15.1.5. State Management Errors . . . . . . . . . . . . . . 336 370 15.1.6. Security Errors . . . . . . . . . . . . . . . . . . 337 371 15.1.7. Name Errors . . . . . . . . . . . . . . . . . . . . 337 372 15.1.8. Locking Errors . . . . . . . . . . . . . . . . . . . 338 373 15.1.9. Reclaim Errors . . . . . . . . . . . . . . . . . . . 339 374 15.1.10. pNFS Errors . . . . . . . . . . . . . . . . . . . . 340 375 15.1.11. Session Use Errors . . . . . . . . . . . . . . . . . 341 376 15.1.12. Session Management Errors . . . . . . . . . . . . . 343 377 15.1.13. Client Management Errors . . . . . . . . . . . . . . 343 378 15.1.14. Delegation Errors . . . . . . . . . . . . . . . . . 344 379 15.1.15. Attribute Handling Errors . . . . . . . . . . . . . 344 380 15.1.16. Obsoleted Errors . . . . . . . . . . . . . . . . . . 345 381 15.2. Operations and their valid errors . . . . . . . . . . . 346 382 15.3. Callback operations and their valid errors . . . . . . . 362 383 15.4. Errors and the operations that use them . . . . . . . . 364 384 16. NFSv4.1 Procedures . . . . . . . . . . . . . . . . . . . . . 378 385 16.1. Procedure 0: NULL - No Operation . . . . . . . . . . . . 378 386 16.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 379 387 17. Operations: REQUIRED, RECOMMENDED, or OPTIONAL . . . . . . . 390 388 389 390 391 Shepler, et al. Expires February 23, 2009 [Page 7] 392 393 Internet-Draft NFSv4.1 August 2008 394 395 396 18. NFSv4.1 Operations . . . . . . . . . . . . . . . . . . . . . 393 397 18.1. Operation 3: ACCESS - Check Access Rights . . . . . . . 393 398 18.2. Operation 4: CLOSE - Close File . . . . . . . . . . . . 399 399 18.3. Operation 5: COMMIT - Commit Cached Data . . . . . . . . 400 400 18.4. Operation 6: CREATE - Create a Non-Regular File Object . 403 401 18.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting 402 Recovery . . . . . . . . . . . . . . . . . . . . . . . . 406 403 18.6. Operation 8: DELEGRETURN - Return Delegation . . . . . . 407 404 18.7. Operation 9: GETATTR - Get Attributes . . . . . . . . . 407 405 18.8. Operation 10: GETFH - Get Current Filehandle . . . . . . 409 406 18.9. Operation 11: LINK - Create Link to a File . . . . . . . 410 407 18.10. Operation 12: LOCK - Create Lock . . . . . . . . . . . . 413 408 18.11. Operation 13: LOCKT - Test For Lock . . . . . . . . . . 417 409 18.12. Operation 14: LOCKU - Unlock File . . . . . . . . . . . 418 410 18.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . . . 420 411 18.14. Operation 16: LOOKUPP - Lookup Parent Directory . . . . 421 412 18.15. Operation 17: NVERIFY - Verify Difference in 413 Attributes . . . . . . . . . . . . . . . . . . . . . . . 423 414 18.16. Operation 18: OPEN - Open a Regular File . . . . . . . . 424 415 18.17. Operation 19: OPENATTR - Open Named Attribute 416 Directory . . . . . . . . . . . . . . . . . . . . . . . 443 417 18.18. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access . 444 418 18.19. Operation 22: PUTFH - Set Current Filehandle . . . . . . 446 419 18.20. Operation 23: PUTPUBFH - Set Public Filehandle . . . . . 446 420 18.21. Operation 24: PUTROOTFH - Set Root Filehandle . . . . . 448 421 18.22. Operation 25: READ - Read from File . . . . . . . . . . 449 422 18.23. Operation 26: READDIR - Read Directory . . . . . . . . . 451 423 18.24. Operation 27: READLINK - Read Symbolic Link . . . . . . 455 424 18.25. Operation 28: REMOVE - Remove File System Object . . . . 456 425 18.26. Operation 29: RENAME - Rename Directory Entry . . . . . 458 426 18.27. Operation 31: RESTOREFH - Restore Saved Filehandle . . . 462 427 18.28. Operation 32: SAVEFH - Save Current Filehandle . . . . . 463 428 18.29. Operation 33: SECINFO - Obtain Available Security . . . 464 429 18.30. Operation 34: SETATTR - Set Attributes . . . . . . . . . 468 430 18.31. Operation 37: VERIFY - Verify Same Attributes . . . . . 471 431 18.32. Operation 38: WRITE - Write to File . . . . . . . . . . 472 432 18.33. Operation 40: BACKCHANNEL_CTL - Backchannel control . . 476 433 18.34. Operation 41: BIND_CONN_TO_SESSION . . . . . . . . . . . 478 434 18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID . . . 481 435 18.36. Operation 43: CREATE_SESSION - Create New Session and 436 Confirm Client ID . . . . . . . . . . . . . . . . . . . 498 437 18.37. Operation 44: DESTROY_SESSION - Destroy existing 438 session . . . . . . . . . . . . . . . . . . . . . . . . 508 439 18.38. Operation 45: FREE_STATEID - Free stateid with no 440 locks . . . . . . . . . . . . . . . . . . . . . . . . . 509 441 18.39. Operation 46: GET_DIR_DELEGATION - Get a directory 442 delegation . . . . . . . . . . . . . . . . . . . . . . . 510 443 18.40. Operation 47: GETDEVICEINFO - Get Device Information . . 514 444 445 446 447 Shepler, et al. Expires February 23, 2009 [Page 8] 448 449 Internet-Draft NFSv4.1 August 2008 450 451 452 18.41. Operation 48: GETDEVICELIST - Get All Device Mappings 453 for a File System . . . . . . . . . . . . . . . . . . . 516 454 18.42. Operation 49: LAYOUTCOMMIT - Commit writes made using 455 a layout . . . . . . . . . . . . . . . . . . . . . . . . 518 456 18.43. Operation 50: LAYOUTGET - Get Layout Information . . . . 521 457 18.44. Operation 51: LAYOUTRETURN - Release Layout 458 Information . . . . . . . . . . . . . . . . . . . . . . 531 459 18.45. Operation 52: SECINFO_NO_NAME - Get Security on 460 Unnamed Object . . . . . . . . . . . . . . . . . . . . . 535 461 18.46. Operation 53: SEQUENCE - Supply per-procedure 462 sequencing and control . . . . . . . . . . . . . . . . . 537 463 18.47. Operation 54: SET_SSV - Update SSV for a Client ID . . . 542 464 18.48. Operation 55: TEST_STATEID - Test stateids for 465 validity . . . . . . . . . . . . . . . . . . . . . . . . 544 466 18.49. Operation 56: WANT_DELEGATION - Request Delegation . . . 546 467 18.50. Operation 57: DESTROY_CLIENTID - Destroy existing 468 client ID . . . . . . . . . . . . . . . . . . . . . . . 550 469 18.51. Operation 58: RECLAIM_COMPLETE - Indicates Reclaims 470 Finished . . . . . . . . . . . . . . . . . . . . . . . . 550 471 18.52. Operation 10044: ILLEGAL - Illegal operation . . . . . . 553 472 19. NFSv4.1 Callback Procedures . . . . . . . . . . . . . . . . . 553 473 19.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 554 474 19.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . . 554 475 20. NFSv4.1 Callback Operations . . . . . . . . . . . . . . . . . 558 476 20.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . . . 558 477 20.2. Operation 4: CB_RECALL - Recall a Delegation . . . . . . 559 478 20.3. Operation 5: CB_LAYOUTRECALL - Recall Layout from 479 Client . . . . . . . . . . . . . . . . . . . . . . . . . 560 480 20.4. Operation 6: CB_NOTIFY - Notify directory changes . . . 564 481 20.5. Operation 7: CB_PUSH_DELEG - Offer Delegation to 482 Client . . . . . . . . . . . . . . . . . . . . . . . . . 568 483 20.6. Operation 8: CB_RECALL_ANY - Keep any N recallable 484 objects . . . . . . . . . . . . . . . . . . . . . . . . 569 485 20.7. Operation 9: CB_RECALLABLE_OBJ_AVAIL - Signal 486 Resources for Recallable Objects . . . . . . . . . . . . 572 487 20.8. Operation 10: CB_RECALL_SLOT - change flow control 488 limits . . . . . . . . . . . . . . . . . . . . . . . . . 573 489 20.9. Operation 11: CB_SEQUENCE - Supply backchannel 490 sequencing and control . . . . . . . . . . . . . . . . . 574 491 20.10. Operation 12: CB_WANTS_CANCELLED - Cancel Pending 492 Delegation Wants . . . . . . . . . . . . . . . . . . . . 576 493 20.11. Operation 13: CB_NOTIFY_LOCK - Notify of possible 494 lock availability . . . . . . . . . . . . . . . . . . . 577 495 20.12. Operation 14: CB_NOTIFY_DEVICEID - Notify device ID 496 changes . . . . . . . . . . . . . . . . . . . . . . . . 579 497 20.13. Operation 10044: CB_ILLEGAL - Illegal Callback 498 Operation . . . . . . . . . . . . . . . . . . . . . . . 581 499 21. Security Considerations . . . . . . . . . . . . . . . . . . . 581 500 501 502 503 Shepler, et al. Expires February 23, 2009 [Page 9] 504 505 Internet-Draft NFSv4.1 August 2008 506 507 508 22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 583 509 22.1. Named Attribute Definitions . . . . . . . . . . . . . . 583 510 22.1.1. Initial Registry . . . . . . . . . . . . . . . . . . 584 511 22.1.2. Updating Registrations . . . . . . . . . . . . . . . 584 512 22.2. Device ID Notifications . . . . . . . . . . . . . . . . 584 513 22.2.1. Initial Registry . . . . . . . . . . . . . . . . . . 585 514 22.2.2. Updating Registrations . . . . . . . . . . . . . . . 585 515 22.3. Object Recall Types . . . . . . . . . . . . . . . . . . 585 516 22.3.1. Initial Registry . . . . . . . . . . . . . . . . . . 587 517 22.3.2. Updating Registrations . . . . . . . . . . . . . . . 587 518 22.4. Layout Types . . . . . . . . . . . . . . . . . . . . . . 587 519 22.4.1. Initial Registry . . . . . . . . . . . . . . . . . . 588 520 22.4.2. Updating Registrations . . . . . . . . . . . . . . . 588 521 22.4.3. Guidelines for Writing Layout Type Specifications . 588 522 22.5. Path Variable Definitions . . . . . . . . . . . . . . . 590 523 22.5.1. Path Variables Registry . . . . . . . . . . . . . . 590 524 22.5.2. Values for the ${ietf.org:CPU_ARCH} Variable . . . . 592 525 22.5.3. Values for the ${ietf.org:OS_TYPE} Variable . . . . 592 526 23. References . . . . . . . . . . . . . . . . . . . . . . . . . 593 527 23.1. Normative References . . . . . . . . . . . . . . . . . . 593 528 23.2. Informative References . . . . . . . . . . . . . . . . . 595 529 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 596 530 Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 598 531 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 599 532 Intellectual Property and Copyright Statements . . . . . . . . . 600 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 Shepler, et al. Expires February 23, 2009 [Page 10] 560 561 Internet-Draft NFSv4.1 August 2008 562 563 564 1. Introduction 565 566 1.1. The NFS Version 4 Minor Version 1 Protocol 567 568 The NFS version 4 minor version 1 (NFSv4.1) protocol is the second 569 minor version of the NFS version 4 (NFSv4) protocol. The first minor 570 version, NFSv4.0 is described in [20]. It generally follows the 571 guidelines for minor versioning model listed in Section 10 of RFC 572 3530. However, it diverges from guidelines 11 ("a client and server 573 that supports minor version X must support minor versions 0 through 574 X-1"), and 12 ("no features may be introduced as mandatory in a minor 575 version"). These divergences are due to the introduction of the 576 sessions model for managing non-idempotent operations and the 577 RECLAIM_COMPLETE operation. These two new features are 578 infrastructural in nature and simplify implementation of existing and 579 other new features. Making them anything but REQUIRED would add 580 undue complexity to protocol definition and implementation. NFSv4.1 581 accordingly updates the Minor Versioning guidelines (Section 2.7). 582 583 As a minor version, NFSv4.1 is consistent with the overall goals for 584 NFSv4, but extends the protocol so as to better meet those goals, 585 based on experiences with NFSv4.0. In addition, NFSv4.1 has adopted 586 some additional goals, which motivate some of the major extensions in 587 NFSv4.1. 588 589 1.2. Scope of this Document 590 591 This document describes the NFSv4.1 protocol. With respect to 592 NFSv4.0, this document does not: 593 594 o describe the NFSv4.0 protocol, except where needed to contrast 595 with NFSv4.1. 596 597 o modify the specification of the NFSv4.0 protocol. 598 599 o clarify the NFSv4.0 protocol. 600 601 1.3. NFSv4 Goals 602 603 The NFSv4 protocol is a further revision of the NFS protocol defined 604 already by NFSv3 [21]. It retains the essential characteristics of 605 previous versions: easy recovery; independence of transport 606 protocols, operating systems and file systems; simplicity; and good 607 performance. NFSv4 has the following goals: 608 609 o Improved access and good performance on the Internet. 610 611 The protocol is designed to transit firewalls easily, perform well 612 613 614 615 Shepler, et al. Expires February 23, 2009 [Page 11] 616 617 Internet-Draft NFSv4.1 August 2008 618 619 620 where latency is high and bandwidth is low, and scale to very 621 large numbers of clients per server. 622 623 o Strong security with negotiation built into the protocol. 624 625 The protocol builds on the work of the ONCRPC working group in 626 supporting the RPCSEC_GSS protocol. Additionally, the NFSv4.1 627 protocol provides a mechanism to allow clients and servers the 628 ability to negotiate security and require clients and servers to 629 support a minimal set of security schemes. 630 631 o Good cross-platform interoperability. 632 633 The protocol features a file system model that provides a useful, 634 common set of features that does not unduly favor one file system 635 or operating system over another. 636 637 o Designed for protocol extensions. 638 639 The protocol is designed to accept standard extensions within a 640 framework that enable and encourages backward compatibility. 641 642 1.4. NFSv4.1 Goals 643 644 NFSv4.1 has the following goals, within the framework established by 645 the overall NFSv4 goals. 646 647 o To correct significant structural weaknesses and oversights 648 discovered in the base protocol. 649 650 o To add clarity and specificity to areas left unaddressed or not 651 addressed in sufficient detail in the base protocol. However, as 652 stated in Section 1.2, it is not a goal to clarify the NFSv4.0 653 protocol in the NFSv4.1 specification. 654 655 o To add specific features based on experience with the existing 656 protocol and recent industry developments. 657 658 o To provide protocol support to take advantage of clustered server 659 deployments including the ability to provide scalable parallel 660 access to files distributed among multiple servers. 661 662 1.5. General Definitions 663 664 The following definitions are provided for the purpose of providing 665 an appropriate context for the reader. 666 667 668 669 670 671 Shepler, et al. Expires February 23, 2009 [Page 12] 672 673 Internet-Draft NFSv4.1 August 2008 674 675 676 Byte This document defines a byte as an octet, i.e. a datum exactly 677 8 bits in length. 678 679 Client The "client" is the entity that accesses the NFS server's 680 resources. The client may be an application which contains the 681 logic to access the NFS server directly. The client may also be 682 the traditional operating system client that provides remote file 683 system services for a set of applications. 684 685 A client is uniquely identified by a Client Owner. 686 687 With reference to file locking, the client is also the entity that 688 maintains a set of locks on behalf of one or more applications. 689 This client is responsible for crash or failure recovery for those 690 locks it manages. 691 692 Note that multiple clients may share the same transport and 693 connection and multiple clients may exist on the same network 694 node. 695 696 Client ID A 64-bit quantity used as a unique, short-hand reference 697 to a client supplied Verifier and client owner. The server is 698 responsible for supplying the client ID. 699 700 Client Owner The client owner is a unique string, opaque to the 701 server, which identifies a client. Multiple network connections 702 and source network addresses originating from those connections 703 may share a client owner. The server is expected to treat 704 requests from connections with the same client owner as coming 705 from the same client. 706 707 File System The collection of objects on a server (as identified by 708 the major identifier of a Server Owner, which is defined later in 709 this section), that share the same fsid attribute (see 710 Section 5.8.1.9). 711 712 Lease An interval of time defined by the server for which the client 713 is irrevocably granted a lock. At the end of a lease period the 714 lock may be revoked if the lease has not been extended. The lock 715 must be revoked if a conflicting lock has been granted after the 716 lease interval. 717 718 All leases granted by a server have the same fixed interval. Note 719 that the fixed interval was chosen to alleviate the expense a 720 server would have in maintaining state about variable length 721 leases across server failures. 722 723 724 725 726 727 Shepler, et al. Expires February 23, 2009 [Page 13] 728 729 Internet-Draft NFSv4.1 August 2008 730 731 732 Lock The term "lock" is used to refer to byte-range (in UNIX 733 environments, also known as record) locks, share reservations, 734 delegations, or layouts unless specifically stated otherwise. 735 736 Server The "Server" is the entity responsible for coordinating 737 client access to a set of file systems and is identified by a 738 Server owner. A server can span multiple network addresses. 739 740 Server Owner The "Server Owner" identifies the server to the client. 741 The server owner consists of a major and minor identifier. When 742 the client has two connections each to a peer with the same major 743 identifier, the client assumes both peers are the same server (the 744 server namespace is the same via each connection), and assumes and 745 lock state is sharable across both connections. When each peer 746 has both the same major and minor identifier, the client assumes 747 each connection might be associable with the same session. 748 749 Stable Storage NFSv4.1 servers must be able to recover without data 750 loss from multiple power failures (including cascading power 751 failures, that is, several power failures in quick succession), 752 operating system failures, and hardware failure of components 753 other than the storage medium itself (for example, disk, 754 nonvolatile RAM). 755 756 Some examples of stable storage that are allowable for an NFS 757 server include: 758 759 1. Media commit of data, that is, the modified data has been 760 successfully written to the disk media, for example, the disk 761 platter. 762 763 2. An immediate reply disk drive with battery-backed on- drive 764 intermediate storage or uninterruptible power system (UPS). 765 766 3. Server commit of data with battery-backed intermediate storage 767 and recovery software. 768 769 4. Cache commit with uninterruptible power system (UPS) and 770 recovery software. 771 772 Stateid A 128-bit quantity returned by a server that uniquely 773 defines the open and locking state provided by the server for a 774 specific open-owner or lock-owner/open-owner pair for a specific 775 file and type of lock. 776 777 778 779 780 781 782 783 Shepler, et al. Expires February 23, 2009 [Page 14] 784 785 Internet-Draft NFSv4.1 August 2008 786 787 788 Verifier A 64-bit quantity generated by the client that the server 789 can use to determine if the client has restarted and lost all 790 previous lock state. 791 792 1.6. Overview of NFSv4.1 Features 793 794 To provide a reasonable context for the reader, the major features of 795 the NFSv4.1 protocol will be reviewed in brief. This will be done to 796 provide an appropriate context for both the reader who is familiar 797 with the previous versions of the NFS protocol and the reader that is 798 new to the NFS protocols. For the reader new to the NFS protocols, 799 there is still a set of fundamental knowledge that is expected. The 800 reader should be familiar with the XDR and RPC protocols as described 801 in [2] and [3]. A basic knowledge of file systems and distributed 802 file systems is expected as well. 803 804 In general this specification of NFSv4.1 will not distinguish those 805 added in minor version one from those present in the base protocol 806 but will treat NFSv4.1 as a unified whole. See Section 1.7 for a 807 summary of the differences between NFSv4.0 and NFSv4.1. 808 809 1.6.1. RPC and Security 810 811 As with previous versions of NFS, the External Data Representation 812 (XDR) and Remote Procedure Call (RPC) mechanisms used for the NFSv4.1 813 protocol are those defined in [2] and [3]. To meet end-to-end 814 security requirements, the RPCSEC_GSS framework [4] will be used to 815 extend the basic RPC security. With the use of RPCSEC_GSS, various 816 mechanisms can be provided to offer authentication, integrity, and 817 privacy to the NFSv4 protocol. Kerberos V5 will be used as described 818 in [5] to provide one security framework. The LIPKEY and SPKM-3 GSS- 819 API mechanisms described in [6] will be used to provide for the use 820 of user password and client/server public key certificates by the 821 NFSv4 protocol. With the use of RPCSEC_GSS, other mechanisms may 822 also be specified and used for NFSv4.1 security. 823 824 To enable in-band security negotiation, the NFSv4.1 protocol has 825 operations which provide the client a method of querying the server 826 about its policies regarding which security mechanisms must be used 827 for access to the server's file system resources. With this, the 828 client can securely match the security mechanism that meets the 829 policies specified at both the client and server. 830 831 1.6.2. Protocol Structure 832 833 834 835 836 837 838 839 Shepler, et al. Expires February 23, 2009 [Page 15] 840 841 Internet-Draft NFSv4.1 August 2008 842 843 844 1.6.2.1. Core Protocol 845 846 Unlike NFSv3, which used a series of ancillary protocols (e.g. NLM, 847 NSM, MOUNT), within all minor versions of NFSv4 a single RPC protocol 848 is used to make requests to the server. Facilities that had been 849 separate protocols, such as locking, are now integrated within a 850 single unified protocol. 851 852 1.6.2.2. Parallel Access 853 854 Minor version one supports high-performance data access to a 855 clustered server implementation by enabling a separation of metadata 856 access and data access, with the latter done to multiple servers in 857 parallel. 858 859 Such parallel data access is controlled by recallable objects known 860 as "layouts", which are integrated into the protocol locking model. 861 Clients direct requests for data access to a set of data servers 862 specified by the layout via a data storage protocol which may be 863 NFSv4.1 or may be another protocol. 864 865 1.6.3. File System Model 866 867 The general file system model used for the NFSv4.1 protocol is the 868 same as previous versions. The server file system is hierarchical 869 with the regular files contained within being treated as opaque byte 870 streams. In a slight departure, file and directory names are encoded 871 with UTF-8 to deal with the basics of internationalization. 872 873 The NFSv4.1 protocol does not require a separate protocol to provide 874 for the initial mapping between path name and filehandle. All file 875 systems exported by a server are presented as a tree so that all file 876 systems are reachable from a special per-server global root 877 filehandle. This allows LOOKUP operations to be used to perform 878 functions previously provided by the MOUNT protocol. The server 879 provides any necessary pseudo file systems to bridge any gaps that 880 arise due to unexported gaps between exported file systems. 881 882 1.6.3.1. Filehandles 883 884 As in previous versions of the NFS protocol, opaque filehandles are 885 used to identify individual files and directories. Lookup-type and 886 create operations translate file and directory names to filehandles 887 which are then used to identify objects in subsequent operations. 888 889 The NFSv4.1 protocol provides support for persistent filehandles, 890 guaranteed to be valid for the lifetime of the file system object 891 designated. In addition it provides support to servers to provide 892 893 894 895 Shepler, et al. Expires February 23, 2009 [Page 16] 896 897 Internet-Draft NFSv4.1 August 2008 898 899 900 filehandles with more limited validity guarantees, called volatile 901 filehandles. 902 903 1.6.3.2. File Attributes 904 905 The NFSv4.1 protocol has a rich and extensible file object attribute 906 structure, which is divided into REQUIRED, RECOMMENDED, and named 907 attributes (see Section 5). 908 909 Several (but not all) of the REQUIRED attributes are derived from the 910 attributes of NFSv3 (see definition of the fattr3 data type in [21]). 911 An example of a REQUIRED attribute is the file object's type 912 (Section 5.8.1.2) so that regular files can be distinguished from 913 directories (also known as folders in some operating environments) 914 and other types of objects. REQUIRED attributes are discussed in 915 Section 5.1. 916 917 An example of three RECOMMENDED attributes are acl, sacl, and dacl. 918 These attributes define an Access Control List (ACL) on a file object 919 ((Section 6). An ACL provides directory and file access control 920 beyond the model used in NFSv3. The ACL definition allows for 921 specification of specific sets of permissions for individual users 922 and groups. In addition, ACL inheritance allows propagation of 923 access permissions and restriction down a directory tree as file 924 system objects are created. RECOMMENDED attributes are discussed in 925 Section 5.2. 926 927 A named attribute is an opaque byte stream that is associated with a 928 directory or file and referred to by a string name. Named attributes 929 are meant to be used by client applications as a method to associate 930 application-specific data with a regular file or directory. NFSv4.1 931 modifies named attributes relative to NFSv4.0 by tightening the 932 allowed operations in order to prevent the development of non- 933 interoperable implementations. Named attributes are discussed in 934 Section 5.3. 935 936 1.6.3.3. Multi-server Namespace 937 938 NFSv4.1 contains a number of features to allow implementation of 939 namespaces that cross server boundaries and that allow and facilitate 940 a non-disruptive transfer of support for individual file systems 941 between servers. They are all based upon attributes that allow one 942 file system to specify alternate or new locations for that file 943 system. 944 945 These attributes may be used together with the concept of absent file 946 systems, which provide specifications for additional locations but no 947 actual file system content. This allows a number of important 948 949 950 951 Shepler, et al. Expires February 23, 2009 [Page 17] 952 953 Internet-Draft NFSv4.1 August 2008 954 955 956 facilities: 957 958 o Location attributes may be used with absent file systems to 959 implement referrals whereby one server may direct the client to a 960 file system provided by another server. This allows extensive 961 multi-server namespaces to be constructed. 962 963 o Location attributes may be provided for present file systems to 964 provide the locations of alternate file system instances or 965 replicas to be used in the event that the current file system 966 instance becomes unavailable. 967 968 o Location attributes may be provided when a previously present file 969 system becomes absent. This allows non-disruptive migration of 970 file systems to alternate servers. 971 972 1.6.4. Locking Facilities 973 974 As mentioned previously, NFS v4.1 is a single protocol which includes 975 locking facilities. These locking facilities include support for 976 many types of locks including a number of sorts of recallable locks. 977 Recallable locks such as delegations allow the client to be assured 978 that certain events will not occur so long as that lock is held. 979 When circumstances change, the lock is recalled via a callback 980 request. The assurances provided by delegations allow more extensive 981 caching to be done safely when circumstances allow it. 982 983 The types of locks are: 984 985 o Share reservations as established by OPEN operations. 986 987 o Byte-range locks. 988 989 o File delegations, which are recallable locks that assure the 990 holder that inconsistent opens and file changes cannot occur so 991 long as the delegation is held. 992 993 o Directory delegations, which are recallable locks that assure the 994 holder that inconsistent directory modifications cannot occur so 995 long as the delegation is held. 996 997 o Layouts, which are recallable objects that assure the holder that 998 direct access to the file data may be performed directly by the 999 client and that no change to the data's location inconsistent with 1000 that access may be made so long as the layout is held. 1001 1002 All locks for a given client are tied together under a single client- 1003 wide lease. All requests made on sessions associated with the client 1004 1005 1006 1007 Shepler, et al. Expires February 23, 2009 [Page 18] 1008 1009 Internet-Draft NFSv4.1 August 2008 1010 1011 1012 renew that lease. When leases are not promptly renewed locks are 1013 subject to revocation. In the event of server restart, clients have 1014 the opportunity to safely reclaim their locks within a special grace 1015 period. 1016 1017 1.7. Differences from NFSv4.0 1018 1019 The following summarizes the major differences between minor version 1020 one and the base protocol: 1021 1022 o Implementation of the sessions model (Section 2.10). 1023 1024 o Parallel access to data (Section 12). 1025 1026 o Addition of the RECLAIM_COMPLETE operation to better structure the 1027 lock reclamation process (Section 18.51). 1028 1029 o Enhanced delegation support as follows. 1030 1031 * Delegations on directories and other file types in addition to 1032 regular files (Section 18.39, Section 18.49). 1033 1034 * Operations to optimize acquisition of recalled or denied 1035 delegations (Section 18.49, Section 20.5, Section 20.7). 1036 1037 * Notifications of changes to files and directories 1038 (Section 18.39, Section 20.4). 1039 1040 * A method to allow a server to indicate it is recalling one or 1041 more delegations for resource management reasons, and thus a 1042 method to allow the client to pick which delegations to return 1043 (Section 20.6). 1044 1045 o Attributes can be set atomically during exclusive file create via 1046 the OPEN operation (see the new EXCLUSIVE4_1 creation method in 1047 Section 18.16). 1048 1049 o Open files can be preserved if removed and the hard link count 1050 goes to zero thus obviating the need for clients to rename deleted 1051 files to partially hidden names -- colloquially called "silly 1052 rename" (see the new OPEN4_RESULT_PRESERVE_UNLINKED reply flag in 1053 Section 18.16). 1054 1055 o Improved compatibility with Microsoft Windows for Access Control 1056 Lists (Section 6.2.3, Section 6.2.2, Section 6.4.3.2). 1057 1058 o Data retention (Section 5.13). 1059 1060 1061 1062 1063 Shepler, et al. Expires February 23, 2009 [Page 19] 1064 1065 Internet-Draft NFSv4.1 August 2008 1066 1067 1068 o Identification of the implementation of the NFS client and server 1069 (Section 18.35). 1070 1071 o Support for notification of the availability of byte-range locks 1072 (see the new OPEN4_RESULT_MAY_NOTIFY_LOCK reply flag in 1073 Section 18.16 and see Section 20.11). 1074 1075 1076 2. Core Infrastructure 1077 1078 2.1. Introduction 1079 1080 NFSv4.1 relies on core infrastructure common to nearly every 1081 operation. This core infrastructure is described in the remainder of 1082 this section. 1083 1084 2.2. RPC and XDR 1085 1086 The NFSv4.1 protocol is a Remote Procedure Call (RPC) application 1087 that uses RPC version 2 and the corresponding eXternal Data 1088 Representation (XDR) as defined in [3] and [2]. 1089 1090 2.2.1. RPC-based Security 1091 1092 Previous NFS versions have been thought of as having a host-based 1093 authentication model, where the NFS server authenticates the NFS 1094 client, and trusts the client to authenticate all users. Actually, 1095 NFS has always depended on RPC for authentication. One of the first 1096 forms of RPC authentication, AUTH_SYS, had no strong authentication, 1097 and required a host-based authentication approach. NFSv4.1 also 1098 depends on RPC for basic security services, and mandates RPC support 1099 for a user-based authentication model. The user-based authentication 1100 model has user principals authenticated by a server, and in turn the 1101 server authenticated by user principals. RPC provides some basic 1102 security services which are used by NFSv4.1. 1103 1104 2.2.1.1. RPC Security Flavors 1105 1106 As described in section 7.2 "Authentication" of [3], RPC security is 1107 encapsulated in the RPC header, via a security or authentication 1108 flavor, and information specific to the specified security flavor. 1109 Every RPC header conveys information used to identify and 1110 authenticate a client and server. As discussed in Section 2.2.1.1.1, 1111 some security flavors provide additional security services. 1112 1113 NFSv4.1 clients and servers MUST implement RPCSEC_GSS. (This 1114 requirement to implement is not a requirement to use.) Other 1115 flavors, such as AUTH_NONE, and AUTH_SYS, MAY be implemented as well. 1116 1117 1118 1119 Shepler, et al. Expires February 23, 2009 [Page 20] 1120 1121 Internet-Draft NFSv4.1 August 2008 1122 1123 1124 2.2.1.1.1. RPCSEC_GSS and Security Services 1125 1126 RPCSEC_GSS ([4]) uses the functionality of GSS-API [7]. This allows 1127 for the use of various security mechanisms by the RPC layer without 1128 the additional implementation overhead of adding RPC security 1129 flavors. 1130 1131 2.2.1.1.1.1. Identification, Authentication, Integrity, Privacy 1132 1133 Via the GSS-API, RPCSEC_GSS can be used to identify and authenticate 1134 users on clients to servers, and servers to users. It can also 1135 perform integrity checking on the entire RPC message, including the 1136 RPC header, and the arguments or results. Finally, privacy, usually 1137 via encryption, is a service available with RPCSEC_GSS. Privacy is 1138 performed on the arguments and results. Note that if privacy is 1139 selected, integrity, authentication, and identification are enabled. 1140 If privacy is not selected, but integrity is selected, authentication 1141 and identification are enabled. If integrity and privacy are not 1142 selected, but authentication is enabled, identification is enabled. 1143 RPCSEC_GSS does not provide identification as a separate service. 1144 1145 Although GSS-API has an authentication service distinct from its 1146 privacy and integrity services, GSS-API's authentication service is 1147 not used for RPCSEC_GSS's authentication service. Instead, each RPC 1148 request and response header is integrity protected with the GSS-API 1149 integrity service, and this allows RPCSEC_GSS to offer per-RPC 1150 authentication and identity. See [4] for more information. 1151 1152 NFSv4.1 client and servers MUST support RPCSEC_GSS's integrity and 1153 authentication service. NFSv4.1 servers MUST support RPCSEC_GSS's 1154 privacy service. NFSv4.1 clients SHOULD support RPCSEC_GSS's privacy 1155 service. 1156 1157 2.2.1.1.1.2. Security mechanisms for NFSv4.1 1158 1159 RPCSEC_GSS, via GSS-API, normalizes access to mechanisms that provide 1160 security services. Therefore NFSv4.1 clients and servers MUST 1161 support three security mechanisms: Kerberos V5, SPKM-3, and LIPKEY. 1162 1163 The use of RPCSEC_GSS requires selection of: mechanism, quality of 1164 protection (QOP), and service (authentication, integrity, privacy). 1165 For the mandated security mechanisms, NFSv4.1 specifies that a QOP of 1166 zero (0) is used, leaving it up to the mechanism or the mechanism's 1167 configuration to use an appropriate level of protection that QOP zero 1168 maps to. Each mandated mechanism specifies minimum set of 1169 cryptographic algorithms for implementing integrity and privacy. 1170 NFSv4.1 clients and servers MUST be implemented on operating 1171 environments that comply with the REQUIRED cryptographic algorithms 1172 1173 1174 1175 Shepler, et al. Expires February 23, 2009 [Page 21] 1176 1177 Internet-Draft NFSv4.1 August 2008 1178 1179 1180 of each REQUIRED mechanism. 1181 1182 2.2.1.1.1.2.1. Kerberos V5 1183 1184 The Kerberos V5 GSS-API mechanism as described in [5] MUST be 1185 implemented with the RPCSEC_GSS services as specified in the 1186 following table: 1187 1188 1189 column descriptions: 1190 1 == number of pseudo flavor 1191 2 == name of pseudo flavor 1192 3 == mechanism's OID 1193 4 == RPCSEC_GSS service 1194 5 == NFSv4.1 clients MUST support 1195 6 == NFSv4.1 servers MUST support 1196 1197 1 2 3 4 5 6 1198 ------------------------------------------------------------------ 1199 390003 krb5 1.2.840.113554.1.2.2 rpc_gss_svc_none yes yes 1200 390004 krb5i 1.2.840.113554.1.2.2 rpc_gss_svc_integrity yes yes 1201 390005 krb5p 1.2.840.113554.1.2.2 rpc_gss_svc_privacy no yes 1202 1203 Note that the number and name of the pseudo flavor is presented here 1204 as a mapping aid to the implementor. Because the NFSv4.1 protocol 1205 includes a method to negotiate security and it understands the GSS- 1206 API mechanism, the pseudo flavor is not needed. The pseudo flavor is 1207 needed for the NFSv3 since the security negotiation is done via the 1208 MOUNT protocol as described in [22]. 1209 1210 2.2.1.1.1.2.2. LIPKEY 1211 1212 The LIPKEY V5 GSS-API mechanism as described in [6] MUST be 1213 implemented with the RPCSEC_GSS services as specified in the 1214 following table: 1215 1216 1217 1 2 3 4 5 6 1218 ------------------------------------------------------------------ 1219 390006 lipkey 1.3.6.1.5.5.9 rpc_gss_svc_none yes yes 1220 390007 lipkey-i 1.3.6.1.5.5.9 rpc_gss_svc_integrity yes yes 1221 390008 lipkey-p 1.3.6.1.5.5.9 rpc_gss_svc_privacy no yes 1222 1223 2.2.1.1.1.2.3. SPKM-3 as a security triple 1224 1225 The SPKM-3 GSS-API mechanism as described in [6] MUST be implemented 1226 with the RPCSEC_GSS services as specified in the following table: 1227 1228 1229 1230 1231 Shepler, et al. Expires February 23, 2009 [Page 22] 1232 1233 Internet-Draft NFSv4.1 August 2008 1234 1235 1236 1 2 3 4 5 6 1237 ------------------------------------------------------------------ 1238 390009 spkm3 1.3.6.1.5.5.1.3 rpc_gss_svc_none yes yes 1239 390010 spkm3i 1.3.6.1.5.5.1.3 rpc_gss_svc_integrity yes yes 1240 390011 spkm3p 1.3.6.1.5.5.1.3 rpc_gss_svc_privacy no yes 1241 1242 2.2.1.1.1.3. GSS Server Principal 1243 1244 Regardless of what security mechanism under RPCSEC_GSS is being used, 1245 the NFS server, MUST identify itself in GSS-API via a 1246 GSS_C_NT_HOSTBASED_SERVICE name type. GSS_C_NT_HOSTBASED_SERVICE 1247 names are of the form: 1248 1249 service@hostname 1250 1251 For NFS, the "service" element is 1252 1253 nfs 1254 1255 Implementations of security mechanisms will convert nfs@hostname to 1256 various different forms. For Kerberos V5, LIPKEY, and SPKM-3, the 1257 following form is RECOMMENDED: 1258 1259 nfs/hostname 1260 1261 2.3. COMPOUND and CB_COMPOUND 1262 1263 A significant departure from the versions of the NFS protocol before 1264 NFSv4 is the introduction of the COMPOUND procedure. For the NFSv4 1265 protocol, in all minor versions, there are exactly two RPC 1266 procedures, NULL and COMPOUND. The COMPOUND procedure is defined as 1267 a series of individual operations and these operations perform the 1268 sorts of functions performed by traditional NFS procedures. 1269 1270 The operations combined within a COMPOUND request are evaluated in 1271 order by the server, without any atomicity guarantees. A limited set 1272 of facilities exist to pass results from one operation to another. 1273 Once an operation returns a failing result, the evaluation ends and 1274 the results of all evaluated operations are returned to the client. 1275 1276 With the use of the COMPOUND procedure, the client is able to build 1277 simple or complex requests. These COMPOUND requests allow for a 1278 reduction in the number of RPCs needed for logical file system 1279 operations. For example, multi-component lookup requests can be 1280 constructed by combining multiple LOOKUP operations. Those can be 1281 further combined with operations such as GETATTR, READDIR, or OPEN 1282 plus READ to do more complicated sets of operation without incurring 1283 additional latency. 1284 1285 1286 1287 Shepler, et al. Expires February 23, 2009 [Page 23] 1288 1289 Internet-Draft NFSv4.1 August 2008 1290 1291 1292 NFSv4.1 also contains a considerable set of callback operations in 1293 which the server makes an RPC directed at the client. Callback RPCs 1294 have a similar structure to that of the normal server requests. In 1295 all minor versions of the NFSv4 protocol there are two callback RPC 1296 procedures, CB_NULL and CB_COMPOUND. The CB_COMPOUND procedure is 1297 defined in an analogous fashion to that of COMPOUND with its own set 1298 of callback operations. 1299 1300 The addition of new server and callback operations within the 1301 COMPOUND and CB_COMPOUND request framework provides a means of 1302 extending the protocol in subsequent minor versions. 1303 1304 Except for a small number of operations needed for session creation, 1305 server requests and callback requests are performed within the 1306 context of a session. Sessions provide a client context for every 1307 request and support robust reply protection for non-idempotent 1308 requests. 1309 1310 2.4. Client Identifiers and Client Owners 1311 1312 For each operation that obtains or depends on locking state, the 1313 specific client must be identifiable by the server. 1314 1315 Each distinct client instance is represented by a client ID. A 1316 client ID is a 64-bit identifier representing a specific client at a 1317 given time. The client ID is changed whenever the client re- 1318 initializes, and may change when the server re-initializes. Client 1319 IDs are used to support lock identification and crash recovery. 1320 1321 During steady state operation, the client ID associated with each 1322 operation is derived from the session (see Section 2.10) on which the 1323 operation is sent. A session is associated with a client ID when the 1324 session is created. 1325 1326 Unlike NFSv4.0, the only NFSv4.1 operations possible before a client 1327 ID is established are those needed to establish the client ID. 1328 1329 A sequence of an EXCHANGE_ID operation followed by a CREATE_SESSION 1330 operation using that client ID (eir_clientid as returned from 1331 EXCHANGE_ID) is required to establish and confirm the client ID on 1332 the server. Establishment of identification by a new incarnation of 1333 the client also has the effect of immediately releasing any locking 1334 state that a previous incarnation of that same client might have had 1335 on the server. Such released state would include all lock, share 1336 reservation, layout state, and where the server is not supporting the 1337 CLAIM_DELEGATE_PREV claim type, all delegation state associated with 1338 the same client with the same identity. For discussion of delegation 1339 state recovery, see Section 10.2.1. For discussion of layout state 1340 1341 1342 1343 Shepler, et al. Expires February 23, 2009 [Page 24] 1344 1345 Internet-Draft NFSv4.1 August 2008 1346 1347 1348 recovery see Section 12.7.1. 1349 1350 Releasing such state requires that the server be able to determine 1351 that one client instance is the successor of another. Where this 1352 cannot be done, for any of a number of reasons, the locking state 1353 will remain for a time subject to lease expiration (see Section 8.3) 1354 and the new client will need to wait for such state to be removed, if 1355 it makes conflicting lock requests. 1356 1357 Client identification is encapsulated in the following Client Owner 1358 data type: 1359 1360 1361 struct client_owner4 { 1362 verifier4 co_verifier; 1363 opaque co_ownerid; 1364 }; 1365 1366 The first field, co_verifier, is a client incarnation verifier. The 1367 server will start the process of canceling the client's leased state 1368 if co_verifier is different than what the server has previously 1369 recorded for the identified client (as specified in the co_ownerid 1370 field). 1371 1372 The second field, co_ownerid is a variable length string that 1373 uniquely defines the client so that subsequent instances of the same 1374 client bear the same co_ownerid with a different verifier. 1375 1376 There are several considerations for how the client generates the 1377 co_ownerid string: 1378 1379 o The string should be unique so that multiple clients do not 1380 present the same string. The consequences of two clients 1381 presenting the same string range from one client getting an error 1382 to one client having its leased state abruptly and unexpectedly 1383 canceled. 1384 1385 o The string should be selected so that subsequent incarnations 1386 (e.g. restarts) of the same client cause the client to present the 1387 same string. The implementor is cautioned from an approach that 1388 requires the string to be recorded in a local file because this 1389 precludes the use of the implementation in an environment where 1390 there is no local disk and all file access is from an NFSv4.1 1391 server. 1392 1393 o The string should be the same for each server network address that 1394 the client accesses. This way, if a server has multiple 1395 interfaces, the client can trunk traffic over multiple network 1396 1397 1398 1399 Shepler, et al. Expires February 23, 2009 [Page 25] 1400 1401 Internet-Draft NFSv4.1 August 2008 1402 1403 1404 paths as described in Section 2.10.4. (Note: the precise opposite 1405 was advised in the NFSv4.0 specification [20].) 1406 1407 o The algorithm for generating the string should not assume that the 1408 client's network address will not change, unless the client 1409 implementation knows it is using statically assigned network 1410 addresses. This includes changes between client incarnations and 1411 even changes while the client is still running in its current 1412 incarnation. Thus with dynamic address assignment, if the client 1413 includes just the client's network address in the co_ownerid 1414 string, there is a real risk that after the client gives up the 1415 network address, another client, using a similar algorithm for 1416 generating the co_ownerid string, would generate a conflicting 1417 co_ownerid string. 1418 1419 Given the above considerations, an example of a well generated 1420 co_ownerid string is one that includes: 1421 1422 o If applicable, the client's statically assigned network address. 1423 1424 o Additional information that tends to be unique, such as one or 1425 more of: 1426 1427 * The client machine's serial number (for privacy reasons, it is 1428 best to perform some one way function on the serial number). 1429 1430 * A MAC address (again, a one way function should be performed). 1431 1432 * The timestamp of when the NFSv4.1 software was first installed 1433 on the client (though this is subject to the previously 1434 mentioned caution about using information that is stored in a 1435 file, because the file might only be accessible over NFSv4.1). 1436 1437 * A true random number. However since this number ought to be 1438 the same between client incarnations, this shares the same 1439 problem as that of using the timestamp of the software 1440 installation. 1441 1442 o For a user level NFSv4.1 client, it should contain additional 1443 information to distinguish the client from other user level 1444 clients running on the same host, such as a process identifier or 1445 other unique sequence. 1446 1447 The client ID is assigned by the server (the eir_clientid result from 1448 EXCHANGE_ID) and should be chosen so that it will not conflict with a 1449 client ID previously assigned by the server. This applies across 1450 server restarts. 1451 1452 1453 1454 1455 Shepler, et al. Expires February 23, 2009 [Page 26] 1456 1457 Internet-Draft NFSv4.1 August 2008 1458 1459 1460 In the event of a server restart, a client may find out that its 1461 current client ID is no longer valid when it receives an 1462 NFS4ERR_STALE_CLIENTID error. The precise circumstances depend on 1463 the characteristics of the sessions involved, specifically whether 1464 the session is persistent (see Section 2.10.5.5), but in each case 1465 the client will receive this error when it attempts to establish a 1466 new session with the existing client ID and receives the error 1467 NFS4ERR_STALE_CLIENTID, indicating that a new client ID must be 1468 obtained via EXCHANGE_ID and the new session established with that 1469 client ID. 1470 1471 When a session is not persistent, the client will find out that it 1472 needs to create a new session as a result of getting an 1473 NFS4ERR_BADSESSION, since the session in question was lost as part of 1474 a server restart. When the existing client ID is presented to a 1475 server as part of creating a session and that client ID is not 1476 recognized, as would happen after a server restart, the server will 1477 reject the request with the error NFS4ERR_STALE_CLIENTID. 1478 1479 In the case of the session being persistent, the client will re- 1480 establish communication using the existing session after the restart. 1481 This session will be associated with the existing client ID but may 1482 only be used to retransmit operations that the client previously 1483 transmitted and did not see replies to. Replies to operations that 1484 the server previously performed will come from the reply cache, 1485 otherwise NFS4ERR_DEADSESSION will be returned. Hence, such a 1486 session is referred to as "dead". In this situation, in order to 1487 perform new operations, the client must establish a new session. If 1488 an attempt is made to establish this new session with the existing 1489 client ID, the server will reject the request with 1490 NFS4ERR_STALE_CLIENTID. 1491 1492 When NFS4ERR_STALE_CLIENTID is received in either of these 1493 situations, the client must obtain a new client ID by use of the 1494 EXCHANGE_ID operation, then use that client ID as the basis of a new 1495 session, and then proceed to any other necessary recovery for the 1496 server restart case (See Section 8.4.2). 1497 1498 See the descriptions of EXCHANGE_ID (Section 18.35) and 1499 CREATE_SESSION (Section 18.36) for a complete specification of these 1500 operations. 1501 1502 2.4.1. Upgrade from NFSv4.0 to NFSv4.1 1503 1504 To facilitate upgrade from NFSv4.0 to NFSv4.1, a server may compare a 1505 client_owner4 in an EXCHANGE_ID with an nfs_client_id4 established 1506 using the SETCLIENTID operation of NFSv4.0. A server that does so 1507 will allow an upgraded client to avoid waiting until the lease (i.e. 1508 1509 1510 1511 Shepler, et al. Expires February 23, 2009 [Page 27] 1512 1513 Internet-Draft NFSv4.1 August 2008 1514 1515 1516 the lease established by the NFSv4.0 instance client) expires. This 1517 requires the client_owner4 be constructed the same way as the 1518 nfs_client_id4. If the latter's contents included the server's 1519 network address (per the recommendations of the NFSv4.0 specification 1520 [20]), and the NFSv4.1 client does not wish to use a client ID that 1521 prevents trunking, it should send two EXCHANGE_ID operations. The 1522 first EXCHANGE_ID will have a client_owner4 equal to the 1523 nfs_client_id4. This will clear the state created by the NFSv4.0 1524 client. The second EXCHANGE_ID will not have the server's network 1525 address. The state created for the second EXCHANGE_ID will not have 1526 to wait for lease expiration, because there will be no state to 1527 expire. 1528 1529 2.4.2. Server Release of Client ID 1530 1531 NFSv4.1 introduces a new operation called DESTROY_CLIENTID 1532 (Section 18.50) which the client SHOULD use to destroy a client ID it 1533 no longer needs. This permits graceful, bilateral release of a 1534 client ID. The operation cannot be used if there are sessions 1535 associated with the client ID, or state with an unexpired lease. 1536 1537 If the server determines that the client holds no associated state 1538 for its client ID (including sessions, opens, locks, delegations, 1539 layouts, and wants), the server may choose to unilaterally release 1540 the client ID in order to conserve resources. If the client contacts 1541 the server after this release, the server must ensure the client 1542 receives the appropriate error so that it will use the EXCHANGE_ID/ 1543 CREATE_SESSION sequence to establish a new client ID. The server 1544 ought to be very hesitant to release a client ID since the resulting 1545 work on the client to recover from such an event will be the same 1546 burden as if the server had failed and restarted. Typically a server 1547 would not release a client ID unless there had been no activity from 1548 that client for many minutes. As long as there are sessions, opens, 1549 locks, delegations, layouts, or wants, the server MUST NOT release 1550 the client ID. See Section 2.10.11.1.4 for discussion on releasing 1551 inactive sessions. 1552 1553 2.4.3. Resolving Client Owner Conflicts 1554 1555 When the server gets an EXCHANGE_ID for a client owner that currently 1556 has no state, or that has state, but the lease has expired, the 1557 server MUST allow the EXCHANGE_ID, and confirm the new client ID if 1558 followed by the appropriate CREATE_SESSION. 1559 1560 When the server gets an EXCHANGE_ID for a new incarnation of a client 1561 owner that currently has an old incarnation with state and an 1562 unexpired lease, the server is allowed to dispose of the state of the 1563 previous incarnation of the client owner if one of the following are 1564 1565 1566 1567 Shepler, et al. Expires February 23, 2009 [Page 28] 1568 1569 Internet-Draft NFSv4.1 August 2008 1570 1571 1572 true: 1573 1574 o The principal that created the client ID for the client owner is 1575 the same as the principal that is issuing the EXCHANGE_ID. Note 1576 that if the client ID was created with SP4_MACH_CRED state 1577 protection (Section 18.35), the principal MUST be based on 1578 RPCSEC_GSS authentication, the RPCSEC_GSS service used MUST be 1579 integrity or privacy, and the same GSS mechanism and principal 1580 must be used as that used when the client ID was created. 1581 1582 o The client ID was established with SP4_SSV protection 1583 (Section 18.35, Section 2.10.7.3) and the client sends the 1584 EXCHANGE_ID with the security flavor set to RPCSEC_GSS using the 1585 GSS SSV mechanism (Section 2.10.8). 1586 1587 o The client ID was established with SP4_SSV protection, and under 1588 the conditions described herein, the EXCHANGE_ID was sent with 1589 SP4_MACH_CRED state protection. Because the SSV might not persist 1590 across client and server restart, and because the first time a 1591 client sends EXCHANGE_ID to a server it does not have an SSV, the 1592 client MAY send the subsequent EXCHANGE_ID without an SSV 1593 RPCSEC_GSS handle. Instead, as with SP4_MACH_CRED protection, the 1594 principal MUST be based on RPCSEC_GSS authentication, the 1595 RPCSEC_GSS service used MUST be integrity or privacy, and the same 1596 GSS mechanism and principal MUST be used as that used when the 1597 client ID was created. 1598 1599 If none of the above situations apply, the server MUST return 1600 NFS4ERR_CLID_INUSE. 1601 1602 If the server accepts the principal and co_ownerid as matching that 1603 which created the client ID, and the co_verifier in the EXCHANGE_ID 1604 differs from the co_verifier used when the client ID was created, 1605 then after the server receives a CREATE_SESSION that confirms the 1606 client ID, the server deletes state. If the co_verifier values are 1607 the same, (e.g. the client is either updating properties of the 1608 client ID (Section 18.35), or the client is attempting trunking 1609 (Section 2.10.4) the server MUST NOT delete state. 1610 1611 2.5. Server Owners 1612 1613 The Server Owner is similar to a Client Owner (Section 2.4), but 1614 unlike the Client Owner, there is no shorthand server ID. The Server 1615 Owner is defined in the following data type: 1616 1617 1618 1619 1620 1621 1622 1623 Shepler, et al. Expires February 23, 2009 [Page 29] 1624 1625 Internet-Draft NFSv4.1 August 2008 1626 1627 1628 struct server_owner4 { 1629 uint64_t so_minor_id; 1630 opaque so_major_id; 1631 }; 1632 1633 The Server Owner is returned from EXCHANGE_ID. When the so_major_id 1634 fields are the same in two EXCHANGE_ID results, the connections each 1635 EXCHANGE_ID were sent over can be assumed to address the same Server 1636 (as defined in Section 1.5). If the so_minor_id fields are also the 1637 same, then not only do both connections connect to the same server, 1638 but the session can be shared across both connections. The reader is 1639 cautioned that multiple servers may deliberately or accidentally 1640 claim to have the same so_major_id or so_major_id/so_minor_id; the 1641 reader should examine Section 2.10.4 and Section 18.35 in order to 1642 avoid acting on falsely matching Server Owner values. 1643 1644 The considerations for generating a so_major_id are similar to that 1645 for generating a co_ownerid string (see Section 2.4). The 1646 consequences of two servers generating conflicting so_major_id values 1647 are less dire than they are for co_ownerid conflicts because the 1648 client can use RPCSEC_GSS to compare the authenticity of each server 1649 (see Section 2.10.4). 1650 1651 2.6. Security Service Negotiation 1652 1653 With the NFSv4.1 server potentially offering multiple security 1654 mechanisms, the client needs a method to determine or negotiate which 1655 mechanism is to be used for its communication with the server. The 1656 NFS server may have multiple points within its file system namespace 1657 that are available for use by NFS clients. These points can be 1658 considered security policy boundaries, and in some NFS 1659 implementations are tied to NFS export points. In turn the NFS 1660 server may be configured such that each of these security policy 1661 boundaries may have different or multiple security mechanisms in use. 1662 1663 The security negotiation between client and server must be done with 1664 a secure channel to eliminate the possibility of a third party 1665 intercepting the negotiation sequence and forcing the client and 1666 server to choose a lower level of security than required or desired. 1667 See Section 21 for further discussion. 1668 1669 2.6.1. NFSv4.1 Security Tuples 1670 1671 An NFS server can assign one or more "security tuples" to each 1672 security policy boundary in its namespace. Each security tuple 1673 consists of a security flavor (see Section 2.2.1.1), and if the 1674 flavor is RPCSEC_GSS, a GSS-API mechanism OID, a GSS-API quality of 1675 protection, and an RPCSEC_GSS service. 1676 1677 1678 1679 Shepler, et al. Expires February 23, 2009 [Page 30] 1680 1681 Internet-Draft NFSv4.1 August 2008 1682 1683 1684 2.6.2. SECINFO and SECINFO_NO_NAME 1685 1686 The SECINFO and SECINFO_NO_NAME operations allow the client to 1687 determine, on a per filehandle basis, what security tuple is to be 1688 used for server access. In general, the client will not have to use 1689 either operation except during initial communication with the server 1690 or when the client crosses security policy boundaries at the server. 1691 However, the server's policies may also change at any time and force 1692 the client to negotiate a new security tuple. 1693 1694 Where the use of different security tuples would affect the type of 1695 access that would be allowed if a request was sent over the same 1696 connection used for the SECINFO or SECINFO_NO_NAME operation (e.g. 1697 read-only vs. read-write) access, security tuples that allow greater 1698 access should be presented first. Where the general level of access 1699 is the same and different security flavors limit the range of 1700 principals whose privileges are recognized (e.g. allowing or 1701 disallowing root access), flavors supporting the greatest range of 1702 principals should be listed first. 1703 1704 2.6.3. Security Error 1705 1706 Based on the assumption that each NFSv4.1 client and server must 1707 support a minimum set of security (i.e., LIPKEY, SPKM-3, and 1708 Kerberos-V5 all under RPCSEC_GSS), the NFS client will initiate file 1709 access to the server with one of the minimal security tuples. During 1710 communication with the server, the client may receive an NFS error of 1711 NFS4ERR_WRONGSEC. This error allows the server to notify the client 1712 that the security tuple currently being used contravenes the server's 1713 security policy. The client is then responsible for determining (see 1714 Section 2.6.3.1) what security tuples are available at the server and 1715 choosing one which is appropriate for the client. 1716 1717 2.6.3.1. Using NFS4ERR_WRONGSEC, SECINFO, and SECINFO_NO_NAME 1718 1719 This section explains of the mechanics of NFSv4.1 security 1720 negotiation. 1721 1722 2.6.3.1.1. Put Filehandle Operations 1723 1724 The term "put filehandle operation" refers to PUTROOTFH, PUTPUBFH, 1725 PUTFH, and RESTOREFH. Each of the subsections herein describes how 1726 the server handles a subseries of operations that starts with a put 1727 filehandle operation. 1728 1729 1730 1731 1732 1733 1734 1735 Shepler, et al. Expires February 23, 2009 [Page 31] 1736 1737 Internet-Draft NFSv4.1 August 2008 1738 1739 1740 2.6.3.1.1.1. Put Filehandle Operation + SAVEFH 1741 1742 The client is saving a filehandle for a future RESTOREFH, LINK, or 1743 RENAME. SAVEFH MUST NOT return NFS4ERR_WRONGSEC. To determine 1744 whether the put filehandle operation returns NFS4ERR_WRONGSEC or not, 1745 the server implementation pretends SAVEFH is not in the series of 1746 operations and examines which of the situations described in the 1747 other subsections of Section 2.6.3.1.1 apply. 1748 1749 2.6.3.1.1.2. Two or More Put Filehandle Operations 1750 1751 For a series of N put filehandle operations, the server MUST NOT 1752 return NFS4ERR_WRONGSEC to the first N-1 put filehandle operations. 1753 The N'th put filehandle operation is handled as if it is the first in 1754 a subseries of operations. For example if the server received PUTFH, 1755 PUTROOTFH, LOOKUP, then the PUTFH is ignored for NFS4ERR_WRONGSEC 1756 purposes, and the PUTROOTFH, LOOKUP subseries is processed as 1757 according to Section 2.6.3.1.1.3. 1758 1759 2.6.3.1.1.3. Put Filehandle Operation + LOOKUP (or OPEN of an Existing 1760 Name) 1761 1762 This situation also applies to a put filehandle operation followed by 1763 a LOOKUP or an OPEN operation that specifies an existing component 1764 name. 1765 1766 In this situation, the client is potentially crossing a security 1767 policy boundary, and the set of security tuples the parent directory 1768 supports may differ from those of the child. The server 1769 implementation may decide whether to impose any restrictions on 1770 security policy administration. There are at least three approaches 1771 (sec_policy_child is the tuple set of the child export, 1772 sec_policy_parent is that of the parent). 1773 1774 a) sec_policy_child <= sec_policy_parent (<= for subset). This 1775 means that the set of security tuples specified on the security 1776 policy of a child directory is always a subset of that of its 1777 parent directory. 1778 1779 b) sec_policy_child ^ sec_policy_parent != {} (^ for intersection, 1780 {} for the empty set). This means that the security tuples 1781 specified on the security policy of a child directory always has a 1782 non empty intersection with that of the parent. 1783 1784 c) sec_policy_child ^ sec_policy_parent == {}. This means that 1785 the set of tuples specified on the security policy of a child 1786 directory may not intersect with that of the parent. In other 1787 words, there are no restrictions on how the system administrator 1788 1789 1790 1791 Shepler, et al. Expires February 23, 2009 [Page 32] 1792 1793 Internet-Draft NFSv4.1 August 2008 1794 1795 1796 may set up these tuples. 1797 1798 In order for a server to support approaches (b) (for the case when a 1799 client chooses a flavor that is not a member of sec_policy_parent) 1800 and (c), the put filehandle operation cannot return NFS4ERR_WRONGSEC 1801 when there is a security tuple mismatch. Instead, it should be 1802 returned from the LOOKUP (or OPEN by existing component name) that 1803 follows. 1804 1805 Since the above guideline does not contradict approach (a), it should 1806 be followed in general. Even if approach (a) is implemented, it is 1807 possible for the security tuple used to be acceptable for the target 1808 of LOOKUP but not for the filehandles used in the put filehandle 1809 operation. The put filehandle operation could be a PUTROOTFH or 1810 PUTPUBFH, where the client cannot know the security tuples for the 1811 root or public filehandle. Or the security policy for the filehandle 1812 used by the put filehandle operation could have changed since the 1813 time the filehandle was obtained. 1814 1815 Therefore, an NFSv4.1 server MUST NOT return NFS4ERR_WRONGSEC in 1816 response to the put filehandle operation if the operation is 1817 immediately followed by a LOOKUP or an OPEN by component name. 1818 1819 2.6.3.1.1.4. Put Filehandle Operation + LOOKUPP 1820 1821 Since SECINFO only works its way down, there is no way LOOKUPP can 1822 return NFS4ERR_WRONGSEC without SECINFO_NO_NAME. SECINFO_NO_NAME 1823 solves this issue via style SECINFO_STYLE4_PARENT, which works in the 1824 opposite direction as SECINFO. As with Section 2.6.3.1.1.3, a put 1825 filehandle operation that is followed by a LOOKUPP MUST NOT return 1826 NFS4ERR_WRONGSEC. If the server does not support SECINFO_NO_NAME, 1827 the client's only recourse is to send the put filehandle operation, 1828 LOOKUPP, GETFH sequence of operations with every security tuple it 1829 supports. 1830 1831 Regardless of whether SECINFO_NO_NAME is supported, an NFSv4.1 server 1832 MUST NOT return NFS4ERR_WRONGSEC in response to a put filehandle 1833 operation if the operation is immediately followed by a LOOKUPP. 1834 1835 2.6.3.1.1.5. Put Filehandle Operation + SECINFO/SECINFO_NO_NAME 1836 1837 A security sensitive client is allowed to choose a strong security 1838 tuple when querying a server to determine a file object's permitted 1839 security tuples. The security tuple chosen by the client does not 1840 have to be included in the tuple list of the security policy of the 1841 either parent directory indicated in the put filehandle operation, or 1842 the child file object indicated in SECINFO (or any parent directory 1843 indicated in SECINFO_NO_NAME). Of course the server has to be 1844 1845 1846 1847 Shepler, et al. Expires February 23, 2009 [Page 33] 1848 1849 Internet-Draft NFSv4.1 August 2008 1850 1851 1852 configured for whatever security tuple the client selects, otherwise 1853 the request will fail at RPC layer with an appropriate authentication 1854 error. 1855 1856 In theory, there is no connection between the security flavor used by 1857 SECINFO or SECINFO_NO_NAME and those supported by the security 1858 policy. But in practice, the client may start looking for strong 1859 flavors from those supported by the security policy, followed by 1860 those in the REQUIRED set. 1861 1862 The NFSv4.1 server MUST NOT return NFS4ERR_WRONGSEC to a put 1863 filehandle operation that is immediately followed by SECINFO or 1864 SECINFO_NO_NAME. The NFSv4.1 server MUST NOT return NFS4ERR_WRONGSEC 1865 from SECINFO or SECINFO_NO_NAME. 1866 1867 2.6.3.1.1.6. Put Filehandle Operation + Nothing 1868 1869 The NFSv4.1 server MUST NOT return NFS4ERR_WRONGSEC. 1870 1871 2.6.3.1.1.7. Put Filehandle Operation + Anything Else 1872 1873 "Anything Else" includes OPEN by filehandle. 1874 1875 The security policy enforcement applies to the filehandle specified 1876 in the put filehandle operation. Therefore the put filehandle 1877 operation must return NFS4ERR_WRONGSEC when there is a security tuple 1878 mismatch. This avoids the complexity adding NFS4ERR_WRONGSEC as an 1879 allowable error to every other operation. 1880 1881 A COMPOUND containing the series put filehandle operation + 1882 SECINFO_NO_NAME (style SECINFO_STYLE4_CURRENT_FH) is an efficient way 1883 for the client to recover from NFS4ERR_WRONGSEC. 1884 1885 The NFSv4.1 server MUST NOT return NFS4ERR_WRONGSEC to any operation 1886 other than a put filehandle operation, LOOKUP, LOOKUPP, and OPEN (by 1887 component name). 1888 1889 2.6.3.1.1.8. Operations after SECINFO and SECINFO_NO_NAME 1890 1891 Suppose a client sends a COMPOUND procedure containing the series 1892 SEQUENCE, PUTFH, SECINFO_NONAME, READ, and suppose the security tuple 1893 used does not match that required for the target file. By rule (see 1894 Section 2.6.3.1.1.5), neither PUTFH nor SECINFO_NO_NAME can return 1895 NFS4ERR_WRONGSEC. By rule (see Section 2.6.3.1.1.7), READ cannot 1896 return NFS4ERR_WRONGSEC. The issue is resolved by the fact that 1897 SECINFO and SECINFO_NO_NAME consume the current filehandle (note that 1898 this is a change from NFSv4.0). This leaves no current filehandle 1899 for READ to use, and READ returns NFS4ERR_NOFILEHANDLE. 1900 1901 1902 1903 Shepler, et al. Expires February 23, 2009 [Page 34] 1904 1905 Internet-Draft NFSv4.1 August 2008 1906 1907 1908 2.6.3.1.2. LINK and RENAME 1909 1910 The LINK and RENAME operations use both the current and saved 1911 filehandles. When the current filehandle is injected into a series 1912 of operations via a put filehandle operation, the server MUST return 1913 NFS4ERR_WRONGSEC, per Section 2.6.3.1.1. LINK and RENAME MAY return 1914 NFS4ERR_WRONGSEC if the security policy of the saved filehandle 1915 rejects the security flavor used in the COMPOUND request's 1916 credentials. If the server does so, then if there is no intersection 1917 between the security policies of saved and current filehandles, this 1918 means it will be impossible for client to perform the intended LINK 1919 or RENAME operation. 1920 1921 For example, suppose the client sends this COMPOUND request: 1922 SEQUENCE, PUTFH bFH, SAVEFH, PUTFH aFH, RENAME "c" "d", where 1923 filehandles bFH and aFH refer to different directories. Suppose no 1924 common security tuple exists between the security policies of aFH and 1925 bFH. If the client sends the request using credentials acceptable to 1926 bFH's security policy but not aFH's policy, then the PUTFH aFH 1927 operation will fail with NFS4ERR_WRONGSEC. After a SECINFO_NO_NAME 1928 request, the client sends SEQUENCE, PUTFH bFH, SAVEFH, PUTFH aFH, 1929 RENAME "c" "d", using credentials acceptable to aFH's security 1930 policy, but not bFH's policy. The server returns NFS4ERR_WRONGSEC on 1931 the RENAME operation. 1932 1933 To prevent a client from an endless sequence of a request containing 1934 LINK or RENAME, followed by a request containing SECINFO_NO_NAME, the 1935 server MUST detect when the security policies of the current and 1936 saved filehandles have no mutually acceptable security tuple, and 1937 MUST NOT return NFS4ERR_WRONGSEC in that situation. Instead the 1938 server MUST return NFS4ERR_XDEV. 1939 1940 Thus while a server MAY return NFS4ERR_WRONGSEC from LINK and RENAME, 1941 the server implementor may reasonably decide the consequences are not 1942 worth the security benefits, and so allow the security policy of the 1943 current filehandle to override that of the saved filehandle. 1944 1945 2.7. Minor Versioning 1946 1947 To address the requirement of an NFS protocol that can evolve as the 1948 need arises, the NFSv4.1 protocol contains the rules and framework to 1949 allow for future minor changes or versioning. 1950 1951 The base assumption with respect to minor versioning is that any 1952 future accepted minor version must follow the IETF process and be 1953 documented in a standards track RFC. Therefore, each minor version 1954 number will correspond to one or more new RFCs. Minor version zero 1955 of the NFSv4 protocol is represented by [20], and minor version one 1956 1957 1958 1959 Shepler, et al. Expires February 23, 2009 [Page 35] 1960 1961 Internet-Draft NFSv4.1 August 2008 1962 1963 1964 is represented by this document [[Comment.1: RFC Editor: change 1965 "document" to "RFC" when we publish]]. The COMPOUND and CB_COMPOUND 1966 procedures support the encoding of the minor version being requested 1967 by the client. 1968 1969 The following items represent the basic rules for the development of 1970 minor versions. Note that a future minor version may decide to 1971 modify or add to the following rules as part of the minor version 1972 definition. 1973 1974 1. Procedures are not added or deleted 1975 1976 To maintain the general RPC model, NFSv4 minor versions will not 1977 add to or delete procedures from the NFS program. 1978 1979 2. Minor versions may add operations to the COMPOUND and 1980 CB_COMPOUND procedures. 1981 1982 The addition of operations to the COMPOUND and CB_COMPOUND 1983 procedures does not affect the RPC model. 1984 1985 * Minor versions may append attributes to the bitmap4 that 1986 represents sets of attributes and the fattr4 that represents 1987 sets of attribute values. 1988 1989 This allows for the expansion of the attribute model to allow 1990 for future growth or adaptation. 1991 1992 * Minor version X must append any new attributes after the last 1993 documented attribute. 1994 1995 Since attribute results are specified as an opaque array of 1996 per-attribute XDR encoded results, the complexity of adding 1997 new attributes in the midst of the current definitions would 1998 be too burdensome. 1999 2000 3. Minor versions must not modify the structure of an existing 2001 operation's arguments or results. 2002 2003 Again the complexity of handling multiple structure definitions 2004 for a single operation is too burdensome. New operations should 2005 be added instead of modifying existing structures for a minor 2006 version. 2007 2008 This rule does not preclude the following adaptations in a minor 2009 version. 2010 2011 2012 2013 2014 2015 Shepler, et al. Expires February 23, 2009 [Page 36] 2016 2017 Internet-Draft NFSv4.1 August 2008 2018 2019 2020 * adding bits to flag fields such as new attributes to 2021 GETATTR's bitmap4 data type and providing corresponding 2022 variants of opaque arrays, such as a notify4 used together 2023 with such bitmaps. 2024 2025 * adding bits to existing attributes like ACLs that have flag 2026 words 2027 2028 * extending enumerated types (including NFS4ERR_*) with new 2029 values 2030 2031 * adding cases to a switched union 2032 2033 4. Minor versions may not modify the structure of existing 2034 attributes. 2035 2036 5. Minor versions may not delete operations. 2037 2038 This prevents the potential reuse of a particular operation 2039 "slot" in a future minor version. 2040 2041 6. Minor versions may not delete attributes. 2042 2043 7. Minor versions may not delete flag bits or enumeration values. 2044 2045 8. Minor versions may declare an operation MUST NOT be implemented. 2046 2047 Specifying an operation MUST NOT be implemented is equivalent to 2048 obsoleting an operation. For the client, it means that the 2049 operation should not be sent to the server. For the server, an 2050 NFS error can be returned as opposed to "dropping" the request 2051 as an XDR decode error. This approach allows for the 2052 obsolescence of an operation while maintaining its structure so 2053 that a future minor version can reintroduce the operation. 2054 2055 1. Minor versions may declare an attribute MUST NOT be 2056 implemented. 2057 2058 2. Minor versions may declare a flag bit or enumeration value 2059 MUST NOT be implemented. 2060 2061 9. Minor versions may downgrade features from REQUIRED to 2062 RECOMMENDED, or RECOMMENDED to OPTIONAL. 2063 2064 10. Minor versions may upgrade features from OPTIONAL to RECOMMENDED 2065 or RECOMMENDED to REQUIRED. 2066 2067 2068 2069 2070 2071 Shepler, et al. Expires February 23, 2009 [Page 37] 2072 2073 Internet-Draft NFSv4.1 August 2008 2074 2075 2076 11. A client and server that supports minor version X should support 2077 minor versions 0 (zero) through X-1 as well. 2078 2079 12. Except for infrastructural changes, no new features may be 2080 introduced as REQUIRED in a minor version. 2081 2082 This rule allows for the introduction of new functionality and 2083 forces the use of implementation experience before designating a 2084 feature as REQUIRED. On the other hand, some classes of 2085 features are infrastructural and have broad effects. Allowing 2086 such features to not be REQUIRED complicates implementation of 2087 the minor version. 2088 2089 13. A client MUST NOT attempt to use a stateid, filehandle, or 2090 similar returned object from the COMPOUND procedure with minor 2091 version X for another COMPOUND procedure with minor version Y, 2092 where X != Y. 2093 2094 2.8. Non-RPC-based Security Services 2095 2096 As described in Section 2.2.1.1.1.1, NFSv4.1 relies on RPC for 2097 identification, authentication, integrity, and privacy. NFSv4.1 2098 itself provides or enables additional security services as described 2099 in the next several subsections. 2100 2101 2.8.1. Authorization 2102 2103 Authorization to access a file object via an NFSv4.1 operation is 2104 ultimately determined by the NFSv4.1 server. A client can 2105 predetermine its access to a file object via the OPEN (Section 18.16) 2106 and the ACCESS (Section 18.1) operations. 2107 2108 Principals with appropriate access rights can modify the 2109 authorization on a file object via the SETATTR (Section 18.30) 2110 operation. Attributes that affect access rights include: mode, 2111 owner, owner_group, acl, dacl, and sacl. See Section 5. 2112 2113 2.8.2. Auditing 2114 2115 NFSv4.1 provides auditing on a per file object basis, via the acl and 2116 sacl attributes as described in Section 6. It is outside the scope 2117 of this specification to specify audit log formats or management 2118 policies. 2119 2120 2.8.3. Intrusion Detection 2121 2122 NFSv4.1 provides alarm control on a per file object basis, via the 2123 acl and sacl attributes as described in Section 6. Alarms may serve 2124 2125 2126 2127 Shepler, et al. Expires February 23, 2009 [Page 38] 2128 2129 Internet-Draft NFSv4.1 August 2008 2130 2131 2132 as the basis for intrusion detection. It is outside the scope of 2133 this specification to specify heuristics for detecting intrusion via 2134 alarms. 2135 2136 2.9. Transport Layers 2137 2138 2.9.1. REQUIRED and RECOMMENDED Properties of Transports 2139 2140 NFSv4.1 works over RDMA and non-RDMA-based transports with the 2141 following attributes: 2142 2143 o The transport supports reliable delivery of data, which NFSv4.1 2144 requires but neither NFSv4.1 nor RPC has facilities for ensuring. 2145 [23] 2146 2147 o The transport delivers data in the order it was sent. Ordered 2148 delivery simplifies detection of transmit errors, and simplifies 2149 the sending of arbitrary sized requests and responses, via the 2150 record marking protocol [3]. 2151 2152 Where an NFSv4.1 implementation supports operation over the IP 2153 network protocol, any transport used between NFS and IP MUST be among 2154 the IETF-approved congestion control transport protocols. At the 2155 time this document was written, the only two transports that had the 2156 above attributes were TCP and SCTP. To enhance the possibilities for 2157 interoperability, an NFSv4.1 implementation MUST support operation 2158 over the TCP transport protocol. 2159 2160 Even if NFSv4.1 is used over a non-IP network protocol, it is 2161 RECOMMENDED that the transport support congestion control. 2162 2163 It is permissible for a connectionless transport to be used under 2164 NFSv4.1, however reliable and in-order delivery of data combined with 2165 congestion control by the connectionless transport is REQUIRED. 2166 NFSv4.1 assumes that a client transport address and server transport 2167 address used to send data over a transport together constitute a 2168 connection, even if the underlying transport eschews the concept of a 2169 connection. 2170 2171 2.9.2. Client and Server Transport Behavior 2172 2173 If a connection-oriented transport (e.g. TCP) is used, the client 2174 and server SHOULD use long lived connections for at least three 2175 reasons: 2176 2177 1. This will prevent the weakening of the transport's congestion 2178 control mechanisms via short lived connections. 2179 2180 2181 2182 2183 Shepler, et al. Expires February 23, 2009 [Page 39] 2184 2185 Internet-Draft NFSv4.1 August 2008 2186 2187 2188 2. This will improve performance for the WAN environment by 2189 eliminating the need for connection setup handshakes. 2190 2191 3. The NFSv4.1 callback model differs from NFSv4.0, and requires the 2192 client and server to maintain a client-created backchannel (see 2193 Section 2.10.3.1) for the server to use. 2194 2195 In order to reduce congestion, if a connection-oriented transport is 2196 used, and the request is not the NULL procedure, 2197 2198 o A requester MUST NOT retry a request unless the connection the 2199 request was sent over was lost before the reply was received. 2200 2201 o A replier MUST NOT silently drop a request, even if the request is 2202 a retry. (The silent drop behavior of RPCSEC_GSS [4] does not 2203 apply because this behavior happens at the RPCSEC_GSS layer, a 2204 lower layer in the request processing). Instead, the replier 2205 SHOULD return an appropriate error (see Section 2.10.5.1) or it 2206 MAY disconnect the connection. 2207 2208 When sending a reply, the replier MUST send the reply to the same 2209 full network address (e.g. if using an IP-based transport, the source 2210 port of the requester is part of the full network address) that the 2211 requester sent the request from. If using a connection-oriented 2212 transport, replies MUST be sent on the same connection the request 2213 was received from. 2214 2215 If a connection is dropped after the replier receives the request but 2216 before the replier sends the reply, the replier might have an pending 2217 reply. If a connection is established with the same source and 2218 destination full network address as the dropped connection, then the 2219 replier MUST NOT send the reply until the client retries the request. 2220 The reason for this prohibition is that the client MAY retry a 2221 request over a different connection than is associated with the 2222 session. 2223 2224 When using RDMA transports there are other reasons for not tolerating 2225 retries over the same connection: 2226 2227 o RDMA transports use "credits" to enforce flow control, where a 2228 credit is a right to a peer to transmit a message. If one peer 2229 were to retransmit a request (or reply), it would consume an 2230 additional credit. If the replier retransmitted a reply, it would 2231 certainly result in an RDMA connection loss, since the requester 2232 would typically only post a single receive buffer for each 2233 request. If the requester retransmitted a request, the additional 2234 credit consumed on the server might lead to RDMA connection 2235 failure unless the client accounted for it and decreased its 2236 2237 2238 2239 Shepler, et al. Expires February 23, 2009 [Page 40] 2240 2241 Internet-Draft NFSv4.1 August 2008 2242 2243 2244 available credit, leading to wasted resources. 2245 2246 o RDMA credits present a new issue to the reply cache in NFSv4.1. 2247 The reply cache may be used when a connection within a session is 2248 lost, such as after the client reconnects. Credit information is 2249 a dynamic property of the RDMA connection, and stale values must 2250 not be replayed from the cache. This implies that the reply cache 2251 contents must not be blindly used when replies are sent from it, 2252 and credit information appropriate to the channel must be 2253 refreshed by the RPC layer. 2254 2255 In addition, as described in Section 2.10.5.2, while a session is 2256 active, the NFSv4.1 requester MUST NOT stop waiting for a reply. 2257 2258 2.9.3. Ports 2259 2260 Historically, NFSv3 servers have listened over TCP port 2049. The 2261 registered port 2049 [24] for the NFS protocol should be the default 2262 configuration. NFSv4.1 clients SHOULD NOT use the RPC binding 2263 protocols as described in [25]. 2264 2265 2.10. Session 2266 2267 2.10.1. Motivation and Overview 2268 2269 Previous versions and minor versions of NFS have suffered from the 2270 following: 2271 2272 o Lack of support for Exactly Once Semantics (EOS). This includes 2273 lack of support for EOS through server failure and recovery. 2274 2275 o Limited callback support, including no support for sending 2276 callbacks through firewalls, and races between replies to normal 2277 requests and callbacks. 2278 2279 o Limited trunking over multiple network paths. 2280 2281 o Requiring machine credentials for fully secure operation. 2282 2283 Through the introduction of a session, NFSv4.1 addresses the above 2284 shortfalls with practical solutions: 2285 2286 o EOS is enabled by a reply cache with a bounded size, making it 2287 feasible to keep the cache in persistent storage and enable EOS 2288 through server failure and recovery. One reason that previous 2289 revisions of NFS did not support EOS was because some EOS 2290 approaches often limited parallelism. As will be explained in 2291 Section 2.10.5, NFSv4.1 supports both EOS and unlimited 2292 2293 2294 2295 Shepler, et al. Expires February 23, 2009 [Page 41] 2296 2297 Internet-Draft NFSv4.1 August 2008 2298 2299 2300 parallelism. 2301 2302 o The NFSv4.1 client (defined in Section 1.5, Paragraph 2) creates 2303 transport connections and provides them to the server to use for 2304 sending callback requests, thus solving the firewall issue 2305 (Section 18.34). Races between responses from client requests, 2306 and callbacks caused by the requests are detected via the 2307 session's sequencing properties which are a consequence of EOS 2308 (Section 2.10.5.3). 2309 2310 o The NFSv4.1 client can add an arbitrary number of connections to 2311 the session, and thus provide trunking (Section 2.10.4). 2312 2313 o The NFSv4.1 client and server produces a session key independent 2314 of client and server machine credentials which can be used to 2315 compute a digest for protecting critical session management 2316 operations (Section 2.10.7.3). 2317 2318 o The NFSv4.1 client can also create secure RPCSEC_GSS contexts for 2319 use by the session's backchannel that do not require the server to 2320 authenticate to a client machine principal (Section 2.10.7.2). 2321 2322 A session is a dynamically created, long-lived server object created 2323 by a client, used over time from one or more transport connections. 2324 Its function is to maintain the server's state relative to the 2325 connection(s) belonging to a client instance. This state is entirely 2326 independent of the connection itself, and indeed the state exists 2327 whether the connection exists or not. A client may have one or more 2328 sessions associated with it so that client-associated state may be 2329 accessed using any of the sessions associated with that client's 2330 client ID, when connections are associated with those sessions. When 2331 no connections are associated with any of a client ID's sessions for 2332 an extended time, such objects as locks, opens, delegations, layouts, 2333 etc. are subject to expiration. The session serves as an object 2334 representing a means of access by a client to the associated client 2335 state on the server, independent of the physical means of access to 2336 that state. 2337 2338 A single client may create multiple sessions. A single session MUST 2339 NOT serve multiple clients. 2340 2341 2.10.2. NFSv4 Integration 2342 2343 Sessions are part of NFSv4.1 and not NFSv4.0. Normally, a major 2344 infrastructure change such as sessions would require a new major 2345 version number to an ONC RPC program like NFS. However, because 2346 NFSv4 encapsulates its functionality in a single procedure, COMPOUND, 2347 and because COMPOUND can support an arbitrary number of operations, 2348 2349 2350 2351 Shepler, et al. Expires February 23, 2009 [Page 42] 2352 2353 Internet-Draft NFSv4.1 August 2008 2354 2355 2356 sessions have been added to NFSv4.1 with little difficulty. COMPOUND 2357 includes a minor version number field, and for NFSv4.1 this minor 2358 version is set to 1. When the NFSv4 server processes a COMPOUND with 2359 the minor version set to 1, it expects a different set of operations 2360 than it does for NFSv4.0. NFSv4.1 defines the SEQUENCE operation, 2361 which is required for every COMPOUND that operates over an 2362 established session, with the exception of some session 2363 administration operations, such as DESTROY_SESSION (Section 18.37). 2364 2365 2.10.2.1. SEQUENCE and CB_SEQUENCE 2366 2367 In NFSv4.1, when the SEQUENCE operation is present, it MUST be the 2368 first operation in the COMPOUND procedure. The primary purpose of 2369 SEQUENCE is to carry the session identifier. The session identifier 2370 associates all other operations in the COMPOUND procedure with a 2371 particular session. SEQUENCE also contains required information for 2372 maintaining EOS (see Section 2.10.5). Session-enabled NFSv4.1 2373 COMPOUND requests thus have the form: 2374 2375 +-----+--------------+-----------+------------+-----------+---- 2376 | tag | minorversion | numops |SEQUENCE op | op + args | ... 2377 | | (== 1) | (limited) | + args | | 2378 +-----+--------------+-----------+------------+-----------+---- 2379 2380 and the replys have the form: 2381 2382 +------------+-----+--------+-------------------------------+--// 2383 |last status | tag | numres |status + SEQUENCE op + results | // 2384 +------------+-----+--------+-------------------------------+--// 2385 //-----------------------+---- 2386 // status + op + results | ... 2387 //-----------------------+---- 2388 2389 A CB_COMPOUND procedure request and reply has a similar form to 2390 COMPOUND, but instead of a SEQUENCE operation, there is a CB_SEQUENCE 2391 operation. CB_COMPOUND also has an additional field called 2392 "callback_ident", which is superfluous in NFSv4.1 and MUST be ignored 2393 by the client. CB_SEQUENCE has the same information as SEQUENCE, and 2394 also includes other information needed to resolve callback races 2395 (Section 2.10.5.3). 2396 2397 2.10.2.2. Client ID and Session Association 2398 2399 Each client ID (Section 2.4) can have zero or more active sessions. 2400 A client ID and associated session are required to perform file 2401 access in NFSv4.1. Each time a session is used (whether by a client 2402 sending a request to the server, or the client replying to a callback 2403 request from the server), the state leased to its associated client 2404 2405 2406 2407 Shepler, et al. Expires February 23, 2009 [Page 43] 2408 2409 Internet-Draft NFSv4.1 August 2008 2410 2411 2412 ID is automatically renewed. 2413 2414 State such as share reservations, locks, delegations, and layouts 2415 (Section 1.6.4) is tied to the client ID. Client state is not tied 2416 to any individual session. Successive state changing operations from 2417 a given state owner MAY go over different sessions, provided the 2418 session is associated with the same client ID. A