file concept
文件:记录在外存上的 相关信息的 具有名称的 集合,逻辑内存的最小分配单元
通常,文件表示程序和数据
free form or may be formatted rigidly
file attributes
- name: human-readable
- identifier: a unique number, non-human-readable name
- type
- location
- size
- protection
- time, date and user identification
文件的信息存在directory中,directory也存在外存上
A directory entry consists of the file’s name and its unique id. The id in turn locates the other file attributes.
file operations
6个基本文件操作:
-
create
-
write
write pointer
-
read
read pointer
A per-process current-file-position pointer
-
reposition
file seek, not need I/O
-
delete
release file space, erase directory entry
-
truncate
reset length to 0, release file space
只删除内容,保留属性
基本操作可以组合成其他操作
Most file operations involve searching the directory for the entry associated with the named file
首次使用文件时,使用系统调用open()
OS维护一个包含所有打开文件的信息表(open-file table)。当需要一个文件操作时,通过该表的一个索引指定文件,无需搜索。文件不再使用时,进程可关闭它,OS从open-file table删除这一条目。系统调用create和delete操作的是关闭文件,而不是打开文件。
系统调用open()返回一个指向open-file table中一个条目的指针。通过使用该指针,而不是真实文件名称,进行所有IO操作。also accept access mode information, such as create, read-only, read-write, append-only, etc.
close(Fi), move the content of entry Fi in open-file table to directory structure on disk.
多进程可能同时打开同一文件。OS采用两级内部表。
-
Process open-file table, a per-process table
该进程所使用的文件信息
单个进程表的每个条目相应地指向整个系统的打开文件表
-
System open-file table, a system-wide table
包含进程无关信息(如文件在磁盘上的位置,访问日期和文件大小)
一旦一个进程打开文件,System open-file table会增加相应条目,当另一个进程执行调用open(),在其Process open-file table增加一个条目,并指向System open-file table相应条目
每个文件有一个open count,记录多少进程打开了该文件。当open count=0,该文件条目可删除。
每个打开文件有如下信息:
-
file pointer
pointer to last read/write location
对每个进程唯一
-
file-open count
-
disk location of file
-
access rights
每个进程用一个访问模式打开文件
保存在单个进程打开文件表中
文件锁(file locks)
shared lock: 多个进程并发获取
exclusive lock: 只有一个进程可以获取
加锁机制:
mandatory强制:一个进程获得锁,阻止其他进程访问已加锁的文件。access is denied depending on locks held and requested. – Windows OS.
advisory建议:processes can find status of locks and decide what to do. – Unix OS.
file types
实现文件类型的常用技术:在文件名称内包含类型。名称可分为两部分:name , extension
Some OS, each file has a type, and a creator attribute containing the name of the program that created it.
UNIX, a magic number stored at the beginning of some files to indicate roughly the type of the file.
access methods
Criteria for File Organization: Rapid access, Ease of update, Economy of storage, Simple maintenance, Reliability
sequential access 顺序访问
基于文件的磁带模型。不仅适用于顺序访问设备,也适用于随机访问设备
direct access 直接访问
文件由固定长度的逻辑记录组成。基于文件的磁盘模型。可立即访问大量信息。数据库常采用这种类型的文件。
提供的是相对块号。
对直接访问文件,可容易地模拟顺序访问。反过来很低效。
其他访问方式
通常涉及创建索引
directory structure
有时需要在一个磁盘上装多种文件系统
Disk can be subdivided into partitions.
Disk or partition can be used:
raw, without a file system, e.g. swap space
formatted with a file system
Volume: Entity containing file system
Each volume containing file system also tracks that file system’s info in device directory or volume table of contents.
Disks or partitions can be RAID protected against failure.
directory可看作符号表,将 文件名 转换成 目录条目
Both the directory structure and the files reside on disk.
Single-Level Directory
所有文件包含在同一目录中
必须有唯一名称
naming problem , grouping problem
Two-Level Directory
为每个用户创建独立目录
A master file directory(主文件目录MFD) and one user file directory(用户文件目录UFD) for each user.
UFD内文件名唯一即可
No grouping capability
File sharing怎么解决
- Path name. defined by a user name and a file name.
- Special user directory, containing system files.
- Search path, the sequence of directories searched.
Tree-Structured Directories
树有根目录,系统内的每个文件都有唯一路径名
One bit in each entry defines the entry as a file(0) or as a subdirectory(1).
Grouping Capability
每个进程有 Current directory (working directory) 当前目录 In accounting file, a pointer/the name of the user’s initial directory. Copied to a local variable for this user.
Absolute path name: 从根开始
relative path name: 从当前目录开始
创建子目录/新文件在当前目录下完成
Deleting a subdirectory
deleting a empty directory. – MS-DOS 目录不为空不能删除
deleting all files and subdirectories that it contains.– UNIX 所有该目录的文件和子目录可删除
Acyclic-Graph Directories 无环图
树状结构禁止共享文件和目录
注意这里是共享文件而不是文件复制,共享文件任何改变为其他用户可见
Ways implementing shared files and subdirectories:
Create a new directory entry, called a link. – UNIX link, a pointer to another file or subdirectory. Resolve the link – follow pointer to locate the file.
duplicate all information about shared files in all sharing directories. 问题:维护一致性
当用户删除文件时就删除,会留下dangling pointer
对于link: Deletion of a link need not affectt he original file. Deletion of the file entry, leaving the links dangling.
对于duplication: File-reference list, Backpointers. 为每个文件保留一个引用列表或引用计数Backpointers, so we can delete all pointers.
General Graph Directory
允许环存在
为避免无限循环:强制限制在搜索时所访问目录的次数
何时可删除?因为有环,就算不存在引用,其引用计数也可能不为0,可能存在自我引用。解决办法:Garbage collection: The first pass, traversing the entire file system, marking everything that can be accessed. The second pass, collects everything that is not marked onto a list of free space.
How to avoid cycles? Allow only links to file not subdirectories. Every time a new link is added use a cycle detection algorithm. A simpler algorithm, bypass links during directory traversal.
file system mounting 文件系统安装
-
OS需要知道 设备名称 和 安装点(mount point)
-
OS验证设备是否包含一个有效文件系统(通过device driver读入设备目录,验证目录是否具有期望格式)
-
OS在其目录结构记录下:一个文件系统已安装在给定安装点上
Systems impose semantics to clarify functionality.
file sharing
多用户需要共享文件
Sharing may be done through a protection scheme.
More file and directory attributes are needed: File / directory owner; File / directory user, access rights; File / directory user groups, access rights
User IDs identify users, allowing permissions and protections to be per-user. Group IDs allow users to be in groups, permitting group access rights.
remote file system 远程文件系统
-
通过程序(如ftp)在机器之间进行文件的人工传输
-
distributed file systems (DFS)
远程目录可从本机上直接访问
much tighter integration
-
via the World Wide Web(WWW)
A browser is needed. uses anonymous file exchange.
Client-server model
allows clients to mount remote file systems from servers.
Client and user-on-client identification is insecure or complicated.
client can be specified by a network name or other identifier, such as an IP address, Can be spoofed or imitated.
Secure authentication via encrypted keys. Ensuring compatibility of the client and server.
NFS: standard UNIX client-server file sharing protocol
CIFS(Common Internet File System): standard Windows protocol, uses active directory
Standard operating system file calls are translated into remote calls远程调用
Distributed Information Systems
Providing unified access to the information needed for remote computing.
DNS (domain name system) provides host-name-to-network-address translations for entire internet.
NIS (network information service), yellow pages黄页, centralizes storage of user names, host names, printer information, and the like.
LDAP (lightweight directory-access protocol), used by industry as a secure distributed naming mechanism. secure single sign-on
Failure Modes 故障模式
Remote file systems add new failure modes, due to network failure, server failure.
Recovery from failure can involve state information about status状态信息 of each remote request.
Stateless protocols such as NFS V3 include all information in each request, allowing easy recovery but less security.
In NFS V4, it is made stateful
Consistency Semantics 一致性语义
描述多用户同时访问共享文件时的语义。规定了一个用户所修改的数据何时对另一用户可见。
由于延迟,不适合类似进程同步那些复杂算法
-
AFS (Andrew File System) implemented complex remote file sharing semantics
server记录client的动作
When a client changes a file, the server notifies other clients with a call back promise technique
一个用户对打开文件的写不能立即被打开同一文件的其他用户看见。一旦文件关闭,其修改只能被以后打开的会话所见。
多个用户允许对自己的映像进行并发(没有延迟)的读写操作
-
UFS (Unix file system) implements
一个文件与单个物理映射相关联,该映射作为互斥资源访问。修改立即可见。竞争导致用户进程延迟。
-
Immutable shared files, declared by its creator. Read-only. 不可修改
protection
高层功能可以用系统程序调用低层系统调用实现。保护可以只在低层提供。
Mode of access: read, write, execute
为每个文件和目录增加一个访问控制列表ACL(Access-control list),给定每个用户名及其允许访问的类型
三种用户类型:
- owner
- group
- universe(public): all other users in the system
每个域三个位:RWX. 需要9个位
最后修改于 2020-02-28