1. Understanding UNIX Files
1.1. Open File Deletion
A file is essentially 3 parts:
-
An inode that holds a list of data blocks and metadata (perms, etc)
-
Data blocks
-
An entry in a directory linking a filename with the inode
Remember, of course, that a directory is really just a special type of file. There are also two types of inode: an on-disk inode and an in-kernel inode. An on-disk inode is persistent and an in-kernel inode is transient, existing only in memory when the file is open.
What determines whether a file exists is whether there is an inode for it--a file may have 0 data blocks (and thus be a zero-length file) or it may not exist in a directory.
This last fact trips a few people up. If a file is not open, removing its entry from a directory decrements the link-count stored in the inode. Usually, there is only one link from a directory to an inode, so the link-count drops to zero and the inode and its data blocks are deleted.
If a file is open, however, its in-kernel link-count is also incremented when the file is opened. You may then delete the file with rm (essentially, removing its entry from its parent directory)--its in-kernel link count will decrement, but because the in-kernel link-count is still at least one, the on-disk inode and data blocks are not freed. The process with the file open still has complete access to the file and can do all the normal operations it would expect to be able to do. It can be a nice way to ensure, for example, that temporary files get cleaned up. When the file is closed, the in-kernel link-count is decremented again (this time, presumably to zero) and the on-disk inode is synchronized with the in-kernel inode. With the link-count dropping to zero, the on-disk inode and data blocks will then be deleted.
Many people are confused when they delete large files, only to find that no space is recovered! Log files are a common source of this, since they often grow and can fill a file system, encouraging a sys admin to clean them up.
