BlueAdmiral.com

 

 

File Management
 

What is a file?

A file can be described as a container that stores data that is accessible by a computer system - it is basically a container holding a set of related information that will be stored on some form of secondary storage. It is the file manager's responsibility to look after these containers. A data file may consist of for example, a set of computer program instructions, text or another form of data. A text file may contain just a short letter, or contain a complete book. In other words a file is basically a defined set of named data.

Why is file management needed?

The operating system will have to maintain a secure and well managed file system for all the users of the computer system. Mechanisms will have to exist to ensure correct and authorized use of any of the files under the file manager's care. It could be said that this part of the operating system is the most visible to the user, as the user will have specific file requirements and will expect requirement results to be evidenced. The file manager will aim to ensure data integrity and will ensure that files are kept secure. In order to do this the file manager will maintain accurate information about all the files, their use and their movement throughout any file management system. A file will be created, modified or deleted in some way as a direct result of some form of processing activity - which in turn is undertaken by the process manager. As such the file manager needs to ensure that all its files are fully protected from misuse or accidental damage at all times.

The File Manager (FM) will have a predetermined policy that will state how a file is created, used, stored and retrieved. Ultimately the policy will be based on flexibility of access to the files and protection of the files. The FM is mainly concerned with providing a suitable interface for users to manipulate files. Users need to be able to store data long term. This obviously requires a very large and non-volatile memory area. This area needs to be organised and controlled. The FM provides this area in a controlled and structured way (and with the support of the DM). The file manager will typically create a file 'tag' which basically acts as a file descriptor. This descriptor will typically log details of the use of any file, its movement, the file's symbolic name, and its current status, (for example if a file is protected, or archived, etc.). Once a file is actually opened the file manager will append further information to the file descriptor, such as for example, where a file is physically kept in secondary storage (to allow for a file's return, etc.).

Typical responsibilities of a File Manager include:

  • create and delete files
  • allocate and de-allocate file space - communicating availability to others
  • track where files are stored - referring to files by symbolic name (the user does not have to worry about exact storage location)
  • store files efficiently
  • identify and list all the files owned by a particular owner
  • add or delete authorized users and their files
  • access files efficiently, e.g., to retrieve files just using their symbolic names
  • share files
  • control access to files, for example preventing a data file from being corrupted
  • reallocate file space
  • protect files from the failure of the operating system (or hardware)
  • be able to store files to new storage media, such as additional disks or drives

Related file definitions

Before looking at the role of the file manager in greater detail, you should understand the following terms:-

Bit: A binary digit - 0 or 1.

Byte: 8 bits of data, grouped together to represent a single character (e.g., a letter, number or symbol) with 256 possible bit combinations.

Record: A complete set of related data bytes. A set of logically related fields.

File: A group of logically related records.
A stored document with a unique name.
Program file: Contains program instruction.
Data file: Contains data (information).

Directory: Information about files and their location.
Area on disk to store files (e.g., programs or data).
Contains details about file attributes.

Volume: A physical device that stores the files.
A storage medium that can be attached or dismounted as a complete unit.
Configuration can have many volumes.
Volume can have many files.

  

Data and File or Organisation

The files managed by the operating system may consist of a simple set of bytes, through to a complicated stream of records. The operating system will facilitate the structuring of the data contained within a file. Typically each set of data shown on Figure 1. below is significantly more complex than the set above it. Starting with the single binary bit, which represents the 'on' or 'off state of an electrical pulse. One set of eight of these pulses represents one byte. here the sequence of the eight pulses will represent one single character, such as the ASCII character 'A'. A field represents a set of encoded bytes, showing their final characters. Its worth noting that the character <space> will also exist here as its own unique ASCII character. A record will hold together a set of related fields. Finally the file itself will contain a set number of records that may be organized in some way, for example randomly or sequentially.

  

User Interface

The FM provides the user with a limited set of file system commands. The commands are low-level and the real complexities are hidden from the user. These commands may be used in programs, or interactively by the user. They are device independent -users do not need to know the exact location of file. Users can create and manipulate their files using the simple set of commands that follow:-

  • OPEN Make file available to user (or program)
  • CLOSE Close file and make unavailable
  • READ Read record in file
  • WRITE Write record to file
  • MODIFY Read / Change / Write a record
  • DELETE Erase a record or file

Volumes

A volume is a physical storage unit (e.g., a disk), and may be fixed or removable. Each volume will have a 'Volume Descriptor - the FM can then interact with the Volume as necessary. The FM will ensure each volume will have unique details (given in 1st sector of 1st track) as follows:-

Creation date date volume created
Pointer to directory area indicates first sector directory stored
Pointer to file area indicates first sector file is stored
File system code detects volumes with incorrect formats
Volume name user allocated name

A Master File Directory (MFD) is created for each volume, and follows the volume identifiers - it is transparent to users. All file requests will follow path starting with MFD as point of entry. The FM will search the MFD for a pointer to the first directory (the user's directory).

 

Directory structure

A directory is basically a set of linked files whereby they are organised in a way suited to the humans that they serve. The file manager will observe a set of rules ( a policy) in which it will look after the directory and access rights to the files contained within. Directories are organized on a volume, (HDD, FDD, Tape, etc.). This organization provides a logical and controlled access to files. The normal organization is a tree structure. This provides easy and fast access and searches. Each directory entry will contain fields to indicate:

I. a symbolic name of the file

2. the size and the position of the file on the volume

3. the type of access permitted to the file

 

Each directory may contain a number of sub-directories. The structure of directories and the relationship between them is managed by the FM (this may differ between operating systems).

Examples of directory structures include:

I. single level - best used where users have equal status

2. hierarchical - more common; it can be used to represent organizational structure.

Directory structure

The directory contains an overview of all the data held. Each file has a 'file descriptor' entry in the directory structure. A typical file descriptor contains the following information, with typical properties:-

  • Name typically in ASCII code
  • Type such as data, program, sub-directory
  • File size in bytes
  • Record size its fixed size, max size, etc.
  • Location a pointer to the 1st physical block/s where file stored
  • Date time of creation typically based on the internal clock
  • Attributes such as a read, write, access restrictions

File Names

Files may be identified as follows:-
o Absolute Name (complete name) Volume I Directory Path FileName.Ext
Example: (DOS) C:\public\help\test.hlp
(VMS) VAX2::USR3:[public.help]test.hlp
(UNIX) /usr/public/help/test.hlp
o Relative Name FileName.ext (short name)
(Retrievable from current directory)
o Extension Indicates file type
(contents & use of file)

 

Examples of file extensions and file types include:-

  • BAK Back-up data
  • BAT Batch file (e.g., containing executable command)
  • COM Program file (e.g. part of software)
  • EXE Executable program file (e.g. containing some language code)
  • DOC. WP MS Word VIP document or other word processed document
  • GIF GIF graphics file Digitized images(e.g. digitized image)
  • INI Configuration file
  • JPG JPEG graphics file
  • WAV Sound file (e.g., digitized)
  • SYS Operating systems configuration file
  • ARC Archive file used for long term storage
  • PS, GIF or DVI ASCII or binary file awaiting printing

 Whatever the operating system in use, a control mechanism will exist to safeguard the integrity of the files (and directories). All files will contain a number of attributes that will be monitored and maintained by the file manager. File attributes may be:

  • placed on a file to provide access protection - e.g., containing access control information stating who can write, update or delete, etc. a given file
  • applied only to that file
  • applied to all users if the file is shared

Main file operations include facilities such as:-

Create The file manager will first find an available space on the specified secondary storage device. Once space has been allocated, the file manager will record the given name and location of the file for future reference.

Write The file manager is responsible for agreeing the name of the file according to naming conventions, and specifying the contents to be written to the file.

Retrieve The file manager will be responsible for delivering accurate file name indicators, so that file retrieval is possible.

Delete The file manager will first establish that the user is authorized to take such action. On authorization the file manager will release the file space so that the space can be made available to another file.

 

Other operations for example include:-
Name Rename Save Import Play Protect
Copy Move Play Compress Execute Export
Update Display Print Share Download Append

  

Attributes (for protecting a file ) include:-

  • Read only
  • Execute only
  • Non-erasable
  • Hidden (not displayed in listings)
  • System (only accessible by OS)

Records and record format

Records are stored and organized within a file in one of the following ways:-

  • Sequentially (contiguous) - records are placed sequentially (ascending or descending) and ordered by key.
  • Direct file organization (dynamic) - records are placed in random order and assigned 'address locator' key based on a 'hashing algorithm'.
  • Indexed file organization, using a combination of the above, where records are stored sequentially, and are assigned an index to facilitate retrieval both sequentially and directly.

The record format includes the following features:

  • Records can be fixed length or variable length
  • Both record types can be used on the same volume
  • All fixed length records on a volume are the same length
  • A file uses only one record type

Blocked records

This is a storage saving and input/output saving policy. It aims to saves group related records together in one block, and also facilitates retrieval of the same records in one block. It has the following features:-

  • Records can be collected together in blocks
  • Records in block can be fixed or variable in size (but not both)
  • Records in a block are related in some way
  • Blocked records occupy contiguous locations
  • Files do not have to use blocked records
  • Block size will be set to take best advantage of faster transfer rate
  • Blocking improves access for some types of files

Two techniques employed include:-
Block 1 No. of Rec 1 Rec 2 Rec 3 Rec 4 Rec Block 2 Recs

Fixed length records
Block 1 Block No. of Rec 1 Rec Rec 2 Rec 2 Block 2 size Recs length 1 length
Variable length records

Non-contiguous storage
Disk storage devices do not store records one after the other (so they are not contiguous). Non-contiguous storage is more flexible and allows efficient reallocation of file storage space. Records can be accessed anywhere in the file (through Random or Dynamic Access). The problem is that the FM needs to track the locations allocated to a required file.

Linked lists
A common method of tracking allocation of file storage is through the linked list. Each record contains a pointer to the next record in the sequence. An alternative system is to store the links in a table in RAM.

Summary
This section has discussed the need for a file manager within a typical computer system. We have identified key responsibilities carried out by the file manager, including the manipulation and the movement of files, as well as tracking, controlling and protecting the use of the files. We have reviewed a range of file definitions, from the term 'bit' through to 'file'. User interaction issues were discussed, where for example, the file manager grants the user access to a range of commands that allow the user to open, close, read, write and modify their files. We explored issues surrounding directory structure, file naming and the application of file attributes. Finally, we considered how data might actually be placed on a particular storage medium, for example sequentially, randomly or in a block.

 

Back