Geographic Information Systems

Definition (after Duecker): A geographic information system consists of a database of observations on spatially distributed features, activities or events which are definable in space as points, lines or areas. A GIS manipulates data about these points, lines and areas to retrieve data for ad hoc queries and analysis.

Conventional databases may or may not contain spatial information but they do not use it as such. e.g. the familiar DB containing information about parts and suppliers may store the address of the supplier, which is spatial data because it locates the supplier. However, if a map is not included as part of the system, it would be very difficult to answer a query such as 'Find all the suppliers within a 5 mile radius of our store'. With a GIS, it should be straightforward to answer such a query.

Example. A GIS was used to identify landscape patterns associated with Lyme disease risk in a county of New York state. Remote sensing data from Landsat was used to map vegetation and a land-cover map, showing house and forest areas, was prepared to find areas where houses and forest were contiguous. The incidence of Lyme disease was added. The result proved a link between forest contact and the disease, giving guidance to health planners working against the disease.

DBs and GIS
Relational databases are unsuitable for spatial data management since the structure of spatial data does not naturally fit a tabular structure. Geodata are typically very large with a naturally imposed hierarchical structure e.g. regions bounded by polylines, composed of line segments, composed of points.

Hybrid architecture
Typically, spatial data are stored as a set of system files and non-spatial data are stored in a relational database e.g. ARC/INFO

An advantage of this is that the data management of the two components can be optimized separately. However, the spatial data are handled outside the DB and cam-lot gain from standard DB functions such as integrity, security etc.

Integrated architectures
A single DB manages both spatial and non-spatial data. This may lead to poor performance in retrieving spatial data due to the large number of relational accesses and joins required to reconstruct the spatial objects (points to lines to polygons) and to connect spatial attributes to their other attributes. OODBs are an obvious solution to some of these problems but are still in their infancy in this application.

GIS systems should have the following functional capabilities:

  • data capture
  • data storage
  • data management
  • data retrieval
  • data analysis
  • data display

We have space here to consider only some of these, where the~~ are particularly important to GIS.


Data Capture and Storage
How do we get the map into the computer and store it? Maps come from many different sources e.g. paper, photographs, remote sensing, field observations using GPS etc. To be stored in a computer these must be converted into a digital format by scanning or digitising. However, if we need to use maps from different sources together, for example, overlaying one on top of another, the maps must be converted to the same projection and scale.

The earth is an ellipsoid and converting such a 3-dimensional object to 2 dimensions involves choosing a suitable projection and scale. There are several projections in common use e.g. transverse Mercator~(which preserves shape and direction), equal area (which preserves area). Then, a co-ordinate system must be selected such as latitude/longitude or the military grid system. A good GIS should be able to convert maps in different formats to a standard one.

Georeferencing This involves finding control points for aerial or satellite images so that these can be converted to the co-ordinate system that you are using for your maps. This is not as easy as it might appear at first glance.

Raster and vector
There are two data models used for storing map data, each with its own strengths and weaknesses.



Raster grids are easy to understand, capable of rapid retrieval and analysis, good at representing continuous field variables such as topography. However, they require a lot of memory because every grid cell must have a value and for a large uniform area many cells may have the same value (although compression techniques are available). Rasters do not produce such accurate pictures as vectors since the cell size determines the resolution of the data. Vectors can follow features very closely and are very good a representing features that are shown on maps as lines e.g. rivers, roads, boundaries. They are much more efficient than rasters since an outline map could be drawn with perhaps only a few thousand points, far fewer than the number of raster grid cells that would be needed.


Data retrieval

There are a number of retrieval operations involving spatial data which a GIS should be able to perform.

  • showing the position on a map of particular features e.g. all post offices.
  • retrieving attributes from the DB using the map as a query vehicle e.g. pointing to a county on the map and asking for its population.
  • selecting by attribute e.g. find all houses built since 1995 and display their position.
  • selection by proximity e.g. retrieve all superstores within a 5 mile radius of a town centre. This is known as buffering.
  • map overlay. This is a very important operation for GIS. For example, there may be separate maps available for soil type and altitude. By overlaying one on the other you could find those regions which are lower than 200 metres and have clay soils.
    There are various processes for overlay and these are sometimes referred to as map algebra. This consists of:
    Boolean operations - binary combinations of attribute codes. Recoding - reorganising a range of attribute codes into binary layers Arithmetic - addition, subtraction, multiplication or division of each value in a data layer by the value in the corresponding location in a second data layer.
  • allowing networks to be constructed and queried e.g. for power lines, bus routes.
  • making new categories of attribute by clumping areas together e.g. industrial zones and housing zones could be combined to form a new category of built-up areas.

Some of the above operations may involve some degree of analysis and, of course, there is a large set of analytical operations which can be performed on data stores in a GIS which there is no space to consider here.

This brief account of GIS is designed to give you a flavour how such Dbs differ from others which you have studied.

Further Reading
Clarke, K.C.(1997) Getting started with Geographical Information Systems.
Worboys, M.F. (1995) GIS: A Computing Perspective