MARK SWEETING
sweeting.org home the sweeting genealogy project mark sweeting's home search sweeting.org contact me
You are here: Home > Mark Sweeting > Something for the Public > The Current Situation

Previous | Next

PART 1 - THE CURRENT SITUATION

THE UK ARCHAEOLOGICAL DATABASE

As mentioned above, the UK "Archaeological Database" is a very large and varied entity. It is "managed" by several different organisations and groups, who all use their own databases for different purposes. Consequently all these databases have different structures, formats and design.

Perhaps the largest amount of archaeological and heritage data is currently broken up into approximately 90 databases, each hosted by a city or a council SMR. The SMRs are essentially responsible for collating, archiving and disseminating the archaeological and heritage information to both internally (typically with an advisory role in the planning department), and to the public.

The Royal Commission on the Historical Monuments of England, main role is the development and maintenance of a national database of archaeological and heritage related data. The provision of a National Monuments Record (NMR) is perhaps the central role of the Royal Commission. Traditionally this task was achieved by detailed field survey carried out by their own investigators. Within recent years though this has changed as they have realised the slow progress being made, and they have realised the importance of a "centralised database". This realisation has helped improved the relationship between the RCHME, and has encouraged co-operation between the NMR and local government SMRs (Fraser, 1997:23).

English Heritage (EH) is also a large data holder, though their focus is mainly listed buildings and scheduled ancient monuments, or sites of "national importance", rather than archaeological sites.

The Archaeology Data Service is also a very important source of data. The ADS differs from the other organisations though in that it does not generate data so much as store it for others.


 

THE ORGANISATIONS

THE ROYAL COMMISSIONS

In the UK, the Royal Commissions are divided into three different groups -RCHME, RCAHMS and RCAHMW - one each for England, Scotland and Wales respectively.

They were granted Royal Warrants by King Edward VII in 1908 "to make an inventory of the ancient and historical monuments and constructions connected with or illustrative of the contemporary culture, civilisation and conditions of life of the people from the earliest times, and to specify those which seem most worthy of preservation." (Fraser, 1997:22).


RCHME - The Royal Commission on the Historical Monuments of England

The Royal Commission has pursued its task of developing a National Monuments Record (NMR) in two main ways: (1) field investigation and survey, and (2) by collating and archiving aerial photographs, historic maps, drawings and other documentary evidence (RCHME, 1993(a):1). The RCHME has a contemporary statement that says they will "compile, asses, curate and make available the national record of England"s ancient monuments and historic buildings" (Fraser, 1994:23). This means that systematic survey or compilation of records may occur as the Commission sees fit it is not carried out prior to site destruction, as is often the case with SMRs.

Jointly, the RCHME and the ADS are making serious moves towards an on-line NMR. At the moment RCHME have no on-line database, though they have contributed a large amount of data to the ADS. The NMR database will provide a very large resource when it becomes available, but it is accepted that it will not grow substantially unless other organisations contribute to it. For this to happen, it has been acknowledged that there will need to be some sort of uniform data standard across the board, so that data can easily be exchanged and shared.

With this in mind, the RCHME has worked jointly with organisations such as ALGAO and the ADS to develop such a standard. This standard (RCHME, 1993(b)) focuses more at the data rather than the structure or the DBMS, and so allows any organisation adhering to it a more flexible approach to how they organise their data. So for example, the postcode field of a database must be 8 characters in size, have the mnemonic title of POST_CODE, and entered in upper case (e.g. KT4 8UZ). Similarly, in the true spirit of "Year 2000 Compliance", the Bibliographic Document Date of Publication or DOC_ISSUE_DATE field is a numeric positive integer of 4 characters in length. They even provide two examples for this matter, should there be any confusion: 1954, 1991 (RCHME, 1993(b):51,57).


RCAHMS and RCAHMW - The Scottish and Welsh Royal Commissions

The Scottish Royal commission has developed a project called CANMORE, which is the NMR for Scotland (NMRS). It has been running for several months now, and works very effectively. This particular project will be discussed in more depth further on, under the heading of Archaeology and the WWW. The Welsh Royal Commission is also making moves towards bulk data "harmonisation", eventually culminating in a unified Welsh database (END). At the moment though, they appear to be quite a way behind its sister organisations, and again shall be discussed more fully further on.


 
SITES AND MONUMENTS RECORDS

As mentioned above, the SMRs are probably the fastest growing heritage related databases in the UK. They receive some of their data from reported chance finds by members of the public, and with the advent of PPG 16 a tremendous amount of data is generated from planning or development funded excavations within their area of jurisdiction. A serious problem with the reports and other information generated is that it rarely gets published in any formal manner.

For instance, in Birmingham City last year (1998), there were twenty "planning generated" archaeological projects. Of these, it was estimated that only one (or 5%) would see any form of major publication, and the others would just generate smaller internal reports (Mike Hodder, pers. comm. 11/02/1999). The internal reports are publicly accessible as they are produced as part of the planning process, but to view them you would need to arrange an appointment at the Town Hall. Twenty excavations may seem like a small number, but when you consider the small size of Birmingham City, you can appreciate the scale of the problem on a national Scale.

SMR"s and city/borough councils sometimes also generate large volumes of photographs - Aerial Photographs being a very important resource for archaeologists. For instance, in the years 1970 to 1996, Northamptonshire SMR annually photographed sites and important areas, and has accumulated huge numbers of photographs (Christine Addison, pers. comm. 11/02/1999).

The last few years has seen a lot of rapid development amongst some SMRs. Archaeology is not always very fast at adopting new technology, but several SMRs did start using computer databases to hold their records from very early on. However, the software used often depended on site licences held by the local authority, and sometimes upon personal preference or experience.

The result of this was that no two databases were alike. It introduced a severe compatibility problem between any two SMRs wishing to share or exchange data, and meant that usually the only person experienced with a particular county SMR database to use it would usually be the SMR officer themselves.


 
ARCHAEOLOGY DATA SERVICE (ADS)

In their own words:


"The aim of the Archaeology Data Service (ADS) is to collect, describe, catalogue, preserve, and provide user support for digital resources that are created as a product of archaeological research."
ADS, 1999(a).

It also has a responsibility for "promoting standards and guidelines for best practice in the creation, description, preservation and use of spatial information" both in archaeology and across the Arts and Humanities Data Service as a whole (ADS, 1999 (a)).

They provide services for both users (such as students and academics), "data creators" (such as surveyors or field archaeologists), and agencies who may encourage or formalise the use of data standards (ADS, 1999(a)). The services they offer include an on-line catalogue - the "ArcHSearch" system, and a broader "Arts and Humanities" database. They also publish data standards and guidelines (the Guides to Good Practice series) such as the "GIS Guide to good practice" (Gillings and Wise 1998).

The ADS could be described as being similar to the Institute of Field Archaeologists, in that it promotes and encourages a professional level of work from archaeologists adhering to their standards.

So the UK archaeological database is not only distributed widely amongst several organisations, but it is contributed too from a wide range of sources too.


 

CURRENT MOVES TOWARDS A NATIONAL SMR

This growing movement towards creating an effective National SMR or NMR has been driven from many areas, including the ADS, RCHME, EH and ALGAO.

There are several problems encountered when doing this work, and these shall be discussed further below. To briefly generalise here though, the lack of regular recording methods, variation in design of database structure and deviation in both platform and database software have been, and are still, key problems.


 
A UNIFORM DATA STANDARD

In 1993, RCHME published two reports in conjunction with ACAO, under the general title of "Recording England"s Past". One of the reports was a general review of SMRs in England, and the other was a "data standard" intended for "The Extended National Archaeological Record".

The data standard described in the later of those reports was produced by a working party with representatives from RCHME, ACAO, EH, and the British Archaeological Bibliography (BAB), and they also consulted other organisations including the Museum Documentation Association and some SMRs.

Specific attention was given to "technical standards of data compatibility at the computer level, comprising definitions of fields, types and formats, as well as appropriate terminologies [sic] for lexical control" (RCHME, 1993(b):iv).

In December 1997, ALGAO held its first "professional seminar" in Northampton - "The Future of SMRs". The people attending were invited from a range of relevant organisations within archaeology and heritage. The theme was not only of a technical nature (i.e. of creating an integrated system), but also on the use of SMRs - in terms of both academic research and of historical modelling with the use of related information systems (ALGAO, 1997).

ALGAO were an important group in the development of this uniform data standard, and whilst the group themselves are not strictly speaking a "holder of data", they do represent "holders of data" on the larger scale in the form of SMR and county archaeological officers. This is important, because while it may not be feasible for all the members of this organisation to adhere to the standards being discussed, it will have the effect of raising awareness as to the importance of data standards amongst its members.

The software commissioned by the RCHME and ALGAO from exeGesIS, released on 2nd March 1998 is further facilitating this move towards uniformity at a national level. This "Spatial Data Management" software is intended to provide "a flexible and thorough solution for SMR officers that is compatible with national data standards and incorporates the RCHME Thesaurus of Monument Types" (RCHME, 1998(a)). This software requires Microsoft Access, a very capable and user friendly DBMS; and MapInfo Corporation"s "MapInfo" - the industry standard GIS or "desk top mapping" package used by many government and archaeological organisations.

Providing the software works well and it is taken on and used by the archaeological community at a wider scale, then this is a great step in the right direction. "Data Migration" is provided as part of the package, and generally takes 3 or 4 days to complete (RCHME 1998a). This should allow a fairly easy transition to the software, which would perhaps have been the worst part for any SMR doing it themselves!

Providing SMRs purchase the system, it should automatically make them conform to these new data standards, and will make future transfer of data very simple. By the middle of April 1998, 15 organisations had purchased the software (RCHME, 1998(a)), and presumably there will be a quiet period whilst other SMRs see how it fairs.

April 1998 saw RCHME, EH and ALGAO formalised their "co-operation statement", which promise"s to provide a future for local SMRs. This statement, entitled "Unlocking the Past for the New Millennium: A new statement of co-operation on Sites and Monuments Records in England", will have three main influential effects:
 

    1. The production of revised standards for the development of Inventory Records (MIDAS) and the establishment of the Forum for Information Standards in Heritage, England (FISHEN);
    2. the launch of a forum for developing spatial standards (in October 1997);
    3. and the launch of a new suite of software programmes, produced by exeGesIS SDM, and based upon Microsoft Access and MapInfo;
RCHME 1998(b).

So what we are seeing for the first time in archaeology is a firm belief in universal data standards and conformity, which will lead to both a formalised management strategy for archaeological and heritage databases, and the prospect of large scale data sharing or exchange. In the statement, they even go on to say "The partners in the statement will work closely with each other to create a national network of heritage information, accessible to all." (RCHME, 1998(b)), so this vision is really quite clear.


 

ARCHAEOLOGY AND THE WEB

I commented above that archaeology is not always very fast at adopting new technologies. Where the Internet is concerned though, there is a different story. Very quickly, the Internet was seen as an extremely useful tool among archaeologists. E-mail discussion groups appeared, site reports were published on-line, E-journals started (e.g. Internet Archaeology (http://intarch.ac.uk)) and so on. In the early days of the Internet, archaeologists did make full use of it. However, archaeologists are not computer programmers, and today the technology seems to have overtaken them. Programming skills such as Java are rarely found in archaeologists, so these skills usually need to be provided by others, and the money for this is seldom available.

There have been a few projects that have extended the use of the WWW by archaeologists, and databases have proved a popular choice for this.

Databases are found all over the web these days - everyone that uses the web will be familiar with search engines for example. Search engines are large databases of web pages, maintained by "web crawlers" or "worms" that rove the Internet cataloguing pages as they come across them. They provide a simple interface, usually consisting of a single text entry field that allows you to enter a keyword to perform a search with. Web-based databases can be of any size, from simple contact information for a small firm, to the mighty map and satellite image database from Microsoft: the TerraServer (http://terraserver.microsoft.com see Figure 1).

The Terra Server contains 1.01 terabytes of data, and 4.1 terabytes of uncompressed images in the form of satellite images and maps covering a large portion of the "western world". It enables you to search for locations on the Earth"s surface by typing in the place name, or you can click and zoom/pan etc around the globe as you see fit. Results are shown as either maps or, when available, satellite images.

In terms of archaeology databases, I have already mentioned several projects aimed at providing text based databases over the web - the CANMORE project being perhaps the most successful to date.

CANMORE stands for the Computer Application for National MOnuments Record Enquiries, and also, coincidentally, happens to be the name given to Malcolm III who was crowned king of Scots in 1058 (RCAHMS, 1998). It was developed in conjunction with ORACLE, the software firm (which is probably amongst the top three database developers along with Informix and Sybase), and the ADS. The site gives you access to the National Monuments Record of Scotland, and gives you results in tabular form, providing quite a lot of information including any relevant bibliographic notes. The database contains details on thousands of archaeological sites, monuments, buildings and maritime sites in Scotland, and is free for anyone to use.

As with any large database, searching can be difficult unless you are prepared to sift through hundreds of results. With a WWW search engine, you typically search by keyword, and are regularly presented with hundreds of results, perhaps listed with 20 per page. To improve "search technique", most WWW search engines allow you to apply Boolean logic to your searches. For example, suppose you were to search the WWW for information concerning the pests of Chinese cabbage (which incidentally is particularly prone to slug and caterpillar damage (Larkcom, 1997:90)). Using AltaVistaTM (http://www.altavista.com/, 18/01/98) you may type in the "chinese cabbage pests" in the search box. The results you get back may contain anything to do with any individual keyword entered. By applying Boolean logic to the search you would type "chinese + cabbage + pests". In theory, the results returned would all contain the words "chinese", "cabbage" and "pests" within the document. Even still, errors may still appear - some of the results could actually be suggestions of methods to deal with garden pests using Chinese cabbage as a remedy (as is commonly done for slugs in a vegetable garden, or pond snails in a garden pond).

The main reason WWW search engines are often so poor, is that good use of "metadata" to describe page content is rare, and so a search engine has to try and generalise is indexing procedure. The use of Metadata is discussed in further on.

The CANMORE project uses an interface that allows you to be very specific with your search (see Figure 2). You are given seven fields with which to be very specific about records you are interested in. You do not have to fill them all out; you can be as general or as specific as you like.

The results are presented in tabular form, and contain an SMR record number for each record, information about location by both place name and grid reference, a short description of the record, and any extra information that is relevant (see Figure 3). By clicking on the record number, you are taken to even more detailed results for that particular record.

The Welsh Royal Commission (RCAHMW) is also entering the digital age. They run the Microsoft Fox Pro DBMS, and are currently working with the ADS to "formulate policies for archiving digital data" (RCAHMW, 1999(a)).

This will presumably lead to the development of "END" - the Welsh Extended National Database that is currently in its early stages of development. END revolves around a partnership between several organisations that have agreed to share digital data from their databases. This will not lead to a centralised database, but to each group holding a copy of each other"s data. Over the last two years, the member organisations have agreed on data structures, and have now reached "harmonisation", and the next phase will see "harmonisation of terminology used within each record" (RCAHMW, 1999(b)). This approach is the reverse to that of the RCHME, whose prime area of concern is the data standard, rather than the actual structure. RCAHMW also plan bibliographic data to be available in the future, as are other ideas:
 

"With an eye to the future, RCAHMW has been setting standards for the vector mapping of archaeological data with a view to establishing Geographical Information Systems for the main participants of END"
RCAHMW 1999 (b)

The standards for vector mapping are reminiscent of the GIS Guide to Good Practice (Gillings and Wise, 1998) published by the ADS, and establishing a GIS for the main participants of END is reminiscent of the exeGesIS/RCHME package.

The final move into the digital era being made by the Welsh Commission is looking into ways of digital report publication. At the moment they are running trials with downloadable Adobe Acrobat files, (which requires either an extra "plug-in" or a separate program to view the files with). Post-script files are ideally suited to being printed on paper, and are the ideal format used when exchanging documents between desk-top and printer (the format of Adobe Acrobat files is PDF or Portable Document Format). They are not very convenient to read on the screen, and the files are larger than the equivalents in HTML. For example, the HTML 4.0 specification from the W3 Consortium can be downloaded as an HTML "zip" archive that is 389 Kb, or the post-script version can be downloaded at 2.1Mb. Even as a "gzip" compressed file, the post-script document is still 600 Kb!


 

Character Entity Description

Character Entity Reference

Result

Euro sign

€

Capital letter S with caron

Š

Š

Capital letter Y with diaeresis

Ÿ

Ÿ

Small letter n with tilde

ñ

ñ

Small letter e with diaeresis

ë

ë

Table 1: Example of character entity references in HTML 4.0 (after the World Wide Web Consortium, 1998).


 

I can see the benefits of transferring files in this manner for those of us who like to have reports on paper, but it does defeat the idea of the WWW. It also makes the process of viewing them awkward. One other possible reason for the use of post-script files at the moment is that HTML does not support a full range of characters - notably foreign ones (V.L. Gaffney, pers. comm. 09/02/1999). This will be rectified in the future though, and I doubt there are many (if any) characters unavailable that the RCAHMW would require. Even if this were to prove problematic, Dynamic HTML (DHTML) will shortly allow the "streaming" of true-type fonts along with HTML documents, permitting the use of non-standard fonts within them. Table 1 illustrates some ISO 8859-1 characters (see Demo 1 for more).

For some years now, the ADS have been developing the "NMR on-line" in conjunction with the RCHME. At present, the database is not web accessible, but hopefully as it grows over the next few years and as the adoption of these data standards and software suites becomes more common it should happen.

The ADS themselves also provide their own archaeology database. Their data comes from many sources, including the RCHME, SMRs, and academics, and it the intention that anyone can supply data to them - providing it is on the correct format (ADS, 1999). They provide several user interfaces, including the keyword search (Figure 4), and the "Where and When Query" (Figure 5).

The first of these, the keyword search, is performed on metadata, in much the same manner as a web search engine. As well as a normal search, an "intelligent search" is performed by comparing results from the Thesaurus of Monument Types (published jointly by EH and RCHME (1995)) against record metadata. On the WWW, this search is performed on metadata, whenever possible. Metedata is information contained in the "header" part of an HTML document. As Paul Miller puts it:
 

"Metadata is data about data, and therefore provides basic information such as the author of a work, the date of creation, links to any related works, etc."
Paul Miller, 1998.

Metadata in a web document is put in the "header" part of the file. A web page may have several types of metadata, and the listing shown in Example 1 gives one possible method to do this.


 

<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-5">

<META name="Author" lang="en" content="Mark Sweeting">

<META http-equiv="Expires" content="Tue, 20 Aug 1999 14:25:27 GMT">

<META name="keywords" content="archaeology, database, SMR, GIS, WWW, Java">

<META scheme="ISBN" name="identifier" content="0-8230-2355-9">

Example 1: Possible metadata tags used to describe an HTML file.


 

These few lines of data can, if used correctly, provide very useful information about the page. Most of the lines are self explanatory, but it should be noted that the "Expires" data tells the browser that it need not reload the document until after a specific data, relying on a locally cached version until then. This is clearly good for Internet traffic. The "keywords" field is used by search engines in much the same manner as a metadata search may function for archaeological data, and even specialised fields can be created - such as the "ISBN" field above.

The Dublin Core standard for Metadata (Miller, 1996) is a scheme different to many others. It is not only intended to be used in HTML documents, but also PDF files, images etc. It is very similar to the metadata illustration above, but when writing about it, Paul Miller makes some interesting points. He encourages readers of the paper to implement the Dublin Core standard for metadata in their web pages so that the usability of it can be judged, and he also suggests that if they were to do that, then they may become part of:
 

"...a growing and exciting trend, whereby all the data available out on the Web might actually become information, and therefore of use to the wider community."
Paul Miller, 1996.

Whilst this may sound like it comes from within the realms of fiction, he does actually have a good point. The correct use of metadata could enable normal search engines to index the web a whole lot better, and would clearly speed up and improve web searching for useful information - no matter what the subject is.

A metadata search on the "ArcHSEARCH" system at the ADS provides fairly good results. As you can see from Figure 6, the NMR database provides output in a textural/tabular form, which, just like CANMORE, allows you to click on a record for more specific information (Figure 7). The results illustrated were from a keyword search for "bronze", and the few records I clicked on were actually relevant.

Presenting archaeological data in this form permits fairly effective "data mining", just as you may expect to do with a library catalogue. However, when there are so many results, this can be a very time consuming operation. Furthermore, the output is certainly not vissually stimulating, and would not encourage school children or the public to browse the data.

The use of maps in archaeological publications has always been a popular tool, from the earliest books by the likes of Childe and Piggott, to the most modern works dealing with spatially aware subjects such as monument intervisibility or landscape analysis.

If one of the key reasons behind developing an on-line NMR is to make access easy for everyone, then they simply must consider using some form of mapping tool as a front end to the data.