[Part 3 - Implementation of a "Web GIS"] Something for the Public: A web accessible National Sites and Monuments Record // sweeting.org

You are here: Home > Mark Sweeting > Something for the Public > Implementation of a "Web GIS"

PART 3 - IMPLEMENTATION OF A "WEB GIS"

Any system of this nature would have three main components:

The client-side programme or Graphical User Interface (GUI),
The server-side database,
The database/GUI interface or gateway programme.

The method of approach used for any part of such a system will greatly influence the approach used for the others. Compatibility between component parts is essential, so a good understanding of the different technologies is essential.

CLIENT-SIDE PROGRAMMING: THE GUI

Whilst each of the components of such a system are important individually, it is certainly fair to argue that paramount to the systems success is the design of the GUI. This is what the user has to look at and navigate around, so not only does it need to be intuitive, but it also needs to be aesthetically pleasing to the eye. No matter how effective the database server is, people will not come back to a site that is a pain to navigate around.

The CANMORE front end is a good example of a nice GUI. It is nicely laid out, and the tasteful "corporate image" is sustained throughout. This front-end does credit to the hidden (and extremely effective) workings of the database.

HTML Forms

Nearly all of the sites presented so far have had one thing in common – they all accept user input through forms. Forms are an effective way of accepting data from the user, and can be easily integrated to many programs on the server through the use of a Common Gateway Interface (CGI) program. Forms communicate with server programmes in one of two very different ways. The first method is "Get". This is the most basic form of encoding data, the information being added to the URL. This is known as "URL encoded" data. Consider the example in Figure 11 (below).

The information contained within this form would be encoded as follows:

http://URL/cgi-bin/server_script.cgi?name=mark&email=nobody%40nowhere.co.uk

So the actual data is just added to the URL. The file "server_script.cgi" need not actually be a programme at all. It could just be a static file, which may contain JavaScript (discussed below) for example, that could then perform some client-side processing of the data.

By replacing the "get" method with "post", the form performs in a very different manner. The Post method uses the HyperText Transfer Protocol (HTTP) to send the information to the server as an input stream. HTTP is just a protocol used to communicate with, just like FTP (File Transfer Protocol), TCP (Transmission Control Protocol) or TELNET (Network Terminal Protocol) – similar to the way humans may use French, German or Spanish. Every HTTP request is fairly similar, weather you request a standard "html" file or a server programme. Example 2 (below) shows the HTTP request header for the code shown in Figure 11.

<form method="get" action="cgi-bin/server_script.cgi">

Enter your name: <input type="text" name="name" size="15">

Enter your email address: <input type="text" name="email" size="15">

<input type="submit" value=" O.K. ">

</form>
                                 Enter your name: 

                        Enter your email address:

Figure 11: An example web form and the HTML code required to create it.

GET /cgi-bin/server_script.cgi?name=mark&email=nobody%40nowhere.co.uk HTTP/1.0
Referer: http://some.domain.com/a_page.html
Connection: Keep-Alive
User-Agent: Mozilla/4.50 (X11; I; Linux 2.0.35 i686)
Host: localhost
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*
Accept-Language: en
Accept-Charset: iso-8859-1,*,utf-8

Example 2: The HTTP request header for the form in Figure 11, using the GET.

Example 3 lists the HTTP request headers for the "post" method of the same form. The form information is still included as part of the HTTP request header, but as separate lines. Note also that a few extra lines need to be added to the request.

POST /cgi-bin/server_script.cgi HTTP/1.0
Referer: http://some.domain.com/a_page.html
Connection: Keep-Alive
User-Agent: Mozilla/4.50 (X11; I; Linux 2.0.35 i686)
Host: localhost
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*
Accept-Language: en
Accept-Charset: iso-8859-1,*,utf-8
Content-type: application/x-www-form-urlencoded
Content-length: 31

name=mark&email=nobody%40nowhere.co.uk

Example 3: The HTTP request header for the form in Figure 11, using the POST method.

In terms of "functionality", the method used to send the form data is generally irrelevant – the only difference is how you decode the data at the server end. There are however a few considerations to be made when deciding the method to use. Rafe Colburn (1998:68) summarises them:

If the user may wish to bookmark the script's results for future reference, then the GET method should be used. The form data is then added to the bookmark file entry as part of the URL.
If you use a hyper-link pointing to a CGI programme, the method is forced to be GET. For example:
```
        <a href="news.cgi?id=454">A news story</a>
```
If the form includes a lot of fields, the URL can get very long and messy using the GET method. The POST method removes this problem.
Any private data such as passwords or credit card numbers are clearly visible to onlookers in the location box when using the GET method. Also, the browser will cache your information, and log files on the server will contain this data. So the GET method is the least secure of the two methods.

It is interesting to note that the ADS and CANMORE projects both use the POST method with their forms – preventing any bookmarking of results, but also stopping any local caching and consequently out of date results.

The Use of JavaScript^™

JavaScript is a client-side scripting language developed by Netscape. It is used to embed small programmes or scripts within an HTML document. Whilst it is not strictly an integral part of HTML forms, it is used extensively to increase the function of a form. For example, when a user clicks the submit button on a form, a JavaScript programme could check any information entered by the user for errors, confirm the data is correct, format it ready for the server programme, and then finally submit it. This saves time by removing some of the workload from the server to the client"s machine. Even though this could be just a small amount of work for the client, (the ADS use a short script to confirm users have actually entered data – see Figure 12), it could remove a noticeable load from the server if it has many concurrent users.


           <SCRIPT LANGUAGE="JavaScript">

           function verify(form) {
             if(form.query.value == "") {
               alert("You must enter a search term");
             } else {
               form.submit();
             }
           }

           </SCRIPT>

Figure 12: A short JavaScript to eliminate the server-side processing of empty forms (after ADS, 1998(c)).

It should be mentioned that the clients web browser must support JavaScript for it to work. Similarly, the correct version of JavaScript must be supported. The two most popular web browsers, Netscape Navigator and Microsoft Internet Explorer do not implement the same versions of JavaScript (Microsoft actually "tweaked" JavaScript adding some of their own "improvements" to create a new language called JScript) and this does introduce some problems. In a similar manner to the adoption of the Data Standards developed by RCHME, ALGAO and EH, the World Wide Web Consortium (W3, http://www.w3.org) have agreed on the use of Netscape"s JavaScript as the standard scripting language. In the spirit of platform independence and transportability therefore, it is this that should be adhered too.

Java^™

Java is a programming language from Sun Microsystems. One of the key ideas behind Java is that it is a platform independent language. That is to say that programs written in Java can run on any machine that has a Java Virtual Machine (JVM). The JVM acts as an interpreter for the Java byte code, and it is that which figures out how to execute the code (Linden, 1997:23).

Java Applets

Perhaps one of the most exciting aspects of this language is the new range of possibilities presented through the use of Java Applets. Applets are essentially "mini programmes" that are designed to be embedded within web pages, just as an image file may be. A Java applet has almost all the capability that any desktop application may have. It could simply scroll text across the screen, or it could check stock market shares and draw graphs of the results.

Through Java"s Abstract Window Toolkit (java.awt) it is possible to create GUIs that have the same "look and feel" of the operating system (OS) they are running in (see Figure 13). Every type of input device, from buttons to pull-down menus, are available to the programmer, so it is possible to create "pop-up" application windows that download in a web-page and look and feel just like they are a normal program.

Java has been developed within the last few years, during the period in which we have seen a huge explosion in the use of the Internet. Consequently, it is richly featured with respect to networking and Internet protocols. Alongside this, Java"s database connectivity (JDBC) provides the already highly functional API (Application Programmer"s Interface) for database work (java.sql) improved performance and integration with databases that support it. As a result, the popularity Java has gained steadily, and is constantly being improved with every release of Java.

Java could also easily support an "interactive" and "dynamic" display for the information. The graphics capability of Java is highly oriented to drawing on-screen images using vector data, though it can quite happily handle raster images. The standard graphics capability is provided through the java.awt.Graphics API, and the new Java 2D API (java.awt.Graphics2D) released with Version 1.2 of Sun"s Java Development Kit (JDK) provides even greater flexibility. For instance, JDK 1.1.x would allow you to draw rectangles, ovals, lines, and polylines, but the fill was always uniform and the line width was fixed to 1 pixel. Java 2D allows you to draw virtually any shape, with different line styles, fill styles and transparencies, mixes and textures (Loeb, B., 1999:44).

Again, in a similar fashion to the JavaScript and JScript problems, Microsoft introduced their own language for creating applets, known as J++. This only works in Microsoft Internet Explorer, and not on other platforms such as the Mac OS, UNIX, OS2 and BeOS for example, and there were problems with Java running in Microsoft Internet Explorer. These problems are being solved now, but once again I would advocate using the version of Java from Sun Microsystems, as this is the W3 approved standard.

Providing that the client supported Java, then it would be quite feasible to create a very basic GIS Java application/applet. It could be downloaded via the WWW and allow the user to query databases and have the results displayed graphically instead of (or as well as) in a tabular nature.

SERVER-SIDE PROGRAMMING

The "server end" allows far greater flexibility than at the client end. Unlike the client, we know exactly what the software/hardware set-up is, and can design the system accordingly.

Database Software

The choice of the database software would primarily depend on the platform the server was to run on and what platform the system administrator was familiar with. Microsoft for example ship a dedicated database server operating system known as "SQL Server". It is probably more typical to use a database purchased separately to the server operating system – particularly in the case of Web Servers. Both Microsoft NT and the various UNIX flavours are well catered for when it comes to suitable database applications.

The next factor that would influence the choice of database has to be weather it does what you want it to do. If the client applet were to use Java Database Connectivity (JDBC discussed below), then this must be supported by the database. If there are special field formats within the data sets, then they must be supported by the database. A common problem found with database applications is that the date/time formats can vary considerably from application to application.

It would also be preferable if the database could support the same structures as the exeGesIS package being pushed by the RCHME. I can see no reason why there may be problems here though – assuming you employ a relational database on your server.

It is worth noting here, that whilst the exeGesIS software uses established (though not necessarily standard) software (MS Access and MapInfo), the use of standard data-types and formats should allow future integration with other software. This was one of the key features of the ALGAO/RCHME/EH Corporation Statement (RCHME, 1998(b)), and is a very important part of it.

Finally, any additional features that make the database attractive should be considered. For example, Oracle 8 has a special "cartridge" (or additional software) known as the "Spatial Cartridge". This is designed to speed up spatially related queries by essentially allowing the indexing of spatial information to be tailored to suit the density of information over a particular area. The records are stored in tables or blocks relating to their spatial position. In areas where there is a high density of records, these tables would be broken down to cover a smaller geographical area, whilst holding a fairly average number of records. Whilst this may not improve the search speed for a standard query based upon "period" or "classification", it is clearly beneficial when downloading additional map information such as roads, water features and other "layers".

Oracle 8 provides JDBC support, but it also supports a technique known as Java Remote Method Invocation (RMI). Java RMI allows remote applications (e.g. Applets) the ability to call methods or functions built into the database, such as additional searching algorithms. It means that an applet can be broken down into two parts – part being used at the client end, and the other part working on the server. "Raw" Java data types can be passed directly between classes using "data streams". This process cuts down much of the formatting and conversion that takes place using other network streams. While this may increase server workload, it makes the applet smaller, meaning a quicker download. It is also more efficient as the "server half" is actually running within the database itself, meaning there are fewer processes running on the server.

The Informix SE database offers something similar to the RMI – "stored procedures". If a procedure or method is often used (such as data formatting prior to being drawn on the screen), then it can be stored with the database to cut down communication time between the client and server (see http://www.informix.com/informix/techbriefs/se/se.htm, 12/01/98).

Finally, another very important consideration is the cost of the software. Here, the Informix database is a winner, as it is free for non-commercial use. If the "Web Accessible NMR" was running at a University for example, it may well be free for use.

Gateway Programmes

Bridging the gap between client and server is usually some form of gateway programme. Traditionally, gateway programmes, also known as Common Gateway Interface (CGI) programmes, were written in "Perl". Other languages that can be used include C/C++, Java, UNIX shell languages and AppleScript. With the growing popularity of Java, Java Database Connectivity (JDBC – discussed below) is also becoming an increasingly popular tool for database work.

Practical Extraction and Report Language

Perl is a scripting language that finds excellent use as a "gluing language" (one that can join many programs written in different languages together), and as a text processing or report generating language. Both these factors and the strong integration within the UNIX environment mean Perl has become the standard CGI language. It is quick to learn and is very richly featured. Not only that, but the popular Perl module, "perl-cgi", enables even easier script development, and the embedded Perl interpreter available for the Apache web server speeds up processing quite markedly. Other modules enable the manipulation and production of graphics, and also database integration (such as "oraperl" for the Oracle database or "perl-postgres" for the popular PostgreSQL database).

It would be quite feasible to write a Perl script that took arguments from a standard HTML form, queried a database, and then drew the results as a GIF file for inclusion within a web page. However, if you were intending pursuing this method, it would perhaps be wise to use an actual GIS application running on the server to perform all the work, such as GRASS (as discussed above).

Java Database Connectivity

JDBC is an API from JavaSoft that uses SQL statements to access databases from Java programmes. JDBC provides an API that supports drivers for several databases, allowing Java programmes to communicate with many different databases on many different platforms (Doherty and Manning, 1998:537-8).

Using JDBC then, an applet can communicate directly with a supported database server, without the need to go via an HTTP server or any form of gateway programme. Conversely, if a gateway programme were to be used to perform extra processing of data, this could use JDBC itself to connect to a local database.

Java Servlets

Java Servlets are, in essence, the opposite of Applets – they run on the server as opposed to the client. Due to Java"s good networking ability, Java Servlets can make very effective servers – only a few lines of code are needed to create a fully functional server programme. The benefit here is that a server can be written to do exactly what you want, dealing with data passed to it in exactly the manner you wish. For instance, if a Java server were listening for requests on a specific port, the HTTP server and any database server could be bypassed, meaning any client-side processing that were to be done could be carried out immediately as no conversions between programming languages would be required. Java servlets are an underlying feature of Java RMI (discussed above), though they can also function on their own.

OTHER PROGRAMMING CONSIDERATIONS

So far, the practical aspects of getting data out on the web have been considered. It is now time to consider the National Monuments Record itself.

The first question to consider is "Is it worth reinventing the wheel?" The "wheel" may be any component that would make up an "on-line GIS" – from the GUI to the database.

My "vision" is for a vector based GIS Applet because it would provide the client greater flexibility and control over the appearance of their "map", and the streaming of vector data would allow map objects to be added as they are loaded. This would help keep the user occupied through an otherwise boring period of time.

Not many currently available products would function in the manner suggested above. The closest would be the AutoDesk product MapGuide.

According to AutoDesk, MapGuide is "designed specifically to meet the data distribution needs of GIS users" with "live maps with author-defined querying and reporting capabilities accessible to any number of viewers" (AutoDesk publicity flyer). The GIS supports many common formats of vector and raster data, as well as various mapping formats such as MapInfo MIF/MID, ESRI ArcView SHP, and Atlas GIS BNA.

To use the software, you must buy the full complement of software – AutoDesk MapGuide Server and AutoDesk MapGuide Author. To use the data over the Web the client must download a 2 Mb "plug-in". Once saved locally though, all the user needs to do subsequently is download the GIS data.

The product is extremely effective, and does appear to bring a very useful subset of typical GIS tools to the WWW. One may argue that it brings too much functionality – "Joe Bloggs" would not be interested in monument intervisibility for example. Nevertheless, integration with current file formats is important, and this product appears to do that too. The main drawback for the user would be the 2Mb plug-in, though this would be a one-off burden to download. For the NMR, it may well be the licensing cost that is prohibitive.

The alternative to the vector approach is the raster approach. Again, the commercial products available today all support the standard GIS data formats, and MapInfo"s MapExtreme should interface directly with the exeGesIS software.

This approach avoids the client-side processing of data and the burden of downloading an applet or pug-in. It would also enable the use of free and already proven GIS software like GRASS as a back-end. If pursuing the "free" route, then operating systems such as Linux could be used to provide the server platform. Linux has been shown to be an extremely stable and robust operating system, (for example Saran, 1999). As companies such as IBM and HP are now offering system support for it, and others like Oracle, Informix, Sybase and InterBase are offering high-end DBMS for it, Linux is becoming a seriously viable alternative.