This article originally appeared in Crystallographic Computing 5. From Chemistry to Biology, papers presented at the International School on Crystallographic Computing held at Bischenberg, France, 29 July - 5 August, 1990, edited by D. Moras, A. D. Podjarny and J. C. Thierry, International Union of Crystallography, Oxford University Press, 1991. It has been updated to reflect recent changes in computers and software.

Improved Productivity Through Crystallographic Packages

Bert Frenz, B. A. Frenz & Associates, Inc.

TODAY crystallographers know so much about their science that the vast majority of small molecule structures solve, refine and converge on accurate results with a minimum of effort. Ideally, we would like to design a machine that we pour in diffraction data at one end and receive out publishable papers at the other end. However, unlike many other areas of computational science, crystallographic software does not condense easily into a single job step. Instead, we progress through a large collection of job steps, each with myriads of options. As a result, our productivity relates directly to the software tools that we use.

Our efficiency in performing these job steps is enhanced by the following factors:

I. Completeness

Any crystallographic software system must contain the essentials: data reduction, structure solution, structure refinement, derived results. However, because of the variance in crystals, chemical composition, molecular weight and space group, a complete system must include a plethora of optional functions that extend beyond the basics. In a sense, the more functions the better because one never knows what crystallographic problems will appear on the horizon. Because SDP for Windows builds on a long history of academic and commercial software developments, it is one of the most comprehensive available today. Table I gives a checklist of the many functions included in SDP for Windows.

Back to Outline

II. Integration

Early in the history of crystallographic computing, programmers recognized the need to have the output from one program flow as input into the next (see Frenz, 1988, for a history of crystallographic computing). Eventually this concept advanced to the formation of crystallographic packages that combine many software modules into one coherent self-contained product. Three widely used packages are SHELX, XTAL and SDP, with hundreds of installations each.

Probably the foremost advantage of program packages is that the burden of development, maintenance and support shifts to a single research group or company. In his address at the special session entitled "Crystallographic Computing for the 1990's: What Can We Expect?," Robert Langridge stated that software maintenance represents 75% of the cost over the lifetime of a computer system (Langridge, 1990). In the same session, Richard Feldman of the National Institutes of Health predicted increased commercialization of programming because their studies showed it was more rational to buy software tools than to do the development themselves. He remarked that the scale of effort was not necessarily in the realm of an individual (Feldmann, 1990). A chart shown by Langridge conceptualizes the magnitude of this effort (see Figure 1). He explained that if an individual writes one program for

Friendly
Program
p
Friendly
Program
System
p2
Program
1
Program
System
p

Figure 1. Programming Effort. (adapted by Langridge from F.P. Brooks, "The Mythical Man Month")

personal use, the work effort is defined as "1". To make that same program friendly so others can use it, increases the work effort to p. Similarly, to incorporate a single program into a system of programs increases the work effort to p. Finally, to make as a system of programs friendly magnifies effort to p2, which he admits is an underestimate. Thus it is an order of magnitude more time consuming to incorporate a single program into a friendly shareable system than to write the program for your own use. My own experience definitely bares this out. In fact, I might add that writing the system for Windows increases the effort by another order of magnitude.

Back to Outline

III. Flexibility

In the past it was essential that source code be available so crystallographers could make changes in the software. Changes were necessary because there were constant exceptions to the rule and every crystal structure seemed to require a different set of conditions to solve, refine or converge. As crystallography matured, programmers built more and more options into the original code. If they also documented these adequately, the need for source code modifications diminished. (As a result of diminished need, many modern day students do not have the software skills to make code changes.) In addition to having many options, it is important the software assigns default values for as many options as possible, and that it chooses these defaults intelligently on the basis of the structural problem at hand.

A good example of flexibility is the choice of weighting schemes for least-squares refinement. There seems to be no clear consensus on which is the best weighting scheme. A system that allows a choice of a half-dozen or so methods is desirable. Each of these options should include three or more adjustable parameters with logically assigned default values that depend on the intensity data.

Back to Outline

IV. Ease of Learning

Traditionally the computer has played the role of a powerful calculating machine. More recently it has become clear that the computer can serve a second important role, namely, a teaching machine. This is particularly important for new students who are chemists or physicists using crystallography as an analytical tool, in contrast to crystallographers with extensive experience (see for example, Bond & Carrano, 1995). Yet even among experienced crystallographers, there is resistance to learn new techniques when one is comfortable with a previously learned method (how many crystallographers use ORTEP and PLUTO interchangeably, rather than only knowing one of these drawing programs?). Therefore for both new and experienced users, anything that improves ease of learning will benefit productivity.

Ease of learning relates directly to the interface between the user and the computer, a topic discussed below. Here let's concentrate on a few of the other elements. Choice of words is important; too often software and software documentation expect the user to have too deep an understanding of crystallographic and computer jargon. Inappropriately, software requests parameters defined in terms of short computer-language-like words such as TITL, NREFL, and TEXP. Similarly, requesting answers to be 0 for yes, or 1 for no (or was it the reverse?) is unnecessarily confusing. Requesting unnecessary information is another common shortcoming. For example, the software can estimate the expected number of molecules per unit cell with 99% accuracy and it need not request this as input. Also, any parameter entered during one job step need not be requested again later in another step. Ease of learning is greatly enhanced by documentation that is complete, well written, intelligently organized, and includes a detailed index. Making the most helpful information available to the user at a single keystroke on the computer saves thumbing through a manual. Combining the above features with a well designed user interface can turn the computer into a powerful tool for teaching crystallography.

Back to Outline

V. Hardware Speed

Existing performance measurements, e.g., Whetstone index, provide a narrow perspective in that they usually measure only one feature. First, the measured feature may or may not have relevance in actual scientific calculations, and second, the feature usually is not measured in combination with other features which also effect results. To insure the performance measurements are meaningful to scientific work, a set of timing tests were selected from actual calculations commonly performed during crystallographic investigations. Five tests were chosen, as shown in Table II.

Table II. Timing Tests with Results for Digital XL590 90MHz Pentium computer.*

Test Name Description Time
LS/200 Full-matrix least-squares refinement
(200 variables, 1500 data)
25 sec
LS/400 Full-matrix least-squares refinement
(400 variables, 3000 data)
287 sec
FOUR Difference Fourier with peak search
(3000 data, 15 sections)
5 sec
SHELXS/P Patterson interpretation
(C2/c, 4000 data, 5 heavy atoms)
5 sec
SHELXS/D Direct methods
(P212121, 1000 data, 20 atoms)
7 sec

* Configuration included 90-MHZ Pentium processor. The SDP Performance Index is 439.

Outside of graphics and word processing, more than 95% of the computer usage in small-molecule crystallography is attributable to calculations such as those included in the above timing tests. Each test comprises a complete job step, not a subset. Thus the relationship between disk input/output volume, memory usage, and numeric calculations is identical to real-world problems.

In the full gamut of hundreds of timing tests, execution times varied from 5 seconds to over 25 hours. To simplify reporting, the results are given relative to the performance of the original IBM PC-XT computer for which an SDP Performance Index is defined as 1.0. The index for a particular computer is the average of the indices for each of the five tests. Figure 2 charts performance for various classes of microcomputers. The bulk of the timing tests were performed on microcomputers. This partially reflects the thrust taken by SDP for Windows marketing. It also reflects a growing worldwide trend. Unlike supercomputers, mainframes, and minicomputers, personal computers (PC's) are readily available worldwide because of reduced trade restrictions, lower cost, and easier local repair capabilities. This change in computer availability has broadened the crystallographic community to include scientists in diverse parts of the world and in smaller colleges, universities, and industries. The improved performance and convenience of low cost desktop computers provides the direct benefit of improved productivity.

Figure 2. Hardware Performance Using Crystallographic Software

Back to Outline

VI. User Interface Speed

Previous authors and manufacturers have focused computer speed on hardware speed and have ignored the human component of speed. Computers have become faster and faster, but without improvements in the way we interact with computers, we may not be getting the true benefit of the faster machines. Traditionally, crystallographers have been at the forefront of computer technology and have always been among the first to implement new computers and new computer interfaces. Yet recently, crystallographic software has not kept pace with the rapid advances and improvements in the user interface as shown diagrammatically in Figure 3. As Philip Bourne states, "the card deck still

rules" (Bourne, 1990). Somewhere in the early 1970s most crystallographic software adopted a card image format for communicating between the user and the computer. Even though computer cards have long become extinct, much of today's crystallographic software still communicates through 80 column images. Meanwhile mainstream software has advanced to dialogue, then menus, and now windows as better ways to communicate with the user. There are some notable exceptions of crystallographic software that have kept pace with the changes, but the bulk of software in use today still operates under 25-year-old mentality.

One of these exceptions is SDP for Windows. Its user interface is written for Microsoft's Windows 3.11 and Windows 95. Simply clicking on the SDP icon opens a window with the same look and feel of hundreds of other off-the-shelf Windows software. Beginning students, having been exposed to computers since they started their education, intuitively adapt to running this crystallographic software. In addition to the crystallographic functions, the windows menus handle all operations involving printing, viewing, creating, or deleting files; in other words, the user does not need to know anything about the computer operating system. The Windows menu tree starts with logical keywords: File, Edit, Data, Solve, Atoms, Refine, Graphics, Publish, View, Help. Using the mouse to click on any of these shows the underlying choices. For example, the sequence to execute a least-squares refinement is Refine, Full-matrix or for a particular type of absorption correction the sequence is Data, Absorption correction, Psi-scan.

Once a selection as been made, a table of adjustable parameters appears on the screen. It shows all parameters, their definitions and their values for this particular job step. For example, under least-squares refinement, one of the parameters is the number of cycles to be executed. The current value can be changed simply by typing over the existing value. Every parameter has a default value; thus, in a sense, the user can execute every module as a "black box". In actual practice, structures vary and user preferences differ, so it is likely the user routinely will adjust one or more parameters. Recommendations for alternative values, as well as a more detailed explanation of the adjustable parameter, show as a three-line window that appears whenever the cursor highlights a parameter. Above the table is a row of icons representing the tasks that can be performed. These icons represent: save, exit, calculate, modify, view summary, view details, view graph, print and undo. Typically the sequence is execute followed by view summary. The summary table has a format similar to the parameter screen to simplify ease of use. If the user desires more details, the view detail option uses the mouse for online scrolling through traditional computer output files. The graph icon shows ORTEP, PLUTO, ProGraph (macromolecular) drawings, as well as an array of charts and graphs such as decay and absorption curves, refinement results and structure analysis plots.

The use of color graphics in crystallography improves user productivity. Pictures convey information more quickly than do words. With the rapid advancement of high resolution color devices, there is a great opportunity to simplify the task of understanding crystallography. An example is the way SDP for Windows uses interactive graphics in the structure solution step as well as facilitating the orientation for publication drawings. During structure solution the Fourier peak locations show as peak numbers on a color graphics screen. Lines connect numbers if the peaks lie within bonding distances defined by atomic radii rules. To improve viewing perspective, the user can be freely rotate the model in real time using simple mouse movements. The user can also click on peaks/atoms (1) to display distances, angles, and torsional angles, (2) to identify coordinates, (3) to delete spurious peaks, and (4) to rename valid peaks to appropriate atom types. Additional functions tie interactive real-time graphics to least-squares refinement, electron density contouring, and presentation graphics such as ORTEP, PLUTO, and ProGraph.

In conclusion, a carefully designed crystallographic package is one that is (1) complete enough to handle a diversity of structural problems, (2) well integrated into a single entity to reduce errors and to eliminate unnecessary steps, (3) flexible to handle options and personal preferences, (4) easy for new students to learn crystallography, (5) running on a computer that is fast enough to return results while the project is still clearly on one's mind, and (6) designed with a user interface that enhances the ease of use and the enjoyment of the process. Crystallographic packages of this type improve our productivity in structure determination, thereby shifting us away from maintenance work, and instead, focusing our efforts on obtaining results and advancing science.

References

"Computers and Crystallography: Joint Progress." B. Frenz, Computers in Physics, Vol. 2, No. 3, May/June 1988, pp. 42-47.

"The Pocket Supercomputer: Andromeda Strain Revisited." B. A. Frenz, American Crystallographic Association Meeting, April 8-13, 1990, New Orleans, LA, Abstracts, p. 31.

"Computing to the Millennium." R. Langridge, American Crystallographic Association Meeting, April 8-13, 1990, New Orleans, LA, Abstracts, p. 27.

"Workstations of the Future." R. J. Feldmann, American Crystallographic Association Meeting, April 8-13, 1990, New Orleans, LA, Abstracts, p. 27.

"The Crystallographic Workbench." P. E. Bourne, P. M. Marquess and W. A. Hendrickson, American Crystallographic Association Meeting, April 8-13, 1990, New Orleans, LA, Abstracts, p. 28.

"Introductory Crystallography in the Advanced Inorganic Chemistry Laboratory." M. R. Bond and C. J. Carrano, J. Chemical Education, Vol. 72, No. 5, May 1995, pp. 451-454.


Back to Top

Back to Products

Last modified: April 27, 2002