Computational Science and Engineering:
How to Educate the Next Generation
John Guckenheimer
Computers have had a profound
impact upon science and engineering since their invention in the middle of the
twentieth century. That impact continues to grow with the capacity of
computers, communications networks and information repositories accessible via
the web. Therefore, all disciplines need to incorporate computation into the
education of their young scientists and engineers. This “white paper” is an exploration of
computational science education framed in the context of Cornell University.
Its purpose is to stimulate the development of effective computational science
and engineering academic programs at Cornell. The focus will be on the three
largest undergraduate colleges: Agriculture, Arts and Sciences and Engineering
and upon graduate education, organized through the Graduate School into Fields
of Study. Throughout the document, we include engineering in the sciences that
are part of computational science.
Computational Science
Computation has taken a place
with experiment and theory as a mode of doing science. More powerful computers,
better software and electronic data repositories all broaden the access to
computational science. Individuals do not need to be expert programmers,
mathematicians or computer scientists to engage in computational science any
longer. However, they do need to become skilled and intelligent “users” to make
effective use of computational tools. Scientific research seeks to extend the
frontiers of knowledge, so computational science entails the development of new
models, new ways of analyzing models and data, and new ways of interpreting the
output from computations. The capability of the computational tools themselves
also become increasingly important as an enabler of science at the deepest
levels. In many research areas, it is impossible to write down equations or
laws from which predictions can be derived analytically. A few examples
illustrate this point
Scientific progress depends on
our ability to construct computer models of the processes we study and the
effectiveness of computational methods to extract information from these
models. Evaluation of the science requires an understanding of modeling
approximations and computational accuracy, and comparison of computational
output with empirical data.
An essential part of
computational science is extension of the technology it employs. This
technology includes the mathematical foundations of algorithms, the programming
languages in which models are implemented and the operating systems of the
computers. Parallel computers having complex memory hierarchies dominate high
performance computing and multi-core chips are now bringing parallel
computation to the desktop. Achieving computational efficiency on these
machines is a technical challenge that influences the science that will be
accomplished. The creation of high quality software for the solution of
numerical problems remains a difficult challenge. Everywhere we seek to push
the frontiers of scientific computation, we discover new limitations of current
methods. Thus computational science requires a spectrum of activities that includes algorithmic
research, software development, modeling, data collection and curation of data
repositories in addition to computing itself. This set of activities is not
aligned with the traditional departmental organization of universities and is
hardly represented in the curricula that we offer students. Essential aspects
of computational science need established academic homes if they are to become
important part of the educational system. Computational science within the
disciplines requires both new facilities and new research organizations that go
beyond individual faculty members working with a few students and postdoctoral
fellows. While individual investigator groups remain fundamental units for
conducting scientific research, large organizations are needed to create
federated data archives of high quality that are rapidly becoming essential
scientific resources.
Large supercomputer centers were
established twenty years ago to provide resources for computational science
when the cost of computers was thousands of times higher than it is today. The
technical staff of these centers include a substantial group of professional
computational scientists who provide services to “users” but do not engage in
scientific research themselves. Centers continue to provide direct support for
the most demanding applications, but much of the work that once required a
supercomputer and technical help in using those computers is easily done at the
desktop today. High performance computing centers are adapting to the
remarkable power of desktop computing and the communications capability of
networks. The 2003 NSF Report, Revolutionizing Science and Engineering through
Cyber-infrastructure, (http://www.communitytechnology.org/nsf_ci_report/) presents a vision in which centers assume increasing responsibility for
services that support collaborations across entire research areas in addition
to services to individual users. The development of cyberinfrastructure requires new
types of organizations within scientific communities, creation of new research
areas that are oriented toward technology, shifts in disciplinary boundaries
and new modes of interactions among disciplines. The payoff for this change and
added complexity is a qualitative change in our ability to study scientific
problems of pressing social and economic importance.
Our
educational system has responded slowly to the demands of computational
science. The 2005 report Computational Science: Ensuring America’s
Competitiveness in June, 2005 (http://www.nitrd.gov/pitac/) of the President’s Information
Technology Advisory Committee (PITAC) calls for new structures, programs and
institutional incentives to train computational scientists. No discipline has taken primary
responsibility for computational science education, and much of the evolution
of computational science technologies has taken place outside the mainstream of
our academic programs. Apart from the centers described above, universities
have been largely users rather than
developers of computational science. The focus has been on application of
existing methods to solve scientific problems more than on expansion of the
suite of available methods. Moreover, mainstream educational programs offer few
opportunities to develop expertise in computational science methodologies. This
situation is a danger to US economic competitiveness. The importance of
computational science technology to industry is hard to overstate. Computation
gives companies advantages in how they design and build their products.
Computational capability is a critical factor in determining winners and losers
in the global economy and in national health and security as well as in the
scientific marketplace of ideas. We need to act on the recommendations of the
PITAC report to create mainstream programs in our universities that take
responsibility for the development of the
cross-cutting “core” of computational science and for education of
students in computational science.
We have done little to adapt the
instruction that science students receive to the growing use of computation in
the conduct of science. One can argue about how much change has taken place,
but the time has come for us to reexamine thoroughly courses and curricula that
we teach. The basic goals of educating scientists to think analytically
and critically have not changed, but
computers have changed the fundamental aspects of how we solve problems. Simulation of complex models and
computational methods for solving mathematical problems enable us to transcend
the limits of what could be done “by hand” when theories that predate computers
were formulated. Much of the science and mathematics that we teach today would
have been formulated differently if computers had been available at the time of
its discovery. Implementing appropriate changes throughout the science
curriculum is a process that has only begun. Moreover, most of the classrooms
we teach in were also designed before the age of computers. Computational
science is best learned by doing, and we should establish environments where
active learning takes place in a supervised fashion.
Thus, there are two goals that
Cornell University should adopt with regard to computational science education:
Educating the Next Generation
There are several principles
that can help guide the implementation of academic computational science
programs.
Academic programs that adhere to
these principles require organizations that are complementary to traditional
departments. The Faculty of Computing and Information Science is an innovative
structure for the support of these academic programs. The Graduate School at
Cornell with its Fields also provides a framework for establishing
computational science programs at the graduate level. Three current
interdisciplinary Graduate Fields support computational science directly. The
Graduate Field of Applied Mathematics was established in 1963 and has served
for decades as a home for students whose primary focus is numerical analysis
and scientific computation. Since the requirements of this Field do not mesh
fully with the background and needs of students engaged in computational science
in science and engineering disciplines, two new fields have been established, a
Graduate Field of Computational Biology and a minor Graduate Field of
Computational Science. The establishment of new fields is only the first step
in ensuring that there will be curricula that meet the needs of students in
computational science.
There have been two previous
studies of computational science academic programs at Cornell. In the early
1990’s. The Theory Center prepared materials that surveyed computing research
and faculty in different disciplines and highlighted recommendations for
students interested in pursuing computational science. This survey concluded
that existing fields were adequate to meet student demand and proposed only
greater use of existing minors as a mechanism for cross disciplinary training.
After the Faculty for Computing and Information Science was established,
Charlie van Loan headed a Study Group in 2000-01 to impart a sense of unity to
what Cornell does in computer science and engineering with coherent educational
programs that serve both undergraduates and graduates. This Study Group
surveyed courses in computational science at Cornell and programs at other
universities. This document draws upon these efforts, in particular as
benchmarks of faculty engaged in
computational science and the curricula available to students over at
least the past fifteen years.
The evolution of academic
curricula happens on a time scale of decades and is slow compared to the
changes that have taken place in computational science over the past twenty
five years. Incorporating computational science components into courses within
established disciplines is challenging. At one level, we need to determine how
computation will be included within the “foundation” courses in mathematics,
basic science and programming that we now require. This can take the form of
either new courses that replace or add to existing ones or to revisions of
existing courses. Basic topics can be more effectively addressed at this level
than in disciplinary courses whose focus is computational science methods,
models and analysis for that discipline. Pedagogically, there is also a need
for common mathematics and computing courses to highlight the use of abstract
concepts and general methods in other disciplines. Interdisciplinary
coordination of curriculum revisions is required to develop a shared vision for both undergraduate and
graduate computational science curricula. Implementing practice into computational
science instruction is similar to laboratory courses in that it requires
specialized teaching environments. Providing space for these facilities,
maintaining equipment and developing
instructional materials require
sustained institutional support that goes beyond the resources required for
typical lecture courses.
This document can only make a
cursory assessment of the state of computation in our curricula for science and
engineering students. It is easier to identify courses whose primary content
involves computation than to find smaller computational components of other
courses. The 2000-01 study of van Loan lists three levels of courses. Basic
programming courses are taught primarily by the computer science department,
with smaller 100 level courses in biological and environmental engineering and
in earth and atmospheric sciences addressed to students in those majors. The
computer science courses include one and two credit introductory courses in
unix tools and programming in the languages C, C++ as well as more extensive
courses that introduce programming concepts, algorithms and data structures.
These courses are taught in two “flavors” with varying emphasis upon scientific
computation. Both introduce Matlab, a commonly used programming environment in
computational science that are not specifically oriented toward large scale
computing. The study lists nine more advanced undergraduate courses, one of
which is no longer in the catalog of courses and two of which have changed
numbers from 200 level to 300 level courses. Six of these courses, CEE 241, AEP
438, BTRY 421, COMS 322, COMS 421 and MATH 425 are numerical analysis and
methods that teach general algorithms that are the foundation of most numerical
computing. COMS 321 is a computational biology course, MAE 470 is a course on finite
element calculations for mechanical and engineering design and PHYS 480/680 has
evolved to a more specialized course that emphasizes computation of material
properties. A few new courses in computational biology have been added to the
curriculum, notably in bioinformatics and dynamic models in biology. The six
general courses listed above overlap, but none builds upon another. There
are no undergraduate course sequences in numerical analysis.
At the graduate level, Computer
Science 621 (Matrix Computation), 622 (Numerical Optimization and Nonlinear
Equations) and 624 (Numerical Solution of Differential Equations) form a core
sequence in numerical methods for scientific computing. Currently, 621 is
taught yearly in the fall and 622 and
624 are taught in alternate years in the spring. Ten years ago, the scientific
and parallel computing group within the Computer Science Department had four
faculty, but Trefethen and Coleman have left Cornell and not been replaced by
faculty whose core expertise is in numerical analysis. The CIS study group identified an additional
ten courses in advanced scientific computing in the 2000-01 Course Catalog,
most in more specialized areas than the courses listed above. Two of these ten
courses are no longer in the catalog and three of the remaining eight will not
be offered in the 2005-06 academic year. The statistics with regard to “data
gathering and display” are more dismal. The study group listed
six graduate level courses in this area, of which three are no longer in
the catalog and only one is being offered in the 2005-06 academic year. The
group identified four courses in modeling and simulation, one of which is no
longer in the catalog. Since 2000-01,
three or four new courses have been added to the ones that existed then, but
the net change is negative. These data suggest that our academic programs in
computational science are hardly robust.
The curriculum described above
is inadequate to sustain a distinctive program in computational science.
Analysis of faculty appointments over the decade 1995-2005 reinforces the
conclusion that computational science has suffered from a lack of concerted
attention at Cornell during the past decade. The Cornell curriculum needs to be
enhanced to provide a strong foundation for thriving computational science
programs. This raises questions about how Cornell can best use its existing
faculty to support computational science programs and where additional faculty
are need to sustain strong programs.
Interdisciplinary organizations
are used to support computational science programs across the country, almost
without exception. The field structure of the Graduate School and the Faculty
for Computing and Information Science provide an excellent organizational
structure for interdisciplinary programs, assuming adequate resources are
allocated to the programs. The Faculty of CIS is an academic organization whose
resources can augment those of colleges and departments for computational
science programs. It can serve as the “home” for these programs, ensuring that
the core areas of computational science
remain strong at Cornell and coordinating programs across “stakeholder
units.” Close cooperation among these units
will be required to develop and maintain strong computational science programs
and might not happen spontaneously.
As a starting point, we propose
the following actions: