FOCUS: CLIMATE CHANGE SOFTWARE
Enabling Open
Development
Methodologies
in Climate Change
Assessment
Modeling
Joshua Introne, Robert Laubacher, and Thomas Malone,
MIT Center for Collective Intelligence
// The Radically Open Modeling Architecture (ROMA)
lets climate change policy stakeholders create and run
surrogate simulations and composite models. //
Models also play a central role for
climate change policy-makers, but are
so complex and computationally demanding that experts must run them
and interpret their results, creating a
bottleneck between models and stakeholders. This reduces the lexibility
that individual stakeholders have to
explore alternative scenarios and limits the number of stakeholders that can
query the models. It also makes models
more opaque to stakeholders because
experts summarize model results and
tend to omit details about the models’
assumptions. Complexity and opacity, in turn, reduce public conidence in
such models.
Drawing inspiration from open
source development practices, we
wanted to address these problems by
providing support for modularization
of and open access to models that can
inform climate policy deliberations.
We thus developed a publicly accessible Web service called ROMA (Radically Open Modeling Architecture)
that allows anyone to create, combine,
and run modular simulations. ROMA
currently provides the modeling functionality in the Climate CoLab (http://
climatecolab.org), a collective intelligence application in which large numbers of people work together to develop
proposals to address climate change.4,5
In time, we hope that ROMA will support a community focused on model
development and analysis.
Design
COMPUTATIONAL
SIMULATION
MODELS help support scientiically
grounded “what if” analyses by translating specialized knowledge into tools
that can project the likely future impact
of current actions. Models have thus
become important in a variety of policy
domains. In recent years, several software platforms for environmental policy-making and urban planning have
added simulation models to decision
support tools to provide stakeholders
with direct access to these models. This
trend is continuing to gain ground.1–3
56 I E E E S O F T W A R E | P U B L I S H E D B Y T H E I E E E C O M P U T E R S O C I E T Y
We initially developed ROMA as part
of the Climate CoLab, where community members run models to predict the
outcomes of proposals to address climate change. Modeling in the CoLab
helped crystallize two of ROMA’s technical design challenges. First, it had to
simplify complex, computationally expensive models for a broad, Web-based
0 74 0 -74 5 9 / 11 / $ 2 6 . 0 0 © 2 0 11 I E E E
community; models must execute rapidly and without great loss of accuracy, and we must be able to lexibly
tune any model’s interface to meet diverse users’ needs. Our second design
challenge was how to use modeling
functionality to support collective intelligence. Research has demonstrated
that large groups of diverse individuals
can i nd better solutions than similarly
sized groups of experts, but only if the
individuals have a basic understanding
of the domain and are free to explore
the space.6 Thus, ROMA must provide
modeling capabilities that could inject
expertise into users’ exploration of
the domain yet still let individuals try
out ideas that model creators haven’t
foreseen.
ROMA provides two core functionalities to meet these design challenges.
It provides tools for generating and running surrogate simulations of much
larger integrated assessment models
(IAMs; integrative models that predict
the impacts of climate change across
a variety of domains). Clients can run
surrogate models very quickly and easily customize them to reduce input number and complexity. ROMA also offers
a uniform API and componentized view
of models and stored model runs and
lets clients combine components to create executable composite models.
These features let stakeholders explore climate and integrated assessment models directly; the design also
supports a division of labor in which
experts in different subspecialties can
easily add new component models that
stakeholders can then combine with
others to explore competing assumptions about the world.
ROMA Service
Architecture
ROMA describes simulation models by
their inputs and outputs. These variables
can be of any standard data type (for example, integers, doubles, or text) and
can represent vector or scalar values. A
Components
Variable
Model
Name
Description
Author
Mapped models Version
URL
run(List<Tuple>:Scenario
ExternalName: 0 ..* inputs 1 ..*
String
ExternalName: 0 ..* outputs 1 ..*
String
MappedModel
Name
Description
Unit
Cardinality
DataType
Precision
Label
Tuple
Replication
SamplingFrequency
MappingFunction
Values
Scenario
CompositeModel
Name
Author
Version
VariableMapping
ExecutionOrder
FIGURE 1. Partial class diagram for the modeling service. The model and associated
variable classes capture metadata about models known to ROMA. A scenario captures data
for a particular run of a model in tuples. Both models and scenarios support versioning.
model can also be associated with other
metadata (such as a name and description). ROMA publishes its own URL
that clients can use to run models.
When a model runs, ROMA generates and stores a dataset called a scenario that contains all the concrete input and output values and a reference
to the model that generated it. For composite models, the scenario will also
contain subscenarios corresponding to
the inputs and outputs of each component model in the composite.
Because it maintains a connection
between scenarios and the models
that create them, ROMA can semiautomatically update stored scenarios
if a model changes. It can also swap
out subscenarios or replace component
models to update a composite scenario.
This enables less tightly coupled worklows around the creation of scenarios.
For example, a team developing a scenario for a global emissions policy in
the CoLab can plug in different national policy scenarios that were developed elsewhere.
Figure 1 shows a class diagram
that describes ROMA’s core components. ROMA exposes the four main
elements—models, variables, scenarios,
and tuples—via a RESTful interface
that lets Web clients retrieve XML descriptions of these entities. All other
functionality is available through a set
of Web forms.
ROMA offers two kinds of support
for combining models: mapped models
can transform other models’ inputs and
outputs, and composite models can contain other models and maintain connections among them. Figure 2 shows how
ROMA uses these types of models.
The use of mapped models allows
several transformations:
• Replication. A model can be repeated n times over incoming values
with higher cardinality. For example, a mapped model with a replication value of n > 1 can transform
a model that accepts scalar values
into a model that accepts vectors
with a cardinality n.
N OV E M B E R / D E C E M B E R 2 0 11
| IEEE
S O F T W A R E 57
FOCUS: CLIMATE CHANGE SOFTWARE
which case the system will calculate all
downstream changes and update the
composite scenario’s version number.
Similarly, users can replace a component model in a composite model with
a new model that has the same inputs
and outputs and request that the system
update all scenarios to the new composite model.
Composite model
Model
Mapped
model
∑
Surrogate Models
Step 1
Step 2
Step 3
FIGURE 2. A notional schematic illustrating how to connect models. A composite model
consists of several steps that embed component models.
• Subsampling. Users can reduce any
model’s output cardinality by subsampling its outputs at a given frequency. For example, a user can
sample a model that provides predictions for atmospheric CO2 for
each year over the course of a century at a period of 10 years to generate data for another model that
requires decadal CO2 values as
inputs.
• Many-to-one mapping. Users can
also reduce any model’s output cardinality by applying a many-to-one
mapping function—for example,
sum, average, i rst, and last.
If users want to combine these transformations, ROMA applies them sequentially as ordered in the preceding list.
Thus, ROMA i rst repeats a model over
its inputs, subsamples the results of that
operation, and i nally combines them
using the many-to-one mapping function if one is speciied.
Composite models arrange their
component models in a series of ordered,
connected steps. Each step can contain
any number of models as long as they
have no dependencies on each other. A
set of connection descriptors speciies
connections from the composite model
inputs to steps, connections between
steps, and from steps to the composite model’s outputs. ROMA allows users to connect only those variables that
have the same units, data type, and cardinality. More sophisticated compatibility checking is left to the composite
model creator. Connections between
steps must be from output variables in
an earlier step to input variables in a
later (though not necessarily adjacent)
step so that cycles can’t occur. When a
user runs a composite model, ROMA
executes the steps in order.
Running a composite model produces a composite scenario that contains references to scenarios generated
by each of the component models. Users can replace component scenarios
(that don’t have their inputs determined
by upstream models) after ROMA has
generated a composite scenario, in
58 I E E E S O F T W A R E | W W W. C O M P U T E R . O R G / S O F T W A R E
Other domains often use surrogate models when “real” models are too expensive to run for all parameter combinations of interest or when model authors
prefer to control access to their technology.7 Researchers construct surrogate
models by interpolating between known
data points that the actual model generated. In practice, a surrogate model
is often elaborated as a researcher explores a model’s parameter space. In the
case of climate and integrated assessment models, though, we generally have
access to published data instead of the
actual models, so we construct surrogate models based on this data.
To simplify generating surrogate
models in ROMA, the service accepts
scenario-based data—a form commonly used for presenting data from
IAMs—and automatically generates
surrogate models. We’re currently developing a user interface that will make
it easy for anyone with access to such a
set of scenarios (for instance, a model
creator) to create a surrogate model
within ROMA.
Surrogate models provide users with
a very rapid estimate of much larger
models for a bounded region of their
parameter space. Of course, these estimates are only approximations, the accuracy of which depend on the curveitting algorithm used, the amount of
data available, the complexity of the
output surface, and other factors. Users must weigh the trade-offs between
speed and accuracy for each particular
application and domain in which they
employ surrogate models.
Model Execution
and Spreadsheet Models
We intended for ROMA to work with
models that run on other servers. For
it to run an external model, the model
provider must present a URL that accepts a form post with values for each
input variable in the model and be able
to return data to ROMA. Although
ROMA is agnostic with respect to the
technology that runs individual models, no provision is currently made for
models that have long execution times
(greater than the HTTP request timeout) or that require scheduling.
In addition to externally hosted
models, ROMA includes functionality that can transform a spreadsheet
into an executable model. Spreadsheet
models map input and output variables to cells and cell ranges; the user
dei nes this association when uploading a spreadsheet to ROMA and usersupplied functions embedded in the
spreadsheet perform all model calculations. The system uses the spreadsheet
engine available from the open source
Apache POI project (http://poi.apache.
org) to run this type of model.
Although they’re computationally
limited, spreadsheets are widely understood, and many people use them
to create informal models that support
decision-making. Thus, spreadsheets
provide an easy way to “open up” modeling to a broad community without requiring users to learn a domain-speciic
language for building models.
ROMA Application
in the Climate CoLab
In the Climate CoLab, some kinds of
user proposals must be attached to a
ROMA-generated scenario that predicts the impacts of that proposal. The
CoLab uses the XML data ROMA provides to generate an interface that lets
users enter input variables, run models,
and view stored results.
So far, all proposals that require
models in the CoLab have been for a
Composite model
User-provided inputs
Ouputs
Land use
Climate
Regional emissions
Afforestation
C-Learn
climate
model
Deforestation
Sea level
Energy Modeling
Forum-22
mitigation cost
(7 surrogate models)
Emissions change
Option: 3 region
Option: 7 region
Atmospheric CO2
7 region
Damage cost
(2 surrogate models)
Temperature
Economic
% GDP mitigation
% GDP damage
Sea level
Temperature
Option: 15 region
Physical impacts
(2 models)
Physical systems
Water, food, etc.
GDP: Gross Domestic Product
FIGURE 3. MIT composite model inputs, modules, and outputs. The three-, seven-, and
15-region inputs for emissions are interface options that let the user specify emissions
reductions at different levels of granularity.
global agreement to address climate
change, and contributors have used one
of three variants of a single composite
model to develop scenarios. The composite model combines a climate simulation with models that predict economic and physical systems’ impacts
(see Figure 3). The model’s variants
differ in the degree of granularity with
which users specify emissions reduction
commitments for the world’s nations.
To run the model, users specify
global land use goals and emissions
commitments broken out as inputs
by region. They can choose to specify
emissions targets for three (developed
countries, rapidly developing countries,
and other developing countries), seven
(with larger economies broken out), or
15 regions. Models in the seven- and
15-region variants of the composite
model transform the emissions inputs
into the three regions accepted by the
C-Learn Climate Model.
The MIT composite model feeds
emissions commitments for three regions and land use goals into the CLearn climate simulation. C-Learn is a
version of the Climate Rapid Overview
and Decision-support Simulation (CROADS), 8 a lightweight climate model
developed by Climate Interactive
(http://climateinteractive.org) that can
run on personal computers. C-Learn
runs as a separate Web service hosted
on an internal server and produces predictions for several indicators including the climate outputs in Figure 3. CLearn outputs are for each year from
2000 to 2100 inclusive.
Two physical impact models produce a brief textual summary of the
anticipated effects of temperature
change on geophysical (water, land,
ecosystems, and singular events) and
human systems (health and food/agriculture). This information is derived
from published research that provides
predictions for each Celsius degree of
change.10,11 The CoLab model then
captures this information as a simple
spreadsheet model that looks up the
appropriate output based on the temperature change by 2100, and uses a
mapped model to transform the vector
outputs from C-Learn into the scalar
output that the physical impacts models require.
N OV E M B E R / D E C E M B E R 2 0 11
| IEEE
S O F T W A R E 59
FOCUS: CLIMATE CHANGE SOFTWARE
Business as usual
Scenario A
Scenario B
Year
% 2005 emissions
2080
% Emissions × GDP
% GDP
2080
% GDP
Year
FIGURE 4. Development of mitigation cost surrogate. For each year, emissions values in
each scenario are plotted against Gross Domestic Product (GDP) values. Surrogate models
use the resultant curve to infer GDP over the entire range of emissions values for that year.
Based on data generated by a handful of well-known IAMs, several surrogate models can compute economic
outputs. Typically, IAMs report two
types of economic costs: damage cost
(the cost of climate change reported as
a percentage deviation from anticipated
future Gross Domestic Product [GDP]
if climate change were not to occur9,11)
and mitigation cost (the cost of reducing emissions, also reported as percent
deviation GDP from an anticipated
baseline). The MIT composite reports
seven predictions using surrogate models based on data from the Stanford
Energy Modeling Forum’s (EMF) 22
exercise.12
Preparing Mitigation
Cost Surrogate Models
We based mitigation cost models in
the CoLab on data generated during
the EMF 22 exercise. Modeling teams
who participated in EMF 22 simulated
a group of scenarios that relected a
range of potential global mitigation
policy approaches plus a reference,
called a business as usual (BAU) scenario with no mitigation policy. Each
scenario involved stabilizing greenhouse gas concentrations at a particular target level. Data reported included
greenhouse gas (GHG) emissions and
sequestration and a variety of economic indicators such as GDP.
We created surrogate models to
predict the effect of emissions reduction on anticipated GDP from 2000 to
2100. We chose changes in GHG emissions as input because emission reductions are the primary mechanism by
which to achieve GHG stabilization
and because actions to reduce emissions will be the primary driver of mitigation policy costs.
To construct the surrogates, we
used two sets of data for each model:
percentage change in fossil fuel CO2
emissions versus 2005 levels and percentage reduction in GDP versus the
reference scenario (no policy or BAU).
Thus, for each model in each year, we
had n points that associate an emissions level with a percent deviation in
60 I E E E S O F T W A R E | W W W. C O M P U T E R . O R G / S O F T W A R E
GDP, where n is the number of scenarios our analysis used for that model
(see Figure 4).
To determine the impact on GDP
for any emissions level in a particular
year, we located the point on a curve
that best fit the n data points available for that year and then used linear
piecewise interpolation to approximate this curve. More sophisticated
approaches (for example, higherorder polynomials) are possible, but
we didn’t feel they were justified
without additional data. If emissions
levels are lower than the most aggressive scenario in a particular year,
the surrogate model doesn’t report
a value. If emission levels are higher
than BAU (the scenario for which
mitigation cost is zero), the model
simply reports zero percent change in
anticipated GDP.
For policy proposals in the CoLab
that are too aggressive to be simulated
with a particular surrogate mitigation
model (for example, emissions levels
are too low in a particular year), the
system reports that the modeling team
in question likely judges the policy scenario to be technically or economically
infeasible.
Some inaccuracies arose for the CoLab mitigation cost models because
emissions values generated by C-Learn
and used as inputs to the surrogate
models didn’t correspond in every detail with the emissions values the original mitigation models used. For example, CoLab users could specify land
use goals to manipulate emissions levels, but the surrogate models didn’t incorporate this as a source of emissions.
Land use accounted for approximately
8 percent of total CO2 emissions in
2010, so differences in land use policy
would have an incremental impact on
both environmental and economic outcomes. To enhance the system’s accuracy, we’re exploring the incorporation
of land use emissions in a future surrogate model.
ABOUT THE AUTHORS
T
he modeling functionality
ROMA offers to Climate CoLab users is only a subset of its
potential. We plan to introduce more
advanced functionality as we develop
organizational processes to help scaffold its use. Throughout 2011, the CoLab will launch a series of contests to
create both national and global proposals for emissions reduction. Occurring in parallel, these contests will be
phased with interim evaluations at the
end of each phase.
Within the CoLab, the validity of
the models has been established via a
centrally administered review process
with a panel of experts. To support the
vision of an open-modeling community,
we hope to design processes and technical support to better leverage the collective intelligence of the community. For
instance, the community could be invited to look for obvious errors (ini nite
or impossible values at the extrema of
the input space) for each model. Model
creators and experts might attach key
assumptions to individual models, and
experts could weigh in on the validity
of those assumptions. These assessments could be summarized to provide
policy-makers with indications about
model maturity and uncertainty.
The hurdles to creating community
processes around model creation, analysis, and validation are as much social
and organizational as they are technical. Integrated assessment models
have traditionally been implemented
as monolithic software projects developed by small teams of experts, and
these development processes have led
to the complexity and opacity that currently cause dificulties. By emphasizing modularity and offering a set of
features that allow stakeholders to become more directly involved in climate
and assessment modeling, we hope
ROMA will enable the social and organizational processes that ultimately improve our chances of creating solutions
to climate change.
JOSHUA INTRONE is a research scientist at the MIT Center for
Collective Intelligence and the software architect of the Climate CoLab.
His research interests include the design of mediating tools to improve
collaborative and collective performance, the impact of social network
structure on collaborative information processing, and the development
of sociotechnical architectures for problem-solving. Introne has a PhD
in computer science from Brandeis University. Contact him at jintrone@
mit.edu.
ROBERT LAUBACHER is a research scientist and associate director at the MIT Center for Collective Intelligence, where he manages
the Climate CoLab project. His research interests include developing
approaches that can make complex simulation models accessible
to interested citizens. Laubacher has an MA in modern history from
Harvard. Contact him at rjl@mit.edu.
THOMAS W. MALONE is the Patrick J. McGovern Professor of
Management at the MIT Sloan School of Management and the founding director of the MIT Center for Collective Intelligence. His research
interests include collective intelligence, organizational design, and
computer-supported cooperative work. Malone has a PhD in cognitive and social psychology from Stanford University. Contact him at
malone@mit.edu.
References
1. M. Matthies, C. Giupponi, and B. Ostendorf,
“Environmental Decision Support Systems:
Current Issues, Methods and Tools,”
Environmental Modelling & Software, vol.
22, no. 2, 2007, pp. 123–127.
2. I.S. Mayer et al., “Collaborative Decision
Making for Sustainable Urban Renewal
Projects: A Simulation-Gaming Approach,”
Environment and Planning B: Planning and
Design, vol. 32, no. 3, 2005, pp. 403–423.
3. B. Friedman et al., “Laying the Foundations
for Public Participation and Value Advocacy:
Interaction Design for a Large-Scale Urban
Simulation,” Proc. 2008 Int’l Conf. Digital
Govt. Research, Digital Gov’t Soc. North
America, 2008, pp. 305–314.
4. T.W. Malone, R. Laubacher, and
C. Dellarocas, “The Collective Intelligence
Genome,” Sloan Management Rev., vol. 51,
no. 3, 2010, pp. 21–31.
5. J. Introne et al., “The Climate CoLab: Large
Scale Model-Based Collaborative Planning,”
Proc. 2011 Conf. Collaboration Technologies
and Systems, IEEE CS Press, 2011, pp. 40–47.
6. L. Hong and S.E. Page, “Groups of Diverse
Problem Solvers Can Outperform Groups of
High-Ability Problem Solvers,” Proc. Nat’l
Academy of Sciences of the United States of
America, Nat’l Academy of Sciences, vol. 101,
no. 46, 2004, pp. 16385–16389.
7. D. Gorissen et al., “A Surrogate Modeling and
Adaptive Sampling Toolbox for ComputerBased Design,” J. Machine Learning Research,
vol. 11, 2010, pp. 2051–2055.
8. T. Fiddaman et al., C-ROADS Simulator
Reference Guide, Climate CoLab, 2011.
9. P.W.D. Nordhaus, A Question of Balance:
Weighing the Options on Global Warming
Policies, Yale Univ. Press, 2008.
10. M.L. Parry, O.F. Canziani, and J.P. Palutikof,
“Technical Summary. Climate Change 2007:
Impacts, Adaptation and Vulnerability.
Contribution of Working Group II to the 4th
Assessment Report of the Intergovernmental
Panel on Climate Change,” Report of the
Intergovernmental Panel on Climate Change,
M.L. Parry et al., eds., Cambridge Univ. Press,
2007, pp. 23–78.
11. N.H. Stern, The Economics of Climate
Change: The Stern Review, Cambridge Univ.
Press, 2007.
12. L. Clarke et al., “International Climate Policy
Architectures: Overview of the EMF 22
International Scenarios,” Energy Economics,
vol. 31, no. 2, 2009, pp. S64–S81.
Selected CS articles and columns
are also available for free at
http://ComputingNow.computer.org.
N OV E M B E R / D E C E M B E R 2 0 11
| IEEE
S O F T WA R E
61