GridFields: Algebraic Manipulation of Simulation Results

Scientists' ability to generate and store simulation results is outpacing their ability to analyze them via ad hoc programs. We observe that these programs exhibit an algebraic structure that can be used to facilitate reasoning and improve performance.

GridFields separate topology from data attributesGridFields separate topology from data attributes

In our work on GridFields, we present a formal data model that exposes this algebraic structure, then implement the model, evaluate it, and use it to express, optimize, and reason about data transformations in a variety of scientific domains.

Simulation results are defined over a logical grid structure that allows a continuous domain to be represented discretely in the computer. Existing approaches for manipulating these gridded datasets are incomplete. The performance of SQL queries that manipulate large numeric datasets is not competitive
with that of specialized tools, and the up-front effort required to deploy a relational database makes them unpopular for dynamic scientific applications. Tools for processing multidimensional arrays can only capture regular, rectilinear grids. Visualization libraries accommodate arbitrary grids, but no algebra has been
developed to simplify their use and afford optimization. Further, these libraries are data dependent---physical changes to data characteristics break user programs.

GridFields expose topological equivalences between simulation results, which facilitates certain computations.GridFields expose topological equivalences between simulation results, which facilitates certain computations.

We adopt the grid as a first-class citizen, separating topology from geometry and separating structure from data. Our model is agnostic with respect to dimension, uniformly capturing, for example, particle trajectories (1-D), sea-surface temperatures (2-D), and blood flow in the heart (3-D). Equipped with data, a grid becomes a gridfield. We provide operators for constructing, transforming, and aggregating gridfields that admit algebraic laws useful for optimization. We implement the model by analyzing several candidate data structures and incorporating their best features. We then show how to deploy gridfields in practice by injecting the model as middleware between heterogeneous, ad hoc file formats and a popular visualization library.

In this project, we define, develop, implement, evaluate and deploy a model of gridded datasets that accommodates a variety of complex grid structures and a variety of complex data products. We evaluate the applicability and performance of the model using datasets from oceanography, seismology, and medicine and conclude that our model-driven approach offers significant advantages over the status quo.

Here at CMOP, we use GridFields to implement query services over ocean circulation model results, and we have added a GridFields plugin module for the VisTrails provenance and visualization system.

GridFields: Model-Driven Data Transformation in the Physical Sciences, Bill Howe, Phd Dissertation, Portland State University, 2007.

Algebraic Manipulation of Scientific Datasets
Bill Howe, David Maier
VLDB Journal, 14(4), November 2005

More information

AttachmentSize
dissertation.pdf5.45 MB
howemaier_vldbjournal.pdf495.24 KB

Events

« September 2008 »
SuMTuWThFSa
123456
78910111213
14151617181920
21222324252627
282930

User login

Search CMOP

Research Feature

Cruises Find what goes on aboard the Wecoma. Read the Chief Scientist's Blog or watch the Video Blogs.

Profile

Lydie Herfort is a post-doctoral fellow and aquatic microbiologist. Read More

Outreach

Visual Data: Picture This! is a class offered this fall to high school students. Learn More

Director's Welcome

CMOP is an outstanding opportunity to address regional and national priorities in ocean policy, and beyond.
More ...