Science Tools > About Us > New Methods in Science Computing HTML Format

Copyright © 1997 - 2023 Science Tools Corporation All rights reserved	Disclaimer

HISTORICAL DOCUMENT

First Published in June, 1998, delivered at Goddard SFC, Greenbelt, MD

New Methods in Science Computing

We provide a set of tools, loosely known as "BigSur" for historical reasons. Among these tools are STDB - The Science-Tools Database, DPS-DE, and DPS-EE- The Demand Processing System, Demand Engine and Eager Engine, respectively. The following outlines the overall architecture and benefits of this system.

New Methods in Science Computing

Our system represents a new perspective for performing Science. For example, from reception of satellite sensor data to presenting carefully analyzed information upon a persons desktop, performing modern earth science is an exercise in managing data. The BigSur System™ addresses the end to end challenge literally from data collection, through processing, to delivery of results.

Our system learns to handle each researchers scientific abstractions and harness their existing tool-sets, and imposes as small a burden upon the researcher as possible. An opportunity to automate the mechanisms of scientific processing relieves the researcher of many tasks and permits more time to focus on research instead of data-management. Modern database technology is used as a means of tying the pieces together and provides for a straight-forward approach to interoperability between research groups since the system is available to be browsed by any application which wishes access. The meta-data presented in the database conforms to the FGDC Standard and borrows on the Canadian SAIF Standard, and provides just the right perspective on managing data which have geospatial and temporal attributes. The meta-data provides a rich collection of attributes that can be associated with data. This permits all users to find what they are interested in -- from pointers to white-papers, to actual code that manipulates a researchers objects, it's in there! Geospatial searches are greatly enhanced through use of special R-Tree indexing, making browsing of large collections practical. And researchers are free to use as much or as little of the system as they wish - the system does not impose great obligations to be useful.

The BigSur Perspective...

Earth Science is all about managing data…

…we provide the natural solution;

A database-centric model for
Science data management
and processing.

Object-management, work-flow management, and distributed processes and objects are built into the core design.

>>back to top<<

Primary Benefits

Scientific Notebook - A repository for data about scientific data
Workflow System - Parent-child associations of scientific functions are managed
Automation of Processing - The system can learn how to run your processes
Distributed Objects - Fully distributed meta-data, universal naming and special "snipping" functions
Distributed Processing - Any system can host your scientific processes (and objects)
Resource Discovery - Robust metadata and special indexes assist your applications
Multiple-Dimension Objects - From cropping to Multi-Variate Point Operations...

These features promote Scientific Defensibility!

>>back to top<<

Features

Complete end-to-end solution:

Object management including

Multi-Dimensional Array Support

Distributed Objects

Process management

Workflow system

Distributed Objects and Processes

Utilization of existing tools

Processing tools

Visualization tools

General purpose browser

Objects
data
“conceptual”
process instances
Processes
Relationships
object to object
process to process
object to process

>>back to top<<

Objects
Database meta-data includes:

Temporal & Spatial domains

Parent Object references

Parent Processes

Process definition

Process instance

Objects may be real or conceptual!
>>back to top<<

Processes
Processes definitions kept in the database...

Process source may be stored

Process arguments are known

Process definitions may be merely notebook entries, or may be capable of being dispatched by the Distributed Processing System.

>>back to top<<

Relationships
One of BigSur's Strongest features is its powerful and very flexible management of relationships, including:

Objects to other objects

Objects to Processes

parent process definition

parent process instance

Processes to other Processes

>>back to top<<

Scientific Defensibility

BigSur tracks how objects are created

Process definition, and source

Arguments used

Parent data-sets

Therefore, the complete lineage is known.

If questions arise, or problems discovered, not only can the details be traced, but processing can be repeated with corrected parents.

>>back to top<<

Distributed Objects

Objects may be distributed throughout a network

BigSur permits you to manage meta-data on any site desired

Use naming convention of choice:

URLs - Uniform Resource Locator

Kahn-Wilenski Handle

DLOBH - Distributed Large Object Handle

>>back to top<<

Distributed Workflow System

The workflow system provides for “eager, lazy, and push”:

Push - inbound data is ingested from the outside

Lazy - Processing only done on request

Eager - Processing done when Parent data objects are ready

>>back to top<<

Process Dispatching

Processes are dispatched from a queue when they are ready by a dispatching daemon.

Multiple daemons may exist

on any system in the network

may be used to control load and work locations

Daemons may compile and run source code, if desired, or merely execute scripts.

>>back to top<<

Process Scripts

Scripts can be very helpful to a Process.

They may:

Fetch arguments from the database

Prepare the environment

Clean up the environment

Create new database entries for objects created by the process

Move objects to archives for safety

>>back to top<<

Multi-Dimension Arrays

The motivation for creating MDAs was two-fold:

Provide capability for performing “The Query From Hell” where disparate data-types are joined.

Provide good performance for distributed large object management by providing a “remote snip” capability.

>>back to top<<

Multi-Variate Point Operations

MVPOs are used to join MDAs

Provide built-in manipulation of multi-dimension objects

Joins based upon “shape” of arrays

Provide ability to define “cell” as a known data-type

This permits use of advanced object-relational features where existing functions (methods) may be easily applied.

[end of document]

Feedback

Contact Us

website contact: Webmistress