Science Tools Corporation
Copyright © 1997 - 2024 Science Tools Corporation All rights reserved
Disclaimer
About Us About Us Our ValueProductsConsultingReferenceSupport
 

 

HISTORICAL DOCUMENT

First Published in June, 1998, delivered at Goddard SFC, Greenbelt, MD

 

New Methods in Science Computing

Primary Benefits | Features | Objects | Processes | Relationships | Scientific Defensibility
Distributed Objects | Distributed Workflow System
| Process Dispatching | Process Scripts
Multi-Dimension Arrays | Multi-Variate Point Operations

We provide a set of tools, loosely known as "BigSur" for historical reasons. Among these tools are STDB - The Science-Tools Database, DPS-DE, and DPS-EE- The Demand Processing System, Demand Engine and Eager Engine, respectively. The following outlines the overall architecture and benefits of this system.


New Methods in Science Computing 

Our system represents a new perspective for performing Science. For example, from reception of satellite sensor data to presenting carefully analyzed information upon a persons desktop, performing modern earth science is an exercise in managing data. The BigSur System™ addresses the end to end challenge literally from data collection, through processing, to delivery of results.

Our system learns to handle each researchers scientific abstractions and harness their existing tool-sets, and imposes as small a burden upon the researcher as possible. An opportunity to automate the mechanisms of scientific processing relieves the researcher of many tasks and permits more time to focus on research instead of data-management. Modern database technology is used as a means of tying the pieces together and provides for a straight-forward approach to interoperability between research groups since the system is available to be browsed by any application which wishes access. The meta-data presented in the database conforms to the FGDC Standard and borrows on the Canadian SAIF Standard, and provides just the right perspective on managing data which have geospatial and temporal attributes. The meta-data provides a rich collection of attributes that can be associated with data. This permits all users to find what they are interested in -- from pointers to white-papers, to actual code that manipulates a researchers objects, it's in there! Geospatial searches are greatly enhanced through use of special R-Tree indexing, making browsing of large collections practical. And researchers are free to use as much or as little of the system as they wish - the system does not impose great obligations to be useful.


The BigSur Perspective...

Earth Science is all about managing data…

…we provide the natural solution;

A database-centric model for
Science data management
and processing.

Object-management, work-flow management, and distributed processes and objects are built into the core design.


Primary Benefits

  • Scientific Notebook - A repository for data about scientific data
  • Workflow System - Parent-child associations of scientific functions are managed
  • Automation of Processing - The system can learn how to run your processes
  • Distributed Objects - Fully distributed meta-data, universal naming and special "snipping" functions
  • Distributed Processing - Any system can host your scientific processes (and objects)
  • Resource Discovery - Robust metadata and special indexes assist your applications
  • Multiple-Dimension Objects - From cropping to Multi-Variate Point Operations...

These features promote Scientific Defensibility!


Features

Complete end-to-end solution:

  • Object management including
    • Multi-Dimensional Array Support
    • Distributed Objects
  • Process management
  • Workflow system
  • Distributed Objects and Processes
  • Utilization of existing tools
    • Processing tools
    • Visualization tools
  • General purpose browser

Objects
data
“conceptual”
process instances

Processes

Relationships
object to object
process to process
object to process



Objects

Database meta-data includes:

  • Temporal & Spatial domains
  • Parent Object references
  • Parent Processes
  • Process definition
  • Process instance
Objects may be real or conceptual!

>>back to top<<



Processes

Processes definitions kept in the database...

  • Process source may be stored
  • Process arguments are known
Process definitions may be merely notebook entries, or may be capable of being dispatched by the Distributed Processing System.

>>back to top<<



Relationships

One of BigSur's Strongest features is its powerful and very flexible management of relationships, including:

  • Objects to other objects
  • Objects to Processes
    • parent process definition
    • parent process instance
  • Processes to other Processes

>>back to top<<


Scientific Defensibility

BigSur tracks how objects are created

Process definition, and source

Arguments used

Parent data-sets

Therefore, the complete lineage is known.

If questions arise, or problems discovered, not only can the details be traced, but processing can be repeated with corrected parents.


Distributed Objects

Objects may be distributed throughout a network

BigSur permits you to manage meta-data on any site desired

Use naming convention of choice:

URLs - Uniform Resource Locator

Kahn-Wilenski Handle

DLOBH - Distributed Large Object Handle


Distributed Workflow System

The workflow system provides for “eager, lazy, and push”:

Push - inbound data is ingested from the outside

Lazy - Processing only done on request

Eager - Processing done when Parent data objects are ready


Process Dispatching

Processes are dispatched from a queue when they are ready by a dispatching daemon.

Multiple daemons may exist

on any system in the network

may be used to control load and work locations

Daemons may compile and run source code, if desired, or merely execute scripts.


Process Scripts

Scripts can be very helpful to a Process.

They may:

Fetch arguments from the database

Prepare the environment

Clean up the environment

Create new database entries for objects created by the process

Move objects to archives for safety

>>back to top<<


Multi-Dimension Arrays

The motivation for creating MDAs was two-fold:

Provide capability for performing “The Query From Hell” where disparate data-types are joined.

Provide good performance for distributed large object management by providing a “remote snip” capability.


Multi-Variate Point Operations

MVPOs are used to join MDAs

Provide built-in manipulation of multi-dimension objects

Joins based upon “shape” of arrays

Provide ability to define “cell” as a known data-type

This permits use of advanced object-relational features where existing functions (methods) may be easily applied.

[end of document]

 
Feedback
Contact Us

website contact: Webmistress

Science Tools > Top Level