Science Tools > Q & A - Angus Files

Copyright © 1997 - 2025 Science Tools Corporation All rights reserved	Disclaimer

Questions and Answers

Product Specific Question and Answers
arranged by Topic

1. Security functions

2. Departmental changes to individual users privileges

3. Archived data sets

4. Fault tolerance

5. Support for a variety of distribution media

1.	Security functions Question: Describe system security functions to prevent unauthorized access to customers facilities, products, and services. This description should not rely on any existing campus network security such as firewalls. Response: Our system addresses the challenges of security with multiple approaches in order to address the manifold threats, working in concert with, and providing features beyond, security measures provided on operating systems, within database management systems and between network segments at firewalls. Briefly, the following threat domains exist: Operating system, database, application, and network. There are certain expectations inherent in each security domain and our system addresses each with different mechanisms. Operating System level security: Users who establish valid connections to the system may or may not be valid users of our software. In addition, our customers have required of us that we provide for applications written in the Java programming language due to its relative ease of use. It happens that Java itself has great network level security built in but is quite unsecure, in current versions, within the system that it's installed on. Further, different authorized users may have different levels of access privileges, ordinary users vs. operators, for example. And, there's a chicken-egg problem with regard to database connectivity information - it must also be secured. In order to address these challenges, we have developed new technology (we are in the patenting process now) which secures the installation, including database connectivity information, in excess of that provided by the operating system itself, and, though not yet certified, we believe its at the "B-1" certification security level, as measured by the US federal government. Once a Java program is up and running using our security software, only our Java API will have access to sensitive data and even in the event that there are connection errors, only data that was passed explicitly by the user will be returned in error messages (as appropriate). So nothing can be learned from attempting to make the software fail. Similarly, in the event that someone attempts to spoof our Java API, because the critical information is encrypted, again, nothing can be learned from such attempts - all they will see is gibberish. Because the database connection strategy is both secured, and conforms to the ODBC/JDBC connectivity standards, our customers may swap out their RDBMS from one vendor to another, despite differences in database security strategies between them. Also, our customers have increased choice of database user authentication strategies. For example, it may not be desirable to explicitly authorize each and every user to the RDBMS, so instead user groups may be established and authenticated as a class. At the operating system level, an environment variable may be set to a particular group, such as "operator", and the Science Tools Administrator may then authorize that user as in the operator group. The user may then run an application or tool and be authenticated and connect, receiving whatever database privileges are appropriate for "operators." Database level security: Each RDBMS vendor has implemented their own security mechanisms. It goes beyond the scope of this document to enumerate them all but they can be summarized as primarily being user-based. Some systems rely upon operating system authentication while others rely upon their own schemes to create a "user profile". Either way, they all tend to place restrictions on databases, database schemas, tables sometimes even rows using a "username" based strategy. Our connection strategy, described above, permits use of these features to the full extent permitted by the chosen RDBMS vendor. Application level security: (The word "application" as used here refers to any software which connects to a database - such programs are known as "applications" in the database world, though they may also be entire systems in their own right.) There often exist needs to declare privileges within an installation of our software which go beyond database level security in their complexity. For example, the database has no way to know which rows in a particular table are appropriate for any particular user to see - generally, they only provide for access to a database table or no access to a table. The semantics for doing finer granularity security beyond that cannot be generalized and abstracted because they are always site (and sometimes sub-component) specific - if it were possible, the database vendors would have already provided it. Yet, our customers wish help in implementing such detail-oriented semantics. For this reason, our system includes features to provide them. These include, "security_code" which have types and values, "access_constraints", and "use_constraints", which describe how things may be accessed and what they may be used for, "security_privileges" granted to individuals or accounts, and additional related components to help glue these parts together. We also have included a concept of trusted hosts. This permits an explicit notation whether or not user authentication performed on a particular host computer are to be trusted or not, and if so, to what degree. For example, it may be OK to permit a given user, connected from a specific host, to read but not update various information while the same user connected from a different host may have full access privileges. These features are provided so that the specific whims of a given customer can be easily accommodated. Some, but not all of these features are enabled and active in the product as shipped while others are not. It should be pointed out that many of these application level security features require tailoring by customers for the features to have meaning and import. Network level security: Where physical network topology exits a secured place, firewalls provide the first line of defense against intruders. For secured access to clients through these firewalls, we have a number of technologies available, depending upon the specific need. VPN's (Virtual Private Networks) could be used with some of customers larger clients and may be worth the effort in some cases. OpenSSL, and other Secure Socket Layer packages are available for machine to machine transport. And, our processing daemons have been used to pierce outbound from very sensitive sites and connect directly to their target DBMS, thus overcoming some onerous network security issues. For client-access, we recommend running an HTTPS enabled web-server such as the Tomcat and Apache/SSL web-servers which we run on our own computers at Science Tools. Our Java API is also secure in this context and, when placed with a server environment with JavaServelet technology, users can then have an entirely encrypted experience with our system. There are, however, difficulties with the above. One such difficulty is the user-authentication burden as web-connections are stateless and yet users want to have a stateful connection (user context). It's rather inefficient to create and tear-down database connections for each and every page-touch. And, these environments are not set up to handle individual-user oriented database connections, and instead are intended to support a single authentication for all users connected to the server. Add to this the user "application level" security concerns, outlined above. Taken together, these are challenging set of problems if sophisticated, high-granularity, highly discerning security features are desired in combination with database access - all of the previously mentioned security features come into play. We have had a strong research and development effort on this subject and recently came up with what we feel is break-through technology - if you'll pardon the pun! As there may be an opportunity to patent, we are delighted to share in person and will only put into writing here that we are aware of the problem, we are unaware of this problem being solved by anyone else, and we hope to announce a resolution to this problem shortly.
2.	Departmental changes to individual users privileges Question: Describe system capabilities to allow separate customer departments to make changes to individual users privileges with respect to their areas only. Response: As outlined above, the question of security is non-trivial and has manifold components potentially involved which directly reflect on user-level privileges. Unstated in this request for comment, for example, is the implication that these different departments should also have their own areas of privilege and a domain of dominion. There are a great many choices and because there are so many ways which could work, there is no "right way." Our job at Science Tools is to give our customers as much flexibility and choice as possible. Regarding database-level security tools, many RDBMS database vendors provide sufficient features to provide the control requested. Because BigSur uses most any RDBMS, these features may already be available through the RDBMS itself. Our schema includes, as outlined above, a very flexible internal security scheme to provide for application-enforced, sophisticated security. Using these tools, the customer is free to declare its own privilege levels and what they mean. Science Tools software provides for assignment of these privileges and management of the meta-data surrounding them. Changing a users privileges would be as simple as a single update statement. But, of course, there's the application side which must be attended to. The price of ultimate flexibility is placing privilege checks in user-interface code that confirm access rights. Customers could use multiple BigSur installations and have them inter-operate using BigSur machinery such as the publishing system. In this scenario, departments could each have their own data in their own system. It's difficult for us to recommend this merely on the strength of the desire to have user-level security controls. Another important method for security is our "named database connection" feature we've developed at customer request. With named connections, the system administration staff configures database connections with username/password authentication information which is based upon database level restrictions, for example, read only or update rights to particular database tables. Users whose username appears in the appropriate place will be able to use the connection. If various authentications are set up, each with their own privileges, then users can be added to or removed from various named connections, achieving the desired control.
3.	Archived data sets Question: How does your software manage large, archived data sets stored on a variety of media and hardware, including magnetic disks (e.g., RAIDS), tapes (e.g., StorageTek silo), or optical jukeboxes? Which reliability factors and data recovery mechanisms have you considered in your recommendation? Response: The BigSur System® has an optional archiving component available which is designed to serve as an archive repository manager for an entire site, managing holdings for many machines. This archiving system uniquely identifies individual holdings, records source host and directory entry meta-data and manages "pointers" to the archival media, including a reference to what system the holding resides on. Media type is recorded as well as attributes of the media type such as access times and transfer rates sufficient to describe all forms of electronic media. This system readily handles knowledge of duplicates and one of its unique features is its ability to detect and record duplicate copies of objects. This feature helps reduce the size of archival holdings by allowing single copies to be archived and it may just as easily be used to manage multiple physical archives. For detecting errors, the identical mechanism that is used to detect duplicates is employed, permitting single bit errors to be discovered and, should alternative copies exist, the system can provide access to them in lieu of the corrupted copy. This system does not address the issue of media reliability beyond this, however, because this information can be put to no additional practical effect. Our system does not involve new hardware and so is not a part of any RAID-like technique to improve the reliability of media, though we are aware of a vendor who has offered to supply us with as many terabytes of data storage on RAID-5 as we wish for a mere $22,500. per terabyte with 1yr/3yr warranty - let us know if you're interested. We recommend using our Archiver on all new data-sets. On pre-existing data-sets, there is a cost associated with creating the archival entries and for them we may recommend any of a number of alternative approaches which serve to limit the burden of reading every single byte of existing holdings which is otherwise necessary.
4.	Fault tolerance Question: Describe your system's fault tolerance with respect to network outage, host failure, and controller failures. Describe the ability of your solution to identify details regarding failures - anticipated or definite - and to aid support personnel in preventive maintenance and/or fault recovery. Response: Our system provides for reliability and fault tolerance both as an aspect of its overall architecture and as a result of specific features we've created to ensure reliability and easy maintenance. Being database-centric, there are a host of standard, off the shelf database tools available to provide services such as database replication. Therefore, using standard, commercial tools, one of our installations can have a complete, functional copy residing on another continent, if desired. And surely the RDBMS vendor is responsible for providing a reliable storage engine and competent backup and restore utilities that are fast and capable. In addition to off-site replication provided by others, our system has a rudimentary replication capability built in which we refer to as publishing. This optional feature permits key information to be transmitted to explicitly declared publishing sites and there's very good control of publishing. This publishing can include virtually every kind of object or component that's managed in our system, though there are some things which we deem inappropriate for publication, such as clearly private configuration information, and security-related data. This data is of sufficiently small volume and is changed sufficiently infrequently that our publishing system could act to keep backup installations current with production activities. Our system also records most actions through log-files. The utility of each logfile differs depending upon the specific function, however, our software can be generally characterized as having robust logfile capabilities. For example, default, non-optional logfiles exist for the most important errors in the installation on a site. For each basic capability, there's an additional logfile, and in many cases individual logs are also created to log the individual actions of individual users for both security and debugging purposes. These logs can serve both as an audit trail of access activity and also as a source for specific information for use by administration staff in keeping the system operating smoothly. And, our API also includes the ability for application programmers to declare auxiliary log files for their own purposes. In addition to these strategies, our system includes a very carefully written processing section designed to guarantee that any and all "collisions" are detected and handled and that there are no "race conditions." If a network error should occur depriving any running scientific processes from being able to access their database, that process will notice the failure when it next tries to communicate with the database. The default action is for it to continue on if the data it wishes to store is considered non-critical, or fail with appropriate information in its log files - and send email to support staff depending upon configuration settings. However, this default action can be overridden easily enough, and the over-ride strategy can become the default action at the wishes of our customers. We provide a routine which may be called to enter a wait-and-reconnect loop. If a host should fail, any processes running on that system will cease to run. Their entries will still be retained in our processing system, and may be restarted on that host when it recovers, or those entries may be updated to begin the work again on a new, surviving system. Depending upon the process and the details of any networked file systems, it may be possible to restart an interrupted job, picking up somewhere in the middle. Most often, however, this will not be possible, or perhaps not practical, and the process will merely be restarted from scratch. It's easy to setup alternative processing hosts, and in the event one host should fail, surviving hosts will continue on, picking up the workload their peer would have performed.
5.	Support for a variety of distribution media Question: Describe your solution's support for a variety of distribution media production, format conversion, and duplication. Media should include 8 mm DAT, 4 mm DAT, DLT, CD-ROM, and DVD. How does your system handle addition of media types or expansion of production capacity? Response: We have seen production strategies in action and know that customers systems use a great variety of differing hardware to implement your present solution. On many existing systems our DPS (Distributed Processing System) can be used to help coordinate this piece. A DPS daemon would be used on each host and it would connect to our BigSur System® which would then direct the data transfer onto the media in question, record what was done, and direct operations staff what manual steps need to be performed, as appropriate. User interaction is likely required at many points in these data-transfer steps and for this, the script that coordinates the data transfer and records what occurs can also hold a dialogue with operations staff. It may also be possible in many cases to avoid this dialogue and automate to the point where the data transfer begins when the media becomes ready. We discussed this with customers when we visited their facilities last November and have confirmed that the present code can easily be adapted for this use by encapsulating it inside one of our processing scripts. (Note that the writing of processing scripts is considered a normal part of setting up our system to perform customer work and is therefore easily accommodated with templates and tools.) Most likely, only one DPS processing script needs to be written and not one per media type. Also note that exactly which media types are supported depends on what hardware is made available and is not constrained by our system. All media can be supported so long as its host can run our DPS software - which only requires the ability to run Java. Addition of new media will depend on hardware and certainly production capacity is not an issue for our system, though there does exist some possibility that production demands may outstrip the capability of the hardware, most likely network transport. Note that the overhead of our system is very low and should never be an issue.

Feedback

Contact Us

website contact: Webmistress