Friday, November 15, 2019

Database Management System

Database Management System Database for more Complex Information data Aim To review Why OODBMS is the DBMS of Choice for Complex data. Introduction: The aim of this research is to review why Object oriented database system (OODBMS) is the database management system of choice for complex data application OODBMS is a database management system that supports the modelling and creation of data as objects, includes supports for classes of objects, inheritance of class properties, methods by subclasses and their objects. Some Client Server application uses RDBMS for data storage and Object oriented programming language for development. This type of scenario would have performance issues because objects must be mapped to the database tables this is referred to as impedance mismatch. To avoid this object-relational impedance mismatch problems caused by having to map object, it became necessary to have an alterative method of storing and mapping data. Client server application using RDBMS as described above is not a good choice for complex data as such OODBMS provides a better alternative. Origin and History of OODBMS Database Management System (DBMS) is software designed for managing and controlling access to databases. DBMS is â€Å"a shared collection of logically related data (and a description of this data) designed to meet the information needs of an organisation† (Connolly Begg). The first DBMS was introduced in the ‘60s and before this time, data handling was done using file based method. File based method stored data in individual data files with interface between programmes and files. Mapping happens between physical and logical file, where one file correspond to one or more several other programs. Extensive programming in third generation language like Cobol, were used in creating, manipulating and accessing data. There are so many problems with this method of storing data e.g. increased programme maintenance and development time, redundant data, weak security, separation and isolation which means each program maintains its own set of data, users of one program may not be aware of holding or blocking by other programs. As a result of the limitations in File based method a better way of data handling was required which led to the creation of DBMS to address these problems. The need to organise and share data on large sized projects also contributed to the creation of DBMS. DBMS control the organisation and structure of the data storage and is independent of the program that is being used to access the data. Early DBMS systems represent the first generation of DBMS and theses are: Hierarchical design by IBM Information Management System IMS.IMS is still the main hierarchical DBMS used by most large mainframe installations. Network design by General Electric Integrated Data Store (IDS) -CODASYL (Conference on Data Systems Languages) or DBTG Systems The main problems with the Hierarchical and Network designs are that the Systems lack structural independence and are very complex The distinguishing feature of storing data in files and databases is that multiple programs and types of users are able to use the databases. Relational database management system (RDBMS) The term Relational Database Management System (RDBMS) was introduced by Edgar Frank Codd in his paper, â€Å"A Relational Model of Data for Large Shared Data Banks† which was published in 1970. This paper formalized the basis for a RDBMS. RDBMSs are referred to as the second-generation Database Management systems, and the definition of what constitutes a relational database system with the guideline for the development of RDBMS is in Codds 12 Rules for relational systems published in 1985. RDBMS has three main characteristics which are: Information are held in form of a table, where data are described using values. Data in the table columns should not be repeated. Use of Standard Query Language (SQL). Relational model is the main data model and the foundation for many leading database products that include DB2 from IBM, Informix, Oracle, Sybase, Microsofts Access and SQL Server, and Ingres. The market for RDBMS represent close to a multibillion-dollar. No need for the use of predefined keys to input information in RDBMS which make it more flexible than the first generation DBMS. Also SQL is easy to learn making RDBMS more productive. the main advantage of RDBMSs is the ease with which users can easily create, access and manipulate data if need be. Other benefits of RDBMS are: Multi-threading for users. Asynchronous Input/Output for performance issues. Data Partitioning. Parallel database queries for processing complex query. Scalable architecture. Advanced management tools and security with automatic data logging and recovery Referential Integrity for Data consistency. Transactional management features for database consistency Though RDBMSs has served effectively for number of years, it has certain limitations that were exposed by increasingly demands for complex data types and high performance applications. RDBMS limitations include but not limited to the following: Relational databases are unable to handle complex multimedia data such as images, video and audio clips due to lack of storage capacity. RDBMS support only few simple datatypes, e.g. integer, floating point, character string and date/time. These user-defined datatypes are useful just for defining value domains. Some RDBMS support binary large objects (BLOBs, image, text) which are used as pointers to external storage, these objects are difficult to manage and exchange. Standard SQL is limiting, which made vendor to create specific extensions such as, Sybase Transact-SQL, Oracle SQL*Plus. Also RDBMS do not work efficiently with languages such as Java scripts and C++ which emerged after the RDBMS original development Impedance mismatch: The data type in the database systems is mismatching with the complex data structure created by application. RDBMS also uses mixed different programming paradigms, in which data with different types and locations are handled at the same time. Information in RDBMS are in tables where relationships between the entities are defined by values. Data in RDBMS cannot represent Real World Entities; normalization leads to relations that do not correspond to object in the real world Because of the above limitations and the challenges of Complex data applications, Internet and the Web usage, Object-oriented databases management system (OODBMS) was introduced in the 1980s. OODBMS offer extensible and controlled data management services, consistency, data independent and secure environment, to object-oriented model. OODBMS handle big and complex information systems that RDBMS was unable to handle. RDBMS has been very successful with huge investments in its development by many big database vendors. RDBMS has many loyal customer base in the corporate relational databases community and a large Industry based on RDBMS applications and systems development. Because of this, major RDBMS vendors (Oracle, IBM, and Informix) came up with another DBMS called Object Related Database Management System (ORDBMS) to allow organisations to run legacy systems and new Object Oriented based systems in parallel, and gradually migrates to the new ORDBMS technology as its benefits become more apparent Thus RDBMS vendors developed ORDBMS as a way to meet the challenges of the traditional RDBMSs and as a future proof for corporate investment. However ORDBMS still fail to hide the inherent mismatch between relational and object oriented database models. Object oriented database system (OODBMS) have their origin in Object Oriented Programming Languages (OOPL). OODBMS started as a research in the mid ‘70s, the research is based on having a real database management support for graph-structured objects. Around this time also, several shortcomings of RDBMS were highlighted within the database community and It was also acknowledge that OOPL had very strong advantages over the non-OOPL for many programming applications. Some of the advantages are: Strong encapsulations which make it easier to program large and complex applications Inheritance features, which enable code reuse. Application programmers also want to make object persistent that is object lifespan should extend beyond single program execution One of the early researches project on OODBMS was the ORION Research Project in the early 1980, which was conducted by Won Kim at the Microelectronics and Computer Technology Corporation (MCC). Two products were developed from the research, the ITASCA that is no longer in existence and the Versant OODBMS. Some other OODBMS commercial products are Gemstone that was known before as Servio Logic, Gbase (Graphael) and Vbase (Ontologic) In the early ‘90s addition products entered the market, some of these are: Objectivity/DB (Objectivity, Inc) ObjectStore (Progress Software, as acquisition from excelon which was originally Object Design) O2 snd Versant Object database (Versant Corporation) ONTOS (Ontos Inc formally Ontologic) ITASCA Jasmine (Fujitsu, marketed by Computer Associates) Some of these products are still available in the market with the introduction of new commercial Open source products in early 2004 e.g InterSystems , Ozone, Zope , FramerD and XL2. The open source products are gaining high recognition in the markets. The concepts of persistence to object programming language have been added to OODBMS. Early commercial products were integrated with programming languages e.g. Gemstone uses Smalltalk, Gbase uses LSIP and Vbase uses COP. C++ is the dominant language during the early 90s for OODBMS markets. Java and C# were introduce by vendors in the late 90s. The open source object databases are entirely written in Object Oriented Programme languages like Java or C#, e.g. db4objects (db4o) and Perst (McObject). Chris Muller has also recently created another open source Object database called Magma (Written in Squeak). Open Source products are reasonably affordable and easy to use and this opes the second growth period for Object database. Manifestos â€Å"The Object Oriented Database Manifesto† by Atkinson et al, 1989 listed the mandatory features that a system needs before it can be known as an OODBMS. Object oriented database manifesto abandons the relational model design by determining the basic rules of object database systems. The characteristics of the object DBMS is grouped into three by the Manifesto. Mandatory: Encapsulation, Object identity, types/classes, complex objects, overriding combined with late binding, inheritance, extensibility, persistence, computational completeness, concurrency, ad hoc query , secondary storage management and recovery facilities.. Open: this is decided on by the designer and include the representation of system, uniformity of type system and the programming paradigm Optional: includes multiple inheritance, type checking, inferring, distributions, versions and design transactions The OODBMS manifesto was unacceptable by some relational database professional, so another competitive manifestos was introduced â€Å"The Third-generation System Manifesto† by Stonebraker et al 1990.This manifesto retain all features of relational Database System that are practically proven (e.g. SQL) and augment the old features with new ones, e.g. the object-oriented concepts. â€Å"The Third Manifesto† which is written by Darwen and Date 1995 rejected both the object-oriented ideas and SQL which according to them defeat the relational models ideas, and to return to the genuine relational model and Codds 12 rule. The document produced for this manifesto is very controversial, from the current way software engineering and query/programming languages are done now. The arguments presented in this manifesto are more ideological rather than technical and this make the manifesto very difficult to be accepted by many databases professional. The newest version of the manifesto (2006) still retains these ideological assertions. Object-Oriented Database Management System Development review OODBMS stores Objects rather than data such as Integers, strings and real numbers, Objects consists of Attributes and Methods Attributes: They are data that defines the characteristics of an object. This data may be integers, strings and real number, or the data may be a reference to a complex object. Methods: This defines the behaviour of an object and methods are procedures or functions. The introduction of OODBMS was to reduce impedance mismatch between programming languages and database management system, to offer performance advantages and to provide clear support for complex user defined type including the ability to call, store and query complex object directly The developments of OODBMS support the modelling and creation of data as objects, extends programming languages with capabilities like data recovery, concurrency control, persistent and other relational capabilities. OODBMS are used when there is a need for good performance processing on complex applications because it takes a revolutionary approach to database management. OODBMS do not use table to store data, data are stored in objects and it handles concurrent access and provides a persistent store for objects in a multi-user client/server environment. OODBMS uses class definitions and traditional Object Oriented languages such as, C++ and Smalltalk for defining and accessing data instead of using separate language like SQL in defining, manipulating and retrieving data. OODBMS can be viewed as an extension of the OO language, providing direct integration to database capabilities. Object-Oriented database development initially focuses only on application that manages complex information like the Computer Aided Software Engineering, Aided Manufacturing (CAM) and Computer Aided design (CAD) applications. Other applications where object-oriented database technologies are now being used include: Telecommunications applications Hospital applications Finance institutions Multimedia applications Document/quality management OODBMS development enforces object oriented programming concepts such as data encapsulation, polymorphism, object Identity, inheritance as well as database transaction management concepts e.g the ACID PROPERTIES (Atomicity, Consistency, Isolation and Durability) which improve the integrity of the system.. OODBMS also support secondary storage management systems with ad hoc query languages , these permits the management of very large data. The inclusion of data definition within operations definitions in Object Oriented Database Management System has the following advantages: The defined operations are not dependent on the database application running at the moment and they apply ubiquitously. Inheritance allows the development of solutions for complex problems incrementally, and this is achieved by defining new objects in terms of the previously defined objects. Extension of Datatypes to support complex data such as multi-media, this is done by defining new object classes which have operations to support the new kinds of information. Object identity (OID) enables objects to be independent of each other in the database. Data encapsulation, this allows the internal state of the object to be hidden. Objects that are encapsulated are the ones that can only be assessed by their methods instead of their internal states. There are three types of encapsulated objects: Full encapsulation, here all the operations on objects are performed through message sending and by the execution of method. Write encapsulation, here the internal state of the object is allowed for reading operations only. Partial encapsulations, this allows direct access to reading and writing to some part of the internal state Another distinguishing characteristic of objects is that they have an identity that is independent of the state of the object. For example, if one has a car object and we remodel the car and change its appearance, the brake and the tires so that it looks entirely different, it would still be recognized as the same object we had originally before the changes were applied. Object-identity allows objects to be related as well as shared within a distributed computing network. All of these advantages come together to provide significant help to database application developers making development easier. The advantages also allow object-oriented databases development to solve information management problems that are characterized by the need to manage: A large number of different data types, A large number of relationships between the objects, and Objects with complex behaviours. An application development where this kind of complexity exists includes engineering, manufacturing, simulations, office automation and large information systems. Currently there is no widely agreed standard for what constitutes an OODBMS and no standard query language for OODBMS that is equivalent to what SQL is to the RDBMS. The Object Data Management Group (ODMG) a consortium of Object-Oriented Database Management System (ODBMS) vendors and interested parties working on standards to allow the portability of customer software across ODBMS products and to create a standardized Object Query Language (OQL) and object definition language (ODL). The work of ODMG on object data management standards completed in 2001 and the group was disbanded. The final release of the ODMG standard is: Object Database Standard (ODMG-3.0). Because the ODMG group was dissolved in 2001and this made the standardization of Object database languished. Another group was formed in 2005 called the OMG,s Object Database Technology Working group (ODBTWG) which is now working on a new standard to reflect recent changes in OODBMS technology. The main features of an OODBMS development is the way in which object is access in a transparent way such that interaction with persistent objects is the same as interacting with in memory objects. RDBMSs require the use of interaction through a query sub-language like SQL and the use of ODBC, ADO or JDBC. All this is unnecessary with OODBMS. In OODBMS when a request is made for an object in the database, the object is transferred into the applications cache where it is either used as a temporary value that is detached from the main version in the database so any updates to the cached object will not affect the object in the database. Object can also be used as the exact copy of the version in the database, so here any updates to the object are shown in the database and any changes to object in the database means the object has to be retrieved again from the OODBMS. There are a number of implementation of OODBMS with both research and commercial application. Each implementation is different based on the Object oriented languages, which form their origin. Some of these developments are discussed below: Gemstone: This was mainly based on smalltalk and its data definition and manipulation language is called opal. Gemstone do not provide all of most of the object-oriented features, it does not provide multiple inheritance. Iris: This is a research prototype, with the purpose if meeting the needs of applications like CASE tools, CAD etc. (Fisman et al (1987). It is designed to be accessible from any number of programming languages VBase: This is commercially available and is built with a schema definition language concept when objects are defined as data types. The main purposed of Vbase is to combine the procedural language with persistent object support, it also has the strong typing inherent in object systems for both language and database. O2: This is based on the framework of a set and tuple data model. It is designed with the purposed of integrating database technology with the object- oriented approach, for an all in one system. (Lecluse , et al 1988) Most of the OODBMS development discussed, tried to meet object orientation approach in someway, they do this by implementing various OODBMS features. There are major differences in the physical implementation of each model as well as the number of features implemented. Two Examples of companies using OODBMS are: British Telecominication uses Versant Mclaren develop the Formula one racing car uses Jasmine OODBMS Comparison criteria between RDBMS and OODBMS including the pros and cons. RDBMS and OODBMS differ in many ways example is in application domain, performance, usage, current market demand and support from vendors. RDBMS and OODBMS comparisons are done to get a better understanding for how the two databases differ and why OODBMS is better for large complex applications. A significant difference between object-oriented databases and relational databases is that object-oriented databases represent relationships explicitly, supporting both navigational and associative access to information data. As the complexity of the interrelation between database information increases, so is the advantage of representing relationships explicitly increase too. Explicit relationships usage also improves data access performance over relational value-based applications. Support for Object Oriented Programming Language (OOPL): Object Oriented Programming is not ideal for RDBMS, programmers spend most of the of coding time mapping the program object to the database In OODBMS, programming is direct and extensive, OOPL are designed to allow an application developer to create a complex sequence of instructions with less difficulty Standards: The defined standard for RDBMS is SQL, with main elements being DDL, DML DCL. The current version of the standard is SQL 3, SQL 3 defines the new features used in ORDBMS. ODMG is the group that maintains the standards for OODBMS the main components of standard being ODL and OQL. Most recent version of this standard is ODMG-3.0 and there are no new standard yet to reflect the advances in OODBMS technology. Product Maturity: RDBMS is a second generation and in a much matured stage. RDBMSs have good implementation with many support services such as the tool support for reporting, data transformations, OLAP tools etc. RDBMS is available from many vendors; this made it an obvious choice for most of the users. OODBMS is the 3rd generation DBMS making it relatively new, it emerged after the RDBMS, new ODMG standard still not in the final stage. Some implementations are available but still do not fully support all the features. Simplicity of use: The table structure in RDBMS is very easy to understand by user and also RDBMS has many end-user tools. OODBMS is mainly for developers, not many end user tools are available for object database products but this is hopefully expected to improve over time. Versioning of Data: Versioning of data is not supported by RDBMS but the user using multiple records with some of the attributes describing the versioning information can implement this feature. Versioning are defined by the user Versioning is supported naturally by OODBMS this is to maintain multiple versions of the data objects Complex data relationships: RDBMS provides basic relational tables, user-defined sets of records with system-defined domains, in addition to more high-level types defined in application. OODBMS has the same type system for system-defined and user-defined types. It is open-ended. Performance: OODBMS performed better than RDBMS based on the Object Operations version OO1 benchmark that was done on the OODBMS software (GemStone, Objectivit0079/DB, Ontos, Objectstore) and RDBMSs (Ingres and Sybase) in 1989 and 1990. Object in OODBMS is a better model of the real world entity than the relational rows in regards to complex applications. OODBMS outperform RDBMS when it comes to handling complex and interrelated data. Also the lack of impedance mismatch in OODBMS means they provide significantly better performance than RDBMS. Impedance mismatch require the mapping of one data structure i.e. tables to another data structure i.e. objects and this slows down performance on in RDBMs. Also the client catching features in OODBMS improve performance and also no joins are required in OODBMS Application Domain: RDBMS is used for large administrative systems, with many instances of simple data types; it can only handle short transactions and few data at a time. OODBMS is for design applications, with many and complex object oriented data types, handles long transactions and large client data. Semantic Gap (DDL/DML Vs. PL): RDBMS offers relational tables in some DDL and a standardized DML with client/server support and with embedded SQL in many Programming languages(PL). SQL is not computationally complete. Applications may have further, often high-level types and special storage structures expressed in some PL. In OODBMS, same OBJECT-ORIENTED PROGRAMMING LANGUAGES like C++ and small talk is used for both client and server. It does accommodate only object-oriented languages and does not accommodate cobol. The database PL is computationally complete. Query Optimization: This is very strong in RDBMS because of the restricted set of data type. OODBMS has poor query optimization because of the complex data structure Primary Keys: In RDBMS rows are uniquely identifying based on the value and also no two records can have the same primary key values this is to avoid error conditions. In an OODBMS, Object Identifiers (OID) which are system generated are used to uniquely identify an object and this is done behind the scenes and this is completely invisible to the user. With this feature in OODBMS there is no limitation on the value that can be stored in an object, thus increasing the efficiency of the database. Vendors Support : RDBMS is highly successful because of the large market size but most of the vendors of RDBMS are adding the capabilities of Object oriented to their applications so they are moving towards ORDBMS. OODBMS are targeting niche market because they lack of support from Vendors. This is due to the fact that the market for RDBMS is very large and it is difficult for vendors to move away from legacy systems that are mostly based on RDBMS. OODBMS Functionality and performance review: Complex process integration among companies is the driving force for adopting OODBMS. The capabilities of Objected-oriented programming language are integrated to the technology of DBMS in OODBMS. Designs of Object database are quite different because object database design is the essential part of the overall application design process. In OODBMS the Object classes used is the same as the classes used by the programming language. Data are stored in tables with columns and rows in relational databases i.e. data are represented in two-dimensional view. This is effective for applications that are simple, straightforward system, possessing low volume. RDBMS is good for application with simple relationships between data. Relational database technology failed to handle the needs of complex information systems because it requires the developer to force an information model into tables where relationships between entities are defined by values and, Relational database requires translation of sub-language like SQL and call interface like JDBC/ODBC. All these slows down the RDBMS data performance Support of abstract object interfaces is actively provided by OODBMS. It manages types, classes and methods, including the execution of methods. Data can be represented in more than two-dimensional view and relationships between data are represented explicitly which improve data access performance The basic functionality of relational database management systems is combined with new functionality of Object oriented in OODBMS. The basic functionalities are: Persistence Transaction Management and Concurrency control Security Recovery Data Access performance Query. Persistence: This is the ability of an object to be stored on a permanent medium and can survive program termination or shutdown. i.e. it can survive the duration of the OS process in which it resides. Persistence data to survive transaction updates they have to be stored outside transaction context. The addition of persistent to objects is essential to making OODBMS applications useful in practice this because most applications need to handle persistent data. Persistence is dealt with in OODBMS by the addition of persistence to object programming language like C++, Smalltalk and Java. OODBMS support persistent objects from data distribution, programming languages, transaction model, versions, schema evolution and generation of new types. Another way OODBMS offer persistence is through Inheriting from a common class. Transaction Management and Concurrency control: The use some form of versioning systems is the means of managing updates to multiple data at the same time in OODBMS, without any interference from one another. OODBMS products allow the objects to remain in the client cache after committing a transaction this is done so that the application is able to reference it again soon. This feature increase performance with the way data are stored and retrieved from the database. Concurrency control enables users to see the same view of object data in OODBMS. This allows a lot of reads and writes operations to go on in parallel while ensuring that the data is in consistent and good state. Security: Secure OODBMSs have certain characteristics that make them unique. The concepts of encapsulation, inheritance, information hiding, methods, and the ability to model real-world entities in object oriented environment provide security model in OODBMS. OODBMS may encapsulate a series of basic access commands into a method and make it public for users, while keeping basic commands themselves away from users. Little work has been done in OODBMS application to add security mechanism against malicious misuse of data. Recovery: Recovery features in OODBMS allow a consistent state of the database to be reinstated after a system crash or failure. This is done by either by rolling back the uncommitted transaction or rolling forward of transactions that has been committed but not completely flushed to disk. Data Access performance

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.