Introduction To Database
What Is The Database?
A database is a collection of information that is organized so that it can be easily accessed, managed and updated. Data is organized into rows, columns and tables, and it is indexed to make it easier to find relevant information. Data gets updated, expanded and deleted as new information is added. Databases process workloads to create and update themselves, querying the data they contain and running applications against it.
Computer databases typically contain aggregations of data records or files,such as sales transactions, product catalogs and inventories, and customer profiles. Typically, a database manager provides users with the ability to control read/write access, specify report generation and analyze usage. Some databases offer atomicity, consistency, isolation and durability (ACID) compliance to guarantee that data is consistent and that transactions are complete. Databases are prevalent in large mainframe systems, but are also present in smaller distributed workstations and midrange systems, such as IBM's AS/400 and personal computers.
Evolution Of Databases
Databases have evolved since their inception in the 1960s, beginning with hierarchical and network databases, through the 1980s with Object-Oriented Databases, today with SQL And NoSQL Databases as well as Cloud Databases.
In one view, databases can be classified according to content type, bibliographic, full text, numeric and images. In computing, databases are sometimes classified according to their organizational approach. There are many different kinds of databases, ranging from the most prevalent approach, the relational database, to a distributed database, cloud database or NoSQL database.
A relational database, invented by E.F. Codd at IBM in 1970, is a tabular database in which data is defined so that it can be reorganized and accessed in a number of different ways. Relational databases are made up of a set of tables with data that fits into a predefined category. Each table has at least one data category in a column, and each row has a certain data instance for the categories which are defined in the columns. The Structured Query Language (SQL) is the standard user and application program interface for a relational database. Relational databases are easy to extend, and a new data category can be added after the original database creation without requiring you to modify all the existing applications.
A distributed database is a database in which portions of the database are stored in multiple physical locations, and in which processing is dispersed or replicated among different points in a network. Distributed databases can be homogeneous or heterogeneous. All the physical locations in a homogeneous distributed database system have the same underlying hardware and run the same operating systems and database applications. The hardware, operating systems or database applications in a heterogeneous distributed database may be different at each of the locations.
A cloud database is a database that has been optimized or built for a virtualized environment, either in a hybrid cloud, public cloud or private cloud. Cloud databases provide benefits such as the ability to pay for storage capacity and bandwidth on a per-use basis, and they provide scalability on demand, along with high availability. A cloud database also gives enterprises the opportunity to support business applications in a software-as-a-service deployment.
NoSQL databases are useful for large sets of distributed data.NoSQL databases are effective for big data performance issues that relational databases aren't built to solve. They are most effective when an organization must analyze large chunks of unstructured data or data that's stored across multiple virtual servers in the cloud.
Object-oriented database is the database which created using Object-Oriented Programming languages and are often stored in relational databases, but object-oriented databases are well-suited for those items.
An object-oriented database is organized around objects rather than actions, and data rather than logic. For example, a multimedia record in a relational database can be a definable data object, as opposed to an alphanumeric value.
A graph-oriented database, or graph database, is a type of NoSQL database that uses graph theory to store, map and query relationships. Graph databases are basically collections of nodes and edges, where each node represents an entity, and each edgerepresents a connection between nodes.
Graph databases are growing in popularity for analyzing interconnections. For example, companies might use a graph database to mine data about customers from social media.
Advantages And Disadvantages Of Database
Controlling Data Redundancy.
In non-database systems (traditional computer file processing), each application program has its own files. In this case, the duplicated copies of the same data are created at many places. In Database, all the data of an organization is integrated into a single database. The data is recorded at only one place in the database and it is not duplicated. For example, the dean's faculty file and the faculty payroll file contain several items that are identical. When they are converted into database, the data is integrated into a single database so that multiple copies of the same data are reduced to-single copy. In Database, the data redundancy can be controlled or reduced but is not removed completely. Sometimes, it is necessary to create duplicate copies of the same data items in order to relate tables with each other. By controlling the data redundancy, you can save storage space. Similarly, it is useful for retrieving data from database using queries.
By controlling the data redundancy, the data consistency is obtained. If a data item appears only once, any update to its value has to be performed only once and the updated value (new value of item) is immediately available to all users. If the DBMS has reduced redundancy to a minimum level, the database system enforces consistency. It means that when a data item appears more than once in the database and is updated, the DBMS automatically updates each occurrence of a data item in the database.
In DBMS, data can be shared by authorized users of the organization. The DBA manages the data and gives rights to users to access the data. Many users can be authorized to access the same set of information simultaneously. The remote users can also share same data. Similarly, the data of same database can be shared between different application programs.
In DBMS, data in database is stored in tables. A single database contains multiple tables and relationships can be created between tables (or associated data entities). This makes easy to retrieve and update data.
Integrity constraints or consistency rules can be applied to database so that the correct data can be entered into database. The constraints may be applied to data item within a single record or they may be applied to relationships between records. The examples of integrity constraints are:
- 'Issue Date' in a library system cannot be later than the corresponding 'Return Date' of a book.
- Maximum obtained marks in a subject cannot exceed 100.
- Registration number of BCS and MCS students must start with 'BCS' and 'MCS' respectively etc.
Most of the DBMSs provide the facility for applying the integrity constraints. The database designer (or DBA) identifies integrity constraints during database design. The application programmer can also identify integrity constraints in the program code during developing the application program. The integrity constraints are automatically checked at the time of data entry or when the record is updated. If the data entry operator (end-user) violates an integrity constraint, the data is not inserted or updated into the database and a message is displayed by the system. For example, when you draw amount from the bank through ATM card, then your account balance is compared with the amount you are drawing. If the amount in your account balance is less than the amount you want to draw, then a message is displayed on the screen to inform you about your account balance.
Data security is the protection of the database from unauthorized users. Only the authorized persons are allowed to access the database. Some of the users may be allowed to access only a part of database i.e., the data that is related to them or related to their department. Mostly, the DBA or head of a department can access all the data in the database. Some users may be permitted only to retrieve data, whereas others are allowed to retrieve as well as to update data. The database access is controlled by the DBA. He creates the accounts of users and gives rights to access the database. Typically, users or group of users are given usernames protected by passwords.
Most of the DBMSs provide the security sub-system, which the DBA uses to create accounts of users and to specify account restrictions. The user enters his/her account number (or username) and password to access the data from database. For example, if you have an account of e-mail in the "hotmail.com" (a popular website), then you have to give your correct username and password to access your account of e-mail. Similarly, when you insert your ATM card into the Auto Teller Machine (ATM) in a bank, the machine reads your ID number printed on the card and then asks you to enter your pin code (or password). In this way, you can access your account.
A transaction in commercial databases is referred to as atomic unit of work. For example, when you purchase something from a point of sale (POS) terminal, a number of tasks are performed such as;
- Company stock is updated.
- Amount is added in company's account.
- Sales person's commission increases etc.
All these tasks collectively are called an atomic unit of work or transaction. These tasks must be completed in all; otherwise partially completed tasks are rolled back. Thus through DBMS, it is ensured that only consistent data exists within the database.
Database Access Language
Most of the DBMSs provide SQL as standard database access language. It is used to access data from multiple tables of a database.
- Development of Application:
The cost and time for developing new applications is also reduced. The DBMS provides tools that can be used to develop application programs. For example, some wizards are available to generate Forms and Reports. Stored procedures (stored on server side) also reduce the size of application programs.
Form is very important object of DBMS. You can create Forms very easily and quickly in DBMS, Once a Form is created, it can be used many times and it canbe modified very easily. The created Forms are also saved along with database and behave like a software component. A Form provides very easy way (user-friendly interface) to enter data into database, edit data, and display data from database. The non-technical users can also perform various operations on databases through Forms without going into the technical details of a database.
Most of the DBMSs provide the report writer tools used to create reports. The users can create reports very easily and quickly. Once a report is created, it can be used many times and it can be modified very easily. The created reports are also saved along with database and behave like a software component.
Control Over Concurrency:
In a computer file-based system, if two users are allowed to access data simultaneously, it is possible that they will interfere with each other. For example, if both users attempt to perform update operation on the same record, then one may overwrite the values recorded by the other. Most DBMSs have sub-systems to control the concurrency so that transactions are always recorded" with accuracy.
Backup and Recovery Procedures:
In a computer file-based system, the user creates the backup of data regularly to protect the valuable data from damaging due to failures to the computer system or application program. It is a time consuming method, if volume of data is large. Most of the DBMSs provide the 'backup and recovery' sub-systems that automatically create the backup of data and restore data if required. For example, if the computer system fails in the middle (or end) of an update operation of the program, the recovery sub-system is responsible for making sure that the database is restored to the state it was in before the program started executing.
The separation of data structure of database from the application program that is used to access data from database is called data independence. In DBMS, database and application programs are separated from each other. The DBMS sits in between them. You can easily change the structure of database without modifying the application program. For example you can modify the size or data type of a data items (fields of a database table). On the other hand, in computer file-based system, the structure of data items are built into the individual application programs. Thus the data is dependent on the data file and vice versa.
DBMS also provides advance capabilities for online access and reporting of data through Internet. Today, most of the database systems are online. The database technology is used in conjunction with Internet technology to access data on the web servers.
Although there are many advantages but the DBMS may also have some minor disadvantages. These are:
Cost of Hardware & Software
A processor with high speed of data processing and memory of large size is required to run the DBMS software. It means that you have to upgrade the hardware used for file-based system. Similarly, DBMS software is also Very costly.
Cost of Data Conversion:
When a computer file-based system is replaced with a database system, the data stored into data file must be converted to database files. It is difficult and time consuming method to convert data of data files into database. You have to hire DBA (or database designer) and system designer along with application programmers, alternatively, you have to take the services of some software houses. So a lot of money has to be paid for developing database and related software.
Cost of Staff Training:
Most DBMSs are often complex systems so the training for users to use the DBMS is required. Training is required at all levels, including programming, application development, and database administration. The organization has to pay a lot of amount on the training of staff to run the DBMS.
Appointing Technical Staff:
The trained technical persons such as database administrator and application programmers etc are required to handle the DBMS. You have to pay handsome salaries to these persons. Therefore, the system cost increases.
In most of the organizations, all data is integrated into a single database. If database is corrupted due to power failure or it is corrupted on the storage media, then our valuable data may be lost or whole system stops.