In this tutorial, we are going to learn and examine the data mining on different kinds of data or on what kind of data we can perform the mining and get the insights. In terms of principle, data mining should apply to any kind of data repository as well as to transient data such as data streams.
Relational Database
Database Management system, as the name suggests consists of a collection of interrelated data. A Database Management System is also termed to be a database system DBMS Consists of a database and also a set of software programs to manage and access the data
We would focus on the term Database point-wise!
- The main role of the software programs in the DBMS is to involve mechanisms to define database structures, maintain the storage of data, undergo concurrent sharing of the data, and many more.
- The software programs also maintain the security of the information that is stored, despite crashes of the system or unauthorized access.
- As the name suggests the term Relational database is the one in which relations are maintained or in other words, tables are maintained. It is a collection of tables in which each of the tables is assigned a unique name.
- Each table consists set of attributes and usually stores a large number of tuples (rows) or columns.
- A semantic data model, such as an Entity-Relationship data model is often constructed for Relational databases.
- Each tuple in RDBMS represents an object such that this object is identified by a unique key
- Relational databases can be accessed by Queries which are written in SQL (Structured Query Language). SQL is a Relational Query Language. A given Query is then converted into a set of Relational Operations like join, selection, projection and so many others.
Example:
- Customer
- Item
- Employee and
- Branch
The Relation Customer consists of a set of attributes, which also includes a unique ID Called cust ID (Customer ID), customer name, Address, age, occupation, category, and soon.
Each of the relations item, employee, and branch consists of a set of attributes describing their properties.
Data Warehouse
A data warehouse can be defined as an information repository that has been collected or maintained from multiple sources, and stored under a unique schema. The processes that occur while constructing a data warehouse are namely:
- Data Cleaning
- Data Integration
- Data transformations
- Data loading
- Periodic data Refreshing
Note: Data warehouse is a term that comes under the ETL technology, ETL stands for Extracting. Transforming and Loading. In order to conduct decision making, the data in the data warehouse is supposed to be organized around. Some major subjects such as customer, item, supplier and activity.
- In the data warehouse historical data is also preserved.
- A data warehouse is usually modeled by a multidimensional database structure where each dimension corresponds to an attribute or a set of attributes in the schema.
- Each cell in here stores the value of some aggregate measures.
- A datacube is somehow related to a data warehouse as a data cube provides a multidimensional view of data and allows the computation and fast access of the summarized data.
- Data warehouses are well suited for online analytical processing or OLAP OLAP uses the background knowledge in such a way that it allows the presentation of data at different levels of abstraction.
- Examples of OLAP operations include drill-down or roll-up the main job of drill-down and Roll-up is to it allow the users to use or view the data from different angles.
Transactional Databases
A transactional database consists of a file such that each record in this file represents a transaction From the name itself, we can define a transactional database. Let us discuss the concept of transactional database using an example
A transaction includes a unique transaction number for eg. (trans_ID) followed by a list of items making up the transaction. consider the below table
Trans-ID | List of items-IDS |
---|---|
T100 | 11,13,18,116 |
T200 | 12,18 |
T300 | 14,16,17,20 |
- Transactions can be stored in a table with one record per transaction
- From the table above we can make out that, the transactional database is usually stored in a flat file as shown in the table.
- The transactional databases may have additional databases associated with them, which may contain other information like data of the transaction, and the customer ID number. the ID number of the salesperson etc.
Advanced Database Systems and Advanced Database Applications
A database system as described in the earlier sections is the system which is used to store data, maintain, analyse and produce desired outputs using the software programs and Structured Query Language (SQL) As the name itself signifies here we will be discussing about the application of database systems.
Where do we use databases in our day-to day life?
In certain business applications, relational/database systems are immensely used with an advancement in the field of database Astems, different kind of advanced data and information systems have been invented. Which have made our day-to-day life work easier.
The needs of the application user have helped emerging out with such dynamic database applications
The new database applications could be listed as:
- Handling spatial data (such as maps)
- Engineering Design data: The one that is used to design the buildings, system components, or integrated circuits.
- Hypertext and multimedia data: This includes text, image, video, and audio data.
- Time-Related data: It includes historical records or stock Exchange data.
- Stream data: It includes data such as video surveillance and sensor data, where data flow in and out like streams.
Such databases requires to have good facilities in order to store retrieve and regularly update immense data.