blog




  • Essay / Database in a Distributive Environment - 536

    Database in a Distributive EnvironmentDatabase is a diverse collection of information that manages data and enables rapid storage and retrieval of that data. Each application requires a database containing application-specific data that users access. However, each application, depending on its needs, requires a different type of database. Researchers classify databases based on user-specific features, settings, and applications. There have been several discussions and research on joins, which are a key performance indicator of any database. Some researchers preferred the centralized database over the distributed database, based on the analysis of joins performed in the two databases mentioned above. For example, Sharma and Singh (2012) conclude that in a centralized database, data is placed in a central location, while in a distributed database, data is distributed across multiple locations to increase transparency of the access. They found that data is placed in a central location to avoid redundancy in the database. In contrast, Carbunar and Sion (2012) explain that sensitive data in a parallel distributed system is placed by a client on a database server located at the service provider. Based on the joins performed, the authors found that the server should not be able to evaluate cross-column join predicates on the initially stored data. Additionally, since join performance determines the speed of databases, some authors found that cloud database is better than parallel database. Cheng, Yu and Yu (2011) show that the HPSJ algorithm processes an R join between two basic relations, it first obtains all centers which have a non-empty subgroup F labeled x and a non-empty subgroup T vacuum labeled y, using the table and maintains them. The authors describe that the two-step R-join algorithm is used to process the temporal relationship containing R-join attributes. On the other hand, Carbunar and Sion (2012) explain that join algorithms return all matching tuples, which speeds up the parallel database. Different authors view settings based on usage in a specific application. Another category, according to the researchers, is load balancing. For example, Lubbe, Reuter and Mitschang (2012) proposed a load balancing algorithm for partitioned data. This aims to balance the amount of data and focus on reducing data skew between partitions. They also showed that if the current load exceeds a certain threshold in a particular node, then it will check the load in the neighboring node and if the load in that node is less than the threshold, then the load will be shared between them..