Sample Paper on Management Architectures
- Data Management Architectures
As analytics and data continue to become integral parts of a business, so do data management architectures. There is an increase in competition and opportunities in the global market, creating a need for business analytics. Organizations need business data to solve current issues and anticipate future ones by using trends. Therefore, data management architectures are important for the strategic management of an organization. This paper discusses the three main concepts of data management architecture which are: data warehouse, data lake and data mart.
- Key Concepts of Data Management Architecture
- Data Warehouse
A data warehouse is a big store of accumulated data from various sources inside an organization and used in guiding management decisions. According to Oracle, a data warehouse is designed for analysis and query instead of transaction processing (Oracle Data Warehousing concepts). Warehoused data is store electronically and must be stored in a secure, easy to manage, retrievable and reliable manner. The concept of data ware housing originated in 1988 from IBM specialists Paul Murphy and Barry Delvin. Data in a data warehouse is extracted from operational data stores, external sources and transaction systems. These information assets are valuable to an organization and must be warehoused to be retrieved at a later date (Golfarelli and Rizzi 1). Data warehouses have the following characteristics: integrated, variant and non-volatile (Stairs and Reynolds 224).
- A Data Lake
A data lake is a storage mine that hold large amounts of raw data in its original format until it is required. A data lake allows many points of access and many points of collection for huge volumes of data (Dennis Data Lakes 101: An Overview). Data lake was coined by James Dixon in 2010 and he compared the architecture to a lake of water that is filled form different sources and is used by different users for different purposes (Dennis Data Lakes 101: An Overview). As a service solution, it provides enterprises provides fat data processing for faster business outcomes. Many organizations that use the data lake concept compare it to Ralph Emerson’s maxim that “life is a journey, not a destination” by recognizing that data lake is the destination of an awaiting data journey (O’Brien 4). This journey follows its own direction and pace based on the priorities and drivers of the organization it is adopted. Choosing the right technology while adopting the concept makes the implementation process run smoothly (Kim Successful Data Lakes: A Growing Trend).
- A data mart
According to IBM, a data mart is a subsection of a data warehouse (IBM Data Mart). It provides the primary access to information stored in a data warehouse and is often focused on a specific or a set of business functions usually departments. Most of an organization’s data is stored in a data warehouse but most individuals only require specific information such as accounting or marketing data. According to Chhabra and Pahwa (75) data marts are created from a data warehouse. Contrary to Inmon, Ralph Kimball believes that data marts are the foundation of a data warehouse (Drkusic Data Warehouse Modeling). While designing a data mart, it is important align its design to specific business requirements.
- Uses of Data Warehouse, Data Lake and data Mart in an Organization
In an organization, data exists from various sources and in different formats which are not consistent. A data warehouse consolidates this information from different sources and makes it available in a harmonized and unified form. There are three steps to getting to this unified form of data and they include: Extraction, transformation and loading. Data is extracted on a certain interval from the different sources in an organization. Extracted data may fist go to another server before being moving to a data warehouse. At the transformation stage, different data sets are made compatible with each other through format adjusting and conflict resolving. Finally, the transformed data is loaded for business functions such as trend analysis, calculations and reporting. The data warehouse is mainly used by financial institutions for its structured data characteristics and its ability to store large data from different sources.
A data mart is a subdivision of a data warehouse. It offers analytical information for a restricted data area (Stair and Reynolds, 224). For example, for just one department in an organization. In large organizations with many departments, data marts prevent the interference of data within an organization. For example, in a school setting, a data mart can prevent the administration department from interfering with academic department. Data marts also help simplify data analysis in an organization. For example, a data mart can meet specific and smaller requirements before tackling data from a data warehouse. Finally, a data mart makes it easy for individuals to retrieve information because it is focused on specific departments. Therefore, it is easy to retrieve analytical information for a specific department.
A data lake like a data warehouse offer mass storage but at cheaper costs and higher flexibility and agility. Unlike data warehouse, a data lake allows users more democracy and the possibility to ask new questions in the future. A data warehouse stores data from all sources storing both the original version and the transformed version (Inmon 26). A good example of a data lake is the cloud storage that back -ups all forms of data. The advantage of data lake is that it helps save on physical storage and does not change the original form of data.
Chhabra, Rashmi and Pahwa, Payal. Data Mart Designing and Integration Approaches. Vol 3 Issue 4, 2014, 74-79. http://www.ijcsmc.com/docs/papers/April2014/V3I4201424.pdf Accessed August 4, 2017
Dennis, Amber L. Data Lakes 101: An Overview.2016 http://www.dataversity.net/data-lakes-101-overview/ Accessed August 4, 2017
Drkusic, Emil. Data Warehouse Modelling. 2016 http://www.vertabelo.com/blog/technical-articles/data-warehouse-modeling-the-star-schema Accessed August 4, 2017
Gorfarelli and Rizzi. Introduction to Data Warehousing. 2009. http://cdn.ttgtmedia.com/searchDataManagement/downloads/Data_Warehouse_Design.pdf Accessed 4, August 2017
IBM. Data Mart. https://www.ibm.com/support/knowledgecenter/en/SSMPHH_10.0.0/com.ibm.guardium. doc/reports/datamart.html Accessed August 4, 2017
Inmon, William H. Data Lake Architecture: Designing the Data Lake and Avoiding the Garbage Dump. , 2016. Internet resource.
Kim, Dale. Successful Data Lakes: A Growing Trend. 2017 https://upside.tdwi.org/articles/2017/02/16/successful-data-lakes-a-growing-trend.aspx Accessed August 4, 2017
Oracle. Data Warehousing Concepts https://docs.oracle.com/cd/B10501_01/server.920/a96520/concept.htm Accessed 4, August 2017
O’Brien, John and Ryan, Lindy. The definitive Guide to the Data Lake. 2015 https://2xbbhjxc6wk3v21p62t8n4d4-wpengine.netdna-ssl.com/wp- content/uploads/2012/06/Data-Lake-Report-Final-1.pdf Accessed August 4, 2017
Stair, Ralph and Reynolds, George. Principles of Information Systems. 2016 Print