In order to adapt to the new normal, organizations must adopt a new data management architecture, allowing them to thrive in the digital world. Simon Spring, EMEA Account Director at WhereScape, discusses the key components of a Data Fabric and explains how it helps organizations manage and maximize the value of their data.
If 2020 has taught us anything, it’s that change happens suddenly, unexpectedly and can have a significant impact on all aspects of our world. To counter such seismic changes, companies must embrace agility at scale, adjusting objectives and goals, as well as business processes and decision-making capabilities, almost overnight.
The pandemic has forced organizations to manage change and make decisions faster than ever. Everything indicates that this will not be a temporary situation as digitization, automation and other forces of accelerated change continue to shape the new normal.
To thrive and survive in today’s increasingly complex and volatile world, there is a need for a new data management architecture, which makes it easy for users to find and use analytical data and analytical resources to take in charge of the strategic and tactical decisions that must be made every day. Enabling this unhindered access – along with seamless accessibility and shareability by anyone who needs it – comes in the form of Data Fabric.
Data Fabric: what is it and what can it bring?
Data Fabric is a holistic analytics architecture that ensures that all forms of data are captured and integrated for any type of analytics data and are easily accessible and searchable for users across the enterprise.
Born out of the pressing need to find a better way to manage enterprise data, a Data Fabric uses analytics on existing and discoverable metadata assets to support the design, deployment and use of integrated and reusable data in all environments.
According to Gartner, there are four key attributes of a Data Fabric architecture: it must collect and analyze all forms of metadata, including technical, business, operational and social; it must convert passive metadata into active metadata for frictionless data sharing; he must create and organize knowledge graphs that allow users to derive business value from them; and it must have a robust data integration backbone that supports all types of data users.
Simply put, a Data Fabric is a single environment comprised of a unified architecture and services that help organizations manage and maximize the value of their data, eliminating data silos and simplifying access to multiple assets. data across the enterprise on demand, making it faster and easier to acquire new knowledge and undertake real-time analytics.
The good news is that as an overlay on existing applications and data, Data Fabric enables organizations to maximize the value of their existing data lakes and data warehouses. There is no need to remove and replace an existing technology investment.
So what are the key components of a Data Fabric?
Key architectural components: an overview
Made up of multiple components, data flows, and processes that all need to be coordinated and integrated, the Data Fabric analytics architecture presents a complex array of technologies and functions. These include:
- A real-time analysis platform (RT) – The first analytical component that analyzes the data flows (transactions, IoT flows, etc.) entering the company in real time.
- The enterprise data warehouse (EDW) – The production analysis environment where routine analyzes, reports and KPIs are produced regularly using reliable and reliable data.
- The IT platform for investigation (PEAK) – Used for data mining, data mining, modeling and cause and effect analyzes. Also known as the Data Lake, it is the playground for data scientists and others with unknown or unexpected queries.
- A data integration platform – That extracts, formats and loads the structured data into the EDW and invokes the data quality processing if necessary.
- A data refinery – To ingest structured and multistructured raw data, distilling it into useful formats in the PKI for advanced analyzes.
- Analysis tools and applications – To create reports, perform analyzes and view results.
- A data catalog – Which acts as an entry point for users where they can see what data is available and find out what analytical assets already exist. This must be meticulously maintained and updated.
Setting up a Data Fabric: the technical processes involved
Those responsible for creating and maintaining a Data Fabric are faced with a heavy task. The more they simplify the access and use of the analytical environment by the business community, the more complex the infrastructure becomes.
Technologies that work seamlessly to support a variety of processes will be essential. These include:
- Discovery – In addition to detecting what data and assets already exist in the environment and obtaining complete metadata on data lineage (sources, integration techniques and quality metrics), technicians can use usage statistics ( who uses what and how often) and impact analysis to understand which data and analytical assets are impacted if an integration program changes.
- Data availability – If a user requests data that is not available, potential sources should be sought and assessed in terms of quality, accessibility and suitability for the requested purpose. All of this information should be documented in the data catalog for future use.
- Design and deploy – Populate the right analysis component (EDW, ICP and RT) with the right data and technologies from the right data source, using data integration and quality processes to ensure data reliability. Sensitive data must be identified and protected by encryption or other masking mechanisms.
- Monitoring – The Data Catalog should be updated with the latest additions, modifications and changes to the Data Fabric, its data or its analytical assets. Likewise, any change in lineage or data usage should be monitored.
The best tips for success
For Data Fabric to be successful, organizations must be committed to maintaining the integrity of the architectural standards and the components on which it is built. So, if silos are created as temporary workarounds, they will need to be decommissioned when they are no longer needed. Because the value of the Data Fabric depends on the strength of the information gathered from the Data Catalog, outdated, outdated, or inaccurate metadata cannot infiltrate the Catalog.
Finally, simply moving legacy analytical components, such as an aging data warehouse, into the fabric can cause integration issues. Ideally, these legacy components should be reviewed and redesigned.
Although this is a large-scale enterprise, successful Data Fabric environments are already proving their worth when it comes to empowering businesses to leverage data more efficiently and unlock the full potential of their assets. data to gain competitive advantage.
Click below to share this article