The Data Cloud is an ecosystem of cloud-based services to store, process and analyze operational, business intelligence, and other data from systems of records spanning databases, data warehouses, and data pipelines. Examples of such services include Snowflake, BigQuery, Atlas, RedShift, Looker, Tableau, FiveTran, Kafka, S3, and others.
Benefits of Data Cloud
The Data Cloud enables businesses to become data-driven, to make faster, better, decisions and improve their customer reach, engagement, and retention. Additionally, it allows them to simplify their data infrastructure management, eliminate silos, and deliver frictionless and seamless access to data. Finally, moving to the Data Cloud frees internal resources normally required for infrastructure procurement, deployment, and maintenance since it benefits from the ‘as a service’ model of delivery. Those resources can now instead be used to continuously refine products, unlock new opportunities and deliver deeper insights and visibility to stakeholders.
- Digital Transformation is broadly the process of digitizing information and using digital technologies to create new customer experiences, business processes, or employee management to meet changing business, market, and cultural requirements. The Data Cloud enables companies to efficiently capture all their diverse data, easily experiment with the various use cases, quickly scale to accommodate their need to unlock the value from their data, and dynamically optimize the price for performance and value.
- Data Democratization is generally defined as the process of making data available to a wide range of stakeholders in a business. It has been a key tenet of digital transformation, and helps unlock the value of proprietary data that, until now, has been embedded in various pockets of the organization. This helps organizations improve both their top and bottom line. Embracing the Data Cloud helps teams pick the right tools for managing, processing, and analyzing data—broad interoperability means they can use the tools that make the most sense to them.
- Infrastructure as Code (IaC) is a process for managing and configuring infrastructure through text files in a human- & machine-readable format. Thanks to an ever more cloud-enabled environment, IaC has seen a massive uptake and, in turn, is transforming how organizations bring their products and services to market. The draw for operations and product teams is that IaC lets them rapidly prototype and easily scale. Now, developers can combine cloud repositories like RDS and CloudSQL with other cloud services to build the Data Cloud for their applications, and they can deploy services like FiveTran and Kafka to move data from their Data Cloud to their data warehouse.
The Data Cloud vs. the Data Lake
Data Lakes initially promised to solve the burgeoning problem of siloed data repositories by bringing all data to a single location that can be searched and analysed across the business. There are some major shortcomings in this concept however:
- Data remains very distributed and siloed, and relies on each data repository forwarding information to the Data Lake.
- Data is duplicated, increasing costs, generating inconsistencies, and adding to the governance and monitoring burden.
- Forwarding data to one location is also only half of the battle—it does not naturally assist with enabling different use cases and supporting diverse technologies that different data engineering teams are working with.
The Data Cloud solves problems brought on by data siloing. With the Data Cloud, teams can:
- pick the technologies most suited to their needs;
- provide easy access—across their organization—to the relevant data repositories; and
- get consistency and cost problems under control with state-of-the-art monitoring.
With the Data Cloud, processing capacity and storage capacity both move to a more flexible model. By bringing the processing and business intelligence into the cloud along with the data, teams get virtually unlimited scale and performance, as well as easy access to cloud orchestration features that unlock new use cases and opportunities.
Any type of cloud operational model inherently brings new challenges, and the Data Cloud is no exception. In particular, with sensitive customer data and sensitive operational data in use, there are security challenges that must be addressed:
- Identity: Many popular databases, pipelines, and data warehouses do not support protocols like SAML, OIDC, and other standards that are used by identity providers. This makes identity-based security approaches like SSO and MFA infeasible when using the out-of-the-box authentication and authorization features of the data platforms.
- Managing access: Companies configure their data services primarily to provide flexibility and to be able to support various applications, including third party applications, packaged BI tools, and analytic services. This reduces the ability to tightly control access. Without a standard notion of identity, governance, and compliance, the data environment becomes impossible to manage at scale.
- Monitoring and auditing: Generating logs, metrics, and traces slows database performance, and it results in many divergent types of log artifacts, each using its own syntax, structure, and attributes, and lacking the consistent identity and context information security teams need in order to trace issues and incidents.
Securing your Data Cloud with Cyral
Cyral allows organizations to secure their Data Cloud with a service that gives them the ability to observe, control, and protect their databases, data warehouses, and pipelines. What makes Cyral unique is the ability to intercept requests to these data endpoints from any user, tool, or app without impacting performance or scalability. We’ve taken an API-first and Security as Code approach to designing this product, which makes it easy to orchestrate in cloud-native environments and integrates with all popular tools for monitoring, tracing, alerting, and forensics. To learn how Cyral can help your team make this critical transformation, register for a demo.
Anomaly detection refers to the process of identifying unusual items, events, or observations. Those items raise suspicion by differing from the normal and expected behavior. …
When one talks about API security the focus is typically on public facing APIs. As digital transformation efforts take hold internal API also become critical …