Wednesday, April 1, 2015

Moore’s Law, Cloud Computing, and DW/BI



What is Moore’s Law?



The observation made in 1965 by Gordon Moore, co-founder of Intel, that the number of transistors per square inch on integrated circuits had doubled every year since the integrated circuit was invented. Moore predicted that this trend would continue for the foreseeable future. In subsequent years, the pace slowed down a bit, but data density has doubled approximately every 18 months, and this is the current definition of Moore's Law, which Moore himself has blessed. Most experts, including Moore himself, expect Moore's Law to hold for at least another two decades.  The advent and evolution of cloud computing is a testimonial to the law.

What is Cloud Computing?



Cloud Computing provides a simple way to access servers, storage, databases and a broad set of application services over the Internet. Cloud Computing providers such as Amazon Web Services own and maintain the network-connected hardware required for these application services, while you provision and use what you need via a web application.





Let us now delve into some of the latest developments pertaining to cloud computing in the DW/BI sphere.

1. Amazon Redshift


Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse solution that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. Amazon Redshift’s data warehouse architecture allows the user to automate most of the common administrative tasks associated with provisioning, configuring and monitoring a cloud data warehouse. Backups to Amazon S3 are continuous, incremental and automatic. Restores are fast; you can start querying in minutes while your data is spooled down in the background. Enabling disaster recovery across regions takes just a few clicks.

                               

Customer Case study: By moving to AWS and using Amazon Redshift as a fast, fully managed data warehouse, Nokia is able to run queries twice as fast as its previous solution and can use business intelligence tools to mine and analyze big data at a 50% costs savings.

Benefits of Amazon Redshift:

  • Optimized for Data Warehousing: Amazon Redshift has a massively parallel processing (MPP) data warehouse architecture, parallelizing and distributing SQL operations to take advantage of all available resources.

  • Scalable: With a few clicks of the AWS Management Console or a simple API call, you can easily change the number or type of nodes in your cloud data warehouse as your performance or capacity needs change.

  • Fault Tolerant: Amazon Redshift has multiple features that enhance the reliability of your data warehouse cluster. All data written to a node in your cluster is automatically replicated to other nodes within the cluster and all data is continuously backed up to Amazon S3.

 2. Snowflake


Snowflake’s unique architecture takes full advantage of all the cloud’s capabilities to store and process data. Scale up and down at any time without costly redistribution of data, read-only downtime, or hours of delay before new resources can be used. Based on a patent-pending new architecture, Snowflake’s cloud service delivers the power of data warehousing, the flexibility of big data platforms and the elasticity of the cloud—at a 90 percent lower cost than on-premises data warehouses.




           Benefits of Snowflake Cloud Services:

      Data warehousing as a service. Snowflake eliminates the pains associated with managing and tuning a database. That enables self-service access to data so that analysts can focus on getting value from data rather than on managing hardware and software.
      Multidimensional elasticity. Unlike existing products, Snowflake’s elastic scaling technology makes it possible to independently scale users, data and workloads, delivering optimal performance at any scale. Elastic scaling makes it possible to simultaneously load and query data because every user and workload can have exactly the resources needed, without contention.
 Single service for all business data. Snowflake brings native storage of semi-structured data into a relational database that understands and fully optimizes querying of that data. Analysts can query structured and semi-structured data in a single system without compromise.


Customer Case study: Adobe implemented Snowflake’s Cloud based offering because of the flexibility that came from separating compute from storage provides users and applications with on-demand access to business-critical data at the performance level and scale required. Adobe’s testing indicated that Snowflake’s cost / performance ratio could exceed alternate cloud-based solutions in the market.”

Top 5 Trends in Cloud Data Warehousing and Analytics for 2015

  •  Trend #1: Rapid Deployment of Large-Scale Cloud Data Warehouses

Given the urgency that today’s ever increasing data volumes and complexity levels present, organizations are searching for ways to keep the focus on their business rather than their IT infrastructure. Advances in cloud-based infrastructure and technology are leading companies to trust more of their critical functions to the cloud, including large scale cloud-based data marts and data warehouses.

  • Trend #2: Increased Enablement of Self-Service Data Access via Cloud data Integration Services

Even the most mature analytics organizations struggle with the gap between business analysts who need access to information that is not in existing systems and actually making that information accessible. Developers on the IT side work to create applications to house and maintain this data, but these solutions are often disparate and have no governance from or integration to a data warehouse or one another. New cloud-based data integration and data refinery technologies can allow organizations to close this gap by providing APIs to easily move data between cloud data stores.

  • Trend #3: Continued Growth of NoSQL Adoption

NoSQL databases showed a 7% increase in adoption in 2014[1], with reasons for increased interest ranging from faster and more flexible development to lower deployment costs. NoSQL databases not only offer a low-risk, low-cost solution for organizations looking to get started with cloud-based analytics but also provide one of the most efficient, scalable solutions for cloud data storage as well. Additionally, new types of NoSQL tools, such as graph databases for analyzing relationship networks and key-value pair databases for data stream analysis, are gaining popularity for specific analytic use cases.

  • Trend #4: Big Data Analytics in the Cloud

Big data represents a major focal point for many organizations in recent years. The challenge with big data analytics has always been bringing the data to the analytics tool. Now, with new technologies available for analyzing these data sets in the cloud, organizations are taking advantage of the increased scalability and lower overhead, and we are seeing a shift from physical machines to cloud-based big data solutions.

  • Trend #5: Cloud-Based Analytics and Data Discovery


Deploying cloud-based analytics and data discovery tools may be one of the simplest, most efficient ways for organizations to engage their users and provide self-service Business Intelligence capabilities to put the data in the hands of the business users who can get the most insight from it.

References:

http://www.snowflake.net/product/architecture/
http://aws.amazon.com/redshift/
https://www.ironsidegroup.com/2015/02/02/top-5-trends-in-cloud-data-warehousing-and-analytics-for-2015/
http://www.webopedia.com/TERM/M/Moores_Law.html