Data warehousing in the age of big data /

Krishnan, Krish.

Data warehousing in the age of big data / Krish Krishnan. - 1 online resource - The Morgan Kaufmann Series on Business Intelligence . - Morgan Kaufmann Series on Business Intelligence. .

Includes bibliographical references and index.

Front Cover -- Data Warehousing in the Age of Big Data -- Copyright Page -- Contents -- Acknowledgments -- About the Author -- Introduction -- Part 1: Big Data -- Part 2: The Data Warehousing -- Part 3: Building the Big Data -- Data Warehouse -- Appendixes -- Companion website -- 1 BIG DATA -- 1 Introduction to Big Data -- Introduction -- Big Data -- Defining Big Data -- Why Big Data and why now? -- Big Data example -- Social Media posts -- Survey data analysis -- Survey data -- Weather data -- Twitter data -- Integration and analysis -- Additional data types -- Summary -- Further reading. 2 Working with Big Data -- Introduction -- Data explosion -- Data volume -- Machine data -- Application log -- Clickstream logs -- External or third-party data -- Emails -- Contracts -- Geographic information systems and geo-spatial data -- Example: Funshots, Inc. -- Data velocity -- Amazon, Facebook, Yahoo, and Google -- Sensor data -- Mobile networks -- Social media -- Data variety -- Summary -- 3 Big Data Processing Architectures -- Introduction -- Data processing revisited -- Data processing techniques -- Data processing infrastructure challenges -- Storage -- Transportation -- Processing. Journal -- Checkpoint -- HDFS startup -- Block allocation and storage in HDFS -- HDFS client -- Replication and recovery -- Communication and management -- Heartbeats -- CheckpointNode and BackupNode -- CheckpointNode -- BackupNode -- File system snapshots -- JobTracker and TaskTracker -- MapReduce -- MapReduce programming model -- MapReduce program design -- MapReduce implementation architecture -- MapReduce job processing and management -- MapReduce limitations (Version 1, Hadoop MapReduce) -- MapReduce v2 (YARN) -- YARN scalability -- Comparison between MapReduce v1 and v2 -- SQL/MapReduce. Speed or throughput -- Shared-everything and shared-nothing architectures -- Shared-everything architecture -- Shared-nothing architecture -- OLTP versus data warehousing -- Big Data processing -- Infrastructure explained -- Data processing explained -- Telco Big Data study -- Infrastructure -- Data processing -- 4 Introducing Big Data Technologies -- Introduction -- Distributed data processing -- Big Data processing requirements -- Technologies for Big Data processing -- Google file system -- Hadoop -- Hadoop core components -- HDFS -- HDFS architecture -- NameNode -- DataNodes -- Image. Zookeeper -- Zookeeper features -- Locks and processing -- Failure and recovery -- Pig -- Programming with pig latin -- Pig data types -- Running pig programs -- Pig program flow -- Common pig command -- HBase -- HBase architecture -- HBase components -- Write-ahead log -- Hive -- Hive architecture -- Infrastructure -- Execution: how does hive process queries? -- Hive data types -- Hive query language (HiveQL) -- Chukwa -- Flume -- Oozie -- HCatalog -- Sqoop -- Sqoop1 -- Sqoop2 -- Hadoop summary -- NoSQL -- CAP theorem -- Key-value pair: Voldemort -- Column family store: Cassandra -- Data model.

"In conclusion as you come to the end of this book, the concept of a Data Warehouse and its primary goal of serving the enterprise version of truth, and being the single platform for all the source of information will continue to remain intact and valid for many years to come. As we have discussed across many chapters and in many case studies, the limitations that existed with the infrastructures to create, manage and deploy Data Warehouses have been largely eliminated with the availability of Big Data technologies and infrastructure platforms, making the goal of the single version of truth a feasible reality. Integrating and extending Big Data into the Data Warehouse, and creating a larger decision support platform will benefit businesses for years to come. This book has touched upon governance and information lifecycle management aspects of Big Data in the larger program, however you can reuse all the current program management techniques that you follow for the Data Warehouse for this program and even implement agile approaches to integrating and managing data in the Data Warehouse. Technologies will continue to evolve in this spectrum and there will be more additions of solutions, which can be integrated if you follow the modular integration approaches to building and managing the Data Warehouse. The Appendix sections contain many more case studies and a special section on Healthcare Information Factory based on Big Data approaches. These are more guiding posts to help you align your thoughts and goals to building and integrating Big Data in your Data Warehouse"--


English.

0124059201 9780124059207 1299591914 9781299591912

C20120027378 9780124058910 (WaSeSS)ssj0000872952

490441 MIL 933188FB-27F2-4793-A537-B72686CEC28D OverDrive, Inc. http://www.overdrive.com




Data warehousing.
Big data.
Entrepôts de données (Informatique)
Données volumineuses.
COMPUTERS--Database Management--Data Warehousing.
Big data
Data warehousing
Big Data
Data-Warehouse-Konzept

QA76.9.D37 / K75 2013eb

005.74/5

Powered by Koha