Cargando...

Big Data: Principles and best practices of scalable realtime data systems (2015)

por Nathan Marz

Miembros	Reseñas	Popularidad	Valoración promedia	Conversaciones
68	2	393,307	(4.1)	Ninguno
Big Data teaches you to build big data systems using an architecture that takes advantage of clustered hardware along with new tools designed specifically to capture and analyze web-scale data. It describes a scalable, easy-to-understand approach to big data systems that can be built and run by a small team. Following a realistic example, this book guides readers through the theory of big data systems, how to implement them in practice, and how to deploy and operate them once they're built. About the Book Web-scale applications like social networks, real-time analytics, or e-commerce sites deal with a lot of data, whose volume and velocity exceed the limits of traditional database systems. These applications require architectures built around clusters of machines to store and process data of any size, or speed. Fortunately, scale and simplicity are not mutually exclusive. Big Data teaches you to build big data systems using an architecture designed specifically to capture and analyze web-scale data. This book presents the Lambda Architecture, a scalable, easy-to-understand approach that can be built and run by a small team. You'll explore the theory of big data systems and how to implement them in practice. In addition to discovering a general framework for processing big data, you'll learn specific technologies like Hadoop, Storm, and NoSQL databases. This book requires no previous exposure to large-scale data analysis or NoSQL tools. Familiarity with traditional databases is helpful. What's Inside Introduction to big data systems Real-time processing of web-scale data Tools like Hadoop, Cassandra, and Storm Extensions to traditional database skills About the Authors Nathan Marz is the creator of Apache Storm and the originator of the Lambda Architecture for big data systems. James Warren is an analytics architect with a background in machine learning and scientific computing.… (más)

todos los miembros

▾Miembros

Añadido recientemente por

prengel90, rpandey, anirudhgarg100, ampitup, forniax, bedarrabooks1, benschwem

▾Etiquetas

(Processed for Ordering by MH on 5/26/2015) (1) 2015-1125 (1) @work (1) big data (7) bigdata (1) calibre (1) data science (1) DLBDSEDE (1) eBook - Toc (1) eManning (1) general (1) H7SDA (1) hadoop (2) Informática (3) informática (2) kdnuggets (1) leyendo ahora (1) Libro de texto (1) Libro electrónico (3) map-reduce (1) maybe-later (1) meap (1) Por leer (5) Programación (3) Programación (1) Propio (1) Referencia (1) Software (2) Tapa blanda (1) trade paperback (1)

Traducción de etiquetas en funcionamiento

▾Recomendaciones de LibraryThing

▾Listas

Big Data (5)

▾¿Te va a gustar?

Cargando...

Inscríbete en LibraryThing para averiguar si este libro te gustará.

▾Conversaciones (Enlaces relacionados)

Actualmente no hay Conversaciones sobre este libro.

▾Reseñas de miembros

fecha ▼ | votos

Mostrando 2 de 2

This book was my first exposure to an architecture for dealing with large amounts of data in a holistic way; while I'm familiar with individual concepts like MapReduce, Column Stores, CAP, etc... I've never thought about them all at as part of the same ecosystem. As such, my rating is based on the accessibility and readability of the book, not of the correctness and feasibility of the content.

This is the kind of technology stack my current employer is forming a business around and I want to get started as soon as possible; I was recommended this book by someone who has built these systems before. I managed to read through it in a day and never felt daunted or lost in the text. While there are certainly parts I chose to skim over because I feel I'll be better off examining them in depth while I tackle that particular part of the infrastructure, I feel the overall gist of what this book is enabling me to build was covered in a very understandable way. Even if I don't remember much of the book's particular details, I know when I'll need to revisit them and where I need to look.

To sum it up very briefly (and hoping I'm not messing up), this book spells out a proposed general architecture for processing huge amounts of data (The Lambda Architecture) and covers the five layers it comprises:
1. The data ingestion layer
2. The batch layer (for views that take a long time to process)
3. The serving layer (for serving the information generated by the batch layer)
4. The speed layer (for quickly showing derived information that has been added since the last batch, can also be used for real-time views)
5. The querying layer, to get back specific information.

Along the way it defines data to mean raw information, vs. what information we will derive by views. At each of these layers, the authors go into the things you will have to consider (algorithm choice, anticipated gotchas, the nature of the problems being solved) and use a concrete solution to demonstrate how those problems would be implemented. While particular pieces of software are chosen, they are used to discuss the issues in real-world terms and the book does a good job of not being beholden to particular implementations.

I have never read books by Manning Press before, generally choosing to stick to O'Reilly publications and the occasional Pragmatic Bookshelf if it involves Ruby. This book impressed me greatly, and it's still in the process of being read. I will eagerly look over the rest of Manning's catalogue to see if they reach this level of quality. (

)

NaleagDeco | Dec 13, 2020 |

The best book on the subject. (

)

ignacy | Sep 11, 2014 |

fecha ▼ | votos

Mostrando 2 de 2

▾Reseñas publicadas

sin reseñas | añadir una reseña

▾Otros autores

» Añade otros autores

▾Relaciones con series y obras

▾Premios y Honores

ver historial

▾Conocimiento común

Debes iniciar sesión para editar los datos de Conocimiento Común.

Para más ayuda, consulta la página de ayuda de Conocimiento Común.

Título canónico

Título original

Títulos alternativos

Fecha de publicación original

2015

Personas/Personajes

Lugares importantes

Acontecimientos importantes

Películas relacionadas

Epígrafe

Dedicatoria

Primeras palabras

Citas

Últimas palabras

Aviso de desambiguación

Editores de la editorial

Blurbistas

Idioma original

DDC/MDS Canónico

LCC canónico

▾Referencias

Referencias a esta obra en fuentes externas.

Wikipedia en inglés

Ninguno

▾Descripciones del libro

Big Data teaches you to build big data systems using an architecture that takes advantage of clustered hardware along with new tools designed specifically to capture and analyze web-scale data. It describes a scalable, easy-to-understand approach to big data systems that can be built and run by a small team. Following a realistic example, this book guides readers through the theory of big data systems, how to implement them in practice, and how to deploy and operate them once they're built. About the Book Web-scale applications like social networks, real-time analytics, or e-commerce sites deal with a lot of data, whose volume and velocity exceed the limits of traditional database systems. These applications require architectures built around clusters of machines to store and process data of any size, or speed. Fortunately, scale and simplicity are not mutually exclusive. Big Data teaches you to build big data systems using an architecture designed specifically to capture and analyze web-scale data. This book presents the Lambda Architecture, a scalable, easy-to-understand approach that can be built and run by a small team. You'll explore the theory of big data systems and how to implement them in practice. In addition to discovering a general framework for processing big data, you'll learn specific technologies like Hadoop, Storm, and NoSQL databases. This book requires no previous exposure to large-scale data analysis or NoSQL tools. Familiarity with traditional databases is helpful. What's Inside Introduction to big data systems Real-time processing of web-scale data Tools like Hadoop, Cassandra, and Storm Extensions to traditional database skills About the Authors Nathan Marz is the creator of Apache Storm and the originator of the Lambda Architecture for big data systems. James Warren is an analytics architect with a background in machine learning and scientific computing.

▾Descripciones de biblioteca

No se han encontrado descripciones de biblioteca.

▾Descripción de los miembros de LibraryThing

Descripción del libro

Resumen Haiku