bigdata
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| bigdata [2020/05/04 14:33] – skipidar | bigdata [2023/01/14 15:36] (current) – skipidar | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ==== BigData ==== | ==== BigData ==== | ||
| + | |||
| + | {{https:// | ||
| Line 61: | Line 63: | ||
| Presto can query data where it is stored, without needing to move data into a separate analytics system. Query execution runs in parallel over a pure memory-based architecture, | Presto can query data where it is stored, without needing to move data into a separate analytics system. Query execution runs in parallel over a pure memory-based architecture, | ||
| + | |||
| + | |||
| + | |||
| + | |||
| + | === Hadoop vs. Spark? What are the differences? | ||
| + | |||
| + | Spark can run on top of the Hadoop Cluster. | ||
| + | Spark may be a replacement of MapReduce. | ||
| + | |||
| + | Hadoop and Apache Spark are both big-data frameworks, but they don't really serve the same purposes. | ||
| + | |||
| + | Hadoop is essentially a DISTRIBUTED DATA infrastructure: | ||
| + | |||
| + | Spark, on the other hand, is a data-processing tool that operates on those distributed data collections; | ||
| + | Spark only competes with the MapReduce part of Hadoop. | ||
| + | Spark is speedier. Spark is generally a lot faster than MapReduce | ||
| + | |||
| + | https:// | ||
| + | |||
| + | |||
| + | |||
| + | |||
| + | === What is Apache Storm? === | ||
| + | |||
| + | Storm is a competitor of Spark. | ||
| + | |||
| + | Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing. | ||
| + | |||
| + | Apache Storm is NOT a DataBase | ||
| + | |||
| + | |||
| + | |||
| + | === Storm vs Spark? === | ||
| + | |||
| + | |||
| + | They do practically the same - processing of data | ||
| + | |||
| + | multilantlanguage - Storm is better (like R) | ||
| + | data sources - Spark is better (like S3) | ||
| + | |||
| + | |||
bigdata.1588602825.txt.gz · Last modified: (external edit)
