问

Hadoop中的组合器,Reducers和EcoSystemProject

100斤的重口味_866 发布于 2022-12-19 12:10

mapreduce

process

input

您如何看待本网站提到的问题4的答案是什么？

答案是对还是错

问题:4

In the standard word count MapReduce algorithm, why might using a combiner reduce theoverall Job running time?

A. Because combiners perform local aggregation of word counts, thereby allowing the mappers to process input data faster.
B. Because combinersperform local aggregation of word counts, thereby reducing the number of mappers that need to run.
C. Because combiners perform local aggregation of word counts, and then transfer that data toreducers without writing the intermediate data to disk.
D. Because combiners perform local aggregation of word counts, thereby reducing the number of key-value pairs that need to be snuff let across the network to the reducers.

Answer:A

和

问题:3

What happens in a MapReduce job when you set the number of reducers to one?

A. A single reducer gathers and processes all the output from all the mappers. The output iswritten in as many separate files as there are mappers.
B. A single reducer gathers andprocesses all the output from all the mappers. The output iswritten to a single file in HDFS.
C. Setting the number of reducers to one creates a processing bottleneck, and since the number of reducers as specified by the programmer is used as a reference value only, the MapReduceruntime provides a default setting for the number of reducers.
D. Setting the number of reducers to one is invalid, and an exception is thrown.
Answer:A

从我对上述问题的理解答案

Question 4: D
Question 3: B

UPDATE

You have user profile records in your OLTP database,that you want to join with weblogs you have already ingested into HDFS.How will you obtain these user records?
Options
A. HDFS commands
B. Pig load
C. Sqoop import
D. Hive
Answer:B

对于更新的问题,我的答案我对B和C表示怀疑

编辑

正确答案:Sqoop.

1 个回答

据我所知,答案都是错误的.

我没有多少工作,Combiner但我到处都发现它正在研究输出Mapper.这个问题的答案.问题号码4应该d.

再次根据实际经验,我发现输出文件的数量总是等于Reducers 的数量.所以,答案问题3号应B.使用时可能不是这种情况,MultipleOutputs但这并不常见.

最后我认为Apache不会对MapReduce撒谎(例外确实会发生:).这两个问题的答案可以在他们的维基页面中找到.看一看.

顺便说一句,我喜欢"100%传递保证或你的退款!!!" 引用您提供的链接;-)

编辑
由于我对Pig&Sqoop知之甚少,因此不确定更新部分中的问题.但是,通过在HDFS数据上创建外部表然后加入,使用Hive可以实现同样的目的.

更新
在用户milk3422和所有者的评论之后,我做了一些搜索并发现我对Hive的假设是最后一个问题的答案是错误的,因为涉及另一个OLTP数据库.正确的答案应该是C,因为Sqoop旨在在HDFS和关系数据库之间传输数据.

2022-12-19 12:13 回答

左右印象摄影

撰写答案

今天，你开发时遇到什么问题呢？

立即提问

热门标签