sink.hdfs writer在我的文本文件中添加了垃圾

 微公号莆田鞋园 发布于 2022-12-18 13:15

我已成功配置flume以将文本文件从本地文件夹传输到hdfs.我的问题是当这个文件被转移到hdfs时,我的文本文件中会添加一些不需要的文本"hdfs.write.Longwriter + binary characters".这是我的flume.conf

agent.sources = flumedump
agent.channels = memoryChannel
agent.sinks = flumeHDFS

agent.sources.flumedump.type = spooldir
agent.sources.flumedump.spoolDir = /opt/test/flume/flumedump/
agent.sources.flumedump.channels = memoryChannel

# Each sink's type must be defined
agent.sinks.flumeHDFS.type = hdfs
agent.sinks.flumeHDFS.hdfs.path = hdfs://bigdata.ibm.com:9000/user/vin
agent.sinks.flumeHDFS.fileType = DataStream

#Format to be written
agent.sinks.flumeHDFS.hdfs.writeFormat = Text

agent.sinks.flumeHDFS.hdfs.maxOpenFiles = 10
# rollover file based on maximum size of 10 MB
agent.sinks.flumeHDFS.hdfs.rollSize = 10485760

# never rollover based on the number of events
agent.sinks.flumeHDFS.hdfs.rollCount = 0

# rollover file based on max time of 1 mi
agent.sinks.flumeHDFS.hdfs.rollInterval = 60


#Specify the channel the sink should use
agent.sinks.flumeHDFS.channel = memoryChannel

# Each channel's type is defined.
agent.channels.memoryChannel.type = memory

# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
agent.channels.memoryChannel.capacity = 100

我的源文本文件非常简单,包含文本:您好我的名字是Hadoop,这是文件一.

我在hdfs中获得的接收器文件如下所示:SEQ!org.apache.hadoop.io.LongWritableorg.apache.hadoop.io.Text 5 > I <4H ǥ +您好我的名字是Hadoop,这是文件一.

请让我知道我做错了什么?

1 个回答
  • 弄清楚了.我不得不修理这条线

    agent.sinks.flumeHDFS.fileType = DataStream

    并将其更改为

    agent.sinks.flumeHDFS.hdfs.fileType = DataStream

    这解决了这个问题.

    2022-12-18 13:16 回答
撰写答案
今天,你开发时遇到什么问题呢?
立即提问
热门标签
PHP1.CN | 中国最专业的PHP中文社区 | PNG素材下载 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有