热门标签 | HotTags
当前位置:  开发笔记 > 编程语言 > 正文

Sqoop安装配置及演示

Sqoop是一个用来将Hadoop(Hive、HBase)和关系型数据库中的数据相互转移的工具,可以将一个关系型数据库(例如:MySQL,Oracle,Postgres等)中的数据导入到Hadoop的HDFS中,也可以将HDFS的数据导入到关系型数据库中。Sqoop目前已经是Apache的顶级项目了,

Sqoop是一个用来将Hadoop(Hive、HBase)和关系型数据库中的数据相互转移的工具,可以将一个关系型数据库(例如:MySQL ,Oracle ,Postgres等)中的数据导入到Hadoop的HDFS中,也可以将HDFS的数据导入到关系型数据库中。Sqoop目前已经是Apache的顶级项目了,

Sqoop是一个用来将Hadoop(Hive、HBase)和关系型数据库中的数据相互转移的工具,可以将一个关系型数据库(例如:MySQL ,Oracle ,Postgres等)中的数据导入到Hadoop的HDFS中,也可以将HDFS的数据导入到关系型数据库中。 Sqoop目前已经是Apache的顶级项目了,目前版本是1.4.4 和 Sqoop2 1.99.3,本文以1.4.4的版本为例讲解基本的安装配置和简单应用的演示。
  • 安装配置
  • 准备测试数据
  • 导入数据到HDFS
  • 导入数据到Hive
  • 导入数据到HBase
[一]、安装配置 选择Sqoop 1.4.4 版本:sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz 1.1、下载后解压配置:
tar -zxvf sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz /usr/local/
cd /usr/local
ln -s sqoop-1.4.4.bin__hadoop-2.0.4-alpha sqoop
1.2、环境变量配置 vi ~/.bash_profile
#Sqoop  add by micmiu.com
export SQOOP_HOME=/usr/local/sqoop
export PATH=$SQOOP_HOME/bin:$PATH
1.3、配置Sqoop参数: 复制/conf/sqoop-env-template.sh 一份重命名为:/conf/sqoop-env.sh vi ?/conf/sqoop-env.sh
# 指定各环境变量的实际配置
# Set Hadoop-specific environment variables here.
#Set path to where bin/hadoop is available
#export HADOOP_COMMON_HOME=
#Set path to where hadoop-*-core.jar is available
#export HADOOP_MAPRED_HOME=
#set the path to where bin/hbase is available
#export HBASE_HOME=
#Set the path to where bin/hive is available
#export HIVE_HOME=
ps:因为我当前用户的默认环境变量中已经配置了相关变量,故该配置文件无需再修改:
# Hadoop  
export HADOOP_PREFIX="/usr/local/hadoop"  
export HADOOP_HOME=${HADOOP_PREFIX}  
export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin
export HADOOP_COMMON_HOME=${HADOOP_PREFIX}  
export HADOOP_HDFS_HOME=${HADOOP_PREFIX}  
export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
export HADOOP_YARN_HOME=${HADOOP_PREFIX}  
# Native Path  
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native  
export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib/native" 
# Hadoop end
#Hive
export HIVE_HOME=/usr/local/hive
export PATH=$HIVE_HOME/bin:$PATH
#HBase
export HBASE_HOME=/usr/local/hbase
export PATH=$HBASE
#add by micmiu.com
1.4、驱动jar包 下面测试演示以MySQL为例,则需要把mysql对应的驱动lib文件copy到 /lib 目录下。 [二]、测试数据准备 以MySQL 为例:
  • 192.168.6.77(hostname:Master.Hadoop)
  • database: test
  • 用户:root 密码:micmiu
准备两张测试表一个有主键表demo_blog,一个无主键表 demo_log
CREATE TABLE `demo_blog` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `blog` varchar(100) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8;
CREATE TABLE `demo_log` (
  `operator` varchar(16) NOT NULL,
  `log` varchar(100) NOT NULL
) ENGINE=MyISAM  DEFAULT CHARSET=utf8;
插入测试数据:
insert into demo_blog (id, blog) values (1, "micmiu.com");
insert into demo_blog (id, blog) values (2, "ctosun.com");
insert into demo_blog (id, blog) values (3, "baby.micmiu.com");
insert into demo_log (operator, log) values ("micmiu", "create");
insert into demo_log (operator, log) values ("micmiu", "update");
insert into demo_log (operator, log) values ("michael", "edit");
insert into demo_log (operator, log) values ("michael", "delete");
[三]、导入数据到HDFS 3.1、导入有主键的表 比如我需要把表 demo_blog (含主键) 的数据导入到HDFS中,执行如下命令:
sqoop import --connect jdbc:mysql://192.168.6.77/test --username root --password micmiu --table demo_blog
执行过程如下:
$ sqoop import --connect jdbc:mysql://192.168.6.77/test --username root --password micmiu --table demo_blog
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
14/04/09 09:58:43 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/04/09 09:58:43 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/04/09 09:58:43 INFO tool.CodeGenTool: Beginning code generation
14/04/09 09:58:43 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `demo_blog` AS t LIMIT 1
14/04/09 09:58:43 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `demo_blog` AS t LIMIT 1
14/04/09 09:58:43 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop
Note: /tmp/sqoop-hadoop/compile/e8fd26a5bca5b7f51cdb03bf847ce389/demo_blog.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/04/09 09:58:44 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/e8fd26a5bca5b7f51cdb03bf847ce389/demo_blog.jar
14/04/09 09:58:44 WARN manager.MySQLManager: It looks like you are importing from mysql.
14/04/09 09:58:44 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
14/04/09 09:58:44 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
14/04/09 09:58:44 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
14/04/09 09:58:44 INFO mapreduce.ImportJobBase: Beginning import of demo_blog
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hbase-0.98.0-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/04/09 09:58:44 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/04/09 09:58:45 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/04/09 09:58:45 INFO client.RMProxy: Connecting to ResourceManager at Master.Hadoop/192.168.6.77:8032
14/04/09 09:58:47 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`id`), MAX(`id`) FROM `demo_blog`
14/04/09 09:58:47 INFO mapreduce.JobSubmitter: number of splits:3
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.job.classpath.files is deprecated. Instead, use mapreduce.job.classpath.files
14/04/09 09:58:47 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.cache.files.filesizes is deprecated. Instead, use mapreduce.job.cache.files.filesizes
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.cache.files is deprecated. Instead, use mapreduce.job.cache.files
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
14/04/09 09:58:47 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/04/09 09:58:47 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/04/09 09:58:47 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.cache.files.timestamps is deprecated. Instead, use mapreduce.job.cache.files.timestamps
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/04/09 09:58:47 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1396936838233_0001
14/04/09 09:58:47 INFO impl.YarnClientImpl: Submitted application application_1396936838233_0001 to ResourceManager at Master.Hadoop/192.168.6.77:8032
14/04/09 09:58:47 INFO mapreduce.Job: The url to track the job: http://Master.Hadoop:8088/proxy/application_1396936838233_0001/
14/04/09 09:58:47 INFO mapreduce.Job: Running job: job_1396936838233_0001
14/04/09 09:59:00 INFO mapreduce.Job: Job job_1396936838233_0001 running in uber mode : false
14/04/09 09:59:00 INFO mapreduce.Job:  map 0% reduce 0%
14/04/09 09:59:14 INFO mapreduce.Job:  map 33% reduce 0%
14/04/09 09:59:16 INFO mapreduce.Job:  map 67% reduce 0%
14/04/09 09:59:19 INFO mapreduce.Job:  map 100% reduce 0%
14/04/09 09:59:19 INFO mapreduce.Job: Job job_1396936838233_0001 completed successfully
14/04/09 09:59:19 INFO mapreduce.Job: Counters: 27
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=271866
		FILE: Number of read operatiOns=0
		FILE: Number of large read operatiOns=0
		FILE: Number of write operatiOns=0
		HDFS: Number of bytes read=295
		HDFS: Number of bytes written=44
		HDFS: Number of read operatiOns=12
		HDFS: Number of large read operatiOns=0
		HDFS: Number of write operatiOns=6
	Job Counters 
		Launched map tasks=3
		Other local map tasks=3
		Total time spent by all maps in occupied slots (ms)=43032
		Total time spent by all reduces in occupied slots (ms)=0
	Map-Reduce Framework
		Map input records=3
		Map output records=3
		Input split bytes=295
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=590
		CPU time spent (ms)=6330
		Physical memory (bytes) snapshot=440934400
		Virtual memory (bytes) snapshot=3882573824
		Total committed heap usage (bytes)=160563200
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=44
14/04/09 09:59:19 INFO mapreduce.ImportJobBase: Transferred 44 bytes in 34.454 seconds (1.2771 bytes/sec)
14/04/09 09:59:19 INFO mapreduce.ImportJobBase: Retrieved 3 records.
验证导入到hdfs上的数据:
$ hdfs dfs -ls /user/hadoop/demo_blog
Found 4 items
-rw-r--r--   3 hadoop supergroup          0 2014-04-09 09:59 /user/hadoop/demo_blog/_SUCCESS
-rw-r--r--   3 hadoop supergroup         13 2014-04-09 09:59 /user/hadoop/demo_blog/part-m-00000
-rw-r--r--   3 hadoop supergroup         13 2014-04-09 09:59 /user/hadoop/demo_blog/part-m-00001
-rw-r--r--   3 hadoop supergroup         18 2014-04-09 09:59 /user/hadoop/demo_blog/part-m-00002
[hadoop@Master ~]$ hdfs dfs -cat /user/hadoop/demo_blog/part-m-0000*
1,micmiu.com
2,ctosun.com
3,baby.micmiu.com
ps:默认设置下导入到hdfs上的路径是:?/user/username/tablename/(files),比如我的当前用户是hadoop,那么实际路径即:?/user/hadoop/demo_blog/(files)。 如果要自定义路径需要增加参数:--warehouse-dir 比如:
sqoop import --connect jdbc:mysql://Master.Hadoop/test --username root --password micmiu --table demo_blog --warehouse-dir /user/micmiu/sqoop
3.2、导入不含主键的表 比如需要把表 demo_log(无主键) 的数据导入到hdfs中,执行如下命令:
sqoop import --connect jdbc:mysql://192.168.6.77/test --username root --password micmiu --table demo_log --warehouse-dir /user/micmiu/sqoop --split-by operator
ps:无主键表的导入需要增加参数? --split-by xxx ?或者 -m 1 执行过程:
$ sqoop import --connect jdbc:mysql://192.168.6.77/test --username root --password micmiu --table demo_log --warehouse-dir /user/micmiu/sqoop --split-by operator
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
14/04/09 15:02:06 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/04/09 15:02:06 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/04/09 15:02:06 INFO tool.CodeGenTool: Beginning code generation
14/04/09 15:02:06 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `demo_log` AS t LIMIT 1
14/04/09 15:02:06 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `demo_log` AS t LIMIT 1
14/04/09 15:02:06 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop
Note: /tmp/sqoop-hadoop/compile/dddc1bcdba30515f95a2d604f22e4fe9/demo_log.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/04/09 15:02:07 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/dddc1bcdba30515f95a2d604f22e4fe9/demo_log.jar
14/04/09 15:02:07 WARN manager.MySQLManager: It looks like you are importing from mysql.
14/04/09 15:02:07 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
14/04/09 15:02:07 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
14/04/09 15:02:07 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
14/04/09 15:02:07 INFO mapreduce.ImportJobBase: Beginning import of demo_log
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hbase-0.98.0-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/04/09 15:02:07 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/04/09 15:02:08 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/04/09 15:02:08 INFO client.RMProxy: Connecting to ResourceManager at Master.Hadoop/192.168.6.77:8032
14/04/09 15:02:10 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`operator`), MAX(`operator`) FROM `demo_log`
14/04/09 15:02:10 WARN db.TextSplitter: Generating splits for a textual index column.
14/04/09 15:02:10 WARN db.TextSplitter: If your database sorts in a case-insensitive order, this may result in a partial import or duplicate records.
14/04/09 15:02:10 WARN db.TextSplitter: You are strongly encouraged to choose an integral split column.
14/04/09 15:02:10 INFO mapreduce.JobSubmitter: number of splits:4
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.job.classpath.files is deprecated. Instead, use mapreduce.job.classpath.files
14/04/09 15:02:10 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.cache.files.filesizes is deprecated. Instead, use mapreduce.job.cache.files.filesizes
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.cache.files is deprecated. Instead, use mapreduce.job.cache.files
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
14/04/09 15:02:10 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/04/09 15:02:10 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/04/09 15:02:10 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.cache.files.timestamps is deprecated. Instead, use mapreduce.job.cache.files.timestamps
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/04/09 15:02:10 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1396936838233_0003
14/04/09 15:02:10 INFO impl.YarnClientImpl: Submitted application application_1396936838233_0003 to ResourceManager at Master.Hadoop/192.168.6.77:8032
14/04/09 15:02:10 INFO mapreduce.Job: The url to track the job: http://Master.Hadoop:8088/proxy/application_1396936838233_0003/
14/04/09 15:02:10 INFO mapreduce.Job: Running job: job_1396936838233_0003
14/04/09 15:02:17 INFO mapreduce.Job: Job job_1396936838233_0003 running in uber mode : false
14/04/09 15:02:17 INFO mapreduce.Job:  map 0% reduce 0%
14/04/09 15:02:28 INFO mapreduce.Job:  map 25% reduce 0%
14/04/09 15:02:30 INFO mapreduce.Job:  map 50% reduce 0%
14/04/09 15:02:33 INFO mapreduce.Job:  map 100% reduce 0%
14/04/09 15:02:33 INFO mapreduce.Job: Job job_1396936838233_0003 completed successfully
14/04/09 15:02:33 INFO mapreduce.Job: Counters: 27
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=362536
		FILE: Number of read operatiOns=0
		FILE: Number of large read operatiOns=0
		FILE: Number of write operatiOns=0
		HDFS: Number of bytes read=516
		HDFS: Number of bytes written=56
		HDFS: Number of read operatiOns=16
		HDFS: Number of large read operatiOns=0
		HDFS: Number of write operatiOns=8
	Job Counters 
		Launched map tasks=4
		Other local map tasks=4
		Total time spent by all maps in occupied slots (ms)=44481
		Total time spent by all reduces in occupied slots (ms)=0
	Map-Reduce Framework
		Map input records=4
		Map output records=4
		Input split bytes=516
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=429
		CPU time spent (ms)=6650
		Physical memory (bytes) snapshot=587669504
		Virtual memory (bytes) snapshot=5219356672
		Total committed heap usage (bytes)=205848576
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=56
14/04/09 15:02:33 INFO mapreduce.ImportJobBase: Transferred 56 bytes in 25.2746 seconds (2.2157 bytes/sec)
14/04/09 15:02:33 INFO mapreduce.ImportJobBase: Retrieved 4 records.
验证导入的数据:
$ hdfs dfs -ls /user/micmiu/sqoop/demo_log
Found 5 items
-rw-r--r--   3 hadoop supergroup          0 2014-04-09 15:02 /user/micmiu/sqoop/demo_log/_SUCCESS
-rw-r--r--   3 hadoop supergroup         28 2014-04-09 15:02 /user/micmiu/sqoop/demo_log/part-m-00000
-rw-r--r--   3 hadoop supergroup          0 2014-04-09 15:02 /user/micmiu/sqoop/demo_log/part-m-00001
-rw-r--r--   3 hadoop supergroup          0 2014-04-09 15:02 /user/micmiu/sqoop/demo_log/part-m-00002
-rw-r--r--   3 hadoop supergroup         28 2014-04-09 15:02 /user/micmiu/sqoop/demo_log/part-m-00003
$ hdfs dfs -cat /user/micmiu/sqoop/demo_log/part-m-0000*
michael,edit
michael,delete
micmiu,create
micmiu,update
[四]、导入数据到Hive 比如把表demo_blog 数据导入到Hive中,增加参数 --hive-import?:
sqoop import --connect jdbc:mysql://192.168.6.77/test --username root --password micmiu --table demo_blog  --warehouse-dir /user/sqoop --hive-import --create-hive-table
执行过程如下:
$ sqoop import --connect jdbc:mysql://192.168.6.77/test --username root --password micmiu --table demo_blog  --warehouse-dir /user/sqoop --hive-import --create-hive-table 
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
14/04/09 10:44:21 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/04/09 10:44:21 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
14/04/09 10:44:21 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
14/04/09 10:44:21 WARN tool.BaseSqoopTool: It seems that you've specified at least one of following:
14/04/09 10:44:21 WARN tool.BaseSqoopTool: 	--hive-home
14/04/09 10:44:21 WARN tool.BaseSqoopTool: 	--hive-overwrite
14/04/09 10:44:21 WARN tool.BaseSqoopTool: 	--create-hive-table
14/04/09 10:44:21 WARN tool.BaseSqoopTool: 	--hive-table
14/04/09 10:44:21 WARN tool.BaseSqoopTool: 	--hive-partition-key
14/04/09 10:44:21 WARN tool.BaseSqoopTool: 	--hive-partition-value
14/04/09 10:44:21 WARN tool.BaseSqoopTool: 	--map-column-hive
14/04/09 10:44:21 WARN tool.BaseSqoopTool: Without specifying parameter --hive-import. Please note that
14/04/09 10:44:21 WARN tool.BaseSqoopTool: those arguments will not be used in this session. Either
14/04/09 10:44:21 WARN tool.BaseSqoopTool: specify --hive-import to apply them correctly or remove them
14/04/09 10:44:21 WARN tool.BaseSqoopTool: from command line to remove this warning.
14/04/09 10:44:21 INFO tool.BaseSqoopTool: Please note that --hive-home, --hive-partition-key, 
14/04/09 10:44:21 INFO tool.BaseSqoopTool: 	 hive-partition-value and --map-column-hive options are 
14/04/09 10:44:21 INFO tool.BaseSqoopTool: 	 are also valid for HCatalog imports and exports
14/04/09 10:44:21 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/04/09 10:44:21 INFO tool.CodeGenTool: Beginning code generation
14/04/09 10:44:21 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `demo_blog` AS t LIMIT 1
14/04/09 10:44:21 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `demo_blog` AS t LIMIT 1
14/04/09 10:44:21 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop
Note: /tmp/sqoop-hadoop/compile/c071f02ecad006293202fd2c2fad0dce/demo_blog.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/04/09 10:44:22 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/c071f02ecad006293202fd2c2fad0dce/demo_blog.jar
14/04/09 10:44:22 WARN manager.MySQLManager: It looks like you are importing from mysql.
14/04/09 10:44:22 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
14/04/09 10:44:22 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
14/04/09 10:44:22 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
14/04/09 10:44:22 INFO mapreduce.ImportJobBase: Beginning import of demo_blog
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hbase-0.98.0-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/04/09 10:44:22 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/04/09 10:44:23 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/04/09 10:44:23 INFO client.RMProxy: Connecting to ResourceManager at Master.Hadoop/192.168.6.77:8032
14/04/09 10:44:25 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`id`), MAX(`id`) FROM `demo_blog`
14/04/09 10:44:25 INFO mapreduce.JobSubmitter: number of splits:3
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.job.classpath.files is deprecated. Instead, use mapreduce.job.classpath.files
14/04/09 10:44:25 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.cache.files.filesizes is deprecated. Instead, use mapreduce.job.cache.files.filesizes
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.cache.files is deprecated. Instead, use mapreduce.job.cache.files
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
14/04/09 10:44:25 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/04/09 10:44:25 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/04/09 10:44:25 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.cache.files.timestamps is deprecated. Instead, use mapreduce.job.cache.files.timestamps
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/04/09 10:44:25 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1396936838233_0002
14/04/09 10:44:25 INFO impl.YarnClientImpl: Submitted application application_1396936838233_0002 to ResourceManager at Master.Hadoop/192.168.6.77:8032
14/04/09 10:44:25 INFO mapreduce.Job: The url to track the job: http://Master.Hadoop:8088/proxy/application_1396936838233_0002/
14/04/09 10:44:25 INFO mapreduce.Job: Running job: job_1396936838233_0002
14/04/09 10:44:33 INFO mapreduce.Job: Job job_1396936838233_0002 running in uber mode : false
14/04/09 10:44:33 INFO mapreduce.Job:  map 0% reduce 0%
14/04/09 10:44:46 INFO mapreduce.Job:  map 67% reduce 0%
14/04/09 10:44:48 INFO mapreduce.Job:  map 100% reduce 0%
14/04/09 10:44:49 INFO mapreduce.Job: Job job_1396936838233_0002 completed successfully
14/04/09 10:44:49 INFO mapreduce.Job: Counters: 27
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=271860
		FILE: Number of read operatiOns=0
		FILE: Number of large read operatiOns=0
		FILE: Number of write operatiOns=0
		HDFS: Number of bytes read=295
		HDFS: Number of bytes written=44
		HDFS: Number of read operatiOns=12
		HDFS: Number of large read operatiOns=0
		HDFS: Number of write operatiOns=6
	Job Counters 
		Launched map tasks=3
		Other local map tasks=3
		Total time spent by all maps in occupied slots (ms)=34047
		Total time spent by all reduces in occupied slots (ms)=0
	Map-Reduce Framework
		Map input records=3
		Map output records=3
		Input split bytes=295
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=505
		CPU time spent (ms)=5350
		Physical memory (bytes) snapshot=427388928
		Virtual memory (bytes) snapshot=3881439232
		Total committed heap usage (bytes)=171638784
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=44
14/04/09 10:44:49 INFO mapreduce.ImportJobBase: Transferred 44 bytes in 26.0401 seconds (1.6897 bytes/sec)
14/04/09 10:44:49 INFO mapreduce.ImportJobBase: Retrieved 3 records.
14/04/09 10:44:49 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `demo_blog` AS t LIMIT 1
14/04/09 10:44:49 INFO hive.HiveImport: Loading uploaded data into Hive
14/04/09 10:44:52 INFO hive.HiveImport: 14/04/09 10:44:52 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/04/09 10:44:52 INFO hive.HiveImport: 14/04/09 10:44:52 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
14/04/09 10:44:52 INFO hive.HiveImport: 14/04/09 10:44:52 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
14/04/09 10:44:52 INFO hive.HiveImport: 14/04/09 10:44:52 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
14/04/09 10:44:52 INFO hive.HiveImport: 14/04/09 10:44:52 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
14/04/09 10:44:52 INFO hive.HiveImport: 14/04/09 10:44:52 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
14/04/09 10:44:52 INFO hive.HiveImport: 14/04/09 10:44:52 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
14/04/09 10:44:52 INFO hive.HiveImport: 14/04/09 10:44:52 INFO Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed
14/04/09 10:44:53 INFO hive.HiveImport: 14/04/09 10:44:53 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect.  Use hive.hmshandler.retry.* instead
14/04/09 10:44:53 INFO hive.HiveImport: 
14/04/09 10:44:53 INFO hive.HiveImport: Logging initialized using configuration in file:/usr/local/hive-0.13.0-bin/conf/hive-log4j.properties
14/04/09 10:44:53 INFO hive.HiveImport: SLF4J: Class path contains multiple SLF4J bindings.
14/04/09 10:44:53 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
14/04/09 10:44:53 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/hbase-0.98.0-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
14/04/09 10:44:53 INFO hive.HiveImport: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
14/04/09 10:44:53 INFO hive.HiveImport: SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/04/09 10:44:57 INFO hive.HiveImport: OK
14/04/09 10:44:57 INFO hive.HiveImport: Time taken: 0.773 seconds
14/04/09 10:44:57 INFO hive.HiveImport: Loading data to table default.demo_blog
14/04/09 10:44:57 INFO hive.HiveImport: Table default.demo_blog stats: [numFiles=4, numRows=0, totalSize=44, rawDataSize=0]
14/04/09 10:44:57 INFO hive.HiveImport: OK
14/04/09 10:44:57 INFO hive.HiveImport: Time taken: 0.25 seconds
14/04/09 10:44:57 INFO hive.HiveImport: Hive import complete.
14/04/09 10:44:57 INFO hive.HiveImport: Export directory is empty, removing it
Hive CLI中验证导入的数据:
hive> show tables;
OK
demo_blog
hbase_table_1
hbase_table_2
hbase_table_3
micmiu_blog
micmiu_hx_master
pokes
xflow_dstip
Time taken: 0.073 seconds, Fetched: 8 row(s)
hive> select * from demo_blog;
OK
1	micmiu.com
2	ctosun.com
3	baby.micmiu.com
Time taken: 0.506 seconds, Fetched: 3 row(s)
[五]、导入数据到HBase 演示把表 demo_blog 数据导入到HBase ,指定Hbase中表名为 demo_sqoop2hbase 的命令:
sqoop  import  --connect jdbc:mysql://192.168.6.77/test --username root --password micmiu --table demo_blog --hbase-table demo_sqoop2hbase --hbase-create-table --hbase-row-key id --column-family url
执行过程:
$ sqoop  import  --connect jdbc:mysql://192.168.6.77/test --username root --password micmiu --table demo_blog --hbase-table demo_sqoop2hbase --hbase-create-table --hbase-row-key id --column-family url
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
14/04/09 16:23:38 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/04/09 16:23:38 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/04/09 16:23:38 INFO tool.CodeGenTool: Beginning code generation
14/04/09 16:23:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `demo_blog` AS t LIMIT 1
14/04/09 16:23:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `demo_blog` AS t LIMIT 1
14/04/09 16:23:39 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop
Note: /tmp/sqoop-hadoop/compile/85408c854ee8fba75bbb2458e5e25093/demo_blog.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/04/09 16:23:40 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/85408c854ee8fba75bbb2458e5e25093/demo_blog.jar
14/04/09 16:23:40 WARN manager.MySQLManager: It looks like you are importing from mysql.
14/04/09 16:23:40 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
14/04/09 16:23:40 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
14/04/09 16:23:40 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
14/04/09 16:23:40 INFO mapreduce.ImportJobBase: Beginning import of demo_blog
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hbase-0.98.0-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/04/09 16:23:40 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/04/09 16:23:40 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:host.name=Master.Hadoop
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:java.version=1.6.0_20
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Sun Microsystems Inc.
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:java.home=/java/jdk1.6.0_20/jre
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:java.class.path=/usr/local/hadoop/etc/hadoop: .......
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/usr/local/hadoop/lib/native
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:java.compiler=
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:os.version=2.6.32-71.el6.x86_64
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:user.name=hadoop
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/hadoop
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:user.dir=/home/hadoop
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Initiating client connection, cOnnectString=Slave6.Hadoop:2181,Slave5.Hadoop:2181,Slave7.Hadoop:2181 sessiOnTimeout=90000 watcher=hconnection-0x57c8b24d, quorum=Slave6.Hadoop:2181,Slave5.Hadoop:2181,Slave7.Hadoop:2181, baseZNode=/hbase
14/04/09 16:23:41 INFO zookeeper.ClientCnxn: Opening socket connection to server Slave5.Hadoop/192.168.8.205:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration)
14/04/09 16:23:41 INFO zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x57c8b24d connecting to ZooKeeper ensemble=Slave6.Hadoop:2181,Slave5.Hadoop:2181,Slave7.Hadoop:2181
14/04/09 16:23:41 INFO zookeeper.ClientCnxn: Socket connection established to Slave5.Hadoop/192.168.8.205:2181, initiating session
14/04/09 16:23:41 INFO zookeeper.ClientCnxn: Session establishment complete on server Slave5.Hadoop/192.168.8.205:2181, sessiOnid= 0x453fecb6c50009, negotiated timeout = 90000
14/04/09 16:23:41 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Initiating client connection, cOnnectString=Slave6.Hadoop:2181,Slave5.Hadoop:2181,Slave7.Hadoop:2181 sessiOnTimeout=90000 watcher=catalogtracker-on-hconnection-0x57c8b24d, quorum=Slave6.Hadoop:2181,Slave5.Hadoop:2181,Slave7.Hadoop:2181, baseZNode=/hbase
14/04/09 16:23:41 INFO zookeeper.ClientCnxn: Opening socket connection to server Slave7.Hadoop/192.168.8.207:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration)
14/04/09 16:23:41 INFO zookeeper.RecoverableZooKeeper: Process identifier=catalogtracker-on-hconnection-0x57c8b24d connecting to ZooKeeper ensemble=Slave6.Hadoop:2181,Slave5.Hadoop:2181,Slave7.Hadoop:2181
14/04/09 16:23:41 INFO zookeeper.ClientCnxn: Socket connection established to Slave7.Hadoop/192.168.8.207:2181, initiating session
14/04/09 16:23:41 INFO zookeeper.ClientCnxn: Session establishment complete on server Slave7.Hadoop/192.168.8.207:2181, sessiOnid= 0x2453fecb6f50008, negotiated timeout = 90000
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Session: 0x2453fecb6f50008 closed
14/04/09 16:23:41 INFO zookeeper.ClientCnxn: EventThread shut down
14/04/09 16:23:41 INFO mapreduce.HBaseImportJob: Creating missing HBase table demo_sqoop2hbase
14/04/09 16:23:42 INFO zookeeper.ZooKeeper: Initiating client connection, cOnnectString=Slave6.Hadoop:2181,Slave5.Hadoop:2181,Slave7.Hadoop:2181 sessiOnTimeout=90000 watcher=catalogtracker-on-hconnection-0x57c8b24d, quorum=Slave6.Hadoop:2181,Slave5.Hadoop:2181,Slave7.Hadoop:2181, baseZNode=/hbase
14/04/09 16:23:42 INFO zookeeper.RecoverableZooKeeper: Process identifier=catalogtracker-on-hconnection-0x57c8b24d connecting to ZooKeeper ensemble=Slave6.Hadoop:2181,Slave5.Hadoop:2181,Slave7.Hadoop:2181
14/04/09 16:23:42 INFO zookeeper.ClientCnxn: Opening socket connection to server Slave7.Hadoop/192.168.8.207:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration)
14/04/09 16:23:42 INFO zookeeper.ClientCnxn: Socket connection established to Slave7.Hadoop/192.168.8.207:2181, initiating session
14/04/09 16:23:42 INFO zookeeper.ClientCnxn: Session establishment complete on server Slave7.Hadoop/192.168.8.207:2181, sessiOnid= 0x2453fecb6f50009, negotiated timeout = 90000
14/04/09 16:23:42 INFO zookeeper.ZooKeeper: Session: 0x2453fecb6f50009 closed
14/04/09 16:23:42 INFO zookeeper.ClientCnxn: EventThread shut down
14/04/09 16:23:42 INFO client.RMProxy: Connecting to ResourceManager at Master.Hadoop/192.168.6.77:8032
14/04/09 16:23:47 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`id`), MAX(`id`) FROM `demo_blog`
14/04/09 16:23:47 INFO mapreduce.JobSubmitter: number of splits:3
14/04/09 16:23:47 INFO Configuration.deprecation: mapred.job.classpath.files is deprecated. Instead, use mapreduce.job.classpath.files
14/04/09 16:23:47 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/04/09 16:23:47 INFO Configuration.deprecation: mapred.cache.files.filesizes is deprecated. Instead, use mapreduce.job.cache.files.filesizes
14/04/09 16:23:47 INFO Configuration.deprecation: mapred.cache.files is deprecated. Instead, use mapreduce.job.cache.files
14/04/09 16:23:47 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/04/09 16:23:47 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
14/04/09 16:23:47 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
14/04/09 16:23:47 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/04/09 16:23:47 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
14/04/09 16:23:47 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
14/04/09 16:23:47 INFO Configuration.deprecation: mapred.cache.files.timestamps is deprecated. Instead, use mapreduce.job.cache.files.timestamps
14/04/09 16:23:47 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/04/09 16:23:47 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/04/09 16:23:47 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1396936838233_0005
14/04/09 16:23:47 INFO impl.YarnClientImpl: Submitted application application_1396936838233_0005 to ResourceManager at Master.Hadoop/192.168.6.77:8032
14/04/09 16:23:47 INFO mapreduce.Job: The url to track the job: http://Master.Hadoop:8088/proxy/application_1396936838233_0005/
14/04/09 16:23:47 INFO mapreduce.Job: Running job: job_1396936838233_0005
14/04/09 16:23:55 INFO mapreduce.Job: Job job_1396936838233_0005 running in uber mode : false
14/04/09 16:23:55 INFO mapreduce.Job:  map 0% reduce 0%
14/04/09 16:24:05 INFO mapreduce.Job:  map 33% reduce 0%
14/04/09 16:24:12 INFO mapreduce.Job:  map 100% reduce 0%
14/04/09 16:24:12 INFO mapreduce.Job: Job job_1396936838233_0005 completed successfully
14/04/09 16:24:12 INFO mapreduce.Job: Counters: 27
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=354636
		FILE: Number of read operatiOns=0
		FILE: Number of large read operatiOns=0
		FILE: Number of write operatiOns=0
		HDFS: Number of bytes read=295
		HDFS: Number of bytes written=0
		HDFS: Number of read operatiOns=3
		HDFS: Number of large read operatiOns=0
		HDFS: Number of write operatiOns=0
	Job Counters 
		Launched map tasks=3
		Other local map tasks=3
		Total time spent by all maps in occupied slots (ms)=35297
		Total time spent by all reduces in occupied slots (ms)=0
	Map-Reduce Framework
		Map input records=3
		Map output records=3
		Input split bytes=295
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=381
		CPU time spent (ms)=11050
		Physical memory (bytes) snapshot=543367168
		Virtual memory (bytes) snapshot=3918925824
		Total committed heap usage (bytes)=156958720
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=0
14/04/09 16:24:12 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 29.7126 seconds (0 bytes/sec)
14/04/09 16:24:12 INFO mapreduce.ImportJobBase: Retrieved 3 records.
hbase shell中验证导入的数据:
hbase(main):009:0> list
TABLE                                                                                                       
demo_sqoop2hbase                                                                                            
table_02                                                                                                    
table_03                                                                                                    
test_table                                                                                                  
xyz                                                                                                         
5 row(s) in 0.0310 secOnds=> ["demo_sqoop2hbase", "table_02", "table_03", "test_table", "xyz"]
hbase(main):010:0> scan "demo_sqoop2hbase"
ROW                          COLUMN+CELL                                                                    
 1                           column=url:blog, timestamp=1397031850700, value=micmiu.com                     
 2                           column=url:blog, timestamp=1397031844106, value=ctosun.com                     
 3                           column=url:blog, timestamp=1397031849888, value=baby.micmiu.com                
3 row(s) in 0.0730 seconds
hbase(main):011:0> describe "demo_sqoop2hbase"
DESCRIPTION                                                            ENABLED                              
 'demo_sqoop2hbase', {NAME => 'url', DATA_BLOCK_ENCODING => 'NONE', BL true                                 
 OOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIOnS=> '1', COMPRE                                      
 SSION => 'NONE', MIN_VERSIOnS=> '0', TTL => '2147483647', KEEP_DELET                                      
 ED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOC                                      
 KCACHE => 'true'}                                                                                          
1 row(s) in 0.0580 seconds
hbase(main):012:0>
验证导入成功。 本文到此已经把MySQL中的数据迁移到 HDFS、Hive、HBase的三种基本情况演示结束。 参考:
  • http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html
—————– ?EOF?@Michael Sun?—————–
推荐阅读
  • 一、Hadoop来历Hadoop的思想来源于Google在做搜索引擎的时候出现一个很大的问题就是这么多网页我如何才能以最快的速度来搜索到,由于这个问题Google发明 ... [详细]
  • 大数据Hadoop生态(20)MapReduce框架原理OutputFormat的开发笔记
    本文介绍了大数据Hadoop生态(20)MapReduce框架原理OutputFormat的开发笔记,包括outputFormat接口实现类、自定义outputFormat步骤和案例。案例中将包含nty的日志输出到nty.log文件,其他日志输出到other.log文件。同时提供了一些相关网址供参考。 ... [详细]
  • Hadoop源码解析1Hadoop工程包架构解析
    1 Hadoop中各工程包依赖简述   Google的核心竞争技术是它的计算平台。Google的大牛们用了下面5篇文章,介绍了它们的计算设施。   GoogleCluster:ht ... [详细]
  • 我们在之前的文章中已经初步介绍了Cloudera。hadoop基础----hadoop实战(零)-----hadoop的平台版本选择从版本选择这篇文章中我们了解到除了hadoop官方版本外很多 ... [详细]
  • 本文介绍了在Win10上安装WinPythonHadoop的详细步骤,包括安装Python环境、安装JDK8、安装pyspark、安装Hadoop和Spark、设置环境变量、下载winutils.exe等。同时提醒注意Hadoop版本与pyspark版本的一致性,并建议重启电脑以确保安装成功。 ... [详细]
  • Maven构建Hadoop,
    Maven构建Hadoop工程阅读目录序Maven安装构建示例下载系列索引 序  上一篇,我们编写了第一个MapReduce,并且成功的运行了Job,Hadoop1.x是通过ant ... [详细]
  • 什么是大数据lambda架构
    一、什么是Lambda架构Lambda架构由Storm的作者[NathanMarz]提出,根据维基百科的定义,Lambda架构的设计是为了在处理大规模数 ... [详细]
  • 《Spark核心技术与高级应用》——1.2节Spark的重要扩展
    本节书摘来自华章社区《Spark核心技术与高级应用》一书中的第1章,第1.2节Spark的重要扩展,作者于俊向海代其锋马海平,更多章节内容可以访问云栖社区“华章社区”公众号查看1. ... [详细]
  • ubuntu用sqoop将数据从hive导入mysql时,命令: ... [详细]
  • REVERT权限切换的操作步骤和注意事项
    本文介绍了在SQL Server中进行REVERT权限切换的操作步骤和注意事项。首先登录到SQL Server,其中包括一个具有很小权限的普通用户和一个系统管理员角色中的成员。然后通过添加Windows登录到SQL Server,并将其添加到AdventureWorks数据库中的用户列表中。最后通过REVERT命令切换权限。在操作过程中需要注意的是,确保登录名和数据库名的正确性,并遵循安全措施,以防止权限泄露和数据损坏。 ... [详细]
  • 【转】腾讯分析系统架构解析
    TA(TencentAnalytics,腾讯分析)是一款面向第三方站长的免费网站分析系统,在数据稳定性、及时性方面广受站长好评,其秒级的实时数据更新频率也获得业界的认可。本文将从实 ... [详细]
  • Kylin 单节点安装
    软件环境Hadoop:2.7,3.1(sincev2.5)Hive:0.13-1.2.1HBase:1.1,2.0(sincev2.5)Spark(optional)2.3.0K ... [详细]
  • 前言折腾了一段时间hadoop的部署管理,写下此系列博客记录一下。为了避免各位做部署这种重复性的劳动,我已经把部署的步骤写成脚本,各位只需要按着本文把脚本执行完,整个环境基本就部署 ... [详细]
  • Zookeeper为分布式环境提供灵活的协调基础架构。ZooKeeper框架支持许多当今最好的工业应用程序。我们将在本章中讨论ZooKeeper的一些最显着的应用。雅虎ZooKee ... [详细]
  • Azkaban(三)Azkaban的使用
    界面介绍首页有四个菜单projects:最重要的部分,创建一个工程,所有flows将在工程中运行。scheduling:显示定时任务executing:显示当前运行的任务histo ... [详细]
author-avatar
so-sweet天地
这个家伙很懒,什么也没留下!
PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有