How to load data in hadoop?

We need to use HIVE to create table and load data. I am using some pre-downloaded files, after search alot on google. As all the files have data in tab seprated form.

  1. Create table from HIVE
  2. Move to Hadoop_Home and where we need to move the files from root directory to specified directory
  3. Move to HIVE shell
  4. and using Load command, load data into tables.

create table users (id STRING, birth_date STRING, gender STRING) ROW FORMAT DELIMITED FIELDS TERMINATED by '\t' stored as textfile tblproperties ("skip.header.line.count"="1");

create table products (url STRING, category STRING) ROW FORMAT DELIMITED FIELDS TERMINATED by '\t' stored as textfile tblproperties ("skip.header.line.count"="1");

hadoop fs -put /urlmap.tsv /user/hadoop/products.tsv
hadoop fs -put /regusers.tsv /user/hadoop/users.tsv

LOAD DATA INPATH '/user/hadoop/products.tsv' OVERWRITE INTO TABLE products;
LOAD DATA INPATH '/user/hadoop/users.tsv' OVERWRITE INTO TABLE users;

How to clear cache of run (saved password of shared folder)

Follow below steps to resolve the problem:
  1. Open cmd
  2. Execute command "net use"
  3. Copy string from Remote column, IP address of server whose cache you want to delete
  4. Execute below command
    • net use \\\folder(paste your string copied in step 3) /delete
  5. Now on re-execution of net use command, you can see the string entry get removed
  6. Now go to task manager
    • kill explorer.exe and starting it again.

SecondaryNameNode Inconsistent checkpoint fields

2016-07-19 12:43:28,702 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint Inconsistent checkpoint fields.
LV = -60 namespaceID = 1655575768 cTime = 0 ; clusterId = CID-90fb1076-4e1e-4ab6-bb76-37177e31ad64 ; blockpoolId = BP-532229130-
Expecting respectively: -60; 724011492; 0; CID-f9fc0705-4a77-49d1-9220-874fd5f30efe; BP-78957080-
at org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$


  1. Stop all hadoop services (
  2. Open logs hadoop-root-secondarynamenode-localhost***.log
  3. Look for path of secondary name node
  4. delete secondary namenode directory
  5. start hadoop services

Hadoop datanode/namenode error Incompatible clusterIDs in /home/hadoop/hadoopdata/hdfs/datanode: namenode clusterID =  datanode clusterID = 
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(
at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(


  1. Check your hdfs-site.xml file to see where is pointing to
  2. and delete folder
  3. and then stop and start the datanode

Check you all log files for any error!!!!!!!!!

There might be chances to get below error

org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /path/to/hadoop/storage/namenode is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(
at org.apache.hadoop.hdfs.server.namenode.NameNode.(
at org.apache.hadoop.hdfs.server.namenode.NameNode.(
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(
2016-07-19 11:33:16,737 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@
2016-07-19 11:33:16,838 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system...
2016-07-19 11:33:16,839 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped.
2016-07-19 11:33:16,839 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
2016-07-19 11:33:16,839 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.

this is because above you have deleted both namenode and datanode folders:

need to format the namenode

Steps to follow:

  1. stop datanode
  2. delete folder 
  3. execute "hdfs namenode -format"
  4. start datanode

Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

hive>show databases;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

chown -R hdfs:hadoop *

in case above solution doesn't work then try to remove *.lck files, using below command

[root@localhost sbin]# rm metastore_db/*.lck


RuntimeException Permission denied: user=sa, access=WRITE, inode="/tmp/hive-root":root:supergroup:drwxr-xr-x

Problem solved by doing change in dfs_permissions. Locate the file hdfs-site.xml and using vim editor change the value property to false.


The system cannot find the batch label specified - nodemanager

Seems to be your are trying to start yarn or other on Windows and if you downloaded files from internet.
If this is the case then the files you have downloaded or replaced in directory "bin" and "etc/hadoop"

Some of the files *.cmd having LF as line terminator (use notepad ++ to see)

Open notepadd++ tool and convert file type as below

Goto> Edit> EOL Conversion> Windows Format

By doing this you can see the changes in file all new line terminator has been changed to CRLF.
