sqoop import --connect jdbc:mysql://myexample.com/test \ --username myuser -P \ --table mytable sqoop import --connect jdbc:mysql://myexample.com/test --username myuser \ --password mypassword \ --table mytableThe first one is secure as other people can’t see the password, however, it is only practical to use in the command line. And we all agree that the second one is insecure as everyone can see what the password is to access the database. The more secure way of passing the password is through the use of so called password file. The command as follows:
echo -n "password" > /home/ericlin/.mysql.password chmod 400 /home/ericlin/.mysql.password sqoop import --connect jdbc:mysql://myexample.com/test \ --username myuser \ --password-file /home/ericlin/.mysql.password \ --table mytablePlease note that we need “-n” option for the “echo” command so that no newline will be added to the end of the password. And, please do not use “vim” to create the file as “vim” will add newline automatically to the end of the file, which will cause Sqoop to fail as the password contains a newline character. However, storing password in a text file is still considered not secure even though we have set the permissions. As of Sqoop 1.4.5, Sqoop supports the use of JAVA Key Store to store passwords, so that you do not need to store passwords in clear text in a file. To generate the key:
[ericlin@localhost ~] $ hadoop credential create mydb.password.alias -provider jceks://hdfs/user/ericlin/mysql.password.jceks Enter password: Enter password again: mysql.password has been successfully created. org.apache.hadoop.security.alias.JavaKeyStoreProvider has been updated.On prompt, enter the password that will be used to access the database. The “mydb.password.alias” is the alias that we can use to pass to Sqoop when running the command, so that no password is needed. Then you can run the following Sqoop command:
sqoop import -Dhadoop.security.credential.provider.path=jceks://hdfs/user/ericlin/mysql.password.jceks \ -–connect ‘jdbc:mysql://myexample.com/test’ \ -–table mytable \ -–username myuser \ -–password-alias mydb.password.aliasThis way password is hidden inside jceks://hdfs/user/ericlin/mysql.password.jceks and no one is able to see it. Hope this helps.
Related articles
Thank you for this, Eric! Very helpful. With the hdfs method of storing password, it would only work if you have a persisted hdfs, correct? Do you have any suggestions for EMR as well? Thanks.
Hi,
I have not played with EMR before, so I don’t have much to say about it. But yes, you need to have persisted HDFS to store the password file.
I followed same steps, but i got this error ” INFO [main] sqoop.Sqoop: Running Sqoop version: 1.4.6
CredentialProvider facility not available in the hadoop environment used
”
I run his command
sqoop import -Dhadoop.security.credential.provider.path=jceks://hdfs/user/hadoop/mysql.password.jceks –connect ‘jdbc:mysql://localhost/practice’ –table authors –username root –password-alias mydb.password.alias
Whats wrong? Please reply.
I run this command to generate password
hadoop credential create mydb.password.alias -provider jceks://hdfs/user/hadoop/mysql.password.jceks
Hi Venu,
Sorry for the late reply. Can you please provide the verbose output of Sqoop command? So add “–verbose” to sqoop and paste the output here.
Thanks
Hi,
I run the same command , but it is giving unrecognised argument at –password-alias like below
14/04/05 13:57:53 ERROR tool.BaseSqoopTool: Error parsing arguments for import:
16/07/13 13:57:53 ERROR tool.BaseSqoopTool: Unrecognized argument: -Dhadoop.security.credential.provider.path=jceks://hdfs/user/vijay/myPassword.jceks
16/07/13 13:57:53 ERROR tool.BaseSqoopTool: Unrecognized argument: –password-alias
16/07/13 13:57:53 ERROR tool.BaseSqoopTool: Unrecognized argument: db2-dev-password
plz help me….
Hi Shravan,
Can you confirm the version of Sqoop you are using? If you look at https://issues.apache.org/jira/browse/SQOOP-2283, it is only available from 1.4.6 onwards.
Thank you very much Eric! Fantastic article, helped me to understand how the password credential provider works.
Regards
Sivakumar
Hi SivaKumar,
Thanks for visiting and I am glad that my blog helps.
Hi Eric,
Fantastic. It worked for me well. Thanks a lot
Hi Vaidynanthan,
I am glad that my post helped you and thanks for visiting.
very nice and clear post – I was very confused reading official hadoop documentation but your post and examples cleared this up for me. thanks Eric!
Hi Boris, I am glad that my post helped you!
Hi Eric,
Is there any protection to the credential provider? My point is if everyone can access the provider and retrieve the password, that’ll be essentially the same as everyone being able to see the password in plain text. Right?
Hi Rui,
Sorry about the late reply. I am not sure any good ways, but I would think setting the file to be 600 will prevent others from accessing it. The password is already encrypted and it is very hard to steal password already.
Thanks Eric for the reply. So we need to rely on file permission to protect the key store, which is the same as the (plain text) password file method. But the difference is, unlike password file, the password is encrypted in key store file. Is this correct?
Hi Rui,
Yeah, that’s correct, at least based on my understanding. If you try to cat the jceks file, you will noticed garbage outputs.
I see. Thank you for the explanations 🙂
Hey Eric,
Nice and clean explanation, thanks a lot for sharing mate, this info was quite useful for me.
Keep up the good work and update your blog more often 🙂
Cheers!
Hi Oleg,
Thanks for your comment and I am glad that my blog helps. Will certainly keep blogging and share all my experience of Hadoop and CDH.
Cheers