- Stop cluster so that no changes will happen to the cluster during the migration
- Backup the old PostgreSQL database, including schema and data
- Create a new database and user in MySQL, with proper permissions:
CREATE DATABASE metastore DEFAULT CHARACTER SET utf8; GRANT ALL ON metastore.* TO 'hive'@'%' IDENTIFIED BY 'hive_password';
- Update the database connection details in Cloudera Manager > Hive > Configuration page to use the new database
- Go to Hive > “Actions” menu on the top right corner and click on “Create Hive MetaStore Database Tables”
- Export the PostgreSQL DB without schema definition, i.e., data only, using the following command:
pg_dump -U
the “-a” for “data only” option--column-inserts --format p -h -p -f hive-in.sql -a - Download a php program from http://www.lightbox.ca/pg2mysql/pg2mysql-1.9.tar.bz2. This program can help us to do the SQL conversion from PostgreSQL to MySQL
- Install php-cli on your machine. If you use CentOS, simply run:
sudo yum install php-cli
- Run the following command to generate the hive-out.sql file (tested on PHP5.3)
php pg2mysql-1.9/pg2mysql_cli.php hive-in.sql hive-out.sql InnoDB
- Open the hive-out.sql file in any editor and add:
SET FOREIGN_KEY_CHECKS=0;
to the beginning of the file and:SET FOREIGN_KEY_CHECKS=1;
to the end of the file, so that no foreign keys will be checked during data import. - Update the following three tables to UPPER CASE, so change from:
INSERT INTO next_compaction_queue_id (ncq_next) VALUES (1); INSERT INTO next_lock_id (nl_next) VALUES (1); INSERT INTO next_txn_id (ntxn_next) VALUES (1);
to:INSERT INTO NEXT_COMPACTION_QUEUE_ID (ncq_next) VALUES (1); INSERT INTO NEXT_LOCK_ID (nl_next) VALUES (1); INSERT INTO NEXT_TXN_ID (ntxn_next) VALUES (1);
because those tables are in the UPPER case in MySQL version. - Remove the following line in the hive-out.sql file:
INSERT INTO VERSION (VER_ID, SCHEMA_VERSION, VERSION_COMMENT) VALUES (1, '1.1.0', 'Hive release version 1.1.0');
As we have used Cloudera Manager to generate the schema for us in MySQL at step 5 and this table has been populated automatically. - Put hive-out.sql file on the MySQL server host
- Log into MySQL and then run command:
USE metastore; SOURCE /path/to/hive-out.sql;
- Now ready to “Start” Hive service and test out if everything is OK.