Hive partitioned tables Issue with schema & PrestoDB

It's very strange for Hive and PrestoDB user that the schema of partitioned tables in Hive is defined on partition level as well. Partition level schema and table level schema in Apache Hadoop is letting complex.

Let's see the details in below example:

Table schema

In Hive you can change the schema of an existing table. Let’s say you have a table:

CREATE TABLE TEST1
(ID INT,
 NAME STRING,
 RATING INT
) ROW FORMAT DELIMITED FIELDS TERMINATED BY "\t" LINES TERMINATED BY '\n' STORED AS TEXTFILE;

We will focus on the third column named rating. We load few records for hdfs to this table. The file looks like this:

1         john  3.5
2  Dav  4.6
3  andy  5

hive -e "load data inpath 'input.txt' into table test1"


The third column has some decimal values, but we have defined this column as integer, so the we won’t see the decimal part in the data:

hive> select * from test1;
OK
1  john  3
2  Dav  4
3  andy  5
Time taken: 0.121 seconds, Fetched: 3 row(s)

We realise our mistake, so let’s change the schema of the table by replacing int to decimal(3,1):

hive> alter table test1 change rating rating decimal(3,1);
OK

Time taken: 0.45 seconds

Now we can see the data, as expected:

hive> select * from test1;                            
OK
1  john  3.5
2  Dav  4.6
3  andy  5
Time taken: 0.130 seconds, Fetched: 3 row(s)

So everything works fine, but it’s get more complicated with partitioned tables. Since hive defined schema on table level & partition level as well.

Partitioned tables

Let’s say our table has partitions:

CREATE TABLE TEST1_PARTITIONED
(ID INT,
 NAME STRING,
 RATING INT
) PARTITIONED BY (DAY INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY "\t" LINES TERMINATED BY '\n' STORED AS TEXTFILE;


Let’s load the same data as previously done:

hive -e "load data inpath 'input.txt' into table TEST1_PARTITIONED partition (day = 01)"

Table will shows int values — as previously did:

hive> select * from TEST1_PARTITIONED;
OK
1  john  3  01
2  Dav  4  01
3  andy  5  01
Time taken: 0.130 seconds, Fetched: 3 row(s)

We have noticed our mistake and we change the schema of that table:

hive> alter table TEST1_PARTITIONED change rating rating decimal(3,1);
OK
Time taken: 0.311 seconds

No error everythings right till here.


However when we select that data, we get int values again (as if the schema didn’t change):

hive> select * from TEST1_PARTITIONED;
OK
1  john  3  01
2  Dav  4  01
3  andy  5  01
Time taken: 0.139 seconds, Fetched: 3 row(s)

Successful in hive & Spark

But it will fail in prestoDB with below error:
On prestoDB we will get below error:

com.facebook.presto.spi.PrestoException: There is a mismatch between the table and partition schemas. The types are incompatible and cannot be coerced. The column 'rating' in table 'DB.TEST1_PARTITIONED' is declared as type 'decimal(3,1)', but partition 'day=01' declared column 'rating' as type 'int'.
at com.facebook.presto.hive.HiveSplitManager.lambda$getPartitionMetadata$1(HiveSplitManager.java:254)
at com.google.common.collect.Iterators$8.transform(Iterators.java:799)
at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
at com.google.common.collect.Iterators$5.hasNext(Iterators.java:548)
at com.facebook.presto.hive.ConcurrentLazyQueue.poll(ConcurrentLazyQueue.java:37)
at com.facebook.presto.hive.BackgroundHiveSplitLoader.loadSplits(BackgroundHiveSplitLoader.java:219)
at com.facebook.presto.hive.BackgroundHiveSplitLoader.access$300(BackgroundHiveSplitLoader.java:78)
at com.facebook.presto.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:179)
at com.facebook.presto.hive.util.ResumableTasks.safeProcessTask(ResumableTasks.java:45)
at com.facebook.presto.hive.util.ResumableTasks.lambda$submit$1(ResumableTasks.java:33)
at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:77)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)



This can be a bit strange. The schema for partitioned table is defined also on partition level. So altering the schema will affect only newly created partitions. For existing partition old schema will be used (that one which was in use during creation of that partition).

Loading the same file to another partition will show the difference:

hive -e "load data inpath 'input.txt' into table TEST1_PARTITIONED partition (day = 02)"

And now we can see that third column (named rating) partition day=02 is displayed with precision decimal(3,1):

hive> select * from TEST1_PARTITIONED;
OK
1  john  3  01
2  Dav  4  01
3  andy  5  01
1  john  3.5  02
2  Dav  4.6  02
3  andy  5  02
Time taken: 0.192 seconds, Fetched: 6 row(s)


How to deal with it?

In Hive 1.1.0 there is a way to change the partition schema but did not work:

alter table ... partition(...) change col_name col_name type;

I haven’t found any way to change the partition schema in releases this release so it seems the only way to fix it is to recreate the table.

Recreate table with changed type & load the same data into it.
Or
If table is external then create new external table on same location with changed schema and load all the existing partition into it.

Comments

Popular posts from this blog

Setup Nginx as a Reverse Proxy for Thingsboard running on different port/server

How to auto re-launch a YARN Application Master on a failure.