HBASE-1968 ISSUE Report

整理资料的时候发现笔记本里outlook标红的几封邮件,归档下。不涉及team的legal和copyright的。描述了定位调查和汇报HBASE-1968的过程。只记得但是发现认为是个很低级的bug。在邮件中再看下才详细了解了下。

 

 

—————————————————–START——————————————————————————

Hi Andrew,

Thanks for your information.

Best regards

-Forrest

—–Original Message—–

From: Andrew Purtell (RD-US)
Sent: 2009年11月11日 11:17
To: Forrest Zhang (RD-CN)
Cc: All of CN ***** Team
Subject: RE: Failure of one Put in HTable causes the failure of all the following Puts

Hi Forrest,

HBase 0.20.2 release candidate 1 is available now at http://people.apache.org/~jdcryans/hbase-0.20.2-candidate-1/

For a list of issues resolved in this release, please see http://su.pr/1nnhl5 . HBASE-1968, the getWriteBuffer() API addition to HTable, is included. Also, the GMS team’s FindBugs reports are incorporated in HBASE-1916; and Trend engineer Mingjui Ray Liao reported and fixed HBASE-1912.

Best regards,

- Andy

 

—–Original Message—–

From: Forrest Zhang (RD-CN)
Sent: Wed 11/11/2009 10:36 AM
To: Andrew Purtell (RD-US)
Cc: All of CN ***** Team
Subject: RE: Failure of one Put in HTable causes the failure of all the following Puts

Hi Andrew,

Thank you for your reply. This workaround is sufficient for our current requirement. We will use it when it’s available.

Best regards,

-Forrest

 

—–Original Message—–
From: Andrew Purtell (RD-US)
Sent: 2009年11月10日 12:59
To: Forrest Zhang (RD-CN)
Cc: All of CN ***** Team
Subject: RE: Failure of one Put in HTable causes the failure of all the following Puts

Yes, getWriteBuffer() is in trunk only, I see that now. Sorry, after we cut a release we move on pretty quickly. I added it on the 0.20 branch for the next point release. See https://issues.apache.org/jira/browse/HBASE-1968. In the meantime it’s trivial to patch the current release sources and recompile the jar for the client. Just add the following to org.apache.hadoop.hbase.client.HTable.java:

public ArrayList<Put> getWriteBuffer() {

return writeBuffer;

}

And I did understand your point, but I did not finish item #4 by mistake while editing the comment. I can attach a clarification to the issue but I’m sure all of the devs understand the problem. I restated the problem more clearly I hope in HBASE-1968.

Thanks Forrest.

Please let me know if the workaround is not sufficient and we must do more on 0.20 branch to address this.

Best regards,

 

- Andy

 

—–Original Message—–

From: Forrest Zhang (RD-CN)
Sent: Tue 11/10/2009 10:12 AM
To: Andrew Purtell (RD-US)
Cc: All of CN ***** Team
Subject: RE: Failure of one Put in HTable causes the failure of all the following Puts

Hi Andrew,

Thank you for your timely response.

There is no getWriteBuffer() defined in HTable, thus the client cannot remove the invalid Put manually.

And maybe I have not clearly expressed my idea. I’d like to make further explanation of 3) and 4) in you posted summarization.

3) When the invalid put is processed, an exception is thrown. The finally clause of flushCommits() removes all successful puts from the writebuffer list but the failed put remains at the top.

If all the Puts in writeBuffer are valid, they can be removed from writeBuffer after they are processed successfully. But, if there is one Put is invalid, an exception is thrown, and all the Subsequent Puts will not be processed no matter they are valid or invalid. This bad Put and all the Puts will remain in the writeBuffer even after being retried over and over.

4) Subsequent puts will add more entries to the write buffer but the first entry on the list is invalid so eventually every Put will throw an exception once the buffer limit .

Always only the first invalid Put will throw an exception, the following Puts cannot get the chance to be processed and will not throw an exception.

Please let me know if you need any further information.

 

Best regards,

-Forrest

 

—–Original Message—–

From: Andrew Purtell (RD-US)
Sent: 2009?11?9? 22:54
To: Forrest Zhang (RD-CN)
Cc: All of CN ***** Team
Subject: RE: Failure of one Put in HTable causes the failure of all the following Puts

Hi Forrest,

Thanks for the report with clear detail.

Please see https://issues.apache.org/jira/browse/HBASE-1845?focusedCommentId=12774990&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12774990

HBASE-1845 (“Discussion issue for multi calls, MultiDelete, MultiGet and MultiPut”) is the future direction for the client API. I have posted your findings there.

For now, I believe what will work with the current write buffering scheme in HTable is for the client — when it gets an exception on flushCommits() — to call HTable.getWriteBuffer(), which will return a reference to the list that backs the write buffer, and for the client then to remove the invalid entry at the head of the list manually if the list is not empty. The API is not clean on this point, but it does allow the client to retrieve the invalid entry from the write buffer rather than having it just be discarded. Subsequent puts should then succeed. Please let me know if I have misunderstood something.

Best regards,

- Andy

 

—–Original Message—–

From: Forrest Zhang (RD-CN)
Sent: Mon 11/9/2009 1:06 PM
To: Andrew Purtell (RD-US)
Cc: All of CN ***** Team
Subject: Failure of one Put in HTable causes the failure of all the following Puts

Hi Andrew,

In the testing of GMS, we find one problem.

When insert rows into one table by calling the method public synchronized void put(final Put put), if the column family of one row does not exist, the insert operation will failed and throw NoSuchColumnFamilyException.. We observed that all the following insert operation will fails even though all of them have valid column family. That is one exception of insert operation can cause failure of all the following insert operation.

We track the following two methods used when put one Put to a HTable in org.apache.hadoop.hbase.client.HTable.java

 

 

When put one Put into HTable, it will add the Put to an ArrayList<Put> typed writeBuffer, and call flushCommits(), in which all the Put in writeBuffer are Batch processed. The Put which is successfully processed will be cleared from writeBuffer, but all the Puts following the first one (we call it bad Put) which is failed to be processed for some reason such as invalid family will be remaining the writeBuffer, and can NEVER be successfully processed. Even though the Puts in writeBuffer will be retried in the next put operation, the failure of the first bad Put can ALWAYS cause the failure of all the good Puts following it.

Thanks,

Best regards,

-Forrest

—————————————————–END——————————————————————————

 

完。

原创文章。为了维护文章的版本一致、最新、可追溯,转载请注明: 转载自idouba

本文链接地址: HBASE-1968 ISSUE Report


, , ,

Trackbacks/Pingbacks

  1. Problems found in hbase-0.20.1 by findbugs | idouba - 2015年11月22日

    […] 感觉那个时候Hbase的品质,至少是从代码这个角度看,真的是有挺多可以吐槽的。想起来有这样一个bug发现也不是很难,问题的reproduct也很容易,就是向Hbase提交的put中包含了一个不存在的 column family,因为write buffer中这个坏的put不能被清除,结果导致之后的的put都不成功。感觉挺低级的一个bug。邮件沟通了好几轮最终修复手段也怎么看也不是一个解决bug的方 法,只是一个workaround让用户对原来内部的private 的buffer有写权限,让用户来做错误处理。 […]

发表评论