Problems found in hbase-0.20.1 by static scan

todo:对其中典型bug的解释。包括规则rule的解释。? ?

在09年底在对开发的代码进行有限的白盒测试的时候,使用静态代码扫描工具就着依赖使用的hbase(当时0.20.0版本)的代码进行代码扫描。本身直接扫描的report结果内容很多,有些是style和bad practice这样的改善型的,但是也有挺多的通过在代码中确认属于真实bug的。尤其是有些非常低级的bug出现在RegionServer这种非常核心的class里。筛选出来通过team在US的一个同事Andrew Purtell提交了HBASE-1916到。

感觉那个时候Hbase的品质,至少是从代码这个角度看,真的是有挺多可以吐槽的。想起来有这样一个bug(HBASE-1968)发现也不是很难,问题的reproduct也很容易,就是向Hbase提交的put中包含了一个不存在的 column family,因为write buffer中这个坏的put不能被清除,结果导致之后的的put都不成功。感觉挺低级的一个bug。邮件沟通了好几轮最终修复手段也怎么看也不是一个解决bug的方 法,只是一个workaround让用户对原来内部的private 的buffer有写权限,让用户来做错误处理。

当时hbase的品质真的不敢恭维,至少当时的感觉是。同时对hadooop源码也进行了扫描(当时版本是0.20.1),也核对了代码后,发现hadoop中扫描得到真正的bug的几乎没有。也确认了hbase项目的品质和hadoop还是差距挺明显的。尤其在之后的读代码过程中,挺多的代码细节一再证明了两个项目的差别,当然也是个人认识,其实是个人感觉:-)。

当然下面列出的是更大范围的,挺多也不能算是bug。

其中有些rule其实是比较tricky的,大多数时候我们都不注意,但是在后期的代码维护中,或者稍微扩展的应用中很容易形成一个致命bug,而这类bug出现后一般调查起来反倒是比较费劲的。挺多其实还是蛮考验语言功底、素养的。

因为比较典型的不少问题,在代码中看着不是错误,至少现在不是错误,甚至都不是大的问题,但是是潜在的问题,在以后的代码维护中,功能稍微有扩展或者调整,即使是当时写下这些的代码的人也可能在修改这份代码时候出现问题。。

当编译器不能再编译阶段,通过语法来提醒我们有些错误的时候,就需要写代码的时候能有些简单的rule来follow。

  1. Pattern Call to equals comparing different types
  2. Pattern Class defines compareTo and usesObjectequals
  3. Pattern Class defines equals and uses ObjecthashCode
  4. Pattern clone method does not call superclone
  5. Pattern Constructor invokes Threadstart
  6. Pattern Dead store to local variable
  7. Pattern equals method always returns false
  8. Pattern Inconsistent synchronization
  9. Pattern Inefficient use of keySet iterator instead ofentrySet iterator
  10. Pattern instanceof will always return false
  11. Pattern integral division result cast to double or float
  12. Pattern Invocation of hashCode on an array
  13. Pattern Invocation of toString on an array
  14. Pattern Method calls Threadsleep with a lock held
  15. Pattern Method does not release lock on all exceptionpaths
  16. Pattern Method invokes inefficient Boolean constructoruse BooleanvalueOf instead
  17. Pattern Method may fail to close stream
  18. Pattern Method might ignore exception
  19. Pattern Method with Boolean return type returns explicitnull
  20. Pattern Naked notify
  21. Pattern No relationship between generic parameter andmethod argument
  22. Pattern Possible null pointer dereference
  23. Pattern Private method is never called
  24. Pattern Random object created and used only once
  25. Pattern Redundant nullcheck of value known to be non-null
  26. Bug Result of integer multiplication cast to long
  27. Pattern Should be a static inner class
  28. Pattern Synchronization performed on utilconcurrentinstance
  29. Pattern Test for floating point equality
  30. Pattern Unconditional wait
  31. Pattern Unread field
  32. Pattern Unread field should this field be static
  33. Pattern Unsigned right shift cast to shortbyte
  34. Pattern Vacuous comparison of integer value

Bugs found in HBase 2.0:

Pattern: Call to equals() comparing different types

id:EC_UNRELATED_TYPES,type:EC,category:CORRECTNESS

This method calls equals(Object) on two references of different class typeswith no common subclasses. Therefore, the objects being compared are unlikelyto be members of the same class at runtime (unless some application classeswere not analyzed, or dynamic class loading can occur at runtime). According tothe contract of equals(), objects of different classes should always compare asunequal; therefore, according to the contract defined byjava.lang.Object.equals(Object), the result of this comparison will always befalse at runtime.

Pattern: Class defines compareTo(…) and usesObject.equals()

id:EQ_COMPARETO_USE_OBJECT_EQUALS,type:Eq,category:BAD_PRACTICE

This class defines a compareTo(…) method but inherits its equals()method from java.lang.Object. Generally, the value of compareTo should returnzero if and only if equals returns true. If this is violated, weird andunpredictable failures will occur in classes such as PriorityQueue. In Java 5the PriorityQueue.remove method uses the compareTo method, while in Java 6 ituses the equals method.

From the JavaDoc for the compareTo method in the Comparable interface:

It is strongly recommended, but not strictly requiredthat (x.compareTo(y)==0) == (x.equals(y)). Generally speaking, any class thatimplements the Comparable interface and violates this condition should clearlyindicate this fact. The recommended language is “Note: this class has anatural ordering that is inconsistent with equals.”

Pattern: Class defines equals() and uses Object.hashCode()

id:HE_EQUALS_USE_HASHCODE,type:HE,category:BAD_PRACTICE

This class overrides equals(Object), but does not override hashCode(), andinherits the implementation of hashCode() from java.lang.Object (which returnsthe identity hash code, an arbitrary value assigned to the object by theVM). Therefore, the class is very likely to violate the invariant thatequal objects must have equal hashcodes.

If you don’t think instances of this class will ever be inserted into aHashMap/HashTable, the recommended hashCode implementation to use is:

Pattern: clone method does not call super.clone()

id:CN_IDIOM_NO_SUPER_CALL,type:CN,category:BAD_PRACTICE

This non-final class defines a clone() method that does not callsuper.clone(). If this class (“A“) is extended by a subclass(“B“), and the subclassBcalls super.clone(), then itis likely thatB‘s clone() method will return an object of typeA,which violates the standard contract for clone().

If all clone() methods call super.clone(), then they are guaranteed to useObject.clone(), which always returns an object of the correct type.

Pattern: Constructor invokes Thread.start()

id:SC_START_IN_CTOR,type:SC,category:MT_CORRECTNESS

The constructor starts a thread. This is likely to be wrong if the classis ever extended/subclassed, since the thread will be started before thesubclass constructor is started.

Pattern: Dead store to local variable

id:DLS_DEAD_LOCAL_STORE,type:DLS,category:STYLE

This instruction assigns a value to a local variable, but the value is notread or used in any subsequent instruction. Often, this indicates an error,because the value computed is never used.

Note that Sun’s javac compiler often generates dead stores for final localvariables. Because FindBugs is a bytecode-based tool, there is no easy way toeliminate these false positives.

Pattern: equals method always returns false

id:EQ_ALWAYS_FALSE,type:Eq,category:CORRECTNESS

This class defines an equals method that always returns false. This meansthat an object is not equal to itself, and it is impossible to create usefulMaps or Sets of this class. More fundamentally, it means that equals is notreflexive, one of the requirements of the equals method.

The likely intended semantics are object identity: that an object is equalto itself. This is the behavior inherited from class Object. If you need tooverride an equals inherited from a different superclass, you can use use:

 

Pattern: Inconsistent synchronization

id:IS2_INCONSISTENT_SYNC,type:IS,category:MT_CORRECTNESS

The fields of this class appear to be accessed inconsistently with respectto synchronization. This bug report indicates that the bug patterndetector judged that

  • The class contains a mix of locked and unlocked accesses,
  • At least one locked access was performed by one of the class’s own methods, and
  • The number of unsynchronized field accesses (reads and writes) was no more than one third of all accesses, with writes being weighed twice as high as reads

A typical bug matching this bug pattern is forgetting to synchronize oneof the methods in a class that is intended to be thread-safe.

You can select the nodes labeled “Unsynchronized access” to showthe code locations where the detector believed that a field was accessedwithout synchronization.

Note that there are various sources of inaccuracy in this detector; forexample, the detector cannot statically detect all situations in which a lockis held. Also, even when the detector is accurate in distinguishinglocked vs. unlocked accesses, the code in question may still be correct.

Bug:Inconsistent synchronization oforg.apache.hadoop.hbase.client.HTable.currentWriteBufferSize; locked 75% of time

Pattern: Inefficient use of keySet iterator instead ofentrySet iterator

id:WMI_WRONG_MAP_ITERATOR,type:WMI,category:PERFORMANCE

This method accesses the value of a Map entry, using a key that wasretrieved from a keySet iterator. It is more efficient to use an iterator onthe entrySet of the map, to avoid the Map.get(key) lookup.

Bug:Method new org.apache.hadoop.hbase.client.Scan(Scan)makes inefficient use of keySet iterator instead of entrySet iterator

Pattern: instanceof will always return false

id:BC_IMPOSSIBLE_INSTANCEOF,type:BC,category:CORRECTNESS

This instanceof test will always return false. Although this is safe, makesure it isn’t an indication of some misunderstanding or some other logic error.

Pattern: integral division result cast to double or float

id:ICAST_IDIV_CAST_TO_DOUBLE,type:ICAST,category:STYLE

This code casts the result of an integral division (e.g., int or longdivision) operation to double or float. Doing division on integers truncatesthe result to the integer value closest to zero. The fact that the result wascast to double suggests that this precision should have been retained. What wasprobably meant was to cast one or both of the operands to doublebeforeperforming the division. Here is an example:

Bug:integral division result cast to double or float

Pattern: Invocation of hashCode on an array

id:DMI_INVOKING_HASHCODE_ON_ARRAY,type:DMI,category:CORRECTNESS

The code invokes hashCode on an array. Calling hashCode on an arrayreturns the same value as System.identityHashCode, and ingores the contents andlength of the array. If you need a hashCode that depends on the contents of anarray a, use java.util.Arrays.hashCode(a).

Pattern: Invocation of toString on an array

id:DMI_INVOKING_TOSTRING_ON_ARRAY,type:USELESS_STRING,category:CORRECTNESS

The code invokes toString on an array, which will generate a fairly uselessresult such as [C@16f0472. Consider using Arrays.toString to convert the arrayinto a readable String that gives the contents of the array. See ProgrammingPuzzlers, chapter 3, puzzle 12.

Pattern: Method calls Thread.sleep() with a lock held

id:SWL_SLEEP_WITH_LOCK_HELD,type:SWL,category:MT_CORRECTNESS

This method calls Thread.sleep() with a lock held. This may result in verypoor performance and scalability, or a deadlock, since other threads may bewaiting to acquire the lock. It is a much better idea to call wait() on thelock, which releases the lock and allows other threads to run.

Bug:org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getMaster()calls Thread.sleep() with a lock held

Pattern: Method concatenates strings using + in a loop
id: SBSC_USE_STRINGBUFFER_CONCATENATION, type: SBSC, category: PERFORMANCE

The method seems to be building a String using concatenation in a loop. Ineach iteration, the String is converted to a StringBuffer/StringBuilder,appended to, and converted back to a String. This can lead to a cost quadraticin the number of iterations, as the growing string is recopied in eachiteration.

Better performance can be obtained by using a StringBuffer (orStringBuilder in Java 1.5) explicitly.

For example:

Bug:org.apache.hadoop.hbase.regionserver.LogRoller.run()does not release lock on all exception paths

Pattern: Method does not release lock on all exceptionpaths

id:UL_UNRELEASED_LOCK_EXCEPTION_PATH,type:UL,category:MT_CORRECTNESS

This method acquires a JSR-166 (java.util.concurrent)lock, but does not release it on all exception paths out of the method. Ingeneral, the correct idiom for using a JSR-166 lock is:

Pattern: Method invokes inefficient Boolean constructor;use Boolean.valueOf(…) instead

id:DM_BOOLEAN_CTOR,type:Dm,category:PERFORMANCE

Creating new instances of java.lang.Boolean wastes memory, since Booleanobjects are immutable and there are only two useful values of this type.Use the Boolean.valueOf() method (or Java 1.5 autoboxing) to create Booleanobjects instead.

Bug:org.apache.hadoop.hbase.thrift.generated.ColumnDescriptor.getFieldValue(int)

Pattern: Method invokes inefficient Number constructor;use static valueOf instead
id: DM_NUMBER_CTOR, type: Bx, category: PERFORMANCE

Using new Integer(int) is guaranteed to always result in a new objectwhereas Integer.valueOf(int) allows caching of values to be done by thecompiler, class library, or JVM. Using of cached values avoids objectallocation and the code will be faster.

Values between -128 and 127 are guaranteed to have corresponding cachedinstances and using valueOf is approximately 3.5 times faster than usingconstructor. For values outside the constant range the performance of bothstyles is the same.

Unless the class must be compatible with JVMs predating Java 1.5, useeither autoboxing or the valueOf() method when creating instances of Long,Integer, Short, Character, and Byte.

Bug: Methodorg.apache.hadoop.hbase.client.TestClient.makeNAscii(byte[], int) invokesinefficient new Integer(int) constructor; use Integer.valueOf(int) instead

Pattern: Method may fail to close stream

id:OS_OPEN_STREAM,type:OS,category:BAD_PRACTICE

The method creates an IO stream object, does not assign it to any fields,pass it to other methods that might close it, or return it, and does not appearto close the stream on all paths out of the method. This may result in afile descriptor leak. It is generally a good idea to use a finally blockto ensure that streams are closed.

Bug:org.apache.hadoop.hbase.io.hfile.RandomSeek.slurp(String) may fail to closestream

Pattern: Method might ignore exception

id:DE_MIGHT_IGNORE,type:DE,category:BAD_PRACTICE

This method might ignore an exception. In general, exceptions shouldbe handled or reported in some way, or they should be thrown out of the method.

Bug:org.apache.hadoop.hbase.ipc.HBaseServer$Connection.close() might ignorejava.lang.Exception

Pattern: Method with Boolean return type returns explicitnull

id:NP_BOOLEAN_RETURN_NULL,type:NP,category:BAD_PRACTICE

A method that returns either Boolean.TRUE, Boolean.FALSE or null is anaccident waiting to happen. This method can be invoked as though it returned avalue of type boolean, and the compiler will insert automatic unboxing of theBoolean value. If a null value is returned, this will result in aNullPointerException.

Bug: org.apache.hadoop.hbase.client.HTable$10.call()has Boolean return type and returns explicit null

Pattern: Naked notify

id:NN_NAKED_NOTIFY,type:NN,category:MT_CORRECTNESS

A call to notify() or notifyAll() was made without any (apparent)accompanying modification to mutable object state. In general, calling anotify method on a monitor is done because some condition another thread iswaiting for has become true. However, for the condition to be meaningful,it must involve a heap object that is visible to both threads.

This bug does not necessarily indicate an error, since the change tomutable object state may have taken place in a method which then called themethod containing the notification.

Pattern: No relationship between generic parameter andmethod argument

id:GC_UNRELATED_TYPES,type:GC,category:CORRECTNESS

This call to a generic collection method contains an argument with anincompatible class from that of the collection’s parameter (i.e., the type ofthe argument is neither a supertype nor a subtype of the corresponding generictype argument). Therefore, it is unlikely that the collection contains anyobjects that are equal to the method argument used here. Most likely, the wrongvalue is being passed to the method.

In general, instances of two unrelated classes are not equal. For example,if the Foo and Bar classes are not related by subtyping, then an instance ofFoo should not be equal to an instance of Bar. Among other issues, doing sowill likely result in an equals method that is not symmetrical. For example, ifyou define the Foo class so that a Foo can be equal to a String, your equalsmethod isn’t symmetrical since a String can only be equal to a String.

In rare cases, people do define nonsymmetrical equals methods and stillmanage to make their code work. Although none of the APIs document or guaranteeit, it is typically the case that if you check if a Collection<String>contains a Foo, the equals method of argument (e.g., the equals method of theFoo class) used to perform the equality checks.

Pattern: Possible null pointer dereference

id:NP_NULL_ON_SOME_PATH,type:NP,category:CORRECTNESS

There is a branch of statement that,if executed,guarantees that anull value will be dereferenced, which would generate a NullPointerExceptionwhen the code is executed. Of course, the problem might be that the branch orstatement is infeasible and that the null pointer exception can’t ever beexecuted; deciding that is beyond the ability of FindBugs

Pattern: Private method is never called

id:UPM_UNCALLED_PRIVATE_METHOD,type:UPM,category:PERFORMANCE

This private method is never called. Although it is possible that themethod will be invoked through reflection, it is more likely that the method isnever used, and should be removed.

Pattern: Random object created and used only once

id:DMI_RANDOM_USED_ONLY_ONCE,type:BC,category:BAD_PRACTICE

This code creates a java.util.Random object, uses it to generate onerandom number, and then discards the Random object. This produces mediocrequality random numbers and is inefficient. If possible, rewrite the code sothat the Random object is created once and saved, and each time a new randomnumber is required invoke a method on the existing Random object to obtain it.

If it is important that the generated Random numbers not be guessable, youmustnot create a new Random for each random number; the values are tooeasily guessable. You should strongly consider using ajava.security.SecureRandom instead (and avoid allocating a new SecureRandom foreach random number needed).

Bug: TestTableIndex

Pattern: Redundant nullcheck of value known to be non-null

id:RCN_REDUNDANT_NULLCHECK_OF_NONNULL_VALUE,type:RCN,category:STYLE

This method contains a redundant check of a known non-null value againstthe constant null.

Bug: Result of integer multiplication cast to long

Pattern id:ICAST_INTEGER_MULTIPLY_CAST_TO_LONG,type:ICAST,category:STYLE

This code performs integer multiply and then converts the result to along, as in:

If the multiplication is done using long arithmetic, youcan avoid the possibility that the result will overflow. For example, you couldfix the above code to:

or

Pattern: Should be a static inner class

id:SIC_INNER_SHOULD_BE_STATIC,type:SIC,category:PERFORMANCE

This class is an inner class, but does not use its embedded reference tothe object which created it. This reference makes the instances of theclass larger, and may keep the reference to the creator object alive longerthan necessary. If possible, the class should be made static.

Pattern: Synchronization performed on util.concurrentinstance

id:JLM_JSR166_UTILCONCURRENT_MONITORENTER,type:JLM,category:MT_CORRECTNESS

This method performs synchronization an object that is an instance of aclass from the java.util.concurrent package (or its subclasses). Instances ofthese classes have there own concurrency control mechanisms that are distinctfrom and incompatible with the use of the keyword synchronized.

Pattern: Test for floating point equality

id:FE_FLOATING_POINT_EQUALITY,type:FE,category:STYLE

This operation compares two floating point values for equality. Becausefloating point calculations may involve rounding, calculated float and doublevalues may not be accurate. For values that must be precise, such as monetaryvalues, consider using a fixed-precision type such as BigDecimal. For valuesthat need not be precise, consider comparing for equality within some range,for example: if ( Math.abs(x – y) < .0000001 ). See the Java LanguageSpecification, section 4.2.4.

Pattern: Unconditional wait

id:UW_UNCOND_WAIT,type:UW,category:MT_CORRECTNESS

This method contains a call to java.lang.Object.wait() which is notguarded by conditional control flow. The code should verify thatcondition it intends to wait for is not already satisfied before calling wait;any previous notifications will be ignored.

Pattern: Unread field

id:URF_UNREAD_FIELD,type:UrF,category:PERFORMANCE

This field is never read. Consider removing it from the class.

Pattern: Unread field: should this field be static

id:SS_SHOULD_BE_STATIC,type:SS,category:PERFORMANCE

This class contains an instance final field that is initialized to acompile-time static value. Consider making the field static.

Pattern: Unsigned right shift cast to short/byte

id:ICAST_QUESTIONABLE_UNSIGNED_RIGHT_SHIFT,type:BSHIFT,category:STYLE

The code performs an unsigned right shift, whose result is then cast to ashort or byte, which discards the upper bits of the result. Since the upperbits are discarded, there may be no difference between a signed and unsignedright shift (depending upon the size of the shift).

Pattern: Vacuous comparison of integer value

id:INT_VACUOUS_COMPARISON,type:INT,category:STYLE

There is an integer comparison that always returns the same value (e.g., x<= Integer.MAX_VALUE).

原创文章。为了维护文章的版本一致、最新、可追溯,转载请注明: 转载自idouba

本文链接地址: Problems found in hbase-0.20.1 by static scan


, , , , ,

No comments yet.

发表评论