热门标签 | HotTags
当前位置:  开发笔记 > 编程语言 > 正文

如何确保hashCode()与equals()一致?-HowtoensurehashCode()isconsistentwithequals()?

Whenoverridingtheequals()functionofjava.lang.Object,thejavadocssuggestthat,当重写java.lang.

When overriding the equals() function of java.lang.Object, the javadocs suggest that,

当重写java.lang.Object的equals()函数时,javadocs建议,

it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.

通常需要在重写此方法时覆盖hashCode方法,以便维护hashCode方法的常规协定,该方法声明相等的对象必须具有相等的哈希代码。

The hashCode() method must return a unique integer for each object (this is easy to do when comparing objects based on memory location, simply return the unique integer address of the object)

hashCode()方法必须为每个对象返回一个唯一的整数(这在基于内存位置比较对象时很容易,只需返回对象的唯一整数地址)

How should a hashCode() method be overriden so that it returns a unique integer for each object based only on that object's properities?

应该如何覆盖hashCode()方法,以便它仅基于该对象的特性为每个对象返回一个唯一的整数?


public class People{
   public String name;
   public int age;

   public int hashCode(){
      // How to get a unique integer based on name and age?
   }
}
/*******************************/
public class App{
   public static void main( String args[] ){
       People mike = new People();
       People melissa = new People();
       mike.name = "mike";
       mike.age = 23;
       melissa.name = "melissa";
       melissa.age = 24;
       System.out.println( mike.hasCode() );  // output?
       System.out.println( melissa.hashCode(); // output?
   }
}

8 个解决方案

#1


It doesn't say the hashcode for an object has to be completely unique, only that the hashcode for two equal objects returns the same hashcode. It's entirely legal to have two non-equal objects return the same hashcode. However, the more unique a hashcode distribution is over a set of objects, the better performance you'll get out of HashMaps and other operations that use the hashCode.

它没有说对象的哈希码必须是完全唯一的,只是两个相等对象的哈希码返回相同的哈希码。让两个不相等的对象返回相同的哈希码是完全合法的。但是,哈希码分布在一组对象上越独特,您将从HashMaps和使用hashCode的其他操作中获得更好的性能。

IDEs such as IntelliJ Idea have built-in generators for equals and hashCode that generally do a pretty good job at coming up with "good enough" code for most objects (and probably better than some hand-crafted overly-clever hash functions).

诸如IntelliJ Idea之类的IDE具有用于equals和hashCode的内置生成器,它们通常可以为大多数对象提供“足够好”的代码(并且可能比一些手工制作的过于聪明的哈希函数更好)。

For example, here's a hashCode function that Idea generates for your People class:

例如,这是Idea为您的People类生成的hashCode函数:

public int hashCode() {
    int result = name != null ? name.hashCode() : 0;
    result = 31 * result + age;
    return result;
}

#2


I won't go in to the details of hashCode uniqueness as Marc has already addressed it. For your People class, you first need to decide what equality of a person means. Maybe equality is based solely on their name, maybe it's based on name and age. It will be domain specific. Let's say equality is based on name and age. Your overridden equals would look like

我不会详细介绍hashCode的唯一性,因为Marc已经解决了这个问题。对于您的People类,您首先需要确定一个人的平等意味着什么。也许平等完全取决于他们的名字,也许是基于姓名和年龄。它将是特定领域的。让我们说平等是基于名称和年龄。你被覆盖的等于看起来像

public boolean equals(Object obj) {
    if (this==obj) return true;
    if (obj==null) return false;
    if (!(getClass().equals(obj.getClass())) return false;
    Person other = (Person)obj;
    return (name==null ? other.name==null : name.equals(other.name)) &&
        age==other.age;
}

Any time you override equals you must override hashCode. Furthermore, hashCode can't use any more fields in its computation than equals did. Most of the time you must add or exclusive-or the hash code of the various fields (hashCode should be fast to compute). So a valid hashCode method might look like:

无论何时重写equals,都必须覆盖hashCode。此外,hashCode在其计算中不能使用比equals更多的字段。大多数情况下,您必须添加或排除各个字段的哈希码(hashCode应该快速计算)。所以一个有效的hashCode方法可能如下所示:

public int hashCode() {
    return (name==null ? 17 : name.hashCode()) ^ age;
}

Note that the following is not valid as it uses a field that equals didn't (height). In this case two "equals" objects could have a different hash code.

请注意,以下内容无效,因为它使用的字段等于不(高度)。在这种情况下,两个“等于”对象可以具有不同的哈希码。

public int hashCode() {
    return (name==null ? 17 : name.hashCode()) ^ age ^ height;
}

Also, it's perfectly valid for two non-equals objects to have the same hash code:

此外,两个非等于对象具有相同的哈希码完全有效:

public int hashCode() {    
    return age;    
}

In this case Jane age 30 is not equal to Bob age 30, yet both their hash codes are 30. While valid this is undesirable for performance in hash-based collections.

在这种情况下,Jane年龄30不等于Bob年龄30,但他们的哈希码都是30.虽然有效,但这对于基于散列的集合中的性能是不合需要的。

#3


Another question asks if there are some basic low-level things that all programmers should know, and I think hash lookups are one of those. So here goes.

另一个问题是,是否存在所有程序员都应该知道的基本低级事物,我认为哈希查找就是其中之一。所以这里。

A hash table (note that I'm not using an actual classname) is basically an array of linked lists. To find something in the table, you first compute the hashcode of that something, then mod it by the size of the table. This is an index into the array, and you get a linked list at that index. You then traverse the list until you find your object.

哈希表(注意我没有使用实际的类名)基本上是一个链表的数组。要在表中查找内容,首先要计算该内容的哈希码,然后根据表的大小对其进行修改。这是数组的索引,您将获得该索引的链接列表。然后,您遍历列表,直到找到您的对象。

Since array retrieval is O(1), and linked list traversal is O(n), you want a hash function that creates as random a distribution as possible, so that objects will be hashed to different lists. Every object could return the value 0 as its hashcode, and a hash table would still work, but it would essentially be a long linked-list at element 0 of the array.

由于数组检索是O(1),并且链表遍历是O(n),因此您需要一个尽可能随机创建分布的哈希函数,以便将对象散列到不同的列表。每个对象都可以返回值0作为其哈希码,并且哈希表仍然可以工作,但它本质上是数组元素0处的长链表。

You also generally want the array to be large, which increases the chances that the object will be in a list of length 1. The Java HashMap, for example, increases the size of the array when the number of entries in the map is > 75% of the size of the array. There's a tradeoff here: you can have a huge array with very few entries and waste memory, or a smaller array where each element in the array is a list with > 1 entries, and waste time traversing. A perfect hash would assign each object to a unique location in the array, with no wasted space.

您通常也希望数组很大,这会增加对象在长度为1的列表中的可能性。例如,当地图中的条目数大于75时,Java HashMap会增加数组的大小。数组大小的百分比。这里有一个权衡:你可以拥有一个包含极少数条目和浪费内存的巨大数组,或者一个较小的数组,其中数组中的每个元素都是一个包含> 1个条目的列表,并且浪费时间遍历。完美的哈希会将每个对象分配给数组中的唯一位置,而不会浪费空间。

The term "perfect hash" is a real term, and in some cases you can create a hash function that provides a unique number for each object. This is only possible when you know the set of all possible values. In the general case, you can't achieve this, and there will be some values that return the same hashcode. This is simple mathematics: if you have a string that's more than 4 bytes long, you can't create a unique 4-byte hashcode.

术语“完美散列”是一个实际术语,在某些情况下,您可以创建一个散列函数,为每个对象提供唯一的数字。只有在知道所有可能值的集合时才可以执行此操作。在一般情况下,您无法实现此目的,并且会有一些值返回相同的哈希码。这是简单的数学:如果您的字符串长度超过4个字节,则无法创建唯一的4字节哈希码。

One interesting tidbit: hash arrays are generally sized based on prime numbers, to give the best chance for random allocation when you mod the results, regardless of how random the hashcodes really are.

一个有趣的花絮:哈希数组通常根据素数来确定大小,以便在修改结果时为随机分配提供最佳机会,无论哈希码实际上是多么随机。

Edit based on comments:

根据评论进行编辑:

1) A linked list is not the only way to represent the objects that have the same hashcode, although that is the method used by the JDK 1.5 HashMap. Although less memory-efficient than a simple array, it does arguably create less churn when rehashing (because the entries can be unlinked from one bucket and relinked to another).

1)链表不是表示具有相同哈希码的对象的唯一方法,尽管这是JDK 1.5 HashMap使用的方法。虽然比简单数组的内存效率更低,但在重新散列时可能会产生更少的流失(因为条目可以从一个存储桶取消链接并重新链接到另一个存储桶)。

2) As of JDK 1.4, the HashMap class uses an array sized as a power of 2; prior to that it used 2^N+1, which I believe is prime for N <= 32. This does not speed up array indexing per se, but does allow the array index to be computed with a bitwise AND rather than a division, as noted by Neil Coffey. Personally, I'd question this as premature optimization, but given the list of authors on HashMap, I'll assume there is some real benefit.

2)从JDK 1.4开始,HashMap类使用一个大小为2的幂的数组;在此之前,它使用2 ^ N + 1,我认为这是N <= 32的素数。这不会加速数组索引本身,但允许使用按位AND而不是除法来计算数组索引,如Neil Coffey所述。就个人而言,我认为这是过早的优化,但鉴于HashMap的作者列表,我会假设有一些真正的好处。

#4


In general the hash code cannot be unique, as there are more values than possible hash codes (integers). A good hash code distributes the values well over the integers. A bad one could always give the same value and still be logically correct, it would just lead to unacceptably inefficient hash tables.

通常,哈希码不能是唯一的,因为存在比可能的哈希码(整数)更多的值。良好的哈希代码可以很好地分配整数值。一个坏的可能总是给出相同的值并且仍然在逻辑上是正确的,它只会导致无法接受的低效哈希表。

Equal values must have the same hash value for hash tables to work correctly. Otherwise you could add a key to a hash table, then try to look it up via an equal value with a different hash code and not find it. Or you could put an equal value with a different hash code and have two equal values at different places in the hash table.

等值必须具有相同的哈希值才能使哈希表正常工作。否则,您可以将一个键添加到哈希表中,然后尝试通过具有不同哈希码的相等值查找它,而不是找到它。或者,您可以使用不同的哈希代码放置相等的值,并在哈希表中的不同位置具有两个相等的值。

In practice you usually select a subset of the fields to be taken into account in both the hashCode() and the equals() method.

实际上,您通常会在hashCode()和equals()方法中选择要考虑的字段子集。

#5


I think you misunderstood it. The hashcode does not have to be unique to each object (after all, it is a hash code) though you obviously don't want it to be identical for all objects. You do, however, need it to be identical to all objects that are equal, otherwise things like the standard collections would not work (e.g., you'd look up something in the hash set but would not find it).

我觉得你误会了。哈希码不必对每个对象都是唯一的(毕竟,它是一个哈希码),尽管你显然不希望它对所有对象都是相同的。但是,你需要它与所有相同的对象相同,否则像标准集合这样的东西将不起作用(例如,你在哈希集中查找某些内容但却找不到它)。

For straightforward attributes, some IDEs have hashcode function builders.

对于简单的属性,某些IDE具有哈希码功能构建器。

If you don't use IDEs, consider using Apahce Commons and the class HashCodeBuilder

如果您不使用IDE,请考虑使用Apahce Commons和HashCodeBuilder类

#6


The only contractual obligation for hashCode is for it to be consistent. The fields used in creating the hashCode value must be the same or a subset of the fields used in the equals method. This means returning 0 for all values is valid, although not efficient.

hashCode唯一的合同义务是使其保持一致。用于创建hashCode值的字段必须与equals方法中使用的字段相同或是其子集。这意味着返回0表示所有值都有效,但效率不高。

One can check if hashCode is consistent via a unit test. I written an abstract class called EqualityTestCase, which does a handful of hashCode checks. One simply has to extend the test case and implement two or three factory methods. The test does a very crude job of testing if the hashCode is efficient.

可以通过单元测试检查hashCode是否一致。我写了一个名为EqualityTestCase的抽象类,它执行一些hashCode检查。一个人只需要扩展测试用例并实现两个或三个工厂方法。如果hashCode有效,测试会做一个非常粗略的测试工作。

#7


This is what documentation tells us as for hash code method

这就是文档告诉我们的哈希码方法

@ javadoc

Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.

每当在Java应用程序的执行期间多次在同一对象上调用它时,hashCode方法必须始终返回相同的整数,前提是不修改对象的equals比较中使用的信息。从应用程序的一次执行到同一应用程序的另一次执行,该整数不需要保持一致。

#8


There is a notion of business key, which determines uniqueness of separate instances of the same type. Each specific type (class) that models a separate entity from the target domain (e.g. vehicle in a fleet system) should have a business key, which is represented by one or more class fields. Methods equals() and hasCode() should both be implemented using the fields, which make up a business key. This ensures that both methods consistent with each other.

有一个业务键的概念,它决定了相同类型的单独实例的唯一性。为目标域(例如车队系统中的车辆)建模单独实体的每种特定类型(类)应具有业务密钥,该业务密钥由一个或多个类字段表示。方法equals()和hasCode()都应该使用组成业务键的字段来实现。这确保了两种方法彼此一致。


推荐阅读
  • 使用C++编写程序实现增加或删除桌面的右键列表项
    本文介绍了使用C++编写程序实现增加或删除桌面的右键列表项的方法。首先通过操作注册表来实现增加或删除右键列表项的目的,然后使用管理注册表的函数来编写程序。文章详细介绍了使用的五种函数:RegCreateKey、RegSetValueEx、RegOpenKeyEx、RegDeleteKey和RegCloseKey,并给出了增加一项的函数写法。通过本文的方法,可以方便地自定义桌面的右键列表项。 ... [详细]
  • 本文主要解析了Open judge C16H问题中涉及到的Magical Balls的快速幂和逆元算法,并给出了问题的解析和解决方法。详细介绍了问题的背景和规则,并给出了相应的算法解析和实现步骤。通过本文的解析,读者可以更好地理解和解决Open judge C16H问题中的Magical Balls部分。 ... [详细]
  • 还要谈谈Equals和GetHashcode
    这篇随笔和上篇随笔《从两个数组中查找相同的数字谈Hashtable》都是为了下面分析Dictionary的实现做的铺垫一.两个逻辑上相等的实例对象。两个对象相等,除了指两个不同变量引用了 ... [详细]
  • Iamtryingtomakeaclassthatwillreadatextfileofnamesintoanarray,thenreturnthatarra ... [详细]
  • 本文讨论了如何优化解决hdu 1003 java题目的动态规划方法,通过分析加法规则和最大和的性质,提出了一种优化的思路。具体方法是,当从1加到n为负时,即sum(1,n)sum(n,s),可以继续加法计算。同时,还考虑了两种特殊情况:都是负数的情况和有0的情况。最后,通过使用Scanner类来获取输入数据。 ... [详细]
  • 本文讨论了使用差分约束系统求解House Man跳跃问题的思路与方法。给定一组不同高度,要求从最低点跳跃到最高点,每次跳跃的距离不超过D,并且不能改变给定的顺序。通过建立差分约束系统,将问题转化为图的建立和查询距离的问题。文章详细介绍了建立约束条件的方法,并使用SPFA算法判环并输出结果。同时还讨论了建边方向和跳跃顺序的关系。 ... [详细]
  • Python正则表达式学习记录及常用方法
    本文记录了学习Python正则表达式的过程,介绍了re模块的常用方法re.search,并解释了rawstring的作用。正则表达式是一种方便检查字符串匹配模式的工具,通过本文的学习可以掌握Python中使用正则表达式的基本方法。 ... [详细]
  • Go Cobra命令行工具入门教程
    本文介绍了Go语言实现的命令行工具Cobra的基本概念、安装方法和入门实践。Cobra被广泛应用于各种项目中,如Kubernetes、Hugo和Github CLI等。通过使用Cobra,我们可以快速创建命令行工具,适用于写测试脚本和各种服务的Admin CLI。文章还通过一个简单的demo演示了Cobra的使用方法。 ... [详细]
  • 开发笔记:实验7的文件读写操作
    本文介绍了使用C++的ofstream和ifstream类进行文件读写操作的方法,包括创建文件、写入文件和读取文件的过程。同时还介绍了如何判断文件是否成功打开和关闭文件的方法。通过本文的学习,读者可以了解如何在C++中进行文件读写操作。 ... [详细]
  • Android实战——jsoup实现网络爬虫,糗事百科项目的起步
    本文介绍了Android实战中使用jsoup实现网络爬虫的方法,以糗事百科项目为例。对于初学者来说,数据源的缺乏是做项目的最大烦恼之一。本文讲述了如何使用网络爬虫获取数据,并以糗事百科作为练手项目。同时,提到了使用jsoup需要结合前端基础知识,以及如果学过JS的话可以更轻松地使用该框架。 ... [详细]
  • 在说Hibernate映射前,我们先来了解下对象关系映射ORM。ORM的实现思想就是将关系数据库中表的数据映射成对象,以对象的形式展现。这样开发人员就可以把对数据库的操作转化为对 ... [详细]
  • JavaSE笔试题-接口、抽象类、多态等问题解答
    本文解答了JavaSE笔试题中关于接口、抽象类、多态等问题。包括Math类的取整数方法、接口是否可继承、抽象类是否可实现接口、抽象类是否可继承具体类、抽象类中是否可以有静态main方法等问题。同时介绍了面向对象的特征,以及Java中实现多态的机制。 ... [详细]
  • 本文整理了Java中org.gwtbootstrap3.client.ui.Icon.addDomHandler()方法的一些代码示例,展示了Icon.ad ... [详细]
  • 颜色迁移(reinhard VS welsh)
    不要谈什么天分,运气,你需要的是一个截稿日,以及一个不交稿就能打爆你狗头的人,然后你就会被自己的才华吓到。------ ... [详细]
  • angular.element使用方法及总结
    2019独角兽企业重金招聘Python工程师标准在线查询:http:each.sinaapp.comangularapielement.html使用方法 ... [详细]
author-avatar
幸福欧旭旭_320
这个家伙很懒,什么也没留下!
Tags | 热门标签
RankList | 热门文章
PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有