Sets

As mentioned earlier, the defining characteristic of Sets is that all the elements must be unique (based on the implementation of the equals method). If you add a duplicate element to a Set it will replace the existing element that it is equal to.

Set is also an interface: the two main implementation of Set are HashSet and the TreeSet.

Deciding which implementation to use is somewhat simpler than with Lists: if the order of elements in the Set is not important use HashSet, otherwise use TreeSet. Operations such as add, remove, contains and size are typically far faster with HashSet than TreeSet, but HashSets cannot be ordered.

The performance of Sets is conditional on providing a good implementation of the hashKey method, because this provides a critical mechanism for quickly detecting duplicate elements.

For instance, when objects are added to a Set, they are indexed based on the value of their hashKey. All objects with the same hashKey are grouped together in buckets. As mentioned earlier, hashKey will not necessarily produce unique results, so buckets may contain more than one object, but typically the buckets will be very small.

When you add an object to a Set, the Set need to find out if there are any other elements in the Set that are equal to the new object.

One way of implementing this would be to iterate through all the elements in the Set and call equals on them with the new object. A far faster mechanism, however, is to find the hashCode of the object being added, and then locate the bucket of objects for this value (which is extremely quick, because the lookup is based on a numeric index). Once the bucket is found, the new object can be compared for equality with the objects in the bucket.

Because buckets will typically contain 0 or 1 objects, the equals method will probably only be invoked (at most)once per add.

Consider, however, what would happen if you provided the following hashKey implementation:

public int hashCode() {

      return 1;

}

This is actually a valid implementation of hashKey: if two objects are equal they will return the same hashKey.

Despite being valid, it would result in very poor performance when adding a new element to a Set, because it would result in the following logic:

  • Find the hashKey of the object passed as a parameter (1)

  • Find the bucket with this hashKey (the bucket would contain all objects in the Set)

  • Loop through these objects and check if they are equal to the object passed in.

As you can see, you have completely negated the performance optimizations associated with hashCodes and indexing.

The following program demonstrates the use of a HashSet:

public static void main(String[] args) {

    Set addresses = new HashSet();

    Address a1 = new Address(799, "E Dragham",

        "Tucson", 85705, "USA");

     Address a2 = new Address(200, "Main Street",

         "Phoenix", 85123, "USA");

     Address a3 = new Address(100, "Main Street",

         "Seattle", 98104, "USA");

     Address a4 = new Address(8400 , "London Place",

          "Washington", 20521, "USA");

     Address a5 = new Address(8400 , "London Place",

          "Washington", 20521, "USA");

    addresses.add(a1);

    addresses.add(a2);

    addresses.add(a3);

    addresses.add(a4);

    addresses.add(a5);       

    System.out.println("Size = " + addresses.size());

    System.out.println(addresses);

}

This produces the following output:

Size = 4

[799 E Dragham, Tucson, USA

, 8400 London Place, Washington, USA

, 200 Main Street, Phoenix, USA

, 100 Main Street, Seattle, USA

]

Notice that even though you have added 5 elements to the Set, the Set only contains 4 elements: this is because the final two addresses added to the Set meet the definition of equality defined by Address, and therefore the second instance overwrites the first.

Also notice that the order in which the addresses are printed out does not match the order they were added; HashSet makes no attempts to maintain any order.

An example using TreeSet would look identical; except the output would adhere to the order the elements were added. I will leave you to implement that yourself.

A Software Engineer Learns Java and Object Orientated Programming
titlepage.xhtml
part0000_split_000.html
part0000_split_001.html
part0000_split_002.html
part0000_split_003.html
part0000_split_004.html
part0000_split_005.html
part0000_split_006.html
part0000_split_007.html
part0000_split_008.html
part0000_split_009.html
part0000_split_010.html
part0000_split_011.html
part0000_split_012.html
part0000_split_013.html
part0000_split_014.html
part0000_split_015.html
part0000_split_016.html
part0000_split_017.html
part0000_split_018.html
part0000_split_019.html
part0000_split_020.html
part0000_split_021.html
part0000_split_022.html
part0000_split_023.html
part0000_split_024.html
part0000_split_025.html
part0000_split_026.html
part0000_split_027.html
part0000_split_028.html
part0000_split_029.html
part0000_split_030.html
part0000_split_031.html
part0000_split_032.html
part0000_split_033.html
part0000_split_034.html
part0000_split_035.html
part0000_split_036.html
part0000_split_037.html
part0000_split_038.html
part0000_split_039.html
part0000_split_040.html
part0000_split_041.html
part0000_split_042.html
part0000_split_043.html
part0000_split_044.html
part0000_split_045.html
part0000_split_046.html
part0000_split_047.html
part0000_split_048.html
part0000_split_049.html
part0000_split_050.html
part0000_split_051.html
part0000_split_052.html
part0000_split_053.html
part0000_split_054.html
part0000_split_055.html
part0000_split_056.html
part0000_split_057.html
part0000_split_058.html
part0000_split_059.html
part0000_split_060.html
part0000_split_061.html
part0000_split_062.html
part0000_split_063.html
part0000_split_064.html
part0000_split_065.html
part0000_split_066.html
part0000_split_067.html
part0000_split_068.html
part0000_split_069.html
part0000_split_070.html
part0000_split_071.html
part0000_split_072.html
part0000_split_073.html
part0000_split_074.html
part0000_split_075.html
part0000_split_076.html
part0000_split_077.html
part0000_split_078.html
part0000_split_079.html
part0000_split_080.html
part0000_split_081.html
part0000_split_082.html
part0000_split_083.html
part0000_split_084.html
part0000_split_085.html
part0000_split_086.html
part0000_split_087.html
part0000_split_088.html
part0000_split_089.html
part0000_split_090.html
part0000_split_091.html
part0000_split_092.html
part0000_split_093.html
part0000_split_094.html
part0000_split_095.html
part0000_split_096.html
part0000_split_097.html
part0000_split_098.html
part0000_split_099.html
part0000_split_100.html
part0000_split_101.html
part0000_split_102.html
part0000_split_103.html
part0000_split_104.html
part0000_split_105.html
part0000_split_106.html
part0000_split_107.html
part0000_split_108.html
part0000_split_109.html
part0000_split_110.html
part0000_split_111.html
part0000_split_112.html
part0000_split_113.html
part0000_split_114.html
part0000_split_115.html
part0000_split_116.html
part0000_split_117.html
part0000_split_118.html
part0000_split_119.html
part0000_split_120.html
part0000_split_121.html
part0000_split_122.html
part0000_split_123.html
part0000_split_124.html
part0000_split_125.html
part0000_split_126.html
part0000_split_127.html
part0000_split_128.html
part0000_split_129.html
part0000_split_130.html
part0000_split_131.html
part0000_split_132.html
part0000_split_133.html
part0000_split_134.html
part0000_split_135.html
part0000_split_136.html
part0000_split_137.html
part0000_split_138.html
part0000_split_139.html