Java Stream Distinct by Property

·

3 min read

I used the Distinct Clause in Oracle SQL for the first time and saw its benefits. Until this clause was introduced, it was very difficult to clean duplicate data in results/reports to do this requires long coding. With the advent of the Distinct Clause, the code became shorter and simpler, and its beauty was enhanced.

The beauty of the distinct clause is that it does not show duplication in the report/results, the data is kept in its place but only includes one record the property or field in which duplication occurs.

Image description

What is java stream distinct by property

One of the most typical jobs that we encounter as programmers are searching for different elements in a list. With the addition of Streams in Java 8, we now have a new API for processing data using a functional approach.

Let's have a look at a simple Person class, as illustrated below. It has the name and the age. Its equals method compares persons based on their names, therefore two people with the same name value (regardless of age) will be regarded as equal.

The Distinct clause will select only one name from the list when we apply the filter on the field or property of the name.

How it works

As I mentioned above, we took the example of a person class that has the name and the age.

Name                    Age
Adam                    21
John                    28
Ben                     45
Mathew                  45
Adam                    22
Smith                   47

When we apply a distinct clause to filter data on the name property, it will select one Adam and ignore the other one as the value of Adam is duplicated in this list. A distinct method can be applied to different properties with different filters.

Example of use

Let's have the example of the person class/list that contains data under two fields one is the name and the other one is age (already mentioned in the how it works section)

Name                    Age
Adam                    21
John                    28
Ben                     45
Mathew                  45
Adam                    22
Smith                   47

We may use the ListIterate.distinct() function to filter a Stream using different HashingStrategies. Lambda expressions or method references can be used to define these techniques.

If we wish to filter by a person's name, we may do it as follows:

List personListFiltered = ListIterate.distinct(personList, HashingStrategies.fromFunction(Person::getName));

If we're going to utilise a primitive property (int, long, double), we may use a specific function like this:

List personListFiltered = ListIterate.distinct( personList, HashingStrategies.fromIntFunction(Person::getAge));

StreamEx is one of the classes supplied, and it includes a distinct method to which we may submit a reference to the attribute we wish to distinguish:

List personListFiltered = StreamEx.of(personList).distinct(Person::getName).toList();

Conclusion

We looked at some instances of how to access different components of a Stream depending on an attribute using the standard Java 8 API and extra libraries in this fast article.