SEARCHING AND SORTING
In this section we are going to review searching and
sorting algorithms.
Why is sorting so important
- The first step in organizing data is sorting. Lots
of tasks become easier once a data set of items is sorted
- Some algorithms like binary search are built around
a sorted data structure.
- In accordance to S. Skiena computers have
historically spent more time sorting than doing anything
else. Sorting remains the most ubiquitous combinatorial
algorithm problem in practice.
Considerations:
- How to sort: descending order or ascending order?
- Sorting based on what? An object name
(alphabetically), by some number defined by its
fields/instance variables. Or maybe compare dates,
birthdays, etc.
- What happens with equals keys, for example various
people with the same name: John, then sort them by Last
Name.
- Does your sorting algorithm sorts in place or needs
extra memory to hold another copy of the array to be
sorted. This is even more important in embedded systems.
Java uses Comparable interface for sorting and returns:
+1 if compared object is greater, -1 if compared object is
less and 0 if compared objects are equal.
Sorting becomes more ubiquitous when we think on all the
things we do daily that are previously sorted for us to
understand and have better access to them:
- Imagine trying to find a phone number in an
unsorted phone book, or searching for a word in an
unsorted dictionary.
- Your MP3 player can sort your lists by artists
name, genre, song name, ratings.
- Search engines display results in descending order
of importance
- Spreadsheets can be sorted in various ways to work
better with their contents
Searching
There are two types of searching algorithms: Those that
need a previously ordered data structure in order to work
properly, and those that don’t need an ordered list.
Searching is also very important for many computing
applications: searching through a search engine, finding a
bank account balance for some client, searching in a large
data set for a particular value, searching in your
directories for some needed file. Many applications rely on
effective search, if your application is complete but takes
long too perform a search and retrieve data it will be
discarded as useless.
SELECTION SORT
It is called selection sort because it repeatedly
selects the smallest remaining item:
- Find the smallest element. Swap it with the first
element.
- Find the second smallest element. Swap it with the
second element
- Find the third smallest element. Swap it with the
third element
- Repeat finding the smallest element and swapping in
the correct position until the list is sorted
Implementation in java for selection sort:
public class SelectionSort {
public static void selectionSort(Comparable[] array) {
for (int i = 0; i < array.length; i++) {
int min = i;
for (int j = i + 1; j < array.length; j++) {
if (isLess(array[j], array[min])) {
min = j;
}
}
swap(array, i, min);
}
}
private static boolean isLess(Comparable j, Comparable min) {
int comparison = j.compareTo(min);
return comparison< 0;
}
private static void swap(Comparable[] array, int i, int j) {
Comparable temp = array[i];
array[i] = array[j];
array[j] = temp;
}
public static <E> void printArray(E[] array) {
for (int i = 0; i < array.length; i++) {
System.out.print(array[i] + " ");
}
}
public static boolean isSorted(Comparable[] array) {
for (int i = 1; i < array.length; i++) {
if (isLess(array[i], array[i - 1])) {
return false;
}
}
return true;
}
public static void main(String[] args) {
Integer[] intArray = { 34, 17, 23, 35, 45, 9, 1 };
System.out.println("Unsorted Array: ");
printArray(intArray);
System.out.println("\nIs intArray sorted? "
+ isSorted(intArray));
selectionSort(intArray);
System.out.println("\nSelection sort:");
printArray(intArray);
System.out.println("\nIs intArray sorted? "
+ isSorted(intArray));
String[] stringArray = { "z", "g", "c", "o", "a",
"@", "b", "A", "0", "." };
System.out.println("\n\nUnsorted Array: ");
printArray(stringArray);
System.out.println("\n\nSelection sort:");
selectionSort(stringArray);
printArray(stringArray);
}
}
Output is:
Unsorted Array:
34 17 23 35 45 9 1
Is intArray sorted? false
Selection sort:
1 9 17 23 34 35 45
Is intArray sorted? true
Unsorted Array:
z g c o a @ b A 0 .
Selection sort:
. 0 @ A a b c g o z
SHELLSORT
Shell sort is perfect good for embedded systems as it
doesn’t require a lot of extra memory allocation; it is also
useful for small to medium size arrays. Lets see the
following implementation in java code:
public class ShellSort {
public static void shellSort(Comparable[] array) {
int N = array.length;
int h = 1;
while (h < N/3) h = 3*h + 1;
while (h >= 1){
for (int i=h; i<N ; i++) {
for(int j=i; j>=h && isLess(array[j], array[j-h]); j-=h){
swap(array, j, j-h);
}
}
h = h/3;
}
}
private static boolean isLess(Comparable j, Comparable min) {
int comparison = j.compareTo(min);
return comparison< 0;
}
private static void swap(Comparable[] array, int i, int j) {
Comparable temp = array[i];
array[i] = array[j];
array[j] = temp;
}
public static <E> void printArray(E[] array) {
for (int i = 0; i < array.length; i++) {
System.out.print(array[i] + " ");
}
}
public static boolean isSorted(Comparable[] array) {
for (int i = 1; i < array.length; i++) {
if (isLess(array[i], array[i - 1])) {
return false;
}
}
return true;
}
public static void main(String[] args) {
Integer[] intArray = { 34, 1000, 23, 2, 35, 0, 9, 1 };
System.out.println("Unsorted Array: ");
printArray(intArray);
System.out.println("\nIs intArray sorted? " + isSorted(intArray));
shellSort(intArray);
System.out.println("\nSelection sort:");
printArray(intArray);
System.out.println("\nIs intArray sorted? " + isSorted(intArray));
String[] stringArray = { "z", "Z","999", "g", "c", "o",
"a", "@", "b", "A","0", "." };
System.out.println("\n\nUnsorted Array: ");
printArray(stringArray);
System.out.println("\n\nSelection sort:");
shellSort(stringArray);
printArray(stringArray);
}
}
Output:
Unsorted Array:
34 1000 23 2 35 0 9 1
Is intArray sorted? false
Selection sort:
0 1 2 9 23 34 35 1000
Is intArray sorted? true
Unsorted Array:
z Z 999 g c o a @ b A 0 .
Selection sort:
. 0 999 @ A Z a b c g o z
MERGESORT
Mergesort is also called divide and conquer algorithm,
because it divides the original data into smaller pieces of
data to solve the problem. Merge sort works in the following
way:
- Divide into 2 collections. Mergesort will take the
middle index in the collection and split it into the left
and right parts based on this middle index.
- Resulting collections are again recursively splited
and sorted
- Once the sorting of the two collections is
finished, the results are merged
- Now Mergesort it picks the item which is smaller
and inserts this item into the new collection.
- Then selects the next elements and sorts the
smaller element from both collections
public class MergeSort {
public int[] sort(int [] array){
mergeSort(array, 0, array.length-1);
return array;
}
private void mergeSort(int[] array, int first, int last) {
int mid = (first + last) / 2;
if (first < last) {
mergeSort(array, first, mid);
mergeSort(array, mid + 1, last);
}
int a = 0, f = first, l = mid + 1;
int[] temp = new int[last - first + 1];
while (f <= mid && l <= last) {
temp[a++] = array[f] < array[l] ? array[f++] : array[l++];
}
while (f <= mid) {
temp[a++] = array[f++];
}
while (l <= last) {
temp[a++] = array[l++];
}
a = 0;
while (first <= last) {
array[first++] = temp[a++];
}
}
public static void printArray(int[] array) {
for (int i = 0; i < array.length; i++) {
System.out.print(array[i] + " ");
}
}
public static void main(String[] args) {
System.out.println("Original array:");
int[] example = { 100, 30, 55, 62, 2, 42, 20, 9, 394, 1, 0 };
printArray(example);
System.out.println("\nMergesort");
MergeSort merge = new MergeSort();
merge.sort(example);
printArray(example);
}
}
Output:
Original array:
100 30 55 62 2 42 20 9 394 1 0
Mergesort
0 1 2 9 20 30 42 55 62 100 394
QUICKSORT
Quick sort is better than merge sort from a memory usage
comparison. Because quick sort doesn’t require additional
storage to work. It only uses a small auxiliary stack.
Skiena discovered via experimentation that a properly
implemented quicksort is typically 2-3 times faster than
mergesort or heapsort
The pseudo-code for quicksort is:
- If the array contains only 1 element or 0 elements
then the array is sorted.
- If the array contains more than 1 element:
- Select randomly an element from the array. This is
the "pivot element".
- Split into 2 arrays based on pivot element: smaller
elements than pivot go to the first array, the ones above
the pivot go into the second array
- Sort both arrays by recursively applying the
Quicksort algorithm.
- Merge the arrays
Quicksort may take quadratic time O(n2) to
complete but this is highly improbable to occur. This
happens when the partitions are unbalanced. For example, the
first partition is on the smallest item, the second
partition of the next smallest item, so each time the
program just remove one item for each call, leading to a
large number of partitions of large sub arrays .If the
collection of elements is previously randomized the most
probable performance time for quicksort is Θ (n log n). So
if you’re having problems with performance of quicksort, you
can rearrange randomly elements in the collection this will
increase the probability of quicksort to perform in Θ (n log
n).
Implementation of quicksort:
import java.util.Random;
public class QuickSort {
public void quicksort(int[] array) {
quicksort(array, 0, array.length - 1);
}
private void quicksort(int[] array, int first, int last) {
if (first < last) {
int pivot = partition(array, first, last);
quicksort(array, first, pivot - 1);
quicksort(array, pivot + 1, last);
}
}
private int partition(int[] input, int first, int last) {
int pivot = first + new Random().nextInt(last - first + 1);
swap(input, pivot, last);
for (int i = first; i < last; i++) {
if (input[i] <= input[last]) {
swap(input, i, first);
first++;
}
}
swap(input, first, last);
return first;
}
private void swap(int[] input, int a, int b) {
int temp = input[a];
input[a] = input[b];
input[b] = temp;
}
public static void printArray(int[] array) {
for (int i = 0; i < array.length; i++) {
System.out.print(array[i] + " ");
}
}
public static void main(String[] args) {
int[] example = { 90, 6456, 20, 34, 65, -1, 54, -15,
1, -999, 55, 529, 0 };
System.out.println("Original Array");
printArray(example);
System.out.println("\n\nQuickSort");
QuickSort sort = new QuickSort();
sort.quicksort(example);
printArray(example);
}
}
Output:
Original Array
90 6456 20 34 65 -1 54 -15 1 -999 55 529 0
QuickSort
-999 -15 -1 0 1 20 34 54 55 65 90 529 6456
LINEAR SEARCH
This is the most basic type of search, easy to implement
and understand but inefficient for extremely large data
sets. Let’s look at the following implementation:
public class LinearSearch {
public static void main(String[] args) {
int[] array = { 0, 2, 9, 10, 100, -2, -3,
902, 504, -1000, 20, 9923 };
System.out.println(linearSearch(array, 100));
System.out.println(linearSearch(array, -1));
System.out.println(linearSearch(array, 9923));
}
private static String linearSearch(int[] array, int toSearch){
String result ="not found";
for (int i=0; i< array.length; i++){
if(array[i]==toSearch){
result = array[i] +" found at position index: " + i;
break;
}
}
return result;
}
}
BINARY SEARCH
Binary search is also easy to implement and easy to
understand. We actually make use of binary search without
knowing when searching in a dictionary for a specific word
or in a phone book. Technically it’s not the same but it is
a very similar process.
We open a dictionary more or less by the middle if the
word we are searching for starts in a letter over the
middle, we discard the first half of the dictionary and now
focus more or less by the middle of the second half. If the
word is below the middle of the second half, we discard the
top half and keep applying this same technique until we find
our word
Let’s look at the pseudocode to make this idea clearer:
- Receive a sorted array of n elements
- Compute middle point
- If index toSearch is less than array[mid] then
highestIndex = mid -1 (This changes mid)
- If index toSearch is greater than array[mid] then
lowestIndex = mid +1
- else if index isn't greater nor less than
array[mid], this means it's equal so return 0
- If not greater, less nor equal, then it's not in
the array so return -1
public class BinarySearch {
private BinarySearch(){ }
public static int binarySearch(int toSearch, int[]array){
int lowestIndex = 0;
int highestIndex = array.length -1;
while (lowestIndex <= highestIndex){
int mid = lowestIndex +
(highestIndex - lowestIndex)/2;
if ( toSearch < array[mid])
highestIndex = mid -1;
else if ( toSearch > array[mid])
lowestIndex = mid + 1;
else if (toSearch==array[mid])
return mid;
}
return -1;
}
public static void main(String[] args) {
int [] array = { -100, 1 , 2, 3, 4, 5, 100, 999, 10203};
System.out.println(binarySearch(999, array));
System.out.println(binarySearch(-100, array));
System.out.println(binarySearch(6, array));
}
}
Java implements its own .binarySearch() static methods
in the classes Arrays and Collections in the standard
java.util package for performing binary searches on Java
arrays and on Lists, respectively. They must be arrays of
primitives, or the arrays or Lists must be of a type that
implements the Comparable interface, or you must specify a
custom Comparator object.