Intro to Graphs

A graph is an abstract representation of a complex system such as networks, human relationships, and any organization.
A graph is very important because it can represent and simplify any complex relationships in a visual format.
It is pretty amazing that messy applied problems often could be described with simple and classical graph properties.

In this post, I am going to discuss some popular graphs and brief examples.

Graph Examples

A graph G can be represented as a set of vertices and edges which can be said as G = (V, E)
Now, let’s take a look at various graphs and their properties.

Undirected vs. Directed Graph

undirected vs directed graph

A graph G = (V, E) is undirected if an edge between two vertices is equally applicable.
In other words, if you can travel from one vertex to another vertex in both directions with one edge it’s an undirected graph.
If not, it is a directed graph.
In an undirected graph, you can visit A from B, C, D whereas there isn’t any vertex that can visit A in a directed graph.

Weighted vs. Unweighted Graph

weighted vs unweighted graph

If each edge in a graph is assigned with numerical value then it is considered as a weighted graph.
Each edge may have different weights and therefore the costs of visiting node may be different even if the number of edges are the same.
If each edge in a graph is not assigned with numerical value then it is an unweighted graph. (All edges are assumed to be of equal weight)

As you can see above graph, the shortest path from A to C in the unweighted graph is edge A, C.
However, the story is different in weighted graph.
Since it costs 10 to use direct path it is cheaper to visit B and then C which is 5 (4 + 1)

Cyclic vs. Acyclic Graph

An acyclic graph does not have any cycles.
There is a cycle in a graph if you can visit a vertex and come back to the same vertex while traversing.

In the above cyclic graph example you can see that starting from vertex A you can visit C, B, and can come back to A which is a cycle.
But there isn’t any cycle in the acyclic graph.

Code Example

Now, let’s discuss what kind of data structure we can use in order to represent a graph.
First, we need information about all the vertices and edges associated with each vertex.
In that sense, it would be reasonable to define a user-defined type for each edge.
And each vertex will know which edges are associated with it.

Here is a sample code representing data structure for a graph.
There could be many other ways as this is just one way to represent it.

struct Edge
{
    // destination vertex id for this edge
    // it could be any other data structure to represent destination
    int vertexId;
    
    // weight of the edge if application. 0 means unweighted.
    int weight;
};

struct Vertex
{
    // all the edges associated with this vertex
    vector<Edge> edges;    
};

struct Graph
{
    // here I use index of each vertex as vertex id for simplicity
    // but you can use your own way to represent it differently.
    vector<Vertex> vertices;
    
    // is this directed graph?
    bool directed;
};

Conclusion

There are many other types such as simple vs non-simple, sparse vs dense and etc which are not discussed here.
I just wanted to show very basic concepts of important graphs and their properties.

Intro to Mergesort Algorithm

Mergesort is a sorting algorithm, a classic example of a divide and conquer sorting algorithm.
Divide and conquer is a technique to solve the problem by dividing it into smaller subsets which should be simple to easier to deal with.
The result of small problems now should help to solve bigger subsets and eventually the entire problem.
Let’s take a look at how mergesort works.

Mergesort Algorithm

The following is the steps for mergesort algorithm

  1. Recursively divide the array into smaller sub-arrays until each array has a single element.
  2. Recursively merge sub-arrays with sorted order until all the sub-arrays are merged. Note that when merging two arrays it’s essentially interleaving each element from the front of the list based on the sort order

Let’s take a look at an example picture to help better understand.

mergesort example

The above example shows you dividing (partitioning) steps.
It divides the initial array until a single element is left in a sub-array.

Now it just needs to merge each sub-array based on sorting order, either incremental or decremental, like the below picture.

mergesort example

Although I mentioned the array in the above example you can also sort a linked list with mergesort.
In some sense, it is more efficient to apply mergesort for linked list because you don’t need to have an extra buffer to store the temporary result.
The temporary buffer is necessary for sorting the array since you need to store merged sub-arrays as you see in the picture.
However, it is not necessary for linked list since you just need to decouple a node from the list and merge it later.

Code Example (Array)

void merge(vector<int> &array, int low, int middle, int high)
{
    // construct array 1, low to middle
    vector<int> array1;
    for (int i = low; i <= middle; ++i)
    {
        array1.push_back(array[i]);
    }
    
    // construct array 2, middle + 1 to high
    vector<int> array2;
    for (int i = middle + 1; i <= high; ++i)
    {
        array2.push_back(array[i]);
    }
    
    size_t array1Index = 0;
    size_t array2Index = 0;
    int resultIdx = low;
    
    while (array1Index < array1.size() && array2Index < array2.size())
    {
        if (array1[array1Index] < array2[array2Index])
        {
            array[resultIdx++] = array1[array1Index++];   
        }    
        else
        {
            array[resultIdx++] = array2[array2Index++];
        }
    }
    
    for (size_t i = array1Index; i < array1.size(); ++i)
    {
        array[resultIdx++] = array1[i];
    }
    
    for (size_t i = array2Index; i < array2.size(); ++i)
    {
        array[resultIdx++] = array2[i];
    }
}

void mergesort(vector<int> &array, int low, int high)
{
    if (low < high)
    {
        int middle = (low + high) / 2;
        mergesort(array, low, middle);
        mergesort(array, middle + 1, high);
        merge(array, low, middle, high);
    }
}

Code Example (Linked List)

struct Node
{
    int val;
    Node *next;
}

Node* mergesort(Node* head) 
{
    // no need to sort if null
    if (!head)
    {
        return 0;
    }
    // base case for recursive solution
    else if (!head->next)
    {
        return head;
    }

    Node *l1 = head;
    Node *l2 = mergesort(head->next);

    Node *newHead = 0;
    
    // merge two sorted linked lists 
    // (list with one node and list with multiple nodes)
    // l1 has one node and is smaller than all of l2
    // just append l2 to l1 and return
    if (l1->val < l2->val)
    {
        newHead = l1;
        newHead->next = l2;
        return newHead;
    }

    newHead = l2;

    // need to find a place to insert l1
    // this is essentially merging two sorted linked lists
    while (l2->next)
    {
        // found correct place to merge
        if (l1->val < l2->next->val)
        {
            Node *next = l2->next;
            l2->next = l1;
            l2->next->next = next;
            break;
        }

        l2 = l2->next;
    }

    if (!l2->next)
    {
        l2->next = l1;
        l1->next = 0;
    }

    return newHead;
}

Although it is not partitioning efficiently like an array it is still mergesort as it merges each sorted linked lists until done.
Here is the mergesort steps for linked list.

  1. Iterate the list until the end
  2. While winding up start merging two sorted lists
  3. A list with one node (current node) and sorted linked lists from winding up process

IF you would like, You can also mergesort linked list by having additional arrays.

  1. An array to hold each node of linked list
  2. Iterate the array and start merging linked lists – index 0 and 1, 2 and 3 or 0, 1, 2, 3 whichever you prefer (this is the same as above except using an array.)

Performance

The performance of mergesort is O(n log n).
However, please note that you may need an extra buffer to store a temporary sorted result which may add copying cost.
You can avoid the extra buffer if you use linked list instead of array.

Conclusion

Mergesort is a classical divide and conquer sorting algorithm.
Although it is very easy to understand and implement it may cost more than other sorting algorithms due to extra buffer and copy cost.
But the performance is still bound under O(n log n) consistently unlike quicksort.

Intro to Quicksort Algorithm

Quicksort is one of the most popular sorting algorithms.
Although there are quite a few sorting algorithms such as mergesort, heapsort quicksort has a unique property that is noteworthy.
In general, quicksort performs better than others due to its property under a certain condition.

Let’s take a look at the basics of quicksort!

How does quicksort work?

Quicksort sorts the array by partitioning the array recursively which is one of divide and conquer sorting algorithms.
Quicksort first selects a random element and partitions the elements based on the selected element which is called a pivot.
All the elements that are less than the pivot are placed on the left side of the pivot and the others placed on the right side.
However, placing the elements to the left side is enough because all the leftovers will be on the right side of the pivot.
It means you need to keep track of the index to place the next item during partitioning.

Let’s take a look at an example first which I believe will help understand much better than just words.
Here is the unsorted array.

quicksort example

Quicksort partitioning Algorithm

The core of quicksort is partitioning as quicksort is done by performing partitioning recursively.
Here are the partitioning algorithm steps.

  • Selects a random element for the pivot
  • Iterate the array from low to the high boundary and place the element to the left side of the partition
  • Place the pivot element to the next partition index after the iteration is done
  • Now you have two partitions
    • Left partition has elements that are less than the pivot value
    • Right partition has elements that are greater than or equal to the pivot value

Partitioning Algorithm Example

25 is selected as the pivot value.
Please note that I have to keep track of the index to place the next item which is called partition index in this example.
The iterator index is literally an iterator that will visit every single node here.

Now we need to do the following operations.

  • Is 10 < 25 true?
    • Swap the element with the one at the partition index. In this case, 10 is staying at the same place
    • Increment the partition index by 1 since it’s filled
    • Increment the iterator index by 1 to check next item
  • If not true
    • Increment the iterator index by 1 to check the next item

quicksort example

We perform the same operation as the above.

  • Is 3 < 25 true?
    • Swap the element with the one at the partition index. In this case, 3 is staying at the same place
    • Increment the partition index by 1 since it’s filled
    • Increment the iterator index by 1 to check next item
  • If not true
    • Increment the iterator index by 1 to check the next item

quicksort example
  • Is 200 < 25 true?
    • Swap the element with the one at the partition index
    • Increment the partition index by 1 since it’s filled
    • Increment the iterator index by 1 to check next item
  • If not true
    • Increment the iterator index by 1 to check the next item. Please note that we are only placing elements that are less than the pivot value.

quicksort example
  • Is 55 < 25 true?
    • Swap the element with the one at the partition index
    • Increment the partition index by 1 since it’s filled
    • Increment the iterator index by 1 to check next item
  • If not true
    • Increment the iterator index by 1 to check the next item

  • Is 1 < 25 true?
    • Swap the element with the one at the partition index
    • Increment the partition index by 1 since it’s filled
    • Increment the iterator index by 1 to check next item
  • If not true
    • Increment the iterator index by 1 to check the next item

  • Is 2 < 25 true?
    • Swap the element with the one at the partition index
    • Increment the partition index by 1 since it’s filled
    • Increment the iterator index by 1 to check next item
  • If not true
    • Increment the iterator index by 1 to check the next item

  • Is 3 < 25 true?
    • Swap the element with the one at the partition index
    • Increment the partition index by 1 since it’s filled
    • Increment the iterator index by 1 to check next item
  • If not true
    • Increment the iterator index by 1 to check the next item

  • Is 88 < 25 true?
    • Swap the element with the one at the partition index
    • Increment the partition index by 1 since it’s filled
    • Increment the iterator index by 1 to check next item
  • If not true
    • Increment the iterator index by 1 to check the next item

  • Is 71 < 25 true?
    • Swap the element with the one at the partition index
    • Increment the partition index by 1 since it’s filled
    • Increment the iterator index by 1 to check next item
  • If not true
    • Increment the iterator index by 1 to check the next item

After the iteration, you need to make sure the pivot is swapped with the partition index.
Now we have partitioned the array.
The left partition has all the elements that are smaller than the pivot (25) and the right partition has all the elements that are greater than or equal to the pivot.

Quicksort Algorithm

Quicksort becomes very easy once partitioning is done.
You just need to recursively apply partitioning to both left and right partitions.
This is why quicksort is a divide and conquer algorithm since it divides the problem to each subset (partitions) to solve the problem.
It will be clear once you see the code example.

Code Example

/**
 * @brief  partition the array from low to high index
 * 
 * @param  array  array to partition
 * @param  low    lower boundary index
 * @param  high   upper boundary index
 * 
 * @return index of the pivot after partitioning
 */
int partition(vector<int> &array, int low, int high)
{
    int partitionIdx = low;

    // here I just chose the last element but it really should be random 
    // or some other method not to make quicksort to run worst case of O(n^2)
    int pivot = high;
    
    for (int i = low; i < high; ++i)
    {
        // current value < array[pivot], swap
        if (array[i] < array[pivot])
        {
            swap(array[i], array[partitionIdx++]);
        }
    }
    
    // place pivot to correct place. this value will never be moved now
    swap(array[partitionIdx], array[pivot]);
    
    return partitionIdx;
}

/**
 * @brief  quicksort
 * 
 * @param  array  array to sort
 * @param  low    lower boundary index
 * @param  high   upper boundary index
 * 
 * @return index of the pivot after partitioning
 */
void quicksort(vector<int> &array, int low, int high)
{
    if (low < high)
    {
        int partitionIdx = partition(array, low, high);
        quicksort(array, low, partitionIdx - 1);
        quicksort(array, partitionIdx + 1, high);
    }
}

Performance

The key of the quicksort is partitioning and the key of partitioning is how well you can select a pivot element.
Do you remember I mentioned that quicksort performs better than others under certain conditions?
That really depends on selecting a pivot element.
If the random pivot is selected than it is likely to have balanced partitions and quicksort will perform better than many others because it sorts all the elements in place.
(Think about mergesort which requires extra buffer and copy)
Average case performance is, therefore, O(n log n)

However, in a worst-case scenario where the pivot is selected in incremental or decremental order, the performance will be O(n^2) which is not better than bubble sort.
It happens because there will be extremely unbalanced partitions where the size will be (n – 1) and 1 respectively.
That’s why the performance of the worst-case scenario is O(n^2).
For that reason, there are some algorithms for random pivot selection.

Conclusion

The key to the quicksort is balanced partitioning based on a good pivot.
The partitioning algorithm itself is quite useful for cases such as finding the median. (Please note that finding median doesn’t require the entire array to be sorted)

Intro to Binary Search Algorithm

Let’s say there is an unsorted array and you need to find certain values.
How would you search for those values?
You can’t avoid iterating the entire array every time you are searching since the array is unsorted.
Wouldn’t this be a very inefficient way of searching?
It may be fine if the size of the array is pretty small. However, what if there are one million elements in the array?
You probably do not want to iterate one million elements every time you search because it’s a waste of time.

Binary search algorithm is a very efficient search algorithm.
It requires the array to be sorted which may cost in the beginning.
However, once the array is sorted it’s extremely fast and easy to find an element in the array.
Let’s take a look at how binary search work in this post.

How does binary search work?

First of all, please note that an array needs to be sorted in order to apply binary search and you will see why as you read.
Basically binary search means you eliminate half of the search candidates every time you compare so you don’t have to check every element in the array.
I believe an example will explain the concept better than any other word.

Let’s say there is a sorted array with 12 elements.
And you would like to find the index of the element 81 if it exists.

binary search example

Binary Search Algorithm Steps

The followings are steps of binary search algorithm.

  1. Find a middle point and check if this element is equivalent as the search key 81
  2. If yes, we found the element and return the index
  3. If the key is smaller than the middle then we only need to search the left side of the middle point since the key will never exist on the right side.
  4. If the key is greater than the middle then we only need to search the right side of the middle point since the key will never exist on the left side.
  5. Iterate steps #1 – 4 until the element is found.

Although it is pretty simple it might not be clear to understand at first glance.

Binary Search Example

Let’s take a look at how we can find 81 from the above array.
First, we need to check the middle point which is located at index 5 with the value 25.

binary search example

81 is not the same as 25.
Since 81 is greater than 25 we don’t need to search the left side of 25 which is from index 0 to index 4.
We throw away the left side and need to search the right side only.
What is the index range we need to search this time? It is from the index 6 to 11.

The middle element this time is located at index 8 with value 51.
But 51 is still not the same as 81.
Since 81 is still greater than 51 we can throw away the left side of the middle point (index 8).
81 simply can’t be found in the left side of the index 8.
What is the next index range we need to search for? It is from the index 9 to 11.

binary search example

The middle element this time is located at index 10 with value 81.
We finally found the element at index 10!
How many times did we need to compare to find the element? Only 3 times.
It is indeed a very efficient algorithm to search!

Now you might ask what happens if the array doesn’t have an element we are looking for.
In that case, the middle point index will eventually go out of a valid index search range and binary search will return proper error code.
It will be clear once you see the code example below.

Code Example

#include <vector>
using namespace std;

/**
 * @brief  This function searches an element 'key' in the array using
 *         binary search algorithm
 * 
 * @param  array  array to search
 * @param  low    lower boundary index
 * @param  high   upper boundary index
 * @parma  key    key to search in the array
 * 
 * @return index of the found element, -1 if key doesn't exist
 */
int binarySearch(const vector<int> &array, int low, int high, int key)
{
    // weren't able to find element up to this point
    if (low > high)
    {
        return -1;
    }
    
    int middle = (low + high) / 2;
    if (array[middle] == key)
    {
        return middle;
    }
    else if (array[middle] > key)
    {
        // throw away right half of the array in searching
        return binarySearch(array, low, middle - 1, key);
    }
    else
    {
        // throw away left half of the array in searching
        return binarySearch(array, middle + 1, high, key);
    }
}

Performance

The search performance of binary search is O(log n) since it reduces half every time it compares.

Conclusion

Binary search is a great algorithm to find an element.
However, please note that it works great when applied to data structures such as array because you can easily throw away half of the search candidates.
You can still use binary search in linked list but it will not be as efficient as array.