Intro to Array data structure Part 1

Today, we will go over a popular data structure – array.
This is the most basic and fundamental data structure in computer science regardless of any programming language.
Even many complex data structures use array inside the implementation.
Programming language like python doesn’t have array but it provides list instead. (of course, python list is much more flexible and easier to use than C++ array)
In this post, I will go over some important characteristics of array with example code based on C++.

What is Array?

Array is a contiguous piece of memory of a certain data type.
It sounds hard but it will be very clear once you see this picture and example code.

Here, this picture is an array of integer with size 3.

array

Data type is an integer and there are a total of three elements for the array.
Please note that all those three elements are located right next to each other.

The first element of the array contains the integer 10 and 3 for second and 99 for third respectively.
In order to access each element, we need to know the index of the element.
This could be a little counter-intuitive but index of C++ always starts from 0 for array.

For example, if you would like to access the first element you need to know that it’s located at index 0.

Let’s take a look at the C++ code example for array declaration and basic usage.

#include <iostream>

using namespace std;

int main()
{
    //this is just a normal integer variable
    int temp;
    
    // this is an array of integer with size 3
    // please note that [ ] is necessary in order to declare it as an array.
    // you need to provide size inside [ ] unless you are initializing right away.
    // however, if you are initializing the array like below compilre can figure out the size.
    int arr[] = {10, 3, 99};
    
    // 3 inside [ ] means size of the array and it is necessary if you are not initializing like above
    // int arr[3]; 
    
    // you can access each element in the array using index
    // you will see 10 in the screen
    cout << "value of first index of the array:" << arr[0] << endl;
    
    // you can also assign a value to each element using index
    // you will see 11 on the screen
    arr[0] = 11;
    cout << "value of first index of the array after change:" << arr[0] << endl;
    
    // you will see 3 on the screen
    cout << "value of second index of the array:" << arr[1] << endl;
    
    // in this case, arr can only hold int because it's declared as int arr[3]!
    // this will cause a compilation error!
    // arr[2] = "test";
    
    // implicit conversion from 1.1 to 1. bad!
    // double precision will be dropped!
    // you will see 1 on the screen if you print arr[2]
    // arr[2] = 1.1;
    
    return 0;
}

As you see above, an array is merely a container that can hold many elements of the same data type.
You can have an array of double, short, int or string. (Please note that string is already an array of char)

One thing you have to remember is that unlike python an array in C++ can contain only a single data type.
Another words, if you have an array of int, then it can only hold int.
Although you can still assign double, short or any other number related data type it doesn’t necessarily mean it’s correct.

What happens you mistakenly assign 1.1 to arr[2] like the above example?
It will implicitly convert 1.1 to 1 because int cannot take double precision and will only take the integer part.
And the user might be surprised to see 1 on the screen instead of 1.1!

Array access with loop

You can use a loop (for, while or do-while) to access an array efficiently and elegantly.
And I will explain it using the example below.

Let’s say you are to write a program that does the following.
1. Take 5 test scores from the user
2. Find the lowest and highest test score
3. Calculate the average of those scores
4. Print the average, lowest and highest score along with individual test scores.

#include <iostream>

using namespace std;

int main()
{
    const int NUM_TESTS = 5;
    int scores[NUM_TESTS];
    
    for (int i = 0; i < NUM_TESTS; ++i)
    {
        cout << "Please enter a test score:";
        
        // you can use for loop to access each element in the array!
        // value of i will be from 0 to NUM_TESTS - 1 which are 0,1,2,3,4
        // please note that the index of the array always starts from 0!
        cin >> scores[i];
    }

    // in order to find lowest score you need to compare all the element
    // starting from the element at index 0
    // so far the element at index 0 is the lowest test score.
    int lowestScore = scores[0];
    
    // please note that this loop starts at index 1 since you already got the element at index 0
    // i will be 1,2,3,4
    for (int i = 1; i < NUM_TESTS; ++i)
    {
        // update lowestScore if current element is lower
        if (scores[i] < lowestScore)
        {
            lowestScore = scores[i];
        }
    }
    
    cout << "The lowest score is " << lowestScore << endl;
    
    // please refer to explantions for lowest score as this is very similar
    int highestScore = scores[0];
    for (int i = 1; i < NUM_TESTS; ++i)
    {
        if (scores[i] > highestScore)
        {
            highestScore = scores[i];
        }
    }
    
    cout << "The highest score is " << highestScore << endl;
    
    int total = 0;
    for (int i = 0; i < NUM_TESTS; ++i)
    {
        total += scores[i];
    }
    
    double avg = total * 1.0 / NUM_TESTS;
    cout << "Average of test scores:" << avg << endl;
    
    for (int i = 0; i < NUM_TESTS; ++i)
    {
        cout << scores[i] << " ";
    }
    cout << endl;
    
    return 0;
}

As you can see above it’s much easier to access (read/write) the array using loops and that’s usually recommended way unless you want to access some specific location.

Array as a function parameter

In the above example, we observed that you can have a variable for the array.
Then you might ask if the array can be used in function, either as a parameter or return type.


Quick answer is that a function can take an array as a parameter but it cannot return the array.
However, it doesn’t mean returning an array is completely impossible. Instead of returning array directly, it can return a pointer which is essentially the same as array.
Although a pointer to an array and an array is not exactly the same it is still possible to use pointer like an array.
I will discuss the difference between array and pointer in another post.
For now, let’s just focus on array used as a function parameter.
I rewrote above example code using functions instead.

#include <iostream>

using namespace std;

// this is basic syntax to have an array parameter
// you only need to specify the parameter is array. don't need to have size inside [ ]
int getHigestTestScore(int scores[], int size)
{
    // -999 is just a custom error code
    if (size <= 0)
    {
        return -999;
    }
    int highestScore = scores[0];
    for (int i = 1; i < size; ++i)
    {
        if (scores[i] > highestScore)
        {
            highestScore = scores[i];
        }
    }
    
    return highestScore;
}

// you could also specify the size like below and it works fine.
// but it's quite rigid and usually not a good choice
// in the end this function really assumes the array is sizes of 5 which isn't flexible
// and it involves some hard coded number which smells
int getHigestTestScore2(int scores[5])
{
    int highestScore = scores[0];
    for (int i = 1; i < 5; ++i)
    {
        if (scores[i] > highestScore)
        {
            highestScore = scores[i];
        }
    }
    
    return highestScore;
}

int main()
{
    int arr[] = {1,2,3,4,5};
    int high1 = getHigestTestScore(arr);
    
    int high2 = getHigestTestScore2(arr);
    return 0;
}

I think the example code is pretty clear on how to use array as a function parameter.
Although you can use a pointer instead of array I omitted here to just focus on array only.

Pros

  • Random access is allowed which enables fast search
  • No extra space is required like Linked List (i.e., next pointers)
  • Memory locality yields better access performance than Linked List since all the elements are all located next to each other

Cons

  • Adjusting the size is not flexible. You will need to copy/move elements after increasing/decreasing the size.
  • Insertion and deletion of elements are quite expensive because you have to copy/move all the elements after the operations.

Performance

Search – O(1)
Insertion – O(n)
Deletion – O(n)

Conclusion

We have taken a look at basic concepts and usage of array. However, this is not it!
There are many array related topics such as 2D array and array pointer I am going to discuss in next post.

Leave a Reply

Your email address will not be published. Required fields are marked *