C++ Vectors

Arrays are a programming tool that provide a mechanism to store a group of values under a single name. The values can be any available data type (e.g., int, double, string, etc.). In C++, we talk about vectors, rather than arrays.^[1]

vector<type> variable_name (number_of_elements);

vector<type> variable_name;

And that would declare an empty vector — a vector that contains zero elements.

The argument type in angle-brackets indicates the data type of the elements of the vector; variable_name is the name that we assign to the vector, and the optional number_of_elements may be provided to indicate how many elements the vector will initially contain.

vector<int> values (5);      // Declares a vector of 5 integers
vector<double> grades (20);  // Declares a vector of 20 doubles
vector<string> names;        // Declares a vector of strings,
                             // initially empty (contains 0 strings)

When using vectors in our programs, we must provide the appropriate #include directive at the top, since vectors are a Standard Library facility, and not a built-in part of the core language:

After a vector has been declared specifying a certain number of elements, we can refer to individual elements in the vector using square brackets to provide a subscript or index, as shown below:

When using a vector or array followed by square brackets with a subscript, the resulting expression refers to one individual element of the vector or array, as opposed to the group of values, and roughly speaking, you can use that expression as you would use a variable of the corresponding data type. In the above example, the data type of the expression grades[5] is double, so you can use it as you would use a variable of type double — you can assign a value to it (a numeric value, with or without decimals), or you can retrieve the value, use it for arithmetic operations, etc.

The above extends to other data types as well; if we have a vector of strings called names, the expression names[0] is a string, referring to the first element in the vector names. We can do anything with this expression that we would do with a string variable. For instance, the expression names[0].length() gives us the length of this string.

An important condition for the index or subscript is that it must indicate a valid element in the vector. Elements in a vector are “numbered” starting with element 0. This means that valid subscript values are numbers between 0 and size−1, where size is the number of elements of the vector. For the example above of grades, valid subscripts are between 0 and 19.

The following fragment shows an example of a program that asks the user for marks for a group of 20 students and stores them in a vector.

#include <iostream>
#include <vector>
using namespace std;

int main()
{
    vector<double> student_marks(20);

    for (vector<double>::size_type i = 0; i < 20; i++)
    {
        cout << "Enter marks for student #" << i+1
             << ": " << flush;
        cin >> student_marks[i];
    }
    // ... Do some stuff with the values

    return 0;
}

The first statement declares a vector called student_marks with capacity to hold 20 values of type double. These values can be accessed individually as student_marks[0] to student_marks[19]. The for loop has the counter i go from 0 to 19, allowing access to each individual element in a sequential manner, starting at 0 and going through each value from 0 to 19, inclusively.

Notice the data type for the subscript, vector<double>::size_type. As with strings, class vector<type> provides a size_type to represent positions and sizes. It is always recommended that you use this data type when dealing with vectors.

for loops usually go hand in hand with the use of vectors or arrays, as they provide a convenient way to access every element, one at a time, using the loop control variable as the subscript. This does not mean that we must use for loops whenever we require access to the elements of a vector — it only means that quite often, a for loop provides a convenient approach and we choose it as the mechanism to access the elements.

Resizing Vectors

Vectors have one important advantage with respect to C-style arrays: vectors can be resized during the execution of the program to accommodate any extra elements as needed, or even to “shrink” the vector.

In the example from the previous fragment above, if we don't know ahead of time (i.e., at the time we are writing the program) that there are 20 students, we could obtain that information at run-time (e.g., prompt the user for the number of students) and resize the vector accordingly, as shown below (though we notice that the example is somewhat silly, in that we could have waited until having the value of num_students and then declare the vector initializing it with that size):

vector<double> student_marks;
    // no size specified: vector contains
    // no elements

int num_students;
cout << "Number of students: " << flush;
cin >> num_students;

student_marks.resize (num_students);

for (vector<double>::size_type i = 0; i < num_students; i++)
{
    cout << "Enter marks for student #" << i+1
         << ": " << flush;
    cin >> student_marks[i];
}

Notice that the valid subscripts for a vector with num_students elements are 0 to num_students−1. For that reason, the for loop starts at 0 and goes while i is less than num_elements.

It is always a better idea to control for loops using the size method of vector. That way, we make sure that we loop only through the right subscript values, and we avoid the risk of accidentally exceeding the limits of the vector:

for (vector<double>::size_type i = 0; i < student_marks.size(); i++)

The difference in this case seems insignificant, and it almost sounds unnecessary to use the size method; but again, it's always a good idea to stick to good programming practices that may be very convenient in larger or more complex programs.

In some situations, we can not determine the number of elements before reading them. That is, we may have to read numbers to then determine when to stop reading them (an example would be, keep reading values until you read a negative value). In such situations, the trick of resizing the vector is not an option (at least not the way it is used in the example above).

Vectors provide a convenient way of handling this type of situation. We can use the push_back method to append one element at the end of the array. The operation includes resizing to one more element to accommodate for the extra element, and storing the given value at the end of the array.

The example below shows the use of push_back to accept numbers from the user and store them in a vector, until the user indicates that there are no more numbers.

#include <iostream>
#include <vector>
using namespace std;

int main()
{
    vector<double> student_marks;
    double mark;
    char answer;

    cout << "Enter marks (y/n)? " << flush;
    cin >> answer;

    while (answer == 'y' || answer == 'Y')
    {
        cout << "Enter value: " << flush;
        cin >> mark;

        student_marks.push_back (mark);

        cout << "More students (y/n)? " << flush;
        cin >> answer;
    }

    return 0;
}

Inserting and Removing Elements

Methods push_back and pop_back insert and remove (respectively) elements at the end of the vector. For situations where we need to insert or remove at an arbitrary position, we have methods remove and remove. We notice that these methods are inefficient with a vector (they take linear time, since all the remaining elements from the given position to the end have to be shifted). However, for situations where we must support these operations, class vector does provide the facilities to do so.

Methods insert and remove use iterators (discussed in the next section) as parameters to indicate the position at which we want to insert or remove the element(s). However, a “quick and dirty” solution is available, given the nature of vector iterators (they support operations essentially identical to pointer arithmetic). The code sample below illustrates these features:

#include <vector>
using namespace std;

int main()
{
    vector<double> values(10);
        // Create vector with 10 elements (initialized to 0.0)

    values.insert (values.begin() + 5, 1.4142);
        // Insert sqr root of 2 at position 5 (before the element that was at position 5)

    values.remove (values.begin() + 3);
        // Remove element at position 3

In the sample above, the expressions values.begin() + n work similarly to the idea of getting a pointer to element n of an array — we get a pointer to the first element of the array, then add an integer offset to obtain a pointer pointing n elements after the beginning of the array. The expression values.begin() returns an iterator pointing to the first element of the vector, and adding an integer value n results in an iterator pointing n positions after the first element, providing the required parameter for insert and remove methods.

Footnotes

Actually, there are also arrays in C++, and we often refer to these as built-in arrays or C-style arrays to emphasize the distinction; however, strictly speaking, the term array corresponds to the built-in version, which is compatible with arrays in the C language. In this tutorial, we deal exclusively with vectors. ↑

C++ Vectors

Basic Vectors Facilities

Resizing Vectors

Inserting and Removing Elements

Iterators: An Alternative Way of Accessing Elements

A Few Algorithms from the Standard Library

Footnotes

	C++ Vectors Carlos Moreno Copyright © 2000 – 2024 Carlos Moreno and Mochima Software/AudioSystems No, a person is smart. People are dumb, panicky, dangerous animals – Kay Basic Vectors Facilities Arrays are a programming tool that provide a mechanism to store a group of values under a single name. The values can be any available data type (e.g., int, double, string, etc.). In C++, we talk about vectors, rather than arrays.^[1] Vectors are declared with the following syntax: vector<type> variable_name (number_of_elements); The number of elements is optional. You could declare it like this: vector<type> variable_name; And that would declare an empty vector — a vector that contains zero elements. The argument type in angle-brackets indicates the data type of the elements of the vector; variable_name is the name that we assign to the vector, and the optional number_of_elements may be provided to indicate how many elements the vector will initially contain. Below are several examples of vector declarations: vector<int> values (5); // Declares a vector of 5 integers vector<double> grades (20); // Declares a vector of 20 doubles vector<string> names; // Declares a vector of strings, // initially empty (contains 0 strings) When using vectors in our programs, we must provide the appropriate #include directive at the top, since vectors are a Standard Library facility, and not a built-in part of the core language: #include <vector> After a vector has been declared specifying a certain number of elements, we can refer to individual elements in the vector using square brackets to provide a subscript or index, as shown below: grades[5] When using a vector or array followed by square brackets with a subscript, the resulting expression refers to one individual element of the vector or array, as opposed to the group of values, and roughly speaking, you can use that expression as you would use a variable of the corresponding data type. In the above example, the data type of the expression grades[5] is double, so you can use it as you would use a variable of type double — you can assign a value to it (a numeric value, with or without decimals), or you can retrieve the value, use it for arithmetic operations, etc. The above extends to other data types as well; if we have a vector of strings called names, the expression names[0] is a string, referring to the first element in the vector names. We can do anything with this expression that we would do with a string variable. For instance, the expression names[0].length() gives us the length of this string. An important condition for the index or subscript is that it must indicate a valid element in the vector. Elements in a vector are “numbered” starting with element 0. This means that valid subscript values are numbers between 0 and size−1, where size is the number of elements of the vector. For the example above of grades, valid subscripts are between 0 and 19. The following fragment shows an example of a program that asks the user for marks for a group of 20 students and stores them in a vector. #include <iostream> #include <vector> using namespace std; int main() { vector<double> student_marks(20); for (vector<double>::size_type i = 0; i < 20; i++) { cout << "Enter marks for student #" << i+1 << ": " << flush; cin >> student_marks[i]; } // ... Do some stuff with the values return 0; } The first statement declares a vector called student_marks with capacity to hold 20 values of type double. These values can be accessed individually as student_marks[0] to student_marks[19]. The for loop has the counter i go from 0 to 19, allowing access to each individual element in a sequential manner, starting at 0 and going through each value from 0 to 19, inclusively. Notice the data type for the subscript, vector<double>::size_type. As with strings, class vector<type> provides a size_type to represent positions and sizes. It is always recommended that you use this data type when dealing with vectors. for loops usually go hand in hand with the use of vectors or arrays, as they provide a convenient way to access every element, one at a time, using the loop control variable as the subscript. This does not mean that we must use for loops whenever we require access to the elements of a vector — it only means that quite often, a for loop provides a convenient approach and we choose it as the mechanism to access the elements. Resizing Vectors Vectors have one important advantage with respect to C-style arrays: vectors can be resized during the execution of the program to accommodate any extra elements as needed, or even to “shrink” the vector. In the example from the previous fragment above, if we don't know ahead of time (i.e., at the time we are writing the program) that there are 20 students, we could obtain that information at run-time (e.g., prompt the user for the number of students) and resize the vector accordingly, as shown below (though we notice that the example is somewhat silly, in that we could have waited until having the value of num_students and then declare the vector initializing it with that size): vector<double> student_marks; // no size specified: vector contains // no elements int num_students; cout << "Number of students: " << flush; cin >> num_students; student_marks.resize (num_students); for (vector<double>::size_type i = 0; i < num_students; i++) { cout << "Enter marks for student #" << i+1 << ": " << flush; cin >> student_marks[i]; } Notice that the valid subscripts for a vector with num_students elements are 0 to num_students−1. For that reason, the for loop starts at 0 and goes while i is less than num_elements. It is always a better idea to control for loops using the size method of vector. That way, we make sure that we loop only through the right subscript values, and we avoid the risk of accidentally exceeding the limits of the vector: for (vector<double>::size_type i = 0; i < student_marks.size(); i++) The difference in this case seems insignificant, and it almost sounds unnecessary to use the size method; but again, it's always a good idea to stick to good programming practices that may be very convenient in larger or more complex programs. In some situations, we can not determine the number of elements before reading them. That is, we may have to read numbers to then determine when to stop reading them (an example would be, keep reading values until you read a negative value). In such situations, the trick of resizing the vector is not an option (at least not the way it is used in the example above). Vectors provide a convenient way of handling this type of situation. We can use the push_back method to append one element at the end of the array. The operation includes resizing to one more element to accommodate for the extra element, and storing the given value at the end of the array. The example below shows the use of push_back to accept numbers from the user and store them in a vector, until the user indicates that there are no more numbers. #include <iostream> #include <vector> using namespace std; int main() { vector<double> student_marks; double mark; char answer; cout << "Enter marks (y/n)? " << flush; cin >> answer; while (answer == 'y' \|\| answer == 'Y') { cout << "Enter value: " << flush; cin >> mark; student_marks.push_back (mark); cout << "More students (y/n)? " << flush; cin >> answer; } return 0; } Inserting and Removing Elements Methods push_back and pop_back insert and remove (respectively) elements at the end of the vector. For situations where we need to insert or remove at an arbitrary position, we have methods remove and remove. We notice that these methods are inefficient with a vector (they take linear time, since all the remaining elements from the given position to the end have to be shifted). However, for situations where we must support these operations, class vector does provide the facilities to do so. Methods insert and remove use iterators (discussed in the next section) as parameters to indicate the position at which we want to insert or remove the element(s). However, a “quick and dirty” solution is available, given the nature of vector iterators (they support operations essentially identical to pointer arithmetic). The code sample below illustrates these features: #include <vector> using namespace std; int main() { vector<double> values(10); // Create vector with 10 elements (initialized to 0.0) values.insert (values.begin() + 5, 1.4142); // Insert sqr root of 2 at position 5 (before the element that was at position 5) values.remove (values.begin() + 3); // Remove element at position 3 In the sample above, the expressions values.begin() + n work similarly to the idea of getting a pointer to element n of an array — we get a pointer to the first element of the array, then add an integer offset to obtain a pointer pointing n elements after the beginning of the array. The expression values.begin() returns an iterator pointing to the first element of the vector, and adding an integer value n results in an iterator pointing n positions after the first element, providing the required parameter for insert and remove methods.
	Iterators: An Alternative Way of Accessing Elements Iterators provide an arguably more flexible way of accessing elements in a vector. As the name indicates, they allow you to iterate over the elements of a vector, providing read or write access to each element, one at a time. To work with iterators, we first declare a variable of type iterator. When declaring the iterator, we have to specify that it is an iterator for a vector of the given type, as illustrated in the example below: vector<int>::iterator i; This statement declares an iterator that can be used to iterate over the elements of a vector of ints. The iterator variable (in the above example, i) allows us to access a particular element, and to move to the next element. These two operations are clearly necessary if we want to iterate over all the elements of the vector. We can visualize the iterator as an object that “points to” a particular element of the vector. To access to the element “pointed to” by the iterator, we place a * operator in front, as shown in the example below: cout << i << endl; To advance to the next element in the vector, we use the pre- or post-increment ++ operator. Either one does the job, but in general, when working with iterators, it is recommended to prefer the pre-increment form, since it is potentially more efficient (even if the difference is marginal): cout << i << endl; ++i; The above examples deal with just one element of the vector. As I mentioned, we use iterators in general to iterate over the elements of a vector. To this end, we typically use a for loop, with an iterator as the control variable (as opposed to an integer to be used as the subscript). The for loop starts with a value for the iterator pointing to the first element in the vector, and goes while the iterator points to elements within the vector. To accomplish this, we use the begin and end methods of vector. These methods give us iterators positioned at the beginning and at the end of the vector (more specifically, vec.end() will give us an iterator positioned at one-past-the-last element, such that we can use it to control the loop). The fragment below illustrates the use of iterators to print all the elements of a vector of ints. vector<int> values; // ... (populate the vector) for (vector<int>::iterator i = values.begin(); i != values.end(); ++i) { cout << i << endl; } The for loop is split into three lines to avoid an extra-wide figure — in actual programs, it is often written this way, to avoid extra-long lines. Notice that since values.end() is an iterator placed at one-past-the-last element, when i is equal to that value, then we exit the loop, since that means that we are now outside the range of the vector's elements. If we have a vector of strings and want to print the elements (the strings) and their lengths, we would use iterators as shown in the following fragment: vector<string> names; // ... (populate the vector) for (vector<string>::iterator n = names.begin(); n != names.end(); ++n) { cout << "Name: " << n << "\tLength: " << n–>length() << endl; } The expression n–>length() can be seen as “syntactic sugar” for the (equivalent) expression (n).length() — n gives us the string “pointed to” by the iterator n; being a string, we can use .length() to obtain the length of the string (in this case not given by a variable name, but by the expression (*n)). The syntax using the –> operator provides a more convenient way to carry out this operation, and it is the one that virtually every C++ programmer would use and expect to see in actual programs. The same applies to other string methods such as substr, find, etc. The following is an example of a loop using iterators to display the strings in a vector, limiting the maximum number of characters to print to 10. vector<string> titles; // ... (populate the vector) const int max_length = 10; for (vector<string>::iterator t = titles.begin(); t != titles.end(); ++t) { cout << "Title: " << t–>substr(0, max_length) << endl; }
	A Few Algorithms from the Standard Library One of the additional benefits of using iterators is that the Standard Library algorithms are compatible with this idiom. In other words, Standard Library algorithms work with iterators representing a sequence of items, rather than directly with vectors. This provides a flexible and convenient mechanism, since the same idiom can be used to access elements in a sequence coming from any other data structure, not only vectors (for a more advanced discussion on the benefits of this approach, you may take a look at my STL Tutorial — that tutorial, however, requires a relatively advanced knowledge of C++ features, including templates, classes and object oriented; use it at your own risk :-)). Furthermore, the idiom is not restricted to items that are stored in some data structure in memory — the items might as well be the rows resulting from a query to a remote database system, or items read from a file on disk, or items representing packets received through a network interface. The Standard Library algorithms deal with sequences rather than with containers (e.g., vectors). A sequence is specified by a pair of iterators that indicate the beginning and the end of the sequence (given by an iterator to one-past-the-last element of the sequence). For example, the algorithm sort may be used to sort the elements of a vector. We would use .begin() and .end() to obtain the sequence of elements that sort expects. The line below shows an example of sorting the values of a vector: sort (values.begin(), values.end()); When using this or other Standard Library algorithms, we have to include the <algorithm> standard library fecilities: #include <algorithm> We could use the algorithm reverse to re-arrange the elements in the vector by reversing them. For example, we could use it after sort to order the elements from higher to lower: sort (values.begin(), values.end()); reverse (values.begin(), values.end()); If instead of putting them in order we wanted to randomly shuffle them (for instance, in a game application where you want to shuffle a deck of cards), there is also an available algorithm for this: random_shuffle (values.begin(), values.end()); The above are examples of mutating algorithms — they modify the sequence of elements. Other algorithms are provided for tasks such as counting or searching. For instance, if we want to count how many elements with a given value are contained in a vector, we could do: const int num_zeros = count (values.begin(), values.end(), 0); If we want to determine the highest and lowest values in a vector, we can do it with the following algorithms: const int highest = max_element (values.begin(), values.end()); const int lowest = min_element (values.begin(), values.end()); The algorithms max_element and min_element return an iterator pointing at the highest and lowest elements, respectively; we place a * in front to obtain the corresponding values. These are just some of the many available algorithms available in the Standard Library. As mentioned earlier, in my STL Tutorial, I discuss many other algorithms in more detail, as well as some of the basic principles behind them. Footnotes Actually, there are also arrays in C++, and we often refer to these as built-in arrays or C-style arrays to emphasize the distinction; however, strictly speaking, the term array corresponds to the built-in version, which is compatible with arrays in the C language. In this tutorial, we deal exclusively with vectors. ↑