Study notes: Programming: Principles and Practices

2022-10-24 | updated 2022-11-08

Notes on Programming: Principles and Practices by Bjarne Stroustrup

[3.8] Types and objects

type: the set of possible values and operations for an object
object: some memory holding a value of a given type
value: some bits in memory interpreted according to a type
variable: a named object
declaration: a statement giving a name to an object
definition: a statement setting aside memory for an object

[4.3] Expressions

int length = 20;            // assigned
int area = length * length; // read

In line 1, length as an lvalue: the box containing the int. In line 2, length as an rvalue: the value itself being read; the contents of the box.

In length *= 15;, it is used as both lvalue and rvalue. Same with ++length: it is updated, then read.

Note: Assignment is an expression: In a = 10, = is an operator and the expression produces a value. a = 10; (with a semicolon) is a statement.

[4.3.3] Conversions

Conversion without checking:

2.5 / 2 == 2.5 / double(2);

To prevent narrowing, use type { value } form:

'a' + 1 == int{'a'} + 1;

Decimal is truncated in these:

int ratio = a / b;

int ratio;
ratio = a / b;

But this form checks for narrowing [compiler error? runtime?]:

int ratio {a/b};

[4.4] Statements

Kinds:

empty, e.g. in if (cond); { ...; }, the cond value doesn’t matter since the if block is empty; the second block runs either way (i.e. the first semicolon was probably included accidentally).
expression statement
declaration
selection: if, switch
iteration/repetition: while, for
block (compound)

An expression statement is useless if it doesn’t have a side effect, so these are typically assignment, I/O, subroutine calls, etc.

[4.5] Functions

Definition:

int square (int x) { return x + x; }

Declaration:

int square (int);

A declaration can be #included in the program, e.g. from a header file, while the definition is elsewhere, out of sight, e.g. in a compiled library.

[4.6] Vector

Initialising with some data:

vector<int> v = { 3, 2, 1 };

Defined with a given size but no element values yet (initialised to "" because the element type is string):

vector<string> vs(4);

A vector knows its own size, accessed by calling a member function:

for (int i = 0; i < v.size(); ++i)
  cout << v[i] << '\n';

Equivalent using a ranged for-loop to iterate over all elements:

for (int x : v)
  cout << x << '\n';

Creating an empty vector without a preallocated size:

vector<int> v;
v.push_back(2); ...
for (int x; cin >> x;)
  v.push_back(x);

The above uses a for-loop with a continuation condition (while the >> operation produces a value) without incrementing a counter variable. This keeps the scope of x local to the loop, unlike a while-loop with x initialised outside the loop.

[5.1] Types of errors

compile-time:
- syntax
- type
link-time
run-time, detected by:
- hardware/OS
- library
- user code
logic

[5.2] Sources of errors

poor specification
incomplete program
unexpected arguments
unexpected input
unexpected state
logical errors

[8.4]

Functions nested inside other functions are not legal in C++.

[8.5]

[8.5.1] Unnamed function arguments

Function arguments need not be named. In function definitions (cf. declarations), unnamed parameters are probably only useful if the argument is no longer used in the body but must be retained for API stability.

[8.5.2] Returning from a void function

No return statement needed at the end of the function. (All other functions except main() require a return statement.)

Return early with:

return;

[8.5.3] Pass by value

makes a copy of the argument, which only has local scope. Cheap for small objects: int, char, small structs.

[8.5.4] Pass by const reference

Useful for potentially large arguments that don’t need to be mutated, e.g. vector, string. Not copied.

E.g. change void f(vector<int> v) to void f(const vector<int>& v) with no change to the function body.

[8.5.5] Pass by reference

Use if you need to modify the function argument (imperative/OO style), e.g. swap().

A reference is a new name for an existing object – conceptually, an alias.

Especially useful for repeatedly mutating a value that can only be accessed via a complicated or expensive expression.

[8.5.7]

Passing an argument is equivalent to initialising the function-local variable with the given value. It might result in conversion. E.g. in void f(double d); f(1);, d would get the same value as in double d = 1;.

Beware narrowing conversions! For void f(int i), f(1.5) == f(1). Rather be explicit: f( int(1.5) ) or f( static_cast<int>(1.5) ).

[8.5.8] Call stack

The call stack consists of a stack of function activation records. Those set aside space in memory for the parameters and local variables of the function. [But doesn’t do allocation? or allocates but doesn’t initialise?] Cheap to create, even if some variables are only used in some code paths (branches).

[8.5.9] constexpr functions

Behave just like normal functions if the arguments are not constant (e.g. determined at run-time or passed in to parent function). But if used in a constexpr context:

the arguments must be known at compile-time.
the function will be evaluated at compile-time, so it must be “simple” enough:
- single return / exit point
- doesn’t modify non-local variables (global or passed by reference)

[8.6]

[8.6.1]

Predicting the order of evaluation of subexpressions is tricky, so don’t read/write a variable twice in an expression that modifies it, e.g. cout << ++i << i; or v[++i] = i; or v[i] = ++i;.

[8.6.2]

Avoid global variables, especially those with non-constant initialisers: the order of initialisation of different translation units is not defined.

If needing an object with complex initialisation, return it from a function. If it is expensive to calculate and you want to share a single instance, create a static const variable in the function and return a const reference to it: The static variable will be constructed the first time the function is called, and subsequent calls will just return new references.

[8.7] Namespaces

Unlike in Python, where a module is a file and a namespace for all public members (incl. imported ones), C++ header files and namespaces are orthogonal.

Short, pretty, obvious namespace names are more likely to clash with those defined elsewhere (e.g. other libraries).

To avoid having to use the namespace prefix:

After using std::string;, string means std::string.
After using namespace std;, all names in std are directly accessible.

Don’t put these using statements in headers! Let users choose whether to do using.

Also, don’t use except for well-known namespaces (e.g. std).

[9.3] Class interface and implementation

Members and member functions not under the public: label are private. But if listing public members first, must use private: label after them, to signal end of public:.

In a struct (vs. class), all members are public. Useful if there are no invariants (restrictions on the values members can take).

Question: Should private members be declared in the header or only the public members?

The distinction between interface and implementation can be represented by both private vs. public and header (.h) vs. implementation (.cpp/.a/.so). Since the body of public methods will call private members or access private data members, the public implementation shouldn’t be in the header either.

[9.4]

[9.4.4]

Member functions can be defined with the class definition/declaration. But this might obscure the interface if the function bodies are long (cf. accessor methods). Also, it’s better to separate the declaration into the header and the implementation into the translation unit. This also allows the implementation to change without requiring code that calls it to be recompiled. However, if defined in the class, the method gets inlined (i.e. the body is copied to all call sites), which gives better performance if the function is simple (e.g. ≤ 2 expressions) and used a lot.

Constructors: Use the initialiser list for data members before the function body (which is empty or checks invariants).

class Date {
  int y, m, d;

  Date(int dd) : y{0}, m{1}, d{dd}
  //             ^ initialiser list
  {
    is_valid();
    ...
  }
};

[9.5] Enums

Declare with enum class to get a scoped enum (used like Month::jan):

enum class Month {
  jan = 1,  // 0 by default
  feb,      // representation = previous value + 1
  mar,
  ...,
  dec
};

Scoped enums can’t be assigned to/from an int: they are different types. But can be created via unchecked conversion:

Month m = Month(3);

and they can be converted explicitly to the int representation:

int(m);  // 3
int(Month::jan) == 1;

It is not possible to define a constructor method (to check invariants). But you could define a separate constructor/converter function, e.g. Month int2month(int).

Plain enums (not scoped)

Declared with enum (no class):

enum Day { mon, tues };

Can be accessed without the scope prefix (potential for name clashes):

mon == Day::mon;

Can assign the enum to an int, but not an int to the enum:

int i = mon;

// Day d = 1;   // invalid
Day d = Day(1); // ok

[9.6] Operator overloading

Can only be done for operators defined in C++; can’t create new operators. Have to be with the same no. of parameters. Should have the same conventional interpretation/meaning (assignment, equality, comparison, subscript, call, etc.).

[9.7] Class interfaces

Design principles:

Keep interface small but complete.
Provide constructors, destructors (free all resources).
Decide whether to allow or prohibit copying.
Use the type system to check arguments at compile-time.
Mark non-modifying methods (const).

Symbolic constants in classes: Use static const/constexpr so that all instances share the same copy of the value.

[9.7.2] Copying

Default behaviour: Copying a class instance copies all its members.

Otherwise: see §18.3, §14.2.4

[9.7.3] Default constructor

Constructor that takes no arguments, e.g.

Date::Date() : y{1}, m{1}, d{1} { };

class Date {
  // in-class initialisers
  int y {1};
  int m {1};
  int d {1};

  public:
    // default values are available to all constructors
    Date();
    Date(int y);
    Date(int y, int m);
    Date(int y, int m, int d);
};

Or, instead of the constructor, provide a const Date& default_date(), which returns a reference to a static local variable. This function could also be used by the different constructors.

A default constructor allows creating e.g. vectors of Dates without needing to supply default/initial values:

vector<Date> dates(10);

[9.7.4] const methods

Must be declared as const to allow them to be invoked on const objects.

class Date {
  public:
    int day() const;
    void add_day(int n);
};

const Date d = Date(1);
d.day() == 1;
d.add_day(2);  // invalid

[9.7.5] Methods vs. helper functions

If not needing to access the representation, prefer to provide a freestanding function:

keeps the interface small
is easier to debug
there’s less impact if the representation needs to change

Comparison operators shouldn’t need to know about private data (unless it can’t be done efficiently).

Might be useful to group all the functions into a namespace.