MPL – Collective communication

In a previous post I gave a very short introduction to the Message Passing Library (MPL), which is a C++ message passing library based on MPI. Although it does not provide a direct mapping of the C API to C++ it comes with all modes of collective communication as defined be the MPI standard, e.g., barrier, broadcast, gather, scatter, reduce and so on. The following program does not perform any meaningful calculation, but illustrates some modes of collective communication.

#include <cstdlib>
#include <complex>
#include <iostream>
#include <vector>
#include <mpl/mpl.hpp>
 
int main() {
  const mpl::communicator &comm_world=mpl::environment::comm_world();
  std::vector<int> v;
  if (comm_world.rank()==0)
    for (int i=0; i<comm_world.size(); ++i)
      v.push_back(i);
  int x;
  // rank 0 scatters data to all processes
  comm_world.scatter(0, v.data(), x);
  std::cout << "rank " << comm_world.rank() << " got " << x << '\n';
  // wait until all processes have reached this point
  comm_world.barrier();
  x*=2;
  // rank 0 gathers data from all processes
  comm_world.gather(0, x, v.data());
  if (comm_world.rank()==0)
    for (int i=0; i<comm_world.size(); ++i)
      std::cout << "got " << v[i] << " from rank " << i << '\n';
  // wait until all processes have reached this point
  comm_world.barrier();
  // calculate global sum and pass result to rank 0
  if (comm_world.rank()==0) {
    int sum;
    comm_world.reduce(mpl::plus<int>(), 0, x, sum);
    std::cout << "sum = " << sum << '\n';
  } else
    comm_world.reduce(mpl::plus<int>(), 0, x);
  // wait until all processes have reached this point
  comm_world.barrier();
  // calculate global sum and pass result to all
  comm_world.allreduce(mpl::plus<int>(), x);
  std::cout << "sum = " << x << '\n';
  return EXIT_SUCCESS;
}

Note that the reduction operation (addition) in the example above is specified as an anonymous function object. In addition to addition, MPL provides multiplication, logical operations »and« and »or«, bitwise operations »and«, »or«, and »xor« as well as minimum and maximum. A reduction operation must take two arguments of the same kind and produce a result of the same type as the arguments. With MPL it becomes very easy to define custom reduction operations, as the following example shows. Note that it is required, that the reduction operation is implemented by a class, which is derived from std::function and has no member variables.

#include <cstdlib>
#include <ctime>
#include <iostream>
#include <functional>
#include <vector>
#include <mpl/mpl.hpp>
 
// calculate least common multiple of two arguments 
template<typename T>
class lcm : public std::function<T (T, T)> {
  // helper: calculate greatest common divisor
  T gcd(T a, T b) {
    T zero=T(), t;
    if (a<zero) a=-a;
    if (b<zero) b=-b;
    while (b>zero) {
      t=a%b;  a=b;  b=t;
    }
    return a;
  }
public:
  T operator()(T a, T b) {
    T zero=T();
    T t((a/gcd(a, b))*b);
    if (t<zero)
      return -t;
    return t;
  }
};
 
int main() {
  const mpl::communicator &comm_world=mpl::environment::comm_world();
  // generate data
  std::srand(std::time(0)*comm_world.rank());  // random seed
  int v=std::rand()%12+1;
  // calculate least common multiple and send result to rank 0
  if (comm_world.rank()==0) {
    int result;
    // calculate least common multiple
    comm_world.reduce(lcm<int>(), 0, v, result);
    // display data from all ranks
    std::cout << "Arguments:\n";
    for (int r=0; r<comm_world.size(); ++r) {
      if (r>0)
	comm_world.recv(v, r);
      std::cout << v << '\n';
    }
    // display results of global reduction
    std::cout << "\nResult:\n";
    std::cout << result << '\n';
  } else {
    // calculate least common multiple
    comm_world.reduce(lcm<int>(), 0, v);
    // send data to rank 0 for display
    comm_world.send(v, 0);
  }
  return EXIT_SUCCESS;
}
MPI provides reductions operations MINLOC and MAXLOC, which simultaneously determine the minimum or maximum plus the location of the minimum or maximum. In MPL, however, such specific operations are not needed. Minimum and maximum operations for pairs of some scalar type and an integer do the same job as the following example shows.
#include <cstdlib>
#include <ctime>
#include <iostream>
#include <iomanip>
#include <vector>
#include <utility>
#include <mpl/mpl.hpp>
 
typedef std::pair<double, int> pair_t;
 
int main() {
  const mpl::communicator &comm_world=mpl::environment::comm_world();
  // generate data
  std::srand(std::time(0)*comm_world.rank());  // random seed
  const int n=8;
  std::vector<pair_t> v(n);
  for (pair_t &i : v)
    i=std::make_pair(static_cast<double>(std::rand())/RAND_MAX, comm_world.rank());
  // calculate minium and its location and send result to rank 0
  mpl::contiguous_layout<pair_t> layout(n);
  if (comm_world.rank()==0) {
    std::vector<pair_t> result(n);
    // calculate minimum
    comm_world.reduce(mpl::min<pair_t>(), 0, v.data(), result.data(), layout);
    // display data from all ranks
    std::cout << "Arguments:\n";
    for (int r=0; r<comm_world.size(); ++r) {
      if (r>0)
	comm_world.recv(v.data(), layout, r);
      for (pair_t i : v) 
	std::cout << std::fixed << std::setprecision(5) << i.first << ' ' << i.second << '\t';
      std::cout << '\n';
    }
    // display results of global reduction
    std::cout << "\nResults:\n";
    for (pair_t i : result) 
      std::cout << std::fixed << std::setprecision(5) << i.first << ' ' << i.second << '\t';
    std::cout << '\n';
  } else {
    // calculate minium and its location and send result to rank 0
    comm_world.reduce(mpl::min<pair_t>(), 0, v.data(), layout);
    // send data to rank 0 for display
    comm_world.send(v.data(), layout, 0);
  }
  return EXIT_SUCCESS;
}

MPL – A message passing library

The Message Passing Interface (MPI) Standard defines a message passing library, which serves as the basis for many high-performance computing applications today. It provides portable scalable functions for data exchange in parallel computations for various parallel computing architectures. Originally application programing interfaces had been defined for C and Fortran as well as for C++. In the 2008 update, known as MPI-3, however, the C++ bindings have been removed from the MPI standard. During the various revisions the MPI standard became quite complex and dropping one of the three language bindings may have helped to keep the standard maintainable as a whole. Furthermore, the C++ bindings were not very well designed. Although object orientated techniques had been applied, the MPI C++ bindings did not come close to a well designed C++ library by todays standards. explains in more detail, what happened to the C++ bindings in his blog.

Alternative C++ bindings to the MPI Standard are provided by Boost MPI and OOMPI, which was an early attempt to bring MPI 1 functionality to C++ in an object orientated way. Boost MPI uses rather modern C++ programing techniques to provide a very nice interface to the MPI standard’s core functionality. With Boost MPI programs become more type save (When sending data of a particular C++ type, the corresponding MPI data type is deduced by the compiler.) and sending data given by user defined structures or classes becomes much more easy than with the MPI standard’s C or C++ bindings. Although Boost MPI is a huge improvement over the deprecated C++ bindings of the MPI standard it has also its limitations.

  • It is no longer actively maintained.
  • Sending data of complex classes and structures is based on Boost serialization, which may cause performance reductions and does not work in heterogeneous environments (different endians etc.).
  • Boost MPI provides no equivalent to derived MPI data types (strided vectors, sub matrices, etc.).
  • Although Boost MPI supports the more general graph communicators, there are no functions for Cartesian communicators.
  • Boost MPI ist based on C++03, it does not benefit from new C++11 features.

Because C++ was dropped from the MPI standard and because Boost MPI does not fulfill all my needs for a flexible easy-to-use C++ message passing library I started to write my own massage passing library on top of MPI, just called MPL (Message Passing Library), see my GitHub account. Note that MPL will neither bring all functions of the C language API to C++ nor provide a direct mapping of the C API to some C++ functions and classes. Its focus is on the MPI core functions, ease of use, type safety, and elegance. It uses C++11 features, wherever reasonable, e.g., lambda functions as custom reduction operations.

MPL relies heavily on templates and template meta programming and it comes just as a bunch of header files. Documentation is still missing and only available in form as the source code and a few sample programs. If you are familiar with MPI, however, the transition to MPL will not be difficult. Let us start with a hello-world type program:

#include <cstdlib>
#include <iostream>
#include <mpl/mpl.hpp>
 
int main() {
  const mpl::communicator & comm_world(mpl::environment::comm_world());
  std::cout << "Hello world! I am running on \"" 
   	    << mpl::environment::processor_name() 
   	    << "\". My rank is "
  	    << comm_world.rank()
  	    << " out of " 
   	    << comm_world.size() << " processes.\n";
  return EXIT_SUCCESS;
}

Similar to MPI_COMM_WORLD in MPI, MPL has a global communicator that contains all processes, which belong to a parallel computation. Each communicator has a rank (the number of the process within a communicator) and a size (the total number of processes). The program shown above just prints for each process its rank, the size of the world communicator and the computer’s name, where the process runs. Note that with MPL it is not required to initialize or to finalize the massage passing library (MPI_Init and MPI_Finalize are called implicitly by some compiler magic.).

Let us look at a less trivial example and see how messages are send and received. A very elementary example using the C language bindings of MPI and C++11 may look like this:

#include <cstdlib>
#include <complex>
#include <iostream>
#include <mpi.h>
 
int main(int argc, char *argv[]) {
  MPI_Init(&argc, &argv);
  int c_size, c_rank;
  MPI_Comm_rank(MPI_COMM_WORLD, &c_rank);
  MPI_Comm_size(MPI_COMM_WORLD, &c_size);
  if (c_size<2) {
    MPI_Finalize();
    return EXIT_FAILURE;
  }
  // send and recieve a single floating point number
  if (c_rank==0) {
    double pi=3.14;
    MPI_Send(&pi, // pointer to memory
	     1, // number of data items
	     MPI_DOUBLE, // data type
	     1, // destination
	     0, // tag
	     MPI_COMM_WORLD // communicator
	     );
    std::cout << "sent: " << pi << '\n';
  } else if (c_rank==1) {
    double pi=0;
    MPI_Recv(&pi, // pointer to memory
	     1, // number of data items
	     MPI_DOUBLE, // data type
	     0, // source
	     0, // tag
	     MPI_COMM_WORLD, // communicator
	     MPI_STATUS_IGNORE // ignore the receive status
	     );
    std::cout << "got : " << pi << '\n';
  }
  MPI_Finalize();
  return EXIT_SUCCESS;
}

Here the standard MPI functions MPI_Send and MPI_Recv are employed. The function signature requires a lot of parameters: a pointer to a buffer, the number of items to be send or received, the data type, a source or destination, a tag, and finally the communicator. With MPL this simplifies a lot. MPL assumes that only one data item is send or received at a time, thus the number of data items is not needed to be specified. Furthermore, the underling MPI datatype will be deduced automatically at compile time by the compiler. This eliminates a typical error of MPI programs, e.g., passing a pointer to a do an integer by specifying MPI_DOUBLE as the data type. The tag, which may be used to distinguish between different kind of messages, becomes in MPL an argument with a default value, so it is optional. Thus, in MPL only the communicator, a reference to the data and a source or destination has to be given to the send and receive functions. The MPL equivalent to the MPI program shown above may look as:

#include <cstdlib>
#include <complex>
#include <iostream>
#include <mpl/mpl.hpp>
 
int main() {
  const mpl::communicator &comm_world=mpl::environment::comm_world();
  if (comm_world.size()<2)
    return EXIT_FAILURE;
  // send and recieve a single floating point number
  if (comm_world.rank()==0) {
    double pi=3.14;
    comm_world.send(pi, 1);  // send to rank 1
    std::cout << "sent: " << pi << '\n';
  } else if (comm_world.rank()==1) {
    double pi=0;
    comm_world.recv(pi, 0);  // receive from rank 0
    std::cout << "got : " << pi << '\n';
  }
  return EXIT_SUCCESS;
}

Of course sending and receiving single data items will not be sufficient for a message passing library. This is why MPL introduces the concept of data layouts. Data layouts specify the memory layout of a set of data to be sent or received (similar to derived data types in MPI). The layout may be continuous, a strided vector etc. The data layout is provided as an additional parameter to sending or receiving functions and, in contrast to the case of single data items, data is passed via a pointer. The following example may give an idea how data layouts are used with MPL:

#include <cstdlib>
#include <complex>
#include <iostream>
#include <vector>
#include <mpl/mpl.hpp>
 
int main() {
  const mpl::communicator &comm_world=mpl::environment::comm_world();
  if (comm_world.size()<2)
    return EXIT_FAILURE;
  std::vector<double> v(8);
  mpl::contiguous_layout<double> v_layout(v.size());
  // send and recieve a vector of floating point numbers
  if (comm_world.rank()==0) {
    double init=0;
    for (double &x : v) {
      x=init;
      ++init;
    }
    comm_world.send(v.data(), v_layout, 1);  // send to rank 1
    std::cout << "sent: ";
    for (double &x : v) 
      std::cout << x << ' ';
    std::cout << '\n';
  } else if (comm_world.rank()==1) {
    comm_world.recv(v.data(), v_layout, 0);  // receive from rank 0
    std::cout << "got : ";
    for (double &x : v) 
      std::cout << x << ' ';
    std::cout << '\n';
  }
  return EXIT_SUCCESS;
}

Cell phone supercomputers

In “Physics of the Future” Michio Kaku states “Today your cell phone has more computer power than all of the NASA back in 1969, when it placed two astronauts on the moon.” and in the fictional movie “Iron Sky” the space ship “Götterdämmerung” is operated by a smart phone. So what is actually the computational power of modern cell phones? To get an answer to this question I run the linpack benchmark on my Jolla (with Qualcomm Snapdragon dual core cpu at 1.4 GHz). Indeed, according to this benchmark my cell phone has a single-core performance of the order of a Cray supercomputer of the early 1990s, more than 300Mflops, which is about an order of magnitude slower than a modern desktop cpu.

The linpack benchmak solves a large dense system of linear equations in order to determine the number of executed floating point operations per second.

The linpack benchmak solves a large dense system of linear equations in order to determine the number of executed floating point operations per second. Screenshot shows results for a Jolla smart phone.

C++11/14 for scientific computing V

$\boldsymbol{\lambda}$ functions

Anonymous functions, often called $\lambda$ functions, are a common feature of scientific programming languages, e.g., Male and Matlab. They are particularly usefull when functions are employed, which take other functions as arguments.  For example, a numerical root finding algorithm requires to specify a function as an input parameter.

C++11 introduces $\lambda$ functions to C++ and in C++14 they have been extended and become slightly more flexible.  $\lambda$ functions can be assigned to variables similar to functors.  The main advantage of $\lambda$ functions over functors is that they can be defined where needed.  Furthermore, $\lambda$ functions capture variables in the current scope, thus they represent a closure.

The syntax of a $\lambda$ function has basically the form

[ capture-list ] ( params ) -> ret { body }

The possibly empty capture-list specifies which variables are captured and how they are captured (by value or by reference).  Then the arguments and the return type are specified followed by the definition of the function body.  The return type may be omitted if the function body consists of nothing but a single return statement with an expression.

The use of $\lambda$ functions is probably best explained by providing a specific example. In the following, a generic root finding algorithm (regula falsi) is implemented, which looks in an interval for a root of some function.

#include <iostream>
#include <iomanip>
#include <cstdlib>
#include <cmath>
#include <limits>
 
template<typename T, typename F>
T regula_falsi(T s, T t, F f) {
  typedef enum { none, left, right } side_t;
  side_t side(none);
  // starting values at endpoints of interval 
  T fs=f(s);
  T ft=f(t);
  T r;
  for (int n=0; n<std::numeric_limits<T>::digits+2; ++n) {
    r=(fs*t - ft*s)/(fs - ft);
    if (std::abs(t-s)<std::numeric_limits<T>::epsilon()*std::abs(t+s)) 
      break;
    T fr=f(r);
    if (fr*ft>0) { // fr and ft have same sign, copy r to t
      t=r; 
      ft=fr;
      if (side==left) 
	fs/=2;
      side=left;
    } else if (fs*fr>0) { // fr and fs have same sign, copy r to s
      s=r; 
      fs=fr;
      if (side==right) 
	ft/=2;
      side=right;
    } else { // fr*f_ very small (looks like zero)
      break;
    } 
  }
  return r;
}
 
int main() {
  // assign lambda function to variable and use afterwards
  auto f = [](double x){ return std::cos(x)-x*x*x; };
  double x0=regula_falsi(0., 1., f);
  // specify lambda function directly as an argument
  double x1=regula_falsi(0., 3., [](double x){ return std::cos(x); });
  std::cout << std::setprecision(12) 
	    << x0 << '\n'
	    << x1 << '\n';
  return 0;
}

C++11/14 for scientific computing IV

Random numbers

C++98 inherited from C the standard functions rand and srand and the macro RAND_MAX for generating pseudo-random numbers. These functions suffer several problems. For example, it has been never specified which algorithm is used in rand to produce pseudo-random numbers. Thus, in C and in C++98 the outcome of a Monte Carlo simulation depends on the employed implementation of the standard library if rand is used. Furthermore, in C++98 it is hard to generate random numbers from some non-uniform distribution.

Therefore, in C++11 a portable random number generator facility has been introduced. All classes related to random number generation are declared in the header random. The random number generator facility consists of engines and distributions. Engines produce streams of pseudo random bit patterns and may implement various algorithms for random number generation, e.g., linear congruences or the Mersenne Twister. Distributions consume an engine’s bits to generate random numbers, which are distributed according to a specific distribution, e.g., uniform, exponential or Gaussian. The following code gives a simple example. A more detailed explanation of the C++11 random number generator facility can be found in Random Number Generation in C++11 and in the video rand() Considered Harmful.

#include <iostream>
#include <random>
 
int main() {
  // Mersenne Twister with default seed
  std::mt19937 engine;
  // exponential distribution with mean 1
  std::exponential_distribution<> distribution(1);
  for (int i(0); i<100; ++i) 
    std::cout << distribution(engine) << '\n';
}

C++11/14 for scientific computing III

Mathematical functions

C++11/14 introduces several new mathematical functions, which are all overloaded for the types float, double and long double and that are defined in the header file cmath. The following table summerizes the new functions.

Basic operations
remainder signed remainder of the division operation
remquo signed remainder as well as the three last bits of the division operation
fma fused multiply-add operation
fmax larger of two floating point values
fmin smaller of two floating point values
fdim positive difference of two floating point values ($\max(0, x-y)$)
nan, nanf, nanl not-a-number (NaN)
Exponential functions
exp2 returns $2$ raised to the given power ($2^x$)
expm1 returns e raised to the given power, minus one ($\exp x -1$)
log2 base 2 logarithm of the given number ($\log_2x$)
log1p natural logarithm of 1 plus the given number ($\ln(1+x)$)
Power functions
cbrt computes cubic root ($\sqrt[3]{x}$)
hypot computes square root of the sum of the squares of two given numbers ($\sqrt{x^2+y^2}$)
Hyperbolic functions
asinh computes the inverse hyperbolic sine
acosh computes the inverse hyperbolic cosine
atanh computes the inverse hyperbolic tangent
Error and gamma functions
erf error function
erfc complementary error function
tgamma gamma function
lgamma natural logarithm of the gamma function
Nearest integer floating point operations
trunc nearest integer not greater in magnitude than the given value
round, lround, llround nearest integer, rounding away from zero in halfway cases
nearbyint nearest integer using current rounding mode
rint, lrint, llrint nearest integer using current rounding mode with exception if the result differs
Floating point manipulation functions
scalbn, scalbln multiplies a number by FLT_RADIX raised to a power
ilogb extracts exponent of the number
logb extracts exponent of the number
nextafter, nexttoward next representable floating point value towards the given value
copysign copies the sign of a floating point value
Classification and comparison
fpclassify categorizes the given floating point value
isfinite checks if the given number has finite value
isinf checks if the given number is infinite
isnan checks if the given number is NaN
isnormal checks if the given number is normal
signbit checks if the given number is negative
isgreater checks if the first floating-point argument is greater than the second
isgreaterequal checks if the first floating-point argument is greater or equal than the second
isless checks if the first floating-point argument is less than the second
islessequal checks if the first floating-point argument is less or equal than the second
islessgreater checks if the first floating-point argument is less or greater than the second
isunordered checks if two floating-point values are unordered

Minimum maximum functions

The header file algorithm defines template functions for determining the minimum or the maximum of two comparable objects of the same type, e.g., numbers. In C++11 overloaded versions of these functions have been introduced that allow to determine the minimum or the maximum of an arbitrary number of objects, which are passed via an initializer list. Furthermore, minmax determines both the minimum and the maximum in a single scan. The following code gives an illustrative example.

#include <iostream>
#include <algorithm>
 
int main() {
  int i0(0), i1(1), i2(2), i3(3), i4(4);
  // determine minimum and maximum seperately
  std::cout << "min : " << std::min({i3, i1, i2, i0, i4}) << '\n'
	    << "max : " << std::max({i3, i1, i2, i0, i4}) << '\n';
  // determine minimum and maximum in a single scan
  auto min_max=std::minmax({i3, i1, i2, i0, i4});
  std::cout << "min : " << min_max.first << '\n'
	    << "max : " << min_max.second << '\n';
}

C++11/14 for scientific computing II

Complex numbers

The template class std::complex and functions for complex numbers (defined in the header file complex) have been extended in C++11/14. The new function std::proj returns the projection of the complex number $z$ onto the Riemann sphere. The functions std::asin, std::acos and std::atan calculate the inverse of the sine, cosine and tangent for complex arguments and similarly std::ahsin, std::ahcos and std::ahtan compute the inverses of the complex hyperbolic trigonometric functions. The member functions real and imag are overloaded in C++11. In C++11 it is not only possible to get the real and the imaginary part of a complex number, the real and the imaginary may now also be set by these functions as illustrated below.

#include <iostream>
#include <complex>
 
int main() {
  std::complex<double> z;
  // set real and imaginary part
  z.real(1);
  z.imag(2);
  // get real and imaginary part
  std::cout << "z = " << z.real() << " + " << z.imag() << "i\n";
}

C++14 introduces the literals if, i and id, which represent pure imaginary numbers in single, double and extended precision, respectively. These literals are declared in the inline namespace std::literals::complex_literals and make complex expressions in source code more easy to write and read as the following example shows.

#include <iostream>
#include <complex>
 
int main() {
  using namespace std::literals;
 
  double pi=std::acos(-1.);
  std::complex<double> z=std::exp(1i*pi); // Euler's formula
  std::cout << "exp(i, pi) = " << z << '\n';
}

C++11/14 for scientific computing I

In 2011 a new standard for the C++ programming language has been published, which is commonly referred to as C++11. This new standard introduces many new language features and standard library functions. Many of these new features have been introduced to make C++ more flexible and to make C++ more easy to use.  Bjarne Stroustrup (the creator of C++) even thinks «C++11 feels like a new language: The pieces just fit together better than they used to and I find a higher-level style of programming more natural than before and as efficient as ever.» This year another new version of C++ (C++14) has been defined, which brings some minor enhancements and clarifications compared to C++11. As C++ is a general purpose programing language the new features of the revised versions are not specifically designed for the needs of scientific computing. Nevertheless C++11/14 adds several new tools to the computational scientist’s toolbox. In the following I will present some of them.

New integer types

C++ does not define which are the minimal and maximal values that the integer types int and long can hold. The language standard requires only lower/upper bounds on these values. As a consequence int may be a 16-bit, 32-bit or a 64-bit integer or even something different. In C++11/14 the standard header file cstdint defines several new integer types with specific bit-width.

int8_t
int16_t
int32_t
int64_t
signed integer type with width of
exactly 8, 16, 32 and 64 bits respectively
with no padding bits and using 2’s complement for negative values
(provided only if the implementation directly supports the type)
int_fast8_t
int_fast16_t
int_fast32_t
int_fast64_t
fastest signed integer type with width of
at least 8, 16, 32 and 64 bits respectively
int_least8_t
int_least16_t
int_least32_t
int_least64_t
smallest signed integer type with width of
at least 8, 16, 32 and 64 bits respectively
intmax_t maximum width integer type
intptr_t integer type capable of holding a pointer
uint8_t
uint16_t
uint32_t
uint64_t
unsigned integer type with width of
exactly 8, 16, 32 and 64 bits respectively
(provided only if the implementation directly supports the type)
uint_fast8_t
uint_fast16_t
uint_fast32_t
uint_fast64_t
fastest unsigned integer type with width of
at least 8, 16, 32 and 64 bits respectively
uint_least8_t
uint_least16_t
uint_least32_t
uint_least64_t
smallest unsigned integer type with width of
at least 8, 16, 32 and 64 bits respectively
uintmax_t maximum width unsigned integer type
uintptr_t unsigned integer type capable of holding a pointer

The header file cstdint defines also several macros for the maximal and minimal values of the integer types shown above. These values, however, are more conveniently accessed through the template class std::numeric_limits.

Numeric limits

The class template std::numeric_limits, which is defined in the header file limits provides a standardized way to query various properties of arithmetic types (e.g., the largest possible value for type int is std::numeric_limits<int>::max). This information is provided via specializations of the numeric_limits template. Since C++1 the members of std::numeric_limits are declared as static constexpr. Thus, their return values can be consumed by operations that require constant expressions, such as an integer template argument. Furthermore, the new members max_digits10 and lowest have been introduced in C++11, which give the number of decimal digits necessary to differentiate all values of this type and the lowest finite value of the given type.

Integer overflow

YouTube Error: An integer overflow occurred.

On YouTube a video has been watched more than 2,147,483,647 times, a number that can not been represented by a (signed) 32-bit integer.

Calculating the Hermite functions

The Hermite functions appear as the solutions of the quantum mechanical harmonic oscillator. But they have applications in many other fields and applications, e.g., pseudospectral methods. The Hermite functions $h_n(x)$ are defined as
\begin{equation}
\label{eq:h}
h_n(x) = \frac{1}{\sqrt{\sqrt{\pi} 2^n n!}} \mathrm{e}^{-x^2/2} H_n(x) \,,
\end{equation} where $H_n(x)$ denotes the $n$th Hermite polynomial defined via the recurrence relation
\begin{equation}
H_{n}(x) = 2xH_{n-1}(x)-2(n-1)H_{n-2}(x)
\end{equation} with the initial values $H_0(x)=1$ and $H_1(x)=2x$. Although the Hermite functions are quite well behaved, they are rather difficult to calculate especially for large $n$ and/or large $x$.

Calculating the Hermite functions via the definition \eqref{eq:h} fails easily due to numerical overflow or underflow that is caused by the rapid growth or decrease of the individual factors $\frac{1}{\sqrt{\sqrt{\pi} 2^n n!}} \mathrm{e}^{-x^2/2}$ and $H_n(x)$. A partial solution to this problem is to introduce the modified Hermite polynomials defined via the recurrence relation
\begin{equation}
\label{eq:h2}
\tilde H_{n}(x) = \sqrt{\frac{2}{n}}x\tilde H_{n-1}(x)-2\sqrt{\frac{n-1}{n}}\tilde H_{n-2}(x)
\end{equation} with the initial values $\tilde H_0(x)=1/\sqrt[4]{\pi}$ and $\tilde H_1(x)=\sqrt{2}\,x/\sqrt[4]{\pi}$. With these polynomials the Hermite functions become
\begin{equation}
h_n(x) = \mathrm{e}^{-x^2/2} \tilde H_n(x) \,.
\end{equation} Because the modified Hermite polynomials $\tilde H_{n}(x)$ grow much more slowly than the standard Hermite polynomials we got rid of the normalizing factor $\frac{1}{\sqrt{\sqrt{\pi} 2^n n!}}$.

But this does not solve all problems of underflow and overflow. For $x>38$ the factor $\mathrm{e}^{-x^2/2}$ is smaller than any number that can be represented as a double precision floating point number. A very robust remedy to numerical underflow was sketched in BIT Numerical Mathematics, Vol. 49, pp 281-295. The basic idea is to keep the magnitude of the modified Hermite polynomials on the order of one during their calculation via the recurrence relation \eqref{eq:h2} by introducing a suitable normalizing factor. During the calculation one has to keep track of the sum of the logarithms of these normalizing factors. Then the factor $\mathrm{e}^{-x^2/2}$ has to be modified by this sum when the value of the Hermite function is calculated. A Python implementation of this algorithm is shown below. This algorithm allows to calculate the values of high-order Hermite functions even for quite large arguments.

from numpy import *
from pylab import *
 
def h(n, x):
    if n==0:
        return ones_like(x)*pi**(-0.25)*exp(-x**2/2)
    if n==1:
        return sqrt(2.)*x*norm*pi**(-0.25)*exp(-x**2/2)
    h_i_2=ones_like(x)*pi**(-0.25)
    h_i_1=sqrt(2.)*x*pi**(-0.25)
    sum_log_scale=zeros_like(x)
    for i in range(2, n+1):
        h_i=sqrt(2./i)*x*h_i_1-sqrt((i-1.)/i)*h_i_2
        h_i_2, h_i_1=h_i_1, h_i
        log_scale=log(abs(h_i)).round()
        scale=exp(-log_scale)
        h_i=h_i*scale
        h_i_1=h_i_1*scale
        h_i_2=h_i_2*scale
        sum_log_scale+=log_scale
    return h_i*exp(-x**2/2+sum_log_scale)

The Hermite function $h_{800}(x)$. A direct calculation via the definition \eqref{eq:h2} of the Hermite function in terms of the modified Hermite polynomials $\tilde H_n(x)$ fails for $x>38$ due to numerical underflow as shown in the upper part. The shown Python code, which avoids underflow, has been utilized to produce the lower part.

The Hermite function $h_{800}(x)$. A direct calculation via the definition \eqref{eq:h2} of the Hermite function in terms of the modified Hermite polynomials $\tilde H_n(x)$ fails for $x>39$ due to numerical underflow as shown in the upper part. The shown Python code, which avoids underflow, has been utilized to produce the lower part.

I would like to thank Randolf Beerwerth for drawing my attention to this problem.