Recently, OCC completed a project converting MATLAB code to FORTRAN 95 which presented us with some interesting problems due to their differences in the treatment of variables, functions and matrices. This article describes how we solved those problems.
The most noticeable difference between the two languages is that MATLAB is dynamically typed, while all variables in FORTRAN are declared at the top of the scope.
In FORTRAN, variables must be declared before use. A major challenge of converting dynamically and statically typed languages is working out what variables need to be declared. Using MATLAB it’s possible to see all variables that are in scope and their contents, but using this as the primary source of variable definitions is time consuming. A solution used during our project required collaboration from the author of the MatLab code, whereby declarative comments were added to the top of every MatLab function. This method was much faster, because the code author already knows the type and purpose of each variable.
In our system (shown below), the variable declarations are identified by a comment beginning with a
#. The next character denotes the type (
i for integer,
f for float/double or
s for string). This can be followed by an integer representing the number of array dimensions. The first word is the variable name, which may be followed by extra text for a comment. During the conversion, a simple find and replace can be used to replace these lines.
Functions and intent
In MATLAB code, the function declaration has two separate lists for variables, inputs on the right and outputs on the left. FORTRAN does not have such a system apart from for functions that return a single output. Instead, the variables have a declared intent. There are three types of intent:
inout. Pure inputs are
in, pure outputs are
inout is used when the value passed in may be modified by the function.
Variable types and kinds
Variable types in FORTRAN are self-explanatory, but the
kind keyword is less obvious. The kind tells the compiler what type of integer or real to use. For example, the integer could be an 8-bit, 16-bit, 32-bit or 64-bit integer. To keep the code consistent, we declare the variables
kd to denote the kind of integers and reals. The definitions below set the integer to have 4 bytes (32-bit) and the real to be a double .
Defining kinds in such a way is also vital when using literal strings. By attaching
_kd to a literal double (above), we ensure that it is treated as a double. We found that not doing this caused to code to treat it as a single, which leads to rounding errors in the code and ultimately produces a different answer from the original MATLAB code.
Both languages treat arrays as matrices, allowing operations to be performed with them, or subsets of them. Referencing subsets of arrays has the same syntax in both languages, but there are many differences.
There are two different types of array, dynamic and static. Using static arrays, which have a fixed size, is ultimately faster and easier. FORTRAN is more flexible than other languages, in that the size of the array can be supplied as an input parameter.
When this is not possible dynamic arrays must be used. Dynamic arrays can be passed in and out of functions, but they must be allocated the correct size before use and it is important for the code calling the function to deallocate any dynamic arrays after use.
Both MATLAB and FORTRAN treat arrays as matrices and have arithmetic operators for matrices. However, this is not as simple as it initially seems. By default, operators in FORTRAN are element -wise such that:
While in MATLAB the operators are assumed to be matrix operations:
To perform a matrix multiplication in FORTRAN the following function should be used:
This means that you must know your types whenever translating a multiplication. When the types are scalars the
* operator is the same in both languages, but when the types are both arrays
* becomes the function
MatMul in FORTRAN.
A greater worry is that matrix division is not natively present in FORTRAN. To perform matrix division in FORTRAN, you need to source your own algorithm, which is usually slower than the MATLAB equivalent.
Since the language is dynamically typed, MATLAB shapes the array to accommodate this new matrix.
FORTRAN, however, needs to know the size of the array. In these situations, it is better to anticipate the total size of the array and allocate it. In the most extreme situations, it may be that the sizes of the matrices being concatenated are unknown. In this case the only solution is to allocate a new array and deallocate the old, which can be relatively slow.
Whilst there are many similarities between MATLAB and FORTRAN, converting from a dynamically typed language to a statically typed one is not a simple task particularly when many common practices found in MATLAB code depend on the dynamic elements. Our project was successful though and what we’ve learned will make future conversions much more straightforward.