
19 Mar 2025
In today’s fast-paced world we expect things to work and work efficiently. We all hate waiting for a slow system to respond or get back with the results, especially when we have tonnes of other things to do. One way to improve performance is through Code Optimisation.
In simple terms, code optimisation is when code is optimised, or improved, in such a way that it executes faster, in some cases much faster, and more efficiently. This could mean a reduction in memory usage to handle larger datasets more smoothly, speeding up any calculations, or simply improving readability to make code easier to maintain, all the while reducing carbon footprint. This in turn speeds up execution.
It is about small tweaks that can save you hours or days down the line.
Python is widely used in the scientific community due to its relative ease of use and extensive library, however, let's face it, it might run slower than other languages, such as C++. When performing mathematical transformations, especially on large datasets, Numpy libraries can become a bottleneck in terms of speed, especially if you have to iterate over the array’s elements. Despite this, Python remains one of the most popular choices for scientific computing.
So, what can we do to make a model written in Python run faster?
On a standard laptop you would have probably about 8 CPU cores and the speed to process calculations is dependent on these cores. For normal day-to-day use this is just fine, but let’s imagine that you need to process a model with a large amount of data, or iterate over a large dataset, then GPU computing might be a much better option.
CPUs (Central Processing Units) enable you to run different processes at any one time, in the example above 8 different processes at a time. GPUs (Graphics Processing Units) on the other hand work a bit differently. You can have thousands of cores in one single GPU. Having said that, the processes carried out on these cores have to be the same. So, for instance if you wanted to process two different calculations at the same time, 1*2 and 1+2, a GPU won’t do.
What GPUs excel at is doing the same calculation but with different values, so for instance with a GPU you can compute 1*2, 1*3 and 1*4 because it is the same calculation with different variables, i.e. x*n, where in our example x is always 1 and n is the changing value which can go up to millions of variable values, and these calculations can all carried out at the same time. So, for large model computing which has the same calculation but different variable values, GPU Computing works very well.
Now let’s imagine that you have some Python code that you would like to optimise. This is where Numba comes in. Numba is the Python framework that enables just-in-time compilations, directly made on a GPU (instead of traditional CPU, such as with Numpy module) in order to reduce processing time, and hence drastically speed up the computations.
Basically, what you are doing differently is that computations are not run in sequence such as with a CPU, but in parallel. The GPU will be running thousands of calculations simultaneously and that is where you will start seeing a huge difference in execution performance.
Scientists and engineers all over the world use GPUs for data-driven modelling because the same equation is computed with the different variable values, simultaneously. This could be a key process in optimisation algorithms, or simply solving your model, as in most cases you will be working with thousands or millions of calculations. Carrying out all of these calculations, multiple times, would take an unfeasibly long time unless you use GPUs. LLMs (Language Learning Models), such as ChatGPT were trained just like this: handling large amounts of computations at the same time.
With Digital Twins for instance, where virtual models replicate the real world scenario, scientists and engineers can test a huge number of scenarios in a safe manner. Take for instance the exploration of space - space scientists and engineers must be almost certain that everything will go as planned. Without such environments, real-world experimentation could be catastrophic.
