But, let’s first answer a basic question: What is the difference between CPython and Cython?
CPython is Python’s default interpreter.
What we commonly use as Python is written in the C language and i widely available. Did you know there are other version of Python as well?
IronPython – Python written in C# (for .NET)
Jython – Python written in Java
RustPython – Written in Rust
PyPy – Written in a subset of Python called RPython. Known for its speed enhancements.
Brython – Written in Javascript for client side web programming.
The syntax for all these languages is common and is the same as the Python we use everyday. Interesting isn’t it?
However, unlike CPython, the Cython module converts code back into C, compiles it and we can directly call the compiled function without the need for an interpreter.
Let’s now see how to Cythonize python code in Jupyter Notebook environment.
To benchmark let’s first write a simple for-loop logic using Python code and measure how long it takes to run.
1. Define and time a Python Function to benchmark
Let’s create a simple function and measure how long it takes to execute in Python.
import time
def somefunc(K):
accum = 0
for i in range(K):
if i % 5:
accum = accum + i
return accum
Measure the time.
t1 = time.time()
somefunc(20000000)
t2 = time.time()
t = t2-t1
print("%.10f" % t, "seconds")
3.0396461487 seconds
So, it takes about 3.03 seconds.
Let’s now see try to run the function using Cython and see if we gain some speed.
2. How to run Python using Cython in Jupyter Notebook
We can do this in 3 simple steps:
Step 1: Install the cython package.
!pip install cython
Collecting cython
Downloading Cython-3.0.3-cp311-cp311-win_amd64.whl (2.8 MB)
0.0/2.8 MB ? eta -:--:--
0.0/2.8 MB 640.0 kB/s eta 0:00:05
- 0.1/2.8 MB 1.0 MB/s eta 0:00:03
-------- 0.6/2.8 MB 4.8 MB/s eta 0:00:01
------------------ 1.3/2.8 MB 8.1 MB/s eta 0:00:01
---------------------------------- 2.4/2.8 MB 10.8 MB/s eta 0:00:01
---------------------------------------- 2.8/2.8 MB 11.1 MB/s eta 0:00:00
Installing collected packages: cython
Successfully installed cython-3.0.3
Step 2: Load the cython
extension.
%load_ext cython
For Windows: Microsoft Visual C++ 14.0 or greater is required. Get it with “Microsoft C++ Build Tools”: https://visualstudio.microsoft.com/visual-cpp-build-tools/
Step 3: Add the Cython magin in the beginning of the cell where you want Cython to convert the Python code.
%%cython -a
def somefunc_cy(K):
accum = 0
for i in range(K):
if i % 5:
accum = accum + i
return accum
Content of stdout:
_cython_magic_8fbc48008287a01d5af77cea06745f284b6e6aa6.c
Creating library C:\Users\Akash\.ipython\cython\Users\Akash\.ipython\cython\_cython_magic_8fbc48008287a01d5af77cea06745f284b6e6aa6.cp311-win_amd64.lib and object C:\Users\Akash\.ipython\cython\Users\Akash\.ipython\cython\_cython_magic_8fbc48008287a01d5af77cea06745f284b6e6aa6.cp311-win_amd64.exp
Generating code
Finished generating code
Generated by Cython 3.0.3
Yellow lines hint at Python interaction.
Click on a line that starts with a “+
” to see the C code that Cython generated for it.
1:
+2: def somefunc_cy(K):
+3: accum = 0
+4: for i in range(K):
+5: if i % 5:
+6: accum = accum + i
+7: return accum
The new function is now ready to run.
Step 4: Everything is set. We can now run the code now.
Since Cython has already compiled somefunc_cy
function, we don’t have to add the %%cython -a
function in the cell. The somefunc_cy
function can be called like any other Python function.
t1 = time.time()
somefunc_cy(20000000)
t2 = time.time()
t = t2-t1
print("%.10f" % t, "seconds")
1.7662577629 seconds
Notice here, we did absolutely no change to the original function, yet we have a ~50% drop in the code run time just by using the %%cython -a
magic command.
3. Let’s cythonize the function
You can bring in further improvements in the code run time by defining the data type of the variables used.
The type of the accum
variable is unsigned long long int.
Why is this so?
Integer is given because the sum of all numbers will be an integer. Unsigned because the sum will always be positive.
And long long
?
Because the sum of all numbers can be very large. long long
is added so as to increase the variable size to the maximum possible size that the system can allow.
%%cython -a
cpdef unsigned long long int somefunc_cy2(long int K):
cdef unsigned long long int accum = 0
cdef long int i
for i in range(K):
if i % 5:
accum = accum + i
return accum
Content of stdout:
_cython_magic_ce5f40fea156989a1abbf0f5aee20729e3b85c20.c
Creating library C:\Users\Akash\.ipython\cython\Users\Akash\.ipython\cython\_cython_magic_ce5f40fea156989a1abbf0f5aee20729e3b85c20.cp311-win_amd64.lib and object C:\Users\Akash\.ipython\cython\Users\Akash\.ipython\cython\_cython_magic_ce5f40fea156989a1abbf0f5aee20729e3b85c20.cp311-win_amd64.exp
Generating code
Finished generating code
Generated by Cython 3.0.3
Yellow lines hint at Python interaction.
Click on a line that starts with a “+
” to see the C code that Cython generated for it.
1:
+2: cpdef unsigned long long int somefunc_cy2(long int K):
+3: cdef unsigned long long int accum = 0
4: cdef long int i
5:
+6: for i in range(K):
+7: if i % 5:
+8: accum = accum + i
+9: return accum
Cython has generated more C code for this. Let’s see if there is any improvement.
t1 = time.time()
somefunc_cy2(20000000)
t2 = time.time()
t = t2-t1
print("%.10f" % t, "seconds")
0.0721337795 seconds
It takes less than a 10th of a second now.
That’s the difference Cython can bring in just by making adding the magic %%cython -a
in the beginning of the cell and declaring the data types.
Share this post