All versions of Python available on NeSI platforms are owned and
licensed by the Python Software Foundation. Each version is released
under a specific open-source licence. The licences are available on
the Python documentation server.
Our operating systems include Python but not an up to date version, so
we strongly recommend that you load one of our Python environment
modules instead. They include optimised builds of the most popular
Python packages for computational work such as numpy, scipy,
matplotlib, and many more.
multiprocessing.cpu_count() patched to return only the number of
CPUs available to the process, which in a Slurm job can be fewer
than the number of CPUs on the node.
PYTHONUSERBASE set to a path which includes the toolchain, so that
incompatible builds of the same version of Python don't attempt to
share user-installed libraries.
#!/bin/bash -e
#SBATCH --job-name=PythonMPI
#SBATCH --account nesi99999
#SBATCH --ntasks=2 # Number of MPI tasks
#SBATCH --time=00:30:00
#SBATCH --mem-per-cpu=512MB # Memory per logical CPU
module load Python
srun python PythonMPI.py # Executes ntasks copies of the script
importnumpyasnpfrommpi4pyimportMPIcomm=MPI.COMM_WORLDsize=comm.Get_size()# Total number of MPI tasksrank=comm.Get_rank()# Rank of this MPI task# Calculate the data (numbers 0-9) on the MPI ranksrank_data=np.arange(rank,10,size)# perform some operation on the ranks datarank_data+=1# gather the data back to rank 0data_gather=comm.gather(rank_data,root=0)# on rank 0 sum the gathered data and print both the sum of, # and the unsummed dataifrank==0:print('Gathered data:',data_gather)print('Sum:',sum(data_gather))
The above Python script will create a list of numbers (0-9) split
between the MPI tasks (ranks). Each task will then add one to the
numbers it has, those numbers will then be gathered back to task 0,
where the numbers will be summed and both the sum of, and the unsummed
data is printed.
#!/bin/bash -e
#SBATCH --job-name=PytonMultiprocessing
#SBATCH --account nesi99999
#SBATCH --cpus-per-task=2 # Number of logical CPUs
#SBATCH --time=00:10:00
#SBATCH --mem-per-cpu=512MB # Memory per logical CPU
module load Python
python PythonMultiprocessing.py
importmultiprocessingdefcalc_square(numbers,result1):foridx,ninenumerate(numbers):result1[idx]=n*ndefcalc_cube(numbers,result2):foridx,ninenumerate(numbers):result2[idx]=n*n*nif__name__=="__main__":numbers=[2,3,4]# Sets up the shared memory variables, allowing the variables to be# accessed globally across processesresult1=multiprocessing.Array('i',3)result2=multiprocessing.Array('i',3)# set up the processesp1=multiprocessing.Process(target=calc_square,args=(numbers,result1,))p2=multiprocessing.Process(target=calc_cube,args=(numbers,result2,))# start the processesp1.start()p2.start()# end the processesp1.join()p2.join()print(result1[:])print(result2[:])
The above Python script will calculated the square and cube of an array
of numbers using multiprocessing and print the results from outside of
those processes, safely circumventing Python's global interpreter lock.
Job arrays can be handled using the Slurm environment variable
SLURM_ARRAY_TASK_ID as array index. This index can be called directly
from within the script or using a command line argument. In the
following both options are presented:
The job scripts calling both examples:
#!/bin/bash -e
#SBATCH --job-name test
#SBATCH --account nesi99999
#SBATCH --time 00:01:00
#SBATCH --ntasks 1
#SBATCH --cpus-per-task 1
#SBATCH --array 1-2 # Array jobs
module load Anaconda3
echo "SLURM_ARRAY_TASK_ID.$SLURM_ARRAY_TASK_ID of $SLURM_ARRAY_TASK_COUNT"
#env variable in python
python hello_world.py
#as command line argument
python hello_world_args.py -ID $SLURM_ARRAY_TASK_ID
the version getting the env variable in the python script
hello_world.py
#!/usr/bin/env python3importosmy_id=os.environ['SLURM_ARRAY_TASK_ID']print("hello world with ID {}".format(my_id))
the version getting the env variable as argument in the python script hello_world_args.py
#!/usr/bin/env python3"""Module for handling inpu arguments"""importargparse# get tests from fileclassLoadFromFile(argparse.Action):""" class for reading arguments from file """def__call__(self,parser,namespace,values,option_string=None):withvaluesasF:vals=F.read().split()setattr(namespace,self.dest,vals)defget_args():""" Definition of the input arguments """parser=argparse.ArgumentParser(description='Hello World')parser.add_argument('-ID',type=int,action='store',dest='my_id',help='Slurm ID')returnparser.parse_args()ARGS=get_args()print("hello world from ID {}".format(ARGS.my_id))
Programmers around the world have written and released many packages for
Python, which are not included with the core Python distribution and
must be installed separately. Each Python environment module comes with
its own particular suite of packages, and the system Python has its own
installed packages.
If you are working on multiple projects, this method will cause issues
as your projects may require different versions of packages which are
not compatible.
We strongly recommend using separate Python virtual environments to isolate
dependencies between projects, avoid filling your home space and being
able to share installation with collaborators
Installing packages in a Python virtual environment¶
A Python virtual environment is lightweight system to create an
environment which contains specific packages for a project, without
interfering with the global Python installation. Each virtual
environment is a different directory.
To create a Python virtual environment, use the venv module as follows
By default, Python virtual environments are fully isolated from the
system installation. It means that you will not be able to access
packages already prepared by NeSI in the corresponding Python
environment module.
To avoid this, use the option --system-site-packages when creating the
virtual environment
A downside of this is that now your virtual environment also finds
packages from your $HOME folder. To avoid this behaviour, make sure
to use export PYTHONNOUSERSITE=1 before calling pip or running a
Python script. For example, in a Slurm job submission script
You can use iPython to list the functions available that start with a
given string. Please note that if the string denotes a module (i.e., it
has a full stop somewhere in it), that module (or the function you want
from it) must first be imported, using either an "import X" statement or
a "from X import Y" statement.
importosos.<TAB># List all functions in the os moduleos.O_<TAB># List functions starting with "O_" from the os modulelen<TAB># List functions starting with "len"
Here,
o<TAB>
or even
os<TAB>
and expect to see the methods and values provided by the os module - you
have to put the full stop after the "os" if you want to do that.
You can also do this on functions (len?), methods (os.mkdir?) and
modules (os.path?). If you try to do it on something that isn't
defined yet, Python will tell you that the object in question couldn't
be found.