Skip to content
Contact Support

Models

To avoid unnecessary storage use, we maintain readonly versions of popular models in /opt/nesi/models, if you can use this please do. If you need a model that is not listed here, please Contact our Support Team with the model name, source, and a brief description of your use case.

Available models

Model Licence Path Slurm
Llama 3.1 Meta Llama 3.1
/opt/nesi/models/gguf/llama3.1/llama3.1-8b.gguf
#SBATCH --gpus-per-node=l4:1
/opt/nesi/model/gguf/llama3.1/llama3.1-70b.gguf
#SBATCH --partition=milan
#SBATCH --gpus-per-node=a100:1
DeepSeek-R1 MIT
/opt/nesi/model/gguf/deepseek-r1/deepseek-r1-7b.gguf
#SBATCH --gpus-per-node=l4:1
/opt/nesi/model/gguf/deepseek-r1/deepseek-r1-32b.gguf
#SBATCH --partition=genoa
#SBATCH --gpus-per-node=a100:1
/opt/nesi/model/gguf/deepseek-r1/deepseek-r1-70b.gguf
#SBATCH --partition=milan
#SBATCH --gpus-per-node=a100:1
Qwen3 Apache 2.0
/opt/nesi/model/gguf/qwen3/qwen3-14b.gguf
#SBATCH --gpus-per-node=l4:1
/opt/nesi/model/gguf/qwen3/qwen3-32b.gguf
#SBATCH --partition=genoa
#SBATCH --gpus-per-node=a100:1
Qwen2.5 Apache 2.0
/opt/nesi/model/gguf/qwen2.5/qwen2.5-7b.gguf
#SBATCH --gpus-per-node=l4:1
/opt/nesi/model/gguf/qwen2.5/qwen2.5-14b.gguf
#SBATCH --gpus-per-node=l4:1
Gemma 3 Gemma
/opt/nesi/model/gguf/gemma3/gemma3-27b.gguf
#SBATCH --partition=genoa
#SBATCH --gpus-per-node=a100:1

The Slurm column shows the minimum GPU flags required, your actual throughput will depend on the queue size. See Hardware for a full list of available GPUs.

L4 GPUs have no double-precision floating point

The L4 is an inference-optimised GPU. It is suitable for running quantised models but should not be used for model training or workflows that require FP64 precision.

- [Ollama](../Software/Available_Applications/ollama.md).
- [Hardware](../Batch_Computing/Hardware.md).