Skip to content

Releases: bsc-pm/ompss-2-releases

OmpSs-2 2025.11

28 Oct 13:16

Choose a tag to compare

Version 2025.11, Tue Oct 28, 2025

The OmpSs-2 2025.11 release enhances the taskiter construct through the NODES runtime and extends the APIs of the nOS-V tasking library and the LLVM Clang compiler. This improves the interaction with the attaching and detaching of threads and enhances the efficiency of the Taskiter mechanism. It also introduces several bug fixes and improvements to the NODES runtime, the nOS-V tasking library, the ovni instrumentation library, and the LLVM Clang compiler.

nOS-V

  • Added support for emitting hardware counter events in ovni traces
  • Introduce nosv_rwlock_t and related calls, as a replacement for pthread read-write locks
  • Introduce nosv_pthread_create as a drop-in replacement for pthread_create
  • When compiled with --enable-debug, nosv_mutex_t will now perform owner checks to aid debugging nOS-V programs
  • Added common utility scripts and sample job submission files for common nOS-V configurations, installed in the nOS-V share directory
  • Added flags attach/detach flags to instruct nOS-V to instrument attached threads through ovni
  • Changed the API for pthread synchronization primitives alternatives to return POSIX-like error codes
  • Changed the default isolation_level for nOS-V to process, instead of user. Now, by default, nOS-V programs will not share instances with other processes. Users should enable inter-process capabilities explicitly through the config file or using the shared-mpi preset
  • Fixed detection of invalid or unusable cores given in topology.binding
  • Fixed detection of mismatching cpu bindings between runtimes sharing the same nOS-V instance
  • Fixed the nosv_cond test failing under certain conditions

NODES

  • Re-implementation of the newly optimized taskiter construct
  • Improve usability by swapping to the use of a configuration file (nodes.toml) instead of environment variables
  • Add the option to build NODES with AddressSanitizer
  • Add an internal parallelization mechanism within NODES. Currently only leveraged through the taskiter construct
  • Deprecate NODES_OVNI. Although it currently overrides the new configuration file, it will be removed in a future release
  • Improve and move the documentation currently found within the README to a separate documentation folder
  • Fixed memory leaks related to the TaskMetadata class, Taskloops, dependencies, and SpawnFunctions
  • Fixed bugs regarding reductions which yielded wrong values in some scenarios

LLVM/Clang

  • Adapt to the newer nOS-V API calls
  • Several bugfixes for both OmpSs-2 constructs and OpenMP

Ovni

  • Add support for hardware counters (HWC) in nOS-V
  • Add user documentation for NODES instrumentation
  • Increase nOS-V model version to 2.6.0
  • Fix a bug in ovniemu when loading loom CPUs from multiple threads

Nanos6 and Task-Aware Libraries

  • No relevant changes compared to the previous release

OmpSs-2 2025.06

06 Jun 14:38

Choose a tag to compare

Version 2025.06, Fri Jun 6, 2025

The OmpSs-2 2025.06 release adds compatibility with ALPI v1.2 across Task-Aware libraries and runtimes, expands device support via Nanos6 and LLVM/Clang, introduces code coverage in nOS-V, and features several updates to Task-Aware libraries such as TASYCL, TACUDA, and TAMPI. It also introduces several bug fixes and improvements to the OmpSs-2 clang parser, several APIs from nOS-V, and instrumentation across libraries.

Nanos6

  • Add compatibility with ALPI version 1.2.
  • Add support for the grid clause on CUDA tasks.
  • Weaken DLB test requirements (partial drop of tests).

nOS-V

  • Extended hwinfo API with new functionalities
  • Updated compatibility to support ALPI v1.2
  • Integrated code coverage
  • Fixed a TLS-related bug when nesting attaches via fork()
  • Fixed synchronization issues involving nosv_cond mutexes
  • Fixed memory consistency issues on certain architectures related to barriers in complete_callbacks
  • Reduced the number of instrumentation events triggered by schedpoints and yields
  • Improved detection logic and internal representation of hardware information
  • Fixed an instrumentation bug where physical CPU IDs were incorrectly emitted instead of logical IDs, leading to emulator failures

NODES

  • No relevant changes compared to the previous release

LLVM/OpenMP (libompv)

  • New, more refined implementation of the passive wait policy in libompv (OMP_WAIT_POLICY=passive)
  • Add a libompvtarget, equivalent to libomptarget but through libompv
  • Several bugfixes in the libompv implementation of free-agents

LLVM/Clang

  • Preliminary support for combining OmpSs-2 with OpenMP offload
  • Added suport for the grid clause for devices in OmpSs-2
  • Several bugfixes in the OmpSs-2 clang parser

Ovni

  • Add support OpenMP label and task ID views.
  • Add support for nOS-V non-blocking scheduler server events (VSN and VSn).
  • Add OpenMP simple breakdown view.
  • Add bench6 package to run full mini-apps for tests.

Task-Aware Libraries

  • Add compatibility with ALPI v1.2 to multiple Task-Aware libraries.
  • (TASYCL) Accept Adaptive CPP targets in the configure script.
  • (TASYCL) Expand the create queues API to accept combinations of SYCL device selectors, async exception handlers.
  • (TASYCL) Expand the API to allow executing functions on all TASYCL queues.
  • (TAMPI) Rework the polling mechanism through ALPI's new suspend feature
  • (TAMPI) Generate a PKGCONFIG file on installation
  • (TAMPI) Allow specifying the maximum number of CPUs of the system while configuring TAMPI
  • (TAMPI) Improve the logging of tests
  • (TAMPI) Fixed passing of lambdas in some boost functions to fix compatibility with v1.87.9
  • (TAMPI) Removed cpubind from tests to avoid unexpected behavior depending on SLURM configuration
  • (TACUDA) Allow multiple streams per CPU through the use of the tacudaCreateStreams parameter
  • (TACUDA) Preallocate cudaEvent_t objects to reduce internal CUDA library contention

OmpSs-2 2024.11

15 Nov 14:51

Choose a tag to compare

Version 2024.11, Fri Nov 15, 2024

The OmpSs-2 2024.11 release adds support for Coroutines through the NODES runtime and the nOS-V tasking library and introduces several new features in nOS-V which include support for a task suspension API, support for RISC-V, a Topology API, and a Memory Pressure API, among others. This release also introduces support for the breakdown model through ovni and nOS-V.

Nanos6

  • Add compatibility with ALPI version 1.1 by implementing various functions from the tasking interface

nOS-V

  • Introduce support for breakdown model implementation, supported through the use of ovniemu -b
  • Refactor shutdown mechanism, using a coordinated approach to prevent contention during runtime shutdown
  • Introduce a Memory Pressure API, to query the current occupancy of the nOS-V shared memory segment
  • Allow re-initialization of nOS-V, permitting the call to nosv_init() after nosv_shutdown()
  • Enable turbo setting by default, and add correctness checking to detect changes to FPU flags from outside of nOS-V
  • Add support for coroutines and similar constructs through the nosv_suspend() API.
  • Add support for RISC-V
  • Introduce a Topology API, which allows the configuration of system topology through the nosv.toml file
  • Allow submitting tasks as NOSV_SUBMIT_IMMEDIATE from a task's run callback
  • Introduce nosv_cond_t and related calls, as a replacement for pthread condition variables
  • Other miscellaneous fixes and improvements

NODES

  • Introduce support for Coroutines
  • Fix immediate successor logic from within busy threads
  • Fix wrong header include order in the build system affecting NODES' installation
  • Other minor bug fixes and code improvements

LLVM/OpenMP (libompv)

  • Support other LLVM/Intel compiler generated code in libompv (tracing) by setting OMP_ENABLE_COMPAT=1
  • Other bug fixes and improvements

LLVM/Clang

  • Miscellaneous bug fixes and improvements

Ovni

  • Add breakdown model for nOS-V
  • New mark API ovni_mark_*() to emit user-defined events
  • New API to manage stream metadata ovni_attr_*()
  • Update trace format to version 3 (to support independent streams)

Task-Aware Libraries

  • Introduce TAMPI-OPT, an update for the Task-Aware MPI (TAMPI) library which implements several optimizations

OmpSs-2 2024.05

16 May 12:12

Choose a tag to compare

Version 2024.05, Thu May 16, 2024

The OmpSs-2 2024.05 release includes the Directory/Cache (D/C) for Host and CUDA devices in Nanos6, several new features for the nOS-V tasking library, and performance and bugfixes. The libompv in LLVM/OpenMP includes the implementation of OpenMP free-agents and instrumentation through ovni. This release removes the support for the Mercurium compiler.

Nanos6

  • Add directory/cache (D/C) for Host and CUDA devices
  • Add device memory allocation API for D/C-managed memory
  • Improvements to the ovni instrumentation

nOS-V

  • New batch submission API, which can accumulate tasks to submit them in batch once a certain threshold is reached
  • Add nosv_mutex_t and nosv_barrier_t as nOS-V aware alternatives to their pthread counterparts
  • Add instrumentation points for the nosv_attach and nosv_detach calls
  • Add instrumentation for parallel tasks
  • Activate the turbo.enabled configuration option by default, enabling flush-to-zero in x86-64 and aarch64
  • Perform safety checks when the turbo.enabled configuration option is set to verify FPU flags are not modified by external libraries
  • Split instrumentation events for the scheduler to allow them to be more granularly controlled
  • Allow nOS-V programs to call fork() without leaving the forked process in an incoherent state
  • Other bugfixes and improvements

NODES

  • Improve the error-handling of nOS-V return codes
  • Improve descriptiveness of ovni instrumentation
  • Various improvements related to API integrations (nOS-V, ALPI, ovni)

LLVM/OpenMP (libompv)

  • Implement the OpenMP free-agents feature by setting OMP_ENABLE_FREE_AGENTS=1 and OMP_WAIT_POLICY=passive
  • Instrument through ovni by setting OMP_OVNI=1 and enabling ovni instrumentation in nOS-V

LLVM/Clang

  • Add OPENMP_RUNTIME environment variable to choose the runtime library to link against
  • Other bugfixes and improvements

Ovni

  • New ovni_thread_requirefunction to enable emulation models
  • Streams are marked as finished when calling ovni_thread_free
  • Support per-thread metadata
  • Add manual page for ovnidump
  • Add support for nosv_attach and nosv_detach events
  • Add support for nosv_mutex_lock, nosv_mutex_trylock, and nosv_mutex_unlock events
  • Add support for nosv_barrier events
  • Add OpenMP model to instrument the libompv implementation
  • Add new body model to support parallel tasks in nOS-V (taskfor directive)
  • Fix Paraver cfgs for Mac OS
  • Other bugfixes and improvements

OmpSs-2 2023.11

22 Nov 16:08

Choose a tag to compare

Version 2023.11, Wed Nov 22, 2023

The OmpSs-2 2023.11 release includes performance and bugfixes for the runtime systems, several new features for the nOS-V tasking library, and performance improvements on the taskiter construct implementation. It also implements the ALPI (version 1.0) in the runtime systems, which provides support for task-aware libraries. The LLVM/OpenMP includes a new OpenMP runtime called OpenMP-V (libompv) that works on top of the nOS-V tasking library. A new instrumentation library called Sonar is provided to instrument MPI function calls through ovni.

General

  • The OmpSs-2 runtime systems expose the ALPI generic low-level tasking interface

Nanos6

  • Implement the ALPI interface (version 1.0)
  • Allow embedding jemalloc allocator
  • Embed hwloc and jemalloc by default
  • Add devices.cuda.prefetch config option to control CUDA prefetching of data dependencies (enabled by default)
  • Install the nanos6.toml config file in $prefix/share
  • Remove obsolete instrument.h public interface
  • Remove obsolete stats and graph instrumentations
  • Remove software dependency with libunwind and elfutils
  • Fix execution when enabling extrae instrumentation
  • Remove memory leaks
  • Various bugfixes and corrections

nOS-V

  • Implement the ALPI interface (version 1.0)
  • Add misc.stack_size config option to change the stack size of nOS-V threads
  • Add ovni.level config option for fine-grained instrumentation control
  • Change nosv_attach API to not require an explicit task type and support multiple attaches
  • Implement parallel tasks which can be executed on multiple CPUs at once
  • Allow calling nosv_init and nosv_shutdown multiple times
  • Change error handling to return custom nOS-V error codes
  • Allow early wake of deadline tasks with nosv_submit passing the NOSV_SUBMIT_DEADLINE_WAKE flag
  • Add compatibility layer for calls to sched_get/setaffinity and pthread_get/setaffinity
  • Add instrumentation points for the nosv_create and nosv_destroy APIs
  • Various bugfixes and corrections

NODES

  • Improve performance of the taskiter construct
  • Fix several bugs of the taskiter implementation
  • Ensure nOS-V library is at the first level of dependencies
  • Use the updated attach/detach from nOS-V 2.0
  • Drop support for nOS-V versions older than 2.0

LLVM/OpenMP

  • Provide OpenMP runtime named OpenMP-V (libompv) working over the nOS-V tasking library (-fopenmp=libompv)
  • Make OpenMP-V runtime compatible with task-aware libraries
  • Drop support for task-aware libraries in vanilla OpenMP runtime libomp

LLVM/Clang

  • Fix task data dependencies' calculation for long double types

Ovni

  • Add OVNI_TRACEDIR envar to change the trace directory (default is ovni)
  • Add the ovniver program to report the libovni version and commit
  • Add ovni_version_get() function
  • Add nOS-V API subsystem events for nosv_create() and nosv_destroy()
  • Add TAMPI model with T code, subsystem events and cfgs
  • Add MPI model with M code, function events and cfgs
  • Don't hardcore destination directory names like lib, to use the ones in the destination host (like lib64)

Sonar

  • Introduce the Sonar library that uses ovni for instrumenting MPI functions

Task-Aware Libraries

  • Leverage the ALPI interface instead of the Nanos6-specific interface
  • Drop support for OmpSs-2 versions older than 2023.11
  • See other features and fixes in each task-aware libraries' CHANGELOG

OmpSs-2 2023.05.1

24 Jul 15:34

Choose a tag to compare

OmpSs-2 2023.05.1, Mon Jul 24, 2023

The OmpSs-2 2023.05.1 release includes several bug fixes and improvements with respect to the OmpSs-2 2023.05 release. These bug fixes are listed at the end of these release notes.

The OmpSs-2 2023.05 releases include new software projects and several performance and usability improvements for the OmpSs-2 programming model. In the context of OmpSs-2, this release introduces the new NODES runtime system supporting OmpSs-2, a novel and efficient tasking library named nOS-V, new Task-Aware libraries for interoperability with GPU offloading models, and new features in the ovni instrumentation library.

General

  • Improve support for ovni instrumentation in the Nanos6 runtime and support for the idle CPUs view
  • Add performance and usability improvements in Nanos6
  • Allow embedding hwloc library into Nanos6 to avoid conflicts with other third-party software that use different hwloc versions
  • Add support for atomic and critical OmpSs-2 directives in the LLVM/Clang compiler
  • Drop support for task for clause
  • Mercurium is the OmpSs-2 legacy compiler, not supported anymore, and will not provide new features for OmpSs-2. Use the LLVM/Clang compiler instead

NODES Runtime and nOS-V Tasking Library

  • Introduce the new low-level nOS-V threading and tasking library, enabling co-execution of applications
  • Introduce the new NODES runtime system, built on top of nOS-V, that supports the OmpSs-2 model. This runtime implements the taskiter construct and leverages directed task graphs (DCTG) to optimize the execution of iterative applications
  • Extend -fompss-2 option from LLVM/Clang to choose between Nanos6 and NODES runtimes by accepting the option values libnanos6 (default) and libnodes, respectively

Task-Aware Libraries

  • Introduce the new Task-Aware CUDA (TACUDA), Task-Aware HIP (TAHIP) and Task-Aware SYCL (TASYCL) libraries. These task-aware libraries seamlessly integrate the CUDA, HIP and SYCL APIs for GPU offloading with the OmpSs-2 and OpenMP tasking models
  • Add performance improvements and bug fixes in the Task-Aware MPI (TAMPI) and Task-Aware GASPI (TAGASPI) communication libraries
  • Extend Task-Aware MPI (TAMPI) to support ovni instrumentation and allow tracing of multi-node hyrbid MPI+OmpSs-2 applications

ovni Instrumentation

  • Add new graph-based design in ovni to support complex models like the new breakdown timeline

Changes with respect to the 2023.05 release

The OmpSs-2 2023.05.1 includes the following bug fixes and improvements with respect to the 2023.05 version:

Nanos6 Runtime

  • Fix CUDA kernel launch configuration and improve performance of OmpSs-2@CUDA support
  • Allow failures at CUDA prefetching without aborting the execution
  • Fix linking with jemalloc when --as-needed linking flag is used
  • Improve testing infrastructure and programs
  • Update documentation regarding OmpSs-2@CUDA support
  • Improve general documentation

LLVM/OpenMP Runtime

  • Fix OpenMP potential use-after-free in polling tasks' mechanism

LLVM/Clang Compiler

  • Fix unconditional break inside a for-loop which is encapsulated in a task
  • Fix device tasks call order when capturing more information in other clauses
  • Add support shmem clause in device tasks

OmpSs-2 2023.05

24 May 11:01

Choose a tag to compare

OmpSs-2 2023.05, Wed May 24, 2023

The OmpSs-2 2023.05 release includes new software projects and several performance and usability improvements for the OmpSs-2 programming model. In the context of OmpSs-2, this release introduces the new NODES runtime system supporting OmpSs-2, a novel and efficient tasking library named nOS-V, new Task-Aware libraries for interoperability with GPU offloading models, and new features in the ovni instrumentation library.

General

  • Improve support for ovni instrumentation in the Nanos6 runtime and support for the idle CPUs view
  • Add performance and usability improvements in Nanos6
  • Allow embedding hwloc library into Nanos6 to avoid conflicts with other third-party software that use different hwloc versions
  • Add support for atomic and critical OmpSs-2 directives in the LLVM/Clang compiler
  • Drop support for task for clause
  • Mercurium is the OmpSs-2 legacy compiler, not supported anymore, and will not provide new features for OmpSs-2. Use the LLVM/Clang compiler instead

NODES Runtime and nOS-V Tasking Library

  • Introduce the new low-level nOS-V threading and tasking library, enabling co-execution of applications
  • Introduce the new NODES runtime system, built on top of nOS-V, that supports the OmpSs-2 model. This runtime implements the taskiter construct and leverages directed task graphs (DCTG) to optimize the execution of iterative applications
  • Extend -fompss-2 option from LLVM/Clang to choose between Nanos6 and NODES runtimes by accepting the option values libnanos6 (default) and libnodes, respectively

Task-Aware Libraries

  • Introduce the new Task-Aware CUDA (TACUDA), Task-Aware HIP (TAHIP) and Task-Aware SYCL (TASYCL) libraries. These task-aware libraries seamlessly integrate the CUDA, HIP and SYCL APIs for GPU offloading with the OmpSs-2 and OpenMP tasking models
  • Add performance improvements and bug fixes in the Task-Aware MPI (TAMPI) and Task-Aware GASPI (TAGASPI) communication libraries
  • Extend Task-Aware MPI (TAMPI) to support ovni instrumentation and allow tracing of multi-node hyrbid MPI+OmpSs-2 applications

ovni Instrumentation

  • Add new graph-based design in ovni to support complex models like the new breakdown timeline