Releases: bsc-pm/ompss-2-releases
OmpSs-2 2025.11
Version 2025.11, Tue Oct 28, 2025
The OmpSs-2 2025.11 release enhances the taskiter construct through the NODES runtime and extends the APIs of the nOS-V tasking library and the LLVM Clang compiler. This improves the interaction with the attaching and detaching of threads and enhances the efficiency of the Taskiter mechanism. It also introduces several bug fixes and improvements to the NODES runtime, the nOS-V tasking library, the ovni instrumentation library, and the LLVM Clang compiler.
nOS-V
- Added support for emitting hardware counter events in ovni traces
- Introduce
nosv_rwlock_tand related calls, as a replacement for pthread read-write locks - Introduce
nosv_pthread_createas a drop-in replacement forpthread_create - When compiled with
--enable-debug,nosv_mutex_twill now perform owner checks to aid debugging nOS-V programs - Added common utility scripts and sample job submission files for common nOS-V configurations, installed in the nOS-V
sharedirectory - Added flags attach/detach flags to instruct nOS-V to instrument attached threads through ovni
- Changed the API for pthread synchronization primitives alternatives to return POSIX-like error codes
- Changed the default
isolation_levelfor nOS-V toprocess, instead ofuser. Now, by default, nOS-V programs will not share instances with other processes. Users should enable inter-process capabilities explicitly through the config file or using theshared-mpipreset - Fixed detection of invalid or unusable cores given in
topology.binding - Fixed detection of mismatching cpu bindings between runtimes sharing the same nOS-V instance
- Fixed the
nosv_condtest failing under certain conditions
NODES
- Re-implementation of the newly optimized
taskiterconstruct - Improve usability by swapping to the use of a configuration file (
nodes.toml) instead of environment variables - Add the option to build NODES with AddressSanitizer
- Add an internal parallelization mechanism within NODES. Currently only leveraged through the
taskiterconstruct - Deprecate
NODES_OVNI. Although it currently overrides the new configuration file, it will be removed in a future release - Improve and move the documentation currently found within the
READMEto a separate documentation folder - Fixed memory leaks related to the
TaskMetadataclass,Taskloops, dependencies, andSpawnFunctions - Fixed bugs regarding reductions which yielded wrong values in some scenarios
LLVM/Clang
- Adapt to the newer nOS-V API calls
- Several bugfixes for both OmpSs-2 constructs and OpenMP
Ovni
- Add support for hardware counters (HWC) in nOS-V
- Add user documentation for NODES instrumentation
- Increase nOS-V model version to 2.6.0
- Fix a bug in ovniemu when loading loom CPUs from multiple threads
Nanos6 and Task-Aware Libraries
- No relevant changes compared to the previous release
OmpSs-2 2025.06
Version 2025.06, Fri Jun 6, 2025
The OmpSs-2 2025.06 release adds compatibility with ALPI v1.2 across Task-Aware libraries and runtimes, expands device support via Nanos6 and LLVM/Clang, introduces code coverage in nOS-V, and features several updates to Task-Aware libraries such as TASYCL, TACUDA, and TAMPI. It also introduces several bug fixes and improvements to the OmpSs-2 clang parser, several APIs from nOS-V, and instrumentation across libraries.
Nanos6
- Add compatibility with ALPI version 1.2.
- Add support for the
gridclause on CUDA tasks. - Weaken DLB test requirements (partial drop of tests).
nOS-V
- Extended hwinfo API with new functionalities
- Updated compatibility to support ALPI v1.2
- Integrated code coverage
- Fixed a TLS-related bug when nesting
attachesviafork() - Fixed synchronization issues involving
nosv_condmutexes - Fixed memory consistency issues on certain architectures related to barriers in
complete_callbacks - Reduced the number of instrumentation events triggered by
schedpointsandyields - Improved detection logic and internal representation of hardware information
- Fixed an instrumentation bug where physical CPU IDs were incorrectly emitted instead of logical IDs, leading to emulator failures
NODES
- No relevant changes compared to the previous release
LLVM/OpenMP (libompv)
- New, more refined implementation of the
passivewait policy inlibompv(OMP_WAIT_POLICY=passive) - Add a
libompvtarget, equivalent tolibomptargetbut throughlibompv - Several bugfixes in the
libompvimplementation of free-agents
LLVM/Clang
- Preliminary support for combining OmpSs-2 with OpenMP offload
- Added suport for the
gridclause for devices in OmpSs-2 - Several bugfixes in the OmpSs-2 clang parser
Ovni
- Add support OpenMP label and task ID views.
- Add support for nOS-V non-blocking scheduler server events (VSN and VSn).
- Add OpenMP simple breakdown view.
- Add bench6 package to run full mini-apps for tests.
Task-Aware Libraries
- Add compatibility with ALPI v1.2 to multiple Task-Aware libraries.
- (TASYCL) Accept Adaptive CPP targets in the configure script.
- (TASYCL) Expand the create queues API to accept combinations of SYCL device selectors, async exception handlers.
- (TASYCL) Expand the API to allow executing functions on all TASYCL queues.
- (TAMPI) Rework the polling mechanism through ALPI's new suspend feature
- (TAMPI) Generate a PKGCONFIG file on installation
- (TAMPI) Allow specifying the maximum number of CPUs of the system while configuring TAMPI
- (TAMPI) Improve the logging of tests
- (TAMPI) Fixed passing of lambdas in some boost functions to fix compatibility with v1.87.9
- (TAMPI) Removed
cpubindfrom tests to avoid unexpected behavior depending on SLURM configuration - (TACUDA) Allow multiple streams per CPU through the use of the
tacudaCreateStreamsparameter - (TACUDA) Preallocate
cudaEvent_tobjects to reduce internal CUDA library contention
OmpSs-2 2024.11
Version 2024.11, Fri Nov 15, 2024
The OmpSs-2 2024.11 release adds support for Coroutines through the NODES runtime and the nOS-V tasking library and introduces several new features in nOS-V which include support for a task suspension API, support for RISC-V, a Topology API, and a Memory Pressure API, among others. This release also introduces support for the breakdown model through ovni and nOS-V.
Nanos6
- Add compatibility with ALPI version 1.1 by implementing various functions from the tasking interface
nOS-V
- Introduce support for breakdown model implementation, supported through the use of
ovniemu -b - Refactor shutdown mechanism, using a coordinated approach to prevent contention during runtime shutdown
- Introduce a Memory Pressure API, to query the current occupancy of the nOS-V shared memory segment
- Allow re-initialization of nOS-V, permitting the call to
nosv_init()afternosv_shutdown() - Enable
turbosetting by default, and add correctness checking to detect changes to FPU flags from outside of nOS-V - Add support for coroutines and similar constructs through the
nosv_suspend()API. - Add support for RISC-V
- Introduce a Topology API, which allows the configuration of system topology through the
nosv.tomlfile - Allow submitting tasks as
NOSV_SUBMIT_IMMEDIATEfrom a task's run callback - Introduce
nosv_cond_tand related calls, as a replacement for pthread condition variables - Other miscellaneous fixes and improvements
NODES
- Introduce support for Coroutines
- Fix immediate successor logic from within busy threads
- Fix wrong header include order in the build system affecting NODES' installation
- Other minor bug fixes and code improvements
LLVM/OpenMP (libompv)
- Support other LLVM/Intel compiler generated code in libompv (tracing) by setting
OMP_ENABLE_COMPAT=1 - Other bug fixes and improvements
LLVM/Clang
- Miscellaneous bug fixes and improvements
Ovni
- Add breakdown model for nOS-V
- New mark API
ovni_mark_*()to emit user-defined events - New API to manage stream metadata
ovni_attr_*() - Update trace format to version 3 (to support independent streams)
Task-Aware Libraries
- Introduce TAMPI-OPT, an update for the Task-Aware MPI (TAMPI) library which implements several optimizations
OmpSs-2 2024.05
Version 2024.05, Thu May 16, 2024
The OmpSs-2 2024.05 release includes the Directory/Cache (D/C) for Host and CUDA devices in Nanos6, several new features for the nOS-V tasking library, and performance and bugfixes. The libompv in LLVM/OpenMP includes the implementation of OpenMP free-agents and instrumentation through ovni. This release removes the support for the Mercurium compiler.
Nanos6
- Add directory/cache (D/C) for Host and CUDA devices
- Add device memory allocation API for D/C-managed memory
- Improvements to the ovni instrumentation
nOS-V
- New batch submission API, which can accumulate tasks to submit them in batch once a certain threshold is reached
- Add
nosv_mutex_tandnosv_barrier_tas nOS-V aware alternatives to their pthread counterparts - Add instrumentation points for the
nosv_attachandnosv_detachcalls - Add instrumentation for parallel tasks
- Activate the
turbo.enabledconfiguration option by default, enabling flush-to-zero in x86-64 and aarch64 - Perform safety checks when the
turbo.enabledconfiguration option is set to verify FPU flags are not modified by external libraries - Split instrumentation events for the scheduler to allow them to be more granularly controlled
- Allow nOS-V programs to call fork() without leaving the forked process in an incoherent state
- Other bugfixes and improvements
NODES
- Improve the error-handling of nOS-V return codes
- Improve descriptiveness of ovni instrumentation
- Various improvements related to API integrations (nOS-V, ALPI, ovni)
LLVM/OpenMP (libompv)
- Implement the OpenMP free-agents feature by setting
OMP_ENABLE_FREE_AGENTS=1andOMP_WAIT_POLICY=passive - Instrument through ovni by setting
OMP_OVNI=1and enabling ovni instrumentation in nOS-V
LLVM/Clang
- Add
OPENMP_RUNTIMEenvironment variable to choose the runtime library to link against - Other bugfixes and improvements
Ovni
- New
ovni_thread_requirefunction to enable emulation models - Streams are marked as finished when calling
ovni_thread_free - Support per-thread metadata
- Add manual page for
ovnidump - Add support for
nosv_attachandnosv_detachevents - Add support for
nosv_mutex_lock,nosv_mutex_trylock, andnosv_mutex_unlockevents - Add support for
nosv_barrierevents - Add OpenMP model to instrument the
libompvimplementation - Add new body model to support parallel tasks in nOS-V (
taskfordirective) - Fix Paraver cfgs for Mac OS
- Other bugfixes and improvements
OmpSs-2 2023.11
Version 2023.11, Wed Nov 22, 2023
The OmpSs-2 2023.11 release includes performance and bugfixes for the runtime systems, several new features for the nOS-V tasking library, and performance improvements on the taskiter construct implementation. It also implements the ALPI (version 1.0) in the runtime systems, which provides support for task-aware libraries. The LLVM/OpenMP includes a new OpenMP runtime called OpenMP-V (libompv) that works on top of the nOS-V tasking library. A new instrumentation library called Sonar is provided to instrument MPI function calls through ovni.
General
- The OmpSs-2 runtime systems expose the ALPI generic low-level tasking interface
Nanos6
- Implement the ALPI interface (version 1.0)
- Allow embedding jemalloc allocator
- Embed hwloc and jemalloc by default
- Add
devices.cuda.prefetchconfig option to control CUDA prefetching of data dependencies (enabled by default) - Install the
nanos6.tomlconfig file in$prefix/share - Remove obsolete instrument.h public interface
- Remove obsolete stats and graph instrumentations
- Remove software dependency with libunwind and elfutils
- Fix execution when enabling extrae instrumentation
- Remove memory leaks
- Various bugfixes and corrections
nOS-V
- Implement the ALPI interface (version 1.0)
- Add
misc.stack_sizeconfig option to change the stack size of nOS-V threads - Add
ovni.levelconfig option for fine-grained instrumentation control - Change
nosv_attachAPI to not require an explicit task type and support multiple attaches - Implement parallel tasks which can be executed on multiple CPUs at once
- Allow calling
nosv_initandnosv_shutdownmultiple times - Change error handling to return custom nOS-V error codes
- Allow early wake of deadline tasks with
nosv_submitpassing theNOSV_SUBMIT_DEADLINE_WAKEflag - Add compatibility layer for calls to
sched_get/setaffinityandpthread_get/setaffinity - Add instrumentation points for the
nosv_createandnosv_destroyAPIs - Various bugfixes and corrections
NODES
- Improve performance of the
taskiterconstruct - Fix several bugs of the
taskiterimplementation - Ensure nOS-V library is at the first level of dependencies
- Use the updated attach/detach from nOS-V 2.0
- Drop support for nOS-V versions older than 2.0
LLVM/OpenMP
- Provide OpenMP runtime named OpenMP-V (
libompv) working over the nOS-V tasking library (-fopenmp=libompv) - Make OpenMP-V runtime compatible with task-aware libraries
- Drop support for task-aware libraries in vanilla OpenMP runtime
libomp
LLVM/Clang
- Fix task data dependencies' calculation for long double types
Ovni
- Add
OVNI_TRACEDIRenvar to change the trace directory (default isovni) - Add the
ovniverprogram to report the libovni version and commit - Add
ovni_version_get()function - Add nOS-V API subsystem events for
nosv_create()andnosv_destroy() - Add TAMPI model with
Tcode, subsystem events and cfgs - Add MPI model with
Mcode, function events and cfgs - Don't hardcore destination directory names like lib, to use the ones in the destination host (like lib64)
Sonar
- Introduce the Sonar library that uses ovni for instrumenting MPI functions
Task-Aware Libraries
- Leverage the ALPI interface instead of the Nanos6-specific interface
- Drop support for OmpSs-2 versions older than 2023.11
- See other features and fixes in each task-aware libraries' CHANGELOG
OmpSs-2 2023.05.1
OmpSs-2 2023.05.1, Mon Jul 24, 2023
The OmpSs-2 2023.05.1 release includes several bug fixes and improvements with respect to the OmpSs-2 2023.05 release. These bug fixes are listed at the end of these release notes.
The OmpSs-2 2023.05 releases include new software projects and several performance and usability improvements for the OmpSs-2 programming model. In the context of OmpSs-2, this release introduces the new NODES runtime system supporting OmpSs-2, a novel and efficient tasking library named nOS-V, new Task-Aware libraries for interoperability with GPU offloading models, and new features in the ovni instrumentation library.
General
- Improve support for ovni instrumentation in the Nanos6 runtime and support for the idle CPUs view
- Add performance and usability improvements in Nanos6
- Allow embedding hwloc library into Nanos6 to avoid conflicts with other third-party software that use different hwloc versions
- Add support for
atomicandcriticalOmpSs-2 directives in the LLVM/Clang compiler - Drop support for
task forclause - Mercurium is the OmpSs-2 legacy compiler, not supported anymore, and will not provide new features for OmpSs-2. Use the LLVM/Clang compiler instead
NODES Runtime and nOS-V Tasking Library
- Introduce the new low-level nOS-V threading and tasking library, enabling co-execution of applications
- Introduce the new NODES runtime system, built on top of nOS-V, that supports the OmpSs-2 model. This runtime implements the
taskiterconstruct and leverages directed task graphs (DCTG) to optimize the execution of iterative applications - Extend
-fompss-2option from LLVM/Clang to choose between Nanos6 and NODES runtimes by accepting the option valueslibnanos6(default) andlibnodes, respectively
Task-Aware Libraries
- Introduce the new Task-Aware CUDA (TACUDA), Task-Aware HIP (TAHIP) and Task-Aware SYCL (TASYCL) libraries. These task-aware libraries seamlessly integrate the CUDA, HIP and SYCL APIs for GPU offloading with the OmpSs-2 and OpenMP tasking models
- Add performance improvements and bug fixes in the Task-Aware MPI (TAMPI) and Task-Aware GASPI (TAGASPI) communication libraries
- Extend Task-Aware MPI (TAMPI) to support ovni instrumentation and allow tracing of multi-node hyrbid MPI+OmpSs-2 applications
ovni Instrumentation
- Add new graph-based design in ovni to support complex models like the new breakdown timeline
Changes with respect to the 2023.05 release
The OmpSs-2 2023.05.1 includes the following bug fixes and improvements with respect to the 2023.05 version:
Nanos6 Runtime
- Fix CUDA kernel launch configuration and improve performance of OmpSs-2@CUDA support
- Allow failures at CUDA prefetching without aborting the execution
- Fix linking with jemalloc when --as-needed linking flag is used
- Improve testing infrastructure and programs
- Update documentation regarding OmpSs-2@CUDA support
- Improve general documentation
LLVM/OpenMP Runtime
- Fix OpenMP potential use-after-free in polling tasks' mechanism
LLVM/Clang Compiler
- Fix unconditional break inside a for-loop which is encapsulated in a task
- Fix device tasks call order when capturing more information in other clauses
- Add support
shmemclause in device tasks
OmpSs-2 2023.05
OmpSs-2 2023.05, Wed May 24, 2023
The OmpSs-2 2023.05 release includes new software projects and several performance and usability improvements for the OmpSs-2 programming model. In the context of OmpSs-2, this release introduces the new NODES runtime system supporting OmpSs-2, a novel and efficient tasking library named nOS-V, new Task-Aware libraries for interoperability with GPU offloading models, and new features in the ovni instrumentation library.
General
- Improve support for ovni instrumentation in the Nanos6 runtime and support for the idle CPUs view
- Add performance and usability improvements in Nanos6
- Allow embedding hwloc library into Nanos6 to avoid conflicts with other third-party software that use different hwloc versions
- Add support for
atomicandcriticalOmpSs-2 directives in the LLVM/Clang compiler - Drop support for
task forclause - Mercurium is the OmpSs-2 legacy compiler, not supported anymore, and will not provide new features for OmpSs-2. Use the LLVM/Clang compiler instead
NODES Runtime and nOS-V Tasking Library
- Introduce the new low-level nOS-V threading and tasking library, enabling co-execution of applications
- Introduce the new NODES runtime system, built on top of nOS-V, that supports the OmpSs-2 model. This runtime implements the
taskiterconstruct and leverages directed task graphs (DCTG) to optimize the execution of iterative applications - Extend
-fompss-2option from LLVM/Clang to choose between Nanos6 and NODES runtimes by accepting the option valueslibnanos6(default) andlibnodes, respectively
Task-Aware Libraries
- Introduce the new Task-Aware CUDA (TACUDA), Task-Aware HIP (TAHIP) and Task-Aware SYCL (TASYCL) libraries. These task-aware libraries seamlessly integrate the CUDA, HIP and SYCL APIs for GPU offloading with the OmpSs-2 and OpenMP tasking models
- Add performance improvements and bug fixes in the Task-Aware MPI (TAMPI) and Task-Aware GASPI (TAGASPI) communication libraries
- Extend Task-Aware MPI (TAMPI) to support ovni instrumentation and allow tracing of multi-node hyrbid MPI+OmpSs-2 applications
ovni Instrumentation
- Add new graph-based design in ovni to support complex models like the new breakdown timeline