Releases · beehive-lab/GPULlama3.java

19 Dec 08:43

github-actions

Immutable

v0.3.3

27b16a0

GPULlama3.java 0.3.3 Latest

Latest

📦 Installation

Maven

<dependency>
    <groupId>io.github.beehive-lab</groupId>
    <artifactId>gpu-llama3</artifactId>
    <version>0.3.3</version>
</dependency>

Gradle

implementation 'io.github.beehive-lab:gpu-llama3:0.3.3'

📖 Documentation | 🔗 Maven Central

Assets 3

18 Dec 12:23

github-actions

Immutable

v0.3.2

15d8cc8

GPULlama3.java 0.3.2

Model Support

[models] Support for IBM Granite Models 3.2, 3.3 & 4.0 with FP16 and Q8 (#92)

Other Changes

[docs] Update docs to use SDKMAN! and point to TornadoVM 2.2.0 (#93)
Add JBang catalog and local usage examples to README.md (#91)
Add jbang script and configuration to make easy to run (#90)

📦 Installation

Maven

<dependency>
    <groupId>io.github.beehive-lab</groupId>
    <artifactId>gpu-llama3</artifactId>
    <version>0.3.2</version>
</dependency>

Gradle

implementation 'io.github.beehive-lab:gpu-llama3:0.3.2'

📖 Documentation | 🔗 Maven Central

Assets 3

11 Dec 17:19

github-actions

Immutable

v0.3.1

89fdb7b

GPULlama3.java 0.3.1

Model Support

Add compatibility method for langchain4j and quarkus in ModelLoader (#87)

📦 Installation

Maven

<dependency>
    <groupId>io.github.beehive-lab</groupId>
    <artifactId>gpu-llama3</artifactId>
    <version>0.3.1</version>
</dependency>

Gradle

implementation 'io.github.beehive-lab:gpu-llama3:0.3.1'

📖 Documentation | 🔗 Maven Central

Assets 3

11 Dec 12:19

github-actions

Immutable

v0.3.0

3722b09

GPULlama3.java 0.3.0

Model Support

[refactor] Generalize the design of tornadovm package to support multiple new models and types for GPU exec (#62)
Refactor/cleanup model loaders (#58)
Add Support for Q8_0 Models (#59)

Bug Fixes

[fix] Normalization compute step for non-nvidia hardware (#84)

Other Changes

Update README to enhance TornadoVM performance section and clarify GP… (#85)
Simplify installation by replacing TornadoVM submodule with pre-built SDK (#82)
[FP16] Improved performance by fusing dequantize with compute in kernels: 20-30% Inference Speedup (#78)
[cicd] Prevent workflows from running on forks (#83)
[CI][packaging] Automate process of deploying a new release with Github actions (#81)
[Opt] Manipulation of Q8_0 tensors with Tornado ByteArrays (#79)
Optimization in Q8_0 loading (#74)
[opt] GGUF Load Optimization for tensors in TornadoVM layout (#71)
Add SchedulerType support to all TornadoVM layer planners and layer… (#66)
Weight Abstractions (#65)
Bug fixes in sizes and names of GridScheduler (#64)
Add Maven wrapper support (#56)
Add changes used in Devoxx Demo (#54)

📦 Installation

Maven

<dependency>
    <groupId>io.github.beehive-lab</groupId>
    <artifactId>gpu-llama3</artifactId>
    <version>0.3.0</version>
</dependency>

Gradle

implementation 'io.github.beehive-lab:gpu-llama3:0.3.0'

📖 Documentation | 🔗 Maven Central

Assets 3

01 Oct 17:53

mikepapadim

v0.2.2

bf07393

v0.2.2

What's Changed

Fully working support for LangChain4j
LlamaApp cleanup by @orionpapadakis in #51
Fix execution path control by @orionpapadakis in #52
Add support for encoding ordinary text in Qwen3Tokenizer and update Q… by @mikepapadim in #53

Full Changelog: v0.2.1...v0.2.2

Contributors

mikepapadim and orionpapadakis

Assets 3

15 Sep 16:08

mikepapadim

v0.2.1

04d7760

v0.2.1

What's Changed

Minor cleanup by @mikepapadim in #47
Add useTornadovm flag to model loader to handle Builder option in Langchain4j by @mikepapadim in #50

Full Changelog: v0.2.0...v0.2.1

Contributors

mikepapadim

Assets 3

04 Sep 12:42

mikepapadim

v0.2.0

5413e38

v0.2.0

Model Support

Mistral – support for GGUF-format Mistral models with optimized GPU execution.
Qwen2.5 – GGUF-format Qwen2.5 models supported, including performance improvements for attention layers.
Qwen3 – compatible with GGUF-format Qwen3 models and updated integration.
DeepSeek-R1-Distill-Qwen-1.5B – GGUF-format DeepSeek distilled models supported for efficient inference.
Phi-3 – full support for GGUF-format Microsoft Phi-3 models for high-performance workloads.

What's Changed

[refactor] Renamed aux package to resolve Windows issue by @stratika in #11
Windows support for GPULlama3.java by @stratika in #12
[API] Update TornadoVM API to use latest warmup features by @mikepapadim in #13
[model] Add support for Mistral models by @orionpapadakis in #17
Cleanups post Mistral Integration by @mikepapadim in #27
Add a Docker section to README with available images and usage examples by @mikepapadim in #28
Refactor TornadoVMMasterPlan to simplify scheduling decision for non-Nvidia HW and Mistral Models by @mikepapadim in #32
File not found error handling in loadModel method in GGUF.java by @dhruvarayasam in #34
Update README for clarity by @mikepapadim in #36
[models] Support for Qwen3 models by @orionpapadakis in #37
[models][phi-3] Support for Microsoft's Phi-3 models by @mikepapadim in #38
Reorganize package structure and update imports to use `org.beehive.g… by @mikepapadim in #42
Update README.md by @kotselidis in #44
[models][deepseek][qwen2.5] Add support for Qwen2.5 and Deepseek-Distilled-Qwen models by @orionpapadakis in #40
Improve attention performance for qwen2.5 & deepseek by @orionpapadakis in #46

New Contributors

@orionpapadakis made their first contribution in #17
@dhruvarayasam made their first contribution in #34

Full Changelog: v0.1.0-beta...v0.2.0

Contributors

mikepapadim, kotselidis, and 3 other contributors

Assets 3

0 Join discussion

30 May 07:01

mikepapadim

v0.1.0-beta

0c9a05a

v0.1.0-beta

Llama 3 model compatibility - Full support for Llama 3.0, 3.1, and 3.2 models
GGUF format support - Native handling of GGUF model files
Support for FP16 models for reduced memory usage and faster computation
GPU Acceleration on NVIDIA GPUs using both OpenCL and PTX backends
[Experimental] Support for Apple Silicon (M1/M2/M3) via OpenCL (subject to hardware/compiler limitations)
[Experimental] Initial support for Q8 and Q4 quantized models, using runtime dequantization to FP16

Assets 3

Releases: beehive-lab/GPULlama3.java

GPULlama3.java 0.3.3

📦 Installation

Uh oh!

GPULlama3.java 0.3.2

Model Support

Other Changes

📦 Installation

Uh oh!

GPULlama3.java 0.3.1

Model Support

📦 Installation

Uh oh!

GPULlama3.java 0.3.0

Model Support

Bug Fixes

Other Changes

📦 Installation

Uh oh!

v0.2.2

What's Changed

Contributors

Uh oh!

v0.2.1

What's Changed

Contributors

Uh oh!

v0.2.0

Model Support

What's Changed

New Contributors

Contributors

Uh oh!

v0.1.0-beta

Uh oh!