Releases: containers/ramalama
Releases · containers/ramalama
v0.14.0
What's Changed
- Docsite builds remove extraneous manpage number labels by @ieaves in #2037
- Bump to latest llama.cpp and whisper.cpp by @rhatdan in #2039
- Added inference specification files to info command by @engelmi in #2049
- Update docusaurus monorepo to v3.9.2 by @red-hat-konflux-kflux-prd-rh03[bot] in #2055
- Pin macos CI to python <3.14 until mlx is updated by @olliewalsh in #2051
- Added --max-tokens to llama.cpp inference spec by @engelmi in #2057
- Lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2056
- Prefer the embedded chat template for ollama models by @olliewalsh in #2040
- Set gguf quantization default to Q4_K_M by @engelmi in #2050
- Update dependency huggingface-hub to ~=0.36.0 by @red-hat-konflux-kflux-prd-rh03[bot] in #2059
- Update Konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2044
- Lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2060
- docker: fix list command for oci images when running in a non-UTC timezone by @mikebonnet in #2067
- Update dependency huggingface-hub to v1 by @renovate[bot] in #2066
- Fix AMD GPU image selection on arm64 for issue #2045 by @rhatdan in #2048
- run RAG operations in a separate container by @mikebonnet in #2053
- konflux: merge before building/testing PRs by @mikebonnet in #2069
- fix "ramalama rag" under docker by @mikebonnet in #2068
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2070
- Renaming huggingface-cli -> hf by @Yarboa in #2047
- Added Speaking and Advocacy heading for CONTRIBUTING.md by @dominikkawka in #2073
- Fix the rpm name in docs by @olliewalsh in #2083
- Update SECURITY.md. Use github issues for security vulnerabilities by @rhatdan in #2077
- Improving ramalama rag section in README.md by @jpodivin in #2076
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2074
- fix up type checking and add it to GitHub CI by @mikebonnet in #2075
- konflux: disable builds on s390x by @mikebonnet in #2087
- Bump llama.cpp and whisper.cpp by @rhatdan in #2071
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2088
- Add --port flag to ramalama run command by @rhatdan in #2082
- rag: keep the versions of gguf and convert_hf_to_gguf.py in sync by @mikebonnet in #2092
- Bump to v0.14.0 by @rhatdan in #2093
New Contributors
- @Yarboa made their first contribution in #2047
- @dominikkawka made their first contribution in #2073
- @jpodivin made their first contribution in #2076
Full Changelog: v0.13.0...v0.14.0
v0.13.0
What's Changed
- Reintroduce readme updates and add additional documentation by @ieaves in #1987
- feat: support safetensors-only repos across runtimes by @kush-gupt in #1976
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #1984
- [skip-ci] Update astral-sh/setup-uv action to v7 by @renovate[bot] in #2006
- Remove tag in model name for health check by @engelmi in #2008
- Add safetensor snapshot file type by @engelmi in #2009
- Fixes default engine detection for OSX users by @ieaves in #2010
- Don't attempt to relabel image mounts by @rhatdan in #2012
- Link to Ollama registry catalong and fix capitalizations by @rhatdan in #2011
- add support for split model file url by @fozzee in #2001
- Update pre-commit hook pycqa/isort to v7 by @red-hat-konflux-kflux-prd-rh03[bot] in #2021
- Update dependency isort to v7 by @red-hat-konflux-kflux-prd-rh03[bot] in #2020
- Update Konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2018
- Adds dependency group so uv automatically provisions dev dependencies by @ieaves in #2015
- Always chosen port passed via CLI parameter by @engelmi in #2013
- Daemon bugfix for docker users and better pull config handling by @ieaves in #2005
- Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.6-1760340943 by @red-hat-konflux-kflux-prd-rh03[bot] in #2023
- Search the inference spec files in the source directory by @kpouget in #2014
- docs: Update README with correct MLX install instructions by @kush-gupt in #2024
- Added data-files path to default config dirs list by @engelmi in #2028
- Template conversion system and messages template wrapping for Go templates by @ieaves in #1947
- Fix model paths when running with --nocontainer by @olliewalsh in #2030
- Add NVIDIA GPU support to quadlet generation for
ramalama serveby @wang7x in #1955 - build_rag.sh: add libraries required by opencv-python by @mikebonnet in #2031
- Bump to v0.13.0 by @rhatdan in #2029
New Contributors
Full Changelog: v0.12.4...v0.13.0
v0.12.4
What's Changed
- fix: split model regex match model without path by @fozzee in #1952
- konflux: fix creation of the PipelineRun when a tag is pushed by @mikebonnet in #1961
- Add e2e pytest test for list and rm commands by @telemaco in #1949
- konflux: test the cuda image on NVIDIA hardware by @mikebonnet in #1800
- fix: use OpenVINO 2025.3 by @jeffmaury in #1968
- chore(deps): update pre-commit hook psf/black to v25.9.0 by @red-hat-konflux-kflux-prd-rh03[bot] in #1965
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #1964
- chore(config): migrate renovate config by @red-hat-konflux-kflux-prd-rh03[bot] in #1966
- Fix cuda-vllm image build steps by @mcornea in #1963
- fix(intel): detect xe driver by @futursolo in #1980
- chore(deps): update konflux references to abf231c by @red-hat-konflux-kflux-prd-rh03[bot] in #1975
- Add e2e pytest test for help command by @telemaco in #1973
- fix: rename file with : in name by @jeffmaury in #1972
- chore(deps): update dependency macos to v15 by @renovate[bot] in #1970
- Fix handling of API_KEY in ramalama chat by @rhatdan in #1958
- Inference engine spec by @engelmi in #1959
- Add e2e pytest test for run command by @telemaco in #1978
- Add perplexity and bench by @engelmi in #1986
- Update react monorepo to v19.2.0 by @renovate[bot] in #1992
- Update pre-commit hook pycqa/isort to v6.1.0 by @red-hat-konflux-kflux-prd-rh03[bot] in #1991
- Fix --exclude-dir arguments to grep in Makefile and add .tox. by @jwieleRH in #1990
- updates docsite and adds docsite to the make docs process by @ieaves in #1988
- Fix typo in llama.cpp engine spec by @olliewalsh in #1998
- chore: update demo script with RAG and mcp based on dev conf presenta… by @bmahabirbu in #1994
- Fix llama.cpp build instruction set by @olliewalsh in #2000
- Add unified --max-tokens CLI argument for output token limiting by @rhatdan in #1982
- Bump the versions of llama.cpp and whisper.cpp by @rhatdan in #1999
- Bump to v0.12.4 by @rhatdan in #2003
New Contributors
- @fozzee made their first contribution in #1952
- @jeffmaury made their first contribution in #1968
- @futursolo made their first contribution in #1980
Full Changelog: v0.12.3...v0.12.4
v0.12.3
What's Changed
- konflux: release images when a tag is pushed to the git repo by @mikebonnet in #1926
- konflux: build ramalama images for s390x and ppc64le by @mikebonnet in #1842
- konflux: run clamav-scan as a matrixed task by @mikebonnet in #1922
- s390x: switch to a smaller bigendian model for testing by @mikebonnet in #1930
- --flash-attn requires an option in llama-server now by @rhatdan in #1928
- Pass the encoding argument to run_cmd(). by @jwieleRH in #1931
- Improve NVIDIA CDI check. by @jwieleRH in #1903
- [ci] Update repo for ubuntu podman 5 by @olliewalsh in #1940
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #1932
- chore(deps): update dependency huggingface-hub to ~=0.35.0 by @renovate[bot] in #1935
- fix: with the new llama.cpp version and chat templates rag_framework … by @bmahabirbu in #1937
- Introduce tox for testing and add e2e framework by @telemaco in #1938
- docs: revert incorrect docs changes by @cdoern in #1936
- Add bats test to cover docker-compose in serve by @abhibongale in #1934
- konflux: set the source-repo-url annotation on the override Snapshot by @mikebonnet in #1941
- Add Compose docs by @abhibongale in #1943
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1758184894 by @renovate[bot] in #1944
- Adds a roadmap document for tracking future work and goals by @ieaves in #1893
- konflux: handle "incoming" events when creating override Snapshots by @mikebonnet in #1945
- added mcp to chat by @bmahabirbu in #1923
- introduced some qol fixed for standard python mcp client by @bmahabirbu in #1953
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #1954
- Add e2e pytest workflows to Github CI by @telemaco in #1950
- Add e2e pytest test for bench command by @telemaco in #1942
- Reorganize transports and add new rlcr transport option by @ieaves in #1907
- Bump to v0.12.3 by @rhatdan in #1956
New Contributors
Full Changelog: v0.12.2...v0.12.3
v0.12.2
What's Changed
- Add Docker Compose generator by @abhibongale in #1839
- Catch KeyError exceptions by @rhatdan in #1867
- Fallback to default image when CUDA version is out of date by @rhatdan in #1871
- Changed from google-chrome to firefox by @AlexonOliveiraRH in #1876
- Revert back to ollama granite-code models by @olliewalsh in #1875
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1756799158 by @renovate[bot] in #1887
- fix(deps): update dependency @mdx-js/react to v3.1.1 by @renovate[bot] in #1885
- Don't print llama stack api endpoint info unless --debug is passed by @booxter in #1881
- feat(script): add browser override and improve service startup flow by @AlexonOliveiraRH in #1879
- tests: generate tmpdir store for ollama pull testcase by @booxter in #1891
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #1884
- [skip-ci] Update actions/stale action to v10 by @renovate[bot] in #1896
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1756915113 by @red-hat-konflux-kflux-prd-rh03[bot] in #1895
- Added the GGUF field tokenizer.chat_template for getting chat template by @engelmi in #1890
- Suppress stderr when chatting without container by @booxter in #1880
- konflux: stop building unnecessary images by @mikebonnet in #1897
- Readme updates and python classifiers by @ieaves in #1894
- Update versions of llama.cpp and whisper.cpp by @rhatdan in #1874
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #1906
- Extended inspect command by --get option with auto-complete by @engelmi in #1889
- Allow running
ramalamawithout a GPU by @kpouget in #1909 - Add tests for
--device noneby @kpouget in #1911 - Bump to latest version of llama.cpp by @rhatdan in #1910
- Initial model swap work by @engelmi in #1807
- Fix ramalama run with prompt index error by @engelmi in #1913
- Fix the application of codespell in "make validate". by @jwieleRH in #1904
- Do not set the ctx-size by default by @rhatdan in #1915
- Use Hugging Face models for tinylama and smollm:135 by @olliewalsh in #1916
- build_rag.sh: install mistral-common for convert_hf_to_gguf.py by @mikebonnet in #1925
- Bump to v0.12.2 by @rhatdan in #1912
New Contributors
- @abhibongale made their first contribution in #1839
- @AlexonOliveiraRH made their first contribution in #1876
Full Changelog: v0.12.1...v0.12.2
v0.12.1
What's Changed
- konflux: remove ecosystem tests by @mikebonnet in #1841
- konflux: migrate to per-component service accounts by @mikebonnet in #1840
- Bats test fixes by @olliewalsh in #1847
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #1851
- Add support for huggingface split GGUF models by @olliewalsh in #1848
- Fix progress bar not reaching 100% by @olliewalsh in #1846
- Use correct chat template file for ollama models by @olliewalsh in #1856
- fix: Don't use non-portable errno.{EMEDIUMTYPE,ENOMEDIUM} by @booxter in #1850
- Make the container build system more flexible by @kpouget in #1854
- Fix deepseek-r1 chat template conversion by @olliewalsh in #1858
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1755678605 by @red-hat-konflux-kflux-prd-rh03[bot] in #1863
- DNM: ci test by @olliewalsh in #1862
- Fix python environment in ci jobs by @olliewalsh in #1861
- feat: added wait logic for rag_framework fixed doc2rag file and added… by @bmahabirbu in #1761
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #1865
- Bump to v0.12.1 by @rhatdan in #1866
New Contributors
Full Changelog: v0.12.0...v0.12.1
v0.12.0
What's Changed
- Bump to v0.11.3 by @rhatdan in #1789
- Fixup release of stable-diffusion images by @rhatdan in #1790
- Splitting pypi build and upload into dedicated make targets by @engelmi in #1791
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1754380668 by @renovate[bot] in #1780
- Fix the 050-pull test. by @jwieleRH in #1798
- Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.6-1754380668 by @red-hat-konflux-kflux-prd-rh03[bot] in #1803
- Skip the benchmark test if llama-bench is not available. by @jwieleRH in #1801
- Explicitly pull of base container before converted container build by @engelmi in #1804
- Do not cleanup all podman contact if user runs make test. by @rhatdan in #1792
- Fix handling of oci models when being served by @rhatdan in #1802
- We should use all config files paths found by @rhatdan in #1799
- Fixed error in readme to ensure consistency. by @rhatdan in #1806
- Untabify test/system/helpers.bash in accordance with PEP8. by @jwieleRH in #1808
- konflux: build stable-diffusion and remove rocm-ubi by @mikebonnet in #1813
- fix parsing of CDI yaml by @mikebonnet in #1811
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1754586119 by @red-hat-konflux-kflux-prd-rh03[bot] in #1814
- bump the version of mesa-vulkan-drivers to match the copr repo by @mikebonnet in #1820
- fix "skopeo copy" command" by @mikebonnet in #1822
- Supports gpt-oss by @orimanabu in #1810
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #1819
- Trying to cut down on github notifications by @ericcurtin in #1824
- Wait for container to be healthy with ramalama run by @rhatdan in #1809
- [skip-ci] Update actions/checkout action to v5 by @renovate[bot] in #1823
- Add gpt-oss models to shortnames.conf by @rhatdan in #1826
- konflux: build on the fastest instance types available, and increase timeouts by @mikebonnet in #1827
- reimplement health checks outside of podman by @mikebonnet in #1830
- chore(deps): update pre-commit hook pre-commit/pre-commit-hooks to v6 by @red-hat-konflux-kflux-prd-rh03[bot] in #1837
- allow the llama-stack image to be configured via an env var by @mikebonnet in #1834
- Improve error message when huggingface-cli is needed and missing by @pbabinca in #1767
- Added error message for GGUF not found and safetensor unsupported by @engelmi in #1838
- Bump to v0.12.0 by @rhatdan in #1833
New Contributors
- @orimanabu made their first contribution in #1810
Full Changelog: v0.11.3...v0.12.0
v0.11.3
What's Changed
- musa: re-enable whisper.cpp build and update its commit SHA by @yeahdongcn in #1758
- Bump to v0.11.2 by @rhatdan in #1757
- fix model name in stack.py by @pbalczynski in #1759
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1753769805 by @renovate[bot] in #1765
- vLLM v0.10.0 release by @ericcurtin in #1763
- fix(deps): update react monorepo to v19.1.1 by @renovate[bot] in #1762
- Fix excess error output in run command by @arortiz-rh in #1760
- call Model.validate_args() from Model.ensure_model_exists() by @mikebonnet in #1772
- De-duplicate bash build scripts by @ericcurtin in #1773
- Enable/Disable thinking on reasoning models by @rhatdan in #1768
- Include fix that allows us to build on older ARM SoC's by @ericcurtin in #1775
- Enable multiline chat by @engelmi in #1777
- Include the host in the quadlet's PublishPort directive by @Stebalien in #1771
- Fix run/generate for oci models by @olliewalsh in #1779
- chore(deps): update dependency typescript to ~5.9.0 by @renovate[bot] in #1782
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1753978585 by @red-hat-konflux-kflux-prd-rh03[bot] in #1781
- Typing and bug squashes by @ieaves in #1764
- Adding --add-to-unit option to --generate to allow creating or updati… by @Annakan in #1774
- Fix handling of configured_has_all by @rhatdan in #1784
- building stable diffusion. by @jtligon in #1769
- Use modelstore as tmpdir for hfcli by @engelmi in #1787
- Set TMPDIR to /var/tmp if not set by @rhatdan in #1786
New Contributors
- @pbalczynski made their first contribution in #1759
- @arortiz-rh made their first contribution in #1760
- @Stebalien made their first contribution in #1771
- @Annakan made their first contribution in #1774
Full Changelog: v0.11.2...v0.11.3
v0.11.2
What's Changed
- Bump to v0.11.1 by @rhatdan in #1726
- konflux: add pipelines for ramalama-vllm and layered images by @mikebonnet in #1717
- Don't override image when using rag if user specified it by @rhatdan in #1727
- Re-enable passing chat template to model by @engelmi in #1732
- No virglrenderer in RHEL by @ericcurtin in #1728
- Add stale githup workflow to maintain older issues and PRs. by @rhatdan in #1733
- konflux: build -rag images on bigger instances with large disks by @mikebonnet in #1737
- musa: upgrade musa sdk to rc4.2.0 by @yeahdongcn in #1697
- Remove GGUF version check when parsing by @engelmi in #1738
- Define image within container with full name by @rhatdan in #1734
- musa: disable build of whisper.cpp, and update llama.cpp by @mikebonnet in #1745
- Include mmproj mount in quadlet by @olliewalsh in #1742
- Adds docs site by @ieaves in #1736
- Fix listing models by @engelmi in #1748
- fix(deps): update dependency huggingface-hub to ~=0.34.0 by @renovate[bot] in #1747
- chore(deps): update dependency typescript to ~5.8.0 by @renovate[bot] in #1746
- Use blobs directory as context directory on convert by @engelmi in #1739
- konflux: push images to the quay.io/ramalama org after integration testing by @mikebonnet in #1743
- CUDA vLLM variant by @ericcurtin in #1741
- Add setuptools_scm by @ericcurtin in #1749
- Fixes docsite page linking by @ieaves in #1752
- Fix kube volumemount for hostpaths and add mmproj by @olliewalsh in #1751
- More cuda vLLM enablement by @ericcurtin in #1750
- Fix assembling URLs for big models by @engelmi in #1756
Full Changelog: v0.11.1...v0.11.2
v0.11.1
What's Changed
- Bump to 0.11.0 by @rhatdan in #1694
- Mistral should point to lmstudio gguf by @ericcurtin in #1698
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1752587049 by @renovate[bot] in #1699
- chore(deps): update quay.io/konflux-ci/build-trusted-artifacts:latest docker digest to f7d0c51 by @renovate[bot] in #1696
- reduce unnecessary image pulls during testing, and re-enable a couple tests by @mikebonnet in #1700
- Minor fixes to rpm builds by packit and spec file. by @smooge in #1704
- konflux: build cuda on arm64, and simplify testing by @mikebonnet in #1687
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1752625787 by @red-hat-konflux-kflux-prd-rh03[bot] in #1710
- Included ramalama.conf in wheel by @carlwgeorge in #1711
- Improve NVIDIA GPU detection. by @jwieleRH in #1617
- README: remove duplicate statements by @rhatdan in #1707
- fix GPU selection and pytorch URL when building rag images by @mikebonnet in #1709
- Add support for Intel Iris Xe Graphics (46AA, 46A6, 46A8) by @tonyjames in #1712
- konflux: add pipelines for asahi, cann, intel-gpu, llama-stack, musa, openvino, and ramalama-cli by @mikebonnet in #1708
- Add vllm to cpu inferencing Containerfile by @ericcurtin in #1677
- build_rag.sh: install cmake by @mikebonnet in #1716
- Update Konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #1718
- container-images: add virglrenderer to vulkan by @slp in #1714
- added milvus support and qol console logs for rag command by @bmahabirbu in #1720
- fixes issue where format=markdown saves to dublicate absolute path by @bmahabirbu in #1719
- Engine should be created after checks by @rhatdan in #1722
- Use model organization as namespace when pulling Ollama models by @engelmi in #1721
- Consolodate run and chat commands together also allow specification of prefix in ramalama.conf by @rhatdan in #1706
- If container fails on Run, warn and exit by @rhatdan in #1723
- Added temporary migration routine for non-namespaced ollama models by @engelmi in #1725
New Contributors
- @tonyjames made their first contribution in #1712
Full Changelog: v0.11.0...v0.11.1