-
Notifications
You must be signed in to change notification settings - Fork 52
Open
Description
I installed this: python -m pip install llama-cpp-python --prefer-binary --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/basic/cu121
Also tried this:https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX512/cu121
Here is my code:
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.chains import LLMChain
from langchain.llms import LlamaCpp
from langchain.prompts import PromptTemplate
from langchain.agents.agent_types import AgentType
from langchain_experimental.agents.agent_toolkits import create_csv_agent
template = """Question: {question}
Answer: Let's work this out in a step by step way to be sure we have the right answer."""
prompt = PromptTemplate(template=template, input_variables=["question"])
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])
n_gpu_layers = 10 # Change this value based on your model and your GPU VRAM pool.
n_batch = 210 # Should be between 1 and n_ctx, consider the amount of VRAM in your GPU.
llm = LlamaCpp(
model_path="/home/rtx-4070/Downloads/openorca-platypus2-13b.Q4_K_M.gguf",
n_gpu_layers=n_gpu_layers,
n_batch=n_batch,
callback_manager=callback_manager,
verbose=True, # Verbose is required to pass to the callback manager
)
print("dfg")
llm_chain = LLMChain(prompt=prompt, llm=llm)
question = "What NFL team won the Super Bowl in the year Justin Bieber was born?"
llm_chain.run(question)
I get this error:
Illegal instruction (core dumped)
CUDA Version: 12.1
Ubuntu: 20
GPU: RTX 4070
CPU : AMD Ryzen 7 5700X 8-Core
Langchain Version: 0.0.347
Langchain Experimental: 0.0.44
Created fresh new conda environment and installed everything in sequence: langchain, experimental, llama-cppy-python.
If remove this package and install normal llama-cpp-python models loads and responds but on cpu with BLAS=0
Metadata
Metadata
Assignees
Labels
No labels