Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -329,7 +329,8 @@ ID USER GROUP NAME STAT CPU MEM HOS

#### Connecting to the Workload Cluster Kubernetes API Locally

To establish a local connection to the Workload Cluster Kubernetes API, you will need to export the following environment variables. The `$VROUTER_IP` will contain the public IP address of the vRouter instance, and the `$CONTROL_PLANE_IP` will contain the IP address of the workload cluster control plane instance. Note that the virtual machines change on each deploy, so change the name of the vRouter and control plane instance appropriately in the following code block:
To establish a local connection to the Workload Cluster Kubernetes API, you will need to export the following environment variables. The `$VROUTER_IP` will contain the public IP address of the vRouter instance, and the `$CONTROL_PLANE_IP` will contain the IP address of the workload cluster control plane instance. Here, we will connect directly to the Kubernetes cluster from the OpenNebula frontend, to make things simpler.
Note that the virtual machines change on each deploy, so change the name of the vRouter and control plane instance appropriately in the following code block:

```shell
export VROUTER_VM_NAME=<changeme>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,11 @@ weight: 3
---

<a id="cd_cloud"></a>
This document describes the procedure to deploy an AI-ready OpenNebula cloud using OneDeploy on a single [Scaleway Elastic Metal](https://www.scaleway.com/en/elastic-metal/) bare-metal server equipped with GPUs.

Here you have a practical guide to deploy an AI-ready OpenNebula cloud using OneDeploy on a single [Scaleway Elastic Metal](https://www.scaleway.com/en/elastic-metal/) instance equipped with GPUs. This setup is ideal for demonstrations, proofs-of-concept (PoCs), or for quickly trying out the solution without the need for a complex physical infrastructure.
The architecture is a converged OpenNebula installation, where the frontend services and KVM hypervisor run on the same physical host. This approach is ideal for demonstrations, proofs-of-concept (PoCs), or for quickly trying out the solution without the need for a complex physical infrastructure.

The outlined procedure is based on an instance with NVIDIA L40S GPUs as an example. A converged OpenNebula cloud, including frontend and KVM node, is deployed on the same bare metal server.
The outlined procedure is based on an instance with NVIDIA L40S GPUs as an example.

## Prerequisites

Expand Down Expand Up @@ -63,20 +64,25 @@ If the directory is not empty it means that IOMMU is active, which is a prerequi

### Server Pre-configuration

These steps prepare the server for the OneDeploy tool, which runs as the `root` user.
The following steps prepare the server to run OneDeploy, which operates with `root` privileges.

1. Enable Local Root SSH Access:
Generate an SSH key pair for the `root` user and authorize it for local connections. This allows Ansible to connect to `127.0.0.1` as `root`.
1. Obtain Root Privileges:
OneDeploy installs software and modifies system-level configuration files. To perform these actions, open a `root` shell.
```shell
sudo su
ssh-keygen -t ed25519 -f ~/.ssh/id_ed25519 -N "" -q
sudo -i
```

2. Configure Local Root SSH Access:
Generate an SSH key pair for `root` and authorize it for local connections. This allows Ansible to connect to `127.0.0.1` as `root`.
```shell
ssh-keygen -t ed25519 -f /root/.ssh/id_ed25519 -N "" -q
cat /root/.ssh/id_ed25519.pub >> /root/.ssh/authorized_keys
```

2. Create a Virtual Network Bridge:
3. Create a Virtual Network Bridge:
To provide network connectivity to the VMs, create a virtual bridge with NAT. This allows VMs to access the internet through the server's public network interface.

2.1 Create the Netplan configuration file for the bridge:
3.1 Create the Netplan configuration file for the bridge:
```shell
tee /etc/netplan/60-bridge.yaml > /dev/null << 'EOF'
network:
Expand All @@ -94,8 +100,8 @@ These steps prepare the server for the OneDeploy tool, which runs as the `root`
EOF
```

2.2 Apply the network configuration and enable IP forwarding. Replace `enp129s0f0np0` with your server's main network interface if it is different.
```default
3.2 Apply the network configuration and enable IP forwarding. Replace `enp129s0f0np0` with your server's main network interface if it is different.
```shell
netplan apply
sysctl -w net.ipv4.ip_forward=1
iptables -t nat -A POSTROUTING -s 192.168.100.0/24 -o enp129s0f0np0 -j MASQUERADE
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -60,8 +60,17 @@ To deploy the vLLM appliance for benchmarking, follow these steps:
onetemplate instantiate vllm --name vllm
```

4. Wait until the vLLM engine has loaded the model and the application is served. To confirm progress, access the VM via SSH and check the logs located in `/var/log/one-appliance/vllm.log`. You should see an output similar to this:
4. Wait until the vLLM engine has loaded the model and the application is served. To confirm progress, access the VM via SSH and check the logs located in `/var/log/one-appliance/vllm.log`.

4.1 To access the VM, run the following command:
```shell
onevm ssh 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the instance name instead of the id:

Suggested change
onevm ssh 0
onevm ssh vllm

```
You can also list all available VMs by running `onevm list`.

4.2 Once inside the VM, check the logs in `/var/log/one-appliance/vllm.log`. You should see an output similar to this:

```base
[...]

(APIServer pid=2480) INFO 11-26 11:00:33 [api_server.py:1971] Starting vLLM API server 0 on http://0.0.0.0:8000
Expand Down