Installation

Last updated 11 days ago

Installation

NVIDIA GPU Driver

Open a terminal, execute the following commands to install NVIDIA CPU driver.

sudo apt install nvidia-utils-535
sudo apt install nvidia-driver-535

If you got an error like Unable to locate package nvidia-driver-535. The apt database may out date. Run sudo apt update to update the apt database to solve this problem.

Now run sudo reboot to reboot the host. After rebooting, execute nvidia-smi command. You should see the information regarding NVIDIA GPU in the output.

Install NVIDIA Toolkit (CUDA)

In the terminal, execute the following commands to install CUDA, the NVIDIA toolkit.

wget https://developer.download.nvidia.com/compute/cuda/repos/debian12/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-3

Disk Setup (LVM Setting)

To configure the AI SSD for directing AIDE in executing fine-tuning tasks, follow these steps:

Install LVM

sudo apt update
sudo apt install lvm2 xfsprogs

Check Disks Location

lshw -class disk -class storage | grep -E 'ai100|logical name|version: EIFZ'
lsblk | grep nvme

Ensure that ai100 device identifiers are nvme6n1 and nvme8n1. Update if necessary

Clear Disks (Just In Case)

sudo wipefs -a /dev/nvme1n1 /dev/nvme2n1

Create LVM

sudo pvcreate /dev/nvme1n1 /dev/nvme2n1
sudo vgcreate ai /dev/nvme1n1 /dev/nvme2n1
sudo lvcreate --type striped -i 2 -I 128k -l 100%FREE -n ai ai

Mount LVM

Format the disk

sudo mkfs.xfs -f -s size=4k -m crc=0 /dev/ai/ai -f

Mount the disk

sudo mkdir -p /mnt/nvme0
sudo mount /dev/ai/ai /mnt/nvme0
sudo chown -R $USER:$USER /mnt/nvme0

Make Mount Persistent

sudo echo '/dev/ai/ai /mnt/nvme0 xfs defaults,nofail 0 0' | sudo tee -a /etc/fstab

To remove permanent mount setting, run: sudo sed -i '/\/dev\/ai\/ai/d' /etc/fstab

Successful Example

If LVM setting is successful, you will see the following successful configuration when running command lsblk.

If you need to dissolve LVM setting. Just run the following commands:

sudo umount /mnt/nvme0
sudo lvremove -y ai
sudo pvremove -y /dev/nvme1n1 /dev/nvme2n1 --force --force

Swap File Setting

Enable swap space to provide extra memory for DRAM, allowing you to increase batch sizes if there is sufficient

Create swap file

sudo dd if=/dev/zero of=/mnt/nvme0/swapfile bs=1M count=256k

Modify permission

sudo chmod 0600 /mnt/nvme0/swapfile

Initialize swap file

sudo mkswap /mnt/nvme0/swapfile

Enable the swap

sudo swapon /mnt/nvme0/swapfile

Make the swap permanent

sudo echo '/mnt/nvme0/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab

If you would like to remove the swap, please make sure to follow the steps below to prevent unexpected system issues.

sudo swapoff /mnt/nvme0/swapfile
sudo sed -i '/\/mnt\/nvme0\/swapfile/d' /etc/fstab
sudo rm /mnt/nvme0/swapfile

Install Docker

Run the following command to uninstall all conflicting packages:

for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do sudo apt-get remove $pkg; done

apt-get might report that you have none of these packages installed.

Set up Docker's apt repository.

# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

Install the Docker packages.

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

Add user to docker group.

sudo usermod -aG docker ACCOUNT

Here ACCOUNT is the account you are logging into. After this, remember to re-login againg so that your account is a member of docker group.

Verify that the installation is successful.

docker run hello-world

This command downloads a test image and runs it in a container. When the container runs, it prints a confirmation message and exits.

Install NVIDIA Container Toolkit

Configure the production repository.

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
  sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

Update the packages list from the repository.

sudo apt-get update

Install the NVIDIA Container Toolkit packages.

sudo apt-get install -y nvidia-container-toolkit

Configure the container runtime by using the nvidia-ctk command.

sudo nvidia-ctk runtime configure --runtime=docker

Restart the Docker daemon.

sudo systemctl restart docker

Install GenAI Studio

GenAI Studio make an installer so that users can install it with ease. Normally, what you need to do is to download it and, then, execute it.

The GenAI Studio installation file is approximately 30GB. To ensure smooth system installation, we recommend having at least 100GB of free disk space.

Please contact your technical window for the installation file. the named like GenAI-Studio_<VERSIOIN>_setup.run format. Don't forget move the installer you downloaded to the target host if your download does not run on. Finally, just execute the downloaded installer file. Answer the questions during the process. You can tell that's really a simple step.

Check the permissions of installer file you downloaded. If it does not have execute permission attached, just change it by chmod 0755 INSTALLER_FILE command.

Starts GenAI Studio Up

If everything goes well the GenAI Studio should be installed under $HOME/Advantech/GenAI-Studio directory. Change your directory to ~/Advantech/GenAI-Studio/bin and run ./app-up. After seconds, open a browser to visit the target host with 3001 port.

Before v1.1.0 release, the installation path is $HOME/GenAI-Studio.

PreviousPrerequisite NextUtilities

Last updated 11 days ago

NVIDIA GPU Driver

Open a terminal, execute the following commands to install NVIDIA CPU driver.

sudo apt install nvidia-utils-535
sudo apt install nvidia-driver-535

If you got an error like Unable to locate package nvidia-driver-535. The apt database may out date. Run sudo apt update to update the apt database to solve this problem.

Now run sudo reboot to reboot the host. After rebooting, execute nvidia-smi command. You should see the information regarding NVIDIA GPU in the output.

Install NVIDIA Toolkit (CUDA)

In the terminal, execute the following commands to install CUDA, the NVIDIA toolkit.

wget https://developer.download.nvidia.com/compute/cuda/repos/debian12/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-3

Disk Setup (LVM Setting)

To configure the AI SSD for directing AIDE in executing fine-tuning tasks, follow these steps:

Install LVM

sudo apt update
sudo apt install lvm2 xfsprogs

Check Disks Location

lshw -class disk -class storage | grep -E 'ai100|logical name|version: EIFZ'
lsblk | grep nvme

Ensure that ai100 device identifiers are nvme6n1 and nvme8n1. Update if necessary

Clear Disks (Just In Case)

sudo wipefs -a /dev/nvme1n1 /dev/nvme2n1

Create LVM

sudo pvcreate /dev/nvme1n1 /dev/nvme2n1
sudo vgcreate ai /dev/nvme1n1 /dev/nvme2n1
sudo lvcreate --type striped -i 2 -I 128k -l 100%FREE -n ai ai

Mount LVM

Format the disk

sudo mkfs.xfs -f -s size=4k -m crc=0 /dev/ai/ai -f

Mount the disk

sudo mkdir -p /mnt/nvme0
sudo mount /dev/ai/ai /mnt/nvme0
sudo chown -R $USER:$USER /mnt/nvme0

Make Mount Persistent

sudo echo '/dev/ai/ai /mnt/nvme0 xfs defaults,nofail 0 0' | sudo tee -a /etc/fstab

To remove permanent mount setting, run: sudo sed -i '/\/dev\/ai\/ai/d' /etc/fstab

Successful Example

If LVM setting is successful, you will see the following successful configuration when running command lsblk.

If you need to dissolve LVM setting. Just run the following commands:

sudo umount /mnt/nvme0
sudo lvremove -y ai
sudo pvremove -y /dev/nvme1n1 /dev/nvme2n1 --force --force

Swap File Setting

Enable swap space to provide extra memory for DRAM, allowing you to increase batch sizes if there is sufficient

Create swap file

sudo dd if=/dev/zero of=/mnt/nvme0/swapfile bs=1M count=256k

Modify permission

sudo chmod 0600 /mnt/nvme0/swapfile

Initialize swap file

sudo mkswap /mnt/nvme0/swapfile

Enable the swap

sudo swapon /mnt/nvme0/swapfile

Make the swap permanent

sudo echo '/mnt/nvme0/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab

If you would like to remove the swap, please make sure to follow the steps below to prevent unexpected system issues.

sudo swapoff /mnt/nvme0/swapfile
sudo sed -i '/\/mnt\/nvme0\/swapfile/d' /etc/fstab
sudo rm /mnt/nvme0/swapfile

Install Docker

Run the following command to uninstall all conflicting packages:

for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do sudo apt-get remove $pkg; done

apt-get might report that you have none of these packages installed.

Set up Docker's apt repository.

# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

Install the Docker packages.

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

Add user to docker group.

sudo usermod -aG docker ACCOUNT

Here ACCOUNT is the account you are logging into. After this, remember to re-login againg so that your account is a member of docker group.

Verify that the installation is successful.

docker run hello-world

This command downloads a test image and runs it in a container. When the container runs, it prints a confirmation message and exits.

Install NVIDIA Container Toolkit

Configure the production repository.

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
  sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

Update the packages list from the repository.

sudo apt-get update

Install the NVIDIA Container Toolkit packages.

sudo apt-get install -y nvidia-container-toolkit

Configure the container runtime by using the nvidia-ctk command.

sudo nvidia-ctk runtime configure --runtime=docker

Restart the Docker daemon.

sudo systemctl restart docker

Install GenAI Studio

GenAI Studio make an installer so that users can install it with ease. Normally, what you need to do is to download it and, then, execute it.

The GenAI Studio installation file is approximately 30GB. To ensure smooth system installation, we recommend having at least 100GB of free disk space.

Check the permissions of installer file you downloaded. If it does not have execute permission attached, just change it by chmod 0755 INSTALLER_FILE command.

Starts GenAI Studio Up

Before v1.1.0 release, the installation path is $HOME/GenAI-Studio.