Validation
Last updated
Last updated
Max Token Input: Allows users to set a limit on the maximum number of tokens generated by the model.
Temperature Control: Provides a slider to adjust the randomness of the model's outputs, promoting deterministic or creative responses.
Top-p Sampling: Enables users to fine-tune the sampling technique for better response quality.
Users can select from multiple models available for validation.
Clicking on the model list allows you to choose from a variety of available models. These include both pre-trained base models and custom models that have been fine-tuned for specific tasks.
Example Models:
Foundation Model: meta-llama/Llama-3.1-8B-Instruct
After Fine-tuning Model: AIR-8B -
Epoch 4
.
The interface displays outputs from different models or configurations in parallel for easy comparison.
Each column corresponds to a model with its specific parameters.
A dedicated input field for entering questions or prompts to be validated.
Users can add questions using the text input area at the bottom of the interface or upload a JSON file to provide a batch of questions.
Submit Button: Initiates the validation process based on the current configuration.
Reset Button: Clears all question inputs.
Case I: Foundation Model
vs Fine-tuning Mode (Epoch 4)
Case II: Fine-tuning Mode (Epoch 1)
vs Fine-tuning Mode (Epoch 4)
Download: Download the questions and their corresponding model-generated answers as a CSV file.
Create (Model Quantization):
Create Workspace with this inference: Quantize the model and create a new Workspace directly, applying the quantized model. The default quantization format is q4_k_m
. Advanced quantization options are available in advanced mode. The model selection list will display fine-tuned models from various training epochs. A statistical summary of user ratings for each model's responses will be included to aid in your selection.
Import to Ollama's Inference Repo: To place a model into a repository that Ollama can access and use.
Once the model is compressed, you can locate it in the AI Provider section, under LLM and then Ollama.
Upon clicking submit
, the system will query both models with the provided question. During this process, there will be a brief loading time as the models are loaded onto the GPU. Once the models have generated their responses, a "like" icon will appear next to each answer. If you are satisfied with a particular response, please click the corresponding icon. The system will record the number of "likes" each model receives, which will be used for subsequent model quantification.