Inference Chat

PreviousFeature Overview NextFine-tuning

Last updated 2 months ago

Inference Chat

In GenAI Studio, a workspace is a core organizational concept designed to structure and manage documents, conversations, and interactions with the language model in a compartmentalized and context-specific manner.

Workspace Concept in GenAI Studio

A workspace in GenAI Studio is a logical partition that acts as a dedicated environment for grouping related documents, chats, and configurations. It enables users to organize their interactions with the AI model by isolating contexts, ensuring that conversations and retrieved information remain relevant to specific topics, projects, or use cases. Think of a workspace as a "container" or "thought space" where documents and their associated embeddings (vectorized representations) are stored and accessed independently from other workspaces.

Key Characteristics of Workspaces

Document Containerization:
- Each workspace can hold a unique set of documents (e.g., PDFs, TXT, DOCX, etc.). When a document is uploaded to GenAI Studio, it is not automatically available to all workspaces—it must be explicitly moved or embedded into a specific workspace.
- This containerization ensures that the AI only retrieves information from the documents assigned to the active workspace, preventing irrelevant or cross-contextual data from affecting responses.
Contextual Clarity:
- Workspaces maintain their own conversational context. When you interact with the AI within a workspace, the model references only the documents and chat history associated with that workspace.
- This isolation eliminates confusion caused by overlapping or unrelated information, making responses more accurate and focused.
Privacy and Segregation:
- Workspaces do not "talk" to each other. Documents embedded in one workspace are inaccessible to another, ensuring data privacy and segregation.
- This feature is particularly useful for teams or individuals managing multiple projects, as it keeps sensitive or project-specific information compartmentalized.
Customizable Configurations:
- Each workspace can be configured with its own settings, such as a specific language model (LLM), embedding model, system prompt, response temperature, or document similarity threshold.
- For example, a workspace for technical research might use a precise, low-temperature model, while a creative writing workspace could use a more open-ended, high-temperature model.
Persistent Storage:
- Workspaces, along with their documents, embeddings, and chat histories, are stored in the configured storage system.
- This persistence ensures that shutting down and restarting GenAI Studio does not result in data loss, allowing users to resume their work seamlessly.

How Workspaces Function in Practice

When a user interacts with GenAI Studio, the workflow involving workspaces typically follows these steps:

Workspace Creation:
- A user creates a workspace, giving it a name (e.g., "Project Alpha," "Recipe Collection," or "Legal Research").
- This workspace becomes a dedicated space for all related documents and interactions.
Document Embedding:
- Documents are uploaded to GenAI Studio and then moved to a specific workspace. During this process, the document is chunked into smaller text segments, which are converted into numerical vectors using an embedding model.
- These vectors are stored in a vector database (Qdrant) and associated with the workspace.
Query Processing with Retrieval-Augmented Generation (RAG):
- When a user asks a question in a workspace, GenAI Studio uses RAG to process the query:
  - The query is matched against the vector database to retrieve the most relevant text chunks from the workspace’s documents.
  - Typically, 4–6 relevant chunks are selected and passed to the LLM as context, along with the workspace’s system prompt and chat history.
  - The LLM then generates a response based on this context, ensuring that answers are grounded in the workspace’s documents.
Chat and Agent Interactions:
- Workspaces support both conversational and query-based interactions. In conversation mode, the AI retains context from previous messages within the workspace. In query mode, it provides direct answers based on document content.
- AI agents, when invoked (e.g., via @agent), operate within the workspace’s context and can perform tasks like summarizing documents, scraping websites, or generating charts, all while respecting the workspace’s boundaries.

Benefits of the Workspace Concept

Organization: Workspaces provide a structured way to manage documents and conversations, making it easier to handle multiple projects or topics without overlap.
Efficiency: Vector caching and document sharing reduce computational overhead, as embeddings are reused rather than recomputed.
Flexibility: Users can tailor each workspace to specific needs, from model selection to prompt design, enabling diverse use cases within a single application.
Scalability: Workspaces support collaboration through multi-user permissions, allowing teams to work on shared projects while maintaining control over access.
Privacy: The segregation of workspaces ensures that sensitive data remains confined to its intended context, enhancing security.

Use Case Examples

Academic Research: A researcher creates a workspace for each paper or topic, embedding relevant articles and notes. Queries within the workspace yield precise answers based only on those materials.
Team Collaboration: A marketing team uses one workspace for campaign planning and another for competitor analysis, with each workspace containing relevant documents and custom prompts.
Personal Knowledge Management: An individual creates workspace for recipes, travel plans, and work notes, keeping each domain separate for clarity.