RAG API

This document outlines the API endpoints for managing Retrieval-Augmented Generation (RAG) components in PySpur.

Document Collections

Create Document Collection

Description: Creates a new document collection from uploaded files and metadata. The files are processed asynchronously in the background.

URL: /rag/collections/

Method: POST

Form Data:

files: List[UploadFile]  # List of files to upload (optional)
metadata: str  # JSON string containing collection configuration

Where metadata is a JSON string representing:

class DocumentCollectionCreateSchema:
    name: str  # Name of the collection
    description: str  # Description of the collection
    text_processing: ChunkingConfigSchema  # Configuration for text processing

Response Schema:

class DocumentCollectionResponseSchema:
    id: str  # ID of the document collection
    name: str  # Name of the collection
    description: str  # Description of the collection
    status: str  # Status of the collection (processing, ready, failed)
    created_at: str  # When the collection was created (ISO format)
    updated_at: str  # When the collection was last updated (ISO format)
    document_count: int  # Number of documents in the collection
    chunk_count: int  # Number of chunks in the collection
    error_message: Optional[str]  # Error message if processing failed

List Document Collections

Description: Lists all document collections.

URL: /rag/collections/

Method: GET

Response Schema:

List[DocumentCollectionResponseSchema]

Get Document Collection

Description: Gets details of a specific document collection.

URL: /rag/collections/{collection_id}/

Method: GET

Parameters:

collection_id: str  # ID of the document collection

Response Schema:

class DocumentCollectionResponseSchema:
    id: str  # ID of the document collection
    name: str  # Name of the collection
    description: str  # Description of the collection
    status: str  # Status of the collection (processing, ready, failed)
    created_at: str  # When the collection was created (ISO format)
    updated_at: str  # When the collection was last updated (ISO format)
    document_count: int  # Number of documents in the collection
    chunk_count: int  # Number of chunks in the collection
    error_message: Optional[str]  # Error message if processing failed

Delete Document Collection

Description: Deletes a document collection and its associated data.

URL: /rag/collections/{collection_id}/

Method: DELETE

Parameters:

collection_id: str  # ID of the document collection

Response: 200 OK with message

Get Collection Progress

Description: Gets the processing progress of a document collection.

URL: /rag/collections/{collection_id}/progress/

Method: GET

Parameters:

collection_id: str  # ID of the document collection

Response Schema:

class ProcessingProgressSchema:
    id: str  # ID of the collection
    status: str  # Status of processing
    progress: float  # Progress percentage (0-100)
    current_step: Optional[str]  # Current processing step
    total_files: Optional[int]  # Total number of files
    processed_files: Optional[int]  # Number of processed files
    total_chunks: Optional[int]  # Total number of chunks
    processed_chunks: Optional[int]  # Number of processed chunks
    error_message: Optional[str]  # Error message if processing failed
    created_at: str  # When processing started (ISO format)
    updated_at: str  # When processing was last updated (ISO format)

Add Documents to Collection

Description: Adds documents to an existing collection. The documents are processed asynchronously in the background.

URL: /rag/collections/{collection_id}/documents/

Method: POST

Parameters:

collection_id: str  # ID of the document collection

Form Data:

files: List[UploadFile]  # List of files to upload

Response Schema:

class DocumentCollectionResponseSchema:
    # Same as Get Document Collection

Get Collection Documents

Description: Gets all documents and their chunks for a collection.

URL: /rag/collections/{collection_id}/documents/

Method: GET

Parameters:

collection_id: str  # ID of the document collection

Response Schema:

List[DocumentWithChunksSchema]

Where DocumentWithChunksSchema contains:

class DocumentWithChunksSchema:
    id: str  # ID of the document
    title: str  # Title of the document
    metadata: Dict[str, Any]  # Metadata about the document
    chunks: List[DocumentChunkSchema]  # List of chunks in the document

Delete Document from Collection

Description: Deletes a document from a collection.

URL: /rag/collections/{collection_id}/documents/{document_id}/

Method: DELETE

Parameters:

collection_id: str  # ID of the document collection
document_id: str  # ID of the document to delete

Response: 200 OK with message

Preview Chunk

Description: Previews how a document would be chunked with a given configuration.

URL: /rag/collections/preview_chunk/

Method: POST

Form Data:

file: UploadFile  # File to preview
chunking_config: str  # JSON string containing chunking configuration

Response Schema:

{
    "chunks": List[Dict[str, Any]],  # Preview of chunks
    "total_chunks": int  # Total number of chunks
}

Vector Indices

Create Vector Index

Description: Creates a new vector index from a document collection. The index is created asynchronously in the background.

URL: /rag/indices/

Method: POST

Request Payload:

class VectorIndexCreateSchema:
    name: str  # Name of the index
    description: str  # Description of the index
    collection_id: str  # ID of the document collection
    embedding: EmbeddingConfigSchema  # Configuration for embedding

Response Schema:

class VectorIndexResponseSchema:
    id: str  # ID of the vector index
    name: str  # Name of the index
    description: str  # Description of the index
    collection_id: str  # ID of the document collection
    status: str  # Status of the index (processing, ready, failed)
    created_at: str  # When the index was created (ISO format)
    updated_at: str  # When the index was last updated (ISO format)
    document_count: int  # Number of documents in the index
    chunk_count: int  # Number of chunks in the index
    embedding_model: str  # Name of the embedding model
    vector_db: str  # Name of the vector database
    error_message: Optional[str]  # Error message if processing failed

List Vector Indices

Description: Lists all vector indices.

URL: /rag/indices/

Method: GET

Response Schema:

List[VectorIndexResponseSchema]

Get Vector Index

Description: Gets details of a specific vector index.

URL: /rag/indices/{index_id}/

Method: GET

Parameters:

index_id: str  # ID of the vector index

Response Schema:

class VectorIndexResponseSchema:
    # Same as Create Vector Index response

Delete Vector Index

Description: Deletes a vector index and its associated data.

URL: /rag/indices/{index_id}/

Method: DELETE

Parameters:

index_id: str  # ID of the vector index

Response: 200 OK with message

Get Index Progress

Description: Gets the processing progress of a vector index.

URL: /rag/indices/{index_id}/progress/

Method: GET

Parameters:

index_id: str  # ID of the vector index

Response Schema:

class ProcessingProgressSchema:
    # Same as Get Collection Progress response

Retrieve from Index

Description: Retrieves relevant chunks from a vector index based on a query.

URL: /rag/indices/{index_id}/retrieve/

Method: POST

Parameters:

index_id: str  # ID of the vector index

Request Payload:

class RetrievalRequestSchema:
    query: str  # Query to search for
    top_k: Optional[int] = 5  # Number of results to return
    score_threshold: Optional[float] = None  # Minimum score threshold
    semantic_weight: Optional[float] = 1.0  # Weight for semantic search
    keyword_weight: Optional[float] = 0.0  # Weight for keyword search

Response Schema:

class RetrievalResponseSchema:
    results: List[RetrievalResultSchema]  # List of retrieval results
    total_results: int  # Total number of results

Where RetrievalResultSchema contains:

class RetrievalResultSchema:
    text: str  # Text of the chunk
    score: float  # Relevance score
    metadata: ChunkMetadataSchema  # Metadata about the chunk

Get Started

Chatbots

Tools

RAG

Evals

API Reference

Rag

RAG API

Document Collections

Create Document Collection

List Document Collections

Get Document Collection

Delete Document Collection

Get Collection Progress

Add Documents to Collection

Get Collection Documents

Delete Document from Collection

Preview Chunk

Vector Indices

Create Vector Index

List Vector Indices

Get Vector Index

Delete Vector Index

Get Index Progress

Retrieve from Index

Get Started

Chatbots

Tools

RAG

Evals

API Reference

​RAG API

​Document Collections

​Create Document Collection

​List Document Collections

​Get Document Collection

​Delete Document Collection

​Get Collection Progress

​Add Documents to Collection

​Get Collection Documents

​Delete Document from Collection

​Preview Chunk

​Vector Indices

​Create Vector Index

​List Vector Indices

​Get Vector Index

​Delete Vector Index

​Get Index Progress

​Retrieve from Index

RAG API

Document Collections

Create Document Collection

List Document Collections

Get Document Collection

Delete Document Collection

Get Collection Progress

Add Documents to Collection

Get Collection Documents

Delete Document from Collection

Preview Chunk

Vector Indices

Create Vector Index

List Vector Indices

Get Vector Index

Delete Vector Index

Get Index Progress

Retrieve from Index