Step-by-Step Guide: Creating an MCP Server with Vectorize for Invoice Data Analysis

Step 2: Set Up the MCP Server to integrate with Vectorize

Before you begin, ensure you have the following prerequisites and technical requirements in place to successfully implement and work with the MCP dashboard system:

  1. Install Visual Studio Code: https://code.visualstudio.com/

  2. Install Python: https://www.python.org/downloads/

  3. Install Claude Desktop App: https://claude.ai/download

Detailed Step-by-Step Implementation Guide

  1. Create a new directory named "mcprag" in your C:\ drive by running the command: "mkdir mcprag" and press enter

  2. Open Visual Studio Code, click File, and select Open Folder.

  1. Select the folder “mcprag“.

  1. Select "New Terminal" from the Menu.

  1. Type the following commands in your terminal:

PS C:\mcptool>mkdir vectorize-search and press enter
PS C:\mcptool>cd vectorize-search and press enter
PS C:\mcptool\vectorize-search>python -m venv .venv and press enter
PS C:\mcptool\vectorize-search>.venv\Scripts\activate (For Windows) or source .venv\bin\activate (For Mac) and press enter

(.venv) PS C:\mcptool\vectorize-search> pip install vectorize-client --upgrade and press enter
(.venv) PS C:\mcptool\vectorize-search> pip install mcp and press enter

  1. Create a new file called "server.py" and paste the following Python code into Visual Studio Code.

from mcp.server.fastmcp import FastMCP
import vectorize_client as v
import json
import os
import sys
import time
import signal

#from vectorize_client import PipelinesApi, RetrieveDocumentsRequest


# === Configuration ===
VECTORIZE_BASE_URL = "https://api.vectorize.io/v1"
VECTORIZE_ORGANIZATION_ID = "YOUR_ORGANIZATION_ID_HERE" # Replace with your actual organization ID
VECTORIZE_PIPELINE_ID = "YOUR_PIPELINE_ID_HERE" # Replace with your actual pipeline ID
TOKEN_FILE_PATH = r"C:\mcprag\vectorize-search\token.json" # Path to your token.json file



def read_token_from_file(file_path: str) -> str:
    """
    Reads the token from a JSON file.
    The JSON file should contain a key named 'TOKEN'.
    Example token.json file content:
        {
            "TOKEN": "YOUR_VECTORIZE_TOKEN_HERE"
        }
    """
    with open(file_path, "r", encoding="utf-8") as f:
        data = json.load(f)
    token = data.get("TOKEN")
    if not token:
        raise ValueError("No 'TOKEN' key found in token.json.")
    return token


def submit_to_vectorize(query: str):
    """Submit the given query to the Vectorize API and retrieve documents."""
    try:
        # 1. Read the token from the JSON file
        token = read_token_from_file(TOKEN_FILE_PATH)
        
        # 2. Configure the Vectorize client
        api = v.ApiClient(
            v.Configuration(
                access_token=token,
                host=VECTORIZE_BASE_URL,
            )
        )
        
        pipelines = v.PipelinesApi(api)
        response = pipelines.retrieve_documents(
            VECTORIZE_ORGANIZATION_ID,
            VECTORIZE_PIPELINE_ID,
            v.RetrieveDocumentsRequest(question=query, num_results=5),
        )
        # Return the list of retrieved documents
        return response.documents
    except Exception as e:
        return {"error": f"Error while fetching documents: {str(e)}"}


def signal_handler(sig, frame):
    print("Safely stopping the server ...")
    sys.exit(0)


signal.signal(signal.SIGINT, signal_handler)


# Create an MCP server with a 30-second timeout
mcp = FastMCP(
    name="vectorize-search",
    host="127.0.0.1",
    port=8080,
    timeout=30,  # seconds
)


@mcp.tool()
def vectorize_search(query: str):
    """
    Uses Vectorize to perform a search for the given query.
    """
    try:
        if not isinstance(query, str):
            return {"error": "Input must be a string."}
        if not query.strip():
            return {"error": "Query cannot be empty."}
        
        result = submit_to_vectorize(query)
        return result
    except Exception as e:
        return {"error": f"Failed to execute query: {str(e)}"}


if __name__ == "__main__":
    try:
        print("Starting MCP server 'vectorize-search' on 127.0.0.1:8080 ... timeout=30 seconds")
        mcp.run()
    except Exception as e:
        print(f"Error: {e}")
        time.sleep(5)
  1. Get your essential information from Vectorize pipeline in Connect Menu

    1. Select your business pipeline name

    2. Click the "Connect" menu option

    3. Copy the "Organization ID" and replace the placeholder in the Python code: VECTORIZE_ORGANIZATION_ID = "YOUR_ORGANIZATION_ID_HERE"

    4. Copy the "Pipeline ID" and replace the placeholder in the Python code: VECTORIZE_PIPELINE_ID = "YOUR_PIPELINE_ID_HERE"

    5. Generate a token and download the authentication token from the Connect Menu and place the token file in the folder: C:\mcprag\vectorize-search\token.json