Step-by-Step Guide to Implementing AI-Powered Semantic Search in FileMaker 2024

Community Live 13: Jumpstart AI in Claris FileMaker – A step by step workshop

Overview of FileMaker’s New AI Features

FileMaker 2024 introduces a variety of AI tools, specifically for semantic search, which organizes and retrieves data based on contextual meaning. Here are the key tools:

AI Setup and Configuration: Using Configure AI Account and Set AI Call Logging script steps, you can set up connections to AI providers and log detailed interactions for tracking and debugging.
Embedding and Search Tools: New script steps and functions allow you to generate, store, and manage vector embeddings, the backbone of semantic search.
Token Management and Optimization: To manage costs and optimize performance, functions like GetTokenCount and Get(LastStepTokenUsed) help control token usage during interactions with AI.

These features lay the groundwork for integrating sophisticated search capabilities that go beyond traditional keyword-based retrieval.

Getting Started: Semantic Search Setup in FileMaker 2024

Step 1: Configuring the AI Account

To begin using AI-powered features in FileMaker, you need to configure a connection to an AI provider, such as OpenAI or a local large language model (LLM). This configuration establishes the session for AI interactions, allowing for both security and customization.

Create a Configuration Script:
- Open the Script Workspace in FileMaker and create a new script, naming it something like Configure AI Account.
- Add the Configure AI Account Script Step: This script step is essential for setting up the connection between your FileMaker app and the chosen AI model.
Configure Script Parameters:
- Account Name: This is a unique, descriptive name that you choose to identify the AI connection. This name (e.g., “Sample Account”) will be used across other scripts to reference this AI account.
- Model Provider: From the dropdown, select OpenAI (or another AI provider if you are using a custom model). For OpenAI, you will need to use their API key to access their model; local models will require different credentials.
- API Key: This is the secure key that authenticates the connection to OpenAI or your chosen AI provider. Paste your API key here, which allows FileMaker to send requests to the provider.
Running the Configuration:
- Run this configuration script at the start of each session to ensure the AI account is ready for use. If your FileMaker solution involves multiple workflows needing AI access, you might include this script as part of the session’s OnFirstWindowOpen trigger.

Tip: Testing the script initially is recommended to ensure the API key and account configuration are correct. If there’s an issue, FileMaker will throw an error, which you can capture with Get(LastError) for troubleshooting.

Step 2: Preparing Data for Semantic Search with Embeddings

Embedding is a process that converts textual data into numerical vectors. These vectors capture the “meaning” of the text, allowing you to perform searches based on similarity rather than keyword matches.

Create a Container Field for Embeddings:
- Open the Manage Database dialog and create a new container field (e.g., Note_Embedding) in the table where your searchable data resides.
- Why Container Fields? Embeddings stored in container fields perform better because FileMaker stores these embeddings as binary data, which is smaller in size and improves efficiency. Text fields can be used, but binary storage offers better performance, especially for large datasets.
Setting Up the Embedding Script:
- In Script Workspace, create a new script named Embedding Note Data.
- Add the Insert Embedding in Found Set script step to handle embedding multiple records at once, eliminating the need for a loop. This script step is ideal for generating embeddings across large datasets in a single operation.
Configuring the Embedding Script Step:
- Account Name: Match this to the account name used in the Configure AI Account script, e.g., “Sample Account.”
- Embedding Model: For OpenAI, enter the model name such as text-embedding-3-small, which is adequate for general-purpose embedding tasks. Check the latest documentation for updated model names, as providers sometimes change them.
- Source Field: Choose the field containing the text you want to analyze (e.g., Note field with meeting notes or descriptions).
- Target Field: Choose the Note_Embedding container field you created. This is where the embedding will be stored.
- Optional Settings:
  - Replace Target Content: Set to True if you plan to regularly update the embeddings, such as when source data changes. If left unchecked, embeddings are only generated if the target field is empty.
  - Additional JSON Parameters: Use this to optimize settings like token limits, batch size, or other provider-specific options if you’re working with large datasets.
Running the Script to Generate Embeddings:
- Run this script across your records to generate embeddings and store them in Note_Embedding. FileMaker will send the text to the AI provider, generate a vector, and store it as binary data in each record’s container field.

Important: Large datasets may incur costs if you’re using OpenAI’s API. To control costs, you might want to use OpenAI’s “small” model, which is more cost-effective for general embedding tasks.

Step 3: Implementing Semantic Search

Once embeddings are stored, you can perform semantic searches based on the vectors. Unlike traditional searches, semantic search uses vector similarity, allowing results based on context rather than exact keywords.

Setting Up a Semantic Search Script:
- Create a new script named Perform Semantic Find.
- Add the Perform Semantic Find Script Step: This is the primary step for executing a semantic search, and it requires parameters to determine how the search will work.
Configuring the Search Script Step:
- Query By: Choose between Natural Language and Vector Data.
  - Natural Language: Allows users to enter plain text that FileMaker will automatically convert to a vector for the search.
  - Vector Data: Use this if you’re comparing vectors directly rather than converting text to vectors.
- Account Name: Set this to match your configured AI account.
- Embedding Model: Use the same model you used for generating the embeddings to ensure vector consistency.
- Text Field: Use a global text field where users can type in their search term. For a more efficient design, set this field as a global variable (e.g., Search_Query).
- Record Set: Specify whether to search across All Records or a Found Set. For performance, narrow down the search scope with a pre-filtered found set when working with large datasets.
- Target Field: This should be the container field holding the vector embeddings (e.g., Note_Embedding).
- Return Count: Set the number of results to display (default is 10). To display more or fewer, adjust this value as desired.
- Cosine Similarity Condition: Optionally set a similarity threshold (between -1 and 1). Only records meeting this similarity will appear in the search results. Cosine similarity measures the degree of similarity between two vectors, where values closer to 1 are more alike.

Step 4: Displaying and Sorting Results by Cosine Similarity

Cosine similarity allows for an advanced layer of control in search results by sorting results based on relevance.

Understanding Cosine Similarity:
- Cosine Similarity Function: Calculates a similarity score between -1 and 1.
  - 1 = Highly similar
  - 0 = Neutral (no significant similarity)
  - -1 = Dissimilar
- Use Case: By adding the CosineSimilarity function in a calculation field or layout calculation, you can display how close each record is to the search term.
Setting Up Cosine Similarity:
- In Script Workspace, create a new calculation field or layout calculation using the CosineSimilarity function. Compare the vector from the user’s search term (stored in Search_Query) to the embedding in Note_Embedding.
- Sort results in descending order to show the most relevant records first.

Pro Tip: Setting a threshold in the script to hide records below a certain similarity level (e.g., 0.3) can filter out less relevant records, enhancing search efficiency.

Additional Tips for Optimizing Performance and Security

Data Security:
- Restrict access to embedding fields based on user roles to prevent unauthorized access to vector data.
- Consider local LLM deployment for sensitive data that must stay within your organization.
Embedding Storage Best Practices:
- For large datasets, split data into batches and run embeddings incrementally to avoid hitting API rate limits.
Cost Management:
- Track token usage with GetTokenCount and set budget caps. Optimize embeddings by storing frequently used ones and only re-generating as needed.

Community Link: https://community.claris.com/en/s/question/0D5Vy00000KXLH2KAP/october-24-2024-community-live-jumpstart-ai-in-claris-filemaker-a-stepbystep-workshop

In order to participate during the live workshop, you will need your own OpenAI API key.

You will want to secure your API key well in advance of the livestream in order to be able to follow along.

Steps (subject to change):
- Go here: Sign Up – OpenAIOpenAIhttps://platform.openai.com › signup
- Once signed up, go to your profile: https://platform.openai.com/settings/profile
- Select the “User API Keys” tab: https://platform.openai.com/api-keys
- Select the “View project API Keys” button
- Select the “create new secret key” button
- Copy the key and save it somewhere safe and secure. *This is your one chance to copy the key

Here are some more resources to help you master the artificial intelligence script steps and calculation functions:

Custom GPT Cris created to help you add semantic search to your FileMaker apps (OpenAI login required to access this resource)

Engineering blog: Working with LLMs in FileMaker 2024

FileMaker Help on artificial intelligence script steps

FileMaker Help on artificial intelligence calculation functions