Skip to content

Instantly share code, notes, and snippets.

@kurokobo
Last active December 17, 2025 01:17
Show Gist options
  • Select an option

  • Save kurokobo/51fbe7f92f4526957e12dacfa7783cdf to your computer and use it in GitHub Desktop.

Select an option

Save kurokobo/51fbe7f92f4526957e12dacfa7783cdf to your computer and use it in GitHub Desktop.
Dify: Weaviate 1.19 to 1.27+ Migration Guide (community-edited, simplified)
"""
# NOTE: THIS SCRIPT IS DEPRECATED AND OUTDATED
## TL;DR;
Use the official migration script instead.
https://github.com/langgenius/dify-docs/blob/main/assets/migrate_weaviate_collections.py
## Background
This script was originally released as a community-edited version of the draft of the script presented by the Dify Team,
to address some issues encountered during migration of Weaviate collections in certain environments,
before the official script was finalized, as a temporary workaround.
However, all modifications made in this script have already been backported to the official script.
Therefore, this unofficial script is deprecated and will not be maintained in the future.
We strongly recommend using the latest official script.
If you face any issues with the official script, please report them to the Dify Team via their GitHub repository or any supported channels.
You can see the revisions made in this script by checking the git history: https://gist.github.com/kurokobo/51fbe7f92f4526957e12dacfa7783cdf/revisions
The original source for this script can be found at: https://github.com/langgenius/dify/issues/27291#issuecomment-3501003678.
The key changes made in this script were:
- Retrieve Weaviate connection info from environment variables to make this script run in the Worker container.
- Switch to cursor-based pagination in "replace_old_collection", since the migration could fail with large collections.
- Fix an issue where both the old and new collections remained without being deleted after migrating an empty collection.
"""
import sys
print("WARNING: This migration script is DEPRECATED and OUTDATED.")
print("Please use the following official migration script instead:")
print("https://github.com/langgenius/dify-docs/blob/main/assets/migrate_weaviate_collections.py")
print("This script will now exit without making any changes.")
sys.exit(1)

Weaviate 1.19 to 1.27+ Migration Guide for Dify

  • ⚠️ This guide is not officially supported by the Dify Team.
  • ⚠️ This is a community-edited, simplified version of the official migration guide presented by the Dify Team.

Complete guide to safely migrate Dify knowledge bases from Weaviate 1.19 to 1.27/1.33.


✅ NOTE: BEFORE PROCEEDING FURTHER

If your environment contains only a small number of Knowledges, you might be able to resolve the issue using the following much simpler steps, instead of the more complicated process on this page.

  1. Open the Settings page for your knowledge.
  2. Change the Embedding Model to something else.
  3. On the Documents page, wait until all documents become Available.
  4. Open the Settings page again and change the Embedding Model back to the original.
  5. On the Documents page, wait again until all documents become Available.
  6. Repeat these steps for each Knowledges.

The steps described in the following sections are aimed at large environments, where it's not feasible to manually edit every Knowledges.


📝 Outline

This guide covers the following two cases.
While Case A is recommended for a safer migration, this guide can also be applied to Case B:

  • Case A
    • You are currently running a version of Dify 1.9.1 or earlier with Weaviate 1.19 included.
    • All knowledge is functioning properly.
  • Case B
    • You have already upgraded to Weaviate 1.27+ and are running Dify 1.9.2 or later.
    • The knowledge created with the previous version is corrupted, and you have no backup to revert to the earlier version.

The procedure in this guide is as follows:

  1. Take a complete backup of your current Dify environment.
  2. If your Dify version is 1.9.1 or earlier, upgrade Dify.
  3. Operate the weaviate container and modify the directory structure of the LSM data.
  4. Operate the worker container and run the migration script.
  5. Perform cleanup.

📝 Migration Procedure

Note:
This procedure cannot be rolled back by any means other than a restore. Attempting to roll back using anything other than a restore may make things worse.
We recommend that you follow the steps to take a full backup first, in preparation for a possible restore.


Step 1: Backup Your Environment

Stop your Dify services:

cd /path/to/dify/docker
docker compose down

Then making full copy or archive of your entire docker directory (/path/to/dify/docker for example) as a safety measure.

If you encounter issues later, you can restore this backup to revert to the original state.


Step 2: Upgrade to Weaviate 1.27+ (Only for Case A)

This step is only for Case A - users currently on Dify 1.9.1 or earlier with Weaviate 1.19.
If you are already running Weaviate 1.27+ (Case B), you can skip this step.

Follow the upgrade guide to move to the latest (or a specific) Dify version that uses Weaviate 1.27+.


Step 3: Fix Orphaned LSM Data

If your Dify has stopped, start it and wait until it has fully launched.

cd /path/to/dify/docker
docker compose up -d

Ensure your Weaviate using the image version 1.27.0 or higher.

cd /path/to/dify/docker
docker compose ps weaviate  # The "IMAGE" column should show "semitechnologies/weaviate:1.27.0" or higher

Enter the shell of your weaviatwe container:

cd /path/to/dify/docker
docker compose exec -it weaviate /bin/sh

Then run the following commands inside the container to fix LSM data:

cd /var/lib/weaviate
for dir in vector_index_*_node_*_lsm; do
  [ -d "$dir" ] || continue
  
  # Extract index ID and shard ID
  index_id=$(echo "$dir" | sed -n 's/vector_index_\([^_]*_[^_]*_[^_]*_[^_]*_[^_]*\)_node_.*/\1/p')
  shard_id=$(echo "$dir" | sed -n 's/.*_node_\([^_]*\)_lsm/\1/p')
  
  # Create target directory and copy
  mkdir -p "vector_index_${index_id}_node/$shard_id/lsm"
  cp -a "$dir/"* "vector_index_${index_id}_node/$shard_id/lsm/"
  
  echo "✓ Copied $dir"
done
exit

Then restart weaviate container to ensure changes are recognized:

cd /path/to/dify/docker
docker compose restart weaviate

Step 4: Migrate Schema

Place migrate_weaviate_collections.py script to your /path/to/dify/docker/volumes/app/storage/ directory, then enter the shell of your worker container:

cp /path/to/migrate_weaviate_collections.py /path/to/dify/docker/volumes/app/storage/
cd /path/to/dify/docker
docker compose exec -it worker /bin/bash

Then run the following commands inside the container to execute the migration script:

uv run --no-cache /app/api/storage/migrate_weaviate_collections.py
exit

Restart Dify services:

docker compose down
docker compose up -d

Verify in Dify UI:

  1. Go to your Dify console
  2. Open your knowledge bases
  3. Try "Retrieval Testing"
  4. Should work without errors!

Step 5: Cleanup (Optional)

After successful migration, you can delete orphaned files to free up space.
Enter the shell of your weaviatwe container:

cd /path/to/dify/docker
docker compose exec -it weaviate /bin/sh

Then run the following commands inside the container to delete orphaned files:

cd /var/lib/weaviate
rm -rf vector_index_*_node_*
exit

Also, you can delete the migration script from your storage volume:

rm /path/to/dify/docker/volumes/app/storage/migrate_weaviate_collections.py

📝 Files Needed

📝 Credits

  • Original migration approach: Dify team
  • LSM recovery method: Chinese Dify community user
  • Combined solution: Community effort
@kurokobo
Copy link
Author

@suntao2015005848
Good catch, thanks!
It seems that /home/dify no longer exists in the container starting from version 1.11.0, which caused the cache directory for uv to fail to be created.

As you suggested, using -u root would work, but an even simpler solution is just to add --no-cache to the uv command. I've updated the guide accordingly.

Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment