caleb-kaiser · October 23, 2024 23:36 · haider827 · Mar 25, 2025
diff --git a/2-create_dataset.ipynb b/2-create_dataset.ipynb
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/gist/caleb-kaiser/d947d1276d1548708e9290b4d926e57e/2-create_dataset.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "<img src=\"https://raw.githubusercontent.com/comet-ml/opik/main/apps/opik-documentation/documentation/static/img/opik-logo.svg\" width=\"250\"/>"
      ],
      "metadata": {
        "id": "WlnFnpxC1uiD"
      }
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DEcQzUry-2vY"
      },
      "source": [
        "# Create an Evaluation Dataset With Opik\n",
        "\n",
        "In this exercise, you'll create an evaluation dataset with Opik. Datasets can be used to track test cases you would like to evaluate your LLM on. Once a dataset has been created, you can run Experiments on it. Each Experiment will evaluate an LLM application based on the test cases in the dataset using an evaluation metric and report the results back to the dataset."
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Imports & Configuration"
      ],
      "metadata": {
        "id": "A0pC5p72CR4r"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "%pip install opik comet_ml --quiet"
      ],
      "metadata": {
        "id": "p8aXUc0g3Hvt"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "wwVN4uKz-2vc"
      },
      "outputs": [],
      "source": [
        "import os\n",
        "import IPython\n",
        "import ast\n",
        "import csv\n",
        "import opik\n",
        "import getpass\n",
        "from opik import Opik"
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "# opik configs\n",
        "if \"OPIK_API_KEY\" not in os.environ:\n",
        "    os.environ[\"OPIK_API_KEY\"] = getpass.getpass(\"Enter your Opik API key: \")\n",
        "\n",
        "opik.configure()"
      ],
      "metadata": {
        "id": "7tqZ3ITv3Xot"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Dataset\n",
        "\n",
        "The **`get_or_create_dataset`** method checks if dataset with the given name already exists, and, if so, the existing dataset will be returned. If not, then it creates the dataset.\n",
        "\n",
        "Opik also automatically deduplicates items that are inserted into a dataset when using the Python SDK. This means that you can insert the same item multiple times without duplicating it in the dataset.\n",
        "\n",
        "These two features combined means that you can use the SDK to manage your datasets in a \"fire and forget\" manner."
      ],
      "metadata": {
        "id": "tLJtCHDoCWBl"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Create or get the dataset\n",
        "client = Opik()\n",
        "dataset = client.get_or_create_dataset(name=\"foodchatbot_eval\")"
      ],
      "metadata": {
        "id": "CF8t6pTNCW7t"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Optional: Download Dataset From Comet\n",
        "\n",
        "If you have not previously created the `foodchatbot_eval` dataset in your Opik workspace, run the following code to download the dataset as a Comet Artifact and populate your Opik dataset.\n",
        "\n",
        "If you have already created the `foodchatbot_eval` dataset, you can skip to the next section."
      ],
      "metadata": {
        "id": "mBM8zw-F7Rw0"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "import comet_ml\n",
        "\n",
        "comet_ml.login(api_key=os.environ[\"OPIK_API_KEY\"])\n",
        "experiment = comet_ml.start(project_name=\"foodchatbot_eval\")\n",
        "\n",
        "logged_artifact = experiment.get_artifact(artifact_name=\"foodchatbot_eval\",\n",
        "                                          workspace=\"examples\")\n",
        "local_artifact = logged_artifact.download(\"./\")\n",
        "experiment.end()"
      ],
      "metadata": {
        "id": "osGFU3YD7RY9"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Read the CSV file and insert items into the dataset\n",
        "with open('./foodchatbot_clean_eval_dataset.csv', newline='') as csvfile:\n",
        "    reader = csv.reader(csvfile)\n",
        "    next(reader, None) # skip the header\n",
        "    for row in reader:\n",
        "        index, question, response = row\n",
        "        dataset.insert([\n",
        "            {\"question\": question, \"response\": response}\n",
        "        ])"
      ],
      "metadata": {
        "id": "jDENwjnu8g_l"
      },
      "execution_count": null,
      "outputs": []
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "comet-eval",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.10.15"
    },
    "colab": {
      "provenance": [],
      "collapsed_sections": [
        "A0pC5p72CR4r",
        "tLJtCHDoCWBl"
      ],
      "include_colab_link": true
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
 }
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "view-in-github",
	"colab_type": "text"
	},
	"source": [
	"<a href=\"https://colab.research.google.com/gist/caleb-kaiser/d947d1276d1548708e9290b4d926e57e/2-create_dataset.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
	]
	},
	{
	"cell_type": "markdown",
	"source": [
	"<img src=\"https://raw.githubusercontent.com/comet-ml/opik/main/apps/opik-documentation/documentation/static/img/opik-logo.svg\" width=\"250\"/>"
	],
	"metadata": {
	"id": "WlnFnpxC1uiD"
	}
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "DEcQzUry-2vY"
	},
	"source": [
	"# Create an Evaluation Dataset With Opik\n",
	"\n",
	"In this exercise, you'll create an evaluation dataset with Opik. Datasets can be used to track test cases you would like to evaluate your LLM on. Once a dataset has been created, you can run Experiments on it. Each Experiment will evaluate an LLM application based on the test cases in the dataset using an evaluation metric and report the results back to the dataset."
	]
	},
	{
	"cell_type": "markdown",
	"source": [
	"# Imports & Configuration"
	],
	"metadata": {
	"id": "A0pC5p72CR4r"
	}
	},
	{
	"cell_type": "code",
	"source": [
	"%pip install opik comet_ml --quiet"
	],
	"metadata": {
	"id": "p8aXUc0g3Hvt"
	},
	"execution_count": null,
	"outputs": []
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {
	"id": "wwVN4uKz-2vc"
	},
	"outputs": [],
	"source": [
	"import os\n",
	"import IPython\n",
	"import ast\n",
	"import csv\n",
	"import opik\n",
	"import getpass\n",
	"from opik import Opik"
	]
	},
	{
	"cell_type": "code",
	"source": [
	"# opik configs\n",
	"if \"OPIK_API_KEY\" not in os.environ:\n",
	" os.environ[\"OPIK_API_KEY\"] = getpass.getpass(\"Enter your Opik API key: \")\n",
	"\n",
	"opik.configure()"
	],
	"metadata": {
	"id": "7tqZ3ITv3Xot"
	},
	"execution_count": null,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"source": [
	"# Dataset\n",
	"\n",
	"The `get_or_create_dataset` method checks if dataset with the given name already exists, and, if so, the existing dataset will be returned. If not, then it creates the dataset.\n",
	"\n",
	"Opik also automatically deduplicates items that are inserted into a dataset when using the Python SDK. This means that you can insert the same item multiple times without duplicating it in the dataset.\n",
	"\n",
	"These two features combined means that you can use the SDK to manage your datasets in a \"fire and forget\" manner."
	],
	"metadata": {
	"id": "tLJtCHDoCWBl"
	}
	},
	{
	"cell_type": "code",
	"source": [
	"# Create or get the dataset\n",
	"client = Opik()\n",
	"dataset = client.get_or_create_dataset(name=\"foodchatbot_eval\")"
	],
	"metadata": {
	"id": "CF8t6pTNCW7t"
	},
	"execution_count": null,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"source": [
	"## Optional: Download Dataset From Comet\n",
	"\n",
	"If you have not previously created the `foodchatbot_eval` dataset in your Opik workspace, run the following code to download the dataset as a Comet Artifact and populate your Opik dataset.\n",
	"\n",
	"If you have already created the `foodchatbot_eval` dataset, you can skip to the next section."
	],
	"metadata": {
	"id": "mBM8zw-F7Rw0"
	}
	},
	{
	"cell_type": "code",
	"source": [
	"import comet_ml\n",
	"\n",
	"comet_ml.login(api_key=os.environ[\"OPIK_API_KEY\"])\n",
	"experiment = comet_ml.start(project_name=\"foodchatbot_eval\")\n",
	"\n",
	"logged_artifact = experiment.get_artifact(artifact_name=\"foodchatbot_eval\",\n",
	" workspace=\"examples\")\n",
	"local_artifact = logged_artifact.download(\"./\")\n",
	"experiment.end()"
	],
	"metadata": {
	"id": "osGFU3YD7RY9"
	},
	"execution_count": null,
	"outputs": []
	},
	{
	"cell_type": "code",
	"source": [
	"# Read the CSV file and insert items into the dataset\n",
	"with open('./foodchatbot_clean_eval_dataset.csv', newline='') as csvfile:\n",
	" reader = csv.reader(csvfile)\n",
	" next(reader, None) # skip the header\n",
	" for row in reader:\n",
	" index, question, response = row\n",
	" dataset.insert([\n",
	" {\"question\": question, \"response\": response}\n",
	" ])"
	],
	"metadata": {
	"id": "jDENwjnu8g_l"
	},
	"execution_count": null,
	"outputs": []
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "comet-eval",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.10.15"
	},
	"colab": {
	"provenance": [],
	"collapsed_sections": [
	"A0pC5p72CR4r",
	"tLJtCHDoCWBl"
	],
	"include_colab_link": true
	}
	},
	"nbformat": 4,
	"nbformat_minor": 0
	}
No results found