JoeJoe1313 · January 3, 2026 21:17
diff --git a/1.ipynb b/1.ipynb
diff --git a/2.ipynb b/2.ipynb
diff --git a/3.ipynb b/3.ipynb
diff --git a/4.ipynb b/4.ipynb
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "583c59cd",
   "metadata": {},
   "source": [
    "For R2L models, the paper proposes \"reverse thinking\" based on Bayes' rule, which evaluates how likely the question is given a particular answer. The authors test three scoring paradigms and find that the simplest one performs the best:\n",
    "\n",
    "- **Paradigm 1 (Normalized + Prior):**\n",
    "\n",
    "$$\n",
    "s_i^{(1)} = \\frac{1}{M_i} \\log p_{R2L}(q \\mid a_i) + \\log p_{R2L}(a_i)\n",
    "$$\n",
    "\n",
    "- **Paradigm 2 (Unnormalized + Prior):**\n",
    "\n",
    "$$\n",
    "\\tilde{s}_i^{(2)} = \\log p_{R2L}(q \\mid a_i) + \\log p_{R2L}(a_i)\n",
    "$$\n",
    "\n",
    "- **Paradigm 3 (Unnormalized, Uniform Prior):**\n",
    "\n",
    "$$\n",
    "s_i^{(3)} = \\log p_{R2L}(q \\mid a_i)\n",
    "$$\n",
    "\n",
    "Empirically, Paradigm 3 consistently yields the highest MCQ accuracy. By ignoring the prior probability of the answer, it avoids noisy estimations and cleanly sidesteps the surface-form competition issue."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "99bb236b",
   "metadata": {},
   "source": []
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
	{
	"cells": [
	{
	"cell_type": "markdown",
	"id": "583c59cd",
	"metadata": {},
	"source": [
	"For R2L models, the paper proposes \"reverse thinking\" based on Bayes' rule, which evaluates how likely the question is given a particular answer. The authors test three scoring paradigms and find that the simplest one performs the best:\n",
	"\n",
	"- Paradigm 1 (Normalized + Prior):\n",
	"\n",
	"$$\n",
	"s_i^{(1)} = \\frac{1}{M_i} \\log p_{R2L}(q \\mid a_i) + \\log p_{R2L}(a_i)\n",
	"$$\n",
	"\n",
	"- Paradigm 2 (Unnormalized + Prior):\n",
	"\n",
	"$$\n",
	"\\tilde{s}_i^{(2)} = \\log p_{R2L}(q \\mid a_i) + \\log p_{R2L}(a_i)\n",
	"$$\n",
	"\n",
	"- Paradigm 3 (Unnormalized, Uniform Prior):\n",
	"\n",
	"$$\n",
	"s_i^{(3)} = \\log p_{R2L}(q \\mid a_i)\n",
	"$$\n",
	"\n",
	"Empirically, Paradigm 3 consistently yields the highest MCQ accuracy. By ignoring the prior probability of the answer, it avoids noisy estimations and cleanly sidesteps the surface-form competition issue."
	]
	},
	{
	"cell_type": "markdown",
	"id": "99bb236b",
	"metadata": {},
	"source": []
	}
	],
	"metadata": {
	"language_info": {
	"name": "python"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 5
	}
No results found