- ✅ Webrecorder's browser extension effectively preserves ChatGPT web conversations.
- All data is available for each actively viewed conversations
- Full conversation structure with parent/child relationships
- Chronological ordering via timestamps, temporal anchors with microsecond precision
- ❌ Replay does not work in replayweb.page
- ❌ Privacy leak: Conversation list is always captured (titles, IDs, timestamps)
- Must create separate archives for each conversation of interest
- Timestamps come from the client, and capture takes place in browser. Worth checking change-time-on-local-machine attack.
Method: Browser extension. In extension settings, opt-in to "archive cookies" and "archive local storage"
-
⚠️ Careful: From Webrecorder's UI: "Sharing content created with this setting enabled may compromise your login credentials. Archived items created with this settings should generally be kept private!" -
A conversation's detailed messages are only captured if the archive was created while actively viewing that specific conversation. Otherwise, only metadata appears in the conversation list.
-
Fortunately, if you have not explicitly browsed to a conversation, its content is not included in the archive – only metadata e.g. title and time.
File: archive/data.warc.gz → decompressed to data.warc
API Endpoint: https://chatgpt.com/backend-api/conversations and https://chatgpt.com/backend-api/conversation/{conversation_id}
{
"items": [
{
"id": "conversation-uuid",
"title": "Conversation Title",
"create_time": "2026-01-30T15:14:58.016910Z",
"update_time": "2026-01-30T15:18:33.779223Z",
"is_archived": false,
...
}
],
"total": 29,
"limit": 28,
"offset": 0
}Complete conversation structure:
{
"conversation_id": "uuid",
"title": "Conversation Title",
"create_time": 1234567890.123,
"update_time": 1234567891.456,
"mapping": {
"node-id-1": {
"id": "node-id-1",
"message": {
"id": "message-uuid",
"author": {
"role": "user"
},
"content": {
"content_type": "text",
"parts": [
"This is the actual message text"
]
},
"create_time": 1234567890.123
},
"parent": "parent-node-id",
"children": ["child-node-id"]
}
}
}mapping: Dictionary of conversation nodes (messages and system nodes)message.author.role:"user"(your prompts) or"assistant"(ChatGPT responses)message.content.parts: Array containing the actual message textcreate_time: Unix timestamp for chronological ordering
File: pages/pages.jsonl (JSONL format - one JSON object per line)
{
"title": "ChatGPT",
"url": "https://chatgpt.com/",
"id": "page-session-id",
"ts": "2026-02-12T09:09:40.044Z",
"text": "Extracted text from page including:\nYou said:\nShow me a cat picture\nChatGPT said:\n..."
}WACZ archives contain multiple layers of cryptographic and temporal anchors that can be used to verify the authenticity and integrity of conversations and individual messages.
-
WARC-Level Integrity (Per HTTP Transaction). Each WARC record contains cryptographic digests and metadata for verification
-
File-Level Integrity (WACZ Package):
datapackage.jsoncontains SHA-256 hashes of all archive components -
Conversation-Level Anchors:
{ "conversation_id": "697ccad2-0994-832f-a69b-0e2a1456e747", "title": "Banking Choices in Germany", "create_time": 1769786098.01691, "update_time": 1769786315.00338, "current_node": "4428eb41-7d52-4427-a3cc-ca43441ce839", "default_model_slug": "auto" } -
Message-Level Anchors: Each message contains multiple unique identifiers and timestamps:
{ "id": "36c27e0d-7917-49ff-a4d9-fc77f26dd29d", "author": { "role": "user", "name": null, "metadata": {} }, "create_time": 1769786097.538748, "update_time": null, "status": "finished_successfully", "metadata": { "request_id": "247a8438-e77a-4d4d-b5c5-4362cd7ab256", "turn_exchange_id": "27723209-488e-46a5-b593-0767b8c61dc1", "message_source": null, "triggered_by_system_hint_suggestion": false }, "content": { "content_type": "text", "parts": ["Message text"] } }
message.id: Unique UUID for this specific messagemetadata.request_id: Backend request UUID (tracks API call)metadata.turn_exchange_id: UUID linking user prompt to assistant response- **
Node ID**: The key in the mapping dictionary (often same as message.id)
create_time: Message creation (Unix timestamp with microsecond precision)update_time: If message was edited (null if never edited)
author.role:"user","assistant","system", or"tool"author.name: User identifier (if available)status: Message processing status ("finished_successfully", etc.)
The conversation tree structure provides internal consistency verification:
{
"node-id-1": {
"id": "36c27e0d-7917-49ff-a4d9-fc77f26dd29d",
"parent": "89d462ab-2d11-48bb-af9c-d9331c666a2a",
"children": ["03004e31-4833-4012-8cd8-ce39a52ee775"],
"message": { ... }
}
}