rodrigodesalvobraz/whatsapp-chat-viewer: Generate a WhatsApp-style HTML page from an exported chat, with support for images, videos, audio, PDFs, and optional audio transcription.


Generates a WhatsApp-style HTML page from an exported WhatsApp chat, with support for images, videos, audio, PDFs, and optional audio transcription.

Screenshot

pip install -r requirements.txt

For audio transcription, set your OpenAI API key:

# Linux/macOS
export OPENAI_API_KEY="sk-..."

# Windows
setx OPENAI_API_KEY "sk-..."
python whatsapp_viewer.py --dir "path/to/chat/folder"

This expects a chat.txt file and media files inside the folder, and generates output.html there.

python whatsapp_viewer.py "my_chat.txt" --dir "path/to/folder" --me "YourName"

The --me flag aligns your messages to the right.

Transcribe all audio files using OpenAI’s speech-to-text API:

python whatsapp_viewer.py "chat.txt" --dir "path/to/folder" --transcribe

Transcribe only the first N audios (useful for testing):

python whatsapp_viewer.py "chat.txt" --dir "path/to/folder" --transcribe --transcribe-only-x-audios 5

Transcriptions are cached as .original.txt files next to each audio file. Re-running the command skips already-transcribed audios.

Correct transcriptions using an LLM with conversation context:

python whatsapp_viewer.py "chat.txt" --dir "path/to/folder" --correct

Interactive mode lets you review each correction (accept, reject, or edit):

python whatsapp_viewer.py "chat.txt" --dir "path/to/folder" --correct-interactive --transcribe-only-x-audios 10

Corrected transcriptions are saved as .txt files. The original .original.txt files are preserved. The HTML output uses corrected versions when available.

Argument Description
chat_txt Chat text file (default: chat.txt)
media_dir Media directory (default: .)
output_html Output HTML file (default: output.html)
--dir DIR Base directory for all files
--me NAME Your name in the chat (right-aligns your messages)
--transcribe Transcribe audio files using OpenAI API
--transcribe-only-x-audios N Limit to first N audios
--stt-model MODEL Speech-to-text model (default: gpt-4o-mini-transcribe)
--correct Correct transcriptions using LLM with conversation context
--correct-interactive Interactively review each correction
--llm-model MODEL LLM model for correction (default: gpt-4o-mini)

A sample/ directory is included with a short conversation between two fictional users, AI-generated images, audio files, and pre-generated transcriptions.

To generate the HTML yourself:

python whatsapp_viewer.py --dir sample --me Bob

Or browse the pre-generated output directly:



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *