A simple Python script to parse SIR (Special Intensive Revision) electoral roll PDFs published by the Election Commission of India, extract voter details from scanned Malayalam documents, and convert them into a clean CSV with Malayalam to English translation.
Special Intensive Revision (SIR) is a comprehensive verification process of electoral rolls conducted by the Election Commission of India to update, correct, add, or remove voter entries.
- Malayalam OCR using Tesseract
- Parses scanned Election Commission SIR PDFs