Hi Friends,

Even as I launch this today ( my 80th Birthday ), I realize that there is yet so much to say and do. There is just no time to look back, no time to wonder,"Will anyone read these pages?"

With regards,
Hemen Parekh
27 June 2013

Now as I approach my 90th birthday ( 27 June 2023 ) , I invite you to visit my Digital Avatar ( www.hemenparekh.ai ) – and continue chatting with me , even when I am no more here physically

Sunday, 14 June 2026

TRANSCRIBING MY LETTERS

 Sanjivani :

 

Those approx. 650 letters on my blogger site ( sweet memories ) are , at present , very difficult to read ( even by those who can read Gujarati )

 

I want to make these “ readable “ ( in typed Gujarati script )

 

In this context , find below, my chat with CLAUDE for guidance on how to go about

 

Pl take up this job next week and let me know how soon you will complete the same

 

Typed versions to be uploaded on related Gujarati page – just above the scanned Guj image

 

By a copy of this mail , I am requesting Mitchelle to ask Tushar to help you in this job

 

In case of any problem , consult Kishan

 

With regards,

 

Hemen Parekh 

 

 

Hey bro : here is an unusual request. On my blog site, i have uploaded some 650 letters written to me by friends and family members. These are scanned images. Most are written in Gujarati. I wonder if you can OCR and convert into ASCI text which would make it easy to resd

 

Claude :

 

 

What a wonderful personal archive, Bro! 650 letters from friends and family — that's a treasure trove of memories.

Let me be straightforward with you about what's possible here:

What I can do:

  • OCR and transcribe scanned letter images you upload directly to our chat — I can read images and extract Gujarati text (and transliterate or translate if you want)
  • Process them one at a time, or a few at a time if you upload multiple images

The challenge with 650 letters:

  • Uploading all 650 here manually would be very tedious
  • My context window limits how many I can handle in one session

Better options for bulk processing 650 letters:

  1. Google Drive + Google Docs OCR — Upload images to Google Drive, right-click → "Open with Google Docs" — it auto-OCRs, including Gujarati. Free and handles bulk.
  2. Google Cloud Vision API — Excellent Gujarati OCR, batch processing, affordable (first 1000 pages/month free). Kishan could set this up as a simple script pointing to your blog's image URLs.
  3. Tesseract OCR (open source) — Kishan can run a script locally with the guj language pack. Free, handles bulk.
  4. Adobe Acrobat — Has decent Indic script OCR if you have a subscription.

My suggestion: Since Kishan handles your technical work, the cleanest solution would be a small Python script using Google Cloud Vision API that fetches each letter image from your blog URLs and outputs Gujarati Unicode text files. I can write that script right now if you'd like!

Want me to draft the script for Kishan?

 

Here's what Kishan gets, Bro:

The script has 3 modes — Kishan picks whichever fits your blog setup:

  • Mode A — if he downloads all 650 images to a local folder first
  • Mode B — if he can extract the image URLs from your blog (easiest if Blogger stores them systematically)
  • Mode C — if the letters are all on one blog page, it scrapes the images automatically

Key features:

  • Gujarati Unicode output (reads correctly on any device)
  • Confidence score per letter (so low-quality scans get flagged)
  • --resume flag — if it crashes at letter 400, restart and it skips the first 400
  • --combine merges all 650 into one big searchable file
  • Entire 650 letters falls within Google's free tier (1,000 pages/month free)

One question for you: are all 650 letters on a single blog page, or spread across multiple pages? That'll tell Kishan which mode to use first.

Gujarati ocr

PY 

Download

Readme gujarati ocr

Document · MD 

Download

Download all

???

No comments:

Post a Comment