Hi Friends,

Even as I launch this today ( my 80th Birthday ), I realize that there is yet so much to say and do. There is just no time to look back, no time to wonder,"Will anyone read these pages?"

With regards,
Hemen Parekh
27 June 2013

Now as I approach my 90th birthday ( 27 June 2023 ) , I invite you to visit my Digital Avatar ( www.hemenparekh.ai ) – and continue chatting with me , even when I am no more here physically

Tuesday 2 January 2024

FW: Exploring Voice Clone Solutions and Cost-Effective Alternatives



I heard both .  I presume you “  trained “  both , using my own voice sample audio file , sent to you by Sandeep


I found that playht voice , comes very close to mimicking my voice – so I suggest we subscribe to playht . It seems much better ( closer to my own ) than my voice on Personal.ai . Playht seems to have INDIAN ENGLISH accent


You may talk to Sandeep who will do the needful to register and subscribe – initially for ONE MONTH ( trial , which will give us some idea of likely monthly usage / cost ) . I suppose we have a choice NOT to renew or to continue


How long will deployment take once Sandeep subscribes ?




PS :


I am assuming that on our U / I , there will be some ICON for voice ( Listen to me > … ) , Voice will be heard ONLY if , after seeing ANSWER TEXT , a visitor clicks on this icon  


From: Kishan Kokal [mailto:kokalkishan.official@gmail.com]
Sent: 16 November 2023 13:30
To: Hemen Parekh
Subject: Exploring Voice Clone Solutions and Cost-Effective Alternatives


Hello uncle,

I've been exploring open-source voice clone projects, and I recently tried one of them. Unfortunately, the results were not quite satisfactory. The output, which I named "tortoise.wav," was generated using tortoise-tts on my local computer and took approximately 11 minutes. The software generally performs better with an NVIDIA graphics card, but I don't have one.

In general, any generative AI application benefits significantly from a GPU due to its superior performance in matrix operations. However, hosting such applications on a VPS with 4GB of RAM and 2 CPU cores is not practical. It's likely to result in server crashes.

Running this AI application on Google Compute Engine is expensive, with even a single attached NVIDIA T4 GPU core costing $102.2 per month. Therefore, opting for a third-party API is a more cost-effective solution.

I explored the playht API, which offers a free trial. I gave it a try, and the output, named "playht.wav," was quite impressive.

Considering the limitations of hosting on a VPS, I suggest leveraging third-party APIs like playht and elevenlabs. Playht charges $5 per month for up to 25,000 characters and $0.25 for every additional 1000 characters. Elevenlabs, on the other hand, charges $22 per month, including 100,000 total characters per month (equivalent to approximately 2 hours of generated audio using text-to-speech).


In conclusion, the challenges of running resource-intensive voice clone models on VPS or cloud services with GPUs underscore the value of exploring third-party APIs. With offerings like playht and elevenlabs, the cost-effectiveness, ease of use, and impressive results make them compelling alternatives to self-hosting. The convenience and efficiency of third-party APIs are noteworthy in this context.

Best regards,

No comments:

Post a Comment