Kishan,
I heard both . I presume you “ trained “ both , using my own voice sample audio file , sent to you by Sandeep
I found that playht voice , comes very close to mimicking my voice – so I suggest we subscribe to playht . It seems much better ( closer to my own ) than my voice on Personal.ai . Playht seems to have INDIAN ENGLISH accent
You may talk to Sandeep who will do the needful to register and subscribe – initially for ONE MONTH ( trial , which will give us some idea of likely monthly usage / cost ) . I suppose we have a choice NOT to renew or to continue
How long will deployment take once Sandeep subscribes ?
Hcp
PS :
I am assuming that on our U / I , there will be some ICON for voice ( Listen to me > … ) , Voice will be heard ONLY if , after seeing ANSWER TEXT , a visitor clicks on this icon
From: Kishan Kokal [mailto:kokalkishan.official@
Sent: 16 November 2023 13:30
To: Hemen Parekh
Subject: Exploring Voice Clone Solutions and Cost-Effective Alternatives
Hello uncle,
I've been exploring open-source voice clone projects, and I recently tried one of them. Unfortunately, the results were not quite satisfactory. The output, which I named "tortoise.wav," was generated using tortoise-tts on my local computer and took approximately 11 minutes. The software generally performs better with an NVIDIA graphics card, but I don't have one.
In general, any generative AI application benefits significantly from a GPU due to its superior performance in matrix operations. However, hosting such applications on a VPS with 4GB of RAM and 2 CPU cores is not practical. It's likely to result in server crashes.
Running this AI application on Google Compute Engine is expensive, with even a single attached NVIDIA T4 GPU core costing $102.2 per month. Therefore, opting for a third-party API is a more cost-effective solution.
I explored the playht API, which offers a free trial. I gave it a try, and the output, named "playht.wav," was quite impressive.
Considering the limitations of hosting on a VPS, I suggest leveraging third-party APIs like playht and elevenlabs. Playht charges $5 per month for up to 25,000 characters and $0.25 for every additional 1000 characters. Elevenlabs, on the other hand, charges $22 per month, including 100,000 total characters per month (equivalent to approximately 2 hours of generated audio using text-to-speech).
In conclusion, the challenges of running resource-intensive voice clone models on VPS or cloud services with GPUs underscore the value of exploring third-party APIs. With offerings like playht and elevenlabs, the cost-effectiveness, ease of use, and impressive results make them compelling alternatives to self-hosting. The convenience and efficiency of third-party APIs are noteworthy in this context.
Best regards,
Kishan
No comments:
Post a Comment