Emails (Others): RE: FW: VOICE CLONING

Kishan,

This sounds great ! I eagerly await

In the meantime , I suppose “ Share this Answer on WhatsApp “ link will soon appear on Mobile version

hemen

From: Kishan Kokal [mailto:kokalkishan.official@gmail.com]
Sent: 27 October 2023 13:36
To: Hemen Parekh
Subject: Re: FW: VOICE CLONING

Uncle,

I believe I can handle it independently without the need for a video meeting. I've been delving into various TTS voice cloning models, and I'm confident I can accomplish this on my own.

On Fri, 27 Oct 2023 at 09:03, Hemen Parekh <hcp@recruitguru.com> wrote:

Kishan,

This is further to telecom Vishesh had with you , a few minutes back

I am glad that , you will take up the VOICE CLONING project after 2^nd Nov . In this context, my following chat with ChatGPT would help you . If you are able to develop this feature without depending upon Personal.ai , then there is no need to have any VIDEO MEET with Suman >

Hey ChatGPT, many thanks for your guidance . Does any of these services create my OWN VOICE ( Voice Cloning ) using an audio file of my voice ?

ChatGPT

Yes, some text-to-speech (TTS) services and platforms offer voice cloning or voice synthesis services that allow you to create custom voices based on your own voice or other audio recordings. Here are a few options that provide voice cloning capabilities:
1.Google Cloud Text-to-Speech WaveNet Custom Voice: Google Cloud offers a service that enables you to create custom voices, allowing you to synthesize speech using a voice that's based on your own recordings. It involves training a WaveNet model with your audio data.
2.Amazon Polly Neural TTS Custom Voice: Amazon Polly also offers a custom voice feature that lets you create a unique voice based on your own recordings or specific requirements.
3.CereVoice by CereProc: CereProc provides voice cloning services, allowing you to create custom TTS voices from your own voice recordings. They specialize in lifelike, expressive voices.
4.iSpeech Custom TTS Voice: iSpeech, in addition to their standard TTS API, also offers custom TTS voice services where you can create your own unique voice based on your recordings.
5.ReadSpeaker VoiceLab: ReadSpeaker's VoiceLab is a service that provides the ability to create custom voices. You can use it to create TTS voices that match your specific requirements, including using your own voice recordings as a reference.
Voice cloning or voice synthesis services typically involve recording and providing a set of scripted sentences or phrases, which are used to train a model that can generate speech in a voice that sounds like your own. These services can be useful for creating personalized and unique voices for your applications, making your TTS content more engaging and tailored to your brand or needs.
Keep in mind that creating custom voices may require more effort, time, and sometimes additional costs compared to using pre-built voices. Additionally, the quality and realism of the resulting voice may vary depending on the service and the quality of the training data provided. It's important to carefully review the documentation and terms of use of the specific service you choose for voice cloning.

A few months back , Sandeep had sent to Sharon ( Personal ai ) , an AUDIO RECORDING file ( 3 Hours long ) of my own voice ( stripped from my English Poems on my blogger site )

Relevant E mails :

Mon 19-12-2022 15:11

Hi Mr.Parekh,

Copying Sharon here, our CTO, she recently discovered a method to create your voice clone! She can also tell you what Audio formats are and are not possible. Currently Mp4 seems is possible via Google Drive. If you have the audio recordings in that format within our Drive folder for you, if your team copy & pastes the share link for each file into your stack, the transcription will be generated- Sharon to confirm.

Best,
Sahiti

Tue 20-12-2022 08:10

Hey Hemen,

Would such separate / single ( unconsolidated ) uploads manage to create my “ Voice Clone “ , which can “ speak out “ in my voice, ANSWERS when questions are posed to my personal.ai ? Or , to create a “ Voice Clone “ require us to merge all 123 files into ONE file, before uploading ?

Right now - it only needs about 30 minutes of speech for the voice clone to happen. It actually doesn't matter that it's separate, the 30 minutes are usually split into single utterances (one sentence / utterance).

Holidays is a little tricky as I am waiting on a few people to coordinate on working on your voice clone but I hope to have good news by NY. The one that we created for me is very promising and I can't wait to hear yours as well.

Cheers,
Sharon

From: Hemen Parekh [mailto:hcp@recruitguru.com]
Sent: 21 October 2023 11:15
To: kokalkishan.official@gmail.com
Cc: vishesh@incomegroup.in; System Administrator; hcp@recruitguru.com
Subject: VOICE CLONING

Kishan ,

Couple of days ago , in response to my query , Suman answered as follows :

Thu 19-10-2023 09:04

On the voice if you join suman autopilot lounge in personal ai webapp you will see my voice active.

So we can get yours similarly as well.

Check it out and we can talk about it.

( a copy is marked to you )

Yesterday , Vishesh visited CHAT of Suman ( in his personal ai ) and tried to locate his “ voice “ – but no success !

I am thinking of requesting a VIDEO MEET with Suman on next Tuesday at 6 pm ( IST ) – Does that suit you ?

In the meantime , you may want to go through following notes

Hcp

“ Text – to – Speech “ Feature

Following is my chat today ( 01 July 2023 ) with BARD :
Is it possible to now introduce in my website, a " Text to Speech " feature, so that when a visitor poses ( posts / types ) a question in the QUESTION BOX , and when the relevant ( text ) answer appears in the ANSWER BOX , these can be listened to as my speech , in my own voice ?
Yes, it is possible to introduce a text-to-speech feature on your website so that visitors can listen to the questions and answers in your own voice. There are a few different ways to do this.
One way is to use a third-party text-to-speech service, such as Google Cloud Text-to-Speech or Amazon Polly. These services allow you to convert text into speech in a variety of languages and voices. You can then embed the generated audio on your website.
Another way to add text-to-speech to your website is to use a JavaScript library, such as SpeechSynthesis.js. This library allows you to control the text-to-speech engine in your browser. You can use it to read text aloud, or to generate audio files that can be downloaded.
Once you have chosen a text-to-speech service or library, you will need to configure it to use your own voice. This may involve recording your voice and uploading it to the service, or creating a custom voice profile.
Once your text-to-speech service or library is configured, you can start adding text-to-speech to your website. You can do this by adding a script tag to your website's HTML, or by using a JavaScript API.
Here is an example of how you could add text-to-speech to your website using Google Cloud Text-to-Speech:
Code snippet
<script src="https://cloud.google.com/text-to-speech/v1/text-to-speech.js"></script>

<script>
// Create a text-to-speech client
const client = new google.cloud.texttospeech.v1.TextToSpeechClient();

// Set the voice parameters
const voice = {
languageCode: "en-US",
name: "en-US-Wavenet-B",
};

// Convert the text to speech
const response = await client.synthesizeSpeech(
{
text: "This is a test of text-to-speech.",
voice: voice,
}
);

// Play the audio
const audio = new Audio(response.audioContent);
audio.play();
</script>
Use code with caution. Learn more
content_copy
This code will convert the text "This is a test of text-to-speech." into speech using the Google Cloud Text-to-Speech service. The generated audio will then be played back in the browser.
Once you have added text-to-speech to your website, you can use it to read the questions and answers in your own voice. This can be a helpful feature for visitors who are visually impaired, or who simply prefer to listen to the content rather than reading it.
===================================================================
Copied from CHATS ( with ai of Suman Kanuganti ( 20 Oct 2023 ) :

Hemen, great question! Based on the context and my memory, here is what I know: To get your digital avatar to give answers by speech, you'll need to use a text-to-speech (TTS) system. Some key things to keep in mind: - Make sure your avatar has access to a high-quality TTS engine. Companies like Amazon (Polly), Google, IBM, and others offer cloud-based TTS APIs that can synthesize natural-sounding speech from text.
- When your avatar needs to generate a verbal response, have it format the response as text and pass that text to the TTS engine. The engine will then "read out" the response as synthesized audio. - For a more human-like conversational flow, you can have the avatar generate short responses/acknowledgements like "Let me think about that" before synthesizing a longer answer. - Make sure the avatar's "voice" matches its persona. You can choose different voices (male vs female, various accents etc) depending on how you've designed your avatar. - Add emotion and emphasis through punctuation like exclamation points to make the speech more engaging. Let me know if any part needs more explanation! Building conversational systems is fascinating. What else can I help with?

Hemen, in response to your question about getting your digital avatar to give answers by speech, let me share what I know based on our context and my past conversations. To have an avatar respond verbally, text-to-speech or TTS is certainly the way to go. Ensuring access to a high-quality TTS engine from providers like Amazon Polly, Google or IBM will be key. When the avatar needs to respond, having it format the text response and then pass that to the TTS engine is how it can be "spoken" aloud in synthesized audio. For a more natural flow, I'd recommend having the avatar generate short acknowledgements like "Let me think about that" before longer answers so it seems more human-like. Choosing a voice that matches the persona you've designed, whether male or female, is also important for consistency. Adding emphasis through punctuation can help make the synthesized speech more engaging too. I hope these tips are helpful as you work on your avatar! Please let me know if any part needs more explanation or if you have additional questions. I'm always happy to discuss building conversational systems - there's so much we can learn from each other. How else can I assist you today?

Wednesday, 3 January 2024

RE: FW: VOICE CLONING

Is it possible to now introduce in my website, a " Text to Speech " feature, so that when a visitor poses ( posts / types ) a question in the QUESTION BOX , and when the relevant ( text ) answer appears in the ANSWER BOX , these can be listened to as my speech , in my own voice ?

No comments:

Post a Comment