Hi Friends,

Even as I launch this today ( my 80th Birthday ), I realize that there is yet so much to say and do. There is just no time to look back, no time to wonder,"Will anyone read these pages?"

With regards,
Hemen Parekh
27 June 2013

Now as I approach my 90th birthday ( 27 June 2023 ) , I invite you to visit my Digital Avatar ( www.hemenparekh.ai ) – and continue chatting with me , even when I am no more here physically

Thursday 4 January 2024

FW: FOR YOUR EXPERIMENTATION

 That's fantastic.

As said by Hemen uncle, the file is approximately 40gb in size.
How would you share such a massive file?

 

On Tue, 27 Dec 2022 at 14:25, System Admin <systemadmin@3pconsultants.co.in> wrote:

Dear Kishan

 

All files Extension are PNG,JPG,JSON format.

In case any query feel free to contact me on below No.

 

 

Regards,

 

Sandeep Tamhankar

+91 98333 03938

3P Consultants

 

From: Hemen Parekh <hcp@recruitguru.com>
Sent: 27 December 2022 13:13
To: System Admin <systemadmin@3pconsultants.co.in>
Cc: kokalkishan.official@gmail.com
Subject: FW: FOR YOUR EXPERIMENTATION

 

Sandeep,

 

Can you pl clarify to Kishan ?

 

hcp

 

From: Kishan Kokal [mailto:kokalkishan.official@gmail.com]
Sent: 27 December 2022 13:09
To: Hemen Parekh
Subject: Re: FOR YOUR EXPERIMENTATION

 

I am excited about the opportunity to contribute to the success of this project.

 

When discussing parsing, what kind of file do you wish to parse? (for e.g. Plain text(txt), csv, pdf, docx, etc.)

 

 

On Tue, 27 Dec 2022 at 10:59, Hemen Parekh <hcp@recruitguru.com> wrote:

Kishan,

 

 

It was nice talking to you a few minutes ago

 

I am glad to know that the U / I which you developed  ( for my Digital Avatar ), is prepared in HTML and JavaScript

 

That means :

 

If you succeed in integrating this U / I with “ QnA ( Question and Answer ) “ API of Personal.ai, then we can actually TEST how well it works by posting some questions and getting relevant answers

 

In this regard, find below the email received from Suman Kanuganti ( Founder , Personal.ai ) :

 

 

Api key is in the settings, follow doc here 

 

https://docs.personal.ai/docs/memory-api   

 

 

Message API is in the same link on the left. You can find the api key to use in Hemen personal ai account in settings. Here is direct link for message api.

 

https://documenter.getpostman.com/view/13134732/TzscpSjZ#4acaf2f5-d5a7-46a6-a4c4-06e015cbc959

 

 

Next :

 

How to parse 40 GB word file to  :

 

Ø Separate words

 

Ø Arrange these words in the descending order of “ Frequency of Occurrence “ ( I suppose the frequency number will appear within brackets, next to each word

 

 

Ø Remove words representing : Verbs – Adverbs – Preposition – common nouns – hyphenation etc ( this may need to be done manually )

 

 

What final list emerges , could be considered “ Keywords / Topics my areas of knowledge )

 

 

Following is the answer that ChatGPT provided :

 

 

 

Yes, there are several tools that can parse a large file of Word documents and arrange the parsed words in a descending order of frequency. One option is to use a command-line tool called "sort," which is available on most Unix-like operating systems (including Linux and MacOS). To use "sort," you can first convert the Word documents to plain text files (using a tool like "catdoc" or "antiword"), and then use the "sort" command to sort the words in the text files by frequency.

For example, you can use the following command to convert a Word document to a plain text file:

Copy code

catdoc input.doc > output.txt

Then, you can use the following command to sort the words in the text file by frequency:

Copy code

sort output.txt | uniq -c | sort -nr

This will output a list of words and their frequencies, sorted in descending order.

Another option is to use a programming language like Python to parse the Word documents and count the frequencies of the words. There are several libraries available for working with Word documents in Python, such as the "docx" library or the "python-docx" library. You can use these libraries to extract the text from the Word documents, split the text into individual words, and then use a dictionary to count the frequencies of the words. Finally, you can sort the dictionary by value to obtain a list of words and their frequencies in descending order.

I hope this helps! Let me know if you have any other questions.

 

 

Is above mentioned “ Process Instructions “ sufficient for you to help me carry out this “ listing of keywords “ ?

 

Regards,

 

Hemen Uncle

No comments:

Post a Comment