Thursday 4 January 2024

FW: FW: Keywords for sample content

 Subject: RE: FW: Keywords for sample content


Both are possible. Although right now our topics are mapped to keywords and not higher level abstractions.


1/ It sounds like the search functionality is our autocomplete functionality where you can type prefixes and we'll show the list of topics that matches. [See below when I type in doc … ] # of occurrence is also part of this payload. [See screenshot 1]

2/ Once the user types in a topic, we can also show the list of questions that matches the topics the AI has learned. 


Let me know if this is what u are looking for,





On Mon, Jan 09, 2023 at 9:43 PM, Hemen Parekh <hcp@recruitguru.com> wrote:

Hi, Sharon,



Thanks  for your prompt reply


1 )….. I am thrilled at the possibility of API delivering the list of “ topics “  ( in my draft  U / I ,  I have used the word “ topic “ – and not “ keyword  )



           During my meeting with Manoj / Kartavya, it was decided that :


          #   We will NOT , show alphabets  ( A ..B..C.. ), for visitors to select  / click, in order to view a “ drop-list “ of topics starting with the clicked



         #   Instead, we will provide a SEARCH BAR , where a visitor can type .


              As soon as she types ( say ) D, all the topics starting with letter D will appear. As soon as she type the next letter ( say )


              “ O “ , all the topics, starting with “ Do “ ( Eg. DocumentDomainDormant..etc ), will appear , with number of occurrence in brackets



2 )…… I do not want the autocomplete feature ,



          Clicking on a topic ( in the drop-list below search bar ), should transfer that word ( say “ Document “ ) , into the Question Box . At this stage, we


          have following options :


         #  Visitor is advised to frame / type a question which contains the topic “ Document “




        #  We, ourselves,  pop up a few questions containing that word , and ask visitor to select / click any of those questions



Since many visitors are not “ adapt “ at framing questions that make any sense, I prefer the second option, - visitor don’t have to struggle to frame a question


Choosing the second option also has the advantage of eliminating the “ starting inertia “


Once, a visitor has successively selected 2 / 3 , “ prepared “ questions, and received relevant answers, her confidence goes up



Can this be done ?











Hey Hemen —


1/ We can probably fetch the list of keywords for you and possibly expose that through an API to show top keywords. I can gather a list to see if it would match what you are thinking. 

2/ If by Search by keyword you mean the autocomplete feature that we have here — we can also expose that API.




On Mon, Jan 09, 2023 at 8:10 PM, Hemen Parekh <hcp@recruitguru.com> wrote:




Please take a look at the keywords list ( =  “ Topics “ on my Personal.ai page )


Despite Manoj putting in a lot of “ parsing “ efforts over the past week , this list cannot be used ( as it is ) in My Digital Avatar U / I , for the following reasons :



It covers only “ 50 % of the content “ from Blogger site “ back up “ taken about a month ago  ( mostly “ text files “ of my blogs / notes / reports )


Ø To extract keywords from the balance 50 % could take another one week


Ø The list does not carry , within brackets alongside , the number of times ( frequency ) each keyword has been used by me in my Digital Content


Ø Since the “ Frequency of Occurrence “ is missing, it is not possible to re-arrange the list in the “ Descending Order of Occurrence 


Ø Only through such re-arrangement , the IMPORTANT keywords ( topics ) would rise to the TOP of the list, with unimportant words sinking to the bottom ( enabling some cut-off )


Ø If , over the past 30 years , I have used a particular word very frequently, it is a clear indication that the “ related text “, denotes my “ Area of Knowledge “


Ø In my Digital Avatar U / I , we want to direct ( in a subtle way ) , our visitors to click / select , such TOP RANKING keywords ( topics ), where the probability of getting a good / relevant answer, is much better . Without rearranging ( in descending order of occurrence ), this cannot be achieved


Ø Then there is the problem of > How often can we take a back-up ? How often can we repeat the “ parsing / rearranging “ exercise ?


Ø On top of this, keywords extracted by Manoj, will NOT perfectly match with the TOPIC words that you display ( on my ai page ), simply because you extract from 100006 Memory Blocks

Ø It is not desirable to have TWO , totally different DATABASES ( one of Keywords with Manoj and another of Topics with Sharon ), which need to be SYNCHRONIZED again and again , even if done programmatically . This like wearing two wrist-watches, one running fast and another running slow – and having to synchronize these every 2 hours !


Ø It is also important to remember that, for the past 2 month, I am uploading many TEXT documents on my persona.ai page which are NOT being uploaded on my blogger site. Hence Keywords and Topics can never get synchronized !



In light of the foregoing, I urge you to consider the following :


Ø Manoj give up effort to parse / extract keywords and store in a LOCAL DATABSE for displaying in  U / I  ( attached )


Ø In the “ Search for Keywords “ feature ( left column of U / I ), the TOPICS get delivered from Personal.ai , through API


Ø Obviously, this might cover ( may be ) , 5,000 TOPICS ( to be arranged ALPHABETICALLY and in DESCENDING ORDER of frequency


Today evening at 6 pm ( IST ), Manoj is giving me a DEMO of the working of QnA  API ( Micosoft Team )


If you ( or any member of your team ) wants to watch, here is that link :



With regards,




Hi Hemen Sir,


Attaching the first cut of keywords from 50% of your content from Blogger site feeds.


Keywords -   contentKeywords1.csv



Content –  content1.txt



Please review the same and let us know if we can go ahead to get keywords for rest of the content.





Manoj Hardwani


Hi Hemen,


Attaching the sample content from you blog and the keywords from this content.


I think it does require refinement.


Please review and provide your feedback on the same.



