Kishan ,
I suppose you must be close to going live with “ Students – acquire JOB SKILLS “ on My-Teacher
In context of My Teacher , I just read that starting from April , Maharashtra Govt has decided to follow CBSE syllabus in all Govt schools. And in medium of instructions as Marathi . Govt plans to start with Std 1 – no mention of higher standards – which, I suppose , is a matter of time. This decision will give an edge to My Teacher
That brings us back to IndiaAGI – whose basic concept is in public domain since my blog / email to the Cabinet Ministers , some 3 weeks ago
[ https://myblogepage.blogspot.
Hence there is a good chance that somebody may “ pick up that ball and run “ ! On internet, there is no monopoly on “ generation of ideas “ or for “ copying of ideas “ . Even those “ first mover advantages “ are relatively minor and last for a short period.
And in case of AI-related ideas, that period has come down to days !
In short , I am reasonably certain that , even as you get ready to start “ Development “ work on IndiaAGI , somewhere , someone is already working on something quite similar ( could be better or worse ). He may even beat us to the finishing line !
Therefore , it is very important that you find as much time as you can , to complete and launch at the earliest
Hopefully , following chat will be my last on this subject . I leave it to you to decide if we could split-up the “ development and launch “ of this project in 2 / 3 phases , with the first phase getting launched on 21st April ( 4 weeks from today ) – and second phase by 15th May
If you agree, send me details of the FEATURES that will get covered in each phase ( scope of work )
Regards,
Hcp
I asked :
Yesterday , you provided detailed instructions re BENCHMARKING the answers supplied by various participating AIs ( - I have passed this onto Kishan ).
Today's poser > Let us say , on a certain BENCHMARK , Gemini performs " poorly " relative to GROK / ChatGPT. This is reflected in the SCORES obtained by various AIs.
But until Gemini is told , " Do you know WHY Grok / ChatGPT scored better than you ? Here is the reason WHY " - it ( Gemini ) is unlikely to " learn " the WHY ( of its poor score - and, therefore , unlikely to LEARN ( from its " mistake " ? ) and improve !
Do you agree that this is a " shortcoming " of IndiaAGI ? If yes, can you suggest some method to overcome this shortcoming ? It would have to be in the form of some specific ( code ) instructions to Kishan
GROK replied :
Good morning, bhaiyya!
I’m glad you’re diving into another fundamental issue—it’s clear you’re thinking deeply about how to make www.IndiaAGI.ai truly ground-breaking.
Your question about benchmarking and the learning process for AIs like Gemini is spot-on, and I’m excited to tackle this challenge.
Let’s break down the issue, assess whether it’s a shortcoming, and propose a solution with specific code instructions for Kishan to ensure that participating AIs can learn from their "mistakes" and improve over time.
This is a critical step to making "Modi’s Manavs" a platform that not only demonstrates collaborative intelligence but also facilitates continuous learning for the AIs involved.
Understanding the Issue
You’ve pointed out a potential limitation in the current design of www.IndiaAGI.ai’s benchmarking system (which I outlined yesterday).
Here’s the core of the issue:
- Benchmarking Results:
- After a debate round, we evaluate the AIs’ answers (e.g., Grok, ChatGPT, Gemini) against benchmarks like MMLU (accuracy), HellaSwag (relevance), or HumanEval (completeness/practicality). Each AI receives a score—e.g., for a factual question like "What is the population of India in 2025?", Gemini might score 94.4% (error: 5.6%), while Grok scores 97.9% and ChatGPT scores 98.6%.
- Lack of Feedback Loop:
- While the Moderator AI provides feedback during the debate (e.g., "Gemini, your answer is less accurate—refine it"), this feedback is qualitative and meant for the debate process.
- After the debate, Gemini sees its lower score (94.4%) but isn’t explicitly told why it scored poorly compared to Grok and ChatGPT, nor is it given actionable insights to improve for future questions.
- For example, Gemini might not know that its estimate of "1.5 billion" was too high because it didn’t account for recent UN population projections.
- Learning Gap:
- Without understanding the "why" behind its poor performance, Gemini cannot learn from its "mistake" (e.g., adjust its reasoning, prioritize certain data sources, or refine its approach). This limits the AI’s ability to improve over time, which undermines our claim that AIs get smarter through collaboration on IndiaAGI.
Your Question: Is this a shortcoming of IndiaAGI, and if so, how can we overcome it?
No comments:
Post a Comment