Hi Friends,

Even as I launch this today ( my 80th Birthday ), I realize that there is yet so much to say and do. There is just no time to look back, no time to wonder,"Will anyone read these pages?"

With regards,
Hemen Parekh
27 June 2013

Now as I approach my 90th birthday ( 27 June 2023 ) , I invite you to visit my Digital Avatar ( www.hemenparekh.ai ) – and continue chatting with me , even when I am no more here physically

Tuesday, 28 June 2016

For Today's Discussions


I have a database of over 5 million job advts , downloaded over the past 6 / 7 years from various job portals of India

Each job advt database consists of :
Ø  Advt ID

Ø  Designation ( being advertised )

Ø  Company Name ( Advertiser )

Ø  Job Description

Ø  Desired Profile

Ø  Compensation Offered

Ø  Experience ( desired ) – Years

Ø  Industry Type

Ø  Education Quali ( Min )

Ø  Location ( Posting City )

Ø  Keywords

Ø  Advt Posting  Date

Ø  Expiry Date


Some years back , ( when our website , www.World-Wide-Jobs.com , was up and running ) , we had developed a feature to analyze this database and display the findings visually , in different ways 

We were displaying PIE-CHARTS of :
Ø  Industry-wise Jobs

Ø  City-wise Jobs


You will observe that , with a much larger database available now , it is possible to analyze / display the “ No of Jobs “ , in many more ways

Not only that , it should be possible to analyze this huge database to predict the future expected PATTERN of the occurrence of jobs , in many different ways !


At any given time , the number of jobs getting advertised , is an important Economic Indicator

If economy is booming and company Order Books are getting fatter , then more jobs will get advertized – and vice-versa

Hence , a time-series analysis of the no of new jobs getting posted on job portals , has a  straight line relationship with the state of the economy ( a high co-efficient of correlation )

Apart from that , can a Data mining of 5 million jobs , answer ( even partially ) , the following questions ?
Ø  Who ( which Companies ) are advertizing and when ?

Ø  What jobs / vacancies / positions are being advertized ?

Ø  What is the frequency with which a particular job gets advertized ? By entire industry ? By a given Company ?

Ø  Which regions / cities have max / min no of new jobs ?

Ø  What are regional disparities due to ?

Ø  Which Industries are advertising most – creating most jobs ?

Ø  What Edu Qualifications are in max demand ?

Ø  What kind of jobs demand what kind of Edu Qualifications ?

Ø  What is the level of co-relation between , Position and the years of Experience demanded ?

Ø  For identical positions being advertized , how much do “ Job Descriptions / Desired Profiles “ differ, from company to company ?

Ø  Are there significant differences in the “ No of years of Experience “ being demanded , for identical positions ?

Ø  What is the probability of finding the “ Keywords “ in “ Job Description / Desired Profile “ ?

Ø  What is the extent of duplication ( redundancy ? ) between , “ Job Description “ and “ Desired Profile “ ?

Ø  What percentage of Advts fail to make any mention of , Compensation Offered ?

Ø  When a company posts an advt for same / identical position , at different points of time , are there any differences in values ( fields ) ?

Ø  From an analysis of all the advts posted by a given Company ( over past 7 years ) , can any conclusion be reached as to the changing nature of that company’s business (by co-relating the “ Skills related Keywords “)?

Ø  Can the algorithm predict what job a company will advertize next – and when ?

Ø  Is there any correlation between , “ Designation / Position “ and the “ Keywords “ ?

Ø  From analyzing this huge data , can software auto-generate , a complete / editable job advt , as soon as a Recruiter simply types the “ Designation / Position “ ?

I believe , so far , no one has undertaken such a Data mining project

If carried out diligently , I am sure , the outcome would be of immense benefit to :
Ø HR Managers

      for Manpower Planning / Compensation Planning 

Ø Recruiting Managers

      for framing Man Specifications / Job Description Manuals 

Ø Educationists

      for deciding what Edu Quali are in demand and tailor the Courses 

Ø Students

      to figure out what “ Skills “ are in demand by Industry and prepare 

Ø Planning Commission ( NITI Aayog )

      for allocating Resources to States / Regions , based on imbalances 

Ø  HRD Ministry

       For long term Macro-Planning in respect of Education 

Ø National Skills Development Commission

      for chalking out Skills Development Programs in collaboration with Companies / Industries 

If undertaken – and executed seriously – then this Data mining project has the potential to place 

Ministry of Statistics and Programme Implementation , 

 
on the Centre-Stage of National Education Planning Scenario

What can / will such a project yield ?

Without exaggerating , it would be safe to assume that , this vast database of job advts would contain :

Ø  50 million phrases / sentences

Ø  500 million words

Obviously , each word / phrase / sentence , is nothing more than a

 Database of Intentions “ of the Employer Companies

( to borrow from John Battelle’s well-researched book about Google )

Our goal shall be to make this ( Data mining Algorithm ) a dynamic / continuous “ Process “ , so that , we can measure the changing nature of these “ Intentions “ , over a long , long period

And we must enable a “ Researching Visitor ( of web site ) “, to benefit from these trends / patterns

Even though 5 million job advts may contain 500 million “ words “ , these are not Unique

 

Most of these are used again and again , hundreds or thousands of times

 

Thru data mining , it is not difficult to compute their “ Frequency of Usage “

 

And then , these frequencies can be graphically plotted against any particular time-period

 

Such Graphical Representations can be further broken up by ,

 

Ø  City Names

 

Ø  Company Names

 

Ø  Industry Names

 

Ø  Function Names

 

Ø  Designations ( Vacancy Names ).. etc

 

And such graphical analysis can be done , not only for “ Keywords “ but even for “ Key Phrases “ and “ Sentences “ !



Take a look at this project paper ( NOT ENCLOSED )

It is all about data mining of some 150 million records ( location points ) and about uncovering “ trends / patterns “ of physical movements of 300 human volunteers , over a “  period of time  “

I quote from article in Times of India ( 19 July 2013 ) :

“ ..the first system of its kind to predict long term human mobility in a unified way , parse the data. " Far Out " does not need to be told exactly what to look for  --- it automatically discovered regularities in the data “

“ Do you know precisely where you’ll be 285 days from now at 2 pm ?

Researchers have developed a new tracking software that can tell you exactly where you will be on a precise time and date , years into the future “

What we want to do with 5 million job advts database , is quite similar, viz ;
 predict ,

 WHO     ( which Company / Industry ) , will advertize

 WHAT   ( vacancies / positions / designations ),  and

 WHEN   ( time )



I am talking about developing an “ Expert System “ , thru discovery of specific “ Co-relations “ amongst various Data Fields of 5 million job advts

Eg :

Ø  What is the Co-relation between , any given

Ø   Designation / Vacancy-Name / Advertized Position ,

 and

Ø   Educational Qualifications  ?

Here are some examples :

Ø  Any designation  such as “ Production Manager “ would call for an “ Engineering Degree / Diploma “ ( but never a CS / CA )


Ø  Any designation in “ Finance Function “ will require,
·       B Com
·       M Com
·       CA    etc
       But never a BE(M ) / BE (Chem )


Ø  Any designation at Manager level will call for a minimum experience of 5 years ( but never a Fresh Graduate with NIL experience )


Ø  MBA / BBA / MMS etc are the most preferred Edu Qualifications for positions in Marketing



Ø  No vacancy in an Automobile Manufacturing Company , will call for a degree in Pharmaceutical


Ø  No Electrical Machinery Manufacturing company will ever demand a Medical Degree (MBBS )


To a human mind , these ( rules ) are so obvious !


But , no human mind can write-down ALL of such RULES , in 2 minutes ! – something that your Data mining Software can – and will – do in 5 seconds !


All that you need , after computing “ Frequencies of Occurrences “ , is to :


Ø  Plot the Co-efficients of Co-relations between various Fields ( of job advts )

Ø  Compute Probabilities for each and create hundreds of Probability Tables


And , since a thousand new job advts are getting added to our Job Advt Database , daily , the SAMPLE SIZE is perpetually increasing – thereby , increasing the Accuracies of your Predictions !


Having done this , imagine the following scenario :


Recruitment Officer of Wipro , comes to our “ Post Job “ page and , in the field for “ Designation “ simply types ,

“ Business Analyst “


And Presto !

The entire Job Advt Form gets auto-filled , with MOST PROBABLE values !

Would not that amaze her ?

All that our software has done is analyzed job advts of all “ Software Companies “ ( an Industry ),– and of WIPRO – for the position of Business Analyst and filled in the most probable values


This is no rocket science !


We had actually , partially attempted it – albeit in a crude way – in our earlier web site ,www.IndiaRecruiter.net


What surprises me is , how come no one has attempted this so far !


Especially , Naukri / TimesJobs / MonsterIndia , who have accumulated millions of job advts !

Anyway , the fact that they have , so far , ignored this  Line of Examination , will work to the advantage of


Ministry of Statistics and Programme Implementation


 – making YOU the very first person in the entire world to come up with a PREDICTION MODEL in the area of JOBS
            

However , without applying some simple data mining tool , it would not be possible to answer the following questions :

Where is the greatest decline of jobs being advertized ? 

How much is the percentage decline ?
Ø  In which Industry ?

Ø  In which Company ?

Ø  In which City ?

Ø  In which Region ?

Ø  In which Skills ?

Ø  For which Positions ?

Ø  For which Education Levels ? ………… etc


With a data mining tool , such individual graphs could emerge ( within fraction of a second ) at the click of a button !


One could even co-relate these graphs with other ,
 publicly available statistical data such as :
Ø IIP   ( Index of Industrial Production )

Ø Stock Market Index

Ø Currency Exchange Rate ( eg; declining Rupee )

Ø Decline in GDP / Increasing Fiscal Deficit

Ø CAD ( Current Account Deficit )

Ø Foreign Investments

Ø  Primary Bank Rates of RBI…………………………….etc


With proper co-relations , one could even predict how much the job market will further shrink , over the next 6 months ! or grow ?

Such” Predictive Model of Job Market “, would be of immense interest to , not only the economists but also to the HRD Ministry / Planning Commission / Educational Institutions and of course the students themselves 


hemen  parekh


Marol , Mumbai , India


( M ) +91 - 98,67,55,08,08

No comments:

Post a Comment