Tuesday, May 10, 2016

Scarce Data Problem: Mysterious Indus Valley Script

Once upon a time, nearly 5000 years ago there was an urban Civilization with 1000 planned Cities across the North-Western India. It is called Indus Valley Civilization

The Cities 

Most of the cities were along banks of the now dry river and at its peak this Civilization had a population of over five million. 
For that era it was very advanced civilization with houses built out of baked bricks and public facilities like baths and huge granaries. One of the cities Harappa had about 700 houses with private wells and most cities had Citadels. The civilization had developed metallurgy.  

The Oldest Writing on Indian Subcontinent  

One of the most interesting thing about this civilization is that they developed a script and left behind around 4000 seals with around 400 distinct symbols. 
The script is still undeciphered and the underlying language has not been identified. 
  • The script is found all over the North-Western India as well as Egypt and Mesopotemia.
  • Script is continued over hundreds of years with regularity and similarity between seals found across the region and across the centuries 
  • There is no bilingual text available like Rosetta Stone.
  • Most seals are extremely brief with longest being only 17 character 

Original Scarce Data Problem

  1. Does this script represent a spoken language ?
  2. If yes what language ? What can we learn about this language ? How does it related to present day languages of the South Asia
  3. What are these seals trying to tell ? What were they used for ?
Answers to these question can shed light on early history of Indian subcontinent and cultural sphere. But this is not just an old Indian civilization it is the World Heritage for all of humanity.  


Call to Action:

How can we use our big data tool set and machine learning to decipher Indus Valley Script ?

[All images from Wikipedia - thank you to all original creators]

