MaTra

Monday, May 7, 2012

Multilingual Computing at C-DAC Mumbai

The Center for Development of Advanced Computing (CDAC) Mumbai is actively involved in Research and Development into the areas of Language computing and has produced many products targeted to bridge the language barrier in the era of digitization.

Knowledge Based Computer Systems (KBCS) at CDAC Mumbai is working in the field Natural Language Processing (NLP), Machine learning (ML), Data Mining (DM), Expert systems (ES), etc. and following are some of the prominent works in the area of multilingual computing

MaTra
Xlit
StatMT
SuTra
Rupantar
ChitranTran

MaTra:

MaTra is a Fully-Automatic English to Hindi Indicative Machine Translation System. The approach taken by MaTra is 'Transfer Based' and is very well appreciated in the research community. MaTra is targeted to work with text in open domains like World Wide Web documents and news stories.

System uri : http://cdacmumbai.in/matra/

Salient Features:

Hybrid (Rule Based and Statistical) approach to Machine Translation
Uses target language independent intermediate structured representation and so can be easily adopted for English to other Indian languages machine translation systems

Figure 1:MaTra: Machine Translation System

Xlit:

Xlit is a transliteration tool to convert words from English to Indian languages and back, without losing the phonetic characteristics. It transliterates the words from English to Indian language, eg. converts 'bharat' to 'भारत', 'school' to 'स्कूल', etc. It also suggests more than one option for the given word, like भरत, भारात, बहारत, etc. for 'bharat'. Prototypes are available for Hindi, Marathi, Urdu and Kannada. XlitHindi – an extension for the OpenOffice Writer is available for download at the URL:

http://extensions.services.openoffice.org/project/xlithindi

System uri: http://cdacmumbai.in/xlit/editor/

Salient Features:

Can be easily integrated into any desktop or web application
Uses generalized framework for developing any language pair transliteration system

Figure 2 Xlit: A transliteration system

StatMT:

StatMT is a Statistical Machine Translation (SMT) system which translates the source language sentences (e.g., English) to target language sentences (e.g. Hindi, Marathi, Bengali, etc.) using statistical models. Stat MT system is part of English to Indian Languages Machine Translation System (E-ILMT) consortium and aims to design and deploy a Machine Translation System from English to Indian Languages in Tourism and Healthcare Domains. More information about the system can be found at http://www.cdacmumbai.in/e-ilmt.

SuTra:

Sutra is a multi-user translation assistance tool that makes intelligent suggestions to translators on possible reuse of translations from older version systems or systems with similar domains. The aim is to reduce the translator's efforts and make available translated versions of applications in least possible time. System is released under open source and is available at http://sourceforge.net/projects/sutra/.

Figure 3SuTra: An Intelligent Suggestive Translator for Localisation

Rupantar:

Rupantar is an utility to write in Indian languages using Roman Script. It also allows you to convert text from one script to other script, ex. 'रमेश' in Hindi to ‘ரமேஷ்’ in Tamil. It uses a key map based technique for writing and conversion.

System uri: http://www.cdacmumbai.in/rupantar

Salient Features:

Easy integration with other desktop and web applications
Fast and lightweight application

Figure 4: Rupantar

ChitranTran:

ChitranTran is an utility to extract and transliterate text from images. Available prototype can extract text in English and Hindi and can tranliterate that to other Indian languages.

System uri: http://202.141.152.1/xlit/chitrantran/

Salient Features:

Easy integration with other desktop and web applications
Support for Indian language text extraction and transliteration

Figure 5 ChitranTran

Friday, April 1, 2011

Abdul Kalam Pitches A Multilingual, Mobile Web at WWW conference 2011

APJ Abdul Kalam, Former President of India, speaking at the World Wide Web conference in Hyderabad made a pitch for a multilingual web, saying that in its current form, the World Wide Web has its shortcomings – “The language barrier is the biggest hinderance in making the Web truly democratic. Originally the lingua franca of the web was mainly English, and while the situation has started to change, much more needs to be done. The development of a country is directly determined by the amount of content in the countrys native language available on the web.” More interestingly, Kalam suggested cross-lingual access to the web, saying that knowledge grows by sharing, and language should not be an impediment here. He said that rural folk need to be convinced that the web is useful for them, and at present, the community on the web tends to generate content for its own consumption.

Some of the important points from his speech:
Kalam had the following suggestions for the development of the World Wide Web
- To look for solutions on how a mobile device can provide integrated solutions of 3G and 4G applications in its mother tongue. For a farmer, the price of agricultural products, for a fisherman, the market price of fish.
- For Web 2.0 and 3.0 (the semantic web) to enhance services in native languages and the web to offer access without any barriers of language, cost, creed or geographical barriers.
- For the mobile to become a personal authentication device, and for money transactions through the mobile to be highly secure.
- For sensors incorporated into a mobile device holder, to be able to transmit data related to a patient, and get the doctors advice / consultancy
- More societal applications, given large bandwidth that 4g offers, which involve farmers and villagers who are less empowered. The future of the web is going to shift from connecting the corporates to connecting the individual in the rural society.
Kalam also spoke of a societal grid, combining the National Knowledge Network, a Healthcare Grid, an e-governance grid and a Rural grid.

Courtesy and more detail news here at Medianama

Thursday, February 10, 2011

SIGAI Workshop on Emerging Research Trends in Artificial Intelligence (ERTAI - 2011)

SIGAI Workshop on Emerging Research Trends

in Artificial Intelligence (ERTAI - 2011)

19th - 21st June, 2011, C-DAC, E-City Bengaluru, India

Supported by Computer Society of India (CSI)

About The Workshop

Based on the success of the previous ERTAI workshop, conducted in 2010, CSI- SIGAI has decided to announce the next ERTAI workshop to be held during June 19-21 2011 at C-DAC Electronics City Bengaluru. The backdrop of this workshop remains the same as that of last year. That is, through ERTAI, we plan to continue to provide a forum where those pursuing research in AI can exchange ideas and seek guidance. And those who are seeking to enter into AI research can also get a valuable feel of current research going on in various streams in AI in industry as well as in academia. ERTAI 2011 will enable new and aspiring research scholars to identify relevant and useful research topics and get guidance on their approach and direction.

We invite papers that describe work-in-progress by various research scholars spanning many areas including language processing, multi-agent systems, web mining, information retrieval, semantic web, e-learning, optimization problems, pattern recognition, etc. We also invite suggestions on relevant topics for invited talks -- both in technical areas and and research methology areas. The detailed program for the workshop is being finalized and will be announced shortly.

Proposed Structure of Workshop

It will be a three day programme consisting of,

Invited talks covering current trends, specific challenges, etc. in Artificial Intelligence
Invited talks on mentoring research scholars on publication, research methodology, etc.
Presentations by those currently pursuing research in AI area.

We will have a panel of experienced researchers to evaluate and mentor the research presentations.

Call For Papers

For the research presentations, we are now inviting brief research papers of 5-6 pages, outlining the problem being addressed, approach followed vis-à-vis existing approaches, current status / results, and future plans. A subset will be short-listed for presentation, based on a formal review process. Papers must have significant AI content to be considered for presentation. Relevant topics include (but are not limited to):

	Knowledge Representation		Reasoning
	Model-Based Learning		Expert Systems
	Data Mining		State Space Search
	Cognitive Systems		Vision & Perception
	Intelligent User Interfaces		Reactive AI
	Ambient Intelligence		Artificial Life
	Evolutionary Computing		Fuzzy Systems
	Uncertainty in AI		Machine Learning
	Constraint Satisfaction		Ontologies
	Natural Language Processing		Pattern Recognition
	Intelligent Agents		Soft Computing
	Planning & Scheduling		Neural Networks
	Case-Based Reasoning

A one-page call for papers for the ERTAI - 2011 workshop may be obtained from here

Target Audience

Target audience will be primarily:

Faculty members pursuing research involving AI as the base or as a tool for an application.
Faculty members interested in pursuing research and exploring areas / options.
Research scholars working for a post graduate degree.
Students seriously interested in research, specifically on AI.

Important Dates

Full paper submission deadline - April 30, 2011
Acceptance intimation - May 25, 2011
Camera ready copy due - June 05, 2011

ERTAI Secretariat

ERTAI Secretariat
Centre for Development of Advanced Computing
68, Electronics City, Bengaluru 560100.India.
Telephone: +91 80 28523300
Fax: +91 80 28522590
Email: csi.sigai@gmail.com
Web: http://sigai.cdacmumbai.in/

Friday, May 28, 2010

Advanced Statistics and Data Mining - Summer School

A Summer School on Advanced Statistics and Data Mining is being organized by the Artificial Intelligence Department of the Computer Science Faculty of the Univ. Politécnica de Madrid. from June 28, 2010 to July 09,2010 in Madrid, Spain. More information can be found at:

http://www.dia.fi.upm.es/index.php?page=presentation&hl=es_ES

Tuesday, May 4, 2010

ACL 2010

The 48th Annual Meeting of the Association for Computational Linguistics will be held in Uppsala, Sweden, July 11–16, 2010. The conference will be organized by the Department of Linguistics and Philology at Uppsala University.
More details can be found at : http://acl2010.org/index.htm

MaTra

Monday, May 7, 2012

Multilingual Computing at C-DAC Mumbai

Friday, April 1, 2011

Abdul Kalam Pitches A Multilingual, Mobile Web at WWW conference 2011

Thursday, February 10, 2011

SIGAI Workshop on Emerging Research Trends in Artificial Intelligence (ERTAI - 2011)

SIGAI Workshop on Emerging Research Trends

in Artificial Intelligence (ERTAI - 2011)

19th - 21st June, 2011, C-DAC, E-City Bengaluru, India

Supported by Computer Society of India (CSI)

About The Workshop

Proposed Structure of Workshop

Call For Papers

Target Audience

Important Dates

ERTAI Secretariat

Friday, May 28, 2010

Advanced Statistics and Data Mining - Summer School

Tuesday, May 4, 2010

ACL 2010

Have a query about any product?

Xlit Presentation

Favourites

Blog Archive