Answers to some frequently asked questions…
What Does WWL Do?
The worldwide lexicon project will create a simple, easy to use system for locating and communicating with dictionary, encyclopedia, semantic network and machine translation servers throughout the Internet. Think of this as GNUtella for dictionaries. What WWL does is enable programs to easily locate and communicate with web dictionaries and translation servers worldwide.
The project consists of two components: a simple RPC (remote procedure call) protocol that enables application to locate and submit queries to any dictionary or semantic network server that recognizes the Worldwide Lexicon Protocol.
The second component is a distributed computing (or distributed human computing) system that will enlist a large number of users to contribute defintions and translations in many languages. This will complement existing dictionaries, and will enable them to learn new words and phrases based on user input.
What Does WWL Not Do?
The worldwide lexicon is NOT machine translation software. The goal of the project is to create a uniform procedure for locating and talking to dictionary services. This facility can be incorporated into translation software (for example, to expand the scope of a vocabulary).
The Worldwide Lexicon Protocol can also be used to submit queries to full-text machine translation servers.
What Type Of Programs Can Use The Worldwide Lexicon
The Worldwide Lexicon Protocol enables ANY computer program to locate and communicate with dictionary, encyclopedia and translation servers throughout the web. This can be used in a wide range of client )(desktop) applications, such as:
text editors (embedded dictionary and translation tools)
web browsers (highlight a word to look up its meanings, translations in other languages)
instant messaging/chat (add word/phrase or full-text translation capability with just a few lines of code)
machine translation programs (WWL does not translate sentences, however a MT program can use WWL to communicate with other dictionaries to expand the scope of its vocabulary)
research tools (use WWL to lookup terms in specialized encyclopedias using a simple plug in)
smarter email routing and filtering software
How Can I Incorporate WWL Into My Programs?
It’s easy. If your programming language supports SOAP, adding WWL to your application is as simple as adding a few lines to code to submit queries to WWL servers. Once you do this, your application will be able to talk to any dictionary or semantic network server that recognizes the Worldwide Lexicon Protocol.
Whether you want to add dictionary lookup capability to a text editor, or to build a “multilingual chat client”, adding WWL features to your programs will require minimal effort.
- Implementing WWL On Your Dictionary Server
- Introduction to the Worldwide Lexicon
- Building A Multilingual Chat Client
- WWL And Jabber
How Can I Update My Dictionary Server To Support WWL?
Most web scripting languages now support the SOAP RPC standard. Presumably you have already written scripts to process queries submitted via web forms. All you need to do is to write a script to process requests submitted via our SOAP interface. This interface can co-exist with your existing systems and web front ends.
You can also decide which WWL features you want to implement. Some dictionary owners may only provide read-only access. Others may want to allow users to contribute new entries, but require them to be approved by editors or trusted users. Others may want to build highly automated systems in which nearly all of the content is user controlled. Each WWL server owner decides whether to open their system up to the public.
- Introduction to the Worldwide Lexicon
- Implementing WWL On Your Dictionary Server
- Real-Time Human Assisted Queries
When Will WWL Be Available to the Public?
We will be publishing the specification for the Worldwide Lexicon Protocol in early May, in preparation for O’Reilly & Associates Emerging Technologies conference. The protocol is easy to implement in both client and server applications. The draft Worldwide Lexicon Protocol Specification is available now.
We are currently working on building several WWL supernodes which will be available for developers to use to test client and server applications. We will be making an announcement about this shortly. We have also created a SourceForge account which we will be using to host open source projects related to WWL.
- Worldwide Lexicon Protocol Specification
When Will The Distributed System Go Online?
The Worldwide Lexicon Protocol includes several SOAP methods that are used to signal Internet users to contribute to participating WWL servers. These features are fully defined in the draft WWL protocol, and will enable WWL server operators to open their systems to public feedback as soon as they support the WWL protocol.
WWL developers are also working on presence awareness software (the ‘lexicon@home’ client) that silently monitors an internet user’s keyboard and mouse activity. When the program senses the user is not busy, and subject to the user’s notification preferences, it invokes a WWLRequest() method on one or more WWL servers to see if there are any jobs awaiting processing. A WWL server may ask users to define words, key in translations, translate a short block of text, or score submissions from other users.
This capability may also be embedded in existing client programs such as instant messaging software, smart cursor programs, browser plug-ins, etc. We are talking to software vendors that have a large installed base about adding WWL lexicon@home capability to future versions of their applications.
We hope to distribute a client program by early summer. We will most likely release this program once a sizeable number of dictionaries are participating in the WWL system.
- Building The Lexicon At Home Client
What Is The Multilingual Chat Service?
One of the most interesting applications for the worldwide lexicon is multilingual chat. This is NOT an automated translation system like Babelfish. Because chat is an interactive medium, it is possible to create software that uses the WWL to guide the sender in translating words or phrases into other languages as they are typing.
How can this work when other translation systems often fail?
There are two reasons why WWL multilingual chat will work. First, whenever the chat program encounters a word or phrase that has multiple meanings, it will prompt the user to disambiguate (clarify) the expression by choosing from several options. The computer is not attempting to guess the correct meaning for the word. It forces the user to do this by pressing an extra key or by selecting a menu item.
Second, chat users communicate with short, informal messages. They adapt to the limitations of the medium. (Most people do not type very quickly, so they avoid writing verbose messages). A WWL chat client will reward users who adapt to the system by using words that have fewer meanings, and by using simplified grammar.
We are working with the developers of the Jabber instant messaging network to create a basic toolset that developers can use to create instant messaging clients that facilitate bilingual and multilingual communication.
What Are You Doing For Quality Control?
Each WWL server operator decides the extent to which their system will allow public submissions. For example, a dictionary that indexes rapidly evolving slang expressions (such as those found in chat rooms) will, by necessity, be open to public submissions. Other dictionary owners will prefer to control access to their index, whether this is to preserve editorial control, or for copyright reasons.
The Worldwide Lexicon Protocol provides a mechanism for scoring entries so that dictionary operators can open their systems to public submissions while also filtering inaccurate and bogus submissions. The key to doing this is a randomized peer review process in which each new entry is submitted to several randomly chosen volunteers who agree to score new submissions.
In addition to the automated scoring system, dictionary owners can also implement their own internal review processes to double check the most popular entries, or submissions with borderline scores. Each dictionary owner can decide how strict they want publishing controls to be. Some may want new submissions to be immediately available to all users. Others may force new submissions to pass an automated or editor supervised review process.
Users can also decide which dictionaries they prefer to use. WWL servers that are known to be high quality can be given first priority for search requests. WWL servers of unknown or poor quality can be filtered. This can be done by the user, by WWL supernodes (which can collect user feedback), or some combination of both.