II-03-01-Pór-InterviewProfLévy 283-292

II-03-01-Pór-InterviewProfLévy 283-292

 

 

Interview with Professor Pierre Lévy1

George Pór 2

 

Semantic inter-operability: a condition for large-scale CI

 

George Pór: One of the things that inspired me to invite you to this conversation was knowing that whatever comes out of it will probably have the potential to trigger further interesting conversations in other circles of CI thinkers, doers, practitioners.

Pierre Lévy: OK.

GP: I kept following your work since our last conversation, a few years ago, and have been impressed by the journey you've been on. Your focus on making your CI model grounded in, supported by, and supporting, a robust computational semantics, is both evolutionary and revolutionary.

PL: I'm not so sure so many people think like you.

GP: I mean, if we can't destroy the barriers to semantic interoperability, we won't realize global CI... We may have a global brain but not a global mind, let alone global CI.

PL: That's exactly what I think.

GP: And what I appreciate a lot is that you not only think of, but also are pioneering an important piece of it, what you call the “information economy mark up language."

PL: Actually, I call it “Information Economy Meta Language” (IEML).

GP: Oh, that's something new. I remember when you talked about CIML, the Collective Intelligence Mark-up Language, at the first CI Colloquium in Ottawa.

PL: The “ML” is a kind of veiled reference to “mark-up language” but it really means “meta language."

GP: Can you give us a picture of the world where the “semantic interoperability” challenges to CI are resolved with the help of your information economy meta-language?

PL: The problems of semantic operability are rather simple and clear. There are many natural languages and there are no simple and reliable means of automatic translation. This is the first point. The second is that we have many cataloguing systems, taxonomies, ontologies, and so on, and they are not compatible. In addition, the great majority of them were designed before the computer, like those that are employed by librarians. So they are not designed to exploit the new computing capabilities and the very important fact that in the near future, all the documents will be digitized and on line.

Finally there is this problem in computer science itself or in AI. Let's acknowledge that the original research program of AI has not succeeded. If we think at the scale of the Internet or at the scale of global human CI itself, it is rather obvious that, currently, there is no solution to the problem of processing the meaning of this huge amount of interdependent digitized information flow.

Why is no artificial intelligence environment up to this task? Because the computer scientists who tried to work in this direction thought they could encompass human intelligence by logic. But there is much more to human intelligence than logic. This should have been obvious from the beginning, but apparently it was not the case and we (I mean the scientific community) had to go through a process of trial and error.


The missing symbolic meta-language

We have a kind of global brain and we have a general interconnection, at least a possible general interconnection between all the computers and digital repositories of the planet: a global digital memory in the process of technical interconnection.

All the documents can be connected by hyperlinks and all the people that are behind the computers can exchange information. But there is no common language, no common symbolic system that can convey human meaning, on one side, and be computable by the symbolic automata that are today at our disposal, on the other side. We have to explore new possibilities.

Today, we have a huge opportunity to expand our personal and collective intelligence. But cultural tradition did not pass on to us any computable symbolic system able to map an infinite semantic space.

The reason why we now have to invent such a symbolic system is that the situation of having a global human digital memory animated by powerful symbolic automata, and accessible from anywhere in real time, is completely new-less than a generation!

This new environment offers us a fantastic opportunity to grow a better collective intelligence, from the scale of small teams to the scale of the human race, but there will be no big leap or significant threshold in collective cognition capabilities without reflexive power. If we, as homo sapiens, have a reflexive consciousness, it's not because we have big brains. Elephants have - and Neanderthal had - bigger brains. It's because we have this extraordinary inborn cognitive tool called “language” allowing us to add reflexivity to our minds…

By contrast, the other animal species have no language capabilities. Of course they have cognition and communication, but no reflexive consciousness, and consequently no (or very limited) cultural evolution.

Our current challenge is to get a reflexive consciousness at the scale of human collective intelligence. The kind of cyberspace-supported symbolic system that my CI Lab is currently working on aims at progressively developing a better consciousness of our collective intelligence and at supplying sophisticated maps and compasses to navigate our cultural evolution. Pursuing this goal, there will of course be more “tangible” outcomes, like semantic search engines and powerful methodologies for knowledge management.

I'm not sure that my IEML will be the symbolic system of CI, but if we don't try, and experiment, and engage ourselves in seeking solutions for the “reflexive CI consciousness” problem, we will never solve it. So we have to do something. And I'm in a privileged situation by being supported by academic institutions funding this work and providing the proper environment for me and my team (the University of Ottawa, the Canada Research Chair program of the Canadian federal government, the SSHRC [Social Sciences and Humanities Research Council]).

I said that IEML is a symbolic system that is able, in principle, to express any meaning that can be expressed in natural languages, on one side, and that this meaning can be recognized and processed automatically, on the other side. The trick behind this is, in fact, very simple. Computers can only process syntax and they have no access to semantics. So, I had to design a language the semantic of which would be, as much as possible, parallel, or isomorphic, to its syntax. It is probably impossible to get a language the semantics of which is completely and perfectly expressed by its syntax (except in mathematics where we are limited to numbers and logics), but we can do a much better correspondence between syntax and semantics than in natural languages. This is what IEML is about: improving the computability of meaning.

Now, the grammar of the language has been completely formalized. Every expression of the meta-language can be recognized, parsed and processed by a deterministic finite machine (practically: by a computer program). The grammar will be published soon, with an open source parser. IEML expressions can express any complex concept or describe any complex network. I think that we have here a strong mathematical foundation (formal languages, set theory and graph theory) allowing automatic processing. I'm currently working on a theory of semantic functions-semantic transformations, perception of semantic patterns, automatic ranking on semantic criteria, etc.

Ultimately we'll have tools to model and simulate cognitive, social and cultural auto-poetic systems and interdependent ecosystems.

The Information Economy Meta Language in practice

That was the theory; now, the practical part. The language currently has only 2,000 words. It can accept something like 250 million words,1023 different phrases and 1069 semes (that are triples of phrases). And if you want, you can arrange these semes in an open-ended complexity of graphs. It is practically infinite.

What we have to do now, with my very small team, is to take some terminologies, ontologies, and classification systems and translate them into IEML.

GP: That's exactly what I was going to ask about. You wrote somewhere that the multiplicity of ontologies and taxonomies is a challenge to the inter-operability of meaning. What would you reply, if a devil's advocate would ask, isn't IEML just adding to that multiplicity of ontologies and taxonomies?

PL: What I plan to do in the coming years is to take some ontologies from interesting fields, like public health, professional skills, e-commerce, etc., and to translate them into IEML so that we can build semantic search engines that can process the documents indexed in IEML even if they were indexed originally by different ontologies of separate fields.

I'm adding a new meta-layer, where documents indexed in the context of different ontologies can be searched by a semantic search engine that can work on a heterogeneous corpus. I would like to show that, in addition to translating different ontologies into the same meta-language, we can perform a much more precise, rigorous-a much more scientific-search than we can with current search engines-even with documents that were originally indexed by incompatible ontologies-provided that these ontologies have been translated in IEML.

All the work that has been done in any ontology can be “saved” and valued in a kind of universal level, in IEML, so the work of ontology builders is not lost.

GP: And powered up because IEML will make them capable of traveling further and faster, in connecting with ontologies of other fields, supported by a mathematically formalized language.

People who work on the project

Pierre, you said something about your “small team” which reminded me that now that you've laid strong foundations, could your work at this stage benefit if there was a way to amplify the circle of people involved with it? In other words, would it be useful to engage more minds helping you further develop the dictionary and the methods in various other domains?

PL: I'm not sure. You were involved with the CI Lab when I tried to gather a network of people interested in building this new field of CI, but I realized that at least from my perspective, it was too early. And for the last past 4 years I have been working almost alone. Now for the three years to come, I will not be alone because the tasks to be done need the skills of a team of good computer scientists, but it will be a small team, a little group in Brazil, France, Canada, and two or three experts in the U.S. In a way, that is already complex enough, because all those people will have to build a common computing environment.

And on the other hand I have to work with specialists of public health and various other fields, like food industry, professional skills, etc., to translate their terminologies in IEML-not more than three to four, or maximum five different fields. And then we have to provide the empirical proof that it works, demonstrating that it works, that we have made a scientific leap in semantic search and knowledge management based on computational semantics.

This will take probably two to three years, maybe four years. In this process we will have discovered many problems and tried to solve them, we will have developed a methodology of IEML translation, and tested the computing and semantic search tools. When this phase of R&D has been completed, of course it will be time to open the circle.

But nevertheless since April 2007, on the website of the Lab, http://www.ieml.org, there is a wikimetal, for “wiki meta language.” This wiki will support collaborative work on the translation of the various terminologies into IEML. Today there are 2000 words. But at the end of these three years of collaborative work, there will be at least-let's say 15,000 words. We will gain experience in the processes of collaborative translation of various terminologies into IEML. Today the IEML words are interpreted only in French and English. At the end they will be interpreted in Spanish, Portuguese, and maybe in some Asian languages. Once we have a strong empirical scientific foundation, when we have proven that the theory is not invalidated by a large-scale experience, IEML will be able to “walk by itself.”

Computational semantics and the wisdom traditions

GP: The whole idea of CI Convergence is to make the important work of the various tribes of CI more visible to one another, and CI based on computational semantics is an essential field within the larger field.

PL: A kind of sub-discipline of CI.

GP: Yes, and it can enrich the other perspectives on CI. It can also inspire us to ask, just what are all the interesting things that we can think of when we think of CI as a field of multidisciplinary study and practice? That's one of the reasons why I would like to find ways to make your contribution to the field more visible to our colleagues. That's why I am looking for easy points of access to it by “lay people,” I mean, colleagues who are not specialists in your domain. Looking at the “resource flow” diagram is a small and easy step in discovering what you do; it can also trigger interest in understanding more of it, as it did for me.

 

Figure 1: The Six-Pole Resource Flow Diagram

 

PL: In the six pole diagram above, A: means “actual", U: means “virtual", S: means “sign", B: means “being” and T: means “thing". This notion of symbolic meta-language that can encompass any aspect of human life can be found, for example, in the Chinese tradition of the I Ching, where the basic human situations and their dynamic tensions are represented by a purely combinatorial system. It can be found in the Jewish tradition-just think of the Kabala and its manipulation and combinations of letters having multiple layers of interpretation. It can be found in some Buddhist and Tantric traditions, if you think about the Kalachakra tradition, for example, where there are very complex mandalas with hundreds of deities and symbols and all around a complex organized space with at least 3 different levels of interpretation. We can also think about the very rich tradition of the western “arts of memory” that were included in the rhetorical disciplines. All these traditions have developed a kind of symbolic geometry, or a geometry of meaningful symbols. My work connects not only horizontally, at this present time, as an effort to augment human CI with the intellectual, scientific and technical tools we have today, but it is also, in a kind of vertical time dimension, the continuation of a very ancient effort of various traditions. It strikes me that the quest for an all encompassing symbolic system that tries to overcome the limitations of natural language by geometrizing or mathematizing the signification process is something that can be found in so many traditions, including the good old Western philosophic and scientific tradition. Think of the work of Leibniz for example, his universal characteristics, or even Peirce's attempts.

So there are some deep roots… it's not only “let's improve the semantic web.” It's more than that; just to add some dimension.

Why the semantic web is not enough

GP: Yes, I can see that. Regarding our contemporaries, whom do you think of as a leading light in computational semantics, today, or in any area that inspired your work?

PL: Of course the first name that comes to my mind is Doug Engelbart. He was one of the first to understand that what we had to do with computers was not “artificial intelligence” but augmentation of personal and collective intelligence. He is also one of the few to recognize that this cognitive augmentation is connected to adding sophistication to our symbolic tools. We owe him the first versions of the mouse, and many of our first hypertext and groupware tools. I read also with pleasure the works of John Sowa on ontology and knowledge representation.

I have a great admiration for Tim Berners-Lee because he connected the field of hypertext and the field of computer networks. The result was the invention of the web, a new layer for the addressing of digital memory: addressing the pages. And this allows us to send a link for such-and-such a page in our e-mails, and to navigate from any page towards any other page, at the scale of the Internet. This was a huge achievement.

Finally, since several years now, I have a great intellectual exchange with Michel Biezunski and Steve Newcomb who were the inventors of the “topic maps” a very powerful norm for hypertextual information architecture.

GP: How will the next stage of digital memory addressing differ from the semantic web that Tim Berners-Lee is championing?

PL: If you look at the current tools of the semantic web-XML, RDF, OWL-basically they are logical tools and not semantic tools. XML explicates the logical structure of a database, RDF is an attempt to perform a kind of cataloguing of Web resources by triples that can be connected in graphs. And OWL is just a language to formalize and process ontologies, but the different incompatible ontologies stay different and incompatible. Also, what you have inside the angle brackets “< >“ in XML, RDF or OWL is still natural language expressions, with all their inherent limitations.

I really think that what we need now is to design a symbolic system that resonates with the scale, complexity and power of our new technical environment and I don't see this theoretical boldness in the current work of the semantic Web, even if what is being done here is obviously very useful.

I'm not sure at all if this new symbolic system will be IEML but I think we do need this kind of symbolic system. Maybe it's a matter of several generations.

GP: In any case, you are prototyping the first one, and you make the importance of the whole issue more visible and more ready to be looked at from various perspectives. (The one from which I'm looking at it, is the evolution of collective consciousness at increasing scale.)

PL: It would be nice if, in the CI field, people would begin to consider that there is not only a universal, infinite, and measurable physical space, but also a universal, infinite, and measurable semantic space, and that we could observe, understand and improve the processes in this semantic space. We have the hardware aspect of the observation instrument [i.e., linked computers to observe this semantic space, like telescopes or microscopes to observe physical space].

Now we need the software and symbolic part of it to make this observation instrument fully operational. So there is a whole new space that we can collectively explore and understand more… If I could bring people into looking at this space, it would be enough. It would be an achievement.

Setting the stage for beginning an exchange…

GP: That's very inspiring. You know, even before you build up the 15,000-word meta-language, and even before you develop your first prototype, your ideas are already inspiring some of us to see a new dimension of the CI field, which I and probably a number of us didn't think of before hearing about your work.

PL: I appreciate very much the work that you are doing in the convergence of people working in this field. I wanted to do it myself but realized it was not my “karma” to do it. But it has to be done. I also sent an email to Thomas Malone the director of CI center at MIT and he answered very gently. I think that it is a good thing that MIT opened such a research center with such a title. It is a kind of signal, it is no longer a marginal field, it is mainstream. It is good news.

GP: For me, your getting the Canada Research Chair on CI was already a significant step in the direction of CI being recognized. Thinking of the many different ways that different colleagues are approaching it, I just can't prevent myself from fantasizing about a “what if”: What if we are at a stage in our work where we've already laid the groundwork and, of course, there is still much more to do, but we do experience more freedom and curiosity in ourselves to look around and see who else is here on this field and what we can gift one another with. That's the dream that I hold when I'm sensing into what the Collective Intelligence Convergence can become. That's one of the possibilities that I feel attracted to.

Relevant link

Collective Intelligence Lab: http://www.ieml.org/

 

 

 

1 Professor Levy is Canada Research Chair in Collective Intelligence at the University of Ottawa. The interview took place 12 January 2007 and was shortened somewhat and also updated by Professor Lévy on 10 November 2007. Transcription: Ms. Sheri Herndon.

2 George Pór is an advisor to leaders in international business and government. Former Senior Research Fellow at INSEAD, currently he is a PrimaVera Research Fellow in Collective Intelligence at Universiteit van Amsterdam


http://primavera.feb.uva.nl/index.php?option=com_content&task=view&id=20


Publisher of the Blog of Collective Intelligence:


http://www.community-intelligence.com/blogs/public . His clients include: British Petroleum, EDS, Ericsson, European Commission, European Foundation for Management Development, European Investment Bank, Ford Motor Co., Hewlett Packard, Intel, Siemens, Sun Microsystems, Swiss Re, and Unilever. He can be reached at George(at)Community-Intelligence.com.