ISSN 1566 - 8266 BOARD BNVKI: Joost Kok (chair) Rineke Verbrugge (member) Wiebe van der Hoek (member) Yao-Hua Tan (member) Eric Postma (member) Luc DeHaspe (member) Walter Daelemans (member) Gert-Jan Beijer (member)
BNVKI-SECRETARY: Rineke Verbrugge Rijksuniversiteit Groningen Cognitive Science and Engineering Grote Kruisstraat 2/1 9712 TS Groningen
[email protected]
Volume, 16, number 6 December, 1999 EDITORIAL BOARD: Eric Postma (Editor in Chief) Jaap van den Herik Bart de Boer Shan-Hwei Nienhuys-Cheng Cees Witteveen Antal van den Bosch (section editor) Edwin de Jong (editor Belgium) Richard Starmans (section editor) Radboud Winkels (section editor) EDITORIAL ADDRESS BNVKI newsletter Joke Hellemons Universiteit Maastricht FdAW, Department of Computer Science P.O.Box 616, 6200 MD Maastricht Telephone: 043 388 34 77 / 35 04 Fax: 043 388 48 97 E-mail:
[email protected] http://www.cs. unimaas.nl/~bnvki
GREAT EXPECTATIONS Editor-in-Chief On the brink of the year 2000 it is almost obligatory either to look back onto the past millennium or to look forward into the next millennium. Opting for the latter I could speculate on the impressive achievements that lie ahead for artificial intelligence. I would certainly not be the first to do this. Many authors have attempted to predict the future from the current state of research and technology. A striking example is Hans Moravec’s contribution to the December 1999 issue of Scientific American. His great expectations of future artificialintelligence research leads him to predict that by 2050 robot “brains” start rivalling human intelligence. The prediction is based on the following line of reasoning. Estimating the computational power of the human retina at 1,000 MIPS (Million Instructions Per Second), Moravec asserts that the computational power of the entire brain is about 1000,000 MIPS because the brain outweighs the retina by a factor of about 75,000 (the retina weighs 0,02 grams whereas the brain weighs 1,500 grams). Extrapolating from the rapid development of computational power, Moravec reaches to the conclusion that by 2050 artificial brains start rivalling human brains. Moravec is wrong for at least three reasons. First, the retina performs a plethora of functions and is still poorly understood by biologists. According to Moravec, the computational power of the retina follows from the fact that the retina detects an edge or motion in million image regions in parallel. In computer vision, edge or motion detection requires at least 100 instructions. Therefore, the entire retina performs 1,000 MIPS. However, the retina is much more than a parallel edge and motion detector. This is (amongst others) due to the physical features and embodiment of the eye. For instance, the eyes are in constant motion, which implies that motion is detected all the time. Nevertheless, we do not see the world moving as we move our eyes. Undoubtedly, this observation has profound consequences for the design and function of the retinal circuitry. “As a consequence, the retina performs much more than 1,000 MPS.” The second reason why Moravec is wrong has to do with his confounding of weight with computational power. Although there may be a relation between the weight of brain tissue and the associated computational power, this relation is not necessarily linear. The brain has many structures for supporting neurons and their connections. So, on the one hand Moravec may overestimate the weight. On the other hand, and more importantly, the structural features of the brain differ from those of the retina. Local connections abound in the retina, whereas many global long-range connections exist in the brain. These connections and their adaptive strengths are pivotal for our mental abilities and there is no simple relationship between their weight and their computational power. As a final point, the evolution of computer speed as a function of time is a non-linear function and is expected to saturate in the near future. Edward Rietman stressed this point in his BNAIC’99 lecture on evolvable neural hardware. (See my report on page 166 of this newsletter.) Therefore, even if the brain performs about a million MIPS, our computers may not yet have reached this level of performance in 2050. It is somewhat depressing that researchers like Moravec have not learned from the lessons of the past. As we know from the past fifty years of AI research, too optimistic expectations are bound to lead to disappointments. Therefore, I favour a more modest and realistic approach. I am convinced that the new century (or millennium) holds many challenges and opportunities for artificial-intelligence researchers. The increasing computational power will certainly change the face of artificial intelligence, robots will become more and more humanoid, and artificial-intelligence techniques will be incorporated in our daily environment. But I will not make more specific predictions than that. Instead, on behalf of the Editorial Board and the Board of the BNVKI, I wish you a very happy New Year, new century, and new millennium! Photos by Joke Hellemons, Jaap van den Herik, and Hans Henseler.
Cover photograph: The invited speakers of the BNAIC’99 (from left to right) Edward Rietman, Tom Mitchell, and Jonathan Schaeffer. ERRATUM In the previous BNVKI newsletter (Vol.16, No.5), Bart Verheij’s contribution on page 145 contained two errors due to editorial mistakes. Firstly, a footnote included by the author was omitted. The footnote stated that the contribution was published in the journal Recht en Elektronische Media. Secondly, on page 147 the footnote contained a mistake, the proper footnote reads as follows: “Voor meer informatie over de systemen verwijs ik naar het proefschrift en de productbesprekingen in het tijdschrift R&EM. WVP is
BNVKI newsletter besproken in R&EM, nr.1 (1996), OVB in R&EM, nr.4 (1998)”. 160
December 1999
TABLE OF CONTENTS Great Expectations (Editor in Chief) .......................................................................................................................................160 Table of Contents ....................................................................................................................................................................161 BNVKI-Board News (Joost Kok) ..........................................................................................................................................162 Minutes of the BNVKI-AIABN General Assembly (Rineke Verbrugge) ...............................................................................163 Again on the Right Track (Jaap van den Herik) ......................................................................................................................164 BNAIC’99 Reports .................................................................................................................................................................165 Invited Lectures (Eric Postma) ...........................................................................................................................................165 Logic and Reasoning 1 (Shan-Hwei Nienhuys-Cheng)......................................................................................................167 Evolutionary Computation 1 (William Langdon) ..............................................................................................................168 Machine Learning and Neural Networks 1 (Jan van den Berg)..........................................................................................169 Belief Networks (Marc Gyssens) ......................................................................................................................................169 Search (Ida Sprinkhuizen-Kuyper) ....................................................................................................................................170 Agent Technology 1 (Catholijn Jonker) ............................................................................................................................170 Knowledge Representations and Systems (Jos Uiterwijk) .................................................................................................171 Special Session AI and Law (Yao-Hua Tan) .....................................................................................................................172 Logic and Reasoning 2 (Wiebe van der Hoek)...................................................................................................................173 Machine Learning and Neural Networks 2.........................................................................................................................173 Robotics and Vision 2 (Rineke Verbrugge) .......................................................................................................................174 Special Session AI in Medicine (Arie Hasman) ................................................................................................................174 Demonstrations 2 (Hans Henseler) ....................................................................................................................................175 Agent Technology 2 (Nico Roos).......................................................................................................................................175 Robotics and Vision 3 (Edwin de Jong) ............................................................................................................................176 Evolutionary Computation 3 (Elena Marchiori)) ...............................................................................................................177 Agent Technology 3 (John-Jules Meyer) ..........................................................................................................................177 Demonstrations 3 (Erica van de Stadt) ..............................................................................................................................178 Logic and Learning Theory (Richard Benjamins) .............................................................................................................179 Machine Learning and Neural Networks 3 (Antal van den Bosch) ...................................................................................179 Winner of the SKBS-AWARD (Hans Henseler) ...............................................................................................................180 Explorations in the Document Vector Model of Information Retrieval (Ruud van der Pol) ...................................................180 Cognitive Science Research at ULB and ULG (Edwin de Jong).............................................................................................183 Section Knowledge Systems in Law and Computer Science (Radboud Winkels) ..................................................................185 Interfacing between Lawyers and Computers, an Architecture for Knowledge-based Interfaces to Legal Databases (Kees van Noortwijk) .....................................................................................................................185 Casus-gebaseerde Juridische Argumentatie (Radboud Winkels).......................................................................................189 Kunnen Rechters Rechtspreken (Doeko Bosscher) ...........................................................................................................190 Section SIKS (Richard Starmans) ...........................................................................................................................................190 Report on the SIKS master class by Jonathan Schaeffer (Ida Sprinkhuizen-Kuyper) ......................................................190 Report on the SIKS master class by Tom Mitchell (Evgueni Smirnov) ...........................................................................191 Call for Papers ........................................................................................................................................................................191 14th European conference on Artificial Intelligence .........................................................................................................191 Conferences, Symposia, Workshops .......................................................................................................................................192 Email-addresses, Board Members/ Editors BNVKI newsletter/ How to become a member?/ Submissions ...........................193 Advertisement/ Change of Address.........................................................................................................................................193
The BNVKI is sponsored by AHOLD and by BOLESIAN In 1999, the publication of the BNVKI newsletter is also supported by the Division of Computer Science Research in the Netherlands (previously called SION, now ACI)
BNVKI newsletter
161
December 1999
BNVKI BOARD NEWS Joost Kok Chairman BNVKI Which much pleasure, I would like to tell you something about my impressions of the BNAIC in Maastricht. However, pleasant circumstances forced me to be in the hospital: during the session Machine Learning and Neural Networks III, Marijn Kok was born and I must admit, we (my wife and myself) are very happy with Marijn. Hence, I have to rely on second-hand impressions of the BNAIC which all were very favourable. Especially the programme and the organisation received positive remarks. Therefore I would like to thank everybody who helped in organizing the BNAIC’99 conference, in particular Floris Wiesman (the chair of the organisational committee) for all their efforts.
Best Student Paper Award: Rens Kortmann
Taking up my usual task of pointing you to remarkable items in our AI society, I would like to draw your attention to the BNVKI tutorials on artificial intelligence and language processing (http://pcger33.uia.ac.be/bnvki/) which will be held in Tilburg, January 10, 2000. The day consists of two three-hour tutorials about Symbolic Machine Learning for Natural Language Processing (Raymond Mooney) and Communication in a Multi-Agent System (Frank Dignum). As a BNVKI member one only pays 50 guilders registration fee for this most interesting day.
Best Demonstration (SKBS) Award: Michiel van Wezel and Han La Poutré
It has become a tradition in this column to give a link to an interesting site. This time, I would like to point to the Research index http://citeseer.nl.nec. com/cs in which citations to papers are listed. This link can be very useful to find related papers on a certain subject. Finally, on behalf of the Board of the BNVKI it is my pleasure to congratulate Rens Kortmann (Best Student Paper Award), and Michiel van Wezel and Han La Poutré (Best demonstration (SKBS) Award) and Gianluca Bontempi and Mauro Birattari (Best paper Award). Best Paper Award: Gianluca Bontempi and Mauro Birattari
BNVKI newsletter
162
December 1999
Ad 4. Election of New Board In the past year, Luc de Raedt accepted a chair in Germany and stepped down as Board member. The Board proposes Luc DeHaspe (KU Leuven) as new member. The rest of the Board is not changed. The meeting accepts this proposal by general acclamation.
MINUTES OF THE BNVKI-AIABN GENERAL ASSEMBLY November 4, 1999 Rineke Verbrugge, Secretary Agenda 1. Opening 2. Announcements 3. Minutes of the previous general assembly 4. Election of New Board 5. ECCAI developments 6. Financial Report 7. Location of BNAIC’2000 8. BNVKI Newsletter 9. Cooperation with CLIN 10. Any other business
Ad 5. ECCAI developments At the ECCAI general assembly of August 4, 1999 in Stockholm, some announcements were made that are also relevant for BNVKI-AIABN members: • The journal AICom, which in the past was sent to all individual members of national AIorganizations, will not be published in paper form anymore. In the future, BNVKI-AIABN members will be able to access the electronic version at IOS Press, using passwords specifically earmarked for our members. • Ph.D. supervisors are asked to nominate excellent Ph.D. theses that have been defended in 1999, for the Second ECCAI Dissertation Award (1500 ECU). The nomination procedure is explained at http://www.eccai.org/ dissertation.html. Please note that the deadline is quite soon. • Ph.D. students who are members of BNVKIAIABN and who will present a paper at ECAI2000 can apply for a travel grant (400 ECU) from ECCAI. At least two such grants may be awarded to Ph.D. students from the Netherlands and Belgium.
The chairman of the general assembly is Yao-Hua Tan, replacing BNVKI chairman Joost Kok, who could not be present. Ad 1. Opening The chairman opens the meeting at 13 h, and the agenda is passed around among the participants. Ad 2. Announcements • The chairman is pleased to announce that the BNVKI-AIABN keeps growing and now has 233 members. Unfortunately the number of Belgian participants does not follow this trend and stays at 35. • BNAIC’99 attracted 144 participants, also quite a high number. This BNAIC has been organized in cooperation with SNN. The BNVKI Board would like to continue this cooperation. • In 1999, the BNVKI-AIABN sponsored some AI-activities. R. Moody was invited as a speaker at a CLIN-NVKI workshop on Communication in Tilburg. A workshop in Rotterdam on “Formal Models of Electronic Commerce” was sponsored, and a guarantee was given to the “Dutch-German Workshop on Non-monotonic Reasoning”. The Board intends to continue sponsoring such activities. • Recently, the Board established a standard f 500,- sponsorship package for companies and organizations willing to sponsor the BNVKIAIABN. Of course it remains possible to sponsor our organization by different amounts than f 500,- per year.
Ad 6. Financial Report For the financial year 1999, treasurer Wiebe van der Hoek shows that the BNVKI-AIABN looks to be financially healthy. H. de Swart suggests using a different bank account that offers more interest than the present one. The accounts committee has inspected the books and was satisfied to find them in order. During the meeting, a new accounts committee is proposed, consisting of Catholijn Jonker, Ida Sprinkhuizen-Kuyper and Annette ten Teije; this proposal is accepted by general acclamation. The meeting also accepts the treasurer’s proposed budget for the year 2000 by general acclamation. Ad 7. Location of BNAIC-2000 Antal van den Bosch presents the plan to locate BNAIC-2000 at Tilburg University, where the Info Lab and Computational Linguistics will organize it. The event will take place in late October, and satellite events like Benelog, Benelearn, and CLIN are planned to be coordinated with BNAIC-2000. Bert Kappen of SNN proposes that there should be more contact between BNAIC organizers and SNN from an early stage, so that SNN may have more
Ad 3. Minutes of the previous general assembly The minutes are accepted by general acclamation.
BNVKI newsletter
163
December 1999
addresses the question: how relevant is an announcement of a Ph.D. defence to the persons involved? With some pleasure I may say: the relevance is increasing.
influence on the structure of the conference program. Ad 8. BNVKI Newsletter Editor Eric Postma encourages members that their copy for the Newsletter is always highly welcome.
Below we provide you with the total number of the Ph.D. theses announced in 1999. For comparison we have given the list containing the numbers of the previous years. This list contains the adjusted numbers. The dedicated reader knows that after publication of the annual list, a Ph.D. researcher or a supervisor sometimes “feels forgotten”. Hence, in the February issue of the year following on the publication we supply now and then additional information. The current list reads:
Ad 9. Cooperation with CLIN This year some activities have been organized together with CLIN. Both sides want to continue the cooperation. Ad 10. Any other business Jaap van den Herik asks the Board to publish the scheme of years in which Board members have to step down in the Newsletter. He also expresses his opinion that it would be a good idea if BNAIC2001 were to be located in Belgium.
1994: 22 1995: 23 1996: 21 1997: 30 (for reasons see above) 1998: 21 1999: 28
AGAIN ON THE RIGHT TRACK
The number of 28 (for 1999) means that we are back on the right track of producing Ph.D. theses in accordance with the number of researchers (professors as well as aio’s) in the field. I had hoped for 25 (see Vol.15, No.6, p.182).
Jaap van den Herik IKAT, Universiteit Maastricht This issue of the BNVKI Newsletter tells you on the successful BNAIC’99 in Maastricht, on November 3-4, 1999. It was the first official cooperation with our Belgian colleagues. Having followed the presentation of many Ph.D. researchers on that conference as well as on the SIKS Doctoral Consortium (also in Maastricht, on November 2, 1999), I believe that the next century will bring an explosion of Ph.D. theses, say, in 2002. So, the future is promising.
As a courtesy to the 1999 newborn doctores I am pleased to honour them (again) by mentioning them below together with their promotion date. G.C. van den Eijkel (18-1), M.A.P.M. van Asseldonk (5-2), L.K.J.M. Vermeersch (8-2), M. Wiering (17-2), M.Weusten (10-3), P. van de Laar (12-3), K. Sima’an (31-3), J. Jarmulak (6-4), L. Matthijsen (9-4), M. Sloof (11-5), H. Vandecasteele (26-5), R. Potharst (4-6), B. de Boer (4-6), G. Zwaneveld (4-6), D. Beal (11-6), J. Penders (11-6), J.G.M. Schavemaker (18-6), J. Sowa (25-6), M.F. Moens (28-6), D. Spelt (10-9), H. Paijmans (14-9), N.J. T. Wijngaards (30-9), A. de Moor (1-10), M. Aznar Fernandez de Montessinos (25-10), J.H.J. Lenting (3-12), W.J. Willemse (3-12), F.A.B. Lohman (7-12), A. de Vries (17-12).
But, what about last year? Since 1994, your Editor counts the Ph.D. production as published in this Newsletter. The Ph.D. theses deal with AI and AIrelated domains. At first, the scope was restricted to AI in The Netherlands. Thereafter we took into account the research area Law and Computer Science. Then, we opened our Section also for SIKS announcements from the other brand of our School, namely the Database side. Moreover, related domains such as AI and Robotics, Medical AI, Neural Networks and Knowledge Management were sometimes given a place to announce their scientific results. Finally, we broadened our scope towards Belgium.
For the year 2000 we have received one announcement. Maybe the step to the next millennium is too large a step for an adequate planning. I know that many have in mind to start the millennium with the best thing they can do (i.e., defend a Ph.D. thesis), but so far only two dates have been scheduled. The list below therefore contains only six announcements. Two of them are repetitions of earlier-published announcements, but they belong to the month in which this Newsletter is issued. The third one is a Ph.D. thesis we
An announcement is primarily dependent on the willingness of the author and the supervisor to inform the Editor on the Ph.D. defence. Hence, the numbers, the information, and the conclusions are rather biased and unreliable. Yet, they have some value in their own. For instance, it adequately BNVKI newsletter
164
December 1999
Promotoren: Prof.dr. D.G. Bouwhuis en Prof.dr. P.M.E. de Bra. Copromotor: dr.ir. J.H. Eggen.
received from the Delft University of Technology, the fourth is from our SIKS colleague Professor Apers, while the fifth and sixth have their roots in Delft and Eindhoven respectively. We congratulate the new doctor F.A.B. Lohman, A. de Vries, E.G.P. Bovenkamp, and S.C. Pauws with the publication of their theses and wish them a successful defence. The relatedness between Information Management, Knowledge Management and AI techniques is an issue which deserves substantial attention in the next five years.
BNAIC’99 REPORTS INVITED LECTURES Report by Eric Postma
Finally, we have two Ph.D. reviews. In the Section Legal Knowledge Based Systems. Kees van Noortwijk (EUR) discusses Luuk Matthijssen’s (KUB) thesis, titled Interfacing between Lawyers and Computers, Architecture for Knowledge-based Interfaces to Legal Databases. He concludes that Matthijssen’s contribution is worth to be applied in other domains. After the BNAIC’99 reports the reader finds a review by Ruud van der Pol. He discusses Hans Paijmans thesis Exploration in the Document Vector Model of Information Retrieval.
The BNAIC'99 organizational committee succeeded in attracting three invited speakers: Tom Mitchell (Carnegie Mellon University), Edward Rietman (Bell Labs/Lucent Technologies), and Jonathan Schaeffer (University of Alberta). In addition, Frans Groen from the University of Amsterdam was invited to provide an introduction to the Robot Soccer demo. Tom Mitchell opened the scientific part of the conference.
J.H.J. Lenting (December 3, 1999) Informed Gambling. Conception and analysis of a multi-agent mechanism for discrete reallocation. Universiteit Maastricht. Promotor: Prof. dr. H.J. van den Herik; co-promotor: dr.P.J. Braspenning.
Extracting Information from the World Wide Web Tom Mitchell Mitchell began his talk with the observation that "the Internet is rapidly becoming the largest knowledge base of information in the world". Any one of the over 500 million web-pages in this steadily increasing knowledge base can be readily retrieved by a standard workstation, so accessibility poses no problem. However, the automatic understanding or indexing of the contents of web pages does present a challenge to present-day computers. Mitchell's research group faces this challenge by developing dedicated machine-learning methods for automatic indexing of web pages. A prototype system for extracting information from web sites in a fully automatic way is now running at CMU. Mitchell showed some examples of how the prototype system deals with web pages of companies and universities. On tasks involving the classification of a computer science web page as a faculty or
W.J. Willemse (December 3, 1999) Computational Intelligence: Life without Tables for the Actuary. Technische Universiteit Delft. Promotor: Prof.dr. H. Koppelaar. F.A.B. Lohman. (December 7, 1999) The Effectiveness of Management Information. Technische Universiteit Delft. Promotor: Prof.dr. H.G. Sol. A. de Vries. (December 17, 1999). Content and Multimedia Database Management Systems. Technische Universiteit Twente. Promotor: Prof.dr. P. Apers. E.G.P. Bovenkamp. (January 11, 2000) Fuzzy temporal reasoning. Technische Universiteit Delft. Promotor: Prof.dr.ir. E. Backer, co-promotor: Dr.ir. J.C.A. van der Lubbe S.C. Pauws. (January 12, 2000). Music and Choice: Adaptive Systems and Multimodal Interaction. Technische Universiteit Eindhoven. BNVKI newsletter
165
December 1999
a student page, the system achieves an accuracy of about 70 percent correct on previously unseen pages. This level of performance was achieved with a very simple ML method based on the "Bag-OfWords" (BOW) representation in combination with a Bayesian classifier. The BOW representation is a highdimensional feature vector (50,000 to 100,000 dimensional!) in which each element represents a word in the English language. The contents of a web page are encoded into the feature vector by setting each element to the number of occurrences of the corresponding word. Obviously the resulting feature representation of a web page is very sparse as the majority of the elements will be zero. The reported performance was obtained by using a training set of 4000 hand-labelled web pages derived from the sites of the computer science departments of four universities. After applying Bayes' rule to update the probabilities that BOW vectors belong to a certain class, the classifier was tested on the computer science web pages of CMU. From the experiment, Mitchell was able to show a ranking of words most indicative of (computer science) student pages. The highest-ranking word was "resume", followed by "advisor", "student", and "stuff".
learning technique called co-training. The main feature of co-training is that it applies a kind of bootstrapping procedure to generate a large training set from a small set of labelled instances. Two classifiers were trained, one on the words on the web page (using the BOW representation) and the other on the names of the hyperlinks. Training both classifiers on only a small (training) set of labelled instances yields rather weak classifiers (error rates of about 25%). The next step is to apply both classifiers to the much larger set of unlabeled instances. As the classifiers generate probabilities, rather than discrete positive or negative outcomes, the classified unlabeled instances can be ranked in terms of the probability that they have been correctly classified. Now the crucial step in co-training is to select the highest-ranked unlabeled instances and add them together with their class to the training set of labelled instances, and train both classifiers again on the extended set. Applying this procedure iteratively yields an ever-growing set of labelled examples and, in general, a decreased classification error. Mitchell was very enthusiastic about co-training and ended his talk by asking the audience to come up with suggestions for suitable application domains of cotraining.
“CO-TRAINING”
Evolvable Hardware Edward Rietman
Hand labelling is a very tedious job, which presents a major obstacle for applying ML methods to classifying web pages. Fortunately, Mitchell’s research team found a way to minimize the amount of hand labelling required. The main idea is to exploit the redundancy of web pages. For instance, in most cases the words on the web page suffice for classifying it as (for instance) as a student or faculty page. However, the names on the embedded hyperlinks (e.g., "advisor") provide an independent reliable source for classification. Mitchell explained a BNVKI newsletter
Rietman opened his lecture by stating that, instead of talking about AI in chip manufacturing, he decided to talk about evolvable hardware. One of his first slides showed the well-known graph of the evolution of computational power as a function of time. At the top of the graph (in 1999) was the G4, the latest Macintosh chip that achieves a performance comparable to a CRAY supercomputer, i.e., 1 billion floating point operations per second. Despite the amazing speed of evolution of hardware, Rietman stressed 166
December 1999
some feeling of dissatisfaction due to the lack of scientific embedding of his presentation. Rietman did not address the question of why oscillators are so important to study for future computers. Simply pointing at biological plausibility is not enough to warrant their study. From private talks with Rietman, I know that he has scientifically well-founded reasons for studying oscillators. For one reason or another he presented himself during his lecture primarily as an electrical engineer instead of an AI researcher.
that we are reaching the limits of what can be physically realized. Shrinking below the scale of a few tens of atoms inevitably leads to problems. The tendency to densely pack transistors (i.e., the building blocks of chips) limits the design and interconnection possibilities. One solution may to exploit the third dimension. But, according to Rietman, the dissipation of the components prevents their dense packing in three dimensions. In his words: "there is no way of getting the heat out of the system". OSCILLATORS
Rietman's solution is to follow the biological example in creating new types of biologically inspired computers. Parallel processing, a distributed power source, and oscillating computational elements should form the main ingredients of such a computer. In particular, the use of oscillators is relatively new in neural hardware research. Rietman pointed at the omnipresence of oscillators and oscillations in neural tissue and claimed their incorporation to be pivotal in future neural computers. Some of his own experiments involving neurons based on the (for electrical engineers) well-known Schmitt-trigger circuit, showed the nonlinear sigmoid relationship between input voltage and oscillator frequency characteristic of real neurons. He also discussed his experiments on in-silico evolution of oscillators using fieldprogrammable gate arrays (fpga's). In one of these experiments, oscillators with "impossibly" high oscillation frequencies evolved. It turned out that specific defects or characteristics of the chip (e.g., parasitic capacitances) where exploited by the genetic algorithm to achieve those extra ordinally high frequencies.
The Games Computers (and people) play Jonathan Schaeffer
On the second day of the BNAIC, Jonathan Schaeffer gave a highly entertaining lecture. He talked about the progress in game research using a carefully prepared multimedia PowerPoint presentation. Referring to Mitchell's Bag-of-Words method, Schaeffer stressed that there are many efficient non-human ways of solving difficult problems. (Apparently, Schaeffer assumed that humans do not use "stupid" methods such as the BOW method. Whether that is the case remains to be seen.) Making a case for concentrating on methods suitable for computers instead of methods inspired on the human example, he reviewed the history of games and search research. His historic overview included discussions of the games of Checkers, Chess, Backgammon, Othello, Scrabble, Bridge, Poker, and Crossword puzzles. For each game he presented the Human Performance (e.g., Kasparov for the game of Chess), the Computer Performance (e.g., Deep Blue), The Verdict (e.g., the result of Kasparov vs. Deep Blue), and the Secret (e.g., the power of Deep Blue relies partly on brute force). With respect to the famous match between Kasporov and Deep Blue, Schaeffer
Rietman's lecture was interesting and thought provoking. It shed a light on future developments. However, it left me with BNVKI newsletter
167
December 1999
From left to right: Jaap van den Herik, Joke Hellemons, Martine Tiessen, Floris Wiesman, and Eric Postma
remarked that although the match was good publicity of IBM, from a scientific point of view it was of limited value. In experimental science, a single data point is not very informative. Repeatability of experiments is an important requirement. Since Deep Blue has been dismantled, repeating the Kasparov-Deep Blue experiment is not possible. Therefore, the Verdict in this case is more or less unsettled. In Schaeffer's words: "in chess humans are (by a slim margin) still superior to computers".
LOGIC AND REASONING 1 Report by Pierre-Yves Schobbens University of Namur The Complexities of a Refinement Operator for Prenex Conjunctive Normal Forms Shan-Hwei Nienhuys-Cheng
The paper extends refinement operators, as used in ILP (Inductive Logic Programming, to which you can be introduced by the LNAI book of the same author). In ILP, an initial Logic Programming description of a concept is incrementally refined so as to match all positive and no negative examples. Traditionally, the description is given by Horn clauses, but here this is extended to Prenex Conjunctive Normal Forms (PCNF). PCNF are as expressive as first order logic. The refinement operators propose ways to make such descriptions more specific (usually to fit a further example). Seven operators were shown to be enough, when used in chain, to specialise to any PCNF formula, a property called “weak completeness”. The author also brought bad news: the set of possibilities brought by these operators is very large, namely exponential without functions and doubly exponential with functions. It seems thus useful to look for effective heuristics to discover rapidly the right refinement, and also to look for subsets with more manageable complexities (beyond classical ILP).
Schaeffer's use of multimedia in his presentation allowed him to respond directly to, for instance, Kasparov's excuses after his loss. This type of presentation was very assuming and entertaining. Despite some problems with the audibility of parts of the presentation (which were certainly not due to Schaeffer, who tested his presentation extensively before the lecture), his lecture was a great success. He ended his lecture by expressing a concern about the future of AI. Given the huge efforts involved in dealing with games such as chess where all the rules and states are known, what tremendous effort is required to deal with real-world problems with its unknown and changing rules and states? This concern seems completely warranted given the still limited applicability of AI techniques in the real world. On the other hand, it represents a great challenge for the next generation of AI researchers.
Computing with Computational Histories A. Bos, N. Roos, C. Witteveen
The paper presented by André Bos aims define a new kind of complexity, taking into account previous attempts to solve similar problems. Thus the computer can use a list of previously solved problems, called the computational history. This concept is not so easy to define, as for BNVKI newsletter
168
December 1999
example they discovered that one possible definition reduces unexpectedly to the class of pre-compilable problems of Cadoli et al. To avoid this, the history is only allowed to depend on the alphabet of the fixed part of the problem. Up to now, complexity theory was independent of the alphabet used, so this new concept needs further investigations, to my mind.
with each. Thus the agents’ behaviours coevolve, unlike in Axelrod’s early work experiments where several hand coded opponents were provided.
EVOLUTIONARY COMPUTATION 1 Report by William Langdon CWI
David showed that using the measured fitness of the agent directly in roulette wheel (or fitness proportionate) selection gives too little selective advantage and co-operative behaviour does not evolve. However if selection pressure is increased (e.g., by using binary tournament selection or rescaling the fitness measure to give more children to slightly better parents) good behaviour arises which is reasonably stable. However if the selection pressure is increased (e.g., tournament size of 8), co-operative behaviour arises quickly but the population is randomly invaded by cheats who rapidly increase in number and take over the whole population. Surprisingly co-operative behaviour can evolve again in a population of cheats. David also showed as this varies between runs, averaging across many (25) runs yields smooth graphs but conceals the true random behaviour. Thus large tournament sizes appear to give rise to populations that on average score only marginally lower than with small tournament sizes. David also investigated Thierens’ elitist recombination (ER). This guarantees high scoring agents or their successful children will be carried into the next generation. This was found to be the most successful at evolving co-operation and at providing protection against cheats in small tournament sizes. If the selection pressure was again increased, by increasing the tournament size, the population again became unstable. So that high scoring cheats could evolve and multiply. However co-operative behaviour was not displaced and quickly re-asserted itself (thus there is only a small drop in average performance with ER and large tournament sizes). Even in “good” populations there are some cheats. Weak selection and ER prevent their numbers from growing. However very strong selection promotes the occasionally highscoring cheat rapidly. If ER is not used they can take over the population. With lower selection pressure, co-operating agents have more time to spot and punish cheats. If they were not punished they would have scores even higher than the cooperating agents and so eventually take over the population. It might also be that the co-operating agents evolved with weak and strong selection are different. It would be interesting to know if
Selection and the Evoluation of Co-operative Populations Fitness Tournament Size Proportionate direct sigma 2 4 8 scaling Generational 2 none ok ok ok unstable Elitist v good -> less stable N/A Recombination
The Influence of Evolutionary Selection Schemes on the Iterated Prisoner’s Dilemma D.D.B. van Bragt, C.H.M. van Kemenade and J.A. La Poutré Evolutionary Computation 1 was the first session in which technical papers from the conference were presented. It took place in parallel with two other sessions but nevertheless both presentations were well attended. They took place in the Koning Willem II room. However despite its attractiveness, including views of the beautiful grounds of the castle, both presenters kept their audience’s attention. David van Bragt of the CWI (
[email protected]) presented his work on the importance of difference selection techniques on the evolution of co-operation when using Genetic Algorithms (GAs) with the Prisoner’s Dilemma, a non-zero sum game. This was a resubmission of a paper he presented at the 5th International Conference of the Society for Computational Economics on Computing in Economics and Finance (CEF’99), Boston College, USA, 24-26 June. An extended version (with the same title and authors) will appear in the “Computational Economics” journal. The Iterated Prisoner’s Dilemma (IPD) is a game that, following Robert Axelrod’s work, has often been taken as a simple model of economics and psychology. In particular the IPD is a suitable model for investigating the emergence of co-operative behaviour in spite of immediate rewards to both parties to cheat on the other. David showed that the selection scheme is crucial to the evolution of populations of co-operative agents, which are stable to invasion by more selfish behaviour. Several selection schemes were investigated. Each can be taken as a model of different economies. For example “Fitness Proportionate” requires global knowledge while “Tournament Selection” uses only local information. In these simulations each agent’s score is obtained by pairing it with 12 other agents from the same population (60) and playing IPD BNVKI newsletter
169
December 1999
different strategies evolve and how they change over time. One conclusion to be drawn is that it is vital in multi-agent simulations that all the relevant parameters are included in published descriptions. Solving Constraint Satisfaction Problems with Heuristic-based Evolutionary Algorithms B.G.W. Craenen, A.E. Eiben and E. Marchiori
store and next analyse databases in new ways, by exploiting the features of memory-residentness. In this paper, the quick retrieval of any entry of the database is an important precondition. In a clear way, Wim explained the working of the new algorithm for mining 'frequent item sets'. A 'trie' (not a tree) is the basic data structure used. This trie is gradually constructed using an efficient depth-first search (which demands the abovementioned quick retrieval of all entries). During the process 'candidates' may become 'frequent item', namely if the number of occurrences of the candidate in the database surpasses a given threshold. The talk finished with a presentation of some experimental work. In the discussion it was suggested by the public to study time and space complexity as well. However, this was also suggested in the paper by the authors themselves.
This is work compliments recent work by his coauthors on CSPs (for example Parallel Problem Solving from Nature, Amsterdam, PPSN 1998, LNCS 1498, pp. 196-205). They used an adaptive performance measure combined with a genetic algorithm. Here Bart Craenen looked at the other class of evolutionary algorithms used to solve CSPs, i.e., the class where heuristics are combined with genetic algorithms using an integer representation. By systematically changing two probability parameters (tightness and density) 250 random CSPs were generated of different difficulties. The idea was for the heuristic+GA to find any solution that satisfied all the constraints. The CSPs ranged from the very easy (where all approaches found a solution) to very difficult (where none found any). Indeed it is possible there are no solutions to these. Bart highlighted the most interesting “mushy” region where performance between GAs and other approaches varies. He tried three heuristic+GA approaches, which were chosen to represent most of the heuristic field. They include genetic repair after traditional crossover/mutation (ESP-GA), heuristic based genetic operators (H-GA) and a sophisticated constraint-network method (Arc-GA). The same population size (10) was used with each. The three Heuristic+GA approaches had similar performances. He gives data from a different GA approach (Dozier, WCCI-94) that gives much better performance in the mushy region. Bart suggests this is because Dozier’s MID GA both uses a heuristic within the mutation operator and adaptively changes the fitness function, i.e., spans both classes of CSP solvers. In the future Bart intends to investigate several mechanisms that adapt the GA’s fitness measure.
Statistical Analysis of Gene Expression Data L.F.A. Wessels, M.J.T. Reinders, R. Baldocchi, and J. Gray
In the next talk some preliminary data analysis studies were presented. The motivation for this work lies in the idea that the activity levels of genes provide important information on various cell processes. Understanding these processes is crucial for the development of a therapy for specific kinds of cancer. In his talk, Mr. Wessels explained how various clustering approaches have been tried while changing both the pre-processing of the data (here, the time signals) and the distance-measure of the clustering technique itself. Next, some illumination was given on the interpretation of the clustering results (not easy for BNAIC-people since most of them are not especially familiar with the biology of living cells!). Using expert knowledge, it appeared to be possible to extract certain biologically significant hypotheses from the clusters found. More experiments (to be performed in the near future) are needed
MACHINE LEARNING AND NEURAL NETWORKS 1 Report by Jan van den Berg Erasmus Universiteit Rotterdam Mining Frequent Item sets in Memory-resident Databases W. Pijls and J.C. Bioch
Wim Pijls gave the first talk in this session. The present-day availability of large and cheap RAM's makes it possible to first BNVKI newsletter
170
December 1999
to validate these hypotheses. To finish, an interesting observation by the speaker was the idea that - in complex areas like the activity levels of genes - existing domain knowledge may guide the selection of the right clustering technique.
A Genetic Local Search Algorithm for Random Binary Constraint Satisfaction Problems E. Marchiori and A. Steenbeek Elena Marchiori presented the talk and she discussed the class of random binary constraint satisfaction problems (RBCSP) characterized by four parameters
. She considered an evolutionary algorithm combined with the improvement of the children by a well-chosen nondeterministic local optimisation procedure before determining their fitness and adding them to the population (Lamarckian evolution). The results obtained by this algorithm RIGA were compared to the results of MIDA by Dozier et al. The results of the comparisons seem to be very promising.
BELIEF NETWORKS Report by Marc Gyssens Limburgs Universitair Centrum Focused Quantification of a Belief Network using Sensitivity Analysis N. Peek, V. Coupé, and J. Ottenkamp Since quantification of a Bayesian belief network is hard, the authors propose to do a sensitivity analysis first to reveal which parameters are the most influential on the performance of the network. The quantification effort can then be focused on these parameters, while rough estimates suffice for the other parameters. The main shortcoming of the authors’ proposal, which requires further research, is that the effects of refinements are nonmonotonic, and that it is therefore difficult to establish a stopping criterion.
Investigating pn2 Search D.M. Breuker, J.W.H.M. Uiterwijk, and H.J. van den Herik
Jos Uiterwijk presented pn2 search. This search technique is an alternative for / extension of proof-number search. To save memory a secondary pn search is used for evaluating the leaf nodes of the first search. Using an adequate test set of chess problems it was shown that the pn2-search algorithm is a viable alternative with many advantages, the most important one being a considerable performance improvement in finding solutions.
Exploiting Non-monotonic Influences in Qualitative Belief Networks S. Renooij and L.C. van der Gaag In the subsequent paper, the authors tackle the same problem, but with another solution, namely the use of a qualitative rather than a quantitative belief network. They focus on the inclusion of nonmonotonic relationships on binary variables. In the future, they would like to extend their work to more complicated non-monotonic relationships and more general variables.
HESSA solves the Job Shop Scheduling Problem P. van Dael, D. Devogelaere, and M. Rijckaert
Patrick Van Dael presented a paper in which an evolutionary search-scheduling algorithm (ESSA) is considered for the most difficult JSSP’s. The ESSA proposed is a hybrid approach (HESSA) that focuses on optimisation of locally optimised solutions. The new ESSA is applied on a benchmark problem. The results demonstrate that the new hybrid ESSA can be used to solve a JSSP.
VAS: Quantifying a Qualitative Network J. Donkers, R. Ferreira, J.Uiterwijk, and H.J. van den Herik The last paper of the session, presents ongoing work and somehow establishes a link between the previous two papers. “VAS” is an existing qualitative tool in a decision support system used for rapid problem assessment by policy developers. The authors discuss how to transform this qualitative tool to a quantitative Bayesian probability network.
The presentations about the evolutionary search algorithms both showed that combinations of these algorithms with other search algorithms (hybrid approaches, local search) is both possible
SEARCH Report by Ida Sprinkhuizen-Kuyper Universiteit Maastricht
BNVKI newsletter
171
December 1999
and very promising for obtaining good results for difficult search problems.
Given agent languages, and agent architectures it is possible to construct intelligent agents for all kinds of application areas. One of the areas that is experiencing a booming growth during recent years is the area of electronic commerce. In Intelligent agents, markets and competition – consumer’s interests and functionality of destination sites, Kees Jonkheer addressed the influence that Intelligent Agents can have on Electronic Commerce. By studying customer interests and supplier strategies a number of potential effects of intelligent agents on competition in electronic commerce were identified, and a statistical analysis of websites for travelling revealed that the functionality of those websites is relatively moderate. The statistical analysis provides a means to methodologically identify the need to improve the functionality of web sites. Furthermore, it makes it clear that although a lot of companies present themselves on the web, a lot of work still remains to be done to exploit the benefits of the web to their advantage. Some of the factors that will insure the substantial influence of intelligent agents on electronic commerce are the use of digital passports and personal mobile portals. Digital passports could form the basis for buyercoalitions (e.g., www.travel-for-less.com for obtaining travel tickets), whereas personal mobile portals evolve from agents interfacing between the web and human users. The supplier strategies deal with functionality of web sites, but also with product-supply mechanism (possibly involving boycotts, and digital cartels). As a conclusion Kees presented a list of both beneficial and potentially hazardous influences of intelligent agents on electronic commerce.
AGENT TECHNOLOGY 1 Report by Catholijn Jonker Vrije Universiteit Amsterdam A Formal Semantics of the Core of AGENT-O K.V. Hindriks, F.S. de Boer, W. van der Hoek, and J.-J. Meyer
The formal semantics of agent languages was discussed by Koen Hindriks. One of the standard benefits of formalization has proved itself again: a number of gaps were discovered in the description of AGENT-0. However, the main focus of the presentation was on understanding decision making within agents that are specified in either AGENT-0 or 3APL. By providing two easy to understand syllogisms with respect to decision making, the authors managed to provide insight not only in the static semantics of both AGENT-0 and 3APL, but also in the differences between those two languages. The primary differences are that within AGENT-0 all applicable goals will be applied in future and (in principle) no retraction of goals is possible, whereas in 3APL, a preference ordering ensures that only preferred goals will be applied in future and, furthermore, retraction of those goals is possible. Although the semantics provided for AGENT-0 is not complete (it covers a more or less coherent subset of the language), it is precisely enough to give the reader a global understanding of what decision making in AGENT-0 entails. On the question whether the dynamics introduced by time (since that is not considered in this article) will be the subject of ongoing research, Koen answered that they will probably not do it for AGENT-0, but they will address this for 3APL.
Deliberate Normative Agents: Principles and Architecture A. Castelfranchi, F. Dignum, C.M. Jonker, and J. Treur
Intelligent Agents, Markets and Competition – Consumers’ Interests and Functionality of Destination Sites K. Jonkheer BNVKI newsletter
172
December 1999
experimental system was shown and preliminary results for toy domains were discussed. The results were promising, but application to real-world problems is awaited. The full paper was published in the Proceedings of the 12th Banff Knowledge Acquisition Workshop.
Having agent languages for the programming of (multi-)agent systems does not prescribe the architecture of the agents in such systems. Catholijn Jonker addressed an agent architecture for deliberative normative agents. In the work presented researchers from different backgrounds (social sciences and artificial intelligence) cooperated in order to tackle the problem of dealing with unexpected events and dynamics environments. Social science provided the insight that conscious reasoning with and about norms allows agents to deal with exceptional circumstances. Such an agent is not only able to violate conventions (if that is deemed more important in specific situation that adhering to the norm), the agent is also capable of coping with another agent’s violation of a norm. In cases where the standard, hardwire cooperation protocols are insufficient, deliberative normative agents are autonomous in that they can decide to obey a norm that help the cooperation over the missing information in the protocol or disambiguates that protocol. In case the hard wired protocols lead to inconsistencies and thus fail, the deliberative normative agents by obeying to norms or consciously violating some norms can go on where the hardwired agent fails.
Describing Problem Solving Methods using Anytime Performance Profiles A. ten Teije and F. van Harmelen
The second paper was presented by Annette ten Teije (also on behalf of Frank van Harmelen) of the Department of AI of the Vrije Universiteit Amsterdam. She proposed the use of anytime performance profiles to describe the computational behaviour of problem solving methods. A performance profile describes how the quality of the output of an algorithm gradually increases as a function of the computation time. Anytime algorithms are algorithms that give an answer irrespective of the amount of time allocated, and the quality of the answer is expected to grow with computation time. As an example the performance profiles for three different methods for a classification task are given and their feasibility is discussed. The full paper was published in the Proceedings of the IJCAI-99 Workshop on Ontologies and Problem-Solving Methods: Lessons Learned and Future Trends, and is available in PDF format from the site http://SunSITE. Informatik.RWTHAachen.DE/Publications /CEUR-WS/Vol18/10-tenteije.pdf
KNOWLEDGE REPRESENTATION AND SYSTEMS Report by Jos Uiterwijk Universiteit Maastricht
SPECIAL SESSION AI AND LAW Report by Yao-Hua Tan Erasmus Universiteit Rotterdam
Automatic Reuse of Knowledge: A Theory P. Beys, and M. Jansen
Transfer of knowledge in the legal domain L. Mommers
This session consisted of two presentations. In the first one Pascal Beys from the Social Sciences Informatics Institute of the Faculty of Psychology, University of Amsterdam exposed a consistent theory about the description of reusable knowledge components. A KADS-like approach setting some bases for automating the process of software reuse was described. Using the vocabulary of set algebra the main components of a KBS, i.e., goals, methods, and domains, are formulated. An BNVKI newsletter
Laurens Mommers addressed the issue of transfer of legal knowledge from one agent to another, in particular when the knowledge is transferred from an information system to a human user. Especially in the legal domain it is essential that you can be sure that it is 173
December 1999
norm. In our example, if the user only states in his query that that his ship is a bulk carrier, while in fact it is a carrier, and he asks if he is allowed to carry 15 passengers, then the system will not be able to apply the most specific information, and hence it will give the wrong answer that he is not allowed to do so. However, this problem could be prevented if the legal information server would add extra information to his answers that this conclusions holds unless the carrier is a bulk carrier. Bosscher described how they added this functionality to the server based on an ingenious idea to compute a tree of the classification rules, which represents most of the exception structure between the norms in a database.
knowledge rather than mere belief that is transferred. To make this distinction operational Mommers applies several qualifications that distinguish legal knowledge from mere belief. Examples of such classifications are truth, justification with reasons, production by reliable sources, and coherence with general legal background information. These qualifications are taken from philosophical theories on knowledge. Mommers showed that different types of legal knowledge require different types of knowledge qualifications. For example, truth is a very important qualification for testimonial beliefs, while it is not applicable for interpretative beliefs. When a judge gives a specific interpretation of a legal rule, then this is not a statement that can be true or false. In the case of transfer of legal beliefs Mommers showed that the criteria of knowledge transfer are different in the case of testimonial or interpretative beliefs.
Automated Argument Assistance for Lawyers B. Verheij
Bart Verheij presented an improved version 2.0 of the version 1.0 ArgueMed system that he has been developing in the last couple of years. The ArgueMed system is a graphical support tool for legal argumentation. The graphical interface enables lawyers to structure their arguments according to argumentation theory. For example, the systems might propose the argumentation move to attack the claim of your opponent by undercutting the justifications for his claim. These undercutting structures can get quite complicated. Hence, a graphical support tool is very helpful in constructing defence or attack strategies in legal disputes. The problem with the previous version 1.0 of ArgueMed was that it did not represent these under cutters explicitly. It only indicated that an argument was defeated by an undercutting argument, but it did not graphically represent this undercutting argument. In the new version 2.0 this explicit graphical representation of the under cutter argument is implemented. For this implementation the underlying argumentation theory of the 1.0 version
Generating Exception Structures for Legal Information Serving R. Winkels, D.J.B. Bosscher, A.W.F. Boer and J.A. Breuker
Doeko Bosscher presented the research that he had done with his co-authors Radboud Winkels, Alexander Boer and Joost Breuker. They presented a legal information server that can answer queries about ship classification. Ship classification is based on normative rules such as that a cargo ship should not have more than 13 passengers. The problem is that in many cases there are more specific exception rules that lead to different conclusions. For example, the exception rule that a special subtype of cargo ships, namely bulk carriers, are permitted to have more than 13 passengers. Since the information about bulk carriers is more specific than the information about carriers, the exception rule prevails. The problem is that users of the system often do not provide in their queries information detailed enough to trigger the most specific BNVKI newsletter
174
December 1999
used during the process of formalisation of the theory. Kamps then argues for a generalisation of this claim: rather than regarding axiomatisations of a theory as a final and finishing step in the evolution of a theory, we should experience the process of axiomatisation of a theory as an important way of developing it.
had to be extended with a new type of arguments, the so-called dialectical argument that contain attacks by under cutters. LOGIC AND REASONING 2 Report by Wiebe van der Hoek Universiteit Utrecht The logic of Knowledge Games: showing a card H. van Ditmarsch
Whereas several books and papers on modal epistemic logic might suggest that this area is quite well explored, van Ditmarsch shows with his work that by focussing on specific epistemic notions in some very simple and natural card games, there is still a lot to be formalised here. His main interest is in the dynamics of knowledge of several agents, during such a game, where a simple action like one player A showing one card to another player B gives rise to rather complicated notions of learning-private learning (agent B learns a card of agent A), subgroup learning (A and B learn which card B now knows of A) and group learning (the group as a whole learns that A and B learn what was described above). Van Ditmarsch’ focus is on a semantic understanding of such actions: he demonstrates how resulting Kripke models emerge out of such ‘show-actions’, given an initial Kripke model in which the action is performed.
Jan Heemskerk of KPN Research MACHINE LEARNING AND NEURAL NETWORKS 2 Report by Bert Kappen Kath.Universiteit Nijmegen Information Retrieval Systems using an Associative Conceptual Space and SelfOrganising Maps M. Schuemie and J. van den Berg The contribution by Schuemie TUD) and van de Berg (EUR) addresses the problem of visualization of large sets of documents, in particular books. A common problem that arises in this area is that when one is for instance interested in books on Greek philosophy one may miss the title “Collected works of Plato” since it does not match any of the search queries. This example illustrates that a distance measure that expresses semantic similarity is needed to obtain useful visualizations. The new idea in this paper is that a set of significant words appearing in documents can be obtained from the subject indices at the end of the books. Furthermore, the cooccurence of these words on the same page in the book can be used as a measure of semantic similarity. Words can then be clustered using this semantic similarity. The visualisation of documents proceeds in the standard WEBSOM way: each document is represented as a histogram over the semantic word clusters.
On Criteria for Formal Theory Building J. Kamps Kamps argues that the availability of automated reasoning tools has led to a renewed interest of axiomatizing social scientific theories. A formalisation can provide clarity of such theories and their consequences, and formal claims, like consistency and falsifiability of theories and the soundness of derivations can be automatically verified (these properties are called ‘criteria’ by Kamps). The author has taken a classic organisation theory as a test-example for such criteria. During this test, it was not only established that some given claims about the example-theory appeared to be wrong, but also that the criteria themselves could be BNVKI newsletter
Notes on Embedding a Trained Neural Network W. Peng, J. Nijhuis, and L. Spaanenburg The paper by Peng, Nijmhuis and Spaanenburg (RUG) addresses the problem of quantization of continuous weights in a trained neural network for hardware implementation. As one may expect, 175
December 1999
rope, and two parameters, represented by two tags attached to the rope. By turning the wheels, high values for one parameter are traded for low values of the other. In a set of experiments, a pursuit task was set up for a simulated Khepera robot. The speaker argued convincingly that the two opposing forces of light sensitivity and response accuracy acted as cause for the trade-off in the artificial system, and explained the trade-off function between spatial and temporal resolution as deduced from the results of the experiments. Future work will hopefully include implementing the model in a real robotic system and validating it using real biological data.
when the number of bits used to represent the weights is too small, the performance degrades. Four different strategies are proposed to do quantisation. They differ in whether the quantisation interval is determined per neuron of globally for the whole network. The first strategy is clearly the most accurate but requires more memory and more computation. ROBOTICS AND VISION 2 Report by Rineke Verbrugge Universiteit Groningen Grounding a Lexicon in a Coordination Task On Mobile Robots P. Vogt
SPECIAL SESSION AI IN MEDICINE Report by Arie Hasman Universiteit Maastricht
The first speaker’s research falls into the recent tradition at the AI Lab in Brussels, where researchers, under the leadership of Luc Steels, aim at understanding the origins and evolution of language and meaning. In the present talk, the central experiment presents two mobile robots, the “speaker” and the “hearer”. Their task is to develop categories and a lexicon, so that they can communicate planned actions like continuing straight on, or going to the left. This is meant to be realized during a series of so-called follow-me games. Technically, the robots first categorize segments of the time series of motor commands, using the method of delays recently developed by Rosenstein and Cohen. Then, the “speaker” needs to name the category by a so-called form, on the basis of previous form-meaning associations in its lexicon. Subsequently, the “hearer” tries to decode the received form using the form-meaning associations from its own lexicon, and then performs the action, if it understood the message. It turns out that categorization is achieved quite successfully. Naming, on the other hand, stays behind at a 55% success rate, so that the follow-me games as a whole are not very successful yet.
Detection and Assessment of the Severity of Levodopa Induced Dyskenesia in Patients with Parkinson'’ Disease by Neural Networks N.L.W. Keijsers, M.W.I.M. Horstink, and C.C.A.M. Gielen
During the session there were three presentations. The first presentation concerned the classification and scoring of involuntary movements, induced in Parkinson's Disease patients by a drug, called levodopa. When the time that these involuntary movements occur can be determined, one can adjust the timing and the dosage of the drug to reduce these involuntary movements. For the classification and scoring of the involuntary movements several neural nets were constructed (depending on the task that was used to determine involuntary movements). It appeared that the optimal number of units in the neural net was equal to one. Also there was only one output node. This raised the question why a hidden layer was necessary and why not other methods like non-linear discriminant analysis could be used instead. The neural nets appeared to perform better than the regression analysis (which was used by other investigators) in tasks where the patients also made voluntary movements.
The Trade-off between Spatial and Temporal Resolution in Visual Systems R. Kortmann, E. Postma, and H.J. van den Herik The second paper was presented by Rens Kortmann, who was motivated by the biological phenomenon that spatial and temporal resolution are traded off against each other in animals. With some poetic license, the speaker compared human beings to the fly, who has poor spatial resolution due to its compound eyes, but has excellent temporal resolution by which it “is able to see every frame of a movie separately”. He explained the concept of trade off by the guiding metaphor of two opposing forces, represented in a picture by two springs pulling at two wheels connected through a BNVKI newsletter
176
December 1999
architecture was successfully applied to develop decision support systems.
Modelling the Psychoactive Drug Selection Application Domain at the Knowledge Level D.M.H. Van Hyfte, P.A. de Clercq, T.B. TjandraMaga, F.G. Zitman, and P.F. de Vries Robbé
AGENT TECHNOLOGY 2 Report by Nico Roos Universiteit Maastricht
The second contribution concerned the modelling of the psychoactive drug prescription process. Prescribing such drugs requires a lot of expertise in different domains. Although rule-based systems have been developed to support this type of prescription these systems have some dis-advantages. The most important one being the lack of deep knowledge, so that the reasoning leading to the selection of a drug is quite superficial. The contribution dealt with two subtasks of the prescription process: generating candidate therapeutic drug options and checking for possible contra-indications. A task-based approach was followed and described. The domain knowledge was modelled on different levels of abstraction. Some examples were presented. No data were presented about the use of the model; only the way of modelling the prescription process was dealt with.
Open Multi-Agent Systems: Agent Communication and Integration R.M. van Eijk, F.S. de Boer, W. van der Hoek, and J.-J.Ch. Meyer In the session on Agent Technology 2 there were three talks on papers that had already appeared elsewhere. Rogier van Eijk gave the first talk on ‘Open multi-agent systems: Agent communication and integration’. The co-authors of the corresponding paper are Frank de Boer, Wiebe van der Hoek and John-Jules Meyer. In his talk Rogier addressed the problem of adding new agents to an existing multi agent system. This requires the integration of an agent into a multi agent system and the communication of the new agent’s skill to other agents in the system. An abstract programming language for open multi agent systems together with a formal semantics is proposed for this purpose.
GuiDE: an architecture for the acquisition and execution of clinical guideline-application P.A. de Clercq, J.A. Blom, A. Hasman, and H.H.M. Korsten
Overview of Knowledge Sharing and Reuse Components: Ontologies and Problem-Solving Methods A.G. Pérez and V.R. Benjamins
The third contribution concerned an architecture for the acquisition and execution of clinical guideline application tasks. Usually domain models are implicit in computerized rule-based guideline systems. These rule-based systems have no notion of either the application domain or the employed problem solving strategy. This makes it difficult to maintain these types of systems. This contribution described a modelling approach based on domain ontologies and problem-solving methods. The architecture presented consists of a set of reusable software components. Both design-time and execution time models are available. Examples were presented where the
Richard Benjamins gave the second talk of the session. His talk was based on the paper ‘Overview of knowledge sharing and reuse components: Ontologies and problem-solving methods’, that he wrote together with Asunción Gómes Pérez. The idea discussed in the talk is that of applying Ontologies and Problem-Solving Methods to realize the use reusable components. Ontologies and PSMs might present a solution for the problem of integrating reusable components. This integration problem states that the way knowledge is represented effect the nature of a problem and the way it is solved. Ontologies could be used to describe the domain knowledge and PSMs could be used to describe the reasoning process, both in a domain independent way.
BNVKI newsletter
A Multi-Agent Architecture for an Intelligent Website in Insurance C.M. Jonker, R.A. Lam, and J. Treur
177
December 1999
In the last talk, Remco Lam presented an application of a multi agent system. This talk was base on the paper ‘A multi-agent architecture for an intelligent website in insurance’, that he wrote with Catholijn Jonker and Jan Treur. The problem of customers visiting the wibsite of an insurance company is finding the right information and being able to re-find information. The customer wants someone to help him/her. To provide this help, two types of agents are introduced, Personal Assistants and Website Agents. The PA notifies changes in offers and looks for items a customer needs. It does this by communicating with the WAs. The WAs make profiles of customers, makes offers and communicates with other WAs for the redirection of customers. DEMONSTRATIONS 2 Report by Hans Henseler TNO-TPD
Silvie Spreeuwenberg of LibRT
A Knowledge Based Tool to Validate and Verify an Aion Knowledge Base S. Spreeuwenberg and R. Gerrits
KMD-MATE-An analysis and design environment for Knowledge Management Support J.H. van Lieshout and E.C. van de Stadt
First demonstration in this Thursday afternoon session was presented by Silvie Spreeuwenberg (see picture) and Rik Gerrits from LibRT (pronounced as liberty). They presented Valens: ‘A Knowledge Based Tool to Validate and Verify an Aion Knowledge Base’. Aion is a development environment for rule-based systems. Valens was developed for testing purposes to validate and verificate results and is written as an Aion application. It turns out that a knowledge base gives non-deterministic results determining on the order of the rules in the knowledge base, which was illustrated with a 19-rule knowledge base on cartrouble diagnosis.
The second demonstration was presented by Jan van Lieshout and Erica van de Stadt from WizWise Technology. The goal of knowledge management is to manage and support knowledge intensive work. In particular knowledge that is beyond the scope of normal information. Unfortunately their KMD-MATE tool was not yet ready and they gave a presentation about the KMD (Knowledge Manipulation Diagram) concept. They explained that the tool should improve the process of knowledge dissemination by assisting a user in designing a KMD. Mondriaan Art by Evolution J.J. van Hemert and A.E. Eiben The third and final demonstration of this session was presented by Jano van Hemert of Leiden University. He demonstrated “Mondriaan Art by Evolution” a genetic algorithm that generates Mondriaan look-a-likes. Jano developed the system as an instructing tool for a course on evolutionary learning that is thought by A.E. Eiben at Leiden University. Basically, Jano has designed a binary string that encodes the dna of Mondriaan look-alikes so he can use regular mutation and crossover
BNVKI newsletter
178
December 1999
the author describes an active vision scheme that detects and reduces uncertainty in the estimations.
operators. The fitness function, however, is the user who is presented every generation with 9 Mondriaan imitations (see picture) that must be rated. According to this rating the user favourites are then recombined into the next generation.
Dealing with Environmental Dynamics P. András, E. Postma, and H.J. van den Herik The third presentation in this session was that of Peter Andras (Universiteit Maastricht), who investigated the difference between a recurrent and a feedforward neural network in a prediction task. The task is to predict the ground level position of a fly that is moving down based on the fly's current position or position and movement. The feedforward network makes a one shot estimate, whereas the recurrent network takes a position as input and estimates the next position, and arrives at an answer by repeatedly feeding back its own output into the inputs. The predictions of the recurrent network were more accurate. Recurrent neural networks are an important research topic since they overcome a fundamental limitation of feedforward networks in that they contain internal state. Moreover, internal state can be developed based on the learning task. However, this difference between recurrent and feedforward networks was not the source of the difference in performance; the recurrent connections were only active during performance, not during learning. Rather, the success of the recursive method is caused by the fact that the simpler approximations of the recurrent solution outweigh the accumulated approximation errors. Thus, the architecture of the network fits the structure of the problem very well. This is not a problem; given the impossibility of a universal learning method, it can be seen that this is actually desirable. However, the conclusion was that recurrent neural networks are to be preferred over feedforward networks when serving as control mechanisms for situated robots. Although I agree that recurrent networks are very promising and have advantages in this area, this conclusion is too general to draw from experiments on one particular problem.
Jano van Hemert and one of his Mondriaan generations ROBOTICS AND VISION 3 Report by Edwin de Jong Universiteit Leiden Evolving Visual Feature Detectors T. Belpaeme The third session on Robotics and Vision started with Tony Belpaeme of the VUB AI Lab, who presented a paper on evolving visual feature detectors. The primitives were low-level image operators such as thresholds, spatial filters, and orientation selective filters. Genetic programming was applied to images to evolve feature detectors based on these primitives. Whereas in this work entropy was used as the fitness function, future work aims to link the process to a task, such as discrimination. Appearance Based Robot Localization A. Kröse, R. Bunschoten, N. Vlassis, and Y. Motomura
EVOLUTIONARY COMPUTATION 3 Report by Elena Marchiori Universiteit Leiden
Ben Kröse (UvA) is interested in robot localization based directly on observations of the robot. With this approach, he aims to take away the need for a geometric model of the environment, which is a very welcome step for real world robot applications. The principal components of video images received by the robot's omni directional camera are used to estimate the conditional probability of being at a particular location given the current sensor data. A subset of the components is chosen based on entropy measure. Other work of
BNVKI newsletter
Comparing Genetic Programming Variants for Data Classification J. Eggermont, A.E. Eiben and J.L. van Hemert First, Jeroen Eggermont (LIACS, Leiden University) presented the work on data classification using Genetic Programming, which is contained in two papers written together with A.E. Eiben and J.I. van Hemert. The authors compare various variants of GP for classification. In 179
December 1999
particular, they propose a GP representation based on a class of Boolean formulas that allows one to construct more transparent classification models using GP.
paper. This paper appeared already in this year’s PAAM. It describes an agent-based electronic market architecture that has been designed compositionally using elements from both agent and knowledge technology. Special attention was paid on the distributed and dynamic nature of processes in and systems supporting an emarket as well as on the knowledge intensity of the domains involved. Design and implementation have been done by means of the well-known DESIRE method/system. In a lively presentation Catholijn illustrated the general ideas by means of the example of a car market. In the discussion, among other ones the issue of ‘trust’ in e-market situations was raised by Yao-Hua Tan.
Size Fair Tree Genetic Programming Crossover W.B. Langdon Next, Bill Langdon (CWI) presented his work on size fair crossover, a novel crossover operator for tree based GP. The author shows by means of experiments that size fair crossover controls variation in size of the trees, thus reducing increases in tree size during the execution of the genetic program without affecting its performance. On the Modelling of Evolutionary Algorithms P.A.N. Bosman and D. Thierens Finally, Peter Bosman (Utrecht University) presented a joint work with D. Thierens describing a general framework for developing evolutionary algorithms. The authors discuss the main issues to be addressed when designing a general EA framework, like modularity and expandability. They introduce a general system called EA Visualizer, which provides adequate solutions to these issues.
Specification of Behavioural Requirements within Compositional Multi-Agent System Design
D.E. Herlea, C.M. Jonker, J. Treur and N.J.E. Wijngaards
Catholijn’s second presentation concerned a paper which appeared earlier in this year’s MAAMAW. I do not have these proceedings, so I have to rely on my rather limited- memory of the presentation itself for a discussion of this paper. (This is entirely due to my own mental state and not to the communication abilities of the speaker.) A key element of the proposal for a design method for multi-agent systems was the inclusion of two specification languages: one for specifying design descriptions, and one for specifying (behavioural) requirements and scenarios. I do remember getting slightly confused during the presentation as to the place of validation vs. verification in the complete picture.
AGENT TECHNOLOGY 3 Report by John-Jules Meyer Universtiteit Utrecht
After being ill for a week, with piles of work in arrear, what is more pleasant and helpful to recover fully and get to grips with reality again than writing a session report on the latest BNAIC? By the way, I thought BNAIC’99 was very successful, both qua scientific contents and qua social event at a beautiful location. Well, back to business. The session I chaired was the 3rd on Agent technology and was dominated by research performed at the Free University Amsterdam. It consisted of three presentations in total given by only two presenters.
Rights, Duties and Commitments between Agents
L. van der Torre and Yao-Hua Tan
An Electronic Market Place: Generic Agent Models, Ontologies and Knowledge M. Albers, C.M. Jonker, M. Karami and J. Treur
This paper was also presented at this year’s IJCAI, so there is only a brief abstract in the BNAIC proceedings. The paper deals with quite an interesting issue in deontic reasoning, viz. the distinction between
Catholijn Jonker filled most of the session by two presentations, first that of the BNVKI newsletter
180
December 1999
Online Voting Advice. The system presented by Wouter Teepe is a (web-enabled) support tool that should help voters decide on what political party to vote. It is a very appealing application. Apparently a lot of people are familiar with the feeling of uncertainty when making their votes for political parties. An arranged volunteer demonstrated the system. This volunteer patiently gave his opinion on 40 thesis in terms of degrees of agreement. Initially, the preferences for all parties are equally distributed. Based on the user’s input the individual preferences are adjusted to reflect the user’s preferences better. It seemed that the model to combine user inputs relies on the assumption of independence of the thesis presented (and the user’s reactions here on). During the lively discussion that followed this demonstration, it was questioned whether adding all inputs independently to account for the final preference scores is a sensible thing to do, since it is extremely difficult to compose a set of truly independent thesis.
creating a certain deontic state (in which certain obligations and permissions hold) and evaluating such a deontic state. Leon presented a logic in which one can reason about both aspects, using Veltman’s update semantics. Although the choice for the latter seems an intuitively clear one, I must say that I nevertheless found the resulting system rather overwhelming by its complexity. But perhaps this is just due to the fact that I heard the presentation for the first time, and I did not have the IJCAI proceedings at hand. DEMONSTRATIONS 3 Report by Erica van de Stadt WizWise Technology The beautiful Limburg landscape and the historical “Vaeshartelt” castle composed the pleasant and stimulating environment for the Eleventh BelgiumNetherlands Conference on Artificial. Traditionally, this conference has payed attention to the practical side of applying AI algorithms and techniques in real support systems. And, today this is still the case. Therefore, beside the sessions on research papers, three “Demonstrations” session where scheduled. This note gives a short impression of one of these three demonstrations sessions. In the conference program this session was denoted with the generic term “Demonstrations 3”, and contained three contributions, which I will discuss according to the order in which they were presented.
Promising Practical Fruits of AI L. Hulzebos (Special talk of Bolesian) The third demonstration was actually not a demonstration but a special talk of Bolesian (one of BNAIC’s sponsors). In this talk entitled “Promising Practical Fruits of AT”, L. Hulzebos sketched Bolesian’s approach in bringing new innovative AI techniques to the market. In this approach, Bolesian adapts “concepts” -as they call it- from the research communities and promotes the application in business systems. “Concepts” is a very broad term used to denote theories, techniques, algorithms, etc. As an illustration of new “concepts”, ontologies and Bayesian Networks were mentioned. From the discussion it became clear that achieving business acceptance for AI innovations is not an easy task. Summarizing I think we can say that the demonstrations certainly illustrate practical applications of AI and contribute to a pleasant and varied conference program.
Interactive and continuous visualization of EAs: The EA Visualizer P.A.N. Bosman and D. Thierens In this demonstration, a system that visually supports the exploratory selection and tuning of evolutionary algorithms was illustrated. The system (a java applet) provides screens where the user can interactively choose the kind of EA to explore together with the appropriate parameters and their settings, a starting population and a fitness evaluation-criteria. While the EA runs, several graphical representation of its progress and performance can be displayed such that the user can monitor the dynamic behaviour. The tool has successfully been used in research projects and for education purposes.
LOGIC AND LEARNING THEORY Report by Richard Benjamins Universiteit Amsterdam A Bound on the Cross-validation Estimate for Algorithm Assessment G. Bontempi and M. Birattari The first paper of this session by G. Bontempi and M. Birattari from the Universite’ Libre de Bruxelles turned out to be the prize-winning paper. The paper distinguishes two different ways of assessing a learning procedure: the hypothesis-based approach and the algorithm-based approach. The
“Wij kiezen partij voor u” Online Voting Advise W. Teepe
BNVKI newsletter
181
December 1999
further tests on child nodes. Bot and Langdon show in a first series of experiments that “fitness sharing pareto scoring”, a way to score fitness in a population with a certain goal in mind (i.e., as much small trees with as high generalization as possible), yields bigger but more accurate linear classification trees than “domination pareto”. A second series of experiments shows that the fitness sharing pareto approach works reasonably for a few well-known numeric ML benchmark tasks.
hypothesis-based approach is concerned with the estimation of the performance of the selected hypothesis. The algorithm-based approach views a learned hypothesis as a function of the data. Crossvalidation is interpreted by the authors as belonging to the algorithm-based framework, and they the derive a new bound on its accuracy. Implication-with-possible-exceptions H. Jurjus and H. de Swart The paper by H. Jurjus and H. de Swart of the Tilburg University introduces the notion of implication-with-possible-exceptions by means of the topological notion of a full subset. In their paper, they argue that all laws of classical proposition logic are also “valid” (valid-up-topossible-exceptions) for this new notion of implication.
Unsupervised Classification in a Layered RBF Network of Spiking Neurons S.M. Bohté, H. La Poutré and J.N. Kok Reminiscent of prof. Rietman’s invited lecture the previous day, Bohté argued that spiking neurons are a potentially very powerful means for information coding, with even some biological evidence to back it up. The neural network architecture presented by Bohté performs unsupervised clustering over multiple real-valued inputs successfully. Certain “hard” types of clusters (e.g., interlocking shapes) are tackled by adding an intermediate layer of radial-basis function neurons; this way clusters can be encoded by neurons that fire with a certain synchronicity, rather than by the first spiking neuron.
The Replacement Operation for CCP Programs M. Bertolino, S. Etalle, and C. Palamidessi The last paper of this session by M. Bertolino, S. Etalle, and C. Palamidessi of the University of Maastricht and Penn State University is concerned with optimisation techniques for the development of large and efficient applications. Replacement is a program transformation technique, which consists of replacing an agent with another one in the body of a definition. Concurrent constraint programming (CCP) is a programming paradigm, which uses the “store-as-constraint” model. The goal of this paper is to provide applicability conditions, which ensure the correctness of the replacement operation for CCP, that is, that the transformed program is equivalent to the original one.
Top-down Design and Construction of Knowledge-Based Systems with Manual and Inductive Techniques F. Verdenius and M.W. van Someren In the talk, based on a paper published earlier in the proceedings of the 12th Banff Knowledge Acquisition Workshop (Banff, Alberta, Canada), Verdenius described the synthesis of a method for building knowledge-based systems, which combines traditional knowledge acquisition with machine learning in a divide-and-conquer fashion. The method balances a planning with a resource economy model that estimates costs of elements in the planning. While the method is intended to be general, Verdenius described how the model was born from reconstructing a real-world process (ripening bananas for supermarkets) in which there was no initial plan. Unfortunately, he did not bring along some bananas in perfectly ripe shape to prove the effectiveness. Nevertheless, the session was closed well over scheduled time in good humour.
MACHINE LEARNING AND NEURAL NETWORKS 3 Report by Antal van den Bosch Universiteit Tilburg The last session on “machine learning and neural networks” featured, apart from the themes suggested by its name, a blend with evolutionary computation and knowledge acquisition as well, making it nicely varied. Application of Genetic Programming to Induction of Linear Classification Trees M. Bot, and W.B. Langdon
WINNER OF THE SKBS-AWARD Hans Henseler - TNO
The first talk was presented by Martijn Bot who presented joint work of Linear Classification Trees. The latter kind of decision trees grow nodes in which a linear combination of continuous or integer feature values is made, leading to classifications or BNVKI newsletter
The SKBS-award is a money prize provided by the Stichting Knowledge-Based Systems (SKBS) to stimulate the application of knowledge-based techniques. The demonstration by M. van Wezel of 182
December 1999
“Neural Vision 2.0 - Exploratory Data Analysis with Neural Networks” was elected as winner of the SKBS-award for best demonstration. It is based on a joined cooperation by M. van Wezel, J. Sprenger, R. van Stee and J. La Poutré of the CWI and J. van Wieringen from the Dutch Ministry of Transport, Public Works and Water Management (Ministerie van Verkeer en Waterstaat). The jury elected this demonstration because it is an AIapplication that actually fulfils the need of a user. Certainly, the runner up “Wij kiezen voor u. Online Voting Advice” by Wouter Teepe also fulfils a clear need but this demonstration was considered less advanced although it’s effectiveness was clearly demonstrated. KPN’s demo of intelligent agents and TNO’s demo of artificially intelligent web crawling (which was closely related to the keynote speech of Tom Mitchell) were also good but the jury felt is was not appropriate to award the money-prize to either one. The EA Visualizer and Valens are interesting but lacked a real application in the demonstration. The jury recommends everyone to try the “Mondriaan Art by Evolution” demonstration but as Jano said himself it is not meant to be an actual application.
(bijvoorbeeld indexen). Als bij het matchen een document-representatie voldoende 'lijkt op' de zoekvraag, dan wordt het document als relevant beoordeeld voor de informatiebehoefte. Deze manier van zoeken kent vele varianten, of 'modellen'. Een van de bekendste modellen voor IR, het Vectorruimte Model, gaat uit van vectorrepresentaties. Men kan verdedigen dat alle modellen met een dergelijke representatie werken: de documentvector. Die vectoren zijn bruikbaar in IR: rechtstreeks als indexen waarop een zoekvraag wordt gematcht (om documenten te vinden), en daarnaast ook voor andere taken, met name het classificeren en het clusteren van documenten. Op alle drie deze gebieden heeft Paijmans verkenningen uitgevoerd. De verkenningen vonden plaats binnen het kader van zijn promotie-onderzoek aan de Katholieke Universiteit Brabant. De verdediging vond plaats op 14 september 1999 en de promotor was Prof.dr. H. Bunt. Hieronder zal ik de belangrijkste bijdragen samenvatten en voorzien van enig commentaar.
END OF BNAIC’99 REPORTS EXPLORATIONS IN THE DOCUMENT VECTOR MODEL OF INFORMATION RETRIEVAL Proefschrift van Hans Paijmans, KUB
Bespreking door Ruud van der Pol Universiteit Maastricht Information Retrieval (IR) is een onderzoeksgebied waarin wordt onderzocht hoe een persoon zo snel mogelijk documenten kan vinden die informatie leveren waaraan hij of zij behoefte heeft. Het vinden van de relevante documenten in een grote verzameling is moeilijk. Iedereen die wel eens een zoekmachine op het Internet raadpleegt heeft dat zelf kunnen ondervinden. In de loop der tijd is in IR een manier van zoeken ontstaan die is gebaseerd op het matchen (beoordelen van 'gelijkenis') van twee representaties: enerzijds een representatie van de informatiebehoefte (in de vorm van een zoekvraag) en anderzijds een representatie van de documenten in de verzameling bibliotheken (in eerste instantie niet meer dan archieven van heersers), alsmede de BNVKI newsletter
SAMENVATTING
Het proefschrift is opgebouwd uit twee delen: een deel met een inleidend en verkennend karakter (hoofdstukken 1 tot en met 4), en een deel met eigen experimenteel onderzoek (hoofdstukken 5 tot en met 8). Het tweede deel is gebaseerd op vier artikelen uit de periode 1993 1999. In het eerste deel worden diverse inleidingen gegeven. Daarnaast is er een boeiend historisch overzicht van IR, onder meer over de eerste organisatie van bibliotheken. Belangrijke stappen voor de verspreidbaarheid van 183
December 1999
Clarit aan de hand van een neerslag van hun (op zich rijkere) documentrepresentaties tot eenvoudige vectoren. Vervolgens wordt de kwaliteit van de aldus verkregen vectoren gemeten. In Topic worden bij het zoeken naast de indextermen (verkregen met de tf.idf methode) ook additionele termen gebruikt uit een kennisbank. De laatste bevat een (in regels geformuleerde) thesaurus, met gewichten voor de sterkte van de verbanden tussen de termen erin. Paijmans beschouwt ook deze extra termen als onderdelen van de document vector. Als er om de ene term wordt gevraagd, dan zal ook een document met geassocieerde termen worden gevonden. De termen in de documentvector komen uiteindelijk alle uit de thesaurus. Het andere systeem, Clarit, extraheert zelfstandige naamwoorden en zelfstandig naamwoorddelen uit teksten en gebruikt deze in de documentvector. Het doet dit op grond van grammaticale en statistische regels en een lexicon. Er komen in de documentvector ook termen voor die niet in het lexicon staan maar wel in de tekst.
informatie waren de stap van rol (of codex) naar gebonden boek (of volumen) en later de boekdrukkunst. IR maakte een sprong voorwaarts toen de ordening van boeken niet langer werd gekoppeld aan een locatie (de plank) in de bibliotheek, maar aan een relatieve indexcode of classificatie op onderwerp (door Dewey, in 1876). Dit beginsel is ook de basis van de bovengenoemde, hedendaagse zoeksystemen. Een overzicht van modellen in IR leert dat er vele verschillende technieken zijn ontwikkeld voor de drie deelprocessen (d.w.z., het maken van twee soorten representaties en het matchen). Op grond van die technieken heeft men verschillende 'modellen' voor zoeken benoemd. De naamgeving is nogal eens verwarrend, constateert Paijmans. Veelal benoemt men het type van een systeem naar één model, op grond van een belangrijk kenmerk, b.v. de wijze van matching, en verwacht dan tevens de andere kenmerken van het model in het systeem aan te treffen. Zo wordt een systeem tot slechts één model gerekend. Maar vaak heeft een systeem juist ook kenmerken van andere modellen.
Vergeleken is hoe de documentvectoren, en dus niet de hele zoeksystemen, presteren in een documentencollectie van tekstdocumenten met algemene onderwerpen. Daartoe zijn de documenten door twee mensen in zes onderwerpsgroepen geclassificeerd. Vervolgens werden de documentvectoren vergeleken met de gemiddelde vector van de documenten uit elk der zes groepen. Zo kon globaal worden vastgesteld welke van de twee systemen de beste, i.e., de meest overeenkomende, documentvectoren oplevert met die uit de clusters. Hieraan kleefden wel verscheidene haken en ogen. Geen van beide systemen bleek duidelijk superieur. Wel bleek Topic geneigd om bij het indexeren algemenere termen toe te kennen, en Clarit specifiekere. Dit verschil zou kunnen leiden tot grotere recall bij het zoeken met Topic, en grotere precisie bij het zoeken met Clarit. Het wordt wellicht
Om de verwarring te verminderen stelt Paijmans een algemeen model voor, het Document Vector Model, waaronder hij alle andere door hem besproken modellen als subtypen schaart. Op die wijze behoort een systeem tot meerdere modellen. De kracht van de documentvector gedachte is het besef dat je altijd met vectoren kan werken. In combinatie met de wetenschap dat de vectorrekening heel wat trucs in huis heeft, heb je dan een krachtig instrument in handen. Paijmans gebruikt dat als rode draad door de hoofdstukken van het tweede deel. VERGELIJKING VAN TOPIC MET CLARIT
Het eerste experiment betreft een vergelijking van de zoeksystemen Topic en BNVKI newsletter
184
December 1999
positie van een woord, (2) positie van een zin, (3) cue words (zins stukjes zoals: "En dit is van groot belang…"), en (4) woordtype.
veroorzaakt doordat Clarit termen uit het lexicon en het document zelf betrekt, en Topic alleen uit de kennisbank; in de laatste zijn wellicht geen specifiekere termen voorhanden en in het document wel.
De resultaten waren nauwelijks bevestigend te noemen. Merkwaardig genoeg werd er geen duidelijke correlatie gevonden tussen woord-gewicht en positie van woord en van zin. Dit is in strijd met de gangbare opvatting dat de eerste of laatste zin van een alinea meer informatie bevat dan de andere zinnen. Die opvatting wordt in deze experimenten dus niet bevestigd. De enige vergelijking die een positieve correlatie opleverde was de laatste: zelfstandige naamwoorden, bijvoeglijke naamwoorden, en werkwoorden zijn van groter belang dan de meeste functiewoorden. Om precies te zijn: naamwoordelijke voornaamwoorden (iemand, alles, niets, etc.), naamwoordelijke bijwoorden (hier, nu, daar, dan, etc.) en bijwoordelijke zelfstandige naamwoorden (januari, zondag, Oost, vandaag, thuis, etc.). De werkhypothese lijkt dus niet op te gaan: van samenklonteren van informatierijke delen is nauwelijks sprake. Althans in het huidige experiment, dat een database besloeg van 24 wetenschappelijke artikelen. Een groter experiment zou anders kunnen uitpakken, denk ik.
Een eerste vraag bij dit experiment betreft de algemeenheid van de conclusies: de specifieke vulling van de kennisbank zou wel eens sterk bepalend kunnen zijn voor de prestaties van Topic. Paijmans merkt dit zelf op. Een tweede vraag is mijns inziens in hoeverre een vergelijking zinvol is, want Topic is gebaseerd op een representatie van expliciete domeinkennis en Clarit op generieke taalkennis. Zoals Paijmans hierover zelf suggereert, zouden deze systemen wel eens niet als elkaar uitsluitend maar juist aanvullend kunnen worden gebruikt. Derhalve begrijp ik niet goed waarom ze tegenover elkaar zijn gezet, en niet gewoon van beide de prestaties zijn verkend. Je krijgt de indruk dat Paijmans er zelf ook mee worstelde, aangezien hij eerst stelt dat het doel is het vergelijken van systemen en later stelt dat het gaat om te zien of een vergelijking op grond van documentvectoren mogelijk is. OPSPOREN VAN INFORMATIERIJKE TEKSTDELEN
Het vinden van informatierijke tekstdelen in een document heeft meerdere gebruiksdoelen, onder meer effectievere indexering en rangschikking naar relevantie. In het hoofdstuk gewijd aan het herkennen van informatierijke (dus belangrijke) delen van tekst luidt de werkhypothese dat informatierijke woorden niet willekeurig zijn verdeeld over het document, maar neigen tot samenklonteren. In een reeks experimenten is de correlatie bestudeerd tussen enerzijds de woordgewichten bepaald door frequentietelling volgens de tf.idf methode (term frequency x inverse document frequency) en anderzijds elk van vier mogelijke manieren om informatierijke delen te vinden in teksten, te weten: (1). BNVKI newsletter
LOCAL DICTIONARIES
In classificatietaken en IR is het lastig om te werken met document vectoren van grote lengte. Het is efficiënt om te werken met kleinere vectoren, mits die bijna even effectief zijn. De vraag is dan hoe de juiste delen van de document vectoren te selecteren. Paijmans heeft deze vraag bestudeerd voor classificatietaken, onder meer met behulp van local dictionaries. Stel je hebt een indeling in categorieën, en in elke categorie enkele documenten. Je kan dan per categorie een klein woordenboek opstellen van de woorden in die klasse. We noemen een dergelijk woordenboek een 'local dictionary'. 185
December 1999
LEXICALE COHESIE
Een hypothese is dat de documentvectoren van de documenten in een klasse veelzeggender zijn met de woorden uit de local dictionary dan uit de gehele woordenlijst van de documentencollectie. Een vraag is dan hoe je de juiste termen uit de local dictionary kiest. Anderen probeerden dat op grond van termfrequentie, zonder rekening te houden met bijvoorbeeld documentlengte. Bovendien was hun classificatietechniek dubieus. Paijmans probeert naast de frequentie ook andere manieren, en ook andere classificatietechnieken. Daarbij varieerde hij de omvang van de documentvectoren.
Het laatste experiment draait om lexicale cohesie. Dit verschijnsel is te zien als een vorm van samenhang tussen opeenvolgende zinnen, met name in de vorm van verwijzing naar en herhaling van begrippen uit voorgaande zinnen. De grondgedachte van dit experiment is dat de mate van lexicale cohesie tussen auteurs varieert en wellicht kan worden gebruikt om de auteur van een tekst te herkennen in een groep van auteurs. Dit zou verder kunnen gaan dan het onderscheiden van twee auteurs op grond van ad-hoc tekstkenmerken die alleen voor die twee auteurs gelden (zoals het gebruik van "while" door de een en "whilst" door de ander). In een reeks experimenten werd de lexicale cohesie berekend met behulp van document vectoren. Per hele zin of per blok van een vast aantal woorden is berekend hoe groot de rol van een bepaald woord erin is en dit is in de documentvector geplaatst. Zo kon de mate van cohesie tussen meerdere zinnen worden berekend. Vervolgens werden tekststukken elk geclassificeerd door twee (ML) algoritmen (IBL4 en K*). Er is gemeten met verscheidene instellingen van vensteromvang (hele zinnen of vast aantal woorden), en filteringen van woorden.
Voor de classificatie gebruikte hij vier technieken, waaronder de in de IR bekende formule van Rocchio (voor relevance feedback), en twee ML-technieken (regelinductie en genetische algoritmen). De belangrijkste conclusies uit deze experimenten zijn: - Local dictionaries kunnen het beste worden gemaakt met als rangschikkingsmethode voor woordselectie een gewicht dat lokale informatie combineert met informatie uit de gehele database; dit gebeurt bijvoorbeeld in de bekende tf.idfmethode. Dit is beter dan bijvoorbeeld frequentie-informatie uit alleen klassevoorbeelden (zoals in het genoemde eerdere onderzoek was gedaan). - De vectorlengte deed er veel minder toe. 10, 20, 66, 0f 100 woorden als lengte gaf ongeveer gelijke resultaten.
De experimenten besloegen (onder andere) een bestand met slechts twee auteurs, en een bestand met een groter aantal auteurs. In theorie geven goede classificatieresultaten aan dat er grote verschillen tussen de teksten bestaan, en vice versa. Dus teksten van verschillende auteurs geven goede, en van dezelfde auteur slechte classificatieresultaten. Van de vele resultaten van deze experimenten noem ik er enkele: - Bij teksten van twee auteurs blijken tekstvensters van een vast aantal woorden betere resultaten (grotere classificatieprestatie verschillen) te geven dan tekstvensters van hele zinnen. Een voorbeeld van de prestaties: in een
Het meest interessant lijkt mij de laatste conclusie: dat de vectorlengte weinig uitmaakt, en dat je met een heel korte vector al toe kan. Dat roept de vraag op of het nog korter kan, en hoeveel korter! Overigens vind ik het jammer dat er uit dit omvangrijke experiment relatief weinig conclusies voortkomen. BNVKI newsletter
186
December 1999
modellen onder één overkoepelend 'zoekvraagmodel' scharen. Een derde zienswijze kan de wijze van matchen centraal stellen; de huidige bekende modellen ontlenen veelal hun namen hieraan. Om onduidelijkheid te voorkomen zou men ook het woord 'model' kunnen weglaten en, in plaats daarvan, per zoeksysteem de gebruikte technieken van de drie deelprocessen kunnen noemen. Paijmans zelf had de documentenkwestie kunnen omzeilen door als titel een frase uit het proefschrift zelf te kiezen: "aspects of IR and text categorization based on document vectors" (p. 175). Deze kritiek is natuurlijk van ondergeschikt belang: waar het om gaat is dat het begrip documentvector voor IR onderzoek zeer vruchtbaar blijkt.
experiment werden teksten van diverse auteurs in 65% van de gevallen geclassificeerd, en werden teksten van twee verschillende auteurs in 80% juist geclassificeerd (merk op dat at random classificatie al 50% levert). Dit is een voorbeeld van een groot verschil; bij de meeste varianten zijn de gevonden verschillen kleiner. - Bij teksten van vele auteurs, waarvan slechts twee teksten van dezelfde auteur waren werden bij sommige instellingen niet deze twee het slechtst geclassificeerd, oftewel bleek niet duidelijk de overeenkomst. Gemiddeld over de resultaten verkregen bij verschillende instellingen was dat weer wel zo. Een exacte verklaring had Paijmans hiervoor niet. Als mogelijke oorzaak noemt hij dat de tekstfragmenten hier tamelijk kort zijn (2000 woorden), in vergelijking tot de teksten van de twee auteurvergelijkingen (10 maal zo lang).
De verkenningen van Paijmans in het Documentvector Model mogen origineel worden genoemd. Dat ze niet altijd tot harde conclusies leiden lijkt meer samen te hangen met het onderwerp dan met de persoon die het onderzoek uitvoerde. Het onderwerp vereist een enorme berg statistisch werk alvorens een conclusie kan worden getrokken. Niettemin zou een vervolgonderzoek toch nog wat extra conclusies kunnen opleveren. Met name waar het local dictionaries betreft wil de(ze) lezer dat graag.
Concluderend mag men stellen dat lexicale cohesie in sommige gevallen bruikbaar is voor auteur-herkenning, maar lang niet in alle gevallen. Dit was uiteraard te verwachten. Ondanks deze beperkte bruikbaarheid kan men ook stellen dat de experimenten succesvol waren. Het is immers al knap dat er überhaupt een onderscheid van auteurs mogelijk is. Voor mensen is dit evenmin een triviale taak. In dit experiment blijkt bovendien dat MLtechnieken krachtige gereedschappen vormen voor IR.
Aan de schrijfstijl van Paijmans valt op dat hij, vooral in het begin, herhaaldelijk uitweidt. Hierdoor wordt het eigenlijke betoog soms iets minder overzichtelijk. Daar staat natuurlijk het voordeel tegenover dat het onderhoudend is en vooral leerzaam. Ook toont het dat Paijmans meer heeft gezien van het vakgebied. Op enkele punten was een iets langere behandeling juist welkom geweest, bijvoorbeeld over de keuze van cue words bij het opsporen van informatierijke tekstdelen en over latent semantic indexing bij local dictionaries. Over het geheel genomen zijn de opbouw
ALGEMEEN COMMENTAAR
Naast het reeds gegeven commentaar op de experimenten, geef ik hieronder enig algemeen commentaar. Ten eerste beschouw ik het door Paijmans geïntroduceerde Documentvector Model als slechts één van de mogelijke zienswijzen; een andere zienswijze kan bijvoorbeeld de zoekvraag centraal stellen, en alle andere BNVKI newsletter
187
December 1999
documentvectoren. Het geheel laat zich goed lezen, wellicht vanwege de vaak heldere uitleg. Paijmans heeft bovendien voldoende aanknopingspunten genoemd voor interessant verder onderzoek op het gebied van IR en ML; genoeg stof om ook door anderen dan hemzelf te worden aangepakt.
en structuur van de tekst voldoende duidelijk. CONCLUSIE
Al met al is naar mijn mening dit proefschrift een zeer nuttige bijdrage aan het vakgebied IR. Het rapporteert vier interessante experimenten, gebracht vanuit de originele en verbindende invalshoek van
John Holland and Douglas Hofstadter in 1992. He is the author of "The subtlety of sameness" (MIT Press, 1995).
COGNITIVE SCIENCE RESEARCH AT ULB AND ULG Axel Cleeremans
Université Libre de Bruxelles
The QPCS group is involved in a number of interdisciplinary projects focusing, broadly speaking, on the concept of emergence: computer-modeling of analogy-making, connectionist models of memory, foundational issues in cognitive science and evolutionary psychology. Ongoing projects include:
Bob French
Université de Liège The Cognitive Science Research Unit at the Université Libre de Bruxelles (ULB) and the Quantitative Psychology and Cognitive Science group at the Université de Liège (Ulg) both aim to further our understanding of the mechanisms involved in information processing by humans and machines alike. In this brief presentation, we focus on the short history of both groups and on their developing interactions. Next, we summarize current research and future projects.
Foundational issues: Several projects focusing on the nature of knowledge representation and on the epistemology of AI, particularly with respect to the Turing test. A recurrent theme in these projects is the notion of emergence, that is, the relationship between symbolic and subsymbolic information processing, in domains such as for instance, analogymaking.
COGNITIVE SCIENCE AT ULG
The Quantitative Psychology and Cognitive Science (QPCS) group at the University of Liège was created by Bob French in May, 1998. Bob French was originally trained in mathematics, obtaining a Master’s Degree in mathematics from Indiana University in 1975. He then left for Paris where he worked for a decade as a free-lance translator. The swan song of his translation career was his translation into French with J. Henry of the Pulitzer Prize winning book by Douglas Hofstadter, Gödel, Escher, Bach: an Eternal Golden Braid. Bob received his Ph.D from the University of Michigan under the joint supervision of BNVKI newsletter
Human memory and categorization: Several closely related projects aimed at exploring the the relationships between connectionist principles and empirical findings in human memory and categorization in domains such as bilingualism or development. A particularly important issue in this context is the issue of catastrophic interference in connectionist networks and possible solutions to it. Evolutionary psychology: Several ongoing projects aimed to explore whether human information processing can usefully 188
December 1999
the role that consciousness plays in learning and in information-processing, and the extent to which knowledge in general is rule-based vs. memory-based. With respect to the first issue, some authors would say that cognition always involves consciousness, while others believe that consciousness is more of an epiphenomenon. As for the second issue, there also is ongoing controversy about the extent to which central human abilities such as language processing can be understood based exclusively on emergent mechanisms of abstraction rooted in subsymbolic, associative, memory based representations, or whether such abilities should instead be taken as testimony to the fact that human cognition necessarily involves symbol manipulation and rulebased processes.
be analyzed as resulting from philogenetic adaptation. COGNITIVE SCIENCE AT ULB
The Cognitive Science Research Unit (CSRU) at the Université Libre de Bruxelles was founded in 1996 by Axel Cleeremans and Alain Content. After obtaining a degree in Psychology from ULB in 1986, Cleeremans moved to Carnegie Mellon to work with James L. McClelland on connectionist models of cognition. This collaboration concentrated on the analysis of the computational power of simple recurrent networks and on the extent to which such networks can constitute models of skill acquisition in simple sequential tasks. It resulted in the publication of the first book dedicated to computational accounts of implicit learning (Cleeremans, 1993, MIT Press). After obtaining his Ph.D. in 1991, Cleeremans returned at the Université Libre de Bruxelles as a Fonds National de la Recherche Scientifique (FNRS) research assistant. He now is a research associate with the same institution.
To address these issues, the CSRU combines empirical approaches (e.g., traditional experimental psychology techniques, particularly using a sequence learning paradigm), computer modeling (essentially through the development of neural network models of performance in specific tasks), and, more recently, brain imaging techniques. Current projects include the following:
Alain Content completed his Ph.D. research at the Laboratory of Experimental Psychology in Brussels, on determinants of reading acquisition and reading deficits. He then spent a few months in Jeff Elman’s Center for Research in Language, to explore the potential of connectionist networks to account for speech and written language processing. He now is an associate professor at ULB, and has published on visual and auditory word recognition, reading acquisition and dyslexia.
Foundational issues: Theoretical work on the role of consciousness in learning, on the nature of conscious experience, and on the philosophical and epistemological implications of computational theories of consciousness. Consciousness and sequence learning: Empirical and modeling work aimed at exploring (1) the contribution of conscious and unconscious processes to sequence learning (with Arnaud Destrebecqz, CSRU), (2) knowledge representation in sequence learning tasks with Maud Boyer, CSRU), and (3) the development of new connectionist models of sequence learning capable of capturing the time course of
The CSRU's main focus is centered on the role that elementary, associative learning mechanisms play in human cognition, and more specifically in skill acquisition and in language processing. The domain is controversial because of two major issues: BNVKI newsletter
189
December 1999
processing within single Arnaud Destrebecqz).
trials
(with
COLLABORATION
Members of both labs are involved in teaching an advanced degree in Cognitive Science initiated by Cleeremans in 1997. The program aims to offer an interdisciplinary perspective on issues that concern philosophers, psychologists, and computer scientists alike. From 2000 onwards and pending approval, this program will be organized jointly by both Ulg and ULB.
Neural correlates of implicit learning: Empirical work meant to explore the neural correlates of sequence learning and the role of REM sleep in memory consolidation (with Arnaud Destrebecqz, Pierre Maquet, Philippe Peigneux, Martial Vanderlinden and others from the Cyclotron Research Center and from the Neuropsychology Unit, Ulg)
Both labs are also involved in two joint research projects. First, the labs are involved in an inter-university IUAP program headed by V. De Keyser (Ulg) and G. d'Ydewalle (KUL). The program brings together numerous teams from the two linguistic communities of Belgium and focuses on the temporal control of dynamic task situations and on knowledge representation. Second, both teams, along with colleagues from the University of Warwick and Birkbeck College in England and from the University of Grenoble in France, will participate in a large European Commission Research Training Network Grant (2000-2004) to study learning and forgetting in natural and artificial systems, using state-of-the-art techniques from mathematics, computer modeling and neuro-imaging.
Implicit learning and language acquisition: Empirical and modeling work aimed at comparing symbolic, chunkingbased models with Simple Recurrent Networks in artificial grammar learning and other language acquisition/processing tasks (with Pierre Perruchet, Université de Bourgogne) Sequential processes in auditory word recognition: Empirical and computational work exploring the nature of processes and sensory cues involved in the segmentation of continuous speech (with Uli Frauenfelder, Université de Genève), as well as their acquisition by humans and machines. Visual word recognition: Empirical, statistical and computational work exploring the mechanisms of activation of phonological and semantic information from print, and the potential of connectionist approaches to account for these. These different projects contribute to develop a perspective on cognition that is based on the notion that learning is an integral part of human information processing and that the latter is best described in terms of continuous, graded and distributed representation and processing systems.
BNVKI newsletter
More information: Cognitive Science Research Unit: http://164.15.20.1/axcWWW/axc.html
Quantitative Psychology and Cognitive Science group: http://www.fapse.ulg.ac.be/Lab/cogsci/frdefault.hml
190
December 1999
systemen1, zodat het begrijpelijk is dat rechtsinformatici de laatste jaren hun aandacht weer op andere terreinen zijn gaan richten.
SECTION KNOWLEDGE SYSTEMS IN LAW AND COMPUTERBETWEEN SCIENCE INTERFACING LAWYERS AND COMPUTERS, AN ARCHITECTURE FOR KNOWLEDGE-BASED INTERFACES TO LEGAL DATABASES
Het ontsluiten van de dagelijks in omvang toenemende databanken met juridisch tekst-materiaal (wetgevings- en vooral juris-prudentieteksten) is daarbij één van de taken waarop zich momenteel veel van de inspanningen richten. Natuurlijk beschikken deze databanken ook op dit moment over zoekfuncties die het in principe mogelijk maken ieder gewenst document op het scherm te tonen en af te drukken. Maar iedere jurist die ermee gewerkt heeft weet dat in de praktijk lelijk kan tegenvallen. Al te vaak levert een zoekactie ofwel geen of te weinig gewenste documenten op, ofwel een veel te groot aantal documenten waartussen de gewenste door moeizaam bladeren moeten worden gelokaliseerd. Het proefschrift van Luuk Matthijssen2 heeft betrekking op deze problematiek. Het bevat onder andere een beschrijving van een 'kennis gebaseerde' interface voor juridische databanken, die het zoeken efficiënter en effectiever beoogt te maken.
Dissertatie van Luuk Matthijssen, KUB
Bespreking door mr. dr. Kees van Noortwijk, Erasmus Universiteit Rotterdam INTRODUCTIE
Het is verheugend om te zien dat de ontsluiting van juridische gegevensbestanden de laatste jaren weer in het middelpunt van de belangstelling staat. Opvallend genoeg was het juist deze vorm van automatisering die in de rechtspraktijk het eerste ingang vond. De eerste Nederlandse juridische databanken ontstonden immers al in de tweede helft van de zeventiger jaren. Deze elektronische wettenen jurisprudentieverzamelingen werden na een aarzelend begin zowel door praktijkjuristen als rechtswetenschappers omarmd en nemen inmiddels een niet meer weg te denken plaats in. Toch werden ze tot voor kort door rechtsinformatici enigszins verguisd, in die zin dat onderzoeksinspanningen vooral werden gericht op toepassingen die ‘intelligenter’ of ‘juridisch inhoudelijker’ waren. Bedoeld werden daarmee dan automatiseringstoepassingen die niet zozeer tot doel hadden tekstmateriaal te ontsluiten, maar waarmee bijvoorbeeld bepaalde vormen van juridisch 'redeneren' konden worden nagebootst. Op enkele uitzonderingen na hebben de (zonder overdrijving) tientallen onderzoeksprojecten op dit gebied echter niet geleid tot voor de praktijk bruikbare
BNVKI newsletter
INHOUD VAN HET BOEK
Het boek vangt aan met een kort historisch overzicht van juridische databanken. Ook nieuwe ontwikkelingen, zoals de verbeterde toegankelijkheid van databanken via Internet en het combineren van databanken ten behoeve van behaalde beroepsgroepen, komen aan de orde. 'Booleaans zoeken' (het selecteren van documenten door middel van een combinatie van trefwoorden die erin voorkomen) voert nog steeds de 1
Eén van de weinige uitzonderingen vormen de z.g.n. juridische computer-adviessystemen, waarvan er sinds midden jaren 80 meer dan 10 verschillende op de markt zijn gebracht, merendeels door uitgeverij Vermande in Lelystad i.s.m. de Erasmus Universiteit Rotterdam. 2 Luuk Matthijssen promoveerde op 9 april 1999 aan de Katholieke Universiteit Brabant.
191
December 1999
boventoon, alhoewel de beperkingen van deze methode al dikwijls zijn beschreven3. Matthijssen bespreekt in dit verband de 'conceptuele kloof' tussen het idee dat de gebruiker heeft van de onderwerpen in de databank enerzijds en de beperkte afbeelding van deze onderwerpen in de index van de databank. Een gebruiker wil meestal zoveel mogelijk documenten over een bepaald onderwerp vinden, maar moet om deze te vinden een zoekvraag opgeven die gesteld is in termen van de vorm van de documenten (welke woorden moeten ze bevatten). Met deze slag van onderwerpc.q. betekenisniveau naar vormniveau hebben veel gebruikers moeite. Het opstellen van effectieve zoekvragen is momenteel dan ook een vaardigheid die veel oefening vereist. Veel juristen laten het zoeken in databanken daarom over aan documentalisten of bibliotheekpersoneel.
verbeteringen op dit laatste gebied, met behulp van kennisgebaseerde methoden. De hoofdstukken 3 en 4 vormen de kern van het proefschrift. Matthijssen bespreekt daarin een 'intelligente interface' voor informatieontsluiting. De bedoeling van een dergelijke interface is in het algemeen om op te treden als intermediair en zodoende de communicatie tussen gebruiker en informatiesysteem zoveel mogelijk te vergemakkelijken. Centraal staan daarbij in dit geval de informatieverwerkende aspecten van de taken die voor een bepaald juridisch terrein kunnen worden onderscheiden. Door gebruik te maken van een op deze taken gebaseerde hyperindex4 kunnen (als het goed is) precies die documenten worden teruggevonden in de database die voor de taak in kwestie nodig c.q. nuttig zijn. Het samenstellen van de index gebeurt op basis van een domeinmodel, waarin onder andere de taken op een bepaald deelterrein worden opgesomd en gekarakteriseerd. Om het domein en de taken te analyseren teneinde te bepalen welke informatie precies nodig is voor welke taak maakt Matthijssen daarnaast gebruik van een argumentatiemodel. Hoe een en ander praktisch kan worden vormgegeven wordt beschreven in Hoofdstuk 4, waarin als voorbeeld-domein wordt gebruikt ‘de bezwaarschriftprocedure volgens de Algemene Wet Bestuursrecht’. Uitgaande van een besluit van een bepaald bestuursorgaan zijn enkele taken in dat geval: analyse van het bestuursbesluit teneinde na te gaan of wel de juiste argumenten zijn gebruikt en het verzamelen van argumenten om een bezwaarschrift tegen het besluit te onderbouwen. Vaak kunnen dergelijke taken worden onderverdeeld in subtaken,
THEORETISCH KAPER
Dit probleem wordt besproken in hoofdstuk 2, waarin een theoretisch kader wordt gegeven voor de ontsluiting van informatie in databanken. Onderscheiden worden de indexeerfunctie, de zoekvraagfunctie en de vergelijkingsfunctie van een informatiesysteem. Elk van deze functies kent in de praktijk onvolkomenheden, hetgeen het resultaat van een zoekopdracht negatief beïnvloedt. Matthijssen ziet de eerder genoemde 'conceptuele kloof' als een afzonderlijke, vierde probleemcategorie. Deze heeft in feite op het functioneren van het gehele informatiesysteem en dus op alle drie de andere functies betrekking, maar omvat daarnaast ook een subjectief element, de interactie met de gebruiker. Hij streeft in zijn onderzoek met name naar
4 Bedoeld is hier een index die is opgebouwd uit een netwerk van index-termen die door middel van 'links' met elkaar verbonden zijn terwijl de index-termen zelf zijn voorzien van beschrijvingen om zodoende het 'bladeren' naar de juiste term en de daarmee verbonden documenten te vergemakkelijken.
3
Zie bijvoorbeeld Wildemast en De Mulder, 'Some design considerations for a conceptual legal information retrieval system', in: C.A.F.M. Grütters et al. (eds.), Proceedings Jurix 1992, p. 81-92. Vermande, Lelystad 1992.
BNVKI newsletter
192
December 1999
elk mogelijk informatiebehoeften.
met
eigen
kunnen ondersteunen. De taakgebaseerde methodologie vereist veel informatie over de casus om exact de juiste documenten te kunnen selecteren uit de database. In het zesde hoofdstuk wordt beschreven hoe de werking van het prototype is uitgetest door een aantal proefpersonen. Zij dienden een bepaalde casus in het systeem in te voeren en op basis van in de database gevonden documenten een correct bezwaarschrift op te stellen. In vrijwel alle gevallen lukte dit zonder veel problemen (hoewel soms teveel en/of onjuiste documenten werden gevonden). Voorts worden mogelijke andere toepassingen voor dit soort informatieontsluitingssystemen genoemd (onder andere in relatie tot het zoeken op Internet) en wordt aandacht besteed aan het ontwerpproces. In hoofdstuk 7 wordt het onderzoek samengevat en worden de voornaamste conclusies opgesomd. Ook treffen we hier aanbevelingen voor toepassing van de onderzochte technieken voor andere informatiesystemen (juridisch zowel als niet-juridisch).
ARMOR
In hoofdstuk 5 wordt dan een prototype van een applicatie beschreven waarin de kennis van dit bestuursrechtelijke terrein is opgenomen. De naam van de applicatie luidt ARMOR, wat staat voor Argument Model based Retrieval System. Doel ervan is om burgers te voorzien van alle informatie die nodig is voor het schrijven van een bezwaarschrift, inclusief argumenten die tegen het bestuursbesluit kunnen worden ingebracht. De applicatie gaat in feite verder dan alleen informatieontsluiting. Zo is ook een expertsysteemachtige module aanwezig waarmee het bestuursbesluit kan worden geanalyseerd, en (een link naar) tekstverwerkingsfuncties om het uiteindelijke bezwaarschrift te schrijven. Bij het samenstellen van het prototype is gebruik gemaakt van 4e generatie tools. Het resultaat is een Windows applicatie waarvan het zichtbare deel bestaat uit een aantal invulschermen, elk corresponderend met een bepaald gedeelte van de hoofdtaak (het schrijven van een bezwaarschrift). Door open- en meerkeuzevragen te beantwoorden zorgt de gebruiker ervoor dat het systeem een model van de casus kan opbouwen. Aan de hand van dit model bepaalt het systeem automatisch welke zoektermen bij deze casus het beste gebruikt kunnen worden om bij iedere stap de relevante wetteksten en jurisprudentie in de database op te zoeken, en voert desgewenst deze zoekactie uit. Al met al gaat dit prototype verder dan alleen informatieontsluiting, zoals Matthijssen zelf ook toegeeft5. Zijn voornaamste argument voor de gekozen weg is dat alle modules in het prototype samen nodig zijn om de gekozen juridische taak (het opstellen van een bezwaarschrift) goed te 5
EVALUATIE VAN HET BOEK
Al met al gaat het hier om een interessant proefschrift, over een onderwerp (namelijk informatieontsluiting) waarvan het belang nauwelijks kan worden overschat. Aantal en omvang van elektronische dataverzamelingen groeien explosief. Ook het aantal gebruikers ervan neemt snel toe, niet in de laatste plaats door de steeds betere toegankelijkheid (o.a. via Internet). De tekortkomingen van de huidige zoeken selectiemethoden (voornamelijk Booleaans zoeken) voor deze dataverzamelingen komen daardoor des te duidelijker aan het licht. Het bouwen van een betere ('intelligente') gebruikersinterface is een goede gedachte, aangezien daarmee ook de toegang tot de op dit moment bestaande databanken kan worden verbeterd. Dit in tegenstelling tot bijvoorbeeld verbeteringen in de structuur of de wijze van opslag van documenten,
p. 175.
BNVKI newsletter
193
December 1999
waarvoor in ieder geval ook inspanningen aan de kant van de uitgevers van de databanken nodig zouden zijn.
andere woorden: om up-to-date te blijven zou de hyperindex eigenlijk steeds moeten worden aangepast, iets dat waarschijnlijk niet door de gebruiker zelf kan gebeuren. Wat dit betreft staat deze techniek in feite op één lijn met andere 'handmatige' ontsluitingstechnieken, zoals het aan een document toevoegen van trefwoorden of onderwerpcoderingen en (wellicht in wat mindere mate) het samenstellen van thesauri.
Toch denk ik niet dat de interface die hier is gepresenteerd (of een afgeleide daarvan) de oplossing kan zijn voor alle problemen op het gebied van (juridische) informatieontsluiting. Het gaat om een domeinspecifieke oplossing, die door de aanzienlijke hoeveelheid werk die verbonden is aan het modelleren van taken ook relatief kostbaar is. Matthijssen geeft zelf ook aan dat de toegevoegde waarde van een dergelijk systeem (betere informatieontsluiting) moet worden afgewogen tegen de kosten van het bouwen ervan6.
Dat neemt niet weg dat de door Matthijssen beschreven informatieontsluitings-methode in principe toepasbaar is in de praktijk, getuige ook het fraai ogende en volledig functionele prototype dat hij beschrijft. Met name wanneer de omvang van een domein beperkt is en de gebruikersvriendelijkheid zo hoog mogelijk moet zijn zou een ontsluitingssysteem als dit zijn nut kunnen bewijzen.
Het modelleren van taken, als basis voor het ontsluitingsproces, is op zich een interessant idee en blijkt in dit geval ook effectief. Maar voor een domein dat omvangrijker c.q. onoverzichtelijker is dan het hier gepresenteerde kon het wel eens buitengewoon moeilijk zijn om zelfs maar een redelijk deel van alle mogelijke taken te modelleren. Voor niet-gemodelleerde taken zou dan weer moeten worden teruggegrepen op traditionele zoekmethoden. Het systeem is derhalve geen vervanging voor de 'algemene' zoekprogrammatuur die thans bij juridische databanken wordt toegepast, maar eerder een aanvulling daarop.
REFERENTIE Luuk Matthijssen, 1999. Interfacing between Lawyers and Computers, an Architecture for Knowledge-based Interfaces to Legal Databases, Katholieke Universiteit Brabant. Van dit proefschrift is een handelseditie verschenen bij uitgeverij Kluwer Law International te ’sGravenhage onder ISBN 90-411-1181-6.
CASUS-GEBASEERDE JURIDISCHE ARGUMENTATIE JURIX lezing van Bram Roth, Universiteit Maastricht, donderdag 14 oktober 1999 te Utrecht
Een ander probleem is naar mijn mening ook het gebrek aan dynamiek in de hyperindex. Juridische databases hebben de neiging snel te groeien, en het is niet gezegd dat de zoektermen die vandaag effectief zijn voor het selecteren van een zo compleet mogelijke verzameling documenten (zeg, rechterlijke uitspraken op bestuursrechtelijk gebied) dat over een paar jaar (of misschien zelfs een paar maanden?) nog steeds zullen zijn. Met 6
Verslag door Radboud Winkels (UvA) Bram Roth is sinds 1 jaar AiO aan de Universiteit van Maastricht en zijn onderzoek richt zich op het juridisch argumenteren met casus. Aangezien hij de laatste tijd vrij veel onderwijs moet geven, deed hij verslag van zijn literatuuronderzoek van het eerste jaar.
p. 235.
BNVKI newsletter
194
December 1999
Wat is casus-gebaseerde argumentatie? Een “moderne” visie op juridisch redeneren is het te zien als een dialectisch proces, waarin twee partijen elk een verschillend standpunt proberen te verdedigen middels argumentatie. Beide partijen brengen argumenten naar voren die hun eigen standpunt onderbouwen of dat van de tegenpartij ondergraven. Daarbij kunnen ze juridische bronnen als rechtvaardiging gebruiken. Eén van de mogelijke bronnen is jurisprudentie, een oude casus die door een rechtscollege al is beslist. Zo’n oude casus kan men op verschillende manieren gebruiken. Als de conclusie in het oude geval dezelfde is als men nu wil bereiken, dan probeert men de overeenkomsten tussen de twee gevallen te benadrukken en eventuele verschillen te bagatelliseren. Is de conclusie eerder die van de tegenpartij, dan probeert men het tegenovergestelde te doen. Men gebruikt dus de argumentatie van de oude casus met eventuele aanpassingen voor het nieuwe geval.
aspecten van argumentatie: 1. 2. 3. 4.
De representatie van de argumentatie in de oude casus De notie van vergelijkbaarheid van twee casus; hoe wordt een nieuwe casus vergeleken met oude casus? De constructie van een nieuwe argumentatie Conflictresolutie als meerdere oude casus van toepassing zijn
HYPO In HYPO worden casus gerepresenteerd als factoren die bijdragen tot de ene positie of juist de andere. Een argumentatie in een oude casus wordt dan opgeslagen als een set van factoren plus het oordeel van het rechtscollege. Het domein van HYPO is de Amerikaanse trade secrets act (TSA) waarin de bescherming van industriële geheimen wordt geregeld. Een farmaceutisch bedrijf brengt bijvoorbeeld na jaren van onderzoek een nieuw medicijn op de markt dat een groot succes wordt. Nadat een werknemer ontslag genomen heeft en bij een concurrent is gaan werken, brengt die in zeer korte tijd een zelfde medicijn op de markt. Is hier sprake van een overtreding van de TSA? Uit jurisprudentieonderzoek blijkt dat verschillende factoren van belang zijn: A. B. C. D. E. F.
In de continentale traditie is het redeneren met casus minder gebruikelijk dan in het Angelsaksische systeem. Het is dan ook niet verwonderlijk dat het meeste onderzoek naar casus-gebaseerd redeneren (Case Based Reasoning) en ook casusgebaseerd argumenteren, uit de Verenigde Staten afkomstig is.
de informatie is waardevol (+) het betreft een uniek product (+) het is opgenomen in een octrooi (+) er zijn geringe veiligheidsmaatregelen genomen (-) de informatie was bekend buiten het bedrijf (-) het is in het algemeen belang bekend gemaakt (-)
Sommige factoren dragen bij tot de conclusie dat de TSA is overtreden (A, B en C met ‘+’ teken), andere tot de conclusie dat zulks niet het geval is (D, E en F met ‘-’ teken). Laten we aannemen dat in de nieuwe farmacie-casus de factoren A, B, C en D aan de orde zijn. Er zijn twee relevante oude casus. In casus I speelden factoren A, B en E een rol en was de beslissing dat de TSA niet was overtreden. In casus II speelden factoren A, B, C, D en F een rol en volgde het oordeel dat de TSA wel was overtreden. Oude casus worden met de nieuwe casus vergeleken door de set van gemeenschappelijke factoren te bekijken. De oude casus met de grootste set gemeenschappelijke factoren is het meest on point. In dit geval dus casus II (alle 4 de factoren van de nieuwe casus tegenover slechts 2 voor casus I). De constructie van nieuwe argumenten bestaat uit het aanhalen van oude casus. Eventuele conflicten worden opgelost door ‘on pointness’ van casus.
Roth besprak drie systemen of computermodellen van casus-gebaseerd argumenteren: HYPO (Ashley & Rissland rond 1990), CATO (Aleven & Ashley rond 1996), beiden Amerikaanse systemen, en een formeel dialoog model (Prakken & Sartor rond 1998) van Europese bodem.7 Om de verschillende aanpakken te kunnen vergelijken introduceerde Roth eerst vier 7 In twee van de drie benaderingen is Nederland vertegenwoordigd; Vincent Aleven, die onlangs op CATO promoveerde (zie de BNVKI-Nieuwsbrief van April 1999 voor een bespreking door Bram Roth), is een Nederlander. Henry Prakken, een van de opstellers van het formele dialoogmodel eveneens.
BNVKI newsletter
casus-gebaseerde
195
December 1999
CONCLUSIES CATO
Na bespreking van deze drie verschillende aanpakken van het argumenteren met casus, sloot Roth af met de constatering dat er volgens hem meer informatie uit oude casus te halen is dan men tot nu toe doet. Hij denkt aan zaken als weeginformatie over het relatieve gewicht van sets redenen vóór en tégen een conclusie (sluit aan bij gedachten van een van zijn directe collegae Jaap Hage), en informatie m.b.t. de weerlegging van het ene argument door het andere. Daarvoor is een rijkere representatie van casus en een rijkere logica voor het redeneren ermee nodig. Na dit heldere en rustig gebrachte betoog, ontspon zich een levendige discussie over de al dan niet terechte samenvatting van de drie hierboven beschreven systemen, en, interessanter, over hoe men oude argumentaties dan wel zou kunnen opslaan en opnieuw gebruiken. Een bewijs dat Bram Roth een belangrijk en tot de verbeelding sprekend promotieonderwerp heeft gekozen. Veel succes de komende jaren!
CATO is gebaseerd op HYPO en lijkt er dan ook erg op, maar introduceert abstracte factoren waardoor er een factor-hierarchie gebouwd kan worden. Verschillen tussen casus kunnen nu op een hoger, afgeleid niveau worden bepaald. Zo kan bijvoorbeeld de factor “informatie achter slot en grendel” positief bijdragen aan de abstractere factor “moeite om geheim te beschermen”, terwijl de factor “informatie in reclamefolders” daar negatief aan bijdraagt. Dit maakt het mogelijk casus te vergelijken op iets abstracter niveau dan de concrete set factoren. Ook kan CATO subtielere argumentatiestrategieën aan, gebaseerd op het benadrukken respectievelijk bagatelliseren van verschillen tussen casus. Eigenlijk is het systeem ook bedoeld om studenten argumenteren met casus te leren, een aspect dat Roth buiten beschouwing liet. FORMEEL DIALOOGMODEL
In het formeel dialoogmodel van Prakken en Sartor worden casus gerepresenteerd als regels, bijv.: Regel 1: ALS productinformatie veel waard is (A+), EN in een octrooi staat (B+), EN onbekend is buiten het bedrijf (C+), DAN is deze informatie een bedrijfsgeheim (+).
KUNNEN RECHTERS RECHTSPREKEN? Ronald van den Hoogen (CBM, RUU)
Deze regels fungeren als een soort samenvatting van de oorspronkelijke casus. Henry Prakken, die in de zaal zat, voegde nog toe dat er in hun model wel degelijk ook een databank met de originele casus is. Regels kunnen van toepassing op een nieuwe casus gemaakt worden door voorwaarden weg te laten (de regel te verbreden). De betere vergelijkbare regel wint dan van anderen. In Roth’s visie speelt in de aanpak van Prakken en Sartor de oorspronkelijke casus geen rol meer, ze zijn gereduceerd tot abstracte regels. Vergelijkbaarheid van casus wordt dan prioriteiten tussen die regels, en het argumenteren het introduceren van nieuwe regels door voorwaarden weg te laten.
BNVKI newsletter
verslag door Doeko Bosscher (UvA) Met deze knuppel in het hoenderhok opende (mr.) Ronald van den Hoogen van het Centrum voor Beleid en Management (CBM) van Utrecht zijn praatje voor de laatste Jurix bijeenkomst. Ronald is namelijk bezig met een onderzoek over de relatie tussen behoorlijke rechtspraak en Informatie- en CommunicatieTechnologie (ICT). Zijn uitgangspunt is nu eens niet de vraag of computers recht kunnen spreken, maar in hoeverre de ICT de notie van behoorlijke rechtspraak beïnvloedt. In zijn praatje ging hij van de klassieke invullingen van behoorlijke rechtsspraak vanuit de verschillende rechtsstatelijke visies naar de invloed van ICT. Zijn eerste nieuwe vraag is of burgers nieuwe of andere eisen stellen aan de (individuele) rechter vanwege zijn toegenomen elektronische armslag. Zijn tweede vraag is in hoeverre het naast de informatisering van de overheid ook het functioneren van de rechterlijke macht raakt. Een keur van projecten is al door justitie heengetrokken 196
December 1999
in ideas for new search enhancements. Hopefully, new AI textbooks will spend more pages to search and particularly to the importance of search enhancements. Schaeffer’s master class addressed issues that are of general interest and welcome to (SIKS) Ph.D. student. Many students working on their thesis have experienced or will experience the limits of test-book descriptions of search (and other) algorithms. This fact and the clarity and convincingness of presentation made Schaeffer’s master-class a great success.
maar het is nog niet goed duidelijk hoe dit de organisatie heeft veranderd. Vanzelfsprekend heeft deze digitalisering invloed op de aan de rechter te stellen eisen ten aanzien van snelheid, accuraatheid etc. Ook worden interessante vragen opgeroepen over de noties van onafhankelijkheid en openbaarheid. Zo is het de vraag of verdergaande beslissingsondersteuning de onafhankelijkheid van de rechter beïnvloedt en in welke mate de beslissingsondersteunende maatregelen openbaar moeten zijn. Al met al een mooi onderwerp waarop Ronald medio 2000 hoopt te promoveren.
AN OVERVIEW OF MACHINE LEARNING A SIKS master class by Tom Mitchell (Carnegie Mellon University) Report by Evgueni Smirnov, UM Professor Tom Mitchell’s SIKS Master Class was a clear and concise introduction to machine learning. His presentation was largely based on his textbook Machine Learning (published by McGraw-Hill in 1997). He started with the question also posed in the introduction chapter of the book, namely “Why Machine Learning?” His answer was clear - the recent progress in algorithms and theory, combined with the increasing computational power of computers, as well as the growing flood of online data makes machine learning very important from both a theoretical and a practical point of view. In this respect the lecturer discussed the main niches of machine learning nowadays such as data mining, software applications and self-customizing programs (e.g. Internet ML applications). After the inspiring introduction, Mitchell continued with a typical data-mining task and examples. He showed different techniques for applying decision trees, artificial neural networks, and Bayesian learning (including naïve Bayesian classifier and Bayesian belief networks) to various tasks. At the end of the Master Class Mitchell discussed recent research trends in machine learning, such as learning from labelled and unlabeled data in the context of the Internet and programming languages with learning capabilities. After the Master Class, Mitchell answered questions from SIKS students in an informal and friendly atmosphere.
Section Editor Richard Starmans THE SIKS MASTERCLASSES IN MAASTRICHT SEARCH ALGORITHMS VERSUS SEARCH
A SIKS master class by Jonathan Schaeffer (University of Alberta) Report by Ida Sprinkhuizen-Kuyper, UM Jonathan Schaeffer gave a very fascinating course on the state of affairs with respect to search. He mentioned a decreasing interest in search in AI textbooks: the book of Rich (1983) dedicated 31% of its pages to search, while this number is only 12% in the book by Russell and Norvig (1995) and 10% in Poole, Mackworth and Goebel (1998). The take-home messages of Jonathan Schaeffer were: • • • •
Efficient Search is all about search enhancements. Given an application domain, the choice of the algorithm is usually a trivial one. 99% of the effort is spent on implementing, debugging, tuning, and analysing search enhancements. The AI textbooks have it backwards, by emphasizing the search algorithms and ignoring the enhancements.
Voor alle vragen en opmerkingen over SIKS kunt u (op maandag, woensdagmiddag en donderdag) terecht bij: Richard Starmans, Coördinator SIKS Postbus 80.089, 3508 TB UTRECHT, tel. 030- 253 4083 / 1454, fax. 030- 251 3791 e-mail: [email protected] www: http://www.siks.nl
The importance of this message was illustrated by several convincing examples, such as the 15 puzzle (single-agent A*, simple), Chinook (alpha-beta), and Sokoban (single-agent A*, complex). Schaeffer showed that there is still much to be done BNVKI newsletter
CALL FOR PAPERS
197
December 1999
the 14th biennial European Conference on Artificial Intelligence. Submissions are invited on substantial, original and previously unpublished research in all fields of Artificial Intelligence.
14TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE
August 20th – 25th 2000, Berlin, Humboldt University
FORMATTING GUIDELINES
It is highly recommended to submit papers using the final camera-ready formatting style. Submissions must not exceed five pages in cameraready format. Submissions of unformatted papers are limited to 6000 words including footnotes, figure captions, tables, appendices, and bibliography. Guidelines on the format of submissions will be available on the ECAI 2000 style guide page. Each accepted paper will be allocated five pages in the proceedings.
In this special turn of the century we invite you to celebrate ECAI 2000 in Berlin. In the tradition of previous ECAIs the conference will bring together researchers and developers from academy and industry in order to present the state of the art in AI both in research and in applications. The technical program will have a scientific track (paper presentations, invited talks, panel discussions), workshops, and tutorials.
SUBMISSION PROCEDURE
Submission is a two-stage process. Authors are asked to submit a brief summary of their paper by 2 February 2000. The strongly preferred submission method is to use the web-based summary submission form. Submitted summaries will be assigned a unique tracking number that should be marked on the full paper submission. Authors without access to the web should send a summary including the title, authors, contact address and abstract for the paper (maximum 200 words), plus keywords to the ECAI 2000 Program Chair (by email or postal mail). The summary information and the tracking number should also be included with the paper itself, on a separate sheet of paper. (Authors not able to use the web-based submission form may omit the tracking number). Submission of the paper is in hard copy form only, fax or electronic submissions will not be accepted. Six copies of the paper (each including the summary sheet) should be sent by postal mail or courier service to the ECAI 2000 Program Chair at the address below. The deadline for receipt of papers is 4 February 2000. Papers received after this date will not be reviewed. Notification of receipt of full papers will be mailed to the corresponding author soon after receipt.
Since the year 2000 is a special year, we will make ECAI 2000 a very special conference. Among the programs thought to realize this purpose is an exhibition concept that reflects AI's history, its gaining grounds in the 20th century and its progress paths envisioned into the 21st century. Furthermore, we plan to have a rather unusual sidetrack meant to attract layman such that ECAI 2000 becomes an event in Berlin and not only one of diverse scientific congresses hardly noticed by the public. In so far, we aim at presenting AI on ECAI 2000 in a very broad scope in order to show its relation to other classical and advanced IT-topics (e. g. databases, distributed computing, robotics, operations research, artificial life, neuro sciences, virtual reality, and multimedia). For the first time, ECAI comprises the Prestigious Applications of Intelligent Systems sub-conference (PAIS 2000). The purpose of this event is to provide a forum for industry practitioners to learn about the power and applicability of selected intelligent systems techniques and share experience on the application, development and deployment of intelligent systems in industry. This will be the largest showcase in Europe of real applications using intelligent systems technology and the ideal place to meet with those working to make successful applications.
ADDRESS FOR SUBMISSION ECAI 2000 Program Chair Werner Horn, Austrian Research Institute for AI (ÖFAI), Schottengasse 3, A-1010 Vienna, Austria FURTHER INFORMATION www.ecai2000.hu-berlin.de
CONFERENCES, SYMPOSIA WORKSHOPS
CALL FOR PAPERS
The ECAI 2000 Program Committee invites submission of papers for the Technical Program of BNVKI newsletter
198
December 1999
Wiskunde en Natuurwetenschappen, Dept. of Computer Science Universiteit Leiden, Niels Bohrweg 1, 2333 CA Leiden Tel: (071) 5277057. E-mail: [email protected]
Below, the reader finds a list of conferences and web sites or email addresses for further information. A more extensive list of conferences can be found in the Calendar 1999, as published in AI Communication and in the SIGART Newsletter.
Dr. Y.H. Tan EURIDIS, Erasmus Universiteit Rotterdam Postbus 1738, 3000 DR Rotterdam Tel.: (010) 4082255. E-mail: [email protected]
March 2-4, 2000 International Workshop Network Sampling. Steve Thompson, Dept. of Methodology & Statistics. Universiteit Maastricht.
Dr. E.O. Postma Universiteit Maastricht, IKAT Postbus 616, 6200 MD Maastricht Tel.: (043) 3883493. E-mail: [email protected]
March 19-21, 2000 2000 ACM Symposium on Applied Computing (SAC2000) Special Track on coordination Models, Languages and Applications. Information: http://www.cs.ucy.ac.cy/SAC2000.htm
Dr. R. Verbrugge Rijksuniversiteit Groningen, Cognitive Science and Engineering Grote Kruisstraat 2/1, 9712 TS Groningen. Tel.: (050) 3636334. E-mail: [email protected]
March 20-22, 2000 Workshop in the AAAI Spring Symposium Series, Bringing Knowledge to Business Processes. Stanford University, California. Information: http://www.aifb.unikarlsruhe.de/~sst/Research/Events/ss00
Dr. W. van der Hoek Universiteit Utrecht, Department of Computer Science, P.O. Box 80089, 3508 TB Utrecht Tel.: (030) 2533599. E-mail: [email protected]
March 22-24, 2000 FroCoS2000: Third International Workshop, Frontiers of Combining Systems Nancy, France Information: http://www.loria.fr/conferences/frocos2000/
Dr. L. DeHaspe Katholieke Universiteit Leuven Dept. of Computer Science, Clestijnenlaan 200A,, B-3001 Heverlee, België Email:ldh@[email protected]
April 25 - 28, 2000 European Meeting on Cybernetics and Systems Research, University of Vienna. Information: http://www.ai.univie.ac.at/emcst/
G. Beijer BOLESIAN BV, Steenovenweg 19, 5708 HN Helmond Tel.: (0492) 502525. E-mail: [email protected] Dr. W. Daelemans Katholieke Universiteit Brabant, Vakgroep TaalLiteratuurwetenschap, Postbus 90153, 5000 LE Tilburg. Tel.: (013) 4663070. E-mail: [email protected]
April 26-27-28, 2000 ESANN 2000. 8th European Symposium on Artificial Neural Networks. Novotel Hotel, Katelijnestraat 65B, 8000 Brugge, Belgium.
EDITORS BNVKI newsletter
May 8-10, 2000 Euromedia 2000, Antwerpen. Information: http://hobbes.rug.ac.be/~scs/conf/euromd2000/
Dr. E.O. Postma (editor in chief) (See addresses Board Members)
May 13 – 16, 2000 An international conference on Artificial Neural Networks In Medicine And Biology. Göteborg University, Sweden Information: http://www.phil.gu.se/annimab.html
Prof. dr. H.J. van den Herik Universiteit Maastricht, IKAT Postbus 616, 6200 MD Maastricht Tel.: (043) 3883485. E-mail: [email protected]
May 15-19, 2000 The first Summerschool on Model-Based Systems and Qualitative Reasoning (MBS/QR). University Residential Centre of Bertinoro (Italy).
Dr. C. Witteveen Technische Universiteit Delft, Department Informatica, Julianalaan 132, 2628 BL Delft Tel.: (015) 2782521. E-mail: [email protected]
May 17 - 19, 2000 ICM, Information Systems for Enhanced Public Safety and Security, International Congress Centre, Munich, Germany Information: http://www.eurocomm.org/2000
Technische
Dr. R.G.F. Winkels Universiteit van Amsterdam, Rechtsinformatica Postbus 1030, 1000 BA Amsterdam Tel.: (020) 5253485. E-mail: [email protected]
May 23 – 26, 2000 Second International ICSC Symposium, Neural Computation / NC’2000, Technical University of Berlin, Germany. Information: http//www.icsc.ab.ca/nc2000.htm
Dr. S.-H. Nienhuys-Cheng Erasmus Universiteit Rotterdam, Informatica Postbus 1738, 3000 DR Rotterdam Tel.: (010) 4081345. E-mail: [email protected]
May 28 – June 2, 2000 First European Workshop on RoboCup. Information: http://www.cs.uu.nl/people/wiebe/EuRoboCup2000/ MAIL ADRESSES BOARD MEMBERS BNVKI
Ir. E.D. de Jong Vrije Universiteit Brussel, AI Lab Pleinlaan 2, B-1050 Brussel, Belgium Tel.: +32 (0)2 6293713. E-mail: [email protected]
Prof.dr. J. N. Kok
BNVKI newsletter
en
199
December 1999
Dr. A. van den Bosch Katholieke Universiteit Brabant, Taal- en Literatuurwetenschap, Postbus 90153, 5000 LE Tilburg Tel.: (013) 4360911. E-mail: [email protected] Dr. R.J.C.M. Starmans Coordinator Research school SIKS, P.O. Box 80089, 3508 TB, Utrecht Tel.: (030) 2534083/1454. E-mail: [email protected] Dr. B. de Boer Vrije Universiteit Brussel, AI Lab Pleinlaan 2, B-1050 Brussel, Belgium Tel.: +32 (0)2 6293703. E-mail: [email protected] HOW TO SUBSCRIBE The BNVKI/AIABN Newsletter is a direct benefit of membership in the BNVKI/AIABN. Membership dues are Fl. 75,- or BF 1.400 for regular members; NLG 50,- or BF 900 for doctoral students (AIO's); and NLG 40,- or BF 700 for students. In addition, members will receive two issues of the European journal AI Communications. The newsletter appears bimonthly and contains information about conferences, research projects, job opportunities, funding opportunities, etc., provided enough information is supplied. Therefore, all members are encouraged to send news and items they consider worthwhile to the editorial office of the BNVKI/AIABN newsletter. Subscription is done by payment of the membership due to RABO-Bank no. 11.66.34.200 or Postbank no. 3102697 for the Netherlands, or Argenta Bank no. 979-9307518-82 for Belgium. In both cases, specify BNVKI/AIABN in Maastricht as the recipient, and please do not forget to mention your name and address. Sending of the BNVKI/AIABN newsletter will only commence after your payment has been received. If you wish to conclude your membership, please send a written notification to the editorial office before December 1 1999. COPY The editorial board welcomes product announcements, book reviews, product reviews, overviews of AI research in business, and interviews. Contributions stating controversial opinions or otherwise stimulating discussions, are higly encouraged. ADVERTISING It is possible to have your advertisement included in the BNVKI/AIABN Newsletter. For further information about pricing etc., please contact the editorial office. CHANGE OF ADDRESS The BNVKI/AIABN newsletter is sent from Maastricht. The BNVKI/AIABN board has decided that the BNVKI/AIABN membership administration takes place at the editorial office of the Newsletter. Therefore, please send address changes to: Editorial Office BNVKI/AIABN Newsletter Universiteit Maastricht, FdAW, Vakgroep Informatica, Postbus 616, 6200 MD Maastricht, Nederland, Tel: ++31-(0)43-3883477 E-mail: [email protected]
BNVKI newsletter
200
December 1999