Saptono An experiment with nonsense∗ Rendra Suroso†
[email protected] 20051125
Abstract
generator. And since it does not generate predetermined sentential seed, Saptono is simply a nonHere we present Saptono, a random essay gener- sense automaton. Despite its limitation, we will ator. It is linguistically sufficient to generate most show that Saptono is a good tool to generate corcases of sentences which may appear in specific nat- pora often found in particular kind of writings and ural language, yet it is merely syntactically-driven. this only takes us to further question: how human Despite its randomness, Saptono can still gener- deals with nonsense, meaninglessness, or meaning ate essays similar to writings found in certain field itself, but not all kinds of meaning. of the study of culture and contemporary philosophy. 2 Grammar Keywords: nonsense, generator
1
Saptono is designed with Indonesian (Bahasa Indonesia) as its ‘mother tongue’. However, its desired effect is only a kind of writing, i.e.: essay, such that more detailed design (morphological parser, detailed feature structure, etc.) is not required.
Intro
Common knowledge in computational linguistics says that the connection between syntax and semantics of natural language is what a parser and a generator do. A parser will split a sentence or phrase into its constituents and possibly come up with a logical form, while a generator does otherwise. It is also common that in many representationalist AI projects with natural language processor in it, corpora or simply sentences are the input before their corresponding logical forms become responsible for belief fixation and updates. Finally, if required, a natural language output is generated. In Saptono (Studi APlikasi oTOmata NOnsens, Study of Application of Nonsense Automata), the work is simpler and hence far than adequate or even misleading to simulate a real human reasoning of all concepts and themes. Saptono only has linguistic rules, a dictionary, and at its heart, a random
2.1
Generator
Saptono is implemented in SWI-Prolog that, just like any other dialects of the language, has a topdown, depth-first search. To save computational resources, we still apply this top-down search with breadth-first approach for each rewriting, and each rewriting is controlled by random numbers. Schematically, see Fig. 1. As a consequence, Saptono will be a lame parser, but very efficient generator. parent | randomizer / \ child1 child2
∗ Working
Paper Bandung Fe Institute, December 2005 Cog. Prog., Dept. Cog. Sci., Bandung Fe Institute, http://cogsci.bandungfe.net/ppl/brs/
Figure 1 Random breadth-first rewriting
† Aut.
1
2.2
Syntax
ap a1 conj vp cvp
→ → → → →
aaux a1 | fin a1 a0 conj | a0 fin c0 a0 vaux cvp vt0 np | vi0 pp | vi0 vn0 ap | vti0 ip p0 np
The syntax is inspired by X-bar Grammar (Haegeman, 1994), only that Saptono’s grammar is not necessarily related to linguistic issues Government fin | and Binding framework wants to tackle with. It even cares less about transformation. The only conpp → sideration is that X-bar Grammar is a context-free grammar with a very general and simple derivational rule, it has a huge amount of possible deriva- See Appendix B for not-so-detailed description of tions with some derivations may be to empty string. each of the lexical categories in the grammar above. Once rewriting rules are all set, and due to the fact that Saptono has no morphological analyzer XP / \ whatsoever, before generating a text, we need to spc X1 determine what theme the text is all about, and / \ finally, what dictionary it requires. X0
YP
Figure 2 General rewriting in an X-bar Grammar (Cf. Haegeman, 1994)
3
Not without specific reason, we choose contemporary cultural studies as the theme. Previous works, e.g.: Dada Engine (Bulhak, 1996) also chose the same theme, except that Dada Engine as a scripting language is more thoroughly designed (and thematically specified) to generate not only essays but also various kinds of bogus texts. One of Dada Engine’s applications, Postmodern Generator1 was provocative for that the author says that writing in postmodernism literature is a kind of mental debility. Saptono has different working hypothesis, that random generator is possible for certain kinds of writings. Its performance depends on its grammar and lexical categories, and it has nothing to do with neuropathology or psychopathology of any kind. It is easy to construct a jargon generator, but the architecture of the generator by no means what is there found in, say, aphasic patients. Some motivation however is inherited from Postmodern Generator that in some stream of thought, since Saptono as an NLP program is purely ‘technological’ (Allen, 1995), while its target is to simulate the production of certain kind of text, another constraint is maintained: ‘the simpler, the more purely-syntacticallyand-randomly driven at every level and every step of generation, the better’.
In practice, a detailed account of syntax of Indonesian is required. Indonesian is an SVO language, it has no tenses since the sense of time only resides in loosely defined auxiliary verbs. It has no explicit counting of objects, almost no difference between singular and plural form since, in most cases, plurals are represented by repeating the noun. It, however, has various patterns of morphological derivations for verbs primarily governed by semantic roles rather than syntactic categories. We simply treated them as different lexicons of verbs that appear no more or less probable to each other. However, it has syntactic exclusion too, such as the position of determiner that occurs before the noun when there is an indefinite article, but after when definite. In a brief description of the grammar used by Saptono below, the character ‘→’ means ‘rewrite’ and ‘|’ means ‘or’. Randomizer operates in this ’|’ character to pick which derivation to follow and in the selection of lexicon after terminal symbols. ip i1 np cnp1 cnp2 cn
→ → → → → →
An Experiment
np i1 i0 vp fin cnp1 | fin cnp2 det1 cn cn det2 n0 ap
1 http://www.cs.monash.edu.au/cgi-bin/postmodern
2
3.1
Lexicon
no more special position than that of monadic logical connective (tidak, bukan [not]) or—Indonesianspecific—temporal operator (belum [not yet], sudah [already], akan [will], etc.). In addition, we will not develop a temporal reasoner for Saptono so it is impossible for Saptono to describe a process meaningfully.
With respect to the theme mentioned above, we choose from literature of the field, some most popular words of all categories. Nouns are easy to identify as well as adjectives. It goes trickier for verbs, since applying verb for any given noun will make the sentence even more meaningless. We will deal with this issue in the future by applying a simple ontology. A sample sentence:
3.4
To add more power in mimicking natural language expressions, not only we apply single word for each lexical item, but also compound of words. For nouns, verbs, determiners and adjectives, this depends on the theme, but for other categories (vaux, aaux), the compound is more persistent or less theme-dependent.
i. Lanskap imagologi mampu melatari signifikasi. I. Landscape of imagology is able to become a background of signification.
3.2
Compound
Coherence
Saptono does not produce only jargons (i.e.: np’s) nor single sentence, but it outputs a full essay. For this purpose, we need to have a procedure to simulate coherence or discourse. Our discourse analyzer is a simple procedure: take the last noun appearing in previous sentence as subject-noun of next sentence. This procedure only applies sententially. In a sentence with more than 2 n0’s, (e.g. sentence with pp or nested sentence with rewriting rule ip → . . . → ip → . . . ), the procedure only applies to the first and the last n0 of the same sentence. A sample sentence:
3.5
Titling, Quotation and Reference
Again, with respect to the theme under consideration, to add its mimicking power, we also include title generated by a jargon (or np) that is none but only the subset of the same generator. With it, several quotations are retrievable based on keywords from a database. The quotations are taken from real references. Here is a title:
iii. Ethos imagologi simulakral atau transpersonal. ii. Parodi yang cenderung transpersonal tak pelak III. Ethos of simulacral or transpersonal imagolmeninabobokkan budaya akibat polivalensi ogy. ortodoksi Saussurean dan metaforis mustahil mensedimentasikan surealisme yang sekilas And here is a sample quotation between two sensubyektif. Surealisme tersebut dengan ragu tences: disimulasikan oleh Liyan. iv. Makna transpersonal tadi tak mampu terlihat subyektif dan Saussurean. Baudrillard menanII. Parody that is transpersonal definitely stupefies daskan bahwa, culture because polivalency of Saussurean and metaphorical orthodoxy is impossible to sedi”Di luar tarikan gravitasi yang memment surrealism that is at a glance subjective. pertahankan badan kita agar tetap This surrealism is doubtfully simulated by the dalam orbit, seluruh atom makna akan Other. tersesat atau membebaskan diri di luar angkasa.” (Baudrillard, 1999)
3.3
Vagueness
Multiplisitas makna neo-Marxian dengan yakinVagueness that at its most well-known (but not in nya berjalan. all) forms is existence of modal operators is added to category vaux. This means that modal operators with keyword makna (meaning) and corresponding (mungkin [possibly], seharusnya [necessarily]) have reference: 3
v. Baudrillard, J. 1999. Galaksi Simulacra.
Sedimentasi obyek urung terakumulasi lewat bantuan perwujudan budaya. Dengan demikian, budaya diskursif dan fetish tersebut kadang-kadang menyem4 Further Works bunyikan manifestasi budaya lewat logos Diri yang neo-Marxian. Manifestasi Diri psikoanalitis mungkin We are working on how to apply a simple ontol- melatari manifestasi imagi yang cenderung metaforis ogy to all lexical categories. The candidate may be atas nama keniscayaan imagi simulakral. Kondisi by exploiting the lexical items that lack extension imagi melangkah dengan mediasi logos komodifikasi. (Cf. e.g.: Sowa, 2000 for complete review) that are Komodifikasi berspekulasi atas nama prolegomprobably much related to abstract concepts. The ena obyek neo-Marxian dan Lacanian. Enunsiasi effort is a necessary requirement before we arrive to obyek terkadang melatar-belakangi tinjauan Marxisme higher target: specific functioning of the language dengan multiplisitas kebermaknaan. Kebermaknaan of thought or mentalese (Cf. e.g.: Pinker, 1994). belum berspekulasi. Analisis kebermaknaan mentransformasikan keniscayaan Liyan. Liyan nostalgis bisa saja mengejawantah. Liyan yang sekilas nostalgis 5 Acknowledgment mentransformasikan sedimentasi obyek. Obyek yang The author thanks BFI colleagues and Bambang fetish serta Lacanian terkadang merupakan kenisSubarnaz for being the first volunteers in the Turing cayaan kuasa yang cenderung psikoanalitis atau DerTest with Saptono. This research is financially ridean dengan logos Marxisme meskipun sedimentasi refleksi diziarahi melalui kondisi hasrat maskulin. supported by Surya Research Int’l. Jika hasrat yang diskursif dan subyektif disimulasikan oleh kondisi budaya, hasilnya, komodifikasi bisa saja bersikukuh, analisis komodifikasi terlihat fetish. In6 References ferioritas komodifikasi menjadi pos-Whorfian sekaliAllen, J. (1995). Natural Language Understanding. gus psikoanalitis dengan mediasi logos makna posNew York: Benjamin/Cummings. Whorfian. Baudrillard menandaskan bahwa, ”Di luar tarikan gravitasi yang mempertahankan badan kita agar tetap dalam orbit, seluruh atom makna akan tersesat atau membebaskan diri di luar angkasa.” (Baudrillard,1999)
Bulhak, A. (1996). The Dada Engine version 1.0 Manual. http://dev.null.org/dadaengine/ manual-1.0/dada.html Haegeman, L. (1994). Introduction to Government and Binding Theory. Oxford: Blackwell.
Enunsiasi makna metaforis menyembunyikan perwujudan kuasa atas campur tangan multiplisitas imagi Pinker, S. (1994). The Language Instinct: The simulakral. Prolegomena imagi terlalu terakumuNew Science of Language and Mind. London: lasi. Kalaulah logos imagi nampak seakan-akan subPenguin. yektif sekaligus psikoanalitis atas nama manifestasi imagi, imagi akan tampak neo-Marxian dan Lacanian. Sowa, J. (2000). Knowledge Representation: Log- Kalaupun imagi tersebut tak mustahil berjalan melalui ical, Philosophical, and Computational Founda- perwujudan hasrat diskursif atau bahkan subyektif, tions. Pacific Grove, CA: Brooks/Cole. tak heran Diri diskursif urung terbungkam. Logos Diri yang cenderung Derridean menandaskan, kondisi polivalensi yang sekilas psikoanalitis cenderung 7 Appendix A: A sample of nostalgis. Enunsiasi polivalensi yang simulakral terlihat neo-Marxian. Multiplisitas polivalensi yang sekilas generated essay transpersonal terbebat. Manifestasi polivalensi akan Analisis obyek yang cenderung positivistik serta bergeming. Tinjauan polivalensi berproduksi. Maniintertekstual. festasi polivalensi berproduksi melalui logos instrumen oleh Saptono yang cenderung diskursif akibat makna tak mustahil terakumulasi. 4
Makna transpersonal sekaligus Lacanian melatari pp prepositional phrase posmodernisme atas nama prolegomena instrumen vp verb phrase Foucauldian. Eksterioritas instrumen psikoanalitis telah dibungkam oleh kondisi kuasa. TERMINAL SYMBOLS a0 adjective Tinjauan kuasa Derridean serta Saussurean {diskursif, intertekstual, . . .} mengekstrapolasi keniscayaan strategi maskulin aaux auxiliary to adjective atau bahkan simulakral dengan analisis diskursus. phrase Eksterioritas diskursus menjadi intimidatif melalui {yang, . . .} perwujudan feminisme Lacanian atau maskulin c0 conjunction meskipun multiplisitas instrumen mengatasi komod{atau, dan, meskipun, . . .} ifikasi yang cenderung fetish atas nama tinjauan det1 definite article obyek subyektif. Kalau obyek terbungkam, tak heran {seberkas, sesosok, . . .} hasrat diskursif sekaligus neo-Marxian memperdaya det2 indefinite article multiplisitas posmodernitas metaforis serta transper{tadi, tersebut, . . .} sonal dengan prolegomena posmodernitas fetish atau fin empty pos-Whorfian. Posmodernitas psikoanalitis sekaligus {} fetish tersebut akan berubah menjadi ambigu atau n0 noun bahkan neo-Marxian dengan mediasi analisis budaya {hasrat, instrumen, . . .} maskulin. Prolegomena budaya yang positivistik vaux auxiliary to verb phrase menahbiskan eksterioritas hasrat lewat bantuan {mungkin, pasti, . . .} manifestasi feminisme. Akibatnya, sedimentasi vi0 intransitive verb feminisme tidak nampak seakan-akan transpersonal {berproduksi, terbebat, . . .} lewat bantuan kondisi komodifikasi psikoanalitis vn0 nominal verb sekaligus positivistik. {menjadi, merupakan, . . .} vt0 transitive verb ------------{menahbiskan, mengekstrapolasi, . . .} Baudrillard, J. 1999. Galaksi Simulacra. vti0 intransitive verb for nested sentence {menandaskan, 8 Appendix B: The grammar mengatakan, . . .} Below are lexical categories used in the grammar p0 preposition of Saptono. Some categories (e.g.: i0, i1, a1, {dengan, lewat, . . .} conj), are omitted because they do not really have any lexical function but required by the grammar or are directly terminated with an empty string such that they require no linguistic-specific labels. NON-TERMINAL SYMBOLS ap adjective phrase cn articled noun phrase cnp1 noun phrase with indefinite article cnp2 noun phrase with definite article cvp verb selector ip inflectional phrase, also known as sentence np noun phrase 5