EUROPEAN POLYTECHNICAL INSTITUTE KUNOVICE
PROCEEDINGS
FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT
ICSC 2006
January 27, Kunovice, Czech Republic
Edited by: Prof. Ing. Imrich Rukovanský, CSc, and Doc. Ing. Pavel Ošmera, CSc Prepared for print by: Bc. Andrea Šimonová, DiS., Bc. Pavel Kubala, DiS. and Ing. Petr Matušík Printed by:
© European Polytechnical Institute Kunovice, 2006 ISBN : 80–7314–084-5
FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT
ICSC 2006
Organized by
THE EUROPEAN POLYTECHNICAL INSTITUTE, KUNOVICE THE CZECH REPUBLIC
Conference Chairman
Ing. Oldřich Kratochvíl, Dr.h.c., rector
Conference Co-Chairmen Prof. Ing. Imrich Rukovanský, CSc. Assoc. Prof. Ing.Pavel Ošmera, CSc.
INTERNATIONAL PROGRAMME COMMITEE O. Kratochvíl – Chairman (CZ) M. Baraňski (Poland) J. Baštinec (Czech Republic) J. Diblík (Czech Republic) P. Dostál (Czech Republic) U. K. Chakraborthy (USA) B. Kulcsár (Hungary) V. Mikula (Czech Republic)
P. Ošmera (Czech Republic) J. Petrucha (Czech Republic) I. Rukovanský (Czech Republic) G. Vértesy (Hungary) W. Zamojski (Poland) J. Zapletal (Czech Republic) T. Walkowiak (Poland)
ORGANIZING COMMITEE I. Rukovanský (Chairman) P. Ošmera A. Šimonová P. Kubala J. Kavka P. Matušík M. Balus I. Polášková
Session 1: ICSC Chairman: Doc. RNDr. Josef Zapletal, CSc.
J. Šáchová T. Chmela J. Míšek Š. Mikuláš R. Jurča M. Zálešák
OBSAH A MESSAGE FROM THE GENERAL CHAIRMAN OF THE CONFERENCE ...................................................7 POZNÁMKA K ROZHODOVÁNÍ ZA RIZIKA A NEJISTOTY Zapletal Josef...............................................................................................................................................................9 THE USE OF FUZZY LOGIC FOR SUPPORT OF DIRECT MAILING Dostál Petr .................................................................................................................................................................21 THE COLLATION OF VARIOUS METHODS FOR THE SOLUTION OF TRANSPORTATION PROBLEMS Abdurrzzag Tamtam ..................................................................................................................................................27 SOLUTION OF STRUCTURAL INTERBRANCH SYSTEM OF A DYNAMIC MODEL Baštinec Jaromír, Diblík Josef ....................................................................................................................................35 PROBABILITY THEORY AND STATISTICS IN THE COMBINED FORM OF STUDY OF THE BACHELOR STUDENT PROGRAMMES AT FEEC BUT Novák Michal ............................................................................................................................................................43 APLIKACE FUZZY SYSTÉMŮ PRO PODPORU ROZHODOVÁNÍ A ŘÍZENÍ Mikula Vladimír, Petrucha Jindřich ............................................................................................................................45 EXAMPLES OF USING CONCEPTS OF PROBABILITY THEORY IN MANAGEMENT DECISION MAKING Novák Michal, Fajmon Břetislav................................................................................................................................51 OPTIMIZATION OF MATERIAL CHARACTERIZATION BY ADAPTIVE TESTING Vértesy Gábor, Tomáš Ivan, Mészáros István .............................................................................................................57 VYUŽITÍ KOMPLETNÍHO GENETICKÉHO ALGORITMU PRO ŘEŠENÍ OPTIMALIZACE VÝROBNÍHO PROCESU Z HLEDISKA MAXIMALIZACE ZISKU Kostiha Jiří ................................................................................................................................................................65 REVITALIZING COMPANY INFORMATION SYSTEMS AND COMPETITIVE ADVANTAGES Lacko Branislav.........................................................................................................................................................73 MODEL LEARNING AND INFERENCE THROUGH ANFIS Amalka Al Khatib......................................................................................................................................................81 GRAMMATICAL EVOLUTION WITH BACKWARD PROCESSING Ošmera Pavel, Popelka Ondřej, Rukovanský Imrich ...................................................................................................89 OBJECT RECOGNITION BY MEANS OF NEW AL Šťastný Jiří, Minařík Martin .......................................................................................................................................99 APLIKÁCIA TEÓRIE GRAFOV V INTELIGENTNOM DOPRAVNOM SYSTÉME Klieštik Tomáš.........................................................................................................................................................105 THE VORTEX-FRACTAL THEORY OF THE UNIVERSE STRUCTURES Ošmera Pavel...........................................................................................................................................................109 VORTEX-FRACTAL PHYSICS Ošmera Pavel...........................................................................................................................................................123 VÝZNAM MONITOROVÁNÍ POČÍTAČOVÝCH SÍTÍ Rukovanský Imrich..................................................................................................................................................131 ANALÝZA DAT S VYUŽITÍM NEURONOVÝCH SÍTÍ A KONTINGENČNÍCH TABULEK Petrucha Jindřich......................................................................................................................................................137 DETECTION OF INITIAL DATA GENERATING BOUNDED SOLUTIONS OF LINEAR DISCRETE EQUATIONS Baštinec Jaromír, Diblík Josef ..................................................................................................................................143 ON SOME PROPERTIES OF FRACTIONAL CALCULUS Krupková Vlasta, Šmarda Zdeněk ............................................................................................................................157
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
5
ON THE STABILITY OF LINEAR INTEGRODIFFERENTIAL EQUATIONS Šmarda Zdeněk ........................................................................................................................................................163 EXISTENCE OF POSITIVE SOLUTIONS FOR RETARDED FUNCTIONAL DIFFERENTIAL EQUATIONS WITH UNBOUNDED DELAY AND FINITE MEMORY Diblík Josef, Svoboda Zdeněk ..................................................................................................................................169 APPLICATION OF NON SIMPLEX METHOD FOR LINEAR PROGRAMMING Tomšová Marie........................................................................................................................................................173 AUTHOR INDEX ..................................................................................................................................................179
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
6
A MESSAGE FROM THE GENERAL CHAIRMAN OF THE CONFERENCE Dear guests and participants at this conference You are getting this anthology from the 4th scientific conference ICSC 2006 – International Conference on Soft Computing Applied in Computer and Economic Environment.
Ing. Oldřich Kratochvíl, Dr.h.c.
Prof. Ing. Imrich Rukovanský, CSc.
It is my pleasure to give thanks for the preparation of the conference to Prof. Ing. Imrich Rukovansky, CSc. Fuzzy logic and neuron networks have become an important part of work at our University during the last four years. The conference participants gave their papers on their scientific work and results gained during the last year. Some academics (Doc. Pavel Ošmera CSc, Ing. Dostál), their papers are a part of this anthology, will send their results to well known magazines abroad highlighting that their research results were published in this anthology for the first time.
I am pleased that the academics from Czech, Slovak, Hungary, Poland and Russia universities took part in the conference. This conference was not only an important scientific but also a social event. Allow me to wish lots of success to all the participants of the conference. I kindly ask them to be with us at the 5th conference ICSC 2007.
Kunovice, January 27, 2006
Dipl. Ing. Oldřich Kratochvíl, Dr.h.c. rector
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
7
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
8
POZNÁMKA K ROZHODOVÁNÍ ZA RIZIKA A NEJISTOTY Josef Zapletal
Evropský polytechnický institut, s.r.o. Abstrakt: V článku se popisují metody rozhodovací analýzy, které mají poněkud odlišné rysy od rozhodovacích metod operační analýzy, které představují především optimalizační nástroje vhodné pro řešení jednodušších, dobře strukturovaných rozhodovacích problémů. Naopak, charakteristickým rysem rozhodovací analýzy je to, že snaží skloubit exaktní postupy a modelové nástroje se znalostmi a zkušenostmi řešitelů těchto problémů. Heuristické metody významně ovlivňují postupy a výsledné řešení problémů. Uvedeme základní pojmy , metody a nástroje rozhodovací analýzy, resp. rozhodování za rizika a nejistoty. Mezi ně bude patřit pojem subjektivní pravděpodobnost, funkce utility za rizika a některé grafické nástroje podpory řešení rozhodovacích problémů za rizika a nejistoty.
Klíčová slova: Rozhodovací analýza, operační analýza, deterministické metody, stochastické metody, subjektivní pravděpodobnost, poměr sázek, funkce utility za rizika, sklon rozhodovatele k riziku, averze k riziku, sklon k riziku, konkávní, lineární a konvexní funkce utility, jistotní ekvivalent.
1 Úvod Tento příspěvek má být jakýmsi metodickým návodem pro studenty EPI, kteří se zabývají ve svých projektech problematikou rozhodování a to zejména rozhodování řízení nedeterministických procesů. Výchozím materiálem se mně stala skripta Jiřího Fotra a Jiřího Dědiny Manažérské rozhodování a dále práce [2], [4], [5], [7], [13]. Příklad v podkapitole Metoda relativních velikostí je přebrán z [6].
2. Subjektivní pravděpodobnosti. 2.1 Objektivní a subjektivní pravděpodobnost Důležitou součástí přípravy rozhodování je vyjasnit si možné budoucí situace. Zejména očekávané ekonomicko politické vlivy, které mají vliv na důsledky uvažovaných variant rozhodování. Některé z těchto skutečností mohou být nepříznivé(nadúroda ve velkých geografických oblastech, nadvýroba určitého zboží ve velkých ekonomicky silných státech atd.). Naopak některé mohou být příznivé (tržní konjunktura, ústup určité konkurence z trhu v důsledku zmodernizování výroby, získání nových odbytišť pro zaběhnutou výrobu aj.). Je proto nutné, nějakým způsobem pro další vyhodnocování stanovit míru nebezpečí u nepříznivých vlivů a míru nadějnosti u příznivých vlivů. Takovou mírou bývá zpravidla pravděpodobnost. Rizikové situace a jejich umístění v konkrétním čase však nejsou obvykle zcela a pravidelně opakovatelné. Nejsme většinou v situaci, kdy má manažér rozhodnout o nákupu jistého počtu náhradních dílů do rezervy pro daný drahý stroj a přitom ví, kolik a s jakou pravděpodobností se tyto díly pokazily a kdy existuje jakýsi objektivní systém pravděpodobností, z něhož lze vycházet. Při rozhodování obecných možných situací nemá manažér k dispozici minulé statistické údaje a pro pravděpodobnost ohodnocení rizikových situací uplatnit pouze tzv. subjektivní pravděpodobnosti. Tyto jsou založeny na předpokladu, že každý subjekt, kterým ovšem není pouhý jedinec, má určitý předpoklad vývoje a víru v tento vývoj. Subjektivní pravděpodobnost je pak vyjádřením míry „osobního přesvědčení“ subjektu (celého týmu prognostiků a manažérů) na nastoupení určitého jevu případně události. 2.2. Číselné vyjádření subjektivní pravděpodobnosti Subjektivní pravděpodobnost můžeme vyjádřit buď číselně, nebo slovně. Číselné vyjádření může mít dvě formy: První forma – pomocí čísel od 0 do 1 případně od nula procent do 100%. Hodnota pravděpodobnosti nula vyjadřuje, že daná situace nebo jev určitě nenastane, hodnota pravděpodobnosti 1, resp. 100% indikuje, že daná situace nebo jev nastanou s jistotou. Druhá forma – číselného vyjádření subjektivní pravděpodobnosti je vyjádření buď ve formě poměru udávajícího počet „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
9
realizací daného jevu z celkového počtu možných případů (např. porucha určitého zařízení nastane v průměru jedenkrát za 100 dní), nebo pomocí tzv. poměru sázek. V tomto případě vyjadřuje manažér svoji víru ve výskyt daného jevu (kterým může být např. úspěch nově vyvinutého výrobku na trhu) např. výrokem typu: „Vsadil bych 3:1, že výrobek bude na trhu úspěšný. Pravděpodobnost tržního úspěchu výrobku je pak
3
= 0, 75.
3 +1
Číselné stanovení subjektivních pravděpodobností obvykle probíhá ve spolupráci se specialisty z dané věcné oblasti. Při stanovování úspěchu vývoje určitého výrobku to může být vedoucí vývojového týmu, podobně při určování výše prodeje zase marketingový odborník. Pro ilustraci postupů stanovení subjektivních pravděpodobností uvedeme dvě metody a to metodu relativních velikostí a metodu kvantilů.
Metoda relativních velikostí Tato metoda je vhodná pro určování subjektivních pravděpodobností jevů, kterých je pouze omezený počet. V této metodě se určuje nejprve ten jev (situace), kterou považuje odborník za nejpravděpodobnější. Tato pravděpodobnost pak slouží jako základ pro stanovení pravděpodobností dalších jevů (situací). Pro názornost uvedeme následující ilustrativní příklad. Předpokládejme, že podnik kupuje nové výrobní zařízení, přičemž s nákupem je třeba objednat určitý počet kusů významných a většinou velmi drahé náhradní součásti, která se náhodně poškozuje. Jako podklad pro tuto objednávku je třeba určit pravděpodobnost jednotlivých hodnot počtu poruch dané součásti (to jsou v našem případě jevy, resp. Situace, které mohou nastat) během provozu doby životnosti kupovaného výrobního zařízení. Pokud je toto zařízení již několikanásobně v provozu a vedou–li se statistiky poruchovosti spadá tato úloha do operační analýzy, konkrétně do teorie zásob. Většinou ale nejsou k dispozici dostatečně rozsáhlá statistická vyhodnocení a pro stanovení subjektivních pravděpodobností jednotlivých počtů poruch se využije informací získaných diskusí analytika s odborníkem. Z takové diskuse vyplynulo, že • maximální předpokládaný počet poruch je pět (počet poruch se může tedy pohybovat od nuly k pěti); • nejpravděpodobnější počet poruch jsou dvě; • pravděpodobnost jedné resp. tří poruch je stejně velká a je přibližně dvakrát menší než pravděpodobnost dvou poruch; • pravděpodobnost žádné, resp. pěti poruch je stejně velká a je zhruba desetkrát menší než pravděpodobnost dvou poruch; • pravděpodobnost čtyř poruch je přibližně pětkrát menší než pravděpodobnost dvou poruch. Jestliže nyní pravděpodobnost vzniku dvou poruch (tj. počtu poruch s největší pravděpodobností) označíme P a pravděpodobnosti nastoupení i poruch jako pi , pak z výše uvedeného plyne p2 = P p1 = p 3 = po = p 5 = p4 =
P 2 P 10
P
5 Druhá rovnice shora vyjadřuje tvrzení, že pravděpodobnost vzniku jedné poruchy je stejně velká jako pravděpodobnost nastoupení tří poruch a obě jsou přibližně dvakrát menší než pravděpodobnost nastoupení dvou poruch, kterou jsme označili P. Protože celý pravděpodobnostní prostor je tvořen pěti hodnotami, musí platit po + p1 + p2 + p3 + p4 + p5 = 1 .
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
10
Jestliže nyní do této rovnice dosadíme za po = P 10 Řešením této rovnice je hodnota Dostaneme
P
,
10 +
P
P
p1 = P
+ P+
2
,
p2 = P ,
2 P
+
2
+
5
P
atd., dostaneme
= 1
10
P = 0 , 42 , pomocí níž již určíme pravděpodobnosti jednotlivých počtů poruch.
P
po =
0, 42
=
10
p1 =
P
B 0, 04
10
0, 42
=
= 0, 21
2 2 p2 = P = 0, 42 p3 = p4 = p5 =
P 2 P
0, 42
=
2 0, 42
=
5 P
= 0, 21 B 0, 08
5 =
0, 42
10
B 0, 04
10
Stanovené subjektivní pravděpodobnost tvoří rozdělení pravděpodobnosti počtu poruch. Toto rozdělení můžeme zapsat buď ve tvaru tabulky (viz první řádek tabulky 1), kde každé hodnotě počtu poruch odpovídá určitá pravděpodobnost, nebo graficky, pomocí histogramu, kde na ose x zobrazíme jednotlivé počty poruch a na ose y jim odpovídající pravděpodobnosti (viz obr.1). Tab.1 Rozdělení pravděpodobností Počet poruch 2 3 0,42 0,21
Pravděpodobnosti Pravděpodobnost Kumulativní Pravděpodobnost
0 0,04
1 0,21
0,04
0,25
0,67
0,88
4 0,08
5 0,04
0,96
1
0,45 0,4 0,35 0,3 0,25 0,2 0,15 0,1 0,05 0 0
1
2
3
4
5
Obr.1 Rozdělení pravděpodobnosti počtu poruch
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
11
Poznámka Rozdělení pravděpodobnost počtu poruch můžeme též vyjádřit pomocí tzv. kumulativních pravděpodobností (viz třetí řádek tab. 1). Tyto kumulativní pravděpodobnosti vyjadřují pravděpodobnost, že počet poruch bude menší nebo roven určitému počtu poruch. Například pravděpodobnost jevu, že počet poruch bude menší nebo roven dvěma (tj. nastanou buď dvě, jedna nebo žádná porucha) je 0,67. Tuto pravděpodobnost získáme součtem (kumulací) pravděpodobností nastoupení žádné, jedné nebo dvou poruch. Platí 0,67 = 0,04 + 0,21 + 0,42. Kumulativní pravděpodobnost nula až pěti poruch je jedna. Je možné tvrdit s jistotou, že během daného období provozu zařízení počet poruch nepřevýší pět. Kumulativní pravděpodobnosti uvedené v tab. 1 ve svém souhrnu definují distribuční funkci náhodné veličiny udávající počty poruch.
Metoda kvantilů Tato metoda je vhodná pro stanovení subjektivního rozdělení pravděpodobnosti v případě, že počet možných jevů (situací), které mohou nastat je vysoký, případně nekonečný. Tento charakter má většina faktorů rizika, např. nákupní a prodejní ceny určitých produktů a surovin, výše poptávky, devizových kursů aj. Podstata metody kvantilů vyplyne názorně z příkladu stanovení rozdělení pravděpodobnosti budoucí poptávky po určitém výrobku nově uváděném na trh. Předpokládejme, že z diskuse analytika s marketingovým odborníkem vyplynulo, že roční výše poptávky se může pohybovat od pěti tisíc kusů do deseti tisíc (pesimistický odhad pět tisíc, optimistický odhad deset tisíc určují hranici intervalu, ve kterém se může poptávka pohybovat), Dále se může postupovat dvěma způsoby. V prvním marketingový odborník určuje velikosti poptávky, které odpovídají podle jeho názoru určitým pevným hodnotám pravděpodobnosti, např. 0,25, 0,5, a 0,75. Při druhém způsobu určuje marketingový odborník hodnoty pravděpodobností, které podle jeho soudu odpovídají určitým zvoleným hodnotám poptávky, např. šesti tisícům kusů, sedmi tisícům kusů, osmi tisícům kusů a devíti tisícům kusů. Jestli např. v prvním případě vedla diskuse analytika s marketingovým odborníkem k závěru, že pravděpodobnosti 0,25 odpovídá velikost poptávky sedm tisíc kusů, pravděpodobnostem 0,5 a 0,75 velikost poptávky osm tisíc kusů a osm tisíc pět set kusů, pak tyto dvojice čísel spolu s dvojicemi 0; 5 000 a 1; 10 000 představují subjektivní rozdělení pravděpodobnosti poptávky (viz obr.2).Jednotlivé dvojice je třeba chápat tak, že např. pravděpodobnost, že roční poptávka po daném produktu nepřekročí výši sedmi tisíc kusů, je 0,25 (je to tedy pravděpodobnost, že poptávka bude menší nebo nejvýše rovna sedmi tisícům kusů). Pravděpodobnost, že poptávka nepřekročí osm tisíc kusů je 0,5 , pravděpodobnost nepřekročení poptávky velikosti osm tisíc pět set je 0,75 a konečně pravděpodobnost velikosti 1 odpovídající deseti tisícům znamená, že marketingový odborník považuje za zcela jisté , že roční poptávka po daném produktu nepřekročí hodnotu deseti tisíc kusů. Pravděpodobnost
Poptávka ( ks/rok )
1
10 000
0,75
8 500
0,5
8 000
0,25
7 000
0 5 000 Obr.2 Určení poptávky pro dané hodnoty pravděpodobností Subjektivní rozdělení pravděpodobnosti poptávky po daném produktu je nyní možné opět zobrazit graficky, kdy na xové ose zobrazíme hodnoty poptávky a na ose y-ové odpovídající pravděpodobnosti. Tím dostaneme graf distribuční funkce. „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
12
Na obr.3 uvedená distribuční funkce poptávky odpovídá kumulativním pravděpodobnostem počtu poptávek v předchozím příkladu I u poptávky (a jiných faktorů rizika s velkým počtem jejich možných hodnot) je však možné určit něco, co odpovídá pravděpodobnostem (nekumulativním. Jde o tzv. hustotu pravděpodobnosti, jejíž grafické zobrazení pro případ naší poptávky uvádíme na obr. 4. Na x-ovou osu opět nanášíme hodnoty poptávky, ale na y-ové ose nejsou odpovídající pravděpodobnosti, ale již zmíněná hustota pravděpodobnosti. Graf hustoty pravděpodobnosti lze interpretovat takto: Pravděpodobnosti určitých hodnot poptávky jsou dány velikostí odpovídajících ploch pod křivkou hustoty pravděpodobnosti na obr. 4. Celá plocha pod touto křivkou je normována a je rovna jedné (vyjadřuje to jistotu, že roční poptávka po daném produktu nebude nižší než pět tisíc kusů a současně nepřesáhne hodnotu deseti tisíc kusů). Z obr. 2 resp. 3 vidíme, že pravděpodobnost toho, že poptávka nepřekročí sedm tisíc kusů, je 0,25. Stejná pravděpodobnost (tj. že se poptávka bude pohybovat mezi pěti tisíci kusy a sedmi tisíci kusy) je vyjádřena plochou obrazce pod křivkou na obr. 4 od počátku s hodnotou pět tisíc do kolmice v bodě sedm tisíc. 1
0,75
0,5
0,25
0 5000
7000
8000
8500
10000
Obr.3 Distribuční funkce poptávky Stejně tak např. pravděpodobnost poptávky v intervalu od sedmi tisíc kusů do osmi tisíc pěti set kusů je dána plochou vyšrafovaného obrazce pod křivkou na obr.4 ohraničeného kolmicemi v hodnotě poptávky sedm tisíc kusů a osm tisíc pět set kusů. Tuto pravděpodobnost můžeme též stanovit z odpovídajících hodnot na obr. 3, resp. 4 jakožto rozdíl pravděpodobnosti, že poptávka bude menší nebo rovna osmi tisícům pěti stům kusů a pravděpodobnosti, že poptávka nepřekročí sedm tisíc kusů, tj. 0,75 – 0,25 = 0,5. Distribuční funkce (viz obr. 3) a hustota pravděpodobnosti (viz obr. 4) jsou tedy ve vzájemném jednoznačném vztahu a při znalosti jedné křivky lze určit druhou křivku a naopak.
5000
6000
7000
8000
8500
10000
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
13
2.3. Slovní vyjádření subjektivních pravděpodobností Přednost číselného vyjádření subjektivních pravděpodobností je jeho jednoznačnost. Pokud chceme těchto pravděpodobností využít při tvorbě a řešení modelů podporujících manažérské rozhodování, nelze užít jiného než číselného vyjádření subjektivních pravděpodobností. Určitou nevýhodou je, že se manažeři mnohdy kvantitativnímu vyjádření vyhýbají a raději pracují se slovními popisy subjektivních pravděpodobností, které jsou všeobecně srozumitelné a přijatelné. Mezi číselnými hodnotami a slovními popisy subjektivních pravděpodobností existuje určitý vztah, který můžeme vyjádřit např. pomocí tab. 2 (Tepper – Kápl 1991). Slovní vyjádření subjektivních pravděpodobností má však též značné nevýhody. Nelze jej využít pro tvorbu matematických modelů podporujících přípravu manažérského rozhodnutí. Kromě toho praktické zkušenosti ukazují, že jednoznačný vztah mezi číselným a slovním vyjádřením subjektivních pravděpodobností, uvedených v tabulce 2, není určitou závaznou normou a že různí lidé chápou uplatněné slovní popisy odlišně a přikládají jim nestejný obsahový význam) blíže viz Moore, 1983). Nejednoznačnost slovních vyjádření subjektivních pravděpodobností je proto jejich značným nedostatkem, který může ztížit komunikaci při týmovém řešení. Vzhledem k těmto skutečnostem může slovní vyjádření subjektivních pravděpodobností sloužit jako určitý první stupeň, po němž následuje uplatnění některé metody číselného stanovení těchto pravděpodobností. Tab. 2 Číselné a slovní vyjádření subjektivních pravděpodobností
Vyjádření subjektivní pravděpodobnosti Číselné 0 0,1 0,2 - 0,3 0,4 0.6 0,7 – 0,8 0,9 1
Slovní Zcela vyloučeno Krajně nepravděpodobné Dosti nepravděpodobné Nepravděpodobné Pravděpodobné Dosti pravděpodobné Nanejvýš pravděpodobné Zcela jistě
3 Funkce utility 3.1 Postoj rozhodovatele k riziku Při rozhodování za rizika a nejistoty, a to zvláště ve fázi hodnocení variant a výběru varianty určené k realizaci , hraje významnou roli postoj rozhodovatele k riziku. Rozhodovatel (manažér, podnikatel) může mít buď averzi k riziku,případně sklon k riziku nebo neutrální postoj k riziku. Rozhodovatel s averzí k riziku se snaží vyhnout volbě značně rizikových variant a vyhledává málo rizikové varianty, které se značnou jistotou zaručují dosažení výsledků, které jsou pro něj přijatelné. Rozhodovatel se sklonem k riziku naopak vyhledává značně rizikové varianty (které mají naději na dosažení zvláště dobrých výsledků, ale jsou spojeny i vyšším nebezpečím špatných výsledků, resp. ztrát) a preferují je před variantami méně rizikovými. U rozhodovatele s neutrálním postojem k riziku jsou averze a sklon k riziku ve vzájemné rovnováze. Postoj rozhodovatele k riziku patří k jednomu ze základních pojmů teorie rozhodování za rizika a nejistoty. Jeho definice je založena na chování rozhodovatele v situaci, kdy má možnost volby mezi dvěma variantami, z nichž jedna je riziková a druhá neriziková. Předpokládejme např. že riziková varianta vede s pravděpodobností p1 k výsledku x1 a s pravděpodobností 1 - p1 k výsledku x2 . Neriziková varianta nechť s jistotou zaručuje dosažení výsledku, který je roven očekávání (střední) hodnotě výsledku první varianty, tj. zaručuje dosažení výsledku x1. p1 + x2.( 1 - p1 ) . Podle definice má rozhodovatel averzi k riziku právě tehdy, dává-li v každé situaci výše uvedeného typu přednost druhé (tj. nerizikové) variantě před první (rizikovou) variantou. Jestliže rozhodovatel preferuje vždy první, rizikem zatíženou variantu před druhou nerizikovou variantou, má sklon k riziku. Pro rozhodovatele s neutrálním postojem k riziku jsou obě výše uvedené varianty indiferentní (tj. hodnotí je stejně vysoko). „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
14
Předpokládejme, že rozhodovatel např. vybírá ze dvou variant. První vede s pravděpodobností 0,5 k zisku ve výši 10 milionů Kč a se stejnou pravděpodobností k nulovému zisku. Druhá varianta zaručuje s jistotou dosažení zisku ve výši 5 milionů Kč (což se právě rovná očekávanému zisku při aplikaci první varianty neboť 0,5 . 10 mil. Kč + 0,5 . 0 = 5 mil, Kč). V tomto případě rozhodovatel s averzí k riziku volí druhou variantu, kdy s jistotou dosáhne zisku 5 mil. Kč (tzn., že se snaží vyhnout situaci, která by v případě volby první varianty mohla nastat a vést k nulovému zisku). Rozhodovatel se sklonem k riziku volí první variantu, u které oceňuje značnou naději (50% pravděpodobnost) dosáhnout zisku ve výši 10 mil. Kč (to je o pět mil. Kč více, než zaručuje druhá varianta). Pro rozhodovatele s neutrálním vztahem k riziku jsou obě varianty stejně výhodné. Postoj rozhodovatele k riziku ovlivňuje více faktorů. K nejvýznamnějším patří jeho osobní založení, minulé zkušenosti (tj. úspěšnost nebo neúspěšnost předchozích rozhodnutí), dále okolí, ve kterém volba rizikových variant probíhá.
3.2 Konstrukce funkce utility Funkce utility za rizika (existují terminologické nejednotnosti. Kromě funkce utility za rizika se používají i termíny funkce užitku za rizika, resp. užitková funkce za rizika), slouží jako nástroj pomocí kterého lze kvantitativně vyjádřit postoj rozhodovatele k riziku. Lze dokázat (Keeney – Raiffa,1976), že pro rozhodovatele s averzí k riziku je funkce utility konkávní, rozhodovatel se sklonem k riziku má funkci utility konvexní. Funkce utility rozhodovatele s neutrálním postojem k riziku je lineárního tvaru. (Grafické znázornění funkce utility v závislosti na postoji rozhodovatele k riziku pro kriteria výnosového a nákladového typu je uvedeno na obr. 5 a 6. Utilita 1
1 2 3
0
Kritérium
Obr. 5 Rostoucí funkce utility kritéria výnosového typu Vysvětlivky: 1 . . . Konkávní funkce utility rozhodovatele s averzí k riziku. 2 . . . Lineární funkce utility rozhodovatele s neutrálním postojem k riziku. 3 . . . Konvexní funkce utility rozhodovatele se sklonem k riziku.
Pro náležité pochopení funkce utility je třeba zdůraznit, že tato funkce nevyjadřuje celkový postoj rozhodovatele k riziku, tj. vzhledem k celému souboru kritérií hodnocení variant, nýbrž postoj k riziku z hlediska daného kritéria hodnocení. Vzhledem k tomu se tato funkce označuje též jako dílčí, resp. jednorozměrná funkce utility. Tvar této funkce pro daného rozhodovatele může být (a z pravidla je) pro jednotlivá kritéria zčásti odlišný (např. pro některá kritéria jsou odpovídající funkce utility konkávní, tzn. že vyjadřují averzi rozhodovatele k riziku, vzhledem k jiným kritériím má rozhodovatel neutrální postoj k riziku a odpovídající funkce utility jsou lineární) Dříve než si ukážeme postup konstrukce funkce utility pro zvolené kritérium hodnocení, musíme se ještě seznámit s jedním základním pojmem, na kterém je tato konstrukce založena. Tímto pojmem je tzv. jistotní ekvivalent. Jistotním ekvivalentem (pro dané kritérium hodnocení) varianty, která vede k důsledkům (vzhledem k tomuto kritériu) velikosti x1, x2, . . . , xn s pravděpodobnostmi p1, p2, . . . , pn rozumíme takovou hodnotu důsledku , jehož utilita je rovna právě střední (očekávané) utilitě varianty vzhledem k tomuto kritériu. (Rozhodovatel se tedy cení důsledek rovný jistému ekvivalentu, resp. variantu, která vede s jistotou k důsledku rovnému jistotnímu ekvivalentu, resp. variantu, „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
15
která vede s jistotou k důsledku rovnému jistotnímu ekvivalentu stejně vysoko, jako výše uvedenou variantu zatíženou rizikem). Platí tedy (1)
kde
. . . jistotní ekvivalent , . . . utilita jistotného ekvivalentu , . . . utilita důsledku x i .
Utilita 1
1 2 3
0 Kritérium Obr 6 Klesající funkce utility kritéria nákladového typu Vysvětlivky: 1 . . . Konkávní funkce utility rozhodovatele s averzí k riziku. 2 . . . Lineární funkce utility rozhodovatele s neutrálním postojem k riziku. 3 . . . Konvexní funkce utility rozhodovatele se sklonem k riziku.
Jestliže se vrátíme k předchozímu příkladu dvou variant (první byla riziková a vedla s pravděpodobností 0,5 k zisku 10 mil. Kč a se stejnou pravděpodobností k nulovému zisku a druhá zaručovala s jistotou zisk ve výši 5 mil. Kč.) pomocí kterého jsme demonstrovali postoj rozhodovatele k riziku pak pro první rizikovou variantu platí x1 = 10 mil., x2 = 0 a p1 = p2 = 0,5. Pokud nyní rozhodovatel cení stejně vysoko tuto rizikovou variantu jako variantu, která s jistotou zaručuje zisk např. ve výši 3 mil Kč (tj.utilita jistého zisku ve výši 3 mil. Kč je rovna očekávané utilitě rizikové varianty, neboli podle vztahu (1) platí u (3) = 0,5.u (10) + o,5 . u (0), je jistotní ekvivalent této rizikové varianty roven právě 3 mil. Kč. Jistotní ekvivalent můžeme také interpretovat poněkud jinak. Budeme-li uvažovanou rizikovou variantu považovat za loterii s výhrami 10 a 0 mil.Kč (dosahovanými se stejnou pravděpodobností 0,5), pak je jistotní ekvivalent roven minimální částce, za kterou je subjekt ochoten tuto loterii prodat. V našem případě by tedy tato prodejní cena činila 3 mil. Kč. Na základě vztahu jistotního ekvivalentu dané rizikové varianty a jejího očekávaného důsledku je rovněž možné vymezit postoj rozhodovatele k riziku. Jestliže v předchozím příkladě byl pro daného rozhodovatele jistotní ekvivalent rizikové varianty (jejíž očekávaný zisk je 0,5 . 10 + 0,5 . 0 = 5 mil. Kč) menší než tento očekávaný zisk (platí 3 <5), má rozhodovatel vzhledem ke kritériu tvořenému ziskem averzi k riziku. Je-li jistotní ekvivalent dané rizikové varianty vyšší než její očekávaný zisk , tj. vyšší než 5 mil. Kč, má rozhodovatel vzhledem k zisku klon k riziku. Jistotní ekvivalent dané rizikové varianty rozhodovatele s neutrálním postojem k riziku je v našem případě roven právě 5 mil. Kč. Rozdíl mezi očekávaným důsledkem rizikové varianty a jejím jistotním ekvivalentem se někdy označuje jako riziková prémie (podrobněji viz Fotr – Píšek, 1986)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
16
3.3 Příklad určení funkce utility Konstrukci funkce utility si ukážeme na příkladě jejího stanovení pro kritérium hodnocení, tvořeného ziskem. Předpokládejme, že při řešení určitého rozhodovacího problému investiční povahy bylo formulováno několik variant, které jsou v odlišné míře rizikové a jejichž možný zisk se pohybuje od 0,25 mil. Kč do 9,8 mil. Kč. Naším úkolem je pomoci stanovit rozhodovateli, který je odpovědný za volbu varianty určené k realizaci, funkci utility zisku. Prvním krokem při stanovení funkce utility pro dané kritérium je vymezit její definiční obor. Jako krajní body definičního oboru můžeme zvolit buď nejmenší a největší hodnotu kritéria v daném souboru variant (v našem případě 0,25 mil.Kč a 9,8 mil. Kč) nebo veličiny, které vznikly jejich vhodným zaokrouhlením. My se rozhodneme pro zaokrouhlení a funkci utility zisku budeme stanovovat pro interval, jehož dolní mez je 0 mil. Kč a horní mez 10 mil. Kč. Pro stanovení definičního oboru funkce utility můžeme stanovit hodnoty utility v krajních bodech tohoto oboru. Využijeme zde toho, že funkce utility za rizika (stejně jako funkce utility za jistoty) nevyjadřuje absolutní ocenění (výhodnost pro rozhodovatele) možných hodnot kritéria, ale ocenění relativní, a proto je volba hodnot utility v krajních bodech definičního oboru arbitrární. (arbitrárnost volby hodnot funkce utility v jejích krajních bodech vyplývá z toho, že funkce utility je jednoznačná až na pozitivní lineární transformaci, tj. původně stanovená funkce utility u (x) může být nahrazena libovolnou jinou funkcí v (x), pro kterou platí vztah v (x) = a.u (x) + b, přičemž a > 0). Zvykem však je volit pro kritéria výnosového typu utilitu dolní meze definičního oboru rovnou nule a utilitu horní meze tohoto oboru rovnou jedné (někdy stu), přičemž pro kritéria nákladového typu je to právě naopak. V našem případě zvolíme tedy utilitu nulového zisku jako nula a utilitu zisku ve výši 10 mil. Kč jako jedna (platí tedy u (0) = 0, u (10) = 1) , takže známe již dva body hledané utility zisku. Dále stanovíme několik dalších bodů hledané funkce utility a to pomocí jistotních ekvivalentů. Určování jistotních ekvivalentů probíhá v dialogu analytika (konzultanta)s rozhodovatele, kdy analytik klade postupně rozhodovateli dotazy, přičemž první dotazy může rozhodovatel zodpovědět snadno, avšak náročnost dalších dotazů (tak jak se dialog blíží k určení jistotního ekvivalentu ) se zvyšuje. Dialog analytika s rozhodovatele by mohl mít v našem případě asi následující průběh. Analytik položí rozhodovateli nejdříve dotaz. „Preferujete více variantu, která Vám s jistotou zaručí zisk ve výši 1 mil. Kč, nebo rizikovou variantu, která s pravděpodobností 0,5 vede k zisku 10 mil. Kč a se stejnou pravděpodobností k nulovému zisku“? Rozhodovatel (pokud jeho averze k riziku není zvlášť výrazná) asi odpoví, že preferuje více více danou rizikovou variantu. V tomto případě další dotaz analytika na rozhodovatele zní např. „Preferujete více variantu s jistým ziskem ve výši 8 mil. Kč nebo danou rizikovou variantu“? Rozhodovatel, (pokud nemá nemá vysoce výrazný sklon k riziku ) patrně odpoví, že si více cení nerizikové varianty s jistým ziskem velikosti 8 mil. Kč. Charakter dalších dotazů analytika na rozhodovatele je obdobný. Analytik v druhém kroku zjišťuje, zda rozhodovatel preferuje více variantu s jistým ziskem 1,5 mil. Kč (resp. dále 6 mil. Kč.) než danou rizikovou variantu. Pokud náš rozhodovatel představuje převládající typ rozhodovatele s určitou averzí k riziku, bude patrně ještě preferovat danou rizikovou variantu před variantou s jistým ziskem 1,5 mil. Kč. A zcela jistě si bude více cenit varianty s jistým ziskem 6 mil. Kč než dané rizikové varianty. Dalšími obdobnými dotazy analytika na rozhodovatele dospějeme v dalších krocích např. k situaci, kdy si rozhodovatel o něco málo více cení dané rizikové varianty než varianty s jistým ziskem 2,5 mil. Kč a stejně tak považuje za o něco málo lepší variantu s jistým ziskem 3,5 mil. Kč než danou rizikovou variantu.Jestliže analytik dále zjistí, že rozhodovatel cení stejně vysoko variantu s jistým ziskem 3 mil. Kč jako danou rizikovou variantu, určili jsme jistotní ekvivalent rizikové varianty, která s pravděpodobností 0,5 poskytuje zisk 10 mil. Kč a se stejnou pravděpodobností nulový zisk. Tento jistotní ekvivalent činí 3 mil. Kč. Tento nepřímý způsob stanovení jistotního ekvivalentu dané rizikové varianty v dialoga analytika s rozhodovatelem je pro rozhodovatele méně náročný než přímý způsob, tj. položení dotazu: „ Jakou výši jistého zisku si ceníte stejně vysoko jako jako rizikovou variantu vedoucí se stejnou pravděpodobností 0,5 k zisku 10 mil. Kč a k nulovému zisku?“ Kromě vyšší náročnosti vede tento přímý způsob i k méně spolehlivým výsledkům. (Podrobněji k problematice k určování jistotních ekvivalentů viz Fotr-Píšek, 1986.) Stanovený jistotní ekvivalent nám umožňuje určit třetí bod (kromě dvou výše charakterizovaných krajních bodů) funkce utility zisku. Vzhledem ke vztahu (1) musí totiž platit
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
17
u (3) = 0,5 . u (0) + 0,5 . u (10) a po dosazení za
(2)
u (0) = 0 a za u (10) =1 do vztahu (2) dostaneme u (3) = 0,5 .0 + 0,5 . 10 = 0,5
(3)
Tím jsme určili utilitu jistotního ekvivalentu dané rizikové varianty, která je 0,5 a máme tedy další (již třetí) bod hledané funkce utility zisku, který má souřadnice 3 a 0,5. Další dva body funkce utility určíme stejným způsobem, a to pomocí jistotních ekvivalentů dalších rizikových variant, vytvořených s využitím znalosti jistotního ekvivalentu velikosti 3 mil. Kč. Stanovíme (opět v dialogu analytika s rozhodovatelem) jistotní ekvivalenty dvou rizikových variant, z nichž první vede s pravděpodobností 0,5 k nulovému zisku a se stejnou pravděpodobností k zisku 3 mil. Kč a druhá s pravděpodobností 0,5 k zisku ve výši 3 mil. Kč a 10 mil. Kč. Činí-li jistotní ekvivalent první rizikové varianty 1,3 mil. Kč a jistotní ekvivalent druhé rizikové varianty 5,5 mil. Kč, musí opět vzhledem k (1) platit: u (1,3) = 0,5 . u (0) + 0,5 . u (3
(4)
u (5,5) = 0,5 . u (3) + 0,5 . u (10)
(5)
Protože však již víme, že u (3) = 0,5 , dostaneme po dosazení za u (0) = 0, u (3) = 0,5 a u (10) = 1 do vztahů (4) a (5) u (1,3) = 0,5 .0 + 0,5 . 0,5 = 0,25
(6)
u (5,5) = 0,5 . 0,5 + 0,5 . 1 =0,75
(7)
Nyní již známe souřadnice dalších dvou bodů funkce utility zisku, které jsou (1,3; 0,25) a (5,5; 0,75), a celkem tedy je známých již pět bodů této funkce. Získané body funkce utility zobrazíme v grafu (na ose x vyneseme velikosti zisku a na ose y jim odpovídající utility – viz. obr. 7). Proložíme-li tyto body vhodnou křivkou, získáme aproximaci grafického zobrazení funkce utility zisku pro daného rozhodovatele (a daný rozhodovací problém). Z obr. 7 vidíme, že funkce utility zisku daného rozhodovatele je konkávní. Tento rozhodovatel má tedy pro dané kritérium averzi k riziku. Pro praktické uplatnění funkce utility při hodnocení a výběru rizikových variant může být vhodné stanovit funkční tvar funkce utility a určit jeho parametry. V případě neutrálního postoje rozhodovatele k riziku, kdy funkce utility je lineární, má její vzorec tvar u (x ) =
0
x- x *
0
x - x
0
,
*
kde u je nejhorší a x nejlepší hodnota daného kritéria (tj. krajní body definičního oboru). Její kvantifikace nevyžaduje odhad parametrů.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
18
Utilita 1
0,75
0,5
0,25
0
1,3
3
5,5
10
Zisk
Obr. 7 Funkce utility zisku V případě averze, resp. sklonu rozhodovatele k riziku, lze odpovídající konkávní, resp. konvexní funkci utility často zobrazit exponenciální funkcí tvaru u (x )= e
a.x+ b
(9) K určení parametrů a a b této funkce užijeme všech funkčních hodnot funkce utility ve stanovených bodech , tvořených krajními body jejího definičního oboru a jistotními ekvivalenty. Určení koeficientů se provádí aproximační metodou nejmenších čtverců přes logaritmování dané rovnice (9). Empirické výzkumy chování rozhodovatelů za rizika a jejich funkce utility ukazují, že značně převládá averze k riziku. Současně se však ukazuje, že postoj subjektu k riziku je často odlišný v závislosti na tom, zda jde o zisky nebo ztráty. Zatímco v oblasti zisku je převládající averze k riziku, pak v oblasti ztrát převládá sklon k riziku (to však platí spíš o menších ztrátách, neboť v případě značných až katastrofických ztrát, vedoucích k ruinování subjektu opět výrazně převládá averze k riziku).Postoj rozhodovatele s averzí k riziku v oblasti zisku a sklonem k riziku v oblasti ztrát lze pak zobrazit funkcí utility s inflexním bodem, který odděluje její konkávní část pro kladné hodnoty ziskového kritéria od části konvexní pro záporné hodnoty tohoto kritéria(viz obr. 8). Utilita 1
Averze k riziku
Inflexní bod Sklon k riziku
Oblast ztráty
0
Oblast zisku
Kritérium
Obr. 8 Funkce utility s inflexním bodem K funkci utility a k její konstrukci je třeba ještě poznamenat, že tato funkce vyjadřuje vždy subjektivní postoj rozhodovatele k riziku vzhledem k danému kritériu. Žádná objektivní funkce utility (pro dané kritérium) neexistuje. Funkce utility různých rozhodovatelů se mohou lišit (a také se obvykle liší, jak ukazují výsledky empirických studií). „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
19
Anii funkce utility téhož rozhodovatele pro dané kritérium, zjišťovaná v různých obdobích, nemusí být stejná. Empirické výzkumy též ukazují, že konstrukce funkce utility je obecně obtížná záležitost (z hlediska informací, které je třeba od subjektu pro její kvantifikaci získat). Rozhodovatel s ní proto v mnoha případech nejsou ochotni pracovat.
Seznam literatury [1] BAŠTA, A. Plánové rozhodovací procesy a jejich systém. Praha : Academia, 1977. [2] ČERNÝ, J.; GLŰCKAUFOVÁ, D. Vícekriteriální vyhodnocování v praxi. Praha : SNTL, 1982. [3] EDEN, C.; JONES, S.; SIMS, D. Messing About in Probléme. Oxford : Pergamon Press, 1983. [4] FOTR, J. Příprava a hodnocení podnikatelských projektů. Praha : VŠE, 1993. [5] FOTR, J. Manažérská rozhodovací analýza. Praha : VŠE, 1992. [6] FOTR, J.; DĚDINA, J. Manažérské rozhodování. Praha : VŠE. [7] FOTR, J.; PÍŠEK, M. Exaktní metody ekonomického rozhodování. Praha : Academia, 1986. [8] IVANCEVICH, J. M.; DONESLY, J. H.; GOBBON, J. L. Management. Principles and Functions. Homewood : R. D. Irvin, 1989. [9] MOORE, P. G. The Business of Risk. Cambridge : University Press, 1983. [10] NOVÁK, M. Examples of using concepts of probability theory in managementdecision makinng. Mezinárodní konference. Kunovice : EPI, s.r.o., 2006. [11] NOVÁK, M. Probability theorz in combined form of study at FEEC BUT. Mezinárodní konference. Kunovice : EPI, s.r.o., 2006. [12] PÍŠEK, M.; VOBOŘIL, J. Vybrané metody dlouhodobého prognózování a jejich využití. Praha : Ekonomický ústav ČSAV, 1981. [13] STCHLE, W. H. Management. München : Verlag Franz Valen, 1989. [14] VLČEK, R. Hodnotový management. Praha : Management Press, 1992. [15] VLČEK, R. Příručka hodnotové analýzy. Praha : SNTL, 1983. [16] WATSON, S. R.; BUDGE, J. R. Decision Synthesi. Cambridge : Cambridge University Press, 1987. [17] ZAPLETAL, J. Operační analýza. Kunovice : Skriptorium VOŠ, 1995. [18] ZÁRUBA, P. aj. Základy podnikového managementu. Praha : Aleko, 1991.
Adresa: Doc. RNDr. Josef Zapletal, CSc. Evropský polytechnický institut, s.r.o. Osvobození 699 686 04 Kunovice Česká republika Tel./fax.: +420 572 549 018/ +420 572 548 788 e-mail:
[email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
20
THE USE OF FUZZY LOGIC FOR SUPPORT OF DIRECT MAILING Petr Dostál
Brno University of Technology
Abstrakt: The article deals with the use of fuzzy logic for the support of direct mailing. The brief description of fuzzy logic, the process of calculation, the scheme of models, rule blocks, attributes and their membership functions are mentioned. The use of fuzzy logic is an advantage especially for support of direct mailing where evaluation is very complicated.
Klíčová slova: fuzzy logic, direct mailing, model, rule block, membership function, attributes
1. INTRODUCTION The use of fuzzy logic is an advantage especially for the support of direct mailing where evaluation is very complicated. The advantage is that linguistic variables are used. Fuzzy logic measures the certainty or uncertainty of membership of an element of the set. Analogously man makes decisions during mental and physical activities. The solution of a certain case is found in the principle of rules that were defined by fuzzy logics for similar cases. Fuzzy logics belong among methods that are used in the area of direct mailing.
2. The fuzzy processing The calculation of fuzzy logics consists of three steps: fuzzification, fuzzy inference and defuzzification. •
The fuzzification means that the real variables are transferred to linguistic variables. The definition of linguistic variable goes out from basic linguistic variables, for example, at the variable risk there are set up the following attributes: none, very low, low, medium, high, very high. Usually there are used from three to seven attributes of variable. The attributes are defined by the so called membership function, such as Λ, π, Z, S and some others. The membership function is set up for input and output variables.
•
The fuzzy inference defines the behavior of the system by means of rules of type <When>,
on the linguistic level. The conditional clauses evaluate the state of input variables by the rules. The conditional clauses are in the form <When> Inputa Inputb ….. Inputx Inputy …….. Output1, it means, when (the state occurs) Inputa and Inputb, ….., Inputx or Inputy, …… , then (the situation is) Output1. The fuzzy logic represents the expert systems. Each combination of attributes of variables, incoming into the system and occurring in condition <When>, , presents one rule. Every condition behind <When> has a corresponding result behind . It is necessary to determine every rule and its degree of supports (the weight of rule in the system). The rules are created by the expert himself. •
The defuzzification transfers the results of fuzzy inference on to the output variables, that describes the results verbally (for example, whether the risk exists or not).
The system with fuzzy logics can work as an automatic system with entering of input data. The input data can be represented by many variables.
3. DIRECT MAILING This case presents the use of fuzzy logic for direct mailing, whether the client is visited personally, sent a letter or not to speak to him. See the model on fig. 1. „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
21
Fig. 1 Project chart The input variables and their attributes are Loan (fig. 2) (none, small, medium, high), Salary (fig. 3) (low, medium, high), Age (fig. 4) (young, medium, old, very old), Children (no, a few, many), State (single, married, divorced) and Place (big city, city, village).
Fig. 2. The attributes and membership functions of variable Loan
Fig. 3. The attributes and membership functions of variable Salary
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
22
Fig. 4. The attributes and membership functions of variable Age The rule blocks with attributes are Finance (excellent, good, bad), Personality (unsuitable, suitable, good, excellent). The fig. 5 shows the attributes and membership functions of the Finance.
Fig. 5. The attributes and membership functions of variable Finance The output variable Mailing with the attributes evaluates whether the client will be visited or a letter will be sent to him or he will not be spoken to him. See fig. 6.
Fig. 6 The attributes and membership functions of variable Mailing
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
23
Fig. 7 shows one from two rule blocks Finance with their rules and degree of support that set up the relation between input and output variables.
Fig. 7 Rule block When the model is made, it is necessary to tune it (to set up the inputs on known values, evaluate the results and to change the rules or weights, if necessary). If the system is tuned, it is possible to use it in practice.
Fig.8 The attributes and membership functions of output variable Mailing The set up of the rule block distinguish single cases. For example, the result of decision making is the no contact with client in case when the person has a low salary and a lot of loans, he lives in a village, he is of old age, he is single and without children. Fig. 8 shows this result, where the mailing is evaluated not to contact the client. The effort is to bring the profit in the future from the investment into the marketing. The evaluation, whether the marketing project is profitable or loss-making, is possible to evaluate after a certain time.
4. CONCLUSION The mentioned case is only the fraction of possible variants of the use of fuzzy logic in various areas of decision making. The theory of fuzzy logic contributes to the quality of decision making. The decision making process is an important activity of firms. It is possible to say, that the successful decision making make the firm successful. It is necessary to emphasize, that these methods support the decision making and that the responsibility of optimal variant or variants are on those, who make the decision. Fuzzy logic as well artificial neural networks and genetic algorithms belongs to relative strong methods as a tool of artificial intelligence for the support of decision making.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
24
LITERATURE [1] ALIEV, A.; ALIEV, R. Soft Computing and Its Applications, World Scientific Pub. Ltd, UK2002, 444 p., ISBN 981-02-4700-1. [2] ALTROCK, C. Fuzzy Logic & Neurofuzzy – Applications in Business & Finance. USA : Prentice Hall, 1996, 375p., ISBN 0-13-591512-0. [3] DOSTÁL, P. Moderní metody ekonomických analýz – Finanční kybernetika. Zlín : UTB, 2002, 110p., ISBN 80-7318-075-8. [4] DOSTÁL, P. Soft Computing and Stock Market. Brno : VUT, 2003, p.258-262, ISBN 80-214-2411-7. [5] DOSTÁL P.; ŽÁK, L. Fuzzy Logic and Financial Time Series. Kunovice : EPI, s.r.o., 2004, International Conference on Soft Computing, s.93-97., ISBN 80-7314-025-X. [6] DOSTÁL, P.; RAIS, K. Operační a systémová analýza II. Brno : VUT – FP, 2005, Skripta, 160s., ISBN 80214-2803-1. [7] DOSTÁL, P.; MACHŮ, E. The Use of Fuzzy Logic in Pedagogy of Gifted Students. Brno : 2005, Business and Economic Development in Central and Eastern Europe, Konference, s.18, 5s., ISBN-214-3012-5. [8] DOSTÁL, P. Vybrané metody rozhodování v podnikové sféře. Brno : VUT-FP, 2005, Habilitační práce, 22 s. 188 s. ISBN 80-214-3083-4, ISSN 1213-418X. [9] DOSTÁL, P. Využití fuzzy logiky v risk managementu. In Progressive Methods and Tools of Management and Economics of Companies. Brno : 2005, 5s., ISBN 80-214-3099-0. [10] DOSTÁL, P.; RAIS, K.; SOJKA, Z. Pokročilé metody manažerského rozhodování. Grada, 2005,168s, ISBN 80-247-1338-1. [11] FANTA, J. Psychologie, algoritmy a umělá inteligence na kapitálových trzích. Praha : Grada, 2001, 168p., ISBN 80-247-0024-7. [12] KAZABOV, K.; KOZMA, R. Neuro-Fuzzy – Techniques for Intelligent Information Systems Physica-Verlag. Germany, 1998, 427p., ISBN 3-7908-1187-4. [13] KLIR, G.J.; YUAN, B. Fuzzy Sets and Fuzzy Logic, Theory and Applications. New Jersy : Prentice Hall, USA, 1995, 279p., ISBN 0-13-101171-5. [14] KOLEKTIV FuzzyTech – Users Manual, Inform. GmbH, Germany, 2002, 258p. [15] NOVÁK, V. Fuzzy množiny a jejich aplikace. Praha : SNTL, 1990, 297p., ISBN 80-03-00325-3. [16] RAIS, K.; SMEJKAL V. Řízení rizik. Praha : Grada, 2003, 270p., ISBN 80-247-0198-7. [17] RIBEIRO, R.; YAGER, R. Soft Computing in Financial Engineering. A Springer Verlag Copany, 1999, 590p., ISBN 3-7908-1173-4.
Address: Ing. Petr Dostál, CSc. Brno University of Technology Kolejní 4 612 00 Brno, Czech Republic Tel. +420 541 143 714, Fax. +420 541 142 692 e-mail: [email protected], [email protected] http://www.iqnet.cz/dostal
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
25
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
26
THE COLLATION OF VARIOUS METHODS FOR THE SOLUTION OF TRANSPORTATION PROBLEMS Abdurrzzag Tamtam
Brno University of Technology
Abstrakt: The paper solves the problem of effectiveness of various methods with respect to their sizes. We study and compare the North West Corn method,the index method, the Vogel method. It is obvious that the first allocation required values of goods by customers will be most expansive when using North West Corn method which is based on geographical principle without any economical conditions. We show that the index method gives a better result than North West Corn method but the Vogel method brings a better solution with respect to sizes. We modified Vogel method and we show by example that for greater matrixes among suppliers and customers, the modified method the gives optimum solution.
Klíčová slova: Supplier, customer, cell, free cell – water, occupied cell – stone, North West Corn method, index method, Vogel method, modified Vogel method.
1) Introduction A special class of linear programming problems are the so called distributive problems. Such a distributive problem can be formulated in full generality as follows: Minimise the objective function
m
n
z = ∑
∑ ci j x i j
i=1 j=1
under the conditions n
m
∑ xi j = a i
∑ k i j xi j = b j i =1
j=1
and non-negative of xi j
where i = 1, 2, . . . , m , j = 1, 2, . . . , n
These problems can be solved also with the Simplex method. But such a manner of solving is usually very lengthy and laborious. Special properties of distributive problems make it possible to use special methods such as transportation problem. There we suppose m suppliers Si ( i = 1, 2, . . . , m) with inventories si of units of identical commodity and n customers Kj ( j = 1, 2, . . . , n) with requirements of kj units of the same commodity. We suppose that: m
∑ si =
i=1
n
∑ kj
j=1
and we say that a balanced transport problem is given. Further the expanses for transporting a unit from the supplier Si to the customer Kj are known. We denote them cij.. The number of transported units of commodity from i-th supplier to the j-th customer will be denoted by xij. The problem is to find m.n – dimensional vector
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
27
[ x11, x12, . . . , x1n, x21, x22, . . . , x2n, . . . , xm1, xm2, . . . , xmn ] Satisfying the following restrictions: x11 + x12 + x13 + x1n
= s1 = s2 xm1 + xm2+ xm3+ xmn = sm = k1 + x31 + . . . . . . + xm1 = k2 + x32 + .. . . . . . . . + xm2 + x2n + x3n . . . . . . . . . ... xmn = kn
x21 + x22 + x23 + x2n x11
+ x21 + x22
x12 x1n
This minimises the objective function z = c11x11 + c12x12 + ... + c1nx1n + c21x21 + c22x22 + ... + c2nx2n + ... + cm1xm1 + cm2xm2+ ... + cmnxmn Coefficients standing at variables are equal to one or zero. The number of zero coefficients is n . m . (m + n - 2), the number of unity coefficients is only (2. m . n). The system of equations is dependent. We see that after summing the first m equations we obtain the same result as after summing the last n equations. Therefore the solution has at most (m + n – 1) non-zero components. If the number of non-zero components is just (m + n - 1), then the solution is called non-degenerated, in case of smaller number of solutions we say that the solution is degenerated. That is the reason for which we do not solve the transport problem by Simplex method and we use simpler methods.
2 Solving of Transportation Problem Hereafter we study the optimising process on a concrete example. The comparison of the methods for the solving of transportation problem will be done at one example which contains three suppliers and four customers. The following example with a small matrix of 3x4 cells shows the advantages of index method, Vogel method and modified Vogel method against the North West Corn method, and also shows that the Vogel method exceeds the index method but this small example does not show any difference between Vogel method and our modified method. Example: Let us suppose that we have three suppliers S1, S2 and S3 with inventories 310, 200 and 190 units of commodity and four customers K1, K2, K3, K4with demands for 250, 100, 150 and 200 of units. The transport costs cij from the i-th supplier to the j-th customer are given by table 1. Table 1
Suppliers
Customers K2 K3
K1 20
14
Inventories
K4 11
12
S1
310 6
15
18
15
S2
200 17
12
19
23
S3 Demands
190 250
100
150
200
700
North West Corner Method The name arose from the geographical point of view. NW method is a geographical method of occupation of fields (cells) in the table, which has nothing common with economical point of view. We are beginning from the cell P11 determined by the row D1 and the column S1 (From North and West on a map) the value c11 has no sense for the construction. This cell will be occupied by maximum possible part of required commodity from S1, it is 250. The „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
28
supplier S1 still has 310 - 250 = 60 units. We continue with occupations of cells in the east course till the inventories of the first supplier are spent. So the cell P12 determined by double S1, K2 obtains 60 units. Inventories of the first supplier are equal to zero, but the second customer is not satisfied. We pass on to the second supplier S2 who still has 40 remaining units, hence the cell P21 will be occupied by 40 units. At the supplier S2 there are still 160 units of commodity. It is possible to satisfy the whole requirement of the customer K3, and we put it into the cell P23. The last 10 units from S2 will be given to the forth customer in the cell P24. The customer K4 claims 200 units of commodity and he has only 10 units. The rest 190 units will be transported from the third supplier. Six cells are occupied from the total number of 12 cells. In this example (m+n-1) is just equal to 6, and hence the problem is not degenerated. The value of the objective function is z = 250.20+60.14+14+40.15+150.18+10.15+190.13=13660 financial units We obtained the first solution. It is not the best solution of transportation so far. This fact can be proved by calculus using table 2 containing the first solution obtained by NW method. The occupied cells are called stones and the empty cells are called waters. Table 2
Customers K2 K3
Suppliers
K1
S1
250
20
11
14
310 15
18
40
S2 17
150 12
15
10 19
S3 250
12
60 6
Demands
Inventories
K4
100
150
200
23
190
190
200
700
Index Method It is obvious that the economical point of view has its important position. This method works with costs cij and begins with occupation of the cell with the smallest costs and continues over greater and greater costs of cells to the maximum cost. Simultaneously the sum of values in stones in every row is identical with the initial inventory of the relevant supplier. Similarly the sums of values in stones are the same as the requirements of relevant customers. We explain this method in our example. We begin with the cell P21, which has the smallest cost c21 = 6. We occupy it with the maximum requirement which can be delivered from the supplier S2, it is 200 units of commodity. This does not satisfy the demand of K2. For the next smallest cost is c13, we occupy the cell P13 with the maximum amount again, this one is given by the demand of S3. The next smallest cost equal to 12 occurs in two cells namely in P14 and in P32. We see that they are in different rows and also in different columns. Therefore it is not necessary to choose the order of occupation of these cells. We put into the first cell 160 units, there is not more at S1. The amount needed by for the customer K2 is transported from K3 into the cell P32. The cells with even higher costs c12=14, c22 =c24=15, are not occupied because the inventories of the second supplier S2 are exhausted. We come to the cost c31= 17. The customer K1 asks for the total amount 250 units and he has 200 units distributed in the stone P21 thus he receives in the cell P31 the amount of 50 units of commodity. For analogous reasons (the inventories of the supplier are exhausted or the demands of the customer are satisfied) we omit the costs 18, 19 and 20. We finish the allocation in the cell P34 with 40 units. The general visualisation can be found in the table 3. The value of the objective function for the solution obtained by the index method rewritten in the table 3 is equal to: z = 200.6 + 150.11 + (100 + 160).12 + 50.17 + 40.23 = 7740
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
29
Table 3
Suppliers
Customers K2 K3
K1
14
20
11
12
150
S1 15
6
S2
200
S3
50
100
Demands
250
100
Inventories
K4 160 18
310 15
200 17
12
19
23
150
40
190
200
700
We see that the index method brings a better result than the North West corner method. The index method demonstrative applies the economical specifications and it requires speculation during all of the allocation process. On the other hand the North – West corner method can be called mechanical we use a given algorithm without extra cogitation. However nor even the solution obtained by index method need be optimal. Especially for larger problems we come to results which have very far from the optimal solution. Better results can be received using a method of approach called the Vogel method (VAM method). We illustrate this one on the same example and such we will be able to compare all these methods. Using the Vogel method we even receive the optimal solution for our example.
Vogel Approach Method (VAM Method) In the course of the index method we occupy a cell with the smallest cost the earliest possible which is lying in concrete row and concrete column. It may be that after the exhaustion of inventories in this row or by refilling the demands of a customer who is in that concrete column. We must occupy the cell with a very high cost. For this reason costs for all the solution are growing abnormal. VAM method uses a process which excludes these situations, unfortunately not absolutely. Table 4
Suppliers
K1 20
Customers K2 K3 14 11
Inventories Difference
K4 12
S1 6
15
18
12
19
250 11
100 2
150 7
200
9
190
5
23
S3 Demands Differences
1
15
S2 17
310
200 3
700
The basic pattern of VAM method is to prevent from such a system of occupation of cells. It is to prevent from a situation that to the end of dispatching of inventories to customers the differences among costs are growing inappropriately. This unwelcome situation approves oneself such that in corresponding rows or columns are great differences between the minimal cost and the nearest higher one.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
30
Table 5
Suppliers
K1 20
Customers K2 K3 14 11
Inventories Difference
K4 12
S1 6 S2
15
18
12
19
250 11,3
100 2,2
150 7,8
200
9,x
190
5,5
23
S3 Demands Differences
1,1
15
200 17
310
200 3,11
700
We keep away from such situations namely as that at the beginning of allocation of inventories to customers we prefer those rows and columns where there is the maximum difference between the smallest value of cost and the nearest greater one. After finding such an array (row or column) we implement the maximum possible inventory into the cell with minimum cost. We apply this method on our example again. We calculate differences in all the rows and columns. The results are in the last row and last column of table 4. The greatest difference is in the first column. We find the cell with minimum cost in that column. It is the cell P21. We occupy it by the maximal possible amount i.e. by 200. The inventories of the supplier S2 are exhausted. This fact will be denoted by a lying cross at differences in the second row where characteristics of the second supplier are described. We do an analogous conclusion when the demands of the customer are satisfied. We write the lying cross into a cell for differences which is lying in the column of the corresponding customer. After every operation by which either the supplier is empty or the customer is satisfied we omit this row or column. Hence it is necessary to re-count all the remaining differences. The situation as it is after the first assignment into the cell P21 is given in the table 5. (Simultaneously the difference is re-counted. The highest difference is in the fourth column now. The bottom cost is in the cell P14 . This cell will be occupied by the demand of customer K4 i.e. by 200 units. The amount of the fourth customer is satisfied. We put it as that we write the lying cross at differences in the fourth column. After the next recounting of differences the maximal difference (8) is in the third – the capacities of the second supplier S2 are run out, therefore the costs c2j , j = 1,2,3,4 are not used for the computational procedure of new differences. The lowermost cost is in the cell P13 .We occupy this cell by the remaining maximum amount from the first supplier S1, i.e. 110 units. Now both the suppliers S1 and S2 are exhausted and simultaneously the costs from the first two rows are not considered for the next calculation of the differences. As the column differences, we overwrite the according lying costs. We complete desired amounts of commodity of customers in a row K2 which receives 100 units of commodity, K1 which receives 50 units of commodity, and K3 which receives 100 units of the same commodity. The whole process of dispatching of commodity and the sequences of differences for adequate rows and columns are given in table 6. The value of the objective function obtained by the Vogel method described in the table 6 is equal to z = 200.6 + 110.11 + (200 + 100).12 + 50.17 + 40.19 = 7620 . Simultaneously it is the minimum value. Every other solution produces a larger value of the objective function by most and hence larger costs for transportation. In the end we can say that the Vogel method brings usually the best results from all those last given methods.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
31
Table 6
Suppliers
K1 20
S1 6 S2
200
S3
50
17
Customers K2 K3 14 11 110 15
18
12 100
19
Inventories Differences
K4 12 200
250 100 150 11,3,3, 2,2,2,1 7,8,8,19, Differences 17,17,x 2,x 19,19,x
1,1,3,x
200
9,x
190
5,5,5,5
15
23
40
Demands
310
200 3,11,x
700
Very often the Vogel method gives directly the optimal solution. It is usually for not extensive problems, for example in our case. For more complicated problems the solution which is received by Vogel method is near to optimum.
Modified Vogel Method This method can be used when the matrix of cells for suppliers and customers is of the type m . n where m >= 3 and n >=3. In the account when the numbers m, n are near to three there we receive the same results as for not modified Vogel method. Now to the modified Vogel method. Table 7
Suppliers
K1 20
S1
Customers K2 K3 14 11 110 15
6 S2
200
S3
50
17
18
12
19
100
Inventories Differences
K4 12 200
310
3,9,9,x
200
9,x
190
7,7,7,7
15
23
40
Demands
250 100 150 200 14,3,3, 3,2,2,12, 8,8,8,19, 11,11,x Differences 17,17,x x 19,19,x
700
We take the three smallest cij in every row and column. We order these three numbers with respect to greatness. Let us suppose that we are working in the io row and the smallest numbers are c i j ≤ c i j ≤ c i j . o 1
Now we do differences d 2 = c
io j 3
- ci
j
o 2
and d1 = c
io j 2
- ci
o
j1
o 2
o 3
hence we define modified difference (max difference) as
the sum of differences d and d , We obtain modified difference as d = d + d . We apply this modified method to our example. The final table is the table 7. We see that the modified method gives the same dislocation as the VAM. We can say that the modified method did not bring anything new and better. But it is also necessary to state, that it is not worse. The modified differences can be more favourable for examples with bigger matrices. We show it on the following example: 1
2
M
1
2
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
32
Example. Let us suppose that we have seven suppliers S1, S2, . . . , S7 with inventories 100, 200, 300, 400, 500, 600 and 700 units of commodity and four customers K1, K2, K3, K4 with demands for 250, 100, 150 and 200 of units. The transport costs cij from the i-th supplier to the j-th customer are given in table 8 : Table 8
Suppliers
K1 9
S1
10
S2
150
S3
300
Customers K2 K3 11 6
9
8
9 S4
5
S6
1
6
7
340
300
2,2,2,2,2,2,2, x
400
4,4,4,x
500
2,2,x
600
8,x
700
4,4,4,4,6,x
9
600 9
5,5,5,5,5,x
9
500 9
200
7
400
2
3,3,3,3,3,3,x
5
3
3
100 7
4
2
3
S7
3 50
6
Max differences
7 90
8
S5
K4
Inventories
3 360
Demands 960 740 650 450 4,6,3,3,3 4,4,4,5,x 2,2,3,3,3 4,4,4,4,4 Max ,3 ,3,x ,2,x differences Vogel method accomplishes the starting allocation that the cost is equal to 11580 financial units. Our modified method brings the starting allocation in the cost 9890 financial units (see table 8). It is obvious that the modified Vogel method presented here brings more better results than the classic Vogel method. But it is necessary to admit that at an other choice of column or row with the same w-differences we can come to other and worst result.
REFERENCES [1] ACKOFF, R.; SASSIENI, M. Fundamentals of operations research. N.Y. : Wiley, 1968. [2] BELLMAN, R.; DREYFUS, S. Dynamic programming. Princeton : PUP, 1962. [3] DUDIRKIN, J. Opereach research. Praha : SNTL, 1994. [4] CHURCHMAN, C. W.; ACKOFF, R.; ARNOFF, E. Introduction to operation research. N.Y. : Wiley, 1975. [5] SAATY, T. Mathematical Methods of Operations Research. N.Y. N.Y. : Mac grave, 1959. [6] ZAPLETAL, J.; ZÁSTĚRA, B. Vybrané kapitoly z operačního výzkumu. Zlín : VUT-FT, 1983. [7] ZAPLETAL, J. Operační analýza. Skriptorium VOŠ, Kunovice : EPI, s.r.o., 1995. [8] ZAPLETAL, J.; VACULÍK, J. Podpůrné metody rozhodovacích procesů. Brno : MU, 1998.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
33
Address: Abdurrzzag Tamtam Faculty of Electrical engineering and Communication Brno University of Technology Purkynova 118, 612 00 Brno, Czech Republic e-mail: [email protected],
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
34
SOLUTION OF STRUCTURAL INTERBRANCH SYSTEM OF A DYNAMIC MODEL Jaromír Baštinec, Josef Diblík
Brno University of Technology
Abstrakt: The whole range of problems by mutual deliveries among various manufacturing branches of industry and a lot of sizes at on market place is given by crude productions of individual branches. In this paper, it will be shown that this dynamic problem can be solved with special mathematical methods.
Klíčová slova: Branches of industry, crude production, technical coefficient,dynamic model.
Let us suppose that the economical system is divided into n manufacturing branches. We denote xi the whole amount produced by i − th branch of industry. Further we denote X ij the amount of production of i − th branch supplied to the branch. At the end we denote yi the amount of products of the i − th branch for final usage (ii. market, export). The whole relations among producers and their customers can be given as follows: j − th
x1 = X 11 + X 12 + . . . + X 1n + y1 x2 = X 21 + X 22 + . . . + X 2 n + y2 . . . . . . .
(1)
xn = X n1 + X n 2 + . . . + X nn + y n The common cognitions allow us to do the following assumption: The supply X ij of the i − th branch to the j − th one is direct proportional to the crude production of the j − th branch x j . Then we have:
X i j = ai j x j
(2)
where the coefficient of the direct proportion is called the technical coefficient. If we know the supplies among all branches of industry and the amount for the final usage from the previous seasons then we can calculate the technical coefficient such as we calculate all xi i = 1, 2, . . . , n from the system (1) and hence the technical coefficient can be obtained from the equation
ai j =
Xi j xj
(3)
Calculating X ij from (3) for i = 1, 2, . . . , n , and j = 1, 2, . . . , n and putting them into (1) the following system of linear equations is received:
x1 = a11 x1 + a12 x2 + . . . + a1n xn + y1 x2 = a21 x1 + a22 x2 + . . . + a2 n xn + y2 . . . . . . . . . . . .
(4)
xn = an1 x1 + an 2 x2 + . . . + ann xn + y n The system (4) can be rewritten into the matrix form:
X = A. X + Y .
(5)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
35
Vector X is called vector of crude production of branches, vector Y is called vector of final usage and the matrix A is called the matrix of technical coefficients. The system of equations can be rewritten into the form (6) ( E − A) . X = Y where E is a unit matrix of the type n
it is n rows and n columns.
n
The matrix E − A can be supposed as an operator of transformation which for the input vector X (vector of crude production of branches) Y assigns the vector of output (vector of final usage). This result is not eminent for economy. More interesting and more important for economy is to search out the operator which for given Y assigns the vector X. This operator can be received by multiplication of the equation (6) by the inverse matrix ( E − A )
X = ( E − A ) .Y −1
−1
and we receive (7)
The matrix ( E − A ) is denoted by B and it is called the matrix of the full material burden. Matrix equations are called fundamental form of open static model of inter branches relations. The elements of the matrix −1
B = bi j
(8)
are called coefficients of the full material costs. The coefficient bi j indicates inverse consumption of the production of the i − th branch which is necessary for the delivery of the production of the j − th branch for the final usage. Among
ai j and bi j holds the following relation: ai j ≤ bi j for i, j = 1, 2, . . . , n . For the expression of economical interpretation of the coefficients bi j it is useful to restore the model (7) as the system of the equations again:
xi = bi1 y1 + bi 2 y 2 + . . . + bin yn
i = 1, 2, . . ., n
(9)
For the new vector of usage Y ' = y1 , y2 , . . ., yn the system ( 9 ) is of the form: '
'
'
xi' = bi1 y1' + bi 2 y 2' + ... + bin yn'
i = 1, 2,..., n
(10)
Let us denote ∆xi = x&i − xi , ∆yi = y& i − yi , i = 1,2.L, n. and we do a subtraction of the left and right sides of suitable equations of the systems ( 9 ) and (10). We receive the system ∆xi = bi1∆y1 + bi 2 ∆y2 + L + bin ∆yn , i = 1,2,L, n. (11) The magnitude ∆xi tells the difference of the change of the whole crude production of the i − th branch if the change of the components of the final usage is ∆yi , ∆y2 ,L, ∆yn . It is possible to prove that the coefficients bi j are non negative numbers and hence ∆xi ≥ 0. We put ∆yi = 1 and ∆yi = 0 for all k ≠ j . Then from (11)
∆xi = bij ⋅ 1, i = 1, 2,L, n.
(12)
The coefficient bi j of the full material costs sets the value for which must be the production increased in the i − th branch that the j − th branch increases supply for final usage upon the unit. The coefficient bi j of the full material costs includes in itself at first the value of the direct supply from the i − th branch into the j − th branch which is necessary to the production of a unit in the j − th branch, secondly the values of supplies of the i − th branch which contracts the j − th branch mediate by instrumentality of the others branches. We supposed at the model (6) that we know the vector of the crude production X and we gain the vector of final usage Y . Contrary at the model (7) we know the vector of final usage Y and with the aid of it we look for the vector of the crude production X . As the third type of problems we solve the following situation. We have given for some branches the crude production and for the other the amount of the vector of final usage. At this treatment of the model the required vector contains k elements formed by crude production and n − k elements formed by final trade outlets from the system ( n > k ). The branches of the model can be transformed so that we obtain two families: The first family contains branches for which the capacities of crude production are given. The second family of branches for which we know the amount of the full final usage. We compose our model such that we calculate from the first family the full final usage and from the second one the crude production for complementary branches. The first family will be denoted by the index 1, the second by the index „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
36
2. We look for such matrix R for which the following equality relation holds. X Y R. 1 = 1 Y2 X 2 It is obvious that we must construct four sub-matrixes with the following property: R11 R 21 where R11 is the matrix of the type k k
n−k
n−k
(13)
R1 2
X Y . 1 = 1 Y2 X 2 R2 2
, R12 is of the type
n−k
k
(14)
, R21 is of the type
k
n−k
and R2 2 is of the type
.
From (14) there follows: R11 X 1 + R12Y2 = Y1
(15)
R21 X 1 + R2 2Y2 = X 2
We infer the matrixes R11 , R12 , R21 , R22 as follows: We divide the matrix A of technical coefficients with the agreement of the allocation of branches of industry into two families: A11 A 21
A1 2
X Y X . 1 + 1 = 1 X 2 Y2 X 2 A2 2
(16)
We extend the equation (16) into two equations with respect to the rules for multiplication of matrixes and we have: A11 X 1 + A12 X 2 + Y1 = X 1
(17)
A21 X 1 + A2 2 X 2 + Y2 = X 2
After the rearrangement of the equations (17) we obtain
(
)
Y1 = E − A11 X 1 − A12 X 2
(
(18)
)
Y2 = − A21 X 1 + E − A2 2 X 2
When we put X 2 from the second equation of (18) we get
(
X 2 = E − A2 2
)
−1
(
)
Y2 + E − A2 2 A21 X 1
and hence
(
)
(
Y1 = E − A11 X 1 − A12 E − A2 2
)
−1
(
Y2 + E − A2 2
)
−1
A21 X 1
(19)
After comparing with the first equation from (15) we get
(
)
(
R11 = E − A11 − A1 2 E − A2 2
(
R12 = − A12 E − A2 2
)
)
−1
A21
(20)
−1
Similarly as (20) we get from the second equation of (15)
( ) = (E − A )
R21 = E − A2 2 R2 2
−1
A21
(21)
−1
22
While the interpretation of the elements of matrixes ( E − A ) and ( E − A ) in models (6) and (7) is simple, the interpretation of the elements in the matrixes Ri j ; i , j = 1, 2 is more complicated task. For the understanding of the −1
content of the elements of sub-matrixes Ri j ; i , j = 1, 2 we study fractional products.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
37
Dynamic systems Production in a certain interval depends on the accumulation in the previous interval. The statistic models do not express this dependence. Accumulation is a component of final product. In the balance of the relations between the branches, we denoted the final products of the individual branches as follows: y1, y2, ... , yn, - This final product basically consists of two parts: yi(1) ... the consumed part, and yi(2) ... the accumulated part, i.e. yi = yi(1) + yi(2). For distinguishing the individual periods of time, we shall denote gross production of the i-th branch in year t by the symbol xi(t) and the final product in year t by the symbol yi(t). Accumulated product of i-th branch becomes a part of the means of other branches. We denote the part of accumulated product of t i-th branch in year t which is invested into the j-th branch as Δyij(t). Then n
yi( 2) = ∑ ∆yij (t ). j =1
If the accumulated product itself consists only of floating means that are consumed in the following year (t+1), the following relation obviously holds between the increment of production in the j-th branch [xj(t+1)-xj(t)] and the investment into the products of the i-the branch Δyij(t):
∆yij (t ) = aij [ x j (t + 1) − x j (t )] = X ij (t + 1) − X ij (t ) . However, part of the accumulated product is of the form of basic funds that are not consumed within a single year. Suppose the consumption of investments Δyij(t) is divided into Tij years. This means that only Tij -part of the investment Δyij(t) is consumed within the following year. The reality is hus better characterised by the relation
∆yij (t ) = aij ( x j (t + 1) − x j (t )) , Tij which after multiplication gives
∆yij (t ) = aijTij ( x j (t + 1) − x j (t )). The relation between the increment in the year (t+1) and the accumulation in the preceding year is therefore given by a system of technical coefficients aij and by the system of average periods of usability of Tij, that are also of technical nature. Therefore we substitute them with the so-called "investment coefficient", denoted by cij:
cij = aijTij The system of investment coefficients, and similarly the system of technical coefficients, forms square matrix C. Using the investment coefficients, the relation between the increment of production in the j-th branch and and the extent of investments into the production of the j-th branch may be written in the form
∆yij (t ) = cij ( x j (t + 1) − x j (t )) The whole of the accumulated production is thus equal to n
n
j =1
j =1
yi( 2 ) (t ) = ∑ ∆yij (t ) = ∑ cij ( x j (t + 1) − x j (t )). This equation connects the accumulation of the i-th branch with the increment of the production in the individual branches. Similar equations may be obtained for all the branches
c11 ( x1 (t + 1) − x1 (t )) + c12 ( x2 (t + 1) − x2 (t )) + L + c1n ( xn (t + 1) − xn (t )) = y1( 2) (t ), c21 ( x1 (t + 1) − x1 (t )) + c22 ( x2 (t + 1) − x2 (t )) + L + c2 n ( xn (t + 1) − xn (t )) = y2( 2) (t ), O cn1 ( x1 (t + 1) − x1 (t )) + cn 2 ( x2 (t + 1) − x2 (t )) + L + cnn ( xn (t + 1) − xn (t )) = yn( 2) (t ).
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
38
From this system of equations, we may directly determine how much we need to accumulate from the production of the individual branches in the given year to reach the planned increment of the production in the individual branches. When we write the system of equations in the matrix form
C∆X = Y ( 2) , and after adaptation
∆X = C −1Y ( 2) we may determine the increment of the individual branches in the following year, when the level and structure of accumulation is given. Now we take into account the period t = 1, 2, ..., T and adopt the following notation: n
xi (t ) = ∑ ( X ij (t ) + zij (t )) + yi (t ), i = 1,2,L, n,
(22)
j =1
where Xij(t) is the supply of the i-th branch to the j-th branch for consumption in the t-th period of time, zij(t) is the supply of the i-th branch to the j-th branch for investment during the t-th period of time, yi(t) is the final product of the i-th branch in the t-th period of time. Suppose, similarly as with the static model, that the
X ij (t ) = aij x j (t ), where aij is the technical coefficient that does not change in time. Further, suppose that the supply of the i-th branch to the j-th branch for investment during the period of time t is proportional to the increment of the production of the j-th branch in one period of time:
zij (t ) = cij ( x j (t + 1) − x j (t )) = cij ∆x j (t ), where Δxj(t) is the increment of the total production of the j-th branch within one period, cij is the investment coefficient , again independent of time. By substituting into the balance equation (22), we obtain, in the matrix form,
CX (t + 1) + ( A − C − E ) X (t ) = −Y (t ),
We obtained a system of difference equations
CX (t + 1) = ( E − A + C ) X (t ) + Y (t ), X (t + 1) = C −1 ( E − A + C ) X (t ) + C −1Y (t ).
From this system, we may determine how the inter-branch relationships should look in order to obtain the required growth. −1
Denote C ( E − A + C ) := M , C
−1
:= N . We receive X (t + 1) = MX (t ) + NY (t ).
(23) We obtained a system of difference equation defined for all t. If |M| is not equal to 0, then a unique solution of the system exists. By subsequent substitution, we transform the system (23) into a single difference equation of the n-th order, which may be solved e.g. with the aid of eigenvalues of the characteristic equation. If M has a small order, then the solution was described in [17]. Moreover, we can use the next Theorem (proof see [3], p. 124).
Theorem: The unique solution of the initial value problem
X (n + 1) = M (n ) X (n ) + G (n ), X (n0 ) = X 0 is given by
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
39
n−1 n−1 n−1 X (n, n0 , X 0 ) = ∏ M (i ) X 0 + ∑ ∏ M (i )G (r ). r = n0 i = r +1 i=n0
If M is a constant matrix, then the solution is given by n−1
X (n, n0 , X 0 ) = M n −n0 X 0 + ∑ M n−r −1G (r ). r =n0
Example: We consider the system
x(n + 1) = 2 x (n) + y (n) + n, y (n + 1) = 2 y (n) + 1. x(0 ) = 1, y (0) = 0. Solution:
2 1
n
1
, G (n) = , X (0) = . In this case we have M = 0 1 1 0 Then
2n M = 0 n
n 2n−1 . 2n
Hence
2n X (n) = 0
n2 n −1 1 n−1 2 n−r −1 + ∑ 2 n 0 r =0 0
(n − r − 1)2 n−r −2 r 2 n −r −1
1 =
2n n−1 r 2 n−r −1 + (n − r − 1)2 n−r −2 = (*) = + ∑ n − r −1 0 2 r =0 n +1 n −1 a (1 − an ) − na (1 − a ) r We use the formula ∑ ra = . (1 − a )2 r =1 n +2 1 1 n 1 n−1 1 r n − 1 n−1 1 r 1 1 − + (n ) n + ∑ 2n n 4 ∑ 2 2 2 4 2 2 = n 2 1 0 r = r = + 2 (*) = + 2 = r 0 n 1 n−1 1 1 0 1− ∑ 2 2 r =0 2
n 1 n n n 1 − + − 2 n + n 2n −1 − 3 n n 2 4 2 2 2 2 4 . = + 2 n = n n 1 1 0 1− 1− 2 2
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
40
So we have
3 x(n ) = 2 n + n 2n −1 − n, 4 n
1 y (n ) = 1 − . 2 Acknowledgement This research has been supported by the Czech Ministry of Education in the frame of MSM002160503 Research Intention MIKROSYN New Trends in Microelectronic Systems and Nanotechnologies.
REFERENCES [1] ACKOFF, R. L. Progress in Operation Research. New York : John Wiley & Sons, Inc. 1961. [2] CHURCHMAN, Ch. W.; ACKOFF; R. L.; ARNOFF, L. Introduction to Operations Research. New York : John Wiley & Sons, Inc. 1957. [3] ELAYDI, S. N. An introduction to dufference equations, Second Edition, Springer, 1999. [4] HABR, J.; VEPŘEK, J. Systémová analýza a syntéza. Praha : SNTL, 1972 . [5] BECK, J.; LAGOVÁ, M.; ZELINKA, J. Lineární modely v ekonomii. Praha : SNTL, 1982. [6] KLAPKA, J.; DVOŘÁK, J.; POPELKA, P. Metody operačního výzkumu. Brno : VUTIUM, 2001. [7] PRÁGEROVÁ, A. Diferenční rovnice. Praha : SNTL, 1971. [8] RAIS, K. Vybrané kapitoly z operační analýzy. Brno : PGS, 1985. [9] ROCCAFERRERA, G. M. F. Operation Research Models for Business and Indusry. Chicago, New York : S.W. publishing company, 1964. [10] TER-MANUELIANC, A. Modelování problémů řízení. Praha : Institut řízení, 1977. [11] VACULÍK, J.; ZAPLETAL, J. Podpůrné metody rozhodovacích procesů. Brno : Masarykova univerzita 1998. [12] WALTER, J. a kol. Operační výzkum. Praha : SNTL, 1973. [13] WALTER, J. Stochastické modely v ekonomii. Praha : SNTL, 1970. [14] ZAPLETAL, J.; ZÁSTĚRA, B. Vybrané kapitoly z operačního výzkumu. Zlín : VUTFT, 1983. [15] ZAPLETAL, J. Operační analýza. Kunovice : Skriptorium VOŠ, 1995. [16] ZAPLETAL, J. Structural Interbranch System of Static Model. International conference of EPI Kunovice, 2005, 303 – 307. ISBN 80-7314-052-7. [17] BAŠTINEC, J. Structural Interbranch System of Dynamic Model. International conference of EPI Kunovice, 2005, 317 – 322. ISBN 80-7314-052-7.
Address: Doc. RNDr. Jaromír Baštinec, CSc. Department of Mathematics Faculty of Electrical Engineering and Communication Brno University of Technology Technická 8, 616 00 Brno, [email protected] Address: Prof. RNDr. Josef Diblík, DrSc. Department of Mathematics Faculty of Electrical Engineering and Communication Brno University of Technology Technická 8, 616 00 Brno, [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
41
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
42
PROBABILITY THEORY AND STATISTICS IN THE COMBINED FORM OF STUDY OF THE BACHELOR STUDENT PROGRAMMES AT FEEC BUT
Michal Novák
Brno University of Technology
Abstrakt: At FEEC BUT the teaching of bachelor student programmes in the combined form of study began in the academic year 2004/2005. This contribution discusses how the basics of probability theory and statistics are included in the programmes. Some general information on the course as well as first experience from it are also given.
Klíčová slova: distance learning, combined form of study, probability, statistics, teaching mathematics
Introductory information – context and prerequisites Teaching in the combined form of study began at FEEC BUT in the academic year 2004/2005. The courses as well as their outlines and requirements are the same as in the attended form of study (with the exception of one subject offered in the attended form but not in the combined one). Mathematics in the bachelor student programmes is therefore taught in four subjects: Mathematical seminar, Mathematics 1, Mathematics 2 and Mathematics 3. Mathematical seminar is meant to revise secondary school knowledge of mathematics necessary for further studies. In the combined form of study it is a long weekend course at the beginning of the term; the students can either attend it or submit exercises only. Mathematics 1 (in the first term) includes basics of linear algebra, basics of differential and integral calculus of functions of one variable and basics of differential calculus of more variables. Mathematics 2 (in the second term) includes solving differential equations and basics of theory of complex functions and integral transformations. Both courses consist of five tutorials.
The course on probability and statistics Basic concepts of probability and statistics are taught during the third term in Mathematics 3. Its outline, however, includes basic numerical methods as well. There are four tutorials (3 lessons each) out of total number of six in the third term which include Mathematics 3. The outline of the first Mathematics 3 course was as follows: • Tutorial 1 • introduction, revision, classical and geometrical probabilities, discrete and continuous random variable, issues of expected value and dispersion • Tutorial 2 • some basic distributions of probability (binomial, Poisson, exponential, uniform, normal), idea of statistical testing, basic statistical tests (sign test, z-test & mean expected value test), queuing theory (time permitting) • Tutorial 3 • numerical methods (not to be discussed here) • Tutorial 4 • final tutorial, revision, sample tests The tutorials introduce students to the basic concepts of probability theory and statistics. Students learn that a relatively small number of formulas and theorems have profound effects in a number of situations. The choice of tasks emphasises the way of decoding the respective word problems and finding the way of applying the mathematical means to solve the problems. The part on statistical testing stresses the general concept of testing and its applicability in various contexts. „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
43
Students are required to submit exercises for each tutorial. These are assessed by a maximum of 30 points in total. The written only exam at the end of term is assessed by a maximum of 70 points. In order to pass the subject students are required to acquire the minimum of 50 points. Students have access to sample tests – these can be downloaded from teacher’s homepage. The final tutorial deals with the exam as well. Students can study the subject matter from [2], which respects the needs of distance form of study and was targeted at this subject. The text, which is also used by students in the attended form of study, is available as a PDF file from a number of links including the faculty website and teacher’s homepage. Special office hours in convenient time are set for the combined students only – consulting subject matter by telephone is widely used. Another contribution in the proceedings of this conference, [3], gives examples of tasks solved throughout the course. Applications of these problems as well as the necessary mathematical knowledge are included there as well.
Probability and statistics in the master study programmes Since teaching in the combined form of study started only as late as 2004/2005, there are no students in the master study programmes yet. The master study programmes in the attended form of study, however, offer a course Probability, statistics, operations research, which deepens the knowledge of probability theory and statistics. The topics dealt with in the course include: basic statistical tests – t-test, F-test; confidence intervals; linear regression; post-hoc tests; goodness of fit test; nonparametric tests; mathematical methods in economics - linear programming, the transport problem; dynamic programming, recursive algorithm, inventory models.
Conclusion Knowledge of probability theory and statistics is necessary for dealing with a great number of situations which can occur in various contexts. The combined form of study as a means designed to provide access to this knowledge to students who are already employed can help to improve position and status not only of such students but also of their employers.
References [1] BAŠTINEC, J. Výuka matematiky na FEKT VUT Brno (v bakalářském i magisterském studiu). In 37. konferencia slovenských matematikov. Žilina : Slovenská matematická společnost, 2005. [2] FAJMON, B.; RŮŽIČKOVÁ, I. Matematika 3. Brno : UMAT FEKT VUT, 2003, available from https://www.feec.vutbr.cz/et/skripta/umat/Matematika_3_S.pdf. [3] NOVÁK, M. Examples of using concepts of probability theory in management decision making. This proceedings.
Address: Mgr. Michal Novák, Ph.D. Ústav matematiky, FEKT VUT v Brně Technická 8 616 00 Brno tel.: +420-541143135 e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
44
APLIKACE FUZZY SYSTÉMŮ PRO PODPORU ROZHODOVÁNÍ A ŘÍZENÍ Vladimír Mikula, Jindřich Petrucha
Evropský polytechnický institut, s.r.o.
Abstrakt: Tento příspěvek pojednává o použití fuzzy logiky pro podporu rozhodovacího procesu. Základní poznatky o fuzzylogice byly publikovány v početných literárních pramenech- viz přehled literatury na konci tohoto článku. Stručný přehled teorie fuzzy systémů může čtenář nalézt v tutoriálním článku publikovaném autorem v Proceedings of the International Conference of EPI, s.r.o. Kunovice, leden 2005 [1] Popsáno je uspořádání fuzzy systému s vysvětlením účelu a funkce individuálních bloků systému a stručný popis využití pro rozhodovací a řídicí procesy.
Klíčová slova: fuzzy sets, fuzzy systems, universe of fuzzy sets,degree if membership, fuzzification of input and out-put variables, fuzzy rules, fuzzy associative memory FAM (bank of rules), fuzzy inferences, MAXMIN and MAXPROD methods (MAMDANI´s and LARSEN´s method), centroid, defuzzification of centroid, crisp value of output variable.
Úvod V reálném světě jsou jevy a aktivity popisovány pokud možno exaktně na základě idealizovaných matematických modelů (běžné v přírodních a technických vědách). Avšak v některých oblastech, např. v ekonomice, v organizování chodu systémů společenského charakteru) nejsou vždy k dispozici exaktní matematické modely, nebo jsou obtížně formulovatelné, anebo, i když by se dal takový model zformulovat, byl by velmi komplikovaný a těžkopádný a možno i prakticky nepoužitelný. V těchto situacích se čím dál více začínají uplatňovat metody, popisující systémy a jevy na základě expertních znalostí sledované problematiky. Pro řešení se využívají postupy a metody z oblasti umělé inteligence, kam patří umělé neuronové sítě (napodobující myšlení) a fuzzy systémy, založené na fuzzy logice (a na tzv. approximate reasoning,) a využívající tzv.lingvistické proměnné, tedy vyjadřování pomocí vágních jazykových prostředků, hodnotících velikost parametrů nějaké veličiny, odstupňované v jistém rozsahu pomocí výstižných slov (nálepek, labelů), např: velmi malý, malý, střední, velký, velmi velký, apod..) I když tyto pojmy nejsou ostře vymezeny, jsou tzv neostré, nebo-li fuzzy, vyjadřují obvykle v přijatelné a srozumitelné formě příslušnou oblast platnosti nějakého tvrzení a dají se modelovat pomocí fuzzy množin. Použití „ostré“ (binární, booleovské) logiky, uznávající jenom pravdivost (vyjádřenou symbolem logické jedničky), nebo nepravdivost (vyjádřenou logickou nulou) jistého tvrzení je někdy příliš hrubé a nepřijatelné a lépe vyhovuje odstupňování míry pravdivosti hodnotami kontinualně rozloženými v definovaném intervalu (např. od nuly do jedničky). Proto je ve fuzzy logice zaveden pojem stupně příslušnosti prvku do dané fuzzy množiny, pro který platí μ∈< 0, 1 >. Využívání vágních pojmů pro popis skutečností pomocí lingvistických prostředků je odedávna běžné a mnohdy velmi výhodné a je součástí každodenního života. Proces řízení anebo rozhodování je pak expertně popsán pravidly typu JESTLI-ŽE < předběžná podmínka > PAK < následek, činnost >, nebo, jak je všeobecně používáno IF THEN < consequent >. Soubor těchto pravidel pak tvoří tzv. znalostní banku ( fuzzy associative memory, FAM) a postihuje pomocí patřičně kvantifikovaných lingvistických prostředků uvažovaný řídicí, nebo rozhodovací proces – je to obdoba programu sekvenčního digitálního počítače, operujícího v převážné většině na bázi ostré, booleovské logiky. Poznatky o fuzzy logice, která je obecnější než ostrá binární logika (jež je speciálním případem fuzzy logiky) jsou popsány ve velmi početné literatuře - viz reference na konci tohoto článku.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
45
Uspořádání fuzzy systému pro řízení a pro podporu rozhodování je uvedeno na Obr. 1. Algoritmus a funkci jednotlivých bloků fuzzy systému lze stručně popsat takto: ostré Blok defuzzifikace hodn. Blok úprav výstupních veličin 6 7
Blok fuzzy inferencí 5
Upravené výst. prom. y1, ...ym
y1 ... ym
Banka pravidel IFTHEN (FAM) 4
Řízení nebo rozhodující objekt
x1 ...xn Vstupní proměnné y1 ...ym
Blok fuzzifikace
1
Blok úprava vstupních veličin
3 2
Obr. 1. Uspořádání fuzzy systému 1. 2. 3.
4.
5.
Získání hodnot vstupních proměnných x1 až xn (např. pomocí senzorů na řízeném objektu (nebo z vhodnédatabanky) Úprava těchto hodnot (normování, úprava na bezrozměrné číselné hodnoty) Fuzzifikace vstupních a výstupních veličin, tj. jejich rozdělení na dílčí fuzzy podmnožiny v příslušných univerzech a přidělení názvů (nálepek, labelů) těmto podmnožinám ve smyslu výše uvedených úvah o jazykové proměnné. Otázkou je stanovení optimálního počtu těchto podmnožin. Čím více jich zvolíme, tím jemnější řízení, nebo rozhodování dosáhneme,ale prodlouží se tím výpočetní čas. Ukazuje se, že rozumný počet podmnožin je 3 až 9, obvykle se volí lichý počet (symetrické rozložení kolem střední hodnoty. Sousední podmnožiny se musí částečně překrývat, aby bylo dosaženo plynulé přecházení univerzem. Stupeň překrytí (overlapping) si vyžaduje hlubší rozbor, ale přibližně lze zvolit koeficient překrytí jako poměr kp = A / B, kde A je šířka intervalu překrytí dvou sousedních podmnožin na ose x a B je celkový interval na ose x , zabíraný těmito podmnožinami . Často se volí překrytí tak, aby průsečík obou překrývajících se podmnožin byl na hodnotě μ = (0,3 až 0,5). Sestavení banky pravidel (FAM- fuzzy associative memory) na základě expertních znalostí řeše-ného problému. Pravidla (rules) jsou již zmíněného typu IF < antecedent > THEN < consequent >. Rozměr banky pravidel je roven počtu vstupních proměnných n a počet pravidel P je roven součinu počtu podmnožin pi na které jsou patřičná univerza vstupních veličin rozdělena, tedy P = p1 . p2 . ... pn . Uvažujme např. systém, který má dvě vstupní veličiny, tedy n = 2 (označme je x1 a x2) a jednu výstupní veličinu y. Banka pravidel bude tedy dvourozměrná (obdélníková). Nechť univerzum veličiny x1 je rozděleno do pěti podmnožin, označených lingvisticky na stupně: velmi nízký (VN), nízký (N), střední (S), vysoký (V) a velmi vysoký (VV), univerzum veličiny x2 nechť má tři stupně: nízký (N), střední (S ) a vysoký (V ), a nechť výstupní veličina y bude mít také tři stupně: malá (M) střední (S) a velká (V).. Počet pravidel P = p1 . p2 = 5 . 3 = 15. Nechť expertně stanovená pravidla, reprezentovaná doporučeným obsazením jednotlivých políček znalostní banky mají rozdělení podle Obr. 2a. Je-li počet vstupních proměnných roven třem, banka pravidel bude třírozměrný hranol, obecně pro n vstupních proměnných to bude n – rozměrné těleso (Obr.2b). Dále platí zásada, že pro každou výstupní veličinu musíme vytvořit samostatnou banku pravidel Fuzzy inference. Fuzzifikované vstupní a výstupní veličiny přivádíme do bloku inferencí, kde na základě pravidel uložených ve znalostní bance jsou vykonávány operace, zvané fuzzy inference. Výsledkem těchto operací je získání tzv. centroidů výstupních veličin, což jsou obvykle subnormální fuzzy množiny (nedosahující úrovně µ = 1). Inference jsou typu MAXMIN (Mamdaniho metoda), nebo MAXPROD (Larsenova metoda). Pro objasnění algoritmu inferencí uvažujme konkrétní fuzzifikované veličiny podle Obr.3.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
46
x1 y
x2
VN
N
S
V
VV
N
V
V
V
V
V
S
M
S
S
S
S
V
M
M
M
M
M
x3 x1 x2
a
b
Obr.2. Příklad banky pravidel při dvou (a) a při třech (b) vstupních proměnných. U inferencí typu MAXMIN postupujeme takto: pro definované hodnoty vstupních veličin, např. x1A a současně x2A zjistíme do kterých fuzzy podmnožin tyto veličiny patří a s jakým stupněm příslušnosti µ (x1A), µ (x2A). Tak např. x1A leží v množině V s hodnotou µ = 0,7 a také v sousední množině S, kde dosahuje hodnotu µ = 0,35. Veličina x2A leží přitom v množině N, kde dosahuje hodnoty µ = 0,45 a také v množině S s hodnotou µ = 0,15. Na základě banky pravidel (použijeme FAM banku uvedenou výše)pro tuto situaci zapíšeme následující pravidla: R1 : IF x1 = x1A → V ( 0,7) AND x2 = x2A → N (0,45) THEN y → V (0,45) OR R2 : IF x1 = x1A → S (0,35) AND x2 = x2A → S (0,15) THEN y → S (0,15). OR next rule. Protože v antecedentu je použita spojka AND, značící průnik, používáme ve smyslu pravidla o průniku fuzzy množin [1]: µV = min (µV , µN ). Obě pravidla R1 a R2 platí současně, takže je spojíme spojkou OR, nebo-li sjednocení. Současně platí i ostatní pravidla, ale ta se v uvedené situaci neprojeví.. Výslednou fuzzy množinu výstupní veličiny (výsledný centroid) dostaneme tak, že množinu V výstupní veličiny y ořežeme ve výšce µ = 0,45 a množinu S ořežeme ve výšce µ= 0,15. Takto získané dílčí centroidy, označme je jako C1 a C2, spojíme podle pravidla o sjednocení fuzzy množin: µC = max (µC1 , µC2 ), jak je ukázáno na obr 3a. µ VN 1 0,7
N
S
V
VV
µ
M
S
V
0,45 MAXMIN
0,35
0,15
0
x1A
x1
0
y C2
µ 1
N
S
V
µ
M
C1
S
V
1 MAXPROD
0,45
0,45 0,15
0,15 0
x2A
0 C2
y C1
Obr.3. Rozdělení fuzzifikovaných veličin uvažovaného fuzzy systému a ukázka inferencí typu MAXMIN a MAXPROD. „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
47
Podle metody MAXPROD (Larsenova metoda) postupujeme tak, že při získávání dílčích centroidů neořezáváme příslušné podmnožiny výstupní veličiny, ale je snížíme vynásobením (product, odtud PROD) hodnotou minima získaného aplikací jednotlivých pravidel. Takto získané dílčí centroidy pak spojíme operací sjednocení (MAX) a dostaneme výsledný centroid (Obr. 3.). 1. Defuzzifikace centroidu výstupní veličiny je operace při níž se snažíme vyhodnocením centroidu získat ostrou (crisp) hodnotu výstupní veličiny y. Je několik metod defuzzifikace, ale nejpoužívanější je metoda nalezení souřadnice těžiště centroidu (centre of gravity, COG), tedy yCOG podle vztahu ∞ n ∫ µ(y)y dy Σ µ (yi) yi -∞ i=1 yCOG = ≈ ∞ n ∫ µ (y) dy Σ µ (yi) -∞ i =1 Druhá část tohoto vztahu používá místo integrace součty výrazů µ (yi) yi a µ (yi) získaných vzorkováním centroidu (Obr. 4.). µ 1 µ ( yi )
těžiště ( COG)
. 0
y1 y2 yCOG
yi
yn
y
Obr. 4. Centroid výstupní veličiny vyjadřený pomocí vzorkování. Určíme-li všechny hodnoty yCOG pro všechny hodnoty vstupních veličin a znázorníme je v souřadném systému yCOG = f (x1, x2 …. xn), dostaneme tzv. rozhodovací (u řídicího procesu řídicí) plochu uvažovaného fuzzy systému v (n +1) – rozměrném prostoru. 2. Blok úprav výstupních hodnot. Výstupní ostré hodnoty yCOG získané na výstupu defuzzifikátoru mohou být pouhé číselné hodnoty bez rozměru. Pro reálný systém může být nutné dodat k nim rozměr a přizpůsobit je k žádoucím rozsahům veličin pro které je systém sestaven (např. v reálných řídicích systémech veličiny y mohou být elktrická napětí, nastavovaná v definovaných rozsazích, nebo může jít o délku časového intervalu po který má funkce systému probíhat, atd.). K tomu slouží blok úprav výstupních veličin.
Aplikace poznatků Výše uvedené postupy lze aplikovat v různých oborech. Uveďme zjednodušený příklad z oblasti rozho-dování managementu při financování určité akce, kterou má realizovat subjekt S, jehož jakostní parametry P1 , P2 ... Pn jsou uloženy v databázi DB a na základě těchto dat lze expertním způsobem vyhodnotit údaje potřebné pro rozhodnutí, zda uvažovanou akci v požadované výši nákladů financovat nebo ne. Ale rozhodování jenom mezi dvěma krajními hodnotami tedy ANO nebo NE může být příliš hrubé. Jemnější a snad přijatelnější by bylo použití ještě dvou stupňů mezi nimi, tedy např. SPÍŠE ANO a SPÍŠE NE. Použití dalšího stupně uprostřed by mohlo vést k neurčitému stavu, proto se přikloníme k uvažovaným čtyřem stupňům doporučení zda uvažovanou akci realizovat, tedy ANO, SPÍŠE ANO, SPÍŠE NE a NE . Výši nákladů na akci lze také kvantifikovat vhodnými lingvistickými stupni např. VYSOKÉ, STŘEDNÍ a NÍZKÉ. Kvalitativní parametr realizačního subjektu (uvažujme pro názornost jen jeden (významný) parametr, a to P1, i když „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
48
tuto metodiku lze zavést i pro více parametrů) také rozdělíme expertně na žádoucí počet lingvisticky odstupňovaných hodnot, např. VELMI MALÝ, MALÝ, STŘEDNÍ, VELKÝ, VELMI VELKÝ. Algoritmus tohoto procesu znázorňuje Obr. 5.
Databáze umožňující určit kvalitativní parametry realizačního subjektu
Expertní stanovení fuzzy podmnožin vst. a výst.veličin a banky pravidel
Fuzzy systém pro podporu rozhodovacího procesu podle schématu na Obr. 1
Finální rozhodnutí
x1 y x2
VM N N S N V N
M S SN SA SN SA N SN
V A A SA
VV A A A
Banka pravidel IF - THEN
µ VM 1
M S
V
VV
0 µ N 1
µ N 1 x1
S
0
SN SA
A
y
V
0
x2
x1 kvalitativní parametr P1 realiz. subjektu x2 výška nákladů y rozhodnutí managementu zda akci financovat při uvažované výši nákladů
Obr. 5. Algoritmus rozhodovacího procesu, příklad banky pravidel IF- THEN a rozvržení fuzzifikovaných vstupních a výstupních veličin.
Závěr Tento článek podává stručný přehled o metodě využití fuzzy logiky pro podporu rozhodovacího procesu. Navazuje na předchozí tutoriální článek [1], uvedený ve sborníku mezinárodní konference EPI Kunovice, konané v lednu 2006. Využití fuzzy množin umožňuje získat vhodné výsledky i v případech, kdy není k dispozici exaktní matematický model systému, nebo je velmi komplikovaný a nevhodný pro řešení v reálném čase, ale kdy existují expertní znalosti o řešené problematice.Potřebné teoretické základy jsou v uvedených literárních pramenech. V článku jsou stručně popsány jednotlivé bloky fuzzy systému, jejich funkce, případně nejdůležitější zásady návrhu. Na závěr je ukázán příklad koncepce fuzzy systému pro podporu rozhodovacího procesu managementu při financování určité akce. Metodika se dá využít i pro různé další úlohy z oblasti bankovnictví, ekonomiky, průmyslu, atd. Výstup tohoto systému lze brát spíše jako kvalifikované doporučení při rozhodování. Finální rozhodnutí, samozřejmě, záleží na názoru rozhodujícího subjektu, jímž je uvedený management.
Literatura: [1] MIKULA, V. Exploitation of fuzzy logic in control and decision processes. Proceedings of the International Conference of Kunovice : EPI, 2005. [2] ZADEH, L. A. Fuzzy Sets. Inf. and Control, 8, 1965, pp. 338- 353. [3] KOSKO,B. Neural Networks and Fuzzy Systems. Prentice Hall Inc., 1992. [4] NOVÁK, V. Fuzzy množiny a jejich aplikace. Praha : Matematický seminář, SNTL, 1992. [5] NOVÁK, V. Základy fuzzy modelování. Praha : BEN – Technická literatura, 2000. [6] POKORNÝ, M. Umělá inteligence v modelování a řízení. Praha : BEN- Technická literatura, 1996. [7] ZADEH, L. A.; LANGARI, R.; YEN, R.; COX, E. Fuzzy Logic Educational Program. Motorola Co. 1992. [8] KAUFMANN, A. Initiation Élementaire aux Sous- ensambles Flous à l´ Usage des Débutants. École Polytechnique Féderal de Lausanne, 1992. [9] KONEČNÝ, V.; PEZLAR, R.; REJNUŠ, O. Fuzzy expertní systémy systémy a rozhodování. Brno : Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis, 2001.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
49
Adresa: Prof. Ing. Vladimír Mikula, CSc. Evropský polytechnický institut, s.r.o. Osvobození 699, 686 04 Kunovice te./fax.: +420 572 549 018, +420 572 548 788 e-mail: [email protected] Adresa: Ing. Jindřich Petrucha, Ph.D. Evropský polytechnický institut, s.r.o. Osvobození 699, 686 04 Kunovice te./fax.: +420 572 549 018, +420 572 548 788 e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
50
EXAMPLES OF USING CONCEPTS OF PROBABILITY THEORY IN MANAGEMENT DECISION MAKING Michal Novák, Břetislav Fajmon
FEKT VUT v Brně
Abstrakt: Probability theory and statistics play an important part in everyday company and management life and decision making. In the contribution we show that even very basic concepts can be used in solving practical problems. The choice of tasks and ways of solving them conform to the curriculum of a course on probability theory and statistics offered in a combined form of study of bachelor student programmes at FEEC BUT, which is referred to elsewhere in this proceedings.
Klíčová slova: probability theory, statistics, combined form of study, teaching mathematics
Introductory information Elsewhere in the proceedings of this conference, [6], there is mentioned the curriculum of a course on probability theory and statistics offered in the combined form of study at FEEC BUT. This course is a brief and introductory one only – it includes classical and geometrical probabilities, discrete and continuous random variable, basic terms of statistics, some very basic distributions of probability and introduction to the issue of statistical testing. Yet even these concepts only can be used in solving some important practical tasks. We are going to discuss various problems which can occur in various contexts in a number of variations and we are going to show how the knowledge of basic concepts only can help in solving them. The variability of the choice is intentional – we want to show that the basics of probability theory and statistics can be applied in a number of situations at almost any position in the company without any special deep education, long training or use of specialised software.
The issue of guarantee period Let us consider the following situation: The operating life of a product can be described by a certain distribution of probability. We are ready to tolerate a certain number of legitimate complaints in the guarantee period. What length of the guarantee period shall we set? Once we know the distribution of probability describing the operating life of our product, the task can be solved in a simple way. Let us denote the continuous random variable describing the operating life of our product by X , the expected value of the operating life by EX, the relative number of legitimate complaints we are ready to tolerate by α and the length of the guarantee period by G. Then in fact we need to find such G that P ( X < G ) = α . For the sake of simplicity, let us consider the exponential distribution1. Then we have 1 − e G = − EX ln(1 − α ) .
−
G EX
= α , which results in
Designing a special offer Let us imagine the following special offer: We are going to offer something for free. Every eligible person is entitled to do something (cast a die, turn a lottery1
The choice follows from the fact that the exponential distribution is one of those taught in the course on probability and statistics referred to in [6]. It is to be noted, however, that the choice of exponential distribution in this context may be rather special. „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
51
wheel, etc) with a relatively small number of possible outcomes. If certain states are reached, the person is given the advertised thing and entitled to another chance. This keeps repeating as long as the desired states are reached. We may consider the following specific example: Beers for free! If you can squeeze in, turn our lottery-wheel! Six numbers only! 5 and 6 mean a beer for free and another chance! As long as 5 and 6 keep falling, you keep drinking! The only way to stop drinking for free is to pray for 1 to 4! The nature of this offer can be revealed using the concept of expected value. Let X be a discrete random variable denoting the number of beers drunk for free by one person and p(x) the probability mass function of X. Let us except the 2 cases of stopping drinking for other reasons than turning out the wrong numbers. We get that p( x ) = x +1 for 3 x = {0,1,2,...} and p( x) = 0 otherwise. The expected number of beers drunk for free by one person in such a special ∞
offer can be computed as EX = ∑ x x =0
2 3
x +1
and it turns out that EX=0.5, which is definitely less than the offer suggests.
Designing board games Many board games contain fields known as “function fields”, i.e. fields which require some action or direct the game. The flow of the game can be controlled or directed by a suitable choice of positions of these fields, or rather the number of fields between them. It becomes apparent if the game is played with more than one die and the Distribution of probability of sum s on tw o dice length of each player’s move is the sum of numbers on the dice. The graph shows values of probability 0,200 mass function of a discrete random variable denoting the possible sums on two six-sided dice: probability
0,150
Seven is the most likely sum on two dice – it is three times more probable than three or eleven and six times more probable than two or twelve.
0,100 0,050
The only mathematical concept used in this example is the notorious formula of classical probability | A| P ( A) = , where |A| is the number of positive |Ω| outcomes of an experiment and |Ω| is the number of all possible outcomes.
0,000 2
3
4
5
6
7
8
9
10
11
12
possible sum s
The issue of guessing in multiple choice tests There are many objections to multiple choice tests. It has been often suggested that the results may be influenced by simple guessing. Let us consider the following conditions: There is a given number of questions (let us denote it by n), each of which offers the same number of options (let us denote it by k), out of which r options are always correct. Let us suppose that the respondent does not know anything about the subject matter of the test yet knows how many options are correct and guesses accordingly. What is the expected number of correctly answered questions if the question is regarded as answered correctly if all correct options are marked only?
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
52
number of correct options (r)
number of options (k) 1
3 0,33
4 0,25
5 0,20
6 0,17
7 0,14
8 0,13
2
0,33
0,17
0,10
0,07
0,05
0,04
0,25
0,10
0,05
0,03
0,02
0,20
0,07 0,17
0,03 0,05
0,01 0,02
0,14
0,04
3 4 5 6 7
0,13
The probability that a question is answered correctly under these circumstances is the same for every question and 1 . The table shows the respective equals p = k r probabilities for given k and r. The expected number of correctly answered questions is EX = n. p . The example of 100 questions of six options out which three are always correct (which can be considered a reasonable compromise) immediately negates the objection on guessing.
Reading specifications Reading and understanding specifications of products is an important part of everyday company life. Let us e.g. consider a test with the following specifications: Point span: 0 – 100; minimum points to pass the test: 50; random variable X, which describes the results of the test, has normal distribution, with expected value µ = 62 and dispersion σ 2 = 25 . •
It is obvious that such a test is useless, since if X~No( µ , σ 2 ), we have P (µ − 3σ < X < µ + 3σ ) = 0.9973 , which •
means that failing the proposed test is almost impossible because in our case P (47 < X < 77) = 0.9973 . Let us consider another simple example: Random variable X describing the time before a problem with a machine occurs has exponential distribution of probability. The problem occurs once in H hours. We denote T the operating time before the problem occurs for the first time and p probability that the machine works without a problem for longer that T. For H=2000 and p=0.99 we get that T is as short as 20 hours. This could make us reject such a machine since 20 hours of non-problematic operating time is indeed not acceptable. Yet with some background knowledge of probability theory we could easily object to such reasoning since with p decreasing T rapidly increases. The general formula for counting T is in our case of exponential distribution of probability T = − H ln p , which follows from P ( X ≥ T ) = p , where X ~ Exp(1) describes the time before the problem occurs in case that the problem occurs once in H hours2. The following table gives values of T for some values of p and H. H
2000
p
0,99 0,98
0,97
0,96
0,95
T
20,1 40,4
60,9
81,6
102,6 123,8 145,1 166,8 188,6 210,7 325,0 446,3
H
3000
p
0,99 0,98
0,97
0,96
0,95
T
30,2 60,6
91,4
122,5 153,9 185,6 217,7 250,1 282,9 316,1 487,6 669,4
H p
4000 0,99 0,98
0,97
0,96
T
40,2 80,8 121,8 163,3 205,2 247,5 290,3 333,5 377,2 421,4 650,1 892,6
0,95
0,94
0,94
0,94
0,93
0,93
0,93
0,92
0,92
0,92
0,91
0,91
0,91
0,90
0,90
0,90
0,85
0,85
0,85
0,80
0,80
0,80
Comparing results Comparing data is a necessity in a number of situations. However, wrong conclusions are often drawn from “comparing the incomparable”. Let us consider the following data sets with the same span of possible results, where xi ∈ {10,11,...,19,20} : Set A: xi : 12, 12, 12, 13, 14, 14, 15, 15, 15, 16, 18, 19, 20 2
Naturally, another piece of knowledge would be necessary to support or question the fact that exponential distribution is used in this respect. „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
53
Set B: xi : 10, 10, 11, 11, 11, 13, 16, 17, 18, 19, 19, 20, 20 Value xi = 18 occurs in both data sets. The average value for both data sets is x = 15 , therefore for both data sets we have that xi = 18 is better than the average. Yet after we employ the standardized z-values, which take into account both average and dispersion, we get that in set A the respective z-value is z16 = 1,22 while for set B we have z16 = 0,77 , which means that value xi = 18 is relatively much better in set A even though its distance from the (same) average is the same in both sets.
The issue of statistical testing Since z-tests only are dealt with throughout the course in the combined form of study at FEEC BUT, which is referred to in [6], let us consider only this type of statistical tests. For the sake of simplicity let us deal with the test µ ≠ constant. The mathematical background of this test is very simple: once we set the hypotheses, we are looking for α such xk that P ( X ≥ xk ) = , where X~No( µ , σ 2 ) and α is the given significance level. The xk can be easily 2 obtained with the help of the respective z-value, since the parameters µ and σ 2 (thus alsoσ ) are known. This simple background can be applied in a number of various situations. Typically: We know that the operating time of a machine (random variable X) can be described by the normal distribution of probability as X~No( µ , σ 2 ). We are going to test a new technique designed to increase the operating time. Some machines have been enhanced with the technique and their operating time measured. Given the significance level α what conclusion on the quality of the technique can be drawn from the values? Statistical testing can be easily abused in order to manipulate the recipient into accepting wrong conclusions. This is especially true for the choice of the significance level and the misinterpretation of µ ≠ constant and µ > constant or µ < constant tests. Given suitable significance level and disregarding the nature of the test almost any hypothesis may seem acceptable.
Conclusion Knowledge of probability theory and statistics is an integral part of responsible decision making in many aspects of company routine. It is often believed that these theories are too abstract or too complex to be used by non-trained staff or without specialised software. The above contribution does not challenge this assumption, which is naturally valid in a great many contexts. It rather complements the idea by suggesting that there exist real-life situations which can be solved by a surprisingly modest amount of knowledge of probability theory.
References: [1] BAŠTINEC, J. Matematika pro bakaláře na FEKT VUT. In Matematika na vysokých školách. Praha : 2003. [2] BAŠTINEC, J.; DIBLÍK, J. Výuka matematiky v magisterském studiu na FEKT VUT. In XXIII International Colloquium on The Acquisition Process Management, Sborník abstraktů a elektronických verzí příspěvků na CD-ROM. Brno : Univerzita obrany, 2005. [3] CASELLA, G.; BERGER, R. L. Statistical Inference, 2nd ed. Duxbury Thompson Learning, 2002. [4] FAJMON, B.; RŮŽIČKOVÁ, I. Matematika 3. Brno : UMAT FEKT VUT, 2003, available from https://www.feec.vutbr.cz/et/skripta/umat/Matematika_3_S.pdf [5] FOTR, J.; DĚDINA, J. Manažerské rozhodování. Praha : Vysoká škola ekonomická, Fakulta podnikohospodářská, 1994. [6] NOVÁK, M. Probability theory and statistics in the combined form of study of the bachelor student programmes at FEEC BUT. This proceedings. [7] ZAPLETAL, J. Poznámka k rozhodování za rizika a nejistoty. This proceedings. [8] ZAPLETAL, J. Základy počtu pravděpodobnosti a matematické statistiky. Brno : PC-DIR, 1995.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
54
Address: Mgr. Michal Novák, Ph.D. Ústav matematiky, FEKT VUT v Brně Technická 8 616 00 Brno tel.: +420-541143135 e-mail: [email protected] Address: RNDr. Mgr. Břetislav Fajmon, PhD. Ústav matematiky, FEKT VUT v Brně Technická 8 616 00 Brno tel.: +420-541143135 e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
55
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
56
OPTIMIZATION OF MATERIAL CHARACTERIZATION BY ADAPTIVE TESTING Gábor Vértesy1, Ivan Tomáš2, István Mészáros3
1
Research Institute for Technical Physics and Materials Science Hungarian Academy of Sciences, 2 Institute of Physics, Academy of Sciences of the Czech Republic 3 Department of Materials Science and Engineering, Budapest University of Technology and Economics Abstract: A new procedure, called Adaptive Testing was applied for non-destructive characterization of cold-rolled austenitic stainless steel samples. The flat samples were magnetized by an attached yoke, and sensitive, reliable descriptors of their plastic deformation strain were obtained from the proper evaluation, based on the measurements of series of magnetic minor hysteresis loops, without magnetic saturation of the samples. The results were compared with the results of conventional, reference measurements. Significant increase of sensitivity was found if Adaptive Testing was applied.
Keywords: Adaptive testing, nondestructive material evaluation, material parameter optimization
1. Introduction Magnetic measurements are frequently used for characterization of changes in structure of ferromagnetic materials, because their magnetization processes are closely related to the microstructure of the materials. This fact also makes magnetic measurements an evident candidate for non-destructive testing, for detection and characterization of any modification and/or defects in materials and in products manufactured from such materials [1,2]. Majority of traditional magnetic investigations of variation of structural material properties simply make use of several parameters of the saturation-to-saturation major hysteresis loop (coercive force, remanent induction, saturation magnetization, permeability), see e.g [3,4]. These traditional parameters were very suitably established for general account of magnetic properties of ferromagnetic samples, but they were never optimized for magnetic reflection of various structural properties of the measured specimens and for their current alterations. An alternative, sensitive and experimentally friendly approach to this topic, the Adaptive Testing (AT) method, suggests a procedure of accumulation of data on the selected physical process, whose parameters are systematically modified in as broad ranges of values as to get the most complex picture of the behavior. Next analysis of the recorded data leads to a large family of “degradation curves”, i.e. of potential calibration curves, the most satisfactory of which is then picked up by a software algorithm as the optimally adapted calibration curve for next tests of unknown samples of the inspected material altered in the expected way. Based on the magnetic minor loops measurement the method of Magnetic Adaptive Testing (MAT) was considered recently in [5] and [6]. MAT introduced general magnetic descriptors to diverse variations of non-magnetic properties of ferromagnetic materials, optimally adapted to the just investigated property and material. According to this method the sets of minor hysteresis loops are scrutinized, and sensitive descriptors of the property variation of the material are identified. In this work an example of application of AT is given. We describe an experimental search for the calibration curve/curves, best adapted to magnetic examination of a particular steel material, subjected to cold rolling. Next testingof any unknown sample of the same kind of steel would be expected to indicate the level of the currently applied plastic strain through magnitude of the chosen magnetic feature of the material.
2. ADAPTIVE TESTING OF COLD ROLLED AUSTENITIC STEEL Titanium stabilized austenitic stainless steel, 18/8 type, was studied. Stripe-shaped specimens were annealed at 1100 C for 1 hour. Then they were quenched in water in order to prevent any carbide precipitation, and to achieve homogeneous austenitic structure as the starting material structure. The as-prepared stainless steel specimens were coldrolled at room temperature to different strains (from 33 to 63% strain in the case of the investigated specimens). For the reference measurement, a major (saturation-to-saturation) magnetic hysteresis loop was taken from each sample. A specially designed Permeameter [6] with a magnetizing yoke was applied for measurement of families of minor loops „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
57
differential permeability of the magnetic circuit. The magnetizing coil wound on the ferrite yoke gets a triangular waveform current with step-wise increasing amplitudes and with a fixed slope magnitude in all the triangles. This produces time-variation of the effective field, ha(t), in the magnetizing circuit and a signal is induced in the pick-up coil. As long as ha(t) sweeps linearly with time, the voltage signal U(ha,hb), in the pick-up coil is proportional to the differential permeability, µ(ha,hb), of the magnetic circuit µ ( ha , hb ) = const * U ( ha , hb ) = const * ∂B ( ha , hb ) / ∂ha * ∂ha / ∂t Permeameter works under full control of a PC computer, which sends the steering information to the function generator, and collects the measured data. An input-output data acquisition card accomplishes the measurement. The computer registers data-files for each measured family of the minor “permeability” loops, corresponding to each measured sample.
The experimental raw data are processed by a data-evaluation program, which divides the originally continuous data of each measured sample into a family of individual permeability half loops. Then the family, either of the top half-loops or the bottom half-loops or their average is chosen for next processing. The program filters experimental noise and interpolates the experimental data into a regular square grid of elements, µij ≡ µ(hai ,hbj), of a “µ-matrix” with a preselected field-step. The consecutive series of µ-matrices, each taken for one sample with strain, ε, of the consecutive series of the more-and-more deformed material, describes the magnetic reflection of the material plastic deformation. The matrices are processed by a matrix-evaluation program, which normalizes them by a chosen reference matrix, and arranges all the mutually corresponding elements µij of all the evaluated µ-matrices into a µij(ε) table. Each µij(ε)column of the table numerically represents one µij(ε)-degradation function of the material. The matrix-evaluation program calculates sensitivity of each degradation function and draws their “sensitivity map” in the plane of the field coordinates (hai ,hbj)≡(i,j). This map shows the relative sensitivity of each µij(ε)-degradation function with respect to the plastic deformation strain, ε, of the investigated material. Sensitivity of each degradation function is computed as the slope of its linear regression and it is expressed by a shade in the sensitivity map figure.
1750 1500
hb [A/m]
1250 1000 750 500 250 0 -250 -750 -500 -250
0
250
500
750 1000 1250
ha [A/m]
Fig. 1 Map of relative sensitivity of the µ-degradation functions, µij(ε)≡µ(hai ,hbj)(ε). (The crossing point of the lines indicate the most sensitive µij(ε)-degradation function.) Permeability matrices of all the samples were calculated from the measured data, and the matrices-evaluation process was applied to compare sensitivity of all the individual degradation functions, µij(ε), each corresponding to a pair of the field-coordinates (hai ,hbj). The sample having the lowest strain was used for the normalization. Fig. 1 shows the map of the relative sensitivity of the µij(ε)-degradation functions. The elements, depicted in the sensitivity map as the “whitest”, correspond to the most sensitive µij(ε)-degradation functions. The most sensitive element, characterized by ha=700 A/m and hb=1200 A/m corresponds to the top of the “white” area. Its location is shown by the two, crossing perpendicular lines in Fig. 1. It is also seen from the sensitivity map, that within the “whitest” region the µij(ε)-degradation functions vary only very slightly, so the neighbouring elements of the chosen (ha700, hb1200) provide practically the same value. This makes the choice of the proper, sensitive descriptor to be very reliable. The most sensitive µij(ε)-degradation function is shown in Fig. 2. The Bmax values, which were determined from reference measurement on the major loops, are also indicated in the figure. „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
58
5.5 5.0
ha700,hb1200 Reference
µ degradation functions
4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 30
35
40
45
50
55
60
65
Plastic strain [%]
Fig. 2 The most sensitiveµij(ε)-degradation function and the result of the reference measurement Integrating the permeability along the field ha, hysteresis loops and hysteresis loop B-matrices can be obtained. The Bmatrices contain the same information as the µ–matrices, however, presentation of the ε-dependences of the corresponding Bij(ε)-degradation functions is different and sometimes advantageous. After the same procedure of the matrices-evaluation and the corresponding normalization, all the Bij(ε)-degradation functions for 0≤hbj≤2000 A/m, hbj≤haj≤+hbj, are shown in Fig. 3. It is worth of mentioning, that all the Bij(ε)-degradation functions show the same shape-type of dependence on plastic deformation. 200
B-degradation functions
150
f 6
100
f e 5
50
e
0
b 2 b b B B 2 b 2 b B
A 1 a
3 c c c C 3 c C 3 c C
d 4 d d D d 4 D 4 d D 4 d D D
f F Ff 6 F 6ff F 6 F 6 F 6 Ffff 6 F 6 f
e E e E E 5 e E 5 e 5 e E 5 E 5 e E
F
-50 -100 -150 -200 30
35
40
45
50
55
60
65
Plastic strain [%]
Fig. 3 Bij(ε)-degradation functions, as functions of the plastic strain. The sensitivity map for the Bij(ε)-degradation functions is presented in Fig. 4. In contrast to Fig. 1, here the white (and also black) areas indicate those regions where the matrix elements varies magnitudes vary rapidly, jumping from one element to the neighbouring one. For instance, elements corresponding to the steepest slopes in Fig. 3 are located in the black and white areas of Fig. 4. These descriptors are very sensitive, but their reliability is questionable, because moving from one element to the neighbouring one, high jumps of values can happen (even from a large positive value to a large negative one). Because of this reason, the choice of descriptors from the homogeneously gray areas seems to be the most reliable one. The optimal choice of the Bij(ε)-degradation functions are shown in Fig. 5.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
59
1800
1600
hb [A/m]
1400
1200
1000
800
600
-600
-400
-200
0
200
400
600
ha [A/m]
Fig. 4 Map of relative sensitivity of the B-degradation functions, Bij(ε) ≡ B(hai ,hbj)(ε). 8
B-degradation functions
7
ha100hB1000
6 5 4 3 2 1 30
35
40
45
50
55
60
65
Plastic strain [%]
Fig. 5 The optimal choice of the Bij(ε)-degradation functions. A third type of matrices (µ’-matrix) can also be obtained. This is the matrix of the derivative of permeability with respect to the field, ha, (the first derivative of permeability, ∂µ/∂ha, or the second derivative of magnetic induction, ∂2B/∂ha2). The sensitivity map of the µ’ij(ε)-degradation functions is shown in Fig. 6. The optimal µ’ij(ε)-degradation function (a compromise between sensitivity and reliability, as explained above) is shown in Fig. 7.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
60
1600 1400 1200
hb [A/m]
1000 800 600 400 200 0 -200
0
200
400
600
800 1000 1200
ha [A/m]
Fig. 6 Map of relative sensitivity of the µ’ij(ε)-degradation functions.
14 ha450,hb950
µ' degradation functions
12 10 8 6 4 2 0 30
35
40
45
50
55
60
65
Plastic strain [%]
Fig. 7 Optimal choice of µ’ij(ε)-degradation functions, as a function of plastic strain. 3. DISCUSSION As it was already mentioned the originally paramagnetic austenite specimens became more and more ferromagnetic, as a consequence of the applied cold-rolling. All austenitic stainless steels are paramagnetic in the annealed, fully austenitic condition, and the only magnetic phase, which can be induced (e.g. by cold-rolling) in the low carbon austenitic stainless steels, is the bcc α′-martensite, which is highly ferromagnetic. This process can be followed easily by magnetic measurements. By applying the above described adaptive testing method, the relatively small difference between the magnetic characteristics of the investigated sample series can be determined much more sensitively, than by the conventional methods. By using the µij(ε)-degradation functions derived from the permeability matrices, it is possible to increase the sensitivity of the determination of the appearance of ferromagnetic α′-martensite by about a factor 3, as compared with the “classical” major loop approach, if proper descriptors are used. (For comparison see Fig. 2.) The reliability of this determination is very good, which is illustrated by the sensitivity matrix. Here a wide plateau is seen, where sensitivity of the µij(ε)-degradation functions is varied only very slightly, if their field coordinates are miss-positioned. In other words, the sensitivity (and also the reliability) of the measurement is not influenced by the exact choice of the matrix element. „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
61
The sensitivity with respect to the applied strain is increased, if the hysteresis loop B-matrix parameters are considered instead of permeability matrix. The sensitivity map shows the area, where the most sensitive and/or most reliable descriptors are found. If the descriptors are very carefully chosen, more than 1:7 ratio can be obtained between the less and the most ferromagnetic piece of the investigated samples. The most sensitive area in the Bij-sensitivity map is not as large (there is not such a wide plateau), as in the case of sensitivity of the µij(ε)-degradation functions, but there is a well defined area, where the most sensitive descriptors are positioned. Scatter of the parameters is very low, as can be seen in Fig. 3. It is possible to increase the sensitivity: more than 1:100 ratio can be reached, but in this case the exact choice of matrix elements becomes crucial. The shapes of the dependences of all the Bij-matrix elements vs. plastic deformation are very similar to each other. So, the descriptors, which were evaluated from the hysteresis loop B-matrices, seem to be especially suitable for very sensitive characterization of the changes, which are introduced by the cold rolling in the austenitic material. It is possible to increase the sensitivity even more substantially, and at the same time to avoid the possible mistake of any miss-positioning by choosing the first derivative of permeability (µ’-matrix) as the source of descriptors. In the case of µ’-matrices, the largest sensitivity can be obtained if the most reliable area is taken. For the presented series of samples it is 1:14 (see Fig. 7). On the other side, the scatter of the µ’-matrix elements is the largest, if we take into account all the elements, but even in this case an area of the elements can be found, from where reliable enough elements can be taken.
4. CONCLUSIONS This paper was dedicated to the indirect experimental measurement of variable material properties, in particular to the question how – from all available features of the chosen physical process (here from the process of magnetization of the samples by an external field), which was employed for description of the investigated material variation – to determine that one, which is the best adapted to the particular material under inspection, to the particular variation/degradation of the material, and to the particular demands declared by the examiner. The introduced way of the Adaptive Testing of materials suggests a procedure of collection of data on the selected physical process, whose parameters are systematically modified in as broad ranges of values as to get the most complex picture of the behavior. Next analysis of the recorded data leads to a large family of degradation functions, the most satisfactory of which is then defined as the optimally adapted calibration curve for next tests of unknown samples of the investigated material altered in the expected way. As shown in the presented experimental example, an optimum selection among the degradation functions has to take into consideration not only the desired high sensitivity of the calibration curve, but needs to demand also low experimental error of the curve-constituting measured values and low curvature of the sensitivity surface (referred to as stability of the calibration curve) around the selected AT-coordinates point. The presented example obviously demonstrated that AT, focusing on the explored concrete material and the explored concrete degradation, introduces kind of a trade involving sensitivity, stability, smoothness, shape, and experimental friendliness of the optimum calibration curves. This focus usually leads to excellent results of AT, which are substantially more advantageous for description of the explored material variations than the plain use of the traditional descriptors, which are focused on the employed physical process itself. ACKNOWLEDGEMENTS The financial support by the Hungarian Scientific Research Fund (T-035264 and T-062466) and by the Academy of Science of the Czech Republic (projects No. 1QS100100 and AVOZ 10100520) is appreciated. REFERENCES [1] JOHNSON, M. J.; LO, C. C. H.; ZHU, B.; CAO, H.; JILES, D.C.; Nondestruct. Eval. 20 (2000) 11. [2] JILES, D.C. Magnetic methods in nondestructive testing. K. H. J. Buschow et al., Ed., Encyclopedia of Materials Science and Technology, Oxford : Elsevier Press, 2001. p. 6021. [3] JILES, D.C. NDT Int. 21 (1988) 311. [4] DEVINE, M. K. Min. Met. Mater. (JOM) (1992) 24. [5] TOMÁŠ, I.; Magn. Magn. Mat. 268 (2004) 178. [6] TOMÁŠ, I.; PEREVERTOV, O. JSAEM Studies in Applied Electromagnetics and Mechanics. T. Takagi and M. Ueasaka (Eds.), 9 (2001) 533. „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
62
Address: Dr. Gábor Vértesy, Dr.Sc. Research Institute for Technical Physics and Materials Science Hungarian Academy of Sciences, H-1525 Budapest, P.O.B. 49, Hungary phone:+3613922677 fax: +3613922226 e-mail: [email protected]
Address: RNDr. Ivan Tomáš, CSc. Institute of Physics, Academy of Sciences of the Czech Republic Na Slovance 2, 18221 Praha, Czech Republic phone: +420266052177 fax: +420286890527 e-mail: [email protected]
Address: Dr. István Mészáros Department of Materials Science and Engineering, Budapest University of Technology and Economics, H-1111 Budapest, Goldmann ter 3, Hungary phone: +3614632883 fax: +3614633250 e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
63
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
64
VYUŽITÍ KOMPLETNÍHO GENETICKÉHO ALGORITMU PRO ŘEŠENÍ OPTIMALIZACE VÝROBNÍHO PROCESU Z HLEDISKA MAXIMALIZACE ZISKU
Jiří Kostiha
VUT Brno Abstrakt: Článek obsahuje popis kompaktního genetického algoritmu, jeho specifika a odlišnosti od klasického genetického algoritmu. Součástí je ukázka efektivnosti na standardní testovací funkci a aplikace algoritmu na konkrétní ekonomickou optimalizační úlohu s úkolem maximalizace zisku výrobního procesu. Klíčová slova: kompaktní, genetický, algoritmus, evoluční, optimalizace, vícerozměrný, problém, řešení, maximalizace, zisk, výrobní, proces
Úvodem Evoluční algoritmy (EA) jsou svým principem založeny na technické interpretaci biologických dějů. Základní myšlenkou popsanou Darwinem je přežívání nejsilnějších jedinců. Během života se jedinci mezi sebou kříží, mutují a tím vznikají nový jedinci, kteří následovně vstupují do procesu výběru a dalšího křížení. Nejlépe vybavení jedinci k životu mají největší šanci ke křížení, zatímco slabší jedinci mají malou šanci. Tím je zaručeno velmi pravděpodobné pokračování genetických větví silných jedinců, zatímco větve slabších jedinců se ukončují. Na základě principu EA se dají nalézt jejich společné rysy: • Inspirovány přírodními evolučními procesy. • Pracují s populací jedinců. • Iterační charakter přístupu k řešení. Jeden iterační cyklus je dán jednou populací. • Stochastický a heuristický charakter. • Poskytují řešení, které nemusí být optimální, ale je vhodné. Řešení je poskytnuto v krátkém čase. Genetické algoritmy (GA) jsou dosud nejúspěšnější z evolučních algoritmů vůbec. Pro svou efektivitu a velmi dobré výsledky nacházejí uplatnění v mnoha odvětvích lidské činnosti. Používají se například při vytváření rozvrhů práce pro stroje v továrnách, v teorii her, v ekonomii managementu, pro řešení optimalizačních problémů multimodálních funkcí, při řízení robotů, v rozpoznávacích systémech a v úlohách umělého života. Dále genetické algoritmy nacházejí uplatnění při řešení tzv. NP-úplných problémů, kde téměř všechny ostatní algoritmy selhávají, tj. kde výpočetní čas je exponenciálně nebo faktoriálně závislý na počtu proměnných. Za svou dobu existence vzniklo mnoho modifikací EA a stále se vyvíjí nové. Kompaktní genetický algoritmus (CGA) je speciální případ algoritmu. Řadí se do skupiny GA, avšak je postaven na zcela odlišném základě. Hlavní rozdíly jsou v technice generování populace a rekombinačních operátorech. Evoluční algoritmy Genetické algoritmy Klasický genetický algoritmus Kompaktní genetický algoritmus Obrázek 1: Hierarchie evolučních algoritmů Srovnání CGA a klasického GA Principem klasického GA i CGA, stejně jako všech EA je řešení úloh, které se dají popsat tzv. fitness funkcí a hledání jejich globálního extrému. Podle typu úlohy může jít o minimalizaci nebo maximalizaci. Fitness funkce se skládá z vlastní funkce a omezujících podmínek. Funkce i podmínky jsou charakteristické pro daný problém a je nutné je vytvořit podle zadání konkrétního problému. „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
65
Základem GA jsou chromozómy, které v matematické interpretaci odpovídají jedincům, někdy také nazývanými agenty. Jedinci tvoří populaci, která se mění s přibývajícími generacemi. Populace reprezentuje informaci o částech prohledávaného prostoru, které byly v dosavadním výpočtu vzorkovány. Standardně používané operátory křížení, mutace a selekce potom určují, jak má být s touto informací naloženo při generování dalších potenciálně slibných řešení. Tyto operátory tvoří základ klasického GA. CGA algoritmus používá pravděpodobnostní popis aktuálního stavu výpočtu (model popisující současnou populaci). Operátory křížení, mutace a selekce jsou zde nahrazeny vzorkováním prohledávaného prostoru na základě daného pravděpodobnostního modelu. Jeho základem je pravděpodobnostní chromozóm (PCh), který je zapsán jediným nrozměrným reálným vektorem. Nepracuje se s reálnou populací, ale s vektorem, na základě kterého se v době výpočtu generuje reálný chromozóm (jedinec). Každá pozice vektoru vyjadřuje pravděpodobnost výskytu hodnoty 1 na dané pozici chromozomu. Počáteční stav je dán hodnotou 0,5 na všech pozicích PCh. Na obrázku 2 je znázorněn PCh pro jednoduchou čtyřbitovou úlohu o dvou neznámých parametrech reprezentovaných geny 1 a 2. V pravé části je nejpravděpodobnější binární hodnota generovaných jedinců. Černá čísla představují ustálené hodnoty - již se nemohou měnit. Šedá neustálené - mohou se ještě v průběhu výpočtu měnit. Pravděpodobnostní chromozóm Gen 1 Gen 2 0 0,8 0,6 1 0,3 0,5 0,2 1
Reprezentace PCh chromozómu v binárním tvaru Gen 1 Gen 2 0 1 1 1 0 1 0 1 Obrázek 2: Pravděpodobnostní chromozóm
Vývoj výpočtu spočívá v modifikaci PCh podle vygenerované populace nějakým vhodným způsobem, aby docházelo k přibližování hodnot generovaných jedinců ke globálnímu extrému. Postup výpočtu klasického GA a CGA je na obrázku 3. populace
pravděpodobnostní chromozóm (PCh)
křížení
populace
mutace
modifikace PCh
selekce a)
b) Obrázek 3: Srovnání klasického genetického algoritmu a) a kompaktního genetického algoritmu b)
Modifikace PCh se provádí podle určitých pravidel, kterých může být celá řada. Použité pravidlo v rámci zde uvedených výsledků je následující: • vyber nejlepšího a nejhoršího jedince z populace • jestliže se bity jedinců liší a u lepšího jedince je 1, pak k hodnotě bitu PCh na dané pozici přičti 0,01 (1%) • jestliže se bity jedinců liší a u lepšího jedince je 0, pak k hodnotě bitu PCh na dané pozici odečti 0,01 (1%) • jestliže mají bity stejnou hodnotu, nedělej nic Elitismus Elitismus je technika, která se používá k zapamatování nejlepšího, nebo několika nejlepších jedinců předcházející populace a jejich automatické zařazení do následující populace. Tím je stoprocentně zaručeno přežití nejlepších jedinců. Toto vede v mnoha případech ke zlepšení efektivnosti algoritmu. Při aplikaci v CGA má smysl, jak vyplývá z principu algoritmu, zařazovat pouze jednoho nejlepšího jedince. V následující generaci se srovnávají všichni nově vytvoření jedinci tj. celá nová populace a elitní jedinec z předchozí generace. Vybere se nový nejlepší jedinec a je označen za elitního. Modifikace bitů PCh se provádí podle nejlepšího a nejhoršího jedince, přičemž nejlepší jedinec v tomto případě odpovídá elitnímu jedinci.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
66
Stabilizační hranice Stabilizační podmínka, viz. podmínky ukončení výpočtu, může vést k předčasnému ustálení v lokálním extrému nebo na nějaké hodnotě dané předčasným ustálením některého nebo některých bitů. V takových případech může docházet k získávání nesprávných hodnot nejen při uvíznutí algoritmu v lokálním extrému, ale také k uvíznutí "někde" na funkci, zapříčiněné ustáleným bitem, který již nemůže změnit svou hodnotu. Proto jsem algoritmus doplnil o stabilizační hranice, zobrazené na obrázku 4. Interval b znázorňuje modifikační oblast, ve které se můžou pohybovat hodnoty PCh. Naopak intervaly a na okrajích jsou zakázány. K stabilizaci potom dochází na těchto hranicích. Je tím stále zaručena pravděpodobnost generování jedinců s opačnou hodnotou bitu a možnost překlopení a ustálení bitu k opačné hodnotě. Takto je zamezeno možné předčasné nežádoucí stabilizaci bitů PCh. a
b
0
a 1
Obrázek 4: Stabilizační hranice
Podmínky ukončení výpočtu Podmínek ukončení výpočtu může být mnoho. U klasického GA jsou dvě základní. Ukončení při dosáhnutí zadaného počtu generací a ukončení při dosáhnutí zadané přesnosti výpočtu. Druhá podmínka se týká především testovacích funkcí, kde známe hodnotu hledaného extrému. U CGA přibývá ještě jedna důležitá podmínka a to ukončení při stabilizaci hodnot PCh na hodnotách 0 a 1. Tato podmínka je zajímavá tím, že algoritmus jakoby sám pozná kdy došel k extrému a již nemůže vygenerovat jinou hodnotu. Vypočítaná hodnota nemusí být přímo hodnota globálního extrému, toto záleží na složitosti problému, ale je to nějaká vhodná hodnota, kterou lze považovat za velmi dobré řešení zkoumaného problému. Při stabilizaci je žádoucí ukončení algoritmu, protože již nemůže dojít k nalezení lepší hodnoty a všichni generovaní jedinci jsou totožní a odpovídají stabilizovanému PCh. Při použití stabilizačních hranic toto neplatí, ale pravděpodobnost, že bude nalezeno lepší řešení je velmi malá a algoritmus je dále neefektivní.
Testovací úloha Pro testování CGA jsem použil standardní testovací funkci Rastriginovu, zobrazenou na obrázku 6. Výhodou této funkce je stejný zápis při použití libovolného počtu neznámých parametrů tzn. testování je možné na libovolné rozměrnosti bez nutnosti zásahu do matematického zápisu této funkce. Další výhodnou vlastností funkce je poloha globálního extrému v bodě nula na všech osách při všech rozměrnostech. Parametry: • interval -5,12 až +5,12 na každé ose • 16-ti bitové kódování genů • maximální možná přesnost 0,00015625 při použitém kódování a velikosti intervalu • 5n – 1 lokálních extrémů, n = počet dimenzí • výsledné hodnoty potřebného počtu generací k nalezení globálního minima průměrovány aritmeticky z 10 řešení • použitý výpočetní prostředek: počítač s procesorem AMD Athlon 1600+ Na obrázku 5 je vidět závislost počtu generací na velikosti populace. Tenká čára nahoře znázorňuje řešení s největším počtem potřebných generací k nalezení minima a naopak tenká čára dole s nejmenším počtem potřebných generací. Tlustá čára vyznačuje aritmetický průměr z 10 řešení. Krok zvyšování populace na ose x je 10. Jedno řešení je dosahováno přibližně během jedné sekundy.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
67
Obrázek 5: Závislost počtu generací potřebných pro nalezení globálního minima Rastriginovi funkce na počtu jedinců v populaci pro dva neznámé parametry
Vliv elitismu U testovací úlohy se dvěma a třemi neznámými parametry je patrný pozitivní vliv při použití menších populací, řádově do 40 jedinců. S rostoucí rozměrností problému, tj.zvyšování počtu neznámých parametrů, se elitismus ukazuje jako nevhodný pro zefektivnění algoritmu. Průměrné množství generací potřebné pro nalezení globálního řešení je větší. Toto je způsobeno častou konvergencí k lokálnímu extrému a uvíznutí v něm.
Vliv stabilizační hranice S použitím stabilizačních hranic u tohoto problému nedochází příliš ke zlepšení průměrného počtu potřebných generací k nalezení globálního minima. Dochází však k větší spolehlivosti dosáhnutí lepšího řešení a omezuje se časté uvíznutí algoritmu. Jako vhodná hodnota stabilizační hranice se ukazuje malé číslo v řádu procent.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
68
Obrázek 6: Standardní Rastriginova testovací funkce pro dva neznámé parametry, a) řez rovinou xz v y = 0, b) náhled na rovinu xy ze shora, c) , d) 3D náhled
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
69
Příklad řešení úlohy maximalizace zisku výrobního procesu pomocí CGA Zadání V podniku se vyrábí pět druhů výrobků (A, B, C, D, E) ze tří druhů materiálu (S1, S2, S3). Materiál je k dispozici pro plánované období v omezeném množství: 1500kg S1, 300kg S2, 450kg S3. Výrobní procesy jednotlivých druhů výrobků probíhají nezávisle. Jiná omezení nepřichází v úvahu. Spotřeba materiálu na jeden kus vyráběných druhů výrobku (kg na 1kus) a velkoobchodní ceny jednotlivých výrobků jsou následující: Výrobek A B C D E Spotřeba (kg na 1kus) materiálu: S1 0,4 0,3 0,6 0,6 S2 0,05 0,2 0,1 0,1 S3 0,1 0,2 0,2 0,1 0,2 Velkoobchodní cena (Kč na 1kus) 20 120 100 140 40 Tabulka 1: Zadání úlohy Úkol: Vypočítejte kolik je třeba vyrobit jednotlivých druhů výrobků při daných omezeních, aby v plánovaném období bylo dosaženo maximálních tržeb za plánované období.
Sestavení ekonomicko matematického modelu Funkce maximalizace zisku:
zmax ( x1 , x2 , x3 , x4 , x5 ) = 20 x1 + 120 x2 + 100 x3 + 140 x4 + 40 x5 Omezující podmínky:
S1: 0 x1 + 0, 4 x2 + 0,3 x3 + 0, 6 x4 + 0, 6 x5 ≤ 1500 S2: 0, 05 x1 + 0, 2 x2 + 0,1x3 + 0,1x4 + 0 x5 ≤ 300 S3: 0,1x1 + 0, 2 x2 + 0, 2 x3 + 0,1x4 + 0, 2 x5 ≤ 450
Řešení Použil jsem 16 bitové kódování genů z testovací úlohy, které je svým rozmezím hodnot (0 až 65535) pro jednotlivé geny více než dostačující. Funkce zmax představuje fitness funkci, do které se přidají omezující podmínky s postihy za nežádoucí řešení. Z deseti provedených řešení označil algoritmus hodnoty v tabulce 2 jako optimální a na nich také setrval. Devadesát procent hodnot bylo označeno během prvních 1000 generací a do 10 sekund na počítači s procesorem AMD Athlon 1600+. Průměrná hodnota spočítaného řešení je 379674 Kč. Optimální řešení odpovídá 380000 Kč. číslo řešení
hodnota označeného maximálního zisku [kč]
1 379900 2 379400 3 379680 4 379560 5 379960 6 380000 7 379900 8 379880 9 378820 10 379640 Tabulka 2: Hodnoty spočítané CGA
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
70
Závěrem CGA, stejně jako ostatní EA poskytují v poměrně velmi krátkém čase řešení, které je nebo se blíží k optimálnímu. Použití algoritmu je vhodné tam, kde ostatní algoritmy selhávají nebo nejsou schopné podat řešení v dostatečně krátkém čase. Přičemž není zcela nutné znát nejlepší řešení a postačuje znalost některého z velmi dobrých řešení. Tomuto charakteru úloh odpovídají ekonomické problémy, proto je použití algoritmu v těchto úlohách vhodné.
Použité zkratky a pojmy CGA Compact Genetic Algorithm (Kompaktní genetický algoritmus), speciální případ GA, který je založen na PCh EA Evolutionary Algorithm (Evoluční algoritmus), matematické algoritmy inspirované přírodou Elitní jedinec Nejlepší jedinec za celou dobu výpočtu, popř. části výpočtu Fitness Síla jedince, vyjadřuje míru vhodnosti řešení daného jedince Fitness funkce Funkce popisující zadanou úlohu, charakteristická pro danou úlohu, hledá se na ní globální extrém odpovídající optimálnímu řešení úlohy GA Genetic Algorithm (Genetický algoritmus), skupina algoritmů spadající pod EA Gen Prvek vektoru parametrů, základní stavební jednotka chromozomu, odpovídá zakódované proměnné Generace Krok iteračního cyklu při hledání optimálního řešení pomocí GA Chromozóm Genetická informace ve formě řetězce, skládá se z genů, v matematické interpretaci odpovídá jedinci Jedinec (Agent) Nositel genetické informace, jedinci tvoří populaci, v matematické interpretaci odpovídá chromozómu Křížení Konstrukce nových jedinců (potomků) dle původních jedinců vybraných z generace (rodičů) Mutace Náhodná změna hodnot bitů v chromozómu PCh Probability Chromozome (pravděpodobnostní chromozóm), zvláštnost CGA, na základě PCh se konstruují jedinci populace, podle daných pravidel je zpětně modifikován Populace Množina jedinců v daném kroku iteračního cyklu zvaném generace Potomek Jedinec, jenž je výsledkem rekombinace rodiče(ů) Rekombinace Proces generování nového jedince, obvykle odpovídá křížení a mutaci Rodič Jedinec vstupující do rekombinace, výsledkem je(jsou) nový jedinec(i) Selekce Výběr jedinců do nové populace podle daných pravidel
Literatura [1] MAŘÍK, V.; ŠTĚPÁNKOVÁ, O.; LAŽANSKÝ, J. a kol. Umělá Inteligence (4) [2] PETERKA, I. Genetické algoritmy. Matematicko-fyzikální fakulta Univerzity Karlovy, duben 1999. [3] KALÁTOVÁ, E.; DOBIÁŠ, J. Evoluční algoritmy. http://lucifer.fav.zcu.cz/uir/ [4] HARIK, a kol., 1997.
Adresa: Ing. Jiří Kostiha VUT Brno Kolejní 4 612 00 Brno tel.: +420 775 303 543 e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
71
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
72
REVITALIZING COMPANY INFORMATION SYSTEMS AND COMPETITIVE ADVANTAGES Branislav Lacko
VUT v Brně Abstract: The contribution deals with innovation of computer based information system
Key words: Information system, revitalization, innovation of information system, grow of company
1 Introduction At present, there is hardly any company having no information system working. Nevertheless, we can still read in our magazines and brochures articles on the information systems design many of those books reasoning extensively the necessity of introducing information systems. Current problems, however, do not lie in the absence of information systems in Czech companies. The problem is in the quality of our existing information systems. From this perspective, it is suitable to talk about the necessity of the information systems innovation. This paper's title used the word “revitalizing” instead of the expression "innovating" for the following reasons: • Revitalization of enterprises, which many of our companies have been trying to do must necessarily comprise revitalizing those enterprises´ information systems as well • The word "revitalization" covers more precisely the goal of the change which the information systems of our enterprises should go through After having been designed and implemented, each information system was handed over to its users in a certain condition, size, having characteristic features. Since the moment of being handed over for use, it can develop, stagnate or decline. By the development of an information system we understand improving its qualitative and quantitative parameters in the course of its use. If those parameters remain unchanged during the system's use, stagnation occurs. In case of impairing the information system parameters, we call it decline - a degradation of the information system. The qualitative parameters describe certain characteristic values of the information system, especially of technological, programme, system and operation type, e.g.: number of computers, external memories capacity, processor operation speed, etc. The qualitative parameters can be considered similarly, such as: the way of using database concept, automation degree of tasks, and the like. For each i-th parameter out of n parameters we can stipulate a so-called increase index ri,j for moment tj as follows. We introduce for each parameter its unit value hi,0 in time t=0 i.e. at the moment of its handover for operation. We will make it equal one. Then we can determine the value of the increase index for another time period compared to the initial value of the parameter. Consequently we can easily calculate the average increase index Rj, for a certain moment of time j. Then we can even make a graph of dependence of the information system development in time by the means of the average index of the information system development. What we will call the information system revitalization is a situation when the average index of the information system development changes by a decisive leap, resulting mainly of qualitative parameters, especially those oriented to the benefits of the information system and supporting the decisive processes in the enterprise directly.
2 Implication statement on the information system development The causes of the information system development must be seen in the requirements of the company staff to the information system. Those are derived from the following facts: • the company size increases • the experience in the information system use grows making thus the demands for the information system to „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
73
• • • • • etc.
increase as well the requirements for information increase in consequence of the increasing information barrier in the society technological and programming means of computing technology develop new information technologies develop the theory and practice of the information systems develop the demands for quality increase generally,
The development of the enterprise, however, is a factor to be identified as the most important one. Under the term of development of the enterprise we will understand a situation, in which the size of the enterprise extends, the production volume grows and profits increase. We can declare the following statement: If an enterprise develops, the information system of the enterprise develops as well. The enterprise development represents an antecedent of implication here, while the information system development represents a consequent of this implication. Let us try to comment individual combinations possible of the implication presented. The zero therein represents "false" and one represents "true" of the respective affirmation. 0⇒0 An enterprise, which does not develop or even stagnates, has no resources for the information system development. It is probably managed incorrectly; therefore its management is not probably interested in developing the information system. 1⇒1 A developing enterprise must support its successful development by developing its information system to meet the information demands of its employees and ensure relevant and top-quality information to support decision-making processes. 0⇒1 This true combination of the implication says that even a bad performing enterprise can develop its information system. It may happen in some of the following situations: • An enterprise revealed bad functioning of its information system as an obstruction in its development and decided to improve it to a necessary level. Unfortunately, this situation comprises even the alternative of investing into the information system only as "a wonder" process which is automatically to prevent any further stagnation of the company or even its decline. • A stagnating enterprise makes use of the progress made in the information technologies, improving its system by various, especially intensifying methods as a consequence of general progress in this sphere. • The milieu of the enterprise (state administration bodies, state legislation, competitors, client demands, public opinion, etc.) drive the enterprise to develop its information system. 1⇒0 A false combination signals a situation which may have a negative impact on the enterprise prosperity and further development by the decline of the stagnating information system which will stop providing support to the management and employees of the company. It may turn to be one of the factors which at first will cause problems and later might even become a cause of the company decline.
3 A sentence on the information system development pace The fact itself that the information system develops is not enough if we want it to support the decision making activities in the company really well. We must find out the validity of the following sentence: An information system must develop in a pace corresponding to the pace of the enterprise development Similarly, as for the information system development, we can determine an average index of the company development R` and its parameters for certain moments of time j (the equal ones will make the best results). „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
74
The pace of development – the speed – can be expressed in both cases as the first derivation of the function of the average index increase of the information system development or respectively of the average increase index of the enterprise depending on time. It should be applied, that if an information system of suitable size was handed over for use, supporting management of a company within the scope necessary, the pace of the information system development should be equal or faster compared to the economic development of the company: If the pace is slower, the information system development pace lags behind the company development pace, resulting in a disproportion which may mean problems again, and possibly even stagnation and decline of the company in case of non-solving such situation.
4 A problem of information systems revolutionary development In the previous contemplations, gradual growth of both information system and the enterprise parameters – i.e. so-called evolutionary development was assumed. In practice, revolutionary development of the information system or the enterprise often occurs, or revolutionary growth in both cases, such as an exchange of an existing outmoded computer for a new and progressive one, two firms merger, increase in production through opening a new-built hall, etc. Expressed in graphics, this will be manifested by a leap within the respective growth curve (see Fig.1). R
t Fig.1
In a book by C. Gray [1], the author explains why the enterprise's growth is a necessary precondition for its successful existence. It does not mean that the number of employees would have to increase permanently, but after all, the company must develop still more and more perfect products, decrease its costs, increase profits, etc. Company stagnation in today's dynamic world of market economy heads towards its breakdown very fast. At present era of global computerization, it is more important than ever to develop its information system as well. Current computer producers meet this demand, designing their individual computer models of standard computer lines in such a way that their efficiency covers the large scope of the performance and external disc memories capacities. That means the user may gradually extend his computer fluently according to his increasing requirements and financial capacities, which is very advantageous for him. For the start of his information system implementation, he does not have to pay unnecessarily for his computer capacity or memory capacity of the external memories capacity, which he would not use. Moreover, he can schedule the increase in his computer's capacity as his gradually earned financial resources allow. And he constantly works with the same operational system, within the same database and communication setting, with the same standard and application programme equipment, valorizing and protecting the resources invested into the technological equipment and personnel training before. If a computer does not have a concept like that, the individual modules increase their capacity with certain delay. Figure No.2 shows a solution, where the user bought a new model having higher capacity immediately after the capacity of the old model exhausted. He pays unnecessarily for this capacity improvement for the entire period B, before he can really use this potential. Fig. 3 shows a situation where the user is waiting and does not cover his increased information requirements until they reach the „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
75
possibilities of a new model.
R
Demand Satisfaction
A
B
C
t
Fig.2
R
t A
B
C
Fig.3
There are losses in both cases, however. In the former case due to unexploited and paid capacity. In the latter, in consequence of insufficient computer support in steering and decision taking. Fig. 4 demonstrates the course of information system benefits in individual periods A,B,C. Therefore the companies offering a limited number of computer models design their parameters to cover mutually. That enables using a more efficient model before it reaches the limits of its capacity. Such situation is ideal, in which the user works with a supplier ready and willing to accept the old model back as a counter value during the innovation. If the user does not have an opportunity like that, the benefits decrease and losses occur caused by the value of the computer discarded.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
76
€
t A
B
C
Fig.4
Another situation, much worse, is that of a user who finds out that at the start of the information system implementation he had decided for a computer having no further capacity extension possibilities, and the new computer offered requires substantial financial resources to be invested into programmes modifications. Similar situation may happen in case of communication network exchange, which may comprise cables redesign to achieve higher transmission speeds, exchange of the still working network cards for more efficient ones together with the adjustment of the operational system adaptation which must communicate now with the new network programme equipment. The whole event postpones the possibility of solving the entire problem and due to the overhaul and modifications; the parameters may even impair for a certain amount of time. The increase curve may run as shown in Fig.5. Sections B and C are critical as the users´ requirements are not covered there. In addition, in section C, technical complications often occur, requiring extra costs. Therefore the costs curve's line is often that shown in Fig. 6. An irregular, immediate increase in costs is very unfavourable from the company finances point of view, as any financial expert can explain and confirm. Everybody thus try to avoid such situation; the users should design their concepts of the information system implementation in such a way that similar situations do not happen.
IS Capacity
t A
B
C
D
Fig. 5
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
77
€
A
B
C
D
Fig.6
"Leap" changes in the information system development may occur for other than technological reasons, of course (financial situation of the company, changes in the company's information policy, etc.). Revitalization always represents a leap change in efficiency. Therefore the aforementioned curves must be contemplated very well in relation to its implementation.
5 Discussion on the information system revitalization goals It is a common phenomenon in the information technologies field, that modern information technologies and the application thereof are a motor of innovations within the information systems sphere. In spite of that, it is necessary to state in relation to the information system s revitalization, that main driving force should be in striving for improving the information system quality of the with regard to company management support and company processes efficiency improving so that the competitive abilities of the company can grow. Implementation of information technologies on itself cannot be considered a competitive advantage anymore. The reality is that the inability to use modern information technologies effectively to maximum extent turns out to be a retardation factor in company development leading to the company's competitive abilities impairment. The goals of information system revitalizing must be conceived in relation to overall improvement in company processes effectiveness and efficiency (Business Process Reengineering). Information, especially of economic type, is essential for proper company management. [8, 9] Parallel economic data processing intercepted by modern controlling, and using the information obtained from those data cohere to the possibility of good company development [7]. 6. Conclusion The knowledge, quoted at the end of the preceding paragraph should be understood as an urgent stress on the importance of the right information system revitalization strategy choice. Creating no information system strategy causes a lot of troubles to Czech companies. After all, it is one of the most frequent reasons why most contributions of the existing information systems are very moderate. This paper would like to point out some important facts: 1. When revitalizing a company, its information system revitalization must be solved as well 2. The information system revitalization goals must support directly the company revitalization 3. Properly chosen strategy of the information system revitalization implementation, coherent to the company's development planned enables correct setting of the so useful information system development dynamics
Literature: [1] GRAY, C. Růst podniku. Publikace edice Business Guide pro malé a střední podnikatele, Praha : Readers International Prague, 1993. [2] STRASSMAN, P. A. Stages of Growth. Datamation, October 1976, str. 46 – 50. „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
78
[3] [4] [5] [6] [7]
[8] [9]
LACKO, B. Analýza zkušeností z inovace počítačového systému v k. p. TOS KUŘIM. Kuřim : Interní publikace TOS KUŘIM, 1982. LACKO, B. Restrukturalizace báze dat. Sborník referátů semináře DATASEM 95, CS COMPEX Brno : 1995. LACKO, B. Vývojové trendy v informačních a řídicích systémech. Abstrakt habilitační přednášky, Brno : VUT FS, 1994, 16 stran. LACKO, B. Analýza dynamika rozvoje informačního systému. Sborník mezinárodní konference Systémová integrace 95, Praha : VŠE KIT, 1995, str. 133 – 144. FEDOROVÁ, A. Some connections between the accountant management and economic development in Czech Republic. In: International Congress, Business and Economic Development in Central and Eastern Europe, Brno : TU of Brno, 2001. TVRDÍKOVÁ, M. Zavádění a inovace informačních systémů. Praha : Grada, 2000. MERUNKA, V. ; POLÁK, J. ; CARDA, A. Umění systémového návrhu. Praha : Grada, 2003.
Address: Doc. Ing. Branislav Lacko, CSc. VUT v Brně Technická 2 CZ-616 69 Brno e-mail: lacko @ fme.vutbr.cz
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
79
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
80
MODEL LEARNING AND INFERENCE THROUGH ANFIS Amal Al Khatib
Brno University of Technology
Abstract: This paper discusses the map and architecture of a learning procedure called ANFIS (adaptivenetwork-based fuzzy inference system), ANFIS is a fuzzy inference system implemented in the framework of adaptive networks. By using a hybrid learning procedure, ANFIS can construct an input-output mapping based on both human knowledge (in the form of fuzzy if-then rules) and using input-output data pairs.
1. INTRODUCTION Intelligent systems have appeared in many technical areas, such as consumer electronics, robotics and industrial control systems. Many of these intelligent systems are based on fuzzy control strategies which describe complex systems mathematical model in terms of linguistic rules. Fuzzy set theory derives from the fact that most natural classes and concepts are fuzzy rather than crisp nature. On the other hand, people can approximate well enough to perform many desired tasks. The fact is that they summaries from massive information inputs and still function effectively. For complex systems, fuzzy logic is quite suitable because of its tolerance to some imprecision.
2. ANFIS 2.1. Model Learning and Inference Through ANFIS Assuming that we already have a collection of input/output data and would like to build a fuzzy inference system that approximate the data, this system would consist of a number of membership functions and rules with adjustable parameters similarly to that of neural networks. Rather than choosing the parameters associated with a given membership function arbitrarily, these parameters could be chosen so as to tailor the membership functions to the input/output data in order to account for these types of variations in the data values.
2.2. What Is ANFIS? ANFIS (adaptive-network-based-fuzzy inference system) is considered to be an adaptive network which is very similar to neural networks, using a given input/output data set, ANFIS constructs a fuzzy inference system (FIS) whose membership function parameters are adjusted using either a backpropagation algorithm alone, or in combination with a least squares method. This allows the fuzzy systems to learn from the data they are modeling.
2.3. ANFIS Objective The purpose of ANFIS is to integrate the best features of Fuzzy Systems and Neural networks. From Fuzzy Systems it is a representation of prior knowledge into a set of constraints to reduce the optimization search space. From Neural networks it is an adaptation of back propagation to structured network to automate the parametric tuning.
2.4. ANFIS Architecture For simplicity, I‘ll assume the fuzzy inference system under consideration has two inputs x and y and one output z. Suppose that the rule base contains two fuzzy if-then rules of Takagi and Sugeno’s type. Rule 1: If x is A1 and y is B1, then f1 = pl x + q1y + rl Rule 2: If x is A2 and y is B2, then f2 = p2x + q2y + r2
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
81
Then the fuzzy reasoning is illustrated in Fig. 2.4.1.(a), and the corresponding equivalent ANFIS architecture is shown in Fig. 2.4.1.(b).
Fig. 2.4.1.(a)
Fig. 2.4.1.(b) The node functions in the same layer are of the same function family as described below: Layer 1: Every node i in this layer is a square node with a node function
Oi = µAi ( x ) 1
1
Where x is the input to node i, and Ai is the linguistic label associated with this node function. In other words, Oi is the membership function of Ai and it specifies the degree to which the given x satisfies the quantifier Ai. Usually µAi ( x ) is chosen to be bell-shaped such as:
µAi ( x ) =
1 x − c 2 i bi 1 + ai
Where {ai,bi,ci} is the parameter set. As the values of these parameters change, the bell-shaped functions very accordingly, thus exhibiting various forms of membership functions to linguistic label Ai. Parameters in this layer are referred to as premise parameters. Layer 2: Every node in this layer is a circle node labeled II which uses the logic operation that the user chooses (AND,OR), example: wi = µAi ( x ) AND µBi ( x ) Each node output represents the firing strength of a rule. Layer 3: Every node in this layer is a circle node labeled N. the ith node calculates the ratio of the ith rule’s firing strength to the sum of all rules‘ firing strengths:
wi =
wi w1 + w2
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
82
Layer 4: Every node i in this layer is a square node with a node function
OI4 = wi fi = wi ( pi x + qi y + ri ) Where wi is the output of layer 3, and {pi,qi,ri} is the parameter set. Parameters in this layer will be referred to as consequent parameters. Layer 5: The single node in this layer is a circle node labeled Σ that comuptes the overall output as the summation of all incoming signals, i.e.,
O15 = overall output = ∑ wi f i = i
∑wf ∑w
i i
i
i
i
Fig 2.4.2. shows a 2-input, ANFIS with 2 rules. Two membership functions are associated with each input, so the input space is partitioned into four fuzzy subspaces, each of which is governed by a fuzzy if-then rules. The premise part of a rule delineates a fuzzy subspace while the consequent part specifies the output within this fuzzy subspace. premise parameters
ANFIS A1
x
Π
w1
consequent parameters Ν
w1
A2
B2
Layer 1
Π
w2
Layer 2
Ν Layer 3
Σ wi*fi
Σ
x y
B1
y
w1 *f1
w2 *f2
w2 Layer 4
Layer 5
Fig. 2.4.2. anfis map
2.5. Hybrid Learning Algorithm From the anfis architecture (Fig. 2.4.2) it is observed that given the values of premise parameters, the overall output can be expressed as linear combinations of the consequent parameters. The output f in layer 5 can be expressed as:
f =
w2 w1 f2 f1 + w1 + w2 w1 + w2
= w1 f1 + w2 f 2 = ( w1 x ) p1 + ( w1 y ) q1 + ( w1 ) r1 + ( w2 x) p2 + ( w2 y ) q2 + ( w2 ) r2 which is linear in the consequent parameters (pi,qi,ri, p2,q2,r2) as a result we have: S = set of total parameters S1 = set of premise parameters S2 = set of consequent parameters After finding the initial parameters by the generation of the fuzzy inference system (FIS), we can directly apply the hybrid learning rule which will be discussed in the coming section. More specifically, in the forward pass of the hybrid learning algorithm, functional signals go forward till layer 4 and the consequent parameters are identified by the least squares estimate. In the backward pass, the error rates propagate backward and the premise parameters are updated by the gradient descent. The following table (Table 2.5.1.) summarizes the activities in each pass.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
83
MF param. (premise) Rule param. (consequence)
forward pass fixed least-squares
backward pass back propagation fixed
Table 2.5.1. Tow pass in Hybrid Learning procedure The consequent parameters thus identified are optimal (in the consequent parameters space) under the condition that the premise parameters are fixed. Accordingly the hybrid approach is much faster than the strict gradient descent and it is worthwhile to look for the possibility of decomposing the parameter set. However, it should be noted that the computation complexity of the least squares estimate is higher than that of the gradient descent. In fact, there are four methods to update the parameters, as listed below according to their computation complexities: 1. Gradient descent only: all parameters are updated by the gradient descent. 2. Gradient descent and one pass of least squares estimate (LSE): the LSE is applied only once at the very beginning to get the initial values of the consequent parameters and then the gradient descent takes over to update all parameters. 3. Gradient descent and LSE: this is the proposed hybrid learning rule. 4. Sequential (Approximate) LSE only: The Anfis is linearized with respect to the premise parameters. The choice of the above methods should be based on the trade-off between computation complexity and resulting performance.
3.9. Hybrid Learning Rule (Forward pass) Though we can apply the gradient method to identify the parameters in an adaptive network, the method is generally slow and likely to become trapped in local minima. That’s why a Hybrid Learning Rule is proposed which combines the gradient descent method and the least squares estimate (LSE) to identify parameters. For simplicity, it is assumed that the adaptive network under consideration has only one output Output = F ( I , S ) Where I is the set of input variables and S is the set of parameters. If there exists a function H such that the composite function H○F is linear in some of the elements of S, then these elements can be identified by the least squares method. More formally, if the parameter set S can be decomposed into two sets S = S1 ⊕ S2 Where ⊕ represents direct sum such that H○F is linear in the elements of S2 , then upon applying H to the output equation we have H(output) = H○ F ( I , S ) Which is linear in the elements of S2. Now given values of elements of S1, we can plug P training data into the previous equation and obtain a matrix equation: AX = B Where X is an unknown vector whose elements are parameters in S2. Let S 2 = M, then the dimensions of A, X and B are PxM, Mx1 and Px1, respectively. Since P (number of training data pairs) is usually greater than M (number of linear parameters), this is an over determined problem and generally there is no exact solution to the previous equation (AX = B). Instead, a least squares estimate (LSE) of X, X*, is sought to minimize the squared error AX − B
2
.
This is a standard problem that forms the grounds for linear regression, adaptive filtering and signal processing. The most well-known formula for X* uses the pseudo-inverse of X: X* = (ATA)-1ATB „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
84
Where AT is the transpose of A, and (ATA)-1AT is the pseudo-inverse of A if ATA is non-singular. The sequential formulas are used to compute the LSE of X. this sequential method of LSE is more efficient (especially when M is small) and can be easily modified to an on-line version for systems with changing characteristics. Specifically, let the T
T
ith row vector of Matrix A be ai and the ith element of B be b i , then X can be calculated iteratively using the sequential formulas:
X i +1 = X i + Si +1ai +1 (biT+1 − aiT+1 X i ) S i +1
Si ai +1aiT+1 Si = Si − , i = 0,1,….,P-1 1 + aiT+1Si ai +1
Where Si is often called the covariance matrix and the least squares estimate X* is equal to Xp. The initial conditions are X0 = 0 and S0 = γ I , where γ is a positive large number and I is the identity matrix of dimension MxM. Now we can combine the gradient method and the least squares estimate to update the parameters in an anfis structure. Each epoch of this hybrid learning procedure is composed of a forward pass and a backward pass. In the forward pass, we supply input data and functional signals go forward to calculate each node output until the matrices A and B are obtained, and the parameters in S2 are identified by the sequential least squares formulas. After identifying parameters in S2 the functional signals keep going forward till the error measure is calculated. In the backward pass, the error rates (the derivatives of the error measure with respect to each node output) propagate from the output end and toward the input end, and the parameters in S1are updated by the gradient method (as will be explained in the coming section). For given fixed values of parameters in S1, the parameters in S2 thus found are guaranteed to be the global optimum point in the S2 parameter space due to the choice of the squared error measure. Not only can this hybrid learning rule decrease the dimension of the search space in the gradient method, but in general it will also cut down substantially the convergence time.
3.10. Gradient descent (Backward pass) The objective of applying this step is to update the premise parameters (membership functions’ parameters). And in this paper I’m using the Back propagation algorithm. ANFIS Back propagation’s basic idea is based on the Error measure E (Overall error measure) P P 1 ( p) E = ∑ E ( p ) = ∑ (d ( p ) − O5 ) 2 p =1 p =1 2
where: p = number of nodes in a layer d = pth component of desired output vector O5 = actual output in layer 5 of the ANFIS structure (see Fig.4) Assuming that the symbol θ represents each membership function’s parameter’s update then for each parameter θ the update formula is: P ∂E ∂E ( p ) ∆θ i = −η = −η ∑ ∂θ i p =1 ∂θ i
where
η=
k is the learning rate ∂E ∑ i ∂θ i
κ is the step size
∂E is the derivative update ∂θ i „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
85
The chain rule is used in order to calculate the partial derivatives. Based on the Sugeno inference mechanism, the error rate for consequence parameters can be calculated as follows: ( p) 2 ∂O4( ,pj) ∂O3(,pj) ∂O2( ,p1) ∂O1(,1p ) ∂E ( p ) ( p) ( p ) ∂O5 = ∑ (d − O5 ) ( p ) ∂θ i ∂O4, j ∂O3(,pj) ∂O2( ,p1) ∂O1(,1p ) ∂θ i j =1
The derivation of
∂O5 is as follows ∂O4 , j ∂ ∑ O4, k ∂O5 =1 = k ∂O4, j ∂O4, j
for
for
∂O4,i is ∂O3, j
∂O4 ,i ∂(O3,i ( pi x + qi + ri )) pi x + qi + ri i= j = = ∂O3, j ∂O3, j 0 i≠ j
∂O3,i is ∂O2, j
∂O3,i ∂O2 , j
and for
∂O1,i ∂ai
,
∂O1,i ∂bi
, and
∂O1,i ∂ci
O2,i O2,1 ∂ , if O2 ,1 + O2, 2 O2,1 + O2, 2 = = − O2,1 ∂O2, j , if O2,1 + O2, 2
i= j i≠ j
are
x − ci 2bi x − ci 2 x − ci 2bi x − ci 2 bi −( ) In( ) 2bi ( ) ) ∂O1,i ∂O1,i ∂O1,i ai ai ai a = , = , = x − ci 2b 2 x − ci 2bi 2 x − ci 2bi 2 b c ∂ai ∂ ∂ i i ai [1 + ( ) i] [1 + ( ) ] ( x − ci )[1 + ( ) ] ai ai ai 2bi (
3. CONCLUSION The hybrid system ANFIS with inference mechanism based on the adaptive network is being considered in this paper. An important property of this system is that it’s possible to tune the parameters of the fuzzy system that has been described with the help of ANFIS. The same methods that are used to tune the parameters are also used to tune the weights in the neural networks. As the ANFIS is a multilayer network, it can be concluded, that the tuning of its parameters is a non-linear task (the parameters of the inner layers can’t be expressed linearly). Hence, new algorithms ‘inherit’ problems that are related to the training algorithms of the multilayer neural. Moreover, new algorithms require more computational power; ANFIS is a learning method that is computationally more complex yet more effective than some of the other methods proposed in the neural network field.
References [1] JANG, J. S.; ANFIS, R. Adaptive-Network-Based Fuzzy Inference System, IEEE Transactions on Systems, Man and Cybernetics, Vol. 23, No. 3, May/June 1993, pp. 665-683. [2] The MathWorks, Inc.: Neural Network Toolbox (Matlab Toolbox), in reference to the handbook of The MathWorks, Inc., Boston 1998 [3] VALISHEVSKY, A. Comparative Analysis of Different Approaches towards Multilayer Perceptron Training, Scientific Proceedings of Riga Technical University, 2001. „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
86
[4] [5]
Neuro-Fuzzy Modeling and Control”, J.S.R. Jang and C.-T. Sun, Proceedings of the IEEE, 83(3):378-406 The Fuzzy Logic Toolbox for use with MATLAB, J.S.R. Jang and N. Gulley, Natick, MA: The MathWorks Inc., 1995
Address: Ing. Amal Al Khatib VUT v Brně Technická 2 CZ-616 69 Brno e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
87
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
88
GRAMMATICAL EVOLUTION WITH BACKWARD PROCESSING Pavel Ošmera, O. Popelka,1 Imrich Rukovanský2
1
2
Brno University of Technology, European Polytechnical Institute Kunovice
Abstract: This paper describes Parallel Grammatical Evolution (PGE) that can evolve complete programs using a variable length linear genome to govern the mapping of a Backus Naur Form grammar definition. To increase the efficiency of Grammatical Evolution (GE) the influence of backward processing was tested. The significance of backward coding (BC)and the comparison with standard coding of GEs is presented. BC can speed up Grammatical Evolution with high quality features. The adaptive significance of Parallel Grammatical Evolution with male and female populations has been studied.
1 INTRODUCTION Grammatical Evolution (GE) [1] can be considered a form of grammar-based genetic programming (GP). In particular, Koza’s genetic programming has enjoyed considerable popularity and widespread use. Unlike a Koza-style approach, there is no distinction made at this stage between what he describes as function (operator in this case) and terminals (variables). Koza originally employed Lisp as his target language. This distinction is more of an implementation detail than a design issue. Grammatical evolution can be used to generate programs in any language, using Backus Naur Form (BNF). BNF grammars consist of terminals, which are items that can appear in the language, i.e. +, -, sin, log etc. and non-terminal, which can be expanded into one or more terminals and non-terminals. A non-terminal symbol is any symbol that can be rewritten to another string, and conversely a terminal symbol is one that cannot be rewritten. The major strength of GE with respect to GP is its ability to generate multi-line functions in any language. Rather than representing the programs as parse tree, as in GP, a linear genome representing is used. A genotype-phenotype mapping is employed such that each individual’s variable length byte strings, contains the information to select production rules from a BNF grammar. The grammar allows the generation of programs, in an arbitrary language that are guaranteed to be syntactically correct. The user can tailor the grammar to produce solutions that are purely syntactically constrained, or they may incorporate domain knowledge by biasing the grammar to produce very specific form of sentences. GE system in [1-3] codes a set of pseudo random numbers, which are used to decide which choice to take when a nonterminal has one or more outcomes. Because GE mapping technique employs a BNF definition, the system is language independent, and theoretically can generate arbitrarily complex functions. There is quite an unusual approach in GEs, as it is possible for certain genes to be used two or more times if the wrapping operator is used. BNF is a notation that represents a language in the form of production rules. It is possible to generate programs using the Grammatical Swarm Optimization (GSO) technique [2] with a performance similar to the GE. Given the relative simplicity of GSO, the small population sizes involved, and the complete absence of a crossover operator synonymous with program evolution in GP or GE. Grammatical evolution was one of the first approaches to distinguish between the genotype and phenotype. GE evolves a sequence of rule numbers that are translated, using a predetermined grammar set into a phenotypic tree. Our approach uses a parallel structure of GE (PGE). A population is divided into several subpopulations that are arranged in the hierarchical structure [4]. Every subpopulation has two separate parts: a male group and a female group. Every group uses quite a different type of selection. In the first group a classical type of GA selection is used. In the second group only different individuals can be included. It is a biologically inspired computing similar to a harem arrangement. This strategy increases an inner adaptation of PGE. The following text explains why we used this approach. Analogy would lead us one step further, namely, to the belief that the combination of GE with a sexual reproduction [5-6]. On the principle of the sexual reproduction we can create a parallel GE with a hierarchical structure. 2 PARALLEL GRAMMATICAL EVOLUTION The PGE is based on the grammatical evolution GE [1], where BNF grammars consist of terminals and non-terminals. Terminals are items, which can appear in the language. Non-terminals can be expanded into one or more terminals and non-terminals. Grammar is represented by the tuple {N,T,P,S}, where N is the set of non-terminals, T the set of „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
89
terminals, P a set of production rules which map the elements of N to T, and S is a start symbol which is a member of N. For example, below is the BNF used for our problem: N = {expr, fnc} T = {sin, cos, +, -, /, *, X, 1, 2, 3, 4, 5, 6, 7, 8, 9} S = <expr> and P can be represented as 4 production rules: 1. <expr> := <expr> <expr><expr> <expr> 2. := sin cos + * U3. := X 4. := 0,1,2,3,4,5,6,7,8,9 The production rules and the number of choices associated with each are in Table 1. The symbol U- denotes an unary minus operation. Table 1: The number of available choices for each production rule. rule no choices 1 4 2 6 3 1 4 10 There are notable differences when compared with [1]. We don’t use two elements <pre_op> and , but only one element for all functions with n arguments. There are not rules for parentheses; they are substituted by a tree representation of the function. The element and the rule <expr> were added to cover generating numbers. The rule <expr> is derived from the rule <expr><expr>. Using this approach we can generate the expressions more easily. For example when one argument is a number, then +(4,x) can be produced, which is equivalent to (4 + x) in an infix notation. The same result can be received if one of <expr> in the rule <expr><expr> is substituted with and then with a number, but it would need more genes. There are not any rules with parentheses because all information is included in the tree representation of an individual. Parentheses are automatically added during the creation of the text output. If in the GE is not restricted anyhow, the search space can have infinite number of solutions. For example the function cos(2x), can be expressed as cos(x+x); cos(x+x+1-1); cos(x+x+x-x); cos(x+x+0+0+0...) etc. It is desired to limit the number of elements in the expression and the number of repetitions of the same terminals and non-terminals.
3 BACKWARD PROCESSING OF THE GE The chromosome is represented by a set of integers filled with random values in the initial population. Gene values are used during chromosome translation to decide which terminal or nonterminal to pick from the set. When selecting a production rule there are four possibilities, we use gene_value mod 4 to select a rule. However the list of variables has only one member (variable X) and gene_value mod 1 always returns 0. A gene is always read; no matter if a decision is to be made, this approach makes some genes in the chromosome somehow redundant. Values of such genes can be randomly created, but genes must be present. The figure Fig. 1 shows the genotype-phenotype translation scheme. Body of the individual is shown as a linear structure, but in fact it is stored as a one-way tree (child objects have no links to parent objects). In the diagram we use abbreviated notations for nonterminal symbols: f - , e - <expr>, n - , v - . The column description in Fig. 1: A. Objects of the individual’s body (resulting trigonometric function), „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
90
B. Genes used to translate the chromosome into the phenotype, C. Modulo operation, divisor is the number of possible choices determined by the gene context, D. Result of the modulo operation, E. State of the individual’s body after processing a gene on the corresponding line, F. Blocks in the chromosome and corresponding production rules, G. Block marks added to the chromosome.
Fig.1: Relations between genotype (column B) and phenotype (column A) Since operation modulo takes two operands, the resulting number is influenced by gene value and by gene context (Fig. 1C = see Fig. 1 column C). Gene context is the number of choices, determined by the currently used list (rules, functions, variables). Therefore genes with same values might give different results of modulo operation depending on what object they code. On the other hand one terminal symbol can be coded by many different gene values as long as the result of modulo operation is the same (31 mod 3) = (34 mod 3) = 1. In the example (Fig. 1A) given the variables set has only one member X. Therefore, modulo divider is always 1 and the result is always 0, a gene which codes a variable is redundant in that context (Fig. 1D). If the system runs out of genes during phenotype-genotype translation then the chromosome is wrapped and genes at the beginning are reused.
4 PROCESSING THE GRAMMAR The processing of the production rules is done backwards – from the end of the rule to the beginning (Fig. 2). E.g. production rule <expr1><expr2> is processed as <expr2><expr1>. We use <expr1> and <expr2> at this point to denote which expression will be the first argument of .
Fig. 2: Proposed backward notation of a function tree structure
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
91
The main difference between and <expr> nonterminals is in the number of real objects they produce in the individual’s body. Nonterminal always generates one and only one terminal; on the contrary <expr> generates an unknown number of nonterminal and terminal symbols. If the phenotype is represented as a tree structure then a product of the nonterminal is the parent object for handling all objects generated by <expr> nonterminals contained in the same rule. Therefore the rule <expr1><expr2> can be represented as a tree (Fig. 3).
Fig. 3: Production rule shown as a tree To select a production rule (selection of a tree structure) only one gene is needed. To process the selected rule a number of n genes are needed and finally to select a specific nonterminal symbol again one gene is needed. If the processing is done backwards the first processed terminals are leafs of the tree and the last processed terminal in a rule is the root of a subtree. The very last terminal is the root of the whole tree. Note that in a forward processing (<expr1><expr2>) the first processed gene codes the rule, the second gene codes the root of the subtree and the last are leafs. When using the forward processing and coding of the rules described in [1] it’s not possible to easily recover the tree structure from genotype. This is caused with <expr> nonterminals using an unknown number of successive genes. The last processed terminal being just a leaf of the tree. The proposed backward processing is shown in Fig. 1E.
4.1 PHENOTYPE TO GENOTYPE PROJECTION Using the proposed backward processing system the translation to a phenotype subtree has a certain scheme. It begins with a production rule (selecting the type of the subtree) and ends with the root of the subtree (in our case with a function) (Fig. 1F). In the genotype this means that one gene used to select a production rule is followed by n genes with different contexts which are followed by one gene used to translate . Therefore a gene coding a production rule forms a pair with a gene coding terminal symbol for (root of the rule). Those genes can be marked when processing the individual. This is an example of a simple marking system: BB – Begin block (a gene coding a production rule) IB – Inside block EB – End block (a gene coding a root of a subtree) The EB and BB marks are pair marks and in the chromosome they define a block (Fig. 1G). Such blocks can be nested but they don’t overlap (the same way as parentheses). The IB mark is not a pair mark, but it is always contained in a block (IB marks are presently generated by nonterminals). Given a BB gene a corresponding EB gene can be found using a simple LIFO method. A block of chromosome enclosed in a BB-EB gene pair then codes a subtree of the phenotype. Such block is fully autonomous and can be exchanged with any other block or it can serve as completely new individual. Only BB genes code the tree of individual’s body, while EB and IB genes code the terminal symbols in the resulting phenotype. The BB genes code the structure of the individual, changing their values can cause change of the applied production rule. Therefore change (e.g. by mutation) in the value of a structural gene may trigger change of context of many, or all following genes. This simple marking system introduces a phenotype feedback to phenotype; however it doesn’t affect the universality of the algorithm. It’s not dependent on the used terminal or nonterminal symbols; it only requires the result to be a tree structure. Using this system it’s possible to introduce a progressive crossover and mutation.
4.2 CROSSOVER When using grammatical evolution the resulting phenotype coded by one gene depends on the value of the gene and on its context. If a chromosome is crossed at random point, it is very probable that the context of the genes in second part will change. This way crossover causes destruction of the phenotype, because the newly added parts code different phenotype than in the original individual. This behavior can be eliminated using a block marking system. Crossover is then performed as an exchange of blocks. „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
92
The crossover is made always in an even number of genes, where the odd gene must be BB gene and even must be EB gene. Starting BB gene is presently chosen randomly; the first gene is excluded because it encapsulates (together with the last used gene) the whole individual. The operation takes two parent chromosomes and the result is always two child chromosomes. It is also possible to combine the same individuals, while the resulting child chromosomes can be entirely different. Given the parents: 1) cos( x + 2 ) + sin( x * 3 ) 2) cos( x + 2 ) + sin( x * 3 ) The operation can produce children: 3) cos( sin( x * 3 ) + 2 ) + sin( x * 3 ) 4) cos( x + 2 ) + x This crossover method works similar to direct combining of phenotype trees, however this method works purely on the chromosome. Therefore phenotype and genotype are still separated. The result is a chromosome, which will generate an individual with a structure combined from its parents. This way we receive the encoding of an individual without backward analysis of his phenotype. To perform a crossover the phenotype has to be evaluated (to mark the genes), but it is neither used nor know in the crossover operation (also it doesn’t have to exist).
4.3 MUTATION Mutation can be divided into mutation of structural (BB) genes and mutation of other genes. Mutation of one structural gene can affect other genes by changing their context therefore structural mutation amount should be very low. On the other hand the amount of mutation of other genes can be set very high and it can speed up searching an approximate solution. Given an individual: sin( 2 + x ) + cos( 3 * x ) and using only mutation of non-structural genes, it is possible to get: cos( 5 – x ) * sin( 1 * x ) Therefore the structure doesn’t change, but we can get a lot of new combinations of terminal symbols. The divided mutation allows using the benefits of high mutation while eliminating the risk of damaging the structure of an individual.
4.4 POPULATION MODEL The system uses three populations forming a simple tree structure (Fig. 4). There is a Master population and two slave populations, which simulate different genders. The links among the populations lead only one way - from bottom to top.
Fig. 4: The population model
4.5 FEMALE POPULATION When a new individual is to be inserted in a population a check is preformed whether it should be inserted. If a same or similar individual already exists in the population then the new individual is not inserted. In a female population every genotype and phenotype occurs only once. The population maintains a very high diversity; therefore the mutation operation is not applied to this population. Removing the individuals is based on two criterions. The first criterion is the age of an individual - length of stay in the population. The second criterion is the fitness of an individual. Using the second criterion a maximum population size is maintained. Parents are chosen using the tournament system selection.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
93
4.6 MALE POPULATION New individuals are not checked so duplicate phenotypes and genotypes can occur, also the mutation is enabled for this population. Mutation rate can be safely set very high (30%) provided that the structural mutation is set very low (less then 2%). For a couple of best individuals the mutations are nondestructive. If a protected individual is to be mutated a clone is created and added to the population. If the system stagnates in a local solution the mutation rate is raised using a linear function depending on the number of cycles for which the solution wasn’t improved. Parents are chosen using a logarithmic function depending on the position of an individual in a population sorted by fitness. For every selected male parent a new selection of female parent is made.
4.7 MASTER POPULATION The master population is superior to the male and female populations. Periodically the subpopulations send over their best solutions. Moreover the master population performs another evolution on its own. Parents are selected using the tournament system. The master population uses the same system of mutations as the male population, but for removing individuals from the population only the fitness criterion is used. Therefore master population also serves as an archive of best solutions of the whole system.
4.8 FITNESS FUNCTION Around the searched function there is defined an equidistant area of a given size. Fitness of an individual’s phenotype is computed as the number of points inside this area divided by the number of all checked points (a value in <0,1>). This fitness function forms a strong selection pressure; therefore the system finds an approximate solution very quickly.
4.9 RESULTS Given sample of 100 points in the interval [0,2π] and using the block marking system described in 5.1, PGE has successfully found the searched function sin(2*x)*cos(2+x) on the majority of runs. The graph (Fig 5.) shows maximum fitness in the system for ten runs and an average (bold). On the other hand the same system with phenotype to genotype projection disabled (Fig. 6). The majority of runs didn’t find the searched function within 120 generations. We have simplified the generation of numbers by adding a new production rule, thus allowing the generation of functions containing integer constants. The described parallel system together with phenotype to genotype projection improved the speed of the system. The progressive crossover and mutation eliminates destroying partial results and allowed us to generate more complicated functions (e.g. sin(2 * x)*cos(2 + x)). 1
F itn e s s
0 ,8
0 ,6
0 ,4
0 ,2
G e n e ra tio n 0
1
2 5
5 0
7 5
1 0 0
Fig. 5: Convergence of the PGE using backward processing (average in bold) We have described a parallel system, Parallel Grammatical Evolution (PGE) that can map an integer genotype onto a phenotype with the backward coding. PGE has proved successful for creating trigonometric identities. Parallel GEs with the sexual reproduction can increase the efficiency and robustness of systems, and thus they can track better optimal parameters in a changing environment. From the experimental session it can be concluded that modified standard GEs with two sub-populations can design PGE much better than classical versions of GEs.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
94
1
F itn e s s
0 ,8
0 ,6
0 ,4
0 ,2
G e n e ra tio n 0
2 5
1
5 0
7 5
1 0 0
Fig. 6: Convergence of the PGE using forward processing (average in bold)
Fitness 1
0,8
0,6
0,4
0,2
Generation
0 0
50
100
Fig.7: Convergence of the PGE with 5 PC using backward processing (average in bold)
Fig.8: The parallel structure of PGE with 6 computers „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
95
The PGE algorithm was tested with the group of 6 computers in the computer network (see Fig. 8). Five computers calculated in the structure of five subsystems MR1, MR2, MR3, MR4, and MR5 and one master MR. The male subpopulation M of MR in the higher level follows the convergence of the subsystem. In Fig. 7 is presented 10 runs of the PGE- program. The shortest time of computation is only 10 generation. All calculation were finished before 40 generation. This is better to compare with backward processing on one computer (see Fig. 5). The forward processing on one computer was the slowest (see Fig. 6).
5 CONCLUSIONS The increased awareness from other scientific communities, such as biology and mathematics, promises new insights and new opportunities. There is much to accomplish and there are many open questions. Interest from diverse disciplines continues to increase and simulated evolution is becoming more generally accepted as a paradigm for optimization in practical engineering problems. The parallel grammatical evolution can be used for the automatic generation of programs. This can help us to find information as a part of complexity. We are far from supposing that all difficulties are removed but first results with PGEs are very promising.
REFERENCES [1] O’NEILL, M.; RYAN, C. Grammatical Evolution: Evolutionary Automatic Programming in an Arbitrary Language Kluwer. Academic Publishers 2003. [2] O’NEILL, M.; BRABAZON, A.; ADLEY, C. The Automatic Generation of Programs for Classification Problems with Grammatical Swarm. Proceedings of CEC 2004, Portland, Oregon (2004) 104 – 110. [3] PIASÉCZNY, W.; SUZUKI, H.; SAWAI, H. Chemical Genetic Programming – Evolution of Amino Acid Rewriting Rules Used for Genotype-Phenotype Translation, Proceedings of CEC 2004, Portland, Oregon (2004) 1639 - 1646. [4] OŠMERA, P.; ŠIMONÍK, I.; ROUPEC, J. Multilevel distributed genetic algorithms. In Proceedings of the International Conference IEE/IEEE on Genetic Algorithms, Sheffield (1995) 505–510. [5] OŠMERA, P.; ROUPEC, J. Limited Lifetime Genetic Algorithms in Comparison with Sexual Reproduction Based Gas. Proceedings of MENDEL’2000, Brno, Czech Republic (2000) 118 – 126 [6] OŠMERA, P. Evolution of System with Unpredictable Behavior, Proceedings of MENDEL’2004, Brno, Czech Republic (2004) 1 - 6. [7] OŠMERA, P. Genetic Algorithms and their Aplications, the habilit work, in Czech language 2002.
Address: Doc. Ing. Pavel Ošmera, CSc. Institute of Automation and Computer Science Brno University of Technology Technicka 2, 616 69 Brno, Czech Republic Tel.: +420 541 142 294 Fax: +420 541 142 490 e-mail: osmera @fme.vutbr.cz
Address: Bc. Ondřej Popelka Institute of Automation and Computer Science Brno University of Technology Technicka 2, 616 69 Brno, Czech Republic Tel.: +420 541 142 294 Fax: +420 541 142 490 e-mail: [email protected],
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
96
Address: Prof. Ing. Imrich Rukovanský, CSc. European Polytechnical Institute, s.r.o. Osvobození 699, 686 04 Kunovice, Czech Republic e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
97
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
98
OBJECT RECOGNITION BY MEANS OF NEW AL Jiří Štastný, Martin Minařík
Brno University of Technology Abstract: This document provides an overview of algorithms for object recognition. Three basic algorithms are described - recognition with the aid of the momentum, recognition with the aid of grammar and back propagation algorithm, thus recognition with the aid of neural network. Finally, speed and applicability of these algorithms are compared.
Key-Words: Back Propagation Algorithm, Momentum, Grammar
1 Introduction Pattern recognition consists in sorting objects into classes. Class is a subset of objects whose elements have common features from the classification standpoint. Object has a physical character, which in computer vision is most frequently taken to mean a part of segmented image. Methods for the classification of objects constitute last and upper-most step in computer vision theory. The following methods were mutually compared: • Recognition with the aid of moments • Recognition with the aid of grammar describing the edges of object • Recognition with the aid of neural network (back propagation) A real technological scene for object classification was simulated by digitizing five selected objects (see Fig. 1). For this purpose, two-dimensional images of three-dimensional objects were prepared. The aim was to test such objects that resemble two-dimensional images of real objects. The choice of objects of similar shape was also intentional.
Fig. 1
2 Recognition with the aid of the momentum method The resultant moment characteristics for object detection, which are used in the program, will be in the form:
ϕ1 = θ 20 + θ 02
(1)
ϕ2 = (θ20 +θ02 ) + 4θ 2
2 11
(2)
ϕ3 = (θ30θ12) + (3θ21 −θ03) 2
2
(3)
ϕ 4 = (θ 30 + θ12 ) + (θ 21 + θ 03 ) 2
2
[
]
ϕ5 = (θ30 − 3θ12)(θ30 +θ12) (θ30 +θ12)2 −3(θ21 +θ03)2 +
[
+ (3θ21 −θ03)(θ21 +θ03) 3(θ30 +θ12) − (θ21 +θ03)
[
2
2
]
]
ϕ6 =(θ20+θ02)(θ30θ12)2 −(θ21+θ03)2 +4θ11(θ30+θ12)(θ21+θ03)
[
]
ϕ7 = (3θ21 −θ03)(θ30 +θ12) (θ30 +θ12) −3(θ21 +θ03) −
[
2
2
−(θ30 −3θ12)(θ21 +θ03) 3(θ30 +θ12) −(θ21 +θ03) 2
2
]
(4) (5) (6) (7)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
99
The moment description represents binary and grey shade areas. This method (see [7]) is based on computing seven moment object flags. These moments are invariant with respect to repositioning, rotation and size of the object. Recognition with the aid of the moment method has yielded very good results. This method faultlessly classified most objects already at the stage of learning on some model etalon. The moments method can be used for a different edge detector than for which the moments were calculated since they are not too sensitive to changes in object edges, e.g. the Canny detector (at higher sigma it rounds the edges) and the Sobel operator. This method is unfit for applications requiring the recognition of minimum dissimilar shapes, because this method is not sensitive to minor shape changes. This method is fit for applications requiring fast recognition of dissimilar objects in different rotation.
3 Recognition with the aid of grammar While in flag methods of pattern recognition use is made of quantitative description of objects by numerical parameters, the flag vector, in syntactical methods the input description is of quantitative nature reflecting the structure of the object. The elementary properties of syntactically described objects are referred to as primitives. Primitives are edge parts of a certain shape or a graph or relation description of areas when the primitives are sub-areas of a certain shape. The task of syntactical pattern recognition of an image is to determine whether the image under analysis corresponds to the images of a given grammar, i.e. whether this grammar can generate this image. The image is represented by a language string given by the grammar. The simplest way of pattern recognition is „comparison with model“. A string representing the image is compared with elements of sentences set representing single model images. In the comparison, either complete or partial agreement with the model is necessary but on the basis of a certain adapting criterion. This method is simple and rapid. If a complete image description is necessary for pattern recognition, syntactical analysis is required. In object analysis tasks, the aim is to obtain a description that would include not only the listing of recognized objects and their mutual arrangement (structural information) but also their dimensions and distances found between them (semantic information). In the design of syntactical analyzer we must expect random effects such as image distortion. Primitives are the basic building element of image. When choosing them, their easy recognition must be taken into consideration. For images that are characterized by an edge or skeleton it is suitable to have parts of line as primitives. For example, a line segment can be characterized by its beginning and end, its length or angle. The same holds for curves. The choice of primitives depends on the application being solved. It generally holds: • Primitives must be easy to recognize also by existing non-syntactical methods • Primitives must provide a compact and sufficient description of images by means of specified relations If primitives of greater complexity are used we obtain a simpler structural description of objects and these results in applying a simpler grammar for the description of objects. But it also leads to a greater complexity when seeking such primitives in an image. On the contrary, simpler primitives lead to a more complex grammar but they are easy to identify in the image. After the choice of primitives the next important step is to set up transcription rules for the grammar, based on experience and knowledge. Example of setting up grammar for object and choosing primitives is on Fig 2). If from a point marked on Fig. 2) we proceed anticlockwise, the string will be:
dfbcajbcag Due to the poor quality of input image it may happen that short straight segments exhibited on Fig. 2) will not be detected and thus the objects string will also be changed
dfbcjbcag dfbcajcag dfbcjcag For this case it is suitable to modify the grammar such that it also generates the above strings. This type of approach is suitable for the expected deformations. In the case of random deformations it is advantage to use the classification by means of distance (the Levenshtein distance). „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
100
Fig. 2 In the program generated, a syntactical analyzer is applied, which operates using the following algorithm: 1. If not all the strings have been analysed, read a new string and proceed with step 2, otherwise proceed with step 7 2. Perform the bottom-to-top analysis for the class selected 3. If the string belongs to the language of the grammar of selected class, proceed with step 6 4. If the number of string rotations is less than the string length, rotate the string and proceed with step 2, otherwise proceed with step 5 5. If the number of string rotations is less than (360 / angle step), rotate the object by the angle step given and proceed with step 2 6. enter the result and proceed with step 1 7. write the message about pattern recognition The syntactical analyzer has been designed for the left linear grammar String rotation: This mean shifting the last terminal symbol to the beginning 1.rotace
abcde → eabcd →L Object rotation: This means rotating the object by a given angle and thus obtaining a different string. If we need to classify N objects, we must create N classes, N grammars for them, and the respective languages L(G1), L(G2), ..., L(GN). For example, if grammar Gx generates words containing only one terminal symbol b, then all the objects containing just this one symbol b will belong to class X pertaining to this grammar. Objects containing more than one symbol b will be further analysed using the remaining grammars. In the case that no grammar is found that corresponds to the given string, the object will be suppressed. In the case of primitives marking single edge segments, the grammar is very sensitive to small mistakes in edge detection. It is necessary to tailor the grammar for a definite type of edge detector. For example, it is a mistake to set up the grammar for objects to which the current-zero operator was applied, and then to recognize objects by means of the Canny edge detector with sizeable Sigma (it modifies edges). Grammars are suitable to use in applications which require recognizing differently rotated objects and when the emphasis is on recognizing small changes in the segment edge. Setting up a grammar requires time and the knowledge of grammar description of edges. Preparing the rules for grammar must be done manually, it is not done automatically as in other methods.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
101
4 Back Propagation Algorithm Back-propagation algorithm is an iterative method where the network gets from an initial non-learned state to the full learned one (see [10]). It is possible to describe the algorithm in the following way: random initialization of weights; repeat repeat choose_pattern_from_training_set; put_chosen_pattern_in_input_of_network; compute_outputs_of_network; compare_outputs_with_required_values; modify_weights; until all_patterns_from_traning_set_are_chosen; until total_error < criterion;
The learning algorithm of back-propagation is essentially an optimization method that is able to find the weight coefficients and thresholds for the given neural network and training set. The network is assumed to be made up of neurons the behaviour of which is described by the formula:
N y = S ∑ wi xi + Θ i =1
(17)
where the output nonlinear function S is defined by the formula:
S (ϕ ) =
1 1 + e −γϕ
(18)
where γ determines the curve steepness in the origin of coordinates. Input and output values are assumed to be in the range < 0, 1 >. In the following formulas the parameter o denotes the output layer, h the hidden layer, and i,j the indexes. Index i o
h
indexes output neurons and index j their inputs. Then yi means i-th neuron output of the hidden layer and wij means the weight connecting i-th neuron of the output layer and j-th neuron of the previous hidden layer. The appurtenant back-propagation algorithm can be written in the following steps: 1. Initialization. You set at random all the weights in the network at values in the recommended range < -0.3, 0.3 >. 2. Pattern submitting. You choose a pattern from the training set and put it in network inputs. Then you compute outputs of particular neurons by relations (17) and (18). 3. Comparison. First you compute the neural network energy (SSE) under relation (19).
E=
1 n ∑ ( y − di ) 2 2 i =1 i
(19)
Then you compute an error for output layer by the relation:
δio = (d i − yio ) yioγ (1 − yio ) 4.
(20)
Back-propagation of an error and weight modification. You compute for all neurons in the layer:
∆wijl (t ) = ηδi l (t ) y lj−1 (t ) + α∆wijl (t − 1)
(21)
∆Θil (t ) = ηδi l (t ) + α∆Θ il (t − 1)
(22)
By the relation:
δi
h −1
h −1 i
=y
h −1 i
(1 − y
n
)∑ wkihδkh
(23)
k =1
you back-propagate an error in the layer nearer the inputs. Then you modify the weights:
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
102
wijl (t + 1) = wijl (t ) + ∆wijl (t ) Θ il (t + 1) = Θ il (t ) + ∆Θ il (t )
(24) (25)
You apply step number 4 to all the layers of network. You start with output layer followed by hidden layers. 5. Termination of pattern selection from the training set. If you have submitted all patterns from the training set to network then continue with step number 6 else you go back to step number 2. 6. Termination of learning process. If neural network energy in the last computation has been less than the criterion selected then terminate the learning process else you continue with step number 2. Flag vectors have been submitted to network and arm lengths have been transformed into values from interval < 0, 1 >. The number of vector components has been put in position 140 in the implemented computer program. This method is the fastest of all the methods under comparison. For the description of objects using this method, 70 symptomatic vectors were used that went from the centre of gravity to object edges. This method can recognize objects with considerably modified shapes but it may identify incorrectly objects of similar shape. This method recognizes differently rotates objects. The error of the method increases with decreasing size of objects.
5 Conclusion Recommended application of pattern recognition methods: Grammar recognition – is suitable where the recognition of rotated objects is required and where single edge segments need to be detected with high accuracy without the risk of the occurrence of significant errors, and where high-speed classification is required. Regarded as significant is an error that cannot be implied in the rules. Recognition with the aid of moments – is suitable where the edge course is not very important, where a rough division into single classes is sufficient. For example, it does not matter whether there is a sharp transition or a short curve between two edges. Recognition with the aid of neural network – is suitable where high-speed classification with randomly rotated objects is required and where we need to tolerate some differences between learned etalons and classified objects. The fastest methods for pattern recognition are recognition with the aid of grammar and recognition with the aid of neural network. The moment method is the slowest.
Acknowledgement This research was supported by the grants: No 102/03/0434 Limits for broad-band signal transmission on the twisted pairs and other system co-existence. The Grant Agency of the Czech Republic (GACR) No 102/03/0260 Development of network communication application programming interface for new generation of mobile and wireless terminals. The Grant Agency of the Czech Republic (GACR) No 102/03/0560 New methods for location and verification of compliance of quality of service in new generation networks. The Grant Agency of the Czech Republic (GACR) No CEZ: J22/98: 261100009 Nontraditional methods for investigating complex and vague systems No CZ 400011(CEZ 262200011) Research of communication systems and technologies (Research design) Grant 1570 F1 New approach to the subject High-speed Communication Systems (grant of the Czech Ministry of Education, Youth and Sports) Grant 1563 F1 Restructure of telecommunication objects for third age university (grant of the Czech Ministry of Education, Youth and Sports)
References: [1] FISHER, R.: World and Scene Representations. [Online], 2002. <www.dai.ed.ac.uk/CVonline/repres.htm> [2] HEALTH M. and SARKAR S.: Edge detection comparison. [Online], 1996. <marathon.csee.usf.edu/edge/edge_detection. html>
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
103
[3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14]
GONZALES, R.,C. and WOODS, R., E.: Digital Image Processing, Addison-Wesley Publishing Co., New York 1993. ŽÁRA, J. and BENEŠ, B.: Model Computer Graphic, Computer Press, Prague 1996. SMITH, M.,W. and Davis, W.,A.: A new Algorithm for Edge Detection. John Wiley, New York 1974. SVITÁK, R.: Edge Detection on Images. [Project-online]. ZU FAV. Plzeň 2001. ŠONKA, M. and HLAVÁČ, V. and BOYLE, R..: Image Processing, Analysis and Machine Vision. PWS, Boston 1998. JEŽEK, B.: Computer Graphics II. [Lectures -online]. UHK FIM . Hradec Králové 2002. BULB, M. : Programming [Online]. Prague 2000 <www.freesoft.cz/projekty/vyhen/clanky/prog/bres. html> ŠNOREK, M. and JIŘINA, M.: Neuronové sítě a neuropočítače, Prague, 1998. ŠONKA, M: Course Digital Image Processing. [Online]. , Prague 2002. ZAHN, T., CH.: Fourier Descriptors for Plane Closed Curves. In: IEEE Trans. on Computers, vol. C-21, No.3, 1972. LOPEZ-CAVIEDES, M. and SANCHEZ-DIAZ, G.: A New Clustering Criterion in Pattern Recognition. International Journal WSEAS Transactions on Computers, Issue 3, Volume 3, July 2004, ISSN 1109-2750. RODRIGUEZ, J. N. and CO.: An Artificial Vision System for Identify and Classify Objects. International Journal WSEAS Transactions on Computers, Issue 2, Volume 3, April 2004, ISSN 1109-2750.
Address: Ing. RNDr. Jiří Šťastný, CSc. Institute of Automation and Computer Science Brno University of Technology Technicka 2, 616 69 Brno, Czech Republic Tel.: +420 541 142 294 Fax: +420 541 142 490 e-mail: [email protected] Address: Ing. Martin Minařík Institute of Automation and Computer Science Brno University of Technology Technicka 2, 616 69 Brno, Czech Republic Tel.: +420 541 142 294 Fax: +420 541 142 490 e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
104
APLIKÁCIA TEÓRIE GRAFOV V INTELIGENTNOM DOPRAVNOM SYSTÉME3 Tomáš Klieštik
Žilinská univerzita Abstrakt: článok pojednáva o aplikácii teórie grafov v inteligentnom dopravnom systéme. Konkrétne o aplikácii Hamiltonovských cykloch v grafe t. j. rieši úlohu obchodného cestujúceho. Pomocou danej úlohy môže dopravný podnik optimalizovať (minimalizovať) prepravné náklady. A to tak, že minimalizuje prepravnú trasu. Daný problém je ilustrovaný na modelovom príklade a riešený pomocou optimalizačného software LINGO.
Kľúčové slová: graf, cyklus v grafe, minimalizácie, účelová funkcia,, ohraničenia, inteligentný dopravný systém,
Reálne systémy opisujeme a skúmame pomocou ideálnych matematických objektov. Jedným z takýchto ideálnych objektov, ktorý vytvorila matematika a ktorý slúži vyjadreniu mnohých, obsahovo často celkom odlišných situácií, je graf. Teória grafov patrí medzi najmladšie matematické disciplíny, ako systematická veda sa sformovala iba v tridsiatych rokoch minulého storočia. Za jedného z prvých priekopníkov teórie grafov sa považuje Leonard Euler, ktorý sa preslávil okrem iného riešením problému siedmych mostov mesta Kráľovca. Úloha spočívala v navrhnutí okružnej cesty, ktorá prechádza cez všetky mosty, ale cez každý iba raz. Euler dokázal, že takáto okružná cesta neexistuje. Teóriu grafov ďalej rozpracovali G. R. Kirchhoff, K. Appel, W. Haken, W. R. Hamilton a iní. V príspevku budem podrobnejšie rozoberať poznatky teórie grafov, ktoré ako prvý definoval William Rowan Hamilton. Jeho poznatky budem aplikovať v tzv. úlohe obchodného cestujúceho resp. okružnom probléme. Túto úlohu je možné riešiť dvoma spôsobmi a to: heuristickými metódami alebo pomocou lineárneho programovania. Ja budem okružný problém riešiť metódami lineárneho programovania. Z výpočtového hľadiska ide o časovo mimoriadne náročnú úlohu, a preto načrtneme na modelovom príklade možnosť riešenia pomocou optimalizačného software LINGO. V úlohe o obchodnom cestujúcom treba určiť, v akom poradí obchodný cestujúci, ak vyjde z určitého miesta, navštívi práve raz ostatné mestá a vráti sa naspäť do východiskového mesta. Predpokladá sa, že je známy počet miest v sieti n a že známe sú aj vzdialenosti medzi jednotlivými mestami cij (i = 1,2,....n, j = 1,2,....n). Uvedený problém možno zapísať ako úlohu lineárneho programovania kde premenné sú bivalentné t. j. ak sa cesta na príslušnej trase zrealizuje, hodnota premennej je 1, ak sa cesta neuskutoční, hodnota premennej sa rovná 0. Uvažujeme, že sa cesta medzi ľubovoľnými dvoma miestami na t-tom kroku (t = 1,2,....n) uskutoční. Do úlohy zavedieme bivalentné premenné xijt . Ak sa cesta na t-tom kroku z miesta i do miesta j uskutoční, xijt = 1, ak nie, xijt = 0. Pre koeficienty cij platí:
cij , ak existujecesta z miestai do miesta j cij = 0, ak i = j M , ak neexistujecesta z miestai do miesta j
3
(1)
príspevok je výstupom vedeckého projektu CISKO, Š. a kol. :Ekonomické aspekty inteligentného dopravného systému (dopravnej telematiky) v odbore cestnej dopravy, projekt VEGA 1/12349/04, ŽU v Žiline, FPEDaS
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
105
Úlohu možno matematicky potom formulovať nasledovne: n
n
n
min ∑∑∑ cijxijt , i, j, t = 1,2,...., n
(2)
i =1 j =1 t =1
za podmienok n
n
∑∑ x
= 1, t = 1, 2,...., n
(3)
ijt
= 1, i = 1, 2,...., n
(4)
ijt
= 1,
(5)
ijt
i =1 j =1 n
n
∑∑ x j =1 t =1 n
n
∑∑ x
j = 1, 2,...., n
i =1 t =1 n
n
i =1
k =1
∑ xijt − ∑ xjk (t + 1) = 0, n
∑x
n
ijn
i =1
− ∑ xjk 1 = 0,
j = 1, 2,...., n, t = 1, 2,...., n − 1
j = 1,2,...., n
(6)
(7)
k =1
Xijt ∈ {0,1}
(8)
Podmienky (3) až (5) zabezpečujú, že na každé miesto môže obchodný cestujúci prísť len raz aj z každého miesta odíde ten raz, pričom vykoná n ciest. Podmienky (6) a (7) zabezpečujú, že ak obchodný cestujúci vykoná na k-tom kroku cestu do niektorého miesta s, tak môže na k+1-om kroku vyjsť ten z toho istého miesta s. Podmienky (3) až (7) zabezpečujú neprerušenosť okružnej cesty a zamedzujú vzniku cyklov. Podmienka (8) je podmienkou bivalentnosti. V úlohe lineárneho programovania potom vystupuje n.n.n premenných a n+n+n+n+(n-1)+n ohraničení4. Rozmery úlohy nie sú síce problémom z výpočtového hľadiska, pomocou niektorého z optimalizačných programových balíkov ich možno rýchlo vyriešiť, prinášajú však značné komplikácie pri zostavovaní úlohy. Danú úlohu môžeme previesť na tzv. Tuckerovu formuláciu úlohy o obchodnom cestujúcom. Do úlohy sa zavádzajú bivalentné premenné xij , ktoré označujú cestu medzi miestom i a miestom j. Aby sa zabránilo vytvoreniu cyklov v dopravnej sieti, zavádzajú sa do úlohy lineárneho programovania ďalšie podmienky v tvare:
ui − uj + nxij ≤ n − 1 i, j = 2,3,...., n, i ≠ j v ktorých premenné ui a uj môžu nadobúdať ľubovoľné hodnoty (sú to reálne čísla priradené miestu i, resp. miestu j). Potom možno problém obchodného cestujúceho formulovať ako úlohu lineárneho programovania nasledujúcim spôsobom: n
n
min ∑∑ cijxij
(9)
i =1 j =1
za podmienok n
∑x
ij
= 1,
j = 1,2,...., n
(10)
= 1, i = 1, 2,...., n
(11)
i =1 n
∑x
ij
j =1
4
V príspevku budem problematiku ilustrovať na modelovom príklade s piatimi miestami t.j. úloha by mala 125 premenných a 29 ohraničení. „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
106
ui − uj + nxij ≤ n − 1 i, j = 2,3,...., n, i ≠ j
(12)
Xij ∈ {0,1} i, j = 1,2,....n
(13)
V takto naformulovanej úlohe bude počet premenných a počet ohraničení už výrazne redukovaný.5. Ale aj tak bude úloha už aj pri 5 miestach pomerne rozsiahla a preto je vhodné použiť niektorý z optimalizačných softwarových balíkov. Úlohu som sa rozhodol riešiť v programe LINGO, ktorý obsahuje špeciálny jazyk, pomocou ktorého môžeme zápis úlohy ešte o niečo zjednodušiť. Nevýhodou je to, že voľne k dispozícii je iba výučbová demo verzia, ktorá síce funguje ako „ostrá“ verzia, má však určité obmedzenia. Jedným z obmedzení je aj počet bivalentných premenných a to iba 30 t. j. môžeme optimalizovať iba päť miest. Úlohu zapíšeme následovne: MODEL: !Úloha obchodného cestujúceho; SETS: MIESTO/A,B,C,D,E/:U; MATICA(MIESTO,MIESTO):X,KM; ENDSETS !minimalizácia počtu ubehnutých kilometrov; MIN= @SUM( MATICA:KM*X); !riadkové a stĺpcové súčty sú rovné 1; @FOR( MIESTO(I):@SUM(MIESTO(J):X(I,J))=1); @FOR( MIESTO(J):@SUM(MIESTO(I):X(I,J))=1); !Premenné U môžu byť ľubovolné; @FOR( MIESTO:@FREE(U)); !Kuhn-Tuckerove podmienky; @FOR( MIESTO(I)|I#GT#1:@FOR( MIESTO(J)|J#GT#1: U(I)-U(J)+@SIZE(MIESTO)*X(I,J)<=@SIZE(MIESTO)-1)); !Podmienky bivalentnosti; @FOR(MATICA:@BIN(X)); DATA: !Matica vzdialeností; KM=0 25 60 87 42 25 0 68 12 58 60 68 0 33 40 87 12 33 0 71 42 58 40 71 0; ENDDATA END
5
V modelovom príklade to bude iba 25 premenných a 17 ohraničení
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
107
Po spustení príkazom Solve dostaneme následujúce riešenie: Global optimal solution found at iteration: 65 Objective value: 152.0000 Variable U( A) U( B) U( C) U( D) U( E) X( A, A) X( A, B) X( A, C) X( A, D) X( A, E) X( B, A) X( B, B) X( B, C) X( B, D) X( B, E) X( C, A) X( C, B) X( C, C) X( C, D) X( C, E) X( D, A) X( D, B) X( D, C) X( D, D) X( D, E) X( E, A) X( E, B) X( E, C) X( E, D) X( E, E)
Value 0.000000 4.000000 2.000000 3.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000
Reduced Cost 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 25.00000 60.00000 87.00000 42.00000 25.00000 0.000000 68.00000 12.00000 58.00000 60.00000 68.00000 0.000000 33.00000 40.00000 87.00000 12.00000 33.00000 0.000000 71.00000 42.00000 58.00000 40.00000 71.00000 0.000000
Z tohto vyplýva, že optimálna trasa je následovná A-E-C-D-B-A, prepravné náklady na danú trasu budú 152 jednotiek a riešenie dosiahneme do 65 iteráciách t. j. neexistuje pri daných vstupných podmienkach iná trasa, ktorá by viedla ku nižším nákladom, ako trasa ktorú sme zistili pomocou úlohy o obchodnom cestujúcom.
Literatúra: [1] UNČOVSKÝ, L. Modely sieťovej analýzy. Bratislava : Alfa, 1991. [2] BREZINA, I.; IVANIČOVÁ, Z. Kvantitatívne metódy v logistike. Bratislava : Ekonóm, 1999. [3] DADO, M. et al. TASID – technológie a služby inteligentnej dopravy, vedecko technický projekt č. AV/819/2002, Cisko, Š. et al: čiastkový projekt – Ekonomické a mimoekonomické efekty a hodnotenie investícií, ŽU v Žiline, 2002-2005 [4] GREGOVÁ, E. Regionálne aspekty globalizácie: nová úloha konkurenčnej schopnosti regiónu. In: Globalizácia a jej sociálno-ekonomické dôsledky ´05: Zborník z medzinárodnej vedeckej konferencie. Rajecké Teplice 2005. ISBN: 80-8070-463-5 Adresa: Ing. Tomáš Klieštik, PhD. Žilinská univerzita, 010 26 Žilina t.č.:041/5133221 e-mail: [email protected] „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
108
THE VORTEX-FRACTAL THEORY OF THE UNIVERSE STRUCTURES Pavel Ošmera
Brno University of Technology Abstract: The strength of physical science lies in its ability to explain phenomena as well as make predictions based on observable, and repeatable phenomena according to known laws. Science is particularly weak in examining unique, nonrepeatable events. We try to piece together the knowledge of evolution with the help of biology, informatics and physics to describe a complex vortex structure of the universe. Evolution is a procedure where matter, energy, and information come together. We would like to find the plausible unifying mechanisms for an explanation of the vortex systems. Investigators with specialized training in overlapping disciplines can bring new insights to the area of study, enabling them to make original contributions. This paper is an attempt to explain a vortex-fractal principle of universe structures, vortex light rays and what is gravitation.
Keywords: evolution, universe, a basic particle structure, light, gravitation
1. Introduction Matter has an innate tendency to self-organizing and generating complexity [1-9]. This tendency has been at work since the birth of the universe, when a pinpoint of featureless matter budded from “nothing” at all [11]. Irreversibility and nonlinearity characterize phenomena in every field of complexity. Nonlinearity causes small changes on one level of organization to produce large effects (anomalies) at the same or higher levels. The smallest of events can lead to the most massive consequences. We can see an emergent property, which manifests as the result of positive and negative feedback. But global features of the system cannot be understood only by analyzing the parts separately. Deterministic chaos arises from the infinitely complex fractal structure (see Fig.1). A fractal’s form is the same no matter what length scale we use. By using the techniques of parallelism and massive parallelism in computer simulations we come a little closer to explaining of basic principles of complex systems. Our attention is directed to the most efficient algorithms of turbulence simulation, which can help us understand a behavior of very complex fractal objects as a whirl. Chaotic systems are exquisitely sensitive to initial conditions, and their future behavior can only be reliably predicted over a short time period. Moreover, the more chaotic system, the less compressible its algorithmic representation. Turbulence is regarded as one of the “grand challenge” problems in contemporary high-performance computing. Despite this astonishing progress during the fifty years since the visionary work of von Neumann, simulating turbulent fluid flow in realistic way is still largely beyond the capability of today computers. In essence, the common underlying theme linking complexity of nature with computation, which depends on the emergence of a complex organized behavior from many simpler cooperative and conflicting interactions between the microscopic components, whether they are spinning electrons, atoms etc.
Fig. 1 A spiral structure as a fractal
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
109
Earthquakes, avalanches, and financial crashes do have a common fingerprint: the distribution of events follows a simple power law [3], [11]. This power law means that the physics of small avalanches is the same as that of large ones. Self-organization is a natural consequence of time evolution of vast aggregates of simple agents (particles). By making these agents interact in a more complex way we could create an even greater variety of behavior, such as spiral structures (see Fig.2b) reminiscent of galaxies (see Fig. 3a,b), hurricanes (see Fig. 3c), tornado and particles of matter. Nonliving things, for instance crystals, are capable of self-reproduction during growth. Evolution on the edge of chaos can be extended for nonliving systems [6 – 8]. The negative forces are caused by negative fluctuation and positive forces are caused by positive fluctuation and by selection as an influence of boundary conditions. Fractals seem to be very powerful in describing natural objects on all scales. Fractal dimension and fractal measure, are crucial parameters for such description [12 - 15]. Many natural objects have self-similarity or partial-self-similarity of the whole object and its part. Different physical quantities describing properties of fractal objects in E-dimensional Euclidean space with a fractal dimension D [12]. Fractal dimension D depends on the inter-relation between the number of repetition and reduction of individual object. There is relationship between the dimensionality and fractal properties of the matter, which contains the constant of golden mean φ = (√ 5 – 1)/2 = 0.6180339887. Constant φ is a special case of fractal dimension D defined by the condition D (D – E + 2) = 1 for E = 3 [12]. Links between inverse coupling constants of various interactions (gravitational, electromagnetic, weak and strong) in the three-dimensional Euclidean space are discussed in [13]. Different properties of particles (and interactions between them) correspond to the specific values of a fractal dimension. Following values (D = 0, E – 2, E – 1, E) play the most important role in such analysis [13]. There exists a large body of knowledge about the process of natural evolution that can be used to guide simulations. This process is well suited for solving problems with unusual constrains where heuristic solutions are not available or generally lead to unsatisfactory results. Often revolution has an interdisciplinary character. Its central discoveries often come from people straying outside the normal bounds of their specialties. Naturalistic explanations of universe’s origin are speculative [1,9,11]. But does this mean such inquiries are impotent or without value? The same criticism can be made of any attempt to reconstruct unique events in the past. We cannot complete our knowledge without answering some of the fundamental question about nature. How does universe begin? What is turbulence? Above all, in a universe ruled by entropy, drawing inexorably toward greater and greater disorder, how does order arise? Although the various speculative origin scenarios may be tested against data collected in laboratory experiments, these models cannot be tested against the actual events in question, i.e., the origin of complex structures. Such scenarios, then, must ever remain speculation, not knowledge. There is no way to know whether the results from these experiments tell anything about the way universe itself evolved. In a strict sense, these speculative reconstruction are not falsifiable; the may only be judged plausible or implausible. In the familiar Popper sense of what science is, a theory is deemed scientific if it can be checked or tested by experiment against observable, repeatable phenomena. Behavior of complex nonlinear systems with unpredictable behavior can be demonstrated by a relatively simple and transparent system as a magnetic pendulum [8]. The idea is to set the pendulum swinging and guess which attractor will win. Even with just three magnets placed in a triangle, the pendulum’s motion cannot be predicted. The unexpected behavior can be extended to physiological and psychiatric medicine, economic forecasting, and perhaps the evolution of society. A physicist could not truly understand turbulence or complexity unless he understood pendulums. The chaos began to unite the study of different systems. A simulation brings its own problem: the tiny imprecision build into each calculation rapidly takes over, because this is a system with sensitive dependence on initial conditions. But people have to know about disorder if they are going to deals with it. Classical scientists want to discover regularities. It is not easy to find the grail of science, the Grand Unified Theory or “theory of everything”. On the other hand there is a trend in science toward reductionism, the analysis of system only in terms of their constituent parts: quarks, chromosomes, or neuron. Some scientists believe that they are looking for the whole. Magnetic fields are most easily understood in terms of magnetic field lines. These field lines define the direction and strength of the magnetic field at any location in 3D nonlinear space. These magnetic lines have both direction and strength – the closer we are to a magnetic source, then stronger the field lines. The magnetic field lines always begin on the north poles of a magnet, and end on the south poles. The magnetic field of a magnetic dipole is approximately proportional to the inverse cube of the distance from the dipole. Therefore, if we double the distance from the magnet, then the magnetic field strength will be reduced by a factor of 8. Magnetic system of a magnetic pendulum is very complex [8]. If we know the initial state we cannot predict the final state. Even with just three magnets on the base plate, we cannot predict the motion. On the other hand, if we know the final state we cannot derive history to the initial state. The same problem is with universe’s origin.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
110
2. Self-organization of complex systems Complex systems share certain crucial properties (non-linearity, complex mixture of positive and negative feedback, nonlinear dynamics, emergence, collective behavior, spontaneous organization, etc.). In the natural world, such systems include universe, brains, immune systems, ecology, cells, developing embryos, and ant colonies. In the human world, they include cultural and social systems [5, 6, 8]. Each of these systems is a network of a number of “agents” acting in parallel. In a brain, the agents are nerve cells; in ecology, the agents are species; in a cell, the agents are organelles such as the nucleus and the mitochondria; in an embryo, the agents are cells, and so on. Each agent finds itself in the environment produced by its interactions with the other agents in the system. It is constantly acting and reacting to what the other agents are doing. There are emergent properties, the interaction of a lot of parts, the kinds of things that the group of agents can do collectively, something that the individual cannot. There is no master agent - for example - a master neuron in the brain. Complex systems have a lot of levels of organization (hierarchical structures), with agents at any level serving as building blocks for agents at a higher level. An example of a self-organized structure is a whirlpool (see Fig. 2a). Nonlinearity in feedback processes serves to regulate and control. Evolution is chaos with feedback [17].
3. Vortex structures Perhaps vortex structures with vortex lines, such as are created approximately in a whirlpool or in a tornado are a plausible speculation of elementary particle structures. The whirlpool-structure (a turbulent eddy) with a funnel shape can have for example a water outlet of the bath or in the PET-bottle (see Fig. 2a). The streamlines are spirals (or circles) about a vortex axis, similar to the lines of the magnetic field round a wire carrying a current. The velocity v of the flow is inversely proportional to the distance from the vortex axis as can be observed at the drain hole of a bath-tub [16,18]. Speed v depends on friction. In the bath-tub the core is replaced by air. For a hurricane, the core is called the eye. If two or more vortex lines are parallel side by side in the fluid, the core of each vortex line must move in the velocity field arising from other vortex lines. So two parallel vortex filaments with opposite rotation (spin) follow straight lines course side by side (see Fig. 6a), whereas with the same spin they dance round each other (Fig. 6b). If three or more vortexes are working together (see Fig. 6c), a more complex structure can be created. If one bends a vortex line into a closed ring, then the vortex ring moves with unchanging shape in a strait line: each part of the ring must move in the velocity field of all the other parts. For example experienced smokers can blow smoke-rings (see Fig. 4).
a)
b)
Fig.2 a) The vortex in the PET-bottle b) The vortex model
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
111
Fig. 3 Examples of spiral structures: a), b) galaxies, c) the Earth’s hurricane
Fig. 4 The annular structure that can be created from vortexes (for example the electron) The presented theory is based on the following assumptions and hypotheses: 1. There is a hidden substance in the universe that contains very small sub-particles with unmeasurable mass. 2. We will call these sub-particles “osmerons” (“osmero” was the the name of the deity in ancient Egypt for 4 pairs of gods). 3. Vortices and annular structures (rotational structures) can be created from these sub-particles. 4. There are two types of the vortices VB, VT with opposite flow of the energy E (see Fig. 5a,b) 5. The vortex pair with two VB, or two VT can create two types of the pair: the same rotation with parallel rotational axes or the contra rotation with the parallel rotational axes (see Fig 6a,b and 13b,c). 6. The vortex VB and the vortex VT can create the pair with fore head orientation on the same rotational axis. 7. Vortices VB and vortices VT can create the chain (string) structure (see Fig. 14). 8. The vortex or annular structures can change from one structure to another very quickly. 9. There is sufficient amount of the accessible energy E. If a semi-fractal description of nature is plausible for us, we can imagine that many objects of the universe are the fractal-vortices [19]. If we see vortex structures in a macro-world (as spiral galaxies) and in real world (as the whirlpool of bath-tub shown in Fig. 5a and as the tornado in Fig. 5b or hurricane) it can be probable that particles in micro-world have similar fractal-vortex structures. The flow of energy E in the tornado-vortex VT and in the bath-tub-vortex VB has opposite direction (see Fig. 5a and Fig. 5b). The pressure p is higher at the bottom of vortices. To create a vortex structure we need a minimum value of energy – a quantum of energy and the sufficient number of sub-particles (osmerons). Perhaps there is a relation between Planck’s constant and the minimum energy of the vortex structure VB or „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
112
VT.
a) b) Fig. 5 Vortex structures: a) the vortex VB at the drain hole of bath-tub, b) the tornado-vortex VT The forces between two vortices and motion of two types of the vortex pairs are shown in Fig. 6. The behavior of the vortex pair shown in Fig. 6a can help us to explain the expansion of the universe (the Hubble’s law – with antigravitational forces Fag). The behavior of the vortex pair shown in Fig. 6b can help us to explain the disc and spiral shape of the Milky Way with vortex-gravitational forces Fg. But it cannot help us to explain the spherical shape of the universe bodies as the Sun or the Earth. There are an another forces that can occur at the vortex structures (see Fig. 16a, Fig. 17a). More then two vortices can form the complex structures (for example three vortices shown in Fig. 6c or more vortices in Fig. 16).
Fig. 6 The motion and gravitational forces Fg and anti-gravitational forces Fag of the vortex pair V1 and V2: a) with the contra rotation of ω1, ω2 b) with the equal rotation of ω1, ω2 c) the motion of 3 vortex particles p1, p2, and p3 (all particles have the same direction of rotation). „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
113
We can build the chain structure from the vortices (see Fig. 10, 14). The forces and motion of two vortex pair is shown in Fig. 3a. There are two possible lines between pairs, which depends on the direction of rotation P1 and P2. The lines between vortex pair P1 and P2 (see Fig. 7a) with the same rotation are shown in Fig. 7b. The arrangement of lines for the contra rotation of pairs is shown on Fig.7c. More vortices can create a vortex ray (see Fig.8a).
Fig. 7
a) b) a) The forces and motion of two vortex pairs with opposite flow of the energy E b) Lines of hidden sub-particles between pairs with the same rotation of vortices c) Lines of hidden sub-particles between pairs with the opposite rotation of vortex pairs
c)
The value of the frequency of vortex’s vibrations along the rotational axis increases the accumulated energy in the complex vortex-structure, for example in photon rays (see Fig. 8). The photon-ray can be the vortex row (chain) with a very small mass [1]. Every photon-ray can have opposite rotation with regard neighbor photon-rays (see Fig. 8a). The number of vortex-rays in circle structure of the stream must be even (see Fig. 8a) to form divergent rays (see Fig 8b). Figures 8a,b,c can help us to explain the behavior of vortex-rays: as the energy flow of particle structures (vortices) and the wave transport of energy. Because the side rays in Fig.8b have no neighbor rays, they are deviated by forces F from the neighbor that is near to the center of ray-flow. It can explain the wave behavior of the light flow of photons as vortex rays behind the hole. The laser beam structure in Fig.8c can be explained with particle motion shown in Fig. 6c. Photon-vortex-structure V (or Vp) can be created from annular vortex-electron structure e by two ways: a) by the change of the shape of vortex-electron-structure (see Fig. 9a), b) by cutting the closed electron-structure (see Fig.9b).
a) Fig. 8 Vortex rays of the light:
b) c) a) the spin example of vortex-rays (if f = f1 = 1 Hz then E = Ep = h . f1), b) the spreading of vortex-rays behind the hole (the wave refraction of light), c) the spreading of vortex-rays with the same rotation (the laser beam).
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
114
Fig. 9 The release of vortex pair Vp from vortex-electron structure e: a) with change of the shape of vortex-electron-structure, b) by cutting the closed electron-structure.
4. Elementary particles Macroscopic matter consists of molecules, which are built out of atoms; these atoms define the elements. From the four elements of the ancient Greeks we moved to 92 natural and about 20 artificial elements, each of which may appear in several isotopes [16]. We learned that each atom consists of a small nucleus and a large electron cloud around it. The nucleus is again composed of P charged protons and N neutral neutrons. A neutral atom thus has P electrons determining its chemical behavior. For the same, P + N different atomic weights describe isotopes. In 1932 we had just three basic particles, the proton, the neutron, and the electron, to build all known tangible matter from. The neutrino (little neutron) was proposed in 1930 by Pauli to take away some of the energy, momentum and spin arising in betadecay of a neutron into a proton and electron. Neutrinos show very little interest in any reactions and have zero or very small mass. “Zero” neutrino mass means smaller than measurable. Each particle seems to have an antiparticle of the same mass but with a different sign of the electric charge. For example positron balances the electron, and the antiproton was found in a particle accelerator built particularly to produce such antiprotons according to E = mc2. Basic particles consist from smaller parts. The proton consists of two up and one down quark, or in short: proton = (uud), the neuron = (udd). Quarks have fractional electric charges ±1/3, ±2/3, which explains the existence of the double particles consisting of three quarks with 2/3 charge. Quarks appear in six types (six “flavors”): u, d, c, s, t, b (=up, down, charm, strange, top, bottom). These six flavors are grouped into three generations, which correspond to the three leptons. Thus the mesons with two quarks are Bosons, and the baryons with three quarks are Fermions. Each of the quarks and leptons has its antiparticles; mesons are formed with one quark and one antiquark. Quarks, in contrast to leptons, appear in three “colors”. We have at present 36 different quarks of various colors and favors, and 12 different leptons, or 48 fundamental particles all together. The masses of the three (up or down) quarks forming the proton or neutron are much smaller than the mass of that nucleon; most of the mass is hidden in the interaction energy due to the enormous color forces between quarks [16]. Perhaps it has something with a rotational vortex structure of nucleons. One very speculative imagination of the electron structure is presented in Fig. 4. Experimental investigations of possible types of reactions show that certain particle numbers are conserved in the sense that the number of incoming particles of this type must equal the number of particles of the same type after the reaction is over. Each of the three-lepton generations has the own conserved number of particles, and so do the quarks for all generation together. Perhaps during lepton generation is a rotational structure changed (see Fig. 9). Leptons do not consists of quarks. Also the electric charge is always conserved, whereas the mass can be transformed into energy and back. Also, electric charge is conserved with antiparticles having opposite charge. However, antiparticles always count negative for the particle number and charge (not for mass). Thus radiation energy can form an electron-positron pair since then the number of e-leptons is still zero. So normal matter needs only the electron, u, and d as constituents, a nice simplification. Investigators with specialized training in overlapping disciplines can bring new insights to an area of study, enabling them to make original contributions. It can present the ways universe’s structure could have arisen. An open question remains; what is gravity [16]. „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
115
With these conservation laws we now understand a crucial difference between mesons (quark + antiquark) and baryons (three quarks). The meson number is not conserved since the quark and antiquark can annihilate each other. The baryon number is conserved since their quark number. For example the free neutron decays after 15 minutes into a proton, an electron, and an anti-electron-neutrino. This is allowed since both the proton and the neutron has the same baryon number of one. But a neutron star or pulsar does not decay into protons because of the strong forces between neutrons. It seems the force between two quarks is quite strong and for large distances independent of distance. We cannot observe quarks isolated somewhat like north and south poles of a magnetic dipole. So if we try to pull quarks apart, we need so much energy that we merely create new particles. Only “white” combinations of quarks, where the color forces have cancelled each other (like quark-antiquark, or three quarks with the three fundamental colors), are observed as isolated particles [16]. We still feel gravitation since there are no negative masses, in contrast to positive and negative electric charges, which cancel each other in their force over long distances. Electric forces do not propagate with infinite velocity but only with the large but finite light velocity c. Perhaps c is maximum velocity of spreading in the hidden substance. Light waves are called photons in quantum theory or Coulomb forces are transmitted via the quasi-particles called photons. Similarly, gravitational forces propagate with velocity c, perhaps with the help of quantized gravity waves called “gravitons” (not yet detected as quantized quasi-particles) [16]. Quite generally at present, forces are supposed to come from the exchange of intermediate Bosons (virtual particles). Virtual particles are packets of energy ∆E = mc2 with a short lifetime ∆t, such that the energy-time uncertainty relation ∆E ∆t ≤ h/2π allows their creation. The color forces between quarks are transmitted by gluons (i.e., by particles glueing the quarks together) of zero mass. They bind three quarks together as a nucleon (proton or neutron). At some distance from this nucleon some remnant of color forces is felt, since they have not canceled each other exactly. Coulomb forces and gravitation are felt over infinitive distances without exponential cut-off and thus have “zero” mass. Color forces also must have infinite range since otherwise we could isolate single quarks; thus also the gluons are massless. The weak interaction covers only very short distances because of the large mass of the corresponding intermediate Bosons. The iteration energy remains the same if all spin reverse their orientation. What has been described here is the so –called standard model, which includes color forces. The Grand Unified Theory (GUT) combines it with electromagnetic and weak forces, and the Theory of Everything would include gravity. Magnetism we understand on the basis of suitable models. How a spontaneous magnetization can be formed? A very speculative imagination how an electromagnetic field can be created, during a jump of an electron between two atoms, is presented in [19]. Magnetic and electric lines are presented as a vortex flow of hidden substance (subparticles with a mass smaller than measurable). Finally I would like to present my very speculative origin scenario. It is possible that all very complex systems exist in anomalous states, as vortex structures. These anomalous states have a hierarchical structure. May be 3D-matter is a first anomalous stage after a collision of “supervortex” spaces. At the second level is the origin of living systems. At the third level is a brain with a consciousness. There is no greater anomaly in nature than matter that can live and can have a consciousness.
a)
b)
Fig. 10 A speculative structure of a photon flow (electromagnetic lines)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
116
If a fractal description with the fractal dimension is plausible for us, we can imagine that almost all objects of the universe are fractal vortices with a different fractal dimension. If we see vortex structures in a macro-world (as spiral galaxies) and real world (as tornado, whirlpools, and hurricanes) it can be probable that particles in micro-world have the same fractal-vortex structure [20]. To create a fractal vortex we need a minimum value of energy – a quantum of energy. There can be a relation between Planck’s constant h and the fractal dimension of the vortex. The value of frequency of vortex’s vibrations (see Fig. 14) increases the accumulated energy of a vortex structure in coincidence with physical law for photon’s energy. A photon flow can be vortex row with a very small mass (see Fig. 10, 14). It can be an opened structure created by cutting the closed “electron” structure in Fig.10a). May be it is a better model than a classical planetary model. Perhaps our universe is not a superstring space but a “supervortex” space. Vortex structures can explain magnetism, perhaps gravity etc. Vortices can attract each other using their different polarities (see Fig. 6b). Vortices with their rotation have inertia, which explains what the mass of matter can be to compare with a hidden substance (subparticles without mass). We can see, for example, the fractal vortex structure on Jupiter’s weather or Earth’s weather (see Fig. 3c). Perhaps vortex structures will be a plausible speculation but research is needed to test it. The increased awareness from other scientific communities, such as biology and mathematics, promises new insights and new opportunities. There is much to accomplish and there are many open questions. Interest from diverse disciplines continues to increase and evolution of complex structures is becoming more generally accepted as a paradigm for imagination of basic principles of nature. We can make our model more complex, and more faithful to reality, or we can make it simpler and easier to handle (to generalize and abstract). Some patterns are fractal, exhibiting structures self-similar in scale.
5. The annular structure of the basic particles What are the shape and the structure of the basic particles as the electron, the proton, and the neutron? All these particles have the spin. We use the definition of the spin s as the ratio the sum of the threads (coils) c1 and c2 to the number of electron-threads Ce (see Fig. 11 and Fig. 12). The proton and the neutron are described as the group of three quarks [20]. Our attempt to use the quarks: u and d to form the annular shape of the proton and the neutron is presented in Fig. 11, 16 (not in right scale – the electron is smaller and the proton and the neutron are thicker and larger).
Fig. 11 The annular and close energy structure of the basic particles with their spins s ( fractional electric charges) (quarks u and d are only abstract and open substructures – “building blocks” – of the proton and the neutron) The spin structure of basic particles can be explained with the description in Fig. 12 where circle structures are opened to could be easily drawn (from Fig. 12a to Fig.12d). The closed structure from Fig. 12a is in Fig. 12e and from Fig. 12b in Fig. 12f. Zero spin of the neutron can be form from nonzero threads (coils) c1 and c2: c1 = - c2.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
117
Fig. 12 The spin structure of particles (Fig12e is closed structure from Fig. 12a, Fig. 12f is closed structure from Fig. 12b) Forces between two vortex pairs with different axes and directions of the energy flow are presented in Fig. 13.
Fig. 13 Forces between two vortex pairs in the different arrangement Forces between vortices in the photon’s (or in the gluon’s) flow of the energy are presented in Fig. 14.
Fig. 14 Forces and oscillations of the photon flow „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
118
Fig. 15 Forces between two electrons The forces between two electrons and their trajectories tr are shown in Fig. 15. Electrons with the same direction of trajectory tr form the electron rays. It occurs in two cases of electron-electron orientations (it is shown in Fig. 15 on the right). Two neighbor electrons in the electron-ray slightly attract each other (with the force Fa) due to the opposite direction of magnetic lines. All reaction forces in the electron-ray have the same direction (the same as the trajectory tr). The behavior of electron rays is similar to the behavior of photon-rays described in Fig. 8a. The strong nucleus forces can be explained with the vortex bonds Vp1 and Vp2 between protons (see Fig. 16a) and neutrons. c)
Fig. 16 a) The strong nucleus forces by vortex bonds Vp1 and Vp2 between protons b) The spin structure of the proton or the neutron) c) The model of the proton’s (or neutron’s) structure (the same as in Fig. 16b) „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
119
a)
b)
Fig. 17 a) The vortex bonds V1a , V1b and V2a , V2b between two electrons that can transport energy b) Gravitation forces between two particles with mass m1 and m2.
The forces Fe (Coulomb‘s low) between the electron and the proton depends on line density in the area S which is inverse proportional to square distance d2 [[20]. The light beam (the photon flow) is a complex structure that can translate the energy by excitation of the vortex row. There is the distance between two vortices where the couple force F has maximum value (see Fig. 14). Around this position every vortex can oscillate as was presented in the center of Fig 14 (the wave theory of light). The vortices (photons) in the light flow oscillate (vibrate) to translate energy (it is not similar to particle translation). But one vortex pair (photon) can move and translate the energy separately. The strong nucleus forces can be explained with the vortex bonds between vortex pairs Vp1 and Vp2 (see Fig. 16a). The gravitational forces FG depend proportional on the density of magnetic lines in the area S (see Fig.17b) which decreases indirectly with square distance d2 and increases proportional with the mass m1 and m2 (Newton’s low). The higher number of proton’s vortex bonds is between protons in the nucleus (see Fig. 16a) the higher density of the magnetic field will be in the area S and the gravitational forces will be higher. The higher energy has nucleus the higher number of vortex bonds will be created and stronger forces are between protons and neutrons (the strong interaction). The main component of gravitational force FG is the complex magnetic field with two-way magnetic lines (see Fig. 17b) that alternate each another. There is an analogy between electric forces (Coulomb’s low) and magnetic forces for a gravitational influence (Newton’s low). All the universe is fill up with the magnetic lines. To explain the structure of the magnetic lines we need lower “sub-subparticles” than are the basic particles as electron, proton, and neutron. This “subsubparticles” can form flexible magnetic lines with similar structure as photons but they have to be smaller (perhaps they can be gluons).
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
120
Fig. 18 Forces between the proton p and electron e Position of the electron from the proton during the levitation depends on two different types of forces. The magnetic force Fm repels the electron from the proton (see Fig. 18a) and the charge-reaction force Fr of the electron attracts him. The charge forces Fr work on the principle of activity and reaction (the same principle as the rocket engine but with the hidden substance as is shown in Fig. 15. The magnetic repulsion forces are stronger when the particles are closer. It hangs where this upward repulsion balances the downward force of the charges, that is, at the point of equilibrium where the total force is zero. If the electron were not spinning, the magnetic torque would turn it over. When the electron is spinning, the torque acts gyroscopically and the axis does not overturn but rotates about the direction of the magnetic proton’s field. This rotation is called precession. For the electron to remain suspended, equilibrium is not enough. The equilibrium must also be stable, so a slight horizontal or vertical displacement produces a force pushing the electron back toward the equilibrium point. The reaction force Fr of the electron and the strength Fm of magnetization between the proton and the electron determine the equilibrium distance d where magnetism balances “rocket” force Fr. Slight changes of temperature alter the magnetization of particles. 6. Conclusions The annular-vortex model might be better than a classical planetary one. Our universe might be considered as “supervortex” space. Vortex structures can explain the electromagnetic field, perhaps gravitation too. Vortices can attract each other using their different polarities (see Fig. 13a). Planck’s constant h might be the energy Ep of one vortex pair Vp (see Fig. 8a). Close vortex structures (Fermions) with their rotation have the inertia, which explains what the mass of matter can be compared with a hidden substance (the sub-particles “osmerons” with very small and unmeasurable size in the vortex structure). The radiation is an open vortex structure (Bosons, for example light with photons) and matter is a close vortex structure with mass (for example: electrons, protons, and neutrons and follows complex structures as the nucleus {see Fig. 16a} etc.). Electron structures rotate and proton (neutron) structures need not rotate to have a rotating magnetic field. Both (the electron and the proton) have the rotating magnetic field. Vortex structures might be a plausible speculation for a computer models and calculation. Fractals seem to be very powerful in describing natural objects on all scales.
References [1] THAXTON, Ch. B.; BRADLEY, W. L.; OLSEN, R. L. The Mystery of Life’s Origin: Reassessing Current Theories, New York : Philosophical Libery, 1984. [2] DAWKINS, R. The Selfish Gene. Oxford : Oxford Univrsity Press, 1976. [3] KAUFFMAN, S. A. Investigations. New York : Oxford University Press, 2000. „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
121
[4] [5] [6] [7] [8]
[9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20]
PRIGOGINE, I.; STENDERS, I. Order out of Chaos. Flamingo, 1985. OŠMERA, P. Complex Adaptive Systems. Proceedings of MENDEL’2001, Brno : Czech Republic (2001) 137 – 143. OŠMERA, P. Complex Evolutionary Structures. Proceedings of MENDEL’02, Brno : Czech Republic (2002) 109 –116. OŠMERA, P. Evolvable Controllers using Paralel Evolutionary Algorithms. Proceedings of MENDEL’2003, Brno : Czech Republic (2003) 126 - 132. OŠMERA, P. Evolution of System with Unpredictable Behavior, Proceedings of MENDEL’2004, Brno : Czech Republic (2004) 1 - 6. Ošmera, P.: Genetic Algorithms and their Aplications, the habilit work, in Czech language 2002. WAŮDROP, M. M. Complexity – The Emerging Science at Edge of Order and Chaos. Viking 1993. OŠMERA, P.; POPELKA, O.; PANACEK, T. Parallel Grammatical Evolution, Proceedings of MENDEL’2005, Brno : Czech Republic (2005). COVENEY, P.; HIGHFIELD, R. Frontiers of Complexity. Faber and Faber, 1996. ZMEŠKAL, O.; NEZADAL, M.; BUCHNICEK, M. Fractal-Cantorial geometry. Hausdorf dimension and fundamental laws of physics, Chaos, Solitons and Fractals 17 (2003) 113-119. ZMEŠKAL, O.; NEZADAL, M.; BUCHNICEK, M. Coupling constants in fractal and cantorian physics. Solitons and Fractals (2005) article in press EL NACHIE MS. On the exact mass spectrum of quark. Chaos, Soliton & Fractals 2002,14;369-76 EL NACHIE MS. Quantum gravity. Clifford algebras and fundamental constant of nature. Chaos, Soliton & Fractals 2002,14;437-50 STAUFFER, D.; STANLEY, H. E. From Newton to Mandelbrot. A Primer in Theoretical Physics with Fractal for the Personal Computer, Springer-Verlag Berlin Heidelberg, 1996. GLICK, J. Chaos - Making a New Science. Vintage, 1998. CAPRA, F. The Web of Life. HarperCollins Publishers, 1996. OŠMERA, P. Evolution of the univers structures. Proceedings of MENDEL 2005, Brno : Czech Republic (2005) 1-6. OŠMERA, P. The Vortex-fractal Theory of the Gravitation, Proceedings of MENDEL 2005, Brno : Czech Republic (2005) 7-14.
Address: Doc. Ing. Pavel Ošmera, CSc. Institute of Automation and Computer Science Brno University of Technology Technicka 2, 616 69 Brno, Czech Republic Tel.: +420 541 142 294 Fax: +420 541 142 490 e-mail: osmera @fme.vutbr.cz
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
122
VORTEX-FRACTAL PHYSICS Pavel Ošmera
Brno University of Technology Abstract: We would like to find the plausible unifying mechanisms for an explanation of the vortex systems. This paper is an attempt to attain a new and profound understanding of nature’s behavior as a vortex-fractal principle for everything. There is a vortex explanation of polarization, the diffraction grating, and we compare quantum electrodynamics (QED) with the vortex-fractal description. This new approach can be called physics of vortex structures (FVS).
Keywords: vortex, polarization, diffraction grating, basic particle structure, light, gravitation.
1. Introduction The electrical force, like a gravitational force, decreases inversely as the square of distance between charges. This relationship is called Coulomb’s law. There are two kinds of “matter”, which we can all positive and negative. Like kinds repel and unlike kinds attract – unlike gravity where there is only attraction. But it is not precisely true when charges are moving – the electrical forces depends also on the motion of charges in a complicated way [2]. One part of the force between moving charges we call the magnetic force. It is really one aspect of a vortex effect. That is why we call the subject “electromagnetism”. We find, from experiment, that the force that acts on a particular charge – no matter how many other charges there are or how they are moving – depends only on the position of that particular charge, on the velocity of the charge, and on the amount of charge [2]. We can write the force on a charge q moving with a velocity v as F = q (E + v x B). (1.1) We call E the electric field and B the magnetic field at the location of the charge. There is still “something” there when the charge is removed. The field we consider as mathematical function of position and time. For an arbitrary closed surface, the net outward flow – or flux – is the average outward normal component of the velocity, times the area of the surface: Flux = (average normal component).(surface area). (1.2) In the case of an electric field, we can mathematically define something analogous to an outflow, and we again call it flux, but of course it is not the flow of any substance, because the electric field is not the velocity of anything [2]. In the vortex-fractal hypothesis of electron structure [6] it can be velocity of osmerons [8]. Osmerons are sub-particles that create for example a vortex structure of an electron [7]. The name osmeron was derived from the name of the Egyptian deity with 4 pairs of gods for primary creative forces (from a chaos beginning). Osmerons are too small that is why they have unmeasurable size and mass (see Fig. 1).
Fig. 1 A vortex structure of light rays „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
123
Physics has a history of synthesizing many phenomena into a few theories. For example it was discovered that heat phenomena are easily understandable from the law of motion. The theory of gravitation, on the other hand, was not understandable from the other theories. Gravitation is, so far, not understandable in terms of other phenomena [4]. Quantum mechanics thus supplied the theory behind chemistry. So, fundamental theoretical chemistry is really physics. The theory of interaction of light and matter is called “quantum electrodynamics” QED. There constants in quantum electrodynamics, that have been measured and calculated with very high accuracy. QED theory is probably not too far off these calculations. It is necessary to distinguish two questions: how Nature works and why Nature works that way. There is another question: “What holds the nucleus together”? [2]. In nucleus there is several protons, all of which are positive. Why don’t they push themselves apart? It turns out that in nuclei there are, in edition to electrical forces, nonelectrical forces, called nuclear forces, which are greater than the electrical repulsion. The nucleus forces, however, have a short range – their force falls off much more rapidly than 1/r2 [2]. It seems to me that this is one complex energy structure created from protons and neutrons connected by vortex bonds [7, 8]. In this complex nucleus structure energy is running in one complex loop {7]. We may ask, finally, what holds a negatively charged electron (since it has no nuclear forces). If an electron is all made of one kind of substance, each part should repel the other parts [2]. If we accept the vortex electron-structure it can be vortex forces between photons from which the electron is created [7, 8]. What is the charge? It can be something that has relation to the flow of osmerons though annular electron structure (ring). Electrical force, like gravitational force, decreases inversely as the square of distance between charges. There must be the same principle. Perhaps there is very small escape of energy from nucleus complex loop in vortex bonds (“gravitons” – small number of osmerons) that creates gravitational field.
2. Diffracting grating A particular color o f light can be split one more time in a different way, according to its so-called “polarization”. Thus light is something like raindrops – each little lump of light is called a photon - and if the light is all one color, all the “raindrops” are the same size and vortex structure (see Fig. 1). The human eye is a very good instrument: it takes only about five or six photons to active a nerve cell and send a message to the brain [4]. Light goes in straight lines; it bends when it goes into water; when it is reflected from a surface like a mirror, the angle at which the light hits the surface is equal to the angle at which it leaves the surface. Light can be separated into color; you can see beautiful colors on a mud puddle when there is a little bit of oil on it (because the oil film’s thickness is not exactly uniform), lens focuses light, and so on [4]. When a photon comes down on the surface of the glass, it interacts with electrons throughout the glass, not just on the surface. The photon and electron do some kind of a dance, the net result is the same as if the photon hit only the surface [4]. There is the relationship between the thickness of a sheet of glass and partial reflection [4]. It appears that partial reflection can be “turned off” or “amplified” by the presence of an additional surface. It demonstrates a phenomenon called “interference”. As the thickness of the glass increases, partial reflection goes a repeating of zero to 16%, with no signs of dying out [4]. This strange phenomenon of partial reflection by two surfaces can be explained for intense light by a theory of waves, but the wave theory cannot explain how the detector makes equally loud clicks as the light gets dimmer. Quantum electrodynamics “resolves” this wave/particle duality by the probability that a photon will hit a detector. Grand principle of QED: The probability of an event is equal to the square of the length of arrow called “probability amplitude”. General rule of QED: Draw an arrow for each way and then combine the arrows (“add” them) by hooking the head of one to the tail of the next [4] (see Fig. 2c). Every phenomenon about light that has been observed in detail can be explained by the theory of quantum electrodynamics (QED) [4]. In Fig. 2a,b the same diffraction on DVD surface is explained by the vortex structures with the same result like in Fig. 2c. Some osmeron’s trajectory are changed or absorbed, due to symmetry of vortex structure is changed to asymmetric (compare with the symmetrical vortex structure in Fig. 1). In the blue rays the diameter D2 is greater then D2 at the red rays. The diffraction for red rays is greater then for blue rays because the asymmetry of vortex structure of red light is higher then at blue light (see Fig. 2b). A diffracting grating with grooves at the right distance for red light also works for other colors. If you shine white light down onto the grating, red light comes out at one place, orange light comes out slightly above it, followed by yellow, green, and blue light – all the colors of rainbow. Where there is a series of grooves close together, you can often see colors – when you hold a CD disc or better DVD disc – under bright light at correct angels (see Fig. 2d and Fig. 3). What is interesting that one light ray doesn’t really travel only in a strait line; it “smells” the neighboring paths around it, and uses s small core of nearby space [4]. When the one slot b is smaller, the detector D starts clicking not only in the position on strait line (photon 1 in Fig. 4). When we have two slots and the distance d between them is decreasing (see Fig. 4) we can see interference between photon 1 and photon 2. This is an example of the “uncertainty principle”; there is a kind of “complementary” between knowledge of where light goes trough two holes and where it „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
124
goes afterwards – precise knowledge of both is impossible. So the idea that light goes in a straight line is a convenient approximation to describe what happens in the world that is familiar to us [4].
Fig. 2 Diffraction on DVD surface
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
125
Fig. 3 Example: how we can measure the wave length λof light (for example: red laser)
Fig. 4 Photons (or electrons] coming though sheet with two holes Sometimes our observations (measurements) involve condition that are special and represent in fact a limited experience with nature. It is a small section only of natural phenomena that one gets from direct experience. It is only through refined measurements and careful experimentation that we have a wider vision [4]. And then we see unexpected things; we see things that are far from what we would guess - far from what we could have imagined but just to comprehend those things, which are there. For two slots problem in Fig. 4 there is one simplification at least. Electrons behave in this respect in exactly the same way as photons; they are both screwy in exactly the same way [4].
3. Polarization Polarization produces a large number of different possible couplings. All possible combinations of polarized electrons and photons do not couple (see Fig. 6). So far, no fundamental spin 0 particles have been found. But we can see the vortex rings in the water [9], in the air making a travelling vortex ring [2]. It is clear form Fig. 5 that in the points 1,2 there is not phase change. But osmerons in the points 3,4 have an opposite phase shift which change the symmetry of vortex structure in the light ray.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
126
Fig. 5 Polarization of light ray that has a vortex structure
Fig. 6 Forces that are between the electron and the proton (a, b) and their outer vortex structures (c, d)
Fig. 7 Vortex structure of electron and anti-electron with two face orientation
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
127
Fig. 8 Vortex structure of the proton and the antiproton
4. Conclusions The vortex models might be better than a classical one. Our universe might be considered as “supervortex” space. Vortex structures can explain the electromagnetic field, perhaps gravitation too. Vortices can attract each other using their different polarities. Planck’s constant h might be the energy Ep of one vortex pair Vp. Vacuum is a space full of osmerons. There can be non-homogeneous density of osmerons from which vortex structures are created. Close vortex structures with their rotation have the inertia, which explains what the mass of matter can be compared open structures that create radiation. The radiation is an open vortex structure (for example light with photons) and matter is a close vortex structure with mass (for example: electrons, protons, and neutrons and followed with complex structures as the nucleus. All things are only complex vortex structures. Because the electron structure has the face and the back we can distinguish two states “0” and “1”. This can be used in coding of “electron computers” and “electron memories” (analogy to quantum computers). Fractal dimensions seem to be very powerful in describing natural objects on all scales. The behavior and creation of annular-vortex structures with zero spin was described in [2], [9]. Water forms a spiraling, funnel-shaped vortex as it drains from 1.5 or 2-liter soda PET-bottle. A simple connector device from two original lids with 1cm hole allows the water drain into a second bottle. Fill only one of soda bottles about two-thirds full of water. Place the two bottles on a table with the filled bottle on top. Watch the water slowly drip down into the lower bottle as air simultaneously bubbles up into the top bottle. The flow of water may come to a complete stop. To create vortex structure it is necessary add chaotic movement (shacking) or better rapidly rotate the top bottle in a circle a few times. Notice the shape of the top vortex and there is second vortex in the lower bottle (in principle it is tornado structure destroyed by gravity). The whole assembly can then be inverted and the process repeated. This simple model demonstrates the basic principle of vortex structures and how we can come from chaos to self-organized structure (the basic principle of evolution for nonliving systems). This knowledge can be used when we trying quickly get the liquid from a tank (canister).
References [1] FEYNMAN, R. P.; LEIGHTON, R. B.; SANDS, M. The Feynman Lectures on Physics, volume I, AddisonWesley publishing company, 1977. [2] FEYNMAN, R. P.; LEIGHTON, R. B.; SAMDS, M. The Feynman Lectures on Physics, volume II, AddisonWesley publishing company, 1977. [3] FEYNMAN, R. P.; LEIGHTON, R. B.; SANDS, M. The Feynman Lectures on Physics, volume III, AddisonWesley publishing company, 1977. [4] FEYNMAN, R. P. QED – The Strange Theory of Light and Matter. Princeton University Press, 1988. [5] FEYNMAN, R. P. The Character of Physical Law, Penguin Books, 1992. „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
128
[6] [7] [8] [9] [10]
OŠMERA, P. Evolution of univers structures, Proceedings of MENDEL 2005, Brno : Czech Republic (2005) 1-6. OŠMERA, P. The Vortex-fractal Theory of the Gravitation, Proceedings of MENDEL’2005, Brno : Czech Republic (2005) 7-14. OŠMERA, P. The Vortex-fractal Theory of Universe Structures. Proceedings of the 4th International Conference on Soft Computing ICSC2006, January 27, 2006, Kunovice, Czech Republic LIM, T. T.; NICLES, T. B. Instability and reconnection in thehead –on collision of two vortex rings, letter to Nature, vol. 357, May 1992. WALLRAFF, A.; LUKASHENKO, A.; LISENFELD, J.; KEMP, A.; FISTUL, M. V.; KOVAL, Y. & USTINOV, A.V. Quantum dynamics of a single vortex, letters to nature, vol.425, September 2003.
Address: Doc. Ing. Pavel Ošmera, CSc. Institute of Automation and Computer Science Brno University of Technology Technicka 2, 616 69 Brno, Czech Republic Tel.: +420 541 142 294 Fax: +420 541 142 490 e-mail: osmera @fme.vutbr.cz
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
129
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
130
VÝZNAM MONITOROVÁNÍ POČÍTAČOVÝCH SÍTÍ Imrich Rukovanský
Evropský polytechnický institut, s.r.o.
Abstrakt: Zda počítačová síť pracuje efektivně, můžeme prokazatelně zjistit pouze sledováním jeho aktivit a to měřením.Tuto činnost lze provádět využitím monitorů (síťových analyzátorů), kterými lze sledovat až desítky různých výkonnostních parametrů (propustnost, využití CPU, využití komunikačních linek mezi uzly, úzká místa) a to na různých úrovních sítě. Avšak charakter měření se liší případ od případu a závisí na tom, jaké výkonnostní parametry sledujeme, zda sledujeme pouze vybranou část sítě, nebo síť jako celek, zda využíváme hardwareový, softwarový monitor, nebo kombinaci obou.Důležitost monitorování počítačových sítí, jakož i různorodost praktických měření dokladuje tento příspěvek.
Klíčová slova:počítačová síť, monitor, měření, bezdrátové spoje, WiFi, přepínač, směrovač, server, propustnost, zátěž, výkonnost.
Úvod Složitost, rozlehlost a různorodost počítačových sítí neustále roste.Trvale se zvyšují přenosové rychlosti mezi uzly sítě, lokální sítě se propojují v geograficky rozsáhlé celky , vznikají moderní síťové operační systémy zajišťující maximální průchodnost toku úloh sítí. S cílem maximálního zužitkování nákladných prostředků technického i programového vybavení sítě nastupují různé filosofie sdílení zdrojů, prokládaných činností a souběžných aktivit různých uživatelů, výstavba důmyslných databázových systémů, až po schopnost sdružování uzlů sítě dle charakteru řešených úloh (clustering). Avšak na otázku, zda vůbec a do jaké míry počítačová síť pracuje efektivně, do jaké míry je využita kapacita počítačů v uzlech sítě, jak jsou využity komunikační prostředky mezi jednotlivými uzly, zda je efektivně zvolena topologie sítě, jaké je využití databází včetně úrovně souběžně pracujících uživatelů, aktivity na úrovni vstupních a výstupních jednotek, a na celou řadu dalších skutečností zodpovědět nelze bez využití monitorování (měření) sítě, resp.některých jejich částí. Monitorování (měření) počítačové sítě provádíme využitím hardwarových, nebo softwarových monitorů, resp. kombinací obou forem. Provést veškeré druhy měření na počítačové síti najednou je prakticky nemožné. V současné době se využívá několik stovek druhů měření. Navíc měření se provádí zpravidla jen ve vytipované části sítě (switche, routery, servery,časti LAN, apod.), nebo činností ( toky paketů, propustnost, přetížení, apod ). Proto před zamýšleným monitorováním je třeba zodpovědět zásadní otázku, a to co chceme monitorováním dosáhnout, k čemu poslouží naměřené hodnoty eventuálně ve které části sítě (složky) očekáváme zvýšení efektivnosti nebo výkonnosti. Problematika optimalizace činnosti a měření počítačových sítí je řešena v rámci výzkumných úkolů katedry Aplikované informatiky na EPI Kunovice. Některé konkrétní výsledky pro ilustraci uvádíme. [1], [2], [6]. 1 Uplatnění měření při zavádění nové technologie do stávající sítě Za účelem zvýšení propustnosti počítačové sítě EPI Kunovice bylo rozhodnuto začlenit do ní možnost bezdrátového přístupu .Byl proveden návrh takto koncipované sítě včetně vytipovaných míst pro přístupové body, zvolená konkrétní technologie WiFi a další náležitosti související s technickým i programovým zabezpečením sítě. V procesu realizace a následného zprovoznění takto upravené sítě bylo zapotřebí provést měření a ověřit tak reálný přínos zvolené koncepce řešení.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
131
Měření přístupových bodů V první fázi implementace WiFi bylo třeba naladit přístupové body tak aby si v dosahu vzájemně nekolidovaly a současně, aby jejich vysílací výkon odpovídal normám ČTU. Za účelem zjištění skutečného stavu bylo provedeno měření u všech přístupových bodů v dané lokalitě. Měřením bylo zjištěno, že v souvislosti se zavedením WiFi v budovách školy bude nutné pro zajištění správného chodu sítě velmi často kontrolovat frekvence přístupových bodů a případně doladit podle toho, který z kanálů bude vykazovat nejlepší signál. Měření bylo provedeno pomocí monitoru na Linuxu distribuce Debian na softwaru Cacti, který je volně distribuován v rámci open source licencí. Tento software se na EPI,s.r.o běžně využívá ke sledování stavu sítě, zvláště pak k monitorování zátěže a průtoku dat v jednotlivých uzlech sítě. Bez využití výsledků monitorování by zajištění efektivního chodu sítě vzhledem k její rozmanitosti a intranetových spojů mezi uzly různými technologiemi od bezdrátové mikrovlnné až po optické metalické by bylo nereálné. [3]. Měření propustnosti sítě po zavedení technologie WiFi Pro zjištění přínosu bezdrátové technologie WiFi do sítě školy, bylo třeba vyjít z původně naměřených statistik poskytnutých výše zmíněným monitorem sítě a porovnat je s nově naměřenými hodnotami. Testováním náhodného uživatele bezdrátového připojení se ukázalo, že se propustnost intranetu školy se dle očekávání zvýšila. Porovnáním statistických dat z dřívějšího provozu a současného režimu byl zjištěn přenos většího objemu dat, navíc v kratších intervalech. Příznivý vliv zvýšení propustnosti lze vypozorovat nejen v samotném intranetu, ale také u internetových přenosů sítě EPI. Uvedené skutečnosti ilustruje přiložený obrázek (Obr.1).
Obr.1 Zatíženost přenosových cest u vybraného uživatele. Z obrázku je patrná rychlost stahování v kilobajtech za sekundu. Od měsíce července je přenos dat do intranetu mnohonásobně vyšší než v předchozím měsíci. Rovněž můžeme vypozorovat nárůst zátěže jak na odchozím směru do internetu (zelená křivka) tak i příchozím směru (modrá křivka); zvýšenou přenosovou kapacitu pak ilustruje na obrázku fialová křivka (Total traffic) Prezentované výsledky získané měřením jednoznačně potvrdily přínos začlenění nové technologie WiFi do školní sítě [3]. 2 Měření síťových prvků a propustnosti LAN Zkoumaná lokální počítačová síť společnosti Branson je součástí celosvětové WAN sítě společnosti Emerson a je určena pro sváření plastů, pro sváření ultrazvukem, teplem a vibracemi. Firma většinu výrobků produkuje pro automobilový průmysl. Zkoumaná LAN je vybudovaná výhradně na bázi produktů Cisco a zaručuje tím efektivní sdílení a využívání veškerých služeb, které hardware poskytuje. Sestává ze tří hlavních serverů na kterých běží v době prvního měření Windows Server NT 4.0, přičemž v současné době probíhá migrace na nový operační systém Microsoft Windows Server 2003 Standard Edition. Dále pak z jednoho Modular Access routera Cisco 1700 (vyčleněného ke komunikaci s mateřskou společností a s divizemi v Evropě), „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
132
jednoho switche třídy 4000, dalších 6-ti switchů Catalyst třídy 3500, cca 200 pracovních stanic a zařízení (tiskárny, CNC stroje, apod), které vyžadují IP adresu. Nakolik zkoumaná LAN byla vyprojektovaná a uvedena do provozu v roce 2000, vyvstal problém zjistit, zda bude výkonnostně stačit novým potřebám a požadavkům kladeným v současnosti. V síti dochází ke změně filosofie ukládání dat a to od dřívějšího lokálního na centrální ukládání všech souborů a dat.V této souvislosti je potřeba zjistit propustnost jednotlivých prvků sítě, zda vyhovují narůstající zátěži související s novými funkcemi daného aplikačního prostředí.Samotné měření, jak již bylo naznačeno dříve se provádí na dvou úrovních: na úrovni monitorování prvků sítě (switchů) a monitorování jednotlivých větví resp. segmentů sítě. Měření propustnosti switchů Měření je zaměřeno na switche z toho důvodu, že veškerý provoz na síti je zabezpečena právě těmito zařízeními, a to nám poskytuje podrobný přehled o celkovém provozu sítě, jelikož data získáváme přímo z těchto zařízení samotných. K měření se využívají dva nástroje. Hodnoty z centrálního switche Catalyst 4006 se získávají pomocí MRTG (The Multi Router Traffic Grapher) Jde o monitor určený ke sledování přenosové zátěže na spojích sítě. Generuje HTML stránky obsahujících PNG zobrazení, které poskytuje živé vizuální reprezentaci těchto toků (Check http://www.stat.ee.ethz.ch/mrtg/) Menší switche se měří pomocí nástroje zvaného Cluster management, což je v principu podobný monitor jako předchozí, jenže je vyvinut přímo Ciscem samotným pro své sítě, což je znát na podrobnosti měřených dat prezentujících cca 30 hodnot. Měřením prvků (switchů) jsou k disposici nejrozmanitější informace o stávající činnosti sítě. Ať již jde o informace o portech samotných, podrobné informace týkající se odesílání paketů, paketů přijímaných, nebo o procentuální vyjádření hodnot, počet samotných paketů, které prošli zařízeními, využití informace o šířce pásma apod. Vedle „výkonnostních „ parametrů poskytují rovněž informace o poruchovosti sítě a tím i dokonalý bezprostředný přehled o stavu sítě. Jelikož za pomocí Cluster managementu je neustále vidět co se děje v zařízeních sítě, je možné na vzniklé situace přiměřeně a adekvátně reagovat. Měření probíhalo ve stejných časových úsecích během pracovní doby a tím se získal dokonalý přehled o efektivnosti využití jednotlivých složek sítě. Ukázalo se, že pro získání dokonalejšího přehledu je třeba časovou periodu snížit; byla zvolena na 30 minut. Sběr informací pomocí MRTG probíhal automaticky, měření pomocí Cluster managementu byla prováděna manuálně, kdy se informace překopírovaly do excelových listů a následně se zpracovávaly a vyhodnocovaly. Přenosové aktivity na Catalystu 4006 ilustruje obrázek Obr. 2. Jde o port č. 13 spolu s naměřenými hodnotami, které nás nejvíce zajímají.
Max
Average
Current
In
4560.0 B/s (0.0%)
1513.0 B/s (0.0%)
283.0 B/s (0.0%)
Out
5063.0 B/s (0.0%)
1924.0 B/s (0.0%)
486.0 B/s (0.0%)
Obr.2. Naměřené přenosové rychlosti u vybraného portu.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
133
Průběh měření na Catalystech 3500 mělo odlišný charakter. Vzhledem k rozmanitosti a množství údajů byl pro ilustraci vybrán switch 129.115.4.38 a u něj vybraný parametr Transit Rate. FastEthernet0/1
0,9
FastEthernet0/2 FastEthernet0/3
0,8
FastEthernet0/4 FastEthernet0/5 FastEthernet0/6
0,7
FastEthernet0/7 FastEthernet0/8
0,6
FastEthernet0/9 FastEthernet0/10 FastEthernet0/11
0,5
FastEthernet0/12 FastEthernet0/13
0,4
FastEthernet0/14 FastEthernet0/15
0,3
FastEthernet0/16 FastEthernet0/17 FastEthernet0/18
0,2
FastEthernet0/19 FastEthernet0/20
0,1
FastEthernet0/21 FastEthernet0/22 FastEthernet0/23
0
FastEthernet0/24
00 6:
00 7:
00 8:
00 9:
:00 10
:00 11
:00 12
:00 13
:00 14
:00 15
GigabitEthernet0/1 GigabitEthernet0/2
Obr.3 Aktuální přenosové rychlosti v daných okamžicích. Na obr.3 jsou zachyceny aktuální přenosové rychlosti ( Mbps) s jakou jsou data na daném switchi přeposílaná na další zařízení. V našem případě jde o přenos z hlavního switche přes tento switch ke koncovým zařízením. Výsledky měření a vyhodnocení statistických dat ukázaly, že počítačová síť z velké části pokryje narůstající výkonnostní požadavky na něj kladené. Ukázalo se ale také, že na oddělení strojní a elektrokostrukce vzniká nutná potřeba pořízení jednoho 24 portového switchu pro vyselektování pracovišť pracujících se zvýšeným objemem dat; dosavadní podmínky již nevyhovují kapacitně jejich potřebám. Navíc bylo zjištěno, že pro zajištění efektivní práce sítě bude třeba vytvořit VLAN pro separaci určité skupiny pracovníků spolu s centrálním serverem na ukládání dat. [4]. Měření propustnosti segmentů, resp. větví LAN Pro získání uceleného přehledu o výkonnosti celé sítě LAN bylo rozhodnuto doplnit závěry získané měřením prvků o monitorování vybraných segmentů sítě.Jde o ty větve, na které mají být vlivem modernizací naší LAN pokládány nejnáročnější aplikační požadavky. Jde předně o řadu upgradů CAD aplikací, jako je EPLAN, SolidEdge, a dalších. Měření na nejvytíženějších větvích má zajistit plynulý přechod paketů v celé síti LAN a tím na nové podmínky práce sítě. K sledovaným parametrům patří: • monitorování toku paketů (pracovní stanice a uzly LAN), rovněž tak ztráty, směrování, výměna a další • identifikace nejintenzivněji vysílajících a nejintenzivněji přijímacích stanic • monitorování příliš dlouhých a příliš krátkých rámců,chyb CRC, chybová statistika • monitorování různých parametrů na jednotlivých vrstvách OSI (párování MAC adres, síťových adres, IP adres, a pod) Monitorování sítě bude provedeno pomocí osvědčených síťových analyzátorů Sniffer Analyzer a NetXrey, spolu s prostředky pro správu sítě obsažené v OS WIN2003 Server SE. (Bandwidth test).
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
134
Sniffer Analyzer je prostředek určený k analýze aktivit v sítích LAN a WAN, všude tam kde se vyžaduje sofistikovaná analýza paketů. Zachytává anomálie na všech úrovních sítě, odhaluje problémy související s propustností sítě a předkládá návrhy možných řešení.Kromě toho je osvědčeným prostředkem i k odhalování poruch sítě. Bandwidth test se využívá k ověřování propustnosti větve sítě. [5]. 3 Sledování výkonnosti serverů pomocí Microsoft Operations Manager 2005 O významu sledování výkonnosti složitých počítačových struktur, tedy i počítačových sítí jsou vedle uživatelů přesvědčeni také přední počítačové firmy. U předchozího příkladu jsme upozornili na monitory firmy Cisco, které jsou specielně tvořené ke sledování výkonnosti počítačových sítí tvořených z prvků této firmy. Poněkud s jiným řešením přichází na trh fy Microsoft , která pro monitorování výkonových a kapacitních parametrů serverů Windows 2000 a 2003 uvádí na trh monitor Microsoft Operations Manager 2005 (MOM 2005). Umožňuje celou škálu měření počínaje využitím CPU, diskového prostoru, parametrů databází, včetně konektivity souběžně pracujících uživatelů, využití prostoru databáze, apod. Za účelem ověření reálných možností monitoru a jeho použitelnosti v konkrétním nasazení, byla provedena jeho instalace v počítačové síti s 250 uživateli na OS MS Windows ve W2k3 doméně. S File serverem, Domain Controlerem, Exchange serverem W2k a Ebi serverem. Po instalaci monitoru se na firemní síti neobjevil žádný problém.Po důkladnějším prozkoumání options byla zapnuta větší filtrace a už se objevily první výsledky monitorování. Rozpoznal z čeho se skládá daná síť, ukázal, že jsou tam switche a huby, že jsou použity prvky jak 1Gbit tak 100Mbit, tak 10Mbit a hned doporučil nahradit 10Mbit minimálně za 100Mbit. Dalším pozitivním rysem MOM je, že se počítačová síť dá různě rozgrupovat a podle daných grup používat určité typy filtrace a zkoumání problémů. Např. na zkoumané síti je k disposici oddělení DEMO, které slouží k předvádění produktů firmy (serverů a programů na nich vytvořených na míru). V této skupině jsou statické adresy a všude jinde kromě oddělení IT oddělení je DHCP (automatické přidělování IP adres ze serveru). V této skupině dochází často ke kolizi IP adres. Ukázalo se, že MOM dokáže tuto situaci dobře mapovat a informovat o daném problému správce sítě. Uživatel si nastaví IP adresu a MOM hlásí, že je uživatel v kolizi. V zápětí uživatel volá správce, že má problém se dostat na síť. Správce však již ví kde je problém. Velice užitečná funkce poskytovaná MOM, kterou uvítá nejeden uživatel je, že dokáže pro lepší orientaci vytvořit předpokládaný model počítačové sítě, kterou lze přehledně znázornit, popsat s určitými informacemi jednotlivé skupiny kde se přesně ví, co se děje a co by se dít nemělo. Monitor MOM poskytuje uživateli možnost upravovat si chybová hlášení (error messages), jakož i styl jejich zobrazení. Stupeň důležitosti konkrétních chybových hlášení si určí uživatel sám a speciálně si nastaví důležitost těch situací, před kterými nás má monitor varovat.Samozřejmě, že je možno jednotlivé úrovně různě vybarvovat, měnit barvu apod. I když zatím proběhla pouze první část experimentů s monitorem MOM (podzim 2005), lze potvrdit jeho užitečnost a význam při sledování činnosti sítě. Na základě dosud získaných poznatků : • můžeme jednoznačně doporučit pořízení monitoru MOM všude tam, kde se jedná o rozsáhlejší topologii počítačové sítě. Prokazatelně poskytuje cenné informace o skutečné činnosti sítě a navíc dokáže předcházet problémům.Investice do MOM se vyplatí. • konstatujeme, že je určen pro sledování serverových systémů a což jsme předpokládali není příliš vhodný jako monitorující nástroj pro problémy související s infrastrukturou směrovačů či přepínačů. [2, 6]. Závěr O významu monitorování počítačových sítí již dnes není pochyb.Předně o něj mají zájem uživatelé sítí, kteří měřením získávají aktuální informace o reálné činnosti sítě. Vedle získávání výkonnostních parametrů jako jsou propustnost, využití, apod., mohou rovněž získávat různá hlášení o bezporuchovém stavu sítě a další potřebné informace.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
135
Renomované firmy, zabývající se výrobou různých komponent, nebo výstavbou rozsáhlých počítačových sítí proto nabízí celou škálu monitorů, kterými lze zajistit sledování (měření) a následné efektivní provozování těchto složitých počítačových struktur. Závěrem je si však nutno uvědomit, že monitory (síťové analyzátory) nejsou univerzální a jsou závislé na tom jaké parametry mají měřit, kterou složku sítě mají monitorovat (např.servery, switche, apod), jsou zpravidla vyvinuté pro prvky určité firmy (Cisco, Microsoft).
LITERATURA [1] RUKOVANSKÝ, I. Sledování výkonnosti počítačových sítí. Kunovice : Výzkumná zpráva EPI, s.r.o. prosinec 2005. [2] HANCE, B. Microsoft Operations Manager 2005. Computerworld č.8, 2005, str. 21. [3] CHMELA, T. Využití WiFi ke zvýšení propustnosti sítě. Kunovice : EPI, s. r. o. Bakalářská práce EPI, 2005. [4] UNČÍK, M. Měření síťových prvků počítačové sítě a vyhodnocení měření. Kunovice : Podklady k Bc práci EPI, s.r.o., 2005. [5] SLOBODA, P. Měření propustnosti lokální počítačové sítě. Kunovice : Podklady k Bc. práci, EPI, s.r.o., 2005. [6] KADLEC, R. Ověření funkčních schopností MOM 2005 na konkrétní síti. Kunovice : Prezentace výsledků týmu 3-2-1, EPI, s.r.o., 2006.
Adresa: Prof. Ing. Imrich Rukovanský, CSc. Evropský polytechnický institut, s.r.o. Osvobození 699 686 04 Kunovice tel. / fax.: +420 572 549 018, +420 572 548 788 e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
136
ANALÝZA DAT S VYUŽITÍM NEURONOVÝCH SÍTÍ A KONTINGENČNÍCH TABULEK Jindřich Petrucha
Evropský polytechnický institut, s.r.o. Abstrakt: Článek popisuje možnosti použití externích zdrojů pro získání vstupních dat o Evropské unii a převod těchto dat do formátu vhodného pro OLAP analýzu. Je popisován proces transformace a čištění dat tak aby bylo možné provést převod do kontingenční tabulky, ze které jsou data vkládána do externího programu realizující analýzu vybrané časové řady. Externí program simulátoru neuronové sítě provede proces učení z vybraných dat a další etapu procesu analýzy. Klíčová slova: OLAP, neuronová síť, časové řady, kontingenční tabulka, čištění dat, ETL.
1. Úvod Možnosti použití externích zdrojů pro analýzu dat se v současné době neustále zvětšují, protože mnohá data jsou prezentována na internetu a jsou neustále aktualizována. Problém tedy nespočívá ani tak v získaní dat, ale spíše v problematice jejich analýzy pomocí moderních prostředků informačních technologií. Mnohé specializované nástroje vyžadují určitý standardizovaný formát, který slouží jako import dat, nebo je nutno čerpat tato data z datových skladů a použít techniku datové pumpy, která dovoluje vybrat požadovaná data. Pokud pracujeme s daty na internetu většina těchto dat je v textovém formátu s různými grafickými úpravami, které vizuálně zpřehledňují zobrazená data, ale na druhou stranu pro automatické zpracování je tento formát naprosto nehodný. Tato nadbytečná data je nutno odstranit pomocí programových nástrojů nebo ručně pomocí různých editorů do požadovaného formátu. Takto upravená data je možné použít pro analýzu pomocí nástrojů umělé inteligence, které zkvalitňují rozhodovací proces.
2. Problematika extrakce dat 2.1 Obecné principy Důležitým krokem je etapa popsána v [1] jako ETL (Extraction Transformation Loading) tedy extrakce dat z určitého transakčního zdroje, transformace těchto dat do potřebných struktur a následně nahraní těchto dat do datového skladu nebo přímo do programového systému. Cílem této etapy je centralizovat data tak by byla splněna podmínka rychlého přístupu k těmto datům. Je vhodné na počátku si definovat cíle této etapy tak aby byla splněna obecná kritéria. Předpoklad pro tyto činnosti je naše schopnost zpracovávat data na principech uvedených ve firemní literatuře firmy Oracle kde je definován proces transformace dat na informace tehdy když: • máme údaje, • víme, že máme údaje, • víme, kde tyto údaje máme, • máme k nim přístup, • zdroji údajů můžeme důvěřovat. Ne vždy se podaří jednotlivé podmínky této definice splnit, protože ve všech systémech, kde pracuje lidský faktor, dochází k chybám, ať už záměrným nebo náhodnou chybou lidského činitele, který pracuje v transakční úrovni zpracování. Množství dat je v mnoha případech tak velké, že jen menší část je podrobována analýze, ze které v procesu business inteligence vznikají znalosti. V počáteční fázi ETL potřebujeme získat data uložené v určitém externím informačním zdroji, kterým může být informační podnikový systém, ve kterém se provádí většina transakčních operací, ať už se jedná o účetnictví, skladové hospodářství, odběratelsko dodavatelské vztahy nebo podobné systémy. Z těchto systémů pokud obsahují kompatibilní databázové systémy, můžeme získat data přímo z relačních tabulek nebo přes export dat do patřičných formátů, které podporuje většina systémů. Například CSV textové soubory a podobně. Další možností je monitorovaní tržního prostředí internetu a odtud získat potřebná data. Zde je možné použít moderní nástroje, jako jsou čtečky RSS kanálů a sledovat vznik informačních zdrojů změnou URL přímo na stránkách určité firmy. Nebo můžeme využívat datové „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
137
zdroje informačních a vyhledávacích serverů. Tento systém se využívá hlavně v oblasti finančních trhů při sledování indexů jednotlivých akciových trhů a následném vyhodnocení hodnoty akcií. 2.2 Extrakce a čištění dat na konkrétním příkladu Pro konkrétní případ si vyberme aktuální statistická data o evropské unii, která popisují stav inflace a vývoj průmyslové výroby v jednotlivých státech evropské unie za určité období. Tato data je možné získat dle odkazu [5] v pdf formátu v přehledových tabulkách tak jak je znázorněno na obr. 1. V tabulce je na levé straně časová osa a na horní části se nacházejí názvy jednotlivých států.
Obr. 1. Statistická data z informačního zdroje EUROSTAT Pokud chceme extrahovat tato data je možné vložit data s využitím kopírování a označené části data do textového souboru, který dále opatříme oddělovacími středníky mezi jednotlivé údaje a provedeme rozdělení na jednotlivé záznamy. Složitost tohoto kroku ukazuje textová část po výběru dat z pdf formátu. Data vytvářejí jeden záznam bez oddělovacích informací. Tato málo strukturovaná forma zápisu se musí transformovat tak aby bylo možné separovat potřebná data do tabulkového kalkulátoru. EUEMUAustriaBelgiumDenmarkFinlandFranceGermanyGreeceIrelandItalyLuxembourgetherlandsPortugalSpainSwed enUKCyprusCREstoniaHungaryLatviaLithuaniaMaltaPolandSlovakiaSloveniaI.99-0,2%-0,1%-0,1%0,4%0,2%-0,2%0,4%-0,1%-1,3%-0,8%0,1%-1,7%0,0%-0,4%0,3%-0,4%-0,6%0,7%0,9%1,1%2,6%1,0%1,0%na1,5%3,0%1,0%II.990,3%0,3%0,2%0,2%0,5%0,4%0,4%0,2%0,7%0,7%0,2%1,9%0,7%0,0%0,1%0,1%0,2%-1,8%-0,1%0,2%1,3%0,2%0,0%na0,5%0,8%0,4% Transformaci je vhodné provést do CSV formátu kterému rozumí tabulkový kalkulátor Excel a dovoluje importovat data z tohoto textového formátu. Dalším důležitým krokem pro analýzu je vytvoření datové struktury, která je vhodná pro kontingenční tabulku, která představuje datovou OLAP kostku, ve které je možné provádět různé pohledy na analyzovaná data. Na obr. 2 je znázorněna kontingenční tabulka z vloženými daty. V levé části je zachována časová osa, která je rozdělena jednotlivé roky a další členění představují jednotlivé měsíce podle číselného označení. V kontingenční tabulce je možné provádět různé agregace podle časové osy a vybírat si státy, které chceme v daném ukazateli sledovat. Pod každým rokem je použit řádek s agregující hodnotou průměru dat pro určitý sloupec. Tento agregační vzorec lze také podle požadavků měnit. Následující obrázek obr. 2 zobrazuje data vybraná pro rok 2004 a 2005 pro státy EU se zvýrazněním CR. Hodnota inflace pro CR je velmi nízká a je srovnatelná se zeměmi jako je Rakousko, Německo nebo Francie.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
138
Obr. 2. Kontingenční tabulka s údaji inflace pro vybrané země EU . 3. Analýza dat s využitím neuronových sítí 3.1 Analýza časových řad Pro další analýzu údajů v kontingenční tabulce můžeme použít určité další nástroje umělé inteligence, které dovolují zlepšit analytický proces vyhledáním souvislostí a vazeb, které nejsou do dat přímo vloženy. Jeden z těchto nástrojů představují neuronové sítě realizované pomocí programových simulátorů, které dovolují zadat architekturu neuronové sítě, provést etapu učení podle nastaveného kritéria, simulovat proces zpracování vstupních a výstupních vzorů. Jako hodnotící kritérium lze používat různé přístupy hodnocení v učící množině, Pro náš případ použijeme střední kvadratickou odchylku (MSE, mean squared error) přes celé zahrnované období, zahrnující N predikcí, kdy hodnota nesmí překročit určitou zadanou hodnotu. MSE = 1/N ∑ (predikce – skutečnost)2 Při analýze budeme používat jednokrokovou predikci, která je zabudována přímo do programového simulátoru. Proces zpracování dat z kontingenční tabulky bude znázorněn v následujících odstavcích. 3.2 Použití simulátoru neuronových sítí pro analýzu časových řad Jako vstupní datový soubor je používán ACII soubor, který má určenou strukturu dovolující zadat parametry neuronové sítě a časovou řadu podle struktury vstupního vzoru. Ukázka ASCII souboru: Neuron - casova rada inflace CR od roku 2000 5 -pocet vstupu „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
139
1 -pocet vystupu 1-krok predikce 70 -pocet dat casove rady 10 -pocet neuronu ve skryte vrstve **** data casove rady **** 0.0170 0.0020 0.0000 -0.0020 0.0030 …… Data jsou vložena do tohoto souboru buď přímo z kontingenční tabulky nebo z dat ze kterých čerpá kontingenční tabulka. Tento způsob dovoluje modifikovat také parametr jako je počet neuronů ve skryté vrstvě velmi jednoduchým způsobem. Program simulátoru je napsán v jazyce PASCAL DELHI a dovoluje realizovat učení na vybrané části časové řady. Na obr. 3 je základní okno simulátoru zobrazující data inflace pro CR normovaná na interval od nuly do jedné. Z obrázku je zřejmé kolísání hodnot, které se bude snažit neuronová síť analyzovat a naučit. Pro učení byla vybrána oblast dat od 1 do 50 prvku a pro vyhodnocení chyby oblast od 10 do 55 prvku. Proces učení byl nastaven na 50000 cyklů učení pro vhodné sledování totální chyby ve spodním pravém oknu simulátoru. Po ukončení procesu učení byl proveden test, který je znázorněn v horním levém okně simulátoru, kde zelené hodnoty jsou data časové řady a modré hodnoty (čaerkované) jsou predikované hodnoty pomocí simulátoru umělé neuronové sítě. Na obr. 4 je detailně vidět oblast od 50 prvku po 70 prvek, kdy od 55 prvku se pracovalo s již neznámými daty. Simulátor zachycuje změny a spíše má snahu předvídat prudší kolísání, tak jak bylo zřejmé v předchozích letech. Největší rozdíl je pro hodnotu 61 prvku, kdy simulátor předpokládá růst inflace, který se ale nekonal. Pro přesnější hodnoty by bylo vhodné prodloužit počet cyklů dvojnásobně a učit simulátor na delší časové období.
Obr. 3. Simulátor umělé neuronové sítě s daty vloženými z kontingenční tabulky pro zemi CR.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
140
Obr. 4. Detailní zobrazení dat ze simulátoru umělé neuronové sítě pro data inflace CR. Využití této metodiky dovoluje velkou flexibilitu při analýze dat, které se nacházejí v různých informačních zdrojích na internetu a provádět hledání souvislostí mezi jednotlivými daty. Kontingenční tabulky dovolují velmi přehledně analyzovat souhrnné za vybrané časové období a jejich vazba na simulátory neuronových sítí dává kvalitní nástroj do rukou samotného uživatele. Tento postup je vhodný pro různé systémy, které mají určenou časovou osu ve které se lze dobře orientovat.
3. Závěr Z příkladu, který byl postupně prezentován v předcházejících odstavcích je zřejmé, že pro kvalitní analýzu je velmi důležité mít připravena data, která jsou očištěna od různých výkyvů a dovolující provést na jejich základě rozhodovací proces. OLAP nástroje jsou připraveny pro zpracování dat, ale mají jen velmi málo možností na jejich transformaci do požadovaného formátu. Velmi často jsou k dispozici data málo strukturovaná, která je nutno zpracovat v etapě ETL, pro kterou je nutné připravit programové nástroje podle charakteru vstupních dat. Tato etapa je podle mého názoru nejsložitější a zabírá nejvíce času z hlediska přípravy dat. Kontingenčním tabulky jsou vhodným nástrojem pro analýzu dat a dovolují výběr časové řady s možností vložení do dalších programových systémů. Simulátory neuronových sítí je vhodné použít tam kde lze sledovat určitý trend v datech, která analyzujeme.
Použitá literatura: [1] LACKO, L. Datové sklady analýza OLAP a dolování dat s příklady v Microsoft SQL Serveru a Oracle. 1. vyd. Brno : Computer Press, 2003. s. 486. ISBN 80-7226-969-0. [2] DOSTÁL, P. Moderní metody ekonomických analýz – Finanční kybernetika. První vydání. Zlín : Univerzita Tomáše Bati, 2002, s.110. ISBN 80-7318-075-8. [3] PETRUCHA, J.; MIKULA, V. Application of Neural Networks for Time Series Prediction and Making the Adequate Program Simulator. In Proceedings First INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIROMENTS January 30 -31. Kunovice : Evropský polytechnický institut, 2003. s.165-171. ISBN 80-7314-017-9. [4] PETRUCHA, J. Technologie analýzy dat – OLAP systémy v prostředí DBPROVE. ACTA UNIVERSITATIS AGRICULTURAE ET SILVICULTURAE MENDELIANAE BRUNENSIS, 2000, ročník XLVIII, číslo 2, s. 149-155. ISSN 1211-8516. [5] http://www.csas.cz/banka/application?pageid=downloads&dtree=cs&selnod=57/dataEU_public.pdf [online]. 2005 [cit. 2006-01-18]. Dostupný z WWW: < http://www.csas.cz/banka>.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
141
Adresa: Ing. Jindřich Petrucha, Ph.D. Evropský polytechnický institut, s.r.o. Osvobození 699, 686 04 Kunovice te./fax.: +420 572 549 018, +420 572 548 788 e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
142
DETECTION OF INITIAL DATA GENERATING BOUNDED SOLUTIONS OF LINEAR DISCRETE EQUATIONS1 Jaromír Baštinec, Josef Diblík
Brno University of Technology
Abstract: In the situation when graphs of solutions of discrete equations remain in a prescribed domain, the problem concerning determination of their initial data is discussed. Special attention is paid to linear discrete equations and initial data generating its solutions such that their graphs remain in a prescribed domain are found. Illustrative examples are considered, too.
Key words: Linear discrete equation, bounded solutions, initial data. AMS Subject Classification: 39A10, 39A11
1 Introduction and the problem considered 1.1 General suppositions We consider a scalar discrete equation ∆u (k ) = f (k , u (k )) (1.1) f : N (a ) × R → R where (a ) = {a, a + 1, ...}and a ∈ N , N = {0,1,...}. Together with discrete equation (1.1) we consider an initial problem. It is posed as follows: for a given s ∈ N we are seeking the solution u = u (k ) of (1.1) satisfying the initial condition u (a + s ) = u s ∈ R (1.2) with a prescribed constant us. Let us recall that the solution of initial problem (1.1), (1.2) is defined as an infinite
{ }
sequence of numbers u k
∞ k =0
with u k = u (a + s + k ), i.e.
u 0 = u s = u(a + s ), u 1 = u (a + s + 1),..., u n = u (a + s + n ),...
such that for any k ∈ N (a + s ) the equality (1.1) holds. Let us note that the existence and uniqueness of the solution of the initial problem (1.1), (1.2) is a consequence of properties of the function f. If function f depends continuously on second argument then the initial problem (1.1), (1.2) depends continuously on its initial data. Let b(k),c(k) be real functions defined on N(a) such that b(k) < c(k) for every k ∈ N (a ) . We define a set ω ⊂ N (a ) × R as ω : = {(k , u ) : k ∈ N (a ), u ∈ ω (k )} with ω (k ) := {u : b(k ) 〈 u 〈 c(k )} and a closure of the set u; as with
Obviously it holds:
1
ϖ := {(k , u ) : k ∈ N (a ), u ∈ϖ (k )} ϖ (k ) := {u : b(k )≤ u ≤ c (k )}
{(k , u ), k ∈ N (a ), u ∈ ω (k )} = ω =
U
k∈N (a )
{(k , u ), u ∈ ω (k )}
Preliminary version.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
143
Let us involve set B = B1 ∪ B 2 with
and define the boundary of ω as
B1 : = {(k , u ) : k ∈ N (a ), u = b(k )} ⊂ N (a ) × R, B2 : = {(k , u ) : k ∈ N (a ), u = c(k )} ⊂ N (a ) × R
∂ω : = {(k , u ): k ∈ N (a ), (u − b(k ))(u − c(k )) = 0} = B
Define, moreover,
∂ω (k ) := {b(k ), c (k )}
and ( for (k , u ) ∈ N (a )× R ) auxiliary functions
U 1 (k , u ) := u − b(k ), U 2 (k , u ) := u − v(k ).
Definition 1.1. The full difference ∆U 1 (k , u ) | (k ,u )∈B1 of the function U 1 (k , u ) for a given (k , u ) ∈ B1 with respect to the discrete equation (1.1) and the set
is defined as ∆U 1 (k , u ) |(k ,u )∈B1 : = f (k , b(k )) − b(k + 1) + b(k ).
B1
The full difference ∆U 2 (k , u ) | (k ,u )∈B2 of the function U 2 (k , u ) for a given (k , u ) ∈ B 2 with respect to the discrete equation (1.1) and the set B2 is defined as ∆U 2 (k , u ) | (k ,u )∈B2 : = f (k , v(k )) − c (k + 1) + c(k ). Definition 1.2. A point (k , u ) ∈ B with k ∈ N (a ) is called the point of the type of strict egress for the set ω with respect to the discrete equation (1.1) if ∆U 1 (k , u ) | (k ,u )∈B 〈 0 in the case when (k , u ) ∈ B1 , and
∆U 2 (k , u ) | (k ,u )∈B 〉 0
in the case when (k , u ) ∈ B 2 . The affirmation of following lemma is based on above Definitions 1.1, 1.2 and is an easy consequence of the formulated notions. Lemma 1.3. The point (k , u ) ∈ B with k ∈ N (a ) is a point of the type of strict egress for the set LJ with respect to the discrete equation (1.1) if and only if in the case when (k , u ) ∈ B1 , and in the case when (k , u ) ∈ B 2 .
f : (k , b (k )) − b(k + 1) + b(k ) 〈 0 f : (k , b (k )) − c (k + 1) + c(k ) 〉 0
1.2 Nonlinear case and description of problem considered The following theorem, concerning asymptotic behavior of solutions of the equation (1.1) is a particular case of a more general result in [3, Theorem 2, p. 520] (see [4] also). Theorem 1.4. Let us suppose that f is defined on ϖ with values in R and is continuous with respect to the second argument. If, moreover, each point (k , u ) ∈ B is the point of the type of strict egress for the set ω with respect to the discrete equation (1.1), then there exists an initial problem u ∗ (a ) = u ∗ ∈ ω (a )
(1.3)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
144
such that the corresponding solution u = u ∗ (k ) satisfies the relation
u ∗ (k ) ∈ ω (k )
(1.4) for every k ∈ N (a ) . Now we are able to describe the general version of the problem considered. Analysing result given by Theorem 1.4 we conclude that the existence of (at least one) solution of the problem (1.1), (1.3) having the indicated (asymptotic) behavior characterized by relations (1.4) is stated without concrete determination of the corresponding initial data u* itself. In this contribution we try particularly to fill this gap in the linear case. Note that the questions concerning behavior of solutions of discrete equations are considered e.g. in [1, 2], [5]— [9]. Unfortunately, problem concerning the determination of corresponding initial data was not considered there.
1.3 Linear case and the problem considered Let us put f (k , u (k )) : = ϕ (k )u (k ) + δ (k ) in (1.1) with ϕ (k ), δ (k ): N (a ) → R and consider the corresponding linear equation ∆u (k ) = ϕ (k )u(k ) + δ (k ) together with an initial problem u (a ) = u ∗ It is easy to verify that in the linear case Theorem 1.4 takes the form: Theorem 1.5. Let the inequalities (1 + ϕ (k ))b(k ) + δ (k ) + b(k + 1) 〈 0 (1 + ϕ (k ))c(k ) + δ (k ) + c(k + 1) 〉 0
(1.5) (1.6)
(1.7) (1.8)
hold for every k ∈ N (a ) . Then there exists an initial problem u ∗ (a ) = u ∗ ∈ ω (a ),
(1.9)
such that the corresponding solution u = u (k ) of equation (1.5) satisfies for every k ∈ N (a ) the inequalities ∗
b(k ) 〈 u ∗ (k ) 〈 c(k ).
(1.10)
Now, let us formulate problem under consideration. Problem 1.6. Determine at least one value u* such that the corresponding solution u = u ∗ (k ) of the linear problem
(
)
(1.5), (1.6) satisfies the relations k, u ∗ (k ) ∈ ω for every k ∈ N (a ) , i.e. satisfies inequalities (1.10) for every k ∈ N (a ) . We will show in the sequel that conditions of Theorem 1.5 together with the condition ϕ (k ) 〉 − 1 for every k ∈ N (a ) are sufficient for determining at least one initial value u* .
2
Main Results k2
In the following we put
∏
G (i ) ≡ 1 and
k2
∑ G(i ) ≡ 0 if k , k 1
2
∈ N (a ), k1 〉 k 2 and G is a function well-defined on
i = k1
i = k1
N(a).
2.1
Auxiliary Lemma
Lemma 2.1. Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 and the inequalities (1.7), (1.8) hold. Then the sequence c(a + s ) − u cs : =
s −1
∑
δ (a + i )
i =0
a + s −1
∏ (1 + ϕ ( j ))
j = a + i +1
s −1
{u cs }∞s =0 with
∏ (1 + ϕ (a + i ))
, s∈N
(2.1)
i =0
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
145
is a decreasing convergent sequence and the sequence {u bs }∞s =0 b(a + s ) −
s −1
∑
δ (a + i )
i=0
u bs :=
a + s −1
∏ (1 + ϕ ( j ))
j = a + i +1
, s ∈ N,
s −1
∏ (1 + ϕ (a + i ))
(2.2)
i =0
is an increasing convergent sequence. Moreover u cs 〉 u bs holds for every s ∈ N and for limits c ∗ , b ∗ , where c ∗ = lim u cs ,
b ∗ = lim ,
s →∞
(2.3)
s →∞
the inequality c ∗ ≥ b ∗ holds. Proof. Let us divide the proof into several steps.
a) Property u cs 〉 u bs , s ∈ N (a ) .
Let us show that u cs 〉 u bs for every s ∈ N (a ) . This follows from (2.1), (2.2), since c(a + s ) 〉 b(a + s ) for every s ∈N . b) Sequence {u cs }s =0 is a decreasing sequence. ∞
Let us verify that u cs 〉 u c , s +1 for s ∈ N every . If s = 0 and s = 1 then (2.1) gives c(a + 1) − δ (a ) u c 0 = c(a ) and u c1 = 1 + ϕ (a ) The inequality u c 0 〉 u b1 is due to the property (1 + ϕ (a )) 〉 0 a consequence of (1.8) with k = a since (1 + ϕ (a ))c(a ) + δ (a ) − c(a + 1) 〉 0
Let us consider the general case. For k = a + s, s ∈ N , the inequality (1.8) gives
(1 + ϕ (a + s ))c(a + s ) + δ (a + s ) − c(a + s + 1)
or (since 1 + ϕ (a + s ) 〉 0)
c(a + s ) 〉
〉 0
c(a + s + 1) − δ (a + s ) 1 + ϕ (a + s )
(2.4)
Then using (2.1) we estimate the general term u cs , s ∈ N of the sequence {u cs }∞s =0 : c(a + s ) −
s −1
∑ i =0
u cs =
a + s −1
δ (a + i )
∏ (1 + ϕ ( j ))
j = a + i +1
s −1
∏ (1 + ϕ (a + i ))
c (a + s + 1) − δ (a + s ) − δ (a + i ) (1 + ϕ ( j )) 1 + ϕ (a + s ) i=0 j = a + i +1 s −1
〉 [due to (2.4)] 〉
i =0
∑
a + s −1
∏
s −1
∏ (1 + ϕ (a + i ))
=
i =0
a+s s −1 c (a + s + 1) − δ (a + s ) − δ (a + i ) ( 1 + ϕ ( j ))(1 + ϕ (a + s )) c (a + s + 1) − δ (a + s ) − δ ( a + i ) (1 + ϕ ( j )) i = 0 j a i = + + 1 i = 0 j = a + i + 1 = = a + s −1
s −1
∑
∏
∑
s
s
∏ (1 + ϕ (a + i ))
∏ (1 + ϕ (a + i ))
i =0
c(a + s + 1) − =
s
a+s
i =0
j = a + i +1
∑ δ (a + i ) ∏ (1 + ϕ ( j )) s
∏ (1 + ϕ (a + i ))
∏
i =0
= u c, s +1
i =0
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
146
So, the inequality u cs 〉 u c , s +1 holds for every s ∈ N . c) Sequence {u bs }∞s = 0 : is an increasing sequence. Let us show that u bs 〈 u b , s +1 for s ∈ N . If s = 0 and s = 1 then (2.2) gives b(a + 1) − δ (a ) u b 0 = b(a ) and u b1 = 1 + ϕ (a ) The inequality u b 0 〈 u b1 is due to the property (1 + ϕ (a )) 〉 0 a consequence of (1.7) with k = a since (1 + ϕ (a ))b(a ) + δ (a ) − b(a + 1) 〈 0 Let us consider the general case. For k = a + s, s ∈ N , the inequality (1.7) gives
(1 + ϕ (a + s ))b(a + s) + δ (a + s ) − b(a + s + 1)
or (since 1 + ϕ (a + s ) 〉 0) b(a + s ) 〈
〈 0
b(a + s + 1) − δ (a + s ) 1 + ϕ (a + s )
(2.5)
Then using (2.2) we estimate the general term u bs , s ∈ N of the sequence {u bs }∞s = 0 : b(a + s ) −
s −1
∑
δ (a + i )
i =0
u bs =
a + s −1
∏ (1 + ϕ ( j ))
j = a + i +1
s −1
∏ (1 + ϕ (a + i ))
b(a + s + 1) − δ (a + s ) − δ (a + i ) (1 + ϕ ( j )) 1 + δ (a + s ) i =0 j = a + i +1 s −1
〈 [due to (2.5)] 〈
i =0
∑
a + s −1
∏
a −1
∏ (1 + ϕ (a + i ))
=
i =0
a + s −1 a+ s s −1 s −1 b(a + s + 1) − δ (a + s ) − δ (a + i ) ( 1 + ϕ ( j ))(1 + ϕ (a + s )) b(a + s + 1) − δ (a + s ) − a i δ ( ) (1 + ϕ ( j )) + i =0 j = a + i +1 i =0 j = a + i +1 = = = s s
∑
∏
∑
∏ (1 + ϕ (a + i ))
b(a + s + 1) − =
s
i =0
∏ (1 + ϕ (a + i ))
i =0 a+ s
∑ δ (a + i ) ∏ (1 + ϕ ( j )) j = a + i +1
s
∏ (1 + ϕ (a + i ))
∏
i =0
= u b , s +1
i =0
Consequently, the inequality u bs 〈 u b , s +1 is verified for every s ∈ N . The lemma is proved since all remaining affirmations are elementary consequences of the theory of number sequences.
2.2 Main Results Lemma 2.2. Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 . Then solutions u (k ), U (k ), k ∈ N (a ) of two problems for linear equation (1.5): u (a ) = α , and U (a ) = β with α 〈 β satisfy the inequalities for every k ∈ N (a ) .
u (k ) 〈 U (k )
(2.6)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
147
Proof. For k = a + 1 we have
u (a + 1) = (1 + ϕ (a ))u (a ) + δ (a ) = (1 + ϕ (a ))α + δ (a ), U (a + 1) = (1 + ϕ (a ))U (a ) + δ (a ) = (1 + ϕ (a ))β + δ (a ) and u (a + 1) 〈 U (a + 1) . Let inequality (2.6) holds for k = a, a + 1,..., a + p . Then u (a + p + 1) = (1 + ϕ (a + p ))u (a + p ) + δ (a + p ), U (a + p + 1) = (1 + ϕ (a + p ))U (a + p ) + δ (a + p ), and obviously u (a + p + 1) 〈 U (a + p + 1) . The proof is complete. Lemma 2.3. Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 and the inequalities (1.7), (1.8) hold. Then a)
∗ The solution u = u cs (k ), k ∈ N (a ) of the problem
∗ u cs (a ) = u cs , s ∈ N for the linear equation (1.5) satisfies the relations ∗ u cs (k ) ∈ ω (k ), k = a, a + 1,..., a + s − 1 and ∗ u cs (a + s ) = c(a + s ) Moreover, ∗ u c∗, s +1 (k ) 〈 u cs (k ), k = a, a + 1,..., a + s
b)
(2.7) (2.8) (2.9) (2.10)
∗ (k ), k ∈ N (a ) of the problem The solution u bs
∗ u bs (a ) = u bs , s ∈ N for the linear equation (1.5) satisfies the relations ∗ u bs (k )∈ ω (k ), k = a, a + 1,..., a + s − 1 and ∗ u bs (a + s ) = b(a + s ) Moreover, ∗ u b∗, s +1 (k ) 〉 u bs (k ), k = a, a + 1,..., a + s
(2.11) (2.12) (2.13) (2.14)
Proof. α ) Consider the initial problem (2.7). Then ∗ u cs (a + 1) = (1 + ϕ (a ))u cs∗ (a ) + δ (a ) = (1 + ϕ (a ))u cs + δ (a )
(2.15)
Suppose that the formula ∗ u cs (a + k ) = u cs
k −1
k −1
k −1
i =0
i =0
j = i +1
∏ (1 + ϕ (a + i )) + ∑ δ (a + i )∏ (1 + ϕ (a + j ))
(2.16)
holds for k = 0,1,..., s − 1 . For k = 0 and k = 1 it holds since we consequently get the relations (2.7) and (2.15). Moreover, ∗ u cs (a + s ) = (1 + ϕ (a + s − 1))u cs∗ (a + s − 1) + δ (a + s − 1)
= (1 + ϕ (a + s − 1))u cs = u cs = u cs
s −1
s −2
s −2
s−2
i =0
i =0
j = i +1
∏ (1 + ϕ (a + i )) + ∑ δ (a + i )∏ (1 + ϕ (a + j )) + δ (a + s − 1) = s−2
s −1
∏ (1 + ϕ (a + i )) +∑ δ (a + i )∏ (1 + ϕ (a + j )) + δ (a + s − 1) = i =0
i=0
j = i +1
s −1
s −1
s −1
i =0
i =0
j = i +1
∏ (1 + ϕ (a + i )) + ∑ δ (a + i )∏ (1 + ϕ (a + j ))
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
148
Comparing the last expression with (2.16), we conclude that (2.16) holds for k = s and, moreover, for every k ∈ N . Using the representation (2.1), we get s −1 s −1 δ (a + i ) (1 + κ (a + j )) s −1 c(a + s ) − s −1 s −1 i =0 j = i +1 ∗ 1 ϕ δ u cs + + + + (a + s ) = ( ( a i ) ) ( a i ) (1 + ϕ (a + j )) = c(a + s ) s −1 i =0 i =0 j = i +1 (1 + ϕ (a + i )) i =0 So, the formula (2.9) holds. β ) Consider the initial problem (2.11). Then
∑
∏
∑
∏
∏
∏
∗ u bs (a + 1) = (1 + ϕ (a ))u bs∗ (a ) + δ (a ) = (1 + ϕ (a ))u bs + δ (a )
(2.17)
Suppose that the formula ∗ (a + k ) = u bs u bs
k −1
k −1
k −1
i =0
i =0
j = i +1
∏ (1 + ϕ (a + i )) + ∑ δ (a + i ) ∏ (1 + ϕ (a + j ))
holds for k = 0,1,..., s − 1 (for k = 0 and k = 1 we consequently get relations (2.11) and (2.17)). Moreover, ∗ u bs (a + s ) = (1 + ϕ (a + s − 1))u bs∗ (a + s − 1) + δ (a + s − 1)
= (1 + ϕ (a + s − 1))u bs = u bs = u bs
s −1
s −2
s −2
s −2
i =0
i =0
j = i +1
∏ (1 + ϕ (s − i )) + ∑ δ (a + i )∏ (1 + ϕ (a + j )) + δ (a + s − 1) = s −2
s −1
∏ (1 + ϕ (a + i )) + ∑ δ (a + i )∏ (1 + ϕ (a + j )) + δ (a + s − 1) = i=0
i =0
j = i +1
s −1
s −1
s −1
i =0
i =0
j =i +1
∏ (1 + ϕ (a + i )) + ∑ δ (a + i )∏ (1 + ϕ (a + j ))
Comparing the last expression with (2.17), we conclude that (2.17) holds for k = s and, consequently, for every k ∈ N . Using the representation (2.2) we get s −1 s −1 δ (a + i ) (1 + ϕ (a + j )) s −1 b(a + s ) − i =0 j = i +1 ∗ u bs (a + s) = (1 + ϕ (a + i )) + s −1 i =0 (1 + ϕ (a + i )) i =0
∑
∏
∏
∏
s −1
s −1
i =0
j = i +1
∑ δ (a + i )∏ (1 + ϕ (a + j )) = b(a + s )
So the formula (2.13) is proved. γ ) By Lemma 2.1 we have u cs 〉 u bs , s ∈ N .. Then, by Lemma 2.2, for every, k ∈ N (a ) .inequalities
∗ u cs (k ) 〉 u bs∗ (k ) holds. Since, by Lemma 2.1, u cs 〉 u c , s +1 , s ∈ N and u bs 〈 u b , s +1 , s ∈ N , the properties (2.10) and (2.14) are a consequence of Lemma 2.1 and Lemma 2.2. Let us prove relations (2.8), (2.12). Since ∗ u bs (a ) 〈 u cs∗ (a ) and u b∗, s +1 (a ) 〉 u bs∗ (a ), u c∗, s +1 (a ) 〈 u cs∗ (a ), Lemma 2.2 gives ∗ u bs (k ) 〈 u b∗, s +1 (k ) 〈 u c∗, s +1 (k ) 〈 u cs∗ (k )
For k = a + s we get
∗ b(a + s ) = u bs (a + s) 〈 u b∗, s +1 (a + s ) 〈 u c∗, s +1 (a + s ) 〈 u cs∗ (a + s ) = c(a + s )
The last inequalities can be rewritten as b (a + ~s ) = u b∗ s (a + ~ s ) 〈 u b∗, s (a + ~ s ) 〈 u c∗, s +1 (a + ~ s ) 〈 u c∗s (a + ~ s ) = c(a + ~ s)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
149
Putting here consequently ~ s = 0,1,2..., s − 1 we see that (2.8) and (2.12) hold, too. Theorem 2.4. [Main Result] Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 and the inequalities (1.7), (1.8) hold. Then
[
]
every initial problem (1.9) with u ∗ ∈ b ∗ , c ∗ , where b* and c* are defined by (2.3), determines a solution satisfying inequalities (1.10). Proof. The proof is a straightforward consequence of Lemmas 2.1, 2.2, 2.3. For solutions of the problems u (a ) = b ∗ , U (a ) = c ∗ we have b(k ) 〈 u(k ) ≤ U (k ) 〈 c(k ) for every k = a, a + 1,... and u (k ) ≤ u~(k ) ≤ U (k ) for every k = a, a + 1,... if u~ (a ) ∈ b ∗ , c ∗ .
[
]
Consequence 1. If Theorem 2.4 holds, then the expression s −1 u (a + s ) = u ∗ ( 1 + ϕ (a + i )) + i =0
∏
[
s −1
∑
s −1
∏ (1 + ϕ (a + p))
δ (a + i )
i=0
pí +1
]
with u ∗ ∈ b ∗ , c ∗ is a solution of the problem (1.5), (1.9) satisfying inequalities (1.10). Proof. The proof follows immediately from the statement of Theorem 2.4 and from the explicit form of the problem (1.5), (1.9) which can be derived from (1.5) directly. Theorem 2.5. Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 and the inequalities (1.7), (1.8) hold. The initial problem u (a ) = u ∇
[
]
with u ∇ ∈ [b(a ), c (a )] \ b ∗ , c ∗ generate a solution u = u ∇ (k ) of equation (1.5) not satisfying inequalities (1.10) for all k = N (a )
(
)
Proof. Let us suppose that u ∇ ∈ c ∗ , c(a ) . Then there exists a number s = s ∇ ∈ N such that u ∇ 〉 u cs∇ . By Lemma 2.2, the inequalities ∗ u ∇ (k ) 〉 u cs ∇ (k ) ∗ ∗ where u cs ∇ (k ) is solution with u cs∇ (a ) = u cs∇ , hold for very k = N (a ) . In accordance with Lemma 2.3, ∗ u cs ∇ (k ) ∈ ω (k ),
and
k = a, a + 1,..., a + s ∇ − 1
(
) (
∗ ∇ u cs = c a + s∇ ∇ a+s
)
Moreover, due to (1.5) and (1.8), ∗ ∇ ∇ ∗ ∇ u cs u cs + δ a + s ∇ = 1+ ϕ a + s∇ c a + s ∇ + δ a + s ∇ 〉 c a + s∇ +1 ∇ a + s +1 = 1+ ϕ a + s ∇ a+s
(
Consequently,
) ( (
)) (
) (
(
) ( (
)
(
)) (
) (
)
(
)
)
u ∇ a + s ∇ +1 〉 c a + s∇ + 1
[
]
So the inequalities (1.10) do not hold for k = a + s ∇ + 1 . The case u ∇ ∈ b(a ), b ∗ can be considered similarly. The following two corollaries follow obviously from Theorem 2.4 and Theorem 2.5. Corollary 1. Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 and the inequalities (1.7), (1.8) hold. Then a solution
[
]
u = u (k ) of equation (1.5) satisfies inequalities (1.10) for every k = N (a ) if and only if u (a ) ∈ b ∗ , c ∗ .
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
150
Corollary 2. Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 and the inequalities (1.7), (1.8) hold. Let, moreover, b* = c*. Then the equation (1.5) has a unique solution u = u*(k) satisfying for every k = N (a ) inequalities (1.10). This solution is determined by initial data u*(a) = u* = b*. It is interesting to find sufficient conditions for the case b* = c*. Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 and the inequalities (1.7), (1.8) hold. Let us denote ∆(s ) = u cs − u bs , s = 0,1,... Then the length of the interval [b*,c*] can be estimated (due to the monotonicity of sequences {u cs }∞s =0 , {u bs }∞s =0 as 0 ≤ c ∗ − b ∗ 〈 ∆(s ),
s = 0,1,...
From the definition of the expressions u cs , u bs we see that ∆(s ) = u cs − u bs = [due to (2.1) and (2.2)] =
c(a + s ) −
s −1
∑
δ (a + i )
i=0
=
a + s −1
∏ (1 + ϕ ( j ))
j = a + i +1
s −1
∏ (1 + ϕ (a + i ))
b(a + s ) −
∑
a + s −1
δ (a + i )
i =0
−
∏ (1 + ϕ ( j ))
j = a +i +1
s −1
∏ (1 + ϕ (a + i ))
i=0
∆(s ) =
s −1
i=0
c(a + s ) − b(a + s )
=
c(a + s ) − b(a + s ) s −1
∏ (1 + ϕ (a + i ))
, i.e.
i =0
s −1
∏ (1 + ϕ (a + i )) i =0
The following corollary is obvious. Corollary 3. Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 and the inequalities (1.7), (1.8) hold. Then the following inequalities hold obviously: 0 〈 u cs − c ∗ 〈 ∆ (s ), s ∈ N 0 〈 b ∗ −u bs 〈 ∆ (s ),
s∈ N
Theorem 2.6. Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 and the inequalities (1.7), (1.8) hold. Then b* = c* if lim ∆(s ) = 0 s →∞
(2.18)
Proof. From Theorem 1.5, the existence a solution of problem (1.5), (1.9) follows. Then c(a + s ) − ∗
s −1
∑ i −0
∗
c − b = lim
a + s −1
∏ (1 + ϕ ( j ))
δ (a + i )
j = a + i +1
s −1
s →∞
b(a + s ) − − lim
s→∞
∏ (1 + ϕ (a + i )) i =0
c (a + s ) − b(a + s ) −
s −1
lim
s →∞
∏ (1 + ϕ (a + i ))
a + s −1
∏ (1 + ϕ ( j ))
j = a + i +1
s −1
∏ (1 + ϕ (a + i ))
=
i =0
s −1
a + s −1
j = a + i +1
i−0
j = a + i +1
∏ (1 + ϕ (a + i )) s −1
i−0
a + s −1
s −1
s →∞
c (a + s ) − b(a + s )
∑
δ (a + i )
∑ δ (a + i ) ∏ (1 + ϕ ( j )) + ∑ δ (a + i ) ∏ (1 + ϕ ( j )) i −0
lim
s −1
=
i=0
= lim ∆(s ) = 0 s→∞
i =0
Then c* = b*. „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
151
Remark 2.7. The condition (2.18) is valid e.g. in the case when s −1
lim [c(a + s ) − b(a + s )] = 0
lim
s →∞
s →∞
∏ (1 + ϕ (a + i )) 〉
0
i =0
or in the case when functions c(k), b(k) are bounded on N(a) and ϕ (k ) ≥ ε 〉 0, ε = const, for every k = N (a ) . 2.3 Concluding remarks Let us consider a partial case of equation (1.5) with ϕ (k ) = −1 + A, A ∈ R, A 〉 1 and with a function δ (k ) , u (k +1) = Au(k ) + δ (k )
(2.19)
The following theorem is a consequence of the previous results: Theorem 2.8. Let A > 1 and δ (k ) 〈 M on (a ) . Then initial problem δ (a + i )
∞
∑
u ∗ (a ) = u ∗ = −
(2.20)
A i +1
i =0
generates a unique bounded solution of equation (2.19) on N(a). Proof. The series (2.20) is obviously convergent since it can be majorized by a convergent series M A
∞
∑A
1
i=0
i
=
M A −1
Put c(k ) := εM , b(k ) := −εM with ε 〉 1 / ( A − 1), ε =const. Then inequalities (1.7) and (1.8) hold since (1 + ϕ (k ))b(k ) + δ (k ) − b(k + 1) ≤ M (ε (1 − A) + 1) 〈 0 and (1 + ϕ (k ))c(k ) + δ (k ) − c(k + 1) ≥ M (ε ( A − 1) − 1) 〉 0 for every k = N (a ) . All assumptions of Theorem 2.4 hold. Let us compute the limits c*, b* . In accordance with (2.3), (2.1), we get c (a + s ) − c ∗ = lim
s →∞
s −1
∑
δ (a + i )
i =0
a + s −1
∏ (1 + ϕ ( j ))
j = a + i +1
s −1
s −1
∑
εM −
i =0
= lim
∏A
j = a + i +1
∏A
i=0
∞
∑
=−
i =0
δ (a + i ) A i +1
= u∗
i =0
Since
c(a + s ) − b(a + s )
lim ∆(s ) = lim
s →∞
a + s −1
s −1
s→∞
∏ (1 + ϕ (a + i ))
δ (a + i )
s→∞
s −1
∏ (1 + ϕ (a + i ))
= lim
s →∞
2εM As
=0
i=0
then in accordance with Theorem 2.6 we conclude w* = c* =b*. This is in accordance with (2.20). Then by Corollary 2 there exists a unique solution u = u*(k) (generated just by relation (2.20)) satisfying for every k = N (a ) inequalities − εM 〈 u ∗ (k ) 〈 εM
These inequalities express the boundedness of u*(k) on N(a). Since inequalities (1.7), (1.8) are valid for arbitrary positive M we can conclude that bounded solution of equation (2.19) is really only one.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
152
Theorem 2.8 serves only as an illustration of results obtained. Let us note that this result coincides with result described in the book [5, p. 77, Exercise 20, part (c)].
2.4 Examples In this section, two illustrative examples are considered. In the first one we determine initial data generating unique bounded solution of equation of the type (1.5). In the second one every solution is unbounded, but initial data determine a particular solution with slightly different asymptotic behaviour. Example 2.9. Let us consider the equation (1.5) with ϕ (k ) = 1 / k , δ (k ) = −1 / (k + 2 ) and a = 1, i.e. the equation ∆u (k ) =
1 1 u (k ) − , k ∈ N (1) k k +2
(2.21)
Define b(k ) ≡ 0, c(k ) ≡ 1, k ∈ N (1) . Then
(1 + ϕ (k ))b(k ) + δ (k ) − b(k + 1) = δ (k ) = −
1 〈0 k +2
for all k ∈ N (1) and the inequalities (1.7) hold. Moreover,
(1 + ϕ (k ))c(k ) + δ (k ) − c(k + 1) = 1 −
1 2 = 〉 0 k k + 2 k (k + 2 ) and the inequalities (1.8) hold for all k ∈ N (1) . All conditions of Theorem 2.4 hold and therefore there exists a solution u~ (k ) of equation (2.21) such that 0 〈 u~ (k ) 〈 1 (2.22) for every k ∈ N (1) . Moreover, using (2.3) and (2.1) we get
c(a + s ) − c ∗ = lim u cs = lim s → +∞
s −1
∑
δ (a + i )
i=0
a + s −1
∏ (1 + ϕ (a + i ))
j = a + i +1
s −1
s → +∞
1+
s −1
∑ i =0
= lim
s −1
s → +∞
∏ (1 + ϕ (a + i ))
1 (1 + i ) + 2
∏
i =0
i=0
s
1
∏ 1 + j
j =i + 2
1 1 + +i 1
=
s +1 1 4 5 s +1 13 4 1 1 1 1 1 1 1 1 1 1 + . ..... (1)1 1 + (s + 1) . + . + ... + . + . + . ..... + ... + + + + 1 s 43 4 s s+2 32 3 3 2 4 3 s 1 s s 2 s = lim = lim s → +∞ s → +∞ 3 4 5 s s +1 s +1 2. . . ..... . 2 3 4 s −1 s s +1 s +1 1 1 1 1 1 1 1 1 1 1 1 + lim = lim − − = lim − + − + ... + = s → +∞ s + 1 s → +∞ (i + 1)i s → +∞ i = 2 i i + 1 s →+∞ 2 3 3 4 s + 1 s + 2 2 i =2
lim
∑
∑
Since lim ∆(s ) = lim
s →∞
s→∞
c (a + s ) − b(a + s ) s −1
∏ (1 + ϕ (a + i ))
= lim
i =0
1
s → ∞ s −1
2+i
∏ 1+ i
= lim
s→∞
1 =0 s +1
i =0
then, in accordance with Theorem 2.6, b* = c* = 1/2. Therefore the equation (2.21) has a unique solution u = u~ (k ), k ∈ N (1) satisfying inequalities (2.22), determined by the initial data u (1) = 1 / 2 . Indeed, it is easy to verify, that the function k u~ (k ) = , k ∈ N (1) k +1
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
153
is such a solution of equation (2.21). Moreover the general solution of (2.21) is given by formula u (k ) = u~ (k ) + C.k , where C is any constant. It means that u~ (k ) is a unique bounded solution of equation (2.21). Example 2.10. Let us consider the equation (1.5) with ϕ (k ) = 2 / (k + 1), δ (k ) = 2 and a = 2, i.e. the equation ∆u (k ) =
2 u (k ) + 2, k +1
k ∈ N (2)
(2.23)
Define b(k) = k2, c(k) = (k +1I)2, k ∈ N (1) . Then
(1 + ϕ (k ))b(k ) + δ (k ) − b(k + 1) = 1 +
2 2 1− k 2 〈 0 k + 2 − (k + 1) = k +1 k +1
for all k ∈ N (2) and the inequalities (1.7) hold. Moreover,
(1 + ϕ (k ))c(k ) + δ (k ) − c(k + 1) = 1 +
2 2 2 (k + 1) + 2 − (k + 2) = 1 〉 0 k + 1
and the inequalities (1.8) hold for all k ∈ N (2) . All conditions of Theorem 2.4 hold and therefore there exists a solution u~ (k ) of equation (2.23) such that k 2 〈 u~ (k ) 〈 (k + 1)2 (2.24) for every k ∈ N (2) . Using (2.3) and (2.1) we get s −1
∑
c(a + s ) −
i =0
c ∗ = lim u cs = lim s → +∞
δ (a + i )
a + s −1
∏ (1 + ϕ ( j ))
j = a + i +1
s −1
s → +∞
∏ (1 + ϕ (a + i ))
s −1
= lim
(a + s + 1)2 − 2∑
s → +∞
i =0
s −1
(s + 3)2 − 2∑ i =0
lim
s −1
s → +∞
s +1
j +3
∏ j +1
j =i +3
= lim
i+5
(s + 3)2 − 2(s + 3)(s + 4 ) 1 . 1 + 1 . 1 + ... + 4 5 5 6 (s + 3)(s + 4) 3.4
s → +∞
s +3
(s + 3)2 − 2(s + 3)(s + 4 )∑ i =4
lim
s → +∞
12 −
1 .(s + 3)(s + 4) 12
i =0
2 1 + a + i + 1
s + 2 5 6 s+ 2 s+2 4 5 5 6 7 8 s s +1 s + 2 s + 3 s + 4 . . . . . . . ..... 3 4 5 6 s − 2 s −1 s s + 1 s + 2
i =0
lim
∏
2 1 + + 1 j j = a + i +1
∏
=
(s + 3)2 − 2 6 . 7 ..... s + 4 + 7 . 8 ..... s + 4 + ... + s + 4 + 1
s → +∞
∏ i +3
i =0
s −1
s +1
1 1 − i (i + 1)
1 1 1 1 . . . s + 2 s + 3 s + 3 s + 4
=
s+3
= lim
s → +∞
(s + 3)2 − 2(s + 3)(s + 4)∑ i=4
1 .(s + 3)(s + 4) 12
1 i (i + 1)
=
s+3 1 1 1 1 1 1 1 1 = lim 12. − 24. . + . + ... + − + − = s → +∞ s+4 s + 2 s + 3 s + 3 s + 4 4 5 5 6
24 =6 4
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
154
Since lim ∆(s ) = lim
s →∞
s→∞
c(a + s ) − b(a + s )
(s + 3)2 − (s + 2)2 s −1 s →∞ 1 .(s + 3)(s + 4 ) ∏ (1 + ϕ (a + i )) 12 = lim
12(2s + 5) =0 s → ∞ (s + 3)(s + 4 )
= lim
i =0
then, in accordance with Theorem 2.6, b* = c* = 6. Therefore the equation (2.23) has a unique solution u = u~ (k ), k ∈ N (2) satisfying inequalities (2.24), determined by the initial data u~ (2 ) = 6 . Indeed, it is easy to verify that the function u~ (k ) = k (k + 1), k ∈ N (2) is such a solution of equation (2.23). It is easy to see that the general solution of (2.23) is given by formula u (k ) = u~ (k ) + C.(k + 1)(k + 2) where C is any constant. Acknowledgment The first author was supported by the Grant 201/04/0580 of the Czech Grant Agency (Prague), the second author was supported by the Council of Czech Government MSM 00216 30503.
References [1] AGAWAL, R.P. Differential Equations and Inequalities, Marcel Dekker, Inc., 2nd ed., 2000. [2] AGAWAL, R.P.; POPENDA, J. Periodic solutions of first order linear difference equations, Mathl. Comput. Modelling 22, 11-19, 1995. [3] DIBLÍK, J. Discrete retract principle for systems of discrete equations, Comput. Math. Appl 42 (2001), 515528. [4] DIBLÍK, J. Asymptotic behaviour of solutions of discrete equations, Fund. Differ. Equ., 11 (2004), 37-48. J. Diblik, Retract principle for difference equations Proceedings of the Fourth International Conference on Difference Equations, Poznan, Poland, August 27-31, 1998. Eds.: S.Elaydi, G. Ladas, J. Popenda and J. Rakowski, Gordon and Breach Science Publ., 107-114, 2000. [5] ELAYDI, S. N. An Introduction to Difference Equations, Springer, 1999. Second Edition. [6] GOLDA, W.; WERBOWSKI, G. Oscillation of linear functional equations of the second order, Funkc. Ekvac. 37 (1994), 221-227. [7] GYORI, I.; PITUK, M. Asympotic formulae for the solutions of a linear delay difference equation, J. Math. Anal. Appl. 195 (1995), 376-392. [8] GYORI, J.; PITUK, M. Comparison theorems and asymptotic equilibrium for delay differential and difference equations, Dyn. Systems and Appl. 5 (1996), 277—302. [9] MIGDA, M.; MIGDA, J. Asympotic behaviour of solutions of difference equations of second order, Demonstr. Math. XXXII (1999), 767-773.
Adresa: Doc. RNDr. Jaromír Baštinec, CSc. Department of Mathematics Faculty of Electrical Engineering and Communication Brno University of Technology Technická 8, 616 00 Brno, [email protected] Adresa: Doc. RNDr. Josef Diblík, DrSc. Department of Mathematics Faculty of Electrical Engineering and Communication Brno University of Technology Technická 8, 616 00 Brno, [email protected] „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
155
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
156
ON SOME PROPERTIES OF FRACTIONAL CALCULUS Vlasta Krupková, Zdeněk Šmarda Brno University of Technology
Abstract: This paper is devoted to some properties of the fractional integral and derivative. There are shown applications of this calculus in cases of some classes of fractional integral equations.
Key words: Fractional integral, fractional derivative, Laplace transform.
1. Introduction The fractional calculus is a generalization of integration and derivation to non-integer order operators [4,7]. The idea of fractional calculus has been known since the development of the normal calculus, with the first reference probably being associated with Leibniz and L'Hospital in 1695. Fourier, Euler, Laplace are among the many that dabbled with fractional calculus and the mathematical consequences [6]. The most famous of these definitions that have been popularized in the world of fractional calculus are the RiemannLiouville and Grunwald-Letnikov defintion [1]. From the view of requirements of physical reality Caputo reformulated the more "classic" definition of the Riemann-Liouville fractional derivative in order to use integer order initial conditions to solve fractional order differential equations [7]. As recently as 1996, Kolowankar reformulated again, the RiemannLiouville fractional derivative in order to differentiate no-where differentiable fractal functions [2]. In this paper we will also be devoted to other constructions of the fractional derivative especially, and we show some particularities of these ones occuring at solving of certain classes of integral equations.
2. The fractional integral Understanding of definitions and use of fractional calculus will made necessary some but relatively simple mathematical definitions that will arise in the study of these concepts. Euler's Gamma function:
Γ(t ) = ∫ x t −1 e − x dx,
(1)
Γ(n ) = (n − 1)!
(2)
∞
special case when x = n :
0
Mittag-Leffler function in two parameters: ∞
Εα , β (t ) = ∑ k =0
tk Γ (αk + β )
α 〉 0,
β 〉 0
(3)
It is a generalization of exponential function ∞
Ε1,1 (t ) = ∑ k =0
More particular cases
( )
Ε 2,1 (t ) = cosh t , Ε1
∞ tk tk =∑ = et Γ(k + 1) k =0 k!
2 ,1
(t ) =
2 π
( )
e −t erfc t
(4)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
157
The Laplace transform:
L{ f (t ) } := ∫ e − pt f (t )dt = F ( p ) ∞
(5)
0
Also commonly used is a convolution of two function
f (t ) ∗ g (t ) := ∫ f (t − τ )g (τ )dτ = g (t ) ∗ f (t ) t
0
L{ f (t ) ∗ g (t ) } = F ( p )G ( p )
(6)
One final important property of the Laplace transform that should be addressed is the Laplace transform of a derivative of integer order n of the function f (t ) n −1 n −1 L F ( n ) (t ) } = p n F ( p ) − ∑ p n− k −1 f ( k ) (0) = p n F ( p ) − ∑ p k f k =0 k =0
( n − k −1)
(0)
(7)
Cauchy formula for evaluating the nth integration of the function:
∫ ...∫ f (τ )dτ =
t 1 (t − τ )(n−1) f (τ )dτ ∫ 0 0 (n − 1)! n For the abbreviated of this formula, we introduce the operator I t 1 I n f (t ) := f n (t ) = (t − τ )(n−1) f (τ )dτ ∫ 0 (n − 1)! t
(8)
(9)
For direct use in (8), n is restricted to be an integer. The primary restriction is the use of the factorial which in essence has no meaning for non-integer values. The Gamma function is however an analytic expansion of the factorial for all reals, and thus can be used in place of the factorial as in (2). Hence, by replacing the factorial expression for its Gamma function equivalent, we can generalize (9) for all α ∈ R+
I α f (t ) := f α (t ) =
1 t α −1 ( t − τ ) f (τ )dτ ∫ Γ(α ) 0
(10) This approach is commonly referred to as the Riemann-Liouville approach. This formulation of the fractional integral carries it some very important properties, that will later show importance when solving equations involving integrals and derivatives of fractional order. First, we consider integrations of order α = 0 to be an identity operator, i.e.
I 0 f (t ) = f (t )
Also, given the nature of the integral's definition, and based on the principle from which it came (Cauchy repeated integral equation), we can see that just as
I n I m = I m+ n = I m I n ,
m, n ∈ N
(11)
so to,
I α J β = I α +β = I β I α , α , β ∈ R The one presupposed condition placed upon a function f (t ) that needs to be satisfied for these and other similar properties to remain true, is that f (t ) be a causal function, i.e. that it is vanishing for t ≤ 0 . Although this is a consequence of convection, the convenience of this condition is especially clear in the context of the property demonstrated in (11). The effect is such that function Φ (t )
f (0 ) = f n (0 ) = f α (0) = 0 . Using the Gamma function we define a Φ α (t ) :=
t α −1 Γ(α )
From this we obtain „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
158
α −1 ( t −τ ) + Φ α (t ) ∗ f (t ) = ∫ f (τ )dτ 0 Γ(α ) t
t + denotes the function vanishes for t ≤ 0 . Now the formula of the fractional integral (10) can be written in the form 1 t α −1 ( I α f (t ) = Φ α (t ) ∗ f (t ) = t − τ ) f (τ )dτ ∫ Γ(α ) 0 (12) We will find the Laplace transform of the Riemann-Liouville fractional integral. In (12) we showed that the fractional integral could be expressed as the convolution of two functions, given by
{
L t α −1
Φ α (t ) and f (t ) . The Laplace transform of t α −1 is
} = Γ(α ) p −α
Thus, the Laplace transform of the fractional integral is found to be
{
L I α f (t ) } = p −α F ( p )
(13)
3. The fractional derivative Because the Riemann-Liouville approach to the fractional integral began with an expression for repeated integration of a function, one's first instinct may be to imitate a similar approach for the derivative. In the following we introduce two definitions fractional derivative in the sense of the Riemann-Liouville approach. Consider a differentiation of order α ∈ R+ . Now, we select an integer m such that m − 1 〈 α 〈 m . Given these numbers, we have two possible ways to define the derivative. Having found the integer m, the first step of the process is to integrate our function
f (t ) by order m − α and second, we differentiate the resulting function f m−α (t ) by order m. This method we will call
Left Hand Definition (LHD) of the fractional derivative and there is given
d mm 1 ∫t f (ατ+)1− m dτ Γ ( m −α ) 0 (t −τ ) D Lα f (t ) := dtd m , m −1 〈 α 〈 m f ( t ) , α = m dt m
(14) The Right Hand Definition (RHD) attempts to arrive at the same result using the same operations, but in the reverse order. The mathematical results of this is the form (m)
1 ∫t f α (+τ1−)m dτ Γ ( m −α ) 0 (t −τ ) D f (t ) := d m , m −1 〈 α 〈 m f (t ), α = m m dt α R
(15)
This second definition, although referred to here as the Right Hand Definition, was originally formulated by Caputo, and is therefore, commonly referred to as the Caputo fractional derivative. Demonstrating the practicality of the RHD over the LHD is conveniently simple. For example, the fractional derivative of a constant using the LHD is not zero, and in fact there is valid
Dια C =
Ct −α Γ(1 − α )
which is a substantial problem in the physical world. In the section 2. we demonstrated the Laplace transform of the fractional integral (13). Using this definition, we may find similarly the Laplace transform of LHD fractional derivative. The fractional derivative of this one may be written in the form
DLα f (t ) = g ( m ) (t ), where g (t ) = I m −α f (t ),
m −1 ≤ α 〈 m
Using the formula (7) and the definition of the fractional integral Laplace transform, we obtain m −1 m −1 L DLα f (t ) } = p m G ( p ) − ∑ p k g ( m −k −1) (0 ) = p α F ( p ) − ∑ p k DL(α −k −1) f (0) k =0 k =0
(16)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
159
It is obvious that the required initial conditions are for all k to n — 1 terms, fractional order derivatives of f (t ) . For the RHD, we will write the derivative in the form
DRα f (t ) = I m−α g (t ), where g (t ) = F (m ) (t ), m − 1 ≤ α 〈 m
Using the formula (13), we get
{
m −1
L DRα f (t ) } = p − (m − α )G ( p ) = pα F ( p ) − ∑ p α −k −1 f ( k ) (0) k =0
(17)
In this formulation, the order a does not appear in the derivatives of f (t ) . So, quite conveniently , integer order derivatives f (t ) are used as the initial conditions, and therefore easily interpreted from physical data and observations. Consider the fractional integral equation of the first kind
1 t u (τ ) dτ = f (t ), Γ(α ) ∫0 (t − τ )(1−α )
This equation can be written in the form
0 〈α 〈1 (18)
I α u (t ) = f (t )
There is valid
I α u (t ) = Φ α (t ) ∗ u (t ) ⇒ L{Φ α (t ) } =
U ( p) pα
(19)
We can reorder the result (19) into one of two forms.
F (p) U ( p ) = p α F ( p ) = p 1−α p
or
U ( p ) = pα F ( p ) =
1 ( pF ( p ) − f (0)) + f 1(−0α) 1−α p p
Inverting the first form into the time domain, we get
u (t ) =
1 d t f (τ ) dτ = f (t ) Γ(1 − α ) dt ∫0 (t − τ )α
which is equivalent to solution of (18) with the LHD. The second form can be similarly inverted to yield
u (t ) =
t f ′(τ ) 1 t −α ( ) ( ) d τ f t f 0 = + Γ(1 − α ) ∫0 (t − τ )α Γ(1 − α )
which is equivalent to solution of (18) with the RHD. Now we consider the fractional integral equation of the second kind in the form
u (t ) +
λ t u (t ) dτ = f (t ) ⇔ 1 + λI α u (t ) = f (t ) 1−α ∫ 0 Γ(α ) (t − τ )
(
)
(20)
Applying the Laplace transform to (20) we get
λ L 1 + λI α u (t ) } = L f (t ) } ⇒ 1 + α U ( p ) = F ( p ) p
(
)
(21) The equation (21) can be rearranged in many ways, but we will solve this one using of LHD. We can rewrite the equation (21) as follows
p α −1 U ( p) = p α − 1 F ( p ) + F ( p ) p +λ
(22)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
160
The equation (22) is next inverted back into the normal function domain. In order to do this one must address comprehend the Laplace transform of the Mittag-Leffler function
L Eα ,1 − λt α
(
)
p α −1 } = α +1 p
(23)
By the relationship given in (7), it is clear that between the brackets in (22) is the Laplace transform of the first derivative of the Mittag- Leffler function in (23), i.e.
d L Eα ,1 − λt α dt
(
From this the inverse (22) we obtain
)
} = L Eα(1,)1 − λt α
(
)
(
)
pα −1 } = p α −1 p +λ
u (t ) = f (t ) + Eα(1,)1 − λt α ∗ f (t )
Acknowledgement This research has been supported by the Czech Ministry of Education in the frame of MSM002160503 Research Intention MIKROSYN New Trends in Microelectronic Systems and Nanotechnologies.
Reference [1] CHEN, Y.Q. Fractional-order calculus in Signal processing and Control, CSOIS, ECE Dept. of Utah State University, 2003, 1-83. [2] KOWANKAR, K. M.; GANGAL, A. D. Fractional Differentiability of nowhere differentiable functions and dimensions, Chaos, Vol.6, No 4, 1996, Amer. Inst. of Physics. [3] NISHIMOTO, K. An essence of Nishimoto's Fractional Calculus, Descartes Press Corp., 1991. [4] OLDHAM, K. B.; SPANIER, J. The Fractional Calculus, Acad. Press, New York, 1974. [5] PIRES, E. J. S. Fractional Order Dynamics in a GA planar, Signal Processing 83, 2003, 2377-2386. [6] PODLUBNY, I. Fractional Differential Equations, Mathematics in Science and Engineering Vol. 198, Acad. Press 1999. [7] PODLUBNY, I. Fractional Differential Equations, Acad. Press, San Diego 1999. [8] RAYNAUD, H. F.; ZERGALNOH, A. State-space Representation for Fractional order controllers, Automatica 36, 2000, 1017-1021.
Address: RNDr. Vlasta Krupková, CSc. University of Technology Technická 8 616 00 Brno e-mail: [email protected] Address: Doc. RNDr. Zdeněk Šmarda, CSc. University of Technology Technická 8 616 00 Brno e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
161
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
162
ON THE STABILITY OF LINEAR INTEGRODIFFERENTIAL EQUATIONS Zdeněk Šmarda University of Technology, Brno Abstract: The stability solutions of linear integrodifferential equations with respect to an asymptotic behaviour of base and regular ordinary differential equations is investigated.
Key words: Stability solutions, integrodifferential equations, base and regular system.
1. Introduction While studying integro-differential equations (IDE) we often meet two basic problems, which do not have an analogy in the theory of ordinary differential equations: • The system of IDE has singular points of the first and second order , i.e. points that more one solution passes through or no solution exists there at all, in simple cases, presuming continuity (see [1,2]). • An unknown function is under the integration sign, so the right-hand side of the IDE cannot be evaluated in given point. This means that we are actually unable to determine the direction field of the IDE, although the direction field exists. Therefore known qualitative methods of an investigation of ordinary differential equations, e.g. Wazewki's topological method, cannot be applied to IDE (see [4,5,6,7]). In this paper, we investigate stability of solutions of linear systems of IDE and appropriating base systems. There is introduced some examples base systems and linear systems of IDE in which integral terms will change asymptotic behaviour of base systems. Character of asymptotic behaviour of linear systems of IDE with respect to base systems depends on kernels of integral terms, especially. There are given sufficient conditions under which linear systems of IDE is stable or asymptotic stable with assumption that base systems is unstable.
2. Examples and basic notions Consider the linear system of IDE u ' (t ) = A(t )u (t ) + K (t , s )u (s )ds,
∫
t
(1)
0
where u(i) (t) = (u1(i)(t),.…,un(i)(t))T, i = 0,1, A(t) is n x n- the matrix function, A(t) ∈ C1(J), detA(t) ≠ 0, for t ∈ J, K(t, s) is n × n - the matrix function, K(t, s) ∈ C1(J x J), J = [0, ∞ ). The function K(t, s) will be called the kernel of (1). We also consider the base system of (1) u'(t) = A(t)u(t).
(2)
Let V(t), V(0) = E, be the fundamental matrix of (2) then the Cauchy problem of (2) with u(0) = b, b = (b1,..., bn)T is a constant vector, has the unique solution u(t) = V(t)b. (3) The zero solution of (2) will be called stable if for any constant vector c and t → ∞ there is valid
u (t ) ≤ V (t ) c ≤ M c , M ∈ R + If as t → ∞ , ||u(t)|| → 0 then the zero solution of (2) will be called asymptotic stable. If as t → ∞ , ||u(t)|| → ∞ then the zero solution of (2) will be called unstable, ||.|| is a usual norm in Rn or R2n.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
163
The system (2) is the linear system. Thus, there is alwayes valid one of these cases: (I) All solutions of (2) are stable. (II) All solutions of (2) are asymptotic stable. (III) All solutions of (2) are unstable. Consequently, we will say that the system (2) is stable, asymptotic stable, unstable. There is also valid in the case of (1). Example 1. Consider the system of IDE 1 0 − u(t ) + u ' (t ) = t + 1 1 0 − t + 1
τ +1 t + 1 u (τ )dr 0
t
0 0 0
∫
(3)
with the initial conditions u1(0) = b1, u2(0) = b2. The particular solution of (3)
1 u (t ) = t + 1 0
t2 2(t + 1) b. 1 t +1
Thus, the system (3) is unstable. Appropriating the base system has the form u'k(t)= −
1 u k (t ) t +1
u k (0) = bk , k = 1,2
(4)
with the particular solution uk(t) = −
1 bk t +1
From here the system (4) is asymptotic stable so that the integral term in (3) changed the asymptotic stable system (4) in the unstable system (3). Example 2. Consider the system of IDE 1 + t t + 2) ( 1 )( u ' (t ) = 0
0 u (t ) + 1 t +1
0
t
∫ 0 0
τ +1
(t + 1) (t + 2) 2(τ + 1) u(τ )dτ − (t + 1) 3 4
3
(5)
with the initial conditions u1(0) = b1, u2(0) = b2. The particular solution of (5)
u1 ( t ) =
t +1 2(t + 1) 2 3 2 + − + b1 + b2 ln 2 t+2 (t + 1) 3 t + 2 t + 1 (t + 1) 1 u 2 (t ) = 2 − b2. t + 1
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
164
From here the system (5) is only stable. Appropriating the base system has the form
0 u (t ) 1 t + 1
1 u ' (t ) = (t + 1)(t + 2) 0 with the particular solution
u1 (t ) =
2(t + 1) b1 , t+2
(6)
u 2 (t ) = (t + 1)b2
Thus, the system (6) is unstable. Now it is obvious that the integral term in (5) changed the unstable system (6) in the stable system (5).
3. Main results Let R(t, s) ∈ C1(JX J) be a resolvent the matrix kernel –A-1(t)K(t, s). Put
∫
t
u (t ) = z (t ) + R(t , s ) z ( s)ds.
(7)
0
Substituting (7) into (1) we obtain z ' (t ) = B (t ) z (t ) −
t
∫ R (t , s) z(s)ds, 0
' t
where B(t) = A(t) + A-1(t)K(t,t). Consider the system
z ' (t )´B(t ) z (t )
(9)
which will be called the regular system with respect to (1) and let W(t) be the fundamental matrix of (9). Theorem. Let following assumptions hold: (i) The base system (2) is unstable. (ii) The regular system (9) is either stable or asymptotic stable. (iii) ∞ t
∫∫ 0
0
W −1 ( s) R t' (t , s)W ( s) dsdt < N < ∞
F (t ) =
t
∫ R(t , s)W (s) ds < L < ∞. 0
F(t) —> 0 ast → ∞ in the case of the asymptotic stability. Then the system (1) is either stable or asymptotic stable. Proof. The resolvent R(t, s) of (7) satisfies the equation (see [3]) R (t , s ) = Q(t , s ) + Q(t , µ )R (µ , s )dµ
∫
t
0
(10)
where Q(t, s) = -A-1(t)K(t, s). From (8) we get z (t ) = W (t )b +
∫
t
W (t )W −1 (s )
0
∫
s
0
Rt' (s, µ ) z ( µ )dµds.
(11)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
165
−1 −1 Multiplying by W (t ) the both sides of (11) and putting α (t ) ≡ W (t )z (t ) we obtain
α (t ) ≤ b N . Thus
z (t ) ≤ W (t ) bN Now, from (7) it follows u (t )
≤
W ( t )W
−1
∫
(t ) z (t ) +
≤ W (t ) W −1 (t ) z (t ) +
∫
t
0
t
R ( t , s )W ( s )W
−1
0
( s ) z ( s ) ds ≤
R (t , s)W (s ) W −1 (s ) z (s ) ds ≤
t ≤ W (t ) + ∫ R(t , s )W ( s ) ds 0
N b ≤ ( W (t ) + L N b.
(12)
From here it is obvious that the system (1) is stable . In the case F(t) —> 0 and ||W(t)|| —> 0 as t → ∞ , we obtain from (12) that the system (1) is asymptotic stable. The proof is complete. Results of this Theorem we apply to the system of IDE in the example 2, i.e. τ +1 1 0 0 4 3 t (t + 1) (t + 2) ( t + 1 )( t + 2 ) u (t ) + u (τ )dr u ' (t ) = 2(τ + 1) 1 0 − 0 0 t +1 (t + 1) 3 with the initial conditions u1 (0) = b1 , u 2 (0) = b2 . We already know that the base system
∫
1 0 u (t ) u (t ) = (t + 1)(t + 2) 1 0 t +1 is unstable. The matrix B(t) of the regular system z'(i) = B(t)z(t) has the form 1 1 1 B (t ) = A(t ) + A −1 (t )K (t , t ) = t + 2 ( t + 1 )( t + 2 ) t +1 0 −1 and the resolvent 1 1 0 − R (t , τ ) = (t + 1)(t + 2) τ + 1 0 −2 The fundamental matrix of the regular system 2(t + 1) t +1 1 1 − 3(t + 2) (t + 1)3 . W (t ) = t + 2 1 0 t +1 ´
Thus, the regular system is stable. Remain to verify the conditions (iii),(iv). In this case there is valid 2 1 0 − R (t , τ )W (τ ) = ( t + 1 )( t + 2 ) (τ + 1)2 0 2
∫ R(t ,τ )W (τ ) dτ = 6 − ln 2 = N 〈 ∞ t
5
0
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
166
W −1 (t )Rt´ (t , τ )W (τ ) = ∞ t
∫∫ 0
0
t+2
2 2(t + 1)(τ + 1)
1 1 − (t + 1)2 (t + 2)2
0 1 . 0 0
W −1 (t )Rt´ (t , τ )W (τ ) dτdt 〈 3 = L 〈 ∞
Now, it is obvious that all assumptions of Theorem are fulfilled and withought computting of the particular solution of the system of IDE (5) we get that this one is also stable. The example is complete.
Acknowledgement This research has been supported by the Czech Ministry of Education in the frame of MSM002160503 Research Intention MIKROSYN New Trends in Microelectronic Systems and Nanotechnologies.
Reference [1] BYKOV, J. V. Theory of integrodifferential equations , Kirg. Univ. Frunze 1957 (in Russian). [2] IMANALIEV, M. Oscillation and stability of solutions of singular-p erturbation integro differential equation, Akad. nauk , ILIM Frunze, 1974 (in Russian). [3] ŠKRÁŠEK, J. Základy aplikované matematiky III., Praha : SNTL, 1993. [4] ŠMARDA, Z. On some particularities of integro-differential equations. Proceedings of APLIMAT 2003, p.253-257. [5] ŠMARDA, Z. On solutions of an implicit singular system of integro differential equations depending on a parameter, Demonstratio Mathematica, Vol.XXXI, No 1, (1998), 125-130. [6] ŠMARDA, Z. On an initial value problem for singular integro- differential equations, Demonstratio Mathematica, Vol. XXXV, No 4, (2002), 803-811. [7] YANG, G. Minimal positive solutions to some singular second-order differential equations, J. Math. Anal. Appl. 266 (2002), 479-491.
Address: Doc. RNDr. Zdeněk Šmarda, CSc. University of Technology Technická 8 616 00 Brno e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
167
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
168
EXISTENCE OF POSITIVE SOLUTIONS FOR RETARDED FUNCTIONAL DIFFERENTIAL EQUATIONS WITH UNBOUNDED DELAY AND FINITE MEMORY Josef Diblík, Zdeněk Svoboda University of Technology, Brno
Abstrakt: For systems of retarded functional differential equations with unbounded delay and with finite memory sufficient and necessary conditions of existence of positive solutions on an interval of the form t ∗ , ∞ are derived. A general criterion is given together with corresponding applications (including a linear case, too). Examples are inserted to illustrate the results.
[ )
Key words and phrases: Positive solution, delayed equation, p- function.
1 Introduction One of the basic projects in the theory of regulation is to find the control of some process such that parameters of this stay in the required area. The most simply type of this area is often described by inequalities. Description of continuous processes is usually realized by using the differential equation. In this paper is given a criterion for the existence of positive solutions (i.e. a solution with positive coordinates on a considered interval) for systems of retarded functional differential equations (RFDE's) with unbounded delay and with finite memory. At first let us give short explanation of emphasized above terms. Let us recall basic notions of RFDE's with unbounded delay but with finite memory. A function p ∈ C[R x [-1,0], R] is called a p -function if it has the following properties [12, p. 8]: (i) (ii) (iii)
p(t,0)=t. p(t, -1) is a nondecreasing function of t. there exists a σ ≥ - ∞ such that p( t, ϑ ) is an increasing function for ϑ for each t ∈ ( σ , ∞ ). (Throughout the following text we suppose t ∈ ( σ , ∞ ).)
In the theory of RFDE's the symbol yt , which expresses "taking into account", the history of the process y(t) considered, is used. With the aid of p - functions the symbol yt is defined as follows: Definition 1 ( [12, p. 8] ) Let t0 ∈ R, A > 0 and y ∈ C ( [p(t0, -l), t0 + A), Rn). For any t ∈ [t0,t0 + A), we define y t (ϑ ) := y ( p(t , ϑ )), − 1 ≤ ϑ ≤ 0 and write yt ∈ C := C[[-l,0],Rn]. Note that the frequently used symbol “yt” (e.g., yt(s) := y(t + s), where −τ ≤ s ≤ 0, τ 〉 0, τ = const) in the theory of delayed functional differential equations for equations with bounded delays is a partial case of the above definition. Indeed, in this case we can put p (t, ϑ ) := t + T ϑ , ϑ ∈ [-1, 0]. In this paper we investigate existence of positive solutions of the system ý(t) = f(t,y t )
(1)
where f ∈ C([t0,t0 +A) x C,Rn), A > 0, and yt is defined in accordance with Definition 1. This system is called the system of p-type retarded functional differential equations (p-RFDE's) or a system with unbounded delay with finite memory. Definition 2 The function y ∈ C ([p(t0, -l),t0 +A),Rn) ∩ C1 ([t0,t0 + A),Rn) satisfying (1) on [t0,t0+A) is called a solution of (1) on [p(t0, - 1),t0 + A). Suppose that Ω . is an open subset of R × C and the function f : Ω . —> Rn is continuous. If (t0, φ ) Ω , then there „ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
169
exists a solution y = y(t0, φ ) of the system p-RFDE's (1) through (t0, φ ) (see [12, p. 25]). Moreover this solution is unique if f(t, φ ) is locally Lipschitzian with respect to second argument φ ([12, p. 30]) and is continuable in the usual sense of extended existence if f is quasibounded ([12, p. 41]). Suppose that the solution y = y(t0, φ ) of p-RFDE's (1) through (t0,
φ ) ∈ Ω , defined on [t0, A], is unique. Then the property of the continuous dependence holds too (see [12, p. 33]), i.e. for every ε > 0, there exists a δ ( ε ) > 0 such that (s, ψ ) ∈ Ω , |s - t0| < δ and || ψ - φ || < δ imply || yt (s, ψ ) - (to, φ )|| < ε , for all t ∈ [ ζ ,A] where y(s, ψ ) is the solution of the system p- RFDE's (1) through (s, ψ ), ζ = max {s,t 0 } and || • || is the supremum norm in Rn. Note that these results can be adapted easily for the case (which will be used in the sequel) when Ω has the form Ω = [t*, ∞ ) x C where t* ∈ R. 1.1 Problem of existence of positive solutions In this paper we are concerned with the problem of existence of positive solutions (i.e. problem of existence of solutions having all its coordinates positive on considered intervals) for nonlinear systems of RFDE's with unbounded delay but with finite memory. Let us cite some known results for retarded functional differential equations. For the scalar equation x& (t ) + p(t )x(t − τ (t )) = 0 (2) with p, τ ∈ C ( [t0, ∞ ), R+), τ (t) ≤ t, lim (t - τ (T)) = ∞ and R+ = [0, ∞ ) a criterion for existence of a positiv solution t →∞
is given in the book [10]. Namely, (2) has a positive solution with respect to t1 if and only if there exists a continuous function λ (t) on [T1, ∞ ) with T1 = inf {t - τ (t)}, such that λ (t) > 0 for t ≥ t1 and t ≥ t1
t λ (t) ≥ p(t)e ∫ t −τ ( t ) λ (s)ds , t ≥ t1.
(3)
(A function x is called a solution of (2) with respect to an initial point t1 ≥ t0 if x is defined and is continuous on [T1, ∞ ), differentiable on [t1, ∞ ), and satisfies (2) for t ≥ t1.) Results in this direction are formulated in the book [11] and in the papers [1, 2], too. Positive solutions of (2) in the critical case were studied e.g. in [4]-[10]. The cited criterion was generalized for nonlinear systems of RFDE's with bounded retardation in [3] and for nonlinear systems of RFDE's with unbounded delay and with finite memory in [6]. These generalizations are in a sense "direct" generalizations since in their formulations existence of a positive (vector) functions playing a similar role as λ in (3) is supposed. 2 Sufficient conditions Let a constant vector k >> 0 and a vector λ (t) defined and locally integrable on [p*, ∞ ) are given. Then the operator T is well defined by T(k, λ ) (t) := ke ∫
t p*
λ (s)ds = (k1e ∫
t p*
λ (s)ds ,..., kne ∫
t p*
λ (s)ds ).
Define for every i ∈ {1, 2,..., n} two types of subsets of the set C:
τ and
: = { φ ∈ C : 0 « φ ( ϑ ) « T (k, λ )t ( ϑ ), ϑ ∈ [-1,0] except for φ i(0) = k1e ∫
i
τ
: = { φ ∈ C : 0 « φ ( ϑ ) « T (k, λ )t ( ϑ ), ϑ ∈ [-1,0] except for φ i(0) = 0
i
t p*
λ (s)ds
}
}.
Theorem 1 Suppose f ∈ C ( Ω ,Rn) is locally Lipschitzian with respect to the second argument and quasibounded. Let a constant vector k >> 0 and a vector λ (t) defined and locally integrable on [p*, ∞ ) are given. If, moreover, inequalities
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
170
µ i λi (t ) > kiµi e ∫ hold for every i ∈ { 1 , 2 , . . . , n}, (t, φ ) ∈ [t*, ∞ ) x
τ
t p*
λ (s)ds ⋅ f (tφ ) i
i
and inequalities
(4)
µ i f i (t , φ ) > 0
(5)
hold for every i ∈ { 1 , 2 , . . . , n}, (t, φ ) ∈ [t*, ∞ ) x τ i , where µ i = - 1 for i = 1,... ,p and µ i = 1 for i = p+1,... ,n, then there exists a positive solution y = y(t) on [p*, ∞ ) of the system p -RFDE's (1). The proofs of this and next theorems are based on the retract method and on the Lyapunoff method. Analogous consideration can be found in [6].
3 Scalar linear application Let us consider the scalar linear equation with delay y& (t ) = −
[ )[
∫ ( ) K (t , s )y(s )ds t
(6)
τ t
)
[ ) [
)
[
)
where K : t ∗ , ∞ x p ∗ , ∞ → R + is a continuous function, and τ : t ∗ , ∞ → p ∗ , ∞ is a nondecreasing function with τ (t )〈 t . Theorem 2 The equation (6) has a positive solution
([
) )
function λ ∈ C p , ∞ , R , such that λ (t )〉 0 for t ≥ t and ∗
y = y (t ) on p ∗ , ∞ if and only if there exists a
∗
t t λ (u )du λ t ≥ K (t , s )e ∫s ds τ (t )
∫
[ )
(7)
on the interval t ∗ , ∞ . Inequality (7) can be used for finding sufficient conditions for the existence of a positive solution of Eq. (6). Let us give two of them.
[ )
In the case when τ (t ) ≡ p ∗ 〈t ∗ and K (t , s ) ≡ c(t ) for every t ∈ t ∗ , ∞ , Eq. (6) takes the form y& (t ) = −c (t )
∫
t
p∗
y (s )ds
(8)
c(t ) ≤
(
δ2
δ t − p∗
e with a positive constant S is a sufficient condition.
[
t ∈ t∗,∞
) −1 ,
[p , ∞) , the inequality ∗
Theorem 3 For the existence of a solution of Eq. (8), positive on
)
(9)
[ )
In the case when τ (t ) ≡ t − 1, 1 ∈ R + and K (t , s ) ≡ c(t ) for every t ∈ t ∗ , ∞ , Eq. (6) takes the form y& (t ) = −c (t )
∫
t
t −1
y (s )ds
(10)
[
)
Theorem 4 For the existence of a solution of Eq. (10), positive on t ∗ ,−1, ∞ , the inequality
[ )
c(t ) ≤ M , t ∈ t , ∞ ∗
is sufficient for M = α (2 − α ) / l = const with a constant a being the positive root of the equation 2 − α = 2e approximate values are a = 1., 5936 and M = 0., 6476/l2.) 2
(11) −α
. (The
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
171
Acknowledgment This research has been supported by the Czech Ministry of Education in the frame of MSM002160503 Research Intention MIKROSYN New Trends in Microelectronic Systems and Nanotechnologies.
Reference [1] BEREZANSKI, L.; BRAVERMAN, E. On oscillation of a logistic equation with several delays, J. Comput. and Appl. Mathem. 113 (2000), 255-265. [2] ČERMÁK, J. A change of variables in the asymptotic theory of differential equations with unbounded delay, J. Comput. Appl. Mathem. 143 (2002), 81-93. [3] DIBLÍK, J. A criterion for existence of positive solutions of systems of retarded functional differential equations. Nonl. Anal., TMA 38 (1999), 327-339. [4] DIBLÍK, J. Positive and oscillating solutions of differential equations with delay in critical case, J. Comput. Appl. Mathem. 88 (1998), 185-202. [5] DIBLÍK, J.; KOKSCH, N. Positive solutions of the equation x(t) = —c(t)x(t — r) in the critical case. J. Math. Anal. Appl. 250 (2000), 635-659. [6] DIBLÍK, J.; SVOBODA, Z. An existence criterion of positive solutions of p-type retarded functional differential equations. J. Comput. Appl. Mathem. 147 (2002), 315-331. [7] DOMSHLAK, Y. On oscillation properties of delay differential equations with oscillating coefficients, Funct. Diff. Equat., Israel Seminar 2 (1996), 59-68. [8] DOMSHLAK, Y.; STAVROULAKIS, I. P. Oscillation of first-order delay differential equations in a critical case, Appl. Anal. 61 (1996), 359-371. [9] ELBERT, Á.; STAVROULAKIS, I. P. Oscillation and non-oscillation criteria for delay differential equations, Proc. Amer. Math. Soc. 123 (1995), 1503-1510. [10] ERBE, L. H.; KONG, Q.; ZHANG, B. G. Oscillation Theory for Functional Differential Equations. New York : Marcel Dekker, 1995. [11] GYÖRI, I.; LADAS, G. Oscillation Theory of Delay Differential Equations. Oxford : Clarendon Press, 1991. [12] LAKSHMIKAMTHAN, V.; WEN, L.; ZHANG, B. Theory of Differential Equations with Unbounded Delay, Kluwer Academic Publishers, 1994. [13] HALE, J.K.; LUNEL, S. M. V. Introduction to Functional Differential Equations, New York : Springer-Verlag, Inc., 1993.
Address: Doc. RNDr. Josef Diblík, DrSc. Department of Mathematics Faculty of Electrical Engineering and Communication Brno University of Technology Technická 8, 616 00 Brno, [email protected]
Address: RNDr. Zdeněk Svoboda, CSc. Department of Mathematics Faculty of Electrical Engineering and Communication Brno University of Technology Technická 8, 616 00 Brno, [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
172
APPLICATION OF NON SIMPLEX METHOD FOR LINEAR PROGRAMMING Marie Tomšová University of Technology Brno
Abstract: This paper showes a method for the solution of optimizing problems which is different from the usually used Simplex method. The simplex method is considered one of the basic models fom which many linear programming techniques are directly and indirectly derived. The simplex method is an iterative process which approaches, step by step, an optimum solution in such a way that an objective function of maximization or minimization is fully reached. Each iteration in this process consists of shortening the distance (mathematically and also graphically) from the objective function to the intercepted vertex of a convex set determined by the inequalitis which describe the problem. The simplex method is not the only technique known and used for solving linear programming problems. For the pedagogical expendiency are more useful also other methods, see for example R. Dorfman, P.A.Samuelson, and R.M.Solov, Linear Programming and Economic Analysis, New. York: McGraw-Hill Book Comp. Inc,1958. I interduce an other method from the simplex method. This method will be based on the princip of the graphical method of optimization of linear problems for two variables, but my method will be generalized for n variables and an arbitrary finite number of inequalities descibing the problem.
Key words. Liner programming, system of inequalities, disposal and slack variable, dummy variable, objective function, iteration, key row, key column, key element, polyhedron, hyper-plane.
1. The leading article Remark I reported this problem on the fourth Mathematical Workshop in Brno 2005. This contribution aims at spreading knowledge of the described method - I prepared a programme for solving of concrete problems. This programme is given as an appendix. Introduction The generall problem of linear programming is usually formulated as follows: Let aij, bi , cj (i = 1,2, ... , m ; j = 1,2, ... ,n) be given real numbers and let us denote I1 C I = {i = 1,2, ... , m} and J1 C J = j = 1, 2, ... ,n}. The problem of maximizing of the function n
∑c x j
i =1
j
(1)
on the set of n
∑a x j =1
ij
j
n
∑a x j =1
ij
j
xj ≥ 0
≤ bj = bj
( i ∈ I1 ) ( i ∈ I − I1 ) ( j ∈ J1 )
I ≠ 0, I1 ≠ I , or J1 ≠ J . is called maximizing problem of linear programming in mixed form if 1
(2)
(3) (4)
The problem of linear programming given by (1) till (4) where I1 = I and J1 = J that is the problem of maximizing of the function n
∑c x i =1
j
j
(5)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
173
on the set of linear independent system of linear unequations n
∑a x j =1
ij
xj ≥ 0
≤ bj
( i ∈ I1 ) ( j ∈ J1 )
j
(6)
(7)
is called maximizing problem of linear programming in the form of unequations. With respect to the fact that for arbitrary set M ⊂ R n where Rn is n dimensional vectorial space and for arbitrary linear function z : M → R n
min z ( x ) = max(− z ( x )), where x ∈ R n holds then if one of extrems exists we can transform also minimizing problem on the problem with linear equations or or linear unequations. We do the rearrangement by multiplication by number -1.
2. The solution of the general problem We desist from the condition (4) and hence also from (7) in the following considerations. We rewrite the system (6) and add the objective function as the last row into the form:
a11 x1 + a12 x2 + ... + a1n xn + b1 ≥ 0 a21 x1 + a22 x2 + ... + a2 n xn + b2 ≥ 0 ... am1 x1 + am 2 x2 + ... + amn xn + bm ≥ 0 c1 x1 + c2 x2 + ... + cn xn = 0
(8)
We call the set x = {x1,x2, ... ,xn} C Rn of elements as polyhedron. We call the polyhedron opened if m ≤ n . We know that the objective function z (x ), x ∈ R n receives its optimal values at vertixes or at all areas of polyhedron. In the first part of computational procedure there we find one from vertixes of polyhedron and we transform it into the coordinate origin simultaneously. Simultaneously we transform all hyperplanes of polyhedron and the objective function with respect to the given transformation. We continue as folows: We select arbitrary hyperplane and we denote it by the index i ∈ {1, 2,..., m}. We divide the whole column at the variable x1 by coefficient ai1 and we put simultaneously
x1 = x1′ − ai 2 x2 − ai 3 x3 ... − ain xn − bi
(9)
We introduce the transformation relation (9) into the system (8) and the mathematical representation of the problem transforms for m > 1 on to the following form:
a11 a a a a a a a x1′ + a12 − 11 i 2 x2 + a13 − 11 i 3 x3 + ... + a1n − 11 in xn + b1 − bi 11 ≥ 0 ai1 ai1 ai1 ai1 ai1 a21 a a a a a a a x1′ + a22 − 21 i 2 x2 + a23 − 21 i 3 x3 + ... + a2n − 21 in xn + b2 − bi 21 ≥ 0 ai1 ai1 ai1 ai1 ai1 ai −1,1 a a a a a a a x1′ + ai −1,2 − i −1,1 i 2 x2 + ai −1,3 − i −1,1 i 3 x2 + ... + ai −1, n − i −1,1 in xn + bi −1 − bi i −1,1 ≥ 0 ai1 ai1 ai1 ai1 ai1 x´1
+
0
+
0 + ... +
0 +
0
≥0
ai +1,1 a a a a a a a x1′ + ai +1,2 − i +1,1 i 2 x2 + ai +1,3 − i +1,1 i 3 x2 + ... + ai +1,n − i +1,1 in xn + bi +1 − bi i +1,1 ≥ 0 ai1 ai1 ai1 ai1 ai1
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
174
a21 a a a a a a a x1′ + a22 − 21 i 2 x2 + a23 − 21 i 3 x3 + ... + a2n − 21 in xn + b2 − bi 21 ≥ 0 ai1 ai1 ai1 ai1 ai1 In the following step there we choose some of rows where the coefficient at the variable x2 is different from zero arbitrary. The existence of such road follows from the assumption that m linear rows is independent. In the next we suppose hat this assumption is satisfied by the row s, s ≤ m . It is obvious that s ≠ i . We continue such that we divide the whole second column by the expression
as 2 −
asi ai 2 ai1
and then we introduce the following transformation:
x2 = −
a s ,1 a in a s1 a a + x ´2 − a s 3 − s1 i 3 x 3 + ... − a s , n − a i1 a i1 a i1
a x n − bs + b1 s ,1 a i1
After this transformation the s-th row will be of the form:
0
x2′ +
0
+
0 + ... +
0 +
0
≥0
We continue till we do the all m < n transformations by analogy. Thus we calculate one point of one edge of polyhedron which transformed into the coordinate origin. In the account that m ≥ n there we find after n transformations one vertix of polyhedron which transformed into the coordinate origin. The whole calculus is done on computer therefore we calculate only with the matrix of coefficients of polyhedron. We apply all the steps of transformation to the objective function c1 x1 + c 2 x 2 + ... + c n x n + 0 ≥ 0 and we receive after the first transformation ca ca ca c1 ´ x1 + c 2 − 1 i 2 x 2 + c 3 − 1 i 3 x 3 + ... + c n − 1 in a i1 a a a i1 i1 i1
c x n + 0 − bi 1 ≥ 0 a i1
Further adaptations of coefficients of the objective function run over simultaneously with the adaptations of coeficients of polyhedron such as it was given in the previous description of the hash algorithm applied onto polyhedron. As the next step we extend the matrix of coefficients descibing the system (8) with the objective function which is of the type (m +1) x (n +1) such that we add the matrix of the type n x (n +1) which consists of unit matrix type n x n with added column vector of zeros of the length n.This step is necessary for explicit expression of the poin of edge optionally vertex of polyhedron and the optimal value of the objective function. Onto such expanded matrix are applied all before described affinite transformations. We obtain after the making described transformtion algorithm the original coordinates of the point of edge or the vertex of polyhedron which is transformed into the coordinate origin in the last column of the matrix n x (n + 1). We show the expanded matrix of coefficients of the type (m + 1 + n) x (n + 1) before the transformation algorithm. a11 a1 2 a1 3 a21 a2 2 a2 3 . . . a 1 a 2 a m m3 m c1 c2 c3 0 0 1 0 1 0 . . . 0 0 0
... ... ... ... ... ... ... ... ...
b1 b1 . . am n bm cn 0 0 0 0 0 . . 1 0 a1n
a2 n
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
175
3. Optimizing and decision making process small We suppose that m>n and that the transformation algorithm transformed one of vertices of polyhedron was transformed into the coordinate origin. If after this transformation all the coefficients in the first m rows of the last (n + 1)-the column are nonnegative numbers and simultaneously all transformed coefficiens of the objective function it is c,k in the (m + 1-the row negativw the maximizing process of the objective function z(x) is finished. In the last n rows of the (n + 1)-the column there are original coordinates of the vertex of polyhedron in which the objective function acquires its maximum and the value of this maximum is at the position [m + 1, n + 1] of transformed matrix. If previous situation does not occure then it is necessary to do the following analysis. We suppose for coefficients in the first m rows of the (n + 1)th column nonnegative again but some of transformed coefficiens of the objective function in the (m + 1)-th row is positive. Let this situation in the j0-th column occure. We look at all transformed coeficients in the j0-th column. If all transformed coefficients of the polyhedron a ´ij 0 . are nonnegative then the problem has not any solution. It is possible to get along this edge incident to polyhedron to infinity. Polyhedron is not bounded ant the solution does not exist.
4. Example 4.1 Remark I drafted a program for explanation of the given method which adress is on the server Pal of the Technical University in Brno: Q: \vyuka\ matematm\Tomsova\Polyhedron\ matice. exe. We do an applivcation of this program. Data are denoted as CONCRETE 4.2 Example Maximize the objective function z = 3x+5y in the area bounded by the following restrictions: 1. x≥0 2. y≥0 3. x≤5 4. x + 2 y ≤ 12 5. 2 x + 3 y ≤ 19 Solution: We line a figure for the better graphical preview where the area of polyhedron will be bounded with bisectors suitabled to the restrictions with the vertixes 0 = [0, 0], A = [5, 0], B = [5, 3], C = [2, 5], D = [0, 6]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
176
After startup of program we browse the vertixes in the sentence 0, A, B, C, D. We can monitor the coefficients and values of the objective function in the green coloured row. The coordinates of the vertices are in the last column in the last two rows. The program finishes at that moment when all coefficients of the objective function are nonpositive and the result is: The problem has just one solution at the point C and the value of the objective function is 31.
References [1] CHURCHMAN, CH. W.; ACKOFF, R. L.; ARNOFF, L. Introduction to Operation Research. New York : John Wiley & Sons. Inc. 1957. [2] KLAPKA, J.; DVOŘÁK, J.; POPELA, P. Metody operačního výzkumu. Brno : VUTIUM, 2001 [3] RAIS, K. Vybrané kapitoly z operační analýzy. Brno : PGS, 1985. [4] VACULÍK, J.; ZAPLETAL, J. Podpůrné metody rozhodovacích procesu. Brno : Masarykova univerzita, 1998. [5] WALTER J. a kol. Operační výzkum. Praha : SNTL, 1973. [6] ZAPLETAL, J. Operační analýza. Kunovice : Skriptorium VOS, 1995. Address: Mgr. Marie Tomšová University of Technology Technická 8, 616 00 Brno, e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
177
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
178
AUTHOR INDEX
A
S, Š
ABDURRZZAG, T....................27 AMALKA AL K. ......................81
SVOBODA, Z. .......................169 ŠMARDA, Z. ..................157, 163 ŠŤASTNÝ, J. ............................99
B BAŠTINEC, J. ...................35, 143
T
D
TOMÁŠ, I..................................57 TOMŠOVÁ, M. ......................173
DIBLÍK J. ................. 35, 143, 169 DOSTÁL, P. .............................21
V VÉRTESY, G. ..........................57
F Z
FAJMON B. .............................51
ZAPLETAL, J. .......................... 9
K KLIEŠTIK T. ..........................105 KOSTIHA J. .............................65 KRUPKOVÁ V. .....................157
L LACKO, B. ..............................73 LAŠŠÁK, V. .............................89
M MÉSZÁROS, I. .........................57 MIKULA, V. ............................45 MINAŘÍK, M. ..........................99
N NOVÁK, M. .......................43, 51
O OŠMERA, P. ............ 89, 109, 123
P PETRUCHA, J. .................45, 137 POPELKA, O. ..........................89
R RUKOVANSKÝ, I. ..........89, 131
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
179
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT” EPI Kunovice, Czech Republic, January 27, 2006
180
SPONSORS
Sverepec 365 Považská Bystrica PSČ: 01701 tel.: 00421-42-4321110 fax: 00421-42-4379930 majitelia: Milan Richtárik Jozef Ďurajka
VS-mont, s. r. o., Lazy pod Makytou, SLOVENSKO email: [email protected] tel./fax: 00421 42 4681 965 tel.: 00421 42 4681 952
Udiča 366, 018 01 Považská Bystrica telefón: 042/4260768042/4260769 fax:042/4340269 e-mail:[email protected] - objednávky tovaru [email protected] - ekonomické oddelenie [email protected] - sekretariát
Název: ICSC 2006 – Fourth International Conference on Soft Computing Applied in Computer and Economic Environment Autor: Kolektiv autorů
Vydavatel, nositel autorských práv, vyrobil: Evropský polytechnický institut, s.r.o. Osvobození 699, 686 04 Kunovice Náklad:
200 ks
Počet stran:
182
Vydání: první Rok vydání: únor 2006
ISBN 80-7314-084-5
I SBN 8 0 - 73 14 - 0 84 - 5
9 788073 140847