Overcoming coherence issues in distributed photo capture Pieter-Jan Vandormael June 17, 2008
ii Do not print pages i-iv.
iii Do not print pages i-iv.
iv
Samenvatting Inleiding Ubiquitous computing Een trend in de hedendaagse computerwetenschap is de toenemende beschikbaarheid van rekenkracht en netwerkverbondenheid in een alomtegenwoordige en geminiaturiseerde vorm. Dit is te danken aan een aantal technologie¨en die algemeen ingeburgerd zijn geraakt: Draagbare toestellen, de algemeenheid van het internet en webtechnologie¨en als de “lingua franca” van computers, draadloze communicatiesystemen die wereldwijde bereikbaarheid verzorgen, en tenslotte, sensorische technologie¨en (zoals het Global Positioning System, gps) die een bewustzijn van de omgeving teweegbrengen. Het aantal van dergelijke toestellen die voor het publiek beschikbaar worden neemt gestaag toe. Ubiquitous computing heeft betrekking op het benutten van deze toenemende integratie van rekenkrachtige toestellen in de dagdagelijkse wereld [1].
Context awareness, robotica en sensorknoop-netwerken Het is aan de hand van contextbewustzijn (context awareness), dat de fysische wereld verstrengeld wordt met computersystemen. Een belangrijk en overblijvend probleem is dat, in vergelijking met de subtiele voeling die mensen hebben met hun tastbare wereld, de bestaande systemen erg ruw zijn. Niet alleen zijn sensors onvermijdelijk onnauwkeurig, maar ook de laatste stap om semantisch rijke informatie te produceren op een accurate wijze uit ruwe sensorgegevens, is extreem moeilijk. De wereld van de robotica bijvoorbeeld, is deze kwestie al vele jaren aan het aanpakken. In nichedomeinen, zoals huishoudelijk stofzuigen, kunnen robotten redelijk goed presteren. Een ander voorbeeld is in-network processing. Hierbij voeren knopen (nodes) in een sensornetwerk taken uit zoals het samenvoegen van sensorgegevens van naburige knopen, zodat meetgegevens voor een gebied eerder dan voor een enkele sensor tezamen verwerkt kunnen worden.
Het belang van locatie In beide domeinen is het deelprobleem van de kennis van de locatie van de robot of knoop opgelost. Maar veralgemeningen vanuit deze domeinen blijven moeilijk. In context aware computing, een subdomein van ubiquitous computing (niet van toepassing op het onderwerp van dit eindwerk, maar verwant), speelt plaatsbepaling een belangrijke rol in het koppelen van de virtuele wereld aan de re¨ele wereld en omgekeerd. Maar ook in andere domeinen maakt goede positie-informatie het verschil. Geotagging is het toevoegen van positie-informatie aan v
vi verschillende soorten media. Een populair gebruik momenteel is het toevoegen van geografische metagegevens aan foto’s op populaire foto-uitwisselingswebsites zoals Flickr, zodat fotoverzamelingen doorzocht kunnen worden op locatie. Vanuit een ander perspectief, namelijk beginnend bij een kaart of herkenningspunten, koppelen experimentele webdiensten en producten, zoals Google’s Street View, Microsoft Live Labs’ Photosynth, en andere, foto’s en posities op interessante nieuwe manieren. Voorlopig is het echter niet mogelijk voor gebruikers om hun eigen foto’s te uploaden.
Het delen van foto’s Nieuwe manieren om foto’s te delen met andere gebruikers, naast de al bestaande gecentraliseerde websites, worden ook onderzocht. Een voorbeeld is PIX-grid [2], een peer-topeer-systeem op internetschaal dat gebruikers in staat stelt om publiekelijk of binnen een groep foto’s uit te wisselen en op een effici¨ente manier foto’s te lokaliseren gebaseerd op semi-automatisch toegevoegde metagegevens (komende van de toestellen en de gebruiker).
Distributed capture Tegelijkertijd krijgt het nieuwe concept van collaborative capture of distributed capture [3] vorm. Distributed capture (oftewel collaboratieve of verspreide opname en samenstelling; iets wat over een langere periode kan plaatsvinden) bestaat erin een set van verspreide en overal aanwezige persoonlijke toestellen te gebruiken om data op te nemen van een gebeurtenis, of om een omgeving op te meten, als ware het een verspreide sensorische infrastructuur, en om dan de resulterende samenstelling daarvan uit te wisselen en te synchroniseren met het doel de totale kwaliteit te verbeteren.
Doelstellingen Het onderwerp van dit eindwerk is het onderzoeken van de ruimtelijke coherentie in het vernieuwende concept van distributed photo capture, in het bijzonder het probleem van algemene coherentie in plaatsgemarkeerde (geotagged) fotoverzamelingen. Aangezien bijna geen verwant werk bestaat over dit onderwerp en het werk uit andere verwante domeinen slechts in beperkte mate van toepassing is, kan dit werk beschouwd worden als een eerste stap in een nieuwe richting. Distributed photo capture betreft het samenstellen van foto’s genomen met een camera of draagbaar toestel, door verschillende gebruikers en op verschillende maar gelieerde plaatsen, tot een coherente verzameling. De coherentie van de verzameling is hier duidelijk het beoordelingspunt, en meer bepaald is het de ruimtelijke coherentie die van belang is in dit werk. Een fotoverzameling wordt ruimtelijk incoherent als een foto wordt toegevoegd die zich niet op de positie bevindt waar de gebruiker hem verwacht. Gebaseerd op een virtuele-wereldvoorstelling van de verzameling, zou de gebruiker kunnen verwachten het ene te zien terwijl hij uiteindelijk iets anders te zien krijgt dat niet overeenkomt met de werkelijkheid. De hoofddoelstelling is om coherentieproblemen in distributed photo capture te verkennen. Aangezien dit concept erg nieuw is, bestaat een belangrijk deel van dit eindwerk erin om eerst en vooral het probleem duidelijk te formuleren. Ten tweede worden manieren om de inconsistenties te corrigeren onderzocht. Tot slot zal een prototypische toepassing in verband met een gedistribueerde fotoverzameling en gebruikmakend van een foutverspreidingstechniek
vii ontwikkeld worden om deze theorie in praktijk om te zetten. Dit eindwerk poogt ook een begin te vormen voor toekomstig werk in dit domein.
Plaatsbepaling met afstandsmetingen door gebruikers Locaties in data aggregatie Veel onderzoek werd verricht naar context, zelfs in die toepassingen waar in feite locaties een belangrijke rol spelen. Het aspect van context dat gerelateerd is aan locatie, is nabijheid. Dit is tegelijk een krachtig en vaag begrip, afhankelijk van wat nodig is voor de toepassing. In de Ambient Computing and Embedded Systems (ACES) research groep werd een toepassing ontwikkeld om te navigeren door een fotoverzameling gebruikmakende van context [4]. Foto’s genomen van de omgeving worden weergegeven in hun ori¨entatie relatief ten opzichte van de gebruiker (en zijn locatie). Het concept van context dient hier vooral als een multidimensionale afstand die tijd en ruimte vervat. Foto’s genomen door eender wie op eender welk moment, zolang ze maar genomen zijn op plaatsen kort bij elkaar, worden samengezet. Het is mogelijk om door de foto’s heen te klikken en te zien wat verderop ligt. Zodoende vormt de fotoverzameling een coherente virtuele wereld. De fysische positie van de gebruiker wordt voortdurend gevolgd en zijn context wordt mee veranderd, zodat de virtuele wereld gekoppeld is aan de re¨ele wereld. Hiervoor wordt gps gebruikt. Het probleem met plaatsbepaling met gps is dat, terwijl het vrij accuraat is binnen zijn foutmarges, het niet overal bruikbaar is (zoals in gebouwen omwille van signaalverzwakking) en de informatie niet snel genoeg vernieuwd wordt. Het is verder niet mogelijk om fijnmazige mobiele plaatsbepaling (tot op cm’s) in te wisselen voor iets anders, zoals de verversingssnelheid, met dezelfde ontvanger. Een ander probleem is het gewaarworden van veranderingen in ori¨entatie bij gps. Elke gebruiker kan ook foto’s toevoegen aan de verzameling. Het markeren van foto’s met geografische co¨ ordinaten is tegenwoordig een populair onderwerp. Ook al is het eindgebruik van deze applicatie goed bekend, het verkrijgen en opnemen van inhoud blijft toch nog moeilijk. Dit situeert zich in het domein van de (ruimtelijk) data aggregatie (het verzamelen van gegevens). De gebruikte toestellen zijn kleine pda’s annex fotocamera’s met enige netwerkmogelijkheden en een gps-ontvanger. Een interessant, gerelateerd en zelfs commercieel gerealiseerde toepassing is [5] van Cyclomedia, waarvoor herhaaldelijk panoramische beelden werden getrokken in Nederlandse steden, en later aan elkaar genaaid tot 3D kaarten. Dit bewijst dat er oneindig veel toepassingsmogelijkheden zijn, als posities maar met voldoende precisie kunnen worden bepaald. Het is zelfs mogelijk om vrij nauwkeurig de hoogte van gebouwen e.d. te meten vanaf deze foto’s voor technische doeleinden (zoals wegenwerken, renovatieprojecten, kadastrale gegevens, . . . ). Dit eindwerk heeft als opzet de ruimtelijke consistentie in het verzamelen van de gegevens te onderzoeken: hoe foto’s op verschillende plaatsen met elkaar in verband kunnen worden gebracht. Meer bepaald willen we te weten komen hoe foto’s (in een net van fotoposities) posities gegeven kunnen worden terwijl hun onderlinge structuur coherent blijft. Om dit te kunnen doen, moet de fotoverzameling op zo’n manier aan de gebruiker worden gepresenteerd dat erdoor navigeren simpel, realistisch, redelijk transparant en vrij van inconsistenties is. De manier waarop met locaties wordt omgegaan moet ons helpen dit te bereiken. Om deze reden zullen we ons toeleggen op technieken voor plaatsbepaling.
viii
Een toepassingsvoorbeeld In lijn met het bovenvermelde onderzoek stellen we een voorbeeld van een toepassing voor waarbij de gebruiker gevraagd wordt foto’s te nemen en de afstand en de richting tussen de twee plaatsen onderling waar de foto’s genomen zijn in te geven. Op deze manier kan een kaart geconstrueerd worden van de omgeving waarin de foto’s verbonden zijn volgens hun afstand en relatieve richting (hoek). Deze informatie kan gedeeld worden tussen gebruikers, en deze collectief gebouwde kaart kan dan dienen als een fotografische gids voor anderen. Men kan inzien dat de kaart snel behoorlijk groot kan worden, zodat ervoor gezorgd moet worden dat de toepassing goed meeschaalt naar grote verzamelingen. Het is zinvol de verzameling alsook de berekeningen betrokken in het plaatsbepalen en koppelen van de foto’s, gedistribueerd uit te voeren, zodat de bronnen kunnen gedeeld worden op een manier die haalbaar is voor dergelijke beperkte pda’s en bij het ontbreken van een centrale dienst. Beeld een gebruiker in die staat in de hoek van een kamer, kijkt naar de andere kant van de kamer, en in die positie een foto neemt. De gebruiker wandelt dan diagonaal (45◦ naar rechts) de kamer over in 20 passen, kijkt naar de andere muur in het noorden, en neemt opnieuw een foto. De gebruiker kan deze informatie samen met de foto’s invoeren in de toepassing (20 passen, 45◦ , naar het noorden kijkend). Op deze manier heeft de gebruiker zelf een relatieve positieschatting uitgevoerd zonder te rekenen op andere systemen zoals gps. Hoewel de schatting van de gebruiker mogelijk minder nauwkeurig is, kan men ervan op aan dat als de gebruiker zich vervolgens naar links zou verplaatsen ten opzichte van waar hij zich daarvoor bevond, de nieuwe, derde foto zeker ook ingevoerd zou worden als zijnde links van de tweede foto, en nooit andersom, zoals zou kunnen gebeuren indien ze beide samen binnen de foutmarges van gps vielen. Deze vorm van relatieve plaatsbepaling kan toch absolute co¨ordinaten bekomen als de eerste hoek van de kamer een bekend ori¨entatiepunt zou zijn. Deze hoek zou op voorhand kunnen gelokaliseerd worden aan de hand van andere middelen zoals gps. Plaatsbepaling, of het identificeren van een locatie, kan dus gebeuren ten opzichte van andere elementen (zoals bijvoorbeeld, ten opzichte van andere naburige sensornodes in sensornetwerken, of door middel van odometrie in robotica), of kan absoluut zijn in de vorm van wereldco¨ordinaten (zoals bijvoorbeeld gps-positionering of visueel herkenbare ori¨entatiepunten). Hoewel de gebruiker niet gemakkelijk voor de hand liggende foute schattingen zal maken in opeenvolgende foto’s, zou het kunnen dat hij/zij een erg grote wandeling maakt en dan terug aankomt op zijn/haar vertrekpunt, om dan tot de conclusie te komen dat de eerste en laatste foto’s zich in de toepassing niet op dezelfde plaats bevinden, omdat hij of zij een grote cumulatieve fout heeft opgebouwd onderweg. Dit probleem kan verholpen worden door gekende locaties (ori¨entatiepunten of landmarks genoemd) te verwerken in de wandeling, zodat de tussenliggende metingen kunnen gecorrigeerd worden aan de hand van interpolatietechnieken, en wel als volgt: veronderstel dat de gebruiker begint in een hoek van een grote kamer, hetgeen een gekende locatie is, en dan de kamer diagonaal oversteekt in deelstappen. Uiteindelijk komt hij aan in de tegenoverliggende hoek, die ook een gekende locatie is. Daar kan de fout groot zijn maar kan deze verbeterd worden omdat de positie gekend is. De deelstappen tussenin kunnen vervolgens ook bijgesteld worden. Een ander type van samenhang dat wenselijk is, zit ook in de werking van de toepassing vervat: als er een foto is van een deur en de gebruiker klikt op de deur om erdoor te gaan, dan zouden we wensen dat hij of zij een foto ziet van wat zich achter de deur bevindt, bijvoorbeeld een gang, zonder dat dit beeld verschoven of geroteerd is, en bovenal zonder iets anders te
ix tonen dan deze gang, zoals een kamer ernaast. De fouten die we verwachten van gebruikers zullen zowel in de geschatte afstanden als in de geschatte hoeken zitten: zowel systematische (bijv. de optimistische gebruiker die denkt dat het niet zo ver is) als niet-systematische misschattingen (aangezien het voor mensen moeilijk is om afstanden te schatten, zelfs met het doel duidelijk in zicht). Daarnaast ook een fout in de hoekschattingen, zelfs voor niet al te grote afstanden, in het bijzonder wanneer het niet 90◦ of 0◦ betreft. Het is mogelijk dat de fouten groter worden voor grotere afstanden en uiteindelijk zo groot dat de gebruiker in deelstappen zal moeten werken. De focus van de toepassing ligt echter op consistentie en transparantie voor de gebruiker, eerder dan nauwkeurigheid, zodat deze fout niet stoort als de gebruiker hem niet opmerkt. Het wordt nog moeilijker wanneer er zaken, zoals gebouwen of begroeiing, . . . het directe zicht belemmeren, en wanneer de gebruiker dus zal moeten afwijken van het pad in vogelvlucht. Hetzelfde geldt voor heuvels en niveauverschillen in het terrein, waarbij het probleem een derde dimensie krijgt. Men zou zelfs grotere misschattingen verwachten hier. Deze kunnen gereduceerd worden als de gebruiker in stappen wil werken (bijvoorbeeld, eerst tot aan het gebouw, dan langs de zijkanten van het gebouw, en nadien tot aan het doel).
Het plaatsbepalingsprobleem in het algemeen Plaatsbepaling is de centraal kwestie in ons probleem. We willen in staat zijn om te bepalen waar iets zich bevindt met betrekking tot iets anders. Meer exact geformuleerd willen we de positie van een punt kunnen schatten. Zoals eerder gezegd spelen zowel relatieve schattingen als absolute schattingen een rol. Ondanks de grote hoeveelheid bestaand werk over het probleem van de plaatsbepaling, daar het een oud probleem is, bestaat er toch geen generische oplossing voor alle facetten van ubiquitous computing. Voor tenminste twee grote domeinen echter zijn er wel plaatsbepalingstechnieken specifiek ontwikkeld, namelijk voor robotica en ad-hoc sensornetwerken. Beide domeinen hebben oplossingen die tot op zekere hoogte van toepassing zijn op het plaatsbepalingsprobleem. Spijtig genoeg bestaat er niets dat nauw verwant is aan ons bovenstaand toepassingsvoorbeeld, of dat (veel) van de eigenschappen bezit die we vooropgesteld hebben, zodat het nodig was in het eindwerk een overzicht en literatuurstudie te maken van de gevonden technieken en van welke idee¨en ze bevatten die ons zouden kunnen helpen. Uit dit overzicht werd uiteindelijk vooral ´e´en algoritme weerhouden, dat dadelijk toegelicht zal worden in zijn oorspronkelijke vorm, nadat eerst nog twee belangrijke begrippen zijn verduidelijkt.
Twee belangrijke begrippen: consistentie en coherentie Consistentie (consistency) betekent “de mogelijkheid om te worden samengevoegd zonder contradictie”1 , of, wanneer een toestand uitgedrukt wordt, “overeenkomst of harmonie van delen of eigenschappen ten opzichte van elkaar of een geheel”. Wanneer deze term gebruikt wordt in de context van ordening in gedistribueerde systemen, dan doelt consistentie typisch op causale consistentie of ordening. Net zoals daarbij enkel de volgorde die ertoe doet (dat wil zeggen de “ging-vooraf”-relatie) belangrijk is, geldt bij het ruimtelijke ordenen van een verzameling plaatsgemarkeerde foto’s een gelijkaardig concept: enkel de verbanden tussen de foto’s moeten gerespecteerd worden (bijvoorbeeld een “bevindt-zich-links-van”-verband). 1
Deze definities zijn vrije vertalingen van de overeenkomstige Engelstalige definities afkomstig uit het Merriam-Webster’s online woordenboek.
x Bovendien moet wat belangrijk is (van structuur) in de echte wereld terug te vinden zijn in de virtuele voorstelling. Coherentie daarentegen heeft als definitie “een systematisch of logisch verband of consistentie” en “logisch of esthetisch geordend of samengevoegd”. Wij hanteren coherentie als een ruimere term dan consistentie, voor een toestand van samenhang van een hele verzameling eerder dan een eigenschap tussen twee foto’s. Het totaalbeeld van de fotoverzameling moet coherent zijn en dus kloppen met de werkelijkheid.
Een algoritme naar analogie met een netwerk van veren Het enige algoritme dat in deze samenvatting besproken wordt komt uit het domein van de robotica. Daar schat een rijdende robot op verschillende momenten zijn positie aan de hand van zijn odometer (dat wil zeggen een afstandsmeter gebaseerd op de omwentelingen van zijn wielen) en een kompas, bijvoorbeeld. De verschillende posities op verschillende momenten komen overeen met de verschillende posities van de foto’s in ons voorbeeld en de afstandsschattingen respectievelijk hoekschattingen van gebruikers vervangen de odometrie en het kompas. Het algoritme maakt deel uit van een familie technieken die een kaart van posities consistent maken door paarsgewijze geometrische verbanden op te leggen tussen individuele metingen (dwz. posities). Andere verwante algoritmen maken de mechanische analogie met een net van knooppunten met daartussen veren of elastiekjes dat zijn evenwicht zoekt. Het fysische evenwicht (waarbij minimale energie bereikt wordt) vertaalt zich dan naar een optimale foutspreiding in een netwerk van fotoposities en de schattingen daartussen. De aanpak in [6] maakt niet effectief gebruik van deze analogie in het algoritme zelf, maar is rekenkundig goedkoop, eenvoudig te implementeren en garandeert het vinden van een globaal optimale oplossing. Het gebruikt een snelle, on-line en iteratieve methode om globale consistente kaarten op te stellen aan de hand van enkel lokale meetinformatie. Om de geometrische consistentie te bewaren wordt een relaxatiealgoritme gebruikt. Het basisidee is om iteratief elke knoop om beurt te kiezen en deze te verplaatsen naar waar zijn buurknopen denken dat deze zich moet bevinden.
Een algoritme voor distributed photo capture Probleem en doelstellingen Het verzorgen van coherentie in distributed capture is een erg nieuw onderwerp. In de INRIA Ambient Computing and Embedded Systems research groep te Rennes wordt momenteel een verwante studie uitgevoerd naar het tijdsaspect [3], namelijk naar het op een coherente manier samenvoegen van multimediafragmenten die collaboratief opgenomen zijn door meerdere toestellen. Dit eindwerk richt zich daarentegen naar het ruimtelijke aspect. De term “distributed” slaat hier op het feit dat een set van foto’s, genomen op verschillende posities, allemaal dezelfde plaats of hetzelfde object kunnen tonen. Het coherentieprobleem is dat als dat zo is in werkelijkheid, dat er dan ook voor moet gezorgd worden dat dezelfde ervaring gerecre¨eerd wordt in de virtuele voorstelling, zoals een kaart, die gehanteerd wordt. Ons doel is het verbeteren van de coherentie en daarvoor zullen we een foutverspreidingstechniek gebruiken. Het algoritme daarvoor dient gedecentraliseerd te zijn zodat de berekeningen gedaan kunnen worden in een gedistribueerd systeem. Naast een algoritme zullen we ook een prototypische toepassing ontwikkelen en implementeren (zie verder).
xi Gezien het domein vele facetten heeft, moeten we de reikwijdte van dit werk enigszins beperken. De oplossing die wij zullen vooropstellen is een incrementele methode die voortdurend de samenhang in een dynamisch veranderende verzameling van knopen verbetert. Hierbij is het belangrijk dat correcties enkel een lokaal effect hebben, eerder dan dat een optimale toestand behaald wordt zoals men zou verkrijgen bij een methode die in ´e´en keer op een hele verzameling wordt toegepast. Verder is het delen van de plaatsgegevens van foto’s belangrijk, niet het verspreiden van de foto’s zelf of het gebruikte communicatieprotocol. Ook primeren technieken voor plaatsbepaling en maken we geen keuze met betrekking tot welke technologie juist gebruikt wordt om absolute co¨ordinaten toe te kennen aan foto’s, noch voorzien we in details zoals aanwezig in geografische informatiesystemen. Enkel het toevoegen van foto’s aan de applicatie wordt ontwikkeld, niet het wijzigen of verwijderen ervan. Wat betreft de gedistribueerde werking van het systeem wordt daar op het niveau van het algoritme rekening mee gehouden, maar wordt dit niet uitgewerkt in de toepassing. Tot slot is een formele evaluatie van de performantie van de oplossing geen hoofdprioriteit, maar wel het aantonen van de uitvoerbaarheid van zo’n systeem. Andere aspecten zoals gebruikersinterfaces, veiligheid en beheerbaarheid behoren evenmin tot de opdracht.
Representatie Hoe het oplossen van coherentieproblemen in plaatsbepaling ook de coherentie verbetert in distributed capture wordt duidelijk als we foto’s en posities gaan koppelen. Omdat het koppelen van de foto aan de positie van het hoofdobject in de foto zelf een aantal nadelen heeft (maar ook voordelen) die wij ongewenst achten voor deze toepassing (zoals het feit dat zulke foto’s niet navigeerbaar zijn als een virtuele wandeling), wordt ervoor gekozen om de positie van een foto af te leiden uit het camerastandpunt. Dit geeft een zicht naar buiten toe, maar de foto’s moeten hiervoor geschikt zijn. In het interieur bevatten de foto’s muren, deuren en gangen en kamers, buiten tonen ze het zicht van op hun positie. Punten waar we de positie van willen bepalen, dus de camerastandpunten van foto’s, noemen we knopen (zoals in een graaf of netwerk; nodes). Er zijn qua schattingen (de verbanden tussen de foto’s die we links zullen noemen) twee soorten: relatieve links zijn schattingen van afstand en hoek (in eerste instantie een absolute hoek ten opzichte van het noorden) van de eerste knoop naar de volgende, door de gebruikers. Absolute links daarentegen bevatten vrij precieze informatie, zoals van een bron zoals gps of het aanwijzen op een satellietkaart, of zelfs in een ge¨ıdealiseerd geval (zoals in de eerste versie van het algoritme) perfecte informatie over een positie. Hoe meer gekende knopen met een dergelijke schatting, des te minder correcties er in het algemeen moeten gebeuren.
Keuze van algoritme Een eerste simpele versie van het algoritme steunt sterk op het relaxatiealgoritme dat eerder werd besproken. Het beoogde gebruik is zeer gelijkaardig, alleen zullen we andere gegevens gebruiken in een andere situatie, en is ook het gedistribueerd zijn van tel. Het algoritme maakt gebruik van informatie voorzien door gebruikers en poogt met die beperkte gegevens de best mogelijke oplossing te bekomen. Immers, om te zorgen dat de fout goed verspreid is en niet extreem groot in ´e´en plaats, en dus dat de kaart werkelijk coherent is, is het nodig dat er voldoende gegevens beschikbaar zijn. Verder is het voortdurend aannemen van de nieuwe schattingsinformatie en het lokaal doorvoeren van deze aanpassing de snelste manier om de
xii coherentie te verbeteren. Bovendien maakt dit verspreide berekeningen mogelijk. Terwijl dit algoritme enkele van de meer geavanceerde technieken mist om de coherentie te verbeteren, is het ´e´en van de weinige dat toelaat om gebruikt te worden in een eenvoudig gedistribueerd systeem. Zoals vele algoritmen voor sensornetwerken is het niveau van dit algoritme te omschrijven als een gewone nauwkeurigheidsmaximalisatie. Toch wordt er zoveel mogelijk informatie benut en zo weinig mogelijk benaderd en is het niveau dus hoger. De verdienste van dit algoritme is de mogelijkheid om gedecentraliseerd te worden uitgevoerd, terwijl het toch eenvoudig blijft om te implementeren. Ook worden er aanpassingen gemaakt om het algoritme meer specifiek te maken voor het photo capturing gegeven (bijvoorbeeld voorzieningen om de kijkrichtingen van foto’s te draaien).
Algoritme, eerste versie Het algoritme bestaat uit twee stappen voor elke iteratie. Beide stappen worden uitgevoerd voor elke knoop i om beurt, binnen ´e´en iteratie. De volgorde waarin knopen moeten worden bijgewerkt wordt niet bespoken in [6] maar is niet noodzakelijk onbelangrijk. Stap 1 Voor de huidige knoop i en elke buurknoop j, tussen welke er een relatieve link bestaat 0 ) voor knoop i met afstand dji en absolute hoek θji , wordt er een positieschatting (x0ji , yji berekend gebruikmakend van de positie (xj , yj ) van de buurknoop j. De techniek die gebruikt wordt om de positie te berekenen kan gezien worden als een combinatie van multilateratie en multiangulatie. x0ji = xj + dji cos θji
(1a)
0 yji = yj + dji sin θji
(1b)
Op dezelfde manier wordt de variantie vji op de positie van de huidige knoop i in deze schatting berekend aan de hand van de variantie vj van elke buurknoop j en de variantie van de link tussen de knopen uji . vji = vj + uji (2) Stap 2 In de tweede stap wordt de positie (xi , yi ) van de huidige knoop i bijgewerkt met een gewogen 0 ) die gemaakt zijn voor alle buren j in de vorige gemiddelde over de positieschattingen (x0ji , yji stap. Het doel is om een nieuwe positie (xi , yi ) te produceren voor knoop i die uiteindelijk (na over i te lopen en te itereren) meer geschikt zal zijn. Eerst worden de varianties van de schattingen vji opgeteld tot de nieuwe variantie vi voor knoop i, zodat de ge¨ınverteerde varianties P van de schattingen gebruikt kunnen worden als 0 de gewichten voor het gemiddelde. Hier is j de som over de buren j van knoop i (zonder j = i). In de praktijk wordt dit gedaan in een lus over j, waarbij steeds wordt opgeteld bij een tijdelijke waarde 1/vi , en vervolgens dit wordt ge¨ınverteerd om vi te bekomen. 0
X 1 1 = vi vji j
(3)
xiii Vervolgens wordt, ook in een lus over j voor de sommatie, de positie berekend en bijgewerkt. 0
xi =
X x0ji vi j
vji
(4a)
0
yi =
0 v X yji i j
vji
(4b)
De volgende opmerkingen moeten hierbij in het kort gemaakt worden: Randvoorwaarden De informatie van absolute links wordt toegevoegd door, tussen elke twee stappen in, de vers berekende positie en de variantie van de betrokken knopen terug te zetten naar respectievelijk de opgegeven positie en nul. Stopcriterium In [6] wordt het niet langer overschrijden van een bepaalde drempelwaarde qua totale verandering in posities gebruikt als een mogelijk stopcriterium. In deze en volgende versie echter, met oog op de gedistribueerde toepassing, is het beter om voor elke knoop individueel te stoppen als geen van zijn buren nog noemenswaardig bijgewerkt zijn wat betreft hun posities. Als dit het geval is dan zou een herberekening van de positie van de knoop ook geen noemenswaardig verschil meer opleveren aangezien de gebruikte informatie nauwelijks gewijzigd is. Lokaliteit Het algoritme kan lokaal te werk gaan en enkel lokaal een invloed uitoefenen. Als het voorgesteld stopcriterium niet te streng is dan zal een verbetering van de positie van de huidige knoop niet doorsijpelen naar vele buren (bovendien zal de bijsturing in positie steeds afnemen), als het netwerk tenminste stevig genoeg is. Dit hangt allemaal af van hoe veel accuraat gekende posities er zijn, hoe hoog de connectiviteitsgraad is en hoe streng het stopcriterium is. Nauwkeurigheid gaat ten koste van snelheid. Geschiktheid voor een verspreide berekening De lokaliteitseigenschap maakt het mogelijk om in theorie het algoritme te implementeren voor verspreide uitvoering. Berekeningen kunnen dan uitgevoerd worden per knoop op het toestel dat de knoop “bezit” (dit is het toestel van de gebruiker die de knoop heeft toegevoegd). Als een knoop bijgewerkt wordt en de verandering significant is, dan kunnen buurknopen op hetzelfde toestel ook voor bijwerking gemarkeerd worden, en dan kan een vraag om bijwerking verstuurd worden naar de andere buurknopen op andere toestellen. Real-time invoeging Deze lokaliteitseigenschap maakt het ook mogelijk om een knoop onderweg toe te voegen zonder veranderingen in het hele net teweeg te brengen (maar enkel in een klein gebied rondom de knoop). De dynamiek in dit algoritme laat toe dat knopen op elk moment toegevoegd worden (al moet uiteraard vermeden worden dat dit de stappen van een lopende berekening stoort). Convergentiebewijs In [6] bij de originele versie van het algoritme wordt bewezen dat er altijd convergentie is naar een globaal optimum. Het is daarentegen niet bewezen dat het algoritme zoals boven gegeven, met bijvoorbeeld de bijkomende randvoorwaarden, en dat bovendien numeriek ge¨ımplementeerd is, ook altijd zal convergeren. De eerste versie van het algoritme werd ge¨ımplementeerd in een Mathematica [7] voorbeeld als een haalbaarheidsstudie.
xiv
Algoritme, tweede versie In de tweede versie wordt functionaliteit toegevoegd om rekening te houden met het feit dat niet enkel de camerastandpunten coherent gemaakt moeten worden maar ook de foto’s zelf. Meer hoekgegevens worden opgenomen in het algoritme aangezien een foto ook een ori¨entatie heeft. Ook met een gedistribueerde uitvoering wordt deze keer meer rekening gehouden. De basiswerking van het algoritme is hetzelfde en verloopt als volgt: eerst en vooral wordt het algoritme gestart op knoop i omdat ofwel deze juist toegevoegd is of omdat een gegeven van ´e´en van de buurknopen noemenswaardig bijgewerkt werd om een uitvoering van het algoritme ter verbetering van deze knoop te rechtvaardigen. De buren worden gevraagd om, op dezelfde manier als in stap 1 in de eerste versie, de positie van knoop i te schatten. De meest recente informatie over de buren (hun positie en variantie) wordt verzonden door de buren wanneer deze veranderde (of wanneer een knoop werd toegevoegd, ter initialisatie). De definitie van relatieve links wordt aangepast zodat deze foto’s verbinden in plaats van knopen. Ook heeft elke relatieve link nu een afstand en twee relatieve hoeken (´e´en voor de verbinding ten opzichte van de “beginfoto”, en ´e´en voor de “doelfoto” ten opzichte van de verbinding). Als gevolg daarvan hangt de geschatte positie nu ook af van de ori¨entatie van de foto op een buurknoop. Absolute links worden nu op dezelfde manier meegerekend als relatieve links in het algoritme. De tweede stap (om bij te werken) is ook gelijkaardig aan stap 2 in de eerste versie, behalve dat er nu een stuk bijkomt om de foto’s te draaien en zoveel mogelijk in overeenstemming te brengen met de informatie in de links. Er bestaat hier het gevaar dat verschillende invloeden elkaar tegenwerken en dat een nulvector bekomen wordt (wat geen oplossing betekent). Dit wordt zoveel mogelijk vermeden maar indien dit toch optreedt dan wordt de ori¨entatie van de foto gewoon niet bijgewerkt en de vorige (altijd geldige) ori¨entatie bewaard.
Toepassing: Geometer Ontwerp Een prototype van een distributed photo capture toepassing, Geometer genaamd, wordt ontworpen en ge¨ımplementeerd. De belangrijkste vereiste is dat een gebruiker foto’s moet kunnen toevoegen en dat de positie van het camerastandpunt horend bij deze foto bepaald wordt door een aanwijzing op een kaart of door schattingen van de afstand en hoek tussen standpunten door de gebruiker. Met een dergelijke verzameling van foto’s, die ook foto’s van anderen op andere toestellen bevat, kan de gebruiker dan virtueel rondwandelen en verdere foto’s toevoegen. Hiervoor zal bovenstaand algoritme gebruikt worden. Verdere beperkingen worden opgelegd door het platform. Dit brengt een aantal niet-functionele vereisten met zich mee qua implementatie, betrouwbaarheid, performantie en aanpasbaarheid. Het gebruikte platform is Android [8] van The Open Handset Alliance, “een software stack for draagbare toestellen met inbegrepen een besturingssysteem, middleware en kernapplicaties. Ontwikkelaars kunnen toepassingen cre¨eren voor het platform gebruikmakende van de Android sdk (software development kit). Toepassingen worden geschreven in de Java programmeertaal en worden uitgevoerd op Dalvik, een virtuele machine (vm) ontworpen voor embedded gebruik draaiende bovenop een Linux kernel”. Onze motivatie om Android te gebruiken is dat dit toelaat om te ontwikkelen met een
xv draagbaar toestel, zoals een gsm of pda, in gedachten, niet alleen qua performanties maar ook qua gebruikersinterface en -ervaring. Deze voorkeurstoestellen werden geconcipieerd als always-on netwerkapparaten met uitvoerige zintuiglijke -en netwerkmogelijkheden en webtechnologie¨en. Dit, en het feit dat Google, de belangrijkste participant, de huidige sterkhouder is in location aware toepassingen en diensten, maakte het een voor de hand liggende keuze. De lijst van features leent zich voor onze toepassing en de filosofie komt overeen met het gebruik dat vooropgesteld was voor onze toepassing. Het platform is open en makkelijk toegankelijk, gebruikt de vertrouwde Java-taal, en voorziet in een emulator zodat het bezit van een echt apparaat niet nodig is.
Implementatie De implementatie houdt uitvoerig rekening met de bijzondere architectuur en opzet van Android, en gebruikt verschillende delen (hulpmiddelen, bibliotheken, API’s, standaarden) van het platform (van een weergave voor kaarten tot de debugtool). Bij de behandeling van het algoritme daarnet werd nog geen keuze gemaakt wat betreft variantiewaarden voor de schattingen. Een lagere waarde markeert een schatting als nauwkeuriger. Voor de tweede versie van het algoritme, zoals ge¨ımplementeerd in Geometer, werd voor de variantie uji van een relatieve link tussen knoop i en j met afstand dji , de volgende formule gebruikt: q uji = 1.1
(dji · 0.1)2 + (2dji sin (vθ · 2 · π/2 · 360))2
(5)
De variantie wordt geschat als een combinatie van 10% van de afstand, wat de diepte waarmee de positie zou kunnen afwijken voorstelt, en de mogelijke zijdelingse afwijkingsafstand, gebaseerd op een geschatte onnauwkeurigheid van de hoek. Hiervoor stelt vθ = 5 graden de onnauwkeurigheid op de hoek tussen de eerste foto en de richting van de link voor. De onnauwkeurigheid van de andere hoek wordt ingebracht door de vermenigvuldiging met 110%. De 10% is een keuze die gebaseerd is op de 5% die gebruikt wordt in [6]. Er werd gekozen om vθ een vaste waarde te geven omdat niet bekend is of de nauwkeurigheid van de schatting voor een hoek gecorreleerd is met de grootte van die schatting. Enkel de afstand heeft dus eigenlijk een invloed op de variantie van de link. Voor absolute links wordt een constante positieve waarde van 100 gezet. Dit maakt de absolute links ongeveer 10000 ` a 100000 keer nauwkeuriger dan de relatieve links. Bij schattingen naar gedraaide foto’s op hetzelfde standpunt wordt wel een nulwaarde gebruikt als variantie, zodat geen onnauwkeurigheid zou toegevoegd worden aan de positie van een knoop aangezien het hier toch over schattingen “binnen” ´e´en knoop gaat (de variantie van de knoop zal altijd eerst door een andere schatting worden bepaald op een niet-nul waarde). De variantie voor absolute links moet specifiek verschillen van nul of van een klein getal: omdat absolute links niet langer op de manier van de eerste versie van het algoritme opgelegd worden zou een nulwaarde een deling door nul geven. Dit kan opgelost worden door een kleine constante te gebruiken. Maar erger, aangezien zoveel mogelijk gebruik gemaakt wordt van integer calculus uit effici¨entieredenen, zou deze constante ook het maximum aantal links (met absolute links als meest limiterende) worden dat kan gemaakt worden naar een enkele knoop. Veronderstel, in overeenstemming met vergelijking 3, dat er 100 links zijn met elk vji = 100. Dan is vpre = 100/100, waarbij vpre een numerieke tussenuitkomst is die wiskundig equivalent is aan v1i . De uiteindelijke variantie voor de knoop is dan vi = 100/100 = 1. Maar neem 101
xvi links daarentegen, en dit zou vpre = 101/100 en vi = 100/101 worden, wat numeriek vi = 0 is. Door deze 0 worden de co¨ ordinaten foutief gewogen (de som van de gewichten moet ten ´ en oplossing zou zijn om de varianties overal als doubles te bewaren. alle tijden 110% zijn). E´ Een andere, en dit is degene die toegepast werd, is om te detecteren dat vpre groter is dan ´e´en en dan een kleine hoeveelheid toe te voegen aan elke vji en de berekening te herdoen. Dit houdt bij benadering de verhouding van de aandelen van de links en de varianties van hun andere knopen gelijk, wat in de eerste plaats de bedoeling was.
Resultaten Het gerealiseerde prototype laat gebruikers toe om foto’s toe te voegen en voert dan het algoritme uit om de bijhorende posities coherent te maken. Het algoritme is in staat om de samenhang te verbeteren en om de fout te verspreiden. De performantie van het algoritme is niet uitgesproken, maar vormt vooral een begin. Sommige kwesties vragen verdere aandacht, namelijk wat het algoritme betreft: Toenemende variantie Het feit dat de variantie van een knoop altijd toeneemt en nooit afneemt kan nog ongekende gevolgen hebben in een gedistribueerd systeem. Volgorde in de lussen De volgorde waarin knopen bijgewerkt worden heeft wel degelijk een invloed op het convergentieproces. Oscillaties Omwille van veranderingen aan het originele algoritme is het niet langer bewezen dat altijd convergentie zal optreden naar een globaal optimum. Bovendien wordt dit ook verstoord door een numerieke implementatie, ook bij het originele algoritme. Zo zorgt de berekening van de afstand in graden voor een niet-wederkerigheid en dus een onnauwkeurigheid omwille van de ellipso¨ıde-benadering. Tezamen met andere afrondingsfouten zou dit instabiliteiten kunnen veroorzaken. Nulvector Het nulvectorprobleem, zoals vermeldt bij de bespreking van de tweede versie van het algoritme, en onze oplossing ervoor wordt een probleem wanneer er vele links zijn. Dan bestaat de mogelijkheid dat de links zich uitmiddelen en dat de vector een lengte zal hebben die quasi gelijk is aan nul. Het kan ook de convergentie verslechteren. Wat Android aangaat is het een interessant platform gebleken. Echter de sdk is nog niet zo stabiel en bevat of bevatte een aantal bugs die een hindernis zijn geweest in de implementatie. Ook op het vlak van performantie schiet Android tekort. In het bijzonder tussen sdk versie m3-rc37a en m5-rc14 is er een verschil in de capaciteit van de emulator om met vele objecten te werken, ten koste van de snelheid. De kaartweergave en opzoekingen in de gegevensbank zijn aan de trage kant, ook in de referentieimplementaties van Android. Zoals verwacht is XMPPcommunicatie (en het Intent-concept) niet toereikend voor onze toepassing. Het testen en debuggen tot slot is moeilijker dan normaal. De applicatie is stabiel als het ´e´en enkel toestel betreft. Er is tot op zekere hoogte ook functionaliteit aanwezig voor gedistribueerde uitvoering op meerdere toestellen maar dit werd niet volledig uitgewerkt (en gestabiliseerd). In het algemeen zijn er teveel beperkingen die het succesvol implementeren van een dergelijk systeem beletten. In retrospectie was er in het huidige ontwerp beter voor geopteerd geweest om de wachtrij in het algoritmisch deelsysteem te baseren op foto’s in plaats van knopen, vooral met het oog
xvii op toekomstige uitbreidingen. Zonder dit is het immers moeilijk om een volgorde te regelen waarin foto’s gedraaid worden. Aan de andere kant zou deze verandering ook de applicatie verzwaren. Tot slot zou best ook een limiet ge¨ımplementeerd worden op het aantal knopen dat wordt weergegeven op de kaart.
Besluit Vooraleerst werden in dit eindwerk verschillende aspecten besproken van het ruimere probleemdomein, dat plaatsbepaling voor distributed data capture met ruimtelijke coherentieproblemen betreft. Een uitgebreide literatuurstudie en voorbeelden werden beschreven in het eindwerk, waarvan in deze samenvatting ´e´en voorbeeld en ´e´en algoritme werden weerhouden. Aangezien het domein nog te nieuw is, werd teruggegrepen naar twee verwante domeinen, namelijk dat van plaatsbepaling voor ad-hoc sensornetwerken en voor robotica. Terwijl plaatsbepaling toch al een bekend en goed onderzocht probleem is, is er nog bijkomend onderzoek nodig om het bruikbaar te maken voor het domein van ubiquitous computing. Er werden minder relevante en bruikbare technieken gevonden dan verhoopt, en geen enkel algoritme was een oplossing voor ons probleem op alle vlakken. Eerder dan een reeds bestaande aanpak volledig over te nemen bleek het dus nodig om zelf bij te sturen en een nieuwe richting in te slaan. Ook bleek dat in dit vroege stadium van dit onderzoeksdomein een algoritme dat gewone nauwkeurigheidsmaximalisatie kan uitvoeren meer binnen de mogelijkheden ligt dan geavanceerde coherentieverbeterende technieken. Toch is er ´e´en soort van algoritmen, gebaseerd op de mechanische analogie met het evenwicht in een netwerk van veren, en in het bijzonder de iteratieve aanpak in [6], dat aantrekkelijk leek als een basis voor onze oplossing. Ondanks de tekortkomingen op vlak van verfijning en foutverspreidingsvermogen, biedt het mogelijkheden voor een aanpassing naar gedecentraliseerde berekening, en is het ´e´en van de weinige algoritmes die toelaten een eenvoudig gedistribueerd systeem te cre¨eren. Het is haalbaar om het te implementeren in een prototypische toepassing voor het voorbeeld van het verspreid nemen van foto’s. Met deze kennis werd het algoritme vertaald en aangepast aan een gedistribueerde implementatie en de specifieke opgave van de toepassing. Maar eerst werden de probleemomschrijving en de doelstellingen geformuleerd. Het specifieke probleem is hier het verzekeren van de coherentie in distributed photo capture in een plaatsgemarkeerde gedistribueerde fotoverzameling. Gezien het erg nieuwe onderwerp werd gefocust op het coherentieprobleem zelf. De haalbaarheid van het algoritme werd een eerste keer nagegaan in Mathematica. Verderbouwend op dit theoretische werk, werd een prototypische toepassing ontwikkeld van een gedistribueerde plaatsgemarkeerde fotoverzameling, “Geometer” genaamd, ter voorbeeld en om de realizeerbaarheid van een dergelijk systeem aan te tonen. Het systeem werd gebouwd op het Android platform voor draagbare toestellen. Het ontwerp van Geometer zorgt voor een scheiding tussen de gebruikersinterface en de backend, waar de berekeningen gedaan worden. Bepaalde delen zijn slechts los verbonden wat het geheel een grotere uitbreidbaarheid bezorgt. Het gerealiseerde prototype laat individuele gebruikers toe om foto’s toe te voegen en voert dan het algoritme uit om de bijhorende posities coherent te maken. De gebruikers kunnen dan een kaart bekijken of virtueel door de foto’s wandelen. De aanwezigheid van mogelijkheden voor gedistribueerde uitvoering in de implementatie, naar het theoretische ontwerp, moet enkel beschouwd worden als een extra en ongetest toevoegsel. Het algoritme toont zich bekwaam, maar de performantie is niet uitgesproken. Toch
xviii vormt het een goed begin. Het Android platform biedt interessante mogelijkheden maar is nog niet voldoende stabiel en heeft nog verschillende bugs, veranderingen in de API’s en andere zwakheden. Het meest dringend is een verbetering van de performantie en van de mogelijkheden voor het testen.
Acknowledgement This master’s thesis closes off a period of my life, and it certainly has been an interesting one. I would like to show my gratitude to people who have supported me throughout. For a start I would like to thank Prof. Y. Berbers for giving me the opportunity to perform my thesis work abroad and for supporting this endeavour all the way. My sincere appreciation to Paul Couderc, my counsellor in Rennes, first of all for accepting my candidature, for the animated discussions on the problem definition, for counsel on the small day-to-day issues I had, and for his inexhaustible source of experience on writing papers and reports. Thanks to Dr. Yves Vandewoude for proofreading and suggestions, and for advice on the finalizing details. I also extend my gratitude to Prof. M. Banˆatre, team supervisor, and Prof. G. Lorette, Erasmuscoordinator, both in Rennes. To Prof. P. Dierckx, Erasmuscoordinator, and Julien Pauty, post-doc coming from ACES, both in Leuven for advice on the foreign experience. Subsequently also thanks to the other people involved in the Erasmus or educational administrations both in Leuven and Rennes, and to the research group (that is provisionally still called) ACES at INRIA/IRISA. The best of luck to Xavier Le Bourdon with his Ph.D. work on distributed multimedia capture (which is related to this thesis) and thanks for making his stand on alternatives representation-wise and implementation-wise. Finally I would like to give a moment’s thought to all my friends, colleagues and housemates, both in Leuven and in Rennes, in spite off the short time I was there, for support, understanding and motivation. To my parents by way of thanks for all you’ve done for me. Pieter-Jan, Heverlee, 18 juni 2008
xix
xx
Abstract This master’s thesis tackles coherence issues in the novel concept of distributed data capture, in particular the problem of spatial coherence in geotagged photo collections. Distributed photo capture concerns the aggregation of photos taken with a mobile device, by several users and at different but related places, into a coherent distributed collection. Since the concept is very new, a part of this thesis is first to clearly formulate the problem. In a literature study the large domain of localization (and related coherence improving techniques) is explored, in particular through the ad-hoc sensor network and robotic node localization subdomains. An adapted iterative mesh relaxation algorithm that improves the map consistency is proposed. The algorithm applies an error diffusion technique to attain some level of consistency and allows for decentralized calculation. A prototype application has been developed and implemented to illustrate the possibilities.
xxi
xxii
Contents Contents
xxiii
List of Figures
xxvii
List of Tables
xxix
1 Introduction 1.1 Involved domains . . . . . . . . . . 1.1.1 Ubiquitous computing . . . 1.1.2 Context awareness, robotics 1.1.3 The importance of location 1.1.4 Sharing photos . . . . . . . 1.1.5 Distributed capture . . . . 1.2 Objectives . . . . . . . . . . . . . . 1.3 Structure of this text . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . and sensor-node networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
2 Localization with user distance measurements 2.1 Locations in data aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Example application 1: building maps by taking pictures at distances measured by a user . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.2 Example application 2: cooperative human surveying . . . . . . . . . . 2.2 The localization problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Overview and concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 Criteria, challenges and metrics . . . . . . . . . . . . . . . . . . . . . . 2.3 Ad-hoc sensor network localization . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Convex optimization & SDP . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 DV-distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.3 Savarese et al. two-stage localization scheme . . . . . . . . . . . . . . . 2.3.4 Savvides et al. four-stage localization scheme . . . . . . . . . . . . . . 2.3.5 MDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.6 Sectoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.7 SA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.8 NBP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Robotic node localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Simultaneous map building and localization . . . . . . . . . . . . . . . 2.4.2 Map learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiii
1 1 1 1 1 2 2 2 2 5 5 6 8 8 8 10 12 14 15 16 17 17 18 19 19 19 19 20 20
xxiv
2.5
CONTENTS 2.4.3 Elastic correction . . . . . . . . . . . . . . . . . . . . . 2.4.4 Mesh relaxation and map consistency . . . . . . . . . 2.4.5 Cooperative methods, such as for our second example Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
21 22 23 24
3 Distributed photo capture 3.1 Problem and objectives . . . . . . . . . . . 3.2 Scope . . . . . . . . . . . . . . . . . . . . . 3.3 Representation . . . . . . . . . . . . . . . . 3.3.1 Viewpoints . . . . . . . . . . . . . . 3.3.2 Nodes . . . . . . . . . . . . . . . . . 3.3.3 Links . . . . . . . . . . . . . . . . . 3.4 Algorithm . . . . . . . . . . . . . . . . . . . 3.4.1 Motivation . . . . . . . . . . . . . . 3.4.2 Algorithm, first version . . . . . . . 3.4.3 Boundary conditions . . . . . . . . . 3.4.4 Stopping criterion . . . . . . . . . . 3.4.5 Locality . . . . . . . . . . . . . . . . 3.4.6 Possibility for distributed calculation 3.4.7 Real-time insertion . . . . . . . . . . 3.4.8 Proof of convergence . . . . . . . . . 3.4.9 Algorithm, second version . . . . . . 3.5 Conclusion . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
27 27 28 29 29 30 30 31 31 31 32 33 33 33 33 34 34 35
4 Implementation: Geometer 4.1 Requirements analysis . . . . . . . . . . . 4.2 Use cases . . . . . . . . . . . . . . . . . . 4.3 Domain model . . . . . . . . . . . . . . . 4.4 The Android platform . . . . . . . . . . . 4.4.1 Motivation . . . . . . . . . . . . . 4.4.2 Emulator . . . . . . . . . . . . . . 4.4.3 Architecture . . . . . . . . . . . . 4.4.4 API’s and software . . . . . . . . . 4.4.5 Development tools . . . . . . . . . 4.5 Design class diagram and implementation 4.5.1 Design . . . . . . . . . . . . . . . . 4.5.2 UI and Activities . . . . . . . . . . 4.5.3 Domain . . . . . . . . . . . . . . . 4.5.4 Services . . . . . . . . . . . . . . . 4.6 Implementation . . . . . . . . . . . . . . . 4.6.1 Variance . . . . . . . . . . . . . . . 4.7 Results . . . . . . . . . . . . . . . . . . . . 4.7.1 Algorithm issues . . . . . . . . . . 4.7.2 Experience with Android . . . . . 4.7.3 Application issues . . . . . . . . . 4.8 Conclusion . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
37 37 39 42 42 42 42 43 43 44 44 44 48 49 50 54 54 55 55 59 60 61
. . . . . . . . . . . . . . . . . . . . .
CONTENTS 5 Conclusion 5.1 Results . . . . . 5.2 Produced work 5.3 Future work . . 5.4 Conclusion . . Bibliography
xxv
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
63 63 63 64 64 65
xxvi
CONTENTS
List of Figures 2.1
Overview of steps and methods. . . . . . . . . . . . . . . . . . . . . . . . . . .
14
4.1 4.2 4.3 4.4 4.5 4.6 4.7
Use case diagram. . . . . . . . . . . . . . . Domain model. . . . . . . . . . . . . . . . . Major components of the Android operating Design class diagram. . . . . . . . . . . . . Geometer map mode screen. . . . . . . . . . The emulator. . . . . . . . . . . . . . . . . . Mathematica example. . . . . . . . . . . . .
39 41 43 46 56 57 58
xxvii
. . . . . . . . . . system. . . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
xxviii
LIST OF FIGURES
List of Tables 2.1
Overview of the algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xxix
24
xxx
LIST OF TABLES
Chapter 1
Introduction 1.1 1.1.1
Involved domains Ubiquitous computing
A trend in modern computing is the increasing availability of computing power and connectivity in a ubiquitous and miniaturized form. This is due to several technologies that have become widely accepted: mobile phones, the universality of the internet and web technologies as the “lingua franca” of computers, wireless communication systems providing global connectivity, and finally, sensing technologies (such as the US Global Positioning System, GPS) which provide environment awareness. The number of devices that become available to the masses is augmenting steadily. Ubiquitous computing is about exploiting the increasing integration of computing devices with our everyday physical world [1].
1.1.2
Context awareness, robotics and sensor-node networks
Through context awareness, the everyday physical world is integrated with computer systems. A key problem remaining is that, compared to the subtle understanding that humans have of their physical world, the existing systems are quite crude. Not only are sensors inevitably inaccurate, but the final stage of producing semantically rich information accurately from raw sensor data is extremely difficult. The world of robotics, for example, has been tackling this difficulty for many years. In tightly restricted domains such as domestic vacuum-cleaning, robots can perform reasonably well. Another example is in-network processing, where nodes in a sensor network perform tasks such as aggregating sensor data from nearby nodes so as to examine values for an area rather than a single sensor.
1.1.3
The importance of location
In both these domains the sub-problem of knowing the location of each robot or node has been solved. But generalization from these domains remains elusive. In context aware computing, a subfield of ubiquitous computing, not applicable to the subject of this text but related, localization (the determination of location) plays an important part to bring the virtual world into the real world and vice versa. But also in other domains good location information makes a difference. Geotagging is the act of adding location information to various media. A popular use at the moment is adding geographical metadata to photos on popular public 1
2
CHAPTER 1. INTRODUCTION
photo-sharing websites such as Flickr so that photo collections can be searched on location. From a different perspective, namely starting from a map or landmarks, experimental web services and products such as Google’s Street View, Microsoft Live Labs’ Photosynth, and others, couple photos and locations in exciting new ways. At the moment however it is not possible for users to upload photos there.
1.1.4
Sharing photos
New ways to share photos between users, next to the already existing centralized websites, are also being looked into. An example is PIX-grid [2], an internet-scale peer-to-peer system that enables users to share (globally or only within a group) and efficiently locate photographs based on semi-automatically provided meta-information (by the devices and the user).
1.1.5
Distributed capture
At the same time, the novel concept of collaborative or distributed capture [3] is taking form. Distributed capture (which can take place over a longer period of time) consists in using a collection of personal devices dispersed throughout to capture data of an event or a reality as a distributed sensing infrastructure, and exchanging and synchronizing the resulting aggregate to increase the global quality.
1.2
Objectives
The subject of this master’s thesis is the investigation of spatial coherence in the novel concept of distributed photo capture, in particular the problem of global coherence in geotagged photo collections. As nearly no related work has already been done and work from related domains is not that much applicable, this should be considered an experimentation in a new direction. Distributed photo capture concerns the aggregation of photos taken with a photo camera or mobile device, by several users and at different but related places, into a coherent collection. The coherence of the collection is clearly the quality metric here and it is more precisely the spatial coherence that is under scrutiny in this text. A photo collection becomes spatially incoherent when a photo is added that is not at the location where the user expects it. Based on the virtual world model of the collection, the user might expect to see one thing but might get to see something quite different that does not match reality. The main objective is to explore the coherence issues in distributed photo capture. Since the concept is very new, a big part of this thesis is first and uttermost to clearly formulate the problem. Secondly, ways to correct inconsistencies need to be examined. Finally, a prototype application of a distributed photo collection with an error diffusion technique shall be developed to put the theory into practice. This thesis also tries to be a starting point for future work in this domain.
1.3
Structure of this text
The bibliographic study in Chapter 2 comprises a state of the art and survey of the larger problem of localization, chiefly the ways to solve the coherence issues that go with it, and enlightens the general aims that are posed for the domain. From this literature overview an algorithm and some ideas are selected to be translated into a decentralized algorithm
1.3. STRUCTURE OF THIS TEXT
3
specifically adapted for the use in a photo application. In Chapter 3 the specific problem and expectations are marked out, choices are made about representations, and the algorithm and the adaptations are presented. Chapter 4 is dedicated to the development process of the prototype application and choices made therein. The results are discussed. Finally, Chapter 5 offers a concluding overview of the realizations.
4
CHAPTER 1. INTRODUCTION
Chapter 2
Localization with user distance measurements This partly introductory chapter situates the specific domain and problem at hand. Section 2.1 introduces the general theme and problem related to localization. It also lines out the scope and reason for existence of this work and underlines its meaningfullness. The problem is illustrated with two examples. In Section 2.2 the problem is worked out in detail. An overview of the two existing related scientific domains is given and concepts from both worlds are explained. Furthermore general issues, challenges and metrics from the general field of networks and distributed systems are fit into the specific domain. Sections 2.3 and 2.4 continue on the same topic, and effectively constitute a literature study of algorithms used in the domain of respectively ad-hoc sensor network localization and robotic node localization.
2.1
Locations in data aggregation
A lot of research has been done on context, even in those applications where in fact location plays an important role. The aspect of context that is related to location, is proximity. However this is both a powerful and vague notion, depending on what is needed for the application. In the Ambient Computing and Embedded Systems (ACES) research group an application to navigate through a photo collection using context was developed [4]. Pictures that were taken of the surroundings are displayed in their directions with respect to to the user (and his location). The concept of context serves here mostly as a multidimensional distance combining time and space. Pictures taken by anyone or at any time, as long as they were taken in places close to each other, are brought together. It is possible to click through the photos and see what lies ahead. As such the photo collection forms a coherent virtual world. The user’s physical location is also continuously monitored and his context is updated accordingly, linking the virtual world to the real world. For this positioning GPS is used. The problem here is that GPS, while quite accurate within its margins, is not usable everywhere (such as in buildings because of signal attenuation) and does not update fast enough. It is also not possible to trade fine-grained mobile positioning (up to cm’s) for something else, like the refresh rate, with the same receiver. Perceiving changes in headings is another problem with GPS by itself. Any user can also add pictures to the collection. Tagging photos using geographical coordinates is a hot topic nowadays. While the application end-usage itself has been worked 5
6
CHAPTER 2. LOCALIZATION WITH USER DISTANCE MEASUREMENTS
out, obtaining content for it remains difficult. It is situated in the domain of (spatial) data aggregation. The devices used are small PDAs annex photo cameras with some networking capability and a GPS system. An interesting, related and even commercially realized application is [5] by Cyclomedia, where repeatedly panoramic images were taken in Dutch cities, and later stitched into 3D maps. This proves that there are infinite possibilities of application, if only locations can be determined precisely enough. It is even possible to measure quite accurately heights of buildings and such from these pictures for technical purposes (such as roadworks, renovation projects, cadastral information, . . . ). This thesis intends to look into the spatial consistency in aggregating the data: how pictures at different places can be referred to against each other. Specifically we want to know how pictures (in a mesh of pictures locations) can be given locations while keeping their relative structure coherent. To be able to do so, the photo collection should be presented to the user so that navigating through it is easy and realistic, furthermore quite transparent and free from inconsistencies. The way locations are handled should helps us in achieving this. Because of this we will concentrate on techniques for localization. A (photo) database can always be shared in a distributed manner using any sort of distributed file system or peer-to-peer system, so this will not be the problem. For our purpose, distributedness implies for instance that the addition of a picture has only a local impact. Supposing that a photo collection system is implemented on the level of PDA’s, this means there will only be communication with a few other PDAs and any load of calculations can be spread. Some other common issues with distributed systems, such as how to deal with administrative boundaries within the collection or rights to correct the spatial data, are not addressed here. Finally there is the problem of determining the angle at which the photos where taken, which can never be really corrected using given (pure) position data. This might be solved by not only using odometry and the users input but also analyzing the visual data with vision and scene analysis techniques. An odometer is a device that measures the traveled distance by the number of wheel revolutions. Odometric data (good range uncertainty, bad direction uncertainty) and visual data (bad range uncertainty, good direction uncertainty) techniques tend to complement each other, in the same way as odometric data and bearing/compass data does.
2.1.1
Example application 1: building maps by taking pictures at distances measured by a user
In line with the previous research we propose an example of an application where a user is asked to take pictures and input the distance and the direction relatively in between the spots where the pictures are taken. In this way a map can be built of the environment wherein the photos are related by their distance and relative direction (angle). This information can be shared between users and this collectively built map can then serve as a photographic guide to others. One can expect the map would grow quite big easily, so the application should scale well to large collections. It would make sense to make the collection and any processing involved in locating and relating the pictures distributed so resources can be shared in a way that is achievable for such limited PDA devices and without a central service. Imagine a user standing at the corner of a room, facing one far end of the room, and taking a picture. The user would then walk diagonally (say 45◦ to the right) across the room
2.1. LOCATIONS IN DATA AGGREGATION
7
in 20 steps, face another wall to the North, and again take a picture. The user could enter this data along with the photos into the application (20 steps, 45◦ , facing North). In this way the user has done a relative position estimation himself and does not rely on external systems as GPS. While the user’s estimate might be much less accurate, we can count on it that if the user would now move to the left of where he was before, the new, third picture would certainly be classified as being left to the second picture, and never the other way around, as it could have been if it was within the margin of inaccuracy of GPS. This form of relative positioning can still be given absolute coordinates if the first corner of the room were a known landmark, which has been located by means of tools such as GPS beforehand. Localization or location identification can be relative to other elements (e.g., to other neighboring nodes in sensor networks, or by using odometry in robotics), or be absolute in terms of global coordinates (e.g., GPS positioning, visually recognizable landmarks). While the user will not easily make obvious errors in subsequent photos, the user might go for a walk, make a very large tour and then arrive at his/her starting point, only finding out the first and the last pictures are not in the same spot in the application, because he or she has built up a large cumulative error on the way. This problem can be resolved by incorporating known locations (called landmarks) as part of the tour, such that intermediate measurements can be corrected with interpolation techniques, as follows: suppose the user starts at the corner of a large room, which is a known landmark, and then traverses the room diagonally in steps. Finally he ends up in the opposite corner, which is also a known landmark. There the error might be large but can be corrected because the position is known. The steps in between can subsequently be adjusted as well. Another sort of consistency we would like is also in the logic of the application: if there is a picture of a door and the user clicks on the door to proceed, we would like him or her to see a picture of what is behind the door, for instance a corridor, without the image being panned to the side or rotated, and certainly not showing something else than this corridor, such as an adjacent chamber. The errors we expect from users are both in distance and in angles: both systematic distance (the optimistic user always thinks it is not as far) and non-systematic distance errors (it being difficult for humans to judge distances, even in clear view). Also, some error in the angle estimations, even for not too great distances, especially when it is not 90◦ or 0◦ . The errors have a chance of becoming greater for larger distances and in the end so large that the user will need to work in steps. The focus of the application is however on consistency and transparency for the user, over accuracy, so this error is not be harmful if the user does not notice it. It gets even more difficult when things, such as buildings, shrubbery, . . . , are in the way of the direct line of sight, and when the user must deviate from the straight path. The same goes for hills and different level of terrain, when the problem gets a height component as well. One would expect even greater errors here. They can be reduced if the user is willing to work in steps (e.g., to the building first, then the sides of the building, and afterwards to the endpoint), except when there is nothing to reference to. Later on, we will refer to this example as distributed photo capture when multiple users collaborate in this application.
8
2.1.2
CHAPTER 2. LOCALIZATION WITH USER DISTANCE MEASUREMENTS
Example application 2: cooperative human surveying
In fact this kind of problem is what human surveyors call an open traverse [9]. An open traverse is an environment in which there are no fixed reference points, and as a result, open traverses are prone to cumulative errors. For this reason, human surveyors tend to eschew open traverses and instead make use of closed traverses. Nevertheless if there is no other way, two humans could work together, one as an explorer and the other as fixed observer, serving as a reference point for the explorer, crossing the open traverse while switching roles. The rest of this text is mostly centered around the first example application. Extensions to the second or other examples are obvious.
2.2 2.2.1
The localization problem Overview and concepts
Localization is the problem that is central here in what we want to do. We want to be able to tell where something is in aspect to some other thing. In more strict terms, we want to estimate the position of a point. We use the word “estimate” as there is always some inaccuracy involved in practice. The type of problem we will look into here is partially relative, partially absolute. As said earlier there are lots of points where we have some relative position measurements (such as distance measurements), and then there are also some points which are already known. Despite the significant amount of existing work on localization problems, being an old problem, a generic solution suitable for all facets of ubiquitous computing does not exist. However, localization techniques have been designed specifically for at least two large domains of modern computing: robotics and ad-hoc sensor networks. Both domains have some solutions that are for some part applicable to our localization problem. Unfortunately nothing that is closely related to our first example application, or that has (a lot of) the features we set forth, exists, therefore we will give an overview of the found techniques that contain ideas that might help us. Localization is also one of the issues on the open research issues list for sensor networks by Akyildiz [10]. Robotics The domain of robotics studies mobile autonomous robots. These robots have computing power on-board (or somewhere else over a network link) and can sense and actuate standalone. They are usually equipped with an odometer (a proprioceptive sensor), some range sensors (such as lasers or radar; exteroceptive sensors) to detect walls, and sometimes also a compass for bearing information. They also have an emitter/detector for beacons (for instance a camera to see visual bright strips). In mobile robotics it is important for robots to be able to explore the environment. In the end, people in the field of robotics [11] would like to develop robots that can figure out their position (localization) precisely using a given map and a local current view (range sensor, camera) of the surroundings. However, instead of humans building these maps, it would be more efficient and accurate to let robots construct the maps. For this the robots sensing position must be known precisely. It is this chicken-or-the-egg problem concerning map building and localization that brings us close to our example application problem, which could be defined as long-term globally referenced position estimation without a priori information.
2.2. THE LOCALIZATION PROBLEM
9
Map learning is the process of roaming an environment and constructing a map representation from a set of measurements. These measurements, namely position, distance, orientation and identification of spatially remote objects are error prone. Maps are essential for mobile robot navigation in complex environments, being needed for self-localization, path planning and human-robot interaction. To operate in unknown environments a robot needs the ability to build its own maps. However the sensor information available to the robot is noisy and can produce errors when integrated into the map. Ad-hoc Sensor networks In sensor networks, a network of randomly dispersed small-size and low-power nodes needs to work together to perform a certain activity. A part of this problem is knowing the position of all the nodes to a certain precision, and also the geometry of the nodes relative to each other. This is used in other issues, such as optimal routing or sensor data fusion. In the modern sensor networks there are some nodes, a fraction of all nodes (for instance one fifth; this is a design choice), which have hardware to more precisely pinpoint position or which are installed at points with known coordinates. They are called anchor nodes or beacon nodes. Anchor nodes (landmarks) are nodes of a similar type as the rest of the common nodes but have precise live or predefined location information, for instance with respect to some global coordinate system, such as GPS, or by earlier measurements during deployment. Beacon nodes (access points) on the contrary are different and are external nodes, that don’t necessarily belong to the sensor network, but can be used as reference points nonetheless. The given problem is now to determine the location of the other nodes relative to these anchor nodes, and in this way also their absolute position. The sensor networks under scrutiny are ad-hoc networks “acting as high resolution eyes and ears of the surrounding physical space, making the vision of Mark Weiser [12] into reality” [13]. Ad-hoc means that there is no prior knowledge on how they will be constituted, and their relation might be disordered. We want these location networks to be self-organizing/configuring. The domain of ad-hoc sensor networks involves multiple nodes and is not concerned with temporal consistency, whereas our problem might only have one PDA, the location of which we need to gather and join, both in space and in time. Nevertheless this domain has had lots of attention paid regarding distributed processing and low energy usage (this implies minimizing communications), which is useful to our application in the distributed systems domain and the ubiquitous computing domain as well. An additional area of concern is also scalability. Nothing exactly related to our example applications is found, therefore we will give an overview of techniques that exist and contain helpful parts or ideas. Translating these two domains to our first example As to robotics we compare the robot with one PDA/user. The beacons in our case are the landmarks with known positions users use as references. It can be estimated that, both in robotics and in our example, not so many simultaneous references to landmarks are made because only one or a few are ‘visible’ at each moment. While the robots mostly use odometry, users might count their steps for instance. Both are not good at retaining a direction and have distance errors. Both may also use a compass to improve their judgment of direction. We are mostly interested in the path taken by the robots, and not so much in the walls they
10
CHAPTER 2. LOCALIZATION WITH USER DISTANCE MEASUREMENTS
measure using some laser sensing. Both the robot and the user constitute a single identity that moves in time. So adding new measurements can make the path traveled (history) more precise. There are also some papers [9] that talk about two or more robots cooperating, as in our second example. The scale at which users take measurements will probably be one order greater in comparison with the robots. Except for some experiments with scene analysis techniques, the robots have nothing to do with taking pictures that should form a coherent set. Their purpose is mostly map building. In sensor networks the emphasis is on the large scale communication or distributiveness. In our example too, the collection might easily grow to become very large, along with all kinds of appearing scalability issues, which is why distributed algorithms are important to us. For this reason we are also interested in algorithms where new measurements can be added with only a local disturbance, and not in the whole collection. The nodes are static, so there is no time aspect, and thus they compare to our picture locations (each spot where the user makes the measurements). The nodes have a much more limited processing power than our PDAs. The beacon/anchor nodes are equivalent to our landmarks, although they are much more referred. In sensor networks distances are in general not very well estimated. This is because sensor nodes mostly use radio range measuring techniques like time-of-arrival, timedifference-of-arrival and received-signal-strength [13]. In this aspect it might be very similar to user estimates. Moreover the distance estimates are worse if something (such as buildings, terrain) is blocking sight to the landmark, which goes for user estimates too. They also have a limited range in which they can take relatively good measurements, as users do. In contrast to the robots, which cover the temporal aspect quite well, the sensor nodes help us with the large spatial aspect, where there is nevertheless only the limited area one node can reach (so-called fog of war). We will review the issues of the problem, coherence in localization techniques, first, also in the context of the above example of distributed photo capture. Then we will explain in depth what the challenges are for any solution, which boils down to the algorithms we will see later, and what the criteria are by which they would be evaluated. Finally, we will look into the found literature and the localization techniques suggested there.
2.2.2
Issues
Consistency Consistency means “the ability to be asserted together without contradiction”1 , or, when expressing a state, “agreement or harmony of parts or features to one another or a whole”. When used in the context of ordering in distributed systems, consistency usually refers to causal consistency or ordering. A well-known application is the ordering of messages for distributed computations, where there is data replication and concurrent reads and writes or concurrent messages occur (such as in distributed shared memory, or distributed transactions). Rather than allowing messages to be delivered in any order, or instead of imposing a total order and making sure that every process receives exactly the same order, there is an approach in between these extremes that suffices for most applications: causal ordering, also called the happened-before relation. When a message is received by a process, it can potentially affect any subsequent message sent by that process. If that is the case, then those messages should be received in that order at 1
These definitions are from the Merriam-Webster’s online dictionary.
2.2. THE LOCALIZATION PROBLEM
11
all processes. Unrelated messages may be delivered in any order. This can be stated more formally as follows: sendi (m) → sendj (n)
implies
deliverk (m) → deliverk (n)
A similar concept can be developed for the spatial ordering of a set of geotagged photos. The ordering here is an ordering in three dimensions in the virtual world model to match the real world. Supposedly mapped on a one-dimensional world, this would mean that photos that show a closer view of an object should be ordered closer to the object on a line in virtual space. Note that it is the ordering of the photos between themselves that matters, rather than the exact distance to the object. Causal ordering is then translated to only imposing ordering between photos that are linked. The virtual model, which is equivalent with a receiving process, is made consistent with the real world, being the sending process. Causality signifies that in the real world the locations of two photos are compared or estimated with respect to eachother, and this connection is carried over into the virtual model. We also use the term “consistency” when comparing two photos at near positions, but always the criterion of this consistency is matched against the real world (such as: are they showing similar things in the right directions, as in the real world?). In this sense consistency is related to correctness. Coherence versus consistency Coherence, on the other hand, is defined as “systematic or logical connection or consistency” and “logically or aesthetically ordered or integrated”. To us, coherence is a broader term. It points more at the state and togetherness of a collection. Being coherent, all photos together should give an “image” that is sound with reality, for instance making a qualitative walkthrough possible. From a different point of view, coherence is said to be a weaker form of consistency where the state just needs to be valid but can temporarily differ between the instances. Still, the difference between consistency and coherence are not fundamentally important and both expressions are used interchangeably. Coherence (and consistency) is our goal and the error diffusion techniques we will encounter are our means of reaching this goal. Other issues Some other issues are clearly involved and important as well, allthough not considered in this text. Privacy and administrative boundaries Users might want to keep some photos to themselves, while making others public. Users may also unite in groups of friends that share their proper photos only among themselves. Security and access rights If this is the case then security of the information needs to be enforced against any form of wrong use or abuse. This includes notions such as authentication, confidentiality and integrity: how to prevent someone from knowing where you are or submitting photos under your name, for instance. Who can change what? While this is a very important issue when creating anything useful, it is not an issue considered in this work, mostly since its priority is second to making a proof of concept of a working system. Incentives Some ways to encourage people to take pictures themselves, instead of only using the collection for browsing, may need to be put in place.
12
CHAPTER 2. LOCALIZATION WITH USER DISTANCE MEASUREMENTS
Legal issues Both the sharing of content and the content itself may incur legal issues.
2.2.3
Criteria, challenges and metrics
There are several ways of judging the localization (and coherence improving) techniques and algorithms: Accuracy (metrically) For our application this is not the most important metric. However for most fields this is nearly the only and key metric. (15 cm accuracy 95% of the time, as the precision). In other words, how precise is the map in a local sense? Topological consistency How consistent are the maps in a global sense? Scalability A system is scalable if it will remain effective when there is a significant increase in the number of resources and the number of users. What we are looking for specifically are low computational orders such as preferably but improbable O(log n) and at maximum O(n). Some other challenges, such as cost and distributedness depend on or influence the scalability. Cost The cost of a single system built around these techniques should not be too high. And the cost of a multi-system deployment should not rise much more than linearly in function of the number of systems. For instance it is impossible to afford an expensive precise GPS system in every sensor node. It is also unaffordable to replicate the entire worldwide picture collection in every PDA. Distributed or centralized It is clear that scalability requires the absence of bottlenecks, which usually means the algorithm needs to be decentralized. However the question of distributedness versus centralization is not limited to availability or performance (including communications delays). Probably the most convincing reason in favor of a distributed system, is the aspect of local or limited control versus global control. Instead of forwarding responsibility, and thus also control, over the photos in the collection to some central and global entity, the devices themselves maintain responsibility over their own photos. This is consistent with the view that what belongs at the device should also be handled and kept there. If we want to keep the information at the device, then it would seem logical to also assign the responsibility (and the calculation of the algorithm) to this device, the entity that has the needed information. Some algorithms perform well with only a partial/incomplete view and knowledge instead of a global view on the data set, and this opens up possibilities. An example is the principle of locality (as in field theory in physics). We prefer distributed (i.e. decentralized) algorithms by choice. Parallelism in the localization of objects How many objects can be located at the same time? Some sensor node network technologies allow a node to interact with (and measure to) at most one other node at once. Most technologies in sensor networks use radio however where this is not a problem. Tolerance to measurement errors As mentioned in [14] there is a range error problem: a solution should be able to cope with large range measurements errors. In sensor networks
2.2. THE LOCALIZATION PROBLEM
13
there are some methods that generate errors as large as ±50% of the measurement. Generally the rule seems to apply that how bigger the error on the measurement the worse the accuracy of the solution, and this seems to be inherent to the problem. Needed beacon density As mentioned in [14] there is the sparse anchor node problem: in 2D space at least three reference points are needed to uniquely determine a location of an unknown object. There need to be sufficient anchors to remove ambiguities for the wanted accuracy. Also range measurements will not always be made against all anchor nodes directly (e.g., because of limited radio ranges in sensor networks). Another important point is that we can demand no particular topology of anchors, but instead only some arbitrary distribution. The following criteria are not considered in the evaluation of the algorithms. Ad-hoc flexibility Algorithms ideally should be adaptability to any environment without extra work. Generally some kind of automatic configuration is used. Sometimes there are limitations in certain environments. Also in sensor networks this criterion includes ease of deployment. Some related challenges are: Self-organization In sensor networks, can one just drop the nodes and leave them to sort everything out? (Self-)Calibration Is there any calibration required (e.g., to remove systematic errors in robot odometry)? More closely to our envisioned application, can the application adjust to the user’s systematic errors? Adaptivity of fidelity Not important since we want transparency, but can we choose to opt for more accuracy at the cost of something else (time, number of measurements, energy)? Generality of formalism. How wide is the spectrum of applications that can profit from the algorithm? For a distributed algorithm the following criteria are also involved: Tolerance to node failures This is only in sensor networks. Any algorithms used should be robust to some nodes leaving or joining unexpectedly. In distributed systems failures are partial, some components fail while others continue to function. In our example, if some other user’s device is not available, the pictures and informations stored therein will not be used. Transparency We want to conceal from the user how the locations are determined and how the context is gathered, so that the system is perceived as a whole. Access transparency Pictures and locations from other devices should ideally be amalgamated as if they were from the local device. A photo browser should be unaware of the fact that some locations, or the responsibility thereof, are located on a remote device. This means that local and remote locations are treated in the same way. Concurrency Several instances will be working in parallel, but can responsibilities be divided properly so that no conflicts occur at attempts to access a shared resource at the same time? This is again related to the local/limited view property.
14
CHAPTER 2. LOCALIZATION WITH USER DISTANCE MEASUREMENTS The following criteria are again not considered in the evaluation of the algorithms.
Inconsistent state Is the algorithm resilient against temporarily inconsistent states? Energy efficiency More important for sensor networks where the nodes are really very small and calculations (local computations) are much cheaper than (wireless) communications in terms of energy consumption. Not considered.
2.3
Ad-hoc sensor network localization
We will look into the domain of sensor networks first as this is the most recent and thus we expect to find the more improved algorithms here. The future is expected to bring a wide range of novel sensor/actuator networks, containing a diverse collection of sensors and actuators, including devices that are fixed (surveillance cameras, motion sensors), devices that are carried (mobile phones, handheld computers), and devices that are mobile (robots). To have some form of cooperation or sensor fusion, these network elements will have to know their relative spatial configuration. Sensor node localization has been the topic of active research and many systems have made their appearance in the past years. Despite these efforts, very few systems are actually ad-hoc. Even fewer methods have been proposed that offer a fully distributed operation.
Figure 2.1: Overview of steps and methods. [13] [13] provides a recent overview of techniques and technologies involved in sensor node localization. Figure 2.1 shows this overview in three steps: 1. Distance measurement (be it
2.3. AD-HOC SENSOR NETWORK LOCALIZATION
15
range based, range free, or otherwise), 2. Location estimation, 3. Optimization (refinement). While we are not concerned with the technologies themselves, in any of the papers, the techniques for location estimation and refinement (these two elements are often mentioned as a whole) do. This overview is by far not complete but shows which techniques have survived scrutiny in the domain of sensor networks. We will try to point out the strong points and weaknesses of the most promising later on in a discussion of the involved papers themselves. For step 2 there are several primitive techniques for position determination. We will list some, and a subset will be discussed later: triangulation or multiangulation, trilateration or multilateration (either atomic multilateration or collaborative multilateration), proximity, scene analysis, min-max algorithms, and least square algorithms. For whole system methods (at least step 2 and 3) some techniques require centralized computation and are range-based (such as convex optimization, MDS-MAP (maximum a posteriori estimate)), others are range free (such as the centroid algorithm, APIT (approximate point-in-test), amorphous localization, and DV-hop algorithm which is range free but also has range based variants). Some other techniques focus on incorporating mobility, very often using probabilistic methods: sequential Monte Carlo localization, Bayesian analysis and Markov modeling, and Kalman filtering. Technology-wise [15] also makes an overview, which for the bigger part isn’t of much use to us, but also explains the lateration and min-max position calculation techniques. As a side choice should we use simple lateration or min-max (rectangular bounding box with center) to compute the positions in step 2? Min-max is computationally cheaper and is relatively insensitive to errors, however it is an approximation and lateration is not. Also lateration does not require that anchor nodes are at the edges of the network (it does make a difference but less than with min-max). At a first glimpse for our problem we would go with lateration, under the presumption there is enough computational power for lateration, and to not loose any precision in this phase, and finally so we can move out of our constellation of anchors. To show proof that these algorithms work, [16] contains a theoretic study: when is the problem solvable and what is its computational complexity. It also looks at the densities of the beacons sufficient for trilateration to be carried out in a designated order of steps. Yet another overview is [17], which describes the evolution from connectivity-based algorithms to distance measurements-based algorithms and their improvements. We are only interested in the latter. While being very simple and not being sensitive to range errors (but keep in mind that hops will always be an approximation to distance measurements), a disadvantage of connectivity-based algorithms is that they can only provide a coarse grained estimate. Also, the error is highly dependent on the node density, the number of anchors and the network topology. [17] also enlightens the differences between centralized and distributed algorithms. A lot of the overview papers conclude by saying “No size fits all” which is why we look at several techniques altogether: convex optimization and SDP, DV-distance, Savarese et al. two-stage localization scheme, Savvides et al. four-stage localization scheme, MDS, sectoring, SA, and finally NBP. We will recall upon these overview papers to provide insight and make comparisons when dealing with the more technical papers below.
2.3.1
Convex optimization & SDP
The localization problem can be formulated as a convex optimization problem (formulation in a quadratic form) and solved using existing algorithms for solving linear programs and
16
CHAPTER 2. LOCALIZATION WITH USER DISTANCE MEASUREMENTS
semidefinite programming (SDP, which is a generalization of linear programming, LP) algorithms. Here the task is the minimization of the error in sensor positions to fit the distance measurements. The proximity constraints between nodes which are within range of each other are modeled as convex constraints, and these can be solved by efficient convex programming techniques. [18] uses a first-order method (an efficient algorithm based on interior-point methods for solving linear programs) and rectangular bounds, and deals mostly with connectivity-induced constraints (either angular or radial). A second-order cone problem (SOCP) method (more of a SDP method but still convex programming) can also be used [19]. Apart from being a centralized algorithm, the biggest disadvantage is that the anchor nodes need to be placed on the outside edge. This is quite unacceptable for our example application as we would like users being able to thread outside of a known area too. It also does not scale well: the idea was mentioned to either limit the number of constraints on each node or to solve the problem hierarchically. In the latter the procedure could be applied to several subnetworks and then the subnetwork centroids could be abstracted to nodes in the larger network and positioned accordingly with another iteration of the same procedure. This mirrors Lagrangian Decomposition for the solution of large optimization problems on multiple processors, and with these plural hierarchical steps should scale linearly. To do better in solving the localization problem one could also add other bounding-away constraints. But these are not convex so the SDP relaxation method is needed to be able to formulate tighter convex constraints, which can then be relaxed into linear matrix inequalities. With SDP the many possible solutions are converted into a constrained maximization problem by requiring bi-valued (0 or 1) coefficients. The mean location error appears to be a monotonically decreasing function of the number of anchors. It can be fine-tuned using a gradient search procedure. In [20] the basic idea is to convert the nonconvex quadratic distance constraints into linear constraints by introducing a relaxation to remove the quadratic term in the formulation. At the very least the disadvantage of being a centralized method remains. Still, there are also some advantages: Very few anchor nodes are required for accurate estimates. Also the estimation errors are minimal even when the anchor nodes are not suitably placed within the network. We will move on and see if we can find the same in a distributed algorithm.
2.3.2
DV-distance
The DV-distance algorithm [21] is a distributed distance measurements based extension to the DV-hop (distance vector hop) algorithm [21]. All unknown nodes triangulate their position from distance values to at least three anchors. While in the DV-hop approach these distance values come from network hop counts converted into distances using a calculated average distance of a hop, the DV-distance approach directly uses distance measurements. DV-hop is quite good in dense and regular topologies and is about three times more accurate therein than convex programming. The DV-based algorithms behave generally well and have a low signaling complexity. For sparse or irregular networks however the accuracy degrades to the radio range. A third contemplated algorithm called ‘Euclidean’ [21], is claimed to provide better accuracy for anisotropic topologies (topologies that are misleading in the relation between the hops and the true distance), and is generally more predictable in performance, at the cost of communication. Yet, the usefulness of the result of the paper [21] is limited as the network topology is not necessarily realistic and far from sparse.
2.3. AD-HOC SENSOR NETWORK LOCALIZATION
2.3.3
17
Savarese et al. two-stage localization scheme
The two-phase localization scheme [14] (also called Robust positioning) by Savarese et al. is for the first part similar to the DV-distance algorithm. This start-up phase (called HopTERRAIN) is there to address the sparse anchor node problem and to give each node an initial estimate. The initial estimates are then refined amongst direct neighbors in an iterative refinement phase by a least-squares trilateration/triangulation (solving a system of linearized equations by means of a least-squares algorithm). In this phase the range error problem is addressed. The algorithm originally uses hops but can be extended to using distances. In this form it is distributed and still simple. The algorithm itself does not rely on routing within the network, but uses only local broadcasting. On the other hand, it is also limited because of its simplicity. Errors propagate fast throughout the whole network. Some network topologies are inherently hard, or even impossible, to locate (such as clusters connected by a single link). There is not much difference between orderly nodes (in a grid) or random distributed nodes, though. It is shown that it works poorly in networks where each node has less than 7 direct neighbors, which is quite grave for our first example. Here too, most anchors need to be placed on the perimeter of the network. A confidence value, chosen by the nodes themselves, is used to counter the error propagation, making the least-squares algorithm weighted. Also, groups of ill-connected nodes (less than 3 neighbors for a single ill-connected node) are heuristically detected. All these modifications are distributed too. This algorithm is about twice as accurate as DV-hop in sparse networks. It also outperforms the Euclidean algorithm in [21] in case of few anchors.
2.3.4
Savvides et al. four-stage localization scheme
The four-stage localization scheme by Savvides et al. [22], called collaborative multilateration or the n-hop multilateration primitive, is also similar to the DV-distance algorithm and to the algorithm by Savarese et al. Still, it has a fundamentally different mechanism for limiting error propagation in comparison with the previous algorithm, is also fully distributed, pays attention to scalability and is meant to be fine-grained. Both a centralized model and a distributed model, which in comparison does not compromise the accuracy, are presented. The latter makes it tolerant to node failures, distributes the communication cost, and lets nodes collaborate, all without any additional supporting mechanism such as routing. There are four phases. First, a collaborative subtree is formed using nodes that meet the criteria of high confidence. It uses the notion of tentative uniqueness for the criteria. A node is called tentative uniquely localizable if it has at least three neighbours that are noncolinear anchors or tentatively uniquely localizable nodes themselves. This tree serves to have a gradient/sequence later on (it can be searched with Distributed depth first search, DDFS), to avoid oscillation and thus error creation. In parallel and independently, secondly, initial estimates are computed. A rectangular bounding box approximation is used and positions are taken to be the center of this box. The third phase is a refinement phase. Here a Kalman filter is used, which is the same as iterative least squares techniques in a static network. For the distributed model an approximation to a distributed Kalman filter is used. This iterative refinement phase is less sophisticated than the one in Savareses algorithm. And finally nodes that weren’t included in the subtree are positioned in a post-processing phase. We have found some distributiveness but some other problems remain. The algorithms
18
CHAPTER 2. LOCALIZATION WITH USER DISTANCE MEASUREMENTS
so far seem to suffer mostly from employing a technique that is too evidently linked with a technology, rather than looking for the best heuristic. In any case we are looking for a different emphasis.
2.3.5
MDS
Multidimensional scaling (MDS) was originally a set of data (dissimilarity) analysis techniques that display the structure of distance-like data as a geometric picture , revealing spatial structures in the data. Typically dimensionality reduction techniques are used followed by multidimensional scaling and coordinate alignment techniques. Specifically, the initially obtained location estimates are refined using a least-squares minimization. This is a centralized algorithm. MDS can be further improved [23] by dividing the entire sensor network into groups where adjacent groups may overlap and share common sensors. Then these local maps are stitched together to form a global map by using the common nodes and aligning the known and estimated positions of the anchors. In theory this allows the centralized algorithm to be made distributed. In practice, it is better to run it centralized because of some disadvantages, such as the large number of required iterations and thus the high communication cost. MDS can be helpful to overcome adverse network and terrain conditions, such as anisotropy in topology and complex or uneven terrain. It is also efficient in eliminating cumulative measurement errors, which are caused by inaccurate distance measurements. The algorithm of Savarese (Section 2.3.3) uses iterative computing to counteract these but adds a large number of communication costs into the algorithm and still cannot generate good position estimation in some circumstances. Worst in this case are the connectivity-based algorithms such as DV-hop, where the accuracy relies on the average radio range estimation, and it tends to deteriorate when there is anisotropy. In [23] the fact that buildings blocking the line of sight may render a distance measurement very inaccurate is explicitly mentioned. In [23] the classical MDS and an iterative optimization of MDS are presented. MDS is wielded as a data-analytic approach to discover the dimensions that underlie the judgments of distance and model data in a geometric space. They call their algorithm a multivariate optimization based iterative algorithm for sensor location calculations. The main advantage is that MDS generates relatively accurate position estimates, even with only limited and errorprone distance information available. The iterativeness lies in the fact that not all distances between any combination of nodes is known and that these need to be derived. An on demand algorithm, to only estimate the positions of a partial set of nodes in the network around a small area, is also supplied. In [24] an extension was proposed in which all the nodes each build local maps which are joined afterwards (to a global coordinate system) using linear translation. A node measures the distances to its neighbors and then broadcasts this information. This results in each node knowing the distance to its neighbors and some distances between those neighbors. In this way, first, local partial maps with relative positions are constructed, and then these adjacent local maps are combined by aligning (mirroring, rotating) the coordinate systems, and finally translated to absolute coordinates using the anchors known positions. These MDS methods seem to require few anchors. Yet, there is error propagation, large number of iterations are necessary, and, the extensions are not very robust since range errors accumulate when combining maps.
2.4. ROBOTIC NODE LOCALIZATION
19
So far, none of the distributed algorithms generate only a local impact upon adding pictures. Next, we present some more prominent and exotic approaches to solving the localization problem (that might be exceptionally strong against some weaknesses).
2.3.6
Sectoring
In [25] it is considered whether adding the ability to estimate bearing to neighboring nodes qualitatively improves the performance of ad-hoc localization schemes. Such a scheme, with both range and bearing information, is given and is shown to have a greater accuracy. In fact the bearing information need not even be precise and an approximate detection of bearing (such that neighbors can be placed in sectors) suffices. A distributed iterative least-meansquared refinement procedure is used to minimize the relative position error between nodes, based on a weighted least squares technique [26]. Subsequently this algorithm is extended with, first, bearing, and second, sectoring, capabilities, and a comparison is made between these.
2.3.7
SA
The stochastic optimization approach suggests an alternative formulation and solution of the distance-based localization problem using combinatorial optimization notions and tools, such as the simulated annealing (SA) technique [27]. It has high robustness against converging to a false local minimum due to the employed technique. The computational cost of the SA approach is however higher.
2.3.8
NBP
Here the problem is formulated as an inference problem on a graphical model. A variant of belief propagation (BP), namely the nonparametric belief propagation (NBP) algorithm [28], is applied. It is an iterative local message exchange algorithm. Normally, localization, formulated as a nonlinear least-squares optimization problem, in the special case that the noise on distance observations is well modeled by a Gaussian distribution, can be relatively well calibrated using an iterative optimization approach. In contrast in this method, which also holds for non-Gaussianity, reformulating localization as stated above allows to apply NBP which serves as an approximation with several advantages: its distributed implementation is easy and a small number of iterations are sufficient to converge. Also, it is capable of providing information about uncertainties and accommodating non-Gaussian distance measurement errors. Non-Gaussian uncertainty is claimed in [28] to be a common occurrence in sensor localization problems. Furthermore, a limit for communications can be set and the accuracy adjusted to that.
2.4
Robotic node localization
Some form of ad-hoc localization also exists in the domain of mobile robotics. The localization problem is similar. One main difference is that mobile robots have (additional) odometric measurements, and that issues such as scalable communication and power consumption are not studied by the robotics community. Robotics also has been studied more in real life environments, instead of simulations.
20
CHAPTER 2. LOCALIZATION WITH USER DISTANCE MEASUREMENTS
The sensor network domain is very much focused on having small nodes and saving energy and appear to be handling things rather approximately on the algorithmical level. In fact the domain of robotic node localization predates the field of sensor networks. While we will not find many distributed algorithms here, we hope to find algorithms that can handle the problem of consistency better, rather than the plain accuracy maximization in the sensor networks field. Two key problems [9] exist in robotics localization: • Odometry is subject to cumulative drift: consider, for example, a robot that starts out near a landmark such as a doorway, and takes an extended journey around the environment. Eventually, it arrives at a doorway again. If there is any significant odometric drift (as there is bound to be, if the environment is large enough), it will be difficult for the robot to tell whether it has arrived at the original doorway, or a different one. Without this information, the robot cannot construct an accurate map. • Natural landmarks can be ambiguous. A way needs to be found to model uncertainty in ‘known’ positions. For instance, to which exact point are measurements made when referring to a large building from different sides? This too might become a problem for us. Localization is an extremely well studied area in mobile robotics. Most early advanced approaches however can be classified [29] as variants on two basic techniques: Kalman filters and Bayesian/Markov. They are very useful in simultaneous localization and map building, but they scale badly: with Kalman filters one needs to invert matrices of order n2 , where n is the number of elements to be localized. With B/M one must maintain probability distributions in 3n dimensions. Another disadvantage is that these methods are suited for mobile application, which means that every new measurement is rather immediately integrated and that previous data are not corrected.
2.4.1
Simultaneous map building and localization
[11] is one of these approaches, an older one, and talks about the chicken-or-the-egg problem concerning map building and localization. They develop an extended Kalman filter (EKF) based localization algorithm, based on algorithms for building a stochastic map of spatial relationships, the computational cost of which they claim is much better. They make an assumption about, among others, Gaussianity and independence of noise disturbances. This Gaussianity is contested in a later paper [28] (albeit in sensor networks). The procedure itself is a so-called “three-stage relocation-fusion approach”. We won’t go much deeper into these two types of methods, but we will mostly talk about methods that relax a mesh iteratively.
2.4.2
Map learning
[30] is one of the earliest works on the problems involved in constructing maps from measurements taken with sensors subject to errors. Unfortunately it deals mostly with the solution of redoing the measurements (e.g., by revisiting) to lower the error, while we are mostly interested in doing the best with what we have. Also the studied algorithm creates a map that is a graph but does not mention distances or distance measurements whatsoever. There is an
2.4. ROBOTIC NODE LOCALIZATION
21
even earlier but rather non-public work [31] by the same authors on the optimization problems related to constructing the most accurate map consistent with a set of measurements. Some types of uncertainty are addressed in [30]: movement or directional uncertainty (general sensor errors such as in odometry), recognition uncertainty (not identifying locations one has seen before) and continuity uncertainty (missing some things on the way, so not ending up with the full picture in the end). Quoting [30] there are two parts involved in map learning: exploration and assimilation. “Exploration involves moving about in the world gathering information, and assimilation involves using that information to construct a useful representation of space. Exploration and assimilation are generally handled in parallel, with assimilation performed incrementally as new information becomes available during exploration.” In our first example, exploration is at the liberty of the user, while assimilation is the work done by the computer to render the users actions useful.
2.4.3
Elastic correction
[32] takes the graph and makes a mechanical analogy with a truss of springs where each route is a spring and each landmark is a node. Errors are corrected as a result of the deformations induced from the forces arising within the structure as inconsistent measurements are taken. The problem of odometry (or drift) errors is specifically mentioned, and could be comparable to the errors on user-estimated distances, including the same directionality error accumulation. The uncertainty on odometry is modeled by the elasticity parameters characterizing the structure. Odometers typically produce both systematic errors, because of hardware, and nonsystematic errors, because of undesired interactions between the robot and the environment such as uneven ground or wheel slippage. The latter are called dead-reckoning errors in [32], referring to the notion of latency hiding in computer game techniques to provide failure transparency. While the systematic errors might be possible to correct, the non-systematic errors cannot be predicted at all, but a normal distribution with null mean is assumed, and the algorithm is used to minimize this overall error. Topological (recognition uncertainty) errors are not considered as the robot (or user) should be able to recognize a landmark he has seen before. The basic procedure works usually as follows, citing [32]: “While exploring the environment, the robot calculates the relative position of each landmark compared to the one met immediately before, by applying deadreckoning. When it meets a landmark it has already seen, self-positioning and error correction are achieved together by combining the new measurements collected with the knowledge accumulated so far.” If the (drift) error on the measurements is not corrected by using measurements from multiple routes repeatedly it will accumulate and the measured position of landmarks using relative positioning will start to diverge very far from their real absolute position, thus leading to an inconsistent mapping of the environment. For the algorithm itself basically the physical equations concerning the linear axial and rotational springs are gathered in a set for the whole structure and the displacements of the nodes are then solved by applying a so-called “stiffness method”. Both distance and angle are used. The drawback is that this approach requires the inversion of a large n-dependent (where n is the number of measurements) matrix making it computationally expensive in large environments or with long measurements. This kind of (non-)scalability is unacceptable for our
22
CHAPTER 2. LOCALIZATION WITH USER DISTANCE MEASUREMENTS
example application.
2.4.4
Mesh relaxation and map consistency
The next few methods all have in common that they want to obtain a globally consistent map by enforcing pairwise geometric relationships between individual range measurements. They have the notion of maintaining relative spatial relationships between locations or coordinate systems. [33] is quite similar to [32]: it handles a maximum likelihood estimation problem and algorithm, in which the relationships between range measurements are treated as random variables, minimizing the Mahalanobis distance. The disadvantage is the 3n × 3n matrix inversion. We also find that the approach is not general enough (except for the maximum likelihood estimation problem itself), since it seems to deal mostly with range scans (position of points in walls and such), and not so much the subsequent positions of the robot itself, which would be most useful to our example. The approach in [6] is claimed to be different from previous work such as [32] in that it is computationally cheap, easy to implement and is guaranteed to find a globally optimal solution. It introduces a fast on-line method of learning globally consistent maps using only local metric information. In fact, all next algorithms will be iterative. The noise model on the position estimates is a Gaussian distribution, with the noise distributed equally in all directions around the position. This way a circle shape can be used. The angle is measured in an absolute way using a compass and thus not from odometry as with the distance. To maintain geometric consistency a relaxation algorithm is used. The basic idea is to iteratively pick each node in turn and then move it to where its neighbors think it should be. The convergence to a global optimum is proven by making the analogy with energy minimization in a spring system. The previous robotic approaches using relaxation are generalized in [29] into a formalism for generalized localization. In this way it could be very useful for adaption to our first example application. The generalization lies in the fact that instead of the pose of one object (e.g., a robot), now the pose of all elements (e.g., sensor nodes, or photographed spots) in a network are determined relative to a global coordinate system. In this way the paper forms a bridge between the domains of robotics and of distributed sensor networks. When considering only static beacon-sensing elements, each measurement imposes a constraint on the relative pose of two network elements. With a set of such measurements, the problem is to find a set of global poses such that these constraints are satisfied. This is what concerns the typical mesh relaxation problem for static elements, comparable to nodes in sensor networks. For dynamic localization, furthermore, where elements are mobile and in addition equipped with a motion sensor (such as odometry), each network element is represented by a series of bodies, each of which describes the pose of that element at a particular time. By finding the lowest energy configuration this time, we can determine the global pose of all network elements at all times, as would be useful in our first example or in the case of robotics. It not only improves the current position, but also all the previous position measurements. This means that the more you measure, the more precise all your measurements will become, generally speaking, as is also shown with CLAM or multi-SLAM (see Section 2.4.5) approaches. Allthough similar to the previous approaches, [29] considers localization as a coordinate transformation: every entity defines a local coordinate system and every measurement defines
2.4. ROBOTIC NODE LOCALIZATION
23
a relationship between local coordinate systems. The aim is to find the set of coordinate transformations that are consistent with these relationships. In the absence of uncertainty it is possible to solve equations concerning these coordinate transformations directly. Real systems have uncertainty, however, which is the motivation for the mesh-based approach. All spring analogies can be equally thought of as forms of maximum likelihood estimations with gradient descent. Being a maximum likelihood estimator, the mesh-based approach is general and robust. Moreover it is scalable, namely linear with the number of elements being localized. The algorithm is claimed to be simple by the authors, about 100 lines of C code. Their results seem to deliver highly rectilinear maps, which matches their real environment, even if rectilinear environments were never assumed in the formalism. Also odometric drift is being corrected very clearly, and odometric drift is a systematic bias while the formalism assumes unbiased uncertainty. Some small downsides are that for dynamic problems the size of the mesh will grow linearly in time (O(n)) and that it assumes beacons have unique identities (as to the anonymous beacons problem). To solve this they suggest that a combinatorial layer could be added on top of the algorithm, looking at multiple meshes with only unique beacons and sorting out the most probable mesh. While this particular method looks promising if these conclusions hold, their precision (and consistency) might still be too low for use in our first example. Distributed mesh calculations also need to be looked into, but the following solution is handed to us by the authors: the mesh can be divided into regions, each of which is assigned to a separate computer, such that they need only communicate about the elements at the boundaries of their regions, thus improving the scalability. In contrast to [6], the authors of [29] don’t consider it proven that convergence will be to a global instead of a local optimum, and they call for further investigation.
2.4.5
Cooperative methods, such as for our second example
CLAM Cooperative Localization and Mapping (CLAM, often also called multi-SLAM) [9] involves an extension to SLAM (Simultaneous Localization and Mapping, as in Section 2.4.1), where two or more robots cooperate to build a map of the environment. The goal is to achieve increased accuracy, rather than increased speed. The two robots play the roles of explorer and observer, as explained before (Section 2.1.2). In the first strategy these roles are fixed, in the second they swap roles in turn. The robots stay within sight of each other. A Bayesian inference technique is used to determine the occupancy state of each cell on a rectangular grid global occupancy map (a cell is either occupied, non-occupied or unknown), based on the range data provided by each robot and then fused. The second strategy proves to be the better because of stronger constraints. While this paper contains some good ideas for our second example application, it does not provide an improvement for the approaches we have already seen for the first example application. Synergetic localization for groups of mobile robots [34] is another Kalman filter approach, but adjusted to allow it to be decentralized and used in a distributed multi-robot localization algorithm. In this case ‘distributed’ refers to spreading the computation over several robots that cooperate. Here too it is unclear whether
24
CHAPTER 2. LOCALIZATION WITH USER DISTANCE MEASUREMENTS
their solution can be translated to an approach without exteroceptive sensors. Also, while the algorithm is decentralized, it is not really distributed as such. The computations are divided over the robots, but the data in the computations may not be related or local.
2.5
Conclusion
Algorithm Convex opt. SDP DV-distance Savarese Savvides MDS Sectoring (SA) NBP Simultaneous (Map learning) Elastic correction Range scan alignment (MR1) Map consistency (MR2) MR3 (CLAM) (Synergetic loc.)
Distr. pot. ◦ (?)
Order
Accuracy Tol.
? (? ? ?)
◦
??
? ◦ ? (? ? ?) ?? ?? ◦ ? ?
? ?? ?? ??? ? ? ??, +?
?? ?? ?? ??? ? ?? ??? ?
? ?
??? ? ? ??
?
???
?? (? ? ?)
???
◦ (?)
??? ? ?
? ? ?? ???
??? ??? ??? ◦ (?) ? ?? ◦ ◦ ◦ ◦
??
? ? ??
Anchor nodes 20%/◦ (??) 15%/? 10%/? 20%/? 5%/◦ 5%/??
Properties orient.
local local orient.
N/A N/A N/A N/A
local orient. orient. orient. orient.
?
N/A
local, orient.
??
N/A N/A N/A
local, orient. coop. coop.
Table 2.1: Overview of the algorithms. The column titles in full read: the name of the algorithm matching the previous sections, potential to make it distributed, the computational complexity order, the accuracy, the tolerance to bad measurements, the strictness of the anchor nodes, and finally some other properties. Table 2.1 shows an overview of the algorithms. This overview is only indicatively correct. For instance comparing accuracies is very difficult and not much data is available. Especially the specificities of the algorithms are hard to capture in one table. The top half are the algorithms for ad-hoc sensor network localization, the bottom half for robotic node localization. The first column lists the algorithms in order as described above. Algorithm names completely between brackets are algorithms that are less complete in the table, since less comparable. “MR” stands for the Mesh relaxation algorithms (1, 2 and 3) in order. Next columns give the criteria: possibility to make it distributed or already distributed, the order (scaling, efficiency), the accuracy, the tolerance to bad measurements. The fifth column first shows the targeted proportion of anchor nodes and then if the anchor nodes can be placed freely or are more constrained (mostly placing at edges or not). The last column contains the properties: “local” means that the algorithm has a local impact, “orient.” means orientation information (bearing or odometry) is used, and “coop.” means it is a cooperative method (for more than one robot). More stars indicate a better score. Marks indicated between brackets
2.5. CONCLUSION
25
are better marks achieved through some extension or special case. For the order ? ? ? more or less corresponds to O(n), ?? corresponds to O(n2 ) and ? to O(n3 ). The order (scaling) of DV-distance and “Savarese” mostly concerns network routing. Some algorithms are more accurate because they also use bearing or odometry information (indicated by “orient˙’’). Anchor nodes are not applicable for the bottom half. The expected connectivity (how many neighbours) is many (> 10) for the top half methods, and few (about 2) for the bottom half. Empty fields are unknown or not applicable. Although localization is an old problem, further research is necessary to make it really usable for ubiquitous computing. In particular, for our simple and obvious but apparently fresh application example, there is considerable difficulty of finding an approach that translates easily (or does not even require translation at all) and that meets all our requirements to some degree at the same time. As the overview shows the different methods are quite scattered and have lots of different focal points. Some excel at some criteria but are inadequate for the rest of the criteria. Two extremely complex techniques, MDS and NBP, can work with and improve the more inaccurate measurements. On the accuracy of the results however, there is no proof of real big differences in general cases or these results are not known. Only four algorithms have a computational complexity that approaches O(n). But most conclusive is the potential for a distributed implementation. The algorithms of Savarese et al. and Savvides et al. are distributed and could be interesting for establishing more sophisticated communication protocols to improve the decentralized communication efficiency. They look into such problems as error propagation and oscillations. Clearly, a method which also has direction information is more accurate. We are not only looking for a distributed algorithm but also one that has the local impact property (so scalability is in check and nodes can easily be added later). This brings us to the second mesh relaxation (map consistency) algorithm. The predecessing robotic node localization algorithms have developed the idea that is used here, but the algorithm in question is the first that is eligible for a distributed adaptation. We will examine this algorithm further in the next chapter. The third mesh relaxation algorithm loses much of this advantage, but seems superior and more general for some other criteria, and therefore could be a good successor if the distributedness can be assured.
26
CHAPTER 2. LOCALIZATION WITH USER DISTANCE MEASUREMENTS
Chapter 3
Distributed photo capture In this chapter we specifically address the problem in our example application for distributed photo capture. Sections 3.1 and 3.2 recapture and elaborate on the problem, objectives and the scope of our application. Section 3.3 discusses the representational choices. Finally, Section 3.4 proposes our algorithm for the prototype solution of next chapter. The algorithm is based on and adapted from the relaxation algorithm [6] from the domain of robotic node localization from Chapter 2. In contrast with the more general views on the domain in the previous chapter, this chapter aims more particularly at the problem and result central to this thesis.
3.1
Problem and objectives
While multiple applications would benefit from a solution for this kind of problem, in the prototype implementation we will exclusively deal with the needs of our example application from Section 2.1.1 about a photo system in a two-dimensional world (no generalization to include the height/level aspect). Multiple users can collaborate or work exchangeably on the addition of the photos to the application. Ensuring coherence in distributed capture is a very novel topic. Related work on the temporal aspect (merging in a coherent manner multimedia fragments captured collaboratively by a set of devices) is in progress at INRIA Rennes in the Ambient Computing and Embedded Systems research group [3]. This thesis is an initial attempt to tackle the spatial/geographic aspect. The term “distributed” here alludes to the fact that a set of photos, shot at different locations, all can show one place or object. The coherence issue here is that if they do so in reality, then they should recreate the experience of that distributed view of the one place or object on the virtual map they’re placed on. The goal of our solution is improving the coherence, and error diffusion is the means. The goal is to theoretically devise and then in practice implement a prototype that can be used as in the example. For this we need an algorithm. This algorithm should be decentralized so the computation can be done in a distributed system. In the next chapter the working prototype application is developed and implemented on the Android platform emulator that mimics a mobile device (see Section 4.4). 27
28
3.2
CHAPTER 3. DISTRIBUTED PHOTO CAPTURE
Scope
The size of the field and the number of (related) aspects requires us to narrow down the scope of our work. The following are out of scope: Offline method Rather than an off-line correction method that can be applied to a whole collection of nodes at once and where optimality has priority, our work is targeted at an on-line method: as a continuous and incremental method, corrections are applied to a dynamically changing collection of nodes, with no guarantee of an optimal solution but with a concentrated effort on trying to limit the influence (range) of the corrections. Photo images The sharing of photos (of the photo data itself) is not considered. The localization of photos is important to us (i.e. correcting and sharing the information on position and direction). We assume photos are present and shared between all involved users. In practice this could be done using a distributed filesystem or some peer-to-peer filetransfer system. Functionality to take photos with an on-board camera is present in Android and can be easily substituted in the future for the photo browser that is currently used. However, since we lack a real device this would not be convenient. Localization technology We also dissociate ourselves from a particular choice of a localization technology (that is, a technical absolute estimation device with high accuracy). We are concerned with the localization technique but not the technology used to acquire absolute estimates. While the problem is centered around the distance estimates of users, it is clear that also some absolute references are necessary. In some situations GPS is certainly a possibility. Android has functionality for built-in location sensors such as GPS and compasses and this can be added quite easily should one want to. Again, since the emulator is used, this would not be convenient. Another option is using map estimation (i.e., indicating on a (satellite) map), which can also serve as a supplement to GPS. We lack a technique with very high accuracy to obtain a proportion of well-known locations and this is out of the scope of this work. Modification or deletion of photos We concentrate our effort only on showing that it is feasible to add data coherently, rather than implementing advanced modification and deletion functionality, which evidently comes with synchronization problems, that are again not unique to this application. Users will be able to add photos, but not modify or delete them. Discovery and communication protocol Also, discovery techniques, in a distributed application’s sense, to find other users of the same service, were not considered. As a substitute discovery is being done in practice by a user giving the list of other people to contact. No proper low-level communication protocol will be developed but the usage of an existing (albeit less appropriate system) is projected. The application will be coupled to the user, but it will be hard for a user to move the database of stored information to another device, therefore coupling will also be with the personal device in some way. Distributed operation We clearly lack a real distributed and mobile environment. But this is solved in a creative way, as is shown in the next chapter, partly by using some features of the Android platform. The goal is to develop a theoretical solution for this kind of environment, however it is not actually the aim to have a working prototype
3.3. REPRESENTATION
29
for such a distributed environment. Our solution should be on a par in design, but it cannot be expected that a fully-fledged distributed system is realised on any existing mobile platform. While having theoretical scalability is very important, constructing a practical implementation that lives up to this is very hard, and testing would be even harder. We lack the resources to properly test the usage of more than one instance of the application at a time. Failure handling, as in distributed systems, will not be included. Geographical details While we will make use of geographical terminology and basic techniques, this work does not focus on the subtleties of geographic information systems. We will be using geographical world coordinates so that the solution is shown using absolute coordinates, however the finesses of this other domain are out of the scope of this work. In particular coordinates near the poles or near the date line (where negative coordinates turn into positive coordinates) will not be taken into account. Also plain relational databases will be used and not the more advanced known techniques such as geodatabases. Formal performance evaluation Since our main priority is to show the feasibility of the prototype, a formal evaluation of the performance is only of minor importance to us at this stage. It would require us to set up a specific structure for measuring, including wellchosen problems and reference solutions. No standardized problems exist in literature, so comparing would be hard in any case. User interface The user interface will be written in English. The user interface will not receive any particular attention in terms of usability. Related aspects As mentioned before, aspects such as administration, permissions, authentication, security or rights management, as well as many other non-considered criteria and challenges from Chapter 2, neither will be looked into this time.
3.3
Representation
This section illustrates how solving coherence issues in localization also establishes coherence in distributed capture.
3.3.1
Viewpoints
How will we relate photos to locations? One way is to directly couple the location of the main object in the photo with the photo itself. This is useful for contextual relating because the object-centeredness allows to group several photos of the same object, for instance photos that are rotated around the object in respect. However, there is no easy way to determine what the main object is. Inside a building a photo of an object will likely be a close-up. It might be difficult to find this object by its photo. Outside, it could be a view from afar of a touristic landmark. But the landmark could look different from different sides and so its connected photos would all be different and one photo by itself would give unclear location information. Also all the photos having this landmark would be inserted on one spot, and no photos would surround it, so there is no easy way to follow photos to get to the landmark. It
30
CHAPTER 3. DISTRIBUTED PHOTO CAPTURE
is difficult to spatially map out the photos, as it is difficult to calculate distance to an object using vision techniques. In fact the photos we expect are not about objects at all, they are about views from different viewpoints. This is a different manner. The viewpoints we talk about are the positions of the camera when the photo was taken. We take the location of the camera or cameraman and link this to the photo. Rather than an inward and centralized view, this gives an outward and scattered view on the environment with photos “looking around”. In the interior photos contain walls and doors or views of corridors, outside they show the scenery from that point. With such a set of photos it is possible to create a walkthrough experience for the user, much like he himself would see with his own eyes. Grouping of photos can be done based on the camera viewpoint. So when the camera is rotated photos are added in the same group. Moving to another group happens when the cameraman (the user who inputs the photos) walks around. Disadvantages here are of course the lack of possibility to insert just any photo coherently into the system. Photos of close-ups or photos not taken straight at eye-level are discouraged. In short, photos are showing (a view of) a place rather than an object.
3.3.2
Nodes
There are some points or places (the viewpoints) of which we want to determine the position in GPS-like coordinates. We will call these points nodes 1 .
3.3.3
Links
Some of these nodes’ positions will be determined using GPS or pointing on a map. Others will be estimated by the user from another node. Thus, there are two kinds of estimates, which we will call links, that give information on the positions of the nodes and topologically place or connect the nodes: Relative links Relative links in the simple form are user estimates of distance and absolute angle (versus North for instance) from one node to the next. Which kind of angles needs to be estimated is left unspecified at first. The first version of the algorithm uses just one distance and one absolute angular measurement. Of course multiple links can be made. The links are bidirectional. Absolute links Absolute links are considered local metric information with very little inaccuracy on known positions. High accuracy is needed for several reasons matching those of anchor nodes in sensor networks. The more known nodes, the fewer real correcting work needs to be done (supposing that there are no conflicts). In any case a sufficient proportion is needed for the problem to be feasible. And if these known nodes have a small error on their position, then it will be virtually impossible to get a smaller error than this for nearby nodes. Also, a couple of nodes with absolute coordinates are needed to be able to transfer everything into absolute coordinates. 1
As in nodes of a graph, not the sort of node devices from ad-hoc sensor networks
3.4. ALGORITHM
3.4
31
Algorithm
The simple version of the algorithm is closely based on the relaxation algorithm described in [6] and as the second algorithm in Section 2.4.4 from the field of robotic node localization. Here too the purpose is to maintain a coherent representation of the environment and to allow reconciliation with future estimates. However we will be using it with different data in a different situation. Also distributedness is an additional goal here. The algorithm still relies to a great extent on the information provided by the users, but tries to use as much of the information as possible to come up with the best possible solution. Making sure that the error is distributed and not extremely located in one place requires lots of data being present and will only then make the map truly coherent. Also, taking in the information from each new estimate and applying it locally, is the quickest way to improve the coherence and also makes distributed calculation possible. In [6] a Gaussian noise distribution is assumed in all directions around the positions of the nodes so a circular probability area is achieved and a single variance measure can be used. The variance measure typically is taken to be proportional to the estimate distance as the variance represents the uncertainty of the measurement. A covariant noise model (form of an ellipse) was found to be unnecessary [6]. The general idea behind the relaxation algorithm is based on the mesh of springs idea, where coherence is attained for optimization to the lowest energy in the mesh. However the algorithm does not solve a system of physics-like equations but works iteratively. The principle is to handle each node in turn, determine its position through the links from each of its topological neighbours, and move it there (“where its neighbours think it should be”). By repeating this the positions will converge to a globally optimal solution.
3.4.1
Motivation
While this algorithm lacks some of the advanced techniques to improve coherence, it is one of the few that allow to create a lightweight, simple and distributed system. Its skill level can be described as plain accuracy maximization, as found in many of the sensor node algorithms, yet at a higher level because it tries to use all the information available, without approximating. Its merit is its capacity to be adapted for decentralized calculation, while remaining rather straightforward to implement. Also adaptations can be brought about to make it more specific to the photo capturing aspect (for instance to the turn the photos as well next to positioning them).
3.4.2
Algorithm, first version
The algorithm, consists of two steps for each iteration. Both steps are performed for each node i in turn, within one iteration. The order in which the nodes should be updated is not discussed in [6] but is not necessarily unimportant. Step 1 For the current node i and each neighbouring node j, between which there is a relative link 0 ) for node i using the with distance dji and absolute angle θji , a position estimate (x0ji , yji position (xj , yj ) of the neighbouring node j is calculated. The used technique to calculate the
32
CHAPTER 3. DISTRIBUTED PHOTO CAPTURE
position can be seen as a combination of multilateration and multiangulation. x0ji = xj + dji cos θji
(3.1a)
0 yji
(3.1b)
= yj + dji sin θji
In the same way the variance vji on the position of the current node i in this estimate is calculated from the variance vj of each neighbouring node j and the variance of the link between the nodes uji . (3.2) vji = vj + uji Step 2 In the second step the position (xi , yi ) of the current node i is updated with a weighted 0 ) made from all of the neighbours j of the previous average of the position estimates (x0ji , yji step. The purpose is to produce a new position (xi , yi ) for node i that in the end (after looping i and iterating) is more suitable. First, the new variance vi for node i is accumulated from the variances in the estimates vji , so the inverted variances in the estimates can be used as the weights for the average. P0 Here, j is the sum over the neighbours j of node i (excluding j = i). In practice, this is done in a loop over j, constantly adding to a temporary value 1/vi , and then inverting this to get vi . 0 X 1 1 = (3.3) vi vji j
Next, also in a loop over j for the summation, the position is calculated and updated. 0
xi =
X x0ji vi j
vji
(3.4a)
0
yi =
0 v X yji i j
3.4.3
vji
(3.4b)
Boundary conditions
Links that are absolute, and thus not between two nodes, are considered to contain perfect information free of doubt (which is an idealization of reality) in this version of the algorithm. Such absolute links were never considered in the original algorithm. The information is applied by resetting the newly calculated position and variance of the implied nodes to respectively the provided position and zero in between each of the steps. It is useful to do this for instance for the first and last node, but it is not strictly necessary. Of course, to be able to calculate absolute positions for every node, at least one such absolute link is necessary. If the order in which nodes are processed starts from such a node and then proceeds with its neighbours etc., then convergence will be faster in general (assuming that nodes are in more or less a chain formation). For a real mesh structure this is not trivial and convergence will not be predictable (but unlikely as fast as in the latter case).
3.4. ALGORITHM
3.4.4
33
Stopping criterion
In [6] the total change in positions falling below some threshold is given as a possible stopping criterion. However in this and the next of our versions, in view of the distributed application, it is better to stop for each node when no more neighbouring nodes have had significant updates to their position. If this is the case then calculating the position for the node again would have no significant influence since the used data has nearly not changed. The fact that the position update difference for one node drops below a certain threshold still is the stopping criterion in a way. If this node is recalculated but not corrected much, while its neighbours are still updating significantly, and if this node never really gets corrected anymore, so it is not this node that is asking its neighbours to update, then this will bring the algorithm to a halt eventually under normal circumstances.
3.4.5
Locality
The algorithm can work locally. If this proposed stopping criterion is not too strict then an update in the position of a current node will not ripple through more than a couple of neighbours (with the change in position getting smaller and smaller), given that the mesh is rigid enough. This all depends on how many accurately known locations there are and how high the connectivity is. A high connectivity (many direct neighbours) increases the accuracy of the current node’s position. An absolute link makes the node’s position perfectly accurate (nearly perfectly accurate, in the next version). However, a high connectivity throughout the whole mesh makes the convergence speed slower because of more involved computations. Accuracy comes at the cost of speed. The same goes for a too strict stopping criterion, which could make changes propagate themselves nearly endlessly and would destroy the convergence speed.
3.4.6
Possibility for distributed calculation
This locality property makes it possible to theoretically implement the algorithm in a distributed fashion. Calculations can be executed per node, on the device that “owns” the node. If a node gets updated and the change is noteworthy, neighbouring nodes can be set to be updated too in case they are on the same device, and an update request can be sent out to the other neighbouring nodes on other devices.
3.4.7
Real-time insertion
This local impact property also makes it possible to add a node along the way and without incurring changes in the whole system (but only in a small area surrounding the node). The dynamism in this algorithm allows for new nodes to be added at any time (except in the steps of one running core calculation of course). Its presence should be announced much like update requests. The device that added the node should take responsibility for it and its calculations. This means that when the device is switched off afterwards no non-cached information can be requested and no updates can be made to that node, as is inevitable in a distributed system. Because effects of change or addition are local the insertion is also fast. There is no need to recalculate the whole system of nodes. One iteration of the algorithm for the new node is
34
CHAPTER 3. DISTRIBUTED PHOTO CAPTURE
sufficient for a first roughly nearby position. Still, it might take quite a bit more to stabilize the effect of the insertion.
3.4.8
Proof of convergence
By referring to the mesh of springs analogy, the authors of [6] show that each update to any node will always result in a decrease in energy. Because the energy function moreover is both bounded at zero and quadratic, they prove that there is always convergence to a global optimum (minimum in energy) for their specific version of the algorithm. However, it is not proven that the algorithm as stated above (with for instance the additional boundary conditions) always converges. Implementation without slower convergence is also difficult if real world coordinates are used, rather than some artificial Cartesian coordinates.
3.4.9
Algorithm, second version
The first version of the algorithm was implemented in a Mathematica [7] example as a feasibility study. In the second version functionality is added to take in account the fact that not only the viewpoints of the photos must be made coherent but also the photos themselves. Since a photo needs to be oriented as well, more angular data is added. The distributedness is looked at more explicitly this time. Instead of starting from the main loop that runs over all nodes, we look at the calculations at one node in particular, and any necessary communications to the neighbours that are involved. The basis of the algorithm is the same and goes as follows: first of all, the algorithm is started on the node i because either it was newly created or because some data from one of its neighbours got corrected significantly enough to also justify a correction run for this node. For the node i, in the same way as in Section 3.4.2, the neighbours are asked to estimate the position. The most recent information about the neighbours (their position and variance) got sent by the neighbours when it changed (or when this node was newly created since it needed to be presented to its neighbours). We adapt the definition of relative links so that they connect photos instead of general nodes. Each relative link now has a distance and two relative angles (one for the link in respect to the “startphoto”, and one for the “endphoto” in respect to the link) for reasons explained in Chapter 4 in Section 4.2. A consequence is that the position calculated by estimating depends on the orientation of the photo at the neighbouring node. This is only logical: if the neighbour’s position gets corrected and its photo gets turned a bit to the left, then the position of this node should not only translate with the neighbour’s position, but should also slightly rotate counterclockwise (to the left) around it. The absolute links are no longer forcefully implied in the calculation of a node’s position as was the case previously in Section 3.4.3. Moreover, analogous to relative links, the absolute links are considered to deal with photos rather than nodes. Since there can be more than one absolute link involved per photo/node now, in this step we should speak of (the more abstract) link j instead of neighbouring node j. Absolute links are given a variance too. But since there is no neighbouring node on the other side the variance vji only has this one component. Apart from this difference (and apart from the fact that they will not transport any requests for neighbour updates) they are incorporated into the calculations of the algorithm just like the relative links.
3.5. CONCLUSION
35
The second step (for updating) is also similar to Section 3.4.2, except that there is a part appended to the step. After the second step the node’s position is corrected, but then it is still necessary to turn the photo to be as much as possible in correspondence with the information in the links. For each photo p on node i the following step is done: for each link 0 is calculated as it should to that photo j, the absolute angle (orientation) of the photo αjp be to be perfectly in line with the information within the link. Next, the angle is averaged with weights (according to the variances) for all these links and the photo is turned to this absolute angle αp . As with averaging vectors starting from one point there is the possibility that the different angles cancel each other out in the calculation of the average. Take for example the average of an angle of 0 degrees and of an angle of 180 degrees. Rather than 90 degrees or -90 degrees, as one might answer too quickly, the result is unsolvable. If it is looked upon as two vectors of equal length pointing in opposite direction, then it becomes obvious that the resultant vector of the sum of both is a vector in any direction with zero length. For this reason the average angle αp is calculated with the x and y components of the 0 . estimated angles αjp 0
αpx
=
0 )v X cos (αjp i j
vji
(3.5a)
0
αpy
=
0 )v X sin (αjp i j
vji
(3.5b)
An arctangent is used to calculate the average angle from its components (converting rectangular coordinates to polar). Actually just an arctangent does not suffice, the correct quadrant also needs to be determined (as is done in Java’s “Math.atan2()” method). αp = arctan (
αpy ) αpx
(3.6)
In case αpx and αpy are very small (and the zero length vector problem has occurred), by design, the angle calculation is skipped and the photo’s orientation just is not corrected at all. This would only happen very exceptionally and clearly there would not be enough or correct information to improve the coherence anyway. Since a photo at some time had one link to it, namely at creation, and therefore always had a valid angle at some point, the problem can be solved in this way, because the old orientation is kept. As already mentioned in Section 3.4.4, the stopping criterion per node now decides if the update should be propagated to the neighbours or not.
3.5
Conclusion
Ensuring coherence in distributed capture is a very novel topic. We concentrate on the coherence problem itself. Several side-issues and typical topics in distributed systems are beyond the scope of this thesis. A good way for the problem to be represented was considered. A new algorithm was translated from a different domain and proposed for this application field: the mesh relaxation algorithm [6] from Section 2.4.4 was chosen mainly because it is lightweight, simple and was a candidate for an adaptation to decentralized calculation. Its
36
CHAPTER 3. DISTRIBUTED PHOTO CAPTURE
effectiveness can be described as plain accuracy maximization. The algorithm has the locality property. This makes it possible to be used in a distributed system. This form, together with adaptations to the specific photo capture application, was presented as a second version of the algorithm. This version will be implemented in a prototype system in Chapter 4.
Chapter 4
Implementation: Geometer In this chapter we will document the process of implementation of a prototype distributed photo capture application, named “Geometer”. Section 4.1 starts with an analysis of the requirements, both functional and non-functional, that are limited by the indicated constraints. The analysis step continues in Section 4.2 by giving an overview of the use cases the system should accomplish, including one elaborate example. In Section 4.3 a domain model is drawn up to discover and work out the most important concepts. The application will be developed on Android, a platform for mobile devices. This platform is presented in Section 4.4. Describing the design phase, Section 4.5 elaborates on the design class diagram. Some peculiarities and choices of the implementation are discussed in Section 4.6. Finally, the implementation and the results are considered in Section 4.7 and the remaining issues, both in the platform, the algorithm and the application, are discussed.
4.1
Requirements analysis
Functional requirements Functional requirements determine the functionality that the system needs to offer. • A user should be able to add photos he took at location to the system. • The locations of the camera viewpoints from which these photos where taken should be determined by – estimations the user makes on a map, and – distance and angle estimations the user makes in between camera viewpoints. • The user should be able to view and walk through the photos, including photos added by other selected users on other devices in the distributed system. • More estimations can be added to existing photos, also to photos by other users. Constraints The following constraints implicate a number of non-functional requirements and therefore are important in the analysis. Some constraints are imposed because of the selected algorithm, others because of the platform and the virtual or (in the future) physical device. 37
38
CHAPTER 4. IMPLEMENTATION: GEOMETER
Imposed by the algorithm: The employed algorithm will be our second version adapted mesh relaxation algorithm. It requires a decentralized implementation and the locality property. It also implies some representations. Imposed by the platform (and the device): The platform will be Android as explained in detail in Section 4.4. [35] contains a list of efficiency optimizations for code written for Android. The particular context of embedded devices makes Java code that is different from conventional Java code a necessity most of the time. Examples are: no abundant object creation, less general code (this boils down to saying what is actually done, such as specifying the exact type and not the generalized type), no internal getters/setters, using local variables, and avoiding floating point numbers. Some of these unfortunately conflict to some extent with and should carefully be weighed against conventional good practice. Also, Android (and the Dalvik VM) does not fully implement all aspects of regular Java but rather is a subset (a different but comparable subset as J2ME). Not all external libraries are currently compatible with Android. The choice of Android determines the heterogeneity and openness of the system. The Android middleware provides heterogeneity to some extent, but is rather limited. Android’s low coupling helps the openness, but the full focus on an open design is out of scope. The usage of a mobile device disadvantages the communications delay, and therefore negatively impacts the performance. The user interface should allow for different types of input as typically seen in mobile devices: a touchscreen with (multiple-)finger or stylus input, mobile phone buttons, a joystick, small buttons or a virtual keyboard, and since the emulator is used, also a PC mouse and keyboard. Non-functional requirements Implementation The application should make use of Android, and its API’s (application programming interfaces), and as much as possible comply with the (de facto) standards. This includes living up to the particular Android application anatomy [36] and design philosophy [37]. Reliability User inputs should be checked on validity. Performance While it is very hard to attain nearly-instant responsiveness and results, especially because of data that needs to be fetched from other devices, it should be possible for users to add and browse photos in the time they need to walk from one spot to the next. Android imposes a 5 second input event timeout in the user interface that should strictly be respected. Preferably the responsiveness does not increase with the number of entries in the system or in the view. Adaptability The application should be loosely coupled so that parts can be swapped in or out. Future possible changes are providing real camera or localization interfaces. Also, to be consistent with the architecture of Android, it should be possible to add new views and register alternative intent handlers.
4.2. USE CASES
4.2
39
Use cases
Use cases help to understand the way users will interact with the application. The set of identified use cases is shown in Figure 4.1. The text of the “Add a distance estimate” use case is included below as a sample.
Figure 4.1: Use case diagram.
Use case 1: Add a distance estimate Scope: Geometer distributed photo capture application Primary actor: User Stakeholders and interests: • User: Adds photos and other information into the system and browses these and other photos and information. • Other Users: Simultaneously browse the photos and the information (what the User adds instantly also becomes visible and accessible to them when completed). Brief description: The User adds a distance estimate. This can also be an estimate of a rotation. The User starts at a position of an existing photo and then moves (and/or turns) to a new position. Here the User either takes a new picture and adds this as a new photo, or uses the existing photo that is already added at this position. The user makes an estimate of the distance and angles between these two photos. This information (and the new photos) is added to Geometer. Preconditions: The application is running on the User’s device, and optionally on Other Users’ devices too.
40
CHAPTER 4. IMPLEMENTATION: GEOMETER
Success guarantee (postconditions): Estimate is added. If it was an estimate to a new photo then a new photo is added as well. This information becomes browsable for the User and the Other Users. Main success scenario (basic flow): 1. User selects the existing photo (and its node) from where he will make the estimate. User is standing at that photo’s position. (a) Include use case “Select photo”. 2. User requests to add a distance estimate. 3. Geometer observes the currently selected photo and asks to designate the photo to which the estimate is made. 4. User moves from that position to the new position and estimates distance and the angle in which he started to walk off (relative to the direction in which the previous picture was taken). 5. User takes a new picture with his camera and estimates the angle of taking the picture (relative to the direction he was moving in). 6. User adds and designates the new photo. (a) Include use case “Add photo”. 7. Geometer asks for the three estimate values. 8. User enters the three estimate values he made. 9. Geometer validates the inputs. (a) The inputs are incorrect. i. Geometer notifies the User. ii. Go to step 7. 10. Geometer records the estimate. 11. Geometer shows and selects the photo to which the estimate was made. Extensions (alternative flows): *A At any time, User goes on with something else and does not complete operation: (a) Geometer does not record anything. 5-6A User makes an estimate to an existing photo: (a) User looks up and selects the existing photo (and its node) to which he will make the estimate. i. Include use case “Select photo”.
4.2. USE CASES
41
(b) User verifies that his new position is the same position from where the photo was taken. User estimates the angle at which the picture was taken in the past (relative to the direction he was moving). (c) User designates this photo. 4-8A User makes an estimate to a new photo at the same position but in another direction (rotated): (a) User rotates, estimates the angle of this rotation, and takes a new picture with his camera. (b) User adds and designates the new photo. i. Include use case “Add photo”. (c) Geometer asks for the estimate value (rotation angle). (d) User enters the estimate value he made.
Figure 4.2: Domain model.
42
4.3
CHAPTER 4. IMPLEMENTATION: GEOMETER
Domain model
Figure 4.2 is the domain model showing noteworthy domain concepts. It was mainly used as a source of inspiration for the design process.
4.4
The Android platform
Quoting [8] the Android platform is “a software stack for mobile devices including an operating system, middleware and key applications. Developers can create applications for the platform using the Android SDK (software development kit). Applications are written using the Java programming language and run on Dalvik, a custom virtual machine (VM) designed for embedded use which runs on top of a Linux kernel”. Android is being developed by The Open Handset Alliance, a group of more than 30 technology and mobile companies, with Google as an unofficial frontrunner. Instead of Google choosing the path of a single proprietary mobile phone product line (previously coined the “gPhone” by the press), it has opted to pursue an open platform, which can be used by both individual developers and mobile device manufacturers. A rather complete first SDK version (containing an emulator) was released early November 2007, with several updates since then. Moreover real devices are expected to be released for mass consumption in a (few) year’s time.
4.4.1
Motivation
Android allows developers to develop with a mobile device in mind, not only performance-wise but also in user interface and user experience. These devices are envisioned as always-on networking devices with elaborate network and sensing capabilities and web technologies. This, and the fact that Google is today’s stronghold in location aware applications and services, made it a likely choice. Its feature list lends itself for our application and its philosophy matches the envisioned use of our application. The platform is open and accessible, and uses the familiar Java language. There is no need to be in possession of a device since an emulator exists.
4.4.2
Emulator
For now, an emulator is provided that closely matches how a real device would look. This emulates the entire device and runs the operating system (from the linux kernel and up). The performance of running the emulator on a modern PC is said to be comparable or slower than it would be on a real device. This is known since some prototypes have already been demonstrated. Network performance can be set to mimic a mobile device’s limited network capabilities (both in speed and delay), allthough we will assume no limitations larger than WiFi or 3G/UMTS/HSDPA. The emulator also mimics some other features, such as an SD card (with an image file), a camera (by not having any real input) or a GPS-receiver (with a prerecorded route file). But most network features are real and live, such as looking up geographical data and maps, sending messages, . . .
4.4. THE ANDROID PLATFORM
4.4.3
43
Architecture
As described above, a stack (as shown in Figure 4.3) is specified including an operating system, middleware, up to an adapted Java virtual machine optimized for mobile devices.
Figure 4.3: Major components of the Android operating system. [8]
4.4.4
API’s and software
Device Rather than just a mobile phone, a personal and powerful mobile device is envisaged. Modern network technologies, comprising Bluetooth, EDGE, 3G, and WiFi will be available depending on hardware. Also interesting are the optional presence of a GPS-receiver, a compass and an accelerometer depending on hardware. The Location-Based Services (LBS) API allows software to obtain the phone’s current location. This includes locations obtained from the Global Positioning System (GPS) satellite constellation, but it is not limited to that. Maps Android contains an optional API (i.e., licensable) to display and use Google Maps. Satellite maps and street plans, including traffic info, can be scrolled. Custom overlays can be added. Addresses and other landmarks can be looked up. Under the hood, there are some auxiliary functions that help to handle the maps and deal with coordinates. XMPP Another optional API gives you XMPP messaging (Extensible Messaging and Presence Protocol, often associated with Jabber or GTalk). Quoting [8] “applications will frequently need
44
CHAPTER 4. IMPLEMENTATION: GEOMETER
to communicate between devices. For instance, you might wish to send messages back and forth between two devices, to implement an interactive game of checkers”. This communication passes through a central server and is not peer-to-peer. Neither is it particularly suited for sending data at a high pace and high bandwidth between applications on a large scale. Because of its disregard of power consumption it is not a protocol to consider for an actual implementation with distributed communication over mobile devices. However, it is the most suitable form of communication offered to us as a standard within the platform in lack of a proper communication protocol. In theory everyone can freely set up an XMPP server. An advantage of XMPP is that communication passes between users and not between devices, and in the end it is the users’ information that is in the system. A problem is nevertheless, as mentioned in Section 3.2, the difficulty for a user to take the database with him when he changes devices. However this is a typical limitation. In theory multiple users should be able to share one device in running the application, provided they mark the user switch. Database Android makes use of an SQLite3 database and has several types of wrapper and helper classes, such as ContentProvider and others, to interact with it. This provides low coupling and extensibility.
4.4.5
Development tools
Apart from the emulator Android also comes with some tools to assist the developer in creating applications. This includes an Eclipse plug-in, a command-line tool for accessing the emulator (a debug bridge), configuration tools on the emulator, a debugger, and a profiler and tracer. Most of these are early versions however.
4.5
Design class diagram and implementation
The diagram in Figure 4.4 is by no means complete: it purposefully does not show all the elements and all the attributes and operations. Constructors and destructors have been left out (as they mostly just have the attributes as arguments), and also the getters of the attributes of primitive type (those that are listed in the top compartment, and not through aggregations). The same goes for methods that are being overridden or implemented in a (grand-)child. Nested classes are hidden too. The package indicating labels in brackets below the entity names help to group the entities.
4.5.1
Design
The application contains two active controlling and mutually interacting key entities that dominate the structure. One controller (following the Controller pattern) is situated in the frontend, that deals with the user interface. The other is in the backend, which contains the technical packages (db, network, math). The domain is shared between these two parts. Some classes must only have one (and exactly one) instance and therefore apply the Singleton pattern. These encompass Geometer, ControllerService, ControllerActivity, NetworkThread, and MathQueueThread.
4.5. DESIGN CLASS DIAGRAM AND IMPLEMENTATION
45
Two Android building blocks Two of the building blocks [36, 38] that are inherent to Android are Activities and Services. An Activity [39] is an entity that has a life cycle, and usually also is a single screen in the application (it doesn’t have to be but it will likely want to interact with the user). This fits into the Android philosophy of having applications and part of applications that are never shut down explicitly. An application process’s lifetime is not directly controlled by the application itself. Parts, such as these individual Activities, are sent to the background, put asleep, or shut down, as lack of memory demands. In each case, state is saved and then can be restored later on. The entry point to the application (i.e. “the class containing the main method”) is also an Activity, seeing that Android is centered around user interface views, and the first called method is the onCreate() method. The typical entry point in Geometer is Geometer. Services [40], on the other hand, are entities running in the background for an indefinite period of time. They do important work for the user, but work that should not block or burden the Activities. The glue for all this loose coupling is found in AndroidManifest.xml, where each component and its function should be specified. Intents (see Section 4.5.2) are registered here and coupled to IntentReceivers. Components also need to claim permissions for some services. Frontend and backend ControllerActivity is the central point of the user interface frontend. It is an Activity without any screens itself. ControllerService likewise is the central point of the backend and steers the maths, database and network functionalities. The lifetimes of both parts are quite independent. At regular start-up Geometer binds itself (using an Intent) to ControllerService, which may or may not be running already, but in any case is started if not running yet. When the connection is set up, Geometer starts ControllerActivity as a sub-Activity. On the frontend side of the connection ControllerService maintains itself as an instance of the ControllerServiceI interface stub, that is automatically generated with the required ControllerServiceI.aidl. This stub is passed to ControllerActivity and used on the frontend side to unilaterally make requests to the backend via remote procedure calls. When the user interface is no longer needed it may get closed down by the system (including ControllerActivity and Geometer) but the backend (and in particular ControllerService) can keep running independently to keep calculating or keep communicating over the network. While it runs in the same process, it runs in a different thread so the user interface is not blocked by anything that happens in the backend. It will end itself if it is idle and unused for some time. ControllerService, besides being started by Geometer, can also be started when an Intent message comes in over the network. This means users do not need to have Geometer running explicitly or still in foreground to work together with other users. Requests for information will still be complied with and recalculations will still be performed if the device is on-line and has the necessary resources to spare. The connection is not the only way of communicating. The backend reads and writes into a database about the elements that make up the domain and that database is also read by the frontend. In the process, persistant state is constantly saved, which is convenient for the
46
CHAPTER 4. IMPLEMENTATION: GEOMETER
Figure 4.4: Design class diagram, continued on next page (with small overlap).
4.5. DESIGN CLASS DIAGRAM AND IMPLEMENTATION
47
48
CHAPTER 4. IMPLEMENTATION: GEOMETER
lifecycle handling.
4.5.2
UI and Activities
The frontend mainly deals with the ui and geometer packages, and with ControllerActivity. An Activity is implemented as a class that extends the Activity base class. It will setup a user interface composed of View (widget) objects and responds to events. The lifecycle of an Activity is concretized by callback methods at changes of state such as onCreate(), onStart(), onRestart(), onResume(), onFreeze(), onPause(), onStop() and onDestroy(), each with different responsibilities concerning saving or restoring state. Intents Switching screens is done by starting other Activities and interacting with them. This is done loosely coupled by using Intents. Intents resemble in many ways the Command pattern. Intents decouple requests from actions. In fact, the same method is used between any component. Android makes much of low coupling because it is open as to applications and components. Users can prefer their own photo browser over the built-in photo browser or Geometer’s. If all three components adhere to the same Intents model then this switching does not need any change of code. Users can also move back to previous screens. Android uses a history stack for this. Results can be received by the onActivityResult() method from Activities that have finished their task. User interfaces are built using XML files and Views. Each screen has an XML file that specifies the layout, the widgets (View objects) to use, and their properties. With these files Android automatically generates an R.java class. In the Activities the screens are attached using setContentView(R.layout.myscreen). There are other XML files with strings (for international descriptors) and image resources, . . . too. The layouting effort needed by the developper is minimal and user interfaces across applications will have a consistent look and feel. Apart from Geometer and ControllerActivity, the MapObserver, PhotoObserver, AddUI, ImageSwitcherActivity and ForwardPhotoGridActivity classes all extend Activity. MapObserver actually extends MapActivity and as such displays a map with overlays and several functions (both accessible through a menu and buttons, and with other basic means of input). PhotoObserver serves as an other mode in which the user can walk through the photos (by browsing them). LookupMapObserver and LookupPhotoObserver are variations with added functionality for a photo to be chosen as a target for an estimation. This class hierarchy makes use of the Observer pattern to keep the right photo selected in the different modes, hence the implementation of the Observer interface. The subject is the Selection class, with PhotoSelection as concrete subject. The AddUI hierarchy implements the different dialog screens where estimations are added. Upon such an action, a request is sent to ControllerService’s addEstimate() method over the connection with the necessary information in a hashmap. ImageSwitcherActivity is our implementation (since Android currently lacks one) of a file browser for photos in the form of a gallery. Images are read off the SD card by Android itself and put into a database, which is queried by the browser. While the specifications
4.5. DESIGN CLASS DIAGRAM AND IMPLEMENTATION
49
left the default way to browse files on Android and the functioning of a default file browser unspecified, we consider this to be a likely and interchangeable manner. ForwardPhotoGridActivity finally provides a custom widget-like screen to display a set of photos together and make a selection possible. When walking through the photos in photo mode it is quite trivial to take the photo to the left or right (by turning), but when it comes to moving forward and showing the photo ahead a small algorithm is used to select good candidates (based on the distance, the angle and how straight ahead the photo is). If there are multiple good candidates then this screen is displayed to offer the user a choice.
4.5.3
Domain
There are three element types involved in the domain (and in the domain package), which could also be recognised in the domain model and earlier parts of this text (Section 3.3): nodes, links and photos. Each element has a unique ID throughout the system (i.e. also between remote and local sites). This ID consists of the unique identifier of the user that created the element and thus holds responsibility over it (the creator field, which is the e-mail address used for the GTalk service), and a unique incremented number (non-recurring per user). Together we call this the “fullID” (the full...ID field1 ). Every element type’s top class extends ContentObserver so that, for the frontend side, when an element is changed at the backend side and committed to the database the element object will terminate itself and be recreated. This restoration is done later on when another object detects that a reference basically is dead (because the other object is terminated) and looks the element that should be there up again by its fullID. Apart from the destructors, the constructors are tedious too to be able to gradually and serially construct a consistent state of objects. Furthermore there are some supportive primitive classes in the primitive package.
Nodes Nodes are the camera positions to be corrected by the algorithm. They have a Position and a variance value that are derived for them by the algorithm based on the information the links provide. Each node has a list (both by object reference and by fullID) of all the photos at that node. The Node class is specialized into a LocalNode subclass, for regular nodes that are managed by the local user, and a RemoteNode class, which uses the RemoteProxy pattern to model cached versions of nodes managed by remote users. Any changes are announced to the ControllerService after deciding whether the changes need only be handled locally and committed, or the changes should be sent on over the network. This happens through the inform...() methods, when the node is just newly created, when a link is added to a photo of this node, and when another node has been corrected by the algorithm and the updates might need to ripple through. 1
“. . . ” substitutes any element name.
50
CHAPTER 4. IMPLEMENTATION: GEOMETER
Photos The Photo class represents a single photo and has an absolute angle (indicating in which direction the photo was shot) that is derived for it by the algorithm and a ContentURI (a URI is a universal resource identifier) that translates to the full pathname of the image file. Each photo also has a reference to the node to which it belongs (both as fullID and as object reference) and a list (again in the double manner) without duplicates of all the links in which it is involved. There are two add...Link() methods to add links, one for effectively new links, the other for restoring the photo object later on. Links Links come in different flavours but they have a variance value, derived for them by the algorithm, in common. Following the TemplateMethod pattern to some extent, the variance and coordinates are calculated here in the Link class, more precisely in the calculate...() methods (for the main calculations themselves) and in the preCalculate...() methods (for calculations of which the results can be reused several times), which need to be overridden by the subclasses. The informNodeChanged() method serves to pass the notification that a node is corrected by the algorithm from one node to its neighbour. The links are classified as RelativeLinks or AbsoluteLinks. The RelativeLink is a link that starts at a photo and ends at another photo (both are double references to a Photo object and a fullID), while the AbsoluteLink type only involves one photo (again a double reference). In Geometer both types currently have only one non-abstract implementation. The UserDistanceEstimation class is such an implementation of RelativeLink and has a Distance and two RelativeAngle fields. The calculation methods basically calculate the position of one end (that is, the node of either the start- or endphoto) given the information in the link and the position of the other end. The UserRotatedDistanceEstimation class is a variation on this class where the distance is fixed at zero, both ends are on the same node, and only the photo relative angle is used. Finally, the UserMapEstimation class contains an absolute Position and one AbsoluteAngle that indicates the angle of the photo. No calculations are necessary here, one only needs to return the specified position information. The variance is set to a constant value.
4.5.4
Services
The backend consists of three parts, namely the database, network communications and mathematics parts (in respectively the db, network and math packages), which are controlled by ControllerService. ControllerService has a local class implementation of the ControllerServiceI interface in the mBinder field. Most prominently the addEstimate() method, which is called when estimates are added through the frontend, is implemented in there. Database The single class GeometerProvider provides a standardized interface to the database by extending Android’s DatabaseContentProvider. ContentProviders [41] implement a standard request syntax for data, and a standard access mechanism for the returned data. Once again,
4.5. DESIGN CLASS DIAGRAM AND IMPLEMENTATION
51
ContentProviders are loosely linked to their clients. Each ContentProvider exposes a unique string (a URI) identifying the type of data that it will handle, and the client must use that string to store or retrieve data of that type. Since these URIs should be accessible for external applications as well and determine the identity of Geometer, they are defined in the top Geometer class. A typical URI in Geometer looks like “content://fr.geometer.Geometer/nodes” (to specify all nodes), for instance, or “content://fr.geometer.Geometer/nodes/
[email protected]/42” (for a specific node uniquely determined by the unique user’s GTalk address and an index). As an example, MapObserver calls getNodesInRectangleForUI() in ControllerActivity to perform queries using GeometerProvider. A ContentObserver object is kept in MapObserver to get a callback when the content affected (possibly returned) by the performed query has changed, so that the query is performed again after a certain timeout (and only once per timeout). The timeout is managed by a ResettableTimeout object which sends a message to the handler of MapObserver when the timer expires. Another timer is used to make sure that only one query caused by user inputs is done every so many milliseconds. The lookup...() methods in Controller are also called by classes from the domain when these are instantiated at the frontend side. Both ControllerActivity and ControllerService implement the Controller interface so they can be used transparently within the domain for this reason. The purpose is that when a node is requested, the node is queried and constructed from the database (thus restored to its latest version). A hashmap serving as a cache (one for each of the domain element types) is searched first before the database is queried. The same functionality can be found in ControllerService except that no ContentObservers are used. The ControllerService is the only class writing to the database however. The save...() methods (wrapped by the commitChange() methods) commit either a specific domain element or all currently open elements of a type. These methods are called by the network component when a new remote element is reported, or by the changed domain elements themselves (which are changed for instance by repositioning after running the algorithm). The database also remembers any foreign elements that are encountered, but is not the first-hand source, does not take responsibility and therefore only serves as a cache in this case. The above is only for Geometer’s internal (however public) domain data. Miscellaneous settings that need to be saved in between running the application are stored in the Application Preferences through the specific functionality provided for this task by Android. Network There are two activities associated with network communications, namely sending and receiving. The received Intents crystallize in the NetworkReceiver class because it extends IntentReceiver and we registered Intents for Geometer in the AndroidManifest.xml. However the actual requests are processed in ControllerService in the on...Told() and on...Asked() methods. The NetworkThread sends out Intents and runs in a different thread so it does not block the rest. The first thing that is done in the class is setting up a GTalk session that will remain connected throughout the lifetime of the backend. NetworkThread also asks ControllerService to look up the contacts listed in the Android contacts application (with a ContentObserver so that any changes on the fly are also taken in). The contacts are necessary because their
52
CHAPTER 4. IMPLEMENTATION: GEOMETER
specified contact method (the GTalk address) will be used in a multicast to all contacts in case the information sent out is not to a specific target (as would be the case with an answer to a request). Furthermore NetworkThread gets the user’s identifier (GTalk address) and supplies it to ControllerService and thus to the rest of the application so it can be used in the fullID’s. The class has several methods (the ask...(), doAsk...() and tell...() methods) to assemble and send messages. The assembly constitutes of writing basic information (such as source or target for instance) in the root of the Intent and including information bundles. These methods use lower level pack...() methods to effectively package and serialize the information (that is, domain elements) in bundles into the Intents. There are seven distinct communicative actions: askNodesInRectangles The most important is a multicast of a request of all the nodes in a given near-rectangular area (which matches the map visible on screen). Upon reception a device should send back all the (or as many as possible) valid nodes (with a tellTotalNode) that its user manages and that are situated in the area. askTotalNode If the current device only needs to know the latest version of one specific node, or one of its photos or any links in which they are involved, then this request is sent out. The device whose user owns the node needs to reply with a tellTotalNode. tellTotalNode This message sends all the involved photos and links with the given node, including the node itself, to either everyone in the contacts list (in case it is an unsolicited informative message) or the target that asked for it (in case it serves as an answer message). The first-hand source is the owner of the node and it also includes any (also foreign) photos that are at that node and links that have those photos as endphotos. Actually this method uses several other tell... actions (see below) to get its job done, so entities are sent separately. tellRecalculateNode When a change is made that influences a domain element for which this device is not the local device, then a suggestion to recalculate is sent out, including the available first-hand information (i.e. of local elements) that will be needed to recalculate the element. Again, this other information is sent separately. Upon reception the actual local device should integrate the new information and should schedule the algorithm on the node. Likely it will send out new similar messages afterwards. A special case is when a user on this device adds an estimate between two remote photos/nodes: this action message needs to be sent to both the remote nodes involved, but no first-hand information on the (respective other) nodes can be sent since we only have firsthand information on the link. We need to ask the owner devices to forward the necessary information to the devices of the nodes. tellNode, tellLink, tellPhoto Both tell... actions above make use of these three more elementary actions to serially send information about individual domain elements. On reception, if these are foreign entities, then the entities will be added to the database, or the matching existing foreign entities (such as remote nodes) will be updated.
4.5. DESIGN CLASS DIAGRAM AND IMPLEMENTATION
53
In general, too many (unnecessary) information is sent, because of limitations and a bug (see below). The requirement of total ordering and reliable messaging is left to the specific communication technology.
Mathematics More about the primitives, variables and units The classes from the primitives package represent the data that is subject to the calculations in the math package. Double precision floating point variables were used only where impossible to avoid because of rounding errors or because of interfacing with Android’s classes. Everywhere else integer arithmetic with long variables was used. However, since most actual values were non-integers, scaling operations back and forth needed to be performed. In the Position class the actual integer x and y coordinates (longitude and latitude) in microdegrees are scaled up so the largest theoretically accepted value (1000 degrees) gets equated internally with the maximum representable long value. The calculations are performed with this internal version and this version is kept in memory. Afterwards the values can be scaled down again. Also checks are made for the maximum and minimum values. Since it concerns degrees we expect an input between -180 and 180 (inclusive) degrees for x, and between -80 and 80 (inclusive) degrees for y. Normalization is done by adding or subtracting a sufficient number of times 360 degrees. Position also has functionality to calculate the approximate distance or bearing (more precisely: the approximate initial bearing in degrees East of true North on a great circle) over the earth’s surface to another position. For this calculation the WGS84 ellipsoid from Android’s location and map API is used. For the Distance the same scaling as for Position holds. However here the actual distance in meters is kept as primary source (as a double), while a second field holds a cached version of the scaled distance in microdegrees (as a long). The reason for this is that the distance in meters is an absolute value chosen by the user, whereas the distance in degrees changes depending on where it is measured (because of the ellipsoid). The calculation from distance in meters to distance in microdegrees at some position and in some direction is done, in the same way as with Position, by first calculating how much one meter distance approximately is in microdegrees near the same position and in the same direction. Both the position and angle need to be specified. It is evident that this brings certain limitations under which validity holds. The distance must not be too large or the approximation for its conversion does not hold: for instance, the factor changes at different ends of the line between two positions. Also a straight line is only an approximation so an angle can be used (the line will bend off to another angle if it is long enough). A distance in meters by our choice needs to be between 0 and 10000 meters. In short, the calculations apply only very locally, and this goes for all the values involved in the distance estimations. An Angle gets its (integer) value in degrees. This is then normalized so it lies between 0 and 360 (exclusive) degrees. The value that is kept (as a double) in memory is in radians. Angles are positive counterclockwise and absolute angles are versus North. In subclass AbsoluteAngle however an absolute angle in radians specially for the maths calculations is returned versus East, to be in check with default mathematical conventions. This does not matter for the relative angles in subclass RelativeAngle.
54
CHAPTER 4. IMPLEMENTATION: GEOMETER
Queueing The MathQueueThread class implements a PriorityBlockingQueue to schedule algorithm runs for the nodes. This is useful since there are several nodes that are possibly being managed by this device and the speed at which calculations are done can sometimes be slower than the speed at which nodes are scheduled to be calculated (as requests can also come in over the network). Because the queue has a priority we can also make sure that newly created nodes are absolutely scheduled first (to make sure they become initialized). The queue also refuses to schedule a node that has already been scheduled and is not yet (completely ) calculated, as this would be a redundant effort. If there are no more nodes in the queue then it waits for some time and finally checks if the whole service can’t be stopped. Algorithm If there is a scheduled node, MathQueueThread (the context) will execute the doAlgorithm() method of MathAlgorithm (the strategy in the Strategy pattern), with as only implementation IterativeRelaxationAlgorithm (the concrete strategy). The IterativeRelaxationAlgorithm’s doAlgorithm() method performs the second version algorithm (Section 3.4.9) on the node. Some smaller sub-calculations are done in the links or in the primitives. The most important variables, the x and y coordinates and variance values, are of the long type. Care is taken to avoid rounding errors, mostly when variance comes into play (some values hence are in double or scaled there). The stopping criterion is set at 0.2 m.
4.6
Implementation
The design and theory leaves open some room for choices. This includes the parameters used throughout the algorithm.
4.6.1
Variance
The variance values of the estimations still need to be chosen wisely. A lower value says the estimation is deemed to be more accurate. For the user distance estimations the variance could be calculated based on the given inputs, and this was tried out in the Mathematica example. For a link between node i and node j, with a distance dji , the variance uji is the following: q uji = (dji · 0.1)2 + (2dji sin (vθ · 2 · π/2 · 360))2 (4.1) where vθ = 2 (degrees) if the angle is a multiple of 90 degrees, vθ = 5 if another multiple of 45 degrees, and vθ = 10 otherwise. The variance is estimated as the combination of 10% of the distance, which represents the depth by which the position might be off, and the sideways distance by which the position might be off, based on an estimated angle inaccuracy. The sideways distance is approximated by drawing a circle through the position (with the “known” position in the center) and taking twice the chord of the possible angle error superimposed on the link line. The 10% is a choice that is based on the 5% used in [6]. For the angle error one could reason that users are more precise in estimating perpendicular and half-perpendicular angles and are very weak in estimating any other angle, and therefore these angles could be given a higher appraisal. This is what is illustrated above. On the other hand, one could argue that people will hardly ever enter any values other than round-offs to multiples of 90 degrees, or to some extent 45 degrees, and that in fact precisely entered values (such as 33.5 degrees) should be rewarded with low variance values.
4.7. RESULTS
55
For this reason, in the second version of the algorithm, as implemented in Geometer, vθ has been set to a constant value, namely 5 degrees, irrespective of the value of the angle. Also this concerns one of the two angles. For the other angle the above variance value is multiplied by a constant factor, namely 110%, to represent the uncertainty on the orientation of the photo from which the link started in the first place. In short, only the distance has an influence on the variance of the link. For absolute links, in the second version of the algorithm, a constant non-zero value of 100 is set. It was zero in the first version of the algorithm. This makes the absolute links considered somewhat 10000 to 100000 times more accurate than user distance estimations. Rotated photo user estimations have a variance of zero, so they will not add uncertainty to a node’s position, after all they are estimations “within” a node (the node’s variance will always first be determined by some other estimation than of the rotated photo). The variance for the absolute links must specifically not be zero or a small number: because absolute links are no longer implied boundary condition-wise as in the first version, a zero variance would give a division by zero. This can be solved by using a small constant. But what is worse, since we are using integer calculus as much as possible, this constant would also be the maximum number of links (with absolute links being the most limiting here) that can be made to a single node. Indeed, in accordance with (3.3) suppose there are 100 links each with vji = 100. Then vpre = 100/100, where vpre is an intermediary numerical result mathematically equivalent to v1i . The final variance for the node then is vi = 100/100 = 1. Take 101 links instead, and this would become vpre = 101/100 and vi = 100/101, but this numerically is vi = 0. This 0 means the coordinates are falsely weighted (the sum of the weights must add up to 100%). One solution would be to keep the variances as doubles everywhere. Another, and this is implemented, is to detect that vpre is bigger than one and to add a small value to each vji and recalculate. This approximately keeps the proportions of involvement of the links and their other node’s variances the same, which is the purpose in the first place.
4.7
Results
The realised prototype allows users to add photos and performs the algorithm to make their positions coherent. The use cases have been made possible. Figure 4.5 shows a screen (with a satellite map and on a white background for clarity) where locally several photos have been added to build up a walkthrough of an indoor environment. Figure 4.6 shows the same (this time in the total emulator), except lacking three map estimations on the corner nodes and the last added link on the left side. Comparing both shows some of the correcting qualities of the prototype (as Figure 4.5 has more information). On Figure 4.7 the same set of photos is shown corrected by the first version of the algorithm in the Mathematica example.
4.7.1
Algorithm issues
The algorithm is capable of keeping things “together” and of minimizing the distributed error by diffusing it. Performance of the algorithm in its last version is not exceptional, however it is a good starting point. Still, some issues need to be looked into:
56
CHAPTER 4. IMPLEMENTATION: GEOMETER
(a) The background is a satellite image.
(b) On a white background.
Figure 4.5: Geometer map mode screen with a full set of photos added of an indoor walkthrough.
4.7. RESULTS
57
Figure 4.6: The emulator in full with the same set of photos but not yet the full tour connected back at the left side.
58
CHAPTER 4. IMPLEMENTATION: GEOMETER
Figure 4.7: The same set of photos as made coherent in the Mathematica example. Increasing variance The fact that a node’s variance always increases and never decreases may have an impact in the fully distributed system, especially since in the current implementation multicast is used and not broadcast and thus only a subset is “in range” or in the “inner circle”. As a start, the variance values should not overflow in a large system. Of course this goes for any of the values in the algorithm, and its scalability ends at some stage. Secondly, there is a concern that rounding errors in the calculation of the variance would make the value increase more than normally. This would make a part of the mesh that is present for a longer time, proportionally have a higher variance than the other parts of the system. Order of the loops The order in which the nodes are handled can make a difference. There are open possibilities to sort the nodes in the queue. Also it could be beneficial to make sure that on average nodes’ positions are recalculated equally as much, rather than one node’s position (or a couple nodes’ positions) being calculated over and over. The queue as currently implemented favours this but no guarantees are made. Oscillations Because of the changes made to the original algorithm, it is no longer certain that global convergence will happen in any case. Actually, the same holds for the original algorithm in a real numerical implementation. The properties of the (any) distributed application may prohibit the system from ever reaching a global optimum. Also, to calculate the distance in degrees an ellipsoid is used that only is an approximation to reality and only is valid under certain conditions. Often in practice using our calculation, the distance from A to B appears to be slightly different from the distance from B to A (i.e. the distance approximations are nonreciprocal). This is logical as it all depends on the supplied near position. This too introduces
4.7. RESULTS
59
at least a slower convergence of the algorithm, and could pave the way to instabilities. Other rounding errors also have an influence on convergence, and these rounding errors increase as real-world coordinates are used in a large and vast system. Finally, the algorithm is not suitable for large loops or chains of nodes, where such oscillation effects easily might become visible. Zero length vector The zero length vector problem, as mentioned in Section 3.4.9, really becomes a problem if there are very many links. The reason for this is that many links could average out more easily and that many links could diminish the length of the vector in any case. It might also worsen the convergence.
4.7.2
Experience with Android
The used Android platform allows for exciting uses and extensions. Geometer is mostly in conformity with various official guidelines and unofficial conventions. Bugs Still, the Android SDK is very young. Not all design choices have stabilized and lots of bugs are still present, several of which affected or still affect Geometer. • A “GREF has increased to 601” message and termination of the Dalvik VM with “excessive global references (601)” occur when too many objects have been constructed and the garbage collector has been unable to clean up all the destructed objects. This is not yet solved but has been made less severe so not a first-line problem any more. • The up navigation button makes the map go down a bit first before going up. • Extra clicks are required between touch and key modes: when switching between mouse and keyboard input you need to click/press twice (even in the Android menus). • Problems exist with (un-)packing Bundles. Bundles packed in Bundles are not unpacked properly. This would be useful for Intents and prevents optimizing the communications over XMPP. • In SDK version m3-rc37a floats and doubles fail to be retrieved from the database. This was by-passed by using varchars instead of doubles. Performance Android also has a fair bit of performance issues that are not yet solved. Notably between SDK version m3-rc37a and m5-rc14 the emulator became a bit more capable of handling many objects (for instance on screen), at the expense of becoming a lot slower (very clearly in start-up time). Androids map API implementation is quite beta and even the reference implementation example can get slow quite often. Also it scales badly in terms of elements drawn in the overlays. The database queries and the wrapper classes (such as the ContentObservers) are relatively slow too. The responsiveness of the map mode of Geometer depends largely on
60
CHAPTER 4. IMPLEMENTATION: GEOMETER
these two issues. Even more so, with more nodes in view more overlay elements need to be drawn and more database queries need to be done, which rapidly slows down performance with more than 10 nodes. Still, moving the map without any nodes in view (so without much of the Geometer-specific implementation) is not too fast either. The Intent system, while providing loose coupling, incurs quite an overhead on calls between the different parts of the system. This also applies for the XMPP communication. As expected it is not scalable and the roundtrips are not really fast either. Testing and debugging Android provides assistance in debugging but this is all very early days. A profiling tool only became available halfway through. The debugging also is slow, and since it concerns heavily threaded software the debugging is not always straightforward. A problem with Android is the burden and difficulty of testing the software. This prevents the use of a test-driven development method. For the larger part of this thesis JUnit did not work on Android. However there are several other unit testing packages, such as P-Unit or Positron, that worked to some extent and with mixed results. Still, even if unit testing would have been easily possible, there is still a lack of mock objects, which prevents for instance Services, communications and databases from being tested. We have been able to do some small tests, when necessary, on some classes in the primitives package, but have been unable to test for instance even ControllerActivity because its functionality mostly interacts with other subsystems which couldn’t be substituted. Good news is that this seems to be rapidly improving with plenty attention from the Android developers.
4.7.3
Application issues
While the functionality to work on a single device is fully incorporated and stable, the (nontheoretical) distributed functionality is not completely worked out (and unstable). It would not have a practical use anyway because of the non-scalability of XMPP. In general too many obstacles and necessary concessions currently prevent the implementation of a fullscale distributed system. Some related functions, such as keeping the ControllerService running, after the Activities (such as Geometer) have been hidden or closed in favour of other applications, do not work under all circumstances. Fully focused on photos MathQueueThread queues nodes and not photos. In hindsight it could have been a better design to make the queue based on photos instead of nodes for future extensions. For the bigger part of the functionality nodes are perfectly suitable, however it is difficult to specify an order in which to turn the photos, which becomes impractical when there is one photo that not yet has an orientation and needs to be initialized first. On the other hand, basing the queue on photos would make the application heavier. Nodes in view A way to request only a limited number of nodes in view was not implemented. To soften the slowness of the map a limit could be set of showing only 20 nodes and requiring the user
4.8. CONCLUSION
61
to zoom in to see more nodes. However the map cannot be zoomed in infinitely which leaves another problem.
4.8
Conclusion
The development and implementation process was described in a requirements analysis, use cases, a domain model and a design class diagram. The advantages and disadvantages of the chosen Android platform were explained, and we feel that the overall balance, keeping in mind that it is a system in evolution, lies in favor. Some implementation choices were also enlightened. And finally the result and any issues were discussed.
62
CHAPTER 4. IMPLEMENTATION: GEOMETER
Chapter 5
Conclusion 5.1
Results
Foremost several aspects of the large problem domain, which is localization for distributed data capture with spatial coherence issues, were made explicit. Two examples were presented as illustrations. An extensive literature study was done. As the domain is very new, search concentrated mostly on two related, but nevertheless different, domains, namely ad-hoc sensor network localization and robotic node localization. Although localization is an old problem, further research is necessary to make it really usable for ubiquitous computing. Less related and usable techniques than hoped for were found, and none of the found algorithms was an allround winner for our specifications. There was considerable difficulty of finding an approach that would translate easily and that would meet all our requirements to some degree. Rather than picking a ready and applicable approach we had to derive and experiment in a new direction. Also, instead of advanced coherence improving techniques, it turns out that a more plain accuracy maximization is more attainable at initial stage. Still, one family of algorithms that is based on a mechanical spring analogy and uses mesh relaxation, seems attractive as a technique for our problem. Particularly the iterative approach in [6] seems most fit for the job. While it requires some concessions on sophistication and error diffusion capabilities, it offers possibilities for adaptation to decentralized calculation, and it is one of the few algorithms that allow to create a lightweight, simple, distributed system. It is feasible to be implemented in a prototype application featuring the distributed photo capture example. Based on this knowledge, the algorithm was translated and adapted for a distributed implementation and this specific application problem. But first, the distributed photo capture problem definition and scope were formulated. The specific problem here is ensuring coherence in distributed photo capture in a geotagged distributed photo collection. It is a very novel topic so effort has been concentrated on the coherence problem itself. A representation was chosen with consideration for the peculiarities of the problem. A feasibility study of the algorithm was made in Mathematica.
5.2
Produced work
Following up on this theoretical work, a prototype application was implemented of a distributed geotagged photo collection, called “Geometer”, as an example and to show the real63
64
CHAPTER 5. CONCLUSION
izability of such a system. The system is built on the Android platform for mobile devices. Geometer’s architecture design notably contains a far-reaching separation between the user interface and the backend, where the computational thread is situated. Some parts are only loosely coupled and therefore the whole gains a lot more extensibility. The realised prototype allows individual users to add photos and performs the algorithm to make their positions coherent. The users can then view a map or walk through the photos. In practice distributed operation is included, following the theoretical design, but this should be considered a much untested bonus. The algorithm proves capable, but performance of the algorithm is not exceptional. Still it is a good starting point. The Android platform allows for exciting uses but is not stabilized yet. Several bugs and API changes and weaknesses were encountered. Most pressingly performance and testing are major concerns.
5.3
Future work
Some implementation choices and algorithm issues still can be improved as is discussed in Section 4.7 in Chapter 4. As to the prototype, a more stable and tested distributed implementation stands out as future work. Secondly, work could be done on some of the things left out of scope that are not so abundantly researched in the field of distributed systems. An example is the modification and deletion, and with this also synchronization issues, of photos in the application in addition to the functionality to add photos. Another is including a temporal aspect too. Hopefully the advent of a real device can give a better idea of the real-world usefulness of the application. Yet another idea could be to weaken the strict links between the photos and implement some hyperlinked system with room for other content as well, similar to a wiki. It would be useful to assess the distributed properties and robustness herein of the algorithm in more detail. The convergence guarantee and speed of the algorithm with its current adaptations could be diagnosed and confirmed. The prototype allows to experiment with other solutions, albeit that the Android API might change quite a bit the next few months, so for the prototype to be used in real life other changes too will likely need to be made.
5.4
Conclusion
The results match the stipulated objectives of a large literature study, a problem analysis, the formulation of a solution, and the design and implementation of a prototype. A start was made for the novel concept of distributed capture and several domains have converged in the solution. A prototype has been made and the level of feasibility of the solution thus far has been discussed.
Bibliography [1] G. Coulouris, J. Dollimore, and T. Kindberg, Distributed Systems: Concepts And Design. Addison Wesley Longman, 2005. [2] K. Aberer, P. Cudre-Mauroux, A. Datta, and M. Hauswirth, “PIX-Grid: A Platform for P2P Photo Exchange,” Proceedings of Ubiquitous Mobile Information and Collaboration Systems (UMICS 2003), collocated with CAiSE’03, 2003. [3] X. L. Bourdon and P. Couderc, “A protocol for distributed multimedia capture for personal communicating devices,” in Autonomics ’07: Proceedings of the 1st international conference on Autonomic computing and communication systems. ICST, Brussels, Belgium, Belgium: ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), 2007, pp. 1–8. [4] J. Pauty, P. Couderc, and M. Banˆatre, “Using context to navigate through a photo collection,” Proceedings of the 7th international conference on Human computer interaction with mobile devices & services, pp. 145–152, 2005. [5] E. Verbree and A. van Anrooij, “Interactive navigation through distance added valued panoramic images,” ISPRS, Panoramic Photogrammetry Workshop, Feb, 2004. [6] T. Duckett, S. Marsland, and J. Shapiro, “Learning globally consistent maps by relaxation,” Robotics and Automation, 2000. Proceedings. ICRA’00. IEEE International Conference on, vol. 4, 2000. [7] Wolfram Mathematica. [Online]. Available: http://www.wolfram.com [8] Android at Google Code. [Online]. Available: documentation.html
http://code.google.com/android/
[9] A. Howard and L. Kitchen, “Cooperative localisation and mapping,” International Conference on Field and Service Robotics (FSR99), pp. 92–97, 1999. [10] I. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci, “A survey on sensor networks,” Communications Magazine, IEEE, vol. 40, no. 8, pp. 102–114, 2002. [11] J. Leonard and H. Durrant-Whyte, “Simultaneous map building and localization for an autonomous mobilerobot,” Intelligent Robots and Systems’ 91.’Intelligence for Mechanical Systems, Proceedings IROS’91. IEEE/RSJ International Workshop on, pp. 1442– 1447, 1991. 65
66
BIBLIOGRAPHY
[12] M. Weiser, “The Computer for the Twenty-First Century,” Scientific American, vol. 265, no. 3, pp. 94–104, 1991. [13] K. Muthukrishnan, M. Lijding, and P. Havinga, “Towards smart surroundings: Enabling techniques and technologies for localization,” Lecture notes in computer science, pp. 350–362. [14] C. Savarese, J. Rabaey, and K. Langendoen, “Robust Positioning Algorithms for Distributed Ad-Hoc Wireless Sensor Networks.” [15] K. Langendoen and N. Reijers, “Distributed localization in wireless sensor networks: a quantitative comparison,” Computer Networks, vol. 43, no. 4, pp. 499–518, 2003. [16] J. Aspnes, T. Eren, D. Goldenberg, A. Morse, W. Whiteley, Y. Yang, B. Anderson, and P. Belhumeur, “A theory of network localization,” IEEE Transactions on Mobile Computing, vol. 5, no. 12, pp. 1663–1678, 2006. [17] G. Mao, B. Fidan, and B. Anderson, “Wireless sensor network localization techniques,” Computer Networks, vol. 51, no. 10, pp. 2529–2553, 2007. [18] L. Doherty, K. Pister, and L. El Ghaoui, “Convex Position Estimation in Wireless Sensor Networks.” [19] G. Xue and Y. Ye, “An efficient algorithm for minimizing a sum of Euclidean norms with applications,” SIAM J. on Optimization, 1997. [20] P. Biswas and Y. Ye, “Semidefinite programming for ad hoc wireless sensor network localization,? Dept. of Computer Science,” Stanford University, Tech. Rep., Sept 2003, Tech. Rep. [21] D. Niculescu and B. Nath, “Ad Hoc Positioning System (APS).” [22] A. Savvides, H. Park, and M. Srivastava, “The n-Hop Multilateration Primitive for Node Localization Problems,” Mobile Networks and Applications, vol. 8, no. 4, pp. 443–451, 2003. [23] X. Ji and H. Zha, “Sensor positioning in wireless ad-hoc sensor networks using multidimensional scaling,” INFOCOM 2004. Twenty-third AnnualJoint Conference of the IEEE Computer and Communications Societies, vol. 4, 2004. ˇ [24] S. Capkun, M. Hamdi, and J. Hubaux, “GPS-free Positioning in Mobile Ad Hoc Networks,” Cluster Computing, vol. 5, no. 2, pp. 157–167, 2002. [25] K. Chintalapudi, A. Dhariwal, R. Govindan, and G. Sukhatme, “Ad-hoc localization using ranging and sectoring,” INFOCOM 2004. Twenty-third AnnualJoint Conference of the IEEE Computer and Communications Societies, vol. 4, 2004. [26] W. Bach, D. Dam, L. Evers, M. Jonker, H. Scholten, and P. Havinga, “An Iterative Quality-based Localization Algorithm for Ad Hoc Networks.” [27] A. Kannan, G. Mao, and B. Vucetic, “Simulated annealing based localization in wireless sensor network,” The 30th IEEE Conference on Local Computer Networks, pp. 513–514, 2005.
BIBLIOGRAPHY
67
[28] A. Ihler, J. Fisher III, R. Moses, and A. Willsky, “Nonparametric Belief Propagation for Self-Localization of Sensor Networks,” Selected Areas in Communications, IEEE Journal on, vol. 23, no. 4, pp. 809–819, 2005. [29] A. Howard, M. Mataric, and G. Sukhatme, “Relaxation on a mesh: a formalism for generalized localization,” Intelligent Robots and Systems, 2001. Proceedings. 2001 IEEE/RSJ International Conference on, vol. 2, 2001. [30] K. Basye, T. Dean, and J. Vitter, “Coping with Uncertainty in Map Learning,” Machine Learning, vol. 29, no. 1, pp. 65–88, 1997. [31] T. Dean, “On the complexity of integrating spatial measurements,” Proc. SPIE, vol. 1003, p. 358, 1988. [32] M. Golfarelli, D. Maio, and S. Rizzi, “Elastic correction of dead-reckoning errors in map building,” Intelligent Robots and Systems, 1998. Proceedings., 1998 IEEE/RSJ International Conference on, vol. 2, 1998. [33] F. Lu and E. Milios, “Globally Consistent Range Scan Alignment for Environment Mapping,” Autonomous Robots, vol. 4, no. 4, pp. 333–349, 1997. [34] S. Roumeliotis and G. Bekey, “Synergetic localization for groups of mobile robots,” Decision and Control, 2000. Proceedings of the 39th IEEE Conference on, vol. 4, 2000. [35] Android: how to write efficient code. [Online]. Available: http://code.google.com/ android/toolbox/performance.html [36] Android application anatomy. [Online]. Available: http://code.google.com/android/ intro/anatomy.html [37] Android design philosophy. [Online]. Available: toolbox/philosophy.html
http://code.google.com/android/
[38] Android building blocks. [Online]. Available: http://code.google.com/android/devel/ building-blocks.html [39] Android Activity. [Online]. Available: android/app/Activity.html
http://code.google.com/android/reference/
[40] Android Service. [Online]. Available: android/app/Service.html
http://code.google.com/android/reference/
[41] Android: storing, retrieving and exposing data. [Online]. Available: //code.google.com/android/devel/data.html
http: