SURF stimuleringsregeling Learning Analytics 2013
Learning Analytics, formatieve toetsing & leerdisposities Eindrapportage bij ‘stimuleringsregeling Learning Analytics 2013’
Penvoerder: Maastricht University School of Business and Economics Partners: Looptijd: 1 juli 2013 tot 31 mei 2014 Projectleider: Dirk Tempelaar
Datum: 27 juni 2014 [definitieve versie]
0
SURF stimuleringsregeling Learning Analytics 2013
1
SURF stimuleringsregeling Learning Analytics 2013
Inhoud 1. Korte samenvatting ...................................................................................................... 1 2. Doelstelling, doelgroep en aanpak ................................................................................ 1 3. Resultaat...................................................................................................................... 3 4. Conclusies ................................................................................................................... 5 5. Continuering ................................................................................................................ 5 6. Overige opmerkingen ................................................................................................... 6 7. Kostenoverzicht ........................................................................................................... 7
www.creativecommons.org/licenses/by/3.0/nl
2
SURF stimuleringsregeling Learning Analytics 2013
1. Korte samenvatting Het project behelsde twee toepassingen van ‘dispositionele learning analytics gebaseerd op formatieve toetsing’, en het onderzoeken van de uitkomsten van die onderwijskundige vernieuwingen. Formatieve toetsing is vormgegeven door het gebruik van de e-tutorials MyMathLab en MyStatLab. Tracking en performance data uit deze beide systemen is vervolgens gecombineerd met tracking data uit andere leerondersteunende systemen, zoals het leer-management systeem BlackBoard, en inschrijfsystemen. Als derde component in onze LA toepassing zijn leerdispositiedata toegevoegd: data over leerkenmerken van studenten, verkregen uit de afname van surveys. Beide onderwijsexperimenten hebben in het eerste semester plaats gevonden. Het tweede semester is benut om meer onderzoek te doen naar de mogelijkheden tot het genereren van leerfeedback, in aanvulling op de leerfeedback die al online is verstrekt. Vooral het eerste experiment, door de rijkheid van de verzamelde data enerzijds, en door de grote omvang van de studentenpopulatie anderzijds, is gebleken een dankbaar terrein te zijn voor empirisch onderzoek naar op LA gebaseerde innovaties. Publicaties die uit dit onderzoek zijn voortgekomen zijn als appendices aan dit rapport toegevoegd. In het bijzonder Appendix 6 is een gedetailleerde beschrijving van het eerste, grootschalige onderwijsexperiment (geaccepteerd door Computers in Human Behavior, special issue Learning Analytics). Noot: omdat het project al was afgerond op het moment , is deze rapportage in essentie ook gelijk aan de voorlopige eindrapportage. De enige wijziging die tussen half mei en nu heeft plaatsgevonden, is in de confirmatie dat de beoogde ‘kroonjuweel’ in de disseminatieactiviteiten, de publicatie in Appendix 6, inderdaad in een acceptatie is geresulteerd.
Appendices: 1. Artikel Tijdschrift voor Remedial teaching 2. Artikel Innovative Infotechnologies for Science, Business and Education, 2014 N2 (17) 3. Paper ORD 2014 4. Hoofdstuk in: M. Kalz and E. Ras (Eds.): CAA 2014, CCIS 439, pp. 67–78, 2014, Springer International Publishing Switzerland 5. Hoofdstuk Proceedings ICOTS 2014 6. Manuscript voor Computers in Human Behavior
1
SURF stimuleringsregeling Learning Analytics 2013
2. Doelstelling, doelgroep en aanpak Ons project is op het schutblad van deze rapportage al gepositioneerd als de doorsnijding van Learning Analytics, Formatieve toetsing, en Leerdisposities. De vragen waar we in onze onderwijsexperimenten en onderzoek ons op hebben gericht, hebben steeds betrekking op verschillende combinaties van deze drie thema’s. Zoals: - Welke rol kan formatieve toetsing spelen in LA toepassingen, en hoe verhoudt de feedback die op formatieve toetsing kan worden gebaseerd, tot de feedback ontleend aan andere databronnen voor LA? - Welke toegevoegde waarde hebben leerdisposities in LA toepassingen, in aanvulling op het gebruik van track data afkomstig uit zowel leer-management systemen, als systemen voor formatieve toetsing? Wat is de rol van de tijdsdimensie hierin: welke informatiebronnen zijn in staat vroegtijdige feedback te genereren, zodat voldoende tijd voor interventies resteert? - Hoe kan op LA gebaseerde leerfeedback vorm worden gegeven in een probleemgestuurd programma, waar de belangrijkste actor in het maken van leerkeuzes de student is, in samenwerking met tutoren? De experimenten hebben plaatsgevonden in twee verschillende programma’s (SBE, School of Business & Economics, en UCM, University College), waarbij het in beide gevallen ging om een introductie tot kwantitatieve methoden. Beide toepassingen hebben plaatsgevonden in programma’s die het principe van probleemgestuurd onderwijs, pgo, volgen. Echter, onze toepassingen waren daar niet specifiek afhankelijk van. Wat wel een cruciaal rol heeft gespeeld, is het student-gecentreerde aspect van de onderwijsinrichting: het is de student zelf die de belangrijke keuzes maakt in hoe en wat te leren, dus moet de feedback ook in eerste instantie op die student gericht zijn. In onze toepassingen is formatieve toetsing de meest belangrijke informatiebron, in de zin van best voorspellend, gebleken. Dat zal ook te maken hebben met de disciplines waarin de experimenten hebben plaatsgevonden: wiskunde en statistiek. Twee exacte disciplines, die zich kenmerken door een sterk gestructureerde, hiërarchische opbouw van het domein, en een kwantitatieve inslag. In die disciplines werkt formatieve toetsing, en het gebruik van e-tutorials, goed; zo goed zelfs dat ze vaak klassieke onderwijsvormen als werkcolleges geheel verdrongen hebben. Het is in deze context dat ook LA gebaseerd op formatieve toetsing een grote rol kan vervullen; in andere contexten kan dat perspectief mogelijk minder groot zijn. De gekozen methode behelst dat in twee verschillende cursussen, met verschillende typen studenten, is geëxperimenteerd met het afnemen van vragenlijsten op het gebied van leerdisposities, het vervangen van werkcolleges door het gebruik van e-tutorials, inclusief formatieve toetsing, en het inrichten van leerfeedback op basis van LA, gebruik makend van data afkomstig uit alle beschikbare databronnen. LA leerfeedback was daarbij zowel op de student, als op tutoren gericht. Online feedback was hierin wel steeds partieel van karakter, dwz betrof steeds één enkele informatiebron. Vervolgens is na het verstrijken van de onderwijsperiode verder ‘off-line’ onderzoek gedaan: wanneer we de verschillende informatiebronnen combineren, wat kan er gezegd worden over de informatierijkheid van ieder? Welke tijdsaspecten spelen daarbij een rol? De projectperiode heeft het hele academische jaar 2013/2014 beslagen, inclusief de zomer 2013 voor de voorbereiding. Onderwijs wordt aan de UM in 8-weeks blokperioden gegeven. In de eerste blokperiode van het academisch jaar, september-oktober, heeft het eerste en grootschalige experiment plaatsgevonden: onder 1000 SBE eerstejaars. In de tweede blokperiode, november-december, heeft het kleinschaliger experiment binnen het UCM plaatsgevonden, met 80 deelnemers. Het tweede semester is volledig gebruikt voor onderzoek: de ‘off-line’ analyses. Het project heeft volledig binnen de UM plaatsgevonden. De organisatie zelfs binnen één faculteit, SBE, de uitvoering echter zowel in SBE als het college: UCM. De organisatie van het project heeft geheel in handen gelegen van de penvoerder, maar in de uitvoering hebben een groot aantal tutoren een rol gehad. Het Maastrichtse pgo systeem kenmerkt zich immers door een combinatie van grootschaligheid en kleinschaligheid: de programma’s trekken relatief grote aantallen studenten, maar de onderwijsorganisatie is zeer kleinschalig waar het 2
SURF stimuleringsregeling Learning Analytics 2013 de kern van het pgo systeem betreft: de tutorgroep. Tutoren van 72 SBE tutorgroepen en 6 UCM tutorgroepen hebben de een belangrijke rol gespeeld in de verstrekking van leerfeedback.
3. Resultaat De onderwijsexperimenten zijn naar verwachting verlopen. Risico’s waren daarbij overigens gering: in de voorgaande SURF TTL projecten was de invoering van het gebruik van etutorials en formatieve toetsing al georganiseerd, inclusief het genereren van leerfeedback op basis van deze formatieve toetsing. De stap naar het genereren van leerfeedback op basis van alle informatiebronnen, dus naast formatieve toetsing ook BB track data, leerdisposities, en instaptoetsing, was een zeer beheersbare. Dat geldt eveneens het betrekken van de tutoren in de LA toepassing, naast feedback aan alle deelnemende studenten. Gedurende het project, naar aanleiding van discussies met projectleiders van andere LA projecten, is de vraag naar de informatierijkheid van alternatieve informatiebronnen steeds meer op de voorgrond komen te staan. Hoe rijk is (wat is het voorspellende vermogen van) BB track data in vergelijking tot data uit formatieve toetsing? Wat voegt dispositiedata toe? Maakt het uit op welk moment je deze vergelijking maakt? Deze vragen zijn leidend geweest in de verschillende presentaties en artikelen die met de uitkomsten van het project zijn samengesteld. Op voorhand was er de verwachting geweest dat een andere onderzoeksvraag daar zich aan zou toe voegen: kan op empirische wijze worden vastgesteld dat LA effectief is, dwz dat door het toepassen van LA het leerrendement wordt verhoogd. Een op zich al lastig te beantwoorden vraag, omdat je daarvoor twee verschillende jaargroepen moet vergelijken: een jaargroep die wel LA ter beschikking had, en een jaargroep die dat niet had. Dit type niet-gerandomiseerde experimenten geeft een wankele basis voor het doen van vergelijkende analyses, omdat je niet in de hand hebt of, en welke, andere factoren ook wijzigen. Toen vervolgens ook nog bleek dat van de diverse LA informatiebronnen, de data van formatieve toetsing verreweg het beste voorspelden, en juist die datacomponent ook al in vorige uitvoeringen van de cursussen was vorm gegeven, werd snel duidelijk dat op basis van onze experimenten weinig afgeleid kon worden over de effectiviteit van LA. In termen van de vraag naar voorspelbaarheid van leerresultaten, kunnen in ons onderzoek de volgende alternatieve informatiebronnen worden onderscheiden: - Leerdisposities, inclusief student data uit inschrijfsystemen; - Data uit diagnostische instaptoetsen; - BlackBoard track data, per week/dag; - E-tutorial track data in ‘practice mode’, per week/dag; - E-tutorial performance data in ‘quiz mode’. De volgorde waarin deze alternatieve informatiebronnen staan opgesomd is geen toevallige. Leerdisposities, data uit inschrijfsystemen, en diagnostische instaptoetsdata zijn helemaal aan het begin van de cursus bekend. Qua timing is dit de ideale data: signaleer je hier al risico op uitval, of onvoldoende presteren, dan resteert nog letterlijk ‘alle tijd’ om te interveniëren. Dat geldt minder voor track data: hoewel deze data veel makkelijker is te verzamelen dan dispositiedata, of toetsdata, komt die later ter beschikking, omdat eerst een gedeelte van de cursus moet verlopen voordat trackdata (van voldoende nauwkeurigheid) ter beschikking komt. De sequentie wordt afgesloten met de performance data: het zal niet vaak zijn dat de eerste echte prestaties ook al in week 1 of 2 beschikbaar komen. In ons geval is de data van de eerste quiz pas aan het eind van week 3 beschikbaar. De andere ordening die we in de informatiebronnen kunnen aanbrengen, is die van voorspellend vermogen: hoe goed kunnen ze helpen in het voorspellen van het eindresultaat, en daarmee, in het signaleren van onderprestatie/uitval? De ordening die je dan krijgt is een geheel andere: bijna omgekeerd evenredig aan de voorgaande. Verreweg de beste voorspeller in beide experimenten is quiz-prestatie data: de quiz-score correleert sterk met de examenscore. Echter: de eerste quiz data zijn pas in het midden van de eerste blokperiode beschikbaar, vervolg-quizzes zelfs nog later. 3
SURF stimuleringsregeling Learning Analytics 2013 Als vroegtijdiger signalering gewenst is, blijkt diagnostische instaptoetsing een ‘second-best’ bron te zijn. Onmiddellijk gevolgd door leerdispositiedata, en track data van e-tutorials. In onze twee toepassingen was het voorspellend vermogen van BB trackdata heel gering, en werd volledig gedomineerd door het voorspellend vermogen e-tutorial track data. Die uitkomst is natuurlijk in zekere mate context-afhankelijk: bepaalde toepassingen die in onze cursussen waren georganiseerd binnen de e-tutorial, hadden ook in BB georganiseerd kunnen worden. Nu werd BB primair gebruikt voor roosterinformatie, verspreiden van slides en tapes van lezingen, down/uploaden van toepassingsopdrachten. Intensiteit van gebruik van dit type BB functies is kennelijk nauwelijks gerelateerd aan studiesucces. De verschillende informatiebronnen laten nog een derde ordening toe: hoe ‘duur’ zijn ze, dwz hoeveel moeite moet je doen om ze te verzamelen. De ene pool van deze ordening wordt zeker gevormd door de track data: moderne systemen voor technology enhanced education staan bijna altijd toe dat verschillende vormen van trackdata (connecttijd, aantal clicks, aantal herhaalde pogingen, …) eenvoudig gegenereerd worden. Aan de andere pool van het spectrum bevinden zich de leerdispositiedata: het vergaren van survey-data kost relatief veel inspanning, naast de zorg dat alle studenten een waarheidsgetrouwe respons verschaffen. Omdat leerdisposities qua voorspellend vermogen in de middenmoot eindigen, zou dat een argument kunnen zijn om het verzamelen van leerdisposities achterwege te laten: duur met beperkte opbrengst. Daar kunnen twee argumenten tegen in worden gebracht. Allereerst bleek in ons eerste experiment dat dispositiedata niet enkel voorspellend is voor studiesucces, maar ook voor track data in de e-tutorials. In de tweede plaats bieden leerdisposities vaak een aantrekkelijker aanknoping voor interventie/counseling, dan track data. Voor veel studenten zal de feedback ‘je oefent te weinig’ niet erg uitnodigend zijn, terwijl de feedback ‘de leerstijl die je gebruikt is niet de meest optimale’ mogelijk veel eerder aanleiding is te veranderen. De uitkomsten van het LA project zijn benut voor verschillende presentaties en publicaties. Hieronder een opsomming van de officiële presentaties en (veelal daaraan gekoppelde) publicaties. Daarnaast is op allerlei spreekbeurten, zowel intern als extern, de rol van formatieve toetsing, en de rol die LA kan hebben bij gebruik van formatieve toetsing, toegelicht (zoals in afgelopen week: spreekbeurt op 14 mei op SBO toetscongres, en op 15 mei voor een groep UT bestuurders en docenten in een voordracht over het UM toetssysteem). Presentaties: 1. Gastspreker op het IIT-2013 congres (Innovative Information Technologies for science, business and education), dat van 14-16 november plaats vond in Vilnius, Litouwen, ter gelegenheid van de ‘Lithuanian presidency of the council of the EU’. Lezing: Learning Analytics and formative assessments in blended learning of mathematics. 2. Presentatie voor deelnemers van de SURF LA stimulusregeling over de beperkingen van BB in het gebruik van LA: BB & Learning Analytics, enige kritische kanttekening, 17 maart 2014. 3. Presentatie op de 2014 OnderwijsResearch Dagen (ORD), 11-13 juni 2014, Groningen, met de titel: Learning Analytics, formatieve toetsing & leerdisposities. 4. Presentatie op de 2014 International Computer Assisted Assessment (CAA) Conference, 30 juni – 1 juli 2014, Zeist, met de titel: Computer Assisted, Formative Assessment and Dispositional Learning Analytics in Learning Mathematics & Statistics. 5. Presentatie op de 2014 International Conference for Teaching Statistics (ICOTS), 1318 juli 2014, Flagstaff, USA, met de titel: Formative Assessment and Learning Analytics in Statistics Education. 6. Presentatie op het EARLI SIG Metacognition congres, 3-6 september 2014, Istanbul, met de titel: Track data and self-report data in explaining self-regulated learning: experiences from a dispositional learning analytics application. Publicaties: 1. Tijdschrift voor Remedial Teaching 2014/2, 22e jaargang nr. 2: RT-programma’s aan de poort van ho werken. Artikel op uitnodiging over projecten in het kader van zowel de SURF TTL als SURF LA regeling. Zie Appendix 1 voor de definitieve versie van het artikel. 4
SURF stimuleringsregeling Learning Analytics 2013 2. International journal IITSBE (Innovative Infotechnologies for Science, Business and Education) 2014 N2 (17): Learning Analytics and formative assessments in blended learning of mathematics. Dit is de uitgeschreven tekst van presentatie #1, te publiceren in een open access tijdschrift. Definitieve tekst in Appendix 2. 3. Learning Analytics, formatieve toetsing & leerdisposities: paper ten behoeve van ORD congres. Appendix 3. 4. Computer Assisted, Formative Assessment and Dispositional Learning Analytics in Learning Mathematics & Statistics. Hoofdstuk in: M. Kalz and E. Ras (Eds.): CAA 2014, CCIS 439, pp. 67–78, 2014, Springer International Publishing Switzerland 2014. Dit is de uitgeschreven voordracht van de CAA presentatie #4. Zie Appendix 4. 5. Formative Assessment and Learning Analytics in Statistics Education. Paper voor de Proceedings van het ICOTS 2014 congres; uitgeschreven versie van presentatie #5. Zie Appendix 5. 6. In search for the most informative data for feedback generation: Learning Analytics in a data-rich context. Paper ingestuurd in reactie op de ‘call for special issue on LA’ van het ‘vlaggeschip’ in onze discipline: Computers in Human Behavior. Het artikel is geaccepteerd voor publicatie. Dit paper richt zeer volledig op de verslaglegging van het eerste onderwijskundige experiment in het kader van het SURF project. Publicaties 1, 4, en 6 zijn geschreven samen met mijn vaste coauteurs: Bart Rienties (Open University, UK) en Bas Giesbers (Rotterdam School of management), beide onderzoekers van technology enhanced education in het algemeen, en LA in het bijzonder.
4. Conclusies De belangrijkste conclusie van ons project is toch wel dat LA als methode minder breed inzetbaar is dan op voorhand gedacht. In ons project waarin we (achteraf gezien, en dus zo dus ook enigszins toevallig) zeer rijke data tot onze beschikking hadden, bleek al snel dat de rijkheid van data uit het proces van formatieve toetsing dat van alle andere type data deed verbleken. Dat suggereert dat wanneer er geen of slechts beperkte gegevens beschikbaar zijn die daadwerkelijk essentiële leerprocessen of leeruitkomsten bemeten, de toepassing van LA ook maar beperkte perspectieven heeft. Tevens werd ook duidelijk dat het gevaarlijk kan zijn je op ‘second best’ alternatieve data te richten, bij gebrek aan heel rijke data. In ons geval suggereert BB click data zo’n alternatief te zijn, maar het voorspellend vermogen bleek minimaal te zijn. De beschikbaarheid van leerdispositiedata kan voor een deel dat geringe voorspellende vermogen verklaren. Zo kan een groot aantal BB clicks een maatstaf zijn voor een hoog niveau van leeractiviteit, en dus een positieve betekenis hebben. Maar in andere gevallen kan een groot aantal BB clicks een indicator zijn voor inefficiënt leergedrag, zoals een stapsgewijze leerstijl, in plaats van een diepgaande leerstijl, en dus een negatieve betekenis hebben. De interpretatie van activiteit in een leer-management systeem is dus lang niet altijd eenduidig. Het voorspellend vermogen van leerdispositiedata is zeker veel beperkter dan dat van data uit formatieve toetsing. Tegelijkertijd is het vergaren van dispositiedata tijdrovend. In toepassingen waar al rijke data beschikbaar zijn, of eenvoudig te verkrijgen, is het gebruik van dispositionele LA dus niet voor de hand liggend. Wel geldt dat het type leerfeedback dat met dispositionele LA kan worden gegenereerd, meer omvattend is dan bij LA gebaseerd op tracking data enkel. Deze rijkere feedback, en de extra aanknopingspunten voor counseling, kan een reden op zichzelf zijn om dispositionele LA te gebruiken.
5. Continuering Het uitgevoerde SURF project is ingebed in langlopend toegepast onderwijskundig innovatie-onderzoek binnen de MU-SBE, dat zich toespitst op de rol van technology enhanced education binnen een probleem-gestuurde programma. Gedurende de periode van de SURF TTL stimuleringsregelingen zijn onze onderzoeksprojecten primair gericht geweest op het invoeren en onderzoeken van digitale, formatieve toetsing. In de laatste fase van ons TTL project is een start gemaakt met het genereren van feedback over leerprocessen, gebruik makend van LA principes, en primair gebaseerd op data uit de 5
SURF stimuleringsregeling Learning Analytics 2013 systemen van formatieve toetsing. Dit type projecten is onderdeel van onderwijskundig vernieuwingsbeleid, en daarop gebaseerd (wetenschappelijk) effectonderzoek, van de UM, en zal dus ook in de toekomst worden gecontinueerd, ook na het aflopen van de stimuleringsregeling. Vooralsnog ligt daarbij de focus op de combinatie van formatieve toetsing, en dispositionele Learning Analytics (LA waarin leerfeedback zowel wordt gebaseerd op trackdata, als op individuele kenmerken van de studenten, zoals leerstijlen). De eerst verantwoordelijke van deze projecten is ook de projectleider van het SURF stimuleringsproject, en daarmee tevens aanspreekpunt voor meer informatie over het project (Dirk Tempelaar). In de resultaatbespreking in sectie 3 is al verslag gedaan van een zestal presentaties en zestal publicaties die gedurende het project zijn of nog worden verricht. Daarnaast zijn de vele spreekbeurten waar een toetsdeskundige in deze periode als vanzelfsprekend wordt uitgenodigd (dankzij de toegenomen aandacht voor toetsing en toetsprocedures, onder druk van de nieuwe aandachtspunten in onderwijsvisitaties), perfecte gelegenheden om meer bekendheid te geven aan formatieve toetsing en LA. Door de inbedding van ons onderzoek in langdurig onderzoek van faculteit en universiteit, zal het ‘zendingswerk’ zeker doorlopen na het beëindigen van de stimuleringsregeling. Waarbij wel opgemerkt dient te worden dat een zeker vertraging ten opzichte van de SURF agenda onvermijdelijk is. Collega’s binnen en buiten de UM hebben momenteel vooral belangstelling voor formatieve toetsing, en hoe dat in te voeren is in bestaand onderwijs. Het is momenteel nog erg lastig aandacht te vragen voor de invoering van LA. Pas als die formatieve toetsing goed en wel is ingevoerd, worden waarschijnlijk de geesten rijp om de vervolgstap te maken: LA gebaseerd op de rijke informatie die formatieve toetsing oplevert.
6. Overige opmerkingen De grote verschillen tussen eerdere TTL projecten en het huidige LA project qua financieringsmogelijkheden gaven aanleiding tot enige scepsis bij de start van het project: welke impact zou de nu wel heel erg beperkte omvang van SURF subsidies hebben op de uitvoering van de projecten. Die scepsis bleek onterecht: door de kleinere omvang van ieder van de projecten, was er veel meer contact met andere projecten dan ik op voorhand had verwacht (tijdens de TTL projecten was het contact vooral met partners in het eigen project), en de projectbijeenkomsten op het SURF kantoor met alle projectleiders bleken stimulerende brainstromsessies te zijn. In dat opzocht is recht gedaan aan ‘klein maar fijn’. In die Utrechtse sessies is vrij veel aandacht besteed aan de rol van BlackBoard in LA toepassingen, mede omdat een groot aantal deelnemende instellingen gebruik maken van BB. Gestimuleerd door die discussies hebben we in ons project ook nader gekeken wat de rol van BB kan zijn bij formatieve toetsing, en het genereren van leerfeedback op basis daarvan. De uitkomst van die onderzoeksvraag is niet eerder in deze rapportage uitgewerkt, maar is een eenduidig negatieve: BB is als systeem ongeschikt voor het gebruik van formatieve toetsing. Dat ligt niet aan de veelheid van toetsopties op zichzelf in BB: de functionaliteit van toetsen doorstaat de vergelijking met professionele toetssoftware als QMP. Qua rapportagemogelijkheden voldoet BB nog steeds wanneer het zich beperkt tot itemscore en geaggregeerde itemscore, toetsscore. Met andere woorden: voor summatieve toetsing volstaan de standaard BB rapportages. Dat verandert wanneer je BB wil gebruiken voor formatieve toetsing. Dan is het essentieel om niet alleen te kunnen aangeven of de student een vraag goed of fout heeft beantwoord, en hoeveel goede antwoorden er in totaal zijn, maar is het daarenboven gewenst specifieke feedback te verschaffen: welk verkeerd antwoord heeft de student gegeven, waarom is dat antwoord incorrect, en het goede antwoord correct. Het downloaden van die detailgegevens uit BB tests (en dat zelfde geldt ook BB surveys) is een drama. BB is namelijk niet instaat eenvoudige downloadbestanden te genereren, met vraagcode’s en antwoordcode’s voor iedere student, item en antwoordoptie. In plaats daarvan bestaat het downloadbestand van BB voor iedere student, voor ieder item, uit het volledige tekstbestand van de vraag plus het volledige tekstbestand van het antwoord. Bij toetsen op het gebied van de wiskunde en statistiek, waar vragen zowel als antwoorden zijn opgebouwd met de BB equation editor, gaan die vraag- en antwoord tekstbestanden gepaard met plaatjesbestanden die de formules weergegeven (voor ieder item, voor iedere student, voor iedere toets). In onze toepassing (met 1000 studenten, 40 6
SURF stimuleringsregeling Learning Analytics 2013 items per toets) heeft een gemiddelde download van een toetsafname een omvang van tussen de 500 en 1000 MegaByte (afgelopen jaar, toen we dezelfde toetsen in een ander pakket afnamen, altijd minder dan 300 KiloByte, dus meer dan een factor 1000 verschil). Dat is onwerkbaar in een situatie waar door BB de downloads van grote bestanden een lage prioriteit krijgt, dus op z’n gunstigs in de nacht na het downloadverzoek plaatsvindt, in het slechtste geval in het weekeind. De essentie van formatieve toetsing is snelle feedback, en die is niet compatibel met huidige versies van toetsfunctionaliteit in BB. Daarnaast breekt op dat BB geen eenduidige definitie van leeractiviteit hanteert (soms aantal clicks, soms tijd, afhankelijk van de specifieke BB functie of rapportage-optie), en dat BB tijdmetingen geïnfleerd zijn, omdat de teller blijft doorlopen bij lange perioden van inactiviteit.
7. Kostenoverzicht Omdat de uitkomsten van het project het perspectief van interessante publicatiemogelijkheden opende, is alle onderzoekstijd van de projectleider (2d/w) gedurende januari t/m mei in het project gestoken. Daardoor is de omvang van het project in manuren gemeten sterk uitgebreid (uren van co-auteurs niet eens meetellend). Gegeven de vaste omvang van het SURF aandeel in de urenfinanciering, impliceert dit dat de gerealiseerde subsidie- en matchingspercentages sterk zijn gaan afwijken: ongeveer 20% van de uren zijn gefinancierd uit SURF subsidie, de resterende 80% uit universitaire middelen. Financiële verantwoording (kostenoverzicht) eindrapport stimuleringsregeling Learning Analytics 2013 (project met matching) Project: Learning Analytics, formatieve toetsing & leerdisposities Rapportageperiode: 1/7/2013 t/m 31/5/2014
Begroting
Gerealiseerde projectkosten
Restant begroting
Prognose realisatie
In projectvoorstel In rapportage
Geschatte kosten
(A)
tot einde project
periode (B)
(A)-(B)
Materiele kosten Fase1 Fase2 Fase3 Fase4 Totaal materiele kosten
0
0
0
0
40 uur
0 uur
Personele kosten Fase1
40 uur
40 uur
Fase2
100 uur
200+PM uur
Fase3
30 uur
30 uur
30 uur
0 uur
Fase4
80 uur
360 uur
360 uur
-280 uur
250 uur
630+PM uur
200+PM uur -100-PM uur
projectmanagement Totaal personele kosten Overige kosten
lezing Vilnius
Onvoorzien
EU financiering
…
overige congressen
630+PM uur -380-PM uur
universitaire financiering Totaal overige kosten
0
0
Totaal projectkosten
250 uur
630+PM uur
7
0
0
630+PM uur -380-PM uur
elsevier_CHB_2757 In search for the most informative data for feedback generation: Learning Aanalytics in a data-rich context Dirk T. Tempelaara, ⁎
[email protected] Bart Rienties b
[email protected] Bas Giesbers c
[email protected] a
Maastricht University, School of Business and Economics, PO Box 616, 6200 MD Maastricht, Netherlands
b
Open University UK, Institute of Educational Technology, UK
c
Rotterdam School of Management, Erasmus University, Netherlands
⁎
Corresponding author. Address: Department of Quantitative Economics, Maastricht University, School of Business and Economics, PO Box 616, 6200 MD Maastricht, Netherlands. Tel.: +31 43 38 83858; fax: +31 43 38 84874.
Abstract Learning analytics seek to enhance the learning processes through systematic measurements of learning related data and to provide informative feedback to learners and teachers. Track data from learning management systems (LMS) constitute a main data source for learning analytics. This empirical contribution provides an application of Buckingham Shum and Deakin Crick ’s theoretical framework of dispositional learning analytics: an infrastructure that combines learning dispositions data with data extracted from computer-assisted, formative assessments and LMSs. In a large introductory quantitative methods module, 922 students were enrolled in a module based on the principles of blended learning, combining face-to-face problem-based learning sessions with e-tutorials. We investigated the predictive power of learning dispositions, outcomes of continuous formative assessments and other system generated data in modelling student performance of and their potential to generate informative feedback. Using a dynamic, longitudinal perspective, computer-assisted formative assessments seem to be the best predictor for detecting underperforming students and academic performance, while basic LMS data did not substantially predict learning. If timely feedback is crucial, both use-intensity related track data from e-tutorial systems, and learning dispositions, are valuable sources for feedback generation.
Keywords: Blended learning; Dispositional learning analytics; e-Tutorials; Formative assessment; Learning dispositions
1 Introduction Learning analytics provide institutions with opportunities to support student progression and to enable personalised, rich learning (Bienkowski, Feng, & Means, 2012; Oblinger, 2012; Siemens, Dawson, & Lynch, 2013; Tobarra, RoblesGómez, Ros, Hernández, & Caminero, 2014). With the increased availability of large datasets, powerful analytics engines (Tobarra et al., 2014), and skillfully designed visualisations of analytics results (Gonz ález-Torres, Garc ía-Peñalvo, & Therón, 2013), institutions may be able to use the experience of the past to create supportive, insightful models of primary (and perhaps real-time) learning processes (Rienties, Slade, Clow, Cooper, & Ferguson Author B., submitted for publication; Baker, 2010; Stiles, 2012). According to Bienkowski et al. (2012, p. 5), “education is getting very close to a time when personalisation will become commonplace in learning”, although several researchers (Garc ía-Peñalvo, Conde, Alier, & Casany, 2011; Greller & Drachsler, 2012; Stiles, 2012) indicate that most institutions may not be ready to exploit the variety of available datasets for learning and teaching. Many learning analytics applications use data generated from learner activities, such as the number of clicks (Siemens, 2013; Wolff, Zdrahal, Nikolov, & Pantucek, 2013), learner participation in discussion forums (Agudo-Peregrina, Iglesias-Pradas, Conde-Gonz ález, & Hernández-Garc ía, 2014; Macfadyen & Dawson, 2010), or (continuous) computer-assisted formative assessments (Tempelaar, Heck, Cuypers, van der Kooij, & van de Vrie, 2013; Tempelaar, Kuperus, Cuypers, van der Kooij, van de Vrie, & Heck, 2012 Author A., 2012a,b ; Wolff et al., 2013). User behaviour data are frequently supplemented with background data retrieved from learning management systems (LMS) (Macfadyen & Dawson, 2010) and other
student admission systems, such as accounts of prior education (Arbaugh, 2014; Author A, 2012a; Richardson, 2012; Tempelaar, Niculescu, Rienties, Giesbers, & Gijselaers, 2012 ). For example, in one of the first learning analytics studies focused
elsevier_CHB_2757 on 118 biology students, Macfadyen and Dawson (2010) found that some (# of discussion messages posted, # assessments finished, # mail messages sent) LMS variables but not all (e.g., time spent in the LMS) were useful predictors of student retention and academic performance. Buckingham Shum and Deakin Crick (2012) propose a dispositional learning analytics infrastructure that combines learning activity generated data with learning dispositions, values and attitudes measured through self-report surveys, which are fed back to students and teachers through visual analytics. For example, longitudinal studies in motivation research (Author B., 2012a; J ärvelä, Hurme, & J ärvenoja, 2011; Rienties, Tempelaar, Giesbers, Segers, & Gijselaers, 2012 ) and students ’ learning approaches (Nijhuis, Segers, & Gijselaers, 2008) indicate strong variability in how students learn over time in face-to-face settings (e.g., becoming more focussed on deep learning rather than surface learning), depending on the learning design, teacher support, tasks, and learning dispositions of students. Indeed, in a study amongst 730 students Tempelaar, Niculescu, et al. (2012)Author A (2012a) found that positive learning emotions contributed positively to becoming an intensive online learner, while negative learning emotions, like boredom, contributed negatively to learning behaviour. Similarly, in an online community of practice of 133 instructors supporting EdD students, Nistor et al. (2014) found that self-efficacy (and expertise) of instructors predicted online contributions. However, a combination of LMS data with intentionally collected data, such as self-report data stemming from student responses to surveys, is an exception rather than the rule in learning analytics (Author A., 2013a; Buckingham Shum & Ferguson, 2012; Greller & Drachsler, 2012; Macfadyen & Dawson, 2010; Tempelaar et al., 2013 ). In our empirical contribution focusing on a large scale module in introductory mathematics and statistics, we aim to provide a practical application of such an infrastructure based on combining longitudinal learning and learner data. In collecting learner data , we opted to use three validated self-report surveys firmly rooted in current educational research, including learning styles (Vermunt, 1996), learning motivation and engagement (Martin, 2007), and learning emotions (Pekrun, Goetz, Frenzel, Barchfeld, & Perry, 2011). This operationalisation of learning dispositions closely resembles the specification of cognitive, metacognitive and motivational learning factors relevant for the internal loop of informative tutoring feedback (e.g., Narciss, 2008; Narciss & Huth, 2006). For learning data , data sources are used from more common learning analytics applications, and constitute both data extracted from an institutional LMS (Gonz ález-Torres et al., 2013; Macfadyen & Dawson, 2010) and system track data extracted from the e-tutorials used for practicing and formative assessments (e.g., Tempelaar et al., 2013; Tempelaar, Kuperus, et al., 2012 Author A, 2012b, 2013a ; Wolff et al., 2013). The prime aim of the analysis is predictive modelling (Baker, 2010; Sao Pedro, Baker, Gobert, Montalvo, & Nakama, 2013), with a focus on the roles of (each of) 100+
predictor variables from the several data sources can play in generating timely, informative feedback for students.
2 Literature review 2.1 Learning analytics A broad goal of learning analytics is to apply the outcomes of analysing data gathered by monitoring and measuring the learning process (Buckingham Shum & Ferguson, 2012; Siemens, 2013). A vast body of research on student retention (Credé & Niehorster, 2012; Marks, Sibley, & Arbaugh, 2005; Richardson, 2012) indicates that academic performance can be reasonably well predicted by a range of demographic, academic integration, social integration, psycho-emotional and social factors, although most predictive models can explain only up to 30% of variance. Recent studies in learning analytics (Agudo-Peregrina et al., 2014; Author A, 2013a; Macfadyen & Dawson, 2010; Tempelaar et al., 2013; Wolff et al., 2013) seem to indicate that adding LMS user behaviour to these models can substantially improve the explained variance of academic performance. However, according to Agudo-Peregrina et al. (2014) there is no consensus in the learning analytics community on which user behaviour and interactions data are appropriate to measure, understand and model learning processes and academic performance. Clow (2013, p. 692) argues that “as a field, learning analytics is data-driven and is often atheoretical, or more precisely, is not explicit about its theoretical basis”. Although several researchers have worked to link learning analytics to pedagogical theory (Clow, 2013; Dawson, 2008; Macfadyen & Dawson, 2010; Suthers, Vatrapu, Medina, Joseph, & Dwyer, 2008), this is still the exception, rather than the rule. However, Macfadyen and Dawson (2010, p. 597) note that “knowledge of actual course design and instructor intentions is critical in determining which variables can meaningfully represent student effort or activity, and which should be excluded”. For example, Author A (2013a)Tempelaar et al. (2013) found empirical evidence for the role of a broad range of learning dispositions in learning analytics applications in a study amongst 1832 students. Demographic characteristics, cultural differences, learning styles, learning motivation and engagement, and learning emotions, all proved to be facets of learning dispositions having a substantial impact on learning mathematics and statistics. This study extends the analysis of predictive modelling for generating learning feedback by looking at the role of any data source in a multivariate context, so in the presence of several alternative data sources. In Verbert, Manouselis, Drachsler, and Duval (2012), six objectives are distinguished in using learning analytics: predicting learner performance and modelling learners, suggesting relevant learning resources, increasing reflection and awareness, enhancing social learning environments, detecting undesirable learner behaviours, and detecting affects of learners. Although the combination of self-report learner data with learning data extracted from e-tutorial systems (see below) allows us to contribute to at least five of these objectives of applying learning analytics (as described in Narciss & Huth, 2006) (as described in Narciss & Huth, 2006), we will focus in this contribution on the first objective: predictive modelling of performance and learning behaviour (Baker, 2010; Sao Pedro et al., 2013). The ultimate goal of this predictive modelling endeavour is to find out which components from a rich set of data sources best serve the role of generating timely, informative feedback and signalling risk of underperformance.
2.2 Formative testing and feedback A classic function of testing is that of taking an aptitude test. After completion of the learning process, we expect students to demonstrate mastery of the subject. According to test tradition, feedback resulting from such “classical” tests are typically limited to
elsevier_CHB_2757 a grade (Boud & Falchikov, 2006; Whitelock, Richardson, Field, Van Labeke, & Pulman, 2014). Another limitation of classical summative testing is that feedback becomes available only after finishing all learning activities (Segers, Dochy, & Cascallar, 2003). An alternative form of assessment, formative assessment, has an entirely different function: that of informing student and teacher (Segers et al., 2003). This information should help to better shape teaching and learning and is especially useful when it becomes available prior to or during the learning process. Feedback plays a crucial part to assist regulating learning processes (Boud & Falchikov, 2006; Hattie, 2009; Lehmann, Hähnlein, & Ifenthaler, 2014; Whitelock et al., 2014). Several alternative operationalisations to support feedback are possible. For example, using two experimental studies with different degrees of generic and directed prompts, Lehmann et al. (2014) found that directed prereflected prompts encourage positive activities in online environments. In a metastudy of 800+ meta-studies, Hattie (2009) found that the way students receive feedback was one of the most powerful factors in enhancing learning experiences. Diagnostic testing is an example of this, just as is a test-directed learning approach that constitutes the basic educational principle of many e-tutorial systems (Tempelaar, Rienties, & Giesbers Author A., 2009). Because feedback from tests constitutes a main function for learning, it is crucial that this information is readily available, preferably even instantly. At this point digital testing comes on the scene: it is unthinkable to get feedback from formative assessments in time without using computers. Previous research by Wolff et al. (2013) found that a combination of LMS data with data from continuous assessments were the best predictor for performance drops amongst 7701 students. In particular, the number of clicks in an LMS just before the next assessment significantly predicted continuation of studies (Wolff et al., 2013). Similarly, in a study of six online and two blended courses, Agudo-Peregrina et al. (2014) found that interactions with assessment tools, followed by interactions with peers and teachers, and active participation significantly predicted academic performance in the six online courses. However, no clear paths of learning analytics data were found for the two blended courses. In contrast, Tempelaar et al. (2013)Author A (2013a) did find that both dispositional data and data extracted from formative testing had a substantial impact on student performance in a blended course of 1832 students.
2.3 Case study: Mathematics and statistics Our empirical contribution focuses on freshmen students in quantitative methods (mathematics and statistics) of the business & economics school at Maastricht University. This education is directed at a large and diverse group of students, which benefits the research design. As a basic LMS system, Blackboard is used to share basic course information to students. Given the restricted functionality of this LMS in terms of personalised, adaptive learning content with rich varieties of feedback and support provision (for a detailed critique on the limitations of LMS, see Conde, García, Rodríguez-Conde, Alier, & García-Holgado, 2014; García-Peñalvo et al., 2011), two external e-tutorials were utilised: MyStatLab (MSL) and MyMathLab (MML). These e-tutorials are generic LMSs for learning statistics and mathematics developed by the publisher Pearson. Although MyLabs can be used as a learning environment in the broad sense of the word (it contains, amongst others, a digital version of the textbook), it is primarily an environment for test-directed learning and practicing. Each step in the learning process is initiated by a question, and students are encouraged to (try to) answer each question. If a student does not master a question (completely), she/he can either ask for help to solve the problem step-by-step (Help Me Solve This), or ask for a fully worked example (View an Example), as demonstrated in Fig. 1. These two functionalities are examples of Knowledge of Result/response (KR) and Knowledge of the Correct Response (KCR) types of feedback; see Narciss and Huth (2006) and Narciss (2008). After receiving this type of feedback, a new version of the problem loads (parameter based) to allow the student to demonstrate his/her newly acquired mastery. When a student provides an answer and opts for ‘Check Answer’, Multiple-Try Feedback (MTF, Narciss, 2008) is provided, whereby the number of times feedback is provided for the same task depends on the format of the task (only two for a multiple choice type of task as in Fig. 1, more for open type tasks requiring numerical answers).
Fig. 1 MyMathLab task and feedback options.
elsevier_CHB_2757
3 Research methods 3.1 Research questions While an increasing body of research is becoming available how students’ usage and behaviour in LMS influences academic performance (e.g., Arbaugh, 2014; Macfadyen & Dawson, 2010; Marks et al., 2005; Wolff et al., 2013), how the use of etutorials or other formats of blended learning effects performance (e.g., Lajoie & Azevedo, 2006), and how feedback based on learning dispositions stimulates learning Buckingham Shum and Deakin Crick (2012), to the best of our knowledge no study has looked at how all these factors can be combined into one research context, and what the relative contributions of LMSs, formative testing, e-tutorials, and applying dispositional learning analytics to student performance are. In our empirical contribution focusing on a large scale module in introductory mathematics and statistics followed by 922 students, we aim to provide a practical application of such an infrastructure based on combining longitudinal learning data from our LMS, the two e-tutorials, and (self-reported) learner data. The prime aim of the analysis is predictive modelling (Baker, 2010; Sao Pedro et al., 2013; Wolff et al., 2013), with a focus on the role each of these data sources can play in generating timely, informative feedback for students. Q1. To what extent do (self-reported) learning dispositions of students, LMSs and e-tutorial data (formative assessments) predict academic performance over time? Q2. To what extent do predictions based on these alternative data sources refer to unique facets of performance, and to what extent do these predictions overlap? Q3. Which source(s) of data (learning dispositions, LMS data, e-tutorials formative tests) provide the most potential to provide timely feedback for students?
3.2 Methodology 3.2.1 Context of study The educational system in which students learn mathematics and statistics is best described as a ‘blended ’ or ‘hybrid ’ system. The main component is face-to-face: problem-based learning (PBL), in small groups (14 students), coached by a content expert tutor (Author ARienties, Tempelaar, Van den Bossche, Gijselaers, & Segers, 2009, 2009; Author B., 2009; Schmidt, Van Der Molen, Te Winkel, & Wijnen, 2009 ; Tempelaar et al., 2009). Participation in these tutorial groups is required, as for all courses based on the Maastricht PBL system. Optional is the online component of the blend: the use of the two e-tutorials (Author A, 2013aTempelaar et al., 2013). This optional component fits the Maastricht educational model, which is student-centred and places the responsibility for making educational choices primarily on the student (Author A, 2013a; Schmidt et al., 2009 ; Tempelaar et al., 2013). At the same time, due to strong diversity in prior knowledge in mathematics and statistics, not all students, in particular those at the high end, will benefit equally from using these environments. However, the use of e-tutorials and achieving good scores in the practicing modes of the MyLab environments is stimulated by making bonus points available for good performance in the quizzes. Quizzes are taken every two weeks and consist of items that are drawn from the same item pools applied in the practicing mode. We chose for this particular constellation as it stimulates students with limited prior knowledge to make intensive use of the MyLab platforms. Students with limited prior knowledge may realise that they fall behind other students, and therefore need to achieve a good bonus score both to compensate, and to support their learning. The most direct way to do so is to frequently practice in the MML and MSL environments. The bonus is maximised to 20% of what one can score in the exam. The student-centred characteristic of the instructional model requires, first and foremost, adequate informative feedback to students so that they are able to monitor their study progress and their topic mastery in absolute and relative sense. The provision of relevant feedback starts on the first day of the course when students take two diagnostic entry tests for mathematics and statistics (Author A, 2013aTempelaar et al., 2013). Feedback from these entry tests provides a first signal of the importance for using the MyLab platforms. Next, the MML and MSL-environments take over the monitoring function: at any time students can see their progress in preparing the next quiz, get feedback on the performance in completed quizzes, and on their performance in the practice sessions. The same (individual and aggregated) information is also available for the tutors in the form of visual dashboards (Clow, 2013; González-Torres et al., 2013; Verbert et al., 2012). Although the primary responsibility for directing the learning process is with the student, the tutor acts complementary to that self-steering, especially in situations where the tutor considers that a more intense use of e-tutorials is desirable, given the position of the student concerned. In this way, the application of learning analytics shapes the instructional support.
3.2.2 Participants The most recent cohort of freshmen (2013/2014) containing 922 students were included, who in some way participated in learning activities (i.e., have been active in BlackBoard). A large diversity in the student population is present: only 24% were educated in the Dutch high school system. The largest group, 46% of the freshmen, were educated according to the German Abitur system. High school systems in Europe differ strongly, most particularly in the teaching of mathematics and statistics. Therefore, it is crucial that the first module offered to these students is flexible and allows for individual learning paths (Author A, 2009, 2012a, 2013aTempelaar et al., 2009, 2013; Tempelaar, Kuperus, et al., 2012). In the investigated course, students work an average 38.2 h in MML and 24.4 h in MSL, 30 % to –50% of the available time of 80 h for learning in both topics.
3.3 Instruments and procedure As illustrated in Fig. 2, we will investigate the relationships between a range of of data sources, leading to in total 102 different variables. In the subsections that follow, the several data sources are described that provide the predictor variables for our
elsevier_CHB_2757 predictive modelling.
Fig. 2 Visualisation of module structure and types of learner and learning data.
3.3.1 Registration systems capturing demographic data In line with academic retention or academic analytics literature (Marks et al., 2005; Richardson, 2012), several demographic factors are known to influence performance. A main advantage of this type of data is that institutions can relatively easily extract this information from student admission, and are therefore logical factors to include in learning analytics models. Demographic data were extracted from concern systems: nationality, gender, age and prior education. Since, by law, introductory modules like ours need to be based on the coverage of Dutch high school programs, we converted nationality data into an indicator for having been educated in the Dutch high school system. 24% of students are educated in the Dutch higher education system, 76% of students in international systems, mostly of continental European countries. About 39% of students are female, with 61% males. Age demonstrates very little variation (nearly all students are below 20), and no relationship with any performance, and is excluded. The main demographic variable is the type of mathematics track in high school: advanced, preparing for sciences or technical studies in higher education, or basic, and preparing for social sciences (the third level, mathematics for arts and humanities, does not provide access to our program). Exactly two third of the students has a basic mathematics level, one third has an advanced level. (See Author A, 2009, 2012a, 2013aTempelaar et al., 2009, 2013; Tempelaar, Kuperus, et al., 2012 for detailed description.)
3.3.2 Diagnostic entry tests At the very start of the course, so shaping part of Week0 data, are entry tests for mathematics and statistics all students were required to do. Both entry tests are based on national projects directed at signalling deficiencies in the area of mathematics and statistics encountered in the transition from high school to university (see Author A, 2012bTempelaar, Niculescu, et al., 2012 for an elaboration). Topics included in the entry tests refer to foundational topics, often covered in junior high school programs, such as basic algebraic skills or statistical literacy.
3.3.3 Learning dispositions data Learning dispositions of three different types were included: learning styles, learning motivation and engagement, and learning emotions. The first two facets were measured at the start of the module, and from the longitudinal perspective are assigned to Week0 data. Learning style data are based on the learning style model of Vermunt (1996, 1998). Vermunt’s model distinguishes learning strategies (deep, step-wise, and concrete ways of processing learning topics), and regulation strategies (self, external, and lack of regulation of learning). Recent Anglo-Saxon literature on academic achievement and dropout assigns an increasingly dominant role to the theoretical model of Andrew Martin (2007): the ‘Motivation and Engagement Wheel ’. This model includes both behaviours and thoughts, or cognitions, that play a role in learning. Both are subdivided into adaptive and mal-adaptive (or obstructive) forms. Adaptive thoughts consist of Self-belief, Value of school and Learning focus, whereas adaptive behaviours consist of Planning, Study management and Perseverance. Maladaptive thoughts include Anxiety, Failure Avoidance, and Uncertain Control, and lastly, maladaptive behaviours include Self-Handicapping and Disengagement. As a result, the four quadrants are: adaptive behaviour and adaptive thoughts (the ‘boosters’), mal-adaptive behaviour (the ‘guzzlers’) and obstructive thoughts (the ‘mufflers’). The third component, learning emotions, is more than a disposition: it is also an outcome of the learning process. Therefore, the timing of the measurement of learning emotions is Week4, halfway into the module, so that students have sufficient involvement and experience in the module to form specific learning emotions, but still timely enough to make it a potential source of feedback. Learning emotions were measured through four scales of the Achievement Emotions Questionnaire (AEQ) developed
elsevier_CHB_2757 by Pekrun et al. (2011): Enjoyment, Anxiety, Boredom and Hopelessness. All learning dispositions are administered through self-report surveys scored on a 7-point Likert scale.
3.3.4 Learning management system User track data of LMS are often at the heart of learning analytics applications. Also in our context intensive use of our LMS, BlackBoard (BB), has been made. In line with Agudo-Peregrina et al. (2014), we captured tracking data from six learning activities. First, the diagnostic entry tests were administered in BB, and through the MyGrades function, students could access feedback on their test attempts. Second, surveys for learning dispositions were administered in BB. Third, two lectures per week were provided, overview lectures at the start of the week, and recap lectures at the end of the week, which were all videotaped and made available as webcasts through BB. Fourth, several exercises for doing applied statistical analyses, including a student project, were distributed through BB, with a requirement to upload solutions files again in BB. Finally, communication from the module staff, various course materials and a series of old exams (to practice the final exam) were made available in BB. For all individual BB items, Statistics Tracking was set on to create use intensity data on BB function and item level.
3.3.5 E-tutorials MyMathLab and MyStatLab Students worked in the MyMathLab and MyStatLab e-tutorials for all seven weeks, practicing homework exercises selected by the module coordinator. The MyLab systems track three scores achieved in each task, mastery score (MMLMastery), time on task (MMLHours), and number of attempts required to get to the mastery level achieved (MMLAttempts). Those data were aggregated over the on average 25 weekly tasks for mathematics, and about 20 tasks for statistics, to produce six predictors, three for each topic, for each of the seven weeks. Less aggregated data sets have been investigated, but due to high collinearity in data of individual tasks, these produced less stable prediction models. The three (bonus) quizzes took place in the weeks 3, 5 and 7. Quizzes were administrated in the MyLab tools, and consisted of selections of practice tasks from the two previous weeks.
3.3.6 Academic performance Six measures of academic performance in the quantitative methods module were included for predictive modelling: score in both topic components of the final, written exam (MathExam and StatsExam), aggregated scores for the three quizzes in both topics, MathQuiz and StatsQuiz, overall score in the module, QMTotal (weighting the final exam with weight 5, and the bonus score from quizzes and homework with weight 1), and module passing rate: QMPass.
3.4 Data analysis Complete information was obtained for 873 out of 922 students (95%) on the various instruments. Prediction models applied in this study are all of linear, hierarchic regression type. More complex models have been investigated, in particular interaction models. However, none of these more advanced model types passed the model selection criterion that prediction models should be stable over all seven weekly intervals. Collinearity existing in track data in a similar way forced us to aggregate that type of data into weekly units; models based on less aggregated data such as individual task data gave rise to collinearity issues.
4 Results The aim of this study being predictive modelling in a rich data context, we will focus the reporting on the coefficient of multiple correlation, R , of the several prediction models. Although the ultimate aim of prediction modelling is often the comparison of explained variation, which is based on the square of the multiple correlation, we opted for using R itself, to allow for more detailed comparisons between alternative models. Values for R are documented in Table 1 for prediction models based on alternative data sets. For data sets that are longitudinal in nature and allow for incremental weekly data sets, the growth in predictive power is illustrated in time graphs for BB track data, MyLabs track data and test performance data. To ease comparison, all graphs share the same vertical scale. Table 1 Predictive power, as multiple correlation R, of various data sets and various timings, for six performance measures. Data source
Timing
MathExam
StatsExam
MathQuiz
StatsQuiz
QMscore
QMpass
Demographics
Week0
.431
.291
.393
.190
.393
.302
EntryTests
Week0
.429
.299
.451
.218
.405
.283
Learning styles
Week0
.240
.219
.223
.221
.250
.212
Motivation & engagement
Week0
.304
.309
.326
.320
.343
.271
BlackBoard
Week0
.121
.091
.162
.153
.138
.100
AllWeek0
Week0
.587
.451
.582
.406
.573
.445
elsevier_CHB_2757 BlackBoard
Week1
.133
.132
.183
.161
.163
.104
MyLabs
Week1
.387
.268
.480
.362
.372
.310
AllWeek1
Week1
.611
.490
.662
.564
.635
.497
BlackBoard
Week2
.151
.140
.196
.171
.178
.133
MyLabs
Week2
.395
.330
.500
.391
.390
.324
AllWeek2
Week2
.615
.502
.673
.585
.645
.506
BlackBoard
Week3
.162
.141
.198
.171
.181
.140
MyLabs
Week3
.482
.372
.613
.443
.463
.399
Quiz1
Week3
.637
.540
.851
.779
.715
.568
AllWeek3
Week3
.716
.621
.887
.820
.781
.625
Learning Eemotions
Week4
.481
.333
.481
.294
.473
.380
BlackBoard
Week4
.163
.142
.212
.194
.184
.144
MyLabs
Week4
.503
.399
.651
.516
.497
.410
AllWeek4
Week4
.736
.632
.899
.830
.799
.646
BlackBoard
Week5
.172
.144
.215
.194
.190
.146
MyLabs
Week5
.541
.475
.695
.571
.567
.455
Quiz2
Week5
.711
.589
.962
.948
.792
.613
AllWeek5
Week5
.762
.659
.968
.954
.833
.662
BlackBoard
Week6
.173
.150
.217
.210
.191
.146
MyLabs
Week6
.547
.483
.700
.572
.566
.461
AllWeek6
Week6
.762
.662
.969
.954
.835
.663
BlackBoard
Week7
.177
.150
.221
.213
.191
.146
MyLabs
Week7
.557
.493
.707
.572
.567
.464
Quiz3
Week7
.727
.615
1.000
1.000
.816
.633
AllWeek7
Week7
.772
.670
1.000
1.000
.845
.675
4.1 Predicting performance by demographic data For the mathematics related performance measures, and for measures relating to completion of the module, there is only one significant predictor variable: mathematics track in high school. Its impact is substantial: its beta weight in predicting, for example, MathExam is 0.43, explaining in itself 20% of variation. However, performance in statistics is different: there exists a substantial impact of the internationalisation dummy, favouring students educated in the Dutch high school system. That impact finds its explanation in the extraordinary role of statistics in the Dutch high school system, in comparison to other continental European countries. Lastly, gender is significant in predicting StatsExam, favouring male students. However, more predictors do not imply better prediction: mathematics performance is much better predicted than statistics performance, with overall performance in an intermediate position, due to lack of coverage in so many high school programs. In other words, in line with previous research ( Author A, 2013a; Marks et al., 2005; Richardson, 2012; Tempelaar et al., 2013) prior education seems to be a useful factor to include in learning analytics modelling.
4.2 Predicting performance by EntryTest data
elsevier_CHB_2757 Entry test data have substantial predictive power for both performance in mathematics (for MathExam, R = .43, for MathQuiz, R = .45) and overall performance (QMscore, R = .41). These correlations are very similar in value to those of the prior mathematics education variables, indicating that the entry tests provide a good summary of what students have learned in high school. Predictive power for statistics related performance is at a lower level (for StatsExam, R = .30, for StatsQuiz, R = .22), due to the circumstance that many of the students have not been educated before in statistics, so that the entry test cannot be very informative of later performance in the course.
4.3 Predicting performance by Learning Dispositions data In terms of predictive power, learning dispositions sit in between BB track data, and the three data sources containing data of more cognitive nature, as is clear from Table 1. Different from MyLab, EntryTest, and demographics predictors, the impact of learning dispositions is of a rather constant level, irrespective of the type of performance measure. For learning styles, R ranges from .21 for passing rate, to .25 for overall score, whereas for the motivation and engagement data the range is from .27 to .34. Learning emotions achieve even higher levels of prediction power, but as noticed before these variables are measured in the midst of the module, so are themselves best viewed as a mixture of disposition and the outcome of the learning process. The prediction relationships take different shapes, depending on the performance measure. For instance, amongst the learning styles variables, critical processing of learning material, the processing strategy most indicative of deep learning, acts as the most powerful predictor for exam performance, both for mathematics and statistics. In contrast, the regulation strategy self-regulation of learning content is the strongest predictor of quiz performance (with a negative beta, indicating that students who follow their own learning agenda underperform relative to students who adopt the agenda built into the Mylabs).
4.4 Predicting performance by learning management system data Given the wealth of BB data, preliminary analysis was applied to find out which indicators of learning intensity performed well in each of the consecutive weeks. BB data is highly collinear, implying different choices of predictor variables in models for each of the seven weeks. The single variable playing a consistent role in all of the weekly models is overall activity in BB: the total number of clicks, per week. Fig. 3 demonstrates the predictive power in terms of the multiple correlation coefficients of longitudinal models developed on overall user activity.
Fig. 3 Predictive power of BB track data for six performance measures.
The figure signals two important features. First, there is little progress in predictive power over time: the earliest predictions are about as good as later predictions. It is indeed the case that Week0 BB usage, that is the use of BB in the week before the module starts, has the highest predictive power for the several performance variables of all individual weeks. In line with previous findings (Agudo-Peregrina et al., 2014; Macfadyen & Dawson, 2010), the second observation is that predictive power of our LMS remains low: the multiple correlations of all six performance indicators converges to a value of about 0.2, indicating that no more than about 4% in performance variation can be explained by BB track data. Although there is strong variation in LMS data, this variation is not consistently related with variation in performance. There is one exception to this general result: the number of downloads of old exams for practicing purposes is a reasonable predictor (beta equal to 0.25). However, nearly all of these downloads took place in Week8, the same week as the exam taking place, because of which it is not very useful for a prediction model for providing early feedback to students.
4.5 Predicting performance by MyMathLab and MyStatLab e-tutorial data After aggregation to weekly data, three use intensity data remain: mastery level, time on task, and average number of attempts per task, both for MML as for MSL. All of these variables are highly positively correlated: for mathematics e.g., the correlation between mastery level and time for the whole module is .49, the correlation between mastery level and number of attempts is .63, and between time and attempts .47. However, if we include all three predictor variables into one equation, the outcome becomes (for mathematics, estimated over all weeks):
elsevier_CHB_2757
with values of R being .51 and .66, respectively. A remarkable and very consistent feature of all prediction equations using mastery, time, and attempts data is that the beta of mastery is always positive, and the beta of time on task and number of attempts are always negative, although all bivariate correlations between time on task and performance measures are positive. There is, however, a simple explanation for this sign reversal: mastery time on task and attempt variables are strongly collinear. Practicing longer in the two MyLab systems increases expected performance, since students who practice more, achieve higher mastery levels. Similarly: redoing a task for a second or third time will generally increase mastery level.1 Now that the potential of building prediction models for performance based on data from the two MyLab systems has been established, the next step was to design these prediction models using incremental data sets of track data. Starting with the Week1 data set, we extend the data set in weekly steps, arriving after seven weeks at the final set of predictor variables, containing mastery, time on task and number of attempts system data of seven consecutive weeks for MML and MSL systems. Fig. 4 describes the development of the multiple correlation coefficients R in time, that is, over subsequent weekly data sets.
Fig. 4 Predictive Pp ower of MML and MSL system data for six performance measures.
Since the predictor data sets are incremental, the values of multiple correlations increase over weeks. Those for performance in the mathematics exam, and the overall score, start at values around 0.4 in Week0, and increase to values between 0.5 and 0.6 in the last week. In other words, being pro-active in the e-tutorials seems to be a good candidate to be included in learning analytics modelling.
4.6 Predicting performance by Quiz data That the best predictor for performance, is performance itself, will not surprise many teachers and researchers. Although quizzes in our context are more of formative, than summative type (bringing only a bonus score, to a maximum amount of 20% of what one can score), they constitute the most reliable predictor of all six performance measures. Focusing on performance in the exam (since predicting quiz scores, or total scores, from the quizzes themselves brings about endogeneity issues), multiple correlation values develop from R = .64 to R = .73 for mathematics, and from R = .54 to R = .62 for statistics, over the three quizzes. Fig. 5 demonstrates this development in predictive power, where as the starting point of the time trajectories, the EntryTests are used. In line with Wolff et al. (2013), quizzes seem to be a good indicator for learning in prediction modelling.
elsevier_CHB_2757
Fig. 5 Predictive power of EntryTest and Quiz data for six performance measures.
4.7 Predicting performance by all weekly data The very last step in assessing the quality of prediction models entails the combination of different data sources in each of the longitudinal models. Fig. 6 provides an insight in the development of predictive power in time, when combining all available data.
Fig. 6 Predictive power of all data combined for six performance measures.
As indicated before, the predictive power towards the Quiz performance components are an artefact of using predictor variables that more and more coincide with the predicted performance component. The main criterion is the prediction of both exam components of performance. In Week3, multiple correlations R for predicting MathExam and StatsExam are a substantial .72 and .62. Given the importance of Week3 data with regard to potential interventions for students at risk (i.e., failing the course and/or dropping out), Fig. 7 provides scatterplots of the prediction equations for the two exam performance components in the first row, and the two quiz performance components in the second row, with mathematics in the first panel, and statistics in the second. Scatterplots produced for later weeks demonstrate higher predictive power, but less time to intervene: with still five full weeks to catch up, Week3 feedback appears to be the best compromise between timely feedback and sufficient high predictive power (the ribbon pattern in the first two panel are a consequence of exam scores expressed as integer numbers).
elsevier_CHB_2757
Fig. 7 Scatterplots of prediction equations for exam (first row) and quiz (second row) performance, for mathematics (left) and statistics (right).
When we compare predictive power of all data combined, with that of prediction models based on a single data source, there is evidence of considerable overlap in the information content of various data sources. Especially MyLab track data, EntryTest data, Quiz data and prior education data share variation. From that perspective of providing unique information, the learning dispositions data set is most complementary. For example, in the Week0 data set, demographic variables predict MathExam with R = .43, StatsExam with R = .29. Adding learning dispositions to demographic variables increases R to .53 and .42, respectively, with entry testing and BB data having the limited effect of further increasing R to .59 and .45. Part of the complementary nature of disposition data is in the specific position it takes in predicting the passing rate. Of all performance variables, the passing rate is by far the most difficult to predict, since the required score to pass the test is about at the top of the score distribution. So relatively small differences in test scores make the difference between failing and passing, making it a more difficult phenomenon to predict than the final score itself. From that perspective, disposition data do a relatively good job in pass/fail predictions, providing support to the notions by Buckingham Shum and Deakin Crick (2012) that learning analytics should combine LMS data with learner data.
5 Discussion In this empirical study into predictive modelling of student performance, we investigated several different data sources to explore the potential of generating informative feedback for students and teachers using learning analytics: data from registration systems, entry test data, students ’ learning dispositions, BlackBoard tracking data, tracking data from two e-tutorial systems, and data from systems for formative, computer assisted assessments. In line with recommendations by Agudo-Peregrina et al. (2014), we collected both dynamic, longitudinal user data and semi-static data, such as prior education. It appears that the role of BlackBoard track data in predicting student performance is dominated by the predictive power of any of the other data components, implying that in applications with such rich data available, BlackBoard data have no added value in predicting performance and signalling underperforming students. This seems to confirm initial findings by Macfadyen and Dawson (2010), who found that simple clicking behaviour in a LMS is at best a poor proxy for actual user-behaviour of students. Data extracted from the testing mode of the MyLab systems, the qQuiz data, dominate in a similar respect all other data, including data generated by the practicing mode of MyLabs, indicating the predictive power of “true” assessment data (even if it comes from assessments that are more of formative, than summative type). However, assessment data is typically delayed data (Boud & Falchikov, 2006; Whitelock et al., 2014; Wolff et al., 2013), not available before midterm, or as in our case, the third week of the course. Up to the moment this richest data component becomes available, entry test data and the combination of mastery data and use intensity data generated by the e-tutorial systems are a second best alternative for true assessment data. This links well with Wolff et al. (2013), who found that performance on initial assessments during the first parts of online modules were substantial predictors for final exam performance. A similar conclusion can be made with regards to the learning disposition data: up to the moment that assessment data become available, they serve a unique role in predicting student performance and signalling underperformance beyond system track data of the e-tutorials. From the moment that computer assisted, formative assessment data become available, their predictive power is dominated by that of performance in those formative assessments. Dispositions data
elsevier_CHB_2757 are not as easily collected as system tracking data from LMSs or e-tutorial systems (Buckingham Shum & Deakin Crick, 2012). The answer to the question if the effort to collect dispositional data is worthwhile (or not), is therefore strongly dependent on when richer (assessment) data becomes available, and the need for timely signalling of underperformance. If timely feedback is required, the combination of data extracted from e-tutorials, both in practicing and test modes, and learning disposition data suggests being the best mix to serve learning analytics applications. In contrast to Agudo-Peregrina et al. (2014), who found no consistent patterns in two blended courses using learning analytics, we did find that our mix of various LMS data allowed us to accurately predict academic performance, both from a static and dynamic perspective. The inclusion of extensive usage of computer-assisted tests might explain part of this difference, as well as more fine-grained learning disposition data allowed us to model the learning patterns from the start of the module. Even in the case dispositions would more strongly overlap other predictor variables, like e.g. prior education, dispositions have a unique position with regard to the final aim of feedback. Feedback is informative if two conditions are satisfied: it is predictive, and allows for intervention. Feedback based on prior education may be strongly predictive, but is certainly incapable of designing interventions as to eliminate the foreseen cause of underperformance (Boud & Falchikov, 2006; Whitelock et al., 2014). Feedback related to learning dispositions, such as signalling suboptimal learning strategies, or inappropriate learning regulation, is generally open to interventions to improve the learning process (Lehmann et al., 2014; Pekrun et al., 2011). So, to the extent learning dispositions share predictive power with alternative aspects of learning, feedback in terms of these dispositions will generally be preferred over feedback framed in any of the other aspects of learning. These findings strongly support the integrative approach to learning analytics as advocated by Buckingham Shum and Deakin Crick (2012). As ‘[t] here is substantial and growing evidence within educational research that learners’ orientation towards learning -their learning dispositions- significantly influence the nature of their engagement with new learning opportunities…’ (Buckingham Shum & Deakin Crick, 2012; p. 2), the combination of ‘ first generation’ technology-driven learning
analytics with insights from educational research provides the step towards ‘second generation’ learning analytics. In other words, a development of learning analytics that empowers students to become independent professionals, who can shape their own learning. Future developments should further investigate how to best present feedback based on learning disposition data in combination with technology-generated data to students. A crucial limitation of our study is that we focussed our analyses based only on formal learning interactions, as measured by the three LMS data. Although we added self-reported data from a range of learning disposition instruments to get a more fine-grained, nuanced understanding of the data, several studies indicate that students increasingly use informal networks (Agudo-Peregrina et al., 2014; Hommes et al., 2012) and learning tools (e.g., Facebook, twitter, texts) to share knowledge and learn together. For example, Hommes et al. (2012) found that informal social learning links primarily predicted academic performance amongst 300 medical students. Using dynamic social network analyses, Author B (2014)Rienties, Hernandez Nanclares, Hommes, & Veermans, (2014). found that 30– 80% of learning occurred outside formally assigned groups. Agudo-Peregrina et al. (2014) argue that learning analytics should take into consideration data from
Personal Learning Environments (PLE), although several ethical issues (Author BRienties et al., submitted for publication) need to be addressed in terms of informed consent if institutions are using PLE data, such as Facebook.
6 Conclusion The generation of timely feedback based on early performance predictions and early signalling of underperformance are crucial objectives in many learning analytics applications. The added value of data sources for such applications will therefore depend on the predictive power of the data, the timely availability of the data, and the uniqueness of information in the data. In this study, we integrated data from many different sources and found evidence for strong predictive power of data from formative testing. However, not all modules will contain aspects of formative testing, and even if so, data from formative testing might not be timely enough. In that case, data from e-tutorial systems as the MyLabs, both in terms of mastery level and time on task and attempt data constitute a good second best information source, as will do entry test data or prior education data. Learning data from these various sources share a cognitive nature, and thus share important overlap in predictive power. Learner data in the form of learning dispositions have a unique role in such learning analytics applications since its contribution in performance prediction is indeed orthogonal to that of other data sources. In a rich data context as investigated here, the role of BB track data appeared to be minimal.
Acknowledgement The project reported here has been supported and co-financed by the Dutch SURF-foundation as part of the Learning Analytics Stimulus program.
Appendix A. Supplementary material Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.chb.2014.05.038.
References Agudo-Peregrina Á.F., Iglesias-Pradas S., Conde-Gonz ález M. Á. and Hernández-Garc ía Á., Can we predict success from log data in VLEs? Classification of interactions for learning analytics and their relation with performance in VLE-supported F2F and online learning, Computers in Human Behavior 31, 2014, 542–550, DOI: 10.1016/j.chb.2013.05.031. Arbaugh J.B., System, scholar, or students? Which most influences online MBA course effectiveness?, Journal of Computer Assisted Learning 2014, DOI: 10.1111/jcal.12048.
elsevier_CHB_2757 Author A. (2009). [Details removed for peer review].Tempelaar, D. T., Heck, A., Cuypers, H., van der Kooij, H., & van de Vrie, E. (2013). Formative Assessment and Learning Analytics. In D. Suthers & K. Verbert (Eds.), Proceedings of the 3rd International Conference on Learning Analytics and Knowledge, 205-209. New York: ACM. DOI: 10.1145/2460296.2460337. Author A. (2012a). [Details removed for peer review].Tempelaar, D. T., Kuperus, B., Cuypers, H., van der Kooij, H., van de Vrie, E., & Heck, A. (2012). “The Role of Digital, Formative Testing in e-Learning for Mathematics: A Case Study in the Netherlands”. In: “Mathematical e-learning” [online dossier]. Universities and Knowledge Society Journal (RUSC). vol. 9, no 1. UoC. Author A. (2013a). [Details removed for peer review].Tempelaar, D. T., Niculescu, A., Rienties, B., Giesbers, B., & Gijselaers, W. H. (2012). How achievement emotions impact students' decisions for online learning, and what precedes those emotions. Internet and Higher Education, 15(3), 161–169. doi: 10.1016/j.iheduc.2011.10.003 Author B. (2009). [Details removed for peer review].Tempelaar, D. T., Rienties, B., & Giesbers, B. (2009). Who profits most from blended learning? Industry and Higher Education, 23(4), 285-292. Author B. (2012a). [Details removed for peer review].Rienties, B., Tempelaar, D. T., Giesbers, B., Segers, M., & Gijselaers, W. H. (2012). A dynamic analysis of social interaction in Computer Mediated Communication; a preference for autonomous learning. Interactive Learning Environments. doi: 10.1080/10494820.2012.707127 Author B. (2014). [Details removed for peer review].Rienties, B., Tempelaar, D. T., Van den Bossche, P., Gijselaers, W. H., & Segers, M. (2009). The role of academic motivation in Computer-Supported Collaborative Learning. Computers in Human Behavior, 25(6), 1195-1206. doi: 10.1016/j.chb.2009.05.012 Author B. (submitted for publication). [Details removed for peer review].Rienties, B., Slade, S., Clow, D., Cooper, M., & Ferguson, R. (Submitted). Risks and ethical concerns of Learning Analytics: a state-of-the-art review. Rienties, B., Hernandez Nanclares, N., Hommes, J., & Veermans, K. (2014). Understanding emerging knowledge spillovers in small-group learning settings; a networked learning perspective. In V. Hodgson, M. De Laat, D. McConnell & T. Ryberg (Eds.), The Design, Experience and Practice of Networked Learning: (Vol. 7, 127-148) Springer: Dordrecht.Baker R., Data mining for education, International Encyclopedia of Education 7, 2010, 112–118.
Bienkowski, M., Feng, M., & Means, B. (2012). Enhancing teaching and learning through educational data mining and learning analytics: An issue brief. US Department of Education, Office of Educational Technology (pp. 1–57). Boud D. and Falchikov N., Aligning assessment with long-term learning, Assessment & Evaluation in Higher Education 31 (4), 2006, 399–413, DOI: 10.1080/02602930600679050.
Buckingham Shum, S., & Deakin Crick, R. (2012). Learning dispositions and transferable competencies: Pedagogy, modelling and learning analytics. In Paper presented at the 2nd international conference on learning analytics & knowledge, Vancouver, British Columbia. Buckingham Shum S. and Ferguson R., Social learning analytics, Journal of Educational Technology & Society 15 (3), 2012, DOI: 10.1145/2330601.2330616. Clow D., An overview of learning analytics, Teaching in Higher Education 18 (6), 2013, 683–695, DOI: 10.1080/13562517.2013.827653. Conde M.A., Garc ía F., Rodríguez-Conde M.J., Alier M. and Garc ía-Holgado A., Perceived openness of learning management systems by students and teachers in education and technology courses, Computers in Human Behavior 31, 2014, 517–526, DOI: 10.1016/j.chb.2013.05.023. Credé M. and Niehorster S., Adjustment to college as measured by the student adaptation to college questionnaire: A quantitative review of its structure and relationships with correlates and consequences, Educational Psychology Review 24 (1), 2012, 133–165, DOI: 10.1007/s10648-011-9184-5. Dawson S., A study of the relationship between student social networks and sense of community, Journal of Educational Technology & Society 11 (3), 2008. Garc ía-Peñalvo F.J., Conde M. Á., Alier M. and Casany M.J., Opening learning management systems to personal learning environments, Journal of Universal Computer Science 17 (9), 2011, 1222–1240, DOI: 10.3217/jucs-017-09-1222. Gonz ález-Torres A., Garc ía-Peñalvo F.J. and Therón R., Human-computer interaction in evolutionary visual software analytics, Computers in Human Behavior 29 (2), 2013, 486–495, DOI: 10.1016/j.chb.2012.01.013. Greller W. and Drachsler H., Translating learning into numbers: A generic framework for learning analytics, Journal of Educational Technology & Society 15 (3), 2012. Hattie J., Visible learning: A synthesis of over 800 meta-analyses relating to achievement, 2009, Routledge; New York . Hommes J., Rienties B., de Grave W., Bos G., Schuwirth L. and Scherpbier A., Visualising the invisible: A network approach to reveal the informal social side of student learning, Advances in Health Sciences Education 17 (5), 2012, 743–757,
elsevier_CHB_2757 DOI: 10.1007/s10459-012-9349-0. J ärvelä S., Hurme T. and J ärvenoja H., Self-regulation and motivation in computer-supported collaborative learning environments, In: Ludvigson S., Lund A., Rasmussen I. and Säljö R., (Eds.), Learning across sites: New tools, infrastructure and practices, 2011, Routledge; New York, NY , 330–345. Lajoie S.P. and Azevedo R., Teaching and learning in technology-rich environments, In: Alexander P. and Winne P., (Eds.), Handbook of educational psychology, 2nd ed., 2006, Erlbaum; Mahwah, NJ , 803–821. Lehmann T., Hähnlein I. and Ifenthaler D., Cognitive, metacognitive and motivational perspectives on preflection in self-regulated online learning, Computers in Human Behavior 32, 2014, 313–323, DOI: 10.1016/j.chb.2013.07.051. Macfadyen L.P. and Dawson S., Mining LMS data to develop an “early warning system” for educators: A proof of concept, Computers & Education 54 (2), 2010, 588–599, DOI: 10.1016/j.compedu.2009.09.008. Marks R.B., Sibley S.D. and Arbaugh J.B., A structural equation model of predictors for effective online learning, Journal of Management Education 29 (4), 2005, 531–563, DOI: 10.1177/1052562904271199. Martin A.J., Examining a multidimensional model of student motivation and engagement using a construct validation approach, British Journal of Educational Psychology 77 (2), 2007, 413–440, DOI: 10.1348/000709906X118036. Narciss S. and Huth K., Fostering achievement and motivation with bug-related tutoring feedback in a computer-based training for written subtraction, Learning and Instruction 16 (4), 2006, 310–322, DOI: 10.1016/j.learninstruc.2006.07.003. Narciss S., Feedback strategies for interactive learning tasks, In: Spector J.M., Merrill M.D., van Merrienboer J.J.G. and Driscoll M.P., (Eds.), Handbook of research on educational communications and technology, 3 ed., 2008, Lawrence Erlbaum Associates ; Mahaw, NJ , 125–144. Nijhuis J., Segers M. and Gijselaers W., The extent of variability in learning strategies and students’ perceptions of the learning environment, Learning and Instruction 18 (2), 2008, 121–134, DOI: 10.1016/j.learninstruc.2007.01.009. Nistor N., Baltes B., Dasc ălu M., Mihăilă D., Smeaton G. and Trăuşan-Matu Ş., Participation in virtual academic communities of practice under the influence of technology acceptance and community factors. A learning analytics application, Computers in Human Behavior 34, 2014, 339–344, DOI: 10.1016/j.chb.2013.10.051.
Oblinger D.G., Let’s talk… Analytics, EDUCAUSE Review 47 (4), 2012, 10–13. Pekrun R., Goetz T., Frenzel A.C., Barchfeld P. and Perry R.P., Measuring emotions in students’ learning and performance: The Achievement Emotions Questionnaire (AEQ), Contemporary Educational Psychology 36 (1), 2011, 36–48, DOI: 10.1016/j.cedpsych.2010.10.002. Richardson J.T.E., The attainment of White and ethnic minority students in distance education, Assessment & Evaluation in Higher Education 37 (4), 2012, 393–408, DOI: 10.1080/02602938.2010.534767. Sao Pedro M., Baker R.S.J., Gobert J., Montalvo O. and Nakama A., Leveraging machine-learned detectors of systematic inquiry behavior to estimate and predict transfer of inquiry skill, User Modeling and User-Adapted Interaction 23 (1), 2013, 1–39, DOI: 10.1007/s11257-011-9101-0. Schmidt H.G., Van Der Molen H.T., Te Winkel W.W.R. and Wijnen W.H.F.W., Constructivist, problem-based learning does work: A meta-analysis of curricular comparisons involving a single medical school, Educational Psychologist 44 (4), 2009, 227–249, DOI: 10.1080/00461520903213592. Segers M., Dochy F. and Cascallar E., Optimising new modes of assessment: In search of qualities and standards, 2003, Kluwer Academic Publishers ; Dordrecht . Siemens G., Learning analytics: The emergence of a discipline, American Behavioral Scientist 57 (10), 2013, 1380–1400, DOI: 10.1177/0002764213498851. Siemens G., Dawson S. and Lynch G., Improving the quality of productivity of the higher education sector: Policy and strategy for systems-level deployment of learning analytics, Solar Research 2013.
Stiles, R. J. (2012). Understanding and managing the risks of analytics in higher education: A guide. Educause. Suthers D.D., Vatrapu R., Medina R., Joseph S. and Dwyer N., Beyond threaded discussion: Representational guidance in asynchronous collaborative learning environments, Computers & Education 50 (4), 2008, 1103–1127, DOI: 10.1016/j.compedu.2006.10.007. Tempelaar, D. T., Heck, A., Cuypers, H., van der Kooij, H., & van de Vrie, E. (2013). Formative Assessment and Learning Analytics. In D. Suthers & K. Verbert (Eds.), Proceedings of the 3rd International Conference on Learning Analytics and Knowledge, 205-209. New York: ACM. DOI: 10.1145/2460296.2460337.Tempelaar, D. T., Kuperus, B., Cuypers, H., Van der Kooij, H., Van de Vrie, E., & Heck, A. (2012). “The Role of Digital, Formative Testing in e-Learning for Mathematics: A Case Study in the Netherlands”. In:
elsevier_CHB_2757 “Mathematical e-learning” [online dossier]. Universities and Knowledge Society Journal (RUSC). vol. 9, no 1. UoC.Tempelaar, D. T., Niculescu, A., Rienties, B., Giesbers, B., & Gijselaers, W. H. (2012). How achievement emotions impact students' decisions for online learning, and what precedes those emotions. Internet and Higher Education, 15(3), 161–169. doi: 10.1016/j.iheduc.2011.10.003Tempelaar, D. T., Rienties, B., & Giesbers, B. (2009). Who profits most from blended learning? Industry and Higher Education, 23(4), 285-292. Tobarra L., Robles-Gó mez A., Ros S., Herná ndez R. and Caminero A.C., Analyzing the students’ behavior and relevant topics in virtual learning communities, Computers in Human Behavior 31, 2014, 659–669,
DOI: 10.1016/j.chb.2013.10.001. Verbert K., Manouselis N., Drachsler H. and Duval E., Dataset-driven research to support learning and knowledge analytics, Journal of Educational Technology & Society 15 (3), 2012, 133–148. Vermunt J.D., Metacognitive, cognitive and affective aspects of learning styles and strategies: A phenomenographic analysis, Higher Education 31 (25–50), 1996, DOI: 10.1007/BF00129106. Vermunt J.D., The regulation of constructive learning processes, British Journal of Educational Psychology 68, 1998, 149–171, DOI: 10.1111/j.2044-8279.1998.tb01281.x.
Whitelock, D., Richardson, J., Field, D., Van Labeke, N., & Pulman, S. (2014). Designing and testing visual representations of draft essays for higher education students. Paper presented at the LAK 2014, Indianapolis. Wolff, A., Zdrahal, Z., Nikolov, A., & Pantucek, M. (2013). Improving retention: Predicting at-risk students by analysing clicking behaviour in a virtual learning environment. In Paper presented at the proceedings of the third international conference on learning analytics and knowledge.
Footnotes 1In follow-up multiple regression modeling, time and number of attempts have a negative impact: for a given mastery level, students who need more time to reach that level, or students who need more trials to reach that level, have lower expected performance, which is quite intuitive. Amongst students with complete mastery (mastery level > 95%), the negative correction for time on task is stronger than for students with incomplete mastery. The opposite is true for the correction for number of attempts: amongst students with high mastery, the negative correction for redoing a task is less than for students with low mastery. Although these interaction effects appeared to be stable patterns for both topic areas, we choose not to include them in the development of longitudinal models: the impact of predictive power is small, and it introduces more collinearity, and less parsimony.
Appendix A. Supplementary material Multimedia Component 1
Supplementary data 1
Highlights • Formative assessment data have high predictive power in generating learning feedback. • Track data from e-tutorial systems are second-best predictors for timely feedback. • Predictive power of LMS data falls short in LA applications with rich data sources. • Learning dispositions take a unique position being complementary to all other data. • Combination of several data sources in LA is key to get timely, predictive feedback.
Queries and Answers Query: Please confirm that given name(s) and surname(s) have been identified correctly. Answer: Yes, correctly identified Query: The country name has been inserted for the affiliations ‘b and c’. Please check, and correct if necessary.
elsevier_CHB_2757 Answer: Okay Query: Please check the changes in citation ‘Narciss (2008, 2006)’ and correct if necessary. Answer: Okay Query: Please check the insertion of ‘Acknowledgement’, and correct if necessary. Answer: Okay Query: Please check the text ‘[Details removed for peer review]’ in references and replace it with the relevant text. Answer: We have inserted the correct references. However, they are inserted in their original position, so that references need still to get sorted! Query: As references ‘Author A (2012a)’ and ‘Author A (2012)’ were identical, the latter has been removed from the reference list and subsequent references have been renumbered. Answer: Okay Query: Please update reference ‘Author B (submitted for publication)’. Answer: This is still the current status of this refernce
FORMATIVE ASSESSMENT AND LEARNING ANALYTICS IN STATISTICS EDUCATION Dirk T. Tempelaar Maastricht University School of Business & Economics, The Netherlands Tongersestraat 53, 6211 LM Maastricht - P.O. Box 616, 6200 MD Maastricht - The Netherlands
[email protected] Learning analytics seeks to enhance the learning process through systematic measurements of learning related data, and informing learners and teachers of the results of these measurements, so as to support the control of the learning process. Learning analytics has various sources of information, two main types being intentional and learner activity related metadata.. This contribution provides a practical application of Shum and Crick’s theoretical framework of a learning analytics infrastructure that combines learning dispositions data with data extracted from computer based, formative assessments. In a large introductory statistics course based on the principles of blended learning, combining face-to-face problem-based learning sessions with technology enhanced education, we demonstrate that students learning choices profit from providing students with feedback based on learning analytics, as to optimize individual learning. INTRODUCTION The prime data source for most learning analytic applications is data generated by learner activities, such as learner participation in continuous, formative assessments. That information is frequently supplemented by background data retrieved from learning management systems and other concern systems, as for example accounts of prior education. A combination with intentionally collected data, such as self-report data stemming from student responses to surveys, is however the exception rather than the rule. In their theoretical contribution to LAK2012, Shum and Crick (2012) propose a learning analytics infrastructure that combines learning activity generated data with learning dispositions, values and attitudes measured through self-report surveys and fed back to students and teachers through visual analytics. In our empirical contribution of the application of learning analytics in statistics education, we aim to provide a practical application of such an infrastructure based on combining learning and learner data. In collecting learner data, we opted to use a wide range of well validated self-report surveys firmly rooted in current educational research, including learning styles, learning motivation and engagement, and learning emotions. Learner data were reported to both students and teachers. Our second data source is rooted in the instructional method of formative testing, and brings about the second focus of this empirical study: to demonstrate the crucial role of data derived from computer-based formative assessments in designing effective learning analytic infrastructures. FORMATIVE ASSESSMENT The classic function of testing is that of taking an aptitude test. After completion of the learning process, we expect students to demonstrate mastery of the subject. According to test tradition, feedback resulting from such classic tests is no more than a grade, and that feedback becomes available only after finishing all learning. The alternative form of assessment, formative assessment, has an entirely different function: that of informing student and teacher. The information should help better shape the teaching and learning and is especially useful when it becomes available during or prior to the learning. Diagnostic testing is an example of this, just as is practice testing. Because here the feedback that tests yield for learning constitutes the main function, it is crucial that this information is readily available, preferably even directly. LEARNING ANALYTICS The broad goal of learning analytics is to apply the outcomes of analyzing data gathered by monitoring and measuring the learning process, as feedback to assist directing that same learning process. Several alternative operationalizations are possible to support this. In Verbert, Manouselis, Drachsler, and Duval (2012), six objectives are distinguished: predicting learner performance and modeling learners, suggesting relevant learning resources, increasing reflection and awareness,
enhancing social learning environments, detecting undesirable learner behaviors, and detecting affects of learners. In the following sections describing our approach, we will demonstrate that the combination of self-report learner data with learning data from test-directed instruction allows to contribute to at least five of these objectives of applying learning analytics. Only social interaction is restricted to learners being able to assess their individual learning profiles in terms of a comparison of their own strong and weak characteristics relative to the position of other students. These profiles are based on both learner behavior, including all undesirable aspects of it, and learner characteristics: the dispositions, attitudes and values. Learner profiles are used to model different types of learners, and to predict learner performance for each individual student. Since our instructional format is of student-centered type, with the student, and not the teacher, steering the learning process, it is crucial to feedback all this information to learners themselves as to make them fully aware of how to optimize their individual learning trajectories. CASE STUDY: STATISTICS EDUCATION Our empirical contribution focuses on freshmen education in quantitative methods (mathematics and statistics) of the business & economics school at Maastricht University, one of the educational projects in the SURF project. Our project is directed at a large and diverse group of students. The population of students studied here consists of two cohorts of freshmen: 2011/2012 and 2012/2013, containing 1,832 students who in some way participated in school activities (have been active in the digital learning environment Blackboard). Besides BlackBoard, a digital learning environment for formative assessment were utilized: MyStatLab, by a large majority of students. The diversity of the student population derives mainly from its very international composition: only 34.8% took Dutch high school, whereas all others were educated in international high school systems. The largest group, 41.9% of the freshmen, were educated according to the German Abitur system. High school systems in Europe differ strongly, most particularly in the teaching of mathematics and statistics. In that European palette the Netherlands occupies a rather unique position, both in choice of subjects (one of the few European systems with substantial focus on statistics) and the chosen pedagogical approach. But even beyond the Dutch position, there exist large differences, such as between the Anglo-Saxon and German-oriented high school systems. Therefore it is crucial that the first course offered to these students is flexible and allows for individual learning paths. To some extent, this is realized in offering optional, developmental summer courses, but for the main part, this diversity issue needs to be solved in the program itself. The digital environments for test-directed learning play an important role in this. EDUCATIONAL PRACTICE The educational system in which students learn mathematics and statistics is best described as a ‘blended system’. The main component is 'face-to-face’: problem-based learning (PBL), in small groups (14 students), coached by a content expert tutor. Participation in these tutor groups is required, as for all courses based on the Maastricht PBL system. Optional is the online component of the blend: the use of technology enhanced education MyStatLab (MSL) environment. MSL is a generic digital learning environment, developed by the publisher Pearson, for learning statistics. It adapts to the specific choice of a textbook from Pearson. Although MSL can be used as a learning environment in the broad sense of the word (it contains, among others, a digital version of the textbook), it is primarily an environment for test-directed learning. Each step in the learning process is initiated by submitting a question. Students are encouraged to (try to) answer the question. If they do not master (completely), the student can either ask for help to step by step solve the problem (Help Me Solve This), or ask for a fully worked example to show (View an Example). Next, a new version of the problem loads (parameter based) to allow the student to demonstrate their newly acquired mastery. In the investigated courses, students work an average 19.2 hours in MSL, about a quarter of the available time of 80 hours for learning statistics. In this study, we use two different indicators for the intensity of use of MSL: Stats#hours, the number of hours a student spent practicing in the MSL environment, and StatsTestScore, the average score for the practice questions, all chapters aggregated.
DISPOSITIONAL VARIABLES FOR LEARNING ANALYTICS First: data from the regular student administration such as whether or not Dutch high school, whether or not advanced prior math schooling, gender, nationality and entry test score. Students with advanced prior schooling are better at math, without incurring more need to practice, but they are not better at statistics, which corresponds to the fact that in programs at advanced level, the focus is not on statistics but abstract math. Dutch students make considerably less use of both test environments and hence achieve a slightly lower score, benefiting from a smoother transition than international students, but relying just somewhat too much on that. Students with a high entry test score do better in mathematics and a little better in statistics in the test environments, without the need to exercise more. Finally, there are modest gender effects, the strongest in the intensity of exercising: female students are more active than male students. The remaining data from the student records of administrative systems regard the nationality of students. Because cultural differences in education has been given an increasingly important role, and because the Maastricht student population makes it very suitable through its strong international composition, the nationality data are converted into so-called national culture dimensions, based on the framework of Hofstede (Hofstede, Hofstede, & Minkov, 2010). In that framework, there are a number of cultural dimensions that refer to values that are strongly nationally determined. In this study we use six of these dimensions: Power Distance, Individualism versus Collectivism, Masculinity versus Femininity, Uncertainty Avoidance, Long-Term vs. ShortTerm Orientation and Indulgence vs. Restraint. Scores for each of these national dimensions are assigned to the individual students. Correlating these scores with the four indicators of practice tests intensity result in several significant effects, all in line with Hofstede's framework. The most significant effects are for students from a masculine culture, where mutual competition is an important driver in education, for students from a culture that value long-term over short-term and, somewhat in relation thereto, cultures that value sobriety rather than enjoyment. In this, masculinity and hedonism have a stronger impact on the intensity of exercising, than on the proceeds of exercising, in contrast to long-term orientation that has about equal impact on both aspects. Uncertainty avoidance contributes, as expected, to practicing, albeit to a lesser extent and again primarily toward intensity of exercising rather than its outcome. The roles of power distance and individualism play a less salient role in learning, as expected. Although the effects are smaller in size, learning data based on the learning style model of Vermunt (1996) exhibit a characteristic role. Vermunt’s model distinguishes learning strategies (deep, stepwise, and concrete ways of processing learning topics), and regulation strategies (self, external, and lack of regulation of learning). Deep-learning students demonstrate no strong relationship with test directed learning: they exercise slightly less, but achieve a slightly better score. That is certainly not true for the stepwise learning students. Especially for these students the availability of practice tests seems to be meaningful: they practice more often and longer than other students and achieve, especially for statistics, a better score than the other students. Recent Anglo-Saxon literature on academic achievement and dropout assigns an increasingly dominant role to the theoretical model of Andrew Martin: the 'Motivation and Engagement Wheel’ (Martin, 2007). That model includes both behaviors and thoughts or cognitions that play a role in learning. Both are then divided into adaptive and mal-adaptive or obstructive forms. As a result, the four quadrants are: adaptive behavior and adaptive thoughts, mal-adaptive behavior and obstructive thoughts. All adaptive thoughts and all adaptive behaviors have a positive impact on the willingness of students to use the test environments, where the effect of the adaptive behavior dominates that of cognitions. The mal-adaptive variables show a less uniform picture. Mal-adaptivity manifests itself differently in female and male students: for female students primarily in the form of limiting thoughts, especially fear and uncertainty, in male students primarily as mal-adaptive behaviors: self-handicapping and disengagement. That difference has a significant impact on learning. Mal-adaptive behaviors negatively impact the use of the test environments: all the correlations, both for use intensity and performance, are negative. The effect of inhibiting mind, however, is different: uncertainty and anxiety have a stimulating effect on the use of the test environments rather than an inhibitory effect. Combination of both effects provides a partial explanation for the observed gender effects in the use of the test environments.
CONCLUSIONS The intensive use of formative assessment tools makes a major difference for academic performance. But in a student-centered curriculum it is not sufficient when teachers are convinced of the benefits that formative assessment in digital learning environments entails. Students regulate their own learning process, making themselves choices on how intensively they will exercise and therefore, are the ones who need to become convinced of the usefulness of these digital tools. In this, learning analytics can play an important role: it provides a multitude of information that the student can use to adapt the personal learning environment as much as possible to the own strengths and weaknesses. For example, in our experiment the students were informed about their personal learning dispositions, attitudes and values, together with information on how learning in general interferes with choices they can make in composing their learning blend. At the same time: the multitude of information available from learning analytics is also the problem: that information requires individual processing. Some information is more important for one student than the other, requiring a personal selection of information to take place. Learning analytics deployed within a system of student-centered education thus has its own challenges. The aim of this contribution extends beyond demonstrating the practical importance of Shum and Crick’s learning analytics infrastructure. Additionally, this research provides many clues as to what individualized information feedback could look alike. In the learning blend described in this case study, the face-to-face component PBL constitutes the main instructional method. The digital component is intended as a supplementary learning tool, primarily for students for whom the transition from secondary to university education entails above average hurdles. Part of these problems are of cognitive type: e.g. international students who never received statistics education as part of their high school mathematics program, or other freshmen who might have been educated in certain topics, without achieving required proficiency levels. For these kind of cognitive deficiencies, the digital test-directed environments proved to be an effective tool to supplement PBL. But this applies not only to adjustment problems resulting from knowledge backlogs. Students encounter several types of adjustment problems where the digital tools appear to be functional. The above addressed learning dispositions are a good example: student-centered education presupposes in fact deep, self-regulated learning, where many students have little experience in this, and feel on more familiar ground with step-wise, externally regulated learning. As the analyses demonstrate: the digital test environments help in this transformation. It also makes clear that the test environments are instrumental for students with non-adaptive cognitions about learning mathematics and statistics, such as anxiety. An outcome that is intuitive: the individual practice sessions with computerized feedback will for some students be a safer learning environment than the PBL tutorial group sessions. ACKNOWLEDGEMENTS This project has been financed by SURFfoundation as part of the Learning Analytics program. REFERENCES Buckingham, S. S. & Deakin, C. R. (2012). Learning Dispositions and Transferable Competencies: Pedagogy, Modelling and Learning Analytics. Proceedings LAK2012: 2nd International Conference on Learning Analytics & Knowledge, pp. 92-101. ACM Press: New York Hofstede, G., Hofstede, G. J., & Minkov, M. (2010). Cultures and organizations: Software of the mind. Revised and expanded third edition. Maidenhead: McGraw-Hill. Martin, A. J. (2007). Examining a multidimensional model of student motivation and engagement using a construct validation approach. British Journal of Educational Psychology, 77, 413-440. Verbert, K., Manouselis, N., Drachsler, H., & Duval, E. (2012). Dataset-Driven Research to Support Learning and Knowledge Analytics. Educational Technology & Society, 15 (3), 133– 148. Vermunt, J. D. (1996). Leerstijlen en sturen van leerprocessen in het Hoger Onderwijs. Amsterdam/Lisse: Swets & Zeitlinger.
Computer Assisted, Formative Assessment and Dispositional Learning Analytics in Learning Mathematics and Statistics Dirk T. Tempelaar1, Bart Rienties2, and Bas Giesbers3 1
School of Business and Economics, Maastricht University, The Netherlands
[email protected] 2 Open University UK, Institute of Educational Technology, Milton Keynes, UK
[email protected] 3 Rotterdam School of Management, Erasmus Universiteit, Rotterdam, The Netherlands
[email protected]
Abstract. Learning analytics seeks to enhance the learning process through systematic measurements of learning related data and to provide informative feedback to learners and teachers, so as to support the regulation of the learning. Track data from technology enhanced learning systems constitute the main data source for learning analytics. This empirical contribution provides an application of Buckingham Shum and Deakin Crick’s theoretical framework of dispositional learning analytics [1]: an infrastructure that combines learning dispositions data with data extracted from computer assisted, formative assessments. In a large introductory quantitative methods module based on the principles of blended learning, combining face-to-face problem-based learning sessions with e-tutorials, we investigate the predictive power of learning dispositions, outcomes of continuous formative assessments and other system generated data in modeling student performance and their potential to generate informative feedback. Using a dynamic, longitudinal perspective, Computer Assisted Formative Assessments seem to be the best predictor for detecting underperforming students and academic performance, while basic LMS data did not substantially predict learning. Keywords: blended learning, computer assisted assessment, dispositional learning analytics, e-tutorials, formative assessment, learning dispositions, student profiles.
1
Introduction
Many learning analytics (LA) applications use data generated by learner activities, such as learner participation in discussion forums, wikis or (continuous) computer assisted formative assessments. This user behavior data is frequently supplemented with background data retrieved from learning management systems (LMS) and other student admission systems, as for example accounts of prior education. In their theoretical contribution to LAK2012 [1] (see also the 2013 LASI Workshop [2]), Buckingham Shum and Deakin Crick propose a dispositional LA infrastructure that M. Kalz and E. Ras (Eds.): CAA 2014, CCIS 439, pp. 67–78, 2014. © Springer International Publishing Switzerland 2014
68
D.T. Tempelaar, B. Rienties, and B. Giesbers
combines learning activity generated data with learning dispositions, values and attitudes measured through self-report surveys, which are fed back to students and teachers through visual analytics. However, a combination with intentionally collected data, such as self-report data stemming from student responses to surveys, is the exception rather than the rule in LA ([3], [4], and [5]). In our empirical contribution focusing on a large scale module in introductory mathematics and statistics, we aim to provide a practical application of such an infrastructure based on combining learning and learner data. In collecting learner data, we opted to use a wide range of validated self-report surveys firmly rooted in current educational research, including learning styles, learning motivation and engagement, and learning attitudes. This operationalization of learning dispositions closely resembles the specification of cognitive, metacognitive and motivational learning factors relevant for the internal loop of informative tutoring feedback (see [6], [7] for examples). Other data sources used are more common for LA applications, and constitute both data extracted from a learning management system, as well as system track data extracted from the e-tutorials used for practicing and formative assessments. The prime aim of the analysis is to provide a stepping stone for predictive modeling, with a focus on the role each of these data sources can play in generating timely, informative feedback. This paper extends our earlier study [8], which found empirical evidence for the role of dispositional data in LA applications.
2
Background
2.1
Computer Assisted Formative Assessment
The classic function of assessment is that of taking an aptitude test. After completion of the learning process, we expect students to demonstrate mastery of the subject. According to test tradition, feedback resulting from such classic assessment is no more than a grade which becomes available only after finishing all learning activities. In recent years, the conception of assessment as a summative function (i.e. assessment of learning) has been broadened toward the conception of assessment as a formative function (i.e. assessment for learning). That is, as a means to provide feedback to both student and teacher about teaching and learning prior to or during the learning process [9, 10]. Examples of formative assessment are diagnostic testing, and test-directed learning approaches that constitutes the basic educational principle of many e-tutorial systems [11]. Because feedback from assessments constitutes a main function for learning, it is crucial that this information is readily available, preferably even directly. At this point digital testing enters the stage: it is unthinkable to get just-in-time feedback from formative assessments without using computers. 2.2
Learning Analytics
A broad goal of LA is to apply the outcomes of analyzing data gathered by monitoring and measuring the learning process, whereby feedback plays a crucial part to assist regulating that same learning process. Several alternative operationalizations are possible to support this. In [12], six objectives are distinguished: predicting learner performance and modelling learners, suggesting relevant learning resources, increasing
Computer Assisted, Formative Assessment and Dispositional LA
69
reflection and awareness, enhancing social learning environments, detecting undesirable learner behaviors, and detecting affects of learners. Although the combination of self-report learner data with learning data extracted from e-tutorial systems allows us to contribute to at least five of these objectives of applying learning analytics (as described in [8]), in this contribution we will focus on the first objective: predictive modeling of performance and learning behavior. The ultimate goal of this predictive modeling endeavor is to investigate which components from a rich set of data sources, best serve the role of generating timely, informative feedback and afford signaling the risk of underperformance. 2.3
Related Work
Previous research by Wolff, Zdrahal, Nikolov, and Pantucek [13] found that a combination of LMS data with data from continuous summative assessments were the best predictor for performance drops amongst 7,701 students. In particular, the number of clicks in a LMS just before the next assessment significantly predicted continuation of studies [13]. As is evident from our own previous research [8], formative assessment data, supplemented with learning disposition data, also had a substantial impact on student performance in a blended course of 1,832 students.
3
Case Study: Mathematics and Statistics
3.1
Internationalization of Higher Education
Our empirical contribution focuses on freshmen students in quantitative methods (mathematics and statistics) course of the Maastricht University School of Business & Economics. The course is the first module for students entering the program. It is directed at a large and diverse group of students, which benefits the research design. The population consists of 1,840 freshmen students, in two cohorts: 2012/2013 and 2013/2014, who in some way participated in learning activities (i.e., have been active in the learning management system BlackBoard). Besides BlackBoard, two different e-tutorial systems for technology-enhanced learning and practicing were utilized: MyStatLab and MyMathLab. The diversity of the student population mainly lies in its international composition: only 23% received their prior (secondary) education from the Dutch high school system. The largest group, 45% of the freshmen, was educated according to the German Abitur system. The remaining 32% are mainly from central-European and southEuropean countries. High school systems in Europe differ strongly, most particularly in the teaching of mathematics and statistics. Therefore it is crucial that the first module offered to these students is flexible and allows for individual learning paths. 3.2
Test-Directed E-tutorials
The two e-tutorial systems MyStatLab (MSL) and MyMathLab (MML) are generic digital learning environments for learning statistics and mathematics developed by the publisher Pearson. Although MyLabs can be used as a learning environment in the broad sense of the word (it contains, among others, a digital version of the textbook),
70
D.T. Tempelaar, B. Rienties, and B. Giesbers
it is primarily an environment for test-directed learning and practicing. Each step in the learning process is initiated by submitting a question. Students are encouraged to (try to) answer each question (see Fig. 1 for an example). If they do not master a question (completely), the student can either ask for help to solve the problem step-by-step (Help Me Solve This), or ask for a fully worked example (View an Example). These two functionalities are examples of Knowledge of Result/response (KR) and Knowledge of the Correct Response (KCR) types of feedback; see Narciss [6], [7]. After receiving this type of feedback, a new version of the problem loads (parameter based) to allow the student to demonstrate his/her newly acquired mastery. When a student provides an answer and opts for ‘Check Answer’, Multiple-Try Feedback (MTF, [6]) is provided, whereby the number of times feedback is provided for the same task depends on the format of the task (only two for a multiple choice type of task as in Fig.1, more for open type of tasks requiring numerical answers).
Fig. 1. MyMathLab task and feedback options
In the investigated course, students on average work 35.7 hours in MML and 23.6 hours in MSL, which is 30% to 40% of the available time of 80 hours for learning in both topics. In the present study, we use two different indicators for the intensity of the MyLabs usage: MMLHours and MSLHours indicate the time a student spends practicing in each respective MyLab environment per week; MMLMastery and MSLMastery indicate the average final score achieved for the practice questions in any week. 3.3
Educational Practice
The educational system in which students learn mathematics and statistics is best described as a ‘blended’ or ‘hybrid’ system. The main component is 'face-to-face’:
Computer Assisted, Formative Assessment and Dispositional LA
71
problem-based learning (PBL, see [14] for an elaborate overview), in small groups (14 students), coached by a content expert tutor. Participation in these tutor groups is required, as for all courses based on the Maastricht PBL system. The online component of the blend, that is, the use of the two e-tutorials, is optional. The reason for making the online component optional is that this best fits the Maastricht educational model, which is student-centered and places the responsibility for making educational choices primarily with the student. At the same time, due to the diversity in prior knowledge, not all students will benefit equally from using these environments; in particular for those at the high performance end, extensive practicing will not be the most effective allocation of learning time. However, the use of e-tutorials is stimulated by making bonus credits available for good performance in the quizzes, and for achieving good scores in the practicing modes of the MyLab environments. Quizzes are taken every two weeks and consist of items that are drawn from the same item pools applied in the practicing mode. We chose for this particular constellation, since it stimulates students with little prior knowledge to make intensive use of the MyLab platforms. They realize that they may fall behind other students in writing the exam, and therefore need to achieve a good bonus score both to compensate, and to support their learning. The most direct way to do so is to frequently practice in the MML and MSL environments. The bonus is maximized to 20% of what one can score in the exam. The student-centered characteristic of the instructional model first and foremost requires adequate informative feedback to students so that they are able to monitor their study progress and their topic mastery in absolute and relative sense. The provision of relevant feedback starts on the first day of the course when students take two diagnostic entry tests for mathematics and statistics. Feedback from these entry tests provide the first signals to students of the importance of using the MyLab platforms. Next, the MML and MSL-environments contain a monitoring function: at any time students can see their progress in preparing the next quiz, and can get feedback on the performance in completed quizzes and on their performance in the practice sessions. The same information is also available to the tutors. Although the primary responsibility for directing the learning process lies with the student, the tutor can act complementary to that self-steering, especially in situations where the tutor considers that a more intense use of e-tutorials is desirable, given the position of the student concerned. In this way, the application of LA shapes the instructional situation.
4
The Array of Learning Analytics Data Sources
In order to explore the potential of feedback based on the several components of the learning blend, we investigate the relationship between an array of LA data sources, and academic performance in the Quantitative Methods module. Academic performance consists of the individual scores in both topic components of the final written exam (MathExam and StatsExam), and the overall grade in the module (QMGrade). Both are subject to a weight factor, weighting the final exam with factor 5, and the bonus score from quizzes and homework with factor 1. In designing models covering two class years, performance scores have been standardized by calculating Z-scores in order to compare performance across the two cohorts. Prediction models for these three learning performance measures are based on the following data sources:
72
D.T. Tempelaar, B. Rienties, and B. Giesbers
• Formative assessment data consisting of: ─ Week0: diagnostics entry tests for mathematics and statistics, with a strong focus on basic algebraic skills, a well-known topic for high school deficiencies. ─ Week1: mastery scores and practice time in MyMathLab and MyStatLab. ─ Week2: mastery scores and practice time in MyMathLab and MyStatLab. ─ Week3: mastery scores and practice time in MyMathLab and MyStatLab, and Quiz1 scores for mathematics and statistics. ─ Week4: mastery score and practice time in MyMathLab and MyStatLab. ─ Week5: mastery scores and practice time in MyMathLab and MyStatLab, and Quiz2 scores for mathematics and statistics. ─ Week6: mastery score and practice time in MyMathLab and MyStatLab. ─ Week7: mastery scores and practice time in MyMathLab and MyStatLab, and Quiz3 scores for mathematics and statistics. • BlackBoard use intensity data, in terms of number of clicks, again decomposed into weekly figures (BB time on task data was initially included in the study, but appeared to be dominated by click data with regard to predictive power, and was therefore excluded in the final analyses). • Learning dispositions and demographic data from several concern systems. These data are, in terms of designing longitudinal models, assigned to Week0. Demographic data were obtained from the regular student administration. An important part of demographic data is prior education. High school educational systems generally distinguish between a basic level of mathematics education preparing for the social sciences, and an advanced level preparing for sciences. An indicator variable is used for mathematics at advanced level (about one third of the students), with basic level of mathematics prior schooling being the reference group. Students with advanced prior schooling are generally better in mathematics, but not in statistics, which corresponds to the fact that in programs at advanced level, the focus is abstract mathematics (calculus) rather than statistics. Other demographic data refer to gender, nationality and age. Learning style data based on the learning style model of Vermunt [15] constitute the first component of measured learning dispositions (see also: Vermunt & Vermetten, [16]). Vermunt distinguishes four domains or components of learning in his model: cognitive processing strategies, metacognitive regulation strategies, learning conceptions or mental models of learning, and learning orientations. In each domain, five different scales describe different aspects of the learning component. In this study, we applied the two domains of processing and regulation strategies, since these facets of learning styles are most open to interventions based upon learning feedback. In Vermunt’s model, three types of learning strategies are distinguished: deep learning, step-wise (or surface) learning, and concrete ways of processing learning topics. In a similar way, three types of regulation strategies are distinguished: self-regulation of learning, external regulation of learning, and lack of regulation. Combining scores on processing and regulation strategies, we can find alternative profiles of learning approaches often seen in students in higher education. For instance, the meaning directed learning approach combines high levels for deep learning, with students critically processing the learning materials, with high levels for self-regulation, both with regard to learning process and learning content. These students are the ‘ideal’ higher
Computer Assisted, Formative Assessment and Dispositional LA
73
education students: being self-directed, independent learners. The typical learning approach of students with high scores on step-wise learning, who depend a lot on memorization and rehearsing processes, and at the same time score high on external regulation of learning, does carry a lot more risks with regard to academic success. These learning approaches are very often guarantees for success in high school, but start to fail in university. Students with high scores for lack of regulation of any type run the highest risk; drop-out for these profiles is higher than for any other profile. Recent Anglo-Saxon literature on academic achievement and dropout assigns an increasingly dominant role to the theoretical model of Andrew Martin: the 'Motivation and Engagement Wheel’ [17]: see Fig. 2.
Fig. 2. Motivation and Engagement Wheel (Source: [17])
This model includes both behaviors and thoughts, or cognitions, that play a role in learning. Both are subdivided into adaptive and mal-adaptive or impeding forms. As a result, the four quadrants are: adaptive behavior and adaptive thoughts (the ‘boosters’), mal-adaptive behavior (the ‘guzzlers’) and impeding thoughts (the ‘mufflers’). Adaptive thoughts consist of Self-belief, Learning focus, and Value of school, whereas adaptive behaviors consist of Persistence, Planning, and Task management. Maladaptive or impeding thoughts include Anxiety, Failure avoidance, and Uncertain control, and lastly, maladaptive behaviors include Self-sabotage and Disengagement. Further components of learning dispositions are learning attitudes, and intrinsic versus extrinsic motivation to learn. All learning dispositions are administered through selfreport surveys. From 1,794 out of 1,840 students (97.5%), complete information was obtained on the various instruments. Similar to the feedback based on student activity in the two MML and MSL platforms, also learning dispositions data was used to provide feedback during the course. Students were given access to visualizations of their characteristic learning approaches, relative to the profile of the average students. Next to that, all students received individual data on personal dispositions, in order to analyze these data as a required
74
D.T. Tempelaar, B. Rienties, and B. Giesbers
statistical project. The only retrospective part of this study is the investigation of the predictive power of the several data sources with regard to course performance, as discussed in the next section.
5
Predicting Performance
Before turning to longitudinal models predicting performance using week by week data, the first step is to determine the maximum predictive power for each of the data sources, using aggregated data for all weeks. For one category of data, the outcome appears to be simple: BlackBoard track data can predict no more than 1% of variation in the three performance measures. In other words, the (multiple) correlation of BlackBoard user track data and the performance variables is not above 0.1. From a substantial perspective, that excludes the category of BlackBoard data for developing prediction models as being practically insignificant. With regard to the MyLab data, both overall mastery in MML and MSL correlate strongly with all performance measures (correlations in the range of 0.35 to 0.55), whereas correlations between time in the system and performance measures are weaker, but still substantial (in the range 0.1 to 0.2). Composing regression models that predict performance measures from multiple regressions containing both mastery and time in MyLab systems variables, generates the following prediction equations (in normalized performance measures, using Z-scores, and standardized beta’s): ZMathExam
0.562 MMLMastery – 0.277 MMLHours, R = 0.47
ZStatsExam
0.506 MSLMastery – 0.251 MSLHours, R
ZQMGrade
0.40
0.36 MMLMastery – 0.196 MMLHours
0.341 MSLMastery – 0.092 MSLHours, R
0.58
All prediction equations have substantial multiple correlations, which suggests that feedback based on overall mastery and time for both MyLab systems has good prospects. A remarkable and very consistent feature of all three prediction equations is that the beta of mastery is always positive, and the beta of time in system is always negative, although all bivariate correlations between time in system variables and performance measures are positive. There is however a simple explanation for this sign reversal: mastery and time in system variables are strongly collinear, with a 0.59 correlation for the MML platform, and a .66 correlation for the MSL platform. Practicing longer in the two MyLab systems increases expected performance, since students who practice more, achieve higher mastery levels. In a multiple regression model, one however corrects for mastery level, and now time has a negative impact: for a given mastery level, students who need more time to reach that level, have lower expected performance, which is quite intuitive. After the potential of building prediction models for performance based on data from the two MyLab systems has been established, the next step is to design these prediction models using incremental data sets of system data. Starting with the Week0 data set, containing data that are available at the very start of the module (in our
Computer Assisted, Formative Assessment and Dispositional LA
75
example: data from the diagnostic entry tests), we extend the data set in weekly steps, arriving at the final set of predictor variables after seven weeks. Thus, the incremental system data contains entry test data, mastery and time in system data of seven consecutive weeks, and MyLab quiz data administered in weeks 3, 5, and 7. Instead of providing regressions for all seven weeks and all three performance measures, Fig. 3 describes the development of the multiple correlation coefficient R in time, that is, over incremental weekly data sets.
Fig. 3. Longitudinal Performance Predictions based on Formative Assessments: Multiple Correlation R
Since the predictor data sets are incremental, the values of multiple correlation increase over weeks. Those for performance in the mathematics exam, and the overall grade, start at values around 0.45 in Week0, and increase to values between 0.7 and 0.8 in the last week. In contrast, there is less power in predicting performance in statistics, the difference caused by the statistics entry test being less informative for later statistics performance, than the mathematics entry exam is for later mathematics performance. The circumstance that many of the students have not been educated before in statistics is crucial for understanding the entry test being not very informative. Predictor sets used for the generation of Fig. 3 include only MyLab data, together with entry tests data; no learning dispositions data have been used yet. When we add these data, assuming that these data are available at the start of the course so that they are part of the new Week0 data set, we arrive at Fig. 4 describing the development of the multiple correlation coefficients R over all weeks. The main impact of the availability of learning disposition data is the strong increase in predictive power in the first weeks. From the third week onwards, when data from the first quiz becomes available, the difference in predictive power between models including and those excluding learning dispositions, is minimal. Apparently, collinearity between scores in the first
76
D.T. Tempelaar, B. Rienties, and B. Giesbers
quiz and the set of learning dispositions imply that dispositions have hardly any additional predictive power beyond that of quiz performance; most of their impact is also captured in quiz performance scores.
Fig. 4. Longitudinal Performance Predictions based on Formative Assessments and Learning Dispositions: Multiple Correlation R
6
Conclusions
In this empirical study into predictive modeling of student performance, we investigated three different data sources to explore the potential of generating informative feedback using LA: BlackBoard tracking data, students’ learning dispositions, and data from systems for formative, computer assisted assessments. The last data source allows further classification into data generated in the practice mode (both mastery and system time data), and data generated by formative assessments (performance data). It appears that the combination of dispositions data and assessment system data dominate the role of BlackBoard track data in predicting student performance, implying that in applications with such rich data available, BlackBoard data have no added value in predicting performance and signaling underperforming students. This seems to confirm initial findings by Macfayden and Dawson [5], who found that simple clicking behavior in a LMS is at best a poor proxy for actual user-behavior of students. Data extracted from the testing mode of the MyLab systems dominate in a similar respect data generated by the practicing mode of MyLabs, indicating the predictive power of true assessment data, even if it comes from assessments that are primarily formative in nature. However, assessment data is typically delayed data, not available before midterm, or as in our case, the third week of the course. Up to the moment this
Computer Assisted, Formative Assessment and Dispositional LA
77
richest data component becomes available, mastery data and use intensity data generated by the e-tutorial systems are a second best alternative for true assessment data. This links well with Wolff et al. [13], who found that performance on initial assessments during the first parts of an online module were substantial predictors for final exam performance. A similar conclusion can be drawn with regard to the learning disposition data: up to the moment that assessment data become available, they serve a unique role in predicting student performance and signaling underperformance beyond system track data of the e-tutorials. From the moment that computer assisted, formative assessment data become available, their predictive power is dominated by that of performance in those formative assessments. Dispositions data are not as easily collected as system tracking data from learning management systems or e-tutorial systems. The answer to the question if the effort to collect dispositional data is worthwhile (or not), is therefore strongly dependent on when richer (assessment) data becomes available, and the need for timely signaling of underperformance. If timely feedback is required, the combination of data extracted from e-tutorials, both in practicing and test modes, and learning disposition data suggests being the best mix to serve LA applications. Acknowledgements. The project reported here has been supported and co-financed by SURF-foundation as part of the Learning Analytics Stimulus program.
References 1. Buckingham Shum, S., Deakin Crick, R.: Learning Dispositions and Transferable Competencies: Pedagogy, Modelling and Learning Analytics. In: Proceedings LAK 2012: 2nd International Conference on Learning Analytics & Knowledge, pp. 92–101. ACM Press, New York (2012) 2. LASI Dispositional Learning Analytics Workshop (2013), http://Learningemergence.net/events/lasi-dla.wkshp 3. Buckingham Shum, S., Ferguson, R.: Social Learning Analytics. Journal of Educational Technology & Society 15(3) (2012) 4. Greller, W., Drachsler, H.: Translating Learning into Numbers: A Generic Framework for Learning Analytics. Journal of Educational Technology & Society 15(3) (2012) 5. Macfadyen, L.P., Dawson, S.: Mining LMS data to develop an “early warning system” for educators: A proof of concept. Computers & Education 54(2), 588–599 (2010) 6. Narciss, S.: Feedback strategies for interactive learning tasks. In: Spector, J.M., Merrill, M.D., van Merrienboer, J.J.G., Driscoll, M.P. (eds.) Handbook of Research on Educational Communications and Technology, 3rd edn., pp. 125–144. Lawrence Erlbaum Associates, Mahaw (2008) 7. Narciss, S., Huth, K.: Fostering achievement and motivation with bug-related tutoring feedback in a computer-based training on written subtraction. Learning and Instruction 16, 310–322 (2006) 8. Tempelaar, D.T., Cuypers, H., Van de Vrie, E.M., Heck, A., Van der Kooij, H.: Formative Assessment and Learning Analytics. In: Proceedings LAK 2013: 3rd International Conference on Learning Analytics & Knowledge, pp. 205–209. ACM Press, New York (2013)
78
D.T. Tempelaar, B. Rienties, and B. Giesbers
9. Birenbaum, M.: New insights into learning and teaching and their implications for assessment. In: Segers, M., Dochy, F., Cascallar (eds.) Optimizing New Modes of Assessment; in Search of Qualities and Standards, vol. 1, pp. 13–37. Kluwer Academic Publishers, Dordrecht (2003) 10. Wyatt-Smith, C., Klenowski, V., Colbert, P.: Assessment Understood as Enabling. In: Wyatt-Smith, C., Klenowski, V., Colbert, P. (eds.) Designing Assessment for Quality Learning, vol. 1, pp. 1–20. Springer, Netherlands (2014) 11. Shute, V., Kim, Y.: Formative and Stealth Assessment. In: Spector, J.M., Merrill, M.D., Elen, J., Bishop, M.J. (eds.) Handbook of Research on Educational Communications and Technology, 4th edn., pp. 311–321. Springer, New York (2014) 12. Verbert, K., Manouselis, N., Drachsler, H., Duval, E.: Dataset-Driven Research to Support Learning and Knowledge Analytics. Educational Technology & Society 15(3), 133–148 (2012) 13. Wolff, A., Zdrahal, Z., Nikolov, A., Pantucek, M.: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment. In: Proceedings LAK 2013: 3rd International Conference on Learning Analytics & Knowledge, pp. 145–149. ACM Press, New York (2013) 14. Loyens, S.M.M., Kirschner, P.A., Paas, F.: Problem-based learning. In: Harris, K.R., Graham, S., Urdan, T., Bus, A.G., Major, S., Swanson, H. (eds.) APA Educational Psychology Handbook: Application to Learning and Teaching, vol. 3, pp. 403–425. American Psychological Association, Washington, D.C (2011) 15. Vermunt, J.D.: Leerstijlen en sturen van leerprocessen in het Hoger Onderwijs. Swets & Zeitlinger, Amsterdam/Lisse (1996) 16. Vermunt, J.D., Vermetten, Y.: Patterns in Student Learning: Relationships between Learning Strategies, Conceptions of Learning, and Learning Orientations. Educational Psychology Review 16, 359–385 (2004) 17. Martin, A.J.: Examining a multidimensional model of student motivation and engagement using a construct validation approach. British Journal of Educational Psychology 77, 413–440 (2007)
Learning Analytics, formatieve toetsing & leerdisposities Dirk Tempelaar, Maastricht University
Samenvatting In deze bijdrage rapporteren we over een toepassing van het door Buckingham Shum en Deakin Crick voorgestelde model van Learning Analytics gebaseerd op een combinatie van systeem gegenereerde data, en leerdispositie data. Kern van dat model is dat individuele leerdisposities, ofwel ‘learning power’, een belangrijke ontbrekende factor kan zijn in toepassingen van Learning Analytics. In onze toepassing van Learning Analytics in een leercontext gekenmerkt door het gebruik van een hypride (blended) leeromgeving, dat probleem-gestuurd leren combineert met digitale platforms voor toetsing en toets-gestuurd leren, blijkt inderdaad dat dispositiedata een belangrijke complementaire rol heeft naast trackdata in het voorspellen van leeruitkomsten.
Voorstel Het doel van het hier beschreven onderzoek is het realiseren van een toepassing van het door Buckingham Shum en Deakin Crick voorgestelde model van Learning Analytics gebaseerd op een combinatie van systeem gegenereerde data, en leerdispositie data ( Buckingham Shum & Deakin Crick, 2012). Kern van dat model is dat individuele leerdisposities, ofwel ‘learning power’, een belangrijke ontbrekende factor is in veel toepassingen van Learning Analytics. In het kader van de SURF Learning Analytics stimuleringsregeling 2013 zijn mogelijkheden van Learning Analytics onderzocht in een leercontext gekenmerkt door het gebruik van digitale platforms voor toetsing en toets-gestuurd leren, en het benutten van leerlingkenmerken (zie ook Tempelaar, Cuypers, Van de Vrie, Heck & Van der Kooij, 2013, en Tempelaar, Cuypers, Van de Vrie, Van der Kooij & Heck, 2012). Als doelgroep is een studentpopulatie van eerstejaars studenten bedrijfskunde en economie van de UM gebruikt, een door de omvang en sterk internationale samenstelling vanuit onderzoeksperspectief aantrekkelijke populatie. Deze studenten werden gevolgd in de cursus die een eerste kennismaking met wiskunde en statistiek verschaft. Het leerproces vond plaats in een hypride (blended) leeromgeving, dat probleem-gestuurd leren combineert met het gebruik van de leerplatforms MyMathLab, MyStatLab (Pearson) en BlackBoard, waarmee een grote database met systeem-track data is opgebouwd. Die is aangevuld met leerdispositie data verkregen met zelf-rapportage instrumenten gebaseerd op sociaalcognitieve leermodellen. Het project is een vervolg van eerdere projecten die zijn uitgevoerd in het kader van het SURF TTL programma, een programma gericht op de toepassing van digitale, formatieve toetsing (Tempelaar, Kuperus, Cuypers, Van der Kooij, Van de Vrie & Heck, 2012). Het theoretisch kader van Learning Analytics geschetst door Verbert, Manouselis, Drachsler & Duval (2012) onderscheidt een zestal doelstellingen voor Learning analytics toepassingen: het modeleren en voorspellen van leergedrag, het aanbevelen van relevante leermiddelen, het versterken van metacognitieve functies, het versterken van de rol van sociale leeromgevingen, en het detecteren van zowel leervoorkeuren, als suboptimaal leergedrag. Door leerdispositie data te operationaliseren aan de hand van een brede keuze uit gangbare modellen uit de sociaal-cognitieve leertheorie, disposities die veel worden toegepast in onder zoek naar ‘student profiling in blended learning’, beogen we verschillende van deze doelstellingen in deze studie te realiseren. Leerdispositie-instrumenten die zijn ingezet betreffen impliciete intelligentie-theorieën, opvattingen over studie inspanning, prestatiedoelen, academische motivatie, leerstijlen, vak attitudes, en motivatie & betrokkenheid (zie ook Tempelaar et al. (2012) voor een beschrijving van deze instrumenten in meer detail). Qua onderzoeksmethode impliceert de keuze voor het door Buckingham Shum en Deakin Crick voorgestelde model van Learning Analytics dat het systematisch opvragen en analyseren van twee typen databronnen cruciaal was. Allereerst het verkrijgen van een zo volledig mogelijk collectie van systeemdata, afkomstig uit inschrijfsystemen (nationaliteit, vooropleiding, wiskunde niveau), toetssystemen (voorkennistoetsing), en de digitale leeromgevingen MyMathLab, MyStatLab en BlackBoard. Daarnaast het bevragen van studenten op
hun leerdisposities gebruik makend van instrumenten eerder gbruikt in studies naar leervoorkeuren in hybride leeromgevingen. Op basis van beide typen data zijn modellen voor leergedrag opgesteld. Daarbij zijn bestaande sociaal-cognitieve leermodellen als leidend in het modelleerproces gekozen; in deze studie is geen gebruik gemaakt van data-mining. Het belangrijkste globale resultaat ligt in de bevestiging van het uitgangspunt van het Buckingham Shum en Deakin Crick model dat systeem data en leerdispositiedata twee complementaire bronnen van data voor Learning Analytics toepassingen vormen. Systeemdata kan in een vroeg stadium de ‘echte drop-outs’ onderscheiden, en gerichte interventies mogelijk maken. Anderzijds is leerdispositiedata beter in staat risicoprofielen te bepalen van studenten die later in de cursus in problemen komen, en zo een bijdrage leveren aan de functies van het detecteren van leervoorkeuren, signaleren van suboptimaal leergedrag en het genereren van aanbeveligen op basis van die detecties. De kracht van het model ligt dus heel specifiek in de complementariteit van de twee datacomponenten. In de detailresultaten valt vooral de voorspelkracht van nationale cultuurdimensies gebaseerd op het cultuurmodel van Hofstede op. De zeer internationale samenstelling van de onderzochte populatie zal daar zeker ook debet aan zijn, maar duidelijk is dat waar studenten het gebruik van leermiddelen deels naar eigen inzicht kunnen bepalen, cultuurverschillen een belangrijke determinant in die keuzes zijn. De betekenis van deze bijdrage ligt mogelijk in het richting geven aan onderzoek in het jonge vakgebied van Learning Analytics. Veel empirische toepassingen baseren zich louter op één datacomponent: de systeemdata. Het voordeel van het gebruik van track-data is dat die in verhouding tot vragenlijstdata dat in meer traditioneel sociaal-cognitief onderzoek veelal wordt gebruikt, eenvoudig en grootschalig kan worden verkregen. Echter, beide typen data lijken hun eigen merites te hebben, hetgeen het gebruik van dispositiedata in aanvulling op trackdata in Learning Analytics studies sterk aanbevelenswaardig maakt. Referenties: Buckingham Shum, S. & Deakin Crick, R. (2012). Learning Dispositions and Transferable Competencies: Pedagogy, Modelling and Learning Analytics. In: Proceedings LAK2012: 2nd International Conference on Learning Analytics & Knowledge, pp. 92-101. ACM Press: New York. Tempelaar, D. T., Cuypers, H., Van de Vrie, E., Heck, A., Van der Kooij, H. (2013). Formative Assessment and Learning Analytics. In Proceedings of the 3rd International Conference on Learning Analytics and Knowledge. New York: ACM. 978-1-4503-1785-6/13/04. Tempelaar, D. T., Cuypers, H., Van de Vrie, E., Van der Kooij, H., & Heck A. (2012). Toetsgestuurd leren en learning analytics. OnderwijsInnovatie, September 2012, 17-26. Tempelaar, D. T.; Kuperus, B., Cuypers, H., Van der Kooij, H., Van de Vrie, E., & Heck, A. (2012). “The Role of Digital, Formative Testing in e-Learning for Mathematics: A Case Study in the Netherlands”. In: “Mathematical e-learning” [online dossier]. Universities and Knowledge Society Journal (RUSC). vol. 9, no 1. UoC. [4] Verbert, K., Manouselis, N., Drachsler, H., & Duval, E. (2012). Dataset-Driven Research to Support Learning and Knowledge Analytics. Educational Technology & Society, 15 (3), 133–148.
LEARNING ANALYTICS & FORMATIVE ASSESSMENTS IN BLENDED LEARNING OF MATHEMATICS & STATISTICS Dirk T. Tempelaar Maastricht University School of Business & Economics, Tongersestraat 53, The Netherlands email:
[email protected] Abstract. Learning analytics seeks to enhance the learning process through systematic measurements of learning related data, and informing learners and teachers of the results of these measurements, so as to support the control of the learning process. Learning analytics has various sources of information, two main types being intentional and learner activity related metadata.. This contribution provides a practical application of Buckinghan Shum and Deakin Crick’s theoretical framework of dispositional learning analytics [1]: an infrastructure that combines learning dispositions data with data extracted from computer based, formative assessments. In a large introductory statistics course based on the principles of blended learning, combining face-to-face problem-based learning sessions with technology enhanced education, we demonstrate that students learning choices profit from providing students with feedback based on learning analytics, so as to optimize individual learning choices. This study is based on a project financed by SURFfoundation as part of the Dutch Learning Analytics program.. Keywords: blended learning; dispositional learning analytics; formative assessment; learning dispositions; student profiles; technology enhanced learning. Short title: Learning analytics. Introduction. The prime data source for most learning analytic applications is data generated by learner activities, such as learner participation in continuous, formative assessments. That information is frequently supplemented by background data retrieved from learning management systems and other concern systems, as for example accounts of prior education. A combination with intentionally collected data, such as self-report data stemming from student responses to surveys, is however the exception rather than the rule. In their theoretical contribution to LAK2012 [1], see also the 2013 LASI Workshop [2], Buckinghan Shum and Deakin Crick propose the dispositional learning analytics infrastructure that combines learning activity generated data with learning dispositions, values and attitudes measured through self-report surveys and fed back to students and teachers through visual analytics. Their proposal considers for example spider diagrams to provide learners inside in their learning dispositions, values and attitudes. In our empirical contribution focusing on large scale education in introductory math and statistics, we aim to provide a practical application of such an infrastructure based on combining learning and learner data. In collecting learner data, we opted to use a wide range of well validated self-report surveys firmly rooted in current educational research, including learning styles, learning motivation and engagement, and learning emotions. Learner data were reported to both students and teachers using visual analytics similar to those described in [1], so instead of focusing on technology to feedback learner data, we will focus here on the crucial role of the richness of the profile of learner dispositions, values and attitudes. Our second data source is rooted in the instructional method of test-directed learning, and brings about the second focus of this empirical study: to
demonstrate the crucial role of data derived from computer-based formative assessments in designing effective learning analytic infrastructures. This paper extends our earlier study [3]. 1. Formative Assessment. The classic function of testing is that of taking an aptitude test. After completion of the learning process, we expect students to demonstrate mastery of the subject. According to test tradition, feedback resulting from such classic tests is no more than a grade, and that feedback becomes available only after finishing all learning. The alternative form of assessment, formative assessment, has an entirely different function: that of informing student and teacher. The information should help better shape the teaching and learning and is especially useful when it becomes available during or prior to the learning. Diagnostic testing is an example of this, just as is practice testing. Because here the feedback that tests yield for learning constitutes the main function, it is crucial that this information is readily available, preferably even directly. At this point digital testing comes on the scene: it is unthinkable to get feedback from formative assessments in time without using computers. 2. Learning Analytics. The broad goal of learning analytics is to apply the outcomes of analysing data gathered by monitoring and measuring the learning process, as feedback to assist directing that same learning process. Several alternative operationalizations are possible to support this. In [4], six objectives are distinguished: predicting learner performance and modelling learners, suggesting relevant learning resources, increasing reflection and awareness, enhancing social learning environments, detecting undesirable learner behaviours, and detecting affects of learners. In the following sections describing our approach, we will demonstrate that the combination of self-report learner data with learning data from test-directed instruction allows to contribute to at least five of these objectives of applying learning analytics. Only social interaction is restricted to learners being able to assess their individual learning profiles in terms of a comparison of their own strong and weak characteristics relative to the position of other students. These profiles are based on both learner behaviour, including all undesirable aspects of it, and learner characteristics: the dispositions, attitudes and values. Learner profiles are used to model different types of learners, and to predict learner performance for each individual student. Since our instructional format is of student-centred type, with the student, and not the teacher, steering the learning process, it is crucial to feedback all this information to learners themselves as to make them fully aware of how to optimize their individual learning trajectories. 3. Case Study: Mathematics and Statistics. Our empirical contribution focuses on freshmen education in quantitative methods (mathematics and statistics) of the business & economics school at Maastricht University. This education is directed at a large and diverse group of students, which benefits the research design. The population of students studied here consists of two cohorts of freshmen: 2011/2012 and 2012/2013, containing 1,800 students who in some way participated in school activities (have been active in the digital learning environment Blackboard). Besides BlackBoard, two different digital learning environments for technology-enhanced learning and practicing were utilized: MyStatLab and MyMathLab. The diversity of the student population derives mainly from its very international composition: only 23% took Dutch high school, whereas all others were educated in international high school systems. The largest group, 45% of the freshmen, were educated according to the German Abitur system. High school systems in Europe differ strongly, most particularly in the teaching of mathematics and statistics. In that European palette the
Netherlands occupies a rather unique position, both in choice of subjects (one of the few European systems with substantial focus on statistics) and the chosen pedagogical approach. But even beyond the Dutch position, there exist large differences, such as between the AngloSaxon and German-oriented high school systems. Therefore it is crucial that the first course offered to these students is flexible and allows for individual learning paths. To some extent, this is realized in offering optional, developmental summer courses, but for the main part, this diversity issue needs to be solved in the program itself. The digital environments for testdirected learning play an important role in this. 4. Technology-Enhanced Learning. The two technology-enhanced MyLabs, MyStatLab (MSL) and MyMathLab (MML), are generic digital learning environments, developed by the publisher Pearson, for learning statistics and mathematics. It adapts to the specific choice of a textbook from Pearson. Although MyLabs can be used as a learning environment in the broad sense of the word (it contains, among others, a digital version of the textbook), it is primarily an environment for test-directed learning and practicing. Each step in the learning process is initiated by submitting a question. Students are encouraged to (try to) answer the question. If they do not master (completely), the student can either ask for help to step by step solve the problem (Help Me Solve This), or ask for a fully worked example to show (View an Example). Next, a new version of the problem loads (parameter based) to allow the student to demonstrate their newly acquired mastery. In the investigated courses, students work an average 35.7 hours in MML and 23.6 hours in MSL, 30% to 40% of the available time of 80 hours for learning in both topics. In this study, we use two different indicators for the intensity of use of thye MyLabs: #hours, the number of hours a student spent practicing in both MyLab environments, and TestScore, the average score for the practice questions, all chapters aggregated, again for both topics. 5. Educational Practice. The educational system in which students learn mathematics and statistics is best described as a ‘blended system’. The main component is 'face-to-face’: problem-based learning (PBL), in small groups (14 students), coached by a content expert tutor. Participation in these tutor groups is required, as for all courses based on the Maastricht PBL system. Optional is the online component of the blend: the use of the two test-directed learning environments. The reason for having this component optional is at one hand that this best fits the Maastricht educational model, which is student-directed and places the responsibility for making educational choices primarily with the student, and at the other hand, the circumstance that not all students will benefit equally from using these environments: due to the diversity in prior knowledge, it is supposed to have less added value for students at the high end. However, the use of technology-enhanced environments is stimulated by making bonus points available for good performance in the quizzes. Quizzes are taken every two weeks and consist of items that are drawn from item pools very similar to the item pools applied in the two digital practice platforms. We chose for this particular constellation, since it stimulates students with little prior knowledge to make intensive use of the test platforms. They realize that they fall behind other students in writing the exam, and need to achieve a good bonus score both to compensate, and to support their learning. The most direct way to do so is to frequently practice in the MML and MSL environments. The student-directed characteristic of the instructional model requires first and foremost adequate information for students so that they are able to monitor their study progress and their topic mastery in absolute and relative sense. That provision of relevant information starts the first day of the course when students take two entry tests for
mathematics and statistics, so as to make their positions clear. Feedback from entry tests provide the first signals of the importance of using the test platforms. Next, the digital MML and MSL-environments take over the monitor function: students can at any time see their progress in preparing the next quiz, get feedback on the performance in the already taken quizzes and on the conduct of the practice sessions. The same information is also available for the teachers. Although the primary responsibility for directing the learning process is with the student, the tutor acts complementary to that self-steering, especially in situations where the tutor considers that a more intense use of digital learning environments is desirable, given the position of the student concerned. In this way, the application of learning analytics shapes the instructional situation. 6. Impact of Technology-Enhanced Learning. To explore the role of technology-enhanced learning, we investigated the relationship between the intensity of use of the two technology-enhanced platforms and academic performance. Two indicators measure academic performance: the exam containing a mathematics and statistics part (MathExam and StatExam) and three quizzes for both subtopics, summed into a MathQuiz and StatQuiz score. Before examining the relationship between practice and performance, we corrected for differences in prior knowledge, in two ways: by the level of prior mathematics education, and by the student score in the math entry test. What prior education is concerned: high school systems distinguish a basic level preparing for the social sciences and an advanced level preparing for sciences. An indicator variable is used for math at advanced level (MathAdv) (which is true for one third of the students), with basic level of math prior schooling being the reference group. Moreover, the level of prior math knowledge is determined by the day-one entry or diagnostic test, of which the score is labelled as EntryTest, focusing on the mastery of basic algebraic skills. One of the most straightforward ways to investigate the role of technology-enhanced learning on achievement is to use regression analyses in which performance variables are explained by prior knowledge and data on intensity of using the practice tests. These regressions indicate that prior knowledge, both as type of prior schooling and as score in the entry test, explains part of performance differences. But the most important predictor of course performance is the level that students gain in the test platforms. The number of different tests students need to acquire that level, or the time they need to practice to acquire that level, has a corrective effect, what is intuitive: knowledge achieved through testing helps, but if a student needs a lot of time or effort to reach that level, this signals more problematic learning. An alternative demonstration of the impact of using the test environments is obtained by dividing the population of students into students with high and low mastery in the entry test and high and low level of intensity of using the test platforms, and comparing exam scores and pass/fail outcomes. The fit resulting from these prediction models is very high. For example, in a median split on performance in the math platform, 92% of the students with the better practice performance do pass, against 59% in the students with lower practice performance. 6.1. Learning Analytics: Demographic Characteristics. Having demonstrated that on average students benefit from the opportunity of technologyenhanced learning, the question arises whether this is equally true for all students. This question asks for learning analytics applications using data from other sources than the learning environments to identify specific student groups most in need for these practice environments. In this section of our empirical study, we follow [1], [5] to investigate individual differences in the intensity of using digital learning tools. As a first step, we make use of data from the regular student administration such as whether or not Dutch high school,
whether or not advanced prior math schooling, gender, nationality and entry test score. Students with advanced prior schooling are better at math, without incurring more need to practice. They are not better at statistics, which corresponds to the fact that in programs at advanced level, the focus is not on statistics but abstract math. Dutch students make considerably less use of both test environments and hence achieve a slightly lower score, benefiting from a smoother transition than international students, but relying just somewhat too much on that. Students with a high entry test score do better in mathematics and a little better in statistics in the test environments, without the need to exercise more. Finally, there are modest gender effects, the strongest in the intensity of exercising: female students are more active than male students. 6.2. Learning Analytics: Cultural Differences. The remaining data from the student records of administrative systems regard the nationality of students. Because cultural differences in education has been given an increasingly important role, and because the Maastricht student population makes it very suitable through its strong international composition, the nationality data are converted into so-called national culture dimensions, based on the framework of Hofstede [6]. In that framework, there are a number of cultural dimensions that refer to values that are strongly nationally determined. In this study we use six of these dimensions: Power Distance, Individualism versus Collectivism, Masculinity versus Femininity, Uncertainty Avoidance, Long-Term vs. Short-Term Orientation and Indulgence vs. Restraint. Scores for each of these national dimensions are assigned to the individual students. Correlating these scores with the four indicators of practice tests intensity result in several significant effects, all in line with Hofstede's framework. The most significant effects are for students from a masculine culture, where mutual competition is an important driver in education, for students from a culture that value long-term over short-term and, somewhat in relation thereto, cultures that value sobriety rather than enjoyment. In this, masculinity and hedonism have a stronger impact on the intensity of exercising, than on the proceeds of exercising, in contrast to long-term orientation, that has about equal impact on both aspects. Uncertainty avoidance contributes, as expected, to practicing, albeit to a lesser extent and again primarily toward intensity of exercising rather than its outcome. The roles of power distance and individualism play a less salient role in learning, as expected. 6.3. Learning Analytics: learning styles. Although the effects are smaller in size, learning data based on the learning style model of Vermunt [7] exhibit a characteristic role. Vermunt’s model distinguishes learning strategies (deep, step-wise, and concrete ways of processing learning topics), and regulation strategies (self, external, and lack of regulation of learning). Deep-learning students demonstrate no strong relationship with test directed learning: they exercise slightly less, but achieve a slightly better score. That is certainly not true for the stepwise learning students. Especially for these students the availability of practice tests seems to be meaningful: they practice more often and longer than other students and achieve, especially for statistics, a better score than the other students. These patterns repeat themselves in the learning regulation variables that characterize the two ways of learning: self-regulation being characteristic for deep learning, external regulation as a feature for stepwise learning. Indeed, the students whose learning behavior has to be externally regulated, are those who benefit most from the test environments: both in intensity and performance they surpass the other students. A notable (but weak) pattern is finally visible in learning behaviour lacking regulation: these students tend to practice more often and longer than the other students but achieve in both subtopics lower performance levels. Apparently, even the structure of the two test environments is incapable to compensate the of lack of regulation for these student.
6.4. Learning Analytics: (Mal)Adaptive Thoughts & Behaviors. Recent Anglo-Saxon literature on academic achievement and dropout assigns an increasingly dominant role to the theoretical model of Andrew Martin: the 'Motivation and Engagement Wheel’ [8]. That model includes both behaviours and thoughts or cognitions that play a role in learning. Both are then divided into adaptive and mal-adaptive or obstructive forms. As a result, the four quadrants are: adaptive behaviour and adaptive thoughts (the ‘boosters’), maladaptive behaviour (the ‘guzzlers’) and obstructive thoughts (the ‘mufflers’). In Figure 1, two panels depict the relationships of adaptive and mal-adaptive thoughts and behaviours with the usage data.
Fig. 1. Role of (mal)adaptive thoughts and behaviours.
The first panel documents adaptive thoughts Self-belief, Value of school and Learning focus, and adaptive behaviours Planning, Study management and Perseverance. All adaptive thoughts and all adaptive behaviours have a positive impact on the willingness of students to use the test environments, where the effect of the adaptive behaviour dominates that of cognitions. The mal-adaptive variables show a less uniform picture. Because gender effects play a prominent role here, the dummy variable female/male is added to the four data of use intensity in the panel. From these additional correlations we conclude that mal-adaptivity manifests itself differently in female and male students: for female students primarily in the form of limiting thoughts, especially fear and uncertainty, in male students primarily as maladaptive behaviours: self-handicapping and disengagement. That difference has a significant impact on learning. Mal-adaptive behaviours negatively impact the use of the test environments: all the correlations, both for use intensity and performance, are negative. The effect of inhibiting mind, however, is different: uncertainty and anxiety have a stimulating effect on the use of the test environments rather than an inhibitory effect. Combination of both effects provides a partial explanation for the observed gender effects in the use of the test environments. 6.5. Learning Analytics: Learning Emotions. Also of relatively recent date is research on the role of emotions in learning. Leading in this research is Pekrun’s control-value theory of learning emotions [9]. That theory indicates that emotions that arise when learning are influenced by the feeling to be 'in control' and something worthwhile to do. Pekrun’s model distinguishes several emotions, and for this study we selected emotions that contribute most strongly to student success or failure: the negative emotions of Anxiety, Boredom and Hopelessness, the positive emotion Enjoyment. Emotions are context-specific measured, for example, Anxiety is defined in the context of
learning mathematics. Learning emotions are typically measured in the middle of the course, unlike all other instruments that are taken in the beginning of the course. Correlations can thus not be interpreted within a cause-effect framework, as we can do for most other variables. The most obvious association is that of mutual influence: emotions will impact the use of the test environments, but conversely experience gained in practicing, and ideally the performance in practicing, will also determine learning emotions. Associations we find all have predicted directions: negative emotions demonstrate negative relationships to the use of the test environments, positive emotion and feeling in control, demonstrate positive relationships. It is striking that performance in the test environment, especially for mathematics, is much stronger associated with learning emotions than the intensity of practicing in the test environments. Conclusions The intensive use of technology-enhanced environments makes a major difference for academic performance. But in a student-centred curriculum it is not sufficient when teachers are convinced of the benefits that test-based learning in digital learning environments entails. Students regulate their own learning process, making themselves choices on how intensively they will exercise and therefore, are the ones who need to become convinced of the usefulness of these digital tools. In this, learning analytics can play an important role: it provides a multitude of information that the student can use to adapt the personal learning environment as much as possible to the own strengths and weaknesses. For example, in our experiment the students were informed about their personal learning dispositions, attitudes and values, together with information on how learning in general interferes with choices they can make in composing their learning blend. At the same time: the multitude of information available from learning analytics is also the problem: that information requires individual processing. Some information is more important for one student than the other, requiring a personal selection of information to take place. Learning analytics deployed within a system of student-centred education thus has its own challenges. The aim of this contribution extends beyond demonstrating the practical importance of Buckingham Shum and Deakin Crick’s dispositional learning analytics infrastructure. Additionally, this research provides many clues as to what individualized information feedback could look alike. In the learning blend described in this case study, the face-to-face component PBL constitutes the main instructional method. The digital component is intended as a supplementary learning tool, primarily for students for whom the transition from secondary to university education entails above average hurdles. Part of these problems are of cognitive type: e.g. international students who never received statistics education as part of their high school mathematics program, or other freshmen who might have been educated in certain topics, without achieving required proficiency levels. For these kind of cognitive deficiencies, the technology-enhanced environments proved to be an effective tool to supplement PBL. But this applies not only to adjustment problems resulting from knowledge backlogs. Students encounter several types of adjustment problems where the digital tools appear to be functional. The above addressed learning dispositions are a good example: student-centred education presupposes in fact deep, self-regulated learning, where many students have little experience in this, and feel on more familiar ground with step-wise, externally regulated learning. As the analyses demonstrate: the digital test environments help in this transformation. It also makes clear that the test environments are instrumental for students with non-adaptive cognitions about learning mathematics and statistics, such as anxiety. An outcome that is intuitive: the individual practice sessions with computerized feedback will for some students be a safer learning environment than the PBL tutorial group sessions. Finally, the learning analytics outcomes make also clear where the limits of the potentials of digital practice are: for students with non-adaptive behaviours and negative
learning emotions. If learning involves boredom and provokes self-handicapping, even the challenges of test-based learning will fall short. Acknowledgements The project reported here has been financed by SURF-foundation as part of the Learning Analytics Stimulus program. References [1]. Buckingham Shum, S. & Deakin Crick, R. (2012). Learning Dispositions and Transferable Competencies: Pedagogy, Modelling and Learning Analytics. Proceedings LAK2012: 2nd International Conference on Learning Analytics & Knowledge, pp. 92-101. ACM Press: New York [2]. LASI Dispositional Learning Analytics Workshop (2013). Learningemergence.net/events/lasi-dla.wkshp. [3]. Tempelaar, D. T., Cuypers, H., Van de Vrie, E. M., Heck, A., & Van der Kooij, H., (2013). “Formative Assessment and Learning Analytics”. Proceedings LAK2013: 3rd International Conference on Learning Analytics & Knowledge, pp. 205-209. ACM Press: New York. [4]. Verbert, K., Manouselis, N., Drachsler, H., & Duval, E. (2012). Dataset-Driven Research to Support Learning and Knowledge Analytics. Educational Technology & Society, 15 (3), 133–148. [5]. Whitmer, J., Fernandes, K., & Allen, W. R.. (2012). Analytics in Progress: Technology Use, Student Characteristics, and Student Achievement. EDUCAUSE Review Online, July. [6]. Hofstede, G., Hofstede, G. J., & Minkov, M. (2010). Cultures and organizations: Software of the mind. Revised and expanded third edition. Maidenhead: McGraw-Hill. [7]. Vermunt, J. D. (1996). Leerstijlen en sturen van leerprocessen in het Hoger Onderwijs. Amsterdam/Lisse: Swets & Zeitlinger. [8]. Martin, A. J. (2007). Examining a multidimensional model of student motivation and engagement using a construct validation approach. British Journal of Educational Psychology, 77, 413-440. [9]. Pekrun, R. (2006). The control-value theory of achievement emotions: Assumptions, corollaries, and implications for educational research and practice. Educational Psychology Review, 18, 315-34.
Inhoud Interview lucas Rurup over passend onderwijs Erica Jansen RT-programma's aan de poort van hoger onderwijs werken
Grootschalig effectonderzoek naar vergoede dyslexiebehandelingen ~"t~AP
Liesbeth Tilanus
"
8
Dirk Tempelaar, Bart Rientjes, . Bas Giesbers
18
Begeleid hardop lezen blijkt effectief bij zwakke lezers dt"tWPf{i;", Ron Oostdam, Henk Blok, ~ Conny Boendermaker
22
ONm\~<';
Aafke Bouwman over differentieren Leontien Ie Blanc
Taalgericht rekenonderwijs leidt tot beter wiskundig redeneren
Hybride lezen
~"t~A
Jantien Smit
g
12
Mieke Urff
En verder Gastcolumn Vers bloed
I Johan
Schokker
I Jits Horst-Koenraads
30
Verslag remediaal specialisten: Serious games in het onderwijs
26
RT-programma's aan de poor van hager onderwi· 5 werken De studieprestaties van studenten in het hoger onderwijs die hebben deelgenomen aan bijspijkeronderwijs rondom het yak rViskundein de zomer voorafgaand aan hun studie, liggen vele malen hoger dan st enten die hier niet aan hebben meegedaan. Dat blijkt uit onderzoek van de universi eit Maastricht en Erasmus Universiteit. In deze remedial teaching-programma's wor t gebruik gemaakt van ICT-middelen in de vorm van e-tutorials. Deze kunnen vergaande aanpassingen maken en dus aansluiten op de individuele leerbehoeften van de leer' g. De auteurs van dit artikel hebben aan de wieg gestaan van het Webspijkerprogramma van de Universiteit Maastricht en leggen in dit artikel uit hoe het precies tot sad is gekomen en wat er allemaal bij komt kijken.
T
egenva"ende
studentprestaties
lende instroomdecennium
een belangrijk
onderwijs
opleidingen
worden
zijn al meer dan een
aandachtspunt
(ho) (Onderwijsraad,
op het gebied van de wiskun-
oorzaak aangewezen.
Tekorten die zowel
uit het niet halen van het beoogde eindniveau
vooropleiding,
als het niet goed aansluiten
volgopleiding
(Onderwijsraad,
in de
van vooropleiding
Elektronische
leeromgeving
Aan verschillende
en ver-
omgeving
Knowledge Spaces ruimtetheorie
of 'knowledge
matige intelligentie
(Door adaptieve toetsing bepaalt de elektronische leeromgeving ALEKShet kennisniveau van de student' discussie over het falen van Het Studiehuis,
minstens
omdat steeds minder studenten
effecten
van internationalisering.
nationale
profilering,
teit Maastricht
Opleidingen
onderwijssystemen
het vwo-diplo-
de volgende
siteit aan vooropleiding quentie de urgentie
onderwijs
het kennisniveau
een enorme vergroting
en voorkennis.
in diver-
Met als onmiddellijke
dat remedial teaching
zich flexibel
conse-
aanpast aan
van de student.
Zo'n boom bestaat uit een heleboel punten (de
In Nederland
dit dat er een volgorde-relatie
aan de volgende
kan beginnen.
termen van de boomstructuur: structuur student
die de student gebruikt
landse universiteiten belangrijke
van Neder-
teaching-projecten.
In eerste instantie
rt-modules
2004 en 2008 deelnam,
van remedial
bepaalt
state, K) van de student.
de combinatie
In
van punten die lessen
beheerst. Vervolgens wordt de boom-
om de student
te laten kiezen uit lessen die de
als volgende stap kan doen (ready to learn). Dat kan de vol-
gende les zijn in bijvoorbeeld
het vereenvoudigen
van breuken, net
zoals bij het leerboek, maar kan ook een les zijn over een ander deel(schrijven van breuken als percentages)
ander onderwerp
of zelfs een heel
(oplossen van een lineaire vergelijking),
zolang de
er maar klaar voor is. Die nieuwe lessen waarvoor
dent aile vereiste voorkennis (outer fringe).
de stu-
heeft, duidt ALEKS aan met buitenrand
Maar de keus is nog ruimer, want ALEKS houdt ook bij
hoe goed de student
de lessen beheerst die al eerder gedaan zijn,
blijkens de adaptieve toets, enige herhaling
zinvol is.
(inner fringe) en maken ook deel
les. Zie Figuur 1 voor een grafische
weergave. buitenrandgebied van kennistoestand
K
lag de focus op het ontwerp
en het opdoen van ervaring
gebruik ervan. De Webspijker 2005; Tempelaar
(precedentie-relatie)
en hoge scholen op het gebied van ICT, een
rol gespeeld als aanjager en subsidiegever
van flexibele
tuszijn,
hebben afgerond, voordat je
Met adaptieve toetsen
(knowledge
uit van de te kiezen volgende
heeft SURF,de samenwerkingsorganisatie
types') plus verbindingen
bestaat: je zal eerst de ene les moeten
voorstellen
op
Daarbij wordt gebruik gemaakt
Deze lessen vormen de binnenrand SURF-projecten
maar ALEKS biedt
lessen aan, die aile precies aansluiten
lessen, of we I 'problem
maar waarvoor,
deficienties.
in het kiezen van de vol-
sen die punten. Wanneer twee lessen met elkaar verbonden
student
als Belgie en Duitsland,
beschreven
van de student.
pagina staat (de Iineaire methode),
een keuze uit verschillende
onderwerp
voor voortgezet
zelfs voor buurlanden
deze internationalisering
individuele
aan de Universi-
waarvoor
vormt (minder dan 30 procent).
ma een minderheid sterk verschillen,
onder-
mate de
met een sterke inter-
zoals die voor bedrijfskunde
(UM), kennen een instroom
Omdat internationale impliceert
naar universitair
van. Maar ook in toenemende
kan worden
bepaald. Door adaptieve toetsing
ALEKS de student
ALEKS de kennistoestand
de klassieke route van vwo naar ho
volgen. De overstap van beroepsonderwijs wijs is daar een voorbeeld
categorie een
wordt steeds diverser
kan wor-
de kennis van een
gende les. Dat is niet, zoals bij een leerboek, altijd die ene les die op
betekent
van vwo naar ho
speelt de tweede
rol: de instroom
rol in de
de onderwijsvernieu-
die de doorstroom
Tegenwoordig
zo belangrijke
met de vraag hoe de totale de 'kennisruimte',
(zijn 'kennistoestand')
Daarnaast ondersteunt
de Kennis-
(KST) is de tak van kunst-
Maar ook hoe vervolgens
bepaalt ALEKS het kennisniveau
afzonderlijke
moest bevorderen.
space theory'
kan worden
van kennisbomen. speelde een belangrijke
en tik in het zoekvenster
in). Dit werkt als voigt:
die zich bezighoudt
student
en door toetsing
leer-
Assessment and LEarning in
(zie ook: www.aleks.com
Higher Education Mathematics
individuele
van eind jaren negentig
op basis van de elektronische
ALEKS. ALEKS staat voor
den gerepresenteerd.
wing
zijn de afgelopen jaren wiskunde
ontworpen
kennis binnen een wetenschapsgebied,
2006).
De eerste categorie van deficienties
ALEKS
ho-instellingen
bijspijkerprogramma's
in het hoger
Bij beta- en gamma-
2006).
kennisdeficienties
de als veel voorkomende voortkomen
die zich uiten in teleurstel-
en doorstroomcijfers
projecten,
met het
waaraan de UM tussen
waren daar voorbeelden
van (Rienties et aI.,
et aI., 2007, 2008; Tempelaar, Kuperus et aI., 2012;
Wieland et aI., 2007). Tussen 2009-2013 verschoof
het accent meer naar het gezamenlijk
vormgeven
waarmee
van toetsbanken
werden. Tevens werd de reikwijdte op de instroom,
maar ook op de doorstroom
diejaar van opleidingen bedrijfskunde,
zoals bouwkunde, omvatte
instellingen
technische
droeg de naam
universiteiten,
Toetsing en Toetsgestuurd
onder andere NKBW-projecten
kennisbank
basisvaardigheden
tezamen toetsbanken dat van de
dat staat voor:
wiskunde.
voor de wiskunde
Learning Analytics
Daarin stelden samen.
weer een nieuwe focus (LA) (Tempelaar, Cuypers et
aI., 2012, 2013). Het belang van deze recente ontwikkeling dial teaching wordt verderop
Iytics toegelicht.
zoals
maar ook techni-
scheikunde.
Vanaf 2013 heeft het SURF-programma gekregen:
gevoed
in vooral het eerste stu-
levenswetenschappen,
aan de Nederlandse
Dit SURF-programma nationale
rt-modules
niet enkel meer gericht
met een stevige wiskundecomponent
economie,
sche opleidingen
Leren en
dergelijke
verbreed:
in dit artikel in het kader
voor reme-
Learning Ana-
binnenrandgebied van kennistoestand
Figuur 1: Buitenrand- en binnenrand kennistoestand K Een efficiente manier om de kennistoestand van een student te beschrijven is door de opsomming van zowel inner fringe als outer fringe (In deze figuur noemen ze dat overigens weer buitenrand- en binnenrandgebied). Dit sluit ook aan bij de aansturing van het leren van de student: op ieder moment kan de student kiezen uit of we I een les uit de outer fringe (een nieuw onderwerp) of we I het herhalen van een eerder geleerde maar niet volledig beheerste les uit de inner fringe.
K
Online zomercursus Uiteindelijk
Onwaarschijnlijk
leidde dit onderzoek tot een online zomercursus
van ALEKS. Het werd een omvangrijk tenminste
honderd studie-uren
instromers.
Veel omvangrijker
bijspijkerprogramma
op basis van
gezien de sterk heterogene
groat effect
In de verschillende
projecten
is goed gekeken wat het effect is van
het volgen van de zomercursussen.
groep
dus dan de klassieke opfrisser aan het
Een lastig onderwerp
deelname vrijwillig
is. Er zijn forse verschillen
de studieprestaties
van studenten
omdat
te constateren
die aan de zomercursus
tussen hebben
begin van het academisch jaar. Ook werd gekozen voor afstands-
deelgenomen
onderwijs;
de cursus heel veel beter. Het ruwe effect is ongeveer even groot als
op deze manier kon de cursus zich aanpassen aan vele
vormen van voortgezet daaruit voortkomen.
onderwijs
en vele typen van deficienties
En er moest flexibiliteit
zich moest kunnen plooien naar de vakantieplannen toekomstige
studenten.
die
zijn omdat de cursus
en de niet-deelnemers.
het verschil tussen Wiskunde Dat is een onwaarschijnlijk
A en Wiskunde
Vandaar dat er gekozen werd voor de
gedurende
zomermaanden.
presteren
de hele bovenbouw
terwijl
bedraagt
in het voortgezet
onderwijs
op A of B-niveau wordt lesgegeven.
Dat (te grote) ruwe effect moet deels worden toegeschreven zelfselectie: vakantie
iHet ruwe effect van de cursus is ongeveer even groot als het verschil tussen Wiskunde A en Wiskunde B in de vooropleiding'
studenten
op te offeren
andere studenten motivatie,
die vrijwillig
studentkenmerken.
studentkenmerken
Learning Analytics manier het leren te bemeten,
componenten
door
is een lastige statis-
die een rol kunnen spelen in de keuze
door
deel te nemen.
bij zowel UM als Erasmus Universiteit
dat er inderdaad
komsten daarvan studenten
van de
uit de zelfselectie
En juist de splitsing van het ruwe
effect in deze twee afzonderlijke
Onderzoek
op systematische
meer
tische exercitie, die veel extra data vereist. In feite zijn dat aile mogelijke
(LA) beoogt het leerproces te versterken
zich van
leerdisposities:
van het echte leereffect
en het effect dat resulteert
al dan niet aan de zomercursus
Learning Analytics
bij te spijkeren, wilen
in gunstiger etc.
Het ruwe effect is de optelsom gunstiger
aan
kiezen een deel van hun zomer-
om wiskunde
onderscheiden
betere leerregulatie,
zomercursus,
na
B in de vooropleiding.
groot effect: de zomercursus
immers zo'n honderd studie-uur,
van de
De deelnemers
een fors selectie-effect
ALE K5
8
lijkt uit te wijzen
is, maar dat na correctie
Instructor Module
en met de uit-
en docenten te informeren,
om zo
Reports
Gradebook
Homework
het leren beter aan te sturen. Essentieel voor LA is de beschikbaarheid van veel data over het leerproces, hetgeen eigenlijk enkel kan bij computerondersteund is in dat geval computer
leren. De belangrijkste
'logging'
dat een student
allangere
ook complexere
feedback genereren, zoals bijvoorbeeld
student
onderwerpen
bestudeert,
tijd niet meer actief is geweest,
individuele
die sterk adaptief
leerbehoeften,
worden
schuwd als de combinatie tie aanleiding gebruikt
zijn ingericht,
met een stoplichtsymbool van studie-inspanning
leeromgevingen
van allerlei 'dashboards'
proces aangeven:
op
eerste-
gewaaren studiepresta-
om het signaal samen te beschikbaar
die de voortgang
door de aanin het leer-
zie Figuur 2 als voorbeeld.
vorm van LA is die waar de feedback wordt geba-
seerd op een combinatie Dispositionele
and Probability
tot extra aandacht geeft. Daarbij wordt data
uit digitale
Een bijzondere
Series,
Systems of Unear Equations lIInd Matrices
van LA is het
Purdue Universiteit:
stellen. In ALEKS is deze terugkoppeling wezigheid
aansluitend
van een toepassing
van de Amerikaanse
jaars studenten
Sequences,
worden gebruikt.
speelt feedback gebaseerd op LA een
cruciale rol. Beroemd voorbeeld 'stoplichtproject'
maar
dat een
in een niet voor de hand Iiggende volgorde
of dat er verkeerde leerstrategieen
In rt-programma's
bron
data. Die kan helpen signaleren
van systeemdata,
LA heet deze toepassing,
soonlijk ingekleurd
en leerdispositie-data.
waarin de feedback per-
wordt op basis van leervoorkeuren
van de stu-
dent. Zie Tempelaar en Cuypers et al. (2012, 2013) voor meer ach-
~
ameL @i) (!.QgjnISludent td)
Lastlogin
Last assessment
5.2
08/30/2012
0812012012
16.8
09/01/2012
08/1312012
21.0
08128/2012
08/1512012
this course
il!W
tergrondinformatie. Deze variant stelt hoge eisen aan de beschikbaarheid vante studentkenmerken toepassingen
wilen
als leerstijl, motivatie,
van rele-
attitudes;
deze gegevens maar beperkt
beschikbaar
zijn. Het zijn overigens wel precies de gegevens die ook voor effectbepaling
van belang zijn, dus vaak wilen
hand in hand gaan.
OJ
Abraham
Simone
S
2%
in veel
deze toepassingen
III Altamirano
m
Alves
Maria
Constanze
Figuur 2: ALEKSdashboard
7+6%
14+17'
(gedeeltelijk)
voor docent
'Recenteontwikkelingen in het hoger onderwijs, met voorop in tern ation a lis ering, geven een impuls aan nieuwe vormen van remedial teaching' daarvoor
nog een aanzienlijk
leereffect
resteert (grofweg:
50 pro-
cent van het ruwe effect is selectie, 50 procent is leeropbrengst RT; zie Tempelaar et aI., 2011). Opmerkelijk grootste
leereffect
genoeg wordt
behaald niet bij leerstof
bij meer basale algebra"lsche vaardigheden,
van
daarbij het
uit de bovenbouw, primair onderdeel
maar van
het onderbouwprogramma. Wij zijn van mening dat recente ontwikkelingen wijs, met voorop internationalisering, vormen van remedial teaching. len, adaptieve toetsing,
Computerondersteunde
leermidde-
aan aanpassing aan individuele
ten zijn daarin cruciaal. Nieuwe ontwikkelingen
tics zullen
in het hoger onder-
een impuls geven aan nieuwe
als
leerbehoef-
Learning Analy-
daarin een grotere rol gaan spelen.
Het hier beschreven onderzoek heeft in belangrijke mate geprofiteerd van ondersteuning door SURFprogramma's Webspijkeren, Toetsing en Toetsgestuurd Leren, en Learning Analytics.
Onderwijsraad
(2006). Versteviging
van kennis in het onderwijs.
Den Haag:
Onderwijsraad Rienties, B., Dijkstra, J., Rehm, M., Tempelaar, D. T. & Blok, G. (2005). Online bijin de praktijk. Tijdschrift voor hoger onderwijs 23 (4), 239-253.
spijkeronderwijs
Tempelaar, D. T., Cuypers, H., Van de Vrie, E., Heck, A., Van der Kooij, H. (2013). Formative Assessment and Learning Analytics. In D. Suthers & K. Verbert
(Eds.), Proceedings of the 3rd International
Conference on Learning Analytics
and Knowledge, 205-209. New York: ACM. Tempelaar, D. T., Cuypers, H., Van de Vrie, E.,Van der Kooij, H., & Heck A. (2012). Toetsgestuurd
leren en learning analytics.
Onderwijslnnovatie,
September
2012, 17-26. (http://www.ou.nl/documents/10815/575b8d77-da70-
Dirk Tempelaar is als universitair hoofddocent verbonden aan de Maastricht University School of Business & Economics.
490d -9 585-lCeea8c349al) Tempelaar, D. T., Kuperus, B., Cuypers, H., Van der Kooij, H., Van de Vrie, E.,& Heck, A. (2012). "The Role of Digital, Formative Testing in e-Learning for Mathematics: A Case Study in the Netherlands".
In: Mathematical
e-Iearning [online
dossier]. Universities and Knowledge Society Journal (RUSe). vol. 9, no 1. Uoc. Tempelaar, D. T., Rienties, B., Kaper, W, Giesbers, B., Van Gastel, L., Van de Vrie E.,Van der Kooij, H. & Cuypers, H. (2011). Effectiviteit onderwijs
wiskunde in de transitie
van voortgezet
van facultatief
aansluit-
naar hoger onderwijs.
Bart Rienties is als Reader (universitair hoofddocent) in Learning Analytics verbonden aan de Open University UK,Institute of Educational Technology.
Peda-
gogische Studien, Vol 88, NO.4, pp. 231-248. Tempelaar, D. T., Rienties, B., Van Engelen, A.J.M., Brouwer, N., Wieland, A., Van Wesel, M. (2007). Web-Spijkeren
wijslnnovatie,
I & II: wiskunde reparatieonderwijs.
Onder-
9(2), 17-26. (http://www.ou.nl/documents/10815/f84d1d97-
19e9-49 24-af83 -84ff751 C4C43) Tempelaar, D. T., Rienties, B., Van Wesel, M. (2008). Toetsend leren voor flexibel remediatie-onderwijs.
Exam ens, Februari 2008, 28-32.
Wieland, A., Brouwer, N., Kaper, W., Heck, A., Tempelaar, D. T., Rienties, B., Van Leijen, M., Ten Boske, B. (2007). Factoren die een rol spelen bij de ontwikkeling van remedierend
onderwijs.
Tijdschrift voor Hoger Onderwijs, 25(1), 2-15.
Bas Giesbers is als projectleider e-Iearning verbonden aan de Rotterdam School of Management, Erasmus Universiteit.