Enhancing Collaboration through Assessment & Reflection Chris Phielix
Leden beoordelingscommissie: Prof. dr. M. Brekelmans Prof. dr. F. Fischer Prof. dr. S. Järvelä Prof. dr. J. van Tartwijk Dr. K. Kreijns
This research was carried out in the context of the Dutch Interuniversity Centre for Educational Research. © 2012, Chris Phielix. All rights reserved. No parts of this publication may be reproduced, stored in a retrieval system, or transmitted in any other form or by any other means, mechanically, by photocopy, by recording, or otherwise, without permission from the author. ISBN- 978-90-393-5839-9 Printed by: Drukkerij Donath B.V. Zeist Cover design: VanAnne.nl
Enhancing Collaboration through Assessment & Reflection Samenwerking Verbeteren door middel van Beoordeling en Reflectie (met een samenvatting in het Nederlands)
Proefschrift ter verkrijging van de graad van doctor aan de Universiteit Utrecht op gezag van de rector magnificus, prof. dr. G.J. van der Zwaan, ingevolge het besluit van het college voor promoties in het openbaar te verdedigen op
28 september 2012 des middags te 12.45 uur door
Chris Phielix geboren op 11 september 1977 te Utrecht
Promotor: Co-promotor:
Prof. dr. P. A. Kirschner Dr. F. J. Prins
Contents 1. General Introduction ................................................................................................. 7 2. Collaboration in a CSCL-environment .................................................................. 11 2.1 Collaborative Learning ............................................................................................ 11 2.2 Social and Cognitive Processes in Collaborative Learning...................................... 11 2.3 CSCL-groups versus Face-to-Face groups............................................................... 12 2.4 Effects of CSCL-design on Social Interaction ......................................................... 13 2.5 Peer Feedback to Enhance Performance .................................................................. 15 2.6 Reflection to Enhance Performance and Consensus ................................................ 19 2.7 Social Relation Models to Measure Consensus in Peer Ratings .............................. 20 2.8 Group Awareness tools to Enhance Collaboration................................................... 22 3. Tools and Measures ................................................................................................. 25 3.1 Introduction.............................................................................................................. 25 3.2 Virtual Collaborative Research Institute (VCRI) ..................................................... 25 3.3 Self and Peer Assessment Tool (Radar) ................................................................... 27 3.4 Reflection Tool (Reflector) ...................................................................................... 32 3.5 Coding Scheme Output Co-reflection ...................................................................... 33 3.6 Social Performance Scales ....................................................................................... 34 4. Awareness of Group Performance¹ ........................................................................ 41 Abstract.......................................................................................................................... 41 4.1 Introduction.............................................................................................................. 41 4.2 Research Questions .................................................................................................. 42 4.3 Method and Instrumentation .................................................................................... 43 4.4 Results...................................................................................................................... 49 4.5 Discussion and Conclusion ...................................................................................... 53 5. Half-time Awareness in a CSCL environment² ..................................................... 57 Abstract.......................................................................................................................... 57 5.1 Introduction.............................................................................................................. 57 5.2 Research Questions .................................................................................................. 58 5.3 Method and Instrumentation .................................................................................... 60 5.4 Results...................................................................................................................... 68 5.5 Discussion and Conclusion ...................................................................................... 77
6. Using Reflection to Increase Consensus among Peer Raters³ .............................. 83 Abstract.......................................................................................................................... 83 6.1 Introduction ............................................................................................................. 83 6.2 Research Questions.................................................................................................. 84 6.3 Method..................................................................................................................... 85 6.4 Results ..................................................................................................................... 93 6.5 Discussion and conclusion..................................................................................... 103 6.6 Future Research and Implications.......................................................................... 106 7. General Discussion and Conclusions.................................................................... 109 7.1 Introduction ........................................................................................................... 109 7.2 Summary of the Studies......................................................................................... 110 7.3 Synthesis................................................................................................................ 113 7.4 Methodological Issues ........................................................................................... 120 7.5 Theoretical and Practical Implications................................................................... 124 References................................................................................................................... 126 8. Samenvatting.......................................................................................................... 137 8.1 Introductie.............................................................................................................. 137 8.2 Samenvatting van de studies.................................................................................. 139 8.3 Synthese................................................................................................................. 142 8.4 Methodologische opmerkingen.............................................................................. 149 8.5 Theoretische en praktische implicaties .................................................................. 153 Dankwoord ................................................................................................................. 155 List of Publications .................................................................................................... 157 Curriculum Vitae....................................................................................................... 159 List of ICO-Dissertations 2011 ................................................................................. 160
1.
General Introduction
Collaborative learning, often supported by computer networks (computer supported collaborative learning; CSCL), is enjoying considerable interest at all levels of education (Strijbos, Kirschner, & Martens, 2004). Though CSCL environments have been shown to be promising educational tools and expectations as to their value and effectiveness are high, groups learning in CSCL environments do not always reach their full potential (e.g., Baltes, Dickson, Sherman, Bauer, & LaGanke, 2002; Fjermestad, 2004; Hobman, Bordia, Irmer, & Chang, 2002; Lipponen, Rahikainen, Lallimo, & Hakkarainen, 2003; Thompson & Coovert, 2003). Two important reasons for the disparity between potential and actual performance lies in the social interaction between the group members which is influenced by (1) the design of the CSCL environment (Kreijns, Kirschner, & Jochems, 2003), and (2) group members’ self perceptions about their actual behavior and performance (Dunning, Heath, & Suls, 2004). First, the design of CSCL environments is often solely functional, focusing on the cognitive processes needed to accomplish a task and/or solve a problem and the achievement of optimal learning performances (Kreijns & Kirschner, 2004). These functional CSCL environments force (or coerce; Kirschner, Beers, Boshuizen, & Gijselaars, 2008) group members to only carry out the task and thus limit the possibility for socio-emotional processes taking place. These socioemotional processes, which are the basis for group forming and group dynamics, are essential for developing strong social relationships, strong group cohesiveness, feelings of trust, and a sense of community among group members (Cutler, 1996; Jehng, 1997; Kreijns & Kirschner) . Second, group members tend to overestimate their social and cognitive performance in the group. Group member’s self-views (self perceptions) show a tenuous to modest relationship with their actual behavior and performance (Dunning, Heath, & Suls, 2004), which indicates that group members are often not aware that their actual behavior and performance are less effective as they think. This tendency of group members to believe that they are performing effectively, while they often do not, can undermine the groups’ social (e.g., team development) and cognitive performance (e.g., quantity and quality of work), causing it not to reach its full potential (Karau & Williams, 1993; Stroebe, Diehl, & Abakoumkin, 1992). To this end, CSCL environments can be augmented with computer supported collaborative learning (CSCL) tools that make group members aware of the discrepancies between their self perceptions and their actual behavior and performance (e.g., Janssen, Erkens, Kanselaar, & Jaspers, 2007). These tools, also known as group awareness tools, provide information about the social and collaborative environment in which a person participates (e.g., they inform students how their actual behavior and performance is perceived by their peers; see Buder, 2007, 2011). This enhanced group awareness can lead to more effective and efficient collaboration (e.g., Buder & Bodemer, 2008; Janssen, Erkens, & Kirschner, 2011). Two operationalizations of such tools were developed for this research project, namely (1) a shared self and peer assessment tool (Radar) and (2) a shared reflection tool (Reflector). These tools are intended to help group members become better aware of their individual and group behavior and to stimulate them to set goals and formulate plans for improving the group’s social and cognitive performance. The use of self and peer assessment have become increasingly popular in the field of education. Although the impact of peer assessment on learning processes and activities are often viewed as important, empirical evidence for its effects on learning is scarce (Strijbos &
7
Introduction Sluijsmans, 2010; Topping, 1998; Van Gennip, Segers, & Tillema, 2009; 2010; Zundert, Sluijsmans, & Van Merrienboer, 2010). There is a need for empirical (quasi)experimental studies in which clearly described methods and conditions are related to outcome variables (Topping, 2010). Furthermore, according to Hattie and Timperley (2007), the effect of feedback (i.e., shared assessments) can be increased when students answer three reflective questions (e.g., (1) Where am I going?, (2) How am I going?, and (3) Where to next?. Thus, for both scientific and practical reasons it is important to examine the effects of peer assessment (with or without reflection) on the group’s social and cognitive processes. Therefore, the central research question in this thesis is: To what extent does peer assessment and reflection affect behavior and performance in small CSCL groups? The central research question is addressed in three empirical studies each focusing on specific research questions concerning different aspects and availability of the peer assessment and/or reflection tool. This research is primarily focused on the socio-emotional aspects of group functioning. The main research goals are (1) examining ways to let group members become aware of their behavior by means of an assessment tool, and (2) examining ways to alter behavior and performance by means of a reflection tool. In this thesis, assessment is defined as the process through which students monitor and rate the behavior and/or performance of themselves (i.e., self assessment) and/or their fellow group members (i.e., peer assessment). Reflection is defined as the intellectual and affective activities individuals engage in to explore their experiences to reach new understandings and appreciations of those experiences (Boud, Keogh, & Walker, 1985). In this thesis, students not only reflect individually but also collaboratively on their experiences. This process of collaborative reflection (i.e., co-reflection) is defined as a collaborative critical thinking process involving cognitive and affective interactions between two or more individuals who explore their experiences in order to reach new intersubjective understandings and appreciations (Yukawa, 2006; p. 206). Behavior is defined as the perceived social (non-task related) and cognitive (task-related) aspects or activities that are important for successful collaboration. These social aspects, further referred to as social behavior, are measured by self and peer ratings in Radar on four variables: influence, friendliness, cooperativeness, and reliability. The cognitive aspects, further referred to as cognitive behavior, is measured by self and peer ratings in Radar on two variables, namely productivity and quality of contribution. Performance is defined as the social and cognitive achievement or output at the end of the collaboration process. Social achievement, further referred to as social performance, is measured by four subscales (i.e., team development, group satisfaction, levels of group conflict, and attitude toward problem-based collaboration). Cognitive output, further referred to as cognitive performance, is measured by the grade that was given to the group’s product (i.e., essay or paper). The theoretical framework for the three empirical studies is presented in chapter 2, which addresses collaborative learning (CL), social and cognitive processes in CL, CSCL-groups versus face-to-face groups, effects of CSCL-design on social interaction, peer feedback, self and peer assessment, reflection, social relation models, and facilitating group awareness to enhance collaboration. Chapter 3 describes the tools and measures used in this thesis, in example the self and peer assessment tool (Radar), reflection tool (Reflector), the CSCL environment (VCRI) in which these tools are embedded, the scales to measure the groups’ social performance, and a coding scheme for the output of the collaborative reflection (co-reflection). The first empirical
8
Introduction study, reported in chapter 4, examines the separate and interaction effect of Radar and Reflector on the groups’ social and cognitive performance. In this study Radar provides information on five traits deemed important for assessing behavior in groups. Four traits are related to social behavior, namely influence, friendliness, cooperation, and reliability. The last trait is related to cognitive behavior, which is productivity. The Reflector in this study is aimed at group functioning history. The second empirical study, reported in chapter 5, examines to what extent half-time awareness (i.e., receiving the tools for the first time halfway the collaboration process) affects the group’s social and cognitive performance. In this study the five traits of Radar are complemented with a sixth trait related to cognitive behavior: quality of contribution. The Reflector in this study is aimed at future group functioning. In the third empirical study, reported in chapter 6, it was hypothesized that a combination of Radar and Reflector should be more effective than solely Radar. A combination of Radar and Reflector would lead to more objective and valid ratings, more consensus among raters, and enhancement of the groups’ social and cognitive performance, compared to groups with only Radar. Chapter 7 provides a general discussion and concludes this thesis. This chapter lists the main findings of the three studies, discusses their theoretical implications, considers methodological issues, provides suggestions for future research, and describes the practical implications of the results.
9
2.
Collaboration in a CSCL-environment
2.1 Collaborative Learning There is a growing interest among educational settings, especially in higher education, in letting people learn and/or work together in small groups (Strijbos, Kirschner & Martens, 2004). This is known as collaborative learning (CL). According to Rochelle and Teasley (1995), collaborative learning involves the “mutual engagement of participants in a coordinated effort to solve the problem together” (p. 70). Higher education is increasingly turning to collaborative learning to cope with the increasingly high demands of the environment (e.g. Cohen & Bailey, 1997; Salas, Sims & Burke, 2005). Group or teams are formed based on the assumption that these teams offer a variety of backgrounds, points of view, education and/or expertise, and bring multiple perspectives to solve complex problems (Van den Bossche, Gijselaers, Segers, & Kirschner, 2006). Closely related to CL is co-operative learning. The difference between CL and co-operative learning is that CL is a personal philosophy of interaction, while co-operative learning is a set of processes (i.e., structure) designed to help people interact together in order to accomplish a specific goal or develop an end product (Kirschner, 2001). The underlying premise of CL is based upon consensus building through co-operation by group members, in which they share authority and responsibility for the group’s actions. Co-operative learning is more directive and teachercentred, whereas CL is more student centred (Kirschner). CL has been found to enhance students’ cognitive performance (Johnson & Johnson, 1999; Slavin, 1997) and stimulate students to engage in knowledge construction (Stahl, 2004). These effects can be reinforced when collaborative learning is embedded in an authentic context and applied to ill-structured and complex learning tasks (Jonassen, 1991, 1994). As such, it is often used in combination with other conceptions of learning such as problem-based learning or project-based learning, where students have to solve problems or carry out tasks that are embedded in realistic contexts. Having students work in small groups makes it possible for them to cope with problems that otherwise would take too much time or would be too difficult to solve on their own. Finally, CL also stimulates the development of teamwork skills, facilitating competencebased learning as well (Druskat & Kayes, 2000). The interaction during collaboration supports the use of effective discourse learning methods (i.e., explicitation, discussion, reasoning, reflection and convincing others), and provides opportunities for developing social and communication skills, developing positive attitudes towards co-members and learning material, and building social relationships and group cohesion (Johnson & Johnson, 1989, 1999).
2.2 Social and Cognitive Processes in Collaborative Learning A key to successful collaborative learning is social interaction (Kreijns, Kirschner, & Jochems, 2003; Liaw & Huang, 2000; Northrup, 2001). Social interaction is important for the group’s cognitive (i.e., task-related) processes in collaboration, such as discussion, reasoning, reflection, critical thinking and creating a shared understanding of the problem (e.g., Garrison, Anderson, & Archer, 2001; Johnson & Johnson, 1999; Kreijns et al., 2003). These cognitive processes are essential for their cognitive performance, such as effective and efficient problem solving, task
11
Chapter 2 accomplishment, and knowledge construction. For example, Henry (1995) found that groups who shared and discussed task-relevant information outperformed groups who did not. However, forming a group of skilled individuals does not automatically make it a team and does not guarantee success (Salas, Sims, & Burke, 2005). To form a group of skilled individuals into a team, group members do not only need to put effort in carrying out the task or solving the problem, but also in teamwork. It is teamwork that ensures the success of groups, and there is no reason to believe that this would be different for groups whose focus is on learning (Kay, Maisonneuve, Yacef, & Reimann, 2006). According to Salas, Sims, and Burke, teamwork can be defined as “…a set of interrelated thoughts, actions, and feelings of each team member that are needed to function as a team and that combine to facilitate coordinated, adaptive performance and task objectives resulting in value-added outcomes” (p. 562). In other words, to achieve a well functioning or effective group, group members also need to put effort in the social (i.e., non taskrelated) processes, such as developing positive affective relationships, strong group cohesiveness, feelings of trust and belonging, and a sense of community (e.g., Boud, Cohen, & Sampson, 1999; Johnson, Johnson, & Smith, 2007; Kreijns & Kirschner, 2004). These social processes allow group members to get to know and understand each other so as to become a ‘healthy’ community of learning (Gunawardena, 1995; Rourke, 2000; Wegerif, 1998). For example, Rovai (2001) found that ‘feelings of community’ can increase the flow of information, support, commitment to group goals, and satisfaction with group efforts. Furthermore, Guzzo and Dickson (1996) found that group cohesion enhances task performance and effectiveness. To successfully collaborate, group members need to coordinate and regulate both cognitive and social processes (Ellis, 1997; Erkens, 2004; Erkens, Jaspers, Prangsma, & Kanselaar, 2005; Kreijns et al., 2003). During collaboration, group members are interdependent, and therefore they have to plan task-related activities, discuss collaboration strategies, monitor the collaboration process, and evaluate and reflect on the manner in which they collaborated. Processes that involve coordination, regulation, monitoring, evaluation, and reflection of both the cognitive (taskrelated) and social (non task-related) processes, can also be referred to as meta-cognitive processes, and are considered important to successful group performance (Artzt & ArmourThomas, 1997; Van Meter & Stevens, 2000). For example, Erkens et al. (2005) found that metacognitive activities such as making plans were related to the quality of written texts, and Jehn and Shah (1997) found that task monitoring was related to group performance. Furthermore, several studies showed that group performance can be increased by letting group members discuss how their group is performing and how collaboration may be improved (e.g., Johnson, Johnson, Roy, & Zaidman, 1985; Yager, Johnson, Johnson, & Snider, 1986). Processes such as planning task-related activities, discuss collaboration strategies, monitoring collaboration processes, evaluating and reflecting on the manner in which they collaborated, will not automatically occur by simply bringing students together (e.g., Fischer, Bruhn, Gräsel, & Mandl, 2002; Gräsel, Fischer, Bruhn, & Mandl, 2001; Hewitt, 2005; Weinberger, 2003). For successful collaborative learning these processes need to be supported by adequeate scaffolds or tools. Therefore, in this research project, groups are given two computer supported collaborative learning (CSCL) tools, namely a shared assessment tool (Radar) and reflection tool (Reflector) to provide them with information to monitor, coordinate and regulate their own and the group’s social and cognitive performance.
2.3 CSCL-groups versus Face-to-Face groups The rapid development of information and communication technologies has led to computer applications such as external representations, group awareness widgets, and shared participation
12
Chapter 2 tools, which have proven to be useful for supporting collaborative learning (e.g., Janssen, Erkens, Kanselaar, & Jaspers, 2007; Kreijns, Kirschner, & Jochems, 2003; Slof, Erkens, Kirschner, Janssen, & Phielix, 2010). Several researchers report cognitive and social benefits for groups in CSCL environments as compared to contiguous (i.e., face-to-face) groups. First, concerning the cognitive aspects of collaboration, researchers have found that students working in CSCLenvironments report higher levels of learning (Hertz-Lazarowitz & Bar-Natan, 2002), make higher quality decisions, deliver more complete reports, participate more equally (Fjermestad, 2004; Janssen, Erkens, Kanselaar, & Jaspers, 2007), and engage in more complex, broader, and challenging discussions (Benbunan-Fich, Hiltz, & Turoff, 2003) than students working face-toface. With respect to the social aspects, students working in CSCL-environments report higher levels of satisfaction compared to students in contiguous groups (Fjermestad, 2004). There are, however, also contradictory results. First, concerning the cognitive aspects of collaboration, students working in CSCL-environments sometimes perceive their discussions as more confusing (Thompson & Coovert, 2003), as being less productive (Straus, 1997; Straus & McGrath, 1994) and need more time to reach consensus and to make decisions (Fjermestad, 2004) than students working face-to-face. Second, students working in CSCL-environments have been found to show lower levels of participation (Lipponen, Rahikainen, Lallimo, & Hakkarainen, 2003), to experience higher levels of conflict (Hobman, Bordia, Irmer, & Chang, 2002), to experience lower levels group cohesiveness (Straus; Straus & McGrath) and to experience lower levels of satisfaction (Baltes, Dickson, Sherman, Bauer, & LaGanke, 2002) as compared to students working in contiguous groups. In other words, students working in CSCL-environments do not always reach their full potential. Finally, there are also studies which show that there is little difference between face-to-face and CSCL-groups, especially with respect to characteristic problems and difficulties (O'Donnell & O'Kelly, 1994; Salomon & Globerson, 1989), such as social loafing (i.e., where group members invest less effort in a group, compared to working individually), or the free rider effect (i.e., where students let other group members do the work for them). Thus, both CSCL-groups and face-to-face groups can use tools that enhance students’ awareness of behavior and performance by providing team members with explicit information concerning their behavior (e.g., being too dominant) or their performance (e.g., contributing low quality work).
2.4 Effects of CSCL-design on Social Interaction Two important reasons for the disparity between the potential and the performance of groups learning in CSCL-environments lies in the design of the CSCL-environment, and the actual or perceived social and cognitive behavior of the group members. With respect to the former, most CSCL-environments focus on supporting cognitive or task-related processes in collaboration and limit possibilities for social or non task-related processes (Kreijns et al., 2004). For instance, despite technological advances, most CSCL-environments are still text-based computer mediated communication systems using email, chat and/or discussion boards which cannot easily convey visual, nonverbal cues (Kreijns et al., 2003). The absence of these cues can cause specific communication and interaction problems since there are few possibilities to exchange socioemotional and affective information, and there is little information about group members’ presence, self-image, attitudes, moods, actions and reactions (Short, Williams, & Christie, 1976). According to Short et al. these nonverbal cues are related to forming, building and maintaining social relationships. Such lean systems can, thus, negatively affect impression formation and social behavior (e.g., Garton & Wellman, 1995; Walther, Anderson, & Park, 1994).
13
Chapter 2 With respect to the latter (i.e., behavior), group members form interpersonal perceptions during interaction (Kenny, 1994) based on what they see and experience. They form impressions (e.g., norms, values, beliefs) about themselves, the group, other group members, and what the other group members think of them. These impressions are based on perceived cognitive behaviors (e.g., a person’s or the team’s productivity) and social behaviors (e.g., the dominance and/or friendliness of group members) during interaction. Based upon these perceptions, group members determine their own social and cognitive behavior, and develop relationships with each other. However, research has shown that self-perceptions of performance and perceptions of group performance are generally unrealistically positive in contiguous groups (Saavedra et al., 1993; Stroebe, Diehl, & Abakoumkin, 1992; Yammarino & Atwater, 1997) and in computermediated groups (Weisband & Atwater, 1999). Group members tend to overestimate their social and cognitive performance in the group. Group member’s self-views (self perceptions) show a tenuous to modest relationship with their actual behavior and performance (Dunning, Heath, & Suls, 2004), which indicates that group members are often not aware that their actual behavior and performance are less effective as they think. This tendency of group members to believe that they are performing effectively, while they often do not, can result in reduced effort by group members (i.e., social loafing; Williams, Harkins, & Latané, 1981), further undermining the group’s social (e.g., team development) and cognitive performance (e.g., quantity and quality of work), and causing it not to reach its full potential (Karau & Williams, 1993; Stroebe, Diehl, & Abakoumkin, 1992). Unfortunately, group members are often not aware that they are loafing or are unwilling to admit it (Karau & Williams, 1993). To enhance social interaction and alleviate biased self perceptions, small groups can be outfitted with CSCL tools that make group members aware of the discrepancies between their self perceptions and their actual behavior and performance (e.g., Janssen, Erkens, Kanselaar, & Jaspers, 2007). These tools, also known as group awareness tools, provide information about the social and collaborative environment in which a person participates (e.g., they inform students how their actual behavior and performance is perceived by their peers; see Buder, 2007, 2011). This enhanced group awareness can lead to more effective and efficient collaboration (e.g., Buder & Bodemer, 2008; Janssen, Erkens, & Kirschner, 2011). Two operationalizations of such tools were developed for this research project, namely (1) a shared self and peer assessment tool (Radar) and (2) a shared reflection tool (Reflector). These tools are intended to help group members become better aware of their individual and group behavior and to stimulate them to set goals and formulate plans for improving the group’s social and cognitive performance. Radar allows the group members to rate their own social and cognitive behavior (i.e., self assessment) as well as that of their fellow group members (i.e., peer assessment), and shares this information anonymously with all group members. The effect of Radar on students’ behavior may, however, depend on the willingness and ability of students to reflect upon the information (i.e., peer feedback) provided by this tool. For instance, Radar can provide students with cues for needed behavioral adaption (e.g., I come across as too strong), but they might lack the will and ability to reflect upon this information (e.g., what should I do to ‘lighten up’?). Therefore, Radar was combined with a Reflector, which stimulated group members to individually reflect upon their own functioning, their received peer ratings, and the functioning of the group as a whole. Reflector also stimulated group members to collaboratively reflect on their group performance and formulate plans for improvement. To this end, it was hypothesized that a combination of Radar and Reflector would be most effective for influencing the group members’ behavior and enhancing their performance. The next sections deal with aspects central to these tools, namely peer feedback, group awareness, self and peer assessment, and reflection.
14
Chapter 2
2.5 Peer Feedback to Enhance Performance Peer feedback can be used as a method for providing group members with information concerning their behavior in a group (i.e., their interpersonal behavior). This peer feedback can be focussed on evaluation and/or development. Topping (1998) has a more evaluative perspective on peer feedback and defines peer feedback as an “arrangement in which individuals consider the mount, level, value, worth, quality or success of the products or outcomes of learning of peers of similar status” (p. 250). In comparison, Earley, Northcraft, Lee, and Lituchy (1990), and Kluger and DeNisi (1996) have a more developmental perspective on feedback which is focussed on performance improvement, and is described as information provided to an individual in order to increase performance. In this thesis the developmental perspective on feedback will be used, because for this research project, peer feedback is provided by sharing information of an assessment tool (i.e., Radar) and reflection tool (i.e., Reflector) in order to enhance the group’s social interaction and performance. Therefore, in this thesis, peer feedback is defined as information provided by peers of similar status (i.e., fellow students), which is intended to increase performance. Peer feedback can be provided by an individual or a team (i.e., individuals working in a team context), and naturally, it can also be received by an individual or a team (i.e., individual feedback gathered at the team level and presented to the whole team). Furthermore, peer feedback can be given on performance outcomes (i.e., outcome feedback) or on processes (i.e., process feedback). Outcome feedback, which is the most common, is information concerning performance outcomes (Balcazar, Hopkins, & Suarez, 1986). Outcome feedback has been found to increase performance on both the individual-level (Kluger & DeNisi) and the team-level (Burgio, Engel, Hawkins, McCormick, & Scheve, 1990; Goltz, Citera, Jensen, Favero, & Komaki, 1989), especially when it is combined with goal setting (Mento, Steel, & Karren, 1987; Neubert, 1998; Tubbs, 1986). The underlying mechanism for the effects of outcome feedback is an increase in effort, which in turn leads to an increase in performance (Kluger & DeNisi; Locke & Latham, 1990). Process feedback is information concerning one’s learning or work process, without taking the product or performance into account (Balcazar, Hopkins, & Suarez). In this thesis, each group member provided both process as outcome feedback in order to increase social interaction during collaboration and enhance group performance at the end. For example, each group member provided anonymous feedback on the collaboration process by rating the social aspects of collaboration (e.g., friendliness, cooperation) of themselves and their fellow group members. Furthermore, each group member also provided feedback on the performance of his/her peers by rating their cognitive aspects of collaboration (i.e., productivity, quality of contribution) during the collaboration process. This feedback is visualized in a radar diagram in a CSCL environment (see Radar in chapter 3.2).
2.5.1 Providing peer feedback is a skill Providing peer feedback is a complex skill that does not effectively or efficiently emerge or develop in a spontaneous way (Prins, Sluijsmans, Kirschner, & Strijbos, 2005). There is often resistance to large-scale peer feedback interventions, often accompanied by pages of supportive information on feedback, complex feedback instruments, and several peer feedback tasks, because the added value of peer feedback is often not acknowledged (Prins, Sluijsmans, & Kirschner, 2006). In addition, peer feedback procedures are sometimes perceived as being too artificial, too difficult, costing too much time or effort, or students feel uneasy giving feedback to their peers. Feedback-providers often do not have the appropriate style of delivering peer feedback, and
15
Chapter 2 feedback-receivers often do not have the skills to regulate an effective feedback dialogue (Prins et al., 2006). Therefore, in order to effectively and efficiently use peer feedback, (a) simple smallscale peer feedback tools should be developed, and (b) peer feedback skills need to be supported, acquired and practiced (Prins et al., 2006; Sluijsmans, 2002). Sluijsmans and Van Merriënboer (2000; Sluijsmans, 2002) analyzed peer feedback skills in the domain of teacher education and identified three important sub skills that need to be supported: (1) defining the assessment criteria, (2) assessing the product or contribution to group performance of a peer, and (3) delivery of the peer feedback. Prins et al. (2005) added receiving peer feedback as a fourth peer feedback sub skill and emphasized the importance of a feedback dialogue for the development and emergence of effective peer feedback. In general, support or training is necessary for each sub skill because it cannot a priori be assumed that students are experienced in the different peer assessment practices (Sluijsmans). Therefore, in this research project, a CSCL environment (VCRI) was augmented with (1) an easy to complete and easy to interpret self and peer assessment tool (Radar), and (2) a shared collaborative reflection tool (Reflector). Firstly, Radar supported the assessment input by defining the assessment criteria and stimulating group members to rate themselves and their peers on six traits deemed important for group work. Also, Radar supported the assessment output by providing each group member with the anonymous peer ratings and average peer ratings of his/her fellow group members. Secondly, Reflector structured and regulated the feedback dialogue by sharing group members’ individual reflection on their received peer ratings with all fellow group members, and stimulating them to collaboratively reflect on the functioning of the group as a whole, to reach a shared conclusion on this matter, and to set shared goals for improving it.
2.5.2 Self and peer assessment as a form of peer feedback Working in a group can be very frustrating, especially when a fellow group member does not meet up to the standards of his/her peers (e.g., when a group member forgets his/her appointments, does not fulfill his/her tasks on time, is less productive, produces low work quality, and is free-riding on the work of others). In order to make a group member more aware of his/her behavior and performance during collaboration, group members can critically self assess themselves by reflecting on their own performance, and/or receiving an assessment of their performance from others (e.g., their fellow group members). Boud and Falchikov (1989) define self assessment as students making judgments about their own learning, mainly about their achievements and learning outcomes. Peer assessment can be defined as an educational arrangement where students judge a fellow student’s performance qualitatively and/or quantitatively (Topping, 1998; Strijbos & Sluijsmans, 2010), which stimulates students to share responsibility, reflect, discuss and collaborate (Birenbaum, 1996; Boud, 1990; Orsmond, Merry, & Callaghan, 2004; Sambell & McDowell, 1998). Somervell (1993) stresses that providing peer assessment can be seen as a part of the self assessment process, informing self assessment. By peer assessing group members are stimulated to externalize and articulate their thoughts about their own social and cognitive performance, that of their peers and that of the group as a whole (Fischer, Bruhn, Grasel, & Mandl, 2002). Sharing self and peer assessments with others can be seen as providing information to enhance group performance. Thus, self and peer assessments can be seen as forms of peer feedback (e.g., Kluger & DeNisi, 1996; Strijbos, Narciss, & Dunnebier, 2010). These assessments can take many different forms, such as grading a learning outcome or product (e.g., essay, paper), providing feedback on a process (e.g., group functioning, team development), or evaluating a
16
Chapter 2 performance (e.g., a person’s or the team’s productivity). Peer assessment can be anonymous or identified, and can be given or received by individuals or groups. Self and peer assessments can (1) support students in forming judgments about what can be referred to as good group behavior and high-quality performance (Topping), (2) provide teachers with a more accurate perception of students’ individual behavior and performance in collaborative group work (Cheng & Warren, 2000), and (3) foster reflection on the student's own learning process and learning activities (Dochy et al., 1999). Therefore, self and peer assessments are increasingly applied in formative assessments and to evaluate collaborative processes during group work (Dochy et al., 1999; Strijbos & Sluijsmans, 2010). Although the impact of peer assessment on learning processes and activities are often heard strong, empirical evidence for its effects on learning is scarce (Strijbos & Sluijsmans, 2010; Topping, 1998; Van Gennip, Segers, & Tillema, 2009; 2010; Van Zundert, Sluijsmans, & Van Merrienboer, 2010). Several review studies (e.g., Topping; Van Gennip et al.) display a large variety and much ambiguity in peer assessment practices, caused by the many different forms and characteristics of peer assessment. This diversity makes it difficult to unravel what the real effects of peer assessment on learning are. Thus, there is a need for empirical (quasi)experimental studies in which clearly described methods and conditions are related to outcome variables. In this thesis, Radar provided group members with information or feedback, which is relevant for the groups’ social processes (e.g., team development, group satisfaction) and cognitive processes (e.g., task accomplishment). This shared information is intended to (1) assist group members in becoming better aware of the behavior of themselves and the group as a whole, and (2) stimulate group members to improve their interpersonal behavior and enhance group performance. For example, McLeod and Liker (1992) found that group-level process feedback on interpersonal behavior influenced dominance behavior of individual group members. Two other studies investigating individualized peer feedback on interpersonal group member behavior (e.g., communication and collaboration) found that it led to increased cooperation, communication, satisfaction, and motivation (Dominick, Reilly, & McGourty, 1997; Druskat & Wolff, 1999). Thus, in this thesis it was expected that enhancing interpersonal behavior would positively affect the group’s social performance (Geister, Konradt, & Hertel, 2006; McLeod & Liker), and would also indirectly affect its cognitive performance (Kreijns et al., 2003).
2.5.3 Using Assessment to Enhance performance: drawbacks and solutions The use of self and peer assessment in small groups can provide group members with useful information about their own behavior and performance. However, there are six possible problems that may arise when self and peer assessments are used to enhance performance. First, although peers have been called the most accurate and informed judges of the behavior of other group members (Kane & Lawler, 1978; Lewin & Zwany, 1976; Murphy & Cleveland, 1991), and peers have many opportunities to observe their colleagues’ social (non-task related) and cognitive (taskrelated) behavior, they can not observe all aspects that are relevant and important for successful collaboration. For example, peers can observe the performance (e.g., work results) and behavior (e.g., friendliness) of a specific group member during interaction with him/her, but they are unable to observe interactions taking place between other group members outside their presence, especially when groups use computer mediated communication (CMC) systems (e.g., chat). To this end, group members lack a lot of information about fellow group members’ socio-emotional processes (e.g., frustrations, feelings of trust), which are important for a successful collaboration process (e.g., Järvelä, Järvenoja, & Veermans, 2008; Järvelä, Volet, & Järvenoja, 2010). To partially overcome this lack of information, in this research project, a CSCL environment was
17
Chapter 2 augmented with an easy to complete and easy to interpret self and peer assessment tool (Radar). Using Radar, students rate themselves and their peers on four social aspects (i.e., influence, friendliness, cooperation, and reliability), and two cognitive aspects of collaboration (i.e., productivity and quality of contribution). Radar shares these self and peer ratings with each group member, by visualizing these ratings anonymously in a Radar diagram. Because group members’ self ratings are shared in Radar, all group members receive information on their fellow group members’ intensions (e.g., to involve in the collaboration process, be friendly, cooperative and trustworthy). Because group members’ peer ratings are shared in Radar, all group members receive information on how their social and cognitive intensions are perceived and experienced by their peers. The strength of Radar lies in its ability to make implicit aspects of collaboration (e.g., frustrations among peers) explicit for all group members. Radar enhances students’ awareness of behavior and performance by providing them with explicit information concerning their behavior (e.g., being too dominant) or their performance (e.g., contributing low quality work). The second problem of the use of self and peer assessment in general is that students tend to emphasize their strengths and positive performances, and attribute weakness and negative performances to others (e.g., Klein, 2001; Saavedra & Kwun, 1993). This tendency, also known as attribution (e.g., Eccles & Wigfield, 2002; Weiner, 1985), can result in unrealistically high self ratings and low peer ratings. To overcome this tendency, students need to become more aware of the – often unrealistic and inaccurate – standards they use to compare and rate social and cognitive behavior of themselves and their peers. Therefore, in this research project, Radar shares self and peer ratings with all group members in order to make them more aware of their inaccurate self and peer perceptions, by showing discrepancies between their own perceptions (i.e., self and provided peer ratings) and those of others (i.e., received self and peer ratings of fellow group members). The third problem is that peers may be unwilling to provide accurate ratings and instead may rate their friends too leniently (Landy & Farr, 1983). During completion of self and peer ratings, students make many mental comparisons (Goethals, Messick, & Allison, 1991), which are selected, interpreted, and/or biased (Saavedra & Kwun, 1993). Interpersonal relationships among group members may cause peers to be too lenient or less discriminating when rating their friends. By working closely together, group members often develop friendly relationships with one another. Therefore, peers may be unwilling to provide accurate ratings and instead may rate their friends too leniently (Landy & Farr, 1983), or rate everyone similarly in order to not cause friction within the group (Murphy & Cleveland, 1995). A solution for this problem is the use of anonymous peer ratings. According to Kagan, Kigli-Shemesh, and Tabak (2006), anonymous peer ratings are one of the most effective and objective ways to gather information on individual behavior and performance. Some researchers (e.g., Cestone, Levine, & Lane, 2008), however, suggest that anonymous peer ratings provide harsher criticisms and evaluations, and as a result have a negative impact on the relationships between group members. Others, (e.g., Bamberger, Erev, Kimmel, & Oref-Chen, 2005) found no empirical evidence that anonymous peer assessment harmed relationships and impaired group task focus and functioning.Therefore, in this research project, Radar anonymously shares all provided peer ratings with each group member in order to (1) make group members more aware of the discrepancies between their self and received peer ratings, and (2) stimulate peers to provide more accurate peer ratings. The fourth problem is that information gathered and received by anonymous peer ratings is only reliable and useful for enhancement of performance when all peer raters agree (show high levels of consensus), about what can be referred to as good group behavior and high-quality performance. To overcome large assimilation in peer ratings, group members need to become
18
Chapter 2 aware of the different perceptions and standards they use to compare and rate the social and cognitive behavior of themselves and their peers, and develop shared norms and standards. This process of norm setting, also known as norming (Tuckman & Jensen, 1977, in which group members reach consensus about their behavior, goals and strategies, is an important stage in group development to become a well performing group (Johnson, Suriya, Yoon, Berret, & La Fleur, 2002). Therefore, in this research project, Radar anonymously shares all perceived and received ratings of each group member in order to make group members more aware of the different (1) interpersonal perceptions on the social and cognitive behavior of themselves and their peers, and (2) standards they use to compare and rate each others’ social and cognitive behavior. To reach shared norms and standards, Reflector stimulates individual reflection on discrepancies between self and received peer ratings, and stimulates group members to reflect collaboratively upon their group performance (see section 3.3). This reflection process allows group members to discuss discrepancies in individual norms, and to reach a shared standard about what can be referred to as good group behavior and high-quality group performance. The fifth problem is that solely providing information about different interpersonal perceptions is probably not enough to alter group members’ behavior or change their rating standards (if needed). To do so, group members need to reflect individually upon their self and peer ratings by asking themselves (1) what arguments they have to give high or low ratings to themselves or their peers, (2) whether they understand the (different) interpersonal perceptions and ratings on the behavior of themselves and their peers, (3) whether they accept these (different) perceptions, and (4) whether it provides clues to change their own perceptions and behavior (e.g., Prins, Sluijsmans, & Kirschner, 2006). The sixth problem is that it cannot be assumed that group members will automatically reflect on a high cognitive level on their perceived and received peer assessments (e.g., Kollar & Fischer, 2010). Therefore, in this research project, Reflector explicitly stimulated, structured and regulated the reflection process upon the provided information by the Radar. How Reflector stimulated the reflection process is described in the next section.
2.6 Reflection to Enhance Performance and Consensus Reflection plays a very important role in individual learning processes (Chen, Wei, Wu, & Uden, 2009), as well as in collaborative learning processes (Yukawa, 2006). Reflection can be defined as the intellectual and affective activities individuals engage in to explore their experiences in order to reach new understandings and appreciations of those experiences (Boud, Keogh, & Walker, 1985). Reflection can lead to new perspectives on experience, changes in behavior, readiness for application, and commitment to action (Boud et al., 1985). With respect to feedback and assessment as discussed in the previous sections, students need to be challenged to reflect on their own performance, and determine whether the feedback (from peers in their group) provides clues that can be used for behavioral change (Prins, Sluijsmans, & Kirschner, 2006). According to Hattie and Timperley (2007), feedback effectiveness is increased when students answer three reflective questions: (1) Where am I going? / What are the goals? (feed up), (2) How am I going? / What progress is being made toward the goal? (feed back), and (3) Where to next? / What activities need to be undertaken to make better progress? (feed forward). Reflection on peer feedback, thus, makes group members more aware of their own behavior, how it affects others, and whether they should alter it. This awareness allows for a better understanding of the activities of others providing context for one’s own activity (Dourish & Bellotti, 1992). In this thesis, a reflection tool (Reflector) was designed to assist group members in becoming better aware of their individual and group behavior, and to stimulate them to set goals
19
Chapter 2 and formulate plans to enhance social and cognitive group performance. Group members using Reflector individually reflect and provide information on: their own perspective on their personal performance (feed up); differences between their self perception and the perception of their peers concerning their personal performance (feed back); whether they agree with those perceptions (feed back); and, their individual perspective on group performance (feed up). However, because group performance is determined by the individual effort of all group members, in this thesis, students do not only reflect individually but also collaboratively on their experiences. This process of collaborative reflection (i.e., co-reflection) is defined as ‘a collaborative critical thinking process involving cognitive and affective interactions between two or more individuals who explore their experiences in order to reach new intersubjective understandings and appreciations’ (Yukawa, 2006; p. 206). Using Reflector, students were stimulated to collaboratively reflect (i.e., co-reflect) on the group’s performance and reach a shared conclusion on this (feed back). Based on their shared conclusion, group members set goals to improve their group performance (feed forward). In this thesis, it was assumed that reflection could lead to awareness of unrealistically positive self and peer perceptions, and supports students in forming shared norms about what can be referred to as good group behavior and high-quality performance. Thus, it was expected that groups who reflect upon their individual and group performance would exhibit more moderate peer ratings and reach higher levels of consensus in their peer ratings. For reflection to have an effect on consensus among peer ratings and group performance, in this research project, group members reflected individually (self reflection) and collaboratively (co-reflection) upon the performance of themselves, their peers and the group as a whole. Firstly, Reflector shared and visualized the self reflections with all fellow group members. Secondly, Reflector structured and regulated a feedback dialogue by stimulating group members to collaboratively reflect on the functioning of the group as a whole and to reach a shared conclusion on this matter. This feedback dialogue, in which group members discuss how well their group is functioning and how group processes may be improved, is also known as group processing (Webb & Palincsar, 1996). By externalizing and articulating their thoughts about the social and cognitive performance of themselves, their peers and the group, group members can reach new intersubjective understandings and appreciations on this matter. This, in turn, can enhance group members’ awareness of their misconceptions, their behavior, and how their behavior affects their peers and the group as a whole (e.g., Fisher, Bruhn, Gräsel, & Mandl, 2002; Leinonen & Järvelä, 2006), and support students to reach a shared norm (i.e., consensus) about what can be referred to as good group behavior and high-quality performance. Furthermore, when group members show a high level of consensus in their peer ratings, it can be assumed that they did not only looked more carefully to each others’ behavior, but also to their group performance. Thus, they will most likely also agree on their perceived social and cognitive group performance, as would be evidenced by high correlations between the perceived social behavior and the perceived social performance.
2.7 Social Relation Models to Measure Consensus in Peer Ratings Research has shown that when students evaluate the performance of their peers, their ratings are likely to be interdependent (Kenny, 1994). As discussed in section 2.5.3, due to friendly relationships, group members may be unwilling to provide accurate ratings and instead may rate their friends too leniently (Landy & Farr, 1983), or rate everyone similarly in order to not cause friction within the group (Murphy & Cleveland, 1995). According to Magin (2001) social relationships between persons which assess each other’s performance can be seen as a potential
20
Chapter 2 source of bias, of which raters are often not aware of. This reciprocal bias affects both the reliability and the validity of peer assessments (Montgomery, 1986). For example, in a group of four students (i.e., Ann, Bert, Chris, and Dennis), Ann’s rating of Bert is related to Bert’s rating of Ann. When Ann perceives Bert as friendly, she will likely behave friendly to Bert, and will therefore also be seen as friendly by Bert, and vice versa. Thus, relationship factors and interdependencies among peers likely contain interesting and useful information to better understand the complexities of peer ratings. When these interdependencies are ignored, meaningful information about the interdependencies among peers is lost, and results of statistical analyses (i.e., ANOVA) may be distorted (Bonito & Kenny, 2010; Kenny; Kenny & Judd, 1986). To this end, Social Relations Models (SRM) was used to examine to what extent relationship factors among peers influence their ratings. SRM does not require an assumption of independence, and provides both a theoretical basis and a statistical tool (e.g., SOREMO) to answer questions regarding the impact of rater and ratee characteristics on peer ratings (Kenny). SRM can be used to estimate sources of variance in ratings. It focuses on breaking down sources of variance (i.e., actor, partner, and dyadic variances) and on correlating them with each other and self ratings in order to be able to fully understand and explain these ratings. For example, with SRM it is possible to partition sources of variance in peer ratings into actor variance (i.e., caused by the tendency of raters to rate all peers similarly - high or low - on a particular trait), partner variance (i.e., caused by the tendency of ratees to elicit similar ratings from all peer raters), and dyadic variance (i.e., caused by the unique relationship between two group members after removing their individual-level tendencies). According to Kenny (1994), for ratings of personality traits, the variance partitioning shows that about 20% of the variance is due to actor, 15% to partner, 20% to dyadic relationships, and 45% to error. To illustrate the use of SRM in this study, consider the following example: Bert perceives Ann as a cooperative group member and, therefore, gives her a high rating on this trait. Several possible explanations can be given for Bert’s rating of Ann. First, it could be the result of Bert’s tendency to rate or perceive all his peers as cooperative, as would be evidenced by a significant actor (i.e., perceiver) effect. When some raters tend to rate all others as high on a trait, whereas other raters see all others as low on a trait, there is assimilation in ratings of indiviudal raters (i.e., considerable high actor variance in the group; Kenny, 1994). Second, Bert’s ratings of Ann may be attributed to Ann’s tendency to evoke similar ratings from all her peers, as would be evidenced by a significant partner (i.e., target) effect. When all group members agree that some peers are high on a trait, whereas other peers are low on the trait, there is consensus among raters (i.e., considerable high partner variance; Kenny, 1994). Note that a high degree of consensus in peer ratings does not necessarily means that these peer ratings are valid. For example, when all peer raters agree that a group member is unfriendly, this could also be the result of peers picking on one specific group member, racism, or bullying behavior. Third, Bert’s rating of Ann may be due to factors that are uniquely dyadic, that is, effects due to the unique relationship between two group members (i.e., between Bert and Ann). The relationship effects represent the degree to which Bert’s rating of Ann cannot be explained by actor or partner effects. If Bert does not see most group members as cooperative and most group members do not consider Ann to be cooperative, yet Bert repeatedly sees Ann as cooperative, then there would be a large relationship effect. In sum, for peer ratings to be reliable (which is conditional for validity), partner variance needs to be large, and actor variance needs to be small. Therefore, in this research project, SRM will be used to examine whether peer ratings of students with Reflector will show more consensus
21
Chapter 2 (i.e., higher partner variances), and less assimilation (i.e., lower actor variances), compared to students without this tool.
2.8 Group Awareness tools to Enhance Collaboration A self and peer assessment tool such as Radar, combined with a co-reflection tool as Reflector, provide students with information which they can use to monitor their collaboration process, and evaluate and reflect on the manner in which they collaborate. This information allows students to determine whether selected strategies are working as expected, and whether the groups’ social performance (e.g., team development, group satisfaction) and cognitive performance (e.g., equal participation of all peers, quality of the products) are up to standard. Thus, group awareness tools as Radar and Reflector can facilitate group awareness, which can be defined as knowledge and perception about the social and collaborative environment in which a person participates (Buder, 2007). Several researchers have suggested that awareness plays an important role in facilitating CSCL (Dourish & Bellotti, 1992; Gutwin & Greenberg, 2004; Kirschner, Strijbos, Kreijns, & Beers, 2004). During collaboration, students have to be aware of the activities of their group members, because it allows them to decide which activities they have to engage in. For example, Janssen, Erkens, Kanselaar, and Jaspers (2007) used a tool (participation tool) that visualized participation in order to raise students’ awareness of how much each group member contributed to his or her group’s online communication, and found that students’ using this tool participated more and engaged more in coordination and regulation of social activities during collaboration. One of the reasons why group awareness tools positively affect collaboration is that they facilitate achievement of an awareness of behavior. For example, a studie by Zumbach, Hillers, and Reimann (2004) showed that their group awareness tool, which visualized parameters of interaction (e.g., participation behavior, group members’ motivation (self ratings), and amount of contributions), positively affected students’ learning process, group performance, and motivation. Therefore, in this study, it is expected that group performance will be positively affected by visualizing group members’ behavior in a Radar diagram. This awareness is further enhanced by Reflector, stimulating students to reflect upon the discrepancies between their self perceptions (i.e., self ratings), and their actual behavior (i.e., received peer ratings), which can provide students with cues for behavioral adaptation. Another reason for the effect of group awareness tools is that they facilitate achieving an awareness of group norms. For example, Radar creates the opportunity for social comparison, which means that students can compare themselves to other group members. Radar provides students with self and (average) peer ratings of their group members’ social and cognitive behavior, which are visible and available for all group members. By comparing themselves to other group members, students may become motivated to set higher standards for themselves and to try to increase their effort. For example, in studies by Janssen et al. (2007) and, Michinov and Primois (2005), social comparison processes were stimulated by providing students with measures of participation of other group members which were visible and available for all group members. Results showed that this awareness positively affected group members’ participation. Furthermore, visualization of group norms (i.e., self and peer ratings in Radar) and explicit reflection upon these norms and standards (i.e., reflection on discrepencies between self and peer ratings in Reflector), allow group members to discuss dicrepencies in group members’ individual norms, and to reach shared norms and standards to measure their behavior and performance (a process also known as norming, see Tuckman & Jensen, 1977). Therefore, in this thesis, it was expected that this shared standard would lead to higher consensus among peer raters, that is, the degree to which all group members rate some peers as very friendly and rate other peers as very
22
Chapter 2 unfriendly. It was also expected that this shared standard would lead to a more valid perception of their social performance, for example, higher correlations between perceived behavior (i.e., cooperative behavior) during collaboration and their perceived performance at the end (i.e., team development). Group awareness tools, finally, also can raise students’ awareness of group functioning. Group awareness tools, such as Radar and Reflector, provide feedback about social and cognitive aspects of their collaboration. For example, Radar shares and visualizes students’ self and peer ratings on social and cognitive behavior with all group members, and Reflector shares and displays student reflections on the performance of themselves and the group as a whole with all group members. This feedback provides students with information that they can use to monitor, coordinate and regulate the group’s social and cognitive performance. This, in turn, can enhance students’ awareness of group functioning. This awareness of group functioning is further enhanced by Reflector, which supports group members to collaboratively reflect (co-reflect) on the functioning of the group and reach a shared conclusion on this. Finally, based on their shared conclusion, Reflector stimulates group members to set shared goals for improving the group’s social and cognitive performance, and allows students to determine whether it is necessary to change their strategy, goals or plans for future group functioning.
23
3.
Tools and Measures
3.1 Introduction This chapter describes the tools and measures used in this thesis, in example the CSCL environment (VCRI) in which the self and peer assessment tool (Radar) and co-reflection tool (Reflector) were embedded, the scales to measure behavior, a coding scheme for the output of the collaborative reflection (co-reflection), and the scales to measure the groups’ performance. Table 3.1 provides an overview of the measures on behavior and performance. Table 3.1 Overview of Scales, Subscales and Instruments Scale
Subscales
Instrument
Social behavior
Influence, Friendliness, Cooperation, Reliability
Radar
Cognitive behavior
Productivity, Quality of Contribution
Radar
Social performance
Team development, Group process Satisfaction Intra-group Conflicts Attitude towards Collaborative Problem Solving -
Questionnaire
Cognitive performance
Paper grade
Behavior is defined as the perceived social (i.e., non-task related) and cognitive (i.e., taskrelated) aspects or activities that are important for successful collaboration. These social aspects, further referred to as social behavior, are measured by self and peer ratings in Radar on four variables: influence, friendliness, cooperativeness, and reliability. The cognitive aspects, further referred to as cognitive behavior, are measured by self and peer ratings in Radar on two variables: productivity and quality of contribution. Performance is defined as the social and cognitive achievement or output at the end of the collaboration process. The social achievement, further referred to as social performance, is measured by four subscales (i.e., team development, group satisfaction, levels of group conflict, and attitude toward problem-based collaboration). The cognitive output, further referred to as cognitive performance, is measured by the grade that was given to the group’s product (i.e., essay, paper).
3.2 Virtual Collaborative Research Institute (VCRI) The VCRI (Jaspers, Broeken, & Erkens, 2004) is a groupware program designed to support collaborative learning on research projects and inquiry tasks. Using the VCRI, each student works behind one computer. The VCRI can be augmented with more than 10 different tools, of which 6 were used for this research project (see Figure 3.1). The Chat tool (top left) is used for synchronous communication between group members. The chat history is automatically stored and can be re-read by participants at any time. Users can search for relevant historical information using the Sources tool (top centre). The Co-Writer (top right) is a shared word-processor, which can be used to write a group text. Using the Co-Writer, students can simultaneously work on different parts of their texts. Notes (bottom left) is a note pad which allows the user to make notes and to copy and paste selected information. Radar for peer feedback (bottom centre) and Reflector for reflection (bottom right) will be described in the
25
Figure 3.1 Screenshot of VCRI with the six tools used in this experiment.
Chapter 3
26
Chapter 3 following sections. Windows of the available tools are automatically arranged on the screen, when students log on to the VCRI.
3.3 Self and Peer Assessment Tool (Radar)
Figure 3.2 Radar – Input screen
Radar was designed and developed to be an easy to complete and easy to interpret self and peer assessment tool. Aim of the tool was to enhance awareness of group members’ social and cognitive behavior, and, in turn, enhance social and cognitive group performance. Therefore, Radar elicits information on group members’ social and cognitive behavior and visualizes this information in a radar diagram (see Figure 3.2). Radar provides users with anonymous information on how their cognitive and social behaviors are perceived by themselves, their peers, and the group as a whole with respect to specific traits found to tacitly affect how one ‘rates’ others (den Brok, Brekelmans, & Wubbels, 2006).
27
Chapter 3
3.3.1 Social and cognitive behavior scales in Radar In the first study, as decribed in chapter 4, Radar provided information on five traits deemed important for assessing behavior in groups. Four traits were related to social behavior, namely influence, friendliness, cooperation, and reliability. The fifth trait was related to cognitive behavior: productivity. These traits are derived from studies on interpersonal perceptions, interaction, group functioning, and group effectiveness (e.g., Bales, 1988; den Brok, Brekelmans, & Wubbels, 2006; Kenny, 1994; Salas, Sims, & Burke, 2005). However, in the follow-up study (see chapter 5) the tool was complemented with a sixth trait that also represents cognitive or taskrelated behavior: quality of contribution. The following sections will describe the six traits in Radar and the reasons for their choice. Influence is directly derived from Wubbels, Créton, and Hooymayers’ (1985) influence dimension which they labeled dominance and submissiveness in their model for interpersonal teacher behavior. This dimension is also used by Bales (1988) and represents the prominence, status, power, and personal influence that the individual is seen to have in relation to other group members. The variable is labeled ‘influence’, and not ‘dominance’ or ‘submissive’, because the latter two can be perceived of as negative traits. The label ‘influence’ is neutral and can be perceived as either a positive or a negative trait. Influence is chosen because it is closely related to leadership. Group members who receive a high rating on this variable can be perceived as good democratic leaders (Salas, Sims, & Burke, 2005). According to Bales (1988) dominant group members, high participators, extroverts, and group members that show tendency to impose their views on the group will also be rated high on influence. However, group members who are very self-centred, egoistic, and constantly disagree with every decision or action of their peers will probably receive a high rating on ‘influence’ as well, but will most likely receive a low rating on friendliness. Friendliness is one of the eight behavior categories from Wubbels et al.’s (1985) model for interpersonal teacher behavior. Bales (1988) used a similar dimension (i.e., friendliness vs. unfriendliness). Bales and Cohen (1979) defined this as the extent to which individual members are friendly with and respectful to each other. This variable was selected because friendly behavior is closely related to helpful, cooperative, and backup-behavior. Johnston and Briggs (1968) found that teams who were able to help and compensate each other under periods of high stress had fewer errors. Group members who are friendly, helpful and cooperative towards other group members, can also be seen as being flexible. Research has shown that providing flexibility in how work is completed increases team effectiveness (Campion, Medsker, & Higgs, 1993). A low rating on this variable is probably associated with self-interested and self-protected behaviors and values. Behaviors and values as equalitarian, cooperative, helpful, supportive, backupbehavior, or protective to others will probably be perceived as friendly, and receive a high rating. However, a friendly group member does not necessarly has to be seen as cooperative, and vice versa. Cooperation, which denotes the degree to which someone is willing to work with others, is derived directly from Wubbels et al.’s (1985) dimension proximity (i.e., opposition vs. cooperation) which they defined as the property of being close together, or in group settings as the feeling of being a group (i.e., group cohesiveness). Strijbos, Kirschner, and Martens (2004) defined cooperative as the extent to which individual members collaborate with each other, central to group performance or group efficiency. This variable is also used by Bales (1988) labelled as ‘forward’-behavior, and is associated with behaviors and values as cooperative, helpful, backup-behavior, adaptability and supportive. Cooperative group members are willing to accept group norms, values, definitions of the task, and want to get on with carrying out the group
28
Chapter 3 task. This variable is selected because non-cooperative behavior (i.e., free-rider effect and suckereffect) will have negative effects on the group performance and group effectiveness. The freerider effect or hitchhiking effect occurs when group members think that their individual effort is unnecessary, because the task can be performed by the other group members. This often occurs when the individual group members receive a grade based on the performance of the whole group (Kerr, 1983; Kerr, & Bruun, 1983). The sucker effect occurs when productive group members believe that they invest more time and effort in the group product/performance than their fellow group members. The productive group members will often reduce their individual efforts, because they refuse to support the non-contributing members (Kerr, 1983; Kerr & Bruun, 1983). Notice the overlap between the variables ‘friendliness’ and ‘cooperation’ (e.g., cooperative, helpful, supportive, backup-behavior). Group members who are very cooperative are most likely to be perceived as friendly as well. Reliability is a trait reflecting ‘trust’ which has been identified as an important precursor for successful collaboration, in face-to-face teams (Castleton_Partners/TCO, 2007) and in CSCL (Jarvenpaa & Leidner, 1999). According to Emans, Koopman, Rutte, and Steensma (1996) trust is the cognitive and affective assurance of group members that they respect each other’s interests and, therefore, can orient themselves towards each other’s words, actions, and decisions with an easy conscience. A number of experimental studies has shown that the use of computer mediated communication (CMC) systems leads to decreased trust and less cooperative behaviour in CMC groups (Bos, Olson, Gergle, Olson, & Wright, 2002; Jensen, Farnham, Drucker, & Kollock, 2000). Productivity and Quality of contribution are the extent to which individual group members contribute quantitatively and qualitatively to tasks or duties central to group performance or group efficiency. These traits, representing cognitive or task-related behavior, were selected because research has shown that group members (1) do not always participate equally (Karau & Williams, 1993), and (2) monitor the performance (i.e., quantity and quality) of other group members (Salas et al., 2005).
3.3.2 Procedure for completing the Radar In Radar, group members are both assessors and assessees. As assessor, to-be-assessed peers in the group (i.e., the assessees) can be selected and their profile will appear as dotted lines in the center circle of the radar diagram. Each group member is represented by a specific color. Assessors rate themselves and all other group members on each of the six subscales (i.e., traits) which are divided in 41 points of assessment ranging from 0 to 4 (see Figure 3.3). For example, a student can rate his/her peer 3.2 for friendliness. To simplify data analysis, ratings were transformed into integers on a 100-point scale by multiplying the ratings (0-4) by 25. Thus a rating of 3.2 was after multiplication saved in the database as 80 points (3.2 times 25) on 100point scale.
Figure 3.3 Detailed image of a subscale in Radar (simplified)
Care was taken to ensure that all assessors used the same definition of the six traits. Prior to the experiment the researcher notified participants of the fact that text balloons with content
29
Chapter 3 information and definitions would appear when they moved the cursor across one of the traits in the tool. For example, when the cursor is moved across ‘influence’ a balloon pops up with the text ‘A high score on influence means that this person has an influence on what happens in the group, on the behavior of other group members, and on the form and content of the group product (see Table 3.2)’. Table 3.2 Social and Cognitive Behavior as Measured by Radar Scale
Subscales
Balloon text/ description
Social behavior
Influence
A high score on influence means that this person has a big influence on: other group members; what happens in the group; structure and content of the groups’ product.
Friendliness
A high score on friendliness means that this person: is friendly and helpful; provides a positive contribution to the group atmosphere; responds friendly and helpful on questions, suggestions and ideas of others.
Cooperation
A high score on cooperation means that this person: collaborates well in the group; is willing to take over tasks of others; takes initiatives; communicates well; tries to think and cooperate in finding solutions for problems that occur.
Reliability
A high score on reliability means that this person: is reliable; keeps his/her word; does what he/she is suppose to do; finishes his/her task at the appointed time.
Productivity
A high score on productivity means that this person:is productive; works hard; has a high contribution in problem solving; has a high contribution to the groups’ product.
Quality of contribution
A high score on ‘quality of contribution’ means that: his/her work is perceived as useful and good; he/she produces a high quality of work; he/she has a positive contribution to the content and structure of the groups’product.
Cognitive behavior
Ratings are automatically saved in a database. Assessment is anonymous; group members can see the received peer assessments of themselves and their fellow group members, but not who entered the assessment. Students can only access individual and average assessments of their peers after they have completed the assessment themselves. When all group members have completed their self and peer assessments, two modified radar diagrams become available. The first - Information about yourself - shows the output of the self assessment (e.g., Chris about Chris) along with the received average peer assessments of her/his own behavior (e.g., Group about Chris). The self assessment is not taken into account for computing the average peer assessment scores. To provide more information about the variance in the average score of their peer assessment, students can also choose to view the individual peer assessments about their own behavior (e.g., Group members about Chris). The second - Information about the group (see Figure 3.4) - represents the average scores of the group members, so that group members can get a general impression about the functioning of the group.
30
Chapter 3
Figure 3.4 Radar – Information about the group
All group members are represented as a solid line in the diagram, each with a different color. Participants can complicate or simplify the Radar diagram by including or excluding group members from the view through clicking a name in the legend. It is assumed that the peer feedback from Radar makes group members aware of the differences between their intended behavior (self assessment) and how this behavior is perceived by their peers (peer assessment). It is also assumed that group members will be stimulated to improve their social and cognitive behavior, knowing that (1) every group member is assessed by peers, and (2) these scores are shared anonymously in the group. Therefore, it is expected that group members using Radar throughout the task will exhibit higher self and peer assessment scores over time on all six traits.
31
Chapter 3
3.4 Reflection Tool (Reflector) For this research project, a reflection tool (Reflector, see Figure 3.5) was designed to assist group members in becoming better aware of their individual and group behavior, and to stimulate them to set goals and formulate plans to enhance social and cognitive group performance.
Figure 3.5 Reflector
In the first study (see chapter 4) no significant main effects were found for Reflector on group performance. This was ascribed to the fact that the tool was not focused on future functioning and goal setting (Hattie & Timperley, 2007; Mento, Steel, & Karren, 1987; Neubert, 1998; Tubbs, 1986). For feedback to be effective, the receiver needs to answer three major questions; (1) Where am I going? /What are the goals? (feed up), (2) How am I going? / What progress is being made toward the goal? (feed back), and (3) Where to next? / What activities need to be undertaken to make better progress? (feed forward) (Hattie & Timperley, 2007).
32
Chapter 3 Therefore, in the follow-up study (see chapter 5), Reflector was redesigned to make group members better aware of their individual and group behavior, and to stimulate them to set goals and formulate plans to enhance social and cognitive group performance. Group members using Reflector individually reflect and provide information on (1) their own perspective on their personal performance (feed up), (2) differences between their self perception and the perception of their peers concerning their personal performance (feed back), (3) whether they agree with those perceptions (feed back), and (4) their individual perspective on group performance (feed up). Because group performance is determined by the individual effort of all group members, Reflector also (5) stimulates group members to collaboratively reflect (i.e., co-reflect) on group performance and reach a shared conclusion on this (feed back). Based on their shared conclusion, group members (6) set goals to improve group performance (feed forward). The Reflector was implemented to stimulate group members to co-reflect on their individual behavior and overall group performance. The tool contained six reflective questions: 1. What is your opinion of how you functioned in the group? Give arguments to support this. 2. What differences do you see between the assessment that you received from your peers and your self assessment? 3. Why do or don’t you agree with your peers concerning your assessment? 4. What is your opinion of how the group is functioning? Give arguments to support this. 5. What does the group think about its functioning in general? Discuss and formulate a conclusion shared by all the group members. 6. Set specific goals (i.e., who, what, when) to improve group performance. The first four questions are completed in Reflector, with completion indicated by clicking an ‘Add’-button. This allows students to share their answers with the rest of the group and allows them to see the answers of the others. Students can only gain access to their peers’ answers after they have added their own so as not to influence each another. The last two questions are completed in Co-Writer, in a specific frame (Co-Reflection), which allows writing a shared conclusion and formulating shared goals. Responses made by the students in Reflector are not scored or evaluated.
3.5 Coding Scheme Output Co-reflection To improve the social and cognitive performance of the group, group members reflected collaboratively (i.e., co-reflect) to set goals and formulate plans to improve their social and cognitive activities. Categories for the coding scheme were derived from studies on social interaction and coordination processes in CSCL and were added until there were no ‘rest categories’. Finally, two independent researchers coded and categorized the goals and plans in nine categories (see Table 3.1). Inter-rater reliability was substantial (Cohen’s Kappa = .79). The first three categories are communication, focusing on task and task coordination, activities which are crucial for successful collaboration (Barron, 2003; Erkens, Jaspers, Prangsma, & Kanselaar, 2005; Slof, Erkens, Kirschner, Jaspers, & Janssen, 2010). Furthermore, students need to carry out meta-cognitive activities such as planning and monitoring to employ a proper problem-solving strategy and reflect on its suitability (Lazonder & Rouet, 2008; Narciss, Proske, & Koerndle, 2007). Students also must develop positive affective relationships with each other (Kreijns, Kirschner, & Jochems, 2003), thus friendliness is a sixth category.
33
Chapter 3 Table 3.1 Coding Scheme for Output Co-reflection: Specific Goals to Improve Group Performance Label Code Description Example Communication
Com
Focusing on task
Focus
Task coordination
Task
Planning
Plan
Improve communication or discuss teamwork Improve concentration or focus on task Improve coordination, task- or role planning Improve time planning
We have to improve our communication and discuss our teamwork more often. We’ll focus more on our work.
Monitoring
Mon
Improve peer monitoring
Friendliness
Friend
Productivity
Prod
Improve friendliness towards each other Improve productivity
Quality
Qual
Improve quality of work
We shouldn’t be so unfriendly towards each other. We’ll increase our productivity and participate more equally. We’ll improve the quality of our work.
No suggestions
None
No suggestions for improvement
We have no suggestions for improvement.
We’ll divide the tasks more effectively. Let’s make clear who does what. We’ll set deadlines and improve our time planning. We’ll monitor each others’ progression.
Productivity and quality are the seventh and eight category because in effective groups, group members mutually depend on the willingness, effort and participation of their peers (Janssen, Erkens, Kanselaar, & Jaspers, 2007; Karau & Williams, 1993; Williams, Harkins, & Latané, 1981). The category no suggestion is added for students who do not have any suggestions to improve their performance.
3.6 Social Performance Scales Forming a group of skilled individuals does not automatically make it a team and does not guarantee success (Salas, Sims, & Burke, 2005). To form a group of skilled individuals into a team, group members do not only need to put effort in carrying out the task or solving the problem, but also in teamwork. It is teamwork that ensures the success of groups, and there is no reason to believe that this would be different for groups whose focus is on learning (Kay, Maisonneuve, Yacef, & Reimann, 2006). To achieve a well functioning or effective group, group members also need to put effort in the social (i.e., non task-related) processes, such as strong group cohesiveness,dev eloping positive affective relationships, feelings of trust and belonging, and a sense of community (e.g., Boud, Cohen, & Sampson, 1999; Johnson, Johnson, & Smith, 2007; Kreijns & Kirschner, 2004). These social processes allow group members to get to know and understand each other so as to become a ‘healthy’ community of learning (Gunawardena, 1995; Rourke, 2000; Wegerif, 1998). For example, Rovai (2001) found that ‘feelings of community’ can increase the flow of information, support, commitment to group goals, and satisfaction with group efforts. Furthermore, Guzzo and Dickson (1996) found that group cohesion enhances task performance and effectiveness. To this end, the output of the group’s social processes (i.e., perceived social performance) was measured by a questionnaire at the end of the collaboration process. Four previously tested and validated scales were used to measure team development (α = .92, 10 items), group-process satisfaction (α = .76, 6 items, both from Savicki, Kelley, & Lingenfelter, 1996), intra-group conflicts (α = .92, 7 items, from Saavedra, Early, & Van Dyne, 1993), and attitude towards collaborative problem solving (α = .81, 7 items, from Clarebout, Elen,
34
Chapter 3 & Lowyck, 1999). These scales were translated into Dutch and transformed into 5-point Likert scales (1 = totally disagree, 5 = totally agree; see Table 3.1) by Strijbos, Martens, Jochems, and Broers (2007). The Team Development scale provides information on the perceived level of group cohesion. The Group-process Satisfaction scale provides information on the perceived satisfaction with general group functioning. The Intra-group Conflicts scale provides information on the perceived level of conflict between group members. The Attitude towards Collaborative Problem Solving scale provides information on the perceived level of group effectiveness and how group members felt about working and solving problems in a group. The 30 items in the four scales were subjected to principal component analysis. Prior to performing this analysis, the suitability of data for factor analysis was assessed. Inspection of the correlation matrix showed that all coefficients were .5 and higher. The Kaiser-Meyer-Oklin value was .73, exceeding the recommended value of .6 and Bartlett’s Test of Sphericity reached statistical significance, supporting the factorability of the correlation matrix. The analysis revealed the presence of one main component with Eigen values exceeding 1, explaining 76.6% of the variance respectively. Cronbach’s alpha of the composed ‘Social Performance (total)’ scale was .90. Table 3.1 Social Performance Scales Scale Team Development
k 10
Group-process Satisfaction
6
Intra-group Conflicts
7
Attitude towards Collaborative Problem Solving
7
Social performance (total)
30
Example Group members contribute ideas and solutions to problems. I felt that my group worked very hard together to solve this problem. I found myself unhappy and in conflict with members of my group. Collaborating in a group is challenging. See all items of four scales stated above.
Cronbach’s α .77 .71 .84 .74 .90
35
Chapter 3 Appendix 3.1 Scale: Team Development
Using the scale below, provide a rating that you think is most descriptive of your group. 1 = totally disagree, 10 = totally agree.
Team Development 1 2 3 4 5 6 7 8 9 10
Group members understand group goals and are committed to them. (Commitment) Alle groepsleden begrijpen het gezamenlijke doel en zijn hieraan toegewijd. Group members are friendly, concerned, and interested in each other. (Acceptance) Alle groepsleden gedragen zich vriendelijk, betrokken en geïnteresseerd t.o.v. elkaar. Group members acknowledge and confront conflict openly. (Clarification) Alle groepsleden erkennen conflicten en treden deze open tegemoet. Group members listen with understanding to others. (Belonging) Alle groepsleden reageren begripvol op elkaar. Group members include others in the decision-making process. (Involvement) Alle groepsleden betrekken elkaar bij het nemen van beslissingen. Group members recognize and respect individual differences. (Support) Alle groepsleden herkenen en respecteren individuele verschillen. Group members contribute ideas and solutions to problems. (Achievement) Alle groepsleden geven suggesties en dragen bij aan het oplossen van problemen/ taken. Group members vale the contributions and ideas of others. (Pride) Alle groepsleden waarderen de ideeën en bijdragen van anderen. Group members recognize and reward group performance. (Recognition) Alle groepsleden erkennen en waarderen het groepsresultaat. Group members encourage and appreciate comments about group efforts. (Satisfaction) Alle groepsleden stimuleren en waarderen opmerkingen t.a.v. de inzet van de groep.
Adapted from: Kormanski, C. (1990). Team building patterns of academic groups. The Journal for Specialists in Group Work, 15(4), 206-214. Translated into Dutch by: Strijbos, J. W., Martens, R. L., Jochems, W. M. G., & Broers, N. J. (2007). The effect of functional roles on perceived group efficiency during computer-supported collaborative learning: a matter of triangulation. Computers in Human Behavior, 23, 353–380.
36
Chapter 3 Appendix 3.2 Scale: Group Process Satisfaction
Using the scale below, provide a rating that you think is most descriptive of your group. 1 = strongly disagree, 2 = disagree, 3 = moderately disagree, 4 = neutral, 5 = moderately agree, 6 = agree, 7 = strongly agree.
Group Process Satisfaction 1 2 3 4
5
6
I enjoyed talking with my group on the network. Ik vond het plezierig om met mijn medegroepsleden te communiceren via een computernetwerk. I felt good that I could participate with my group in coming to a conclusion about the problem. Ik vond het prettig om als groep te werken aan de oplossing van de studietaak. I did not feel that people listened to me when i had an idea about the problem. Ik had niet het idee dat anderen naar mij luisterden wanneer ik ideeën of suggesties inbracht. I felt that I could express my thoughts and feelings openly to others on the network while solving the problem. Ik heb het idee dat ik mijn opinies en suggesties vrijelijk kon inbrengen gedurende het werk aan de studietaak. I did not feel that people understood my thoughts and feelings after I expressed them while solving this problem. Ik had niet het idee dat anderen mijn opinies en suggesties begrepen, die ik inbracht tijdens het groepswerk. I felt that my group worked very hard together to solve this problem. Ik ben van mening dat mijn groep intensief heeft samengewerkt aan de studietaak.
NB. Reversed items that need to be recoded are item 3 and 5.
Adapted from: Savicki, V., Kelley, M., & Lingenfelter, D. (1996). Gender, group composition, and task type in small task groups using computer-mediated communication. Computers in Human Behavior, 12, 549–565. Translated into Dutch by: Strijbos, J. W., Martens, R. L., Jochems, W. M. G., & Broers, N. J. (2007). The effect of functional roles on perceived group efficiency during computer-supported collaborative learning: a matter of triangulation. Computers in Human Behavior, 23, 353–380.
37
Chapter 3 Appendix 3.3 Scale: Intragroup Conflict
Using the scale below, provide a rating that you think is most descriptive of your group. 1 = totally disagree, 2 = moderately disagree, 3 = neutral, 4 = moderately agree, 5 = totally agree.
Intragroup Conflict 1 2 3 4 5 6 7
There was a lot of tension among people in our group. Er was veel spanning tussen groepsleden. People in our group never interfered with each other’s work. Groepsleden bemoeiden zich niet met elkaars werkzaamheden. Most people in our group got along with one another. De meeste groepsleden konden goed met elkaar overweg. Given the way group members performed their roles I often felt frustrated. De wijze waarop groepsleden hun rol vervulden, stelde me vaak teleur. I found myself unhappy and in conflict with members of my group. Ik voelde me niet op mijn gemak en was vaak in conflict met groepsleden. People I depended on to get my job done in the group often let me down. Groepsleden waarvan ik voor mijn werkzaamheden afhankelijk was, lieten mij vaak zitten. I found myself in conflict with other group members because of their actions (or lack of actions). Ik was vaak in conflict met groepsleden door hun activiteiten in de groep (of het gebrek daaraan).
NB. Reversed items that need to be recoded are item 2 and item 3.
Adapted from: Saavedra, R., Early, P. C., & Van Dyne, L. (1993). Complex interdependence in task-performing groups. Journal of Applied Psychology, 78, 61–72. Translated into Dutch by: Strijbos, J. W., Martens, R. L., Jochems, W. M. G., & Broers, N. J. (2007). The effect of functional roles on perceived group efficiency during computer-supported collaborative learning: a matter of triangulation. Computers in Human Behavior, 23, 353–380.
38
Chapter 3 Appendix 3.4 Scale: Instructional Beliefs About Problem-Based Collaboration
Using the scale below, provide a rating that you think is most descriptive of your group. 1 = totally disagree, 2 = moderately disagree, 3 = neutral, 4 = moderately agree, 5 = totally agree.
Instructional Beliefs About Problem-Based Collaboration 1 2 3 4 5 6 7
Working in a group on a task is dull. In een groep aan een taak werken is saai. Solving problems in a group is dull. Problemen oplossen in een groep is saai. Working in a group is efficient. Werken in een groep is efficiënt. Solving problems is exciting. Het oplossen van problemen is uitdagend. Working in a group is exciting. Werken in een groep is uitdagend. Working in a group is inefficient. Werken in een groep is inefficiënt. Solving problems in a group is exciting. Problemen oplossen in een groep is uitdagend.
NB. Reversed items that need to be recoded are item 1, 2 and 6.
Adapted from: Clarebout, G., Elen, J., & Lowyck, J. (1999, August). An invasion in the classroom: Influence on instructional and epistemological beliefs. Paper presented at the eighth bi-annual conference of the European Association of Research on Learning and Instruction (EARLI), Goteborg, Sweden. Translated into Dutch by: Strijbos, J. W., Martens, R. L., Jochems, W. M. G., & Broers, N. J. (2007). The effect of functional roles on perceived group efficiency during computer-supported collaborative learning: a matter of triangulation. Computers in Human Behavior, 23, 353–380.
39
4.
Awareness of Group Performance¹
Abstract This chapter presents the first empirical study and deals with the separate and interaction effect of an assessment and reflection tool on the groups’ social and cognitive performance during computer supported collaborative learning (CSCL). A CSCL-environment was augmented with a self and peer assessment tool (Radar) and reflection tool (Reflector) in order to make group members aware of both their individual and their group behavior. Radar visualizes how group members perceive their own social and cognitive performance and that of their peers during collaboration along five dimensions. Reflector stimulates group members to reflect upon their own performance and the performance of the group. A 2x2 factorial between-subjects design was used to examine whether Radar and Reflector would lead to better team development, more group satisfaction, lower levels of group conflict, more positive attitudes toward problem-based collaboration, and a better group product. Results show that groups with Radar perceived their team as being better developed, experienced lower conflict levels, and had a more positive attitude towards collaborative problem solving than groups without Radar. The quality of group products, however, did not differ. The results demonstrate that peer feedback on the social performance of individual group members can enhance the performance and attitudes of a CSCLgroup.
4.1 Introduction Well performing teams go through several stages of group development (e.g., Gersick, 1988; Tuckman & Jensen, 1977). While the research showing this was carried out in face-to-face teams, Tuckman and Jensen’s (1977) concept of group development stages also seems to be relevant to virtual learning groups (Johnson, Suriya, Yoon, Berret, & La Fleur, 2002). Tuckman and Jensen observed and distinguished five stages, namely: (1) forming (i.e., getting to know each other and the task at hand), (2) storming (i.e., establishing roles and positions within the group), (3) norming (i.e., reaching consensus about behavior, goals en strategies, (4) performing (i.e., reaching conclusions and delivering results), and (5) adjourning (i.e., dismantling of the group when the task is completed). Each of these five stages involves two aspects: interpersonal relationships (i.e., social and socio-emotional aspects) and behavior to accomplish the task (i.e., cognitive aspects). It is especially the social or socio-emotional aspects of group development processes such as developing positive affective relationships, group cohesiveness, feelings of trust, and a sense of community, that are very important for a group to reach their full potential (Kreijns & Kirschner, 2004). However, most computer supported collaborative learning (CSCL) environments focus primarily on the support of cognitive processes in collaboration, and limit the possibility for social processes to take place (Kreijns & Kirschner, 2004). Moreover, group members are often not fully aware that their behavior is not in the best interest of the groups’ development or product (Karau & Williams, 1993). Therefore, in this study, a CSCL-environment was augmented with an assessment tool (Radar) and a reflection tool (Reflector) in order to support the social processes during collaboration, and to make group members aware of their individual and group behavior.
¹Based on Phielix, C., Prins, F. J., Kirschner, P. A. (2010). Awareness of group performance in a CSCL environment: Effects of peer feedback and reflection. Computers in Human Behavior, 26, 151-161.
Chapter 4 According to Hattie and Timperley (2007), the effect of feedback (e.g., shared self and peer ratings in Radar) can be increased when students reflect upon this matter. To this end, it was hypothesized that a combination of Radar and Reflector would be most effective for influencing group members’ behavior and enhancing their performance. The aim of this study is to examine the effects of these two tools on team development, group satisfaction, level of group conflict, attitude towards collaborative problem solving, and the quality of the groups’ product.
4.2 Research Questions This study investigated the effect of a peer assessment tool and a reflection tool for both the social and cognitive behavior of individual group members working in a CSCL-environment, and the social and cognitive performance of the group as a whole. To this end, an existing CSCLenvironment was augmented with two independent, but complementary, tools. The first was an individualized peer assessment tool - Radar - which was meant to stimulate and provide group members with information about the social and cognitive behavior of themselves, their peers, and the group as a whole. This information was presented from both the perspectives of the group members themselves (i.e., self perceptions), their peers (i.e., peer perceptions) and the group as a whole. The second tool was a shared reflection tool - Reflector - which was meant to stimulate group members to reflect on and provide information about their personal perspectives on the group’s performance, their own contributions, their own behavior and how this behavior was perceived by their peers, as well as to co-reflect on the group performance and reach shared understanding on this. The following research questions will be addressed: 1. Do groups with Radar and Reflector show higher differences between self assessments and peer assessments between three successive measurement moments, than groups with only Radar? Expected is that the peer feedback provided by Radar at the first assessment should make group members aware of their unrealistic self perceptions and peer perceptions, resulting in a decrease of self assessment and peer assessment scores at a subsequent assessment. Also, a combination of Radar and Reflector should lead to even lower self assessment and peer assessment scores than groups with only Radar. 2. Do groups with Radar and Reflector show more congruency between self assessments and peer assessments at T3 than groups without Radar and/or Reflector alone? Expected is that Radar and Reflector will cause group members to adjust their unrealistic positive self perceptions towards more realistic perceptions of their peers between a second and third measurement. Therefore, groups with Radar and Reflector should show the highest positive correlations between self assessments and peer assessments. 3. Do members of groups with Radar and Reflector perceive themselves and others to exhibit better social and cognitive behavior than those in groups without Radar and/or Reflector? Expected is that both Radar and Reflector should positively affect perceived social and cognitive behavior, with a combination of Radar and Reflector being most effective. 4. Do groups with Radar and Reflector perform better socially than groups without Radar and/or Reflector? In other words, do groups using Radar and Reflector develop better, have higher group satisfaction, have lower levels of group conflict, and have more positive attitude towards collaborative problem solving than groups without Radar and/or Reflector? Expected is that both Radar as Reflector will positively affect the social behavior in the
42
Chapter 4 group, and that this should lead to an increase in the social performance of the group. A combination of both tools should be most effective. 5. Do groups with Radar and Reflector perform better cognitively than groups without Radar and/or Reflector? In other words, do groups with Radar and Reflector produce a group product of higher quality than groups without Radar and/or Reflector? Expected is that both Radar as Reflector will positively affect the social behavior in the group and that this should indirectly lead to an increase in the cognitive performance of the group. A combination of both tools should be most effective.
4.3 Method and Instrumentation 4.3.1 Participants Participants were 39 sophomore Dutch high school students (19 male, 20 female), with an average age of 16 (M = 15.5, SD = .60, Min =14, Max = 17), from an academic high school in The Netherlands. Students came from two classes and were enrolled in the second stage of the preuniversity education track which encompasses the final three years of high school. The participants were randomly assigned by the researchers to groups of three or four, and to one of the four conditions (see Design). Group compositions were heterogeneous in ability and gender.
4.3.2 Design A 2x2 between-subjects factorial design was used with the factors Radar unavailable (¬Ra) – available (+Ra), and Reflector unavailable (¬Rf) – available (+Rf). This leads to four conditions (¬Ra¬Rf, +Ra¬Rf, ¬Ra+Rf, +Ra+Rf). The condition with Radar and Reflector (+Ra+Rf) consisted of 11 students (2 groups of 4, and 1 group of 3), without Radar but with Reflector (¬Ra+Rf) of 12 students (3 groups of 4), and with Radar but without Reflector (+Ra¬Rf) and without both tools ( ¬Ra¬Rf) of 8 students (2 groups of 4). Table 4.1 Overview of Scales, Subscales and Instruments Scale
Subscales
Instrument
Social behavior
Influence, Friendliness, Cooperation, Reliability
Radar
Cognitive behavior
Productivity
Radar
Social performance
Team development, Group process Satisfaction Intra-group Conflicts Attitude towards Collaborative Problem Solving -
Questionnaire
Cognitive performance
Essay grade
4.3.3 Measures See Table 4.1 for the measures of social and cognitive behavior/performance. Social behavior. The perceived social behavior in the group was measured by the self assessments and peer assessments in Radar on four variables, namely ‘influence’, ‘friendliness’, ‘cooperativeness’ and ‘reliability’. These variables are rated on a continuous scale ranging from 0 to 4 (0 = none, 4 = very high). Cognitive behavior. The perceived cognitive behavior in the group is measured by the self assessments and peer assessments in Radar on the variable ‘productivity’, that was rated on a continuous scale ranging from 0 to 4 (0 = none, 4 = very high).
43
Chapter 4 Cognitive performance. The grade given to the groups’ collaborative writing task (i.e., the essay) was used as a measure of cognitive performance. The essays were graded by two researchers, both experienced in grading essays. The inter-rater reliability was high (n = 10, Cronbach’s α = .86). Social performance. To measure social performance, previously tested and validated scales were used to meausure team development (α = .92, 10 items), group-process satisfaction (α = .76, 6 items, both from Savicki, Kelley, & Lingenfelter, 1996), intra-group conflicts (α = .92, 7 items, from Saavedra, Early, & Van Dyne, 1993), and attitude towards collaborative problem solving (α = .81, 7 items, from Clarebout, Elen, & Lowyck, 1999). These scales were translated into Dutch and transformed into 5-point Likert scales (1 = totally disagree, 5 = totally agree; see Table 4.2) by Strijbos, Martens, Jochems, and Broers (2007). The Team Development scale provides information on the perceived level of group cohesion. The Group-process Satisfaction scale provides information on the perceived satisfaction with general group functioning. The Intragroup Conflicts scale provides information on the perceived level of conflict between group members. The Attitude towards Collaborative Problem Solving scale provides information on the perceived level of group effectiveness and how group members felt about working and solving problems in a group. Table 4.2 Examples of Social Performance Scales and their reliabilities in this study Scale k Example
Cronbach’s α .77
Team Development
10
Group members contribute ideas and solutions to problems.
Group-process Satisfaction
6
I felt that my group worked very hard together to solve this problem.
.71
Intra-group Conflicts
7
I found myself unhappy and in conflict with members of my group.
.84
Attitude towards Collaborative Problem Solving
7
Collaborating in a group is challenging.
.74
4.3.4 Task and procedure The students collaborated in groups of three or four on a collaborative writing task in sociology. Every student worked at a computer. Each group had to write one essay about Fitna - a very contentious film - which argues that Islam encourages, among other things, terrorism, antiSemitism, sexism, violence against women, and Islamic universalism. This task was considered highly civically relevant by the school. The collaborative writing task consisted of two 90-minute sessions separated by one week. The groups collaborated in a CSCL environment called Virtual Collaborative Research Institute (VCRI; Jaspers, Broeken, & Erkens, 2002) which is a groupware program designed to support collaborative learning on research projects and inquiry tasks. VCRI will be further described in the Instruments section. Students were instructed to use VCRI to communicate with the other group members and to make complete use of the tools for peer feedback and reflection when the experimental condition allowed this. Students received content information and definitions regarding to the five variables on which they had to assess themselves and their peers. Students were told that they had four lessons to complete the task, that it would be graded by their teacher, and that it would affect their grade for the course. The introduction to the task stressed the importance of working together as a group and pointed out that each individual
44
Chapter 4 group member was responsible for the successful completion of the group task. To successfully complete the task, all group members had to participate. During collaboration, groups with a peer assessment tool (i.e., +RA¬RF, +RA+RF) used the tool at the beginning of the experiment (T1), halfway through the experiment (i.e., at the end of the first session; T2), and at the end of the second and final session (T3). The groups with a reflection tool (i.e., ¬RA+RF, +RA+RF), used the tool twice, namely halfway through the experiment (T2) and at the end of the final session (T3). While groups with Radar and/or Reflector used the tools, groups without Radar and/or Reflector continued working on their collaborative writing task. Groups with Radar and/or Reflector received extra time for their collaborative writing task so that time-on-task was equal for all conditions. At the end of the final session (T3), the peer assessment- and reflection tools became available for all conditions so that all participants could assess their peers and reflect on their behaviors. Finally, all participants completed a 30-item questionnaire measuring the social performance of the group.
4.3.5 Tools Virtual Collaborative Research Institute (VCRI). The Virtual Collaborative Research Institute (VCRI) is a groupware program that supports collaborative working and learning on research projects and inquiry tasks (Jaspers, Broeken, & Erkens, 2004). VCRI contains more than 10 different tools, but only 6 were used for this experiment (see Figure 4.1).
Figure 4.1 Screenshot of VCRI with the six tools used in this experiment.
The Chat tool (top left) is used for synchronous communication between group members. The chat history is automatically stored and can be re-read by participants at any time. Users can
45
Chapter 4 search for relevant historical information using the Sources tool (top centre). The Co-Writer (top right) is a shared word-processor, which can be used to write a group text. Using the Co-Writer, students can simultaneously work on different parts of their texts. Notes (bottom left) is a note pad which allows the user to make notes and to copy and paste selected information. Radar for peer feedback (bottom centre) and Reflector for reflection (bottom right) will be described in the following sections. Windows of the available tools are automatically arranged on the screen, when students log on to the VCRI.
Figure 4.2 Radar - Group information
Peer assessment tool (Radar). The VCRI was augmented with a peer assessment tool for stimulating and facilitating information of group members’ social and cognitive behavior. This information is visualized in a radar diagram; therefore the peer assessment tool is named ‘Radar’ (see Figure 4.2). Radar provides users with anonymous information on how their cognitive and social behavior is perceived by themselves, their peers, and the group as a whole. The information gathered is based on specific traits that have been found to tacitly affect how one ‘rates’ other people (den Brok, Brekelmans, & Wubbels, 2006). Radar provides information on five traits that are important for assessing behavior in groups. Four are related to social or interpersonal behavior, namely (1) influence; (2) friendliness; (3) cooperation; (4) reliability; and one to cognitive behavior, namely (5) productivity. These traits are derived from studies on interpersonal perceptions, interaction, group functioning, and group effectiveness (e.g., Bales, 1988; den Brok, Brekelmans, & Wubbels; Kenny, 1994; Salas, Sims, & Burke, 2005).
46
Chapter 4 Influence is directly derived from Wubbels, Créton, and Hooymayers’ (1985) influence dimension (i.e., dominance vs. submissiveness) in their model for interpersonal teacher behavior. This dimension is also used by Bales (1988) and represents the prominence, status, power, and personal influence that the individual is seen to have in relation to other group members. The variable is labeled ‘influence’, and not ‘dominance’ or ‘submissive’, because those labels can be perceived as negative traits. Friendliness is one of the eight behavior categories from Wubbels, Créton, and Hooymayers’ (1985) model for interpersonal teacher behavior. Bales (1988) used a similar dimension (i.e., friendliness vs. unfriendliness). Bales and Cohen (1979) defined this as the extent to which individual members are friendly and respectful to each other. Cooperation, which denotes the degree to which someone is willing to work with others, is derived directly from Wubbels et al’s (1985) dimension Proximity (i.e., opposition vs. cooperation) They defined proximity as the property of being close together, or in group settings as the feeling of being a group (i.e., group cohesiveness). Reliable is considered a trait reflecting ‘trust’ which has been identified as an important precursor for successful collaboration, both in face-to-face teams (Castleton_Partners/TCO, 2007) and in CSCL (Jarvenpaa & Leidner, 1999). According to Emans, Koopman, Rutte, and Steensma (1996) trust can be seen as the cognitive and affective assurance of group members that they respect each other’s interests and, therefore, can orient themselves towards each other’s words, actions, and decisions with an easy conscience. Productivity is the extent to which individual members contribute to tasks or duties, central to group performance or group efficiency (Salas, Sims, & Burke, 2005). This trait, which represents cognitive or task-related behavior, was selected because research has shown that group members monitor the performance of their other group members in comparison to their own performance (Salas, Sims, & Burke). In Radar, all group members are both assessor and assessee. In the role of assessor, the tobe-assessed peer in the group can be selected and her/his profile will appear as dotted lines in the centre circle of the radar diagram. Each group member is represented by a specific color. Assessors rate themselves and all other group members on each of the six subscales (i.e., traits) which are divided in 41 points of assessment ranging from 0 to 4. For example, a student can rate his/her peer 3.2 for friendliness. To simplify data analysis, ratings were transformed into integers on a 100-point scale by multiplying the ratings (0-4) by 25. Thus a rating of 3.2 was after multiplication saved in the database as 80 points (3.2 times 25) on 100-point scale. To make sure that all assessors interpret the five traits in the same way, assessors saw a text balloon with content information and definitions when they moved the cursor across one of the five traits in the tool. For example, when the assessor moves the cursor across ‘influence’ a balloon pops up with the text ‘A high score on influence means that this person has a big influence on what happens in the group, other group members behavior, and the form and content of the group product (the essay)’. The ratings are automatically saved in a database. The assessment is anonymous; group members can see the output of the assessments of the other group members, but cannot see who entered the data. In order to stimulate students to complete the Radar, they can only gain access to the individual and average assessments of their peers after they have completed the assessment themselves. When all group members have completed their self assessments and peer assessments, two modified radar diagrams become available. The first - Information about yourself - shows the output of the self assessment (e.g., Chris about Chris) along with the average scores of the peer assessments of her/him (e.g., Group about Chris). The self assessment is not
47
Chapter 4 taken into account for computing the average scores. To provide more information about the variance in the average score of their peer assessment, students can also choose to view the individual peer assessments about their own behavior (e.g., Group members about Chris). The second - Information about the group (see Figure 4.2) - represents the average scores of the group members, so that group members can get a general impression about the functioning of the group. All group members are represented as a solid line in the diagram, each with a different color. The student can include or exclude group members from the diagram by clicking a name in the legend. Reflection tool (Reflector). VCRI was also augmented with a reflection tool (Reflector) in order to stimulate group members to reflect and/or co-reflect on their individual behavior and overall group performance. This tool contained the five reflective questions discussed earlier: 1. What is your opinion on how the group functioned? Give arguments to support this. 2. What do you contribute to the functioning of the group? Give examples. 3. What do other group members think about your functioning in the group? Why do you think this? 4. What is your opinion on how you functioned in the group? Give arguments to support this. 5. What does the group think about its functioning in general? Discuss and formulate a conclusion that is shared by all group members. The first four questions are answered in Reflector, and completion is indicated by clicking an ‘Add’-button. This allows students to share their answers with the rest of the group and allows them to see the others’ answers. Students can only gain access to the answers of their peers after they have added their own answers, so as not to be influenced by one another. The fifth question is completed in the Co-Writer, which allows writing a shared conclusion. The responses made by the students in the Reflector are not scored or evaluated.
4.3.6 Data Analyses First, to examine whether groups with Radar and Reflector show larger differences for self assessments and peer assessments than groups with only Radar between T1, T2 and T3, a paired samples t-test (one-tailed) with the dependent variables influence, friendliness, cooperation, reliability and productivity, are used to (1) compare the self assessment scores at T1, T2 and T3, and (2) compare the peer assessment scores at T1, T2 and T3. Differences between the self- and peer assessments at T1, T2 and T3 are analyzed using an independent t-test (two-tailed). Second, to examine whether groups with Radar and Reflector show more congruency between self assessments and peer assessments than groups with only Radar, a Pearson productmoment correlation coefficient is used. We expect that peer assessments at T1 and T2 will affect self assessments at T2 and T3, therefore correlations will be calculated between peer assessments at T1, T2 and T3 and self assessments at T2 and T3. Third, to examine whether groups with Radar and/or Reflector perceived better social and cognitive behavior than groups without these tools, a two way between-groups analysis of variance (ANOVA) (two-tailed) is conducted to explore the effect of Radar and/or Reflector on influence, friendliness, cooperation, reliability and productivity, as measured at T3 for both self assessment as peer assessment. Fourth, to examine whether Radar and/or Reflector lead to higher social performance, a two way between-groups analysis of variance (one-tailed) is conducted with the dependent variables
48
Chapter 4 ‘team development’, ‘group satisfaction’, ‘level of group conflicts’, and ‘attitude towards collaborative problem solving’, as measured by the questionnaire at the end of the experiment. Fifth, to examine whether Radar and/or Reflector lead to higher cognitive performance, a two way between-groups analysis of variance (one-tailed) is conducted with the grade on the essay as dependent variable.
4.4 Results 4.4.1 Self assessment scores Table 4.3 shows the mean scores and standard deviations of self assessments at T1, T2 and T3 per condition. At T1 and T2 only groups with a Radar (condition +Ra¬Rf and +Ra+Rf) could complete a self assessment. At T3 all conditions received and completed a self assessment. Except where noted, tests were one-sided. The rule of thumb (Kittler, Menard, & Phillips, 2007) for effects sizes (η2) was small ≥ .01, medium ≥ .06, and large ≥ .14. Table 4.3 Mean and Standard Deviations of Self Assessments per Condition Influence Friendliness Cooperation T Condition n M SD M SD M SD 1 +Ra¬Rf 8 73.62 9.97 66.00 4.60 69.38 8.88 +Ra+Rf 11 73.09 9.18 68.91 9.69 70.09 16.59 2 +Ra¬Rf 8 75.88 7.34 68.38 8.23 66.25 7.96 +Ra+Rf 11 69.27 13.45 64.00 8.61 67.09 12.64 3 +Ra¬Rf 8 76.25 4.71 73.63 6.97 69.88 3.83 +Ra+Rf 11 66.54 12.68 68.46 16.98 69.63 13.43 ¬Ra¬Rf 8 75.75 15.24 76.50 14.14 77.63 12.62 ¬Ra+Rf 12 69.50 10.62 67.58 22.15 70.92 5.81
Reliability M SD 62.38 10.94 62.27 11.33 59.88 18.60 60.00 21.37 70.50 9.93 62.09 17.84 69.00 16.61 68.33 16.81
Productivity M SD 62.25 6.63 67.18 13.42 66.50 11.25 61.82 19.10 72.13 7.70 65.09 15.29 76.13 14.14 73.33 11.69
To examine whether groups with Radar and Reflector show higher discrepancies for self assessments between T1, T2 and T3, than groups with only a Radar, a paired samples t-test was used to compare the average self assessment scores at T1, T2 and T3, with respect to perceived social and cognitive behavior (influence, friendliness, cooperation, reliability and productivity). No significant differences between the first and the second assessment were found for groups with only a Radar (+Ra¬Rf ). In comparison with the second assessment, students perceived at T3 significantly more Reliability, t (7) = 2.53, p = .02, η2 = .48. Compared with the first assessment, students perceived at T3 more Friendliness, t (7) = 3.10, p = .009, η2 = .58, and more Productivity, t (7) = 2.55, p = .02, η2 = .48. For groups with both Radar and Reflector (+Ra+Rf) no significant differences were found between the average self assessment scores at T1, T2 and T3. Independent t-tests comparing self assessment scores of conditions +Ra¬Rf and +Ra+Rf at T1, T2 and T3 revealed that groups with only Radar (+Ra¬Rf) perceived significantly more Influence at T3 than groups with both Radar and Reflector (+Ra+Rf); t (13) = -2.33, p = .04 (two tailed). The magnitude of the difference in means (mean difference = 9.71, 95% CI: .72 to 18.69) was large (η2 = .48). No other significant differences were found.
4.4.2 Peer assessment scores Table 4.4 shows the mean scores and the standard deviations of peer assessments at T1, T2 and T3 per condition. It was assumed that peer feedback provided by Radar would make group
49
Chapter 4 members aware of, possible, unrealistic positive perceptions of the performance of their peers. The expectation was that peer assessment scores would decrease at T2. Table 4.4 Mean and Standard Deviations of Peer Assessments per Condition Influence Friendliness Cooperation T Condition n M SD M SD M SD 1 +Ra¬Rf 24 74.17 6.70 66.42 7.22 69.88 6.27 +Ra+Rf 30 75.17 11.51 75.13 9.85 73.07 12.77 2 +Ra¬Rf 24 75.04 7.67 67.08 8.83 66.58 9.18 +Ra+Rf 30 68.80 13.48 63.77 11.15 69.40 7.93 3 +Ra¬Rf 24 75.29 6.95 71.63 7.81 71.04 5.04 +Ra+Rf 30 70.07 8.80 69.60 17.09 69.40 11.36 ¬Ra¬Rf 24 71.79 18.87 66.62 18.17 71.04 17.89 ¬Ra+Rf 36 70.28 12.87 66.14 17.41 68.19 16.87
Reliability M SD 69.25 7.35 70.73 11.81 66.00 13.05 64.30 15.32 72.08 6.98 69.83 12.23 71.46 16.99 69.53 17.87
Productivity M SD 63.33 8.29 68.27 13.72 66.79 10.32 69.50 11.43 71.00 7.58 66.73 12.91 68.25 18.49 68.28 14.02
To examine whether groups with Radar and Reflector show greater differences than groups with only a Radar for peer assessments between T1, T2 and T3, a paired samples t-test (onetailed) was used to compare average peer assessment scores at T1, T2 and T3 with respect to perceived social and cognitive behavior (i.e., influence, friendliness, cooperation, reliability and productivity). No significant differences were found between the first and the second assessments for groups with only Radar (+Ra¬Rf ). Compared to the second assessment, students at T3 perceived significantly more Friendliness, t (23) = 2.80, p = .01, η2 = .25, more Cooperativeness, t (23) = 2.29, p = .02, η2 = .19, more Reliability, t (23) = 2.62, p = .008, η2 = .23, and higher Productivity, t (23) = 2.38, p = .01, η2 = .20. Compared with the first assessment, students perceived at T3 significantly more Friendliness, t (23) = 3.27, p = .002, η2 = .32, and more Productivity, t (23) = 4.33, p = .00, η2 = .45. Significant differences were found between the first and the second assessments for groups with both Radar and Reflector (+Ra+Rf ). Compared to T1, students with Radar and Reflector at T2 perceived significantly less Influence, t (29) = -2.00, p = .03, η2 = .06, less Friendliness, t (29) = -4.40, p = .00, η2 = .25, and less Reliability, t (29) = -1.81, p = .04, η2 = .05. In comparison with the second assessment, students at T3 perceived significantly more Friendliness, t (29) = 2.05, p = .03, η2 = .07, and more Reliability, t (29) = 1.88, p = .04, η2 = .06. Compared to T1, students at T3 perceived significantly less Influence, t (29) = -2.15, p = .02, η2 = .07.
4.4.3 Comparing self and peer assessments for groups with Radar An independent t-test (one-tailed) was used to examine the differences between self assessments and peer assessments at T1, T2 and T3 with respect to perceived social and cognitive behavior (i.e., influence, friendliness, cooperation, reliability and productivity). Tables 4.3 and 4.4 show the mean scores and standard deviations of self assessments and peer assessments per condition. Students with only Radar (+Ra¬Rf) perceived their peers at T1 as significantly more Reliable than themselves, t (30) = -2.02, p = .03. The magnitude of the difference in means (mean difference = -6.88, 95% CI: -13.82 to .07) was moderate (η2 = .12). In comparing the other students in their team with themselves, students with both Radar and Reflector (+Ra+Rf) perceived their peers at T1 as being significantly more Friendly t (39) = -1.80, p = .04, with a moderate (η2 = .08) magnitude of the difference in means (mean
50
Chapter 4 difference = -6.22, 95% CI: -13.21 to .77) and as significantly more Reliable, t (39) = -2.05, p = .02, with a moderate (η2 = .10) magnitude of the difference in means (mean difference = 8.46, 95% CI: -16.80 to .13). No other significant differences were found between self- and peer assessments for condition +Ra¬Rf and +Ra+Rf at T1, T2 and T3, or for condition ¬Ra+Rf and ¬Ra¬Rf at T3.
4.4.4 Examining congruency between self and peer assessments A Pearson product-moment correlation coefficient was used to test congruency between peer assessments at T1, T2, T3 and self assessments at T2 and T3 with respect to perceived social and cognitive behavior (i.e., influence, friendliness, cooperation, reliability and productivity). Preliminary analyses were performed to ensure no violation of the assumptions of normality, linearity and homoscedasticity. Table 4.5 shows the Pearson correlations for peer assessments at T1, T2, T3, and self assessment at T2, T3. Table 4.5 Pearson correlations for Peer Assessments at T1, T2, T3 and Self Assessment at T2, T3 Influence Friendliness Cooperation Reliability Condition Assessment Self-2 Self-3 Self-2 Self-3 Self-2 Self-3 Self-2 Self-3 Peer-1 -.47 -.04 .22 -.15 -.27 -.68 -.46 -.04 +Ra¬Rf Peer-2 -.23 -.43 -.52 -.72* -.08 -.45 .33 .48 (n = 8) Peer-3 -.64 -.81* .11 -.33 -.65 -.59 .71* .67 +Ra+Rf (n = 11)
Peer-1 Peer-2 Peer-3
.41 .64* .51
.50 -.01 .81** .37 .69* .62*
.29 .53 .67*
.47 .01 -.05
.29 -.02 -.30
.33 .47 .42
.28 .22 .47
Productivity Self-2 Self-3 .17 .04 -.27 -.34 -.08 -.39 .56 .73* .72*
.48 .62* .35
There was a strong negative correlation for groups with only Radar (+Ra¬Rf) between self assessment and peer assessment scores for Influence at T3, r = -.81, n = 8, p = .01, and between peer assessment scores at T2 and self assessments at T3 for Friendliness, r = -.72, n = 8, p = .04. A strong positive correlation was found between the self assessment scores for Reliability at T2 and peer assessments at T3, r = .71, n = 8, p = .05. For groups with both Radar and Reflector (+Ra+Rf), there was a strong positive correlation between self assessment and peer assessment scores for Influence at T2, r = .64, n = 11, p = .03, and also at T3, r = .69, n = 11, p = .02. Peer assessment scores for Influence at T2 correlate strongly with self assessments at T3, r = .81, n = 11, p = .00, indicating a convergence of self and peer perceptions. Peer assessment scores for Friendliness at T3 correlate strongly with self assessments at T2, r = .62, n = 11, p = .04, and self assessments at T3, r = .67, n = 11, p = .03. Peer assessment scores for Productivity at T2 correlate strongly with self assessments at T2, r = .73, n = 11, p = .01, and with self assessments at T3, r = .62, n = 11, p = .04, indicating a convergence of self and peer perceptions. A strong positive correlation was also found between self assessment scores at T2 and peer assessments at T3, r = .72, n = 11, p = .01.
4.4.5 Comparing peer assessment scores for all conditions at T3 It was expected that at the end of the task (T3), groups with both Radar and Reflector (condition +Ra+Rf) would perceive more social behavior (e.g., less influence, more friendliness) and better cognitive behavior (e.g., more productivity), than groups with only Radar (+Ra¬Rf), only Reflector (¬Ra+Rf) or without either (¬Ra¬Rf). A two-way between-groups ANOVA was conducted to explore the effect of Radar and/or Reflector at T3 for both peer assessment and self
51
Chapter 4 assessment. Analysis of peer assessments showed no significant interaction between Radar and Reflector and no significant main effects. Analysis of self assessments showed no significant interaction or main effect for Radar, but did show a statistically significant main effect for Reflector on Influence, F (1, 35) = 4.54, p = .04 (two-tailed), partial η2 = .12. In an independent t-test comparing the self assessment scores on Influence for groups with and without Reflector, the ¬Ra+Rf and +Ra+Rf conditions were combined. Groups with Reflector scored significantly higher on Influence (M = 68.08, SD = 11.48), than groups without, M = 76.00, SD = 10.90; t (37) = -2.16, p = .04). The magnitude of the difference in means (mean difference = -7.91, 95% CI: .49 to 15.33) was moderate (η2 = .11).
4.4.6 Impact of tools on social performance A two-way between-groups ANOVA was conducted to explore the effect of Radar and Reflector on social performance with respect to team development, group satisfaction, group conflicts and attitude towards collaborative problem solving. Participants were divided into four groups according to their condition (¬Ra¬Rf, +Ra¬Rf, ¬Ra+Rf, +Ra+Rf). There were no significant interaction effects between Radar and Reflector and no significant main effects for Reflector. There was a main effect for Radar on team development, F (1, 30) = 4.19, p = .05, partial η2 = .12, level of group conflict, F (1, 31) = 4.49, p = .04, partial η2 = .13, and attitude towards collaborative problem solving, F (2, 31) = 1.44, p = .04, partial η2 = .13. An independent t-test was conducted to examine the main effects of Radar on team development, group conflict and attitude towards problem based collaboration. Conditions +Ra¬Rf and +Ra+Rf were combined into a new group named ‘with Radar’, and conditions ¬Ra+Rf and ¬Ra¬Rf were combined into group ‘without Radar’ (see Table 4.6). Table 4.6 Independent Samples t-test Between Groups With and Without Radar Scale
Treatment
N
M
SD
Team development
with radar
16
4.08
.35
without radar
18
3.82
.48
with radar
17
3.95
.55
without radar
18
3.95
.70
with radar
17
1.79
.37
without radar
18
2.17
.71
with radar
17
3.89
.39
without radar
18
3.57
.62
Group satisfaction Level of group conflict Attitude towards collaborative problem solving
Mean difference
p
η²
.26*
.04
.09
.00
.49
.00
-.38*
.03
.11
.32*
.04
.09
The results in Table 4.6 show that groups with Radar (+Ra¬Rf and +Ra+Rf) scored significantly higher on team development, t (32) = 1.79, p = .04, experienced significantly less group conflicts t (36) = -2.03, p = .03, and had a significantly more positive attitude towards collaborative problem solving, t (29) = 1.84, p = .04, than groups without Radar (¬Ra+Rf and ¬Ra¬Rf).
4.4.7 Impact of tools on cognitive performance A two way between-groups ANOVA was conducted to explore the effect of Radar and Reflector on group cognitive performance, as measured by the grade given to their essays. There
52
Chapter 4 were no significant interaction effects between Radar and Reflector, and no significant main effects for Radar or Reflector. Table 4.7 shows mean and standard deviations for cognitive performance per condition. Table 4.7 Mean and Standard Deviations for Cognitive Performance per Condition Cognitive performance (grade essay) Condition ¬Ra¬Rf ¬Ra+Rf +Ra¬Rf +Ra+Rf
M 6.00 5.83 6.25 5.17
SD .71 1.04 2.47 1.61
Min 5.5 5.0 4.5 4.0
Max 6.5 7.0 8.0 7.0
4.5 Discussion and Conclusion The first aim of this study was to examine whether groups with peer assessment tool (Radar) and reflection tool (Reflector) showed larger differences for self assessments and peer assessments between T1, T2 and T3, than groups with only Radar. Based on Stroebe, Diehl, and Abakoumkin (1992), we assumed that group members would generally form unrealistically positive perceptions of self performance and peer performance. Therefore, we expected that peer feedback provided by Radar at the first assessment (T1) would make group members aware of these perceptions, resulting in a decrease of self assessment and peer assessment scores at the second assessment (T2). Analysis of self assessment scores showed no significant decrease in scores at T2, but, as expected, analyses of peer assessment scores at T2 for groups with both Radar and Reflector (+Ra+Rf) showed a decrease in scores for Influence, Friendliness, and Reliability as compared to the first assessment (T1). The second aim of this study was to determine whether the self assessments and peer assessments scores of groups with Radar and Reflector would be more similar (be more congruent) than the scores of groups with only Radar. We assumed that group members would adjust their unrealistic positive self perceptions towards more realistic perceptions between T2 and T3. Therefore, positive correlations were expected between the peer assessments at T2 and the self assessments at T3. As expected, the peer assessments of groups with both Radar and Reflector (+Ra+Rf) at T2 correlated strongly with the self assessments at T3 for Influence and Productivity, indicating a convergence of self and peer perceptions. For groups with only Radar (+Ra¬Rf), a strong negative correlation was found between peer assessment scores at T2 and self assessments at T3 for Friendliness. This suggests that a combination of the two tools will make students more aware of their social and cognitive behavior during collaboration. The third aim of this study was to enhance students’ social and cognitive behavior with Radar and/or Reflector. We assumed that at T3, groups with both Radar and Reflector (+Ra+Rf) would perceive more social behavior (e.g., less influence, more friendliness) and cognitive behavior (e.g., more productivity) than groups without Radar (¬Ra+Rf), without Reflector (+Ra¬Rf), or without both (¬Ra¬Rf). A two-way ANOVA of the self assessment scores showed a significant main effect for Reflector on Influence but not on other variables. Independent t-tests between groups with and without Reflector showed that groups with Reflector scored significantly higher on Influence than groups without Reflector, but no differences for the other variables. This may be because Reflector’s questions make stress the individual contribution to group functioning and, thus, make group members especially aware of their influence rather than
53
Chapter 4 their friendliness, reliability or other characteristics. How one contributes (i.e., amount and value) can, possibly, be perceived as an influence on group functioning. For the peer assessment scores, a two-way ANOVA did not show the expected main effects of both Radar and Reflector on Influence, Friendliness, Cooperativeness, Reliability, and Productivity at T3. This might be due to a tendency for all group members to assess their peers more positively than they normally would, because of the high level of group satisfaction based upon successful task completion. This is in line with Locke and Latham (1990) who found that group satisfaction is strongly related to group performance. This would also explain the increase of self assessment scores and peer assessment scores at T3 as compared to T2. The fourth and fifth aims of this study were to examine the effects of Radar and/or Reflector on students’ perceived team development, group satisfaction, level of group conflict, attitude towards collaborative problem solving, and grade given for their essay. As expected, main effects were found for Radar on team development, group conflict, and attitude towards collaborative problem solving. However, no effects were found for group satisfaction and grade. The lack of a significant main effect for Radar on group satisfaction is probably due to the short period of time in which the groups had to collaborate in order to accomplish the task. Deadlines and the task at hand can influence group development (Gersick, 1988). The short amount of time could ‘force’ group members to fulfill a role or task in which they do not feel comfortable or satisfied with. Changing circumstances, such as desired role-changes or disappointing level of task accomplishment, may cause group development to revert to the stage of storming (Bales & Cohen, 1979), which can be contentious, unpleasant, and even painful to group members who do not like conflicts (Tuckman & Jensen, 1977). In the same vein, the period of time may be too short to find effects of the tools on cognitive performance. Therefore, further studies will examine the effects of Radar and Reflector during a longer period (e.g., three months) during which students collaborate on a complex learning task. Several limitations of this study should be kept in mind. First, the statistical power of this study is low because of the relatively small sample size (N = 39). However, even with this small sample, significant main effects were found for Radar on team development, level of group conflict and attitude towards collaborative problem solving. Second, in this study Radar is both an intervention and a measurement tool for the dependent variables (e.g., Influence, Friendliness). Therefore, the design did not allow us to determine whether the decrease of self assessment and peer assessment scores halfway collaboration at T2 was caused by Radar or Reflector, or whether this also occurred in the control group. In future studies, an extra control group will be added for which Radar will become available at T2. Although the effects of this study are mainly ascribed to the Radar, we still assume that a combination of Radar and Reflector will be most effective. An explanation why no significant main effects for the Reflector on social group performance were found could be that Reflector focused here on past and present group functioning and not on future functioning, which might have resulted in superficial reflections which do not take future group behavior into account. In further studies, the Reflector will also focus on future group functioning. That is, it will also stimulate group members to formulate plans and set goals for improving social and cognitive group performance. Research has shown, for example, that outcome feedback can increase individual and group performance, especially when it is combined with goal setting (Mento, Steel, & Karren, 1987; Neubert, 1998; Tubbs, 1986). There is no reason that this should not also be the case for process feedback. In conclusion, the effects of Radar on group functioning are very promising. They show that social group performance in CSCL environments, such as team development, level of group
54
Chapter 4 conflicts and attitude towards collaborative problem solving, can be enhanced by adding this easy to complete and easy to interpret peer assessment tool. For Reflector, it was argued that the focus of the questions should be directed towards future group performance and goal setting.
55
5.
Half-time Awareness in a CSCL environment²
Abstract This study examined to what extent half-time awareness (i.e., receiving Radar and Reflector halfway through the collaboration process) affects the group’s social and cognitive performance. Compared to the previously described study (see chapter 4), in this study, Radar’s five traits were complemented with a sixth trait, namely quality of contribution. Also, reflection prompts aimed at future group functioning were added to Reflector. The underlying assumption was that group performance would be positively influenced by making group members aware of the social and cognitive behavior of themselves, their peers, and the group as a whole. Participants were 108 sophomore Dutch high school students working in dyads, triads and groups of four on a collaborative writing task, with or without the tools. Results demonstrate that awareness stimulated by the peer feedback and reflection tools enhances group-process satisfaction and social performance of CSCL-groups.
5.1 Introduction Collaborative learning supported by computer networks (computer supported collaborative learning; CSCL), while enjoying considerable interest at all levels of education (Strijbos, Kirschner, & Martens, 2004) is sometimes hampered by social problems that arise between team members (Lipponen, Rahikainen, Lallimo, & Hakkarainen, 2003; Hobman, Bordia, Irmer, & Chang, 2002). To this end, self and peer assessments can be used to provide useful feedback on group functioning, as they allow group members to better judge their own and other’s behavior and contributions to the group and thus avoid the social problems often encountered (Dochy, Segers, & Sluijsmans, 1999). Though positive effects of self and peer assessments have been reported (e.g., Cutler & Price, 1995; McDowell, 1995; Phielix, Prins, & Kirschner, 2010), they are also prone to biases. Saavedra and Kwun (1993) found, for example, that group members tend to overestimate their own performance which affects their peer assessments. To counteract this, Dochy et al. propose that a combination of peer feedback and reflection could enhance the validity of self and peer assesments, and possibly also enhance behavioral change (e.g., Prins, Sluijsmans, & Kirschner, 2006). In the study described here, a CSCL-environment was augmented with a peer assessment tool (Radar) and a reflection tool (Reflector) to help make group members better aware of individual and group behavior and stimulate them to set goals and formulate plans for improving the group’s social and cognitive performance. The study has two main goals, namely to examine the effects of these tools on perceived social and cognitive behavior and to examine social and cognitive group performance (i.e., team development, group process satisfaction, level of group conflict, attitude towards collaborative problem solving, and the quality of the group’s product).
²Based on Phielix, C., Prins, F. J., Kirschner, P. A., Erkens, G., & Jaspers, J. (2011). Group awareness of social and cognitive performance in a CSCL environment: Effects of a peer feedback and reflection tool. Computers in Human Behavior, 27, 1087–1102.
Chapter 5
5.1.1 Redesigning the self and peers assessment tool In our first study, as decribed in chapter 4, we developed a self and peer assessment tool (Radar) to help group members become better aware of their individual and group behavior by providing them with information about the social and cognitive behavior of themselves, their peers, and the group as a whole. In our first study Radar provided information on five traits deemed important for assessing behavior in groups. Four traits were related to social behavior, namely influence, friendliness, cooperation, and reliability. The last trait was related to cognitive behavior: productivity. These traits are derived from studies on interpersonal perceptions, interaction, group functioning, and group effectiveness (e.g., Bales, 1988; den Brok, Brekelmans, & Wubbels, 2006; Kenny, 1994; Salas, Sims, & Burke, 2005). However, in this follow-up study the tool is complemented with a sixth trait that also represents cognitive or task-related behavior: quality of contribution. This trait was selected because research has shown that group members monitor the quantity and quality of fellow group members’ performances (e.g., Salas et al., 2005).
5.1.2 Redesigning the reflection tool In the first study, as decribed in chapter 4, a shared reflection tool (Reflector) was developed to stimulate group members to reflect on their own past and present performances, as well as that of the group as a whole, to enhance group performance. In that first study, no significant main effects were found for Reflector on group performance. This was ascribed to the fact that the tool was not focused on future functioning and goal setting (Hattie & Timperley, 2007; Mento, Steel, & Karren, 1987; Neubert, 1998; Tubbs, 1986). For feedback to be effective, the receiver needs to answer three major questions; (1) Where am I going? /What are the goals? (feed up), (2) How am I going? / What progress is being made toward the goal? (feed back), and (3) Where to next? / What activities need to be undertaken to make better progress? (feed forward) (Hattie & Timperley, 2007). Therefore, in this follow-up study, Reflector was redesigned to make group members better aware of their individual and group behavior, and to stimulate them to set goals and formulate plans to enhance social and cognitive group performance. Group members using Reflector individually reflect and provide information on (1) their own perspective on their personal performance (feed up), (2) differences between their self perception and the perception of their peers concerning their personal performance (feed back), (3) whether they agree with those perceptions (feed back), and (4) their individual perspective on group performance (feed up). Because group performance is determined by the individual effort of all group members, Reflector also (5) stimulates group members to collaboratively reflect (i.e., co-reflect) on group performance and reach a shared conclusion on this (feed back). Based on their shared conclusion, group members (6) set goals to improve group performance (feed forward). Co-reflection is defined as “a collaborative critical thinking process involving cognitive and affective interactions between two or more individuals who explore their experiences in order to reach new intersubjective understandings and appreciations” (Yukawa, 2006; p. 206).
5.2 Research Questions This study investigated whether a peer assessment tool and reflection tool would enhance social and cognitive group performance in a CSCL-environment. To this end, an existing CSCLenvironment was augmented with two independent, but complementary, tools. The first was an individualized peer assessment tool - Radar - to provide group members with information about the social and cognitive behavior of themselves, their peers, and the group as a whole. The second
58
Chapter 5 was a shared reflection tool - Reflector - to stimulate group members to co-reflect on social and cognitive group performance and to set goals and formulate plans to enhance group performance. In the first experimental condition, group members used Radar at the start of the collaboration (T1), and Radar and Reflector halfway through (T2) and at the end (T3). In order to examine the effect of the tools halfway through the collaboration process, in the second experimental condition, group members used Radar and Reflector halfway through the collaboration (T2) and at the end (T3). In the control condition, the group members used the tools only at the end (T3). The following research questions were addressed: 1. Do group members in condition 1 (tools at T1, T2, and T3) perceive that there is less social and cognitive behavior at the second assessment (T2) compared to their perceptions of that behavior at the first assessment (T1)? Peer feedback should make group members aware of unrealistically positive self and peer perceptions, resulting in more realistic perceptions of their functioning and, thus, a decrease of self and peer assessment scores at a subsequent assessment (e.g., Phielix et al., 2010). In other words, group members in condition 1 should exhibit lower self assessment and peer assessment scores at T2 as compared to T1. 2. Do group members in condition 1 (tools at T1, T2, and T3) perceive that there is less social and cognitive behavior halfway through (T2), than group members in condition 2 (tools at T2 and T3) who used Radar for the first time at T2? Peer feedback provided by Radar should make group members aware of their unrealistically positive self and peer perceptions (Phielix et al., 2010), resulting in more realistic perceptions of their functioning and, thus, in a decrease in self and peer assessment scores at a subsequent assessment. In other words, group members in condition 1 should exhibit lower self and peer assessment scores at T2 than group members in condition 2. 3. Do group members in condition 1 (tools at T1, T2, and T3) and condition 2 (tools at T2 and T3) perceive that there is more social and cognitive behavior at the end of the collaboration process (T3) compared to halfway through (T2) and, for condition 1, the beginning (T1)? Information from Radar and Reflector halfway through the task (T2) stimulates group members to set goals to improve their own and the group’s social and cognitive performance. In other words, group members using Radar and Reflector at T2 should exhibit higher self assessment and peer assessment scores at the end (T3) compared to scores halfway through (T2) and, for condition 1, at the beginning (T1). 4. Do group members in condition 1 (tools at T1, T2, and T3), perceive that there is better social and cognitive behavior than group members in condition 2 (tools at T2 and T3) and condition 3 (tools at T3)? Information from Radar and Reflector stimulates group members to set goals to improve their own and the group’s social and cognitive performance. In other words, group members in condition 1 should perceive the social and cognitive behavior at T3 to be better than group members in conditions 2 and 3. Also, group members in condition 2 should perceive social and cognitive behavior at T3 to be better than group members in condition 3.
59
Chapter 5 5. Do group members using Radar and Reflector show more congruence between self and peer assessment scores at a subsequent assessment? Group members need time to adjust their unrealistically positive self perceptions, and thus non-significant or small correlations should be found between self and peer assessments after the first completion of Radar, but significant and higher correlations should be found at a subsequent assessment. Also significant differences should be found between self and peer assessments after the first completion of Radar, and these differences should become non-significant or smaller at a subsequent assessment. 6. Do group members using Reflector set goals and formulate plans to enhance social and cognitive group performance? Hypothesis: The reflective questions in Reflector stimulate group members to co-reflect on social and cognitive performance and to set goals and formulate plans to enhance social and cognitive group performance. 7. Do group members in condition 1 (tools at T1, T2, and T3), perceive that there is higher social performance (i.e., better team development, higher group satisfaction, less group conflict, and more positive attitudes towards collaborative problem solving) at T3, than group members in conditions 2 (tools at T2 and T3) and 3 (tools at T3)? Radar and Reflector should positively affect social behavior in the group leading to increased social performance of the group. In other words, group members in condition 1 should perceive higher social performance at T3 than group members in conditions 2 and 3. Also, group members in condition 2 should perceive higher social performance at T3 than group members in condition 3. 8. Do groups in condition 1 (tools at T1, T2, and T3), exhibit higher cognitive performance (i.e., produce higher quality group products) at T3, than groups in conditions 2 (tools at T2 and T3) and 3 (tools at T3)? Radar and Reflector, by positively affecting social behavior in the group, should indirectly increase cognitive performance of the group. In other words, groups in condition 1 should exhibit higher cognitive performance at T3, than groups in conditions 2 and 3. Also, groups in condition 2 should exhibit higher cognitive performance at T3 than groups in condition 3.
5.3 Method and Instrumentation 5.3.1 Participants Participants were 108 sophomore Dutch high school students (58 male, 50 female) in four classes with an average age of 16 (M = 15.8, SD = .50, Min =15, Max = 18). Prior to the experiment, they were randomly assigned by the teacher to dyads (n = 16), triads (n = 84) and groups of four (n = 8), and randomly assigned by the researcher to one of three conditions (see Design). Groups were heterogeneous in ability and gender.
5.3.2 Design For this study two experimental conditions and one control condition were used (Table 5.1). The first experimental condition (n = 59) received Radar and Reflector at the beginning (T1), halfway
60
Chapter 5 (T2) and at the end (T3) of the collaboration process. The second experimental condition (n = 23) received these tools halfway (through T2) and at the end (T3) of the collaboration. The control condition (n = 26) did not receive the tools during collaboration, but completed them at the end of the collaboration (T3) as measurement instruments. Table 5.1 Design Overview Condition
T1 – beginning T2 – halfway
1 – Tools at T1, T2 and T3 (n = 59)
Radar
T3 – end
Radar & Reflector Radar, Reflector & Questionnaire
2 – Tools at T2 and T3 (n = 23)
Radar & Reflector Radar, Reflector & Questionnaire
3 – Tools at T3 (n = 26)
Radar, Reflector & Questionnaire
5.3.3 Measures Social behavior. Perceived social behavior was measured by the self and peer assessments in Radar on four variables, namely ‘influence’, ‘friendliness’, ‘cooperativeness’ and ‘reliability’ (see Table 5.2). These variables are rated on a continuous scale ranging from 0 to 4 (0 = none, 4 = very high). To simplify data-analysis, the ratings are transformed to a scale from 0 to 100 by multiplying the ratings (0-4) by 25. Table 5.2 Social and Cognitive Behavior as Measured by Radar Scale
Subscales
Balloontext/ description
Social behavior
Influence
A high score on influence means that this person has a big influence on: other group members; what happens in the group; structure and content of the groups’ product. A high score on friendliness means that this person: is friendly and helpful; provides a positive contribution to the group atmosphere; responds friendly and helpful on questions, suggestions and ideas of others. A high score on cooperation means that this person: collaborates well in the group; is willing to take over tasks of others; takes initiatives; communicates well; tries to think and cooperate in finding solutions for problems that occur. A high score on reliability means that this person: is reliable; keeps his/her word; does what he/she is suppose to do; finishes his/her task at the appointed time.
Friendliness
Cooperation
Reliability
Cognitive behavior
Productivity
Quality of Contribution
A high score on productivity means that this person:is productive; works hard; has a high contribution in problem solving; has a high contribution to the groups’ product. A high score on ‘quality of contribution’ means that: his/her work is perceived as useful and good; he/she produces a high quality of work; he/she has a positive contribution to the content and structure of the groups’product.
61
Chapter 5 Cognitive behavior. Perceived cognitive behavior was measured by the self and peer assessments in Radar on the variables ‘productivity’ and ‘quality of contribution’ (see Table 5.2), rated on a continuous scale ranging from 0 to 4 (0 = none, 4 = very high). The same transformation was carried out here. Coding Scheme Output Co-reflection. To improve the social and cognitive performance of the group, group members reflect together (i.e., co-reflect) to set goals and formulate plans to improve their social and cognitive activities. Categories for the coding scheme were derived from studies on social interaction and coordination processes in CSCL and were added until there were no ‘rest categories’. Finally, two independent researchers coded and categorized the goals and plans in nine categories (see Table 5.3). Inter-rater reliability was substantial (Cohen’s Kappa = .79). Table 5.3 Coding Scheme for Output Co-reflection: Specific Goals to Improve Group Performance Label Code Description Example Communication
Com
Focusing on task
Focus
Task coordination
Task
Planning
Plan
Improve communication or teamwork Improve concentration or focus on task Improve coordination, task- or role planning Improve time planning
We have to improve our communication and discuss our teamwork more often. We’ll focus more on our work.
Monitoring
Mon
Improve peer monitoring
Friendliness
Friend
Productivity
Prod
Improve friendliness towards each other Improve productivity
Quality
Qual
Improve quality of work
We shouldn’t be so unfriendly towards each other. We’ll increase our productivity and participate more equally. We’ll improve the quality of our work.
No suggestions
None
No suggestions for improvement
We have no suggestions for improvement.
We’ll divide the tasks more effectively. Let’s make clear who does what. We’ll set deadlines and improve our time planning. We’ll monitor each others’ progression.
The first three categories are communication, focusing on task and task coordination, activities which are crucial for successful collaboration (Barron, 2003; Erkens, Jaspers, Prangsma, & Kanselaar, 2005; Slof, Erkens, Kirschner, Jaspers, & Janssen, 2010). Furthermore, students need to carry out meta-cognitive activities such as planning and monitoring to employ a proper problem-solving strategy and reflect on its suitability (Lazonder & Rouet, 2008; Narciss, Proske, & Koerndle, 2007). Students also must develop positive affective relationships with each other (Kreijns, Kirschner, & Jochems, 2003), thus friendliness is a sixth category. Productivity and quality are the seventh and eight category because in effective groups, group members mutually depend on the willingness, effort and participation of their peers (Janssen, Erkens, Kanselaar, & Jaspers, 2007; Karau & Williams, 1993; Williams, Harkins, & Latané, 1981). The category no suggestion is added for students who do not have any suggestions to improve their performance. Social performance. Four previously tested and validated scales were used to measure team development (α = .92, 10 items), group-process satisfaction (α = .76, 6 items, both from Savicki, Kelley, & Lingenfelter, 1996), intra-group conflicts (α = .92, 7 items, from Saavedra, Early, & Van Dyne, 1993), and attitude towards collaborative problem solving (α = .81, 7 items, from
62
Chapter 5 Clarebout, Elen, & Lowyck, 1999). These scales were translated into Dutch and transformed into 5-point Likert scales (1 = totally disagree, 5 = totally agree; see Table 3.1) by Strijbos, Martens, Jochems, and Broers (2007). The Team Development scale provides information on the perceived level of group cohesion. The Group-process Satisfaction scale provides information on the perceived satisfaction with general group functioning. The Intra-group Conflicts scale provides information on the perceived level of conflict between group members. The Attitude towards Collaborative Problem Solving scale provides information on the perceived level of group effectiveness and how group members felt about working and solving problems in a group. The 30 items in the four scales were subjected to principal component analysis. Prior to performing this analysis, the suitability of data for factor analysis was assessed. Inspection of the correlation matrix showed that all coefficients were .5 and higher. The Kaiser-Meyer-Oklin value was .73, exceeding the recommended value of .6 and Bartlett’s Test of Sphericity reached statistical significance, supporting the factorability of the correlation matrix. The analysis revealed the presence of one main component with Eigen values exceeding 1, explaining 76.6% of the variance respectively. Cronbach’s alpha of the composed ‘Social Performance (total)’ scale was .90. Table 5.4 Examples of Social Performance Scales and their reliabilities in this study Scale Team Development
k 10
Group-process Satisfaction
6
Intra-group Conflicts
7
Attitude towards Collaborative Problem Solving
7
Social performance (total)
30
Example Group members contribute ideas and solutions to problems. I felt that my group worked very hard together to solve this problem. I found myself unhappy and in conflict with members of my group. Collaborating in a group is challenging. See all items of four scales stated above.
Cronbach’s α .77 .71 .84 .74 .90
Cognitive performance. The grade given to each group’s collaborative writing task (i.e., an essay) was used as a measure of cognitive performance.
5.3.4 Task and procedure Students collaborated in dyads and groups of three or four on a collaborative writing task in sociology. Every student worked at a computer. Each group had to write an essay on a highly relevant current-events topic. Prior to this collaborative writing task, students collaborated for one month choosing the topic, searching for relevant sources of information, writing a short paper and giving a presentation. Thus, all information needed to write the essay was available for all groups. The collaborative writing task consisted of three 45-minute sessions over a period of one week. The groups collaborated in a CSCL-environment called Virtual Collaborative Research Institute (Jaspers, Broeken, & Erkens, 2004), a groupware program that supports collaborative learning on research projects and inquiry tasks (see Instruments section). Students were instructed to use the environment to communicate with other group members and to make complete use of the tools for peer feedback and reflection when the experimental condition allowed this. Students received content information and definitions regarding the six traits on which they had to assess themselves and their peers. Students were told that they had three lessons to complete the task,
63
Chapter 5 that it would be graded by their teacher, and that it would affect their grade for the course. The introduction to the task stressed the importance of working together as a group and pointed out that each individual group member was responsible for the successful completion of the group task. To successfully complete the task, all group members had to participate. While groups used the tools, groups without them worked on the collaborative writing task. Time-on-task (i.e., writing the essay) was equal for all conditions. At the end of the final session (T3), both tools were made available for all conditions so that all participants could assess their peers and reflect on their behaviors. Finally, all participants completed a 30-item questionnaire on the social performance of the group.
5.3.5 Instruments Virtual Collaborative Research Institute (VCRI). The Virtual Collaborative Research Institute (VCRI) is a groupware program that supports collaborative working and learning on research projects and inquiry tasks (Jaspers, Broeken, & Erkens, 2004). VCRI contains more than 10 different tools, but only 5 were used for this experiment (see Figure 5.1). Co-Writer (top left) is a shared word-processor for writing a group text. Using Co-Writer, students can simultaneously work on different parts of their texts. The Chat tool (top center) is used for synchronous communication. The chat history is automatically stored and can be re-read by participants at any time.
Figure 5.1 Screenshot of VCRI with the five tools used in this experiment.
64
Chapter 5 Notes (bottom right) is a note pad which allows the user to make notes and to copy and paste selected information. Radar for peer feedback (bottom left) and Reflector for reflection (top right) will be described in the following sections. Windows of the available tools are automatically arranged on the screen when students log on to the VCRI. Peer assessment tool (Radar). VCRI was augmented with a peer assessment tool for eliciting information on group members’ social and cognitive behavior. This information is visualized in a radar diagram; therefore the peer assessment tool is named Radar (see Figure 5.2). Radar provides users with anonymous information on how their cognitive and social behaviors are perceived by themselves, their peers, and the group as a whole with respect to specific traits found to tacitly affect how one ‘rates’ others (see den Brok, Brekelmans, & Wubbels, 2006). Radar provides information on six traits deemed important for assessing behavior in groups. Four are related to social or interpersonal behavior, namely (1) influence; (2) friendliness; (3) cooperation; (4) reliability; and two are related to cognitive behavior, namely (5) productivity and (6) quality of contribution. These traits are derived from studies on interpersonal perceptions, interaction, group functioning, and group effectiveness (e.g., Bales, 1988; den Brok, Brekelmans, & Wubbels; Kenny, 1994; Salas, Sims, & Burke, 2005). Influence is directly derived from Wubbels, Créton, and Hooymayers’ (1985) influence dimension (i.e., dominance vs. submissiveness) in their model for interpersonal teacher behavior. This dimension is also used by Bales (1988) and represents the prominence, status, power, and personal influence that the individual is seen to have in relation to other group members. The variable is labeled ‘influence’, and not ‘dominance’ or ‘submissive’, because the latter two can be perceived of as negative traits. Friendliness is one of the eight behavior categories from Wubbels et al.’s (1985) model for interpersonal teacher behavior. Bales (1988) used a similar dimension (i.e., friendliness vs. unfriendliness). Bales and Cohen (1979) defined this as the extent to which individual members are friendly with and respectful to each other. Cooperation, which denotes the degree to which someone is willing to work with others, is derived directly from Wubbels et al.’s (1985) dimension Proximity (i.e., opposition vs. cooperation) They defined proximity as the property of being close together, or in group settings as the feeling of being a group (i.e., group cohesiveness). Reliability is a trait reflecting ‘trust’ which has been identified as an important precursor for successful collaboration, in face-to-face teams (Castleton_Partners/TCO, 2007) and in CSCL (Jarvenpaa & Leidner, 1999). According to Emans, Koopman, Rutte, and Steensma (1996) trust can be seen as the cognitive and affective assurance of group members that they respect each other’s interests and, therefore, can orient themselves towards each other’s words, actions, and decisions with an easy conscience. Productivity and Quality of contribution are the extent to which individual group members contribute quantitatively and qualitatively to tasks or duties central to group performance or group efficiency. These traits, representing cognitive or task-related behavior, were selected because research has shown that group members (1) do not always participate equally (Karau & Williams, 1993), and (2) monitor the performance (i.e., quantity and quality) of other group members (Salas et al., 2005). In Radar, group members are both assessors and assessees. As assessor, to-be-assessed peers in the group can be selected and their profile will appear as dotted lines in the center circle of the radar diagram. Each group member is represented by a specific color. Assessors rate
65
Chapter 5 themselves and all other group members on each of the six subscales (i.e., traits) which are divided in 41 points of assessment ranging from 0 to 4. For example, a student can rate his/her peer 3.2 for friendliness. To simplify data analysis, ratings were transformed into integers on a 100-point scale by multiplying the ratings (0-4) by 25. Thus a rating of 3.2 was after multiplication saved in the database as 80 points (3.2 times 25) on 100-point scale. Care was taken to ensure that all assessors use the same definition of the six traits. Prior to the experiment the researcher notified participants of text balloons with content information and definitions that would appear when they moved the cursor across one of the traits in the tool. For example, when the assessor moves the cursor across ‘influence’ a balloon pops up with the text ‘A high score on influence means that this person has an influence on what happens in the group, on the behavior of other group members, and on the form and content of the group product (the essay)’.
Figure 5.2 Radar – Input screen
Figure 5.3 Radar – Group information
Ratings are automatically saved in a database. For groups of 3 and 4 members, the assessment is anonymous. Group members can see the assessments of the other group members, but not who entered the data. To stimulate students to complete Radar, they can only gain access to the individual and average assessments of their peers after they have completed the assessment themselves. When all group members have completed their self and peer assessments, two modified radar diagrams become available. The first - Information about yourself - shows the output of the self assessment (e.g., Chris about Chris) along with the average scores of the peer assessments of her/him (e.g., Group about Chris). The self assessment is not taken into account for computing the average scores. To provide more information about the variance in the average score of their peer assessment, students can also choose to view the individual peer assessments about their own behavior (e.g., Group members about Chris). The second - Information about the group (see Figure 5.3) - represents the average scores of the group members, so that group members can get a general impression about the functioning of the group. All group members are represented as a solid line in the diagram, each with a different color. Participants can include or exclude group members from the diagram by clicking a name in the legend. It is assumed that information (peer feedback) from Radar makes group members aware of the differences between their intended behavior (measured by self assessment) and how this behavior is perceived by their peers (measured by peer assessment). It is also assumed that every
66
Chapter 5 group member will be stimulated to improve his/her social and cognitive behavior, knowing that (1) every group member will be assessed by his/her peers, and (2) these scores will be shared (anonymously) among the group. Therefore, it is expected that group members using Radar throughout will exhibit higher self and peer assessment scores on all six traits at the end compared to halfway through and the beginning. Reflection tool (Reflector). VCRI was also augmented with a reflection tool (Reflector) for stimulating group members to co-reflect on their individual behavior and overall group performance. This tool contained six reflective questions: 1. What is your opinion of how you functioned in the group? Give arguments to support this. 2. What differences do you see between the assessment that you received from your peers and your self assessment? 3. Why do or don’t you agree with your peers concerning your assessment? 4. What is your opinion of how the group is functioning? Give arguments to support this. 5. What does the group think about its functioning in general? Discuss and formulate a conclusion shared by all the group members. 6. Set specific goals (who, what, when) to improve group performance. The first four questions are completed in the Reflector, with completion indicated by clicking an ‘Add’-button. This allows the student to share her/his answers with the rest of the group and allows her/him to see the answers of the others. Students can only gain access to their peers’ answers after they have added their own so as not to influence each another. The last two questions are completed in Co-Writer, in a specific section named Co-Reflection, which allows writing a shared conclusion and formulating shared goals. Responses made by the students in the Reflector are not scored or evaluated.
5.3.6 Data analyses To examine whether group members in condition 1 (tools at T1, T2, and T3) exhibit lower self assessment and peer assessment scores halfway (T2) compared to the beginning (T1), a one-way repeated measures ANOVA with dependent variables related to perceived social and cognitive behavior (i.e., influence, friendliness, cooperation, reliability, productivity, and quality of contribution) is used to compare self assessment and peer assessment scores at T1, T2 and T3. To examine whether group members in condition 1 (tools at T1, T2, and T3) exhibit lower self assessment and peer assessment scores at T2 than groups in condition 2 (tools at T2 and T3), intra-class correlations will be calculated to examine group effects, after which multilevel analyses will be carried out to examine the effect of the tools halfway through (at T2) with respect to perceived social and cognitive behavior as measured by Radar. To examine whether group members using Radar and Reflector perceive more social and cognitive behavior at the end of the task compared to previous assessments, a one-way repeated measures ANOVA with dependent variables influence, friendliness, cooperation, reliability, productivity, and quality of contribution is used to compare self and peer assessment scores for condition 1 at T1, T2 and T3. This is followed by a paired samples t-test (one-tailed) to compare self and peer assessment scores for condition 2 at T2 and T3. To examine whether group members in condition 1 (tools at T1, T2, and T3) perceive better social and cognitive behavior at T3 than their peers in conditions 2 (tools at T2 and T3) or 3
67
Chapter 5 (control; tools at T3), intra-class correlations will be calculated to determine whether group effects exist. After this, multilevel analyses will be used to examine the effect of the tools at the end of the experiment (T3) with respect to perceived social and cognitive behavior as measured by Radar. To examine whether group members using Radar and Reflector show more congruency between self and peer assessment scores at a subsequent assessment, a Pearson product-moment correlation coefficient is used to test congruency between self and peer assessments at T1, T2, T3 with respect to perceived social and cognitive behavior. Additionally, an independent t-test (onetailed) will be used to examine the differences between self and peer assessments scores per condition at T1, T2, and T3 with respect to perceived social and cognitive behavior. To examine whether group members using Reflector set goals and formulate plans to enhance social and cognitive group performance, goals and plans were independently coded and categorized by two researchers. Mean frequencies per group of goals and plans will be presented per condition in a table. To examine whether group members in condition 1 (tools at T1, T2 and T3) perceive higher social performance (i.e., better team development, higher group satisfaction, less group conflict, and more positive attitudes towards collaborative problem solving) than group members in conditions 2 (tools at T2, T3) and 3 (tools at T3), intra-class correlations will be calculated to examine group effects. Then, multilevel analysis will be used to examine the effect of condition on the dependent variables social performance, team development, group satisfaction, level of group conflicts, and attitude towards collaborative problem solving, as measured by the questionnaire at the end of the experiment. Finally, to examine whether groups in condition 1 (tools at T1, T2 and T3) exhibit higher cognitive performance than groups in conditions 2 (tools at T2 and T3) and 3 (tools at T3), a one way between-groups ANOVA (one-tailed) with planned comparisons is conducted with the grade on the essay as dependent variable. Except where noted, tests were one-tailed. The rule of thumb (Cohen, 1988, pp. 284-287) for effects sizes (η2) was small ≥ .01, medium ≥ .06, and large ≥ .14.
5.4 Results 5.4.1 Research question 1 Table 5.5 shows the mean scores and standard deviations of self assessment scores at T1, T2, and T3 for each condition. Table 5.5 Mean and Standard Deviations of Self Assessments per Condition Influence Friendliness Cooperation T Condition n M SD M SD M SD 1 1 59 64.85 13.90 72.75 15.96 68.97 17.19 2 1 59 70.20 16.76 76.46 16.54 73.78 20.97 2 23 62.74 14.37 64.96 17.85 62.13 14.02 3 1 59 71.15 15.97 77.90 16.66 73.59 19.18 2 23 64.26 9.89 66.39 17.12 62.91 16.15 3 26 65.85 13.34 67.00 16.69 68.42 16.90
Reliability M SD 70.42 18.01 72.86 18.05 68.70 16.51 75.36 16.74 65.57 14.87 68.31 14.75
Productivity M SD 64.68 13.82 71.93 18.28 63.52 11.04 72.98 16.82 66.00 10.47 60.96 12.70
Quality M SD 65.98 13.75 72.75 14.78 65.70 10.02 72.19 15.97 71.48 11.61 63.81 11.38
A one-way repeated measures ANOVA was conducted to compare self assessment and peer assessment scores for condition 1 at the beginning (T1), halfway through (T2) and at the end (T3),
68
Chapter 5 with respect to perceived social and cognitive behavior. Unexpectedly, students in condition 1 (tools at T1, T2 and T3) exhibited significantly higher self assessment scores halfway through (T2) compared to the beginning (T1). There were significant effects for influence (mean difference = 5.36, 95% CI: 1.21 to 9.50), Wilks’ Lambda = .79, F (2, 57) = 7.39, p < .001, partial η2 = .21; productivity (mean difference = 7.25, 95% CI: 1.84 to 12.66), Wilks’ Lambda = .81, F (2, 57) = 6.73, p < .005, partial η2 = .19; and for quality of contribution (mean difference = 6.76, 95% CI: 2.39 to 11.14), Wilks’ Lambda = .80, F (2, 57) = 7.13, p < .005, partial η2 = .20. Group members that used the tools throughout perceived themselves halfway through as having more influence, being more productive, and making higher quality contributions. Table 5.6 Mean and Standard Deviations of Average Peer Assessments per Condition Influence Friendliness Cooperation Reliability T Condition n M SD M SD M SD M SD 1 1 59 65.52 10.86 75.90 13.58 69.02 12.62 69.79 13.49 2 1 59 71.82 12.29 77.81 15.56 74.04 15.90 72.19 16.31 2 23 63.35 8.38 67.54 11.91 61.57 12.70 69.93 8.63 3 1 59 71.36 12.48 77.29 15.26 73.09 15.77 74.96 14.13 2 23 64.91 8.18 70.35 14.32 64.98 9.75 67.30 9.78 3 26 66.62 12.86 72.73 8.67 68.88 15.68 68.52 12.34
Productivity M SD 64.10 11.77 73.53 13.32 64.98 7.12 74.31 10.67 70.43 9.82 65.13 12.35
Quality M SD 68.40 9.57 73.33 11.36 65.43 9.51 73.76 11.31 68.67 9.17 68.31 12.45
Table 5.6 shows the mean scores and standard deviations of average peer assessments scores at T1, T2 and T3 per condition. Unexpectedly, students in condition 1 (tools at T1, T2 and T3) exhibited significantly higher average peer assessment scores halfway through (T2) compared to the beginning (T1). There were significant effects for influence (mean difference = 6.30, 95% CI: 2.39 to 10.20), Wilks’ Lambda = .77, F (2, 57) = 8.38, p < .005, partial η2 = .23; cooperation (mean difference = 5.02, 95% CI: .43 to 9.60), Wilks’ Lambda = .89, F (2, 57) = 3.58, p < .05, partial η2 = .11; productivity (mean difference = 9.43, 95% CI: 4.75 to 14.11), Wilks’ Lambda = .63, F (2, 57) = 1.69, p < .0005, partial η2 = .37; and quality of contribution (mean difference = 4.94, 95% CI: 1.65 to 8.21), Wilks’ Lambda = .78, F (2, 57) = 7.90, p < .005, partial η2 = .22. Group members that used the tools throughout perceived their peers halfway through as having more influence, being more cooperative, productive, and making higher quality contributions.
5.4.2 Research question 2 Intra-class correlations were calculated to examine group effects, followed by multilevel analysis to examine the effect of condition halfway through (at T2) with respect to perceived social and cognitive behavior as measured by the Radar (see Table 5.7). As significant intra-class correlations were found for friendliness, cooperation, and reliability, indicating that there is a group effect, multilevel analyses were used to examine the differences on these variables between conditions 1 and 2. The associated significant χ²-values indicate that conditions (tools) have an effect on how group members perceived their own social and cognitive behavior (e.g., influence, friendliness, etc.).
69
Chapter 5
Table 5.7 Multilevel Analyses for Comparing Condition 1 and Condition 2 on Self Assessment Scores at T2 Comparing Intra-class Chi-square condition correlations 1 vs. 2 Scale r β SE χ² Influence
-0.10
7.46*
3.92
8.15*
Friendliness
0.56***
10.94*
5.61
8.94*
Cooperation
0.29*
11.98*
5.53
9.77**
Reliability
0.36**
3.67
5.41
5.67
Productivity
0.14
8.46*
4.38
8.42*
Quality of Contribution
0.05
7.03*
3.50
8.25*
* p < .05 (1-tailed) ** p < .01 (1-tailed) *** p < .001 (1-tailed)
Unexpectedly, the significant β-value shows that group members in condition 1 (tools at T1, T2 and T3) perceived themselves as having more influence, being more friendly, cooperative, and productive, and making higher quality contributions, than group members in condition 2 (tools at T2 and T3). No significant differences were found for reliability. Table 5.8 Multilevel Analyses for Comparing Condition 1 and Condition 2 on Peer Assessment Scores at T2 Comparing Intra-class condition Chi-square correlations 1 vs. 2 Scale r β SE χ² Influence
0.46***
8.47*
3.53
9.72**
Friendliness
0.70***
10.37*
5.31
9.14*
Cooperation
0.66***
12.47*
5.33
11.00**
Reliability
0.57***
2.26
4.97
5.26*
Productivity
0.59***
8.55*
4.06
9.10**
Quality of Contribution
0.53***
7.90*
3.53
9.61**
* p < .05 (1-tailed) ** p < .01 (1-tailed) *** p < .001 (1-tailed)
As significant intra-class correlations were found indicating that there is a group effect for all dependent variables, multilevel analyses were carried out to examine the differences on these variables between conditions 1 and 2 (see Table 5.8). The associated significant χ²-values indicate that conditions (tools) have an effect on group members perceived social and cognitive behavior (e.g., influence, friendliness, etc.). Unexpectedly, the significant β-value shows that group members in condition 1 (tools at T1, T2 and T3) perceived more social behavior (i.e., influence, friendliness, cooperation), and cognitive behavior (i.e., productivity and quality of contribution), than group members in condition 2 (tools at T2 and T3). No significant differences were found for reliability.
70
Chapter 5
5.4.3 Research question 3 A one-way repeated measures ANOVA was conducted to compare self assessment and peer assessment scores for condition 1 at the beginning (T1), halfway (T2) and at the end (T3), with respect to perceived social and cognitive behavior. See results of hypothesis 1 for the comparison self assessment scores and peer assessment scores at the beginning (T1) and halfway through (T2) for condition 1. Group members in condition 1 (tools at T1, T2 and T3) exhibited significantly higher self assessment scores at the end (T3) compared to the beginning (T1), there were significant effects for reliability (mean difference = 4.93, 95% CI: .10 to 9.76), Wilks’ Lambda = .89, F (2, 57) = 3.64, p < .05, partial η2 = .11; productivity (mean difference = 8.30, 95% CI: 2.74 to 13.87), Wilks’ Lambda = .81, F (2, 57) = 6.73, p < .005, partial η2 = .19; and quality of contribution (mean difference = 6.20, 95% CI: 1.33 to 11.07), Wilks’ Lambda = .80, F (2, 57) = 7.13, p < .005, partial η2 = .20. No significant differences in self assessment scores were found comparing scores halfway (T2) with scores at the end (T3). For peer assessment scores, groups in condition 1 (tools at T1, T2 and T3) exhibited significantly higher average peer assessment scores at the end (T3) compared to the beginning (T1). There were significant effects for influence (mean difference = 5.84, 95% CI: 1.72 to 9.95), Wilks’ Lambda = .77, F (2, 57) = 8.38, p < .005, partial η2 = .23; reliability (mean difference = 5.17, 95% CI: .57 to 9.77), Wilks’ Lambda = .88, F (2, 57) = 3.88, p < .05, partial η2 = .12; productivity (mean difference = 10.22, 95% CI: 5.92 to 14.51), Wilks’ Lambda = .63, F (2, 57) = 1.69, p < .0005, partial η2 = .37; and quality of contribution (mean difference = 5.36, 95% CI: 1.67 to 9.05), Wilks’ Lambda = .78, F (2, 57) = 7.90, p < .005, partial η2 = .22. No significant differences in average peer assessment scores were found comparing scores halfway (T2) with scores at the end (T3). To compare self and peer assessment scores at T3 with T2 for condition 2, a paired samples t-test (one-tailed) showed significant differences for students in condition 2 for self assessments halfway (T2) and at the end (T3). As expected, students in condition 2 (tools at T2 and T3) perceived a significantly higher quality of contribution at T3, (mean difference = 5.78, SD = 10.66, 95% CI: 1.17 to 10.39), t (22) = 2.60, p < .01. Significant differences were also found between peer assessment scores halfway (T2) and at the end (T3). As expected, students perceived significantly more productivity at T3, (mean difference = 5.46, SD = 9.43, 95% CI: 1.42 to 9.50), t (22) = 2.80, p < .01.
71
Chapter 5 Table 5.9 Multilevel Analyses for Effects of Condition on Self Assessment Scores at T3 Intra-class correlation Scale
rI
Influence
0.13
Friendliness
Comparing condition 1 vs. 2 β
SE
Comparing condition 1 vs. 3
Chisquare χ²
Comparing condition 2 vs. 3 β
SE
Chisquare
β
SE
χ²
5.30
3.65
13.10*** -1.59
4.37 4.88*
10.90* 4.91
17.16*** -0.61
5.15 5.10**
6.89*
3.81
0.42**
11.51*
5.13
Cooperation
0.30**
11.17*
5.26
5.17
5.04
14.80*** -5.51
5.51 6.35**
Reliability
0.33**
9.79*
4.71
7.05
4.51
15.03*** -2.74
4.68 5.25*
Productivity
0.02
6.98*
3.59
12.02* 3.44
21.07*** 5.04
3.45 6.41**
Quality of Contribution
0.09
.71
3.61
8.38* 3.46
14.65*** 7.67*
3.96 8.29**
* p < .05 (1-tailed) ** p < .01 (1-tailed) *** p < .001 (1-tailed)
5.4.4 Research question 4 In comparing self assessment scores for condition 1 (tools at T1, T2, and T3) with condition 2 (tools at T2 and T3) and condition 3 (tools at T3), significant intra-class correlations were found for friendliness, cooperation, and reliability indicating a group effect (see Table 5.9). Multilevel analyses were, thus, used to examine the differences between conditions 1 and 2. The associated significant χ²-values indicate that conditions (tools) have an effect on how group members perceived their own social and cognitive behavior (e.g., influence, friendliness, etc.). As expected, the significant β-values show that group members in condition 1 perceived themselves as having more influence, being friendlier, more cooperative, more productive, and making contributions of higher quality, than group members in condition 2. No significant differences were found for reliability. Comparing conditions 1 and 3, group members in condition 1 perceived themselves as being more friendly, more productive, and making contributions of higher quality than group members in condition 3. No significant differences were found for influence, cooperation, and reliability. Comparing self assessment scores for condition 2 (tools at T2 and T3) with condition 3 (tools at T3), the significant β-value (see Table 9) shows that group members in condition 2 perceived themselves as making contributions of higher quality than group members in condition 3. No other significant differences in self assessment scores were found between conditions 2 and 3.
72
Chapter 5 Table 5.10 Multilevel Analyses for Effects of Condition on Peer Assessment Scores at T3 Intra-class correlation Scale
rI
Comparing condition 1 vs. 2 β
SE
Comparing condition 1 vs. 3 β
SE
Chisquare χ²
Comparing condition 2 vs. 3 β
SE
Chisquare χ²
Influence
0.34***
6.45*
3.53
4.74
3.38 12.89*** -1.71
3.55 4.59*
Friendliness
0.74***
6.94
5.13
4.56
4.91 12.43**
-2.38
5.30 5.51**
Cooperation
0.77***
8.11
5.63
4.21
5.40 12.99*** -3.90
5.84 6.01**
Reliability
0.62***
7.66*
4.50
6.44
4.31 13.82*** -1.22
4.19 4.82*
Productivity
0.27*
3.88
3.04
Quality of Contribution
0.35**
5.09
3.37
9.18** 2.91 17.37*** 5.30 5.45* 3.23 12.36**
0.36
3.36 6.72** 3.79 4.48*
* p < .05 (1-tailed) ** p < .01 (1-tailed) *** p < .001 (1-tailed)
In comparing peer assessment scores for condition 1 (tools at T1, T2 and T3) with conditions 2 (tools at T2 and T3) and 3 (tools at T3), significant intra-class correlations indicated a group effect for all dependent variables (see Table 5.10). Multilevel analyses were, thus, used to examine the differences between conditions 1 and 2. The associated significant χ²-values indicate that conditions (tools) have an effect on group members perceived social and cognitive behavior (e.g., influence, friendliness, etc.). As expected, significant β-values show that group members in condition 1 perceived their peers having more influence and being more reliable than group members in condition 2. Unexpectedly, no significant differences in peer assessment scores were found for friendliness, cooperation, productivity, or quality of contribution. Comparing conditions 1 and 3, as expected, group members in condition 1 perceived more productivity and higher quality contributions than group members in condition 3. No significant differences were found for perceived social behavior (i.e., influence, friendliness, cooperation, and reliability). Finally, in comparing peer assessment scores for condition 2 (tools at T2 and T3) with condition 3 (tools at T3) no significant differences in peer assessment scores at T3 were found.
5.4.5 Research question 5 A Pearson product-moment correlation coefficient was used to test congruency between self and peer assessments at T1, T2, T3 with respect to perceived social and cognitive behavior. Preliminary analyses were performed to ensure no violation of the assumptions of normality, linearity and/or homoscedasticity. The rule of thumb (Cohen, 1988) for the strength of the correlation (r) was small = .10-.29, medium = .30-.49, and large = .50-1.0. Table 5.11 shows the Pearson correlations between average peer assessments and self assessments at T1, T2, and T3.
73
Chapter 5
Table 5.11 Pearson Correlations between Average Peer assessments and Self assessments at T1, T2 and T3 Influence Friendliness Cooperation Reliability Productivity Quality Condition T n r p r p r p r p r p r p 1 1 59 .28 .03* .27 .04* .52 .00** .28 .03* .23 .08 .32 .01* 2 59 .62 .00** .72 .00** .54 .00** .53 .00** .62 .00** .55 .00** 3 59 .69 .00** .66 .00** .59 .00** .57 .00** .34 .01* .46 .00** 2 2 23 -.18 .42 .07 .74 .13 .57 .02 .94 .24 .28 .06 .78 3 23 .65 .00** .47 .02* .38 .07 .02 .92 .61 .00** .52 .01* 3 3 26 .36 .07 -.02 .91 .30 .14 .14 .51 .01 .97 -.04 .86
As expected, results in Table 11 show non-significant or relatively small correlations for self and peer assessment scores at group members’ first assessment, and significant medium to large correlations at subsequent assessments. For students in condition 1 (tools at T1, T2 and T3), nearly all correlations between self and peer assessment scores are significantly positive, except for productivity at T1. Compared to T1, all correlations increased significantly at T2. Compared to T2, correlations for friendliness, productivity and quality of contribution show small decreases at T3, but remain significantly positive. Compared to T2, correlations for influence, cooperativeness and reliability increased at T3. Compared to the correlations at the beginning (T1) this indicates a higher convergence of self and peer assessments at the end of the collaboration process. For students in condition 2 (tools at T2 and T3), no significant correlations were found between self and peer assessment scores at T2. Compared to T2, significant correlations were found for influence, friendliness, productivity and quality of contribution at T3. For students in condition 3 (tools at T3), no significant correlations were found between self and peer assessment scores at T3. An independent t-test (one-tailed) was used to examine differences between self and peer assessments at T1, T2 and T3 with respect to perceived social and cognitive behavior. Tables 5 and 6 show the mean scores and standard deviations of self and peer assessments per condition. No significant differences in self and peer assessments for condition 1 were found at T1, nor for conditions 1 and 2 at T2. At T3; students in condition 2 perceived their productivity significantly lower (mean difference = -4.43, SD = 9.02), t (22) = -2.36, p < .01, than their peers. No significant differences in self and peer assessments were found at T3 for conditions 1 and 3.
5.4.6 Research question 6 It was expected that the reflective questions in Reflector would stimulate group members to (co)reflect on social and cognitive performance, and set goals and formulate plans to enhance the social and cognitive group performance. To examine whether groups using Reflector set goals and formulated plans to enhance social and cognitive group performance, their responses were independently coded and categorized by two researchers. Mean frequencies, percentages and cumulative percentages of plans and goals per group and per condition, are presented in Table 5.12.
74
Chapter 5
Table 5.12 Mean Frequencies, Percentages and Cumulative Percentages of Future Goals per Group per Condition Condition 1 Condition 2 Condition 3 tools at T1, T2 and T3 tools at T2 and T3 tools at T3 (n groups = 20) (n groups = 8) (n groups = 9) Label Communication Task coordination Focussing Productivity Quality Planning Friendliness Monitoring No suggestions Goals per group
Mean f 0.70 0.40 0.20 0.15 0.10 0.05 0.05 0.05 0.30 2.00
% 35.00 20.00 10.00 7.50 5.00 2.50 2.50 2.50 15.00 100.00
Cum. % Mean f 35.00 55.00 65.00 72.50 77.50 80.00 82.50 85.00 100.00
0.25 0.75 0.13 0.13 0.13 0.38 0.00 0.13 0.00 1.88
% 13.33 40.00 6.67 6.67 6.67 20.00 0.00 6.67 0.00 100.00
Cum. % Mean f 13.33 53.33 60.00 66.67 73.33 93.33 93.33 100.00 100.00
0.22 0.22 0.22 0.33 0.00 0.00 0.00 0.00 0.11 1.11
% 20.00 20.00 20.00 30.00 0.00 0.00 0.00 0.00 10.00 100.00
Cum. % 20.00 40.00 60.00 90.00 90.00 90.00 90.00 90.00 100.00
Groups in condition 1 and 2 formulated twice as much plans and goals as condition 3, since groups in conditions 1 and 2 completed Reflector twice, compared to once for condition 3. Groups in condition 1 (tools at T1, T2 and T3) formulated goals focused on improving communication, task coordination and on better focusing on the task. On average, groups in condition 1 formulated nearly three times as many goals focused at improving communication than groups in conditions 2 or 3. Groups in condition 2 (tools at T2 and T3) formulated goals that were focused on improving task coordination, planning and communication. Groups in condition 3 (tools at T3) formulated goals focused on improving productivity, communication, task coordination and on focusing on the task. On the average, groups in condition 3 formulated three times as many goals focused on improving productivity than groups in conditions 1 or 2. The five most often mentioned goals relate to focusing on improving task coordination, communication, productivity, and on focusing on the task. This indicates that groups formulated goals to achieve better teamwork, which can enhance their social and cognitive performance.
5.4.7 Research question 7 It was expected that groups in condition 1 would perceive higher social performance at T3 than groups in conditions 2 and 3 and that groups in condition 2 perceive higher social performance at T3 than groups in condition 3. Table 5.13 shows the means and standard deviations for scores on social performance. To examine whether groups in condition 1 (tools at T1, T2, and T3) perceive higher social performance (i.e., better team development, higher group satisfaction, less group conflict, and more positive attitudes towards collaborative problem solving) than groups in condition 2 (tools at T2 and T3) and condition 3 (tools at T3), first, intra-class correlations were calculated to examine any group effects on the social performance scales. Second, multilevel analysis was used to examine the effect of condition on the social performance scales as measured by the questionnaire at the end of the experiment, after collaboration in the CSCL environment (VCRI). Table 5.14 shows intra-class correlations and multilevel analyses for effects of condition on social performance scales.
75
Chapter 5
Table 5.13 Means and Standard Deviations for Scores on Social Performance Scales Condition 1 tools at T1, T2 and T3 (n = 54) M SD
Scale
Condition 2 tools at T2 and T3 (n = 22) M SD
Condition 3 tools at T3 (n = 25) M SD
Team development
4.09
0.56
3.48
0.62
3.80
0.37
Group-process satisfaction
4.04
0.52
3.59
0.61
3.71
0.59
Intra-group conflicts
1.95
0.58
2.43
0.67
2.23
0.50
Attitude
3.85
0.50
3.59
0.60
3.65
0.55
Social Performance (total)
4.01
0.46
3.56
0.56
3.73
0.44
In comparing condition 1 (tools at T1, T2 and T3) with conditions 2 (tools at T2 and T3) and 3 (tools at T3) on perceived social performance, significant intra-class correlations were found for all social performance scales except attitude towards collaborative problem-solving, indicating that the group has an effect on the perceived social performance of individual group members. Multilevel analyses are, thus, needed to examine the effect of condition (tools) on perceived social performance. Table 5.14 Means and Standard Deviations for Scores on Social Performance Scales Comparing Comparing ChiIntra-class condition condition square correlation 1 vs. 2 1 vs. 3 Scale rI β SE β SE χ²
Comparing condition 2 vs. 3 β SE
Chisquare χ²
Team development
.60***
.63*** .18
.27
.18
7.84**
-.32
.18
2.01
Group-process satisfaction
.41**
.41** .17
.31*
.17
3.90
-.13
.20
-1.00
Intra-group conflicts
.62***
Attitude Social Performance (total)
-.49*
.21
-.27
.20
3.07
.20
.22
-0.19
.03
.26*
.14
.19
.13
-.07
-.06
.17
-1.57
.49***
.45** .15
.26*
.14
4.88*
-.17
.17
-0.49
As expected, the significant β-values in Table 5.14 show that groups in condition 1 (tools at T1, T2 and T3) perceived their team as being better developed than groups in condition 2 (tools at T2 and T3). However, no significant differences were found for team development between condition 1 (tools at T1, T2 and T3) and condition 3 (tools at T3), as indicated by a nonsignificant β-value. As expected, the significant β-value shows that groups in condition 1 (tools at T1, T2 and T3) experienced higher levels of group satisfaction, than groups in condition 2 (tools at T2 and T3) and condition 3 (tools at T3). These effects should be interpreted with caution as the associated χ²-value was only marginally significant (p = .07). As expected, groups in condition 1 (tools at T1, T2 and T3) experienced lower levels of conflicts, than groups in
76
Chapter 5 condition 2 (tools at T2 and T3), but no significant differences were found between condition 1 and condition 3 (tools at T3), as indicated by a non-significant β-value. Also this effect should be interpreted with caution, as the associated χ²-value was not significant (p = .11). As expected, groups in condition 1 (tools at T1, T2 and T3) had significantly more positive attitude towards collaborative problem solving than groups in condition 2 (tools at T2 and T3), but no significant differences were found for attitude between condition 1 (tools at T1, T2 and T3) and condition 3 (tools at T3). As expected, groups in condition 1 experienced significantly higher social performance than groups in condition 2 and condition 3. In comparing condition 2 (tools at T2 and T3) with condition 3 (tools at T3) on perceived social performance (see Table 13), no significant differences were found between conditions 2 and 3. The non-significant χ²-values indicate that condition (tools) had no effect on the perceived social performance for condition 2. Table 5.15 Means and Standard Deviations for Cognitive Performance per Condition Cognitive performance (grade essay) Condition
n groups
M
SD
Min
Max
1 – tools available at T1, T2 and T3
20
6.81
.71
4.0
2 – tools available at T2 and T3
8
6.54
1.04
4.5
8.5
3 – tools available at T3
9
6.36
1.61
4.0
8.5
8.5
5.4.8 Research question 8 It was expected that groups in condition 1 would exhibit higher cognitive performance at T3 than groups in conditions 2 and 3 and that groups in condition 2 would exhibit higher cognitive performance at T3 than groups in control condition 3. A one-way between-groups ANOVA (onetailed) with planned comparisons was conducted to explore the effect of Radar and Reflector on group cognitive performance, as measured by the grade given to their (group) essays. Table 5.15 shows means and standard deviations for group performance per condition. No significant effects were found.
5.5 Discussion and Conclusion This study examined the effects of a peer assessment tool (Radar) and a reflection tool (Reflector) on (1) perceived social and cognitive behavior, and (2) social and cognitive performance of the group. Most of the expectations were met. Results showed that groups using tools throughout perceive better social and cognitive behavior halfway through, show more convergence between self and peer assessments, and report higher social group performance, than groups not using tools. Results did not show a decrease of self and peer assessment scores halfway through, indicating that there is no support that Radar and Reflector can help to reduce group members unrealistic positive perceptions of self and peer performance. There was also no support that using Radar and Reflector indirectly leads to higher cognitive performance. Below, the eight research questions of this study, accompanied with results and explanations will be addressed. The first question addressed in this study was to examine whether group members in condition 1 (tools at T1, T2, and T3) perceive less social and cognitive behavior halfway (T2) compared to the beginning (T1). Based on Stroebe, Diehl, and Abakoumkin (1992) and the results of a previous study (Phielix et al., 2010), it was hypothesized that group members generally form 77
Chapter 5 unrealistically positive perceptions of self performance and peer performance. Therefore, we expected that the peer feedback provided by Radar at the first assessment (T1) would make group members better aware of these perceptions, resulting in a decrease of self and peer assessment scores at the second assessment (T2). Unexpectedly, most self and peer assessment scores increased significantly at the second assessment (T2). Group members using the tools throughout perceived their peers halfway through as having more influence, being more cooperative, productive, and making higher quality contributions. Therefore, the hypothesis that Radar and Reflector can reduce unrealistic positive perceptions of self and peer performance was not supported by the data. A possible explanation could be that, in comparison to the previous study, these group members were familiar in collaborating with each other. Prior to this experiment, group members collaborated with each other for one month. It is likely that this former collaboration period caused students to have more realistic perceptions, and these perceptions increased over time by using Radar and Reflector. The second question addressed was to examine whether group members in condition 1 (tools at T1, T2, and T3) perceive less social and cognitive behavior than group members in condition 2 (tools at T2 and T3), who used Radar for the first time at T2. As stated above, it was hypothesized that information provided by Radar should make group members aware of their unrealistic positive self perceptions and peer perceptions (Hattie & Timperley, 2007; Phielix et al., 2010). Therefore it was expected that group members in condition 1 (using Radar for the second time) exhibit lower self assessment and peer assessment scores at T2, than group members in condition 2 (using Radar for the first time). Unexpectedly, group members in condition 1 perceived more social behavior (i.e., influence, friendliness, cooperation), and cognitive behavior (i.e., productivity and quality of contribution) halfway through, than group members in condition 2. As stated above, an explanation could be that group members in condition 1, who completed Radar for the second time, managed to improve their social and cognitive behavior over time, resulting in significantly higher self and peer assessment scores at T2 compared to T1. The third question addressed was to examine whether groups in condition 1 (tools at T1, T2 and T3) perceive more social and cognitive behavior at the end (T3) compared to halfway (T2) and the beginning (T1). Based on Hattie and Timperley (2007) and results of a previous study (Phielix et al., 2010) it was hypothesized that information (‘feed back’ and ‘feed forward’) from Radar and Reflector halfway (T2) would stimulate group members to set goals to improve the social and cognitive performance of themselves and the group. Therefore, it was expected that self assessment and peer assessment scores would be higher at the end (T3), compared to halfway (T2) and the beginning (T1). As expected, group members using the tools throughout perceived their peers at the end as having more influence, being more reliable, more productive, and making higher quality contributions, compared to the beginning. Unexpectedly, no significant differences were found comparing scores halfway (T2) with scores at the end (T3). An explanation could be that time was long enough for the tools to cause an effect between the beginning (T1) and the end (T3), but was too short to cause an effect between halfway (T2) and the end (T3). The fourth question addressed was to examine whether group members in condition 1 (tools at T1, T2, and T3), perceive better social and cognitive behavior at the end (T3), than group members in condition 2 (tools at T2 and T3) or condition 3 (tools at T3). It was hypothesized that information from Radar and Reflector will stimulate group members to set goals to improve the social and cognitive performance of themselves and of the group. Significant χ²-values indicated that conditions (tools) have an effect on how group members perceived the social and cognitive behavior of themselves and their peers. As expected, group members using tools throughout (condition 1) perceived themselves as having more influence, being friendlier, more cooperative,
78
Chapter 5 more productive, and making contributions of higher quality, than group members using tools since halfway (condition 2). Group members using tools throughout also perceived themselves as being more friendly, more productive, and making contributions of higher quality than group members not using the tools (condition 3). Group members using the tools since halfway (condition 2) perceived themselves as making contributions of higher quality than group members not using tools (condition 3). Significant χ²-values also indicated that conditions (tools) have an effect on how group members perceived the social and cognitive behavior of their peers. Group members using the tools throughout perceived their peers as having more influence and being more reliable than group members using tools since halfway. Unexpectedly, no significant differences were found for perceived social behavior (i.e., influence, friendliness, cooperation, and reliability), comparing group members using tools throughout with group members not using tools. Nevertheless, group members using tools throughout perceived more productivity and higher quality contributions than group members not using tools. No significant differences in peer assessment scores were found for condition 2 and 3. These inconclusive results indicate that no clear patterns can be found in the self and peer assessment scores of the Radar concerning perceived social and cognitive behavior. For example, it is hard to explain why significant differences in friendliness are found between condition 1 and 2 at T2 (halfway through), but no significant differences are found within condition 1. The fifth question addressed was to determine whether group members using Radar and Reflector show more congruency between self and peer assessment scores at a subsequent assessment. It was hypothesized that group members need time to adjust their unrealistic positive self perceptions, and thus it was expected that non-significant or small correlations would be found between self assessment and peer assessment scores after group members’ first completion of the Radar, but significant and higher correlations would be found at a subsequent assessment. It was also expected that significant differences would be found between self and peer assessment scores after group members’ first completion of the Radar, and that these differences would become non-significant or smaller at a subsequent assessment. One significant difference was found between self and peer assessments, students in condition 2 perceived their own productivity significantly lower than their peers. Furthermore, we did not find any significant differences between self and peer assessment scores at T1, T2 or T3, but did find, as expected, nonsignificant or relatively small correlations for self assessment and peer assessment scores at group members’ first assessment, and significant medium to large correlations at subsequent assessments. At the end of the collaboration process, group members using tools (condition 1 and 2) showed higher convergence of self assessment and peer assessment scores over time, than group members not using tools (condition 3). These results indicate that group members using tools establish a shared perception on the social and cognitive behavior of individual group members. Results also indicate that group members use their personal references (e.g., norms, values, beliefs) about themselves, their peers and the group during completion of the first self and peer assessment in the Radar (Kenny, 1994). As stated, it appears that after completion of the Radar for the first time, group members need time to (1) process, reflect and act upon the received feedback, and (2) monitor and assess their own social and cognitive performance and that of their peers during collaboration, and (3) establish shared norms, values and beliefs. The sixth question addressed was to examine whether group members using Reflector set goals and formulate plans to enhance social and cognitive group performance. Results show that Reflector stimulates group members to set goals and formulate plans to improve their social and cognitive group performance. Results also show that goals and plans were mainly focused on improvement of activities, such as task coordination, communication, productivity, and focussing
79
Chapter 5 on the task. An explanation why goals and plans were mainly focused on these activities could be that these activities are derived from dimensions of Radar. For example, improvement of ‘task coordination’ and ‘communication’ could indicate that students want to improve their cooperation, and it is likely that these activities are derived from Radar output on the dimension ‘cooperation’. The need to improve ‘productivity’ and ‘focussing on task’ could indicate that students want to improve the productivity and quality of their work, which are both dimensions of Radar. The social and cognitive activities stated above are crucial for successful collaboration (Barron, 2003; Erkens et al., 2005), and indicate that groups formulated goals to enhance their social and cognitive performance (teamwork). The seventh question addressed was to examine whether group members in condition 1 (tools at T1, T2, and T3) perceive higher social performance, in terms of better team development, higher group satisfaction, less group conflict, and more positive attitudes towards collaborative problem solving, than group members in condition 2 (tools at T2 and T3) and condition 3 (tools at T3). First, as expected, significant intra-class correlations were found for all social performance scales, except for ‘attitude towards collaborative problem-solving’. An explanation for not finding significant intra-class correlation for attitude towards collaborative problem solving could be that, in comparison with the other scales (e.g., team development), this scale is not determined by a single collaboration session (e.g., a collaborative writing task), but is a summary evaluations of several collaboration sessions (Petty, Wegener, & Fabrigar, 1997). Second, unexpectedly, no significant differences were found between conditions 2 and 3. Using tools since halfway had no effect on the perceived social performance for condition 2. The period of time may be too short to find effects of the tools on social performance. Group members in condition 2, who received and used the tools halfway, had half the amount of time (67.5 minutes from halfway to the end) to change their social and cognitive behavior, compared to group members in condition 1, who received and used tools from the beginning to the end (135 minutes). Third, as expected, comparing condition 1 and 2, group members in condition 1 (tools at T1, T2 and T3) perceived their team as being better developed, experienced higher levels of group satisfaction, lower levels of conflicts, and had a more positive attitude towards collaborative problem solving than group members in condition 2 (tools at T2 and T3). Comparing condition 1 and 3, group members in condition 1 (tools at T1, T2 and T3) experienced higher levels of group satisfaction and higher social performance than group members in condition 3 (tools at T3). Nevertheless, the associated χ²-values were low or not significant, indicating that using tools throughout can have a positive effect on social performance, in terms of better team development, higher group satisfaction, less group conflict, and more positive attitudes towards collaborative problem solving, but also indicate that there are other (not examined) factors (e.g., group member’s discourse or communication) that have a major influence on these social performance scales. These results support our hypothesis that frequent use of Radar combined with Reflector focused on future group functioning and goal setting, enhances social group performance. In future research it would be interesting to examine the effect of the tools on group member’s intended and actual behavior by analyzing their communication using discourse analysis (e.g., Erkens & Janssen, 2008). The final question addressed was to examine whether using Radar and Reflector would lead to higher cognitive performance, measured by the grade given to the essays. As found in the previous study (Phielix et al., 2010), no significant effects of Radar and Reflector were found for grade given to the essays. Again, the period of time may be too short to find effects of the tools on cognitive performance.
80
Chapter 5 A few limitations of this study should be kept in mind. First, as mentioned, a limitation of this study might lie in the short period of time in which group members had to use the tools, fulfill the task, and establish shared norms, values and beliefs. Significant effects of the tools on social and cognitive behavior were found, but could have been stronger when more time would have been available. In future research Radar and Reflector will be used for a longer period (e.g., three months), during which the tools will be available from the beginning for all conditions, and will have to be used several times. Second, in this study self and peer assessment (interpersonal perceptions) were used in order to change the social and cognitive behavior of individual group members and the group as a whole. Students were both feedback provider (assessor) as feedback receiver (assessee), thus student’s interpersonal perceptions and behavior were influenced by their provided and received feedback. In future research it would be interesting to use a larger sample size and Social Relations Models (SRM) to examine how much variance in self assessment and peer assessment scores can be explained by the actor (assessor), partner (assessee), diad (specific combination of two students), and group (specific combination of three or more students). Third, in this quantitative study the output of the Reflector, concerning group member’s intentions to enhance their social and cognitive performance by setting goals and formulating plans, was only analyzed in a quantitative way. It would be interesting to examine in a qualitative study whether these intensions lead to actual changes in social and cognitive behavior and activities (e.g., using discourse analysis to find out whether the intention to become more friendly actually led to more friendly and helpful behavior in the chat). In conclusion, the effects of Radar and Reflector on group functioning are very promising. Results show that, by adding these easy to complete and easy to interpret peer feedback and reflection tools in a CSCL environment, students (1) become aware of interpersonal perceptions and behavior, (2) exhibit higher social and cognitive behavior, (3) establish shared perceptions on interpersonal behavior, and (4) can enhance the social performance of the group.
81
6.
Using Reflection to Increase Consensus among Peer Raters³
Abstract There is a tenuous relationship between students’ self perception and their actual performance. The tendency to believe that they are performing effectively, while they are not, undermines social and cognitive group performance. In this study, two groups used a self and peer assessment tool (Radar) with or without a reflection tool (Reflector) in a CSCL-environment to enhance students’ performance. Participants consisted of 191 second year university students working in groups of three, four and five on a collaborative writing task. Results did not show that the combination of Radar and Reflector led to enhanced group performance. Results did, however, show that supplementing a self and peer assessment tool with a reflection tool, led to more consensus among raters, more moderate and less optimistic self and peer perceptions, and more valid judgments of social performance. Based on these findings, it’s recommended that formative self and peer assessments are frequently used over a longer period of time and combined with reflection prompts aimed at future functioning.
6.1 Introduction Collaborative learning, often supported by computer networks (computer supported collaborative learning; CSCL), is enjoying considerable interest at all levels of education (Strijbos, Kirschner, & Martens, 2004). Though CSCL environments have been shown to be promising educational tools and though expectations as to their value and effectiveness are high, groups learning in CSCL environments do not always reach their full potential (e.g., Fjermestad, 2004; Lipponen, Rahikainen, Lallimo, & Hakkarainen, 2003; Thompson & Coovert, 2003). One of the most important reasons for this disparity between their potential and their results can be found in the social interaction between the group members, which is influenced by the design of the CSCL environment and/or the social and cognitive behavior of the group members (Kreijns, Kirschner, & Jochems, 2003). First, the design of CSCL environments is often solely functional, focussing on the cognitive processes needed to accomplish a task and/or solve a problem and achieving optimal learning performances (Kreijns & Kirschner, 2004). These functional CSCL environments force (coerce; Kirschner, Beers, Boshuizen, & Gijselaars, 2008) group members to solely carry out their task and thus limit the possibility for socio-emotional processes to take place. These socioemotional processes, which are the basis for group forming and group dynamics, are essential for developing strong social relationships, strong group cohesiveness, feelings of trust, and a sense of community among group members. Second, group members’ self-views (self perceptions) show a tenuous to modest relationship with their actual behavior and performance (Dunning, Heath, & Suls, 2004). Students tend to overrate themselves and hold overinflated views of their expertise, skill, and character (e.g., Chemers, Hu, & Garcia, 2001; Dunning, Heath, & Suls, 2004; Falchikov & Boud, 1989). This tendency of group members to believe that they are performing effectively, while they often do not, can undermine the groups’ social (e.g., team development) and cognitive performance
³Based on Phielix, C., Prins, F. J., Kirschner, P. A., Janssen, J., & Slof, B. Using reflection to increase consensus among peer raters. Manuscript submitted for publication.
Chapter 6 (e.g., quantity and quality of work), and causing it not to reach its full potential (Karau & Williams, 1993; Stroebe, Diehl, & Abakoumkin, 1992). To enhance social interaction and alleviate biased self perceptions, small groups can be augmented with computer supported collaborative learning (CSCL) tools that can make group members aware of their social (e.g., friendliness) and cognitive behavior (e.g., productivity) during collaboration (e.g., Phielix, Prins, Kirschner, Erkens, & Jaspers, 2011). These tools, also known as group awareness tools, provide information about the social and collaborative environment in which a person participates (e.g., inform students how their actual behavior and performance is perceived by their peers; see Buder, 2007; 2011). This enhanced group awareness can lead to more effective and efficient collaboration (e.g., Buder & Bodemer, 2008; Janssen, Erkens, & Kirschner, 2011). Two operationalizations of such tools were used in this study, namely a shared self and peer assessment tool (Radar) and a shared reflection tool (Reflector). Using Radar, students anonymously rate the social (e.g., friendliness) and cognitive (e.g., productivity) behavior of themselves and their fellow group members. Using Reflector, group members are stimulated to individually reflect upon their own functioning, their received peer ratings, and the functioning of the group as a whole. Reflector also stimulated group members to reflect collaboratively on their group performance and to formulate plans for improvement. To this end, it was hypothesized that a combination of Radar and Reflector would lead to more objective and valid ratings, more consensus among raters, and enhancement of the groups’ social and cognitive performance.
6.2 Research Questions This study examines whether the use of a shared reflection tool (Reflector) can enhance the level of consensus among peer raters, and enhance their social and cognitive group performance. The following research questions and hypotheses will be addressed: Question 1: What is the effect of Reflector on peer-rating consensus across time? Hypothesis 1a: Reflector enables reflection upon individual and group behavior, and support students in forming judgments (i.e., shared understanding) about what can be referred to as good group behavior and high-quality performance. Thus, students using Reflector (+Re) will show higher levels of consensus among peer ratings (i.e., higher partner variance) and lower levels of assimilation (i.e., lower actor variance), than students not using Reflector (¬Re). Hypothesis 1b: Compared to solely using Radar, using a combination of Radar and Reflector will be more effective to change group members’ perceptions and behavior, because Reflector facilitates explicit reflection upon group members’ behavior, norms and group functioning. Thus, self and average peer ratings of students using Reflector (+Re) will develop differently (i.e., grow or decline) across time compared to students not using Reflector (¬Re). Hypothesis 1c: Reflector can enhance awareness of unrealistic self perceptions by stimulating reflection on discrepancies between self and received peer ratings, which can result in higher correlations between self ratings and received peer ratings across time. Students using Reflector (+Re) will show higher correlations between self and received peer ratings, than students not using Reflector (¬Re). Question 2: What is the effect of the Reflector on the social and cognitive group performance? Hypothesis 2a: Groups with Reflector (+Re) will score higher on social group performance compared to groups without Reflector (¬Re), because groups with Reflector set goals and formulate plans to enhance their social performance.
84
Chapter 6 Hypothesis 2b: Groups with Reflector (+Re) will score higher on cognitive group performance (i.e., their essay grade) compared to groups without Reflector (¬Re), because groups with Reflector set goals and formulate plans to enhance their cognitive performance. Hypothesis 2c: Reflector enables reflection upon individual and group behavior, and supports students in forming judgments (i.e., shared understanding) about what can be referred to as good group behavior and high-quality performance. Thus, students using Reflector (+Re) will exhibit more valid peer ratings, that is, higher correlations between their perceived social behavior (i.e., average peer ratings) and their perceived social performance, compared to groups without Reflector (¬Re).
6.3 Method 6.3.1 Participants Participants were 191 second-year Dutch university Educational Science students (37 male, 154 female) with an average age of 23.6 years (SD = 7.16, Min = 19, Max = 55). Prior to the experiment, they were randomly assigned by the teacher to groups of three (ngroups = 7), four (ngroups = 40), and five (ngroups = 2), and randomly assigned by the researchers to one of two conditions (see Design). Groups were heterogeneous in ability.
6.3.2 Design An experimental design was used with one experimental and one control condition (see Table 6.1). The experimental condition (n = 105) received a self and peer assessment tool (Radar) and reflection tool (Reflector), and the control condition (n = 86) received only Radar. During a period of eight weeks, participants in both conditions used the Radar four times. Additionally, from the second measurement occasion on (T2: week 3), participants in the experimental condition also had to complete the Reflector. In the first week, students were assigned to their groups and formulated plans for the weeks to come. Because group work was minimal in this first week, there was not much activity to reflect upon. Therefore, in the first week, groups only had to complete the Radar to gather a baseline measurement. Table 6.1 Design of the study Condition
T1 – week 1
T2 – week 3
T3 – week 6
T4 – week 8
1. With Reflector (+Re)
Radar
Radar Reflector
Radar Reflector
Radar, Reflector, Questionnaire
2. Without Reflector (¬Re)
Radar
Radar
Radar
Radar, Questionnaire
6.3.3 Task and procedure Students collaborated in groups of three, four, and five on a collaborative research task in educational psychology. To successfully complete this task, each group had to write a research paper about a pilot-study which they conducted over a period of eight weeks. During this period, they had to complete a self and peer assessment tool (Radar) four times, with or without a supplemental (co-)reflection tool (Reflector). Prior to this collaborative writing task, students had no experience in collaboratively writing a paper. The groups collaborated in a CSCL-environment called Virtual Collaborative Research Institute (VCRI; Jaspers, Broeken, & Erkens, 2004), a groupware program that supports
85
Chapter 6 collaborative learning on research projects and inquiry tasks (see Tools section). Students were instructed to make complete use of the tools for peer feedback and reflection when the experimental condition allowed this. During use of the tools, students were instructed to use the environment to communicate with other group members. Students received content information and definitions regarding the six traits on which they had to assess themselves and their peers. Students were told that they had eight weeks to complete the task, that it would be graded by their teacher, and that it would affect their final grade for the course. The introduction to the task stressed the importance of working together as a group and pointed out that each individual group member was responsible for the successful completion of the group task. At the end of the final session all participants completed a 30-item questionnaire on the social performance of the group.
Figure 6.1 Screenshot of VCRI with the five tools used in this experiment.
6.3.4 Tools Virtual Collaborative Research Institute (VCRI). VCRI (Jaspers et al., 2004) is a groupware program that supports collaborative working and learning on research projects and inquiry tasks containing more than 10 different tools, of which five were used for this experiment (see Figure 6.1). Co-Writer (top left) is a shared word-processor for writing a group text where students in a team can simultaneously work on different parts of the group text. The Chat tool (top center) is used for synchronous communication that is automatically stored and can be re-read by participants at any time. Notes (bottom right) is a note pad which allows the user to take notes and 86
Chapter 6 copy and paste selected information. Radar for self and peer assessment (bottom left) and Reflector for co-reflection (top right) will be described in the following sections. Frames of the available tools are automatically arranged on the screen when students log on to the VCRI. Self and Peer Assessment tool (Radar). Radar is a self and peer assessment tool for eliciting information on group members’ social and cognitive behavior visualized in a radar diagram (see Figure 6.2). Radar provides users with anonymous information on how their cognitive and social behaviors are perceived by themselves, their peers, and the group as a whole with respect to specific traits found to tacitly affect how one ‘rates’ others (den Brok, Brekelmans, & Wubbels, 2006). Radar provides information on six traits important for assessing behavior in groups. Four are related to social or interpersonal behavior, namely (1) influence; (2) friendliness; (3) cooperation; (4) reliability; and two are related to cognitive behavior, namely (5) productivity and (6) quality of contribution. These traits are derived from studies on interpersonal perceptions, interaction, group functioning, and group effectiveness (e.g., Bales, 1988; den Brok, Brekelmans, & Wubbels; Kenny, 1994; Salas, Sims, & Burke, 2005).
Figure 6.2 Radar – Input screen
Figure 6.3 Radar – Group information
Influence is directly derived from Wubbels, Créton, and Hooymayers’ (1985) influence dimension which they labeled dominance and submissiveness in their model for interpersonal teacher behavior. This dimension is also used by Bales (1988) and represents the prominence, status, power, and personal influence that the individual is seen to have in relation to other group members. The variable is labeled ‘influence’, and not ‘dominance’ or ‘submissive’, because the latter two can be perceived of as negative traits. Friendliness is one of the eight behavior categories from Wubbels et al.’s (1985) model for interpersonal teacher behavior. Bales (1988) used a similar dimension (i.e., friendliness vs. unfriendliness). Bales and Cohen (1979) defined this as the extent to which individual members are friendly with and respectful to each other. Cooperation, which denotes the degree to which someone is willing to work with others, is derived directly from Wubbels et al.’s (1985) dimension proximity (i.e., opposition vs. cooperation), which they defined as the property of being close together, or in group settings as the feeling of being a group (i.e., group cohesiveness).
87
Chapter 6 Reliability is a trait reflecting ‘trust’ which has been identified as an important precursor for successful collaboration, in face-to-face teams (Castleton_Partners/TCO, 2007) and in CSCL (Jarvenpaa & Leidner, 1999). According to Emans, Koopman, Rutte, and Steensma (1996) trust is the cognitive and affective assurance of group members that they respect each other’s interests and, therefore, can orient themselves towards each other’s words, actions, and decisions with an easy conscience. Productivity and Quality of contribution are the extent to which individual group members contribute quantitatively and qualitatively to tasks or duties central to group performance or group efficiency. These traits, representing cognitive or task-related behavior, were selected because research has shown that group members (1) do not always participate equally (Karau & Williams, 1993), and (2) monitor the performance (i.e., quantity and quality) of other group members (Salas et al., 2005). In Radar, group members are both assessors and assessees. As assessor, to-be-assessed peers in the group (i.e., the assesses) can be selected and their profile will appear as dotted lines in the center circle of the radar diagram. Each group member is represented by a specific color. Assessors rate themselves and all other group members on each of the six subscales (traits). Each of the subscales was divided in 41 points of assessment ranging from 0 to 4 (see Figure 6.4). For example, a student can rate his/her peer 3.2 for friendliness. To simplify data analysis, ratings were transformed into integers on a 100-point scale by multiplying the ratings (0-4) by 25. Thus a rating of 3.2 was after multiplication saved in the database as 80 points (3.2 times 25) on 100point scale.
Figure 6.4 Detailed image of a subscale in Radar (simplified)
Care was taken to ensure that all assessors use the same definition of the six traits. Prior to the experiment the researcher notified participants of the fact that text balloons with content information and definitions would appear when they moved the cursor across one of the traits in the tool. For example, when the cursor is moved across ‘influence’ a balloon pops up with the text ‘A high score on influence means that this person has an influence on what happens in the group, on the behavior of other group members, and on the form and content of the group product (the paper)’. Ratings are automatically saved in a database. Assessment is anonymous; group members can see the assessments of the other group members, but not who entered the assessment. Students can only access individual and average assessments of their peers after they have completed the assessment themselves. When all group members have completed their self and peer assessments, two modified radar diagrams become available. The first - Information about yourself - shows the output of the self assessment (e.g., Chris about Chris) along with the average scores of the peer assessments of her/him (e.g., Group about Chris). The self assessment is not taken into account for computing the average scores. To provide more information about the variance in the average score of their peer assessment, students can also choose to view the individual peer assessments about their own behavior (e.g., Group members about Chris). The second - Information about the group (see Figure 6.3) - represents the average scores of the group members, so that group members can get a general impression about the functioning of the group.
88
Chapter 6 All group members are represented as a solid line in the diagram, each with a different color. Participants can complicate or simplify the Radar diagram by including or excluding group members from the view through clicking a name in the legend. It is assumed that the peer feedback from Radar makes group members aware of the differences between their intended behavior (self assessment) and how this behavior is perceived by their peers (peer assessment). It is also assumed that group members will be stimulated to improve their social and cognitive behavior, knowing that (1) every group member is assessed by peers, and (2) these scores are shared anonymously in the group. Therefore, it is expected that group members using Radar throughout the task will exhibit higher self and peer assessment scores over time on all six traits. Reflection tool (Reflector). Reflector was implemented to stimulate group members to coreflect on their individual behavior and overall group performance. It contained six reflective questions: 1. What is your opinion of how you functioned in the group? Give arguments to support this. 2. What differences do you see between the assessment that you received from your peers and your self assessment? 3. Why do or don’t you agree with your peers concerning your assessment? 4. What is your opinion of how the group is functioning? Give arguments to support this. 5. What does the group think about its functioning in general? Discuss and formulate a conclusion shared by all the group members. 6. What is needed to improve your group performance? Set specific goals (i.e., who, what, when) to improve group performance. The first four questions are completed in Reflector, with completion indicated by clicking an ‘Add’-button. This allows students to share their answers with the rest of the group and allows them to see the answers of the others. Students can only gain access to their peers’ answers after they have added their own so as not to influence each another. The last two questions are completed in Co-Writer, in a specific frame (Co-Reflection), which allows writing a shared conclusion and formulating shared goals. Responses made by the students in Reflector are not scored or evaluated.
6.3.5 Measures Table 6.2 provides the coding scheme for the output of the co-reflection. Reflection Characteristics. To improve the social and cognitive group performance, group members reflect together (i.e., co-reflect) to set goals and formulate plans to improve their social and cognitive activities. Categories for the coding scheme were derived from studies on social interaction and coordination processes in CSCL and were added until there were no ‘rest categories’..Two independent researchers coded and categorized the goals and plans into nine categories (see Table 6.4). Inter-rater reliability was substantial (Cohen’s Kappa = .79). The first three categories are communication, focusing on task and task coordination, activities which are crucial for successful collaboration (Barron, 2003; Erkens, Jaspers, Prangsma, & Kanselaar, 2005; Slof, Erkens, Kirschner, Jaspers, & Janssen, 2010).
89
Chapter 6 Table 6.2 Coding Scheme for Output Co-reflection: Specific Goals to Improve Group Performance Label Code Description Example Communication
Com
Focusing on task
Focus
Task coordination
Task
Planning
Plan
Improve communication or discuss teamwork Improve concentration or focus on task Improve coordination, task- or role planning Improve time planning
We have to improve our communication and discuss our teamwork more often. We’ll focus more on our work.
Monitoring
Mon
Improve peer monitoring
Friendliness
Friend
Productivity
Prod
Improve friendliness towards each other Improve productivity
Quality
Qual
Improve quality of work
We shouldn’t be so unfriendly towards each other. We’ll increase our productivity and participate more equally. We’ll improve the quality of our work.
No suggestions
None
No suggestions for improvement
We have no suggestions for improvement.
We’ll divide the tasks more effectively. Let’s make clear who does what. We’ll set deadlines and improve our time planning. We’ll monitor each others’ progression.
Furthermore, students need to carry out meta-cognitive activities such as planning and monitoring to employ a proper problem-solving strategy and reflect on its suitability (Lazonder & Rouet, 2008; Narciss, Proske, & Koerndle, 2007). Students also must develop positive affective relationships with each other (Kreijns, Kirschner, & Jochems, 2003), thus friendliness is a sixth category. Productivity and quality are the seventh and eight categories because in effective groups, group members mutually depend on the willingness, effort and participation of their peers (Janssen, Erkens, Kanselaar, & Jaspers, 2007; Karau & Williams, 1993; Williams, Harkins, & Latané, 1981). The category no suggestion is added for students who do not have any suggestions to improve their performance. Table 6.3 provides an overview of the measures on social and cognitive behavior/performance. Table 6.3 Overview of Scales, Subscales and Instruments Scale
Subscales
Instrument
Social behavior
Influence, Friendliness, Cooperation, Reliability
Radar
Cognitive behavior
Productivity, Quality of Contribution
Radar
Social performance
Team development, Group process Satisfaction Intra-group Conflicts Attitude towards Collaborative Problem Solving -
Questionnaire
Cognitive performance
Paper grade
Social behavior. See Table 6.4. Perceived social behavior was measured by the self and peer assessments in Radar on four variables (influence, friendliness, cooperativeness, reliability).
90
Chapter 6
Table 6.4 Social and Cognitive Behavior as Measured by Radar Scale
Subscales
Balloon text/ description
Social behavior
Influence
A high score on influence means that this person has a big influence on: other group members; what happens in the group; structure and content of the groups’ product.
Friendliness
A high score on friendliness means that this person: is friendly and helpful; provides a positive contribution to the group atmosphere; responds friendly and helpful on questions, suggestions and ideas of others.
Cooperation
A high score on cooperation means that this person: collaborates well in the group; is willing to take over tasks of others; takes initiatives; communicates well; tries to think and cooperate in finding solutions for problems that occur.
Reliability
A high score on reliability means that this person: is reliable; keeps his/her word; does what he/she is suppose to do; finishes his/her task at the appointed time.
Productivity
A high score on productivity means that this person:is productive; works hard; has a high contribution in problem solving; has a high contribution to the groups’ product.
Quality of contribution
A high score on ‘quality of contribution’ means that: his/her work is perceived as useful and good; he/she produces a high quality of work; he/she has a positive contribution to the content and structure of the groups’product.
Cognitive behavior
Cognitive behavior. See Table 6.4. Perceived cognitive behavior was measured by the self and peer assessments in Radar on the variables ‘productivity’ and ‘quality of contribution’, rated on a continuous scale ranging from 0 to 4 (0 = none, 4 = very high). The same transformation was carried out here. Social performance. To measure social performance, previously tested and validated scales were used to measure team development (α = .92, 10 items), group-process satisfaction (α = .76, 6 items, both from Savicki, Kelley, & Lingenfelter, 1996), intra-group conflicts (α = .92, 7 items, from Saavedra, Early, & Van Dyne, 1993), and attitude towards collaborative problem solving (α = .81, 7 items, from Clarebout, Elen, & Lowyck, 1999). These scales were translated into Dutch and transformed into 5-point Likert scales (1 = totally disagree, 5 = totally agree; see Table 6.5) by Strijbos, Martens, Jochems, and Broers (2007). The Team Development scale provides information on the perceived level of group cohesion. The Group-process Satisfaction scale provides information on the perceived satisfaction with general group functioning. The Intragroup Conflicts scale provides information on the perceived level of conflict between group members. The Attitude towards Collaborative Problem Solving scale provides information on the perceived level of group effectiveness and how group members felt about working and solving problems in a group. The 30 items in the four scales were subjected to principal component analysis. Prior to performing this analysis, the suitability of data for factor analysis was assessed. Inspection of the correlation matrix showed that all coefficients were .5 and higher. The KaiserMeyer-Oklin value was .73, exceeding the recommended value of .6 and Bartlett’s Test of Sphericity reached statistical significance, supporting the factorability of the correlation matrix. The analysis revealed the presence of one main component with Eigen values exceeding 1, explaining 76.6% of the variance respectively. Cronbach’s alpha of the composed ‘Social Performance (total)’ scale was .90.
91
Chapter 6
Table 6.5 Examples of Social Performance Scales and their reliabilities in this study Scale Team Development
k 10
Group-process Satisfaction
6
Intra-group Conflicts
7
Attitude towards Collaborative Problem Solving
7
Social performance (total)
30
Example Group members contribute ideas and solutions to problems. I felt that my group worked very hard together to solve this problem. I found myself unhappy and in conflict with members of my group. Collaborating in a group is challenging. See all items of four scales stated above.
Cronbach’s α .77 .71 .84 .74 .90
Cognitive performance. The grade given to each group’s collaborative research task (i.e., a paper) was used as a measure of cognitive performance.
6.3.6 Data analyses At first, a manipulation check was conducted to examine whether the reflection tool stimulates groups to formulate goals and plans to enhance their functioning and performance. For the manipulation check, responses in Reflector were independently coded and categorized by two researchers to examine what goals/plans group members set/formulated in the Reflector to enhance social and cognitive group performance. For the first research question, Social Relation Models-analyses was used to examine whether students using Reflector would show higher levels of consensus (i.e., show higher partner variance) among peer ratings across time, and lower levels of assimilation (i.e., show lower actor variance), than students not using Reflector (¬Re). The computer program SOREMO (Kenny, 1998) was used to analyze data from round-robin designs. Depending on the normality of the distribution of variances, a Mann-Whitney U test (for not normal distribution) or MANOVA (for normal distribution) was used to examine the significant differences in the relative amount of actor and partner variances between students using Reflector (+Re) and not using Reflector (¬Re). Using the absolute variances would not be interesting, because differences in absolute variances between the experimental and control condition could be caused by differences in the total sum of all absolute variances within the condition. Secondly, a MANOVA was used to examine whether there are significant differences between conditions (with or without Reflector) at the beginning (at T1) for self assessment and peer assessment scores in Radar. Thirdly, because measurement occasions were nested within students and students were nested within groups, a three-level multilevel model was used to examine whether self and peer ratings developed differently in the two conditions (with or without Reflector) by examining the interaction between measurement occasion and condition. For the second research question, firstly, multilevel analyses were used to examine whether groups with Reflector (+Re) exhibit higher levels of social performance than groups without Reflector (¬Re). Secondly, to examine whether groups with Reflector (+Re) exhibit higher cognitive performance than groups without Reflector (¬Re), an independent one-tailed t-test was used to examine differences between conditions 1 and 2 with paper grade as dependent variable. Finally, to determine whether Reflector supported students in forming judgments (a shared
92
Chapter 6 standard) about what can be referred to as good group behavior and high-quality performance, a Pearson product-moment correlation coefficient was used to test whether students using Reflector (+Re) exhibited higher correlations between their perceived social behavior (i.e., average peer ratings) and their perceived social performance at the end (T4), compared to groups without Reflector (¬Re). Except where noted, tests were one-tailed since most hypotheses were directional. The rule of thumb (Cohen, 1988, pp. 284-287) for the strength of the correlation (r) was small = .10-.29, medium = .30-.49, and large = .50-1.0., and for effects sizes (η2) it was small ≥ .01, medium ≥ .06, and large ≥ .14. For data analysis, only groups that completed three or more assessments were taken into account.
6.4 Results 6.4.1 Manipulation check: formulated plans and goals in Reflector Formulated group plans and goals were coded and categorized to examine what goals/plans were set/formulated in Reflector to enhance social and cognitive group performance. Table 6.5 shows per occasion, the absolute frequencies of the number of groups that formulated a goal or plan in a specific category. For example, at T2 (week 3) 10 of the 23 groups (43%) formulated a plan or set a goal to enhance their communication. Groups in the experimental condition (Radar and Reflector) completed the Reflector three times, at T2 (week 3), T3 (week 6), and T4 (week 8). Over time, groups formulated goals focused on improving communication (28%), task coordination (27%), planning (24%), increasing productivity (7%), focusing on the task (5%), increasing quality of contribution (4%) and monitoring their peers (4%). Only one of the groups formulated a goal focused on improving friendliness towards each other. At the second measurement in week 3 (T2), which was the first reflection occasion, 8 of 23 groups (35%) indicated that they did not have any plans or goals for improvement, compared to 4 groups (17%) at week 6 (T3) and 6 groups (26%) at week 8 (T4). Compared to T2, at T3 the number of groups that formulated goals to improve their task coordination is twice as great (14 groups). Furthermore, at week 6 (T3), 13 of 23 groups (57%) formulated goals to improve their planning activities, compared to 8 groups (35%) at T2 and T4. Overall, the three most often mentioned goals relate to improving communication, task coordination, and planning, indicating that Reflector stimulated groups to formulate goals and plans to enhance their social and cognitive performance. Therefore, the manipulation can be considered as successful.
93
Chapter 6
Table 6.5 Absolute Frequencies and Percentages (%) of Formulated Future Goals and Plans Condition 1 - Radar and Reflector (ngroups = 23) Label Communication
T2 week 3
T3 week 6
T4 week 8
Overall (T2, T3, T4)
10
29%
13
25%
11
30%
34
28%
Task coordination
7
21%
14
27%
12
32%
33
27%
Focussing
2
6%
3
6%
1
3%
6
5%
Productivity
3
9%
3
6%
1
3%
7
6%
Quality
2
6%
2
4%
1
3%
5
4%
Planning
8
24%
13
25%
8
22%
29
24%
Friendliness
1
3%
1
2%
1
3%
3
2%
Monitoring
1
3%
2
4%
2
5%
5
4%
34
100%
51
100%
37
100%
122
100%
Total plans & goals Groups without plans or goals
8
4
6
18
6.4.2 Effect of Reflector on consensus among peer raters across time Tables 6.6 and 6.7 show means and standard deviations of self and average peer ratings at T1, T2, T3 and T4. Table 6.6 Mean and Standard Deviations of Self Assessments per Condition Influence Friendliness Cooperation Reliability T n M SD M SD M SD M SD 1 +Re 92 60.41 11.68 69.07 11.82 68.58 13.12 72.22 14.09 ¬Re 91 63.47 13.45 69.10 12.94 65.68 14.48 74.69 15.14 2 +Re 91 64.34 10.84 68.51 11.16 67.45 13.26 70.88 13.27 ¬Re 90 67.19 11.69 71.28 9.32 67.76 9.85 71.46 9.82 3 +Re 78 65.60 12.23 68.06 10.62 67.24 11.07 69.47 11.74 ¬Re 86 68.76 10.10 70.57 9.29 68.00 11.93 69.17 10.52 4 +Re 80 66.05 12.33 67.64 11.28 66.15 11.72 69.59 11.93 ¬Re 80 69.48 10.29 74.00 9.05 70.95 9.75 71.38 9.70
Productivity M SD 64.72 12.33 64.65 11.54 65.08 10.98 66.57 12.61 65.60 11.52 66.47 11.85 68.71 12.69 70.60 10.44
Quality M SD 64.36 11.09 66.91 12.61 68.01 9.84 68.17 11.01 67.27 10.01 69.23 10.10 69.00 11.54 73.29 10.42
Table 6.7 Mean and Standard Deviations of Average Peer Assessments per Condition Influence Friendliness Cooperation Reliability T n M SD M SD M SD M SD 1 +Re 92 63.81 8.55 72.15 10.53 68.94 10.72 71.01 10.76 ¬Re 91 64.37 10.82 73.31 12.81 69.84 11.77 72.81 12.18 2 +Re 91 66.28 6.92 72.13 10.69 68.18 10.70 70.46 10.37 ¬Re 90 68.57 7.76 74.90 9.81 70.80 9.51 72.82 8.23 3 +Re 78 66.52 7.61 69.50 9.04 66.70 9.87 68.86 8.94 ¬Re 87 69.54 7.67 73.39 8.54 69.38 8.43 71.17 8.04 4 +Re 80 65.88 8.22 68.69 9.54 65.23 9.98 67.95 8.11 ¬Re 81 68.67 9.17 74.44 8.73 71.05 10.15 72.36 8.34
Productivity M SD 66.83 10.53 67.45 11.74 68.24 7.33 69.26 8.57 67.26 7.80 69.50 9.50 67.59 7.73 71.29 9.82
Quality M SD 67.74 9.42 67.17 11.81 68.60 7.84 69.71 7.33 67.40 6.95 70.18 8.04 68.55 7.32 71.38 9.14
Social Relation Model analyses were used to examine the differences in partner (ratee) and actor (rater) variance between conditions. Table 6.8 shows the relative partner and actor variance
94
Chapter 6 per dependent variable (i.e., influence, friendliness, etc.) per condition across time. The relative variances are proportions of the total variance. Figures 6.5A through 6.5L are graphs for each dependent variable with relative partner and actor variance over time per condition. Table 6.8 Relative Partner & Actor Variance For Peer Assessments Per Dependent Variable Per Condition Influence Friendliness Cooperation Reliability Productivity Quality Partner Actor
Partner Actor
Partner Actor
Partner Actor
Partner Actor
Partner Actor
T1 T2 T3 T4
.20* .39* .43* .28*
.35* .20* .19* .35*
.09* .12* .13 .04
.68* .63* .49* .42*
.11* .21* .29* .17*
.64* .41* .34* .53*
.06 .30* .43* .23*
.44* .38* .18* .37*
.11 .26* .40* .22*
.41* .18* .22* .30*
.08 .19* .38* .22
.42* .40* .21* .27*
T1 T2 ¬Re T3 T4 * p < .05
.16* .09 .14* .20*
.37* .25* .48* .46*
.09 .00 .01 .05
.64* .74* .67* .54*
.00 .05 .04 .00
.60* .70* .61* .66*
.05 .06 .07* .17*
.51* .55* .64* .58*
.16* .19* .12* .23*
.39* .44* .63* .55*
.16* .21* .05 .19
.60* .36* .65* .37*
+Re
Results show that all actor variances with or without Reflector are significant. Here, variance in ratings is determined by the actors (raters), that is, the variance is caused by the tendency of raters to rate all peers similarly - high or low - on a particular trait. For students using Reflector, significant partner variances are found for peer ratings on the four Radar dimensions that tap into social interpersonal behavior (i.e., influence, friendliness, cooperation, reliability). Here, variance in ratings is determined by the partners (ratees), that is, the variance is caused by the tendency of ratees to elicit similar ratings from all peer raters. For students not using Reflector, no significant partner variances were found for peer ratings on friendliness and cooperation, indicating that there is no consensus amongst raters for these traits; that is, the degree to which all actors (raters) agree that some peers are high on a trait ands others are low on that trait. In contrast to students using Reflector, partner variances for students not using Reflector never exceeded actor variances (e.g., influence at T2 and T3). This means that peer ratings of students not using Reflector are more determined by characteristics of the raters (e.g., their norms and stereotypes) than by characteristics of the ratees. In general, variance partitioning for ratings of personality traits shows that approximately 20% of the variance is due to the actor, and 15% to the partner (Kenny et al., 2001, 2006). Significantly high partner variances (Min = .11, Max = .43) are found across time for students using Reflector compared to those without (Min = .07, Max = .21), which indicates that groups with Reflector show higher levels of consensus among raters across time than groups without. For students using Reflector, with the exception of friendliness, partner variance increases until T3, but decreases at T4 (Min = .17, Max = .28). For students not using Reflector, partner variance for influence, reliability, and productivity increases at T4 (Min = .17, Max = .23). This indicates that the level of consensus among raters using Reflector decreased at T4, but increased for raters not using Reflector.
95
Chapter 6
0,450
0,800
0,400
0,700 0,600
0,300 0,250
Influence C=1
0,200
Influence C=2
0,150
Actor varianc
Partner varianc
0,350
Influence C=1
0,400
Influence C=2
0,300 0,200
0,100 0,050
0,100
0,000 Time 1
Time 2
Time 3
0,000
Time 4
Time 1
Figure 6.5a Partner variance (%) for influence
Time 2
Time 3
Time 4
Figure 6.5b Actor variance (%) for influence
0,450
0,800
0,400
0,700
0,350
0,600
0,300 0,250
Friendliness C=1
0,200
Friendliness C=2
0,150
Actor varianc
Partner varianc
0,500
0,100
0,500 Friendliness C=1
0,400
Friendliness C=2
0,300 0,200
0,050
0,100
0,000 Time 1
Time 2
Time 3
0,000
Time 4
Time 1
Figure 6.5c Partner variance (%) for friendliness
Time 2
Time 3
Time 4
Figure 6.5d Actor variance (%) for friendliness
0,450
0,800 0,400
0,700
0,350
0,600
0,250
Cooperativeness C=1
0,200
Cooperativeness C=2
0,150
Actor varianc
Partner varianc
0,300
0,100
0,500 Cooperativeness C=1
0,400
Cooperativeness C=2
0,300 0,200
0,050
0,100
0,000 Time 1
Time 2
Time 3
0,000
Time 4
Time 1
Time 2
Time 3
Time 4
Figure 6.5e Partner variance (%) for cooperativeness Figure 6.5f Actor variance (%) for cooperativeness 0,450
0,800
0,400
0,700 0,600
0,300 0,250
Reliability C=1
0,200
Reliability C=2
0,150
Actor varianc
Partner varianc
0,350
0,000
0,000 Time 1
Time 2
Time 3
Time 4
Time 1
Figure 6.5g Partner variance (%) for reliability
Time 2
Time 3
Time 4
Figure 6.5h Actor variance (%) for reliability
0,450
0,800
0,400
0,700
0,350
0,600
0,300 0,250
Productivity C=1
0,200
Productivity C=2
0,150
Actor varianc
Partner varianc
Reliability C=2
0,300
0,100
0,050
0,500 Productivity C=1
0,400
Productivity C=2
0,300 0,200
0,100 0,050
0,100
0,000 Time 1
Time 2
Time 3
0,000
Time 4
Time 1
Time 2
Time 3
Time 4
Figure 6.5j Actor variance (%) for productivity
Figure 6.5i Partner variance (%) for productivity 0,450
0,800
0,400
0,700
0,350
0,600
0,300 0,250
Quality C=1
0,200
Quality C=2
0,150
Actor varianc
Partner varianc
Reliability C=1
0,400
0,200
0,100
0,500 Quality C=1
0,400
Quality C=2
0,300 0,200
0,100
0,100
0,050 0,000
0,000 Time 1
Time 2
Time 3
Time 4
Figure 6.5k Partner variance (%) for quality
96
0,500
Time 1
Time 2
Time 3
Time 4
Figure 6.5l Actor variance (%) for quality
Chapter 6 At T3, actor variance for students using Reflector is normally partitioned (approximately 20%), but increases at T4 (Min = .27, Max = .53) with the exception of friendliness (r = .49) and cooperation (r = .34). Although actor variance for students not using Reflector decreases at T4 (Min = .37, Max = .66) - with the exception of cooperation - actor variances are high from the beginning (T1) until the end (T4) (Min = .25, Max = .74). This means that at the end of the collaboration process (T4) peer ratings of students using Reflector are more determined by characteristics of the raters (e.g., their norms and stereotypes) than by characteristics of the ratees (i.e., their social and cognitive behavior in the group), compared to students not using Reflector. For example, Figure 6.5a shows the development of relative partner variance for influence at four measurement occassions (i.e., T1-4) for both conditions (Radar with or without Reflector). This graph can be interpreted as follows: in the ideal situation, partner variance is equal in both conditions at the beginning of the collaboration process (at T1), because both conditions are homogene and none of the participants used the tools until that time. Ideally, partner variance slowly increases over time in both conditions. However, due to Reflector, partner variance in condition 1 (with Reflector) exceeds partner variance in condition 2 (without Reflector). The development of partner variance for influence shows almost an ideal development over time, except for the decrease of partner variance at T4 for groups with Reflector (condition 1), and the decrease at T2 for groups without Reflector (see Figure 6.5a). Figure 6.5b shows the development of the percentages of relative actor variance for influence over time for both conditions. In the ideal situation, actor variance is equal (preferably zero) in both conditions at T1. Ideally, actor variance slowly decreases over time in both conditions. Due to Reflector, actor variance in condition 1 (with Reflector) is less compared to condition 2 (without Reflector). The development of actor variance for influence shows almost an ideal development over time, except for the increase of actor variance at T4 for groups with Reflector (condition 1), and the increase at T3 for groups without Reflector (see Figure 6.5b). Because the relative variance was not normally distributed, a Mann-Whitney U test was used to examine whether the relative amount of actor and partner variance between students using Reflector (+Re) and not using Reflector (¬Re) is significantly different. The significant results are shown in Table 6.9. Table 6.9 Significant differences in Relative Variances Between Students With (+Re) and Without Reflector (¬Re) T T1
Variable
Md (+Re) n groups
Md (¬Re) n groups
U
z
r
p
Partner Cooperation
.02
24
.00
18
137
-2.11
.33
.02
Actor Quality
.24
24
.49
18
147
1.75
.27
.04
T2
Partner Influence
.27
24
.04
19
135
-2.29
.35
.01
T3
Partner Productivity Actor Reliability Actor Productivity Actor Quality
.20 .31 .40 .26
19 19 19 19
.07 .68 .55 .66
18 18 18 18
111 107 106 76
-1.84 1.95 1.98 2.89
.30 .32 .33 .48
.03 .03 .02 .00
T4
Partner Cooperation Actor Productivity
.11 .38
19 19
.01 .62
17 17
109 108
-1.72 1.70
.29 .28
.04 .04
Results show that, for students using Reflector, partner variances for peer ratings on influence (at T2), cooperation (at T1, T4) and productivity (at T3) were significantly higher 97
Chapter 6 compared to students not using Reflector. These results indicate that peer ratings of students using Reflector show higher levels of consensus on influence (at T2), cooperation (at T1, T4) and productivity (at T3), than peer ratings of students not using Reflector. For students using Reflector, actor variances for peer assessments on reliability (at T3), productivity (at T3, T4), and quality of contribution (at T1, T3) were significantly lower compared to students not using Reflector. This indicates that ratings of students using Reflector on reliability (at T3), productivity (at T3, T4), and quality of contribution (at T1, T3) are less determined by the tendency of an actor (rater) to see all other group members as high or low on a particular trait.
6.4.3 Effect of Reflector on students self ratings across time First, a MANOVA was used to examine whether there are significant differences between conditions (with or without Reflector) at the beginning (at T1) for self and peer ratings in Radar. Results showed no significant differences between conditions at the beginning of the collaboration (at T1). Multilevel analyses were then used to examine the effect of the Reflector on the self ratings across time (T1, T2, T3 and T4). Note that significant group-level variance (p < .000) for all dependent variables was found, indicating that the group influenced the peer ratings. Table 6.6 shows the means and standard deviations of the self assessments at T1, T2, T3 and T4. A linear growth model using hierarchical linear modeling (i.e., repeated measures multilevel analysis) was used to examine how students’ ratings of their group members developed over the four measurement occasions and test whether this was affected by experimental condition (without Reflector vs. with Reflector). Furthermore, interaction between measurement occasion and condition was examined to determine whether growth or decline of self ratings developed differently in the two conditions. Because measurement occasions were nested within students and students were nested within groups, a three-level multilevel model was used. The results of this analysis with respect to students’ self evaluations of their behavior are presented in Table 6.10. Table 6.10 Multilevel Analyses for Effects of Occasion and Condition on Self Assessment Scores Influence Friendliness Cooperation Reliability Productivity Fixed effects β SE β β SE β β SE β β SE β β SE β Intercept 62.67 (.82) 69.22 (.83) 67.01 (.86) 72.51 (.90) 64.23 (.85) Occasion 1.95** (.44) .41 (.44) .44 (.45) -1.00** (.47) 1.48*** (.45) Condition -3.73** (.92) -2.81** (.93) -.94 (.99) -1.05 (1.00) -1.45 (.99) Occ. * cond. -.15 (.85) -1.85* (.85) -2.28** (.89) .38 (.94) -.54 (.90) Decrease in 15.69*** 8.84** .91 1.11 2.11 deviance (χ²)
Quality β SE β 65.62 (.77) 1.69*** (.42) -2.88*** (.79) -.57 (.82) 10.33***
Note. Standard errors are in parentheses. * p < .05 ** p < .01 *** p < .001
The results represent the so-called conditional model (i.e., model with predictor variables such as experimental condition). This model examined the linear effect of measurement occasion on influence rating by incorporating occasion as a fixed effect. Because the first measurement was taken as a reference point, and there were four measurement occasions in total, this variable could range from 0 to 3. The fixed effects can therefore be interpreted as follows.
98
Chapter 6 First, the estimate for the intercept indicate the average self ratings of group members at the first measurement occasion (e.g., influence = 62.67). Second, the linear effect of measurement occasion was significantly positive for influence (β = 1.95, p < .01), productivity (β = 1.48, p < .001), and quality of contribution (β = 1.69, p < .001), indicating that in both conditions average self ratings on influence increased by 2 points from one occasion to the next, and self ratings on productivity and quality of contribution by 1.5 and 1.7 points respectively. A significant negative linear effect of measurement occasion was found for reliability (β = -1.00, p < .01), indicating that group members perceived themselves as being a little less reliable (by 1 point respectively), from one occasion to the next. Third, condition (without Reflector vs. with Reflector) was added as a fixed effect to the multilevel model. As can be seen, a significant effect of condition was found for influence (β = -3.73, p < .01), friendliness (β = -2.81, p < .000), and quality of contribution (β = -2.88, p < .001). This indicates that, on average, students using Reflector rated themselves significantly lower on influence (e.g., 3.73 points), friendliness and quality of contribution, than students not using Reflector. Fourth, interaction effects between condition and measurement occasion were examined. A significant interaction effect was found for friendliness (β = -1.85, p < .05) and cooperation (β = -2.28, p < .01). This means that the influence of measurement occasion on self ratings is different for students using or not using Reflector, and indicates that self ratings developed differently across time. For students using Reflector, self ratings declined across time, (e.g., 1.85 points for friendliness), compared to students not using Reflector. Finally, when comparing the model fit of the conditional model (with Reflector) to the unconditional model (without Reflector), significant decreases in deviance were found for ratings on influence (β = 15.69, p < .001), friendliness (β = 8.84, p < .01), and quality of contribution (β = 10.33, p < .001), indicating that the conditional model (with Reflector) fit the data significantly better for these self ratings. In sum, results show that Reflector has an effect on students’ self ratings. Due to Reflector, self ratings develop differently (i.e., they decline) across time. Generally, students using Reflector rated themselves significantly lower on influence, friendliness and quality of contribution, than students not using it. These results indicate that Reflector makes students aware of their unrealistic positive self perceptions, which leads to more moderate and less optimistic self ratings.
6.4.4 Effect of Reflector on students peer ratings across time In the same way as for the self ratings, a three-level multilevel model was used to examine whether growth or decline of peer ratings developed differently in the two conditions by examining the interaction between measurement occasion and condition. The results of this analysis are given in Table 6.11. The linear effect of measurement occasion was significantly positive for influence (β = 1.01, p < .01) and quality of contribution (β = .71, p < .05), indicating that in both conditions average ratings on influence increased by 1 point from one occasion to the next, and ratings on quality of contribution by 0.7 points respectively. A significant effect of condition was found for influence (β = -2.23, p < .01), friendliness (β = -3.37, p < .000), cooperativeness (β = -2.75, p < .01), reliability (β = -2.59, p < .01), productivity (β = -1.98, p < .01), and quality of contribution (β = -1.89, p < .01). This indicates that on average, students in the condition with Reflector rated their group members’ significantly lower (e.g., 2.23 points lower on influence), than students in the condition without. No significant interaction effect between condition and measurement occasion was found, indicating that peer ratings tended to develop similarly in both conditions.
99
Chapter 6
Table 6.11 Multilevel Analyses for Effects of Occasion and Condition on Peer Assessment Scores Influence Friendliness Cooperation Reliability Productivity Fixed effects β SE β β SE β β SE β β SE β β SE β Intercept 65.26 (.68) 72.98 (.99) 69.38 (.86) 71.92 (.78) 67.44 (.73) Occasion 1.01** (.37) -.50 (.46) -.44 (.46) -.66 (.41) .64 (.40) Condition -2.23** (.78) -3.37*** (.98) -2.75** (.98) -2.59** (.86) -1.98** (.84) Occ. * cond. -.36 (.73) -1.24 (.88) -1.12 (.90) -.60 (.79) -.93 (.79) Decrease in 8.15** 11.41*** 7.72** 8.87** 5.54** deviance (χ²)
Quality β SE β 67.73 (.71) .71* (.38) -1.89** (.79) -.85 (.74) 5.67**
Note. Standard errors are in parentheses. * p < .05 ** p < .01 *** p < .001
Finally, comparing the model fit of the conditional model (with Reflector) to the unconditional model (without Reflector), significant decreases in deviance were found for influence (β = 8.15, p < .01), friendliness (β = 11.41, p < .000), cooperativeness (β = 7.72, p < .01), reliability (β = 8.87, p < 01), productivity (β = 5.54, p < .01), and quality of contribution (β = 5.67, p < .01), indicating that the conditional model fit the data significantly better. In sum, results show that Reflector affects on peer ratings. Although peer ratings develop similarly in both conditions across time, on average, students using Reflector rated their peers significantly lower on all traits compared to students without. These results indicate that Reflector can (1) make students aware of unrealistic positive peer perceptions, and (2) support them in forming a shared understanding about what can be referred to as good group behavior and highquality performance, which leads to more moderate and less optimistic peer ratings. Table 6.12 Pearson Correlations between Self Ratings and received Average Peer Ratings at T1, T2, T3, T4 Influence
Friendliness
Cooperative
r
r
Condition
T
N
r
With Reflector (+Re)
1 2 3 4
92 91 78 80
.23* .47** .23* .46**
-.02 .25* .37** .38**
1 91 2 90 3 86 4 80 * p < .05 (2-tailed) ** p < .01 (2-tailed)
.34** .45** .44** .62**
.23* .24* .33** .40**
Without Reflector (¬Re)
Reliability
Productivity
Quality of contribution
r
r
r
.05 .27* .23* .27*
.13 .36** .17 .15
.17 .36** .28* .31**
.18 .21* .27* .25*
.11 .16 .40** .31**
-.00 .26* .29** .43**
.20 .14 .48** .52**
.16 .48** .45** .59**
6.4.5 Effect of Reflector on self ratings and received peer ratings A Pearson product-moment correlation coefficient was used to test congruency between self and received peer ratings at T1, T2, T3, and T4 with respect to perceived social and cognitive behavior (see Table 6.12). Preliminary analyses were performed to ensure no violation of the assumptions of normality, linearity and/or homoscedasticity. The rule of thumb (Cohen, 1988) for the strength of the correlation (r) was small = .10–.29, medium = .30–.49, and large = .50–1.0.
100
Chapter 6 Results show non-significant or relatively small correlations at T1, and significant medium to large correlations at subsequent assessments for both groups. Suprisingly, at the end of the collaboration (T4), groups with Reflector showed no significant correlations between self and peer ratings on reliability, and medium and low correlations on productivity and quality of contribution, compared to high correlations for groups without Reflector. Table 6.13 Multilevel Analyses for Effects of Condition on Social Performance Scales Condition 1 (n = 89) Scale
M
Team development
3.67
SD 0.73
Condition 2 (n = 86)
Comparing Condition 1 vs. 2 β SE β
M
SD
3.87
0.48
-.27* -.07
Group-process satisfaction
3.08
0.41
3.14
0.34
Intra-group conflicts
2.58
0.59
2.43
0.50
Attitude
3.07
0.21
3.08
0.18
Social Performance (total)
3.14
0.20
3.18
0.15
Chisquare χ²
.15
3.31*
.07
.91
.12
2.85*
-.02
.03
.00
-.05
.04
1.60
.21*
* p < .05 (1-tailed)
6.4.6 Effect of Reflector on perceived social performance Multilevel analysis was used to examine whether groups with Reflector (+Re) perceive higher social performance (i.e., better team development, higher group satisfaction, less group conflict, and more positive attitudes towards collaborative problem solving) than groups without Reflector (¬Re). Table 6.13 shows multilevel analyses for effects of condition on social performance scales. The significant β-value shows that groups with Reflector perceived their team as being less developed and having more intra-group conflicts, than groups without Reflector. However, no significant differences were found for total social performance, group process satisfaction, attitude towards collaborative problem solving. Table 6.14 Means and Standard Deviations for Cognitive Performance per Condition Cognitive performance (grade paper) Condition
n groups
M
SD
Min
Max
1 – Radar and Reflector
23
7.35
.51
6.5
8.5
2 – Radar (control)
24
7.46
.44
6.5
8.5
6.4.7 Effect of Reflector on cognitive performance An independent t-test (one-tailed) was conducted to explore the effect of Reflector on cognitive group performance as measured by the grade given to their (group) paper. Table 6.14 shows means and standard deviations for group performance per condition. No significant differences were found.
101
Chapter 6
6.4.8 Effect of Reflector on congruency between peer ratings and performance A Pearson product-moment correlation coefficient was used to test congruency between group members’ average peer ratings and their perceived social performance at T4. Preliminary analyses were performed to ensure no violation of the assumptions of normality, linearity and/or homoscedasticity. Table 6.15 shows the Pearson correlations between average peer ratings and perceived social performance. Table 6.15 Correlations for Average Peer assessments (with and without Reflector) and Social Performance at T4
With Reflector (+Re) Influence Friendliness Cooperativeness Reliability Productivity Quality of contribution Without Reflector (¬Re) Influence Friendliness Cooperativeness Reliability Productivity Quality of contribution * p < .05 (1-tailed) ** p < .01 (1-tailed) *** p < .001 (1-tailed)
Team development
Group process satisfaction
Intra group conflicts
n 80 80 80 80 80 80
r .15 .38*** .36*** .24* .14 .14
r .31** .44*** .35*** .19* .17 .21*
r -.09 -.28** -.22* -.16 -.12 -.08
81 81 81 81 81 81
.15 .04 .31** .25* .26** .28**
.12 .05 .16 .21* .16 .20*
-.15 -.04 -.25* -.23* -.21* -.21*
Attitude towards CL problem solving
r .13 .19 .10 .09 .23* .12
.17 .02 -.05 .05 .04 -.02
Social performance (total)
r .26* .44*** .41*** .25* .20* .22*
.13 .04 .17 .17 .17 .20*
As expected, compared to groups without Reflector (¬Re), groups with Reflector (+Re) show more significant correlations between their average peer ratings and their perceived social performance (e.g., team development, group process satisfaction, attitude towards collaborative problem solving, and social performance in total). For students using Reflector, results show that all peer ratings on perceived social behavior (i.e., influence, friendliness, cooperativeness, and reliability) ands all peer ratings on cognitive behavior (i.e., productivity and quality of contribution) correlate significantly positively with their perceived social performance (in total). These results suggest that, for students using Reflector, scores on total social performance are based on the perceived social and cognitive behavior of their peers in the group. For students not using Reflector, there were no significant correlations between their perceived social behavior and their perceived social performance (in total). There was, however, a significant correlation between their perceived cognitive behavior (i.e., quality of contribution) and their perceived social performance (in total). These results indicate that students’ high or low scores on social performance are not based on perceived positive or negative social behavior in the group, but on the cognitive behavior of their peers (i.e., their quality of their contribution). It was not expected that at T4 partner variance for all dependent variables would decrease for groups with Reflector, and that all dependent variables (except friendliness) would increase for groups using only Radar (see paragraph 6.5.2). Therefore, additionally, a Pearson productmoment correlation coefficient was calculated to test for congruence between group members’ average peer ratings at T3 and their perceived social performance (see Table 6.16). Compared to 102
Chapter 6 students using Reflector, which showed many significant correlations between their perceived behavior and social performance, students not using Reflector showed only two significant correlations. Students with Reflector showed significant negative correlations between their perceived social behavior (i.e., friendliness, cooperativeness and reliability) and intra group conflicts, which indicates that high levels of perceived friendliness, cooperativeness, and reliability led to low levels of perceived intra group conflicts. Compared to students using Reflector, which showed significant correlations between all six variables of Radar and perceived social performance (in total), students not using Reflector showed no.significant correlations. Table 6.16 Correlations for Average Peer assessments with(out) Reflector at T3 and Social Performance at T4 Team development
Group process satisfaction
Intra group conflicts
With Reflector (+Re) Influence Friendliness Cooperativeness Reliability Productivity Quality of contribution
n 80 80 80 80 80 80
r .12 .50*** .44*** .32** .22* .28**
r .21* .30** .30** .18 .22* .21*
r .00 -.39*** -.28** -.25* -.16 -.16
Without Reflector (¬Re) Influence Friendliness Cooperativeness Reliability Productivity Quality of contribution
81 81 81 81 81 81
.07 .02 .16 .13 .14 .08
.21* .04 .16 .08 .12 .07
-.15 -.08 -.13 -.16 -.06 -.11
Attitude towards CL problem solving
r .14 .25* .18 .15 .10 .24*
.07 -.10 -.23* -.12 .06 .07
Social performance (total)
r .25* .46*** .45*** .28** .25* .34**
.05 -.06 .05 -.01 .15 .04
* p < .05 (1-tailed) ** p < .01 (1-tailed) *** p < .001 (1-tailed)
6.5 Discussion and conclusion The major aims of this study were to examine whether the use of a shared reflection tool (Reflector) could enhance the level of consensus among peer raters and enhance their social and cognitive group performance. Results show that the use of a co-reflection tool in a CSCL environment leads to (1) groups formulating goals and plans to enhance their social and cognitive performance; (2) higher levels of consensus among peer raters across time; (3) less assimilation in individual peer ratings across time; (4) more moderate and less optimistic self and peer perceptions across time; (5) more moderate and less optimistic perceptions of social group performance (i.e., less team development and more intra-group conflicts); and, (6) more valid perceived social performance (i.e., significant correlations between average peer ratings on behavior and perceived social performance). The first research question addressed related to the effect of the reflection tool (Reflector) on self and peer ratings across time. Social Relations Models (SRM) were used to examine interdependencies among ratings. The findings support the assumption that supplementing a self and peer assessment tool (Radar) with a co-reflection tool (Reflector) can lead to higher levels of consensus (i.e., higher partner variance) among peer raters. A Mann-Whitney U test was used to examine significant differences between students using Reflector (+Re) and not using Reflector (¬Re) in relative actor and partner variances. Although the statistical power of the Mann-Whitney U test was low (.33), peer ratings of students using Reflector showed significantly higher levels of
103
Chapter 6 consensus (i.e., higher partner variances) on influence (at T2), cooperation (at T1, T4) and productivity (at T3), compared to students not using Reflector. Partner variance was significantly higher for students using Reflector compared to students not using Reflector. Furthermore, peer ratings of students using Reflector showed significantly lower actor variances on reliability (at T3), productivity (at T3, T4), and quality of contribution (at T1, T3), compared to students not using Reflector. This indicates that in contrast to students not using Reflector, peer ratings of students using Reflector are less determined by the tendency of a rater to see all other group members as high or low on a particular trait. Overall, the findings support the assumption that visualization of group norms (i.e., self and peer ratings in Radar) and explicit reflection upon these norms and standards (i.e., reflection on discrepencies between self and peer ratings in Reflector) support group members in creating a shared standard about what can be referred to as good group behavior and high-quality performance. According to Kenny et al. (2001; 2006), in general, variance partitioning for ratings of personality traits show about 20% of the variance is due to actor, and 15% to partner. Surprisingly, significantly high partner variances (Min = 11%, Max = 43%) were found across time for students using Reflector, compared to students not using Reflector (Min = 7%, Max = 21%). Unexpectedly, for students using Reflector, actor variance increased and most partner variance decreased at T4. Also, actor and partner variance for students not using Reflector showed the complete opposite development at T4. A possible explanation for this decrease in partner variance, and increase in actor variance, for students using Reflector could be that, in the final stage of the collaboration process, students are less focused on the process (group members’ behavior), and more on getting the product (paper) finished before the deadline (e.g., Aubert & Kelsey, 2003). This is supported by the number of groups in the Reflector at T4 (6 out of 23) that had no suggestions to improve their group functioning. Another explanation is that, at the end of the collaboration process (at T4) students using Reflector did not observe and rated their peers’ behavior as carefully as they did before, because they were euphoric for finishing their paper in time, and/or because their final ratings could no longer affect their peers’ behavior or the groups’ performance. Multilevel analyses were used to examine the effect of Reflector on self and peer ratings across time. As expected, self ratings of students using Reflector (+Re) developed differently across time compared to students not using Reflector (¬Re). Due to Reflector, self ratings declined across time. Peer ratings, however, developed similarly in both conditions across time. Results indicated that on average, students using Reflector rated themselves significantly lower on influence, friendliness and quality of contribution, than students without. Furthermore, students using Reflector rated their peers significantly lower on all traits, compared to students without. The findings support the assumption that Reflector can (1) make students aware of their generally unrealistic positive self perceptions, and (2) support group members in forming a shared standard about what can be referred to as good group behavior and high-quality performance, which leads to more moderate and less optimistic self and peer perceptions (i.e., self and peer ratings). A Pearson correlation coefficient was used to test congruency between self and received peer ratings at T1, T2, T3, and T4, with respect to perceived social and cognitive behavior (see Table 6.12). As expected, results show non-significant or relatively small correlations at the first assessment, and significant medium to large correlations at subsequent assessments for both groups. This indicates that over time, the tools (Radar and Reflector) positively affect the congruency between self and received peer ratings. Unexpectedly, at the end of the collaboration (T4), groups with Reflector showed no significant correlations between self and peer ratings on reliability, and showed low and medium correlations on productivity and quality of contribution,
104
Chapter 6 compared to medium and high correlations for groups without Reflector. An explanation could be that students not using Reflector maintain the tendency to overrate themselves (Dunning, Heath, & Suls, 2004), and rate their peers too leniently (Landy & Farr, 1983; see higher self and peer ratings of groups without Reflector at paragraph 6.5.3 and 6.5.4), whereas groups with Reflector need time to adjust there unrealistic self and peer ratings to reach more consensus on reliability, productivity and quality of contribution (see results in Table 6.9 at T3). The second research question addressed pertained to the effect of the Reflector on the groups’ social and cognitive performance. Findings did not support the assumptions that the use of a co-reflection tool enhanced the social and cognitive performance of the group. Groups with Reflector perceived their team as being less developed and having more intra-group conflicts than groups without. An explanation could be that, because Reflector stimulates group members to reflect explicitly (e.g., look more closely) upon the functioning of the group and to set shared goals for improvement, this process leads to more group awareness of different perspectives on group functioning, therefore, resulting in more conflict and less development. This would be in line with Tuckman and Jensen’s (1977) concept of group development stages. Tuckman and Jensen observed and distinguished five stages, namely: (1) forming (i.e., getting to know each other and the task at hand), (2) storming (i.e., establishing roles and positions within the group), (3) norming (i.e., reaching consensus about behavior, goals en strategies, (4) performing (i.e., reaching conclusions and delivering results), and (5) adjourning (i.e., dismantling of the group when the task is completed). It can be expected that a tool like Reflector may cause group development to revert to the stage of storming, which can be contentious, unpleasant, and even painful to group members who do not like conflicts (e.g., Bales & Cohen, 1979; Tuckman & Jensen, 1977). However, differences between the two conditions (with or without Reflector) are small and no significant differences were found for social performance in total. Furthermore, results indicate that perceptions of groups without Reflector are not valid. The perceived social performance of students not using Reflector, as measured by the questionnaire, do not correlate with their perceived behavior (peer ratings) as measured by Radar. Another explanation for not finding any significant differences could be that the effect of Radar on social and cognitive performance is stronger than the effect of the Reflector, so no significant differences will be found between groups with or without Reflector. The small range in grades given to the students’ papers is probably also one of the reasons for not finding any significant differences in cognitive performance between conditions. Findings did support the assumption that students using Reflector would exhibit more valid peer ratings, that is, higher correlations between peer ratings on social behavior and the perceived social performance, compared to groups without. As expected, for students using Reflector, results show that all peer ratings on perceived social behavior (influence, friendliness, cooperativeness, and reliability), as well as all peer ratings on cognitive behavior (productivity and quality of contribution), correlate significantly positively with their perceived social performance (in total). These results indicate that, for students using Reflector, scores on total social performance are based on the perceived social and cognitive behavior of their peers in the group. For students not using Reflector, results did not show any significant correlations between their perceived social behavior and their perceived social performance (in total). Results did show a significant correlation between students’ perceived cognitive behavior (i.e., quality of contribution) and their perceived social performance (in total). These results indicate that, for students not using Reflector, scores on social performance are not based on perceived positive or negative social behavior in the group, but on perceived cognitive behavior of their peers (i.e., their quality of their contribution). An explanation for these findings could be that visualization of
105
Chapter 6 group norms and explicit reflection upon these norms and standards support group members in creating a shared standard, which leads to a more valid perception of their social performance. For example, higher correlations between perceived behavior (i.e., cooperative behavior) during collaboration and perceived performance at the end (i.e., team development). Apparently, students do not automatically reflect on a high cognitive level on their perceived and received peer assessments, and need a reflection tool (i.e., Reflector) to do so (Kollar & Fischer, 2010). In sum, results show that supplementing a self and peer assessment tool with a co-reflection tool, leads to more consensus among raters, more moderate and less optimistic self and peer perceptions, and more valid judgments of social performance.
6.6 Future Research and Implications In this study Social Relations Models (SRM) were used to examine interdependencies among ratings. To our knowledge, very few studies have used SRM to study group dynamics (see a review by Marcus, 1998), or applied it to study consensus among peer ratings on performance (e.g., Greguras, Robie, & Born, 2001; Greguras, Robie, Born, & Koenigs, 2007). In this study, SRM provided a very useful theoretical basis and statistical tool to partition sources of variance in peer ratings into actor variance caused by the tendency of the raters to rate all peers similarly high or low - on a particular trait, and partner variance caused by the tendency of the ratees to elicit similar ratings from all peer raters. This study can be regarded as our first introduction and pilot using SRM for educational research. For future research it would be interesting to examine the correlation between self-ratings and the individual-level SRM effects (actor and partner). The correlations between self-rating and actor effects of trait ratings measure assumed similarity (i.e., Does the way a person sees him/herself correspond to how he/she sees others?). The correlations between self-rating and partner effects of trait ratings measure self-other agreement (i.e., Do others see a person as that person sees him/herself?). SRM is to be recommended for educational research on interpersonal perceptions, group dynamics, and round-robin designs where persons rate each other and themselves on particular traits or behavior. A few limitations and practical implications of this study should be kept in mind. As mentioned, a limitation of this study might lie in the fact that students collaborated face to face, and only used the CSCL environment to complete their self and peer ratings across time. This could explain why, compared to previous studies (i.e., Phielix et al., 2010, 2011), no significant effects of the Reflector were found on the groups’ social performance. Full collaboration of the participants in the CSCL environment could have enhanced the effect of the Radar and Reflector, because then, group members would have to rely on the tools to gain information on the behavior and performance of their peers. Another explanation could be the difference in reflection skills between the participants of the previous studies (i.e., sophomore high school), and the current study (i.e., second year university). Reflection skills of second year university students should be more developed than those of sophomore high school students. Therefore, the added value of the Reflector may be higher for high school students than university students, which could explain why an effect of Reflector is found for high school students. In this study the output of the Reflector concerning group member’s intentions to enhance their social and cognitive performance by setting goals and formulating plans was for practical reasons primarily analyzed in a quantitative way. To examine whether these intentions lead to actual changes in social and cognitive behavior and activities (e.g., using discourse analysis to find out whether the intention to become friendlier actually led to more friendly and helpful behavior) it would have been necessary to record all interactions during collaboration over a period of eight weeks. All 49 groups, however, collaborated face-to-face on different times and
106
Chapter 6 locations (e.g., at home or at the university), which makes it very complicated and impractical to record students’ interactions. The output of the Reflector provides no information on how the group members accomplish their goal or executed their formulated plan. It would be interesting to know what group members will do differently in their next collaborative assignment. Therefore, in future research, these reflective questions will be added to the Reflector to provide this information. To study the effect of Reflector on development of perceived social performance over time, it would have been interesting to know how students, using or not using Reflector, perceived their groups’ social performance during the collaboration process (at T2 or T3). However, letting them complete a questionnaire at T2 or T3 would have stimulated students not using Reflector to reflect on their group functioning and would have, thus, influenced the experiment. The lack of interviews with students after the collaboration process makes it hard to explain why partner variances decreased at T4 for students using Reflector, and partner variances increased for students not using Reflector. Therefore, in further research we will interview a number of students to gain more information concerning the motivation for their self and peer ratings at different occasions. A practical implication is that results in this study indicate that self and peer ratings are not evidently reliable and valid. However, findings in this study show that students’ self and received peer ratings show more convergence over time. Results also indicated that reflection prompts caused self and peer ratings to become more objective and valid over time. Therefore, when self or peer ratings are used for developmental purposes, it is recommended that students complete several ratings (e.g., three measurement occasions), over a longer period of time (e.g., eight weeks), supplemented with reflection prompts aimed at awareness of (self and peer) behavior, group norms, and future (group) functioning. In conclusion, Social Relation Models (SRM) analyses proved to be very useful to examine variance and consensus in self and peer rating. Results show that supplementing a self and peer assessment tool (i.e., Radar) with a shared co-reflection tool (i.e., Reflector), that consisted only of six simple reflective questions, can lead to higher levels of consensus among peer raters, more moderate and less optimistic self and peer perceptions, and more valid jugdments of social performance. These results indicate that for self and peer ratings to be less optimistic and more moderate and valid, self and peer ratings should always be combined with individual and/or group reflection.
107
7. General Discussion and Conclusions
7.1 Introduction Group members’ self-views (i.e., self perceptions) show a tenuous to modest relationship with their actual behavior and performance (Dunning et al., 2004). Students tend to overrate themselves and hold overinflated views of their expertise, skill, and character (e.g., Chemers, Hu, & Garcia, 2001; Dunning, Heath, & Suls, 2004; Falchikov & Boud, 1989). This tendency of group members to believe that they are performing effectively, while they often are not, can undermine the groups’ social (e.g., team development) and cognitive performance (e.g., quantity and quality of work), and cause it not to reach its full potential (Karau & Williams, 1993; Stroebe, Diehl, & Abakoumkin, 1992). To enhance collaboration and alleviate biased self perceptions, the learning environment can be augmented with computer supported collaborative learning (CSCL) tools that make group members aware of the discrepancies between their self perceptions and their actual behavior and performance (e.g., Janssen, Erkens, Kanselaar, & Jaspers, 2007). These tools, also known as group awareness tools, provide information about the social and collaborative environment in which a person participates (e.g., they inform students how their actual behavior and performance is perceived by their peers; see Buder, 2007, 2011). This enhanced group awareness can lead to more effective and efficient collaboration (e.g., Buder & Bodemer, 2008; Janssen, Erkens, & Kirschner, 2011). Two operationalizations of such tools were developed for this research project, namely (1) a shared self and peer assessment tool (Radar) and (2) a shared reflection tool (Reflector). These tools are intended to help group members become better aware of their individual and group behavior and to stimulate them to set goals and formulate plans for improving the group’s social and cognitive performance. According to Hattie and Timperley (2007), the effect of feedback (i.e., shared self and peer ratings in Radar) can be increased when students answer three reflective questions: (1) Where am I going?; (2) How am I going?; and (3) Where to next?. To this end, it was hypothesized that a combination of Radar and Reflector would be most effective for influencing group members’ behavior and enhancing their performance. The three empirical studies described in this thesis were all aimed at answering the following central research question: To what extent does assessment and reflection affect behavior and performance in small CSCL groups? Each of the three empirical studies focused on specific research questions concerning different aspects and availability of the assessment and/or reflection tool. The main research goals were (1) examining ways to let group members become aware of their behavior by means of an assessment tool, and (2) examining ways to alter behavior and performance by means of a reflection tool. Before we can answer this question theoretically, it is necessary to define the central concepts of the research question stated above. In this thesis, assessment is defined as the process through which students monitor and rate the behavior and/or performance of themselves (i.e., self
109
Discussion assessment) and/or their fellow group members (i.e., peer assessment). Reflection is defined as the intellectual and affective activities individuals engage in to explore their experiences to reach new understandings and appreciations of those experiences (Boud, Keogh, & Walker, 1985). In this thesis, however, students do not only reflect individually but also collaboratively on their experiences. This process of collaborative reflection (i.e., co-reflection) is defined as a collaborative critical thinking process involving cognitive and affective interactions between two or more individuals who explore their experiences in order to reach new intersubjective understandings and appreciations (Yukawa, 2006; p. 206). Behavior is defined as the perceived social (non-task related) and cognitive (task-related) aspects or activities that are important for successful collaboration. These social aspects, further referred to as social behavior, are measured by self and peer ratings in Radar on four variables: influence, friendliness, cooperativeness, and reliability. The cognitive aspects, further referred to as cognitive behavior, is measured by self and peer ratings in Radar on two variables: productivity and quality of contribution. Performance is defined as the social and cognitive achievement or output at the end of the collaboration process. The social achievement, further referred to as social performance, is measured by four subscales (e.g., team development, group satisfaction, levels of group conflict, and attitude toward problem-based collaboration). The cognitive output, further referred to as cognitive performance, is measured by the grade that was given to the group’s product (e.g., essay or paper). The next sections provide a summary of each study, and a synthesis of the results to answer the central research question. This is followed by a discussion of the methodological issues, as well as the theoretical and practical implications.
7.2 Summary of the Studies The first empirical study (Chapter 4) examined, first, how the tools (Radar and Reflector) affect group members’ behavior over time, and second, main and interaction effects of Radar and Reflector on the groups’ social and cognitive performance at the end of the collaboration process. In this study Radar provided group awareness information on five traits deemed important for assessing behavior in groups. Four were related to social behavior, namely influence, friendliness, cooperation, and reliability. The last - productivity - was related to cognitive behavior. The reflection opportunity, provided by Reflector, was aimed at group functioning history. The second empirical study (Chapter 5) examined to what extent half-time awareness (i.e., first receiving the tools halfway through the collaboration process) affects the group’s social and cognitive performance. In this study Radar’s five just mentioned traits were complemented with a sixth related to cognitive behavior, namely quality of contribution. Reflection in this study was aimed at future group functioning. The third empirical study (Chapter 6) examined the effect of Reflector on the level of consensus among peer raters and the groups’ social and cognitive performance.
7.2.1 Study 1: Awareness of group performance The aims of the first study (see chapter 4) were to (1) design an assessment and a reflection tool, and (2) examine main and interaction effects of these tools on group members’ behavior and performance during CSCL. The VCRI (Virtual Collaborative Research Institute), a tried and tested CSCL-environment, was augmented with two independent and complementary, tools in order to help group members become better aware of their individual and group behavior, and to stimulate them to reflect on their individual and group performance. The first tool was a self and peer assessment tool - Radar - which provided group members with information about their own social and cognitive behavior, that of their peers, and the behavior of the group as a whole. Group
110
Discussion members rated themselves and their peers on five traits deemed important for assessing social behavior (i.e., influence, friendliness, cooperation, and reliability) and cognitive behavior (i.e., productivity) in groups. The second tool was a shared reflection tool - Reflector - which allowed group members to share their individual reflections on their own functioning, on the ratings that they received from their peers, and on the functioning of the group as a whole, and also stimulated them to collaboratively reflect on their group performance and reach a shared understanding on this. Reflection was focused on past and present group functioning. Participants were 39 sophomore Dutch high school students. A 2x2 factorial betweensubjects design, with the factors Radar unavailable (¬Ra) – available (+Ra), and Reflector unavailable (¬Rf) – available (+Rf), was used to examine whether these tools would lead to better social and cognitive behavior, better social performance (i.e., better team development, more group satisfaction, lower levels of group conflict, and more positive attitudes toward problembased collaboration), and better cognitive performance (e.g., a better group product). As expected, at a second assessment (T2), results showed a decrease in peer ratings for groups with Radar and Reflector on influence, friendliness, and reliability as compared to the first assessment (T1). Peer ratings of groups with both tools (+Ra+Rf) at T2 correlated strongly with the self-assessments at the third and final measurement (T3) for influence and productivity, indicating a convergence of self and peer perceptions. A main effect on self-ratings was found for Reflector on influence, but not on other variables. Groups with Reflector rated themselves higher on influence than groups without Reflector. A main effect was found for Radar on social group performance. Groups with Radar perceived their team as being better developed, experienced lower conflict levels, and had a more positive attitude towards collaborative problem solving than groups without Radar. In conclusion, results were promising and demonstrated that use of a Radar and Reflector in a CSCL environment can lead to more realistic self perceptions (i.e., higher convergence of self and peer ratings), and can enhance social group performance, compared to groups that do not have these tools.
7.2.2 Study 2: Half-time awareness in a CSCL environment The second study (Chapter 5) examined the extent to which awareness halfway through the learning process (i.e. receiving the tools for the first time halfway the collaboration process) affects the social and cognitive performance. Again, VCRI was augmented with Radar and Reflector to (1) help group members become aware of individual and group behavior, (2) stimulate them to reflect on their individual and group performance, and (3) stimulate them to set goals and formulate plans for improving the social and cognitive performance. In this follow-up study, Radar was complemented with a sixth trait that represents cognitive or task-related behavior, namely quality of contribution. Furthermore, reflection prompts were not only focused on past and present group functioning, but also on future group functioning. Reflector was complemented with a text which stimulated students to set goals and formulate plans to enhance social and cognitive group performance. Participants were 108 sophomore Dutch high school students, who collaborated in dyads (n = 16), triads (n = 84) or groups of four (n = 8) on a collaborative writing task in sociology. In the first experimental condition (n = 59), group members used Radar at the start (T1), and used Radar and Reflector halfway through (T2) and at the end (T3) of the collaboration process. In order to examine the effect of the tools halfway through the collaboration process, in the second experimental condition (n = 23), group members used Radar and Reflector halfway through the collaboration (T2) and at the end (T3). In the control condition (n = 26), group members used the
111
Discussion tools only at the end (T3) which also was the outcome measure. At the end all participants completed a questionnaire that measured perceived social group performance (e.g., better team development, more group satisfaction, lower levels of group conflict, and more positive attitudes toward problem-based collaboration). The grade given to each group’s collaborative writing task (i.e., an essay) was used as a measure of cognitive group performance. Unexpectedly, at the second measurement (T2 – halfway through), groups that completed the tools for the second time (condition 1 – using the tools throughout) perceived higher levels of social and cognitive behavior than groups that completed the tools for the first time (condition 2 – using the tools halfway through). At the end, no differences were found for cognitive performance between conditions. As expected, at the end of the collaboration process (T3), group members using tools (condition 1 and 2) showed higher convergence of self and peer ratings over time, than group members not using tools (condition 3). Results indicated that tools have an effect on how group members perceived the social and cognitive behavior of themselves and of their peers. Overall, groups using the tools throughout perceived higher levels of social and/or cognitive behavior of themselves and their peers at the end (T3), compared to groups not using the tools. Group members using tools throughout (condition 1) perceived better social performance than group members using tools since halfway (condition 2) or not using the tools (condition 3). In conclusion, results showed that the use of Radar and Reflector in a CSCL environment support students to (1) become more aware of interpersonal perceptions and behavior, (2) exhibit higher social and cognitive behavior at the end, (3) establish shared perceptions on interpersonal behavior, and (4) enhance social group performance.
7.2.3 Study 3: Using reflection to increase consensus among peer raters The third study (Chapter 6) examined whether the use of a shared reflection tool (Reflector) could enhance the level of consensus among peer raters, and enhance their social and cognitive group performance. For this study an experimental design was used with one experimental and one control condition. The experimental condition (n = 105) received Radar and Reflector, and the control condition (n = 86) received only Radar. Participants were 191 second-year Dutch university Educational Science students (37 male, 154 female), and collaborated face-to-face in groups of three, four, and five, on a collaborative research task in educational psychology. Each group had to write a research paper about a pilot-study which they conducted over a period of eight weeks. During this period, students in both conditions completed the Radar four times. Additionally, students in the experimental condition completed the Reflector three times, starting at the second measurement occasion. To complete Radar and/or Reflector, students logged into VCRI. Radar provided group members with information about the social (i.e., influence, friendliness, cooperation, reliability) and cognitive behavior (i.e., productivity and quality of contribution) of themselves, their peers, and the group as a whole. Reflector stimulated group members to reflect individually and collaboratively on the received peer ratings, their own performance and the performance of the group. It was assumed that Reflector would lead to awareness of unrealistically self and peer perceptions, support group members in setting shared goals to improve their group performance, and support group members in forming judgments about what can be referred to as good group behavior and high-quality performance. Therefore, it was expected that, compared to groups without Reflector, groups with Reflector would show (1) higher levels of consensus between peer raters, (2) a different development of self and peer ratings over time, (3) better perceived social
112
Discussion performance at the end, (4) better cognitive performance at the end, and (5) a more valid perception of their social performance. Unexpectedly, perceived social and cognitive behavior (self and peer ratings) did not increase at the end of the collaboration process for both groups (with or without Reflector). At the end, groups with Reflector showed more moderate and less optimistic perceptions of social group performance (i.e., less team development and more conflicts). No significant differences were found between groups with or without Reflector for cognitive group performance. As expected, results showed that supplementing a self and peer assessment tool (i.e., Radar) with a reflection tool (i.e., Reflector) can lead to: higher levels of consensus among peer raters across time; more moderate self and peer perceptions across time; and more valid perceived social performance, that is, significant correlations between average peer ratings on behavior and perceived social performance. In conclusion, results showed that supplementing Radar with Reflector leads to more consensus among raters, more moderate self and peer perceptions, and more valid judgments of social performance.
7.3
Synthesis
7.3.1 Why the tools? / Why should the tools work? Working in a group can be very frustrating, especially when a fellow group member does not meet up to the standards of his/her peers (e.g., when a group member is failing his/her appointments, not fulfilling his/her tasks on time, being less productive, producing low quality of work, and is free-riding on the work of others). In order to make group members more aware of their behavior and performance during collaboration, group members can do a critical self assessment by reflecting on one’s own performance, and/or receive an assessment of their performance by others (e.g., their fellow group members). The use of self and peer assessment in small groups can provide group members with useful information about their own behavior and performance. However, there are six possible problems that may arise when self and peer assessments are used to enhance performance. First, although peers have been called the most accurate and informed judges of the behavior of other group members (Kane & Lawler, 1978; Lewin & Zwany, 1976; Murphy & Cleveland, 1991), and peers have many opportunities to observe their colleagues’ social (non-task related) and cognitive (taskrelated) behavior, they can not observe all aspects that are relevant and important for successful collaboration. For example, peers can observe the performance (e.g., work results) and behavior (e.g., friendliness) of a specific group member during interaction with him/her, but they are unable to observe interactions taking place between other group members outside their presence, especially when groups use computer mediated communication (CMC) systems (e.g., chatbox). To this end, group members lack a lot of information about fellow group members’ socioemotional processes (e.g., frustrations, feelings of trust), which are important for a successful collaboration process (e.g., Järvelä, Järvenoja, & Veermans, 2008; Järvelä, Volet, & Järvenoja, 2010). To partially overcome this lack of information, in this research project, a CSCL environment was augmented with an easy to complete and easy to interpret self and peer assessment tool (Radar). Using Radar, students rate themselves and their peers on four social aspects (i.e., influence, friendliness, cooperation, and reliability), and two cognitive aspects of collaboration (i.e., productivity and quality of contribution). Radar shares these self and peer ratings with each group member, by visualizing these ratings anonymously in a Radar diagram. Because group members’ self ratings are shared in Radar, all group members receive information on their fellow group members’ intensions (e.g., to involve in the collaboration process, be
113
Discussion friendly, cooperative and trustworthy). Because group members’ peer ratings are shared in Radar, all group members receive information on how their social and cognitive intensions are perceived and experienced by their peers. The strength of Radar lies in its ability to make implicit aspects of collaboration (e.g., frustrations among peers) explicit for all group members. Radar enhances students’ awareness of behavior and performance by providing them with explicit information concerning their behavior (e.g., being too dominant) or their performance (e.g., contributing low quality work). To further enhance students’ awareness of behavior and performance, the CSCLenvironment was also augmented with a shared collaborative reflection tool (Reflector), which stimulates and supports students to reflect upon the discrepancies between their self perceptions (i.e., self ratings), and their actual behavior (i.e., received peer ratings), and provides students with cues for behavioral adaptation. The second problem of the use of self and peer assessment in general, is that it cannot be assumed that students have the skills to provide and receive effective peer feedback (Prins, Sluijsmans, Kirschner, & Strijbos, 2005; Sluijsmans, 2002). Sluijsmans and Van Merriënboer (2000) analyzed peer feedback skills in the domain of teacher education and identified three important sub skills that need to be supported: (1) defining the assessment criteria, (2) assessing the product or contribution to group performance of a peer, and (3) delivery of the peer feedback. Prins et al. (2005) added receiving peer feedback as a fourth peer feedback sub skill and emphasized the importance of a feedback dialogue for the development and emergence of effective peer feedback. Therefore, in this research project, Radar supported group members in providing peer feedback (i.e., the assessment input) as well as receiving it (i.e., the assessment output). For example, Radar supported the assessment input by defining the assessment criteria and stimulating group members to rate themselves and their peers on six traits deemed important for group work. Also, Radar supported the assessment output by providing each group member with the (average) peer ratings of his/her fellow group members. Furthermore, Reflector supported group members to reflect on this matter and regulates a feedback dialogue. Reflector structured and regulated the feedback dialogue by sharing group members’ individual reflection on their received peer ratings with all fellow group members, and stimulating them to collaboratively reflect on the functioning of the group as a whole, reach a shared conclusion on this matter, and set shared goals for improving it. The third problem of the use of self and peer assessment in general is that students tend to emphasize their strengths and positive performances, and attribute weakness and negative performances to others (e.g., Klein, 2001; Saavedra & Kwun, 1993). This tendency, also known as attribution (e.g., Eccles & Wigfield, 2002; Weiner, 1985), can result in unrealistically high self ratings and low peer ratings. To overcome this tendency, students need to become more aware of the – often unrealistic and inaccurate – standard they use to compare and rate social and cognitive behavior of themselves and their peers. Therefore, in this research project, Radar shares self and peer ratings with all group members in order to make them more aware of their inaccurate self and peer perceptions, by showing discrepancies between their own perceptions (i.e., self and provided peer ratings) and those of others (i.e., received self and peer ratings of fellow group members). The fourth problem is that peers may be unwilling to provide accurate ratings and instead may rate their friends too leniently (Landy & Farr, 1983). During completion of self and peer ratings, students make many mental comparisons (Goethals, Messick, & Allison, 1991), which are selected, interpreted, and/or biased (Saavedra & Kwun, 1993). Interpersonal relationships among group members may cause peers to be too lenient or less discriminating when rating their friends. By working closely together, group members often develop friendly relationships with one
114
Discussion another. Therefore, peers may be unwilling to provide accurate ratings and instead may rate their friends too leniently (Landy & Farr, 1983), or rate everyone similarly in order to not cause friction within the group (Murphy & Cleveland, 1995). A solution for this problem is the use of anonymous peer ratings. According to Kagan, Kigli-Shemesh, and Tabak (2006), anonymous peer ratings are one of the most effective and objective ways to gather information on individual behavior and performance. Some researchers (e.g., Cestone, Levine, & Lane, 2008), however, suggest that anonymous peer ratings provide harsher criticisms and evaluations, and as a result have a negative impact on the relationships between group members. Others, (e.g., Bamberger, Erev, Kimmel, & Oref-Chen, 2005) found no empirical evidence that anonymous peer assessment harmed relationships and impaired group task focus and functioning. Therefore, in this research project, Radar anonymously shares all provided peer ratings with each group member in order to (1) make group members more aware of the discrepancies between their self and received peer ratings, and (2) stimulate peers to provide more accurate peer ratings. The fifth problem is that information gathered and received by anonymous peer ratings is only reliable and useful for enhancement of performance when all peer raters agree (show high levels of consensus), about what can be referred to as good group behavior and high-quality performance. To overcome large dissimilation in peer ratings, group members need to become aware of the different perceptions and standards they use to compare and rate the social and cognitive behavior of themselves and their peers, and develop shared norms and standards. This process of norm setting, also known as norming (Tuckman & Jensen, 1977, in which group members reach consensus about their behavior, goals and strategies, is an important stage in group development to become a well performing group (Johnson, Suriya, Yoon, Berret, & La Fleur, 2002). Therefore, in this research project, Radar anonymously shared all perceived and received ratings of each group member in order to make group members more aware of the different (1) interpersonal perceptions on the social and cognitive behavior of themselves and their peers, and (2) standards they use to compare and rate each others’ social and cognitive behavior. To reach shared norms and standards, Reflector stimulates individual reflection on discrepancies between self and received peer ratings, and stimulates group members to reflect collaboratively upon their group performance (see section 3.3). This reflection process allows group members to discuss discrepancies in group members’ individual norms, and to reach a shared standard about what can be referred to as good group behavior and high-quality group performance. For example, findings in Chapter 6 show that supplementing Radar with Reflector leads to higher levels of consensus (i.e., higher partner variance) among peer raters. The sixth problem is that solely providing information about different interpersonal perceptions is probably not enough to alter group members’ behavior or change their rating standards (if needed). To do so, group members need to reflect individually upon their self and peer ratings by asking themselves (1) what arguments they have to give high or low ratings to themselves or their peers, (2) whether they understand the (different) interpersonal perceptions and ratings on the behavior of themselves and their peers, (3) whether they accept these (different) perceptions, and (4) determine whether it provides clues to change their own perceptions and behavior (e.g., Prins, Sluijsmans, & Kirschner, 2006). However, it can not be assumed that group members will automatically reflect on a high cognitive level on their perceived and received peer assessments (e.g., Kollar & Fischer, 2010). Thus, the effect of Radar on group members’ behavior depends on the willingness and ability of group members to reflect upon the information provided by this tool. For instance, Radar can provide group members with cues for needed behavioral adaption (e.g., I come across as too strong), but they might lack the will and ability to reflect upon this information (e.g., what should I do to ‘lighten up’?).
115
Discussion To overcome this problem, Reflector stimulated, structured and regulated the reflection process. Reflector stimulated group members to individually reflect and provide information on: their own perspective on their personal performance; differences between their self perception and the perception of their peers concerning their personal performance; whether they agree with those perceptions; and, their individual perspective on group performance. Because group performance is determined by the individual effort of all group members, Reflector also stimulated group members to collaboratively reflect on group performance and reach a shared conclusion on this. Finally, based on their shared conclusion, Reflector stimulates group to set shared goals for improving the group’s social and cognitive performance. Reflector structured and regulated the reflection process by allowing group members to share their individual reflection with their peers, however, group members could only gain access to their peers’ reflections after they have completed and added their own reflection.
7.3.2 To what extent did the tools affect group members’ behavior? In this research project, it was expected that the information (i.e., peer ratings) provided by Radar at the first assessment should make group members aware of their unrealistic self and peer perceptions, resulting in a decrease of self and peer ratings at a subsequent assessment. However, the effect of Radar depends on the willingness and ability of group members to reflect upon the received information (i.e., self and peer ratings). Therefore, it was also hypothesized that a combination of Radar and Reflector should be even more effective. Thus, it was expected that a combination of Radar and Reflector should lead to even lower self and peer ratings than groups with only Radar. It was assumed that Radar and Reflector would have this effect on the self and peer ratings, because Radar creates the opportunity for social comparison, which means that students can compare themselves to other group members. Radar facilitates an awareness of group norms by providing students with self and (average) peer ratings of their group members’ social and cognitive behavior, which are visible and available for all group members. This awareness is further enhanced by Reflector, stimulating students to reflect upon the discrepancies between their self perceptions (i.e., self ratings), and their actual behavior (i.e., received peer ratings). By comparing themselves to other group members, students are motivated to set higher standards (e.g., Janssen, Erkens, Kanselaar, & Jaspers, 2007; Michinov & Primois, 2005), resulting in a decrease of self and peer ratings. In line with the hypotheses, Radar and Reflector have an effect on group members’ perceived social and cognitive behavior halfway through the collaboration process, at the second measurement occasion (T2). That is, for groups using Radar and Reflector, results show a decrease in self ratings (Study 3) and in peer ratings (Studies 1 & 3) at the second measurement (T2). In Study 2, however, self and peer ratings increased at T2, which was not in line with the hypothesis. A possible explanation could be that, in comparison to Studies 1 and 3, group members in Study 2 were familiar with collaborating with each other. Prior to the experiment of Study 2, participants collaborated with each other for one month. It is likely that this earlier collaboration period caused students to have more realistic perceptions, and these perceptions increased over time by using Radar and Reflector. Furthermore, by comparing themselves to other group members, students are not only motivated to set higher standards, but also to increase their effort (e.g., Janssen, et al., 2007; Michinov & Primois). Because of this increase in effort and the fact that Reflector supports group members to set shared goals to improve their group performance, it was expected that social and cognitive behavior should improve towards the end of the collaboration process (i.e., increasing self and peer ratings at the end). Results showed that Radar and Reflector have an effect on how group members perceived the social and cognitive behavior of themselves and their peers at the
116
Discussion end of the collaboration process. Groups with Reflector rated themselves significantly higher on influence at the end than groups without Reflector (see Study 1). Peer ratings increased at the end of the collaboration process but no significant main effect was found for Radar or Reflector (Study 1). Groups using the tools throughout perceived higher levels of social and/or cognitive behavior of themselves and their peers at the end (T3), compared to groups not using the tools (see Study 2). In Study 3, however, self and peer ratings for all groups (Radar with or without Reflector) decreased over time, with a main effect of Reflector on self ratings. Self ratings of students using Reflector were significantly lower on influence, friendliness and quality of contribution, than students without. Also, students using Reflector rated their peers significantly lower on all traits, compared to students without. In conclusion, the findings stated above are not in line with the hypothesis that self and peer ratings should increase at the end (see hypothesis stated in Studies 1 & 2), but do support the assumption that (1) Radar motivates students to set higher standards, which results in a decrease of self and peer perceptions (i.e., self and peer ratings), and that (2) Reflector further enhances the effect of the Radar, which results in more moderate and less optimistic self and peer ratings for groups with Reflector (see hypothesis stated in Studies 1, 2 & 3).
7.3.3 To what extent did the tools lead to more valid self and peer perceptions? Visualization of group norms (i.e., self and peer ratings) in Radar and explicit reflection upon these norms and standards in Reflector (i.e., reflection on discrepancies between self and peer ratings, and co-reflection on group performance) allow group members to discuss discrepancies in group members’ individual norms, and to reach a shared standard about what can be referred to as good group behavior and high-quality performance. This process of norm setting, also known as norming, in which group members reach consensus about their behavior, goals and strategies, is an import stage in group development to become a well performing group (Johnson, Suriya, Yoon, Berret, & La Fleur, 2002; Tuckman & Jensen, 1977). Therefore, it was expected that use of Reflector would lead to (1) higher correlations (i.e., congruency) between self and received peer ratings, and (2) a more valid perception of their social performance, for example, higher correlations between perceived behavior (i.e., cooperative behavior) during collaboration and their perceived performance at the end (i.e., team development). It was also expected that use of Reflector would lead to (3) higher consensus among peer raters, that is, the degree to which all group members rate some peers as very friendly and rate other peers as very unfriendly. First, in line with the hypothesis, group using Reflector show higher correlations between self and received peer ratings. Results indicate that groups with Radar and Reflector show more and higher positive correlations between self and peer ratings than groups with only Radar (Study 1). Furthermore, groups using Radar and Reflector showed higher convergence of self and peer ratings over time than groups not using Radar and Reflector (see Study 2). In Study 3, both groups (with and without Reflector) show non-significant or relatively small correlations at the first assessment, but significant medium to large correlations at subsequent assessments. However, at the end of the collaboration (T4), groups with Reflector showed no significant correlations between self and peer ratings on reliability, and medium and low correlations on productivity and quality of contribution, compared to high correlations on these traits for groups without Reflector. An explanation could be that students not using Reflector maintain the tendency to overrate themselves (Dunning, Heath, & Suls, 2004), and rate their peers too leniently (Landy & Farr, 1983; see higher self and peer ratings of groups without Reflector at paragraph 6.5.3 and 6.5.4), whereas groups with Reflector need time to adjust there self and perceived peer
117
Discussion ratings to reach more consensus on reliability, productivity and quality of contribution (see results in Table 6.9 at T3). Second, in line with the hypothesis, groups using Reflector show higher correlations between perceived behavior (i.e., peer ratings in Radar during collaboration) and their performance (i.e., outcomes of the questionnaire on social performance at the end). Results show that, for students with Reflector, peer ratings correlate significantly positively with their perceived social performance. For students without Reflector, peer ratings did not correlate with their perceived social performance. This indicates that, only for students with Reflector, scores on social performance (in total) are based on the perceived social and cognitive behavior of their peers in the group (Study 3). Third, in line with the hypothesis, groups with Radar and Reflector show higher levels of consensus (i.e., higher partner variances and lower actor variances) between peer raters (Study 3). This means that ratings of students with Reflector are more determined by the perceived behavior of the ratees, and less determined by the tendency of the rater to rate all other group members as high or low on a particular trait. Thus, use of Reflector leads to more valid peer ratings. In conclusion, the obtained results strongly indicate that, over time, group awareness tools such Radar and Reflector can lead to more congruency (higher correlations) between self ratings and received peer ratings (Studies 1, 2 & 3), a more valid perception of their social performance (Study 3), and higher levels of consensus between peer raters (Study 3).
7.3.4 To what extent did the tools enhance group performance at the end? In this research project, it was expected that social and cognitive group performance would be positively affected by Radar and Reflector because, as stated in the previous sections, it was assumed that Radar should enhance group members’ awareness of their own behavior, motivate students to set higher standards and increase their effort. It was expected that this positive effect of Radar on group members’ social and cognitive behavior should be further enhanced by Reflector because it stimulates (1) individual reflection on discrepancies between their self and received peer ratings in Radar, (2) collaborative reflection upon their group performance in order to reach a shared conclusion about it, and (3) to set shared goals for improving the group’s social and cognitive performance. In order to reach a shared conclusion about the groups’ performance, group members need to discuss how well their group is functioning, what can be referred to as good group behavior and high-quality performance, and how group processes can be improved. These discussions, also known as group processing (Webb & Palincsar, 1996), support groups to pinpoint, comprehend, and solve collaboration problems (e.g., free riding by some group members) and contribute to successful collaborative behavior (Yager, Johnson, Johnson, & Snider, 1986). In line with the hypothesis that Radar and Reflector enhance social performance, results show that groups using the tools throughout (Radar with or without Reflector) perceived higher levels of social group performance (i.e., group process satisfaction) compared to groups not using the tools (see Studies 1 and 2). These findings are in line with a study by Zumbach, Hillers, and Reimann (2004) which showed that their group awareness tool (visualizing parameters of interaction, such as participation, motivation, and contribution), positively affected students’ learning process, group performance, and motivation. However, not in line with the hypothesis stated above, groups with only Radar perceived more team development and less intra-group conflicts than groups with Radar and Reflector (Study 3). An explanation could be that because Reflector stimulates group members to reflect explicitly (i.e., look closer) at the functioning of the group and set shared goals for improvement, this process leads to more group awareness of
118
Discussion different perspectives on group functioning, resulting in more conflicts and less team development. This would be in line with Tuckman and Jensen’s (1977) concept of group development stages (i.e., forming, storming, norming, performing, and adjourning). It can be expected that a tool like Reflector may cause group development to revert to the stage of storming, which can be contentious, unpleasant, and even painful to group members who do not like conflicts (e.g., Bales & Cohen, 1979; Tuckman & Jensen, 1977). However, differences between the two conditions (with or without Reflector) are small and no significant differences were found for social performance in total. Furthermore, no significant main effects were found for Radar or Reflector on the groups’ cognitive performances (i.e., grade given to their paper). An explanation could be that, in Studies 1 and 2, time was to short for the tools to have an effect on the groups’ cognitive performance. The small range in grades given to the students’ papers (see Study 3), is probably also one of the reasons for not finding any significant differences in cognitive performance between conditions.
7.3.5 What are the preconditions for Radar and Reflector to be effective? Although the Radar is easy to complete and interpret, and Reflector stimulates, structures and regulates the reflection process during collaboration, students still require a certain metacognitive skillfulness, which enables them to critically compare their social and cognitive behavior with that of fellow group members, and critically reflect upon their individual and group performance (e.g., Prins, Veenman, & Elshout, 2006; Veenman, Wilhelm, & Beishuizen, 2004). Veenman et al. (2004) found that metacognitive skillfulness is a general, person-related characteristic across age groups, rather than being domain-specific. Therefore, we recommend using these tools in educational levels similar to sophomore high school students or higher. Results indicate that students’ self and received peer ratings show more convergence over time (all studies), and these ratings also become more objective and reliable when supplemented with reflection prompts (Study 3). Therefore, when self or peer ratings are used for developmental purposes, students should complete several ratings (i.e., three measurement occasions), over an appropriate period of time (i.e., eight weeks), supplemented with reflection prompts aimed at future (group) functioning. Results of Study 2 showed that most self and peer ratings increased significantly at the second assessment (T2), compared to a decrease in Studies 1 and 3. An explanation for these contradictory results could be the familiarity between group members. In comparison to Studies 1 and 3, group members in Study 2 were familiar with collaborating with each other. It is possible that this prior collaboration period caused students to already have more realistic (i.e., less positive) self and peer perceptions, compared to students without this collaboration history. This assumption would be in line with a study by Janssen, Erkens, Kirschner, and Kanselaar (2009), who found that higher familiarity led to more critical and exploratory group norm perceptions, and more positive perceptions of online communication and collaboration. In conclusion, the contradictory results of Study 2 and Studies 1 and 3, imply that Radar and Reflector are effective in both familiar and unfamiliar groups. However, it should be kept in mind that higher familiarity can lead to more positive group norm perceptions (see Study 2), but also to more negative (i.e., critical) group norm perceptions (e.g., Janssen et al., 2009). As stated in paragraph 7.3.1, peers may be unwilling to provide accurate ratings and instead may rate their friends too leniently (Landy & Farr, 1983), or rate everyone similarly in order to not cause friction within the group (Murphy & Cleveland, 1995). Therefore, it is recommended that users ensure the anonymity in Radar as long as there is no relevant reason to eliminate it,
119
Discussion because this will probably be one of the most effective and objective ways to gather information on individual behavior and performance (Kagan, Kigli-Shemesh, & Tabak, 2006). As stated in the introduction, the effect of Radar on students’ behavior depends on the willingness and ability of students to reflect upon the information provided by this tool. However, it can not be assumed that group members will automatically reflect on a high cognitive level on their perceived and received peer assessments (e.g., Kollar & Fischer, 2010). Furthermore, Reflector enhances the effect of the Radar, which results in more moderate and less optimistic self and peer ratings for groups with Reflector (see hypothesis stated in Studies 1, 2 & 3). Therefore, the recommended sequence of completing the tools is first Radar, than Reflector.
7.3.6 Conclusions about the tools In conclusion, the three empirical studies in this thesis examined to what extent a self and peer assessment tool (Radar) and co-reflection tool (Reflector) affect behavior and performance in small CSCL groups. The obtained results strongly indicate that, over time, Radar and Reflector can have a positive effect on perceived social and cognitive behavior (Studies 1 & 2), lead to more congruency (higher correlations) between self and peer ratings (Studies 1, 2 & 3), and have a positive effect on social group performance (Studies 1 & 2). Furthermore, results indicate that reflection prompts aimed at the discrepancies between self ratings and received peer ratings, stimulate students to set higher standards for themselves, resulting in more moderate and less optimistic self and peer perceptions (Study 3). Reflection prompts aimed at reaching a shared conclusion about the group’s functioning and formulating plans to improve it, stimulates group members to create shared norms and standards (Tuckman & Jensen, 1977) about what can be referred to as good group behavior and high-quality performance, lead to more consensus among peer raters (Study 3), and a more valid perception of social group performance (Study 3).
7.4
Methodological Issues
Although results show that group awareness tools such as Radar and Reflector can positively affect the interaction and social performance in small groups, there are some methodological issues that should be kept in mind.
7.4.1 Measuring Group Awareness Equals Intervention In our studies, Radar is an intervention tool, but also a measurement tool. This has the disadvantage that it is not possible to measure awareness without intervening. For example, it is not possible to measure students’ awareness of behavior (i.e., discrepancies between self perception on behavior and their actual behavior as perceived by their peers), without explicitly asking students about it. In case students are not aware of these discrepancies, asking them this question will inevitably stimulate reflection on this matter, and increase students’ awareness of behavior. Furthermore, having Radar as both measurement and intervention tool limited the designs of the studies. For example, for groups with only Reflector it was not possible to measure students’ awareness of behavior during collaboration without intervening measures (e.g., using interviews, retrospection, or augmenting these groups with Radar). To overcome this problem, we used a time-series design with three conditions (tools throughout, tools halfway, no tools), to examine to what extent half-time awareness (receiving the tools for the first time halfway the collaboration process) affects the group’s social and cognitive behavior and performance.
120
Discussion
7.4.2 Comparability of the three empirical studies Another issue is the comparability of the three empirical studies. There are several differences between the three empirical studies, such as the way in which students collaborated and used the tools, familiarity, sample size, educational level, age/maturation, and duration of the collaboration task. The most important difference between the studies is the way in which students collaborated and used the tools. In Studies 1 and 2, students fully collaborated in a CSCL environment and could only communicate through a chat tool. In Study 3 students collaborated face-to-face, and only used the CSCL environment to use the group awareness tools (Radar with or without Reflector) across time. Students could set goals and formulate plans outside the CSCL environment, during face-to-face interactions. This could explain why, compared to Studies 1 and 2, no significant effects of Reflector were found on the groups’ social performance. Full collaboration in the CSCL environment could have enhanced the effect of the Radar and Reflector, because then, group members would have to rely on the tools to gain information on the behavior and performance of their peers. Another explanation for not finding significant effects of Reflector on the groups’ social performance could be the difference in reflection skills between the students of Study 1 and 2 (sophomore Dutch high school students), and Study 3 (second year educational science students). Reflection skills of second year university students are fare more developed than those of sophomore high school students. Therefore, the added value of Reflector is much higher for high school students than university students, which could explain why an effect of Reflector is only found for high school students. Groups also differed in their familiarity to collaborate with each other. In comparison to Studies 1 and 3, group members in Study 2 were familiar with collaborating with each other. Prior to Study 2, participants collaborated for one month on a sociology project. It is possible that this prior collaboration period caused students to already have more realistic (i.e., less positive) self and peer perceptions, compared to students without this collaboration history. This assumption would be in line with a study by Janssen, Erkens, Kirschner, and Kanselaar (2009), who found that higher familiarity led to more critical and exploratory group norm perceptions. Thus, although it was hypothesized that Radar and Reflector would lead to a decrease in self and peer ratings during collaboration, ratings of students who had worked with each other previously would most likely increase over time by using Radar and Reflector. This assumption would also be in line with results of Study 2, showing that most self and peer ratings increased significantly at the second assessment (T2), compared to a decrease in Studies 1 and 3. In conclusion, assuming that familiarity caused the findings of Study 2 not to be in line with the hypothesis that ratings would decrease due to Radar and Reflector, we, however, still consider the hypothesis as plausible based on the findings in Studies 1 and 3. Another difference between these studies is sample size. The sample size in the first study (pilot study) is relatively small (N = 39) compared to the second (N = 108) and third study (N = 191). Although in Study 1 significant main effects were found for Radar on the groups’ social performance (e.g., team development, level of group conflict and attitude towards collaborative problem solving), the added value of this pilot study can be argued in comparison to Studies 2 and 3. Finally, it should be noticed that development of students’ awareness of behavior was not only examined in two short-term studies, but also in a mediate term study. Duration of the collaboration task varied across two 90-minutes session separated by one week (Study 1), three 45-minute sessions over a period of one week (Study 2), and a collaboration period of eight weeks (Study 3). Results indicate that students’ self and received peer ratings show more convergence over time (all studies), and these ratings also become more objective and reliable when
121
Discussion supplemented with reflection prompts (Study 3). Therefore, when self or peer ratings are used for developmental purposes, students should complete several ratings (i.e., three measurement occasions), over a longer period of time (i.e., eight weeks), supplemented with reflection prompts aimed future (group) functioning.
7.4.3 Radar: design issues and considerations An important criterion for designing a self and peer assessment tool (i.e., Radar) was to develop an easy to complete and easy to interpret assessment tool. The aim of the tool was to enhance awareness of group members’ social and cognitive behavior, and, in turn, enhance social and cognitive group performance. Therefore, Radar provides information on six traits important for assessing behavior in groups. Four are related to social or interpersonal behavior, namely (1) influence; (2) friendliness; (3) cooperation; (4) reliability; and two are related to cognitive behavior, namely (5) productivity and (6) quality of contribution. Although these traits were derived from studies on interpersonal perceptions, interaction, group functioning, and group effectiveness (e.g., Bales, 1988; den Brok, Brekelmans, & Wubbels, 2006; Kenny, 1994; Salas, Sims, & Burke, 2005), it can be questioned whether the number of axes (dependent variables) in Radar are sufficient, and whether there are other, more adequate, traits or dimensions (i.e., collaboration skills) to enhance behavior and performance. Nevertheless, it was hard to find studies or good practices which we could use as an example. There are hardly any other empirical studies that examined the effect of self and peer assessment on students’ perceptions on interpersonal behavior, and students’ social and cognitive performance (Gennip, Segers, & Tillema, 2009, 2010). In this thesis, students rated the perceived social and cognitive behavior on a continuous scale ranging from 0 to 4 (0 = none, 4 = very high). It can be questioned whether quantitative feedback by self and peer rating is the best method to enhance group awareness on behavior and performance. There are other quantitative and qualitative methods to enhance awareness. For example, two other quantitative methods are peer ranking and peer nomination. Peer ranking consists of having each group member rank all of the others from best to worst on one or more factors. Peer nomination consists of having each member of the group nominate the member who is perceived to be the highest in the group on a particular characteristic or dimension of performance (Dochy, Segers, & Sluijsmans, 1999). These quantitative methods, however, only provide information about the best performing peer, and do not provide any information about the performance of the other group members. According to Pond, Ul-Haq, and Wade (1995) are these other methods also not able to prevent rating bias, An example of a qualitative method is to ask students to write a short assessment report with qualitative feedback (i.e., compliments, arguments, and recommendations) concerning fellow group members or a group product (e.g., Prins, Sluijsmans, Kirschner, & Strijbos, 2005). Nevertheless, in our opinion, using self and peer ratings is one of the fastest, easy to implement, easy to complete, and most detailed method to enhance students’ awareness on behavior and performance Another issue is the construct validity of self and peer ratings in Radar. Although care was taken to ensure that all raters used the same definition of the six traits in Radar (e.g., by providing text balloons with content information and definitions), it could be that construct validity was limited because raters used different standards to rate themselves and their peers, especially at the first measurement occasion. In this thesis it was assumed that validity of self and peer ratings in Radar would be enhanced over time by Reflector, which allowed group members to jointly discuss their given ratings (Farh, Cannella, & Bedejan, 1991; Saavedra & Kwun, 1993). This assumption is supported by several findings in this thesis, which indicate that use of Reflector
122
Discussion (i.e., stimulating group members to collaboratively reflect upon their ratings) leads to higher levels of agreement (consensus) between raters, and higher congruency (correlations) between self and received peer ratings. However, high consensus and congruency may indicate that the same measures were used (e.g., peers agree with each other that a specific group member is unfriendly); it provides no information about whether the grounds for agreement are right or wrong (e.g., maybe the peers had a fight with this specific group member during lunch). Qualitative measurements are required to examine whether these self and peer ratings are based on group members’ actual behavior. An example of qualitative research on construct validity of self and peer ratings in Radar is done by Van Strien (2009) who studied construct validity of self and peer ratings on influence, friendliness, and productivity in Radar by examining whether self and peer ratings (i.e., ratings on productivity) were actually based on the frequency of specific actions in the CSCL environment (i.e., the amount of text that was added to the groups’ essay). He found that these ratings are only construct valid to a limited degree, but that reflection on one’s functioning within the group may have a positive effect on this validity. These findings are in line with the assumption that validity of ratings can be enhanced by allowing group members to jointly discuss their given ratings (Farh et al.; Saavedra & Kwun). Therefore, to increase construct validity of self and peer ratings in an educational setting, it is recommended that (1) students jointly discuss the criteria and standards they use to rate themselves and their peers, or that (2) students define their own assessment constructs collaboratively (e.g., Dochy, Segers, & Sluijsmans, 1999). To reach a high level of construct validity it is preferred that students reach a shared understanding about their standards on classroom level, so that construct validity is not only high within groups, but also between groups.
7.4.4 Reflector: design issues and considerations The design of the reflection tool was to develop an easy to complete and easy to interpret reflection tool. The aim of the tool was to stimulate and regulate reflection group members’ individual behavior and overall group performance. Therefore, Reflector stimulated each group member to individually reflect upon their own functioning, their received peer ratings, and the functioning of the group as a whole. The tool shared these individual reflections with all group members and stimulated group members to reflect collaboratively on their group performance and to set shared goals and formulate plans for improving the group’s social and cognitive performance. Results showed that goals and plans were mainly focused on improvement of activities (i.e., task coordination, communication, productivity, and focusing on the task), which are crucial for successful collaboration (Barron, 2003; Erkens et al., 2005). However, in this thesis, the output of the Reflector concerning group member’s intentions to enhance their social and cognitive performance by setting goals and formulating plans was primarily analyzed in a quantitative way. Further qualitative research is necessary to examine whether these intensions lead to actual changes in social and cognitive behavior and activities (e.g., using discourse analysis to find out whether the intention to become friendlier actually led to more friendly and helpful behavior in the chat). Furthermore, Reflector-output provides no information on how group members accomplished their goal or executed their formulated plan. In addition, it is also interesting to know what group members have learned and what they will do differently in their next collaborative assignment. Therefore, in future research, these reflective questions will be added to the Reflector to stimulate students’ (self) regulating skills, and to gain more insight about group members’ executive activities in order to enhance their performance.
123
Discussion Finally, the lack of interviews with students after the collaboration process makes it hard to explain why and how Reflector affected students’ perceptions on their behavior and social performance. Therefore, for future research is recommended to gain more information concerning students’ motivation for their self and peer ratings at different occasions (i.e., by using think aloud during completion of Radar, or interviewing participants afterwards).
7.5
Theoretical and Practical Implications
Overall, the effects of Radar and Reflector on group members’ individual behavior and their social group performance look very promising. To our knowledge, there is no concise conclusion in previous research (1) to what extent peer assessment and reflection prompts affect group behavior and performance, and (2) what kind of reflection prompts lead to effective reflection processes (e.g., Chen, Wei, Wu, & Uden, 2009; Strijbos & Sluijsmans, 2010; Topping, 1998; Van Gennip, Segers, & Tillema, 2009; 2010; Van Zundert, Sluijsmans, & Van Merrienboer, 2010). Therefore, this research project can be considered as one of the first studies that examined a combination of several domains relevant for succesful group behavior and performance, such as Computer Supported Collaborative Learning, Team Development, Interpersonal Perceptions, Self and Peer Assessment, and Reflection. Results indicate that self and peer ratings are not evidently reliable and valid. However, findings in this thesis suggest that students’ self and received peer ratings show more convergence over time (all studies), especially when these ratings are supplemented with reflection prompts. In Study 1, groups with Radar and Reflector showed more and higher positive correlations between self and peer ratings than groups with only Radar. In Study 2, groups using Radar and Reflector showed higher convergence of self and peer ratings over time, than groups not using Radar and Reflector. In Study 3, both groups (with or without Reflector) showed non-significant or relatively small correlations between self and peer ratings at the first assessment, and significant medium to large correlations at subsequent assessments. In this thesis, reflection was not used for professional development (e.g., Schön, 1983, 1987) or personal growth (e.g., Korthagen, 1985). It was used to make group members aware of their behavior, group norms, and group functioning, in order to reach new understandings and appreciations of this matter (e.g., Boud, Keogh, & Walker, 1985). As expected, over time, reflection prompts caused self and peer ratings to become more objective and reliable (see Study 3). Stimulating students to reflect upon the behavior of themselves (self ratings) and their peers (perceived and received peer ratings), results in ratings that are more determined by the perceived behavior of the ratees, and less determined by the tendency of the rater to rate all other group members as high or low on a particular trait (Study 3). Therefore, when self or peer ratings are used for developmental purposes, it is recommended that students complete several ratings (i.e., three measurement occasions), over a longer period of time (i.e., 8 weeks), supplemented with reflection prompts aimed at awareness of (self and peer) behavior, group norms, and future (group) functioning. Unfortunately, the group awareness tools used in this research did not have a significant main effect on cognitive performance (i.e., grade given to their paper). A possible explanation could be that a better social group performance (i.e., higher levels of team development and group satisfaction) does not necessarily directly lead to a higher cognitive performance (i.e., quality of group product). However, results did show that groups using these tools throughout (Radar with or without Reflector) perceived higher levels of social group performance compared to groups not using the tools (Studies 1 and 2). These results indicate that group awareness tools, such as a self and peer assessment tool (i.e., Radar), can be used to enhance social group performance (i.e.,
124
Discussion better team development and group process satisfaction), particularly for high school students (Studies 1 and 2). In this research (see Study 3), Social Relations Models (SRM) provided a very useful theoretical basis and statistical tool to partition sources of variance in peer ratings into actor variance (caused by the tendency of raters to rate all peers similarly - high or low - on a particular trait), and partner variance (caused by the tendency of ratees to elicit similar ratings from all peer raters). Therefore, SRM were used to examine interdependencies among ratings. To our knowledge, few studies have applied SRM to study group dynamics (see a review by Marcus, 1998), or applied SRM to study consensus among peer ratings on performance (e.g., Greguras, Robie, & Born, 2001; Greguras, Robie, Born, & Koenigs, 2007). For future research it would be interesting to examine the correlation between self-ratings and the individual-level SRM effects (actor and partner). The correlations between self-rating and actor effects of trait ratings measure assumed similarity (i.e., does the way a person sees him- or herself correspond to how he or she sees others?). The correlations between self-rating and partner effects of trait ratings measure selfother agreement (i.e., do others see a person as that person sees him- or herself?). SRM is to be recommended for educational research on interpersonal perceptions, group dynamics, and round robin designs where persons rate each other and themselves on particular traits or behavior.
125
References Artzt, A. F., & Armour-Thomas, E. (1997). Mathematical problem solving in small groups: Exploring the interplay of students’ metacognitive behaviors, perceptions, and ability levels. Journal of Mathematical Behavior, 16(1), 63–74. Balcazar, F., Hopkins, B. L., & Suarez, Y. (1986). A critical, objective review of performance feedback. Journal of Organizational Behavior Management, 7, 65-89. Bales, R. F. (1988). A new overview of the SYMLOG system: Measuring and changing behavior in groups. In R. B. Polly, A. P. Hare, & P. J. Stone (Eds.), The SYMLOG practitioner: Applications of small group research (pp. 319-344). New York: Preager. Bales, R. F., & Cohen, S. P. (1979). SYMLOG: a system for the multiple level observation of groups. New York: Macmillan Publishing. Baltes, B. B., Dickson, M. W., Sherman, M. P., Bauer, C. C., & LaGanke, J. (2002). Computermediated communication and group decision making: A meta-analysis. Organizational Behavior and Human Decision Processes, 87(1), 156-179. Bamberger, P. A., & Bar Niv, O. (2006). Intentional Rating Distortion and Peer Evaluation in Management Education: Why and How to Identify “Game-Players". Journal of Learning in Higher Education, 2, 77−87. Bamberger, P., Erev, I., Kimmel, T., & Oref-Chen, T. (2005). Peer assessment, individual performance and contribution to group processes: The impact of rater anonymity. Group and Organization Management, 30, 344−377. Barron, B. (2003). When smart groups fail. Journal of the Learning Sciences, 12, 307–359. Beers, P. J., Boshuizen, H. P. A., Kirschner, P. A., & Gijselaers, W. H. (2005). Computer support for knowledge construction in collaborative learning environments. Computers in Human Behavior, 21, 623–643. Benbunan-Fich, R., Hiltz, S. R., & Turoff, M. (2003). A comparative content analysis of face-toface vs. asynchronous group decision making. Decision Support Systems, 34(4), 457–469. Birenbaum, M. (1996). Assessment 2000: towards a pluralistic approach to assessment. In M. Birenbaum, & F. Dochy (Eds.), Alternatives in assessment of achievements, learning processes and prior knowledge (pp. 3-29). Boston, MA: Kluwer. Bodemer, D. (2011). Tacit guidance for collaborative multimedia learning. Computers in Human Behavior, 27(3), 1079–1086. Bonito, J. A., & Kenny, D. A. (2010). The measurement of reliability of social relations components from round-robin designs. Personal Relationships, 17, 235–251. Bos, N., Olson, J. S., Gergle, D., Olson, G. M., & Wright, Z. (2002). Effects of four computermediated communications channels on trust development. Proceedings of SIGCHI: ACM special interest group on computer– human interaction (pp. 135–140). New York7 ACM Press. Boud, D. (1990) Assessment and the promotion of academic values, Studies in Higher Education, 15, 101–111. Boud, D., Cohen, R., & Sampson, J. (1999). Peer learning and assessment. Assessment & Evaluation in Higher Education, 24, 413-426. Boud, D., & Falchikov, N. (1989). Quantitative studies of self-assessment in higher education: a
126
References critical analysis of findings, Higher Education, 18, 529-549. Boud, D., Keogh, R., & Walker, D. (1985). Promoting reflection in learning: A model. In D. Boud, R. Keogh, & D. Walker (Eds.), Reflection: Turning experience into learning (pp. 18– 40). London: Routledge Falmer. Brok, P. den, Brekelmans, M., & Wubbels, Th. (2006). Multilevel issues in studies using students' perceptions of learning environments: The case of the Questionnaire on Teacher Interaction. Learning Environments Research, 9, 199-213. Buder, J. (2007). Net-based knowledge-communication in groups: Searching for added value. Journal of Psychology, 215(4), 209–217. Buder, J. (2011). Group awareness tools for learning: Current and future directions. Computers in Human Behavior, 27, 1114–1117. Buder, J., & Bodemer, D. (2008). Supporting controversial CSCL discussions with augmented group awareness tools. International Journal of Computer-Supported Collaborative Learning, 3(2), 123–139. Burgio, L. D., Engel, B. T., Hawkins, A., McCormick, K., & Scheve, A. (1990). A staff management dystem for maintaining improvements in continence with elderly nursing home residents. Journal of Applied Behavior Analysis, 23, 111-118. Campbell, J. D. (1986). Similarity and uniqueness: The effects of attribute type, relevance, and individual differences in self-esteem and depression. Journal of Personality and Social Psychology, 50, 281–294. Campion, M. A., Medsker, G., & Higgs, C. (1993). Relations between work group characteristics and effectiveness: Implications for designing effective work groups. Personnel Psychology, 46, 823-847. Castleton_Partners/TCO. (2007). Building trust in diverse teams. London/Cambridge, UK: Castleton Partners / TSO International Diversity Management. Cestone, C. M., Levine, R. E., & Lane, D. R. (2008). Peer assessment and evaluation of teambased learning. In L. K. Michaelsen, M. Sweet, & D. X. Parmelee (Eds.), Team-based learning: Small group learning’s next big step, (pp. 69-78). New Directions for Teaching and Learning, No. 116. San Francisco, CA: Jossey-Bass. Chemers, M. M., Hu, L., & Garcia, B.F. (2001). Academic self-efficacy and first-year college student performance and adjustment. Journal of Educational Psychology, 93, 55–64. Chen, N. S.,Wei, C.W.,Wu, K. T., & Uden, L. (2009). Effects of high level prompts and peer assessment on online learners’ reflection levels. Computers & Education, 52, 283–291. Cheng,W., & Warren, M. (1997). Having second thoughts: student perceptions before and after a peer assessment exercise. Studies in Higher Education, 22, 233-239. Cho, K., Schunn, C. D., & Wilson, R. W. (2006). Validity and reliability of scaffolded peer assessment of writing from instructor and student perspectives. Journal of Educational Psychology, 98, 891-901. Clarebout, G., Elen, J., & Lowyck, J. (1999, August). An invasion in the classroom: Influence on instructional and epistemological beliefs. Paper presented at the eighth bi-annual conference of the European Association of Research on Learning and Instruction (EARLI), Goteborg, Sweden. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (Rev. ed.). Hillsdale, NJ: Erlbaum. Cohen, S. G., & Bailey, D. E. (1997). What makes teams work: Group effectiveness research from the shop floor to the executive suite. Journal of Management, 23, 239-290. Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of
127
References behavioral measurements: Theory of generalizability for scores and profiles (p. 410). New York, NY: Wiley. Cutler, H., & Price, J. (1995). The development of skills through peer assessment. In A. Edwards, & P. Knight (Eds.) Assessing Competence in Higher Education, (pp. 150-159). Birmingham, UK: Staff and Educational Development Association. Cutler, R. H. (1996). Technologies, relations, and selves. In: L. Strate, R. Jacobson, & S.B. Gibson (eds.), Communication and cyberspace: social interaction in an electronic environment. Cresskill, NJ: Hampton Press, pp. 317–333. Dancer, W. T. & Dancer, J. (1992). Peer rating in higher education, Journal of Education for Business, 67, 306-309. Dochy, F., Segers, M., & Sluijsmans, D. (1999). The use of self-, peer and co-assessment in higher education: A review. Studies in Higher Education, 24(3), 331-350. Dominick, P. G., Reilly, R. R., & McGourty, J. W. (1997). The effects of peer feedback on team member behavior. Group & Organization Management, 22, 508-525. Dourish, P., & Bellotti, V. (1992). Awareness and coordination in a shared workspace. In M. Mantel & R. Baecker (Eds.), Proceedings of the ACM Conference on computer-supported cooperative work (pp. 107-114). New York: ACM Press. Druskat, V. U., & Kayes, D. C. (2000). Learning versus performance in short-term project teams. Small Group Research, 31, 328-353 Druskat, V. U., & Wolff, S. B. (1999). Effects and timing of developmental peer appraisals in self-managing work groups. Journal of Applied Psychology, 84, 58-74. Dunning, D., Heath, C., & Suls, J. M. (2004). Flawed self-assessment: Implications for health, education, and business. Psychological Science in the Public Interest, 5, 69–106. Earley, P. C., Northcraft, G. B., Lee, C., & Lituchy, T. R. (1990). Impact of process and outcome feedback on the relation of goal setting to task performance. Academy of Management Journal, 33, 87-105. Eccles, J. S., & Wigfield, A. (2002).Motivational beliefs, values, and goals. Annual Review of Psychology, 53, 109–132. Ellis, S. (1997). Strategy choice in sociocultural context. Developmental Review, 17(4), 490–524. Emans, B., Koopman, P., Rutte, C., & Steensma, H. (1996). Teams in organisaties: Interne en externe determinanten van resultaatgerichtheid [Teams in organizations: Internal and external determinants of outcome orientation]. Gedrag en Organisatie, 9, 309–327. Erkens, G. (2004). Dynamics of coordination in collaboration. In J. Van der linden & P. Renshaw (Eds.), Dialogic learning: Shifting perspectives to learning, instruction, and teaching (pp. 191–216). Dordrecht: Kluwer Academic Publishers. Erkens, G., & Janssen, J. (2008). Automatic coding of dialogue acts in collaboration protocols. Computer-Supported Collaborative Learning, 3, 447-470. Erkens, G., Jaspers, J., Prangsma, M., & Kanselaar, G. (2005). Coordination processes in computer supported collaborative writing. Computers in Human Behavior, 21, 463–486. Falchikov, N. (1995). Peer feedback marking: developing peer assessment, Innovations in Education and Training International, 32, 175-187. Falchikov, N., & Boud, D. (1989). Student self-assessment in higher education: A meta-analysis. Review of Educational Research, 59, 395–430. Falchikov, N., & Goldfinch, J. (2000). Student peer assessment in higher education: a metaanalysis comparing peer and teacher marks. Review of Educational Research, 70, 287-322. Fischer, F., Bruhn, J., Gräsel, C., & Mandl, H. (2002). Fostering collaborative knowledge construction with visualization tools. Learning and Instruction, 12, 213-232.
128
References Fjermestad, J. (2004). An analysis of communication mode in group support systems research. Decision Support Systems, 37, 239-263. Garton, L., & Wellman, B. (1995). Social impacts of electronic mail in organizations: a review of the research literature. In B. R. Burleseon (Ed.), Communication yearbook (Vol. 18) (pp. 438–453). Thousand Oaks, CA: Sage. Garrison, D. R., Anderson, T., Archer, W. (2001). Critical thinking and computer conferencing: A model and tool to access cognitive presence. American Journal of Distance Education. 15(1), 7–23. Geister, S., Konradt, U., & Hertel, G. (2006). Effects of Process Feedback on Motivation, Satisfaction, and Performance in Virtual Teams. Small Group Research, 37, 459-489. Gersick, C. (1988). Time and transition in work teams: Towards a new model of group development. The Academy of Management Journal, 31, 9-41. Goethals, G. R. (1986). Fabricating and ignoring social reality: Self-serving estimates of consensus. In J. M. Olson, C. P. Hermann, & M. P. Zanna (Eds.), Relative deprivation and social comparison: The Ontario Symposium (Vol. 4, pp. 135–157). Hillsdale, NJ: Lawrence Erlbaum. Goethals, G. R., Messick, D. M., & Allison, S. T. (1991). The uniqueness bias: Studies of constructive social comparison. In J. Suls & T. A. Wills (Eds.), Social comparison research: Contemporary theory and research (pp. 149–176). Hillsdale, NJ: Lawrence Erlbaum. Goltz, S. M., Citera, M., Jensen, M., Favero, J., & Komaki, J. L. (1989). Individual feedback: Does it enhance effects of group feedback? Journal of Organizational Behavior Management, 10, 77-92. Gräsel, C., Fischer, F., Bruhn, J., & Mandl, H. (2001). Let me tell you something you do know. A pilot study on discourse in cooperative learning with computer networks. In H. Jonassen, S. Dijkstra, & D. Sembill (Eds.), Learning with multimedia – results and perspectives (pp. 111-137). Frankfurt a. M.: Lang. Greguras, G. J., Robie, C., & Born, M. P. (2001). Applying the Social Relations Model to Self and Peer Evaluations. Journal of Management Development, 20, 508–525. Greguras, G. J., Robie, C., Born, M., & Koenigs, R. (2007). A social relations analysis of team performance ratings. International Journal of Selection and Assessment, 14, 434 – 448. Gunawardena, C. N. (1995). Social presence theory and implications for interaction and collaborative learning in computer conferences. International Journal of Educational Telecommunications, 1(2&3), 147–166. Gutwin, C., & Greenberg, S. (2004). The importance of awareness for team cognition in distributed collaboration. In E. Salas & S. M. Fiore (Eds.), Team cognition: Understanding the factors that drive processes and performance (pp. 177–201). Washington: APA Press. Guzzo, R. A., & Dickson, M. W. (1996). Teams in organizations: Recent research on performance and effectiveness. Annual Review of Psychology, 47, 307-338. Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77, 81–112. Henry, R. A. (1995). Improving group judgment accuracy: information sharing and determining the best member. Organizational Behavior and Human Decision Processes, 62(2), 190– 197. Hertz-Lazarowitz, R., & Bar-Natan, I. (2002). Writing development of Arab and Jewish students using cooperative learning (CL) and computer-mediated communication (CMC). Computers & Education, 39(1), 19–36.
129
References Hewitt, J. (2005). Toward an understanding of how threads die in asynchronous computer conferences. The Journal of the Learning Sciences, 14(4), 567-589. Hobman, E. V., Bordia, P., Irmer, B., & Chang, A. (2002). The expression of conflict in computer-mediated and face-to-face groups. Small Group Research, 33, 439-465. Janssen, J., Erkens, G., & Kirschner, P. A. (2011). Group awareness tools: It’s what you do with it that matters. Computers in Human Behavior, 27, 1046–1058. Janssen, J., Erkens, G., Kirschner, P. A. & Kanselaar, G. (2009). Influence of group member familiarity on online collaborative learning. Computers in Human Behavior, 25, 161–170. Janssen, J., Erkens, G., Kanselaar, G., & Jaspers, J. (2007). Visualization of participation: Does it contribute to successful computer-supported collaborative learning. Computers & Education, 49, 1037-1065. Jaspers, J., Broeken, M., & Erkens, G. (2004). Virtual Collaborative Research Institute (VCRI) (Version 2.0). Utrecht: Onderwijskunde Utrecht, ICO/ISOR. Järvelä, S., Järvenoja, H., & Veermans, M. (2008). Understanding the dynamics of motivation in socially shared learning. International Journal of Educational Research, 47(2), 122–135. Järvelä, S., Volet, S., & Järvenoja, H. (2005). Motivation in collaborative learning: New concepts and methods for studying social processes of motivation. A paper presented at the Earli 2005 conference, 22-27 August 2005, Nicosia, Cyprus. Järvelä, S., Volet, S., & Järvenoja, H. (2010). Research on Motivation in Collaborative Learning: Moving Beyond the Cognitive–Situative Divide and Combining Individual and Social Processes. Educational Psychologist, 45(1), 15–27. Jarvenpaa, S., & Leidner, D. (1999). Communication and trust in global virtual teams. Organization Science, 10, 791-815. Jehn, K. A., & Shah, P. P. (1997). Interpersonal Relationships and Task Performance: An Examination of Mediating Processes in Friendship and Acquaintance Groups. Journal of Personality and Social Psychology, 72(4), 775-790. Jehng, J.J. (1997). The psycho-social processes and cognitive effects of peer-baded collaborative interactions with computers. Journal of Educational Computing Research, 17(1), 19-46 Jensen, C., Farnham, S. D., Drucker, S. M., & Kollock, P. (2000). The effect of communication modality on cooperation in online environments. Proceedings of the CHI 2002 Conference on Human Factors in Computer Systems. New York: ACM. Johnson, D. W., & Johnson, R. T. (1989). Cooperation and competition: Theory and research. Edina, MN: Interaction Book Company. Johnson, D. W., & Johnson, R. T. (1999). Learning together and alone: Cooperative, competitive, and individualistic learning (5th ed.). Boston, MA: Allyn and Bacon. Johnson, D. W., Johnson, R. T., Roy, P., & Zaidman, B. (1985) Oral Interaction in Cooperative Learning Groups: Speaking, Listening, and the Nature of Statements Made by High-, Medium-, and Low-Achieving Students. Journal of Psychology, 119(4), 303-321. Johnson, D. W., Johnson, R. T., & Smith, K. (2007). The state of cooperative learning in postsecondary and professional settings. Educational Psychology Review, 19, 15-29. Johnson, S. D., Suriya, C., Yoon, S. W., Berret, J. V., & La Fleur, J. (2002). Team development and group processes of virtual learning teams. Computers & Education, 39, 379–393. Johnston, W. A., & Briggs, G. E. (1968).Team performance as a function of task arrangement and work load. Journal of Applied Psychology, 52, 89-94. Jonassen, D. H. (1991). Objectivism versus constructivism: Do we need a new philosophical paradigm? Educational Technology Research and Development, 39(3), 5-14. Jonassen, D. H. (1994). Thinking technology: Toward a constructivist design model. Educational
130
References Technology, 34(4), 34-37. Kagan, I., Kigli-Shemesh, R., & Tabak, N. (2006). ‘Let me tell you what I really think about you’ – evaluating nursing managers using anonymous staff feedback. Journal, 14, 356-365. Kane, J. S., & Lawler, E. E. (1978). Methods of peer assessment. Psychological Bulletin, 85, 555586. Karau, S., & Williams, K. (1993). Social loafing: A meta-analytic review and theoretical integration. Journal of Personality and Social Psychology, 65, 681-706. Kay, J., Maisonneuve, N., Yacef, K., & Reimann, P. (2006). The Big Five and Visualisations for Team Work Activity. Proceedings of Intelligent Tutoring Systems (ITS06), M. Ikeda, K. D. Ashley & T-W. Chan (eds). Taiwan. Lecture Notes in Computer Science 4053, SpringerVerlag, 197-206. Kenny, D. A. (1991). A general model of consensus and accuracy in interpersonal perception. Psychological Review, 98, 155-163. Kenny, D. A. (1994). Interpersonal perception: A social relations analysis. New York: Guilford. Kenny, D. A., & Judd, C. M. (1986). Consequences of Violating the Independence Assumption in Analysis of Variance. Psychological Bulletin, 99, 422–431. Kenny, D. A. (1998). ‘SOREMO Version V.2’, unpublished manuscript, University of Connecticut, Storrs, CT. Kerr, N. (1983). The dispensability of member effort and group motivation losses: Free-rider effects. Journal of Personality and Social Psychology, 44, 78–94. Kerr, N., & Bruun, S. (1983). The dispensability of member effort and group motivation losses: Free rider effects. Journal of Educational Computing Research, 5, 1-15. Kirschner, P. A. (2001). Using integrated electronic environments for collaborative teaching/learning. Research Dialogue in Learning and Instruction, 2(1), 1–10. Kirschner, P. A., Beers. P. J., Boshuizen, H. P. A., & Gijselaers, W. H. (2008). Coercing shared knowledge in collaborative learning environments. Computers in Human Behavior, 24, 403420. Kirschner, P. A., Strijbos, J., Kreijns, K., & Beers, P. J. (2004). Designing electronic collaborative learning environments. Educational Technology Research and Development, 52(3), 47–66. Klein, W. M. (2001). Post hoc construction of self-performance and other performance in selfserving social comparison. Society for Personality and Social Psychology, 27(6), 744–754. Kluger, A. N., & DeNisi, A. (1996). The effects of feedback interventions on performance: A historical review, a meta-analysis, and a preliminary feedback intervention theory. Psychological Bulletin, 119, 254-284. Kollar, I., & Fischer, F. (2010). Peer assessment as collaborative learning: A cognitive perspective. Learning and Instruction, 20, 344-348. Kormanski, C. (1990). Team building patterns of academic groups. The Journal for Specialists in Group Work, 15(4), 206-214. Kreijns, K., & Kirschner, P. A., (2004). Determining sociability, social space and social presence in (a)synchronous collaborating teams. Cyberpsychology and Behavior, 7, 155-172. Kreijns, K., Kirschner, P. A., & Jochems, W. (2003). Identifying the pitfalls for social interaction in computer-supported collaborative learning environments: a review of the research. Computers in Human Behavior, 19, 335-353. Landy, F. J., & Farr, J. L. 1983. The measurement of work performance: Methods, theory, and applications. New York: Academic Press. Lazonder, A. W., & Rouet, J. F. (2008). Information problem solving instruction: Some cognitive and metacognitive issues. Computers in Human Behavior, 24, 753–765.
131
References Leinonen, P., & Järvelä, S. (2006). Facilitating interpersonal evaluation of knowledge in a context of distributed team collaboration. British Journal of Educational Technology, 37 (6), 897916. Leinonen, P., Järvelä, S., & Häkkinen, P. (2005). Conceptualizing the awareness of collaboration: A qualitative study of a global virtual team. Computer Supported Cooperative Work, 14 (4), 301-322. Lewin, A. Y., & Zwany, A. (1976). Peer nominations. A model, literature critique and a paradigm for research. Personnel Psychology, 29, 423-447. Liaw, S., & Huang, H. (2000). Enhancing interactivity in web-based instruction: A review of the literature. Educational Technology, 40(3), 41-45. Lipponen, L., Rahikainen, M., Lallimo, J., & Hakkarainen, K. (2003). Patterns of participation and discourse in elementary students' computer-supported collaborative learning. Learning and Instruction, 13, 487-509. Locke, E. A., & Latham, G. P. (1990). A theory of goal setting & task performance. Englewood Cliffs, NJ: Prentice Hall. Magin, D. (2001): Reciprocity as a source of bias in multiple peer assessment of group work. Studies in Higher Education, 26, 53-63. Marcus, D. K. (1998). Studying Group Dynamics with the Social Relations Model. Group Dynamics: Theory, Research, and Practice, 2, 230–240. McDowell, L. (1995). The impact of innovative assessment on student learning, Innovations in Education and Training International, 32, 302-313. McLeod, P. L., & Liker, J. K. (1992). Process feedback in task groups: An application of goal setting. Journal of Applied Behavioral Science, 28, 15-52. Mento, A. J., Steel, R. P., & Karren, R. J. (1987). A meta-analytic study of the effects of goal setting on task performance: 1966-1984. Organizational Behavior and Human Decision Processes, 39, 52-83. Michinov, N., & Primois, C. (2005). Improving productivity and creativity in online groups through social comparison process: new evidence for asynchronous electronic brainstorming. Computers in Human Behavior, 21(1), 11–28. Montgomery, B. (1986) An interactionist analysis of small group peer assessment, Small Group Behavior, 17, 19–37. Murphy, K. R., & Cleveland, J. N. (1991). Performance appraisal. Needham Heights, MA: Allyn & Bacon. Murphy, K. R., & Cleveland, J. N. (1995). Understanding performance appraisal: Social, organizational, and goal-based perspectives. Thousand Oaks, CA: Sage. Narciss, S., Proske, A., & Koerndle, H. (2007). Promoting self-regulated learning in web-based learning environments. Computers in Human Behavior, 23, 1126–1144. Neubert, M. J. (1998). The value of feedback and goal setting over goal setting alone and potential moderators of this effect: A meta-analysis. Human Performance, 11, 321-335. Northrup, P. T. (2001). A framework for designing interactivity into web-based instruction. Educational Technology, 41(2), 31-39. O’Donnell, A. M., & O’Kelly, J. (1994). Learning from peers: beyond the rhetoric of positive results. Educational Psychology Review, 6(4), 321–349. Orsmond, P., Merry, S., & Callaghan, A. (2004). Implementation of a formative assessment model incorporating peer and self-assessment. Innovations in Education and Teaching International, 41, 273-290. Petty, R. E.,Wegener, D. T., Fabrigar, L. R. (1997). Attitudes and attitude change. Annual Review
132
References of Psychology, 48, 609–647. Phielix, C., Prins, F. J., Kirschner, P. A. (2010). Awareness of group performance in a CSCL environment: Effects of peer feedback and reflection. Computers in Human Behavior, 26, 151-161. Phielix, C., Prins, F. J., Kirschner, P. A., Erkens, G., & Jaspers, J. (2011). Group awareness of social and cognitive performance in a CSCL environment: Effects of a peer feedback and reflection tool. Computers in Human Behavior, 27, 1087–1102. Prins, F. J., Sluijsmans, D. M. A., & Kirschner, P. A. (2006). Feedback for general practitioners in training: quality, styles, and preferences. Advances in Health Sciences Education, 11, 289303. Prins, F. J., Sluijsmans, D. M. A., Kirschner, P. A., & Strijbos, J. W. (2005). Formative peer assessment in a CSCL environment: A case study. Assessment and Evaluation in Higher Education, 30(4), 417–444. Prins, F. J., Veenman, M. V. J., & Elshout, J. J. (2006). The impact of intellectual ability and metacognition on learning: New support for the threshold of problematicity theory. Learning and Instruction, 16, 374–387. Rochelle, J., & Teasley, S. (1995). The construction of shared knowledge in collaborative problem solving. In C. O’Malley (Ed.), Computer-supported collaborative learning (pp. 69–97). New York: Springer-Verlag. Rourke, L. (2000). Operationalizing social interaction in computer conferencing. Proceedings of the 16th annual conference of the Canadian Association for Distance Education. Quebec City, Canada. Retrieved April 1, 2004, from http://www.ulaval.ca/aced2000cade/english/proceedings.html Rovai, A. P. (2001). Classroom community at a distance: A comparative analysis of two ALNbased university programs. Internet and Higher Education, 4, 105–118. Saavedra, R., Early, P. C., & Van Dyne, L. (1993). Complex interdependence in task-performing groups. Journal of Applied Psychology, 78, 61–72. Saavedra, R., & Kwun, S. K. (1993). Peer evaluation in self-managing work groups. Journal of Applied Psychology, 78, 450-462. Salas, E., Sims, D. E., & Burke, C. S. (2005). Is there a "Big Five" in teamwork? Small Group Research, 36, 555-599. Salomon, G., & Globerson, T. (1989). When teams do not function the way they ought to? International Journal of Educational Research, 13(1), 89–99. Sambell, K., & McDowell, L. (1998). The value of self and peer assessment to the developing lifelong learner. In C. Rust (Ed.), Improving student learning - Improving students as learners (pp. 56-66). Oxford, UK: Oxford Centre for Staff and Learning Development. Savicki, V., Kelley, M., & Lingenfelter, D. (1996). Gender, group composition, and task type in small task groups using computer-mediated communication. Computers in Human Behavior, 12, 549–565. Schön, D. A. (1987). Educating the reflective practitioner. San Francisco, CA: Jossey-Bass. Short, J., Williams, E., & Christie, B. (1976). The social psychology of telecommunications. London: John Wiley & Sons. Slavin, R. E. (1997). Educational psychology: Theory and practice (5th ed.). Needham Heights, MA: Allyn & Bacon. Slof, B., Erkens, G., Kirschner, P. A., Janssen, J., & Phielix, C. (2010). Fostering Complex Learning-task Performance through Scripting Student Use of Computer Supported Representational Tools. Computers and Education, 55, 1707–1720. Slof, B., Erkens, G., Kirschner, P. A., & Jaspers, J. G. M. (2010). Design and effects of
133
References representational scripting on group performance. Educational Technology Research and Development, 58, 589–608. Slof, B., Erkens, G., Kirschner, P. A., Jaspers, J. G. M., & Janssen, J. (2010). Guiding students’ online complex learning-task behavior through representational scripting. Computers in Human Behavior, 26, 927–939. Sluijsmans, D. M. A. (2002) Student involvement in assessment: the training of peer assessment skills, unpublished doctoral dissertation, Open University of the Netherlands, Heerlen. Sluijsmans, D., & Prins, F. (2006). A conceptual framework for integrating peer assessment in teacher education. Studies in Educational Evaluation, 32, 6-22. Sluijsmans, D., & Van Merriënboer, J. J. G. (2000) A peer assessment model (Heerlen, Open University of the Netherlands). Somervell, H. (1993). Issues in assessment, enterprise and higher education: the case for self-, peer and collaborative assessment, Assessment and Evaluation in Higher Education, 18, 221-233. Stahl, G. (2004). Groupware goes to scholl: Adapting BSCW to the classroom. International Journal of Computer Applications in Technology, 19(3/4), 1–13. Straus, S. G. (1997). Technology, group process, and group outcomes: Testing the connections in computer-mediated and face-to-face groups. Human-Computer Interaction, 12, 227-266. Straus, S. G., & McGrath, J. E. (1994). Does the medium matter? The interaction of task type and technology on group performance and member reactions. Journal of Applied Psychology, 79(1), 87-97. Strijbos, J. W., Kirschner, P. A., & Martens, R. L. (Eds.). (2004). What we know about CSCL: And implementing it in higher education. Boston, MA: Kluwer Academic/Springer Verlag. Strijbos, J. W., Martens, R. L., Jochems, W. M. G., & Broers, N. J. (2007). The effect of functional roles on perceived group efficiency during computer-supported collaborative learning: a matter of triangulation. Computers in Human Behavior, 23, 353–380. Strijbos, J. W., Narciss, S., & Dunnebier, K. (2010). Peer feedback content and sender’s competence level in academic writing revision tasks: are they critical for feedback perceptions and efficiency? Learning and Instruction, 20(4), 291-303. Strijbos, J. W., & Sluijsmans, D. (2010). Unraveling peer assessment: Methodological, functional, and conceptual developments. Learning and Instruction, 20(4), 265-269. Stroebe, W., Diehl, M., & Abakoumkin, G. (1992). The illusion of group effectivity. Personality and Social Psychology Bulletin, 18, 643-650. Suls, J., & Wan, C. K. (1987). In search of the false-uniqueness phenomenon: Fear and estimates of social consensus. Journal of Personality and Social Psychology, 59, 229–241. Thompson, L. F., & Coovert, M. D. (2003). Teamwork online: the effects of computer conferencing on perceived confusion, satisfaction and postdiscussion accuracy. Group Dynamics, 7(2), 135–151. Topping, K. J. (1998). Peer assessment between students in colleges and universities. Review of Educational Research, 68, 249-276. Tubbs, M. E. (1986). Goal setting: A meta-analytic examination of the empirical evidence. Journal of Applied Psychology, 71, 474-483. Tuckman, B. W., & Jensen, M. A. C. (1977). Stages of small group development revisited. Group and Organizational Studies, 2, 419-427. Van den Bossche, P., Gijselaers, W., Segers, M., & Kirschner, P. A. (2006). Social and cognitive factors driving teamwork in collaborative learning environments: Team learning beliefs and behaviors. Small Group Research, 37, 490-521
134
References Van der Pol, J., Van den Berg, B. A. M., Admiraal, W. F., & Simons, P. R. J. (2008). The nature, reception, and use of online peer feedback in higher education. Computers and Education, 51, 1804–1817. Van Gennip, N. A. E., Segers, M. S. R., & Tillema, H. H. (2009). Peer assessment for learning from a social perspective: the influence of interpersonal variables and structural features. Educational Research Review, 4, 41-54. Van Gennip, N. A. E., Segers, M. S. R., & Tillema, H. H. (2010). Peer assessment as a collaborative learning activity: the role of interpersonal variables and conceptions. Learning and Instruction, 20(4), 280-290. Van Meter, P., & Stevens, R. J. (2000). The role of theory in the study of peer collaboration. Journal of Experimental Education, 69(1), 113–127. Van Zundert, M., Sluijsmans, D. M. A., & Van Merriënboer, J. J. G. (2010). Effective peer assessment processes: research findings and future directions. Learning and Instruction, 20(4), 270-279. Veenman, M. V. J., Wilhelm, P., & Beishuizen, J. J. (2004). The relation between intellectual and metacognitive skills from a developmental perspective. Learning and Instruction, 14, 89– 109. Walther, J. B., Anderson, J. F., & Park, D. (1994). Interpersonal effects in computer-mediated interaction: a meta-analysis of social and anti-social communication. Communication Research, 19, 460-487. Webb, N. M., & Palincsar, A. S. (1996). Group processes in the classroom. In D. C. Berliner (Ed.), Handbook of educational psychology (pp. 841–873). New York: Simon & Schuster/Macmillan. Wegerif, R. (1998, March). The social dimension of asynchronous learning networks. Journal of Asynchronous Learning Networks, 2(1), 34–49. Weinberger, A. (2003). Scripts for computer-supported collaborative learning. Effects of social and epistemic cooperation scripts on collaborative construction. Doctoral dissertation, Ludwig-Maximilians-University, Munich, Germany. Available at: http://edoc.ub.unimuenchen.de/archive/00001120/01/Weinberger_Armin.pdf. Weiner, B. (1985). An attributional theory of achievement motivation and emotion. Psychological Review, 92, 548–573. Weisband, S., & Atwater, L. (1999). Evaluating self and others in electronic and face-to-face groups. Journal of Applied Psychology, 84(4), 632-639. Wherry, R.J., & Bartlett, C. J. 1982. The control of bias in ratings. Personnel Psychology, 35, 521-551. Williams, K. D., Harkins, S. G., & Latané, B. (1981). Identifiability as a deterrent to social loafing: Two cheering experiments. Journal of Personality and Social Psychology, 40, 303311. Woolhouse, M. (1999). Peer assessment: the participants' perception of two activities on a further education teacher education course. Journal of Further and Higher Education, 23, 211-219. Wubbels, T., Créton, H. A., & Hooymayers, H. P. (1985, March–April). Discipline problems of beginning teachers, interactional teacher behavior mapped out. Paper presented at the annual meeting of the American Educational Research Association, Chicago. Yager, S., Johnson, R. T., Johnson, D. W., & Snider, B. (1986). The impact of group processing on achievement in cooperative learning groups. Journal of Social Psychology, 126(3), 389– 397.
135
References Yammarino, F. J., & Atwater, L. E. (1997). Do managers see themselves as others see them? Implications of self-other rating agreement for human resources management. Organizational Dynamics, 25(4), 35-44. Yukawa, J. (2006). Co-reflection in online learning: Collaborative critical thinking as narrative. International Journal of Computer-Supported Collaborative Learning, 1(2), 203-228. Zumbach, J., Hillers, A., & Reimann, P. (2004). Distributed problem-based learning: the use of feedback mechanisms in online learning. In T. S. Roberts (Ed.), Online collaborative learning: Theory and practice (pp. 86–102). Hershey, PA: Idea Group Inc.
136
8. Samenvatting 8.1 Introductie Samenwerkend leren, vaak ondersteund door computernetwerken (computer ondersteund samenwerkend leren, CSCL), geniet grote belangstelling op alle niveaus van het onderwijs (Strijbos, Kirschner, & Martens, 2004). Hoewel CSCL omgevingen hebben aangetoond een veelbelovend educatief instrument te zijn en de verwachtingen ten aanzien van hun waarde en effectiviteit hoog zijn, bereiken groepen in CSCL-omgevingen niet altijd hun volledige potentieel (bijv. Fjermestad, 2004; Lipponen, Rahikainen, Lallimo, & Hakkarainen, 2003; Thompson & Coovert, 2003). Twee belangrijke redenen voor het verschil tussen de potentiële en de werkelijke prestaties ligt in de sociale interactie tussen de groepsleden, welke wordt beïnvloed door (1) het ontwerp van de CSCL omgeving (Kreijns, Kirschner, & Jochems, 2003), en (2) groepsleden hun zelfbeeld over hun feitelijke gedrag en prestaties (Dunning, Heath, & Suls, 2004). Ten eerste, het ontwerp van CSCL-omgevingen zijn vaak louter functioneel en richten zich enkel op de cognitieve processen die nodig zijn om een taak te volbrengen, problemen op te lossen of optimale leerprestaties te bereiken (Kreijns & Kirschner, 2004). Deze functionele CSCL-omgevingen bieden groepsleden enkel de mogelijkheid om de taak uit te voeren (Kirschner, Beers, Boshuizen, & Gijselaars, 2008), en beperken op deze manier de mogelijkheid om sociaal-emotionele processen plaats te laten vinden. Deze sociaal-emotionele processen, die de basis vormen voor groepsvorming en groepsdynamiek, zijn essentieel voor het ontwikkelen van sterke sociale relaties, sterke groepscohesie, gevoelens van vertrouwen en een eenheidsgevoel onder de groepsleden. Ten tweede, de zelfpercepties (percepties van het eigen gedrag) van studenten laten een zwakke tot matige relatie zien met hun feitelijke gedrag en prestatie (Dunning et al., 2004). Studenten hebben de neiging om zichzelf te overschatten en een overgewaardeerd beeld vast te houden omtrent hun expertise, vaardigheid, en karakter (bijvoorbeeld Chemers, Hu & Garcia, 2001; Dunning, Heath, & Suls, 2004; Falchikov & Boud, 1989). Deze neiging van groepsleden om te geloven dat zij effectief presteren, terwijl dit vaak niet zo is, kan de sociale prestaties (bijvoorbeeld team ontwikkeling) en cognitieve prestaties (bijvoorbeeld de hoeveelheid en de kwaliteit van het bijgedragen werk) van de groep ondermijnen, en belemmeren dat de groep haar volledige potentieel bereikt (Karau & Williams, 1993; Stroebe, Diehl, en Abakoumkin, 1992). Om de sociale interactie te versterken en de bevooroordeelde zelfpercepties enigzins te verlichten, kan de leeromgeving worden uitgebreid met instrumenten die het samenwerkend leren ondersteunen. Voorbeelden van dergelijke instrumenten zijn computerprogramma’s (vanaf nu tools genaamd), die groepsleden zich bewust maken van de verschillen tussen hun eigen perceptie en hun feitelijke gedrag en prestatie (zie bijvoorbeeld Janssen, Erkens, Kanselaar, & Jaspers, 2007). Deze tools, ook wel bekend als groepsbewustzijn-tools, verstrekken informatie over de omgeving waarin een persoon deelneemt (bijvoorbeeld door studenten te informeren hoe hun feitelijke gedrag wordt gezien door hun groepsleden, zie Buder, 2007, 2011). Dit verbeterde groepsbewustzijn kan leiden tot een effectievere en efficiëntere samenwerking (zie bijvoorbeeld Buder & Bodemer, 2008; Janssen, Erkens, & Kirschner, 2011). Twee operationaliseringen van dergelijke tools zijn ontwikkeld voor dit onderzoek, namelijk een zelf- en peer-beoordelings tool (Radar) en een reflectie tool (Reflector). De Radar maakt het mogelijk voor groepsleden om
137
Samenvatting zowel hun eigen sociale gedrag (bijvoorbeeld vriendelijkheid) en cognitieve gedrag (bijvoorbeeld productiviteit) te beoordelen, als dat van hun groepsgenoten. Deze beoordelingen worden (anoniem) gedeeld met alle groepsleden, om hen op deze manier bewust te maken van hun eigen gedrag. Het effect van de Radar op het gedrag van de studenten is echter afhankelijk van de bereidheid en het vermogen van de studenten om te reflecteren op de ontvangen beoordelingen. De Radar kan een groepslid bijvoorbeeld aanwijzingen geven voor een benodigde gedragsaanpassing (bijvoorbeeld dat hij/zij onvriendelijk overkomt), maar bij dit groepslid kan de wil en het vermogen ontbreken om te reflecteren op deze informatie (bijvoorbeeld wat hij/zij moet veranderen om vriendelijker over te komen). Daarom werd de Radar gecombineerd met de Reflector, die groepsleden ondersteunt en stimuleert om (a) individueel te reflecteren op hun eigen functioneren, (b) op hun ontvangen beoordelingen, en (c) op het functioneren van de groep als geheel. De Reflector stimuleert groepsleden ook om gezamenlijk na te denken over het groepsfunctioneren en om plannen te formuleren ter verbetering. Om deze reden werd verondersteld dat een combinatie van Radar en Reflector effectiever zou zijn om het gedrag van de groepsleden te beïnvloeden, dan enkel het gebruik van de Radar. De drie empirische studies beschreven in dit proefschrift zijn allemaal gericht op het beantwoorden van de volgende centrale onderzoeksvraag: Hoe en in welke mate beinvloedt beoordeling en reflectie het gedrag en de prestatie in kleine CSCL groepen? Voordat deze vraag theoretisch kan worden beantwoord, is het noodzakelijk om de centrale concepten van de onderzoekvraag te definiëren. In dit proefschrift wordt beoordeling gedefinieerd als het proces waarin studenten het gedrag en/of de prestaties waarderen van henzelf (zelfbeoordeling) en/of dat van hun groepsgenoten (peer-beoordeling). Reflectie is gedefinieerd als de intellectuele en affectieve activiteiten die individuelen ondernemen om hun ervaringen te doorgronden om op deze manier tot nieuwe inzichten en waarderingen te komen (Boud, Keogh, & Walker, 1985). Dit proces wordt gestimuleerd door middel van reflectievragen in de Reflector, welke individuele en gezamenlijke reflectie stimuleren over het individuele functioneren en het groepsfunctioneren. Gedrag wordt gedefinieerd als de sociale (niet-taakgerelateerde) en cognitieve (taak-gerelateerde) aspecten of activiteiten die van belang zijn voor een succesvolle samenwerking. Deze sociale aspecten, verder aangeduid als sociaal gedrag, worden gemeten door zelf- en peer-beoordelingen in de Radar op vier variabelen: invloed, vriendelijkheid, coöperativiteit, en betrouwbaarheid. De cognitieve aspecten, verder aangeduid als cognitief gedrag, wordt gemeten door de zelf- en peer-beoordelingen in de Radar op twee variabelen: de productiviteit en de kwaliteit van de bijdrage. Prestatie is gedefinieerd als de sociale en cognitieve prestatie aan het eind van het samenwerkingsproces. De sociale prestatie wordt gemeten op vier subschalen (teamontwikkeling, groepstevredenheid, groepsconflicten, en de houding ten opzichte van probleem-gestuurd samenwerken), door middel van een vragenlijst aan het einde van de samenwerking. De cognitieve prestatie wordt gemeten door het cijfer dat groepen hebben ontvangen voor hun groepsproduct (bijvoorbeeld hun betoog of onderzoeksverslag). De eerste empirische studie (hoofdstuk 4) onderzocht hoe de tools (Radar en Reflector) door de tijd heen het gedrag van de groepsleden veranderde. Tevens werden de hoofd- en interactie-effecten van de Radar en de Reflector onderzocht op de sociale en cognitieve groepsprestaties aan het einde van het samenwerkingsproces. In deze studie versterkte Radar het
138
Samenvatting groepsbewustzijn door informatie te verschaffen over vijf eigenschappen die van belang geacht worden voor de beoordeling van gedrag in groepen. Vier eigenschappen hadden betrekking op sociaal gedrag, namelijk invloed, vriendelijkheid, samenwerking en betrouwbaarheid. De laatste productiviteit - was gerelateerd aan cognitief gedrag. Het groepsbewustzijn werd verder versterkt door groepsleden in de Reflector te laten reflecteren op het huidige functioneren van de groep. De tweede empirische studie (hoofdstuk 5) onderzocht in welke mate de tools van invloed zijn op de sociale en cognitieve groepsprestaties, wanneer deze tools pas halverwege het samenwerkingsproces beschikbaar worden voor de groep. De zojuist genoemde vijf eigenschappen in de Radar werden aangevuld met een zesde eigenschap gerelateerd aan cognitief gedrag, namelijk de kwaliteit van de bijdrage. Reflectie in deze studie was gericht op het toekomstige functioneren van de groep. In de derde empirische studie (hoofdstuk 6) werd het effect van de Reflector onderzocht op het overeenstemmingsniveau tussen beoordelaars, evenals het effect op de sociale en cognitieve groepsprestaties. De volgende paragrafen bevatten een samenvatting van elk onderzoek en een samenvoeging (synthese) van de resultaten om de algemene onderzoeksvraag te beantwoorden. Dit wordt gevolgd door een algemene discussie over de bevindingen, beperkingen van dit onderzoek en mogelijkheden voor toekomstig onderzoek.
8.2 Samenvatting van de studies 8.2.1 Studie 1 De doelen van de eerste studie (zie hoofdstuk 4) waren (1) het ontwerpen van een zelf- en peer-beoordelings tool (waarmee gelijke groepsleden elkaars gedrag kunnen beoordelen) en een reflectie tool, en (2) het onderzoeken van de hoofd- en interactie-effecten van deze tools op het gedrag en de groepsprestaties. De VCRI (Virtual Collaborative Research Institute), een beproefde CSCL-omgeving, werd aangevuld met twee onafhankelijke en elkaar aanvullende tools om groepsleden te ondersteunen zich beter bewust te worden van hun individuele en collectieve gedrag, en hen te stimuleren na te denken over hun individuele en collectieve prestaties. De eerste tool – Radar – is een zelf- en peer-assessment tool die studenten voorziet van informatie over hun eigen sociale en cognitieve gedrag, het gedrag van hun groepsgenoten (peers), en dat van de groep als geheel. Groepsleden beoordeelde zichzelf en hun peers op vijf eigenschappen die van belang werden geacht voor de beoordeling van sociaal gedrag (invloed, vriendelijkheid, samenwerking, en betrouwbaarheid) en cognitief gedrag (productiviteit) tijdens de samenwerking. De tweede tool is een reflectie tool - Reflector – die groepsleden in staat stelt om hun individuele reflecties te delen betreffende (1) hun eigen functioneren, (2) hun ontvangen beoordelingen, en (3) het functioneren van de groep als geheel. Tevens stimuleerde Reflector de groepsleden om gezamenlijk na te denken over het groepsfunctioneren en hier een gezamenlijk standpunt over te bereiken. De reflectie was gericht op het groepsfunctioneren in het verleden en het heden. De deelnemers waren 39 Nederlandse middelbare scholieren (Havo 4). Een 2x2 factorial between-subjects design met de factoren ‘Radar niet beschikbaar’ (¬Ra), ‘Radar beschikbaar’ (+Ra), ‘Reflector niet beschikbaar’ (¬Rf), en ‘Reflector beschikbaar’ (+Rf), werd gebruikt om te onderzoeken of deze tools zouden leiden tot beter sociaal en cognitief gedrag, betere sociale prestaties (bijvoorbeeld betere team ontwikkeling, meer groepstevredenheid, minder groepsconflicten, en een positievere houding tegenover probleem-gestuurd samenwerken), en betere cognitieve prestaties (bijvoorbeeld een beter groepsproduct). Zoals verwacht, bleken groepsleden zichzelf te overschatten bij de eerste beoordeling en vertoonden de resultaten bij de tweede beoordeling (T2) een afname van de peer-beoordelingen voor groepen met Radar en Reflector (+Ra+Rf) op invloed, vriendelijkheid en betrouwbaarheid. Peer-beoordelingen op T2 van groepen met beide tools (+Ra+Rf) correleerde sterk met de self-
139
Samenvatting assessments bij de derde (T3) en laatste meting (T4) voor invloed en productiviteit, wat wijst op een convergentie van zelf- en peer-beoordelingen. Er werd een hoofd-effect gevonden van de Reflector op de variabele invloed, maar niet op andere variabelen. Groepen met Reflector beoordeelden zichzelf hoger op invloed dan groepen zonder reflector. Een werd een hoofd-effect gevonden voor Radar op de sociale groepsprestaties. Groepen met Radar ervaarden een betere teamontwikkeling, ervaarden minder conflicten, en hadden een positievere houding ten aanzien problem-gestuurd samenwerken dan groepen zonder Radar. Tot slot, de resultaten waren veelbelovend en toonden aan dat het gebruik van Radar en Reflector in een CSCL omgeving leidt tot meer realistische zelfwaarnemingen (dat wil zeggen, een hogere convergentie van zelf- en peer-beoordelingen), en betere sociale groepsprestaties, in vergelijking met groepen die niet niet over deze tools beschikken.
8.2.2 Studie 2 De tweede studie (hoofdstuk 5) onderzocht in welke mate de tools van invloed zijn op de sociale en cognitieve groepsprestaties, wanneer deze pas halverwege het samenwerkingsproces beschikbaar worden voor de groep. Wederom werd de VCRI aangevuld met Radar en Reflector om (1) groepsleden te helpen zich bewust te worden van hun individuele en collectieve gedrag, en (2) hen te stimuleren om te reflecteren op hun individuele en collectieve prestaties. In deze follow-up studie was de Radar aangevuld met een zesde eigenschap voor de beoordeling van cognitief/ taak-gerelateerd gedrag, namelijk ‘kwaliteit van de bijdrage’. Bovendien waren de reflectievragen in de Reflector niet alleen gericht op groepsfunctioneren in het verleden en het heden, maar ook op het toekomstig groepsfunctioneren. In de Reflector werden groepsleden nu ook gestimuleerd om doelen te stellen en plannen te formuleren voor het verbeteren van hun sociale en cognitieve prestaties. Deelnemers waren 108 Nederlandse middelbare scholieren (Havo 4 en Vwo 4), die in tweetallen (n = 16), drietallen (n = 84) of groepen van vier (n = 8) een gezamenlijke schrijfopdracht moesten uitvoeren voor maatschappijleer. De groepsleden in de eerste experimentele conditie (n = 59), gebruikte de Radar vanaf het begin (T1), en gebruikte Radar en Reflector halverwege (T2) en aan het einde (T3) van het samenwerkingsproces. Om het effect van de tools halverwege het samenwerkingsproces te onderzoeken, gebruikte de groepsleden in de tweede experimentele conditie (n = 23), Radar en Reflector halverwege de samenwerking (T2) en aan het eind (T3). In de controle conditie (n = 26), ontvingen de groepsleden de tools pas aan het einde (T3), welke tevens diende als eindmeting. Aan het eind vulden alle deelnemers een vragenlijst in over hun sociale groepsprestaties (bijvoorbeeld, teamontwikkeling, groepstevredenheid, groepsconflicten, en houding tegenover probleem-gestuurd samenwerken). Het cijfer dat aan elke groep werd gegeven voor hun gezamenlijke schrijfopdracht (essay) diende als maatstaf voor de cognitieve prestatie van de groep. Onverwacht, bij de tweede meting (T2 - halverwege), ervaarden de groepen die de tools voor de tweede keer gebruikte (conditie 1 – tools beschikbaar vanaf begin) hogere niveaus van sociaal en cognitief gedrag, dan groepen die de tools halverwege voor het eerst gebruikte (conditie 2 – tools beschikbaar halverwege). Aan het einde werden geen verschillen gevonden voor de cognitieve prestaties tussen de condities. Zoals verwacht, aan het einde van het samenwerkingsproces (T3), vertoonden groepsleden met tools (conditie 1 en 2) hogere convergentie van zelf- en peer-beoordelingen door de tijd heen, dan groepsleden zonder tools (conditie 3). Resultaten toonden aan dat de tools een effect hebben op hoe groepsleden het sociale en cognitieve gedrag van zichzelf en hun groepsgenoten beoordelen. Over het algemeen beoordeelden groepen met tools zichzelf en hun groepsgenoten
140
Samenvatting hoger op sociaal en/of cognitief gedrag aan het einde (T3), dan groepen zonder tools. Groepen die vanaf het begin de tools gebruikte (conditie 1) rapporteerden betere sociale prestaties dan groepen die de tools sinds halverwege gebruikte (conditie 2), en groepen zonder tools (conditie 3 – tools beschikbaar aan het eind). Tot slot, de resultaten toonden aan dat het gebruik van Radar en Reflector in een CSCL omgeving leidt tot (1) hoger bewustzijn van interpersoonlijke percepties en gedrag, (2) beter sociaal en cognitief gedrag aan het einde van de samenwerking, (3) gedeelde percepties op interpersoonlijk gedrag, en (4) verbetering van de sociale groepsprestaties.
8.2.3 Studie 3 De derde studie (hoofdstuk 6) onderzocht het effect van de Reflector op (1) de mate van overeenstemming (consensus) tussen peer-beoordelaars, en (2) de sociale en cognitieve groepsprestaties. Voor dit onderzoek werd een experimenteel design gebruikt met één experimentele en één controle conditie. De experimentele conditie (n = 105) kreeg een zelf- en peer-beoordelingstool (Radar) en reflectie-tool (Reflector). De controle conditie (n = 86) kreeg alleen de Radar. Deelnemers waren 191 tweedejaars Nederlandse universitaire studenten Onderwijskunde (37 mannen, 154 vrouwen), en werkte face-to-face in groepjes van drie, vier en vijf, op een gezamenlijke onderzoekstaak in onderwijspsychologie. Elke groep moest een research paper schrijven over hun pilot-studie, welke zij in een periode van acht weken moesten uitvoeren. Tijdens deze periode, vulden de studenten uit beide condities de Radar vier keer in. Daarbij vulden de studenten in de experimentele conditie de Reflector in op drie meetmomenten, te beginnen bij het tweede meetmoment. Studenten gebruikten de VCRI omgeving enkel om de Reflector en/of Radar in te vullen. De Radar voorzag groepsleden met informatie over het sociale gedrag (invloed, vriendelijkheid, samenwerking en betrouwbaarheid) en cognitieve gedrag (productiviteit en kwaliteit van de bijdrage) van zichzelf, hun groepsgenoten, en de groep als geheel. De Reflector stimuleerde groepsleden om individueel en gezamenlijk na te denken over de ontvangen peer-beoordelingen, hun eigen prestaties, en die van de groep. Er werd verondersteld dat Reflector groepsleden (1) bewust maakt van onrealistische zelfen peer-percepties, (2) ondersteund bij het vaststellen van gezamenlijke doelen om hun groepsprestaties te verbeteren, en (3) ondersteund bij het bepalen van normen en standaarden over wat kan worden aangeduid als goed groepsgedrag en prestaties van hoge kwaliteit. Om deze redenen werd verwacht dat, in vergelijking met groepen zonder Reflector, groepen met Reflector (1) een hogere mate van consensus tussen peer-beoordelaars vertonen, (2) na verloop van tijd een andere ontwikkeling van zelf- en peer-beoordelingen laten zien, (3) betere sociale en cognitieve groepsprestaties worden waargenomen aan het einde van het samenwerkingsproces, en (4) een meer valide beeld van hun sociale groepsprestaties vertonen. Onverwacht, ervaarden geen van beide groepen (met of zonder Reflector) een toename in het sociale en cognitieve gedrag aan het einde van het samenwerkingsproces. Aan het einde, toonde groepen met Reflector een meer gematigde en minder optimistische perceptie van hun sociale groepsprestaties (minder teamontwikkeling en meer groepsconflicten). Er werden geen significante verschillen gevonden op de cognitieve prestaties tussen de groepen met of zonder Reflector. Tot slot, zoals verwacht, toonden de resultaten aan dat het combineren van een zelf- en peer-beoordelingstool (Radar) met een reflectie-tool (Reflector) leidt tot: een hoger niveau van consensus tussen peer-beoordelaars, meer gematigde en minder optimistische zelf- en peerbeoordelingen na verloop van tijd, en meer valide beoordelingen van de sociale groepsprestatie
141
Samenvatting (dat wil zeggen, significante correlaties tussen de gemiddelde peer-beoordelingen op gedrag en de ervaren sociale groepsprestaties aan het einde).
8.3 Synthese 8.3.1 Waarom hebben we tools nodig om de samenwerking te verbeteren? Werken in een groep kan zeer frustrerend zijn, vooral wanneer een groepslid niet aan de normen en waarden van zijn/haar groepsgenoten kan voldoen. Bijvoorbeeld wanneer een groepslid er niet in slaagt om zijn/haar afspraken na te komen, zijn/haar taken niet op tijd af heeft, minder productief is dan de rest van de groep, werk levert van een lagere kwaliteit, en/of meelift op het werk van anderen. Om een groepslid zich meer bewust te maken van zijn/haar gedrag en prestaties tijdens de samenwerking, kunnen groepsleden een kritische zelfbeoordeling uitvoeren door te reflecteren op het eigen functioneren, en/of een beoordeling van hun gedrag en prestaties ontvangen van anderen (bijvoorbeeld hun groepsgenoten). Het gebruik van zelf- en peer-beoordelingen in kleine groepen kunnen groepsleden van nuttige informatie voorzien over hun eigen gedrag en prestaties. Echter, er kunnen zich zes problemen voordoen wanneer zelf- en peer-beoordelingen worden ingezet om prestaties te verbeteren. Ten eerste, hoewel groepsgenoten de meest nauwkeurige en best geïnformeerde beoordelaars zijn van het gedrag en de prestaties van hun groepsleden (Kane & Lawler, 1978; Lewin & Zwany, 1976; Murphy & Cleveland, 1991) en zij veel mogelijkheden hebben om het sociale (niet-taakgerelateerde) en het cognitieve (taakgerelateerde) gedrag waar te nemen, kunnen ze niet alle aspecten waarnemen die relevant en belangrijk zijn voor een succesvolle samenwerking. Groepsleden kunnen bijvoorbeeld tijdens de samenwerking door middel van observatie wel informatie verzamelen over prestaties (bijvoorbeeld werkresultaten) en gedrag (bijvoorbeeld vriendelijkheid) van specifieke groepsleden, maar kunnen niet observeren hoe de samenwerking (interactie) verloopt tussen andere groepsleden waar zij zelf niet bij aanwezig zijn. Het observeren van de samenwerking tussen groepsleden is met name lastig wanneer groepsleden communiceren via de computer (CMC) zoals bijvoorbeeld een chat. Groepsleden missen hierdoor een hoop informatie over de sociaal-emotionele processen zoals bijvoorbeeld hun frustraties, en gevoelens van vertrouwen, terwijl deze processen wel een cruciale rol spelen in het realiseren van een succesvolle samenwerking (Järvelä, Järvenoja, en Veermans, 2008; Järvelä, Volet, & Järvenoja, 2010). Om dit gebrek aan informatie (gedeeltelijk) te compenseren werd in dit onderzoek een CSCL omgeving (VCRI) uitgerust met een zelf- en peer-beoordelingsinstrument (Radar) die gemakkelijk is in te vullen en te interpreteren. Met behulp van de Radar, beoordelen groepsleden zichzelf (zelfbeoordeling) en elkaar (peer-beoordeling) op vier sociale aspecten van samenwerking, namelijk invloed, vriendelijkheid, samenwerking en betrouwbaarheid. Tevens beoordelen zij zichzelf en elkaar op twee cognitieve aspecten, namelijk productiviteit en kwaliteit van de bijdrage. De Radar deelt deze zelf- en peer-beoordeling met alle groepsleden door deze te visualiseren in een radar-diagram. Doordat de zelfbeoordelingen via de Radar worden gedeeld, ontvangen alle groepsleden informatie over elkaars intensies, bijvoorbeeld in hoeverre zij invloed beogen te hebben op het samenwerkingsproces en in hoeverre zij vriendelijk, samenwerkend en betrouwbaar beogen te zijn. Doordat de peer-beoordelingen via de Radar worden gedeeld, ontvangen alle groepsleden informatie over hoe hun intensies worden waargenomen en ervaren door hun mede-groepsleden. De kracht van de Radar ligt in het feit dat dit instrument impliciete aspecten van de samenwerking (bijvoorbeeld frustraties onder groepsleden) expliciet en zichtbaar kan maken voor alle groepsleden. De Radar maakt groepsleden bewust van hun gedrag en prestaties door expliciete informatie te verschaffen over zijn/haar gedrag (bijv. dat iemand te dominant is) of prestatie (bijv. dat het werk van iemand onder de maat is). Om dit bewustzijn nog
142
Samenvatting verder te vergroten is de VCRI ook uitgerust met een reflectie-instrument (Reflector), welke groepsleden ondersteunt en stimuleert om na te denken over de verschillen tussen hun eigen waarnemingen (hun zelfbeoordelingen) en hun feitelijk gedrag (de ontvangen beoordelingen van hun groepsgenoten). Het tweede probleem van het gebruik van zelf- en peer-beoordelingen in het algemeen is dat het niet vanzelfsprekend is dat studenten de vaardigheden bezitten om effectieve feedback (beoordelingen) te geven en te ontvangen (Prins, Sluijsmans, Kirschner, & Strijbos, 2005; Sluijsmans, 2002). Sluijsmans en Van Merriënboer (2000) analyseerde peer feedback vaardigheden in het docentonderwijs en concludeerden dat er drie onderdelen moeten worden ondersteund, namelijk (1) het definieren van de beoordelingscriteria, (2) het beoordelen van de bijdrage van een groepslid aan het product of de groepsprestatie, en (3) het geven van de feedback. Prins en collega’s (2005) voegde hier een vierde onderdeel aan toe, namelijk het ontvangen van de feedback. Volgens Prins en collega’s is het belangrijk dat er een feedbackdialoog ontstaat tussen de feedback-gever en ontvanger, zodat de ontvanger de feedback op waarde kan schatten, kan bepalen of hij/zij de feedback accepteert en bereid is om zijn/haar gedag aan te passen. Om deze redenen ondersteunt de Radar groepsleden zowel bij het geven van feedback (het invoeren van de beoordelingen) als bij het ontvangen. Radar ondersteunt bijvoorbeeld de invoer door de beoordelingscriteria te definieren en groepsleden zichzelf en elkaar te laten beoordelen op zes eigenschappen die van belang zijn bij een goede samenwerking. Tevens ondersteunt de Radar het ontvangen van de feedback door ieder groepslid zijn/haar peer beoordelingen te verschaffen. De Reflector ondersteunt en reguleert de feedback-dialoog door groepsleden op hun ontvangen peer beoordelingen te laten reflecteren en deze individuele reflecties te delen met de overige groepsleden. Tevens stimuleert de Reflector groepsleden om gezamenlijk te reflecteren op het functioneren van de groep als geheel, hier een gedeelde conclusie over te formuleren en doelen op te stellen ter verbetering van het groepsfunctioneren. Het derde probleem is dat studenten de neiging hebben om hun sterke punten en positieve prestaties te benadrukken, en zwakheden en negatieve prestaties toe te schrijven aan de overige groepsleden (Klein, 2001; Saavedra & Kwun, 1993). Deze neiging, ook wel bekend als attributie (zie Eccles & Wigfield, 2002; Weiner, 1985) kan resulteren in onrealistisch hoge zelfbeoordelingen en lage peer-beoordelingen. Dit probleem kan worden opgelost door groepsleden bewust te maken van – de vaak onrealistisch en inaccurate – standaarden die zij gebruiken om zichzelf en anderen te beoordelen. Om deze reden deelde de Radar de zelf- en peer-beoordelingen met alle groepsleden, zodat zij zich bewust konden worden van de verschillen tussen hun persoonlijke percepties (de gegeven zelf- en peer-beoordelingen) en de percepties van anderen (de ontvangen zelf- en peer-beoordelingen). Het vierde probleem is dat groepsleden elkaar soms niet accuraat en kritisch willen beoordelen en in plaats daarvan hun groepsgenoten te mild beoordelen (Dancer & Dancer, 1992; Landy & Farr, 1983). Tijdens de samenwerking observeren en vergelijken de groepsleden elkaars gedrag en prestaties (Goethals, Messick, & Allison, 1991), maar deze observaties worden gekleurd door selectieve waarneming, interpretaties en/of vooroordelen (Bonito & Kenny, 2010; Saavedra & Kwun, 1993). Zo kunnen bijvoorbeeld vriendschappelijke relaties tussen groepsleden er toe leiden dat groepsleden elkaar te mild beoordelen. Het is ook mogelijk dat groepsleden elkaar gelijk beoordelen om spanningen in de groep te voorkomen (Murphy & Cleveland, 1995). Een oplossing voor dit probleem is het gebruik van anonieme peer-beoordelingen. Volgens Kagan, Kigli-Shemesh en Tabak (2006) is dit één van de meest effectieve en objectieve manieren om informatie te vergaren over over individueel gedrag en prestatie. Sommige onderzoekers (zoals Cestone, Levine, & Lane, 2008) suggereren echter dat anonieme peer-beoordelingen leiden
143
Samenvatting tot felle kritieken en evaluaties, wat vervolgens weer een negatief resultaat kan hebben op de onderlinge relaties binnen de groep. Andere onderzoekers (zoals Bamberger, Erev, Kimmel, & Oref-Chen, 2005) hebben geen empirisch bewijs kunnen vinden dan anonieme peer beoordelingen onderlinge relaties zou beschadigen of het groepsfunctioneren zou dwarsbomen. Om deze reden is in dit onderzoek ervoor gekozen dat de Radar de gegeven zelf- en peerbeoordelingen anoniem deelt met de overige groepsleden zodat groepsleden (1) zich meer bewust worden van de verschillen tussen hun zelf-beoordeling en hun ontvangen peer-beoordelingen, en (2) gestimuleerd worden om accurater en kritischer te beoordelen. Het vijfde probleem is dat de verkregen informatie uit anonieme beoordelingen alleen betrouwbaar en zinvol is voor de verbetering van prestaties wanneer alle groepsleden het met elkaar eens zijn (een hoge mate van consensus vertonen) over wat zij verstaan onder goed gedrag en prestaties van hoge kwaliteit. Wanneer een groepslid bijvoorbeeld van drie groepsgenoten één overwegend prositieve beoordeling ontvangt, één overwegend negatieve en nog één overwegend neutrale beoordeling, dan is het voor de ontvanger niet duidelijk of hij/zij nu wel of niet zijn/haar gedrag moet aanpassen. Om grote verschillen in beoordelingen te voorkomen moeten groepsleden zich bewust worden van de verschillende percepties en standaarden die zij gebruiken om gedrag en prestaties te beoordelen en vervolgens gezamenlijk een gedeelde standaard van normen en waarden ontwikkelen. Dit proces van normen- en waardenbepaling, wat bekend staat onder de naam ‘norming’ (Tuckman & Jensen, 1977) speelt een belangrijke rol in de groepsontwikkeling om tot een succesvolle samenwerking te komen (Johnson, Suriya, Yoon, Berret, & La Fleur, 2002). Om deze reden verschaft de Radar informatie over / maakt het groepsleden bewust van (1) de verschillende percepties die groepsleden hebben op het gedrag van zichzelf en elkaar, en (2) de standaarden die groepsleden gebruiken om elkaars gedrag te beoordelen. Om tot een gedeelde standaard te komen, stimuleert de Reflector groepsleden om (a) te reflecteren op de verschillen tussen hun zelf-beoordeling en ontvangen peer-beoordelingen, en (b) gezamenlijk te reflecteren op de presaties van de groep als geheel en hier een gedeelde conclusie over te formuleren. Dit reflectieproces stelt groepsleden in staat om te discussiëren over de verschillen tussen hun individuele normen en een gezamenlijke standaard te ontwikkelen over wat zij als groep verstaan onder goed gedrag en prestaties van hoge kwaliteit. Dat dit proces daadwerkelijk plaatsvindt, blijkt uit de resultaten van hoofdstuk 6, waarin groepsleden een hogere mate van overeenstemming (consensus) vertonen in hun peer-beoordelingen. Het zesde probleem is dat het enkel verschaffen van informatie (zoals peer-beoordelingen) waarschijnlijk niet voldoende is om, indien gewenst, het gedrag te veranderen of tot een gezamenlijke standaard te komen. Om dit te bewerkstelligen dienen groepsleden individueel te reflecteren op door zichzelf af te vragen (1) welke argumenten zij hebben om zichzelf of hun groepsgenoten hoog of laag te beoordelen, (2) of zij de verschillen in percepties tussen hun zelfbeoordeling en ontvangen peer-beoordelingen begrijpen, (3) of zij deze verschillen accepteren, en (4) of deze verschillen aanwijzingen bevatten om hun gedrag aan te passen (zie Prins, Sluijsmans, & Kirschner, 2006). Hoe dan ook, er kan niet vanuit worden gegaan dat groepsleden automatisch op hoog metacognitief niveau reflecteren op hun gegeven en ontvangen beoordelingen (zie Kollar & Fischer, 2010). Het effect van de Radar op het gedrag en de prestaties van groepsleden hangt dus af van hun bereidheid en vaardigheid om te reflecteren op de verschafte informatie in de Radar. Om dit probleem te overbruggen wordt het reflectieproces gestimuleerd, gestructureerd en gereguleerd door de Reflector. De Reflector stimuleert groepsleden om te reflecteren op - en informatie te verschaffen over - hoe zij over hun persoonlijke prestatie denken; hoe zij denken over de verschillen tussen hun zelf- en ontvangen peer-beoordelingen; in hoeverre zij het eens zijn met deze verschillen; zij bereid zijn om hun
144
Samenvatting gedrag gaan aanpassen; en hoe zij denken over het functioneren van de groeps als geheel. Omdat het groepsfunctioneren wordt bepaald door de individuele inzet van alle groepsleden stimuleert de Reflector hen om hier gezamenlijk op te reflecteren en doelen te stellen ter verbetering. Verder structureert en reguleert de Reflector het reflectieproces door de individuele reflecties van de groepsleden te delen met hun groepsgenoten. Groepsleden kregen pas toegang tot elkaars individuele reflecties in de Reflector nadat zij zelf eerst hun reflectie hadden ingevuld en afgerond.
8.3.2 In hoeverre hebben de tools invloed op het gedrag van de groepsleden? In dit proefschrift werd verondersteld dat de verschafte informatie door de Radar bij de eerste meting de groepsleden bewust zou maken van hun onrealistische zelf- en peerbeoordelingen (zelf- en peer-percepties) en zou resulteren in een daling van de zelf- en peerbeoordelingen bij de tweede meting. Echter, het effect van de Radar is afhankelijk van de bereidheid en het vermogen van groepsleden om te reflecteren op de ontvangen zelf- en peerbeoordelingen. Groepsleden dienen deze informatie te verwerken en zich af te vragen of zij de verschillen tussen hun zelfbeoordeling en ontvangen peer-beoordelingen begrijpen en accepteren, om zich vervolgens af te vragen of deze beoordelingen aanwijzingen geven voor gedragsverandering (zie bijvoorbeeld Hattie & Timperley, 2007; Prins, Sluijsmans, & Kirschner, 2006). Dit reflectieproces wordt gestimuleerd, gestructureerd en gereguleerd door de Reflector. Daarom werd verondersteld dat een combinatie van Radar en Reflector effectiever zou zijn dan enkel het gebruik van de Radar en zou leiden tot nog lagere zelf- en peer-beoordelingen. De reden dat Radar en Reflector dit effect hebben op de zelf- en peer-beoordelingen komt omdat de Radar de mogelijkheid biedt voor sociale vergelijking, wat betekent dat studenten zichzelf kunnen vergelijken met andere groepsleden. De Radar maakt groepsleden bewust van hun groepsnormen door de zelf- en (gemiddelde) peer-beoordelingen zichtbaar en beschikbaar te maken voor alle groepsleden. Deze bewustwording wordt verder versterkt door de Reflector, welke groepsleden ondersteunt en stimuleert om na te denken over de verschillen tussen hun zelfbeoordelingen en ontvangen peer-beoordelingen. Wetende dat men vergeleken wordt met anderen motiveert de groepsleden om hogere normen voor zichzelf te stellen en hun inspanningen te verhogen (zie Janssen, Erkens, Kanselaar, & Jaspers, 2007; Michinov & Primois, 2005) wat zal resulteren in lagere zelf- en peer-beoordelingen. In lijn met de veronderstelling, hebben Radar en Reflector een effect op het (waargenomen) sociale en cognitieve gedrag bij de tweede meting (T2). Dat wil zeggen, groepen met Radar en Reflector vertonen een daling in de zelf-beoordelingen (Studie 3) en in de peer-beoordelingen (Studies 1 & 3) bij de tweede meting (T2). Onverwacht vertoonde in Studie 2 de zelf- en peer-beoordelingen een verhoging op T2. Een mogelijke verklaring zou kunnen zijn dat, in vergelijking met studies 1 en 3, groepsleden in Studie 2 bekend waren met elkaar samen te werken. Voorafgaand aan het experiment van Studie 2 werkten de deelnemers een maand met elkaar samen. Waarschijnlijk heeft deze eerdere samenwerking ertoe geleid dat studenten een meer realistische perceptie hadden van elkaar en dat deze percepties in de loop der tijd positiever zijn geworden door gebruik van Radar en Reflector. Tevens, door groepsleden zich met elkaar te laten vergelijken, zijn zij niet alleen gemotiveerd om hogere normen voor zichzelf te stellen, maar ook om hun inzet te vergroten (zie Janssen et al., 2007; Michinov & Primois). Door deze verhoogde inzet en het feit dat de Reflector groepsleden ondersteunde en stimuleerde om doelen te stellen ter verbetering van het groepsfunctioneren, werd verwacht dat het gedrag (de zelf- en peer-beoordelingen) zou toenemen naar het einde van het samenwerkingsproces. Uit de resultaten bleek dat groepen met Reflector zichzelf aan het einde van de samenwerking significant hoger beoordeelden op invloed dan
145
Samenvatting groepen zonder Reflector (zie Studie 1). Peer-beoordelingen verhoogden ook aan het einde van de samenwerking, maar er werd geen significant hoofdeffect gevonden voor Radar of Reflector (Studie 1). Groepen met tools ervaarden hogere niveaus van sociaal en/of cognitief gedrag van zichzelf en hun groepsgenoten aan het einde (T3), in vergelijking met groepen zonder tools (zie Studie 2). Onverwacht, in Studie 3, daalde zelf- en peer-beoordelingen voor alle groepen (Radar met of zonder Reflector) na verloop van tijd, met een hoofdeffect van de Reflector op de zelfbeoordelingen. Tot slot, bovenstaande resultaten komen niet overeen met de veronderstelling dat het gebruik van Radar en Reflector leiden tot hogere zelf- en peer-beoordelingen aan het eind van de samenwerking (zie hypotheses in Studies 1 & 2). Echter, de resultaten ondersteunen wel de veronderstelling dat (1) de Radar groepsleden motiveert om hogere normen te stellen, resulterend in lagere zelf- en peer-beoordelingen, en dat (2) de Reflector het effect van de Radar versterkt, resulterend in meer gematigde en minder optimistische beoordelingen voor groepen met Reflector (zie hypotheses in Studies 1, 2 & 3).
8.3.3 In hoeverre hebben de tools invloed op de validiteit van de beoordelingen? Visualisatie van groepsnormen (zelf- en peer-beoordelingen in Radar) en expliciete reflectie op deze normen en standaarden (dat wil zeggen, reflectie op verschillen tussen de zelf- en peerbeoordelingen) in de Reflector, maken het mogelijk dat groepsleden eventuele verschillen in normering kunnen bespreken, om zo een gedeelde (gezamenlijke) normering (standaard) te bereiken om hun gedrag en prestaties te meten. Dit normeringsproces (Tuckman & Jensen, 1977), waarbij groepsleden overeenstemming (consensus) bereiken over hun gedrag, doelstellingen en strategieën, is een belangrijke fase in de groepsontwikkeling om tot een goed presterende groep te komen (Johnson, Suriya, Yoon, Berret, & La Fleur, 2002). Om deze reden werd in dit onderzoeksproject verwacht dat deze gedeelde normering zou leiden tot (1) hogere correlaties tussen zelf- en ontvangen peer-beoordelingen, (2) een meer valide beeld van hun sociale prestaties, bijvoorbeeld hogere correlaties tussen het waargenomen gedrag (bijvoorbeeld de beoordeling op de variable ‘samenwerking’ in de Radar) en hun vermeende prestaties aan het einde (zoals de beoordeling voor ‘teamontwikkeling’ in de enquete). Er werd ook verwacht dat deze gedeelde normerming zou leiden tot (3) een hogere mate van overeenstemming (consensus) tussen peer-beoordelaars, dat wil zeggen de mate waarin alle groepsleden sommige groepsgenoten als zeer vriendelijk beoordelen en de andere groepsgenoten als zeer onvriendelijk. Ten eerste, in lijn met de veronderstellingen, vertoonden groepen met Radar en Reflector meer en hogere positieve correlaties tussen zelf- en peer-beoordelingen dan groepen met alleen Radar (Studie 1). Bovendien, groepen met Radar en Reflector vertoonden hogere convergentie van zelf- en peer-beoordelingen door de tijd, dan groepen zonder Radar en Reflector (zie Studie 2). In Studie 3, vertoonden beide groepen (met en zonder Reflector) niet-significante of relatief kleine correlaties bij het eerste meetmoment en middelmatige tot hoge correlaties bij volgende metingen. Onverwacht, aan het einde van de samenwerking (T4), vertoonden groepen met Reflector geen significante correlaties tussen zelf- en peer-beoordelingen op de variabele ‘betrouwbaarheid’, en middelmatige tot lage correlaties op de variabelen ‘productiviteit’ en ‘kwaliteit van de bijdrage’, in vergelijking met hoge correlaties op deze variabelen voor groepen zonder Reflector. Een mogelijke verklaring hiervoor is dat groepsleden die geen Reflector gebruiken de neiging houden om zichzelf te overschatten (Dunning, Heath, & Suls, 2004) en hun groepsgenoten te mild blijven beoordelen (Landy & Farr, 1983; zie bijvoorbeeld de hogere zelfen peer-beoordelingen van groepen zonder Reflector in paragraaf 6.5.3 en 6.5.4), terwijl groepen
146
Samenvatting met Reflector tijd nodig hebben om hun beoordelingen aan te passen en meer consensus te bereiken met hun groepsgenoten (zie bijvoorbeeld de resultaten in Tabel 6.9 op T3). Ten tweede, in lijn met de veronderstellingen laten de resultaten van studenten met Reflector zien dat hun peer-beoordelingen van sociaal en cognitief gedrag significant positief correleren met hun ervaren sociale prestaties (in totaal). De peer-beoordelingen van studenten zonder Reflector correleren niet met hun ervaren sociale prestaties (in totaal). Dit geeft aan dat, voor studenten met Reflector, beoordelingen van de totale sociale prestaties gebaseerd zijn op het waargenomen sociale en cognitieve gedrag van hun groepsgenoten (Studie 3). Ten derde, in lijn met de veronderstelling vertoonden groepen met Radar en Reflector een hogere mate van consensus (dat wil zeggen, hogere partner varianties en lagere actor varianties) tussen peer-beoordelaars (Studie 3). Dit betekent dat beoordelingen van studenten met Reflector meer bepaald worden door het waargenomen gedrag van de beoordeelde, en minder worden bepaald door de neiging van de beoordelaar om alle groepsleden hoog of laag te beoordelen op een bepaalde eigenschap. Om deze reden wordt geconcludeerd dat het gebruik van Reflector leidt tot meer valide peer-beoordelingen. Tot slot, bovenstaande resultaten wijzen sterk in de richting dat na verloop van tijd, groepsbewustzijn-tools als Radar en Reflector kunnen leiden tot hogere correlaties tussen de zelfen peer-beoordelingen (Studies 1, 2 & 3), een meer valide perceptie van de sociale groepsprestatie (Studie 3), en een hogere mate van overeenstemming (consensus) tussen peer-beoordelaars (Studie 3).
8.3.4 In hoeverre hebben de tools invloed op de groepsprestaties aan het eind? In dit onderzoek werd verwacht dat de sociale en cognitive groepsprestaties verbeterd konden worden door gebruik te maken van de Radar en de Reflector omdat, zoals beschreven in de voorgaande paragraven, verondersteld werd dat de Radar groepsleden bewust kon maken van hun gedrag, hen kon motiveren om hogere normen te stellen, en hun inzet te verhogen. Tevens werd verondersteld dat de Reflector het effect van de Radar zo vergroten. In lijn met de hypothese ervaarden groepen met tools (Radar met of zonder Reflector) een hoger niveau van sociale groepprestaties (dat wil zeggen, groepstevredenheid), dan groepen zonder tools (zie Studies 1 en 2). Deze bevindingen komen overeen met onderzoek van Zumbach, Hillers en Reimann (2004) waaruit bleek dat hun groepsbewustzijn-instrument, welke verschillende aspecten van interactie visualiseerde (bijvoorbeeld participatie, motivatie en bijdrage), een positieve invloed had op het leerproces van studenten, hun groepsprestaties en hun motivatie. Echter, niet in lijn met de veronderstelling, ervaarden groepen met alleen Radar meer teamontwikkeling en minder conflicten binnen de groep dan groepen met Radar en Reflector (Studie 3). Een verklaring zou kunnen zijn dat Reflector groepsleden stimuleert expliciet te reflecteren op (beter te kijken naar) het groepsfunctioneren en gezamenlijke doelen te stellen voor verbetering en dat dit proces leidt tot meer bewustwording van de verschillende perspectieven op het groepsfunctioneren, resulterend in meer conflicten en minder teamontwikkeling. Dit proces komt overeen met de groepsontwikkelingsfase ‘storming’ van Tuckman en Jensen (1977). Tuckman en Jensen onderscheiden vijf fasen in groepsontwikkeling, te weten: (1) het vormen van de groep (forming), waarin de groepsleden bekend raken met elkaar en de uit te voeren taak, (2) het stormen (storming), waarin de rollen en posities binnen de groep worden bepaald, (3) het normeren (norming) waaring overeenstemming wordt bereikt over gedrag, doelen en strategieën, (4) het uitvoeren (performing) waarin resultaten worden geleverd en conclusies worden getrokken, en (5) het beeindigen (adjourning), waarin na het voltooien van de taak de samenwerking wordt beëindigd. Het is te verwachten dat een instrument als Reflector een groep
147
Samenvatting doet terugkeren naar de fase van ‘het stormen’, welke als onaangenaam en soms zelfs als pijnlijk wordt ervaren door groepsleden die niet graag in conflict zijn (bijv. Bales & Cohen, 1979; Tuckman & Jensen, 1977). Echter, de verschillen tussen de twee condities (met of zonder Reflector) zijn klein en ook zijn er geen significante verschillen gevonden op de sociale prestatie (in totaal). Er werden overigens ook geen significante hoofdeffecten gevonden voor Radar of Reflector op de cognitieve groepsprestatie (het cijfer voor hun groepsopdracht). Een verklaring hiervoor zou kunnen zijn dat in Studies 1 en 2 de samenwerkingsperiode te kort was voor de Radar en Reflector om een effect te hebben op de cognitieve groepsprestaties. De kleine spreiding in de cijfers die aan de groepsproducten zijn gegeven kan ook een van de redenen zijn waarom er geen verschil is gevonden tussen de condities.
8.3.5 Wat zijn de voorwaarden om Radar en Reflector effectief te laten zijn? Ondanks dat de Radar gemakkelijk is in te vullen en te interpreteren en de Reflector het reflectieproces structureert, reguleert en stimuleert, hebben groepsleden toch bepaalde metacognitieve vaardigheden nodig die hen in staat stelt om hun sociale en cognitieve gedrag kritisch te vergelijken met dat van hun groepsgenoten en kritisch te reflecteren op hun individuele- en groepsprestaties (zie Prins, Veenman, & Elshout, 2006; Veenman, Wilhelm, & Beishuizen, 2004). Volgens Veenman en collega’s (2004) zijn metacognitieve vaardigheden persoons-gerelatereerd over leeftijdsgroepen, daarom bevelen wij het gebruik van Radar en Reflector aan in onderwijsniveaus vergelijkbaar met het middelbaar onderwijs of hoger. Resultaten in Studie 2 laten zien dat de meeste zelf- en peer-beoordelingen toenemen bij de tweede meting, in tegenstelling tot Studies 1 en 3 waar de beoordelingen afnemen. Een verklaring voor deze tegengestelde resultaten ligt mogelijk in de onderlinge bekendheid van de groepsleden. In vergelijking met Studies 1 en 3, waren groepsleden in Studie 2 bekend om met elkaar samen te werken. Het is mogelijk dat deze eerdere samenwerking er toe heeft geleid dat deze groepsleden al reeds een realistischer (minder positief) beeld hadden van elkaars gedrag. Deze veronderstelling komt overeen met de resultaten van een studie door Janssen, Erkens, Kirschner, en Kanselaar (2009), waarin wordt aangetoond dat onderlinge bekendheid leidde tot kritischere (hogere) groepsnormen en een positieve perceptie tegenover samenwerken. De tegenovergestelde resultaten uit Studie 2 en Studies 1 en 3 geven dus aan dat Radar en Reflector effect hebben op zowel bekende als onbekende groepen. Desondanks, moet er rekening worden gehouden met het feit dat een hogere mate van bekendheid kan leiden tot lagere groepsnormen (zie Studie 2), maar ook tot meer kritische (hogere) groepsnormen (Janssen et al., 2009). Zoals beschreven in paragraaf 7.3.1, zijn groepsleden soms niet bereid om accurate beoordelingen te geven en beoordelen zij hun groepsgenoten te mild (Landy & Farr, 1983) of allemaal gelijk om spanning te voorkomen (Murphy & Cleveland, 1995). Daarom is het aanbevolen dat groepsleden de anonimiteit in de Radar waarborgen zo lang er geen aanleiding is om deze op te heffen, want dit zal leiden tot de meest objectieve en effectieve beoordelingen (Kagan, Kigli-Shemesh, & Tabak, 2006). Zoals beschreven in de introductie, hangt het effect van de Radar af van de bereidheid en vaardigheid van groepsleden om te reflecteren op de ontvangen informatie. Er kan echter niet automatisch vanuit worden gegaan dat groepsleden op hoog metacognitief niveau zullen reflecteren op de informatie uit de Radar (zie Kollar & Fischer, 2010). Tevens bevestigen de resulaten uit dit onderzoek dat de Reflector het effect van de Radar versterkt, wat resulteert in meer gematigde en minder optimistische zelf- en peer-beoordelingen voor groepen met Reflector (zie Studies 1, 2 & 3). Om deze redenen is de aanbevolen volgorde van invullen: eerst de Radar, daarna de Reflector.
148
Samenvatting
8.3.6 Conclusies over de tools (Radar en Reflector) De drie empirische studies in dit proefschrift onderzochten (1) manieren om groepsleden zich meer bewust te laten worden van hun sociale en cognitieve groepsgedrag, en (2) manieren om individueel sociaal en cognitief gedrag te veranderen door middel van reflectie. De verkregen resultaten zijn sterke aanwijzingen dat, na verloop van tijd, groepsbewustzijn-tools, zoals een zelf- en peer-beoordelingstool (Radar) en reflectie-tool (Reflector), een positief effect hebben op het ervaren sociale en cognitieve gedrag (Studies 1 & 2), leiden tot meer congruentie (hogere correlaties) tussen zelf- en peer-beoordelingen (Studies 1, 2 & 3), en een positief effect hebben op de sociale prestaties (Studies 1 & 2). Bovendien geven de resultaten aan dat reflectievragen gericht op bewustwording van gedrag, studenten aanzet om hogere normen voor zichzelf te stellen, wat resulteert in meer gematigde en minder optimistische zelf en peer-beoordelingen (Studie 3). Reflectievragen die groepsleden bewust maken van verschillende normen in de groep en stimuleren om tot gezamenlijke gedeelde normen (Tuckman & Jensen, 1977) te komen, leiden tot meer consensus tussen peer-beoordelaars (Studie 3), en een meer valide beeld van de sociale groepsprestaties (Studie 3).
8.4 Methodologische opmerkingen Hoewel de resultaten laten zien dat groepsbewustzijn-tools, zoals Radar en Reflector een positieve invloed hebben op de interactie en sociale prestaties in kleine groepen, zijn er enkele methodologische kwesties die aan de orde moeten worden gesteld.
8.4.1 Het meten van groepsbewustzijn staat gelijk aan een interventie In dit onderzoeksproject is de Radar een interventie tool, maar ook een meetinstrument. Dit heeft als nadeel dat het niet mogelijk is om het bewustzijn te meten zonder de situatie te veranderen. Het is bijvoorbeeld niet mogelijk om te meten in hoeverre studenten zich bewustzijn van hun gedrag (dat wil zeggen, de verschillen tussen hun zelfbeeld op gedrag en hun feitelijke gedrag zoals waargenomen door hun groepsgenoten), zonder studenten hier expliciet naar te vragen. In het geval dat studenten zich niet bewust zijn van deze verschillen, zal het stellen van deze vraag hen onvermijdelijk stimuleren om hierop te reflecteren en het bewustzijn van de student vergroten. Bovendien, dat Radar zowel meetinstrument als interventie was beperkte enigszins de opzet (het ontwerp) van de studies. Voor groepen met alleen Reflector was het bijvoorbeeld niet mogelijk om studenten hun bewustzijn van gedrag te meten tijdens de samenwerking zonder tussenkomst van de onderzoekers (bijvoorbeeld met behulp van interviews, terugblikken, of inzetten van de Radar). Om dit probleem te verhelpen, is gebruik gemaakt van een tijdreeks-ontwerp (time-series design) met drie voorwaarden (tools beschikbaar vanaf het begin, vanaf halverwege, of pas aan het einde – na de samenwerking), om te onderzoeken in hoeverre bewustwording halverwege (het ontvangen van de tools halverwege het samenwerkingsproces) van invloed is op het gedrag en de groepsprestaties.
8.4.2 Vergelijkbaarheid van de drie empirische studies Een ander probleem is de vergelijkbaarheid van de drie empirische studies. Er zijn een aantal verschillen tussen de drie studies, zoals de manier van samenwerking, gebruik van de tools, vertrouwdheid, steekproefgrootte, opleidingsniveau, leeftijd, en de duur van de samenwerkingstaak. Het belangrijkste verschil tussen de studies is de manier waarop studenten hebben samengewerkt en tools gebruikten. In Studies 1 en 2, werkten de studenten volledig 149
Samenvatting samen in een CSCL-omgeving en konden zij alleen communiceren via een chat-tool. In Studie 3 werkten studenten face-to-face samen, en gebruikten zij de CSCL-omgeving alleen om op verschillende momenten de bewustwordings-instrumenten (Radar met of zonder Reflector) in te vullen. Deelnemers hadden de mogelijkheid om doelen te stellen en plannen te maken buiten de CSCL-omgeving om, bijvoorbeeld tijdens face-to-face interacties. Dit zou kunnen verklaren waarom, in vergelijking met Studies 1 en 2, geen significante effecten van de Reflector werden gevonden op de sociale groepsprestaties. Volledige samenwerking in de CSCL-omgeving zou het effect van de Radar en Reflector hebben versterkt, want dan zouden groepsleden moeten vertrouwen op de tools om informatie over het gedrag en de prestaties van hun groepsgenoten te krijgen. Een andere verklaring voor het niet vinden van significante effecten van Reflector op de sociale groepsprestaties kan het verschil in reflectievaardigheden zijn tussen de deelnemers van Studie 1 en 2 (vierdejaars middelbare scholieren) en Studie 3 (tweedejaars onderwijskunde studenten). Reflectievaardigheden van tweedejaars studenten zijn doorgaans verder ontwikkeld dan die van vierdejaars middelbare scholieren. Om deze reden is de toegevoegde waarde van de Reflector ook veel hoger voor middelbare scholieren, dan voor universitaire studenten, wat zou kunnen verklaren waarom er alleen een effect van de Reflector wordt gevonden bij middelbare scholieren. Groepen verschilden ook in hun vertrouwdheid om met elkaar samen te werken. In vergelijking met Studies 1 en 3, waren groepsleden in Studie 2 wel vertouwd om met elkaar samen te werken. Voorafgaand aan Studie 2 werkten de deelnemers namelijk een maand lang samen aan een project voor het vak maatschappijleer. Het is mogelijk dat deze voorafgaande samenwerkingsperiode er toe heeft geleid dat deze deelnemers een meer realistisch beeld hadden (dat wil zeggen minder positief) van zichzelf en hun groepsgenoten, ten opzichte van deelnemers die voorafgaand niet een maand met elkaar hebben samengewerkt. Deze aanname komt overeen met de resultaten van een studie door Janssen, Erkens, Kirschner, en Kanselaar (2009), waarin wordt aangetoond dat onderlinge bekendheid leidde tot kritischere (hogere) groepsnormen en een positieve perceptie tegenover samenwerken. Dus, ondanks de hypothese dat gebruik van Radar en Reflector zal leiden tot een afname in zelf- en peer-beoordelingen, zullen beoordelingen van groepsleden die al eerder met elkaar hebben samengewerkt waarschijnlijk toenemen na verloop van tijd. Deze aanname is ook in overeenstemming met de resultaten van Studie 2, waaruit blijkt dat de meeste zelf en peer-beoordelingen aanzienlijk zijn toegenomen bij het tweede meetmoment (T2). Groepen die de tools beschikbaar hadden vanaf het begin ervaarden op het tweede meetmoment (T2) meer inbreng (invloed) tijdens het proces, betere samenwerking, hogere productiviteit, en een hogere kwaliteit van de bijdragen, dan groepen zonder tools. Echter, in tegenstelling tot Studie 2 namen de beoordelingen op het tweede meetmoment af in de Studies 1 en 3. In conclusie, aannemend dat onderlinge bekendheid er voor heeft gezorgd dat de resultaten uit Studie 2 niet overeenkomen met de gestelde hypothese, beschouwen we de hypothese nog steeds als plausibel en ondersteund door de resultaten uit Studies 1 en 3. Een ander verschil tussen de studies is de steekproefomvang. De steekproefomvang in de eerste studie (pilot-studie) is relatief klein (N = 39) ten opzichte van de tweede (N = 108) en derde studie (N = 191). Hoewel in Studie 1 significante hoofdeffecten werden gevonden voor Radar op de sociale groepsprestaties (bijvoorbeeld, teamontwikkeling, groepsconflict en houding ten opzichte van probleem-gestuurd samenwerken), kan over de toegevoegde waarde van deze pilotstudie worden getwist als men kijkt naar de resultaten van Studies 2 en 3. Ten slotte dient te worden opgemerkt dat de bewustwording van het gedrag niet enkel is onderzocht in twee korte termijn studies, maar ook in een middellange termijn studie. De duur van de samenwerkingstaak varieerde tussen twee 90 minuten-sessies, gescheiden door een week
150
Samenvatting (Studie 1), drie 45 minuten sessies over een periode van een week (Studie 2), en een samenwerkingsperiode van acht weken (Studie 3). Resultaten geven aan dat de studenten hun zelfbeoordeling en ontvangen peer-beoordelingen dichter bij elkaar komen te liggen (convergeren) na verloop van tijd (zie alle studies), en dat beoordelingen objectiever en betrouwbaarder worden wanneer deze worden gecombineerd met reflectievragen (Studie 3). Op basis van deze resultaten kan geconcludeerd worden dat zelf- en peer beoordelingen voor ontwikkelingsdoeleinden bij voorkeur meerdere malen ingezet worden (bijvoorbeeld drie keer), over een langere periode (bijvoorbeeld acht weken), en aangevuld dienen te worden met reflectievragen die gericht zijn op toekomstig (groeps)functioneren.
8.4.3 Radar: ontwerppunten en overwegingen Een belangrijk criterium voor het ontwerpen van een zelf- en peer-beoordelingsinstrument (zoals de Radar) was om een eenvoudig in te vullen en gemakkelijk te interpreteren beoordelingsinstrument te ontwikkelen. Doel van het instrument was om het bewustzijn van de groepsleden hun sociale en cognitieve gedrag te verbeteren, en daarmee ook indirect de sociale en cognitieve groepsprestaties. Om deze reden biedt de Radar informatie over zes kenmerken die van belang zijn voor de beoordeling van gedrag in groepen. Vier kenmerken zijn gerelateerd aan sociaal of interpersoonlijk gedrag, namelijk (1) invloed, (2) vriendelijkheid, (3) samenwerking, en (4) betrouwbaarheid, de andere twee kenmerken zijn gerelateerd aan cognitief gedrag, namelijk (5) productiviteit en (6) kwaliteit van de bijdrage. Hoewel deze kenmerken zijn afgeleid van studies over interpersoonlijke percepties, interactie, groepsfunctioneren en groepseffectiviteit (bijvoorbeeld Bales, 1988; Den Brok, Brekelmans, & Wubbels, 2006; Kenny, 1994; Salas, Sims, & Burke, 2005), kan ter discussie worden gesteld of het aantal assen (afhankelijke variabelen) in Radar voldoende zijn, en of er geen andere, meer geschikte eigenschappen of vaardigheden om groepsleden bewust te maken van hun gedrag en indirect hun prestatie te verbeteren. Voor zover bekend waren er weinig studies of goede voorbeelden voorhanden die we konden gebruiken als richtlijn. Er zijn nauwelijks andere empirische studies die onderzoek hebben gedaan naar het effect van zelf- en peer-beoordelingen op interpersoonlijk gedrag en sociale en cognitieve prestaties (Gennip, Segers, & Tillema, 2009, 2010). In dit proefschrift, beoordeelden studenten het sociale en cognitieve gedrag van zichzelf en hun groepsgenoten op een continue schaal van 0 tot 4 (0 = geen, 4 = zeer hoog). Het is de vraag of deze kwantitatieve manier van feedback geven de beste methode is om groepsleden bewuster te maken van hun gedrag en prestaties. Er zijn andere kwantitatieve en kwalitatieve methoden om het bewustzijn te verhogen. Twee andere kwantitatieve methoden zijn bijvoorbeeld peer-ranking en peer-nominatie. Bij peer-ranking dienden de deelnemers alle groepsleden te rangschikken van beste tot slechtste op een of meerdere factoren. Bij peer-nominatie benoemt elke deelnemer een groepslid die wordt gezien als de beste in de groep op een bepaald kenmerk (Dochy, Segers, & Sluijsmans, 1999). Echter, deze kwantitatieve methoden verschaffen alleen informatie over de best presterende collega, en geen enkele informatie over de prestaties van de andere groepsleden. Volgens de Vijver, Ul-Haq, en Wade (1995) blijft ook bij deze alternatieve kwantitatieve methoden het probleem bestaan dat beoordelaars bevooroordeeld zijn. Een voorbeeld van een kwalitatieve feedbackmethode om dit tegen te gaan, is studenten te vragen om een kort evaluatierapport schrijven (dat wil zeggen, complimenten, argumenten en aanbevelingen) met betrekking tot groepsgenoten of groepsproducten (zie bijvoorbeeld Prins, Sluijsmans, Kirschner, & Strijbos, 2005). Desondanks zijn zelf- en peer-beoordelingen naar onze mening één van de snelste, meest eenvoudig te implementeren, in te vullen, en meest gedetailleerde methode om studenten te helpen zich bewuster te worden van hun gedrag en prestaties.
151
Samenvatting Een ander probleem is de constructvaliditeit van zelf- en peer-beoordelingen in de Radar. Hoewel er zorgvuldigheid in acht is genomen om ervoor te zorgen dat alle beoordelaars dezelfde definitie hanteerden voor de zes kenmerken/eigenschappen in de Radar (bijvoorbeeld door het verstrekken van tekstballonnen met definities en informatie over de inhoud), kan het zijn dat de constructvaliditeit werd beperkt doordat beoordelaars verschillende normen hanteerden om zichzelf en hun groepsgenoten te beoordelen, vooral bij de eerste meting. In dit proefschrift werd verondersteld dat de betrouwbaarheid van de zelf- en peer-beoordelingen in de Radar na verloop van tijd zou worden vergroot door de Reflector, omdat de Reflector groepsleden stimuleert om gezamenlijk hun gegeven en ontvangen beoordelingen te bespreken (Farh, Cannella, & Bedejan, 1991; Saavedra & Kwun, 1993). Deze veronderstelling wordt ondersteund door een aantal bevindingen in dit proefschrift, welke aantonen dat het gebruik van Reflector (dat wil zeggen, het stimuleren van groepsleden om gezamenlijk reflecteren op hun beoordelingen) leidt tot een hoger niveau van overeenstemming (consensus) tussen beoordelaars, en een hogere congruentie (correlaties) tussen zelf- en ontvangen peer-beoordelingen. Echter, een hoge consensus en hoge congruentie geven enkel aan dat dezelfde standaard is gebruikt (bijvoorbeeld groepsgenoten zijn het elkaar eens dat een specifiek groepslid onvriendelijk is), het geeft geen informatie over de vraag of de redenen voor de overeenstemming terecht of onterecht zijn (bijvoorbeeld, groepsgenoten hebbben misschien wel ruzie gehad met dit specifieke groepslid tijdens de lunch). Er zijn kwalitatieve analyses nodig om te onderzoeken of deze zelf- en peer-beoordeling gebaseerd zijn op het feitelijke gedrag van de groepsleden. Een voorbeeld van kwalitatief onderzoek naar de constructvaliditeit van zelf- en peer-beoordelingen in Radar is uitgevoerd door Van Strien (2009). Van Strien analyseerde de constructvaliditeit van de zelf- en peerbeoordelingen in de Radar voor de variabelen invloed, vriendelijkheid en productiviteit door te onderzoeken of de zelf- en peer-beoordelingen (bijvoorbeeld op productiviteit) feitelijk gebaseerd zijn op de frequentie van specifieke acties in de CSCL omgeving (bijvoorbeeld op de hoeveelheid tekst die is toegevoegd aan het groepsproduct). Van Strien vond dat deze beoordelingen slechts in beperkte mate constructvalide zijn, maar dat reflectie op het groepsfunctioneren een positief effect heeft op de constructvaliditeit. Deze bevindingen zijn in lijn met de veronderstelling dat de validiteit van de beoordelingen kan worden verhoogd door groepsleden gezamenlijk te laten reflecteren op hun beoordelingen (Farh et al.;. Saavedra & Kwun). Om de constructvaliditeit van zelf- en peer-beoordelingen te verhogen in een onderwijssetting, is het raadzaam dat lerenden gezamenlijk de criteria en normen opstellen die zij willen gebruiken om zichzelf en hun groepsgenoten te beoordelen (bijvoorbeeld Dochy, Segers, & Sluijsmans, 1999). Om een hoge constructvaliditeit te realiseren dienen lerenden bij voorkeur met de hele klas tot een gezamenlijke gedeelde standaard te komen, zodat de constructvaliditeit niet alleen hoog is binnen groepen, maar ook tussen groepen.
8.4.4 Reflector: ontwerppunten en overwegingen Belangrijke criteria bij het ontwerpen van de Reflector was om een eenvoudig in te vullen en gemakkelijk te interpreteren reflectie-instrument te ontwikkelen. Doel van het instrument was het stimuleren en reguleren van reflectie op het individuele gedrag van groepsleden en de algemene groepsprestaties. De Reflector stimuleert ieder groepslid om individueel te reflecteren op hun eigen functioneren, hun ontvangen peer-beoordelingen, en het groepsfunctioneren. Reflector deelt deze individuele reflecties met alle groepsleden en stimuleert groepsleden om gezamenlijk te reflecteren over het groepsfuncitoneren, hier overeenstemming over te bereiken en vervolgens gemeenschappelijke doelen te stellen ter verbetering van het groepsfunctioneren. Resultaten toonden aan dat doelen en plannen voornamelijk gericht waren op verbetering van
152
Samenvatting activiteiten (dat wil zeggen, taakcoördinatie, communicatie, productiviteit, en concentreren op de taak), welke overigens cruciaal zijn voor een succesvolle samenwerking (Barron, 2003;. Erkens et al., 2005). Echter, in dit proefschrift, zijn de uitkomsten van de Reflector (zoals de gestelde doelen en geformuleerde plannen ter vebetering van het groepsfunctioneren) voornamelijk kwantitatief geanalyseerd. Er is kwalitatief onderzoek nodig om na te gaan of deze gestelde doelen daadwerkelijk leiden tot veranderingen in gedrag en activiteiten (bijvoorbeeld door met behulp van discours-analyse uit te vinden of de intentie om vriendelijker te worden daadwerkelijk heeft geleid tot meer vriendelijk en behulpzaam gedrag in de chat). De uitkomsten van de Reflector verschaffen ook geen informatie over hoe de groepsleden hun doel bereikt hebben of hun plan hebben uitgevoerd. Daarnaast is het ook interessant om te weten wat groepsleden hebben geleerd en wat ze anders doen in hun volgende samenwerkingsopdracht. Daarom zal, in toekomstig onderzoek, deze reflectieve vragen worden toegevoegd aan de Reflector. Tot slot, het ontbreken van interviews met studenten na het samenwerkingsproces maakt het moeilijk om te verklaren waarom en hoe de Reflector invloed heeft gehad op de zelf- en peerbeoordelingen. Daarom wordt voor toekomstig onderzoek aangeraden om meer informatie te krijgen over de motivatie van leerlingen betreffende hun zelf- en peer-beoordelingen op verschillende gelegenheden (dat wil zeggen, door gebruik te maken van harop-denken tijdens het invulllen van de Radar, of door deelnemers achteraf te interviewen).
8.5 Theoretische en praktische implicaties Over het algemeen zijn de effecten van de Radar en de Reflector op het individuele gedrag en de groepsprestaties veel belovend. Voor zover wij weten is er geen kernachtige conclusie gemaakt in eerder onderzoek over (1) in welke mate peer-beoordelingen en vastgestelde reflectievragen een effect hebben op gedrag en prestatie, en (2) in welke mate vastgestelde reflectievragen leiden tot effectieve reflectieprocessen (zie Chen, Wei, Wu, & Uden, 2009; Strijbos & Sluijsmans, 2010; Topping, 1998; Van Gennip, Segers, & Tillema, 2009; 2010; Van Zundert, Sluijsmans, & Van Merrienboer, 2010). Om deze reden kan dit onderzoeksproject worden beschouwd als één van de eerste onderzoeken naar een combinatie van meerdere domeinen die relevant zijn voor een succesvolle samenwerking, zoals CSCL, team ontwikkeling, interpersoonlijke percepties, zelf- en peer-beoordelingen en reflectie. Resultaten laten zien dat zelf- en peer-beoordelingen niet per definitie betrouwbaar en valide zijn. Desondanks laten de bevindingen in dit onderzoek zien (zie alle studies) dat zelf- en ontvangen peer-beoordelingen na verloop van tijd dichter bij elkaar komen te liggen (convergeren), met name als deze beoordelingen worden ondersteund met de vastgestelde reflectie-vragen in de Reflector. Bijvoorbeeld, in Studie 1 laten groepen met Radar en Reflector meer en hogere positieve correlaties zien tussen zelf- en peer-beoordelingen dan groepen met alleen Radar. In Studie 2 vertonen groepen met Radar en Reflector meer convergentie tussen zelfen peer-beoordelingen dan groepen zonder Radar en Reflector. Ook in Studie 3 vertonen beide groepen (met of zonder Reflector) bij de eerste meting geen of kleine correlaties tussen de zelf- en peer-beoordelingen, maar medium tot hoge correlaties op de metingen daarna. In dit proefschrift is reflectie niet gebruikt voor de professionele ontwikkeling (zie Schön, 1983, 1987) of persoonlijke groei (zie Korthagen, 1985), maar is het ingezet om groepsleden zich bewust te maken van hun gedrag, groepsnormen en groepsfunctioneren om nieuwe inzichten, begrip en ervaring te krijgen op deze aspecten (Bound, Keogh, & Walker, 1985). Als verwacht, werden door de reflectie-vragen na verloop van tijd de zelf- en peer-beoordelingen objectiever en betrouwbaarder (zie Studie 3). Door groepsleden te stimuleren om te reflecteren op hun eigen
153
Samenvatting gedrag en dat van hun groepsgenoten leidt dit er toe dat beoordelingen meer worden bepaald door het gedrag van van de beoordeelde, en minder door de neiging van de beoordelaar om iedereen hoog of laag te scoren op een bepaald aspect (Studie 3). Dus, wanneer zelf- en peer-beoordelingen worden ingezet voor de ontwikkelings-doeleinden, dan is het aan te bevelen om meerdere beoordelingsmomenten te verdelen over een relatief lange samenwerkingsperiode (bijv. 8 weken), aangevuld met reflectie-vragen die gericht zijn op bewustwording van gedrag, groepsnormen en toekomstig groepsfunctioneren. De groepsbewustzijn-tools in dit onderzoek hadden helaas geen significant effect op de cognitieve groepsprestatie (het cijfer dat groepen kregen voor hun groepsproduct). Een mogelijke verklaring zou kunnen zijn dat betere sociale groepsprestaties (zoals teamontwikkeling en groepstevredenheid) niet direct leiden tot een betere kwaliteit van het groepsproduct. Desondanks tonen de resultaten aan dat groepen die deze tools (Radar met of zonder Reflector) gedurende de hele samenwerking tot hun beschikking hebben, hogere niveaus van social groepsprestatie rapporteren dan groepen zonder deze tools (Studies 1 en 2). Deze resultaten tonen aan dat tools als de Radar en de Reflector de social groepspresaties kunnen verbeteren, met name bij middelbare scholieren (Studies 1 en 2). In dit onderzoek (zie Studie 3) verschafte Social Relations Models (SRM) een bruikbare theoretische basis en een statistisch instrument om verschillende variantiebronnen te splitsen in actor variantie (veroorzaakt door de neiging van beoordelaars om alle groepsgenoten gelijk – hoog of laag – te beoordelen op een bepaald aspect), en partner variantie (afhankelijk van de mate waarop een beoordeelde gelijke beoordelingen ontvangt van zijn/haar beoordelaars). SRM werd ingezet om de afhankelijkheid van peer-beoordelingen te onderzoeken. Voor zover wij weten hebben maar enkele studies SRM gebruikt om groepsdynamiek te bestuderen (zie de review van Marcus, 1998) of SRM gebruikt om consensus vast te stellen onder peer-beoordelingen op prestaties (zie Greguras, Robie, & Born, 2001; Greguras, Robie, Born, & Koenigs, 2007). Voor toekomstig onderzoek is het interessant om onderzoek te doen naar de correlaties tussen zelfbeoordelingen en de SRM-effecten op individueel niveau (actor- en partner-effecten). Bijvoorbeeld, de correlaties tussen zelfbeoordelingen en actor-effecten meten de veronderstelde overeenkomst, met andere woorden, komt de manier waarop een persoon zichzelf ziet overeen met hoe hij/zij anderen ziet? De correlaties tussen zelfbeoordelingen en partner-effecten meten zelf-ander overeenkomst, ofwel, komt de manier waarop anderen een persoon zien overeen met hoe hij/zij zichzelf ziet? SRM kan worden aanbevolen voor onderwijskundig onderzoek naar interpersoonlijke percepties, groepsdynamiek, en situaties waarin personen zichzelf en elkaar beoordelen op bepaalde eigenschappen of gedrag.
154
Dankwoord De voorkant van dit proefschrift is misleidend. Er staat namelijk maar één naam en dat wekt de indruk dat dit proefschrift de verdienste is van één persoon. Niets is minder waar. Dit proefschrift is enkel en alleen tot stand gekomen dankzij de samen- en medewerking van een heleboel mensen. Ik wil er graag een aantal persoonlijk noemen en bedanken. Als onderzoeker in het onderwijs ben je afhankelijk van de bereidwillige medewerking van onderwijsmanagers, coördinatoren, docenten, studenten en leerlingen. Dankzij de geweldige inzet en medewerking van Ruud Nelissen, Lia Klerkx en Harm Oostland heb ik tweemaal data kunnen verzamelen op het Openbaar Lyceum Zeist. Fijn dat de deur nog altijd open staat. Waardevolle data werd ook verzameld tijdens de cursus Onderwijspsychologie van de opleiding Onderwijskunde aan de Universiteit Utrecht. Casper, Femmy, Frans, Gerdine, Gijsbert en Jeroen, dank voor jullie hulp bij het verzamelen van de data. De dataverzameling vond niet alleen plaats op Nederlandse bodem, maar ook aan de Katholieke Hogeschool Leuven op het departement Gezondheidszorg en Technologie. Jetske Strijbos, Kristel Vanhoof en Thomas Scheers, dank voor uw warme welkom, flexibiliteit en geloof in de Radar en Reflector. Uw enthousiasme en motivatie werkten aanstekelijk. Bij het ontwikkelen van de Radar en Reflector, het analyseren van de data en het schrijven van de artikelen heb ik de afgelopen jaren nauw samen mogen werken met Frans Prins en Paul Kirschner, mijn co-promotor en promotor. Tijdens de samenwerking fungeerden zij als mijn persoonlijke Radar en Reflector. Paul, je bent een kei in het bewaken van de productiviteit en het waarborgen van de kwaliteit. Je weet mensen te inspireren, motiveren en samen te brengen. Ik voel me vereerd dat ik mij als Aio onder jouw vleugels heb mogen ontwikkelen. Frans, in de rol als dagelijks begeleider heb jij veruit de meeste invloed gehad op mijn sociale en cognitieve processen binnen dit promotietraject. Je bent een kei in het doorhakken van knopen, prioriteiten stellen en doelgericht werken. Je weet mensen te enthousiasmeren, gerust te stellen en boven zichzelf uit te laten stijgen. We vormden niet alleen een professioneel duo in Utrecht, maar ook een muzikaal duo. Zowel binnen als buiten werktijd voegde jij een vrolijke noot toe. Paul en Frans, bedankt voor jullie ideeën, energie en vertrouwen. Ik kijk met plezier terug op onze samenwerking en hoop dit in de toekomst voort te zetten. Er zijn nog meer mensen intensief betrokken geweest bij de ontwikkeling van de Radar en Reflector. Jos, ontzettend bedankt voor al je uren programmeren en het testen van de tools tot diep in de nacht en in het weekend. Fijn dat ik altijd op je kon rekenen. Gijsbert, bedankt voor het meedenken en jouw eindeloze inzet voor het onderwijskundig onderzoek en onderwijs in Utrecht. Jeroen, dankzij jou verliep de structurering en analyse van de gigantische berg data een stuk soepeler. Heerlijk om collega’s met zoveel expertise om mij heen te hebben. Ook Johan van Strien wil ik hier graag bedanken. Als student-assistent heb je ons team enorm geholpen bij het analyseren van de data. Je werk was nauwkeurig en gestructureerd. De sociale interactie met mijn collega’s van de afdeling Onderwijskunde heeft ook een belangrijke rol gespeeld in mijn promotietraject. Met hen kon ik praten en lachen over werk en niet werk gerelateerde zaken. Dankzij hen voelde de afdeling als een tweede thuis. Marieke, zelden ben ik zo warm en enthousiast ontvangen op mijn ‘oude’ werkplek. Dank daarvoor. Hendrien, de herinnering aan onze gesprekken tovert nog steeds een brede glimlach op mijn
155
Dankwoord gezicht. Luce, ook jouw openhartigheid maakt me blij. Tim, buddy, bedankt voor je gulle lach en support. Harmen en Bert, strijders, “over jullie en al onze ervaringen zou ik een boek kunnen schrijven, maar dat valt buiten het bestek van dit proefschrift” (cf., Slof, 2011, p. 148). Met mij als hekkensluiter is de reis van de ‘Three Stooges’ nu eindelijk voltooid. Gedurende dit traject hebben we elkaar goed leren kennen door samen tientallen kilometers te lopen, honderden kilometers te fietsen en duizenden kilometers te vliegen. Het was een mooi avontuur en ik verheug me op het vervolg. “Wij hebben laten zien dat collegialiteit en vriendschap perfect samen kunnen gaan. Ik kon me geen betere kamergenoten wensen” (cf., Schaap, 2011, p. 152). Gedurende de laatste fase van mijn proefschrift was ik tevens werkzaam bij het ICLON aan de Universiteit Leiden. Daar trof ik een professionele organisatie en geweldige collega’s. Dankzij hen voelde ik mij ook daar al snel op mijn plek. Klaas, ik denk met plezier terug aan onze gesprekken. Dank voor je openheid, humor en flexibiliteit. Je hebt een groot hart voor het vak, maar ook voor de mensen om je heen. Ik heb veel van je geleerd. Jacobiene, Rosanne, Rikkert, Monika, Alessandra, Eke, Ben, Dineke, Esther, Elise, Albert, Cris, Jeannette, Tamara, Nelleke, Mariska, Anneke en andere collega’s, bedankt voor de prettige samenwerking, betrokkenheid en steun. Lieve familie en vrienden, bedankt voor jullie interesse in mijn werk en alle zorg en aandacht voor mij! Otto, Sjaan, Mariska, Ralph, Kris, Peter, Elly, Willem, Cate, Lizan en Thijs, wat ben ik gezegend met zo’n lieve en warme (schoon)familie. Fijn dat ik altijd op jullie kan rekenen. Volleybalteam Citokio (met aanhang) en zanggroep The Young Vocals wil ik hier ook graag noemen. Jullie zorgden de afgelopen jaren voor de nodige fysieke en mentale ontlading. Ook jullie hebben een belangrijke rol gespeeld in het slagen van dit project. Andrea, bedankt voor je betrokkenheid en de Engelse spellingcontrole. Ten slotte, Anne, wat ben je toch een fantastische vrouw! Ik heb de nodige uren, dagen, avonden, weekenden en vakanties van je geclaimd om mijn proefschrift af te ronden, maar kon altijd rekenen op je onvoorwaardelijke steun en liefde. Het proefschrift is nu af! Dank voor het ontwerpen van de mooie omslag. Ik ben er nu weer helemaal voor jou en Thomas.
156
List of Publications
Submitted journal articles Phielix, C., Prins, F. J., Kirschner, P. A., Janssen, J., & Slof, B. (submitted). Using reflection to increase reliability of peer assessments on social and cognitive behavior: Do we need to reflect? Van Strien, L. H., Phielix, C., Prins, F. J., Kirschner, P. A., & Erkens, G. (submitted). Construct validity of self- and peer ratings of influence, friendliness, and productivity in an electronic peer feedback and reflection tool. Small Group Research.
Journal articles, refereed Phielix, C., Prins, F. J., Kirschner, P. A. (2010). Awareness of group performance in a CSCL environment: Effects of peer feedback and reflection. Computers in Human Behavior, 26, 151-161. Phielix, C., Prins, F. J., Kirschner, P. A., Erkens, G., & Jaspers, J. (2011). Group awareness of social and cognitive performance in a CSCL environment: Effects of a peer feedback and reflection tool. Computers in Human Behavior, 27, 1087–1102. Slof, B., Erkens, G., Kirschner, P. A., Janssen, J., & Phielix, C. (2010). Fostering complex learning-task performance through scripting student use of computer supported representational tools. Computers & Education, 55, 1707-1720.
Journal articles, non-refereed Books Erkens, G., Jaspers, J., Van Gisbergen, M., Phielix, C., & Kanselaar, G. (2003). Projectonderwijs in ICT-leeromgeving in de tweede fase VO. Utrecht, The Netherlands: Universiteit Utrecht.
Conference presentations Individual papers Phielix, C., Prins, F. J., Janssen, J., & Kirschner, P. A. (2011). Using a Reflection Tool to Increase Reliability of Peer Assessments in a CSCL Environment. In Spada, H., Stahl, G., Miyake, N., & Law, N. (Eds.), Proceedings of the Tenth International Conference on Computer Supported Collaborative Learning. (pp. 326-333). Hong Kong, China: International Society of the Learning Sciences, Inc. Phielix, C., Prins, F. J., & Kirschner, P. A. (2010). Group awareness of social and cognitive behavior in a CSCL environment. Paper presented at the Ninth Biennial International Conference of the Learning Sciences, Chicago, United States of America. Phielix, C., Prins, F. J., & Kirschner, P. A. (2009, August). The design of Peer Feedback and Reflection Tools in a CSCL Environment. Paper presented at the Thirteenth Biennial International Conference of the European Association for Research on Learning and Instruction, Amsterdam, The Netherlands.
157
List of Publications Phielix, C., Prins, F. J., & Kirschner, P. A. (2009, June). The design of Peer Feedback and Reflection Tools in a CSCL Environment. In O’Malley, C., Suthers, D., Reimann, P., Dimitracopoulou, A. (Eds.), Proceedings of the Eighth International Conference on Computer Supported Collaborative Learning. (pp. 626-635). Rhodes, Greece: International Society of the Learning Sciences, Inc. Phielix, C., Prins, F.J., & Kirschner, P.A. (2008, Juni). Het effect van een peer assessment- en reflectie tool op sociale interactie, sociabiliteit en groepsprestatie in een CSCL omgeving. Round table gepresenteerd op de Onderwijs Research Dagen, Eindhoven. Phielix, C., Prins, F. J., & Kirschner, P. A. (2009, May). Individueel- en groepsbewustzijn in een CSCL-omgeving: het effect van een peer feedback- en reflectie tool. [Individual and group Awareness in a CSCL environment: the effect of a peer feedback and reflection tool.] Onderwijs: een kwestie van emancipatie en (on)gelijkheid. Proceedings van de 36e Onderwijs Research Dagen 2009, (149-150), Leuven: VOR, VFO, Katholieke Universiteit Leuven. Workshops Fisher, F., Prins, F. J., & Phielix, C. (2010) Social Scaffolds in Innovative Learning Environments. Workshop performed at the ICO Winter school, Regensburg (March 21 - 25, 2010). Sluijsmans, D., Van Gennip, N., Gielen, S., Phielix, C., Prins, F., Strijbos, J. W., Van de Watering, G., & Van Zundert, M. (2007, November). Guidelines for integrating peer assessment in team work: Assuring psychological safety, group awareness and individual responsibility. Workshop performed at the second European Practice Based and Practitioner Research (PBPR) Conference on Learning and Instruction, Maastricht, the Netherlands.
158
Curriculum Vitae Chris Phielix was born on September 11th 1977 in Utrecht, The Netherlands. He completed secondary education in 1997 at O.S.G. Schoonoord in Zeist. In September 1999 he started his study Educational Sciences at Utrecht University. In the third year of his study he worked as an assistant-teacher in several courses in the field of Educational Sciences. In the final year of Phielix’s study he wrote his Master’s thesis, which contained one of the research questions of a NWO project (PRO-ICT project), in which he also collaborated as a student assistant researching projectbased CSCL in secondary education. After receiving his Master's degree in February 2004, he worked as a teacher at Utrecht Univerisity. In December 2004 he worked as a research assistant at CITO, an international testing and assessment company in Arnhem. In May 2005 he worked as an educational consultant at FLOAT in Doorn. Phielix started his PhD research in December 2006 at the Research Centre Learning in Interaction (RCLI) at Utrecht University, part of the research group led by Prof. Paul A. Kirschner. Alongside his PhD project, Phielix was PhD co-ordinator for ICO themegroup ‘Innovative learning arrangements’ and lectured at the Department of Social and Behavioral Sciences, expecting to earn his educational degree for lecturing at universities in 2012. In January 2011, Phielix joined the Graduate School of Teaching (ICLON) at Leiden University, where he works as a supervisor and researcher at the expertisecentre ‘Teacher Learning’, led by Dr. Klaas van Veen. In June 2012 Phielix completed his PhD project which was focused on socio-emotional aspects of group functioning. The main research goals were (1) examining ways to let group members become aware of their socio-emotional group behavior by means of a peer-feedback tool and (2) examining ways to alter individual socio- emotional group behavior by means of a reflection prompts. This thesis is the result of that research project, of which aspects have been published in international journals and have been presented at several national and international conferences.
159
List of ICO-Dissertations 2011 221. 222. 223. 224.
225. 226. 227. 228. 229. 230. 231. 232. 233. 234. 235.
160
Slof, B. (28-01-2011). Representational scripting for carrying out complex learning tasks. Utrecht: Utrecht University. Fastré, G. (11-03-2011). Improving sustainable assessment skills in vocational education. Heerlen: Open University of the Netherlands. Min-Leliveld, M.J. (18-05-2011). Supporting medical teachers’ learning: Characteristics of effective instructional development. Leiden: Leiden University. Van Blankenstein, F.M. (18-05-2011). Elaboration during problem-based small group discussion: A new approach to study collaborative learning. Maastricht: Maastricht University. Dobber, M. (21-06-2011). Collaboration in groups during teacher education. Leiden: Leiden University. Jossberger, H. (24-06-2011). Towards self-regulated learning in vocational education: Difficulties and opportunities. Heerlen: Open University of the Netherlands. Schaap, H. (24-06-2011). Students' personal professional theories in vocational education: Developing a knowledge base. Utrecht: Utrecht University. Kolovou, A. (04-07-2011). Mathematical problem solving in primary school. Utrecht: Utrecht University. Beausaert, A.J. (19-10-2011). The use of personal developments plans in the workplace. Effects, purposes and supporting conditions. Maastricht: Maastricht University Favier, T.T. (31-10-2011). Geographic information systems in inquiry-based secondary geography education: Theory and practice. Amsterdam: VU University Amsterdam. Brouwer, P. (15-11-2011). Collaboration in teacher teams. Utrecht: Utrecht University. Molenaar, I. (24-11-2011). It’s all about metacognitive activities; Computerized scaffolding of self-regulated learning. Amsterdam: University of Amsterdam. Cornelissen, L.J.F. (29-11-2011). Knowledge processes in school-university research networks. Eindhoven: Eindhoven University of Technology. Elffers, L. (14-12-2011). The transition to post-secondary vocational education: Students’ entrance, experiences, and attainment. Amsterdam: University of Amsterdam. Van Stiphout, I.M. (14-12-2011). The development of algebraic proficiency. Eindhoven: Eindhoven University of Technology.