Perspectives in Psychology
5.4 MENTAL PROCESSES
The second major body of research in cognitive science has sought to explain the mental processes that operate on the representations we construct of our knowledge of the world. Of course, it is not possible to separate our understanding, nor our discussion, of representations and processes. Indeed, the sections on mental models and expertise made this abundantly clear! However, a body of research exists that has tended to focus more on process than representation. It is to this that we now turn.
All of what follows in this section rests on the assumption that cognitive actions operate on mental representations. As the cognitive actions occur, mental representations change in some way. And changes in mental representations mean changes in our knowledge of the world, which we call learning. By and large, we can therefore think of three families of cognitive processes, each bringing about its own kind of change in mental representation, and therefore resulting in its own kind of learning. The distinctions, predictably, are not always clear. But the three kinds of mental processes have to do with (1) information processing, (2) symbol manipulation, and (3) knowledge construction. We shall examine each of these in turn.
5.4. 1 Information-Processing Accounts of Cognition
As we have seen, one of the basic tenets of cognitive theory is that information that is present in an instructional stimulus is acted on by a variety of mediating variables before the student produces a response. Information-processing accounts of cognition describe stages that information moves through in the cognitive system and suggests processes that operate at each step. We therefore begin this section with a general account of information processing in human beings. This account sets the stage for our consideration of cognition as symbol manipulation and as knowledge construction.
Although the rise of information-processing accounts of cognition cannot be ascribed uniquely to the development of the computer, the early cognitive psychologists' descriptions of human thinking use distinctly computerlike terms. Like computers, people were supposed to take information from the environment into "buffers," to "process" it before "storing it in memory." Information-processing models describe the nature and function of putative "units" within the human perceptual and cognitive systems, and how they interact. They trace their origins to Atkinson and Shiffrin's (1968) model of memory, which was the first to suggest that memory consisted of a sensory register, a long-term and a short-term store. According to Atkinson and Shiffrin's account, information is registered by the senses and then placed into a short-term storage area. Here, unless it is worked with in a "rehearsal buffer," it decays after about 15 seconds. If information in the short-term store is rehearsed to any significant extent, it stands a chance of being placed into the long-term store, where it remains more or less permanently. With no more than minor changes, this model of human information processing has persisted in the instructional technology literature (R. Gagn6, 1974; E. Gagn6, 1985) and in recent ideas about long-term and short-term, or working, memory (Gagn6 & Glaser, 1987). The importance that every instructional designer gives to practice stems from the belief that rehearsal improves the chance of information passing into long-term memory.
A major problem that this approach to explaining human cognition pointed to was the relative inefficiency of human beings at information processing. This is to be a result of the limited capacity of working memory to roughly seven (Miller, 1956) or five (Simon, 1974) pieces of information at one time. (E. Gagn6 [1985, p. 13] makes an interesting comparison between a computer's and a person's capacity to process information. The computer wins handily. However, human capacity to be creative, to imagine, and to solve complex problems does not enter into the equation.) It therefore became necessary to modify the basic model to account for these observations. One modification arose from studies like those of Shiffrin and Schneider (1977) and Schneider and Shiffrin (1977). In a series of memory experiments, these researchers demonstrated that, with sufficient rehearsal, people automatize what they have learned so that what was originally a number of discrete items become one single "chunk7' of information. With what is referred to as "overlearning," the limitations of working memory can be overcome. The notion of chunking information in order to make it possible for people to remember collections of more than five things has become quite prevalent in the information-processing literature (see Anderson, 1983). And rehearsal strategies intended to induce 'chunking became part of the standard repertoire of tools used by instructional designers.
Another problem with the basic information-processing account arose from research on memory for text in which it was demonstrated that people remembered the ideas of passages rather than the text itself (Bransford & Franks, 197 1; Bransford & Johnson, 1972). 'Ibis suggested that what was passed from working memory to long-term memory was not a direct representation of the information in short-term memory but a more abstract representation of its meaning. These abstract representations are, of course, schemata, which we discussed at some length earlier. Schema theory added a whole new dimension to ideas about information processing. So far, information-processing theory assumed that the driving force of cognition was the information that was registered by the sensory buffers-that cognition was data driven, or bottom-up. Schema theory proposed that information was, at least in part, top-down. This meant, according to Neisser (1976), that cognition is driven as much as by what we know as by the information we take in at a given moment. In other words, the contents of long-term memory play a large part in the processing of information that passes through working memory. For instructional designers, it became apparent that strategies were required that guided top-down processing by activating relevant schemata and aided retrieval by providing the correct context for recall. The "elaboration theory of instruction" (Reigeluth & Stein, 1983; Reigeluth & Curtis, 1987) achieves both of these ends (see 18.4.3). Presenting an epitome of the content at the beginning of instruction activates relevant schemata. Providing synthesizers at strategic points during instruction helps students remember, and integrate, what they have learned up to that point.
Bottom-up, information-processing approaches have recently regained ground in cognitive theory as the result of the recognition of the importance of preattentive perceptual Processes (Marr, 1982; Arbib & Hanson, 1987; Boden, 1988; Treisman, 1988; Pomerantz, Pristach & Carlson, 1989). Our overview of cognitive science, mentioned before, described computational approaches to cognition. In this return to a bottom-up approach, however, we can see marked differences from the bottom-up, information-processing approaches of the 60s and 70s. Bottom-up processes are now clearly confined within the barrier of what Pylyshyn (1984) called cognitive impenetrability. These are processes over which we can have no attentive, conscious, effortful control. Nonetheless, they impose a considerable amount of organization on the information we receive from the world. In vision, for example, it is likely that all information about the organization of a scene, except for some depth cues, is determined preattentively (Marr, 1982). What is more, preattentive perceptual structure predisposes us to make particular interpretations of information, top-down (Owens, 1985a, 1985b; Duong, 1994). In other words, the way our perception processes information determines how our cognitive system will process it. Subliminal advertising works!
Although we still talk rather glibly about short-term and long-term memory and use rather loosely other terms that come from information-processing models of cognition, information-processing theories have matured considerably since they first appeared in the late 50s. The balance between bottom-up and top-down theories, achieved largely within the framework of computational theories of cognition, offers researchers a good conceptual framework within which to design and conduct studies. Equally, instructional designers who are serious about bringing cognitive theory into educational technology will find in this latest incarnation of information-processing theory an empirically valid and rationally tenable basis for making decisions about instructional strategies.
5.4.2 Cognition as Symbol Manipulation
How is information that is processed by the cognitive system represented by it? One very popular answer is "as symbols." This notion lies close to the heart of cognitive science and, as we saw in the very first section of this chapter, it is also the source of some of the most virulent attacks on cognitive theory (Clancey, 1993). The idea is that we think by mentally manipulating symbols that are representations, in our mind's eye, of referents in the real world. There is a direct mapping between objects and actions in the external world and the symbols we use internally to represent them. Our manipulation of these symbols places them into new relationships with each other, allowing new insights into objects and phenomena. Our ability to reverse the process by means of which the world was originally encoded as symbols therefore allows us to act on the real world in new and potentially more effective ways.
We need to consider both how well people can manipulate symbols mentally and what happens as a result. The clearest evidence for people's ability to manipulate symbols in their "mind's eye" comes from Kosslyn's (1985) studies of mental imagery. Kosslyn's basic research paradigm was to have his subjects create a mental image and then to instruct them directly to change it in some way, usually by "zooming" in and out on it. Evidence for the success of his subjects at doing this was found in their ability to answer questions about properties of the imaged objects that could only be inspected as a result of such manipulation.
The work of Shepard and his colleagues (Shepard & Cooper, 1982) represents another "classical" case of our ability to manipulate images in our mind's eye. The best known of Shepard's experimental methods is as follows. Subjects are shown two three-dimensional solid figures seen from different angles. The figures may be the same or different. The subjects are asked to judge whether the figures are the same or different. In order to make the judgment, it is necessary to rotate mentally one of the figures in three dimensions in an attempt to orient it to the same position as the target, so that a direct comparison may be made. Shepard consistently found that the time it took to make the judgment was almost perfectly correlated with the number of degrees through which the figure had to be rotated, suggesting that the subject was rotating it in real time in the mind's eye.
Finally, Salomon (1979) speaks more generally of "symbol systems" and of people's ability to internalize them and use them as "tools for thought." In an early experiment (Salomon, 1974), he had subjects study paintings in one of the following three conditions: (a) A film showed the entire picture, zoomed in on a detail, and zoomed out again, for a total of 80 times. (b) The film cut from the whole picture directly to the detail without the transitional zooming. (c) The film showed just the whole picture. In a posttest of cue attendance, in which subjects were asked to write down as many details as they could from a slide of another picture, low-ability subjects performed better if they were in the "zooming" group. High-ability subjects did better if they just saw the entire picture. Salomon concluded that zooming in and out on details, which is a symbolic element in the symbol system of film, television, and any form of motion picture, modeled for the low-ability subjects a strategy for cue attendance that they could execute for themselves cognitively. This was not necessary for the high-ability subjects. Indeed, there was evidence that modeling the zooming strategy reduced performance of high-ability subjects because it got in the way of mental processes that were activated without prompting. Bovy (1983) found results similar to Salomon's using "irising" rather than zooming. A similar interaction between ability and modeling was reported by Winn (1986) for serial and parallel pattern-recall tasks.
Salomon has continued to develop the notion of internalized symbol systems serving as cognitive tools. Educational technologists have been particularly interested in his research on how the symbolic systems of computers can "become cognitive," as be put it (Salomon, 1988). The internalization of the symbolic operations of computers led to the development of a word processor, called the "Writing Partner" (Salomon, Perkins & Globerson, 1991), that helped students write. The results of a number of experiments showed that interacting with the computer led the users to internalize a number of its ways of processing, which led to improved metacognition relevant to the writing task. Most recently (Salomon, 1993), this idea has evolved even further, to encompass the notion of distributing cognition among students and machines (and, of course, other students).
This research has had two main influences on educational technology. The first, derived from work in imagery of the kind reported by Kosslyn and Shepard, provided an attractive theoretical basis for the development of instructional systems that incorporate large amounts of visual material (Winn, 1980, 1982). The promotion and study of visual literacy (Dondis, 1973; Sless, 1981) is one manifestation of this activity. A number of studies have shown that the use of visual instructional materials can be beneficial for some students studying some kinds of content. For example, Dwyer (1972, 1978) has conducted an extensive research program on the differential benefits of different kinds of visual materials, and has generally reported that realistic pictures are good for identification tasks, line drawings for teaching structure and function, and so on. Explanations for these different effects rest on the assumption that different ways of encoding material facilitate some cognitive processes rather than other--that some materials are more effectively manipulated in the mind's eye for given tasks than others.
The second influence of this research on educational technology has been in the study of the interaction between technology and cognitive systems. Salomon's research, which we just described, is of course an example of this. The work of Papert and his colleagues at MIT's Metlia Lab is another important example. Papert (1983) began by proposing that young children can learn the "powerful ideas" that underlie reasoning and problem solving by working (perhaps playing is the more appropriate term) in a microworld over which they have control. The archetype of such a microworld is the well-known LOGO environment (see 22.214.171.124) in which the student solves problems by instructing a "turtle" to perform certain tasks. Learning occurs when the children develop problem definition and debugging skills as they write programs for the turtle to follow. Working with LOGO, children. develop fluency in problem solving as well as specific skills, like problem decomposition and the ability to modularize problem solutions. Like Salomon's (1988) subjects, the children who work with LOGO (and in other technology-based environments (Harel & Papert, 1991]) internalize a lot of the computer's ways of using information and develop skills in symbol manipulation that they use to solve problems.
There is, of course, a great deal of research into problem solving through symbol manipulation that is not concerned particularly with technology. The work of Simon and his colleagues is central to this research. (See Klahr & Kotovsky's [19891 edited volume that pays tribute to his work.) It is based largely on the notion that human reasoning operates by applying rules to encoded information that manipulate the information in such a way as to reveal solutions to problems. The information is encoded as a "production system" that operates by testing whether the conditions of rules are true or not, and following specific actions if they are (see also 24.8.1). A simple example: "If the sum of an addition of a column of digits is greater than 10, then write down the right-hand integer and carry I to add to the next column." The "if . . . then. . ." structure is a simple production system in which a mental action is carried out (add I to the next column) if a condition is true (the number is greater than 10).
An excellent illustration is to be found in Larkin and Simon's (1987) account of the superiority of diagrams over text for solving certain classes of problems. Here, they develop a production system model of pulley systems to explain how the number of pulleys attached to a block, and the way in which they are connected, affects the amount of weight that can be raised by a given force. The model is quite complex. It is based on the idea that people need to search through the information presented to them in order to identify the conditions of a rule (e.g., if a rope passes over two pulleys between its point of attachment and a load, its mechanical advantage is doubled) and then compute the results of applying the production rule in those given circumstances. The two steps, searching for the conditions of the production rule and computing the consequences of its application, draw on cognitive resources (memory and processing) to different degrees. Larkin and Simon's argument is that diagrams require less effort to search for the conditions and to perform the computation, which is why they are so often more successful than text for problem solving.
It is easier to explain the symbol manipulation required to search for information and use it to compute the answer to a question with a simpler example. Winn, Li, and Schill (1991) conducted an empirical test of some aspects of Larkin and Simon's account using family trees rather than pulley systems. Subjects examined either family trees or statements about who was related to whom. They were given questions to answer about kinship, such as, "Is Mary Jack's second cousin?" The dependent measure of most interest was the speed at which subjects were able to answer the questions. Arguing that the information presented in the text required more cognitive manipulation than that provided by the family trees, from which answers could be obtained by simple inspection, it was expected that subjects seeing diagrams would be able to answer kinship questions quicker than those who saw text. This turned out to be the case.
These results, along with analysis of strategies that subjects used to find answers to the questions, supported the following interpretation. The text condition provided simple factual statements about who was whose parent, such as "Jack is Mary's parent; Jack is Edward's parent; Mary is Penny's parent. . . ." To answer a question from text, such as, "Is Amy Joseph's first cousin?", the subject has to read through the list until the first relevant piece of information was found, which in this case would be a statement about who Amy's parent was. That information had to be stored in memory, while the second piece of information, about Joseph's parents, was sought and remembered. For first cousins, it was necessary to repeat this search-and-store process twice more, to find who were the parents of Amy's and Joseph's parents, before all the conditions of the production could be satisfied. This required encoding and retrieval of at least four pieces of information, assuming the subject was 100% efficient. Next, the answer had to be computed from this information. Either the lineage of Amy and Joseph made them second cousins or it did not.
In the case of family trees, once the first person in the problem had been found, all that was necessary to do was to trace up and down the tree the required number of branches and read off the name at the end. Nothing had to be stored in memory, and no computations were required. This, of course, was only the case when kinship terms (cousin, sibling) and the conventions of family trees were known to subjects. When this was not the case, and subjects had to apply kinship rules explicitly, the advantage of the graphic was reduced. For example, in one experiment, some subjects worked with Chinese names and kinship terms defined for them in a rule. So the requirements of symbol manipulation to solve problems are removed when the conventions of the graphic representation are known. Interestingly, the most rapid responses were given by subjects, in the graphic condition, who were told no kinship rules at all. They simply used their knowledge that cousins are always on the same level of a family tree and did not examine parents at all.
This study, and Larkin and Simon's production system model that lay behind it, illustrate very well the symbol manipulation approach to theories of cognitive processing. In the case of both pulleys and families, subjects encode objects (pulleys, ropes, weights, people's names, and kinship) as symbols that they are required to store in memory and manipulate through comparisons, tracing relationships among them, and so on. When the symbols are represented as diagrams of pulley systems or family trees, relationships among them that are crucial to understanding the systems, and answering questions about them are made explicit by their relative placement on the page and by drawings of the links among them: ropes between pairs of pulleys, lines between names in the family tree. This makes the search for conditions of production rules much simpler and does not draw on memory at all. Computation consists of reading off the answer once all the conditions have been met. If, in addition, the graphic representation uses conventions with which the reader is familiar, search and computation can be short-circuited completely, making the task trivial by comparison.
Many other examples of symbol manipulation through production systems exist. In the area of mathematics education, the interested reader will wish to look at projects reported by Resnick (1976) and Greeno (1980) in which instruction makes it easier for students to encode and manipulate mathematical concepts and relations. Applications of Anderson's (1983) ACT* production system in intelligent computer-based tutors to teach geometry, algebra, and LISP are also illustrative (Anderson & Reiser, 1985; Anderson, Boyle & Yost, 1985).
For the educational technologist, the question arises of how to make symbol manipulation easier so that problems may be solved more rapidly and accurately. Larkin and Simon and Winn, Li, and Schill show that one way to do this is to show conceptual relationships by layout and links in a graphic. A related body of research concerns the relations between illustrations and text. (See summaries in Willows & Houghton, 1987; Houghton & Willows, 1987; Mandl & Levin, 1989; Schnotz & Kulhavy, 1994.) Central to this research is the idea that pictures and words can work together to help students understand information more effectively and efficiently. There is now considerable evidence that people encode information in one of two memory systems, a verbal system and an imaginal system. This "dual coding" (Paivio, 1983; Clark & Paivio, 1991) or "conjoint retention" (Kulhavy, Lee & Caterino, 1985) has two major advantages. The first is redundancy. Information that is hard to recall from one source is still available in the other. Second is the uniqueness of each coding system. As Levin, Anglin, and Carney (1987) have ably demonstrated, different types of illustration are particularly good at performing unique functions. Realistic pictures are good for identification, cutaways and line drawings for showing the structure or operation of things. Text is more appropriate for discursive and more abstract presentations.
Specific guidelines for instructional design have been drawn from this research, many presented in the summaries mentioned in the previous paragraph. Other useful sources are chapters by Mayer and by Winn in Fleming and Levie's (1993) volume on message design. The theoretical basis for these principles is by and large the facilitation of symbol manipulation in the mind's eye that comes from certain types of presentation.
However, as we saw at the beginning of this chapter, the basic assumption that we think by manipulating symbols that represent objects and events in the real world has been called into question (Clancey, 1993). There are a number of grounds for this criticism. The most compelling is that we do not carry around in our beads representations that are accurate "maps" of the world. Schemata, mental models, symbol systems, search, and computation are all metaphors that give a superficial appearance of validity because they predict behavior. However, the essential processes that underlie the metaphors are more amenable to genetic and biological than to psychological analysis. We are, after all, living systems that have evolved like other living systems. And our minds are embodied in our brains, which are organs just like any other. We shall leave the implications of this line of argument to those writing other chapters in this handbook. For now, we shall turn to a relatively uncontroversial and well-rooted corollary, that people construct knowledge for themselves rather than receiving it from someone else.
5.4.3 Cognition as Knowledge Construction
One result of the mental manipulation of symbols is that new concepts can be created. Our combining and recombining of mentally represented phenomena leads to the creation of new schemata that may or may not correspond to things in the real world. When this activity is accompanied by constant interaction with the environment in order to verify new hypotheses about the world, we can say that we are accommodating our knowledge to new experiences in the "classic" interactions described by Neisser (1976) and Piaget (1968), mentioned earlier. When we construct new knowledge without direct reference to the outside world, then we are perhaps at our most creative, conjuring from memories thoughts and expressions of it that are entirely novel.
When we looked at schema theory, we described Neisser's (1976) "perceptual cycle," which describes how what we know directs how we seek information; how we seek information determines what information we get; and how the information we receive affects what we know. This description of knowledge acquisition provides a good account of how top-down processes, driven by knowledge we already have, interact with bottom-up processes, driven by information in the environment, to enable us to assimilate new knowledge and accommodate what we already know to make it compatible.
What arises from this description, which we did not make explicit earlier, is that the perceptual cycle and thus the entire knowledge acquisition process is centered on the person not the environment. Some (Duffy & Jonassen, 1992; Cunningham, 1992a; and Chapters 7 and 23 in this handbook) extend this notion to mean that the schemata a person constructs do not correspond in any absolute or objective way to the environment. A person's understanding is therefore built from that person's adaptations to the, environment entirely in terms of the experience and understanding that the person has already constructed. There is no process whereby representations of the world are directly "mapped" onto schemata. We do not carry representational images of the world in our mind's eye. Semiotic theory, which has recently made an appearance on the educational stage (Cunningham, 1992b; Driscoll, 1990; Driscoll & Lebow, 1992) goes one step further, claiming that we do not apprehend the world directly at all. Rather, we experience it through the signs we construct to represent it. Nonetheless, if students are given responsibility for constructing their own signs and knowledge of the world, semiotic theory can guide the development and implementation of learning activities as Winn, Hoffman, and Osberg (1995) have demonstrated.
A thorough discussion of these ideas takes place in Chapters 7 and 23 and so will therefore not be pursued here. What is of relevance in this discussion of cognitive processes, however, is the notion that people do construct understanding for themselves in ways that are often idiosyncratic and that often defy expression to someone else. We all "know the world" in ways that differ, sometimes quite sharply, from other people. This idiosyncracy of knowledge has led some (Merrill, 1992) to react severely against instructional theories that aim at fostering construction of knowledge that varies among individuals on the grounds that some knowledge and skills must be acquired and expressed in a uniform manner. Idiosyncratic understanding of brain surgery or how to fly a plane could lead to disaster! However, one can reasonably make the case that some knowledge can be, indeed is, best, constructed by individuals for themselves without the imposition of a right answer or a correct set of actions to follow as a result.
The significance of knowledge construction for educational technology lies in its marking a shift away from didactic, content-specific instruction to building environments that make it easy for students to construct their understanding of knowledge domains. Zucchermaglio (1993) describes "filled7' and "empty" technologies. The former are instructional systems, like CAI and intelligent tutors, that consist of shells plus content. For example, Anderson, Boyle, and Yost's (1985) algebra tutor consists of a variety of generic components, found in any intelligent tutorial, such as the capability of constructing a student model, of making inferences, and so on (see chapters in Polson & Richardson, 1988). In addition, it contains a knowledge base about algebra from which the other components draw. On the other hand, empty technologies are shells that provide teachers and students with the capability of interacting with content, exploring information, and creating output, but which do not contain a predetermined knowledge base. An example is the "Bubble Dialogue" project (McMahon & O'Neil, 1993), which consists of a HyperCard stack that permits students to construct dialogues. The program allows students to write both the overt speech and the covert thoughts of the characters whose roles they play. Yet what the students write about is not prescribed, and the tool has been used for many purposes ranging from teaching writing to developing understanding about social problems.
If cognition is understood to involve the construction of knowledge by students, it is therefore essential that they be given the freedom to do so. This means that, within Spiro et al.'s (1992) constraints of "advanced knowledge acquisition in ill-structured domains," instruction is less concerned with content, and sometimes only marginally so. Instead, educational technologists need to become more concerned with how students interact with the environments within which technology places them and with how objects and Phenomena in those environments appear and behave. This requires educational technologists to read carefully in the area of human factors (for example, Ellis, 1993; Barfield & Furness, 1995) where a great deal of research exists on the cognitive consequences of human-machine interaction. It requires less emphasis on instructional design's traditional attention to task and content analysis. It requires alternative ways of thinking about (Winn, 1993b) and doing (Cunningham, 1992a) evaluation. In short, it is only through the cognitive activity that interaction with content engenders, not the content itself, that people can learn anything at all.
Information-processing models of cognition have had a great deal of influence on research and practice of educational technology. Instructional designers' day-to-day frames of reference for thinking about cognition, such as working memory and long-term memory, come directly from information-processing theory. The emphasis on rehearsal in many instructional strategies arises from the small capacity of working memory. Attempts to overcome for this problem have led designers to develop all manner of strategies to induce chunking. Information-processing theories of cognition continue to serve our field well.
Research into cognitive processes involved in symbol manipulation have been influential in the development of intelligent tutoring systems (Wenger, 1987), as well as in information-processing accounts of learning and instruction. The result has been that the conceptual bases for some (though not all) instructional theory and instructional design models have embodied a production system approach to instruction and instructional design (see Landa, 1983; Scandura, 1983; Merrill, 1992). To the extent that symbol manipulation accounts of cognition are being challenged, these approaches to instruction and instructional design are also challenged by association.
Accounts of learning through the construction of knowledge by students have been generally well accepted since the mid-70s and have served as the basis for a number of the assumptions educational technologists have made about how to teach. Attempts to set instructional design firmly on cognitive foundations (DiVesta & Rieber, 1987; Bonner, 1988; Tennyson & Rasch, 1988) reflect this orientation. We examine these in the next section.