Similarity, Interactive Activation, and Mapping

Robert L Goldstone

Indiana University

February, 1993



The question of "What makes things seem similar?" is important both for the pivotal role of similarity in theories of cognition and for an intrinsic interest in how people make comparisons. Similarity frequently involves more than listing the features of the things to be compared and comparing the lists for overlap. Often, the parts of one thing must be aligned, or placed in correspondence with the parts of the other thing. The quantitative model with the best overall fit to human data assumes an interactive activation process whereby correspondences between the parts of compared things mutually and concurrently influence each other. An essential aspect of this model is that matching and mismatching features influence similarity more if they belong to parts that are placed in correspondence. In turn, parts are placed in correspondence if they have many features in common and if they are consistent with other developing correspondences.

Similarity, Interactive Activation, and Mapping

Americans in England, and Brits in the States, often spontaneously remark on the similarity between cricket and baseball. The resemblance seems fairly mundane and commonplace; both sports involve batters, fielders, pitchers, and balls, among other commonalities. A tempting first pass at a theory of cricket-baseball similarity might be to list all the rules, equipment, and player positions for cricket, do the same for baseball, and keep a tally of how many of these aspects are shared by the two sports. The more aspects shared between the sports, the more we might expect trans-Atlantic travellers to acknowledge the sports' similarity. In fact, formalized instantiations of this theory have been proposed by psychologists to explain subjective assessments of similarity.

This paper will examine the most influential formulations of this "list features and compare for overlap" theory of similarity. These theories have difficulty in accounting for the similarity of structured scenes. It is argued that processes that go beyond comparing overlap or tallying matches are required to align structured displays. In our example, parts of the cricket match must be placed in correspondence with the baseball game's parts before their similarity can be assessed. Similarity is not increased much if features belong to noncorresponding parts. If an American in London observes, "Cricket is like baseball. An American baseball is white and cricket players all wear white" we might think a peculiar and unlikely comparison has been made; baseballs do not correspond to cricket players' shirts and so their matching features are not very important. A quantitative model of similarity is developed in which object and feature correspondences are simultaneously and interactively determined.

The experiments and model described here are concerned with evaluating the similarity of scenes. A scene is any group of objects, elements, words, actions, or events. The parts of a scene are usually interrelated, and have internal structure. For example, a baseball has certain relations to pitchers and bats, and is composed of features such as white and round. A correspondence is a connection or association between parts from two scenes. Parts may correspond because of superficial similarities or because they play the same role in their respective scenes.

The motivation for studying similarity is three-fold. First, similarity plays a pivotal role in cognitive theories of problem solving, memory, categorization, and many other phenomena. The act of comparing events, objects, and scenes, and establishing similarities between them has considerable influence on the cognitive processes on which we depend. Second, similarity provides an elegant diagnostic tool for examining the structure of our mental entities, and the processes that operate on these entities. Given that psychologists have no microscope with direct access to people's representations of their knowledge, appraisals of similarity provide a powerful, if indirect, lens onto representation/process assemblies. Third, similarity occupies an important middle ground between perception and higher-level knowledge. Similarity is grounded by perceptual functions; a tone of 200 hz and a tone of 202 hz sound similar and little can be done to alter this perceived similarity. One of the major claims of this paper will be that similarity also shares many commonalities with the more "cognitive" process of analogical reasoning.

Geometric Models of Similarity

Geometric models of similarity have been among the most influential approaches to analyzing similarity (Carroll & Wish, 1974; Torgerson, 1965; Nosofsky, 1992). These approaches are exemplified by nonmetric multidimensional scaling (MDS) models (Shepard, 1962a, 1962b). MDS models represent similarity relations between entities in terms of a geometric model that consists of a set of points embedded in a dimensionally organized metric space. The input to MDS routines may be similarity judgments, dissimilarity judgments, confusion matrices, correlation coefficients, joint probabilities, or any other measure of pairwise proximity. The output from an MDS routine is a geometric model of the data, with each object of the data set represented as a point in an N-dimensional space. The distance between two objects' points in the space is taken to be inversely related to the objects' similarity.

In MDS, the distance between points i and j is typically computed by:


where n is the number of dimensions, Xik is the value of dimension k for item i, and r is a parameter that allows different spatial metrics to be used (r=1 is a city-block metric, r=2 is an Euclidean metric). Similarity is assumed be related by a monotonic decreasing function to the interpoint distance.

The Contrast Model

The other main approach to similarity in cognitive psychology defines similarity in terms of a feature-matching process based on the weighting of common and distinctive features. Amos Tversky's Contrast model (1977) is the best known feature set approach to similarity. In this model, entities are represented as a collection of features and similarity is computed by:

S(A,B) = qf(A«B)-af(A-B)-bf(B-A)

The similarity of A to B is expressed as a linear combination of the measure of the common and distinctive features. The term (A«B) represents the features that items A and B have in common. (A-B) represents the features that A has but B does not. (B-A) represents the features that B, but not A, possesses. The terms q, a, and b refer to weights for the common and distinctive components. The function f is often assumed to satisfy feature additivity such that f(X»Y)=f(X) + f(Y) when X and Y are disjoint. A feature may be any property, characteristic or aspect of a stimulus; features may be concrete or abstractions such as "symmetric" or "beautiful."

Similarity and Alignment

While traditional MDS models and the Contrast Model have primarily been analyzed for their differences (Gati & Tversky, 1982; Tversky, 1977; Tversky & Hutchinson, 1986), one similarity is that their representations of compared items are not structured hierarchically (entities are not embedded in on another) or propositionally (entities do not take other entities as arguments). Because of their relatively unstructured representations, the models, for the most part, ignore issues involving the alignment of dimensions and features. If structurally richer representations are used to represent compared scenes, then an issue virtually irrelevant in simpler representations arises: "What scene parts correspond to each other?" In the Contrast Model, features are aligned with one another on the basis of identity. Features that are identical are matched, and therefore become part of the (A«B) term. Features of one scene that are not identical to any feature of the other scene will not be placed in correspondence at all, and will be relegated to the (A-B) or (B-A) term. MDS models align identical dimensions.

Neither model has the notion that correspondences should be consistent with one another. Dimension/feature alignments are not contingent upon each other. In general, the determination of a feature or dimension match does not depend on the other feature matches that are present between two scenes.

Contingencies between correspondences arise when propositionally or hierarchically structured scenes are compared. Propositional representations are characterized by relational predicates that take arguments. A two-place predicate such as Above would take two arguments. The meaning of a proposition often changes if the argument order changes. For example, Above (Triangle, Circle) does not represent the same fact as Above (Circle, Triangle). Propositionally, if two relations correspond to each other, there will be a tendency to place the arguments to the relations into correspondence. For example, if above(triangle, circle) and above(square, diamond) are compared, triangle and square will be placed in alignment because of the alignment between the two above relations. Likewise, inconsistent, many-to-one correspondences will be avoided. A many-to-one mapping occurs if two or more elements from one scene are placed into correspondence with a single element from the other scene. Many-to-one mappings are inconsistent because, in many domains, a coherent interpretation of the relation between two scenes cannot formulated if a part from one scene has two competing correspondences (Marr & Poggio, 1979). Because propositional information is not represented in the MDS and Contrast models, arguments being placed in correspondence do not cause predicates to be placed in correspondence.

Hierarchical representations involve entities that are embedded in one another. Hierarchical representations are required to represent the fact that X is part of Y or that X is a kind of Y. Hierarchically, if two features are placed in correspondence, then there will be a tendency to place the objects that contain the features into correspondence (and vice versa). Because hierarchical information is not part of the simple representations used in the MDS and Contrast models, one feature-to-feature correspondence does not cause whole objects to be placed in correspondence or vice versa.

The importance of determining correspondences, and the influence of correspondences on each other, is clear from work in analogical reasoning. Gentner (1983) and Holyoak and Thagard (1989) argue that analogies are constructed by creating correspondences between elements that tend to be consistent with other emerging correspondences. For example, in the analogy between an atom and our solar system, the sun is placed into correspondence with an atom's nucleus because they occupy the same role in the relation rotates-around(object1, object2). In addition, the sun-to-nucleus correspondence is supported by other alignments, such as the electron-to-planet correspondence (Gentner, 1983). There is psychological evidence (Clement & Gentner, 1991; Gentner & Toupin, 1986) that people judge the goodness of an analogy by the coherence of the correspondences that are created.

The burden of the rest of this paper is to show that similarity assessments share deep similarities with analogical reasoning. Specifically, the similarity of scenes cannot be determined until the scenes' parts are placed in correspondence. As with analogy, parts will tend to be placed in correspondence to the extent that they are consistent with other correspondences. The importance of a correspondence for similarity depends on the degree to which the correspondence is globally consistent - consistent with the emerging pattern of other correspondences. As an example, consider Fig. 1A, which is roughly based on stimuli used by Gati and Tversky (1984) to test the influence of common and distinctive features on similarity. Is scene B or C more similar to A? There is good reason to choose C. While both B and C shared the feature spotted with A, the spots do not belong to corresponding parts of A and B. For A and B, the spots are a match-out-of-place (MOP), where a MOP is defined to be a feature match between objects that are not aligned with one another. For A and C, the spots are a match-in-place (MIP), where a MIP is a feature match between objects that are aligned.


Insert Figure 1 about here


We might expect that C will be judged to be more similar to A than will B (MIPs increase similarity more than MOPs), and that B will be judged more similar to A than will D (MOPs increase similarity somewhat). According to the first prediction, a feature match is more influential to similarity if it belongs to corresponding objects. In order to know how much a feature correspondence increases similarity, it would be necessary to know about other object and feature correspondences. The first two experiments investigate these predictions and place general constraints on a theory of similarity that is applicable to structured scenes.

Throughout the experiments, complex artificial scenes consisting of clearly delineated objects and features were used. The materials are justified on several grounds. First, as Figure 1 illustrates, previous researchers have tested scenes where object alignments that depend on other correspondences are required. It is quite difficult, in fact, to develop complex artificial materials in which object alignment is not an issue. Second, in many cases, the properties of the stimuli that make them artificial are the very same properties that make the stimuli powerful methodological tools. Prior knowledge and familiarity would potentially weaken claims about a general influence of alignment on similarity. The independence of scene components allows selective manipulation of parts of a scene while leaving the rest of the scene intact. For quantitative modelling, it is important to use scenes that have a relatively straightforward decomposition. Third, a wealth of data suggests that even for perceptual similarity, subjects are sensitive to high-level structure (e.g. Palmer, 1978). Many natural comparisons (e.g. landscapes, stories, human bodies, families, etc.) involve scenes that are highly structured.

Experiment 1

The first experiment was designed to answer a number of questions. First, does a shared feature between two objects that are aligned (a MIP) increase similarity judgements more than a shared feature between two objects that do not correspond to each other (a MOP)? Second, does a MOP count at all; that is, do scenes with a MOP get higher similarity ratings than scenes with no match at all? Third, what is the relationship between establishing correspondences and similarity? Specifically, is clarity of alignment/mapping necessarily proportional to the judged similarity of two scenes or are there cases where it is unclear what scene elements map onto each other but the judged similarity is high? Fourth, are similarity ratings influenced by mappings? If subjects align objects that objectively have fewer features in common (they make the "wrong" mapping), are their similarity ratings relatively low?

Briefly, two scenes were shown side-by-side on a computer screen. Each scene was composed of two stylized butterflies. For each pair of scenes, subjects assigned a similarity rating and then indicated which butterflies mapped onto each other. The two scenes were systematically related such that the number of MIPs and MOPs were independently varied.


Subjects. Thirty-five undergraduate students from University of Illinois at Urbana-Champaign served as subjects in order to fulfill a course requirement.

Materials. Subjects were shown 120 displays on Macintosh SE screens. Each display contained four butterflies - two on either side of a black vertical bar subtending the middle of the screen. A scene consisted of two butterflies. Each butterfly was composed of four features: Wing shading (twenty two different values including striped, spotted, checkerboard, black, brick, etc.), Head style (triangle, square, circle, or M-shaped), Tail style (radiating lines, zig-zag, cross lines, or line with ball), and body shading (the same range of values as wing shading). None of the wing shadings on any of the four butterflies had the same value as the body shading of any butterfly. The display area was 17 cm. high by 21 cm. across. Each individual butterfly was approximately 6 cm. by 4 cm. Viewing distance was not controlled but was approximately 60 cm.

Design. On each trial, one scene composed of two butterflies was constructed (the "initial scene") and the other scene (the "changed scene") was constructed by selectively changing some of the features of the initial scene. On two thirds of the trials, one dimension of the initial butterfly was altered to create the changed scene; on the remaining third, two dimensions were altered.


Insert Table 1 about here


Dimensions were altered in one of six ways, as illustrated by Table 1. Each butterfly can be abstractly represented by letters, with different letters referring to different features. For example, the butterfly represented by "ABCD" has value A for its head (a "circle head" for example), value B for its tail, value C for its body, and value D for its wings. The butterfly "AXCY" has identical values to the first butterfly for its head and body, and different values for its tail and wings. The following factors were randomized: the order of the dimensions (for example, on one quarter of the trials the first value designates wings), the left/right order of the initial and changed scenes, and the actual value represented by the letters (on some trials "A" refers to a square head, while on other trials it refers to a circle head). When two dimensions were altered, the second dimension to be altered was randomly selected from the remaining three dimensions. The second dimension was randomly altered in any of the ways listed in Table 1, except the first method.


Insert Figure 2 about here


Fig. 2 is an example of the DH->HD method. The butterflies can abstractly be described as: Butterfly A=ABCD, B=EFGH, C=ABCH, and D=EFGD. Butterflies A and C are identical except along the body-shading dimension (the fourth dimension), with C having B's wavy-line shading. B and D are also identical except along the body-shading dimension, with D having A's value of stripes. Given that A corresponds to C and B corresponds to D, on the basis of the number of dimension values that they have in common, the body shading dimension has two MOPs. Both the striped and wavy lines feature matches are matches that belong to butterflies that are not placed in correspondence.


Insert Figure 3 about here


Fig. 3 shows an example of a trial where two dimensions are changed. In the top display, the body shading dimension is changed via method DH->HY. Butterfly A corresponds to Butterfly C on the basis of featural overlap, but Butterflies A and D have identical body shadings. Neither butterfly B nor C share their body shadings with any other butterfly. The second dimension which is changed is the tail dimension, which is changed via DH->HD. A, while corresponding to C, has the same tail as D. B, while corresponding to D, has the same tail as C.

The actual physical location of the butterflies could very likely act as a cue for mapping the butterflies from one scene onto another. To control for this, three different spatial layouts were used. In the "same positions" layout, butterflies that correspond to each other (according to their feature overlap) are placed in the same relative locations in their respective scenes (Fig. 3 has this layout). In the "opposite positions" layout, butterflies that do not correspond to each other are placed in the same relative locations (Fig. 3 would have this layout if butterflies C and D swapped their positions). In the "unrelated positions" layout, neither butterfly of one scene has the same relative location as either of the butterflies of the other scene (Fig. 2 has this layout). Within each scene, the two butterflies were always placed diagonal to each other.

Procedure. Each trial began with the simultaneous display of the initial scene and the changed scene. The subjects' task was to rate the two scenes' similarity on a scale from one to nine. A rating of one indicated very low similarity; a rating of nine indicated very high similarity. It was stressed to the subjects that they were supposed to rate the similarity of the whole left scene to the whole right scene. After subjects gave their similarity rating, they were asked to place the butterflies from the left scene "into correspondence" with the butterflies on the right scene. If a subject believed that the top butterfly of the left scene corresponded to the top butterfly of the right scene, and the two bottom butterflies corresponded to each other, then they were instructed to press the "1" key. If they believed the top butterfly of each scene corresponded to the bottom butterfly of the other scene, then they were told to press the "2" key. They were given immediate visual feedback as to their mapping choice; Black lines 2 cm. thick were displayed on the screen, connecting the butterflies that subjects indicated should be placed in correspondence. After two seconds, the screen was erased and subjects proceeded to the next trial.

Fifteen practice trials were given, to familiarize subjects with the rating and mapping tasks. These were followed by 105 trials as described above. On each trial the method of transforming the initial scene into the changed scene, the particular dimensions changed, and the values along those dimensions were all subject to random selection.


The data of most importance are the subjects' similarity ratings and their proposed mappings. Both of these data are presented in Table 2, broken down by method of changing dimensions and by number of dimensions changed.

Mapping Judgments. For all trials, there was an "experimenter determined" or "correct" mapping and the subject's mapping. The correct mapping was defined as the consistent mapping that maximized the number of features that were shared between butterflies that were placed in correspondence. A mapping was called "true" if the experimenter-determined mapping agreed with the subject-determined mapping.


Insert Table 2 about here


Overall, when only one feature was changed, 89% of subject mappings were true, compared to 78% true mappings when two features were changed. The mapping percentages can also be broken down by method of changing dimensions. A breakdown by method yields the following true mapping percentages: Method DH->DH = 89%, DH->DD = 84, DH->DY= 86, DH->XY = 84, DH->HY = 77, DH->HD = 66. Methods DH->DH and DH->DY, and methods DH->DD, DH->DY, and DH->XY were not significantly different. All other method pairs were significantly different (unpaired t tests with Scheffe's adjustment for post-hoc tests, df=68, p<.05). Therefore, both number of MIPs and number of MOPs have an effect on the percentage of true mappings. Methods DH->DH and DH->XY do not differ on number of MOPs, but DH->DH has two more MIPs than DH->XY. The modest but significant difference between these method's true mapping percentages are attributable to number of MIPs. By similar logic, DH->XY and DH->HD do not differ on number of MIPs, but DH->HD has two more MOPs. DH->HY, with one more MOP than DH->XY, is intermediate between DH->XY and DH->HD. Differences in the number of MOPs influences the true mapping percentage more than differences in MIPs do.

Table 2 and Fig. 4 also show that the influence that one feature change has on mapping accuracy increases if other features are changed. If only one feature is changed, then all of the methods of changing the second feature have an equal effect on subjects' mapping accuracy (Bonferroni-adjusted t test, df=68, p>.1 for each pair of methods). One likely explanation is that if only a single feature is changed, then mapping accuracy is close to a "ceiling." Butterfly layout has an influence on mapping accuracy, with the following correct mapping percentages: opposite = 81%, same = 84%, unrelated = 85%. Butterflies placed in opposite locations yielded lower mapping accuracy than butterflies placed in the same (t=2.5, df=68, p<.05) or unrelated (t=2.86, df=68, p<.05) locations.


Insert Figure 4 about here


There is some evidence that the butterfly's dimensions are not equally salient. Mapping accuracies were recorded for trials in which the method was DH->HD and only one dimension was changed. If the head dimension was changed via DH->HD, then 82% of the trials had true mappings. Changing body shading, wing shading, and tail type yielded 88%, 87%, and 90% true mappings respectively. As such, the head dimension seemed to influence mappings more than the other dimensions did. This salience difference should not have greatly affected the findings reported here because the particular dimension changed on any trial was randomized and completely independent of the method used.

Similarity Judgments. All analyses on similarity judgments, unless otherwise stated, use only true mapping trials. If a subject's mapping does not agree with the experimenter-determined mapping, then lower similarity ratings might be expected because the subject would be aligning butterflies in a less than optimal manner. This expectation is supported; the average similarity rating on true mapping trials is 6.47 whereas false mapping trials have an average similarity rating of 5.64 (two tailed paired t=8.396, df=68, p<.001).

The mean similarity ratings for the 6 methods of changing features are: DH->DH =7.1, DH->DD = 6.5, DH->DY = 6.4, DH->XY = 5.5, DH->HY = 5.5, and DH->HD = 5.9. The only overall means that were not different were for methods DH->XY and DH->HY (overall F (5, 34) = 6.5, mse=.02, p<.001. For all other pairs, p<.05 by Fisher's PLSD adjustment for multiple posthoc comparisons). These results demonstrate both influences due to MIPs and MOPs. For example, methods DH->DY and DH->XY have the same number of MOPs, varying only on their number of MIPs. Conversely, DH->XY and DH->HD vary only on their number of MOPs. In general, as the number of MOPs and MIPs increases, so does the similarity rating. Furthermore, MIPs increase similarity to a greater extent than MOPs. For example, methods DH->DH and DH->HD, or methods DH->DY and DH->HY, have the same number of matches in common with the initial scene, but the methods with the matches in place have significantly higher similarity ratings than the methods with matches out of place.


Insert Figure 5 about here


Butterfly layout has a significant influence on similarity ratings (F(2,34), mse=.04, p<.05), with same positions = 6.3, unrelated = 6.1, and opposite = 5.9. Each of these similarities is significantly different from each other (p<.05 by Fisher's PLSD). There is a marginally significant interaction between the method of changing the first feature and the method of changing the second feature on similarity ratings (F (25, 50) = 4.1, mse =.06, p <.1). The data for this interaction are shown in Fig. 5.


The results point to large influences due to the method of changing features, on both mapping and similarity measures. Subjects produce more correct mappings (mappings that maximize the number of matching features placed in correspondence) as the number of MOPs decreased and the number of MIPs increased. Changing one feature resulted in a larger decline in mapping accuracy when there was also another feature that was changed.

Similarity and mapping are differently affected by the method of changing features. While mapping performance decreases as a function of increasing MOPs, similarity ratings increase. MIPs have a greater impact on similarity ratings than do MOPs; the difference between displays with MIPs and displays with an equal number of MOPs is large relative to the difference between displays with any number of MOPs and displays with no MOPs. Conversely, MOPs have a greater impact on mapping accuracy than do MIPs.

In general, there is strong evidence that similarity is not simply evaluated on the basis of the number of simple features that two scenes share. In particular, the organization of the shared features into objects is critical. First, a matching feature that belongs to objects that do not correspond does not influence similarity as much as a match that belongs to corresponding objects. Second, on trials where subjects map objects onto each other in ways that do not maximize the number of matching features in place, the rated similarity is lower. On the other hand, the similarity computation is not equivalent to the mapping computation either. The mapping of objects from one scene to another is weakened by adding matching features out of place, but these features still increase similarity.

Object correspondences depend on more than the objects' relative spatial locations. MIPs increase similarity more than MOPs even when butterflies are placed in unrelated and opposite positions. In fact, in the opposite positions condition, MOPs would be MIPs (and vice versa) if correspondences were determined simply on the basis of spatial locations. Instead, object correspondences are determined by featural similarity. Objects that have many features in common are placed in correspondence, and this increases the weight of any feature common to both objects. This interpretation is the pillar of the "interactive activation" model to be developed later. What makes objects correspond to one another is the common features they possess. In turn, what makes a particular common feature salient is the extent to which it belongs to corresponding objects.

One of the main conclusions from Experiment 1 is that similarity judgments are influenced by the structure of features in a display - matching features count the most when their objects correspond. One methodological objection to this conclusion is that subjects were required to create a mapping between the objects in the display. Perhaps the different influence of MIPs and MOPs on similarity only arose because subjects were explicitly required to make object-to-object mappings.

To address this objection, a replication of Experiment 1 was conducted, with the single change that object-to-object mappings were not required of subjects after they give their similarity ratings. The results from this replication closely followed the results from Experiment 1. In particular, for the six methods of changing features, similarity ratings were: DH->DH = 6.8, DH->DD = 6.3, DH->DY = 6.3, DH->XY = 5.1, DH->HY = 5.1, and DH->HD = 5.6. The results replicate findings that MIPs increase similarity more than MOPs, and that MOPs increase similarity somewhat.

The results from Experiment 1 and its replication point to a natural process whereby subjects perform object-to-object mappings in assessing similarity. The process is natural in the sense that even when object-to-object mappings are not required for the judgment task, they still exert an influence. The task instructions to "judge the similarity between the scene on the left and the scene on the right" do not mention an object-level organization at all. The replication of Experiment 1, while not permitting trial-by-trial analysis of subjects' mappings and their agreement with optimal mappings, does bypass the object-mapping task demands of Experiment 1. The General Discussion provides further discussion and interpretation Experiment 1's results.

Experiment 2

One objection to Experiment 1 is that it involves comparing scenes that are not well integrated. Each scene was composed of two parts. The parts were complete and well structured objects on their own, but the scenes that were made up of the two objects may not be well structured or cohesive, potentially limiting the generality of Experiment 1.

A second objection is that the parts that composed the scenes were remarkably similar. The parts were both butterflies with similar overall shapes. It might be argued that alignment processes are only invoked when there is confusion as to what parts correspond. While this objection does not explain how the correct correspondences are set up when scenes are composed of similar objects, it again points to a possible lack of generality to the current results.

To address these objections, a second experiment was conducted in which the scenes were composed of single objects (schematic birds) and the parts of the birds were clearly different and have delineated roles. Thus, the birds had wings, bodies, and heads, and these parts were connected into a cohesive bird structure. Mapping data was not collected as it was in Experiment 1 because pilot data showed that subjects virtually always performed the role-to-role mapping (e.g. the head of one bird is mapped onto the head of the other bird).


Subjects. Twenty-nine undergraduate students from the University of Michigan served as subjects in order to fulfill a course requirement.


Insert Figure 6 about here


Materials. Fig. 6 provides two examples of the displays that were used. Each display consisted of a bird on the left side of a Macintosh SE screen and a bird on the right side. Each bird consisted of four components: head, upper wing, lower wing, and tail. Each of these components contained one of 21 symbols. These symbols all were composed of straight lines configured in a distinctive geometric pattern. The two birds in a scene were the same, except for changes made to some of the components. The 15 methods of changing bird components are shown in Table 3. These particular methods were chosen so as to vary the number of matches in and out of place from 0 to 4, with all possible MIP/MOP combinations represented. In Table 3, each four letter string represents a bird, with each letter position representing a particular role or part of the bird and each letter representing a particular geometric symbol. Thus, birds ABCD and BACD have four symbols in common, but no symbol occupies the same parts in the two birds.


Insert Table 3 about here


The particular components that were grouped together, the particular symbols that were used, the physical left/right order of the birds, and the particular method of changing components was randomized for every trial, under the constraint that each method for changing components was equally used during the experiment.

Two of the 16 methods of changing birds are shown in Fig. 7. In the top display, there are two MOPs and two MIPs between the birds, corresponding to Method 6 (ABCD->BACD). The hourglass and T-shapes belong to both birds, but they belong to different parts within the birds. The other two shapes are shared by both birds, and are also in the same positions in the birds. The bottom display instantiates Method 14 (ABCD->ABCZ). The birds were each 7.6 cm long. Schematic beaks and tails were added to the birds to increase their bird-like appearance. The two birds were separated by 5 cm.

Procedure. Each trial began with the display of two birds as described above. The subjects' task was to rank on a scale from one to nine the two birds' similarity. A rating of one indicated very low similarity; a rating of nine indicated very high similarity. After a subject pressed a number on the keyboard, the screen was erased, and after two seconds the next trial appeared. Fifteen practice trials were given to familiarize subjects with the rating task, followed by 160 trials as described above.

Results and Discussion

The results are presented in Fig. 7, where each of the 15 methods of changing features are recoded in terms of the number of MIPs and MOPs that they yield. With this recoding, both MIPs (F (3, 25)=5.3, mse = .03, p<.01) and MOPs (F (3, 25)=3.4, mse = .05, p<.05) increase similarity, and MIPs increase similarity more than MOPs. For example, 3 MIPs/0 MOPs elicits a similarity rating of 6.9 while 0 MIPs/3 MOPs elicits a similarity rating of 3.6. Again, we find that the difference between 2 MOPs and 1 MOP is greater than the difference between 1 MOP and 0 MOPs (paired t=2.52, df=28, p<.05). The (2 MOPs - 1 MOP) difference is also greater than the (3 MOPs - 2 MOPs) difference (paired t=2.36, df=28, p<.05). This difference is mostly due to displays with 1 MIP. It is important to note that the 3 MOP/1 MIP condition corresponds to the DH->DD method from Experiment 1. That is, going from 2 MOPs to 3 MOPs clearly adds a MOP, but it is a MOP that creates a two-to-one mapping. For example, two birds have an hourglass shape in the heads, and one of the birds also has an hourglass shape in its body. The head-body hourglass match does not increase similarity significantly. Methods that correspond to the DH->DD method consistently show no effect of the additional MOP. Specifically, Method 3's similarity rating is no greater than Method 5's, 8's is no greater than 12's, and 10's is no greater than 14's.


Insert Figure 7 about here


On the whole, Experiment 2 replicates the findings of Experiment 1 using scenes that seem to form a cohesive unit. While it might be expected that there would be no effect due to MOPs when the parts of a scene have distinct roles, this does not appear to be the case. For example, even though the head of one bird clearly corresponds to the head of the other bird, a symbol match between a head and a wing still increases similarity.

The SIAM Model of Structural Similarity

The data from Experiments 1 and 2 provide a constraint on models of similarity that are applied to scenes with hierarchical or propositional structure. The model SIAM (Similarity as Interactive Activation and Mapping) is developed in order to organize and explain the reported data. SIAM shares architectural commonalities with McClelland and Rumelhart's (1981) interactive activation model of word perception, and is highly related to Falkenhainer, Gentner, and Forbus' SME system of analogical reasoning (1990), which is, in turn, based on Gentner's structure-mapping theory (Gentner, 1983, 1989). SIAM also bears strong conceptual and architectural resemblances to Holyoak and Thagard's (1989) ACME system for analogical reasoning. SIAM can be viewed as extending work in analogy to quantitatively model similarity assessments.

SIAM assumes that similarity and correspondences are determined by a process of interactive activation between feature, object, and role correspondences. The degree to which features from two scenes are placed in correspondence depends on how strongly their objects are placed in correspondence. Reciprocally, how strongly two objects are placed in correspondence depends on the correspondence strength of their features. A similar pattern of simultaneous mutual influence occurs between objects and roles.

SIAM's network architecture is composed of nodes that excite and inhibit each other. As in ACME, nodes represent hypotheses that two entities correspond to one another in two scenes. In SIAM, there are three types of nodes: feature-to-feature nodes, object-to-object nodes, and role-to-role nodes.

Feature-to-feature nodes each represent a hypothesis that two features correspond to each other. There will be one node for every pair of features that belong to the same dimension; if each scene has O objects with F features each, there will be O2F feature-to-feature nodes. As the activation of a feature-to-feature node increases, the two features referenced by the node are placed in stronger correspondence.

Object-to-object nodes represent the hypothesis that two objects correspond. There will be O2 object-to-object nodes if there are O objects in each of two scenes. As the activation of an object-to-object node increases, the two objects are be placed in stronger correspondence with each other.

Role-to-role nodes represent the hypothesis that two relational arguments correspond. As the activation of a role-to-role node increases, the two relational arguments being referenced are placed in stronger correspondence. Role-to-role nodes operate to place scene parts in correspondence that play the same role within a scene.

Rather than being learned, all of the connection strengths between nodes are based on consistency. As in ACME, Marr and Poggio's (1979) work on stereopsis, and McClelland and Rumelhart's original work, activation spreads in SIAM by two principles: 1) nodes that are consistent with one another send excitatory activation to each other and 2) nodes that are inconsistent inhibit one another. Fig. 8 illustrates the basic varieties of excitatory and inhibitory connections in SIAM that occur between features and objects. Each of the twenty slots in Fig. 8 stands for an object-to-object or feature-to-feature node in a comparison between two scenes, each scene with two objects, and each object containing four features. There are four ways in which the activation from one node can influence the activation of another node:

1. Feature-to-feature nodes inhibit and excite other feature-to-feature nodes. Feature correspondences that result in 2-to-1 mappings are inconsistent; correspondences that do not yield 2-to-1 mappings and deal with the same dimension are consistent. Inconsistent nodes inhibit each other; consistent nodes excite each other; nodes that refer to different dimensions have no influence on each other. The node that places the tail of Object A in correspondence with the tail of C (the tail slot that intersects Object A and Object C in Fig. 8) is inconsistent with the node that places the tail of C in correspondence with the tail of B because they would place two features from one scene into correspondence with a single feature of the other scene. The node that aligns the tails of Objects A and C is consistent with the node that aligns the tails of Objects Band D.

2. Object-to-object nodes inhibit and excite other object-to-object nodes. Analogous to feature-to-feature nodes, object correspondences that are inconsistent inhibit one another, and consistent object correspondences excite one another. The node that places B and D in correspondence inhibits the node that places A and D in correspondence (A and B cannot both map unto D) and excites the node that places A and C in correspondence.

3. Feature-to-feature nodes excite consistent object-to-object nodes, and vice versa. The node that places A in correspondence with C is excited by the node that places the wing of A into correspondence with the wing of C. The excitation is bidirectional; a node placing two features in correspondence will be excited by the node that places the objects composed of the features into correspondence.

4. Match values influence feature-to-feature nodes. Features are placed in correspondence to the extent that their feature values match. Match values range from 0 to 1, where 0 is used for dimension values that are maximally different, and 1 is used for identical dimension values (a similar coding scheme is used by Medin & Schaffer, 1978). If Objects A and C both have striped bodies, then the match value for the node that places these bodies into correspondence will be 1.0. This high match value will excite/strenghen the node that aligns the bodies of A and C.


Insert Figure 8 about here


While object-to-object mappings are influenced by a scene's features, they are also influenced by relations between objects, and the roles of the objects in those relations. In the butterfly scenes, each butterfly occupies a certain role, that can be thought of as an argument in a relational proposition. For example, if Butterfly A is above B in a scene, we might express this by above(A,B). Objects that play the same role in their scenes will tend to be placed in alignment, perhaps even if they share few or no features in common. Conversely, different roles tend to align if their objects are similar. The interactions between object correspondences and role correspondences are shown in Fig. 9 and include:

1. Role-to-role nodes inhibit and excite other role-to-role nodes. The node that places the first argument of a relation in the left scene into correspondence with the first argument of a relation in the right scene is inconsistent with and inhibits the node that places the second argument of the left relation in correspondence with the second argument of the right relation. Node consistency is determined as it was for objects and features.

2. Role-to-role nodes excite consistent object-to-object nodes, and vice versa. The left and right scenes contain the relations Above(A,B) and Above(C,D) respectively. The node that places A into correspondence with C is excited by the node that places the first argument of the left relation into correspondence with the first argument of the right relation. The activation also flows in reverse, from the a-corresponds-to-c node to the first-argument-to-first-argument node.

3. Match values excite role-to-role nodes. Roles are placed in correspondence to the extent that they are similar, and occur in similar relations/schemas. A match value of 1.0 is given to nodes that align identical roles in identical relations. As such, the node that aligns the first arguments of the above relations receives excitation from its match value.


Insert Figure 9 about here


Network activity starts by features being placed in correspondence according to their similarity. While the similarity of entire scenes is calculated by SIAM, the similarity of individual features is derived directly from perceptual properties. After features begin to be placed in correspondence, SIAM begins to place objects into correspondence that are consistent with the feature correspondences. Once objects begin to be put in correspondence, activation is fed back down to the feature (mis)matches that are consistent with the object alignments. In this way, object matches influence activation of feature matches and feature matches influence the activation of object matches concurrently. Similarly, role correspondences influence object correspondences, and vice versa.

Processing in SIAM starts with a description of the scenes to be compared. Scenes are described in terms of relations that take objects as arguments, and objects that contain feature slots that are filled with particular feature values. For example, one butterfly scene might be represented by Above-and-left((object1 (head square) (tail zig-zag) (body-shading white) (wing-shading checkered)), (object2 (head triangle) (tail zig-zag) (body-shading striped) (wing-shading spotted))). The weights associated with the node-to-node links are all parametrically determined, not learned. Processing consists of activation passing. On each "slice" of time, activation spreads between nodes. Nodes send activation to each other for a specified number of time units. The network's pattern of activation determines both the similarity of the scenes and the alignment of the scenes' features, objects, and relation arguments. Nodes that have high activity will be weighted highly in the similarity assessment and their elements will tend to be placed in alignment.

Implementation details

All nodes in SIAM vary from 0 to 1 in their activation. To determine how active a node should be on the next time step, the nodes that excite or inhibit it send "advice" indicating what the node's activity should be. The node computes a weighted average of the advice, and adjusts its activation accordingly. Network activity begins with match values set for feature-to-feature nodes and role-to-role nodes, based upon the input descriptions. If two features or roles have identical values, a match value of 1 is assigned, otherwise a parameter-dependent value between 0 and 1 is assigned to the match value that is positively related to the elements' similarity. At the beginning of the first time cycle, all node activations are set to 0.5. A value of 0.5 signifies a state of maximal uncertainty as to whether the two elements are aligned. To the extent that the node that places A in correspondence with B has an activation value greater than 0.5, A and B are aligned with one another. To the extent that the node activation is less than 0.5, A and B are not aligned.

The activation of node i at time t+1, Ai,t+1, is determined by the advice sent from nodes with incoming connections to i via

Ai,t+1 = Ai,t(1-L)+MiL.

L is the overall rate of activation change, and Mi is the weighted average of the advice of the nodes having input connections to node i. If L=1, then Ai,t+1 is simply the weighted average advice as to what i's activation should be. If L<1, then node activations have inertia, and will tend to depart from old activation values less than the connecting nodes advise. Mi is calculated by

Mi = .

Rji is node j's recommended activation value for node i, Wji is the weight of this advice, Sj is the salience of the node connecting the jth dimensions, and n is the number of nodes that have input connections to node i. Thus, if all of the Rji values are large (close to 1.0), then Miwill be close to 1.0, entailing that the activation of node i will increase. The formula for Mi is essentially an average of the advice from every node that connects to i, weighted by the parametrically-determined importance of the advice.

For excitatory connections:

Rji = Ai + (1 - Ai)(Aj -.5) if Aj>0.5

Rji = Ai - Ai(.5 - Aj) if Aj<0.5

For inhibitory connections:

Rji = Ai + (1 - Ai)((1 - Aj)-.5) if (1- Aj)>0.5

Rji = Ai - Ai(.5-(1 - Aj)) if (1 - Aj)<0.5

In this implementation of the model, both excitatory and inhibitory connections can increase or decrease the activity of node i. The difference is that, as node j's activation increases, node i's activity decreases if the connection is inhibitory and increases if it is excitatory. The equations for Rji take into account both the activation of j and i. As j's value becomes more extreme (distant from .5) Rji becomes further removed from Ai. Rji is constrained to fall between 0 and 1 because Ai and Aj fall between 0 and 1. Thus, the recommended activation of a node is based on the original activation of the node (the Ai term), whether the recommendation is to excite or inhibit the node, how far the node's activitation is to its minimum or maximum level [e.g. the (1-Ai) term], and how strong the recommendation is [e.g. the (1 - Aj) term].

The Wji terms are parameters that determine the relative influence of different types of network information on node activation (Table 4 shows the complete list of model parameters). For example, if node j is an object-to-object node consistent with the feature-to-feature node i, then Wji is equal to the parameter object-to-feature-wt and Wij is equal to feature-to-object-wt.


Insert Table 4 about here


Once the node activations have been adjusted, similarity is computed via

similarity = a+ (1-a)Si,

where Si = Sia + Sib, Sia is the salience of scene a's dimension i, and Sib is the salience of the other scene's dimension i. This similarity formula has a non-normalized (Tversky, 1977) and a normalized component. In the current modelling, a is set to 1.0; similarity is thus based on a completely normalized measure. The normalized similarity will fall between 0 and 1. Similarity is computed as a function of the match values for each feature-to-feature node, weighted by the activation of the node. The more active a feature-to-feature node is, the more the particular matching or mismatching value shared by the features will influence similarity. If the features have the same value, then similarity will increase more if the feature-to-feature node's activation is high. Likewise, mismatching features decrease similarity more if they are placed in strong correspondence. Thus, similarity is a function of the featural similarities between the objects, with the importance of a featural similarity determined by its degree of alignment. Assuming that present features have greater salience than absent features (perhaps Six = 0 for all absent features) and that presentation order influences salience, asymmetrical similarity effects (Tversky, 1977) can be accommodated.

Mapping accuracy in SIAM is modelled by comparing object-to-object node activations. The node activations associated with one mapping of objects from one scene to another are compared with alternate mappings' activations. The probability of performing a particular mapping of scene elements is

P(M) = ,

where {C(M)} is the set of object-to-object nodes consistent with mapping M, and n is the total number of object-to-object nodes. According to this formula, a mapping is likely to be made if the objects placed in correspondence by the mapping are strongly aligned (their object-to-object Ai value is high) relative to the other object alignments.

An example of SIAM's processing

Fig. 10 illustrates the processing of SIAM at several different time steps. The example is taken from Fig. 3, a display with 8 MIPs and 3 MOPs. As with Fig. 8, each of the 20 slots represents a particular feature-to-feature or object-to-object correspondence. The values in parentheses show match values (1 = matching feature, 0 = mismatching feature). The other four values show the activation of correspondences after 1, 2, 3, and 10 time steps of activation passing. The default parameter settings in Table 4 are assumed.


Insert Figure 10 about here


On the first time step of activation passing, the strength of feature-to-feature correspondences will depend on match values alone; all other sources of information will recommend maintaining an activation of 0.5. Thus, features will begin to be aligned if they have identical appearances. After the second time step, feature correspondences will influence each other. For example the alignment between A's head and C's head is now stronger than the alignment between the bodies of A and D. Although both of these correspondences align identical features, the B-to-D head alignment strengthens the A-to-C head alignment, while the A-to-D body alignment does not have similar support. At the end of the second time step, object correspondences still have not influenced feature-to-feature alignments. For example, the strength of the A-to-C head alignment is equal to the A-to-D tail alignment, even though A will eventually come to be placed in correspondence with C and not D. Objects, however, are beginning to be placed in correspondence according to the advice of feature-to-feature nodes. For example, the activation of the node that places object B in correspondence with object C drops below 0.5 because these objects have only one matching feature.

By the end of ten time steps, object and feature correspondences show substantial influences due to each other. Although A has two feature matches in common with both C and D, the object-to-object correspondence between A and C is stronger than the correspondence between A and D. The A-to-C correspondence is supported by the B-to-D correspondence, and both of these correspondences inhibit the A-to-D correspondence. As a result, the feature correspondences between A and D are weaker than between A and C. Although the A-to-D tail correspondence receives local support from the B-to-C tail match, it does not receive much support from the A-to-D object correspondence, and consequently, it is weaker than the A-to-C head correspondence that receives both featural and object support. The feature matches between A and C, and between B and D, are MIPs, and gradually exert more of an influence on similarity relative to the MOPs between A and D (or B and C). By the same token, mismatching features between aligned objects also exert more of an influence on similarity with processing. The tail mismatch between A and C will decrease similarity more than the head mismatch between A and D, because similarity is a function of a mismatch value, weighted by its activation.

Alternative Models of Similarity

In order to evaluate the successes and failures of SIAM, two simple and intuitive contrasting models of similarity are developed. In the first model, no alignment of objects takes place when similarity is computed. The second model concedes that alignment of objects is required, and that a distinction between MIPs and MOPs must be made, but that SIAM's process of interactive activation is not required.

Simple & Conjunctive Features Model (SCFM)

According to this model, all of the features of the two scenes are listed, and similarity is simply a monotonic increasing function of the number of shared features weighted by their salience. If simple and conjunctive features are permitted, the advantage of MIPs over MOPs in increasing similarity is predicted. Simple features are the primitive components of a scene's objects. Conjunctive features are aggregations of simple features. The scene would be described as possessing the following features:

simple features: {A,B,C,D,E,F,G,H}

2-way conjunctive features: {AB,AC,AD,BC,BD,CD,EF,EG,EH,FG,FH,GH}

3-way conjunctive features: {ABC,ABD,ACD,BCD, EFG,EFH,EGH,FGH}

4-way conjunctive features:{ABCD,EFGH}

Thus, the scene would be more similar to than . Although both of these scenes have the same number (8) of simple features in common with the original scene, the first has 30 simple and conjunctive features in common with the original scene, as compared to the second scene's 16 matching features.

In testing SCFM, a linear regression model is used with four predictor terms: simple features, 2-way conjunctive features, 3-way conjunctive features, and 4-way conjunctive features. A special case of this model occurs if the simple feature term is the only term that is given a non-zero weight. This is equivalent to a model where MIPS and MOPs both increase similarity, and no difference is made between MIPs and MOPs. The general model is:

similarity = a (number-of-shared-simple-features) + b (number-of-shared-2-way-conjunctive features) + c (number-of-shared-3-way-conjunctive-features) + d (number-of-shared-4-way-conjunctive-features)

Historical justification for this model comes from work in configural cue learning. For example, in category learning research, it has been argued that people are sensitive to configurations of cues in addition to simple cues (Gluck & Bower, 1988; Hayes-Roth & Hayes-Roth, 1977).

Weighted MIPs and MOPs Model (WMMM)

In WMMM, a distinction is made between MIPs and MOPs. As such, WMMM accepts the basic claim that alignment is a necessary part of similarity assessment. WMMM takes as input the number of MIPs and MOPs that are shared between two scenes. WMMM performs a linear regression of three variables on similarity assessments obtained from subjects:

similarity = a (number of MIPs) + b (number of MOPs) + c (number of MIPs * number of MOPs)

Final Modelling Considerations

The version of SIAM that was tested has 2 free parameters (for number-of-cycles, and object-to-feature-wt. These parameters were chosen because of their conceptual importance, but other pairs of parameters gave equivalent performance), as compared to 4 parameters for SCFM, and 3 parameters for WMMM. Separate parameters are not required for distinctive features in SCFM and WMMM. Although distinctive features do decrease similarity (Gati & Tversky, 1984), in the data that will be modelled the total number of features in scenes is held constant. As such, there is a perfect correlation of r=-1 between the number of shared and distinctive features between two scenes. Adding distinctive feature terms to WMMM or SCFM would not improve fits to the current experiments' data because distinctive features are informationally redundant with the shared feature terms. Of course, the alternative models would have to be augmented if they were applied to a broader range of stimuli.

One ramification of this analysis is that a special case of WMMM is equivalent to one simplified version of Tversky's Contrast Model. If a distinction is made between MIPs and MOPs within the Contrast Model, then

Sim(A,B)=af(dAMIPs«BMIPs + eAMOPs«BMOPs ) - bf(A - B) + cf(B - A)

where AMIPs«BMIPs refers to shared MIPs between A and B, and AMOPs«BMOPs refers to shared MOPs. If no distinction is made between shared MIPs and MOPs, the Contrast model would not accommodate the basic result that MIPs increase similarity more than MOPs. We can simplify the Contrast model by introducing new constants, combining the (A - B) and the (B - A) term because the order of scenes is always randomized, and eliminating the multicollinear distinctive features term, yielding:


This reformulation assumes feature additivity. The equation is equivalent to WMMM with no MIPs*MOPs multiplicative term. This isomorphism only applies to the version of the Contrast model in which feature measures within a component are additively combined, (B - A) and (A - B) are equally weighted, only simple features are encoded, and features have equal salience.

The fit of a model is determined relative to the other two models. For each pair of models, a stepwise linear regression determines if one of the models significantly increases the proportion of the empirical data's variance that is accounted for when it is included in the other model. Model B accounts for trends in the data that cannot be accounted for by Model A if the linear regression that contains terms from both models accounts for a significantly greater proportion of the data's variance than does Model A alone. An F test with numerator degrees of freedom equal to the number of linear regression parameters is used to determined significance (p < .05).

Model Fits

Experiment 1

In experiment 1, the primary manipulation was the method of changing butterfly features from one scene to another. Two dimensions of one of the scenes were altered in any one of six methods to yield the second scene. Because some of the resulting 36 combinations of "methods of changing dimensions" were functionally identical, 21 different displays were produced by applying combinations of change methods.


Insert Table 5 about here


As shown in Table 5, SIAM improves the model fits of both WMMM and SCFM, and neither of these alternative models improves SIAM's fit. WMMM improves the fit of SCFM, but SCFM does not improve WMMM's fit. As such, the quality of model fits ascends from SCFM to WMMM to SIAM.

SCFM originally seemed to be a promising model because of its implicit distinction between MIPs and MOPs. An indirect outcome of SCFM's processing is that MIPs increase similarity more than MOPs. Adding a MIP to two scenes increases similarity increases more than adding a MOP, because adding a MIP adds a larger number of shared conjunctive features. Why then do SIAM and WMMM do better than SCFM? By examining the largest discrepancies between the data and SCFM, one reasonable hypothesis emerges. The data/model residual with the largest absolute magnitude (-.612) comes from the display {left scene = , right scene = }. While this display has only 4 MIPs (A, B, E, F) and 2 MOPs (G, H), it possesses 6 simple features (A, B, C, D, G, H) and 3 two-way conjunctive features (AB, GH, EF). Because of the relatively large number of two-way conjunctive features, SCFM gives the scene a similarity rating higher than given by subjects. The second largest discrepancy from SCFM (-.561) comes from the display {,}, again because SCFM predicts too high a similarity rating.

Both of these displays are characterized by concentrated MOPs - MOPs that are concentrated in a pair of object. The objects EFGH and EFCD share two features, but these objects are not placed in optimal correspondence because of the superior match between the two lower objects. Even though EFGH and EFCD have several features in common, WMMM and SIAM treat their common features as MOPs and not as MIPs. Out of the 21 displays, 6 have more than one MOP concentrated in a single pair of objects. The consistently negative residuals of these items (one-way, t-test with HO:m=0, t(5)=-2.30, p<.05) reveals the systematic tendency for SCFM to overestimate the similarity of displays with several feature matches between noncorresponding objects.

SIAM predicts subjects' data from the first experiment with reliably greater accuracy than WMMM. Therefore, SIAM's success at modelling the subject's data is not simply due to the fact that it weights both MIPs and MOPs, and weights MIPs more. The reasons for SIAM's superior fit over WMMM are presented in the general discussion.

In summary, the results suggest that the two models that take into account object alignment in determining the weight of a feature match in similarity have a predictive advantage over the model that encodes simple and conjunctive features. While SCFM can often mimic the differential influence of MIPs and MOPs, when object alignment and object similarity are teased apart, SCFM incorrectly bases scene similarity on object similarity.

Experiment 2

Experiment 2 required subjects to rate the similarity of stylized birds with internal symbols. The similarity ratings for the 15 different scenes can be modelled by WMMM and SIAM. SCFM is no longer applicable. In the first experiment, scenes were composed of multiple objects and each object contained multiple features. However, with the bird stimuli, each scene component (the four bird parts) has only one simple feature. Consequently, conjunctive features are not formed within a scene component, and SCFM is no different from the special case of WMMM that weights MIPs and MOPs equally.

The analogy between Experiment 1 and Experiment 2 must be made with some care for purposes of modelling in SIAM. It might be thought that a single butterfly corresponds to a single bird. Just as a butterfly is composed of features, a bird is composed of parts. By this analysis, the bird scenes would have one object with four dimensions. This analogy, while superficially appealing, will not be used. Instead, it is proposed that a bird scene corresponds to a butterfly scene. That is, the single bird corresponds to two butterflies. While there are two components to the butterfly scenes, the bird scene has four components - one for each bird part. While the butterflies are only loosely connected by relations such as above-and-to-the-left(X,Y), the bird components are tightly related in what will be called a "bird schema" represented by Bird-schema(body part, head part, upper wing part, lower wing part). By this analysis, butterflies play the same role as bird components, not whole birds. The tempting correspondence between birds and butterflies is spurious because whole birds are never placed in correspondence whereas the birds' parts are. Thus, a bird scene's representation might be: Bird-schema ((body (symbol inverted-A)) (head (symbol number-sign)) (upper-wing (symbol plus-sign)) (lower-wing (symbol sideways-L))). The scene is composed of four parts, and each of the parts has only one dimension/value pair.

Because of the psychologically salient relations between scene components in the bird-like forms, role-to-role connections are included in SIAM. This addition could add four more parameters to SIAM; however, only role-to-object-wt is selected as a free parameter; the other role-related parameters are set to their default values. The greater the value of role-to-object-wt, the more the role of a part influences its correspondences.

As shown in Table 5, SIAM's estimate significantly increases WMMM's fit, but the terms from WMMM do not increase SIAM's fit. The account of SIAM's superior fits will be reserved for the general discussion.

Summary of Modelling

In general, SIAM provides a better account of the data from the experiments than did either WMMM or SCFM. The model/data correlations for all three models is surprisingly good. Even SCFM often achieves correlations greater than 0.95. Reasons for these excellent fits are: 1) each data point represents many pieces of raw data, pooling over many dimension and dimension-value instantiations, 2) the manipulations are strong - the generally positive relation between common features and similarity is indisputable, and 3) only plausible similarity models are tested - models that do not explicitly or implicitly account for the difference between MIPs and MOPs are not considered. It might be suggested, "If all the models do so well, perhaps SCFM or WMMM should be preferred on the basis of their parsimony." SIAM has a similar number of parameters as SCFM or WMMM, but its processing is a good deal more elaborate than the other models. However, even though all three models do well, systematic discrepancies between subject-obtained similarity assessments and SCFM and WMMM exist. In the general discussion, these systematic under- and over-estimations will be enumerated and SIAM's account for these phenomena will be given. The first two experiments were in large part exploratory, designed to place general constraints on any theory of structural similarity. Given the modelling success of SIAM, the model seems to merit closer scrutiny. Three experiments were designed to test specific predictions of SIAM.

Experiment 3

One of the distinguishing features of SIAM, relative to WMMM, is its sensitivity to the distribution, not only sheer number, of MIPs and MOPs. WMMM predicts that all displays with X MIPs and Y MOPs are equally similar, if features are equally salient. SIAM is sensitive to the distribution of features matches in dimensions and objects. Under typical parameter values, SIAM predicts that scenes with MIPs that are concentrated in a single pair of objects or a limited number of dimensions will be more similar to each other than scenes with more distributed MIPs. The reason for this prediction is that concentrated feature matches increase each other's salience more than distributed matches do. Positive feedback of node activation occurs between nodes that are directly or indirectly consistent with one another, and has a greater magnitude for directly consistent correspondences.

Consider feature matches that are concentrated in a single pair of objects (Fig. 11, middle display). The objects with many feature matches will be placed in strong correspondence. Once this occurs, the four feature matches that the objects have in common will all receive strong activation from the object-to-object correspondence. There will be no mismatching features that receive as large an amount of fed-back activation. If feature matches are distributed between two pairs of objects (Fig. 11, top display), then no pair of objects will be placed in as strong correspondence, and both matching and mismatching features will receive equal activation from object-to-object correspondences.

Similarly, feature matches that are concentrated on a single dimension (Fig. 11, bottom display) will excite each other if they are consistent, because of the influence of feature-to-feature nodes on each other. A node that places two features from dimension X into correspondence will send excitation to other nodes that consistently place other dimension X features in correspondence. If matching features are distributed across several dimensions, then excitation of mutually consistent dimensional correspondences does not occur. This experiment attempts to confirm these predictions, with particular displays devised to vary the concentration of matching features in objects and dimensions.


Subjects. Thirty-five undergraduate students from University of Illinois served as subjects in order to fulfill a course requirement.


Insert Figure 11 about here


Materials. The butterfly materials from previous experiments were used. In particular, scenes were composed of two butterflies, each with four dimension values. An abstract description of the 11 displays is shown in Table 6. Displays either had distributed matching features, or matching features that were concentrated in a pair of objects or in a dimension. Displays either had 2, 4, or 6 MIPs in total, and never had any MOPs. Fig. 11 shows three of the displays (top = display 4, middle = 8, bottom = 6). For distributed displays, the relevant matching features were found neither on the same objects nor on the same dimensions. For dimensionally concentrated scenes, the relevant MIPs belonged to the same dimension (body shading, wing shading, tail type, or head type). For scenes with MIPs concentrated in objects, two corresponding objects contained all of the relevant MIPs.


Insert Table 6 about here


As in Experiments 1 and 2, the particular order of dimensions and dimension values were randomized on every trial. 18 filler items were interspersed among the 11 critical displays.

Procedure. Experiment 3 used the same procedure as did Experiment 1. Subjects first rated the similarity of scenes on a scale from 1 to 9 (1 = not very similar at all, 9 = highly similar). After the similarity estimate was typed into the computer, subjects indicated which butterflies corresponded to each other.


The similarity ratings shown in Table 6 include only those trials in which the subject performs the "true" mapping, the mapping that maximizes the number of MIPs relative to MOPs. The percentage of true mapping trials is also given. Collapsing over the number of MIPs in a display, distributed MIPs displays have an average similarity of 4.89, object-concentrated displays have an average rating of 5.37, and dimension-concentrated displays have an average rating of 5.13. Each of these differences is significantly different (Bonferroni's family-wise adjustment, paired t(34) > 2.3, p<.05).

Mapping accuracies also varied as a function of MIP distribution. The average mapping accuracies for displays with distributed, object-concentrated, and dimension-concentrated displays are 81.3, 77.0, and 75.9% respectively. The mapping accuracy for the distributed displays is significantly greater than for the concentrated displays (paired t(34) < 2.4, p < .05).


SIAM's prediction of relatively low similarity ratings for distributed, as opposed to concentrated, displays was supported by the experiment. Displays with MIPs concentrated in a single pair of objects were rated as most similar, followed by displays with MIPs concentrated in a small number of dimensions, followed by displays with fully distributed MIPs.

Because the total number of MIPs and MOPs was kept constant, the differences due to feature distribution cannot be accounted for by WMMM. SCFM predicts the advantage of object-concentrated features. If MIPs are concentrated in two objects, then there will be more shared conjunctive features than if the MIPs are distributed across objects. SCFM does not predict the advantage of dimension-concentrated features, because conjunctive features only combine features within an object, not between objects.

Previous research (Evans & Smith, 1988) has shown that, for adults, dimensional identity is a special case of similarity, and is strongly weighted in similarity computations. This account could explain the relatively high similarity for many of the concentrated-features displays (Displays 2, 6, 8, 10, 11). For these displays, either identical objects or dimensions with identical values are present. However, the valuing of identity cannot explain the greater similarity for Displays 3, 5, and 7 relative to their distributed counterparts. For example, in Display 3, placing the 2 MIPs in the same objects increases similarity even though identical objects are not produced.

There are ways to explain the results of Experiment 8 without positing SIAM's processing. One of the simplest ways is to claim: "The influence that a particular matching feature has on similarity increases as a function of the number/weight of matching features of the same type." (Goldstone, Medin, & Gentner, 1991). Thus, if two corresponding objects have zig-zag tails, then this MIP will increase similarity more if there are other MIPs along the tail dimension. Types would have to be proposed for dimensions and objects. Importantly, it is not sufficient to claim that "the influence of a matching feature on similarity increases as a function of the number/weight of other matching features." This claim has been made by many researchers (Hintzman, 1986; Medin & Schaffer, 1978; Nosofsky, 1986) and is consistent with results from Experiments 1 and 2. This claim, however, does not predict any differences in display similarity when MIPs are concentrated rather than distributed. An account of Experiment 3's results must predict that the influence of a feature match depends not just on the quantity of other feature matches, but on their particular type as well. In SIAM, a feature-to-feature node receives activation directly from other feature-to-feature nodes for the same dimension and consistent object-to-object nodes, and only indirectly from different-dimension feature-to-feature nodes. SIAM's non-homogeneous network structure provides a method for equally salient MIPs to differentially affect a particular MIP's influence on similarity.

Experiment 4

In SIAM, similarity is computed by a weighted function of (mis)matching features between scenes. In determining similarity, a (mis)matching feature is weighted by the activation of its feature-to-feature node. Feature-to-feature node activation has been interpreted in terms of attention. If a feature-to-feature node's activation is high, then a large amount of attention is paid to the features' degree of match. Increasing attention is paid to both matching and mismatching features between two objects as the objects become strongly aligned.

This attentional mechanism is in contrast to a second plausible mechanism - that feature matches, but not mismatches, become more important as the features' objects are placed in correspondence. One may hypothesize that if two objects are placed in correspondence, then their features seem more similar to each other, regardless of whether the features are physically similar or not. This view will be called the "object-biased response account" because it argues that people bias their feature perception by object alignments. If objects correspond to each other, then people will be more likely to think that all of their features match. If objects do not correspond to each other, people will be more likely to think that all of their features mismatch.

SIAM is uncommitted as to whether object-biased responses occur. SIAM is committed to the hypothesis that greater attention is given to aligned features (features that belong to objects that are placed in strong correspondence) than unaligned features. In signal detection theory (Swets, Tanner, & Birdsall, 1961), response bias (b) is theoretically separable from response sensitivity (d'). The object-biased response account makes a claim about response bias: people are more likely to say that two features have the same dimension value if they belong to aligned objects than if they belong to unaligned objects, regardless of whether the features do have the same value. SIAM makes a claim about sensitivity: people are more sensitive/discriminating in their perception of whether two features have the same dimension value if they belong to aligned objects than if they belong to unaligned objects. Both claims may be correct.

To test effects of type of display on d' and b, subjects were presented scenes composed of two butterflies that were displayed on the screen for a short period of time. Subjects first give a similarity rating for the two scenes. Then, two pointers appeared on the screen, pointing to the previous locations of two butterflies. Subjects responded as to whether the butterflies referred to by the pointers had matching values on a particular dimension (head, tail, body, or wings). Using Fig. 2 as an example, the following four types of questions were asked:

Aligned Matches: Do A and C have the same WING SHADING? (The correct answer is "Yes.")

Aligned Mismatches: Do A and C have the same BODY SHADING? (The correct answer is "No.")

Unaligned Matches: Do A and D have the same BODY SHADING? (The correct answer is "Yes.")

Unaligned Mismatches: Do A and D have the same WING SHADING? (The correct answer is "No.")

A sensitivity (d') difference between aligned and unaligned features is implied if the first two questions are answered with greater overall accuracy than the last two questions. This pattern, predicted by SIAM, would reveal that people are more sensitive at making feature comparisons if the features belong to strongly aligned objects. A bias difference between aligned and unaligned features is implied if, overall, the "Aligned Match" and "Unaligned Mismatch" questions are answered more accurately than the other two questions.


Subjects. Forty-five undergraduate students from University of Michigan served as subjects in order to fulfill a course requirement.

Materials. The basic butterfly materials from previous experiments were used. An abstract description of the 8 displays is shown in Table 7. For example, Fig. 2 is an instantiation of Method 2. The first four methods have relatively clear butterfly-to-butterfly mappings, containing a minimum of 6 MIPs. The second four methods are identical to the first four methods (1=5, 2=6, 3=7, 4=8) except that mappings are less clear due to the 2 MIPs that are eliminated on the third dimension. The four methods within a specific level of mapping clarity correspond to methods from Experiment 1: DH->DY, DH->HY, DH->XY, and DH->HD.


Insert Table 7 about here


The subject's task was to say whether a cued butterfly from the initial scene has the same dimension value as a cued butterfly from the changed scene. The underlined dimension values in Table 7 are cued. For each of the eight displays, two dimension values are cued. If the initial scene is and the changed scene is then two mismatching features that belong to non-corresponding butterflies are cued (A ¹ D). Thus, this display presents subjects with an unaligned mismatch, and the proper response would be to say that the features are different. The instantiation of dimensions, position of butterflies, dimension values, and left/right placement of the two scenes was randomized in the same manner as Experiment 1.

Procedure. At the start of a trial, subjects were presented a display composed of two scenes, each containing two butterflies. A vertical bar divided the scenes. The scenes stayed on the screen for either a short (5 sec) or long (12.5 sec) time. The two display durations were randomly selected with equal probability on each trial. Subsequently, the screen was erased, and subjects were prompted to give a similarity rating of the scenes on a scale from 1 to 9 (1 = not very similar at all, 9 = highly similar). After the similarity estimate was typed into the computer, two small (1 cm) cuing circles appeared on the screen, in the same position as two of the previously shown butterflies. One butterfly from each scene was cued. The dimension that corresponded to the fourth column of Table 7 was probed. At the bottom of the screen, a sentence appeared: "Did the butterflies marked by the dots have the same _____?" The blank was filled in with "type of head," "type of tail," "body shading," or "wing shading." Subjects pressed "Y" if they believed that the marked butterflies had the same value on the probed dimension, and pressed "N" if they did not. Subjects received no feedback on the correctness of their answer. After a three second pause, the next trial began. Each subject received 112 trials, seven repetitions of the 16 questions in Table 7. The experiment required approximately 30 minutes to complete.


The similarity ratings and percentage of correct feature comparisons are shown for each type of display and each display duration in Table 7. Combining both durations and all displays, the following percentages of correct feature comparisons are found: aligned match (correct answer is "same") = 73.4%, aligned mismatch ("different") = 73.3, unaligned match ("same") = 53.1, and unaligned mismatch ("different") = 81.4. For each subject b values are computed for comparisons between aligned and unaligned object features. "Hits" are defined as correct "same" comparisons; "false alarms" are defined as "same" responses given to displays with non-identical cued features. b values are computed by taking the log of the ratio of Hit's ordinate over false alarm's ordinate. b values are significantly smaller for aligned (average b = 0.0) than unaligned (b = 0.3928) object features (paired t(44) = 3.3, p< .05). Thus, subjects are more likely to respond "the two features are the same" if the features' objects are aligned rather than unaligned, controlling for whether the features do in fact match.

Signal detection analysis can also be applied to test SIAM's sensitivity claim: is sensitivity/discriminability (d') higher for displays that compare aligned, rather than unaligned, objects' features? For each subject d' is computed by Z(false alarm) - Z(hit) for aligned and non-aligned feature displays. Overall, d' = 1.24 for aligned displays, and d' = 0.97 for unaligned displays. This significant difference (paired t(44) = 2.27, p < .05) confirms SIAM's prediction that sensitivity is greater for features that belong to aligned, rather than unaligned, objects.

As might be expected, sensitivity is greater for displays that are shown for a long (d' = 1.41), rather than short (d' = 0.87) duration (paired t(44) = 3.2, p < .05). More relevant to SIAM, sensitivity is greater for displays with clear object-to-object alignments (d' = 1.4) than for displays with less clear alignments (d' = 0.81, t(44) = 3.3, p < .05). In addition, there is a marginally significant interaction between clarity of alignment and alignment (F(1, 44) = 2.94, mse = .04, p < .1), such that larger d' differences between aligned and unaligned displays are found for clearly mapped displays than for less clearly mapped displays.

The similarity data support previous empirical claims. MIPs count more than MOPs (Display 1's rating is greater than 2's; 5>6), and 2 MOPs increases similarity over 0 and 1 MOPs (Display 4>3, 8>7), but 1 MOP does not reliably increase similarity over 0 MOPs (Display 2=3, 6=7). In addition, similarity ratings increase with longer viewing durations. The average similarity ratings for short and long duration displays were 5.99 and 6.35 respectively (paired t(7) = 3.84, p<.05).


Subjects judged whether two features were identical or not. Both their sensitivity and bias in this task are influenced by the pattern of correspondences of the features' objects. Subjects are more accurate/sensitive at the task if the features belong to objects that are aligned. It is not simply that people assume that all features belonging to corresponding objects match. SIAM predicts feature discrimination sensitivity to be greater for aligned than unaligned objects because feature-to-feature nodes that are consistent with the aligned objects receive substantial activation from object-to-object nodes. The activation of a feature-to-feature node is taken to be a measure of the attention paid to the feature match or mismatch.

The response bias indicates that if butterflies correspond, then subjects are more likely to respond "yes, the features match" than if the butterflies are not aligned. Subjects tend to assume that butterflies that do not have many common features do not agree on any particular probed feature either, whereas similar butterflies are more likely to agree on any particular feature.

While the present interpretation of SIAM only requires sensitivity changes due to object alignment, on a broad level, both sensitivity and bias effects suggest that the perception/interpretation of a particular feature (mis)match depends on object correspondences. To know how sensitive and biased subjects are in comparing features it is necessary to know whether the objects correspond to each other or not.

The influence of object alignment on response bias may seem antithetical to SIAM's assumptions. One might argue, "according to SIAM, feature-to-feature node activity represents attention to a feature (mis)match. If greater attention results in greater accuracy, then we would expect greater accuracy in the aligned mismatch condition than the unaligned mismatch condition. Precisely the opposite effect is found." This argument assumes one possible relation between accuracy and attention, but not the only one. One plausible and simple model for feature comparisons is:

P("Same") = match-value*feature-to-feature-activation

That is, the probability of responding that two features are the same is equal to the product of their similarity and the attention given to placing them in alignment. As opposed to the critic's implicit model, the equation takes into account not only the attention paid to the feature match (the feature-to-feature activation), but also the particular match value being attended. With this model of feature comparison, SIAM can accommodate influences of object alignment on sensitivity and response bias.

SIAM predicts that sensitivity is higher for feature matches and mismatches that are placed in aligned objects than unaligned objects. If two objects are placed in strong correspondence, then all of the matching and mismatching features of those objects are made more important for similarity assessments. A relevant criticism (Massaro, 1989) has been levied against McClelland and Elman's (1986) interactive activation model of speech perception. The original version of the interactive activation model predicted non-constant sensitivities for different speech contexts [though newer versions need not (McClelland, 1991)]. No such sensitivity changes are empirically obtained in the spoken word stimuli that have been tested (Massaro, 1989). However, in the case of our butterfly scenes, we do in fact find sensitivity differences for feature matches depending on the alignments of objects. For our domain, the fact that an interactive activation model predicts context-dependent sensitivity changes is a point in favor of the model.

General Discussion

Empirical data and computational modelling argue for the inclusion of structural alignment in a theory of similarity. The model with the best overall fit to the initial experiments assumed an interactive activation process whereby correspondences between scene parts mutually and concurrently influence each other. Specific predictions of the interactive activation model of similarity were tested and empirically supported.

Empirical Evidence in Favor of SIAM

A. SIAM predicts the differential effect of MIPs and MOPs on similarity. The experiments have been unanimous in showing that MOPs increase similarity, and that MIPs increase similarity more than MOPs. MIPs are common features that belong to corresponding scene parts. MOPs are common features that belong to scene parts that do not correspond to one another. Accommodating this basic result was the prerequisite for all of the tested models.

B. SIAM predicts 2 MOPs > 1 MOP = 0 MOPs. For the purpose of increasing similarity, 2 MOPs increase similarity significantly over 1 MOP, but in many cases ,1 MOP does not significantly increase similarity over scenes with 0 MOPs. This pattern specifically holds when the 2 MOPs are arranged as dictated by method "DH->HD." In other words, when the dimension values of two butterflies are swapped to create the changed scene, then the matches associated with those values have a relatively strong influence on similarity.

The finding is naturally handled by SIAM's posited interaction between (in)consistent feature-to-feature nodes. Feature-to-feature nodes that consistently place feature values from the same dimension in correspondence send direct activation to one another. Two MOPs, if created by swapping feature values, will support one another, although they are inconsistent with the globally optimal object-to-object correspondences.

The phenomenon that (2 MOPs - 1 MOP) > (1 MOP - 0 MOPs) cannot be explained by an nonlinear relation between featural overlap and judged similarity. For example, it has been claimed (e.g. Nosofsky, 1986; Shepard, 1987) that the function relating psychological distance to similarity is exponentially decreasing. Adding a common feature to two scenes increases their similarity more when the scenes already have many common features than when they do not have many commonalities. This explanation, however, does not account for finding (2 MOPs - 1 MOP) > (1 MOP - 0 MOPs) at different levels of overall similarity. If we only analyze data from Experiment 1 where one dimension is changed between displayed scenes, we find similarity ratings of: 0 MOPs =6.0, 1 MOP=6.1, and 2 MOPs=6.4. If we change two dimensions, then the influence of one of the changed dimensions is: 0 MOPs=4.7, 1 MOP=4.7, 2 MOPs=5.1. Even though these latter ratings are all lower than the former ratings, and are therefore in the less sensitive region of an exponentially decreasing similarity curve, we still find that the difference between 2 MOPs and 1 MOP is significant, whereas the difference between 1 MOP and 0 MOPs is not. The difference between 1 and 2 MOPs when two dimensions are changed is larger than the difference between 0 and 1 MOP when one dimension is changed, even though the similarity ratings for the latter are higher.

C. SIAM predicts the influence of feature distribution on similarity ratings. MIPs that are concentrated in object pairs or in dimensions are subjectively more similar than MIPs that are distributed across object pairs and/or dimensions. The dimension-concentrated advantage stems from the influence of feature-to-feature nodes on (in)consistent feature-to-feature nodes. Feature-to-feature nodes for a particular dimension are inconsistent if they create a many-to-one mapping between feature values, otherwise they are consistent. Consistent feature-to-feature nodes mutually excite each other, thus increasing their influence on similarity. The object-concentrated advantage accrues from feature-to-object and object-to-feature connections. Objects with many concentrated feature matches will be placed in strong correspondence, and will feed activation back down to the individual feature matches.

An example of the relevance of feature distribution on subjects' similarity ratings is evident from Fig. 5. Compare the case where one dimension is changed via "DH->HD" and the other dimension is changed via "DH->XY" to the case where two dimensions are changed via "DH->HY." The first display can be represented by {, } where two scenes each have two butterflies with four dimensions. The second display can be represented by {, }. For the purposes of WMMM, the model that measures similarity by a weighted combination of MIPs and MOPs, these two displays are equivalent; they both possess 4 MIPs and 2 MOPs.

However, for SIAM, as for subjects, the first display receives a higher similarity. In SIAM, the two MOPs support one another in the first display because they fall on the same dimension and will therefore send activation to one another, as well as inhibition to the D->H mismatches. In the second display, each of the two MOPs will send activation to nodes that place mismatching features in correspondence. For example, the node that places the two C features in correspondence will excite the node that places G and X in correspondence. Similarity is higher for the first display because there will be greater activation of nodes that place matching features in alignment.

D. SIAM predicts the nonsignificant effect due to a MOP that competes against a MIP. In Experiment 2, a symbol match between bird parts with different roles generally increased the birds' similarity, but not if one of the parts also had a matching symbol with a part with the same role. Abstractly, "AX" is more like "YA" than "YZ," but "AX" is not more like "AA" than "AZ." SIAM predicts this effect because a MOP that competes against a symbol correspondence that has role-to-role support will be strongly inhibited. For example, in the display (ABCD, ABDD) obtained by method 10 (Table 3), the part-to-part correspondence of the last parts in each scene will be large - both the symbols and the roles serve to align the parts. This strong part-to-part correspondence will inhibit inconsistent part-to-part correspondences, including the one that maps the fourth part of the first scene onto the third part of the second scene. Because 2-to-1 mappings are inconsistent, the stronger mapping will inhibit the weaker, lessening the weaker's influence on similarity. The weaker mapping will also inhibit the stronger, but to a lesser extent.

WMMM does not predict this effect. In fact, WMMM's three greatest overestimates from Experiment 2 occur on Methods 3, 8, and 10 (see Table 3), precisely the three stimuli that have a 2-to-1 mapping. While WMMM overestimates the similarity of birds with 2-to-1 mappings, SIAM correctly discounts the influence of a MOP if one of the matching symbols also yields a MIP.

E. SIAM predicts the distinctiveness of identity. In keeping with models that assume an accelerated relation between featural overlap (or multidimensional proximity) and subjective similarity, we find that scenes with identical objects are rated as far more similar than scenes with nearly identical objects. While SIAM does not explicitly treat identical scenes specially, the processing of SIAM yields a large jump in similarity from "almost the same" to "identical" for two reasons. First, consistent correspondences mutually support one another. If all feature-to-feature and role-to-role correspondences are consistent, then every correct correspondence will quickly be found, being facilitated from all sides. Given that the optimal SIAM/data fit is obtained with a finite number of cycles, speed in developing correspondences translates to greater attention paid to MIPs. Quick convergence on the optimal mappings is achieved because of mutual support between consistent mappings. Second, because of SIAM's activation function, a single "nay-sayer" greatly lowers similarity. If a node's activation is high (0.9 for example), and three inputs advise the node to increase to 1.0, and one input advises the node to decrease to 0.0, the node's activation will drop to 0.75 (if L=1), not rise.

F. SIAM predicts the ordinal pattern of mapping results. Unlike SCFM and WMMM, SIAM models mapping accuracy as well as similarity assessments. Mapping accuracy is given by the ratio of correct object-to-object activations to all object-to-object activations. SIAM predicts the true mapping percentage rank for each of the six methods of changing butterflies in Experiment 1. SIAM predicts that MOPs generally decrease mapping accuracy, and that with the particular stimuli used in the first experiment, MOPs decrease mapping accuracy more than MIPs increase mapping accuracy. If a scene has many MIPs and no MOPs, then adding a seventh MIP will not increase mapping accuracy much - a MIP can only advise the correct object-to-object nodes to rise to the activation level that many other feature-to-feature nodes are already suggesting. Because of the activation function's emphasis on "dissenting" inputs, a single MOP will greatly decrease the object-to-object node activation associated with optimally aligned objects.

In SIAM, butterfly positions are modelled by including roles in the description of scenes. Two relations were introduced, Above-and-left(upper-left argument, lower-right argument) and Below-and-left(lower-left argument, upper-right argument). SIAM successfully models the qualitative relations between similarity and mapping judgments and butterfly position. Scenes with corresponding butterflies in corresponding roles receive the highest similarity because all sources of information are consistent with one another. In Fig. 3, the strong correspondence between upper-left arguments of the role-to-role nodes sends activation to the object-to-object node that places the circle-headed butterflies together. The circle-headed butterflies are also placed in correspondence by the cumulative action of the object-to-object and feature-to-feature nodes.

If butterflies are positioned in unrelated locations, similarity and mapping accuracy decrease. The correct object-to-object correspondences receive less excitation from the role-to-role nodes. Similarity suffers because the decrease in object-to-object activation of the correctly aligned butterflies is passed along to their feature-to-feature nodes.

Finally, if corresponding butterflies are given opposite positions, similarity and mapping accuracy decrease still further. Now, the role-to-role correspondences compete against the feature-to-feature correspondences in determining object-to-object correspondences. While the 81% true mapping rate suggests that feature-to-feature correspondences eventually achieve superiority, the presence of a contradictory information source retards the development of the correct mapping. Because the true object-to-object correspondences are not highly activated, the feature-to-feature correspondences belonging to aligned objects are weakened. Under all tested parameter settings, SIAM predicts that mapping accuracy decreases going from same to unrelated to opposite object positions. Under virtually all parameter settings, including the best fitting parameters, SIAM predicts that same>unrelated>opposite in terms of similarity ratings.

Although SIAM assumes a strong conceptual link between object alignment and similarity, it also correctly predicts the dissociation between mapping accuracy and similarity. MOPs generally increase similarity but decrease mapping accuracy. SIAM accounts for this dissociation because mapping accuracy and similarity depend on separate layers in SIAM's architecture (object and feature nodes, respectively).

G. SIAM predicts the influence of alignment on sensitivity of feature match detection. In Experiment 4, subjects are more sensitive at discriminating between matching and mismatching pairs of features if the features occur in aligned, rather than unaligned, objects. Furthermore, the influence of object alignment on (mis)match detection sensitivity is greater if the object alignment is clear than if it is more ambiguous. Both of these effects are predicted by SIAM, assuming that the attention paid to the (mis)match of a particular pair of features is positively related to the activation of the pair's feature-to-feature node. SIAM can be supplemented with a simple model of feature match detection that allows it to predict the influence of object alignment on both sensitivity and response bias.

H. SIAM predicts greater comparison difficulty for scenes with many objects/few features than scenes with many features/few objects. This result is predicted by SIAM but not SCFM, assuming that ease of comparison is inversely proportional to the number of elements that must be created. Scenes with many objects are difficult for SIAM because the number of feature-to-feature and object-to-object nodes increases as a quadratic function of number of objects per scene.

I. SIAM predicts that object alignment, not simply object similarity, influences similarity. The design of Experiments 1 and 2 allowed object alignment to be teased apart from object similarity. SIAM predicts that object alignment will, in the long run, serve as the basis for weighting feature matches; MIPs become relatively influential in similarity assessments, as compared to MOPs. SCFM, the model that bases similarity on simple and conjunctive features, essentially predicts that object similarity determines how much a feature match will count. Objects will tend to be aligned if they share many features, however object alignment also depends on the similarity of other object pairs in the scene. When alignment and similarity are dissociated, as they are when MOPs are concentrated in a single pair of objects, then object alignment, not sheer number of matching features, is a better predictor of scene similarity. Early in SIAM's processing, objects are placed in correspondence on the basis of their featural similarity. With time, SIAM decreases the correspondence strength of objects that are featurally similar but are inconsistent with other emerging object correspondences.

J. SIAM predicts that scenes with corresponding objects in unrelated positions can be more similar than scenes with corresponding objects in opposite positions, as was shown in Experiment 1. This result is surprising, given that scenes in the opposite condition (abstractly, scenes and ) share a global feature (in this case, they share the upper-left/lower-right pattern) that unrelated scenes (and ) do not.

For opposite position displays in SIAM there will be pressure from role-to-role nodes to place butterflies in correspondence that are not optimally aligned. Role-to-role activity will inhibit the alignment of featurally similar objects. Moreover, the alignment of featurally dissimilar objects receives more activation, increasing the activation of mismatching features associated with the unaligned objects. In contrast, butterflies in unrelated positions do not generate role-to-role nodes that strongly interfere with the featurally optimal object-to-object alignment.

Thus, correct butterfly mappings are harder to determine for scenes in the opposite condition, and as such, feature matches between corresponding butterflies are missed or given low weight. In support of this conjecture, the results from Experiment 1 indicate that displays in the opposite condition have lower mapping accuracies than in either the same or unrelated conditions. This explanation requires correspondences to be influenced by inconsistent, crossmapped correspondences, and is not accommodated by similarity models with independent feature/dimension weights.

Going from opposite to unrelated scenes removes one relational commonality, yet similarity increases. Furthermore, there is evidence that the removed relation Above-and-to-the-left (X,Y) is psychologically salient. In one of the filler scenes from Experiment 3, two scenes with completely different butterflies are compared. For these scenes, opposite positioned butterflies (opposite and same positions are logically equivalent, expressible as and ) receive a significantly greater similarity (rating = 3.2) than unrelated positioned butterflies (and , rating = 2.7). Thus, the spatial organization of the butterflies in a scene does serve as a basis for evaluating similarity when the identity of the butterfly filling a role is not relevant.

This pattern of results implies a nonmonotonic relation between number of shared stimulus aspects and judged similarity. Nonmonotonicities signify a potentially powerful tool for discriminating feature set or dimension/value models of similarity from SIAM. In SIAM (mis)matching features compete and collaborate to gain attention from the similarity assessor. While SIAM generally predicts that feature matches increase similarity, SIAM also takes into account the influence that each feature match has on other features. Models of similarity that posit independent sets of features, and matches that do not influence each other, do not have the notion of alignments that are consistent or inconsistent with each other. In SIAM, feature matches can be consistent or inconsistent because scenes are described hierarchically and structurally. It seems difficult to explain the finding that unrelated scenes are significantly more similar than opposite scenes unless the notion of inconsistent competing correspondences is invoked. Both the Contrast and MDS models of similarity assume monotonicity. SIAM can predict violations of monotonicity if adding a common element to two scenes decreases attention to other common elements and/or increases attention to scene differences to such an extent that the increase in similarity due to the common feature is overshadowed.

K. SIAM predicts that MOPs influence similarity less when they compete against MIPs. In Experiment 2, despite the fairly large influence of MOPs on similarity, MOPs do not increase similarity when they compete with MIPs. Abstractly, scenes AA and AB are no more similar than are scenes AC and AB. If a symbol takes part in a MIP, no further increase in similarity accrues if takes part in a MOP as well. A small but significant increase in similarity due to MOPs that are directly inconsistent with MIPs was found in Experiment 1 but not in its replication that did not require subjects to show object-to-object mappings. This effect is consistent with SIAM's account of similarity in which scene parts are placed in correspondence and inconsistent correspondences compete against one another. A strong correspondence exists between identical symbols that are located in the same object part. Both physical similarity and role serve to align these symbols. Consequently, a second correspondence, between another symbol of the same form and one of the tightly aligned symbols, will be strongly inhibited by this correspondence.

Conceptual Advantages of SIAM

In addition to the empirical data supporting SIAM, there are also a number of conceptual advantages that SIAM has over WMMM, SCFM, and traditional models of similarity.

A. SIAM provides an account of alignment. SIAM, in addition to computing similarity, computes a set of object and feature correspondences. In WMMM, feature matches are labelled "in place" or "out of place" as part of the scene pre-processing. SIAM does not require this preprocessing.

B. The dynamic time course of similarity is potentially modelled by SIAM. Processing time has a natural analogue in SIAM - number of cycles of activation adjustment. SIAM describes not only the ultimate similarity ratings, but also the development of similarity measures with time. SIAM makes specific predictions about the influence of time on similarity (Goldstone & Medin, in press). Early on, similarity will be influenced by MIPs and MOPs roughly equally. After more cycles have completed, MIPs will be substantially more influential to similarity than MOPs. With more cycles of activation adjustment, feature and object correspondences will be influenced more by other object correspondences. A correspondence's global consistency will determine its strength to a greater extent with increasing processing. SIAM's time course predictions can be used to account for differences between similarity tasks that require fast versus slow similarity assessments (Goldstone & Medin, in press).

C. Structural scenes are naturally described. The single most important trait of SIAM is that it places structured scene descriptions in alignment. SIAM provides for hierarchically and propositionally organized scenes. Most models of similarity are limited to feature set or dimension/value representations. SIAM's emphasis on aligning structured descriptions, in addition to making contact with work in analogical reasoning, has the promise of being extendable to scenes that have interrelated and internally organized parts.


The central premise has been that similarity comparisons require alignment of the compared scenes' parts. Individual alignments, rather than being independently computed, are influenced by the overall pattern of other emerging alignments. Even when subjects are not instructed to do so, even when indirect measures of similarity are used, subjects set up correspondences between the parts of the things they are comparing. Similarity assessments are well captured by an interactive activation process between feature and object correspondences.

As was true of the original interactive activation process proposed by McClelland and Rumelhart (1981), nodes representing consistent hypotheses excite one another, and nodes representing inconsistent hypotheses inhibit each other. In SIAM, each node represents an hypothesis that two entities from two scenes correspond to each other. Under this architecture, feature, object, and role correspondences concurrently constrain one another. A model that assumes that object alignment influences similarity, but does not assume an interactive activation process, provides less successful accounts of contextual and distributional influences on feature matches, sensitivity differences, and mapping results. Clearly, not every possible non-interactive alignment model was tested. Still, SIAM provides explanations for a number of findings - findings that are problematic for several theoretically motivated and intuitively plausible models.

While a process of scene alignment is strongly implicated by the empirical results, several plausible alignment methods are deficient. First, it is not satisfactory to claim, "scene parts are randomly aligned." Object alignment depends on the similarity of the scenes' objects, as revealed by mapping data and the systematically different influence of MIPs and MOPs on similarity. Second, it is not satisfactory to claim, "part alignment depends on a single aspect or dimension." One might suppose that objects such as butterflies in a scene are placed in alignment on the basis of a single dimension, such as the butterflies' heads or their location. Although these are highly salient dimensions, butterflies with different locations or different heads are placed in alignment if they share several other features. Location may be a privileged dimension in visual perception and an important dimension for determining object identity (Kubovy, 1981), but knowledge of location is not sufficient to determine correspondence.

Third, it is not sufficient to claim, "scene alignment depends only on locally-determined similarity. A part corresponds to the part that it is most similar to." This account fails to take into account the global consistency of alignments. In the display {, }, parts that are most similar are not placed in alignment by subjects most of the time. AABB is most similar to AABC in the second scene, and AABC is most similar to AABB in the first scene. Yet, subjects tend to align AAAA with AABC and AABB with CCBB. Alignments tend to be constructed so as to be globally, not locally, optimal. According to global optimality, the set of correspondences that is forged maximizes the total similarity of consistently aligned objects. Correspondences depend on the entire pattern of object similarities as opposed to individually determined similarities. On any given trial a subject may not produce the optimal correspondence, but globally optimal correspondences occur on the majority of trials and similarity is reduced when the optimal correspondence is not made.

If scenes are described as "flat" lists of features, then no two feature alignments are inconsistent. The concept of feature matches being inconsistent if they put different objects in alignment requires, at the least, that scenes be hierarchically organized in terms of objects and features. SIAM's general activation rule is "alignments that are consistent excite one another; alignments that are inconsistent inhibit one another." Issues of consistency arise because objects contain features and if features match, then consistency dictates that the objects that contain the features should also match. Similar dependencies exist between features from the same dimension, between roles and objects, and between objects. Because global consistency is only a viable constraint for scenes that have relations between their internal components, issues of alignment and structural representation, as they relate to similarity, must be addressed simultaneously.

It might be argued that determining features and feature weights is a chore for perceptual psychologists and knowledge-engineers. That is, the way to discover constraints on features and feature weights is to investigate the observer's perceptual and knowledge systems. SIAM does not take this route. SIAM, like Tversky's Contrast model, says nothing about what features can or should be included in scene descriptions. SIAM assumes that perceptual and conceptual systems have provided structural descriptions of scenes. However, in SIAM the weight given to feature (mis)matches in a comparison is not simply a function of perceptual processing and conceptual knowledge. Perceptual (e.g. "bright colors are salient") and knowledge (e.g. "in identifying portraits made by paranoid patients, shifty eyes are a salient feature") constraints are augmented by a domain-independent principle of alignment. There appear to be regularities concerning the salience of feature matches that cannot be attributable to influences of knowledge or individual featural saliences. Instead, to know whether a feature match between two scenes will count as a feature match (and how much it will count) for increasing similarity, it is necessary to know whether the feature match belongs to corresponding parts.


Carroll, J. D., & Wish, M. (1974). Models and methods for three-way multidimensional scaling. In D. H. Krantz, R. C. Atkinson, R. D. Luce, & P. Suppes (Eds.) Contemporary developments in mathematical psychology (Vol. 2, pp. 57-105). San Francisco:Freeman.

Clement, C., & Gentner, D. (1991). Systematicity as a selection constraint in analogical mapping. Proceedings of the Tenth Annual Conference of the Cognitive Science Society. (pp. 412-419). Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Evans, P. M., & Smith, L. B. (1988). The development of identity as a privileged relation in classification: When very similar is not similar enough. Cognitive Development, 3, 265-284.

Falkenhainer, B., Forbus, K.D., & Gentner, D. (1990). The structure-mapping engine: Algorithm and examples. Artificial Intelligence, 41, 1-63.

Gati, I., & Tversky, A. (1982). Representations of qualitative and quantitative dimensions. Journal of Experimental Psychology: Human Perception and Performance, 8, 325-340.

Gati, I., & Tversky, A. (1984). Weighting common and distinctive features in perceptual and conceptual judgments. Cognitive Psychology, 16, 341-370.

Gentner, D. (1983). Structure-mapping: A theoretical framework for analogy. Cognitive Science, 7, 155-170.

Gentner, D. (1989). The mechanisms of analogical learning. In S. Vosniadou & A. Ortony (Eds.), Similarity, analogy, and thought. New York: Cambridge University Press.

Gentner, D., & Toupin, C. (1986). Systematicity and surface similarity in the development of analogy. Cognitive Science, 10(3), 277-300.

Gluck, M. A., & Bower, G. H. (1988). From conditioning to category learning: An adaptive network model. Journal of Experimental Psychology: General, 117, 227-247.

Goldstone, R. L., & Medin, D. L. (in press). The time course of similarity. Journal of Experimental Psychology: Learning, Memory, & Cognitiojn.

Goldstone, R.L., Gentner, D., & Medin, D.L. (1989). Relations Relating Relations. Proceedings of the Eleventh Annual Conference of the Cognitive Science Society. Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Goldstone, R.L., Medin, D.L., & Gentner, D. (1991). Relations, Attributes, and the non-independence of features in similarity judgments. Cognitive Psychology. 222-264.

Hayes-Roth, B., & Hayes-Roth, F. (1977). Concept learning and the recognition and classification of exemplars. Journal of Verbal Learning and Verbal Behavior, 16, 321-338.

Hintzman, D. L. (1986). "Schema abstraction" in a multiple-trace memory model. Psychological Review, 93, 411-429.

Hofstadter, D. R., & Mitchell, M. (in press). An overview of the Copycat project. in K. Holyoak and J. Barnden (Eds.) Advances in Connectionist and Neural Computation Theory, Vol. 2: Connectionist Approaches to Analogy, Metaphor, and Case-Based Reasoning. Ablex : New Jersey.

Holyoak, K. J., & Thagard, P. (1989). Analogical mapping by constraint satisfaction. Cognitive Science, 13, 295-355.

Kubovy, M. (1981). Concurrent-pitch segregation and the theory of indispensable attributes. In M. Kubovy and J. Pomerantz (Eds.) Perceptual Organization. Lawrence Erlbaum Associates: Hillsdale, NJ.

Marr, D., and Poggio, T. (1979). A computational theory of human stereo vision. Proceedings of the Royal Society of London, 204, 301-328.

Massaro, D.W. (1989). Testing between the TRACE model and the fuzzy logical model of speech perception. Cognitive Psychology, 21, 398-421.

McClelland, J. L. (1991). Stochastic interactive processes and the effect of context on perception. Cognitive Psychology, 23, 1-144.

McClelland, J. L., & Rumelhart, D.E. (1981). An interactive activation model of context effects in letter perception: Part 1. An account of basic findings. Psychological Review, 88, 375-407.

McClelland, J.L., & Elman, J.L. (1986). The TRACE model of speech perception. Cognitive Psychology, 18, 1-86.

Medin, D. L., & Schaffer, M. M. (1978). A context theory of classification learning. Psychological Review, 85, 207-238.

Nosofsky, R. M (1986). Attention, similarity, and the identification-categorization relationship. Journal of Experimental Psychology: General, 115, 39-57.

Nosofsky, R. M. (1992). Exemplar-based approach to relating categorization, identification, and recognition. in F. G. Ashby (Ed.) Multidimensional Models of Perception and Cognition. Hillsdale, NJ: Lawrence Erlbaum Associates.

Palmer, S. E. (1978). Structural aspects of visual similarity. Memory & Cognition, 6, 91-97.

Shepard, R. N. (1962a) The analysis of proximities: Multidimensional scaling with an unknown distance function. Part I. Psychometrika, 27, 125-140.

Shepard, R. N. (1962b) The analysis of proximities: Multidimensional scaling with an unknown distance function. Part II. Psychometrika, 27, 219-246.

Shepard, R. N. (1987). Toward a universal law of generalization for psychological science. Science, 237, 1317-1323.

Swets, J. A., Tanner, W. P, & Birdsall, T. G. (1961). Decision process in perception. Psychological Review, 68, 301-340.

Torgerson, W. S. (1965). Multidimensionsal scaling of similarity. Psychometrika, 30, 379-393.

Tversky, A. (1977). Features of similarity. Psychological Review, 84, 327-352.

Tversky, A., & Gati, I. (1982). Similarity, separability, and the triangle inequality. Psychological Review, 89, 123-154.

Tversky, A., & Hutchinson, J. W. (1986). Nearest neighbor analysis of psychological spaces. Psychological Review, 93, 3-22.

Author Notes

The work described here was reported in the author's doctoral thesis submitted to the Department of Psychology, University of Michigan, 1991. I wish to thank my advisor, Douglas Medin, for his advice, support, and mentorship. Dedre Gentner has also contributed to all stages of the research - from conceptual foundations to experimental design. Together, their ideas have shaped the very core of this work. In addition, this research was funded by National Science Foundation grant BNS 87-20301 awarded to Dedre Gentner and Douglas Medin. I also wish to thank Edward Smith and Keith Smith for their innumerable contributions. This work has benefitted from useful comments and suggestions from Lawrence Barsalou, Evan Heit, Keith Holyoak, Frances Kuo, Michael McCloskey, Arthur Markman, Robert Nosofsky, Brian Ross, and Linda Smith. Correspondences concerning this article should be addressed to Robert Goldstone, Psychology Department, Indiana University, Bloomington, Indiana 47405.

Table 1

Stimulus Types When Only One Feature is Changed


Method Initial scene Changed scene Feature matches


Butterfly 1=ABCD Butterfly 1=ABCD 8 matches in place

DH->DH Butterfly 2=EFGH Butterfly 2=EFGH 0 matches out of place


Butterfly 1=ABCD Butterfly 1=ABCD 7 match in place

DH->DD Butterfly 2=EFGH Butterfly 2=EFGD 1 match out of place


Butterfly 1=ABCD Butterfly 1=ABCD 7 match in place

DH->DY Butterfly 2=EFGH Butterfly 2=EFGY 0 matches out of place


Butterfly 1=ABCD Butterfly 1=ABCX 6 matches in place

DH->XY Butterfly 2=EFGH Butterfly 2=EFGY 0 matches out of place


Butterfly 1=ABCD Butterfly 1=ABCH 6 matches in place

DH->HY Butterfly 2=EFGH Butterfly 2=EFGY 1 match out of place


Butterfly 1=ABCD Butterfly 1=ABCH 6 matches in place

DH->HD Butterfly 2=EFGH Butterfly 2=EFGD 2 matches out of place

Table 2

Mapping and Similarity Judgments Broken Down by Method of Changing Dimensions from Initial Scene to Changed Scene


Method Only One Dimension Changed Two Dimensions Changed

Similarity True Mapping% Similarity True Mapping%

1 DH->DH 8.3 91%

2 DH->DD 7.1 90 5.8 83

3 DH->DY 7.0 90 5.4 85

4 DH->XY 6.0 89 4.8 83

5 DH->HY 6.1 86 4.7 76

6 DH->HD 6.4 86 5.1 62

Table 3

Methods of Changing Features
Initial Bird
Changed Bird
Number of Matches out of Place
Number of Matches in Place

Table 4

Nineteen Parameters of SIAM


num_cycles = number of cycles of activation adjustment executed. No default.

L = overall rate of activation change (0-1). Default =1

feature-mismatch-value = the match value (0-1) given to features that mismatch. Default = 0

role-mismatch-value = the match value (0-1) given to roles that mismatch. Default = 0

Six= salience of dimension i for scene x (0 - 1). Default = 1

feature-mismatch-wt = weight of a feature mismatch on feature-to-feature node. Default = 1

feature-match-wt = weight of a feature match on feature-to-feature node. Default = 1

role-mismatch-wt = weight of a role mismatch on role-to-role node. Default = 1

role-match-wt = weight of a role match on role-to-role node. Default = 1

feature-to-feature-inhib-wt=weight of inconsistent feature-to-feature nodes on each other. Default = 1

feature-to-feature-excit-wt = weight of consistent feature-to-feature nodes on each other. Default = 1

object-to-feature-wt=weight of object-to-object node on a consistent feature-to-feature node. Default = 1

feature-to-object-wt = weight of feature-to-feature node on a consistent object-to-object node. Default = 1

object-to-object-inhibit-wt = weight of inconsistent object-to-object nodes on each other. Default= 1

object-to-object-excit-wt = weight of consistent object-to-object nodes on each other. Default = 1

object-to-role-wt = weight of object-to-object node on a consistent role-to-role node. Default = 1

role-to-object-wt = weight of role-to-role node on a consistent object-to-object node. Default = 1

role-to-role-inhibit-wt = weight of inconsistent role-to-role nodes on each other. Default = 1

role-to-role-excit-wt = weight of consistent role-to-role nodes on each other. Default = 1

Table 5

Comparison on Model Fits on First Two Experiments
ExperimentModel ParametersR2 Models improved
Experiment 1SCFM 40.875 None
Experiment 1WMMM 30.942 SCFM
Experiment 1SIAM 20.956 SCFM/WMMM
Experiment 2WMMM 30.96 None
Experiment 2SIAM 30.977 WMMM

Table 6

Similarity and Mapping Accuracy

Initial Scene:


Type of Display SimilarityTrue Mapping




2 MIPs distributed 3.8670.6%


2 MIPs concentrated in dimension 4.0459.5


2 MIPs concentrated in object 4.1366.5


4 MIPs Distributed 4.5884.0


4 MIPs Semi-concentrated

in dimension



4 MIPs concentrated in dimension 4.8481.6


4 MIPs semi-concentrated in object 4.8079.0


4 MIPs concentrated in object 5.2077.3


6 MIPs distributed 6.2489.2
10: AAAC


6 MIPs concentrated in dimension 6.4886.7
11: AAAA


6 MIPs concentrated in object 6.7787.1

Table 7

Accuracy on "Same/Different" Comparison as a Function of Display Time

Initial Scene:







feature comparison


% correct

feature comparison


% correct

feature comparison




7.5 7.9A = Same

D= Different

A =84.9%

D = 79.0%





6.3 6.9C = Different

A = Same







6.6 6.8Different








6.8 7.6Different








5.4 5.8Same








5.0 5.3Different








4.9 4.8Different








5.4 5.7Different






Figure Captions

Figure 1. Based loosely on stimuli used by Gati and Tversky (1984), these landscapes illustrate the hypothesized increase in similarity due to MIPs (matches in place) relative to MOPs (Matches out of place), and due to MOPs relative to non-matching features. B and C both contain a spotted object, but only C's spotted cloud corresponds to A's spotted object.

Figure 2. Sample stimuli from Experiment 1. The scene with butterflies A and B is compared to the scene with butterflies C and D. Although A corresponds to C on the basis of head, tail, and wing matches, A and D have a MOP with respect to their body shading. Similarly, B and D are aligned, but B has the same body shading as C. The method of changing the body shadings is "DH->HD."

Figure 3. Two dimensions are changed between the two scenes. Tails are altered by method "DH->HD" and body shadings are altered by method "DH->HY." The display contains four MIPs, three MIPs, and one feature from each scene does not match any feature in the other scene.

Figure 4. Mapping results from Experiment 1. A true mapping occurs when a subject gives the butterfly-to-butterfly alignment that maximizes the number of MIPs relative to MOPs.

Figure 5. Similarity ratings from Experiment 1, as a function of method of changing dimensions between two scenes.

Figure 6. Two sample displays from Experiment 2. A MOP is defined as a matching symbol between two bird parts that assume different roles in their birds. A MIP is a matching symbol between parts that assume the role in their birds.

Figure 7. Similarity ratings from Experiment 2, as a function of the number of MIPs and MOPs shared between two scenes.

Figure 8. Illustration of the different node and connection types in SIAM. All node connections follow the basic rule that consistent nodes excite, while inconsistent nodes inhibit, each other. Nodes that place objects and features into correspondence are shown. Each node is represented by a rectangular slot. For example, the upper-left slot represents the node that places Objects A and C into correspondence, and the slot below it places the head of Object A into correspondence with the head of C. The upper-left and the upper-right slots have an inhibitory connection between them because Object A cannot correspond both to Object C and Object D.

Figure 9. Illustration of the interconnections between object-to-object and role-to-role nodes. Each cell in the 2X2 arrays represents an object-to-object or role-to-role node. Once again, all node connections are based on consistency. The role-to-role node that places the first arguments of the two Above relations into alignment will excite the node that aligns A and C, because these two objects play the role of "first argument in Above relation."

Figure 10. Example of processing in SIAM. The example is taken from Fig. 3. Each slot represents a node. The values in parentheses show match values (1 = matching feature, 0 = mismatching feature). The four values within a slot show node activations after 1, 2, 3, and 10 time steps have transpired. An abstract depiction of the feature values from Fig. 3. are shown in the lower right corner. For example, Object B shares one feature matches with C (X) and two feature matches with D (U and V).

Figure 11. Sample stimuli from Experiment 3. Results indicate that the similarity of the two scenes in the top display is lower than the similarity of the scenes in the other two displays. All of the displays contain four MIPs and no MOPs. Letters representing MIPs are underlined.