Saturday, May 31, 2014

Paper Review: Learning to Manipulate and Categorize in Human and Artificial Agents

Learning to Manipulate and Categorize in Human and Artificial Agents

Giuseppe Morlino, Claudia Gianelli, Anna M. Borghi, Stefano Nolfia

Cognitive Science (2014) 1–26

DOI: 10.1111/cogs.12130

This study investigates the acquisition of integrated object manipulation and categorization abilities through a series of experiments in which human adults and artificial agents were asked to learn to manipulate two-dimensional objects that varied in shape, color, weight, and color intensity. The analysis of the obtained results and the comparison of the behavior displayed by human and artificial agents allowed us to identify the key role played by features affecting the agent/environment  interaction, the relation between category and action development, and the role of cognitive biases originating from previous knowledge.

The paper looks at issues in the effect of action on categorisation.  They present that categorisation is grounded in in the sensorimotor system, according to present experiments and theory.  And again suggest the central role of action in cognition.

They also look at the issues around how categories enable the flexible usage of objects, and how the grasping of objects changes according to the tasks needed, as per the classic idea of affordances by Gibson (1979).

Important quote: "Affordances are proposed to be the product of the conjunction, in the brain, of repeated visuomotor experiences." Probably a no-brainer to the design community, but important to me, as I need to see this generalise to virtual worlds.  It should be noted that the systems used in this experiment were synthetic, so the effects should generalise to a virtual world, as it is simply shapes and colours with physical properties.  However, there is a history of visual search research with simple shapes not generalising to real images.  This must be considered in any assumptions of efficacy in virtual world simulations.

The experiments involved the manipulation of 2D objects on the screen with a mouse pointer in placing and shaking tasks.   The weight of the objects is aligned with categories and some of the categories are also based on colour, blinking and shape.  The humans (20) were compared to neural network agents.

"The results indicated the discriminative features affecting the agent environment interaction such as weight facilitate the acquisition of the required categorisation abilities with respect to alternative features that are equally informative but that do not affect the outcome of the agent actions."  This leads them to the conclusion that the categorisation for both humans and agents, not withstanding any other factors, is affected by the embodiment of the activity; weight required interaction, not just observation.

The results showed support for a model whereby the interaction with light vs heavy objects produces categories far more effectively than other factors.  Embodied action thus has a great affect on categorisation, whether it affects every category is still uncertain, as the other visual effects (from grounded cognitive affects) still caused categories to form, just not as soon in the training.

They consider this to contribute to a STRONG position of embodiment being central to the creation of categories, and not just being a more peripheral contributor.

They also note a shape effect with humans, ie. they used a curvilinear path with circles, and a rectilinear path with square.  Thus previous memories of the objects influenced their actions and thus the categories.

They also note that the categories are from an interaction of the agent with the environment, and not so from top-down or bottom-up processes exclusively, not overgeneralised or fine granularity categories, but as a dynamic process between agent and environment.

While this is categorisation, and not a memory task, one still has to wonder, for my work, if the memory of a process will be much more enhanced by embodied interactions, and not just visual interactions alone.  One could hypothesise that if the category is more strongly created with embodied action, then the memory of that category (if it maps to say activity specifications) then should be stronger on acting it out.  So an Occulus and Kinect space should measurably work better in process memory tasks than a pure visual space; with both working better than a simple interview.

Something to think about I guess.


No comments: