The Computer Science Colloquium
Thursday, March 18, 4:15pm, room 9204/05
Robert M. Haralick
"Concepts For Symbolic Data Analysis"
Large data sets whose entries take symbolic or categorical values are common. The question is what is the data set telling us. What information is it carrying? What is its structure? Symbolic Data Analysis is a methodology to answer these kinds of questions.
There are two major dimensions of conceptual issues. The first are the concepts that relate to information and structure. The second are the concepts that relate to the fact that the data set must be understood to be perturbed and sampled from a population and the information that is really desired is of the population . The sampling gives rise to the issue of missing data. The perturbing gives rise to the issue of editing data out of the data set.
For the structural issue we suggest the concepts of the rules and multicliques of an N-ary relation. For the missing data and the perturbed data issue, we suggest the concept of the separation relation and show how this leads to a morphological method using the generalized operations of dilation and erosion, from which the opening and closing operations are defined. The opening operation edits out perturbed data and the closing operation fills in missing data. Both opening and closing have a topological connection, which time permitting, we will explain.
The Colloquium is supported by generous contributions from
the Bloomberg, Information Builders, Inc., and Netlogic,
Inc.
365 Fifth Ave, New York City 10016 | Room 4319 | Phone: 212.817.8190 | Fax: 212.817.1510 | compsci@gc.cuny.edu


