a
 

Proceedings of the Information Systems Foundations Workshop

Ontology, Semiotics and Practice 1999


 

Semiotic Approach to Understanding Representation In Information Systems

Graeme Shanks

Department of Information Systems

The University of Melbourne

Email:    g.shanks@dis.unimelb.edu.au

Abstract

Representation is at the core of the discipline of information systems. In this paper, I provide clear definitions of important concepts and draw on ideas from semiotics and ontology to provide a sound theoretical basis for representation. The key concepts of data, information and meaning are first defined. The syntactic, semantic, pragmatic and social semiotic levels are then used to define a framework for understanding different aspects of representation. Ontology is shown to be useful at the syntactic and semantic levels. Examples of how the framework can be used to understand representation at three levels are then provided: languages for representation; representations of the structure of information systems; and representations of the content of information systems.

Keywords

Information systems foundations, representation, semiotics, ontology

INTRODUCTION

Information systems is an applied discipline and its aim is to improve practice (Keen 1980). Information systems research must therefore be both relevant and rigorous (Keen 1991). It must be relevant in that the outcomes of the research should be usable and useful in practice, and accessible to practitioners. It must be rigorous in that it is soundly based in theory and undertaken in a systematic way using appropriate research methods. This balance is desirable but difficult to achieve.

Information systems researchers have long relied on reference disciplines to provide the foundations for their research (Keen 1980, Weber 1997). Ideas from computer science, management science, psychology, sociology and economics have been borrowed to form the basis of high quality research in information systems (Weber 1997). This is considered by many to be a positive feature of the discipline of information systems. However, the question remains: what is at the core of the discipline of information systems? What is it about information systems that is unique and separates it from other reference disciplines?

This paper first argues that representation is at the core of information systems. However, representation is intertwined with communication and action (FRISCO 1998). I use these ideas and borrow from Mingers (1995) to provide definitions for data, information and meaning. The semiotic levels provide a means to structure thinking about representation and communication (Stamper 1992, FRISCO 1998). Ideas from ontology are shown to be most relevant at the syntactic and semantic levels. These concepts together form a basis for understanding representation in information systems. Examples of how the framework can be used to understand representation at the three levels of language, structure and content are then provided.

INFORMATION SYSTEMS AND REPRESENTATION

Information systems are used to represent the structure and behaviour of other systems (Weber 1997). They are intended to be the basis for coordinated action in some social system, for example an organisation. The coordinated action is established by representation (conceptualisations or models) (FRISCO 1998). The representation is expressed as sentences in some defined language, ranging from semi-formal languages such as entity relationship and the unified modeling language to the formal languages of mathematics and logic.

The key concept about an information system then is that it is essentially a representation of something else. It plays a crucial role in communication and coordination, as things in the world can know about other things only via information (Weber 1997). Given the central importance of representation and communication in the discipline of information systems, we should clearly build upon theories from semiotics, the study of the use of symbols to convey knowledge (Stamper 1992). We should also draw on ontology, that part of metaphysics concerned with how human beings conceive of the world. We should also explicitly state our underlying philosophical position.

CONCEPTUAL FOUNDATIONS

My understanding of representation in information systems is built upon a specific philosophical position (Burrell and Morgan 1979) and clear definitions for data, information and meaning (Mingers 1995). Semiotic levels (Stamper 1992) form the basis of the approach and ideas from ontology are used at the syntactic and semantic levels.

Philosophical Position

A realist ontological position is adopted and we accept that the world consists of things with attributes that are related in a causal way (Burrell and Morgan 1979). More precisely, we adopt a critical realist position, in which “we know the world only through our perceptions” and that we recognize the “fallibility of the knowledge we have about the world” (Weber, p171). The world is described in terms of its states (the values of its attributes at a certain point in time), events and laws (that determine the allowed combinations of values of attributes (Weber 1997). In an information system, the state of the real world system is represented by symbols (textual, numeric, or graphical). A system of connotation is a set of rules that govern the interpretation of these symbols by people. Effective communication with a set of symbols requires that groups of people share the same system of connotation (Mingers 1995). Definitions for data and information follow from the realist ontological position.

Data is defined as a collection of symbols that are brought together because they are considered relevant to some purposeful activity (Mingers 1995). These symbols may be at different levels of abstraction, for example conceptual models and the contents of databases.

Information is carried by symbols and is an objective (although abstract) commodity that exists independently of any person who may interpret the symbols. The information carried by a symbol is causally implied by the occurrence of the symbol. The information carried by a symbol relates to who produces it, why and how it was produced and its relationship to the real world system state it signifies. Information reflects the intentions of the creator of the symbols. It is the propositional content of a symbol: that which is implied by the occurrence of the sign (Mingers 1995).

This definition of information is also reflected in Hirschheim et al. (1995, p14)

“In everyday life, data corresponds to saying something (be it true or not) while information corresponds to speech acts which convey intentions”

A subjectivist epistemological position is assumed and we accept that our understanding of the world depends on our prior knowledge and experience (Burrell and Morgan 1979). Not all the information contained in a data set is available to people who interpret it. A definition for meaning follows from the subjectivist epistemological position.

Meaning is defined as the particular meaning that people derive from symbols and is generated from the information that accompanies data. Meaning involves a person understanding a symbol and relating that understanding to their knowledge and experience (Mingers 1995). Meaning depends on the systems of connotations, and background and experience of the producer and interpreter of the data. The meaning associated with a set of symbols may therefore be different for the producer and interpreter.

A summary of the philosophical position and important definitions is provided in Table 1 below.

 

Table 1 Summary of Philosophical Position and Important Definitions

Ontological Position

Realist  - the world system consists of things and attributes that are related in a causal way, and is described in terms of its states (the values of its attributes at a certain point in time), events and laws (the allowed combinations of values of attributes).

Epistemological Position

Subjectivist  - our understanding of the world depends on our prior knowledge and experience.

Data

Data is a collection of symbols that signify real world system states and are brought together because they are considered relevant to some purposeful activity.

Information

Information is an objective commodity carried by symbols and relates to who produced it, why and how it was produced and its relationship to the real world state it signifies.

Meaning

Meaning involves a person understanding a set of symbols and relating that understanding to their knowledge and experience. The meaning associated with a set of symbols may be different for the producer and interpreter of the symbols.

Semiotic Theory

Semiotic theory concerns the use of symbols to convey knowledge. Stamper (1992) defines six levels for analysing symbols. These are the physical, empirical, syntactic, semantic, pragmatic and social levels. The physical and empirical levels concern the physical media and use of the physical media for communication of symbols. They are not generally considered to be in the domain of information systems and will not be considered in this paper. The four semiotic levels that are of interest in representation in information systems are the syntactic, semantic, pragmatic, and social levels. Although these levels are separated for analytical convenience, they are closely interrelated and build on each other. The four levels are briefly discussed below and goals for quality representation are established for each.

Syntactic          The syntactic level of representation is concerned with the form of symbols rather than their meaning. A representation uses a defined set of symbols according to a set of rules. If the syntax is formally defined then symbolic forms may be transformed into other symbolic forms. Two symbolic forms are equivalent if they may be transformed into one another. The goal for the syntactic level is that the representation is correct and consistent (Lindland et al 1994, Shanks and Darke 1998). This means that all sentences in the representation should conform to the predefined syntax rules of the particular grammar used in the representation.

Semantic          The semantic level of representation concerns the meaning of symbols. People, depending on their prior knowledge and experience, assign meanings to symbols. Meaning is the mapping of a symbol to a real world object or state and may be different for different people. However, people with similar knowledge and experience frequently share meanings. The goal for the semantic level is that the representation is complete and accurate at particular points in time (Lindland et al. 1994, Shanks and Darke 1998). This means that the representation should capture all the meaning accurately.

Pragmatic         The pragmatic level of representation concerns the usage of symbols. It takes into account contextual issues including the characteristics of the person using the symbols, the task they are engaged in and the organisational context. The goal for the pragmatic level is that the representation is useful and usable (Shanks and Darke 1998). This means that the representation is suitable for task at hand.

Social               The social level of representation concerns the understanding of the meaning of symbols, and takes into account an understanding of different stakeholder viewpoints and an awareness of any biases and other cultural and political issues involved. The goal for the social level is that a shared understanding of the representation is achieved (Shanks and Corbitt 1999).

 

Figure 1            Semiotic Framework for Understanding Representation  (adapted from Lindland et al. 1994)

Clearly, the nature of the set of symbols and rules defined in the grammar are of critical importance, together with the stakeholder knowledge and beliefs. The grammar should be suitable for the type of system being represented. Particular grammars may be adequate for some representations and inadequate for others. Similarly, different grammars may be suitable for some stakeholder groups but not others. A good matching between system, grammar and stakeholders should be achieved.

Ontology

Ontology is concerned with how human beings conceive of the world. Weber (1997) notes that it deals with theories about the nature of things in general as opposed to theories about particular things. He uses Bunge’s ontology to develop theories about the types of features that information system grammars should have and to evaluate the quality of particular representations (scripts).

Clearly, theories about the types of features that information system grammars should have, fit well with the syntactic semiotic level. Weber (1997, p83) notes two primary criteria that can be used to evaluate the goodness of the “deep structure” of a particular representation, namely completeness and accuracy. This aspect of his work with ontology fits well with the semantic semiotic level, but within the context of business usage (pragmatic semiotic level) and business understanding (social semiotic level).

The pragmatic and social semiotic levels cannot be related as easily to concepts from ontology. However, as information systems are primarily social systems that facilitate communication and coordinate action, these aspects of representation should not be ignored.  They are also core to the discipline of information systems.

USE OF THE FOUNDATIONS

There are three perspectives on representation that the concepts presented above can inform. They are:-

   the nature of the languages used in the representations,

   the representation of the types of things and events, or structure of the conceptual models (the intention) and

   the representation of the things and events or content (the extension).

Each of these perspectives is discussed below in the context of data modelling.

Languages for Representation

The languages used for representation must be suitable for both the domain being modelled and the stakeholders who will use the model (Lindland et al. 1994). The entity relationship language is suitable for the domain of data warehouse design and for stakeholders who are trained in its use. A simpler, more informal language may be more suitable for end users. Darke and Shanks (1996) discuss the relationship between types of representation (formal, semi-formal and informal) and stakeholder.

Weber (1997) has used Bunge’s ontology to evaluate the completeness, clarity, redundancy and other characteristics of representation languages. This approach has much value in finding deficiencies in existing modelling languages and suggesting ways of improving them. It also provides a sound base for proposing hypotheses that can be subsequently tested using empirical research approaches.

Representing the Structure of Information Systems – Data Model Quality

The quality of conceptual models in general and data models in particular is of great importance to practitioners as there are few generally accepted guidelines for evaluating alternative models and little agreement as to what makes a “good” model (Moody and Shanks (1998). Data model quality is often understood by defining lists of desirable dimensions. However these dimensions are often overlapping, vaguely defined, ambiguous and not soundly based in theory.

The semiotic approach for understanding representation has been used to develop a framework for understanding the quality of data models (Shanks and Darke 1998). This framework organises and structures the key concepts and features of quality in conceptual modelling and shows how information systems foundations can be used to inform practice-oriented frameworks. Components of the framework include a set of quality goals for each of the four semiotic levels (these goals may also be associated with particular data model quality dimensions). The means to achieve each of these goals are defined, together with measures for each of the goals and the stakeholders responsible for producing, maintaining and reading the data model.

The data model quality goals, stakeholders, improvement strategies and measures are all intuitively usable and useful to practitioners. However the concept of four sets of clear goals is informed by semiotics. The dimensions for the syntactic and semantic levels and to some extent their measures can be derived using concepts from ontology. Without the theoretical base, there is no way of clearly arguing that at least part of the framework is complete. Figure 2 shows the components of the framework and their interrelationships and Table 1 summarises details within the data model quality framework.

Figure 2            Framework for Understanding Data Quality

 

Table 1 Data Model Quality - Summary of Goals, Means and Measures

Semiotic Level

Goal

Means

Measures

Syntactic

Correct

Syntax Checking,

Training for data modellers

Syntax error ratio

Semantic

Complete and

Valid

Training for data modellers and users,

Stakeholder participation,

Prototyping

Expert rating,

Comparison with generic models

Pragmatic

Usable and

Useful

Visualisation,

Animation,

Explanation,

Simulation

Stakeholder rating

Social

Shared Understanding

Viewpoint analysis,

Conflict resolution,

Cultural immersion.

User surveys

Representing the Content of Information Systems – Data Quality

Data quality problems are becoming increasingly prevalent in practice, particularly in data warehousing and the implementation of enterprise resource planning systems (Shanks 1999). Understanding data quality and how it can be managed and improved are critical issues for practitioners. Data quality is also often understood by defining lists of desirable data quality dimensions that are often overlapping, vaguely defined, ambiguous and not soundly based in theory.

The semiotic approach for understanding representation has been used to develop a framework for understanding data quality (Shanks and Darke 1998). This framework consists of essentially the same components as the framework for understanding data model quality defined previously. The data quality goals, stakeholders, improvement strategies and measures are all readily understood by practitioners. However the concept of four sets of clear goals is informed by semiotics. Concepts from ontology may again be used for goals at the syntactic and semantic levels. Table 2 summarises details within the data quality framework.

Table 2                Data Quality - Summary of Goals, Means and Measures

 

Semiotic Level

Goal

Means

Measures

Syntactic

Consistent

Corporate data model,

Syntax checking,

Training for data producers.

Percentage of inconsistent data values.

Semantic

Complete and Accurate

Training for data producers,

Minimise data transformations and transcriptions.

Percentage of errors in data or population sample.

Pragmatic

Usable and

Useful

Monitoring data consumers,

High quality data delivery systems,

Data tagging.

Time of update,

User surveys,

Effect on decision-making processes and outcomes.

Social

Shared Understanding

Viewpoint analysis,

Conflict resolution,

Cultural immersion.

User surveys

 

CONCLUSION

This paper argues that representation is at the core of the discipline of information systems. It has explicitly adopted a realist ontological position and a subjectivist epistemological position and provided clear definitions for information system, data, information, and meaning. It has suggested that both semiotics and ontology are two key areas of theory that form the foundations of representation in information systems.

Representation should be considered at the three levels of grammars for representation, represents of structure, and representations of content to be relevant to information systems practitioners. These theoretical foundations can be used to inform the development of frameworks that are of use to practitioners.

The four semiotic levels discussed cover both intrinsic and contextual aspects of representation, while ontology focuses on intrinsic aspects of representation. Information systems are primarily social systems that facilitate communication and coordinate action. The contextual issues cannot be ignored.

REFERENCES

Burrell, G. and Morgan, G. (1979) Sociological Paradigms and Organisational Analysis, Heinemann, London

Darke, P. and Shanks, G. (1996) Stakeholder Viewpoints in Requirements Definition: A Framework for Understanding Viewpoint Development Approaches, Requirements Engineering, 1:2, pp 88-105

FRISCO (1998) A Framework of Information Systems Concepts, IFIP Working Group 8.1 FRISCO

Hirschheim, R., Klein, H. and Lyytinen, K. (1995) Information Systems Development and Data Modelling: Conceptual and Philosophical Foundations, Cambridge University Press

Kahn, B., Strong, D.M. and Wang, R.Y. (1997) A Model for Delivering Quality Information as Product and Service, Proceedings of the 1997 Conference on Information Quality, Boston: MIT, 80-94

Keen (1980) MIS Research: Reference Traditions and a Cumulative Tradition, Proc 1st International Conference on Information Systems, Philadelphia, 9-18

Keen (1991) Relevance and Rigor in Information Systems Research: Improving Quality, Confidence, Cohesion and Impact, in Information Systems Research: Contemporary Approaches and Emergent Traditions, H-E Nissen, H. Klein, and R. Hirschheim (eds.) North Holland, Amsterdam, 27-49

Krogstie, J, Lindland, O.I. and Sindre, G. (1995) Towards a Deeper Understanding of Quality in Requirements Engineering, Proc. 7th Intl. Conf. Advanced Information Systems Engineering, Jyvaskyla, Finland (June)

Lindland, O., Sindre, G. and Solvberg, A. (1994) Understanding Quality in Conceptual Modelling, IEEE Software (March) 42-49

Mingers, J.C. (1995) Information and Meaning: foundations for an intersubjective account, Information Systems Journal, Vol 5, 285-306

Moody, D. and Shanks, G. (1998) What Makes a Good Data Model? Evaluating the Quality of Data Models, Australian Computer Journal (August)

Shanks, G. (1999) A Framework for Understanding Data Quality, Proc. 4th Australian Data Management Association (DAMA) Conference, Melbourne (October)

Shanks, G. and Corbitt, B. (1999) Understanding Data Quality: Social and Cultural Aspects, forthcoming in Proc. 11th Australian Conf on Information Systems, Wellington (Dec)

Shanks, G. and Darke, P. (1998) Understanding Metadata and Data Quality in a Data Warehouse, Australian Computer Journal (November)

Stamper, R. (1992) Signs, Organisations, Norms and Information Systems, Proc. 3rd Australian Conference on Information Systems, Wollongong

Strong, D.M., Lee, Y.W. and Wang, R.Y. (1997) Data Quality in Context, Communications of the ACM, Vol 40, No 5, 103-110

Wand, Y. and Wang, R. (1996) Anchoring Data Quality Dimensions in Ontological Foundations, Communications of the ACM, Vol 39, No 11, 86-95

Weber, R. (1997) Ontological Foundations of Information Systems, Coopers and Lybrand, Melbourne

 

ACKNOWLEDGEMENTS

Many thanks to the participants in the IS Foundations Workshop for their constructive discussion about an earlier version of this paper and in particular to Kit Dampney for his insightful comments.

 

COPYRIGHT

Graeme Shanks (c) 1999. The author assigns to the IS Foundations Workshop held at the Department of Computing, Macquarie University on Wednesday 29 September 1999 and educational and non-profit institutions a non-exclusive licence to use this document for personal use and in courses of instruction provided that the article is used in full and this copyright statement is reproduced. The author also grants a non-exclusive license to the IS Foundations Workshop to publish this document in full in the Workshop Proceedings. Those documents may be published on the World Wide Web, CD-ROM, in printed form, and on mirror sites on the World Wide Web. Any other usage is prohibited without the express permission of the author.


 


Proceedings of Information Systems Foundations Workshop 1999
Department of Computing, ICS

Macquarie University
CNG (Kit) Dampney - Web Page Updated 7th September 2000