Ontology as a teaching about the world as a whole. Basic ontological models. Ontology and ontological system models


THE BEGINNING - Ontologies in corporate systems. Part I

Ontological systems can be used to solve business problems, create intelligent systems, and represent knowledge on the Internet. The range of technologies related to this issue is very wide and includes multi-agent systems, automatic knowledge extraction from natural language texts, information retrieval, intelligent annotation, automatic preparation of abstracts, etc.

In the second part of the article, theoretical concepts, tools, practical examples of application are briefly considered.

FORMAL MODEL OF ONTOLOGY

Ontology consists of terms (concepts), their definitions and attributes, as well as associated axioms and inference rules.

Formal Ontology Model O = is an ordered triplet of finite sets, where:

  • T - terms of the applied area (SbA), which is described by the ontology O;
  • R is the relationship between the terms of a given SbA;
  • F are interpretation functions defined in terms and / or relations of the ontology O.

Ontology models are classified as follows:

  • simple (have only concepts);
  • frame-based (only have concepts and properties);
  • based on logic (e.g. Ontolingua, DAML + OIL).

Relationships represent the type of interaction between SbA concepts. An example of a binary relationship is "is a part". It should be noted that the relations that are advisable to use when creating an ontology are much less diverse than the terms and, as a rule, are not specific to a particular SbA ("part-whole", "is a subclass", "has an effect", "looks like " etc.).

Axioms are used to model statements that are always true.

Certain types of relationships can be established between concepts. A glossary of terms in a specific application area, a thesaurus with its own concepts (concepts) and relationships that define natural language terms can be considered as ontologies. Methods of obtaining information are used to establish a connection between verbally defined concepts and the search for concepts of relevant queries. Well-known examples of this type of ontology are the indexes of information search engines on the Internet.

To describe more complex systems, concepts such as extensible ontology model.

ONTOLOGY DESCRIPTION LANGUAGES

In order to implement various ontologies, it is necessary to develop languages ​​for their representation that have sufficient expressive power and allow the user to avoid "low-level" problems.

The key point in ontology design is the choice of an appropriate Ontology specification language. The purpose of such languages ​​is to make it possible to specify additional machine-interpreted semantics of resources, to make machine data representations more similar to the state of things in the real world, and to significantly increase the expressive possibilities of conceptual modeling of loosely structured Web data.

The dissemination of the ontological approach to knowledge representation has contributed to the creation of various languages ​​of ontology representation and tools for editing and analyzing them.

There are traditional ontology specification languages: Ontolingua, CycL, descriptive logic-based languages ​​(such as LOOM), frame-based languages ​​(OKBC, OCML, Flogic).

Later languages ​​are based on Web standards (XOL, SHOE, UPML). RDF (S), DAML, OIL, OWL have been created specifically for the exchange of ontologies over the Web, which will be discussed below.

In general, the difference between the traditional and Web-languages ​​of the ontology specification lies in the expressive capabilities of describing the domain and some of the inference engine capabilities for these languages. Typical language primitives additionally include:

  • constructions for aggregation, multiple class hierarchies, inference rules, axioms;
  • various forms of modularization for recording ontologies and relationships between them;
  • the possibility of meta-description of ontology, which is useful in establishing relationships between different types of ontologies. Today, some of these languages ​​have gained great popularity and are widely used (in particular, to describe information resources and Internet services).

RDF language... Within the framework of the project for semantic interpretation of Internet information resources (Semantic Web), a standard for describing metadata about a document Resource Description Framework, using XML syntax, was proposed.

RDF uses an underlying data model "Object - attribute - value" and is able to play the role of a universal language for describing the semantics of resources and the relationships between them. Resources are described in the form of a directed labeled graph - each resource can have properties, which in turn can also be resources or their collections. All RDF vocabularies use a basic structure that describes resource classes and the types of relationships between them. This allows the use of heterogeneous decentralized dictionaries created for machine processing according to different principles and methods. An important feature of the standard is extensibility: you can set the structure of the source description by using and extending such built-in concepts of RDF schemas as classes, properties, types, collections. The RDF schema model includes class and property inheritance.

RDF has already received support from many of the leading software vendors. A number of software products have been developed that allow you to create RDF descriptions for various kinds of systems. Possibilities of integrating existing information storages into the general base of semantic description and integration of the concept of RDF-base with the MPEG format are assumed. RDF Schema is a standard proposed by the W3C for representing ontological knowledge. It specifies many different valid data schemas. Domain models are described by means of resources, properties and their values. RDFS provides a good foundation for describing vocabularies of subject area types. One of the limitations is the impossibility of using RDFS to express axiomatic knowledge, that is, to set axioms and inference rules based on them.

DAML + OIL Is a semantic markup language for Web resources that extends the RDF and RDF Schema standards through more complete modeling primitives. The latest version of DAML + OIL provides a rich set of constructs for creating ontologies and marking up information so that it can be read and understood by a machine.

The first proposals for describing an ontology based on RDFS were DARPA DAML-ONT (DARPA Agent Markup Language) and European Commission OIL (Ontology Inference Layer). These standards for the specification and exchange of ontologies have been developed to support the knowledge exchange and knowledge integration process. On the basis of these proposals, the joint solution DAML + OIL emerged. The DAML + OIL ontology consists of: headers; class elements (class elements); property elements; instances.

OWL(Web Ontology Language) is an ontology presentation language that extends the capabilities of XML, RDF, RDF Schema and DAML + OIL. This project provides for the creation of a powerful semantic analysis engine. It is planned to remove the limitations of the DAML + OIL constructs.

OWL ontologies are sequences of axioms and facts, as well as references to other ontologies. They contain a component for attribution and other detailed information, are Web documents, and can be referenced through a URI.

In the already mentioned Semantic Web project, the "machine processing of the meaning" of the content will be made as clear as possible by marking documents with an index "with full meaning" based on the use of ontological terms. Thus, ontologies are viewed as a key technology for use in the Semantic Web (Fig. 1).

Ontologies play an important role in organizing and sharing Web-based knowledge processing. Ontologies, defined as shared formal concepts of specific domains, provide an overview of topics about which both people and applications can exchange information. Ontologies differ from XML schemas in that they are representations of knowledge, not message formats (most Web standards are composed of a combination of message formats and protocol specifications).

ONTOLOGY PROCESSING TOOLS

One of the advantages of ontology is the availability of instrumental software for them, which provides general domain-independent support for ontological analysis. There are a number of tools for ontological analysis that support editing, visualization, documentation, import and export of ontologies of different formats, their presentation, merging, comparison.

Editors

Ontolingua. In addition to the ontology editor itself, this system contains a Webster network component designed to define concepts, a server providing access to Ontolingua ontologies via the OKBC (Open Knowledge Base Connectivity) protocol, and Chimaera, a toolkit for analyzing and combining ontologies.

Protégé- a freely distributed Java program designed to build (create, edit and view) ontologies of a particular application area. It includes an ontology editor that allows you to design ontologies by expanding the hierarchical structure of abstract and concrete classes and slots. Based on the formed ontology, Protégé allows generating forms of obtaining knowledge for the introduction of instances of classes and subclasses.

The tool supports the use of the OWL language and allows you to generate html documents that display the structure of ontologies. Since it uses the OKWS knowledge representation frame model, this allows it to be adapted for editing SbA models presented not in OWL, but in other formats (UML, XML, SHOE, DAML + OIL, RDF and RDFS, etc.).

DOE- a simple editor that allows the user to create ontologies. The ontology specification process consists of three stages.

At the first stage, the user builds a taxonomy of concepts and relationships, explicitly outlining the position of each element (concept) in the hierarchy. Then the user indicates what is the specificity of the concept in relation to its "parent", and in what this concept is similar or different from its "brothers". The user can also add synonyms and encyclopedic definitions in several languages ​​for all concepts.

In the second step, the two taxonomies are considered from different points of view. The user can expand them with new objects or add constraints on the scope of relationships.

At the third stage, the ontology can be translated into the language of knowledge representation.

OntoEdit- a tool that provides viewing, checking and modification of the ontology. It supports the OIL and RDFS ontology presentation languages, as well as the XML-based internal OXML knowledge presentation language. Like Protégé, it is a stand-alone Java application, but its codes are proprietary. The free OntoEdit Free version is limited to 50 concepts, 50 relationships, and 50 copies.

OilEd-autonomous graphic ontology editor, developed within the framework of the Op-To-Knowledge project. It is freely distributed under the General Public License (GPL). The tool uses the OIL language to represent ontologies. OilEd lacks support for class instances.

WebOnto designed for viewing, creating and editing ontologies. It uses Operational Conceptual Modeling Language (OCML) to model ontologies. The user can create various structures, including classes with multiple inheritance. The tool has a number of useful features: viewing relations, classes and rules; it is possible to work together on an ontology for several users.

ODE(Ontological Design Environment) interacts with users at the conceptual level, provides them with a set of tables for filling (concepts, attributes, relations) and automatically generates code in LOOM, Ontolingua and FLogic languages. The tool was developed in WebODE, which integrates all ODE services into a single architecture, keeping its ontologies in a relational database.

Complex tools

These tools are needed in order not only to enter and edit ontological information, but also to analyze it, performing typical operations on ontologies, for example:

  • alignment(alignment) - setting a different kind of correspondence between two ontologies so that they can use each other's information;
  • display(mapping) - finding semantic links between similar elements of different ontologies;
  • Union(merging) - an operation that generates a third ontology based on two ontologies, which combines information from the first two.

PROMPT serves to unite and group ontologies. It is an add-on to the Protégé system, implemented as a plugin. Based on the two ontologies to be combined, PROMPT builds a list of operations (for example, combining terms or copying them into a new ontology) and transfers it to the user, who can perform one of the proposed operations. Then the list of operations is modified, and a list of conflicts and their possible solutions is created. This is repeated until a new ontology is ready.

Chimaera- an interactive tool for combining ontologies based on the Ontolingua ontology editor.

V OntoMerge the original ontologies are translated into a general representation in a special language.

OntoMorph defines a set of transformation operators that can be applied to an ontology.

OBSERVER combines ontologies with information about the mapping between them and finds synonyms in the original ontologies.

ONION is based on the algebra of ontology and provides tools for defining the rules of articulation (connection) between ontologies.

METHODOLOGY FOR CREATING ONTOLOGIES

Practical ontology development includes:

  • definition of classes in ontology;
  • the arrangement of classes in a taxonomic hierarchy (subclass - superclass);
  • definition of slots and description of the allowed values ​​of these slots;
  • filling in the values ​​of the slots of the instances.

After that, you can create a knowledge base by defining individual instances of these classes, entering a value in a specific slot, and additional constraints for the slot.

Let's highlight some fundamental rules of ontology development. They look quite categorical, but in many cases they will help to make the right design decisions.

  • There is no single correct way to model a domain — there are always viable alternatives. The best solution almost always depends on the intended application and expected extensions.
  • Ontology development is necessarily an iterative process.
  • Concepts in ontology should be close to objects (physical or logical) and relationships in the domain of interest. Most likely, these are nouns (objects) or verbs (relationships) in sentences that describe the subject area.

Knowing what the ontology is supposed to be used for and how detailed or general it will be can influence many modeling decisions.

It is necessary to determine which of the alternatives will help to better solve the problem and will be more visual, more extensible and easier to maintain. It should be remembered that an ontology is a model of the real world, and concepts in an ontology must reflect this reality.

After the initial version of the ontology has been determined, we can evaluate and debug it, using it in some applications and / or discussing it with subject matter experts. As a result, the initial ontology will most likely need to be revised. And this iterative design process will continue throughout the entire life cycle of the ontology.

Reusing existing ontologies may be necessary if the system needs to interact with other applications that have already entered separate ontologies or controlled vocabularies. Many useful ontologies are already available electronically and can be imported. There are libraries of reusable ontologies such as Ontolingua or DAML. There are also a number of publicly available commercial ontologies such as UNSPSC, RosettaNet, DMOZ.

ORGANIZATIONAL ONTOLOGIES AND KNOWLEDGE PORTALS

Despite the fact that many ontologies have already been developed, reflecting knowledge about a wide variety of objects, when describing specific subjects of economic activity, their specifics must be taken into account and introduced into the corresponding ontological models.

The ontological representation of knowledge about subjects of economic activity, which are part of any system, can be used to combine their information resources into a single information space (Fig. 2).

An enterprise ontology contains classes of concepts with semantic relations assigned to them. It consists of a set technological ontologies and organizational ontologies, reflecting the organizational and functional structure of the enterprise: the composition of the staffing table (employees, administration, service personnel), partners, resources, etc. and the relationship between them. Technology ontologies contain concepts that describe production processes. The general knowledge of the SbA, to which the subjects of economic activity belong, reflects industry ontology.

The developed ontologies will allow employees of the same industry or corporation to use common terminology and avoid mutual misunderstandings that can complicate cooperation and lead to serious losses (for example, an organizational ontology clearly reflects the mutual hierarchy and connections between enterprise departments, as well as their areas of competence, and links to certain regulations provide the same basis for negotiations). They will provide work with structured data sources, for which a data schema can be built, that is, data types and relationships between them are described, and there is a formal way to obtain individual data items. Examples of structured data sources include various databases (for example, relational and object), as well as loosely structured resources described in XML, RDF, OWL, DAML + OIL formats.

As an example of the practical use of ontological models of technologies, let us give the ONTOLOGIC system, designed to create and support distributed systems of normative and reference information (RDI), maintain dictionaries, reference books and classifiers and support the coding system for accounting objects (see Fig. 3).

The basis of the system is a technological environment for continuous, in real time, interaction of users: information consumers (employees of services and functional units) and experts responsible for maintaining reference information.

To ensure unambiguous identification and classification of objects in the reference data systems, a technique has been developed that uses the ontological model of the formal description of classified data, which provides the identification of the key properties of classification objects and the construction of a classification code on their basis. Classes (groups of homogeneous products) are distinguished according to the principle of homogeneity of a set of technical and consumer characteristics, and for each material a classification code is formed, including a class code and codes of all properties and their values ​​for a given material.

Ontology provides a consistent accumulation of any amount of information in a standard classification structure. This approach guarantees unambiguous identification of resources regardless of different interpretations of their names by different manufacturers.

This technology provides for the creation of a standard solution for managing master data and reference data for industrial enterprises, holdings and government agencies. As a technological platform, SAP MDM (Master Data Management) is used, designed to integrate various (including multi-platform) applications across a company, holding, industry, government, etc., as well as for organizing and managing industry or corporate regulatory reference information (master data).

EXAMPLES OF ONTOLOGY APPLICATION

TOVE (Toronto Virtual Enterprise). The goal of the project is to create a data model that should:

  • provide a common terminology for the subject area, the applications of which can be shared and understood by each participant in the communication;
  • give an accurate and as consistent as possible definition of the meaning of each term based on first-order logic;
  • to provide the definition of semantics using a set of axioms that automatically allow you to get an answer to many questions about the subject area.

TOVE should provide the construction of an integrated model of a certain subject area, consisting of the following ontologies: operations, states and time, organization, resources, products, service, production, price, quantity.

Ontolingua is a system developed at Stanford University that provides a distributed collaborative environment for viewing, creating, editing, modifying, and using ontologies. The system server supports up to 150 active users, some of whom supplement the system with a description of their projects.

Among many other projects, Ontolingua uses the Enterprise project.

Enterprise Project. The aim of the project is to improve (replace where necessary) existing modeling techniques with a set of tools that integrate various enterprise modeling techniques and tools. It is planned to create such tools that will provide: fixation and description of a specific subject area; definition of tasks and requirements (consistent with the ontology); identification and assessment of solutions and alternative projects, implementation of the chosen strategy.

Independent development of tools may use different terminology, which can lead to conflicts and ambiguity when integrating them. To solve this problem, an ontology was built, in which a set of frequently used and generally accepted terms, such as activity, process, organization, strategy, marketing, was set.

KACTUS. The goal of the project is to build a methodology for reusing knowledge about technical systems during their life cycle. It is necessary to use the same knowledge bases for design, evaluation, operation, maintenance, redesign and training.

KACTUS supports an integrated approach that includes manufacturing, engineering and knowledge engineering methods by creating an ontological and computational framework for reusing the acquired knowledge in parallel with various applications in the technical field. This is achieved by building a domain ontology and reusing it in various application domains. In addition, an attempt is made to combine these ontologies with existing standards (for example, STEP), using ontologies wherever it is possible to record data about a specific area.

The main formalism in KACTUS is CML (Conceptual Modeling Language).

The KACTUS toolkit is an interactive environment in which you can experiment with theoretical results (organize ontology libraries, transform data between ontologies, make transformations for various formalisms), as well as carry out practical actions (viewing, editing and refining an ontology in different formalisms).

OntoSeek is an information retrieval system that is designed for semantically oriented information retrieval by combining an ontology driven meaning matching engine and powerful modeling systems.

SHOE (Simple HTML Ontology Extensions) allows authors to annotate their Web pages with semantic content. The main component of SHOE is an ontology, which contains information about a certain area. Using this information, search and query tools provide a more relevant query response than existing search engines by providing the ability to include knowledge in Web pages that intelligent agents can actually read. To do this, SHOE supplements HTML with a set of special tags to represent knowledge. SHOE allows you to find knowledge using taxonomy and inference rules that exist in an ontology.

Plinus. The aim of the project is the semi-automatic extraction of knowledge from natural language texts, in particular, literature on the mechanical properties of ceramic materials. Since the texts cover a wide range of concepts, many integrated ontologies are required to cover concepts such as ceramic materials and their properties, how they are processed, various material defects such as cracks and pores. Ontology defines the language with which the semantic portion of the vocabulary is expressed.

CONCLUSION

The activities of individuals and organizations now increasingly depend on the information they have and the ability to effectively use it (extract knowledge). At the same time, some groups of people involved in information processing use special terms that are used by other organizations in a different context. At the same time, different organizations often use different designations for the same concepts.

All this greatly complicates mutual understanding. Therefore, it is necessary to develop formalized models of knowledge representation that would ensure the processing of information at the semantic level in knowledge management systems (KMS).

Currently, there is significant interest in CPS from industrial companies, which are aware of the high applied potential of knowledge-based systems that are used to solve a number of practical problems of an enterprise (organization). Knowledge management issues are becoming crucial for a developing economy, where knowledge is capitalized and, therefore, acquires a completely different status.

Ontologies play a decisive role in the knowledge description model, without which, according to experts, entry into any subject area is prohibited. Designing an ontology is a creative process, and therefore the potential applications of an ontology, as well as the developer's understanding and perspective on the domain, will undoubtedly influence decision making.

    Gladun Anatoly Yasonovich- Cand. tech. Sci., senior researcher International Scientific Research Center of Information Technologies and Systems of NASU,

    Rogushina Yulia Vitalievna- Cand. phys-mat. Sci., senior researcher Institute of Program Systems, National Academy of Sciences of Ukraine.

A short answer to the exam question for the FIS course - artificial intelligence systems (all questions).

Ontology is a formal specification of a shared conceptual model.

O = (C, R, A), where

  • O is an ontology,
  • С - a set of concepts of the subject area,
  • R is a set of relations between them,
  • A is a set of axioms (laws and rules that describe the laws and principles of the existence of concepts).

Classification of ontologies

In terms of the depth of elaboration, all ontologies are divided into:

  • Heavy-weighted ontologies containing axioms (C, R, A)
  • "Light" (Light-weighted), not containing them (C, R)

By the level of generalization, the following 4 categories of ontologies can be distinguished:

  1. Representation ontologies describe a conceptual model that is the basis of the knowledge representation formalism.
  2. General ontologies are similar to domain ontologies, but the concepts they describe are common to several domains. Typically, such ontologies describe concepts such as state, event, process, action, component.
  3. A domain ontology expresses a conceptualization corresponding to a specific domain.
  4. An Applied Ontology (Application Ontology) contains all the descriptions needed to model the knowledge required for a particular application. Typically, an applied ontology is a combination of concepts taken from a domain ontology and a general ontology, which may contain extensions specific to the methods used and the problems to be solved.

Formal Ontology Model

O = ,

  • X is a finite set of concepts of the subject area,
  • R is a finite set of relationships between concepts,
  • Ф is of course the set of interpretation functions given in the ontology.

Limitations on X are finiteness and not emptiness. R, Ф - final, but sometimes they can be empty.

Let R = 0, Ф = 0. Then the ontology X is transformed into a simple dictionary:

O = V = .

In the case R = 0, Ф! = 0, each element of X can be associated with an interpretation function f from Ф.

X = X1 V X2, where

  • X1 - many interpreted terms,
  • X2 - many interpretative terms.

The concept of substance in ontological systems. The concept of substance and being. The search for the substantial basis of being in the history of philosophy. Substance as a self-determined basis of existential processes. General idea of ​​the relationship between spirit and matter, soul and body. Substance, spirit and mind. Categories "absolute", "relative", "general", "singular", "essence" and "phenomenon" for solving the question of the relationship between substance and forms of its manifestation. Materialism and idealism about the nature of consciousness and thinking and their relationship with matter.

Materialistic substantialism. Varieties of building a materialistic ontology. Sensual-material Cosmos as the main feature of ancient natural philosophy. Dialectical materialism as one of the variants of materialistic substantialism and its place in modern philosophy. Understanding of matter as an objective reality and as a substance of all processes in the world. The principle of the materialistic unity of the world. Science and materialistic philosophy. Modern ideas about the structure of matter, matter and fields. The hierarchy of material systems in the world. Structural infinity and eternity of matter as a substance. Universal attributes of matter. The relationship between the general and specific properties of matter. Structural levels of matter and forms of its systemic organization. Methods for revealing the universal properties of matter and proof of their universality. Interaction and motion as attributes of matter. The relationship of interaction and communication. Types of relationships in the world. Asymmetry of causal relationships in irreversible changes. The problem of the spread of connections and interactions in space and time. Is the world infinite or is it a coherent whole entity, an integral system? Interaction and autonomy of material systems. The main forms of motion of matter and the criteria for their classification. The relationship between living and inanimate nature.

Idealistic substantialism. Varieties of idealistic substantialism in the history of philosophy. The idea of ​​the universalism of the world and the sensory-perceiving Cosmos in ancient philosophy. Antique idealism. Religious and philosophical models of idealistic substantialism. Features of the construction of an ontological system in logical idealism. Spiritually ideal beginnings of life. The ratio of the ideal and the material in the idealistic interpretation. Attributes of an ideal substance: consciousness, goal-setting, freedom, creativity. Consciousness as the ideal substantial basis of the world. The concept of eidos as a cause-and-purpose construction of the world, as a self-thinking creature in ancient philosophy. The ancient concept of the Cosmos as a "world subject". The absolute spirit in Hegel's philosophy. The concept of the world cosmic mind. The concept of God in the history of religion and philosophy as the ideal substantial basis of the world. Logos and God.



Creationist variants of ontology. The relationship between God and the World in the ontological systems of the Middle Ages. Reason and will. Divine spirit and human soul. Development of ideas about the soul. The soul as the bearer of consciousness and the entire spiritual world of a person. Spirituality concept. Spirituality and religiosity. Ideal-semantic content of consciousness and its ontological status. Achievements and limitations of idealistic ontology.

Personalistic substantialism. Man as a microcosm in the philosophy of the Renaissance. The values ​​of human existence and the place of Man in the Cosmos. Creativity as the main sign of a person's special place in the world. Leibniz's monadology and ideal-realism of N.O. Lossky. Dynamic understanding of matter. Anthropic principle in cosmology. A cosmic approach to man and consciousness. Features of ontological searches in Russian philosophy.

The crisis of ontologism and anti-substantialist models of philosophy. The crisis of ontologism in the history of philosophy, the thesis about the "death of metaphysics" (premises, motives, declarations and arguments). Being and Consciousness: the Problem of Correspondence of Philosophical Ontological Constructions to Objective Reality. Ontological picture of the world, the real world and the individual. Constructive and creative activity of the human "I" and criticism of ontologism.

Ontological models in modern philosophy. Metaphysics rehabilitation programs and "new ontology" projects. Hierarchical models of ontology: Being as a set of forms of motion of matter by F. Engels. The layers of being N. Hartmann. E. Husserl's regional ontologies. The problem of identifying regional ontologies: the ontology of society. Ontology of consciousness and self-awareness. Ontology of language. Ontology of personal existence (existence). Ontology of corporeality. Ontology of culture. Variants of existential metaphysics: M. Heidegger's fundamental ontology. The world of transcendental being K. Jaspers.

Dialectical-materialistic model of ontology. Materialistic solution to the fundamental question of philosophy. The concept of matter as an objective reality. Structural levels of being.

The problem of typologization of ontological models. Monistic, pluralistic and dualistic ontologies. Essentialist and anti-essentialist ontologies. Hierarchical and non-hierarchical ontological constructions. Natural Philosophical Models. Theistic models. Existential-anthropological models. Phenomenological and hermeneutic models.

Being and development

The problem of movement in the history of philosophy. Correlation of movement, change and development. Basic properties of movement. Philosophical models of development: creationism, emanation theory, preformism, emergentism, evolutionism. Variety of forms of movement and structural levels of being. Changing and unchanging being. The problem of movement in the history of philosophy. The problem of the universality of the movement. Paradoxes of movement.

Development and emergence of new forms of being. Development and dialectics. Dialectical concepts of development. Their structure, laws, principles, basic concepts. The paradox of the emergence of the new. The problem of the relationship between the actual and the potential in development. Non-linear development. Laws and categories of development.

Types of dialectics. Source, mechanism and direction of development. Philosophical laws describing the development of the world (G.W.F. Hegel, K. Marx, dialectical materialism). The law of unity, interaction and struggle of opposites. The law of mutual transition of quantitative and qualitative changes. The law of dialectical negation.

Modern views on the evolution of man, society and the Universe. Man, nature, space. The phenomenon of life and its place in the Universe. The problem of other forms of life in the Universe and the hypothesis of the uniqueness of the human mind (V. Shklovsky). The global crisis of the technogenic-consumer civilization and the concept of the noosphere. Features of the anthropocosmic turn in modern science and culture.

Man as a "bio-logistic" being.

"Logos" component of a person. Man as presence. The concept of "cultural machines". The main phenomena of human existence. Man as a "symbolic" being. The structure of the "symbolic space". Historical types of mentality. Transcendental conditions for the generation of symbols: declarativeness and human capacity for synthetic acts. The human right to make mistakes. Progress and aggravation of global problems of mankind. Synergetics and self-organization processes in open nonlinear systems. Global evolutionism in the structure of modern consciousness. Self-organization processes in open nonlinear systems. Synergetics and its basic concepts (attractors, bifurcation points, fluctuations, fractals). Global evolutionism.

The role of information in development processes. Changing the system of communication means in the modern world as the most important condition for accelerating the pace of development.

Kursk 2007


BBK Reprinted by decision

editorial board

Kursk State University

Reviewer -

: Textbook. manual. for university students. - Kursk: Publishing house of Kursk State University, 200. - 84 p.

The training manual is devoted to the most promising approach to modeling subject areas - ontological. The basic concepts, definitions, methodology of development and construction of ontologies are considered on the example of the educational knowledge base "Animal World". One of the means for constructing ontologies, Protégé, is considered.

It is intended for senior students studying in the specialty …… .. software and administration of information systems.


Introduction ................................................. .................................. 4

1. Theoretical aspects of building ontologies ................... 5

1. 1. Definition of ontology ............................................ ..... 5

1. 2. Models of ontology and ontological system ............ 14

1. 3. Application of ontologies ............................................ .... 21

1. 4. Ontology engineering tools ........................... 25

2. Creation of a domain ontology in Protégé .......... 30

2. 1. Preliminary remarks ....................................... 30

2. 2. Basic information about Protégé ...................................... 37

2. 3. Creation of a domain ontology in Protégé .... 40

3. Semester assignment .............................................. ........... 77

The order of the project: ............................................ 77

Literature................................................. ............................ 82


Introduction

An expert system is a collection of three interdependent "modules": a knowledge base, an inference engine, and a user interface. The inference engine and the interface are usually combined and called the shell of the expert system. In this case, we can talk about two components: the shell and the knowledge base. The most important component among these is by far the knowledge base. The problem of an adequate method, or method, of modeling the subject area and, as a consequence, the formalization of knowledge with their subsequent entry into the knowledge base is, if not central, then at least important in the theory of artificial intelligence.



There are many methods of representing knowledge. These are well-known logical and frame methods, as well as semantic networks and production rules. When creating knowledge-based systems (expert systems are undoubtedly one of them), various ways of representing knowledge are used.

Each of these methods has advantages and disadvantages. At the moment, the use of ontology as a knowledge base for knowledge-based systems is of considerable interest. Note that in some literature the knowledge base is identified with ontology. Generally speaking, there is no unambiguous definition of the domain ontology; often the ontology is defined as it is beneficial to the developer at the moment. This and some other interesting problems related to ontologies, as well as issues of their technical implementation, are considered in this tutorial.

Theoretical aspects of building ontologies

Defining an ontology

As noted earlier, knowledge representation is an important issue in artificial intelligence. The term "knowledge representation" can mean either a way of coding knowledge in a knowledge base, or a formal system that is used to formalize knowledge.

The practice of developing knowledge-based systems for complex subject areas and tasks has shown that in each subject area there is a certain structure that occupies an intermediate position between the knowledge representation used in the subject area model and the subject area model (knowledge base).

This structure is called "domain ontology".

In philosophy, ontology is a term that defines the doctrine of being, of being, in contrast to epistemology - the doctrine of knowledge. From another point of view, ontology is knowledge formally presented on the basis of conceptualization. Conceptualization involves the description of a set of objects and concepts, knowledge about them and the relationships between them.

Ontology called an explicit specification of conceptualization. Formally, an ontology consists of terms organized into a taxonomy, their definitions and attributes, as well as associated axioms and inference rules.

In the simplest case, an ontology describes only a hierarchy of concepts related by categorization relations. In more complex cases, suitable axioms are added to it to express other relationships between concepts and in order to limit their intended interpretation.

With this in mind, an ontology is a knowledge base describing facts that are assumed to be always true within a certain community based on the generally accepted meaning of the vocabulary used.

Let's highlight the following interpretations of this term:

1. Ontology as a philosophical discipline.

2. Ontology as an informal conceptual system.

3. Ontology as a formal view of semantics.

4. Ontology as a "conceptualization" specification.

5. Ontology as a representation of a conceptual system through a logical theory, characterized by:

o special formal properties or

o only by its purpose

6. Ontology as a vocabulary used by logical theory.

7. Ontology as a (metalevel) specification of logical theory.

Speaking about ontology within the framework of the first interpretation, they mean a philosophical discipline that studies the nature and organization of existence.

According to the second interpretation, an ontology is a conceptual system that can act as the basis of a certain knowledge base. According to interpretation 3, the ontology on the basis of which the knowledge base is built is expressed in terms of suitable formal structures at the semantic level. Thus, these two interpretations view ontology as a conceptual “semantic” entity, whether formal or informal, while Interpretations 5-7 treat ontology as a special “syntactic” object. The fourth interpretation is one of the most problematic, since its exact meaning depends on the understanding of the terms “specification” and “conceptualization”.

The first of the approaches to defining the concept of "domain ontology", conventionally called humanitarian, involves definitions in intuitively understood terms. The second approach to defining the concept of ontology is conventionally called computer. Within this approach, computer languages ​​are developed to represent ontologies.

The main advantage of the computer approach is the formality of the proposed means for describing ontologies. The definition of the concept of domain ontology within this approach does not clarify the substantive essence of this concept, but, on the contrary, obscures this essence with numerous technical details related to computer implementation, and does not distinguish it from other concepts, in particular from the concept of a domain model (knowledge base ).

Within the framework of the third, mathematical approach, attempts are made to define the concept of ontology in mathematical terms or with the help of mathematical constructions.

Ontology is a logical theory that limits the valid models of a logical language. Ontology in this case must provide axioms that limit the meaning of non-logical symbols (predicates and functions) of a logical language used as "primitives" for certain presentation purposes. The purpose of an ontology is to characterize conceptualization by limiting the possible interpretations of non-logical symbols of a logical language in order to establish a consensus on how to describe knowledge using that language. Conceptualization is seen as a set of informal rules that constrain the structure of a piece of reality.

So, the subject area ontology is understood as:

1. Domain Ontology there is that part of the knowledge of the subject area, relative to which it is assumed to be invariable. The rest of the domain knowledge is assumed to be changeable, but must remain consistent with the domain ontology.

2. Domain Ontology there is that part of the domain knowledge that limits the meanings of the terms of the domain. The meanings of the terms in the subject area do not depend on the rest (changeable) part of the knowledge of the subject area.

3. Domain Ontology is a set of conventions about a domain, another piece of domain knowledge is a set of empirical and other laws of this domain. Ontology determines the degree of agreement of the meanings of terms by experts in the subject area.

4. Domain Ontology is an explicitly specified external approximation of an implicitly specified conceptualization. Conceptualization is a subset of the multitude of all situations that can be presented. The set of situations corresponding to the knowledge base is a subset of conceptualization. This subset is some approximation of the set of situations that are possible in reality.

In what follows, for definiteness, we will assume that ontology is a formal explicit description of concepts in the domain under consideration (classes (sometimes called concepts)), properties of each concept that describe various properties and attributes of a concept (slots (sometimes called roles or properties)), and restrictions imposed on slots (facets (sometimes called role constraints)) . An ontology, together with a set of individual class instances, forms a knowledge base.

Here are some reasons for the need to develop ontologies. So, ontologies are needed for:

· Sharing by people or software agents a common understanding of the structure of information;

· The possibility of re-using knowledge in the subject area;

· To make domain assumptions explicit;

· Separation of knowledge in the subject area from operational knowledge;

· Analysis of knowledge in the subject area.

Sharing people or software agents with a common understanding of the structure of information is one of the most general purposes of ontology development. For example, suppose several different websites contain information on medicine or provide information about paid medical services that are paid for over the Internet. If these websites share and publish the same basic ontology of terms that they all use, then computer agents can retrieve information from these different sites and accumulate it. Agents can use the accumulated information to respond to user requests or as input to other applications.

Providing the ability to use knowledge of the subject area has become one of the driving forces behind the recent resurgence in ontology studies. For example, for models of many different subject areas, it is necessary to formulate the concept of time. This view includes the concept of time intervals, points in time, relative measures of time, etc. If one group of scientists develops such an ontology in detail, then others can simply reuse it in their subject areas. In addition, if we need to create a large ontology, we can integrate several existing ontologies that describe parts of a large domain. It is possible to reuse a basic ontology such as UNSPSC and extend it to describe the domain of interest.

Making Explicit Domain Assumptions underlying implementation makes it easy to change these assumptions as our domain knowledge changes. Hard-coding assumptions about the world in a programming language makes these assumptions not only difficult to find and understand, but also difficult to change without being a programmer. In addition, explicit domain knowledge specifications are useful for new users who need to learn the meanings of domain terms.

Separating domain knowledge from operational knowledge Is another option for the general use of ontologies. We can describe the task of configuring a product from its components in accordance with the required specification and implement a program that makes this configuration independent of the product and the components themselves. After that, we can develop an ontology of the components and characteristics of the computer and apply this algorithm to configure non-standard computers. We can also use the same algorithm to configure the elevators if we provide it with the ontology of the elevator components.

Analysis of knowledge in the subject area possible when there is a declarative specification of terms. Formal analysis of terms is extremely valuable both when trying to reuse existing ontologies and when extending them.

The question often arises about the difference between an ontology and a database. Let's point out the main differences between them.

The result of a database query is usually a collection of instance data and links to text documents, while the result of an ontology query may include elements of the ontology itself (for example, all subclasses of a certain class).

Ontologies themselves include semantics

Database schemas and catalogs generally do not provide external semantics for their data. The semantics were never defined, or the semantics were defined externally during database design, but this specification did not become part of the database specification and is no longer available. Hence, when using databases, we need certain protocols to deal with conflicting constraints when the database changes. However, ontologies are logical systems that themselves include semantics.

Ontologies are more commonly reused

A database schema defines the structure of a specific database and other databases, and schemas are not often directly reused or extended. The circuit is part of an integrated system and is rarely used separately from it. With ontologies, the situation is exactly the opposite: ontologies usually reuse and extend other ontologies and they are not tied to a specific system.

Ontologies are decentralized in nature

Traditionally, developing and updating a database schema is a centralized process: the developers of the original schema (or employees in the same organization) usually make changes and maintain the schema. At the very end, database schema designers usually know which databases are using their schema. By its very nature, ontology development is a much more decentralized and collaborative process. As a result, there is no centralized control over who uses a particular ontology. It is much more difficult (and maybe impossible) to distribute or synchronize updates: we do not know who is using the ontology, we cannot inform them about the updates, and we cannot assume that they will find out about it themselves. The lack of centralized and synchronized control also makes it difficult (and often impossible) to trace the sequence of operations that transformed one version of an ontology into another.

Ontology information models are richer

In many ontologies, the number of presentation primitives is much larger than in a typical database schema. For example, many ontological languages ​​and systems allow specification of cardinality constraints, inverse properties, transitive properties, inverse classes, etc. Some languages ​​(for example, DAML + OIL) add primitives to define new classes as unions or intersections of other classes, as enumerating their members, as a series of objects that satisfy a certain constraint.

Classes and Instances can be the same

Databases clearly distinguish between schema information and instance information. In many powerful knowledge representation systems, it is difficult to determine where the ontology ends and the instances begin. The use of metaclasses (classes where other classes are used as instances) in many systems (eg Protégé, Ontolingua, RDFS) blurs or blurs the line between classes and instances. Metaclasses are sets, whose elements are also sets. This means that "instance" and "class" are really just roles of a concept.

Ontology and ontological system models

The concept of ontology implies the definition and use of an interrelated and mutually consistent set of three components: a taxonomy of terms, definitions of terms and rules for their processing. Let us introduce the following definition of the concept of an ontology model:

The formal model of the O ontology is understood as

X is a finite set of concepts (notions, terms) of the subject area, which is represented by the ontology O;

R is a finite set of relations between concepts (concepts, terms) of a given subject area;

F is a finite set of interpretation (axiomatization) functions given on the concepts and / or relations of the ontology of O.

The natural constraint imposed on the set X is its finiteness and nonempty. The situation is different with the components F and R in the definition of the ontology O. It is clear that in this case, too, F and R must be finite sets. Let us indicate the boundary cases associated with their emptiness.

1. Let and. Then the ontology O is transformed into a simple dictionary:

.

Such a degenerate ontology can be useful for the specification, replenishment and maintenance of software dictionaries, but dictionary ontologies are of limited use, since they do not explicitly introduce the meaning of terms. Although in some cases, when the terms used belong to a very narrow (for example, technical) vocabulary and their meanings are already well agreed in advance within a certain (for example, scientific) community, such ontologies are applied in practice. Well-known examples of this type of ontology are the indexes of information retrieval machines on the Internet.

2.,. Then each element of the set of terms from X can be associated with an interpretation function f from F. Formally, this statement can be written as follows.

where is the set of interpreted terms;

Many interpretative terms.

such that

The emptiness of the intersection of sets excludes cyclical interpretations, and the introduction of the function of k arguments is intended to provide a more complete interpretation. The kind of mapping f from F determines the expressive power and practical usefulness of this kind of ontology. If the interpretation function is specified by the value assignment operator (), where is the name of the interpretation), then the ontology is transformed into a passive dictionary:

Such a vocabulary is passive, since all definitions of terms from are taken from an already existing and fixed set. Its practical value is higher than that of a simple dictionary, but is clearly insufficient, for example, to represent knowledge in information processing tasks on the Internet due to the dynamic nature of this environment.

In order to take into account the latter circumstance, let us assume that some of the interpreting terms from the set are set procedurally, and not declaratively, and are calculated each time a term from the set is interpreted. In this case, the ontology is transformed into an active vocabulary of definitions

Moreover

The value of such a dictionary for information processing tasks in the Internet environment is higher than that of the previous model, but still insufficient, since the interpreted elements from are not related to each other in any way and, therefore, play only the role of entry keys into the ontology.

To represent the ontology model, which is needed to solve information processing problems on the Internet.

Let us consider possible options for the formation of a set of relations on the basis of ontology concepts.

Let's introduce a special subclass of ontology, a simple taxonomy, as follows:

Taxonomic structure - a hierarchical system of concepts related to each other by the relationship is_a ("to be an element of a class").

The is_a relation has semantics fixed in advance and allows organizing the structure of ontology concepts in the form of a tree.

Classification of ontology models

Model components .
Formal definition
Explanation Software Dictionary Passive software vocabulary Active software dictionary Taxonomy of software concepts

Representation of a set of concepts X in the form of a network structure;

Use of a fairly rich set of relations R, including not only taxonomic relations, but also relations reflecting the specifics of a specific subject area, as well as means of expanding the set R;

The use of declarative and procedural interpretations and relationships, including the ability to define new interpretations.

Let's introduce the concept of an ontological system. A formal model of an ontological system is understood as a triplet of the form:

where is the top-level ontology (metaontology)

The set of subject ontologies and ontologies of domain problems

Model of an inference machine associated with an ontological system.

The use of an ontology system and a special inference machine makes it possible to solve various problems in such a model. By expanding the system of models, it is possible to take into account the user's preferences, and by changing the model of the inference machine, introduce specialized criteria for the relevance of information obtained in the search process and form special repositories of accumulated data, as well as replenish the used ontologies, if necessary.

The model has three ontological components:

Metaontology;

Subject ontology;

Ontology of tasks.

Metaontology operates with general concepts and relationships that do not depend on a specific subject area. Meta-level concepts are general concepts such as “object”, “property”, “value”, etc. The levels of metaontology receive an intensional description of the properties of the subject ontology and the ontology of tasks. The metalevel ontology is static, which makes it possible to provide efficient inference here.

A subject ontology contains concepts that describe a specific subject area, relations that are semantically significant for a given subject area, and many interpretations of these concepts and relations (declarative and procedural). Domain concepts are specific in each applied ontology, but relations are more universal. Therefore, as a basis, such relations of the subject ontology model are usually distinguished as part_of, kind_of, contained_in, member_of, see also and some others.

Attitude part_of defined on a set of concepts, is a belonging relation and shows that a concept can be part of other concepts. It is a relation of the "part-whole" type and is close in properties to the relation is_a and can be specified by the corresponding axioms. Similarly, you can introduce other relationships of the "part-whole" type.

The situation is different with the attitude see_also. It has different semantics and different properties. Therefore, it is advisable to introduce it not declaratively, but procedurally, just as it is done when defining new types in programming languages ​​that support abstract data types.

The ontology of tasks as concepts contains the types of tasks to be solved, and the relations of this ontology, as a rule, specify the decomposition of tasks into subtasks. At the same time, if the application system solves a single type of problem (for example, the problem of searching for information relevant to the request), then the ontology of the problems can in this case be described by a dictionary one. Thus, the model of an ontological system allows one to describe the ontologies of different levels necessary for its functioning. The relationship between ontologies is shown in the figure:

In general, the inference engine of an ontological system can rely on a network representation of an ontology at all levels. Moreover, its functioning will be associated with:

With the activation of concepts and / or relations that fix the problem to be solved (description of the initial situation);

Determination of the target state (situation);

The conclusion on the network is that activation waves propagate from the nodes of the initial situation using the properties of relations associated with them. The criterion for stopping the process is the achievement of the target situation or exceeding the duration of execution (time-out).

Application of ontologies

Summarizing the various typifications of ontology, we can distinguish classifications according to:

Degree of dependence on a specific task or subject area;

The level of detail of axiomatization;

The "nature" of the subject area, etc.

In addition to these dimensions, it is possible to introduce classifications related to the development, implementation and maintenance of an ontology.

According to the degree of dependence on a specific task or subject area, they are usually distinguished:

Top-level ontologies;

Domain-oriented ontologies;

Task-oriented ontologies;

Applied ontologies.

Top-level ontologies describe very general concepts such as space, time, matter, object, event, action, etc., which are independent of a particular problem or area. Therefore, it seems reasonable, at least in theory, to unify them for large user communities.

An example of such a general ontology is CYC®. The project of the same name - CYC® - is focused on creating a multi-context knowledge base and a special inference engine developed by Susogr. The main goal of this gigantic project is to build a knowledge base of all general concepts (starting with such as time, essence, etc.), including the semantic structure of terms, relationships between them and axioms. It is assumed that such a knowledge base can be accessed by a variety of knowledge-based software tools and will play the role of a "basic knowledge" base. In ontology, according to some data, 10 6 concepts and 10 5 axioms are already presented. To represent knowledge within the framework of this project, a special language CYCL has been developed.

Another example of a top-level ontology is the ontology of the Gene-railized Upper Model system, focused on supporting natural language processing processes: English, German, and Italian. The level of abstraction of this ontology is between lexical and conceptual knowledge, which is determined by the requirements to simplify interfaces with linguistic resources. The Generalized Upper Model includes a taxonomy organized as a hierarchy of concepts (about 250 concepts) and a separate hierarchy of relationships.

The creation of sufficiently general top-level ontologies is a very serious task that has not yet been satisfactorily solved.

Subject ontologies and task ontologies describe, respectively, a vocabulary associated with a subject area (medicine, commerce, etc.) or with a specific task or activity (diagnostics, sales, etc.) due to the specialization of terms introduced in the ontology of the upper level. Examples of domain-specific and task-specific ontologies are TOVE and Plinius, respectively.

The ontology in the TOVE (Toronto Virtual Enterprise Project) system is focused on the representation of the corporation model. The main goal of its development is to answer users' questions on reengineering business processes, extracting knowledge explicitly presented in the ontology. In this case, the system can conduct a deductive inference of answers. The ontology does not have the means to integrate with other ontologies. Formally, an ontology is described using frames.

Currently, ontologies of some branches of molecular biology have been built, which offer terminology for defining a variety of chemical elements, describing processes inside a cell. Ontology TAMBIS (TaO) describes bioinformatics, covers the basic concepts of molecular biology and bioinformatics: macromolecules, their purpose, structure, functions, cellular arrangement and the processes in which they interact. TaO ontology is built using the OIL language.

There is also an experimental ontology for bioinorganic centers known as COME. COME consists of three types of entities: Molecule (MOL), Bioinorganic Motive (BIM) and Bioinorganic Proteins (PRX).

Ontologies are also built that represent concepts and relationships in more narrowly focused areas - such as chemical crystals, ceramic materials, bioenergy centers. An example of such ontologies is the Chemical-Crystals ontology. The Chemical-Crystals ontology describes the different types of crystalline structure of substances. This ontology is built using a methodology known as METHONTOLOG.

Another example of ontology is the ontology of pure substances. The definition of pure substances is given in terms of chemical composition, i.e. through structural rules that define pure substances in terms of chemicals and natural numbers. A hierarchical model of the ontology of physical chemistry has been developed. The modular ontology of physical chemistry defines the set of sections of the subject area and the connections between them, describes the system of concepts of each section and sets the connections between the concepts of the sections. The ontology of physical chemistry consists of eight related sections: "Elements", "Substances", "Reactions", "Fundamentals of Thermodynamics", "Thermodynamics. Chemical properties "," Thermodynamics. Physical properties "," Thermodynamics. Relationship between physical and chemical properties ”,“ Chemical kinetics ”. The ontology of a given subject area is based on metaontology, which defines the metaponceptions used in defining the concept systems of each section.

Applied ontologies describe concepts that depend both on a specific subject area and on the tasks that are solved in them. Concepts in such ontologies often correspond to the roles that objects in the subject area play in the process of performing a certain activity. An example of such an ontology is the ontology of the Plinius system, designed for the semi-automatic extraction of knowledge from texts in the field of chemistry. Unlike the other ontologies mentioned above, there is no explicit taxonomy of concepts.

Instead, several sets of atomic concepts are defined, such as, for example, a chemical element, an integer, etc., and the rules for constructing other concepts. The ontology describes about 150 concepts and 6 rules. Formally, the Plinius ontology is also described using frames.

Introduction

Recently, the use of ontologies for modeling subject areas of automated information systems is becoming more and more widespread. Most often, this approach is used for intelligent systems, in particular, designed to operate on the Internet. This is due to the fact that the ontological model allows you to develop a metadata model, which significantly improves the use of the system by a wide range of users in terms of organizing interaction.

Ontology is a structure that describes the meanings of the elements of a certain system, an attempt to structure the surrounding world, to describe some specific subject area in the form of concepts and rules, statements about these concepts, with the help of which it is possible to form relations, classes, functions, etc. Ontologies of subject areas are limited to describing the world within a specific subject area.

The task of constructing an ontological model of the subject area of ​​an information system to support the commercialization of the results of innovative developments in scientific research is an urgent and complex scientific and practical task. The complexity of the task is determined, in particular, by the presence of many interdisciplinary and interdisciplinary connections and various goals of the end users of the system: scientists, experts, businessmen, politicians, employees of public and commercial organizations.

The purpose of this work is to develop and create an ontological model of the subject area of ​​an information system to support the commercialization of scientific research results.

CERIF 2008 at a glance

All over the world there are many different kinds of scientific research, and the research scheme is similar in different countries. As a rule, first, strategic planning is carried out, then a research program is announced, proposals are searched for, suitable proposals are accepted for work, research results are monitored, analyzed and subsequently used for various purposes.

Research in the same area of ​​knowledge can be carried out simultaneously in several scientific organizations, including in one country. In addition, in the age of globalization, research organizations in one country can rely in their work on the results obtained in other countries. Therefore, it is important to ensure the exchange of complete and reliable information, data sets between different countries and foundations at all stages of research, from the application stage to the stage of publication of a review of innovative development. The problem of standardizing scientific research data arose back in the 80s of the last century, and as a solution to this problem, options first appeared for generalizing database schemas for storing scientific research results, on the basis of which the CERIF standard (Common European Research Information Format) later emerged. format for research information).

The organization euroCRIS has been actively involved in modeling the subject area of ​​scientific research on the basis of this standard for the last 14 years in the European Union. The main properties of this standard:
1) the standard supports the concept of objects or entities with attributes: for example, such as a project, a person, an organization;
2) the standard supports n: m relationships between objects using "binding relationships", and thus provides rich semantics, including roles and temporal characteristics;
3) the standard is completely international in terms of linguistic or symbolic set;
4) the standard is extensible without damaging the main data model, which makes it possible to operate at the main level without interfering with even broader interaction.

The main objects in the CERIF standard are Person, OrganizationUnit and Project, each of which is recursively linked to itself and maintains relationships with other objects. The standard describes many additional objects with the help of which research projects, their participants, the results of their joint work, etc. are fully described. Data semantics is set at a special semantic level, in tables describing possible roles and interactions between individual objects.

The relationship between a project, a person, an organization is shown in the CERIF standard using special links, and they are considered one of the strengths of the CERIF model. A link always connects two objects. All links are built according to the same scheme: they inherit names and identifiers from parent objects and additionally have the attributes of the start and end date of the link, each link reflects semantics through a link to the CERIF semantic layer by means of special identifiers. Thus, all possible relationships between projects, people and organizations are set using these connections, and the nature of the relationship of subordination (who is whose author, who is whose subject, what part of what, etc.) is shown thanks to the semantic layer in which all these roles are scheduled.

To display the results of scientific activities in the CERIF standard, special objects are provided: ResultPublication, ResultPatent, ResultProduct (Publication, Patent, Product). In addition to the main and resulting objects, many so-called second-level objects are also used in CERIF, such as: FundProg - financing program, Event - event, Prize - reward, Facil - service facilities, Equip - equipment, etc. Objects of the second level allow you to present the context of the study through links with the main and resulting objects.

The CERIF model supports multilingualism for names, titles, descriptions, keywords, generics, and even semantics. The language used is stored in the LangCode attribute with a maximum of five-digit values ​​(for example, en, de, fr, si, en-uk, en-us, fr-fr, fr-be, fr-nl). The Trans attribute provides information about the type of translation: o = original (original language), h = human (human translation), or m = machine (machine translation). In addition to the main, resulting and second-level objects, multilingualism is also supported by classifiers at the semantic level of CERIF. Thus, it becomes possible to support classification schemes in different languages.

The CERIF standard is recommended for use in CRIS systems (Current Research Information Systems), which collect together all the information underlying scientific research. The use of such systems greatly facilitates the interaction of investors and researchers. Research teams have easy access to the information they need to develop innovative ideas, leaders and managers can more easily track and evaluate ongoing research activities, investors and research councils can optimize the process of financing innovative projects.

A real example of using the standard is the IST World portal, built on the basis of the CERIF standard. It provides information on the experts, research groups, centers and companies involved in creating technologies for the growing information community. The main focus of the service is the expertise and experience of the main participants in the process in European countries. The repository contains information on projects of the fifth, sixth and seventh framework programs of the European Commission, as well as information related to these research projects, collected in Bulgaria, Cyprus, Czech Republic, Estonia, Hungary, Latvia, Lithuania, Malta, Poland, Romania, Russia , Serbia, Slovenia, Slovakia and Turkey.

In Russia, there is no single system for current scientific research. All attempts to create such systems are fragmented within the framework of various programs and projects. In Chernogolovka, within the framework of the Russian Academy of Sciences under the HAAB grant, a project is being implemented, the purpose of which is to create and develop an information system to support the commercialization of the results of intellectual activity to provide interested legal entities and individuals with data on innovative developments of RAS institutes with possible subsequent commercialization. In this system, innovative developments are understood as information images of intellectual property objects, technical solutions, as well as technological requests, ideas and other intangible assets obtained as a result of scientific and technical activities.

Analyzing the CERIF standard, we find that it does not cover the subject areas related to the work of experts and the preparation of an innovative development for the commercialization process. Therefore, the authors proposed an extension of the model proposed by this standard to the above subject areas.

The innovation process from a structural point of view is a set of consistently interrelated actions for the creation, development and dissemination of innovation. The innovation process involves an evolutionary change in the state of an innovative product, its transformation from an idea into a product, as well as monitoring its further market fate.

Domain model to support innovation

The subject area of ​​an information system to support the commercialization of research results is the sum of the combination of a set of several subject areas, namely the subject area of ​​scientific research, the subject area for possible implementation areas and the subject area of ​​experts on the commercialization of innovative developments. In this case, the last term should help to solve the following problem: dynamically form the paths of interaction in the relation "many-to-many" between the first two terms.

The ontology of the field of research activity is the structure of a system that reflects the process of scientific activity. Scientific research is possible only if complete and reliable information and data sets are available: from the application stage to the stage of publication of a development review. Information systems for ongoing research should gather together all the information that underpins scientific research. Such systems can be used by a wide range of people: from researchers to investors. Research organizations can post information about their innovative developments via the Internet and search for proposals from potential investors and customers, potential investors and customers can place orders for R&D and investment proposals in the field of high technologies and search for innovative developments.

In the subject area of ​​scientific research, the following main classes can be distinguished (Fig. 1):


Figure 1. The main classes of the subject area for scientific research

The project contains information about projects, research, the result of which will be innovative developments in one form or another, as well as their timing. Projects can be associated with other projects, associated with people, organizations, patents, publications, products, and other objects of the system.

Organization contains information about organizations related to projects. Contains a description of the organization: settlement currency, number of employees, turnover, etc. Organizations can also be interconnected and linked to other objects in the system.

A person contains information about people involved in scientific projects. People can also be interconnected and related to other objects.

The additional object Names contains information about different spellings of the name of one person, including in different languages.

The publication contains information about the research results in the form of publications. Contains output data about the publication: about the release date, edition, series, pages, ISBN, ISSN, summary, comments, etc. Publications can be interconnected and linked with other research results, as well as with other objects of the system: projects, organizations, people etc.

The patent contains information about patents issued for research results. Contains information about the country where the patent was issued, the date of registration and a summary. Patents can be associated with publications, projects, organizations and people.

The product contains information about products obtained as a result of research, i.e. about innovative developments; and a description of the product. Products can be associated with publications, projects, people, organizations.

Additional objects provided by the CERIF standard are also involved in the subsystem: Language serves to display information about the language of data representation in the system, Address - to display information about the physical addresses of people and organizations, Electronic Address - to display information about the email addresses of people and organizations, Country - to display information about countries, Currency - for information about currencies, Funding Program - for information about the program under which the project is being carried out, etc.

With the help of objects of the semantic level Class and Classification Scheme, types of relations, forms of statements, classification of subjects are characterized. For example, to designate types of publications or types of products, etc.

In the subject area of ​​possible areas of implementation, the following classes can be distinguished (Fig. 2):
The organization contains information about organizations interested in investing in innovative developments, in conducting research and development. Contains a description of the organization: settlement currency, number of employees, turnover, etc.

A person contains information about people employed in organizations, or about individual potential investors. People can be interconnected and related to other objects. For this subject area, the Names object is also applicable, which contains information about different spellings of the name of one person. The proposal contains information about proposals from potential investors for research and development, for investments, for the development of a specific topic. Contains descriptions of offers, as well as information about their timing. Proposals can be interconnected, as well as connected with people, organizations and other objects of the system. A patent contains information about patents for developments in which the organization wants to invest. The product contains information about products interesting to investors.

By analogy with the subject area of ​​scientific research in the subject area of ​​possible implementation areas, additional objects can be distinguished: Language, Address, Electronic Address, Country, Currency, etc. ...

Figure 3. The main classes of the subject area of ​​experts.

In the subject area, according to expert assessment of the possibility of commercializing innovative developments, the following classes can be distinguished (Fig. 3):

The person contains information about experts who assess and analyze innovative developments and make decisions about the possibility of their commercialization. The same additional Names object contains information about the different spellings of the name of one person.

The organization contains information about the organizations in which the experts are employed. The product contains information on scientific and technical developments, which are evaluated by experts. Separately, the Assessment object can be distinguished for storing expert opinions on the possibility of commercialization of developments.

By analogy with the subject areas of scientific research and possible areas of implementation in the subject area of ​​experts, additional objects can be distinguished: Language, Address, Electronic Address, Country, Currency, etc.

The general structure that unites all three subsystems fully reflects the process of conducting scientific research and assessing the possibility of their commercialization (Fig. 4).



Figure 4. Subject area of ​​the information system to support the commercialization of scientific research results

Basic principles of building an information system and its users

In the information system to support the commercialization of the results of scientific research of the RAS, three subsystems can be distinguished: the subsystem of scientific research carried out at the institutes of the RAS (the subsystem of institutes), the subsystem of possible implementation areas (the subsystem of potential investors) and the subsystem for expert evaluation of the possibility of commercializing innovative developments (the subsystem of experts). Accordingly, in each subsystem, three groups of users can be distinguished - a group of intellectual property owners (researchers), a group of experts and a group of investors.

In the information system, each user - the owner of the intellectual property object (researcher) - regardless of the degree of completeness of his development (patent, solution, idea, etc.) can present information about the IPO, about his scientific and technical developments in the form of an aggregate information image of an innovative development, which can include a resume, technological proposal, information about the owner, etc. In addition, he can add information about the patent protection of his developments, as well as post additional information about them. Potential investors, R&D customers or their representatives can place in the system their proposals for investments, information about their needs (interests) and orders for R&D, for expert assessment of innovative development, search for innovative developments, and get acquainted with existing expert assessments of developments. The system can provide a separate virtual platform for experts who can develop a questionnaire (arrange a technology audit), analyze business ideas and evaluate the investment attractiveness of innovative developments. Each user of the information system, depending on his interests, has the opportunity to search for information objects and related information, select them, analyze them in order to subsequently decide on the expediency of further contacts.

A user who is not registered in the system, using the capabilities of the guest entrance, can also take part in the work of the information system in absentia. After reviewing open-to-view summaries of innovative developments, investors' proposals, expert assessments, he can decide whether the system contains developments of interest or research proposals, understand what criteria are used to assess investment attractiveness by experts, and then make a decision on registration and further work in the information system to support the commercialization of scientific research.

Conclusion

The authors believe that the following provisions and results are new in this work: an ontological model of the subject area of ​​an information system for supporting the life cycle of innovative developments of RAS institutes.

The developed model makes it possible to develop a software architecture for such a system, develop metadata, and build a set of interrelated thesauri to support the semantics of end-user requests.

Literature:
1. Lapshin V.A. Ontologies in computer systems. - M .: Scientific world, 2010. - 222 p.
2. Gruber T.R. The role of common ontology in achieving sharable, reusable knowledge bases // Principles of Knowledge Representation and Reasoning. Proceedings of the Second International Conference. J.A. Allen, R. Fikes, E. Sandewell - eds. Morgan Kaufmann, 1991, P. 601-602.
3. Konstantinova N.S. Ontologies as knowledge storage systems [Electronic resource] / NS. Konstantinova, O.A. Mitrofanov. - Access mode: http://www.sci-innov.ru/icatalog_new/index.php?action=send_att&entry_id=68352&fname=68352e2-st08_(Mitrofanova O.A.). Pdf
4. CERIF 2008 - 1.2 Full Data Model (FDM). Introduction and Specification [Electronic resource] - Access mode: http://www.eurocris.org/Uploads/Web%20pages/CERIF2008/Release_1.2/CERIF2008_1.2_FDM.pdf
5. Kulagin M.V., Lopatenko A.S. Scientific information systems and electronic libraries. The need for integration // Proceedings of the Third All-Russian Conference on Electronic Libraries. RCDL "2001 Petrozavodsk, September 11-13, 2001, pp. 14-19.

Editor's Choice
Russian writer. Born into the family of a priest. Memories of parents, impressions of childhood and adolescence were later embodied in ...

One of the famous Russian science fiction writers is Sergei Tarmashev. "Areal" - all the books in order and his other best series, which ...

There are only Jews around Two evenings in a row, on Sunday and yesterday, in the Jewish Cultural Center in Maryina Roshcha a Jewish walk ...

Slava has found her heroine! Few expected that the actress, the wife of actor Timur Efremenkov, was a young woman positioning herself at home ...
Not so long ago, a new bright participant appeared on the most scandalous TV show of the country "Dom-2", who instantly managed to turn to ...
"Ural dumplings" now have no time for jokes. The internal corporate war unleashed by humorists for the millions earned ended in death ...
Man created the very first paintings in the Stone Age. The ancient people believed that their drawings would bring them good luck on the hunt, and maybe ...
They gained great popularity as an option for decorating the interior. They can consist of two parts - a diptych, three - a triptych, and more - ...
Day of jokes, gags and practical jokes is the happiest holiday of the year. On this day, everyone is supposed to play pranks - relatives, loved ones, friends, ...