"Sources and Techniques" Title Page


Chapter 8: A First Approach to the Study of
Intelligence, Information, and Collection

Section One -- Brief Review of the Development of Information Science

The origin of information science can be traced back to 1945. Vannerar Bush, Director of the US Bureau of Scientific Research and Development, released an article entitled "As we may think." For the first time, the role of scientific information in large scale R&D was revealed. A prototype mechanically reduced literature index system (Memex) was introduced. Since then, scientists around the world began to pay attention to information. A number of famous scientists gathered in London in 1948 to hold the first information science conference. The meeting was sponsored by the Royal Society and had a great deal of impact. In 1958, an international information science conference was held in Washington DC, sponsored by the National Science Foundation, the National Achieve Society, the American Academy of Sciences and the American Scientific Research Evaluation Committee. The foundation of information science was laid down in these two meetings. During the past 40 years, the development of information as a science can be discussed in three stages.

First phase (1950s-1960s): applied research. Most of the topics dealt with the sharp contradiction involved in the production, supply and utilization of information, exploring theories and methods to look up information, and establishing information organizations that provide the optimal service. In addition, issues related to information users and their demands were also investigated.

Second phase (1970s): application of new technology. Due to rapid advances in electronics and communications, the use of new technology in information was put on the agenda. During this period, the major projects included using mainframe computers to establish a domestic literature processing system, establishing various databases and networks for online search, automation, studying the design, evaluation, interconnect and compatibility of automated information systems, and assessing the impact of modern information technology on conventional intelligence work, as well as its social, economic and political influence.

Third phase (1980s): basic theoretical research. In this period, the focus was basic theory in all countries. In developed nations the focus shifted from "design and development" to basic theory. During this period more in-depth work was done on networking of information systems, automatic categorization, automatic indexing, machine translation, etc.

Section Two -- Current Thinking on Intelligence, Information, and Collection

I. Intelligence

Intelligence is a new discipline. It is still being developed and there is not a commonly agreed upon definition.

Russian professor A. H. Mihaylov believed that intelligence is a science that "studies the structure and basic characteristics of information and the general pattern in scientific exchange."

T. Saracevic, an American information expert, believed that information is a science that studies human communications and the characteristics of communication systems.

British information expert Brooks considered information a study of the action and reaction between "World 2" and "World 3."

In 1979, the International Standardization Organization (ISO) introduced the definition that information is the study of the function, structure and transfer of information and the management of information systems.

In China, the "basic glossary of information and documents" defines information as a study of the theory, pattern and method of acquiring, transferring and using information, and of the management of information systems.

Theories about information came from actual practice. Furthermore, they have fallen far behind actual practice. Assuming it did start in 1945, it has been around for more than 40 years. Has it become an independent discipline? There are different views in the international community. With the exception of a few people such as E.P. Semenyuk of USSR, the majority still believes that information has not yet become an independent discipline because theories must be established before a subject becomes an independent discipline. Theories of information are evolving at the present moment.

According to V.M. Kedrov et al. of Russia, the following four conditions must be met to establish an independent discipline.

Recently, Semenyuk presented a long paper to the Soviet Academy of Sciences and argued that all the conditions for information to become an independent discipline have been met. First, there is a clear subject to investigate in information, i.e., scientific information and its exchange. Second, there is a conceptual system concerning its subject. Third, basic laws governing the subject have been demonstrated, such as Price's law of the exponential increase of documents and Bradley's law of dispersion on publication. Fourth, a number of principles and theories have been established to interpret many facts, such as the principle that scientific exchange and information activities have social, economic and cultural constraints. Fifth, there are unique methods to study information, such as the blank analysis method.

However, Professor V. Slamecka of the United States believed that there is a considerable distance for information to become an independent discipline. He briefly discussed the progress in information in the past 20 years as follows.

Brooks of the United Kingdom believed that information is still drifting in the sea of practicing common sense. Philosophically, information neither has a well-defined position nor a theoretical basis.

Krauss et al. of the former Democratic Republic of Germany believed that information was still mainly limited to applying knowledge from other disciplines to solve practical problems.

We are in agreement with the latter type of scholars. The status and unique features described above indicate that information is not yet an independent discipline. Its theoretical system is still evolving.

Finally, let us quote Qian Xuesen to end this section. In July 1983, he was the first person to say: "S&T intelligence must be treated as a scientific discipline. To do a good job in this area, we must build up this discipline in China." "We no longer treat defense S&T intelligence as a task. It must be considered a scientific discipline." And, "Why don't we spend two years to devote to defense S&T intelligence." Nevertheless, contrary to his wishes, very little progress was made. Six years later, Qian Xuesen wrote: "I proposed to create a discipline to study information six years ago called knowledge and information activation technology. An academic discussion group was initiated, but it stopped in six months because most participants were not interested." In 1989 Qian Xuesen wrote: "This type of conservatism originates from society. To a large extent it is because there are too many issues concerning the environment we live in and the orderliness we live by. It is difficult to motivate people." Hence, Qian Xuesen put his hope in the 21st century. He wanted the intelligence community to welcome the 21st century by devoting itself to the study of defense S&T intelligence.

II. Information

As we know, S&T intelligence can be divided into four areas. First, information needs to be gathered. Second, information needs to be organized according to a certain order and a database needs to be established. Third, information needs to be indexed for ease of access. Fourth, information needs to be analyzed, or studied. According to the custom of the Chinese technology intelligence community, the first three areas are considered as data handling.

In order to transfer information and acquire intelligence, there is a need to gather data. It has been around for a long time. As information becomes a topic, people are interested in conducting scientific research on information, especially given the sharp contrast between the "information explosion" and "intelligence poverty." Objectively, there is a need for information workers to provide an effective and scientific method to satisfy the growing need for information. As mentioned earlier, in the 1950s-1960s information workers explored the principle and method for information indexing and the preparation and evaluation of abstract indexing publications. In the 1970s the use of a mainframe computer to provide on-line search by networking with various databases, which belonged within the domain of information, became a central subject in the field of information with remarkable success.

The study of information materials started at almost the same time as information studies, but it was not recognized as a discipline. Because information materials are tied even more closely to day-to-day practice, or they are even more highly specialized, or the study subject is even more specific, it appears to be more mature in certain areas, especially in applications such as information indexing, data labeling (by subject, by category), glossary, etc. From the standpoint of building an intelligence organization, in the data field every nation more or less follows the same model. There is very little disagreement. Of course, as a whole, there is a lack of a theoretical concept framework. It is still focused on applied research. The majority of the work is directed toward solving practical problems. These are facts facing people working in this field.

Information work is a spin-off of library science. It has benefited significantly from library science. Methods such as indexing and abstracting are very successful because they are the foundation of library science. However, it is also severely restricted by library science. For instance, a library is a "house of books" and its collection should be "large and complete." Hence, collection plays a very minor role, otherwise known as "purchasing." Over the years, anywhere in the world, very few people ever performed any in-depth research on data collection in a scientific manner.

III. Collection

For the fields of intelligence and information it is possible to list the subjects, contents, methods and theoretical structures being studied by various scholars. However, the field of collection is lagging far behind. There is a lack of content even for research on collection. Flipping through various information journals, out of a total of 500-600 pages collection accounts for merely a dozen pages. Furthermore, there is a lack of variety. It seems that everyone follows the same principles of specificity, accumulation, prediction, planning, purchasing, exchange, requesting, on-site searching and duplication.

A profound scientific investigation of collection to build it up as a discipline occurred in the past 5 years. Qian Xuesen made a great contribution. He laid the foundation for this discipline. In a defense S&T intelligence meeting held on July 2, 1983, he pointed out with foresight that "information collection is a science that needs to be rigorously studied." Since then, S&T intelligence workers and information collectors put more effort to "deepen" their understanding of collection. Thus collection studies began. Collection, as a science, is still in its infancy. It will take at least a few years, if not a decade or perhaps until the middle of the 21st century for collection to become an independent discipline.

Collection research has the following characteristics.

1. Most research projects are work-related; i.e., collection is treated as a task, rather than a science. Although there are a few research projects, most of them are specifically focused on a process or link. From the system standpoint there is no high level research. Hence the research is carried out at a lower level with relatively few results. It is also often limited by the users. Although this type of research can solve some practical problems, it does not have a major impact on the profession as a whole. Collection is being impacted by social progress. Overall, it is in an "ultra stable state." Only in recent years, as national S&T system reform deepens, with reduced or level funding for information expenditures, it is becoming more difficult to manage.

2. To initiate a study of collection and its related tasks the first problem encountered is information gathering. There are no mature and reliable methods. First, the task of collection depends strongly on social connections. It often involves multiple channels and multiple nodes. Therefore a collector cannot experience the entire process of information collection. An individual or a group of people in a certain department cannot control all information. Furthermore, due to interference from either social or human factors, even with partial information in hand, it is difficult to extract false information and then establish either a mathematical or physical model based on true information. Information collection and truth screening are far more serious and complex than data sequencing, labeling and indexing. This is caused by the unique situations encountered in information collection. They are the "high thresholds" preventing the study of collection as a science.

All collectors from various departments must cooperate fully to solve this problem. However, due to differences in understanding, collection research is highly uneven in different departments. Some departments are not interested at all. It appears it will take a long time for this issue to be resolved.

3. As far as study methods are concerned, traditional methods used in information and library science still apply. It barely begins to "borrow" or "modify" techniques used in other disciplines. Of course, there is a lack of uniqueness. Most people are still using a direct descriptive method to qualitatively describe what actions are taken and how the work is being done. Or it is a qualitative explanation of practical experience in information collection. In conclusion, the most commonly used methods are limited to experience-based surveys and statistics. Very few modern scientific methods are being applied.

4. Some scholars and collection workers have realized that general principles and laws to guide and explain collection must be derived from a philosophical level. Some beneficial investigations have begun. Certain opinions and concepts have been introduced. Although these viewpoints are not mature enough to be accepted by the general public, it signifies some progress in the science of collection. It propels the study of collection science into the next phase.

Section Three -- Comparison of Study Subjects in Intelligence, Information and Collection

As discussed earlier, one of the premises for an independent discipline is a clearly defined study subject. Let us understand and compare the subjects to be studied in intelligence, information and collection as follows.

I. Subjects of Intelligence

As far as intelligence and its study subjects are concerned, there are numerous definitions and some are distinctly different. They are given by information scholars and operators with a variety of experiences, knowledge backgrounds, and targets. A number of classic methods to define it were described earlier.

Collectively, we have an idea. Regardless whether so-called "intelligence research" work should be included as a part of intelligence, we should use a novel approach to observe intelligence both laterally from a social perspective, as well as longitudinally from a historical perspective based on more than 30 years of experience in S&T intelligence in order to gradually create theories and form a branch of intelligence with Chinese characteristics. Some call it information science, others call it intelligence science. Why not call it informagence science [note: the terms are given in English].

Under the premise of separating intelligence from information, let us name the science to study S&T intelligence as intelligence science. Another way to express it to call the study of intelligence and its processes intelligence science. Specifically, it is the study of the patterns, principles and laws related to intelligence, including its concepts, attributes, structures, and functionality, as well as the processes of creation, transfer, exchange and absorption.

Intelligence has two branches, i.e., information and intelligence analysis. The former includes information collection, sequencing, indexing and retrieval.

II. Subjects of Information

There are similarities between what we refer to as "information" and what the foreign intelligence community refers to as intelligence work. It is close to information and communication. Nevertheless, it is definitely not equal to documentation.

Can the science and technology used to study information be called information science? In other words, is the study of information and information flow processes a science? Specifically, information science is the study of patterns, principles and laws related to the concepts, attributes, types and functionality of information, as well as the processes of information creation, transfer, flow and utilization.

Obviously, compared to intelligence, the subject and contents of information science are a subset of those in intelligence science.

III. Subjects of Collection

Data collection is a technical discipline and warrants additional research. Can the science to study data collection be called collection science? In other words, is it appropriate to call the study of information collection and information collection process the science of collection?

Obviously, collection is a branch of information. The subject and contents of collection science are a subset of those studied in information science. In one field, the focus is on "commonality" while in the other the focus in on "individuality." Of course, compared to intelligence science they are even a smaller subset.

The science of collection studies intelligence sources, information sources, the needs of intelligence users, transfer channels, collection techniques, and basic theories of collection. The core contents include intelligence sources and collection techniques. Collection science does not study laws and methods related to information activation and extraction because those are areas covered by intelligence analysis. The basic objective of information collection is to obtain information required by the clients.

Section Four -- Disciplinary Characteristics of Intelligence, Information and Collection

Because information and collection are branches of intelligence, we must first discuss the disciplinary characteristics of intelligence before we investigate those of information and collection.

I. Disciplinary Characteristics of Intelligence, Information and Collection

According to conventional systems, is information a social science, or a natural science, or a technological science? There is no agreement among scholars around the world.

Professor Mihaylov of Russia believed that information belongs to the domain of social science because it studies "phenomena and laws unique to humans."

Russian Academy Fellow Ershov believed that "information is a natural science that studies the transfer and processing of information."

F. K. Klaus of the former Democratic Republic of Germany believed that "information is a discipline on the periphery of natural science, social science, engineering and science."

The majority view in China agrees with that of Klaus et al. In China, although some people believe information belongs to social science and others consider it a management science, most people believe that "information is a comprehensive applied science that borders natural science, technological science, and social science."

A different view was presented by Qian Xuesen on August 7, 1984 in a national discussion meeting on thinking. It was introduced for the first time that "information is an applied science in the domain of thinking.

Recently, A.A. Dorovnichyn of Russia introduced yet another new concept. He believed that "like mathematics, information is a methodology." Mathematics is the slave of other disciplines. Intelligence is also a slave. Its mere existence is only to help other disciplines. It does not study or create any specific matter or natural process. Rather, it provides methods for other disciplines. This is a unique viewpoint.

Let us discuss our understanding and viewpoint as follows.

Modern science and technology have developed into a closely related entity with numerous disciplines. This entity is a system that needs to have clearly defined layers and departments. There is a need to pinpoint the position and layer of a modern topic in this system in order to study its disciplinary properties.

To pinpoint the position of information in modern science and technology, we must first clarify whether it is an independent discipline, or a subset of other disciplines. In reality, information is a new area. It has not yet evolved into a new discipline. Its theoretical system has not yet been created. It is not a separate department. From the standpoint of its study and contents, it is not appropriate to assign it to either natural science or social science. Information and information materials have developed into large-scale enterprises in all countries. Information is a powerful tool to understand the world objectively. It is an extension of the human brain and the five senses. It is attached to human thinking and belongs to the domain of cognition. It is a very powerful methodology. Hence Qian Xuesen was profoundly correct to put it under the science of thinking. The science of thinking is a science that studies laws and methods governing thinking.

Although still in its inception stage, Qian Xuesen predicted that there will be a Chinese Academy of Thinking in the 21st century. Thinking, social science, natural science, mathematics, system science and the human body are the six subsystems of modern science and technology.

From the experience gathered in the past 100 years in natural science, the six subsystems may be further divided into three layers, i.e., basic science, technology and engineering, based on whether it either directly or indirectly impacts on the objective world.

Longitudinally, which layer does intelligence belong to? Our actual experience shows that on one hand intelligence is the application of basic sciences such as thinking, information, and culture (the study of creating intellectual wealth). On the other hand, it involves engineering and technology, including intelligence analysis, data handling techniques, database, and design of information structures and systems. Hence, it belongs to the technology layer. Its disciplinary characteristics are similar to those of control theory, operational research, applied mathematics, applied mechanics, and electronics. They all belong to technology.

Transversely, intelligence is most appropriately placed as a cross sectional discipline. This is determined by its study subject and research contents. Unlike other disciplines, intelligence research is not the study subject itself. Instead, it is a study of commonality – intelligence phenomena and general motion – intelligence process. Rather than the specific characteristics of various subject matters and processes, intelligence is the study of the laws governing the transfer, processing, activation and utilization of information produced by the subject matter as it develops. In other words, intelligence is situated in a position where various disciplines, including natural science, social science and human body science, merge. It provides a common method to all disciplines – how to effectively gather, store, index, activate and utilize information. Committee member Gao Yisheng of the Chinese Academy of Sciences complained in a meeting that "one problem is the huge amount of information. How can we grasp what is most critical? We need intelligence workers to teach us some effective methods." This illustrates that intelligence is a methodology. It also proves, from a different angle, that intelligence is a cross sectional discipline, similar to mathematics, information theory, system theory, and control theory.

In conclusion, we believe intelligence belongs to the subsystem of thinking. Longitudinally, it is at the level of technology. Transversely, it is a cross sectional discipline.

Since information science is a branch of intelligence, its disciplinary characteristics are similar to those of intelligence as well. It is also a technology and a cross sectional discipline. Furthermore, information science is a combination of basic sciences such as thinking, information and culture, as well as an application of intelligence. In addition, it is the theoretical basis for techniques such as information creation, data acquisition, data sequencing, database, data retrieval, data transfer, and data flow. Furthermore, transversely it supports other disciplines by providing theories and methods for data acquisition, data processing, data retrieval, and data utilization.

Similarly, collection is a technology and a cross sectional discipline. It is comprised of basic sciences such as thinking, information and culture. It is also an application of information science. In addition it is the theoretical basis for all data acquisition techniques. Furthermore, transversely it provides theoretical and methodological support to other disciplines. Certainly, from the operating standpoint, collection may be more involved with management and coordination than indexing and retrieval. However, collection is not a management science.

II. Query the Theory of "Peripheral Discipline"

Very few people in China consider intelligence and collection as social sciences. However, quite a few people believe they belong within the domain of peripheral disciplines.

What is a peripheral discipline? There are numerous ways to create a peripheral discipline. Basically, there are two expressions. One is the creation of a new discipline in an area where two related disciplines cross over, such as biochemistry. The other is to use the theoretical methods of one (or more) discipline to study the subjects in another discipline. For instance, the laws of physics are used to study the motion of heavenly bodies to create astrophysics.

However, intelligence and information are not quite the same. The subject matter is not the specialty of the discipline. In other words, it is not a study of either the state or the motion of matter. Instead, it discards specific features of various disciplines, matter, phenomena and processes to study their common patterns, theorems and criteria in an abstract manner. Hence, it cannot be called a peripheral discipline.

Of course, an intelligence phenomenon is a phenomenon. System theory, information theory, and control theory have permeated into intelligence and information. Some physical principles and mathematical methods have been applied in intelligence and information research. Technologies such as computer, communication and data storage are being widely used. Nevertheless, they cannot be used as bases to determine the academic characteristics of intelligence. Since system theory, information theory, and control theory can be used to study a wide range of subject matters, they have powerful methodology capabilities. Since new technologies such as computers can be used over a wide range, they can provide excellent technical protection. Just because of the fact that the theories, methods, and techniques described above are used to study population, it does not make population science a peripheral discipline.

In addition, intelligence cannot be considered a peripheral discipline simply because the subject matter and contents of intelligence and information involve both social and natural science. In essence, it is not developed as a result of crossover permeation between social science and natural science.

Finally, we want to review the academic characteristics of intelligence, information, and collection from the standpoint of their significance. As we know, a leap of understanding of the objective world is considered a scientific revolution. A leap in changing the objective world is a technological revolution. Then, once a leap in the understanding of intelligence and its process, or collection and it process, takes place, is it a scientific or technological revolution? This question may lead people to reflect on the peripheral discipline argument and take the technology argument into consideration more profoundly.

Section Five -- General Methods to Study Intelligence, Information and Collection

Any discipline has its own unique methods. Intelligence is still evolving and does not have a comprehensive and unique set of methods. Many of its methods are either derived from library science, or transplanted from social science and natural science. It takes more hard work and further investigation by all workers in the field of intelligence and information to advance and perfect its methodology.

I. Commonly Used Methods

1. Terminology Analysis

Before attempting to solve any problem, a person working in intelligence and information should have a clear definition of each term and the concept it represents, and gradually builds up an understanding of the relationship among different concepts in order to stabilize their positions in the theoretical system. Defining basic (key) terms is helpful to form various assumptions, which is the basis of research. In light of the fact that theoretical concepts lag in intelligence, this is a highly practical method.

Marxist-Leninist epistemology believes that a concept is an objective reflection of the nature of a thing or a phenomenon in words. A specific word (or phrase) to describe a concept is a term. Using terminology analysis to define the meaning of a term is to illustrate the content, as well as the most important and essential characteristics of the concept.

Terminology analysis usually goes through four stages.

First step: Pick terms of interest to the task as initial preparation work.

Second step: Collect all the information on the term possible. Collect information from all possible aspects to build the ensuing analysis on a solid foundation, rather than limiting the outcome by the data collected. Information should collected from special papers, theses, dictionaries and handbooks.

Third step: Perform terminology analysis to determine the concept and most essential characteristics of the term. Rigorously read through the information gathered and extract anything that describes the term. Deliberate and compare repeatedly to find differences and contradictions, and write down all the questions. Use it as the basis to perform terminology analysis. The first step from the standpoint of historical materialism is to analyze the term from a historical perspective in order to understand any changes of its definition, and the formation and development of the concept. History itself is also evolving as well. The second step is to perform an etymological analysis to understand the original meaning of the term and possible interpretations. The third step is to perform a comprehensive contrast analysis to extract the essence. Taking your actual work experience, and the status and prospect of intelligence science into consideration, introduce your own assumption of the concept and essential characteristics of the "term" or "derivative."

The fourth step is to put your own "terminology assumption" into practice to see whether it can adequately explain various problems encountered in the real world. Examine whether it is properly placed in the theoretical concept system. Find contradictions and correct them. Finally, more accurately express the concept and essential characteristics of the term. Or, give a definition to a newly "derived term."

In this book, the concept definition of "intelligence source" and the derivation of "information source" are perfect examples of terminology analysis.

2. Concept Inference

Concept inference is based on the dialectical materialist theory of methodology. On this basis, the concept to be analyzed is compared to the fact, phenomenon, or event. Here, the fact, phenomenon, and event are a reflection of the concept itself.

In simpler terms, concept inference is to find a more specific experience related mark of a concept after the definition of the term is determine by terminology analysis. These marks should be visible and measurable. It provides the material basis for the concept of this term. Upon completion of the research work and after obtaining new information, analyze and understand the concept of the term at a higher level to make the definition and connotation of the term more accurate, enriched and comprehensive.

In the study of intelligence, information and collection, abstract concepts are often encountered. It is difficult to directly link these abstract concepts to facts, phenomena and events encountered. Hence, it is hard to measure and it is necessary to "decompose" a concept into various components in order to convert them into measurable markers in the real world of intelligence. By doing so, it is then possible to collect more information and information for either qualitative or quantitative analysis to allow the research to dig in deeper. It will make the concept more complete through feedback signals.

In intelligence, information and collection, concept inference is a commonly used method. Let us use an example to explain the steps and specific procedures to implement concept inference.

Assume that there is a need to study the "reading skill" of various groups of readers. Since "reading skill" is an abstract concept, it is often very difficult to collect any information that is a direct measure of "reading skill." In this case, concept inference is useful in the research. Usually it may take four steps.

Step 1: On the basis of terminology analysis, find a number of specific concepts (lower level) that determine "reading skill" to some extent.

In this example, "reading skill" may be decomposed into the following specific concepts, including "reading contents," "book list knowledge," "systematic and continual nature of reading," "capability and skill to select specific information," "capability to grasp and profoundly understand the contents read," "capability to apply knowledge contained in the information in practice," "reading hygiene," "reading skill (experience and techniques to protect effective reading)," etc.

Step 2: The concepts described above, in whole as well as in common, form the concept of "reading skill." In any study concerning "reading skill," it is necessary to pick some of them. In some cases, it is necessary to choose more of them and in others less. The selection is dependent on the subject matter and objective of the study. Hence, the second step is to select specific concepts that are both meaningful, important and relevant to the direction of the study, and are necessary to accomplish the objective of the task. These selected concepts can then be the characteristics of the abstract concept of "reading skill." Step 3: This step can be summarized as a determination of the experience characteristics of "reading skill." Experience characteristics are the final characteristics of the concept "decomposition" process. These are visible and measurable features. Based on these features, a judgement can be more precise.

Let us use "reading range," i.e., the first characteristic of "reading skill" as an example. It can be further "decomposed" into the following experience characteristics:

Once experience characteristics are identified, it is possible to apply a number of suitable methods to monitor and record facts and information. For instance, reading contents can be determined through an analysis of the materials checked out. Reading hours can be measured through reader surveys, or by observation. Quantity read can be surveyed or analyzed by reviewing library cards. Reading purpose can be determined by interviewing readers, or analyzing the contents and assessment of the information. The difficulty of the reading material can be revealed by using one or more methods to measure the degree of difficulty of articles.

Step 4: Collect, organize, process and analyze the data and information on all experience characteristics and try to elucidate a pattern. Then by deduction apply it back to concept characteristics and to basic concept. Use the criteria illustrated to support or direct your own research work. In this example the mode of operation of scientific readers is determined by reviewing the "reading skill" of different reader groups.

In this paper, concept inference is also used to determine the assessment criteria for intelligence users' needs.

3. Information Activation Method

This is a fully scientific (inter-disciplinary) method. In essence, it extracts nutrients from a variety of data to obtain intelligence.

Similar to other researchers in basic science, technology and engineering, researchers in intelligence, information and collection must also read a great deal in order to obtain intelligence from work done in the past and experience gathered by their peers. It is a misunderstanding and a joke to study intelligence while neglecting acquiring intelligence.

The core of information activation work includes collecting quality materials of the subject matter in sufficient quantity, assessing these materials in accordance with reliable standards, eliminating all questionable portions, repeatedly checking and double-checking if necessary, and eventually arriving at dependable results. Finally, one must work hard to reveal any unnoticed pattern that exists objectively in order to activate the information to obtain intelligence.

It is highly inadequate for a researcher to collect information at the last moment. Instead, it should build up over time. A researcher should routinely read 5 foreign periodicals, 10 Chinese periodicals, and some special reports. Furthermore, he or she should periodically attend international academic exchange meetings.

Information activation is needed in both theoretical and applied research. In addition, it can be used in various stages of the research project. In preparation, it can help determine the direction of research and set up a plan. In implementation, it can be helpful in terminology analysis, concept inference and hypothesis creation, as well as in providing some facts. Upon completion, it can help verify the accuracy of the results.

At present, information activation is widely used in intelligence research. In the example introduced in this book, a user study conducted by the US DoD Document Center in 1975 regarding the "status and future trend of information storage and transfer technology" was done using the information activation technique alone.

4. Observation and Experimentation

Observation and experimentation is a common method in natural science. It has been transplanted into intelligence.

The close relationship between observation and experimentation is well known. Experimentation is based on observation. Furthermore, it is preserved as a component. However, observation is not always based on experimentation. Nevertheless, an experiment becomes meaningless without observation.

Although experimentation evolves from observation, they are absolutely not on opposing sides. Instead, it is a single entity. When observing a subject, in addition to selecting a suitable method, it is necessary to ensure that the researcher (observer) does not influence the study of the development process of the phenomenon. Moreover, every effort must be made to avoid any influence from the researcher. An experiment is done to understand the nature and pattern of a certain phenomenon. Consciously, certain necessary conditions are created, or altered, to actively influence the process of the subjective matter to meet the objectives. This is an important difference between experimentation and observation.

(1) Observation

In general a scientific observation is a specially organized, planned and goal-oriented activity to understand a subject matter. It can be an individual method or a component of another method. The difference between a scientific observation and an ordinary observation is that it obeys the objectives and tasks of the study. It must have a well-defined range of terminology and concept. These terms and concepts are needed in the study. It has a detailed observation plan and follows rigorous methods. The information obtained from various observation methods should be comparable. In addition, the data collected by means of observation usually need to be verified and tested for reliability.

The quantity and quality of the subject matter must be selected and determined in an observation so that it can accurately represent the major characteristics of the subject matter. This task can be accomplished by using certain empirical equations and statistical methods.

Finally, a method must be chosen.

Based on the way facts and data are obtained, it can be divided into direct observation and indirect observation. In direct observation facts and information are obtained when there is a direct link between the observer and the subject matter. In this case, the observer records what he sees. In indirect observation the observer does not come in contact with the subject matter. Instead, it is done by other people who are familiar with the subject matter.

On the basis of the relationship between the researcher and the subject matter under observation, it can be divided into intervening observation and non-intervening observation. When a researcher observes the subject matter "from the sideline" during a pre-determined time period according to plan it is a non-intervening observation. In an intervening observation the researcher becomes a member of the subject under study. He works with the group and participates in all their activities in order study the subject matter internally.

In addition, there are open observation and undercover observation. In the former case, the subjects know that they being observed. In the latter case, they do not know they are being observed. An observation may be long or short. Of course, a long observation is most beneficial. A short observation is usually used to clarify a specific situation and detail or to collect certain evidence.

The results must be recorded in accordance with certain requirements. The format should suit the objectives and tasks of the study. Not only is the format important but also the time of record is critical.

Finally, statistical analysis must be performed to reach some conclusions. Certain techniques in statistical analysis and fuzzy logic can be used to find the nature and pattern associated with the subject matter.

The disadvantages of observation include a large workload and deviation of the information from reality.

(2) Experimentation

Experimentation is an extension of observation. It is the most commonly used study method in natural science. Any research institution has a large number of laboratories. One of the unique features of experimentation is the ability to reproduce facts and situations of interest. Another feature is the ability to create and change a series of experimental conditions and to observe the creation, development and change due to such conditions. The objective is to determine any intrinsic correlation between these effects and the objective conditions to unveil the nature and pattern of the effect itself.

The premise is to establish a hypothesis to be validated. To verify the validity of a hypothesis, a detailed plan is required to establish the necessary experimental conditions, and to observe and record the results in detail. Finally, the data is analyzed both qualitatively and quantitatively using suitable mathematical methods.

Based on logic structure, there are two types of experiment to verify a hypothesis. The first is the contrast method. In this type of experiment the hypothesis is verified by comparing two or more (test and control) groups. The second type is the serial progressive approach. In this type of experiment there is no control. The way the hypothesis is verified is by comparing the results obtained before and after the experiment.

In light of the fact that intelligence, information and collection have strong social characteristics based on the place and condition of the experiment, they can be divided into natural experiments and laboratory experiments.

A natural experiment takes place in a normal workplace under normal conditions, such as a reading room in a library. By changing the working conditions and providing various extraneous factors, the effect on the subject of study is observed. This provides the researcher with information that is otherwise impossible to obtain. This is the unique feature of a natural experiment.

A natural experiment can be carried out easily. However, it is highly susceptible to unintentional interference. To "eliminate" any interference and to ensure the accuracy of the experiment, an experiment may be organized to take place in a laboratory. In this case, it is necessary to have a specific site. The people or subject matter must also be appropriately selected. Various experimental conditions are then created without external interference to unveil and measure the reaction of the study subject.

When using an experimental method, error analysis should be carefully done just as in natural science. This is a critical step.

Currently, experimentation is widely used in intelligence, information and collection, such as studying intelligence user's needs, construction of transfer channels, selection of information sources, and design of indexing and retrieval systems.

5. Survey Statistics

Survey statistics is a commonly used method in social study. It was the first method to be transplanted into intelligence research. It is the most popular method and is being widely used.

Survey is a method where a sufficient number of "samples" is taken according to a specific scientific principle.

Statistics is a method where statistical analysis in done on various records related to a specific problem to obtain more information.

The methods introduced earlier to study the needs of intelligence users, including survey questionnaires, interviews, and citation analyses, belong to the domain of survey statistics. These classic methods have been introduced in detail before and will not be repeated here.

Of course, survey statistics is not limited to studying intelligence user's needs. It is widely used in the research of information transfer channels, methods and techniques for data acquisition, and policies and plans for collection. Without any exaggeration, improvement measures in every aspect of today's intelligence work are based on survey statistics.

6. Expert Appraisal

Expert appraisal can be considered as a logical and statistical process. Information obtained from experts based on their experience and practice is analyzed, judged and synthesized.

Expert appraisal is being applied to various studies in intelligence, information and collection. It is primarily used to solve problems in two areas.

First, it is used to assess the quality of various objects in intelligence research. These objects may be intelligence sources and users, and previous processes and study methods. In this case expert appraisal not only can be used alone, but also as a component of other research methods. Today it is still the primary method to judge the value of the information collected.

Second, expert appraisal is often used to predict the future development of a certain object. Because it is difficult to set up a mathematical model for some objects, expert appraisal is invaluable in predicting their future development. It does not require detailed computation and experimentation to arrive at the future development of subject matter that is familiar to the expert. Of course, it is a prediction of trends and directions, rather than details.

Expert appraisal is divided into "individual appraisal" and "collective appraisal." For a relatively simple problem, such as assessing the value of a piece of information, "individual appraisal" may be used. It involves visiting one or a few experts to assess the situation. For a relatively complex or important issue, such as evaluating certain research methods or predicting the future development of certain events, "collective appraisal" is required. A group of experts needs to be assembled. Conclusions will be drawn based on statistical and probability analyses of the appraisal from each individual expert. By doing so a brand new appraisal from a quality standpoint can be obtained based on the opinions of the group. Collective appraisal can reduce the level of subjectivity, bias and narrowness often encountered in individual appraisals.

The procedures to use "collective appraisal" are as follows:

(1) Establish an expert appraisal analysis group. Its tasks are to clearly define the topic and objective of the study, select experts, determine the method and procedure for the survey, prepare, issue and collect survey forms, perform statistical analysis on the results, and finally summarize the results of the appraisal.

(2) Establish a group of experts for appraisal. The organizational structure can either be a "solid entity" or a loosely held organization. However, it must contain a group of pre-selected experts who are familiar with the subject matter. They must be representative in quantity and viewpoint. Choose experts who can assess the problem from various aspects. Retain key figures that are experienced in related fields. The accuracy and reliability of the appraisal can be directly impacted by the quality and representation of the expert group.

(3) Prepare for survey work. Primarily, prepare the necessary background materials and draft the survey outline. To quantitatively process survey data it is usually necessary to decompose the survey outline into a form in order to standardize and tabulate the answers. To a large extent the accuracy of the appraisal is dependent upon how detailed the questions asked by the experts are and the accuracy of the questions expressed. Hence, preparation of a scientific questionnaire is an important step in "collective appraisal."

(4) Organize to implement the survey and appraisal.

(5) Retrieve survey opinions to perform data analysis.

There are three ways to perform a "collective appraisal." One is to visit or interview experts to go over the form. Then the results are summarized by an expert appraisal analysis group. The second approach is to invite the expert group to a meeting to discuss the results. The third approach was first adopted by Land Corporation and it is also called the Turfy method. The key feature is to first mail the survey and background material to every expert. Each expert then replies in writing after studying the issues. The organizer collects their opinions and then sends all the answers back to the experts with or without any editing in a anonymous manner to allow each expert to evaluate his own opinion based on the arguments of his peers. He can supplement or modify his opinions and send in his answers one more time. The expert appraisal analysis group can then perform statistics and summarize the results of the second reply. Of course, if the issue is highly complex, then the process may be repeated a few more times to make the conclusion more precise and focused. This method is an extension of a discussion meeting. It has the following advantages of a discussion. (1) Experts have ample time to review the materials and to perform in-depth research. It can overcome the problem of having to speak on the spot without adequate preparation. (2) By reading through the previous survey, the opinions of others are known. With this level of understanding, one can perfect and modify one's own opinions. (3) Due to anonymity the group is not influenced by the opinions of a few well-known experts. This helps open up the field and encourage independent thinking. It can also avoid face to face confrontation when opposing views are presented. It allows each party to calmly analyze the view and reasoning behind the other party's opinion and to complement his own. (4) Since usually more than a few dozens of experts are involved, all answers are given in a tabulated format to facilitate quantitative analysis. In view of these advantages, this method has been widely used since 1960. Some research materials pointed out that more than 20% of the prediction in modern time is done this way.

7. Mathematical Methods

There is one unique feature in modern S&T development. As computer technology advances, mathematical methods are widely used in various fields. Hence the trend is to turn scientific knowledge into mathematical expression. Intelligence, information and collection are no exception. For example, a mathematical abstract of a complex intelligence or information flow process is obtained by applying probability and control theory, and quantitative analysis is carried out using mathematical models. Marx believed that "a science is truly developed after mathematics can be used to deal with it."

It usually takes the following steps to apply mathematics to intelligence and information science.

(1) Use the language of mathematics to describe the problem to be studied and to build a suitable mathematical model.

(2) Find a method to solve the mathematical model.

(3) Interpret and evaluate the mathematical solution to form a judgement or prediction of the problem.

II. Several Notable Issues in Philosophy and Methodology

In exploring the philosophy and methodology of studying intelligence, information and collection, we are in agreement with the overall principle introduced by Qian Xuesen for intelligence research. He said: "Never just limit yourself to your own ideas because it prevents you from seeing the whole system."

1. The development of disciplines such as thinking, systems, control theory and information theory reflects an overall change to make thinking a more scientific process. In the study of information and collection, we have to actively adapt to this change. Of course, information and collection are collective bodies of a series of processes and factors, i.e., systems. The purposes to study information and collection are to investigate the law governing the organization of information and collection, to understand the law that puts the system in order from a random state, and to explore how this orderly entity remains functional. To study the correlation between processes or factors, an optimization method based on the selection theory must be used. We cannot limit to factors affecting an individual entity. Instead, it must be deduced based on decision theory. For a long time, classic decision theory was used exclusively to study information and collection. This is a deficiency in the field.

2. Because to date people are still using a qualitative methods by describing an experience, similar to what happened in conventional library science, progress in information and collection is very slow. Furthermore, it does not seem very scientific and the unique characteristics of the discipline appear ambiguous. In the future certain novel techniques should be used to further strengthen quantitative methods. Methodology to extract and summarize from past practice should be noted to gradually form a distinctly unique system to study information and collection. Another disadvantage of using methods in library science is that it is confined in a primitive descriptive approach.

We must point out here that as far as collection is concerned, not even intuitive experience has been accurately and sufficiently described as of the present moment. Hence, when one is ready to initiate a study on collection, one must combine intuitive description with theoretical extraction. In the early stage, we should look for issues encountered in routine work and use topics to drive key tasks.

3. Since information is often inappropriately accumulated and inherited, it makes information and collection more resistant to change. Over the years we have felt that it is easier to stir up the pot than to reform. In the future, in addition to microscopic methods aiming at partial improvements, we should especially pay attention to studying some macroscopic methods that can advance information and collection as a whole in order to result in a fundamental reform.

III. New Approaches and Methods Discovered in Research and Introduced from Abroad

Intelligence, information and collection are new disciplines. The theoretical concepts are still yet to be formed. The study methods are also just evolving. Hence there is an urgent need to refer to certain new theories, ideas and methods in philosophy, social science and natural science. This point is of great significance to the formation, development and even key breakthroughs in the discipline of information and collection. A number of new theories and ideas that are related to theoretical research on intelligence and information are introduced below. These methods and ideas were developed abroad in the past several years.

1. Idea Gene Theory by Dawkins of the United Kingdom and Intelligence Gene Theory by S.K. Sen of India

According to modern genetics, the gene is the basic biologic element of inheritance. It exists on the chromosome in the form of a linear array. In 1976, Dawkins proposed the idea that a human being, just like a biologic body, transplants, expands and reappears as time and space vary. There is also a basic unit that is a thinking gene. A thinking gene is the basic element of science, as well as the heir and propagator of human culture.

A biological species evolves based on invariance and mutation of genes. Dawkins believed that the development of a scientific idea is remarkably similar to the evolution of a biological species. Some new ideas undergo changes to form new laws, doctrines and theories. He attributed the inheritance and development of ideas to heredity and the mutation of thinking genes. He also believed that a thinking gene is a high fidelity replica of the idea. It can form a gene composite entity that can live and multiply.

In 1981, S.K. Sen of India introduced the concept of replacing "thinking gene" with "intelligence gene" in an attempt to lay a solid theoretical basis for information science. He equated the state of living to an increase in intelligence. He believed that an organic entity evolves by way of hereditary genes, natural selection, replication fidelity and mutation. Intelligence, however, increases gradually by way of heredity of intelligence genes, error detection, social constraints, and thinking changes.

S.K. Sen criticized the methods to quantitatively evaluate and measure knowledge increases. He believed that these methods are based on published articles or available literature as a whole, and do not agree with the actual evolutionary process of ideas. He proposed to build a quantitative method to measure intelligence on gene theory. Counting the number of time an article has been read is not a good way to measure the novelty of a new idea. Citation statistics cannot actually reflect whether the idea illustrated in an article is rejected, accepted, or partially utilized. Therefore he proposed to establish an idea gene structure, idea gene chain, and idea gene exchange model.

S.K. Sen also pointed out that the current categorization method is irrational. He suggested that we reconstruct a categorization system based on an idea evolution chart. In information retrieval, he also proposed to start from looking for the idea gene from the literature and then gather data by means of natural progression to form an idea gene string. Then, prepare it into a novel idea index for use.

Liu Zhihui of the Chongqing Branch of China Institute of S&T Intelligence pointed it out that it is more appropriate to change "intelligence gene" to "knowledge gene." The phenomenon of knowledge inheritance and mutation should be looked at from a dialectical standpoint. In the evolutionary process, knowledge primarily represents the inheritance of certain academic ideas. At the same time, to some extent, it contains a criticism of this idea. In genetics, the dominant character, i.e., inheritance, is to carry forward tradition, and its recessive character, i.e., mutation, is to criticize tradition. As knowledge mutates, the primary behavior is to criticize the idea. At the same time, to a certain degree, it also inherits this idea. Its dominant character is to criticize tradition, i.e., mutation. Its recessive character is to inherit tradition, i.e., inheritance. The idea gene theory describes the process of thinking based on the theory of evolution. An idea is a system of knowledge. A knowledge gene is the basic concept of science. The equations, laws and patterns created from these basic concepts are the DNA of knowledge. Knowledge DNA is the primary constituent of the knowledge cell. It is the basic structure of "inheritance" and idea "mutation." The entire building of science is created by knowledge cells.

The idea gene theory of Dawkins and intelligence gene theory of S.K. Sen are worthy of further study. Based on such theories, the traditional categorization method will be challenged and current quantitative methods of information analysis will be impacted. The theoretical study of intelligence may undergo a major reform after introducing the idea gene theory into this field.

2. Brooks' View

Since 1978 Brooks has dedicated himself to the basic theory of information. He presented a series of insightful viewpoints. His representative work, "Fundamentals of Information" (published in 1980), received worldwide attention. His view can be summarized in the following five points.

(1) He advocated that information ought to be a discipline of science. He said that "from the standpoint of philosophy, information does not have a place nor does it have any theoretical basis." Hence he proposed to abandon some narrow concepts developed in their embryonic stage (limited to literature) and investigate the essence of information and a series of basic issues related to information from a broader perspective.

(2) On the basis of the three-world theory, information should study the interaction between world 2 and world 3 to form an independent discipline. He said that " information is fundamentally needed and is objective, rather than subjective, knowledge." "People studying library science and information science have to pay attention to Bop's three-world theory because it provides a theoretical basis for library and information activities other than from a practical viewpoint." He emphatically denied that information is a combination of some related fields (such as linguistics, communication theory, computer science and statistics). He believes that information has its own unique research subjects and domains that have not yet been touched; i.e., the interaction between world 2 and world 3.

(3) The basic equation to express the effect of information on knowledge structure is K(S)+DI = K(S+DS), where K(S) is the original knowledge structure, DI is the incremental information, and DS is the improvement. Brooks considers it a quasi-mathematical formula. The reason why he presented this equation is because people are extremely ignorant about how knowledge grows. He also stressed that knowledge increase is not a simple addition. Instead, it is an adjustment of the knowledge structure.

(4) Ranking & sequencing and logarithmic perspective are two basic methods to quantitatively measure information. In Brooks' replied to Neil's criticism, he said "information also needs a quantitative measure, otherwise it is merely a summation technique and not a true science." What did he advocate to use? One is ranking and sequencing. This method can preserve more information and data, and it is easy to use. The second is logarithmic perspective. Brooks daringly applied the Weber-Fleishnan law to the subjective human recognition process. That is, the human recognition process operates in a logarithmic manner. He also pointed out the presence of "recognition space" and "information space." It is different from physical space and is constrained by logarithm.

(5) Objective knowledge is organized based on the logic content of the literature. Brooks believed that the material organized by current information workers by category and by subject is not knowledge. Instead, it is literature. A system thus built can only provides literature clues. The real information still needs to be analyzed and revealed by the user. Therefore, he proposed to organize it by a knowledge map. A knowledge map is a direct display of the interaction and linkage points for people to create and think based on the logic content of the paper. Brooks conducted a small experiment to draw a knowledge map by using Faradan's index. However he did not disclose any results.

Brooks is an up-and-coming youngster in information. His viewpoints are worth pursuing. His understanding of the characteristics of knowledge is a giant step deeper. His equation, although simple and coarse, after all establishes an inherent relation between information and knowledge. It points to a direction of basic research – to further elucidate the relation between information and knowledge to establish a more elaborate mathematical model. The idea of drawing a knowledge map by Brooks coincides with S.K. Sen's point of building a idea gene evolution chart. Of course, it is very difficult to draw such a map. There are numerous technical issues to be resolved. However, once a "knowledge map" or "idea gene chart" is successfully drawn, it is equivalent to building a brain outside the human body. It will be a significant contribution as people handle more and more information everyday. Particularly, he sharply criticized the status quo of theoretical information research and challenged the traditional view. He fought hard to understand basic issues of information from a philosophical perspective and firmly believed that information will become a new independent discipline.

Brooks' views are novel and we should study and absorb their scientific merit.

3. Brief Introduction to the "Three New Theories"

In the tide of new technology revolution, modern science has developed from the "old three theories" (i.e., systems, control and information) to the "new three theories" (i.e., dissipation structure, coordination and mutation). They are powerful tools to understand the multi-dimensional correlation between the internal factors of an object and the external environment.

Dissipation structure theory was originally proposed by Professor Prigogine of the Freedom University of Belgium in 1969. It was primarily used to investigate the mechanisms, conditions and laws governing the evolution of a system that starts in a highly random state and ends in a stable and orderly state.

Prigogine pointed out that in a system far from equilibrium, when a variable reaches its critical value, by means of abrupt changes there is a possibility that it may switch from a random state to a orderly state in terms of time, space or function. This orderly macrostructure formed in the non-linear region far from equilibrium must continuously exchange material and energy with the external world to maintain its stability so that it will not disappear due to small perturbation. Prigogine calls this kind of structure that needs to dissipate material and energy to maintain its orderliness a dissipation structure. Under certain conditions, the system can organize itself. This is the self-organizing effect.

According to Prigogine's hypothesis, a system requires at least four conditions to form a dissipation structure:(1) it must be an open system,(2) it must be remote from equilibrium,(3) there must be non-linear interaction among various factors, and (4) a huge rise or fall leads to orderliness.

In 1973, Professor Haken of Stuttgart University in Germany introduced the coordination theory. It expands Prigogine's dissipation structure theory from an open system far remote from equilibrium to a closed system that is in equilibrium.

Haken believes that under certain conditions, a coordination effect is produced through non-linear interaction among various sub-systems, a system may undergo changes from a random state to an orderly state, from a low level orderliness state to a high level orderliness state, or from a orderly state to a random state.

In coordination theory, the degree of orderliness of the system is expressed as an order variable. This order variable is a critical, undamped and slow-relaxing variable that governs the entire process. It determines the structure and functionality of the result.

Quantitative study on both dissipation structure theory and coordination theory relies on mathematical methods associated with mutation theory. Mutation theory is the study of various discontinuities of various natural states and structures, as well as socioeconomic activities, using theories in topology, singular point, qualitative differential equation, and stability mathematics. Mutation theory organically combines dissipation structure theory with coordination theory to promote the development of systems theory.

In recent years, some people are trying to introduce the "three new theories" into information research. Some preliminary attempts were made. However, it still take a great deal of hard work to understand how dissipation structure theory and coordination theory can be used to interpret the theories and practices of intelligence.

Section Six -- Significance of Intelligence, Information and Collection Research

It does not seem hard to answer this question from the standpoint of work. However, from the standpoint of science and technology, considering the impact of technology and industrial revolution brought about by new technology, and also considering that major advances in systems science are right around the corner and major breakthroughs in thinking science are imminent and a new scientific revolution is in the near future, it is not an easy task to clearly express the significance of studying intelligence, information and collection under such a broad background.

In spite of it, our feeling is that the significance and impact of such a study is definitely not limited to improving the effectiveness of various tasks such as data collection, sequencing, indexing and activation. It is also not limited to making better use of the information. Roughly speaking, in addition to understanding how to utilize information fully and in a timely way to create intellectual wealth and total knowledge for mankind, it also involves how the human brain can function to the full extent. Let us assume that once the thinking mechanism of the human brain is understood, artificial intelligence can be developed and knowledge engineering is widely spread. By then, information technology will also have advanced considerably. What would the state of intelligence, information and collection be? What is going to happen to other sciences and technologies? What kind of impact would it have on industrial and technology revolution? If these questions are asked, then indeed it is impossible to accurately and comprehensively describe the significance of studying intelligence, information and collection.

As for collection science, it is still in the inception stage. As the social function of intelligence continues to develop, a series of changes must occur regarding the concept and understanding of collection. Eventually, people will realize the significance of collection research. It is, at least, a major social issue concerning gathering of the human intellectual wealth.

In conclusion, when one considers the profound significance of these topics, definitely try not to limit oneself to the current task in hand. One should keep one's eyes open and think farther.

On to References