Network Analysis Technology

While a vast number of documents are accumulated daily in offices, many of them remain unorganized. This gives rise to the problem that documents sometimes cannot be obtained in a timely manner when they are needed, and it is often the case that the same documents must be prepared again because they cannot be found. Fuji Xerox has been conducting research to develop technology that will solve these problems by forming a network of the documents accumulated in an office based on how they are related and then analyzing this network to find useful information. This network analysis technology uses an algorithm that models the excellent ability of the brain to properly recall memories in response to a given situation or context.
Due to recent progress made in neuroscience, the mechanism of memory recall in the brain has been revealed in considerable detail. Fuji Xerox has also engaged in computational research to elucidate the neural mechanism of memory recall. The proposed algorithm has been formulated as inspired by the evidence obtained by these lines of fundamental scientific research.
"Memory" is categorized into "long-term memory (LTM)" and "short-term memory (STM)." LTM is stored in the brain as an associative network of ideas that have been acquired through various experiences from birth. STM, on the other hand, is a process to temporarily activate a group of ideas extracted from LTM in response to a given situation or context. The associative network of a group of ideas thus activated, which is a sub-graph of the LTM network, is referred to as a "community."
The LTM network in the brain can be compared to a variety of complex networks (e.g., the WWW/internet, citation networks of documents, networks of people in social network service, gene regulatory networks or protein interaction networks in biological cells). This comparison leads us to the idea that relevant information can be extracted as a "community" from these networks by simulating STM recall from LTM in the brain (Fig. 1).

Fig. 1: Extraction of Relevant Information from a Large-Scale Complex Network by Analogy to Memory Retrieval in the Brain

Here we demonstrate the application of our brain-inspired computational algorithm to a citation network of Japan patentsNote1 (Fig. 2). First, a user specifies a set of patents putatively related to the technological topic that the user wants to explore. These patents are referred to as "seed patents." Initial "activation" is set on the seed patents. Then activation spreads along citation links in the network of patents. This process corresponds to the propagation of neuronal activation in the network of neurons.Note2 The spreading activation eventually converges to a steady state that depends on the seed patents representing the topic. This corresponds to the state where the brain has recalled STM. In the steady state, patents that are highly relevant to the topic have high values of activity. Sorting the patents in descending order of their values of activity gives the ranking of patents regarding their relevance to the topic. In addition, visualization of the network of highly relevant patents (namely, the extracted community) provides an overview of the field of technology that the user wants to explore. In this visualization (Fig. 2), each document icon indicates a patent, its size expresses the value of activity (level of relevance) of the patent, and the color differentiates the applicant (company). The arrow lines represent citation relations between patents. The arrow head is directed from a cited to a citing patent in order to indicate that the latter is under the influence of the former. The icons are placed according to the disclosure date from top (past) to bottom (future). Thus, the user can overview how this technological field has developed and which patents or which relations between patents are most influential there.

Fig. 2: Extracting Relevant Patents from a Citation Network

We also plan to apply our algorithm to general office documents other than patents. Making use of modeling the brain's excellent capability to process a large amount of information, we will continue to meet the challenge of creating new technologies to deal with a large number of documents.

  • Note1 A network of patents established through examiner citation
  • Note2 In the brain, information is processed by the propagation of neuronal activation in a network composed of an enormous number of neurons.