Title: ENVIRONMENTAL INFORMATICS (1)
1ENVIRONMENTAL INFORMATICS (1)
- Â Draft outline of a discipline devoted to the
study of environmental information - creation, storage, access, organization,
dissemination, integration, presentation and
usage - Rudolf B. Husar
- Center for Air Pollution Impact and Trend
Analysis (CAPITA) - Washington UniversitySt. Louis, MO 63130
- September 1992
2ENVIRONMENTAL INFORMATICSApplication of
Information Science, Engineering and Technology
to Environmental Problems
- Rudolf B. Husar
- Director, Center for Air Pollution Impact and
Trend Analysis - Washington University, St. Louis, MO
- Â
- Environmental information is becoming
unmanageable by traditional methods. - There is a need to develop effective methods to
store, organize, access, dessimilate, filter,
combine and deliver this peculiar resource. - Information science is to explain information as
a resource and the manner in which it is created,
transformed and used. - Information engineering deals with the design of
information systems, while information technology
deals with the actual processes of storage
transformation and delivery - Presented topics will include information as a
resource user driven data model value-added
processes application of database, geographic
information systems, hypertext, multimedia,
expent system technologiesand the integration of
these technologies into information systems. - The principles of Environmental Informatics will
be discussed in the context of Global Change
databases organized by ORNL-CDIAS, NASA and by
Washington University. - The talk will be augmented a live demonstration
of the Voyager 1 Data Delivery System that
combines database, GIS, hypertext, direct
manipulation and multimedia technologies.
3THE PROBLEM
- Â
- The researcher cannot get access to data
- if he can, he can not read them
- if he can read them,
- he does not know how good they are
- and if he finds them good
- he can not merge them with other data
- Â
- Â
- Â
- Form
- Â
- Information Technology and the Conduct of
Research - The Users View.
- National Academy Press, Washington, D.C. 1989
4Â DATA PATHWAY
- Â DATA PATHWAY
- Monitoring Site
- Principal Investigator
- Information Centers
5INFORMATICS - THE SCIENCE
- Systems exists that organize, store, manipulate,
retrieve, analyze, evaluate, and provide
information in various chunks to a variety of
people. - The practice of informatics has evolved from
professional know-how and technology, not as a
product of 'basic' research. - Informatics is in a prescientific stage of
naming, taxonomy, descriptions and definitions. - First we need to understand how existing
information systems work. - Next we need to formulate a model of these
practices components, activities, values added,
clients served and the problems solved by the IS. - Finally, we have to apply the newly gained
insights (science) to the design of better IS. - Note The steam engine was used in practice well
before the Carnot cycle theory was invented. - SCIENCE
- Â
- The field is in pre-scientific stage. Mostly
taxonomy of working systems. - Â
- Goals Understanding the forms of environmental
knowledge - Usages of environmental knowledge
- Processes of new knowledge creation
-
- Informatics is in a pre-scientific stage of
naming, taxonomy, descriptions and definitions.
However, information systems exists that
organize, store, manipulate, retrieve, analyze,
evaluate, and provide information to a variety of
people. - Practical information management has evolved from
professional know-how and technology, not as a
product of 'basic' research. In order to develop
the science and engineering - First we need to understand how existing
information systems work.
6Â INFORMATICS - THE ENGINEERING
- Â
- Information systems exists that organize, store,
manipulate, retrieve, analyze, evaluate, and
provide information in various chunks to a
variaty of people. - Design of information storage and flow systems.
Emphasis on user driven design to complement
technology and content driven info flows. - Â
- Goals Augment human decision and learning
processes. - Unite data and metadata
- Reduce resistances to info flow
- The activities of information engineering
include - Matching the information need of the user to the
information sources, using available technology. - Develop methodologies for the organization,
transformations and delivery of environmental
data/information/knowledge. - Identify the key information values and the
processes that will enhance those values. - Seek out a set of universal values that can be
added to information, that are independent of the
user environment ( i.e. accessibility, common
coding, and documentation). - Develop new tools that will enhance and augment
the human mind in dealing with environmental
information, e.g. to minimize the 'info-glut'.
7Â INFORMATICS - THE TECHNOLOGY
- The information revolution is driven by the
confluence of comuter hardware, software and
communications technologies. - Hardware Computers, communications,
microelctronics. - Software Database, hypertext, geographic
information systems (GIS), hypertext, multimedia,
object orientation. - Communications Wide area (Internet) and local
networks bulletin boards, CD ROM. - Intellectual Technologies Indexing,
classification/organisation, searchning,
presenting. - These technologies provide the hope to overcome
the information/data glut. - Develop knowledge and data storage, delivery and
processing systems. - Goals Merge database, hypertext, numerical
modeling technologies - User programmable, socially well behaving info
systems - Ultimately interoperable with the universe
- Information systems are implemented using
suitable technologies. The information revolution
is driven by the confluence of computer hardware,
software and communications technologies. - Hardware Computers, communications,
microelectronics. - Software Database, hypertext, geographic info
systems (GIS), multimedia, object orientation. - Communications Wide area (Internet) and local
networks bulletin boards, CD ROM. - Intellectual Technologies Indexing,
classification/organization, searching,
presenting. - These technologies along with developments in
information engineering and science provide the
hope to overcome the information/data glut.
8USER-DRIVEN INFORMATION PROCESSING
- Â Action
- matching goals
- compromising DECISION
- bargaining PROCESSES
- choosing
- Productive Knownledge
- presenting
- options JUDGMENTAL
- advantages PROCESSES
- disadvantages
- Informing Knowledge
- separating
- evaluating
- validating ANALYZING
- interpreting PROCESSES
- synthesizing
- Information
- grouping
- classifying ORGANIZING
9- VALUE ADDED PROCESSES
- Metaphors are useful in describing new,
unfamiliar topics. Environmental Information
systems can be viewed as refineries that
transform low-value data into information and
knowledge through a series of value-adding
processes. - Data constitute the raw input from which
productive knowledge, used for decision making is
derived. - Data refers to numbers, files and the associated
labeling that describes it. Data are turned into
information when one establishes relationships
among data, e.g. relational database. Informing
knowledge educates while productive knowledge is
used for decision making. - In fact, one of the practical definitions of
knowledge is 'whatever is used for
decision-making'.
10USES OF ENVIRONMENTAL DATA
- Environmental data/information is used to
- Â
- Provide Historical Record
- Identify Deviation from Expected Trend
- Anticipate Future Environmental Problems
- Provide Legal/Regulatory Record
- Support Research
- Support Education
- Support Communication
- Â
- The main uses are in science, education and to
support regulations.
11CONTENT, TECHNOLOGY AND USER DRIVEN DATA FLOWS
- Most agencies are disseminating information
relevant to their own domain of activity. Such
data flow is content driven. - New technologies such as the papyrus, printed
book, CD-ROM and computer networks provide bursts
of information flow resulting in
technology-driven information flows. - However, in scientific, educational, and
regulatory use of environmental data, there is a
need for compatible information from various
domains, requiring data merging and synthesis.
Such data flow is user driven since the user
dictates the form, content and flow of the data. - Content and technology-driven data flows are fine
but they are inadequate to handle modern
information needs. The challenge is to develop
the user-driven model and to reconcile and
integrate it with the other models.
12ENVIRONMENTAL INFORMATICS
- Â
- The study of environmental information and its
use in environmental management, science and
education. - More than the study of computers in
environmental information. It's focus is on the
environmental field, rather then on computers and
the technology. - Pressedent Medical Informatics, a mature field
with a goal, domain, textbooks, college courses,
research groups and funding agencies. - ENVIRONMENTAL INFORMATICS, EI
- A tentative definition of EI is
- The study of environmental information and its
use in decision making, education and science. - EI focuses on the environmental field, rather
then on computers and the technology. It's
approach is to systematically study environmental
information, as branches of science, engineering
and technology. - Much of the presentation below is a synthesis of
ideas 'borrowed' and adopted to the environmental
field. - There is precedent Medical Informatics, a mature
field with a goal, domain, textbooks, college
courses, research groups and funding agencies. - Other relevant fields include library sciences,
management sciences, information engineering.
13INFORMATION AS A RESOURCE
- Environmental information and information in
general has several unique characteristics. In
the post-industrial era, material goods were
replaced by information as the commodity of
transactions. It became a resource in itself. - As other resources, information needs to be
acquired, organized and distributed i.e. managed.
However, it is a remarkable resource - It can not be depleted by use.
- In fact, it expands and gets better with use.
- Information is not scarce it is in chronic
surplus. - Scarcity is in time to process it into
knowledge. - The processing costs are borne by the info
user. - Info can be owned by many at the same time.
- It is shared, not exchanged in transactions.
- Â
- Therefore, one must develop different tools from
those that proved useful for natural, capital,
human and technological resource management.
14Â DATA FLOW IMPEDIMENTS
15Â ASSUMPTIONS AND RATIONALE
- For the foreseeable future, environmental
information will grow in quantity and quality. - Individual agencies are collecting, organizing,
and disseminating information relevant to their
own domain of activity. - There is not enough manpower and time to digest,
analyze, integrate, and ultimately make use of
the accumulated environmental information. - Therefore, there is a need for a systematic
effort to develop suitable data organization,
manipulation, integration, and delivery system. - A possible mechanism for accomplishing these
task is to form a consortium of
informatics-minded institutions - the EI Group. - For the foreseeable future, environmental
information will grow in quantity. - There is not enough manpower and time to
analyze, integrate, and use of all the data - The problem is not so much the quantity of data,
but rather the form in which is delivered, e.g.
the automobile windshield delivers lots of data
but we can still process it with ease. - What is needed is a faster way to metabolize the
expanding environmental data sets. - Therefore, there is a need for a systematic
effort to better understand environmental
information its characteristics, use and
management.
16USER DRIVEN FLOW OF ENVIRONMENTAL INFORMATION
- USER DRIVEN FLOW OF ENVIRONMENTAL INFORMATION
- In scientific, educational, and regulatory use of
environmental data, there is a need for
multiplicity of compatible data sets and
knowledge from various domains. - There is a set of universal values that can be
added to the data such as accessibility, common
coding, and documentation. These values in
conjunction with a set of software tools could
minimize the "info-glut". - Use of data for science, education, regulation
and policy requires - Specification of the information need by the
user - An information system (model, educational
software or a decision support system) that is
capable of delivering the needed information. - Domain data supplied by the producer or brokers.
17- POSSIBLE ACTIVITIES OF EI GROUP
- EI Science Define the domain of EI
environmental information as a resource seek
general laws of EI info uses driving forces. - EI Engineering Study the components of EI
systems creation value-added processes
data/information/knowledge structures for storage
and transmission design of EI systems. - Education Develop educational materials on EI
conduct workshops, training sessions. - Work closely with others on
- Data Integration Collect, reconcile,
integrate, document data/information/knowledge
bases. - Data Exchange Foster exchange through
depositories, data catalogs, transfer mechanisms,
and nomenclature standards. - Tools Development Evaluate and develop
software tools for the access, manipulation, and
presentation of environmental information.
18- REQUIREMENTS FOR THE EI GROUP
- The EI GROUP has to have a solid understanding
of environmental data needs for science,
education, and policy development, regulations,
and other uses. - Know how to translate the information needs to
information systems and to design a data flow and
transformation systems (information engineering). - The EI GROUP has to be well versed in modern
information science and technology as applicable
to environmental informatics. Where necessary,
the Group has to develop new concepts and
technologies. - It has to interface with the users of the
environmental information, to assure the
usefulness of the effort. - Interface with, and utilize the existing
governmental and private data sources, building
on and enhance not competing with those effort.
19- OUTPUT OF THE EI GROUP
- Technology Adopt and apply evolving technologies
for DBMS, GIS, Hypertext, Expert Systems, User
Interfaces, Multimedia, Object Orientation - Public Databases Prepare relevant, high quality,
well documented, compatible, integrated, raw, and
aggregated environmental databases to be usable
for science, education, enforcement, and other
purposes. Make such high quality-high value data
environmental information available to many
users. - Software Tools Provide "smart" data
display/manipulation tools that will help turning
data into knowledge. e.g. GIS, Voyager, Movie,
Hypertext, Video/Sound. - Federal agencies have recognized these needs
and formed the Interagency Working Group on Data
Management for Global Change IWGDMGC - The federal effort could be augmented by
companion academic efforts, possibly through a
consortium of informatics-minded institutions -
the EI Group.
20POSSIBLE ACTIVITIES OF THE EI GROUP
- Data Integration Collect, reconcile,
integrate, document information bases. - Data Exchange Foster exchange of environmental
data through depositories, data catalogs,
transfer mechanisms, and nomenclature standards. - Tools Development Evaluate and develop
software tools for the access, manipulation, and
presentation of environmental information. - End-Use Projects Conduct specific research and
development projects for science, education, and
regulations. - Education Conduct workshops, training
sessions, and prepare educational material for
environmental informatics.
21REQUIREMENTS FOR THE EI GROUP
- Â The EI GROUP has to have a solid understanding
of environmental data needs for science,
education, and policy development, regulations,
and other uses. - Know how to translate the information needs to
information systems and to design a data flow and
transformation systems (information engineering). - The EI GROUP has to be well versed in modern
information science and technology as applicable
to environmental informatics. Where necessary,
the Group has to develop new concepts and
technologies. - Interface with, and utilize the existing
governmental and private data sources, building
on and enhance not competeing with those effort. - It has to interface with the users of the
environmental information, to assure the
usefulness of the effort.
22EI GROUP OUTPUT
- New Developments Environmental Informatics
- Science Define the domain of EI. Develop new
methods to classify, organize, and create
environmental knowledge. - Â
- Engineering Create an infrastructure and
methodology for the organization, transformation,
and delivery of environmental information. - Â
- Technology Examine the evolving technologies for
Database Management Systems (DBMS), Geographic
Information System (GIS), Hypertext, Expert
Systems, User Interface, Multimedia, Object
Orientation. Apply and adopt these technologies
to environmental information. - Provide High Grade Environmental Databases for
Public Use - Prepare relevant, high quality, well documented,
compatible, integrated, raw, and aggregated
environmental databases to be usable for science,
education, enforcement, and other purposes. Make
such high quality-high value data environmental
information available to many users. - Provide Software Tools
- Provide "smart" data manipulation tools that will
help turning data into knowledge. - Provide tools for data access, manipulation, and
presentation (e.g. GIS, Voyager, Movie,
Hypertext, Video/Sound).
23Funding
24Information and Decision Making (1)Arno Penzias
Ideas and Information
- An instrument operator, traffic controller,
economist .... all process information. A common
thread among these activities is that is decision
making. A decision may be simple such as
selecting ....replacing a . or as complex as
developing a new clean air legislation. Decisions
are followed by actions and actions generally in
new information . This rather circular behavior
keeps the decision process going until some goal
is met, the task is finished , or the project is
set aside for a time. Healthy flow of information
separates winning organizations from losers. (
More on the flow concept here) - Â
- Â
- For quality information, today's consistently
successful decision makers rely on a combination
of man and mashine. Getting the best combination
requires understanding how the two fit together
and the roles each may play. It also requires
having an information strategy that is suitable
for both he decision-maker's preferences and the
problem at hand. - Â
- Knowledge is whatever information is used to make
decision. - "Deciding" is acting on information.
- Managers are transformers of information
- pp125
25Information and Decision Making (2)Arno Penzias
Ideas and Information
- Information Flow and Decision Making
- An instrument operator, traffic controller,
economist .... all process information. A common
thread among these activities is that is decision
making. A decision may be simple such as
selecting ....replacing a . or as complex as
developing a new clean air legislation. Decisions
are followed by actions and actions generally
reswult in new information . This rather circular
behavior keeps the decision process going until
some goal is met, the task is finished , or the
project is set aside for a time. - Â
- Barring blind luck, the quality of decision can
not be any better than the quality of the
information behind it. - Healthy flow of information separates winning
organizations from losers. ( More on the flow
concept here) - Â
- Knowledge is whatever information is used to make
decision. - "Deciding" is acting on information.
- Managers are transformers of information
- pp125
- Despite the explosive growth in computing, we
have yet to feel the full impact of the
information-processing resource that
microprocessors offer. The computing power will
immensity the challenge of developing ever more
powerful methods of telling mashines to do what
we whish them to do. This requires the solution
of "the software problem". - - Â
- Solving the "the software problem" includes
producing software more quickly, with fewer bugs
at lower cost- software that is easier to to
understand, modify and reuse different
applications. Give user to customize a system by
modifying.
26Information and Decision Making (3)Arno Penzias
Ideas and Information
- UNIX - Social behavior
- Most applications use different formats to move
information between them. UNIX programs
communicate with each other in a specific way.
This arrangement allows the programmer to plug
programs together like Lego sets, without
worrying about the details of interfacing. UNIX's
modularity permits users to build customized
application programs out of modular pars from
libraries and programs borrowed from friends.
Convenient "User programmability" has the
potential to unleash the creative powers of many
users instead of relying on the program creator
for all the insights needed to create well suited
applications - What next? Search of nonprocedural programming
that frees users from worrying about how a given
task is to be accomplished and allow them to
merely state what they want.
27Information and Decision Making (4)Arno Penzias
Ideas and Information
- Networking
- To benefit from information created for different
purposes under different conditions and at
different location, users need convenient
interfaces to the systems providing the data.
Ultimately, the intervening networking technology
that provides the interface should be flexible
enough to accept information in whatever format
the data source provides it and translate it to
the needed format most suitable for human
perception. - Human pattern recognition skills, tactile
sensitivities and similar interfaces to the
external world attest to the massive processing
power that the brain dedicates to such functions. - Evidently, the experience of evolution has
demonstrated the need for a variety of sensitive
interfaces, . The greatest subtlety of our own
human interfaces appears to be in the way we
effortlessly integrate disparate sensory inputs.
It is the single good feeling you get in a
theater or sports arena from words, music,
spectacle, and someone sitting next to you- all
at the same time. In contrast, most of our
present technology tends to deal with each input
the words the visual input etc. as a separate
entity. - User preferences and productivity needs are the
driving forces behind the call for better
interface between people and mashines. - Much of the additional computer processing power
will be devoted to providing better interfaces
between people and mashine.
28Information and Decision Making (5)Arno Penzias
Ideas and Information
- Computers and human information processing
- Â
- While computers afford humans much valuable help
in processing massive amounts of data. However,
mashines are best at manipulate numbers or
symbols people connect them to meaning. - Â
- Machines offer little serious competiion in areas
of creativity, integration of disparate
information, and flexible adaptation to
unforeseen circumstances. Here the human mind
functions best. Computing systems lack a key
attribute of human intelligence the ability to
move from one context to another. - Â Just-in-time Information processing symbiotinc
co-evolution - Â Computers and communiation systems can speed up
the Connectivity can spped - Â
- Today, access to on-line data reduction schemes
enables us to think of the results as we get
them. These better tools can profoundly change
the way we work. Today, we can ask questions in
time to get answers, make decisions and create
more powerful ideas. Generate knowledge faster - While ideas flow from human minds, computers can
help shaping much of the information that leads
to those ideas. By providing needed information
in timely way and in digestable form, electronic
data processing and delivery system can someone
make informed decisions, - Tools of the mind , mind ampliing. Same way as
steam enfine amplies humans physical power, the
computer/communication technologies can amlify
its mental powers. - In this sence, the goal of the information
techloogy promoted here is not so much to
intruduce artificail intelligence, but tho
amplify the actual intelligence of humans to
perfom increasingly complex taks. - Â
29Information and Decision Making (6)Arno Penzias
Ideas and Information
- The Software Problem
- Â
- Despite the explosive growth in computing, we
have yet to feel the full impact of the
information-processing resource that
microprocessors offer. The computing power will
immensity the challenge of developing ever more
powerful methods of telling machines to do what
we whish them to do. This requires the solution
of "the software problem". - - Â
- Solving the "the software problem" includes
producing software more quickly, with fewer bugs
at lower cost- software that is easier to to
understand, modify and reuse different
applications. Give user to customize a system by
modifying. - Â
- Most applications use different formats to move
information between them. UNIX programs
communicate with each other in a specific way.
This arrangement allows the programmer to plug
programs together like Lego sets, without
worrying about the details of interfacing. UNIX's
modularity permits users to build customized
application programs out of modular pars from
libraries and programs borrowed from friends.
Convenient "User programmability" has the
potential to unleash the creative powers of many
users instead of relying on the program creator
for all the insights needed to create well suited
applications - What next? Search of nonprocedural programming
that frees users from worrying about how a given
task is to be accomplished and allow them to
merely state what they want.
30Information and Decision Making (7)Arno Penzias
Ideas and Information
- Data Access
- Â
- To benefit from information created for different
purposes under different conditions and at
different location, users need convenient
interfaces to the systems providing the data.
Ultimately, the intervening networking technology
that provides the interface should be flexible
enough to accept information in whatever format
the data source provides it and translate it to
the needed format most suitable for human
perception. - Human pattern recognition skills, tactile
sensitivities and similar interfaces to the
external world attest to the massive processing
power that the brain dedicates to such functions. - Â
- Evidently, the experience of evolution has
demonstrated the need for a variety of sensitive
interfaces, . The greatest subtlety of our own
human interfaces appears to be in the way we
effortlessly integrate disparate sensory inputs.
It is the single good feeling you get in a
theater or sports arena from words, music,
spectacle, and someone sitting next to you- all
at the same time. In contrast, most of our
present technology tends to deal with each input
the words the visual input etc. as a separate
entity. - User preferences and productivity needs are the
driving forces behind the call for better
interface between people and machines. - Much of the additional computer processing power
will be devoted to providing better interfaces
between people and machine. - Â
- Â
- Â
31Spatial Time Series Analysis-Forecasting -
ControlBennett, R.J. Pion Limited, London 1979
- Description (Characterization)
- In order to understand the functioning of
organisms, one has to understand - 1. individual holons (downward face)
- 2. the relationship between the holons (upward)
Koestlers holarchy - Â
- Involves summarizing the response characteristics
of the system by purely descriptive measures. - Description is accomplished by monitoring,
followed by descriptive statistics. - Explanation
- Associate and explain events that occur in
space-time. Build assotiative, causal
relationships, build model. Analysis stages (p.
20) - Stage 1. Prior hypothesis of systems structure
- Stage 2. System identification and specification
- Stage 3. Parameter estimation
- Stage 4. Check of model fit
- Stage 5. System explanation, forecasting,
control
32Moors Law
- The single most important thing to know about the
evolution of technology is Moore's Law. Most
readers will already be familiar with this "law."
However, it is still true today that the best of
industry executives, engineers, and scientists
fail to account for the enormous implications of
this central concept. - Gordon Moore, a founder of Intel Corporation,
observed in 1965 that the trend in the
fabrication of solid state devices was for the
dimensions of transistors to shrink by a factor
of two every 18 months. Put simply, electronics
doubles its power for a given cost every year and
a half. - In the three decades since Moore made his
observation the industry has followed his
prediction almost exactly. Many learned papers
have been written during that period predicting
the forthcoming end of this trend, but it
continues unabated today. Papers projecting the
end are still being written, accompanied with
impressive physical, mathematical, and economic
reasons why this rate of progress cannot
continue. Yet it does. - Moore's Law is not a "law" of the physical world.
It is merely an observation of industry behavior.
It says that things in electronics get better,
that they get better exponentially, and that this
happens very fast. Some, even Gordon Moore
himself, have conjectured that this is simply a
self-fulfilling prophecy. Since every corporation
knows that progress must happen at a certain
rate, they maintain that rate for fear of being
left behind. - It is also possible that Moore's Law is much
broader than it appears. Possibly it applies to
all of technology, and has applied for centuries
while we were unaware of its consequences or
mechanisms. Perhaps it was only possible to be
explicit about technological change in 1965
because the size of transistors gave us for the
first time a quantitative measure of progress. If
this is so, then we are embedded in an expanding
universe of technology, where the dimensions of
the world about us are forever changing in an
exponential fashion. - The notion of exponential change is deceptively
hard to understand intuitively. All of us are
accustomed to linear projection. We seem to view
the world through linear glasses -- if something
grows by a certain amount this year, it will grow
an equal amount the next year. But according to
Moore's Law, electronics that is twice as
effective in a year and a half will be sixteen
times as effective in 6 years and over a thousand
times as effective in 15 years. This implies
periodic overthrows of everything we know. An
executive in the telecommunications industry
recently said that the problem he confronted was
that the "mean time between decisions exceeded
the mean time between surprises." Moore's Law
guarantees the frequency of surprises.
33Metcalfe's Law -- Network Externalities
- There is another "law" that affects the
introduction of new technology -- this time in an
inhibiting fashion. Metcalfe's Law, also known to
economists generally as the principle of network
externalities, applies when the value of a new
communications service depends on how many other
users have adopted this service. If this is the
case, then the early adopters of a given service
or product are disincented, since the value they
would obtain is very small in the absence of
other users. In this situation innovation is
often throttled. - Metcalfe's law often applies to communications
services. A classic example, of course, is the
videotelephone. There is no value in having the
first videotelephone, and it only acquires value
slowly as the population of users increases. If
there are n users at a given time, then there are
n(n-1) possible one-way connections. Thus the
value grows as the square of the number of users.
The value starts slowly, then reaches some point
where it begins to rise rapidly. It seems as if
there needs to be a critical mass for takeoff,
and that there is no way to achieve that critical
mass, given the burden on initial subscribers. - Metcalfe's Law has defeated many technological
possibilities, left stillborn at the starting
gate of market penetration. Nonetheless, there
are important examples of breakthroughs. For
example, facsimile became a market success, but
only after decades of technological viability.
Even so, facsimile is a complex story, involving
the evolution of standards, the inevitable
progress of electronics, the equally-inevitable
progress in the efficiency of signal-processing
algorithms, and the rise of the business need for
messaging services. - Moore's and Metcalfe's laws make an interesting
pair. In the communications field Moore's law
guarantees the rise of capabilities, while
Metcalfe's law inhibits them from happening.
Devices that appear to have little intrinsic
value without the existence of a large networked
community continue to diminish in cost themselves
until they reach the point where the value and
cost are commensurate. Thus Moore's Law in time
can overcome Metcalfe's Law.
34Metcalfe's Law -- Network Externalities(2)
- Economists know it as the law of increasing
returns, of network externalities, but the idea
is that the more people that are connected to a
network the more valuable it is. Specifically,
the value of a network grows by the square of the
number of users. The value is measured by how
many people I can communicate with out there, so
the total value of the network grows as the
square of the number of users. Now, what this
means is that a small network has almost no
value, and a large network has a huge value.Â
What it gives you is the lock-in phenomenon of
winner takes all. You want to have the same
thing as everybody else. The idea is that you
dont want to be the first person on your block
to get the plague. But when all your friends get
it, you think about getting it. The more people
have it, the more youre likely to get it and
suddenly there is this capture effect where
everybody has it. This law of network
externality governs so much of the business and
is at the heart of the Microsoft trial. Why does
Microsoft have a monopoly? Is this a natural
phenomenon that has to do with networks? - David Reed coined another lawReeds Lawthat
says theres something beyond Metcalfes Law.Â
There are three kinds of networks. - First, theres broadcast like radio and TV, which
well call a Sarnoff network. The value of that
network is proportional to the number of people
receiving the broadcast. Amazon would be this
type of network, because people shop there but
dont interact with each other. - Then theres the Metcalfes Law-type network
where people talk to each other, for example,
classified ads. Reed said that the important
thing about the Internet is neither of those. - The Internet exhibits a third kind of lawwhere
communities with special interests can form. The
thing about communities is there are 2n of them,
so in a large network the value of having so many
possible communities and subnetworks is the
dominant factor. He predicts a scaling of
networks, starting with small networks having
only the Sarnoff linear factor, larger networks
dominated by the square factor, and giant
networks dominated by the 2n factor of the
formation of communities. - Napster is another example of whats going on in
information technology. First, its an example
of the kind of network where winner takes all.Â
Napster is where all the songs are, so thats
where everybody else is. If Napster goes under,
when they go under, then all the little sites
wont be able to replace it because people wont
find what they want there. Napster also brings
up one of the other properties of information,
which is troublesome and is going to shape our
society in the coming yearsthe idea that
information can be copied perfectly at zero
cost. That flies in the face of so much of what
we believe about commerce. As my friend Douglas
Adams said to me, we protect our intellectual
property by the fact that its stuck onto atoms,
but when its no longer stuck onto atoms, there
is really no way to protect it. He would like to
sell his books at half a cent a page, the idea
being that for every page you read, you pay him
half a cent. If you get into the book 20 pages
and you say, This book is really bad, you dont
pay anymore. That would eliminate the copying
of information at zero cost issue that he
experiences as an author. He says people come up
to him in the street and say, Ive read your
book 10 times, and he says, Yes, but you didnt
pay 10 times. - So these are some of the things that trouble me
about the future of information technology. What
are its limits? Will the laws of network effects
doom us all to a shared mediocrity? What will
happen to intellectual property and its effect on
creativity? Is it like the railroads, or is this
something fundamentally different that will last
through the next century?
35The Evolution of the World Wide Web
- The most important case study in communications
technology is the emergence of the World Wide
Web. This revolutionary concept seemed to spring
from nothingness into global ubiquity within the
span of only two years. Yet its development was
completely unforeseen in the industry an
industry that had pursued successive long and
fruitless visions of videotelephony, home
information systems, and video-on-demand, and had
spent decades in the development of ISDN with no
apparent application. It now seems incredible
that no one had foreseen the emergence of the
Web, but except for intimations in William
Gibsons science fiction novel Neuromancer, there
is no mention in either scientific literature or
in popular fiction of this idea prior to its
meteoric rise to popularity. - There is a popular notion that all technologies
take 25 years from ideation to ubiquity. This has
been true of radio, television, telephony, and
many other technologies prevalent in everyday
life. How, then, did the Web achieve such
ubiquity in only a few years? Well, the
historians argue, the Web relied on the Internet,
which in turn was enabled by the widespread
adoption of personal computers. Surely this took
25 years. We might even carry this further. The
personal computer would not have been possible
without the microprocessor, which depended on the
integrated circuit evolution, which itself
evolved from the invention of the transistor, and
so forth. By such arguments nearly every
development, it seems, could be traced back to
antiquity. - Although the argument about the origin and length
of gestation seems an exercise in futility, the
important point is that many revolutions are
enabled by a confluence of events. The seed of
the revolution may not seem to lie in any
individual trend, but in the timely meeting of
two or more seemingly-unrelated trends. In the
case of the World Wide Web the prevalence of PCs
and the growing ubiquity of the Internet formed
an explosive mixture ready to ignite. Perhaps no
invention was really even required. The world was
ready -- it was time for the Web. While this
physical infrastructure was forming in the
worlds networks and on the desktops of users,
there was a parallel evolution of standards for
the display and transmission of graphical
information. HTML, the hypertext markup language,
and HTTP, hypertext transmission protocol, were
unknown acronyms to the majority of technical
people, let alone the lay public. But the
definition of these standards that would enable
the computers and networks to exchange rich
mixtures of text and pictures was taking shape in
Switzerland at the physics laboratory CERN, where
Tim Berners-Lee was the principle champion. - The role of standards in todays information
environment is critical, but often unpredictable.
What is really important is that many users agree
on doing something exactly the same way, so that
everyone achieves the benefits of
interoperability with everyone else. It is
exactly the same concept of network externalities
that is at work in Metcalfes law. An
international standard can stimulate the market
adoption of a particular approach, but it can
also be ignored by the market. Unless users adopt
a standard it is like the proverbial tree falling
in the forest without a sound. Standards are, for
the most part, advisory. User coalitions or
powerful corporations can force their own
standards in a fascinating and ever-changing
multi-player game. Moreover, de facto standards
often emerge from the marketplace itself. - So in the middle 1980s there was a prevalent
physical infrastructure with latent capabilities
and an abstract agreement on standards for
graphics. One more development and two brilliant
marketing ideas were required to jumpstart the
Web. The development was that of Mosaic at the
National Center for Supercomputing Applications
at the University of Illinois. Mosaic was the
first browser, a type of program now known
throughout the world for providing a simple
point-and-click user interface to distributed
information. Following the initial versions of
Mosaic from NCSA, commercial browsers were
popularized by Netscape and Microsoft. - The revolutionary marketing ideas needed for the
Web now seem obvious and ordinary. A decade ago,
however, they were not at all obvious. One idea
was to enable individual users to provide the
content for the Web. The other idea was to give
browsers free to everyone. Between these ideas,
Metcalfe's Law was overcome. Even though browsers
initially had almost no value, since there were
no pages to browse, they could be obtained
electronically at no cost. The price was directly
related to the value. Thus browsers spread
rapidly, just as their value began to build with
the accumulation of web pages. - Allowing the users to provide content was counter
to every idea that had been held by industry. The
telecommunications and computer industries had
tried for a decade to develop and market remote
access to information and entertainment held in
centralized databases. This was the cornerstone
of what were called "home information systems"
that were given trials in many cities during the
1970s and 1980s. Later, the vision pursued by the
industry was that of video-ondemand -- the dream
of providing access to every movie and television
show - ever made, like a giant video rental store, over
a cable or telephone line. - Virtually every large telecommunications company
had trials and plans for videoon-demand, and the
central multi-media servers required for content
storage were being developed by Microsoft,
Oracle, and others. The Web exemplifies some
powerful current trends -- the empowerment of
users, geographically-distributed content,
distributed intelligence, and intelligence and
control at the periphery of the network. Another
principle is that of open, standard interfaces
that allow users and third parties to build new
applications and capabilities upon a standardized
infrastructure. - It is hard to criticize industry for pursuing the
centralized approach. Imagine proposing the Web
to a corporate board in 1985, and describing how
browsers would be given away free, and how
industry would depend upon the users to provide
whatever content might appear. Even today many
corporations wonder and worry about the business
model for the Web, and few are making any profits
at all.
36Information Technology and the Conduct of
Research The Users ViewNational Academy Press,
Washington, D.C. 1989
- Committee rationale There are serious
impediments to the wider and more effective use
of information technology. Committee members were
active researchers are outside the field of
"information technology". In the absence of
considerable knowledge about the field, the panel
was approaching it by asking the researchers
about their experiences. - p 1. Information technology - the set of computer
and communications technologies - has changed the
conduct of scientific, engineering and clinical
research. New technologies offer the prospect of
new ways of finding, understanding, storing, and
communicating information and should increase the
capabilities and productivity of researchers.
Among these new technologies are simulations, new
methods of presenting observational and
computational results as visual images, the use
of knowledge-based systems as "intelligent
assistants" and more flexible and intuitive ways
for people to interact with and control
computers. - The conduct of research The everyday work of
researcher involves writing proposals, developing
theoretical models, designing experiments,
collecting data, analyzing data, communicating
with colleagues, studying research literature,
reviewing colleagues work, and writing articles.
They look at three particular aspect of research
data collection and analysis, communication and
collaboration, and information storage and
retrieval.
37Information Technology and the Conduct of
Research The Users ViewNational Academy Press,
Washington, D.C. 1989
- DATA COLLECTION AND ANALYSIS
- It is one of the most widespread use of
information technology in research. Trends - Increased use of computers
- Dramatic increase of data storage and processing
capacity - Creation of new computer controlled instruments
that produce more data - Increase communication among researchers using
networks. - Availability of software packages for standard
research (e.g. statistical) - Â
- Difficulties
- 1. Uneven distribution of computing resources,
the has and have-nots - 2. Finding the right software. Commercial
software is often unsuitable for specialized
needs. Most researchers, although they are not
skilled software creators, develop their own
software with the help of graduate students. Such
software is designed for one purpose and it is
difficult to understand, to maintain or transport
to other computing environments. - 3. Transmitting data over networks at high speed.
- COMMUNICATION AND COLLABORATION
- Routine word processing and electronic mail are
the most pervasive form of computer use.
Electronic publishing and data communication-coord
ination is becoming increasingly used. Trends - Information can be shared more quickly
- New collaborative arrangements
- Difficulties
- Incompatibility of technologies
- Networks are anarchic.
38Information Technology and the Conduct of
Research The Users ViewNational Academy Press,
Washington, D.C. 1989
- IFORMATION STORAGE AND RETRIEVAL
- How it is stored determines how accessible it is.
Scientific text is stored on print ( hard copy)
and accessible though indices, catalogs of a
library. Data and databases are stored mostly on
computers disks. - A database along with the procedures for
indexing, cataloging and searching makes up an
information management system. - Difficulties
- The researcher cannot get access to data if he
can, he can not read them if he can read them,
he does not know how good they are and if he
finds them good he can not merge them with other
data . - Difficulty accessing data stored by other
researchers. Such access permits reanalysis and
replication, both essential elements of
scientific process. At present data storage is
largely an individual researcher's concern, in
line with the tradition that researchers have
first right to their data. The result has bee a
proliferation of idiosyncratic methods for
storing, organizing, and indexing data, with the
researchers data essentially inaccessible to all
other researchers. - Formats in data files vary from researcher to
researcher, even within a discipline. These
problems prohibit a researcher from merging
someone else's data in his own database. Hence,
considerable effort must be dedicated to
converting data formats. not enough metadata. - Finally when a researcher reads another database,
he has no notion as to the quality of the data it
contains. The data sets do not have enough QC
information and descriptive metadata. There is a
need for evaluated high quality databases. - Given a high quality well described database, a
major difficulty exists in conducting searches.
Most info searches are incomplete, cumbersome,
inefficient, expensive, and executable only by
specialists. Searches are incomplete because the
databases themselves are incomplete. Updating is
expensive because data are stored in more than
one database. Cumbersome and inefficient because
different databases are organized according to
different principles. ( data models) - Another difficulty in storing data information is
private ownership. By tradition, researchers hold
their data privately. In general, they neither
submit their data to a central archive nor make
their data available via computer. Increasingly,
however, in disciplines such as meteorology and
biomedical sciences, submission of primary data
into databanks is has become accepted as a duty.
In some fields, the supporting agencies require
that the data be archived in machine readable
format and that any professional article be
accompanied by a disk describing the underlying
data. Also, a comprehensive reference service for
computer-readable data should be developed.
Master directory - In addition, peer review of articles and
proposals has been constrained by the difficulty
of gaining access to the data used for the
analysis. If writer were required to make their
primary data available, reviewers could repeat at
least part of their analysis reported. Such a
review would be more stringent, would demand more
effort from reviewers and raises some operational
questions that need to be resolved. but
arguably lead to more careful checking of
published results. - Underlying difficulties in information storage
and retrieval are problems in the institutional
management of resources. . Who is to mange,
maintain, and update info services.? Who is to
create and enforce standards? At present, the
research community has tree alternatives federal
government which manages resources as MEDLINElt
and GenBank professional societies such as the
American Chemical Society which manages the
Chemical Abstracts Service and non-profit
organizations such as Institute for Scientific
Information.
39Information Technology and the Conduct of
Research The Users ViewNational Academy Press,
Washington, D.C. 1989
- Recommendations
- Institutions supporting researchers must develop
support policies,services standards for better
use if info technology. The institutions are
Universities, University Departments, Funding
Agencies, Scientific Associations, Network
Administrators, Info Service providers, Software
vendors and professional groups - The Federal Government should support software
development for scientific research. The software
should meet standards of compatibility,
reliability, documentation and should be made
available to other researchers. - Data collected with government support rightfully
belong to the public domain. with reasonable time
for first publication should be respected. - There is a pressing need for more compact form of
storage - Tool building for non-defense software should be
encouraged - The Federal Government should fund pilot projects
to on information storage and dissemination
concepts in selected disciplines and implement
software markets with emphasis on the development
of generic tools useful for multiple disciplines. - The institutions lead by the federal government
should develop an information technology network
for use by all qualified researchers.
40Measuring for Environmental Resultsby William K.
Reilly, EPA Journal, May-June 1989
- A key element in any effort to measure
environmental success is information--information
on where we've been with respect to environmental
quality, where we are now, and where we want to
go. Since its beginning, EPA has devoted a great
deal of time, attention, and money to gathering
data. We are spending more than half a billion
dollars a year on collecting,, processing, and
storing environmental data. Vast amounts of data
are sitting in computers at EPA Headquarters, at
Research Triangle Park, North Carolina, and at
other EPA facilities across the country. - But having all this information--about air and
water quality, about production levels and health
effects of various chemicals, about test results
and pollution discharges and wildlife
habitats--doesn't necessarily mean that we do
anything with it. The unhappy truth is that we
have been much better at gathering raw data than
at analyzing and using data to identify or
anticipate environmental problems and make
decisions on how to prevent or solve them. As
John Naisbitt put it in his book Megatrends "We
are drowning in information but starved for
knowledge." - Our various data systems, and we have hundreds of
them, are mostly separate and distinct, each with
its own language, structure, and purpose.
Information in one system is rarely transferable
to another system. I suspect that few EPA
employees have even the faintest idea of how much
data are available within this Agency, let alone
how to gain access to it. And if that is true of
our own employees, how must the public feel when
they ponder the wealth of information lurking,
just out of reach, in EPA's huge and seemingly
impenetrable data bases? - The strategic information effort I have
described, however, will require a new attitude
on the part of every EPA program manager--a
willingness to break out of the traditional
constraints of media-specific and
category-specific thinking. - Just as important, we must find ways to share our
data more effectively with the people who paid
for it in the first place the American public.
Eventually, as EPA makes progress in
standardizing and integrating its information
systems, the information in those systems--apart
from trade secrets--should be as accessible as
possible. Such information could be made
available through on-line computer
telecommunications, through powerful new compact
disc (CD-ROM) technologies, and perhaps a
comprehensive annual report on environmental
trends. - Sharing information with the public is an
important step toward establishing a common base
of understanding with the American people on
questions of environmental risk. As the recent
furor over residues of the chemical Alar on apple
products shows, there can be a wide gap between
public perceptions of risk and the degree of risk
indicated by the best available scientific data. - EPA must share and explain our information about
the hazards of life in our complex industrial
society with others--with other nations, with
state and local governments, with academia, with
industry, with public-interest groups, and with
citizens. We need to raise the level of debate
on environmental issues and to insure the
informed participation of all segments of our
society in achieving our common goal a cleaner,
,healthier environment. - Environmental data, collected and used within the
strategic framework I have described, can and
will make a significant contribution to
accomplishing our major environmental objectives
over the next few years. Strategic data will
help us - Â Â Â Â Â Â Â Create incentives and track our progress
in finding ways to prevent pollution before it
is generated. - Â Â Â Â Â Â Â Improve our understanding of the complex
environmental interactions that contribute to
international problems like acid rain,
stratospheric ozone depletion and global - warming.
- Â Â Â Â Â Â Â Identify threats to our nation's ecology
and natural systems--our wetlands, our marine - and wildlife resources--and find ways to reduce
those threats. - Â Â Â Â Â Â Â Manage our programs and target our
enforcement efforts to achieve the greatest - environmental results.
41USES OF ENVIRONMETAL DATA
- Environmental data are used for many purposes.
They may be to support environmental management
or to the good of the society by by deriving more
general environmental knowledge - Provide Historical Record
- Identify Deviation from Expected Trend
- Anticipate Future Environmental Problems
- Provide Legal Record
- Support Environmental Research
- Support Environmental Education
- Support Communication
- Record Monitoring and Control Procedures
42Taylor Model
- Taylor Model
- One of the specific tools employed by the staff
of University Library was the Taylor Model.5
Taylor's model is a theoretical model and is not
predictive. The University Library adapted it as
a working tool and, in turn, adopted the concepts
of "value-adding" and the importance of
Information Use Environments as cri