Title: Data Flow in UML
1Data Flow in UML
SAGE (12 prod units)
UML (50 prod units)
PGM (20 prod
CORBA (17 prod units)
SCE (40 pr
- Dr. Jeffrey E. Smith
- Mercury Computer Systems, Inc.
- jesmith_at_mc.com
2Agenda
- Model based parallel programming alternatives
- Focus on framework/UML Conceptualization
- Data Parallel CORBA
- Data Flow in UML Superstructure
3Motivation From Portable HP SW for SIP - Whats
Next, Lincoln Labs
- Moores law addresses computations, not
complexity - In their roadmap for advancing RT Embedded
Software Development, they identified model-based
development and automated mapping support as the
key long-term technologies - Blue Jean datapoint
4(No Transcript)
5Methods to Conceptualize/Apply High Performance
Data Flow Applications
6Observations
- UML doesnt include consistent model of data flow
yet not really - Translate UML diagrams to any source - might be
an avenue of tool support worth exploring
7Goals Component Reuse, Software Productivity,
Leverage Existing Investments Wider Programming
Base
Requirements and Design
UML
Model Behavior
Constructor (Programmer 1)
Translate
Parallel/DSP Prototypers
. . .
Graph(ical)
CORBA
SCE
V/P Compilers
Executable Prototype
Source
POSIX-Compliant API
Optimizer (Programmer 2)
POSIX-Compliant kernel
Executable Deliverable
8Dynamic Compilation Can Provide a Solution
High-Level Algorithms
Collect runtime execution behavior
Work with OMG
UML
UML with Data Flow
- Memory usage
- instruction and data caches
- translation look-aside buffers
- Control flow
- branch probabilities
- program traces
- Call graphs
- gprof statistics
- Data dependencies
- data-dependent control flow
- Variable values
- value locality
- interprocedural dataflow
- Hardware counters
- pipeline stalls
Common CASE Data-Flow Machine Development
(Par.)CORBA
IDE
1-7 Transforms
Non-Optimized Low-Level Algorithms
Profile-Guided Optimizations
Feedback
Optimized Low-Level Algorithms
9Next Steps
- Application to IR formation, fusion, template
matching - Collect software productivity metrics on above
and MITRE benchmarks - Experiment with optimization of UML transformed
(through data parallel CORBA or specialized data
parallel compiler IDEs) software to efficient
embedded platforms - Work with OMG in introducing data flow, in a way
that supports streaming high-performance,
data-flow distributed computers - Examine possibility of embedding dynamic profile
optimization into runtime system - Work with CASE and IDE vendor to integrate
model-based development of efficient streaming
high-performance, data-flow distributed computer
targets
10Trick is to
1) Discover common patterns (SCE, PAS, Par.
CORBA, ) 2) Feed this forward into standard OMG
specs 3) Simplify our own software
architectures/APIs
Action Semantics DataFlow
PAS Channel
11CORBA Sequence
12Data-Parallel CORBA Sequence
13Meta-Classes
14Data Structures
15Runtime Associations
16Control Flow
- Each step is taken when the previous one finishes
- regardless of whether inputs are available,
accurate or complete (pull) - Emphasis is on order in which steps are taken
Weather Info
Start
Analyze Weather Info
Not UML Notation
Chart Course
Cancel Trip
17Object/Data Flow
- Each step is taken when all the required input
objects/data are available - and only when all the inputs are available
(push) - Emphasis is on objects flowing between steps
18UML 2.0 Superstructure RFP Excerpt
Further, the way that objects and other data flow
between parts of a system is crucial to
understanding its architecture. The UML
currently supports object/data flow only at the
lowest level of granularity not even, between
the steps in an activity graph as well as
other locations, in a contradictory way. It is
important for architects to be able to model
object and data flows between entities at a
higher level of granularity, such as classifiers
and packages as well as many other requirements
coming up.
signifies my comments
19Why bring back data flow explicitly into UML?
- With parallel computation increasingly used to
increase computation speeds, there is interest in
linking streaming data flow machines with a
matching modeling paradigm - To bring back data flow standard developers
have been building unique custom DFDs out of
standard UML structure (patterns) - some CASE
vendors added data flow at model meta-model
level - To link/integrate existing DFD toolsets with UML
toolsets existing simulators e.g. Ptolemy
Park - Functional modeling (only third left out of OMT)
fits OO and non-OO modeling paradigm and can be
united with other UML models SD, DSH - Currently addressed in piecemeal in UML (shown
later), none of which conform to pre-existing
modelers view (OMT view) of data flow
20Why bring back data flow explicitly into UML
(cont)?
- Object model defines system components, dynamic
model (state machines) define system control but
functional model (data flow) defines what
computations occur in a system functional
dependencies between processes - Need expressed in software process/workflow,
defense, medical, wireless and digital video
domains - Example When response to Action Semantics RFP
was presented in OMG Plenary, diagrams were not
done in UML (were in data flow) - reason given
was it would take too much space in UML
21Why bring back data flow explicitly into UML
(continued) ?
- Different (yet related) underlying semantics than
State Machines - "is-used-to-produce" relation
- Can have consistent parent/child (state/substate)
diagram from state machine point of view that
violates data consistency model TK - Unique inheritance (decomposition) requirement
- Example definition Let P be a process and D a
composition of P. D is consistent with P iff the
I/O relationships that are 1) specified for P
must also hold for D and 2) not specified to hold
for P must not hold for D TK - Relation to state machines A trigger (t) of a
control process "is-used-to-produce" a response
(r) of the same process iff there is a transition
in the STD that is triggered by t and responds
with r TK. - It is conceptually simpler for some applications
simply a digraph together with a binary
precedence relation. - It is impossible to represent continuous flow,
especially with feedback, in a State Machine
because of the theoretically infinite amount of
states to represent. This is a natural modeling
view with data transforms. - STDs are sequential within one machine, DFDs are
not
22Interaction Diagrams (Sequence, Collaboration)
- Different (yet related) underlying semantics than
Interaction Diagrams - Interaction diagrams are for interaction among
objects - Cannot represent interaction at a lower level
(among methods of different classes) - Cannot represent interaction among systems
23Why aspects of data flow are not yet supported?
- Ambiguous order of input to processes
- Considered difficult to unite the semantics of
data flow models with other OO models (other
research has proved this false) - Some DFDs allowed for control flow and control
flow is duplicated in many of the dynamic models - Could be non-deterministic, since not all
processes or data flows are necessarily used to
produce the high level process outputs - No way to represent sequencing, iteration and
conditionals - Considered to be included but inconsistently and
multiply
24Where UML experts think data flow either exists
or would fit?
- UML Profile for Enterprise Distributed Object
Computing - Activity Diagrams
- Collaboration Diagrams
- Action Semantics (Data flow is model element that
acts as temporary data store between in and out
pins) - Data-parallel CORBA
- Using new (data flow) patterns of existing UML
structures - A UML ActivityGraph Profile for EDOC Task Model
(ad/99-10-07) - Object interaction diagram (I assume this option
is merely seconding the Collaboration Diagrams
suggestion)
25Initial Requirements Collection
- Establish criterion for (automatable) checking of
(internal/external) completeness
(well-formed,well-connected,well-introduced,well-r
ooted) TK, consistency Kung, decomposition
(inheritance), boundedness Park, determinancy
Park, KM and termination Park, KM. - Elementary processes modeled, like Petri nets
Petri, with pre and post conditions describing
the behavior of processes. - Provide ability to express data dimensions and
other data properties (in Class Diagram) and
explicit linkage to these from Data Flow Diagram. - Map to rest of UML - Consistent with State
Diagrams (events trigger "is-used-to-produce"
relation), Action Semantics, EDOC EDOC,
Collaborations, Activity, Class Diagrams
(generalization and process functional
dependencies to associations) Kung, Use Case
Diagrams (actors) Park, RT UML (ports map to
I/O specs) and Deployment Diagrams (see next 2
bullets). - Provide ability to express parallelization along
data dimensions and mapping to hardware resources
in Deployment Diagram. Must express data
distribution types (sequential or parallel) and
sub types (round robin, random even, random
statistical, first available, etc.) - Allow ability to specify that arrows in DFD are
associated with (virtual) channels (see Virtual
Interface Specification) in Deployment Diagram.
26Initial Requirements Collection (cont)
- Non-side affecting operations, or previously
defined actions, are decomposed using functional
models and these are generally used at the
aggregate level SD. - Aggregate objects are passed as an input
parameter and returned as an output parameter,
allowing a process to access any object (data
stores, object classes, or associations) with the
parameter SD. - Place all control flow info in a state machine
(to solve 4.4 and 4.5) SD. - Provide for data store I/O not included in action
semantics. - Must be able to model partial objects (multiple
partial partitions of data) described in Data
Parallel CORBA Spec. - Provide method to express process synchronization
as something external to processes (as opposed to
state machines where this would be defined in a
state) without knowledge of composition context.
Constraints to unify behavior, class
functional models
27Initial Requirements Collection (cont)for
Modeling Streaming Data
- Provide a uni-directional data streaming
interface with data flow. - Model structured (number of dimensions, extent in
each dimension, packing order element type) and
unstructured global data (number of data sets,
size of data) DRI. - Model object I/O requirements e.g. support for
structured/non-structured data, dimensions,
element types and data partitioning specification
(e.g. indivisible or block type and for each
dimension, maximum size, minimum number of
required elements, modulo size, block length,
left and right overlap specs, etc.) DRI. - Model data stream control e.g. push and pull of
data, QoS based on data control (e.g. rate
latency constraints), control data stream,
control tagged data, etc. DRI.
Name 2
Properties
Name
Name 1
Data Distribution (sub)Type
Properties
Properties
Input Specifications
Output Specifications
Name 3
Properties
Global Data
Associate I/O specs with port attributes
Need semantics to model data
28Existing Data Flow Semantic Models
- Petri Nets Petri
- Kung, et al TK, Kung
- Karp and Miller Computation Graphs KM
- Kahn Process Networks Kahn
- Parks Bounded Execution Parks
29Completely different connection in action
semantics, EDOC, Activity Diagrams, Different
CASE vendors
30References
- BS D. Bhatt and J. Shackleton, A Design
Notation and Toolset for High-Performance
Embedded Systems Development, Lectures on
Embedded Systems, LNCS 1494, Springer-Verlag,
VIII, October 1998. - DRI Document for the DARPA Data Reorganization
Effort, www.data-re.org, Feb 2000. - EDOC Cooperative Research Centre for Enterprise
Distributed Systems Technology, UML Profile for
Enterprise Distributed Object Computing,
ad/99-10-07. - Kahn G. Kahn, The Semantics of a Simple
Language for Parallel Programming, Info. Proc.,
pages 471-475, Stockholm, Aug. 1974. - KM R. M. Karp and R. E. Miller, Properties of a
Model for Parallel Computations Determinacy,
Termination, Queueing, SIAM Journal of Applied
Mathematics, Vol. 14, No. 6, November 1966. - Kung C. H. Kung, Conceptual Modeling in the
Context of Software Development, IEEE
Transactions on Software Engineering",
15(10)1176-1187, Oct. 1989. - Parks T. M. Parks. Bounded Scheduling of
Process Networks Technical Report UCB/ERL-95-105,
PhD Dissertation, EECS Department, University of
California. Berkeley, CA, December 1995. - Petri C. A. Petri, Kommunikation mit Automaten,
PhD dissertation, translation by C. F. Greene,
Supplement 1 to Technical Report RADC-TR-65-337,
Vol. 1, Rome Labs, Griffiss Air Force Base, NY,
1965. - TK Y. Tao and C. Kung Formal Definition and
Verification of Data Flow Diagrams, J. Systems
Software, 1629-36, 1991. - SD S. DeLoach, Formal Transformations from
Graphically-Based Object-Oriented Representations
to Theory-Based Specification, PhD thesis, Air
Force Institute of Technology, Wright-Patterson
AFB,OH, June 1996, AFIT/DS/ENG/96-05, AD-A310 608.