Title: Type Systems and Structures
1Type Systems and Structures
Programming Language Principles Lecture 22
- Prepared by
- Manuel E. Bermúdez, Ph.D.
- Associate Professor
- University of Florida
2Data Types
- Most PLs have them.
- Two purposes
- Provide context for operations, e.g. ab (ints or
floats). In Java, - Widget xnew(Widget), allocates memory for an
object of type Widget, and invokes a constructor. - Limits semantically legal operations, e.g. n
"x".
3Type equivalence and compatibility.
- At hardware level, bits have no type.
- In a PL, need types
- to associate with values,
- to resolve contextual issues and
- to check for illegal operations.
4Type System
- Mechanism for defining types and associating them
with PL constructs. - Rules for determining type equivalence,
compatibility. Type inferencing rules are used
to determine the type of an expression from its
parts, and from its context.
5Type Systems
- Distinction between "type of expression" and
"type of object" important only in PLs with
polymorphism. - Subroutines have a type in some languages (RPAL
lambda-closure), if they need to be passed as
parameters, stored or returned from function.
6Type Systems (contd)
- Type checking process of enforcing type
compatibility rules. A "type clash" occurs if
not. - Strongly typed language enforcement of
operations only applied to objects of types
intended. Example C is not very strongly typed
"while (p) ... " used to traverse an array.
7Type Systems (contd)
- Statically typed language strongly typed, with
enforcement occurring at compile time. Examples
ANSI C (more so than classic C), Pascal (almost,
untagged variant records) - Some (few) languages are completely untyped
Bliss, assembly language.
8Type Systems (contd)
- Dynamic (run-time) type checking RPAL, Lisp,
Scheme, Smalltalk. - Other languages (ML, Miranda, Haskell) are
polymorphic, but use significant type inference
at compile time.
9Type Definitions
- Early on (Fortran, Algol, BASIC) available types
were few and non-extensible. - Many languages distinguish
- type declaration (introduce name and scope)
- type definition (describe the object or type
itself).
10Type Definitions (contd)
- Three approaches to describe types
- Denotational.
- A type is a set of values (domain).
- An object has a type if its value is in the set.
- Constructive
- A type is either atomic (int, float, bool, etc.)
or is built (constructed) from atomic types, i.e.
arrays, records, sets, etc. - Abstraction
- A type is an interface a set of operations upon
certain objects.
11Classification of Types
- Scalar (a.k.a. discrete, ordinal) types
- The terminology varies (bool, logical,
truthvalue). - Scalars sometimes come in several widths (short,
int, long in C, float and double, too). - Integers sometimes come "signed" and "unsigned."
12Classification of Types (contd)
- Characters sometimes come in different sizes
(char and "wide" in C, accommodating Unicode) - Sometimes "complex" and "rational" are provided.
- COBOL and PL/1 provide "decimal" type. Example
(PL/1) - FOR I0 TO 32/2 ...
13Classification of Types (contd)
- Enumerations
- Pascal type day (yesterday,
- today, tomorrow)
- A newly defined type, so
- var d day
- for d today to tomorrow do ...
- Can also use to index arrays
- var profits arrayday of real
- In Pascal, enumeration is a full-fledged type.
14Classification of Types (contd)
- C enum day yesterday,today,tomorrow
-
- equivalent to
- typedef int day
- const day yesterday0
- today1
- tomorrow2
15Classification of Types (contd)
- Subrange types.
- Values are a contiguous subset of the base type
values. The range imposes a type constraint. - Pascal
- type water_temp 32 .. 212
16Classification of Types (contd)
- Composite Types.
- Records. A heterogeneous collection of fields.
- Variant records. Only one of the fields is valid
at any given time. Union of the fields, vs.
Cartesian product. - Arrays. Mapping from indices to data fields.
17Classification of Types (contd)
- Sets. Collections of distinct elements, from a
base type. - Pointers. l-values. Used to implement recursive
data types an object of type T contains
references to other objects of type T. - Lists. Length varies at run-time, unlike (most)
arrays. - Files. Hold a current position.
18Orthogonality
- Pascal variant fields required to follow
non-variant ones. - Most PL's provide limited ability to specify
literal values of composite types. - Example In C,
- int x 3,2,1
- Initializer, only allowed for
declarations, - not assignments.
- In Ada, use aggregates to assign composite
values.
19Type Equivalence
- Structural equivalence Two types are equivalent
if they contain the same components. - Varies from one language to another.
20Type Equivalence (contd)
- Example
- type r1 record
- a,b integer
- end
- type r2 record
- b integer
- a integer
- end
-
- var v1 r1
- v2 r2
- v1 v2
- Are these types compatible ?
- What if a and b are reversed ?
- In most languages, no. In ML, yes.
21Type Equivalence (contd)
- Name equivalence based on type definitions
usually same name. - Assumption named types are intended to be
different. - Alias types definition of one type is the name
of another. - Question Should aliased types be the same type?
22Type Equivalence (contd)
- In Modula-2
- TYPE stack_element INTEGER
- MODULE stack
- IMPORT stack_element
- EXPORT push, pop
- procedure push (estack_element)
- procedure pop ( ) stack_element
- Stack module cannot be reused for other types.
23Type Equivalence (contd)
- However,
- TYPE celsius REAL
- fahrenh REAL
- VAR c celsius
- f fahrenh
- f c ( should probably be an error )
24Type Equivalence (contd)
- Strict name equivalence aliased types are
equivalent. - type a b considered both declaration and
definition. - Loose name equivalence aliased types not
equivalent. - type a b considered a declaration a and b
share the definition.
25Type Equivalence (contd)
- In Ada compromise, allows programmer
- to indicate
- alias is a subtype (compatible with base type)
-
- subtype stack_element is integer
26Type Equivalence (contd)
- In Ada, an alias is a derived type (not
compatible) - subtype stack_element is integer
- type celsius is new REAL
- type fahrenh is new REAL
- Now the stack is reusable, and celsius is not
compatible with fahrenh.
27Type Conversion and Casts
- Many contexts in which types are expected
- assignments, unary and binary operators,
parameters. - If types are different, programmer must convert
the type (conversion or casting).
28Three Situations
- Types are structurally equivalent, but language
requires name equivalence. Conversion is
trivial. - Example (in C)
- typedef number int
- typedef quantity int
- number n
- quantity m
- n m
29Three Situations (contd)
- Different sets of values, but same
representation. - Example subrange 3..7 of int.
- Generate run-time code to check for appropriate
values (range check).
30Three Situations (contd)
- Different representations.
- Example (in C)
- int n
- float x
- n x
- Generate code to perform conversion at run-time.
31Type Conversions
- Ada name of type used as a pseudofunction
- Example n integer(r)
- C, C, Java Name of type used as prefix
operator, in ()s. - Example n (int) r
32Type Conversions (contd)
- If conversion not supported in the language,
convert to pointer, cast, and dereference (ack!) - r ((float ) n)
- Re-interpret bits in n as a float.
33Type Conversions (contd)
- OK in C, as long as
- n has an address (won't work with expressions)
- n and r occupy the same amount of storage.
- programmer doesn't expect run-time overflow
checks !
34Type Compatibility and Coercions
- Coercion implicit conversion.
- Rules vary greatly from one language to another.
-
35Type Compatibility and Coercions (contd)
- Ada Types T and S are compatible (coercible) if
either - T and S are equivalent.
- One is a subtype of the other (or both subtypes
of the same base type). - Both are arrays (same numbers, and same type of
elements). - Pascal same as Ada, but allows coercion from
integer to real.
36Type Compatibility and Coercions (contd)
- C Many coercions allowed. General idea convert
to narrowest type that will accommodate both
types. - Promote char (or short int) to int, guaranteeing
neither is char or short. - If one operand is a floating type, convert the
narrower one - float -gt double -gt long double
37Type Compatibility and Coercions (contd)
- Note this accommodates mixtures of integer and
floating types. - If neither type is a floating type, convert the
narrower one - int-gt unsigned int-gt long int-gt unsigned long int
38Examples
- char c / signed or unsigned -- implementation?
/ - short int s
- unsigned int u
- int i
- long int l
- unsigned long int ul
- float f
- double d
- long double ld
39Examples (contd)
- i c / c converted to int
/ - i s / s converted to int
/ - u i / i converted to unsigned int
/ - l u / u converted to long int
/ - ul l / l converted to unsigned long int
/ - f ul / ul converted to float
/ - d f / f converted to double
/ - ld d / d converted to long double /
40Type Compatibility and Coercions (contd)
- Conversion during assignment.
- usual arithmetic conversions don't apply.
- simply convert from type on the right, to type on
the left.
41Examples
- s l / l's low-order bits -gt signed number
/ - s ul / ditto
/ - l s / s signed-extended to longer length
/ - ul s / ditto, ul's high-bit affected ?
/ - s c / c extended (signed or not) to
/ - / s's length, interpreted as signed /
- f l / l converted to float, precision lost
/ - d f / f converted, no precision lost.
/ - f d / d converted, precision lost
/ - / result may be undefined /
42 Type Inference
- Usually easy.
- Type of assignment is type of left-side.
- Type of operation is (common) type of operands.
43 Type Inference (contd)
- Not always easy.
- Pascal
- type A 0 .. 20
- B 10.. 20
- var a A
- b B
- What is the type of ab ? In Pascal, it's the
base type (integer).
44 Type Inference (contd)
- Ada
- The type of the result would be an anonymous type
0..40. - The compiler would generate run-time checks for
values out of bounds. - Curbing unnecessary run-time checks is a major
problem.
45 Type Inference (contd)
- Pascal allows operations on sets
- var A set of 1..10
- B set of 10..20
- C set of 1..15
- i 1..30
- C A B 1..5,i
-
- The type of the expression is set of integer
(the base type). Range check is required when
assigning to C.
46Type Inference (contd)
47Type Inference in ML
- Programmer can declare types, but if not, ML
infers them, using unification (more later in
Prolog).
48Type Inference in ML (contd)
- ML infers the return type of "fib"
- i1 implies i is of type int.
- in implies n is of type int.
- fib_helper(0,1,0) implies f1, f2 of type int,
and confirms (doesn't contradict) i is of type
int. - fib_helper returning f2 implies fib_helper
returns int. - fib returning fib_helper(0,1,0) implies fib
returns int.
49Type Inference in ML (contd)
- ML checks type consistency no contradictions or
ambiguities. - By inferring types, ML allows polymorphism
- fun compare (x,p,q)
- if x p then
- if x q then "all three match"
- else "first two match"
- else
- if x q then "second two match"
- else "none match"
50Type Inference in ML (contd)
- The type of fun is not specified. Typeinference
yields any type for which '' is legal (many of
them !). - Result is polymorphic 'compare' method.
-
- It's possible to underspecify the type
- fun square (x) x x ( int or float ?
) - fun square (xint) x x ( ambiguity
gone )
51Records (structs) and Variants (unions)
52Records (structs) and Variants (unions)
53Records (structs) and Variants (unions, contd)
- Usage
- var copper element
- copper.name 'Cu'
-
- Record can be "packed", filing in holes, but
forcing compiler to generate code that can access
fields using multi-instruction sequences (less
efficient).
54Records (structs) and Variants (unions, contd)
55Records (structs) and Variants (unions, contd)
- Usage
- element copper
- strcpy(copper.name,"Cu")
56Records (structs) and Variants (unions, contd)
- Most languages allow assignment of one record to
another, but if not, a "block_copy" routine can
solve the problem. - Most languages don't allow equality comparison.
A "block_compare" routine might have problems
with garbage in the holes.
57Records (structs) and Variants (unions, contd)
- Compilers often rearrange fields to reduce space
-
58Pascal with Statements
- Introduce a nested scope, in which record fields
are visible without record name. - Useful for deeply nested structures.
- Example
- with copper do begin
- name 'Cu'
- atomic_number 29
- atomic_weight 63.546
- metallic true
- end
59Pascal with Statements (contd)
- Problems with Pascal's with statement
- Can only manipulate fields of ONE record, not
two. Not a shortcut for copying fields from one
record to another. - Local names that match field name become
inaccessible. - Can be difficult to read, especially in long or
deeply nested with statements.
60Pascal with Statements (contd)
- Module-2 allows aliases for complicated
expressions - WITH ecopper DO BEGIN
- e.name 'Cu'
- e.atomic_number 29
- e.atomic_weight 63.546
- e.metallic true
- END
61Pascal with Statements (contd)
- Can access one than one record at a time
- WITH ecopper, firon DO
- e.metallic f.metallic
- END
62Pascal with Statements (contd)
- In Modula-3, the with statement goes further
- WITH d (...) DO
- IF d ltgt 0 THEN val n/d ELSE val 0
63Pascal with Statements (contd)
- C gets around this using the conditional
expression - double d (...)
- val (d ? n/d 0)
-
64Pascal with Statements (contd)
- C has no need for a with statement, just use
pointers - element e ...
- element f ...
- e-gtname f-gtname
- e-gtatomic_number f.atomic_number
- e-gtatomic_weight f.atomic_weight
- e-gtmetallic f.metallic
65Variant Records
- Choice between alternative fields.
- Only one is valid at any given time.
66Example (Pascal)
67Example (Pascal, contd)
- "naturally_occuring" is the "tag", which
indicates whether the element contains - A source and a prevalence, or
- A half_life.
68Example (Pascal, contd)
69In C
70Variant Records (contd)
- Unions are not integrated with structs,
- so there are additional names
- element e
- e.extra_fields.natural_info.source 3
- e.extra_fields.half_life 3.5
71Variant Records (contd)
- In general, type safety is compromised
- type tag (is_int, is_real, is_bool)
- var irb record
- case which tag of
- is_int (iinteger)
- is_real (rreal)
- is_bool (bBoolean)
- end
72Variant Records (contd)
- Usage
-
- irb.which is_real
- irb.r 3.0
- irb.i 7 ( run-time error )
73Variant Records (contd)
- Changing the tag field should make all other
fields in the variant uninitialized, but it's
very expensive to keep track of at run-time.
Most compilers won't catch this -
- irb.which is_real
- irb.r 3.0
- irb.which is_int
- writeln(irb.i)
- ( uninitialized, or worse, shares space
with irb.r )
74Variant Records (contd)
- Worse yet, the tag field is optional
- type tag (is_int, is_real, is_bool)
- var irb record
- case tag of ( 'which' field is gone
! ) - is_int (iinteger)
- is_real (ireal)
- is_bool (iBoolean)
- end
75Variant Records (contd)
- No way to catch
- irb.r 3.0
- writeln(irb.i)
-
- Designers of Modula-3 dropped variant records,
for these safety reasons. - Similarly, designers of Java dropped union of C
and C.
76Variants in Ada
- Must have a tag (discriminant).
- If tag changes, all fields in the variant must be
changed, by - assigning a whole record (A B), or
- assigning an aggregate.
77Example (with discriminant default value)
78Variants in Ada (contd)
- Declaration can use the default
- copper element
- Declaration can override the default
- plutonium element (false)
- americium element
- (naturally_occuring gt false)
79Variants in Ada (contd)
- The type declaration may
- provide a default (constrained discriminant),
which cannot be changed. - not provide a default (unconstrained
discriminant) then every variable declaration
must do so, and the tag may be changed.
80Variants in Ada (contd)
- In short, discriminants are never uninitialized.
- In Ada, variants are required to appear at the
end of the record. - The compiler assigns a constant address to every
field.
81Variants in Modula-2
- In Modula-2, this restriction is dropped.
Usually, a fixed address is assigned to each
field, leaving holes where variants differ in
size.
82Variants in Modula-2 (contd)
83Type Systems and Structures
Programming Language Principles Lecture 22
- Prepared by
- Manuel E. Bermúdez, Ph.D.
- Associate Professor
- University of Florida