Media: Text - PowerPoint PPT Presentation

1 / 10
About This Presentation
Title:

Media: Text

Description:

'Words and symbols in any form, spoken or written, are the most common system of communication. ... ISO sets include Chinese, Japanese, Korean & Arabic. UNICODE ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 11
Provided by: perdanaFs
Category:
Tags: chinese | media | symbols | text

less

Transcript and Presenter's Notes

Title: Media: Text


1
Media Text
  • Words and symbols in any form, spoken or
    written, are the most common system of
    communication. unknown

2
Text - Representation
  • ASCII
  • 7-bit code
  • 128 values in ASCII character set (English
    Alphabet)
  • use of 8th bit in text editors/word processors
    creates incompatibility
  • ISO character sets
  • extended ASCII to support non-English text
    (symbols such as or )
  • ISO Latin provides support for accented
    characters
  • à, ö, ø, etc.
  • ISO sets include Chinese, Japanese, Korean
    Arabic
  • UNICODE
  • 16 bit format (Roman vs. Western European or
    Kanji Japan)
  • 65,000 different symbols
  • 25 supported scripts of Version 2.0 Unicode
    Standard Arabic, Armenian, Bengali, Bopomofo,
    Cyrilic, Devanagari, Georgian, Greek, Gujarati,
    Gurmkhi, Han, Hangul, Hebrew, Hiragana, Kannada,
    Katakana, Latin, Lao, Malayalam, Oriya, Phonetic,
    Tamil, Telugu, Thai, Tibetan

3
ASCII
  • All uppercase and lowercase letters
  • Punctuation symbols like ! . , ? etc.
  • Digits 0, , 9
  • Arithmetic symbols - /
  • Assorted special symbols like _at_
    ( ) etc.
  • Invisible formatting characters

4
ASCII
5
Text - Representation
  • Marked-up text
  • nroff, troff
  • LaTEX
  • SGML
  • HTML
  • HyTime
  • XML, XSL, XLL
  • Structured Text
  • structure of text represented in data structure,
    usually tree-based
  • ODA, structure embedded in byte-stream with
    content
  • Hypertext
  • non-linear
  • graph or web structure nodes and links
  • currently subject of intensive ISO standards
    activity

6
Text - Operations
  • Character operations
  • basic data type with assigned value
  • permits direct character comparison (a
  • String operations
  • comparison
  • concatenation
  • substring extraction and manipulation
  • Editing
  • perhaps the most familiar set of operations on
    text
  • cut/copy/paste
  • strings v. blocks, dependent on document structure

7
Text - Operations
  • Formatting
  • interactive or non-interactive (WYSIWYG v. LaTEX)
  • formatted output
  • bitmap
  • page description language (Postscript, PDF)
  • font management
  • typeface
  • point size (1 point 1/72 of an inch)
  • TrueType fonts geometric description kerning
  • Pattern-matching and Searching
  • search and replace
  • wildcards
  • regular expressions
  • for large bodies of text, or text databases, use
    of inverted indices, hashing techniques and
    clustering.

8
Text - Operations
  • Sorting
  • numerous varieties of sort, all of them
    extensively studied in basic programming
  • sort complexity is a major factor in data
    handling performance
  • Compression
  • ASCII uses 7 bits per character, though most
    word-processors actually use the 8th bit to use
    up a byte per character
  • Information theory estimates 1-2 bits per
    character to be sufficient for natural language
    text
  • This redundancy can be removed by encoding
  • Huffman varies the numbers of bits used to
    represent characters, shortest codes for highest
    frequency characters
  • Lempel-Ziv identifies repeating strings and
    replaces them by pointers to a table
  • Both techniques compress English text at a ratio
    of between 21 and 31

9
Text - Operations
  • Encryption
  • text encryption is widely used in electronic mail
    and networked information systems
  • most widely-used techniques
  • DES
  • RSA public-key
  • PGP
  • subject of major controversy
  • key escrow systems
  • Clipper chip
  • strong encryption now being legally outlawed in
    a number of countries
  • Language-specific operations
  • spell-checking
  • parsing and grammar checking
  • style analysis

10
About Fonts and Faces
  • A typeface family of graphic character (include
    many type sizes styles)
  • A font is a collection of characters of a single
    size
  • Styles are boldface and italic (underlining
    outlining)
  • Serif vs. Sans Serif (sans(French) without)
Write a Comment
User Comments (0)
About PowerShow.com