Title: HighLevel Language Interface
1Kip R. Irvine
- Chapter 12
- High-Level Language Interface
2Chapter Overview
- Why Link ASM and HLL Programs?
- General and Calling Conventions
- External Identifiers
- Inline Assembly Code
- __asm Directive
- File Encryption Example
- Linking to C Programs
- Linking to Borland C
- ReadSector Example
- Special Section Optimizing Your Code
- Loop Optimization Example
- FindArray Example
- Creating the FindArray Project
3Why Link ASM and HLL Programs?
- Use high-level language for overall project
development - Relieves programmer from low-level details
- Use assembly language code
- Speed up critical sections of code
- Access nonstandard hardware devices
- Write platform-specific code
- Extend the HLL's capabilities
4General Conventions
- Considerations when calling assembly language
procedures from high-level languages - Both must use the same naming convention (rules
regarding the naming of variables and procedures) - Both must use the same memory model, with
compatible segment names - Both must use the same calling convention
5Calling Convention
- Identifies specific registers that must be
preserved by procedures - Determines how arguments are passed to
procedures in registers, on the stack, in shared
memory, etc. - Determines the order in which arguments are
passed by calling programs to procedures - Determines whether arguments are passed by value
or by reference - Determines how the stack pointer is restored
after a procedure call - Determines how functions return values
6External Identifiers
- An external identifier is a name that has been
placed in a modules object file in such a way
that the linker can make the name available to
other program modules. - The linker resolves references to external
identifiers, but can only do so if the same
naming convention is used in all program modules.
7Example
8Name decoration
- Compilers for older programming languages such as
COBOL and PASCAL usually convert identifiers to
all uppercase letters. - More recent languages such as C, C, and Java
preserve the case of identifiers. - In addition, languages that support function
overloading (such as C) use a technique known
as name decoration that adds additional
characters to function names. - A function named MySub(int n, double b), for
example, might be exported as MySubintdouble.
912.1.2 Inline Assembly Code
- Assembly language source code that is inserted
directly into a HLL program. - Compilers such as Microsoft Visual C and
Borland C have compiler-specific directives
that identify inline ASM code. - Efficient inline code executes quickly because
CALL and RET instructions are not required. - Simple to code because there are no external
names, memory models, or naming conventions
involved. - Decidedly not portable because it is written for
a single platform.
10_asm Directive in Microsoft Visual C
- Can be placed at the beginning of a single
statement - Or, It can mark the beginning of a block of
assembly language statements - Syntax
__asm statement __asm statement-1
statement-2 ... statement-n
11Commenting Styles
All of the following comment styles are
acceptable, but the latter two are preferred
mov esi,buf initialize index register mov
esi,buf // initialize index register mov
esi,buf / initialize index register /
12You Can Do the Following . . .
- Use any instruction from the Intel instruction
set - Use register names as operands
- Reference function parameters by name
- Reference code labels and variables that were
declared outside the asm block - Use numeric literals that incorporate either
assembler-style or C-style radix notation - Use the PTR operator in statements such as
- inc BYTE PTR esi
- Use the EVEN and ALIGN directives
- Use LENGTH, TYPE, and SIZE directives
13You Cannot Do the Following . . .
- Use data definition directives such as DB, DW, or
BYTE - Use assembler operators other than PTR
- Use STRUCT, RECORD, WIDTH, and MASK
- Use macro directives such as MACRO, REPT, IRC,
IRP - Reference segments by name.
- (You can, however, use segment register names as
operands.)
14Register Usage
- In general, you can modify EAX, EBX, ECX, and EDX
in your inline code because the compiler does not
expect these values to be preserved between
statements - Conversely, always save and restore ESI, EDI, and
EBP.
See the Inline Test demonstration program.
15Notes
- Cannot use OFFSET operator
- Use LEA esi, buffer
- The TYPE operator returns one of the following,
depending on its target - The number of bytes used by a C or C type or
scalar variable - The number of bytes used by a structure
- For an array, the size of a single array element
16(No Transcript)
17File Encryption Example
- Reads a file, encrypts it, and writes the output
to another file. - The TranslateBuffer function uses an __asm block
to define statements that loop through a
character array and XOR each character with a
predefined value.
View the Encode2.cpp program listing
18// translat.cpp include "translat.h" /
Translate a buffer of ltcountgt bytes, using an
encryption character lteChargt. Uses an XOR
operation (ASM function)./ void
TranslateBuffer( char buf, unsigned count,
unsigned char eChar ) __asm
mov esi,buf set index register
mov ecx,count / set loop counter / mov
al,eChar L1 xor esi,al inc esi
Loop L1 // asm
19- / ENCODE.CPP
- // This program copies and encodes a file.
- include ltiostreamgt
- include ltfstreamgt
- include "translat.h"
- using namespace std
- int mainx()
-
- const int BUFSIZE 200
- char bufferBUFSIZE
- unsigned int count // character count
-
- unsigned short encryptCode
- cout ltlt "Encryption code 0-255? "
- cin gtgt encryptCode
- ifstream infile( "infile.txt", iosbinary )
- ofstream outfile( "outfile.txt", iosbinary )
- cout ltlt "Reading INFILE.TXT and creating
OUTFILE.TXT...\n" - while (!infile.eof() )
/ translat.h void TranslateBuffer( char buf,
unsigned count, unsigned
char eChar )
20Overhead
21 while (!infile.eof() )
infile.read(buffer, BUFSIZE ) count
infile.gcount() __asm lea
esi,buffer mov ecx,count mov al,
encryptChar L1 xor esi,al
inc esi Loop L1 // asm
outfile.write(buffer, count) return 0
22Linking Assembly Language to C
- Basic Structure - Two Modules
- The first module, written in assembly language,
contains the external procedure - The second module contains the C/C code that
starts and ends the program - The C module adds the extern qualifier to the
external assembly language function prototype. - The "C" specifier must be included to prevent
name decoration by the C compiler
extern "C" functionName( parameterList )
23Name Decoration
Also known as name mangling. HLL compilers do
this to uniquely identify overloaded functions. A
function such as int ArraySum( int p, int
count ) would be exported as a decorated name
that encodes the return type, function name, and
parameter types. For example int_ArraySum_pInt_in
t The problem with name decoration is that the
C compiler assumes that your assembly language
function's name is decorated. The C compiler
tells the linker to look for a decorated name.
C compilers vary in the way they decorate
function names.
24Linking to Borland C
- We will look at a C program that calls an
external assembly language procedure named
ReadSector - Reads a range of sectors from a disk drive
- Not possible with pure C code
- ASM code uses 16-bit MS-DOS functions
- Tools
- 16-bit version of Borland C 5.01
- Borland TASM 4.0 assembler (included with Borland
C)
25ReadSector Sample Output
Sector display program. Enter drive number 1A,
2B, 3C, 4D, 5E,... 1 Starting sector number
to read 0 Number of sectors to read 20
Reading sectors 0 - 20 from Drive 1 Sector 0
--------------------------------------------------
------ .lt.(P3j2IHC........_at_..................)Y...
MYDISK FAT12 .3. .......x..v..V.U."....N..
.........E...F..E.8N"....w.r...f.. f..W.u...
..V....s.3..F...f..F..V..F....v..F..V..
.......H...F ..N.a.....r98-t.......at9Nt...
.r............t.lt.t.......... ............f..
.......E..N....F..V......r....p..B.-fj.RP.Sj .j
...t...3..v...v.B...v..............V...d.ar._at_u.B.
.Iuw....'..I nvalid system disk...Disk I/O
error...Replace the disk, and then press any
key....IOSYSMSDOS SYS...A......._at_...U.
26ReadSector Source Code
Main C program source code ASM ReadSector
procedure source code
// SectorMain.cpp - Calls ReadSector
Procedure include ltiostream.hgt include
ltconio.hgt include ltstdlib.hgt const int
SECTOR_SIZE 512 extern "C" ReadSector( char
buffer, long startSector, int
driveNum, int numSectors )
27nt main() char buffer long startSector
int driveNum int numSectors
system("CLS") cout ltlt "Sector display
program.\n\n" ltlt "Enter drive number 1A, 2B,
3C, 4D, 5E,... " cin gtgt driveNum cout
ltlt "Starting sector number to read " cin gtgt
startSector cout ltlt "Number of sectors to
read " cin gtgt numSectors buffer new
charnumSectors SECTOR_SIZE cout ltlt
"\n\nReading sectors " ltlt startSector ltlt " - "
ltlt (startSector numSectors) ltlt " from Drive
" ltlt driveNum ltlt endl ReadSector(
buffer, startSector, driveNum, numSectors )
DisplayBuffer( buffer, startSector, numSectors
) system("CLS") return 0
28Assembly Module
TITLE Reading Disk Sectors
(ReadSec.asm) The ReadSector procedure is
called from a 16-bit Real-mode application
written in Borland C 5.01. It can read FAT12,
FAT16, and FAT32 disks under MS-DOS, and
Windows 95/98/Me. Last update 12/5/01 Public
_ReadSector .model small .386 DiskIO
STRUC strtSector DD ? starting sector
number nmSectors DW 1 number of
sectors bufferOfs DW ? buffer
offset bufferSeg DW ? buffer segment DiskIO
ENDS .data diskStruct DiskIO ltgt
29.code -------------------------------------------
--------------- _ReadSector PROC NEAR C ARG
bufferPtrWORD, startSectorDWORD,
driveNumberWORD, \ numSectorsWORD Read n
sectors from a specified disk drive. Receives
pointer to buffer that will hold the sector,
data, starting sector number, drive number,
and number of sectors. Returns
nothing -----------------------------------------
----------------- enter 0,0 pusha mov
eax,startSector mov diskStruct.strtSector,eax m
ov ax,numSectors mov diskStruct.nmSectors,ax m
ov ax,bufferPtr mov diskStruct.bufferOfs,ax pu
sh ds pop diskStruct.bufferSeg mov
ax,7305h ABSDiskReadWrite mov cx,0FFFFh
always this value mov
dx,driveNumber drive number mov bx,OFFSET
diskStruct sector number mov si,0 read
mode int 21h read disk
sector popa leave ret _ReadSector ENDP END
30(No Transcript)
31(No Transcript)
32(No Transcript)
33Example Large Random Integers
- To show a useful example of calling an external
function from Borland C, we can call LongRand,
an assembly language function that returns a
pseudorandom unsigned 32-bit integer. - This is useful because the standard rand()
function in the Borland C library only returns
an integer between 0 and RAND-MAX (32,767). - Our procedure returns an integer between 0 and
4,294,967,295. - This program is compiled in the large memory
model, allowing the data to be larger than 64K,
and requiring that 32-bit values be used for the
return address and data pointer values. - The external function declaration in C is
- extern "C" unsigned long LongRandom()
34Special Section Optimizing Your Code
- The 90/10 rule 90 of a program's CPU time is
spent executing 10 of the program's code - We will concentrate on optimizing ASM code for
speed of execution - Loops are the most effective place to optimize
code - Two simple ways to optimize a loop
- Move invariant code out of the loop
- Substitute registers for variables to reduce the
number of memory accesses - Take advantage of high-level instructions such as
XLAT, SCASB, and MOVSD.
35Loop Optimization Example
- We will write a short program that calculates and
displays the number of elapsed minutes, over a
period of n days. - The following variables are used
.data days DWORD ? minutesInDay DWORD
? totalMinutes DWORD ? str1 BYTE "Daily total
minutes ",0
36Sample Program Output
Daily total minutes 1440 Daily total minutes
2880 Daily total minutes 4320 Daily total
minutes 5760 Daily total minutes 7200 Daily
total minutes 8640 Daily total minutes
10080 Daily total minutes 11520 . . Daily
total minutes 67680 Daily total minutes
69120 Daily total minutes 70560 Daily total
minutes 72000
View the complete source code.
37Version 1
No optimization. mov days,0 mov
totalMinutes,0 L1 loop contains 15
instructions mov eax,24 minutesInDay 24
60 mov ebx,60 mul ebx mov minutesInDay,eax mov
edx,totalMinutes totalMinutes
minutesInDay add edx,minutesInDay mov
totalMinutes,edx mov edx,OFFSET str1 "Daily
total minutes " call WriteString mov
eax,totalMinutes display totalMinutes call
WriteInt call Crlf inc days days cmp
days,50 if days lt 50, jb L1 repeat the
loop
38Version 2
Move calculation of minutesInDay outside the
loop, and assign EDX before the loop. The loop
now contains 10 instructions. mov days,0 mov
totalMinutes,0 mov eax,24 minutesInDay 24
60 mov ebx,60 mul ebx mov minutesInDay,eax m
ov edx,OFFSET str1 "Daily total minutes
" L1 mov edx,totalMinutes totalMinutes
minutesInDay add edx,minutesInDay mov
totalMinutes,edx call WriteString display
str1 (offset in EDX) mov eax,totalMinutes
display totalMinutes call WriteInt call
Crlf inc days days cmp days,50 if days
lt 50, jb L1 repeat the loop
39Version 3
Move totalMinutes to EAX, use EAX throughout
loop. Use constant expresion for minutesInDay
calculation. The loop now contains 7
instructions. C_minutesInDay 24 60
constant expression mov days,0 mov
totalMinutes,0 mov eax,totalMinutes mov
edx,OFFSET str1 "Daily total minutes
" L1 add eax,C_minutesInDay totalMinutes
minutesInDay call WriteString display str1
(offset in EDX) call WriteInt display
totalMinutes (EAX) call Crlf inc days
days cmp days,50 if days lt 50, jb L1
repeat the loop mov totalMinutes,eax update
variable
40Version 4
Substitute ECX for the days variable. Remove
initial assignments to days and
totalMinutes. C_minutesInDay 24 60
constant expression mov eax,0 EAX
totalMinutes mov ecx,0 ECX days mov
edx,OFFSET str1 "Daily total minutes
" L1 loop contains 7 instructions add
eax,C_minutesInDay totalMinutes
minutesInDay call WriteString display str1
(offset in EDX) call WriteInt display
totalMinutes (EAX) call Crlf inc ecx days
(ECX) cmp ecx,50 if days lt 50, jb L1
repeat the loop mov totalMinutes,eax update
variable mov days,ecx update variable
41Using Assembly Language to Optimize C
- Find out how to make your C compiler produce an
assembly language source listing - /FAs command-line option in Visual C, for
example - Optimize loops for speed
- Use hardware-level I/O for optimum speed
- Use BIOS-level I/O for medium speed
42FindArray Example
Let's write a C function that searches for the
first matching integer in an array. The function
returns true if the integer is found, and false
if it is not
include "findarr.h" bool FindArray( long
searchVal, long array, long
count ) for(int i 0 i lt count i)
if( searchVal arrayi ) return true
return false
43Code Produced by C Compiler
optimization switch turned off (1 of 3)
_searchVal 8 _array 12 _count 16 _i
-4 _FindArray PROC NEAR 29 push
ebp mov ebp, esp push ecx 30 for(int i
0 i lt count i) mov DWORD PTR _iebp,
0 jmp SHORT L174 L175 mov eax, DWORD PTR
_iebp add eax, 1 mov DWORD PTR _iebp,
eax
44Code Produced by C Compiler
(2 of 3)
L174 mov ecx, DWORD PTR _iebp cmp ecx,
DWORD PTR _countebp jge SHORT L176 31
if( searchVal arrayi ) mov edx, DWORD PTR
_iebp mov eax, DWORD PTR _arrayebp mov
ecx, DWORD PTR _searchValebp cmp ecx, DWORD
PTR eaxedx4 jne SHORT L177 32 return
true mov al, 1 jmp SHORT L172 L177 33
34 return false jmp SHORT L175
45Code Produced by C Compiler
(3 of 3)
L176 xor al, al AL 0 L172 35
mov esp, ebp restore stack pointer pop
ebp ret 0 _FindArray ENDP
46Hand-Coded Assembly Language (1 of 2)
true 1 false 0 Stack parameters srchVal
equ ebp08 arrayPtr equ ebp12 count
equ ebp16 .code _FindArray PROC near
push ebp mov ebp,esp push edi
mov eax, srchVal search value mov
ecx, count number of items mov edi,
arrayPtr pointer to array
47Hand-Coded Assembly Language (2 of 2)
repne scasd do the search
jz returnTrue ZF 1 if
found returnFalse mov al,
false jmp short exit returnTrue
mov al, true exit pop edi pop
ebp ret _FindArray ENDP
48Creating the FindArray Project
(using Microsoft Visual Studio 6.0)
- Run Visual C and create a project named
FindArray. - Add a CPP source file to the project named
main.cpp. This file should contain the C main()
function that calls FindArray. View a sample. - Add a new header file named FindArr.h to the
project. This file contains the function
prototype for FindArray. View a sample. - Create a file named Scasd.asm and place it in the
project directory. This file contains the source
code for the FindArray procedure. View a sample. - Use ML.EXE to assemble the Scasd.asm file,
producing Scasd.obj. Do not try to link the
program. - Insert Scasd.obj into your C project. (Select
Add Files... from the Project menu.) (this needs
to be verified) - Build and run the project.
49Creating the FindArray Project
(using Microsoft Visual Studio)
- Run Visual C.Net and create a new project named
FindArray. - Add a blank C source file to the project named
main.cpp. Type the main() function that calls
FindArray. View a sample. - Add a new header file named FindArr.h to the
project. This file contains the function
prototype for FindArray. View a sample. - Create a file named Scasd.asm and place it in the
project directory. This file contains the source
code for the FindArray procedure. View a sample. - Use ML.EXE to assemble the Scasd.asm file,
producing Scasd.obj. Do not try to link the
program. - Insert Scasd.obj into your C project.
- Build and run the project.
50The End