MixedLanguageProgramming.pdf
(
558 KB
)
Pobierz
AoA.book
Mixed Language Programming
Mix
ed Language Programming
Chapter Twelve
12.1
Chapter Overview
Most assembly language code doesn’
t appear in a stand-alone assembly language program. Instead,
most assembly code is actually part of a library package that programs written in a high le
v
el language wind
up calling.
Although HLA mak
es it really easy to write standalone assembly applications, at one point or
another you’
ll probably w
ant to call an HLA procedure from some code written in another language or you
may w
ant to call code written in another language from HLA.
This chapter discusses the mechanisms for
doing this in three languages: lo
w-le
v
el assembly (i.e., MASM or Gas), C/C++, and Delphi/K
ylix.
The
mechanisms for other languages are usually similar to one of these three, so the material in this chapter will
still apply e
v
en if you’
re using some other high le
v
el language.
12.2
Mixing HLA and MASM/Gas Code in the Same Program
It may seem kind of weird to mix MASM or Gas and HLA code in the same program.
After all, the
y’
re
both assembly languages and almost an
ything you can do with MASM or Gas can be done in HLA. So wh
y
bother trying to mix the tw
o in the same program?
W
ell, there are three reasons:
You’ve already got a lot of code written in MASM or Gas and you don’t want to convert it to
HLA’s syntax.
• There are a few things MASM and Gas do that HLA cannot, and you happen to need to do one
of those things.
• Someone else has written some MASM or Gas code and they want to be able to call code
you’ve written using HLA.
In this section, we’ll discuss two ways to merge MASM/Gas and HLA code in the same program: via in-line
assembly code and through linking object files.
12.2.1
In-Line (MASM/Gas) Assembly Code in Your HLA Programs
As you’
re probably a
w
are, the HLA compiler doesn’
t actually produce machine code directly from your
HLA source fi
les. Instead, it fi
rst compiles the code to a MASM or Gas-compatible assembly language
source fi
le and then it calls MASM or Gas to assemble this code to object code. If you’
re interested in seeing
the MASM or Gas output HLA produces, just edit the
fi
lename
.ASM fi
le that HLA creates after compiling
your
fi
lename
.HLA source fi
le.
The output assembly fi
le isn’
t amazingly readable, b
ut it is f
airly easy to cor
-
relate the assembly output with the HLA source fi
le.
HLA pro
vides tw
o mechanisms that let you inject ra
w MASM or Gas code directly into the output fi
le it
produces: the
#ASM..
#END
ASM sequence and the
#EMIT statement.
The #ASM..#END
ASM sequence
copies all te
xt between these tw
o clauses directly to the assembly output fi
le, e.g.,
#asm
mov eax, 0 ;MASM/Gas syntax for MOV( 0, EAX );
add eax, ebx ; “ “ “ ADD( ebx, eax );
#endasm
ASM sequence is how you inject in-line (MASM or Gas) assembly code into your HLA
programs. For the most port there is very little need to use this feature, but in a few instances it is valuable.
Beta Draft - Do not distribute
© 2001, By Randall Hyde
Page
1151
•
The #ASM..#END
Chapter Twelve
Volume Four
es the “.intel_syntax” diretive, so you should use Intel syntax when
supplying Gas code between #asm and #endasm.
For example, if you’re writing structured exception handling code under Windows, you’ll need to access
the double word at address FS:[0] (offset zero in the segment pointed at by the 80x86’s FS segment register).
Unfortunately, HLA does not support segmentation and the use of segment registers. However, you can drop
into MASM for a statement or two in order to access this value:
#asm
mov ebx, fs:[0] ; Loads process pointer into EBX
#endasm
At the end of this instruction sequence, EBX will contain the pointer to the process information structure
that Windows maintains.
HLA blindly copies all text between the #ASM and #ENDASM clauses directly to the assembly output
file. HLA does not check the syntax of this code or otherwise verify its correctness. If you introduce an
error within this section of your program, the assembler will report the error when HLA assembles your
code by calling MASM or Gas.
The #EMIT statement also writes text directly to the assembly output file. However, this statement does
not simply copy the text from your source file to the output file; instead, this statement copies the value of a
string (constant) expression to the output file. The syntax for this statement is as follows:
#emit( string_expression );
valuates the expression and verifies that it’s a string expression. Then it copies the string
data to the output file. Like the #ASM/#ENDASM statement, the #EMIT statement does not check the syn-
tax of the MASM statement it writes to the assembly file. If there is a syntax error, MASM or Gas will catch
it later on when HLA assembles the output file.
When HLA compiles your programs into assembly language, it does not use the same symbols in the
assembly language output file that you use in the HLA source files. There are several technical reasons for
this, but the bottom line is this: you cannot easily reference your HLA identifiers in your in-line assembly
code. The only exception to this rule are external identifiers. HLA external identifiers use the same name in
the assembly file as in the HLA source file. Therefore, you can refer to external objects within your in-line
assembly sequences or in the strings you output via #EMIT.
One advantage of the #EMIT statement is that it lets you construct MASM or Gas statements under
(compile-time) program control. You can write an HLA compile-time program that generates a sequence of
strings and emits them to the assembly file via the #EMIT statement. The compile-time program has access
to the HLA symbol table; this means that you can extract the identifiers that HLA emits to the assembly file
and use these directly, even if they aren’t external objects.
The @StaticName compile-time function returns the name that HLA uses to refer to most static objects
in your program. The following program demonstrates a simple use of this compile-time function to obtain
the assembly name of an HLA procedure:
program emitDemo;
#include( “stdlib.hhf” )
procedure myProc;
begin myProc;
stdout.put( “Inside MyProc” nl );
end myProc;
begin emitDemo;
?stmt:string := “call “ + @StaticName( myProc );
Page
1152
© 2001, By Randall Hyde
Beta Draft - Do not distribute
Note, when using Gas, that HLA specifi
This statement e
Mixed Language Programming
#emit( stmt );
end emitDemo;
Program 12.1
Using the @StaticName Function
This e
xample creates a string v
alue (
stmt
) that contains something lik
e “call ?741_myProc” and emits
this assembly instruction directly to the source fi
le (“?741_myProc” is typical of the type of name mangling
that HLA does to static names it writes to the output fi
le). If you compile and run this program, it should dis
-
play “Inside MyProc” and then quit. If you look at the assembly fi
le that HLA emits, you will see that it has
1
gi
en the
myPr
oc
procedure the same name it appends to the CALL instruction
.
The @StaticName function is only v
alid for static symbols.
This includes ST
A
TIC, READONL
Y
, and
ST
ORA
GE v
ariables, procedures, and iterators. It does not include
AR objects, constants, macros, class
iterators, or methods.
ou can access
V
AR v
ariables by using the [EBP+of
fset] addressing mode, specifying the of
fset of the
desired local v
ariable.
Y
ou can use the
@of
fset compile-time function to obtain the of
fset of a
V
AR object or
a parameter
.
The follo
wing program demonstrates ho
w to do this:
program offsetDemo;
#include( “stdlib.hhf” )
var
i:int32;
begin offsetDemo;
mov( -255, i );
?stmt := “mov eax, [ebp+(“ + string( @offset( i )) + “)]”;
#print( “Emitting ‘”, stmt, “‘” )
#emit( stmt );
stdout.put( “eax = “, (type int32 eax), nl );
end offsetDemo;
Program 12.2 Using the @Offset Compile-Time Function
This example emits the statement “mov eax, [ebp+(-8)]” to the assembly language source file. It turns out
that -8 is the offset of the
i
variable in the offsetDemo program’s activation record.
Of course, the examples of #EMIT up to this point have been somewhat ridiculous since you can
achieve the same results by using HLA statements. One very useful purpose for the #emit statement, how-
ever, is to create some instructions that HLA does not support. For example, as of this writing HLA does not
support the LES instruction because you can’t really use it under most 32-bit operating systems. However, if
1. HLA may assign a different name that “?741_myProc” when you compile the program. The exact symbol HLA chooses
varies from version to version of the assembler (it depends on the number of symbols defined prior to the definition of
myProc
. In this example, there were 741 static symbols defined in the HLA Standard Library before the definition of
myProc
.
Beta Draft - Do not distribute
© 2001, By Randall Hyde
Page
1153
v
V
Y
Chapter Twelve
Volume Four
you found a need for this instruction, you could easily write a macro to emit this instruction and appropriate
operands to the assembly source file. Using the #EMIT statement gives you the ability to reference HLA
objects, something you cannot do with the #ASM..#ENDASM sequence.
12.2.2 Linking MASM/Gas-Assembled Modules with HLA Modules
Although you can do some interesting things with HLA’s in-line assembly statements, you’ll probably
never use them. Further, future versions of HLA may not even support these statements, so you should avoid
them as much as possible even if you see a need for them. Of course, HLA does most of the stuff you’d want
to do with the #ASM/#ENDASM and #EMIT statements anyway, so there is very little reason to use them at
all. If you’re going to combine MASM/Gas (or other assembler) code and HLA code together in a program,
most of the time this will occur because you’ve got a module or library routine written in some other assem-
bly language and you would like to take advantage of that code in your HLA programs. Rather than convert
the other assembler’s code to HLA, the easy solution is to simply assemble that other code to an object file
and link it with your HLA programs.
Once you’ve compiled or assembled a source file to an object file, the routines in that module are call-
able from almost any machine code that can handle the routines’ calling sequences. If you have an object
file that contains a SQRT function, for example, it doesn’t matter whether you compiled that function with
HLA, MASM, TASM, NASM, Gas, or even a high level language; if it’s object code and it exports the
proper symbols, you can call it from your HLA program.
Compiling a module in MASM or Gas and linking that with your HLA program is little different than
linking other HLA modules with your main HLA program. In the assembly source file you will have to
export some symbols (using the PUBLIC directive in MASM or the .GLOBAL directive in Gas) and in your
HLA program you’ve got to tell HLA that those symbols appear in a separate module (using the EXTER-
NAL option).
Since the two modules are written in assembly language, there is very little language imposed structure
on the calling sequence and parameter passing mechanisms. If you’re calling a function written in MASM
or Gas from your HLA program, then all you’ve got to do is to make sure that your HLA program passes
parameters in the same locations where the MASM/Gas function is expecting them.
About the only issue you’ve got to deal with is the case of identifiers in the two programs. By default,
MASM and Gas are case insensitive. HLA, on the other hand, enforces case neutrality (which, essentially,
means that it is case sensitive). If you’re using MASM, there is a MASM command line option (“/Cp”) that
tells MASM to preserve case in all public symbols. It’s a real good idea to use this option when assembling
modules you’re going to link with HLA so that MASM doesn’t mess with the case of your identifiers during
assembly.
Of course, since MASM and Gas process symbols in a case sensitive manner, it’s possible to create two
separate identifiers that are the same except for alphabetic case. HLA enforces case neutrality so it won’t let
you (directly) create two different identifiers that differ only in case. In general, this is such a bad program-
ming practice that one would hope you never encounter it (and God forbid you actually do this yourself).
However, if you inherit some MASM or Gas code written by a C hacker, it’s quite possible the code uses this
technique. The way around this problem is to use two separate identifiers in your HLA program and use the
extended form of the EXTERNAL directive to provide the external names. For example, suppose that in
MASM you have the following declarations:
public AVariable
public avariable
.
.
.
.data
AVariable dword ?
avariable byte ?
Page 1154
© 2001, By Randall Hyde
Beta Draft - Do not distribute
Mixed Language Programming
If you assemble this code with the “/Cp” or “/Cx” (total case sensitivity) command line options, MASM will
emit these two external symbols for use by other modules. Of course, were you to attempt to define vari-
ables by these two names in an HLA program, HLA would complain about a duplicate symbol definition.
However, you can connect two different HLA variables to these two identifiers using code like the following:
static
AVariable: dword; external( “AVariable” );
AnotherVar: byte; external( “avariable” );
HLA does not check the strings you supply as parameters to the EXTERNAL clause. Therefore, you
can supply two names that are the same except for case and HLA will not complain. Note that when HLA
calls MASM to assemble it’s output file, HLA specifies the “/Cp” option that tells MASM to preserve case in
public and global symbols. Of course, you would use this same technique in Gas if the Gas programmer has
exported two symbols that are identical except for case.
The following program demonstrates how to call a MASM subroutine from an HLA main program:
// To compile this module and the attendant MASM file, use the following
// command line:
//
// ml -c masmupper.masm
// hla masmdemo1.hla masmupper.obj
//
// Sorry about no make file for this code, but these two files are in
// the HLA Vol4/Ch12 subdirectory that has it’s own makefile for building
// all the source files in the directory and I wanted to avoid confusion.
program MasmDemo1;
#include( “stdlib.hhf” )
// The following external declaration defines a function that
// is written in MASM to convert the character in AL from
// lower case to upper case.
procedure masmUpperCase( c:char in al ); external( “masmUpperCase” );
static
s: string := “Hello World!”;
begin MasmDemo1;
stdout.put( “String converted to uppercase: ‘” );
mov( s, edi );
while( mov( [edi], al ) <> #0 ) do
masmUpperCase( al );
stdout.putc( al );
inc( edi );
endwhile;
stdout.put( “‘” nl );
end MasmDemo1;
Beta Draft - Do not distribute
© 2001, By Randall Hyde
Page 1155
Plik z chomika:
brosaczony
Inne pliki z tego folderu:
Volume5.pdf
(23 KB)
Volume4.pdf
(29 KB)
Volume3.pdf
(27 KB)
Volume2.pdf
(24 KB)
Volume1.pdf
(29 KB)
Inne foldery tego chomika:
Pliki dostępne do 01.06.2025
Pliki dostępne do 19.01.2025
Angielski dla dzieci
Antyki
Audiobook
Zgłoś jeśli
naruszono regulamin