• Welcome to Theos PowerBasic Museum 2017.

The O2 Project - Structured Machine Code (FB)

Started by Charles Pegge, March 13, 2008, 03:12:07 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Charles Pegge

Update: 17 Mar 2008
New Zip: No longer case sensitive.
double quote strings added.



This is an experimental language to support structured x86 machine code programming. As you can see it is extremely small. The entire compiler-linker is a single function, of about 5kb. - Small enough to carry inside my head :)

The purpose of this micro language is to see whether it is practical to create substantial programs using machine code with the addition of support for block structured programming, like a high level language. It is also very strong on scoping, allowing labels to be restricted to the blocks in which they are defined.

The zip file below contains the source code, executable and an x86 machine code list for quick reference.

A Simple Program looks like this:

Oxygen
    '
    ' TESTS
    '
    ' 17:45 11 Mar 2008
    '
    '--------------'
    ' nested loops
    '--------------'

    b9 nl5
    33 d2
    (
        b8 nl10
        (
            42
            48 7f repeat
        )
        49
        7f repeat
    )
    eb gEnd ' jump over some string data
    [ this is a string ]

    .End_of_Prog

    8b c2
    c3


Apart from the block structure there is a g (goto or gosub) which will only allow forward jumps. All labels are defined after the code that uses them.

Only the first two letters of a label are significant but with forward referencing, a label is out of scope as soon as it is defined. This allows labels to be 'recycled' many times throughout a program.



FreeBasic


'   O2
'   alias Oxygen    :D

'   Compiler for structured machine code
'
'------------------------------------------------------------------------

'   11:26 13 Mar 2008
'   Charles E V Pegge
'   Using FreeBasic .183
'------------------------------------------------------------------------



' pass location of code and location of data space:

function caller( byval p as byte pointer, byval q as byte pointer) as uinteger
asm
  mov eax, [p]
  mov ebx, [q]
  call eax             ' call to the afdress contained in eax
  mov [function],eax   ' assume eax contains something meaningful and return it
end asm               '
end function


'------------------------------------------------------------------------

'pass the source string to this function and obtain a binary coding:

'-------------------------------------


function hexlink(byref s as string,byref ert as long,byref i as long) as string
dim as long nb(127),xi(255,3)
dim as long neb=0,xib=0
dim as long j=1, l=len(s), k=0, m=0, n=0, v=0,a=0
dim as string t=string$(len(s),chr$(0)),tb=""
ert=0:i=1
do ' main loop
    do ' skip space for next word
        if (i>l)or(asc(s,i)>32) then exit do
        i+=1
    loop
    if i>l then exit do
    do ' case interpret word
        a=asc(s,i)
        ' commemt til end of line
        if a=39 then ' ' comment
            do ' advance to end of line
                i+=1: if (i>l)or(asc(s,i)<=13) then exit do
            loop
            exit do ' next word on next line
        end if
        ' reserve a place in the jump table:
        if a=58 then ' :
            i+=2: tb=tb+chr$(&he9)+mkl$(j-1)+chr$(&h90)+chr$(&h90)+chr$(&h90): k+=1: exit do
        end if
        if a=36 then ' $ reserve space
            m=0
            if asc(s,i+1)=36 then m=4: i+=1 ' $$ align
            v=val("&h"+mid$(s,i+1,4))
            t= left$(t,j-1) + string$(v,chr$(0)) + mid$(t,j)
            j+=v
            v=(j-1) and &hfffffffc:v+=1
            if v<j then j=v+4
            exit do
        end if
        if a=&h5b then  ' [ ] nestable superquote ascii char strings
            m=0: a=i
            do
                i+=1
                if i>l then exit do
                if asc(s,i)=&h5b then m+=1 ' [
                if asc(s,i)=&h5d then m-=1: if m<0 then exit do ' ]
            loop
            a+=1
            mid$(t,j)=mid$(s,a,i-a): j+=i-a: i+=1: exit do
        end if
        ' block programming
        if a=&h28 then neb+=1:nb(neb)=j:i+=1:exit do '(
        if (a=&h29)or(a=&h2e) then ') . fwd refs only
            if a=&h2e then
             i+=1 ' dot label
             v=asc(s,i)*256:a=asc(s,i+1): if a>64 then v+=a ' 2 significant letters
             a=v
            end if
            if xib>0 then
                v=xib-1
                do 'scan exits table
                    if v<0 then exit do
                    if (xi(v,1)=a)and(xi(v,2)>=neb) then
                        m=xi(v,0)
                        do
                            if m<0 then mid$(t,-m)=mkl$(j+m-4): exit do
                            n=(j-m-1)
                            if (n<0)or(n>127) then ert=2: goto exitf ' short fwd jump out of range
                            mid$(t,m)=chr$(n)
                            exit do ' cases
                        loop
                        xi(v,1)=0: xi(v,2)=0
                    end if
                    v-=1
                loop
                ' consolidate list
                v=0:m=0
                do
                    do
                        if xi(v,1)<>0 then ' skip valid entries
                            v+=1:if m<=v then m=v
                            exit do
                        end if
                        if xi(m,1)=0 then exit do ' omit used entries
                        ' otherwise relocate unresolved entries
                        xi(v,0)=xi(m,0):xi(v,1)=xi(m,1):xi(v,2)=xi(m,2)
                        xi(m,0)=0: xi(m,1)=0:xi(m,2)=0:v+=1
                        exit do 'cases
                    loop
                    m+=1:if m>xib then exit do
                loop
                xib=v
            end if
            if a=&h29 then neb-=1 ' )
            if neb<0 then ert=4: goto exitf ' too many right braces
            i+=1:exit do
        end if
        m=asc(s,i+1)
        if a=&h67 then
             i+=1:if m=&h6c then i+=1
             v=asc(s,i)*256:a=asc(s,i+1): if a>64 then v+=a ' 2 significant letters
             if m=&h6c then xi(xib,0)=-j:xi(xib,1)=v:xi(xib,2)=neb:xib+=1:j+=4: exit do' g  ' long
             xi(xib,0)=j:xi(xib,1)=v:xi(xib,2)=neb:xib+=1:j+=1: exit do' g 'short
        end if
        if a=&h78 then
            if m=&h6c then i+=1: xi(xib,0)=-j:xi(xib,1)=&h29:xi(xib,2)=neb:xib+=1:j+=4: exit do' x exit
            xi(xib,0)=j:xi(xib,1)=&h29:xi(xib,2)=neb:xib+=1:j+=1: exit do' x exit
        end if
        if a=&h72 then ' r repeat
            if m=&h6c then i+=1:v=nb(neb)-j-4: mid$(t,j)=mkl$(v):j+=4:exit do ' r repeat
            v=nb(neb)-j-1: if v<-128 then ert=3: goto exitf ' short loop out of range
            mid$(t,j)=chr$(v and &hff):j+=1:exit do
        end if
        '
        ' otherwise assume a decimal octal or hex code
        if a=&h68 then 'h for hex
            i+=2
            if m=&h77 then mid$(t,j)=mkshort$(val("&h"+mid$(s,i,4))):j+=2: exit do ' w
            if m=&h6c then mid$(t,j)=mkl$(val("&h"+mid$(s,i,8))):    j+=4: exit do ' l
            i-=1: mid$(t,j)=chr$(val("&h"+mid$(s,i,2))):             j+=1: exit do ' default hex byte
        end if
        if a=&h6e then 'n for decimal
            i+=2
            if m=&h77 then mid$(t,j)=mkshort$(val(mid$(s,i,16))):  j+=2: exit do ' w
            if m=&h6c then mid$(t,j)=mkl$(val(mid$(s,i,16))):      j+=4: exit do ' l
            if m=&h71 then mid$(t,j)=mklongint(val(mid$(s,i,26))): j+=8: exit do ' q
            if m=&h73 then mid$(t,j)=mks$(val(mid$(s,i,16))):      j+=4: exit do ' s
            if m=&h64 then mid$(t,j)=mkd$(val(mid$(s,i,24))):      j+=8: exit do ' d
            if m=&h62 then mid$(t,j)=chr$(val(mid$(s,i,16))):      j+=1: exit do ' b
            i-=1: mid$(t,j)=chr$(val(mid$(s,i,16))):               j+=1: exit do ' default decimal byte
        end if
        if i+2<l then 'octal byte
            if asc(s,i+2)>47 then mid$(t,j)=chr$(val("&o"+mid$(s,i,3))): j+=1: exit do
        end if
        mid$(t,j)=chr$(val("&h"+mid$(s,i,2))): j+=1:exit do ' default hexadecimal byte
        exit do ' always drop out
    loop ' end cases
    '
    do ' find delimiting char before next word
        if (i>l)or(asc(s,i)<=32) then exit do
        i+=1
    loop
loop ' for next word
'
' calculate rel jumps for table
i=0
do
    if i>=k then exit do
    m=i*8: mid$(tb,m+2) = mkl$( cvl(mid$(tb,m+2,4))+len(tb)-m-5 ) ' relative jump
    i+=1
loop
exitf:
    if ert=0 then if xib>0 then ert=1000000+xi(xib-1,1) 'unresolved refs
    if ert>0 then function=chr$(&hc3): exit function
    function=tb+left$(t,j-1)
end function






function peeks(byval v as any ptr , l as long) as string
dim as string s=string$(l,chr$(0))
dim as long i=0
dim as byte ptr j=v
do
    if i>=l then function=s: exit function
    s[i]=peek(j): j+=1:i+=1
loop
end function



function displayhex(s as string) as string
    dim as long c=0,i=0,j=3,l=len(s)
    dim as string t=chr$(13)+chr$(10)+space(l*3.125 +1)
    do
        i+=1:if i>l then exit do
        mid$(t,j)=right$("0"+hex$(asc(s,i)),2):j+=3:c+=1
        if c>15 then c=0:mid$(t,j)=chr$(13)+chr$(10):j+=2
    loop
    function=left$(t,j-1)
end function


function displayoct(s as string) as string
    dim as long c=0,i=0,j=3,l=len(s)
    dim as string t=chr$(13)+chr$(10)+space(l*4.125 +1)
    do
        i+=1:if i>l then exit do
        mid$(t,j)=right$("00"+oct$(asc(s,i)),3):j+=4:c+=1
        if c>15 then c=0:mid$(t,j)=chr$(13)+chr$(10):j+=2
    loop
    function=left$(t,j-1)
end function




dim as string s,w,src,crz=chr$(10)+chr$(13)
dim as long ert,i,f1=freefile


if (command$="")or(asc(command$)=63) then
    print "O2 opcode compiler. Version 0.0  12 Mar 2008"
    print "Usage: O2 filename":print
    print "Instruction set:"
    print " ------------------------------------------------------------------------"
    print "   INSTRUCTION SET"
    print " ------------------------------------------------------------------------"
    print " all words are delimited by white space or comment mark"
    print "   '        comment to end of line"
    print "   [..]     string literal (the square brackets are nestable)"
    print "   .label   forward labels (only the first 2 letters are significant)"
    print "   :        make entry in jump table"
    print "   $        make space (value in hexadecimal)"
    print "   $$        make space & align to nearest 4 bytes"
    print "   2 digits  0-9 a-f hexadecimal byte (these are not case sensitive)"
    print "   3 digits  0-7 octal byte "
    print "   g         short forward relative jump (but not into an inner block)"
    print "   gl        long forward relative jump  (ditto)"
    print "   (         start of block"
    print "   )         end of block"
    print "   x         short jump exit from block"
    print "   xl        long jump exit from block"
    print "   r         short jump repeat from start of block"
    print "   rl        long jump repeat from start of block"
    print "   h         hexadecimal numbers: (not case sensitive)"
    print "   hw            word:   2 byte integer"
    print "   hl            long:   4 byte integer"
    print "   n         decimal numbers:"
    print "   nb            byte    1 byte"
    print "   nw            word:   2 byte integer"
    print "   nl            long:   4 byte integer"
    print "   nq            quad:   8 byte integer"
    print "   ns            single: 4 byte floating point"
    print "   nd            double: 8 byte floating point"
    print " "
    print " ------------------------------------------------------------------------"

    end
end if
err=0
open command$ for binary access read as #f1
if err then print "Unable to locate source: "+command$:close:end
src=space(lof(f1)): get$ #f1,,src
close #f1

s=hexlink(src,ert,i)
w=string$(1000,chr$(0))

print "Length: ";len(s)
print

print "Coding: ";displayhex(s): print

if ert then
    if ert<1000000 then print left$(src,i) else print "Reference ";chr$(ert-1000000)
    print "Error:   ";ert
    print "location: ";i
end if



' execute source in s with shared table in w

if ert=0 then print "Return:  0x";hex$(caller(strptr(s),strptr(w)))


print "end"
end


Petr Schreiber

Thanks Charles,

works great, but it does some trouble with MS Windows DEP ( Data Execution Prevention).
Any way how to go around it? ( except dark magic )


Petr
AMD Sempron 3400+ | 1GB RAM @ 533MHz | GeForce 6200 / GeForce 9500GT | 32bit Windows XP SP3

psch.thinbasic.com

Charles Pegge

#2
That's interesting Petr, a warning from the Windows Police. did you ever see that with MC_Exec in ThinBasic - it's using the same technique but of course with a much smaller program. Are you using XP SP2?

Does this happen when you execute a c3 by itself ?

Petr Schreiber

Hi,

well, now I realised I had thinBASIC in DEP ignore list.
I moved O2 there too, and since then no problems, but I guess for "common users" this could be a problem.

With DEP for all, I get the message both for thinBASIC and O2.


Petr
AMD Sempron 3400+ | 1GB RAM @ 533MHz | GeForce 6200 / GeForce 9500GT | 32bit Windows XP SP3

psch.thinbasic.com

Charles Pegge


Thanks Petr, The curious thing is my PC has DEP turned on for all programs but I don't get the DEP error message when running O2 or R$  which also executes code in a string.

DEP seems to be processor dependent (I have an AMD64X2)
Quote
DEP can take advantage of software and hardware support. To use DEP, your computer must be running Microsoft Windows XP Service Pack 2 (SP2) or later, or Windows Server 2003 Service Pack 1 or later. DEP software alone helps protect against certain types of malicious code attacks but to take full advantage of the protection that DEP can offer, your processor must support "execution protection". This is a hardware-based technology designed to mark memory locations as non-executable. If your processor does not support hardware-based DEP, it's a good idea to upgrade to a processor that offers execution protection features


For those of us who are unfamiliar with DEP, here is a description and some instructions on how to configure it.

http://www.updatexp.com/dep-exceptions.html