• Welcome to Theos PowerBasic Museum 2017.

News:

Attachments are only available to registered users.
Please register using your full, real name.

Main Menu

The compiler advantage

Started by John Spikowski, August 11, 2013, 07:20:11 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

John Spikowski

Quote from: Theo
We should not start to compare Interpreters with real compilers.
Script Basic is an interpreter and as such has its advantages and disadvantages.
At the same time a compiler has otehr goals.
If you compile something your target is maximum speed, the result is possibly a datatype conversion to CPU internal datatypa - what we see here. If i write a script language i just stay within the highes precision, because nobody will expect best performance from an interpreting language. Just that it works in any way. Therefore we should not compare two diffrent kind of things.
Please keep the discussions on interpreting languages in the sub forum for that or else i need to make a cleanup of the postings.
Because we have already solved the problem here - Jose explained it - and there is no need to mix oil into the water here.
People may have to read the sollution for this prioblem, it should be easy to find. Academic discussions should be at other places.
@John: Open another post in the Scriptbasic forum IF you see a there a real topic/advantage.

Theo,

Twenty years ago I would have agreed with you. With today's hardware I find little advantage using a low level compiler like PB for 90% of the code I write to earn a living. If I do need a boost in an area of my code I just spend a few minutes and create a C extension module function that looks like any native script function. With Charles's DLLC extension module I can run ScriptBasic in a threaded model with a common callback handler that allow me to use libraries that aren't thread safe. I can script any API at runtime, access/define structures and even do low level COM. If you still think a compiler is the only way to go, the complete feature set I just mention can be embedded in your compiled application with only a few lines of code.

Here is a couple tests I did on Ubuntu 64 bit and Android Linux comparing SB with compiled applications. I'm unable to see the advantage.


FOR x = 65 TO 90
  a = a & CHR(x)
NEXT x
FOR x = 1 TO 1000
  FOR y = 26 TO 1 STEP -1
    b = b & MID(a, y, 1)
  NEXT y
  b = b & a
NEXT x
PRINT LEN(b),"\n"


jrs@laptop:~/BaCon/V220$ time ./strbench
52000

real   0m0.840s
user   0m0.832s
sys   0m0.004s
jrs@laptop:~/BaCon/V220$ cd ~/sb/sb22/test

jrs@laptop:~/sb/sb22/test$ time scriba strbench.sb
52000

real   0m1.024s
user   0m1.016s
sys   0m0.004s
jrs@laptop:~/sb/sb22/test$


This is a SB MD5 example running on Android compared to the Linux C version. The SB version has to do the following.


  • Load scriba
  • Parse/tokenize script
  • Load extension module and declare external functions
  • Load 4.3 MB string from a file into a SB string variable
  • Call the extension module MD5 function passing the loaded text of the bible.
  • Format binary results to a HEX string
  • Cleanup memory, unload extension module and exit scriba
Compare that to a compile C program that is a standard Linux utility and there is 3 tenths of a second difference.


declare sub MD5 alias "md5fun" lib "t"
declare sub LoadString alias "loadstring" lib "t"

s = LoadString("Bible.txt")

m = MD5(s)

for x = 1 to len(m)
  print right("0" & hex(asc(mid(m,x,1))),2)
next
printnl


shell@android:/sdcard/scriptbasic $ time scriba md5cc.sb
17CE80BA9F6A0F74093C575205C9CB17
    0m0.12s real     0m0.06s user     0m0.04s system

shell@android:/sdcard/scriptbasic $ time md5 Bible.txt
17ce80ba9f6a0f74093c575205c9cb17  Bible.txt
    0m0.09s real     0m0.05s user     0m0.02s system
shell@android:/sdcard/scriptbasic $

Aslan Babakhanov

Yet, another holy war: comparing language speeds :)

Processing "boring" string concat or math in the loop is not the best option to do some comparisons.

I would recommend to categorize the tests and at least define the goal:

1) Select programming languages
2) Select OS's

If you plan to test on Android and x86, then test results may not be equal due to processor architecture differences.
For example, implementation of jump operations on ARM processors differs from x86 family.
The same applies to register/memory operations.

Here are the basic test methods:

* loop operations
* if/endif
* data conversion/casting parameters
* string manipulations
* calling subroutines
* math operations: without and with optimizations

Do not call any api functions. 


Theo Gottwald

Skripting is the tool of coice if you need some sort of SysAdmin work quickly automated.
Sort files. Patch things. Like clean up your PC-Household.
If it comes to real applications that need highest speed nothing can beat an compiler.

But i agree with you John, that most SysAdmin work - more then 90% can be done using Skripting.
But immediately if you touch for example Rendering (Raytracing) or Image-Processing,
Skripting (unless you use Scripting languages that are optimized exactly for that) is no more an option.
Same with Sound Processing. You still need any CPU cycle to get the expected performance.

Because not only the CPU Power raised wit time. But also the amount of graphical and other data that needs to be processed, increased in the same way.

Skripting: quickly solve Tasks.
Application Programming: Use a compiler of course.

Must say that i am not a database, or API Programmer in first line.
If my programms woul use 95% of their runtime to call other DLL's or API's then of course there would be no difference, between Compiler and Interpreter.
In my field (explained above) a Skripting language will have no chance unless there would be special optimizations first.
But theny they will need to be compiled.

John Spikowski

My point is that interpreters on today's hardware is not night and day as it once was. I haven't seen many web application being built with compilers. Java is the most widely used language on the planet an it's not a compiler in a true sense. There is no reason not to use scripting to save time and call compiled code when a boost in performance is needed. Charles realized that compilers have a different role and wrote his with JIT features.

I'm not a PowerBASIC hater and was part of the beta team for years. I moved on when Bob and his censorship seemed like I was dealing with a company based out of North Korea. If PowerBASIC went public today, would you buy stock in the company?


Anand Kumar

#4
As far as an interpreter is concerned, when it is executed, there is the run-time + dependencies associated with the run-time that is loaded in memory whenever the interpreter is executed.  The memory requirements for an instance depends on the static features of the scriptbasic language which is defined in the grammar file.  So if we run 1000 concurrent instances of script-basic, it should consume most of the system resources because of its memory requirements (this depends on the available memory and other resources).  For a compiled program, only the assembled code, its associated run-time (which is ~6kb for PB) and dependencies are loaded into memory.

When a interpreted program is executed...  once the run-time is loaded, the program to be executed is loaded into memory by the interpreter's loader + parser + syntax analyzer, sentences are validated and then a tree structure that is ready for execution is created (tokens).  Now, multiple instances of this tree structure can be executed without the need for parsing, loading etc. Even then its performance is not at par with a compiled program.  This is because, what you can see is that the interpreter recreates the entire set of tools used by the operating system to load a program into memory and maintain the program flow and this is true for any scripting language.  For a compiled program, parser/syntax analyzer does not exist and the program is just loaded and executed.  In the case of client side scripts, the script engine is loaded into memory when the browser is loaded in memory and so there is lesser performance penalty but the rest of the issues remain.  In the case of server side scripts, server does load the script engine in memory but still the delays due to loading/parsing/syntax analyzing a specific script exist. 

During execution of the tokens, appropriate functions that is defined in the syntax file is called by the execution engine based on the control flow that is maintained by the interpreter.  Parameter type conversion happens in the individual function before the function is executed and these parameters can be either variables or arrays.  In Script Basic the return values are predominantly variables (based on my experience) and here again, unless defined explicitly the scope of the variables/arrays are global (If we do not include a local definition in the function/procedure we will modify the global data).  Again In Script Basic, variables are accessed by mapping each variable to a global list of locations and when we get/set values we modify values in this list.  Contrary to this, in a compiled program, variables are data memory locations whose content is modified during program execution and functions are program memory locations to which jumps are made according to the control flow.  So you can see the fundamental difference that results in performance issues.  This is inherent in the design and just because we have faster machines it does not mean it is not present. 

Actually, I can even say that with an interpreter, on every execution lots of CPU cycles and system resources are wasted because of the code necessary for loading+parsing+syntax analyzing+variable reference+function calling+translation on every run.  In case of a Compiler, this is done at compile time and only execution of assembled instructions relevant to the program occurs when the program is executed.  Therein lies the difference. 

In case of client side scripting, there are too many platforms and compilers are designed for a specific machine architecture so interpreters are the only available mechanisms to support program execution.  Here again difficulties arise due to the version of the interpreter in use and the grammar supported by that interpreter (For eg Java 1.4 vs Java 1.7). 

So lets not compare apples and oranges.  Each of these language tools are designed for a specific purpose.  Compilers are for language translation (source program language to target machine language) while interpreters does what it means... executing actions of program constructs written in a source language at run-time. 


John Spikowski

#5
QuoteSo if we run 1000 concurrent instances of script-basic, it should consume most of the system resources because of its memory requirements (this depends on the available memory and other resources).

The ScriptBasic runtime (libscriba <360KB) is only loaded into memory once. Each tokenized binary script is run in its own thread sharing the common runtime. The ScriptBasic webserver is a perfect example. It was used for a special event that at one point was seeing 4 million hits an hour with no noticeable stress on the server.

That was a HUGE misconception and I wanted to get that point squared away right from the start. I'll follow up with the rest of your points shortly. (busy at the moment)

Playing in the trees

I don't know what BASIC interpreter you're referring to but that's not how ScriptBasic works. SB uses a contiguous block of memory to hold the highly optimized binary tokenized script. I see little difference between a BASIC to C compiled program's runtime library and SB's C compiled runtime library. Granted in tight loops C/compiled BASIC programs will excel. How many real world applications that you would actually use a BASIC language for is in that state? As I mentioned before, if I find a section of code that would be better served as a compiled C function, I create an extension module and move on.

Very little of the SB code (application) isn't running in a compiled state. Even the interpretive script is optimized for execution. Not all interpreter work this way. Eros's thinBasic for example is a line by line interpreter and if you check out some the code users have posted it looks just like PowerBASIC SDK style BASIC but easier to use.

Recreating the wheel

QuoteActually, I can even say that with an interpreter, on every execution lots of CPU cycles and system resources are wasted because of the code necessary for loading+parsing+syntax analyzing+variable reference+function calling+translation on every run.  In case of a Compiler, this is done at compile time and only execution of assembled instructions relevant to the program occurs when the program is executed.  Therein lies the difference. 

ScriptBasic parses the text script once and keeps it's binary form in a cache directory. I can easily convert the user script to a C source file with the -C ScriptBasic command line switch. This creates a standalone executable normally in the 40-60KB range depending on the complexity of the script. (extension modules used can be statically link in as well)


Anand Kumar

John,

This depends on the grammar that you utilize.  The syntax definitions of version 2.0 is around 1000 lines long (including the comments) supporting roughly 250 basic functions.  If you keep enhancing the grammar to support more functions then the run-time size increases.  My run-time engine is around 1.5 MB and it supports more than 500 functions.  When you use it as a Server which is a special case, the run-time engine is loaded into memory and used as a service for token creation and a separate thread can be spawned for executing these tokens.  That is a way to optimize the system performance. 

When you speak about 4 million hits, the questions that needs to be asked are:

1) Are the hits executing the same program or are each hit refering to a different program? If it is the same program then the need for loading + parsing + syntax analyzing is removed...  A copy of the generated tokens is sufficient for execution.  If they are unique than 4 million trees needs to be created which should consume appropriate memory. 

2) Are they concurrent hits or are they sequential hits?  If they are concurrent than 4 million threads needs to be spawned, this is a OS feature which has been utilized. 

3) What is the size of the program in terms of number of lines (The maximum lines I have written using Scriptbasic is ~10000 while with PowerBasic it is ~250000), this is necessary because if the number of lines increase then the tree size also increases and system resouce consumption increases. 

4) What is the duration of execution of each of these threads, I have run script-basic programs (based on my extensions) that run for the whole-day.  My Pb programs run for weeks but its purpose is different. 

Script Basic is a good tool...  Peter has made it extensible but still it is an interpreter with its own limitations. 

John Spikowski

#7
Quote
This depends on the grammar that you utilize.  The syntax definitions of version 2.0 is around 1000 lines long (including the comments) supporting roughly 250 basic functions.  If you keep enhancing the grammar to support more functions then the run-time size increases.  My run-time engine is around 1.5 MB and it supports more than 500 functions.  When you use it as a Server which is a special case, the run-time engine is loaded into memory and used as a service for token creation and a separate thread can be spawned for executing these tokens.  That is a way to optimize the system performance.

PLEASE update your scriba and extension modules to 2.2 as the ScriptBasic 2.0 version (before my time) is over 10 years old. gcc deprecations and fixes for 64 bit and Android are just some of the 2.2 changes that were made. The only syntax additions were made by Tom and he added just over a dozen new math functions (most were slated for completion and left as stubs by Peter) He also added a new RAD() function which the new math function accept.

All my additions to ScriptBasic are done as extension module functions.

Quote
When you speak about 4 million hits, the questions that needs to be asked are:

1) Are the hits executing the same program or are each hit refering to a different program? If it is the same program then the need for loading + parsing + syntax analyzing is removed...  A copy of the generated tokens is sufficient for execution.  If they are unique than 4 million trees needs to be created which should consume appropriate memory.

2) Are they concurrent hits or are they sequential hits?  If they are concurrent than 4 million threads needs to be spawned, this is a OS feature which has been utilized.

As I understand the story the SB web server and the MySQL extension module were used as a registration application for a marathon or some other big event. You would need to ask Peter as I got the story second hand. Keep in mind the 4 million hits was the total number over an hour period. I have no idea what the peek load was or if they used the MT session extension module.

Quote
3) What is the size of the program in terms of number of lines (The maximum lines I have written using Scriptbasic is ~10000 while with PowerBasic it is ~250000), this is necessary because if the number of lines increase then the tree size also increases and system resouce consumption increases.

WOW! a 10,000 line SB program, wait a second and let me get Guinness on the line as this may be a record.  8)

Quote
4) What is the duration of execution of each of these threads, I have run script-basic programs (based on my extensions) that run for the whole-day.  My Pb programs run for weeks but its purpose is different.

Once again, you would have to ask Peter.

Quote
Script Basic is a good tool...  Peter has made it extensible but still it is an interpreter with its own limitations.

I have yet to run into a limitation with ScriptBasic. I wish I could say the same for PB. (Win32 only, incomplete COM, unknown future, ...)


Theo Gottwald

#8
John - can I help you on that?
Anything in the field of image processing or ray tracing ...
Especially when it gets to larger resolutions. Often not even a compiler is best for that but pure ASM.

On the other side Office-Automations are perfect usages for Scripts, because you do "just call" the host program.
This can be done by Script languages. VB Script is popular in this field.

Saying that, i believe that an developer of a Script language goes on a swampy terrain if he tries to beat a compiler language in terms of speed.
It just does not make sense. Its just like if you want to win a race with a tank against a Porsche.
I have only seen one approach that was very fast, and that was from ASM Guru Hutch - several years ago.
But it will never beat a compiler.

As I see it, Scripting languages serve special purposes, where speed is not important.
Here is an example:

Smart Package Robot Script

Note that this program is not my pogram but from DS (while some parts of it come from me).
I am showing this just as an example of a script language that is used for special purposes.

As you can see mayxmum speed is not important in this usage. Having a compiler on the other side would not help the user.
At a place where special commands were important. These are the places where dedicated script languages can survive.

Thats why i say"Get a specialized purpose for your Scripting language". How about a WOW-Script-Language, John?
I have heared that there are people out there who earn a LOT of MONEY with these ...
Some of them even got Millionaire.
Thats nothing else then another dedicated "Script language". No question how fast ...

Thinking about it i should make a BOT for FARMVILLE :-)

@John, if you have to say somwthing substancial, you posts will not be deleted. But large pictures with no reasons and useless posts get deleted. And please keep a bit forum disciplin, otherwise we can delete more then just posts.
Sometimes we like to Joke, but primarily we look whats good four our Forum readers.
We want compressed information here. We want useful Source Code. Try that.

José Roca

Some of them have been deleted by me, specially the one in which you were insulting the members of this forum. You have your own forums: use them to post and leave us alone.

John Spikowski

To some it seems all I do is complain and bitch about nothing anybody cares about. In reality I'm probably the most active BASIC developer out there. (prove me wrong!) Here is another example of my adventures with porting C based BASIC languages to Android Linux and compiling native.



This is Return to BASIC traditional interactive BASIC running on my Samsung Galaxy Tab 2 10.1 tablet using SDL. The background SDL user interface controls can be toggled on/off from the applications preference menu. I'll post a rtb.apk as soon as I fix the remaining unicode issues which SDL 1.2 doesn't support.

more ...

John Spikowski

QuoteYou have your own forums: use them to post and leave us alone.

The last thing I want to do is upset you. I have tremendous respect for the effort you put into your include files and the reason PowerBASIC still has a product they can sell.

I assume what you mean by us is the loyal PowerBASIC users that still believe there is a bright future for PB and users of all those other BASIC wannabe languages will soon get it.

If you prefer that this forum remain PowerBASIC centric, then you should make that fact known so folks like Patrice, Charles, Theo and myself aren't posting off topic content.



José Roca

Don't twist my words. You know exactly what I mean. If you want to talk about O2, SB or wathever, do it, but stop bashing PB or other compilers. If you want to bash other compilers, do it in your forums and annoy your users. You already did it in your All Basic forum.

When Charles asked me for permission to post about Linux and FreeBasic (and later O2), not only I allowed it, but I made him a moderator. I also asked Fred to be a moderator mainly because of his experience with C++. And Patrice is now using C++.

Please stop trying to discourage PB users. Annoying them, not only you aren't going to increase the number of SB or O2 users, but it will backfire. Nobody is going to register in the O2 forum just to no have to deal with you.

Enough said. If you continue with your campaign against PB in this forum, I know how to stop it...

John Spikowski

QuoteNobody is going to register in the O2 forum just to no have to deal with you.

The difference is Charles isn't trying to sell you anything or has the overhead of a staff and expenses to support.

I think Charles, I and ScriptBasic have been releasing some state of the art technology (JIT virtual DLLs, multi-threading with a smart (common) event handler that works with non-thread safe code)  What have you heard (other than excuses for taking the PB website down for weeks unexpectedly)  from the PB team?

I have no beef with PowerBASIC but I'm not going to lie and call it gods gift to programmers and the only language you will ever need.

José Roca

I can't tell you what I know because I have signed a NDA.

I don't need JIT.

What are virtual DLLs?

PB has a good implementation of threads. Besides threaded variables and the old THREAD statements, there is also a THREAD object and you can also make any function, method or class thread safe by just adding the THREADSAFE clause to it.

Can you instruct me how to do more mondane things like adding a resource file to my executable? All my EXEs use a resource file and I haven't found a way to do it with O2.