• Welcome to Theos PowerBasic Museum 2017.

Code-Formatter PB 10

Started by Theo Gottwald, January 18, 2011, 07:58:28 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Paul Elliott

Larry,

I think I tried to get the continuated lines to be indented 1 tab stop UNLESS it was under
one of your blocks of multi-line assign. If you would combine all of your GDI Plus code & Includes
into 1 file and run it, you will probably see what I'm talking about ( as your code was what
I based it on ).


Jon Eskdale

Hi Paul
I have come across a problem where I am getting
Untrapped Error #5 (Illegal function call) has occurred following execution of AGN2

I've narrowed it down to this bit of source which it doesn't like.  I've edited it to make it short to demonstrate the problem


SELECT CASE wMsg
     ' MLGDEBUG HEX$(wMsg)
     CASE %WM_CREATE  'Allocate storage for the vGridData structure.
        LOCAL tm   AS TEXTMETRIC, ps AS PAINTSTRUCT,_
              si   AS SCROLLINFO, _'lp AS POINTAPI, _
              rc   AS RECT, wRect AS RECT, _
              hdc  AS DWORD, hPen AS DWORD, hBrush AS DWORD, _
              hBrushSel AS DWORD, _
              y AS LONG, x AS LONG, I AS LONG, J AS LONG , iVscrollInc AS LONG, hScrlInc AS LONG
       STATIC MyPoint AS POINTAPI      'the rightclick menu uses this.  Needs to be static
End Select


It seems to be the Array Scan that is generating the error while it is processing the line y as LONG, x as LONG....

Any Ideas - Thanks
Jon


Paul Elliott

Ok, the attached replacement DoFormat.bas should fix the few things found.   ;)



Theo Gottwald

Paul, prefer to post complete Sourcecode, Paul. So anybody can use the actual final version.
Also it would be good if you add a statement how you think the project state is, "ready .. finsihed ... in work ... lot to do ..."
something like that.
I'll wait with own tests until i get something which you tell me that you think its really good.

Paul Elliott

Jon,

Did that fix your problem? It should have ( it did in my tests of the code you posted ).

Just curious. I don't get on he internet a lot lately.

Paul


Paul Elliott

Larry,

I spent several hours looking into seeing how hard it would be to handle all those split
keyword pairs and I've got a big headache.

I think the only way to do that is to do everything via the token method and have a lot
of flags to specify what part of a PB statement has been done & which part is being worked
on. But I may be wrong. If you can figure out a way of doing it ( either in the current version
or the original version or in your own version ) then please let us know.

I also did some work on the Remark handling. Seems that it is not necessary to have a colon
before a REM after a PB statement.

Had never seen a continuation char as the first & only thing on a line. thought that it needed
a space before it.

Still looking at having the full sub/function code on a single line and the possibility of having
multiple sets of them on 1 line. and the possibilty of having that AND having the last one
ending on subsequent lines. this not only affects the formatting program but also the
flowcharting program I'm working on. this last program then get complicated because it
flags each line as to start or end of sub/function. things were much simpler before you
threw those quirks into the mix.

Thanks a lot (  I think? ).

Paul

Larry Charlton

:) I wouldn't sweat it too much.

I did solve it with the tokenizer I wrote.  What I did was implement a get next statement token.  Essentially is skips over _ to the next line until it hits a colon, comment, or CR/LF.  When it hits the end of statement it returns false.  Calling it again returns the next token until there are no more tokens.  The source code is sitting in PbSlice2 if you want to look at it, it's public domain if anything in there would help you.  It's a bit of a different approach though and all based off interfaces.  No reason it couldn't be functionized though.  The primary files are in src\support called Indent.inc and TokenizeLine.inc.  Code still feels a bit clunky so I might take another crack at it, this is a rewrite of this routine from PbSlice which I also published for reference.  Same idea, bit different logic.

FWIW I started out looking at statements on lines also until I came to the conclusion you have to look at source as a stream of tokens.  I did cheat a bit for my indenter and just ignored includes which causes a few quirks.  Once you realize it's impossible to "correctly" indent every file in a stream, it seems like a reasonable compromize to me.  It also allows indenting a single file and more importantly allows mutliple threads to indent different source files if needed.  For me if I don't have keystroke response to everything reguardless of size it needs work...  Clearly I'll have to give at some point.

Paul Elliott

Larry,

How do you format all the lines in-between the current token ( beginning of a keyword pair ) and  the line with the ending
token of the keyword pair?  without combining the continued lines which would alter the original structure.

the current problem is because the program is in the middle of a structure/function and needs to know when it ends so that
it can go back to normal formatting.

still need more work on REM handling. forgot to handle being enclosed in double-quotes.
but I don't get much internet time any more.

speaking of PBSlice2, should it be kicking out all those messages about <1> is not <1> ( or something like that ... I'm not
at the computer that I ran it on ) ?

for this formatter, tokenizing the whole file would be possible as it now reads the whole file into an array and could output
to another array for further processing before finally outputting to a file.  it might be possible to completely tokenize the
whole file at once ( adding cr/lf markers ). but then you must keep all the continued lines as continued lines ( or at least
have that as an option ) as that is what the original programmer wanted. but then you need to keep track of where in the
token array that lines start so that block structures can be formatted ( lining up parts based maximum size of names within
the structure ).
but I've done a few specialized routines that work much easier when dealing with the whole line.  and the original formatting
code only dealt with reading 1 line at a time.


Peter Weis

#293
Hallo Paul,

du wirst nicht herum kommen ganze Blöcke zu lesen,  und nicht nur eine Zeile und dann zu formatieren,  was ich ja schon ansatzweiße versucht habe!

Grüße Peter

[Translation/Theo]

Paul, Peter assumes that it may be needed to work on blocks instead of on lines.
He assumes taht you will have to read the whole block instead of a single line.

(I just translate, ...)


Larry Charlton

Probabbly the same way everyone else does.  I just have an indent level.  Whenever I come across a line that changes the indent level I either adjust the indent level before or after I output the line.  Starting blocks I indent after, ending blocks I outdent before.  You can search for left character _, space_, and tab_, the only downside was finding that embeded in a string or comment was a bit difficult, luckily I had a tokenizer already for other reasons that did away with the need to worry about it.

It probabbly will.  Most of the messages (I did find 3 bugs so far based) are due to the file paths changing.  I may work on that to eliminate it, but I'm also working on a nano sized unit testing framework that should help fix all that.

What I did was a blend.  I realized that one use of the tokenizer might be to create coloring for tokens in a viewer/editor.  The viewer/editor couldn't display all the lines at once so I probabbly didn't want to tokenize the whole file.  Also adjusting token positions and stuff could get tedious if I ever edited.

What I ended up doing was splitting a file into an array of lines.  The tokenizer only works on a single line.  i.e. it splits all the tokens in a line and lets me loop through them rapidly.  Gives me benefits of rapidly inserting new lines in large files, changing text in any way on one or more lines rapidly and gives me quick access to tokens for a line when I need them.  Now I can work with files, lines, or tokens, whichever is best for a problem.

Paul Elliott

Larry,

the current version does that ( sorta ). it loads the file into an array and for the most part runs top to bottom, splitting the line
up into tokens & rebuilding the line for output. indenting works much as you describe. 

the parts that I wrote to format certain structures do loop thru from a given starting point ( the beginning keyword of the
structure ) and look for the ending keyword.  the problem is that it currently needs the ending keyword pair ( ie End Type or
End Enum or End whatever ) to be on a single line. your quirks file shows that the keyword pair can be separated by many
lines and if there are options following them those can also be many lines removed.

my first thought is to loop thru the whole array and build a 2nd array of just the combined lines from the ones that are continued.
would also need to keep track of which lines were used to create the combined line.
my question is how would you like the continued lines to be indented relative to the 1st line? would just a simple 1 tab stop
indent be okay?

I realize that this probably wouldn't show up very often ( except in your code .. you seem to come up with lots of new styles
that I've never seen or considered ... I'm NOT complaining ).
did you ever run your GDI Plus code thru this program to see if I handled your array loading to your liking? or the Class code?


Larry Charlton

That's what I did.  A continuation added 1indent to the next line.  End of continue uation subtracted one after.