'=========================================================================== ' Subject: CUSTOM TOKENIZED STRINGS Date: 03-14-98 (10:14) ' Author: Brian McLaughlin Code: PB ' Origin: alt.lang.powerbasic Packet: PB.ABC '=========================================================================== 'Posted below is code to break a string into tokens, based 'on whatever delimiter(s) you select. I am releasing it to 'the public domain. $LIB ALL OFF DECLARE FUNCTION Tokenize$ (Source$, Delimiter$, Index%) DECLARE FUNCTION GETSTRLOC& (BYVAL Handle%) '<-- REQUIRED! DECLARE FUNCTION GETSTRALLOC& (BYVAL Bytes%) '<-- REQUIRED! 'Demo code ---------------------------------------- CLS Source$ = "C:\BIN\BOX\LID\HINGE\FILENAME.EXE" Delim$ = "\:." Index% = 0 PRINT Source$ DO Tok$ = Tokenize$(Source$, Delim$, Index%) PRINT Tok$ + " "; '<-- notice semicolon PRINT Index% LOOP WHILE LEN(Tok$) END '---------------------------------------------- '============================================================================= FUNCTION Tokenize$ (Source$, Delim$, Index%) '============================================================================= DIM Count AS LOCAL INTEGER 'keeps count of chars between delimiters DIM FirstChar AS LOCAL INTEGER 'offset of first tested char during this call DIM LocalIndex AS LOCAL INTEGER 'an index we can manipulate freely DIM Returned AS LOCAL STRING Returned = "" Count = 0 LocalIndex = Index% FirstChar = 0 ASM Push DS ASM Push SI ASM Push DI ASM Les BX, Index% ASM Mov SI, ES:[BX] ; SI = Index% value ASM Les BX, Delim$ ; ES:BX = address of Delim$ handle ASM Mov AX, ES:[BX] ; AX = Delim$ handle ASM Push AX ASM Call GETSTRLOC ; puts Delim$ address IN DX:AX, length IN CX ASM Push DX ; save Delim$ segment on stack ASM Push AX ; save offset of Delim$ first Byte on stack ASM Push CX ; save CX (length of Delim$) on stack ASM Les BX, Source$ ;ES:BX = address of Source$ handle ASM Mov AX, ES:[BX] ; AX = handle of Source$ ASM Mov BX, AX ; make a copy of AX IN BX ASM Push AX ASM Call GETSTRLOC ASM Jcxz SourceNull ASM Mov ES, DX ;ES = segment of Source$ ASM Mov DX, CX ;DX = length of Source$ ASM Pop CX ;CX = length of Delim$ ASM Jcxz DelimNull ; if Delim$ is a null string, return Source$ ASM Pop DI ;DI = offset of Delim$ ASM Pop DS ;DS:DI point to Delim$ ASM Mov BX, SI ; save SI before we trash it ASM Mov SI, AX ;ES:SI = address of first Byte of Source ASM Mov AX, BX ;AX = last tested Byte of Source ASM Sub DX, AX ;DX = length of untested Source ASM Add SI, AX ;SI = address of first untested Byte of Source ASM Mov FirstChar, SI ; save offset of first untested char ASM Xor AX, AX ; zero IN AX ASM Push DI ; save address of start of Delim on stack ASM Push CX ; save the length of Delim on stack GetSourceChar: ASM OR DX, DX ; are there any untested chars IN Source? ASM JZ TokenCompleted ; if not, prepare to exit ASM Mov AL, ES:[SI] ;AL = next untested char IN Source GetDelimChar: ' target of Loop, below NotDelimChar label ASM Mov AH, [DI] ;AH = next delimiter character IN Delim ASM Cmp AL, AH ; does the source char match the delim char? ASM Jne NotDelimChar ; no match means this source char isn't a delim ASM Cmp Count, 0 ; the source char IS a delim, but does Count = 0? ASM Jnz TokenCompleted ; if Count > 0, we've reached end of this token ASM Inc FirstChar ; if Count = 0, Inc address of first return char SetForNextSourceChar: ' otherwise, just skip ahead to next Source char ASM Inc LocalIndex ; show there's one more tested char ASM Dec DX ; show there's one less untested char IN Source ASM Inc SI ; point SI at next char IN Source ASM Pop CX ; restore CX as a Loop counter = length of Delim$ ASM Pop DI ; restore DI to address of first Byte of Delim ASM Push DI ; save the same values back on the stack ASM Push CX ; IN the same order as before, for the next Loop ASM Jmp GetSourceChar ; And go fetch the next char IN Source to Test NotDelimChar: ASM Inc DI ; point to next delim char ASM Loop GetDelimChar ; Loop back to get next delim char ASM Inc Count ; show token is one char longer ASM Jmp SHORT SetForNextSourceChar ; prepare to get next char from Source SourceNull: ASM Add SP, 2 ;adjust stack pointer before jumping to SourceLeft ASM Mov LocalIndex, CX ; And make certain Index = 0 for next Call ASM Mov Returned, BX ; return handle of Source$ ASM Jmp SHORT TokenExit DelimNull: ' if we're here Delim$ was null ASM Mov Returned, BX ; just put the handle for Source$ IN Returned ASM Mov LocalIndex, CX ; And remember we've checked Out the whole string ASM Jmp SHORT TokenExit TokenCompleted: ASM Cmp Count, 0 ; are we returning a null string? ASM Jne SourceLeft ; if not, we'll be backASM ASM Mov LocalIndex, 0 ; if so, set Index back to zero SourceLeft: ASM Mov AX, Count ASM Push AX ASM Call GETSTRALLOC ASM Mov Returned, AX ;save handle of allocated string ASM Push SS ASM Pop DS ;reset DS to data seg, so we won't crash ASM Push AX ASM Call GETSTRLOC ASM Jcxz TokenExit ASM Push ES ASM Pop DS ASM Mov SI, FirstChar ;DS:SI must point at source ASM Mov ES, DX ASM Mov DI, AX ;ES:DI must point at destination ASM Rep Movsb ; copy Count chars from SaveChars to Returned TokenExit: ' if we got here then return$ handle is in AX ASM Add SP, 4 ; adjust stack pointer: we left two values ASM Pop DI ASM Pop SI ASM Pop DS Index% = LocalIndex Tokenize$ = Returned END FUNCTION