• Welcome to Theos PowerBasic Museum 2017.

SAPI Speech Recognition

Started by John Thompson, January 06, 2009, 01:04:15 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

John Thompson

Has anyone had any luck using SAPI Speech Recognition in PB/Win 9?  I was able to get it working in in PB/Win 8, but have been unable to convert it to PB/Win 9.  I need the SAPISpSharedRecoContext, and tried to access it similar to the SAPI Text To Speech method available on this forum, but have been unsuccessful.  I can provide a more detailed example in PB/Win 8 showing it working, but I get stuck at this very first step (SapiSpSharedRecoContext).  Any ideas?  Thanks!  -John


#COMPILE EXE
#DIM ALL
#INCLUDE ONCE "SAPI.INC"

FUNCTION PBMAIN () AS LONG

    LOCAL pISpVoice AS ISpVoice
    pISpVoice = NEWCOM CLSID $CLSID_SpVoice
    IF ISNOTHING(pISpVoice) THEN MSGBOX "Oops",,"ISpVoice" : EXIT FUNCTION
    pISpVoice = NOTHING

    LOCAL pISpRecoContext AS ISpRecoContext
    pISpRecoContext = NEWCOM CLSID $CLSID_SpSharedRecoContext
    IF ISNOTHING(pISpRecoContext) THEN MSGBOX "Oops",,"ISpRecoContext" : EXIT FUNCTION
    pISpRecoContext = NOTHING

END FUNCTION

José Roca

#1
 
Using the CLSID means that you need to have a particular version installed (5.1 if you are using my include files). If you are using another version, you need to use the appropriate CLSID.

Instead, you can use the version independent ProgID, i.e.


pISpRecoContext = NEWCOM "SAPI.SpSharedRecoContext"


I have tested your code in my computer and it works fine.

José Roca

 
If you have more questions, I need to know which version of SAPI you're using and which headers you're using: my WINAPI headers, or the headers generated by the PB COM Browser or my COM Browser.

Apparently, the version of SAPI that you're using is 5.0:
http://www.powerbasic.com/support/pbforums/showthread.php?t=38648&highlight=sapi

John Thompson

Thanks for your help!  That fixed that problem.  I keep running into them, though, with PB/Win 9.  Is there a section I'm missing of the PB help file that explains this stuff???


    DIM pISpRecoContext     AS LOCAL    ISpRecoContext
    DIM pISpRecoGrammar     AS LOCAL    ISpRecoGrammar
    pISpRecoContext = NEWCOM "SAPI.SpSharedRecoContext"
    IF ISNOTHING(pISpRecoContext) THEN MSGBOX "Oops",,"ISpRecoContext" : EXIT SUB
    lRC = pISpRecoContext.CreateGrammar(1,pISpRecoGrammar)
    MSGBOX FORMAT$(lRC)
    IF ISNOTHING(pISpRecoGrammar) THEN MSGBOX "Oops",,"ISpRecoGrammar" : EXIT SUB


The above code compiles fine and the fifth line returns S_OK, but pISpRecoGrammar can not be used for anything...  The instructions are below, but are in C++.


'ppGrammar
'[out] Address of a pointer which receives the 'ISpRecoGrammar object. The application must call 'IUnknown::Release on the object when finished using it.

'ISpRecoContext::CreateGrammar
HRESULT CreateGrammar(
   ULONGLONG          ullGrammarId,
   ISpRecoGrammar   **ppGrammar
);


I was using 5.0 in the past, but am currently using SAPI SDK 5.1 your newest Win32API* #include's.

Thanks for your help!

Everything COM is working as expected for MS MapPoint and other apps... SAPI keeps kicking my butt, though.

José Roca

#4
 
SAPI has two sets of interfaces: low-level "Sp" interfaces, thought to be used by C programmers, and dual "Speech" interfaces. I advise you to use the "Speech" interfaces with PB.


#COMPILE EXE
#DIM ALL
#INCLUDE ONCE "SAPI.INC"

FUNCTION PBMAIN () AS LONG

   LOCAL pISpeechRecoContext AS ISpeechRecoContext
   LOCAL pISpeechRecoGrammar AS ISpeechRecoGrammar
   pISpeechRecoContext = NEWCOM "SAPI.SpSharedRecoContext"
   IF ISNOTHING(pISpeechRecoContext) THEN MSGBOX "Oops",,"ISpeechRecoContext" : EXIT FUNCTION

   pISpeechRecoGrammar = pISpeechRecoContext.CreateGrammar(1)
   IF ISNOTHING(pISpeechRecoGrammar) THEN MSGBOX "Oops",,"ISpeechRecoGrammar" : EXIT FUNCTION

END FUNCTION


Using the above code you will get a reference to the ISpeechRecoGrammar interface, instead of ISpRecoGrammar.


' ########################################################################################
' Interface name = ISpeechRecoGrammar
' IID = {B6D6F79F-2158-4E50-B5BC-9A9CCD852A09}
' Attributes = 4160 [&H1040] [Dual] [Dispatchable]
' Inherited interface = IDispatch
' ########################################################################################

#IF NOT %DEF(%ISpeechRecoGrammar_INTERFACE_DEFINED)
    %ISpeechRecoGrammar_INTERFACE_DEFINED = 1

INTERFACE ISpeechRecoGrammar $IID_ISpeechRecoGrammar

   INHERIT IDispatch

   ' =====================================================================================
   PROPERTY GET Id <1> ( _                              ' VTable offset = 28
   ) AS VARIANT                                         ' __retval_out VARIANT* Id
   ' =====================================================================================
   PROPERTY GET RecoContext <2> ( _                     ' VTable offset = 32
   ) AS ISpeechRecoContext                              ' __retval_out ISpeechRecoContext** RecoContext
   ' =====================================================================================
   PROPERTY SET State <3> ( _                           ' VTable offset = 36
     BYVAL LONG _                                       ' __in SpeechGrammarState State
   )                                                    ' void
   ' =====================================================================================
   PROPERTY GET State <3> ( _                           ' VTable offset = 40
   ) AS LONG                                            ' __retval_out SpeechGrammarState* State
   ' =====================================================================================
   PROPERTY GET Rules <4> ( _                           ' VTable offset = 44
   ) AS ISpeechGrammarRules                             ' __retval_out ISpeechGrammarRules** Rules
   ' =====================================================================================
   METHOD Reset <5> ( _                                 ' VTable offset = 48
     OPTIONAL BYVAL LONG _                              ' __opt_in long NewLanguage = 0
   )                                                    ' void
   ' =====================================================================================
   METHOD CmdLoadFromFile <7> ( _                       ' VTable offset = 52
     BYVAL STRING _                                     ' __in BSTR FileName
   , OPTIONAL BYVAL LONG _                              ' __opt_in SpeechLoadOption LoadOption = 0
   )                                                    ' void
   ' =====================================================================================
   METHOD CmdLoadFromObject <8> ( _                     ' VTable offset = 56
     BYVAL STRING _                                     ' __in BSTR ClassId
   , BYVAL STRING _                                     ' __in BSTR GrammarName
   , OPTIONAL BYVAL LONG _                              ' __opt_in SpeechLoadOption LoadOption = 0
   )                                                    ' void
   ' =====================================================================================
   METHOD CmdLoadFromResource <9> ( _                   ' VTable offset = 60
     BYVAL LONG _                                       ' __in long hModule
   , BYVAL VARIANT _                                    ' __in VARIANT ResourceName
   , BYVAL VARIANT _                                    ' __in VARIANT ResourceType
   , BYVAL LONG _                                       ' __in long LanguageId
   , OPTIONAL BYVAL LONG _                              ' __opt_in SpeechLoadOption LoadOption = 0
   )                                                    ' void
   ' =====================================================================================
   METHOD CmdLoadFromMemory <10> ( _                    ' VTable offset = 64
     BYVAL VARIANT _                                    ' __in VARIANT GrammarData
   , OPTIONAL BYVAL LONG _                              ' __opt_in SpeechLoadOption LoadOption = 0
   )                                                    ' void
   ' =====================================================================================
   METHOD CmdLoadFromProprietaryGrammar <11> ( _        ' VTable offset = 68
     BYVAL STRING _                                     ' __in BSTR ProprietaryGuid
   , BYVAL STRING _                                     ' __in BSTR ProprietaryString
   , BYVAL VARIANT _                                    ' __in VARIANT ProprietaryData
   , OPTIONAL BYVAL LONG _                              ' __opt_in SpeechLoadOption LoadOption= 0
   )                                                    ' void
   ' =====================================================================================
   METHOD CmdSetRuleState <12> ( _                      ' VTable offset = 72
     BYVAL STRING _                                     ' __in BSTR Name
   , BYVAL LONG _                                       ' __in SpeechRuleState State
   )                                                    ' void
   ' =====================================================================================
   METHOD CmdSetRuleIdState <13> ( _                    ' VTable offset = 76
     BYVAL LONG _                                       ' __in long RuleId
   , BYVAL LONG _                                       ' __in SpeechRuleState State
   )                                                    ' void
   ' =====================================================================================
   METHOD DictationLoad <14> ( _                        ' VTable offset = 80
     OPTIONAL BYVAL STRING _                            ' __opt_in BSTR TopicName = L""
   , OPTIONAL BYVAL LONG _                              ' __opt_in SpeechLoadOption LoadOption = 0
   )                                                    ' void
   ' =====================================================================================
   METHOD DictationUnload <15> ( _                      ' VTable offset = 84
   )                                                    ' void
   ' =====================================================================================
   METHOD DictationSetState <16> ( _                    ' VTable offset = 88
     BYVAL LONG _                                       ' __in SpeechRuleState State
   )                                                    ' void
   ' =====================================================================================
   METHOD SetWordSequenceData <17> ( _                  ' VTable offset = 92
     BYVAL STRING _                                     ' __in BSTR Text
   , BYVAL LONG _                                       ' __in long TextLength
   , BYVAL ISpeechTextSelectionInformation _            ' __in ISpeechTextSelectionInformation* Info
   )                                                    ' void
   ' =====================================================================================
   METHOD SetTextSelection <18> ( _                     ' VTable offset = 96
     BYVAL ISpeechTextSelectionInformation _            ' __in ISpeechTextSelectionInformation* Info
   )                                                    ' void
   ' =====================================================================================
   METHOD IsPronounceable <19> ( _                      ' VTable offset = 100
     BYVAL STRING _                                     ' __in BSTR Word
   ) AS LONG                                            ' __retval_out SpeechWordPronounceable* WordPronounceable
   ' =====================================================================================

END INTERFACE

#ENDIF   ' /* __ISpeechRecoGrammar_INTERFACE_DEFINED__ */


Now, the only problem that you can have is when you have to pass an string. If the parameter is BYVAL VARIANT, you must pass an ansi string because PB does the unicode conversion when you assign an string to a variant, but if the parameter is a BSTR (BYVAL AS STRING), then you have to use UCODE$(<string>).

Using the dual interfaces, error codes are returned by OBJRESULT.