Tuesday, 19 September, 2023 UTC


Summary

react-native-voicebox-speech-rec
React Native Voicebox Speech Recognition: A powerful speech recognition library for React Native applications, enabling real-time speech-to-text transcription.
  • ๐Ÿ“ฑ Cross-platform: Consistent and reliable performance on both iOS and Android.
  • ๐ŸŒ Driven by the SpeechRecognition interface from the Web Speech API.
Versions
1.0.0
iOS support
Android support
Sample App
Features
  • Real-time Transcription: Convert speech to text in real-time as the user speaks.
  • Custom Handlers: Easily setup callback handlers for start, end, and error events during speech recognition.
  • Language Flexibility: Set a specific language for recognition or default to the userโ€™s device system language.
  • No Cloud Dependency: Leverage the native speech recognition engines on iOS and Android without relying on external cloud services.
  • Universal Language Support: Compatible with most languages.
Installation
npm install react-native-voicebox-speech-rec --save
or
yarn add react-native-voicebox-speech-rec
Then link the iOS package by running:
cd ios && pod install && cd ..
or
npx pod-install
Please make sure AndroidX is enabled in your React Native project by editing android/gradle.properties and adding the following 2 lines if they are not there yet:
android.useAndroidX=true
android.enableJetifier=true
Linking
Please use React Native version >=0.60 . React native will automatically link the library to your project.
Development Notes
๐Ÿ“ฑ iOS Simulator Limitation: The simulator simulates voice input with repeated โ€œTestโ€ phrases and does not utilize the actual microphone input. For genuine speech recognition testing, use a physical iPhone.
๐Ÿค– Android Emulator: No specific issues noted. Just remember to enable to use the host audio input for emulator everytime (shown below).
Usage
Just two steps:
  • In App.jsx or App.tsx in your React Native project, wrap your components with SpeechRecognitionRootView from react-native-voicebox-speech-rec
import React from 'react';
import { PaperProvider } from 'react-native-paper';
import { ConversationPage } from './conversation/ConversationPage';
import { SpeechRecognitionRootView } from 'react-native-voicebox-speech-rec';
import { SafeAreaProvider } from 'react-native-safe-area-context';

function App(): JSX.Element {
  return (
    <SafeAreaProvider>
      <PaperProvider>
        <SpeechRecognitionRootView>
          <ConversationPage />
        </SpeechRecognitionRootView>
      </PaperProvider>
    </SafeAreaProvider>
  );
}
export default App;
  • In the file where you want to use speech recognition (e.g. ConversationPage in code example above), use the useSpeechRecognition hook to get the APIs and use them.
import React, { useCallback, useEffect, useMemo } from 'react';
import { ScrollView, Alert } from 'react-native';
import { Text } from 'react-native-paper';
import { SafeAreaView } from 'react-native-safe-area-context';
import { MicrophoneButton } from './MicrophoneButton';
import { useSpeechRecognition } from 'react-native-voicebox-speech-rec';

export const ConversationPage = React.memo(() => {
  /** ***************************************************************
   * Speech Recognition APIs from useSpeechRecognition hook
   *************************************************************** */
  const {
    startSpeechRecognition,
    stopSpeechRecognition,
    speechContentRealTime,
    setSpeechRecErrorHandler,
    setSpeechRecStartedHandler,
    setSpeechRecCompletedHandler,
  } = useSpeechRecognition();

  // Event handler when speech recognition starts
  useEffect(() => {
    setSpeechRecStartedHandler(() => {
      console.log('๐Ÿ‘† Speech Recgnition Started!');
    });
  }, [setSpeechRecStartedHandler]);

  // Event handler when speech recognition has errors
  useEffect(() => {
    setSpeechRecErrorHandler((errorMessage: any) => {
      Alert.alert(
        'Error in speech recognition',
        String(errorMessage),
        [
          {
            text: 'OK',
            style: 'cancel',
          },
        ],
        { cancelable: false }
      );
    });
  }, [setSpeechRecErrorHandler]);

  // Event handler when speech recognition ends
  useEffect(() => {
    setSpeechRecCompletedHandler(async (userChatMessage: string) => {
      if (userChatMessage.length > 0) {
        console.log('๐ŸŽ‰ Speech Recognition Completed. Recognized Content: ', userChatMessage);
      } else {
        console.log('๐ŸŽ‰ Speech Recognition Completed. User spoke nothing. ');
      }
    });
  }, [setSpeechRecCompletedHandler]);

  /** **********************************************************************
   * Start speech recognition when user presses the microphone button
   ********************************************************************** */
  const handleConversationButtonPressed = useCallback(async () => {
    startSpeechRecognition();
  }, [startSpeechRecognition]);

  /** **********************************************************************
   * End speech recognition when user releases the microphone button
   ********************************************************************** */
  const handleConversationButtonReleased = useCallback(() => {
    stopSpeechRecognition();
  }, [stopSpeechRecognition]);

  const speechRecContentArea = useMemo(() => {
    return <Text variant="titleLarge">{speechContentRealTime}</Text>;
  }, [speechContentRealTime]);

  return (
    <SafeAreaView>
      {/* Show realtime speech recognition content in this area */}
      <ScrollView>{speechRecContentArea}</ScrollView>

      <MicrophoneButton
        disabled={false}
        handleButtonPressed={handleConversationButtonPressed}
        handleButtonReleased={handleConversationButtonReleased}
      />
    </SafeAreaView>
  );
});
Example App
For full and advanced usage please take a look into the example React Native app in the example folder.
API
Here are all the APIs provided by the useSpeechRecognition hook.
API Description Platform
startSpeechRecognition(localeTag) Starts listening for speech for a specific locale in IETF BCP 47 standard (e.g., en-US, en-GB, zh-CN, etc). If no localeTag is given, then it uses the system locale of the userโ€™s device. iOS, Android
stopSpeechRecognition Stops listening for speech. iOS, Android
cancelSpeechRecognition Cancels listening for speech. iOS, Android
speechContentRealTime A string representing the recognized texts while the user is speaking. You can use this string to show the real time speech recognition text. iOS, Android
setSpeechRecStartedHandler Set a callback handler which is called when speech recognition starts without error. The signature of the handler is async () => void iOS, Android
setSpeechRecCompletedHandler Set a callback handler which is called when speech recognition completes without error. The signature of the handler is async (speechRecResult: string) => void, where speechRecResult is a string representing the text recognized from userโ€™s speech iOS, Android
setSpeechRecErrorHandler Set a callback handler which is called when speech recognition has error. The signature of the handler is async (errorMessage: string) => void, where errorMessage explains why the error happens iOS, Android
License
MIT
Packages used
Greatly thanks to the following libraries which this project depends on:
  • react-native-localize
  • react-native-webview
  • react-native-volume-manager
Enjoy using this library! GitHub
View Github