Offline voice control for pidog

Hi
Some weeks ago I posted a couple of videos on Sunfounders Facebook page, showing joystick and voice control of pidog. I had several requests to supply the code but I could not, as I had built it inside a ROS environment.

I now have a need for voice control outside ROS as well as being wholly offline.

So, I’ve written a short (circa 100 lines) python wrapper around Sunfounders keyboard control.

I’ll add this to my start on boot, such that I can use it outside the house as a standalone, without any external controller/computer needed. Easier for kids to play etc

I’m happy to post the code here, but not sure if there is still interest ? Also not sure if I’m allowed to post unsolicited software here.

I’ll wait a few days and see if I get any bans!

Please note one will need to perform some localisation to the code, venv and python libraries. So if you’re not comfortable with this, don’t use it! I don’t want to be responsible for breaking someone’s system.
And you will need a microphone!

1 Like

I’m quit interested what you did there to be honest. Love to tinker with those kind of things. I would say these kind of community sharing ideas is one of the main things of a forum as this I would presume. GitHub might be also a place where you can commit such a small thing as well I suppose.

I like you efforts for the community

Nice to hear from you again. I did look at github for a release but to be honest I didn’t want to spend the time learning the system, it looked quite tricky/formal to learn/maintain with little value to myself.

I’ve not seen anyone else posting their software here either. So I’m just nervously tiptoeing around until I learn the system/protocols. Don’t want to upset anyone!

Although my existing code supports all the SF keyboard commands, I would really need to add a couple more for general offline use before posting, such as a clean voice controlled shutdown, but no rush if little interest!

No complaints… so my voice recog code copied below. I cannot provide any support or guarantees for this. If you’re not too familiar with linux/python/venv/apt/pip etc don’t use it. All the instructions/edits etc required are in the code below as comments. Needs a usb microphone adding for the voice capture. Mine has agc, but it didn’t seem to work too well, so I turned it off and fixed gain at maximum via alsa.

#!/usr/bin/env python3

#NOTES......
#Anyone using this will need to edit some of the code in this section and/or install stuff
#Depending on what one already has available. I cannot provide support

#Recommend creating this file straight into the existing examples folder
#so as not to have to mess with python paths, with whatever filename you prfer.

# If voice recog is needed on boot i.e. no external computer/network required
# then run this program via systemd commands. The robot can also recognise
# "DOGNAME shut down now" command, for clean shutdown when away from
# another computer, just so as not to corrupt SDCard by not needing to
# turn power off "live". Dog will bark 3x to acknowledge this
#


#Ensure following libraries are installed via ones personal preferred method
# e.g sudo apt install python3-WHATEVER
#OR
#     pip install WHATEVER inside a venv
#OR
#     etc

import queue
import sounddevice as sd
from vosk import Model, KaldiRecognizer
import sys
import json
import os


# MyKbCtrl is a soft link to Sunfounder's 11_keyboard_control.py 
# One of many ways to import a module with a numeric first character
# One needs to create this link manually, e.g.  
# ln -s 11_keyboard_control.py MKbCtrl.py
# or use another builtin method 

from MyKbCtrl import *
from MyKbCtrl import run_operation

#download VOSK here: Many other sources are available
#https://alphacephei.com/vosk/models 
#I use the vosk-model-small-en-us-0.15

#Change to your own path location for above downloaded VOSK database
DBDIR=os.path.expanduser("~")+"/RoboDog/DataBases/" + "VoiceRecog/Models/vosk-model-small-en-us-0.15"

#Change to your own dog name (chosen name must be in VOSK DB and all in lower case)
#Ensure to retain single trailing space character
DogName="nelson "



############################################################
#NO CHANGES SHOULD BE NEEDED AFTER HERE, BUT MIC MAY NOT WORK?
#I've used two different USB mics, both work fine, but cannot
#test all of them!
############################################################


KEYWORDPOS=3
HEADER=14


# list all audio devices known to your system just for info/debug
print(sd.query_devices())

device_info = sd.query_devices(sd.default.device[0], 'input')
samplerate = int(device_info['default_samplerate'])

# display the default input device for debug/info
print("===> Initial Default Device Number:{} Description: {}".format(sd.default.device[0], device_info))

# setup queue and callback function
q = queue.Queue()

def recordCallback(indata, frames, time, status):
    if status:
        print(status, file=sys.stderr)
    q.put(bytes(indata))

print("===> Building the model and recognizer objects.  This will take a few minutes.")
model = Model(DBDIR)
recognizer = KaldiRecognizer(model, samplerate)
recognizer.SetWords(False)

#Simply maps words to Sunfounders keyboard presses program for simplicity
#I've probably missed several! Could also add number recognition and
#thus add loops around walk-style commands to make it walk further etc etc
#I've had to add a few duplicates, probably my accent! The included Debug messages
#can be used to modify to match ones own accent if needed!
def executeAction(StrippedString):
     print("Command I heard is",StrippedString) # for debug
     if    StrippedString == "doze off"  :  run_operation(("1"))
     elif  StrippedString == "push up"   :  run_operation(("2"))
     elif  StrippedString == "howl"      :  run_operation(("3"))
     elif  StrippedString == "twist body":  run_operation(("4"))
     elif  StrippedString == "twist"     :  run_operation(("4"))
     elif  StrippedString == "scratch"   :  run_operation(("5"))
     elif  StrippedString == "bark"      :  run_operation(("q"))
     elif  StrippedString == "forward"   :  run_operation(("w"))
     elif  StrippedString == "forwards"  :  run_operation(("w"))
     elif  StrippedString == "trot"      :  run_operation(("W"))
     elif  StrippedString == "pant"      :  run_operation(("e"))
     elif  StrippedString == "punt"      :  run_operation(("e"))
     elif  StrippedString == "pumped"    :  run_operation(("e"))
     elif  StrippedString == "wag tail"  :  run_operation(("r"))
     elif  StrippedString == "handshake" :  run_operation(("t"))
     elif  StrippedString == "turn left" :  run_operation(("a"))
     elif  StrippedString == "backward"  :  run_operation(("s"))
     elif  StrippedString == "backwards" :  run_operation(("s"))
     elif  StrippedString == "turn right":  run_operation(("d"))
     elif  StrippedString == "shake head":  run_operation(("f"))
     elif  StrippedString == "high five" :  run_operation(("g"))
     elif  StrippedString == "lie"       :  run_operation(("z"))
     elif  StrippedString == "lie down"  :  run_operation(("z"))
     elif  StrippedString == "stand"     :  run_operation(("x"))
     elif  StrippedString == "stand up"  :  run_operation(("x"))
     elif  StrippedString == "sit"       :  run_operation(("c"))
     elif  StrippedString == "stretch"   :  run_operation(("v"))
     elif (StrippedString == "shut down now"):
         run_operation(("q"))
         run_operation(("q"))
         run_operation(("q"))
         print("Initiating Shutdown")
         os.system("sudo shutdown now")
     else :
        print("..... but I'm sorry I can't '%s'" % StrippedString)



#Start program...
print("===> Ready to obey.... Press Ctrl+C to stop")
try:
    with sd.RawInputStream(dtype='int16',
                           channels=1,
                           callback=recordCallback):
        while True:
            data = q.get()
            if recognizer.AcceptWaveform(data):
                recognizerResult = recognizer.Result()
                # convert the recognizerResult string into a dictionary 
                resultDict = json.loads(recognizerResult)
                if not resultDict.get("text", "") == "":
                    words=recognizerResult.split()
                    StrippedString=recognizerResult[HEADER:len(recognizerResult)-KEYWORDPOS]
                    if StrippedString.startswith(DogName):
                      StrippedString=StrippedString[len(DogName):]
                      Action=executeAction(StrippedString)
                    else :
                     #Just so as user can see what was really heard i.e. a bit of "voice" debug
                      print("I heard this = ",words)

                else:
                    print("no input sound")

except KeyboardInterrupt:
    print('===> Finished')
    my_dog.close()
except Exception as e:
    print(str(e))
~                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
                            

Enjoy!

1 Like

As previously posted, I’ve added hazard/step detection via midas based AI and then occupancy mapping to my pidog. This is also now coupled to path planning, such that my pidog chooses it’s own destination, route plans it, and heads off there autonomously, avoiding any hazards enroute. This code and libraries are unfortunately too large for this style forum. However, my pidog was then lacking some personality, so I interspersed the autonomous wandering with random events from Sunfounders keyboard control to make my robot’s actions a bit more “dog like”, I’ve cut my larger AI code base down such that only the random action section remains and whilst very simple code, I hope it is of some interest/fun.

#!/usr/bin/env python3

import random
import time

#Recommend dropping this file straight into the existing examples folder
#so as not to have to mess with python paths

#Set by user according to environment
ONFLOOR=True # up to use to decide if environmnt is safe enough to wander!
ONFLOOR=False # e.g on a bench so no walk commands will be executed

# MyKbCtrl is a soft link to Sunfounder's 11_keyboard_control.py 
# One of many ways to import a module with a numeric first character
# One needs to create this link manually, e.g.  
# ln -s 11_keyboard_control.py MKbCtrl.py
# or use another builtin method such as __import__()

from MyKbCtrl import *
from MyKbCtrl import run_operation

#Simply randomises to Sunfounders keyboard presses program 
#I've probably missed several! Should have probably used a dictionary too
#I debug this on a bench, so any "walking" style commands are needed to be 
#easily removable to prevent pidog walking off the edge! 
#I've already written the AI for hazard/step detection, but that's
#too big to publish easily

def ExecAnAction(RanAction):
  print("Action = ",RanAction,end="")
  if    RanAction==1 :  run_operation(("1"))  #= "doze off"    
 #elif  RanAction==2 :  run_operation(("2"))  #= "push up" keeps falling over!
  elif  RanAction==3 :  run_operation(("3"))  #= "howl"      
  elif  RanAction==4 :  run_operation(("4"))  #= "twist body"
  elif  RanAction==5 :  run_operation(("4"))  #= "twist"     
  elif  RanAction==6 :  run_operation(("5"))  #= "scratch"   
  elif  RanAction==7 :  run_operation(("q"))  #= "bark"      
  elif  RanAction==10:  run_operation(("e"))  #= "pant"      
  elif  RanAction==11:  run_operation(("e"))  #= "punt"      
  elif  RanAction==12:  run_operation(("e"))  #= "pumped"    
  elif  RanAction==13:  run_operation(("r"))  #= "wag tail"  
  elif  RanAction==14:  run_operation(("t"))  #= "handshake" 
  elif  RanAction==18:  run_operation(("f"))  #= "shake head"
  elif  RanAction==19:  run_operation(("g"))  #= "high five" 
  elif  RanAction==20:  run_operation(("z"))  #= "lie"       
  elif  RanAction==21:  run_operation(("x"))  #= "stand"     
  elif  RanAction==22:  run_operation(("c"))  #= "sit"       
  elif  RanAction==23:  run_operation(("v"))  #= "stretch"   
  if (ONFLOOR): # My own code version calls my AI path planners here
    if    RanAction==8 : run_operation(("w"))   #= "forward"   
    elif  RanAction==9 : run_operation(("W"))   #= "trot"   
    elif  RanAction==15: run_operation(("a"))   #= "turn left" 
    elif  RanAction==16: run_operation(("s"))   #= "backward"  
    elif  RanAction==17: run_operation(("d"))   #= "turn right"
  else:
    if    RanAction==8 :  print("Needs Safe AI Walkabout code")  #= "forward"   
    elif  RanAction==9 :  print("Needs Safe AI Walkabout code")  #= "trot"   
    elif  RanAction==15:  print("Needs Safe AI Walkabout code")  #= "turn left" 
    elif  RanAction==16:  print("Needs Safe AI Walkabout code")  #= "backward"  
    elif  RanAction==17:  print("Needs Safe AI Walkabout code")  #= "turn right"

try:
   while True:
     ExecAnAction(random.randrange(1,24))
     #Give batteries an occasional rest!
     Rest= random.randrange(1,7)
     print("  Rest = ",Rest)
     time.sleep(Rest)


except KeyboardInterrupt:
    print('===> Finished')
    my_dog.close()
except Exception as e:
    print(str(e))
~                        

The code is supplied in “Table top mode” i.e. no walk commands are implemented.
Changing the “ONFLOOR” to True value at the top of the code will add walk commands, so make sure that the dog is in a safe supervised environment. It could end up anywhere! Ctrl-c stops it.

No other code changes are necessary,
A softlink to SunFounders keyboard script will be needed as explained in the code’s comments or use a built in such as __import__()

1 Like

Thank you ! Can’t wait to try it during the vacation !