Pidog OpenAI Image Prompt Fails

My Pidog OpenAI image analysis was working great! But since yesterday evening it is failing to return results. My terminal messages are listed below. Any ideas?

pi@raspberrypi:~/pidog/gpt_examples $ ~/keyboard.sh
config_file: /home/pi/.config/pidog/pidog.conf
robot_hat init … done
imu_sh3001 init … done
rgb_strip init … done
dual_touch init … done
sound_direction init … done
sound_effect init … done
ultrasonic init … done
vilib 0.3.11 launching …
picamera2 0.3.24
ALSA lib pcm.c:8570:(snd_pcm_recover) underrun occurred

Web display on:
http://192.168.86.38:9000/mjpg

Starting web streaming …

  • Serving Flask app ‘vilib.vilib’
  • Debug mode: off

intput: what do you see?
1739372497.743 user >>> what do you see?
failed
chat takes: 4.616 s
tts takes: 3.309 s
actions: [‘stop’]
speak start
speak done

intput:

We have recently updated our code, so we recommend that you re-update all the library code and go run the examples again to see what works.
cd ~/robot-hat
git pull
sudo python3 setup.py install

cd ~/vilib
git pull
sudo python3 install.py

cd ~/pidog
git pull
sudo python3 setup.py install

cd ~/pidog
sudo bash i2samp.sh

After updating the code, reboot your system and go run the example again to see if it works.
If installing i2samp.sh fails, we recommend installing it a few more times.

Thanks! I will give this a try. Question, should I update/upgrade the system first?
sudo apt update
sudo apt upgrade
I think I saw somewhere it was best not to upgrade?

FYI, I get this message when running sudo python3 setup install:
The “setup.py” installation method is planned to be abandoned.
Please execute “install.py” to install.

Issue Description:

I’m working on integrating OpenAI’s Vision API with my PiDog project. The text-based queries work fine, but when I attempt to send an image to OpenAI for analysis, even with the updated code you provided, the API response returns a server error. The code worked fine last week but something changed perhaps on the OpenAI side. The failure occurs after successfully uploading the image to OpenAI’s API.

Here’s the response I receive when running an image prompt:

input: test
1739497226.651   user >>> test
🔍 OpenAI Full Response: 
Run(id='run_kqEc5KWUYdL49rCgR6PP4eZY', 
    assistant_id='asst_Hiz7rLKNny9H2e5irztj9F8W', 
    last_error=LastError(code='server_error', message='Sorry, something went wrong.'), 
    status='failed', 
    model='gpt-4o'
)
❌ OpenAI Run Status: failed

Steps Taken:

  1. Text-only queries work fine with the gpt-4o model.
  2. The image successfully uploads using OpenAI’s file upload API.
  3. The API request fails when sending the image for processing, returning "server_error" in the response.

Code Snippet:

def dialogue_with_img(self, msg, img_path):
chat_print(“user”, msg)

# Upload image to OpenAI's vision API
img_file = self.client.files.create(
    file=open(img_path, "rb"),
    purpose="vision"
)

# Send request with text and image
message = self.client.beta.threads.messages.create(
    thread_id=self.thread.id,
    role="user",
    content=[
        {"type": "text", "text": msg},
        {"type": "image_file", "image_file": {"file_id": img_file.id}}
    ],
)

# Run API request and wait for response
run = self.client.beta.threads.runs.create_and_poll(
    thread_id=self.thread.id,
    assistant_id=self.assistant_id,
)

print("🔍 OpenAI Full Response:", run)  # Debugging

if run.status == 'completed':
    messages = self.client.beta.threads.messages.list(thread_id=self.thread.id)
    for message in messages.data:
        if message.role == 'assistant':
            for block in message.content:
                if block.type == 'text':
                    value = block.text.value
                    chat_print(self.assistant_name, value)
                    return value
    return None  # No valid response
else:
    print(f"❌ OpenAI Run Status: {run.status}")
    return None

#### **Questions & Help Needed:**

1. **Is there an issue with how I’m passing the image to OpenAI?**
2. **Has anyone successfully used OpenAI Vision with PiDog in the last few days?**
3. **Could this be a temporary API issue, or do I need a different approach?**

Any insights would be greatly appreciated! 🚀

FYI - I tried an experiment saving the image to imgur and OpenAI returns responses using this approach, but I would prefer to source the image file from the Raspberry Pi directly. For your reference here’s a copy of my code.

def dialogue_with_img(self, msg, img_path):
chat_print(“user”, msg)
# Upload the image to Imgur
print(“:outbox_tray: Uploading image to Imgur…”)
try:
im = pyimgur.Imgur(IMGUR_CLIENT_ID)
uploaded_image = im.upload_image(img_path, title=“Uploaded from PiDog”)
image_url = uploaded_image.link
print(f":white_check_mark: Uploaded: {image_url}“)
except Exception as e:
print(f":x: Failed to upload image to Imgur: {e}”)
return None
# Send image URL to OpenAI with explicit model specification
print(f":robot: Sending image to OpenAI using model: gpt-4o")
try:
response = self.client.chat.completions.create(
model=“gpt-4o”, # ← Explicitly specify the model
messages=[
{“role”: “user”,
“content”: [
{“type”: “text”, “text”: msg},
{“type”: “image_url”, “image_url”: {“url”: image_url}}
]}
]
)
# Extract response
print(“:white_check_mark: OpenAI Response Received:”, response)
return response.choices[0].message.content
except Exception as e:
print(f":x: Error in OpenAI request: {e}")
return None

And the terminal response is listed below:

pi@raspberrypi:~ $ ~/keyboard.sh
vilib 0.3.14 launching …
picamera2 0.3.25

Web display on:
http://192.168.86.31:9000/mjpg

Starting web streaming …
Camera initialized.

Starting OpenAI ChatGPT with camera input…
Enter your query (or press Enter to skip): * Serving Flask app ‘vilib.vilib’

  • Debug mode: off
    what do you see?
    Sending query to OpenAI…
    1739500704.167 user >>> what do you see?
    📤 Uploading image to Imgur…
    ✅ Uploaded: Uploaded from PiDog - Imgur
    🤖 Sending image to OpenAI using model: gpt-4o
    ✅ OpenAI Response Received:
    Response received in 7.22 seconds.
    OpenAI Response: The image shows a room with various items. In the foreground, there is a cardboard box and what appears to be a wooden mechanism, possibly a music box or a similar device with a rotating drum. Behind this is a dark chair back. In the background, there is a piece of furniture with framed photographs and other objects placed on top. The lighting in the room is dim.
    Enter your query (or press Enter to skip):

:smile: I really appreciate PiDog’s ability to see and respond to its environment. My recent code change successfully restores that capability. However, I noticed that my update removed the Assistant ID, which seems to have also disabled PiDog’s ability to respond to commands like ‘bark’ or ‘scratch.’ I’m looking for guidance on how to retain both the vision-based responses and the command execution functionality. Any insights would be greatly appreciated!

Wow, looks like you fixed my issue. I downloaded the latest code and it works as designed again! Thanks so much! :grinning:

1 Like

Perfect. Feel free to communicate if you run into any other issues.

1 Like