Need help with OpenAI stream completion

Ttorano · Jun 4, 2023

Problem

I am trying to use OpenAI Chat Completion(GPT-3/4) in GDScript at runtime, not in editor. The problem is that I can't get it work with 'stream' option enabled.

What I've tried

I successfully made code to call Chat API, but the operation is very slow.
I found that I can get generated character one by one if stream=true. So, I want to use it.
I followed the guide here, but I can't get it work.

This is the working code with stream=false.

extends Node
class_name OpenAIChatCompletion

signal chat_completed(end, content)

export(String, "gpt-3.5-turbo", "gpt-4") var model = "gpt-4"
export var temperature = 1.0

var is_completed = true
var url = "https://api.openai.com/v1/chat/completions"
var headers = ["Content-Type: application/json", "Authorization: Bearer "]
var http_request
var stream = false

func _ready():
	http_request = HTTPRequest.new()
	add_child(http_request)
	http_request.connect("request_completed", self, "_http_request_completed")

func _http_request_completed(result, response_code, headers, body):
	print("result: " + str(result))
	
	var response =  parse_json(body.get_string_from_utf8())
	print(response)
	emit_signal("chat_completed", true, response.choices[0].message.content)
	is_completed = true
	
func send(api_key, messages):
	var headers_with_api = headers.duplicate()
	headers_with_api[1] = headers_with_api[1] + api_key
	
	var body = to_json(
		{
			"model": model,
			"messages": messages,
			"temperature" : temperature,
			"stream" : stream
		}
	)
	
	is_completed = false
	var error = http_request.request(url, headers_with_api, true, HTTPClient.METHOD_POST, body)

	if error != OK:
		push_error("An error occurred in the HTTP request.")
		is_completed = true

If I make 'stream' variable true, then 'response' variable in _http_request_completed function should return an array(or a dicrionary?) containing multiple responses.
But print(response) returns NULL and the following code gives an error.

	for chunk in response:
		print(chunk)

Question

Does anyone know how to get it work? Any advice would be appreciated.

Kklaas · Jun 5, 2023

What is in the body variable?

print("body: " + str(body))

Is the body empty? maybe the json is corrupt and cannot be parsed?

Ttorano · Jun 7, 2023

klaas
Thanks for your response.
It prints out an array of numbers, like body: [101, 92, 116, 97, 58, 32, 123, 34...]
It seems that something from web server is returned, so I guess the problem is on parse_json(body.get_string_from_utf8()) , maybe?
I don't know how to solve it though.

Kklaas · Jun 7, 2023

Yes ... could have seen this earlier.

request_completed return a packedByteArray as body. You have to convert this to a string. get_string_from_utf8 should do the job mostly correct. If you dont know the servers encoding you have to look into the headers.

Just do a print of body.get_string_from_utf8() ... have a look into the result.

I've just released a HTTP-Addon some weeks ago ... maybe this can help you

https://godotengine.org/asset-library/asset/1797

Ttorano · Jun 9, 2023

klaas

print(body.get_string_from_utf8()) outputs this:

// if stream=true
 data: {"id":"SomeId","object":"chat.completion.chunk","created":SomeNumber,"model":"gpt-4-0314","choices":[{"delta":{"role":"assistant"},"index":0,"finish_reason":null}]}

data: {"id":"SomeId","object":"chat.completion.chunk","created":SomeNumber,"model":"gpt-4-0314","choices":[{"delta":{"content":"I"},"index":0,"finish_reason":null}]}

data: {"id":"SomeId","object":"chat.completion.chunk","created":SomeNumber,"model":"gpt-4-0314","choices":[{"delta":{"content":"'m"},"index":0,"finish_reason":null}]}

// if stream=false
 {
  "id": "SomeId",
  "object": "chat.completion",
  "created": SomeNumber,
  "model": "gpt-4-0314",
  "usage": {
    "prompt_tokens": 270,
    "completion_tokens": 134,
    "total_tokens": 404
  },
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "something"
      },
      "finish_reason": "stop",
      "index": 0
    }
  ]
}

So, basically, it returns multiple json bodies not a single json body if stream=false. Also it has includes extra string "data:"
I think that is the reason why parsing fails.
Do you have any idea how to properly parse this?

Kklaas · Jun 9, 2023

Well ... isnt the streaming options supposed to be read continously, as streaming suggests? How much sense does it make to wait for the whole stream to finish? Isnt it the same as calling it all at once without stream option?

Anyway ... you could just split the result string ... it seems that they are end with two linebreaks as seperation (right?) then json parse all array elements of the split result.

Ttorano · Jun 9, 2023

klaas
oh yeah that is the weirdest part. I should get multiple responses one by one but I get only one.
I think it is the problem of request_completed but I don't know if there is any other option available.

Kklaas · Jun 10, 2023

HTTPRequest does not do it that way. If you realy wann to support this HTTP Long-Polling/streaming you have to use HTTPClient. There you have the method read_response_body_chunk ( ). But this isnt that convenient to use, you also need a deeper understanding of the HTTP protocol and data streams in general.