If these aren't generic messages, then I wouldn't use something as generic as Open Message Format. I would suggest qualifying the name. Open LLM Message Format, OLMF. Or LOMF. Or OLIMF. Something to qualify the scope and advertise what it is for.
I see, thank you for the feedback, I will do a brainstorm on the name, I originally thought that open message format seemed somewhat catchy, but I can totally see how it is generic and not as descriptive as it should be.
Advice: Maybe consider the "comparison box" up higher as it's the easiest fastest way for a new reader to see current state. My first thought was "how many of the big players are compatible with this" and surprisingly, many are now.
The second larger up hill battle you have is specs like this are usually driven by a company or some sort of consortium. Who are you in relation to these orgs to hope to influence/encourage such spec usage?
In response to the second part, this project is a part of the Open LLM Initiative, which also has projects like the Global LLM Challenge. We are also planning to push this project into a consortium once it gains some traction, and we are currently working with some tools so that they adopt this format.
The metadata tokens is a string [1]... that doesn't seem right. Request/response tokens generally need to be separated, as they are usually priced separately.
It doesn't specify how the messages have to be arranged, if at all. But some providers force system/user/assistant/user... with user last. But strict requirements on message order seem to be going away, a sort of Postel's Law adaptation perhaps.
Gemini has a way of basically doing text completion by leaving out the role [2]. But I suppose that's out of the standard.
Parameters like top_p are very eclectic between providers, and so I suppose it makes sense to leave them out, but temperature is pretty universal.
In general this looks like a codification of a minimal OpenAI GPT API, which is reasonable. It's become the de facto standard, and provider gateways all seem to translate to and from the OpenAI API. I think it would be easier to understand if the intro made it more clear that it's really trying to specify an emergent standard and isn't proposing something new.
hey @ianbicking - thanks a lot for the feedback. I've merged a change to fix the links [1].
> The metadata tokens is a string [1]... that doesn't seem right. Request/response tokens generally need to be separated, as they are usually priced separately.
For the metadata you are right. Request and response tokens are billed separately and should be captured accordingly. I've put a PR to address that [2]
> It doesn't specify how the messages have to be arranged, if at all. But some providers force system/user/assistant/user... with user last. ...
We do assume that last message in the array to be from user. But we are not forcing it at the moment.
I've hit cross-LLM-compatibility errors in the past with message order, multiple system messages, and empty messages.
Multiple system messages are kind of a hack to invoke that distinct role in different positions, especially the last position. I.e., second to last message is what the user said, last message is a system message telling the LLM to REALLY FOLLOW THE INSTRUCTIONS and not get overly distracted by the user. (Though personally I usually rewrite the user message for that purpose.)
Multiple user messages in a row is likely caused by some failure in the system to produce an assistant response, like no network. You could ask the client to collapse those, but I think it's most correct to allow them. The user understands the two messages as distinct.
Multiple assistant messages, or no trailing user message, is a reasonable way to represent "please continue" without a message. These could also be collapsed, but that may or may not be accurate depending on how the messages are truncated.
This all gets even more complicated once tools are introduced.
(I also notice there's no max_tokens or stop reason. Both are pretty universal.)
These message order questions do open up a more meta question you might want to think about and decide on: is this a prescriptive spec that says how everyone _should_ behave, a descriptive spec that is roughly the outer bounds of what anyone (either user or provider) can expect... or a combination like prescriptive for the provider and descriptive for the user.
Yeah, I can completely see this, the goal of this was to be specifically for the messages object, and not a completions object, since in my experience, you usually send messages from front end to backend and then create the completion request with all the additional parameters when sending from backend to an LLM provider. So when just sending from an application to the server, trying to just capture the messages object seemed ideal. This was also designed to try and maximize cross compatibility, so it is not what the format "should be" instead, it is trying to be a format that everyone can adopt without disrupting current setups.
Huh, that's a different use case than I was imagining. I actually don't know why I'd want a standard API from a frontend and backend that I control.
In most applications where I make something chat-like (honestly a minority of my LLM use) I have application-specific data in the chat, and then I turn that into an LLM request only immediately before sending a completion request, using application-specific code.
Well, in the case of the front-end (like streamlit, gradio, etc) they send conversational messages in their own custom ways - this means I must develop against them each specifically, and that slows down any quick experimentation I would want to do as a developer. This is the client <> server interaction.
And then the conversational messages sent to the LLM are also somewhat unique to each provider. One improvement for simplicity purposes could be that we get a standard /chat/completions API for server <> LLM interaction and define a standard "messages" object in that API (vs the stand-alone messages object as defined in the OMF").
Perhaps that might be simpler, and easier to understand
Open Message Format (OMF) is a specification for structuring message exchanges between users, servers, and large language models (LLMs). It defines an API contract and schema for the "message" object, which supports communication in conversational agents. OMF allows developers to work with different LLMs and tools without writing extra translation code.
If these aren't generic messages, then I wouldn't use something as generic as Open Message Format. I would suggest qualifying the name. Open LLM Message Format, OLMF. Or LOMF. Or OLIMF. Something to qualify the scope and advertise what it is for.
I see, thank you for the feedback, I will do a brainstorm on the name, I originally thought that open message format seemed somewhat catchy, but I can totally see how it is generic and not as descriptive as it should be.
I think it would be challenging to come up with a more ambiguous name.
Hi, do you think it would be beneficial to specify that it is for LLM interaction within the name itself?
Advice: Maybe consider the "comparison box" up higher as it's the easiest fastest way for a new reader to see current state. My first thought was "how many of the big players are compatible with this" and surprisingly, many are now.
The second larger up hill battle you have is specs like this are usually driven by a company or some sort of consortium. Who are you in relation to these orgs to hope to influence/encourage such spec usage?
In response to the second part, this project is a part of the Open LLM Initiative, which also has projects like the Global LLM Challenge. We are also planning to push this project into a consortium once it gains some traction, and we are currently working with some tools so that they adopt this format.
Thank you so much for the feedback about the location of the comparison box, I have merged a change that moves it up higher in the readme.
Lots of broken links in the doc, though I guess the YAML file specifies everything: https://github.com/open-llm-initiative/open-message-format/b...
The metadata tokens is a string [1]... that doesn't seem right. Request/response tokens generally need to be separated, as they are usually priced separately.
It doesn't specify how the messages have to be arranged, if at all. But some providers force system/user/assistant/user... with user last. But strict requirements on message order seem to be going away, a sort of Postel's Law adaptation perhaps.
Gemini has a way of basically doing text completion by leaving out the role [2]. But I suppose that's out of the standard.
Parameters like top_p are very eclectic between providers, and so I suppose it makes sense to leave them out, but temperature is pretty universal.
In general this looks like a codification of a minimal OpenAI GPT API, which is reasonable. It's become the de facto standard, and provider gateways all seem to translate to and from the OpenAI API. I think it would be easier to understand if the intro made it more clear that it's really trying to specify an emergent standard and isn't proposing something new.
[1] https://github.com/open-llm-initiative/open-message-format/b...
[2] https://ai.google.dev/gemini-api/docs/text-generation?lang=r...
hey @ianbicking - thanks a lot for the feedback. I've merged a change to fix the links [1].
> The metadata tokens is a string [1]... that doesn't seem right. Request/response tokens generally need to be separated, as they are usually priced separately.
For the metadata you are right. Request and response tokens are billed separately and should be captured accordingly. I've put a PR to address that [2]
> It doesn't specify how the messages have to be arranged, if at all. But some providers force system/user/assistant/user... with user last. ...
We do assume that last message in the array to be from user. But we are not forcing it at the moment.
[1] https://github.com/open-llm-initiative/open-message-format/p...
[2] https://github.com/open-llm-initiative/open-message-format/p...
I've hit cross-LLM-compatibility errors in the past with message order, multiple system messages, and empty messages.
Multiple system messages are kind of a hack to invoke that distinct role in different positions, especially the last position. I.e., second to last message is what the user said, last message is a system message telling the LLM to REALLY FOLLOW THE INSTRUCTIONS and not get overly distracted by the user. (Though personally I usually rewrite the user message for that purpose.)
Multiple user messages in a row is likely caused by some failure in the system to produce an assistant response, like no network. You could ask the client to collapse those, but I think it's most correct to allow them. The user understands the two messages as distinct.
Multiple assistant messages, or no trailing user message, is a reasonable way to represent "please continue" without a message. These could also be collapsed, but that may or may not be accurate depending on how the messages are truncated.
This all gets even more complicated once tools are introduced.
(I also notice there's no max_tokens or stop reason. Both are pretty universal.)
These message order questions do open up a more meta question you might want to think about and decide on: is this a prescriptive spec that says how everyone _should_ behave, a descriptive spec that is roughly the outer bounds of what anyone (either user or provider) can expect... or a combination like prescriptive for the provider and descriptive for the user.
Validation suites would also make this clearer.
Yeah, I can completely see this, the goal of this was to be specifically for the messages object, and not a completions object, since in my experience, you usually send messages from front end to backend and then create the completion request with all the additional parameters when sending from backend to an LLM provider. So when just sending from an application to the server, trying to just capture the messages object seemed ideal. This was also designed to try and maximize cross compatibility, so it is not what the format "should be" instead, it is trying to be a format that everyone can adopt without disrupting current setups.
Huh, that's a different use case than I was imagining. I actually don't know why I'd want a standard API from a frontend and backend that I control.
In most applications where I make something chat-like (honestly a minority of my LLM use) I have application-specific data in the chat, and then I turn that into an LLM request only immediately before sending a completion request, using application-specific code.
Well, in the case of the front-end (like streamlit, gradio, etc) they send conversational messages in their own custom ways - this means I must develop against them each specifically, and that slows down any quick experimentation I would want to do as a developer. This is the client <> server interaction.
And then the conversational messages sent to the LLM are also somewhat unique to each provider. One improvement for simplicity purposes could be that we get a standard /chat/completions API for server <> LLM interaction and define a standard "messages" object in that API (vs the stand-alone messages object as defined in the OMF").
Perhaps that might be simpler, and easier to understand
OM
Open Message Format (OMF) is a specification for structuring message exchanges between users, servers, and large language models (LLMs). It defines an API contract and schema for the "message" object, which supports communication in conversational agents. OMF allows developers to work with different LLMs and tools without writing extra translation code.
Couldn't get past the thoughts about the naming after reading the first paragraph...
I propose a better name: Simple Message Spec (SMS)!
Squat all Open-* names!
[dead]