OpenAI LLM APIs: OpenAI or Microsoft Azure?

Table of contents

When large language models come into play as part of building a competitive advantage within services or products, many additional questions arise. Effective implementation that brings real business value requires the analysis of aspects such as reliability, high availability, security, data privacy and many more. Such research is time-consuming, which is why deepsense.ai comes with some of it done for you already.

Table of contents

For the purposes of our projects, we have reviewed the key aspects related to the use of OpenAI models. In this article, we share our insights related to two main ways of accessing the OpenAI models, both directly from the organization’s API and via Microsoft Azure OpenAI Service. We review the two options for the central block of this kind of platform that can be covered by the services offered by the two.

What is OpenAI and what is the role of Microsoft in the equation?

OpenAI is an artificial intelligence company founded in December 2015. Its influence in terms of making AI available to a large community of users and becoming part of pop culture has been nothing short of spectacular. Each series of their famous GPT models quickly became popular, giving rise to excitement, doubts, concerns, and… new business opportunities!

The partnership between OpenAI and Microsoft dates back to November 2021, and it seems that their common goal and a shared ambition to responsibly advance cutting-edge AI research and democratize AI as a new technology platform only strengthens this collaboration. Microsoft is one of the biggest public cloud providers. Their immense cloud resources and robust services have supported OpenAI research in their collaboration, but are not limited only to that company. Currently Microsoft offers resources to other Azure users, allowing them to benefit from the results of the cooperation with OpenAI within a well-known, mature and safe cloud environment.

OpenAI vs Azure OpenAI Service: The two options in detail

OpenAI has released a simple and intuitive API to interact with their models. The requests are handled by OpenAI servers. Such an approach allows us to focus the team’s efforts on research rather than infrastructure and computing challenges. The models were developed as a text in, text out interface based on text prompts. The pattern provided in the request decides. The API and Python SDK openai package can be used.

As far as data security is concerned, the situation changed in April 2023. OpenAI ChatGPT had received heavy criticism from both users and experts concerning its further data usage and retention policies. Currently it is easier to ensure data are not being used to train the models by default. The OpenAI blog confirms that, having chosen to disable history for ChatGPT, new conversations are going to be retained for 30 days before permanent deletion only for monitoring abuse. The TLS communication of OpenAI API allows for in-transit encryption for customer-to-OpenAI requests and responses.

From the perspective of Microsoft Azure, having the early feedback lessons from people who are pushing the envelope and are on the cutting edge of this gives us a great deal of insight into and a head start on what is going to be needed as this infrastructure moves forward. Microsoft currently provides exciting offers that combine the main advantages of the two parties via Microsoft Azure OpenAI Service.

As we can read in the article by Eric Boyd, Corporate Vice President, AI Platform: The power of large-scale generative AI models with the enterprise promises customers have come to expect from our Azure cloud. The service enables the customers to run the same models as OpenAI, benefiting from the enterprise-scale features of the Azure cloud.

In terms of data security, apart from private networking and regional availability, Azure provides privacy and confidentiality when one entrusts them with data. Azure fulfills the following expectations:

Your data is not shared without your consent and is not mined by Microsoft Azure.
The data is processed only with the owner’s consent, and this regulation also applies to Microsoft-authorized subcontractors or subprocessors. The constraints and contractual privacy commitments are sustained.
The data is removed from the cloud once you leave the Azure service or your subscription expires.
The data is secured with state-of-the-art encryption mechanisms both at rest and in transit.

To sum up, if you need to ensure no further usage of your data, Microsoft, as a provider, creates a safe place for your data. If you already possess any data on Azure servers, the choice of service is low-hanging fruit. Additionally, their safety will not be affected in Azure OpenAI Service. As this is a very explicit declaration about safety, we recommend using this service especially when data vulnerability is of concern.

OpenAI models from both providers

There are several engines (corresponding to the models) as their capabilities (and customization opportunities through fine-tuning) differ. For both OpenAI API and Azure, there are a few categories of models which are ready-to-use or allow zero- or few-shot learning depending on the use case. One can always choose between alternative categories:

basic language models that are adaptable to content generation, summarization, and natural language to code translation.
the costs of few-shot learning are reduced – pre-trained language models that can be further trained on specific tasks or domains. This ultimately saves money and improves performance through shorter prompts (no need to put examples in the prompt – decreases token usage).
embedding model series – their responsibility is to convert the text into vector representation to make it possible to measure the relatedness of text strings and be used for further tasks such as search, classification, clustering, diversity management, recommendations and anomaly detection.
fine-tuned version (with text or code prefix for text and code generation respectively).

The models released by OpenAI have been organized into series corresponding to experiments and the respective tasks the engines perform and the engine categories listed above partially overlap with the series but represent a different quality.

Although one can find the entire list of models accessible via OpenAI API in their docs, we are going to limit our attention to the models that are also available in the Azure OpenAI service. Shortly, we will skip the Dall-E, Whisper, Moderation and some GPT-3.5 series models – they are not explicitly accessible in the latter service and are beyond the scope of this related post summarizing aspects of LLM API usage. We compare the context length and time range of the training data set (the maximum number of tokens if approximated to thousands). The details of the models are far beyond the scope of this post, yet we want to provide the basic intuition behind the engines that can be used.

The GPT-3 base models available are Ada, Babbage, Curie and Davinci – the ordering is not accidental and represents the capability vs speed trade-off with Ada as the fastest engine and Davinci as the most powerful. All models were trained on data from up to October 2019 and have a maximum of 2k tokens.

Even though all GPT-3 based engines are usually substituted with the more powerful GPT-3.5 series model, they are also served by fine-tuned alternatives (text prefix) and are available for embedding tasks or are fine-tunable with user data.

For Azure OpenAI Service, these are the models available for getting embeddings. The same holds true for OpenAI, but they also recommend a second generation embedding model, text-embedding-ada-002 (8k tokens) designed to replace the previous generation of embedding models at significantly lower cost.

If one is interested in a modern series of language models optimized for chat but suitable for traditional completion tasks as well, there are at least two served by both providers. The GPT-3.5 series has a flagship model gpt-3.5-turbo (4k tokens) – it is recommended as a cost-effective and the most capable model. Please take into account the fact that, though OpenAI API supports multiple countries (without explicitly mentioning the server regions) not all regions of Azure OpenAI Service enable this engine (for example Central US, in contrast to Eastern US or Western Europe).

Another alternative is GPT-4 of an even higher series that has recently aroused the interest of the public. At the time of writing, it is available in the API documentation as limited beta. In the case of this engine, the regional support in Azure is narrowed down to the Eastern and South Central regions in the case of the US and is not available in other regions. One should consider the fact that currently there is a waitlist to join to get access to this engine API in Open API. This engine is a large multimodal model with 8k and 32k maximum token variants and is meant to solve difficult problems and perform with advanced reasoning capabilities and provide results at a greater level of accuracy compared to the previous models thanks to its broad general knowledge.

Code generation (Codex)

The situation with LLMs understanding and generating code changed in March 2023. For users who are interested in using the engines that were commonly used, namely the code-davinci and code-cushman models (8k and 2k tokens respectively and the speed advantage of the latter), they are still available in Azure OpenAI Service. The same engines are deprecated in OpenAI API and the chat models are recommended for the task – the capabilities remain similar.

Having discussed the details of the various models above, we present a table to summarize the high-level specifications of the models.

Series	Latest model	Max tokens (by 1.024)	Training data
GPT-3	davinci	2k	Up to Oct 2019
	curie
	babbage
	ada
	code-cushman		N/A
	code-davinci-001	8k	Up to Jun 2021
GPT-3.5	text-davinci-002	4k
GPT-3.5	gpt-3.5-turbo(-*)	4k	Up to Sep 2021
GPT-4	gpt-4(-*)	8k, 32k	Up to Sep 2021

Table 1. OpenAI model sizes and training data overview. Source: OpenAI docs.

OpenAI vs Microsoft Azure OpenAI Service: similarities

At this point in our analysis, it is worth mentioning two main similarities.

API Compatibility

Co-development of the APIs with OpenAI and ensuring compatibility to make the transition and integration of the models seamless is the strategy behind Microsoft Azure OpenAI Service. Thus, switching between both services should be fairly easy and might simply involve an alternative configuration to your codebase. If the business use-case and cost optimization strategy is suitable, dynamic migration between providers comes into play.

Limitations and safety

OpenAI claims that the models may encode social biases (negative sentiment towards certain groups/stereotypes). These issues are addressable for each provider but in a different manner (please see the section about differences). Moreover, the models lack knowledge of events which took place after August 2020.

OpenAI vs Microsoft Azure OpenAI Service: differences

In addition to the characteristics of both solutions already described in this article, it is worth noting three basic differences.

Pricing

Pricing is one of the key differences between the providers. The Microsoft Azure calculator helps to estimate pricing of fine-tuning, as well as hosting a fine-tuned model, whereas with OpenAI API one is charged only for the tokens used in requests to that model. The pricing of single API requests (per 1k tokens) of both providers is alike..

Regional availability

Based on the OpenAI API usage policies, all customer data is processed and stored in the US. If the volume of your data is large enough, it is recommended to keeping the compute power close to the data in order to avoid moving it excessively is recommended, so as to avoid network transit overhead. This seems to suggest that the choice of Azure OpenAI Service would be best if you need elasticity or a particular region of availability.

Limitations and safety

From the perspective of an Azure OpenAI Service user, the recommended text-embedding-ada-002 model is available in Azure OpenAI, but the number of tokens used might be limited (see this thread).

The model used with OpenAI API can be supported with their Moderation model to classify and prevent content that is hateful, harmful, violent, sexual or discriminates against minorities. Put simply, it must be compliant with the usage policies of OpenAI. Similarly, the content requirements must be met in the case of its Azure counterpart, so the content policies are intended to improve the safety of the platform for both the input and output of the engines, and are always explicitly filtered.

For more information, see and compare the OpenAI safety standards and the Azure service Responsible AI section in the docs.

Open AI models: final thoughts

A multitude of available models allow for various implementations for your services or products and provide a dynamic approach to scaling and optimizing costs and performance. The adoption of OpenAI solutions creates exciting new business opportunities for various industries. On the other hand, there is an exciting world of alternatives, and our experts can guide you on your journey. Let us know if our AI development agency can help build your vision incorporating large language models!