Skip to main content

Plan and prepare to develop AI solutions on Azure

With the explosion of generative AI as a tool for developers, we are now tasked with the creation of comprehensive AI solutions, which can include machine learning models, AI-based services, and prompt engineering, all alongside custom code.

Azure exposes several services to developers for creating AI solutions. AI-102/103 aims to be a catalog and general introduction to the tools available for AI solutions.

What is AI?

"Artificial Intelligence" encompasses a wide range of software capabilities that enable applications to pantomime human behavior. AI used to be little more than rat nests of if-else statements, but machine learning has evolved drastically over the years. The field of machine learning has developed models of semantic relationships between datapoints, i.e., how do these characters relate to words, how do words relate to concepts. Machine learning models enable applications to appear to interpret input in various formats, reason about said input, and then generate a response.

The most common AI capabilities that we can integrate into applications via Azure include:

Generative AI and agents Leverages large language models (LLMs) to generate responses to prompts written in natural human language. Common uses are interactive chat applications, code generation, and content review.
Natural Language Processing The field of NLP makes use of statistic and semantic models to make sense of human language in text (i.e., emails, documents, social media posts, instant messages). Generative AI can perform NLP tasks, but NLP has specialized uses, one of the most easily understood by the general public being sentiment analysis.
Computer Speech Pretty straightforward: large language models have made it easier for computers to recognize speech and synthesize human language audio responses. Recent AI advances have made handling background noise, recognizing interruptions, and handling a broad swathe of languages and accents far easier.
Computer Vision Refers to the ability of AI applications and agents to accept, interpret, and process visual input from images, videos, and live camera streams.
Information Extraction The ability to combine generative AI models for language reasoning, natural language techniques for document understanding, and computer vision and speech for media analysis enables the development of AI solutions that can extract key information from documents, forms, images, recordings, and other kinds of content.

Microsoft Foundry

Foundry is the product Microsoft uses to package all of their AI services on Azure. When you spin up a Foundry resource on Azure, it acts as a container for one or more child projects, and each project contains the necessary items for AI services:

  • Models
    • Foundry offers a catalog of deployable LLMs from various providers, which can be connected to and interacted with through the Foundry project's endpoint and the Azure OpenAI endpoint (using their respective APIs and SDKs)
  • Agents
    • Named AI configurations that encapsulate a LLM, instructions to the LLM, and tools the LLM can access. Agents are considered "autonomous AI entities" and can automate tasks and collaborate with users and other agents. Foundry exposes an agent service that allows for accessing agents within the Foundry project.
  • Tools
    • Agents leverage tools to accomplish tasks, similar to how we do things as people. Tools can include things like web search and code interpreters, but can also include connections to services via MCP servers. Foundry provides its own toolkit for text analysis, speech recognition, content understanding, and more within a given Foundry project.
  • Knowledge
    • Every session with an agent starts from zero context. We can accomplish long-running or complex tasks by giving agents an encyclopedia to access so they have better ideas of what to do with various prompts from users. Foundry offers a tool, Foundry IQ, to create a single MCP-based knowledge connection.

Like everything in Azure, Foundry has its own mini-portal/dashboard where we access all the usual tooling & operations. Microsoft has also created a Foundry SDK for accessing the portal via code.

Foundry Tools

As AI has ramped up, we've already found several common use cases for AI that can enhance both new and existing software products. Foundry offers its own line of Tools for off-the-shelf solutions for common AI tasks:

Azure Language Provides models and various APIs for analyzing natural language via text and performing tasks such as text extraction, sentiment analysis, and summarization.
Azure Speech Provides APIs for implementing text to speech and speech to text for your application, including real-time live speech for conversational apps & agents.
Azure Translator Leverages LLMs to translate input between a large number of languages.
Azure Document Intelligence Leverage pre-built or your own custom models for extracting fields from complex documents (invoices, receipts, forms).
Azure Content Understanding A more flexible but complicated/expensive version of Document Intelligence. Azure Document Intelligence might replace a dedicated structured form reader, but Content Understanding is able to work with "flexible" inputs: forms with considerable variety, photos of forms, etc.

The branding of "Foundry Tools" is a replacement for the branding "Azure AI Services" and "Azure Cognitive Services" - traces of both will still be within Foundry Tools.

Developer tools and SDKs

This section is just a big sell for VS Code, Copilot, etc., but for VS Code, the Foundry team offers their own extension. You can interact with your Foundry resources and the projects within them via the VS Code extension.

They also offer a Microsoft Foundry SDK for interacting with your Foundry resources, and separately offer the Foundry Tools SDK which allows you to leverage tools within Foundry from your application.

Responsible AI

This is almost 100% on the test. Don't skip this. It's essentially Ethics 101 and it's free points.

  • Fairness
    • AI systems should treat all people fairly. Your models should make predictions without incorporating bias on gender, ethnicity, or other factors that may result in unfair advantage or disadvantage to specific groups of users.
  • Reliability and safety
    • Anything you do with AI, especially if it's used in perilous or fragile industries like automotive and health, must be subject to rigorous testing and deployment processes.
  • Privacy and security
    • AI systems should be secure and respect privacy. AI models rely on large volumes of data which may contain personal details that must be kept private.
  • Inclusiveness
    • "AI should bring benefits to all parts of society, regardless of physical ability, gender, sexual orientation, ethnicity, or other factors."
  • Transparency
    • "Users should be made fully aware of the purpose of the system, how it works, and what limitations may be expected. When an AI application relies on personal data, such as a facial recognition system that takes images of people to recognize them; you should make it clear to the user how their data is used and retained, and who has access to it."
  • Accountability
    • "Although many AI systems seem to operate autonomously, ultimately it's the responsibility of the developers who trained and validated the models they use, and defined the logic that bases decisions on model predictions to ensure that the overall system meets responsibility requirements."