By Liu Jun, Spring AI Alibaba initiator, Apache members
In October 2018, Alibaba Cloud released Spring Cloud Alibaba as open source to facilitate microservice application development for Java developers based on the Spring Cloud programming models. In the past six years, large language models (LLMs) and AI have profoundly transformed all aspects of work and life. These changes are not limited to mobile screens but extend throughout the entire physical world. At this opportune moment, Alibaba Cloud has released Spring AI Alibaba that aims to help Java developers build AI applications as open source. You are welcome to join the Spring AI Alibaba community and participate in co-creating a new physical world.
Spring AI Alibaba is the first open source AI application development framework for Java developers developed by Alibaba Cloud. Built on Spring AI, this framework is a best practice for implementing Alibaba Cloud Tongyi models and services in the Java AI application development field. Spring AI Alibaba provides solutions that integrate high-level AI-driven API abstractions and cloud-native infrastructure to help developers build AI applications. For more information about the code of Spring AI Alibaba, see alibaba/spring-ai-alibaba on GitHub. This article describes the core features of Spring AI Alibaba. In this topic, the development of an application that serves as a flight booking assistant is used as an example to show the convenience of Spring AI Alibaba in AI application development. The sample source code is uploaded to GitHub and the official website of Spring AI Alibaba.
The release of Spring AI Alibaba comes against the backdrop of the rapid development of generative AI and LLMs over the past year. Model services are becoming increasingly prevalent in everyday life. However, only a few companies and algorithm engineers are training LLMs. Users and developers focus more on integrating generative AI capabilities into applications.
The most intuitive way to integrate AI models into applications is by calling API operations. For example, you can call API operations to integrate models of Alibaba Cloud Tongyi and OpenAI. This method is flexible but costly because you need to understand API specifications and learn various interaction patterns with AI models. If you are developing AI applications of Spring, you can use tools such as RestTemplate to reduce the costs of calling API operations. However, RestTemplate cannot be used for general AI application development paradigms. Therefore, an AI application development framework to simplify AI application development becomes essential for Java developers.
In this case, Spring released Spring AI as open source. This framework can be used to simplify the process of developing agent-based applications for Spring developers. Alibaba Cloud then released Spring AI Alibaba as open source, which is developed based on Spring AI and deeply integrated into models of Alibaba Cloud Model Studio and Tongyi to provide best practices. Spring AI Alibaba helps Java developers conveniently develop AI agent-based applications.
Alibaba Cloud and Spring have built a successful partnership over the years. The two parties have collaborated to create the Spring Cloud Alibaba microservice framework and provide comprehensive solutions. Spring Cloud Alibaba has become one of the most widely used open source microservice framework in the Chinese mainland. The ecosystem of Spring Cloud Alibaba has received more than 100,000 stars.
Alibaba Cloud aims to sustain the same level of collaboration with Spring in the development of Spring AI Alibaba. The Spring community primarily handles the atomic capabilities of agent-based application development and API abstractions. The Spring AI Alibaba community focuses on deep integration with Alibaba Cloud Tongyi models and cloud-native infrastructure. Spring AI Alibaba is also responsible for abstracting and implementing core capabilities closely tied to the deployment of agent-based applications, such as workflow orchestration, development toolsets, application evaluation, observability, configuration management, and traffic control. The Spring AI Alibaba project is poised for long-term, sustainable, and healthy development with the support of both the Spring and Alibaba Cloud communities.
Spring AI Alibaba provides the following core features that accelerate and simplify the development of Java agent-based applications:
Spring AI Alibaba is a framework designed specifically for Spring and Java developers to build agent-based applications. Spring AI Alibaba allows you to develop applications by using similar methods as developing a standard Spring Boot application without learning difficulties.
Spring AI Alibaba provides comprehensive abstractions for common AI agent-based application development paradigms. The abstractions include atomic capabilities such as integrating chat models, prompt templates, function callings, and high-level abstractions such as agent orchestration and chat memory.
By default, Spring AI Alibaba is deeply integrated with Tongyi models. In addition, Spring AI Alibaba provides best practices for application deployment and O&M, including gateway management, configuration management, deployment, and observability.
The following sections describe the terms and API definitions of Spring AI Alibaba.
Spring AI provides API abstractions and adaptations to enable basic interactions with chat models. The development of an agent-based application is essentially a continuous interaction process with LLM services. During this process, applications are used to provide semantically structured inputs to chat models. The chat models run inference and returns the outputs.
Interactions with models can involve various types of inputs and outputs. For example, the inputs for the early versions of ChatGPT are text. In addition to text-based models, models that support images, videos, and audio are also available. Specific models support multimodal inputs. For example, you can use a combination of images and text as the inputs.
Spring AI provides complete abstractions for chat models that support input types such as text, images, and audio. As an application developer, you can directly call the APIs provided by Spring AI. The request parameters are used as the model inputs and the returned results are used as the model outputs. Spring AI also supports various communication modes such as synchronous, asynchronous, and streaming modes. In addition, Spring AI allows you to switch underlying model services in a transparent and seamless manner.
Prompts are inputs used to interact with models. The inputs are essentially compound structured data that contain information about multiple roles, including system, user, and assistant. Prompt management is a key step to simplify AI application development.
Spring AI provides an abstraction for managing prompt templates. You can define templates in advance and replace keywords within the templates when you use models.
Spring AI also allows you to load prompt templates directly from resource files. Sample code:
In most cases, the results returned by LLMs are unstructured. However, data passed from upstream to downstream services in an application must be structured and well-defined. Therefore, Spring AI provides the feature for structured outputs. This feature can be used to automatically add data format information to the prompts, help the models understand the required output format, and then convert the results into outputs in the JavaBean format.
The preceding figure shows an example on how Spring AI simplifies the process of converting output formats, including adding format information to the input and converting the output format.
Function calling is a typical paradigm in AI applications for interacting with models. Function calling helps models provide better responses to user questions. When you submit a question as the input, you can include available functions, including function names and descriptions. After the model receives the question and the functions, the model calls functions if necessary based on the inference of the question.
Spring AI standardizes processes such as function definition and registration and automatically injects functions into the prompt before a model request is initiated. When the model calls a function, Spring AI handles the function calling and sends the function calling result and the original question back to the model. The model then decides the next operation based on the updated input. During this process, multiple interactions are involved with the model. Each function calling represents a complete interaction.
The following section provides an example to show how a specific function is called during an interaction with a model.
The code in the preceding figure shows the interaction process between the application and the model without function computation. The input requires the model to calculate the square root of a number based on the built-in inference capabilities. The model then generates an inaccurate result.
In this case, a function is defined to calculate the square root and is registered as a special function that can interact with the model by using the annotations provided by Spring AI. Sample code:
The code in the following figure shows the interaction between the application and the model after the function is called. To generate the final answer, the application interacts with the model in two rounds. The first round of interaction involves initiating the request to calculate the square root. In this round, the prompt includes tools, which contain the defined function. The model returns a special result that contains ToolExecutionRequest, which indicates that Spring AI is required to calculate the square root by calling the defined function. In the second round of interaction, Spring AI sends the original question along with the function calling result to the model. The model then generates the final answer.
Retrieval-augmented generation (RAG) is another commonly used paradigm for developing agent-based applications. RAG is conceptually similar to function calling as a method for applications to help models in inference and answering questions. However, the interaction process of RAG differs from that of function calling.
As shown in the preceding figure, the interaction of RAG and a model is generally divided into two parts: offline and runtime. The offline part involves the process of vectorizing domain-specific data and storing the vectorized data in a vector database. During runtime, Spring AI retrieves data from the vector database to generate a prompt enriched with more contextual information than the original question. The context-enhanced prompt is then provided to the model, which generates a response based on the user question, the context, and the built-in inference capabilities of the model.
Spring AI provides abstractions for loading, analyzing, and vectorizing data and storing vectorized data offline, as well as for data retrieval and prompt enhancement during the runtime.
For information about how to use Spring AI Alibaba to develop a generative AI application, visit https://sca.aliyun.com/ai/get-started/
You can use Spring AI Alibaba to develop applications by using similar methods of using Spring Boot. You need to only add the spring-ai-alibaba-starter
dependency and inject the ChatClient bean
into your application to enable conversational queries.
1. Add the dependency.
<dependency>
<groupId>com.alibaba.ai</groupId>
<artifactId>spring-ai-alibaba-starter</artifactId>
<version>1.0.0-M2</version>
</dependency>
Note: The Spring AI-related dependency package has not been published to the Maven Central Repository. If a dependency parsing error, such as an error related to the spring-ai-core dependency, occurs, you must add the following repository configurations to the pom.xml file of your project:
<repositories>
<repository>
<id>spring-milestones</id>
<name>Spring Milestones</name>
<url>https://repo.spring.io/milestone</url>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>
2. Specify an API key in the application.yaml
file. You can create an API key by using the free quotas provided by Alibaba Cloud Model Studio.
spring:
ai:
dashscope:
api-key: ${AI_DASHSCOPE_API_KEY}
3. Inject the ChatClient
agent proxy into your application.
@RestController
public class ChatController {
private final ChatClient chatClient;
public ChatController(ChatClient.Builder builder) {
this.chatClient = builder.build();
}
@GetMapping("/chat")
public String chat(String input) {
return this.chatClient.prompt()
.user(input)
.call()
.content();
}
}
In this example, an application closer to daily use cases is created to show the powerful capabilities of Spring AI Alibaba in developing agent-based applications.
The application serves as an assistant for flight booking, developed by using Spring AI Alibaba, is designed to help customers with operations such as booking, changing, and canceling flight tickets and answering questions. The application is built to perform the following operations:
• Engage in conversations with users and understand the natural language expressions of users based on AI LLMs.
• Support multi-turn continuous conversations and understand user intent within context.
• Understand and strictly adhere to flight booking-related terms and regulations, such as aviation regulations and rules for ticket refunds, changes, and cancellations.
• Call tools to complete tasks if necessary.
The following figure shows the architecture of the flight booking assistant based on the design intent of the application.
Based on the architecture, the following operations are required to deploy the flight booking assistant:
Develop a common Java application by using Spring Boot. The application can be used to continuously receive user questions and provide answers related to flight booking to users. The application is an agent-based application because it interacts with AI. This way, AI is used to help the application understand user questions and make decisions for users.
The following figure shows the simplified architecture of the flight booking assistant after AI model service is integrated.
The preceding architecture shows that the AI model first understands user requests related to flight booking. The model then decides the next operation to perform and drives business processes. However, the following challenges may occur: A general-purpose LLM cannot accurately and reliably resolve flight booking-related issues. Decisions made by models may not be reliable. If a user initiates a request to change a flight ticket, the model can accurately understand the user intent. However, the model may not know whether the user meets the refund or change rules because each airline may have different policies for changing flight tickets. In addition, the model may not be familiar with the regulations for service fees for changing flight tickets. In scenarios in which economic disputes and legal risks may occur, an AI model must know all the details of the ticket changing policies and confirm whether the user information complies with each rule before the model decides whether to approve the ticket changing request.
However, the preceding requirements cannot be met by using only the AI model. In this case, RAG is applied. Knowledge about changing and canceling flight tickets can be injected into both the application and the AI model based on RAG. This helps the AI model make decisions based on the specified rules.
The following figure shows the augmented architecture of the flight booking assistant after RAG is used.
After RAG is used, the application becomes an intelligent expert in the flight booking field like a trained customer service representative. The application can engage in user-friendly conversations and guide user behaviors based on the rules.
AI agents can help applications understand user requirements and make decisions but cannot be used to execute decisions like applications. The execution of decisions is still implemented by applications. This rule also applies to traditional applications. Intelligent and pre-orchestrated applications need to call functions and modify database records to implement data persistence.
Spring AI allows a model to execute a decision by calling a specific function. In this example, the application can call a function to change or refund a flight ticket and write user data to databases. This is a best practice of the function calling feature.
The following figure shows the architecture of the flight booking assistant after the function calling feature is used.
LLMs are stateless and can only view the content of the current round of conversation. Therefore, to implement the multi-round conversation capability, the application must retain the context of the previous round of conversation each time and send the context along with the latest question as a prompt to the model. In this case, conversation memory provided by Spring AI Alibaba can be added to the application to maintain the conversation contexts conveniently.
In summary, the following core features of Spring AI Alibaba are used for the flight booking assistant:
Spring AI Alibaba not only provides abstractions for the preceding atomic capabilities but also provides ChatClient, which is a higher-level AI agent-based API abstraction. The ChatClient allows you to use a fluent streaming API to corporate multiple components into a cohesive AI agent.
You can use ChatClient
to directly declare the features that the flight booking assistant uses to interact with the AI model. The features include prompt management, RAG, chat memory, and function calling. This way, the application is instantiated as an AI agent proxy object. Sample code:
this.chatClient = modelBuilder
.defaultSystem("""
You are a customer service agent for Funnair airline. You need to answer user questions in a friendly, helpful, and pleasant manner.
You are interacting with customers by using an online chat system.
After a customer provides information to book or cancel a flight ticket, you must provide answers based on the following requirements:
Obtain the following information from the customer: reservation number and customer name.
Check the conversation history to obtain the preceding information before you ask questions to the customer.
Before you change a flight booking, make sure that the change conforms to the related rules.
If service fees are required to change a flight booking, you must obtain user consent before you change the flight booking.
Use the provided features to obtain flight booking details, change flight bookings, and cancel flight bookings.
Call functions to help with decision making if necessary.
Respond in Chinese.
Today is {current_date}.
""")
.defaultAdvisors(
new PromptChatMemoryAdvisor(chatMemory), // Chat Memory
new VectorStoreChatMemoryAdvisor(vectorStore)),
new QuestionAnswerAdvisor(vectorStore, SearchRequest.defaults()), // RAG
new LoggingAdvisor())
.defaultFunctions("getBookingDetails", "changeBooking", "cancelBooking") // FUNCTION CALLING
.build();
This way, you need to only inject ChatClient
into the application beans to implement intelligent capabilities without the need to specify details for interacting with the LLM.
The following figure shows an example of using the flight booking assistant.
For information about the source code of the sample project, see alibaba/spring-ai-alibaba on GitHub.
Spring AI Alibaba aims to provide an open source AI framework with deep integration into comprehensive open source ecosystem of Alibaba Cloud to help Java developers build AI-native application architectures. The future development of the project focuses on the following aspects:
• Prompt template management
• Event-driven AI applications
• Vector databases
• Deployment modes such as function computation
• Observability construction
• Development capabilities of AI agent nodes, including content moderation, throttling, and multi-model switching
• Developer toolsets
• Project official website: https://sca.aliyun.com/ai
• Source code and examples: https://github.com/alibaba/spring-ai-alibaba
HTTP/3 Version RPC Protocol by Apache Dubbo, Improving Unstable Network Efficiency by up to 6 Times
Best Practices for Generating a Unit Test by Using Tongyi Lingma to Simplify Unit Testing
510 posts | 49 followers
FollowAlibaba Cloud Native Community - July 18, 2024
Alibaba Clouder - April 15, 2021
Aliware - July 3, 2020
Alibaba Clouder - September 15, 2020
Alibaba Developer - February 9, 2021
Alibaba Cloud Native - December 10, 2024
510 posts | 49 followers
FollowMulti-source metrics are aggregated to monitor the status of your business and services in real time.
Learn MoreOrganize and manage your resources in a hierarchical manner by using resource directories, folders, accounts, and resource groups.
Learn MoreAccelerate AI-driven business and AI model training and inference with Alibaba Cloud GPU technology
Learn MoreBuild business monitoring capabilities with real time response based on frontend monitoring, application monitoring, and custom business monitoring capabilities
Learn MoreMore Posts by Alibaba Cloud Native Community