Use Dify to create an AI-powered Q&A assistant - - Alibaba Cloud Documentation Center

Dify is a platform that can integrate enterprise or individual knowledge bases with large language model (LLM) applications. You can use Dify to design customized AI-assisted Q&A solutions and apply the solutions to your business. This helps facilitate business development and management. Container Service for Kubernetes (ACK) is a service that can instantly and smoothly scale your business on demand. You can use ACK to facilitate application development.

Procedure

Sample AI Application Customized by Using Dify

2024-08-21_16-01-26 (1)

Sample AI Web Application

2024-08-23_14-14-02 (1)

To create a customized AI-powered Q&A assistant, perform the following steps:

Install ack-dify: Create an ACK cluster and install the ack-dify component in the cluster.
Create an AI-powered Q&A assistant: Log on to the Dify platform to create an AI-powered Q&A assistant and use a website to expose the assistant.
Customize the AI-powered Q&A assistant: Prepare a knowledge base for the Q&A assistant. This way, the assistant can provide professional answers to questions based on the knowledge base.

yuque_diagram12

The following figure shows how Dify interacts with an ACK cluster.

Introduction to Dify

Dify is an open source platform for LLM application development. Dify combines Backend as a Service (BaaS) and LLM Operations LLMOps to streamline the development of generative AI applications. Dify makes AI application orchestration and data operations easy for both developers and non-technical users. Dify is pre-built with key technology stacks required for building LLM applications. This eliminates the need to redevelop solutions and technologies that already exist, allowing you to focus on business innovation and requirements.

The following figure shows the architecture of Dify.

The architecture of Dify includes the following key parts:

Key components required by LLM applications. Dify is integrated with the key components required by LLM applications. With these integrations, Dify supports a wide variety of models and provides a user-friendly prompt orchestration interface, a high-performance Retrieval-Augmented Generation (RAG) system, and a customizable agent framework.
An intuitive interface to visualize orchestration and operations. Dify provides an intuitive interface to support visualized prompt orchestration, operations, and management. This greatly accelerates AI application development and allows developers to quickly integrate LLMs into their AI applications and continuously maintain and optimize the applications.
A set of out-of-the-box application templates and orchestration frameworks. Dify provides out-of-the-box application templates and orchestration frameworks that developers can use to quickly develop generative AI applications powered by LLMs. In addition, Dify can instantly and smoothly scale your business on demand.

Dify is a comprehensive, flexible, and easy-to-use platform for developing and deploying generative AI applications.

1. Install ack-dify

If you are familiar with the process of creating an ACK cluster, you can follow the instructions in the following section to create a cluster that is suitable for your business.

1.1 Set up the environment

Create an ACK Pro cluster that runs Kubernetes 1.22 or later. For more information, see Create an ACK managed cluster and Upgrade clusters.
To ensure cluster stability, we recommend that you reserve at least 2 vCPUs and 4 GB of memory for the Kubernetes system of the cluster and install the Container Storage Interface (CSI) plug-in.
Show how to install the CSI plug-in
When you create the cluster, select Dynamically Provision Volumes by Using Default NAS File Systems and CNFS, Enable NAS Recycle Bin, and Support Fast Data Restore.
Connect a kubectl client to the cluster. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.

1.2 Install ack-dify

To deploy Dify in the ACK cluster, perform the following steps to install the ack-dify component:

Log on to the ACK console. In the left-side navigation pane, click Clusters. Click the name of the cluster you created. On the cluster details page, click the callouts in the following figure in sequence to install ack-dify.
The Application Name and Namespace parameters are optional. Click Next (callout ⑥). In the Confirm message, click Yes. In this case, the default application name ack-dify and default namespace dify-system are used. Select the latest version for Chart Version. Then, click OK to install ack-dify.
Wait 1 minute and run the following command on your on-premises machine. If the output shows that all pods in the dify-system namespace are in the Running state, ack-dify is installed in the cluster.
```
kubectl get pod -n dify-system
```
If a pod is in the Pending state, the reason may be that the persistent volume claim (PVC) required by ack-dify does not exist in the cluster. To resolve this issue, you must manually create a default Container Network File System (CNFS) file system and use a StorageClass of the NAS type. For more information, see Use CNFS to manage NAS file systems (recommended). For more information about how to troubleshoot pod issues, see Pod troubleshooting.

2. Create an AI-powered Q&A assistant

2.1 Access Dify

Enable access to ack-dify over the Internet.
If you use the production environment, we recommend that you enable access control to ensure data security.
After you update the ack-dify Service, an IP address is displayed in the External IP column. To access Dify, use your browser to access the IP address.
Create a Dify account.
Access the external IP address of the ack-dify Service. On the page that appears, follow the on-screen instructions to set up an administrator account and type an email address, username, and password to register with Dify.

2.2 Create an AI-powered Q&A assistant

Use the external IP address of the ack-dify Service and your browser to log on to Dify.
Add an AI model and configure an API key for the model. In this example, a Tongyi Qianwen model is added, as shown in the following figure:
Tongyi Qianwen provides a free quota. After the free quota is exhausted, you are charged based on the tokens that you consume. Compared with deploying a self-managed LLM, it is more cost-effective to purchase model services from a model provider.
1. To obtain an API key, choose Username > Settings > Model Provider > TONGYI (Setup) > Get your API key from AliCloud.
2. Specify the API key you obtained in the input box in the following figure and click Save.
Create a general-purpose AI-powered Q&A assistant.
Choose Studio > Create from Blank. Specify a name and description for the assistant. Use the default settings for other parameters.

2.3 Test the AI-powered Q&A assistant

You can type questions on the web page to test whether the assistant can provide answers. The following figure provides an example. The general-purpose assistant supports only simple conversations and cannot answer difficult questions such as "What is Dify?".

3. Customize the AI-powered Q&A assistant

3.1 Prepare a knowledge base

After you perform the preceding steps, an AI-powered Q&A assistant is created. If you want the assistant to provide professional answers to questions such as "What is Dify?", you must prepare a relevant knowledge base and integrate the knowledge base with the assistant.

To simplify the configuration, a corpus file named dify_doc.md is provided in this example. To create a knowledge base and upload the corpus file to the knowledge base, perform the following steps:

Upload dify_doc.md to the knowledge file.
Choose Knowledge > Create Knowledge > Import from file > Browse > Next.
Click Next and then follow the on-screen instructions to configure the parameters on the Text Preprocessing and Cleaning page.
You can use the default settings. The knowledge base automatically cleans the corpus, chunks the text in the corpus, and creates an index for the corpus. This makes it easy for the assistant to search the knowledge base when generating answers.

Click Show the steps to prepare a Dify knowledge base to view how to convert the preceding corpus file into a .md file.

Show the steps to prepare a Dify knowledge base

Prepare a Dify knowledge base. Dify allows you to upload individual files of the following formats: txt, html, .md, and pdf.

Before you start training the assistant, make sure that a knowledge base is prepared. Before you start coding, all corpus files are .md files in a Git repository. You need to pull the .md files from the repository and merge them into one file. Perform the following steps:

Clone a Git repository. Run the git clone command to clone a Dify Git repository to your on-premises machine.
```
git clone https://github.com/langgenius/dify-docs.git
```

Process the corpus files. To allow text vectorization for the corpus, you can use the following Python code to merge the .md files in the Git repository.

from langchain_text_splitters import MarkdownHeaderTextSplitter, RecursiveCharacterTextSplitter
import os


def merge_markdown_files_from_directory(root_dir, output_file):
    """
    merge_markdown_files_from_directory function
      1. Description: merges all .md files in the specified directory into one output file. 
      2. Parameters:
        root_dir: the root directory. 
        output_file: the directory of the output file. 
      3. Steps:
        Use the os. walk() method to traverse the root directory and its subdirectories. 
        If the suffix of a file is .md, add the path of the file to the markdown_files list. 
        Open the output file, read data from each file in the markdown_files list, and write the data to the output file. 
    """
    markdown_files = []
    for root, dirs, files in os.walk(root_dir):
        for file in files:
            if file.endswith('.md'):
                markdown_files.append(os.path.join(root, file))

    with open(output_file, 'w', encoding='utf-8') as outfile:
        for file_path in markdown_files:
            with open(file_path, 'r', encoding='utf-8') as infile:
                outfile.write(infile.read())
                # outfile.write('\n\n')

               
def process_and_write_markdown(file_path: str, headers_to_split_on: list, chunk_size: int, chunk_overlap: int, output_file: str):
    """
    process_and_write_markdown function
      1. Description:
        Chunks the text in a .md file based on the specified headers and chunking rules, and then writes the chunked text into the output file. 
      2. Parameters:
        file_path: the path of a .md file. 
        headers_to_split_on: a tuple that includes headers. Example: [("#", "Header 1"), ("##", "Header 2")]. 
        chunk_size: the maximum size of a chunk. 
        chunk_overlap: the size of the overlapping part between two consecutive chunks. 
        output_file: the directory of the output file. 
      3. Steps:
        Read data from the .md file in the path specified by the file_path parameter. 
        Create a MarkdownHeaderTextSplitter to chunk the text in the .md file based on the headers_to_split_on tuple. 
        Create a RecursiveCharacterTextSplitter to further chunk the text based on the chunk_size and chunk_overlap parameters. 
        Open the output file and write the metadata and content of the chunks to the output file. 
    """
    try:
        # Read data from the .md file in the path specified by the file_path parameter. 
        with open(file_path, "r", encoding="utf-8") as doc:
            markdown_content = doc.read()

        # Create a MarkdownHeaderTextSplitter to chunk the text in the .md file based on the headers_to_split_on tuple. 
        splitter = MarkdownHeaderTextSplitter(headers_to_split_on=headers_to_split_on, strip_headers=True)
        md_splits = splitter.split_text(markdown_content)

        # Create a RecursiveCharacterTextSplitter to further chunk the text based on the chunk_size and chunk_overlap parameters. 
        text_splitter = RecursiveCharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap)
        splits = text_splitter.split_documents(md_splits)

        # Open the output file and write the metadata and content of the chunks to the output file. 
        with open(output_file, "w") as f:
            for line in splits:
                f.write(str(line.metadata))
                f.write("\n")
                f.write(line.page_content)
                f.write("\n\n\n\n")

    except FileNotFoundError:
        raise FileNotFoundError(f"The file {file_path} does not exist.")


# Example:
if __name__ == "__main__":
    """
     1. Set the following parameters:
      root_directory: the root directory. 
      merged_file_path: the path of the .md file that contains the merged content. 
      output_file: the path of the output file. 
      headers_to_split_on: a list of headers. 
      chunk_size and chunk_overlap: the chunk size and the size of the overlapping part between two consecutive chunks. 
    2. Steps:
      Call the merge_markdown_files_from_directory function to merge all .md files into one .md file in the path specified by the merged_file_path parameter. 
      Call the process_and_write_markdown function to process the file that contains the merged content and write the data of the file to the output file. 
      The script performs the preceding steps to merge multiple .md files, process the file content, and export the merged content to an output file. 
    """
    
    # The directory to be processed.
    root_directory = 'path/to/dify-docs/en'
    # The path of the file that contains the merged content.
    merged_file_path = './merged_markdown.md'
    # The path of the output file after data cleaning.
    output_file = './dify_doc.md'
    
    merge_markdown_files_from_directory(root_directory, merged_file_path)
    headers_to_split_on = [
        ("#", "Header 1"),
        ("##", "Header 2"),
        ("###", "Header 3"),
    ]
    chunk_size = 500
    chunk_overlap = 50
    process_and_write_markdown(merged_file_path, headers_to_split_on, chunk_size, chunk_overlap, output_file)

Upload the dify_doc.md corpus file to the knowledge base.

3.2 Orchestrate and publish the AI-powered Q&A assistant

Configure prompts for the assistant and configure the knowledge base as the context.

Configure prompts: Copy the following content to the Instructions text editor. Prompts provide instructions and constraints for the assistant when it generates answers. Prompts help improve the accuracy of answers provided by the assistant.

You will act as Dify's AI assistant, dedicated to answering customers' questions about Dify products and their features. Your responses should be based on the existing knowledge base to ensure accuracy. If a question is beyond your knowledge, please honestly inform them that you do not know the answer, in order to maintain the integrity of the information. Please communicate in a friendly and warm tone, and feel free to use emoticons appropriately to enhance the interactive experience.

Configure the knowledge base as the context: Click Add in the Context parameter. Select the knowledge base you created and click Add. This allows the assistant to answer questions based on the knowledge base.
In the upper-right corner of the page, choose Publish > Update to save the configurations and make the configurations take effect.

The following figure provides an example.

3.3 Test the assistant

After you integrate a knowledge base into the assistant, the assistant can provide more professional and accurate answers.

Summary

The following table describes the key features that Dify provides for individual developers and enterprises.

Key feature	Description
Comprehensive support for LLMOps	Dify provides comprehensive O&M capabilities for existing AI applications. For example, Dify supports real-time monitoring based on logs and metrics and provides continuous optimizations for prompts, datasets, and models based on production data and user feedback.
RAG engines	Dify provides end-to-end RAG pipelines to perform data operations, including document importing and information searches. RAG pipelines simplify data preparation and can directly process common file formats, such as pdf and ppt.
Agent	Dify allows developers to define agents based on function calls in LLMs or the ReAct paradigm. You can add built-in or custom tools to agents. Dify provides more than 50 built-in tools.
Workflow orchestration	Dify provides a visualized canvas where developers can drag and connect different components to quickly build complex AI workflows. This method does not require complex coding and makes application development easy and intuitive to developers.
Observability	Dify tracks the quality and costs of LLM applications through dashboards and provides assessments. You can use capabilities to monitor your LLM applications.
Enterprise features (SSO/access control)	Dify helps mitigate the risks of data leaks and data corruptions for enterprises and organizations to ensure data security and service continuity.

Apply the AI-powered Q&A assistant to the production environment

To apply the AI-powered Q&A assistant to the production environment of an enterprise or individual developer, use the following methods:

Publish the assistant by using a website.
After you create an AI application by using Dify, you can publish the application by using a web application that is accessible over the Internet. The application works based on the prompts and configurations that you orchestrated. For more information, see Publish as a Single-page Web App.
Expose access to the Dify API.
Dify offers an API that complies with the BaaS concept. Developers can use the powerful capabilities of LLMs directly from the frontend without the need to focus on complex backend architectures and deployment processes. For more information, see Developing with APIs.
Perform custom development based on frontend components.
If you develop new products from scratch or are in the product prototype design phase, you can quickly launch AI sites by using Dify. For more information, see Re-develop Based on Frontend Templates.
Embed the AI application in the websites of enterprises or individual developers
Dify allows you to embed your AI applications in websites. You can build and embed AI customer service chatbots and Q&A chatbots in the official websites of enterprises within minutes. For more information, see Embedding In Websites.

Example of embedding AI applications in websites

This section provides an example on how to embed an LLM application in the website of an enterprise or individual developer.

If you use the production environment, we recommend that you enable access control to ensure data security.

Enable access to ack-dify over the Internet. Use the external IP address of the ack-dify Service to access Dify. For more information, see Access Dify.
Build a simple web application to debug your AI Q&A assistant.
Perform the following steps to deploy a web application in the ACK cluster. You can embed an LLM application developed by using Dify in the web application.
1. Obtain the code of the assistant from Dify.
  Select a method to embed the chat application in your website, as shown in the following figure.
2. Create a Deployment in the ACK cluster to run your web application and create a Service to expose the web application.
  The following YAML template provides an example on how to deploy a static HTML page on an NGINX server.
  1. Log on to the ACK console. On the ConfigMap page, select the default namespace. Click Create from YAML. Copy the following YAML template to the code editor and replace the code block (callout ③) with the code of the assistant you obtained from Dify in the previous step.
    Sample code:
    Show the YAML file content
    apiVersion: apps/v1 kind: Deployment metadata: name: web-deployment spec: replicas: 2 selector: matchLabels: app: web template: metadata: labels: app: web spec: containers: - name: web image: registry.openanolis.cn/openanolis/nginx:1.14.1-8.6 ports: - containerPort: 80 volumeMounts: - name: web-content mountPath: /usr/share/nginx/html volumes: - name: web-content configMap: name: web-config --- apiVersion: v1 kind: Service metadata: name: web-service spec: selector: app: web ports: - protocol: TCP port: 80 targetPort: 80 type: LoadBalancer --- apiVersion: v1 kind: ConfigMap metadata: name: web-config data: index.html: | <!DOCTYPE html> <html lang="zh"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>The simplest website service</title> <style> #dify-chatbot-bubble-button { background-color: #1C64F2 !important; } </style> </head> <body> <h1>Welcome to my website!</h1> <script> window.difyChatbotConfig = { token: 'W1b8mRL******yiD6', baseUrl: 'http://8.154.XX.XX' } </script> <script src="http://8.154.XX.XX/embed.min.js" id="W1b8mRL******yiD6" defer> </script> <style> #dify-chatbot-bubble-button { background-color: #1C64F2 !important; } </style> </body> </html>
  2. After the web application is deployed, the following page appears.
  3. Enable access to the web application over the Internet.
    If you use the production environment, we recommend that you enable access control to ensure data security.
    After you update the web-service Service, an IP address is displayed in the External IP column. To access the web application, use your browser to access the IP address.
    Important
    If you want to allow other devices to access the web application, make sure that the firewall or security group of the cluster allows data transfer on port 80. For more information, see Add a security group rule.
    To prevent cross-site scripting (XSS) attacks, make sure that no malicious code is embedded in the application code and third-party code. You can add extensions or modify the sample code based on your business requirements.
  4. View results.

Continuous improvement

To further improve and optimize your LLM applications, we recommend that you study and pass Alibaba Cloud Certified Associate (ACA) certification for LLMs. The certification provides free courses that you can study to gain knowledge on the features and use scenarios of LLMs and how to optimize LLMs.

Billing

This feature charges you a cluster management fee for the ACK Pro cluster you use and fees for other cloud resources you use. This feature may use the following cloud services: Elastic Compute Service (ECS), Classic Load Balancer (CLB), Elastic IP Address (EIP), and Apsara File Storage NAS (NAS). If the preceding cloud services are used, you are charged by the cloud services based on their billing rules. For more information, see Billing.