You can use various services such as document parsing, image text extraction, and document splitting in a visualized manner in the experience center of the Search Development Workbench console. This helps you quickly evaluate whether the services meet your business requirements.
Features
The following table describes the services that are provided by the experience center.
Category | Description |
Document Content Parsing | Document Content Parsing Service(ops-document-analyze-001) provides a general-purpose document parsing service. You can use this service to extract logical structures, such as titles and paragraphs, from non-structured documents, such as text, tables, and images, to generate structured data. |
Image Content Parsing |
|
Document Slice | Common Document Slicing Service(ops-document-split-001) provides a general-purpose text splitting service. You can use this service to split structured data in the HTML, MARKDOWN, and TXT formats based on paragraphs, semantics, and specific rules. You can also extract code, images, and tables from rich text. |
Text vectorization |
|
Text sparse vectorization | This service converts text data into sparse vectors that occupy less storage space. You can use sparse vectors to express keywords and the information about frequently used terms. You can perform a hybrid search by using sparse and dense vectors to improve the retrieval performance. OpenSearch text sparse vectorization service-generic(ops-text-sparse-embedding-001) provides a text vectorization service that supports more than 100 languages. The input text can be up to 8,192 tokens in length. |
Query Analysis | This service provides the content analysis service for queries based on LLMs and the Natural Language Processing (NLP) capabilities to understand the intent of users, extend similar questions, and convert questions in natural languages into SQL statements. This improves the performance of conversational search in retrieval-augmented generation (RAG) scenarios. Query Analysis Service 001(ops-query-analyze-001) supports the LLM-based general-purpose analysis of queries to understand the intent of users and extend similar questions. |
Sorting Service |
|
Large model |
|
Procedure
Use the document parsing service
Log on to the OpenSearch console.
In the top navigation bar, select a region. In the upper-left corner, select Search Development Workbench.
In the left-side navigation pane, click Experience Center.
On the Experience Center page, select Document Content Parsing(document-analyze) from the Service Category drop-down list and select a service from the Experience Services drop-down list.
Set the Experience Data parameter to Sample data or My data. If you set the Experience Data parameter to My data, you can click Manage data to upload your data. Supported file types include TXT, PDF, HTML, DOC, DOCX, PPT, and PPTX. Each file can be up to 20 MB in size.
Upload local files: Uploaded files are automatically cleared after seven days. The console does not permanently store your data.
Provide the URLs of files and specify the file type: You can provide multiple file URLs. Each file URL occupies a separate line.
NoteIf you select an incorrect file type, the data that you upload fails to be parsed. To prevent document parsing failures, you must select a correct file type based on the file data.
ImportantWhen you import files by using URLs, make sure that your operations comply with laws, regulations, and the management norms of the relevant platform and do not infringe on the legitimate rights or interests of right holders. You shall be solely responsible for any violations of the preceding requirements. As a tool provider, Search Development Workbench does not assume any responsibility for any data parsing or data download that you perform.
If you want to use your data, select an uploaded file or a URL from the drop-down list.
Click Get Results. The system calls the document parsing service to parse the file. You can view the results on the following tabs:
Results: displays the parsing progress and results.
Result source code: displays the result response code. You can click Copy Code to copy the code or click Download File to download the code to your computer.
Sample code: allows you to view and download the sample code for calling the document parsing service.
Use the document splitting service
Log on to the OpenSearch console.
In the top navigation bar, select a region. In the upper-left corner, select Search Development Workbench.
In the left-side navigation pane, click Experience Center.
On the Experience Center page, select Document Slice(document-split) from the Service Category drop-down list and select a service from the Experience Services drop-down list.
Set the Experience Data parameter to Sample data or My data. If you set the Experience Data parameter to My data, you can enter your data in the editor. Supported data formats include TXT, HTML, and MARKDOWN.
NoteIf you select an incorrect data format, the data that you enter fails to be parsed. To prevent document parsing failures, you must use a correct data format.
Configure the Maximum Slice Length parameter. Unit: token. Default value: 300. The maximum value is 1024.
Click Get Results. The system calls the document splitting service to split the data. You can view the results on the following tabs:
Results: displays the splitting progress and results.
Result source code: displays the result response code. You can click Copy Code to copy the code or click Download File to download the code to your computer.
Sample code: allows you to view and download the sample code for calling the document splitting service.
Use the text vectorization or text sparse vectorization service
Log on to the OpenSearch console.
In the top navigation bar, select a region. In the upper-left corner, select Search Development Workbench.
In the left-side navigation pane, click Experience Center.
On the Experience Center page, select Text vectorization(text-embedding) or Text sparse vectorization(text-sparse-embedding) from the Service Category drop-down list and select a service from the Experience Services drop-down list.
Set the Content Type parameter to document or query.
Add text groups or directly enter JSON-formatted data.
Click Get Results. The system calls the text vectorization or text sparse vectorization service to vectorize the text. You can view the results on the following tabs:
Results: displays the vectorization progress and results.
Result source code: displays the result response code. You can click Copy Code to copy the code or click Download File to download the code to your computer.
Sample code: allows you to view and download the sample code for calling the text vectorization or text sparse vectorization service.
Use the sorting service
Log on to the OpenSearch console.
In the top navigation bar, select a region. In the upper-left corner, select Search Development Workbench.
In the left-side navigation pane, click Experience Center.
On the Experience Center page, select Sorting Service(ranker) from the Service Category drop-down list and select a service from the Experience Services drop-down list.
Set the Experience Data parameter to Sample data or My data.
Enter information in the Query field.
Click Get Results. The system calls the sorting service to sort the documents based on the relevance between the query and document content, and returns the scoring results. You can view the results on the following tabs:
Results: displays the sorting and scoring results.
Result source code: displays the result response code. You can click Copy Code to copy the code or click Download File to download the code to your computer.
Sample code: allows you to view and download the sample code for calling the sorting service.
Use an LLM
Log on to the OpenSearch console.
In the top navigation bar, select a region. In the upper-left corner, select Search Development Workbench.
In the left-side navigation pane, click Experience Center.
On the Experience Center page, select Large model(text-generation) from the Service Category drop-down list and select a service from the Experience Services drop-down list.
Enter your question and click the Submit icon. The LLM analyzes the question and provides an answer.
ImportantAll answers are generated by AI models. The content generated by the AI models may be inaccurate or incomplete and does not reflect the attitudes or opinions of Alibaba Cloud.
The Results tab displays the number of input and output tokens that are consumed in this conversation. You can click Clear Conversation to delete this conversation.
Use the image content parsing service
Log on to the OpenSearch console.
In the top navigation bar, select a region. In the upper-left corner, select Search Development Workbench.
In the left-side navigation pane, click Experience Center.
On the Experience Center page, select Image Content Parsing(image-analyze) from the Service Category drop-down list and select a service from the Experience Services drop-down list.
Set the Experience Data parameter to Sample data or My data.
Click Get Results. The system calls the image content parsing service to analyze the image content and generate results or identify and export the key information of the image. You can view the results on the following tabs:
Result: displays the content parsing results.
Result source code: displays the result response code. You can click Copy Code to copy the code or click Download File to download the code to your computer.
Sample code: allows you to view and download the sample code for calling the image content parsing service.
Use the query analysis service
Log on to the OpenSearch console.
In the top navigation bar, select a region. In the upper-left corner, select Search Development Workbench.
In the left-side navigation pane, click Experience Center.
On the Experience Center page, select Query Analysis(query-analyze) from the Service Category drop-down list and select a service from the Experience Services drop-down list.
Directly enter your question in the Query field. Alternatively, construct a multi-round conversation in the Historical Message section and then enter your question in the Query field. The model combines the multi-round conversation and question to identify your query intent.
Turn on Show NL2SQL and select a created service configuration from the Service Configuration drop-down list. You can enter your question in a natural language. The NL2SQL service converts your question in a natural language into SQL statements.
Click Get Results. You can view the results on the following tabs:
Results: displays the query analysis results.
Result source code: displays the result response code. You can click Copy Code to copy the code or click Download File to download the code to your computer.
Sample code: allows you to view and download the sample code for calling the query analysis service.