By Rongrong Chen (Mingxu)
Since hot topics (such as AIGC and large models) have attracted much attention recently, how do you choose storage mediums in different industry scenarios? What factors should be considered when you select models?
This article introduces common storage types and their differences to help readers select the appropriate storage type for different requirements and scenarios.
The physical layer of storage is a disk, memory that uses magnetic recording technology to store data. The disk is the main storage medium of the computer. It can store a large amount of binary data and can keep it from being lost even after a power failure. Early computers used floppy disks. Today, hard disks are commonly used.
A storage disk has three common metrics: throughput, IOPS, and latency. The relationship between the three metrics is represented by the following formula: Throughput per second = I/O size * IOPS * Degree of parallelism (IOPS-latency)
.
Alibaba Cloud storage uses the Pangu system (a virtualization technology for physical storage resource pooling on the disk resources at the physical layer to build a distributed resource scheduling system), which can provide users and customers with a pay-as-you-go and on-demand experience similar to water, electricity, and coal resources.
Files, blocks, and objects are three storage formats that hold, organize, and present data in different ways.
The three storage products have different interface protocols:
The three products are suitable for different application scenarios due to the different storage data and storage structures.
The storage architecture of a NAS file system is a directory tree structure. It can support thousands of virtual machines to access simultaneously and concurrently through the POSIX interface. It also supports random and direct read and write and online modification.
OSS file architecture is a Simple Storage Service (S3) that organizes files in a flat format. OSS does not support random read and write of files. OSS is suitable for uploading, downloading, and distributing large amounts of data over the Internet.
The objects are stored in the bucket. Objects are like files, buckets are like folders or directories, and objects and buckets are searched with uniform resource identifiers (URI). Although the console interface seems to have a tree structure, the displayed folder /.resource
is only a prefix.
Storage Service | Latency | Throughput | Protocol | Access mode (the interface for the virtual machine to access stored data) | Scenarios |
Apsara File Storage NAS | Milliseconds | Hundreds of Gigabits per Second | NFS and SMB | Thousands of ECS instances perform concurrent random read and write operations on an Apsara File Storage NAS file system through POSIX. | Scenarios of highly concurrent access, online modification, and direct read and write |
Object Storage Service (OSS) | Tens of Milliseconds | Hundreds of Gigabits per Second | HTTP and HTTPS (Restful API) | Millions of clients concurrently access an OSS bucket and perform append operations using the web. | Upload, download, and distribute large amounts of data over the Internet |
Elastic Block Storage (EBS) | A Few Microseconds | Tens of Gigabits per Second | Self-Research Agreement | A single ECS instance performs random read and write operations on a block storage device through POSIX | Suitable for high-performance, low-latency application workloads (such as I/O-intensive databases and single ECS instances) |
A protocol defines the format and order of messages exchanged between two or more communicating entities and the actions taken to send and/or receive a message or other event.
Interfaces are often linked to modules in the case of programming. The module is a physical grouping of program entity definitions and is a program unit that can be written and compiled separately. A module includes two parts: interface and implementation. The interface of a module specifies some program entities defined in the module that can be used by other modules. The implementation of a module refers to the specific implementation of the program entities defined in the module. The interface acts as a constraint between the module designer and the user: the user uses the functions provided by the module according to the module interface, and the designer implements modules according to the specified module interface.
Protocols and interfaces often appear together or are used in a mixed way. They are essentially abstract sets of rules that have different meanings in different situations. For example, in the context of computer networks, protocols are often referred to as network protocols or communication protocols, and each layer in the network model has a different protocol. In programming scenarios, we often refer to a specific interface, which can be narrowly understood as interaction points (similar to the service window to the government affairs hall), functions, and methods. However, in specific scenarios, for example, the API is referred to as an interface by default (it is an inclusion relationship, not an equivalence relationship).
Broadly speaking, interfaces and protocols can be used together, and they are highly abstract concepts. In a narrow sense, protocols and interfaces are specific (for example, java has a definable interface, which is often used to compare with abstract classes). At this time, the two are different.
Protocols are common rules and paradigms for multiple communicating entities. Interfaces are concrete implementations of the rules specified in protocols.
Connection: The protocol is the established rule of the interface, and the interface is the concrete implementation of the protocol.
Difference: It is unnecessary to distinguish too much between the protocol and the interface. They are more closely linked.
1. Definition: Network File System (NFS) is a UNIX presentation layer protocol for file sharing developed by SUN Microsystems. It enables users to access files elsewhere on the network as if they were using their own computers.
2. Differentiated Features of NFS
a) NFS only provides basic file processing functions but does not provide any four-layer TCP/IP and OSI seven-layer data transmission service functions. RPC protocol is needed to realize TCP/IP data transmission service functions.
b) By default, NFS is not encrypted and is completely transparent to the client. You can only use the IP address or hostname to determine whether to allow the client to mount a specified shared directory, plaintext transmission. You can use Kerberos to authenticate and encrypt NFS.
3. NFS and other file sharing protocols have something in common: use the C/S architecture.
4. NFS implementation principle: the owner, group, and permissions of shared resources
a) The NFS server and client identify the owner information of the shared resource by UID and GID. When the client mounts an NFS share directory, the UID and GID of the resources in the shared directory will be consistent with those on the server. The client will map the UID and GID to the corresponding user name and group name on the client. The permissions and ACL information (if supported) of shared resources on the NFS server and client will be consistent.
iSCSI SAN is typically used for two purposes:
a) An API defines a set of programming interfaces used by applications. They can be implemented by calling one or multiple systems, and there is no problem without using any system calls.
b) APIs can provide applications with the same interface on various operating systems. However, the implementation of the API on these systems may be different. For example, when an application calls the printf() function, the printf function will call printf in the C library and then call write in the C library, and the C library will call the function of the kernel system. Windows may call function A, and Linux may call function B. Different kernels provide different system calls (a function) to complete the same function.
Disclaimer: The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.
The New Batch of Alibaba Cloud MVP Applications for Autumn 2023 Is Now Open!
Spring Exploration: If There Is @Resource, Why Do We Need @Autowired?
1,042 posts | 256 followers
FollowAlibaba Clouder - November 26, 2020
Alibaba Cloud Community - August 18, 2023
Alibaba Clouder - November 26, 2020
Junho Lee - June 22, 2023
Junho Lee - June 15, 2023
Yen Sheng - April 3, 2023
1,042 posts | 256 followers
FollowBlock-level data storage attached to ECS instances to achieve high performance, low latency, and high reliability
Learn MoreProvides scalable, distributed, and high-performance block storage and object storage services in a software-defined manner.
Learn MoreAn encrypted and secure cloud storage service which stores, processes and accesses massive amounts of data from anywhere in the world
Learn MoreSimple, scalable, on-demand and reliable network attached storage for use with ECS instances, HPC and Container Service.
Learn MoreMore Posts by Alibaba Cloud Community