By Digoal
Multiple geographic representation methods are available in the industry. Generally, we use latitudes and longitudes to represent locations on Earth. PostgreSQL PostGIS describes this type of information, including the altitude. It is the most widely used technology with the highest precision and the best functionality.
However, not all databases use this technology. Many databases chose Geohash technology in the early stages of development because PostGIS is more challenging to implement.
What is Geohash?
As defined by Oracle, Geohash is used for encoding latitude and longitude coordinates into a text string. It is a hierarchical spatial data structure that divides space into buckets of a grid shape. The size of the grid depends on the Geohash's precision.
Geohash typically uses 32 characters for encoding. Base36 geo encodings, which uses 36 characters for encoding, are also used in the industry.
After encoding, points close to each other have closer values. A prefix always contains a range represented by longer code characters. For example, wb12x comprises five characters, and it contains the grid wb12xabcd. In this case, wb12x represents a larger grid, and wb12xabcd represents a smaller grid. Do not ignore boundary issues. For example, -180 and 180 indicate the same place, but they have completely different Geohash codes. This is one of the issues with Geohash.
First, create a Base32 mapping table (assume that 32 characters are used for encoding).
For example, if the Geohash code is ezs42, it can be translated into 13, 31, 24, 4, and 2 according to the following table.
Use 5 bits to represent each of these five numbers. Then we have 01101 11111 11000 00100 00010.
Next, starting from 0 in this bit string, connect the odd-position bits into a longitude code (0111110000000) and the even-position bits into a latitude code (101111001001).
What should we do after obtaining the latitude and longitude bit strings?
Longitude range: -180 to 180
Latitude range: -90 to 90
Divide the range in half each time. For example, divide [-180, 180] to [-180, 0] and (0, 180]. When the bit is 0, the point falls within the range on the left. When the bit is 1, the point falls within the range on the right. Then divide the next range in half. For example, divide [-180, 0] to [-180, -90] and (-90, 0]. Select a range again based on the bit value and repeat the preceding steps.
Get a final range and use the median of the range as the corresponding latitude or longitude.
The above figure shows the calculation process of the bit string 101111001001, with 1 falling in the ranges on the right and 0 falling within the ranges on the left.
In this table, val indicates the median of the range, and err indicates the error.
A longer Geohash string has a smaller error. Let's look at the error table below.
Geohash has the following critical drawbacks:
1) The grid-based algorithm cannot meet high-precision requirements.
2) The earth is an irregular ellipse, so the Geohash deviation changes sharply as the latitude increases. If a point is closer to the Arctic or Antarctic region, the distance calculation result is more inaccurate.
3) The hash values of the longitude boundaries (-180, 180) are entirely different, but they are adjacent, which affects the adjacency calculation.
PostgreSQL PostGIS is a robust data management system that has been developed over decades. It has been applied to military, scientific research, and commercial scenarios.
PostGIS adds extra types, such as geometry, geography, and raster, to the PostgreSQL database. It also adds functions, operators, and index enhancements that apply to these spatial types.
These additional functions, operators, index bindings, and types augment the power of the core PostgreSQL DBMS, making it a fast, feature-plenty, and robust spatial database management system.
The PostGIS 2+ series provides:
The following paragraphs are translated and cited from http://www.cnblogs.com/LBSer/p/3629149.html
Comparison in a simple and interesting scenario:
Feature | Mysql spatial extension | PostGIS |
---|---|---|
Spatial indexes | Only MyISAM supports the R-Tree index. InnoDB does not. | GiST index (a variant of the R-Tree index) |
Spatial data types | 2D data only | 2D, 3D, and curves |
Spatial operation functions | Limited spatial functions | Spatial operation functions defined by the OGC standard |
Spatial projection | Not supported | Multiple common projected coordinate systems are supported |
Transactions | Not supported | PostGIS supports a series of long-running transactions, which can effectively support complex spatial analysis. |
Loading Speed | MySQL > PostGIS (transactions) (Maybe the author did not optimize PostGIS.) | - |
Speed of creating a spatial index | MySQL < PostGIS (diff split algo) | - |
Query efficiency | MySQL and PostGIS perform well in different types of queries. | - |
Use of GIS | Seldom used | Commonly used. For example, PostgreSQL and PostGIS are used by OpenStreetMap. |
Assume that you have received a task to find all points in the blue polygon. You can find only the points in the minimum bounding rectangle (the red box) by using MySQL Spatial. However, you may find points in any polygon if you use PostGIS.
If you need to calculate the distance between two points, calculate only the Euclidean spatial distance using MySQL Spatial, while PostGIS enables you to calculate the real spatial distance in different projected coordinate systems.
In addition to geometry and geography, PostGIS also supports Geohash. Although Geohash is not recommended, PostGIS has a built-in conversion function for converting geometry into Geohash.
ST_GeoHash — Return a GeoHash representation of the geometry.
text ST_GeoHash(geometry geom, integer maxchars=full_precision_of_point);
Return a Geohash representation of the geometry. A Geohash encodes a point into a text form that is sortable and searchable based on prefix. A shorter Geohash is a less precise representation of a point. It can also be thought of as a box, which contains the actual point.
If no maxchars is specified ST_GeoHash returns a Geohash based on full precision of the input geometry type. Points return a Geohash with 20 characters of precision (about enough to hold the full double-precision of the input). Other types return a Geohash with a variable amount of precision, based on the feature's size. Larger features are represented with less precision and smaller features with more precision. The idea is that the box implied by the Geohash will always contain the input feature.
If maxchars is specified, ST_GeoHash returns a Geohash with at most that many characters, so a possibly lower precision representation of the input geometry. For non-points, the starting point of the calculation is the center of the geometry's bounding box.
Availability: 1.4.0
ST_GeoHash does not work with geometries that are not in geographic (lon/lat) coordinates.
This method supports Circular Strings and Curves.
SELECT ST_GeoHash(ST_SetSRID(ST_MakePoint(-126,48),4326));
st_geohash
----------------------
c0w3hf1s70w3hf1s70w3
SELECT ST_GeoHash(ST_SetSRID(ST_MakePoint(-126,48),4326),5);
st_geohash
------------
c0w3h
PostgreSQL: Database Requirement Analysis and Design for Logistics Tracking System
PostGIS Long Lat Geometry Distance Search Tuning Using the GiST KNN Function
digoal - December 21, 2020
digoal - January 25, 2021
ApsaraDB - April 8, 2021
digoal - January 19, 2021
digoal - December 23, 2020
digoal - January 18, 2021
Alibaba Cloud PolarDB for PostgreSQL is an in-house relational database service 100% compatible with PostgreSQL and highly compatible with the Oracle syntax.
Learn MoreAn online MPP warehousing service based on the Greenplum Database open source program
Learn MoreAn on-demand database hosting service for PostgreSQL with automated monitoring, backup and disaster recovery capabilities
Learn MoreLeverage cloud-native database solutions dedicated for FinTech.
Learn MoreMore Posts by digoal