Following on from this question How can I get epsg code in pyproj with version==2.1.3, I discovered that sometimes when intialising a pyproj Proj
instance with a string ESPG code eventually results in an ambiguity when trying to determine the projection's EPSG code.
from pyproj import Proj
crs = Proj(init='epsg:4544').crs
print(crs.to_epsg())
# prints None
If you adjust themin_confidence
parameter, then you can get it out:
from pyproj import Proj
crs = Proj(init='epsg:4544').crs
print(crs.to_epsg(min_confidence=25) # <-- !!
# prints 4544
I feel like this is something I should understand, but I don't. What is the ambiguity here; why does the to_epsg
method require this min_confidence
parameter to return a valid EPSG code in this case?
Reading the source code is not enlightening regarding the source of the ambiguity:
Parameters
----------
min_confidence: int, optional
A value between 0-100 where 100 is the most confident. Default is 70.
Returns
-------
int or None: The best matching EPSG code matching the confidence level.
Reading that function, as well as to_authority
, I surmise that the ambiguity arises when there are multiple matches, and the confidence score allows pyproj to return the best candidate. I don't particularly understand that either, but it does make sense. What really doesn't make sense is how None
can be a better candidate than 4544
in this case, especially when the Proj
instance was initialised without an exception or warning.
Notably, using a CRS
instance instead, but with the same EPSG string input, seems to be unambiguous:
from pyproj import CRS
crs = CRS('epsg:4544')
print(crs.to_epsg())
# prints "4544"
>>> import pyproj
>>> pyproj.__version__
'2.2.0'
Answer
This is a great question and I will do my best to answer.
To begin, the init
style syntax is deprecated (https://pyproj4.github.io/pyproj/stable/gotchas.html#init-auth-auth-code-should-be-replaced-with-auth-auth-code). So, instead of CRS(init="epsg:4544")
, you should use CRS("epsg:4544")
.
I discovered that sometimes when intialising a pyproj Proj instance with a string ESPG code eventually results in an ambiguity when trying to determine the projection's EPSG code.
The reason that the EPSG code does not appear with the CRS initialized with the init=
syntax is that the CRS are different.
>>> from pyproj import CRS
>>> crs_deprecated = CRS(init="epsg:4544")
>>> crs = CRS("epsg:4544")
>>> crs == crs_deprecated
False
Upon further inspection, you can see that the difference is in the axis order (although it is not immediately apparent to the eye due to the content being the same).
>>> crs_deprecated
Name: CGCS2000 / 3-degree Gauss-Kruger CM 105E
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: China - 103.5°E to 106.5°E
- bounds: (103.5, 22.5, 106.5, 42.21)
Coordinate Operation:
- name: Gauss-Kruger CM 105E
- method: Transverse Mercator
Datum: China 2000
- Ellipsoid: CGCS2000
- Prime Meridian: Greenwich
>>> crs
Name: CGCS2000 / 3-degree Gauss-Kruger CM 105E
Axis Info [cartesian]:
- X[north]: Northing (metre)
- Y[east]: Easting (metre)
Area of Use:
- name: China - 103.5°E to 106.5°E
- bounds: (103.5, 22.5, 106.5, 42.21)
Coordinate Operation:
- name: Gauss-Kruger CM 105E
- method: Transverse Mercator
Datum: China 2000
- Ellipsoid: CGCS2000
- Prime Meridian: Greenwich
why does the to_epsg method require this min_confidence parameter to return a valid EPSG code in this case?
The reason the min_confidence
parameter exists is because of the example above. If you have a WKT/PROJ string and you use it to create the CRS instance, in most cases you want to be sure that the EPSG code given by to_epsg
will give you a CRS
instance similar to the one created by the WKT/PROJ string. However, if an EPSG code does not exist that matches you WKT/PROJ string with a min_confidence
you don't want to get that EPSG code back as it will make you think that the WKT/PROJ string and the EPSG code are one and the same when they are not.
However, if you are only wanting to get the EPSG code that is closest to the PROJ/WKT string, then you can reduce your min_confidence
to a threshold you are comfortable with.
Here is an example of that:
>>> crs_deprecated = CRS("+init=epsg:4326")
>>> crs_deprecated.to_epsg(100)
>>> crs_deprecated.to_epsg(70)
>>> crs_deprecated.to_epsg(20)
4326
>>> crs_latlon = CRS("+proj=latlon")
>>> crs_latlon.to_epsg(100)
>>> crs_latlon.to_epsg(70)
4326
>>> crs_epsg = CRS.from_epsg(4326)
>>> crs_epsg.to_epsg(100)
4326
>>> crs_wkt = CRS(crs_epsg.to_wkt())
>>> crs_wkt.to_epsg(100)
4326
>>> crs_wkt == crs_epsg
True
>>> crs_epsg == crs_latlon
False
>>> crs_epsg == crs_deprecated
False
Hopefully this helps resolve some of the confusion.
For reference of the pyproj
related versions I used in this example:
$ python -c "import pyproj; pyproj.show_versions()"
System:
python: 3.7.3 | packaged by conda-forge | (default, Mar 27 2019, 23:01:00) [GCC 7.3.0]
executable: ~/miniconda3/envs/pyproj/bin/python
machine: Linux-4.15.0-51-generic-x86_64-with-debian-buster-sid
PROJ:
PROJ: 6.1.1
data dir: ~/scripts/pyproj/pyproj/proj_dir/share/proj
Python deps:
pyproj: 2.2.1
pip: 18.0
setuptools: 41.0.1
Cython: 0.29.10
aenum: (2, 1, 2)
No comments:
Post a Comment