URLs
Text URL property type is used for link references. Only the schemes http
, https
, ftp
, and mailto
are supported.
Attribute | Value | Detail |
---|---|---|
name | url |
Used in schema definitions |
label | URL | plural: URLs |
group | urls |
Used in search indexing to query all properties of a given type |
matchable | Suitable for use in entity matching | |
pivot | Suitable for use as a pivot point for connecting to other entities |
Python API
Validation and normalisation of URLs is performed by the functions in rigour.urls.
followthemoney.types.UrlType
Bases: PropertyType
A uniform resource locator (URL). This will perform some normalisation
on the URL so that it's sure to be using valid encoding/quoting, and to
make sure the URL has a schema (e.g. http
, https
, ...).
Source code in followthemoney/types/url.py
clean_text(text, fuzzy=False, format=None, proxy=None)
Perform intensive care on URLs to make sure they have a scheme and a host name. If no scheme is given HTTP is assumed.
Source code in followthemoney/types/url.py
compare(left, right)
Compare two URLs and return a float indicating how similar they are. This ignores fragments and peforms hard URL normalisation.