URLs
Text URL property type is used for link references. Only the schemes http, https, ftp, and mailto are supported.
| Attribute | Value | Detail |
|---|---|---|
| name | url |
Used in schema definitions |
| label | URL | plural: URLs |
| group | urls |
Used in search indexing to query all properties of a given type |
| matchable | Suitable for use in entity matching | |
| pivot | Suitable for use as a pivot point for connecting to other entities |
Python API
Validation and normalisation of URLs is performed by the functions in rigour.urls.
followthemoney.types.UrlType
Bases: PropertyType
A uniform resource locator (URL). This will perform some normalisation
on the URL so that it's sure to be using valid encoding/quoting, and to
make sure the URL has a schema (e.g. http, https, ...).
Source code in followthemoney/types/url.py
clean_text(text, fuzzy=False, format=None, proxy=None)
Perform intensive care on URLs to make sure they have a scheme and a host name. If no scheme is given HTTP is assumed.
Source code in followthemoney/types/url.py
compare(left, right)
Compare two URLs and return a float indicating how similar they are. This ignores fragments and peforms hard URL normalisation.