utilities

Download and management utilities for syncing files

Source code

General Methods

class FirnCorr.utilities.reify(wrapped)[source]

Class decorator that puts the result of the method it decorates into the instance

FirnCorr.utilities.get_data_path(relpath: list | str | Path)[source]

Get the absolute path within a package from a relative path

Parameters:
relpath: list, str or pathlib.Path

Relative path

FirnCorr.utilities.get_cache_path(relpath: list | str | Path | None = None, appname='firncorr')[source]

Get the path to the user cache directory for an application

Parameters:
relpath: list, str, pathlib.Path or None

Relative path

appname: str, default ‘firncorr’

Application name

FirnCorr.utilities.import_dependency(name: str, extra: str = '', raise_exception: bool = False)[source]

Import an optional dependency

Adapted from pandas.compat._optional::import_optional_dependency

Parameters:
name: str

Module name

extra: str, default “”

Additional text to include in the ImportError message

raise_exception: bool, default False

Raise an ImportError if the module is not found

Returns:
module: obj

Imported module

FirnCorr.utilities.dependency_available(name: str, minversion: str | None = None)[source]

Checks whether a module is installed without importing it

Adapted from xarray.namedarray.utils.module_available

Parameters:
name: str

Module name

minversionstr, optional

Minimum version of the module

Returns:
availablebool

Whether the module is installed

FirnCorr.utilities.is_valid_url(url: str) bool[source]

Checks if a string is a valid URL

Parameters:
url: str

URL to check

FirnCorr.utilities.Path(filename: str | Path, *args, **kwargs)[source]

Create a URL or pathlib.Path object

Parameters:
filename: str or pathlib.Path

File path or URL

class FirnCorr.utilities.URL(urlname: str | Path, *args, **kwargs)[source]

Handles URLs similar to pathlib.Path objects

classmethod from_parts(parts: str | list | tuple)[source]

Return a URL object from components

Parameters:
parts: str, list or tuple

URL components

joinpath(*pathsegments: list[str])[source]

Append URL components to existing

Parameters:
pathsegments: list[str]

URL components to append

resolve()[source]

Resolve the URL

is_file()[source]

Boolean flag if path is a local file

is_dir()[source]

Boolean flag if path is a local directory

geturl()[source]

String representation of the URL object

get(*args, **kwargs)[source]

Get contents from URL

headers(*args, **kwargs)[source]

Get headers from URL

load(*args, **kwargs)[source]

Load JSON response from URL

ping(*args, **kwargs) bool[source]

Ping URL to check connection

query(*args, **kwargs)[source]

List contents from URL

read(*args, **kwargs)[source]

Open URL and read response

request(*args, **kwargs)[source]

Make URL request

urlopen(*args, **kwargs)[source]

Open URL and return response

property name

URL basename

property netloc

URL network location

property parent

URL parent path as a URL object

property parents

URL parents as a list of URL objects

property parts

URL parts as a tuple

property path

URL path

property s3bucket

AWS s3 bucket name

property s3key

AWS s3 key

property scheme

URL scheme

property stem

URL stem

FirnCorr.utilities.detect_compression(filename: str | Path) bool[source]

Detect if file is compressed based on file extension

Parameters:
filename: str or pathlib.Path

Model file

Returns:
compressed: bool

Input file is gzip compressed

FirnCorr.utilities.compressuser(filename: str | Path)[source]

Tilde-compress a file to be relative to the home directory

Parameters:
filename: str or pathlib.Path

Input filename to tilde-compress

FirnCorr.utilities.get_hash(local: str | IOBase | Path, algorithm: str = 'md5', include_algorithm: bool = False)[source]

Get the hash value from a local file or BytesIO object

Parameters:
local: obj, str or pathlib.Path

BytesIO object or path to file

algorithm: str, default ‘md5’

Hashing algorithm for checksum validation

include_algorithm: bool, default False

Include the algorithm name in the returned hash

FirnCorr.utilities.url_split(s: str)[source]

Recursively split a URL path into a list

Parameters:
s: str

URL string

FirnCorr.utilities.convert_arg_line_to_args(arg_line)[source]

Convert file lines to arguments

Parameters:
arg_line: str

Line string containing a single argument and/or comments

FirnCorr.utilities.get_unix_time(time_string: str, format: str = '%Y-%m-%d %H:%M:%S')[source]

Get the Unix timestamp value for a formatted date string

Parameters:
time_string: str

Formatted time string to parse

format: str, default ‘%Y-%m-%d %H:%M:%S’

Format for input time string

FirnCorr.utilities.isoformat(time_string: str)[source]

Reformat a date string to ISO formatting

Parameters:
time_string: str

formatted time string to parse

FirnCorr.utilities.even(value: float)[source]

Rounds a number to an even number less than or equal to original

Parameters:
value: float

Number to be rounded

FirnCorr.utilities.ceil(value: float)[source]

Rounds a number upward to its nearest integer

Parameters:
value: float

number to be rounded upward

FirnCorr.utilities.copy(source: str | Path, destination: str | Path, move: bool = False, **kwargs)[source]

Copy or move a file with all system information

Parameters:
source: str or pathlib.Path

Source file

destination: str or pathlib.Path

Copied destination file

move: bool, default False

Remove the source file

Create a symbolic link to a file

Parameters:
source: str or pathlib.Path

Source file

destination: str or pathlib.Path

Symbolic link file

FirnCorr.utilities.check_ftp_connection(HOST: str, username: str | None = None, password: str | None = None)[source]

Check internet connection with ftp host

Parameters:
HOST: str

Remote ftp host

username: str or NoneType

ftp username

password: str or NoneType

ftp password

FirnCorr.utilities.ftp_list(HOST: str | list, username: str | None = None, password: str | None = None, timeout: int | None = None, basename: bool = False, pattern: str | None = None, sort: bool = False)[source]

List a directory on a ftp host

Parameters:
HOST: str or list

Remote ftp host path split as list

username: str or NoneType

ftp username

password: str or NoneType

ftp password

timeout: int or NoneType, default None

Timeout in seconds for blocking operations

basename: bool, default False

Return the file or directory basename instead of the full path

pattern: str or NoneType, default None

Regular expression pattern for reducing list

sort: bool, default False

Sort output list

Returns:
output: list

Items in a directory

mtimes: list

Last modification times for items in the directory

FirnCorr.utilities.from_ftp(HOST: str | list, username: str | None = None, password: str | None = None, timeout: int | None = None, local: str | ~pathlib.Path | None = None, hash: str = '', chunk: int = 8192, verbose: bool = False, fid: object = <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, label: str | None = None, mode: oct = 509, **kwargs)[source]

Download a file from a ftp host

Parameters:
HOST: str or list

Remote ftp host path

username: str or NoneType

ftp username

password: str or NoneType

ftp password

timeout: int or NoneType, default None

Timeout in seconds for blocking operations

local: str, pathlib.Path or NoneType, default None

Path to local file

hash: str, default ‘’

MD5 hash of local file

chunk: int, default 8192

Chunk size for transfer encoding

verbose: bool, default False

Print file transfer information

fid: object, default sys.stdout

Open file object for logging file transfers if verbose

label: str, default None

Label for logging file transfer information if verbose

mode: oct, default 0o775

Permissions mode of output local file

Returns:
remote_buffer: obj

BytesIO representation of file

FirnCorr.utilities._create_default_ssl_context() SSLContext[source]

Creates the default SSL context

FirnCorr.utilities._create_ssl_context_no_verify() SSLContext[source]

Creates an SSL context for unverified connections

FirnCorr.utilities._set_ssl_context_options(context: SSLContext) None[source]

Sets the default options for the SSL context

FirnCorr.utilities.check_connection(HOST: str, context: ~ssl.SSLContext = <ssl.SSLContext object>, timeout: int = 20)[source]

Check internet connection with http host

Parameters:
HOST: str

Remote http host

context: obj, default FirnCorr.utilities._default_ssl_context

SSL context for urllib opener object

timeout: int, default 20

Timeout in seconds for blocking operations

FirnCorr.utilities.http_list(HOST: str | list, timeout: int | None = None, context: ~ssl.SSLContext = <ssl.SSLContext object>, parser=<lxml.etree.HTMLParser object>, format: str = '%Y-%m-%d %H:%M', pattern: str = '', sort: bool = False, **kwargs)[source]

List a directory on an Apache http Server

Parameters:
HOST: str or list

Remote http host path

timeout: int or NoneType, default None

Timeout in seconds for blocking operations

context: obj, default FirnCorr.utilities._default_ssl_context

SSL context for urllib opener object

parser: obj, default lxml.etree.HTMLParser()

HTML parser for lxml

format: str, default ‘%Y-%m-%d %H:%M’

Format for input time string

pattern: str, default ‘’

Regular expression pattern for reducing list

sort: bool, default False

Sort output list

Returns:
colnames: list

Column names in a directory

collastmod: list

Last modification times for items in the directory

FirnCorr.utilities.from_http(HOST: str | list, timeout: int | None = None, context: ~ssl.SSLContext = <ssl.SSLContext object>, local: str | ~pathlib.Path | None = None, hash: str = '', chunk: int = 16384, headers: dict = {}, verbose: bool = False, fid: object = <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, label: str | None = None, mode: oct = 509, **kwargs)[source]

Download a file from a http host

Parameters:
HOST: str or list

Remote http host path split as list

timeout: int or NoneType, default None

Timeout in seconds for blocking operations

context: obj, default FirnCorr.utilities._default_ssl_context

SSL context for urllib opener object

local: str, pathlib.Path or NoneType, default None

Path to local file

hash: str, default ‘’

MD5 hash of local file

chunk: int, default 16384

Chunk size for transfer encoding

headers: dict, default {}

Dictionary of headers to append from URL request

verbose: bool, default False

Print file transfer information

fid: object, default sys.stdout

Open file object for logging file transfers if verbose

label: str or None, default None

Label for logging file transfer information if verbose

mode: oct, default 0o775

Permissions mode of output local file

Returns:
remote_buffer: obj

BytesIO representation of file

FirnCorr.utilities.from_json(HOST: str | list, timeout: int | None = None, context: ~ssl.SSLContext = <ssl.SSLContext object>, headers: dict = {}) dict[source]

Load a JSON response from a http host

Parameters:
HOST: str or list

Remote http host path split as list

timeout: int or NoneType, default None

Timeout in seconds for blocking operations

context: obj, default FirnCorr.utilities._default_ssl_context

SSL context for urllib opener object

headers: dict, default {}

Dictionary of headers to append from URL request

Returns:
json_response: dict

JSON response

FirnCorr.utilities.mar_list(HOST: str | list, timeout: int | None = None, context: ~ssl.SSLContext = <ssl.SSLContext object>, parser=<lxml.etree.HTMLParser object>, pattern: str = '', sort: bool = False)[source]

List a directory from the MAR server at Lèige Université

Parameters:
HOST: str or list

Remote http host path

timeout: int or NoneType, default None

Timeout in seconds for blocking operations

context: obj, default FirnCorr.utilities._default_ssl_context

SSL context for urllib opener object

parser: obj, default lxml.etree.HTMLParser()

HTML parser for lxml

pattern: str, default ‘’

Regular expression pattern for reducing list

sort: bool, default False

Sort output list

Returns:
colnames: list

Column names in a directory

collastmod: list

Last modification times for items in the directory

FirnCorr.utilities.build_opener(username: str, password: str, context: ~ssl.SSLContext = <ssl.SSLContext object>, password_manager: bool = False, get_ca_certs: bool = False, redirect: bool = False, authorization_header: bool = True, urs: str = 'https://urs.earthdata.nasa.gov')[source]

Build urllib opener for NASA Earthdata with supplied credentials

Parameters:
username: str or NoneType, default None

NASA Earthdata username

password: str or NoneType, default None

NASA Earthdata password

context: obj, default ssl.SSLContext(ssl.PROTOCOL_TLS)

SSL context for urllib opener object

password_manager: bool, default False

Create password manager context using default realm

get_ca_certs: bool, default False

Get list of loaded “certification authority” certificates

redirect: bool, default False

Create redirect handler object

authorization_header: bool, default True

Add base64 encoded authorization header to opener

urs: str, default ‘https://urs.earthdata.nasa.gov’

Earthdata login URS 3 host

Returns:
opener: object

OpenerDirector instance

FirnCorr.utilities.gesdisc_list(HOST: str | list, username: str | None = None, password: str | None = None, build: bool = False, timeout: int | None = None, urs: str = 'urs.earthdata.nasa.gov', parser=<lxml.etree.HTMLParser object>, format: str = '%Y-%m-%d %H:%M', pattern: str = '', sort: bool = False)[source]

List a directory on NASA GES DISC servers

Parameters:
HOST: str or list

Remote https host

username: str or NoneType, default None

NASA Earthdata username

password: str or NoneType, default None

NASA Earthdata password

build: bool, default True

Build opener with NASA Earthdata credentials

timeout: int or NoneType, default None

Timeout in seconds for blocking operations

parser: obj, default lxml.etree.HTMLParser()

HTML parser for lxml

format: str, default ‘%Y-%m-%d %H:%M’

Format for input time string

pattern: str, default ‘’

Regular expression pattern for reducing list

sort: bool, default False

Sort output list

Returns:
colnames: list

column names in a directory

collastmod: list

last modification times for items in the directory

FirnCorr.utilities.cmr_filter_json(search_results: dict, endpoint: str = 'data', request_type: str = 'application/x-netcdf')[source]

Filter the CMR json response for desired data files

Parameters:
search_results: dict

json response from CMR query

endpoint: str, default ‘data’

url endpoint type

  • 'data': NASA Earthdata https archive

  • 'opendap': NASA Earthdata OPeNDAP archive

  • 's3': NASA Earthdata Cumulus AWS S3 bucket

request_type: str, default ‘application/x-netcdf’

data type for reducing CMR query

Returns:
granule_names: list

Model granule names

granule_urls: list

Model granule urls

granule_mtimes: list

Model granule modification times

FirnCorr.utilities.cmr(short_name: str, version: str | None = None, start_date: str | None = None, end_date: str | None = None, provider: str = 'GES_DISC', endpoint: str = 'data', request_type: str = 'application/x-netcdf', verbose: bool = False, fid: object = <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]

Query the NASA Common Metadata Repository (CMR) for model data

Parameters:
short_name: str

Model shortname in the CMR system

version: str or NoneType, default None

Model version

start_date: str or NoneType, default None

starting date for CMR product query

end_date: str or NoneType, default None

ending date for CMR product query

provider: str, default ‘GES_DISC’

CMR data provider

  • 'GES_DISC': GESDISC

  • 'GESDISCCLD': GESDISC Cumulus

  • 'PODAAC': PO.DAAC Drive

  • 'POCLOUD': PO.DAAC Cumulus

endpoint: str, default ‘data’

url endpoint type

  • 'data': NASA Earthdata https archive

  • 'opendap': NASA Earthdata OPeNDAP archive

  • 's3': NASA Earthdata Cumulus AWS S3 bucket

request_type: str, default ‘application/x-netcdf’

data type for reducing CMR query

verbose: bool, default False

print CMR query information

fid: object, default sys.stdout

Open file object for logging CMR URL if verbose

Returns:
granule_names: list

Model granule names

granule_urls: list

Model granule urls

granule_mtimes: list

Model granule modification times

FirnCorr.utilities.build_request(short_name: str, dataset_version: str, url: str, host: str | None = None, variables: list = [], format: str = 'bmM0Lw', service: str = 'L34RS_MERRA2', version: str = '1.02', bbox: list[int] | list[float] = [-90, -180, 90, 180], **kwargs)[source]

Build requests for the GES DISC subsetting API

Parameters:
short_name: str

Model shortname in the CMR system

dataset_version: str

Model version

url: str

url for granule returned by the CMR system

host: str or NoneType, default None

Override host provider for GES DISC subsetting

Default is host provider given by CMR request

variables: list, default []

Variables for product to subset

format: str, default ‘bmM0Lw’

Coded output format for GES DISC subsetting API

service: str, default ‘L34RS_MERRA2’

GES DISC subsetting API service

version: str, default ‘1.02’

GES DISC subsetting API service version

bbox: list, default [-90,-180,90,180]

Bounding box to spatially subset

kwargs: dict, default {}

Additional parameters for GES DISC subsetting API

Returns:
request_url: str

Formatted url for GES DISC subsetting API