Welcome to Feathr’s Python API documentation!

If you are an end user, read Feathr User APIs.

If you have any suggestions for our API documentation, please help us improve it by creating a Github issue for us.

Feathr APIs for End Users

class feathr.ABC[source]

Bases: object

Helper class that provides a standard way to create an ABC using inheritance.

class feathr.Aggregation(value)[source]

Bases: Enum

The built-in aggregation functions for LookupFeature

AVG = 1

ELEMENTWISE_AVG = 6

ELEMENTWISE_MAX = 8

ELEMENTWISE_MIN = 7

ELEMENTWISE_SUM = 9

LATEST = 10

MAX = 2

MIN = 3

NOP = 0

SUM = 4

UNION = 5

class feathr.AvroJsonSchema(schemaStr: str)[source]

Bases: SourceSchema

Avro schema written in Json string form.

to_feature_config()[source]: Convert the feature anchor definition into internal HOCON format.

class feathr.BackfillTime(start: datetime, end: datetime, step: timedelta)[source]

Bases: object

Time range to materialize/backfill feature data.

start: start time of the backfill, inclusive.

end: end time of the backfill, inclusive.

step: duration of each backfill step. e.g. if daily, use timedelta(days=1)

class feathr.BooleanFeatureType[source]

Bases: FeatureType

Boolean feature value, either true or false.

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.BytesFeatureType[source]

Bases: FeatureType

Bytes feature value.

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.DerivedFeature(name: str, feature_type: ~feathr.definition.dtype.FeatureType, input_features: ~typing.Union[~feathr.definition.feature.FeatureBase, ~typing.List[~feathr.definition.feature.FeatureBase]], transform: ~typing.Union[str, ~feathr.definition.transformation.RowTransformation], key: ~typing.Optional[~typing.Union[~feathr.definition.typed_key.TypedKey, ~typing.List[~feathr.definition.typed_key.TypedKey]]] = [<feathr.definition.typed_key.TypedKey object>], registry_tags: ~typing.Optional[~typing.Dict[str, str]] = None)[source]

Bases: FeatureBase

A derived feature is a feature defined on top of other features, rather than external data source.

name: derived feature name

feature_type: type of derived feature

key: All features with corresponding keys that this derived feature depends on

input_features: features that this derived features depends on

transform: transformation that produces the derived feature value, based on the input_features

registry_tags: A dict of (str, str) that you can pass to feature registry for better organization. For example, you can For example, you can use {“deprecated”: “true”} to indicate this feature is deprecated, etc.

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

validate_feature()[source]: Validate the derived feature is valid

class feathr.Dimension(shape: int, dType: ValueType = ValueType.INT32)[source]: Bases: object

class feathr.DoubleFeatureType[source]

Bases: FeatureType

Double feature value, for example, 1.3d, 2.4d. Double has better precision than float.

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.DoubleVectorFeatureType[source]

Bases: FeatureType

Double vector feature value, for example, [1.3d, 3.3d, 9.3d]

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.Enum(value)[source]

Bases: object

Generic enumeration.

Derive from this class to define new enumerations.

name: The name of the Enum member.

value: The value of the Enum member.

class feathr.ErrorCode(value)[source]

Bases: Enum

An enumeration.

invalid_code = 1

not_implemented = 2

class feathr.ExpressionTransformation(expr: str)[source]

Bases: RowTransformation

A row-level transformations that is defined with the Feathr expression language.

expr: expression that transforms the raw value into a new value, e.g. amount * 10.

to_feature_config(with_def_field_name: Optional[bool] = True) → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

exception feathr.FeathrApiException(message: str, code: ErrorCode)[source]: Bases: Exception

class feathr.FeathrClient(config_path: str = './feathr_config.yaml', local_workspace_dir: Optional[str] = None, credential=None, project_registry_tag: Optional[Dict[str, str]] = None)[source]

Bases: object

Feathr client.

The client is used to create training dataset, materialize features, register features, and fetch features from the online storage.

For offline storage and compute engine, Azure ADLS, AWS S3 and Azure Synapse are supported.

For online storage, currently only Redis is supported. The users of this client is responsible for set up all the necessary information needed to start a Redis client via environment variable or a Spark cluster. Host address, port and password are needed to start the Redis client.

config_path

config path. See [Feathr Config Template](https://github.com/linkedin/feathr/blob/main/feathr_project/feathrcli/data/feathr_user_workspace/feathr_config.yaml) for more details. Defaults to “./feathr_config.yaml”.

Type: str, optional

local_workspace_dir

set where is the local work space dir. If not set, Feathr will create a temporary folder to store local workspace related files.

Type: str, optional

credential

credential to access cloud resources, most likely to be the returned result of DefaultAzureCredential(). If not set, Feathr will initialize DefaultAzureCredential() inside the __init__ function to get credentials.

Type: optional

project_registry_tag

adding tags for project in Feathr registry. This might be useful if you want to tag your project as deprecated, or allow certain customizations on project leve. Default is empty

Type: Dict[str, str]

Raises

RuntimeError – Fail to create the client since necessary environment variables are not set for Redis
client creation. –

build_features(anchor_list: List[FeatureAnchor] = [], derived_feature_list: List[DerivedFeature] = [], verbose: bool = False)[source]: Build features based on the current workspace. all actions that triggers a spark job will be based on the result of this action.

get_features_from_registry(project_name: str) → Dict[str, FeatureBase][source]: Get feature from registry by project name. The features got from registry are automatically built.

get_job_result_uri(block=True, timeout_sec=300) → str[source]: Gets the job output URI

get_job_tags() → Dict[str, str][source]: Gets the job tags

get_offline_features(observation_settings: ObservationSettings, feature_query: Union[FeatureQuery, List[FeatureQuery]], output_path: str, execution_configuratons: Optional[Union[SparkExecutionConfiguration, Dict[str, str]]] = None, udf_files=None, verbose: bool = False)[source]: Get offline features for the observation dataset :param observation_settings: settings of the observation data, e.g. timestamp columns, input path, etc. :param feature_query: features that are requested to add onto the observation data :param output_path: output path of job, i.e. the observation data with features attached. :param execution_configuratons: a dict that will be passed to spark job when the job starts up, i.e. the “spark configurations”. Note that not all of the configuration will be honored since some of the configurations are managed by the Spark platform, such as Databricks or Azure Synapse. Refer to the [spark documentation](https://spark.apache.org/docs/latest/configuration.html) for a complete list of spark configurations.

get_online_features(feature_table, key, feature_names)[source]

Fetches feature value for a certain key from a online feature table.

Parameters

feature_table – the name of the feature table.
key – the key of the entity
feature_names – list of feature names to fetch

Returns

A list of feature values for this entity. It’s ordered by the requested feature names. For example, feature_names = [‘f_is_medium_trip_distance’, ‘f_day_of_week’, ‘f_day_of_month’, ‘f_hour_of_day’] then, the returned feature values is: [b’true’, b’4.0’, b’31.0’, b’23.0’]. If the feature_table or key doesn’t exist, then a list of Nones are returned. For example, [None, None, None, None]. If a feature doesn’t exist, then a None is returned for that feature. For example: [None, b’4.0’, b’31.0’, b’23.0’].

list_registered_features(project_name: Optional[str] = None) → List[str][source]: List all the already registered features. If project_name is not provided or is None, it will return all the registered features; otherwise it will only return features under this project

materialize_features(settings: MaterializationSettings, execution_configuratons: Optional[Union[SparkExecutionConfiguration, Dict[str, str]]] = None, verbose: bool = False)[source]

Materialize feature data

Parameters

settings – Feature materialization settings
execution_configuratons – a dict that will be passed to spark job when the job starts up, i.e. the “spark configurations”. Note that not all of the configuration will be honored since some of the configurations are managed by the Spark platform, such as Databricks or Azure Synapse. Refer to the [spark documentation](https://spark.apache.org/docs/latest/configuration.html) for a complete list of spark configurations.

multi_get_online_features(feature_table, keys, feature_names)[source]

Fetches feature value for a list of keys from a online feature table. This is the batch version of the get API.

Parameters

feature_table – the name of the feature table.
keys – list of keys for the entities
feature_names – list of feature names to fetch

Returns

A list of feature values for the requested entities. It’s ordered by the requested feature names. For example, keys = [12, 24], feature_names = [‘f_is_medium_trip_distance’, ‘f_day_of_week’, ‘f_day_of_month’, ‘f_hour_of_day’] then, the returned feature values is: {‘12’: [b’false’, b’5.0’, b’1.0’, b’0.0’], ‘24’: [b’true’, b’4.0’, b’31.0’, b’23.0’]}. If the feature_table or key doesn’t exist, then a list of Nones are returned. For example, {‘12’: [None, None, None, None], ‘24’: [None, None, None, None]} If a feature doesn’t exist, then a None is returned for that feature. For example: {‘12’: [None, b’4.0’, b’31.0’, b’23.0’], ‘24’: [b’true’, b’4.0’, b’31.0’, b’23.0’]}.

register_features(from_context: bool = True)[source]

Registers features based on the current workspace

Parameters

from_context – If from_context is True (default), the features will be generated from the current context, with the previous built features in client.build(). Otherwise, the features will be generated from
files. (configuration) –

wait_job_to_finish(timeout_sec: int = 300)[source]: Waits for the job to finish in a blocking way unless it times out

class feathr.Feature(name: str, feature_type: ~feathr.definition.dtype.FeatureType, key: ~typing.Optional[~typing.Union[~feathr.definition.typed_key.TypedKey, ~typing.List[~feathr.definition.typed_key.TypedKey]]] = [<feathr.definition.typed_key.TypedKey object>], transform: ~typing.Optional[~typing.Union[str, ~feathr.definition.transformation.Transformation]] = None, registry_tags: ~typing.Optional[~typing.Dict[str, str]] = None)[source]

Bases: FeatureBase

A feature is an individual measurable property or characteristic of an entity. It has a feature name, feature type, and a convenient row transformation used to produce its feature value.

name: Unique name of the feature. Only alphabet, numbers, and ‘_’ are allowed in the name. It can not start with numbers. Note that ‘.’ is NOT ALLOWED!

feature_type: the feature value type. e.g. INT32, FLOAT, etc. Should be part of feathr.dtype

key: The key of this feature. e.g. user_id.

transform: A row transformation used to produce its feature value. e.g. amount * 10

registry_tags: A dict of (str, str) that you can pass to feature registry for better organization. For example, you can use {“deprecated”: “true”} to indicate this feature is deprecated, etc.

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.FeatureAnchor(name: str, source: Source, features: List[Feature], registry_tags: Optional[Dict[str, str]] = None)[source]

Bases: HoconConvertible

A feature anchor defines a set of features on top of a data source, a.k.a. a set of features anchored to a source.

The feature producer writes multiple anchors for a feature, exposing the same feature name for the feature consumer to reference it.

name: Unique name of the anchor.

source: data source that the features are anchored to. Should be either of INPUT_CONTEXT or feathr.source.Source

features: list of features defined within this anchor.

registry_tags: A dict of (str, str) that you can pass to feature registry for better organization. For example, you can use {“deprecated”: “true”} to indicate this anchor is deprecated, etc.

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

validate_features()[source]: Validate that anchor is non-empty and all its features share the same key

class feathr.FeatureBase(name: str, feature_type: ~feathr.definition.dtype.FeatureType, transform: ~typing.Optional[~typing.Union[str, ~feathr.definition.transformation.Transformation]] = None, key: ~typing.Optional[~typing.Union[~feathr.definition.typed_key.TypedKey, ~typing.List[~feathr.definition.typed_key.TypedKey]]] = [<feathr.definition.typed_key.TypedKey object>], registry_tags: ~typing.Optional[~typing.Dict[str, str]] = None)[source]

Bases: HoconConvertible

The base class for features It has a feature name, feature type, and a convenient transformation used to produce its feature value.

name: Unique name of the feature. Only alphabet, numbers, and ‘_’ are allowed in the name. It can not start with numbers. Note that ‘.’ is NOT ALLOWED!

feature_type: the feature value type. e.g. INT32, FLOAT, etc. feathr.dtype

key: The key of this feature. e.g. user_id.

transform: A transformation used to produce its feature value. e.g. amount * 10

registry_tags: A dict of (str, str) that you can pass to feature registry for better organization. For example, you can use {“deprecated”: “true”} to indicate this feature is deprecated, etc.

as_feature(feature_alias)[source]: Provide the feature a different alias, which can be used to reference the feature in transformation expression. This is useful in derived features that depends on the same feature with different keys.

classmethod validate_feature_name(feature_name: str) → bool[source]: Only alphabet, numbers, and ‘_’ are allowed in the name. It can not start with numbers. Note that ‘.’ is NOT ALLOWED! This is because some compute engines, such as Spark, will consider them as operators in feature name.

with_key(key_alias: Union[str, List[str]])[source]: Rename the feature key with the alias. This is useful in derived features that depends on the same feature with different keys.

exception feathr.FeatureNameValidationError[source]

Bases: ValueError

An exception for feature name validation. Feature names must consist of letters, number, or underscores, and cannot begin with a number. Periods are also disallowed, as some compute engines, such as Spark, will consider them as operators in feature name.

class feathr.FeaturePrinter[source]

Bases: object

The class for pretty-printing features

static pretty_print_anchors(anchor_list: List[FeatureAnchor]) → None[source]

Pretty print features

Parameters: feature_list – FeatureAnchor

static pretty_print_feature_query(feature_query: FeatureQuery) → None[source]

Pretty print feature query

Parameters: feature_query – feature query

static pretty_print_materialize_features(settings: MaterializationSettings) → None[source]

Pretty print feature query

Parameters: feature_query – feature query

class feathr.FeatureQuery(feature_list: List[str], key: Optional[Union[TypedKey, List[TypedKey]]] = None)[source]

Bases: HoconConvertible

A FeatureQuery contains a list of features

feature_list: a list of feature names

key: key of feature_list, all features must share the same key

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.FeatureType[source]

Bases: HoconConvertible

Base class for all feature types

class feathr.FloatFeatureType[source]

Bases: FeatureType

Float feature value, for example, 1.3f, 2.4f.

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.FloatVectorFeatureType[source]

Bases: FeatureType

Float vector feature value, for example, [1,3f, 2.4f, 3.9f]

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.HdfsSink(output_path: str)[source]

Bases: Sink

Offline Hadoop HDFS-compatible(HDFS, delta lake, Azure blog storage etc) sink that is used to store feature data. The result is in AVRO format.

output_path: output path

to_feature_config() → str[source]: Produce the config used in feature materialization

class feathr.HdfsSource(name: str, path: str, preprocessing: Optional[Callable] = None, event_timestamp_column: Optional[str] = None, timestamp_format: Optional[str] = 'epoch', registry_tags: Optional[Dict[str, str]] = None)[source]

Bases: Source

A data source(table) stored on HDFS-like file system. Data can be fetch through a POSIX style path.

name

name of the source

Type: str

path

The location of the source data.

Type: str

preprocessing

A preprocessing python function that transforms the source data for further feature transformation.

Type: Optional[Callable]

event_timestamp_column

The timestamp field of your record. As sliding window aggregation feature assume each record in the source data should have a timestamp column.

Type: Optional[str]

timestamp_format

The format of the timestamp field. Defaults to “epoch”. Possible values are: - epoch (seconds since epoch), for example 1647737463 - epoch_millis (milliseconds since epoch), for example 1647737517761 - Any date formats supported by [SimpleDateFormat](https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html).

Type: Optional[str], optional

registry_tags: A dict of (str, str) that you can pass to feature registry for better organization. For example, you can use {“deprecated”: “true”} to indicate this source is deprecated, etc.

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.HoconConvertible[source]

Bases: ABC

Represent classes that can convert into Feathr HOCON config.

abstract to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.InputContext[source]

Bases: Source

‘Passthrough’ source, a.k.a. request feature source. Feature source data is from the observation data itself. For example, you have an observation data table t1 and another feature data table t2. Some of your feature data can be transformed from the observation data table t1 itself, like geo location, then you can define that feature on top of the InputContext.

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.Int32FeatureType[source]

Bases: FeatureType

32-bit integer feature value, for example, 123, 98765.

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.Int32VectorFeatureType[source]

Bases: FeatureType

32-bit integer vector feature value, for example, [1, 3, 9]

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.Int64FeatureType[source]

Bases: FeatureType

64-bit integer(a.k.a. Long in some system) feature value, for example, 123, 98765 but stored in 64-bit integer.

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.Int64VectorFeatureType[source]

Bases: FeatureType

64-bit integer vector feature value, for example, [1, 3, 9]

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.KafKaSource(name: str, kafkaConfig: KafkaConfig)[source]

Bases: Source

A kafka source object. Used in streaming feature ingestion.

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.KafkaConfig(brokers: List[str], topics: List[str], schema: SourceSchema)[source]

Bases: object

Kafka config for a streaming source

brokers: broker/server address

topics: Kafka topics

schema: Kafka message schema

class feathr.LookupFeature(name: str, feature_type: ~feathr.definition.dtype.FeatureType, base_feature: ~feathr.definition.feature.FeatureBase, expansion_feature: ~feathr.definition.feature.FeatureBase, aggregation: ~feathr.definition.aggregation.Aggregation, key: ~typing.Optional[~typing.Union[~feathr.definition.typed_key.TypedKey, ~typing.List[~feathr.definition.typed_key.TypedKey]]] = [<feathr.definition.typed_key.TypedKey object>], registry_tags: ~typing.Optional[~typing.Dict[str, str]] = None)[source]

Bases: FeatureBase

A lookup feature is a feature defined on top of two other features, i.e. using the feature value of the base feature as key, to lookup the feature value from the expansion feature. e.g. a lookup feature user_purchased_item_avg_price could be key-ed by user_id, and computed by: base feature is user_purchased_item_ids. For a given user_id, it returns the item ids purchased by the user. expansion feature is item_price. For a given item id, it returns the item price. aggregation function is average of the item prices.

name: Derived feature name

feature_type: Type of derived feature

key: Join key of the derived feature

base_feature: The feature value of this feature will be used as key to lookup from the expansion feature

expansion_feature: The feature to be looked up

aggregation: Specify the aggregation for the feature values lookup from the expansion feature, in the case of the base feature value needed to be converted into multiple lookup keys, e.g. feature value is an array and each value in the array is used once as a lookup key.

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.MaterializationSettings(name: str, sinks: List[Sink], feature_names: List[str], backfill_time: Optional[BackfillTime] = None)[source]

Bases: object

Settings about materialization features.

name: The materialization job name

sinks: sinks where the materialized features should be written to

feature_names: list of feature names to be materialized

backfill_time: time range and frequency for the materialization. Default to now().

get_backfill_cutoff_time() → List[datetime][source]: Get the backfill cutoff time points for materialization. E.g. for BackfillTime(start=datetime(2022, 3, 1), end=datetime(2022, 3, 5), step=timedelta(days=1)), it returns cutoff time list as [2022-3-1, 2022-3-2, 2022-3-3, 2022-3-4, 2022-3-5], for BackfillTime(start=datetime(2022, 3, 1, 1), end=datetime(2022, 3, 1, 5), step=timedelta(hours=1)), it returns cutoff time list as [2022-3-1 01:00:00, 2022-3-1 02:00:00, 2022-3-1 03:00:00, 2022-3-1 04:00:00, 2022-3-1 05:00:00]

class feathr.ObservationSettings(observation_path: str, event_timestamp_column: Optional[str] = None, timestamp_format: str = 'epoch')[source]

Bases: HoconConvertible

Time settings of the observation data. Used in feature join.

observation_path: path to the observation dataset, i.e. input dataset to get with features

event_timestamp_column

The timestamp field of your record. As sliding window aggregation feature assume each record in the source data should have a timestamp column.

Type: Optional[str]

timestamp_format

The format of the timestamp field. Defaults to “epoch”. Possible values are: - epoch (seconds since epoch), for example 1647737463 - epoch_millis (milliseconds since epoch), for example 1647737517761 - Any date formats supported by [SimpleDateFormat](https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html).

Type: Optional[str], optional

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.RedisSink(table_name: str, streaming: bool = False, streamingTimeoutMs: Optional[int] = None)[source]

Bases: Sink

Redis-based sink use to store online feature data, can be used in batch job or streaming job.

table_name: output table name

streaming: whether it is used in streaming mode

streamingTimeoutMs: maximum running time for streaming mode. It is not used in batch mode.

to_feature_config() → str[source]: Produce the config used in feature materialization

class feathr.RowTransformation[source]

Bases: Transformation

Base class for all row-level transformations.

class feathr.Sink[source]

Bases: HoconConvertible

A data sink.

class feathr.Source(name: str, event_timestamp_column: Optional[str] = '0', timestamp_format: Optional[str] = 'epoch', registry_tags: Optional[Dict[str, str]] = None)[source]

Bases: HoconConvertible

External data source for feature. Typically a data ‘table’.

name: name of the source. It’s used to differentiate from other sources.

event_timestamp_column: column name or field name of the event timestamp

timestamp_format: the format of the event_timestamp_column, e.g. yyyy/MM/DD, or EPOCH

registry_tags: A dict of (str, str) that you can pass to feature registry for better organization. For example, you can use {“deprecated”: “true”} to indicate this source is deprecated, etc.

class feathr.SourceSchema[source]: Bases: HoconConvertible

class feathr.SparkExecutionConfiguration(spark_execution_configuration=typing.Dict[str, str])[source]

Bases: object

A wrapper class to enable Spark Execution Configurations which will be passed to the underlying spark engine. .. attribute:: spark_execution_configuration

dict[str, str] which will be passed to the underlying spark engine

Returns: dict[str, str]

class feathr.StringFeatureType[source]

Bases: FeatureType

String feature value, for example, ‘apple’, ‘orange’.

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.Template(source: ~typing.Union[str, ~jinja2.nodes.Template], block_start_string: str = '{%', block_end_string: str = '%}', variable_start_string: str = '{{', variable_end_string: str = '}}', comment_start_string: str = '{#', comment_end_string: str = '#}', line_statement_prefix: ~typing.Optional[str] = None, line_comment_prefix: ~typing.Optional[str] = None, trim_blocks: bool = False, lstrip_blocks: bool = False, newline_sequence: te.Literal['\n', '\r\n', '\r'] = '\n', keep_trailing_newline: bool = False, extensions: ~typing.Sequence[~typing.Union[str, ~typing.Type[Extension]]] = (), optimized: bool = True, undefined: ~typing.Type[~jinja2.runtime.Undefined] = <class 'jinja2.runtime.Undefined'>, finalize: ~typing.Optional[~typing.Callable[[...], ~typing.Any]] = None, autoescape: ~typing.Union[bool, ~typing.Callable[[~typing.Optional[str]], bool]] = False, enable_async: bool = False)[source]

Bases: object

A compiled template that can be rendered.

Use the methods on Environment to create or load templates. The environment is used to configure how templates are compiled and behave.

It is also possible to create a template object directly. This is not usually recommended. The constructor takes most of the same arguments as Environment. All templates created with the same environment arguments share the same ephemeral Environment instance behind the scenes.

A template object should be considered immutable. Modifications on the object are not supported.

blocks: Dict[str, Callable[[Context], Iterator[str]]]

property debug_info: List[Tuple[int, int]]: The debug info mapping.

environment: Environment

environment_class: alias of Environment

filename: Optional[str]

classmethod from_code(environment: Environment, code: code, globals: MutableMapping[str, Any], uptodate: Optional[Callable[[], bool]] = None) → Template[source]: Creates a template object from compiled code and the globals. This is used by the loaders and environment to create a template object.

classmethod from_module_dict(environment: Environment, module_dict: MutableMapping[str, Any], globals: MutableMapping[str, Any]) → Template[source]: Creates a template object from a module. This is used by the module loader to create a template object.

New in version 2.4.

generate(*args: Any, **kwargs: Any) → Iterator[str][source]

For very large templates it can be useful to not render the whole template at once but evaluate each statement after another and yield piece for piece. This method basically does exactly that and returns a generator that yields one item after another as strings.

It accepts the same arguments as render().

async generate_async(*args: Any, **kwargs: Any) → AsyncIterator[str][source]: An async version of generate(). Works very similarly but returns an async iterator instead.

get_corresponding_lineno(lineno: int) → int[source]: Return the source line number of a line number in the generated bytecode as they are not in sync.

globals: MutableMapping[str, Any]

property is_up_to_date: bool: If this variable is False there is a newer version available.

make_module(vars: Optional[Dict[str, Any]] = None, shared: bool = False, locals: Optional[Mapping[str, Any]] = None) → TemplateModule[source]: This method works like the module attribute when called without arguments but it will evaluate the template on every call rather than caching it. It’s also possible to provide a dict which is then used as context. The arguments are the same as for the new_context() method.

async make_module_async(vars: Optional[Dict[str, Any]] = None, shared: bool = False, locals: Optional[Mapping[str, Any]] = None) → TemplateModule[source]: As template module creation can invoke template code for asynchronous executions this method must be used instead of the normal make_module() one. Likewise the module attribute becomes unavailable in async mode.

property module: TemplateModule

The template as module. This is used for imports in the template runtime but is also useful if one wants to access exported template variables from the Python layer:

>>> t = Template('{% macro foo() %}42{% endmacro %}23')
>>> str(t.module)
'23'
>>> t.module.foo() == u'42'
True

This attribute is not available if async mode is enabled.

name: Optional[str]

new_context(vars: Optional[Dict[str, Any]] = None, shared: bool = False, locals: Optional[Mapping[str, Any]] = None) → Context[source]

Create a new Context for this template. The vars provided will be passed to the template. Per default the globals are added to the context. If shared is set to True the data is passed as is to the context without adding the globals.

locals can be a dict of local variables for internal usage.

render(*args: Any, **kwargs: Any) → str[source]

This method accepts the same arguments as the dict constructor: A dict, a dict subclass or some keyword arguments. If no arguments are given the context will be empty. These two calls do the same:

template.render(knights='that say nih')
template.render({'knights': 'that say nih'})

This will return the rendered template as a string.

async render_async(*args: Any, **kwargs: Any) → str[source]

This works similar to render() but returns a coroutine that when awaited returns the entire rendered template string. This requires the async feature to be enabled.

Example usage:

await template.render_async(knights='that say nih; asynchronously')

root_render_func: Callable[[Context], Iterator[str]]

stream(*args: Any, **kwargs: Any) → TemplateStream[source]: Works exactly like generate() but returns a TemplateStream.

class feathr.Transformation[source]

Bases: HoconConvertible

Base class for all transformations that produce feature values.

class feathr.TypedKey(key_column: str, key_column_type: ValueType, full_name: Optional[str] = None, description: Optional[str] = None, key_column_alias: Optional[str] = None)[source]

Bases: object

The key of a feature. A feature is typically keyed by some id(s). e.g. product id, user id .. attribute:: key_column

The id column name of this key. e.g. ‘product_id’.

key_column_type: Types of the key_column

full_name: Unique name of the key. Recommend using [project_name].[key_name], e.g. ads.user_id

description: Documentation for the key.

key_column_alias: Used in some advanced derived features. Default to the key_column.

as_key(key_column_alias: str) → TypedKey[source]: Rename the key alias. This is useful in derived features that depends on the same feature with different keys.

class feathr.UdfTransform(name: str)[source]

Bases: Transformation

User defined transformation. To be supported.

name: name of the user defined function

to_feature_config() → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

class feathr.ValueType(value)[source]

Bases: Enum

Data type to describe feature keys or observation keys.

UNSPECIFIED: key data type is unspecified.

BOOL: key data type is boolean, either true or false

INT32: key data type is 32-bit integer, for example, an invoice id, 93231.

INT64: key data type is 64-bit integer, for example, an invoice id, 93231.

FLOAT: key data type is float, for example, 123.4f.

DOUBLE: key data type is double, for example, 123.4d.

STRING: key data type is string, for example, a user name, ‘user_joe’

BYTES: key data type is bytes.

BOOL = 1

BYTES = 7

DOUBLE = 5

FLOAT = 4

INT32 = 2

INT64 = 3

STRING = 6

UNSPECIFIED = 0

class feathr.WindowAggTransformation(agg_expr: str, agg_func: str, window: str, group_by: Optional[str] = None, filter: Optional[str] = None, limit: Optional[int] = None)[source]

Bases: Transformation

Aggregate the value of an expression over a fixed time window. E.g. sum(amount*10) over last 3 days.

agg_expr: expression that transforms the raw value into a new value, e.g. amount * 10

agg_func: aggregation function. Available values: SUM, COUNT, MAX, MIN, AVG, MAX_POOLING, MIN_POOLING, AVG_POOLING, LATEST

window: Time window length to apply the aggregation. support 4 type of units: d(day), h(hour), m(minute), s(second). The example value are “7d’ or “5h” or “3m” or “1s”

group_by: Feathr expressions applied after the agg_expr transformation as groupby field, before aggregation, same as ‘group by’ in SQL

filter: Feathr expression applied to each row as a filter before aggregation

to_feature_config(with_def_field_name: Optional[bool] = True) → str[source]: Convert the feature anchor definition into internal HOCON format. (For internal use ony)

feathr.abstractmethod(funcobj)[source]

A decorator indicating abstract methods.

Requires that the metaclass is ABCMeta or derived from it. A class that has a metaclass derived from ABCMeta cannot be instantiated unless all of its abstract methods are overridden. The abstract methods can be called using any of the normal ‘super’ call mechanisms.

Usage:

class C(metaclass=ABCMeta):
@abstractmethod def my_abstract_method(self, …):

…

feathr.copy(x)[source]

Shallow copy operation on arbitrary Python objects.

See the module’s __doc__ string for more info.

class feathr.datetime(year, month, day[, hour[, minute[, second[, microsecond[, tzinfo]]]]])[source]

Bases: date

The year, month and day arguments are required. tzinfo may be None, or an instance of a tzinfo subclass. The remaining arguments may be ints.

astimezone(): tz -> convert to local time in new timezone tz

combine(): date, time -> datetime with same date and time fields

ctime(): Return ctime() style string.

date(): Return date object with same year, month and day.

dst(): Return self.tzinfo.dst(self).

fold

fromisoformat(): string -> datetime from datetime.isoformat() output

fromtimestamp(): timestamp[, tz] -> tz’s local time from POSIX timestamp.

hour

isoformat(): [sep] -> string in ISO 8601 format, YYYY-MM-DDT[HH[:MM[:SS[.mmm[uuu]]]]][+HH:MM]. sep is used to separate the year from the time, and defaults to ‘T’. timespec specifies what components of the time to include (allowed values are ‘auto’, ‘hours’, ‘minutes’, ‘seconds’, ‘milliseconds’, and ‘microseconds’).

max = datetime.datetime(9999, 12, 31, 23, 59, 59, 999999)

microsecond

min = datetime.datetime(1, 1, 1, 0, 0)

minute

now()

Returns new datetime object representing current time local to tz.

tz
Timezone object.

If no tz is specified, uses local timezone.

replace(): Return datetime with new specified fields.

resolution = datetime.timedelta(microseconds=1)

second

strptime(): string, format -> new datetime parsed from a string (like time.strptime()).

time(): Return time object with same time but with tzinfo=None.

timestamp(): Return POSIX timestamp as float.

timetuple(): Return time tuple, compatible with time.localtime().

timetz(): Return time object with same time and tzinfo.

tzinfo

tzname(): Return self.tzinfo.tzname(self).

utcfromtimestamp(): Construct a naive UTC datetime from a POSIX timestamp.

utcnow(): Return a new datetime representing UTC day and time.

utcoffset(): Return self.tzinfo.utcoffset(self).

utctimetuple(): Return UTC time tuple, compatible with time.localtime().

feathr.deepcopy(x, memo=None, _nil=[])[source]

Deep copy operation on arbitrary Python objects.

See the module’s __doc__ string for more info.

feathr.get_result_df(client: FeathrClient, format: Optional[str] = None, res_url: Optional[str] = None) → DataFrame[source]

Download the job result dataset from cloud as a Pandas dataframe.

format: format override, could be “parquet”, “delta”, etc. res_url: output URL to download files. Note that this will not block the job so you need to make sure the job is finished and result URL contains actual data.

feathr.pprint(object, stream=None, indent=1, width=80, depth=None, *, compact=False)[source]: Pretty-print a Python object to a stream [default is sys.stdout].

class feathr.timedelta[source]

Bases: object

Difference between two datetime values.

timedelta(days=0, seconds=0, microseconds=0, milliseconds=0, minutes=0, hours=0, weeks=0)

All arguments are optional and default to 0. Arguments may be integers or floats, and may be positive or negative.

days: Number of days.

max = datetime.timedelta(days=999999999, seconds=86399, microseconds=999999)

microseconds: Number of microseconds (>= 0 and less than 1 second).

min = datetime.timedelta(days=-999999999)

resolution = datetime.timedelta(microseconds=1)

seconds: Number of seconds (>= 0 and less than 1 day).

total_seconds(): Total seconds in the duration.