Skip to content

Corrupted registry breaks usage of feature store #3703

@LeonKolyang

Description

@LeonKolyang

Expected Behavior

Connecting to the feast registry and pulling features from a connected registry should work without an issue.
All feature views should be callable.

Current Behavior

The registry entry for one of four feature views seems to be corrupted. Opening a new connection succeeds for the first three features. Upon calling the fourth feature, the instantiation fails with the following error. This errors occurs regardless of the location from which we run store = FeatureStore(config=repo_config).

  File "/root/.cache/pypoetry/virtualenvs/feast-poc-t0rhiBNj-py3.11/bin/feast", line 8, in <module>
    sys.exit(cli())
             ^^^^^
  File "/root/.cache/pypoetry/virtualenvs/feast-poc-t0rhiBNj-py3.11/lib/python3.11/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.cache/pypoetry/virtualenvs/feast-poc-t0rhiBNj-py3.11/lib/python3.11/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/root/.cache/pypoetry/virtualenvs/feast-poc-t0rhiBNj-py3.11/lib/python3.11/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.cache/pypoetry/virtualenvs/feast-poc-t0rhiBNj-py3.11/lib/python3.11/site-packages/click/core.py", line [140](https://git.takeaway.com/scoober/ml-platform/poc/feast-poc/-/jobs/11952145#L140)4, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.cache/pypoetry/virtualenvs/feast-poc-t0rhiBNj-py3.11/lib/python3.11/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.cache/pypoetry/virtualenvs/feast-poc-t0rhiBNj-py3.11/lib/python3.11/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.cache/pypoetry/virtualenvs/feast-poc-t0rhiBNj-py3.11/lib/python3.11/site-packages/feast/cli.py", line 618, in materialize_incremental_command
    store = FeatureStore(repo_path=str(repo), fs_yaml_file=fs_yaml_file)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.cache/pypoetry/virtualenvs/feast-poc-t0rhiBNj-py3.11/lib/python3.11/site-packages/feast/usage.py", line 362, in wrapper
    raise exc.with_traceback(traceback)
  File "/root/.cache/pypoetry/virtualenvs/feast-poc-t0rhiBNj-py3.11/lib/python3.11/site-packages/feast/usage.py", line 348, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/root/.cache/pypoetry/virtualenvs/feast-poc-t0rhiBNj-py3.11/lib/python3.11/site-packages/feast/feature_store.py", line [160](https://git.takeaway.com/scoober/ml-platform/poc/feast-poc/-/jobs/11952145#L160), in __init__
    self._registry = SqlRegistry(registry_config, self.config.project, None)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.cache/pypoetry/virtualenvs/feast-poc-t0rhiBNj-py3.11/lib/python3.11/site-packages/feast/infra/registry/sql.py", line 201, in __init__
    self.cached_registry_proto = self.proto()
                                 ^^^^^^^^^^^^
  File "/root/.cache/pypoetry/virtualenvs/feast-poc-t0rhiBNj-py3.11/lib/python3.11/site-packages/feast/infra/registry/sql.py", line 842, in proto
    objs: List[Any] = lister(project)  # type: ignore
                      ^^^^^^^^^^^^^^^
  File "/root/.cache/pypoetry/virtualenvs/feast-poc-t0rhiBNj-py3.11/lib/python3.11/site-packages/feast/infra/registry/sql.py", line 582, in list_feature_views
    return self._list_objects(
           ^^^^^^^^^^^^^^^^^^^
  File "/root/.cache/pypoetry/virtualenvs/feast-poc-t0rhiBNj-py3.11/lib/python3.11/site-packages/feast/infra/registry/sql.py", line 1006, in _list_objects
    return [
           ^
  File "/root/.cache/pypoetry/virtualenvs/feast-poc-t0rhiBNj-py3.11/lib/python3.11/site-packages/feast/infra/registry/sql.py", line 1008, in <listcomp>
    proto_class.FromString(row[proto_field_name])
google.protobuf.message.DecodeError: Error parsing message

Existing connections fail to pull any feature view from the feature store with:

Traceback (most recent call last):
  File \"/usr/local/lib/python3.10/site-packages/mlserver/parallel/worker.py\", line 136, in _process_request    return_value = await method(
  File \"/app/./customer-eta/eta/customer_eta.py\", line 90, in predict
    _input = self._preprocessor.preprocess(payload)
  File \"/app/./customer-eta/eta/steps/preprocessing.py\", line 28, in preprocess
    feature_data = self.feature_store.get_online_features(
  File \"/usr/local/lib/python3.10/site-packages/feast/usage.py\", line 288, in wrapper
    return func(*args, **kwargs)
  File \"/usr/local/lib/python3.10/site-packages/feast/feature_store.py\", line 1583, in get_online_features
    return self._get_online_features(
  File \"/usr/local/lib/python3.10/site-packages/feast/feature_store.py\", line 1605, in _get_online_features
    _feature_refs = self._get_features(features, allow_cache=True)
  File \"/usr/local/lib/python3.10/site-packages/feast/feature_store.py\", line 524, in _get_features
    feature_service_from_registry = self.get_feature_service(
  File \"/usr/local/lib/python3.10/site-packages/feast/usage.py\", line 299, in wrapper
    raise exc.with_traceback(traceback)
  File \"/usr/local/lib/python3.10/site-packages/feast/usage.py\", line 288, in wrapper
    return func(*args, **kwargs)
  File \"/usr/local/lib/python3.10/site-packages/feast/feature_store.py\", line 386, in get_feature_service
    return self._registry.get_feature_service(name, self.project, allow_cache)
  File \"/usr/local/lib/python3.10/site-packages/feast/infra/registry/sql.py\", line 383, in get_feature_service
    self._refresh_cached_registry_if_necessary()
  File \"/usr/local/lib/python3.10/site-packages/feast/infra/registry/sql.py\", line 259, in _refresh_cached_registry_if_necessary
    self.refresh()
  File \"/usr/local/lib/python3.10/site-packages/feast/infra/registry/sql.py\", line 238, in refresh
    self.cached_registry_proto = self.proto()
  File \"/usr/local/lib/python3.10/site-packages/feast/infra/registry/sql.py\", line 828, in proto
    projects = self._get_all_projects()
  File \"/usr/local/lib/python3.10/site-packages/feast/infra/registry/sql.py\", line 1074, in _get_all_projects
    rows = conn.execute(stmt).all()
  File \"/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py\", line 1385, in execute
    return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
  File \"/usr/local/lib/python3.10/site-packages/sqlalchemy/sql/elements.py\", line 334, in _execute_on_connection
    return connection._execute_clauseelement(
  File \"/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py\", line 1577, in _execute_clauseelement
    ret = self._execute_context(
  File \"/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py\", line 1948, in _execute_context
    self._handle_dbapi_exception(
  File \"/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py\", line 2129, in _handle_dbapi_exception
    util.raise_(
  File \"/usr/local/lib/python3.10/site-packages/sqlalchemy/util/compat.py\", line 211, in raise_
    raise exception
  File \"/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py\", line 1905, in _execute_context
    self.dialect.do_execute(
  File \"/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/default.py\", line 736, in do_execute
    cursor.execute(statement, parameters)
  File \"/usr/local/lib/python3.10/site-packages/pymysql/cursors.py\", line 153, in execute
    result = self._query(query)
  File \"/usr/local/lib/python3.10/site-packages/pymysql/cursors.py\", line 322, in _query
    conn.query(q)
  File \"/usr/local/lib/python3.10/site-packages/pymysql/connections.py\", line 557, in query
    self._execute_command(COMMAND.COM_QUERY, sql)
  File \"/usr/local/lib/python3.10/site-packages/pymysql/connections.py\", line 861, in _execute_command
    self._write_bytes(packet)
  File \"/usr/local/lib/python3.10/site-packages/pymysql/connections.py\", line 806, in _write_bytes
    raise err.OperationalError(
sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (2006, \"MySQL server has gone away (BrokenPipeError(32, 'Broken pipe'))\")
[SQL: SELECT entities.entity_name, entities.project_id, entities.last_updated_timestamp, entities.entity_proto 
FROM entities]
(Background on this error at: https://sqlalche.me/e/14/e3q8)", "level": "ERROR"

Steps to reproduce

We haven't been able to reproduce the error yet. We have a CI job running feast materialize-incremental every 5 mins, pushing the same demo data. That job started failing at some point without any changes in the data, dependencies or setup in general.

Authentication and manual querying of the registry and databases works fine.

Specifications

  • Version: feast == 0.31.1, python == 3.10.10
  • Platform: AWS, registry: RDS, online store: dynamodb, offline store: redshift
  • Subsystem:

Possible Solution

We dumped the feature store and recreated it running feast apply. This is fine for now since we are in an experimental setup. Once we move towards using feast in production, we would like to have an understanding of what is happening here and how to resolve this without dumping the whole feature store.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions