tensorflow - 在 keras.sequential 模型中使用 keras.layers.Add()

Question

使用 TF 2.0 和 tfp 概率层，我构建了一个keras.sequential模型。我想将其导出以使用 TensorFlow Serving 进行服务，并且我想在 servable 中包含预处理和后处理步骤。

我的预处理步骤相当简单——用显式值填充 NA，将一些字符串编码为浮点数，规范化输入和非规范化输出。对于培训，我一直在使用 pandas 和 numpy 进行前/后处理。

我知道我可以导出我的 Keras 模型的权重，将keras.sequential模型的架构包装在一个更大的 TensorFlow 图中，使用低级操作tf.math.subtract(inputs, vector_of_feature_means)来进行预处理/后处理操作，定义tf.placeholders我的输入和输出，并制作一个可服务的，但我觉得就像必须有一种更清洁的方式来做到这一点。

是否可以在模型中使用keras.layers.Add()和进行显式预处理步骤，还是有一些更标准的方法来做这些事情？keras.layers.Multiply()keras.sequence

score 0 · Accepted Answer

根据我的理解，做这些事情的标准和有效的方法是使用 Tensorflow Transform。如果我们必须使用 TF Transform，这并不意味着我们应该使用整个 TFX 管道。TF 变换也可以用作独立的。

Tensorflow Transform 创建了一个 Beam Transormation Graph，它将这些变换作为常量注入到 Tensorflow Graph 中。由于这些转换在图中表示为常量，因此它们将在训练和服务中保持一致。培训和服务之间一致性的优势是

消除培训服务偏差
消除了在服务系统中包含代码的需要，从而改善了延迟。

TF Transform 的示例代码如下：

导入所有依赖项的代码：

try:
  import tensorflow_transform as tft
  import apache_beam as beam
except ImportError:
  print('Installing TensorFlow Transform.  This will take a minute, ignore the warnings')
  !pip install -q tensorflow_transform
  print('Installing Apache Beam.  This will take a minute, ignore the warnings')
  !pip install -q apache_beam
  import tensorflow_transform as tft
  import apache_beam as beam

import tensorflow as tf
import tensorflow_transform.beam as tft_beam
from tensorflow_transform.tf_metadata import dataset_metadata
from tensorflow_transform.tf_metadata import dataset_schema

下面提到的是我们提到所有转换的预处理功能。截至目前，TF Transform 不提供用于缺失值插补的直接 API。因此，仅为此，我们必须使用低级 API 编写自己的代码。

def preprocessing_fn(inputs):
  """Preprocess input columns into transformed columns."""
  # Since we are modifying some features and leaving others unchanged, we
  # start by setting `outputs` to a copy of `inputs.
  outputs = inputs.copy()

  # Scale numeric columns to have range [0, 1].
  for key in NUMERIC_FEATURE_KEYS:
    outputs[key] = tft.scale_to_0_1(outputs[key])

  for key in OPTIONAL_NUMERIC_FEATURE_KEYS:
    # This is a SparseTensor because it is optional. Here we fill in a default
    # value when it is missing.
    dense = tf.sparse_to_dense(outputs[key].indices,
                               [outputs[key].dense_shape[0], 1],
                               outputs[key].values, default_value=0.)
    # Reshaping from a batch of vectors of size 1 to a batch to scalars.
    dense = tf.squeeze(dense, axis=1)
    outputs[key] = tft.scale_to_0_1(dense)

  # For all categorical columns except the label column, we generate a
  # vocabulary but do not modify the feature.  This vocabulary is instead
  # used in the trainer, by means of a feature column, to convert the feature
  # from a string to an integer id.
  for key in CATEGORICAL_FEATURE_KEYS:
    tft.vocabulary(inputs[key], vocab_filename=key)

  # For the label column we provide the mapping from string to index.
  table = tf.contrib.lookup.index_table_from_tensor(['>50K', '<=50K'])
  outputs[LABEL_KEY] = table.lookup(outputs[LABEL_KEY])

  return outputs

您可以参考下面提到的链接以获取详细信息和 TF 变换教程。

https://www.tensorflow.org/tfx/transform/get_started

https://www.tensorflow.org/tfx/tutorials/transform/census

tensorflow - 在 keras.sequential 模型中使用 keras.layers.Add()

1 回答 1

Related

Reference