I'm implementing a regression network mapping images to poses, using Tensorflow/python API and am trying to process the output of a FixedLengthRecordReader.
I'm trying to adapt the cifar10 example minimally for my purposes.
The cifar10 example reads the raw bytes, decodes, then splits.
result.key, value = reader.read(filename_queue)
# Convert from a string to a vector of uint8 that is record_bytes long.
record_bytes = tf.decode_raw(value, tf.uint8)
# The first bytes represent the label, which we convert from uint8->int32.
result.label = tf.cast(
tf.slice(record_bytes, [0], [label_bytes]), tf.int32)
# The remaining bytes after the label represent the image, which we reshape
# from [depth * height * width] to [depth, height, width].
depth_major = tf.reshape(tf.slice(record_bytes, [label_bytes], [image_bytes]),
[result.depth, result.height, result.width])
# Convert from [depth, height, width] to [height, width, depth].
result.uint8image = tf.transpose(depth_major, [1, 2, 0])
I am reading from a list of binary files with data saved as (pose_data, image_data). Because my pose data is float32 and my image data is uint8, I want to slice first, then cast. Unfortunately, the value result of reader.read is a zero-dimensional string tensor, so slicing doesn't work.
key, value = reader.read(filename_queue)
print value.dtype
print value.get_shape()
<dtype: 'string'>
()
The result of tf.decode_raw(value, dtype) is a 1-dimensional array, but requires a dtype to be specified, and tf.string isn't a valid type that it takes.
Is it possible to slice before decoding? Or do I have to decode -> case back to string -> slice -> recast? Is there another way altogether?
Aucun commentaire:
Enregistrer un commentaire