tokenizer = tfds.deprecated.text.Tokenizer()

로 바뀌었다. 세상 참 빠르다.

www.tensorflow.org/datasets/api_docs/python/tfds/deprecated/text/Tokenizer

 

tfds.deprecated.text.Tokenizer  |  TensorFlow Datasets

Splits a string into tokens, and joins them back.

www.tensorflow.org

 

O:\PycharmProjects\catdogtf2.2\venv\Scripts\python.exe O:\PyCharm\plugins\python\helpers\pydev\pydevconsole.py --mode=client --port=64082

import sys; print('Python %s on %s' % (sys.version, sys.platform))

sys.path.extend(['O:\\PycharmProjects\\catdogtf2.2', 'O:/PycharmProjects/catdogtf2.2'])

Python 3.7.7 (default, May  6 2020, 11:45:54) [MSC v.1916 64 bit (AMD64)]

Type 'copyright', 'credits' or 'license' for more information

IPython 7.17.0 -- An enhanced Interactive Python. Type '?' for help.

PyDev console: using IPython 7.17.0

Python 3.7.7 (default, May  6 2020, 11:45:54) [MSC v.1916 64 bit (AMD64)] on win32

In[2]: runfile('O:/PycharmProjects/catdogtf2.2/010.py', wdir='O:/PycharmProjects/catdogtf2.2')

2020-08-11 23:43:20.730164: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll

2020-08-11 23:43:23.946122: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll

2020-08-11 23:43:24.000619: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:

pciBusID: 0000:09:00.0 name: GeForce RTX 2080 SUPER computeCapability: 7.5

coreClock: 1.845GHz coreCount: 48 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 462.00GiB/s

2020-08-11 23:43:24.001194: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll

2020-08-11 23:43:24.011525: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll

2020-08-11 23:43:24.019517: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll

2020-08-11 23:43:24.024019: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll

2020-08-11 23:43:24.032859: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll

2020-08-11 23:43:24.038094: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll

2020-08-11 23:43:24.052864: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll

2020-08-11 23:43:24.053230: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0

2020-08-11 23:43:24.053758: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2

2020-08-11 23:43:24.067367: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1c9cd3779b0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:

2020-08-11 23:43:24.067961: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version

2020-08-11 23:43:24.068675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:

pciBusID: 0000:09:00.0 name: GeForce RTX 2080 SUPER computeCapability: 7.5

coreClock: 1.845GHz coreCount: 48 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 462.00GiB/s

2020-08-11 23:43:24.069504: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll

2020-08-11 23:43:24.069864: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll

2020-08-11 23:43:24.070004: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll

2020-08-11 23:43:24.070148: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll

2020-08-11 23:43:24.070297: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll

2020-08-11 23:43:24.070429: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll

2020-08-11 23:43:24.070567: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll

2020-08-11 23:43:24.070797: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0

2020-08-11 23:43:25.001396: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:

2020-08-11 23:43:25.001678: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0

2020-08-11 23:43:25.001850: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N

2020-08-11 23:43:25.002247: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6198 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 SUPER, pci bus id: 0000:09:00.0, compute capability: 7.5)

2020-08-11 23:43:25.006712: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1c9f7068730 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:

2020-08-11 23:43:25.007107: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce RTX 2080 SUPER, Compute Capability 7.5

(<tf.Tensor: shape=(), dtype=string, numpy=b'not a cloud to be seen neither on plain nor mountain. These last'>, <tf.Tensor: shape=(), dtype=int64, numpy=2>)

(<tf.Tensor: shape=(), dtype=string, numpy=b'To win the heart; there Love, there young Desire,'>, <tf.Tensor: shape=(), dtype=int64, numpy=1>)

(<tf.Tensor: shape=(), dtype=string, numpy=b'To parching airs beside the running stream;'>, <tf.Tensor: shape=(), dtype=int64, numpy=0>)

(<tf.Tensor: shape=(), dtype=string, numpy=b'Their people as the pastured flock the ram'>, <tf.Tensor: shape=(), dtype=int64, numpy=0>)

(<tf.Tensor: shape=(), dtype=string, numpy=b"A vessel's plank is smooth and even laid,">, <tf.Tensor: shape=(), dtype=int64, numpy=1>)

b'not a cloud to be seen neither on plain nor mountain. These last'

[213, 12965, 228, 9770, 15265, 11378, 3288, 17101, 5332, 13656, 4080, 8818, 14602]

Epoch 1/3

2020-08-11 23:43:42.133082: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll

2020-08-11 23:43:52.484863: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:184] Filling up shuffle buffer (this may take a while): 35287 of 50000

2020-08-11 23:43:54.470819: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:233] Shuffle buffer filled.

2020-08-11 23:43:54.500071: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll

697/697 [==============================] - 13s 19ms/step - loss: 0.5028 - accuracy: 0.7522 - val_loss: 0.3978 - val_accuracy: 0.8140

Epoch 2/3

2020-08-11 23:44:19.091046: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:184] Filling up shuffle buffer (this may take a while): 35584 of 50000

2020-08-11 23:44:21.195127: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:233] Shuffle buffer filled.

697/697 [==============================] - 12s 18ms/step - loss: 0.2949 - accuracy: 0.8707 - val_loss: 0.4052 - val_accuracy: 0.8206

Epoch 3/3

2020-08-11 23:44:43.657965: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:184] Filling up shuffle buffer (this may take a while): 35537 of 50000

2020-08-11 23:44:45.518081: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:233] Shuffle buffer filled.

697/697 [==============================] - 12s 17ms/step - loss: 0.2191 - accuracy: 0.9055 - val_loss: 0.3737 - val_accuracy: 0.8298

79/79 [==============================] - 2s 20ms/step - loss: 0.3737 - accuracy: 0.8298

Eval loss: 0.374, Eval accuracy: 0.830

 

 

import tensorflow as tf

 

import tensorflow_datasets as tfds

import os

 

DIRECTORY_URL = 'https://storage.googleapis.com/download.tensorflow.org/data/illiad/'

FILE_NAMES = ['cowper.txt', 'derby.txt', 'butler.txt']

 

for name in FILE_NAMES:

    text_dir = tf.keras.utils.get_file(name, origin=DIRECTORY_URL + name)

 

parent_dir = os.path.dirname(text_dir)

 

parent_dir

 

 

def labeler(example, index):

    return example, tf.cast(index, tf.int64)

 

 

labeled_data_sets = []

 

for i, file_name in enumerate(FILE_NAMES):

    lines_dataset = tf.data.TextLineDataset(os.path.join(parent_dir, file_name))

    labeled_dataset = lines_dataset.map(lambda ex: labeler(ex, i))

    labeled_data_sets.append(labeled_dataset)

 

BUFFER_SIZE = 50000

BATCH_SIZE = 64

TAKE_SIZE = 5000

 

all_labeled_data = labeled_data_sets[0]

for labeled_dataset in labeled_data_sets[1:]:

    all_labeled_data = all_labeled_data.concatenate(labeled_dataset)

 

all_labeled_data = all_labeled_data.shuffle(

    BUFFER_SIZE, reshuffle_each_iteration=False)

 

for ex in all_labeled_data.take(5):

    print(ex)

 

tokenizer = tfds.features.text.Tokenizer()

 

vocabulary_set = set()

for text_tensor, _ in all_labeled_data:

    some_tokens = tokenizer.tokenize(text_tensor.numpy())

    vocabulary_set.update(some_tokens)

 

vocab_size = len(vocabulary_set)

vocab_size

 

encoder = tfds.features.text.TokenTextEncoder(vocabulary_set)

 

example_text = next(iter(all_labeled_data))[0].numpy()

print(example_text)

 

encoded_example = encoder.encode(example_text)

print(encoded_example)

 

 

def encode(text_tensor, label):

    encoded_text = encoder.encode(text_tensor.numpy())

    return encoded_text, label

 

 

def encode_map_fn(text, label):

    # py_func doesn't set the shape of the returned tensors.

    encoded_text, label = tf.py_function(encode,

                                         inp=[text, label],

                                         Tout=(tf.int64, tf.int64))

 

    # `tf.data.Datasets` work best if all components have a shape set

    #  so set the shapes manually:

    encoded_text.set_shape([None])

    label.set_shape([])

 

    return encoded_text, label

 

 

all_encoded_data = all_labeled_data.map(encode_map_fn)

 

train_data = all_encoded_data.skip(TAKE_SIZE).shuffle(BUFFER_SIZE)

train_data = train_data.padded_batch(BATCH_SIZE)

 

test_data = all_encoded_data.take(TAKE_SIZE)

test_data = test_data.padded_batch(BATCH_SIZE)

 

sample_text, sample_labels = next(iter(test_data))

 

sample_text[0], sample_labels[0]

 

vocab_size += 1

 

model = tf.keras.Sequential()

 

model.add(tf.keras.layers.Embedding(vocab_size, 64))

 

model.add(tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)))

 

# One or more dense layers.

# Edit the list in the `for` line to experiment with layer sizes.

for units in [64, 64]:

    model.add(tf.keras.layers.Dense(units, activation='relu'))

 

# Output layer. The first argument is the number of labels.

model.add(tf.keras.layers.Dense(3))

 

model.compile(optimizer='adam',

              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),

              metrics=['accuracy'])

 

model.fit(train_data, epochs=3, validation_data=test_data)

 

eval_loss, eval_acc = model.evaluate(test_data)

 

print('\nEval loss: {:.3f}, Eval accuracy: {:.3f}'.format(eval_loss, eval_acc))

 

'진행 프로젝트 > [진행] Tensorflow2 &amp;amp;amp;amp;quot;해볼까?&amp;amp;amp;amp;quot;' 카테고리의 다른 글

ㅡㅡ;  (0) 2020.10.27
datas  (0) 2020.08.11
tutorials 09  (0) 2020.08.11
tutorials 08  (0) 2020.08.11
tutorials 07  (0) 2020.08.11

+ Recent posts