View Sample of Vectorised Data Using TensorFlow in Python

Python Server Side Programming Programming

Tensorflow is a machine learning framework that is provided by Google. It is an open-source framework used in conjunction with Python to implement algorithms, deep learning applications and much more. It is used in research and for production purposes.

The ‘tensorflow’ package can be installed on Windows using the below line of code −

pip install tensorflow

Tensor is a data structure used in TensorFlow. It helps connect edges in a flow diagram. This flow diagram is known as the ‘Data flow graph’. Tensors are nothing but multidimensional array or a list.

We will be using the Illiad’s dataset, which contains text data of three translation works from William Cowper, Edward (Earl of Derby) and Samuel Butler. The model is trained to identify the translator when a single line of text is given. The text files used have been preprocessing. This includes removing document header and footer, line numbers and chapter titles.

We are using the Google Colaboratory to run the below code. Google Colab or Colaboratory helps run Python code over the browser and requires zero configuration and free access to GPUs (Graphical Processing Units). Colaboratory has been built on top of Jupyter Notebook.

Example

print("Look at sample data after processing it")
example_text, example_label = next(iter(all_labeled_data))

print("The sentence is : ", example_text.numpy())
vectorized_text, example_label = preprocess_text(example_text, example_label)

print("The vectorized sentence is : ", vectorized_text.numpy())
print("Run the pre-process function on the data")

all_encoded_data = all_labeled_data.map(preprocess_text)

Code credit − https://www.tensorflow.org/tutorials/load_data/text

Output

Look at sample data after processing it
The sentence is : b'But I have now both tasted food, and given'
The vectorized sentence is : [ 20 21 58 49 107 3497 909 2 4 540]
Run the pre-process function on the data

Explanation

Once the data has been vectorized, all the tokens would have been converted to integers.
They are converted to integers so that the model can interpret the input fed to it.

AmitDiwan

Updated on: 2021-01-19T07:54:19+05:30

113 Views

Kickstart Your Career

Get certified by completing the course

Get Started