Exception和Error有什么区别

  • Exception:在程序正常运行中出现的意外,可以捕获,进行相应的处理。
  • Error:在正常情况下不太可能出现的情况,绝大部分Error会导致程序不可恢复。

Java异常图: png

  • Checked异常必须被显式地捕获或者传递(比如IOException),而unchecked异常则可以不必捕获或抛出(比如NullPointerException)。

  • Checked异常继承java.lang.Exception类。Unchecked异常继承自java.lang.RuntimeException类。

处理异常方法:
  • try-catch-finally
  • throw new Object() - throws
注意事项:

try{
 
    // to do something
}
catch(Exception e){ //不要写Exception ,范围太大,更好的是异常要确切一点
    e.printStackTrace();  //不要直接打印,用日志保存,因为在复杂系统中,这个结果不知道在哪儿去了
}
  • try-catch 代码段会产生额外的性能开销,或者换个角度说,它往往会影响 JVM 对代码进行优化(被try-catch包裹的代码JVM不会进行JIT优化),所以建议仅捕获有必要的代码段,尽量不要一个大的 try 包住整段的代码;
  • Java 每实例化一个Exception,都会对当时的栈进行快照(栈快照是记录当前线程状态,变量,代码等信息),这是一个相对比较重的操作。
常见异常:
  • java.lang.NullPointerException(空指针异常) ```java String s = null; s.length();// s对象为空,没法使用,会抛出NullPointerException
 -  java.lang.IndexOutOfBoundsException(数组下标越界异常)
 ```java
  int[] a = {1,2};
	  a[6] = 3; //超过数组的最大长度 2 了,已经越界了
  • java.lang.ArithmeticException(算术异常)
    int i = 3 / 0;  //分母为 0 ,抛出算术异常
    
  • java.lang.FileNotFoundException( 文件未找到异常):文件路径不正确或者根本没有这个文件 ```java File file = new File(“F:/mywork/work.txt”);

     if (!file.exists()) {
         throw new FileNotFoundException();
     }
    
 - java.lang.ClassNotFoundException(类没有找到异常)
 ```java
 package com.midnight.test;

public class Test {  
  public static void main(String[] args) {   
	  try {
		Class.forName("com.midnight.test.Hello"); //如果包 com.midnight.test下面没有Hello.java 这个类就会抛异常(编译时没有问题,运行时抛异常)
	} catch (ClassNotFoundException e) {
		// TODO Auto-generated catch block
		e.printStackTrace();
	}
  }  
}
 
  • java.util.ConcurrentModificationException(fast-fail事件产生的条件:当多个线程对Collection进行操作时,若其中某一个线程通过iterator去遍历集合时,该集合的内容被其他线程所改变;则会抛出ConcurrentModificationException异常。)

  • java.lang.NoClassDefFoundError(类没有找到错误。注意这是一个Error): 类文件存在,但是存在不同的域中 ```java package suba;//当前类会打包到suba

public class Hello {
public static void main(String[] args) {
System.out.println(“hello”); } } //如果当前类在a目录下,进行编译时没有问题,但是运行时会出错NoClassDefFoundError,因为这个类实际上是这样的suba.Hello(在a的子目录suba下面) ```

谈谈你对Java平台的理解

  1. Java是一门面向对象的语言,在Java中一切都是对象Object,最基本的特性有封装、继承与多态。封装是将一些属性(或者叫字段)和方法都包装成一个类;继承是一个类(这个类叫子类)可以继承另一个类(这个类叫父类),这样的好处是避免重复大量的代码,也容易维护;多态是父类可以接受多个子类,我们在编写代码时,具体的类别不知道是哪一个,此时可以使用父类,到运行时才确定是哪一个子类。
  2. 提供了各种类库
    • 核心类库,如IO/NIO,网络,utils,并发,集合等
    • 安全类库
    • jdk.management等类库
    • 海量的第三方类库
  3. Java虚拟机
    • 垃圾收集器
    • 运行时 动态编译
    • 辅助功能,如JFR
  4. 工具
    • 辅助工具,如jlink,jar,jdeps
    • 编译器,javac、sjavac
    • 诊断工具,jmap、jstack、jcmd、jconsole
  5. Java生态,Spring、Hadoop、Spark、Maven
JRE与JDK的理解:
  • JRE(Java Runtime Environment)是Java的运行环境,主要包含的是JVM和一些类库,JRE或许在不久的将来将要消失了。
  • JDK(Java Development Kit)是开发环境,不仅包括了JRE,还有其他的代码诊断工具、编译器等。
一次编译,处处运行的理解:.java文件首先编译成.class文件,然后通过JVM运行,JVM屏蔽了底层硬件的差异性。
  • 解释执行:.java文件首先编译成.class文件,然后将class文件按照字节来转成机器码执行;
  • 即时编译(JIT Just in time):.java文件首先编译成.class文件,然后将一个方法区里面的所有代码转成机器码;
  • AOT(Ahead of time compilation):JDK9新特性,直接将.class文件转成机器码

JIT编译器会将代码缓存到codecache,第二次遇到时直接执行;一般的解释器每次都要先解释后执行。至于那些是热点代码,简单的判断方法是:哪些代码经常出现在栈顶;哪些代码出现的次数超过阈值。

编译性语言与解释性语言的理解:
  • 编译性语言:在编译过程中直接生成目标平台的指令。比如C语言,直接将代码编译成硬件环境的机器码,在不同的硬件上面,每次都要重新编译。
  • 解释性语言:编译时生成中间代码,这个与平台无关,待运行时再将中间代码翻译成机器码。比如Java语言,生成class文件后,通过JVM可以在任意平台运行,但是需要注意不同平台的JVM是不一样的,有windows的、有Mac的、有Linux的。

Note:你可以参考极客时间的专栏《Java核心技术36讲》

序列到序列

序列到序列

这一次我们将要构建一个序列到序列的神经网络,文章来源与udacity

jpg

数据集

数据集包括一个源文件和一个目标文件,他们一一对应。

import numpy as np
import time

import helper

source_path = 'data/letters_source.txt'
target_path = 'data/letters_target.txt'

source_sentences = helper.load_data(source_path)
target_sentences = helper.load_data(target_path)
source_sentences[:50].split('\n')
['bsaqq',
 'npy',
 'lbwuj',
 'bqv',
 'kial',
 'tddam',
 'edxpjpg',
 'nspv',
 'huloz',
 '']

target_sentencessource_sentences中每一个字符的有序排列结果。

target_sentences[:50].split('\n')
['abqqs',
 'npy',
 'bjluw',
 'bqv',
 'aikl',
 'addmt',
 'degjppx',
 'npsv',
 'hlouz',
 '']

预处理

将字符串转化为一个字符序列 png

def extract_character_vocab(data):
    special_words = ['<PAD>', '<UNK>', '<GO>',  '<EOS>']

    set_words = set([character for line in data.split('\n') for character in line])
    int_to_vocab = {word_i: word for word_i, word in enumerate(special_words + list(set_words))}
    vocab_to_int = {word: word_i for word_i, word in int_to_vocab.items()}

    return int_to_vocab, vocab_to_int

# Build int2letter and letter2int dicts
source_int_to_letter, source_letter_to_int = extract_character_vocab(source_sentences)
target_int_to_letter, target_letter_to_int = extract_character_vocab(target_sentences)

# Convert characters to ids
source_letter_ids = [[source_letter_to_int.get(letter, source_letter_to_int['<UNK>']) for letter in line] for line in source_sentences.split('\n')]
target_letter_ids = [[target_letter_to_int.get(letter, target_letter_to_int['<UNK>']) for letter in line] + [target_letter_to_int['<EOS>']] for line in target_sentences.split('\n')] 

print("Example source sequence")
print(source_letter_ids[:3])
print("\n")
print("Example target sequence")
print(target_letter_ids[:3])
Example source sequence
[[28, 12, 16, 9, 9], [22, 4, 5], [8, 28, 20, 6, 19]]


Example target sequence
[[16, 28, 9, 9, 12, 3], [22, 4, 5, 3], [28, 19, 8, 6, 20, 3]]
print(source_sentences[:20])
print(target_letter_to_int)
bsaqq
npy
lbwuj
bqv

{'<PAD>': 0, '<UNK>': 1, '<GO>': 2, '<EOS>': 3, 'p': 4, 'y': 5, 'u': 6, 'x': 7, 'l': 8, 'q': 9, 't': 10, 'g': 11, 's': 12, 'h': 13, 'c': 14, 'r': 15, 'a': 16, 'd': 17, 'o': 18, 'j': 19, 'w': 20, 'v': 21, 'n': 22, 'i': 23, 'e': 24, 'z': 25, 'f': 26, 'm': 27, 'b': 28, 'k': 29}

上面的操作完成了字符到数值的转化。

模型

from distutils.version import LooseVersion
import tensorflow as tf
from tensorflow.python.layers.core import Dense


# Check TensorFlow Version
assert LooseVersion(tf.__version__) >= LooseVersion('1.1'), 'Please use TensorFlow version 1.1 or newer'
print('TensorFlow Version: {}'.format(tf.__version__))
TensorFlow Version: 1.1.0

超参数

# Number of Epochs
epochs = 60
# Batch Size
batch_size = 128
# RNN Size
rnn_size = 50
# Number of Layers
num_layers = 2
# Embedding Size
encoding_embedding_size = 15
decoding_embedding_size = 15
# Learning Rate
learning_rate = 0.001

输入

def get_model_inputs():
    input_data = tf.placeholder(tf.int32, [None, None], name='input')
    targets = tf.placeholder(tf.int32, [None, None], name='targets')
    lr = tf.placeholder(tf.float32, name='learning_rate')

    target_sequence_length = tf.placeholder(tf.int32, (None,), name='target_sequence_length')
    max_target_sequence_length = tf.reduce_max(target_sequence_length, name='max_target_len')
    source_sequence_length = tf.placeholder(tf.int32, (None,), name='source_sequence_length')
    
    return input_data, targets, lr, target_sequence_length, max_target_sequence_length, source_sequence_length

序列到序列模型

我们使用下列各个组件来构建seq2seq模型:

2.1 Encoder
    - Embedding
    - Encoder cell
2.2 Decoder
    1- Process decoder inputs
    2- Set up the decoder
        - Embedding
        - Decoder cell
        - Dense output layer
        - Training decoder
        - Inference decoder
2.3 Seq2seq model connecting the encoder and decoder
2.4 Build the training graph hooking up the model with the 
    optimizer

2.1 编码器

首先我们需要构建的是编码器,将数据进行嵌入处理,作为编码器的输入。

png

  • 将编码的输入传入到RNN,得到隐藏状态和各个时刻的输出,我们不需要每个时刻的输出,忽略就行,只要最终的隐藏状态,它包含了当前整个输入序列的信息。 png
def encoding_layer(input_data, rnn_size, num_layers,
                   source_sequence_length, source_vocab_size, 
                   encoding_embedding_size):


    # Encoder embedding
    enc_embed_input = tf.contrib.layers.embed_sequence(input_data, source_vocab_size, encoding_embedding_size)

    # RNN cell
    def make_cell(rnn_size):
        enc_cell = tf.contrib.rnn.LSTMCell(rnn_size,
                                           initializer=tf.random_uniform_initializer(-0.1, 0.1, seed=2))
        return enc_cell

    enc_cell = tf.contrib.rnn.MultiRNNCell([make_cell(rnn_size) for _ in range(num_layers)])
    
    enc_output, enc_state = tf.nn.dynamic_rnn(enc_cell, enc_embed_input, sequence_length=source_sequence_length, dtype=tf.float32)
    
    return enc_output, enc_state

2.2 解码器

解码器主要由以下部分构成: 1- Process decoder inputs 2- Set up the decoder components - Embedding - Decoder cell - Dense output layer - Training decoder - Inference decoder

处理解码输入

我们传入的输入是这样的(在代码中是整数数值,这里只是为了演示,所以用的是字符): png 我们需要截断最后一个字符,因为,它不会被处理,tf.strided_slice()可以达到我们想要的目的。处理之后就是这样的了: png 然后,我们还需要加上一个开始处理的标识符: png 那么最终的输入就是这样的: png

# Process the input we'll feed to the decoder
def process_decoder_input(target_data, vocab_to_int, batch_size):
    '''Remove the last word id from each batch and concat the <GO> to the begining of each batch'''
    ending = tf.strided_slice(target_data, [0, 0], [batch_size, -1], [1, 1])
    dec_input = tf.concat([tf.fill([batch_size, 1], vocab_to_int['<GO>']), ending], 1)

    return dec_input

Set up the decoder components

    - Embedding
    - Decoder cell
    - Dense output layer
    - Training decoder
    - Inference decoder

1- Embedding

将一个数值转化为一个向量,可以直接使用tf.nn.embedding_lookup来实现: png

2- Decoder Cell

编码单元与解码单元是一致的。

3- Dense output layer

最后我们需要用全连接的方式处理decoder的输出。

4- Training decoder

在训练期间使用的Training decoder,它需要target作为输入: png

5- Inference decoder

在推断(预测)时使用的Inference decoder,ta的输入来自于上一时刻的输出:

png

Training decoder Inference decoder 使用的参数是一样的,它们共享参数,隐藏状态来自于编码器。

def decoding_layer(target_letter_to_int, decoding_embedding_size, num_layers, rnn_size,
                   target_sequence_length, max_target_sequence_length, enc_state, dec_input):
    # 1. Decoder Embedding
    target_vocab_size = len(target_letter_to_int)
    dec_embeddings = tf.Variable(tf.random_uniform([target_vocab_size, decoding_embedding_size]))
    dec_embed_input = tf.nn.embedding_lookup(dec_embeddings, dec_input)

    # 2. Construct the decoder cell
    def make_cell(rnn_size):
        dec_cell = tf.contrib.rnn.LSTMCell(rnn_size,
                                           initializer=tf.random_uniform_initializer(-0.1, 0.1, seed=2))
        return dec_cell

    dec_cell = tf.contrib.rnn.MultiRNNCell([make_cell(rnn_size) for _ in range(num_layers)])
     
    # 3. Dense layer to translate the decoder's output at each time 
    # step into a choice from the target vocabulary
    output_layer = Dense(target_vocab_size,
                         kernel_initializer = tf.truncated_normal_initializer(mean = 0.0, stddev=0.1))


    # 4. Set up a training decoder and an inference decoder
    # Training Decoder
    with tf.variable_scope("decode"):

        # Helper for the training process. Used by BasicDecoder to read inputs.
        training_helper = tf.contrib.seq2seq.TrainingHelper(inputs=dec_embed_input,
                                                            sequence_length=target_sequence_length,
                                                            time_major=False)
        
        
        # Basic decoder
        training_decoder = tf.contrib.seq2seq.BasicDecoder(dec_cell,
                                                           training_helper,
                                                           enc_state,
                                                           output_layer) 
        
        # Perform dynamic decoding using the decoder
        training_decoder_output = tf.contrib.seq2seq.dynamic_decode(training_decoder,
                                                                       impute_finished=True,
                                                                       maximum_iterations=max_target_sequence_length)[0]
    # 5. Inference Decoder
    # Reuses the same parameters trained by the training process
    with tf.variable_scope("decode", reuse=True):
        start_tokens = tf.tile(tf.constant([target_letter_to_int['<GO>']], dtype=tf.int32), [batch_size], name='start_tokens')

        # Helper for the inference process.
        inference_helper = tf.contrib.seq2seq.GreedyEmbeddingHelper(dec_embeddings,
                                                                start_tokens,
                                                                target_letter_to_int['<EOS>'])

        # Basic decoder
        inference_decoder = tf.contrib.seq2seq.BasicDecoder(dec_cell,
                                                        inference_helper,
                                                        enc_state,
                                                        output_layer)
        
        # Perform dynamic decoding using the decoder
        inference_decoder_output = tf.contrib.seq2seq.dynamic_decode(inference_decoder,
                                                            impute_finished=True,
                                                            maximum_iterations=max_target_sequence_length)[0]
         

    
    return training_decoder_output, inference_decoder_output

2.3 Seq2Seq 模型


def seq2seq_model(input_data, targets, lr, target_sequence_length, 
                  max_target_sequence_length, source_sequence_length,
                  source_vocab_size, target_vocab_size,
                  enc_embedding_size, dec_embedding_size, 
                  rnn_size, num_layers):
    
    # Pass the input data through the encoder. We'll ignore the encoder output, but use the state
    _, enc_state = encoding_layer(input_data, 
                                  rnn_size, 
                                  num_layers, 
                                  source_sequence_length,
                                  source_vocab_size, 
                                  encoding_embedding_size)
    
    
    # Prepare the target sequences we'll feed to the decoder in training mode
    dec_input = process_decoder_input(targets, target_letter_to_int, batch_size)
    
    # Pass encoder state and decoder inputs to the decoders
    training_decoder_output, inference_decoder_output = decoding_layer(target_letter_to_int, 
                                                                       decoding_embedding_size, 
                                                                       num_layers, 
                                                                       rnn_size,
                                                                       target_sequence_length,
                                                                       max_target_sequence_length,
                                                                       enc_state, 
                                                                       dec_input) 
    
    return training_decoder_output, inference_decoder_output
    


training_decoder_outputinference_decoder_output 模型的输出形式是一样的: png

最后,我们使用 tf.contrib.seq2seq.sequence_loss() 来计算损失。

# Build the graph
train_graph = tf.Graph()
# Set the graph to default to ensure that it is ready for training
with train_graph.as_default():
    
    # Load the model inputs    
    input_data, targets, lr, target_sequence_length, max_target_sequence_length, source_sequence_length = get_model_inputs()
    
    # Create the training and inference logits
    training_decoder_output, inference_decoder_output = seq2seq_model(input_data, 
                                                                      targets, 
                                                                      lr, 
                                                                      target_sequence_length, 
                                                                      max_target_sequence_length, 
                                                                      source_sequence_length,
                                                                      len(source_letter_to_int),
                                                                      len(target_letter_to_int),
                                                                      encoding_embedding_size, 
                                                                      decoding_embedding_size, 
                                                                      rnn_size, 
                                                                      num_layers)    
    
    # Create tensors for the training logits and inference logits
    training_logits = tf.identity(training_decoder_output.rnn_output, 'logits')
    inference_logits = tf.identity(inference_decoder_output.sample_id, name='predictions')
    
    # Create the weights for sequence_loss
    masks = tf.sequence_mask(target_sequence_length, max_target_sequence_length, dtype=tf.float32, name='masks')

    with tf.name_scope("optimization"):
        
        # Loss function
        cost = tf.contrib.seq2seq.sequence_loss(
            training_logits,
            targets,
            masks)

        # Optimizer
        optimizer = tf.train.AdamOptimizer(lr)

        # Gradient Clipping
        gradients = optimizer.compute_gradients(cost)
        capped_gradients = [(tf.clip_by_value(grad, -5., 5.), var) for grad, var in gradients if grad is not None]
        train_op = optimizer.apply_gradients(capped_gradients)

batch

我们希望的输出是这样的:

png

我们希望的输入是这样的:

png

def pad_sentence_batch(sentence_batch, pad_int):
    """Pad sentences with <PAD> so that each sentence of a batch has the same length"""
    max_sentence = max([len(sentence) for sentence in sentence_batch])
    return [sentence + [pad_int] * (max_sentence - len(sentence)) for sentence in sentence_batch]
def get_batches(targets, sources, batch_size, source_pad_int, target_pad_int):
    """Batch targets, sources, and the lengths of their sentences together"""
    for batch_i in range(0, len(sources)//batch_size):
        start_i = batch_i * batch_size
        sources_batch = sources[start_i:start_i + batch_size]
        targets_batch = targets[start_i:start_i + batch_size]
        pad_sources_batch = np.array(pad_sentence_batch(sources_batch, source_pad_int))
        pad_targets_batch = np.array(pad_sentence_batch(targets_batch, target_pad_int))
        
        # Need the lengths for the _lengths parameters
        pad_targets_lengths = []
        for target in pad_targets_batch:
            pad_targets_lengths.append(len(target))
        
        pad_source_lengths = []
        for source in pad_sources_batch:
            pad_source_lengths.append(len(source))
        
        yield pad_targets_batch, pad_sources_batch, pad_targets_lengths, pad_source_lengths

训练

# Split data to training and validation sets
train_source = source_letter_ids[batch_size:]
train_target = target_letter_ids[batch_size:]
valid_source = source_letter_ids[:batch_size]
valid_target = target_letter_ids[:batch_size]
(valid_targets_batch, valid_sources_batch, valid_targets_lengths, valid_sources_lengths) = next(get_batches(valid_target, valid_source, batch_size,
                           source_letter_to_int['<PAD>'],
                           target_letter_to_int['<PAD>']))

display_step = 20 # Check training loss after every 20 batches

checkpoint = "./best_model.ckpt" 
with tf.Session(graph=train_graph) as sess:
    sess.run(tf.global_variables_initializer())
        
    for epoch_i in range(1, epochs+1):
        for batch_i, (targets_batch, sources_batch, targets_lengths, sources_lengths) in enumerate(
                get_batches(train_target, train_source, batch_size,
                           source_letter_to_int['<PAD>'],
                           target_letter_to_int['<PAD>'])):
            
            # Training step
            _, loss = sess.run(
                [train_op, cost],
                {input_data: sources_batch,
                 targets: targets_batch,
                 lr: learning_rate,
                 target_sequence_length: targets_lengths,
                 source_sequence_length: sources_lengths})

            # Debug message updating us on the status of the training
            if batch_i % display_step == 0 and batch_i > 0:
                
                # Calculate validation cost
                validation_loss = sess.run(
                [cost],
                {input_data: valid_sources_batch,
                 targets: valid_targets_batch,
                 lr: learning_rate,
                 target_sequence_length: valid_targets_lengths,
                 source_sequence_length: valid_sources_lengths})
                
                print('Epoch {:>3}/{} Batch {:>4}/{} - Loss: {:>6.3f}  - Validation loss: {:>6.3f}'
                      .format(epoch_i,
                              epochs, 
                              batch_i, 
                              len(train_source) // batch_size, 
                              loss, 
                              validation_loss[0]))

    
    
    # Save Model
    saver = tf.train.Saver()
    saver.save(sess, checkpoint)
    print('Model Trained and Saved')
Epoch   1/60 Batch   20/77 - Loss:  2.471  - Validation loss:  2.490
Epoch   1/60 Batch   40/77 - Loss:  2.277  - Validation loss:  2.230
Epoch   1/60 Batch   60/77 - Loss:  1.957  - Validation loss:  2.004
Epoch   2/60 Batch   20/77 - Loss:  1.665  - Validation loss:  1.745
Epoch   2/60 Batch   40/77 - Loss:  1.673  - Validation loss:  1.631
Epoch   2/60 Batch   60/77 - Loss:  1.505  - Validation loss:  1.538
Epoch   3/60 Batch   20/77 - Loss:  1.379  - Validation loss:  1.443
Epoch   3/60 Batch   40/77 - Loss:  1.452  - Validation loss:  1.418
Epoch   3/60 Batch   60/77 - Loss:  1.351  - Validation loss:  1.377
Epoch   4/60 Batch   20/77 - Loss:  1.209  - Validation loss:  1.269
Epoch   4/60 Batch   40/77 - Loss:  1.262  - Validation loss:  1.230
Epoch   4/60 Batch   60/77 - Loss:  1.152  - Validation loss:  1.192
Epoch   5/60 Batch   20/77 - Loss:  1.080  - Validation loss:  1.131
Epoch   5/60 Batch   40/77 - Loss:  1.135  - Validation loss:  1.107
Epoch   5/60 Batch   60/77 - Loss:  1.041  - Validation loss:  1.080
Epoch   6/60 Batch   20/77 - Loss:  0.978  - Validation loss:  1.029
Epoch   6/60 Batch   40/77 - Loss:  1.023  - Validation loss:  1.005
Epoch   6/60 Batch   60/77 - Loss:  0.935  - Validation loss:  0.982
Epoch   7/60 Batch   20/77 - Loss:  0.870  - Validation loss:  0.938
Epoch   7/60 Batch   40/77 - Loss:  0.908  - Validation loss:  0.918
Epoch   7/60 Batch   60/77 - Loss:  0.836  - Validation loss:  0.895
Epoch   8/60 Batch   20/77 - Loss:  0.777  - Validation loss:  0.852
Epoch   8/60 Batch   40/77 - Loss:  0.814  - Validation loss:  0.828
Epoch   8/60 Batch   60/77 - Loss:  0.754  - Validation loss:  0.799
Epoch   9/60 Batch   20/77 - Loss:  0.686  - Validation loss:  0.751
Epoch   9/60 Batch   40/77 - Loss:  0.720  - Validation loss:  0.732
Epoch   9/60 Batch   60/77 - Loss:  0.678  - Validation loss:  0.710
Epoch  10/60 Batch   20/77 - Loss:  0.613  - Validation loss:  0.673
Epoch  10/60 Batch   40/77 - Loss:  0.645  - Validation loss:  0.654
Epoch  10/60 Batch   60/77 - Loss:  0.610  - Validation loss:  0.647
Epoch  11/60 Batch   20/77 - Loss:  0.532  - Validation loss:  0.590
Epoch  11/60 Batch   40/77 - Loss:  0.569  - Validation loss:  0.572
Epoch  11/60 Batch   60/77 - Loss:  0.531  - Validation loss:  0.585
Epoch  12/60 Batch   20/77 - Loss:  0.476  - Validation loss:  0.532
Epoch  12/60 Batch   40/77 - Loss:  0.511  - Validation loss:  0.509
Epoch  12/60 Batch   60/77 - Loss:  0.477  - Validation loss:  0.493
Epoch  13/60 Batch   20/77 - Loss:  0.416  - Validation loss:  0.468
Epoch  13/60 Batch   40/77 - Loss:  0.462  - Validation loss:  0.459
Epoch  13/60 Batch   60/77 - Loss:  0.432  - Validation loss:  0.447
Epoch  14/60 Batch   20/77 - Loss:  0.372  - Validation loss:  0.423
Epoch  14/60 Batch   40/77 - Loss:  0.419  - Validation loss:  0.414
Epoch  14/60 Batch   60/77 - Loss:  0.392  - Validation loss:  0.404
Epoch  15/60 Batch   20/77 - Loss:  0.333  - Validation loss:  0.384
Epoch  15/60 Batch   40/77 - Loss:  0.382  - Validation loss:  0.377
Epoch  15/60 Batch   60/77 - Loss:  0.358  - Validation loss:  0.369
Epoch  16/60 Batch   20/77 - Loss:  0.300  - Validation loss:  0.351
Epoch  16/60 Batch   40/77 - Loss:  0.348  - Validation loss:  0.345
Epoch  16/60 Batch   60/77 - Loss:  0.327  - Validation loss:  0.338
Epoch  17/60 Batch   20/77 - Loss:  0.272  - Validation loss:  0.324
Epoch  17/60 Batch   40/77 - Loss:  0.316  - Validation loss:  0.317
Epoch  17/60 Batch   60/77 - Loss:  0.299  - Validation loss:  0.311
Epoch  18/60 Batch   20/77 - Loss:  0.246  - Validation loss:  0.296
Epoch  18/60 Batch   40/77 - Loss:  0.286  - Validation loss:  0.293
Epoch  18/60 Batch   60/77 - Loss:  0.272  - Validation loss:  0.283
Epoch  19/60 Batch   20/77 - Loss:  0.221  - Validation loss:  0.270
Epoch  19/60 Batch   40/77 - Loss:  0.256  - Validation loss:  0.269
Epoch  19/60 Batch   60/77 - Loss:  0.245  - Validation loss:  0.258
Epoch  20/60 Batch   20/77 - Loss:  0.198  - Validation loss:  0.244
Epoch  20/60 Batch   40/77 - Loss:  0.228  - Validation loss:  0.243
Epoch  20/60 Batch   60/77 - Loss:  0.221  - Validation loss:  0.234
Epoch  21/60 Batch   20/77 - Loss:  0.183  - Validation loss:  0.224
Epoch  21/60 Batch   40/77 - Loss:  0.205  - Validation loss:  0.218
Epoch  21/60 Batch   60/77 - Loss:  0.195  - Validation loss:  0.212
Epoch  22/60 Batch   20/77 - Loss:  0.161  - Validation loss:  0.206
Epoch  22/60 Batch   40/77 - Loss:  0.185  - Validation loss:  0.218
Epoch  22/60 Batch   60/77 - Loss:  0.173  - Validation loss:  0.194
Epoch  23/60 Batch   20/77 - Loss:  0.136  - Validation loss:  0.182
Epoch  23/60 Batch   40/77 - Loss:  0.162  - Validation loss:  0.176
Epoch  23/60 Batch   60/77 - Loss:  0.150  - Validation loss:  0.170
Epoch  24/60 Batch   20/77 - Loss:  0.120  - Validation loss:  0.166
Epoch  24/60 Batch   40/77 - Loss:  1.044  - Validation loss:  0.249
Epoch  24/60 Batch   60/77 - Loss:  0.199  - Validation loss:  0.250
Epoch  25/60 Batch   20/77 - Loss:  0.112  - Validation loss:  0.157
Epoch  25/60 Batch   40/77 - Loss:  0.135  - Validation loss:  0.149
Epoch  25/60 Batch   60/77 - Loss:  0.126  - Validation loss:  0.145
Epoch  26/60 Batch   20/77 - Loss:  0.097  - Validation loss:  0.140
Epoch  26/60 Batch   40/77 - Loss:  0.120  - Validation loss:  0.135
Epoch  26/60 Batch   60/77 - Loss:  0.114  - Validation loss:  0.132
Epoch  27/60 Batch   20/77 - Loss:  0.087  - Validation loss:  0.129
Epoch  27/60 Batch   40/77 - Loss:  0.109  - Validation loss:  0.124
Epoch  27/60 Batch   60/77 - Loss:  0.105  - Validation loss:  0.122
Epoch  28/60 Batch   20/77 - Loss:  0.079  - Validation loss:  0.119
Epoch  28/60 Batch   40/77 - Loss:  0.099  - Validation loss:  0.115
Epoch  28/60 Batch   60/77 - Loss:  0.097  - Validation loss:  0.113
Epoch  29/60 Batch   20/77 - Loss:  0.072  - Validation loss:  0.111
Epoch  29/60 Batch   40/77 - Loss:  0.090  - Validation loss:  0.106
Epoch  29/60 Batch   60/77 - Loss:  0.089  - Validation loss:  0.105
Epoch  30/60 Batch   20/77 - Loss:  0.066  - Validation loss:  0.103
Epoch  30/60 Batch   40/77 - Loss:  0.083  - Validation loss:  0.099
Epoch  30/60 Batch   60/77 - Loss:  0.082  - Validation loss:  0.097
Epoch  31/60 Batch   20/77 - Loss:  0.060  - Validation loss:  0.096
Epoch  31/60 Batch   40/77 - Loss:  0.076  - Validation loss:  0.092
Epoch  31/60 Batch   60/77 - Loss:  0.076  - Validation loss:  0.090
Epoch  32/60 Batch   20/77 - Loss:  0.055  - Validation loss:  0.089
Epoch  32/60 Batch   40/77 - Loss:  0.070  - Validation loss:  0.085
Epoch  32/60 Batch   60/77 - Loss:  0.070  - Validation loss:  0.084
Epoch  33/60 Batch   20/77 - Loss:  0.051  - Validation loss:  0.083
Epoch  33/60 Batch   40/77 - Loss:  0.064  - Validation loss:  0.079
Epoch  33/60 Batch   60/77 - Loss:  0.064  - Validation loss:  0.078
Epoch  34/60 Batch   20/77 - Loss:  0.047  - Validation loss:  0.077
Epoch  34/60 Batch   40/77 - Loss:  0.059  - Validation loss:  0.074
Epoch  34/60 Batch   60/77 - Loss:  0.059  - Validation loss:  0.073
Epoch  35/60 Batch   20/77 - Loss:  0.043  - Validation loss:  0.072
Epoch  35/60 Batch   40/77 - Loss:  0.054  - Validation loss:  0.069
Epoch  35/60 Batch   60/77 - Loss:  0.055  - Validation loss:  0.068
Epoch  36/60 Batch   20/77 - Loss:  0.039  - Validation loss:  0.068
Epoch  36/60 Batch   40/77 - Loss:  0.050  - Validation loss:  0.065
Epoch  36/60 Batch   60/77 - Loss:  0.050  - Validation loss:  0.063
Epoch  37/60 Batch   20/77 - Loss:  0.036  - Validation loss:  0.063
Epoch  37/60 Batch   40/77 - Loss:  0.046  - Validation loss:  0.061
Epoch  37/60 Batch   60/77 - Loss:  0.046  - Validation loss:  0.059
Epoch  38/60 Batch   20/77 - Loss:  0.034  - Validation loss:  0.060
Epoch  38/60 Batch   40/77 - Loss:  0.042  - Validation loss:  0.057
Epoch  38/60 Batch   60/77 - Loss:  0.043  - Validation loss:  0.055
Epoch  39/60 Batch   20/77 - Loss:  0.031  - Validation loss:  0.056
Epoch  39/60 Batch   40/77 - Loss:  0.039  - Validation loss:  0.053
Epoch  39/60 Batch   60/77 - Loss:  0.040  - Validation loss:  0.052
Epoch  40/60 Batch   20/77 - Loss:  0.029  - Validation loss:  0.053
Epoch  40/60 Batch   40/77 - Loss:  0.036  - Validation loss:  0.050
Epoch  40/60 Batch   60/77 - Loss:  0.037  - Validation loss:  0.048
Epoch  41/60 Batch   20/77 - Loss:  0.027  - Validation loss:  0.050
Epoch  41/60 Batch   40/77 - Loss:  0.034  - Validation loss:  0.047
Epoch  41/60 Batch   60/77 - Loss:  0.034  - Validation loss:  0.045
Epoch  42/60 Batch   20/77 - Loss:  0.025  - Validation loss:  0.048
Epoch  42/60 Batch   40/77 - Loss:  0.031  - Validation loss:  0.044
Epoch  42/60 Batch   60/77 - Loss:  0.032  - Validation loss:  0.042
Epoch  43/60 Batch   20/77 - Loss:  0.023  - Validation loss:  0.045
Epoch  43/60 Batch   40/77 - Loss:  0.029  - Validation loss:  0.041
Epoch  43/60 Batch   60/77 - Loss:  0.029  - Validation loss:  0.039
Epoch  44/60 Batch   20/77 - Loss:  0.021  - Validation loss:  0.043
Epoch  44/60 Batch   40/77 - Loss:  0.028  - Validation loss:  0.039
Epoch  44/60 Batch   60/77 - Loss:  0.027  - Validation loss:  0.037
Epoch  45/60 Batch   20/77 - Loss:  0.019  - Validation loss:  0.041
Epoch  45/60 Batch   40/77 - Loss:  0.026  - Validation loss:  0.037
Epoch  45/60 Batch   60/77 - Loss:  0.026  - Validation loss:  0.034
Epoch  46/60 Batch   20/77 - Loss:  0.018  - Validation loss:  0.039
Epoch  46/60 Batch   40/77 - Loss:  0.024  - Validation loss:  0.035
Epoch  46/60 Batch   60/77 - Loss:  0.024  - Validation loss:  0.032
Epoch  47/60 Batch   20/77 - Loss:  0.016  - Validation loss:  0.037
Epoch  47/60 Batch   40/77 - Loss:  0.023  - Validation loss:  0.033
Epoch  47/60 Batch   60/77 - Loss:  0.022  - Validation loss:  0.030
Epoch  48/60 Batch   20/77 - Loss:  0.015  - Validation loss:  0.035
Epoch  48/60 Batch   40/77 - Loss:  0.021  - Validation loss:  0.031
Epoch  48/60 Batch   60/77 - Loss:  0.021  - Validation loss:  0.029
Epoch  49/60 Batch   20/77 - Loss:  0.014  - Validation loss:  0.034
Epoch  49/60 Batch   40/77 - Loss:  0.020  - Validation loss:  0.030
Epoch  49/60 Batch   60/77 - Loss:  0.019  - Validation loss:  0.027
Epoch  50/60 Batch   20/77 - Loss:  0.012  - Validation loss:  0.032
Epoch  50/60 Batch   40/77 - Loss:  0.019  - Validation loss:  0.028
Epoch  50/60 Batch   60/77 - Loss:  0.018  - Validation loss:  0.025
Epoch  51/60 Batch   20/77 - Loss:  0.011  - Validation loss:  0.031
Epoch  51/60 Batch   40/77 - Loss:  0.018  - Validation loss:  0.027
Epoch  51/60 Batch   60/77 - Loss:  0.017  - Validation loss:  0.024
Epoch  52/60 Batch   20/77 - Loss:  0.010  - Validation loss:  0.030
Epoch  52/60 Batch   40/77 - Loss:  0.017  - Validation loss:  0.026
Epoch  52/60 Batch   60/77 - Loss:  0.016  - Validation loss:  0.023
Epoch  53/60 Batch   20/77 - Loss:  0.009  - Validation loss:  0.028
Epoch  53/60 Batch   40/77 - Loss:  0.016  - Validation loss:  0.025
Epoch  53/60 Batch   60/77 - Loss:  0.015  - Validation loss:  0.022
Epoch  54/60 Batch   20/77 - Loss:  0.009  - Validation loss:  0.027
Epoch  54/60 Batch   40/77 - Loss:  0.015  - Validation loss:  0.024
Epoch  54/60 Batch   60/77 - Loss:  0.014  - Validation loss:  0.021
Epoch  55/60 Batch   20/77 - Loss:  0.008  - Validation loss:  0.025
Epoch  55/60 Batch   40/77 - Loss:  0.014  - Validation loss:  0.024
Epoch  55/60 Batch   60/77 - Loss:  0.013  - Validation loss:  0.020
Epoch  56/60 Batch   20/77 - Loss:  0.007  - Validation loss:  0.023
Epoch  56/60 Batch   40/77 - Loss:  0.013  - Validation loss:  0.023
Epoch  56/60 Batch   60/77 - Loss:  0.012  - Validation loss:  0.019
Epoch  57/60 Batch   20/77 - Loss:  0.007  - Validation loss:  0.021
Epoch  57/60 Batch   40/77 - Loss:  0.013  - Validation loss:  0.023
Epoch  57/60 Batch   60/77 - Loss:  0.011  - Validation loss:  0.018
Epoch  58/60 Batch   20/77 - Loss:  0.007  - Validation loss:  0.019
Epoch  58/60 Batch   40/77 - Loss:  0.012  - Validation loss:  0.023
Epoch  58/60 Batch   60/77 - Loss:  0.011  - Validation loss:  0.017
Epoch  59/60 Batch   20/77 - Loss:  0.006  - Validation loss:  0.018
Epoch  59/60 Batch   40/77 - Loss:  0.011  - Validation loss:  0.022
Epoch  59/60 Batch   60/77 - Loss:  0.010  - Validation loss:  0.017
Epoch  60/60 Batch   20/77 - Loss:  0.006  - Validation loss:  0.017
Epoch  60/60 Batch   40/77 - Loss:  0.010  - Validation loss:  0.021
Epoch  60/60 Batch   60/77 - Loss:  0.010  - Validation loss:  0.017
Model Trained and Saved

预测

def source_to_seq(text):
    '''Prepare the text for the model'''
    sequence_length = 7
    return [source_letter_to_int.get(word, source_letter_to_int['<UNK>']) for word in text]+ [source_letter_to_int['<PAD>']]*(sequence_length-len(text))

input_sentence = 'hello'
text = source_to_seq(input_sentence)

checkpoint = "./best_model.ckpt"

loaded_graph = tf.Graph()
with tf.Session(graph=loaded_graph) as sess:
    # Load saved model
    loader = tf.train.import_meta_graph(checkpoint + '.meta')
    loader.restore(sess, checkpoint)

    input_data = loaded_graph.get_tensor_by_name('input:0')
    logits = loaded_graph.get_tensor_by_name('predictions:0')
    source_sequence_length = loaded_graph.get_tensor_by_name('source_sequence_length:0')
    target_sequence_length = loaded_graph.get_tensor_by_name('target_sequence_length:0')
    
    #Multiply by batch_size to match the model's input parameters
    answer_logits = sess.run(logits, {input_data: [text]*batch_size, 
                                      target_sequence_length: [len(text)]*batch_size, 
                                      source_sequence_length: [len(text)]*batch_size})[0] 


pad = source_letter_to_int["<PAD>"] 

print('Original Text:', input_sentence)

print('\nSource')
print('  Word Ids:    {}'.format([i for i in text]))
print('  Input Words: {}'.format(" ".join([source_int_to_letter[i] for i in text])))

print('\nTarget')
print('  Word Ids:       {}'.format([i for i in answer_logits if i != pad]))
print('  Response Words: {}'.format(" ".join([target_int_to_letter[i] for i in answer_logits if i != pad])))
INFO:tensorflow:Restoring parameters from ./best_model.ckpt
Original Text: hello

Source
  Word Ids:    [14, 4, 27, 27, 21, 0, 0]
  Input Words: h e l l o <PAD> <PAD>

Target
  Word Ids:       [4, 14, 27, 27, 21, 3]
  Response Words: e h l l o <EOS>