4 预训练模型库及微调代码
<p><font color=green><br />
神经网络之所以功能强大是由于非常大的数据集进行训练,来得到复杂的模型。但这就意味着从头开始的训练模型可能是需要数天甚至数周的密集型计算过程。我们一般的数据集又较少,要去训练较复杂神经网络就不是很容易,而且还容易过拟合。
解决这个问题的一种方法是训练复杂的数据集产生预训练模型,然后用预训练模型对新的数据集进行处理,这个处理方式就叫做迁移学习或fine-tuning。
</font>
如下所列。这些是在<a href="http://www.image-net.org/challenges/LSVRC/2012/" title="ILSVRC-2012-CLS">ILSVRC-2012-CLS</a> 图像分类数据集上进行的训练 。
训练数据格式要和预训练模型的格式要一致。</p>
<table>
<thead>
<tr>
<th>模型名称</th>
<th>框架</th>
<th>精度(imagenet)</th>
<th>大小</th>
<th>代码</th>
</tr>
</thead>
<tbody>
<tr>
<td>inception v3</td>
<td>tensorflow</td>
<td>top1 78% ,top5 93.9</td>
<td>96.2M</td>
<td><a href="https://github.com/tensorflow/models/blob/master/research/slim/nets/resnet_v1.py" title="code">code</a></td>
</tr>
<tr>
<td>xception</td>
<td>Keras</td>
<td>top1 79% top5 94.5</td>
<td>notop79.8,top87.6</td>
<td><a href="https://github.com/fchollet/deep-learning-models/releases/tag/v0.4" title="链接">链接</a></td>
</tr>
<tr>
<td>ResNet V1 50</td>
<td>tensorflow</td>
<td>top1 75.5.top5 92.2</td>
<td>80.7M</td>
<td><a href="https://github.com/tensorflow/models/blob/master/research/slim/nets/resnet_v1.py" title="code">code</a></td>
</tr>
<tr>
<td>ResNet-152</td>
<td>Mxnet</td>
<td>86.4%</td>
<td>230M</td>
<td><a href="http://data.mxnet.io/models/imagenet/resnet/152-layers/" title="链接">链接</a></td>
</tr>
<tr>
<td>ResNet-50</td>
<td>Mxnet</td>
<td>77.4%</td>
<td>98M</td>
<td><a href="http://data.mxnet.io/models/imagenet/resnet/50-layers/" title="链接 ">链接 </a></td>
</tr>
<tr>
<td>VGG16</td>
<td>Tensorflow</td>
<td>92.7%</td>
<td>528M</td>
<td>[下载链接]( <a href="https://www.cs.toronto.edu/~frossard/vgg16/vgg16_weights.npz">https://www.cs.toronto.edu/~frossard/vgg16/vgg16_weights.npz</a> "链接") ,<a href="https://www.cs.toronto.edu/~frossard/vgg16/vgg16.py/" title="code">code</a></td>
</tr>
</tbody>
</table>
<h3>Mxnet模型微调ResNet-50实例</h3>
<p>这是MX的官方例子,采用caltech-256(部分)数据集,ImageNet的ResNet-50做为预训练模型。</p>
<h4>数据处理</h4>
<ul>
<li>从每一类中取出60个图片做为验证数据集。</li>
<li>裁剪为256*256</li>
<li>验证集转换为rec file格式 ,利用im2rec.py</li>
</ul>
<p>验证集数据处理代码</p>
<pre><code class="language-python">wget http://www.vision.caltech.edu/Image_Datasets/Caltech256/256_ObjectCategories.tar
tar -xf 256_ObjectCategories.tar
mkdir -p caltech_256_train_60
for i in 256_ObjectCategories/*; do
c=`basename $i`
mkdir -p caltech_256_train_60/$c
for j in `ls $i/*.jpg | shuf | head -n 60`; do
mv $j caltech_256_train_60/$c/
done
done
python ~/mxnet/tools/im2rec.py --list --recursive caltech-256-60-train caltech_256_train_60/
python ~/mxnet/tools/im2rec.py --list --recursive caltech-256-60-val 256_ObjectCategories/
python ~/mxnet/tools/im2rec.py --resize 256 --quality 90 --num-thread 16 caltech-256-60-val 256_ObjectCategories/
python ~/mxnet/tools/im2rec.py --resize 256 --quality 90 --num-thread 16 caltech-256-60-train caltech_256_train_60/
</code></pre>
<p>如果第一步比较费时可以跳过,直接下载处理好256-60数据集,这已经是.rec格式。</p>
<pre><code>import os, sys
if sys.version_info[0] >= 3:
from urllib.request import urlretrieve
else:
from urllib import urlretrieve
def download(url):
filename = url.split("/")[-1]
if not os.path.exists(filename):
urlretrieve(url, filename)
download('http://data.mxnet.io/data/caltech-256/caltech-256-60-train.rec')
download('http://data.mxnet.io/data/caltech-256/caltech-256-60-val.rec')
</code></pre>
<p>利用数据做迭代函数,这类似于tf的batch_generate</p>
<pre><code>import mxnet as mx
def get_iterators(batch_size, data_shape=(3, 224, 224)):
train = mx.io.ImageRecordIter(
path_imgrec = './caltech-256-60-train.rec',
data_name = 'data',
label_name = 'softmax_label',
batch_size = batch_size,
data_shape = data_shape,
shuffle = True,
rand_crop = True,
rand_mirror = True)
val = mx.io.ImageRecordIter(
path_imgrec = './caltech-256-60-val.rec',
data_name = 'data',
label_name = 'softmax_label',
batch_size = batch_size,
data_shape = data_shape,
rand_crop = False,
rand_mirror = False)
return (train, val)
</code></pre>
<p>下载与训练模型,50-layer ResNet model</p>
<pre><code class="language-python">def get_model(prefix, epoch):
download(prefix+'-symbol.json')
download(prefix+'-%04d.params' % (epoch,))
get_model('http://data.mxnet.io/models/imagenet/resnet/50-layers/resnet-50', 0)
sym, arg_params, aux_params = mx.model.load_checkpoint('resnet-50', 0)</code></pre>
<h4>训练过程</h4>
<ol>
<li>对预训练模型做fine-tune,这一步是关键</li>
</ol>
<pre><code>
def get_fine_tune_model(symbol, arg_params, num_classes, layer_name='flatten0'):
"""
symbol: the pretrained network symbol
arg_params: the argument parameters of the pretrained model
num_classes: the number of classes for the fine-tune datasets
layer_name: the layer name before the last fully-connected layer
"""
all_layers = symbol.get_internals()
net = all_layers[layer_name+'_output']
net = mx.symbol.FullyConnected(data=net, num_hidden=num_classes, name='fc1')
net = mx.symbol.SoftmaxOutput(data=net, name='softmax')
new_args = dict({k:arg_params[k] for k in arg_params if 'fc1' not in k})
return (net, new_args)
</code></pre>
<p>2.0 建立新的模型
这里的arg_params传递参数到与训练模型。</p>
<pre><code>import logging
head = '%(asctime)-15s %(message)s'
logging.basicConfig(level=logging.DEBUG, format=head)
def fit(symbol, arg_params, aux_params, train, val, batch_size, num_gpus):
devs = [mx.gpu(i) for i in range(num_gpus)]
mod = mx.mod.Module(symbol=symbol, context=devs)
mod.fit(train, val,
num_epoch=8,
arg_params=arg_params,
aux_params=aux_params,
allow_missing=True,
batch_end_callback = mx.callback.Speedometer(batch_size, 10),
kvstore='device',
optimizer='sgd',
optimizer_params={'learning_rate':0.01},#学习率有点大后期准确率波动大。
initializer=mx.init.Xavier(rnd_type='gaussian', factor_type="in", magnitude=2),
eval_metric='acc')
metric = mx.metric.Accuracy()
return mod.score(val, metric)
</code></pre>
<p>3.0 开始训练</p>
<pre><code>num_classes = 256
batch_per_gpu = 16
num_gpus = 8
(new_sym, new_args) = get_fine_tune_model(sym, arg_params, num_classes)
batch_size = batch_per_gpu * num_gpus
(train, val) = get_iterators(batch_size)
mod_score = fit(new_sym, new_args, aux_params, train, val, batch_size, num_gpus)
assert mod_score > 0.77, "Low training accuracy."
</code></pre>
<p>经过8个epoch 准确率可以达到78%</p>
<p>4.0 res-152 模型</p>
<pre><code>get_model('http://data.mxnet.io/models/imagenet-11k/resnet-152/resnet-152', 0)
sym, arg_params, aux_params = mx.model.load_checkpoint('resnet-152', 0)
(new_sym, new_args) = get_fine_tune_model(sym, arg_params, num_classes)
mod_score = fit(new_sym, new_args, aux_params, train, val, batch_size, num_gpus)
assert mod_score > 0.86, "Low training accuracy."
</code></pre>
<p>经过8个epoch后,准确率达到86.4</p>
<h3>CPU训练</h3>
<p>在CPU上训练时
<code>batch_size = batch_per_gpu * num_gpus</code>改为<code>batch_size=16</code></p>
<p>替换</p>
<pre><code class="language-python"> devs = [mx.gpu(i) for i in range(num_gpus)]
mod = mx.mod.Module(symbol=symbol, context=devs)</code></pre>
<p>为:<code>mod = mx.mod.Module(symbol=symbol, context=mx.cpu())</code></p>
<h4>打印结果</h4>
<p>2018-08-01 12:18:26,913 Epoch[0] Batch [1530] Speed: 2.69 samples/sec accuracy=0.670000
2018-08-01 12:19:04,238 Epoch[0] Batch [1540] Speed: 2.68 samples/sec accuracy=0.610000
2018-08-01 12:19:07,955 Epoch[0] Train-accuracy=0.600000
2018-08-01 12:19:07,956 Epoch[0] Time cost=5762.745</p>
<h3>Tensorflow-VGG16微调实例</h3>
<p>vgg-16一种深度卷积神经网络模型,16表示其深度。模型可以达到92.7%的测试准确度。它的数据集包括1400万张图像,1000个类别.本例子以github上的<a href="https://github.com/dgurkaynak/tensorflow-cnn-finetune" title="tensorflow-cnn-finetune">tensorflow-cnn-finetune</a>例子为例。数据集没有使用MARVEL 船只数据。利用kaggle上的plant_seedings例子作为数据集。</p>
<ul>
<li>VGG16模型下载</li>
</ul>
<pre><code>./download_weights.sh</code></pre>
<ul>
<li>数据集</li>
</ul>
<p>数据集下载到本地,然后在./data下建立train.txt,txt内容如下:
/mnt/plantdata/train/Maize/749646c56.png 9
/mnt/plantdata/train/Common_Chickweed/937319dc7.png 3
/mnt/plantdata/train/Common_wheat/cb0bc5c02.png 10
/mnt/plantdata/train/Small-flowered_Cranesbill/ceaf23106.png 6
/mnt/plantdata/train/Scentless_Mayweed/ef6841bdb.png 2
/mnt/plantdata/train/Black-grass/adc5443dc.png 0
/mnt/plantdata/train/Sugar_beet/702261484.png 7
/mnt/plantdata/train/Shepherds_Purse/1c6a48d4f.png 1</p>
<p>同样建立验证集./data/val.txt</p>
<p>转换image和标签为txt内内容,需用read_write_data.py</p>
<ul>
<li>model
<strong>本例中只更新最后一层</strong></li>
</ul>
<pre><code> weights = np.load('vgg16_weights.npz')
keys = sorted(weights.keys())
for i, name in enumerate(keys):
parts = name.split('_')
layer = '_'.join(parts[:-1])
# if layer in skip_layers:
# continue
#只更新最后一层 only refresh last layer
if layer == 'fc8' and self.num_classes != 1000:
continue
with tf.variable_scope(layer, reuse=True):
if parts[-1] == 'W':
var = tf.get_variable('weights')
session.run(var.assign(weights[name]))
elif parts[-1] == 'b':
var = tf.get_variable('biases')
session.run(var.assign(weights[name]))
</code></pre>
<ul>
<li>训练</li>
</ul>
<pre><code class="language-python">python finetune.py --num_epochs=2 --num_classes=12 --batch_size=10</code></pre>
<p>按照以上参数设置得到
训练的train tensorboard scalar:
<img src="http://192.168.50.30/Public/Uploads/2018-08-01/5b61740c8cd66.PNG" alt="" /></p>
<p>两个epoch后验证集准确率为0.65</p>
<h5>参考:</h5>
<p><a href="https://github.com/tensorflow/models/blob/master/research/slim/README.md#pre-trained-models">https://github.com/tensorflow/models/blob/master/research/slim/README.md#pre-trained-models</a>
<a href="https://github.com/fchollet/deep-learning-models/releases/">https://github.com/fchollet/deep-learning-models/releases/</a>
<a href="https://mxnet.incubator.apache.org/faq/finetune.html">https://mxnet.incubator.apache.org/faq/finetune.html</a>
<a href="http://data.mxnet.io/models/imagenet-11k-place365-ch/">http://data.mxnet.io/models/imagenet-11k-place365-ch/</a>
<a href="https://github.com/dgurkaynak/tensorflow-cnn-finetune">https://github.com/dgurkaynak/tensorflow-cnn-finetune</a></p>