ha-ha , I'm here again !!! Stand down again flag, After the beginning of school, we should keep the update frequency !!!
This time efficientdet To test smoking , Detection of smoking . that , Old rules , First, the result chart !!!
that , next , Or the original process . Walk up !!!
<> One , Environment configuration
* python==3.7.4
* tensorflow-gpu==1.14.0
* keras==2.2.4
* numpy==1.17.4
This time , On rent gpu On the machine . no way out ,efficientnet This network takes up too much video memory . The machine will not move this time .
<> Two , Smoking data set
This data set is used to labelme Marked , Provided json Format data set , But this time our voc Formal xml data set , So you need to be right json Format data .
picture :
Marked json Format data :
Converted xml format
This time json turn xml The source code is as follows :
# -*- coding: utf-8 -*- """ Created on Sun May 31 10:19:23 2020 @author: ywx
""" import os from typing import List, Any import numpy as np import codecs
import json from glob import glob import cv2 import shutil from sklearn.
model_selectionimport train_test_split # 1. Label path labelme_path = "annotations/"
# original labelme Label data path saved_path = "VOC2007/" # Save path isUseTest=True# Create test collection #
2. Create requirements folder if not os.path.exists(saved_path + "Annotations"): os.makedirs(
saved_path+ "Annotations") if not os.path.exists(saved_path + "JPEGImages/"): os
.makedirs(saved_path + "JPEGImages/") if not os.path.exists(saved_path +
"ImageSets/Main/"): os.makedirs(saved_path + "ImageSets/Main/") # 3. Get Pending Files
files= glob(labelme_path + "*.json") files = [i.replace("\\","/").split("/")[-1]
.split(".json")[0] for i in files] print(files) # 4. Read and write annotation information xml for json_file_
in files: json_filename = labelme_path + json_file_ + ".json" json_file = json.
load(open(json_filename, "r", encoding="utf-8")) height, width, channels = cv2.
imread('jpeg/' + json_file_ + ".jpg").shape with codecs.open(saved_path +
"Annotations/" + json_file_ + ".xml", "w", "utf-8") as xml: xml.write(
'<annotation>\n') xml.write('\t<folder>' + 'WH_data' + '</folder>\n') xml.write(
'\t<filename>' + json_file_ + ".jpg" + '</filename>\n') xml.write('\t<source>\n'
) xml.write('\t\t<database>WH Data</database>\n') xml.write(
'\t\t<annotation>WH</annotation>\n') xml.write('\t\t<image>flickr</image>\n')
xml.write('\t\t<flickrid>NULL</flickrid>\n') xml.write('\t</source>\n') xml.
write('\t<owner>\n') xml.write('\t\t<flickrid>NULL</flickrid>\n') xml.write(
'\t\t<name>WH</name>\n') xml.write('\t</owner>\n') xml.write('\t<size>\n') xml.
write('\t\t<width>' + str(width) + '</width>\n') xml.write('\t\t<height>' + str(
height) + '</height>\n') xml.write('\t\t<depth>' + str(channels) + '</depth>\n')
xml.write('\t</size>\n') xml.write('\t\t<segmented>0</segmented>\n') for multi
in json_file["shapes"]: points = np.array(multi["points"]) labelName=multi[
"label"] xmin = min(points[:, 0]) xmax = max(points[:, 0]) ymin = min(points[:,
1]) ymax = max(points[:, 1]) label = multi["label"] if xmax <= xmin: pass elif
ymax<= ymin: pass else: xml.write('\t<object>\n') xml.write('\t\t<name>' +
labelName+ '</name>\n') xml.write('\t\t<pose>Unspecified</pose>\n') xml.write(
'\t\t<truncated>1</truncated>\n') xml.write('\t\t<difficult>0</difficult>\n')
xml.write('\t\t<bndbox>\n') xml.write('\t\t\t<xmin>' + str(int(xmin)) +
'</xmin>\n') xml.write('\t\t\t<ymin>' + str(int(ymin)) + '</ymin>\n') xml.write(
'\t\t\t<xmax>' + str(int(xmax)) + '</xmax>\n') xml.write('\t\t\t<ymax>' + str(
int(ymax)) + '</ymax>\n') xml.write('\t\t</bndbox>\n') xml.write('\t</object>\n'
) print(json_filename, xmin, ymin, xmax, ymax, label) xml.write('</annotation>')
# 5. Copy picture to VOC2007/JPEGImages/ lower image_files = glob("jpeg/" + "*.jpg") print(
"copy image files to VOC007/JPEGImages/") for image in image_files: shutil.copy(
image, saved_path + "JPEGImages/") # 6.split files for txt txtsavepath =
saved_path+ "ImageSets/Main/" ftrainval = open(txtsavepath + '/trainval.txt',
'w') ftest = open(txtsavepath + '/test.txt', 'w') ftrain = open(txtsavepath +
'/train.txt', 'w') fval = open(txtsavepath + '/val.txt', 'w') total_files = glob
("./VOC2007/Annotations/*.xml") total_files = [i.replace("\\","/").split("/")[-1
].split(".xml")[0] for i in total_files] trainval_files=[] test_files=[] if
isUseTest: trainval_files, test_files = train_test_split(total_files, test_size=
0.15, random_state=55) else: trainval_files=total_files for file in
trainval_files: ftrainval.write(file + "\n") # split train_files, val_files =
train_test_split(trainval_files, test_size=0.15, random_state=55) # train for
file in train_files: ftrain.write(file + "\n") # val for file in val_files: fval
.write(file + "\n") for file in test_files: print(file) ftest.write(file + "\n")
ftrainval.close() ftrain.close() fval.close() ftest.close()
<> Three ,EfficientDet Theory introduction
EfficientDet Is based on Efficientnet Target detection network based on , So you need to read it first Efficientnet, Here you can first look at the development history of convolutional neural networks I wrote about Efficientnet Introduction to .
In short ,EfficientNet Is the resolution of the image , The width of the network , The depth of the network is a combination of the three , adopt α Implement scaling model , different α There are different model accuracy .
in general ,efficientdet Target detection network , So efficientnet Is the backbone network , After that bifpn Feature network , Then output the test results .
1.EfficientNet
EfficientNet Mainly by Efficient Blocks constitute , In which, the small residual side and the large residual side are formed , Attention module is added to it .
def mb_conv_block(inputs, block_args, activation, drop_rate=None, prefix='', ):
"""Mobile Inverted Residual Bottleneck.""" has_se = (block_args.se_ratio is not
None) and (0 < block_args.se_ratio <= 1) bn_axis = 3 if backend.
image_data_format() == 'channels_last' else 1 # workaround over non working
dropout with None in noise_shape in tf.keras Dropout = get_dropout( backend=
backend, layers=layers, models=models, utils=keras_utils ) # Expansion phase
filters= block_args.input_filters * block_args.expand_ratio if block_args.
expand_ratio!= 1: x = layers.Conv2D(filters, 1, padding='same', use_bias=False,
kernel_initializer=CONV_KERNEL_INITIALIZER, name=prefix + 'expand_conv')(inputs)
x= layers.BatchNormalization(axis=bn_axis, name=prefix + 'expand_bn')(x) x =
layers.Activation(activation, name=prefix + 'expand_activation')(x) else: x =
inputs# Depthwise Convolution x = layers.DepthwiseConv2D(block_args.kernel_size,
strides=block_args.strides, padding='same', use_bias=False,
depthwise_initializer=CONV_KERNEL_INITIALIZER, name=prefix + 'dwconv')(x) x =
layers.BatchNormalization(axis=bn_axis, name=prefix + 'bn')(x) x = layers.
Activation(activation, name=prefix + 'activation')(x) # Squeeze and Excitation
phase if has_se: num_reduced_filters = max(1, int( block_args.input_filters *
block_args.se_ratio )) se_tensor = layers.GlobalAveragePooling2D(name=prefix +
'se_squeeze')(x) target_shape = (1, 1, filters) if backend.image_data_format()
== 'channels_last' else (filters, 1, 1) se_tensor = layers.Reshape(target_shape,
name=prefix + 'se_reshape')(se_tensor) se_tensor = layers.Conv2D(
num_reduced_filters, 1, activation=activation, padding='same', use_bias=True,
kernel_initializer=CONV_KERNEL_INITIALIZER, name=prefix + 'se_reduce')(se_tensor
) se_tensor = layers.Conv2D(filters, 1, activation='sigmoid', padding='same',
use_bias=True, kernel_initializer=CONV_KERNEL_INITIALIZER, name=prefix +
'se_expand')(se_tensor) if backend.backend() == 'theano': # For the Theano
backend, we have to explicitly make # the excitation weights broadcastable.
pattern= ([True, True, True, False] if backend.image_data_format() ==
'channels_last' else [True, False, True, True]) se_tensor = layers.Lambda(
lambda x: backend.pattern_broadcast(x, pattern), name=prefix + 'se_broadcast')(
se_tensor) x = layers.multiply([x, se_tensor], name=prefix + 'se_excite') #
Output phase x = layers.Conv2D(block_args.output_filters, 1, padding='same',
use_bias=False, kernel_initializer=CONV_KERNEL_INITIALIZER, name=prefix +
'project_conv')(x) x = layers.BatchNormalization(axis=bn_axis, name=prefix +
'project_bn')(x) if block_args.id_skip and all( s == 1 for s in block_args.
strides) and block_args.input_filters == block_args.output_filters: if drop_rate
and (drop_rate > 0): x = Dropout(drop_rate, noise_shape=(None, 1, 1, 1), name=
prefix+ 'drop')(x) x = layers.add([x, inputs], name=prefix + 'add') return x
2.BiFPN
Improved FPN Multi scale feature fusion based on multi-scale , A weighted bidirectional feature pyramid network is proposed BiFPN.BiFPN A top-down path is introduced , fuse P3~P7 Multi scale features of
BiFPN Modules are similar to FPN network ( Characteristic pyramid network ), But than FPN More complicated . Its main purpose is to enhance features , Extracting more representative features .
Here's how it works FPN network :
And this is BiFPN Network diagram of :
One of them BiFPN The module is :
<> Four , Training process
1. Preparing data sets
Prepare smoking data , use VOC Format data for training
* Put the label file in the VOCdevkit Under folder VOC2007 Under folder Annotation in .
* Put the picture file in the VOCdevkit Under folder VOC2007 Under folder JPEGImages in .
* Use before training voc2efficientdet.py File generation txt. VOCdevkit -VOC2007 ├─ImageSets #
Save data set list file , from voc2yolo3.py File generation ├─Annotations # Store image files in data set ├─JPEGImages #
Store image tags ,xml format └─voc2yolo4.py # Used to generate dataset list files
2. Run build EfficientDet Data required
Run the root directory again voc_annotation.py, Before operation, it is necessary to voc_annotation In the file classes Change it to your own classes.
Each line corresponds to its picture position and its real box position
3. modify voc_classes.txt
It needs to be modified before training model_data Inside voc_classes.txt file , Need to classes Change it to your own classes.
4. modify yolo_anchors.txt
function kmeans_for_anchors.py generate yolo_anchors.txt
5. function
function train.py
* stay main Function :phi To control efficient Version of
* stay main Function :model_path Parameters are used to control the pre training weight
6. Test picture
Need to be modified efficientdet.py The location of the model in the file , Replace it with your trained model and modify it phi by efficientdet Version of . Then in the root directory , function python
predict.py Test .
okay , This is the end of this time !!!
Oh , rush upon sb.'s mind , I kept it logs, Let's show you a wave of training !!!
okay , Next time it depends , Update something else , Update point bert Text !!!
Technology
Daily Recommendation