新换了个数据集跑实验的时候遇到了这个问题
IndexError: too many indices for array: array is 1-dimensional, but 2 were i…
是数据集中xml文件有空标签的原因
首先科普下什么是空标签:
<zhoz></zhoz>这种形式,即里面没有值
正常应该是<zhoz>56</zhoz>这种形式。
参考了该博主的方法

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
 
#移动标签为空的xml,并同步移动相应图片
import os
import xml.etree.ElementTree as ET
import shutil
 
origin_ann_dir = '/home/data_1/project/big-obj/RefineDet.PyTorch/data/VOCdevkit/VOC2007/Annotations/'# 设置原始标签路径为 Annos
new_ann_dir = '/home/data_1/project/big-obj/RefineDet.PyTorch/data/VOCdevkit/VOC2007/xml-save/'# 设置新标签路径 Annotations
origin_pic_dir = '/home/data_1/project/big-obj/RefineDet.PyTorch/data/VOCdevkit/VOC2007/JPEGImages/'
new_pic_dir = '/home/data_1/project/big-obj/RefineDet.PyTorch/data/VOCdevkit/VOC2007/pic-save/'
k=0
p=0
q=0
for dirpaths, dirnames, filenames in os.walk(origin_ann_dir):   # os.walk游走遍历目录名
  for filename in filenames:
    print("process...")
    k=k+1
    print(k)
    if os.path.isfile(r'%s%s' %(origin_ann_dir, filename)):   # 获取原始xml文件绝对路径,isfile()检测是否为文件 isdir检测是否为目录
      origin_ann_path = os.path.join(r'%s%s' %(origin_ann_dir, filename))   # 如果是,获取绝对路径(重复代码)
      new_ann_path = os.path.join(r'%s%s' %(new_ann_dir, filename))
      tree = ET.parse(origin_ann_path)  # ET是一个xml文件解析库,ET.parse()打开xml文件。parse--"解析"
      root = tree.getroot()   # 获取根节点
      if len(root.findall('object')):
        p=p+1
      else:
        print(filename)
        old_xml = origin_ann_dir + filename
        new_xml = new_ann_dir + filename
        old_pic = origin_pic_dir + filename.replace("xml","jpg")
        new_pic = new_pic_dir + filename.replace("xml","jpg")
        q=q+1
        shutil.move(old_pic, new_pic)
        shutil.move(old_xml, new_xml)
print("ok, ",p)
print("empty, ",q)

找到了产生空标签的xml文件。内容如下:

<annotation>
	<folder>obj1344</folder>
	<filename>obj1344_frame0000172.jpg</filename>
	<path>D:\Research\valid\obj1344\obj1344_frame0000172.jpg</path>
	<source>
		<database>Unknown</database>
	</source>
	<size>
		<width>480</width>
		<height>270</height>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
</annotation>

找到了该xml文件,需要重新标注该图片,打开labelimg标注数据后得到的xml文件内容如下:

<annotation>
	<folder>JPEGImages</folder>
	<filename>obj1344_frame0000172.jpg</filename>
	<path>D:\new\SSD-master\datasets\VOC2007\JPEGImages\obj1344_frame0000172.jpg</path>
	<source>
		<database>Unknown</database>
	</source>
	<size>
		<width>480</width>
		<height>270</height>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
	<object>
		<name>plastic</name>
		<pose>Unspecified</pose>
		<truncated>1</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>181</xmin>
			<ymin>144</ymin>
			<xmax>355</xmax>
			<ymax>270</ymax>
		</bndbox>
	</object>
	<object>
		<name>timestamp</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>10</xmin>
			<ymin>5</ymin>
			<xmax>471</xmax>
			<ymax>43</ymax>
		</bndbox>
	</object>
	<object>
		<name>timestamp</name>
		<pose>Unspecified</pose>
		<truncated>1</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>14</xmin>
			<ymin>216</ymin>
			<xmax>480</xmax>
			<ymax>270</ymax>
		</bndbox>
	</object>
</annotation>

修改替换原xml文件就可以运行了。

Logo

华为开发者空间,是为全球开发者打造的专属开发空间,汇聚了华为优质开发资源及工具,致力于让每一位开发者拥有一台云主机,基于华为根生态开发、创新。

更多推荐