libvirt之vfio pci的passthrough

 

一、准备

1.开机启动项

 /etc/default/grub文件

在GRUB_CMDLINE_LINUX参数加入intel_iommu=on

2.加载模块

modprobe vfio

modprobe vfio-pci

3.准备一台虚拟机

e.g 一台虚拟机vfio_test

4. bios开启iommu groups

Navigate through each BIOS screen using the "arrow" keys and find the "IOMMU," "I/O Memory Management Unit," or "Intel ® VT-d"setting (usually located under the "Advanced" or "Chipset/Northbridge/Tylersburg IOH/Intel VT for Directed I/O Configuration"settings menu). Move the cursor over the setting selection box using the "arrow" keys and press the "Page Up" or "Page Down" or specified key to select "Enabled."

二、iommu_group单个设备

1.选择需要passthrough的pci设备,其pci地址是0000:0b:00.0 

[root@host192 libvirt]# lspci | grep -i fibre

0b:00.0 Fibre Channel: QLogic Corp.ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 03)

 

2. 查询其所处的iommu_group,看到iommu_group是19,并且整个组就一个设备0000:0b:00.0

[root@host192 libvirt]# readlink /sys/bus/pci/devices/0000\:0b\:00.0/iommu_group

../../../../kernel/iommu_groups/19

[root@host192 libvirt]# ls /sys/bus/pci/devices/0000\:0b\:00.0/iommu_group/devices/

0000:0b:00.0

 

3.将iommu_group下的设备unbind,将设备unbind之后,会发现目录下面的driver目录消失了

echo 0000:0b:00.0 >>/sys/bus/pci/devices/0000\:0b\:00.0/driver/unbind

注意:这条指令执行后/sys/bus/pci/devices/0000\:0b\:00.0/下的driver目录会消失

 

4.将设备地vendorid和deviceid绑定到new_id

[root@host192 libvirt]# lspci -n -s 0000:0b:00.0

0b:00.0 0c04: 1077:2432 (rev 03)

[root@host192 libvirt]# echo 1077 2432 >/sys/bus/pci/drivers/vfio-pci/new_id

 

5.检测iommu_group是否绑定成功,会发现/dev/vfio下多了19这个组号

[root@host192 vfio]# ls /dev/vfio

19  vfio

 

6.构建xml,将设备绑定到虚拟机

[root@host192 ljl]# cat net2.xml

<hostdev mode='subsystem' type='pci'managed='yes'>

     <driver name='vfio'/>

     <source>

       <address type='pci' domain='0x0000' bus='0x0b' slot='0x00'function='0x0'/>

     </source>

</hostdev>

[root@host192 ljl]# virsh attach-device vfio_test net2.xml --config

Device attached successfully

 

7.启动虚拟机

[root@host192 ljl]# virsh start vfio_test

Domain vfio_test started

 

三、iommu_group下多个设备(比如,双口的网卡)

[root@host192 0000:02:00.0]# ls /sys/bus/pci/devices/0000\:02\:00.0/iommu_group/devices/

0000:02:00.0  0000:02:00.1

 

多个设备的时候和单个设备类似,只是需要将iommu_group组下所有的设备都unbind掉,比如我想passthrough0000:02:00.0这个设备

echo 0000:02:00.1 >>/sys/bus/pci/devices/0000:02:00.0/driver/ubind

echo 0000:02:00.0 >>/sys/bus/pci/devices/0000:02:00.0/driver/ubind

注意先将0000:02:00.1给unbind掉,因为将0000:02:00.0给unbind之后,…/devices/0000:02:00.0/driver这个目录会消失,这时候再unbind掉0000:02:00.1显然不太可能。

这样就可以像之前那样passthrough掉02:00.0这个设备了。

绑定过程可以通过shell脚本进行完成:

 

 
  1. #!/bin/sh

  2.  
  3. if [ $# -ne 1 ]

  4. then

  5. echo "usage: $(basename $0) pciaddr"

  6. exit 1

  7. fi

  8.  
  9. pciaddr=$1

  10. prefullpath="/sys/bus/pci/devices/"$pciaddr

  11. fullpath=$prefullpath"/iommu_group/devices"

  12. driverfullpath=$prefullpath"/driver/unbind"

  13.  
  14. for i in `ls $fullpath`

  15. do

  16. if [ $i != $pciaddr ]

  17. then

  18. # unbind

  19. echo $i >>$driverfullpath

  20.  
  21. # new_id

  22. prdmsg=`/usr/sbin/lspci -n -s $pciaddr`

  23. prdmsg=`echo $prdmsg | /usr/bin/sed 's/:/ /g' | /usr/bin/awk -F' ' '{print $4 " " $5 }'`

  24. echo $prdmsg >/sys/bus/pci/drivers/vfio-pci/new_id

  25. fi

  26. done

  27.  
  28. # unbind and new_id itself

  29. echo $pciaddr >>$driverfullpath

  30. prdmsg=`/usr/sbin/lspci -n -s $pciaddr`

  31. prdmsg=`echo $prdmsg | /usr/bin/sed 's/://g' | /usr/bin/awk -F' ' '{ print $4 " " $5 }'`

  32. echo $prdmsg>/sys/bus/pci/drivers/vfio-pci/new_id

  33.  
  34. exit 0

 

 

四、iommu_group下不仅是多个设备,而且还有bridge

[root@host192 ljl]# ls /sys/bus/pci/devices/0000\:03\:00.0/iommu_group/devices/

0000:00:1c.0  0000:00:1c.4 0000:02:00.2  0000:03:00.0  0000:04:00.0

0000:00:1c.2  0000:02:00.0  0000:02:00.4  0000:03:00.1 0000:04:00.1

[root@host192 ljl]# lspci -s 0000:00:1c.0

00:1c.0 PCI bridge: Intel Corporation 82801JI (ICH10Family) PCI Express Root Port 1

[root@host192 ljl]# lspci -s 0000:00:1c.2

00:1c.2 PCI bridge: Intel Corporation 82801JI (ICH10Family) PCI Express Root Port 3

[root@host192 ljl]# lspci -s 0000:03:00.0

03:00.0 Ethernet controller: BroadcomCorporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)

 

主要的区别还是在unbind和new_id的时候,需要将iommu_group下的每个设备在自己的driver/unbind把自己给unbind掉。

比如:

for  i  in  $(ls /sys/kernel/iommu_groups/8/devices/)

do

   echo $i  >>/sys/kernel/iommu_groups/devices/$i/driver/unbind

   ven=$(cat /sys/kernel/iommu_groups/8/devices/$i/vendor

   dev=$(cat /sys/kernel/iommu_groups/8/devices/$i/device

   echo $ven $dev >/sys/bus/pci/drivers/vfio-pci/new_id

done

 

五、问题记录

1.启动虚拟机报错

error: internal error: process exited whileconnecting to monitor: 2015-08-11T06:11:06.627255Zqemu-kvm: -device vfio-pci,host=0e:00.0,id=hostdev0,bus=pci.2,addr=0x6: vfio:failed to open /dev/vfio/vfio: Operation not permitted

2015-08-11T06:11:06.627315Z qemu-kvm: -devicevfio-pci,host=0e:00.0,id=hostdev0,bus=pci.2,addr=0x6: vfio: failed to setupcontainer for group 19

2015-08-11T06:11:06.627331Z qemu-kvm: -devicevfio-pci,host=0e:00.0,id=hostdev0,bus=pci.2,addr=0x6: vfio: failed to get group19

2015-08-11T06:11:06.627351Z qemu-kvm: -devicevfio-pci,host=0e:00.0,id=hostdev0,bus=pci.2,addr=0x6: Device initialization failed.

2015-08-11T06:11:06.627371Z qemu-kvm: -devicevfio-pci,host=0e:00.0,id=hostdev0,bus=pci.2,addr=0x6: Device 'vfio-pci' couldnot be initialized

解决方法:

/etc/libvirt/qemu.conf文件,在cgroup_device_acl结构体里加上“/dev/vfio/vfio"这条记录

 

2.直通了一张显卡,虚拟机不能开机

pci 0000:08:00.0 is not assignable

原因:qemu默认不能直通没有ACS(Access Control Service)控制的设备,要想直通这些设备,需要放开qemu的校验

/etc/libvirt/qemu.conf

开启relaxed_acs_check = 1的选项,这样这些设备就可以被虚拟机直通使用了

 

3.虚拟机不能开机,failed to set iommu for container

可能的原因,vfio_iommu_type1.allow_unsafe_interrupts没有开启

开启vfio_iommu_type1.allow_unsafe_interrupts之后,虚拟机可以启动了

也可以在系统启动的时候,就开启vfio_iommu_type1.allow_unsafe_interrupts

 

4.虚拟机不能开机,/dev/vfio/16 Operation not permitted

/etc/libvirt/qemu.conf文件,在cgroup_device_acl结构体里加上“/dev/vfio/16"这条记录

六、参考引用

http://www.redhat.com/archives/libvir-list/2013-March/msg00514.html

http://www.linux-kvm.org/images/b/b4/2012-forum-VFIO.pdf

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Deployment_and_Administration_Guide/chap-Guest_virtual_machine_device_configuration.html#sect-PCI_devices-PCI_passthrough

Logo

华为开发者空间,是为全球开发者打造的专属开发空间,汇聚了华为优质开发资源及工具,致力于让每一位开发者拥有一台云主机,基于华为根生态开发、创新。

更多推荐